L19 Programming Problem Championship: Round 2 (Strings)

ugoertz · Post by **ugoertz** » Mon May 08, 2017 8:13 am

bernds wrote:
ugoertz wrote:For the curious, I put my solutions/attempts on github: https://github.com/ugoertz/l19contest. Comments welcome!
I was thinking of doing something like this after the first round, but decided against it - having the solutions googleable kind of breaks the concept of contest web sites.

I was also wondering whether this is frowned upon, but I do not think that google will find these attempts too easily. Also, if one wants to cheat then there are other possibilities (e.g. I just googled one of the problems to check whether the solution is already available and found that the hidden test cases can be found (assuming that kattis uses the same ones as the original contest which seems very likely), and on the corresponding page there is even a notice that solutions will be posted at some later point).

Does this seem reasonable or do people disagree? I for one would surely like to have a look at your solutions.

Ulrich

Bill Spight · Post by **Bill Spight** » Mon May 08, 2017 8:40 am

Some comments on problem 2.

Hidden for courtesy.

Kirby · Post by **Kirby** » Mon May 08, 2017 9:26 am

One thing I've learned from this competition is that the theme doesn't really matter much. Sometimes solving a "recursive" problem isn't really focused on the recursion part, or sometimes solving a "string" problem isn't really about strings. Solving these problems require general problem solving skills, because almost always, it's not so much about solving the problem "correctly", but rather about solving it fast and with a low amount of memory.

Anyway, while I haven't been able to spend as much time as I'd like on this competition, I still appreciate it - I end up programming more than I would otherwise over the weekend

Bill Spight · Post by **Bill Spight** » Mon May 08, 2017 10:04 am

Kirby wrote:almost always, it's not so much about solving the problem "correctly", but rather about solving it fast and with a low amount of memory.

Sort of like playing online go.

quantumf · Post by **quantumf** » Mon May 08, 2017 10:38 am

I found A trivial, B perfect, C-E too difficult. I realize to some extent the choice is guided by the Kattis ratings, which rated B a 6.4, C a 7.3 and D 5.8. I think the submission statistics can be some guide about the accuracy of the ratings - B and C had many submissions (100s) while D had very few (<10).

I'm not fussed about theme. Perhaps a theme around the solution mechanisms might be interesting (recursion, DP, etc), if available.

Thanks for setting this, it's very enjoyable.

jeromie · Post by **jeromie** » Mon May 08, 2017 10:41 am

Bill Spight wrote:
Kirby wrote:almost always, it's not so much about solving the problem "correctly", but rather about solving it fast and with a low amount of memory.
Sort of like playing online go.

These contests are remarkably like solving tsumego. In particular, the nature of the problem changes considerably when you know there is a solution before tackling the problem.

Post by **Solomon** » Mon May 08, 2017 1:26 pm

Problem A

Problem B

The way I approached this problem was to utilize the preprocessing function from the KMP algorithm (call it kmp_pre). Specifically, the function calculates for every index of a pattern P, the length of the longest border (substring that is both a prefix and a suffix) found up to that index. For example, kmp_pre('aabbaabb') = [0 1 0 0 1 2 3 4], and kmp_pre('ababab') = [0 0 1 2 3 4].

If however, instead of running kmp_pre on P, we run it on P + P, then the border length found at an index equal to the size of P tells us the length of the period len(period) := i - kmp_pre + 1, and the value of n from the problem statement would just be len(P) / len(period). A way of visualizing this is to imagine P on two pieces of tape, and you're trying to align them such that the borders align, the length of the gap ('<>') on the left telling you the length of 'a' in the problem statement (s = a^n):

Code: Select all

aabbaabbaabbaabb
||||||||
aabbaabb (aligns, but this is the trivial case)
 aabbaabb
  aabbaabb
   aabbaabb
    aabbaabb (aligns)
<  >||||||||
aabbaabbaabbaabb

abababababab
||||||
ababab (aligns, trivial)
 ababab
  ababab (aligns)
<>||||||
abababababab

abcdabcd
||||
abcd (aligns, trivial)
 abcd
  abcd
   abcd
    abcd (aligns)
<  >||||
abcdabcd

Example 1: let P = aabbaabb => len(P) = 8. Then kmp_pre(P + P) = [0 1 0 0 1 2 3 4 5 6 7 8 9 10 11 12]. At i = 11, the border length of P + P == 8, i.e, that aabbaabbaabb = P + aabb has a substring of length 8 that is both a prefix and a suffix (P), so len(a) = 11 - 8 + 1 = 4. Then n = len(P) / len(a) = 8 / 4 = 2.

Example 2: let P = ababab => len(P) = 6. Then kmp_pre(P + P) = [0 0 1 2 3 4 5 6 7 8 9 10]. At i = 7, the border length of P + P == 6, so len(a) = 7 - 6 + 1 = 2. Then n = len(P) / len(a) = 6 / 2 = 3.

Example 3: let P = abcd => len(P) = 4. Then kmp_pre(P + P) = [0 0 0 0 1 2 3 4]. At i = 7, the border length of P + P == 4, so len(a) = 7 - 4 + 1 = 4. Then n = len(P) / len(a) = 4 / 4 = 1.

Code: Select all

#include <algorithm>
#include <iostream>
#include <vector>
#include <string>
#include <cmath>

std::vector<int> kmp_preprocess(const std::string &pattern) {
    int pattern_sz = (int) pattern.size();
    std::vector<int> border_array(pattern_sz, 0);
    
    int prev = 0;
    int curr = 1;
    while (curr < pattern_sz) {
        if (pattern[curr] == pattern[prev]) {
            border_array[curr] = prev + 1;
            ++prev;
            ++curr;
        } else if (prev > 0) {
            prev = border_array[prev - 1];
        } else {
            ++curr;
        }
    }

    return border_array;
}

int main() {
    while (true) {
        std::string s;
        std::cin >> s;
        if (s == ".") break;

        std::vector<int> kmp_prefix = kmp_preprocess(s + s);
       
        int s_sz = (int) s.size();
        int res = 0;
        for (int i = 0; i < (int) kmp_prefix.size(); ++i) {
            if (kmp_prefix[i] == s_sz) {
                res = i - kmp_prefix[i] + 1;
            }
        }
        std::cout << s_sz / res << std::endl;
    }
}

Problem C

Intuitively I knew this was a DP problem, but problem was getting the state and the transition correct. For the state, several pieces of information is necessary. Start with size and balance...

size: just the length of the string
balance: counter to keep track of difference between number of ('s vs )'s. e.g., ()()()'s balance = 0, ((('s balance = 3, ()(())(('s balance = 2.

If we utilized just these two pieces of information and ran DP with it, that would be ideal, and it would handle a lot of cases like ['(((', ')))'] or ['(())((', ')))', '(()', '(((())']. However, what this doesn't handle is differentiating between something like (()) versus ))((, both of which would have a balance of 0 and same size, but clearly must be handled differently (the first case is "free", the second one not so much...). So a third piece of information (call it 'offset') is needed, which is always <= 0 and counts the number of )'s that come before a positively balanced chunk, or the number of ('s that come after a negatively balanced chunk. So (()) would have an offset of 0, but ))(( would have an offset of -2. It's worth noting that if we are only dealt chunks with negative offsets, then we can't merge any chunks together. In the case of ))((, we would need something (one of many examples) like (( (balance +2, offset 0) and )) (balance -2, offset 0) to merge with. Hence, when we sort these chunks, we need to sort by offset (descending) first to find the ones with 0 offset, before looking balance and size.

That's pretty much the hard part, the DP is simpler:

Code: Select all

memo[i + chunk.balance] = max(memo[i + chunk.balance], memo[i] + chunk.size)

We keep track of two memo tables: one for positively balanced chunks, one for negatively balanced chunks. After that, we go through the memo tables and find the highest aligned value between them, and that's the answer.

Code: Select all

#include <algorithm>
#include <iostream>
#include <string>
#include <vector>

#define MAX_N 300
#define MAX_S 300
#define MAX_V (MAX_N * MAX_S) + 5

class ParenChunk {
public:
    int size;
    int balance; // 0 = balanced, count( '(' ) > count( ')' ) => balance > 0
    int offset;

    ParenChunk(std::string raw_paren_str) {
        size = raw_paren_str.size();
        balance = 0;
        offset = 0;

        for (const char &c : raw_paren_str) {
            if (c == '(')
                ++balance;
            else
                --balance;
            offset = std::min(offset, balance);
        }

        if (balance < 0) {
            // recount offset because it has to be done in reverse ( '))(' )
            int rebalance = 0;
            offset = 0;
            for (auto it = raw_paren_str.rbegin(); it != raw_paren_str.rend(); ++it) {
                if (*it == '(')
                    --rebalance;
                else
                    ++rebalance;
                offset = std::min(offset, rebalance);
            }
        }
    }

    bool operator< (const ParenChunk &p) const {
        if (offset != p.offset)
            return offset > p.offset;
        if (balance != p.balance)
            return std::abs(balance) > std::abs(p.balance);
        return size > p.size;
    }

    int get_balance_magnitude() const {
        return std::abs(balance);
    }
};

std::ostream &operator<<(std::ostream &strm, const ParenChunk &p) {
    return strm << '(' 
                << std::to_string(p.size)    << ", "
                << std::to_string(p.balance) << ", "
                << std::to_string(p.offset) 
                << ')';
}

void run_dp(const std::vector<ParenChunk> &v, std::vector<int> &memo) {
    for (const auto &p : v) {
        int p_bal = p.get_balance_magnitude();
        for (int i = MAX_V - 1 - p_bal; i >= -p.offset; --i) {
            if (memo[i]) {
                memo[i + p_bal] = std::max(memo[i + p_bal], memo[i] + p.size);
            }
        }

        if (p.offset == 0)
            memo[p_bal] = std::max(memo[p_bal], p.size);
    }
}

void process_input(std::vector<ParenChunk> &pv, std::vector<ParenChunk> &nv) {
    int n;
    std::cin >> n;

    while (n--) {
        std::string raw_paren_str;
        std::cin >> raw_paren_str;
        ParenChunk paren_chunk{raw_paren_str};
        if (paren_chunk.balance >= 0) {
            pv.push_back(paren_chunk);
        } else {
            nv.push_back(paren_chunk);
        }
    }
}

int main() {
    std::vector<ParenChunk> positively_balanced;
    std::vector<ParenChunk> negatively_balanced;
    positively_balanced.reserve(MAX_N);
    negatively_balanced.reserve(MAX_N);

    process_input(positively_balanced, negatively_balanced);

    std::sort(positively_balanced.begin(), positively_balanced.end());
    std::sort(negatively_balanced.begin(), negatively_balanced.end());

    std::vector<int> positively_memo(MAX_V);
    std::vector<int> negatively_memo(MAX_V);
    run_dp(positively_balanced, positively_memo);
    run_dp(negatively_balanced, negatively_memo);

    int res = positively_memo[0] + negatively_memo[0];
    for (int i = 1; i < MAX_V; ++i) {
        if (positively_memo[i] && negatively_memo[i]) // match
            res = std::max(res, positively_memo[i] + negatively_memo[i]);
    }

    std::cout << res << std::endl;
}

Kirby · Post by **Kirby** » Mon May 08, 2017 1:34 pm

Cool, Solomon. Re: KMP, figures that Knuth has already solved the problem for us

Re: Problem C, I didn't spend much time on this problem since it looked difficult given the time I would spend on the problem, but I came up with the idea of using your idea of "balance" as a metric. However, once I saw that it didn't always work with a counter example, I gave up on that idea, and started looking at the next problem. It didn't occur to me to consider the "offset".

Post by **Solomon** » Mon May 08, 2017 1:38 pm

Leaderboard

bernds - 11
dfan - 9
ugoertz - 8
Solomon - 7
quantumf - 3
Kirby - 2
jeromie - 2
peti29 - 2
gamesorry - 1

Some theme ideas for round 3:

Graphs
- Graph Traversal
- Single-Source Shortest Paths
- Minimum Spanning Trees
- Network Flow
Computational Geometry
- Circles, points, triangles, and lines
- Polygons
DP
- Knapsack variations [0/1, 0/inf, etc.]
- Coin change variations
- Longest increasing subsequence (LIS) variations
- TSP
Math
- Probability theory
- Number theory
  - Primes
  - Catalan number variations
  - Fibonacci number variations

If I don't get any suggestions, I'll just go with graphs on Friday. Also, for next round, I'll have it start in the morning on Friday in PST time so that it will be around 18:00 in Europe!

bernds · Post by **bernds** » Mon May 08, 2017 2:22 pm

Solomon wrote: Problem B
The way I approached this problem was to utilize the preprocessing function from the KMP algorithm (call it kmp_pre).

Problem C
So a third piece of information (call it 'offset') is needed, which is always <= 0 and counts the number of )'s that come before a positively balanced chunk, or the number of ('s that come after a negatively balanced chunk.

Here's what I did for Problem E:

Graphs seems like a good choice, better go read up on a few things. I'd also like to thank you for running these. Kirby says he does more programming on weekends this way than he would otherwise; I'll go so far as to say it's more fun than most of the things I get to do during the week doing software professionally these days. It's good to encounter problems unlike the typical ones you see every day.

peti29 · Post by **peti29** » Mon May 08, 2017 4:26 pm

Solomon wrote: Problem C
That's pretty much the hard part, the DP is simpler:
Code: Select all
memo[i + chunk.balance] = max(memo[i + chunk.balance], memo[i] + chunk.size)

Post by **Solomon** » Mon May 08, 2017 6:28 pm

peti29 wrote:
Solomon wrote: Problem C
That's pretty much the hard part, the DP is simpler:
Code: Select all
memo[i + chunk.balance] = max(memo[i + chunk.balance], memo[i] + chunk.size)
I still don't understand
I'm not too familiar with DP. What does 'i' symbolize? What does the above line mean? Why is the range ~= max_number_of_strings * max_length_of_strings?
I briefly thought about some kind of sorting but then I didn't know how to handle when certain items needed to be left out.

Post by **Solomon** » Mon May 08, 2017 6:42 pm

bernds wrote:
Solomon wrote: Problem B
The way I approached this problem was to utilize the preprocessing function from the KMP algorithm (call it kmp_pre).
This is just equivalent to searching for S withing S+S (and ignoring the first match), right? I get the feeling this is how it was intended to be solved. I just went with the factorization. The interesting thing is that your solution looks like it should be more efficient, but runs slightly slower in practice in my tests. This is something I'll probably spend time on to try to understand - is C++ string handling that inefficient?

Post by **Solomon** » Mon May 08, 2017 8:12 pm

tj86430 wrote:BTW, is it possible to submit solutions after the round has closed and to get feedback whether the solution would have been accepted? There are some problems I'd like to try, but I don't usually (read: ever) have time to do those within set timeframe. I might try during e.g. summer vacation, though.

Absolutely! Just go to the contest link and open the problems up, you should still be able to submit (assuming you registered an account).

peti29 · Post by **peti29** » Tue May 09, 2017 1:14 am

Solomon: thank you for the detailed explanation! I almost get it now. Only one thing remains:

Life In 19x19

L19 Programming Problem Championship: Round 2 (Strings)

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2

Re: L19 Programming Problem Championship: Round 2