L19 Programming Problem Championship: Round 2 (Strings)

All non-Go discussions should go here.
User avatar
ugoertz
Dies in gote
Posts: 63
Joined: Tue Dec 14, 2010 3:50 am
GD Posts: 0
Been thanked: 40 times

Re: L19 Programming Problem Championship: Round 2

Post by ugoertz »

bernds wrote:
ugoertz wrote:For the curious, I put my solutions/attempts on github: https://github.com/ugoertz/l19contest. Comments welcome!
I was thinking of doing something like this after the first round, but decided against it - having the solutions googleable kind of breaks the concept of contest web sites.
I was also wondering whether this is frowned upon, but I do not think that google will find these attempts too easily. Also, if one wants to cheat then there are other possibilities (e.g. I just googled one of the problems to check whether the solution is already available and found that the hidden test cases can be found (assuming that kattis uses the same ones as the original contest which seems very likely), and on the corresponding page there is even a notice that solutions will be posted at some later point).

Does this seem reasonable or do people disagree? I for one would surely like to have a look at your solutions.

Ulrich
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: L19 Programming Problem Championship: Round 2

Post by Bill Spight »

Some comments on problem 2. :)

Hidden for courtesy.
How to limit the size of n?

The following assumes that the whole string has been read and that an array or arrays or a dictionary is used.

Let Length be the length of the string. Length mod n = 0.

Let Count(char) be the number of occurrences of char in the string. For each character, Count(char) mod n = 0.

That means that we may be able to use the Greatest Common Denominator (GCD) to limit the size of n before matching substrings. We have

n(max) = GCD(Count(“a”), Count(“b”), . . . , Count(“z”))

Since Length = ∑ Count, we can replace one of the Counts with Length. For instance, if Length is a prime, we know that n(max) is Length or 1, but if we only look at the Counts, we may not realize that for some time. For instance, with Length = 7, the Counts (leaving out 0s) might be (2,2,3).

It is a good idea, I think, to sort the Counts and start with the minimum non-zero Count to find the GCD.

Let’s apply all this to the example strings.

String = “abcd”
Length = 4
Count(”a”) = 1
n(max) = GCD(1,4) = 1
n = 1

String = “aaaa”
Length = 4
Count(“a”) = 4
n(max) = GCD(4,4) = 4

String = “ababab”
Length = 6
Count(“a”) = 3
n(max) = GCD(3,6) = 3 ; Note that we do not need to check Count(“b”). It is redundant.

In the next example, let’s just use the numbers.

Length = 48
Count(“a”) = 18
Count(“b”) = 30
n(max) = GCD(18,48) = 6 ; Again, we do not have to make use of Count(“b”).

Can we also make use the number of occurrences of digrams? With a possible Length of 2,000,000 that might help. The answer is yes, with a twist. Let’s look at the last example:

String = “ababab”

Count(“ba”) = 2
Count(“ab”) = 3
n(max) = GCD(2,3) = 1

This is plainly wrong. We cannot use the GCD of the number of occurrences of “ab” and “ba”. The reason is that “ba” does not belong to the repeating substring. The “b” belongs to one substring while the “a” belongs to the next substring. In fact, in this case Count(“ba”) = n-1. It could also equal the count of “ba” in the repeating substring plus n-1. In either case, if we add 1 to it, we can use it, as adding n to a count does no harm. Let’s do that.

GCD(inc(2),3) = 3

So far, so good. But how do we do know whether to add 1 to Count(“ab”) or to Count(“ba”)? Easy. “a” is the first letter in String, and “b” is the last letter. That tells us that “ba” marks the boundaries between repeating substrings.

Now, when we use digrams, we add 1 to the Count of the digram whose first character is the last character in String and whose second character is the first character in String. When we do that, the sum of the counts = Length, as it should.

In this case the digrams give us no reduction of n(max) from the result we get using single letters.

But let’s look at this example:

String = “abbaab”

Using single characters:

Count(“a”) = 3
Count(“b”) = 3
n(max) = GCD(3,3) = 3

Using digrams:

The sorted Counts:

Count(“aa”) = 1
Count(“bb”) = 1
Count(“ba”) = 1
Count(“ab”) = 2

n(max) = GCD(1,1,2,2)) = 1 ; We add 1 to Count(“ba”)
n = 1

OC, in this case we do not have to make the full computation. In fact, since Count(“bb”) = 1, we don’t need to calculate any GCD at all. ;)

Note that even if we get a lower limit for n with the digrams than we do with the single character counts, we may be able to use the single character count result to reduce the limit even further. It does not hurt to start with the single character counts.

String = “abaabababaabacbabaababababaacb”
Length = 30 ; Chosen because of its number of divisors ;)

Count(“c”) = 2
Count(“b”) = 12
Count(“a”) = 16
n(max) = GCD(2,12,16) = 2

Count(“ac”) = 2
Count(“cb”) = 2
Count(“aa”) = 4
Count(“ab”) = 10
Count(“ba”) = 11 ; Add 1 to this

n(max) = GCD(2,2,4,10,12) = 2

This example is meant to illustrate the virtue (I hope) of this approach. The string is perhaps deceptive in part because of so many repetitions of the initial and final parts of the string (the “a”s and “b”s). That means that if we are working down the candidate substrings, we will get early matches, whether we work from the front or the back. There are also a lot of potential “ba” boundaries between substrings. But that means that there are a lot of “a”s and “b”s, but relatively few “c”s. It is the paucity of “c”s that puts a cap on n. With only 2 “c”s n(max) = 2, while using the counts of only “a” and “b” gives us an n(max) of 4.

OC, as always, the proof is in the pudding. :)
Last edited by Bill Spight on Tue May 09, 2017 7:36 am, edited 1 time in total.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
Kirby
Honinbo
Posts: 9553
Joined: Wed Feb 24, 2010 6:04 pm
GD Posts: 0
KGS: Kirby
Tygem: 커비라고해
Has thanked: 1583 times
Been thanked: 1707 times

Re: L19 Programming Problem Championship: Round 2

Post by Kirby »

One thing I've learned from this competition is that the theme doesn't really matter much. Sometimes solving a "recursive" problem isn't really focused on the recursion part, or sometimes solving a "string" problem isn't really about strings. Solving these problems require general problem solving skills, because almost always, it's not so much about solving the problem "correctly", but rather about solving it fast and with a low amount of memory.

Anyway, while I haven't been able to spend as much time as I'd like on this competition, I still appreciate it - I end up programming more than I would otherwise over the weekend :-)
be immersed
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: L19 Programming Problem Championship: Round 2

Post by Bill Spight »

Kirby wrote:almost always, it's not so much about solving the problem "correctly", but rather about solving it fast and with a low amount of memory.
Sort of like playing online go. :mrgreen:
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
User avatar
quantumf
Lives in sente
Posts: 844
Joined: Tue Apr 20, 2010 11:36 pm
Rank: 3d
GD Posts: 422
KGS: komi
Has thanked: 180 times
Been thanked: 151 times

Re: L19 Programming Problem Championship: Round 2

Post by quantumf »

I found A trivial, B perfect, C-E too difficult. I realize to some extent the choice is guided by the Kattis ratings, which rated B a 6.4, C a 7.3 and D 5.8. I think the submission statistics can be some guide about the accuracy of the ratings - B and C had many submissions (100s) while D had very few (<10).

I'm not fussed about theme. Perhaps a theme around the solution mechanisms might be interesting (recursion, DP, etc), if available.

Thanks for setting this, it's very enjoyable.
jeromie
Lives in sente
Posts: 902
Joined: Fri Jan 31, 2014 7:12 pm
Rank: AGA 3k
GD Posts: 0
Universal go server handle: jeromie
Location: Fort Collins, CO
Has thanked: 319 times
Been thanked: 287 times

Re: L19 Programming Problem Championship: Round 2

Post by jeromie »

Bill Spight wrote:
Kirby wrote:almost always, it's not so much about solving the problem "correctly", but rather about solving it fast and with a low amount of memory.
Sort of like playing online go. :mrgreen:
These contests are remarkably like solving tsumego. In particular, the nature of the problem changes considerably when you know there is a solution before tackling the problem.
User avatar
Solomon
Gosei
Posts: 1848
Joined: Tue Apr 20, 2010 9:21 pm
Rank: AGA 5d
GD Posts: 0
KGS: Capsule 4d
Tygem: 치킨까스 5d
Location: Bellevue, WA
Has thanked: 90 times
Been thanked: 835 times

Re: L19 Programming Problem Championship: Round 2

Post by Solomon »

Problem A

Code: Select all

print("".join(word[0].upper() for word in input().split('-')))
Problem B
The way I approached this problem was to utilize the preprocessing function from the KMP algorithm (call it kmp_pre). Specifically, the function calculates for every index of a pattern P, the length of the longest border (substring that is both a prefix and a suffix) found up to that index. For example, kmp_pre('aabbaabb') = [0 1 0 0 1 2 3 4], and kmp_pre('ababab') = [0 0 1 2 3 4].

If however, instead of running kmp_pre on P, we run it on P + P, then the border length found at an index equal to the size of P tells us the length of the period len(period) := i - kmp_pre + 1, and the value of n from the problem statement would just be len(P) / len(period). A way of visualizing this is to imagine P on two pieces of tape, and you're trying to align them such that the borders align, the length of the gap ('<>') on the left telling you the length of 'a' in the problem statement (s = a^n):

Code: Select all

aabbaabbaabbaabb
||||||||
aabbaabb (aligns, but this is the trivial case)
 aabbaabb
  aabbaabb
   aabbaabb
    aabbaabb (aligns)
<  >||||||||
aabbaabbaabbaabb

abababababab
||||||
ababab (aligns, trivial)
 ababab
  ababab (aligns)
<>||||||
abababababab

abcdabcd
||||
abcd (aligns, trivial)
 abcd
  abcd
   abcd
    abcd (aligns)
<  >||||
abcdabcd
Example 1: let P = aabbaabb => len(P) = 8. Then kmp_pre(P + P) = [0 1 0 0 1 2 3 4 5 6 7 8 9 10 11 12]. At i = 11, the border length of P + P == 8, i.e, that aabbaabbaabb = P + aabb has a substring of length 8 that is both a prefix and a suffix (P), so len(a) = 11 - 8 + 1 = 4. Then n = len(P) / len(a) = 8 / 4 = 2.

Example 2: let P = ababab => len(P) = 6. Then kmp_pre(P + P) = [0 0 1 2 3 4 5 6 7 8 9 10]. At i = 7, the border length of P + P == 6, so len(a) = 7 - 6 + 1 = 2. Then n = len(P) / len(a) = 6 / 2 = 3.

Example 3: let P = abcd => len(P) = 4. Then kmp_pre(P + P) = [0 0 0 0 1 2 3 4]. At i = 7, the border length of P + P == 4, so len(a) = 7 - 4 + 1 = 4. Then n = len(P) / len(a) = 4 / 4 = 1.

Code: Select all

#include <algorithm>
#include <iostream>
#include <vector>
#include <string>
#include <cmath>

std::vector<int> kmp_preprocess(const std::string &pattern) {
    int pattern_sz = (int) pattern.size();
    std::vector<int> border_array(pattern_sz, 0);
    
    int prev = 0;
    int curr = 1;
    while (curr < pattern_sz) {
        if (pattern[curr] == pattern[prev]) {
            border_array[curr] = prev + 1;
            ++prev;
            ++curr;
        } else if (prev > 0) {
            prev = border_array[prev - 1];
        } else {
            ++curr;
        }
    }

    return border_array;
}

int main() {
    while (true) {
        std::string s;
        std::cin >> s;
        if (s == ".") break;

        std::vector<int> kmp_prefix = kmp_preprocess(s + s);
       
        int s_sz = (int) s.size();
        int res = 0;
        for (int i = 0; i < (int) kmp_prefix.size(); ++i) {
            if (kmp_prefix[i] == s_sz) {
                res = i - kmp_prefix[i] + 1;
            }
        }
        std::cout << s_sz / res << std::endl;
    }
} 
Problem C
Intuitively I knew this was a DP problem, but problem was getting the state and the transition correct. For the state, several pieces of information is necessary. Start with size and balance...
  • size: just the length of the string
  • balance: counter to keep track of difference between number of ('s vs )'s. e.g., ()()()'s balance = 0, ((('s balance = 3, ()(())(('s balance = 2.
If we utilized just these two pieces of information and ran DP with it, that would be ideal, and it would handle a lot of cases like ['(((', ')))'] or ['(())((', ')))', '(()', '(((())']. However, what this doesn't handle is differentiating between something like (()) versus ))((, both of which would have a balance of 0 and same size, but clearly must be handled differently (the first case is "free", the second one not so much...). So a third piece of information (call it 'offset') is needed, which is always <= 0 and counts the number of )'s that come before a positively balanced chunk, or the number of ('s that come after a negatively balanced chunk. So (()) would have an offset of 0, but ))(( would have an offset of -2. It's worth noting that if we are only dealt chunks with negative offsets, then we can't merge any chunks together. In the case of ))((, we would need something (one of many examples) like (( (balance +2, offset 0) and )) (balance -2, offset 0) to merge with. Hence, when we sort these chunks, we need to sort by offset (descending) first to find the ones with 0 offset, before looking balance and size.

That's pretty much the hard part, the DP is simpler:

Code: Select all

memo[i + chunk.balance] = max(memo[i + chunk.balance], memo[i] + chunk.size)
We keep track of two memo tables: one for positively balanced chunks, one for negatively balanced chunks. After that, we go through the memo tables and find the highest aligned value between them, and that's the answer.

Code: Select all

#include <algorithm>
#include <iostream>
#include <string>
#include <vector>

#define MAX_N 300
#define MAX_S 300
#define MAX_V (MAX_N * MAX_S) + 5

class ParenChunk {
public:
    int size;
    int balance; // 0 = balanced, count( '(' ) > count( ')' ) => balance > 0
    int offset;

    ParenChunk(std::string raw_paren_str) {
        size = raw_paren_str.size();
        balance = 0;
        offset = 0;

        for (const char &c : raw_paren_str) {
            if (c == '(')
                ++balance;
            else
                --balance;
            offset = std::min(offset, balance);
        }

        if (balance < 0) {
            // recount offset because it has to be done in reverse ( '))(' )
            int rebalance = 0;
            offset = 0;
            for (auto it = raw_paren_str.rbegin(); it != raw_paren_str.rend(); ++it) {
                if (*it == '(')
                    --rebalance;
                else
                    ++rebalance;
                offset = std::min(offset, rebalance);
            }
        }
    }

    bool operator< (const ParenChunk &p) const {
        if (offset != p.offset)
            return offset > p.offset;
        if (balance != p.balance)
            return std::abs(balance) > std::abs(p.balance);
        return size > p.size;
    }

    int get_balance_magnitude() const {
        return std::abs(balance);
    }
};

std::ostream &operator<<(std::ostream &strm, const ParenChunk &p) {
    return strm << '(' 
                << std::to_string(p.size)    << ", "
                << std::to_string(p.balance) << ", "
                << std::to_string(p.offset) 
                << ')';
}

void run_dp(const std::vector<ParenChunk> &v, std::vector<int> &memo) {
    for (const auto &p : v) {
        int p_bal = p.get_balance_magnitude();
        for (int i = MAX_V - 1 - p_bal; i >= -p.offset; --i) {
            if (memo[i]) {
                memo[i + p_bal] = std::max(memo[i + p_bal], memo[i] + p.size);
            }
        }

        if (p.offset == 0)
            memo[p_bal] = std::max(memo[p_bal], p.size);
    }
}

void process_input(std::vector<ParenChunk> &pv, std::vector<ParenChunk> &nv) {
    int n;
    std::cin >> n;

    while (n--) {
        std::string raw_paren_str;
        std::cin >> raw_paren_str;
        ParenChunk paren_chunk{raw_paren_str};
        if (paren_chunk.balance >= 0) {
            pv.push_back(paren_chunk);
        } else {
            nv.push_back(paren_chunk);
        }
    }
}

int main() {
    std::vector<ParenChunk> positively_balanced;
    std::vector<ParenChunk> negatively_balanced;
    positively_balanced.reserve(MAX_N);
    negatively_balanced.reserve(MAX_N);

    process_input(positively_balanced, negatively_balanced);

    std::sort(positively_balanced.begin(), positively_balanced.end());
    std::sort(negatively_balanced.begin(), negatively_balanced.end());

    std::vector<int> positively_memo(MAX_V);
    std::vector<int> negatively_memo(MAX_V);
    run_dp(positively_balanced, positively_memo);
    run_dp(negatively_balanced, negatively_memo);

    int res = positively_memo[0] + negatively_memo[0];
    for (int i = 1; i < MAX_V; ++i) {
        if (positively_memo[i] && negatively_memo[i]) // match
            res = std::max(res, positively_memo[i] + negatively_memo[i]);
    }

    std::cout << res << std::endl;
}
Kirby
Honinbo
Posts: 9553
Joined: Wed Feb 24, 2010 6:04 pm
GD Posts: 0
KGS: Kirby
Tygem: 커비라고해
Has thanked: 1583 times
Been thanked: 1707 times

Re: L19 Programming Problem Championship: Round 2

Post by Kirby »

Cool, Solomon. Re: KMP, figures that Knuth has already solved the problem for us :-) Re: Problem C, I didn't spend much time on this problem since it looked difficult given the time I would spend on the problem, but I came up with the idea of using your idea of "balance" as a metric. However, once I saw that it didn't always work with a counter example, I gave up on that idea, and started looking at the next problem. It didn't occur to me to consider the "offset".
be immersed
User avatar
Solomon
Gosei
Posts: 1848
Joined: Tue Apr 20, 2010 9:21 pm
Rank: AGA 5d
GD Posts: 0
KGS: Capsule 4d
Tygem: 치킨까스 5d
Location: Bellevue, WA
Has thanked: 90 times
Been thanked: 835 times

Re: L19 Programming Problem Championship: Round 2

Post by Solomon »

Leaderboard

  1. bernds - 11
  2. dfan - 9
  3. ugoertz - 8
  4. Solomon - 7
  5. quantumf - 3
  6. Kirby - 2
  7. jeromie - 2
  8. peti29 - 2
  9. gamesorry - 1

Some theme ideas for round 3:
  • Graphs
    • Graph Traversal
    • Single-Source Shortest Paths
    • Minimum Spanning Trees
    • Network Flow
  • Computational Geometry
    • Circles, points, triangles, and lines
    • Polygons
  • DP
    • Knapsack variations [0/1, 0/inf, etc.]
    • Coin change variations
    • Longest increasing subsequence (LIS) variations
    • TSP
  • Math
    • Probability theory
    • Number theory
      • Primes
      • Catalan number variations
      • Fibonacci number variations
If I don't get any suggestions, I'll just go with graphs on Friday. Also, for next round, I'll have it start in the morning on Friday in PST time so that it will be around 18:00 in Europe!
bernds
Lives with ko
Posts: 259
Joined: Sun Apr 30, 2017 11:18 pm
Rank: 2d
GD Posts: 0
Has thanked: 46 times
Been thanked: 116 times

Re: L19 Programming Problem Championship: Round 2

Post by bernds »

Solomon wrote: Problem B
The way I approached this problem was to utilize the preprocessing function from the KMP algorithm (call it kmp_pre).
This is just equivalent to searching for S withing S+S (and ignoring the first match), right? I get the feeling this is how it was intended to be solved. I just went with the factorization. The interesting thing is that your solution looks like it should be more efficient, but runs slightly slower in practice in my tests. This is something I'll probably spend time on to try to understand - is C++ string handling that inefficient?
Problem C
So a third piece of information (call it 'offset') is needed, which is always <= 0 and counts the number of )'s that come before a positively balanced chunk, or the number of ('s that come after a negatively balanced chunk.
Yes, that matches what I did. I found that if I sort this data correctly, I can just run the DP algorithm over all of it once and get the right answer at the end.
Here's what I did for Problem E:
It's relatively simple to write a recursive algorithm to generate an alphabetical sequence that matches the requirements. You have to keep track of what the previous character was, and how many characters have been used i times for 1 <= i <= k (and set limits as appropriate).
My first attempt at speeding it up was to recognize that the fact that we're duplicating a lot of patterns if we're going through the whole alphabet. If at any point we're choosing a new character that hasn't been seen before, we can remember how many substrings the recursive calls generated, and then any other choice of new characters will generate the same number. As long as that number is smaller than the remaining count, we can skip the recursive call and just subtract.
This turns out to work great for small k such as in the examples, where we always choose many different new characters, but for large k (approaching 26) it becomes less effective, and here we need speedups the most. So in the end I didn't end up using this. Instead, I recognize when the subproblem becomes suitably small (all character counts above 9 fully used up), and encode the pattern of remaining counts into an index into an array of shortcuts. The idea is the same: if we encounter a similar situation twice, we can skip recursion after the first time.
Graphs seems like a good choice, better go read up on a few things. I'd also like to thank you for running these. Kirby says he does more programming on weekends this way than he would otherwise; I'll go so far as to say it's more fun than most of the things I get to do during the week doing software professionally these days. It's good to encounter problems unlike the typical ones you see every day.
peti29
Lives with ko
Posts: 125
Joined: Mon Feb 17, 2014 2:41 am
Rank: KGS 5 kyu
GD Posts: 0
Has thanked: 13 times
Been thanked: 12 times

Re: L19 Programming Problem Championship: Round 2

Post by peti29 »

Solomon wrote: Problem C
That's pretty much the hard part, the DP is simpler:

Code: Select all

memo[i + chunk.balance] = max(memo[i + chunk.balance], memo[i] + chunk.size)
I still don't understand :(
I'm not too familiar with DP. What does 'i' symbolize? What does the above line mean? Why is the range ~= max_number_of_strings * max_length_of_strings?
I briefly thought about some kind of sorting but then I didn't know how to handle when certain items needed to be left out.
User avatar
Solomon
Gosei
Posts: 1848
Joined: Tue Apr 20, 2010 9:21 pm
Rank: AGA 5d
GD Posts: 0
KGS: Capsule 4d
Tygem: 치킨까스 5d
Location: Bellevue, WA
Has thanked: 90 times
Been thanked: 835 times

Re: L19 Programming Problem Championship: Round 2

Post by Solomon »

peti29 wrote:
Solomon wrote: Problem C
That's pretty much the hard part, the DP is simpler:

Code: Select all

memo[i + chunk.balance] = max(memo[i + chunk.balance], memo[i] + chunk.size)
I still don't understand :(
I'm not too familiar with DP. What does 'i' symbolize? What does the above line mean? Why is the range ~= max_number_of_strings * max_length_of_strings?
I briefly thought about some kind of sorting but then I didn't know how to handle when certain items needed to be left out.
The i is the loop variable. I think the best way to show what this memo (memoization table) does is with an example:

Let's say the input we're given are the following:

(((...( [250 ('s, so the balance is +250, and the size is 250]
(((...()...) [200 ('s, followed by 50 )'s, balance +150, size 250]
(((...( [150 ('s, balance +150, size 150]


These are in sorted order as well, and for now to keep things simple we are not considering offset parentheses chunks (so all 3 of the chunks above have an offset value of 0). We have nothing right now, and for our first chunk, memo[chunk.balance] == memo[250] == 0, so we go ahead and add the first chunk to the memo table (memo[250] = 250):

Code: Select all

memo: [ 0, 0, 0, ..., 250,  0,   0,   0, ... ]
        0  1  2       250  251  252  253
What this tells us is that the best choice (based on size) for any combination of chunks (here we just have 1 chunk) that has a balance of +250 has a size of 250. Right now we're just dealing with the positive memo table; if our negative memo table also had a value greater than 0 in index 250, then the answer would be that value + 250 (e.g., if we had processed ()))...) [1 ( and 251 )'s, or a balance of -250 and size 252], the negative memo table would have a value of 252 in index 250, then our answer would be 250 + 252 = 502).

Anyways, so the next chunk we have is (((...()...) [200 ('s, followed by 50 )'s, balance +150, size 250]. We iterate through the memo table from the end and any time we find a value > 0 at index i, we check to see if it's worth adding this chunk or not. That's this part of the code:

Code: Select all

memo[i + chunk.balance] = max(
    memo[i + chunk.balance], // don't bother adding the chunk
    memo[i] + chunk.size)    // add the chunk
)
In this case we definitely should, because memo[250 + 150] == memo[400] == 0. And like the first time, we check memo[chunk.balance] as well (which also == 0, so we update that too) so now our memo table looks like this:

Code: Select all

memo: [ 0, 0, 0, ..., 250, ..., 250, ..., 500, ...]
        0  1  2       150       250       400
You can start to see why the memo table needs to be as large as 300 * 300 now, right? We're only 2 chunks in and we're already up to index 400 (could have gone up to 600 at this point, or 300 * 2, and we can have up to 300 chunks).

Now for the last chunk (((...( [150 ('s, balance +150, size 150]. First index we hit since we're going from the end is i = 400. memo[i + chunk.balance] == memo[400 + 150] == memo[550] == 0, so that's updated to 500 + 150 = 650. But now, when we hit the next index i = 250, we see that memo[i + chunk.balance] == memo[250 + 150] == memo[400] == 500. In this case,

Code: Select all

max(memo[i + chunk.balance], memo[i] + chunk.size)) == max(500, 250 + 150) == max(500, 400) == 500
So this time we don't use the chunk. What does this mean? For sake of simplicity, let's say the only chunks we have that are negatively balanced are ))...) [200 )'s] and ))...) [200 )'s again], so we know the negative memo table's value at index 400 is 400. Look again at the choices we have for our positively balanced chunks. Clearly, it is better to utilize only the first two chunks (for a total size of 500 + 400 = 900) and not bother with the third (for a total size of (250 + 150) + 400 = 800 < 900), and that's what this memo table is tracking for us. So at index 400, we keep what we have, and the same goes for index 150 (why use the 3rd chunk when the 2nd chunk has the same positive balance but greater size?).

Sorry this got so long-winded, but that's essentially what this memoized table is doing for us. It's "standard" DP in the sense that this is almost exactly the technique used in solving problems like the Coin Change problem, a classical DP problem.
User avatar
Solomon
Gosei
Posts: 1848
Joined: Tue Apr 20, 2010 9:21 pm
Rank: AGA 5d
GD Posts: 0
KGS: Capsule 4d
Tygem: 치킨까스 5d
Location: Bellevue, WA
Has thanked: 90 times
Been thanked: 835 times

Re: L19 Programming Problem Championship: Round 2

Post by Solomon »

bernds wrote:
Solomon wrote: Problem B
The way I approached this problem was to utilize the preprocessing function from the KMP algorithm (call it kmp_pre).
This is just equivalent to searching for S withing S+S (and ignoring the first match), right? I get the feeling this is how it was intended to be solved. I just went with the factorization. The interesting thing is that your solution looks like it should be more efficient, but runs slightly slower in practice in my tests. This is something I'll probably spend time on to try to understand - is C++ string handling that inefficient?
Indeed, but when I tried something far simpler like:

Code: Select all

while True:
    s = input()
    if s == '.':
        break
    print(len(s) // (s + s).find(s, 1))
It would pass the first 3 test cases, before hitting TLE on the fourth (I wrote the C++ equivalent, and found it would hit TLE even sooner, at the second test case).
User avatar
Solomon
Gosei
Posts: 1848
Joined: Tue Apr 20, 2010 9:21 pm
Rank: AGA 5d
GD Posts: 0
KGS: Capsule 4d
Tygem: 치킨까스 5d
Location: Bellevue, WA
Has thanked: 90 times
Been thanked: 835 times

Re: L19 Programming Problem Championship: Round 2

Post by Solomon »

tj86430 wrote:BTW, is it possible to submit solutions after the round has closed and to get feedback whether the solution would have been accepted? There are some problems I'd like to try, but I don't usually (read: ever) have time to do those within set timeframe. I might try during e.g. summer vacation, though.
Absolutely! Just go to the contest link and open the problems up, you should still be able to submit (assuming you registered an account).
peti29
Lives with ko
Posts: 125
Joined: Mon Feb 17, 2014 2:41 am
Rank: KGS 5 kyu
GD Posts: 0
Has thanked: 13 times
Been thanked: 12 times

Re: L19 Programming Problem Championship: Round 2

Post by peti29 »

Solomon: thank you for the detailed explanation! I almost get it now. Only one thing remains:
How is it guaranteed that we'll use a piece only once?
Say our test case is: '(', '(', ')', ')' then our positive memo table will be [0, 1, 2] and our negative memo table the same,
so when we compare memo table entries we will add 4 for i==2 which is correct, but then we'll add 2 for i==1 resulting in a sum of 6 which is incorrect. :scratch:
Post Reply