Video series: Learning from AlphaGo

gennan · Post by **gennan** » Sun Jan 21, 2018 12:56 pm

I just finished episode #4: 5-space extensions are wrong

https://www.youtube.com/watch?v=OapPVew_stM

gennan · Post by **gennan** » Tue Feb 06, 2018 1:43 pm

Learning from AlphaGo #5: refuting a greedy joseki

https://www.youtube.com/watch?v=5ouxL3G_tIo

daal · Post by **daal** » Wed Feb 07, 2018 3:11 am

I learned something from your last video that I hadn't previously understood. It is your explanation starting around 3:35 pointing out that the kikashi doesn't lose anything and can be treated lightly, because even if black swallows up the white stone, the loss is compensated by the fact that b had been forced to play inside his own territory. Thanks! Keep up the good work, I like your videos a lot.

BTW, why does b always have such a miserable winrate, and is this opening winrate of 47% reflected in the overall results of alphago vs. alphago? If so, it seems a clear indication that komi is wrong, no?

luigi · Post by **luigi** » Wed Feb 07, 2018 5:38 am

daal wrote:BTW, why does b always have such a miserable winrate, and is this opening winrate of 47% reflected in the overall results of alphago vs. alphago? If so, it seems a clear indication that komi is wrong, no?

AlphaGo uses 7.5 komi and Chinese rules. 7 komi (breaking ties with a button) would probably be ideal.

Uberdude · Post by **Uberdude** » Wed Feb 07, 2018 5:53 am

Yes, AG thinks white is better on the empty board with 7.5 komi. But I don't think that 47% number actually means black wins 47% of the million self-play games they've done, I understand it as an ill-defined goodness metric rather than a real probabilty, see discussion.

If komi was 6.5 (does that even make sense with Chinese, does it need to change in 2s?) it could be black's "win rate" is 55%, so more lopsided than with 7.5; thus 7.5 might be the non-drawing komi which is closest to even which is probably what you want*.

* Unless you go crazy and make a virtual fractional komi: say if 6.5 komi black wins 55% (+5%) and 7.5 komi 47% (-3%), therefore let's have 7 komi and if it's a draw on the board we assign win to black in 3/8 of cases and white in 5/8 of cases so komi is kinda like 7 and one eighth, where the 50% would be on a linear interpolation between our 2 data points.

djhbrown · Post by **djhbrown** » Wed Feb 07, 2018 5:59 am

luigi wrote:[...komi

heading off-topic here, so let's take it there:
https://www.lifein19x19.com/viewtopic.p ... 22#p227322

luigi · Post by **luigi** » Wed Feb 07, 2018 6:33 am

Uberdude wrote:7 komi and if it's a draw on the board we assign win to black in 3/8 of cases and white in 5/8 of cases so komi is kinda like 7 and one eighth

Well, with the button, if it's a draw on the board, each color is assigned the win in 1/2 of cases.

gennan · Post by **gennan** » Wed Feb 07, 2018 12:27 pm

It seems that 47% winrate for black with an empty board at 7.5 komi is even optimistic. From the 50 published self-play games Master vs Master, black only won 12 (24%). Also, in AlphaGo's opening database, you gradually see black's winrate drop as the game progresses, even when black plays the best moves according to AlphaGo. Usually black's winrate drops to about 43% by move 30. So as the game progresses, AlphaGo seems to become more certain that 7.5 komi is too much.

I think that mathematically perfect komi should be an integer. And because of the increasingly bad odds that black seems to have in AlghaGo's evaluation and self-play game results (using 7.5 komi), my guess would be that perfect komi is 6 rather than 7.

But for human (imperfect) games, I prefer a komi that prevents jigo. So perhaps 6.5 komi would be a fair approximation with Japanese rules. With Chinese rules, almost all games end with an odd score difference on the board. So in practice, black needs 7 points more on the board to win with 5.5, 6 or 6.5 komi. But black needs 9 points more on the board to win with 7.5 komi. (note that even though even score differences on the board are rare with Chinese rules, it is possible that the perfect game has it).

luigi · Post by **luigi** » Wed Feb 07, 2018 1:04 pm

gennan wrote:It seems that 47% winrate for black with an empty board at 7.5 komi is even optimistic. From the 50 published self-play games Master vs Master, black only won 12 (24%). Also, in AlphaGo's opening database, you gradually see black's winrate drop as the game progresses, even when black plays the best moves according to AlphaGo. Usually black's winrate drops to about 43% by move 30. So as the game progresses, AlphaGo seems to become more certain that 7.5 komi is too much.

That's only natural. If AlphaGo played perfectly, it would only display 0% and 100% win rates. It gets closer to perfection as the game progresses (of course, it's never going to display 50% right before passing).

Also, the 50 published games are cherry-picked.

gennan · Post by **gennan** » Wed Feb 07, 2018 1:49 pm

luigi wrote:
gennan wrote:It seems that 47% winrate for black with an empty board at 7.5 komi is even optimistic. From the 50 published self-play games Master vs Master, black only won 12 (24%). Also, in AlphaGo's opening database, you gradually see black's winrate drop as the game progresses, even when black plays the best moves according to AlphaGo. Usually black's winrate drops to about 43% by move 30. So as the game progresses, AlphaGo seems to become more certain that 7.5 komi is too much.
That's only natural. If AlphaGo played perfectly, it would only display 0% and 100% win rates. It gets closer to perfection as the game progresses (of course, it's never going to display 50% right before passing).

Also, the 50 published games are cherry-picked.

Do you think that DeepMind cherry-picked mostly white wins, but in fact unpublished wins had an even color distibution?

Suppose that AlphaGo became so strong that it would evaluate 0% or 100% black winrate for an empty board with 7.5 komi. Will it be 0% or 100%?

My guess is it would be 0%.

But with an integer komi, there can be a komi value where perfect players would always get a jigo. The correct evaluation of an empty board with that komi value would be 50%.

luigi · Post by **luigi** » Thu Feb 08, 2018 3:40 am

gennan wrote:Do you think that DeepMind cherry-picked mostly white wins, but in fact unpublished wins had an even color distibution?

DeepMind's Julian Schrittwieser said:

"In my experience and the experiments we've run, komi 7.5 is very balanced, we only observe a slightly higher winrate for white (55%)."

https://www.reddit.com/r/MachineLearnin ... r/doljugm/

gennan · Post by **gennan** » Thu Feb 08, 2018 11:59 am

luigi wrote:
gennan wrote:Do you think that DeepMind cherry-picked mostly white wins, but in fact unpublished wins had an even color distibution?
DeepMind's Julian Schrittwieser said:

"In my experience and the experiments we've run, komi 7.5 is very balanced, we only observe a slightly higher winrate for white (55%)."

https://www.reddit.com/r/MachineLearnin ... r/doljugm/

Thanks, I did not know that statement. 45% winrate for black is much closer to 50% than 24%. Still, 45% does favor white enough to suggest a komi error of about 1.5 points if it were statistics from human pro games.

I tried to find more details about that statement, but besides some quotes, I couldn't find any. For example, did DeepMind use the same time settings for the published and unpublished games?

DeepMind used 10 minutes time per position to generate the learning tool winrates, which is much longer than the time limits used in self-play games AFAIK. I can imagine that with 10 minutes per move in self-play games, black would win less than 45% (tending to follow the winrate trends in the learning tool more closely), because the moves would be closer to perfect and thus black would get less chances to upset white's lead, making the effect of oversized komi more pronounced.

djhbrown · Post by **djhbrown** » Thu Feb 08, 2018 4:49 pm

gennan wrote:But with an integer komi, there can be a komi value where perfect players would always get a jigo.

That's not logical - the only thing that's sure is that there can be a komi value where imperfect players would sometimes get a jigo. i would imagine that DM experimented with 6.5 and 7.5 and found that 7.5 was closer to 50%. It's entirely possible that 7.5 is closer to 50% than 7.

But one thing is for sure: if anyone ever learns anything from A0, it won't be anything to do with komi.

To me, the most fascinating thing is the markedly different styles of A0 and Master - but of course, i am biased like hell, because A0's honte style is more like Swim's than Master's

gennan · Post by **gennan** » Fri Feb 09, 2018 3:23 am

djhbrown wrote:i would imagine that DM experimented with 6.5 and 7.5 and found that 7.5 was closer to 50%.

To do that DeepMind would have had to raise different instances of AlphaGo for different komi values. I think they would have mentioned it if they did that and reported their findings. But I don't think they did, because of the costs involved in multiplying the expensive hardware costs.

djhbrown wrote:But one thing is for sure: if anyone ever learns anything from A0, it won't be anything to do with komi.

I think the komi value is quite important at a high level of play. The move choice in a situation depends not only on possibilities on the board, it also depends on the who is leading. When behind (even by a little bit), strong players try to reverse the game with risky moves. When ahead, strong players try to consolidate their lead with safe moves.

The stronger the players, the more accurate their awareness of who is leading in a particular situation and in close games (as games between evenly matched strong players tend to be) komi size is an important factor in this evaluation. Even the first moves in a game between strong players are determined by komi size. With a large komi, black tends to play a speedy opening and white tends to play a steady opening. The reason is that strong players feel that black cannot afford a steady opening if the komi is as large as 7 points. But in the classical era before komi, black played a steady opening and white played a speedy opening. So strong players feel komi size throughout the game and it affects their moves right from the start.

So I would think that komi size is also quite relevant for the moves that AlphaGo recommends in a particular situation. If you watch Michael Redmond's reviews, you'll find a returning theme in AlphaGo vs AlpahGo games that black AlphaGo shows signs of desparation already in the middle game. Black AlphaGo tends to turn a close game into a complicated fight and eventually collapses against white AlphaGo. The cause for black's choice seems to be that black AlphaGo realizes that he will just fall short to pay the full komi. If the komi were one point less (so black only needs 7 points more on the board to win instead of 9, with Chinese rules), it is quite possible that black AlphaGo would choose a safer continuation.

If AlphaGo were trained for no komi games, I think it too would play quite differently. So komi may seem irrelevant for what we can learn from AlphaGo, but indirectly it affects what AlphaGo tells us to do.

djhbrown · Post by **djhbrown** » Fri Feb 09, 2018 4:28 am

gennan wrote:Black AlphaGo tends to turn a close game into a complicated fight and eventually collapses against white AlphaGo.... it affects what AlphaGo tells us to do.

i don't want to be a pedantist, but please can you clarify which version of Alfie you refer to? My guess is it's Master v Master, which i regard as being two red herrings squabbling over random rollouts.

i'm no judge, but it does look to me like Alfie0 is calm and collected, whichever colour she takes.

Life In 19x19

Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo

Re: Video series: Learning from AlphaGo