Video series: Learning from AlphaGo
-
gennan
- Lives in gote
- Posts: 497
- Joined: Fri Sep 22, 2017 2:08 am
- Rank: EGF 3d
- GD Posts: 0
- Universal go server handle: gennan
- Location: Netherlands
- Has thanked: 273 times
- Been thanked: 147 times
Re: Video series: Learning from AlphaGo
I just finished episode #4: 5-space extensions are wrong
https://www.youtube.com/watch?v=OapPVew_stM
https://www.youtube.com/watch?v=OapPVew_stM
- daal
- Oza
- Posts: 2508
- Joined: Wed Apr 21, 2010 1:30 am
- GD Posts: 0
- Has thanked: 1304 times
- Been thanked: 1128 times
Re: Video series: Learning from AlphaGo
I learned something from your last video that I hadn't previously understood. It is your explanation starting around 3:35 pointing out that the kikashi doesn't lose anything and can be treated lightly, because even if black swallows up the white stone, the loss is compensated by the fact that b had been forced to play inside his own territory. Thanks! Keep up the good work, I like your videos a lot.
BTW, why does b always have such a miserable winrate, and is this opening winrate of 47% reflected in the overall results of alphago vs. alphago? If so, it seems a clear indication that komi is wrong, no?
BTW, why does b always have such a miserable winrate, and is this opening winrate of 47% reflected in the overall results of alphago vs. alphago? If so, it seems a clear indication that komi is wrong, no?
Patience, grasshopper.
-
luigi
- Lives in gote
- Posts: 352
- Joined: Wed Jul 06, 2011 12:01 pm
- Rank: Low
- GD Posts: 0
- Location: Spain
- Has thanked: 181 times
- Been thanked: 41 times
Re: Video series: Learning from AlphaGo
AlphaGo uses 7.5 komi and Chinese rules. 7 komi (breaking ties with a button) would probably be ideal.daal wrote:BTW, why does b always have such a miserable winrate, and is this opening winrate of 47% reflected in the overall results of alphago vs. alphago? If so, it seems a clear indication that komi is wrong, no?
-
Uberdude
- Judan
- Posts: 6727
- Joined: Thu Nov 24, 2011 11:35 am
- Rank: UK 4 dan
- GD Posts: 0
- KGS: Uberdude 4d
- OGS: Uberdude 7d
- Location: Cambridge, UK
- Has thanked: 436 times
- Been thanked: 3718 times
Re: Video series: Learning from AlphaGo
Yes, AG thinks white is better on the empty board with 7.5 komi. But I don't think that 47% number actually means black wins 47% of the million self-play games they've done, I understand it as an ill-defined goodness metric rather than a real probabilty, see discussion.
If komi was 6.5 (does that even make sense with Chinese, does it need to change in 2s?) it could be black's "win rate" is 55%, so more lopsided than with 7.5; thus 7.5 might be the non-drawing komi which is closest to even which is probably what you want*.
* Unless you go crazy and make a virtual fractional komi: say if 6.5 komi black wins 55% (+5%) and 7.5 komi 47% (-3%), therefore let's have 7 komi and if it's a draw on the board we assign win to black in 3/8 of cases and white in 5/8 of cases so komi is kinda like 7 and one eighth, where the 50% would be on a linear interpolation between our 2 data points.
If komi was 6.5 (does that even make sense with Chinese, does it need to change in 2s?) it could be black's "win rate" is 55%, so more lopsided than with 7.5; thus 7.5 might be the non-drawing komi which is closest to even which is probably what you want*.
* Unless you go crazy and make a virtual fractional komi: say if 6.5 komi black wins 55% (+5%) and 7.5 komi 47% (-3%), therefore let's have 7 komi and if it's a draw on the board we assign win to black in 3/8 of cases and white in 5/8 of cases so komi is kinda like 7 and one eighth, where the 50% would be on a linear interpolation between our 2 data points.
- djhbrown
- Lives in gote
- Posts: 392
- Joined: Tue Sep 15, 2015 5:00 pm
- Rank: NR
- GD Posts: 0
- Has thanked: 23 times
- Been thanked: 43 times
Re: Video series: Learning from AlphaGo
heading off-topic here, so let's take it there:luigi wrote:[...komi
https://www.lifein19x19.com/viewtopic.p ... 22#p227322
i shrink, therefore i swarm
-
luigi
- Lives in gote
- Posts: 352
- Joined: Wed Jul 06, 2011 12:01 pm
- Rank: Low
- GD Posts: 0
- Location: Spain
- Has thanked: 181 times
- Been thanked: 41 times
Re: Video series: Learning from AlphaGo
Well, with the button, if it's a draw on the board, each color is assigned the win in 1/2 of cases.Uberdude wrote:7 komi and if it's a draw on the board we assign win to black in 3/8 of cases and white in 5/8 of cases so komi is kinda like 7 and one eighth
-
gennan
- Lives in gote
- Posts: 497
- Joined: Fri Sep 22, 2017 2:08 am
- Rank: EGF 3d
- GD Posts: 0
- Universal go server handle: gennan
- Location: Netherlands
- Has thanked: 273 times
- Been thanked: 147 times
Re: Video series: Learning from AlphaGo
It seems that 47% winrate for black with an empty board at 7.5 komi is even optimistic. From the 50 published self-play games Master vs Master, black only won 12 (24%). Also, in AlphaGo's opening database, you gradually see black's winrate drop as the game progresses, even when black plays the best moves according to AlphaGo. Usually black's winrate drops to about 43% by move 30. So as the game progresses, AlphaGo seems to become more certain that 7.5 komi is too much.
I think that mathematically perfect komi should be an integer. And because of the increasingly bad odds that black seems to have in AlghaGo's evaluation and self-play game results (using 7.5 komi), my guess would be that perfect komi is 6 rather than 7.
But for human (imperfect) games, I prefer a komi that prevents jigo. So perhaps 6.5 komi would be a fair approximation with Japanese rules. With Chinese rules, almost all games end with an odd score difference on the board. So in practice, black needs 7 points more on the board to win with 5.5, 6 or 6.5 komi. But black needs 9 points more on the board to win with 7.5 komi. (note that even though even score differences on the board are rare with Chinese rules, it is possible that the perfect game has it).
I think that mathematically perfect komi should be an integer. And because of the increasingly bad odds that black seems to have in AlghaGo's evaluation and self-play game results (using 7.5 komi), my guess would be that perfect komi is 6 rather than 7.
But for human (imperfect) games, I prefer a komi that prevents jigo. So perhaps 6.5 komi would be a fair approximation with Japanese rules. With Chinese rules, almost all games end with an odd score difference on the board. So in practice, black needs 7 points more on the board to win with 5.5, 6 or 6.5 komi. But black needs 9 points more on the board to win with 7.5 komi. (note that even though even score differences on the board are rare with Chinese rules, it is possible that the perfect game has it).
-
luigi
- Lives in gote
- Posts: 352
- Joined: Wed Jul 06, 2011 12:01 pm
- Rank: Low
- GD Posts: 0
- Location: Spain
- Has thanked: 181 times
- Been thanked: 41 times
Re: Video series: Learning from AlphaGo
That's only natural. If AlphaGo played perfectly, it would only display 0% and 100% win rates. It gets closer to perfection as the game progresses (of course, it's never going to display 50% right before passing).gennan wrote:It seems that 47% winrate for black with an empty board at 7.5 komi is even optimistic. From the 50 published self-play games Master vs Master, black only won 12 (24%). Also, in AlphaGo's opening database, you gradually see black's winrate drop as the game progresses, even when black plays the best moves according to AlphaGo. Usually black's winrate drops to about 43% by move 30. So as the game progresses, AlphaGo seems to become more certain that 7.5 komi is too much.
Also, the 50 published games are cherry-picked.
-
gennan
- Lives in gote
- Posts: 497
- Joined: Fri Sep 22, 2017 2:08 am
- Rank: EGF 3d
- GD Posts: 0
- Universal go server handle: gennan
- Location: Netherlands
- Has thanked: 273 times
- Been thanked: 147 times
Re: Video series: Learning from AlphaGo
Do you think that DeepMind cherry-picked mostly white wins, but in fact unpublished wins had an even color distibution?luigi wrote:That's only natural. If AlphaGo played perfectly, it would only display 0% and 100% win rates. It gets closer to perfection as the game progresses (of course, it's never going to display 50% right before passing).gennan wrote:It seems that 47% winrate for black with an empty board at 7.5 komi is even optimistic. From the 50 published self-play games Master vs Master, black only won 12 (24%). Also, in AlphaGo's opening database, you gradually see black's winrate drop as the game progresses, even when black plays the best moves according to AlphaGo. Usually black's winrate drops to about 43% by move 30. So as the game progresses, AlphaGo seems to become more certain that 7.5 komi is too much.
Also, the 50 published games are cherry-picked.
Suppose that AlphaGo became so strong that it would evaluate 0% or 100% black winrate for an empty board with 7.5 komi. Will it be 0% or 100%?
My guess is it would be 0%.
But with an integer komi, there can be a komi value where perfect players would always get a jigo. The correct evaluation of an empty board with that komi value would be 50%.
-
luigi
- Lives in gote
- Posts: 352
- Joined: Wed Jul 06, 2011 12:01 pm
- Rank: Low
- GD Posts: 0
- Location: Spain
- Has thanked: 181 times
- Been thanked: 41 times
Re: Video series: Learning from AlphaGo
DeepMind's Julian Schrittwieser said:gennan wrote:Do you think that DeepMind cherry-picked mostly white wins, but in fact unpublished wins had an even color distibution?
"In my experience and the experiments we've run, komi 7.5 is very balanced, we only observe a slightly higher winrate for white (55%)."
https://www.reddit.com/r/MachineLearnin ... r/doljugm/
-
gennan
- Lives in gote
- Posts: 497
- Joined: Fri Sep 22, 2017 2:08 am
- Rank: EGF 3d
- GD Posts: 0
- Universal go server handle: gennan
- Location: Netherlands
- Has thanked: 273 times
- Been thanked: 147 times
Re: Video series: Learning from AlphaGo
Thanks, I did not know that statement. 45% winrate for black is much closer to 50% than 24%. Still, 45% does favor white enough to suggest a komi error of about 1.5 points if it were statistics from human pro games.luigi wrote:DeepMind's Julian Schrittwieser said:gennan wrote:Do you think that DeepMind cherry-picked mostly white wins, but in fact unpublished wins had an even color distibution?
"In my experience and the experiments we've run, komi 7.5 is very balanced, we only observe a slightly higher winrate for white (55%)."
https://www.reddit.com/r/MachineLearnin ... r/doljugm/
I tried to find more details about that statement, but besides some quotes, I couldn't find any. For example, did DeepMind use the same time settings for the published and unpublished games?
DeepMind used 10 minutes time per position to generate the learning tool winrates, which is much longer than the time limits used in self-play games AFAIK. I can imagine that with 10 minutes per move in self-play games, black would win less than 45% (tending to follow the winrate trends in the learning tool more closely), because the moves would be closer to perfect and thus black would get less chances to upset white's lead, making the effect of oversized komi more pronounced.
- djhbrown
- Lives in gote
- Posts: 392
- Joined: Tue Sep 15, 2015 5:00 pm
- Rank: NR
- GD Posts: 0
- Has thanked: 23 times
- Been thanked: 43 times
Re: Video series: Learning from AlphaGo
That's not logical - the only thing that's sure is that there can be a komi value where imperfect players would sometimes get a jigo. i would imagine that DM experimented with 6.5 and 7.5 and found that 7.5 was closer to 50%. It's entirely possible that 7.5 is closer to 50% than 7.gennan wrote:But with an integer komi, there can be a komi value where perfect players would always get a jigo.
But one thing is for sure: if anyone ever learns anything from A0, it won't be anything to do with komi.
To me, the most fascinating thing is the markedly different styles of A0 and Master - but of course, i am biased like hell, because A0's honte style is more like Swim's than Master's
i shrink, therefore i swarm
-
gennan
- Lives in gote
- Posts: 497
- Joined: Fri Sep 22, 2017 2:08 am
- Rank: EGF 3d
- GD Posts: 0
- Universal go server handle: gennan
- Location: Netherlands
- Has thanked: 273 times
- Been thanked: 147 times
Re: Video series: Learning from AlphaGo
To do that DeepMind would have had to raise different instances of AlphaGo for different komi values. I think they would have mentioned it if they did that and reported their findings. But I don't think they did, because of the costs involved in multiplying the expensive hardware costs.djhbrown wrote:i would imagine that DM experimented with 6.5 and 7.5 and found that 7.5 was closer to 50%.
I think the komi value is quite important at a high level of play. The move choice in a situation depends not only on possibilities on the board, it also depends on the who is leading. When behind (even by a little bit), strong players try to reverse the game with risky moves. When ahead, strong players try to consolidate their lead with safe moves.djhbrown wrote:But one thing is for sure: if anyone ever learns anything from A0, it won't be anything to do with komi.
The stronger the players, the more accurate their awareness of who is leading in a particular situation and in close games (as games between evenly matched strong players tend to be) komi size is an important factor in this evaluation. Even the first moves in a game between strong players are determined by komi size. With a large komi, black tends to play a speedy opening and white tends to play a steady opening. The reason is that strong players feel that black cannot afford a steady opening if the komi is as large as 7 points. But in the classical era before komi, black played a steady opening and white played a speedy opening. So strong players feel komi size throughout the game and it affects their moves right from the start.
So I would think that komi size is also quite relevant for the moves that AlphaGo recommends in a particular situation. If you watch Michael Redmond's reviews, you'll find a returning theme in AlphaGo vs AlpahGo games that black AlphaGo shows signs of desparation already in the middle game. Black AlphaGo tends to turn a close game into a complicated fight and eventually collapses against white AlphaGo. The cause for black's choice seems to be that black AlphaGo realizes that he will just fall short to pay the full komi. If the komi were one point less (so black only needs 7 points more on the board to win instead of 9, with Chinese rules), it is quite possible that black AlphaGo would choose a safer continuation.
If AlphaGo were trained for no komi games, I think it too would play quite differently. So komi may seem irrelevant for what we can learn from AlphaGo, but indirectly it affects what AlphaGo tells us to do.
- djhbrown
- Lives in gote
- Posts: 392
- Joined: Tue Sep 15, 2015 5:00 pm
- Rank: NR
- GD Posts: 0
- Has thanked: 23 times
- Been thanked: 43 times
Re: Video series: Learning from AlphaGo
i don't want to be a pedantist, but please can you clarify which version of Alfie you refer to? My guess is it's Master v Master, which i regard as being two red herrings squabbling over random rollouts.gennan wrote:Black AlphaGo tends to turn a close game into a complicated fight and eventually collapses against white AlphaGo.... it affects what AlphaGo tells us to do.
i'm no judge, but it does look to me like Alfie0 is calm and collected, whichever colour she takes.
i shrink, therefore i swarm