AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

ez4u · Post by **ez4u** » Thu Jun 01, 2017 1:10 am

Meanwhile on the original topic of the release of 50 self-played games (out of a hundred million or so). I can not help but think of one of Khalil Gibran's aphorisms in Sand and Foam, "How mean am I when life gives me gold and I give you silver, and yet I deem myself generous."

AlphaGo forever!

DeepMind and Google, not so much.

Uberdude · Post by **Uberdude** » Thu Jun 01, 2017 1:59 am

ez4u wrote:Meanwhile on the original topic of the release of 50 self-played games (out of a hundred million or so).

They said these 50 were self-play with long time limits so presumably stronger play, whereas the millions of games used for training the networks were faster and thus potentially weaker and embarrassing. Nevertheless, it sure would have been nice to see some 3 stone handicap games between AlphaGo Lee and Master versions (Aja confirmed on facebook they did actually play with 3 stones, it wasn't a proxy for win percentage).

P.S. Inseong Hwang 8d has made a nice video lecture taking a quick look at some of the interesting moves in these games, particularly a lot of attachments: https://www.youtube.com/watch?v=iO_JmGH8Iu8

ez4u · Post by **ez4u** » Thu Jun 01, 2017 3:54 am

Ask yourself how likely it is that the program played a hundred million silly, embarrassing games against itself and then suddenly transcended to the AlphaGo that we see today.

Alternatively, consider the early 3-3 invasion. How many thousands (millions?) of games must they be holding that illustrate all the alternatives that both sides have tried both for and against the play? Not to mention any indication of what it did not try!

Thx, I'll have to have a look at Inseong's videos.

Uberdude · Post by **Uberdude** » Thu Jun 01, 2017 5:14 am

ez4u wrote:Ask yourself how likely it is that the program played a hundred million silly, embarrassing games against itself and then suddenly transcended to the AlphaGo that we see today.

Yeah, I don't really buy that argument myself

lightvector · Post by **lightvector** » Thu Jun 01, 2017 6:04 am

A good rule of thumb from computer chess I seem to recall is 30 Elo points per doubling of computing power, although as you get closer to the top I think that shrinks a bit. I don't actually know whether that's exactly right any more, and for AlphaGo this is heavily complicated by the fact that you need both CPU and GPU and whichever one is more constrained becomes a limiting factor, but anyways 30 Elo points per doubling looks very roughly consistent with Deepmind's original Nature paper showing how earlier versions of AlphaGo scaled with computation.

If the newer versions of AlphaGo don't scale terribly differently, then ignoring things like fixed-per-search overheads and other issues with this kind of extraplation, back-of-envelope calculations suggest that AlphaGo at 1 second/move should be still at a minimum at the level a top pro who has a normal-slow-game-amount of thinking time, and actually likely still a bit beyond it, but possibly not enough to win every time any more against a human who has very long thinking time.

(Evidence from the master games, even taking into account the fact that they were blitz and therefore disadvantageous for humans, puts a pretty confident floor on Master's possible single-machine Elo rating of about 3800. It's more likely a couple hundred points higher than this absolute floor, and coupled with improvement since then the Master games, you can easily drop a few hundred rating points in the process of restricting to 1s/move and still end up higher than top pros like Ke Jie around 3600 - https://www.goratings.org/en/).

pookpooi · Post by **pookpooi** » Thu Jun 01, 2017 7:22 am

I'm wondering if DeepMind limits to 50 games instead of millions is because other AI may use those large records as jumpspring. They want FineArt and DeepZenGo to struggle the stagnant point on their own until new paper release. Before revealing the solution, teacher has to make sure that students can't really answer the question.

billyswong · Post by **billyswong** » Sat Jun 03, 2017 11:49 pm

Hi, I am new here. May anyone help me on the game 20? Its official result is B+resign. However, when I try to play out the end game moves and see how close things are (inside the deepmind website), it ends up at "White wins by 0.5 points" showing in front of me. Is it that I played some moves wrong?

Uberdude · Post by **Uberdude** » Sun Jun 04, 2017 12:20 am

billyswong wrote:Hi, I am new here. May anyone help me on the game 20? Its official result is B+resign. However, when I try to play out the end game moves and see how close things are (inside the deepmind website), it ends up at "White wins by 0.5 points" showing in front of me. Is it that I played some moves wrong?

Edit: wrong, I missed white problem in centre.
Very good point. I think in fact it should be White wins by 1.5, because he can get 2 of the 3 available gote dame: the 2 on the left are simple sente atari dame, and on the right side black needs 2 defensive moves inside so white gets both those dame. Did white AlphaGo not realise this and thus resigned a won game?! I also note there are a lot of captures and white has more than black, another possible source of bugs.

billyswong · Post by **billyswong** » Sun Jun 04, 2017 12:53 am

Uberdude wrote:
billyswong wrote:Hi, I am new here. May anyone help me on the game 20? Its official result is B+resign. However, when I try to play out the end game moves and see how close things are (inside the deepmind website), it ends up at "White wins by 0.5 points" showing in front of me. Is it that I played some moves wrong?
Very good point. I think in fact it should be White wins by 1.5, because he can get 2 of the 3 available gote dame: the 2 on the left are simple sente atari dame, and on the right side black needs 2 defensive moves inside. Did white AlphaGo not realise this and thus resigned a won game?! I also note there are a lot of captures and white has more than black, another possible source of bugs.

White also needs one defensive move inside on the left. Thus my W+0.5 result.

But I start wondering maybe it's a break of illusion again, as we often imagine area scoring and territory scoring are equal but they are not. Can anyone check the area scoring result? It is quite stupid the deepmind website applet only count score the territory way while the company's AI product is counting in the area way.

johnsmith · Post by **johnsmith** » Sun Jun 04, 2017 12:53 am

Uberdude wrote:
billyswong wrote:Hi, I am new here. May anyone help me on the game 20? Its official result is B+resign. However, when I try to play out the end game moves and see how close things are (inside the deepmind website), it ends up at "White wins by 0.5 points" showing in front of me. Is it that I played some moves wrong?
Very good point. I think in fact it should be White wins by 1.5, because he can get 2 of the 3 available gote dame: the 2 on the left are simple sente atari dame, and on the right side black needs 2 defensive moves inside so white gets both those dame. Did white AlphaGo not realise this and thus resigned a won game?! I also note there are a lot of captures and white has more than black, another possible source of bugs.

Attached is how I see this game played out till the very end. Black wins by 0.5. That's why white resigned and you can see quite often in these 50 games that one resigns because the opponent was leading by a very small margin of 0.5 or 1.5.

Result:
White: 180.25 = 176 (Points) + 1 (Shared) / 2 + 7.5 (Komi) / 2
Black: 180.75 = 184 (Points) + 1 (Shared) / 2 - 7.5 (Komi) / 2
B + 0.25

Edit: AlphaGO does NEVER make mistakes!

Uberdude · Post by **Uberdude** » Sun Jun 04, 2017 1:01 am

Ah yes, my mistake, not AlphaGo's

. I missed White's problem in the middle so white loses a point there and black gets the last dame to win by half.

Cassandra · Post by **Cassandra** » Sun Jun 04, 2017 1:05 am

Uberdude wrote:
billyswong wrote:Hi, I am new here. May anyone help me on the game 20? Its official result is B+resign. However, when I try to play out the end game moves and see how close things are (inside the deepmind website), it ends up at "White wins by 0.5 points" showing in front of me. Is it that I played some moves wrong?
Very good point. I think in fact it should be White wins by 1.5, because he can get 2 of the 3 available gote dame: the 2 on the left are simple sente atari dame, and on the right side black needs 2 defensive moves inside so white gets both those dame. Did white AlphaGo not realise this and thus resigned a won game?! I also note there are a lot of captures and white has more than black, another possible source of bugs.

None of this ...

There are only two dame that are not sente (E5, L3), so each side will get one of these.

When applying CHINESE rules (as all the selfplay-games are played under), Black wins by 0.5 points, as he has played the last move.

It seems to me that the interface on deepmind's website uses JAPANESE style for counting (= territory plus prisoners), thus it has a white win by 0.5 points, not considering the dame on the board.

Baywa · Post by **Baywa** » Sun Jun 04, 2017 2:01 am

johnsmith wrote:Black wins by 0.5. That's why white resigned and you can see quite often in these 50 games that one resigns because the opponent was leading by a very small margin of 0.5 or 1.5.

I noticed that, too. For example game #39 is very similar in this regard.

Edit: AlphaGO does NEVER make mistakes!

Seems like that, at least for the endgame. We've seen several examples (e.g. game #1 against Ke Jie). Although AlphaGo "throws away" several points in exchange for higher win probability (the way it defines it) it has complete control about the outcome from several moves back.

Edit: Actually, in this series of selfplays it could be interesting to see how hard and close the endgames were fought. The close final score may not tell the whole story.

Bill Spight · Post by **Bill Spight** » Sun Jun 04, 2017 3:03 pm

Baywa wrote:
johnsmith wrote:AlphaGO does NEVER make mistakes!
Seems like that, at least for the endgame. We've seen several examples (e.g. game #1 against Ke Jie). Although AlphaGo "throws away" several points in exchange for higher win probability (the way it defines it) it has complete control about the outcome from several moves back.

I doubt if AlphaGo will drop a point in the late endgame, because of its reading ability. However, what it means by win probability is, AFAIK, unknown, even to its developers. (Because it depends in large part on what the evaluation network has learned.)

One advantage that humans have in the endgame is that the game tends to divide up into independent regions of play. Humans can analyze each independent region separately, which can greatly simplify the challenge of reading. We still have to combine play in all regions, but playing in the hottest region is nearly always correct. AlphaGo, by contrast, always builds a whole board game tree, and must explore more branches than humans.

Here is a problem that an amateur dan player should be able to solve, if she has read Mathematical Go. In fact, White's first non-sente move should be obvious.

Can AlphaGo solve it in 45 seconds? Maybe so, but I'll believe it when I see it.

Actually, in this series of selfplays it could be interesting to see how hard and close the endgames were fought. The close final score may not tell the whole story.

I have taken a look at the end of game 31. Neither player dropped a point (that I found), and the play of the approach ko in the top left corner was impressive.

However, the game came down to the final 4/3 pt. ko (by area scoring). Go to move B305. Each player dropped a ko threat, and Black allowed White to get a virtual ko threat (W310). Since White won the ko, anyway, none of this affected the result. Black's play may be explained as playing for an error, but I fail to see how White's play (W308) could do anything but lower the probability of winning. (If it had any effect on that at all, OC.

)

Since AlphaGo has shown no sign of plateauing, I expect that it still makes mistakes in the opening and middle game. But are humans good enough to find them? But it would not surprise me for humans to find AlphaGo occasionally dropping a point or two 80-100 moves from the end.

billyswong · Post by **billyswong** » Mon Jun 05, 2017 12:02 am

Bill Spight wrote: Can AlphaGo solve it in 45 seconds? Maybe so, but I'll believe it when I see it.

I am quite sure AlphaGo can do that. Remember those unofficial games played online in the name of "Master" this January?

Wikipedia wrote: All 60 games except one were fast paced games with three 20 or 30 seconds byo-yomi. Master offered to extend the byo-yomi to one minute when playing with Nie Weiping in consideration of his age.

Life In 19x19

AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)

Re: AlphaGo vs. AlphaGo: 50 Self-Play Games (May 2017)