AlphaGo Teach discussion (Go Tool from DeepMind)

lightvector · Post by **lightvector** » Mon Dec 11, 2017 10:32 am

I actually think that for mid to high dan level amateurs, there is some usefulness here. I'm below the necessary strength to derive a lot of value out of these variations, but I browsed through and still learned some things about openings that I actually encounter from both sides - the chinese, the mini chinese, the kobayashi, san-ren-sei. Moves that I wouldn't have even considered that I can experiment with now that I know that they're possible, or some moves that I would have considered and rejected as bad instead being evaluated as not-bad by AlphaGo with a simple and clear variation that disproves my misconception.

I also think that at the extremes the evaluations are useful at dan level. Mostly when they get noticeably outside the 40%-60% range I find that I'm more likely than not to myself feel dissatisfied with the side that supposedly fell behind - and that gives me some confidence that the evals, once they get that extreme, are at the right level to be at least somewhat informative. I wouldn't put huge value on it at amateur dan, but at minimum you can still use it as another good hundred datapoints for improving one's overall sense of direction of play and whole-board judgment, one more drop in the bucket.

moha · Post by **moha** » Mon Dec 11, 2017 12:33 pm

Bill Spight wrote:From the web site:
each move’s winning probability was computed by running an independent search of 10 million simulations from that position.
Simulations of what, pray tell?

I think they simply mean the MCTS effort / iterations / node expands. EDIT: Actually this was Master still so rollouts as well.

Bill Spight · Post by **Bill Spight** » Mon Dec 11, 2017 12:46 pm

moha wrote:
Bill Spight wrote:From the web site:
each move’s winning probability was computed by running an independent search of 10 million simulations from that position.
Simulations of what, pray tell?
I think they simply mean the MCTS effort / iterations / node expands. EDIT: Actually this was Master still so rollouts as well.

If so, the win rates should be perfect for SDKs, eh?

John Fairbairn · Post by **John Fairbairn** » Mon Dec 11, 2017 12:53 pm

The wording of the announcement is unclear as regards selection of "major" variations or indeed what is meant by "opening" (fuseki or joseki - I infer the former).

But insofar as I can make sense of it, the results don't seem to equate with pro tournament practice and the human moves shown must therefore be from a mix of pro and ama games but mostly ama.

If I follow the first two recommended moves (R16 then Q4) DeepMind gives 11 human moves: C17, C16, C15, C4, D17, D16, D4, D3, E17, O3 and P17.

But doing a similar exercise on a pro tournament database (approx. 90,000 games) gives a quite different result with 21 moves tried by humans, Human pros did not try the O3 given above, but did try the rest plus C5, C3, D15, D14, D5, E16, E4, E3, O17, P16 and R6.

The AG win rates are potentially interesting (but how much different from pro win rates as shown by Kombilo?), but if they include many wins against amateurs (and/or internet blitz games) they may be somewhat spurious. It would be nice if Nature could be cajoled into including a properly versed go player as one of their referees.

Just offering users the chance to find plays they have never considered before and thus be "creative" (sounds like an AI definition of the word) is not much of a selling point given the wealth of pro games around, and specifically those that were experimental such as New Fuseki, that already do that.

moha · Post by **moha** » Mon Dec 11, 2017 1:24 pm

Bill Spight wrote:
moha wrote:I think they simply mean the MCTS effort / iterations / node expands. EDIT: Actually this was Master still so rollouts as well.
If so, the win rates should be perfect for SDKs, eh?

Master itself did OK, looking only at these numbers, didn't it?

John Fairbairn wrote:The AG win rates are potentially interesting (but how much different from pro win rates as shown by Kombilo?), but if they include many wins against amateurs (and/or internet blitz games)

Again, I think these are only probabilites, estimated by AG with a deep search (and rollouts with the reduced policy net). So there should be some correlations to actual pro data, but significant differences as well.

Baywa · Post by **Baywa** » Mon Dec 11, 2017 1:44 pm

For what it's worth: I think one can find the games that AlphaGo played against humans inside that tree of variations. For example (I tried that out) game No. 18 of Master (Black) against Ke Jie is contained and one can follow AlphaGo's evaluations. Interestingly, Master does not always choose the "best" continuation. OTOH, when Ke Jie played 22 M4 AlphaGo's winning percentage rose to 49.9 percent. The move suggested as alternative, which defended the l.l. corner would have kept Blacks winning percentage at 47.7.

So it might be worthwhile to follow these evaluations.

Bill Spight · Post by **Bill Spight** » Mon Dec 11, 2017 2:37 pm

moha wrote:
Bill Spight wrote:
moha wrote:I think they simply mean the MCTS effort / iterations / node expands. EDIT: Actually this was Master still so rollouts as well.
If so, the win rates should be perfect for SDKs, eh?
Master itself did OK, looking only at these numbers, didn't it?

No. Master had/has a value network and did not rely upon MCTS quasi-random rollouts alone. Also, Master built a game tree and did not rely solely upon "win rates".

Again I ask, what were they simulating?

yoyoma · Post by **yoyoma** » Mon Dec 11, 2017 3:01 pm

John Fairbairn wrote:The wording of the announcement is unclear as regards selection of "major" variations or indeed what is meant by "opening" (fuseki or joseki - I infer the former).

But insofar as I can make sense of it, the results don't seem to equate with pro tournament practice and the human moves shown must therefore be from a mix of pro and ama games but mostly ama.

If I follow the first two recommended moves (R16 then Q4) DeepMind gives 11 human moves: C17, C16, C15, C4, D17, D16, D4, D3, E17, O3 and P17.

But doing a similar exercise on a pro tournament database (approx. 90,000 games) gives a quite different result with 21 moves tried by humans, Human pros did not try the O3 given above, but did try the rest plus C5, C3, D15, D14, D5, E16, E4, E3, O17, P16 and R6.

The AG win rates are potentially interesting (but how much different from pro win rates as shown by Kombilo?), but if they include many wins against amateurs (and/or internet blitz games) they may be somewhat spurious. It would be nice if Nature could be cajoled into including a properly versed go player as one of their referees.

Just offering users the chance to find plays they have never considered before and thus be "creative" (sounds like an AI definition of the word) is not much of a selling point given the wealth of pro games around, and specifically those that were experimental such as New Fuseki, that already do that.

My interpretation is the process was:
1) DeepMind picks a tree of interesting openings. From your analysis it seems maybe it included some amateur games. I view that as not a problem.
2) DeepMind uses AlphaGo Master to analyze every position in the tree, and puts the resulting winrates in the tree. This analysis has nothing to do with what winrates of pros or amateurs had from those positions. So your point about "wins against amateurs" doesn't apply. If there are some positions in there that are very bad (whether an pro or amateur got to that position is not relevant), Master will tell us so.

moha · Post by **moha** » Mon Dec 11, 2017 3:37 pm

Bill Spight wrote:
moha wrote:
Bill Spight wrote:If so, the win rates should be perfect for SDKs, eh?
Master itself did OK, looking only at these numbers, didn't it?
No. Master had/has a value network and did not rely upon MCTS quasi-random rollouts alone. Also, Master built a game tree and did not rely solely upon "win rates".

Again I ask, what were they simulating?

But why do you think these numbers are rollouts alone?

IMO these are Master's complete evaluations, 50% value net + 50% rollout (and Master is known to had weaker value net than Zero, weak enough that dropping rollouts would damage its strength in spite of the huge speedup - and I get the feeling that you equate NN policy-based rollouts to random rollouts of the past).

As for "simulations", I think this word is used for pure MCTS as well, even without rollouts. Such as X simulations = X leaf node expands (in the MCTS tree that Master does build, and which also contains estimated winrates, its only idea of value).

(The word may be a remnant of the idea that for a leaf node to be expanded, a new "simulation" is started from top, that takes a weighted-random branch at each node, until it reaches a leaf which is expanded.)

Bill Spight · Post by **Bill Spight** » Mon Dec 11, 2017 3:58 pm

Bill Spight wrote:If so, the win rates should be perfect for SDKs, eh?

moha wrote:Master itself did OK, looking only at these numbers, didn't it?

Bill Spight wrote:No. Master had/has a value network and did not rely upon MCTS quasi-random rollouts alone. Also, Master built a game tree and did not rely solely upon "win rates".

Again I ask, what were they simulating?

moha wrote: But why do you think these numbers are rollouts alone?

I don't think that they are rollouts. They are "independent . . . simulations", according to the site. As such, they should actually be probability estimates.

IMO these ARE Master's evaluations, 50% value net + 50% rollout (and Master is known to had weaker value net than Zero, weak enough that dropping rollouts would damage its strength in spite of the huge speedup - and I get the feeling that you equate NN policy-based rollouts to random rollouts of the past).

Maybe you are right.

But then they would not be "winning probabilities", as the site claims.

moha · Post by **moha** » Mon Dec 11, 2017 4:10 pm

(Meanwhile I edited my post above to make it more clear.)

I don't think there are any other meaning behind these words than the amount of search done, and the resulting value estimate.

Uberdude · Post by **Uberdude** » Mon Dec 11, 2017 4:29 pm

Baywa wrote:For what it's worth: I think one can find the games that AlphaGo played against humans inside that tree of variations. For example (I tried that out) game No. 18 of Master (Black) against Ke Jie is contained and one can follow AlphaGo's evaluations. Interestingly, Master does not always choose the "best" continuation. OTOH, when Ke Jie played 22 M4 AlphaGo's winning percentage rose to 49.9 percent. The move suggested as alternative, which defended the l.l. corner would have kept Blacks winning percentage at 47.7.

So it might be worthwhile to follow these evaluations.

Indeed, we can use this to find the mistakes of the pros who were losing by move 30 without making any obviously bad moves (actually I think some were fairly obviously bad once you looked at the games on mass, like premature hanging connection in the chinese opening 3-3 invasion and tenuki).

As for that Ke Jie game, I did think having to answer the slide at 3-3 was really painful, and quite probably more so that playing kosumi initially and letting black make whatever loose but not territory formation on the lower side.

I made a thread to highlight and discuss interesting moves/evaluations in this data: https://www.lifein19x19.com/viewtopic.php?f=15&t=15310.

djhbrown · Post by **djhbrown** » Tue Dec 12, 2017 3:38 am

Alfie says all black's first moves are <50%, so either black should resign before placing a stone, or komi is too much.

pookpooi · Post by **pookpooi** » Tue Dec 12, 2017 3:53 am

djhbrown wrote:Alfie says all black's first moves are <50%, so either black should resign before placing a stone, or komi is too much.

Unless the opponent is God, the resign threshold can be set well below 50%.

And I'm wondering what rule AlphaGo calculate in. From the <50% Black winrate I bet it's Chinese. But DeepZenGo also give <50% to black in Japanese rule too.

vier · Post by **vier** » Tue Dec 12, 2017 4:38 am

pookpooi wrote:And I'm wondering what rule AlphaGo calculate in. From the <50% Black winrate I bet it's Chinese.

The opening book "book.sgf" has KM[7.5] in the header.

Life In 19x19

AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Re: AlphaGo Teach discussion (Go Tool from DeepMind)