AlphaGo Teach discussion (Go Tool from DeepMind)

For discussing go computing, software announcements, etc.
lightvector
Lives in sente
Posts: 759
Joined: Sat Jun 19, 2010 10:11 pm
Rank: maybe 2d
GD Posts: 0
Has thanked: 114 times
Been thanked: 916 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by lightvector »

I actually think that for mid to high dan level amateurs, there is some usefulness here. I'm below the necessary strength to derive a lot of value out of these variations, but I browsed through and still learned some things about openings that I actually encounter from both sides - the chinese, the mini chinese, the kobayashi, san-ren-sei. Moves that I wouldn't have even considered that I can experiment with now that I know that they're possible, or some moves that I would have considered and rejected as bad instead being evaluated as not-bad by AlphaGo with a simple and clear variation that disproves my misconception.

I also think that at the extremes the evaluations are useful at dan level. Mostly when they get noticeably outside the 40%-60% range I find that I'm more likely than not to myself feel dissatisfied with the side that supposedly fell behind - and that gives me some confidence that the evals, once they get that extreme, are at the right level to be at least somewhat informative. I wouldn't put huge value on it at amateur dan, but at minimum you can still use it as another good hundred datapoints for improving one's overall sense of direction of play and whole-board judgment, one more drop in the bucket.
moha
Lives in gote
Posts: 311
Joined: Wed May 31, 2017 6:49 am
Rank: 2d
GD Posts: 0
Been thanked: 45 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by moha »

Bill Spight wrote:From the web site:
each move’s winning probability was computed by running an independent search of 10 million simulations from that position.
Simulations of what, pray tell?
I think they simply mean the MCTS effort / iterations / node expands. EDIT: Actually this was Master still so rollouts as well.
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by Bill Spight »

moha wrote:
Bill Spight wrote:From the web site:
each move’s winning probability was computed by running an independent search of 10 million simulations from that position.
Simulations of what, pray tell?
I think they simply mean the MCTS effort / iterations / node expands. EDIT: Actually this was Master still so rollouts as well.


If so, the win rates should be perfect for SDKs, eh?
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
John Fairbairn
Oza
Posts: 3724
Joined: Wed Apr 21, 2010 3:09 am
Has thanked: 20 times
Been thanked: 4672 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by John Fairbairn »

The wording of the announcement is unclear as regards selection of "major" variations or indeed what is meant by "opening" (fuseki or joseki - I infer the former).

But insofar as I can make sense of it, the results don't seem to equate with pro tournament practice and the human moves shown must therefore be from a mix of pro and ama games but mostly ama.

If I follow the first two recommended moves (R16 then Q4) DeepMind gives 11 human moves: C17, C16, C15, C4, D17, D16, D4, D3, E17, O3 and P17.

But doing a similar exercise on a pro tournament database (approx. 90,000 games) gives a quite different result with 21 moves tried by humans, Human pros did not try the O3 given above, but did try the rest plus C5, C3, D15, D14, D5, E16, E4, E3, O17, P16 and R6.

The AG win rates are potentially interesting (but how much different from pro win rates as shown by Kombilo?), but if they include many wins against amateurs (and/or internet blitz games) they may be somewhat spurious. It would be nice if Nature could be cajoled into including a properly versed go player as one of their referees.

Just offering users the chance to find plays they have never considered before and thus be "creative" (sounds like an AI definition of the word) is not much of a selling point given the wealth of pro games around, and specifically those that were experimental such as New Fuseki, that already do that.
moha
Lives in gote
Posts: 311
Joined: Wed May 31, 2017 6:49 am
Rank: 2d
GD Posts: 0
Been thanked: 45 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by moha »

Bill Spight wrote:
moha wrote:I think they simply mean the MCTS effort / iterations / node expands. EDIT: Actually this was Master still so rollouts as well.
If so, the win rates should be perfect for SDKs, eh?
Master itself did OK, looking only at these numbers, didn't it? :)

John Fairbairn wrote:The AG win rates are potentially interesting (but how much different from pro win rates as shown by Kombilo?), but if they include many wins against amateurs (and/or internet blitz games)
Again, I think these are only probabilites, estimated by AG with a deep search (and rollouts with the reduced policy net). So there should be some correlations to actual pro data, but significant differences as well.
Baywa
Dies in gote
Posts: 39
Joined: Wed Feb 22, 2017 6:37 am
GD Posts: 0
Has thanked: 40 times
Been thanked: 10 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by Baywa »

For what it's worth: I think one can find the games that AlphaGo played against humans inside that tree of variations. For example (I tried that out) game No. 18 of Master (Black) against Ke Jie is contained and one can follow AlphaGo's evaluations. Interestingly, Master does not always choose the "best" continuation. OTOH, when Ke Jie played 22 M4 AlphaGo's winning percentage rose to 49.9 percent. The move suggested as alternative, which defended the l.l. corner would have kept Blacks winning percentage at 47.7.

So it might be worthwhile to follow these evaluations.
Couch Potato - I'm just watchin'!
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by Bill Spight »

moha wrote:
Bill Spight wrote:
moha wrote:I think they simply mean the MCTS effort / iterations / node expands. EDIT: Actually this was Master still so rollouts as well.
If so, the win rates should be perfect for SDKs, eh?
Master itself did OK, looking only at these numbers, didn't it? :)


No. Master had/has a value network and did not rely upon MCTS quasi-random rollouts alone. Also, Master built a game tree and did not rely solely upon "win rates".

Again I ask, what were they simulating?
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
yoyoma
Lives in gote
Posts: 653
Joined: Mon Apr 19, 2010 8:45 pm
GD Posts: 0
Location: Austin, Texas, USA
Has thanked: 54 times
Been thanked: 213 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by yoyoma »

John Fairbairn wrote:The wording of the announcement is unclear as regards selection of "major" variations or indeed what is meant by "opening" (fuseki or joseki - I infer the former).

But insofar as I can make sense of it, the results don't seem to equate with pro tournament practice and the human moves shown must therefore be from a mix of pro and ama games but mostly ama.

If I follow the first two recommended moves (R16 then Q4) DeepMind gives 11 human moves: C17, C16, C15, C4, D17, D16, D4, D3, E17, O3 and P17.

But doing a similar exercise on a pro tournament database (approx. 90,000 games) gives a quite different result with 21 moves tried by humans, Human pros did not try the O3 given above, but did try the rest plus C5, C3, D15, D14, D5, E16, E4, E3, O17, P16 and R6.

The AG win rates are potentially interesting (but how much different from pro win rates as shown by Kombilo?), but if they include many wins against amateurs (and/or internet blitz games) they may be somewhat spurious. It would be nice if Nature could be cajoled into including a properly versed go player as one of their referees.

Just offering users the chance to find plays they have never considered before and thus be "creative" (sounds like an AI definition of the word) is not much of a selling point given the wealth of pro games around, and specifically those that were experimental such as New Fuseki, that already do that.


My interpretation is the process was:
1) DeepMind picks a tree of interesting openings. From your analysis it seems maybe it included some amateur games. I view that as not a problem.
2) DeepMind uses AlphaGo Master to analyze every position in the tree, and puts the resulting winrates in the tree. This analysis has nothing to do with what winrates of pros or amateurs had from those positions. So your point about "wins against amateurs" doesn't apply. If there are some positions in there that are very bad (whether an pro or amateur got to that position is not relevant), Master will tell us so.
moha
Lives in gote
Posts: 311
Joined: Wed May 31, 2017 6:49 am
Rank: 2d
GD Posts: 0
Been thanked: 45 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by moha »

Bill Spight wrote:
moha wrote:
Bill Spight wrote:If so, the win rates should be perfect for SDKs, eh?
Master itself did OK, looking only at these numbers, didn't it? :)
No. Master had/has a value network and did not rely upon MCTS quasi-random rollouts alone. Also, Master built a game tree and did not rely solely upon "win rates".

Again I ask, what were they simulating?

But why do you think these numbers are rollouts alone?

IMO these are Master's complete evaluations, 50% value net + 50% rollout (and Master is known to had weaker value net than Zero, weak enough that dropping rollouts would damage its strength in spite of the huge speedup - and I get the feeling that you equate NN policy-based rollouts to random rollouts of the past).

As for "simulations", I think this word is used for pure MCTS as well, even without rollouts. Such as X simulations = X leaf node expands (in the MCTS tree that Master does build, and which also contains estimated winrates, its only idea of value).

(The word may be a remnant of the idea that for a leaf node to be expanded, a new "simulation" is started from top, that takes a weighted-random branch at each node, until it reaches a leaf which is expanded.)
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by Bill Spight »

Bill Spight wrote:If so, the win rates should be perfect for SDKs, eh?


moha wrote:Master itself did OK, looking only at these numbers, didn't it? :)


Bill Spight wrote:No. Master had/has a value network and did not rely upon MCTS quasi-random rollouts alone. Also, Master built a game tree and did not rely solely upon "win rates".

Again I ask, what were they simulating?


moha wrote:But why do you think these numbers are rollouts alone?


I don't think that they are rollouts. They are "independent . . . simulations", according to the site. As such, they should actually be probability estimates.

IMO these ARE Master's evaluations, 50% value net + 50% rollout (and Master is known to had weaker value net than Zero, weak enough that dropping rollouts would damage its strength in spite of the huge speedup - and I get the feeling that you equate NN policy-based rollouts to random rollouts of the past).


Maybe you are right. :) But then they would not be "winning probabilities", as the site claims.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
moha
Lives in gote
Posts: 311
Joined: Wed May 31, 2017 6:49 am
Rank: 2d
GD Posts: 0
Been thanked: 45 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by moha »

(Meanwhile I edited my post above to make it more clear.)

I don't think there are any other meaning behind these words than the amount of search done, and the resulting value estimate.
Uberdude
Judan
Posts: 6727
Joined: Thu Nov 24, 2011 11:35 am
Rank: UK 4 dan
GD Posts: 0
KGS: Uberdude 4d
OGS: Uberdude 7d
Location: Cambridge, UK
Has thanked: 436 times
Been thanked: 3718 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by Uberdude »

Baywa wrote:For what it's worth: I think one can find the games that AlphaGo played against humans inside that tree of variations. For example (I tried that out) game No. 18 of Master (Black) against Ke Jie is contained and one can follow AlphaGo's evaluations. Interestingly, Master does not always choose the "best" continuation. OTOH, when Ke Jie played 22 M4 AlphaGo's winning percentage rose to 49.9 percent. The move suggested as alternative, which defended the l.l. corner would have kept Blacks winning percentage at 47.7.

So it might be worthwhile to follow these evaluations.


Indeed, we can use this to find the mistakes of the pros who were losing by move 30 without making any obviously bad moves (actually I think some were fairly obviously bad once you looked at the games on mass, like premature hanging connection in the chinese opening 3-3 invasion and tenuki).

As for that Ke Jie game, I did think having to answer the slide at 3-3 was really painful, and quite probably more so that playing kosumi initially and letting black make whatever loose but not territory formation on the lower side.

I made a thread to highlight and discuss interesting moves/evaluations in this data: viewtopic.php?f=15&t=15310.
User avatar
djhbrown
Lives in gote
Posts: 392
Joined: Tue Sep 15, 2015 5:00 pm
Rank: NR
GD Posts: 0
Has thanked: 23 times
Been thanked: 43 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by djhbrown »

Alfie says all black's first moves are <50%, so either black should resign before placing a stone, or komi is too much.
Attachments
alfie.png
alfie.png (278.22 KiB) Viewed 14534 times
i shrink, therefore i swarm
pookpooi
Lives in sente
Posts: 727
Joined: Sat Aug 21, 2010 12:26 pm
GD Posts: 10
Has thanked: 44 times
Been thanked: 218 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by pookpooi »

djhbrown wrote:Alfie says all black's first moves are <50%, so either black should resign before placing a stone, or komi is too much.

Unless the opponent is God, the resign threshold can be set well below 50%.

And I'm wondering what rule AlphaGo calculate in. From the <50% Black winrate I bet it's Chinese. But DeepZenGo also give <50% to black in Japanese rule too.
vier
Dies with sente
Posts: 91
Joined: Wed Oct 30, 2013 7:04 am
GD Posts: 0
Has thanked: 8 times
Been thanked: 29 times

Re: AlphaGo Teach discussion (Go Tool from DeepMind)

Post by vier »

pookpooi wrote:And I'm wondering what rule AlphaGo calculate in. From the <50% Black winrate I bet it's Chinese.

The opening book "book.sgf" has KM[7.5] in the header.
Post Reply