Life In 19x19

Posted: **Sat Apr 18, 2020 4:07 am**

I have just begun reading up on how the new amazing Go bots work, so please correct me if anything I write here is misguided or outright wrong.

In the thread "How much do strong bots agree on moves? Case study", the quality of moves in the AlphaGo Master games is discussed primarily in terms of "nth ranked choice" by KataGo, and secondarily in terms of number of playouts. Thread here:
viewtopic.php?f=18&t=17359

Three related questions popped into my head while reading that thread.

1. In a given board position, it could be that there are 10 moves with negligible difference in win rates, so that choosing among them is mostly a matter of personal preference. In another position, there could be only one good move, and even the second best choice has a much lower win rate. Given this variability, why discuss move quality in terms of "nth ranked choice" at all? To me the nth choice carries very little information on move quality. Am I wrong?

2. The SGF also included number of playouts for the top 3 moves. This gives a bit more information I believe. To my (very fuzzy) understanding, each playout is selected according to estimated win rates (Bayesian adjusted by number of playouts with a bit of random noise to spread moves out proportionally). The final move is the one with the most playouts, which should also be the move with the highest Bayesian win rate. Two moves with similar number of playouts should have similar Bayesian adjusted win rates. I'm in way over my head here, but is this roughly correct? Is there a direct relation between playouts and estimated win rate? For example, if one move has 10k playouts and another only 1k, it possible to say anything about difference in expected win rate?

3. When discussing move quality, aren't win rates (again, Bayesian adjusted by number of playouts) the best measure? Why do experts often use fuzzier measures like nth choice or number of playouts?

Thanks in advance for educating me!

Posted: **Sat Apr 18, 2020 4:26 am**

I agree on "the nth choice carries very little information on move quality".

Posted: **Sat Apr 18, 2020 7:35 am**

tango wrote: 1. In a given board position, it could be that there are 10 moves with negligible difference in win rates, so that choosing among them is mostly a matter of personal preference. In another position, there could be only one good move, and even the second best choice has a much lower win rate. Given this variability, why discuss move quality in terms of "nth ranked choice" at all? To me the nth choice carries very little information on move quality. Am I wrong?

Because the post wasn't about the quality of each of those moves, nor an attempt to measure the strength of their play. It was a comparison of their preferences.

It was about how surprisingly often it seemed that KataGo and AlphaGo Master agreed on the exactly same move or almost agreed, and even despite AlphaGo Master is supposed to not have been a "pure zero" bot in terms of the data it learned from. Which is interesting - it either means that AlphaGo Master is vastly more "zero-like" than we thought and therefore agrees merely due to stylistically having similar preferences as bots trained without human data, or else that bots have converged on correct play ehough that a significant fraction of the time they actually do see unique correct plays is even in many open-space situations where humans would think there should be lots of room for choice. Or maybe a mix of both.

(Note: sometimes a strong preference can be meaningful and correct even with a negligible winrate difference. For example, consider a necessary defense where one move leaves a small ko threat and another move doesn't leave the ko threat and also has no other downside. One would not be surprised if most or all strong bots' policy priors were concentrated on the move that leaves no ko threat, and with search uniformly all preferred the move that leaves no ko threat, yet also all judged the winrate difference as a tiny difference that normally one could consider negligible.)

tango wrote: 2. The SGF also included number of playouts for the top 3 moves. This gives a bit more information I believe. To my (very fuzzy) understanding, each playout is selected according to estimated win rates (Bayesian adjusted by number of playouts with a bit of random noise to spread moves out proportionally). The final move is the one with the most playouts, which should also be the move with the highest Bayesian win rate. Two moves with similar number of playouts should have similar Bayesian adjusted win rates. I'm in way over my head here, but is this roughly correct? Is there a direct relation between playouts and estimated win rate? For example, if one move has 10k playouts and another only 1k, it possible to say anything about difference in expected win rate?

Statistically there is almost certainly some correlation but it would quite noisy and have fat tails. Playouts also are selected according to the bot's policy prior, which encodes a lot of strong intuition for good shape and correct direction of play, so a bot can have strong preferences either due to seeing large winrate differences, or having strong intuitions and small winrate differences that agree with or at least don't strongly refute the intuition. So the playout ratio between moves can vary by vast amounts in different situations for the same winrate differences.

tango wrote: 3. When discussing move quality, aren't win rates (again, Bayesian adjusted by number of playouts) the best measure? Why do experts often use fuzzier measures like nth choice or number of playouts?

When you say "Bayesian adjusted by number of playouts" you're sweeping a lot under the rug. In terms of what gets you strongest play in practice if you have to pick a single simple metric alone, number of playouts vastly dominates almost anything else, including winrate. So perhaps you should say "Number of playouts, but Bayesian adjusted very very slightly by winrate".

So:

* You should pay the most attention to winrates (or estimated score difference) when analyzing games to determine how much better or worse different moves are in the most objective way (but make sure they also have enough playouts, or force them to by playing the move on the board itself).
* Playouts are the more reliable indicator of the bot's preferences, such even given all statistics of all moves, solely choosing the max-playouts move is almost the strongest possible way to play using only those the statistics, on average.
* The ordering position of moves that aren't the top-playouts usually means a little less past the top two or three slots, and to the degree it means something, it also has to do heavily with the bot's preferences and intuitions, rather than of objective move quality. For example if a bot intuitively preferred a move but tactically refuted it eventually with enough playouts to prove it objectively one of the worst choices.

Posted: **Sat Apr 18, 2020 8:39 am**

Thank you very much lightvector for the detailed explanation. My bad for not distinguishing between bot preferences and move quality. Seems that I was more or less on the right track, at least with respect to win rates being a good measure, but thanks for clearing up the details. My "sweeping a lot under the rug" was due to lack of understanding which I now think is somewhat improved thanks to your post. It was also intentional vagueness - as my old professor used to say: "When skating on thin ice, safety is in the speed."

Life In 19x19

Number of playouts vs bayesian estimate of win rate

Number of playouts vs bayesian estimate of win rate

Re: Number of playouts vs bayesian estimate of win rate

Re: Number of playouts vs bayesian estimate of win rate

Re: Number of playouts vs bayesian estimate of win rate