Interesting question! I think it's unlikely but worth exploring. But I still think you could do it more efficiently.Maharani wrote:For instance, there is a slight possibility that, with enough play-outs, Kata will start rating the 3-4 points better than the 4-4 points. I want to see if this is something that could happen with a million playouts of move 0.
A million playouts of move 0 gives you, what, 100k playouts of D4 (and similar for the other 4-4 points, but they should all get very similar evaluations), and a few tens of thousands of playouts of D3? You'd do better to just play D3 on the board, get 200k playouts of that position, do the same for D4, then see which has the higher evaluation. You get more (relevant) data for less computation and less of your time.
The problem is that if D3 has a lower policy value, then it could take some massive number of total playouts before D3 starts getting a significant proportion of that total. That's why, if you're interested in a specific move, it's usually better to play that move on the board rather than analyse from the previous position.