Whether you use gamma or beta (both imperfect as continuous), the sum after 150 additions should be pretty close to a normal as well. I think the biggest weakness of these approaches is not the distribution used for approximation, but the implied assumption that the errors are independent. In reality this is not quite true.gennan wrote:The score difference between 2 players over a whole game may then roughly follow a beta distribution.
For example, players may be weaker in certain types of games (moyo or large scale tactical fight), against certain shapes or a certain opponent, which makes many of their errors in that game larger than usual. The above sequence of 50pts errors also have some direct relation between them.
On the other hand, Elo works pretty well for many games, despite making even stronger simplifying assumption (uniform sd). In go that would not be sustainable, but to me that seems quite wild in reality even for chess. The reason for a smaller error total is almost always smaller individual errors - thus smaller devation for the total.
This seems a bit too low to me, and there is a problem with these estimates: both humans and bots get stronger with more time, but perfect play (and any distance to it) is time-independent fixed strength.My guess is that top AI may only lose about 10 points total per game on average.
I agree this line is a bit suspicious. Comparing to a different but similar strength player is already asking for trouble. Also variance - from the 8 games above my results fluctuated quite widely, so a lot of samples would be needed for a real average.total point loss per game:Code: Select all
rank real? KataGo estimate AG0 10 100