New network #212Quick test about
@jlt's law
(reminder : LZ#(n) is stronger than LZ#(n-10) at blocks and time parity)
added parameters -m 20, to avoid duplicate games, and -v 1601, to "standardize" the test.
50 games, no duplicate, no error.
Result :
#212 wins 32-18 (64%)__________________________________________________________________________
And now, how about a little controversy...
If #n wins 55% of its games against #n-1, and
If #n-1 wins 55% of its games against #n-2,and
...
and #n-9 wins 55% of its games against #n-10
#n should win 88% of its games against #n-10, but in this test, it wins only 64%...
In this case, it's as if the real average winrate of #n against #n-1 was only ~51.5% , and not 55%
Some caveats : -m 20 can alter results, and 50 games is not enough, but still, I remember
@moha spoke about the primary source of Elo inflation being the amount of luck accumulated by the new networks in test matches. I think he was right.
Code:
gogui-twogtp -black "C:\Users\jm\Desktop\gogui150\leela-zero-0.16-win64OK\leelaz.exe --gtp --weights=C:\Users\jm\Desktop\LZ_networks\212.gz --noponder --gpu 0 --gpu 1 -m 20 -v 1601" -white "C:\Users\jm\Desktop\gogui150\leela-zero-0.16-win64OK\leelaz.exe --gtp --weights=C:\Users\jm\Desktop\LZ_networks\202.gz --noponder --gpu 0 --gpu 1 -m 20 -v 1601" -games 50 -sgffile 212_202 -auto -komi 7.5 -alternate
The 50 games :
Attachment:
212_202.zip [43.7 KiB]
Downloaded 472 times
EDIT : #212 is B in the even numbered games, and W in the odd ones.