Thanks. Looking at LZ's reported "visits" some huge swings are expected since that includes tree reuse. But it seems there actually are large fluctuations in n/s as well, which is not subject to tree reuse (it's taken as playouts/time IIRC).
Bottom line is, it's not surprising if results with fixed (asymmetrical) visits differ somewhat from fixed per-game time.
Re: LZ's progression
Posted: Tue Oct 02, 2018 1:19 am
by nbc44
Yet another so-called "time parity" test - 10 min per game.
We must have the same kind of PC (I have 2x1080Ti, i9-7920X), and except for your -t 12 -q -d , I have the same commands.
On mine, a 10min (per side) game lasts in fact around 820 sec, as LZ doesn't use up all its time, and a 12801v3201 visits game lasts about the same time, but slightly favours #157, as it takes 10% more time than 181.
181 won 66% of your 200 games at 10 min.
181 won 7 of its 10 games
181 won 4 of its 10 games
181 won 2 of its 10 games
that's 43% of the 30 games, but with ~10% less time than 157
I'd trust your 200 games result. Probably 3x10 games is not enough, or ... maybe it's 181's way of asking for 10% more time
Re: LZ's progression
Posted: Tue Oct 02, 2018 10:38 pm
by nbc44
Due to the fact that your processor is faster, I suggest repeating your last test (10 games) with my settings:
I don't think that the CPU is important, we have the same GPUs, and so, very similar benchmarks
benchmark.jpg (154.28 KiB) Viewed 12061 times
Most probably, 10 or 30 games is not enough. For example, in your two 10 game matches, 181 doubles its winning percentage (20% and 40% , it's a huge swing). That's why I have more faith in your 200 game match
Re: LZ's progression
Posted: Thu Oct 04, 2018 6:23 pm
by splee99
If you have two GPU's, why don't you try assign GPU0 to 181 and GPU1 to 157? I know this would make the speed slower, but it will make a fair match because some data maybe cached in a GPU during the game.
Re: LZ's progression
Posted: Fri Oct 05, 2018 6:22 pm
by nbc44
Last so-called "time parity" test - 10 min per side.
--precision single -t 12 --gpu 0 --gpu 1 --noponder
Part1: #157 (black) vs #181 : +31-69=0
Part2: #157 (white) vs #181 : +53-47=0
Finally: #157 vs #181 : +84-116=0
A farewell to #157!!!
P.S. L0-next 29/09/2018
Re: LZ's progression
Posted: Sat Oct 20, 2018 9:39 pm
by Vargo
New network (183)
Two 20 game matches #183 v #157 at time parity :
Match 1 :
LZ 015, GPU : 1x1080, no pondering, 5 min. per side , corresponding to ~800 visits for #183 and to ~2800 visits for #157
#183 wins 11-9 (all games by resignation, about 240 moves per game, and 230" used per side and per game)
Match 2 :
GPU : 2x1080Ti, no pondering, 7 min. per side , corresponding to ~3400 visits for #183 and to ~12800 visits for #157 #183 wins 12-8 (all games by resignation, about 240 moves per game, and 320" used per side)
Finally, 183 is better than 157, even at low time settings (less than 2" per move with a 1080)
If someone wants the games...
Re: LZ's progression
Posted: Sun Oct 21, 2018 9:54 pm
by abcd_z
Ah, but I have no GPU, and to get a game that takes less than two hours I have to restrict LZ considerably more than you did. On my computer, time parity is 600 visits for LZ 157 and 153 visits for LZ 183. Does 183 still take the lead under those conditions?
Re: LZ's progression
Posted: Mon Oct 22, 2018 1:07 am
by Vargo
abcd_z wrote: Does 183 still take the lead under those conditions?
Most probably no.
I think 157 would win at least 60% of the games with so few visits.
Maybe I'll set up such a match and tell you the result
EDIT :
Here it is !
#157 v #184 (the latest best) 600 visits for #157 and 153 visits for #184
P.S. some interesting analysis of different nets strength scaling with time: https://github.com/gcp/leela-zero/issues/1914. The relevant summary for here is around 20-300 playouts for 40b #181 the best 15 block network #157 needs 3 times the playouts to score 50% win, but is ~4 times faster so is stronger at equal time (which given current hardware is sensible time per move for human analysis). Once you get into more playouts 181's strength improves faster than the 15 block so 157 needs more than 3 and then crosses the equal time ~4 times threshold and 40block wins more.
A randomly-chosen LZ #183 (a54cd) win (Elf got captured in a ladder! ):
Analysing a bit with Elfv1 myself, Elf thought it was doing slightly better (55%) at n15 extend. LZ's n15 is a move which surprised and impressed me (thought shares features with a move Blackie showed me at KPMC) and Elf similarly thinks it's good (it starts off not considering it much but by 1200 playouts thinks it's better than normal l17 defence), does this mean push and cut wasn't best? But the big swing happens (q12 and q10 attachments surprised me but not Elf) with LZ starting the ko with s12 (Elf win collapse to 13%) instead of r13 (Elf win 45%). I suppose r13 is more often the better shape way to start the ko, but here it gives white decent local threats and a playable position. As for the ladder, at move 72 it thinks black will n8 atari allowing white to trade with p7 atari then o7 connect and ko (k15 is ignored threat), despite p12 atari for squeezy ladder having a few playouts and the ladder all the way to the edge of the board being the principal variation.
An Elf v1 win:
P.P.S Had a look at a few more LZ wins, quite a few are Elf blunders of ladders or shortage of liberties (though maybe Elf already thought it was losing from previous non-blunders so is doing bot harakiri). Seems like the 1600 playout limit is not enough to stop Elf making dumb mistakes, maybe Elf would do better at say 10k each. But in my experience it can be quite ladder blunder and blindspot-prone even at higher playouts.
Re: LZ's progression
Posted: Mon Oct 22, 2018 4:41 am
by Vargo
I've launched a 20 game match at 5 min per side per game with GPU 1080. Hopefully, result in 2-3 hours.
The first net is worse than the second
#157a v #181 ( 398 games)
wins black white
#157a 194 48.74% 85 48.57% 109 48.88%
#181 204 51.26% 90 51.43% 114 51.12%
175 43.97% 223 56.03%