Page 25 of 28

Re: LZ's progression

Posted: Sat Apr 13, 2019 10:44 pm
by nbc44
hoa803 wrote:It might provoke an interesting discussion - the folks on GitHub don't feel the time parity matches are a good measure of engine strength, but rather visits. I don't claim to totally understand the reasoning but it might be worth looking into.
i think these visit's tests are "сферический конь в вакууме", known in the west as a spherical cow. :lol:

P.S. For https://github.com/leela-zero/leela-zer ... -482729494 wright now:
#219 (-v 1600) vs elfv2 (-v 3200)
10 wins, 40 losses
50 games played.
:o

EDIT 1.

P.S.S For https://github.com/leela-zero/leela-zer ... -482957781 wright now:

#219 (-v 1600) vs elfv2 (-v 3200)
22 wins, 125 losses
157 games played.
:lol:

to be continued...

Re: LZ's progression

Posted: Sun Apr 14, 2019 12:09 am
by nbc44
Time parity match with statistically significant result :salute:
(part II).
LZ0v17 #219 vs Elfv2
2x1080ti, 10s per move.
C:\APPS\l0gpu17\validation.exe -k 219elfv2-10s -s "0:10" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -- C:\APPS\l0gpu17\leelaz --gtp-command "time_settings 0 10 1" -- C:\APPS\l0gpu17\leelaz --gtp-command "time_settings 0 10 1"

Code: Select all

#219 v elfv2 ( 400 games)
           wins        black       white
#219\  155 38.75%   64 37.21%   91 39.91%
elfv2  245 61.25%  108 62.79%  137 60.09%
                   172 43.00%  228 57.00%
(part III).
LZ0v17 #219 vs Elfv2
2x1080ti, 3s per move.
C:\APPS\l0gpu17\validation.exe -k 219elfv2-3s -s "0:10" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -- C:\APPS\l0gpu17\leelaz --gtp-command "time_settings 0 3 1" -- C:\APPS\l0gpu17\leelaz --gtp-command "time_settings 0 3 1"

Code: Select all

#219 v elfv2 ( 403 games)
           wins        black       white
#219   153 37.97%   58 35.37%   95 39.75%
elfv2  250 62.03%  106 64.63%  144 60.25%
                   164 40.69%  239 59.31%

Re: LZ's progression

Posted: Sun Apr 14, 2019 4:36 am
by hoa803
My suspicion is that time parity might be correct with two entirely separate machines, with the same hardware, and using ponder. Basically what seems to have done in the AlphaZero paper? If I recall they used 90 mins main time and 15s/move byo-yomi. I'm not as sure about doing it on a single machine with --noponder, however.

The reason I wonder is because I know that each move the NN makes is not independent of what it calculated on the previous move(s). We also know that the number of visits calculated on each position will vary wildly for a given amount of time. That will definitely add some serious randomness to the performance of an engine throughout a game. What I don't know is, does that even matter? (see: comment)

Ultimate I think it may be like football. Take the best teams in the world, say Manchester City, Barcelona, etc. Now change the rules of the game in some fundamental way. Maybe some other team will now be stronger.

That may be a poor analogy but basically I'm saying that when we declare an engine stronger given test XYZ, basically all we're saying is under those exact conditions only is that a true statement - especially when the engines are very similar in strength, like elf and lz appear to be at this point.

Re: LZ's progression

Posted: Mon Apr 15, 2019 7:02 am
by maf
of course, just an inconvenient truth

Re: LZ's progression

Posted: Mon Apr 15, 2019 1:51 pm
by nbc44
hoa803 wrote:My suspicion is that time parity might be correct with two entirely separate machines, with the same hardware, and using ponder.
What do you think about test with one computer, ponder, one dedicated gpu for each side? I believe that this will be a more or less honest test.

Re: LZ's progression

Posted: Mon Apr 15, 2019 4:41 pm
by splee99
I think one computer only has one interface bus between the CPU and the GPU. So that part is actually shared and the bot using less interface time will take more advantage.

Re: LZ's progression

Posted: Mon Apr 15, 2019 8:08 pm
by nbc44
splee99 wrote:I think one computer only has one interface bus between the CPU and the GPU. So that part is actually shared and the bot using less interface time will take more advantage.
For 3200 visits and gpu(!) client? it's funny is not it?

Re: LZ's progression

Posted: Mon Apr 15, 2019 11:03 pm
by nbc44
Visit parity match with statistically significant result :salute:
LZ0v17 #219 (1600 visits) vs Elfv2 (3200 visits) 2x1080ti
C:\APPS\l0gpu17\validation.exe -k 219-elfv2 -s "0:1" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g -v 1600 --gpu 0 --gpu 1 --noponder -t 24 -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 0 --gpu 1 --noponder -t 24 -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code: Select all

#219 v elfv2 ( 400 games)
           wins        black       white
#219   129 32.25%   60 31.58%   69 32.86%
elfv2  271 67.75%  130 68.42%  141 67.14%
                   190 47.50%  210 52.50%
In my case, everything is very bad.

Re: LZ's progression

Posted: Wed Apr 17, 2019 12:26 am
by nbc44
Visit parity match
LZ0v17 #219 (1600 visits) vs Elfv2 (3200 visits) 1x1080ti per side + ponder
Part1 - #219 (GPU0) vs Elfv2 (GPU1)
C:\APPS\l0gpu17\validation.exe -k 219-elfv2-1gpu -s "0:1" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g -v 1600 --gpu 0 -t 24 -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 1 -t 24 -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code: Select all

#219 v elfv2 ( 208 games)
           wins        black       white
#219    74 35.58%   34 34.69%   40 36.36%
elfv2  134 64.42%   64 65.31%   70 63.64%
                    98 47.12%  110 52.88%
Part2 - #219 (GPU1) vs Elfv2 (GPU0)
C:\APPS\l0gpu17\validation.exe -k 219-elfv2-1gpu -s "0:1" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g -v 1600 --gpu 1 -t 24 -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 0 -t 24 -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code: Select all

#219 v elfv2 ( 208 games)
           wins        black       white
#219    81 38.94%   43 39.45%   38 38.38%
elfv2  127 61.06%   66 60.55%   61 61.62%
                   109 52.40%   99 47.60%
Summary:

#219 vs elfv2 (37,26%)
+155-261=0
:clap:

Re: LZ's progression

Posted: Wed Apr 17, 2019 2:14 am
by Aram
Is the difference in speed between the ELF network and the 40b network really 2x for you?
I know that theoretically that could be true, but if ive understood correctly, the difference isnt nearly that large in practise?


If you load the 40b network in leela, and write netbench 50000 and then load the elf network and write netbench 50000,
do you really play those 50.000 playouts in half the time with the ELF network?

Re: LZ's progression

Posted: Wed Apr 17, 2019 3:35 am
by nbc44
Aram wrote:Is the difference in speed between the ELF network and the 40b network really 2x for you?
I know that theoretically that could be true, but if ive understood correctly, the difference isnt nearly that large in practise?


If you load the 40b network in leela, and write netbench 50000 and then load the elf network and write netbench 50000,
do you really play those 50.000 playouts in half the time with the ELF network?
1). viewtopic.php?p=242577#p242577

2).

Code: Select all

c:\apps\l0gpu17\leelaz.exe --precision single -t 24 --gpu 0 --gpu 1  -w C:\APPS\net\00ff08eb.gz

Leela: netbench 50000
50000 evaluations in 58.73 seconds -> 851 n/s

c:\apps\l0gpu17\leelaz.exe --precision single -t 24 --gpu 0 --gpu 1  -w C:\APPS\net\05dbca15.gz

Leela: netbench 50000
50000 evaluations in 29.81 seconds -> 1677 n/s

Re: LZ's progression

Posted: Wed Apr 17, 2019 4:46 pm
by hoa803
NBC, if you're using visit parity you shouldn't use ponder. The time to reach that number of visits varies by position. Time parity matches can use ponder on separate hardware though, similar to how Alphago was tested.

There's a thread on GitHub with a visit "parity" (1600 vs 3200) match between 220 and elfv2. The result was inconclusive, seems to indicate they're about the same strength at that visit count.

Early in the match LZ appeared to be stronger with over 95% confidence, but by the end the result evened out.

Edit: a word

Re: LZ's progression

Posted: Wed Apr 17, 2019 8:20 pm
by splee99
My observation is that elfv2 is well trained to make sharp attacks in the early stage of a game. However it does have many blind spots in a complicated life death situations where LZ can take advantage of.

Re: LZ's progression

Posted: Tue Apr 23, 2019 10:22 am
by And
What is the 15b strongest network? edb61bc2, 0a963117 or another?

Re: LZ's progression

Posted: Tue Apr 23, 2019 4:48 pm
by hoa803
And wrote:What is the 15b strongest network? edb61bc2, 0a963117 or another?
There was a GitHub thread about a 15b trained on 40b awhile back. Unsure if anyone is still doing it.

https://github.com/leela-zero/leela-zero/issues/2192