It is currently Thu Mar 28, 2024 3:27 pm

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 418 posts ]  Go to page Previous  1 ... 16, 17, 18, 19, 20, 21  Next
Author Message
Offline
 Post subject: Re: LZ's progression
Post #361 Posted: Sat Apr 13, 2019 10:44 pm 
Dies in gote

Posts: 50
Liked others: 0
Was liked: 3
hoa803 wrote:
It might provoke an interesting discussion - the folks on GitHub don't feel the time parity matches are a good measure of engine strength, but rather visits. I don't claim to totally understand the reasoning but it might be worth looking into.

i think these visit's tests are "сферический конь в вакууме", known in the west as a spherical cow. :lol:

P.S. For https://github.com/leela-zero/leela-zero/issues/2330#issuecomment-482729494 wright now:
#219 (-v 1600) vs elfv2 (-v 3200)
10 wins, 40 losses
50 games played.
:o

EDIT 1.

P.S.S For https://github.com/leela-zero/leela-zero/issues/2330#issuecomment-482957781 wright now:

#219 (-v 1600) vs elfv2 (-v 3200)
22 wins, 125 losses
157 games played.
:lol:

to be continued...


Last edited by nbc44 on Sun Apr 14, 2019 5:53 am, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #362 Posted: Sun Apr 14, 2019 12:09 am 
Dies in gote

Posts: 50
Liked others: 0
Was liked: 3
Time parity match with statistically significant result :salute:
(part II).
LZ0v17 #219 vs Elfv2
2x1080ti, 10s per move.
C:\APPS\l0gpu17\validation.exe -k 219elfv2-10s -s "0:10" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -- C:\APPS\l0gpu17\leelaz --gtp-command "time_settings 0 10 1" -- C:\APPS\l0gpu17\leelaz --gtp-command "time_settings 0 10 1"

Code:
#219 v elfv2 ( 400 games)
           wins        black       white
#219\  155 38.75%   64 37.21%   91 39.91%
elfv2  245 61.25%  108 62.79%  137 60.09%
                   172 43.00%  228 57.00%

(part III).
LZ0v17 #219 vs Elfv2
2x1080ti, 3s per move.
C:\APPS\l0gpu17\validation.exe -k 219elfv2-3s -s "0:10" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -- C:\APPS\l0gpu17\leelaz --gtp-command "time_settings 0 3 1" -- C:\APPS\l0gpu17\leelaz --gtp-command "time_settings 0 3 1"

Code:
#219 v elfv2 ( 403 games)
           wins        black       white
#219   153 37.97%   58 35.37%   95 39.75%
elfv2  250 62.03%  106 64.63%  144 60.25%
                   164 40.69%  239 59.31%


Attachments:
219elfv2-3s.zip [326.11 KiB]
Downloaded 514 times
219elfv2-10s.zip [342.38 KiB]
Downloaded 509 times
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #363 Posted: Sun Apr 14, 2019 4:36 am 
Beginner

Posts: 19
Liked others: 0
Was liked: 2
My suspicion is that time parity might be correct with two entirely separate machines, with the same hardware, and using ponder. Basically what seems to have done in the AlphaZero paper? If I recall they used 90 mins main time and 15s/move byo-yomi. I'm not as sure about doing it on a single machine with --noponder, however.

The reason I wonder is because I know that each move the NN makes is not independent of what it calculated on the previous move(s). We also know that the number of visits calculated on each position will vary wildly for a given amount of time. That will definitely add some serious randomness to the performance of an engine throughout a game. What I don't know is, does that even matter? (see: comment)

Ultimate I think it may be like football. Take the best teams in the world, say Manchester City, Barcelona, etc. Now change the rules of the game in some fundamental way. Maybe some other team will now be stronger.

That may be a poor analogy but basically I'm saying that when we declare an engine stronger given test XYZ, basically all we're saying is under those exact conditions only is that a true statement - especially when the engines are very similar in strength, like elf and lz appear to be at this point.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #364 Posted: Mon Apr 15, 2019 7:02 am 
Dies in gote

Posts: 30
Liked others: 2
Was liked: 9
Rank: 3d
of course, just an inconvenient truth

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #365 Posted: Mon Apr 15, 2019 1:51 pm 
Dies in gote

Posts: 50
Liked others: 0
Was liked: 3
hoa803 wrote:
My suspicion is that time parity might be correct with two entirely separate machines, with the same hardware, and using ponder.

What do you think about test with one computer, ponder, one dedicated gpu for each side? I believe that this will be a more or less honest test.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #366 Posted: Mon Apr 15, 2019 4:41 pm 
Dies with sente

Posts: 101
Liked others: 2
Was liked: 16
Rank: KGS 2 D
I think one computer only has one interface bus between the CPU and the GPU. So that part is actually shared and the bot using less interface time will take more advantage.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #367 Posted: Mon Apr 15, 2019 8:08 pm 
Dies in gote

Posts: 50
Liked others: 0
Was liked: 3
splee99 wrote:
I think one computer only has one interface bus between the CPU and the GPU. So that part is actually shared and the bot using less interface time will take more advantage.

For 3200 visits and gpu(!) client? it's funny is not it?

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #368 Posted: Mon Apr 15, 2019 11:03 pm 
Dies in gote

Posts: 50
Liked others: 0
Was liked: 3
Visit parity match with statistically significant result :salute:
LZ0v17 #219 (1600 visits) vs Elfv2 (3200 visits) 2x1080ti
C:\APPS\l0gpu17\validation.exe -k 219-elfv2 -s "0:1" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g -v 1600 --gpu 0 --gpu 1 --noponder -t 24 -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 0 --gpu 1 --noponder -t 24 -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code:
#219 v elfv2 ( 400 games)
           wins        black       white
#219   129 32.25%   60 31.58%   69 32.86%
elfv2  271 67.75%  130 68.42%  141 67.14%
                   190 47.50%  210 52.50%


In my case, everything is very bad.


Attachments:
219-elfv2.zip [331.93 KiB]
Downloaded 494 times
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #369 Posted: Wed Apr 17, 2019 12:26 am 
Dies in gote

Posts: 50
Liked others: 0
Was liked: 3
Visit parity match
LZ0v17 #219 (1600 visits) vs Elfv2 (3200 visits) 1x1080ti per side + ponder
Part1 - #219 (GPU0) vs Elfv2 (GPU1)
C:\APPS\l0gpu17\validation.exe -k 219-elfv2-1gpu -s "0:1" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g -v 1600 --gpu 0 -t 24 -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 1 -t 24 -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code:
#219 v elfv2 ( 208 games)
           wins        black       white
#219    74 35.58%   34 34.69%   40 36.36%
elfv2  134 64.42%   64 65.31%   70 63.64%
                    98 47.12%  110 52.88%

Part2 - #219 (GPU1) vs Elfv2 (GPU0)
C:\APPS\l0gpu17\validation.exe -k 219-elfv2-1gpu -s "0:1" -g 6 -n C:\APPS\net\00ff08eb.gz -o "-g -v 1600 --gpu 1 -t 24 -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 0 -t 24 -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code:
#219 v elfv2 ( 208 games)
           wins        black       white
#219    81 38.94%   43 39.45%   38 38.38%
elfv2  127 61.06%   66 60.55%   61 61.62%
                   109 52.40%   99 47.60%

Summary:

#219 vs elfv2 (37,26%)
+155-261=0
:clap:


Attachments:
219-elfv2-1gpu-part2.zip [175.14 KiB]
Downloaded 476 times
219-elfv2-1gpu-part1.zip [173.27 KiB]
Downloaded 512 times
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #370 Posted: Wed Apr 17, 2019 2:14 am 
Dies in gote

Posts: 53
Liked others: 3
Was liked: 33
Rank: KGS 2k
Is the difference in speed between the ELF network and the 40b network really 2x for you?
I know that theoretically that could be true, but if ive understood correctly, the difference isnt nearly that large in practise?


If you load the 40b network in leela, and write netbench 50000 and then load the elf network and write netbench 50000,
do you really play those 50.000 playouts in half the time with the ELF network?

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #371 Posted: Wed Apr 17, 2019 3:35 am 
Dies in gote

Posts: 50
Liked others: 0
Was liked: 3
Aram wrote:
Is the difference in speed between the ELF network and the 40b network really 2x for you?
I know that theoretically that could be true, but if ive understood correctly, the difference isnt nearly that large in practise?


If you load the 40b network in leela, and write netbench 50000 and then load the elf network and write netbench 50000,
do you really play those 50.000 playouts in half the time with the ELF network?


1). https://lifein19x19.com/viewtopic.php?p=242577#p242577

2).
Code:
c:\apps\l0gpu17\leelaz.exe --precision single -t 24 --gpu 0 --gpu 1  -w C:\APPS\net\00ff08eb.gz

Leela: netbench 50000
50000 evaluations in 58.73 seconds -> 851 n/s

c:\apps\l0gpu17\leelaz.exe --precision single -t 24 --gpu 0 --gpu 1  -w C:\APPS\net\05dbca15.gz

Leela: netbench 50000
50000 evaluations in 29.81 seconds -> 1677 n/s

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #372 Posted: Wed Apr 17, 2019 4:46 pm 
Beginner

Posts: 19
Liked others: 0
Was liked: 2
NBC, if you're using visit parity you shouldn't use ponder. The time to reach that number of visits varies by position. Time parity matches can use ponder on separate hardware though, similar to how Alphago was tested.

There's a thread on GitHub with a visit "parity" (1600 vs 3200) match between 220 and elfv2. The result was inconclusive, seems to indicate they're about the same strength at that visit count.

Early in the match LZ appeared to be stronger with over 95% confidence, but by the end the result evened out.

Edit: a word


Last edited by hoa803 on Tue Apr 23, 2019 4:21 pm, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #373 Posted: Wed Apr 17, 2019 8:20 pm 
Dies with sente

Posts: 101
Liked others: 2
Was liked: 16
Rank: KGS 2 D
My observation is that elfv2 is well trained to make sharp attacks in the early stage of a game. However it does have many blind spots in a complicated life death situations where LZ can take advantage of.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #374 Posted: Tue Apr 23, 2019 10:22 am 
Gosei
User avatar

Posts: 1348
Liked others: 202
Was liked: 203
What is the 15b strongest network? edb61bc2, 0a963117 or another?

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #375 Posted: Tue Apr 23, 2019 4:48 pm 
Beginner

Posts: 19
Liked others: 0
Was liked: 2
And wrote:
What is the 15b strongest network? edb61bc2, 0a963117 or another?


There was a GitHub thread about a 15b trained on 40b awhile back. Unsure if anyone is still doing it.

https://github.com/leela-zero/leela-zero/issues/2192

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #376 Posted: Thu Apr 25, 2019 1:04 am 
Dies in gote

Posts: 50
Liked others: 0
Was liked: 3
hoa803 wrote:
NBC, if you're using visit parity you shouldn't use ponder. The time to reach that number of visits varies by position. Time parity matches can use ponder on separate hardware though, similar to how Alphago was tested.

Why not? I'm using separate GPU for each net.

hoa803 wrote:
There's a thread on GitHub with a visit "parity" (1600 vs 3200) match between 220 and elfv2. The result was inconclusive, seems to indicate they're about the same strength at that visit count.


1). Wow. And you are right.
C:\APPS\l0gpu17\validation.exe -k 222-elfv2 -s "0:1" -g 7 -n C:\APPS\net\0407e5b5.gz -o "-g -v 1600 --gpu 0 --gpu 1 -t 1 --noponder -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 0 --gpu 1 -t 1 --noponder -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code:
#222 v elfv2 ( 414 games)
           wins        black       white
#222   203 49.03%   86 48.86%  117 49.16%
elfv2  211 50.97%   90 51.14%  121 50.84%
                   176 42.51%  238 57.49%

2). Hmm. Are you right?
C:\APPS\l0gpu17\validation.exe -k 222-elfv2 -s "0:1" -g 7 -n C:\APPS\net\0407e5b5.gz -o "-g -v 1600 --gpu 0 --gpu 1 -t 12 --noponder -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 0 --gpu 1 -t 12 --noponder -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code:
#222 v elfv2 ( 400 games)
           wins        black       white
#222   174 43.50%   82 43.16%   92 43.81%
elfv2  226 56.50%  108 56.84%  118 56.19%
                   190 47.50%  210 52.50%

3). Oh my God. You are definitely wrong.
C:\APPS\l0gpu17\validation.exe -k 222-elfv2 -s "0:1" -g 6 -n C:\APPS\net\0407e5b5.gz -o "-g -v 1600 --gpu 0 --gpu 1 -t 24 --noponder -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 0 --gpu 1 -t 24 --noponder -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code:
#222 v elfv2 ( 400 games)
           wins        black       white
#222   143 35.75%   58 33.53%   85 37.44%
elfv2  257 64.25%  115 66.47%  142 62.56%
                   173 43.25%  227 56.75%


Attachments:
222-elfv2-t24.zip [330.8 KiB]
Downloaded 552 times
222-elfv2-t12.zip [326.69 KiB]
Downloaded 530 times
222-elfv2-t1.zip [341.09 KiB]
Downloaded 546 times
Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #377 Posted: Thu Apr 25, 2019 4:15 am 
Dies in gote

Posts: 53
Liked others: 3
Was liked: 33
Rank: KGS 2k
So you have shown that by increasing the number of threads manually way above the default of the program (which should be the optimum in most cases) you make it play worse?

EDIT:
Or do you want to say that the 20-ish block ELF2 network scales better with threads? Does the 40b network regress with more threads or stay the same?


In all it seems a bit confusing that thread amounts play such a difference in play quality when you are using a fixed number of visits?

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #378 Posted: Thu Apr 25, 2019 5:31 am 
Dies in gote

Posts: 50
Liked others: 0
Was liked: 3
Aram wrote:
So you have shown that by increasing the number of threads manually way above the default of the program (which should be the optimum in most cases) you make it play worse?

EDIT:
Or do you want to say that the 20-ish block ELF2 network scales better with threads? Does the 40b network regress with more threads or stay the same?


In all it seems a bit confusing that thread amounts play such a difference in play quality when you are using a fixed number of visits?


I don't know, it's very strange, but it's a fact.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #379 Posted: Thu Apr 25, 2019 6:42 am 
Judan

Posts: 6725
Location: Cambridge, UK
Liked others: 436
Was liked: 3719
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
I've not been following this thread for a while so I dont know if this is relevant, but I do recall when Facebook ran Elf it had more threads or batches than when I did trying to reproduce things. Given Elf is observed to be quite blind spotty in not considering enough choices and more threads means more independent randomness of choosing which variations to explore it wouldn't surprise me if Elf benefitted more than LZ from more threads.

Top
 Profile  
 
Offline
 Post subject: Re: LZ's progression
Post #380 Posted: Thu Apr 25, 2019 7:18 am 
Lives in gote

Posts: 337
Liked others: 22
Was liked: 97
Aram wrote:
In all it seems a bit confusing that thread amounts play such a difference
Maybe I can bring a little more confusion here ;-)

I never thought to make this little experiment, but maybe there's something wrong here, the numbers seem weird.

Win 10, i9-12 core, 2x1080Ti

Code:
leelaz --gtp--benchmark -t XXX -w ...\223.gz --gpu 0 --gpu 1


XXX=1 ---> 214 n/s
XXX=4 ---> 610 n/s
XXX=12 ---> 731 n/s
XXX=36 ---> 1091 n/s
XXX=48 ---> 990 n/s
XXX=136 ---> 958 n/s
XXX=200 ---> 793 n/s

The maximum seems to be around t 36, but does it prove anything ? :scratch:

t 1
Attachment:
t1.gif
t1.gif [ 25.93 KiB | Viewed 9461 times ]
t36
Attachment:
t36.gif
t36.gif [ 60.2 KiB | Viewed 9461 times ]
t200
Attachment:
t200.gif
t200.gif [ 93.19 KiB | Viewed 9461 times ]

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 418 posts ]  Go to page Previous  1 ... 16, 17, 18, 19, 20, 21  Next

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group