LZ's progression

For discussing go computing, software announcements, etc.
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

hoa803 wrote:NBC, if you're using visit parity you shouldn't use ponder. The time to reach that number of visits varies by position. Time parity matches can use ponder on separate hardware though, similar to how Alphago was tested.
Why not? I'm using separate GPU for each net.
hoa803 wrote:There's a thread on GitHub with a visit "parity" (1600 vs 3200) match between 220 and elfv2. The result was inconclusive, seems to indicate they're about the same strength at that visit count.
1). Wow. And you are right.
C:\APPS\l0gpu17\validation.exe -k 222-elfv2 -s "0:1" -g 7 -n C:\APPS\net\0407e5b5.gz -o "-g -v 1600 --gpu 0 --gpu 1 -t 1 --noponder -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 0 --gpu 1 -t 1 --noponder -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code: Select all

#222 v elfv2 ( 414 games)
           wins        black       white
#222   203 49.03%   86 48.86%  117 49.16%
elfv2  211 50.97%   90 51.14%  121 50.84%
                   176 42.51%  238 57.49%
2). Hmm. Are you right?
C:\APPS\l0gpu17\validation.exe -k 222-elfv2 -s "0:1" -g 7 -n C:\APPS\net\0407e5b5.gz -o "-g -v 1600 --gpu 0 --gpu 1 -t 12 --noponder -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 0 --gpu 1 -t 12 --noponder -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code: Select all

#222 v elfv2 ( 400 games)
           wins        black       white
#222   174 43.50%   82 43.16%   92 43.81%
elfv2  226 56.50%  108 56.84%  118 56.19%
                   190 47.50%  210 52.50%
3). Oh my God. You are definitely wrong.
C:\APPS\l0gpu17\validation.exe -k 222-elfv2 -s "0:1" -g 6 -n C:\APPS\net\0407e5b5.gz -o "-g -v 1600 --gpu 0 --gpu 1 -t 24 --noponder -q -d --timemanage off --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g -v 3200 --gpu 0 --gpu 1 -t 24 --noponder -q -d --timemanage off --precision single -w " -- C:\APPS\l0gpu17\leelaz -- C:\APPS\l0gpu17\leelaz

Code: Select all

#222 v elfv2 ( 400 games)
           wins        black       white
#222   143 35.75%   58 33.53%   85 37.44%
elfv2  257 64.25%  115 66.47%  142 62.56%
                   173 43.25%  227 56.75%
Attachments
222-elfv2-t24.zip
(330.8 KiB) Downloaded 753 times
222-elfv2-t12.zip
(326.69 KiB) Downloaded 712 times
222-elfv2-t1.zip
(341.09 KiB) Downloaded 741 times
Aram
Dies in gote
Posts: 53
Joined: Tue Jun 14, 2016 9:46 am
Rank: KGS 2k
GD Posts: 0
Has thanked: 3 times
Been thanked: 33 times

Re: LZ's progression

Post by Aram »

So you have shown that by increasing the number of threads manually way above the default of the program (which should be the optimum in most cases) you make it play worse?

EDIT:
Or do you want to say that the 20-ish block ELF2 network scales better with threads? Does the 40b network regress with more threads or stay the same?


In all it seems a bit confusing that thread amounts play such a difference in play quality when you are using a fixed number of visits?
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

Aram wrote:So you have shown that by increasing the number of threads manually way above the default of the program (which should be the optimum in most cases) you make it play worse?

EDIT:
Or do you want to say that the 20-ish block ELF2 network scales better with threads? Does the 40b network regress with more threads or stay the same?


In all it seems a bit confusing that thread amounts play such a difference in play quality when you are using a fixed number of visits?
I don't know, it's very strange, but it's a fact.
Uberdude
Judan
Posts: 6727
Joined: Thu Nov 24, 2011 11:35 am
Rank: UK 4 dan
GD Posts: 0
KGS: Uberdude 4d
OGS: Uberdude 7d
Location: Cambridge, UK
Has thanked: 436 times
Been thanked: 3718 times

Re: LZ's progression

Post by Uberdude »

I've not been following this thread for a while so I dont know if this is relevant, but I do recall when Facebook ran Elf it had more threads or batches than when I did trying to reproduce things. Given Elf is observed to be quite blind spotty in not considering enough choices and more threads means more independent randomness of choosing which variations to explore it wouldn't surprise me if Elf benefitted more than LZ from more threads.
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

Aram wrote:In all it seems a bit confusing that thread amounts play such a difference
Maybe I can bring a little more confusion here ;-)

I never thought to make this little experiment, but maybe there's something wrong here, the numbers seem weird.

Win 10, i9-12 core, 2x1080Ti

Code: Select all

leelaz --gtp--benchmark -t XXX -w ...\223.gz --gpu 0 --gpu 1
XXX=1 ---> 214 n/s
XXX=4 ---> 610 n/s
XXX=12 ---> 731 n/s
XXX=36 ---> 1091 n/s
XXX=48 ---> 990 n/s
XXX=136 ---> 958 n/s
XXX=200 ---> 793 n/s

The maximum seems to be around t 36, but does it prove anything ? :scratch:

t 1
t1.gif
t1.gif (25.93 KiB) Viewed 13168 times
t36
t36.gif
t36.gif (60.2 KiB) Viewed 13168 times
t200
t200.gif
t200.gif (93.19 KiB) Viewed 13168 times
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

The main question is what is more important to us - victory or honesty? :salute:
iopq
Dies with sente
Posts: 113
Joined: Wed Feb 27, 2019 11:19 am
Rank: 1d
GD Posts: 0
Universal go server handle: iopq
Has thanked: 11 times
Been thanked: 27 times

Re: LZ's progression

Post by iopq »

Did you set the batch number to half of the threads? You can get better perf.
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

Three last benchmarks :

not specifying -t XXX seems to give slightly less n/s
1.gif
1.gif (39.75 KiB) Viewed 11877 times
iopq wrote:Did you set the batch number to half of the threads?
I'm not sure it's better... but again, I find the effect of -t XXX rather bizarre, and these benchmarks are maybe flawed, one way or another...
3.gif
3.gif (73.69 KiB) Viewed 11877 times

--precision half
seems to be about the same as not specifying the precision
2.gif
2.gif (57.26 KiB) Viewed 11877 times
:scratch: :scratch: :scratch:
iopq
Dies with sente
Posts: 113
Joined: Wed Feb 27, 2019 11:19 am
Rank: 1d
GD Posts: 0
Universal go server handle: iopq
Has thanked: 11 times
Been thanked: 27 times

Re: LZ's progression

Post by iopq »

Benchmark with batching, it would be faster than just threading
Uberdude
Judan
Posts: 6727
Joined: Thu Nov 24, 2011 11:35 am
Rank: UK 4 dan
GD Posts: 0
KGS: Uberdude 4d
OGS: Uberdude 7d
Location: Cambridge, UK
Has thanked: 436 times
Been thanked: 3718 times

Re: LZ's progression

Post by Uberdude »

LZ just beat Golaxy in the Fuzhou AI tournament :tmbup:
https://home.yikeweiqi.com/#/live/board/17523
Amtiskaw
Dies in gote
Posts: 38
Joined: Sun Apr 17, 2016 5:22 am
GD Posts: 0
Has thanked: 4 times
Been thanked: 20 times

Re: LZ's progression

Post by Amtiskaw »

Is there a way to download SGF from that site, and if yes, can someone post it here? :study:

Alright I think I found them...
Attachments
Leela Zero - Baduki.sgf
(2.03 KiB) Downloaded 1463 times
Leela Zero - YikeBot.sgf
(1.23 KiB) Downloaded 1452 times
Golaxy - Leela Zero.sgf
(1.4 KiB) Downloaded 1465 times
Uberdude
Judan
Posts: 6727
Joined: Thu Nov 24, 2011 11:35 am
Rank: UK 4 dan
GD Posts: 0
KGS: Uberdude 4d
OGS: Uberdude 7d
Location: Cambridge, UK
Has thanked: 436 times
Been thanked: 3718 times

Re: LZ's progression

Post by Uberdude »

I've previously hacked out the sgf from yike using browser dev tools, don't know if there's an easier way.

Inline sgf players:




splee99
Dies with sente
Posts: 101
Joined: Thu Nov 15, 2012 9:46 pm
Rank: KGS 2 D
GD Posts: 0
Has thanked: 2 times
Been thanked: 16 times

Re: LZ's progression

Post by splee99 »

iopq wrote:Benchmark with batching, it would be faster than just threading
Could you please show me the command option for batching? It seems that Sabaki always choose batch size 1 by default, while the autogtp chooses something different.
Amtiskaw
Dies in gote
Posts: 38
Joined: Sun Apr 17, 2016 5:22 am
GD Posts: 0
Has thanked: 4 times
Been thanked: 20 times

Re: LZ's progression

Post by Amtiskaw »

Leela lost both its semi-final games. I enjoyed watching the second one live, it had a rather drastic semeai, which sadly became 1-eye vs 0-eye...

Attachments
2019-04-27 LeelaZero vs Golaxy.sgf
(1.39 KiB) Downloaded 1430 times
2019-04-27 Golaxy vs LeelaZero.sgf
(1.83 KiB) Downloaded 1444 times
hoa803
Beginner
Posts: 19
Joined: Tue Apr 02, 2019 7:12 pm
GD Posts: 0
Been thanked: 2 times

Re: LZ's progression

Post by hoa803 »

nbc44 wrote:
hoa803 wrote:NBC, if you're using visit parity you shouldn't use ponder. The time to reach that number of visits varies by position. Time parity matches can use ponder on separate hardware though, similar to how Alphago was tested.
Why not? I'm using separate GPU for each net.
Think about what you are trying to do in terms of mathematics. LZ has a chance to win a random game, let us call that probability P.

In a match with ponder turned on, you've introduce another variable - the total thinking time permitted for each engine due to use of ponder. On a given game either LZ or Elf is likely to get more overall thinking time. Since we already know that strength is directly related to thinking time, your chance of LZ winning a particular game is now the function P(x), where x is a random variable related to the strength at different thinking times.

That means that the statistical basis being used to evaluate strength is no longer valid, because with fixed visit count and ponder the result is a function of another random variable that we don't know anything about. The function P(x) is most likely Gaussian, but we don't know the standard deviation or anything along those lines. I'm not enough of a mathematician to know what that does to the conclusion over a 400 game match.

Also - you should put your queries about thread and batch count to the actual programmers on GitHub. Again you are introducing variables that you don't understand. I think I've seen some discussion about both batch size and number of threads having an impact on performance. You should definitely ask if you want to understand what is going on. Maybe post your results and see what GCP says about it.
Post Reply