Life In 19x19
http://lifein19x19.com/

LZ's progression
http://lifein19x19.com/viewtopic.php?f=18&t=15718
Page 16 of 21

Author:  And [ Thu Feb 07, 2019 3:43 am ]
Post subject:  Re: LZ's progression

does anyone know where to download LZ ZQ elf-2, LZ ZQ elf-5 ?
https://github.com/breakwa11/GoAIRatings

Author:  Vargo [ Wed Feb 13, 2019 9:56 am ]
Post subject:  Re: LZ's progression

20 game match at time parity between
LZ0.16 #204 and LZ0.16 Elfv2
1x1080, twogtp 1.5.0, 5min per side and per game.

Elfv2 wins 13-7
All games by resignation, no error, no duplicate game.
Stats :
Attachment:
204 v elfv2.gif
204 v elfv2.gif [ 52.85 KiB | Viewed 11413 times ]

Author:  Uberdude [ Wed Feb 13, 2019 10:07 am ]
Post subject:  Re: LZ's progression

Vargo, about how many playouts per move is this? The official LZ test was 1600 each and LZ won 65%.

Author:  Vargo [ Wed Feb 13, 2019 10:44 am ]
Post subject:  Re: LZ's progression

Uberdude wrote:
how many playouts per move is this?
5 min per side and per game is, in fact, ~3.5 min/game effectively used, and is ~2s/move. It's similar to -v 1600 for #204 and -v 3000 for Elfv2 , all this with 1x1080.

Author:  Vargo [ Thu Feb 14, 2019 9:39 am ]
Post subject:  Re: LZ's progression

Another 10 game match between LZ0.16_#204 and LZ0.16_Elfv2 at time parity
2x1080Ti, 5 minutes per side per game (probably similar to -v 5000 for #204 and to -v 9000 for Elfv2)
twogtp 1.5.0, no pondering, komi 7.5, no duplicate game, no error.
Result : Elfv2 wins 7-3.

The games :
Attachment:
204_Elfv2.zip [9.67 KiB]
Downloaded 497 times
I've used "-alternate", so, #204 is B in the even numbered games, and #204 is W for the odd numbers.
(#204 only won the games numbered 1, 3, and 6)


The command lines and the stats :
Attachment:
elfv2v204.gif
elfv2v204.gif [ 50.97 KiB | Viewed 11353 times ]

Author:  nbc44 [ Sun Feb 17, 2019 9:47 pm ]
Post subject:  Re: LZ's progression

LZ0.16_#204 vs LZ0.16_Elfv2 2x1080Ti, 3s per move:
gogui-twogtp -black "C:\APPS\l0gpu16\leelaz.exe --gtp --weights=C:\APPS\net\05d10f27.gz -t 12 --gpu 0 --gpu 1 --noponder --precision single" -white "C:\APPS\l0gpu16\leelaz.exe --gtp --weights=C:\APPS\net\05dbca15.gz -t 12 --gpu 0 --gpu 1 --noponder --precision single" -games 100 -sgffile 204-elfv2 -auto -time 1s+4s/1 -komi 7.5 -verbose
gogui-twogtp -white "C:\APPS\l0gpu16\leelaz.exe --gtp --weights=C:\APPS\net\05d10f27.gz -t 12 --gpu 0 --gpu 1 --noponder --precision single" -black "C:\APPS\l0gpu16\leelaz.exe --gtp --weights=C:\APPS\net\05dbca15.gz -t 12 --gpu 0 --gpu 1 --noponder --precision single" -games 100 -sgffile elfv2-204 -auto -time 1s+4s/1 -komi 7.5 -verbose

Nothing interesting:

Code:
+28-72=0 (as black)
+34-66=0 (as white)

Total: +62-138=0

Attachments:
204-elfv2-stat.zip [5.72 KiB]
Downloaded 489 times

Author:  Vargo [ Tue Feb 19, 2019 3:10 am ]
Post subject:  Re: LZ's progression

nbc44 wrote:
Nothing interesting:
Why do you say that ? I find it very interesting, particularly considering it's 200 games :tmbup: !
________________________________________________________________________________________

New network #205

40 game match #205 v. Elfv2

1x1080, 5min per side and per game, no pondering, komi 7.5

Elfv2 wins 25-15 (62.5 %)

40 games :
Attachment:
elfV2_205.zip [34.9 KiB]
Downloaded 488 times

Command lines and stats (205 is B) :
Attachment:
Elfv2_205B.gif
Elfv2_205B.gif [ 96.14 KiB | Viewed 11051 times ]
Command lines and stats (205 is W) :
Attachment:
Elfv2_205W.gif
Elfv2_205W.gif [ 99.14 KiB | Viewed 11051 times ]

Author:  nbc44 [ Tue Feb 19, 2019 9:15 pm ]
Post subject:  Re: LZ's progression

Vargo wrote:
Why do you say that ? I find it very interesting, particularly considering it's 200 games :tmbup: !

I suppose the test result is predetermined.

Long test now:
LZ0.16_#205 vs LZ0.16_Elfv2 - 2x1080Ti, 120s (wow!) per move, (it will be 10 games):

+1-4=0 (#205 is black)
+1-4=0 (#205 is white)

Elfv2 wins 8-2 (80 %)

P.S.
Dragon tail loss :o :


Attachments:
File comment: logs-part2
elfv2-205.zip [1.21 MiB]
Downloaded 495 times
File comment: logs-part1
205-elfv2.zip [1.26 MiB]
Downloaded 464 times
games.zip [10.1 KiB]
Downloaded 445 times

Author:  Vargo [ Sun Mar 03, 2019 8:25 am ]
Post subject:  Re: LZ's progression

In another thread, @jlt wrote an interesting comment :
Quote:
... I would be surprised if, for some n, LeelaZero(n) didn't beat LeelaZero(n-10) more than 50% of the time.


The last 40b network is #207, it's now 50 networks away from the last 15b, and 30+ networks from the last 20b.

20 game matches LZ(n) v. LZ(n-10) at time parity, 3 min/game and /side, 1x1080, komi 7.5, no pondering, LZ0.16, twogtp 1.5.0.

#207 v. #197 --> 12-8 (40b v. 40b)
#197 v. #187 --> 12-8 (40b v. 40b)
#187 v. #177 --> 15-5 (40b v. 40b)
#177 v. #167 --> 13-7 (40b v. 20b)
#167 v. #157 --> 5-15 (20b v. 15b)

And one more match : LZ(n) v. LZ(n-50)

#207 v. #157 --> 15-5 (40b v. 15b)

All games by resignation, no error, no duplicate game.

Average time was around 1.3 sec/move.

Below, the little hands point the networks #157,167,177, etc.
Attachment:
elo2.gif
elo2.gif [ 27.31 KiB | Viewed 10717 times ]

If someone wants the games or the stats, I'll upload them.

Author:  jlt [ Sun Mar 03, 2019 8:32 am ]
Post subject:  Re: LZ's progression

Yes, I should have added the condition "if LZ(n) and LZ(n-10) are networks of the same size". Changing the network size introduces some discontinuity. When 20-block networks were introduced, results were disappointing, that's why the LeelaZero project shifted to 40 blocks rather quickly.

Author:  Vargo [ Sun Mar 03, 2019 9:45 am ]
Post subject:  Re: LZ's progression

You're right, 15b #157 was a turning point, and 20b #158 was weaker.

Another 20 game match (just finished, with the same parameters) :
LZ(n) v. LZ(n-49)

#207 v. #158 --> 19-1 (40b v. 20b)

Not very surprising, but still... it's hard to pretend that LZ doesn't progress anymore ;-)

Author:  Vargo [ Mon Mar 04, 2019 10:41 am ]
Post subject:  Re: LZ's progression

100 game match : LZ(today) v. LZ(1 year ago)

One year ago, the best LZ network was #90 (6x128)
2 minutes per game and side, LZ0.16, twogtp 1.5.0 no pondering, komi 7.5, gpu : 1x1080

Try to guess the result :scratch:
NB. Because of the "-alternate" command, #207 is always named B, even though it was W 50 times.
Attachment:
90.jpg
90.jpg [ 331.19 KiB | Viewed 10583 times ]

Author:  Vargo [ Sat Mar 09, 2019 10:46 pm ]
Post subject:  Re: LZ's progression

What's the effect of the number of visits on a given network ?
For example, what would be the score of LZ#207 --visits=801 v. LZ#207 --visits=1601 ?

I ran such a match yesterday (#207 with --visits=1, --visits=401, --visits=801, --visits=1601, --visits=3201) but... the results were inconclusive, more than half the games were duplicates :sad: :sad: :sad:
Attachment:
dup.gif
dup.gif [ 46.78 KiB | Viewed 10288 times ]
probably because #207 knows all the tricks of #207 ;-)

______________________________________________________________________________

Anyway, there's a new network, #208.
20 game matches : #208 with various visits counts, and -m 40
-m 40 is used to have a bit more randomness in the first 40 moves, and so, avoid duplicate games.

Code:
gogui-twogtp -black "C:\PATH TO LZ\leelaz.exe --gtp --weights=C:\PATH TO NETWORKS\208.gz --noponder -m 40 -v yyyy" -white "C:\PATH TO LZ\leelaz.exe --gtp --weights=C:\PATH TO NETWORKS\208.gz --noponder -m 40 -v zzzz" -games 20 -sgffile XXX -auto -komi 7.5 -alternate
twogtp 1.5.0, LZ0.16, gpu:1x1080
no duplicate game, no error.

time/move seems to scale linearly :
-v 1 : ~0 sec/move
-v 401 : ~0.8
-v 801 : ~1.5
-v 1601 : ~3
-v 3201 : ~5 to 6


Results :
Attachment:
208.gif
208.gif [ 9.4 KiB | Viewed 10288 times ]



If someone wants all the stats (times, lengthes, etc) , I'll upload them.


All the games :
The smallest number of visits is always B in the even numbered games (and W in the odd ones)
for example, 208_401_801-17 is game number 17 between #208 with 400 visits and # 208 with 800 visits. 400 visits is W

Attachment:
games.zip [143.93 KiB]
Downloaded 467 times

Author:  maf [ Sun Mar 10, 2019 10:37 am ]
Post subject:  Re: LZ's progression

Did a quick test using LZ207, p100 vs p1000, got 0:20. Nothing surprising, just fyi.

Author:  And [ Mon Mar 11, 2019 11:52 am ]
Post subject:  Re: LZ's progression

several matches 25x25, nets received by the program https://drive.google.com/open?id=1bgkVB ... oXHUdDuqt7,
https://github.com/leela-zero/leela-zero/issues/2240, 10sec/move, cpuonly, gogui-twogtp:
LM 192x15 GX89(25x25) - LZ 40x256 #205(25x25) 25:15
LZ 192x15 f438268e(25x25) - LZ 40x256 #205(25x25) 18:22
elf v2 256x20(25x25) - LZ 40x256 #205(25x25) 17:23, black elf all parties (11) won because of the ladder
converted minigo(25x25) 000930-goliath and 000990-cormorant do not work in gogui and sabaki.
Can someone with a powerful gpu make a couple of matches?

Author:  Vargo [ Tue Mar 12, 2019 5:16 am ]
Post subject:  Re: LZ's progression

Here are some more 20 game matches of #208 v. #208, with --visits=6401

Same parameters, except for --gpu 0 --gpu 1 (2x1080Ti)
It shouldn't change anything.
No error , no duplicate game.

So, same table as before, with an extra line (6401 → ...)
Attachment:
6401.gif
6401.gif [ 12.09 KiB | Viewed 10831 times ]


Seems like more visits really makes a difference, I find the score of 6401 v. 801 specially harsh :o !


The games between -v 6401 and -v 3201 (3201 is B in the even numbered games):
Attachment:
208_3201v6401.zip [17.77 KiB]
Downloaded 450 times


The stats for -v 6401 vs -v 3201 :
Attachment:
stats.gif
stats.gif [ 199.73 KiB | Viewed 10776 times ]

If someone wants the other stats or games, I can upload them.

Author:  moha [ Tue Mar 12, 2019 3:56 pm ]
Post subject:  Re: LZ's progression

Vargo wrote:
Seems like more visits really makes a difference, I find the score of 6401 v. 801 specially harsh :o !
IIRC similar tests were posted on github a year ago, and that time double playouts seemed to give roughly 75% winrate. This coincides with performance distributions about one standard deviation apart, which in turn can explain quadruple and octuple visits behaviour (3sd->98%, though doubling visits is not the same as doubling playouts, and at high visits the relations may change as well).

Author:  nbc44 [ Wed Mar 13, 2019 2:14 pm ]
Post subject:  Re: LZ's progression

Time parity match.
LZ0.16 XXX and LZ0.16 Elfv2
2x1080ti, 60s per move.
C:\APPS\l0gpu16\validation.exe -n C:\APPS\net\XXX.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -n C:\APPS\net\05dbca15.gz -o "-g --gpu 0 --gpu 1 --noponder -t 24 -q -d --precision single -w" -- C:\APPS\l0gpu16\leelaz --gtp-command "time_settings 1 61 1" -- C:\APPS\l0gpu16\leelaz --gtp-command "time_settings 1 61 1" -k XXX-elfv2

1). #205
Code:
#205 v elfv2 ( 26 games)
           wins        black       white
#205    12 46.15%    2 50.00%   10 45.45%
elfv2   14 53.85%    2 50.00%   12 54.55%
                     4 15.38%   22 84.62%

2). #207
Code:
#207 v elfv2 ( 26 games)
           wins        black       white
#207    13 50.00%    7 53.85%    6 46.15%
elfv2   13 50.00%    6 46.15%    7 53.85%
                    13 50.00%   13 50.00%

3). #208
Code:
#208 v elfv2 ( 26 games)
           wins         black      white
#208     4 15.38%    1  9.09%    3 20.00%
elfv2   22 84.62%   10 90.91%   12 80.00%
                    11 42.31%   15 57.69%

4). #210
in progress...

Attachments:
l0-elfv2.zip [69.66 KiB]
Downloaded 434 times

Author:  Vargo [ Sun Mar 17, 2019 3:02 am ]
Post subject:  Re: LZ's progression

New network #212

Quick test about @jlt's law ;-)
(reminder : LZ#(n) is stronger than LZ#(n-10) at blocks and time parity)

added parameters -m 20, to avoid duplicate games, and -v 1601, to "standardize" the test.

50 games, no duplicate, no error.
Result : #212 wins 32-18 (64%)
__________________________________________________________________________

And now, how about a little controversy... :D :D

If #n wins 55% of its games against #n-1, and
If #n-1 wins 55% of its games against #n-2,and
...
and #n-9 wins 55% of its games against #n-10

#n should win 88% of its games against #n-10, but in this test, it wins only 64%...


In this case, it's as if the real average winrate of #n against #n-1 was only ~51.5% , and not 55%


Some caveats : -m 20 can alter results, and 50 games is not enough, but still, I remember @moha spoke about the primary source of Elo inflation being the amount of luck accumulated by the new networks in test matches. I think he was right.

Code:
gogui-twogtp -black "C:\Users\jm\Desktop\gogui150\leela-zero-0.16-win64OK\leelaz.exe --gtp --weights=C:\Users\jm\Desktop\LZ_networks\212.gz --noponder --gpu 0 --gpu 1 -m 20 -v 1601" -white "C:\Users\jm\Desktop\gogui150\leela-zero-0.16-win64OK\leelaz.exe --gtp --weights=C:\Users\jm\Desktop\LZ_networks\202.gz --noponder --gpu 0 --gpu 1 -m 20 -v 1601" -games 50 -sgffile 212_202 -auto -komi 7.5 -alternate

The 50 games :
Attachment:
212_202.zip [43.7 KiB]
Downloaded 441 times
EDIT : #212 is B in the even numbered games, and W in the odd ones.

Author:  Uberdude [ Sun Mar 17, 2019 6:24 am ]
Post subject:  Re: LZ's progression

Vargo wrote:
If #n wins 55% of its games against #n-1, and
If #n-1 wins 55% of its games against #n-2,and
...
and #n-9 wins 55% of its games against #n-10

#n should win 88% of its games against #n-10, but in this test, it wins only 64%...

In this case, it's as if the real average winrate of #n against #n-1 was only ~51.5% , and not 55%



Why should it? That's an assumption e.g. Elo rating systems take to make the problem simple enough to tackle, but there's no logical 'should' about it. If Man City beat Arsenal 3-0 and Arsenal beat Chelsea 2-0 we can't say Man City should beat Chelsea 5-0.

Page 16 of 21 All times are UTC - 8 hours [ DST ]
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/