Life In 19x19 http://lifein19x19.com/ |
|
LZ's progression http://lifein19x19.com/viewtopic.php?f=18&t=15718 |
Page 16 of 21 |
Author: | And [ Thu Feb 07, 2019 3:43 am ] |
Post subject: | Re: LZ's progression |
does anyone know where to download LZ ZQ elf-2, LZ ZQ elf-5 ? https://github.com/breakwa11/GoAIRatings |
Author: | Vargo [ Wed Feb 13, 2019 9:56 am ] |
Post subject: | Re: LZ's progression |
20 game match at time parity between LZ0.16 #204 and LZ0.16 Elfv2 1x1080, twogtp 1.5.0, 5min per side and per game. Elfv2 wins 13-7 All games by resignation, no error, no duplicate game. Stats : |
Author: | Uberdude [ Wed Feb 13, 2019 10:07 am ] |
Post subject: | Re: LZ's progression |
Vargo, about how many playouts per move is this? The official LZ test was 1600 each and LZ won 65%. |
Author: | Vargo [ Wed Feb 13, 2019 10:44 am ] |
Post subject: | Re: LZ's progression |
Uberdude wrote: how many playouts per move is this? 5 min per side and per game is, in fact, ~3.5 min/game effectively used, and is ~2s/move. It's similar to -v 1600 for #204 and -v 3000 for Elfv2 , all this with 1x1080.
|
Author: | Vargo [ Thu Feb 14, 2019 9:39 am ] |
Post subject: | Re: LZ's progression |
Another 10 game match between LZ0.16_#204 and LZ0.16_Elfv2 at time parity 2x1080Ti, 5 minutes per side per game (probably similar to -v 5000 for #204 and to -v 9000 for Elfv2) twogtp 1.5.0, no pondering, komi 7.5, no duplicate game, no error. Result : Elfv2 wins 7-3. The games : Attachment: I've used "-alternate", so, #204 is B in the even numbered games, and #204 is W for the odd numbers.(#204 only won the games numbered 1, 3, and 6) The command lines and the stats : |
Author: | nbc44 [ Sun Feb 17, 2019 9:47 pm ] | ||
Post subject: | Re: LZ's progression | ||
LZ0.16_#204 vs LZ0.16_Elfv2 2x1080Ti, 3s per move: Nothing interesting: Code: +28-72=0 (as black) +34-66=0 (as white) Total: +62-138=0
|
Author: | Vargo [ Tue Feb 19, 2019 3:10 am ] |
Post subject: | Re: LZ's progression |
nbc44 wrote: Nothing interesting: Why do you say that ? I find it very interesting, particularly considering it's 200 games ! ________________________________________________________________________________________ New network #205 40 game match #205 v. Elfv2 1x1080, 5min per side and per game, no pondering, komi 7.5 Elfv2 wins 25-15 (62.5 %) 40 games : Attachment: Command lines and stats (205 is B) : |
Author: | nbc44 [ Tue Feb 19, 2019 9:15 pm ] | ||||
Post subject: | Re: LZ's progression | ||||
Vargo wrote: Why do you say that ? I find it very interesting, particularly considering it's 200 games ! I suppose the test result is predetermined. Long test now: LZ0.16_#205 vs LZ0.16_Elfv2 - 2x1080Ti, 120s (wow!) per move, (it will be 10 games): +1-4=0 (#205 is black) +1-4=0 (#205 is white) Elfv2 wins 8-2 (80 %) P.S. Dragon tail loss :
|
Author: | Vargo [ Sun Mar 03, 2019 8:25 am ] |
Post subject: | Re: LZ's progression |
In another thread, @jlt wrote an interesting comment : Quote: ... I would be surprised if, for some n, LeelaZero(n) didn't beat LeelaZero(n-10) more than 50% of the time. The last 40b network is #207, it's now 50 networks away from the last 15b, and 30+ networks from the last 20b. 20 game matches LZ(n) v. LZ(n-10) at time parity, 3 min/game and /side, 1x1080, komi 7.5, no pondering, LZ0.16, twogtp 1.5.0. #207 v. #197 --> 12-8 (40b v. 40b) #197 v. #187 --> 12-8 (40b v. 40b) #187 v. #177 --> 15-5 (40b v. 40b) #177 v. #167 --> 13-7 (40b v. 20b) #167 v. #157 --> 5-15 (20b v. 15b) And one more match : LZ(n) v. LZ(n-50) #207 v. #157 --> 15-5 (40b v. 15b) All games by resignation, no error, no duplicate game. Average time was around 1.3 sec/move. Below, the little hands point the networks #157,167,177, etc. If someone wants the games or the stats, I'll upload them. |
Author: | jlt [ Sun Mar 03, 2019 8:32 am ] |
Post subject: | Re: LZ's progression |
Yes, I should have added the condition "if LZ(n) and LZ(n-10) are networks of the same size". Changing the network size introduces some discontinuity. When 20-block networks were introduced, results were disappointing, that's why the LeelaZero project shifted to 40 blocks rather quickly. |
Author: | Vargo [ Sun Mar 03, 2019 9:45 am ] |
Post subject: | Re: LZ's progression |
You're right, 15b #157 was a turning point, and 20b #158 was weaker. Another 20 game match (just finished, with the same parameters) : LZ(n) v. LZ(n-49) #207 v. #158 --> 19-1 (40b v. 20b) Not very surprising, but still... it's hard to pretend that LZ doesn't progress anymore |
Author: | Vargo [ Mon Mar 04, 2019 10:41 am ] |
Post subject: | Re: LZ's progression |
100 game match : LZ(today) v. LZ(1 year ago) One year ago, the best LZ network was #90 (6x128) 2 minutes per game and side, LZ0.16, twogtp 1.5.0 no pondering, komi 7.5, gpu : 1x1080 Try to guess the result NB. Because of the "-alternate" command, #207 is always named B, even though it was W 50 times. |
Author: | Vargo [ Sat Mar 09, 2019 10:46 pm ] |
Post subject: | Re: LZ's progression |
What's the effect of the number of visits on a given network ? For example, what would be the score of LZ#207 --visits=801 v. LZ#207 --visits=1601 ? I ran such a match yesterday (#207 with --visits=1, --visits=401, --visits=801, --visits=1601, --visits=3201) but... the results were inconclusive, more than half the games were duplicates ______________________________________________________________________________ Anyway, there's a new network, #208. 20 game matches : #208 with various visits counts, and -m 40 -m 40 is used to have a bit more randomness in the first 40 moves, and so, avoid duplicate games. Code: gogui-twogtp -black "C:\PATH TO LZ\leelaz.exe --gtp --weights=C:\PATH TO NETWORKS\208.gz --noponder -m 40 -v yyyy" -white "C:\PATH TO LZ\leelaz.exe --gtp --weights=C:\PATH TO NETWORKS\208.gz --noponder -m 40 -v zzzz" -games 20 -sgffile XXX -auto -komi 7.5 -alternate twogtp 1.5.0, LZ0.16, gpu:1x1080no duplicate game, no error. time/move seems to scale linearly : -v 1 : ~0 sec/move -v 401 : ~0.8 -v 801 : ~1.5 -v 1601 : ~3 -v 3201 : ~5 to 6 Results : Attachment: 208.gif [ 9.4 KiB | Viewed 10365 times ] If someone wants all the stats (times, lengthes, etc) , I'll upload them. All the games : The smallest number of visits is always B in the even numbered games (and W in the odd ones) for example, 208_401_801-17 is game number 17 between #208 with 400 visits and # 208 with 800 visits. 400 visits is W Attachment:
|
Author: | maf [ Sun Mar 10, 2019 10:37 am ] |
Post subject: | Re: LZ's progression |
Did a quick test using LZ207, p100 vs p1000, got 0:20. Nothing surprising, just fyi. |
Author: | And [ Mon Mar 11, 2019 11:52 am ] |
Post subject: | Re: LZ's progression |
several matches 25x25, nets received by the program https://drive.google.com/open?id=1bgkVB ... oXHUdDuqt7, https://github.com/leela-zero/leela-zero/issues/2240, 10sec/move, cpuonly, gogui-twogtp: LM 192x15 GX89(25x25) - LZ 40x256 #205(25x25) 25:15 LZ 192x15 f438268e(25x25) - LZ 40x256 #205(25x25) 18:22 elf v2 256x20(25x25) - LZ 40x256 #205(25x25) 17:23, black elf all parties (11) won because of the ladder converted minigo(25x25) 000930-goliath and 000990-cormorant do not work in gogui and sabaki. Can someone with a powerful gpu make a couple of matches? |
Author: | Vargo [ Tue Mar 12, 2019 5:16 am ] |
Post subject: | Re: LZ's progression |
Here are some more 20 game matches of #208 v. #208, with --visits=6401 Same parameters, except for --gpu 0 --gpu 1 (2x1080Ti) It shouldn't change anything. No error , no duplicate game. So, same table as before, with an extra line (6401 → ...) Attachment: 6401.gif [ 12.09 KiB | Viewed 10908 times ] Seems like more visits really makes a difference, I find the score of 6401 v. 801 specially harsh ! The games between -v 6401 and -v 3201 (3201 is B in the even numbered games): Attachment: The stats for -v 6401 vs -v 3201 : If someone wants the other stats or games, I can upload them. |
Author: | moha [ Tue Mar 12, 2019 3:56 pm ] |
Post subject: | Re: LZ's progression |
Vargo wrote: Seems like more visits really makes a difference, I find the score of 6401 v. 801 specially harsh ! IIRC similar tests were posted on github a year ago, and that time double playouts seemed to give roughly 75% winrate. This coincides with performance distributions about one standard deviation apart, which in turn can explain quadruple and octuple visits behaviour (3sd->98%, though doubling visits is not the same as doubling playouts, and at high visits the relations may change as well).
|
Author: | nbc44 [ Wed Mar 13, 2019 2:14 pm ] | ||
Post subject: | Re: LZ's progression | ||
Time parity match. LZ0.16 XXX and LZ0.16 Elfv2 2x1080ti, 60s per move. 1). #205 Code: #205 v elfv2 ( 26 games) wins black white #205 12 46.15% 2 50.00% 10 45.45% elfv2 14 53.85% 2 50.00% 12 54.55% 4 15.38% 22 84.62% 2). #207 Code: #207 v elfv2 ( 26 games) wins black white #207 13 50.00% 7 53.85% 6 46.15% elfv2 13 50.00% 6 46.15% 7 53.85% 13 50.00% 13 50.00% 3). #208 Code: #208 v elfv2 ( 26 games) wins black white #208 4 15.38% 1 9.09% 3 20.00% elfv2 22 84.62% 10 90.91% 12 80.00% 11 42.31% 15 57.69% 4). #210 in progress...
|
Author: | Vargo [ Sun Mar 17, 2019 3:02 am ] |
Post subject: | Re: LZ's progression |
New network #212 Quick test about @jlt's law (reminder : LZ#(n) is stronger than LZ#(n-10) at blocks and time parity) added parameters -m 20, to avoid duplicate games, and -v 1601, to "standardize" the test. 50 games, no duplicate, no error. Result : #212 wins 32-18 (64%) __________________________________________________________________________ And now, how about a little controversy... If #n wins 55% of its games against #n-1, and If #n-1 wins 55% of its games against #n-2,and ... and #n-9 wins 55% of its games against #n-10 #n should win 88% of its games against #n-10, but in this test, it wins only 64%... In this case, it's as if the real average winrate of #n against #n-1 was only ~51.5% , and not 55% Some caveats : -m 20 can alter results, and 50 games is not enough, but still, I remember @moha spoke about the primary source of Elo inflation being the amount of luck accumulated by the new networks in test matches. I think he was right. Code: gogui-twogtp -black "C:\Users\jm\Desktop\gogui150\leela-zero-0.16-win64OK\leelaz.exe --gtp --weights=C:\Users\jm\Desktop\LZ_networks\212.gz --noponder --gpu 0 --gpu 1 -m 20 -v 1601" -white "C:\Users\jm\Desktop\gogui150\leela-zero-0.16-win64OK\leelaz.exe --gtp --weights=C:\Users\jm\Desktop\LZ_networks\202.gz --noponder --gpu 0 --gpu 1 -m 20 -v 1601" -games 50 -sgffile 212_202 -auto -komi 7.5 -alternate The 50 games : Attachment: EDIT : #212 is B in the even numbered games, and W in the odd ones.
|
Author: | Uberdude [ Sun Mar 17, 2019 6:24 am ] |
Post subject: | Re: LZ's progression |
Vargo wrote: If #n wins 55% of its games against #n-1, and If #n-1 wins 55% of its games against #n-2,and ... and #n-9 wins 55% of its games against #n-10 #n should win 88% of its games against #n-10, but in this test, it wins only 64%... In this case, it's as if the real average winrate of #n against #n-1 was only ~51.5% , and not 55% Why should it? That's an assumption e.g. Elo rating systems take to make the problem simple enough to tackle, but there's no logical 'should' about it. If Man City beat Arsenal 3-0 and Arsenal beat Chelsea 2-0 we can't say Man City should beat Chelsea 5-0. |
Page 16 of 21 | All times are UTC - 8 hours [ DST ] |
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |