Life In 19x19 http://lifein19x19.com/ |
|
LZ's progression http://lifein19x19.com/viewtopic.php?f=18&t=15718 |
Page 11 of 21 |
Author: | Uberdude [ Sat Nov 03, 2018 3:40 pm ] |
Post subject: | Re: LZ's progression |
Yakago wrote: luigi wrote: I'm following Leela's progression with delight. Does anyone know what its current rating against human pros on "normal" hardware (let's say one GTX 1080) would be? Is it better than the #1 human on those conditions? If so, by how much? Considering that even a few months ago, LeelaZero was beating human professionals with 2-3 handicap stones, which it was not even built for, on consumer hardware. (Since then, there has been various improvements to handicap play) I would think that LeelaZero is well above any human on a GTX 1080. It is possible that you can find a blind spot with a ladder or similar though, depending on the time settings. I agree LZ is now super-human on a GTX 1080 with say 30s a move, but I'm not actually aware of much direct evidence. Yakago, do you refer to Haylee, she is a lot weaker than a top pro. But yes there's lots of transitive evidence, e.g. a while ago LZ bijxo got some wins in the match vs Golaxy and Golaxy spanked all the humans it played (except that one loss). Reviewing pro games with LZ (and comparing with Elf) it appears to me that LZ is stronger and much more consistent, though it is of course possible the pro had a good response LZ didn't see that invalidates LZ's judgement, but I expect that's a minority (and I can find some, particularly ladders). Or on wbaduk I see LZ beating up non-top Japanese pros, often winning by resign in under 100 moves. I did make an account on Fox to play with LZ to try to play some of the 9ds there and hopefully eventually top pros to gather some direct evidence, but as you start at 3d and need to win 20 games in a row to double rank promote that's 60 games of grinding to 9d, plus I felt a bit bad about beating up 3ds even though my username says LeelaZero and I say I'm a bot so they can quit before the game counts if they don't want to play (but most are Chinese so language barrier). I did actually play a few games this morning, this one below I liked move 47 armpit hit kosumi approach to 4-4 (that's usually a terrible beginner mistake) as a leaning attack to kill the group above, though of course my opp helped with that unnecessary c2 defence. |
Author: | jlt [ Sun Nov 04, 2018 1:25 am ] |
Post subject: | Re: LZ's progression |
Some games of LZ#131 (network ecab83bb, 192x15) against Haylee last May were on relatively powerful hardware. The comments in https://online-go.com/game/12760703 indicate that at each move, the number of visits was typically 200000. On the other hand, https://online-go.com/game/12665694 indicates "1x1080Ti", and a number of visits less than 100000 in general, which is more reasonable. |
Author: | Vargo [ Sun Nov 04, 2018 7:02 am ] |
Post subject: | Re: LZ's progression |
20 game match between LZ0.16 #187 and Elf v1 (1x1080, 5 min per side and game, no pondering, twogtp 1.4.10) Elf v1 wins 13:7 (7 wins as B, 6 wins as W) #187 won 35% of its games, not so bad (reminder : #184 had won only 30% of its 20 games against Elf v1) Maybe not significant, but it seems to go in the right direction) Average length : 220 moves , average time used per side and game: 215" (about the same for both sides) Attachment: Attachment:
|
Author: | sorin [ Sun Nov 04, 2018 1:56 pm ] |
Post subject: | Re: LZ's progression |
splee99 wrote: There is a well-known saying among professional go players. When you have to kill a huge dragon to win, you are already far behind. It may still take more time for LZ to understand this. When LZ186 plays black, it almost always tries to kill a huge dragon. LZ185 is more modest and plays more defensive moves. Recent Go AIs are known for not paying much attention to human pros' sayings, but rather for proving humans wrong |
Author: | splee99 [ Sun Nov 04, 2018 4:55 pm ] |
Post subject: | Re: LZ's progression |
Well, overplay is always wrong. As we can see LZ186 is well punished by LZ187 which is quite often in the defensive mode. |
Author: | nbc44 [ Sun Nov 04, 2018 10:39 pm ] |
Post subject: | Re: LZ's progression |
Vargo wrote: 20 game match between LZ0.16 #187 and Elf v1 #187 won 35% of its games, not so bad It looks like you are an optimist : #187(black) vs Elf v1. : +11-39 P.S. 3s per move, 2x1080ti P.S.S Games ##30-31 are elf's ladder-blindness, so everything is very bad. EDIT. #187(white) vs Elf v1. : +17-33 In total: #187 vs Elf v1. : +28-72 P.S. If someone wants the games... |
Author: | nbc44 [ Mon Nov 05, 2018 7:17 pm ] | ||
Post subject: | Re: LZ's progression | ||
On the other hand ("visit" parity test (-v 1601 -r 5)): Code: The first net is worse than the second Elfv1 v #187 ( 74 games) wins black white Elfv1 24 32.43% 15 34.88% 9 29.03% #187 50 67.57% 28 65.12% 22 70.97% 43 58.11% 31 41.89%
|
Author: | Vargo [ Fri Nov 30, 2018 6:19 am ] |
Post subject: | Re: LZ's progression |
Ten 10 game matches at time parity : 157 v 158 157 v 160 157 v 165 157 v 170 157 v 175 157 v 180 157 v 185 157 v 190 157 v 193 All games and results : Attachment: Below, a graph of the win percentages of #157, with a dashed linear fit, showing the "average progression" of the networks. For example, second black dot from left means : #157 has a win rate of 90% against #160 (1x1080, 2 min per side per game, no pondering, komi 7.5, twogtp 1.5.0, LZ016, #157 is always W in the even numbered games) The networks get better indeed ! I'm running the exact same experiment a second time...Results this evening or tomorrow. I hope the linear fits will look similar. |
Author: | Vargo [ Fri Nov 30, 2018 11:25 am ] |
Post subject: | Re: LZ's progression |
The second series of matches shows progress too, but not as much as the first one. (Same commands, same hardware) Games and results of the second series: Attachment: The combined graph (ten 20-game matches): #157 average win rate goes down from ~75% to ~45% |
Author: | Vargo [ Sat Dec 01, 2018 3:05 am ] |
Post subject: | Re: LZ's progression |
Third and last experiment with another ten 10-game matches (I should have done the three experiments in one go, sorry...) Attachment: After these 270 matches at short time parity (~1 sec/move with 1x1080), the combined graph shows real progress for the 40x256 networks. I think they are now better than #157 (last 15x192), even for relatively fast games. I've not plotted the linear fit, because I'm sure someone will tell me the long time progression model is not linear (true, but it can be a good approximation, locally) For example : rightmost blue point means #157 wins 40% against #193 |
Author: | Vargo [ Sun Dec 09, 2018 5:13 am ] |
Post subject: | Re: LZ's progression |
20 game match :#194 v. #157 at time parity 5min per side per game, komi 7.5, -r 10 for both, LZ 0.16, twogtp 1.5.0 PONDERING ENABLED for both (#194 using gpu 0, and #157 using gpu 1 , gpu : 2x1080Ti) #194 wins 12-8 (7 wins as W) If someone wants the games or the stats, I'll upload them |
Author: | splee99 [ Sun Dec 09, 2018 1:58 pm ] |
Post subject: | Re: LZ's progression |
Another strong opponent is the Leela master. I tested the 15b LM_GX88 which is indeed stronger than #193 in time parity games. However #194 is now much stronger than LM_GX88. |
Author: | nbc44 [ Mon Dec 10, 2018 3:19 pm ] |
Post subject: | Re: LZ's progression |
Forgotten test of "home-made" LeelaMaster_GX88 vs #157a net. By visits (standart 3200): Code: The first net is better than the second LeelaMaster_GX88.txt v fc5e0a50.gz ( 231 games) wins black white LeelaMaster_GX88.txt 136 58.87% 58 61.05% 78 57.35% fc5e0a50.gz 95 41.13% 37 38.95% 58 42.65% 95 41.13% 136 58.87% By time-parity (3s per move): Code: The first net is worse than the second LeelaMaster_GX88.txt v fc5e0a50.gz ( 308 games) wins black white LeelaMaster_GX88.txt 147 47.73% 53 46.90% 94 48.21% fc5e0a50.gz 161 52.27% 60 53.10% 101 51.79% 113 36.69% 195 63.31% P.S. 2x1080ti |
Author: | Vargo [ Fri Dec 14, 2018 7:47 am ] |
Post subject: | Re: LZ's progression |
20 game match between #195 and #157 1x1080, 5 min per side per game, no pondering, komi 7.5 , for both matches. #195 wins 12-8 (average length 248 moves, average time used 243" and 239") 20 game match between #195 and Leela Master GX88 (average length 258 moves, average time used 242" and 237") #195 wins 12-8 250+ moves... it seems a lot, maybe it would have been less if I'd used -r 10 ? If someone wants the game, I'll upload them. For both matches, I used -alternate, so, there are automatic color changes which make the reports look weird (because of the "Colors Exchanged" column in the stats) The stats : |
Author: | Vargo [ Tue Dec 18, 2018 7:57 am ] |
Post subject: | Re: LZ's progression |
Rn, with the LZ weights #191 got 2nd, just behind Golaxy, at the recentAI Ryusei 2018. (see HERE , where there's a link to download the binaries) 20 game match between Rn (with network #195) and LZ0.16 with #195 --device-id 0 --weights C:\...\195.gz --komi 7.5 --const-time 2 for Rn --noponder --weights C:\...\195.gz for LZ -time 1s+3s/1 -auto -komi 7.5 for both twogtp1.5.0, 1x1080, LZ used almost exactly 2sec/move, and Ray used 1.8sec/move Result 10-10 No duplicate games or errors, All games by resignation, but for two games I discarded because I didn't understand their result ... (see image) Attachment: Attachment:
|
Author: | jlt [ Tue Dec 18, 2018 8:34 am ] |
Post subject: | Re: LZ's progression |
Can you explain a bit more? You said this is a 20-game match but you only posted the results of 11 games (games #0 --#10). On the other hand, the zip files you provided do contain 20 games (essai2-0...essai2-9, essai3-0...essai3-9). What is the correspondence between the table you provided and these 20 games? Also, concerning the two games whose result you don't understand, if you post them we could check manually to see if there is something wrong. |
Author: | Vargo [ Tue Dec 18, 2018 9:58 am ] |
Post subject: | Re: LZ's progression |
jlt wrote: Can you explain a bit more? You're right, it's not very clear...I ran 20 games, 2 of them with dubious results, so I ran 2 more, and the match result (10-10) is with unequivocal games, all won by resignation. I just discarded the two strange games. The zip "Rn is B" contains 10 games, with 6 wins for Rn as B, and The zip "Rn is W" contains 10 games, with 4 wins for Rn as W. Concerning the curious (discarded) result in the picture, I think B sees itself as winner (Result_B : B+13.5) and W sees itself as winner too (Result_W : W+10.5) and the 3rd column is Result_Referee : ? I'll run some more games tomorrow, and I'll post the sgf of the dubious ones. |
Author: | Vargo [ Wed Dec 19, 2018 2:19 am ] |
Post subject: | Re: LZ's progression |
Here we are... Another 40 game match Rn(#195) v. LZ0.16(#195) 36 games won by resignation, I've looked at some of them, and, they look ok. Three games where both sides agree on the winner, but not on the score : B sees B as winner (Result_B : B+9.5) and W sees B as winner too (Result_W : B+5.5) and the 3rd column is Result_Referee : ? B sees W as winner (Result_B : W+4.5) and W sees W as winner too (Result_W : W+6.5) and the 3rd column is Result_Referee : ? B sees W as winner (Result_B : W+48.5) and W sees W as winner too (Result_W : W+6.5) and the 3rd column is Result_Referee : ? File : "rn195_lz195_2sec_c-5.sgf" in the zip To my untrained eye, they look like unfinished business, I could understand a resignation, but not a precise score. One game where Rn and LZ don't agree on the winner: B sees B as winner (Result_B : B+11.5) and W sees W as winner too (Result_W : W+8.5) and the 3rd column is Result_Referee : ? File : "rn195_lz195_2sec_c_LZ_B-5.sgf" in the zip I'm calling the referee (that's you ) on these games, how can you count points ? Anyway, discarding the 6 dubious games (out of 62 in all), it's a 56 game match, at 2sec/move and the score is a perfect draw, at 28-28 The 40 games : Attachment:
|
Author: | jlt [ Wed Dec 19, 2018 3:29 am ] |
Post subject: | Re: LZ's progression |
The problem is that the games are unfinished. Consider the last game rn195_lz195_2sec_c_LZ_B-5.sgf. Edit: I made a confusion with rn195_lz195_2sec_LZ_B-5.sgf Black was leading by 1.5 point. When Black played P3, White should have protected at P1, but instead connected at G6, so Black played the last move P1. If the game had continued normally, White would have protected at Q2, then Black O2, White O1 and Black captures 6 stones with O2 (snapback), White Q1, Black O1, White R6 and Black would have won by 14.5 points (or 15.5, I don't know which rules are used to calculate the score). So to summarize: 1) Black was leading 2) White made a gross blunder (but anyway had a lost game). 3) The game is not finished, but if continued Black would have won anyway. 4) The two players probably don't agree on the life-and-death status of the stone P1, so calculate the score differently (but still I don't understand how White can come up with a result like W+8.5). |
Author: | Vargo [ Wed Dec 19, 2018 4:44 am ] |
Post subject: | Re: LZ's progression |
Thanks ! You're absolutely right about your analyse of rn195_lz195_2sec_LZ_B-5.sgf, but LZ and Rn both agreed it was B+ The weird one is rn195_lz195_2sec_c_LZ_B-5.sgf (sorry for the confusing names...) |
Page 11 of 21 | All times are UTC - 8 hours [ DST ] |
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |