LZ's progression
Re: LZ's progression
"True" time parity test - #186 (white) vs #157a (3s per move, 2x1080ti).
New hope, part2:
29:21 
Finally: #186 vs #157a : +48-52=0
P.S. I can not explain the results of my test. What is wrong with black colour? V16, "--precision half" option, too little games...?
New hope, part2:
Code: Select all
gogui-twogtp -white "C:\APPS\l0gpu16\leelaz.exe --gtp --weights=C:\APPS\net\f50dc27c.gz -t 12 --gpu 0 --gpu 1 --noponder --precision half" -black "C:\APPS\l0gpu16\leelaz.exe --gtp --weights=C:\APPS\net\fc5e0a50.gz -t 12 --gpu 0 --gpu 1 --noponder --precision half" -games 50 -sgffile 157a-186 -auto -time 1s+4s/1 -komi 7.5 -verboseFinally: #186 vs #157a : +48-52=0
P.S. I can not explain the results of my test. What is wrong with black colour? V16, "--precision half" option, too little games...?
- Attachments
-
- 186-157a.zip
- Part1 & Part2
- (97.33 KiB) Downloaded 732 times
-
splee99
- Dies with sente
- Posts: 101
- Joined: Thu Nov 15, 2012 9:46 pm
- Rank: KGS 2 D
- GD Posts: 0
- Has thanked: 2 times
- Been thanked: 16 times
Re: LZ's progression
There is a well-known saying among professional go players. When you have to kill a huge dragon to win, you are already far behind. It may still take more time for LZ to understand this. When LZ186 plays black, it almost always tries to kill a huge dragon. LZ185 is more modest and plays more defensive moves.
Re: LZ's progression
And what about #157a? It have the same results, so i'm desagree with you.splee99 wrote:... When LZ186 plays black, it almost always tries to kill a huge dragon. LZ185 is more modest and plays more defensive moves.
-
splee99
- Dies with sente
- Posts: 101
- Joined: Thu Nov 15, 2012 9:46 pm
- Rank: KGS 2 D
- GD Posts: 0
- Has thanked: 2 times
- Been thanked: 16 times
Re: LZ's progression
LZ157 is using 15 blocks, so it is unable to generate any huge dragon (at least as large as the one generated by the 40 block). So there is less unexpected outcome regarding life and death of a large dragon.
-
Uberdude
- Judan
- Posts: 6727
- Joined: Thu Nov 24, 2011 11:35 am
- Rank: UK 4 dan
- GD Posts: 0
- KGS: Uberdude 4d
- OGS: Uberdude 7d
- Location: Cambridge, UK
- Has thanked: 436 times
- Been thanked: 3718 times
Re: LZ's progression
I agree LZ is now super-human on a GTX 1080 with say 30s a move, but I'm not actually aware of much direct evidence. Yakago, do you refer to Haylee, she is a lot weaker than a top pro. But yes there's lots of transitive evidence, e.g. a while ago LZ bijxo got some wins in the match vs Golaxy and Golaxy spanked all the humans it played (except that one loss). Reviewing pro games with LZ (and comparing with Elf) it appears to me that LZ is stronger and much more consistent, though it is of course possible the pro had a good response LZ didn't see that invalidates LZ's judgement, but I expect that's a minority (and I can find some, particularly ladders). Or on wbaduk I see LZ beating up non-top Japanese pros, often winning by resign in under 100 moves.Yakago wrote:Considering that even a few months ago, LeelaZero was beating human professionals with 2-3 handicap stones, which it was not even built for, on consumer hardware. (Since then, there has been various improvements to handicap play)luigi wrote:I'm following Leela's progression with delight. Does anyone know what its current rating against human pros on "normal" hardware (let's say one GTX 1080) would be? Is it better than the #1 human on those conditions? If so, by how much?
I would think that LeelaZero is well above any human on a GTX 1080. It is possible that you can find a blind spot with a ladder or similar though, depending on the time settings.
I did make an account on Fox to play with LZ to try to play some of the 9ds there and hopefully eventually top pros to gather some direct evidence, but as you start at 3d and need to win 20 games in a row to double rank promote that's 60 games of grinding to 9d, plus I felt a bit bad about beating up 3ds even though my username says LeelaZero and I say I'm a bot so they can quit before the game counts if they don't want to play (but most are Chinese so language barrier). I did actually play a few games this morning, this one below I liked move 47 armpit hit kosumi approach to 4-4 (that's usually a terrible beginner mistake) as a leaning attack to kill the group above, though of course my opp helped with that unnecessary c2 defence.
- jlt
- Gosei
- Posts: 1786
- Joined: Wed Dec 14, 2016 3:59 am
- GD Posts: 0
- Has thanked: 185 times
- Been thanked: 495 times
Re: LZ's progression
Some games of LZ#131 (network ecab83bb, 192x15) against Haylee last May were on relatively powerful hardware. The comments in https://online-go.com/game/12760703 indicate that at each move, the number of visits was typically 200000. On the other hand, https://online-go.com/game/12665694 indicates "1x1080Ti", and a number of visits less than 100000 in general, which is more reasonable.
-
Vargo
- Lives in gote
- Posts: 337
- Joined: Sat Aug 17, 2013 5:28 am
- GD Posts: 0
- Has thanked: 22 times
- Been thanked: 97 times
Re: LZ's progression
20 game match between LZ0.16 #187 and Elf v1
(1x1080, 5 min per side and game, no pondering, twogtp 1.4.10)
Elf v1 wins 13:7 (7 wins as B, 6 wins as W)
#187 won 35% of its games, not so bad
(reminder : #184 had won only 30% of its 20 games against Elf v1)
Maybe not significant, but it seems to go in the right direction)
Average length : 220 moves , average time used per side and game: 215" (about the same for both sides)
(1x1080, 5 min per side and game, no pondering, twogtp 1.4.10)
Elf v1 wins 13:7 (7 wins as B, 6 wins as W)
#187 won 35% of its games, not so bad
(reminder : #184 had won only 30% of its 20 games against Elf v1)
Maybe not significant, but it seems to go in the right direction)
Average length : 220 moves , average time used per side and game: 215" (about the same for both sides)
-
sorin
- Lives in gote
- Posts: 389
- Joined: Wed Apr 21, 2010 9:14 pm
- Has thanked: 418 times
- Been thanked: 198 times
Re: LZ's progression
Recent Go AIs are known for not paying much attention to human pros' sayings, but rather for proving humans wrongsplee99 wrote:There is a well-known saying among professional go players. When you have to kill a huge dragon to win, you are already far behind. It may still take more time for LZ to understand this. When LZ186 plays black, it almost always tries to kill a huge dragon. LZ185 is more modest and plays more defensive moves.
Sorin - 361points.com
Re: LZ's progression
It looks like you are an optimistVargo wrote:20 game match between LZ0.16 #187 and Elf v1
#187 won 35% of its games, not so bad![]()
P.S. 3s per move, 2x1080ti
P.S.S Games ##30-31 are elf's ladder-blindness, so everything is very bad.
EDIT.
#187(white) vs Elf v1. : +17-33
#187 vs Elf v1. : +28-72
P.S. If someone wants the games...
Re: LZ's progression
On the other hand ("visit" parity test (-v 1601 -r 5)):
Code: Select all
The first net is worse than the second
Elfv1 v #187 ( 74 games)
wins black white
Elfv1 24 32.43% 15 34.88% 9 29.03%
#187 50 67.57% 28 65.12% 22 70.97%
43 58.11% 31 41.89%
- Attachments
-
- elfv1-187-visit.zip
- (60.1 KiB) Downloaded 579 times
-
Vargo
- Lives in gote
- Posts: 337
- Joined: Sat Aug 17, 2013 5:28 am
- GD Posts: 0
- Has thanked: 22 times
- Been thanked: 97 times
Re: LZ's progression
Ten 10 game matches at time parity :
157 v 158
157 v 160
157 v 165
157 v 170
157 v 175
157 v 180
157 v 185
157 v 190
157 v 193
All games and results : Below, a graph of the win percentages of #157, with a dashed linear fit, showing the "average progression" of the networks.
For example, second black dot from left means : #157 has a win rate of 90% against #160
(1x1080, 2 min per side per game, no pondering, komi 7.5, twogtp 1.5.0, LZ016, #157 is always W in the even numbered games)
The networks get better indeed
!
I'm running the exact same experiment a second time...Results this evening or tomorrow.
I hope the linear fits will look similar.
157 v 158
157 v 160
157 v 165
157 v 170
157 v 175
157 v 180
157 v 185
157 v 190
157 v 193
All games and results : Below, a graph of the win percentages of #157, with a dashed linear fit, showing the "average progression" of the networks.
For example, second black dot from left means : #157 has a win rate of 90% against #160
(1x1080, 2 min per side per game, no pondering, komi 7.5, twogtp 1.5.0, LZ016, #157 is always W in the even numbered games)
The networks get better indeed
I hope the linear fits will look similar.
-
Vargo
- Lives in gote
- Posts: 337
- Joined: Sat Aug 17, 2013 5:28 am
- GD Posts: 0
- Has thanked: 22 times
- Been thanked: 97 times
Re: LZ's progression
The second series of matches shows progress too, but not as much as the first one.
(Same commands, same hardware)
Games and results of the second series: The combined graph (ten 20-game matches):
#157 average win rate goes down from ~75% to ~45%
(Same commands, same hardware)
Games and results of the second series: The combined graph (ten 20-game matches):
#157 average win rate goes down from ~75% to ~45%
-
Vargo
- Lives in gote
- Posts: 337
- Joined: Sat Aug 17, 2013 5:28 am
- GD Posts: 0
- Has thanked: 22 times
- Been thanked: 97 times
Re: LZ's progression
Third and last experiment with another ten 10-game matches (I should have done the three experiments in one go, sorry...)
After these 270 matches at short time parity (~1 sec/move with 1x1080), the combined graph shows real progress for the 40x256 networks. I think they are now better than #157 (last 15x192), even for relatively fast games.
I've not plotted the linear fit, because I'm sure someone will tell me the long time progression model is not linear (true, but it can be a good approximation, locally)
For example : rightmost blue point means #157 wins 40% against #193
I've not plotted the linear fit, because I'm sure someone will tell me the long time progression model is not linear (true, but it can be a good approximation, locally)
For example : rightmost blue point means #157 wins 40% against #193