It is currently Tue Nov 12, 2019 9:17 am

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 26 posts ]  Go to page 1, 2  Next
Author Message
Offline
 Post subject: so basically leela zero has made 0 progress in the last 4mo
Post #1 Posted: Wed Nov 21, 2018 6:55 am 
Dies in gote

Posts: 63
Liked others: 0
Was liked: 3
so basically leela zero has made 0 progress in the last 4months since the last 15 block 157 net

on average hardware that still beats the latest 40block even on time parity

Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #2 Posted: Wed Nov 21, 2018 7:27 am 
Lives with ko

Posts: 242
Liked others: 4
Was liked: 57
According to THIS SITE, the network
#157 (4 months ago) is 10.6d , and #191 (latest) is 12.85d
More than 2 stones difference...

But I'm a bit like you, I doubt very much #157 is over 2 stones stronger at time parity.
Maybe LZ is simply plateauing , but I hope I'm wrong...

Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #3 Posted: Wed Nov 21, 2018 7:59 am 
Judan

Posts: 6172
Location: Cambridge, UK
Liked others: 353
Was liked: 3333
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
hydrogenpi7 wrote:
on average hardware that still beats the latest 40block even on time parity


Please provide evidence of this (and qualify average hardware and what time per move). 1 month ago you might have been correct, but the recent 40b networks now surpass #157 in game/analysis-realistic times on my GTX 1060 (in my playing around on Fox and reviewing experience, particularly with ladders, but I've not done a big match for 100+ games). It is not enough to just say time parity, as the extensive research documented at https://github.com/gcp/leela-zero/issues/1914 shows at very short time parity tests the 40 block networks are better than #157 and have been for a while, whilst at more normal (e.g. 20s a move on a 1060 GPU) #157 was stronger for quite a while but that's changing recently.

Vargo did a 20-game match of #185 vs #157 with equal time (5 min/game on a 1080) on 1st Nov and 185 won. viewtopic.php?p=238518#p238518. Or #181 beating #157 in a 200-game time parity match back on Oct 1st: viewtopic.php?p=237567#p237567.


This post by Uberdude was liked by: Gomoto
Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #4 Posted: Wed Nov 21, 2018 10:01 am 
Lives in sente

Posts: 1257
Liked others: 102
Was liked: 265
hydrogenpi7 wrote:
so basically leela zero has made 0 progress in the last 4months since the last 15 block 157 net

on average hardware that still beats the latest 40block even on time parity


One more shitty claim for the road

_________________
North Lecale

Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #5 Posted: Wed Nov 21, 2018 10:06 am 
Beginner

Posts: 18
Liked others: 16
Was liked: 3
Rank: KGS 6k
The new 40x256 nets are all way stronger. For example the net #157 got demolished by an inofficial net called 1fdfb1c5 (trained by bjiyxo) as you can see here in the first line:

http://zero.sjeng.org/network-profiles/ ... 96aea95a0c

This inoffical net got a 337 : 63 (84.25%) result against #157. And the current 40x256 nets are stronger than this older 1fdfb1c5 net.

The problem is that it is hard to compare different sized nets on the same hardware. Smaller nets get more playouts on the same hardware as bigger networks, which is an advantage. But bigger networks have more potential to evaluate and play the better moves.

Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #6 Posted: Wed Nov 21, 2018 10:52 am 
Lives with ko

Posts: 242
Liked others: 4
Was liked: 57
Tomorrow, I'll run a #191 vs #157 match at time parity, with reasonably long time settings.

I think #191 is stronger, but probably not by 2+ stones. At this level, it's a huge difference.


This post by Vargo was liked by: Gomoto
Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #7 Posted: Wed Nov 21, 2018 9:11 pm 
Gosei
User avatar

Posts: 1541
Location: Hong Kong
Liked others: 45
Was liked: 521
GD Posts: 1292
I would like to think that bigger networks require better hardware for optimized results. It's like trying to solve math with a basic calculator vs an average computer. Basic addition and subtraction would be restricted to the input speed of the operator, but as we move up the scale to higher levels of mathematics the calculator will lose out. So in terms of progress with the use of a basic calculator, it would seem that moving up to higher levels of mathematics does not produce better results.

I have read in a Chinese go forum that with the new Leela zero engine (0.16), it would take a 2x 1080ti setup to be able to enjoy the real speed of the new algorithms.

Quoting the Chinese go forum with the use of google translate
Quote:
The amount of calculation is large, the demand for graphics cards surpasses almost all large games. If you want to play a big weight like 40B, the two 1080TI is the entry configuration.

_________________
http://tchan001.wordpress.com
A blog on Asian go books, go sightings, and interesting tidbits
Go is such a beautiful game.

Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #8 Posted: Thu Nov 22, 2018 2:29 am 
Lives with ko

Posts: 242
Liked others: 4
Was liked: 57
10 game match #191 v. #157

no pondering, -r 10, komi 7.5

15 min games (per side and game) with 2x1080Ti
benchmark 520 n/s (#191) and 1510 n/s (#157)
average length : 243 moves
average time B : 642"
average time W : 645"
That's around 5-6 s/move, amounting to a reasonable 20" per move (???) for one average GPU.

Result : #191 wins 8-2

Too few games, but large margin.

If someone wants the games, I'll upload them


This post by Vargo was liked by 2 people: Gomoto, Uberdude
Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #9 Posted: Thu Nov 22, 2018 5:25 am 
Lives with ko

Posts: 183
Liked others: 25
Was liked: 60
Rank: 2d
Vargo wrote:
10 game match #191 v. #157

Result : #191 wins 8-2
I was curious, so I was trying to run something similar yesterday. While I got a similar result, the games all looked identical up to move 40 or so. If the same thing happened in your run, I don't think we can say anything about relative strengths - only that #157 misevaluates one particular fuseki.

I've tried to make an opening book for twogtp, consisting of a number of files with eight moves each, so that the programs would start from those positions. I've left it overnight, and it's not done, but so far the results are far more even, with #157 (playing as Black in every game so far, which you'd expect to be a disadvantage) winning slightly more often.

Hardware is a GTX 1060, and the full command line:

Code:
gogui-twogtp -black "/local/go/software/leela-zero/autogtp/leelaz -w  /local/go/software/leela-zero/autogtp/net157-15x192final-d351f06e.gz --noponder  -g " -white "/local/go/software/leela-zero/autogtp/leelaz -w  /local/go/software/leela-zero/autogtp/net191-53f805d1.gz --noponder  -g " -auto -games 60 -verbose -sgffile lz157-191d -time "1+1/8" -openings /local/go/openings/ -debugtocomment


This post by bernds was liked by 2 people: Bill Spight, Gomoto
Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #10 Posted: Thu Nov 22, 2018 7:44 am 
Lives with ko

Posts: 242
Liked others: 4
Was liked: 57
bernds wrote:
While I got a similar result, the games all looked identical up to move 40 or so.
Nice to see that you have similar results. There's no duplicate game in my 10 games.
Below is a picture of the 10 games at move 60, they don't look too identical.
Attachment:
xx.jpg
xx.jpg [ 319.82 KiB | Viewed 3199 times ]


This post by Vargo was liked by 2 people: dfan, Drew
Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #11 Posted: Thu Nov 22, 2018 8:46 am 
Gosei

Posts: 1423
Liked others: 705
Was liked: 469
Rank: AGA 3k KGS 1k Fox 1d
GD Posts: 61
KGS: dfan
tchan001 wrote:
I have read in a Chinese go forum that with the new Leela zero engine (0.16), it would take a 2x 1080ti setup to be able to enjoy the real speed of the new algorithms.

I am not sure why they say that. Leela Zero is already strong amateur strength with just a few visits (which you can easily get with no GPU at all). Uberdude's setup seems to be performing quite well with a single graphics card that is less powerful than an 1080Ti.

Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #12 Posted: Thu Nov 22, 2018 9:22 am 
Judan

Posts: 6172
Location: Cambridge, UK
Liked others: 353
Was liked: 3333
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
dfan wrote:
tchan001 wrote:
I have read in a Chinese go forum that with the new Leela zero engine (0.16), it would take a 2x 1080ti setup to be able to enjoy the real speed of the new algorithms.

I am not sure why they say that. Leela Zero is already strong amateur strength with just a few visits (which you can easily get with no GPU at all). Uberdude's setup seems to be performing quite well with a single graphics card that is less powerful than an 1080Ti.


I think what they mean is this:
- LZ 0.16 has some optimisations that make LeelaZero faster
- Some of these optimisations are only available on modern top-notch graphics cards (e.g. reduced precision floating point arithmetic: basically by default GPUs used to do 32-bit floating point arithmetic, but if your neural network only needs 16-bit or even 8-bit then if you can pack 2 16-bit operations into where it used to do 1 32-bit one then you can go twice as fast)
- So if on some crappy old CPU 0.16 vs 0.15 is probably not much faster if at all
- On my 1060 GPU is quite a bit faster (how much exactly seems to vary)
- On 1080Ti or the next generation 2080s will be even more faster.

So "enjoy" means "derive the maximum benefit". I find less than the maximum benefit still enjoyable :) I mean even a 5-year old PC with LZ will in less than a second give you a good selection of candidate moves based on "shape intuition" comparable to that of a super-high dan player who didn't bother to do any reading or glance across the board for ladders.


This post by Uberdude was liked by 2 people: dfan, Gomoto
Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #13 Posted: Thu Nov 22, 2018 3:46 pm 
Lives in sente

Posts: 1282
Location: Earth
Liked others: 468
Was liked: 209
So basically hydrogenpi has made 0 progress evaluating leela zero appropriatly in the last 4mo ;-)


This post by Gomoto was liked by 7 people: abcd_z, Charlie, pnprog, Satorian, sorin, Uberdude, zermelo
Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #14 Posted: Fri Nov 23, 2018 5:22 am 
Lives with ko

Posts: 183
Liked others: 25
Was liked: 60
Rank: 2d
bernds wrote:
I've tried to make an opening book for twogtp, consisting of a number of files with eight moves each, so that the programs would start from those positions. I've left it overnight, and it's not done, but so far the results are far more even, with #157 (playing as Black in every game so far, which you'd expect to be a disadvantage) winning slightly more often.

So, that experiment is complete now. Each program got to play the same initial position both with Black and White.

With #157 as Black: #157 17-14 #191
With #157 as White: #157 12-19 #191

In all, a narrow victory for #191, 33-29. Not enough to demonstrate that #191 has made significant progress. The earlier results suggest it knows something in the very early opening that #157 doesn't, but for analyzing arbitrary positions, I'd say there does not seem to be a big strength difference.

Possible sources of error - the machine wasn't completely unloaded, but the whole thing ran for more than 24 hours, so I'd expect errors from that to average out. The command line was shown above so if there was a setup error people should be able to spot it. I'd be happy if someone could try to reproduce the results.

Here is an example of an opening position that #157 managed to win with both colors.
Attachment:
game1.sgf [260.04 KiB]
Downloaded 57 times

Attachment:
game2.sgf [306.19 KiB]
Downloaded 56 times

Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #15 Posted: Fri Nov 23, 2018 5:48 am 
Lives in gote
User avatar

Posts: 306
Location: Deutschland
Liked others: 264
Was liked: 125
Rank: EGF 4 kyu
bernds wrote:
bernds wrote:
I've tried to make an opening book for twogtp,...


With #157 as Black: #157 17-14 #191
With #157 as White: #157 12-19 #191


Perhaps your opening book is biased to favour black.

If you ran the test for more iterations (so that each network gets several opportunities to play each opening as both black and white) we might have enough data to perform statistical tests to theorise on whether the colour or the network is more significant.

Another option would be to "harvest" an unbiased opening book -- perhaps from Leela Zero match games, available from the web site. You could select openings at move 8 that are most common (given symmetry) and most even for each colour, based on match game outcomes.

Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #16 Posted: Fri Nov 23, 2018 6:05 am 
Lives with ko

Posts: 183
Liked others: 25
Was liked: 60
Rank: 2d
Charlie wrote:
Perhaps your opening book is biased to favour black.
This is possible, but since both sides got to play each position with both colors, it should not make a difference for testing relative strengths. I did not pick the openings completely at random. They were from pro games, picked with an eye towards looking reasonably even, but if one side made one or two moves that an AI perhaps wouldn't, like 3-3 point openings, it was White. The idea being that LZ evaluates an empty board as favourable for White, and if one could construct more even positions it wouldn't be a bad thing for this sort of test.
Here are the first evaluations in each file where LZ#191 was black:
Code:
NN eval=0.445787
NN eval=0.457268
NN eval=0.466394
NN eval=0.461104
NN eval=0.510580
NN eval=0.487815
NN eval=0.460059
NN eval=0.464494
NN eval=0.452350
NN eval=0.453560
NN eval=0.489030
NN eval=0.461185
NN eval=0.462972
NN eval=0.503706
NN eval=0.444272
NN eval=0.513425
NN eval=0.464524
NN eval=0.433751
NN eval=0.482730
NN eval=0.449405
NN eval=0.470484
NN eval=0.518139
NN eval=0.450151
NN eval=0.474539
NN eval=0.460860
NN eval=0.407501
NN eval=0.511947
NN eval=0.468585
NN eval=0.454147
NN eval=0.459028
NN eval=0.474770
That suggests most of the evaluations were around the 0.46 point that the programs think is the evaluation for an empty board. Now, it's possible that these evaluations don't reflect actual winning percentages, and that would be an interesting find if it could be shown.

Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #17 Posted: Fri Nov 23, 2018 6:08 am 
Judan

Posts: 6172
Location: Cambridge, UK
Liked others: 353
Was liked: 3333
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
Something I picked up from my google translate of Chinese kibitz whilst playing as LeelaZero on Fox is they use different bots to play as black and white as some are better at once colour than another. Elfv1 is unusual in that it thinks black is winning on the empty board, so maybe it's better playing as black? Regarding bernds's test, here's a some possibly relevant musings:

#157 likes to knight back off as white after approach in parallel 4-4:
Click Here To Show Diagram Code
[go]$$B
$$ +---------------------------------------+
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . O . . . . . , . . . . . X . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . 2 . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . O . . . . . , . . . . . X . . . |
$$ | . . . . . 1 . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ +---------------------------------------+[/go]


#188 and other recent 40b like to counter-approach (like Elfv1 does too, 4 4-4 corners then 4 approaches is very common Elf fuseki and we see pros playing it recently too). But if white plays knight answer not much loss of winrate.
Click Here To Show Diagram Code
[go]$$B
$$ +---------------------------------------+
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . O . . . . . , . . . . . X . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . O . . . . . , . . . . . X . . . |
$$ | . . . . . 1 . . . . . . . 2 . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ +---------------------------------------+[/go]


Elfv1 thinks knight answer is -7% mistake (I saw in a facebook thread recently Nikola Mitic reported some pros reckon 10% Elf in opening is about 1 point) and black's 3-3 invasion punishes. LZ 188 thinks white is still good here.
Click Here To Show Diagram Code
[go]$$B
$$ +---------------------------------------+
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . 3 . . . . . . . . . . . . . . . . |
$$ | . . . O . . . . . , . . . . . X . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . 2 . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . O . . . . . , . . . . . X . . . |
$$ | . . . . . 1 . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ +---------------------------------------+[/go]


So it appears LZ from 157 to the recent 40b is becoming more like Elfv1 and if we assume Elf is "correct" (at least when it plays) that this 3-3 after knight response is good for black then 157 will willingly play white on this position whilst 40b won't, and 40b is probably better at making black win from here, but maybe both >50% as black.

Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #18 Posted: Fri Nov 23, 2018 11:25 am 
Lives with ko

Posts: 242
Liked others: 4
Was liked: 57
Another 10 game match between #157 and #191, same PC, same time parity, but with twogtp v1.5.0 and LZ0.16 (thanks to baduk1 for the workaround !)
So, much better benchmarks (845 n/s for #191 and 2677 n/s for #157)
Same commands as last match.

Average length : 250 moves
Average time per game : 678" for #191 and 662" for #157

All games by resignation, no duplicate game.
At move 60, the 10 games look different from one another :
Attachment:
191.jpg
191.jpg [ 185.37 KiB | Viewed 2957 times ]
And the result is... hum....
5-5 Go figure... :scratch:
Attachment:
191_is_W_on_odd_.zip [9.63 KiB]
Downloaded 42 times
(I used -alternate, so #191 is W only in the odd numbered games)

Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #19 Posted: Fri Nov 23, 2018 5:38 pm 
Dies with sente

Posts: 94
Liked others: 2
Was liked: 15
Rank: KGS 2 D
Observing the success of Leela-master, I would suspect the real potential of the "zero" approach. I have no doubt that the zero approach can produce bots playing at super human level, but obviously the bot is rather good at micro strategy level. For macro strategy, such as how to choose the opening moves, especially in handicap games, it may take so many self-play games for the bot to learn. Thinking about this, even Leela zero has to use ELF games in the training to make reasonable progress.

Top
 Profile  
 
Offline
 Post subject: Re: so basically leela zero has made 0 progress in the last
Post #20 Posted: Sat Nov 24, 2018 1:51 am 
Lives with ko

Posts: 242
Liked others: 4
Was liked: 57
In another thread, @splee99 said :
Quote:
If you have two GPU's, why don't you try assign GPU0 to 181 and GPU1 to 157? I know this would make the speed slower, but it will make a fair match because some data maybe cached in a GPU during the game.

Good idea.
10 game match #191 v #157 at 5min per game per side, but with pondering enabled for both,
result 6:4 in favor of #191


All the stats and commands :
Attachment:
stats.jpg
stats.jpg [ 123.01 KiB | Viewed 2854 times ]


The games (191 is B only in even numbered games)
Attachment:
191v157_5m_ponder_191isBfor_even_num.zip [9.92 KiB]
Downloaded 40 times

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 26 posts ]  Go to page 1, 2  Next

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: Bill Spight, iopq, rottenhat and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group