LZ's progression

For discussing go computing, software announcements, etc.
Post Reply
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

Last one for today ;-) :

20-game match LZ15#160 v LZ15_ELF2 at time parity.

5 minutes per game (1xGTX1080), noponder, komi 7.5

ELF2 wins 18:2 (90% , 10 wins as W, 8 wins as B)
User avatar
MikeKyle
Lives with ko
Posts: 205
Joined: Wed Jul 26, 2017 2:27 am
Rank: EGF 2k
GD Posts: 0
KGS: MKyle
Has thanked: 49 times
Been thanked: 36 times

Re: LZ's progression

Post by MikeKyle »

Great stuff. Thanks for sharing.

Almost off topic, but it seems like you have a method that could be useful for generating large numbers of Bot vs Bot games. I'd be interested to play with that. Are you using a freely available tool? Or something you've developed yourself?
sorin
Lives in gote
Posts: 389
Joined: Wed Apr 21, 2010 9:14 pm
Has thanked: 418 times
Been thanked: 198 times

Re: LZ's progression

Post by sorin »

I saw today two games on WBaduk server, between Ichiriki Ryo 8p (one of Japan's top pros) and LeelaZero. Both were played on even.

LeelaZero won both (one playing black, one white).

Does anyone know what was the event?
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

@MikeKyle
It's a freely available tool : gogui-twogtp (there's V1.4.9 and V1.4.10)
It automatically plays n games between two programs, then saves the games and gives you a detailed report for each game (winner, score, number of moves, time spent, etc)
It works well with LZ, LZ variable komi, gtp4zen, Ray, and gnugo (and probably with others I haven't tried) but NOT with AQ 2.1.1

V1.4.9 is easy to install under Windows (there's an installer) , it works well for even games, but I think there's a scoring problem with handicap games.
You can download the installer here : https://sourceforge.net/projects/gogui/ ... gui/1.4.9/

V1.4.10 is less easy to use at first (because there's no win installer), but it handles well handicap games.

Once installed, you write a .txt file containing the parameters of the match you want to set up (which programs, nb of playouts, time, pondering, nb of games to play, etc.) and you rename your .txt file as .bat

And you just launch it.

For example,

A file named xxx.bat, and containing :


gogui-twogtp -black "C:\.....path to LZ.....\leelaz.exe --gtp --weights=C:\...path to your weight...\160.txt --visits=1000 --noponder" -white "C:\.....path to LZ.....\leelaz.exe --gtp --weights=C:\...path to your weight...\159.gz --playouts=100 --noponder" -games 30 -sgffile filenameyouwant -auto


would launch a 30 games match between
B : LZ with weights 160.txt (assuming it's the name these weights have on your computer) at 1000 visits and without pondering
and
W : LZ with weights 159.gz (assuming...) at 100 playouts and without pondering

the final report and the games would be named :

filenameyouwant.dat
filenameyouwant01.sgf
filenameyouwant02.sgf
etc...


EDIT :

For gogui-twogtp V1.4.9, The xxx.bat file must be in the same (install) directory as the file "gogui-twogtp.exe"

For both V1.4.9 and V1.4.10, instead of --playouts=xxx or --visits=yyy , you can set the time
-time 5 is the same as -time 5m and means 5min. for the whole game (per side)
-time 1200s means 1200sec
-time 120m+30s/1 means 2hours, plus one 30sec. byoyomi
-time 1s+5s/1 means 5sec. per move (total time must be >0)

For V1.4.10, as there's no installer, the .bat file must be in the installation directory, and must begin with

java.exe -jar lib\gogui-twogtp.jar -black "C:\etc.etc

for example :

If you want to set up an H7 10-games match between LZ and gnugo, you must use twogtp V1.4.10, and you must have one of the lZ variable komi versions , choose for example the one called "dynamic komi for handicap games"

and your .bat file should be something like :

java.exe -jar lib\gogui-twogtp.jar -black "C:\...path to gnugo ...\gnugo.exe --mode gtp" -white "C:\...path to LZvariablekomi ...\leelazVK.exe --gtp --weights=C:\...path to your weights...\40b_155_328k.gz --visits=2000 --noponder" -handicap 7 -games 10 -sgffile whatuwant -auto

for an H7 game between gnugo and LZ_variable_komi
Last edited by Vargo on Mon Aug 06, 2018 12:18 pm, edited 3 times in total.
Uberdude
Judan
Posts: 6727
Joined: Thu Nov 24, 2011 11:35 am
Rank: UK 4 dan
GD Posts: 0
KGS: Uberdude 4d
OGS: Uberdude 7d
Location: Cambridge, UK
Has thanked: 436 times
Been thanked: 3718 times

Re: LZ's progression

Post by Uberdude »

sorin wrote:I saw today two games on WBaduk server, between Ichiriki Ryo 8p (one of Japan's top pros) and LeelaZero. Both were played on even.

LeelaZero won both (one playing black, one white).

Does anyone know what was the event?
I've seen a bunch more, and some by PhoenixGo too. Mostly against young low dan pros. So my guess is it is similar to the online training series against DeepZen from last year. I've not seen a human win. Also I believe the Japan server on WBaduk is a mirror/clone/relay/something of the Japanese Yugen No Ma server. Currently playing is username "aikiller" with 1p Japanese flag, let's see if he or she can live up to that!

Update: Well, aikiller is recorded as winning by 0.5 points, but PhoenixGo played 1 move inside its territory unnecessarily as the last move, presumably because it assumes 7.5 komi but the server rules were 6.5. So not a moral victory even if it technically was one.
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

Someone at reddit/cbaduk (here) asked about 161 vs 157 at time parity...

I've run a 20-games match between #157 and #161 at 5 min. per game per side, no ponder, komi 7.5 (twogtp V1.4.10, 1xGTX1080)

LZ_015#157 v. LZ_015#161 :
LZ#157 wins 17-3 (10 wins as W, 7 wins as B)

Ouch... ! 20 games is not enough to be really sigificant, but stilll...

If someone wants the games, I'll upload them.
Tryss
Lives in gote
Posts: 502
Joined: Tue May 24, 2011 1:07 pm
Rank: KGS 2k
GD Posts: 100
KGS: Tryss
Has thanked: 1 time
Been thanked: 153 times

Re: LZ's progression

Post by Tryss »

Ouch... ! 20 games is not enough to be really sigificant, but stilll...
Actually, if a program A is stronger than B ( the win probability of A against B is >= 0.5), then there is under 0.13% chance that the program A get 3 win or less in 20 games. So the result is statistically significant.
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: LZ's progression

Post by Bill Spight »

Tryss wrote:
Ouch... ! 20 games is not enough to be really sigificant, but stilll...
Actually, if a program A is stronger than B ( the win probability of A against B is >= 0.5), then there is under 0.13% chance that the program A get 3 win or less in 20 games. So the result is statistically significant.
But is statistically significant really significant?

This may sound flip, but it is related to the replication crisis.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

Another 20-games match at 10min/game (1x1080): #157 wins 16-4 (9 times as W, 7 times as B)

And a 10-games match at something like 15min/game #157 wins 6-4 (3 times as W, 3 times as B)
(in fact it's a 5min/game match, but with 2x1080Ti, which corresponds to ~12-16 min/game on a 1080)

Again, if someone wants the games or the reports, I'll upload them.

In the 50 games played in these matches, #161 won only 11, that's 22%, and #161 has a better Elo score than #157, mmmmm... How come ?

Anyway, if someone wants the reports or the games, I'll upload them.
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: LZ's progression

Post by Bill Spight »

Vargo wrote:In the 50 games played in these matches, #161 won only 11, that's 22%, and #161 has a better Elo score than #157, mmmmm... How come ?
How come, indeed?
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
User avatar
jlt
Gosei
Posts: 1786
Joined: Wed Dec 14, 2016 3:59 am
GD Posts: 0
Has thanked: 185 times
Been thanked: 495 times

Re: LZ's progression

Post by jlt »

Test matches which are used to calculate the Elo score of LZ networks are not with time parity, but with the same number of visits. 20-block networks take more time per visit than 15-block networks.
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

The Elo scale is at visits parity, that's why I'm doubtful about it... I would prefer something measuring "real" strength, and with the same scale as for human players, but maybe it's hard or impossible to do.

Anyway, I find all this very interesting and I hope LZ will bounce back !
Uberdude
Judan
Posts: 6727
Joined: Thu Nov 24, 2011 11:35 am
Rank: UK 4 dan
GD Posts: 0
KGS: Uberdude 4d
OGS: Uberdude 7d
Location: Cambridge, UK
Has thanked: 436 times
Been thanked: 3718 times

Re: LZ's progression

Post by Uberdude »

Vargo wrote:In the 50 games played in these matches, #161 won only 11, that's 22%, and #161 has a better Elo score than #157, mmmmm... How come ?
Accumulation of errors without calibration, either to older versions of self or an external benchmark. (On a related note, dead-reckoning / inertial guidance systems (as for Apollo or ICBMs) are amazing feats of engineering, I was recently reading about them.)
Uberdude wrote:In fact during normal 15-block training I think doing some occasionally e.g. #157 vs #147 would be a good idea to see how inflated the incremental self-improvement Elo is (based on the incremental Elo differences from the promotions 147->148->149 etc it went from 11401 -> 11806 which predicts #157 would beat #147 91% of the time, but I would bet it would be quite a lot lower than that in reality).
I think I've managed to get twogtp working and am running a 147 vs 157 match now at 3200 visits.

Update: currently #157 leads only 7-3 as white. #157 6-0 as black before my PC crashed (maybe I shouldn't run this match and LZ and Elf analysis concurrently!).
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

20-games match between LZ015#157 and LZ015#163 at time parity
#163 is 11985 Elo
#157 is 11806 Elo

5-min games, 1x1080, twogtp V1.4.10

#157 wins 16-4 (10 wins as W, 6 wins as B)

I understand the ranking matches are at visits parity, and all that, but there's still something weird with the Elo scale used. For example, the ranking of L-zero has #163 higher than #157 in "Dan scale" ...

Here, #163 wins only 20% of its games against #157... It's only 20 games, but I doubt that #163 is stronger than #157.
Attachments
163_157_time5_157isB.zip
(8.86 KiB) Downloaded 622 times
163_157_time5_157isW.zip
(8.71 KiB) Downloaded 623 times
splee99
Dies with sente
Posts: 101
Joined: Thu Nov 15, 2012 9:46 pm
Rank: KGS 2 D
GD Posts: 0
Has thanked: 2 times
Been thanked: 16 times

Re: LZ's progression

Post by splee99 »

Vargo wrote:
Here, #163 wins only 20% of its games against #157... It's only 20 games, but I doubt that #163 is stronger than #157.
Could you try #165? It is noticeably more aggressive than before.
Post Reply