LZ's progression

For discussing go computing, software announcements, etc.
moha
Lives in gote
Posts: 311
Joined: Wed May 31, 2017 6:49 am
Rank: 2d
GD Posts: 0
Been thanked: 45 times

Re: LZ's progression

Post by moha »

Thanks. Looking at LZ's reported "visits" some huge swings are expected since that includes tree reuse. But it seems there actually are large fluctuations in n/s as well, which is not subject to tree reuse (it's taken as playouts/time IIRC).

Bottom line is, it's not surprising if results with fixed (asymmetrical) visits differ somewhat from fixed per-game time.
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

Yet another so-called "time parity" test - 10 min per game.

Code: Select all

--precision half -t 12 --gpu 0 --gpu 1 --noponder
Part1: #157 (black) vs #181 : +32-68=0

Part2: #157 (white) vs #181 : +35-65=0

Finally: #157 vs #181 : +67-133=0

Is it a miracle or just an ordinary bug of "precision half" option? A farewell to #157?

P.S. L0-next 29/09/2018
Attachments
181-157-10m.zip
(97.6 KiB) Downloaded 563 times
157-181-10m.zip
(93.67 KiB) Downloaded 560 times
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

Another brick in the (same) wall
#181 v. #157 (visits=12801 for #157 and visits=3201 for 181)

#181 wins 7-3

(twogtp v1.4.10, noponder, average length : 237 moves, #157 takes 10% more time than #181)
157_181_157isW.zip
(4.86 KiB) Downloaded 557 times
157_181_157isB.zip
(4.91 KiB) Downloaded 566 times
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

Quick answer:

Code: Select all

C:\APPS\l0gpu\validation.exe -k 157-181-10 -b C:\APPS\l0gpu\leelaz -n C:\APPS\net\d351f06e.gz -o "-g -v 12801 --gpu 0 --gpu 1 --noponder -t 12 -q -d --timemanage off -w" -b C:\APPS\l0gpu\leelaz -n C:\APPS\net\68824bbc.gz -o "-g -v 3201 --gpu 0 --gpu 1 --noponder -t 12 -q -d --timemanage off -w"

Code: Select all

6 wins, 4 losses
10 games played.
#157 wins 6 - 4 :clap:

edit.

Code: Select all

gogui-twogtp -black "C:\APPS\l0gpu\leelaz.exe --gtp --weights=C:\APPS\net\d351f06e.gz -t 12 --gpu 0 --gpu 1 -v 12801 --noponder --timemanage off" -white "C:\APPS\l0gpu\leelaz.exe --gtp --weights=C:\APPS\net\68824bbc.gz -t 12 --gpu 0 --gpu 1 -v 3201 --noponder --timemanage off" -games 5 -sgffile 157-181-visits -auto -komi 7.5 -verbose
&
gogui-twogtp -black "C:\APPS\l0gpu\leelaz.exe --gtp --weights=C:\APPS\net\68824bbc.gz -t 12 --gpu 0 --gpu 1 -v 12801 --noponder --timemanage off" -white "C:\APPS\l0gpu\leelaz.exe --gtp --weights=C:\APPS\net\d351f06e.gz -t 12 --gpu 0 --gpu 1 -v 3201 --noponder --timemanage off" -games 5 -sgffile 157-181-visits -auto -komi 7.5 -verbose
#157 wins 8 - 2 :clap: x 2

Any idea?
Attachments
157-181.zip
(10.65 KiB) Downloaded 557 times
157-181.zip
12801:3201 visits
(8.69 KiB) Downloaded 526 times
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

We must have the same kind of PC (I have 2x1080Ti, i9-7920X), and except for your -t 12 -q -d , I have the same commands.
On mine, a 10min (per side) game lasts in fact around 820 sec, as LZ doesn't use up all its time, and a 12801v3201 visits game lasts about the same time, but slightly favours #157, as it takes 10% more time than 181.

181 won 66% of your 200 games at 10 min.

181 won 7 of its 10 games
181 won 4 of its 10 games
181 won 2 of its 10 games
that's 43% of the 30 games, but with ~10% less time than 157

I'd trust your 200 games result. Probably 3x10 games is not enough, or ... maybe it's 181's way of asking for 10% more time ;-)
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

Due to the fact that your processor is faster, I suggest repeating your last test (10 games) with my settings:

Code: Select all

validation -k 157-181 -b leelaz -n #157net -o "-g -v 12801 --gpu 0 --gpu 1 --noponder -t 12 -q -d --timemanage off -w" -b leelaz -n #181net -o "-g -v 3201 --gpu 0 --gpu 1 --noponder -t 12 -q -d --timemanage off -w"
P.S. I have 2 x 1080ti too and Xeon E5-1650 v4, win10.
P.P.S. Benchmark:
C:\>c:\apps\l0gpu\leelaz.exe --benchmark -t 12 --gpu 0 --gpu 1 -w C:\APPS\net\d351f06e.gz
Thinking at most 2678400.0 seconds...
NN eval=0.531266

Q4 -> 1299 (V: 53.45%) (N: 39.76%) PV: Q4 D16 C4 E3 D3 E4 C6 O3 R6 Q3 R3 R2 P3 Q2 P4 R4 P2 R5 Q6 K3 R14 S6 S7 S5 P8 O7 O8
D16 -> 1288 (V: 53.44%) (N: 39.54%) PV: D16 Q4 D3 C5 C4 D5 F3 C14 F17 C16 C17 B17 C15 B16 D15 D17 B15 E17 F16 C10 O17 F18 G18 E18 H15 G14 H14
D4 -> 631 (V: 53.60%) (N: 17.15%) PV: D4 Q4 D17 C15 C16 D15 F17 C6 F3 C4 C3 B3 C5 B4 D5 D3 B5 E3 F4 C10 O3 F2 G2 E2 H5
15.7 average depth, 33 max depth
2555 non leaf nodes, 1.26 average children
3219 visits, 1114455 nodes, 3218 playouts, 1443 n/s


C:\>c:\apps\l0gpu\leelaz.exe --benchmark -t 12 --gpu 0 --gpu 1 -w C:\APPS\net\68824bbc.gz
Thinking at most 2678400.0 seconds...
NN eval=0.532423
Playouts: 1029, Win: 51.61%, PV: D4 D16 Q4 C6 C14 R6 R14 F3 D6 D7 E6 C4 C3 C5 D3 J3 D9 E7
Playouts: 2454, Win: 52.53%, PV: D4 D16 Q4 C6 C14 R6 R14 F3 D6 D7 E6 C4 C3 C5 D3 J3 D9 E7 H4 H3 F4 G7 F17 C12 E15 E12 B16 E2

D4 -> 2318 (V: 53.20%) (N: 77.92%) PV: D4 D16 Q4 C6 C14 R6 R14 F3 D6 D7 E6 C4 C3 C5 D3 J3 D9 E7 H4 H3 F4 G7 F17 C12 E15 E12 B16 E2
Q4 -> 456 (V: 53.27%) (N: 8.53%) PV: Q4 D4 D16 R3 Q3 R4 Q5 R5 R7 R6 Q6 S7 R8 S8 F3 K3 H3 N3 R9 E3 F4 D6 K5
D16 -> 444 (V: 53.26%) (N: 8.65%) PV: D16 D4 Q4 R3 Q3 R4 Q5 R5 R7 R6 Q6 S7 R8 S8 F3 K3 H3 N3 R9 E3 F4 D6
13.3 average depth, 29 max depth
2567 non leaf nodes, 1.25 average children
3219 visits, 1122186 nodes, 3218 playouts, 491 n/s
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

I don't think that the CPU is important, we have the same GPUs, and so, very similar benchmarks
benchmark.jpg
benchmark.jpg (154.28 KiB) Viewed 12052 times
Most probably, 10 or 30 games is not enough. For example, in your two 10 game matches, 181 doubles its winning percentage (20% and 40% , it's a huge swing). That's why I have more faith in your 200 game match :)
splee99
Dies with sente
Posts: 101
Joined: Thu Nov 15, 2012 9:46 pm
Rank: KGS 2 D
GD Posts: 0
Has thanked: 2 times
Been thanked: 16 times

Re: LZ's progression

Post by splee99 »

If you have two GPU's, why don't you try assign GPU0 to 181 and GPU1 to 157? I know this would make the speed slower, but it will make a fair match because some data maybe cached in a GPU during the game.
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

Last so-called "time parity" test - 10 min per side.

Code: Select all

--precision single -t 12 --gpu 0 --gpu 1 --noponder
Part1: #157 (black) vs #181 : +31-69=0

Part2: #157 (white) vs #181 : +53-47=0

Finally: #157 vs #181 : +84-116=0

A farewell to #157!!! :clap:

P.S. L0-next 29/09/2018
Attachments
181-157-single.zip
(97.11 KiB) Downloaded 659 times
157-181-single.zip
(96.1 KiB) Downloaded 682 times
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

New network :clap: (183)

Two 20 game matches #183 v #157 at time parity :


Match 1 :
LZ 015, GPU : 1x1080, no pondering, 5 min. per side , corresponding to ~800 visits for #183 and to ~2800 visits for #157
#183 wins 11-9
(all games by resignation, about 240 moves per game, and 230" used per side and per game)


Match 2 :
GPU : 2x1080Ti, no pondering, 7 min. per side , corresponding to ~3400 visits for #183 and to ~12800 visits for #157
#183 wins 12-8 (all games by resignation, about 240 moves per game, and 320" used per side)

Finally, 183 is better than 157, even at low time settings (less than 2" per move with a 1080)

If someone wants the games...
abcd_z
Beginner
Posts: 12
Joined: Thu Apr 26, 2018 11:32 am
Rank: 15k
GD Posts: 0
Has thanked: 5 times

Re: LZ's progression

Post by abcd_z »

Ah, but I have no GPU, and to get a game that takes less than two hours I have to restrict LZ considerably more than you did. On my computer, time parity is 600 visits for LZ 157 and 153 visits for LZ 183. Does 183 still take the lead under those conditions?
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

abcd_z wrote: Does 183 still take the lead under those conditions?
Most probably no.

I think 157 would win at least 60% of the games with so few visits.

Maybe I'll set up such a match and tell you the result ;-)


EDIT :
Here it is !
#157 v #184 (the latest best) 600 visits for #157 and 153 visits for #184

# 157 wins 14-6 (70%)

it's even more than 60 % :D
Last edited by Vargo on Mon Oct 22, 2018 4:29 am, edited 1 time in total.
Uberdude
Judan
Posts: 6727
Joined: Thu Nov 24, 2011 11:35 am
Rank: UK 4 dan
GD Posts: 0
KGS: Uberdude 4d
OGS: Uberdude 7d
Location: Cambridge, UK
Has thanked: 436 times
Been thanked: 3718 times

Re: LZ's progression

Post by Uberdude »

I wonder how does #183 do versus Elf v1 at time-parity? I saw it's now stronger (57% win) at visit parity in the test match: http://zero.sjeng.org/match-games/5bcaa ... 3e27abce47.

P.S. some interesting analysis of different nets strength scaling with time: https://github.com/gcp/leela-zero/issues/1914. The relevant summary for here is around 20-300 playouts for 40b #181 the best 15 block network #157 needs 3 times the playouts to score 50% win, but is ~4 times faster so is stronger at equal time (which given current hardware is sensible time per move for human analysis). Once you get into more playouts 181's strength improves faster than the 15 block so 157 needs more than 3 and then crosses the equal time ~4 times threshold and 40block wins more.

A randomly-chosen LZ #183 (a54cd) win (Elf got captured in a ladder! :lol: ):

Analysing a bit with Elfv1 myself, Elf thought it was doing slightly better (55%) at n15 extend. LZ's n15 is a move which surprised and impressed me (thought shares features with a move Blackie showed me at KPMC) and Elf similarly thinks it's good (it starts off not considering it much but by 1200 playouts thinks it's better than normal l17 defence), does this mean push and cut wasn't best? But the big swing happens (q12 and q10 attachments surprised me but not Elf) with LZ starting the ko with s12 (Elf win collapse to 13%) instead of r13 (Elf win 45%). I suppose r13 is more often the better shape way to start the ko, but here it gives white decent local threats and a playable position. As for the ladder, at move 72 it thinks black will n8 atari allowing white to trade with p7 atari then o7 connect and ko (k15 is ignored threat), despite p12 atari for squeezy ladder having a few playouts and the ladder all the way to the edge of the board being the principal variation.

An Elf v1 win:


P.P.S Had a look at a few more LZ wins, quite a few are Elf blunders of ladders or shortage of liberties (though maybe Elf already thought it was losing from previous non-blunders so is doing bot harakiri). Seems like the 1600 playout limit is not enough to stop Elf making dumb mistakes, maybe Elf would do better at say 10k each. But in my experience it can be quite ladder blunder and blindspot-prone even at higher playouts.
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

I've launched a 20 game match at 5 min per side per game with GPU 1080. Hopefully, result in 2-3 hours.
nbc44
Dies in gote
Posts: 50
Joined: Sat Sep 15, 2018 2:34 am
GD Posts: 0
Been thanked: 3 times

Re: LZ's progression

Post by nbc44 »

My old test:

Code: Select all

The first net is worse than the second
#157a v #181 ( 398 games)
           wins        black       white
#157a  194 48.74%   85 48.57%  109 48.88%
#181   204 51.26%   90 51.43%  114 51.12%
                   175 43.97%  223 56.03%
#157a is the strongest 15x192 net:

Code: Select all

2018-07-23 00:13 fc5e0a50 VS d351f06e 220 : 192 (53.40%) 412 / 400 fail)
L0 015 options (2 x 1080ti):

Code: Select all

-g -v 4801 --gpu 1 --gpu 0 --noponder -t 12 -q -d --timemanage off -w C:\APPS\net\fc5e0a50.gz
-g -v 1601 --gpu 0 --gpu 1 --noponder -t 12 -q -d --timemanage off -w C:\APPS\net\68824bbc.gz
Poor #157a network :sad:, but...
x-axis: # of game,
y-axis: winrate of #157a net (0 == 50%)
match157a-181.png
match157a-181.png (35.29 KiB) Viewed 13705 times
How many games are needed to determine the winner?
:scratch:
Post Reply