Life In 19x19
https://lifein19x19.com/

LZ's progression
https://lifein19x19.com/viewtopic.php?f=18&t=15718
Page 7 of 19

Author:  Vargo [ Thu Sep 13, 2018 2:10 am ]
Post subject:  Re: LZ's progression

All these time parity matches are 5min per game and per side, so, it's a bit more than 2 sec per move (with 1x1080), which corresponds roughly to 800 visits for #176, and to 3200 visits for #157.
The matches I ran with longer time settings for other networks had similar results...
Anyway, I've begun a new 157 v 176 match, with 2x1080Ti, 12800 visits for 157 and 3200 visits for 176, which should be approximately at time parity (I'll check, with the .dat)

Author:  moha [ Thu Sep 13, 2018 3:42 am ]
Post subject:  Re: LZ's progression

Vargo wrote:
All these time parity matches are 5min per game and per side, so, it's a bit more than 2 sec per move (with 1x1080), which corresponds roughly to 800 visits for #176, and to 3200 visits for #157.
I think in this visit range more search still beats better search. There are around 300 candidate MOVES in a position, so (even if most of them are pruned) this doesn't mean a significantly deep search. Looking a bit deeper is more valuable than the order you FIRST look at the moves (which is all network strength is about).

So in this range I won't expect to see a spectacular improvement between successive networks even if they actually improve. Just lowering official matches from 3200 to 1600 visits had a noticeable negative effect (results more random, promotions scarcer even on a new size).

Quote:
Anyway, I've begun a new 157 v 176 match, with 2x1080Ti, 12800 visits for 157 and 3200 visits for 176, which should be approximately at time parity (I'll check, with the .dat)
Thanks, this will be interesting. I never saw a statistically significant 40b vs 15b match at more realistic time controls (someone posted similar results on github but also only 1-2 sec/move).

OC testing like this is faster, but if someone uses LZ for serious analysis, he probably would allow at least 10-20 sec per move on 1080ti (nearly 10k visits - which is where search quality should start to overcome the visit disadvantage). And the tournaments these high block nets were first used saw much more visits. On the other hand it is good to know that users with weaker hardware are better off with 15 blocks for now. There probably will be unofficial 15b nets trained on 40 block selfplay data in the future as well.

Author:  Knotwilg [ Thu Sep 13, 2018 4:33 am ]
Post subject:  Re: LZ's progression

Has LZ also built up a model of the game for itself? Has AlphaGo? I'm confused as to the AI aspect. I understand how it uses MTCS and NN to solve the computation problem, but there's no AI in there, is it? Do these programs always rebuild their intelligence particular to the game? Or has LZ also trained itself like AG has? And what has been the result of AG's training? Did it have an impact only on the MTCS and NN parameters? Or did it rebuild some domain knowledge for itself?

Any articles on that?

Sorry to sound confused.

Author:  Tryss [ Thu Sep 13, 2018 5:47 am ]
Post subject:  Re: LZ's progression

Quote:
Has LZ also built up a model of the game for itself? Has AlphaGo? I'm confused as to the AI aspect. I understand how it uses MTCS and NN to solve the computation problem, but there's no AI in there, is it?

If you want to simplify how AlphaGo/LZ works, it's kinda like this :

There is the intuitive part of LZ brain : the neural network. LZ see a position, and her intuition (the neural network) give her a list of candidate moves and a feeling of who's ahead.

And there is the reading process : the Monte Carlo search (even if it's not really Monte Carlo anymore, because there is no rollouts). LZ read the most promising moves, and use her intuition to evaluate the position

Her intuition (the neural network) is trained by feeding her millions of self-play games by previous versions of herself, she's told the result of these positions, and her intuition learn what's good (= what's work), and what's bad (what doesn't). And that's how her intuition get better over time.


Now, what's inside the neural network is quite mysterious, but that's not specific to go. It's a "problem" with all deep neural networks. You can train a network to tell if there's a dog in the picture with really high accuracy, but how exactly the neural network recognise the dog is not well understood.

Author:  Knotwilg [ Thu Sep 13, 2018 7:40 am ]
Post subject:  Re: LZ's progression

Tryss wrote:
And there is the reading process : the Monte Carlo search (even if it's not really Monte Carlo anymore, because there is no rollouts).


OK, now that's confusing: I thought I had to interpret the lower number in Lizzie as "plies", and these plies represent complete rollouts. So now I guess they are not full but partial rollouts, and there's a higher level evaluation than the score.

Tryss wrote:

Her intuition (the neural network) is trained by feeding her millions of self-play games by previous versions of herself, she's told the result of these positions, and her intuition learn what's good (= what's work), and what's bad (what doesn't). And that's how her intuition get better over time.

Now, what's inside the neural network is quite mysterious, but that's not specific to go. It's a "problem" with all deep neural networks. You can train a network to tell if there's a dog in the picture with really high accuracy, but how exactly the neural network recognise the dog is not well understood.


OK. So it's AI after all, not merely an inventive way to speed up reading. Only we get no insight in the "model" used for deciding on either the candidates (explore) or the evaluation of the plie (exploit)

Author:  yakcyll [ Thu Sep 13, 2018 8:01 am ]
Post subject:  Re: LZ's progression

Knotwilg wrote:
OK. So it's AI after all, not merely an inventive way to speed up reading. Only we get no insight in the "model" used for deciding on either the candidates (explore) or the evaluation of the plie (exploit)

It's not a way to speed up reading, but rather to use prior reading experience to its advantage. Another way, I think a more precise one, to think about the 'intelligence' or the 'intuition' part of a bot is that what it does is not selecting moves based on however 'feeling' could by applied to a program, but rather based on that experience (one could argue that's what intelligence is, but let's avoid that for now). Outside of training, in order to skip the MC search, it employs what's called a value network, which is a neural network used to evaluate positions.

Quote:
Game tree is searched in simulations composed from 4 phases:
  • Selection — simulation traverses tree by selecting edges with maximum action value Q (how good this move is).
  • Expansion — if any node is expanded, it is processed once by SL (Supervised Learning) policy network to get prior probabilities for each legal action.
  • Evaluation — each node is evaluated by value network and by FR (Fast Rollout) policy.
  • Backup — action values Q are updated by values collected during evaluation step.

I recommend this article, it describes how AG works pretty well. Basically, there's no set of rules or knowledge it applies, directly or indirectly; that's our thing. The bot merely collects the data about board positions during learning and formats it so that it can utilize the experience quickly, on the fly - in the form of synaptic weights.

Author:  Vargo [ Thu Sep 13, 2018 8:57 am ]
Post subject:  Re: LZ's progression

moha wrote:
Thanks, this will be interesting. I never saw a statistically significant 40b vs 15b match at more realistic time controls

Here it is :
20 game match between LZ0.15 #157 and #176
--visits=3201 for #176
--visits=12801 for #157
which amounts to approximately time parity (average of 3.03s/move for #176 and 3.4s/move for #157)
no pondering, twogtp V1.4.10, 2x1080Ti
Average game length : 256 moves

#176 wins 13:7
(65% , all games by resignation, 8 wins as W, 5 as B)

Even if 20 games is not enough, it seems you were right for the longer time settings ;-)
Attachment:
157isW.zip [9.82 KiB]
Downloaded 22 times
Attachment:
157isB.zip [9.62 KiB]
Downloaded 25 times

Author:  Gomoto [ Thu Sep 13, 2018 9:06 am ]
Post subject:  Re: LZ's progression

Quote:
I recommend this article, it describes how AG works pretty well. Basically, there's no set of rules or knowledge it applies, directly or indirectly; that's our thing. The bot merely collects the data about board positions during learning and formats it so that it can utilize the experience quickly, on the fly - in the form of synaptic weights.


And now please explain how we humans use rules or knowledge to recognize for example an image.

Indeed there is no difference to the bots our brain merely collects the data during learning and formats it so that it can utilize the experience quickly, on the fly - in the form of synaptic ...

It is not that easy to define the difference.

Author:  Gomoto [ Thu Sep 13, 2018 9:08 am ]
Post subject:  Re: LZ's progression

And while there are no explicit rules in a neural network, we can check the data like sorin and find "rules" the AI adheres to. For example the josekis and moves it prefers in specific configurations.

Author:  moha [ Thu Sep 13, 2018 9:24 am ]
Post subject:  Re: LZ's progression

Vargo wrote:
Even if 20 games is not enough, it seems you were right for the longer time settings ;-)
Thanks, nice to see 40b win at last. This may also answer your earlier question (why official/elo tests are not at "time parity" - no consistent meaning):

Vargo wrote:
40 games between #157 and #176.
Time parity, 5 min per game, GPU: 1x1080, komi 7.5, no pondering.
#157 wins 29:11 (17 wins as W, 12 wins as B)

Vargo wrote:
20 game match between LZ0.15 #157 and #176
--visits=3201 for #176
--visits=12801 for #157
which amounts to approximately time parity (average of 3.03s/move for #176 and 3.4s/move for #157)
#176 wins 13:7 (65% , all games by resignation, 8 wins as W, 5 as B)

Author:  nbc44 [ Sat Sep 15, 2018 3:02 am ]
Post subject:  Re: LZ's progression

Vargo wrote:
Here it is :
20 game match between LZ0.15 #157 and #176
--visits=3201 for #176
--visits=12801 for #157
which amounts to approximately time parity (average of 3.03s/move for #176 and 3.4s/move for #157)
no pondering, twogtp V1.4.10, 2x1080Ti
Average game length : 256 moves

#176 wins 13:7
(65% , all games by resignation, 8 wins as W, 5 as B)

Even if 20 games is not enough, it seems you were right for the longer time settings ;-)


My test (l0 v15 #157 vs #176, still in progress) :

Code:
C:\APPS\l0gpu\validation.exe -k 157-176 -b C:\APPS\l0gpu\leelaz -n C:\APPS\net\d351f06e.gz -o "-g -v 12801 --gpu 0 --gpu 1 --noponder -t 12 -q -d -r 5 --timemanage off -w" -b C:\APPS\l0gpu\leelaz -n C:\APPS\net\dabff367.gz -o "-g -v 3201 --gpu 0 --gpu 1 --noponder -t 12 -q -d -r 5 --timemanage off -w"

Code:
Stopping engine.
25 wins, 15 losses
40 games played.
Status: 0 LLR 0.821218 Lower Bound -2.94444 Upper Bound 2.94444


P.S. If someone wants the games, I can upload them (after the end of the test).

Author:  explo [ Sat Sep 15, 2018 11:00 am ]
Post subject:  Re: LZ's progression

Who won the 25 games?

Author:  Vargo [ Sat Sep 15, 2018 12:19 pm ]
Post subject:  Re: LZ's progression

There's a new 256x40b network (#177), a good occasion to see if the result of the last match (157 v 177) still holds.

20 game match between LZ0.15 #157 and #177
--visits=3201 for #177
--visits=12801 for #157
approximately time parity (#157 takes a little more time)
no pondering, twogtp V1.4.10, 2x1080Ti

It's a draw 10:10
(all games by resignation)
So, not as good a result as the last match, but a confirmation that the new networks have caught up with the old 20b (given enough time)
Attachment:
177isW.zip [9.56 KiB]
Downloaded 35 times
Attachment:
177isB.zip [9.32 KiB]
Downloaded 16 times
nbc44 wrote:
My test (l0 v15 #157 vs #176, still in progress) :
Happy to see someone else run matches, thanks ! Looking forward to the final result :)

Author:  nbc44 [ Sat Sep 15, 2018 12:51 pm ]
Post subject:  Re: LZ's progression

explo wrote:
Who won the 25 games?

#157

Author:  moha [ Sat Sep 15, 2018 2:33 pm ]
Post subject:  Re: LZ's progression

Vargo wrote:
It's a draw 10:10 (all games by resignation)
So, not as good a result as the last match, but a confirmation that the new networks have caught up with the old 20b (given enough time)
Depends on what time is "enough" time. :) (I guess you meant old 15b.) Allowing 6 sec instead of 3 for example, 6400 visits instead of 3200 would likely shift the score some percents in 40b's favor (random variance aside), and so on with even more time.

These scaling effects are the heart of the problem. A more practical question is how much visits would a user get in daily use (on which hardware?) when analysing his games.

Author:  nbc44 [ Sun Sep 16, 2018 1:55 am ]
Post subject:  Re: LZ's progression

Vargo wrote:
Looking forward to the final result :)


Nothing interesting right now :D :
Code:
68 wins, 46 losses
114 games played.
Status: 0 LLR 1.64871 Lower Bound -2.94444 Upper Bound 2.94444


P.S. I think 12801 visits is too big for this test.

Author:  Vargo [ Sun Sep 16, 2018 2:25 am ]
Post subject:  Re: LZ's progression

moha wrote:
I guess you meant old 15b
Yes, 20b networks are for Lc0 , Leela Chess Zero is similar to LZ (description HERE)
The Computer Chess Championship is going on these days HERE, and Lc0 is doing particularly well.

moha wrote:
how much visits would a user get in daily use (on which hardware?) when analysing his games
To get 3200 visits (#177) or 12800 visits (#157), with 2x1080Ti, it's around 3 sec/move. For one dedicated GPU, maybe from 5-6 sec for one 1080Ti to 15-20sec (?)

Author:  explo [ Sun Sep 16, 2018 3:45 am ]
Post subject:  Re: LZ's progression

Based on using lizzie, I need around a minute to get 3200 visits on a 40b network. I have a GTX 1050 which I guess is better than what most go players have. Right now most people should rather use #157 if they want to briefly review a game and identify mistakes.

Author:  moha [ Sun Sep 16, 2018 4:29 am ]
Post subject:  Re: LZ's progression

Tests like these could identify the visit points where successive 40b networks overcome the 15b on 4x visits, measuring 40b progress and letting users choose the stronger size for their hardware and time/patience during reviews.

On weak hw this turning point may well remain too high even after substantial 40b training (network strength has less effect during the first bunch of visits, with shallow searches, so the turning point may not decrease too fast in this range), but on 1080ti it seems to be in reach already.

Author:  Vargo [ Sun Sep 16, 2018 10:58 am ]
Post subject:  Re: LZ's progression

explo wrote:
Based on using lizzie, I need around a minute to get 3200 visits on a 40b network.
On my laptop (gpu : GTX965M), I've just run such a game with twogtp (#157 at 12801 visits v. #177 at 3201 visits)
Total time was 5702s. for 286 moves, around 20s. per move. I don't know why there's such a difference...
Attachment:
sc.jpg
sc.jpg [ 79.32 KiB | Viewed 1549 times ]

Page 7 of 19 All times are UTC - 8 hours [ DST ]
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/