KataGo V1.3

For discussing go computing, software announcements, etc.
Post Reply
jann
Lives in gote
Posts: 445
Joined: Tue May 14, 2019 8:00 pm
GD Posts: 0
Been thanked: 37 times

Re: KataGo V1.3

Post by jann »

lightvector wrote:And how much longer "long" would need to be, or if in fact the 30 or 40 block nets have been trained enough to even have better scaling yet? Nobody has tested yet. :)
If there is significant strength difference (seen at equal visits), I'd be very surprised if the scaling effect wouldn't appear in high visit games (exponential policy benefit IMO).
go4thewin
Lives with ko
Posts: 150
Joined: Thu Jan 23, 2020 6:09 am
Rank: 25 kyu
GD Posts: 0
Has thanked: 200 times
Been thanked: 30 times

Re: KataGo V1.3

Post by go4thewin »

Any chance for a 15b bot trained on 30b games in the future? Maybe it would eventually get stronger than Elf2 at playout parity? Thanks for great bots!
Katago 1.3.3 s243 20b 1 playout vs gtp4zen zen6 7d : 4-0 . Wow
Katago 1.3.3 s243 20b 350 playout 1 thread vs lz 125 1 thread 4000 playout [9d amateur - beat a pro] : 3-1 two bots were dead even
leela white
350po3.sgf
(5 KiB) Downloaded 383 times
kata white
350po4.sgf
(5.42 KiB) Downloaded 415 times
lastly, 20b s243 vs 20b s191 both 16 po 1 thread engine 1.3.3: 4-2
They are pretty even on my machine. With continued extended training, it will be interesting to watch progress!
Last edited by go4thewin on Sun Mar 01, 2020 5:42 am, edited 2 times in total.
set katago to play at your level https://docdro.id/sHZU1ti or experiment with gtp4zen ( https://rb.gy/kx2ilb )
Limeztone
Dies in gote
Posts: 63
Joined: Sun Jan 12, 2020 9:28 pm
GD Posts: 0
Has thanked: 8 times
Been thanked: 4 times

Re: KataGo V1.3

Post by Limeztone »

go4thewin wrote:Any chance for a 15b bot trained on 30b games in the future? Maybe it would eventually get stronger than Elf2 at playout parity?
What makes you think that KataGo with the current best 15 block net is not stronger than Elf v2 at playout parity? In my opinion it already is.
go4thewin
Lives with ko
Posts: 150
Joined: Thu Jan 23, 2020 6:09 am
Rank: 25 kyu
GD Posts: 0
Has thanked: 200 times
Been thanked: 30 times

Re: KataGo V1.3

Post by go4thewin »

My tests showed them about even. I forgot how i tested, might not be right. cgos has 15b 100 playout even with kata 1.3.2 s191 50 playout, to give strength estimate. Leela zero did 15b extended trainingd with 40b net, it was effective for them. 15b extended training is one of their strongest nets. It takes time, though, the 30b bot has to get very strong first for the 15b to make any progress
set katago to play at your level https://docdro.id/sHZU1ti or experiment with gtp4zen ( https://rb.gy/kx2ilb )
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: KataGo V1.3

Post by Vargo »

Katago 1.3.3 , Networks :
20b : b20d4686.txt.gz
30b : b30d5259.bin.gz
40b : b40d5243.bin.gz

9x9 : 100 game tests at visits parity (1600 visits). Gogui-twogtp 1.5.1 , 1xGTX1080

Katago 1.3.3 20b v. Katago 1.3.3 30b
no error, 2 duplicate games,
Katago 1.3.3 30b wins 85-13 (86.7 %)

Katago 1.3.3 20b v. Katago 1.3.3 40b
no error, 14 duplicate games,
Katago 1.3.3 40b wins 69-17 (80.2 %)

For 9x9, at visits parity 30b and 40b seem much stronger than 20b


9x9 : 100 game tests at time parity. Gogui-twogtp 1.5.1 , 2xGTX 1080Ti (2s/move, corresponding to ~ 8000 visits for 20b, and to ~3000-3500 visits for 30b or 40b )

Katago 1.3.3 20b v. Katago 1.3.3 30b
no error, 3 duplicate games,
Katago 1.3.3 20b wins 55-42 (56.7 %)

Katago 1.3.3 20b v. Katago 1.3.3 40b
no error, 2 duplicate games,
Katago 1.3.3 20b wins 57-41 (58.1 %)

30b and 40b seem not too far from 20b at time parity.

Stats :
vpar.jpg
vpar.jpg (335.05 KiB) Viewed 9702 times
4.jpg
4.jpg (189.58 KiB) Viewed 9702 times
3.jpg
3.jpg (249.14 KiB) Viewed 9702 times
And
Gosei
Posts: 1464
Joined: Tue Sep 25, 2018 10:28 am
GD Posts: 0
Has thanked: 212 times
Been thanked: 215 times

Re: KataGo V1.3

Post by And »

Vargo why didn't you use a stronger network g170e 20 block s2.43G?
And
Gosei
Posts: 1464
Joined: Tue Sep 25, 2018 10:28 am
GD Posts: 0
Has thanked: 212 times
Been thanked: 215 times

Re: KataGo V1.3

Post by And »

Can anyone explain the meaning of the visits parity tests? I understand this for training networks, but for the user, what's the point?
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: KataGo V1.3

Post by Vargo »

And wrote:And why didn't you use a stronger network g170e 20 block s2.43G?
Maybe I'll try tomorrow with 20b s2.43G.
And
Gosei
Posts: 1464
Joined: Tue Sep 25, 2018 10:28 am
GD Posts: 0
Has thanked: 212 times
Been thanked: 215 times

Re: KataGo V1.3

Post by And »

interesting to see 19x19! thanks for your tests!
go4thewin
Lives with ko
Posts: 150
Joined: Thu Jan 23, 2020 6:09 am
Rank: 25 kyu
GD Posts: 0
Has thanked: 200 times
Been thanked: 30 times

Re: KataGo V1.3

Post by go4thewin »

And wrote:Can anyone explain the meaning of the visits parity tests? I understand this for training networks, but for the user, what's the point?
visits, especially with one thread, are reproducible on different hardware. Very good if you dont want the bot playing at its strongest. Like getting gtp4zen to play at 3 kyu instead of 3 dan. Thats different time per move on different hardware, but visits parameter is the same and reproducible on any hardware. I know exactly how many playouts katago needs to play at a pro level, but the time is different on dif hardware. similarly, i know how to get it to play at 4 dan ogs (1 playout). you can play against a set strength level, especially with nonzero bots trained on sgfs
set katago to play at your level https://docdro.id/sHZU1ti or experiment with gtp4zen ( https://rb.gy/kx2ilb )
And
Gosei
Posts: 1464
Joined: Tue Sep 25, 2018 10:28 am
GD Posts: 0
Has thanked: 212 times
Been thanked: 215 times

Re: KataGo V1.3

Post by And »

go4thewin This is clear https://github.com/breakwa11/GoAIRatings
what is the meaning of for example a match of a network of 20 blocks versus a network of 40 blocks visits parity? it is obvious that a network of 40 blocks is more powerful but uses much more time!
jann
Lives in gote
Posts: 445
Joined: Tue May 14, 2019 8:00 pm
GD Posts: 0
Been thanked: 37 times

Re: KataGo V1.3

Post by jann »

With visit parity test you measure the strength difference between nets. A net that appears stronger at 500 visits will likely appear stronger at 1500 visits as well.

With time parity test you measure the intermixed strength and speed difference between nets, together with any hw or code speed difference. A setup that appears stronger at 10s/move could also appear weaker at 30s/move (or at different hw) because different network strengths scale differently with more visits (stronger nets tend to benefit more).

The advantage of knowing the strength difference and the speed difference as two independent values comes when you need to predict the result at a different hw and time setting (where you cannot test directly).
Last edited by jann on Sat Feb 29, 2020 3:51 pm, edited 1 time in total.
inbae
Dies in gote
Posts: 25
Joined: Tue Feb 04, 2020 11:07 am
GD Posts: 0
KGS: inbae
Been thanked: 7 times

Re: KataGo V1.3

Post by inbae »

IMHO, benchmarks should be done in playout parity, not in visit parity. While the fixed visits tests can represent the quality of analysis by engines in some controlled sense, this is not necessarily related to the real world strength. The number of visits is heavily dependent on the search tree reuse, and is influenced by policy sharpness. A visit parity test can be not very different from a playout parity test when two very similar engines (like LZ with two different networks) are playing against each other, but becomes dubious when two engines are very different (like LZ vs KG). Playout parity, on the other hand, is more appropriate for measuring strength of engines, since number of playouts is proportional to time spent.
xela
Lives in gote
Posts: 652
Joined: Sun Feb 09, 2014 4:46 am
Rank: Australian 3 dan
GD Posts: 200
Location: Adelaide, South Australia
Has thanked: 219 times
Been thanked: 281 times

Re: KataGo V1.3

Post by xela »

I think a lot of people tend to use "visits" and "playouts" interchangeably. (The Lizzie interface doesn't help, showing "playouts" and "visits/second" where both are measuring the same thing.)

If there's a difference, my understanding is that "one playout" is one round of exploring from the root to a leaf node, and one playout adds one visit to every node along the way, so that one playout = multiple visits. With this definition, I'd expect visits per second to be more or less constant (ignoring tree reuse), and playouts per second to vary according to the tree depth (which is influenced by policy sharpness). A deep tree with little branching means that each playout requires a lot of visits, so that you get fewer playouts per second. A shallow tree with lots of branching will give you shorter branches on average, so more playouts per second. Tree reuse will affect both numbers.

You might notice that Lizzie on an empty board will give you large numbers of "visits/second" (actually playouts in my terminology here), but when you add a few stones, the "visits/second" drops.

inbae, am I using the words in the same way as you, or do you have different definitions?
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: KataGo V1.3

Post by Vargo »

100 game test 19x19 : KataGo 1.3.3 b20d52587 v. KataGo 1.3.3 b30d5259 at time parity

1s/move, corresponding to ~2000 visits for 20b and to ~800 visits for 30b, twogtp 1.5.1, no error, no duplicate game, all games by resignation.
b20 wins 67-33

Stats :
1x.jpg
1x.jpg (683.67 KiB) Viewed 8206 times
Post Reply