How to run KataGo in Lizzie v0.7.2?

goame · #1

A click on the engine doesn't work.
Using a new net also doesn't help.
Is there an easy way how to set up KataGo?
And let it use 2x GPUs?

goame · #2

I'm trying to do:

(ALSO:
You can also run this command to have KataGo generate a gtp config for you, and automatically tune the number of threads and other parameters
and other settings based on your answers to various questions.
katago.exe genconfig -model <NEURALNET>.bin.gz -output gtp_custom.cfg)

What's wrong with this?:
LG0\Lizzie\katago\katago.exe genconfig -model g170-b30c320x2-s1287828224-d525929064.bin.gz -output gtp_custom.cfg

At the end of cmd I've got:

2020-03-12 16:43:46+0100: Loading model and initializing benchmark...

Running quick initial benchmark at 16 threads!
2020-03-12 16:43:46+0100: nnRandSeed0 = 1119806987054893883
2020-03-12 16:43:46+0100: After dedups: nnModelFile0 = g170-b30c320x2-s128782822
4-d525929064.bin.gz useFP16 auto useNHWC auto
Uncaught exception: Error loading or parsing model file g170-b30c320x2-s12878282
24-d525929064.bin.gz: Could not open file - does not exist or invalid permission
s?

And · #3

start lizzie.jar (Lizzie requires Java 8 or higher to run), press "engine1: katanetwork.gz" and that’s it!

goame · #4

And wrote:

start lizzie.jar (Lizzie requires Java 8 or higher to run), press "engine1: katanetwork.gz" and that’s it!

It's not.

When I do this, then the Engine is loading and loading and loading and nothing happens.

And · #5

you ran Lizzie.0.7.2.Windows.x64.CPU? what is your video card?

goame · #6

And wrote:

you ran Lizzie.0.7.2.Windows.x64.CPU? what is your video card?

GPU with Tensor cores.
2x RTX 2080 Ti.

And · #7

Does it work from the command line?

goame · #8

And wrote:

Does it work from the command line?

cmd
Z:\>LG0\Lizzie\katago\katago.exe

Usage: LG0\Lizzie\katago\katago.exe SUBCOMMAND

---Common subcommands------------------

gtp : Runs GTP engine that can be plugged into any standard Go GUI for play/anal
ysis.
match : Run self-play match games based on a config, more efficient than gtp due
to batching.
evalsgf : Utility/debug tool, analyze a single position of a game from an SGF fi
le.
version : Print version and exit.

tuner : (OpenCL only) Run tuning to find and optimize parameters that work on yo
ur GPU.

---Selfplay training subcommands---------

selfplay : Play selfplay games and generate training data.
gatekeeper : Poll directory for new nets and match them against the latest net s
o far.

---Testing/debugging subcommands-------------

runtests : Test important board algorithms and datastructures
runnnlayertests : Test a few subcomponents of the current neural net backend

runnnontinyboardtest : Run neural net on a tiny board and dump result to stdout

runoutputtests : Run a bunch of things and dump details to stdout
runsearchtests : Run a bunch of things using a neural net and dump details to st
dout
runsearchtestsv3 : Run a bunch more things using a neural net and dump details t
o stdout
runselfplayinittests : Run some tests involving selfplay training init using a n
eural net and dump details to stdout

---Dev/experimental subcommands-------------
nnerror
demoplay
lzcost
matchauto
sandbox

yoyoma · #9

Uncaught exception: Error loading or parsing model file g170-b30c320x2-s1287828224-d525929064.bin.gz: Could not open file - does not exist or invalid permissions?

It says it cannot load the weights file. Is it spelled correctly? Maybe try adding the full path?

goame · **#10**

yoyoma wrote:

Uncaught exception: Error loading or parsing model file g170-b30c320x2-s1287828224-d525929064.bin.gz: Could not open file - does not exist or invalid permissions?

It says it cannot load the weights file. Is it spelled correctly? Maybe try adding the full path?

Yes.
But what do you mean, try adding full path?

I mean this is the full path: LG0\Lizzie\katago\katago.exe genconfig -model g170-b30c320x2-s1287828224-d525929064.bin.gz -output gtp_custom.cfg

yoyoma · **#11**

It cannot find the weights file. So add full path to the weights file. e.g.

LG0\Lizzie\katago\katago.exe genconfig -model \full\path\here\g170-b30c320x2-s1287828224-d525929064.bin.gz -output gtp_custom.cfg

And · **#12**

do it from the command line with parameters: "katago\katago gtp -model katanetwork.gz -config katago-gtp10.cfg"
a message should appear: "GTP ready"
if it works out, you can replace the network. but when using networks .bin.gz need to replace the KataGo with version 1.3.3

"Starting with this release, KataGo is moving to a new model format which is a bit smaller on disk and faster to load, indicated by a new file extension".bin.gz" instead of ".txt.gz". The new format will NOT work with earlier KataGo versions. However, the version 1.3.3 in this release will still be able to load all older models."
https://github.com/lightvector/KataGo/releases

goame · **#13**

yoyoma wrote:

It cannot find the weights file. So add full path to the weights file. e.g.

LG0\Lizzie\katago\katago.exe genconfig -model \full\path\here\g170-b30c320x2-s1287828224-d525929064.bin.gz -output gtp_custom.cfg

Now this helped, see here:
LG0\Lizzie\katago\katago.exe genconfig -model \LG0\Lizzie\katago\g170-b30c320x2-s1287828224-d525929064.bin.gz -output gtp_custom.cfg

Z:\>LG0\Lizzie\katago\katago.exe genconfig -model \LG0\Lizzie\katago\g170-b30c32
0x2-s1287828224-d525929064.bin.gz -output gtp_custom.cfg

=========================================================================
RULES

What rules should KataGo use by default for play and analysis?
(chinese, japanese, korean, tromp-taylor, aga, chinese-ogs, new-zealand, bga, st
one-scoring, aga-button):
japanese

=========================================================================
SEARCH LIMITS

When playing games, KataGo will always obey the time controls given by the GUI/t
ournament/match/online server.
But you can specify an additional limit to make KataGo move much faster. This do
es NOT affect analysis/review,
only affects playing games. Add a limit? (y/n) (default n):

NOTE: No limits configured for KataGo. KataGo will obey time controls provided b
y the GUI or server or match script
but if they don't specify any, when playing games KataGo may think forever witho
ut moving. (press enter to continue)

When playing games, KataGo can optionally ponder during the opponent's turn. Thi
s gives faster/stronger play
in real games but should NOT be enabled if you are running tests with fixed limi
ts (pondering may exceed those
limits), or to avoid stealing the opponent's compute time when testing two bots
on the same machine.
Enable pondering? (y/n, default n):y

Specify max num seconds KataGo should ponder during the opponent's turn. Leave b
lank for no limit:

=========================================================================
GPUS AND RAM

Finding available GPU-like devices...
Found CUDA device 0: GeForce RTX 2080 Ti
Found CUDA device 1: GeForce RTX 2080 Ti

Specify devices/GPUs to use (for example "0,1,2" to use devices 0, 1, and 2). Le
ave blank for good default:

By default, KataGo will cache up to about 3GB of positions in memory (RAM), in a
ddition to
whatever the current search is using. Specify a max in GB or leave blank for def
ault:
60

=========================================================================
PERFORMANCE TUNING

Specify number of visits to use test/tune performance with, leave blank for defa
ult based on GPU speed.
Use large number for more accurate results, small if your GPU is old and this is
taking forever:

Specify number of seconds/move to optimize performance for (default 5), leave bl
ank for default:

2020-03-12 21:33:49+0100: Loading model and initializing benchmark...

Running quick initial benchmark at 16 threads!
2020-03-12 21:33:49+0100: nnRandSeed0 = 2964166189710053863
2020-03-12 21:33:49+0100: After dedups: nnModelFile0 = \LG0\Lizzie\katago\g170-b
30c320x2-s1287828224-d525929064.bin.gz useFP16 auto useNHWC auto
2020-03-12 21:33:51+0100: Cuda backend: Found GPU GeForce RTX 2080 Ti memory 118
11160064 compute capability major 7 minor 5
2020-03-12 21:33:51+0100: Cuda backend: Model version 8 useFP16 = true useNHWC =
true
2020-03-12 21:33:51+0100: Cuda backend: Model name: g170-b30c320x2-s1287828224-d
525929064

numSearchThreads = 16: 3 / 3 positions, visits/s = 671.70 nnEvals/s = 635.16 nnB
atches/s = 78.02 avgBatchSize = 8.14 (3.6 secs)

=========================================================================
TUNING NOW
Tuning using 700 visits.
Automatically trying different numbers of threads to home in on the best:

2020-03-12 21:34:01+0100: nnRandSeed0 = 12674588962050320539
2020-03-12 21:34:01+0100: After dedups: nnModelFile0 = \LG0\Lizzie\katago\g170-b
30c320x2-s1287828224-d525929064.bin.gz useFP16 auto useNHWC auto
2020-03-12 21:34:04+0100: Cuda backend: Found GPU GeForce RTX 2080 Ti memory 118
11160064 compute capability major 7 minor 5
2020-03-12 21:34:04+0100: Cuda backend: Model version 8 useFP16 = true useNHWC =
true
2020-03-12 21:34:04+0100: Cuda backend: Model name: g170-b30c320x2-s1287828224-d
525929064

Possible numbers of threads to test: 1, 2, 3, 4, 5, 6, 8, 10, 12, 16, 20, 24, 32
,

numSearchThreads = 5: 10 / 10 positions, visits/s = 303.41 nnEvals/s = 278.28 n
nBatches/s = 111.75 avgBatchSize = 2.49 (23.2 secs)
numSearchThreads = 12: 10 / 10 positions, visits/s = 623.36 nnEvals/s = 567.68 n
nBatches/s = 95.65 avgBatchSize = 5.93 (11.4 secs)
numSearchThreads = 10: 10 / 10 positions, visits/s = 508.21 nnEvals/s = 466.70 n
nBatches/s = 94.26 avgBatchSize = 4.95 (14.0 secs)
numSearchThreads = 20: 10 / 10 positions, visits/s = 761.73 nnEvals/s = 703.46 n
nBatches/s = 61.77 avgBatchSize = 11.39 (9.4 secs)
numSearchThreads = 16: 10 / 10 positions, visits/s = 653.74 nnEvals/s = 592.85 n
nBatches/s = 73.88 avgBatchSize = 8.02 (10.9 secs)
numSearchThreads = 24: 10 / 10 positions, visits/s = 810.45 nnEvals/s = 745.32 n
nBatches/s = 48.54 avgBatchSize = 15.36 (8.9 secs)
numSearchThreads = 32: 10 / 10 positions, visits/s = 879.13 nnEvals/s = 817.44 n
nBatches/s = 36.92 avgBatchSize = 22.14 (8.3 secs)

Optimal number of threads is fairly high, tripling the search limit and trying a
gain.

2020-03-12 21:36:12+0100: nnRandSeed0 = 26415063231896071
2020-03-12 21:36:12+0100: After dedups: nnModelFile0 = \LG0\Lizzie\katago\g170-b
30c320x2-s1287828224-d525929064.bin.gz useFP16 auto useNHWC auto
2020-03-12 21:36:15+0100: Cuda backend: Found GPU GeForce RTX 2080 Ti memory 118
11160064 compute capability major 7 minor 5
2020-03-12 21:36:15+0100: Cuda backend: Model version 8 useFP16 = true useNHWC =
true
2020-03-12 21:36:15+0100: Cuda backend: Model name: g170-b30c320x2-s1287828224-d
525929064

Possible numbers of threads to test: 1, 2, 3, 4, 5, 6, 8, 10, 12, 16, 20, 24, 32
, 40, 48, 64, 80, 96,

numSearchThreads = 6: 10 / 10 positions, visits/s = 359.57 nnEvals/s = 327.89 n
nBatches/s = 109.86 avgBatchSize = 2.98 (19.6 secs)
numSearchThreads = 48: 10 / 10 positions, visits/s = 908.65 nnEvals/s = 873.86 n
nBatches/s = 25.54 avgBatchSize = 34.21 (8.2 secs)
numSearchThreads = 64: 10 / 10 positions, visits/s = 923.39 nnEvals/s = 899.43 n
nBatches/s = 21.30 avgBatchSize = 42.23 (8.3 secs)
numSearchThreads = 40: 10 / 10 positions, visits/s = 882.49 nnEvals/s = 839.98 n
nBatches/s = 29.62 avgBatchSize = 28.36 (8.4 secs)

Ordered summary of results:

numSearchThreads = 5: 10 / 10 positions, visits/s = 303.41 nnEvals/s = 278.28 n
nBatches/s = 111.75 avgBatchSize = 2.49 (23.2 secs) (EloDiff baseline)
numSearchThreads = 6: 10 / 10 positions, visits/s = 359.57 nnEvals/s = 327.89 n
nBatches/s = 109.86 avgBatchSize = 2.98 (19.6 secs) (EloDiff +59)
numSearchThreads = 10: 10 / 10 positions, visits/s = 508.21 nnEvals/s = 466.70 n
nBatches/s = 94.26 avgBatchSize = 4.95 (14.0 secs) (EloDiff +174)
numSearchThreads = 12: 10 / 10 positions, visits/s = 623.36 nnEvals/s = 567.68 n
nBatches/s = 95.65 avgBatchSize = 5.93 (11.4 secs) (EloDiff +246)
numSearchThreads = 16: 10 / 10 positions, visits/s = 653.74 nnEvals/s = 592.85 n
nBatches/s = 73.88 avgBatchSize = 8.02 (10.9 secs) (EloDiff +252)
numSearchThreads = 20: 10 / 10 positions, visits/s = 761.73 nnEvals/s = 703.46 n
nBatches/s = 61.77 avgBatchSize = 11.39 (9.4 secs) (EloDiff +301)
numSearchThreads = 24: 10 / 10 positions, visits/s = 810.45 nnEvals/s = 745.32 n
nBatches/s = 48.54 avgBatchSize = 15.36 (8.9 secs) (EloDiff +314)
numSearchThreads = 32: 10 / 10 positions, visits/s = 879.13 nnEvals/s = 817.44 n
nBatches/s = 36.92 avgBatchSize = 22.14 (8.3 secs) (EloDiff +327)
numSearchThreads = 40: 10 / 10 positions, visits/s = 882.49 nnEvals/s = 839.98 n
nBatches/s = 29.62 avgBatchSize = 28.36 (8.4 secs) (EloDiff +308)
numSearchThreads = 48: 10 / 10 positions, visits/s = 908.65 nnEvals/s = 873.86 n
nBatches/s = 25.54 avgBatchSize = 34.21 (8.2 secs) (EloDiff +301)
numSearchThreads = 64: 10 / 10 positions, visits/s = 923.39 nnEvals/s = 899.43 n
nBatches/s = 21.30 avgBatchSize = 42.23 (8.3 secs) (EloDiff +268)

Based on some test data, each speed doubling gains perhaps ~250 Elo by searching
deeper.
Based on some test data, each thread costs perhaps 7 Elo if using 800 visits, an
d 2 Elo if using 5000 visits (by making MCTS worse).
So APPROXIMATELY based on this benchmark, if you intend to do a 5 second search:

numSearchThreads = 5: (baseline)
numSearchThreads = 6: +59 Elo
numSearchThreads = 10: +174 Elo
numSearchThreads = 12: +246 Elo
numSearchThreads = 16: +252 Elo
numSearchThreads = 20: +301 Elo
numSearchThreads = 24: +314 Elo
numSearchThreads = 32: +327 Elo (recommended)
numSearchThreads = 40: +308 Elo
numSearchThreads = 48: +301 Elo
numSearchThreads = 64: +268 Elo

Using 32 numSearchThreads!

=========================================================================
DONE

Writing new config file to gtp_custom.cfg
You should be now able to run KataGo with this config via something like:
LG0\Lizzie\katago\katago.exe gtp -model '\LG0\Lizzie\katago\g170-b30c320x2-s1287
828224-d525929064.bin.gz' -config 'gtp_custom.cfg'

Feel free to look at and edit the above config file further by hand in a txt edi
tor.
For more detailed notes about performance and what options in the config do, see
:
https://github.com/lightvector/KataGo/b ... xample.cfg

Some more questions:

GPUS AND RAM
-I have set as you can see 60 GB, is this good?

PERFORMANCE TUNING
+
TUNING NOW
-Is KataGo using both RTX 2080 Ti GPUs?
-Is KataGo tuned for both GPUs?
-Is tuning using 700 visits enough? -> When I use Leela Zero, then I have 40000 visits per second.

I want to use KataGo only for analysis and I want it to analyse without stopping automatically, even if it takes an hour or more to analyse one position.

goame · **#14**

And what I do next?
How to run it in Lizzie?

lightvector · **#15**

I just realized based on your output that the genconfig command has a misleading description - it will only be configured to use one of your GPUs. In the option where it asks you to select the devices to use, you should NOT use the default here, instead actually specify both devices as described, and let it re-tune based on that.

Sorry about that! I pushed a fix to master branch that clarifies the explanation here which will go out next release.

Additionally, I agree with you 700 is a bit small. It leans towards making it a bit friendly to users in terms of not taking too long, but maybe this is too much. You can manually specify a larger number when it asks you next time, when it says "Specify number of visits to use test/tune performance with".

To use it in Lizzie, you should take the "gtp" command it tells you at the end below "DONE" and tell Lizzie that this is the engine command. Although, you might need to adjust the paths if Lizzie is in a different directory, such that it will "see" your file system from a different than you were in when you tuned KataGo.

I'm surprised that you were getting 40K visits per second with LZ, unless you were using a smaller network with LZ, or using even more GPUs, or LZ was "cheating" by counted 8x as many visits on the opening empty board due to symmetries, which it would not be able to sustain once a few moves were played breaking the symmetry, or something else like that.

yoyoma · **#16**

Quote:

GPUS AND RAM
-I have set as you can see 60 GB, is this good?

By default, KataGo will cache up to about 3GB of positions in memory (RAM), in addition to whatever the current search is using.

I would keep the default 3GB. When doing long searches, the majority of RAM will be in the current search tree. You don't want to reserve too much for cache, leaving not enough room for the current search tree.

lightvector, I wonder if 3GB default is a little high if many users have 8GB RAM, and Windows + a few apps will leave only 4-5GB free?

goame · **#17**

lightvector wrote:

I just realized based on your output that the genconfig command has a misleading description - it will only be configured to use one of your GPUs. In the option where it asks you to select the devices to use, you should NOT use the default here, instead actually specify both devices as described, and let it re-tune based on that.

Sorry about that! I pushed a fix to master branch that clarifies the explanation here which will go out next release.

Additionally, I agree with you 700 is a bit small. It leans towards making it a bit friendly to users in terms of not taking too long, but maybe this is too much. You can manually specify a larger number when it asks you next time, when it says "Specify number of visits to use test/tune performance with".

To use it in Lizzie, you should take the "gtp" command it tells you at the end below "DONE" and tell Lizzie that this is the engine command. Although, you might need to adjust the paths if Lizzie is in a different directory, such that it will "see" your file system from a different than you were in when you tuned KataGo.

I'm surprised that you were getting 40K visits per second with LZ, unless you were using a smaller network with LZ, or using even more GPUs, or LZ was "cheating" by counted 8x as many visits on the opening empty board due to symmetries, which it would not be able to sustain once a few moves were played breaking the symmetry, or something else like that.

Ok then GPU 0,1 should fix it.

If I understand correctly, tuning with more visits leads to more accuracy but tuning takes more time?

Do you mean I should copy and paste this?:
LG0\Lizzie\katago\katago.exe gtp -model '\LG0\Lizzie\katago\g170-b30c320x2-s1287
828224-d525929064.bin.gz' -config 'gtp_custom.cfg'
How to tell Lizzie this is the engine command? Lizzie.jar, setting, engine, and past it there?

The 40K visits are ordinary at the beginning and going fast down, when "all playouts" becomes bigger. Maybe something like 1500 visits per second, after maybe 5-10 minutes, when all playouts(all visits) are above 3000000.

Should I tune for 3 GB RAM or 30 GB RAM?
I have 64 GB RAM.
What are the pros and cons?

goame · **#18**

Tuning with 50000 visits:

Z:\>LG0\Lizzie\katago\katago.exe genconfig -model \LG0\Lizzie\katago\g170-b30c32
0x2-s1287828224-d525929064.bin.gz -output gtp_custom.cfg

=========================================================================
RULES

What rules should KataGo use by default for play and analysis?
(chinese, japanese, korean, tromp-taylor, aga, chinese-ogs, new-zealand, bga, st
one-scoring, aga-button):
japanese

=========================================================================
SEARCH LIMITS

When playing games, KataGo will always obey the time controls given by the GUI/t
ournament/match/online server.
But you can specify an additional limit to make KataGo move much faster. This do
es NOT affect analysis/review,
only affects playing games. Add a limit? (y/n) (default n):
n

NOTE: No limits configured for KataGo. KataGo will obey time controls provided b
y the GUI or server or match script
but if they don't specify any, when playing games KataGo may think forever witho
ut moving. (press enter to continue)

When playing games, KataGo can optionally ponder during the opponent's turn. Thi
s gives faster/stronger play
in real games but should NOT be enabled if you are running tests with fixed limi
ts (pondering may exceed those
limits), or to avoid stealing the opponent's compute time when testing two bots
on the same machine.
Enable pondering? (y/n, default n):y

Specify max num seconds KataGo should ponder during the opponent's turn. Leave b
lank for no limit:

=========================================================================
GPUS AND RAM

Finding available GPU-like devices...
Found CUDA device 0: GeForce RTX 2080 Ti
Found CUDA device 1: GeForce RTX 2080 Ti

Specify devices/GPUs to use (for example "0,1,2" to use devices 0, 1, and 2). Le
ave blank for good default:
"0,1"
could not parse int: "0
Specify devices/GPUs to use (for example "0,1,2" to use devices 0, 1, and 2). Le
ave blank for good default:
0,1

By default, KataGo will cache up to about 3GB of positions in memory (RAM), in a
ddition to
whatever the current search is using. Specify a max in GB or leave blank for def
ault:
60

=========================================================================
PERFORMANCE TUNING

Specify number of visits to use test/tune performance with, leave blank for defa
ult based on GPU speed.
Use large number for more accurate results, small if your GPU is old and this is
taking forever:
50000

Specify number of seconds/move to optimize performance for (default 5), leave bl
ank for default:

2020-03-12 22:55:26+0100: Loading model and initializing benchmark...

=========================================================================
TUNING NOW
Tuning using 50000 visits.
Automatically trying different numbers of threads to home in on the best:

2020-03-12 22:55:26+0100: nnRandSeed0 = 2369906978592220054
2020-03-12 22:55:26+0100: After dedups: nnModelFile0 = \LG0\Lizzie\katago\g170-b
30c320x2-s1287828224-d525929064.bin.gz useFP16 auto useNHWC auto
2020-03-12 22:55:28+0100: Cuda backend: Found GPU GeForce RTX 2080 Ti memory 118
11160064 compute capability major 7 minor 5
2020-03-12 22:55:28+0100: Cuda backend: Found GPU GeForce RTX 2080 Ti memory 118
11160064 compute capability major 7 minor 5
2020-03-12 22:55:28+0100: Cuda backend: Model version 8 useFP16 = true useNHWC =
true
2020-03-12 22:55:28+0100: Cuda backend: Model name: g170-b30c320x2-s1287828224-d
525929064
2020-03-12 22:55:28+0100: Cuda backend: Model version 8 useFP16 = true useNHWC =
true
2020-03-12 22:55:28+0100: Cuda backend: Model name: g170-b30c320x2-s1287828224-d
525929064

Possible numbers of threads to test: 1, 2, 3, 4, 5, 6, 8, 10, 12, 16, 20, 24, 32
,

numSearchThreads = 5: 10 / 10 positions, visits/s = 533.10 nnEvals/s = 350.16 n
nBatches/s = 213.88 avgBatchSize = 1.64 (938.0 secs)
numSearchThreads = 12: 10 / 10 positions, visits/s = 1131.75 nnEvals/s = 769.38
nnBatches/s = 198.99 avgBatchSize = 3.87 (441.9 secs)
numSearchThreads = 10: 10 / 10 positions, visits/s = 964.41 nnEvals/s = 649.12 n
nBatches/s = 204.31 avgBatchSize = 3.18 (518.5 secs)
numSearchThreads = 20: 10 / 10 positions, visits/s = 1520.41 nnEvals/s = 1003.61
nnBatches/s = 152.46 avgBatchSize = 6.58 (329.0 secs)
numSearchThreads = 16: 10 / 10 positions, visits/s = 1387.92 nnEvals/s = 932.16
nnBatches/s = 178.77 avgBatchSize = 5.21 (360.4 secs)
numSearchThreads = 24: 10 / 10 positions, visits/s = 1624.20 nnEvals/s = 1089.80
nnBatches/s = 136.46 avgBatchSize = 7.99 (308.0 secs)
numSearchThreads = 32: 10 / 10 positions, visits/s = 1796.26 nnEvals/s = 1201.35
nnBatches/s = 113.86 avgBatchSize = 10.55 (278.5 secs)

Optimal number of threads is fairly high, tripling the search limit and trying a
gain.

2020-03-12 23:49:10+0100: nnRandSeed0 = 6506758374797114957
2020-03-12 23:49:10+0100: After dedups: nnModelFile0 = \LG0\Lizzie\katago\g170-b
30c320x2-s1287828224-d525929064.bin.gz useFP16 auto useNHWC auto
2020-03-12 23:49:13+0100: Cuda backend: Found GPU GeForce RTX 2080 Ti memory 118
11160064 compute capability major 7 minor 5
2020-03-12 23:49:13+0100: Cuda backend: Found GPU GeForce RTX 2080 Ti memory 118
11160064 compute capability major 7 minor 5
2020-03-12 23:49:13+0100: Cuda backend: Model version 8 useFP16 = true useNHWC =
true
2020-03-12 23:49:13+0100: Cuda backend: Model name: g170-b30c320x2-s1287828224-d
525929064
2020-03-12 23:49:13+0100: Cuda backend: Model version 8 useFP16 = true useNHWC =
true
2020-03-12 23:49:13+0100: Cuda backend: Model name: g170-b30c320x2-s1287828224-d
525929064

Possible numbers of threads to test: 1, 2, 3, 4, 5, 6, 8, 10, 12, 16, 20, 24, 32
, 40, 48, 64, 80, 96,

numSearchThreads = 6: 10 / 10 positions, visits/s = 626.73 nnEvals/s = 407.14 n
nBatches/s = 209.06 avgBatchSize = 1.95 (797.9 secs)
numSearchThreads = 48: 10 / 10 positions, visits/s = 2214.93 nnEvals/s = 1421.03
nnBatches/s = 93.34 avgBatchSize = 15.22 (226.0 secs)
numSearchThreads = 64: 10 / 10 positions, visits/s = 2301.42 nnEvals/s = 1500.58
nnBatches/s = 77.43 avgBatchSize = 19.38 (217.5 secs)
numSearchThreads = 80: 10 / 10 positions, visits/s = 2322.34 nnEvals/s = 1543.88
nnBatches/s = 65.55 avgBatchSize = 23.55 (215.6 secs)
numSearchThreads = 40: 10 / 10 positions, visits/s = 1983.09 nnEvals/s = 1353.57
nnBatches/s = 104.84 avgBatchSize = 12.91 (252.3 secs)

Ordered summary of results:

numSearchThreads = 5: 10 / 10 positions, visits/s = 533.10 nnEvals/s = 350.16 n
nBatches/s = 213.88 avgBatchSize = 1.64 (938.0 secs) (EloDiff baseline)
numSearchThreads = 6: 10 / 10 positions, visits/s = 626.73 nnEvals/s = 407.14 n
nBatches/s = 209.06 avgBatchSize = 1.95 (797.9 secs) (EloDiff +57)
numSearchThreads = 10: 10 / 10 positions, visits/s = 964.41 nnEvals/s = 649.12 n
nBatches/s = 204.31 avgBatchSize = 3.18 (518.5 secs) (EloDiff +208)
numSearchThreads = 12: 10 / 10 positions, visits/s = 1131.75 nnEvals/s = 769.38
nnBatches/s = 198.99 avgBatchSize = 3.87 (441.9 secs) (EloDiff +264)
numSearchThreads = 16: 10 / 10 positions, visits/s = 1387.92 nnEvals/s = 932.16
nnBatches/s = 178.77 avgBatchSize = 5.21 (360.4 secs) (EloDiff +334)
numSearchThreads = 20: 10 / 10 positions, visits/s = 1520.41 nnEvals/s = 1003.61
nnBatches/s = 152.46 avgBatchSize = 6.58 (329.0 secs) (EloDiff +362)
numSearchThreads = 24: 10 / 10 positions, visits/s = 1624.20 nnEvals/s = 1089.80
nnBatches/s = 136.46 avgBatchSize = 7.99 (308.0 secs) (EloDiff +381)
numSearchThreads = 32: 10 / 10 positions, visits/s = 1796.26 nnEvals/s = 1201.35
nnBatches/s = 113.86 avgBatchSize = 10.55 (278.5 secs) (EloDiff +408)
numSearchThreads = 40: 10 / 10 positions, visits/s = 1983.09 nnEvals/s = 1353.57
nnBatches/s = 104.84 avgBatchSize = 12.91 (252.3 secs) (EloDiff +436)
numSearchThreads = 48: 10 / 10 positions, visits/s = 2214.93 nnEvals/s = 1421.03
nnBatches/s = 93.34 avgBatchSize = 15.22 (226.0 secs) (EloDiff +471)
numSearchThreads = 64: 10 / 10 positions, visits/s = 2301.42 nnEvals/s = 1500.58
nnBatches/s = 77.43 avgBatchSize = 19.38 (217.5 secs) (EloDiff +467)
numSearchThreads = 80: 10 / 10 positions, visits/s = 2322.34 nnEvals/s = 1543.88
nnBatches/s = 65.55 avgBatchSize = 23.55 (215.6 secs) (EloDiff +451)

Based on some test data, each speed doubling gains perhaps ~250 Elo by searching
deeper.
Based on some test data, each thread costs perhaps 7 Elo if using 800 visits, an
d 2 Elo if using 5000 visits (by making MCTS worse).
So APPROXIMATELY based on this benchmark, if you intend to do a 5 second search:

numSearchThreads = 5: (baseline)
numSearchThreads = 6: +57 Elo
numSearchThreads = 10: +208 Elo
numSearchThreads = 12: +264 Elo
numSearchThreads = 16: +334 Elo
numSearchThreads = 20: +362 Elo
numSearchThreads = 24: +381 Elo
numSearchThreads = 32: +408 Elo
numSearchThreads = 40: +436 Elo
numSearchThreads = 48: +471 Elo (recommended)
numSearchThreads = 64: +467 Elo
numSearchThreads = 80: +451 Elo

Using 48 numSearchThreads!

=========================================================================
DONE

Writing new config file to gtp_custom.cfg
You should be now able to run KataGo with this config via something like:
LG0\Lizzie\katago\katago.exe gtp -model '\LG0\Lizzie\katago\g170-b30c320x2-s1287
828224-d525929064.bin.gz' -config 'gtp_custom.cfg'

Feel free to look at and edit the above config file further by hand in a txt edi
tor.
For more detailed notes about performance and what options in the config do, see
:
https://github.com/lightvector/KataGo/b ... xample.cfg

goame · **#19**

I copy and paste this:
LG0\Lizzie\katago\katago.exe gtp -model '\LG0\Lizzie\katago\g170-b30c320x2-s1287
828224-d525929064.bin.gz' -config 'gtp_custom.cfg'

into:
lizzie.jar, settings, engine config

but something is still wrong and KataGo is not analysing.

Javaness2 · **#20**

Is it better to just use this?
https://github.com/kaorahi/lizgoban

How to run KataGo in Lizzie v0.7.2?

Who is online