Life In 19x19

Posted: **Wed Jul 24, 2019 3:56 am**

what does "special ladder support" mean?

Posted: **Wed Jul 24, 2019 4:10 pm**

Marcel Grünauer wrote: Two things I'd really like is a "benchmark" command like Leela Zero's and a way to make the first moves more random, using something like Leela Zero's "--randomcnt" and "--randomvisits".

I'm using KataGo to play handicap games and it always goes into all available 3-3 points, or if I use a free handicap of, say, four komoku, it plays a keima against the first one and then always attaches against the other komoku stones.

Yep, if you've ready my reply to hoa803 right above the next version will have better support for testing and tuning performance. In the meantime, if you're willing to experiment further you might be able to do a bit better than just setting it to a number like 10 (unless you did experiment a fair bit in order to eventually settle on 10 as best!). Otherwise, just stay tuned I guess for future versions.

For randomizing early play, you can fiddle with the "chosenMoveTemperatureEarly" parameter and the other parameters near it. Actually, if you want to tune anything about KataGo, you might just want to read through the whole gtp_example.cfg! The comments should explain roughly what most of the settings do.

The policy output does weight 3-3 quite heavily. I agree opening variety is a bit low a common problem for all the 'zero' bots to varying degrees. Still a topic of research how to encourage variety in a non-totally-contrived way and that actually helps training strength, rather than hurting it. For now though in the worst case you could force-play alternative moves for white in one or two of the corners before continuing as a normal handicap game. You can also try free-placement patterns that don't involve a stone in each corner - use the stones for corner enclosures of different kinds, or chinese/star-point/other side formations, and let white have a corner.

And wrote:what does "special ladder support" mean?

KataGo's neural net gets an input plane that marks all stones that are in or can be put into repeated inescapable atari, both now and over the last 2 turns (to help with seeing when moves are ladder breakers/makers). So it should never be mistaken about the status of any direct ladder. In rare situations when very losing and also very behind in score, you might still see it play out a step or two of a broken ladder though.

Posted: **Thu Jul 25, 2019 3:22 am**

lightvector, thank you, interestingly, this feature of the KataGo, relating to the ladders, was obtained as a result of training or was it originally set in the program?

Posted: **Thu Jul 25, 2019 4:42 am**

And wrote:lightvector, thank you, interestingly, this feature of the KataGo, relating to the ladders, was obtained as a result of training or was it originally set in the program?

Re-read what I just wrote.

lightvector wrote: KataGo's neural net gets an input plane that marks all stones that are in or can be put into repeated inescapable atari

It's an input to the neural net. So it's not a thing that is being learned or produced by the neural net, it's being provided as an input. Although the neural net will still need to learn how to use the information or how much attention to give to it.

(In case you're unfamiliar with the terminology, in ALL zero-trained programs, the neural net is the thing that gets trained, and is the part of a program responsible for pattern recognition for what moves are worth considering further, as well as judging the whole board to determine whether the result of a possible line of play is good or bad).

Posted: **Thu Jul 25, 2019 5:01 am**

lightvector, I know almost nothing about the device of neural networks, but I understood approximately how it works. besides this, I do not know English and I use a translator (and the translator often translates in such a way that nothing is clear at all!). thanks for the answer!

PS I took a position with the beginning of the ladder from one game where the ELFv2 lost. KataGo immediately understood what would happen next and played differently! I like KataGo more and more!

Posted: **Fri Jul 26, 2019 9:58 am**

on gt 610 compared to numSearchThreads = 1 with numSearchThreads = 4 position calculation is 2% faster, 8-9%, 16-16%, 32-19%. Is the result better on powerful video cards? Can I change other settings to improve performance? (OpenCL)

Posted: **Sat Jul 27, 2019 8:37 am**

hoa803 wrote: At this setting KataGo was close to twice as fast generating playouts as the 40-block LZ network.

How did you determine this? measuring the time to calculate the position?

Posted: **Sat Jul 27, 2019 11:35 am**

And wrote:on gt 610 compared to numSearchThreads = 1 with numSearchThreads = 4 position calculation is 2% faster, 8-9%, 16-16%, 32-19%. Is the result better on powerful video cards? Can I change other settings to improve performance? (OpenCL)

Probably there aren't many settings you can change here that will make a big difference, it's more that KataGo's OpenCL implementation just needs more optimization work - not surprising since KataGo is the first time I've ever written any GPU code whatsoever, and I'm learning as I go.

Once I get more free-time to work on it, I'll continue to incrementally improve things in addition to having better benchmarking tools. Future versions some weeks or months down the line should be better.

Posted: **Sun Jul 28, 2019 4:35 am**

thank! waiting for new releases! We are also waiting the CPUonly version!

Posted: **Mon Jul 29, 2019 10:30 am**

And wrote:thank! waiting for new releases! We are also waiting the CPUonly version!

You can already run on the CPU if it supports OpenCL. It's quite slow though, since the cpu slowness compounds with the unoptimised opencl code.

Posted: **Mon Jul 29, 2019 4:58 pm**

afar wrote:
And wrote:thank! waiting for new releases! We are also waiting the CPUonly version!
You can already run on the CPU if it supports OpenCL. It's quite slow though, since the cpu slowness compounds with the unoptimised opencl code.

Unoptimised? Ouch.

Compared to a naive implementation - i.e just writing the code literally taking advantage of the OpenCL kernel parallelism but nothing else special, which is what I started with to just make sure it all compiled, it's perhaps a factor of 5 to 20 better, depending on your hardware. I did spend many hours learning about GPU optimization and improving things over that baseline, although there's still more to do. It calls out to CLBlast for fast matrix multiplication, uses Winograd 4x4 tiles, and a few other tricks. (by the way, the Winograd convolution algorithm and the math behind it is pretty sweet).

It's just that there's maybe another factor of 1.5 to 3 left to gain depending on your hardware. There's always a little more that can be done.

And yes, if you have a CPU OpenCL implementation (possibly you can download one that works for your CPU architecture), you can run it on the CPU already, but the OpenCL implementation is optimized for GPU and not CPU, and is likely already doing things that make no sense for performance if the underlying engine is a CPU, so it will probably never compete with a sufficiently well-written CPU-specific version. I have a colleague/friend working on getting a CPU-specific version working. Although of course, getting it working and optimizing it fully are again two very different tasks.

Posted: **Tue Jul 30, 2019 4:35 am**

several games of KataGo - Elf v2 (opencl), gt 610, 30s/move with numSearchThreads = 16 Elf won everything, with numSearchThreads = 1 everything won KataGo. this is normal? with increasing numSearchThreads can power drop so much? or few parties? (I want to make ~50)

Posted: **Tue Jul 30, 2019 5:21 am**

More threads makes a bot much weaker per visit. This is true of LZ too. So increasing threads is only a good idea if it produces a significant increase in visits per second, enough to compensate the loss in strength.

Posted: **Tue Jul 30, 2019 2:36 pm**

lightvector wrote:More threads makes a bot much weaker per visit. This is true of LZ too. So increasing threads is only a good idea if it produces a significant increase in visits per second, enough to compensate the loss in strength.

@lightvector - could you give us the "AI Go for Dummies" explanation behind the weakness caused by increasing threads?

Posted: **Tue Jul 30, 2019 2:59 pm**

Lets say that you have given the bot a budget of 10.000 visits or playouts for a move. Be this either by setting a time limit, and your hardware being able to do that amount in the set time limit, or by manually setting an amount of playouts.

With 1 thread, the bot will look through options with the set algorithm, and then choose a move based on either visit count, confidence, or whatever criteria that has been chosen.
The thing here is that, that 1 thread, will use all those 10.000 playouts, and most likely it uses them in a way that makes sense for Go.

In Go, you have many things that become apparant only very deep in the search tree. A capturing race for example, might only work with 1 very specific move sequence, and the bot will have to search through it all to find it, and then if you use playouts for a certain move to decide what move to play, you adittionally need enough playouts to make it the most visited answer.

Same thing with a ladder, you might need to spend 3000-4000 playouts to make sure that a ladder works, or does not work, since at every turn, you need to consider all the alternatives.

This all works relatively well with 1 thread, but if you use 10 threads, you are splitting your 10.000 playout budget into 10 threads.

And you might not want the different threads to double another threads work.

So thread number 1, now only has 1000 playouts to figure out that C2 will win the capturing race. The answer that C2 is correct, is however so deep, that 1000 playouts is not enough.
All other 9 threads will be looking at other moves most likely. Not a single thread will look deep enough, and the answers in Go, are usually, hidden very deep.

Funnily, the ELF network running on Leela sometimes actually does play stronger. But this is only because the ELF network is so over sharpened, that it only thinks of 1 or 2 moves. When faced with a stronger opponent, which has trained on a wider variety of gameplay, ELF will then loose, because it doesnt understand that some other move might actually be good.
By increasing the amount of threads, you then force the over sharpened ELF network to actually look at moves that it normally wouldnt do.
But this really is something i have only seen ELF ever do, everything else falls in strength with more threads and batching.

You trade speed (moves per second) for strength.

Anyways, this is how ive understood it, im not a huge expert, so feel free to correct me if im wrong.

Life In 19x19

A new run of KataGo released - strength comparable to ELFv2

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL

Re: A new run of KataGo released - strength comparable to EL