Engine Tournament

as0770 · Post by **as0770** » Sat Oct 22, 2016 9:57 am

q30 wrote:Matilda is very weak engine.
Recommendation: take MoGo (it was made for dissertation too and its development was left many years ago) and upgrade it!

It plays well compared to other engines and it can win against beginners like me. There are many weaker engines than Matilda. For a programmer it is a challenge to write a go playing program from scratch, and it is a very hard task to reach the strength of Matilda. It is not a challenge to tune some parameters of an existing engine and for the users it would be boring to play against 10 different versions of MoGo. So Gonçalo's work is much appreciated

q30 · Post by **q30** » Sat Oct 22, 2016 11:18 am

as0770 wrote:
It plays well compared to other engines and it can win against beginners like me. There are many weaker engines than Matilda. For a programmer it is a challenge to write a go playing program from scratch, and it is a very hard task to reach the strength of Matilda. It is not a challenge to tune some parameters of an existing engine and for the users it would be boring to play against 10 different versions of MoGo. So Gonçalo's work is much appreciated :clap:

You can change engine strength by set up time control parameters. For example, if You will download Go board and engines from here, will follow the enclosed instructions, then You will get Go player with 4 engines and 11 levels of game strength (from random to "God-like").
I absolutely agree with You, that is a very hard task. So, it's easier to upgrade existing development left strong engine and not only start up parameters for that it would be not boring (and would not be different MoGo version, but will be, for example, Matilda)...

as0770 · Post by **as0770** » Thu Nov 10, 2016 1:44 pm

Added Michi-C: https://github.com/db3108/michi-c

Leela vs. Hiratuka:

Code: Select all

    1. Leela 0.7.0                  18/24
    2. Hiratuka 10.37B               6/24

League A:

Code: Select all

    1. Leela 0.7.0                  22/24
    2. Hiratuka 10.37B              21/24
    3. PachiUCT 11.99               15/24
    4. Ray 8.0.1                    13/24
    5. Fuego 1.1                     7/24
    6. MoGo 3                        5/24
    7. GNU Go 3.8                    1/24

League B:

Code: Select all

    1. GNU Go 3.8                   29/32
    2. Michi-C                      25/32
    3. Hara 0.9                     18/32
    4. Dariush 3.1.5.7              17/32
    5. Indigo 2009                  16/32
    6. Matilda 1.20                 15/32
    7. Aya 6.34                     13/32
    8. Fudo Go 3.0                  11/32
    9. AmiGoGtp 1.8                  0/32

Configuration:

q30 · Post by **q30** » Sat Nov 19, 2016 2:52 am

as0770 wrote:Added Michi-C: https://github.com/db3108/michi-c

Configuration:

There are very weak parameters of Monte Carlo Go engines and therefore a high level of results randomness...

as0770 · Post by **as0770** » Sat Nov 19, 2016 7:42 am

q30 wrote:
as0770 wrote:Added Michi-C: https://github.com/db3108/michi-c

Configuration:
There are very weak parameters of Monte Carlo Go engines and therefore a high level of results randomness...

?

q30 · Post by **q30** » Sat Nov 19, 2016 11:16 am

as0770 wrote:?

OK, I'll try to explain with my bad English...
All these Go engines are using Monte Carlo method, i.e. games simulations with random move search and with gradual elimination of the most unsuccessful and the choice (with repeated simulations) the most successful (for and the rival) move sequences. So the game strength is in direct dependence of the simulations number. When You use only 1 thread, not enough memory and time, don't use pondering, opening books and patterns search the game of such engines becomes more and more casual...

as0770 · Post by **as0770** » Sat Nov 19, 2016 1:36 pm

q30 wrote:
as0770 wrote:?
OK, I'll try to explain with my bad English...
All these Go engines are using Monte Carlo method, i.e. games simulations with random move search and with gradual elimination of the most unsuccessful and the choice (with repeated simulations) the most successful (for and the rival) move sequences. So the game strength is in direct dependence of the simulations number. When You use only 1 thread, not enough memory and time, don't use pondering, opening books and patterns search the game of such engines becomes more and more casual...

I got 4 cores only. So pondering and 2 cores/engines would result in unfair allocation of CPU time. I think even more important is the number of games. 24 games under this conditions will give a better result than 12 games with doubled time or 2 cores for each engine.

Still this 24 games are statistically not significant. To get a correct reflection of the strength you need at least 100 games/engine. I just want to get a fast and rough estimation of the strength. Anyway, maybe I will improve the conditions for the top engines one day

zookar · Post by **zookar** » Tue Nov 22, 2016 11:51 am

Thanks for sharing, as0770.

Have you tried Ray-nn? It is mentioned here by Hiroshi Yamashita(author of Aya) that Ray has this new version with Policy and Value net implemented and rates 2900 in CGOS(while ray_8.0.1_1k rates around 1800).Their Github link is here.

Leela has also released version 0.8.0 in 2016-11-17,with playouts configurable now.

as0770 · Post by **as0770** » Tue Nov 22, 2016 11:30 pm

zookar wrote:Thanks for sharing, as0770.

Have you tried Ray-nn? It is mentioned here by Hiroshi Yamashita(author of Aya) that Ray has this new version with Policy and Value net implemented and rates 2900 in CGOS(while ray_8.0.1_1k rates around 1800).Their Github link is here.

Leela has also released version 0.8.0 in 2016-11-17,with playouts configurable now.

Thanks for the info, I added Leela 0.8.0 and some weaker engines and will add Ray nn soon.

Leela vs. Hiratuka:

Code: Select all

    1. Leela 0.8.0                   14/24
    2. Hiratuka 10.37B               10/24

League A:

Code: Select all

    1. Leela 0.8.0                  21/24
    2. Hiratuka 10.37B              20/24
    3. PachiUCT 11.99               15/24
    4. Ray 8.0.1                    15/24
    5. Fuego 1.1                     7/24
    6. MoGo 3                        5/24
    7. GNU Go 3.8                    1/24

League B:

Code: Select all

    1. GNU Go 3.8                   29/32
    2. Michi-C                      25/32
    3. Hara 0.9                     18/32
    4. Dariush 3.1.5.7              17/32
    5. Indigo 2009                  16/32
    6. Matilda 1.20                 15/32
    7. Aya 6.34                     13/32
    8. Fudo Go 3.0                   9/32
    9. JrefBot 081016-2022           0/32

League C:

Code: Select all

    1. JrefBot 081016-2022          24/24
    2. AmiGo 1.8                    18/24
    3. Crazy Patterns 0008-13       17/24
    4. Stop 0.9-005                 10/24
    5. CopyBot 0.1                   8/24
    6. Brown 1.0                     7/24
    7. MattpkGo 1.0                  0/24

Configuration:

Under this conditions Leela 0.8.0 got a worse result than v0.7.0.

Best,
Alex

as0770 · Post by **as0770** » Wed Nov 23, 2016 11:40 pm

zookar wrote:Have you tried Ray-nn? It is mentioned here by Hiroshi Yamashita(author of Aya) that Ray has this new version with Policy and Value net implemented and rates 2900 in CGOS(while ray_8.0.1_1k rates around 1800).Their Github link is here.

Unfortunately I have not been able to compile it for Linux...

Alex

Hane · Post by **Hane** » Thu Nov 24, 2016 2:40 am

as0770 wrote:
Unfortunately I have not been able to compile it for Linux...

Alex

It requires CNTK https://github.com/Microsoft/CNTK/releases
and it has a Visual Studio project file, so you can use visual studio to compile

q30 · Post by **q30** » Sat Nov 26, 2016 1:52 am

as0770 wrote: I got 4 cores only. So pondering and 2 cores/engines would result in unfair allocation of CPU time. I think even more important is the number of games. 24 games under this conditions will give a better result than 12 games with doubled time or 2 cores for each engine.
Still this 24 games are statistically not significant. To get a correct reflection of the strength you need at least 100 games/engine. I just want to get a fast and rough estimation of the strength. Anyway, maybe I will improve the conditions for the top engines one day :)

Yes, it will be unfair allocation of CPU time (I have 4 core CPU too). And yes, statistically 24 games not so significant for this level of causality. But there isn't so important number of games, if You are randomizing the games very much. In such case You can get only statistical information about quality and number of used opening book, patterns and specific local position algorithms (I'm not guru nor in English, nor in Go, so it may be, that I give not correct terminology). And the main (for serious games with human) algorithm, that is based on simulations number and therefore is depending on time*performance settings in such sparring games isn't testing...
In one phrase: for given total number of simulations more adequate result is getting in less number of games.

And, Alex, why You are not sparring MoGo version 4.86?

q30 · Post by **q30** » Sat Nov 26, 2016 2:02 am

I have tested leela_080_linux_x64. Go engines rate not changed (sparrings by GTP on one computer with a half of processor and memory limits for each engine):

League A
1. Leela

League B
2. Ray
3. Pachi
4. MoGo

Go engines non default compiling and start up parameters.
leela_080_linux_x64 start up parameters: -g -t 2 -b 0

Leela's developer has selected non standard way for time control. In contrast to all other Go engines in leela_080_linux_x64 it's possible get only the time per move and only by subtraction from external application time per move setting the --lagbuffer (-b) parameter value. For example, if someone adds in GoGUI a few Leela's game strength levels, then he must start gogui.jar (excepting displaying the contestants elapsed times) with "-time 1s+21474836s/1" (for example - the maximal compensable by Leela's -b parameter) and add (in Menu/ Program/ New Program), for example, the next (with path to executive file leela_080_linux_x64 in Working directory):
1) ./leela_080_linux_x64 -g -b 2147483500
2) ./leela_080_linux_x64 -g -b 2147471600
3) ./leela_080_linux_x64 -g -b 0
(1 - for 1 sec per move; 2 - for 2 min per move; 3 - for 357914 min per move)

Engines. Games.

splee99 · Post by **splee99** » Sun Dec 18, 2016 2:43 pm

On my computer Leela 0.8.0 is much stronger than Hiratuka 10.37B, appeared from several matches with twogtp. I have included in the GoGUI with the following command line,

C:\PPDownload\Hiratuka10_37B\gogui-twogtp.exe -black "Leela080.exe --gtp --noponder --threads 2"
-white "Hiratuka-19x19.exe -po 25000" -alternate

I don't have GPU on my AMD Athlon computer. Attached is one of the matches.

q30 · Post by **q30** » Fri Mar 10, 2017 11:33 pm

splee99 wrote:...--noponder...

Using of this parameter isn't a good idea, because all gamers are thinking on their opponents turn...

Life In 19x19

Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament

Re: Engine Tournament