Engine Tournament

For discussing go computing, software announcements, etc.
zakki
Beginner
Posts: 14
Joined: Wed Feb 22, 2017 4:44 am
GD Posts: 0
Has thanked: 1 time
Been thanked: 3 times

Re: Engine Tournament

Post by zakki »

I usually use Windows, and Rn has no maintainer on Linux.
Pull requests is welcomed.
User avatar
pnprog
Lives with ko
Posts: 286
Joined: Thu Oct 20, 2016 7:21 am
Rank: OGS 7 kyu
GD Posts: 0
Has thanked: 94 times
Been thanked: 153 times

Re: Engine Tournament

Post by pnprog »

zakki wrote:I usually use Windows, and Rn has no maintainer on Linux.
Pull requests is welcomed.
I am confident that at some point, somebody will show up and provide some help for linux, keep up the good work! :tmbup:
I am the author of GoReviewPartner, a small software aimed at assisting reviewing a game of Go. Give it a try!
as0770
Lives with ko
Posts: 180
Joined: Sun Jun 26, 2016 8:07 am
Rank: Beginner
GD Posts: 0
Has thanked: 15 times
Been thanked: 23 times

Re: Engine Tournament

Post by as0770 »

Now in League A: Leela Zero 5773f44c (2018.01.26), it lost 5 games because of a ladder. Also Leela is updated to v0.11.0.

Leela vs. AQ

Code: Select all

    1. AQ 2.0.3                        12/16
    2. Leela 0.11.0 Beta 11             4/16

League A:

Code: Select all

    1. Leela 0.11.0                    18/20
    2. Rayon 4.6.0                     15/20
    3. Oakfoam 0.2.1 NG-06             12/20
    4. Leela Zero 0.11 5773f44c         7/20
    5. Hiratuka 10.37B (CPU)            6/20
    6. DarkForest v2 MCTS 1.0           2/20

League B:

Code: Select all

    1. Leela Zero 0.11 c83e1b6e        15/20
    2. Pachi DCNN 11.99                13/20
    3. DarkGo 1.0                      12/20
    4. Dream Go 0.5.0                  11/20
    5. Ray 9.0.1                        7/20
    6. Mogo 4.86                        2/20

League C:

Code: Select all

    1. MoGo 4.86                       18/20
    2. deltaGo 1.0.0                   14/20
    3. Fuego 1.1                       13/20
    4. Michi C-2 1.4.2                  8/20
    5. Orego 7.08                       5/20
    6. GNU Go 3.8                       2/20

League D:

Code: Select all

    1. GNU Go 3.8                      25/28
    2. Hara 0.9                        18/28
    3. Matilda 1.25                    16/28
    4. Indigo 2009                     16/28
    5. Dariush 3.1.5.7                 15/28
    6. Aya 6.34                        13/28
    7. Fudo Go 3.0                      7/28
    8. JrefBot 081016-2022              2/28

League E:

Code: Select all

    1. JrefBot 081016-2022             16/20             
    2. Iomrascálaí 0.3.2               12/20
    3. SimpleGo 0.4.3                  11/20
    4. Crazy Patterns 0008-13           7/20
    5. Marcos Go 1.0                    7/20
    6. AmiGo 1.8                        7/20

League F:

Code: Select all

    1. AmiGo 1.8                       19/20
    2. Beancounter 0.1                 15/20
    3. Stop 0.9-005                    10/20
    4. GoTraxx 1.4.2                    7/20
    5. CopyBot 0.1                      6/20
    6. Brown 1.0                        3/20

Configuration:
League A: 2h/game, pondering off, 4 threads, 2GB on 4 x Intel® Core™ i5-4210H CPU @ 2.90GHz, 8 GiB Ram and GeForce 840M/PCIe/SSE2
TWOGTP=”gogui-twogtp -black \”$BLACK\” -white \”$WHITE\” -games 2 -size 19 -time xx -sgffile xxxx”
gogui -size 19 -program “$TWOGTP” -computer-both -auto

League B-F: 1h/game, pondering off, 1 thread, 1GB on 4 x Intel® Core™ i5-4210H CPU @ 2.90GHz, 8 GiB Ram and GeForce 840M/PCIe/SSE2
TWOGTP=”gogui-twogtp -black \”$BLACK\” -white \”$WHITE\” -games 2 -size 19 -time xx -sgffile xxxx”
gogui -size 19 -program “$TWOGTP” -computer-both -auto

Amigo: amigogtp
AQ: AQ

Code: Select all

aq_config.txt:
-main time[sec] =7200
-time controll =true
-japanese rule =true

Aya: Aya.exe –mode gtp –level max
Beancounter: beancounter
Brown: brown.exe
Copybot: python /path/to/__main__.py
CrazyPatterns: CrazyPatterns.exe
Dariush: DarGTP.exe –level 10
DarkForest: taskset -c 0 bash cnn_evaluator.sh 1 /data/local/go and taskset -c 0 th cnnPlayerMCTSV2.lua --num_gpu 1 --num_tree_thread 1 --rollout 2500 --win_rate_thres 0.1
DarkGo: darknet go engine cfg/go.test.cfg go.weights
deltaGo: deltaGo.exe
Dream Go: export NUM_ITER=1375 and dream_go
Fudo Go: taskset -c 0 fudo –boardsize=19 –komi=6.5
Fuego: fuego.exe –config fuego.cfg

Code: Select all

fuego.cfg:
uct_param_search number_threads 1
uct_param_search lock_free 0
uct_max_memory 1024000000
uct_param_player reuse_subtree 1
uct_param_player ponder 0
uct_param_player early_pass 1

GnuGo: gnugo --mode gtp --level 10 --resign-allowed
GoTraxx: GoTraxx.exe
Hara: hara
Hiratuka: Hiratuka-19×19.exe -po 175000
IndiGo: Indigo.exe -gtp
Iomrascálaí: taskset -c 0,1 iomrascalai
JrefBot: java -jar jrefgo.jar 10000
Leela: leela_gtp_opencl –gtp –threads 4 –noponder
Leela Zero: leelaz --gtp --threads 4 --w /path/to/Leelaz_best-network_yyyy_mm_dd --noponder
Matilda: matilda

Code: Select all

matilda.h:
#define BOARD_SIZ 19
#define DEFAULT_UCT_MEMORY 1000
#define DEFAULT_NUM_THREADS 1

Marcos Go: marcos_go --patterns /path/to/patterns.txt --cycles_mcts 10000 --threads_mcts 1
Michi C-2: michi gtp

Code: Select all

ui.c:
init_large_patterns("patterns2.prob", "patterns2.spat"); // Michis pattern files renamed because they have the same name as Pachis files.

MoGo: mogo
Oakfoam: oakfoam -c nicego-cnn-06.gtp

Code: Select all

nicego-cnn-06.gtp:
param playouts_per_move_max 40000
param thread_count 4

Orego: java -jar /path/to/orego-7.08.jar threads=1 grace
Pachi: pachidcnn -f pachibook.dat threads=4,max_tree_size=2048,pondering=0
Pachi: pachidcnn -f pachibook.dat threads=1,max_tree_size=1024,pondering=0
Ray: ray --time 3600 --thread 1 --no-debug
Rayon: rayon --thread 4 --no-debug
Simple Go: python /path/to/play_gtp.py --node_limit=100
Stop: /usr/bin/java -ea -jar /path/to/stop-09-005.jar --mode gtp

Links:
Amigo: https://sourceforge.net/projects/amigogtp/
AQ: https://github.com/ymgaq/AQ
Aya: http://www.yss-aya.com/
Brown: http://ricoh51.free.fr/go/engineeng.htm
Beancounter: Private
Copybot: https://github.com/sirtango/ICopyMoves
CrazyPatterns: https://www.remi-coulom.fr/Amsterdam2007/
Dariush: http://ricoh51.free.fr/go/engineeng.htm
DarkForest: https://github.com/facebookresearch/darkforestGo
DarkGo: https://pjreddie.com/darknet/darkgo-go-in-darknet/
deltaGo: http://home.q00.itscom.net/otsuki/delta.html
Dream Go: https://github.com/Chicoryn/dream-go
Fudo Go: http://www.geocities.jp/hideki_katoh/
Fuego: http://fuego.sourceforge.net/
GnuGo: https://www.gnu.org/software/gnugo/devel.html
GoTraxx http://gotraxx.codeplex.com/
Hara: https://github.com/antoniogarro/Hara
Hiratuka: Non GPU version (10.37B): http://www.vector.co.jp/download/file/winnt/game/fh673259.html / GPU version (10.38B): http://www.vector.co.jp/download/file/winnt/game/fh688349.html
IndiGo: http://www.math-info.univ-paris5.fr/~bouzy/INDIGO.html
Iomrascálaí: https://github.com/ujh/iomrascalai
JrefBot: http://ricoh51.free.fr/go/engineeng.htm
Leela: https://sjeng.org/leela.html
Leela Zero: https://github.com/gcp/leela-zero
Marcos Go: https://github.com/MarcosPividori/Go-player
Matilda: https://github.com/gonmf/matilda
Michi C-2 https://github.com/db3108/michi-c2
MoGo: https://lifein19x19.com/forum/viewtopic.php?p=211091#p211091
Oakfoam: https://bitbucket.org/dsmic/oakfoam
Orego: https://sites.google.com/a/lclark.edu/drake/research/orego
Pachi: http://pachi.or.cz/
Rayon: https://github.com/zakki/Ray
Ray: https://github.com/kobanium/Ray
Simple Go: https://sourceforge.net/projects/londerings/
Stop: https://www.vanheusden.com/stop/

Best,
Alex
q30
Lives with ko
Posts: 145
Joined: Sat Aug 13, 2016 8:23 am
Rank: 30 kyu
GD Posts: 0
Has thanked: 1 time
Been thanked: 1 time

Re: Engine Tournament

Post by q30 »

as0770 wrote:
q30 wrote:
as0770 wrote:You have no idea what you are talking about. Standard deviation doesn't change with the timecontrol.

It depends on game randomness, that changes with the time control...


Then we have to rewrite basic mathematical principles.

So, it will be good, if we will rewrite Your representations about basic mathematical principles...
For beginning, Standard deviation is square root of give right translation to English Yourself, that can be determined by next:
https://wikimedia.org/api/rest_v1/media/math/render/svg/1d1610b913011b6744f23f47e0920974b7f78f58,
where pi in our case depends among others on time control...
as0770
Lives with ko
Posts: 180
Joined: Sun Jun 26, 2016 8:07 am
Rank: Beginner
GD Posts: 0
Has thanked: 15 times
Been thanked: 23 times

Re: Engine Tournament

Post by as0770 »

q30 wrote:So, it will be good, if we will rewrite Your representations about basic mathematical principles...
For beginning, Standard deviation is square root of give right translation to English Yourself, that can be determined by next:
https://wikimedia.org/api/rest_v1/media/math/render/svg/1d1610b913011b6744f23f47e0920974b7f78f58,
where pi in our case depends among others on time control...


Nice you tried to understand my points. Of course the probability _can_ change _slightly_ with the time control. But the result in a 1h match and a 2h match will be more or less the same. What you claim is that the result of a 2h match will show the relative strength more accurate than a 1h match, and that is nonsense.

Two engines of equal strength will have a 50% chance for a 1-1, a 25%chance for a 0-2 and 25% chance for a 2-0. If you double the time control from 1h to 2h the over all winning probability will _maybe_ change to 51:49%. Experience in engine matches in chess is that you get basically the same results in 1min/game and 2h/game as long as there is no significant bug. The difference of 1h/game and 2h/game match is not measurable. There is no reason why it should be different in Go. Even if the probability changes to 55:45%, you would need hundreds of games to prove the difference in strength. What I do is a tournament with 20 or 30 games. If I run the tournament twice I can get completely different results. This won't change with 2h/games or pondering on (League A is 2h on 4 threads btw).
lightvector
Lives in sente
Posts: 759
Joined: Sat Jun 19, 2010 10:11 pm
Rank: maybe 2d
GD Posts: 0
Has thanked: 114 times
Been thanked: 916 times

Re: Engine Tournament

Post by lightvector »

as0770 wrote:
q30 wrote:So, it will be good, if we will rewrite Your representations about basic mathematical principles...
For beginning, Standard deviation is square root of give right translation to English Yourself, that can be determined by next:
https://wikimedia.org/api/rest_v1/media/math/render/svg/1d1610b913011b6744f23f47e0920974b7f78f58,
where pi in our case depends among others on time control...


Nice you tried to understand my points. Of course the probability _can_ change _slightly_ with the time control. But the result in a 1h match and a 2h match will be more or less the same. What you claim is that the result of a 2h match will show the relative strength more accurate than a 1h match, and that is nonsense.

Two engines of equal strength will have a 50% chance for a 1-1, a 25%chance for a 0-2 and 25% chance for a 2-0. If you double the time control from 1h to 2h the over all winning probability will _maybe_ change to 51:49%. Experience in engine matches in chess is that you get basically the same results in 1min/game and 2h/game as long as there is no significant bug. The difference of 1h/game and 2h/game match is not measurable. There is no reason why it should be different in Go. Even if the probability changes to 55:45%, you would need hundreds of games to prove the difference in strength. What I do is a tournament with 20 or 30 games. If I run the tournament twice I can get completely different results. This won't change with 2h/games or pondering on (League A is 2h on 4 threads btw).


Although, it's best not to take this heuristic too seriously, because a nontrivial change is possible. I haven't read it that closely, but my skim of the following thread https://github.com/gcp/leela-zero/issues/667 suggested that that Leela Zero has sometimes got noticeably different results between very small numbers of playouts, like 5, and a larger number number of playouts, like 1600, where the relative strength difference and even sometimes the ordering of strength would change between the neural nets.

It's not actually not surprising at all to me that Leela Zero in some cases could have quite a large difference in strength between tiny numbers of playouts and large numbers of playouts, enough to change the ordering between nets. For example new candidate nets often appear to vary in strength on the order of multiple hundreds of Elos, so training is very noisy, and there's no reason to expect that the quality of the policy part of the neural net and the value part of the neural net always vary together in the same way. And thinking in those terms, it's pretty obvious that you're measuring something fairly different at 5 playouts vs at 1600 playouts. With very few playouts you rely on the policy net more heavily.

I agree that if you're only running 20 or 30 games, then of course none of this matters, the noise in 20 to 30 games still dwarfs this. :)
q30
Lives with ko
Posts: 145
Joined: Sat Aug 13, 2016 8:23 am
Rank: 30 kyu
GD Posts: 0
Has thanked: 1 time
Been thanked: 1 time

Re: Engine Tournament

Post by q30 »

as0770 wrote:
q30 wrote:
So, it will be good, if we will rewrite Your representations about basic mathematical principles...
For beginning, Standard deviation is square root of give right translation to English Yourself, that can be determined by next:
https://wikimedia.org/api/rest_v1/media ... 74b7f78f58,
where pi in our case depends among others on time control...



Nice you tried to understand my points. Of course the probability _can_ change _slightly_ with the time control. But the result in a 1h match and a 2h match will be more or less the same. What you claim is that the result of a 2h match will show the relative strength more accurate than a 1h match, and that is nonsense.

Two engines of equal strength will have a 50% chance for a 1-1, a 25%chance for a 0-2 and 25% chance for a 2-0. If you double the time control from 1h to 2h the over all winning probability will _maybe_ change to 51:49%. Experience in engine matches in chess is that you get basically the same results in 1min/game and 2h/game as long as there is no significant bug. The difference of 1h/game and 2h/game match is not measurable. There is no reason why it should be different in Go. Even if the probability changes to 55:45%, you would need hundreds of games to prove the difference in strength. What I do is a tournament with 20 or 30 games. If I run the tournament twice I can get completely different results. This won't change with 2h/games or pondering on (League A is 2h on 4 threads btw).

You are quite right, if there is the same engine sparring. But even if there will be 2 simple MC engines (which will in sparring demonstrate mentioned by You chances with time on move --> 0), it may be difference in strength (i.e. in chances) dependent on time control because of difference in best move choice algorithm (and especially more complex engines with more complex algorithms).
You can try to compare 2 engines (with close strength levels) results with time and thread control, that You have used for league B-F, and results of these engines sparring with 2' per move and 4 threads...
as0770
Lives with ko
Posts: 180
Joined: Sun Jun 26, 2016 8:07 am
Rank: Beginner
GD Posts: 0
Has thanked: 15 times
Been thanked: 23 times

Re: Engine Tournament

Post by as0770 »

q30 wrote:You are quite right, if there is the same engine sparring. But even if there will be 2 simple MC engines (which will in sparring demonstrate mentioned by You chances with time on move --> 0), it may be difference in strength (i.e. in chances) dependent on time control because of difference in best move choice algorithm (and especially more complex engines with more complex algorithms).
You can try to compare 2 engines (with close strength levels) results with time and thread control, that You have used for league B-F, and results of these engines sparring with 2' per move and 4 threads...

You don't get the point. The statistical fluctuation is way too high to meassure little differences in strength. I won't play hundreds of games to prove you wrong.
Once again: This are two matches with the same engines and the same conditions:
as0770 wrote:Pachi vs. Hiratuka 8:8
Pachi vs. Hiratuka 2:14

This discussion doesn't make any sense. No more replies by me.
User avatar
pnprog
Lives with ko
Posts: 286
Joined: Thu Oct 20, 2016 7:21 am
Rank: OGS 7 kyu
GD Posts: 0
Has thanked: 94 times
Been thanked: 153 times

Re: Engine Tournament

Post by pnprog »

as0770 wrote:Now in League A: Leela Zero 5773f44c (2018.01.26), it lost 5 games because of a ladder. Also Leela is updated to v0.11.0.

Thanks for running the tournament and sharing the result. It's nice also to have the list of internet links :salute:
I am the author of GoReviewPartner, a small software aimed at assisting reviewing a game of Go. Give it a try!
as0770
Lives with ko
Posts: 180
Joined: Sun Jun 26, 2016 8:07 am
Rank: Beginner
GD Posts: 0
Has thanked: 15 times
Been thanked: 23 times

Re: Engine Tournament

Post by as0770 »

lightvector wrote:Although, it's best not to take this heuristic too seriously, because a nontrivial change is possible. I haven't read it that closely, but my skim of the following thread https://github.com/gcp/leela-zero/issues/667 suggested that that Leela Zero has sometimes got noticeably different results between very small numbers of playouts, like 5, and a larger number number of playouts, like 1600, where the relative strength difference and even sometimes the ordering of strength would change between the neural nets.

It's not actually not surprising at all to me that Leela Zero in some cases could have quite a large difference in strength between tiny numbers of playouts and large numbers of playouts, enough to change the ordering between nets. For example new candidate nets often appear to vary in strength on the order of multiple hundreds of Elos, so training is very noisy, and there's no reason to expect that the quality of the policy part of the neural net and the value part of the neural net always vary together in the same way. And thinking in those terms, it's pretty obvious that you're measuring something fairly different at 5 playouts vs at 1600 playouts. With very few playouts you rely on the policy net more heavily.

I agree that if you're only running 20 or 30 games, then of course none of this matters, the noise in 20 to 30 games still dwarfs this. :)


Of course with 5 playouts there will be different results, but we are talking about 1h/game vs 2h/game what is 7000 vs. 14000 playouts on my system.

It is also funny to follow the history when I replace or remove some engines, look at Ray:

Code: Select all

    1. Ray 9.0.1                    29/32
    2. Pachi DCNN 11.99             28/32
    3. Leela Zero 0.9 (2018.01.01)  19/32
    4. MoGo 4.86                    18/32
    5. deltaGo 1.0.0                17/32
    6. Fuego 1.1                    15/32
    7. Michi C-2 1.4.2               8/32
    8. Orego 7.08                    8/32
    9. GNU Go 3.8                    2/32

Code: Select all

    1. Leela Zero 0.11 c83e1b6e        15/20
    2. Pachi DCNN 11.99                13/20
    3. DarkGo 1.0                      12/20
    4. Dream Go 0.5.0                  11/20
    5. Ray 9.0.1                        7/20
    6. Mogo 4.86                        2/20

And at DreamGo:

Code: Select all

    1. DreamGo 0.5.0                   15/20
    2. DarkForest v2 MCTS 1.0          12/20
    3. Pachi DCNN 11.99                12/20
    4. DarkGo 1.0                      10/20
    5. Ray 9.0.1                        9/20
    6. Mogo 4.86                        2/20

It do not replay the whole tournament, I just remove the old engines and add the new ones.
as0770
Lives with ko
Posts: 180
Joined: Sun Jun 26, 2016 8:07 am
Rank: Beginner
GD Posts: 0
Has thanked: 15 times
Been thanked: 23 times

Re: Engine Tournament

Post by as0770 »

Updates: DarkForrest is relegated to League B and Dream Go made it into League A. Also the Leela vs AQ match was replayed with the latest versions. Surprisingly (for me) Leela 0.11 was able to strike back after Leela 0.11 Beta lost 4-12 against AQ 2.0.1.

Unfortunately AQ doesn't work with Rayon and Oakfoam. One of the engines will crash wenn running on one GPU. So for now AQ can't play in League A.


Leela vs. AQ

Code: Select all

    1. Leela 0.11.0                     9/16
    2. AQ 2.1.1                         7/16

League A:

Code: Select all

    1. Leela 0.11.0                    18/20
    2. Rayon 4.6.0                     15/20
    3. Oakfoam 0.2.1 NG-06             12/20
    4. Hiratuka 10.37B (CPU)            7/20
    5. Leela Zero 0.11 5773f44c         6/20
    6. DreamGo 0.5.0                    2/20

League B:

Code: Select all

    1. DreamGo 0.5.0                   15/20
    2. DarkForrest MCTS 1.0            12/20
    3. Pachi 11.99                     12/20
    4. DarkGo 1.0                      10/20
    5. Ray 9.0.1                        9/20
    6. Mogo 4.86                        2/20

League C:

Code: Select all

    1. MoGo 4.86                       18/20
    2. deltaGo 1.0.0                   14/20
    3. Fuego 1.1                       13/20
    4. Michi C-2 1.4.2                  8/20
    5. Orego 7.08                       5/20
    6. GNU Go 3.8                       2/20

League D:

Code: Select all

    1. GNU Go 3.8                      25/28
    2. Hara 0.9                        18/28
    3. Matilda 1.25                    16/28
    4. Indigo 2009                     16/28
    5. Dariush 3.1.5.7                 15/28
    6. Aya 6.34                        13/28
    7. Fudo Go 3.0                      7/28
    8. JrefBot 081016-2022              2/28

League E:

Code: Select all

    1. JrefBot 081016-2022             16/20             
    2. Iomrascálaí 0.3.2               12/20
    3. SimpleGo 0.4.3                  11/20
    4. Crazy Patterns 0008-13           7/20
    5. Marcos Go 1.0                    7/20
    6. AmiGo 1.8                        7/20

League F:

Code: Select all

    1. AmiGo 1.8                       19/20
    2. Beancounter 0.1                 15/20
    3. Stop 0.9-005                    10/20
    4. GoTraxx 1.4.2                    7/20
    5. CopyBot 0.1                      6/20
    6. Brown 1.0                        3/20

Configuration:
League A: 1h/game, pondering off, 4 threads, 2GB on 4 x Intel® Core™ i5-4210H CPU @ 2.90GHz, 8 GiB Ram and GeForce 840M/PCIe/SSE2
TWOGTP=”gogui-twogtp -black \”$BLACK\” -white \”$WHITE\” -games 2 -size 19 -time xx -sgffile xxxx”
gogui -size 19 -program “$TWOGTP” -computer-both -auto

League B-F: 1h/game, pondering off, 1 thread, 1GB on 4 x Intel® Core™ i5-4210H CPU @ 2.90GHz, 8 GiB Ram and GeForce 840M/PCIe/SSE2
TWOGTP=”gogui-twogtp -black \”$BLACK\” -white \”$WHITE\” -games 2 -size 19 -time xx -sgffile xxxx”
gogui -size 19 -program “$TWOGTP” -computer-both -auto

Amigo: amigogtp
AQ: AQ

Code: Select all

aq_config.txt:
-main time[sec] =3600
-time controll =true
-japanese rule =true

Aya: Aya.exe –mode gtp –level max
Beancounter: beancounter
Brown: brown.exe
Copybot: python /path/to/__main__.py
CrazyPatterns: CrazyPatterns.exe
Dariush: DarGTP.exe –level 10
DarkForest: taskset -c 0 bash cnn_evaluator.sh 1 /data/local/go and taskset -c 0 th cnnPlayerMCTSV2.lua --num_gpu 1 --num_tree_thread 1 --rollout 750 --win_rate_thres 0.1
DarkGo: darknet go engine cfg/go.test.cfg go.weights
deltaGo: deltaGo.exe
Dream Go: export NUM_ITER=1375 and dream_go
Fudo Go: taskset -c 0 fudo –boardsize=19 –komi=6.5
Fuego: fuego.exe –config fuego.cfg

Code: Select all

fuego.cfg:
uct_param_search number_threads 1
uct_param_search lock_free 0
uct_max_memory 1024000000
uct_param_player reuse_subtree 1
uct_param_player ponder 0
uct_param_player early_pass 1

GnuGo: gnugo --mode gtp --level 10 --resign-allowed
GoTraxx: GoTraxx.exe
Hara: hara
Hiratuka: Hiratuka-19×19.exe -po 75000
IndiGo: Indigo.exe -gtp
Iomrascálaí: taskset -c 0,1 iomrascalai
JrefBot: java -jar jrefgo.jar 10000
Leela: leela_gtp_opencl –gtp –threads 4 –noponder
Leela Zero: leelaz --gtp --threads 4 --w /path/to/Leelaz_best-network_yyyy_mm_dd --noponder
Matilda: matilda

Code: Select all

matilda.h:
#define BOARD_SIZ 19
#define DEFAULT_UCT_MEMORY 1000
#define DEFAULT_NUM_THREADS 1

Marcos Go: marcos_go --patterns /path/to/patterns.txt --cycles_mcts 10000 --threads_mcts 1
Michi C-2: michi gtp

Code: Select all

ui.c:
init_large_patterns("patterns2.prob", "patterns2.spat"); // Michis pattern files renamed because they have the same name as Pachis files.

MoGo: mogo
Oakfoam: oakfoam -c nicego-cnn-06.gtp

Code: Select all

nicego-cnn-06.gtp:
param playouts_per_move_max 40000
param thread_count 4

Orego: java -jar /path/to/orego-7.08.jar threads=1 grace
Pachi: pachidcnn -f pachibook.dat threads=1,max_tree_size=1024,pondering=0
Ray: ray --time 3600 --thread 1 --no-debug
Rayon: rayon --thread 4 --no-debug
Simple Go: python /path/to/play_gtp.py --node_limit=100
Stop: /usr/bin/java -ea -jar /path/to/stop-09-005.jar --mode gtp

Links:
Amigo: https://sourceforge.net/projects/amigogtp/
AQ: https://github.com/ymgaq/AQ
Aya: http://www.yss-aya.com/
Brown: http://ricoh51.free.fr/go/engineeng.htm
Beancounter: Private
Copybot: https://github.com/sirtango/ICopyMoves
CrazyPatterns: https://www.remi-coulom.fr/Amsterdam2007/
Dariush: http://ricoh51.free.fr/go/engineeng.htm
DarkForest: https://github.com/facebookresearch/darkforestGo
DarkGo: https://pjreddie.com/darknet/darkgo-go-in-darknet/
deltaGo: http://home.q00.itscom.net/otsuki/delta.html
Dream Go: https://github.com/Chicoryn/dream-go
Fudo Go: http://www.geocities.jp/hideki_katoh/
Fuego: http://fuego.sourceforge.net/
GnuGo: https://www.gnu.org/software/gnugo/devel.html
GoTraxx http://gotraxx.codeplex.com/
Hara: https://github.com/antoniogarro/Hara
Hiratuka: Non GPU version (10.37B): http://www.vector.co.jp/download/file/winnt/game/fh673259.html / GPU version (10.38B): http://www.vector.co.jp/download/file/winnt/game/fh688349.html
IndiGo: http://www.math-info.univ-paris5.fr/~bouzy/INDIGO.html
Iomrascálaí: https://github.com/ujh/iomrascalai
JrefBot: http://ricoh51.free.fr/go/engineeng.htm
Leela: https://sjeng.org/leela.html
Leela Zero: https://github.com/gcp/leela-zero
Marcos Go: https://github.com/MarcosPividori/Go-player
Matilda: https://github.com/gonmf/matilda
Michi C-2 https://github.com/db3108/michi-c2
MoGo: https://lifein19x19.com/forum/viewtopic.php?p=211091#p211091
Oakfoam: https://bitbucket.org/dsmic/oakfoam
Orego: https://sites.google.com/a/lclark.edu/drake/research/orego
Pachi: http://pachi.or.cz/
Rayon: https://github.com/zakki/Ray
Ray: https://github.com/kobanium/Ray
Simple Go: https://sourceforge.net/projects/londerings/
Stop: https://www.vanheusden.com/stop/

Best,
Alex
q30
Lives with ko
Posts: 145
Joined: Sat Aug 13, 2016 8:23 am
Rank: 30 kyu
GD Posts: 0
Has thanked: 1 time
Been thanked: 1 time

Re: Engine Tournament

Post by q30 »

as0770 wrote:
q30 wrote:You are quite right, if there is the same engine sparring. But even if there will be 2 simple MC engines (which will in sparring demonstrate mentioned by You chances with time on move --> 0), it may be difference in strength (i.e. in chances) dependent on time control because of difference in best move choice algorithm (and especially more complex engines with more complex algorithms).
You can try to compare 2 engines (with close strength levels) results with time and thread control, that You have used for league B-F, and results of these engines sparring with 2' per move and 4 threads...

You don't get the point. The statistical fluctuation is way too high to meassure little differences in strength. I won't play hundreds of games to prove you wrong.
Once again: This are two matches with the same engines and the same conditions:
as0770 wrote:Pachi vs. Hiratuka 8:8
Pachi vs. Hiratuka 2:14

This discussion doesn't make any sense. No more replies by me.

This result only proves, that time control was very small for these (or one of these) engines, so games were very randomness...
as0770
Lives with ko
Posts: 180
Joined: Sun Jun 26, 2016 8:07 am
Rank: Beginner
GD Posts: 0
Has thanked: 15 times
Been thanked: 23 times

Re: Engine Tournament

Post by as0770 »

Yet another Leela Zero Update in League A with a network from last Sunday. Although its learning progress seems to decrease, it made a big step in the last two weeks, it was even able to win one of four game against Leela 0.11.0:

Leela vs. AQ

Code: Select all

    1. Leela 0.11.0                     9/16
    2. AQ 2.1.1                         7/16

League A:

Code: Select all

    1. Leela 0.11.0                    17/20
    2. Leela Zero 0.11 cde9c8d4        13/20
    3. Rayon 4.6.0                     13/20
    4. Oakfoam 0.2.1 NG-06             12/20
    5. Hiratuka 10.37B (CPU)            4/20
    6. DreamGo 0.5.0                    1/20

League B:

Code: Select all

    1. DreamGo 0.5.0                   15/20
    2. DarkForrest MCTS 1.0            12/20
    3. Pachi 11.99                     12/20
    4. DarkGo 1.0                      10/20
    5. Ray 9.0.1                        9/20
    6. Mogo 4.86                        2/20

League C:

Code: Select all

    1. MoGo 4.86                       18/20
    2. deltaGo 1.0.0                   14/20
    3. Fuego 1.1                       13/20
    4. Michi C-2 1.4.2                  8/20
    5. Orego 7.08                       5/20
    6. GNU Go 3.8                       2/20

League D:

Code: Select all

    1. GNU Go 3.8                      25/28
    2. Hara 0.9                        18/28
    3. Matilda 1.25                    16/28
    4. Indigo 2009                     16/28
    5. Dariush 3.1.5.7                 15/28
    6. Aya 6.34                        13/28
    7. Fudo Go 3.0                      7/28
    8. JrefBot 081016-2022              2/28

League E:

Code: Select all

    1. JrefBot 081016-2022             16/20             
    2. Iomrascálaí 0.3.2               12/20
    3. SimpleGo 0.4.3                  11/20
    4. Crazy Patterns 0008-13           7/20
    5. Marcos Go 1.0                    7/20
    6. AmiGo 1.8                        7/20

League F:

Code: Select all

    1. AmiGo 1.8                       19/20
    2. Beancounter 0.1                 15/20
    3. Stop 0.9-005                    10/20
    4. GoTraxx 1.4.2                    7/20
    5. CopyBot 0.1                      6/20
    6. Brown 1.0                        3/20

Configuration:
League A: 1h/game, pondering off, 4 threads, 2GB on 4 x Intel® Core™ i5-4210H CPU @ 2.90GHz, 8 GiB Ram and GeForce 840M/PCIe/SSE2
TWOGTP=”gogui-twogtp -black \”$BLACK\” -white \”$WHITE\” -games 2 -size 19 -time xx -sgffile xxxx”
gogui -size 19 -program “$TWOGTP” -computer-both -auto

League B-F: 1h/game, pondering off, 1 thread, 1GB on 4 x Intel® Core™ i5-4210H CPU @ 2.90GHz, 8 GiB Ram and GeForce 840M/PCIe/SSE2
TWOGTP=”gogui-twogtp -black \”$BLACK\” -white \”$WHITE\” -games 2 -size 19 -time xx -sgffile xxxx”
gogui -size 19 -program “$TWOGTP” -computer-both -auto

Amigo: amigogtp
AQ: AQ

Code: Select all

aq_config.txt:
-main time[sec] =3600
-time controll =true
-japanese rule =true

Aya: Aya.exe –mode gtp –level max
Beancounter: beancounter
Brown: brown.exe
Copybot: python /path/to/__main__.py
CrazyPatterns: CrazyPatterns.exe
Dariush: DarGTP.exe –level 10
DarkForest: taskset -c 0 bash cnn_evaluator.sh 1 /data/local/go and taskset -c 0 th cnnPlayerMCTSV2.lua --num_gpu 1 --num_tree_thread 1 --rollout 750 --win_rate_thres 0.1
DarkGo: darknet go engine cfg/go.test.cfg go.weights
deltaGo: deltaGo.exe
Dream Go: export NUM_ITER=1375 and dream_go
Fudo Go: taskset -c 0 fudo –boardsize=19 –komi=6.5
Fuego: fuego.exe –config fuego.cfg

Code: Select all

fuego.cfg:
uct_param_search number_threads 1
uct_param_search lock_free 0
uct_max_memory 1024000000
uct_param_player reuse_subtree 1
uct_param_player ponder 0
uct_param_player early_pass 1

GnuGo: gnugo --mode gtp --level 10 --resign-allowed
GoTraxx: GoTraxx.exe
Hara: hara
Hiratuka: Hiratuka-19×19.exe -po 75000
IndiGo: Indigo.exe -gtp
Iomrascálaí: taskset -c 0,1 iomrascalai
JrefBot: java -jar jrefgo.jar 10000
Leela: leela_gtp_opencl –gtp –threads 4 –noponder
Leela Zero: leelaz --gtp --threads 4 --w /path/to/Leelaz_best-network_yyyy_mm_dd --noponder
Matilda: matilda

Code: Select all

matilda.h:
#define BOARD_SIZ 19
#define DEFAULT_UCT_MEMORY 1000
#define DEFAULT_NUM_THREADS 1

Marcos Go: marcos_go --patterns /path/to/patterns.txt --cycles_mcts 10000 --threads_mcts 1
Michi C-2: michi gtp

Code: Select all

ui.c:
init_large_patterns("patterns2.prob", "patterns2.spat"); // Michis pattern files renamed because they have the same name as Pachis files.

MoGo: mogo
Oakfoam: oakfoam -c nicego-cnn-06.gtp

Code: Select all

nicego-cnn-06.gtp:
param playouts_per_move_max 40000
param thread_count 4

Orego: java -jar /path/to/orego-7.08.jar threads=1 grace
Pachi: pachidcnn -f pachibook.dat threads=1,max_tree_size=1024,pondering=0
Ray: ray --time 3600 --thread 1 --no-debug
Rayon: rayon --thread 4 --no-debug
Simple Go: python /path/to/play_gtp.py --node_limit=100
Stop: /usr/bin/java -ea -jar /path/to/stop-09-005.jar --mode gtp

Links:
Amigo: https://sourceforge.net/projects/amigogtp/
AQ: https://github.com/ymgaq/AQ
Aya: http://www.yss-aya.com/
Brown: http://ricoh51.free.fr/go/engineeng.htm
Beancounter: Private
Copybot: https://github.com/sirtango/ICopyMoves
CrazyPatterns: https://www.remi-coulom.fr/Amsterdam2007/
Dariush: http://ricoh51.free.fr/go/engineeng.htm
DarkForest: https://github.com/facebookresearch/darkforestGo
DarkGo: https://pjreddie.com/darknet/darkgo-go-in-darknet/
deltaGo: http://home.q00.itscom.net/otsuki/delta.html
Dream Go: https://github.com/Chicoryn/dream-go
Fudo Go: http://www.geocities.jp/hideki_katoh/
Fuego: http://fuego.sourceforge.net/
GnuGo: https://www.gnu.org/software/gnugo/devel.html
GoTraxx http://gotraxx.codeplex.com/
Hara: https://github.com/antoniogarro/Hara
Hiratuka: Non GPU version (10.37B): http://www.vector.co.jp/download/file/winnt/game/fh673259.html / GPU version (10.38B): http://www.vector.co.jp/download/file/winnt/game/fh688349.html
IndiGo: http://www.math-info.univ-paris5.fr/~bouzy/INDIGO.html
Iomrascálaí: https://github.com/ujh/iomrascalai
JrefBot: http://ricoh51.free.fr/go/engineeng.htm
Leela: https://sjeng.org/leela.html
Leela Zero: [urlhttp://zero.sjeng.org/[/url]
Marcos Go: https://github.com/MarcosPividori/Go-player
Matilda: https://github.com/gonmf/matilda
Michi C-2 https://github.com/db3108/michi-c2
MoGo: https://lifein19x19.com/forum/viewtopic.php?p=211091#p211091
Oakfoam: https://bitbucket.org/dsmic/oakfoam
Orego: https://sites.google.com/a/lclark.edu/drake/research/orego
Pachi: http://pachi.or.cz/
Rayon: https://github.com/zakki/Ray
Ray: https://github.com/kobanium/Ray
Simple Go: https://sourceforge.net/projects/londerings/
Stop: https://www.vanheusden.com/stop/

Best,
Alex
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: Engine Tournament

Post by Vargo »

Hello,
First, let me say that I'm a huge fan of your engine tournament.

Concerning AQ 2.1.1 v. Leela, the result depends a lot on the harware, and particularly on the GPU.

A GeForce 1080Ti is roughly 2 or 2.5 times more powerful than a 840M, that means AQ is probably one stone stronger on a 1080Ti than on a 840M.
With 2 GPUs, AQ is certainly 2 stones stronger than on a 840M.
If I'm not mistaken, Leela zero can handle multiple GPUs, but Leela011 OpenCL can't. So, on a gaming PC with 2 GPUs, AQ should be 2 stones stronger than Leela OpenCL, whereas on a standard PC, your tournament has shown that they're about even.

To test this, I've run matches (16 games) between AQ 2.1.1 and Leela011 OpenCL : time_settings 900 0 0 (same as in CGOS server) pondering off for both (Sabaki 033.3 used for all the games)

Even games_____GPU: 1x1080Ti____CPU: i7 6700K____RAM: 32 GB
H2 games_______GPU: 2x1080Ti____CPU: i9 7920X____RAM: 64 GB

Results

Even games : AQ 2.1.1 v. Leela011 OpenCL --------> AQ: 13/16 , L011: 3/16
H2 games : AQ 2.1.1 (W) v. Leela011 OpenCL (B)--->AQ: 9/16 , L011: 7/16

Unfortunately AQ doesn't work with Rayon and Oakfoam...
I've run games between AQ and Rayon or others, it works well with Sabaki. The problem is that Sabaki doesn't handle consecutive matches automatically. You have to run one game after another, I don't think you can tell Sabaki to run automatically 16 consecutive games between X and Y, save the games, and at the end, tell the score of the 16 games match. If someone knows how to do it, tell me, I'd be interested.

Thanks for your engine tournament, keep up the good work, it's very interesting :clap:

The games :

Even games, AQ wins :
http://eidogo.com/#43Z2SOX69
http://eidogo.com/#xiY7CYBH
http://eidogo.com/#ffgQmCy6
http://eidogo.com/#wvNnI0Cd
http://eidogo.com/#4o8Bt2DDi
http://eidogo.com/#3qqrHBgd1
http://eidogo.com/#3wls4hDC
http://eidogo.com/#FmaQUvCk
http://eidogo.com/#3U0RMegVA
http://eidogo.com/#PYCapAdC
http://eidogo.com/#1yvV62zbX
http://eidogo.com/#2CBYiDi0a
http://eidogo.com/#3iPHUyV1M

Even games, L011 wins :
http://eidogo.com/#43ay1QnF7
http://eidogo.com/#4o0OQTvU1
http://eidogo.com/#u6ovKvXV


H2 games, AQ wins :
http://eidogo.com/#CChOLmfN
http://eidogo.com/#12Zf9BM93
http://eidogo.com/#kKbAYDgl
http://eidogo.com/#27oUZlYWC
http://eidogo.com/#20Wfekxy4
http://eidogo.com/#y68JWKeu
http://eidogo.com/#3Pt5MiuZ1
http://eidogo.com/#1zxCWuspk
http://eidogo.com/#gPw8oYNv


H2 games, L011 wins :
http://eidogo.com/#2PwvZRi3Y
http://eidogo.com/#2PwvZRi3Y
http://eidogo.com/#3rfaaJ92X
http://eidogo.com/#ybotc5bG
http://eidogo.com/#AW7GziNr
http://eidogo.com/#3qM7UfC8R
as0770
Lives with ko
Posts: 180
Joined: Sun Jun 26, 2016 8:07 am
Rank: Beginner
GD Posts: 0
Has thanked: 15 times
Been thanked: 23 times

Re: Engine Tournament

Post by as0770 »

Vargo wrote:
Unfortunately AQ doesn't work with Rayon and Oakfoam...
I've run games between AQ and Rayon or others, it works well with Sabaki. The problem is that Sabaki doesn't handle consecutive matches automatically. You have to run one game after another


I gonna try Sabaki, but I think it is a gpu memory conflict. Even running both engines in console makes one crash. I think I need to update my computer...

Thanks for your results. I think in Go you can't define one best engine, because, like you said, the strength depends a lot on the hardware.
Post Reply