Engine Tournament
-
as0770
- Lives with ko
- Posts: 180
- Joined: Sun Jun 26, 2016 8:07 am
- Rank: Beginner
- GD Posts: 0
- Has thanked: 15 times
- Been thanked: 23 times
Re: Engine Tournament
1h/game on one thread. Starting with 50sec/move what is around 7000 playouts.LetterRip wrote:Any idea how much time per move and how many playouts were being used by LZ?
-
as0770
- Lives with ko
- Posts: 180
- Joined: Sun Jun 26, 2016 8:07 am
- Rank: Beginner
- GD Posts: 0
- Has thanked: 15 times
- Been thanked: 23 times
Re: Engine Tournament
One more thing about this test: If you play with one engine against itself the ponder hits are close to 100%, That means the pondering side will benefit a lot more than against other engines.q30 wrote:The results (with pondering - without pondering):
MoGo 3 - 1;
Pachi 3 - 1;
Ray 3 - 1;
Leela 3 - 1;
in all 12 - 4 (details).
I don't know, what about quantitatively results (in ELO), but definitely there is qualitative effect, and in sparrings of equivalent strength Go engines the same with pondering may pass in rating engine without pondering.
I did also some testing:
In my League A there are two engines that do not ponder: Hiratuka and DarkGo. DarkGo plays instandly, so it doesn't matter if the opponent is pondering because he don't get any time to ponder. So the only Engine that may be affected is Hiratuka, so I testet Hiratuka against Pachi in 30min games with ponder on and ponder off.
First Round:
Ponder Off: Pachi vs. Hiratuka 5:11
Ponder On: Pachi vs. Hiratuka 8:8
This could indeed be a hint that Pachi becomes significantly stronger with pondering. Because I don't believe this I did the same match again with the same conditions. Now I got:
Ponder Off: Pachi vs. Hiratuka 6:10
Ponder On: Pachi vs. Hiratuka 2:14
That is exactly what I expected: The statistical fluctuation when playing matches between engines with similar strength is very high, just like rolling a dice. The difference between pondering and not pondering is simply not meassurable with such a small amount of games.
-
q30
- Lives with ko
- Posts: 145
- Joined: Sat Aug 13, 2016 8:23 am
- Rank: 30 kyu
- GD Posts: 0
- Has thanked: 1 time
- Been thanked: 1 time
Re: Engine Tournament
If both can ponder too, because one may ponder more effectively, than other...as0770 wrote:So we agree that the question is only relevant in matches between engines where one is able to ponder and the other engine is not? ...q30 wrote:... and in sparrings of equivalent strength Go engines the same with pondering may pass in rating engine without pondering.
So for both the question of the absolute timecontrol and CPU power is much more relevant than the question of running it in ponder on or ponder off matches...
So where is your point always claiming others as "synthetic" results? ...
Yes.
Only on results, obtained from tests with non realistic parameters, for example, time control. On engines with no big difference in strength the results may vary from real control parameters tests.
-
q30
- Lives with ko
- Posts: 145
- Joined: Sat Aug 13, 2016 8:23 am
- Rank: 30 kyu
- GD Posts: 0
- Has thanked: 1 time
- Been thanked: 1 time
Re: Engine Tournament
Try to use time control equivalent to 2 min per move. In this case fluctuations will be much smaller and difference between pondering and not pondering will be measurable...as0770 wrote:...
That is exactly what I expected: The statistical fluctuation when playing matches between engines with similar strength is very high, just like rolling a dice. The difference between pondering and not pondering is simply not meassurable with such a small amount of games.
-
as0770
- Lives with ko
- Posts: 180
- Joined: Sun Jun 26, 2016 8:07 am
- Rank: Beginner
- GD Posts: 0
- Has thanked: 15 times
- Been thanked: 23 times
Re: Engine Tournament
You have no idea what you are talking about. Standard deviation doesn't change with the timecontrol.q30 wrote:Try to use time control equivalent to 2 min per move. In this case fluctuations will be much smaller and difference between pondering and not pondering will be measurable...
Last edited by as0770 on Sun Jan 21, 2018 12:29 am, edited 1 time in total.
-
as0770
- Lives with ko
- Posts: 180
- Joined: Sun Jun 26, 2016 8:07 am
- Rank: Beginner
- GD Posts: 0
- Has thanked: 15 times
- Been thanked: 23 times
Re: Engine Tournament
New entry in League B is Dream Go, update in League C Mathilda 1.25
Leela vs. AQ
Configuration:
League A:
Configuration:
League B:
Configuration:
League C:
Configuration:
League D:
Configuration:
Links:
Best,
Alex
Leela vs. AQ
Code: Select all
1. AQ 2.0.3 12/16
2. Leela 0.11.0 Beta 11 4/16
Code: Select all
1. Leela 0.10.0 22/24
2. Rayon 4.6.0 19/24
3. Oakfoam 0.2.1 NG-06 18/24
4. Hiratuka 10.37B (CPU) 9/24
5. DarkForest v2 MCTS 1.0 7/24
6 DarkGo 1.0 5/24
7. Pachi DCNN 11.99 4/24
Code: Select all
1. Ray 9.0.1 31/36
2. Pachi DCNN 11.99 30/36
3. Dream Go 0.5.0 29/36
4. Leela Zero 0.9 (2018.01.01) 21/36
5. MoGo 4.86 18/36
6. deltaGo 1.0.0 18/36
7. Fuego 1.1 15/36
8. Michi C-2 1.4.2 8/36
9. Orego 7.08 8/36
10. GNU Go 3.8 2/36Code: Select all
1. GNU Go 3.8 25/28
2. Hara 0.9 18/28
3. Matilda 1.25 16/28
4. Indigo 2009 16/28
5. Dariush 3.1.5.7 15/28
6. Aya 6.34 13/28
7. Fudo Go 3.0 7/28
8. JrefBot 081016-2022 2/28Code: Select all
1. JrefBot 081016-2022 16/18
2. Iomrascálaí 0.3.2 15/18
3. Crazy Patterns 0008-13 13/18
4. Marcos Go 1.0 13/18
5. AmiGo 1.8 13/18
6. Beancounter 0.1 8/18
7. Stop 0.9-005 5/18
8. GoTraxx 1.4.2 3/18
0. CopyBot 0.1 2/18
10. Brown 1.0 2/18Links:
Alex
Last edited by as0770 on Mon Jan 22, 2018 10:41 am, edited 2 times in total.
-
lemonsqueez
- Dies in gote
- Posts: 22
- Joined: Sat Jan 20, 2018 2:26 pm
- GD Posts: 0
- Been thanked: 18 times
Re: Engine Tournament
Thanks for running these tournaments, impressive lineup !
Right now the leagues are based on strength more or less if i understand correctly.
Just an idea: how about a gpu league and a cpu league for the top programs ?
This is already the case mostly, what i mean is that for programs like Leela which can do both, it'd be interesting to see how the cpu version fares. Not sure how practical this would be. Maybe you don't want to have to unplug your graphic card to prevent it from using the gpu =)
Right now the leagues are based on strength more or less if i understand correctly.
Just an idea: how about a gpu league and a cpu league for the top programs ?
This is already the case mostly, what i mean is that for programs like Leela which can do both, it'd be interesting to see how the cpu version fares. Not sure how practical this would be. Maybe you don't want to have to unplug your graphic card to prevent it from using the gpu =)
-
as0770
- Lives with ko
- Posts: 180
- Joined: Sun Jun 26, 2016 8:07 am
- Rank: Beginner
- GD Posts: 0
- Has thanked: 15 times
- Been thanked: 23 times
Re: Engine Tournament
You dolemonsqueez wrote:Thanks for running these tournaments, impressive lineup !
Right now the leagues are based on strength more or less if i understand correctly.
One or two engines play in the upper and lower League to have a virtual connection between the Leagues.
I want to have a comparison between GPU and CPU engines so I don't want to make different Leagues.lemonsqueez wrote:Just an idea: how about a gpu league and a cpu league for the top programs ?
This is already the case mostly, what i mean is that for programs like Leela which can do both, it'd be interesting to see how the cpu version fares. Not sure how practical this would be. Maybe you don't want to have to unplug your graphic card to prevent it from using the gpu =)
In fact I tried to run the GPU engines also in CPU mode, this works well with Leela. It would play in the same League as Leela GPU but I don't want to have one engine playing twice in one League. This would strain the results. But you can find Leela CPU in the history
Rayon CPU is basically Ray 9.0.1 afaik. AQ won't work as CPU engine here and for Oakfoam as CPU engine I have to adjust some parameters and there where problems running it. And btw it is very, very weak. For other engines I would have to change the system configuration, but I don't want to mess up my system like this. So after all I would like to run the engines in both modes but I would face too many problems...
-
as0770
- Lives with ko
- Posts: 180
- Joined: Sun Jun 26, 2016 8:07 am
- Rank: Beginner
- GD Posts: 0
- Has thanked: 15 times
- Been thanked: 23 times
Re: Engine Tournament
This time I downsized the Leagues to get space for new entries.
In League B Leela Zero is updated to v0.11 and the last 5x64 network (2018.01.17). After that they changed to a 6x128 network.
Also there is a new Engine in League E: SimpleGo 0.4.3
Leela vs. AQ
League A:
League B:
League C:
League D:
League E:
League F:
Configuration:
Links:
Best,
Alex
In League B Leela Zero is updated to v0.11 and the last 5x64 network (2018.01.17). After that they changed to a 6x128 network.
Also there is a new Engine in League E: SimpleGo 0.4.3
Leela vs. AQ
Code: Select all
1. AQ 2.0.3 12/16
2. Leela 0.11.0 Beta 11 4/16Code: Select all
1. Leela 0.10.0 22/24
2. Rayon 4.6.0 19/24
3. Oakfoam 0.2.1 NG-06 18/24
4. Hiratuka 10.37B (CPU) 9/24
5. DarkForest v2 MCTS 1.0 7/24
6 DarkGo 1.0 5/24
7. Pachi DCNN 11.99 4/24
Code: Select all
1. Leela Zero 0.11 (2018.01.17) 15/20
2. Pachi DCNN 11.99 13/20
3. DarkGo 1.0 12/20
4. Dream Go 0.5.0 11/20
5. Ray 9.0.1 7/20
6. Mogo 4.86 2/20Code: Select all
1. MoGo 4.86 18/20
2. deltaGo 1.0.0 14/20
3. Fuego 1.1 13/20
4. Michi C-2 1.4.2 8/20
5. Orego 7.08 5/20
6. GNU Go 3.8 2/20Code: Select all
1. GNU Go 3.8 25/28
2. Hara 0.9 18/28
3. Matilda 1.25 16/28
4. Indigo 2009 16/28
5. Dariush 3.1.5.7 15/28
6. Aya 6.34 13/28
7. Fudo Go 3.0 7/28
8. JrefBot 081016-2022 2/28Code: Select all
1. JrefBot 081016-2022 16/20
2. Iomrascálaí 0.3.2 12/20
3. SimpleGo 0.4.3 11/20
4. Crazy Patterns 0008-13 7/20
5. Marcos Go 1.0 7/20
6. AmiGo 1.8 7/20Code: Select all
1. AmiGo 1.8 19/20
2. Beancounter 0.1 15/20
3. Stop 0.9-005 10/20
4. GoTraxx 1.4.2 7/20
5. CopyBot 0.1 6/20
6. Brown 1.0 3/20Alex
-
q30
- Lives with ko
- Posts: 145
- Joined: Sat Aug 13, 2016 8:23 am
- Rank: 30 kyu
- GD Posts: 0
- Has thanked: 1 time
- Been thanked: 1 time
Re: Engine Tournament
It depends on game randomness, that changes with the time control...as0770 wrote: You have no idea what you are talking about. Standard deviation doesn't change with the timecontrol.
-
as0770
- Lives with ko
- Posts: 180
- Joined: Sun Jun 26, 2016 8:07 am
- Rank: Beginner
- GD Posts: 0
- Has thanked: 15 times
- Been thanked: 23 times
Re: Engine Tournament
Then we have to rewrite basic mathematical principles.q30 wrote:It depends on game randomness, that changes with the time control...as0770 wrote: You have no idea what you are talking about. Standard deviation doesn't change with the timecontrol.
-
as0770
- Lives with ko
- Posts: 180
- Joined: Sun Jun 26, 2016 8:07 am
- Rank: Beginner
- GD Posts: 0
- Has thanked: 15 times
- Been thanked: 23 times
Re: Engine Tournament
Leela 0.11 is running and I'll wait for the official release of Pachi 12.Cyan wrote:Some strong bots have been updated:
AQ v2.1.1
Leela 0.11.0
Ray 4.32
Pachi 12.00
I'll take a look at the others, thank you.
Edit: Tried to compile Ray 4.32 without success. First I had to get some libs of cntk 2.1 although in the readme the cntk version is 2.3. Then I ran into the next error messages. The author is not much interested in making the installation easier, not even in keeping the readme up to date. That's no problem, but I'll leave Rn until it is easier to install.
- pnprog
- Lives with ko
- Posts: 286
- Joined: Thu Oct 20, 2016 7:21 am
- Rank: OGS 7 kyu
- GD Posts: 0
- Has thanked: 94 times
- Been thanked: 153 times
Re: Engine Tournament
Yes, and sadly it's doing prejudice to this otherwise strong botas0770 wrote:Edit: Tried to compile Ray 4.32 without success. First I had to get some libs of cntk 2.1 although in the readme the cntk version is 2.3. Then I ran into the next error messages. The author is not much interested in making the installation easier, not even in keeping the readme up to date. That's no problem, but I'll leave Rn until it is easier to install.
It would be nice to have somebody who is well versed in unbuntu/apt/ppa/compilation/deb to implement a dedicated ppa for ubuntu and all usual go software. This ppa would include updated deb files for commonly used go program (Leela, Sabaki...). Those hard to compile/install programs like Ray would come in sort of containers or snap applications (with all dependencies included).
Not quite a come back of Hikarunix, but still a big improvement.
I am the author of GoReviewPartner, a small software aimed at assisting reviewing a game of Go. Give it a try!