Re: Engine Tournament
Posted: Fri Jan 05, 2018 3:05 pm
Any idea how much time per move and how many playouts were being used by LZ?
Life in 19x19. Go, Weiqi, Baduk... Thats the life.
https://lifein19x19.com/
1h/game on one thread. Starting with 50sec/move what is around 7000 playouts.LetterRip wrote:Any idea how much time per move and how many playouts were being used by LZ?
One more thing about this test: If you play with one engine against itself the ponder hits are close to 100%, That means the pondering side will benefit a lot more than against other engines.q30 wrote:The results (with pondering - without pondering):
MoGo 3 - 1;
Pachi 3 - 1;
Ray 3 - 1;
Leela 3 - 1;
in all 12 - 4 (details).
I don't know, what about quantitatively results (in ELO), but definitely there is qualitative effect, and in sparrings of equivalent strength Go engines the same with pondering may pass in rating engine without pondering.
If both can ponder too, because one may ponder more effectively, than other...as0770 wrote:So we agree that the question is only relevant in matches between engines where one is able to ponder and the other engine is not? ...q30 wrote:... and in sparrings of equivalent strength Go engines the same with pondering may pass in rating engine without pondering.
So for both the question of the absolute timecontrol and CPU power is much more relevant than the question of running it in ponder on or ponder off matches...
So where is your point always claiming others as "synthetic" results? ...
Try to use time control equivalent to 2 min per move. In this case fluctuations will be much smaller and difference between pondering and not pondering will be measurable...as0770 wrote:...
That is exactly what I expected: The statistical fluctuation when playing matches between engines with similar strength is very high, just like rolling a dice. The difference between pondering and not pondering is simply not meassurable with such a small amount of games.
You have no idea what you are talking about. Standard deviation doesn't change with the timecontrol.q30 wrote:Try to use time control equivalent to 2 min per move. In this case fluctuations will be much smaller and difference between pondering and not pondering will be measurable...
Code: Select all
1. AQ 2.0.3 12/16
2. Leela 0.11.0 Beta 11 4/16
Code: Select all
1. Leela 0.10.0 22/24
2. Rayon 4.6.0 19/24
3. Oakfoam 0.2.1 NG-06 18/24
4. Hiratuka 10.37B (CPU) 9/24
5. DarkForest v2 MCTS 1.0 7/24
6 DarkGo 1.0 5/24
7. Pachi DCNN 11.99 4/24
Code: Select all
1. Ray 9.0.1 31/36
2. Pachi DCNN 11.99 30/36
3. Dream Go 0.5.0 29/36
4. Leela Zero 0.9 (2018.01.01) 21/36
5. MoGo 4.86 18/36
6. deltaGo 1.0.0 18/36
7. Fuego 1.1 15/36
8. Michi C-2 1.4.2 8/36
9. Orego 7.08 8/36
10. GNU Go 3.8 2/36Code: Select all
1. GNU Go 3.8 25/28
2. Hara 0.9 18/28
3. Matilda 1.25 16/28
4. Indigo 2009 16/28
5. Dariush 3.1.5.7 15/28
6. Aya 6.34 13/28
7. Fudo Go 3.0 7/28
8. JrefBot 081016-2022 2/28Code: Select all
1. JrefBot 081016-2022 16/18
2. Iomrascálaí 0.3.2 15/18
3. Crazy Patterns 0008-13 13/18
4. Marcos Go 1.0 13/18
5. AmiGo 1.8 13/18
6. Beancounter 0.1 8/18
7. Stop 0.9-005 5/18
8. GoTraxx 1.4.2 3/18
0. CopyBot 0.1 2/18
10. Brown 1.0 2/18You dolemonsqueez wrote:Thanks for running these tournaments, impressive lineup !
Right now the leagues are based on strength more or less if i understand correctly.
I want to have a comparison between GPU and CPU engines so I don't want to make different Leagues.lemonsqueez wrote:Just an idea: how about a gpu league and a cpu league for the top programs ?
This is already the case mostly, what i mean is that for programs like Leela which can do both, it'd be interesting to see how the cpu version fares. Not sure how practical this would be. Maybe you don't want to have to unplug your graphic card to prevent it from using the gpu =)
Code: Select all
1. AQ 2.0.3 12/16
2. Leela 0.11.0 Beta 11 4/16Code: Select all
1. Leela 0.10.0 22/24
2. Rayon 4.6.0 19/24
3. Oakfoam 0.2.1 NG-06 18/24
4. Hiratuka 10.37B (CPU) 9/24
5. DarkForest v2 MCTS 1.0 7/24
6 DarkGo 1.0 5/24
7. Pachi DCNN 11.99 4/24
Code: Select all
1. Leela Zero 0.11 (2018.01.17) 15/20
2. Pachi DCNN 11.99 13/20
3. DarkGo 1.0 12/20
4. Dream Go 0.5.0 11/20
5. Ray 9.0.1 7/20
6. Mogo 4.86 2/20Code: Select all
1. MoGo 4.86 18/20
2. deltaGo 1.0.0 14/20
3. Fuego 1.1 13/20
4. Michi C-2 1.4.2 8/20
5. Orego 7.08 5/20
6. GNU Go 3.8 2/20Code: Select all
1. GNU Go 3.8 25/28
2. Hara 0.9 18/28
3. Matilda 1.25 16/28
4. Indigo 2009 16/28
5. Dariush 3.1.5.7 15/28
6. Aya 6.34 13/28
7. Fudo Go 3.0 7/28
8. JrefBot 081016-2022 2/28Code: Select all
1. JrefBot 081016-2022 16/20
2. Iomrascálaí 0.3.2 12/20
3. SimpleGo 0.4.3 11/20
4. Crazy Patterns 0008-13 7/20
5. Marcos Go 1.0 7/20
6. AmiGo 1.8 7/20Code: Select all
1. AmiGo 1.8 19/20
2. Beancounter 0.1 15/20
3. Stop 0.9-005 10/20
4. GoTraxx 1.4.2 7/20
5. CopyBot 0.1 6/20
6. Brown 1.0 3/20It depends on game randomness, that changes with the time control...as0770 wrote: You have no idea what you are talking about. Standard deviation doesn't change with the timecontrol.
Then we have to rewrite basic mathematical principles.q30 wrote:It depends on game randomness, that changes with the time control...as0770 wrote: You have no idea what you are talking about. Standard deviation doesn't change with the timecontrol.
Leela 0.11 is running and I'll wait for the official release of Pachi 12.Cyan wrote:Some strong bots have been updated:
AQ v2.1.1
Leela 0.11.0
Ray 4.32
Pachi 12.00
Yes, and sadly it's doing prejudice to this otherwise strong botas0770 wrote:Edit: Tried to compile Ray 4.32 without success. First I had to get some libs of cntk 2.1 although in the readme the cntk version is 2.3. Then I ran into the next error messages. The author is not much interested in making the installation easier, not even in keeping the readme up to date. That's no problem, but I'll leave Rn until it is easier to install.