Thanks for your explanation. That is a clever method. I have a lot of .dat files from GoGui tournaments and always wanted to make such ELO list. Maybe you can share your Python code. I'm sure it would be useful for others.
Posts: 595 Location: Adelaide, South Australia Liked others: 211 Was liked: 267
Rank: Australian 2 dan
GD Posts: 200
Kris Storm wrote:
Maybe you can share your Python code. I'm sure it would be useful for others.
It needs a bit of a rewrite before I can share it. At the moment it wouldn't work on someone else's computer because of all the hard-coded path names (and I'd be embarrassed to let it out in this shape). I'll add it to my to-do list.
Posts: 595 Location: Adelaide, South Australia Liked others: 211 Was liked: 267
Rank: Australian 2 dan
GD Posts: 200
Going slow this week, because my system keeps crashing! It doesn't like the combination fo strong bots and slow games. I think I need to upgrade an nvidia driver, but it's hard to find information on which drivers are more stable. At the moment, I can queue up 8 games to be played overnight, but I'll wake up to a black screen and an unresponsive system, and when I reboot it looks like only two or three games got played.
Anyway, new this week:
AQ has entered the 1-minute and 5-minute ratings. It's at a disadvantage because it was trained for Japanese rules and 6.5 komi, and I'm playing all the games with Chinese rules and 7.5 komi (this works for the majority of bots). My guess is that AQ's rating will therefore be 50-100 points below its true strength, but I can't think of a good way to measure how much difference it actually makes.
Looking at the 5-minute games:
AQ played 52 games
50 were won or lost by resignation.
One was lost by AQ by 47.5 points. I can't figure out why AQ didn't resign. The game was 449 moves long. A large group died at move 234, and analysis with LZ_157 says the winrate was below 5% for the rest of the game. The position could have been scored at move 346, but AQ kept playing inside its own territory and trying to live inside black's territory for another 100 moves.
One game was lost by AQ by 2.5 points. Ray actually gave away 2 points with a slack endgame move, so AQ was previously behind by 4.5 points, hard to explain this as a 6.5 versus 7.5 komi issue.
Of course we don't know how many of the resigned games were due to an overplay that wouldn't have happened with the correct komi.
A few more bots added to the 20-minute ratings. My mathematical model (post number 16 above) is looking about as bad as expected :-) At the slower time limit, it looks like LZ now gets stronger with more threads, unlike in the fast games.
Results so far at 1 minute time limit, based on 1350 games with 62 engines:
Posts: 595 Location: Adelaide, South Australia Liked others: 211 Was liked: 267
Rank: Australian 2 dan
GD Posts: 200
Here are the two games where I thought AQ should have resigned.
Attachments:
File comment: AQ loses by 2.5 points; ray gives away points at move 282 ray_ELF-AQ-2018-10-11_21-15.sgf [2.85 KiB]
Downloaded 1357 times
File comment: AQ loses by 47.5 points; resignable from move 242; game could have been scored at move 346 AQ-ray_W11_12t-2018-10-11_20-15.sgf [3.07 KiB]
Downloaded 1383 times
Posts: 595 Location: Adelaide, South Australia Liked others: 211 Was liked: 267
Rank: Australian 2 dan
GD Posts: 200
Sorry for the long gap between updates! I spent a lot of time figuring out how to update my graphics drivers, but I still haven't solved the crashing problem. It looks like I can't reliably run LZ with 6 or more threads in long games. But that's OK, I've found out what I originally wanted to, which is that LZ (even with 2 threads) seems to achieve superhuman performance on a fairly ordinary computer.
I'm a little surprised to see ELF still at the top of the list, as I thought recent LZ networks had overtaken ELF at time parity. Over the next couple of weeks I'll add some more games to reduce some of the error margins, maybe throw LZ_157 into the mix, and maybe do some benchmarking to see how many visits per second I'm getting for various different networks.
Oh, and for anyone who's observant: in previous posts, the Elo+ and Elo- columns were the wrong way round. I've gone back and edited the earlier posts so they're now correct.
Results so far at 20 minute time limit, based on 228 games with 22 engines:
Posts: 595 Location: Adelaide, South Australia Liked others: 211 Was liked: 267
Rank: Australian 2 dan
GD Posts: 200
Here are the final results (unless I get inspired to do more). Looking at the error bounds, we can't say for sure which of the top 6 is actually the strongest, but they all seem to be definitely in the "superhuman" range (considering that the bottom of this list is already amateur dan level). Just for interest, on my hardware LZ_174 and LZ_188 get about 300 visits per second, ELF about 700, GX47 around 1200, LZ_157 around 1500 (numbers are approximate because they vary from one game to another, possibly depending on the board position and how much of the tree is reused from previous moves).
Results at 20 minute time limit, based on 426 games with 25 engines:
Posts: 595 Location: Adelaide, South Australia Liked others: 211 Was liked: 267
Rank: Australian 2 dan
GD Posts: 200
Updated with KataGo, OpenCL version (and also throwing in some recent LZ weights for comparison). Just fast games for this one, didn't get around to updating the 20 minute results.
kata_6b is the 6-block network, and you can probably guess the names for 10, 15, 20 blocks. In the 1 minute games I also tried different numbers of threads but didn't see much potential for significant improvement. The suggestion in the config file of trying more threads than you have cores wasn't a success on my hardware.
Results at 1 minute time limit, based on 1520 games with 72 engines:
Posts: 595 Location: Adelaide, South Australia Liked others: 211 Was liked: 267
Rank: Australian 2 dan
GD Posts: 200
Thanks, glad you like it!
I think GX47 was the strongest in the GX series when I started doing this (I can't remember exactly, it was a while ago). There are a few newer Leela Master networks now. Download from https://github.com/pangafu/LeelaMasterWeight For more information about how I downloaded and set up the various engines, see the other thread at https://lifein19x19.com/viewtopic.php?p=236178
Posts: 595 Location: Adelaide, South Australia Liked others: 211 Was liked: 267
Rank: Australian 2 dan
GD Posts: 200
Ah, it looks like some of the older networks have been removed from the Google Drive folders. You'd have to raise an issue on github and ask pangafu there if they're still available.
Updated with KataGo, OpenCL version (and also throwing in some recent LZ weights for comparison). Just fast games for this one, didn't get around to updating the 20 minute results.
kata_6b is the 6-block network, and you can probably guess the names for 10, 15, 20 blocks. In the 1 minute games I also tried different numbers of threads but didn't see much potential for significant improvement. The suggestion in the config file of trying more threads than you have cores wasn't a success on my hardware.
Results at 1 minute time limit, based on 1520 games with 72 engines:
So based on this chart anyone with a half way decent GPU at any reasonable time intervals running latest LZ net can already play against AI opponent that is essentially stronger than AlphaGoLee and catching up to AlphaGoMaster?
Posts: 595 Location: Adelaide, South Australia Liked others: 211 Was liked: 267
Rank: Australian 2 dan
GD Posts: 200
hydrogenpi7 wrote:
So based on this chart anyone with a half way decent GPU at any reasonable time intervals running latest LZ net can already play against AI opponent that is essentially stronger than AlphaGoLee and catching up to AlphaGoMaster?
It depends on a bunch of assumptions about how the Elo rating system works. I wouldn't dare to be that precise, but it looks to me like AIs can play at a superhuman level on ordinary PCs with a mid-range GPU.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum