Home-made Elo ratings for some engines

For discussing go computing, software announcements, etc.
And
Gosei
Posts: 1464
Joined: Tue Sep 25, 2018 10:28 am
GD Posts: 0
Has thanked: 212 times
Been thanked: 215 times

Re: Home-made Elo ratings for some engines

Post by And »

xela, thank you very much for your great work! can you explain why of all the networks LM_GX chose LM_GX47? and where can I download LM_B5 and LM_Z2?
xela
Lives in gote
Posts: 652
Joined: Sun Feb 09, 2014 4:46 am
Rank: Australian 3 dan
GD Posts: 200
Location: Adelaide, South Australia
Has thanked: 219 times
Been thanked: 281 times

Re: Home-made Elo ratings for some engines

Post by xela »

Thanks, glad you like it!

I think GX47 was the strongest in the GX series when I started doing this (I can't remember exactly, it was a while ago). There are a few newer Leela Master networks now. Download from https://github.com/pangafu/LeelaMasterWeight For more information about how I downloaded and set up the various engines, see the other thread at viewtopic.php?p=236178
And
Gosei
Posts: 1464
Joined: Tue Sep 25, 2018 10:28 am
GD Posts: 0
Has thanked: 212 times
Been thanked: 215 times

Re: Home-made Elo ratings for some engines

Post by And »

xela, I looked through all several times, but I could not find where to download LM_B5 and LM_Z2 :(
xela
Lives in gote
Posts: 652
Joined: Sun Feb 09, 2014 4:46 am
Rank: Australian 3 dan
GD Posts: 200
Location: Adelaide, South Australia
Has thanked: 219 times
Been thanked: 281 times

Re: Home-made Elo ratings for some engines

Post by xela »

Ah, it looks like some of the older networks have been removed from the Google Drive folders. You'd have to raise an issue on github and ask pangafu there if they're still available.
hydrogenpi7
Dies in gote
Posts: 63
Joined: Sat Mar 25, 2017 3:19 pm
GD Posts: 0
Been thanked: 3 times

Re: Home-made Elo ratings for some engines

Post by hydrogenpi7 »

xela wrote:Updated with KataGo, OpenCL version (and also throwing in some recent LZ weights for comparison). Just fast games for this one, didn't get around to updating the 20 minute results.

kata_6b is the 6-block network, and you can probably guess the names for 10, 15, 20 blocks. In the 1 minute games I also tried different numbers of threads but didn't see much potential for significant improvement. The suggestion in the config file of trying more threads than you have cores wasn't a success on my hardware.

Results at 1 minute time limit, based on 1520 games with 72 engines:

Code: Select all

Name            Elo   Elo+  Elo-  games  score  avg_opp
kata_15b        4224  184   166   16     75%    4041
LZ_242          4194  215   218   8      63%    4117
LZ_157          4186  94    85    74     74%    3993
LZ_188          4172  174   166   14     57%    4128
LM_GX47         4160  101   94    64     72%    3921
kata_20b        4142  179   167   16     63%    4047
LZ_ELF          4130  85    82    72     61%    4039
kata_10b        4037  118   110   44     68%    3873
LZ_ELF_6t       4024  92    97    54     39%    4100
LZ_174          3993  83    85    72     47%    4010
LZ_173          3941  107   106   54     59%    3836
ray_ELF_12t     3920  88    92    66     39%    3997
LZ_141          3907  98    97    60     58%    3805
kata_10_12t     3895  141   139   24     50%    3902
kata_10_6t      3856  130   134   28     43%    3912
LM_E8           3827  107   109   56     59%    3673
kata_10_2t      3820  140   145   24     42%    3886
LZ_116          3752  89    89    74     49%    3757
LZ_174_6t       3733  108   109   46     50%    3723
ray_173_6t      3698  107   106   40     53%    3681
kata_10_24t     3689  159   186   20     25%    3888
LM_Z2           3679  96    92    62     65%    3525
ray_173_12t     3672  112   110   36     53%    3653
LM_W11          3649  113   116   44     50%    3619
LZ_phoenix      3554  116   118   38     42%    3641
LM_B5           3545  99    99    58     57%    3460
kata_6b         3540  185   188   14     43%    3608
ray_W11_12t     3518  111   116   38     39%    3599
ray_173_2t      3489  124   124   30     50%    3489
LZ_zed          3402  119   122   36     42%    3476
leela           3378  116   115   56     55%    3298
LZ_91           3319  99    105   80     30%    3548
ray_ELF         3280  128   124   34     50%    3308
ray_173         3272  132   126   34     59%    3206
ray_W11         3130  104   103   48     50%    3139
dream_ponder    3129  119   123   40     53%    3051
AQ              3054  146   149   24     46%    3087
oakfoam_nn      2993  117   119   84     62%    2785
dream           2992  116   120   36     44%    3034
LM_GX47_c       2969  123   117   34     59%    2909
LZ_116_c2t      2865  109   112   60     30%    3142
LM_E8_c         2851  141   141   22     50%    2851
LM_B5_c         2849  135   135   24     50%    2849
LZ_116_c6t      2828  138   139   24     50%    2824
LM_Z2_c         2746  121   115   36     61%    2667
LZ_57           2744  114   116   52     50%    2725
LM_W11_c        2683  126   134   28     39%    2754
leela_c1t       2576  109   108   42     52%    2551
leela_c2t       2508  129   136   30     37%    2621
LZ_91_c2t       2506  137   140   26     46%    2533
leela_c         2499  103   101   88     59%    2377
pachi_nn        2400  111   107   76     64%    2228
pachi           2190  127   123   68     54%    2179
leela_nonet     2156  105   102   88     58%    2094
gnugo           1872  89    83    84     64%    1774
gnugo_l7        1871  120   122   52     38%    2005
LZ_57_c2t       1864  246   218   8      63%    1791
gnugo_M         1842  140   133   34     53%    1844
gnugo_l1        1823  91    89    84     48%    1882
gnugo_l4        1807  141   139   32     47%    1862
leela_nonet_1t  1758  244   255   10     10%    2160
oakfoam1        1735  126   122   32     56%    1692
pachi_pat       1711  394   220   2      50%    1711
fuego           1711  90    90    78     37%    1945
oakfoam_book    1628  113   115   40     38%    1731
pachi_1t        1585  207   110   14     14%    1894
oakfoam         1567  93    85    72     25%    1806
oakfoam2        1524  137   53    30     23%    1725
pachi_monte     1523  387   49    2      0%     1711
pachi_plain     1523  387   49    2      0%     1711
michi           1506  339   34    4      0%     1791
matilda         1437  170   -31   44     9%     1877
Results at 5 minute time limit, based on 1680 games with 59 engines:

Code: Select all

Name            Elo   Elo+  Elo-  games  score  avg_opp
LZ_242          4662  -15   144   34     79%    4442
LZ_ELF          4506  90    87    66     64%    4389
LM_GX47         4499  97    94    58     64%    4380
kata_20b        4465  106   106   42     52%    4446
LZ_188          4454  112   114   36     47%    4473
LZ_ELF_6t       4444  92    90    62     56%    4390
LZ_157          4390  106   105   44     55%    4354
LZ_174          4259  91    89    76     61%    4149
LZ_173          4257  86    86    80     54%    4189
LZ_141          4243  87    86    82     57%    4160
kata_15b        4156  103   104   46     48%    4172
LZ_phoenix      4144  102   102   54     52%    4109
ray_ELF_12t     4134  97    99    50     48%    4144
LZ_174_6t       4093  80    78    100    57%    4034
ray_173_12t     3944  98    101   54     41%    4017
LM_Z2           3912  92    95    76     49%    3878
LM_B5           3905  112   111   46     59%    3808
ray_173_6t      3853  99    101   48     44%    3902
LZ_116          3844  84    85    98     50%    3827
LM_W11          3802  105   99    64     66%    3657
LM_E8           3787  109   106   50     56%    3740
kata_10b        3697  114   115   48     44%    3777
ray_173_2t      3682  106   108   54     52%    3659
ray_W11_12t     3636  107   99    58     67%    3497
ray_ELF         3491  98    99    66     38%    3641
AQ              3445  102   106   60     50%    3411
ray_173         3422  96    99    62     37%    3548
leela           3386  90    93    98     44%    3425
LZ_zed          3367  105   103   56     50%    3388
LZ_91           3282  92    95    76     37%    3419
kata_6b         3192  109   112   48     38%    3354
ray_W11         3190  94    93    62     50%    3202
dream_ponder    3181  110   112   44     52%    3113
dream           3091  117   121   36     44%    3130
LM_E8_c         3011  104   102   58     53%    2988
LZ_116_c2t      2959  105   104   66     55%    2916
LM_W11_c        2846  116   114   36     53%    2829
LM_GX47_c       2817  106   105   46     46%    2867
oakfoam_nn      2811  92    89    72     51%    2823
leela_c         2733  92    92    74     50%    2731
leela_c2t       2712  83    84    78     49%    2718
LZ_91_c2t       2620  106   101   56     59%    2540
LM_Z2_c         2573  109   110   38     47%    2596
LM_B5_c         2571  114   121   34     38%    2655
LZ_57           2569  113   114   38     45%    2616
leela_c1t       2507  95    100   68     41%    2596
pachi_nn        2400  107   113   64     39%    2514
pachi           2108  112   108   80     58%    2005
LZ_57_c2t       2064  132   122   40     70%    1872
leela_nonet     2058  137   150   42     36%    2157
fuego           1836  108   105   72     65%    1662
pachi_1t        1829  119   114   54     65%    1662
leela_nonet_1t  1827  125   116   52     69%    1624
gnugo           1472  214   -177  106    20%    1763
michi           1438  298   -211  40     55%    1403
oakfoam1        1258  462   -391  28     43%    1401
oakfoam         1039  657   -609  26     27%    1357
oakfoam_book    970   710   -678  32     13%    1406
matilda         947   741   -701  26     15%    1379
So based on this chart anyone with a half way decent GPU at any reasonable time intervals running latest LZ net can already play against AI opponent that is essentially stronger than AlphaGoLee and catching up to AlphaGoMaster?
xela
Lives in gote
Posts: 652
Joined: Sun Feb 09, 2014 4:46 am
Rank: Australian 3 dan
GD Posts: 200
Location: Adelaide, South Australia
Has thanked: 219 times
Been thanked: 281 times

Re: Home-made Elo ratings for some engines

Post by xela »

hydrogenpi7 wrote: So based on this chart anyone with a half way decent GPU at any reasonable time intervals running latest LZ net can already play against AI opponent that is essentially stronger than AlphaGoLee and catching up to AlphaGoMaster?
It depends on a bunch of assumptions about how the Elo rating system works. I wouldn't dare to be that precise, but it looks to me like AIs can play at a superhuman level on ordinary PCs with a mid-range GPU.
xela
Lives in gote
Posts: 652
Joined: Sun Feb 09, 2014 4:46 am
Rank: Australian 3 dan
GD Posts: 200
Location: Adelaide, South Australia
Has thanked: 219 times
Been thanked: 281 times

Re: Home-made Elo ratings for some engines

Post by xela »

Looks like someone else has done something a bit more comprehensive, although they're a bit short on details of the methodology.
Post Reply