Analysis of Pro vs Elf with LeelaElf
Posted: Tue Jun 26, 2018 12:29 pm
When Facebook released their Elf OpenGo bot they also included 12 games versus 4 top Korean pros. These seem to have been rather neglected, so I thought I'd analyse one using the LeelaElf converted weights. As well as the go instruction this can see how close our LeelaElf conversion is to the true Elf: does it play the same moves and give the same win% that were included in the released games (Elf record uses a -1 to +1 scale whereas LeelaZero/Lizze a 0% to 100% so Elf pro record mistake of 0.05 to 0.09 is a 2% LZ mistake).
The games (except first 2) "were played using the v0 pretrained model (publicly available for download). For each move, ELF OpenGo used 2 threads with 40000 rollouts per thread (grouped into batches of 16). This took around 50 seconds per move on a V100 GPU.". For comparison my GeForce 1060 to do 80k playouts with LeelaElf takes 380 seconds (8 times). To enable a decent review pace I generally only did ~10k playouts, but if it looked like another choice might overtake the current top would do more.
And the winrate graph. White's biggest mistakes to 20% were:
- bottom right kick
- hoshi pincer
- hanging connection instead of cut tesuji
- top left approach
- double approach
- triple approach/surround As for comparing Elf and LeelaElf, the only significant good move Elf found that LeelaElf didn't in the part I analuysed was the q12 pincer, though LeelaElf agreed it was good when shown it. They both shared the strange blindspot of not seeing the monkey jump kill at top left (which does cast some doubt on the previous evaluations: black could live if didn't solid connect before and fall back, but then white would capture the one stone in sente which is quite a big difference to strength of groups). They were in pretty good agreement at finding the pros mistakes, though exact win% drops varied somewhat.
The games (except first 2) "were played using the v0 pretrained model (publicly available for download). For each move, ELF OpenGo used 2 threads with 40000 rollouts per thread (grouped into batches of 16). This took around 50 seconds per move on a V100 GPU.". For comparison my GeForce 1060 to do 80k playouts with LeelaElf takes 380 seconds (8 times). To enable a decent review pace I generally only did ~10k playouts, but if it looked like another choice might overtake the current top would do more.
And the winrate graph. White's biggest mistakes to 20% were:
- bottom right kick
- hoshi pincer
- hanging connection instead of cut tesuji
- top left approach
- double approach
- triple approach/surround As for comparing Elf and LeelaElf, the only significant good move Elf found that LeelaElf didn't in the part I analuysed was the q12 pincer, though LeelaElf agreed it was good when shown it. They both shared the strange blindspot of not seeing the monkey jump kill at top left (which does cast some doubt on the previous evaluations: black could live if didn't solid connect before and fall back, but then white would capture the one stone in sente which is quite a big difference to strength of groups). They were in pretty good agreement at finding the pros mistakes, though exact win% drops varied somewhat.