LeelaZero adventures on Fox
Posted: Wed Nov 07, 2018 3:10 am
Recent musings about is LZ super-human on moderate hardware have prompted me to make an account on Fox to play as LeelaZero. It's called leelazero7 so that it's hopefully obviously I'm a bot and people can cancel the game (quite a few do) if they don't want to play a bot [I say "Hi, I am LeelaZero network #157 running on GTX 1060. If you don't want to play a bot feel free to quit." could anyone translate that to Chinese please?]. On Fox you have to start at 3d and can double rank promote if you win 18/20 at 3d or 20/20 at 5d+ and now I'm 10 wins at 7d. This thread is to record interesting games or things I notice.
My setup is a 3GB GeForce GTX 1060 GPU and running the best 15-block network #157 for now (I might switch to 40-block later if the extra strength is needed if/when I get to play top 9ds, or just as an experiment to see how good it is at lower playouts), running Lizzie and relaying the moves by hand (I've made 3 misclicks so far in 48 games, one was a taisha instead of knight press against 3-4 which was a serious mistake but soon recovered from, others were small endgame mistakes in positions winning by a lot). I provide some time management help/hindrance to LZ in that I choose when to play: I might take her top choice (by playouts, blue circle) after just a few seconds (could be ~1k playouts) if I think it's not likely to change or be important. In hard positions or where a 2nd/3rd choice with fewer playouts but higher winrate than the top playouts blue move is visible then I might let it analyse longer (still within game time settings, usually 30 sec byo-yomi) or play the higher winrate lower playouts move in Lizzie, let it analyse for as many playouts as the highest playouts one had and if the winrate remains higher then choose that move for the game (essentially accelerating LZ's switching to the better move, it would do it itself if left to its own devices, and such an approach could be implemented in LZ algorithmically and I expect it would increase its playing strength, a bodge for its too low IMO exploration behaviour). If it looks like LZ might blunder a ladder or other blindspot I won't stop it and use my go skill to correct its mistake, but will give it a bit more time. And if several choices are all very close I might pick the more interesting one. Average playouts per move are probably around 5k which is about 10 seconds, I will use opponent time to let LZ ponder if I think it's needed, else just live review the game and explore other choices for my own interest.
Summary so far: unsurprisingly all wins. Beats 3d/5d/7d with ease (usually >85% by move 50), though with the 7ds now there are some longer periods of play with a flatish instead of constantly declining winrate. Often beat 5ds by only 15-20 points if they didn't resign, it slacks off when leading (playing myself I usually beat Fox 5ds by more, I'm at 6d now and lose some). The 7ds seem to have more fighting spirit so it crushes them in early fighting. A 7d today was the first time LZ thought it was losing: it played an atari for a ladder that didn't work (even after 15k playouts), 7d extended, then LZ initially wanted to atari for the failed ladder but quickly realised it didn't work so compromise for a bad result (but not as bad as continuing failed ladder), see below. I've seen this mistake in this joseki in LZ's training games and it goes unpunished. A dozen moves later LZ was winning again though so no big deal (though because there was the non-working ladder atari on the board it wanted to play there initially for the next few moves, so I gave it more playouts (About 10k, 20s a move) so it would get over that delusion and play sensibly at top right). Another player also saw his winrate increase where LZ thought it could capture a cutting stone in a ladder but couldn't, but that was just making LZ go from winning a fair bit to winning a bit less.
P.S. does anyone have a list of known pro's usernames on Fox? I have this list from Tygem: https://docs.google.com/spreadsheets/d/ ... sp=sharing, maybe some are the same. I often see wonfun who I think is Weon Seongjin?
My setup is a 3GB GeForce GTX 1060 GPU and running the best 15-block network #157 for now (I might switch to 40-block later if the extra strength is needed if/when I get to play top 9ds, or just as an experiment to see how good it is at lower playouts), running Lizzie and relaying the moves by hand (I've made 3 misclicks so far in 48 games, one was a taisha instead of knight press against 3-4 which was a serious mistake but soon recovered from, others were small endgame mistakes in positions winning by a lot). I provide some time management help/hindrance to LZ in that I choose when to play: I might take her top choice (by playouts, blue circle) after just a few seconds (could be ~1k playouts) if I think it's not likely to change or be important. In hard positions or where a 2nd/3rd choice with fewer playouts but higher winrate than the top playouts blue move is visible then I might let it analyse longer (still within game time settings, usually 30 sec byo-yomi) or play the higher winrate lower playouts move in Lizzie, let it analyse for as many playouts as the highest playouts one had and if the winrate remains higher then choose that move for the game (essentially accelerating LZ's switching to the better move, it would do it itself if left to its own devices, and such an approach could be implemented in LZ algorithmically and I expect it would increase its playing strength, a bodge for its too low IMO exploration behaviour). If it looks like LZ might blunder a ladder or other blindspot I won't stop it and use my go skill to correct its mistake, but will give it a bit more time. And if several choices are all very close I might pick the more interesting one. Average playouts per move are probably around 5k which is about 10 seconds, I will use opponent time to let LZ ponder if I think it's needed, else just live review the game and explore other choices for my own interest.
Summary so far: unsurprisingly all wins. Beats 3d/5d/7d with ease (usually >85% by move 50), though with the 7ds now there are some longer periods of play with a flatish instead of constantly declining winrate. Often beat 5ds by only 15-20 points if they didn't resign, it slacks off when leading (playing myself I usually beat Fox 5ds by more, I'm at 6d now and lose some). The 7ds seem to have more fighting spirit so it crushes them in early fighting. A 7d today was the first time LZ thought it was losing: it played an atari for a ladder that didn't work (even after 15k playouts), 7d extended, then LZ initially wanted to atari for the failed ladder but quickly realised it didn't work so compromise for a bad result (but not as bad as continuing failed ladder), see below. I've seen this mistake in this joseki in LZ's training games and it goes unpunished. A dozen moves later LZ was winning again though so no big deal (though because there was the non-working ladder atari on the board it wanted to play there initially for the next few moves, so I gave it more playouts (About 10k, 20s a move) so it would get over that delusion and play sensibly at top right). Another player also saw his winrate increase where LZ thought it could capture a cutting stone in a ladder but couldn't, but that was just making LZ go from winning a fair bit to winning a bit less.
P.S. does anyone have a list of known pro's usernames on Fox? I have this list from Tygem: https://docs.google.com/spreadsheets/d/ ... sp=sharing, maybe some are the same. I often see wonfun who I think is Weon Seongjin?