The challenge is to broaden Leela Zero's search to evaluate not just the best move, but also the near misses and plausible alternatives. A broader search will slightly weaken the playing strength but might make it easier to use as an analysis tool. Of course you can get those evaluations by clicking around in the interface (that was the point of the other conversation), but it could be easier.
My suggestion was to give ten visits (or 20, or 100) to every legal move before starting the usual Monte Carlo tree search. (An alternative approach has been tried, focussing on the top four moves: see Uberdude's post here.) It's not too hard to change the LZ code to do that: it takes about 30 extra lines. My implementation is here: experimental software, use at own risk!
So first of all, it's quite pretty to watch in action! In Lizzie, I've changed the "min playout ratio for stats" parameter from 0.1 to 0.01. Here's a 20-second animation of the first 6,000 or so playouts.
So trying out on some real positions, the results aren't all that dramatic. Often it will explore two or three other moves that would otherwise have been ignored -- but when you let it run a bit longer, the "extra" moves disappear, and you end up with the same answers you would have got anyway. I guess this shows that LZ's method of filtering out suboptimal moves is actually doing a pretty good job! So far I've got the most interesting results by watching Lizzie in real time and pausing just as the dust clears, so to speak (i.e. when the animation above changes from hundreds of candidates to just a handful). And slightly older networks seem to be more "open minded" than the newer, stronger ones.
Some examples on a position I've been spending a bit of time with lately:
Last time I looked at this, most engines would pick two or three of 'a' through 'f' for analysis and would ignore everything else. With my new "LZ-minvisits", do we get any more variety?
- GX-47 unmodified explores a, b, c and e
- GX-47 + 10 visits for all moves explores all of a-f and one other option
- LZ-242 explores 7 different options
- LZ-242 + 10 explores 9 different options after 5,000 playouts. Using +30 visits instead of +10 actually narrows the results slightly (one option disappears)
- LZ-157 unmodified is already exploring 13 different moves!
- LZ-157 + 10 looks at 22 moves (and again drops a few options given more playouts)
Hmm, limited to three images per post by the looks of it. More to come soon.