That's me again!
I was thinking about something that maybe would work, but would be a lot of work to implement:
Basically, it would consist in training a set of policy networks, each one corresponding to a specific level of play (3k, 2k, 1k, 1d, 2d, 3d...).
<Edit> to be clear, I am not proposing to train a bot, only a policy network. Not something that can play Go, no play-out, no tree search, no MC rolls, no value network...</Edit>
A policy network, as I understand it, was developed by Deepmind for there first version of AlphaGo by showing it games of strong amateurs players they downloaded from the internet. This policy network was used to indicate, for a specific game position, what moves a strong amateur would play. This was used to reduce the number of moves AlphaGo had to evaluate (evaluation being done with value network and montecarlo rolls). Later they used AlphaGo VS AlphaGo games to improve better their policy network.
So, we could try to train one policy network using ~2k players' games, then another one using ~1k players' games, then another one using ~1d players' games, and so on.
Note that we don't really care what level at policy network is labelled (1k, or 3d), we only need them to be in croissant order, and ideally at a regular distance in strength. We could classify them using ELO or simply A, B, C...
With such a set of policy networks, we could evaluate how the moves of one player in his game correlate with each of our policy networks, and draw a chart. One could expect this chart to peak at the policy network closest to this player level.
Then, by comparing those charts for different games, we could then tell that for a particular game, that player did not played at his usual level.
The difficult part would be to gather enough games for training, games from players with stable level, and have those games classified by level...
One way to do that could be to work with Go severs, more specifically with
the players they use as anchors
Now, they won't probably want to disclose publicly what players are used as anchors, but maybe this could be done under a non disclosure agreement. Or maybe they could disclose this information when the anchor is removed. Then we can download his games from the period he was selected as an anchor.
Or maybe we could collaborate with Go server to get statistics on what player have a very strong rating confidence.
Once we get enough games to train our policy networks, it also open all sort of possibility regarding the rating of players or their games (like, one could finally get to know the equivalence of ranks among Go servers).