Tryss wrote:
Quote:
I am not trolling you. I got all along what you were saying about the smaller net not playing go but learning to predict human play. What I did not, and still do not understand is what you were trying to say with that specific sentence. But when I asked about it specifically you thought I did not understand you more generally. It has taken some time to finally focus on that one statement.

Simple : the goal is that the program avoid playing stupid/artificial mistakes, but still make big/natural mistakes.
Missing a 25 stones in atari, or the life and dead of some 25pts group is mostly the same for a bot, but one is a mistake you expect a 10k to make, the other isn't.
A move with a high "9-kyu policy" but bad winrate is probably a natural error, the opposite probably isn't
So the ability of the small network to distinguish between these two is not really the issue. It is that it will predict different plays under the two different circumstances, based, not upon its skill at go, but upon what it thinks a presumably weak human would play. And we think that weak humans are more likely to make a mistake in a life and death situation, where there is only one correct play but many possible mistakes, than in a situation where there is only one possible mistake. That makes sense, even for weak bots, not just for weak humans.

Therefore, having seen many more mistakes by the weak human in the life and death situation, the net will predict a human mistake more often that it will in the simple situation. Again, this has nothing to do with the net learning to distinguish between the two situations.
Edit: And, BTW, that assumption is not necessarily the case. Some versions of Crazy Stone, for instance, have been trained on weak human play and yet they do not judge simple situations like the weak humans do, possibly because those situations do not arise often enough to have been learned. You don't have millions of weak human games to learn from.
But the proof is in the pudding. It's worth a try.
