Derived Metrics for the Game of Go
Posted: Mon Nov 09, 2020 9:01 am
This thread discusses the paper Derived Metrics for the Game of Go - Intrinsic Network Strength Assessment and Cheat-Detection by Attila Egri-Nagy and Antti Törmänen.
https://arxiv.org/pdf/2009.01606.pdf
So far I have read to section 3.1.
The visit count N(s,a) is defined as the number of times a variation starting with move a at position s is examined. Presumably, this has been clarified elsewhere but I wonder: is this the number of leaves below the (s,a) node or the number of the search algorithm's walks through the (s,a) node?
Apparently, the scoremean is defined as a mean over all visited scores (at the leaves, I suppose) during a Monte-Carlo search. When correct subsequent play approaches a leaf, the scoremean can converge to strong human score prediction. So far so good. However, in the general case, which includes many positions far earlier than a leaf, there may be some stability in the values and game-tree-local convergence for strong AI play but we do not know by how much in every specific position scoremean and strong human score prediction differ. The scoremean does not equal strong human score prediction.
The paper says: "Every move played in a game reduces the number of its future possibilites." Unless a superko or similar rule applies, this is just a conjecture and disproven by this counter-example: White's two-eye-formation fills the board, White fills an eye, Black passes, White fills an eye committing suicide (assuming it is legal according to the rules). The resulting position has a greater number of future possibilities than the initial position. To get a theorem instead of a conjecture, some presuppositions need to be stated and a proof is required.
The effect of a move is defined as the difference of scoremeans after and before it. The paper says that statistical information on the effects describe the playing skill of a player. No. It only describes a model of the playing skill of a player because the scoremean is only a model of correct positional judgement.
https://arxiv.org/pdf/2009.01606.pdf
So far I have read to section 3.1.
The visit count N(s,a) is defined as the number of times a variation starting with move a at position s is examined. Presumably, this has been clarified elsewhere but I wonder: is this the number of leaves below the (s,a) node or the number of the search algorithm's walks through the (s,a) node?
Apparently, the scoremean is defined as a mean over all visited scores (at the leaves, I suppose) during a Monte-Carlo search. When correct subsequent play approaches a leaf, the scoremean can converge to strong human score prediction. So far so good. However, in the general case, which includes many positions far earlier than a leaf, there may be some stability in the values and game-tree-local convergence for strong AI play but we do not know by how much in every specific position scoremean and strong human score prediction differ. The scoremean does not equal strong human score prediction.
The paper says: "Every move played in a game reduces the number of its future possibilites." Unless a superko or similar rule applies, this is just a conjecture and disproven by this counter-example: White's two-eye-formation fills the board, White fills an eye, Black passes, White fills an eye committing suicide (assuming it is legal according to the rules). The resulting position has a greater number of future possibilities than the initial position. To get a theorem instead of a conjecture, some presuppositions need to be stated and a proof is required.
The effect of a move is defined as the difference of scoremeans after and before it. The paper says that statistical information on the effects describe the playing skill of a player. No. It only describes a model of the playing skill of a player because the scoremean is only a model of correct positional judgement.