Establishing a player's identity

Javaness2 · **#21**

Bojanic wrote:

And even more surprising, some of the forum members who are not convinced about League A case, are optimistic about this one.

At least it is a fresh topic.

Maybe it is actually easier to tell apart 2 human players. For example, if I know Dragos Bajneru is playing Catalin Taranu, I would be confident I could tell them apart from just the game record. Take 2 players with similar style, or of a totally different level, and maybe I have no chance. As for the chances of GoStyle - http://pasky.or.cz/go/ has a link to it - I guess it's going to be difficult to be confident of the results that spits out.

bugsti · **#22**

What is the rating difference between the two supposedly exchanged players?

Below a certain threshold your quest is almost impossible to complete.

Fede · **#23**

bugsti wrote:

What is the rating difference between the two supposedly exchanged players?

Below a certain threshold your quest is almost impossible to complete.

Variable over time, but you can assume it's 300+ GoR. Though I would prefer the analysis tools didn't give weight to skill difference between the two players.

zermelo · **#24**

I believe one could use similar methods as are used in face recognition.

E.g. a neural network is given as input 2 game records, and the information which one of the players is under focus in each game (black/white). The network outputs a similarity measure, which is trained over a large number of games and players. Then one can select some similarity threshold based on measured false positive rate.

If there are usually several records per player, I guess one can have the network output some style-embedding-vector from one game record, and then compare the embedding vector from the investigated game to the average embedding from the reference player's games.

It's an interesting question what is the best way to encode the game records in the input. Is it an sgf-type record, or board end position but with move numbers for each intersection or what?

Bill Spight · **#25**

Fede wrote:

Bojanic wrote:

You are wasting time with such analysis.
In League A topic, if direct similarities between Leela and suspicious games were not good enough for some - and they were measured by program itself, and program plays very consistent - how do you think that you can compare two human games against each other? With human play varying wildly. And probably lot of contaminated sources.

And even more surprising, some of the forum members who are not convinced about League A case, are optimistic about this one.

Please, give us a chance.
I know, I haven't shared much about the case, so it's perfectly understandable to be pessimistic.

This is only one of the avenues that are being investigated. I think the two cases are very different. Give me until the EGC, then I will show my hand.

Isn't there a recent case where parents played instead of their kids?

Bojanic · **#26**

Bill Spight wrote:

Isn't there a recent case where parents played instead of their kids?

Yes, last year in junior Pandenet league double-digit kyus all the sudden started playing like dan players.
For us it was very clear case, but here it would be still discussed...

jlt · **#27**

DDK vs dan players is completely different from 4d vs 6d. A DDK never wins against a dan player (unless the dan player self ataris a big group...), while a 4d sometimes wins against a 6d.

There might have been more discussion if the kid was 10k and the parent 5k.

Gobang · **#28**

Bojanic wrote:

And even more surprising, some of the forum members who are not convinced about League A case, are optimistic about this one.

Not really surprising, since what people say is often more about the agendas that they have than logic and common sense.

Bill Spight · **#29**

Bojanic wrote:

And even more surprising, some of the forum members who are not convinced about League A case, are optimistic about this one.

I am cautiously optimistic that a computer tool can be developed that would help to distinguish between two players.

And yes, it could be used to help distinguish between a human and a specific bot.

Part of that optimism relies upon the quantity of data, hundreds or thousands of games vs. a few.

Edit: Also, one difference from the League A case is that Leela 11 did not play every move in a complete game.

quantumf · **#30**

Player verification is likely to be a much simpler problem than player recognition - i.e. confirming that player X is actually player X is a 1:1 problem compared to identifying who a player is, which is a 1:K problem (in a database of K players).

For anyone actually attempting to code this up, consider using Siamese networks - these are much more tolerant to small training sets. While usually used in facial recognition, I suspect they could work well for sgf recognition.

Gobang · **#31**

Fede wrote:

How should I proceed if I wanted to establish a "fingerprint" of a player's style?

Has anyone ever reliably done this? Are you willing to put in the amount of work needed to achieve this? If you do it thoroughly I imagine that the amount of work might be comparable to a significant portion of the work involved in putting together a dissertation for a doctorate. (If that is your plan, to incorporate this in a dissertation or some form of scientific paper, the best of luck to you).

Assuming this has successfully been done, where will you be? You can say that using methodology X, Y and Z you have evidence that there is an X% probability that the player who was supposed to be playing was not playing. What good will that do? Does it prove anything? Are you ready to put in the work it will take to refute the refutations of your method?

All I am saying is that you should carefully consider what you will achieve, or if you will in fact achieve anything before you embark on a course of action.

Don't get me wrong, I would like nothing more than a method that can conclusively prove that player X is lying when he says he played a certain game. I suspect it will be much easier to prove that player X did not play a game, rather than to prove that player Y, or computer program Z did in fact play it.

Bill Spight · **#32**

Gobang wrote:

Fede wrote:

How should I proceed if I wanted to establish a "fingerprint" of a player's style?

Has anyone ever reliably done this? Are you willing to put in the amount of work needed to achieve this? If you do it thoroughly I imagine that the amount of work might be comparable to a significant portion of the work involved in putting together a dissertation for a doctorate. (If that is you plan, to incorporate this in a dissertation or some form of scientific paper, the best of luck to you).

It is easier, and more practical for purposes of detecting when one specific player has played instead of another, to decide between the two.

Edit: This kind of thing has already been done with the authorship of certain of the Federalist Papers.

There were three possible authors.

Fede · **#33**

Gobang wrote:

Fede wrote:

How should I proceed if I wanted to establish a "fingerprint" of a player's style?

Has anyone ever reliably done this? Are you willing to put in the amount of work needed to achieve this? If you do it thoroughly I imagine that the amount of work might be comparable to a significant portion of the work involved in putting together a dissertation for a doctorate. (If that is your plan, to incorporate this in a dissertation or some form of scientific paper, the best of luck to you).

Assuming this has successfully been done, where will you be? You can say that using methodology X, Y and Z you have evidence that there is an X% probability that the player who was supposed to be playing was not playing. What good will that do? Does it prove anything? Are you ready to put in the work it will take to refute the refutations of your method?

All I am saying is that you should carefully consider what you will achieve, or if you will in fact achieve anything before you embark on a course of action.

Don't get me wrong, I would like nothing more than a method that can conclusively prove that player X is lying when he says he played a certain game. I suspect it will be much easier to prove that player X did not play a game, rather than to prove that player Y, or computer program Z did in fact play it.

Please, this thread should be used to give ideas and to try develop them. I don't give interviews

I will repeat what I have already said: I know that it may be impossible to show whether the accusation has merit or not. I am not pursuing only this approach. I think that it is my duty to try.
I'll add that some players have a unique style that is easier to recognize (think of Takemiya Masaki, for example). And that, in my opinion, many players can recognize their own games because they know how they play. So I don't consider this hopeless.

Now, please, let's use this thread for what it was intended. Pretty please. ;-)

Fede · **#34**

Bojanic wrote:

Federico,
Have you checked games for program usage?
It is much easier, only 1 player is involved, and it coincides with availability of programs.
If those games started in this year's league, then you have only few programs to check.

I thought a little more about this. Yes, I will use AIs to examine the games, as they may help show whether a game was one sided or not. Having someone repeatedly beat a stronger opponent in a one sided game would be increasingly unlikely and a strong AI will make it simpler (and I hope more objective) to judge whether the game is one sided or not.

Edit: I consider this as another approach that I will add to the analysis, not related to the identity question.

Gobang · **#35**

Fede wrote:

And that, in my opinion, many players can recognize their own games because they know how they play.

I play a lot, and if you gave me some random game I played a year or two ago I could not positively identify that it was me playing. It would help me a lot to really know how I play. I might be able to change my play in a positive direction.

pnprog · **#36**

Hi, just a quick update:

I made some cleaning in my list of SGF games: I removes games with very weak bots that would play almost randomly, and also remove some clones bots of the two I selected. In the end, I ended with ~600 games for each.

I prepared the data, and had a first try at training a network.
I used 80 games from each bots, so 40 games each (because that what I was planning to use for Fede's case).
And I split both set in 40 games for training, and 40 games to check for over-fitting.
So totally 15574 entries for training, and 14831 entries to check for over-fitting.

With that, the network is able to converge (it's simple error back propagation algorithm) but, unfortunately, the network is over-fitting: the error on the training set decreases to low value, but the error on the control group just decrease a little bit

Ok, I am not a machine learning specialist in any way, so I will play a bit more with what I have to see if I did not messed up something, but then I will upload the training data so that others can have a try at it as well. But... yeah, I am not quite optimistic this would work. Maybe with more data? But it would not be useful for Fede's case then.

Fede · **#37**

pnprog wrote:

Hi, just a quick update:

I made some cleaning in my list of SGF games: I removes games with very weak bots that would play almost randomly, and also remove some clones bots of the two I selected. In the end, I ended with ~600 games for each.

I prepared the data, and had a first try at training a network.
I used 80 games from each bots, so 40 games each (because that what I was planning to use for Fede's case).
And I split both set in 40 games for training, and 40 games to check for over-fitting.
So totally 15574 entries for training, and 14831 entries to check for over-fitting.

With that, the network is able to converge (it's simple error back propagation algorithm) but, unfortunately, the network is over-fitting: the error on the training set decreases to low value, but the error on the control group just decrease a little bit

Ok, I am not a machine learning specialist in any way, so I will play a bit more with what I have to see if I did not messed up something, but then I will upload the training data so that others can have a try at it as well. But... yeah, I am not quite optimistic this would work. Maybe with more data? But it would not be useful for Fede's case then.

Please bear in mind that I have only a vague idea of how neural networks work, so my questions may be a bit stupid.

How large is the error on the control group?
What is the bottleneck at the time being? The games played by A or the games for which we need to establish whether the player was A or B?

If it's the former, I'm still searching new games to use to train the network for A, so the number of games available could increase. It would be nice to have an idea on how many would be needed.

pnprog · **#38**

Fede wrote:

How large is the error on the control group?

First, a word on the error:
For one training entry, the output should be (1,0) for game played by player A, or (0,1) for games played by player B. And I defined the error simply as the average error for each output. So if the network output (x,y) instead of (1,0), the error is (abs(1-x)+abs(0-y))/2
Then, I defined the error on the training group or control group simply as the average error for all training entries.

Usually, I would use the quadratic error for each training entry. But since one can expect to have a lot of duplicate entries for both player A and player B (all the games they start by hoshi in top corner...) I don't want to give to much weight than necessary to those entries. So simple average.

So in my last run, the training group reached 3% average error, and the control group was at 43%

Fede wrote:

What is the bottleneck at the time being? The games played by A or the games for which we need to establish whether the player was A or B?
If it's the former, I'm still searching new games to use to train the network for A, so the number of games available could increase. It would be nice to have an idea on how many would be needed.

I am not sure in fact. I guess more training data will make it harder for the network to over-fit. But it might just make it impossible to converge as well.

I started a run with 2x400 training games and 2x400 control games. It's very slow on my computer, so I will let it run this night. If it over-fits, then I don't know what to do. If it does not converge, I will try with bigger network sizes (more layers).

If it converges but don't over-fit, bingo

In any case, I will share the training data (or the way to generate them from SGF) to let specialists have a try.

Bill Spight · **#39**

pnprog wrote:

So in my last run, the training group reached 3% average error, and the control group was at 43%

A reduction of the average error from an expected 50% to 43% by training on only 40 games is not bad, IMO.

One thing you can try is to train on all but one game and use it for testing. Then include it in the training set and take out another game for testing. Train on the new set and test on that game. Do this 80 times and you get an estimate of the error rate for training on 79 games.

moha · **#40**

Bill Spight wrote:

A reduction of the average error from an expected 50% to 43% by training on only 40 games is not bad, IMO.

There may also be huge random variance since the test set is small. And I'm afraid you are underestimating the amount of training data required for answers of this complexity. How would a human perform on this task - if he didn't know go before?

Establishing a player's identity

Who is online