Life In 19x19
http://lifein19x19.com/

Establishing a player's identity
http://lifein19x19.com/viewtopic.php?f=10&t=15833
Page 1 of 3

Author:  Fede [ Tue Jun 19, 2018 2:41 am ]
Post subject:  Establishing a player's identity

I am the manager of Leagues C and D of the PGETC project. One of the teams in one of my leagues has been accused of cheating by player substitution. What I mean is that, according to the accusation, when player A was supposed to play, player B (who has a higher GoR) took player A's place and used player A's account to play.

I'm investigating the matter, trying to pursue different avenues. Here I will focus only on one, for which I need assistance.

I have gathered a few hundreds of 19x19 games played by B in the last few years. It's a long shot, but is there any way to use them to extrapolate B's style and check whether it matches the style of A's PGETC games? How should I proceed if I wanted to establish a "fingerprint" of a player's style?

At the time being I won't reveal any other information in order not to introduce bias.

Author:  Javaness2 [ Tue Jun 19, 2018 2:46 am ]
Post subject:  Re: Establishing a player's identity

Make it stop. Please. Make it stop.

Favourite joseki responses could be a way?

Author:  Fede [ Tue Jun 19, 2018 2:57 am ]
Post subject:  Re: Establishing a player's identity

Please, try to stick to the subject.
Sure, the tournament's format can be changed, but that is not part of this discussion.

We all know that in face to face tournaments this kind of trick is much harder to pull off [but not impossible]. The lower leagues of the PGETC are valuable especially for players who aren't able to play face to face with opponents of similar strength and I'd like to make everything I can to help the people who enjoy playing in it continue.

Therefore I'm asking for people to try to be constructive and to focus only on the question I'm asking.
Thank you.

Author:  Uberdude [ Tue Jun 19, 2018 3:03 am ]
Post subject:  Re: Establishing a player's identity

Maybe this automated go style site can help, it's designed to do pretty much what you want. http://gostyle.j2m.cz/webapp.html

Though I've no idea how good it is, one obvious test would split your hundreds of games for player B into several groups, analyse them each separately and see how similar this style program scores them. Then compare to games of player A.

Edit: but if you only have the one game where player B is alleged to have masqueraded as A then that's probably too small for the program to extract a good style fingerprint, as it asks for ~40 games.

Author:  Bojanic [ Tue Jun 19, 2018 3:14 am ]
Post subject:  Re: Establishing a player's identity

Fede wrote:
I am the manager of Leagues C and D of the PGETC project. One of the teams in one of my leagues has been accused of cheating by player substitution. What I mean is that, according to the accusation, when player A was supposed to play, player B (who has a higher GoR) took player A's place and used player A's account to play.

Do you have IP addresses? Or some data from computer on which it was played? (software version, OS, processor, serial, etc...)
On forums those things are main tool in such analysis.

PS just to note that games are transmitted live, and that player B can simply phone player A when necessary...

Author:  Fede [ Tue Jun 19, 2018 3:30 am ]
Post subject:  Re: Establishing a player's identity

Bojanic wrote:
Fede wrote:
I am the manager of Leagues C and D of the PGETC project. One of the teams in one of my leagues has been accused of cheating by player substitution. What I mean is that, according to the accusation, when player A was supposed to play, player B (who has a higher GoR) took player A's place and used player A's account to play.

Do you have IP addresses? Or some data from computer on which it was played? (software version, OS, processor, serial, etc...)
On forums those things are main tool in such analysis.

PS just to note that games are transmitted live, and that player B can simply phone player A when necessary...

IP addresses: they are part of my investigation, but by themselves they do not prove much, almost everyone has a dynamic IP address, which is temporary. My understanding is that, to connect an IP to a specific person, I would need access to the ISP's data, but they wouldn't share anything with me. I cannot get a court to order a suspected player's ISP to provide me the data that would connect an IP to the specific location from which the Internet was accessed.

The accuser(s) say that B is playing all the game instead of A.


EDIT: Unless someone can help request and obtain data from an ISP, please ignore the IP part of the equation.

Author:  Bill Spight [ Tue Jun 19, 2018 3:50 am ]
Post subject:  Re: Establishing a player's identity

You only said that you had several game records for player B. Can you get the same or similar number of game records for player A? If so, you should be able to train a neural network to distinguish between the two. That is in general an easier task to learn than asking it to distinguish between, say, player B and a player who could be anybody. You give it a game record and ask whether Black or White, depending, is player A or player B? How well it can learn to do that, I don't know, but you will get an error rate saying how well it can distinguish between the two. Which is what you want.

With only a few hundred games upon which to learn, the error rate may be high, but I doubt if humans can do better.

Edit: Even with a high error rate for a single game, you may be able to get a low error rate for the whole set of games. For instance, if the network guesses that player B played 4 out of 5 games instead of player A, there is a good chance that player B played at least one of the games. Which is all you need for cheating.

Author:  Fede [ Tue Jun 19, 2018 4:17 am ]
Post subject:  Re: Establishing a player's identity

Marcel GrĂ¼nauer wrote:
Because it seems to be difficult to separate the message from the (jocular) medium, I'll spell it out:

I believe that in light of pro-level-AI-free-for-everyone it will be impossible to prevent this kind of cheating. Not even webcams and screen mirroring will help because the player can just look at a different screen behind the webcam. All these "security measures" just react to existing possibilities - much like airport "security" - and just encourage other forms of cheating while making things more awkward for the honest majority. And cheating is much simpler from one's home than it is at a real-life tournament.

And because players study new moves and variations played by AI, their play will reflect that. So it's an exercise in futility.

Unless you trust players to have some "code of honor" (and you can't) and as long as one in a hundred players (e.g., in the case of the Pandanet tournament) is enough to skew the results, online tournaments simply cannot be trusted.

I understand this and I mostly agree. But my duty is to try to see whether I can get to the end of this. I think I owe to the players to try my best.

If it wasn't 100% clear, here both A and B are definitely human. No AI is involved in any way.


Bill Spight wrote:
You only said that you had several game records for player B. Can you get the same or similar number of game records for player A?

Not as many as of today, there is at least an order of magnitude of difference between the 19x19 even games I have that were played by B and those played by A. Work in progress. Thanks for the idea, though.

Author:  pnprog [ Tue Jun 19, 2018 7:55 am ]
Post subject:  Re: Establishing a player's identity

Bill Spight wrote:
You only said that you had several game records for player B. Can you get the same or similar number of game records for player A? If so, you should be able to train a neural network to distinguish between the two. That is in general an easier task to learn than asking it to distinguish between, say, player B and a player who could be anybody. You give it a game record and ask whether Black or White, depending, is player A or player B? How well it can learn to do that, I don't know, but you will get an error rate saying how well it can distinguish between the two. Which is what you want.

With only a few hundred games upon which to learn, the error rate may be high, but I doubt if humans can do better.

Edit: Even with a high error rate for a single game, you may be able to get a low error rate for the whole set of games. For instance, if the network guesses that player B played 4 out of 5 games instead of player A, there is a good chance that player B played at least one of the games. Which is all you need for cheating.
This looks like a nice experiment!

Could be possible to use bots to generates plenty of games (gnugo, Leela, pachi...) and make a proof of concept to see if it can work first.

What would you consider for the network input?

Fede wrote:
EDIT: Unless someone can help request and obtain data from an ISP, please ignore the IP part of the equation.
You might get support from the Go server. If they have logs of the IP used for by players before/during/after the game, you might be able to find a correspondence between player B and that game. If player B connected to the server earlier that day with the same IP that was used during the game, this is a very strong indication.

Author:  Bill Spight [ Tue Jun 19, 2018 8:33 am ]
Post subject:  Re: Establishing a player's identity

pnprog wrote:
Bill Spight wrote:
You only said that you had several game records for player B. Can you get the same or similar number of game records for player A? If so, you should be able to train a neural network to distinguish between the two. That is in general an easier task to learn than asking it to distinguish between, say, player B and a player who could be anybody. You give it a game record and ask whether Black or White, depending, is player A or player B? How well it can learn to do that, I don't know, but you will get an error rate saying how well it can distinguish between the two. Which is what you want.

With only a few hundred games upon which to learn, the error rate may be high, but I doubt if humans can do better.

Edit: Even with a high error rate for a single game, you may be able to get a low error rate for the whole set of games. For instance, if the network guesses that player B played 4 out of 5 games instead of player A, there is a good chance that player B played at least one of the games. Which is all you need for cheating.
This looks like a nice experiment!

Could be possible to use bots to generates plenty of games (gnugo, Leela, pachi...) and make a proof of concept to see if it can work first.

What would you consider for the network input?


Even if there are not enough game records for player A available to make a good discrimination in this case, I think it would be good to develop a tool for telling two players apart. Aside from style, skill difference is important, as well. In any event, the tool should be developed on games other than those played by the players involved in this case. Maybe start with bots, then use the thousands of game records available on the internet.

Would something like this work? Choose Black or White to identify, and then after each move by that player, ask whether it is by Player One on Player Two. By the end of each game you should be able to get both a prediction and a confidence factor. Correct guesses are rewarded.

Early plays are probably more revealing than late plays. It might work to stop after move 150 (or resignation), or even earlier in the game.

Author:  pnprog [ Tue Jun 19, 2018 8:55 am ]
Post subject:  Re: Establishing a player's identity

Bill Spight wrote:
Would something like this work? Choose Black or White to identify, and then after each move by that player, ask whether it is by Player One on Player Two. By the end of each game you should be able to get both a prediction and a confidence factor. Correct guesses are rewarded.

Early plays are probably more revealing than late plays. It might work to stop after move 150 (or resignation), or even earlier in the game.
OK, so basically, one training entry per move (of that player) and per game, with maybe a limit at move 150.

Then, after the network is trained, one feed it with each move of the suspicious game. Something like this.

I will see if I can download games from the computer go server and make a trial this week.

Author:  Bill Spight [ Tue Jun 19, 2018 10:03 am ]
Post subject:  Re: Establishing a player's identity

pnprog wrote:
Bill Spight wrote:
Would something like this work? Choose Black or White to identify, and then after each move by that player, ask whether it is by Player One on Player Two. By the end of each game you should be able to get both a prediction and a confidence factor. Correct guesses are rewarded.

Early plays are probably more revealing than late plays. It might work to stop after move 150 (or resignation), or even earlier in the game.
OK, so basically, one training entry per move (of that player) and per game, with maybe a limit at move 150.

Then, after the network is trained, one feed it with each move of the suspicious game. Something like this.

I will see if I can download games from the computer go server and make a trial this week.


Thanks. :) This is definitely one verified method of cheating, and if you can develop a tool to help detect it, that would be great!

Author:  Fede [ Tue Jun 19, 2018 2:04 pm ]
Post subject:  Re: Establishing a player's identity

pnprog wrote:
Bill Spight wrote:
Would something like this work? Choose Black or White to identify, and then after each move by that player, ask whether it is by Player One on Player Two. By the end of each game you should be able to get both a prediction and a confidence factor. Correct guesses are rewarded.

Early plays are probably more revealing than late plays. It might work to stop after move 150 (or resignation), or even earlier in the game.
OK, so basically, one training entry per move (of that player) and per game, with maybe a limit at move 150.

Then, after the network is trained, one feed it with each move of the suspicious game. Something like this.

I will see if I can download games from the computer go server and make a trial this week.

If you want to obtain many games without downloading them from an active Go server (they may limit access if there are too many requests), there are a few databases that should be freely available: see section "Other Game Collections" of https://senseis.xmp.net/?GoDatabases. The NNGS and Online-Go collections could be good for this, since they have a huge number of games for many different ranks.

Also, if you need processing power, I have an extra (ordinary) Windows machine that is often off.

Author:  Bojanic [ Tue Jun 19, 2018 2:21 pm ]
Post subject:  Re: Establishing a player's identity

Federico,
Have you checked games for program usage?
It is much easier, only 1 player is involved, and it coincides with availability of programs.
If those games started in this year's league, then you have only few programs to check.

Author:  Gobang [ Tue Jun 19, 2018 6:33 pm ]
Post subject:  Re: Establishing a player's identity

Fede wrote:
I am the manager of Leagues C and D of the PGETC project. One of the teams in one of my leagues has been accused of cheating by player substitution. What I mean is that, according to the accusation, when player A was supposed to play, player B (who has a higher GoR) took player A's place and used player A's account to play.

I'm investigating the matter, trying to pursue different avenues. Here I will focus only on one, for which I need assistance.

I have gathered a few hundreds of 19x19 games played by B in the last few years. It's a long shot, but is there any way to use them to extrapolate B's style and check whether it matches the style of A's PGETC games? How should I proceed if I wanted to establish a "fingerprint" of a player's style?

At the time being I won't reveal any other information in order not to introduce bias.



"I am the manager of Leagues C and D of the PGETC project." Sorry to hear that.

"How should I proceed if I wanted to establish a "fingerprint" of a player's style?" Have a piece of cake and a nice cup of coffee and think about whether or not this is worth wasting time and energy on.

What can you do? You can ask the player who is accused of cheating whether or not he or she cheated. If the player says no, then that is the end. You can prove within 90% plus probability that cheating did occur, but a recent example shows what will happen. You will not get anywhere. The more time and effort you expend into proving this the more time and effort will be put into refuting your proof. The more effort you put into this the more time you will waste.

Author:  pnprog [ Wed Jun 20, 2018 1:52 am ]
Post subject:  Re: Establishing a player's identity

Fede wrote:
If you want to obtain many games without downloading them from an active Go server (they may limit access if there are too many requests), there are a few databases that should be freely available: see section "Other Game Collections" of https://senseis.xmp.net/?GoDatabases. The NNGS and Online-Go collections could be good for this, since they have a huge number of games for many different ranks.
In fact, CGOS provides links to archives games, so no problem :)

So I downloaded the game from May 2018 (11201 games).

Then, I listed the bots that played the most games, here is the top 20 (together with the number of played games):
Code:
Aya793d_524_ro_2k   1805
Aya798c_F32cn15_5k   1747
Gnugo-3.7.10-a1   1672
Stop-0.9-005-19x19   1313
LZ_62b541_ELF_1600   1232
Maru-3.3.0p-0g   1198
LZ_b6337c69_p1600   1135
DCNN_AyaF128a523x1   1094
LZ_158603eb_1600   672
Maru-3.3.1-0g   611
myCtest-10k   512
Maru-3.2.1-0g   509
Rn.4.32-4c   469
Maru-3.3.2raw-0g   430
GnuGo_3.8_lv10   395
Emily_180511   324
RLO.0.2-4c1g   316
MGX-V14   290
RLO.0.2-16c1g   277
LZ-W5748   273

From there, I guess I will pick Maru-3.3.0p-0g and Gnugo-3.7.10-a1. I will remove the 119 games they have played together from the training set (see the ranking list).

Next step is to prepare the training data:
  • I guess I will "encode" a game position using 1 for black stones, -1 for white stones, and 0 for empty intersections? In that case, the neural network will essentially be asked to tell if player A or player B was actually playing in the game, regardless of the color he was playing.
  • Or I could choose 1 for player A or B stones, -1 for opponent stones, and 0 for empty intersections?
  • Any better suggestion? maybe 1 for black stones but 2 if players A or B is black, then -1 for white stones but -2 if players A or B is white?

Fede wrote:
Also, if you need processing power, I have an extra (ordinary) Windows machine that is often off.
Let's see. I made some neural network learning code in python in the past. I will start with that because I am comfortable with it. I want to make a proof of concept first, to see if it could work. If the training is really too slow, then yes, I might have to use another computer :)

For information, how many games from player A and player B do you have? (excluding games between player A and B). I might try with those numbers.

Author:  Fede [ Wed Jun 20, 2018 2:22 am ]
Post subject:  Re: Establishing a player's identity

pnprog wrote:
For information, how many games from player A and player B do you have? (excluding games between player A and B). I might try with those numbers.

It depends on which subset of games you'd like to use. I would assume 19x19 only, no handicap? Or also ranked and no bots?

Let's say approx. 1,000 and 100.

Author:  Bill Spight [ Wed Jun 20, 2018 3:42 am ]
Post subject:  Re: Establishing a player's identity

A couple of points that may not need saying.

First, you need to save a fair number of games for testing purposes and not use them for training. That way you can get error estimates.

Second, it is generally desirable to use the same number of game for each player for training. Otherwise the network may be biased towards guessing the player whose games were used more. I don't know how much of a consideration that may be in this case, however.

Author:  Bojanic [ Wed Jun 20, 2018 4:43 am ]
Post subject:  Re: Establishing a player's identity

You are wasting time with such analysis.
In League A topic, if direct similarities between Leela and suspicious games were not good enough for some - and they were measured by program itself, and program plays very consistent - how do you think that you can compare two human games against each other? With human play varying wildly. And probably lot of contaminated sources.

And even more surprising, some of the forum members who are not convinced about League A case, are optimistic about this one.

Author:  Fede [ Wed Jun 20, 2018 4:49 am ]
Post subject:  Re: Establishing a player's identity

Bojanic wrote:
You are wasting time with such analysis.
In League A topic, if direct similarities between Leela and suspicious games were not good enough for some - and they were measured by program itself, and program plays very consistent - how do you think that you can compare two human games against each other? With human play varying wildly. And probably lot of contaminated sources.

And even more surprising, some of the forum members who are not convinced about League A case, are optimistic about this one.

Please, give us a chance.
I know, I haven't shared much about the case, so it's perfectly understandable to be pessimistic.

This is only one of the avenues that are being investigated. I think the two cases are very different. Give me until the EGC, then I will show my hand.

Page 1 of 3 All times are UTC - 8 hours [ DST ]
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/