“Decision: case of using computer assistance in League A”

maf · Post by **maf** » Sun Jun 10, 2018 11:27 pm

Just quickly to introduce myself, I've been following this case with interest and do not have a fixed opinion on how to judge it, but I also do not care too much - I'm also much more interested in the math and theory behind it and in developing methods that may help us in the future than in the case itself.

Javaness2 wrote:Concerning only the rating development of CM
...

I have a question about this. If CM has been playing as 4d for a while, should a development in the direction of 2400 not be expected entirely regardless of the alleged cheating case at hand?

To explain, if I understand correctly, your point would also apply to a fictional player whose rating steadily rises from 1000 to 1500 while at the same time he regularly defeats high dans (i.e. rating > 2500) - clearly the player's rising rating is not a useful indicator of innocence in this case. In fact, unless the player were to win every single low ranked game, it is an indicator in the opposite direction.

John Fairbairn · Post by **John Fairbairn** » Mon Jun 11, 2018 12:43 am

Concerning only the rating development of CM

I don't know the person concerned and am not carrying a torch for him, but one thing that struck me, and that doesn't seem to have been mentioned so far, is that CM seems to be currently doing a PhD on deep learning. If that (or something similar) is the case, I could easily imagine he is now working full time, and is focused exclusively, on a deep-learning AI program and that very intensity could easily lead to a spike in his performance. I believe we saw something similar with Bruce Wilcox getting to 5-dan in the USA when working on his go-related PhD.

AlesCieply · Post by **AlesCieply** » Mon Jun 11, 2018 3:22 am

Javaness2 wrote:Concerning only the rating development of CM
...

True, Carlo has improved by about 1 stone over the last (about) two years. Is he still progressing? One may never be sure but just looking at his EGC 2017 results it is apparent he was about 4d a year ago (just before starting his hotly debated performance in the PGETC league) and his WAGC results suggest that it is still the case now. Since WAGC he also lost his two Italian championship finals games to Kim Shakhov, 5d, so I guess his rating drops just below 2400 once this gets included on EGD.

AlesCieply · Post by **AlesCieply** » Mon Jun 11, 2018 6:58 am

This is in response to what Prof. Morandin wrote. First of all I appreciate a lot a measured and cultivated tone of your message and the input you provided as an expert in statistics.

frmor wrote: 1) Do not take a small arbitrary number of live games to make the confrontation. In particular, the recent games of WAGC were played in few days in a single tournament, with stronger opponents and after the initial accusation. If you show that Carlo was weaker in those games, well... you just proved that he was weaker in those games. Could be for the pressure after unjust accusations, for the jet lag, because the food gave him stomach ache... anything.

2) The correct way to proceed would be to find all the available live games by Carlo of the last years (I think 2 years should do) and analyze all of them or at least a randomly chosen subset. You should provide evidence that the subset was randomly chosen, of course.

I am not sure if you speak here about my analysis or the one Milos Bojanic is preparing. Definitely, in the analysis I am working on I am trying to include as many as possible regular tournament games by Carlo, I do not cherry-pick or select randomly any of them, a few games available to me are still to be included. However, I do not think it would make sense to include games 2 years old when he played as 3d. The games played over the last one year (since EGC) should suffice. Though, I am reluctant to include any of those Carlo provided himself to the league manager, especially after what I found about the Shakhov-Metta game. The selection of the games provided by Carlo might have also been done in his favour as he included only 4 games played at EGC last year but I guess he recorded more of them there (at least he confessed that he records most important games like the EGC and Italian Championship ones).

frmor wrote: 3) If you are not using a measurable quantity, but expert opinion, you should use more than one expert. You should let the experts analyze the games without them knowing which ones are from live games and which ones are from online games and without them knowing whether Carlo played white or black. Moreover, you should secretly give them also some random live games by other players of similar level as a control group.

I cannot agree more with you on this. In fact, this is what I suggested when talking to some people that Carlo's games should be sent for an expert (high level pro) review:
- 3 experts are found, each of them is provided with 3 sets of games, played by 3 players
- the players would be anonymized as Player A (Carlo in PGETC), Player B (Carlo at regular tournaments), Player C (any EGF pro player, maybe some less known to the experts)
- the experts would be asked to estimate how strong the players are (just ordering them according their strength would do) and if they feel 2 of the sets were played by the same player
- I would consider the expert's view as a proof of cheating (innocence) if all of them agreed on both questions and thought that Player A was different from (or the same as) Player B

Javaness2 · Post by **Javaness2** » Mon Jun 11, 2018 7:20 am

I suppose another nice test would be to see if you could actually detect cheating when you knew it had taken place.
Arrange some games. Each round, 1 player will be allocated to use Leela 11 to cheat with. He should record the manner in which he decides to cheat. Create a large sample. I'm sure plenty of go players will volunteer to spend their time like this.

Bill Spight · Post by **Bill Spight** » Mon Jun 11, 2018 9:41 am

AlesCieply wrote:
frmor wrote: 3) If you are not using a measurable quantity, but expert opinion, you should use more than one expert. You should let the experts analyze the games without them knowing which ones are from live games and which ones are from online games and without them knowing whether Carlo played white or black. Moreover, you should secretly give them also some random live games by other players of similar level as a control group.
I cannot agree more with you on this. In fact, this is what I suggested when talking to some people that Carlo's games should be sent for an expert (high level pro) review:
- 3 experts are found, each of them is provided with 3 sets of games, played by 3 players
- the players would be anonymized as Player A (Carlo in PGETC), Player B (Carlo at regular tournaments), Player C (any EGF pro player, maybe some less known to the experts)
- the experts would be asked to estimate how strong the players are (just ordering them according their strength would do) and if they feel 2 of the sets were played by the same player
- I would consider the expert's view as a proof of cheating (innocence) if all of them agreed on both questions and thought that Player A was different from (or the same as) Player B

(Emphasis mine.)

First, proof is a bit too strong. Evidence is better.

But all this, to me, is maddeningly indirect. It substitutes the question — does Carlo A play like Carlo B? — for the question of whether Carlo A cheated. OC, the questions are related, but not the same. And answering the question does not necessarily require expertise at go. Also, the answer to the other question, about the level of play, is indicated by the choice of games. We know that Carlo played better in that tournament than in other games. The question before us is why.

To utilize the expertise of the judges, I would like them to examine the game records to look for evidence of cheating (or not!). Let me give a couple of examples. I do not claim to be an expert, but I took a look at the fifty plays in question in the Metta-Reem game. It seemed to me that all but eleven were either obvious plays that a kyu player might well find, or plays that were part of one lane roads, where plays were part of a consistent sequence, so that a play in the sequence, if not necessarily obvious, only made sense given earlier plays in the sequence. I did not judge whether any of those eleven plays were evidence of cheating or not, but I attempted to eliminate the other plays from consideration. In chess, we have an example of play by a known cheater. See the link, https://www.chess.com/news/view/life-ti ... r-cheating , which sorin posted here. It is the first example of play in the article. The cheater, who had the possibility of simplifying an obviously won game by trading queens, a line of play which, human vs. human, would probably have led to a quick resignation, instead chose a lengthy combination in which he sacrificed three pawns but ended with a checkmate in three, at which point his opponent resigned. The evidence of cheating in that game is not statistical, but behavioral. That is why I said, let the case be made, by Bojanic and/or others examining the game records. The case, it seems to me, would rest upon behavioral evidence, not statistical evidence.

But there is an important statistical question that arises. Can expert go players reliably evaluate game records for evidence of cheating? To answer this question we can do a test such as Javaness2 suggests, in which players cheat in some games. To make the test sensitive, and to some extent to simulate the situation where a player is already suspected of cheating, have half the games be ones with cheating and half be ones without cheating. Let each expert divide the games into two groups accordingly. For instance, you could have half the games in which 6 dans played without cheating and half in which 4 dans cheated using Leela 11. (My suspicion is that even go pros could easily fail the test, at this point in time.)

Uberdude · Post by **Uberdude** » Mon Jun 11, 2018 9:53 am

I'm reminded of that website someone made a few years ago in which one was shown a game and had to estimate the ranks of the players. Anyone remember that, is it still up? It was harder than you might expect. The judges could be asked to demonstrate their qualifications by ranking well on the leaderboard of that site

maf · Post by **maf** » Mon Jun 11, 2018 9:56 am

AlesCieply wrote:- the experts would be asked to estimate how strong the players are (just ordering them according their strength would do) and if they feel 2 of the sets were played by the same player

Just for your information: I've set up a website that let's us judge other players' skill level, revealing it after locking in a guess, https://kyudan.net. I did not evaluate the results closely, but in general, it seemed that strong and weak players alike do rather poorly. If the details are of interest to you, PM me.

Kirby · Post by **Kirby** » Mon Jun 11, 2018 9:58 am

Uberdude wrote:I'm reminded of that website someone made a few years ago in which one was shown a game and had to estimate the ranks of the players. Anyone remember that, is it still up? It was harder than you might expect. The judges could be asked to demonstrate their qualifications by ranking well on the leaderboard of that site

Solomon also made this post back in 2010:
forum/viewtopic.php?f=10&t=441

It was harder than expected to match up games to ranks.

Uberdude · Post by **Uberdude** » Mon Jun 11, 2018 10:01 am

Kirby wrote: You might be thinking of the post this post that Solomon made back in 2010:

That predates me! This was more recent and a proper standalone website, quite nicely made, with account registration etc. The name was something like kyuordan.com iirc.

Edit: maf was reminded of it too as he made it! https://kyudan.net

Kirby · Post by **Kirby** » Mon Jun 11, 2018 10:05 am

Bill Spight wrote:In chess, we have an example of play by a known cheater. See the link, https://www.chess.com/news/view/life-ti ... r-cheating , which sorin posted here.

Another test, which I found interesting from this article, was that they took the alleged cheater and gave him a chess test giving various chess problems. The alleged cheater couldn't answer even simple problems that someone far beneath his rank should be able to solve.

It'd seem that this type of test wouldn't work that well to identify cheating when the player is a high level player, but for users that are cheating to achieve a rank several stones higher than their real rank, it might be possible to give similar go problem tests to suspected cheaters.

Not really sure if you can really prove anything this way, but it could add to your body of evidence.

dfan · Post by **dfan** » Mon Jun 11, 2018 10:40 am

Kirby wrote:Another test, which I found interesting from this article, was that they took the alleged cheater and gave him a chess test giving various chess problems. The alleged cheater couldn't answer even simple problems that someone far beneath his rank should be able to solve.

A few years back we had a continual cheater in our local chess club, who jumped from around 1400 to over 2000 in strength practically overnight (I'd say it's like going from 10k to 1d). Lots of people were very suspicious but I gave him the benefit of the doubt until it turned out that he was not really capable of holding up his end of a conversation about basic chess concepts.

I don't think it would be possible to use this sort of test to detect a 4d masquerading as a 6d, though.

Gobang · Post by **Gobang** » Mon Jun 11, 2018 4:57 pm

dfan wrote:
Kirby wrote: A few years back we had a continual cheater in our local chess club, who jumped from around 1400 to over 2000 in strength practically overnight.

How is it possible to cheat at a local chess club?

sorin · Post by **sorin** » Mon Jun 11, 2018 6:07 pm

Gobang wrote:
dfan wrote:
Kirby wrote: A few years back we had a continual cheater in our local chess club, who jumped from around 1400 to over 2000 in strength practically overnight.
How is it possible to cheat at a local chess club?

Here is a very recent example of that. There is also a video showing the cheater cheating:
https://www.chess.com/news/view/life-ti ... r-cheating

Gobang · Post by **Gobang** » Mon Jun 11, 2018 6:20 pm

sorin wrote: Here is a very recent example of that. There is also a video showing the cheater cheating:
https://www.chess.com/news/view/life-ti ... r-cheating

Thank you, I had no idea this kind of thing goes on. It is hard to fathom how and why a human being would demean himself in such a way.

Another human who degraded himself, the so called poo jogger:

https://www.newshub.co.nz/home/world/20 ... ssman.html

Life In 19x19

“Decision: case of using computer assistance in League A”

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A