Life In 19x19
http://lifein19x19.com/

“Decision: case of using computer assistance in League A”
http://lifein19x19.com/viewtopic.php?f=10&t=15538
Page 34 of 36

Author:  Javaness2 [ Thu Jun 28, 2018 3:01 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

jlt wrote:
Personally I am not claiming that anyone is right or wrong, I am just waiting for some strong players to play the "4d vs 6d" game. If Bojanic would like to play the game, then there are three possible outcomes:

  • He guesses right most of the time, and never confuses a 4d with a 6d or vice-versa. This would be a strong argument in favor of the validity of his method of analysis.
  • He confuses a 6d with a 4d but never a 4d with a 6d. The test is not conclusive.
  • He confuses a 4d with a 6d. Then, either Bojanic's method is not accurate, or cheating has already occurred in the past as substitution of players (so maybe cheating in PGETC is much more widespread than we previously thought).

Whatever the outcome, I would find the conclusion interesting.


While the test is interesting, I don't think it looks at the right thing.
More correct would be to get somebody like Uberdude to play 5 games with an opponent. In 2 games he will cheat by using Leela. Can you find out in which games he did that. 5/5 score must be obtained.

Author:  AlesCieply [ Thu Jun 28, 2018 3:09 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

jlt wrote:
It would be nice if Bojanic could play the "4 dan or 6 dan ?" game. I wouldn't expect 100% accuracy, but it would be interesting to check if at least he never confuses a 4 dan with a 6 dan or vice-versa.


Bojanic's analysis is about establishing that Carlo Metta plays quite differently on internet and in his regular games. For the "significant moves" (chosen not to be part of forced sequences) Carlo is much more likely to play in agreement with Leela in his PGETC games than in his regular games. That's what Bojanic's analysis tells us. It has nothing to do with establishing whether he plays as 4d or 6d, though it is understood that Leela is quite a bit stronger than 4d. ;-)

Author:  AlesCieply [ Thu Jun 28, 2018 3:13 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

Javaness2 wrote:
While the test is interesting, I don't think it looks at the right thing.
More correct would be to get somebody like Uberdude to play 5 games with an opponent. In 2 games he will cheat by using Leela. Can you find out in which games he did that. 5/5 score must be obtained.


Exactly! :bow:

Author:  Uberdude [ Thu Jun 28, 2018 3:30 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

jlt wrote:
I am not talking about kyu players here. Bojanic (5d) as well as other strong players say that it is easy to see from a game if a player is 4d or 6d. In addition, Bojanic can use computer tools to make more accurate analyses. Other people like Uberdude (4d) and Robert Jasiek (5d) think that it is not possible to judge from a single game or from a small number of games.

Actually, I think strong players can do a fairly good job in judging the strength of similar players from their moves (and I was actually impressed how well my wife did on the test, with caveats noted there), but indeed suspect they may not be quite as good as they think, particularly in the case of a 4d playing well or a 6d playing badly (note for my test I didn't look at the games before choosing them so have no idea if any of the 4ds played well or 6ds badly), and small samples are always problematic. I like evidence, and will happily update my views on its production. Lukan (7d) said he'd give it a go, I hope he does.

Gobang wrote:
(I had the nerve a critical comment about it and was slammed for my "negativity").

As the someone who "slammed" Gobang, I'd call that more of a counter-throw than a slam of my initiation as I said "I disagree, though I think your constant repetition of negativity is a waste of time." in response to "So all this is is a waste of time, just like 99% of the babble around the topic of detecting online cheats.". That was said out of frustration of his repeated calls to throw our hands in the air, give up on detecting or preventing cheating (at least attempting in an imperfect way) and just cancel the entire PGETC league (plus demeaning the participants as "kids"). This is despite admitting he is new to serious/tournament Go and the accounts from various people actually involved of how much they value the league (e.g. dsatkas in Greece, quantumf in SA, Simba and me in UK).

Gobang wrote:
It is also questionable to construct this test with online games where there is no way of verifying who was in fact playing.

As for the possibility of the actual player not being the named one, yes it is non-zero and in the ~4500 games played in the history of the league I think it likely some may be so, but the chance it happened in the 14 cases I picked pretty slim, more so because they are from higher leagues so who exactly would the replacement be? Most of the top players of the countries involved participate so either it's: 1) another player on the team, who isn't playing at the same time and they all collude to keep the cheating secret, 2) some secret strong player from their country not in the league or unknown to the Go community, are there many of these? 3) some pro or strong player in Asia?. Also by being from the same event they have the same time controls, seriousness etc (and were easy for me to obtain). If someone else could collect game records from e.g. the EGC and we judge those too it would be interesting, maybe a 4d in 1 hour PGETC game online is generally weaker / plays worse / judged lower than a 4d at the 2.5 hour EGC? How about at the WAGC? Or faster KPMC? Or your average 1 hour game from a 3-a-day McMahon (last game worse from tiredness perhaps?).

Java's proposed test is also an interesting one, I hope someone conducts it (but I'm rather busy atm, and won't be able to work on my extension of Ales's mistake analysis for some time; my hope is that will give typical profiles for 4 and 6 dans, be better at identifying them than humans, and by knowing their variance we can answer questions like how unlikely is it a 4d plays as well as a 6d, or did Dragos play particularly poorly in that game vs Carlo particularly well). A lot of consideration in this thread has been on false positives, false negatives are also important to test for (but perhaps less so if we consider punishing an innocent worse than not punishing a guilty).

Edit: As Ales said, the 4 or 6 dan test isn't really relevant to Bojanic's analysis, it was prompted by the "I'm a strong player and I looked at the game and there's no way that's a 4 dan (even on a good day)" type arguments (which has a side premise of "and we don't think Carlo is really a 6 dan based on the WAGC").

Edit 2:
Gobang wrote:
For this 6d or 4d test to make any sense, then it should be done in the context that it was created. A 6d player played an entire serious game with someone who is allegedly 4d, (but most probably just acting a bot for Leela). The 6d player said that it felt nothing like playing against a 4d.

Then someone decided to construct a "test", apparently for the purpose of showing that this 6d may not be a reliable judge of whether his opponent was 4d or stronger. My perception is that someone, with the intention of calling the 6d player's judgement into doubt created this "test".

My prompt wasn't just the Simba vs Carlo game, but also using the views of strong players looking at other past games of Carlo to decide whether he cheated instead of statistical Leela similarity approaches (e.g. suggested by Lukan). Gobang makes the distinction between the actual person playing the game (Simba most recently, we've not heard from Reem\Dragos\Kulkov etc) vs an observer and that they will be better at detecting the opponent's strength or if they cheated. I agree playing is different to watching, but it's not clear to me the player is a better judge: in Javaness's test we could ask the opponent of the sometimes-cheater as well as observers to identify the cheating games. But also if we are trying to distinguish "4d cheating with Leela beating 6d" and "4d not cheating getting that expected 1 in 10 win against a 6d" we'd need lots more than 5 games. Also I would like to cheekily point out the EGF rank of the 6d mentioned is 3d (though I believe he is at least 5d).

Edit 3:
Gobang wrote:
Getting kyu players to decide if a 6d or 4d was playing, just by looking at the games is obviously absurd.

I'd say fun but irrelevant:
Uberdude wrote:
What threshold should we take as demonstrating the truth of "It easy to tell the difference between a 4 dan and 6 dan"? (for sufficiently strong players, weaker players might like to play this game for fun/interest but them being bad at it doesn't show 6 or 7ds couldn't be good at it, I hope some strong players participate).

Author:  dfan [ Thu Jun 28, 2018 5:18 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

Javaness2 wrote:
More correct would be to get somebody like Uberdude to play 5 games with an opponent. In 2 games he will cheat by using Leela. Can you find out in which games he did that. 5/5 score must be obtained.

This would tell us something if the subject failed, but I'm not sure if it would tell us anything if the subject succeeded (since one explanation would just be "Uberdude is bad at cheating").

By the way, chess grandmasters have made "my opponent played too strongly for his rating, therefore he was cheating" accusations that have been determined to be incorrect before. Of course, chess is not go.

Author:  Javaness2 [ Thu Jun 28, 2018 6:18 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

Yes, I guess we already discussed somewhere here that we have this problem of people genuinely believing somebody has cheated when they haven't cheated. It happens.

In this case, 5 games would probably be a small sample. I suggested it only as a starting point. Probably a 6 player round robin with 1 player assigned as cheater each round would be more interesting. Who is going to have the time for that though? Nobody...

Author:  Bill Spight [ Thu Jun 28, 2018 6:45 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

Uberdude wrote:
Gobang wrote:
For this 6d or 4d test to make any sense, then it should be done in the context that it was created. A 6d player played an entire serious game with someone who is allegedly 4d, (but most probably just acting a bot for Leela). The 6d player said that it felt nothing like playing against a 4d.

Then someone decided to construct a "test", apparently for the purpose of showing that this 6d may not be a reliable judge of whether his opponent was 4d or stronger. My perception is that someone, with the intention of calling the 6d player's judgement into doubt created this "test".

My prompt wasn't just the Simba vs Carlo game, but also using the views of strong players looking at other past games of Carlo to decide whether he cheated instead of statistical Leela similarity approaches (e.g. suggested by Lukan). Gobang makes the distinction between the actual person playing the game (Simba most recently, we've not heard from Reem\Dragos\Kulkov etc) vs an observer and that they will be better at detecting the opponent's strength or if they cheated. I agree playing is different to watching, but it's not clear to me the player is a better judge: in Javaness's test we could ask the opponent of the sometimes-cheater as well as observers to identify the cheating games. But also if we are trying to distinguish "4d cheating with Leela beating 6d" and "4d not cheating getting that expected 1 in 10 win against a 6d" we'd need lots more than 5 games. Also I would like to cheekily point out the EGF rank of the 6d mentioned is 3d (though I believe he is at least 5d).


I am not very good at judging the strength of another player, either from observing them or playing them. That in part has to do with my psychology, in two ways. First, I don't really care. Second, I tend to raise the level of my game to meet a challenge.

That said, I do think that it is easier to tell any difference while playing a game. Perhaps it has to do with really getting into the game and understanding the play as well as you can, perhaps it has to do with the time involved. In Uberdude's test I am not going to take an hour or two for each game to make the effort.

But looking back on the few games in which I have felt outgunned, in each game, aside from that feeling, I could point to at most a few plays, usually only one, that gave me that feeling. So just because someone has a feeling that their opponent played much better than expected, I also want to know which plays gave rise to that feeling. (I know that that is not always possible. Sometimes you suddenly realize that you are losing and you don't know why. ;))

Author:  Bill Spight [ Thu Jun 28, 2018 6:48 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

dfan wrote:
By the way, chess grandmasters have made "my opponent played too strongly for his rating, therefore he was cheating" accusations that have been determined to be incorrect before. Of course, chess is not go.


I seems to me that at present it is easier to detect cheating at chess than go. Give us ten years, though. :)

Author:  Bill Spight [ Thu Jun 28, 2018 6:56 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

dfan wrote:
Javaness2 wrote:
More correct would be to get somebody like Uberdude to play 5 games with an opponent. In 2 games he will cheat by using Leela. Can you find out in which games he did that. 5/5 score must be obtained.

This would tell us something if the subject failed, but I'm not sure if it would tell us anything if the subject succeeded (since one explanation would just be "Uberdude is bad at cheating").


If we are going to make progress at detecting cheating, we need to have verified cases of cheating for our research. The only way to get a large number of verified cases is to have games in which people cheat on purpose for the sake of the research. (OC, the other players must be in on the fact that their opponent may be cheating.)

Author:  Uberdude [ Thu Jun 28, 2018 7:40 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

Bill Spight wrote:
That said, I do think that it is easier to tell any difference while playing a game. Perhaps it has to do with really getting into the game and understanding the play as well as you can, perhaps it has to do with the time involved. In Uberdude's test I am not going to take an hour or two for each game to make the effort.

That's a good point about the players getting into the game, they have to take responsibility for their moves so when the casual observer says X should have played so-and-so maybe X did consider that but found a refutation for their opponent further down the line the kibitzer didn't. I remember in my British title match with dhu last year Matthew Macfadyen 6d was commentating and said one of us (let's say me) made a mistake and should have played some move. After the game we reviewed his comments and dhu disagreed as he had a stronger reply which meant the proposed better sequence didn't actually work. He had read this in the game. I hadn't read that far but had come to the same conclusion it was not a promising line for me (good or lucky pruning?). If the judges spent as long as the players (2 to 3 hours) on the game would they read equally thoroughly? If I was to do the test I'd probably spend about 15 minutes per game. But on the flip side, non-players can be more objective and emotionally detached. I often make dumb irrational decisions during the heat of the game, particularly in overtime, which on review afterwards I can easily see were silly.

Author:  Bojanic [ Thu Jun 28, 2018 9:47 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

Fenring wrote:
Thanks for the anlalysis Bogdan.
But maybe better to follow the same process with others european player to have a comparaison point?

My name is Milos Bojanic, not Bogdan, and I already explained here as well as in paper update: in preliminary analysis I checked all games from A league and qualifications. Some ten games looked suspicious in deviations histogram,
After more detailed analysis, some were fismissed, and two with most similarities are presented here.

Author:  Bojanic [ Thu Jun 28, 2018 9:54 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

jlt wrote:
It would be nice if Bojanic could play the "4 dan or 6 dan ?" game. I wouldn't expect 100% accuracy, but it would be interesting to check if at least he never confuses a 4 dan with a 6 dan or vice-versa.

For what purposes, except for derailment of this research?
We are not discussing 4 or 6d diff at all, but diff to program that plays very consistent. And that analysis is not performed by me, but by same program.

Author:  bugsti [ Thu Jun 28, 2018 9:58 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

Bojanic wrote:
My name is Milos Bojanic, not Bogdan, and I already explained here as well as in paper update: in preliminary analysis I checked all games from A league and qualifications. Some ten games looked suspicious in deviations histogram,
After more detailed analysis, some were fismissed, and two with most similarities are presented here.


How many time did you spend analyzing each move? How many nodes? I think we need at least 200k nodes per move in order to obtain a reliable deviation histogram, and there will be still many sources of error (some good moves are found after that limit). That requires something like 45 days of calculation on a good hardware, or 1 year in a normal pc. :scratch: :scratch:

You need also to produce this histogram for any available strong bot, they are like a dozen right now :shock: :shock:

Author:  Bill Spight [ Thu Jun 28, 2018 10:48 am ]
Post subject:  Re: “Decision: case of using computer assistance in League A

bugsti wrote:
Bojanic wrote:
My name is Milos Bojanic, not Bogdan, and I already explained here as well as in paper update: in preliminary analysis I checked all games from A league and qualifications. Some ten games looked suspicious in deviations histogram,
After more detailed analysis, some were fismissed, and two with most similarities are presented here.


How many time did you spend analyzing each move? How many nodes? I think we need at least 200k nodes per move in order to obtain a reliable deviation histogram, and there will be still many sources of error (some good moves are found after that limit). That requires something like 45 days of calculation on a good hardware, or 1 year in a normal pc. :scratch: :scratch:

You need also to produce this histogram for any available strong bot, they are like a dozen right now :shock: :shock:


IMO, our current bots are not good enough. They may play at a superhuman level, despite making blunders, but they were optimized for play, not for evaluation and analysis. (Even though they make use of evaluation. That's not all it takes to play well at the time limits in use. Their evaluation just has to be good enough to play well. :))

Author:  Bojanic [ Thu Jun 28, 2018 12:23 pm ]
Post subject:  Re: “Decision: case of using computer assistance in League A

bugsti wrote:
How many time did you spend analyzing each move? How many nodes? I think we need at least 200k nodes per move in order to obtain a reliable deviation histogram, and there will be still many sources of error (some good moves are found after that limit). That requires something like 45 days of calculation on a good hardware, or 1 year in a normal pc. :scratch: :scratch:

You need also to produce this histogram for any available strong bot, they are like a dozen right now :shock: :shock:

Ah, you just came up with 12 Herculean tasks of AI go.
Good idea how to stop investigation, too bad it does not work.

Deviations histograms were actually pretty similar for quick analysis, and for 50k and 200k.
Quick analysis is preliminary screen, and it serves only to select games for further analysis. Games with similar deviations were then analyzed in greater details, especially tenuki moves.

No need to analyze all moves in all games in 1m variations in all programs that even did not exist then.

Author:  bugsti [ Thu Jun 28, 2018 1:02 pm ]
Post subject:  Re: “Decision: case of using computer assistance in League A

Bojanic wrote:
Ah, you just came up with 12 Herculean tasks of AI go.
Good idea how to stop investigation, too bad it does not work.


I heard from one of top team that the final ranking is approved. They send invitation to EGC for PGETC finals to the 4 top team. I guees that case is over for EGF officials.

Author:  Javaness2 [ Thu Jun 28, 2018 1:23 pm ]
Post subject:  Re: “Decision: case of using computer assistance in League A

bugsti wrote:
Bojanic wrote:
Ah, you just came up with 12 Herculean tasks of AI go.
Good idea how to stop investigation, too bad it does not work.


I heard from one of top team that the final ranking is approved. They send invitation to EGC for PGETC finals to the 4 top team. I guees that case is over for EGF officials.


No. This would not preclude an appeal

Author:  bugsti [ Thu Jun 28, 2018 1:38 pm ]
Post subject:  Re: “Decision: case of using computer assistance in League A

Javaness2 wrote:

No. This would not preclude an appeal


But I understood that a penalty or disqualification would exchange ranking between Poland and Romania. So how can the top 4 teams play under an open ruling? Am I missing something here?

Author:  HermanHiddema [ Thu Jun 28, 2018 1:55 pm ]
Post subject:  Re: “Decision: case of using computer assistance in League A

Javaness2 wrote:
bugsti wrote:
Bojanic wrote:
Ah, you just came up with 12 Herculean tasks of AI go.
Good idea how to stop investigation, too bad it does not work.


I heard from one of top team that the final ranking is approved. They send invitation to EGC for PGETC finals to the 4 top team. I guees that case is over for EGF officials.


No. This would not preclude an appeal


The PGETC rules on the site have the following to say about appeals, in section 3.3:

PGETC rules wrote:
In case of differences during or after a game, first of all the captains should try to solve the problem. If no solution is found the responsible league manager should be consulted who will decide. Against this decision it is possible for a team captain to escalate the case by involving the appeals commission. The decision of the appeals commission is final


The EGF General Tournament Rules have the following to say about appeals, in section 7.1:

EGF Tournament rules wrote:
The arbitration procedure used to resolve disputes has three levels of operation: the referee, the appeals committee, and the EGF rules commission. A player with a dispute refers the matter to the referee in the first instance. The dispute may then be referred to the next level up if either player is not satisfied with the judgement or its reasoning. The next level may reject to resume a case if it considers the preceding instance's judgement and reasoning obviously right and just.


It then goes on to say in section 7.5

EGF Tournament rules wrote:
If a player in dispute disagrees with the decision of the appeals committee, the matter must be referred to the EGF rules commission for consideration after the end of the tournament. If the dispute affects titles or prizes, the tournament director cannot declare winners or present prizes until the EGF rules commission has given a final judgement on the matter.


Also, importantly, the EGF tournament rules specify the following in section 1.1

EGF Tournament rules wrote:
These are the general tournament rules of the European Go Federation (EGF) and are used in the tournaments of the EGF. The following rulesets apply:
  1. These General Tournament Rules.
  2. The Tournament System Rules of the EGF.
  3. The event's own Particular Tournament Rules specifying details or variations to the General Tournament Rules.


Point 3 seems relevant here. The event's own rules specify that the decision of the appeals committee is final. If we interpret that as a variation to the EGF general tournament rules, then there is no further avenue for appeal. If we think the EGF rules cannot be altered on this issue and keep superceding the the PGETC rules, then there is a possible appeal to the rules committee, and no winners can be declared, nor prizes awarded. (This appeal would have to come from the Israeli team, BTW. Neither rules allow outsiders to appeal AFAICS)

There seems to be no specific stipulation of how much time is allowed for an appeal (mostly, I think, because these rules are very much written for over the board tournaments, where there's a next round coming up and everything needs to happen a.s.a.p)

Author:  Gobang [ Thu Jun 28, 2018 2:54 pm ]
Post subject:  Re: “Decision: case of using computer assistance in League A

Many thanks to those who took the time to read and reply to my post.

I hope the discussion continues in a constructive way without derailments.

Page 34 of 36 All times are UTC - 8 hours [ DST ]
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/