 Author: pnprog [ Sun Mar 25, 2018 9:21 pm ] Post subject: Re: “Decision: case of using computer assistance in League A Jonas Egeberg wrote:As the manager of League A in PGETC I have been in charge of dealing with this matter. I of course had help from other strong, non-biased players in analyzing the games etc. For those asking, what we did is that we checked several of his offline games from recent tournaments, and we also verified with his opponents that they were the actual games played. We checked moves 50-150 and noted the moves as similar if they were within Leela's top 3 moves, and no further than 5% away from its top move. In those games the similarity was 70-80%. We then went back to the PGETC games and checked the game against Isreal. In that game we found that the similarity was 98%, where the only move that was different was Leela's move number 4, but still within 1% winrate of its top move.I am not that good at statistics, but if we consider that a player, in average, plays 70-80% of moves that are similar to Leela offline (let's take 80% moves), then what is the probability that he plays 49 or 50 similar moves out of 50 online?In my simulation, the probability is below 0.02%Of course, one have to check carefully that the measurement is consistent of both cases, and backed by enough data for the offline measure, and we make the assumption that offline performance can be translated in online performance, and that it is consistent across different opponents, and so on, and so on, and so on...Beside, this is mildly related to the topic, but sometime ago, I stumbled upon this article https://www.chess.com/article/view/better-than-ratings-chess-com-s-new-caps-system#comments that is considering a replacement for ELO. In my understanding, this measure a similarity of play between chess players and chess bots as a measure of the player's strength. I was considering adding that sort of calculation into GoReviewPartner, but just for fun

 Author: RobertJasiek [ Sun Mar 25, 2018 10:57 pm ] Post subject: Re: “Decision: case of using computer assistance in League A Humans can learn from programs and apply what they have learnt. The result can be very similar games. Therefore, probability is not evidence. What we do need in important online games is referees locally supervising the players to prevent almost all cheating of all forms.

 Author: Javaness2 [ Sun Mar 25, 2018 11:06 pm ] Post subject: Re: “Decision: case of using computer assistance in League A The game in question has 166 moves, the allegation is presumably that after move 4 black starts using Leela. In other words he turned on Leela after opting for a double 3-3 fuseki. After that every single move chosen is a 'top' Leela move. The game doesn't look that remarkable to me. The mathematics behind the decision has, especially in the case of an appeal, to be the subject of some debate if online games are still to be taken into account. Nobody has answered my question yet though, is there any appeal of the decision?

 Author: Uberdude [ Mon Mar 26, 2018 1:01 am ] Post subject: Re: “Decision: case of using computer assistance in League A On the issue of the (mis)use of statistics in (in)justice:- https://en.wikipedia.org/wiki/Roy_Meadow- https://en.wikipedia.org/wiki/Lucia_de_Berk- https://en.wikipedia.org/wiki/Birmingham_Six (many other problems here, but part of the evidence was an expert witness saying he was 99% sure a chemical test proved the accused had been handling explosives, but it turned out handling playing cards could produce a false positive (from their similar to nitroglycerine coating), and they had indeed been playing cards)

 Author: Javaness2 [ Mon Mar 26, 2018 1:28 am ] Post subject: Re: “Decision: case of using computer assistance in League A Uberdude wrote:On the issue of the (mis)use of statistics in justice:- https://en.wikipedia.org/wiki/Roy_Meadow- https://en.wikipedia.org/wiki/Lucia_de_Berk- https://en.wikipedia.org/wiki/Birmingham_SixI wasn't aware that the Birmingham Six was an example of mis-use of statistics; wasn't it mostly the simple fabrication of evidence? Mind you the TOS probably forbids me discussing that here, so here is the game from November instead. ]

 Author: aitkensam [ Mon Mar 26, 2018 2:14 am ] Post subject: Re: “Decision: case of using computer assistance in League A Is there a reason why the other games played in this year's league did not form part of the investigation? They would presumably provide useful data for determining whether the game against Israel was out of the ordinary for this player. Or whether all online games show a different pattern to all offline games etc.

 Author: zermelo [ Mon Mar 26, 2018 2:15 am ] Post subject: Re: “Decision: case of using computer assistance in League A Some people have here mentioned 98% out of 100 moves, but I suppose it is actually 50 moves of the player, within moves 50-150 of the game.I think it is very important why exactly moves 50-150 were studied. Was this decided beforehand? Why are not numbers given for the whole game?

 Author: goer [ Mon Mar 26, 2018 2:18 am ] Post subject: Re: “Decision: case of using computer assistance in League A This guy Metta is the chief referee of the coming EGF Congress in Pisa. What happens then, will the EGF allow him to referee?

 Author: goer [ Mon Mar 26, 2018 2:19 am ] Post subject: Re: “Decision: case of using computer assistance in League A aitkensam wrote:Is there a reason why the other games played in this year's league did not form part of the investigation? They would presumably provide useful data for determining whether the game against Israel was out of the ordinary for this player. Or whether all online games show a different pattern to all offline games etc.That's a good point.

 Author: Javaness2 [ Mon Mar 26, 2018 2:22 am ] Post subject: Re: “Decision: case of using computer assistance in League A This is what they didQuote: For those asking, what we did is that we checked several of his offline games from recent tournaments, and we also verified with his opponents that they were the actual games played. We checked moves 50-150 and noted the moves as similar if they were within Leela's top 3 moves, and no further than 5% away from its top move. In those games the similarity was 70-80%. We then went back to the PGETC games and checked the game against Isreal. In that game we found that the similarity was 98%, where the only move that was different was Leela's move number 4, but still within 1% winrate of its top move.Leela move 4 - does that correspond to move 7 of the game, or does it mean move 57? I guess we will see a technical report soon to clear up this confusion. In any EGF event you have 3 stages of appeal. #1.Appeal to the Referee, #2.Appeal to the Tournament Appeals Committee, #3.Appeal to the EGF version of #2. Given that this happened in November and has only become known now, which steps were taken?

 Author: RobertJasiek [ Mon Mar 26, 2018 2:52 am ] Post subject: Re: “Decision: case of using computer assistance in League A The game almost only has ordinary moves. Already this lets it be possible that two different players / programs find most of the same moves, especially if they use a similar playing style. That the player is said to have studied with the program for 2 years makes this all the more likely that same moves are not coincidence and not cheating but a direct consequence of adapting a playing style and "knowledge" / "experience" from training with the program during the years before the game. I do hope very sincerely that the judgement is overturned by higher instances.

 Author: HermanHiddema [ Mon Mar 26, 2018 2:54 am ] Post subject: Re: “Decision: case of using computer assistance in League A Javaness2 wrote:This is what they didQuote: For those asking, what we did is that we checked several of his offline games from recent tournaments, and we also verified with his opponents that they were the actual games played. We checked moves 50-150 and noted the moves as similar if they were within Leela's top 3 moves, and no further than 5% away from its top move. In those games the similarity was 70-80%. We then went back to the PGETC games and checked the game against Isreal. In that game we found that the similarity was 98%, where the only move that was different was Leela's move number 4, but still within 1% winrate of its top move.Leela move 4 - does that correspond to move 7 of the game, or does it mean move 57? I guess we will see a technical report soon to clear up this confusion. In any EGF event you have 3 stages of appeal. #1.Appeal to the Referee, #2.Appeal to the Tournament Appeals Committee, #3.Appeal to the EGF version of #2. Given that this happened in November and has only become known now, which steps were taken?Leela move 4 means the 4th best move in that position according to Leela's analysis.Of the 50 moves considered, 49 were in the top 3 best moves according to Leela's analysis (with the additional constraint that the move should not be more than 5% worse than Leela's top choice).This is where the 98% number comes from, it is 49/50.The only move out of 50 not in Leela's top 3 was Leela's 4th choice in that position, and was less than 1% worse than Leela's top choice.

 Author: Javaness2 [ Mon Mar 26, 2018 4:03 am ] Post subject: Re: “Decision: case of using computer assistance in League A Ah, you are right, that seems like the best way to interpret that statement. So did they then only look at a section of the game or at the whole game? For me it would be kind of strange if they didn't look at the whole game. I suppose that a script already exists to show the comparison data per ply for the game.

 Author: jlt [ Mon Mar 26, 2018 4:18 am ] Post subject: Re: “Decision: case of using computer assistance in League A Uberdude wrote:Out of interest of the quality of the similarity metric used, I downloaded Leela 0.11 (to my crappy laptop, about 10 seconds to get 30k nodes, I don't know how strong it is) and analysed moves 50-80 of Carlo's game vs Israel, and my last PGETC game moves 50-88. In that small section of Carlo's game he got 100% similar, Israel 67%. In my game I got 74% and my opp 89%. (...) if it's possible for an innocent to get 89% on 38 moves then 98% on 100 moves when you've been studying with Leela is suspicious but not good enough proof for punishment.Let's pick the highest percentage, i.e. 89%. Suppose for simplicity that for each move, the probability to find Leela's move is p=0.89. Then for n=50 moves, the probability to find correctly exactly 49 moves is npn-1(1-p) which is about 2%. During rounds 1--3 of the Pandanet EGC, 60 games were played, so you would expect at least one false positive.

 Author: Uberdude [ Mon Mar 26, 2018 4:44 am ] Post subject: Re: “Decision: case of using computer assistance in League A jlt, a good point to illustrate, but even that is placing too much significance on this test. That 2% is the chance of a randomly chosen game being at 98% Leela (and you should include 100% too). But when the game you choose to investigate is selected because someone else noticed it was similar to Leela it's like putting the black spot in the golf analogy on the player who hit the hole-in-one after and because he did so. It's not independent so the simple probabilities are not appropriate.

 Author: Javaness2 [ Mon Mar 26, 2018 4:49 am ] Post subject: Re: “Decision: case of using computer assistance in League A Finally I saw a confirmation, an appeal is planned, so the decision is not final. Judging by the last appeal I saw, it could take a year to figure this out.Russia CzechRepublic Romania Hungary Serbia Ukraine

 Author: ez4u [ Mon Mar 26, 2018 5:11 am ] Post subject: Re: “Decision: case of using computer assistance in League A jlt wrote:Uberdude wrote:Out of interest of the quality of the similarity metric used, I downloaded Leela 0.11 (to my crappy laptop, about 10 seconds to get 30k nodes, I don't know how strong it is) and analysed moves 50-80 of Carlo's game vs Israel, and my last PGETC game moves 50-88. In that small section of Carlo's game he got 100% similar, Israel 67%. In my game I got 74% and my opp 89%. (...) if it's possible for an innocent to get 89% on 38 moves then 98% on 100 moves when you've been studying with Leela is suspicious but not good enough proof for punishment.Let's pick the highest percentage, i.e. 89%. Suppose for simplicity that for each move, the probability to find Leela's move is p=0.89. Then for n=50 moves, the probability to find correctly exactly 49 moves is npn-1(1-p) which is about 2%. During rounds 1--3 of the Pandanet EGC, 60 games were played, so you would expect at least one false positive.This calculation was why I was trying to get something more 'common sense' from Bill. If you take 89% (Uberdude's 4d opponent), you get 1.8% or about one time in 54. However, if you plug in 80% (upper figure for what was observed for the player under discussion) you get a very different result = [edit --> wrong! 0.0000065% or one time in 15 million] 0.018% or one time in about 5,600. And if you take 70% (the lower figure), you get [edit --> wrong!! 5.0E-16 or about one time in 2 quadrillion] 3.8E-7 or about one time in 2.5 million.

 Author: jlt [ Mon Mar 26, 2018 5:26 am ] Post subject: Re: “Decision: case of using computer assistance in League A ez4u wrote:If you take 89% (Uberdude's 4d opponent), you get 1.8% or about one time in 54. However, if you plug in 80% (upper figure for what was observed for the player under discussion) you get a very different result = 0.0000065% or one time in 15 million. And if you take 70% (the lower figure), you get 5.0E-16 or about one time in 2 quadrillion.No you don't. If p=0.8 then npn-1(1-p) is about 1/5600. If p=0.7 then npn-1(1-p) is about 1/(2.5 million).The orders of magnitude are certainly very different, but I purposely picked p=0.89 to be on the safe side, i.e. I would prefer a few cheaters to go unpunished rather than too many punished innocents (together with the whole team).And also, p=0.89 may not be unrealistic for that particular game if you assume that most moves were very "ordinary" (dixit Robert Jasiek).

 Author: HermanHiddema [ Mon Mar 26, 2018 5:28 am ] Post subject: Re: “Decision: case of using computer assistance in League A IMO, for the appeal, they should analyse a large sample of PGETC games to see how much of an outlier 98% is.

