“Decision: case of using computer assistance in League A”

Bonobo · Post by **Bonobo** » Tue Jun 05, 2018 3:12 am

»New evidence for re-opening the Carlo Metta case«:

https://psv4.userapi.com/c848124/u63838 ... idence.pdf

(again, I don’t understand any of this)

AlesCieply · Post by **AlesCieply** » Tue Jun 05, 2018 3:46 am

Hello, I guess you know I was involved in dealing with the case as a member of the PGETC appeals committee. Since about that time I started to look into the matter also on my own trying to devise a better and statistically more sound method to check if someone used AI in internet games or not. My analysis is based on comparing the player performance in internet and live games. While working on it I found what I believe is a new evidence and last Friday I informed the EGF executive and involved parties. As it looks it is becoming a public knowledge you may better have it directly from me, the supplied document is here,
https://drive.google.com/file/d/1NaWwHx ... sp=sharing

The analysis itself evolved a bit since then (in particular, the Kulkov-Metta PGETC game was added), the current version is here
https://docs.google.com/spreadsheets/d/ ... sp=sharing
I would very much appreciate if it was reviewed and checked by others, I am also open to any critic how to improve it or what mistakes you find in it. I still do not consider it as an end product. I would like to add few more games and make comparison with games analyzed for different players. I would also like to repeat the analysis for some sequences where Leela might not be sufficiently precise or consistent. However, the work on it is rather slow and tedious. Feel free to contribute to it.

On the first sheet you also find a Pearson's chi-square test to compare the compatibility of the two histograms, for Carlo's internet and regular games. I am not an expert on statistics but I was told that the p-value represents the probability that the two sets could be results for the same population. In our case it means that the probability of both sets of games being played by one person is now about 0.0001 (=0.01%).

Finally, I would very much appreciate if Carlo Metta came out and explained why he presented an apparently fabricated game record to the league manager. I do believe he is in principle an honest man who has done a lot for the go community and can continue to do so. I just think he made a mistake with using AI in his internet games and now is afraid of admitting it.
EDIT: Here I refer to a game record from the Shakhov-Metta game Carlo himself suplied (among several other records) claiming it was played at regular tournament and contained also many moves "similar to Leela". In fact, the game was played at KGS and the record was edited to look as played "live", see the report for more details on it.

Javaness2 · Post by **Javaness2** » Tue Jun 05, 2018 4:23 am

I think that the first point in your summary is quite debatable right now.

Carlo Metta’s performance in the first 7 PGETC league games was so exceptional that such a feat may occur[e] once in about 3000 tournaments.

The basic idea, that he did very well in this year's PGETC is of course relevant as an initial starting point. However, the winning percentages from Go Rating are probably not very reliable. Thus the figure you quote (1/3000) is probably best left out. Andrew Simon's already mentioned two players with similar 'super' performances this year.

Has it really not been already the subject of analysis to look at every internet tournament he played; show performance rating there, relative to his offline rating performances? Especially in light of this third point, I would say that Carlo's overall performance in the KGS event - I see it is http://www.europeangodatabase.eu/EGD/To ... n=16762284 (but there is also) http://www.europeangodatabase.eu/EGD/To ... y=T171018A - would be interesting.

Coming back to Bojanic's point, it is hard to believe smart guys are going to deliberately disguise their internet games. So I imagine that there is some explanation there.

Tryss · Post by **Tryss** » Tue Jun 05, 2018 4:32 am

A possible source of biais is that you're comparing games he won with games he mostly lost.

The game against Vasquez is closer to the online games than the other regular games, but that is also the one he won.

Charlie · Post by **Charlie** » Tue Jun 05, 2018 4:50 am

AlesCieply wrote:Finally, I would very much appreciate if Carlo Metta came out and explained why he presented an apparently fabricated game record to the league manager.

Just to be clear: the allegedly fabricated game record is *not* the kifu from the game that raised accusations of cheating but another game, between Carlo Metta and Kim Shakhov. You really should be specific in this instance.

AlesCieply · Post by **AlesCieply** » Tue Jun 05, 2018 5:00 am

The basic idea, that he did very well in this year's PGETC is of course relevant as an initial starting point. However, the winning percentages from Go Rating are probably not very reliable. Thus the figure you quote (1/3000) is probably best left out. Andrew Simon's already mentioned two players with similar 'super' performances this year.

May you provide a reference? I do not recall any 4d (and not fast improving!) player performace like that. Of course, there are fast improving 1d players who perform as 3d at tournaments regularly. I agree, the figure 3000 tournaments is approximate, thought even if it was 1000 ...

Just to be clear: the allegedly fabricated game record is *not* the kifu from the game that raised accusations of cheating but another game, between Carlo Metta and Kim Shakhov. You really should be specific in this instance.

I am quite specific on it in the report, did not feel like copy/pasting from the report when people can read it.

A possible source of bias is that you're comparing games he won with games he mostly lost. A possible source of bias is that you're comparing games he won with games he mostly lost.

I am definitely aware of it. The problem is they are not that many regular games Carlo won recently with the records available.

Charlie · Post by **Charlie** » Tue Jun 05, 2018 5:16 am

AlesCieply wrote:
Just to be clear: the allegedly fabricated game record is *not* the kifu from the game that raised accusations of cheating but another game, between Carlo Metta and Kim Shakhov. You really should be specific in this instance.
I am quite specific on it in the report, did not feel like copy/pasting from the report when people can read it.

Do not be arrogant. Many people who read this thread will not go and download your PDF and read it in great detail.

You are not only accusing someone of cheating but also accusing them of fraudulently fabricating evidence! The very least you could do is exercise some care and diligence in doing so!

Javaness2 · Post by **Javaness2** » Tue Jun 05, 2018 5:21 am

AlesCieply wrote: May you provide a reference? I do not recall any 4d (and not fast improving!) player performace like that. Of course, there are fast improving 1d players who perform as 3d at tournaments regularly. I agree, the figure 3000 tournaments is approximate, thought even if it was 1000

I am actually surprised you don't already have this data, because this tournament is so obvious to check.
Just sort this list of performances http://www.europeangodatabase.eu/EGD/To ... y=T160920A

Hidden is the raw gain (at 50%) but of course TPR should be bigger

Code: Select all

GoR after tournament: 2051.345 	52.544    Chris Bryant (UK)
GoR after tournament: 2171.653 	47.167          a 1d
GoR after tournament: 2309.134 	46.753   Daniel Hu (UK)
GoR after tournament: 2259.935 	35.214    a 2d
GoR after tournament: 2305.714 	30.255    Carlo Metta
GoR after tournament: 2092.218 	30.121
GoR after tournament: 2397.215 	24.673
GoR after tournament: 2043.823 	21.05
GoR after tournament: 2433.031 	18.133
GoR after tournament: 2121.004 	17.008
GoR after tournament: 2188.365 	16.505
GoR after tournament: 2311.131 	14.747
GoR after tournament: 2310.588 	13.407
GoR after tournament: 2126.191 	13.002
GoR after tournament: 1658.588 	12.903
GoR after tournament: 2083.033 	11.992
GoR after tournament: 2327.441 	11.614
GoR after tournament: 2270.889 	11.504
GoR after tournament: 2252.379 	11.152
GoR after tournament: 2338.201 	10.159
GoR after tournament: 2148.369 	10.044
GoR after tournament: 2032.917 	10.041
GoR after tournament: 2345.868 	9.099
GoR after tournament: 2062.001 	9.088
GoR after tournament: 2554.068 	9.065
GoR after tournament: 2102.409 	7.755
GoR after tournament: 2188.786 	6.338
GoR after tournament: 2082.72 	6.08
GoR after tournament: 2231.591 	4.753
GoR after tournament: 1784.558 	4.605
GoR after tournament: 2435.44 	4.306
GoR after tournament: 2121.256 	4.192
GoR after tournament: 2226.391 	4.011
GoR after tournament: 2262.345 	3.901
GoR after tournament: 2415.592 	3.895
GoR after tournament: 2708.218 	3.89
GoR after tournament: 2145.097 	3.855
GoR after tournament: 2503.11 	3.139
GoR after tournament: 2739.956 	2.731
GoR after tournament: 2347.463 	2.379
GoR after tournament: 2118.761 	2.275
GoR after tournament: 2160.829 	2.194
GoR after tournament: 2174.745 	2.171
GoR after tournament: 2297.122 	1.151
GoR after tournament: 1959.172 	1.076
GoR after tournament: 2333.719 	0.693
GoR after tournament: 2065.674 	0.663
GoR after tournament: 2327.001 	0.086
GoR after tournament: 1833.388 	-0.314
GoR after tournament: 1702.06 	-0.344
GoR after tournament: 1799.57 	-0.369
GoR after tournament: 1989.18 	-0.4
GoR after tournament: 1687.072 	-0.405
GoR after tournament: 2191.142 	-1.31
GoR after tournament: 2148.85 	-1.474
GoR after tournament: 1949.142 	-1.593
GoR after tournament: 1857.958 	-1.651
GoR after tournament: 2350.28 	-1.728
GoR after tournament: 2502.07 	-1.741
GoR after tournament: 2351.544 	-2.286
GoR after tournament: 2168.554 	-2.343
GoR after tournament: 2329.714 	-2.393
GoR after tournament: 2150.505 	-2.815
GoR after tournament: 2191.358 	-2.824
GoR after tournament: 2329.81 	-3.071
GoR after tournament: 2396.55 	-3.084
GoR after tournament: 2228.872 	-3.798
GoR after tournament: 1985.133 	-3.924
GoR after tournament: 2045.834 	-4.042
GoR after tournament: 1977.848 	-4.236
GoR after tournament: 2382.181 	-4.275
GoR after tournament: 2283.186 	-4.295
GoR after tournament: 2269.634 	-4.33
GoR after tournament: 2158.263 	-4.514
GoR after tournament: 2485.878 	-4.538
GoR after tournament: 2441.816 	-5.301
GoR after tournament: 2337.461 	-5.596
GoR after tournament: 2242.681 	-5.609
GoR after tournament: 2607.648 	-5.901
GoR after tournament: 2332.817 	-7.041
GoR after tournament: 2369.122 	-7.446
GoR after tournament: 2206.372 	-7.641
GoR after tournament: 2169.903 	-8.211
GoR after tournament: 2246.102 	-8.296
GoR after tournament: 2131.445 	-8.419
GoR after tournament: 2212.346 	-8.452
GoR after tournament: 2295.239 	-8.911
GoR after tournament: 2618.884 	-9.866
GoR after tournament: 2153.442 	-10.663
GoR after tournament: 2441.498 	-10.776
GoR after tournament: 2387.235 	-11.255
GoR after tournament: 2447.258 	-11.847
GoR after tournament: 1972.993 	-11.971
GoR after tournament: 2362.567 	-12.1
GoR after tournament: 2306.047 	-12.167
GoR after tournament: 2480.258 	-13.333
GoR after tournament: 2485.259 	-13.444
GoR after tournament: 2502.231 	-13.591
GoR after tournament: 2331.652 	-13.996
GoR after tournament: 2370.293 	-14.179
GoR after tournament: 2256.515 	-15.038
GoR after tournament: 2429.096 	-16.269
GoR after tournament: 2077.376 	-22.624
GoR after tournament: 2160.054 	-24.437
GoR after tournament: 1922.164 	-31.362
GoR after tournament: 2249.452 	-32.76

Javaness2 · Post by **Javaness2** » Tue Jun 05, 2018 5:22 am

Just to be clear: the allegedly fabricated game record is *not* the kifu from the game that raised accusations of cheating but another game, between Carlo Metta and Kim Shakhov. You really should be specific in this instance.

Let us say that the word 'fabricated' was not a good choice here. I would have gone for 'modified'. Especially in a paper like this.

bernds · Post by **bernds** » Tue Jun 05, 2018 5:36 am

Javaness2 wrote:
Just to be clear: the allegedly fabricated game record is *not* the kifu from the game that raised accusations of cheating but another game, between Carlo Metta and Kim Shakhov. You really should be specific in this instance.
Let us say that the word 'fabricated' was not a good choice here. I would have gone for 'modified'. Especially in a paper like this.

The report, as I understand it, says Carlo submitted it as an example of an over-the-board tournament game, and it turned out to be a KGS record instead. The word "fabrication" is entirely appropriate if that is indeed correct, and IMO if this is indeed what happened, it justifies any penalty. If you lie to the court, you deserve whatever you get.

Uberdude · Post by **Uberdude** » Tue Jun 05, 2018 5:41 am

AlesCieply wrote:
The basic idea, that he did very well in this year's PGETC is of course relevant as an initial starting point. However, the winning percentages from Go Rating are probably not very reliable. Thus the figure you quote (1/3000) is probably best left out. Andrew Simon's already mentioned two players with similar 'super' performances this year.
May you provide a reference? I do not recall any 4d (and not fast improving!) player performace like that. Of course, there are fast improving 1d players who perform as 3d at tournaments regularly. I agree, the figure 3000 tournaments is approximate, thought even if it was 1000 ...

From earlier in thread:

Just to go back to Carlo, I thought I'd work out his performance rating for this season's PGETC. He had great results for a 4d:
- beat Andrey Kulkov 6d (Russia) by 1.5
- beat Ondrej Kruml 5d (Czechia) by 2.5
- beat Dragos Bajenaru 6d (Romania) by resign
- beat Reem Ben David 4d (Israel) by resign *** the famous 98% game
- lost to Mero Csaba 6d (Hungary) by 2.5
- beat Mijodrag Stankovic "5d" 3d by resign
- lost to Andrij Kravets 7d/1p by 7.5

At the start of the season in (1st) September Carlo's rating was 2381 [very similar to me], this was after picking up 50 points at the EGC. Of course his true strength could have been more than that and grown since then too but his rating lagged. His performance rating (using EGD GoR calculator), using current ratings of opponents is 2629, or +248.

How does that compare to other good performances?

Forum regulars may remember I beat Victor Chow 7d from South Africa a few years ago. UK were in league C for the 2014/15 season and my initial rating was 2361. My results were:
- beat Petrauskas 3d (Lithuania) by resign
- beat Chow 6/7d (South Africa) by 0.5
- beat Ganeyev 3k (Kazakhstan) by resign.
As I had no losses my performance rating with the "adjust until input = output" method is infinite, anchoring with a loss to 2700 gives 2666, anchoring with loss to 2800 gives 2719. So +300 ish with big uncertainty as no losses and few games, the only useful information is I beat a 2616 in one game, how flukey was that?

Last season Daniel on the UK team had no losses, this season he had just 1:
- beat Rasmusson 4d (Denmark)
- beat Karadaban 5d (Turkey)
- beat Welticke 6d (Germany)
- lost to Lin 6d (Austria)
Initial rating was 2402. Performance rating 2616 (+214).
If you include the wins (included some 5ds) from the previous season (for which his initial rating was 2262 but he probably wasn't much weaker than he is now) as well then you get performance rating of 2677 (+415).

Update: Chris this season:
- beat Isaksen 2d (Denmark)
- beat Schlattner 2d (Switzerland)
- beat Kuntay 2d (Turkey)
- beat Palant "5d" 4d (Germany) [quotes is his stated grade, no quotes is GoR where 4d is 2351->2450]
- beat Laatikainen "5d" 4d (Finland)
- beat Unger "3d" 4d (Austria)
- beat Hanevik 3d (Norway)
- beat Groenen "6d" 5d (Netherlands)
- beat Ouchterlony "4d" 3d (Sweden)
- lost to Metta 4d (Italy)
Initial rating 2284. Performance rating 2568 (+284). And if like Lukan you believe Carlo was using LeelaZero (I estimate EGF GoR ~2900) in the last game he gets 2781 (+497)

Javaness2 · Post by **Javaness2** » Tue Jun 05, 2018 6:06 am

bernds wrote:The report, as I understand it, says Carlo submitted it as an example of an over-the-board tournament game, and it turned out to be a KGS record instead. The word "fabrication" is entirely appropriate if that is indeed correct, and IMO if this is indeed what happened, it justifies any penalty. If you lie to the court, you deserve whatever you get.

If you don't want your evidence to seem neutral, go ahead and choose fabrication. There are some other f words (foolish) you can in there while you are at it.

AlesCieply · Post by **AlesCieply** » Tue Jun 05, 2018 6:13 am

Uberdude wrote: Last season Daniel on the UK team had no losses, this season he had just 1:
- beat Rasmusson 4d (Denmark)
- beat Karadaban 5d (Turkey)
- beat Welticke 6d (Germany)
- lost to Lin 6d (Austria)
Initial rating was 2402. Performance rating 2616 (+214).
If you include the wins (included some 5ds) from the previous season (for which his initial rating was 2262 but he probably wasn't much weaker than he is now) as well then you get performance rating of 2677 (+415).

This one really stands out, I admit. Thanks for providing the reference. Such performances are still quite rare and I would not consider it as a proof of anyone cheating on its own. I hope that is also clear from what I say in the report. Do also note that Daniel's strength/rating is still improving and does not look as settled as the Carlo Metta's one.

Bill Spight · Post by **Bill Spight** » Tue Jun 05, 2018 7:52 am

AlesCieply wrote:Finally, I would very much appreciate if Carlo Metta came out and explained why he presented an apparently fabricated game record to the league manager. I do believe he is in principle an honest man who has done a lot for the go community and can continue to do so. I just think he made a mistake with using AI in his internet games and now is afraid of admitting it.
EDIT: Here I refer to a game record from the Shakhov-Metta game Carlo himself suplied (among several other records) claiming it was played at regular tournament and contained also many moves "similar to Leela". In fact, the game was played at KGS and the record was edited to look as played "live", see the report for more details on it.

To me, this behavioral evidence of doctoring and submitting a game record is the strongest evidence of cheating so far. (Assuming that it holds up, OC.

) As is so often the case, it is the coverup that gets you.

Bojanic · Post by **Bojanic** » Tue Jun 05, 2018 8:18 am

Analysis on rating that Ales made imho are just signal for a lamp to go up.
Same as when weaker player wins, or when someone is playing stronger online.

Also signal for alarm is when I look at deviations diagram in GRP, when I notice that it goes up for one side and continues to rise, is rather suspicious.
But there should be next step in analysis, going move by move, because some things could be deceiving.

Today I have analyzed game from PGETC (none of the mentioned here) where basically every move from one player is Leela's suggestion.
Basically, 90% of moves were A and B suggestions, and only one move was not suggested by Leela (although it looks nice).

Life In 19x19

“Decision: case of using computer assistance in League A”

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A

Re: “Decision: case of using computer assistance in League A