“Decision: case of using computer assistance in League A”

General conversations about Go belong here.
Simba
Lives with ko
Posts: 170
Joined: Wed Feb 16, 2011 8:54 am
Rank: 6d KGS
GD Posts: 0
Has thanked: 14 times
Been thanked: 23 times

Re: “Decision: case of using computer assistance in League A

Post by Simba »

Javaness2 wrote:This post from reddit strikes me as a crock of shite.
Yes, CM could have cheated, but then why tell people about it. Is the allegation that the entire Italian squad was complicit, if so then doesn't that strike you as slightly beyond belief.
People who have done wrong often confide in someone, to ease the burden on their own conscience. This is extremely common. No, I don't think the entire Italian squad was complicit. I think CM is the only one who has done wrong here. His teammate came forward and spoke up, albeit after a short delay, because it was playing on his conscience. That's fine - I forgive them for the delay. They didn't keep it quiet forever.
Javaness2 wrote:Yes okay, Leela 11 and Leela Zero differences exist, but so what? Why would CM want to cheat with the obselete Leela 11 in the first place? Why not pick 1 of many strong LZ networks?
He did pick one of the strong LZ networks - and not the most up-to-date one, because that would be too obvious. If he was accused of cheating, he could (and was planning on) pointing at move 156 as proof he wasn't. If he was accused of cheating with Leela Zero, then of course this wouldn't have worked. But having such a quick first line of defence on hand for a general "we think you are still cheating" accusation would've been powerful.
Bill Spight wrote:Also, in your game vs. Metta, if Black 157 descends, White can win the resulting semeai. The play is pretty much a one lane road, so is within the capabilities of a European 4 dan.
This isn't relevant. If he'd been playing for himself, I'm sure he'd have seen it - I certainly saw it, that's why I didn't descend. No one is contesting this. We all can see that it works. What is far more interesting and not at all obvious is what Leela 0.11 and Leela Zero say about it. There is no easy way of finding that this move is such an unlikely and critical blind spot (see criteria 1-4 below). You can check it when told, and check that the refutation when entered manually does make Leela 0.11 see sense, sure, but that's so different from finding it for yourself.
Fenring wrote:What i dont understand is why it would be a very difficult task for a malicious person to produce such a statement?(Np-problem)
All he have to do is to check the Carlo's moves where Leela 0.11 and Leela Zero disagree? check the moves "Leela 11 says it is very bad move but Leela Zero says it is super so he plays it", i dont know how many of them exists,and on which metrics it is based(top 3 choice,Winrate variation) but it doesn't matter,we are clearly not in a situation "impossible to find,easy to check" like PvsNp problem.
I refer you back to PF137's post here: https://www.reddit.com/r/baduk/comments ... d/e0c509f/. If you play this way, Leela 0.11 believes it's going to lose until manually shown the refutation. That is a pretty big thing that could be pointed to as proof that "I didn't cheat, see, look, Leela says it would lose if I did that". Leela 0.11's suggested move wins the game. This is in the endgame too.

I challenge you to find another situation in any of Metta's online games, PGETC or otherwise, played between the release of 0.11, and the date the cheating accusation was first made where the following criteria are met:

1) We are in the endgame. Let's say move > 150 as a guideline.
2) Leela 0.11's best move shows a significant victory for side A.
3) Leela 0.11 has at least one move that incorrectly shows a significant defeat for side A, and Leela 0.11 does not see that it is incorrect until manually shown the refutation.
4) Leela Zero correctly finds the refutation, and recommends the incorrect move itself as its first choice.

Surely you can see that those criteria, which are all met here, are needle-in-haystack. No one is going to sit there to find something like that. You'd also have to be very strong to find something like this because you'd need to meet criteria 3, i.e. you need to find the refutation, or explore extensively with Leela Zero trying to find blind spots in each position in Leela 0.11. The time required here is immense. In comparison, actually checking the presence of such a move that meets the criteria, if you're told where to look, is very, very easy.
Jan.van.Rongen wrote:
Simba wrote: As it happens, the move in question (156) was Leela 0.11's third choice, ...
What does that show? Nothing IMO.

For the 5 alternatives White has for this move (h19, J-19-J16) 3 win the game if my analysis is correct. J18 played by Metta is a well known shape move. It is not even too difficult to see that it works when you notice the dame zumari of Blacks bigger dragon.
Lol, what are you talking about? The person that I quoted was harpering on about the whole top-3 Leela thing from ages ago, and how he would, as part of his finding of such a move, exclude top-3 moves by Leela. When in fact the move demonstrated was in the top 3 Leela 0.11 chose. Don't try to use what I've said out of context.
Jan.van.Rongen wrote:You are a 3 dan who lost to a 4 dan. What's wrong with that? Why put up a show how you let your team down etc. ? Your only "evidence" that white cheated is an unnamed person that made some remarks on reddit.
My only evidence? Please read through the rest of the thread and the other analyses that people have provided.

I apologise in advance if the next section comes off as arrogant; I'm simply trying to state things factually here.

I'm not 3-dan. I'm 6-dan on KGS, and routinely chew up 4d players without any issue (in fact I do paid teaching for players up to 3d level). I've won every game I've played in PGETC (the tournament that these games are part of) for the past two years, and beat the British champion in fairly serious games with mid-long time limits last year 5-0, albeit with one win on time (he's 4-5d). I attempted to reset to 5-dan before the PGETC league started, but the league organiser didn't allow it, so I'm stuck with an outdated EGF rating and beat the stuffing out of most people I get put against on the lowest board (because the boards are forcibly ordered by rating, as per PGETC rules, and my country has several people with a higher EGF rating than me). I've beaten professionals on even before several times and don't feel out of my depth against them.

EGF ratings are completely unsuitable for players who are primarily online-based. It's impossible to get an accurate reflection of someone's skill from half a dozen games per year in a rating system. But that's one hell of a large topic, and one that we shouldn't get distracted by here. My last face-to-face tournament was in 2016, and I lost two games from lol-nonsense in byo-yomi from clearly winning positions because I'm not used to playing in that setting. Perhaps in over-the-board play, while I remain rusty at it, and that's unlikely to change in the near future, I'd only be 3-4d, but online? I give 2-3 stones to those players.

The gap between 4d and 6d is enormous. I don't want this to sound unkind but you really have no perception of how much stronger than 6d - stronger than the professionals that I've played - someone would have to be to make me feel so utterly helpless like in that game. I wasn't playing against a 4d. No chance in hell. It scares me even thinking how many stones I'd need to take from an opponent that strong. Maybe I'd have a chance with four, but I'm sat here now doubting myself even with four stones. But yeah, don't think for a single second that it's somehow normal for me to get blown into space without any glimmer of hope by a 4d. It isn't. And that certainly isn't unique to me; all 6ds will tell you that 4ds just can't completely oppress them to the point of feeling like they have no hope. You can say all you want as a kibitzer, but you weren't playing. You didn't feel that; don't even pretend to know what it was like.
Bojanic wrote:There is one important point that we have not discussed in this topic.
That is how big difference is between 4d and 5d, and 5d and 6d.
^ This, this so much. And how big the difference between 6d and someone who completely tears them to pieces is. It's pure naivety (and slightly insulting, though understandable given lack of perception of this kind of difference from a player who isn't at this level) to suggest that a 4d is capable of that with no counterplay available.
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: “Decision: case of using computer assistance in League A

Post by Bill Spight »

Bojanic wrote:
Bill Spight wrote: Of the 50 plays examined in the Metta-Ben David game, I judged that only 12 were difficult enough to be relevant to the question of cheating. Using similar but different criteria Bojanic found 6 important plays by Metta between move 50 and 105, a smaller range of 28 plays by him. He and I agreed on 4 of those 6 plays. My guess is that with even several judges we would have consensus on at least 3 plays.
Bill,
in Metta-Ben David game, I estimated that middlegame is from moves 45-105, which is 60 and not 50 moves. As you are used to hear on forums - why don't you read what was posted?
Bojanic, the first important play that you identified was move 51, which came after move 50, the start of the sequence of 50 moves by Metta used in the original verdict, which was where I started looking. You stopped at move 105, which is before the end of that sequence. Sure, there are 60 moves in your sequence, 30 moves by Metta, 28 of which fall in the sequence both of us looked at.

Before you start accusing me of not reading what you wrote, why don't you read what I write more carefully?

Edit: As I said in my response to Ed, I cut you some slack the first time. But not the second.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
bugsti
Dies in gote
Posts: 32
Joined: Tue Jun 05, 2018 2:46 pm
Rank: 5 kyu
GD Posts: 0
Has thanked: 3 times
Been thanked: 7 times

Re: “Decision: case of using computer assistance in League A

Post by bugsti »

Simba wrote:
The gap between 4d and 6d is enormous. I don't want this to sound unkind but you really have no perception of how much stronger than 6d - stronger than the professionals that I've played - someone would have to be to make me feel so utterly helpless like in that game. I wasn't playing against a 4d. No chance in hell. It scares me even thinking how many stones I'd need to take from an opponent that strong. Maybe I'd have a chance with four, but I'm sat here now doubting myself even with four stones. But yeah, don't think for a single second that it's somehow normal for me to get blown into space without any glimmer of hope by a 4d. It isn't. And that certainly isn't unique to me; all 6ds will tell you that 4ds just can't completely oppress them to the point of feeling like they have no hope. You can say all you want as a kibitzer, but you weren't playing. You didn't feel that; don't even pretend to know what it was like.
I quickly reviewed your game with Leela Zero to check out some of your sentences. Surprisingly I saw that White made a huge blunder with the move 74, and after sacrificing his group with the 76-82 sequence you were in a solid winning position according to LZ.

This is very contradictory compared to your version of being "completely oppress". :scratch:
bugsti
Dies in gote
Posts: 32
Joined: Tue Jun 05, 2018 2:46 pm
Rank: 5 kyu
GD Posts: 0
Has thanked: 3 times
Been thanked: 7 times

Re: “Decision: case of using computer assistance in League A

Post by bugsti »

Simba wrote:
EGF ratings are completely unsuitable for players who are primarily online-based. It's impossible to get an accurate reflection of someone's skill from half a dozen games per year in a rating system. But that's one hell of a large topic, and one that we shouldn't get distracted by here. My last face-to-face tournament was in 2016, and I lost two games from lol-nonsense in byo-yomi from clearly winning positions because I'm not used to playing in that setting. Perhaps in over-the-board play, while I remain rusty at it, and that's unlikely to change in the near future, I'd only be 3-4d, but online? I give 2-3 stones to those players.
So you accidentally admitted that there may be discrepancy between live and online rating and strenght, so much so that on KGS you can give 2-3 stones to someone who can beat you in live matches.

It is also interesting how you judge your mistakes compared to Carlo's mistakes. You called yours as "lol-nonsense in byo-yomi from clearly winning positions", Carlo's live mistakes are instead "evidence of cheating".

If we were in a court of justice your allegations would be annihilated in zero time.
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: “Decision: case of using computer assistance in League A

Post by Bill Spight »

Simba wrote:
Bill Spight wrote:Also, in your game vs. Metta, if Black 157 descends, White can win the resulting semeai. The play is pretty much a one lane road, so is within the capabilities of a European 4 dan.
This isn't relevant. If he'd been playing for himself, I'm sure he'd have seen it - I certainly saw it, that's why I didn't descend. No one is contesting this. We all can see that it works. What is far more interesting and not at all obvious is what Leela 0.11 and Leela Zero say about it. There is no easy way of finding that this move is such an unlikely and critical blind spot (see criteria 1-4 below). You can check it when told, and check that the refutation when entered manually does make Leela 0.11 see sense, sure, but that's so different from finding it for yourself.
What I said next was this:
Bill Spight wrote:OTOH, we know that in the heat of battle mistakes are made.
I was pointing out that it was possible that Metta might have missed that fact, left to his own devices. (Not that I am accusing him of cheating.)

I was talking about analyzing the play in terms of go, not statistics. Which is why I followed by saying this:
Bill Spight wrote:Such go analysis is like expert testimony in court, not infallible, but not worthless, either.


As for the question of Leela 11 vs. Leela Zero, that was irrelevant to my point. Sorry for not making that clear. But I also indicated in another note that I thought the question of comparing Leela 11 and Leela Zero was a diversion created by the anonymous accuser. I doubt if Metta (or anybody in their right mind) would cheat by checking the choices of both Leela 11 and Leela Zero and picking one of Leela 11's top three choices unless Leela Zero indicated that Leela 11's top choice was a mistake. But that's my opinion. I am not going to argue the point. I am not here to argue. (Even if I seem to be doing a lot of it lately. ;))
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
Simba
Lives with ko
Posts: 170
Joined: Wed Feb 16, 2011 8:54 am
Rank: 6d KGS
GD Posts: 0
Has thanked: 14 times
Been thanked: 23 times

Re: “Decision: case of using computer assistance in League A

Post by Simba »

bugsti wrote:I quickly reviewed your game with Leela Zero to check out some of your sentences. Surprisingly I saw that White made a huge blunder with the move 74, and after sacrificing his group with the 76-82 sequence you were in a solid winning position according to LZ.

This is very contradictory compared to your version of being "completely oppress". :scratch:
Look at the rest of the game. When I perform analysis on this myself (and I will, just I don't have much time at all for this sort of thing), I expect to find that he turned LZ on at a certain point - likely after he messed up in this spot.

The idea that a 4d could legitimately make me look like I have no idea how to play from that point is even more eye-watering than the idea that they could do that from start to finish.
bugsti wrote:So you accidentally admitted that there may be discrepancy between live and online rating and strenght, so much so that on KGS you can give 2-3 stones to someone who can beat you in live matches.

It is also interesting how you judge your mistakes compared to Carlo's mistakes. You called yours as "lol-nonsense in byo-yomi from clearly winning positions", Carlo's live mistakes are instead "evidence of cheating".

If we were in a court of justice your allegations would be annihilated in zero time.
'Accidentally' admitted? One has to wonder what you think I was trying to say if not that there is a discrepancy between the two, in particular for people who play almost exclusively one but not the other. Regardless, CM plays plenty in real life; this isn't relevant in his case.

CM's mistakes are not evidence of cheating, the fact his play matches a bot when he has access to one, and mismatches a bot when he has does not have access to one is evidence of cheating.

In my case, the only difference in my play occurs after my main time is over. I suck at real life byo-yomi since I'm just completely not used to it :P . I panic and still play stones out of my bowl instead of out of the lid of the container while blitzing everything at 2 seconds/move because I'm worried about losing on time. When not in byo-yomi, I'm as strong face-to-face as I am online.
RobertJasiek
Judan
Posts: 6273
Joined: Tue Apr 27, 2010 8:54 pm
GD Posts: 0
Been thanked: 797 times
Contact:

Re: “Decision: case of using computer assistance in League A

Post by RobertJasiek »

Simba wrote:The idea that a 4d could legitimately make me look like I have no idea how to play from that point
You rely on the wrong assumptions that your level of play is constant, the level of play of a 4d opponent of yours is constant, a 4d cannot play like a 6d in some of his games or some parts of his games, a 4d cannot create positions whose treatment are your weakness.
The gap between 4d and 6d is enormous.
The gap in effort to reach the ranks - yes. The gap in explicit or subconscious knowledge - yes. The impact on play in a single game - no (as a 5d, I know). The major impact on play in a single game is the difference in winning probability when continuing from the same position: my 4d opponents lose more often and my 6d opponents win more often from the same kind of position. The gap in explicit or subconscious knowledge can be noticed on average over several games against the same or different opponents but does not enable the 6d opponents to play better than 4d in every game and on every move. Proof: 6d (and even 7d) opponents do lose a significant fraction of their games against 5d players.

Furthermore, I have seen great differences of particular players' skill in online versus real world games. Some are better online, some better in the real world.
Bojanic
Lives with ko
Posts: 142
Joined: Fri May 06, 2011 1:35 pm
Rank: 5 dan
GD Posts: 0
Has thanked: 27 times
Been thanked: 89 times

Re: “Decision: case of using computer assistance in League A

Post by Bojanic »

Bill Spight wrote:Bojanic, the first important play that you identified was move 51, which came after move 50, the start of the sequence of 50 moves by Metta used in the original verdict, which was where I started looking. You stopped at move 105, which is before the end of that sequence. Sure, there are 60 moves in your sequence, 30 moves by Metta, 28 of which fall in the sequence both of us looked at.

Before you start accusing me of not reading what you wrote, why don't you read what I write more carefully?
Bill,
in updated paper, given here:
viewtopic.php?p=232713#p232713

and in same chart posted earlier here:
viewtopic.php?p=232557#p232557

You can see in yellow analyzed sequences.
Please note that I marked that "joseki" ended at 38, and that sequence of moves were analyzed after that.
There is no Metta's tenukis in this sequence, since he replied to his opponent's moves.
You can also note that chart is first version - I marked move 45 a tenuki too, but during detailed analysis I realized it was just a reply.
Therefore, moves from 38 to 105 were analyzed, not 50 to 105, which is clearly (marked yellow) visible in chart shown two times in this topic. Same situation is visible in other games.
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: “Decision: case of using computer assistance in League A

Post by Bill Spight »

Bojanic wrote:Therefore, moves from 38 to 105 were analyzed, not 50 to 105, which is clearly (marked yellow) visible in chart shown two times in this topic.
I never claimed that you analyzed only moves 50 to 105, I said that that range contained the moves that you analyzed and the moves that I analyzed. We analyzed different ranges.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
Fenring
Dies in gote
Posts: 66
Joined: Wed Oct 12, 2016 9:38 am
Rank: FFG 5k
GD Posts: 0
Been thanked: 14 times

Re: “Decision: case of using computer assistance in League A

Post by Fenring »

Simba wrote:
Fenring wrote:What i dont understand is why it would be a very difficult task for a malicious person to produce such a statement?(Np-problem)
All he have to do is to check the Carlo's moves where Leela 0.11 and Leela Zero disagree? check the moves "Leela 11 says it is very bad move but Leela Zero says it is super so he plays it", i dont know how many of them exists,and on which metrics it is based(top 3 choice,Winrate variation) but it doesn't matter,we are clearly not in a situation "impossible to find,easy to check" like PvsNp problem.
I refer you back to PF137's post here: https://www.reddit.com/r/baduk/comments ... d/e0c509f/. If you play this way, Leela 0.11 believes it's going to lose until manually shown the refutation. That is a pretty big thing that could be pointed to as proof that "I didn't cheat, see, look, Leela says it would lose if I did that". Leela 0.11's suggested move wins the game. This is in the endgame too.

I challenge you to find another situation in any of Metta's online games, PGETC or otherwise, played between the release of 0.11, and the date the cheating accusation was first made where the following criteria are met:

1) We are in the endgame. Let's say move > 150 as a guideline.
2) Leela 0.11's best move shows a significant victory for side A.
3) Leela 0.11 has at least one move that incorrectly shows a significant defeat for side A, and Leela 0.11 does not see that it is incorrect until manually shown the refutation.
4) Leela Zero correctly finds the refutation, and recommends the incorrect move itself as its first choice.

Surely you can see that those criteria, which are all met here, are needle-in-haystack. No one is going to sit there to find something like that. You'd also have to be very strong to find something like this because you'd need to meet criteria 3, i.e. you need to find the refutation, or explore extensively with Leela Zero trying to find blind spots in each position in Leela 0.11. The time required here is immense. In comparison, actually checking the presence of such a move that meets the criteria, if you're told where to look, is very, very easy.
Simba, as i try to explain you since 2 posts, this is obviously not needle-in-haystack, and logic of PF137 you already show is really flawless.


Someone who guess CM cheat with Leela Zero can easily find this.
I open Carlo's Game with Leela Zero and Leela 0.11.
I take a look only when the moves played by Carlo is first choice of Leela Zero and have lower winrate with Leela 0.11.
On this very few moves, i can investigate further,and i dont even need to be strong to check the 3). Leela Zero will give me the refutation.

And i repeat the process on all games i want to check(those played by Carlo after Leela Zero was strong enough).
Not really a needle-in-haystack. i just open the game with 2 bots instead one and compare.
User avatar
Charlie
Lives in gote
Posts: 310
Joined: Mon Feb 06, 2012 2:19 am
Rank: EGF 4 kyu
GD Posts: 0
Location: Deutschland
Has thanked: 272 times
Been thanked: 126 times

Re: “Decision: case of using computer assistance in League A

Post by Charlie »

Simba wrote: I challenge you to find another situation in any of Metta's online games, PGETC or otherwise, played between the release of 0.11, and the date the cheating accusation was first made where the following criteria are met:

1) We are in the endgame. Let's say move > 150 as a guideline.
2) Leela 0.11's best move shows a significant victory for side A.
3) Leela 0.11 has at least one move that incorrectly shows a significant defeat for side A, and Leela 0.11 does not see that it is incorrect until manually shown the refutation.
4) Leela Zero correctly finds the refutation, and recommends the incorrect move itself as its first choice.

Surely you can see that those criteria, which are all met here, are needle-in-haystack. No one is going to sit there to find something like that. You'd also have to be very strong to find something like this because you'd need to meet criteria 3, i.e. you need to find the refutation, or explore extensively with Leela Zero trying to find blind spots in each position in Leela 0.11. The time required here is immense. In comparison, actually checking the presence of such a move that meets the criteria, if you're told where to look, is very, very easy.
Actually, the time required on semi-decent hardware would be a matter of hours. A day at the most. Here is what you do:

1. Download all the SGF file. How many can there be? Even for a prolific online player, I guess there are only hundreds -- not thousands -- of games to check.
2. Write a script that opens each SGF file (any number of libraries exist for this purpose)
3. For moves 150 to the end, do the following:
a) Pass the previous board position to Leela 0.11 and Leela Zero, respectively. (Easily done with GTP)
b) Wait for a certain number of playouts.
c) Compare the suggested moves and find a large discrepancy in win-rates for the actual move between the two bots.
d) Whenever a large difference in win-rates is found, pass the variation from the bot that yields the larger win-rate to that that returned the lesser and look to see if that variation contains a "refutation"

But why? What does it prove? This experiment shows only one fact -- that Leela 0.11 and Leela Zero are completely different -- and that fact has absolutely no bearing on this case.

The fact that two bots are different does not indicate that a human player cheated. How can it?

(@Fenring: I think you mean "flawed". "flawless" is actually a synonym for "perfect")
Last edited by Charlie on Mon Jun 18, 2018 12:19 am, edited 1 time in total.
Uberdude
Judan
Posts: 6727
Joined: Thu Nov 24, 2011 11:35 am
Rank: UK 4 dan
GD Posts: 0
KGS: Uberdude 4d
OGS: Uberdude 7d
Location: Cambridge, UK
Has thanked: 436 times
Been thanked: 3718 times

Re: “Decision: case of using computer assistance in League A

Post by Uberdude »

Here is my analysis/commentary on the Carlo vs Chris game, which combines my impressions as a 4d kibitzer with post-game analysis with Leela Zero #145. In summary, Carlo did better in a pretty normal modern opening with an early 3-3 and big shimari, Chris played flexibly but Carlo just solidly captured 2 stones on right for a good result. Chris then found some sharp local moves in upper right fighting and when Carlo made a natural looking mistake of connecting some stones in atari they all died and game was even to good for Chris. I considered the lower side black group in danger but neither player played in that area for ages which was odd (LZ wanted to many times for both). Chris then spoilt his lead with a liberty-reducing push and Carlo found an excellent kosumi to exploit the shortage of liberties which meant Chris was just running out some weak stones but Carlo's adjacent ones were safe and there was a long but natural pushing battle. Bad aji in the left side meant Carlo reduced it and then play finally turned to the lower side, Chris lost a bunch in the lower left corner to secure his group. In endgame at top Chris lost sente so Carlo got to defend the middle and played a nice but not so amazing tesuji to connect up more efficiently and then won by resign.



Here is the winrate graph from Leela Zero #145 on 6k playouts (Chris was black so wins towards the top). It mostly agreed with my feelings watching the game, but I was surprised how good LZ though the game was for Chris after the upper right fighting as I thought he had made it even to a little good, but not more good for him than it had been for Carlo prior to that.
Carlo vs Chris winrate.PNG
Carlo vs Chris winrate.PNG (438.76 KiB) Viewed 9804 times
Edit: it has been suggested to me that this winrate graph could be skewed in favour of Chris (black) because the game file shows an incorrect komi of -5.5. LeelaZero does not read the komi from the file but always uses 7.5 komi Chinese counting. The league uses Japanese 6.5 komi so this isn't totally correct, but the winner is usually the same under the two rulesets so will only make a significant difference in a very close game.
Attachments
Chris vs Carlo.sgf
(19.16 KiB) Downloaded 950 times
User avatar
pnprog
Lives with ko
Posts: 286
Joined: Thu Oct 20, 2016 7:21 am
Rank: OGS 7 kyu
GD Posts: 0
Has thanked: 94 times
Been thanked: 153 times

Re: “Decision: case of using computer assistance in League A

Post by pnprog »

Simba wrote:I challenge you to find another situation in any of Metta's online games, PGETC or otherwise, played between the release of 0.11, and the date the cheating accusation was first made where the following criteria are met:

1) We are in the endgame. Let's say move > 150 as a guideline.
2) Leela 0.11's best move shows a significant victory for side A.
3) Leela 0.11 has at least one move that incorrectly shows a significant defeat for side A, and Leela 0.11 does not see that it is incorrect until manually shown the refutation.
4) Leela Zero correctly finds the refutation, and recommends the incorrect move itself as its first choice.

Surely you can see that those criteria, which are all met here, are needle-in-haystack. No one is going to sit there to find something like that. You'd also have to be very strong to find something like this because you'd need to meet criteria 3, i.e. you need to find the refutation, or explore extensively with Leela Zero trying to find blind spots in each position in Leela 0.11. The time required here is immense. In comparison, actually checking the presence of such a move that meets the criteria, if you're told where to look, is very, very easy.
Fenring wrote: Simba, as i try to explain you since 2 posts, this is obviously not needle-in-haystack, and logic of PF137 you already show is really flawless.


Someone who guess CM cheat with Leela Zero can easily find this.
I open Carlo's Game with Leela Zero and Leela 0.11.
I take a look only when the moves played by Carlo is first choice of Leela Zero and have lower winrate with Leela 0.11.
On this very few moves, i can investigate further,and i dont even need to be strong to check the 3). Leela Zero will give me the refutation.

And i repeat the process on all games i want to check(those played by Carlo after Leela Zero was strong enough).
Not really a needle-in-haystack. i just open the game with 2 bots instead one and compare.
Charlie wrote: Actually, the time required on semi-decent hardware would be a matter of hours. A day at the most. Here is what you do:

1. Download all the SGF file. How many can there be? Even for a prolific online player, I guess there are only hundreds -- not thousands -- of games to check.
2. Write a script that opens each SGF file (any number of libraries exist for this purpose)
3. For moves 150 to the end, do the following:
a) Pass the previous board position to Leela 0.11 and Leela Zero, respectively. (Easily done with GTP)
b) Wait for a certain number of playouts.
c) Compare the suggested moves and find a large discrepancy in win-rates for the actual move between the two bots.
d) Whenever a large difference in win-rates is found, pass the variation from the bot that yields the larger win-rate to that that returned the lesser and look to see if that variation contains a "refutation"
Well, if someone with a "semi-decent hardware" is willing to give it a try, I can workout something to automatize this sort of analysis. (but my computer does not qualify as semi-decent hardware, so somebody else would have to run it on his own computer)
Charlie wrote: But why? What does it prove? This experiment shows only one fact -- that Leela 0.11 and Leela Zero are completely different -- and that fact has absolutely no bearing on this case.

The fact that two bots are different does not indicate that a human player cheated. How can it?
Well, let's see the results first. If it happens that, out of 20 games (2000 end game moves analysed), there is only that specific moves already mentioned on Reddit that matches all those criteria, then, certainly, finding such move would be difficult (whatever the methodology used).

I should add this: it might end up being be relatively easy to spot such moves once one uses an automatic analysis script designed only for this purpose (if each game has 1 or 2 such moves). But we won't know what methodology was used by this anonymous Redditor to discover about this move. Without such tool, it's could still be very hard to spot such move.
I am the author of GoReviewPartner, a small software aimed at assisting reviewing a game of Go. Give it a try!
Bojanic
Lives with ko
Posts: 142
Joined: Fri May 06, 2011 1:35 pm
Rank: 5 dan
GD Posts: 0
Has thanked: 27 times
Been thanked: 89 times

Re: “Decision: case of using computer assistance in League A

Post by Bojanic »

Bill Spight wrote:
Bojanic wrote:Therefore, moves from 38 to 105 were analyzed, not 50 to 105, which is clearly (marked yellow) visible in chart shown two times in this topic.
I never claimed that you analyzed only moves 50 to 105, I said that that range contained the moves that you analyzed and the moves that I analyzed. We analyzed different ranges.
I see.

And to whose analysis you referred here?
viewtopic.php?p=232635#p232635
Bill Spight wrote:As for the 98% matching evidence, you must understand that matching one of a bot's top three choices was chosen in order to generate impressive matching numbers, not through any theory of how a player might have cheated. (This motive may have been unconscious.) And restricting the possible matches to the fifty moves between moves 51 - 100 is also suspicious. In addition, it is confirmatory evidence instead of disconfirmatory evidence. IOW, it is not just unsound, it is crap.
User avatar
Charlie
Lives in gote
Posts: 310
Joined: Mon Feb 06, 2012 2:19 am
Rank: EGF 4 kyu
GD Posts: 0
Location: Deutschland
Has thanked: 272 times
Been thanked: 126 times

Re: “Decision: case of using computer assistance in League A

Post by Charlie »

Looking at Uberdude's analyses of the game Carlo played against Chris and at the game itself, move 156 looks pretty normal to me. Looking at the times in the SGF makes it seem even more normal.

Carlo spent nearly 4 minutes on move 154. Does anyone honestly believe that he needed that long for that hane? No way -- he was obviously thinking about the group at G18 which only has one eye and pretty much zero chance of making another one. When black descended to H18, Carlo played J18 in two seconds -- he had prepared it.

Can a human 4-dan find J18? Yes, sure. I could find it and I'm only 3 kyu! (I'm pretty damn sure I could find it in a game, too. This sort of exact reading is really my only real strength.)

What's there to read? There are basically only about three possible local moves for white and failing to connect or at least make life would be game-over. Sure, there are many cutting points and weaknesses but 4 minutes is also a long time.
Post Reply