Thanks, but I did not find any specific moves on that spreadsheet.AlesCieply wrote:And something to stay on topic:![]()
Bill, you may also pick the suggestions for any "important moves" that my Leela provided too. They are all in the document I put on-line, https://docs.google.com/spreadsheets/d/ ... =925979564
Just have a look on the sheet MettaBenDavid for the game you started discussing here.
Questions about a game
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Questions about a game
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
AlesCieply
- Dies in gote
- Posts: 65
- Joined: Mon Sep 10, 2012 5:07 am
- GD Posts: 0
- Has thanked: 31 times
- Been thanked: 55 times
Re: Questions about a game
This is what I see when I open the sheet:
Does it not open properly for you. The #1 column is the top suggestion by Leela, the #n is the number of the move (from the list of choices provided by Leela) actually played in the game.
EDIT: Sorry for the large picture, I did not find how to handle it properly.
EDIT: Sorry for the large picture, I did not find how to handle it properly.
Last edited by AlesCieply on Fri Jun 22, 2018 1:29 pm, edited 1 time in total.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Questions about a game
Thanks. It opened on a different sheet. But I found the one you showed. 
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
AlesCieply
- Dies in gote
- Posts: 65
- Joined: Mon Sep 10, 2012 5:07 am
- GD Posts: 0
- Has thanked: 31 times
- Been thanked: 55 times
Re: Questions about a game
Fine, you can check all the games I analyzed there and pick the Leela suggestions for any moves you determine as significant. In fact, I have already some more games analyzed, will upload them at the google sheets some time next week.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Questions about a game
Added more positions to note #2 ( viewtopic.php?p=233019#p233019 ), to make 12 in all. No more positions to add. 
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
AlesCieply
- Dies in gote
- Posts: 65
- Joined: Mon Sep 10, 2012 5:07 am
- GD Posts: 0
- Has thanked: 31 times
- Been thanked: 55 times
Re: Questions about a game
I think it is a right direction to select a number of significant moves in a game and then establish whether the player made a mistake there and how large the mistake was. At least it is what Ken Regan uses for the chess games. My concerns are how reliably we can determine the "significant moves" aa well as the value of mistakes (lowering of the winning probability) the played moves bear. As far as this is difficult to automate, we have to rely on "experts" to choose the significant moves. However, we need to do it for a large number of games played by many different players to establish how players of different strength would do in these positions. Regan was able to compare the histograms/graphs of someone's play with those already established for players of varied strength (ELO). I am afraid we cannot ask for "experts opinion" on hundreds (if not thousands) of games. So, I am a bit sceptical about this kind of analysis at the moment. On the other hand, we should still be able to "measure" how Metta's play in internet games differs from his play in regular games. What Milos Bojanic does in his analysis has a qualitative character but I think that his analysis can be improved by making delta-histograms for the significant moves in a similar way as I do in my analysis for the whole part of the game (moves 31-180).
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Questions about a game
As I have indicated, I think that the consensus of relatively few experts (maybe even five) can distinguish difficult choices in each game, say for the top 5% - 10% of plays, from less difficult choices. They can also agree on the worst plays, which can happen even when the position is not very difficult. For instance, in the Metta-Ben David game even I felt thatAlesCieply wrote:I think it is a right direction to select a number of significant moves in a game and then establish whether the player made a mistake there and how large the mistake was. At least it is what Ken Regan uses for the chess games. My concerns are how reliably we can determine the "significant moves" aa well as the value of mistakes (lowering of the winning probability) the played moves bear.
However, reliably judging the value of mistakes is obviously beyond the ability of Leela 11, as analysis of the 8 Metta games shows. We even see that in the current games of the top AI bots. (Alpha Zero having retired.) But give us a few years, and let us develop bots for the purpose of rating positions and plays. Our current top bots are not optimized for those purposes.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Questions about a game
I have not gotten around to why I started this thread. Busy, busy!
Let me start with some considerations that are broadly applicable.
In this world coincidences abound. So do spurious correlations. When I was a kid I read about one spurious correlation whereby, for several years, it was possible to predict the US economy from the stork population in one European city (Amsterdam, I think, but it was a long time ago). OC, no matter how good the correlation may have been, nobody thought that it was anything but coincidental. Why not? Because we had no theory connecting the two facts. Nothing that we know about biology and economics provides any connection.
Given a correlation, we may search for a theory, an explanation. Often finding any explanation is a challenge. Often, however, we have a number of possible explanations. We may then try to find the best explanation. Generality is desirable, as are brevity and parsimony. The fewer assumptions the explanation requires, the better.
Note that the original correlation which we are trying to explain is not a very good test of the explanation. Sure, if we tested the explanation de novo it would be good evidence, but since we have fitted the theory to the data, we cannot expect less. In science we try to test the explanation or theory with new evidence. In real life "detective" work we may not be able to do that. But in science and detection we do our best to disprove and eliminate possible theories. Especially our pet theories.
In this world coincidences abound. So do spurious correlations. When I was a kid I read about one spurious correlation whereby, for several years, it was possible to predict the US economy from the stork population in one European city (Amsterdam, I think, but it was a long time ago). OC, no matter how good the correlation may have been, nobody thought that it was anything but coincidental. Why not? Because we had no theory connecting the two facts. Nothing that we know about biology and economics provides any connection.
Given a correlation, we may search for a theory, an explanation. Often finding any explanation is a challenge. Often, however, we have a number of possible explanations. We may then try to find the best explanation. Generality is desirable, as are brevity and parsimony. The fewer assumptions the explanation requires, the better.
Note that the original correlation which we are trying to explain is not a very good test of the explanation. Sure, if we tested the explanation de novo it would be good evidence, but since we have fitted the theory to the data, we cannot expect less. In science we try to test the explanation or theory with new evidence. In real life "detective" work we may not be able to do that. But in science and detection we do our best to disprove and eliminate possible theories. Especially our pet theories.
Sherlock Holmes wrote:When you have eliminated the impossible, whatever remains, however improbable, must be the truth.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Questions about a game
Every assumption that we make about a theory or about its evidence weakens the support for the theory. Now to make a case for a theory, in lawyerly fashion, may mean making assumptions. That is not an indictment of making a case. But the standpoint of a scientist or detective is different from that of a lawyer or advocate. Not that scientists and detectives don't make cases for their conclusions, but their process, at least ideally, is largely as the Sherlock Holmes quote suggests, to eliminate theories.
The case against Metta began, IIUC, with the observation of a member of the Israeli team who was watching his game vs. Ben David, that a surprising number of Metta's plays matched the plays suggested by Leela 11. OC, that fact raises the question of cheating, and, IMO, justified filing a complaint. One way of cheating in online chess is to make the plays suggested by a superhuman engine. Analysis reveals that the suspected player, who has a modest rating, makes neither blunder nor mistake, but only makes a few of what chess players term inaccuracies. Such play is superhuman. In addition, there is often behavioral evidence, such as the suspected player belittling his opponent.
Now, Simba strongly believes that Metta cheated in their game. He assumes that Metta used Leela Zero to do so. OC, that may be so, but it is an assumption. If Metta was cheating, why did he blunder away a number of stones early in the game? Simba assumes that Metta did not cheat early in the game, because Leela Zero is so strong that Metta could use it to win if he got behind. Now that is a plausible theory of cheating, but it is also ad hoc, tailored to fit the evidence.
Then there is the curious case, advanced by the anonymous accuser, of the now infamous move 156, where Metta did not pick the play recommended by Leela 11, which was a mistake, but picked the play recommended by Leela Zero. Why that might be relevant is a puzzle, given the theory that Metta used Leela Zero to cheat, anyway. To get there the anonymous accuser had to assume, as he stated on reddit, that once he was comfortably ahead Carlo started using both Leela 11 and Leela Zero, side by side, to cheat. Not only is that an additional assumption, it is implausible on its face.
Now, making assumptions weakens a case, but when the assumptions are not pointed out, doing so can appear to strengthen a case. The anonymous accuser's gratuitous assumption performs that function brilliantly. On move 156 Leela 11's play, which Metta did not choose, is a significant error, while Leela Zero's play, which Metta did choose, is not. Such a large discrepancy in evaluation between Leela 11 and Leela Zero is unusual, and Anonymous Accuser uses that fact as proof that Metta was cheating.
OC, the discrepancy is relevant only given the assumption that Metta was using both Leela 11 and Leela Zero simultaneously. And the so-called proof requires the hidden assumption that Metta would not have found move 156 if he were not cheating. In fact, it is an obvious candidate move.
The original case against Metta, made by the Israeli team, also makes assumptions that make the case appear stronger than it is while actually weakening it. One known way of cheating online is to use a superhuman program to choose your plays. Here the obvious suspect is Leela, especially since Metta says that he used it for training. One problem with that theory is that many of his moves do not match Leela's choice. One possibility, OC, is that Metta cheated in a different way. When asked how they might cheat using a superstrong bot, players on this forum suggested that they might use it to avoid blunders. To test that theory would involve looking at individual plays for evidence of errors. I have actually done that, as have others.
Another possibility is that, because Leela is non-deterministic, plays that it suggested to Metta might not match all of its suggestions when the program is run to check for matches. It is, in fact, very likely that the checking run of the program will not find all of the plays where Metta made the same play as Leela suggested. However, it works the other way, as well. There will be plays that match the checking run that did not match the run that Metta used, if he used Leela at all. This behavior of Leela is something that a scientist or detective would examine. In fact, because of the phenomenon called regression to the mean, a run chosen because it has a high number of matches would likely have a higher number of matches than a random run. In any event, the possibility that Metta made plays suggested by Leela that the checking run of Leela does not match does not justify matching second or third choices of that run to Metta's plays. The likely result of doing so would be to grossly overestimate the number of matches, when it is plausible that the number matches to Leela's top choice alone is already an overestimate, assuming that Metta picked Leela's top choice for several plays.
This is not a criticism of the Israeli team. Their job was to present a case, not to find a verdict. But by matching a range of Metta's plays to Leela's top three choices they gave the appearance, echoed by the claim — by whom, I forget, but it does not really matter — that the probability that Metta cheated in that game is greater than 90%. (The Israeli team found a match of 98% for Metta's plays in the range of moves 50 - 150. Other runs found, as we might have expected, lower matching rates around 93%.) Matches to Leela's top choice were 72%; other runs may have found matches in the mid-60% range. Adding matches to the second and third choices made a big difference in the impression that the evidence gave. 98% matches?
A guilty verdict appeared to be a slam dunk.
To paint such an impressive picture required the assumption that Metta would sometimes pick Leela's second choice and the assumption that he would sometimes pick Leela's third choice. The assumption that he would sometimes pick Leela's fourth choice was not necessary.
The additional assumption was made that he would not pick an obviously bad play. The assumption was also made that Metta would pick a second or third choice in order to avoid detection that he was cheating. This assumption seems rather implausible, given that Leela reveals its second and third choices, possibly among others. How does picking one of them avoid detection?
The choice of the range of 50 of Metta's plays is also suspicious. The arguments that later plays in the endgame might be unreliable and that earlier plays in the opening might provide too many matches because of joseki are plausible. But I cannot avoid the nagging suspicion that a wider range might have been less impressive.
(In particular, Metta's move 37 is problematic for the cheating hypothesis, as we shall see.)
One thing my undergraduate research methods professor stressed was this:
A lot of people assume that the opening is not a good place to look for evidence of cheating in go by using a super strong bot. I disagree. That may make sense in chess, where players memorize openings and chess engines use opening books. But the opening is more fluid in go, and, perhaps more importantly, super strong go bots excel in strategy, which is paramount in the opening. That's where you can use a bot to advantage to take an early lead. And humans are imitating bots in the opening already. They are making early 3-3 invasions, playing some new AI joseki, and making attachments and diagonal contact plays that humans used to avoid in the opening. Some New Fuseki style plays have made a comeback, as well. Using a bot in the opening is not a dead giveaway. (OC, using a bot in a semeai may be a giveaway when the bot makes a mistake.
)
While I have raised questions about the accusations against Metta in this note, my main point is to show the deleterious effect of assumptions. Not only do they weaken a case, they can make it look better than it is.
The case against Metta began, IIUC, with the observation of a member of the Israeli team who was watching his game vs. Ben David, that a surprising number of Metta's plays matched the plays suggested by Leela 11. OC, that fact raises the question of cheating, and, IMO, justified filing a complaint. One way of cheating in online chess is to make the plays suggested by a superhuman engine. Analysis reveals that the suspected player, who has a modest rating, makes neither blunder nor mistake, but only makes a few of what chess players term inaccuracies. Such play is superhuman. In addition, there is often behavioral evidence, such as the suspected player belittling his opponent.
Now, Simba strongly believes that Metta cheated in their game. He assumes that Metta used Leela Zero to do so. OC, that may be so, but it is an assumption. If Metta was cheating, why did he blunder away a number of stones early in the game? Simba assumes that Metta did not cheat early in the game, because Leela Zero is so strong that Metta could use it to win if he got behind. Now that is a plausible theory of cheating, but it is also ad hoc, tailored to fit the evidence.
Then there is the curious case, advanced by the anonymous accuser, of the now infamous move 156, where Metta did not pick the play recommended by Leela 11, which was a mistake, but picked the play recommended by Leela Zero. Why that might be relevant is a puzzle, given the theory that Metta used Leela Zero to cheat, anyway. To get there the anonymous accuser had to assume, as he stated on reddit, that once he was comfortably ahead Carlo started using both Leela 11 and Leela Zero, side by side, to cheat. Not only is that an additional assumption, it is implausible on its face.
Now, making assumptions weakens a case, but when the assumptions are not pointed out, doing so can appear to strengthen a case. The anonymous accuser's gratuitous assumption performs that function brilliantly. On move 156 Leela 11's play, which Metta did not choose, is a significant error, while Leela Zero's play, which Metta did choose, is not. Such a large discrepancy in evaluation between Leela 11 and Leela Zero is unusual, and Anonymous Accuser uses that fact as proof that Metta was cheating.
The original case against Metta, made by the Israeli team, also makes assumptions that make the case appear stronger than it is while actually weakening it. One known way of cheating online is to use a superhuman program to choose your plays. Here the obvious suspect is Leela, especially since Metta says that he used it for training. One problem with that theory is that many of his moves do not match Leela's choice. One possibility, OC, is that Metta cheated in a different way. When asked how they might cheat using a superstrong bot, players on this forum suggested that they might use it to avoid blunders. To test that theory would involve looking at individual plays for evidence of errors. I have actually done that, as have others.
This is not a criticism of the Israeli team. Their job was to present a case, not to find a verdict. But by matching a range of Metta's plays to Leela's top three choices they gave the appearance, echoed by the claim — by whom, I forget, but it does not really matter — that the probability that Metta cheated in that game is greater than 90%. (The Israeli team found a match of 98% for Metta's plays in the range of moves 50 - 150. Other runs found, as we might have expected, lower matching rates around 93%.) Matches to Leela's top choice were 72%; other runs may have found matches in the mid-60% range. Adding matches to the second and third choices made a big difference in the impression that the evidence gave. 98% matches?
To paint such an impressive picture required the assumption that Metta would sometimes pick Leela's second choice and the assumption that he would sometimes pick Leela's third choice. The assumption that he would sometimes pick Leela's fourth choice was not necessary.
The choice of the range of 50 of Metta's plays is also suspicious. The arguments that later plays in the endgame might be unreliable and that earlier plays in the opening might provide too many matches because of joseki are plausible. But I cannot avoid the nagging suspicion that a wider range might have been less impressive.
One thing my undergraduate research methods professor stressed was this:
OC, you may have data that are questionable, or outliers that you ignore in reaching your final conclusion. But you have to address those data and make your arguments. You don't just make some plausible assertions and then ignore data without even considering it. The human world is full of plausible assertions.Do not throw away any data.
A lot of people assume that the opening is not a good place to look for evidence of cheating in go by using a super strong bot. I disagree. That may make sense in chess, where players memorize openings and chess engines use opening books. But the opening is more fluid in go, and, perhaps more importantly, super strong go bots excel in strategy, which is paramount in the opening. That's where you can use a bot to advantage to take an early lead. And humans are imitating bots in the opening already. They are making early 3-3 invasions, playing some new AI joseki, and making attachments and diagonal contact plays that humans used to avoid in the opening. Some New Fuseki style plays have made a comeback, as well. Using a bot in the opening is not a dead giveaway. (OC, using a bot in a semeai may be a giveaway when the bot makes a mistake.
While I have raised questions about the accusations against Metta in this note, my main point is to show the deleterious effect of assumptions. Not only do they weaken a case, they can make it look better than it is.
Last edited by Bill Spight on Mon Jul 09, 2018 2:06 pm, edited 1 time in total.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Questions about a game
Back when I was training at go, I made use of Botvinnik's idea to study positions where I had taken a long time, because they were difficult for me. An unstated assumption in this training is that I was more likely to have made a mistake in a position where I took a long time than in positions where I did not, or to make worse mistakes. That runs counter to folklore, where taking a long time enables you to play well. An famous example is the game where Honinbo Shusai took 8 hours (
) to read out an endgame and find the right play. But if taking a long time meant finding the right play, why study those positions afterwards? Bridge great Alfred Sheinwold once quipped that he played quickly because quick errors were less embarrassing than slow ones.
So I start with the assumption that plays that take more time for a player are more likely to be mistakes than the others, or are worse on average, because they are more difficult for him. As discussion in this thread has indicated, time taken is at best an imperfect measure, since we do not know why the player took a long time. He might have gone to the kitchen to make a sandwich, for instance.
Imperfect though it may be, I take the time taken as a measure of subjective difficulty, and assume that plays where a player takes a long time are on average worse than those where he takes a normal length of time. (Very quick plays may on average be worse, as well, for different reasons.
)
I also take the difference between Leela's estimated winrate for a play and its estimated winrate for its top choice as a measure of how bad a play may be. (Or how good, in some instances.
) As is well known, that is an imperfect measure, for several reasons. But you make do with what you have.
I have used Ales Cieply's numbers.
I have indicated the relationship between the time taken by Metta in his game with Ben David and the number of % points chucked in the following table.
Note that the only other hypothesis advanced in this discussion concerning the length to time Metta took to make a play is that a quick play might not have given him time enough to cheat effectively using Leela. OC, for the plays that took him a long time he had quite enough time to run Leela and cheat.
Imperfect though it may be, I take the time taken as a measure of subjective difficulty, and assume that plays where a player takes a long time are on average worse than those where he takes a normal length of time. (Very quick plays may on average be worse, as well, for different reasons.
I also take the difference between Leela's estimated winrate for a play and its estimated winrate for its top choice as a measure of how bad a play may be. (Or how good, in some instances.
I have indicated the relationship between the time taken by Metta in his game with Ben David and the number of % points chucked in the following table.
- * 9 longest plays ---- total of 15.6% pts. chucked -- average 1.7%
* 10 next longest plays ---- total of 9.0% pts. chucked -- average 0.9%
* 48 shortest plays ---- total of 23.6% pts. chucked -- average 0.5%
Note that the only other hypothesis advanced in this discussion concerning the length to time Metta took to make a play is that a quick play might not have given him time enough to cheat effectively using Leela. OC, for the plays that took him a long time he had quite enough time to run Leela and cheat.
Last edited by Bill Spight on Mon Jul 09, 2018 2:06 pm, edited 1 time in total.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Questions about a game
Most of the positions I posted here were ones on which Metta took the longest (49 sec. or longer). Let's review them. 
Edit: Ales Cieply ran another Leela11 analysis, this time of the whole game, as I suggested, with a setting of 300K+ playouts. This is the most accurate Leela11 analysis that we have. I have edited these positions accordingly.
Edit 2: Ales also managed to get analyses by Leela Zero Elf. I am updating the positions in this note and the next one accordingly. (BTW, Elf agrees with me that
is not good, losing 9% pts.
Leela 11 thinks it's OK, but it was trained originally on human play.
)
First position.
Second position.
Position 3
Position 4
Position 5
Edit: Ales Cieply ran another Leela11 analysis, this time of the whole game, as I suggested, with a setting of 300K+ playouts. This is the most accurate Leela11 analysis that we have. I have edited these positions accordingly.
Edit 2: Ales also managed to get analyses by Leela Zero Elf. I am updating the positions in this note and the next one accordingly. (BTW, Elf agrees with me that
First position.
Last edited by Bill Spight on Fri Jul 27, 2018 2:23 pm, edited 7 times in total.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Questions about a game
More positions. 
Edit: I also applied the result of Cieply's new run to these positions.
Edit 2: Added Leela Elf (200k)'s results, using the delta estimate.
Position 6
Position 7
Position 8
Position 9
Position 10
Position 11
Aside from the position for
, these are the 10 positions on which Carlo Metta took the most time, 49 sec. or more. We do not have Leela11's evaluation for one of them, because it occurred before either Cieply or Bojanic did any evaluations. But it was a joseki choice that stands a good chance of being Cieply's Leela's top choice. Another one was Leela's top choice, and another one Leela considered to be better than its top choice.
Of the 67 plays that Cieply's Leela evaluated, it considered 21 of them to be errors. 7 of them were in the 9 plays that took the longest time. 14 of them were in the 58 remaining plays. So when Carlo took a long time, he was more likely to make an error (according to Leela).
The average winrate loss per error was about 2.3%, regardless of how long Carlo took. So the main difference has to do with the probability of error. When he took less than 49 sec. to make a play, his error rate was around ¼; when he took 49 sec. or more to make a play, his error rate was around ¾. When queried about how they might use a bot to cheat, many of our members said that they would use it to help prevent blunders. Carlo's time usage data does not fit that theory of cheating. Why would he take a long time on a play, only to then pick what Leela told him was a bad play more often than he usually did? What theory of cheating does this data fit?
OC, he might have cleverly taken a long time to pick bad moves, in order to disguise his cheating. But until now, who has suggested that such a strategy might be necessary?
Edit: I also applied the result of Cieply's new run to these positions.
Edit 2: Added Leela Elf (200k)'s results, using the delta estimate.
Position 6
The average winrate loss per error was about 2.3%, regardless of how long Carlo took. So the main difference has to do with the probability of error. When he took less than 49 sec. to make a play, his error rate was around ¼; when he took 49 sec. or more to make a play, his error rate was around ¾. When queried about how they might use a bot to cheat, many of our members said that they would use it to help prevent blunders. Carlo's time usage data does not fit that theory of cheating. Why would he take a long time on a play, only to then pick what Leela told him was a bad play more often than he usually did? What theory of cheating does this data fit?
OC, he might have cleverly taken a long time to pick bad moves, in order to disguise his cheating. But until now, who has suggested that such a strategy might be necessary?
Last edited by Bill Spight on Fri Jul 27, 2018 2:16 pm, edited 4 times in total.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Bojanic
- Lives with ko
- Posts: 142
- Joined: Fri May 06, 2011 1:35 pm
- Rank: 5 dan
- GD Posts: 0
- Has thanked: 27 times
- Been thanked: 89 times
Re: Questions about a game
When I did analysis of this move, before previous move, wN2 was played, and for some time after it was played, this move was actually Leela's top choice. Leela found better move later, but this move was rated as top choice for considerable time.Bill Spight wrote: Position 7
Those 21 moves are not errors! There were better moves, but those moves were still recommended, and in some cases they were Leela's top choice during some period of time. Even in top european players you could see serious errors, blunders, miscalculations, hallucinations etc - but you could not see any of those in two of the Carlo's games. Just a long list of Leela's top choices.Bill Spight wrote:Of the 67 plays that Cieply's Leela evaluated, it considered 21 of them to be errors. 7 of them were in the 9 plays that took the longest time. 14 of them were in the 58 remaining plays. So when Carlo took a long time, he was more likely to make an error (according to Leela).
There is a problem with analysis of spent time - we don't know what he was doing during it.Bill Spight wrote:The average winrate loss per error was about 2.3%, regardless of how long Carlo took. So the main difference has to do with the probability of error. When he took less than 49 sec. to make a play, his error rate was around ¼; when he took 49 sec. or more to make a play, his error rate was around ¾. When queried about how they might use a bot to cheat, many of our members said that they would use it to help prevent blunders. Carlo's time usage data does not fit that theory of cheating. Why would he take a long time on a play, only to then pick what Leela told him was a bad play more often than he usually did? What theory of cheating does this data fit?
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Questions about a game
According to Leela, they reduced Carlo's probability of winning the game. What do you require of an error? That it loses the game after a sequence of perfect play? Call them chucks if you wish. Leela thought that with those 21 moves Carlo chucked 48.8% pts.Bojanic wrote:When I did analysis of this move, before previous move, wN2 was played, and for some time after it was played, this move was actually Leela's top choice. Leela found better move later, but this move was rated as top choice for considerable time.Bill Spight wrote: Position 7
Those 21 moves are not errors!Bill Spight wrote:Of the 67 plays that Cieply's Leela evaluated, it considered 21 of them to be errors. 7 of them were in the 9 plays that took the longest time. 14 of them were in the 58 remaining plays. So when Carlo took a long time, he was more likely to make an error (according to Leela).
That question was already addressed. When Carlo took a long time he was more likely to chuck points. Despite the fact that Leela could have been running all that time and come up with good plays, as well as win rate estimates.There is a problem with analysis of spent time - we don't know what he was doing during it.Bill Spight wrote:The average winrate loss per error was about 2.3%, regardless of how long Carlo took. So the main difference has to do with the probability of error. When he took less than 49 sec. to make a play, his error rate was around ¼; when he took 49 sec. or more to make a play, his error rate was around ¾. When queried about how they might use a bot to cheat, many of our members said that they would use it to help prevent blunders. Carlo's time usage data does not fit that theory of cheating. Why would he take a long time on a play, only to then pick what Leela told him was a bad play more often than he usually did? What theory of cheating does this data fit?
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Questions about a game
Well, I meant for these questions to be about behavioral evidence. Namely, given so much time, during which, assuming that Carlo used a bot to cheat, that bot could be calculating winrates and picking plays, why would he choose options that chucked percentage points, when he did not do so when he took more normal times to play?
However, it seems like I have to do some statistics. And remember, I formed my hypothesis before looking at the data. The hypothesis being that on average the longer a player took to make his play, the more points he would chuck.
So I did a regression using the 67 data points from Cieply's analysis. Here is the equation, using phrases for variables:
expected_percentage_points_chucked = 0.385 + 0.012*seconds_taken_to_play
The correlation coefficient is 0.313.
P(tail) = 0.0050 < 0.01
The correlation between time taken and points chucked (according to Leela) is highly significant.
However, it seems like I have to do some statistics. And remember, I formed my hypothesis before looking at the data. The hypothesis being that on average the longer a player took to make his play, the more points he would chuck.
So I did a regression using the 67 data points from Cieply's analysis. Here is the equation, using phrases for variables:
expected_percentage_points_chucked = 0.385 + 0.012*seconds_taken_to_play
The correlation coefficient is 0.313.
P(tail) = 0.0050 < 0.01
The correlation between time taken and points chucked (according to Leela) is highly significant.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.