Questions about a game

Uberdude · Post by **Uberdude** » Tue Jul 10, 2018 2:17 pm

Interesting find Bill. The obvious interpretation to me of chucking more win% on moves with more time is that these are harder moves and despite spending the extra time the chosen play was quite some way off Leela's best move. If less time had been taken maybe it would have been even worse (or maybe better, sometimes it is possible to overthink and come up with a worse move than your intuition). The next step would be doing a similar analysis with another bot. If a similar pattern is found then that is saying the two bots agree on which moves were good and which were bad, so the "spending more time on hard moves and still not playing so well" hypothesis is strengthened. I'm not sure what it means if such a correlation is not found. The absolute win% differences of another bot are not so important, what Leela calls a 2% mistake LeelaElf might call a 10% because is stronger and has more extreme opinions, it's the relative difference between fast and slow moves that is.

Bill Spight · Post by **Bill Spight** » Tue Jul 10, 2018 4:44 pm

Uberdude wrote:Interesting find Bill. The obvious interpretation to me of chucking more win% on moves with more time is that these are harder moves and despite spending the extra time the chosen play was quite some way off Leela's best move.

Carlo took the most time on move 37, almost 4 min. Do we really think that Carlo, if cheating, let Leela run that long because he thought the move was difficult, and then did not choose Leela's best move over one that Leela (probably) said was decidedly inferior? If living in the corner in gote was Leela's best choice after 4 min., then maybe the eye-stealing tesuji, which Cieply's Leela came up with both times after more than 200K rollouts, was a mistake, as was Leela's assessment, in analysis, that Leela's later choice chucked 3.08% or 2.54%, depending on the analysis run. Of course, that is possible, as is the possibility that Leela simply overthought the position.

I suppose that it is possible for a top bot to overthink, but that's not the way to bet. That still leaves the question in my mind, if cheating, why did Carlo let Leela run that long? For instance, if he was running Ales's Leela, it would have told him that living immediately in the corner would have turned a likely win into a likely loss, and Bojanic's Leela would have told him the same thing, but also would have said that his play would have lost 7% pts.

The next step would be doing a similar analysis with another bot.

Yes!

And doing it as I suggest, by making both Carlo's play and the play suggested by the bot, and comparing those results.

Bojanic · Post by **Bojanic** » Tue Jul 10, 2018 10:51 pm

Bill Spight wrote:Carlo took the most time on move 37, almost 4 min. Do we really think that Carlo, if cheating, let Leela run that long because he thought the move was difficult, and then did not choose Leela's best move over one that Leela (probably) said was decidedly inferior? If living in the corner in gote was Leela's best choice after 4 min., then maybe the eye-stealing tesuji, which Cieply's Leela came up with both times after more than 200K rollouts, was a mistake, as was Leela's assessment, in analysis, that Leela's later choice chucked 3.08% or 2.54%, depending on the analysis run. Of course, that is possible, as is the possibility that Leela simply overthought the position.

Bill,
you forgot one important thing - Leela is not good in L&D, as shown in game from qualification that was also analyzed.
It made terrible mistake, and it took Metta long time to recover.

Long time for thinking on this move could be very easily explained - Metta played it, he was not confident in Leela after previous mistake.

Bill Spight · Post by **Bill Spight** » Wed Jul 11, 2018 1:31 am

Bojanic wrote:
Bill Spight wrote:Carlo took the most time on move 37, almost 4 min. Do we really think that Carlo, if cheating, let Leela run that long because he thought the move was difficult, and then did not choose Leela's best move over one that Leela (probably) said was decidedly inferior? If living in the corner in gote was Leela's best choice after 4 min., then maybe the eye-stealing tesuji, which Cieply's Leela came up with both times after more than 200K rollouts, was a mistake, as was Leela's assessment, in analysis, that Leela's later choice chucked 3.08% or 2.54%, depending on the analysis run. Of course, that is possible, as is the possibility that Leela simply overthought the position.
Bill,
you forgot one important thing - Leela is not good in L&D, as shown in game from qualification that was also analyzed.
It made terrible mistake, and it took Metta long time to recover.

Long time for thinking on this move could be very easily explained - Metta played it, he was not confident in Leela after previous mistake.

Actually, you are agreeing with me. It is unlikely that Metta would take a long time on

if he were going to use Leela to decide on which move to play, because Leela is bad at life and death questions. So he did his own thinking for this play.

Bill Spight · Post by **Bill Spight** » Tue Jul 17, 2018 3:10 pm

I have applied the results of Ales Cieply's new and improved Leela 11 run to the positions in the note, viewtopic.php?p=233833#p233833 and the following note.

Javaness2 · Post by **Javaness2** » Tue Jul 17, 2018 10:46 pm

Jan.van.Rongen wrote: 2. The time taken to move contains a wealth of information that has not been used in a thorough analysis. In the Italian appeal only the weaknesses of the original "98% method" were shown, but they failed to provide additional evidence that goes against the allegation of cheating. A statistical analysus can show that both players speeded up, resp. slowed down in the same periods: i.e. they could both experience the same situations as difficult or easy. This is IMO strong evidence of not cheating, because it is very difficult to simulate that behaviour.
.

This idea interested me, so I had a really quick look at this in excel, so quick that I may have messed up.
If anyone is genuinely interested in the data I suggest they check it.
Column T is correlation on time spent by Black on Move N and by White on Move N+1.
Column T is correlation on time spent by Black on Move N and Move N+2 and by White on Move N+1 and Move N+3.
No idea if the pattern is normal!

Code: Select all

Correlation	T	T+
All	0.190163358	0.645131427
10 to 25	-0.04818757	0.689699683
26 to 40	0.292364124	0.228355808
41 to 55	0.618649288	0.36956237
56 to 70	0.383386508	0.736804967
70 to end	0.08748908	0.671886045

AlesCieply · Post by **AlesCieply** » Tue Jul 24, 2018 8:15 am

Bill, I appreciate your extensive analysis a lot, it is really good to get different and well argued views. However, I tend to agree more with Milos on some points. First of all, it is really hard to say what move was Carlo's Leela suggesting (assuming he used it) at the time he was making his decision when compared with Leela's suggestions at 300k+ (or 200k+) nodes. In fact, in the Italian appeal they say that they got a better agreement between Carlo's moves and those suggested by Leela (top 3 choices) when they were running Leela on lower numbers of nodes. This indicates that the winrate estimates found at high number of playouts are not indicative for Carlo's moves. If I was to make an assumption on why Carlo used a lot of time on some moves I would say that he was more likely to spend more time when Leela's suggestions did differ from his own instincts and in positions when he suspected that Leela might be wrong. As a regular user of Leela he must know for sure where Leela's weaknesses are so he should spend more time when such situation occurs.

It should also be noted that when Carlo played fast, the moves were likely obvious or forced, so one is less likely to make an error (chunk a point/delta as you call it). On the other hand, when he is looking to find a better move than Leela suggest the move will likely be judged as a mistake by Leela. This can easily explain your observations.

AlesCieply · Post by **AlesCieply** » Tue Jul 24, 2018 8:39 am

I have asked Lukas Podpera (ID lukan here), one of the top European players, to give his "expert view" on the moves in the discussed game that he finds suspicious. I am translating the comments he attached (in Czech) to the game record:

this move is not played much nowadays as alpha-go joseki prefers a different one

the same here, unforced move

most strong player would just play c2, looks like a bot-play

good move but easy to find, most dan players would play there, not suspicous to me

important for a shape, not sure whether 4d player would see it

strong line of play

exquisite timing, not many strong amateur players would choose the right time to play this sequence
:b111: strong move
:b121: nice complex move, I do not believe any 4d would play it, most players would just block and defend the territory at the top
:b139: black is not concerned at all about the loss of territory at the top and ends the game with a sharp combination that I myself would have difficulty to find and play out so accurately
:b153: game finished, no 4d would wrap up the game so calmly and professionally

Bill Spight · Post by **Bill Spight** » Tue Jul 24, 2018 2:57 pm

AlesCieply wrote:First of all, it is really hard to say what move was Carlo's Leela suggesting (assuming he used it) at the time he was making his decision when compared with Leela's suggestions at 300k+ (or 200k+) nodes.

Not the question I am asking. I am asking whether, based upon Botvinnik's ideas, Carlo played worse when he took more time. My preference would have been to use a different bot, such as Elf, to estimate the chunks by Carlo. But you're the one who has made the most thorough analysis, and you used Leela 11.

In fact, in the Italian appeal they say that they got a better agreement between Carlo's moves and those suggested by Leela (top 3 choices) when they were running Leela on lower numbers of nodes. This indicates that the winrate estimates found at high number of playouts are not indicative for Carlo's moves.

I don't follow you. The assumption I a making is that Leela 11, although flawed, is able compare moves reasonably well, especially with many playouts. My preference would be to use estimates at the same depth of the tree, because of potential horizon effects, but I'll use the delta if necessary.

If I was to make an assumption on why Carlo used a lot of time on some moves I would say that he was more likely to spend more time when Leela's suggestions did differ from his own instincts and in positions when he suspected that Leela might be wrong. As a regular user of Leela he must know for sure where Leela's weaknesses are so he should spend more time when such situation occurs.

In that case, we should expect that some of the deltas actually indicate that Carlo's choice is better than Leela 11's.

Let's take a look at Carlo's invasion,

, for instance.

Bill Spight wrote:
$$Bcm33 Invasion
$$ ---------------------------------------
$$ | . . . . X . . . . . . . . . . . . . . |
$$ | . 3 . . . X O . . . . . . . . . . . . |
$$ | . 2 X X X O . . O . . . . . . . X . . |
$$ | . 4 O O X O . O . , . . . . . , . . . |
$$ | . . . . O X X . . . . . . . . . . . . |
$$ | . . . . O . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . 1 . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . O . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . O . . X . . |
$$ | . . X . . . . . . . . . . . . . . . . |
$$ | . . . X X X O . . , . . . . . , . . . |
$$ | . . X O O . O . . . . . . O . . X . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ ---------------------------------------
Click Here To Show Diagram Code
[go]$$Bcm33 Invasion $$ --------------------------------------- $$ | . . . . X . . . . . . . . . . . . . . | $$ | . 3 . . . X O . . . . . . . . . . . . | $$ | . 2 X X X O . . O . . . . . . . X . . | $$ | . 4 O O X O . O . , . . . . . , . . . | $$ | . . . . O X X . . . . . . . . . . . . | $$ | . . . . O . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . 1 . . . . . . . . . . . . . . . . | $$ | . . . , . . . . . , . . . . . , . . . | $$ | . . O . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . O . . X . . | $$ | . . X . . . . . . . . . . . . . . . . | $$ | . . . X X X O . . , . . . . . , . . . | $$ | . . X O O . O . . . . . . O . . X . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ ---------------------------------------[/go]
Black invaded after 2 min. 13 sec., almost 14 times his average.

In the rsgf file published by Bojanic, Leela's first choice is as follows.

$$Bcm33 Leela
$$ ---------------------------------------
$$ | . . . . X . . . . . . . . . . . . . . |
$$ | . . . . . X O . . . . . . . . . . . . |
$$ | . . X X X O . . O . . . . . . . X . . |
$$ | . . O O X O . O . , . . . . . , . . . |
$$ | . . . . O X X . . . . . . . . . . . . |
$$ | . . 3 . O . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . O . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . O . . X . . |
$$ | . . X . . . . . . . . . . . . . . . . |
$$ | . . . X X X O . . , . . . . . , . . . |
$$ | . . X O O 1 O . . . . . . O . . X . . |
$$ | . . . . . 2 . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ ---------------------------------------
Click Here To Show Diagram Code
[go]$$Bcm33 Leela $$ --------------------------------------- $$ | . . . . X . . . . . . . . . . . . . . | $$ | . . . . . X O . . . . . . . . . . . . | $$ | . . X X X O . . O . . . . . . . X . . | $$ | . . O O X O . O . , . . . . . , . . . | $$ | . . . . O X X . . . . . . . . . . . . | $$ | . . 3 . O . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . , . . . . . , . . . . . , . . . | $$ | . . O . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . O . . X . . | $$ | . . X . . . . . . . . . . . . . . . . | $$ | . . . X X X O . . , . . . . . , . . . | $$ | . . X O O 1 O . . . . . . O . . X . . | $$ | . . . . . 2 . . . . . . . . . . . . . | $$ | . . . . . . . . . . . . . . . . . . . | $$ ---------------------------------------[/go]
After the kikashi, - , plays the eye stealing tesuji in the top left corner.

Ales Cieply's Leela also chose F-03.

It estimated a loss of 2.6%.

Edit: The new run also picked - , , with 265259 playouts. Carlo's play was not in the running, with only 1106 playouts.
After the invasion Leela chose the continuation in the actual game, - , with 293561 playouts, for an estimated loss of 4.1%, a rather larger number than the first estimate.

For one thing, Leela's choice seems to be quite consistent; even Bojanic's Leela found it. It's delta for Carlo's play was 5%. Even if Carlo thought that Leela's judgement was inferior to his, surely he would have noted what Leela thought, and also that that invasion was off Leela's radar. If he held Leela's evaluation in such disdain, why was he using it to cheat?

I have no personal experience with Leela or other bots, but why would Leela's judgement be bad in this kind of position? Yes, the left side to center seems to be the most urgent part of the board, but there are a number of possibilities there. Yes, there are tactical questions, but strategy is a very important consideration for this position, and strategy is where current top bots excel. Sure, simply because of the number of possibilities, Leela may be unlikely to choose the best play, but that is true for humans, as well. Anyway, the question of invasions and reductions comes up frequently at this stage of the game. Do we really think that good amateurs are better than top bots in these positions?

Lastly, Carlo took more than 2 min. for this play, a long time for him. If he was so confident in his own judgement for this position, why take so long? Wouldn't one minute have been enough?

Bill Spight · Post by **Bill Spight** » Tue Jul 24, 2018 3:52 pm

AlesCieply wrote:I have asked Lukas Podpera (ID lukan here), one of the top European players, to give his "expert view" on the moves in the discussed game that he finds suspicious. I am translating the comments he attached (in Czech) to the game record:
this move is not played much nowadays as alpha-go joseki prefers a different one

Apparently not Leela's choice. delta = 1%. (All deltas taken from your latest and best run.)

the same here, unforced move

Off Leela's radar. delta = 4.1%

most strong player would just play c2, looks like a bot-play

Yes, it does look like a bot move. Mine, too. C-02 is not on Leela's radar, nor mine.

important for a shape, not sure whether 4d player would see it

Carlo took 12 sec. Probably obvious to him, as you say.

strong line of play

14 sec.

exquisite timing, not many strong amateur players would choose the right time to play this sequence

Carlo took 11 sec. This seems to me to be implied in

. Not only by flow of the stones, but because Black seems to be ahead, and the bottom right group is his weakest. Isn't it urgent to settle the bottom right corner?

:b111: strong move

Carlo took 15 sec. delta = 2.7%

:b121: nice complex move, I do not believe any 4d would play it, most players would just block and defend the territory at the top

I would have defended the territory.

Carlo took 49 sec. delta = 1.75%

:b139: black is not concerned at all about the loss of territory at the top and ends the game with a sharp combination that I myself would have difficulty to find and play out so accurately

Carlo took 19 sec.

:b153: game finished, no 4d would wrap up the game so calmly and professionally

Carlo took 11 sec. B153 has been lurking for a while.

AlesCieply · Post by **AlesCieply** » Wed Jul 25, 2018 2:19 am

Bill Spight wrote:My preference would have been to use a different bot, such as Elf, to estimate the chunks by Carlo. But you're the one who has made the most thorough analysis, and you used Leela 11.

Just give me a bit more time, the results of the analysis with Leela Zero (ELF weights) are already in the pipeline ...

Bill Spight wrote:
AlesCieply wrote:In fact, in the Italian appeal they say that they got a better agreement between Carlo's moves and those suggested by Leela (top 3 choices) when they were running Leela on lower numbers of nodes. This indicates that the winrate estimates found at high number of playouts are not indicative for Carlo's moves.
I don't follow you. The assumption I a making is that Leela 11, although flawed, is able compare moves reasonably well, especially with many playouts. My preference would be to use estimates at the same depth of the tree, because of potential horizon effects, but I'll use the delta if necessary.

The moves and winrates found at 300k+ nodes might be different from those seen by Carlo when he made his move. Concerning the winrates, even at 300k+ nodes it is not rare that the winrate evaluation is off by more than 1%, though they are still determined more precisely than the top suggestions where there can be several options with a similar winrates. Carlo could have played Leela top suggestion even after some additional pondering and the played move winrate could easily drop by 2-3% when evaluated at 300k+ nodes.

Bill Spight wrote:
AlesCieply wrote:If I was to make an assumption on why Carlo used a lot of time on some moves I would say that he was more likely to spend more time when Leela's suggestions did differ from his own instincts and in positions when he suspected that Leela might be wrong. As a regular user of Leela he must know for sure where Leela's weaknesses are so he should spend more time when such situation occurs.
In that case, we should expect that some of the deltas actually indicate that Carlo's choice is better than Leela 11's.

I think so too. The question is how often it happens. Can we see it in a limited number of positions taken from one game. And can Leela realize the played move is better as soon as it is played. At the end of the Kim-Metta regular game played at WAGC we see Leela remaining blind for a whole sequence of moves played. I think the long time taken on

(your position 3) is telling. I am a mere 1d player but I believe even much stronger amateur players would just play as Carlo did (making life in a corner) but without thinking on it too much (I would not think at all on it). What caused Carlo to ponder that long. Was it not Leela persisting that there was a better move. This is a clean example in accordance with my (and Bojanic's) hypothesis that he would think longer in positions where he suspects Leela to misjudge L&D. Well, he could have just went to a toilet, accept a brief telephone call or whatever else ...

Bill Spight · Post by **Bill Spight** » Wed Jul 25, 2018 8:45 am

AlesCieply wrote:
Bill Spight wrote:My preference would have been to use a different bot, such as Elf, to estimate the chunks by Carlo. But you're the one who has made the most thorough analysis, and you used Leela 11.
Just give me a bit more time, the results of the analysis with Leela Zero (ELF weights) are already in the pipeline ...

Great!

Bill Spight wrote:
AlesCieply wrote:In fact, in the Italian appeal they say that they got a better agreement between Carlo's moves and those suggested by Leela (top 3 choices) when they were running Leela on lower numbers of nodes. This indicates that the winrate estimates found at high number of playouts are not indicative for Carlo's moves.
I don't follow you. The assumption I a making is that Leela 11, although flawed, is able compare moves reasonably well, especially with many playouts. My preference would be to use estimates at the same depth of the tree, because of potential horizon effects, but I'll use the delta if necessary.
The moves and winrates found at 300k+ nodes might be different from those seen by Carlo when he made his move. Concerning the winrates, even at 300k+ nodes it is not rare that the winrate evaluation is off by more than 1%, though they are still determined more precisely than the top suggestions where there can be several options with a similar winrates. Carlo could have played Leela top suggestion even after some additional pondering and the played move winrate could easily drop by 2-3% when evaluated at 300k+ nodes.

I gather that when you say, "This indicates that the winrate estimates found at high number of playouts are not indicative for Carlo's moves," you mean that the winrates for Carlo's moves that he would have seen if he were cheating are not the winrates for those moves after running Leela for at least 300k playouts. (Although after almost 4 min., quien sabe?

)

True, but here I am using Leela to estimate the degree of error. If Carlo picked Leela's top choice after running it for about one minute, and the winrate dropped enough with more playouts, so that its estimated error is more than 2%, it's still considered an error.

Bill Spight wrote:
AlesCieply wrote:If I was to make an assumption on why Carlo used a lot of time on some moves I would say that he was more likely to spend more time when Leela's suggestions did differ from his own instincts and in positions when he suspected that Leela might be wrong. As a regular user of Leela he must know for sure where Leela's weaknesses are so he should spend more time when such situation occurs.
In that case, we should expect that some of the deltas actually indicate that Carlo's choice is better than Leela 11's.
I think so too. The question is how often it happens. Can we see it in a limited number of positions taken from one game. And can Leela realize the played move is better as soon as it is played.

A few times in this game Leela considered Carlo's play to be better than its top choice, but not by much.

I think the long time taken on (your position 3) is telling. I am a mere 1d player but I believe even much stronger amateur players would just play as Carlo did (making life in a corner) but without thinking on it too much (I would not think at all on it). What caused Carlo to ponder that long. Was it not Leela persisting that there was a better move. This is a clean example in accordance with my (and Bojanic's) hypothesis that he would think longer in positions where he suspects Leela to misjudge L&D. Well, he could have just went to a toilet, accept a brief telephone call or whatever else ...

OC, we do not know why a player might take a long time on a move, but my working hypothesis is that his or her play is generally worse after taking a long time, because the position is subjectively difficult for that player. And, indeed, in this game when Carlo took 49 sec. or longer to make a play, his error rate tripled, according to Leela in your earlier analysis. (I did not check that again, with your latest analysis.)

Now, I am willing to entertain Bojanic's idea that Carlo simply ignored Leela for

, because Leela is relatively weak at L&D. But if Carlo was watching Leela's calculations, he would have noticed two things that remained fairly constant throughout. First, Leela understands that White to play in the top left corner can hold Black to one eye. Second, despite that fact, Leela prefers the eye stealing tesuji for Black in that corner (after playing kikashi in the bottom left corner or not). Leela may be looking at a possible seki in the corner, or maybe at a sacrifice. Also, the invasion at

aims at the eye stealing tesuji, especially since it would also be a two space extension from

. (Carlo was lucky that

prevented that extension cum eye stealing tesuji, defending his group on a small scale instead of attacking

on a large scale.) Surely the eye stealing tesuji was on Carlo's mind. He did not need Leela to suggest it to him.

If Carlo was using Leela to cheat at

, why didn't he follow its suggestion? It is possible that Leela preferred Carlo's play at the time he made it, but just barely. Every analysis I have seen of this game regards it as an error, and, except for this last one, a significant error. Bojanic's Leela rates it as losing 7%.

Leela's eye stealing tesuji is more than a suggestion.

AlesCieply · Post by **AlesCieply** » Fri Jul 27, 2018 2:44 am

Bill, attached find two files generated by the GRP for the analysis of the discussed game, the analysis done with Leela Zero, ELF weights. One of the files is all moves, 100k playouts, the other is moves 30-181, 200k playouts.

Some comments:
The analysis was done by someone else, I cannot run it effectively enough on my laptop. Apparently, Leela Zero treats the playouts limit differently than Leela 0.11 as I see much lower number of playouts (below and around 10k) than the preset values. I guess there is a distinction between number of playouts and number of nodes, maybe the preset limit is applied to the nodes analyzed. I am also quite bothered by the precision of the winrate estimates. Those provided by Leela Zero are much less precise than those by Leela 0.11. Just looking at and comparing the numbers provided when the top suggestion was played (where the winrate before and after the move was played should be approximately the same) I am estimating the precision to be on a level of about 3%, in contrast to about 1% I would give to Leela 0.11. Sure, the winrates look more "accurate" in your terminology.

I am not sure what this will do with my kind of analysis which (in a way) relies on sufficiently precise winrates. I guess the mistake histograms will be smeared by the "noise".

Bill Spight · Post by **Bill Spight** » Fri Jul 27, 2018 3:37 am

AlesCieply wrote:Bill, attached find two files generated by the GRP for the analysis of the discussed game, the analysis done with Leela Zero, ELF weights. One of the files is all moves, 100k playouts, the other is moves 30-181, 200k playouts.

Many thanks.

Some comments:
The analysis was done by someone else, I cannot run it effectively enough on my laptop. Apparently, Leela Zero treats the playouts limit differently than Leela 0.11 as I see much lower number of playouts (below and around 10k) than the preset values. I guess there is a distinction between number of playouts and number of nodes, maybe the preset limit is applied to the nodes analyzed.

I certainly don't know. Uberdude?

I am also quite bothered by the precision of the winrate estimates. Those provided by Leela Zero are much less precise than those by Leela 0.11. Just looking at and comparing the numbers provided when the top suggestion was played (where the winrate before and after the move was played should be approximately the same) I am estimating the precision to be on a level of about 3%, in contrast to about 1% I would give to Leela 0.11. Sure, the winrates look more "accurate" in your terminology.

No, more precise in my terminology. The accuracy is unknown, but we cannot expect greater accuracy than precision (for any given bot).

Leela Zero is stronger than Leela 11, perhaps with less precision.

I am not sure what this will do with my kind of analysis which (in a way) relies on sufficiently precise winrates. I guess the mistake histograms will be smeared by the "noise".

Yeah. As I have said, the bots are trained for play, not analysis.

Edit: But looking at the size of the deltas, that level of precision seems to be good enough to identify probable mistakes.

Bill Spight · Post by **Bill Spight** » Fri Jul 27, 2018 2:25 pm

I have updated the positions in the game according to the new Leela Elf results.

See viewtopic.php?p=233833#p233833 and the next note.

Life In 19x19

Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game

Re: Questions about a game