98% Win "Joseki"?

hyperpape · #1

I saw this claim that a certain joseki in books is evaluated as a 98% win for White: https://twitter.com/GoFederationRu/stat ... 1464165376.

I'm not quite sure what to ask, but what is going on here? Is this an actual recognized joseki? Are the engines consistent in this evaluation? How does this compare to their evaluation of handicap stones?

There's a certain overlap with Bill's question about what win rates mean, but I thought it still deserved a separate post.

dfan · #2

The latest 40-block weights for Leela Zero evaluate that position as 77% for White, which is more in line with what I'm used to seeing for josekis that turn out to be surprisingly bad when evaluated by modern networks.

Perhaps the evaluation in that picture comes from the ELF weights, which are much spikier. (As engines become stronger, their evaluations get closer to 100% and 0%.)

Bill Spight · #3

Interesting.

A couple of comments. Looking in ancient books (but not too ancient), it seems like that joseki has undergone a number of variations in what was considered best play. That kind of makes it ripe for new discoveries.

Also, I would feel better about the winrate claim if the rest of the board were empty. Or switch the colors of the other four stones and see what the winrate estimate is.

dfan · #4

Bill Spight wrote:

Also, I would feel better about the winrate claim if the rest of the board were empty. Or switch the colors of the other four stones and see what the winrate estimate is.

For grins I tried a bunch of different configurations for the other four stones (including leaving them out entirely) and with the same 40-block network (1fdf) Leela Zero's win probabilities for White varied between 75% and 79%.

Bill Spight · #5

dfan wrote:

Bill Spight wrote:

Also, I would feel better about the winrate claim if the rest of the board were empty. Or switch the colors of the other four stones and see what the winrate estimate is.

For grins I tried a bunch of different configurations for the other four stones (including leaving them out entirely) and with the same 40-block network (1fdf) Leela Zero's win probabilities for White varied between 75% and 79%.

Thanks.

That was my guess, ± 2%.

Uberdude · #6

That 98.8% comes from Elf v1, Elf v0 is "only" 97% and various LZ's lower as said. Elf does indeed have very strong opinions, but can also have quite bad blindspots. This is a fairly quiescent position so I'd trust its judgement more than some hot ladder position, but just recently I was reviewing an old game with commentary from Go World, Lee Changho vs Yamashita Keigo from 2004 Ing cup (http://www.go4go.net/go/games/sgfview/4424) and Elf's view of that bottom left bounced around like crazy even with 10k playouts. (Incidentally the game record in go4go and GoGoD had quite a few important differences in move order from the Go World one e.g. m16 before q8 (which I plus bots think is better or else q15 is good, that's classic human knowledge before bots made the press super-fashionable) h3 or c5 first, or b4 (which commentator Rin criticised, Elf too) before or after c2). Were the players making massive mistakes on every move, maybe, it is a pretty weird sequence, but also Elf didn't consider moves like h3 but then liked them when shown so I decided Elf was too much like a random number generator and used the more stable LZ instead.

Playing around with the position a little more in Elf, I think a lot of its dislike for black comes from the white stone in a ladder still being on the board. We know Elf has ladder problems, so could there be something baked into the network that imagines white running out and black failing to capture in a ladder, which would obviously be brilliant for white, so this means it thinks it is better for white than it really is? First test for calibration is to pass for white and give black sente, how much does that swing the winrate? Answer: black is now 98.8%. Also if white passes for his first move it gives black 99.4%, so this is saying on move 3 one extra stone and giving komi is 99.4, whereas after this joseki having one extra stone and giving komi is a little lower at 98.8. So it's not quite as good as one extra stone at the beginning.

So how about white passing and black adding an inefficient but not totally useless stone to help remove the ladder uncertainty and see what Elf thinks without any chance for the influence of ladder delusions. If black captures the stone white is around 40%, i.e. black is a fair bit better. That's somewhat understandable as by taking the stone of the board white's turn above at o7 is less powerful (e.g. cut isn't atari or black can safely double hane) and m3 is no longer sente (Elf also likes l3 and takes some playouts to realise it doesn't threaten to save the stone as ladder still works), but still for it to swing so much illustrates what a relatively small advantage 98.8% means to Elf.

Attachment:

98% joseki black capture, white to play.PNG [ 827.59 KiB | Viewed 5678 times ]

How about leaving the stone on the board so the white turn still has oomph? (Of course black might push there quite soon too to avoid that). If white passes and black plays o2 then now white is still winning, but only 63%. o2 means l3 or m3, moves Elf liked before, are no longer sente so that is quite a minus, but I think this helps put the 98.8% in context that if black had a free but fairly crappy stone at o2 the white lead sounds much less amazing. So was Elf deluded by the ladder, or perhaps being able to force with m3 n3 (in which case o7 turn still has oomp) really is worth almost a whole move. Interesting... (If white passes, we give black the o7 p8 exchange (somewhat generous, white might tenuki) to remove the benefit of the turn with oomp) and then black plays o2 then white is only 54%)

Attachment:

98% joseki black o2, white to play.PNG [ 827.88 KiB | Viewed 5678 times ]

And if black spends the move at n3, then white is only 47%. So the difference between 47% for white and 98.8% for white is having a white stone at the marked spot m3 or not. Certainly having a free stone is better than none (it could help a bit in developing lower side or act as a ladder breaker), but for a stone squashed up against a wall like that (l3 would be better) to be a win% difference of 52% does surprise me. (In this last position LZ #157 thinks white not black is winning, at 56%).

Attachment:

98% joseki black n3, white to play.PNG [ 594.53 KiB | Viewed 5673 times ]

hyperpape · #7

Wonderful analysis. I think this drives home lightvector's point in the other thread about knowing your engine and calibrating your sense of its win rates.

I suppose I should have guessed the fact that this is not quite a whole stone ahead of time: it's takes a lot of work to give up a whole stone in a joseki, or even a "joseki".

Uberdude · #8

FWIW this sequence does appear in Takao's joseki dictionary, with caption "equal" and 1 star which means "quasi-joseki that is less well established or not as important as the double starred ones". 2 stars are given to this classic one which Elf usually has a dismal view of (Mike Kyle posted about) and also Viktor Lin's blog reported an interesting tewari from WeiqiTv that said it was good for white, as Elf also believes.

MikeKyle · #9

Uberdude wrote:

FWIW this sequence does appear in Takao's joseki dictionary, with caption "equal" and 1 star which means "quasi-joseki that is less well established or not as important as the double starred ones". 2 stars are given to this classic one which Elf usually has a dismal view of (Mike Kyle posted about) and also Viktor Lin's blog reported an interesting tewari from WeiqiTv that said it was good for white, as Elf also believes.

From what I saw, Elf (v0) does not feel like the turn at s7 is often a good idea. I guess due to ladder confusion.

I was thinking of asking Elf to evaluate these sequences, but I thought that the ladder problems might get in the way of good evaluation. I somewhat trust Elf's evaluation at this later stage though - when the ladders are gone. (I imagine that Ai is confident enough with that very short O4 ladder?)

98% Win "Joseki"?

Who is online