A pro's view on win rates
-
John Fairbairn
- Oza
- Posts: 3724
- Joined: Wed Apr 21, 2010 3:09 am
- Has thanked: 20 times
- Been thanked: 4672 times
A pro's view on win rates
I said I was going to stop trying to start getting discussions going, in protest at thread hijackers, but this case seems a bit too important to ignore. In any case, it's more about conveying facts than eliciting opinions.
Ohashi Hirofumi 6-dan has become Japan's go-to guy on AI bots. For a long time he has penned an excellent series of articles on AI in Go World. In the latest issue he has started a new series entitled "In search of the weak points in go AI". He has kept himself well abreast of developments in China, Taiwan and Korea, and apparently also the West. It is safe to say that his views, which of course are those of a pro, reflect pro views from elsewhere, at least to the extent that he must be aware of them.
Some points he made in his latest series seemed worth flagging here.
His own experiments use LeelaZero and Elf. He is well up on the various self-play versions of LeelaZero. In short, he is not a dilettante.
Now down to brass tacks. In an effort to ascertain what differences in win rates mean in terms of points, he gives the following two positions (using LeelaZero in this case).
Position A
Black has just played at the triangled point. LZ gives A as the best candidate move with a win rate of 48.9%, and B is second best with 48.3%
I will add here that the GoGoD database has just over 100 examples of this Avalanche joseki and in every single case the next move is A.
Position B
This is exactly the same position except for the addition of the two triangled stones. This exchange (which, by my reckoning, has never appeared in pro play) is considered to be "approximately a 2-point gain for White."
But in terms of win rate, the AI bot now reverses the ranking of A and B. However, of more significance is that the exchange sparks a big change in win rate. A now becomes 60.4% and B 57.9%.
Ohashi infers from this that a roughly 10% change in win rate at this stage of the game equates to 2 points.
He further infers that for amateurs "a 10% drop in win rate is absolutely nothing to worry about. 70% indicates one side is a little better and 90% suggests one side has a winning lead." He goes on to add that, "In games between human players a reversal in fortunes even from 90% is not rare."
I think this assessment tallies quite well with some statements I have seen on this forum, but it is good to have pro backing, of course.
In another part of his article he lists win rates for one and the same position for various bots. The best-move range starts at 42% for FineArt, 44% for PhoenixGo, 46 to 47% for various versions of LeelaZero, up to 51% for ElfOpenGo 1. He points out that the range can be even bigger. In one case the range went from FineArt 49% to Elf 70%, and in another position LeelaZero's 42 contrasted with Elf's 61%
Ohashi doesn't say this, but it seems that Elf is always at the high end of the scale. He seems, however, to think the difference in figures is down to differences in the algorithms and does not have much comparative value.
However, the idea I found most interesting (and which has also been aired here, in part) is that he recommends readers to view the AI output from the stance of the following list of priorities:
1. Candidate moves
2. Number of visits
3. Win rate
He adds: "The idea that number of visits is more important than win rate is maybe somewhat surprising." (Again, not to some people here, but once again pro backing is useful.)
Ohashi doesn't elaborate on that, unfortunately, though he may in his further articles.
Ohashi Hirofumi 6-dan has become Japan's go-to guy on AI bots. For a long time he has penned an excellent series of articles on AI in Go World. In the latest issue he has started a new series entitled "In search of the weak points in go AI". He has kept himself well abreast of developments in China, Taiwan and Korea, and apparently also the West. It is safe to say that his views, which of course are those of a pro, reflect pro views from elsewhere, at least to the extent that he must be aware of them.
Some points he made in his latest series seemed worth flagging here.
His own experiments use LeelaZero and Elf. He is well up on the various self-play versions of LeelaZero. In short, he is not a dilettante.
Now down to brass tacks. In an effort to ascertain what differences in win rates mean in terms of points, he gives the following two positions (using LeelaZero in this case).
Position A
Black has just played at the triangled point. LZ gives A as the best candidate move with a win rate of 48.9%, and B is second best with 48.3%
I will add here that the GoGoD database has just over 100 examples of this Avalanche joseki and in every single case the next move is A.
Position B
This is exactly the same position except for the addition of the two triangled stones. This exchange (which, by my reckoning, has never appeared in pro play) is considered to be "approximately a 2-point gain for White."
But in terms of win rate, the AI bot now reverses the ranking of A and B. However, of more significance is that the exchange sparks a big change in win rate. A now becomes 60.4% and B 57.9%.
Ohashi infers from this that a roughly 10% change in win rate at this stage of the game equates to 2 points.
He further infers that for amateurs "a 10% drop in win rate is absolutely nothing to worry about. 70% indicates one side is a little better and 90% suggests one side has a winning lead." He goes on to add that, "In games between human players a reversal in fortunes even from 90% is not rare."
I think this assessment tallies quite well with some statements I have seen on this forum, but it is good to have pro backing, of course.
In another part of his article he lists win rates for one and the same position for various bots. The best-move range starts at 42% for FineArt, 44% for PhoenixGo, 46 to 47% for various versions of LeelaZero, up to 51% for ElfOpenGo 1. He points out that the range can be even bigger. In one case the range went from FineArt 49% to Elf 70%, and in another position LeelaZero's 42 contrasted with Elf's 61%
Ohashi doesn't say this, but it seems that Elf is always at the high end of the scale. He seems, however, to think the difference in figures is down to differences in the algorithms and does not have much comparative value.
However, the idea I found most interesting (and which has also been aired here, in part) is that he recommends readers to view the AI output from the stance of the following list of priorities:
1. Candidate moves
2. Number of visits
3. Win rate
He adds: "The idea that number of visits is more important than win rate is maybe somewhat surprising." (Again, not to some people here, but once again pro backing is useful.)
Ohashi doesn't elaborate on that, unfortunately, though he may in his further articles.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: A pro's view on win rates
Well, I have been saying that less than a 3% win rate difference is not particularly significant. At 10%, though, I might worry. Unless it is a 10% difference by Elf, OC.John Fairbairn wrote:I said I was going to stop trying to start getting discussions going, in protest at thread hijackers, but this case seems a bit too important to ignore. In any case, it's more about conveying facts than eliciting opinions.
Ohashi Hirofumi 6-dan has become Japan's go-to guy on AI bots. For a long time he has penned an excellent series of articles on AI in Go World. In the latest issue he has started a new series entitled "In search of the weak points in go AI". He has kept himself well abreast of developments in China, Taiwan and Korea, and apparently also the West. It is safe to say that his views, which of course are those of a pro, reflect pro views from elsewhere, at least to the extent that he must be aware of them.
Some points he made in his latest series seemed worth flagging here.
His own experiments use LeelaZero and Elf. He is well up on the various self-play versions of LeelaZero. In short, he is not a dilettante.
Now down to brass tacks. In an effort to ascertain what differences in win rates mean in terms of points, he gives the following two positions (using LeelaZero in this case).
Position A
Black has just played at the triangled point. LZ gives A as the best candidate move with a win rate of 48.9%, and B is second best with 48.3%
I will add here that the GoGoD database has just over 100 examples of this Avalanche joseki and in every single case the next move is A.
Position B
This is exactly the same position except for the addition of the two triangled stones. This exchange (which, by my reckoning, has never appeared in pro play) is considered to be "approximately a 2-point gain for White."
But in terms of win rate, the AI bot now reverses the ranking of A and B. However, of more significance is that the exchange sparks a big change in win rate. A now becomes 60.4% and B 57.9%.
Ohashi infers from this that a roughly 10% change in win rate at this stage of the game equates to 2 points.
He further infers that for amateurs "a 10% drop in win rate is absolutely nothing to worry about. 70% indicates one side is a little better and 90% suggests one side has a winning lead." He goes on to add that, "In games between human players a reversal in fortunes even from 90% is not rare."
I think this assessment tallies quite well with some statements I have seen on this forum, but it is good to have pro backing, of course.
As for the 2 pt. difference, that means that, on average, Black's play gains about 2 pts. less than White's play. As for the reverse of the ranking of A and B, Black's play is even worse if Black has already captured the three White stones on the second line, which White B in the first diagram threatens to force. As usual, you want to make your opponent's previous plays inefficient.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
- jlt
- Gosei
- Posts: 1786
- Joined: Wed Dec 14, 2016 3:59 am
- GD Posts: 0
- Has thanked: 185 times
- Been thanked: 495 times
Re: A pro's view on win rates
I've run the same experiment with LZ157. After a few thousand visits, the preferred move is H17, however the most visited point is H16. The estimated winrates are:
Position A: 48.8% for H17, 48.5% for H16.
Position B: 51.7% for H17, 50.7% for H16.
We can conclude that a -3% mistake is, according to LZ157, equivalent to a 2 point loss at the early stage of the game.
Another experiment: on an empty board, Black's winrate is 46.5% (so White's winrate is 53.5%). If Black passes, then White's winrate becomes 81.7%. Assuming fair komi is about 6 points, and that passing loses twice the komi, we can conclude that, at the early stage of the game, a -30% mistake is equivalent to a 12 point loss.
I have no idea how to interpolate. Maybe a 10% mistake for LZ157 corresponds to about 5 points ?
Position A: 48.8% for H17, 48.5% for H16.
Position B: 51.7% for H17, 50.7% for H16.
We can conclude that a -3% mistake is, according to LZ157, equivalent to a 2 point loss at the early stage of the game.
Another experiment: on an empty board, Black's winrate is 46.5% (so White's winrate is 53.5%). If Black passes, then White's winrate becomes 81.7%. Assuming fair komi is about 6 points, and that passing loses twice the komi, we can conclude that, at the early stage of the game, a -30% mistake is equivalent to a 12 point loss.
I have no idea how to interpolate. Maybe a 10% mistake for LZ157 corresponds to about 5 points ?
-
Javaness2
- Gosei
- Posts: 1545
- Joined: Tue Jul 19, 2011 10:48 am
- GD Posts: 0
- Has thanked: 111 times
- Been thanked: 322 times
- Contact:
Re: A pro's view on win rates
It's an interesting post, but I don't understand how you can infer that 2 points is worth a 3% difference in win rates from 1 example.
If you have access to Golaxy, you will get both % win rate and estimated points lead being displayed at the same time.
https://www.eurogofed.org/index.html?id=232 Artem wrote this article using Golaxy
If you have access to Golaxy, you will get both % win rate and estimated points lead being displayed at the same time.
https://www.eurogofed.org/index.html?id=232 Artem wrote this article using Golaxy
- jlt
- Gosei
- Posts: 1786
- Joined: Wed Dec 14, 2016 3:59 am
- GD Posts: 0
- Has thanked: 185 times
- Been thanked: 495 times
Re: A pro's view on win rates
It's certainly difficult to conclude from one example. I don't have access to Golaxy, so here is another attempt to estimate the correspondence between a number of points and winrate. Let's assume that the score (Black's points minus White's points) follows a normal distribution with standard deviation 14 (I picked this value because it's consistent with the winrate going from about 50% to 20% after a 12 points loss). Then we get, at the early stages of the game, and starting from a balanced position:
1 pt = 3%
2 pts = 6%
3 pts = 8%
4 pts = 11%
5 pts = 14%
6 pts = 17%
7 pts = 19%
8 pts = 22%
9 pts = 24%
10 pts = 26%
11 pts = 28%
12 pts = 30%
13 pts = 32%
14 pts = 34%
15 pts = 36%
16 pts = 37%
17 pts = 39%
18 pts = 40%
19 pts = 41%
20 pts = 42%.
This is of course speculative, and not consistent with my previous post, so at least one of these estimates is wrong.
1 pt = 3%
2 pts = 6%
3 pts = 8%
4 pts = 11%
5 pts = 14%
6 pts = 17%
7 pts = 19%
8 pts = 22%
9 pts = 24%
10 pts = 26%
11 pts = 28%
12 pts = 30%
13 pts = 32%
14 pts = 34%
15 pts = 36%
16 pts = 37%
17 pts = 39%
18 pts = 40%
19 pts = 41%
20 pts = 42%.
This is of course speculative, and not consistent with my previous post, so at least one of these estimates is wrong.
- Knotwilg
- Oza
- Posts: 2432
- Joined: Fri Jan 14, 2011 6:53 am
- Rank: KGS 2d OGS 1d Fox 4d
- GD Posts: 0
- KGS: Artevelde
- OGS: Knotwilg
- Online playing schedule: UTC 18:00 - 22:00
- Location: Ghent, Belgium
- Has thanked: 360 times
- Been thanked: 1021 times
- Contact:
Re: A pro's view on win rates
In Lizzie, the visual aid to LeelaZero, the best candidate is always indicated in cyan and it's the one visited most.
If 10% equates with a 2 point difference, then it means that all analyses that treat percentage differences of less than 5% are bogus. I've been guilty of that.
It also means that bots are probably already close to the hand of God today.
If 10% equates with a 2 point difference, then it means that all analyses that treat percentage differences of less than 5% are bogus. I've been guilty of that.
It also means that bots are probably already close to the hand of God today.
-
Kirby
- Honinbo
- Posts: 9553
- Joined: Wed Feb 24, 2010 6:04 pm
- GD Posts: 0
- KGS: Kirby
- Tygem: 커비라고해
- Has thanked: 1583 times
- Been thanked: 1707 times
Re: A pro's view on win rates
You’ll have big differences depending on the AI you use. E.g. some versions of Elf are much more confident than some versions of Leela. So it’s hard to equate to points.
I agree with the sentiment that AI is well suited to give us ideas primarily, and that number of visits and win rate are less valuable than those ideas.
I agree with the sentiment that AI is well suited to give us ideas primarily, and that number of visits and win rate are less valuable than those ideas.
be immersed
-
Uberdude
- Judan
- Posts: 6727
- Joined: Thu Nov 24, 2011 11:35 am
- Rank: UK 4 dan
- GD Posts: 0
- KGS: Uberdude 4d
- OGS: Uberdude 7d
- Location: Cambridge, UK
- Has thanked: 436 times
- Been thanked: 3718 times
Re: A pro's view on win rates
I disagree with both conclusions (though I think the "if" is not true unless you put big "for some bots, for some positions, at some stages of the game" caveats in). The degree of mistake as measured by winrate loss by a bot like LZ is not a nice function that maps in a reproducible and mutually increasing way to a human perception of the size of a mistake. Here's a simple go problem, white to play. I think many 15ks can get this right. LZ 198 thinks the wrong answer is only a -2% mistake. How many points a mistake is this? Is that even a sensible question? I remember some series in Go World where some pro, Ishida Yoshio iirc, was trying to put numerical point values on various mistakes. I recall being struck how small he thought various obvious dumb kyu mistakes (like this problem) were. We shouldn't all start willy-nilly making mistakes like this that are well within our abilities to not make just because LZ says it's a modest -2% (of course many games are decided by big blunders). What LZ teaches us is that pros routinely make far bigger mistakes than this, as measured by winrate difference. But they are typically much harder positions.Knotwilg wrote: If 10% equates with a 2 point difference, then it means that all analyses that treat percentage differences of less than 5% are bogus. I've been guilty of that.
It also means that bots are probably already close to the hand of God today.
P.S.
- jlt
- Gosei
- Posts: 1786
- Joined: Wed Dec 14, 2016 3:59 am
- GD Posts: 0
- Has thanked: 185 times
- Been thanked: 495 times
Re: A pro's view on win rates
Not on my (older) version of Lizzie.Knotwilg wrote:In Lizzie, the visual aid to LeelaZero, the best candidate is always indicated in cyan and it's the one visited most.
I don't know about Alphago, but if LZ157 after 3000 visits still thinks she has 20% chances of winning despite passing at move 1, and if this estimate reflects actual winrates in selfplay matches, then I guess LZ157 is still at least 3 stones away from perfect play.Knotwilg wrote: It also means that bots are probably already close to the hand of God today.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: A pro's view on win rates
The relationship between the difference in estimated win rates and the difference in estimated scores can be roughly linear only when the differences are small. Previous research on handicap stones indicates that up to 9 stones handicap the relationship between handicap difference and score difference is roughly linear. So in the above example if Black passes twice instead of once, White gains a score advantage of four times komi. But White's win rate does not become 112%.jlt wrote:I've run the same experiment with LZ157. After a few thousand visits, the preferred move is H17, however the most visited point is H16. The estimated winrates are:
Position A: 48.8% for H17, 48.5% for H16.
Position B: 51.7% for H17, 50.7% for H16.
We can conclude that a -3% mistake is, according to LZ157, equivalent to a 2 point loss at the early stage of the game.
Another experiment: on an empty board, Black's winrate is 46.5% (so White's winrate is 53.5%). If Black passes, then White's winrate becomes 81.7%. Assuming fair komi is about 6 points, and that passing loses twice the komi, we can conclude that, at the early stage of the game, a -30% mistake is equivalent to a 12 point loss.
I have no idea how to interpolate. Maybe a 10% mistake for LZ157 corresponds to about 5 points ?
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: A pro's view on win rates
Indeed.Uberdude wrote:The degree of mistake as measured by winrate loss by a bot like LZ is not a nice function that maps in a reproducible and mutually increasing way to a human perception of the size of a mistake.
Simple? Yes, there is a basic principle that leads to the correct answer, But go is hard.Here's a simple go problem, white to play. I think many 15ks can get this right.
Well within LZ's margin of error. But I agree that humans can reliably order plays, even when the differences are slight.LZ 198 thinks the wrong answer is only a -2% mistake.
Let's compare combs. (From an old Brylcreme commercial.
Without a White stone on the left side, White's wall is not efficient. Besides, it has flaws.
OC, there is a sente/gote difference here, so Black is better off in the corner than in the previous diagram, but White has sente.
However, is White really better off in this diagram? The AlphaGo teaching tool thinks that
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
- Knotwilg
- Oza
- Posts: 2432
- Joined: Fri Jan 14, 2011 6:53 am
- Rank: KGS 2d OGS 1d Fox 4d
- GD Posts: 0
- KGS: Artevelde
- OGS: Knotwilg
- Online playing schedule: UTC 18:00 - 22:00
- Location: Ghent, Belgium
- Has thanked: 360 times
- Been thanked: 1021 times
- Contact:
Re: A pro's view on win rates
15 kyus can surely reproduce the answer. However do they really know why? And do we?Uberdude wrote:
Even pros have been able to reproduce common knowledge without realizing some of them were wrong as suggested by tis higher form of intelligence.
-
Ian Butler
- Lives in gote
- Posts: 646
- Joined: Thu Dec 29, 2016 4:09 pm
- GD Posts: 0
- Has thanked: 62 times
- Been thanked: 116 times
Re: A pro's view on win rates
IS it a higher form of intelligence, though?Knotwilg wrote:15 kyus can surely reproduce the answer. However do they really know why? And do we?Uberdude wrote:
Even pros have been able to reproduce common knowledge without realizing some of them were wrong as suggested by this higher form of intelligence.
It's certainly faster, reading out tens of thousands of moves in mere seconds.
Yet a calculator does calculations way faster than I do. But does it understand these calculations better?
But I know what you mean
- Knotwilg
- Oza
- Posts: 2432
- Joined: Fri Jan 14, 2011 6:53 am
- Rank: KGS 2d OGS 1d Fox 4d
- GD Posts: 0
- KGS: Artevelde
- OGS: Knotwilg
- Online playing schedule: UTC 18:00 - 22:00
- Location: Ghent, Belgium
- Has thanked: 360 times
- Been thanked: 1021 times
- Contact:
Re: A pro's view on win rates
Well, we all "know" that this is "wrong" because White needs a base but it's Black's sente. We've been taught that. We don't think it's particularly useful for White to confine Black into the corner this way.
We "know" this is "right", because White has now sente and the exchange of influence for territory is fair. The fact that Black not only has territory but is also out, didn't affect our judgment. Incidentally ...
... we also knew this was "wrong" because it left bad aji for White at 'a'. I for one had always a bad feeling about this knowledge because I experienced the actual aji at 'a' in the previous diagram. But who was I to question pro knowledge, pouring down on me?
Well, it turns out that what we thought we knew was wrong: this diagram is what the bots like more, probably because this aji is less problematic. I should have trusted my experience, not what I was told.
And while we're at it, the bots don't think the first diagram is more problematic than the second.
But do the bots really understand? Probably not. But I'm not so sure about humans either.
We "know" this is "right", because White has now sente and the exchange of influence for territory is fair. The fact that Black not only has territory but is also out, didn't affect our judgment. Incidentally ...
... we also knew this was "wrong" because it left bad aji for White at 'a'. I for one had always a bad feeling about this knowledge because I experienced the actual aji at 'a' in the previous diagram. But who was I to question pro knowledge, pouring down on me?
Well, it turns out that what we thought we knew was wrong: this diagram is what the bots like more, probably because this aji is less problematic. I should have trusted my experience, not what I was told.
And while we're at it, the bots don't think the first diagram is more problematic than the second.
But do the bots really understand? Probably not. But I'm not so sure about humans either.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: A pro's view on win rates
Wait until the bots shut down coal burning power plants and manipulate the stock markets to crash the stocks of oil companies to slow down global warming. Then we'll know.Ian Butler wrote:IS it a higher form of intelligence, though?
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.