Well, komi is not arbitrary.jann wrote:The percentage of correct answers is an exact, factual data (just like the percentage of various board scores). The percentage of wins (given those board scores) depends on an arbitrary parameter "komi".Bill Spight wrote:Given a method of evaluation that has a probabilistic semantics, such as the percentage of correct answers on a test, or percentage of wins in a contest
KGS Ranking adjustment?
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: KGS Ranking adjustment?
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: KGS Ranking adjustment?
In the NM system, the average difference between strong shodan and weak shodan is defined by a winrate of 50% with ½ pt. komi. We could even get more fine-grained and distinguish between levels according to exact komi. The problem with that, as I see it, is that skill at go is a vector; it is only average skill that is reducible to a number. Trying for too much precision is self-defeating.jlt wrote:@jann: normally, if a strong 1 dan plays against a weak 1 dan, then his winrate with komi 0.5 will be just a bit lower than his winrate with komi 7.5, so for the rating system to be fair, he should be awarded a little more points for a victory with komi 0.5 than for a victory with komi 7.5. How much is "a little more points" is not easy to determine, this has to be calculated using experimental data (see the link in the first post by gennan).
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
Re: KGS Ranking adjustment?
Sure, but as I wrote above, this strongly depends on the assumption that result range -1 to -6 is not over- or under-represented. This assumption can be wrong (especially if the sample count is low), and these samples carry less info than others (can be seen as both wins or losses).jlt wrote:@jann: normally, if a strong 1 dan plays against a weak 1 dan, then his winrate with komi 0.5 will be just a bit lower than his winrate with komi 7.5, so for the rating system to be fair, he should be awarded a little more points for a victory with komi 0.5 than for a victory with komi 7.5
And in any case this means the rating will carry more random variance, since normally only the players' performance varies, while for "h1"/even the rating will vary and depend on an external factor outside of the player's board performance.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: KGS Ranking adjustment?
jann wrote:And in any case this means the rating will carry more random variance, since normally only the players' performance varies, while for "h1"/even the rating will vary and depend on an external factor outside of the player's board performance.
Such as whether the player takes Black or White at chess?
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
gennan
- Lives in gote
- Posts: 497
- Joined: Fri Sep 22, 2017 2:08 am
- Rank: EGF 3d
- GD Posts: 0
- Universal go server handle: gennan
- Location: Netherlands
- Has thanked: 273 times
- Been thanked: 147 times
Re: KGS Ranking adjustment?
What kind of mechanism would cause a sharp probability drop between 6 points win on the board and 7 point win on the board? Without any special mechanism, the probability would drop smoothly. Is this some specially rigged version of go, like a game on a 2x3 board with area scoring and mirror plays forbidden?jann wrote: Sure, but as I wrote above, this strongly depends on the assumption that result range -1 to -6 is not over- or under-represented. This assumption can be wrong (especially if the sample count is low), and these samples carry less info than others (can be seen as both wins or losses).
And in any case this means the rating will carry more random variance, since normally only the players' performance varies, while for "h1"/even the rating will vary and depend on an external factor outside of the player's board performance.
- jlt
- Gosei
- Posts: 1786
- Joined: Wed Dec 14, 2016 3:59 am
- GD Posts: 0
- Has thanked: 185 times
- Been thanked: 495 times
Re: KGS Ranking adjustment?
We could imagine a player A, rated 1.75 dan, and a player B, rated 1.45 dan, so that on average, if A takes white, then A's score is (say) -3 points + komi on average, but we don't know anything about the standard deviation. If A's playing style is such that the standard deviation is very small, then A will lose too many games with komi 0 compared to what A's and B's Elo rating would predict. On the other hand, A also sometimes takes black with komi 0 against player C, rated 2.05 dan, and A wins too many games against C, so that the two effects hopefully cancel out.
And if they don't, then A will be demoted to 1.49 dan at some point, and will win against B again, so it's not really a big deal.
And if they don't, then A will be demoted to 1.49 dan at some point, and will win against B again, so it's not really a big deal.
Re: KGS Ranking adjustment?
Sure, in the long run (and as I wrote if the ratings are adjusted frequently there is no real problem).gennan wrote:What kind of mechanism would cause a sharp probability drop between 6 points win on the board and 7 point win on the board? Without any special mechanism, the probability would drop smoothly.
But suppose that you only allowed to observe 10 games of a new player, and need to make the best guess of his strength. With 10 even games I'm confident that I can make a decent guess regardless of the outcomes (even if he happens to lose or win all).
But with 10 of such half-stone "H1" games, if most of them happen to be in the -1 - -6 range (not too hard to imagine), I cannot even guess if he is stronger or weaker than his opponents.
-
gennan
- Lives in gote
- Posts: 497
- Joined: Fri Sep 22, 2017 2:08 am
- Rank: EGF 3d
- GD Posts: 0
- Universal go server handle: gennan
- Location: Netherlands
- Has thanked: 273 times
- Been thanked: 147 times
Re: KGS Ranking adjustment?
I estimate an average human 1d player would lose about 150 points over a whole game (compared to perfect play). You are hypothesizing what would happen when the probability distribution of his total error has a very sharp peak, so that the standard deviation is as small as 1 point. My question is: Why?jlt wrote:We could imagine a player A, rated 1.75 dan, and a player B, rated 1.45 dan, so that on average, if A takes white, then A's score is (say) -3 points + komi on average, but we don't know anything about the standard deviation. If A's playing style is such that the standard deviation is very small, then A will lose too many games with komi 0 compared to what A's and B's Elo rating would predict. On the other hand, A also sometimes takes black with komi 0 against player C, rated 2.05 dan, and A wins too many games against C, so that the two effects hopefully cancel out.
And if they don't, then A will be demoted to 1.49 dan at some point, and will win against B again, so it's not really a big deal.
I'm pretty sure that such players don't exist in real life (unless you build an AI that has exactly this behaviour), so I fail to see the point of this scenario. IMO it has little to do with real world statistics.
Last edited by gennan on Mon Jan 27, 2020 12:25 pm, edited 1 time in total.
-
gennan
- Lives in gote
- Posts: 497
- Joined: Fri Sep 22, 2017 2:08 am
- Rank: EGF 3d
- GD Posts: 0
- Universal go server handle: gennan
- Location: Netherlands
- Has thanked: 273 times
- Been thanked: 147 times
Re: KGS Ranking adjustment?
Oh, I'm sorry. I missed that it was you instead of jann.jlt wrote:@gennan: I agree with you, I was trying to interpret jann's messages.
-
gennan
- Lives in gote
- Posts: 497
- Joined: Fri Sep 22, 2017 2:08 am
- Rank: EGF 3d
- GD Posts: 0
- Universal go server handle: gennan
- Location: Netherlands
- Has thanked: 273 times
- Been thanked: 147 times
Re: KGS Ranking adjustment?
Well yes, 10 games is not a lot of data if you only look at the win/loss ratio. And when the winrate is close to 100% or 0%, the information content of the data is even lower.jann wrote:Sure, in the long run (and as I wrote if the ratings are adjusted frequently there is no real problem).gennan wrote:What kind of mechanism would cause a sharp probability drop between 6 points win on the board and 7 point win on the board? Without any special mechanism, the probability would drop smoothly.
But suppose that you only allowed to observe 10 games of a new player, and need to make the best guess of his strength. With 10 even games I'm confident that I can make a decent guess regardless of the outcomes (even if he happens to lose or win all).
But with 10 of such half-stone "H1" games, if most of them happen to be in the -1 - -6 range (not too hard to imagine), I cannot even guess if he is stronger or weaker than his opponents.
You could extract more data by analyzing the quality of play in those 10 games. Then you have much more data (something like 1200 moves to analyze). I suppose an AI could do that for you. For example, you could extract an error probability distribution of his moves using KataGo and compare it with error probability distributions of typical players of known ranks.
Re: KGS Ranking adjustment?
My point was the relative value: 10 even games are a decent amount of data even for 0% or 100%, but 10 "H1" games can be much less informative.gennan wrote:Well yes, 10 games is not a lot of data if you only look at the win/loss ratio. And when the winrate is close to 100% or 0%, the information content of the data is even lower.
-
Javaness2
- Gosei
- Posts: 1545
- Joined: Tue Jul 19, 2011 10:48 am
- GD Posts: 0
- Has thanked: 111 times
- Been thanked: 322 times
- Contact:
Re: KGS Ranking adjustment?
In the EGF system, komi can be 1 point for an even game. That is entirely valid.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: KGS Ranking adjustment?
And your point is?jann wrote:But suppose that you only allowed to observe 10 games of a new player, and need to make the best guess of his strength. With 10 even games I'm confident that I can make a decent guess regardless of the outcomes (even if he happens to lose or win all).
But with 10 of such half-stone "H1" games, if most of them happen to be in the -1 - -6 range (not too hard to imagine), I cannot even guess if he is stronger or weaker than his opponents.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
gennan
- Lives in gote
- Posts: 497
- Joined: Fri Sep 22, 2017 2:08 am
- Rank: EGF 3d
- GD Posts: 0
- Universal go server handle: gennan
- Location: Netherlands
- Has thanked: 273 times
- Been thanked: 147 times
Re: KGS Ranking adjustment?
Rereading jann's posts, I might make a guess about jann's point:
Handicaps provide fairly large increments of advantage/disadvantage. He worries that with small sample sizes and/or players with very small standard deviations, statistical anomalies could hide in those large increments.
My suggestion to handle this issue: If you worry about this, you could use komi in addition to handicap stones. Then you can make the handicap increments as fine-grained as half-point increments.
Use this as match conditions: If one player loses a game, you change the komi by 0.5 points in his advantage. If the komi goes below 0 or goes up to 14 points, you add or remove a handicap stone. If it's a jigo, you keep the same handicap.
Over the course of a match, the handicap + komi should gravitate to some specific value which would give a pretty good indication of the rank gap between these players.
Handicaps provide fairly large increments of advantage/disadvantage. He worries that with small sample sizes and/or players with very small standard deviations, statistical anomalies could hide in those large increments.
My suggestion to handle this issue: If you worry about this, you could use komi in addition to handicap stones. Then you can make the handicap increments as fine-grained as half-point increments.
Use this as match conditions: If one player loses a game, you change the komi by 0.5 points in his advantage. If the komi goes below 0 or goes up to 14 points, you add or remove a handicap stone. If it's a jigo, you keep the same handicap.
Over the course of a match, the handicap + komi should gravitate to some specific value which would give a pretty good indication of the rank gap between these players.