KGS ranking revisited

hyperpape · Post by **hyperpape** » Thu May 10, 2012 2:20 pm

Robert, like it or not, he's arguing about a reasonably well-defined concept. The fact that you don't like using the word "consistent" in naming that concept (which could naturally be named "momentary internal consistency") is pretty much irrelevant.

RobertJasiek · Post by **RobertJasiek** » Thu May 10, 2012 10:34 pm

One should not call an inconsistent system consistent.

daal · Post by **daal** » Thu May 10, 2012 10:55 pm

RobertJasiek wrote:
witwit wrote:The most important thing for a ranking system is to be internally consistent
A system with sudden shifts IS inconsistent.

But he said internally consistent. If everyones rank is affected in the same way by the recalibration, the system will be just as good at predicting an even match as it was before.

RobertJasiek · Post by **RobertJasiek** » Fri May 11, 2012 12:04 am

A rating system moving too many players upwards is internally inconsistent because a rating system shall be able to distinguish players instead of creating heaps of fake strong subpopulations.

SpongeBob · Post by **SpongeBob** » Fri May 11, 2012 3:13 am

RobertJasiek wrote:... heaps of fake strong subpopulations.

That sounds scary ...

hyperpape · Post by **hyperpape** » Fri May 11, 2012 4:18 am

RobertJasiek wrote:One should not call an inconsistent system consistent.

You're begging the question! As I pointed out, a good descriptive name for the system he wants is "momentary internal consistency".

jts · Post by **jts** » Fri May 11, 2012 9:51 am

hyperpape wrote:
RobertJasiek wrote:One should not call an inconsistent system consistent.
You're begging the question! As I pointed out, a good descriptive name for the system he wants is "momentary internal consistency".

You could be less obtuse, Robert, if instead you said "One should not call an erratic system consistent." That would make it clear that you are, indeed, making a synthetic claim and would point the way in two directions - first, you could clarify the extent to which the KGS system really is erratic, and second, you could clarify why, if we want a consistent system, we should make it less erratic as well.

The problem with inconsistency, as I see it, is that it leads to situations that violate assumed transitive properties. For example, if F always beats W, W always beats H, and H always beats F, it's difficult to apply any meaningful ranking to the {H, F, W} triad.

However, if a ranking system suddenly shifts, that merely means that it has adapted to new information. If it shifts a lot, that either means it gets lots of new information frequently, or it is very sensitive to what little it does get.

RobertJasiek · Post by **RobertJasiek** » Fri May 11, 2012 10:50 am

jts wrote:if instead you said "One should not call an erratic system consistent."

This is a more general requirement. Fine.

if a ranking system suddenly shifts, that merely means that it has adapted to new information.

So the system outputs for each player: "I have adapted to new information! (I do not tell you which information, nor which adaption.)" ;)

IOW, not each adaption is good.

jts · Post by **jts** » Fri May 11, 2012 11:28 am

RobertJasiek wrote:
jts wrote:if instead you said "One should not call an erratic system consistent."
This is a more general requirement. Fine.

if a ranking system suddenly shifts, that merely means that it has adapted to new information.
So the system outputs for each player: "I have adapted to new information! (I do not tell you which information, nor which adaption.)"

IOW, not each adaption is good.

So your objection is not that it's erratic per se, but that wms will not share with you the information that goes into the daily iteration?

emeraldemon · Post by **emeraldemon** » Fri May 11, 2012 11:50 am

It seems to me that the ideal rating & handicapping system would strive to handicap every match to a 50% win rate. If a player's win rate is much higher or lower than 50%, that player is not being well served by the system. (we do occasionally get threads complaining about this, usually "I've won 10 games in a row and my rank hasn't gone up!")

If this is the metric we want to use, it's very easy to check the error: look at the average win-rate of every player over an appreciable number of games, and find the average distance from 50%. There was a competition a while back looking for improvements to ELO that used basically this metric on historical chess data, I believe.

jts · Post by **jts** » Fri May 11, 2012 11:55 am

emeraldemon wrote:It seems to me that the ideal rating & handicapping system would strive to handicap every match to a 50% win rate.

This isn't quite right though, as the ratings are continuous, even though the ranks are cardinal. So between a 3.9k and a 3.0k we might expect the stronger to win 2/3 of the game, even though from the perspective of the stronger player he may feel frustration that he wins 2/3 of his games and never seems to rank up.

hyperpape · Post by **hyperpape** » Fri May 11, 2012 12:28 pm

One adaptation is to use all the variations of komi between 6.5 and 0.5 as appropriate. Of course this doesn't remove the problem entirely.

wms · Post by **wms** » Fri May 11, 2012 1:58 pm

emeraldemon wrote:...There was a competition a while back looking for improvements to ELO that used basically this metric on historical chess data, I believe.

A year or two ago somebody surveyed various rank algorithms applied to go. He used the KGS algorithm (I'd given him what he needed to recreate it exactly), Elo, a couple modern systems (Glicko I think was one?), and his own system. He then used ability to predict game outcomes as his metric of how good a system was. I was happy to hear that the KGS system placed second in his study, behind his own, but ahead of Elo and Glicko. But his system did not consider predictability of rank changes; so KGS' penchant for changing your rank when you don't play, or for it's occasional bumps where everybody goes up or down together, did not count against it.

I'm terrible with names but probably somebody here on 19x19 will remember who did the study and where the results are.

yoyoma · Post by **yoyoma** » Fri May 11, 2012 2:00 pm

wms wrote:
emeraldemon wrote:...There was a competition a while back looking for improvements to ELO that used basically this metric on historical chess data, I believe.
A year or two ago somebody surveyed various rank algorithms applied to go. He used the KGS algorithm (I'd given him what he needed to recreate it exactly), Elo, a couple modern systems (Glicko I think was one?), and his own system. He then used ability to predict game outcomes as his metric of how good a system was. I was happy to hear that the KGS system placed second in his study, behind his own, but ahead of Elo and Glicko. But his system did not consider predictability of rank changes; so KGS' penchant for changing your rank when you don't play, or for it's occasional bumps where everybody goes up or down together, did not count against it.

I'm terrible with names but probably somebody here on 19x19 will remember who did the study and where the results are.

http://remi.coulom.free.fr/WHR/

emeraldemon · Post by **emeraldemon** » Fri May 11, 2012 7:47 pm

Thanks for the link. wms, did the results of that study make you consider trying his algorithm?

Life In 19x19

KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited

Re: KGS ranking revisited