Page 7 of 9

Re: KGS ranking revisited

Posted: Tue May 15, 2012 9:58 pm
by jts
Tami wrote:
jts wrote:Just out of curiosity, if you flipped a coin 600 times and, during those 600 flips, got a sequence of 12 consecutive flips of which (in any order) 9 were heads and 3 were tails, would you believe that the odds of flipping heads had changed during those 12 flips?
The fault with this analogy is that people change, coins don`t.
Coins are change! (Sorry, it was either that or something about flipping back and forth.)

Well, that just makes one half of the analogy easier to understand than the other half. We know coins don't (usually) change, so it's easy to contemplate with equanimity observing 12 unusually lucky flips (I always prefer heads, myself) and attributing this to chance rather than to a short period of heroic numismatic self-overcoming.

On the other hand, if you watch a basketball player make 600 3pt shots, and he makes 45% of them, it's very tempting to believe that there was a brief period of 12 shots when he had "hot hands" and that then he lost his nerve. Humans change, after all. Or, if a fund manager generally does slightly worse than the market over the course of 600 months, but within that period there were twelve consecutive months when he finally figured out the right system, but then everyone else caught on.

If you think you have evidence that a human has changed, you should see whether that evidence helps you predict the future, no?

Re: KGS ranking revisited

Posted: Tue May 15, 2012 10:41 pm
by Tami
jts wrote:If you think you have evidence that a human has changed, you should see whether that evidence helps you predict the future, no?
I agree.

It`s not that KGS ranks never change, it`s just that they change very slowly, which is frustrating for myself and many others. I`m sure anybody who maintained a high winning percentage over a year would gain promotion. But surely people can improve on a monthly basis, if not faster? Why can they not be rewarded sooner for their efforts?

Where do you draw the line anyway? How about making your ranking history stretch over 2 years to make it even more stable? Or how about 5 or 20 years? Should a player demonstrate a new level of strength or weakness over 1 week, 1 month, 1 year, or a decade before that player receives their promotion or demotion?

Speaking for myself, the KGS system does not deliver a 50-50 ratio. I lose more than I win, as it happens. Neither does it give me predominantly close games, even though the system was purportedly designed to. Taking Himiko`s most active recent month, March, for instance, and you will see there was only one game that did not end in resignation or a large margin.

I`m sure I cannot win this argument. I don't have the background in maths or statistics necessary to support my views. But my subjective impression is that the system is too rigid, and I know that this is not a unique impression. Psychology is important, and providing a more easily comprehensible system might not necessarily satisfy some people, but it would probably be more enjoyable for most to know that when they are playing well, they can receive some sort of reward for it instead of feeling that they are permanently condemned to stay at a certain level.

Up to now, KGS has not had a serious rival. Now it does. Nobody can force KGS to change its systems, but it will certainly be interesting to see how people take to Kaya's rating system when Kaya goes public. I won't be surprised if there were far fewer complaints about it.

Re: KGS ranking revisited

Posted: Tue May 15, 2012 11:12 pm
by RobertJasiek
Mef wrote:as I understand it, your rank only gets heavy when no one else is watching.
No, it is almost always heavy, with these exceptions:

1) I have hardly played at all for about 3+ months. (Rare.)

2) On a very few days, a great winning percentage actually results in a significant increment. (The contrary is much more frequent: one or two bad nights almost invariably lead to a significant decrement.)

Re: KGS ranking revisited

Posted: Tue May 15, 2012 11:24 pm
by RobertJasiek
lemmata wrote:these numbers are just for fun
No, because it is no fun having to play a too great percentage of mismatched opponents.

Re: KGS ranking revisited

Posted: Wed May 16, 2012 12:01 am
by RobertJasiek
jts wrote:Mef, who did an excellent job on his analysis of supposedly "heavy" rankings.
April 2011:
47% but dramatic decrement. Was it one of the server adjustments? I do not recall.

May 2011:
61% but the rating dropped.

June 2011:
62% but the rating dropped.

July 2011:
62% and the rating went up dramatically. Not over a period but very suddenly.

Conclusion for April to July 2011:
- Great frustration: three months of rating development contrary to performance during that period.
- More great frustration for the sudden dramatic jump upwards in July. A jump totally inconsistent with the performance May to July. It would have made much more sense if the rating would have improved rather steadily during these three months.

August 2011:
51% but the rating goes upwards. Why has it not gone upwards more, earlier during May to July, when my winning percentages were significantly higher than in August?! This is frustrating again; the rating development defies performance.

Spetember to November 2011:
44%, 39%, 45% but the rating does not drop; instead the rating remains constant. This is frustrating. How can one be motivated to win more while even significantly below average results leave the rating constant?

December 2011:
71%, ok only 17 games, so not much data.

January 2012:
40%, 300 games. For the first time(!), the rating develops as it should: it drops. That it drops only ca. 15% of a rank shows just how very heavy rating changes are! This is frustrating again because 1) it promises just how difficult it will be to increase 15% of a rank and 2) watching some other players with rank changes by 2+ ranks due to just a few played games confirms that playing too many games is a punished.

Feburary + March 2012:
63%, 56 games. 75%, 12 games. This is one of the rare cases where the rating moves up dramatically. More great frustration! Frustration because I have no idea whatsoever why suddenly exceptionally KGS fulfils the player's dream. January with 40%, 300 games would have suggested something very different. The percentages from August to January also would have suggested something very different. Furthermore, the comparison with 62%, average 149 games during May to July 2011 with almost thrice as many games as February 2012 would have suggested that the rating development in Feburary + March 2012 could at most have the increment of May to July 2011; but now the rating increment is greater. Great frustration: three reasons why the new increment did not make sense.

April 2012:
2 games, constant rating. This is only the second month of 13 months with a reasonable rating development. Very sad: When I do not play, my rating development makes much more sense than when I play.

May 2012:
The greatest frustration of all: a manual server shift drops the rating by ca 35% of a rank.

Overall conclusion:
For by far most of the time, the rating developments create frustration instead of good meaning.

Note:
Remarks like "the system is good enough" while the most obviously it is bad multiply that frustration.

Re: KGS ranking revisited

Posted: Wed May 16, 2012 1:53 am
by HermanHiddema
Tami wrote:Mef, I resent being used for that kind of analysis. Couldn`t you at least have chosen some other player and WITHHELD their name? Doing what you did has a definite flavour of ad hominem about it, as did another poster`s allusion to "illusory superiority".
Well, that other poster would be me then.

And I'm sorry you feel that way. My post was in reply to one by Kaya.gs about the psychology of rating systems. It was not intended to apply to anyone personally, and it also was not meant to apply specifically to the current KGS rating system, or to people's experiences with that. It was just an explanation of, and link to a Wikipedia article about, a well known psychological phenomenon that affects all people in general, and is therefore relevant to all rating systems. It never even crossed my mind that you or anyone else would consider it a personal attack.

I have no opinion, in fact, on whether or not the current parameters for the KGS rating system are optimal. I hardly ever play online anyway. I'm just remarking that it is impossible to please everyone, and that there is, IMO, no possible set of parameters for the KGS rating system that will not result in complaints.

Re: KGS ranking revisited

Posted: Wed May 16, 2012 2:34 am
by RobertJasiek
HermanHiddema wrote:it is impossible to please everyone
It is, however, possible to please a much greater percentage of players.

Re: KGS ranking revisited

Posted: Wed May 16, 2012 4:18 am
by Mef
averell wrote: You act like there are no "system anomalies". Even if a system like KGS is the best available (or reasonably possible) doesn't mean that it's not crap in a lot of ways. For example fake short-lived accounts are a reality. Also the assumption "a constant amount of games played over time" is often not realistic (christmas, other holidays). And "heavyness" of accounts even if a lot of the time is only frustration/bias actually admittedly exists and we're only arguing about how bad it actually is (fast improving people / general population).

On the contrary, I know there are anomalies, I know every rating system will have its quirks, and I've frequently said on both this forum and other places that individual player strength fluctuations due to natural variation will outweigh most of the quirks of any somewhat reasonable rating system.

What I dislike is the notion that we need a process of "I am frustrated and think there is problem with X we need to make modification Z" -> "Ok! Let's immediately change the system and add Z!"

What I prefer is "I am frustrated and think there is problem with X" -> "Ok, let's verify phenomenon X actually occurs. If there is an issue with X, let's identify the extent to which it is a problem and who is affected by it. Once we know who is affected and how they are affected, let's come up with a reasonable mitigation measure that returns us to the expected system behavior."

KGS's rating system is a mathematical model being applied to large dataset, if there are issues they should manifest themselves in data. At least a problem should be demonstrable with an example or case study, if not with a large amount of data from a group. As mentioned up thread by others, there are natural biases that will pop up and any and all of our perceptions and in order to get an honest assessment of what's happening vs. what should be happening, we should be able to come up with objectively measurable criteria we can use to go out and test in order to confirm or disprove our suspicions.

Re: KGS ranking revisited

Posted: Wed May 16, 2012 4:39 am
by Mef
Tami wrote: In any case, the graph Mef made could even support my points: between September to November I only managed a win rate of 40% or worse, yet my graph went UP.
RobertJasiek wrote:May 2011:
61% but the rating dropped.
This is perhaps another misconception about how rating systems should behave -- both of these cases are correct behavior for this system. Most go rating systems lump people into pools (ranks, grades, etc). The result is that you are not typically playing someone with the same rating as you, but instead playing someone who is average for your pool. What this means is that if you are on the either tail of the pool you are not expect to win 50% of your games.

If you have just recently promoted to 1d, you may have a rating of 1.0, but your average opponent will be rated 1.5 -- this means that you are playing your average game under-handicapped by about half a stone. The expected number of games you will win is a bit lower than half (using the EGF numbers, a 50 point rating difference near 1 dan is about a 37-39 % expected win rate...I think KGS is a little wider maybe 33-35% for a half stone difference).

This means that if you have just promoted, and you are winning 40% of your games, you are actually doing a little bit better than the rating system was predicting, which means your rating will go up accordingly (maybe to 1.2d or something like that). If you were to maintain a 50% win rate you would be expected to have your rating move up to 1.5.

On the other side of the pool you have the reverse issue, if you are a 1.9d who is on average playing 1.5d players, you are expected to win 60 to 66% of your even games, because you are playing people the system expects are weaker than you. If you are a 1.9d there's a very real chance you will get paired with that person who is a 1.2d and you are playing a game that under slightly different circumstances would be played at a 1 stone handicap. Maintaining a 60% win-rate against an average 1d is merely doing what is expected and will not on its own cause your rating to rise.

Any system that assigns players a rating (as opposed to just a rank) will suffer from this. This solution to this "problem" is simple - narrow the pools until you are satisfied with the bandwidth. On KGS the easiest way to do it (if you are worried about being on the edge of a band) is to offer a fractional komi.

Re: KGS ranking revisited

Posted: Wed May 16, 2012 4:55 am
by hyperpape
I see that Mef and I were writing to make the same point at the same time. Oh well.
Tami wrote:Speaking for myself, the KGS system does not deliver a 50-50 ratio.
This is one of the most unintuitive things about ranks. If you just crossed the 1 kyu threshold (or any rank threshold), you need to win less than 50% of your games to maintain or even improve your rank. Why? Well, everyone between 1 kyu and 1 dan is labeled "1 kyu" but the system thinks that some of them are stronger--they're almost 1 dan--and others are weaker. Take the average, who's a ".5 kyu". When you're paired against that player, you are expected to win less than 50%, because that player is a bit stronger than you. So if you win 50% your rank will increase. If you win somewhat less than 50%, your rank may stay stable. And of course it's the opposite as you approach the threshold from below--you have to win more than 50% of your games.

Note: this point isn't really specific to KGS--it's true of any system that uses dan/kyu ranks but estimates how likely you are to win based on your precise rank (i.e. 1.4325 kyu) and updates your rank accordingly. But it is an argument that EGD style ranks where you see two extra significant digits are psychologically better. It's intuitive that a 2200 who beats a 2250 has just had "good win". But tradition is a hard thing to change.

Re: KGS ranking revisited

Posted: Wed May 16, 2012 4:57 am
by Tami
Mef wrote:On the other side of the pool you have the reverse issue, if you are a 1.9d who is on average playing 1.5d players, you are expected to win 60 to 66% of your even games, because you are playing people the system expects are weaker than you. If you are a 1.9d there's a very real chance you will get paired with that person who is a 1.2d and you are playing a game that under slightly different circumstances would be played at a 1 stone handicap. Maintaining a 60% win-rate against an average 1d is merely doing what is expected and will not on its own cause your rating to rise.Any system that assigns players a rating (as opposed to just a rank) will suffer from this. This solution to this "problem" is simple - narrow the pools until you are satisfied with the bandwidth. On KGS the easiest way to do it (if you are worried about being on the edge of a band) is to offer a fractional komi.
Thanks for explaining this in easy-to-understand terms. It seems a bit clearer to me why it can sometimes feel that there is no benefit in winning, and a lot of loss in losing.

I think it would be an improvement if the KGS algorithm would offer the option of modifying komi so that a rated game would carry the same weight for both players.

Re: KGS ranking revisited

Posted: Wed May 16, 2012 6:35 am
by RobertJasiek
Mef wrote:On KGS the easiest way to do it (if you are worried about being on the edge of a band) is to offer a fractional komi.
No. The easiest solution is: Do not use any pools! Players do not alter their strengths in pools but alter them continuously. Therefore pools are a bad model for a rating system.
This is perhaps another misconception about how rating systems should behave
Not a misconception but a different preference, see above.
-- both of these cases are correct behavior for this system.
Correct only under the assumption that the system design criteria (such as using pools at all) were any good.

Re: KGS ranking revisited

Posted: Wed May 16, 2012 6:41 am
by RobertJasiek
Mef wrote:What I prefer is "I am frustrated and think there is problem with X" -> "Ok, let's verify phenomenon X actually occurs. If there is an issue with X, let's identify the extent to which it is a problem and who is affected by it. Once we know who is affected and how they are affected, let's come up with a reasonable mitigation measure that returns us to the expected system behavior."
I prefer to make changes BEFORE generating any frustration. Specify criteria which a rating system must fulfil for each player. Then design a rating system that will fulfil the criteria for almost all players (98% rather than only 60%).

Re: KGS ranking revisited

Posted: Wed May 16, 2012 6:42 am
by averell
RobertJasiek wrote:
Mef wrote:On KGS the easiest way to do it (if you are worried about being on the edge of a band) is to offer a fractional komi.
No. The easiest solution is: Do not use any pools! Players do not alter their strengths in pools but alter them continuously. Therefore pools are a bad model for a rating system.
A "pool" here is just a word he used to describe the range of opponents that are eligible to play an even game with you. If you want to play even games, you need such a set. The only thing you could criticize is how large it should be (at the moment, all people with a rating that gives them the same "rank").

Re: KGS ranking revisited

Posted: Wed May 16, 2012 6:45 am
by RobertJasiek
stalkor wrote:Also to help players understand how much a win or loss is worth i would like to see an addition in the games tab list where its stated how much that game made your rank shift up or down
I want to see more: I want to see BEFORE agreeing on an opponent 1) which winning percentage the system assumes and 2) how many rating points a win / tie / loss will affect my rating.