jts wrote:So your objection is not that it's erratic per se
Mainly my objection is the system's design errors.
jts wrote:So your objection is not that it's erratic per se
emeraldemon wrote:look at the average win-rate of every player over an appreciable number of games, and find the average distance from 50%.
This is a simplifying theory but not quite true. When the rating system is bad, then some players can play worse than usual because the system expects them to win much more than 50%, winning that much is tiring, and so they win less than they would if they were not forced into becoming tired. E.g., I (and others, from whom I have heard the same) can win ca. 10-12 games in a row, but then one becomes so tired than winning 20-24 games in a row is out of the question. Rather quickly lost games occur, first 1, then 2, then 4, then 8. The more tired the greater the percentage of lost games becomes.
hyperpape wrote:One adaptation is to use all the variations of komi between 6.5 and 0.5 as appropriate. Of course this doesn't remove the problem entirely.
witwit wrote:there is no way to objectively measure accuracy like you can when judging the internal accuracy of the system
RobertJasiek wrote:emeraldemon wrote:look at the average win-rate of every player over an appreciable number of games, and find the average distance from 50%.
This is a simplifying theory but not quite true. When the rating system is bad, then some players can play worse than usual because the system expects them to win much more than 50%, winning that much is tiring, and so they win less than they would if they were not forced into becoming tired. E.g., I (and others, from whom I have heard the same) can win ca. 10-12 games in a row, but then one becomes so tired than winning 20-24 games in a row is out of the question. Rather quickly lost games occur, first 1, then 2, then 4, then 8. The more tired the greater the percentage of lost games becomes.
jts wrote:Well, not necessarily. If your most recent partners decline, you'll decline to. It just assumes that, in the absence of evidence, you can still beat the same people and lose to the same people.
RobertJasiek wrote:witwit wrote:there is no way to objectively measure accuracy like you can when judging the internal accuracy of the system
Do you say that an objective external measure of internal accuracy cannot exist or that so far nobody has described such yet?
emeraldemon wrote:Even if it's true that winning is more tiring than losing (which I'm not sure of),
You can't say "Player A would beat Player B 80% of the time if Player A didn't have to win 80% of the time".
are you trying to suggest that we should model this?
witwit wrote:an objective measure of consistency with external systems can only be defined by arbitrarily picking another system to compare against.
Yes and no. Yes in that it made me decide that if I ever revisit the ranking system, Remi's system would be the first place I go for alternatives. No in that his paper reaffirmed my belief that the KGS system is "good enough" and there is no urgent need to replace it.emeraldemon wrote:Thanks for the link. wms, did the results of that study make you consider trying his algorithm?
Another research direction would be to improve the model. An efficient application of WHR to Go data would require some refinements of the dynamic
Bradley-Terry model, that the KGS rating algorithm [13] already has. In particular, it should be able to
– Handle handicap and komi.
– Deal with outliers.
– Handle the fact that beginners make faster progress than experts.
Kaya.gs wrote:My opinion is that accuracy is just one of the factors in a rating system. The psychology of it is very important. I think the key element that produces discontent with kgs's rating system is heavyness. Its an educated guess that the #1 reason for multiple accounts is the rating system.