Re: The Probability of a Monkey Defeating Yi Chang-ho
Posted: Wed Aug 22, 2012 7:32 am
HermanHiddema wrote:topazg wrote:I'm arguing that the support data collected argues towards a relationship between rating and performance where, with a large enough rating gap, the chance of the weaker playing is 0%. There is limited by existant data supporting this by demonstrating that, for chess at least, the relationship is closer to linear than logistic.
You are arguing that the relationship between rating and performance exists, but the lower and upper bounds of winning chance never reach 100% or 0% - a true logistic function. I'm asking for some data that supports the view that a true logistic function is more reliable model than a function that is linear with logistic elements.
Oh, I see.
Well, the basis of the Elo rating system and similar systems is logistical. There is no rating difference for which the formula returns 0 or 1. The data fits that curve reasonably well, AFAIK. The fact that a certain result, which according to the formula should have a very small but non-zero chance, has not in fact happened, does not in any way constitute proof that it cannot happen.
My original reference to the chess world revolved around an appeal from mathematicians (led, IIRC, by Jeff Sonas) that FIDE change their rating formula, precisely because the Elo rating system does not, in fact, fit the data very well, the data being closer to a linear model. Elo's model, whilst a really nice way of having a decently designed rating system at the time, is a very crude model, and was never originally based on supporting data (and with k values and rating difference / sd type values arbitrarily assigned so the model could be refitted to be as closely predictive as possible).
I fully understand that I have a rather sad life to find this so interesting, but I do
Because everyone loves graphs, and because some people probably don't have the faintest idea what we're talking about, I've made some very crappily drawn graphical examples:
Herman's theory (standard logistical function):

My theory (somewhat logistical function, but with upper and lower floors), please ignore the fact I can't draw freehand curves very well:

EDIT: p = the probability of the player beating the reference player, 1 being 100%, and r being the amount the player's rating in question differs from the reference player.