Life In 19x19
http://lifein19x19.com/

Revised European go ratings
http://lifein19x19.com/viewtopic.php?f=10&t=14557
Page 4 of 6

Author:  Schachus [ Fri Oct 06, 2017 9:18 am ]
Post subject:  Re: Revised European go ratings

Great! Can I see these improved predictions somewhere? Actually I think its okay, if the predicion has slightly more confidence in the stronger player than actual data (because weaker players are ever so slightly more likely to have improved), but of course not nearly this much.

Author:  gennan [ Fri Oct 06, 2017 9:20 am ]
Post subject:  Re: Revised European go ratings

Schachus wrote:
actually, I'm interested: did you do anything the "a" in your revised ratings?


I made a new "a" function by fitting to the observed statistics. see http://goratings.eu/Probabilities and http://goratings.eu/Probabilities/A_ObservedEGD and the other 1/a links on those pages.

Author:  gennan [ Fri Oct 06, 2017 9:23 am ]
Post subject:  Re: Revised European go ratings

Schachus wrote:
Great! Can I see these improved predictions somewhere? Actually I think its okay, if the predicion has slightly more confidence in the stronger player than actual data (because weaker players are ever so slightly more likely to have improved), but of course not nearly this much.


These are the revised predictions resulting from the modified 1/a: http://goratings.eu/Probabilities/P_PredictedRevised.

Author:  Schachus [ Fri Oct 06, 2017 9:35 am ]
Post subject:  Re: Revised European go ratings

Ok, now that you explained this, I can appreciate your effort much better. That is really great!!
I understand right, that you took the green line from observed data as the base of the revised system?

I would suggest, making "a" a very very tiny bit smaller than observed data suggest, because it seems to me that in a rating system, that is devised to self-correct wrong ratings(caused by wrong reset, or not resetting despite improving etc.) over time, you will always have a small percentage of games, where a player is wrongly rated. Now it seems slightly more likely the higher rater player is overrated then the lower rated player (because the "real strength" "should be"(I believe) evenly distrubted and said over-ranked player would have higher rating then real strenth). It would then be correct that the overrated player(who is slightly more often the higher-rated one) would win less than rating says and thus also averaging over everything, a very tiny effect making the stronger players win very slightly less than they "should", would be "right" for a well-working rating system. Unfortunatly I have no clue how big that very tiny effect has to be, I will think about that...


But now I wish I could see how that came out before you cangend rating resets. Can this still be seen somewhere?

Author:  dfan [ Fri Oct 06, 2017 10:17 am ]
Post subject:  Re: Revised European go ratings

Schachus wrote:
Why are you so sure, conservative resetting leads to deflation? There are absolutely no resets in chess and still there is no deflation...(in fact, chess players are whining about inflation, but I dont believe in that either, really).

Chess ratings do have deflationary tendencies when you include "kyu players". In the absence of compensating forces, rapidly improving kids will suck rating points out of the system, as the points required to bring them from 1000 to 2000 (say) have to come from somewhere. The USCF rating system, for example, explicitly includes "bonus" points for exceptional tournament performances (compared to one's current rating) for exactly this reason. Of course older players can put rating points back into the system as they decline too, but the magnitude is less.

You say you don't believe in chess rating inflation either, so I won't spend too much time trying to disprove that, except to say that 1) ratings are higher than they used to be, but in every sport, performance is higher than it used to be, so why not chess?, and 2) studies that evaluate actual move quality by computer engine do not find a drift in the relationship between move quality and Elo rating over time.

Author:  gennan [ Fri Oct 06, 2017 10:19 am ]
Post subject:  Re: Revised European go ratings

Schachus wrote:
But now I wish I could see how that came out before you cangend rating resets. Can this still be seen somewhere?


I can say that the reset policy has no direct effect on the predicted winrate. It does have an effect on the rating distributions (http://goratings.eu/Histograms/Histogram_Revised?year=1996&country=).

Over the years, the distributions can show a bias (mismatch between declared ranks and ratings), like the EGD distibution in 2012. To counter this, I find that the reset policy and epsilon parameter are somewhat interchangeable (the difference is that resets affect individual players and their opponents and spreading from there, while the epsilon parameter affects all players). So reset is a local instrument, while epsilon is a global instrument.

If I use the most liberal reset policy, middle dan ranks will inflate a bit over the years (about 30 point in 20 years), so I changed it to reset to the lower bound of the declared rank for dan players (not contradicting the declared rank, but still using a gain of salt). If I use that policy for kyu ranks, they become deflated over the years (also about 30 points over 20 years). So I picked the middle road.

If I use the EGD reset policy, I can remove the resulting rating distribution bias by using a positive epsilon (larger than the EGD currently uses). The EGD uses quite large K factors for lower ratings (http://goratings.eu/Probabilities/PointContributions_EGD). If I use more normal K factors (http://goratings.eu/Probabilities/PointContributions_Revised) while keeping EGD's reset policy, I need a much larger epsilon. So it seems that EGD's large K factor for lower ratings adds a lot of noise, which obscures its need for a larger epsilon value or a more liberal reset policy.

----

With every modification I do, I clear the system and reprocess all the EGD tournament data (from 1996 to 2017). That reprocessing takes about a minute. I usually do some trial runs on my laptop before pushing it to the web site and reprocesing there.

So basically I have only one version of the system. But I could change some things, reprocess so you can see the effect and change it back when you are done viewing the results. (But tonight I'm a bit too busy for that)

Author:  Schachus [ Fri Oct 06, 2017 12:35 pm ]
Post subject:  Re: Revised European go ratings

dfan wrote:
Schachus wrote:
Why are you so sure, conservative resetting leads to deflation? There are absolutely no resets in chess and still there is no deflation...(in fact, chess players are whining about inflation, but I dont believe in that either, really).

Chess ratings do have deflationary tendencies when you include "kyu players". In the absence of compensating forces, rapidly improving kids will suck rating points out of the system, as the points required to bring them from 1000 to 2000 (say) have to come from somewhere. The USCF rating system, for example, explicitly includes "bonus" points for exceptional tournament performances (compared to one's current rating) for exactly this reason. Of course older players can put rating points back into the system as they decline too, but the magnitude is less.

You say you don't believe in chess rating inflation either, so I won't spend too much time trying to disprove that, except to say that 1) ratings are higher than they used to be, but in every sport, performance is higher than it used to be, so why not chess?, and 2) studies that evaluate actual move quality by computer engine do not find a drift in the relationship between move quality and Elo rating over time.


As for the deflation in lower ratings, you are right, German("DWZ") System also has some weird countermeasures there. I believe the deflation to be caused by the fact, that new kids tend to be even weaker than low rated old kids and thus start with very low ratings and drag other people down. You are right about that, so maybe you do need rating resets or similar at least for DDK. I wasnt thinking about that, when I wrote the previous post, my bad.

for the other part: what you in 2) is basically what I mean: maybe ratings are higher, but players are better. Also, even if ratings would really be higher beyond that, I would tend to believe in a "streching" of the scale (difference between high and low rated players increases), which looks like an inflation if you only look at ratings for top players, rather than an overall inflation.
This stretching can be caused by playing in "elite groups"(you tend to play players close to your own rating), which lets the system loose calibration for rating differences between players, that are rated widely apart.

Author:  gennan [ Fri Oct 06, 2017 2:02 pm ]
Post subject:  Re: Revised European go ratings

So perhaps we are lucky that go has ranks as a measure of large skill differences. I think it's more reliable and accurate to determine that a handicap of 7 stones rather than 6 or 8 stones results in 50% winrate (650 points difference) than to use even game winrates to determine if the difference is 650 rather than 550 or 750. You'd have discriminate between 83%, 86% and 89% according to the revised system (using a 1d as the higher reference rank).

BTW, the EGD predicts something like 99%, 99.4% and 99.9% in this case, but according to my findings, only 5d+ players have such high winrates in even games against opponents that need 7 stones handicap for a 50% winrate (see http://goratings.eu/Probabilities/P_ObservedRevised).

So go ranks (handicap calibration) can be used (and they are) in a reset policy quite naturally as a counter against deflation. I did not know that chess rating systems also have this deflation issue in the lower regions, but it does seem unavoidable now that I think about it. It's interesting that they also use some sort of reset policies to counter it, rather than an epsilon parameter. But it makes sense. I think a reset policy works better because it handles the issue at the root. A global epsilon parameter would take far too long to even out the local dips (deflation) caused by kids that quickly progress by 1000 points.

Inflating everybody a little by an epsilon parameter to counter local deflation does not seem a particularly good instrument to me. It's like trying to help the poor in my country by giving everybody (rich, poor, middle class) 1 euro.

I also don't think that indiscriminately increasing the K factor for lower ratings really helps to counter deflation. It helps when a lower rated player plays against much higher rated players, because the lower rated player gets many more points from a win that his higher rated opponent loses. But if the lower rated player mostly plays other lower rated players, it just creates more noise.

I do think that increasing the K factor of newcomers and newly self-promoted players while decreasing their opponents' K factors could make the reset policy work better. It would dampen rating ripples expanding outward from overranked or underranked resets. I intend to try it out.

Author:  gennan [ Fri Oct 06, 2017 2:16 pm ]
Post subject:  Re: Revised European go ratings

Schachus wrote:
Great! Can I see these improved predictions somewhere? Actually I think its okay, if the predicion has slightly more confidence in the stronger player than actual data (because weaker players are ever so slightly more likely to have improved), but of course not nearly this much.


I think rating confidence, improving players and winrate predictions are distinct issues and they are handled by different parts of the system. Respectively, the K factor, resets, the a function (or rather the function that I call beta, from which a is derived (1/a is the derivative of this beta function, see the About page).

I don't fully understand which of these issues you are adressing here.

Author:  gennan [ Fri Oct 06, 2017 2:54 pm ]
Post subject:  Re: Revised European go ratings

Schachus wrote:
Still I think, if your goal is to raflect handicaps well, you need to take handicap game data to check the wuality of your ratings, even if there is not a lot of data. I still think, checking against the rank defeats the purpose, since that way the system "assing the rating that corresponds to the declared rank" would be optimal, while it clearly isnt.


You can check the statistics for handicaps game at http://goratings.eu/Probabilities/P_ObservedRevised. A bit below the graph you can find a handicap selection box.

Author:  gennan [ Sun Oct 08, 2017 7:27 am ]
Post subject:  Re: Revised European go ratings

gennan wrote:
Schachus wrote:
Great! Can I see these improved predictions somewhere? Actually I think its okay, if the predicion has slightly more confidence in the stronger player than actual data (because weaker players are ever so slightly more likely to have improved), but of course not nearly this much.


I think rating confidence, improving players and winrate predictions are distinct issues and they are handled by different parts of the system. Respectively, the K factor, resets, the a function.


I chose to use the a function to generate K factors for the revised system: K = a * 0.1 (the resulting K factor is shown at http://goratings.eu/Probabilities/PointContributions_Revised). The resulting oscillation sizes in player rating histories look good to me (see the samples at http://goratings.eu/).

Author:  gennan [ Sun Oct 08, 2017 7:32 am ]
Post subject:  Re: Revised European go ratings

gennan wrote:
I do think that increasing the K factor of newcomers and newly self-promoted players while decreasing their opponents' K factors could make the reset policy work better. It would dampen rating ripples expanding outward from overranked or underranked resets. I intend to try it out.


I implemented this now.
It does not make a big difference overall. Ratings of newcomers and newly self-promoted players oscillate a bit more(and the ratings of their opponents a bit less). Perhaps the statistics are a bit smoother, but it's not a big difference.

Author:  Schachus [ Sun Oct 08, 2017 8:13 am ]
Post subject:  Re: Revised European go ratings

The last thing(increasing K for newly resetted, decresing for their opps) sounds like a very good idea to me. Sad to hear, it doesnt have much effect.

As for the handicap system being a good way to measure larger skill differences: Yes, I agree this is something nice, chess doesnt have, but the whole handicap/rank system also brings new problems with it, for example transitivity. If a gives B 3 stones for an even game and B gives C 3 stones as well, is it reasonable to assume A should give C 6 stones(or 5,5 assuming the first one is always just half a stone)?Of course this depends on particualar players and style, which is always a problem with breaking it down to one number in Elo-like systems, but also apart from that, so if you average over a lot of players, I'm not sure high and low handicaps really compare to one-another in that way

Author:  gennan [ Sun Oct 08, 2017 12:25 pm ]
Post subject:  Re: Revised European go ratings

Schachus wrote:
The last thing(increasing K for newly resetted, decresing for their opps) sounds like a very good idea to me. Sad to hear, it doesnt have much effect.


Well, I think it shouldn't have much effect on overall statistics. After all, it is an instrument that applies temporarily and locally. You can see some effect if you look at rating histories of individual players (it temporarily increases the resettee's rating oscillations and it decreases his opponents' rating oscillations). This reset enhancement increases the accuracy of individual rating histories a little bit (which was the intention), but it has little effect if you look at longer periods or larger populations.

Schachus wrote:
As for the handicap system being a good way to measure larger skill differences: Yes, I agree this is something nice, chess doesnt have...


I read that chess also has handicaps (a pawn handicap, a knight handicap), but they aren't used much. I can imagine that it's more complicated than go handicaps (how many rating points is a knight handicap worth? and how does it compare with a pawn handicap?)

Schachus wrote:
..., but the whole handicap/rank system also brings new problems with it, for example transitivity. If a gives B 3 stones for an even game and B gives C 3 stones as well, is it reasonable to assume A should give C 6 stones(or 5,5 assuming the first one is always just half a stone)?Of course this depends on particualar players and style, which is always a problem with breaking it down to one number in Elo-like systems, but also apart from that, so if you average over a lot of players, I'm not sure high and low handicaps really compare to one-another in that way


Yes, there is this question with go handicaps. The traditional go ranking system is based on the assumption that handicaps are largely transitive. The statistics that I extracted from the EGD data seem to confirm this.

For example, compare 6 handicap with 0 handicap. The 6 stones handicap is treated by adding 550 points (rating difference = handicap * 100 - 50) to the rating of the handicap reciever. With this bonus, the expected winrates are equal to even game expected winrates.

I left out points where I have too little data (less than 50 games in the EGD), so the 6 handicap curves are much less complete (8 stone handicap is too incomplete to say anything at all). But focusing on the horizontal band around 50%, I would say it supports the assumption that 6 handicap stones matches 550 rating points over the whole rating scale. For other handicaps, basically the same picture emerges. This suggests that handicap = (rating difference + 50) / 100 roughly holds for all ratings and all handicaps.

I think this implies that go handicaps are roughly transitive (with an error margin of roughly 1 stone).

Author:  Pio2001 [ Mon Nov 06, 2017 4:31 am ]
Post subject:  Re: Revised European go ratings

Hi,
I've got two questions about the current european rating system.

Is it true that the bottom rank is 20 kyu ? (the bottom rank in France is 30 kyu)
Do handicap games count as much as games without handicap ? (in France, they are weighted with a coefficient equal to 1-H/10, and if White looses, her variation is again multiplied by 1-H/10).

Author:  Javaness2 [ Mon Nov 06, 2017 4:36 am ]
Post subject:  Re: Revised European go ratings

Yes to both

Author:  Pio2001 [ Mon Nov 06, 2017 4:42 am ]
Post subject:  Re: Revised European go ratings

Thanks !

Author:  Uberdude [ Mon Nov 06, 2017 9:09 am ]
Post subject:  Re: Revised European go ratings

Pio2001 wrote:
Is it true that the bottom rank is 20 kyu ? (the bottom rank in France is 30 kyu)


I've heard a reason for this is 20k+ players have a high variability and tend to improve fast, so if you put them in a rating system you get poor quality ratings (high uncertainty and quickly out of date). However, I've also heard a con in that it could alienate the weaker players, who feel like the ratings system doesn't think of them as real Go players and they don't get the motivation of improving their rating through tournament play. If a rating system could go down to 30k but have a confidence parameter (like Glicko rather than Elo?) and a liberal reset policy that'd seem a good solution to me.

Author:  hyperpape [ Mon Nov 06, 2017 9:49 am ]
Post subject:  Re: Revised European go ratings

AGA has ratings that go arbitrarily far down.

Author:  Pio2001 [ Mon Nov 06, 2017 9:51 am ]
Post subject:  Re: Revised European go ratings

It is true that the ranks below 20 kyu seem quite useless.
But I have encountered once a situation similar to the one you describe : beginner players were coming in the annual tournament of Lyon (France, 70 players), and, as beginners, I registered them as 20 kyu, as I usually do (I consider people capable of playing an opening on a 19x19 board as 20 kyu).

But they were very frightened of being paired with too strong players, and not getting the right handicaps. In fact, as far as handicap is concerned, players below 20 kyu are considered as 20 kyu, so it would have changed nothing.

Another french tournament director (nickname Fenring on the http://go-on.forumactif.com/forum ) also advocates not to register beginners too high. He says that beginners have to bear with their lack of experience, the stress of their first tournament, and usually under-perform. No need to add an extra challenge on their shoulders on top of that.

Regarding the variability, the french system doesn't use any weighting for players below 20 kyu.
For example, if a 21 kuy plays a 15 kyu with 6 stones, the level variation is weighted for the 15 kyu player (because of the handicap), but not for the 21 kyu (full points are awarded).
The coefficients for fast games, 9x9 games or 13x13 games are also not applied. They get the full variation for each game.

Page 4 of 6 All times are UTC - 8 hours [ DST ]
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/