Harleqin wrote:gennan wrote:The EGD was specifically designed to rate players on a rank scale (based on handicaps), because that is the traditional "rating" scale used in go.
Well, yes, it was supposed to carry that correlation forward, but I don't see how it could actually work:
1. Players use the rating to determine their rank
2. There are very few handicap games in the data
This means that (1.) the rank is not an input anymore, and (2.) the value of handicap is not an input.
This comes on top of the general problem that rating systems do not get enough data to reach any kind of conclusive answer anyway.
Handicaps vs rank gaps
About 10% of the EGD games are handicap games. Indeed that is not really enough to ensure that rank gaps and handicaps stay aligned over time. However, when we analyse those handicap games, the actual results seem to align quite well with expected results. On average the error seems to be less than a stone. So apparently, the EGD managed to keep rank gaps fairly well aligned with handicaps over more than 2 decades. That is encouraging. The system must be doing something right.
Overall rating drift
Besides aligning ranks gaps with handicap, there is also the matter of overall rating drift (inflation/deflation).
1. Before the internet, players would determine their rank by handicap in club games (outside of the EGD). When players entered their 1st tournament, they would declare their club rank at that moment and that would be their initial rating in the EGD.
2. Also, the EGD has a reset mechanism. When a player has improved a lot since their previous tournament (again determined by their current club rank from handicap games), they would declare their new rank on entering the next tournament. If that new rank is more than 1 rank above their previous highest declared rank, the system resets that player's rating to that new rank. So the player is not forced to sandbag and "steal" all their rating points from weaker players.
So the EGD is constantly being fed by club ranks. This is mostly sufficient to keep EGD ranks fairly well aligned to club ranks.
Internet players do complicate this picture a bit, because a new tournament player may have never played in a club when entering their 1st tournament. So they only know their rank on their favourite go server. And the tournament organisers will then guesstimate their equivalent EGF rank to play in the tournament.
The system depends on the right balance of pessimistic and optimistic rank declarations by tournament players to stay well aligned.
When players stop declaring club promotions when playing in tournaments, it becomes difficult to keep EGD ranks aligned to club ranks. If a previous 5k knows he has improved to 3k in their club, but does not update their rank when entering the next tournament, taking away rating points from his opponents to make his own rating rise to 3k, he is effectively sandbagging and causing deflation.
When looking at the EGD data, it does seem that overall, players tend to be pessimistic/conservative, delaying promotion until their EGD rating supports it. There are insufficient optimistic promotions to counter the deflation of pessimistic promotions.
3. So to counter the deflationary effect of improving players that don't promote themselves, we increase the rating bonus in the upcoming EGD update to constantly inject rating points into the system (at least in the kyu range).