EGF Rating System Commission Report 2020

The home for discussions about the EGF
gennan
Lives in gote
Posts: 497
Joined: Fri Sep 22, 2017 2:08 am
Rank: EGF 3d
GD Posts: 0
Universal go server handle: gennan
Location: Netherlands
Has thanked: 273 times
Been thanked: 147 times

Re: EGF Rating System Commission Report 2020

Post by gennan »

So when comparing airbagging and sandbagging, airbagging is the bigger issue on KGS?
That's funny, because on OGS the general perception seems to be that sandbagging is the bigger issue (at least in their forums I don't see nearly as many complaints about airbagging as I see about sandbagging).

Ignoring malicious airbagging/sandbagging, both overly optimistic and overly pessimistic players (both in good faith) have always existed. And I don't think this is a big problem, as long as those populations balance each other out. If a player underpromotes or overpromotes themselves, their rating will settle to a more realistic value soon enough (but perhaps opinions will differ on how soon is soon enough).
Adin
Dies in gote
Posts: 28
Joined: Thu Jun 16, 2016 1:25 pm
Rank: 1 kyu
GD Posts: 0
Been thanked: 2 times

Re: EGF Rating System Commission Report 2020

Post by Adin »

Yes, for sure more overrated players than sandbaggers. Specially since AI became widely available cheating with it became a huge issue. Not just on KGS of course, people are largely underestimating how serious this is in all online games. KGS is dealing with this and it will be kept under control. EGF should also take it very seriously.
gennan
Lives in gote
Posts: 497
Joined: Fri Sep 22, 2017 2:08 am
Rank: EGF 3d
GD Posts: 0
Universal go server handle: gennan
Location: Netherlands
Has thanked: 273 times
Been thanked: 147 times

Re: EGF Rating System Commission Report 2020

Post by gennan »

I don't think it's really fair to conflate concerns of KGS ratings with concerns of EGF ratings. Each of those rating systems has their own specific concerns (at least outside of online EGF rated events in pandemic times).

1: I feel that airbagging and cheating are two diferent things. Cheating is more directed at gratification from beating high dan players, while not caring to play by yourself.
I think airbagging is inflating your online rating by exploiting some loopholes of online play (picking opponents, perhaps exploitable bots, but not neccessarily cheating with an AI). I suppose the gratification is that your inflated rating allows you to be matched against stronger players than usual, but you will still play by yourself (mostly?).

2: I think neither cheating, nor airbagging is very common in IRL tournament games. So I feel that these are not great concerns for the EGF rating system.
Cheating is not so easy IRL. Yes, it may happen every once in a while, but that won't compromise the rating system much overall.
Airbagging is also not easy IRL. For one, you don't get to pick your opponents. Yes, you may get away with a malicious rating reset once in a while. But that only gets you so far and you cannot repeat that trick over and over. Participating in IRL tournaments is not anonymous, so it's easy for a tournament organiser to check your EGF rating history when you register and they should notice when strange things are going on in there. For example, if you double promote in 5 consecutive tournaments while scoring 0 wins, it should raise some eyebrows.

But in any case, responsibilties to check registrations and prevent cheating belong to tournament organisers. It is their responsibilty to feed the rating system with reliable data. It's not the responsibility of the rating system itself and these issues are outside the scope of this particular rating system commission.
Adin
Dies in gote
Posts: 28
Joined: Thu Jun 16, 2016 1:25 pm
Rank: 1 kyu
GD Posts: 0
Been thanked: 2 times

Re: EGF Rating System Commission Report 2020

Post by Adin »

This is my new EGD profile: https://www.europeangodatabase.eu/EGD/P ... y=14537380

The fact that tournaments make you go up or down much less than before makes it that for look like for 8 years I've been fairly stable 1k and next 6 years pretty much stable 1d. That is simply not true. My strength did go up and down mostly due to circumstances in my life when I was more or less able to focus on Go. It certainly wasn't as smooth as this graph looks.

The old graph is not only more accurate but also much more exciting: https://www.europeangodatabase.eu/EGD21 ... y=14537380 It has a moment when I'm nearly dropping to 4k and that is when I had some health problems that affected my play. It has another moment when I'm going up more than 100 points in a 5 games tournament which is certainly not easy when you're 2k and it gave me a morale boost.

You might think that "excitement" does not matter as long as the ranks are on average statistically accurate. But most people don't travel for years to offline tournaments to say: "yeah, I've started as 1k and 5 years later I'm still 1k, never went up or down in rank, yay!" That is just not motivating as far as rating is concerned. It's like gambling with pennies, win or lose you do not care as much.
User avatar
jlt
Gosei
Posts: 1786
Joined: Wed Dec 14, 2016 3:59 am
GD Posts: 0
Has thanked: 185 times
Been thanked: 495 times

Re: EGF Rating System Commission Report 2020

Post by jlt »

You gained 100 points because you won 5/5 games in a tournament, and most of your opponents were stronger players. Wouldn't that already be enough to boost your morale?

Conversely, if you lose all your games in a tournament, wouldn't that be already depressing enough? Why would you insist on losing >100 points in addition? (This happened to me once...)
Javaness2
Gosei
Posts: 1545
Joined: Tue Jul 19, 2011 10:48 am
GD Posts: 0
Has thanked: 111 times
Been thanked: 322 times
Contact:

Re: EGF Rating System Commission Report 2020

Post by Javaness2 »

A big thank you for updating the rating system
gennan
Lives in gote
Posts: 497
Joined: Fri Sep 22, 2017 2:08 am
Rank: EGF 3d
GD Posts: 0
Universal go server handle: gennan
Location: Netherlands
Has thanked: 273 times
Been thanked: 147 times

Re: EGF Rating System Commission Report 2020

Post by gennan »

Adin wrote:This is my new EGD profile: https://www.europeangodatabase.eu/EGD/P ... y=14537380

The fact that tournaments make you go up or down much less than before makes it that for look like for 8 years I've been fairly stable 1k and next 6 years pretty much stable 1d. That is simply not true. My strength did go up and down mostly due to circumstances in my life when I was more or less able to focus on Go. It certainly wasn't as smooth as this graph looks.

The old graph is not only more accurate but also much more exciting: https://www.europeangodatabase.eu/EGD21 ... y=14537380 It has a moment when I'm nearly dropping to 4k and that is when I had some health problems that affected my play. It has another moment when I'm going up more than 100 points in a 5 games tournament which is certainly not easy when you're 2k and it gave me a morale boost.

You might think that "excitement" does not matter as long as the ranks are on average statistically accurate. But most people don't travel for years to offline tournaments to say: "yeah, I've started as 1k and 5 years later I'm still 1k, never went up or down in rank, yay!" That is just not motivating as far as rating is concerned. It's like gambling with pennies, win or lose you do not care as much.
So you prefer volatile ratings, because that reflects ups and downs in your life better than more stable ratings. I can understand that.

But I think there is also something to say for more stable ratings. At least when your rating reaches a new high level, you can be more confident that you have really reached that level with some consistency, instead of it being just some random noise, a fluke or a "lucky" draw of opponents in a single tournament.
gennan
Lives in gote
Posts: 497
Joined: Fri Sep 22, 2017 2:08 am
Rank: EGF 3d
GD Posts: 0
Universal go server handle: gennan
Location: Netherlands
Has thanked: 273 times
Been thanked: 147 times

Re: EGF Rating System Commission Report 2020

Post by gennan »

Adin wrote:
gennan wrote:some countries have quite conservative policies in this regard and don't allow it, even for low rated players, effectively forcing those players to sandbag until the rating system catches up.
This is a very good reason to not make it even harder for those players by lowering the rating gains.

In real life players are much more likely to overrate themselves. They will happily think they are stronger than they truly are. The new system will reinforce the idea that "I'm actually much stronger and I can't go up fast enough by playing so I need to declare a stronger rank". Then as the rating losses are also significantly reduced, the overrated player will take a longer time to reach its real rating.

Essentially instead of rewarding or punishing players based on objective results in games the new system promotes setting your own rating based on subjective opinions.
If you check the new rating histories of young European top players, such as Ilja Shikshin or Artem Kachanovskyi, and compare with the old rating histories, you can see for yourself that the new version is quite capable of keeping up even with the progress of prodigies.
It's just generating less noise than the old version, while tracking their progress.

So I don't think it's really justified to assume that the new lower volatility means that it cannot keep up with rapid progress.
Adin
Dies in gote
Posts: 28
Joined: Thu Jun 16, 2016 1:25 pm
Rank: 1 kyu
GD Posts: 0
Been thanked: 2 times

Re: EGF Rating System Commission Report 2020

Post by Adin »

I don't understand why you see these two as good examples of rank progression with the new system. Ilja started as 4k and went to 262 tournaments. Artem had a special promotion to 6k very early in his career and went to almost 200 tournaments. It's obvious that when you start around 5k and go to hundreds of tournaments then it's possible to rise to high dan level.

The real issue is with the average player who goes to 2-4 offline tournaments a year (even after corona is over). If he is a new player now starting at 30k it might take years for his EGF rank to catch up to his real strength. You are saying that he can just request a special promotion as Artem did, but there are a couple of issues with that. The first is that as you said some countries do not allow you special promotion and just force you to sandbag. At least before the higher volatility allowed a shorter sandbag period. The second is that quite a lot of players would gladly take advantage of special promotions becoming more common to overrate themselves. Don't think only in terms of countries where the average age of players is 30-40 and they are mostly well educated adults. Countries like Romania have *a lot* of kids and teens that see their rank as a way to stand up among peers and some of them are ready to use any opportunity to raise it regardless of real strength.

Even taking a player like myself who started as 2k as I was playing a lot online before going to offline competitions and who did not have very large rank changes. If I had just floated from 1k to 1d for 14 years as in my new graph I would get no motivation from my rating. Going down can hurt but that hurt is common in all sports and it's necessary to motivate you to improve. Also going up is obviously a motivation factor. You might say that's just a personal preference and other people prefer stability to feel comfortable. That may be true but the real issue is that the new rank graph does not reflect reality. In these 14 years I was playing a lot online and certainly did not have a smooth ride there just floating around the same rank. Even though the sample size was hugely larger online I was going up and down quite a lot as my play certainly was changing and it still does today as I use a lot of AI for review lately. What you are doing is just cushioning the real raises and falls and creating a much flatter line that in statistics will look good (after all someone going from 3k to 2d can be well approximated for statistical purposes as a stable 1k) but hides the real evolution of the player.
gennan
Lives in gote
Posts: 497
Joined: Fri Sep 22, 2017 2:08 am
Rank: EGF 3d
GD Posts: 0
Universal go server handle: gennan
Location: Netherlands
Has thanked: 273 times
Been thanked: 147 times

Re: EGF Rating System Commission Report 2020

Post by gennan »

To allow a quicker rating increase, the volatility for low ratings is much larger than the volatility for high ratings (this was also the case in the old system). This volatility is determined by the con function (in chess Elo rating systems this is called K and it also typically varies from large values for low ratings to low values for high ratings).

I tried to look up a young Romanian DDK player that didn't use rating resets (special promotions) to compare the old with the new rating history. This may be an example that meets your criteria: https://www.europeangodatabase.eu/EGD/P ... y=18437639, but I still fail to see that the lower volatility is a problem. The volatility of their new rating history is a bit smaller, but it is still large enough to keep up.

As for higher volatility giving higher motivation. I think that's very much a matter of personal taste. I understand that you prefer a high volatility, but I don't feel that motivation of players should be a design concern for rating systems. We cannot make a rating system where everybody gets to personalize it to their own preferences or their own perception of reality. The rating system merely exists to track the level of all European tournament go players, to allow tournament organisers to make good tournament pairings where players get to play opponents of a similar level.
Javaness2
Gosei
Posts: 1545
Joined: Tue Jul 19, 2011 10:48 am
GD Posts: 0
Has thanked: 111 times
Been thanked: 322 times
Contact:

Re: EGF Rating System Commission Report 2020

Post by Javaness2 »

"The real issue is with the average player who goes to 2-4 offline tournaments a year (even after corona is over). If he is a new player now starting at 30k it might take years for his EGF rank to catch up to his real strength"

Sorry, but why wouldn't he just enter his next tournament at a rank which matched his actual strength?
We know that we can sometimes see kids improve 5 ranks around the level of 30k to 20k practically overnight, something just clicks in their understanding of the game. The stronger players at the local club can surely manage that kind of change.
Adin
Dies in gote
Posts: 28
Joined: Thu Jun 16, 2016 1:25 pm
Rank: 1 kyu
GD Posts: 0
Been thanked: 2 times

Re: EGF Rating System Commission Report 2020

Post by Adin »

gennan wrote:I tried to look up a young Romanian DDK player that didn't use rating resets (special promotions) to compare the old with the new rating history. This may be an example that meets your criteria: https://www.europeangodatabase.eu/EGD/P ... y=18437639, but I still fail to see that the lower volatility is a problem. The volatility of their new rating history is a bit smaller, but it is still large enough to keep up.
That's a player who took part in 29 tournaments over 3 years and a bit. Certainly 9 tournaments per year is way above average.
Javaness2 wrote:Sorry, but why wouldn't he just enter his next tournament at a rank which matched his actual strength?
Because, as gennan said, some countries do not allow you to do it. And it gets even worse now because you are starting at 30k, not 20k.

And for countries that do allow you to set a higher rank, the lower volatility is likely to become an excuse for players overrating themselves. They will say: "look I'm winning but it's too hard to reach my real level by playing because the points gain per win are too small". I already heard that with the old system.
User avatar
jlt
Gosei
Posts: 1786
Joined: Wed Dec 14, 2016 3:59 am
GD Posts: 0
Has thanked: 185 times
Been thanked: 495 times

Re: EGF Rating System Commission Report 2020

Post by jlt »

The French go Federation allows rating resets, but it has to be approved by a committee:

https://ffg.jeudego.org/informations/of ... uation.php

In the 11k-30k range, only resets by 4 levels or more are allowed. If an application is rejected, then the player has to wait 3 months before applying again.

Maybe other countries should adopt a similar rule.
Javaness2
Gosei
Posts: 1545
Joined: Tue Jul 19, 2011 10:48 am
GD Posts: 0
Has thanked: 111 times
Been thanked: 322 times
Contact:

Re: EGF Rating System Commission Report 2020

Post by Javaness2 »

I really think that the national federations have to reform if they think rating resets are something special in the outer kyu range. Rating volatility can never fully compensate for completely out of date ratings. The system is not designed to do that.

I know that the Romanian Go Federation has already had some fun discussing who is in control, which rules are right and which are wrong, etc. Here is another opportunity to create some fruitful discussions to bring joy to the community. :)

Seriously though, the EGF does need to standardize the practices of its federations regarding their use of the rating system. For instance
  • Best practice on submission of rating results - some member nations have never submitted tournament results
  • Produce clear guidelines on the limitations of the rating system - i.e. why rating resets are necessary
  • Agreement on modality of rating resets. Example: a 10k has a record of 10k+ 9k+ 7k+ 6k+ 5k+ in an event. The supreme beings decide he will be promoted to 7k. Is it valid to change his entry rank to 7k before submitting the tournament file for rating, or can it only be reset for their next tournament.
karlsgo
Dies in gote
Posts: 26
Joined: Sat Jul 18, 2015 7:51 am
GD Posts: 0
Location: Karlsruhe/Germany
Has thanked: 1 time
Been thanked: 6 times
Contact:

Re: EGF Rating System Commission Report 2020

Post by karlsgo »

jlt wrote:The French go Federation …

In the 11k-30k range, only resets by 4 levels or more are allowed.
two points.

It is sad, that it's different from the European rule (that allows 2 stones).

But having Sandbaggers by design means, that people who do not like sandbagging, will not join the tournament.

When I promote Tournaments in France I always get complaints about the French rating and most players do not like sandbaggers.

As a tournament director I want players having good games and I accept ratingreset, if comprehensible (an online rating per se is not comprehensible).

Greetings from Germany

Wilhelm
Post Reply