Proposal for a New Ranking System for Insei League

topazg · Post by **topazg** » Wed May 12, 2010 4:51 am

Harleqin wrote:I think that "free pairing" is OK for a little fun on the side, but for "serious" competition, you need to actively balance things out. For leagues, I think that round-robin systems are standard (each player plays each other player once).

There are often "free pairing" side tournaments at weekend tournaments, which give little prizes to whoever played most games, most wins, and perhaps best win ratio.

"Free pairing" tournaments just do not have a clear winner.

Yeah, I agree completely - however, the insei league is something that people have to commit spare time to and fit games around timezones and availability - so round robins or equivalents become very hard for people to commit to. I can see that "free pairing" is probably the only way to do the league, and the ideal incentive is one that makes everyone play everyone 4 times and get a round robin result.

In the absence of being able to play 2 games a day every day, rewarding those that play more has a lot of advantages outside of ranking accuracy - one being that the more people play, the more they are likely to feel they are getting out of the league, and the more likely they are to want to continue with the league. This works well for the pockets of those running it, and it also works well for those trying to have a motivation to get stronger at the game.

HermanHiddema · Post by **HermanHiddema** » Wed May 12, 2010 5:14 am

topazg wrote:
topazg wrote:The reason I don't think it should be based on rating is each player can play something like 52 games (including the teaches, 4 games against each of the other 13 players). In reality, it is rare for players to reach 20 - which means with 32 assumed draws TPR will be naturally lower for higher rated players. This seems unreasonable as the majority of the possible games they could have played are marked as results that never happened, which feels like a poor way of ranking.

There may be other variables you can feed into a TPR, but I can't think of any off the top of my head.
Actually, this is nonsense, sorry. I misread what you were suggesting. TPR doesn't have this problem, but it does have another that numsgil has pointed out: s(g,w) < s(g+1, w) is a really important feature. If you win 3 or 4 big games, and have a 3-0 or 4-0 record, you aren't going to want to play any more - even with the "jigo against self" which is often used in TPR, your performance is sufficiently high that it will be hard for someone to knock you off top spot with a 100% record. I think this is a slightly better system than the current one being used, but the disincentive to continue playing with a good record puts it behind the other proposals (including #wins).

Because I am unsure exactly which parts of your previous post you were calling nonsense, I'll first give a little more depth to my proposal. I used the term TPR, but it is probably a bad term to use, my system would be to use:

1. Give every participant of the league an initial rating of 0.
2. Add a dummy Player (say "Player0") whose rating is fixed at 0. Give every player a jigo against Player 0.
3. Use an Elo type rating system, put in all results to calculate new ratings for all players
4. Repeat step 3 until all ratings stabilize!

Step 4 here is really the important part. It allows you to run the system completely independent from any prior ratings.

As shown herem, the proposal is jigo against a dummy player with base rating, not jigo against self. The base rating dummy pulls down on the rating a lot more than self if you've played relatively few games. It does still, however, make it possible to end on top with relatively few wins if you have a high win ratio.

You could add part of Harleqin's suggestion into it, calculating a 95% confidence interval based on the number of games and using it its lower bound to decrease the rating of players with fewer games. If you map the TPR values to the [0,1] interval, with the average at 0.5, that could work.

It is still possible, however, to have situations where s(g,w) > s(g+1, w). It is even possible in extreme cases to have s(g,w) > s(g+1, w+1), because winning against an opponent with an extremely low rating can actually bring your rating down (In Elo terms: Score goes up, but Average Opponent Rating goes down, so Expected Score also goes up, and might go up more than you Score). This is a disadvantage of the system.

I think it really isn't possible to design a system for a free paired competition that satisfies everyone. You have to find some balance between incentivizing people to play more games and allowing people with fewer games to have a shot at winning, and there will always be people complaining that the balance is out of whack one way or the other.

HermanHiddema · Post by **HermanHiddema** » Wed May 12, 2010 5:22 am

topazg wrote:
HermanHiddema wrote:This bonus points system is basically just adding SODOS (multiplied by some factor) to your score. One problem with that is that the same criticism that applies to SODOS also applies to this, which is that is is questionable to reward winning against strong players when you're no punishing losing against weak players.

Example:

Suppose two players in position 5 and 6 of the 10 player league have both played 8 games. They've played all opponents except each other. Both have scored 4/8. The player in position 5 has defeated players 1, 2, 3 and 4, but lost to 7, 8, 9 and 10. The player in position 6 has defeated 7, 8, 9 and 10, but has lost to 1, 2, 3 and 4. In this scenario, player 5 gets waaaaay more bonus points that player 6. But why? After all, he lost to players 7, 8, 9 and 10, that's pretty weak!
I agree, this is a weakness, but part of the motivation for changing the system is simply to encourage more games. One of the risks of putting the burden on punishment rather than reward is the temptation to play less games to minimise the punishment. Unlike McMahon systems (where SODOS receives so much criticism), in the Insei league people can pick and choose their opponents and the number of games they play - in itself this is a situation that gives rise to potential abuse, and any system must be designed to cater for this. For example, in your situation, the player in 5th wouldn't even bother playing the bottom 4 - the risk of punishment would outweigh any potential reward.

I also slightly suspect that it will be hard to contrive a situation such as the one you show in your example in my system. Beating the top 4 players is unlikely to leave you in 5th due to the bonuses, depending on their results... if you can mock up a simulated league position I'd be interested to have a look at what would be required in practice to generate that scenario.

The example is contrived, because it is an exaggerated situation. More realistic would be:

Player 5: 1- 2- 3- 4+ 7- 8+ 9+ 10+
Player 6: 1- 2- 3- 4- 7+ 8+ 9+ 10+

Having such swapped results against players 4 and 7, who are presumably close in playing strength to players 5 and 6, is entirely possible.

I think if two players score 4/8 against exactly the same opponents, then that is probably a comparable result regardless of which games they won or lost.

Is there anything wrong with adding SOS (multiplied by some factor), instead of SODOS? That would mean that playing a highly ranked player would be good for your score, even when losing, and would thus be an incentive for the top players in the league to play each other, rather than avoid each other in order to avoid losing.

Of course in that case the factor would have to be such that the bonus couldn't outweigh the result. It would be bad if losing against the number 1 player allowed you to overtake him on bonus, so the bonus gain (my bonus - his bonus, for this game) should be strictly smaller than 1.

topazg · Post by **topazg** » Wed May 12, 2010 5:35 am

HermanHiddema wrote:Is there anything wrong with adding SOS (multiplied by some factor), instead of SODOS? That would mean that playing a highly ranked player would be good for your score, even when losing, and would thus be an incentive for the top players in the league to play each other, rather than avoid each other in order to avoid losing.

Of course in that case the factor would have to be such that the bonus couldn't outweigh the result. It would be bad if losing against the number 1 player allowed you to overtake him on bonus, so the bonus gain (my bonus - his bonus, for this game) should be strictly smaller than 1.

Interesting, and quite possible. My desire was to get something that relatively closely represented #wins but with extra information embedded in the scores, simply because I think this seems to match the desires of the participants. In this case, SODOS would achieve this better than SOS.

My biggest concern with SOS is that weaker players can simply fill up their games with losses against the top players to accumulate points - even if they resign after 20 moves each time. From my understanding, this is something that participants are trying to avoid, and something that SOS would encourage (the whole "I've played 16 games and lost them all, but I've got more points than that guy who won 4 and lost 1").

I'd like to see how it models out in the A table, but I've dumped all my spreadsheets and would have to start over

If someone else can do a similar ranking with SOS/2 instead of SODOS/2 as the bonus it would be interesting to see how the table looks. My gut feeling is the desire to reward #wins was too strong for SOS to be better than SODOS here, but that wouldn't be the case if #games is considered to be at least as high a priority. The #games priority seemed only to concern around the lack of #wins for the top players - i.e. a few wins but at 100% meant they could get out of playing more games, whereas #wins covers #games by proxy.

HermanHiddema · Post by **HermanHiddema** » Wed May 12, 2010 5:44 am

One of the difficulties is that the system need to be reliable not only at the top, where the prizes are won, but also at the bottom, where the demotions are decided

topazg · Post by **topazg** » Wed May 12, 2010 5:53 am

HermanHiddema wrote:One of the difficulties is that the system need to be reliable not only at the top, where the prizes are won, but also at the bottom, where the demotions are decided

Indeed, but how do we define reliability?

If a player needs only to win a game to get above those who haven't won any, then playing lots of games for that one win will put you above someone who only won a single game and stopped playing. It seems that the participants want active players as well as strong players in the league, and 1-0 (win-loss) is seen as less favourable than 1-7, where 8 games have been played.

It is questionable that wins against strong players are rewarded more highly than losses against weak players are punished, but it is also questionable whether they should be valued the same, or in the other direction. I don't see any argument for one over the other two that is simply "right".

It seems to be more a case of which achieves the overall purposes of the system. I'm sure Alex wants people playing as many games as possible so that they can improve (his purpose of the leagues), so rewarding wins and not punishing losses increases activity, and thus achieves the purposes of the system. Not only that, but the three people who have posted with concerns about the current system also feel happier with rewarding the wins more greatly than the punishment of losses. In fact, anything that discourages players with wins from playing more games (such as punishment of losses) appears to be the strongest concern.

I think few participants will resent the demotion of people on a 1-0 record.

HermanHiddema · Post by **HermanHiddema** » Wed May 12, 2010 6:08 am

topazg wrote:
HermanHiddema wrote:One of the difficulties is that the system need to be reliable not only at the top, where the prizes are won, but also at the bottom, where the demotions are decided
Indeed, but how do we define reliability?

If a player needs only to win a game to get above those who haven't won any, then playing lots of games for that one win will put you above someone who only won a single game and stopped playing. It seems that the participants want active players as well as strong players in the league, and 1-0 (win-loss) is seen as less favourable than 1-7, where 8 games have been played.

From this, "more games = higher score" would seem a strongly desirable property. But do players feel the same when it is 14-0 vs 14-6?

It is questionable that wins against strong players are rewarded more highly than losses against weak players are punished, but it is also questionable whether they should be valued the same, or in the other direction. I don't see any argument for one over the other two that is simply "right".

It seems to be more a case of which achieves the overall purposes of the system. I'm sure Alex wants people playing as many games as possible so that they can improve (his purpose of the leagues), so rewarding wins and not punishing losses increases activity, and thus achieves the purposes of the system. Not only that, but the three people who have posted with concerns about the current system also feel happier with rewarding the wins more greatly than the punishment of losses. In fact, anything that discourages players with wins from playing more games (such as punishment of losses) appears to be the strongest concern.

I think few participants will resent the demotion of people on a 1-0 record.

A simple system we used for the free pairing tournament at the EGC was:

Score = Wins + Sqrt(Games)

With this system:

Every game always increases your score
Losing always gains less than winning
Playing more games is progressively rewarded less when you've already played plenty.
For players with roughly the same number of games, higher number of wins is always better

HermanHiddema · Post by **HermanHiddema** » Wed May 12, 2010 6:32 am

HermanHiddema wrote: A simple system we used for the free pairing tournament at the EGC was:

Score = Wins + Sqrt(Games)

With this system:

Every game always increases your score

Losing always gains less than winning

Playing more games is progressively rewarded less when you've already played plenty.

For players with roughly the same number of games, higher number of wins is always better

Disadvantage of this system is that it benefits you to play against the weaker players, again.

To avoid that, you could use the formula per opponent:

Score = Sum (Wins(X) + Sqrt(Games(X))

Where X is the opponent. In this way, your first game against a new opponent is worth at least 1 point (Sqrt(Games(X)) and at most 2. The second is worth 0.41 or 1.41, the third is 0.32 or 1.32 and the fourth is 0.27 or 1.27 (depending on if you win or lose them).

If the top players have about a 50-50 chance against each other, then their first game is worth 1.5 to them, on average, while a 99% certain win in their second game against a weaker player is worth 1.4 on average.

And on the bottom end of the league, a 50-50 chance in your 4th game against another bottom player is worth 0.82 on average, while a 99% certain loss in a second game against a top player is only worth 0.42 on average.

This system gives you incentive to play at least one game against every player, and to seek out players of your own strength more often.

topazg · Post by **topazg** » Wed May 12, 2010 6:58 am

HermanHiddema wrote:From this, "more games = higher score" would seem a strongly desirable property. But do players feel the same when it is 14-0 vs 14-6?

I have no idea

It would be consistent with the desires elsewhere, so I hope so!

HermanHiddema wrote:A simple system we used for the free pairing tournament at the EGC was:

Score = Wins + Sqrt(Games)

With this system:

Every game always increases your score

Losing always gains less than winning

Playing more games is progressively rewarded less when you've already played plenty.

For players with roughly the same number of games, higher number of wins is always better

I like this system - it's better IMO than #wins, as it fulfils two priorities, namely the number of victories (top priority) and the number of games. However, in my mind the total number of games was a 3rd priority (though I could be wrong on the order of these last two), and the 2nd priority was a differentiation between who those wins were against. This is something that the simple system doesn't address at all, and I think it may lead to all the strong players only playing all the weak players and trying to pick up perfect records with lots of wins (purely for efficiency of maximising wins, as opposed to wanting the winning percentage itself). Part of the desire seemed to be to get the top players playing each other, so I think there has to be an incentive to play high scoring participants, and wins + sqrt(games) doesn't do this - I think it may have slightly the opposite effect.

EDIT: Ok, you've just partly answered that last paragraph.

In response, I think "If the top players have about a 50-50 chance against each other, then their first game is worth 1.5 to them, on average, while a 99% certain win in their second game against a weaker player is worth 1.4 on average." will result in playing that 1.4 every time. Not only will they almost guarantee 1.4 points, but they won't risk giving their opponent 1.5 points instead, which at 50% chance is too high when compared to the alternatives.

I think the same will be true to a lesser extent on your other example too - although 50/50 for 0.82 and 99/1 for 0.42 is pretty much same number of expected games per point, the 50/50 for 0.82 threatens to give your rivals those points if you lose, whereas playing the top player doesn't.

It is certainly an improvement over the straight wins sqrt(games), but I intuitively still prefer mine as I think it rewards beating the high scoring players more richly, and partly because the 3 parameters are so easily changeable to adapt to differing priority values. It would be good to see Division A worked out for each of these ranking systems - I'll see if I can find time this evening to do it and put them side by side.

HermanHiddema · Post by **HermanHiddema** » Wed May 12, 2010 7:13 am

topazg wrote:
HermanHiddema wrote:From this, "more games = higher score" would seem a strongly desirable property. But do players feel the same when it is 14-0 vs 14-6?
I have no idea It would be consistent with the desires elsewhere, so I hope so!

HermanHiddema wrote:A simple system we used for the free pairing tournament at the EGC was:

Score = Wins + Sqrt(Games)

With this system:

Every game always increases your score

Losing always gains less than winning

Playing more games is progressively rewarded less when you've already played plenty.

For players with roughly the same number of games, higher number of wins is always better
I like this system - it's better IMO than #wins, as it fulfils two priorities, namely the number of victories (top priority) and the number of games. However, in my mind the total number of games was a 3rd priority (though I could be wrong on the order of these last two), and the 2nd priority was a differentiation between who those wins were against. This is something that the simple system doesn't address at all, and I think it may lead to all the strong players only playing all the weak players and trying to pick up perfect records with lots of wins (purely for efficiency of maximising wins, as opposed to wanting the winning percentage itself). Part of the desire seemed to be to get the top players playing each other, so I think there has to be an incentive to play high scoring participants, and wins + sqrt(games) doesn't do this - I think it may have slightly the opposite effect.

EDIT: Ok, you've just partly answered that last paragraph.

In response, I think "If the top players have about a 50-50 chance against each other, then their first game is worth 1.5 to them, on average, while a 99% certain win in their second game against a weaker player is worth 1.4 on average." will result in playing that 1.4 every time. Not only will they almost guarantee 1.4 points, but they won't risk giving their opponent 1.5 points instead, which at 50% chance is too high when compared to the alternatives.

I think the same will be true to a lesser extent on your other example too - although 50/50 for 0.82 and 99/1 for 0.42 is pretty much same number of expected games per point, the 50/50 for 0.82 threatens to give your rivals those points if you lose, whereas playing the top player doesn't.

It is certainly an improvement over the straight wins sqrt(games), but I intuitively still prefer mine as I think it rewards beating the high scoring players more richly, and partly because the 3 parameters are so easily changeable to adapt to differing priority values. It would be good to see Division A worked out for each of these ranking systems - I'll see if I can find time this evening to do it and put them side by side.

I think this misunderstands the math somewhat. With 50/50 chance for 0.82 on average it takes only half the number of games compared to 99/1 for 0.42. The 0.82 is already averaged! They're playing a 50/50 game for 1.64 points total.

Similarly, you're not risking giving your opponent 1.5 points, but you're giving him 1.5 points on average. You give him 1 point guaranteed (and you're taking 1 guaranteed yourself) and you run the risk of losing and giving him one more (but if you win, you get one more). In a situation where there are others also in contention for the top spot, you're both ahead of the game afterward.

Suppose three players (A, B and C) are tied at the top, and have not played each other, but have played only weaker players. If one evening we have the match up:

A - B (50/50 chance)
C - D (99/1 chance, second game they play)

Then who is doing better, on average? A and B? Or C?

topazg · Post by **topazg** » Wed May 12, 2010 7:28 am

HermanHiddema wrote:I think this misunderstands the math somewhat. With 50/50 chance for 0.82 on average it takes only half the number of games compared to 99/1 for 0.42. The 0.82 is already averaged! They're playing a 50/50 game for 1.64 points total.

Similarly, you're not risking giving your opponent 1.5 points, but you're giving him 1.5 points on average. You give him 1 point guaranteed (and you're taking 1 guaranteed yourself) and you run the risk of losing and giving him one more (but if you win, you get one more). In a situation where there are others also in contention for the top spot, you're both ahead of the game afterward.

Suppose three players (A, B and C) are tied at the top, and have not played each other, but have played only weaker players. If one evening we have the match up:

A - B (50/50 chance)
C - D (99/1 chance, second game they play)

Then who is doing better, on average? A and B? Or C?

Yeah, many thanks, I had misunderstood. Ok, that makes the system better than I had thought. The other issue for me is that if 1st doesn't want to play 8th again because he's already played him once and is losing returns on his games, that doesn't stop him from playing 7th, 9th, and 10th to get his games played while keeping 99% result high rewards. We have a total of 14 players in a division (including teachers, and a total allowed of 4 games per person, totalling 52 games). However, in the latest visible division A table, only 4 of the 14 have played more than 7 games, which makes me feel that there will be plenty of opportunity to get a lot of full, or nearly full value, games against weaker opponents.

The reward in your system is only by proxy in that games against the same opponent are valued less. This will encourage a wider distribution of games as opposed to games against stronger (or equal) rank, and that may be difficult / punishing for those in unusual time zones. It doesn't actually reward wins against people who have been directly successful this particular cycle.

HermanHiddema · Post by **HermanHiddema** » Wed May 12, 2010 7:46 am

Yes, my system rewards playing each opponent at least once strongly, but beyond that it does not reward playing stronger players.

It's a hard thing to solve. If you reward playing stronger players too much, then you get the whole "I got 0/16 against the top 5 and beat theis guy who got 4/1 against the bottom players" thing. If you son't reward it enough, they don't want to play each other.

Regardless, 7 games is very little, and very vulnerable to statistical noise from "fluke" wins or losses.

BTW: I'm not advocating any particular system, really. It really depends on your goals. I'm just throwing out some suggestions in the hope that they will help in formulating a good system

topazg · Post by **topazg** » Wed May 12, 2010 7:57 am

HermanHiddema wrote:Yes, my system rewards playing each opponent at least once strongly, but beyond that it does not reward playing stronger players.

It's a hard thing to solve. If you reward playing stronger players too much, then you get the whole "I got 0/16 against the top 5 and beat theis guy who got 4/1 against the bottom players" thing. If you son't reward it enough, they don't want to play each other.

So true!

HermanHiddema wrote:Regardless, 7 games is very little, and very vulnerable to statistical noise from "fluke" wins or losses.

I agree, it's not possible to avoid this issue. The top two players on 3-0 and 1-0 feels particularly wrong to me intuitively, having only played 4 games between them!

HermanHiddema wrote:BTW: I'm not advocating any particular system, really. It really depends on your goals. I'm just throwing out some suggestions in the hope that they will help in formulating a good system

Ditto

I think my rankings so far would be as follows:

My system

Sum (Wins(X) + Sqrt(Games(X)) per opponent
Wins + Sqrt(Games)
#Wins
Mine but with SOS instead of SODOS
Lower bound of Wilson score confidence interval for a Bernoulli parameter
TPR
Current System

My justification is because of having a high priority of valuing "wins against those with a good record" which I felt would encourage people to seek games against the top players, and for the top players to seek games against each other, and mine is the only one I think that actively encourages it. I can't see an easy way to abuse the system (including, as HaHa said, by collusion), and it rewards #wins as the primary driver, which also seems to be favourable as it encourages playing lots of games. I also like the fact it doesn't punish defeats, as again I think that encourages more games played. All of this is by trying to consider the options based on the following goals:

* lots of games is better
* lots of wins is better
* wins against the top players are valuable
* playing more games is almost always better than not playing more games

Arlequin · Post by **Arlequin** » Thu May 13, 2010 2:48 pm

Hi all

I like topazg system, it is probably the best one for our needs.

However I would like to add one other important goal, especially for top A-class players and teachers:
- "both teachers and students should be able to compete for first place and prices"
This is important for the motivation of top players and teachers, who themselves are important for the motivation of other players

.

Due to the special positions of teachers, the number of games they can play might be far higher (or far lower as currently in A group), which would eventually means impossibility for competition with some rule-set.
In theory, people should be willing to play as many simultaneous games with teachers as possible, and so the teachers should be the most active players. It used to be true, but it is recently the opposite in top group.

Therefore I think a good system should encourage people to play a lot of games with teachers, and then, because teachers would be very active, being able to adapt to the special situation of teachers if necessary.

With topasg idea, I think the current fear of A-group player to play against teacher may disappear which is good.
However, will it be the case that teacher play twice as much as top-players ? In that case, topasg's algorithm would give a enormous advantage to the teachers, or maybe I misunderstood it. Would it be easy to make a special case for teachers if necessary ?

-- I am the only Arlequ1 ! Don't let Harlequin steel my identity ^^ --

topazg · Post by **topazg** » Thu May 13, 2010 2:59 pm

Arlequin wrote:With topasg idea, I think the current fear of A-group player to play against teacher may disappear which is good.
However, will it be the case that teacher play twice as much as top-players ? In that case, topasg's algorithm would give a enormous advantage to the teachers, or maybe I misunderstood it. Would it be easy to make a special case for teachers if necessary ?

Yes, this is very true. Especially as the teachers would expect to win the majority of their games, I don't know how easily to make sure that they don't take advantage of this.

Life In 19x19

Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League

Re: Proposal for a New Ranking System for Insei League