It is currently Thu Apr 18, 2024 7:37 pm

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 41 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
Offline
 Post subject: Re: Whole History Rating
Post #21 Posted: Mon Jun 21, 2010 3:14 pm 
Lives in sente
User avatar

Posts: 914
Liked others: 391
Was liked: 162
Rank: German 2 dan
It seems to me that Robert's fears are quite inspecific. I can see how they apply to the current systems, but I do not see them as an impediment to studying better ones.

Of course, looking how the new system handles sparse local data is an important subtopic.

Perhaps we could now have a more detailed look at the paper and get an impression on the different parts of the system and how they fit together.

_________________
A good system naturally covers all corner cases without further effort.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #22 Posted: Mon Jun 21, 2010 4:11 pm 
Lives with ko
User avatar

Posts: 193
Location: Trondheim, Norway
Liked others: 76
Was liked: 29
Rank: 2d EGF and KGS
GD Posts: 1005
Universal go server handle: sverre
RobertJasiek wrote:
Sverre, there are these possibilities: a) yes, b) give them pseudo-ratings that are shown for their pleasure but otherwise ignored, c) use a rating system that calulates only local ratings anyway.


OK, could you give some precise numbers on for example minimal number of rated games per year, or objective criteria for when one is in an "isolated subpopulation"? And also an estimate on what percentage of players would be booted from the rating system under these criteria?

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #23 Posted: Mon Jun 21, 2010 4:26 pm 
Tengen
User avatar

Posts: 4511
Location: Chatteris, UK
Liked others: 1589
Was liked: 656
Rank: Nebulous
GD Posts: 918
KGS: topazg
Sverre wrote:
RobertJasiek wrote:
Sverre, there are these possibilities: a) yes, b) give them pseudo-ratings that are shown for their pleasure but otherwise ignored, c) use a rating system that calulates only local ratings anyway.


OK, could you give some precise numbers on for example minimal number of rated games per year, or objective criteria for when one is in an "isolated subpopulation"? And also an estimate on what percentage of players would be booted from the rating system under these criteria?


Well, I'm guessing the 5 or 6 games I play a year would probably get me booted anyway ..

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #24 Posted: Mon Jun 21, 2010 5:02 pm 
Lives in gote

Posts: 394
Liked others: 29
Was liked: 176
GD Posts: 1072
RobertJasiek wrote:
pwaldron, maybe in theory there are more information is better theorems but currently rating systems are so far from perfect that a more modest approach makes it easier to design better systems. When we will have them, one can still come back to the low confidence sparse data noise and see if one can explain them already well.


Robert, chant ten times: It is always better to have more information.

The very worst you can have is a game prediction algorithm that flips a coin to predict the winner. Every additional game adds more information, and it cannot make a system less accurate. Some game results are more useful than others in pinning down ratings, but they all have value and it is foolish to throw any away. If the information is not useful then it does little to reduce the uncertainty in the resulting estimated parameters (i.e., ratings) but it's never worse to have the information than not to have it.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #25 Posted: Tue Jun 22, 2010 12:20 am 
Judan

Posts: 6139
Liked others: 0
Was liked: 786
Sverre wrote:
OK, could you give some precise numbers on for example minimal number of rated games per year, or objective criteria for when one is in an "isolated subpopulation"? And also an estimate on what percentage of players would be booted from the rating system under these criteria?


No. One would have to think about it to set useful values. I have wanted to encourage such thinking; I have not carried it out in detail myself.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #26 Posted: Tue Jun 22, 2010 12:31 am 
Judan

Posts: 6139
Liked others: 0
Was liked: 786
pwaldron wrote:
Robert, chant ten times: It is always better to have more information.


My first statistics book had a nice example: Estimate the distance between two towns. First you take a rough look: "The next town is about 10km afar." The you measure your town's mediaeval wall: "It is 30cm thick". Now you conclude: "The distance is 10km + 30cm = 10.0003km."

Likewise if you have two isolated players who claim to 5k each and their total game data consist of exactly 1 game between themselves, you cannot connect that information to a huge pool of 5k players elsewhere.

Chant ten times: Strongly disconnected data should not be compared.:)

Quote:
Every additional game adds more information, and it cannot make a system less accurate.


The problem lies in the system itself. If it is not good enough, then it does not interprete sparse data correctly. One must not overinterpret such a system by feeding it with also the sparse data.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #27 Posted: Tue Jun 22, 2010 2:49 am 
Lives in sente
User avatar

Posts: 914
Liked others: 391
Was liked: 162
Rank: German 2 dan
Robert, you seem to presume a weakness of the system before you have even looked at it.

In my as yet rough understanding, each game result is a data point. If only few data points are directly connected to a player, then that player's resulting rating graph will be easily moved with further (even indirect) data. Game results against this player will therefore naturally have little impact on the rating graph of a player with more games.

I understand that you have made bad experience with ELO-like systems. My impression is that this kind of problems is naturally covered by a WHR-like approach.

We shall keep this potential problem in mind, but I would like to move on to a more detailed look at the algorithm now.

_________________
A good system naturally covers all corner cases without further effort.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #28 Posted: Tue Jun 22, 2010 3:03 am 
Judan

Posts: 6139
Liked others: 0
Was liked: 786
I have not referred to only one particular rating system but to rating systems in general.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #29 Posted: Tue Jun 22, 2010 4:31 am 
Lives in gote
User avatar

Posts: 643
Location: Munich, Germany
Liked others: 115
Was liked: 102
Rank: KGS 3k
KGS: LiKao / Loki
One problem with most ranking systems is how to anchor them. On online servers you can anchor some players who don't improve much but are very active and anchoring bots. Anchoring RL systems is much harder.
Perhaps define some percentiles for rank intervals?

_________________
Sanity is for the weak.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #30 Posted: Tue Jun 22, 2010 8:39 pm 
Lives with ko
User avatar

Posts: 129
Location: Turku, Finland
Liked others: 12
Was liked: 21
Rank: EGF 1989 KGS 2d
Li Kao wrote:
One problem with most ranking systems is how to anchor them. On online servers you can anchor some players who don't improve much but are very active and anchoring bots. Anchoring RL systems is much harder.
Perhaps define some percentiles for rank intervals?


Anchoring is not necessary because we can just let the system to float freely. Mathematical rating system should not have any direct and fixed relationship with kyuu-dan ranks (that are subjective honorary titles). If we try to force that relationship, it will just decrease the reliability of the mathematical system. (We play handicap games in tournaments only when we are beginner double digit kyuus!)

And the good thing of plain and simple Elo is that even though we cannot deduce from Elo exact probability of beating specific opponent. We can always put players in very specific order within certain subpopulation. (This is the reason why GoR works like magic!) And there are always enough traffic between subpopulations (e.g. via EGC) so that we can calibrate them to match roughly each other if that is necessary.

But I agree that history approach has it's merits. The best way is to calculate simultaneously normal Elo and rating that includes enough history (a year or so to the past) and put both figures to the same graph.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #31 Posted: Tue Jun 22, 2010 8:46 pm 
Lives in gote

Posts: 653
Location: Austin, Texas, USA
Liked others: 54
Was liked: 216
RobertJasiek wrote:
My first statistics book had a nice example: Estimate the distance between two towns. First you take a rough look: "The next town is about 10km afar." The you measure your town's mediaeval wall: "It is 30cm thick". Now you conclude: "The distance is 10km + 30cm = 10.0003km."


I would say it this way: The next town is 10km +/- 2km. Then measure wall is 30cm. Now: The next town is 10.0003km +/- 2km. Mathematically it works just fine.

Turning to examples of go ratings, if a player has only played 2 games, an even game win against a 30k, and an even game loss to a 2k, then the rating system can say he is 16k +/- 14 ranks.

If a different player played 1000 games, all even games against 16k players, and won 50% lost 50%, then the rating system can say his is 16k +/- 0.2 ranks.

So no games are thrown out. Of course the system will have less confidence in ratings of players with less games. AGA's system publishes a number related to the confidence for all players.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #32 Posted: Tue Jun 22, 2010 10:09 pm 
Judan

Posts: 6139
Liked others: 0
Was liked: 786
Quote:
the rating system can say he is 16k +/- 14 ranks.


If the rating systems did say it (in terms of rating points), that would be an improvement. Strange confidence values instead say too little to the reader.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #33 Posted: Wed Jun 23, 2010 2:33 am 
Lives in sente
User avatar

Posts: 914
Liked others: 391
Was liked: 162
Rank: German 2 dan
Liisa wrote:
Mathematical rating system should not have any direct and fixed relationship with kyuu-dan ranks (that are subjective honorary titles). If we try to force that relationship, it will just decrease the reliability of the mathematical system.


The ranks are just labels attached to certain values of the model. They do not change the model.

Quote:
(We play handicap games in tournaments only when we are beginner double digit kyuus!)


I think that this is a very lamentable recent development.

Quote:
And the good thing of plain and simple Elo is that even though we cannot deduce from Elo exact probability of beating specific opponent, we can always put players in very specific order within a certain subpopulation. (This is the reason why GoR works like magic!)


You describe a use and an outcome and that should be a reason? Magic indeed...

Quote:
And there are always enough traffic between subpopulations (e.g. via EGC) so that we can calibrate them to match roughly each other if that is necessary.


What is the average rating improvement of finnish players at the London Open? If calibration was so fast, it should approach 0, no? The problem is that calibration can only propagate through later games. If a population is 40 ELO points underrated and 5% of them go to a big foreign tournament, they bring home 40 points each. In theory, this should mean that the population is afterwards only 38 points underrated, but in order to actually distribute these points, a lot of games have to be played (and the players bringing the points will naturally not be very inclined to do so).

Quote:
But I agree that history approach has its merits. The best way is to calculate simultaneously normal Elo and rating that includes enough history (a year or so to the past) and put both figures to the same graph.


I think that you have not yet looked at the idea of the algorithm in question. It is not about "including some history".

I guess that I cannot expect everyone who wants to discuss this to read that paper, so an explanation will have to be given in this thread. I shall look into that.

_________________
A good system naturally covers all corner cases without further effort.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #34 Posted: Wed Jun 23, 2010 6:53 am 
Lives with ko
User avatar

Posts: 129
Location: Turku, Finland
Liked others: 12
Was liked: 21
Rank: EGF 1989 KGS 2d
Harleqin wrote:
Quote:
And there are always enough traffic between subpopulations (e.g. via EGC) so that we can calibrate them to match roughly each other if that is necessary.


What is the average rating improvement of finnish players at the London Open? If calibration was so fast, it should approach 0, no? The problem is that calibration can only propagate through later games. If a population is 40 ELO points underrated and 5% of them go to a big foreign tournament, they bring home 40 points each. In theory, this should mean that the population is afterwards only 38 points underrated, but in order to actually distribute these points, a lot of games have to be played (and the players bringing the points will naturally not be very inclined to do so).


If we see that subpopulation's rating is off by 38 points after the LOGC, then we can add 38 Elo points to entire active sub population. In practice we can add 12 points (30%) and then look how much subpopulation is still underrated after next year. If this kind of comparison is applied once a year, soon enough we will get acceptable differences between subpopulations or better yet that subpopulations will stay in sync. This is not that hard.

Quote:
I guess that I cannot expect everyone who wants to discuss this to read that paper, so an explanation will have to be given in this thread. I shall look into that.


That would be nice. Because it is difficult to understand any key points of WHR from the paper. What is WHR about in practice? Exactly how many games/months from past you would like to take in consideration? Real world resembling examples are always nice.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #35 Posted: Wed Jun 23, 2010 7:44 am 
Gosei
User avatar

Posts: 2116
Location: Silicon Valley
Liked others: 152
Was liked: 330
Rank: 2d AGA
GD Posts: 1193
KGS: lavalamp
Tygem: imapenguin
IGS: lavalamp
OGS: daniel_the_smith
Liisa wrote:
If we see that subpopulation's rating is off by 38 points after the LOGC, then we can add 38 Elo points to entire active sub population. In practice we can add 12 points (30%) and then look how much subpopulation is still underrated after next year. If this kind of comparison is applied once a year, soon enough we will get acceptable differences between subpopulations or better yet that subpopulations will stay in sync. This is not that hard.


Do you... plan to do that by hand? For each subpopulation? And how do you even identify a subpopulation? How can you tell how over/underrated a subpopulation is? And how could you keep such a system impartial? I concede it may be possible, but I don't see how you can claim it isn't hard...

Besides, WHR basically does that for you, only in a much better way than arbitrarily giving subpopulations 30% bonuses...

_________________
That which can be destroyed by the truth should be.
--
My (sadly neglected, but not forgotten) project: http://dailyjoseki.com

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #36 Posted: Wed Jun 23, 2010 12:16 pm 
Lives with ko
User avatar

Posts: 129
Location: Turku, Finland
Liked others: 12
Was liked: 21
Rank: EGF 1989 KGS 2d
daniel_the_smith wrote:
Besides, WHR basically does that for you


how?

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #37 Posted: Wed Jun 23, 2010 12:29 pm 
Gosei
User avatar

Posts: 2116
Location: Silicon Valley
Liked others: 152
Was liked: 330
Rank: 2d AGA
GD Posts: 1193
KGS: lavalamp
Tygem: imapenguin
IGS: lavalamp
OGS: daniel_the_smith
As I understand it, WHR works backwards as well as forwards in time. So it should distribute those rating points back amongst the isolated pool retroactively, with no further games necessary.

Although, now that I think about it more, I don't know that it would distribute significantly more points than those few players won (as you were suggesting).

_________________
That which can be destroyed by the truth should be.
--
My (sadly neglected, but not forgotten) project: http://dailyjoseki.com

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #38 Posted: Wed Jun 23, 2010 3:06 pm 
Lives in sente
User avatar

Posts: 914
Liked others: 391
Was liked: 162
Rank: German 2 dan
WHR does not distribute points.

_________________
A good system naturally covers all corner cases without further effort.

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #39 Posted: Wed Jun 23, 2010 3:08 pm 
Gosei
User avatar

Posts: 2116
Location: Silicon Valley
Liked others: 152
Was liked: 330
Rank: 2d AGA
GD Posts: 1193
KGS: lavalamp
Tygem: imapenguin
IGS: lavalamp
OGS: daniel_the_smith
Maybe I should read the paper again. I read it a very long time ago... :)

_________________
That which can be destroyed by the truth should be.
--
My (sadly neglected, but not forgotten) project: http://dailyjoseki.com

Top
 Profile  
 
Offline
 Post subject: Re: Whole History Rating
Post #40 Posted: Wed Jun 23, 2010 5:11 pm 
Lives in gote

Posts: 653
Location: Austin, Texas, USA
Liked others: 54
Was liked: 216
RobertJasiek wrote:
Quote:
the rating system can say he is 16k +/- 14 ranks.


If the rating systems did say it (in terms of rating points), that would be an improvement. Strange confidence values instead say too little to the reader.


AGA already does something like this. My AGA rating is 3.354528, with a sigma of 0.276734.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 41 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group