Re: Whole History Rating open source implementation.
Posted: Tue May 29, 2012 11:32 am
Can you explain/elaborate on this?Rémi wrote:This would be unlike KGS, where the best way to increase one's rating is to stop playing.
Rémi
Life in 19x19. Go, Weiqi, Baduk... Thats the life.
https://lifein19x19.com/
Can you explain/elaborate on this?Rémi wrote:This would be unlike KGS, where the best way to increase one's rating is to stop playing.
Rémi
With the KGS rating system (not WHR), if you stop playing, your rating improves like your past opponents. So if you want your rating to make fast progress you can select opponents who are making fast progress, and then stop playing.hyperpape wrote:Can you explain/elaborate on this?Rémi wrote:This would be unlike KGS, where the best way to increase one's rating is to stop playing.
Rémi
Code: Select all
require 'whole_history_rating'
@whr = WholeHistoryRating::Base.new
for game in (1..10) do
@whr.create_game("anchor", "player", "B", 1, 0)
@whr.create_game("anchor", "player", "W", 1, 0)
end
for game in (1..10) do
@whr.create_game("anchor", "player", "B",180, 600)
@whr.create_game("anchor", "player", "W",180, 600)
end
for i in (1..10) do
@whr.iterate(10)
print @whr.ratings_for_player("anchor"), " "
print @whr.ratings_for_player("player"), "\n"
end
/var/lib/gems/1.9.1/gems/whole_history_rating-0.1.2/lib/whole_history_rating/player.rb:149:in `block in update_by_ndim_newton': uninitialized constant WholeHistoryRating::Player::WHR (NameError)
Yes, giving it crazy input can produce crazy output. I should have documented the handicap parameter better, but I'm not sure what the exact range is.yoyoma wrote: player gives anchor 600 Elo advantage (maybe 3 stones, depends on your model), 50% wins
So you're not asserting this is true of ordinary players who have (previously) played opponents selected more or less at random?Rémi wrote:With the KGS rating system (not WHR), if you stop playing, your rating improves like your past opponents. So if you want your rating to make fast progress you can select opponents who are making fast progress, and then stop playing.hyperpape wrote:Can you explain/elaborate on this?Rémi wrote:This would be unlike KGS, where the best way to increase one's rating is to stop playing.
Rémi
Rémi
60 elo for a handicap stone? That is far too low! KGS uses 148 per rank for 30k-5k, and 226 per rank for 2d+ (The constants are given in a different form here http://senseis.xmp.net/?KGSRatingMath log(e^0.85)*400=148 to convert to Elo form). EGF uses similar numbers. Besides, even using your too low value of 60, this is just a 10 rank improvement over 6 months. It's very easy to go from 25kyu to 15kyu in 6 months.pete wrote:Yes, giving it crazy input can produce crazy output. I should have documented the handicap parameter better, but I'm not sure what the exact range is.yoyoma wrote: player gives anchor 600 Elo advantage (maybe 3 stones, depends on your model), 50% wins
On GoShrine, handicap values for a single stone range roughly between 30-60 elo, depending on the strength of the players. So 600 elo is quite a bit. I've yet to see it go unstable with real data.
I'm guessing what is happening is that we're running into floating point precision issues with certain params. If stability does end up being a problem for real data, I'd definitely take a deeper look at it, though I might need some help from Remi.
-Pete
Newton's method is very efficient but tricky. In order to guarantee it works, it is necessary to check that the Newton iteration brings an improvement in the log-likelihood. If it does not, a fallback method should be used (such as a line search in the gradient direction).yoyoma wrote:I found what look like some numerical stability problems. I had similar problems when I implemented this as well with Newton's method failing or oscillating.
If you don't play on KGS, your rating will improve like your opponents.hyperpape wrote:So you're not asserting this is true of ordinary players who have (previously) played opponents selected more or less at random?
KGS and WHR have a different elo scale, I believe. The total spread of ranks from my 40k games on GoShrine is ~2000 elo which, if spread evenly, is 50 elo per rank/stone.yoyoma wrote: 60 elo for a handicap stone? That is far too low! KGS uses 148 per rank for 30k-5k, and 226 per rank for 2d+ (The constants are given in a different form here http://senseis.xmp.net/?KGSRatingMath log(e^0.85)*400=148 to convert to Elo form). EGF uses similar numbers. Besides, even using your too low value of 60, this is just a 10 rank improvement over 6 months. It's very easy to go from 25kyu to 15kyu in 6 months.
How did you select the volatility meta-parameter of WHR? handicap values?pete wrote:KGS and WHR have a different elo scale, I believe. The total spread of ranks from my 40k games on GoShrine is ~2000 elo which, if spread evenly, is 50 elo per rank/stone.yoyoma wrote: 60 elo for a handicap stone? That is far too low! KGS uses 148 per rank for 30k-5k, and 226 per rank for 2d+ (The constants are given in a different form here http://senseis.xmp.net/?KGSRatingMath log(e^0.85)*400=148 to convert to Elo form). EGF uses similar numbers. Besides, even using your too low value of 60, this is just a 10 rank improvement over 6 months. It's very easy to go from 25kyu to 15kyu in 6 months.
Again, I'd be interested to know if you see this with real data. The example you propose is not completely impossible in general usage, but is one I would certainly not see on GoShrine (600 elo is a 15 stone handicap at 25 kyu),
-Pete
Thanks for the feedback, quantum. I'm leaning towards implementing what Remi suggested about using the lower confidence bound as the rating, which would give you a rank much sooner (though probably lower than your actual rank).quantumf wrote:So after 5 games (3 wins 2 losses) I still don't have a rank. This is somewhat frustrating and not encouraging me to carry on trying. In general I prefer servers that allow one to self-select a starting rank, and find KGS quite annoying, but even KGS gives me a rank after 2 games. Kind of off-topic, but relevant in the sense that there are usability considerations that override perfection/accuracy in ranking systems.
I did some optimization runs, and came up with 300 Elo^2/day, somehow. You can configure the library like this:Rémi wrote: How did you select the volatility meta-parameter of WHR? handicap values?
Code: Select all
@whr = WholeHistoryRating::Base.new(:w2 => 17)Are you working on a new version of WHR that takes this into consideration?Rémi wrote: In my experiments, it was very clear that the handicap value changes a lot with player strength, and also volatility. When choosing the volatility in order to optimize prediction quality over the KGS database, it was too low (14 Elo^2/Day) for beginners, so it produced very "compressed" ratings.
For a rating system to properly understand the variations of strength in a pool of players that mixes beginners and experts, it is really necessary to consider that the strengths of beginners changes faster than the strengths of experts.
Rémi
I like playing around with rating math, sorry for the tldr text.pete wrote:KGS and WHR have a different elo scale, I believe. The total spread of ranks from my 40k games on GoShrine is ~2000 elo which, if spread evenly, is 50 elo per rank/stone.yoyoma wrote: 60 elo for a handicap stone? That is far too low! KGS uses 148 per rank for 30k-5k, and 226 per rank for 2d+ (The constants are given in a different form here http://senseis.xmp.net/?KGSRatingMath log(e^0.85)*400=148 to convert to Elo form). EGF uses similar numbers. Besides, even using your too low value of 60, this is just a 10 rank improvement over 6 months. It's very easy to go from 25kyu to 15kyu in 6 months.
Again, I'd be interested to know if you see this with real data. The example you propose is not completely impossible in general usage, but is one I would certainly not see on GoShrine (600 elo is a 15 stone handicap at 25 kyu),
-Pete
Code: Select all
| | KGS | EGF | EGF | KGS | EGF | EGF |
| | exp. | exp. | obs. | exp. | exp. | obs. |
| even game | win % | win % | win % | elo | elo | elo |
|-----------|-------|-------|-------|-------|-------|-------|
| 10k vs 9k | 30.0 | 33.9 | 44.8 | 148 | 116 | 36 |
| 5d vs 6d | 21.4 | 20.1 | 27.8 | 226 | 232 | 166 |