Page 1 of 3
Whole History Rating open source implementation.
Posted: Sun May 20, 2012 10:28 am
by pete
The core of the rating engine used by
GoShrine is now open source. It's an implementation of Rémi Coulom's
Whole History Rating method with support for handicaps.
https://github.com/goshrine/whole_history_ratingIf anyone ends up using it anywhere, I'd love to hear about it.
-Pete
Re: Whole History Rating open source implementation.
Posted: Sun May 20, 2012 12:02 pm
by coffeeimam
I believe this is the repo itself, for those who are looking:
https://github.com/goshrine/whole_history_rating
Re: Whole History Rating open source implementation.
Posted: Sun May 20, 2012 2:50 pm
by pete
Sheesh. Thanks coffeiamam! I updated the post.
Re: Whole History Rating open source implementation.
Posted: Thu May 24, 2012 10:52 am
by Kaya.gs
pete wrote:Sheesh. Thanks coffeiamam! I updated the post.
Hey pete. I took a very brief look at the gem, but im not entirely sure about the whole solution.
I would consider WHR for Kaya, although im quite content with Glicko so far, ther eis the talk of having 2 separate ratings (blitz/ serious games) and for that , its possible different rating systems are a good solution.
As i had understood, WHR requires a massive recalculation with each result right? Have you benchmarked your solution?
Re: Whole History Rating open source implementation.
Posted: Thu May 24, 2012 11:32 am
by pete
Kaya.gs wrote:As i had understood, WHR requires a massive recalculation with each result right? Have you benchmarked your solution?
I have about 40k games I'm putting in it right now. To update the two involved players' ratings after a new game is a tiny fraction of a second, and only grows with the number of games each player has played. Periodically you should also do iterations over the entire set of players/games. To do a full iteration over all 40000 games & 3000 players this still only takes a few seconds.
So overall, it's very fast, thanks to Rémi's sophisticated math techniques. And this is running in ruby, not the fastest language on the block.

-Pete
Re: Whole History Rating open source implementation.
Posted: Thu May 24, 2012 12:00 pm
by Rémi
Hi Pete,
I am glad you implemented my algorithm. I wonder how you deal with handicap. In my early experiments, I found that the value of a one-stone handicap is much higher for stronger players.
Rémi
Re: Whole History Rating open source implementation.
Posted: Thu May 24, 2012 12:13 pm
by pete
Rémi wrote:
Hi Pete,
I am glad you implemented my algorithm. I wonder how you deal with handicap. In my early experiments, I found that the value of a one-stone handicap is much higher for stronger players.
Rémi
The ruby gem primarily supports a handicap that is a fixed elo amount for a given game. I didn't want to tie the gem to go, so the strength-based handicap system is a bit more complex, and not shown in the example.
To support handicaps that vary with player strength, the library supports passing in a Proc (a ruby callback of sorts) for the handicap attribute, which will be called with the game object as an argument. The Proc can then return a handicap value that is based on the game attributes (such as komi), and the current players' strengths.
On GoShrine, it uses this callback mechanism to implement a handicap that increases the value of each stone relative to the current (pre iteration) strength of the white player, and also factors in komi.
-Pete
Re: Whole History Rating open source implementation.
Posted: Thu May 24, 2012 12:23 pm
by palapiku
I suppose the value of each handicap combination should itself be determined by Bayesian methods...
Re: Whole History Rating open source implementation.
Posted: Thu May 24, 2012 12:38 pm
by pete
palapiku wrote:I suppose the value of each handicap combination should itself be determined by Bayesian methods...
I did do some analysis based on game results to help determine the handicap weighting function in GoShrine. It's also constrained by the fact that ranks (non-elo, but kyu-dan ranks) themselves are determined by this same function (players of one rank apart should play an even game with one stone handicap), There is probably another curve to determine the change in elo as you add more handicap stones: is the value of the first handicap stone the same as the 9th? Probably not, but I'm treating them equal.
In the end I came up with a curve that both maps the handicaps in a way that fits the 1-stone = 1-rank idea, and also optimizes the prediction rates in my training data set.
-Pete
Re: Whole History Rating open source implementation.
Posted: Mon May 28, 2012 10:36 am
by Rémi
I noticed there is a bit of rating drift. With WHR, you can set bots to have a constant rating. That should prevent drift.
Also, I played a few games (account Remi) and still don't have a rating. Do games against bots count?
I like your go server, it makes me feel strong

Rémi
Re: Whole History Rating open source implementation.
Posted: Mon May 28, 2012 9:36 pm
by pete
Rémi wrote:I noticed there is a bit of rating drift. With WHR, you can set bots to have a constant rating. That should prevent drift.
Also, I played a few games (account Remi) and still don't have a rating. Do games against bots count?
RE: rating drift, it seems that on the whole, the mean & spread of all ranks is fairly stable, but yes, bots do go up and down, when in theory they should be at a constant strength. I'm not sure how much of a problem this is.
19x19 games against bots do count, even with a handicap. You still have a ways to go in terms of bringing your rating's confidence above the threshold, which is currently a std deviation of about 120 elo. (Yours is at 168). Does those numbers seem reasonable based on the games you played? Computing the variance was a tricky part of the algorithm.
Thanks for adding links to GoShrine and the ruby library on the WHR page.
Re: Whole History Rating open source implementation.
Posted: Tue May 29, 2012 2:07 am
by Rémi
pete wrote:19x19 games against bots do count, even with a handicap. You still have a ways to go in terms of bringing your rating's confidence above the threshold, which is currently a std deviation of about 120 elo. (Yours is at 168). Does those numbers seem reasonable based on the games you played?
Yes, but I would not use such a threshold, because the games I played already should give a rough indication of my level. If you compute confidence intervals, it would be nice to show/plot them. If you are concerned that some players may reach the top rank if they are lucky in their first games, you could simply sort them according to the lower bound of their confidence interval instead.
BTW, this is what I would do in a server: give the lower bound of the confidence interval as player rating. It has the advatange of giving players an incentive to play more, as the bound would slowly go down when they stop playing. This would be unlike KGS, where the best way to increase one's rating is to stop playing.
Rémi
Re: Whole History Rating open source implementation.
Posted: Tue May 29, 2012 7:11 am
by pete
Rémi wrote:BTW, this is what I would do in a server: give the lower bound of the confidence interval as player rating. It has the advantage of giving players an incentive to play more, as the bound would slowly go down when they stop playing. This would be unlike KGS, where the best way to increase one's rating is to stop playing.
Rémi
Interesting idea. Are there any other servers that do this? It may have some weird side effects, though. Your rank would almost always go up after playing a game if it's been a while since your last game, even if you lost. And if you won an improbable game, would it be possible that your rank could go down, by spreading the variance more than increasing the mean?
Re: Whole History Rating open source implementation.
Posted: Tue May 29, 2012 7:39 am
by Rémi
pete wrote:Rémi wrote:BTW, this is what I would do in a server: give the lower bound of the confidence interval as player rating. It has the advantage of giving players an incentive to play more, as the bound would slowly go down when they stop playing. This would be unlike KGS, where the best way to increase one's rating is to stop playing.
Rémi
Interesting idea. Are there any other servers that do this? It may have some weird side effects, though. Your rank would almost always go up after playing a game if it's been a while since your last game, even if you lost. And if you won an improbable game, would it be possible that your rank could go down, by spreading the variance more than increasing the mean?
I don't think it is possible. Playing a game will always reduce the variance, so a win will always increase the rank.
In some really extreme cases, when the Gaussian approximation of the posterior is wrong, it may be that a loss will increase the lower confidence bound. But I don't expect that would happen much. And if that happens, the lower bound in question would be much lower than the real level of the player (because the rating would be extremely uncertain), so it should be no problem.
Nobody will complain if their rating increases more than they expect, anyway. What's important, is that it gives an incentive for playing.
Rémi
Re: Whole History Rating open source implementation.
Posted: Tue May 29, 2012 11:06 am
by RobertJasiek
Rémi wrote:Nobody will complain if their rating increases more than they expect
Wrong. (If you don't remember, I have complained about such earlier.)