That is true. Also note that 12-21 is not very statistically significant in terms of proving inferiority. Assuming a uniform prior over the win rate, the posterior probability that a player with 21 wins is stronger than a player with 12 wins is only 94%.hyperpape wrote:P.S. Surely there's a timing problem with citing your data. Both goratings and Dr. Bae Taeil's data show Iyama getting even stronger in the past year or two. But his international record reflects more than 5 years of play.
http://www.goratings.org/ now has historical ratings lists
-
Rémi
- Lives with ko
- Posts: 170
- Joined: Sat Jan 14, 2012 4:11 pm
- Rank: KGS 4 kyu
- GD Posts: 0
- Has thanked: 32 times
- Been thanked: 119 times
- Contact:
Re: http://www.goratings.org/ now has historical ratings lis
-
Rémi
- Lives with ko
- Posts: 170
- Joined: Sat Jan 14, 2012 4:11 pm
- Rank: KGS 4 kyu
- GD Posts: 0
- Has thanked: 32 times
- Been thanked: 119 times
- Contact:
Re: http://www.goratings.org/ now has historical ratings lis
Today, Iyama reached #3. Almost 4 months without a loss, 20 victories in a row. Japan is too easy for him:
http://www.goratings.org/players/601.html
http://www.goratings.org/players/601.html
- Attachments
-
- Iyama reaches #3
- iiyama.png (24.46 KiB) Viewed 15300 times
-
Sennahoj
- Dies with sente
- Posts: 103
- Joined: Fri Jun 20, 2014 5:45 am
- Rank: Tygem 5d
- GD Posts: 0
- Has thanked: 3 times
- Been thanked: 37 times
Re: http://www.goratings.org/ now has historical ratings lis
Remi, have you checked how well calibrated the ex-ante probabilities of this model is?
That is, in games where the models predict around X% winning chance for black, how large fraction of those matches are actually won by black?
That is, in games where the models predict around X% winning chance for black, how large fraction of those matches are actually won by black?
-
Uberdude
- Judan
- Posts: 6727
- Joined: Thu Nov 24, 2011 11:35 am
- Rank: UK 4 dan
- GD Posts: 0
- KGS: Uberdude 4d
- OGS: Uberdude 7d
- Location: Cambridge, UK
- Has thanked: 436 times
- Been thanked: 3718 times
Re: http://www.goratings.org/ now has historical ratings lis
Sennahoj, I don't know about these pro games, but I recall WHR did better than KGS's rating system in predicting the win rates of KGS games when that was analyzed some time ago.
-
Rémi
- Lives with ko
- Posts: 170
- Joined: Sat Jan 14, 2012 4:11 pm
- Rank: KGS 4 kyu
- GD Posts: 0
- Has thanked: 32 times
- Been thanked: 119 times
- Contact:
Re: http://www.goratings.org/ now has historical ratings lis
The details of my method are described in the WHR paper:Sennahoj wrote:Remi, have you checked how well calibrated the ex-ante probabilities of this model is?
That is, in games where the models predict around X% winning chance for black, how large fraction of those matches are actually won by black?
http://www.remi-coulom.fr/WHR/WHR.pdf
What I try to optimize is how frequently the winner of a game had the best rating before the game.
-
Sennahoj
- Dies with sente
- Posts: 103
- Joined: Fri Jun 20, 2014 5:45 am
- Rank: Tygem 5d
- GD Posts: 0
- Has thanked: 3 times
- Been thanked: 37 times
Re: http://www.goratings.org/ now has historical ratings lis
thanks, I understand. It would still be interesting to see if the ex-ante winning probabilites are well calibrated (in that sense --- that black wins about X% when the ex-ante winning probability is X%). It is not obvious that this will be the case, even if the model is super good at "getting the sign right", i.e. predicting which player will win.
Just to be clear, this is not a criticism of the methodology or anything like that
I would be happy to do some analysis if you'd give me some data, pm me if you're interested.
The reason this is interesting is that if the ex-ante probabilities are sufficiently well calibrated, it is meaningful to look those numbers --- otherwise it's not, and they should just be thought of as some "intermediate" quantities inside the model, which are used for calculating the final output.
Just to be clear, this is not a criticism of the methodology or anything like that
The reason this is interesting is that if the ex-ante probabilities are sufficiently well calibrated, it is meaningful to look those numbers --- otherwise it's not, and they should just be thought of as some "intermediate" quantities inside the model, which are used for calculating the final output.
-
hyperpape
- Tengen
- Posts: 4382
- Joined: Thu May 06, 2010 3:24 pm
- Rank: AGA 3k
- GD Posts: 65
- OGS: Hyperpape 4k
- Location: Caldas da Rainha, Portugal
- Has thanked: 499 times
- Been thanked: 727 times
Re: http://www.goratings.org/ now has historical ratings lis
Amusingly, the US's Andy Liu is currently rated 175th--the power of winning four straight games with your only loss being in 2011. He'll surely drop a little once the Kansai tournament is over (and you can't read much into Western players' ratings anyway, because of the lack of games and possibility of selective inclusion).
- handa711
- Dies with sente
- Posts: 109
- Joined: Wed Oct 10, 2012 9:50 pm
- Rank: KGS 2 kyu
- GD Posts: 0
- KGS: HandA
- Tygem: NhaTrang11
- IGS: Nagi
- Wbaduk: handa711
- OGS: hoanganh2357
- Has thanked: 13 times
- Been thanked: 9 times
- Contact:
Re: http://www.goratings.org/ now has historical ratings lis
I am baffled by Lee Changho's domination. Noob question: where is Go Seigen?
-
hyperpape
- Tengen
- Posts: 4382
- Joined: Thu May 06, 2010 3:24 pm
- Rank: AGA 3k
- GD Posts: 65
- OGS: Hyperpape 4k
- Location: Caldas da Rainha, Portugal
- Has thanked: 499 times
- Been thanked: 727 times
Re: http://www.goratings.org/ now has historical ratings lis
It appears he's not in the database. There's spotty coverage for the earlier years--I can find Kitani Minoru, but with only 17 games.
As for Lee Changho, why are you baffled? His results were the best in the world by a substantial amount for a decade.
As for Lee Changho, why are you baffled? His results were the best in the world by a substantial amount for a decade.
- handa711
- Dies with sente
- Posts: 109
- Joined: Wed Oct 10, 2012 9:50 pm
- Rank: KGS 2 kyu
- GD Posts: 0
- KGS: HandA
- Tygem: NhaTrang11
- IGS: Nagi
- Wbaduk: handa711
- OGS: hoanganh2357
- Has thanked: 13 times
- Been thanked: 9 times
- Contact:
Re: http://www.goratings.org/ now has historical ratings lis
I'm baffled by the length of his dominance. Possibly because I wasn't even born back then.hyperpape wrote:It appears he's not in the database. There's spotty coverage for the earlier years--I can find Kitani Minoru, but with only 17 games.
As for Lee Changho, why are you baffled? His results were the best in the world by a substantial amount for a decade.
-
hyperpape
- Tengen
- Posts: 4382
- Joined: Thu May 06, 2010 3:24 pm
- Rank: AGA 3k
- GD Posts: 65
- OGS: Hyperpape 4k
- Location: Caldas da Rainha, Portugal
- Has thanked: 499 times
- Been thanked: 727 times
Re: http://www.goratings.org/ now has historical ratings lis
It is amazing. The only 20th century player I know to compare it to is Go Seigen[1]. One difference is that Lee Changho was eventually surpassed as he aged and Lee Sedol came into his prime. Would Sakata have been able to surpass Go in the 1960s if the latter's career hadn't been cut short by an accident? We'll never be able to do more than guess (but check out this interview: http://www.andromeda.com/people/ddyer/a ... unter.html).
P.S. I found Go Seigen in the goratings page (http://www.goratings.org/players/856.html), but he only has five games from after his peak.
[1] Another player you might ask about is Shusai, but there I have no idea.
P.S. I found Go Seigen in the goratings page (http://www.goratings.org/players/856.html), but he only has five games from after his peak.
[1] Another player you might ask about is Shusai, but there I have no idea.
-
Jhyn
- Lives with ko
- Posts: 202
- Joined: Thu Sep 26, 2013 3:03 am
- Rank: EGF 1d
- GD Posts: 0
- Universal go server handle: Jhyn
- Location: Santiago, Chile
- Has thanked: 39 times
- Been thanked: 44 times
Re: http://www.goratings.org/ now has historical ratings lis
I see in your paper that you applied your algorithm to the KGS database. If you don't mind, I had a question about the evaluation of the results.
In your article, you say that "WHR significantly outperforms the other algorithms". The previous table shows a prediction gain of 0.6% compared to a simple ELO system. It looks like a small gain to my untrained eye; nevertheless I understand that the prediction rate cannot increase too much over 50% (if a game is a true 50/50 coin flip, no algorithm can do better than 50%). As a consequence the prediction rate seems to be dependent of the database (many lopsided matchups, e.g. 90%/10%, would increase the prediction rate of all systems). Therefore it seems hard to me to get a good idea of how significant the prediction gain actually is.
I thought about the following : if we took a large "fake" database with win ratios actually distributed as a gaussian centered on 0.5 and some well-chosen variance (corresponding to the apparent variance in the KGS database), what would be the prediction rate of the "perfect algorithm" (that predicts the correct win probability every time)? It seems to me this would be the best theoretically possible prediction rate. I hope my question makes sense.
In your article, you say that "WHR significantly outperforms the other algorithms". The previous table shows a prediction gain of 0.6% compared to a simple ELO system. It looks like a small gain to my untrained eye; nevertheless I understand that the prediction rate cannot increase too much over 50% (if a game is a true 50/50 coin flip, no algorithm can do better than 50%). As a consequence the prediction rate seems to be dependent of the database (many lopsided matchups, e.g. 90%/10%, would increase the prediction rate of all systems). Therefore it seems hard to me to get a good idea of how significant the prediction gain actually is.
I thought about the following : if we took a large "fake" database with win ratios actually distributed as a gaussian centered on 0.5 and some well-chosen variance (corresponding to the apparent variance in the KGS database), what would be the prediction rate of the "perfect algorithm" (that predicts the correct win probability every time)? It seems to me this would be the best theoretically possible prediction rate. I hope my question makes sense.
La victoire est un hasard, la défaite une nécessité.
-
hyperpape
- Tengen
- Posts: 4382
- Joined: Thu May 06, 2010 3:24 pm
- Rank: AGA 3k
- GD Posts: 65
- OGS: Hyperpape 4k
- Location: Caldas da Rainha, Portugal
- Has thanked: 499 times
- Been thanked: 727 times
Re: http://www.goratings.org/ now has historical ratings lis
I was intrigued by the discussion with macelee, so I grabbed Iyama's games against the current top twenty players. If I have not made any transcription errors, the correct list is below.
His record is 12-21.
By year:
2002: 0-1
2005: 0-1
2006: 0-1
2007: 1-1
2008: 1-2
2009: 0-3
2010: 1-3
2011: 5-3
2012: 0-0!
2013: 0-1
2014: 1-2
2015: 2-3
You can see that he's doing a bit better as time goes on. Still, his international results from 2013-2015 are mediocre relative to his current rating. If you split the games into 2002-2010 and 2011-2015, you get records of 3-12 and 9-9 respectively, though there's no principled reason why you should include 2011 and not 2010 (which gives you 10-12). 2010 is notable because it's the first year he plays someone rated lower than himself.
I'd like to calculate how he's performing relative to the model's predictions, but I actually have to look up something to do that
.
His record is 12-21.
By year:
2002: 0-1
2005: 0-1
2006: 0-1
2007: 1-1
2008: 1-2
2009: 0-3
2010: 1-3
2011: 5-3
2012: 0-0!
2013: 0-1
2014: 1-2
2015: 2-3
You can see that he's doing a bit better as time goes on. Still, his international results from 2013-2015 are mediocre relative to his current rating. If you split the games into 2002-2010 and 2011-2015, you get records of 3-12 and 9-9 respectively, though there's no principled reason why you should include 2011 and not 2010 (which gives you 10-12). 2010 is notable because it's the first year he plays someone rated lower than himself.
I'd like to calculate how he's performing relative to the model's predictions, but I actually have to look up something to do that
Code: Select all
Date,W/L,Opponent,Iyama's Rating,Opponent's Rating
2002-12-05,L,Chen Yaoye,3236,3313
2005-11-08,L,Chen Yaoye,3289,3378
2006-01-10,L,Gu Li,3296,3472
2007-12-12,L,Chen Yaoye,3385,3403
2007-12-12,W,Zhou Ruiyang,3385,3405
2008-04-14,L,Lee Sedol,3392,3541
2008-04-25,L,Kang Dongyun,3393,3457
2008-11-18,W,Chen Yaoye,3398,3415
2009-04-13,L,Kang Dongyun,3407,3451
2009-11-30,L,Chen Yaoye,3414,3449
2009-12-01,L,Kim Jiseok,3412,3417
2010-06-07,L,Lian Xiao,3420,3378
2010-07-25,L,Gu Li,3425,3451
2010-10-19,L,Lee Sedol,3434,3538
2010-11-25,W,Lee Sedol,3455,3516
2011-05-16,W,Lee Sedol,3455,3516
2011-05-18,W,Gu Li,3455,3459
2011-08-10,W,Gu Li,3461,3463
2011-08-11,W,Choi Cheolhan,3461,3478
2011-08-13,L,Park Jungwhan,3461,3506
2011-08-14,W,Jiang Weijie,3461,3426
2011-08-17,L,Park Yeonghun,3461,3418
2011-08-19,L,Jiang Weijie,3461,3426
2013-06-30,W,Park Jungwhan,3479,3554
2013-11-11,L,Chen Yaoye,3478,3492
2014-03-28,L,Zhou Ruiyang,3478,3481
2014-08-17,L,Lee Sedol,3482,3506
2014-12-03,W,Park Jungwhan,3489,3571
2015-01-05,W,Park Yeonghun,3493,3448
2015-01-08,L,Chen Yaoye,3494,3480
2015-03-03,W,Mi Yuting,3501,3476
2015-03-04,L,Kim Jiseok,3501,3476
2015-03-14,L,Ke Jie,3502,3581
-
Rémi
- Lives with ko
- Posts: 170
- Joined: Sat Jan 14, 2012 4:11 pm
- Rank: KGS 4 kyu
- GD Posts: 0
- Has thanked: 32 times
- Been thanked: 119 times
- Contact:
Re: http://www.goratings.org/ now has historical ratings lis
Rating algorithms must be tested on real data. You can generate artificial data based on some model, and then the best rating system would be the rating system that assumes this model. But the fact that an algorithm is the best to predict the artificial data does not imply that it will be the best to predict the real data. The only way to measure the ability of an algorithm to predict real game outcomes is to measure how well it predicts real game outcomes.Jhyn wrote:I thought about the following : if we took a large "fake" database with win ratios actually distributed as a gaussian centered on 0.5 and some well-chosen variance (corresponding to the apparent variance in the KGS database), what would be the prediction rate of the "perfect algorithm" (that predicts the correct win probability every time)? It seems to me this would be the best theoretically possible prediction rate. I hope my question makes sense.
Rémi