It is currently Wed Mar 20, 2019 2:06 pm

All times are UTC - 8 hours [ DST ]

Post new topic Reply to topic  [ 101 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6
Author Message
 Post subject: Re: Iyama's world ranking
Post #101 Posted: Sat Feb 16, 2019 2:06 am 
Dies with sente

Posts: 81
Liked others: 8
Was liked: 7
Rank: 4K
jlt wrote:
Park Yeonghun has recently beaten strong opponents like Park Junghwan (3630) and Gu Zihao (3593), but also lost against low rated players like Kim Hyenchan (3246) and Song Gyusang (3329).

I wanted to check by myself if Iyama's rating reflects his performance against Chinese and Korean players. I considered a player with initial rating 3470, and I made him play the same 17 matches than Iyama's since January 2018 with the same result, and repeated that 10000 times. After each of the 17000 matches, the rating is incremented by c(match_result - 1/(10D/100+1)) where c=20 is an arbitrary constant and D is the difference between the opponent's rating and the current player's rating.

The player ended with a rating of... 3339 points, which puts him at the 98th place on goratings.

The constant c=20 is arbitrary, but changing it to other reasonable values (between 1 and 40) doesn't change the conclusion much.

I repeated the same experiment with the 14 matches in 2017 against Chinese and Korean opponents. The final rating is 3524, which corresponds to the 15th place.

If we take into account all the 31 matches since January 2017, we get a rating of 3410 (rank=54th).

My conclusion is that maybe Iyama is a bit overrated, but the variations are so wild that it's impossible to get a reliable rating by only taking into account non-Japanese opponents.

Here is a scilab code, for those who would like to check.





for number_of_loops=1:10000;

for i=1:n;


By the way, is it possible to do the same analysis for Japanese players collectively? I wonder, no, I suspect that Goratings overestimates Japanese players collectively. I know nothing about statistics, but here's what a Reddit user who does had to say about this matter (
Hi Remi, I don't know too much about Taeil Bai's method. The rating system I usually follow is mamumamu. I agree with you that incremental ratings systems don't work well, and that your system is far superior to those.

My criticism of your system comes when comparing it to mamumamu's ratings. It uses Glicko2-system for matches between strongly connected players and MLE concurrent with Glicko2 for international matches.

First, please allow me to confess that my understanding of statistics is probably elementary compared to yours. So I would appreciate patience in enlightening me if my criticism stems from my own ignorance.

When looking at Iyama Yuta's rating. He is current 3546 compared to Ke Jie at 3668. This suggests Iyama Yuta has a 33% chance of winning against Ke Jie. If you look at Mamumamu, it is 9.79 for Iyama Yuta and 10.867 for Ke Jie, which suggests a 22% win rate. This suggests mamumamu and goratings are off by about 100elo for Iyama Yuta.

When compiling the last 19 games (all games since 1/1/2015) of data for Iyama Yuta vs Chinese/Korean professionals, he has 8 wins when I calculate 9.8 wins expected based on This suggests his elo is about 70 points higher than it is. However, it is not statistically significant, as the deviation is less than 1 sigma. However, it is hard to get a statistically significant result with only 19 samples.

If I look at the next highest ranked Japanese player. Ichiriki Ryo. He has played 40 games against Chinese/Korean pros since 1/1/2015. Based on ratings from goratings I expect 16.8 wins, but there were 10. With Ichiriki Ryo, the deviation is just over 2sigma.

If combining the data for Iyama and Ichiriki, the deviation is well over 2sigma, but still under 3 sigma. Adding in more players from Japan to this, or taking the whole data for Japan vs China/Korea matches should further support this point.

I understand that I am only looking at part of the data, and ignoring the passage of time. However, I'm systematically picking out games that fit a certain neutral criteria, then calculating the expected # of wins and comparing to the actual # of wins and finding a statistically significant deviation. As ratings are used to predict future win probability, this suggests to me a flaw in the methodology used to produce the ratings.

My theory is that, although WHR does work better than incremental methods for strongly connected groups of players, it does not work well enough. My theory is further supported if you do a similar analysis for Taiwan vs China/Korea and NA/EU vs China/Korea. China and Korea play many games against each other, but there are far fewer opportunities for Japan/Taiwan/NA/EU to play against China/Korea.

FYI this is the description of mamumamu's methodology. ... le004.html He uses glicko2 first to calculate ratings for each country, then uses MLE to adjust the ratings for regional differences, accounting for the fact that glicko2 is insufficient for this adjustment.

Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 101 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6

All times are UTC - 8 hours [ DST ]

Who is online

Users browsing this forum: No registered users and 3 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group