Life In 19x19

Posted: **Wed Nov 20, 2013 7:11 am**

Inspired by ez4u's previous thread (http://www.lifein19x19.com/forum/viewto ... =10&t=9217), where he calculated the probability of a player repeating as champion under various tournament formats, I sat down to look at the round-robin tournament format. I've been thinking about such things for a while, but Dave's work prompted me to sit down and actually do it.

This work considers round-robin tournaments having 6, 8, 10 or 12 players. Each player's playing strength is modelled with an Elo rating, with each player being set a fixed number of Elo points stronger ("the player rating difference") than the next stronger competitor. This gives a ladder of ratings, rather than a single player dominating a group of roughly comparable competitors. For reference, a 100 Elo-point difference corresponds to a 64% chance of the stronger player winning and a difference 200 Elo points gives a 76% winning probability. For each combination of tournament size and player rating difference I simulated 1,000,000 fictitious tournaments and calculated the final standings of the players. Standings were calculated only using the number of wins; no tiebreaks were used.

Initially we are interested in two situations: the top player wins the tournament outright or is tied for first place. We start by considering the probability of winning the tournament outright, shown in the first graph below. There appears to be three regimes of interest. In the first, for a player rating difference larger than ~125 Elo points, the size of the tournament has very little impact on the winning chance of the strongest player. Presumably this is the because the players at the bottom of the field are so weak that their chances of staging an upset against a top-ranked player is minimal--adding a bunch of kyu players to the Meijin tournament is not going to affect how Cho U does.

A second regime, below ~25 Elo points, has the strongest player doing worse the larger the tournament. This is because the players are so close in strength and upsets occur across the entire field. The winner of the tournament is approximately chosen at random, and the chance of being selected improves with a smaller field. Finally, in the transition regime between ~25 and ~125 Elo points we see the effects of both player strength and tournament size asserting themselves, although there are diminishing returns to larger tournaments.

The probability that the top player is involved in a tie for first place is shown below. As before, in the regime above ~125 Elo points the effect of tournament size is negligible. The regime below ~25 Elo points, however, completely vanishes, and the probability of being involved in a tie actually reaches a maximum between 75-100 Elo points. This came as a surprise given that that top player is noticeably stronger than the rest of the field.

Taken together, the probability that the top player is in contention (outright win or tied) is shown below. For rating difference on the order of 50 Elo points, which corresponds a winning percent of ~57% for the stronger player, the top player is in contention roughly half the time. In go terms, most tournaments are going to involve players within a stone of each other, roughly less than 100 Elo points. The top player is not necessarily going to find it easy going, which helps keep the go scene interesting for the fans.

Posted: **Thu Nov 21, 2013 7:31 am**

This is really just a consequence of "small numbers", not the format per se.

Try rerunning your simulation under the condition that the "round robin" consists of each player playing every other player 20 times (we do a double round robin 10 times). What is now the effect of a modest advantage for the best player?

Now compare with it being done 200 times.

Posted: **Thu Nov 21, 2013 8:15 am**

Mike Novack wrote:This is really just a consequence of "small numbers", not the format per se.

Try rerunning your simulation under the condition that the "round robin" consists of each player playing every other player 20 times (we do a double round robin 10 times). What is now the effect of a modest advantage for the best player?

Now compare with it being done 200 times.

I don't think increasing the numbers of games is really realistic. For several reasons.

1) Tournaments need to finish in a reasonable amount of time. IE players can only play one game per day, if there are sixteen players playing 20 games that would take 320 days.

2) Repeated playing of games isn't necessarily independent, the players strengths will actually change from game to game, as they figure out certain things, especially if the tournament lasts a year and the strongest player is 18 or 19 years old.

3) This changes what is being measured, in the sense that you start measuring how well a player can learn other peoples styles, rather than how strong they are playing a previously unseen opponent.

Posted: **Thu Nov 21, 2013 8:39 am**

Mike Novack wrote: Try rerunning your simulation under the condition that the "round robin" consists of each player playing every other player 20 times (we do a double round robin 10 times). What is now the effect of a modest advantage for the best player?

The work here deals with the tournament structure that is found in the real world. A go tournament held in a round-robin format will have a few players (eight for the Meijin and Honinbo leagues) playing only single game matches.

More data is always better, but as SmoothOper points out it's not always practical.

Posted: **Thu Nov 21, 2013 9:49 am**

pwaldron wrote:
Mike Novack wrote: Try rerunning your simulation under the condition that the "round robin" consists of each player playing every other player 20 times (we do a double round robin 10 times). What is now the effect of a modest advantage for the best player?
The work here deals with the tournament structure that is found in the real world. A go tournament held in a round-robin format will have a few players (eight for the Meijin and Honinbo leagues) playing only single game matches.

More data is always better, but as SmoothOper points out it's not always practical.

So long as you are comparing a limited number of games, there's no way you will get rid of the variance, because there isn't enough data to even it out. As a result, I'm not sure what the point is in saying that 'tournament structures are flawed' when the only way to change the variance is to increase the number of games.

Posted: **Thu Nov 21, 2013 10:11 am**

Like any statistical test, tournament structures are not all equally good/bad at determining a result. That's why there are arguments about them.

Sometimes, the things we'd like a tournament to do are not precisely measurable or can't be weighed against each other (tradition, etc). That's why the arguments are hard.

Now that that's settled, let's have some arguments!

Posted: **Thu Nov 21, 2013 10:12 am**

hyperpape wrote:Like any statistical test, tournament structures are not all equally good/bad at determining a result. That's why there are arguments about them.

And of course the result is less important than making the sponsor happy.

Posted: **Fri Nov 22, 2013 8:39 am**

I was working on some analysis of leagues as well, but had not gotten through it yet. The interesting question to me is the difference between 8 and 9-person leagues (the Honinbo and Meiji tournaments in Japan) and the double league structure (two 6-person leagues with a 1-game playoff between the two winners) used in the Kisei. There is no doubt in my mind that the Yomiuri Shinbun (Kisei sponsor) thought that as the most prestigious tournament in Japan it needed something bigger and better than the Meijin. However, even a 10-person round-robin league involves 66 games. I expect that is too many for newspaper coverage. So they invented a new structure that only requires 31 games (15 X 2 plus the playoff). Just from the look of it, however, I expect that it is less likely to 'correctly' select the strongest challenger.

Edit: It is a 12-player league that requires 66 games not a 10-player. A 10-player league requires 45, which is anyway too many to cover one game per week in a newspaper and still leave enough time to cover the final match as well.
6-player = 15 games (6 x 5 / 2)
8-player = 28 games (8 x 7 / 2)
9-player = 36 games (9 x 8 / 2)
10-player = 45 games (10 x 9 / 2)

Posted: **Fri Nov 22, 2013 11:14 am**

Looks like Kobayashi did a tour de force on ranking

http://www.stat.t.u-tokyo.ac.jp/~takemu ... R-0316.pdf

It definitely seems like the Round Robin isn't efficient.

For those that are interested. The extreme value distribution, used in the paper, is used to calculate probabilities for hypothesis that a set of events in a given period of time is or is not the maximum

, similar to calculating the probability is the median, only for the maximum

, I doubt something like that would be implemented though.

Posted: **Fri Nov 22, 2013 11:31 am**

SmoothOper wrote:Looks like Kobayashi did a tour de force on ranking

This is an interesting paper, although my quick glance through it didn't see round robins directly mentioned. It's clear that round-robins are sub-optimal from an efficiency standpoint, though, because towards the end of the tournament not all games have any effect on the selection of the strongest player.

Round-robin tournaments have received a surprisingly large amount of research because they are applicable more generally to areas like product testing.

Posted: **Fri Nov 22, 2013 11:36 am**

ez4u wrote:I was working on some analysis of leagues as well, but had not gotten through it yet. The interesting question to me is the difference between 8 and 9-person leagues (the Honinbo and Meiji tournaments in Japan) and the double league structure (two 6-person leagues with a 1-game playoff between the two winners) used in the Kisei. There is no doubt in my mind that the Yomiuri Shinbun (Kisei sponsor) thought that as the most prestigious tournament in Japan it needed something bigger and better than the Meijin. However, even a 10-person round-robin league involves 66 games. I expect that is too many for newspaper coverage. So they invented a new structure that only requires 31 games (15 X 2 plus the playoff). Just from the look of it, however, I expect that it is less likely to 'correctly' select the strongest challenger.

I clearly didn't do all my background research. When I was laying out this work, I deliberately discounted the idea of a tournament with an odd number of players. Little did I suspect a counter example was so readily available.

With regards to the Kisei, the chance 'correctly' selecting the strongest player would be the probability that the strongest player wins his league multiplied by the probability of him winning the playoff.

To a first approximation, the probability of winning the playoff will be ~50%, so the dual league system would seem to create quite a penalty.

Posted: **Fri Nov 22, 2013 11:49 am**

pwaldron wrote:
SmoothOper wrote:Looks like Kobayashi did a tour de force on ranking
This is an interesting paper, although my quick glance through it didn't see round robins directly mentioned. It's clear that round-robins are sub-optimal from an efficiency standpoint, though, because towards the end of the tournament not all games have any effect on the selection of the strongest player.

Round-robin tournaments have received a surprisingly large amount of research because they are applicable more generally to areas like product testing.

I think the point of the paper was to confidently find the optimal ranking in the least amount of effort, which I gather, they didn't even consider the Round Robin. I don't know if finding the last place is as important as finding the first( which should require as many games as finding the first), so there maybe subsequent improvements. I haven't thoroughly read the paper so, they may have mentioned that.

Life In 19x19

Mickey Mouse Round Robins

Mickey Mouse Round Robins

Re: Mickey Mouse Round Robins

Re: Mickey Mouse Round Robins

Re: Mickey Mouse Round Robins

Re: Mickey Mouse Round Robins

Re: Mickey Mouse Round Robins

Re: Mickey Mouse Round Robins

Re: Mickey Mouse Round Robins

Re: Mickey Mouse Round Robins

Re: Mickey Mouse Round Robins

Re: Mickey Mouse Round Robins

Re: Mickey Mouse Round Robins