It is currently Thu Apr 25, 2019 7:35 am

 All times are UTC - 8 hours [ DST ]

 Page 1 of 1 [ 7 posts ]
 Print view Previous topic | Next topic
Author Message
 Post subject: Strength as error distribution #1 Posted: Fri Feb 22, 2019 2:10 pm
 Lives with ko

Posts: 226
Liked others: 0
Was liked: 27
Rank: 2d
This came up a few times recently - some random thoughts:

The basic idea is that a player's strength can be described by the errors he makes. For simplicity I'd define an error as a move that loses points compared to the minimax solution (a bit doubtful *1). Such errors should be somewhat normal-ish (many small errors, fewer large errors *2), and after playing 100-200 moves the sum of these errors may be even more so (central limit).

Overall I think assigning a mean and a deviation to a player's per-game error total could offer a decent model. This is not much different to Elo fundaments actually (performance = -errors, and deviation may even be guessable from the mean). Except in go, there is a more tangible meaning behind these numbers. When two players play, the winning margin is the actual sum of the errors of the opponent, minus actual sum of errors of the player (assuming correct komi).

So for each game we have two distributions similar to this plot. The player wins if his "random sample" turns out to be higher than the opponent's (= he gives up less points in the game than the opponent).

Player A has a distribution described by [Aev,Asd] and opponent has [Bev,Bsd]. For simple cases the distribution of the difference can be constructed, but a more general way of getting A's winning probability: for each point on A's distribution, we take its density multiplied by B's cumulative distribution from -infinity to that point (the cases where B made more errors than A's error point in question).

Since only the relative width and position matters, B's distribution can be normalized, to use only A's shifted and scaled one afterwards: A becomes [Aev',Asd'] and B is [0,1]. This means A's numbers are expressed using B's original deviation as unit: we are only interested in where our distribution lies relative to opponent's one, and how it's shape aligns with his (how much wider/narrower it is).

So Aev'=(Aev-Bev)/Bsd and Asd'=Asd/Bsd. With these, the winning probability can be approximated (*2):

1/(sqrt(pi*2*Asd'^2)) * int_x_-inf_inf( e^(-(x-Aev')^2/(2*Asd'^2)) * 0.5*(1+erf(x/sqrt(2))))

Here is a wolfram example to calculate such win probabilities (variable substitution would make it too complex for the free version, so Aev' and Asd' occurrences need to be replaced manually inside square brackets).

Although the absolute position of a distribution doesn't really matter, a very rough guess is that pro level is somewhere around -50 (komi = 7, 1 stone = 2*komi, so 3-5 stones to perfect play). Two players are 1 stone apart if their ev difference is roughly 14 (supposedly 50% winrate with 1 extra stone or with reverse komi).

More interesting is the question of deviation. There is a known problem in translating Elo-like ratings to stones: EGF win% table predicts that winrate against 1 stone stronger opponents is ~33% at 9k, ~25% at 1d, and only ~20% at 7d levels. Using the above function in reverse hints that at 1d the deviation may be a bit less than 1 stone (<14 points). For stronger levels the deviation decreases - making fewer and smaller errors not only means higher ev, but less absolute variance as well.

These rank-dependent winrate differences are handled by EGF using an extra (deviation-like) variable term. This approach offers a natural explanation, from where A's distribution is shifted and scaled against B's normalised one. For stronger players the relative/scaled position of a one stone (14 points) stronger opponent's distribution is significantly farther (since the deviations are smaller). I think this is the real reason behind those differences observed in practice.

*1 This ignores that a deliberate safety move that trades points for consolidation of a winning position is not the same kind of error as points lost on misplaying a local fight for example.

*2 In go the actual error values and sums are integer, so something like a binomial distribution would probably be best. But approximating with other distributions like normal or logistic should also be ok, except maybe at near-perfect play (no positive values / side).

 This post by moha was liked by 3 people: Joaz Banbeck, marvin, Waylon
Top

 Post subject: Re: Strength as error distribution #2 Posted: Fri Feb 22, 2019 5:38 pm
 Judan

Posts: 5172
Location: Banbeck Vale
Liked others: 873
Was liked: 1258
Rank: 1D AGA
GD Posts: 1512
Kaya handle: Test
moha wrote:
...errors should be somewhat normal-ish (many small errors, fewer large errors ...

I'm suspicious of this assumption. The availability of errors of different sizes varies throughout the game. ( The largest error that can be made on the first move should be no more than komi*2, and in the last few moves it is usually a point or two. But in the middle game a bad move can sometimes throw away 100+ points )

I suspect that it will not be a normal distribution: that small errors will be over-represented.

_________________
'I have often wondered how it is that every man loves himself more than all the rest of men, but yet sets less value on his own opinions of himself than on the opinions of others." -Marcus Aurelius

Top

 Post subject: Re: Strength as error distribution #3 Posted: Fri Feb 22, 2019 6:44 pm
 Honinbo

Posts: 8186
Liked others: 2344
Was liked: 2855
To me the fact that errors are non-negative integers suggests a Poisson distribution.

_________________
There is one human race.
----------------------------------------------------

At some point, doesn't thinking have to go on?

Top

 Post subject: Re: Strength as error distribution #4 Posted: Fri Feb 22, 2019 6:46 pm
 Honinbo

Posts: 8186
Liked others: 2344
Was liked: 2855
Joaz Banbeck wrote:
moha wrote:
...errors should be somewhat normal-ish (many small errors, fewer large errors ...

I'm suspicious of this assumption. The availability of errors of different sizes varies throughout the game. ( The largest error that can be made on the first move should be no more than komi*2, and in the last few moves it is usually a point or two. But in the middle game a bad move can sometimes throw away 100+ points )

I suspect that it will not be a normal distribution: that small errors will be over-represented.

For many amateurs the error distribution may be bimodal. With better amateurs making fewer large errors.

_________________
There is one human race.
----------------------------------------------------

At some point, doesn't thinking have to go on?

Top

 Post subject: Re: Strength as error distribution #5 Posted: Fri Feb 22, 2019 9:49 pm
 Lives with ko

Posts: 226
Liked others: 0
Was liked: 27
Rank: 2d
Joaz Banbeck wrote:
The availability of errors of different sizes varies throughout the game. ( The largest error that can be made on the first move should be no more than komi*2, and in the last few moves it is usually a point or two. But in the middle game a bad move can sometimes throw away 100+ points )
Right, the scales of individual errors likely correlate with temperature changes throughout the game. And a large per-game error total may have more to do with a middlegame blunder than with dozens of smaller errors, for example.

This in itself doesn't exclude normality for the total though (e.g. the sum of a few normals is still normal, even if one of them is on orders of magnitude larger scale). But the normality of individual errors is even more questionable OC.

Another possible consequence, verifiable from actual data on results: if the largest errors come from middlegame, the deviation of the total can significantly depend on the character of the player as well (so not guessable from the mean, like EGF tries). Someone who has a strong middlegame likely makes fewer errors there, so likely has a smaller deviation for his total than others with the same rank (mean). This still leaves him with 50% against them, but should have noticeable and consistent effects on his chances against 1 stone stronger opponents (similarly to 9k-1d-7d anomalies above).

Quote:
I suspect that it will not be a normal distribution: that small errors will be over-represented.
This is why the longer route with the double integral seems preferable: it works for a wider range of distributions.

Bill Spight wrote:
To me the fact that errors are non-negative integers suggests a Poisson distribution.
In a few years the newer bots (with multi-komi NNs or the SAI fork) may be able to provide actual data on this.

Top

 Post subject: Re: Strength as error distribution #6 Posted: Mon Feb 25, 2019 6:20 pm
 Lives with ko

Posts: 226
Liked others: 0
Was liked: 27
Rank: 2d
Some further thoughts in comparison to 1-dimensional (mean only) systems:

When two players play, the side with the higher mean always have the upper hand. How much his advantage is, however, depends on deviations nearly as much as on means.

Balancing a matchup to 50% winrate with handicap or komi needs means only. This basically shifts means to be identical, then deviations don't matter anymore.

Partially non-transitive situations are possible, rather practical even (no special correlations, players showing the same performance against all opponents). For example, A is [-100,10], B is [-110,15], and C is [-115,30]. Then A>B (71%), B>C (56%), but A wins less against C (68%) than against B.

So it may be better to exclude non-handicapped games between players of different ranks from 1-dimensional systems. Otherwise deviations may get measured/smeared into the ratings (which should approximate means only - rating C higher than B would be incorrect).

 This post by moha was liked by: Bill Spight
Top

 Post subject: Re: Strength as error distribution #7 Posted: Sun Apr 14, 2019 3:30 pm
 Lives with ko

Posts: 226
Liked others: 0
Was liked: 27
Rank: 2d
Out of curiosity I tried to use this approach on the relation between points (early mistakes / advantages) and winning probabilities.

This is well defined if we know the shapes of players' distributions (or a good approximation), and we have at least a single data point to establish the scale (the distance between the two distributions in deviations). So if we know the percentage value of X points, we can calculate Y points and so on (by shifting the distributions).

And we do have one data point: a whole stone. This is if one player passes his first move - or if there is 1 stone strength difference between the players. And we can guess the point value of this is twice komi - roughly 14 points.

For human ranks I took the winrates against 1 stone stronger opponent from the above EGF table (adjusting down half-rank, and some guessing for pro levels / 9d since it only goes up to 7d). I experimented with bots as well, but their winrate gains fluctuate wildly and sometimes inconsistently (even for smaller mistakes), so I could only roughly conclude that one move for LZ is about 35-40% gain. Which is not much different from my 9d approximation so I made no column for this. Instead I include the idea of 2pt=10% - this can also be used as an anchor.

So, using the above wolfram calculator in both directions, I get the following values:
Code:
1 dan   7 dan   9 dan   2=10?
--------------------------------------------------------
winrate vs 1 extra stone   27.4%   20.2%    ~16%    3.7%?
equiv. distance in sd-s     0.85    1.18    1.41
1 point distance in sd-s   .0607   .0843   .1007   .1800
sd in points               16.47   11.86    9.93    5.56
------------------------
winrate gain for 1 pt       1.71    2.38    2.84    5.06
winrate gain for 2 pts      3.42    4.74    5.66    10.0
winrate gain for 3 pts      5.12    7.10    8.46    14.9
winrate gain for 5 pts      8.50    11.7    13.9    23.8
winrate gain for 7 pts      11.8    16.2    19.1    31.4
winrate gain for 14 pts     22.6    29.8    34.0    46.3

This is for early game only OC - and similar results can be obtained by an oddswise approach as well.

Top

 Display posts from previous: All posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost timeSubject AscendingDescending
 Page 1 of 1 [ 7 posts ]

 All times are UTC - 8 hours [ DST ]

#### Who is online

Users browsing this forum: Bing [Bot] and 2 guests

 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum

Search for:
 Jump to:  Select a forum ------------------ Life In 19x19.com General Topics    Introductions and Guidelines    Off Topic    Announcements    General Go Chat    Beginners    Amateurs    Professionals       Lee Sedol vs Gu Li    Go Rules    Forum/Site Suggestions and Bugs    Creative writing    Tournaments       Ride share to tournaments Improve Your Game    Game Analysis    Study Group    Teachers/Club Leaders       Teacher advertisements    Study Journals L19²GO (Malkovich)    1-on-1 Malkovich games    Big Brother Malkovich games    Rengo Games    Other versions of turn-based games Go Gear    Go Books    Go Book Reviews    Computer Go    Gobans and other equipment    Trading Post    New Products/Upgrades/Sales Go Club Forums    Go Club Discussions       Honinbo Go League    American Go Association Forum       Go Congress 2011 volunteers       AGA volunteers ( non-congress)    Australian Go Association    European Go Federation Forum    Singapore Weiqi Association    KGS    ASR League    IGS    OGS    Tygem    WBaduk    Turn Based Servers    Insei League Events    Kaya.gs       King of the Hill