It is currently Fri Oct 18, 2019 1:32 am

 All times are UTC - 8 hours [ DST ]

 Page 1 of 1 [ 8 posts ]
 Print view Previous topic | Next topic
Author Message
 Post subject: A little experiment with 400 games #1 Posted: Wed Dec 26, 2018 4:49 am
 Lives with ko

Posts: 238
Liked others: 4
Was liked: 57
I was very surprised to see that two 400 game matches could have a 6% difference in winrates (the winrates were 33% and 39%, with the same parameters)

To be sure that the results of these two 400 game matches were within reasonable probabilistic limits, I set up a 100000 throws match of "Heads or Tails". The result comes in seconds, so, this "heads or tails" experiment can be run several times.

Bottom line is :

For a 100000 throws game,
After 400 throws the winrate is anywhere between ~44% and ~56%
After 1000 throws the winrate is usually between ~46% ~54%
After 10000 throws, it's within 1-2 % of 50%
and at 100000 throws, it's usually very close to 50%

It's really a surprise to me that 400 games gives so little information !

Below, two graphs, in each , the 4 pictures describe the same game, after 400, 1000, 10000 and 100000 throws
Attachment:

4gr2.gif [ 31.93 KiB | Viewed 937 times ]
Attachment:

4gr3.gif [ 29.51 KiB | Viewed 937 times ]
PS. I'm running a third (and last) 400 game match between #196 and 92297ff

 This post by Vargo was liked by: And
Top

 Post subject: Re: A little experiment with 400 games #2 Posted: Wed Dec 26, 2018 5:10 am
 Gosei

Posts: 1406
Liked others: 695
Was liked: 459
Rank: AGA 3k KGS 1k Fox 1d
GD Posts: 61
KGS: dfan
The variance of a binomial distribution (summing up the results of n trials where each one has a probability of success of p) is np(1-p), and the standard deviation is the square root of that. 68% of results lie within one standard deviation of the mean and 95% of results lie within two.

For this case, that means that if we assume that the engines are exactly equal in strength, the standard deviation of a 400-game experiment is sqrt(400 * .5 * .5) = 10. So we expect 68% of our experiments to end with a result in the range of 200 ± 10 wins for the first engine (47.5% to 52.5% win rate) and 95% to end with 200 ± 20 wins (45% to 55%).

 This post by dfan was liked by 2 people: And, Bill Spight
Top

 Post subject: Re: A little experiment with 400 games #3 Posted: Wed Dec 26, 2018 6:42 am
 Lives with ko

Posts: 297
Liked others: 58
Was liked: 246
Rank: maybe 2d
Yep, the fact that you need so many games to reliably test strength differences (unless they're extreme) becomes painfully apparent if you're a bot developer.

Top

 Post subject: Re: A little experiment with 400 games #4 Posted: Wed Dec 26, 2018 7:00 am
 Lives with ko

Posts: 238
Liked others: 4
Was liked: 57
dfan wrote:
So we expect 68% of our experiments to end with a result in the range of 200 ± 10 wins for the first engine (47.5% to 52.5% win rate)

You're right, and it also means that when you run 400 game matches between evenly matched contestants, you'll get false winrates by 3% or more (...46%, 47% or 53%, 54%...) in ~30% of the matches !
That's a lot...

Top

 Post subject: Re: A little experiment with 400 games #5 Posted: Wed Dec 26, 2018 7:44 am
 Lives in sente

Posts: 935
Liked others: 0
Was liked: 159
But back to Vargo's original question. What has been shown for the binomial is for the special case where the p is 0.5 (the engines are actually of equal strength)

BUT -- the whole purpose of the experiment was to determine if one engine was in fact stronger than the other (p NOT 0.5).

Here the results were 0.39 and 0.33 and the question being asked was the difference between these meaningful.

Try doing the expansion again with a value for p something like p = 0.36 and then see if results of carrying out the experiment of 0.39 and 0.33 are unlikely or not.

Top

 Post subject: Re: A little experiment with 400 games #6 Posted: Wed Dec 26, 2018 10:21 am
 Gosei

Posts: 1406
Liked others: 695
Was liked: 459
Rank: AGA 3k KGS 1k Fox 1d
GD Posts: 61
KGS: dfan
Yes, I was just providing the analysis to the experiment described in this thread, partially because the full parameters to that experiment were known (the coin was fair) and the parameters to the original experiment are not (we don't know what the "real" strength difference between the two engines is).

That said, if we assume a coin that comes up heads 36%, of the time, the standard deviation of a 400-game result is now 9.6 instead of 10, which means that ~68% of the time the result will lie between .36 * 400 - 9.6 and .36 * 400 + 9.6 wins, which comes out to a winning percentage range of 33.6% to 38.4%. 95% of the time the observed result of a 400-game match will lie within 31.2% and 40.8%.

Top

 Post subject: Re: A little experiment with 400 games #7 Posted: Wed Dec 26, 2018 6:43 pm
 Lives with ko

Posts: 241
Liked others: 52
Was liked: 72
KGS: lepore
I think in the case of two runs of 400 flips, with one giving 33% heads and one giving 39% heads, it makes sense to to a straightforward test of proportions. The variance would get pooled.

Top

 Post subject: Re: A little experiment with 400 games #8 Posted: Thu Dec 27, 2018 10:43 am
 Lives with ko

Posts: 247
Liked others: 0
Was liked: 31
Rank: 2d
One should be very careful when calculating expected deviation, particularly because these simple figures assume sample independence, which is not necessarily true. For example the first few moves and choices of openings / first josekis can have large effect on the outcome, and these are often identical within the sample set. So you may not even getting 400 effective games worth of samples, which further increases variance.

Top

 Display posts from previous: All posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost timeSubject AscendingDescending
 Page 1 of 1 [ 8 posts ]

 All times are UTC - 8 hours [ DST ]

#### Who is online

Users browsing this forum: No registered users and 1 guest

 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum

Search for:
 Jump to:  Select a forum ------------------ Life In 19x19.com General Topics    Introductions and Guidelines    Off Topic    Announcements    General Go Chat    Beginners    Amateurs    Professionals       Lee Sedol vs Gu Li    Go Rules    Forum/Site Suggestions and Bugs    Creative writing    Tournaments       Ride share to tournaments Improve Your Game    Game Analysis    Study Group    Teachers/Club Leaders       Teacher advertisements    Study Journals L19²GO (Malkovich)    1-on-1 Malkovich games    Big Brother Malkovich games    Rengo Games    Other versions of turn-based games Go Gear    Go Books    Go Book Reviews    Computer Go    Gobans and other equipment    Trading Post    New Products/Upgrades/Sales Go Club Forums    Go Club Discussions       Honinbo Go League    American Go Association Forum       Go Congress 2011 volunteers       AGA volunteers ( non-congress)    Australian Go Association    European Go Federation Forum    Singapore Weiqi Association    KGS    ASR League    IGS    OGS    Tygem    WBaduk    Turn Based Servers    Insei League Events    Kaya.gs       King of the Hill
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group