Bob beats Alice ⅞ of the time, with at win/loss ratio of 7/1. Carol beats Bob ⅞ of the time, with at win/loss ratio of 7/1. Carol always beats Alice. If we estimate Carol's win/loss ratio as (7/1) (7/1) = 49/1, OC, the win/loss ratio is off by infinity. However, the winning percentage is off by only 2%.dfan wrote:There is no particular reason that winning percentages have to be related in this exact mathematical way.
For example, Alice, Bob and Carol all play the classic game "Whose random number is bigger?". Alice is a beginner and picks integers from 1 to 100 uniformly at random. Bob is more experienced and picks integers from 51 to 150 uniformly at random. Carol is an expert and picks integers from 101 to 200 uniformly at random (she's very good at this game, though you can probably imagine even better strategies).
How often does Bob beat Alice? How often does Carol beat Bob? How often does Carol beat Alice?
LZ's progression
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: LZ's progression
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
- drmwc
- Lives in gote
- Posts: 452
- Joined: Sat Dec 01, 2012 2:18 pm
- Rank: 4 Dan European
- GD Posts: 0
- Has thanked: 74 times
- Been thanked: 100 times
Re: LZ's progression
Suppose we play a hold'em game. We have 3 possible starting hands. You choose a hand first, I choose second. We then play out the flop, turn and river with no betting.
We bet an amount on the outcome. Surely you are bound to win, as you choose first?
The 3 possible hands are:
The red 2s
6, 7 of spades
Ace of spades, king of clubs.
We bet an amount on the outcome. Surely you are bound to win, as you choose first?
The 3 possible hands are:
The red 2s
6, 7 of spades
Ace of spades, king of clubs.
-
moha
- Lives in gote
- Posts: 311
- Joined: Wed May 31, 2017 6:49 am
- Rank: 2d
- GD Posts: 0
- Been thanked: 45 times
Re: LZ's progression
I think we talk about the same thing, and your example seems good (edit: or maybe not?dfan wrote:I thought it was an example of a "shape of the actual distribution of single game performances" (uniform rather than Gaussian or somesuch) but maybe I was misinterpreting moha's phrase. For one thing, I was interpreting "game performance" as being a function of a single player, and whoever has the better performance wins; perhaps something else was meantBill Spight wrote:How is it an example? Doesn't it depend upon the structure of the game and a presumed definition of expertise at it, rather than the distribution of the game results per se?
Your point there is no necessary relationship between the win rates of A vs. B, B vs. C, and A vs. C is well taken. But I don't think that is what moha is saying.
About "there is no necessary relationship between the win rates of A vs. B, B vs. C, and A vs. C": it seems to me that - assuming the simplest case like no correlations, players performances are independent, etc. - there is a relationship, which depends on the individual performance distributions (thus varies by game type).
-
moha
- Lives in gote
- Posts: 311
- Joined: Wed May 31, 2017 6:49 am
- Rank: 2d
- GD Posts: 0
- Been thanked: 45 times
Re: LZ's progression
drmwc: Your example seems to be a second player advantage game, not quite the same as the A>B>C 60%+60% question.
For the the latter, a slightly similar and interesting example with pure uncorrelated distributions:
Player A picks number 30 (20% of cases) or 3.
Player B picks number 20 (50% of cases) or 2.
Player C picks number 10 (80% of cases) or 1.
But this still seems an r-p-s like situation, which can be considered a distorting factor (individual distributions differ in shape).
For the the latter, a slightly similar and interesting example with pure uncorrelated distributions:
Player A picks number 30 (20% of cases) or 3.
Player B picks number 20 (50% of cases) or 2.
Player C picks number 10 (80% of cases) or 1.
But this still seems an r-p-s like situation, which can be considered a distorting factor (individual distributions differ in shape).
Last edited by moha on Fri May 18, 2018 7:40 am, edited 2 times in total.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: LZ's progression
Apologies to drmwc for butting in, but I take it as an example of non-transitivity.moha wrote:drmwc: Your example seems to be a second player advantage game, not quite the same as the A>B>C 60%+60% question.
I'm not sure, but your idea of individual distributions seems to be related to what I am calling different strengths and weaknesses of different players (multi-dimensionality). Which can also cause non-transitivity.For the the latter, a slightly similar and interesting example with pure uncorrelated distributions:
Player A picks number 30 (20% of cases) or 3.
Player B picks number 20 (50% of cases) or 2.
Player C picks number 10 (80% of cases) or 1.
But this still seems an r-p-s like situation, which can be considered a distorting factor (individual distributions differ in shape).
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
moha
- Lives in gote
- Posts: 311
- Joined: Wed May 31, 2017 6:49 am
- Rank: 2d
- GD Posts: 0
- Been thanked: 45 times
Re: LZ's progression
Yes, but this is similar to rock-paper-scissors. I tried to exclude such distortions, and assumed uncorrelated performances, and that the players only differ in strength (position and variance but not shape of distribution). Then that shape still seems to matter for the accuracy of the oddswise approach.Bill Spight wrote:I'm not sure, but your idea of individual distributions seems to be related to what I am calling different strengths and weaknesses of different players (multi-dimensionality). Which can also cause non-transitivity.
-
dfan
- Gosei
- Posts: 1598
- Joined: Wed Apr 21, 2010 8:49 am
- Rank: AGA 2k Fox 3d
- GD Posts: 61
- KGS: dfan
- Has thanked: 891 times
- Been thanked: 534 times
- Contact:
Re: LZ's progression
Yes. (You can see the ELF network in the graph in multiple places; it's the gray cross with hash code starting with 62b.)chut wrote:Just wondering, there is a rather sharp upturn in strength graph at elo 10800, does that correspond to introducing ELF OpenGo to train LZ?
-
chut
- Dies in gote
- Posts: 23
- Joined: Sun May 20, 2018 5:47 am
- GD Posts: 0
- Has thanked: 7 times
- Been thanked: 3 times
Re: LZ's progression
In this series of matches with Haylee, there is no 3,3 point invasions even with 2 stones handicap, so the network weight used is the 'human' one?
https://www.youtube.com/watch?v=hExYHwtsra8
I find the zero human network quite bad with handicap games. Leela 11 would trash me with 4 stones handicap, but Zero would start by invading all the 3,3 points making the game much easier and much less interesting.
I am wondering whether there is a way to tweek the MCTS for handicap game, for example to favor branches that may not be the best, but with the highest number of sub-branches that are near optimal.
https://www.youtube.com/watch?v=hExYHwtsra8
I find the zero human network quite bad with handicap games. Leela 11 would trash me with 4 stones handicap, but Zero would start by invading all the 3,3 points making the game much easier and much less interesting.
I am wondering whether there is a way to tweek the MCTS for handicap game, for example to favor branches that may not be the best, but with the highest number of sub-branches that are near optimal.
-
dfan
- Gosei
- Posts: 1598
- Joined: Wed Apr 21, 2010 8:49 am
- Rank: AGA 2k Fox 3d
- GD Posts: 61
- KGS: dfan
- Has thanked: 891 times
- Been thanked: 534 times
- Contact:
Re: LZ's progression
People on the LZ team have been thinking about how to make it play handicap games better: https://github.com/gcp/leela-zero/issues/1313
-
Vargo
- Lives in gote
- Posts: 337
- Joined: Sat Aug 17, 2013 5:28 am
- GD Posts: 0
- Has thanked: 22 times
- Been thanked: 97 times
Re: LZ's progression
TWOGTP matches :
1) LZ's networks
Matches between networks #0, #10, #20, #30, ..., #140.
For each network, two 100 games matches against network #70, which is the reference point.
For example, #0 never wins against #70 (1st run = 0 win out of 100 games, and 2nd run = 0 win),
and #140 almost always wins against #70 (1st run = 99 wins out of 100 games, and 2nd run = 99 wins).
twogtp, with LZ015, --visits=51 --noponder
For example, line 60-70, 29, 17 means network#60 won 29 games out of 100 against #70, and 17 games out of 100 in the second 100 games match.
Two odd things :
#20 won 1 game against #70 !
For #60, the two results vary a lot (29 and 17)
2) Zen7 vs LZ with networks #...
--visits=3201 --noponder for LZ and
-t 12 -T 1 -s 850 (gtp4zen)
Each match is 20 games (10 as B, 10 as W)
Zen takes about twice as much time as LZ
1) LZ's networks
Matches between networks #0, #10, #20, #30, ..., #140.
For each network, two 100 games matches against network #70, which is the reference point.
For example, #0 never wins against #70 (1st run = 0 win out of 100 games, and 2nd run = 0 win),
and #140 almost always wins against #70 (1st run = 99 wins out of 100 games, and 2nd run = 99 wins).
twogtp, with LZ015, --visits=51 --noponder
For example, line 60-70, 29, 17 means network#60 won 29 games out of 100 against #70, and 17 games out of 100 in the second 100 games match.
Two odd things :
#20 won 1 game against #70 !
For #60, the two results vary a lot (29 and 17)
2) Zen7 vs LZ with networks #...
--visits=3201 --noponder for LZ and
-t 12 -T 1 -s 850 (gtp4zen)
Each match is 20 games (10 as B, 10 as W)
Zen takes about twice as much time as LZ
-
Uberdude
- Judan
- Posts: 6727
- Joined: Thu Nov 24, 2011 11:35 am
- Rank: UK 4 dan
- GD Posts: 0
- KGS: Uberdude 4d
- OGS: Uberdude 7d
- Location: Cambridge, UK
- Has thanked: 436 times
- Been thanked: 3718 times
Re: LZ's progression
For an idea of what these win ratios mean in terms of (weaker) human rank difference, check https://senseis.xmp.net/?EGFWinningStatistics. e.g a 3d beats a 6d about 8% whilst 4d beats 7d about 3%.
-
Vargo
- Lives in gote
- Posts: 337
- Joined: Sat Aug 17, 2013 5:28 am
- GD Posts: 0
- Has thanked: 22 times
- Been thanked: 97 times
Re: LZ's progression
Network #144 (9e88) was promoted against #143 (057a) by winning 54.84% of its 403 games.
Is it more or less reproducible ? (I think official matches are with 3201 visits).
Here are five twogtp matches, 403 games per match (--visits=xxxx , --noponder)
Up to 200 visits, win% is fluctuating wildly (75% at 0 visit, and then less than 50% with few visits)
Then at 3201 visits, it's 52.35%, which is not bad.
(I won't make a lot of these 400 games matches with 3200 visits, because it takes a long time, even with good GPU)
Is it more or less reproducible ? (I think official matches are with 3201 visits).
Here are five twogtp matches, 403 games per match (--visits=xxxx , --noponder)
Up to 200 visits, win% is fluctuating wildly (75% at 0 visit, and then less than 50% with few visits)
Then at 3201 visits, it's 52.35%, which is not bad.
(I won't make a lot of these 400 games matches with 3200 visits, because it takes a long time, even with good GPU)
- Attachments
-
- 9e88.jpg (81.49 KiB) Viewed 15782 times
-
moha
- Lives in gote
- Posts: 311
- Joined: Wed May 31, 2017 6:49 am
- Rank: 2d
- GD Posts: 0
- Been thanked: 45 times
Re: LZ's progression
This may be a good time for buying a lottery ticket.
OC this assumes promotion is rare (is a kind of survivor bias). And the difference somewhat scales with the number of sims, so the advantage of the stronger net will likely be bigger in deeper searches.luck will usually be there as that is still an easy way towards promotion. Most networks with >55% winrates will in fact be around 52% or so.
-
Vargo
- Lives in gote
- Posts: 337
- Joined: Sat Aug 17, 2013 5:28 am
- GD Posts: 0
- Has thanked: 22 times
- Been thanked: 97 times
Re: LZ's progression
The 74.68% win rate of network 9e88 against network 057a (with --visits=1) seemed weird…
Was it due to the relatively small sample (403 games) or to --visits=1 ?
Here are 9 twogtp matches (3x403 games, 3x1000 and 3x10000). I've kept the results reports generated by twogtp, here is one of these. If someone is interested, I can upload the other ones.
I was expecting the max variation to decrease as the number of games increased… But going from 35% to 66% ???
Am I doing something wrong ? Has someone tried something similar ?
Parameters :
--gtp --weights=xxx --visits=1 --noponder -r 10 and
-games xxxxx -sgffile C:\... -auto -komi 7.5
Curiously, the overall win% is around...52%
Was it due to the relatively small sample (403 games) or to --visits=1 ?
Here are 9 twogtp matches (3x403 games, 3x1000 and 3x10000). I've kept the results reports generated by twogtp, here is one of these. If someone is interested, I can upload the other ones.
I was expecting the max variation to decrease as the number of games increased… But going from 35% to 66% ???
Am I doing something wrong ? Has someone tried something similar ?
Parameters :
--gtp --weights=xxx --visits=1 --noponder -r 10 and
-games xxxxx -sgffile C:\... -auto -komi 7.5
Curiously, the overall win% is around...52%