LZ's progression

For discussing go computing, software announcements, etc.
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: LZ's progression

Post by Bill Spight »

dfan wrote:There is no particular reason that winning percentages have to be related in this exact mathematical way.

For example, Alice, Bob and Carol all play the classic game "Whose random number is bigger?". Alice is a beginner and picks integers from 1 to 100 uniformly at random. Bob is more experienced and picks integers from 51 to 150 uniformly at random. Carol is an expert and picks integers from 101 to 200 uniformly at random (she's very good at this game, though you can probably imagine even better strategies).

How often does Bob beat Alice? How often does Carol beat Bob? How often does Carol beat Alice?


Bob beats Alice ⅞ of the time, with at win/loss ratio of 7/1. Carol beats Bob ⅞ of the time, with at win/loss ratio of 7/1. Carol always beats Alice. If we estimate Carol's win/loss ratio as (7/1) (7/1) = 49/1, OC, the win/loss ratio is off by infinity. However, the winning percentage is off by only 2%. :lol:
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
User avatar
drmwc
Lives in gote
Posts: 452
Joined: Sat Dec 01, 2012 2:18 pm
Rank: 4 Dan European
GD Posts: 0
Has thanked: 74 times
Been thanked: 100 times

Re: LZ's progression

Post by drmwc »

Suppose we play a hold'em game. We have 3 possible starting hands. You choose a hand first, I choose second. We then play out the flop, turn and river with no betting.

We bet an amount on the outcome. Surely you are bound to win, as you choose first?

The 3 possible hands are:
The red 2s
6, 7 of spades
Ace of spades, king of clubs.

If you choose the 2s, I choose 6,7.
If you choose 6,7, I choose AK.
If you choose AK, I choose the 2s.

Winning probabilities are not transitive. I am favourite by a small amount in each scenario.
moha
Lives in gote
Posts: 311
Joined: Wed May 31, 2017 6:49 am
Rank: 2d
GD Posts: 0
Been thanked: 45 times

Re: LZ's progression

Post by moha »

dfan wrote:
Bill Spight wrote:How is it an example? Doesn't it depend upon the structure of the game and a presumed definition of expertise at it, rather than the distribution of the game results per se?
Your point there is no necessary relationship between the win rates of A vs. B, B vs. C, and A vs. C is well taken. But I don't think that is what moha is saying.
I thought it was an example of a "shape of the actual distribution of single game performances" (uniform rather than Gaussian or somesuch) but maybe I was misinterpreting moha's phrase. For one thing, I was interpreting "game performance" as being a function of a single player, and whoever has the better performance wins; perhaps something else was meant
I think we talk about the same thing, and your example seems good (edit: or maybe not? :) Bill's billiard example may be more interesting - exponential? but the difference is still more normal). I assumed that there is an individual performance distribution for both players (pointwise for simplicity - verifiable in go), and that game result distribution is a function of those two and can be different (though with normal individual distributions the difference will be normal as well). In the A>B>C 60%+60% situation it may even be possible to design games with individual shapes for either extreme (A beats C in 61% or 99% - though this may need distorting factors, since the difference will usually be a bit more normal shape, like in dfan's example).

About "there is no necessary relationship between the win rates of A vs. B, B vs. C, and A vs. C": it seems to me that - assuming the simplest case like no correlations, players performances are independent, etc. - there is a relationship, which depends on the individual performance distributions (thus varies by game type).
moha
Lives in gote
Posts: 311
Joined: Wed May 31, 2017 6:49 am
Rank: 2d
GD Posts: 0
Been thanked: 45 times

Re: LZ's progression

Post by moha »

drmwc: Your example seems to be a second player advantage game, not quite the same as the A>B>C 60%+60% question.
For the the latter, a slightly similar and interesting example with pure uncorrelated distributions:

Player A picks number 30 (20% of cases) or 3.
Player B picks number 20 (50% of cases) or 2.
Player C picks number 10 (80% of cases) or 1.

But this still seems an r-p-s like situation, which can be considered a distorting factor (individual distributions differ in shape).
Last edited by moha on Fri May 18, 2018 7:40 am, edited 2 times in total.
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: LZ's progression

Post by Bill Spight »

moha wrote:drmwc: Your example seems to be a second player advantage game, not quite the same as the A>B>C 60%+60% question.


Apologies to drmwc for butting in, but I take it as an example of non-transitivity.

For the the latter, a slightly similar and interesting example with pure uncorrelated distributions:

Player A picks number 30 (20% of cases) or 3.
Player B picks number 20 (50% of cases) or 2.
Player C picks number 10 (80% of cases) or 1.

But this still seems an r-p-s like situation, which can be considered a distorting factor (individual distributions differ in shape).


I'm not sure, but your idea of individual distributions seems to be related to what I am calling different strengths and weaknesses of different players (multi-dimensionality). Which can also cause non-transitivity.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
moha
Lives in gote
Posts: 311
Joined: Wed May 31, 2017 6:49 am
Rank: 2d
GD Posts: 0
Been thanked: 45 times

Re: LZ's progression

Post by moha »

Bill Spight wrote:I'm not sure, but your idea of individual distributions seems to be related to what I am calling different strengths and weaknesses of different players (multi-dimensionality). Which can also cause non-transitivity.
Yes, but this is similar to rock-paper-scissors. I tried to exclude such distortions, and assumed uncorrelated performances, and that the players only differ in strength (position and variance but not shape of distribution). Then that shape still seems to matter for the accuracy of the oddswise approach.
chut
Dies in gote
Posts: 23
Joined: Sun May 20, 2018 5:47 am
GD Posts: 0
Has thanked: 7 times
Been thanked: 3 times

Re: LZ's progression

Post by chut »

Just wondering, there is a rather sharp upturn in strength graph at elo 10800, does that correspond to introducing ELF OpenGo to train LZ?
dfan
Gosei
Posts: 1598
Joined: Wed Apr 21, 2010 8:49 am
Rank: AGA 2k Fox 3d
GD Posts: 61
KGS: dfan
Has thanked: 891 times
Been thanked: 534 times
Contact:

Re: LZ's progression

Post by dfan »

chut wrote:Just wondering, there is a rather sharp upturn in strength graph at elo 10800, does that correspond to introducing ELF OpenGo to train LZ?

Yes. (You can see the ELF network in the graph in multiple places; it's the gray cross with hash code starting with 62b.)
chut
Dies in gote
Posts: 23
Joined: Sun May 20, 2018 5:47 am
GD Posts: 0
Has thanked: 7 times
Been thanked: 3 times

Re: LZ's progression

Post by chut »

In this series of matches with Haylee, there is no 3,3 point invasions even with 2 stones handicap, so the network weight used is the 'human' one?

https://www.youtube.com/watch?v=hExYHwtsra8

I find the zero human network quite bad with handicap games. Leela 11 would trash me with 4 stones handicap, but Zero would start by invading all the 3,3 points making the game much easier and much less interesting.

I am wondering whether there is a way to tweek the MCTS for handicap game, for example to favor branches that may not be the best, but with the highest number of sub-branches that are near optimal.
dfan
Gosei
Posts: 1598
Joined: Wed Apr 21, 2010 8:49 am
Rank: AGA 2k Fox 3d
GD Posts: 61
KGS: dfan
Has thanked: 891 times
Been thanked: 534 times
Contact:

Re: LZ's progression

Post by dfan »

People on the LZ team have been thinking about how to make it play handicap games better: https://github.com/gcp/leela-zero/issues/1313
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

TWOGTP matches :

1) LZ's networks
Matches between networks #0, #10, #20, #30, ..., #140.
For each network, two 100 games matches against network #70, which is the reference point.
For example, #0 never wins against #70 (1st run = 0 win out of 100 games, and 2nd run = 0 win),
and #140 almost always wins against #70 (1st run = 99 wins out of 100 games, and 2nd run = 99 wins).
twogtp, with LZ015, --visits=51 --noponder
For example, line 60-70, 29, 17 means network#60 won 29 games out of 100 against #70, and 17 games out of 100 in the second 100 games match.
Two odd things :
#20 won 1 game against #70 !
For #60, the two results vary a lot (29 and 17)
netw.jpg
netw.jpg (94.5 KiB) Viewed 15881 times



2) Zen7 vs LZ with networks #...
--visits=3201 --noponder for LZ and
-t 12 -T 1 -s 850 (gtp4zen)
Each match is 20 games (10 as B, 10 as W)
Zen takes about twice as much time as LZ
zen.jpg
zen.jpg (67.53 KiB) Viewed 15881 times
Uberdude
Judan
Posts: 6727
Joined: Thu Nov 24, 2011 11:35 am
Rank: UK 4 dan
GD Posts: 0
KGS: Uberdude 4d
OGS: Uberdude 7d
Location: Cambridge, UK
Has thanked: 436 times
Been thanked: 3718 times

Re: LZ's progression

Post by Uberdude »

For an idea of what these win ratios mean in terms of (weaker) human rank difference, check https://senseis.xmp.net/?EGFWinningStatistics. e.g a 3d beats a 6d about 8% whilst 4d beats 7d about 3%.
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

Network #144 (9e88) was promoted against #143 (057a) by winning 54.84% of its 403 games.

Is it more or less reproducible ? (I think official matches are with 3201 visits).
Here are five twogtp matches, 403 games per match (--visits=xxxx , --noponder)
Up to 200 visits, win% is fluctuating wildly (75% at 0 visit, and then less than 50% with few visits)
Then at 3201 visits, it's 52.35%, which is not bad.
(I won't make a lot of these 400 games matches with 3200 visits, because it takes a long time, even with good GPU)
Attachments
9e88.jpg
9e88.jpg (81.49 KiB) Viewed 15766 times
moha
Lives in gote
Posts: 311
Joined: Wed May 31, 2017 6:49 am
Rank: 2d
GD Posts: 0
Been thanked: 45 times

Re: LZ's progression

Post by moha »

This may be a good time for buying a lottery ticket. :)
luck will usually be there as that is still an easy way towards promotion. Most networks with >55% winrates will in fact be around 52% or so.
OC this assumes promotion is rare (is a kind of survivor bias). And the difference somewhat scales with the number of sims, so the advantage of the stronger net will likely be bigger in deeper searches.
Vargo
Lives in gote
Posts: 337
Joined: Sat Aug 17, 2013 5:28 am
GD Posts: 0
Has thanked: 22 times
Been thanked: 97 times

Re: LZ's progression

Post by Vargo »

The 74.68% win rate of network 9e88 against network 057a (with --visits=1) seemed weird…
Was it due to the relatively small sample (403 games) or to --visits=1 ?

Here are 9 twogtp matches (3x403 games, 3x1000 and 3x10000). I've kept the results reports generated by twogtp, here is one of these.
9e88_057v1_1.zip
(26.19 KiB) Downloaded 532 times
If someone is interested, I can upload the other ones.

9e88.jpg
9e88.jpg (161.29 KiB) Viewed 15645 times

I was expecting the max variation to decrease as the number of games increased… But going from 35% to 66% ???
Am I doing something wrong ? Has someone tried something similar ?
Parameters :
--gtp --weights=xxx --visits=1 --noponder -r 10 and
-games xxxxx -sgffile C:\... -auto -komi 7.5

Curiously, the overall win% is around...52% ;-)
Post Reply