Leela Zero Stuck
-
Uberdude
- Judan
- Posts: 6727
- Joined: Thu Nov 24, 2011 11:35 am
- Rank: UK 4 dan
- GD Posts: 0
- KGS: Uberdude 4d
- OGS: Uberdude 7d
- Location: Cambridge, UK
- Has thanked: 436 times
- Been thanked: 3718 times
-
moha
- Lives in gote
- Posts: 311
- Joined: Wed May 31, 2017 6:49 am
- Rank: 2d
- GD Posts: 0
- Been thanked: 45 times
Re: Leela Zero Stuck
I would expect the network to realize it's own vision range problem (from selfplay), and preemptively avoid unclear ladders. It's interesting this does not happen.Uberdude wrote:LeelaZero having ladder troubles against a human on OGS
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Leela Zero Stuck
Perhaps mutual blindness? Neither version of LeelaZero whether the ladder works, and so it is never played out, and LeelaZero never learns. (Or it will take a long time.) As for avoiding unclear ladders, not much is clear, is it? Especially in the opening.moha wrote:I would expect the network to realize it's own vision range problem (from selfplay), and preemptively avoid unclear ladders. It's interesting this does not happen.Uberdude wrote:LeelaZero having ladder troubles against a human on OGS
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
moha
- Lives in gote
- Posts: 311
- Joined: Wed May 31, 2017 6:49 am
- Rank: 2d
- GD Posts: 0
- Been thanked: 45 times
Re: Leela Zero Stuck
There is a point in playing out unread ladders though, for randomization. So during selfplay the side behind may play it.Bill Spight wrote:Perhaps mutual blindness? Neither version of LeelaZero whether the ladder works, and so it is never played out, and LeelaZero never learns. (Or it will take a long time.) As for avoiding unclear ladders, not much is clear, is it? Especially in the opening.moha wrote:I would expect the network to realize it's own vision range problem (from selfplay), and preemptively avoid unclear ladders. It's interesting this does not happen.Uberdude wrote:LeelaZero having ladder troubles against a human on OGS
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Leela Zero Stuck
Uberdude wrote:LeelaZero having ladder troubles against a human on OGS
How often will the next step of a ladder occur with quasi-random play?moha wrote:I would expect the network to realize it's own vision range problem (from selfplay), and preemptively avoid unclear ladders. It's interesting this does not happen.There is a point in playing out unread ladders though, for randomization. So during selfplay the side behind may play it.Bill Spight wrote:Perhaps mutual blindness? Neither version of LeelaZero whether the ladder works, and so it is never played out, and LeelaZero never learns. (Or it will take a long time.) As for avoiding unclear ladders, not much is clear, is it? Especially in the opening.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
kwhyte
- Dies in gote
- Posts: 34
- Joined: Thu Oct 06, 2011 12:25 am
- Rank: some SDK
- GD Posts: 0
- Universal go server handle: kwhyte
- Been thanked: 3 times
Re: Leela Zero Stuck
I think the net will first learn to trust that all unclear ladders work. It is a much worse mistake to run out of a long ladder then eventually works than to chase stones in a long ladder that eventually fails. It is true that the next step of a ladder is always likely to be somewhat likely in quasi-randomized play since the net learns quickly that you should consider all ataris, so playing the ladder out for a few steps should still happen sometimes, but for ladders long enough to get the net into trouble that's a lot of coincidence (and all the random trials where it plays the ladder out a third of the way across the board and then plays away are going to end badly - which will it learn faster, don't play unclear ladders or don't stop playing a ladder that's already gone a few steps?).
-
moha
- Lives in gote
- Posts: 311
- Joined: Wed May 31, 2017 6:49 am
- Rank: 2d
- GD Posts: 0
- Been thanked: 45 times
Re: Leela Zero Stuck
Why quasi-random play?Bill Spight wrote:How often will the next step of a ladder occur with quasi-random play?moha wrote:There is a point in playing out unread ladders though, for randomization. So during selfplay the side behind may play it.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Leela Zero Stuck
With probabilistic reasoning, perhaps there is a threshold, partly dependent upon circumstances, where the ladder is long enough for both sides to play out, because losing the long ladder will lose the game. Playing it out is the only chance to win.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Leela Zero Stuck
Because I did not think that the choice of plays was completely random.moha wrote:Why quasi-random play?Bill Spight wrote:How often will the next step of a ladder occur with quasi-random play?moha wrote:There is a point in playing out unread ladders though, for randomization. So during selfplay the side behind may play it.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Uberdude
- Judan
- Posts: 6727
- Joined: Thu Nov 24, 2011 11:35 am
- Rank: UK 4 dan
- GD Posts: 0
- KGS: Uberdude 4d
- OGS: Uberdude 7d
- Location: Cambridge, UK
- Has thanked: 436 times
- Been thanked: 3718 times
Re: Leela Zero Stuck
Is Bill's point that the quasi-random play of monte carlo rollouts will not be very good at playing ladders so it won't learn them well, but moha's counter point that Leela Zero doesn't use monte carlo rollouts?
-
moha
- Lives in gote
- Posts: 311
- Joined: Wed May 31, 2017 6:49 am
- Rank: 2d
- GD Posts: 0
- Been thanked: 45 times
Re: Leela Zero Stuck
I actually have trouble following the logic myself.Uberdude wrote:Is Bill's point that the quasi-random play of monte carlo rollouts will not be very good at playing ladders so it won't learn them well, but moha's counter point that Leela Zero doesn't use monte carlo rollouts?
My original idea was that - as seen with AlphaGo vs. humans - a bot can understand something like "danger, unclear". And avoid it (even at some cost) when ahead but seek it when behind. With a smallnet Zero selfplay, the bot has no reason to fear being exploited (since the opponent cannot see/read the ladder either), so what remains is "random" probability (e.g. ladders are a bit more likely to work on empty boards than later stages) with huge variance on game result.
I guess Bill may hint at difficulties for the early network in collecting experience at all, thus reaching this point? At first stages with close to random play, ladders are not played. Deepmind also noted that - unlike humans - a bot learns about ladders relatively late.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Leela Zero Stuck
No, my point is that if you have two players who share a blindness, it is difficult for them to learn from each other to overcome that blindness. It has nothing to do with AI per se. Humans have the same problem.Uberdude wrote:Is Bill's point that the quasi-random play of monte carlo rollouts will not be very good at playing ladders so it won't learn them well, but moha's counter point that Leela Zero doesn't use monte carlo rollouts?
It was moha who brought up randomness in the play, and I thought that it was not pure randomness, hence "quasi-random".
Last edited by Bill Spight on Wed Jan 10, 2018 7:05 am, edited 1 time in total.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Bill Spight
- Honinbo
- Posts: 10905
- Joined: Wed Apr 21, 2010 1:24 pm
- Has thanked: 3651 times
- Been thanked: 3373 times
Re: Leela Zero Stuck
Right. Even human beginners can apply logic. (Ladders are logical.) Could bots learn ladders using logic? Of course. But not the bots that are currently in vogue.moha wrote:Deepmind also noted that - unlike humans - a bot learns about ladders relatively late.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
At some point, doesn't thinking have to go on?
— Winona Adkins
Visualize whirled peas.
Everything with love. Stay safe.
-
Uberdude
- Judan
- Posts: 6727
- Joined: Thu Nov 24, 2011 11:35 am
- Rank: UK 4 dan
- GD Posts: 0
- KGS: Uberdude 4d
- OGS: Uberdude 7d
- Location: Cambridge, UK
- Has thanked: 436 times
- Been thanked: 3718 times
Re: Leela Zero Stuck
Something I found interesting to ponder about: Leela Zero is about 1 dan strength now I believe (+/- a few stones) and it likes to play early 3-3 invasions just like AlphaGo (both regular and Zero version) does. It's not very strong yet so it's not some respected oracle like AG that we would emulate, but does this mean it has also discovered some objective truth that early 3-3s are good (which is basically what we have surmised from AG), or is this more like some bias or quirk of how machine learning Go programs work and they get stuck playing them. AG never outgrew this trait and got really strong, but is it really strong because of or despite these 3-3 invasions?