Deep learning and ladder blindness

For discussing go computing, software announcements, etc.
User avatar
MikeKyle
Lives with ko
Posts: 205
Joined: Wed Jul 26, 2017 2:27 am
Rank: EGF 2k
GD Posts: 0
KGS: MKyle
Has thanked: 49 times
Been thanked: 36 times

Deep learning and ladder blindness

Post by MikeKyle »

I was wondering about the implications of the ladder problem to the actual process of learning from self play. Maybe someone with better expertise or insight can help me understand.

As I understand it, due to limmitations of the "zero method" all of these wonderfull strong zero-based bots have a blindness for what will happen at the end of long ladders. But surely they have enough pattern recognition prowes in their neural nets to recognise the kind of thing that happens at the start of a ladder. In this case if the neural net "thought process" could be translated into English (currently impossible of course) wouldn't the bot be thinking something like "This variation I'm reading leads to one of those mysterious ladder things. From my self play I know that playing out one of these leads to either a nearly-won game or a nearly-lost game. If I try this out then I'm deciding the game based on some kind of a coin toss I don't understand!".

If this is the case then wouldn't bots avoid ladders pretty heavily when they think they have >50% win rate and seek them out when they think they have <50% win rate? In their self play experience wouldn't a ladder represent a way to create a gamble out of a unfavourable looking game, or represent a really bad thing to go near when the game looks okay? wouldn't we expect to see quite different joseki/patterns depending on who is ahead or behind by a relatively small amount?
moha
Lives in gote
Posts: 311
Joined: Wed May 31, 2017 6:49 am
Rank: 2d
GD Posts: 0
Been thanked: 45 times

Re: Deep learning and ladder blindness

Post by moha »

There are a few different strategies a net can come up with for the ladder problem. First and most important, with sufficient network depth it is possible to understand and predict the likely results of most ladders. This takes a lot of training and is very sensible to various bugs/biases/defects though, as the necessary connections are long and complicated. The original AGZ was probably the closest to this approach.

Then it is possible to arrive at the assumption that ladders are dangers that usually favor the opponent, thus avoiding them. ELF is said to have evolved in this direction.

What you say is also possible, although it is not a good idea to assume the opponent shares my blindness. Playing out an unread ladder is unlikely to give an even chance, so keeping even 40% or 30% can be preferable to it. But since most bots train in selfplay, this can create similar illusions. This was typical for early LZ, even strengthened by a special bias from it's distributed match system: a quick win (typical early ladders) was worth more than a slow win, because the match was abandoned quite easily if early results favored one side strongly enough. There was even a time where LZ's ladder preference oscillated a bit, probably depending on whether the net in question saw more ladder wins or ladder losses in it's training.

And also, since the net does not need to predict good moves, but interesting moves that need to be searched, assuming all ladders are favorable can also lead to good results when paired with search. This would lead to all ladders getting "looked at" and played only if favorable. This is said to be the current LZ way (although there is some contradiction here since training is done towards search results, so working ladders only). But also note that this has the drawback that only ladders from or near the starting position are handled correctly, so the ones that come up deeper during search remain a problem.
chut
Dies in gote
Posts: 23
Joined: Sun May 20, 2018 5:47 am
GD Posts: 0
Has thanked: 7 times
Been thanked: 3 times

Re: Deep learning and ladder blindness

Post by chut »

I have been think about this intriguing problem a bit. I think human has a meta level logic that governs the tree search engine. So when human sees a ladder pattern forming he/she will direct the tree search to read the ladder to the end as a matter top priority. The win rate in the middle of a ladder is undefined until the whole ladder is fully read out.

I believe Alpha Go (Lee, Master, Zero) have incorporated such a mechanism in its tree search algorithm, otherwise the whole system has a very fragile very glaring loophole. LZ has just such a loophole and it is very easy to mislead LZ into one with a common 3,3 point invasion joseki.
Tryss
Lives in gote
Posts: 502
Joined: Tue May 24, 2011 1:07 pm
Rank: KGS 2k
GD Posts: 100
KGS: Tryss
Has thanked: 1 time
Been thanked: 153 times

Re: Deep learning and ladder blindness

Post by Tryss »

chut wrote:I believe Alpha Go (Lee, Master, Zero) have incorporated such a mechanism in its tree search algorithm, otherwise the whole system has a very fragile very glaring loophole. LZ has just such a loophole and it is very easy to mislead LZ into one with a common 3,3 point invasion joseki.
According to the papers the Alpha Go team published, no, they didn't (and they would have probably published it). In addition to that, that would mean it's no more a pure "zero" bot, so it's doubtfull they would even try to implement something like this in Alpha Go Zero (or in Alpha Zero).
chut
Dies in gote
Posts: 23
Joined: Sun May 20, 2018 5:47 am
GD Posts: 0
Has thanked: 7 times
Been thanked: 3 times

Re: Deep learning and ladder blindness

Post by chut »

Maybe Alphago MUCH superior tree search is able to read deep enough to determine the result of a ladder often enough so it can adjust the network accordingly. In a ladder situation the network weight is meaningless without such a deep tree search. The generative adversarial networks playing each other will be in a blind leading the blind situation and they will never learn how to deal with a ladder properly.

I think with an less endowed training facilities we will need to augment the tree search with such a meta level algorithm.
chut
Dies in gote
Posts: 23
Joined: Sun May 20, 2018 5:47 am
GD Posts: 0
Has thanked: 7 times
Been thanked: 3 times

Re: Deep learning and ladder blindness

Post by chut »

I think this is a very intriguing problem. It is a all or nothing situation. If the MCTS is not deep enough then certain pattern will never be learned. This would be worthy of a paper by itself. And the meta level algorithm to guide the MCTS - I think that is a deep learning experiment worth pursuing.
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: Deep learning and ladder blindness

Post by Bill Spight »

chut wrote:I have been think about this intriguing problem a bit. I think human has a meta level logic that governs the tree search engine. So when human sees a ladder pattern forming he/she will direct the tree search to read the ladder to the end as a matter top priority. The win rate in the middle of a ladder is undefined until the whole ladder is fully read out.
Neural nets are like human unconscious parallel processing (intuition). Human conscious search is very memory limited, which means that it is pretty much depth first. That makes ladders and one lane roads natural targets for it. Humans also have logic, which can allow us to focus or eliminate search. For instance, in a capturing race we can count liberties, which may be viewed as a kind of depth first non-alternating search which, once done, allows us to eliminate other searches, because we know that they are equivalent.

Adding human style logic to a go playing program may or may not be a good idea. A ladder module might work fine if triggered, but make the program less efficient when it is not. Currently programs are getting better rapidly without such alterations, so why bother?

BTW, I have only looked at a couple of Elf's reviews, but it often seems to perform deep search with little breadth (exploration). Perhaps its pattern recognitions is now so good that the benefits of deep search for ladders and semeai are paying off.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
John Fairbairn
Oza
Posts: 3724
Joined: Wed Apr 21, 2010 3:09 am
Has thanked: 20 times
Been thanked: 4672 times

Re: Deep learning and ladder blindness

Post by John Fairbairn »

Humans don't read out most ladders. They just apply the rule of six. It seems practicable to replicate that quite efficiently in a computer, and since it would only ever be triggered in an atari situation, it would be triggered relatively rarely and so would not significantly slow the machine down.

There are of course situations where things get messy with more than one stone in the line of fire on the six lines, but such situations are uncommon and even if the computer gets those cases wrong (and on average only half would be wrong anyway) no harm is done compared to the current capability.
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: Deep learning and ladder blindness

Post by Bill Spight »

John Fairbairn wrote:Humans don't read out most ladders. They just apply the rule of six.
Yes, human logic has produce the rule of six and other rules to make our conscious processing more efficient. :)
It seems practicable to replicate that quite efficiently in a computer, and since it would only ever be triggered in an atari situation, it would be triggered relatively rarely and so would not significantly slow the machine down.
So one would think. :) Yet even when the programmer is a strong player or the programming team includes a strong player such modules have not been implemented in the top level bots. {shrug}
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
Tryss
Lives in gote
Posts: 502
Joined: Tue May 24, 2011 1:07 pm
Rank: KGS 2k
GD Posts: 100
KGS: Tryss
Has thanked: 1 time
Been thanked: 153 times

Re: Deep learning and ladder blindness

Post by Tryss »

John Fairbairn wrote:Humans don't read out most ladders. They just apply the rule of six. It seems practicable to replicate that quite efficiently in a computer, and since it would only ever be triggered in an atari situation, it would be triggered relatively rarely and so would not significantly slow the machine down.
But then you need to apply it to every atari in the search. It's far from obvious that it would increase the strength at time parity.
Uberdude
Judan
Posts: 6727
Joined: Thu Nov 24, 2011 11:35 am
Rank: UK 4 dan
GD Posts: 0
KGS: Uberdude 4d
OGS: Uberdude 7d
Location: Cambridge, UK
Has thanked: 436 times
Been thanked: 3718 times

Re: Deep learning and ladder blindness

Post by Uberdude »

John Fairbairn wrote:Humans don't read out most ladders. They just apply the rule of six.
This human doesn't! (and isn't even clear what said rule is, though imagines it's too do with the width of a channel that affects ladders).
moha
Lives in gote
Posts: 311
Joined: Wed May 31, 2017 6:49 am
Rank: 2d
GD Posts: 0
Been thanked: 45 times

Re: Deep learning and ladder blindness

Post by moha »

As I wrote above it is possible for a net to guess the ladder result without search. It can create a logic like "if the closest stone along this diagonal is w and is closer than the b stone along the neighbouring diagonal" etc. (rule of six?)

It is not easy, and needs intermediate data/concepts (about "closest along a diagonal") which in turn needs cooperation between several layers to pass around those intermediate info. This is probably near the complexity limit that simple gradient descent can find with a lot of training, but AGZ is rumored to achieve this.

A few people experimented with hardcoding those diagonal scans only (as opposed to "ladder capture / ladder escape" from original AlphaGo), and this was also successful.
Bill Spight
Honinbo
Posts: 10905
Joined: Wed Apr 21, 2010 1:24 pm
Has thanked: 3651 times
Been thanked: 3373 times

Re: Deep learning and ladder blindness

Post by Bill Spight »

Uberdude wrote:
John Fairbairn wrote:Humans don't read out most ladders. They just apply the rule of six.
This human doesn't! (and isn't even clear what said rule is, though imagines it's too do with the width of a channel that affects ladders).
Click Here To Show Diagram Code
[go]$$W Ladder path
$$ ---------------
$$ . . . . . . . . |
$$ . . . a X X . . |
$$ . . a S C O X . |
$$ . a S C C X . X |
$$ a S C C S a . . |
$$ S C C S a . . . |[/go]
(Diagram from Sensei's Library, https://senseis.xmp.net/?LadderBreaker )
The linear distance from "a" to "a" inclusive measures 6 points.
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.
User avatar
jlt
Gosei
Posts: 1786
Joined: Wed Dec 14, 2016 3:59 am
GD Posts: 0
Has thanked: 185 times
Been thanked: 495 times

Re: Deep learning and ladder blindness

Post by jlt »

Kageyama, about ladders:
kageyama.png
kageyama.png (122.64 KiB) Viewed 13941 times
dfan
Gosei
Posts: 1598
Joined: Wed Apr 21, 2010 8:49 am
Rank: AGA 2k Fox 3d
GD Posts: 61
KGS: dfan
Has thanked: 891 times
Been thanked: 534 times
Contact:

Re: Deep learning and ladder blindness

Post by dfan »

Bill Spight wrote:
John Fairbairn wrote:It seems practicable to replicate that quite efficiently in a computer, and since it would only ever be triggered in an atari situation, it would be triggered relatively rarely and so would not significantly slow the machine down.
So one would think. :) Yet even when the programmer is a strong player or the programming team includes a strong player such modules have not been implemented in the top level bots. {shrug}
AlphaGo had two input features that checked for the presence of ladders in this way. AlphaGo Zero removed those features because DeepMind wanted to show that it was possible to achieve top-level play without assistance by hand-crafted features. Projects such as Leela Zero and ELF OpenGo happened to follow AlphaGo Zero in this respect, I believe partially because they were trying to confirm that DeepMind's work could be duplicated. I agree that adding ladder features back in would improve the systems' strength, and I wouldn't be surprised if one of the non-open systems has already done so.
Post Reply