It is currently Sun Jul 05, 2020 12:50 pm

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 12 posts ] 
Author Message
Offline
 Post subject: How LZ reads out ladders
Post #1 Posted: Sun Mar 01, 2020 12:59 am 
Lives in gote

Posts: 430
Location: Adelaide, South Australia
Liked others: 173
Was liked: 220
Rank: Australian 2 dan
GD Posts: 200
Carrying on from the other thread, now I'm using my modified LZ version to explore how LZ "understands" ladders. It's really interesting to look at older and newer LZ nets and see how they treat the same position.

In theory, there should be three things going on:
  • Policy network: does LZ think the next move in the ladder is an obvious move to explore? Can it "see at a glance" whether a ladder capture is good or bad?
  • Playouts: it can take about 60 moves to read a ladder that goes all the way across the board. Does that mean LZ needs a minimum of 60 playouts to read out a ladder? Or more playouts if it's reading out other variations along the way?
  • Net evaluation: Once a few moves of a ladder appear on the board, can LZ recognise the position as good for white or good for black? Can it accurately measure the cost of playing out a bad ladder?
These factors interact with each other. If a ladder move is a low policy move then it won't get many playouts. If the value net can recognise that ten moves of a ladder turns into a disaster, then it also shouldn't need to play out the full ladder.

I'd expect that smaller networks (5 or 6 blocks) will need to play out pretty much the whole ladder, because they can't "see all the way across the board", while a 20 or 40 block network should be able to "take in the position at a glance" and understand the ladder status without playing out the moves.

So, on to some tests. Below are some taisha positions where both sides have made mistakes, and now white has the chance to start a ladder. I want to look at four scenarios:
  • Test position 1A: good ladder, attacker's perspective. White's best move is to atari the black stone and start the ladder.
  • Test position 1B: good ladder, defender's perspective. Black shouldn't pull out of atari, but should play elsewhere.
  • Test position 2A: bad ladder, attacker's perspective. Here, for white to give atari is a mistake.
  • Test position 2B: bad ladder, defender's persepctive. White has made a mistake and started the ladder: now black should pull out of atari.

Click Here To Show Diagram Code
[go]$$Wc Test position 1.
$$ ---------------------------------------
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . O . . . . |
$$ | . . . X . . . . . , . . . . . , O d . |
$$ | . . . . . . . . . . . . . . . . e . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . X O O b . . . . . . . . . . . . . |
$$ | . . X X O X a . . , . . . . . X . . . |
$$ | . . . O X O . . . . . . . c . . . . . |
$$ | . . . X X O . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ ---------------------------------------[/go]

Test position 1A (above): white must play at a, any other move is a mistake.
Test position 1B (above): after white a, black should tenuki -- there are several possible moves, for example c, d, e, but b would be a bad mistake.

Click Here To Show Diagram Code
[go]$$Wc Test position 2
$$ ---------------------------------------
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . X . . |
$$ | . . . O . . . . . , . . . . . X . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . c c . . . . . . . . . . . . . . |
$$ | . . c . . . . . . . . . . . . . . . . |
$$ | . . X O O b . . . . . . . . . . . . . |
$$ | . . X X O X a c . , . . . . . O . . . |
$$ | . . c O X O . d . . . . . . . . . . . |
$$ | . . . X X O . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ ---------------------------------------[/go]

Test position 2A: white a is a mistake. Any of the points marked c are not too bad, although d is probably white's best option.
Test position 2B: after white a, black must play b.

I tested with seven different networks (28 permutations of test position + network), with 2,000 playouts each time. The networks were number 45 (5 blocks), 57 (also 5 blocks), 91 (6 blocks), 116 (10 blocks), 157 (15 blocks), 173 (20 blocks) and 258 (40 blocks).

Summary of results:
  • For test positions 1A and 1B, LZ-45 fails: it tries to read out the ladder, but doesn't really "get" how ladders work -- many of the variations are ataris from the wrong side. For 1A, it wants to play C3 instead of G4. For 1B, it wants to pull out of atari. Remember that this network is already based on a million and a half self-play games and can challenge dan-level amateurs. Apparently, that's how well you can play based on good judgement and local shape, without being able to read well!
  • All other networks get test positions 1A and 1B correct, with various amounts of reading. In the next few posts I'll give more details of how the different networks analysed the positions.
  • For test positions 2A and 2B, even LZ-45 got it right, but LZ-116, 157 and 173 have an interesting blind spot here! They seem to fall into a local minimum where the policy network is sharp enough to make reading a lot more efficient, but not sharp enough to actually get the right answer. They read out a few steps of the ladder, not the whole ladder, then think they've got it and stop reading. On the other hand, LZ-258 gets the right answer without reading at all.

Overall, there are three things that really caught my attention.

First is the interplay between network (policy and eval) and playouts for the medium sized nets. They need to read a few steps to evaluate the ladder correctly, but they don't need to read right to the end of the ladder. Of course, "LZ-157 can understand a ladder in 20 playouts" doesn't mean that it never makes ladder mistakes. If the ladder position is a few moves deep in a variation, then that specific position may not get enough playouts, so the ladder can still be "over the horizon" leading to a mistake.

Second is the fact that a 20-block network still isn't quite big enough to make an accurate assessment of the full board. I guess the first five or ten blocks are about understanding basic shapes, then the later blocks start to take in bigger chunks.

Third, it looks as though the 40-block network really can see the ladder status without having to read it out at all, at least for this position. We'd need to test on a bunch more positions to be sure. But I recall many of the "LZ can't do ladders" complaints happening around the time of moving from 15 to 20 blocks. Is it possible that the problem is solved simply by moving to a bigger network?

Finally, the attached GTP log includes all 28 tests, for people who like going through lots of data, showing the number of playouts, policy value, winrate and principal variations for each move. In many cases, reading out the ladder isn't the PV, it's buried amongst the other variations. Over the next few posts I'll show you a few examples.


Attachments:
ladder_tests_log.txt [82.81 KiB]
Downloaded 16 times

This post by xela was liked by 2 people: Bill Spight, Uberdude
Top
 Profile  
 
Offline
 Post subject: Re: How LZ reads out ladders
Post #2 Posted: Sun Mar 01, 2020 3:15 am 
Judan

Posts: 6487
Location: Cambridge, UK
Liked others: 387
Was liked: 3565
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
Interesting xela, thanks. Back around LZ 157 days I remember noticing that LZ would tend to assume ladders work so would make mistakes in positions they didn't, whilst Elf would tend to assume ladders don't work, so make mistakes in positions they did.

Also, to test that LZ 40 block can really "read the ladder at a glance" rather than "ladder from lower left to top right is good for black with a black stone in top right" being baked into the policy I would suggest moving the stone(s) left by one space at a time until they stop being ladder breakers and see if LZ actually notices and how sharply.

Top
 Profile  
 
Offline
 Post subject: Re: How LZ reads out ladders
Post #3 Posted: Sun Mar 01, 2020 3:48 am 
Lives in gote

Posts: 430
Location: Adelaide, South Australia
Liked others: 173
Was liked: 220
Rank: Australian 2 dan
GD Posts: 200
Uberdude wrote:
Also, to test that LZ 40 block can really "read the ladder at a glance" rather than "ladder from lower left to top right is good for black with a black stone in top right" being baked into the policy I would suggest moving the stone(s) left by one space at a time until they stop being ladder breakers and see if LZ actually notices and how sharply.

Good idea! First I'll post the things I've already looked at (finding time to write things down is a bit of a challenge right now, but I'm gradually getting there). Then I'll try this. And I haven't forgotten the other ladder game you suggested...

Top
 Profile  
 
Offline
 Post subject: Re: How LZ reads out ladders
Post #4 Posted: Sun Mar 01, 2020 5:29 am 
Oza
User avatar

Posts: 2263
Location: Tokyo, Japan
Liked others: 2136
Was liked: 1272
Rank: Jp 6 dan
KGS: ez4u
xela wrote:
Carrying on from the other thread, now I'm using my modified LZ version to explore how LZ "understands" ladders. It's really interesting to look at older and newer LZ nets and see how they treat the same position.

In theory, there should be three things going on:
  • Policy network: does LZ think the next move in the ladder is an obvious move to explore? Can it "see at a glance" whether a ladder capture is good or bad?
  • Playouts: it can take about 60 moves to read a ladder that goes all the way across the board. Does that mean LZ needs a minimum of 60 playouts to read out a ladder? Or more playouts if it's reading out other variations along the way?
  • Net evaluation: Once a few moves of a ladder appear on the board, can LZ recognise the position as good for white or good for black? Can it accurately measure the cost of playing out a bad ladder?
These factors interact with each other. If a ladder move is a low policy move then it won't get many playouts. If the value net can recognise that ten moves of a ladder turns into a disaster, then it also shouldn't need to play out the full ladder.

I'd expect that smaller networks (5 or 6 blocks) will need to play out pretty much the whole ladder, because they can't "see all the way across the board", while a 20 or 40 block network should be able to "take in the position at a glance" and understand the ladder status without playing out the moves.

So, on to some tests. Below are some taisha positions where both sides have made mistakes, and now white has the chance to start a ladder. I want to look at four scenarios:
  • Test position 1A: good ladder, attacker's perspective. White's best move is to atari the black stone and start the ladder.
  • Test position 1B: good ladder, defender's perspective. Black shouldn't pull out of atari, but should play elsewhere.
    ...

Click Here To Show Diagram Code
[go]$$Wc Test position 1.
$$ ---------------------------------------
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . O . . . . |
$$ | . . . X . . . . . , . . . . . , O d . |
$$ | . . . . . . . . . . . . . . . . e . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . X O O b . . . . . . . . . . . . . |
$$ | . . X X O X a . . , . . . . . X . . . |
$$ | . . . O X O . . . . . . . c . . . . . |
$$ | . . . X X O . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ ---------------------------------------[/go]

Test position 1A (above): white must play at a, any other move is a mistake.
Test position 1B (above): after white a, black should tenuki -- there are several possible moves, for example c, d, e, but b would be a bad mistake.

...

With the current 266 net, White's blue is the immediately the ladder play at :w1:. After 1, pulling out the Black stone is not rejected - it is never tested (at least within the first 100K of playouts). Most tested replies are in the upper right. The :b2: shown below is not heavily tested (again at least not up to 100K when I ran this). It has a low policy number compared to nearby points. However, note that it is on the line of White's laddering stones (the line of "a"'s) and therefore is a strong ladder breaker that cannot be easily countered from behind. If we play this 2 on the board, Blue becomes :w3: shown below. This move does nto reestablish the ladder. LZ only considers local replies by Black in calculating its results. I ran Lizzie over dinner and some Sunday night television. That gave me 1.3 million playouts. In the early going blue switched back and forth between 3 and 4 below. However, by about 100K 3 dominates. See the three screenshots below the diagram for the rest of the story.
Click Here To Show Diagram Code
[go]$$Wc Test position 1.
$$ ---------------------------------------
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . O . . . . |
$$ | . . . X . . . . . , . . . . . , O . . |
$$ | . . . . . . . . . . . . . . 2 . . . . |
$$ | . . . . . . . . . . . . . . . 3 . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . a . . . . . . . . . . . |
$$ | . . . . . . a . . . . . . . . . . . . |
$$ | . . . . . a . . . . . . . . . . . . . |
$$ | . . X O O 4 . . . . . . . . . . . . . |
$$ | . . X X O X 1 . . , . . . . . X . . . |
$$ | . . . O X O . . . . . . . . . . . . . |
$$ | . . . X X O . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ ---------------------------------------[/go]

Below is a screenshot after 1.3 million playouts. Blue is :w3: above. LZ only calculates using local replies to this :w3:.
Attachment:
Blind spot LZ 266.jpg
Blind spot LZ 266.jpg [ 261.64 KiB | Viewed 832 times ]

Checking the bottom left after 1.3 million playouts for 3, we can see that only three PO's test Black pulling out the laddered stone.
Attachment:
Blind spot LZ 266 2.jpg
Blind spot LZ 266 2.jpg [ 199.18 KiB | Viewed 832 times ]

But when we add :b4:, we see LZ finally making the calculations and White's win rate dropping. This is after 2K playouts.
Attachment:
Blind spot LZ 266 3.jpg
Blind spot LZ 266 3.jpg [ 261.54 KiB | Viewed 832 times ]

_________________
Dave Sigaty
"Short-lived are both the praiser and the praised, and rememberer and the remembered..."
- Marcus Aurelius; Meditations, VIII 21


This post by ez4u was liked by: Bill Spight
Top
 Profile  
 
Offline
 Post subject: Re: How LZ reads out ladders
Post #5 Posted: Sun Mar 01, 2020 5:58 am 
Oza
User avatar

Posts: 2263
Location: Tokyo, Japan
Liked others: 2136
Was liked: 1272
Rank: Jp 6 dan
KGS: ez4u
Following up on my previous post. Here is how Katago 1.3.3 the b15 net handled the same situation in 252 playouts! :rambo:
What to do after :b2: in my original diagram.
Attachment:
Blind spot Katago b15 1.jpg
Blind spot Katago b15 1.jpg [ 198 KiB | Viewed 823 times ]

What it calculated for :w3: with 3(!) playouts.
Attachment:
Blind spot Katago b15 2.jpg
Blind spot Katago b15 2.jpg [ 193.76 KiB | Viewed 823 times ]

_________________
Dave Sigaty
"Short-lived are both the praiser and the praised, and rememberer and the remembered..."
- Marcus Aurelius; Meditations, VIII 21


This post by ez4u was liked by: Bill Spight
Top
 Profile  
 
Offline
 Post subject: Re: How LZ reads out ladders
Post #6 Posted: Sun Mar 01, 2020 6:13 am 
Lives in gote

Posts: 430
Location: Adelaide, South Australia
Liked others: 173
Was liked: 220
Rank: Australian 2 dan
GD Posts: 200
Edit: Dave posted while I was drafting this. Thanks for taking such a close look! Yes, the bigger nets are frighteningly efficient in rejecting some moves, for better or for worse.

--------

Let's look more closely at test 1A, white to play and capture a stone in a ladder.
Click Here To Show Diagram Code
[go]$$Wc Test position 1A, good ladder
$$ | . . . . . . . .
$$ | . . . . . . . .
$$ | . . X O O . . .
$$ | . . X X O X a .
$$ | . . . O X O . .
$$ | . . . X X O . .
$$ | . . . . . . . .
$$ ----------------[/go]

For all networks a was the first choice move (highest policy value), but this choice gets more clear-cut for bigger networks:
Code:
network G4 policy
45      28%
57      54%
91      53%
116     50%
157     88%
173     78%
258     97%

Network 45 has trouble reading out the ladder. It comes up with this fantastic sequence as the principal variation, allowing the F4 stone to escape in exchange for the corner:
Click Here To Show Diagram Code
[go]$$Wc LZ-45's PV on 2000po
$$ | . . . . . . . .
$$ | . . . . . 3 4 .
$$ | . . X O O 2 5 .
$$ | . . X X O X 6 .
$$ | . . 1 O X O . .
$$ | . . 7 X X O . .
$$ | . . . . . . . .
$$ ----------------[/go]

According to LZ-45, this has a winrate of 43% for white, which is better than the 39% it can get from having a go at playing out the ladder and messing up:

Click Here To Show Diagram Code
[go]$$Wc How not to play the ladder
$$ | . . . . . . . .
$$ | . . . . . 3 5 .
$$ | . . X O O 2 4 a
$$ | . . X X O X 1 .
$$ | . . b O X O . .
$$ | . . . X X O . .
$$ | . . . . . . . .
$$ ----------------[/go]

Actually, it does try 5 at a first, but then comes back and looks at this variation too. With this and similar distractions along the way, the 219 playouts given to G4 aren't enough to read the ladder to the end. To start with, this position --
Click Here To Show Diagram Code
[go]$$Wc Test position 1A
$$ | . . . . . . . .
$$ | . . . . . . . .
$$ | . . X O O . . .
$$ | . . X X O X 1 .
$$ | . . b O X O . .
$$ | . . . X X O . .
$$ | . . . . . . . .
$$ ----------------[/go]

has a neural net evaluation of 37% in white's favour. This number goes up with more playouts, but doesn't ever go up enough to beat the principal variation above. Playout number 236 gets to this position:
Click Here To Show Diagram Code
[go]$$Wc W+47% says LZ-45
$$ +---------------------------------------+
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . O . . . . |
$$ | . . . X . . . . . . . . . . . . O . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . O . . . . . . |
$$ | . . . . . . . . . . . O X X O . . . . |
$$ | . . . . . . . . . . O X X O . . . . . |
$$ | . . . . . . . . . O X X O . . . . . . |
$$ | . . . . . . . . O X X O . . . . . . . |
$$ | . . . . . . . O X X O . . . . . . . . |
$$ | . . . . . . O X X O . . . . . . . . . |
$$ | . . . . . O X X O . . . . . . . . . . |
$$ | . . X O O X X O . . . . . . . . . . . |
$$ | . . X X O X O . . . . . . . . X . . . |
$$ | . . . O X O . . . . . . . . . . . . . |
$$ | . . . X X O . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ +---------------------------------------+[/go]

but then LZ doesn't give any more playouts to the ladder, so the positive evaluation doesn't get to filter back up the tree and bump up G4 significantly. (I decided to let it run for a million playouts -- with such a small network, this still takes less than ten minutes -- and that still wasn't enough to put G4 in first place, although it closed the gap a little. It kept exploring the C3 variation, with 967,909 playouts given to that move, leading to a winrate of 37.5%, and G4 got a mere 13,831 playouts, winrate 35.4%.)

Networks 57 and 91 behave pretty similarly: they read the ladder to the end with fewer "distractions" along the way, so there are enough playouts for the evaluations to filter up, and G4 turns out to be the best move. But actually the playouts don't matter that much: even on playout number 1, G4 is evaluated as better than any other move, so it would get the right answer on smaller numbers of playouts.

Here's something interesting with LZ-91:
Click Here To Show Diagram Code
[go]$$Wc LZ-91 has a moment of indecision
$$ | . . . c b . . .
$$ | . . . . . a . .
$$ | . . X O O 2 . .
$$ | . . X X O X 1 .
$$ | . . b O X O . .
$$ | . . . X X O . .
$$ | . . . . . . . .
$$ ----------------[/go]

Both :w1: and :b2: are the first choice moves (highest policy values). But then LZ wants to follow up not with a (13% policy), but with b (32% policy) or c (26% policy). A handful of playouts are enough to check that both those moves lead to nothing good for white, then it gives 255 playouts to a.

Another interesting moment: when LZ-91 reads nearly to the end of the ladder:
Click Here To Show Diagram Code
[go]$$Wc19
$$ +---------------------------------------+
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . O . . . . |
$$ | . . . X . . . . . . . . . . . O O . . |
$$ | . . . . . . . . . . . . . B O X a . . |
$$ | . . . . . . . . . . . . . O X X O . . |
$$ | . . . . . . . . . . . . O X X O . . . |
$$ | . . . . . . . . . . . O X X O . . . . |
$$ | . . . . . . . . . . O X X O . . . . . |
$$ | . . . . . . . . . O X X O . . . . . . |
$$ | . . . . . . . . O X X O . . . . . . . |
$$ | . . . . . . . O X X O . . . . . . . . |
$$ | . . . . . . O X X O . . . . . . . . . |
$$ | . . . . . O X X O . . . . . . . . . . |
$$ | . . X O O X X O . . . . . . . . . . . |
$$ | . . X X O X O . . . . . . . . X . . . |
$$ | . . . O X O . . . . . . . . . . . . . |
$$ | . . . X X O . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ +---------------------------------------+[/go]

after playing :bc: but before playing white a, the neural net evaluation of the position is still B+69% (going down to B+17% after 4 playouts). But then when white a appears on the board, the value changes to W+100.0% (to one decimal place).

Here's the summary of all the variations that LZ-91 explores:


LZ-116 and LZ-157 only give a few playouts to the ladder, because for black to pull out of atari is a low policy option (around 1%), far from the first choice to be explored. LZ-157 gets this far on playout number 1270, but doesn't come back to this position:
Click Here To Show Diagram Code
[go]$$Wc
$$ +---------------------------------------+
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . O . . . . |
$$ | . . . X . . . . . . . . . . . . O . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . O . . . . . . . . . . . |
$$ | . . . . . . O X X . . . . . . . . . . |
$$ | . . . . . O X X O . . . . . . . . . . |
$$ | . . X O O X X O . . . . . . . . . . . |
$$ | . . X X O X O . . . . . . . . X . . . |
$$ | . . . O X O . . . . . . . . . . . . . |
$$ | . . . X X O . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ +---------------------------------------+[/go]


Here's the summary of variations for LZ-157:


Finally, networks 173 and 258 don't read any ladder variations at all. Here's LZ-258:


Attachments:
summary_good-net258.sgf [313.4 KiB]
Downloaded 231 times
summary_good-net157.sgf [317.84 KiB]
Downloaded 233 times
summary_good-net91.sgf [325.43 KiB]
Downloaded 233 times
Top
 Profile  
 
Offline
 Post subject: Re: How LZ reads out ladders
Post #7 Posted: Sun Mar 01, 2020 6:17 am 
Lives in gote

Posts: 430
Location: Adelaide, South Australia
Liked others: 173
Was liked: 220
Rank: Australian 2 dan
GD Posts: 200
Summary: so far it looks as though the policy net overrides everything else. If there's a blind spot in the policy, then in theory a large enough number of playouts together with accurate evaluations should fix it, but it really does take a massive number of playouts.

To be continued...


This post by xela was liked by: Bill Spight
Top
 Profile  
 
Offline
 Post subject: Re: How LZ reads out ladders
Post #8 Posted: Tue Mar 03, 2020 5:00 pm 
Lives in gote

Posts: 430
Location: Adelaide, South Australia
Liked others: 173
Was liked: 220
Rank: Australian 2 dan
GD Posts: 200
Now for test 1B, black to play and figure out that trying to escape from the ladder is a bad idea.

Click Here To Show Diagram Code
[go]$$Bc Test position 1B, good ladder
$$ | . . . . . . . .
$$ | . . . . . . . .
$$ | . . X O O a . .
$$ | . . X X O X O .
$$ | . . . O X O . .
$$ | . . . X X O . .
$$ | . . . . . . . .
$$ ----------------[/go]


For small networks, F5 is the "first instinct" move, and they have to read a little bit to figure out that it doesn't work. As per the previous post, LZ-45 just didn't get it. Other networks up to LZ-116 read out the full ladder. LZ-157 and 173 read out a few steps but are able to evaluate the position at an earlier stage. LZ-258 literally doesn't look at pulling out (at least, not within the first 2,000 playouts).

Code:
network F5 policy F5 playouts best move best policy
45      60%       1977        F5
57      31%       66          Q16       2%
91      59%       189         R15       13%
116     19%       64          R15       8%
157     1%        14          O3        20%
173     5%        29          O3        52%
258     -         0           R15       46%


Examples of the variations explored, if you're interested.

LZ-116 reading out the whole ladder

LZ-173 reading out part of the ladder

LZ-258 operating on a higher level


Something interesting is that even when black pulling out of atari the first time is a low policy move, continuing the ladder in the variations is still often the top policy move, and the only move looked at. I guess this reflects a bias in the self-play games: once LZ has reached a certain strength, it generally won't start bad ladders, meaning that if you have pulled out of atari once then continuing the ladder is probably the right thing to do?


Attachments:
summary_good+1-net116.sgf [317.78 KiB]
Downloaded 167 times
summary_good+1-net173.sgf [320.71 KiB]
Downloaded 168 times
summary_good+1-net258.sgf [306.27 KiB]
Downloaded 168 times
Top
 Profile  
 
Offline
 Post subject: Re: How LZ reads out ladders
Post #9 Posted: Tue Mar 03, 2020 7:00 pm 
Honinbo

Posts: 9943
Liked others: 3219
Was liked: 3254
FWIW, I found two examples of this position on Waltheri, 1941-00-00e, Sekiyama Riichi, 6 dan (W) vs. Nabeshima Ichiro, 4 dan, and 2001-09-16g, Zhu Songli, 5 dan (W) vs. Zhou Heyang, 9 dan.

Click Here To Show Diagram Code
[go]$$Wcm16 Test position 1a.
$$ ---------------------------------------
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . O . . . . |
$$ | . . . X . . . . . , . . . . . , O . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . 2 . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . X O O 3 . . . . . . . . . . . . . |
$$ | . . X X O X 1 . . , . . . . . X . . . |
$$ | . . . O X O . . . . . . . . . . . . . |
$$ | . . . , X a b . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ ---------------------------------------[/go]


Play continued as above in both games. In the Elf commentaries, for :b17: Elf recommends Ba - :w18:, Bb.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Banana Republic. It's not just a store anymore.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: How LZ reads out ladders
Post #10 Posted: Tue Mar 03, 2020 7:32 pm 
Lives in gote

Posts: 430
Location: Adelaide, South Australia
Liked others: 173
Was liked: 220
Rank: Australian 2 dan
GD Posts: 200
Bill Spight wrote:
FWIW, I found two examples of this position on Waltheri, 1941-00-00e, Sekiyama Riichi, 6 dan (W) vs. Nabeshima Ichiro, 4 dan, and 2001-09-16g, Zhu Songli, 5 dan (W) vs. Zhou Heyang, 9 dan.

Nice!

For "research purposes", I've been adding a white stone at a (and a corresponding black stone at D2). because otherwise LZ keeps thinking about playing a as a forcing move in the middle of reading out the ladder. It doesn't seem to change the overall conclusions, but it makes the process of tracing the variations much messier.


This post by xela was liked by: Bill Spight
Top
 Profile  
 
Offline
 Post subject: Re: How LZ reads out ladders
Post #11 Posted: Wed Mar 04, 2020 4:16 pm 
Honinbo

Posts: 9943
Liked others: 3219
Was liked: 3254
Your mission, Mr. Phelps, should you choose to accept it. ;)



From Common Sense in Go (Kubomatsu, 1929, in Japanese).

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Banana Republic. It's not just a store anymore.

Everything with love. Stay safe.


This post by Bill Spight was liked by: xela
Top
 Profile  
 
Offline
 Post subject: Re: How LZ reads out ladders
Post #12 Posted: Thu Mar 05, 2020 4:58 am 
Lives in gote

Posts: 430
Location: Adelaide, South Australia
Liked others: 173
Was liked: 220
Rank: Australian 2 dan
GD Posts: 200
Now on to test 2A, where the ladder is broken. Remember this is the one where the 10, 15 and 20 block networks have a blind spot.

Click Here To Show Diagram Code
[go]$$Wc Test position 2A, bad ladder
$$ | . . . . . . . . .
$$ | . . d . . . . . .
$$ | . . X O O . . . .
$$ | . . X X O X a . .
$$ | . . c O X O . b .
$$ | . . . X X O . . .
$$ | . . . . . . . . .
$$ ------------------[/go]

If the ladder doesn't work, then a is a mistake. There's a few reasonable alternatives, with b probably being the best move. LZ-258 finds b, but the weaker networks prefer c or d.

Code:
network G4 policy G4 playouts best move best policy best playouts
45      29%       239         C3        27%         1699
57      57%       494         C3        3%          1242
91      54%       494         C6        13%         894
116     47%       1852        G4
157     93%       1983        G4
173     82%       1943        G4
258     3%        4           H3        28%         1780


We see a similar pattern to before, where the weaker networks need to read out the ladder (it takes a little over 500 playouts before LZ-57 recognises C3 as being better than G4), but LZ-258 can see the status "at a glance". But this time there's a blip in the middle, where LZ-116, 157, 173 give a lot of playouts to G4 without managing to notice that the ladder is broken! What's going on here?

For LZ-116, here's the principal variation:

Click Here To Show Diagram Code
[go]$$Wc Test position 2A, LZ-116's PV
$$ ---------------------------------------
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . 0 . . . 8 . . . . . . . . . . . . . |
$$ | . 9 2 4 6 5 . . . . . . . . . . X . . |
$$ | . . 3 O 7 . . . . , . . . . . X . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . X O O . . . . . . . . . . . . . . |
$$ | . . X X O X 1 . . , . . . . . O . . . |
$$ | . . . O X O . . . . . . . . . . . . . |
$$ | . . . X X O . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ ---------------------------------------[/go]

So LZ-116's opinion is that after :w1:, black won't pull out of atari.

It does actually read the ladder as a sub-variation:
Click Here To Show Diagram Code
[go]$$Wc Ladder as third choice
$$ | . . . . . . . . .
$$ | . . . . . . . . .
$$ | . . X O O 2 . . .
$$ | . . X X O X 1 . .
$$ | . . . O X O . . .
$$ | . . . X X O . . .
$$ | . . . . . . . . .
$$ ------------------[/go]

Here, :b2: is the third choice move at 6% policy (after 40% for 3-3 invasion in the lower right, or 31% for 3-3 at top left as above), so it gets a total of 34 playouts. So we get the ladder up to this position:

Click Here To Show Diagram Code
[go]$Bc LZ-116 says W+79%
$$ | . . . . . . . . . . . . .
$$ | . . . , . . . . . O . . .
$$ | . . . . . . . . O X X O .
$$ | . . . . . . . O X X O . .
$$ | . . . . . . O X X O . . .
$$ | . . . . . O X X O . . . .
$$ | . . X O O X X O . . . . .
$$ | . . X X O X O . . , . . .
$$ | . . . O X O . . . . . . .
$$ | . . . X X O . . . . . . .
$$ | . . . . . . . . . . . . .
$$ --------------------------[/go]

At this point, LZ-116 still can't see that the ladder is broken, evaluates the position as strongly in white's favour, and stops reading because clearly black isn't going to persist with such a hopeless variation!

My interpretation: LZ-116 is stuck at a local maximum: the policy is sharp enough to quickly eliminate unpromising moves, but not sophisticated enough to notice the ladder-breaker. At the risk of personifying these networks too much, we could say that LZ-116 is overconfident, leaping to conclusions and lacking LZ-57's patience in reading out the whole ladder. LZ-258 is also very condfident, but now the confidence is backed up by enough experience.

Detailed traces if you care to explore further:

LZ-57


LZ-116


LZ-258


Attachments:
summary_bad-net57.sgf [358.95 KiB]
Downloaded 93 times
summary_bad-net116.sgf [370.76 KiB]
Downloaded 91 times
summary_bad-net258.sgf [314.53 KiB]
Downloaded 92 times
Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ] 

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group