It is currently Tue Apr 16, 2024 11:41 am

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 18 posts ] 
Author Message
Offline
 Post subject: O Meien on AlphaGo Zero
Post #1 Posted: Thu Jan 25, 2018 11:38 am 
Oza

Posts: 3655
Liked others: 20
Was liked: 4629
I have just been reading O Meien's views on AlphGo Zero. He has long had an interest in computers and is of course a top pro, so his views should carry some weight.

The main thing he has noticed is that AGZ is "good at living." He observes that it makes many ordinary moves when attacked but has the capacity to make life (shinogi) by making unusual eye-making moves inside its own space. These unusual moves, says, O, are moves that normally look bad. (He doesn't spell it out, but the inference seems to be that human pros have blind spots about such moves.)

Because of this ability, it can invade the opponent's sphere of influence with impunity. This makes it different from pre-Zero AG which tended to favour large scale surrounding attacks and only rarely got itself into shinogi situations.

There were plenty of other things he commented on, but this particular seems especially significant to me because it seems to suggest AG is improving by becoming more and more of a tactics calculating machine. Maybe it hasn't got that much more to tell us about go strategy?


This post by John Fairbairn was liked by 2 people: Bill Spight, sorin
Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #2 Posted: Thu Jan 25, 2018 4:00 pm 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
I have the feeling the influence <> territory balance changes with strength, a bit like komi. The stronger the player is, the less value is in large moyos, since the (same strength) opponent will be able to reduce more effectively. AGZ probably takes this to the extreme, that's why it is more territorial than Master. And since it no longer needs rollouts, it can search much faster, thus presumably deeper as well. The few published games show incredible accuracy, the good invasion and living techniques seem a direct consequence of that.

Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #3 Posted: Thu Jan 25, 2018 4:53 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
I am leery of generalizing from a single instance. There is no guarantee that another Alpha Zero neural net, with a different training history -- perforce! -- would have the same characteristics as the current AlphaGo Zero. Also, in a few years we will have other bots who are as strong, and we shall see how they play, as well. :)

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.


This post by Bill Spight was liked by: lightvector
Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #4 Posted: Thu Jan 25, 2018 10:32 pm 
Lives in sente

Posts: 902
Location: Fort Collins, CO
Liked others: 319
Was liked: 287
Rank: AGA 3k
Universal go server handle: jeromie
I feel like strategy and tactics are so closely intertwined that an advance in one will necessarily lead to a change in the other. While we may not be able to learn the sort of principles that can be communicated via an aphorism, if humans can address the tactical blind spots revealed by AlphaGo's play strategic changes will eventually follow. Staking out a large territory isn't viable if your opponent can destroy it.

And I do think we will learn something from seeing precise (or even merely odd to us) tactical play. While humans may never play with the same unwavering acuity as a computer, even knowing that tactical advances are possible will encourage top players to stretch their limits ever farther.


This post by jeromie was liked by 2 people: johnsmith, sorin
Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #5 Posted: Fri Jan 26, 2018 9:04 am 
Judan

Posts: 6725
Location: Cambridge, UK
Liked others: 436
Was liked: 3719
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
Just a quick comment on styles of strong bots: Zen (the non-released version is top pro level) hasn't started doing early 3-3 invasions like AlphaGo, and later FineArt (iirc), DolBaram and LeelaZero do. When its opponents do them against it Zen is often happy to keep extending as the opponent crawls on the 2nd line and make the gote wall (though it does sometimes jump instead of hane). Zen also likes to split sides of the opponent (e.g. between 4-4 and a shimari) which AlphaGo is noticeable in not liking. So it seems to still play in a more traditionally human style. In some ways it actually seems more human now with the neural networks than the pure MCTS pre-AlphaGo, when it was famous for liking the centre a lot, constructing large moyos with weird central moves and then making spectacular kills. Sometimes it will still go for centre moyos, but it feels to me less biased towards it, and adapts to the circumstances, happy to go for territory if that's a good way too. It does still like shoulder hits though, as does AG.


This post by Uberdude was liked by 4 people: Bill Spight, Gomoto, sorin, Waylon
Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #6 Posted: Fri Jan 26, 2018 9:51 am 
Lives in gote

Posts: 388
Liked others: 416
Was liked: 198
John Fairbairn wrote:
I have just been reading O Meien's views on AlphGo Zero. He has long had an interest in computers and is of course a top pro, so his views should carry some weight.

The main thing he has noticed is that AGZ is "good at living." He observes that it makes many ordinary moves when attacked but has the capacity to make life (shinogi) by making unusual eye-making moves inside its own space. These unusual moves, says, O, are moves that normally look bad. (He doesn't spell it out, but the inference seems to be that human pros have blind spots about such moves.)

Because of this ability, it can invade the opponent's sphere of influence with impunity. This makes it different from pre-Zero AG which tended to favour large scale surrounding attacks and only rarely got itself into shinogi situations.

There were plenty of other things he commented on, but this particular seems especially significant to me because it seems to suggest AG is improving by becoming more and more of a tactics calculating machine. Maybe it hasn't got that much more to tell us about go strategy?


Very interesting! I wish we can see some of the precises examples that O Meien had in mind when he said that, I guess it comes from the games that Deepmind published between AGZ against older AG versions? Although, I do remember such a case also from the game AG played against the Chinese team ("consultation go") in Wuzhen, specifically move 60: http://www.alphago-games.com/view/event ... /3/move/60 - more of a bad-shape tesuji than a life-and-death situation, but nevertheless it led to a surprisingly quick escape from what seemed at first a severe attack from the humans' team.

As for "Go strategy" - what is strategy? Is it not just humans attaching words to situations that seem mysterious just because they are way beyond our reading/tactical ability?
If it turns out that one can live in much tighter spaces than pros currently think they can, the general way to play early in the game ("strategy") will change a lot, so AG is teaching us a lot about strategy, I think.

_________________
Sorin - 361points.com


This post by sorin was liked by 3 people: Bill Spight, johnsmith, Waylon
Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #7 Posted: Fri Jan 26, 2018 10:44 am 
Judan

Posts: 6725
Location: Cambridge, UK
Liked others: 436
Was liked: 3719
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
sorin wrote:
I wish we can see some of the precises examples that O Meien had in mind when he said that, I guess it comes from the games that Deepmind published between AGZ against older AG versions?

This doesn't quite fit "has the capacity to make life (shinogi) by making unusual eye-making moves inside its own space" but the sequence from move 48-60 in this AGZ 20-block vs AG Lee game is one of my favourites http://www.alphago-games.com/view/event ... /3/move/48

Click Here To Show Diagram Code
[go]$$B Black moyo?
$$ +---------------------------------------+
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . O O O . . . . . . . . . . . . |
$$ | . O O O X X X X . . . . . . . . . . . |
$$ | . X X X . . . . . , . . . . . O . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . , . . . . . , . . . . . , . . . |
$$ | . . . . . . . X . . . . . . . . . . . |
$$ | . . X . X X . . . O . . . . . . . . . |
$$ | . X . O . O X . O . . . . . . . . . . |
$$ | . . . O . O . . . . X . . . . O . . . |
$$ | . . X X . . . . O . . . . . . . . . . |
$$ | . . X O O . O . . X . . . X . O . . . |
$$ | . . X X O . . X . . X . . X O . . . . |
$$ | . . . . . . . . . . . O . O . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ +---------------------------------------+[/go]


Click Here To Show Diagram Code
[go]$$B Moyo schmoyo, white territory!
$$ +---------------------------------------+
$$ | . . . . . . . . . . . . . . . . . . . |
$$ | . . . . O O O . . . . . . . . . . . . |
$$ | . O O O X X X X . . . . . . . . . . . |
$$ | . X X X X . . . . , . . . . . O . . . |
$$ | . . . . O . O X . . . . . . . . . . . |
$$ | . . O . . . . X . . . . . . . . . . . |
$$ | . . . . . . O X . . . . . . . . . . . |
$$ | . . O . . . . . . . . . . . . . . . . |
$$ | . . . . . . O X . . . . . . . . . . . |
$$ | . . . O . . . . . , . . . . . , . . . |
$$ | . . . . . . . X . . . . . . . . . . . |
$$ | . . X X X X . . . O . . . . . . . . . |
$$ | . X . O . O X . O . . . . . . . . . . |
$$ | . . . O . O . . . . X . . . . O . . . |
$$ | . . X X . . . . O . . . . . . . . . . |
$$ | . . X O O . O . . X . . . X . O . . . |
$$ | . . X X O . . X . . X . . X O . . . . |
$$ | . . . . . . . . . . . O . O . . . . . |
$$ | . . . . . . . . . . . . . . . . . . . |
$$ +---------------------------------------+[/go]


P.S. something I wonder is should Black 45 have connected solidly against the peep? That would allow white to jump to the 2nd line which separates the corner and destroys some territory, but the corner is safe and white makes no eyes. That is thicker and takes away one sente move white used to live inside so that would be harder, would white still go in so deep or reduce more gently?


This post by Uberdude was liked by 2 people: Elom, sorin
Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #8 Posted: Fri Jan 26, 2018 11:40 am 
Lives in gote

Posts: 388
Liked others: 416
Was liked: 198
Uberdude wrote:
sorin wrote:
I wish we can see some of the precises examples that O Meien had in mind when he said that, I guess it comes from the games that Deepmind published between AGZ against older AG versions?

This doesn't quite fit "has the capacity to make life (shinogi) by making unusual eye-making moves inside its own space" but the sequence from move 48-60 in this AGZ 20-block vs AG Lee game is one of my favourites http://www.alphago-games.com/view/event ... /3/move/48


This sequence is absolutely amazing indeed - it almost looks like it is made up :-)

_________________
Sorin - 361points.com

Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #9 Posted: Fri Jan 26, 2018 1:50 pm 
Lives with ko

Posts: 136
Liked others: 47
Was liked: 21
Rank: KGS 6 dan
sorin wrote:
Uberdude wrote:
sorin wrote:
I wish we can see some of the precises examples that O Meien had in mind when he said that, I guess it comes from the games that Deepmind published between AGZ against older AG versions?

This doesn't quite fit "has the capacity to make life (shinogi) by making unusual eye-making moves inside its own space" but the sequence from move 48-60 in this AGZ 20-block vs AG Lee game is one of my favourites http://www.alphago-games.com/view/event ... /3/move/48
This sequence is absolutely amazing indeed - it almost looks like it is made up :-)
One of my favorites as well! There seems to be missing a black stone at h3 :)

Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #10 Posted: Fri Jan 26, 2018 2:11 pm 
Judan

Posts: 6725
Location: Cambridge, UK
Liked others: 436
Was liked: 3719
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
johnsmith wrote:
There seems to be missing a black stone at h3 :)

Fixed, thanks.

Top
 Profile  
 
Offline
 Post subject: tautologies
Post #11 Posted: Fri Jan 26, 2018 4:56 pm 
Lives in gote
User avatar

Posts: 392
Liked others: 23
Was liked: 43
Rank: NR
Bill Spight wrote:
I am leery of generalizing from a single instance. There is no guarantee that another Alpha Zero neural net, with a different training history -- perforce! -- would have the same characteristics as the current AlphaGo Zero.
As Alexander Dumas remarked, any generalisation is dangerous, so it is dangerous to generalise that generalising from a single instance offers no guarantee of same characteristics; in this case, for one simple reason:
1. Alfie0 has put Monte-Carlo where it belongs, in the wastepaper basket, and with no random element to her, and no change to the microworld of Go, there is no reason to believe that she wouldn't tread the same path and end up looking the same if you were to crank her up again from the beginning, digging another hole in the same place and expecting a different result.

Because the world of Go is so well-defined, and so restricted, there is every reason to believe that Go is axiomatisable - the broad approach of Swim - that's to say, there are universal truths of Go that can be established by logical deduction within a domain model - the sort of thing that Russel and Whitehead tried to do for arithmetic.

Alfie0's behaviour is so markedly similar to that of the ancient greats that there's a fair chance both she and they have started to uncover what those truths are, one of which would be that Alfie "Master"(sic) is a Sorcerer's Apprentice, wrong about almost everything almost all of the time :)

Of course, a different DCNN configuration (eg with more layers and/or an improved learning algorithm) would indeed tread a different path and might end up on top. Even so, my own gutfeel is that it wouldn't have a different style to Alfie0, just superior reading.

Thumbs up for Alfie0.

_________________
i shrink, therefore i swarm

Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #12 Posted: Fri Jan 26, 2018 6:26 pm 
Lives in gote

Posts: 502
Liked others: 1
Was liked: 153
Rank: KGS 2k
GD Posts: 100
KGS: Tryss
Quote:
1. Alfie0 has put Monte-Carlo where it belongs, in the wastepaper basket, and with no random element to her, and no change to the microworld of Go, there is no reason to believe that she wouldn't tread the same path and end up looking the same if you were to crank her up again from the beginning, digging another hole in the same place and expecting a different result.


I hope that you're aware that Alpha Zero use Monte-Carlo...

Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #13 Posted: Fri Jan 26, 2018 7:30 pm 
Lives in gote
User avatar

Posts: 392
Liked others: 23
Was liked: 43
Rank: NR
Tryss wrote:
I hope that you're aware that Alpha Zero use Monte-Carlo...
You hope in vain. their paper says - albeit not in black and white - that it doesn't.

"AlphaGo Zero is the program described in this paper. It learns from self-play reinforcement
learning, starting from random initial weights, without using rollouts." (my emphasis)

However, it's easy to be confused, as they go on to say:
"AlphaGo Zero is provided with perfect knowledge of the game rules. These are used during MCTS, to simulate the positions resulting from a sequence of moves, and to score any
simulations that reach a terminal state".

So as its senior author doesn't know the difference between tree search and random tree search, they should maybe have given the job of final proof reading to a different member of the team who does.

The reason Monte-Carlo is called Monte-Carlo is that it is based on roulette-like random state transitions. An upper confidence bound probabilistic search based solely on a heuristic move generator (the policy net) which embodies no random element has no random character and hence is not Monte-Carlo.

However, i overlooked that they say it starts out with random initial weights (which seems to me to be wholly unnecessary), so i take it all back.

_________________
i shrink, therefore i swarm

Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #14 Posted: Fri Jan 26, 2018 9:50 pm 
Lives in gote

Posts: 502
Liked others: 1
Was liked: 153
Rank: KGS 2k
GD Posts: 100
KGS: Tryss
Page 14 of the Alpha zero preprint : Configuration : During training, each MCTS used 800 simulations.

Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #15 Posted: Fri Jan 26, 2018 10:34 pm 
Lives in gote
User avatar

Posts: 392
Liked others: 23
Was liked: 43
Rank: NR
yes, the paper refers to MCTS in many places, one of which i quoted.

The thing is, Alfie0 doesn't do random rollouts - which i regard as a significant technological development, one which takes her away from making weird moves like Master et al, and takes her away from going "On Tilt", and makes her overall behaviour much closer to the received wisdom of sages down the ages.

I regard this as deeply significant; i see Alfie0 as qualitatively different and a huge step forward from Alfie Master. They are chalk and cheese. Sure, it's a small step for her programmer, but a giant leap forward for heuristic search, and for Go theory - even if it's also a (sensible!) step back to the good old days before Monte Python :).

Random rollouts are, to me, the hallmark of MCTS; its very essence.

Alfie0 differs from Alfie Fan and Alfie etc in that it doesn't do them.

Maybe PHTS (Probabilistic Heuristic Tree Search) would be a more accurate name than MCTS for the kind of search Alfie0 does.

p2 wrote:
Our program, AlphaGo Zero, differs from AlphaGo Fan and AlphaGo Lee 12 in several important aspects. First and foremost, it is trained solely by self-play reinforcement learning, starting from random play, without any supervision or use of human data. Second, it only uses the black and white stones from the board as input features. Third, it uses a single neural network, rather than separate policy and value networks. Finally, it uses a simpler tree search that relies upon this single neural network to evaluate positions and sample moves, without performing any Monte-Carlo rollouts. To achieve these results, we introduce a new reinforcement learning algorithm that incorporates lookahead search inside the training loop, resulting in rapid improvement and precise and stable learning.

_________________
i shrink, therefore i swarm


Last edited by djhbrown on Sat Jan 27, 2018 2:10 am, edited 1 time in total.

This post by djhbrown was liked by 2 people: Gomoto, Waylon
Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #16 Posted: Sat Jan 27, 2018 1:54 am 
Lives in sente

Posts: 827
Location: UK
Liked others: 568
Was liked: 84
Rank: OGS 9kyu
Universal go server handle: WindnWater, Elom
From what I have have been able to ascertain, upon the advent of computer chess programs, chess players began to copy the computers 'cynical', somewhat materialistic conservative style, along with many draws. After all, the computers were beating them, so this must be the way to play chess.

Hold down the forward button to Alphazero's teaching games with Stockfish. It seems to play like a romantic (from our human perspective), opposite in many ways to that of a normal chess engine, using 'soft' move selection discarded many years ago for clever brute search, implying with a 'whole-board' positional strategy... Strong chess players slightly adjusted their style to match that of the best engines, and now it turns out that the best engines up until now may have been playing chess completely wrong (from our human perspective).

May we tread with caution in the wake of strong Go playing engines, but I admit that Alpha Zero's strength difference between the best humans is far above that any traditional chess engine could ever dream of achieving so...

_________________
On Go proverbs:
"A fine Gotation is a diamond in the hand of a dan of wit and a pebble in the hand of a kyu" —Joseph Raux misquoted.


This post by Elom was liked by: djhbrown
Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #17 Posted: Sat Jan 27, 2018 6:36 am 
Judan

Posts: 6725
Location: Cambridge, UK
Liked others: 436
Was liked: 3719
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
djhbrown wrote:
Random rollouts are, to me, the hallmark of MCTS; its very essence.

Alfie0 differs from Alfie Fan and Alfie etc in that it doesn't do them.

Maybe PHTS (Probabilistic Heuristic Tree Search) would be a more accurate name than MCTS for the kind of search Alfie0 does.

Yes, the MCTS name is rather unfortunate now as it's possible to to do it without the (semi-) random rollouts to a terminal game state. My understanding is when Remi Coulom coined the name you couldn't do it without the rollouts as the games it was used for didn't have a decent evaluation function, so you had to use rollouts, but now we have decent neural-network-based evaluation functions for non-terminal game positions you can still do the tree search UCT exploration algorithm aka MCTS but without rollouts.

However....
djhbrown wrote:
The thing is, Alfie0 doesn't do random rollouts - which i regard as a significant technological development, one which takes her away from making weird moves like Master et al, and takes her away from going "On Tilt", and makes her overall behaviour much closer to the received wisdom of sages down the ages.

If by tilt you mean doing things like stupid sente moves when losing, the hypothesised rationale being the random rollouts give rise to the "Oh, the rollouts means maybe they don't answer this obvious sente move and then I reverse the game!" idea, then it's certainly an appealing idea, but unfortunately doesn't seem to be true. We only have a few games where AG0 is losing, but it does do 'on tilt' stupid sentes. See http://www.alphago-games.com/view/event ... 0/move/182 and next few moves until it resigns.


This post by Uberdude was liked by: djhbrown
Top
 Profile  
 
Offline
 Post subject: Re: O Meien on AlphaGo Zero
Post #18 Posted: Sat Jan 27, 2018 7:43 am 
Lives in gote
User avatar

Posts: 392
Liked others: 23
Was liked: 43
Rank: NR
Elom wrote:
chess ... positional strategy... tread with caution
it's very instructive that the experiences of top chess players precisely mirror those of top Go players - this suggests to me that whereas treading with caution is, with hindsight, something both should have done before, it looks to me that with Alfie0, they can now throw off any reluctance and dive right into what she has to say, despite the fact that she also gets desperate when behind, so my comment about her avoiding Tilt was unjustified - although, maybe if her programmers had set too high a threshold (too low a win%) for resigning, it is rational of her to become desperate when there's no hope!

In particular, whereas i felt at the time just before Alfie Fan came along that the apparent superior positional judgement of MCTS was, in fact, an inferior positional judgement compensated for by the combination of surprise element plus exhausting (albeit not exhaustive) tactical reading (and was roundly chastised for daring to utter such a heresy), i see Alfie0's PHTS+DCNN as almost the exact opposite of MCTS+DCNN, and i think Alfie0 IS endowed with superior positional judgement (gained through extensive reading), simply because, unlike her predecessors, she really does separate the wheat from the chaff because she doesn't go wandering off into the maze of random rollouts. One evidence for this is that she seems to understand moyos better than Master, and can see when she can live inside one, something that Swim had a good look at in the context of making sense of Jue Yi's New Move:
https://www.youtube.com/watch?v=KSVi8n4c87A&list=PL4y5WtsvtduqNW0AKlSsOdea3Hl1X_v-S&index=27

So i reckon Go scholars have, in Alfie0, the diamond they dreamed of when clutching at the straw pebble of MCTS (sorry for all the mixed metaphors and allusions).

Because it is such a tiny microworld, Go doesn't have much to tell about intelligence, artificial or natural; but on the other hand, every journey of 1000 miles begins with a single step, and it could well be that the playpen prowess of Alfie0's PHTS could be a precious first step, alongside those of Simon and Minsky and McCarthy and Hofstadter et al:
https://www.youtube.com/watch?v=Ezz_lhYvTW4&list=PL4y5WtsvtduqNW0AKlSsOdea3Hl1X_v-S&index=20

Alfie0 would be as lost as any of us without her parallel hardware which enables her to read both right to the end and wide enough to (mostly) not overlook anything important, but it could be that PHTS would be pretty effective with the help of a static evaluation function in domains where there is no end in sight, not even for an army of Tensor Flow machines.

A domain like macroeconomics, for example...

_________________
i shrink, therefore i swarm

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 18 posts ] 

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group