It is currently Fri Apr 26, 2024 5:58 am

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 17 posts ] 
Author Message
Offline
 Post subject: it's not just tenuki
Post #1 Posted: Thu Sep 22, 2016 6:12 pm 
Lives in gote
User avatar

Posts: 392
Liked others: 23
Was liked: 43
Rank: NR
MCTS bots play the percentages - that's what statistical sampling means.


Attachments:
a.sgf [1.44 KiB]
Downloaded 466 times


Last edited by djhbrown on Tue May 02, 2017 12:35 am, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #2 Posted: Fri Sep 23, 2016 7:56 am 
Lives in sente

Posts: 1037
Liked others: 0
Was liked: 180
djhbrown wrote:
MCTS bots play the percentages - that's what statistical sampling means.............. because no opponent with half a brain would be so dumb as to tenuki during a forced sequence........ they don't have emotional reactions like we humans - it's only that it looks like that to us. Whereas, in fact, they are Quixotic all the time, even when playing at their best.........

And it's not just tenuki, as i found out the hard way this morning against Hirabot33 (and Lee Sedol found out the hard way in game 2 against Alphago).

At move 32 in the above game, Hirabot33 played what looked to me to be an unbelievably stupid move. And followed it up with the even more banal 34. What on earth was going on its little bot-mind??


It is difficult to understand that you sometime seem to understand that the bots aren't "thinking" like we do while at other times seem to imagine that human style thinking is the only way to go.

You shouldn't assume that "statistical sampling" (by including lines a "thinking human" wouldn't try) is necessarily bad. The bots might have different strengths. They don't need a "plan" but will be analyzing afresh each position. That means they might be better at making use of little bits of aji scattered over the board, none of which individually appear to offer very much (and so the human can't plan around them) but that collectively add up to an advantage that will eventually materialize.

Those odd moves and odd tenukis might be good moves, just ones too difficult for a human to see the point of because the benefit is remote. There isn't SPECIFIC plan that the move affects.

Maybe the way to try to look at this is to think back when you would have a hard time understanding a correct (good) tenuki. Say you are playing in a joseki sequence and all of a sudden the opponent tenukis. At some point you learned to look at that (odd) move and recognized that if you didn't respond you would suffer a disadvantage at that location but the move also was a ladder breaker affecting the way you were playing the joseki. As a human player you were able to recognize the "plan" involved. In other words, the human opponent could conceive of that particular tenuki and you to recognize why it would work.

But now suppose it was one of these bots doing that. The reason might not be because of a potential ladder in the area you are now playing but the likelihood of several other ladders in areas not currently being played in (and the collective value of those might be more than the local loss in the area being played in).


This post by Mike Novack was liked by: daal
Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #3 Posted: Fri Sep 23, 2016 8:39 am 
Oza
User avatar

Posts: 2777
Location: Seattle, WA
Liked others: 251
Was liked: 549
KGS: oren
Tygem: oren740, orenl
IGS: oren
Wbaduk: oren
djhbrown wrote:
At move 32 in the above game, Hirabot33 played what looked to me to be an unbelievably stupid move. And followed it up with the even more banal 34. What on earth was going on its little bot-mind??


32 looked like an obvious move to me. 34 requires a little bit of reading. I'm not sure you can compare this at all to Lee Sedol vs AlphaGo.

Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #4 Posted: Fri Sep 23, 2016 9:54 am 
Lives in sente

Posts: 902
Location: Fort Collins, CO
Liked others: 319
Was liked: 287
Rank: AGA 3k
Universal go server handle: jeromie
While I'm about your level and it's difficult to tell for sure, it looks to me like HiraBot gets a good result even if you play correctly because there are forcing moves to build white's outside wall. The bot has "decided" that the solidification of black's territory is worth the outside gain. Of course, if black makes a mistake the program may as well take the profit that is offered.

I think this is similar to many of AlphaGo's moves: the software has calculated that a small local loss is worth the global gain. This is what has made professionals and amateurs alike inspect the games with great detail: AlphaGo evaluates the position differently than most humans, and that means we have an opportunity to learn.

Remember that the neural networks that restrict the moves ALphaGo considers were developed through many, many iterations of self play. Since the ability of white and black to follow complex lines would be entirely equal, trick moves would be unlikely to show favorable results under these conditions.

I do think that many of the problems you are describing have been a part of existing bots, especially when the outcome of the game is mostly decided. For the most part, the addition of neural networks has limited this problem when the game is still competitive. But we must tread lightly as we begin studying the play of professional level bots. We shouldn't accept every move just because the bot played it (perhaps it is displaying some of the problems you highlight!), but neither should we reject moves because we don't immediately understand them (perhaps the move is right after all). Amateurs who play stronger players are familiar with this tension every time they play!


This post by jeromie was liked by: Bill Spight
Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #5 Posted: Fri Sep 23, 2016 5:36 pm 
Lives in gote
User avatar

Posts: 392
Liked others: 23
Was liked: 43
Rank: NR
jeromie wrote:
it looks to me like HiraBot gets a good result even if you play correctly because there are forcing moves to build white's outside wall.



My interest is this: Where does AI go from here?


Attachments:
b.sgf [1.1 KiB]
Downloaded 412 times


Last edited by djhbrown on Tue May 02, 2017 12:37 am, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #6 Posted: Fri Sep 23, 2016 9:33 pm 
Honinbo

Posts: 9545
Liked others: 1600
Was liked: 1711
KGS: Kirby
Tygem: 커비라고해
I thought AlphaGo was still improving to this day through its self play.

_________________
be immersed

Top
 Profile  
 
Offline
 Post subject:
Post #7 Posted: Fri Sep 23, 2016 9:54 pm 
Honinbo
User avatar

Posts: 8859
Location: Santa Barbara, CA
Liked others: 349
Was liked: 2076
GD Posts: 312
Quote:
I thought AlphaGo was still improving to this day through its self play.
A reasonable assumption, and that DM continues to add architectural improvements to it.

Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #8 Posted: Fri Sep 23, 2016 10:57 pm 
Lives in sente

Posts: 727
Liked others: 44
Was liked: 218
GD Posts: 10
djhbrown wrote:
One obvious way to improve Alphago is to add yet more processors to increase the size of the samples and maybe a few more gerzillion self-play RL exercises (although i feel that RL's hill-climbing levels out pretty quickly). Alphago uses about 2000 parallel processors, whereas Zen and others are limited to about 4 or so. That's an increase of 3 orders of magnitude and may be worth 2 or even 3 stones at their level. Or it may not. We won't know until they play each other.


In commercial version, Zen is limited to 8 cores (2013 version, deep learning version could parallel even more processors) and Crazy Stone is limited to 64 cores (deep learning version, Remi answer this himself, was 32 cores in 2015 version) For experimental version Zen use two Xeon E5-2623 v3 x2 and four GeForce GTX TITAN X while Crazy Stone use Xeon, 18 cores, 36 threads, 2.9GHz. But I agree that these hardware can't even compare to the single machine version of AlphaGo (48 cpu, 8 gpu).

djhbrown wrote:
in http://papers.ssrn.com/sol3/papers.cfm? ... id=2818149 i showed that (2) just a little commonsense would have guided Alphago to finding a workable defence to Lee's magic wedge in game 4.


I think you already know that DeepMind eradicate game 4 bug by training AlphaGo even more, what do you think about this method? They said the bug is 'horizontal effect' but did not elaborate that term. It's like they're not quite sure either.

Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #9 Posted: Sat Sep 24, 2016 12:11 am 
Lives in gote
User avatar

Posts: 392
Liked others: 23
Was liked: 43
Rank: NR
i would imagine that DM are currently more focussed on producing something useful in image analysis for differential diagnosis of medical conditions.


Last edited by djhbrown on Tue May 02, 2017 12:39 am, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #10 Posted: Sat Sep 24, 2016 12:57 am 
Lives in sente

Posts: 727
Liked others: 44
Was liked: 218
GD Posts: 10
djhbrown wrote:
pookpooi wrote:
I think you already know that DeepMind eradicate game 4 bug by training AlphaGo even more, what do you think about this method? They said the bug is 'horizontal effect' but did not elaborate that term. It's like they're not quite sure either.

i didn't know that; if you know of a public statement to that effect, please share it. as to "horizontal effect", i agree with them. as i said before, it's a kind of "horizon effect" - but a horizon width rather than depth. in the case of game 4 black 79, Alphago hadn't looked wide enough.


Here https://www.reddit.com/r/baduk/comments ... _is_fixed/

and here https://www.youtube.com/watch?v=LX8Knl0g0LE it's the last question of q&a section, so nearly the end of the video

Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #11 Posted: Sat Sep 24, 2016 2:57 am 
Lives in gote
User avatar

Posts: 392
Liked others: 23
Was liked: 43
Rank: NR
thanks for the links, pookpooi.

re: fixing the bug

i enjoyed Fan Hui's anecdote about imagining that they wanted to wire him up to probe his brain while he was playing Go. :)


Last edited by djhbrown on Tue May 02, 2017 12:39 am, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #12 Posted: Sat Sep 24, 2016 4:25 pm 
Oza
User avatar

Posts: 2777
Location: Seattle, WA
Liked others: 251
Was liked: 549
KGS: oren
Tygem: oren740, orenl
IGS: oren
Wbaduk: oren
For fun, I ran the game through to see what crazystone and zen thought. They also looked at Hirabot's moves for 32 and 34 early and then started moving away. So shapewise, it's a good move to look at for a first try and the stronger bots decide not to play them.

Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #13 Posted: Sat Sep 24, 2016 9:18 pm 
Honinbo

Posts: 9545
Liked others: 1600
Was liked: 1711
KGS: Kirby
Tygem: 커비라고해
djhbrown wrote:
as to "fixing the bug", it is conceivable that more RL trials would improve performance, but there is evidence that RL tails off asymptotically [1], so i guess they found a different way, by simply presenting the position after white 78 to the policy network, telling it that Kim's move of L10 is the correct reply. And telling the value network that the position after black L10 is a win for black.


Actually, I think Aja said that the "bug" was fixed simply by continuing self play. They didn't explicitly give information tailored to the situation, and let it just keep improving itself. Later, they presented the same board position, and the new version of AlphaGo found the correct answer.

RL may tail off at an asymptote, but I don't think AlphaGo has reached that point yet. So far, it appears to have continued improvement simply through self play.

_________________
be immersed

Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #14 Posted: Sat Sep 24, 2016 10:35 pm 
Lives in gote
User avatar

Posts: 392
Liked others: 23
Was liked: 43
Rank: NR
If you were in charge of training Alpha, and had new data from 5 games against one of the world's best players, it would be rather remiss of you not to tell Alpha to learn from that experience and instead just hope she would learn enough solely through self-play.


Last edited by djhbrown on Tue May 02, 2017 12:41 am, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #15 Posted: Sat Sep 24, 2016 10:48 pm 
Lives in sente

Posts: 727
Liked others: 44
Was liked: 218
GD Posts: 10
If I'm in charge with AlphaGo then I'd do what you recommend, directly feed the correct positions, force AlphaGo to learn.
But the real question is, is it that easy? What's more convenience for programmers between doing that or let AlphaGo selfplay correct itself? DeepMind knows, I don't.

Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #16 Posted: Sun Sep 25, 2016 12:01 pm 
Honinbo

Posts: 9545
Liked others: 1600
Was liked: 1711
KGS: Kirby
Tygem: 커비라고해
Sounds to me you are just speculating, djhbrown. Of course I am, too, but at least it's an opinion based on what Aja said. To me, the power behind their approach is the limited domain knowledge. I don't see a reason to stray from that philosophy.

Besides, if the current version of AlphaGo is really as strong as they say, self play provides better quality games than against Lee Sedol.

_________________
be immersed

Top
 Profile  
 
Offline
 Post subject: Re: it's not just tenuki
Post #17 Posted: Sun Sep 25, 2016 2:15 pm 
Lives in sente

Posts: 1037
Liked others: 0
Was liked: 180
djhbrown wrote:


but that would be the DCNN equivalent of patching the code to fix a single case; it would not remedy the systemic underlying design flaw, which i perceive to be a lack of focus due to a lack of a conceptual overview - a lack of positional judgement!


I think you might be helped understanding the difference between a neural net not (yet) getting something right and a bug in its implementation. That a neural net cannot yet do something (cannot "correctly" evaluate the function) for an input it has not yet been trained on is not a "bug". Nor does correcting this one case ONLY fix that one case. Were that the situation, neural nets wouldn't be good for very much.

In the beginning (before training) a neural net can't do anything. It is then trained (cell values adjusted, for the moment ignore how) so that for each input from its training set, it produces the correct output. Again ignoring the process, except to point out that the adjustments must not just get the new input/result correct but must not mess up all the previous input/result pairs. What happens (what a neural net is good for) is that not only will the neural net give the correct input/result pairs it has been trained on, it becomes likely that given an input it has never seen before (one it has NOT been trained on) it will also give the correct result.

So fixing this one KNOWN "error" is actually likely to fix other errors not yet encountered.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 17 posts ] 

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group