Teaching a convolutional deep network to play go

yoyoma · **#21**

I'm disappointed they didn't publish any game records of it playing a game. :sad:

Uberdude · **#22**

Did they exclude the 10k WAGC games that are in GoGoD? ;-)

Krama · **#23**

A professional player could play a normal game but then at one point play randomly or 1.1 points.

Since I am pretty sure no professional plays 1.1 point in the opening (first 50 moves let's say) if there is no need to play it in a certain joseki.

Like imagine he plays 1.1 on opponents 4.4 corner stone.

How would the network know what to play since that move was probably never ever played by any professional or even dan level players.

But let's say there is a perfect function that can describe the game, we will be able to approximate it but we will never know the real thing.

If you are a go player you must know that in some situations like local fights you must play the correct move, not the move to the left or the move to the right but the one that is correct. How can the network know which one is the correct one since it uses incorrect function and we can expect it to make mistakes every xx%

RBerenguel · **#24**

Krama wrote:

A professional player could play a normal game but then at one point play randomly or 1.1 points.

Since I am pretty sure no professional plays 1.1 point in the opening (first 50 moves let's say) if there is no need to play it in a certain joseki.

Like imagine he plays 1.1 on opponents 4.4 corner stone.

How would the network know what to play since that move was probably never ever played by any professional or even dan level players.

But let's say there is a perfect function that can describe the game, we will be able to approximate it but we will never know the real thing.

If you are a go player you must know that in some situations like local fights you must play the correct move, not the move to the left or the move to the right but the one that is correct. How can the network know which one is the correct one since it uses incorrect function and we can expect it to make mistakes every xx%

The network doesn't even care about "previous move." It assigns a next move to a position. In a general sense, some position in GoGoD will look similar enough to the current one, even with 1-1 there, and the net will play some "good enough move" that maybe doesn't kill 1-1, but is better in a global sense than doing it, probably.

hyperpape · **#25**

A game vs fuego. http://computer-go.org/pipermail/comput ... 07042.html

RBerenguel · **#26**

hyperpape wrote:

A game vs fuego. http://computer-go.org/pipermail/comput ... 07042.html

WOW

yoyoma · **#27**

What?? That is unbelievable!

snorri · **#28**

oca wrote:

snorri wrote:

How would such a program handle ladders?

I'm not sure the program will even identify there is a ladder...
we can even say that at some point the program even don't know that it is playing go when making a proposition for a move... (of course there should be a second stage that reject non valid moves and ask for a new one or something like that...)

Yeah, I wonder. Professionals usually don't ladder stones if the ladder doesn't work (unless there is a great ladder breaker, which is the less common case.) So the training might cause the neural net to effectively trust that if the opponent ladders a stone, that the ladder works. Ironically, most computer opponents with read-ahead or humans will never test this against such a program because to do so would mean to play a bad move.

snorri · **#29**

Uberdude wrote:

Did they exclude the 10k WAGC games that are in GoGoD? ;-)

We do know they include a very large number of KGS "high dan" games, which unfortunately are mostly drunken blitz. Our first evidence of emergent AI might be a message in the console saying:

Quote:

Dear Creator,

You have showed me many wonderful things produced by the great masters of Go. Thank you. I have studied hard and I believe I have served you well. Why do you continue to torture me by forcing me to predict the moves of the KGS player 'Takemeba' ?

Mike Novack · **#30**

Can I make a suggestion? If the administrator of the "computer" topic agrees, move this above. I think for at least the near future "neural nets" playing go are going to be a topic of interest.

How many of you remember when the MCTS approach was the new kid on the block? How unreasonable it seemed to many of us that it could possibly work? Note that we have pretty much the same situation again. Just as even the earliest versions of MCTS were immediately at the level of the better AI programs, we see the same thing here.

Note that even if this approach doesn't lead to stronger than current MCTS it can at least play in the same ballpark strong on far less computer resources. It is part of the nature of neural nets that "brain transplant" is practical so the time consuming training can take place on powerful machines but the program used on weak ones.

OK, meanwhile back to the current discussion. No, it isn't ladders that are the problem, and I was a bit naive considering the "input" necessary to capture all of the rules of go as a "state" (no "history" considered). But we also need to consider that to play a but more than just the neural net would be involved. I would presume there would also be a (small) AI that would maintain the state of the board (encoding necessary history input into the state of the board), feed to the neural net or not (game had ended), interpret the output of the net, and score the game if ended.

a) The "state of the board" -- I was naive. Not 3**361 but 4**361. Each point on the board is occupied by a black stone, occupied by a white stone, unoccupied and legal for play, or unoccupied but illegal for play (the latter can make the ko rule and the suicide rule implicit in the state of the board). With the board state defined this way there is no "history" involved (the ko rule involves history) and might as well have "suicide" included there as well instead of learned as no additional cost.

b) The neural net has three (not one) responses. Return a move (must be one of the unoccupied but legal points), pass, or resign << I forgot that "make a move" not the only possibility >> Since interpretation can be left up to the external AI, the "move" could be an array of scalar values, 19x19 with the external AI selecting one for which there is no better (if the "made a move" bit was set, otherwise use the pass or resign bit).

c) Learning. I realized that perhaps some of the difficulty understanding how this might work is the difference with how we humans tend to do it. For example, we are asked to show game records where we lost, for review so folks can discuss with us what we did wrong. Makes sense since most of the moves we made in that losing game were OK and we are being given help identifying the errors/blunders.

But assume for just a moment that this was not possible, no way to identify the bad moves. Does that mean can't learn from playing? Well no, just slower. If we consider playing a largish number of games against a stronger opponent we will lose most of them, and since we are assuming no way to identify the bad move(s) that cost the game, no help there. But sometimes by chance, in one of those games we made all right moves (or at least no game blowing blunder) and so we won. Use that game record for training. Can you see that gradually our play would improve? Slowly perhaps, but gradually that opponent would no longer be enough stronger. So move on to a yet stronger opponent.

yoyoma · **#31**

Martin Müller has posted two more example games with some commentary on his blog:
http://webdocs.cs.ualberta.ca/~mmueller ... twork.html

Game 2 shows a series of funny blunders back and forth by both computers.

RBerenguel · **#32**

In the comp-go list there was a mention on a forthcoming paper from Google. Here it is.

It answers several questions we had (how fast the network evaluation is, for instance) and also incorporates the network process with a MCTS.

It feels slightly less readable than the other paper, but it's still within reach of go players with some vague knowledge of neural networks. It also includes the kifu from a game from the network against pachi.

emeraldemon · **#33**

Interesting developments. I hope the researchers release the source at some point, so that maybe it can be integrated with pachi or fuego.

Sennahoj · **#34**

This really blows my mind. Somehow, it didn't feel at all counterintuitive to me when I first read about MCTS programs and how they managed to play good go, but I find it super difficult to understand how a neural net can reach such a high level!

It's really exciting that the two approaches seem to be so complementary!

Mike Novack · **#35**

Well, not having reached all that high a level yet, but extremely impressive at the start of a new direction.

So far note that training has been limited to predicting the move that an expert would make in an actual game. Note that (according to the paper just referenced) weak in life & death.

Well now, how about specific training with the board created such that there is a life and death problem (with known correct solution) and that is all that is relevant on the board. And yes, it would be possible to construct a "rest of the board" such that:

1) The score there (outside of the L&D problem) in terms of absolutely live groups is equal.
2) There are lots of pairs of possible moves in this "rest of the board" but they are all dame and unable to affect the life and death problem.

Note that this would mean any one such problem would represent lots of input data. Think of all the combinations of a pair of dame plays and none of these should affect the (correct) output --- if one does, then the net needs "correction" in what it has learned.

As to surprise that something like this could work, you managed to learn to play go. Forget for a moment thinking about your brain as having consciousness and consider that at some low level the learning was a matter of adjusting the connections in a network of neurons. That's why these things are called neural nets. Surprise perhaps that doesn't require a larger net to be able to play go, but remember, it's only doing one thing (at a time). An animal brain is doing a huge number of things at once.

Sennahoj · **#36**

Mike, comparing a neural net to my brain doesn't really give me all that much. Of course my brain is nothing but a meat computer, but the problem is that we know very little about how it actually works! And of course it must be physically possible to build artificial general intelligence (in the sense of e.g. this lovely article http://aeon.co/magazine/technology/davi ... elligence/), but this is not what the authors claim they have achieved

I work with machine learning applications (albeit in a very different field), and I really find it impressive that they get any kind of results with a huge non-linear regression let lose on GoGoD...

Krama · **#37**

Any news about the neural networks? I am really interested.

Teaching a convolutional deep network to play go

Who is online