can someone explain how computer go works to me?

often · #1

can someone explain from a high level perspective how most computer algorithims use monte carlo to figure out what to play? i'm not sure how probability figures into figuring out the correct move.

RBerenguel · #2

Monte Carlo algorithms are a broad family of namesakes in many fields. For example, a Monte Carlo algorithm can be used to smooth ray tracing images: instead of sending a ray in each pixel you send "several" (throw a dice, average the end result.)

In the specific case of go, instead of using heuristics (extend here because it is good to have more liberties, kill that group because we want points) we just simulate a random game (some constraints can be put to make it as efficient as possible, but anyway.) Do it for a lot of moves (in depth, because the game has to get to a scoring phase, ideally) and in width for the next move (you need to check as many possible moves as viable) and then you pick the next move just based on how many of these sample games ended in a win for you.

For example, a MC go program may be playing black in 5x5. He'll read ahead (for instance, I'm making up numbers) 100 games per initial move, and will check only non-border plays. So he checks 3-3 (which is the tengen of 5x5 heh) and all other moves around it (2-2,2-3,2-4, etc). In these simulations, 2-2 ends with 20 wins for W and 80 for B, 2-3 with 30 wins for W and 70 for B (again, making up the numbers) and finally, 5-5 has 45 wins for W and 55 for B. So most games checked (all with random moves!) end in a win for 3-3, so the bot will play it, since it is quite likely it's the best move currently. Repeat ad-nauseam

I hope it made some sense, I'm sure Mike can come up with a better explanation. Some day I'll get some free time and write a 30k go bot

wineandgolover · #3

Can we assume that most monte carlo simulations use heuristics to determine possible best moves? You know, to trim the tree? Or are they using pure brute force?

Or instead of heuristics, maybe just memory? (I've seen a shape like this before, and can quickly refute the cut. This branch is terminated)

RBerenguel · #4

wineandgolover wrote:

Can we assume that most monte carlo simulations use heuristics to determine possible best moves? You know, to trim the tree? Or are they using pure brute force?

Or instead of heuristics, maybe just memory? (I've seen a shape like this before, and can quickly refute the cut. This branch is terminated)

Although I know nothing abotu specifics of go AIs, I'm 100% sure all at the same time. These kind of optimisations are the "lowest hanging fruit" and as soon as you want more speed or depth you have to pick them up. The heuristics to choose moves are probably based on locality-hotness-tactics as well as opening databases. For shape algorithms, there are shape databases being used in pachi or fuego, I don't remember. Actually you can get it to analyse your game looking for shapes common/uncommon in pro games. It's kind of fun, once I tried to find bad shape in my games using this (by looking for shapes pro's in my database didn't play.) It gave some interesting results, but nothing so useful as to improve my bad shapes

Bill Spight · #5

This may be more than you are looking for: http://www.math-info.univ-paris5.fr/~bo ... -Bouzy.pdf

It is the slides for a presentation. I have skimmed it and it looks pretty up to date. Bouzy, I have found, makes very clear slides.

I was searching for something he did a few years ago. This is more detailed and includes recent advances.

Mike Novack · #6

I'll try to give a conceptual description of the algorithm (ignoring the details).

The problem being solved --- given a set of potential moves, select the best.

The approach -- have two exactly equal players play out a largish number of pseudo random games beginning with each of these moves. Select the move where the percentage of wins is highest.

The devil is in the details I am leaving out. How is that initial set of potential moves created and what governs the moves made by these modestly bad equal opponents because go is played with time constraints.

can someone explain how computer go works to me?

Who is online