Webcam + Go

quantumf · Post by **quantumf** » Wed Aug 13, 2014 12:12 pm

pasky wrote:Hi! Thanks for the feedback, nice testcases. Indeed, a lot of improvement will be probably needed for many uses, mainly more sophisticated stone detection I guess. (Then again, for some uses like automatic tournament game transmission where you can set things up a bit, I think Imago might be almost good enough as it is now now.)

I'm not sure if Tomas has any plans for making a commercial offering out of this; I certainly don't. But maybe having free, working codebase as a starting point might help someone to build one.

One initial worry is that it's quite slow, despite the C optimization - takes a good few seconds to parse an image on an i7 machine. That may be an issue in certain situations. Not sure exactly what you mean by automatic game transmission, but moves are often played faster than a few seconds apart.

Sure, having this code as a starting point is great. I do recall how exciting I found the idea of automatic board recognition and game capturing as a beginner, but once I learnt to recall my own games, it became a lot less interesting.

Some other technical comments for those wanting to try this out

1. numpy seems to be required
2. PIL appears to be problematic - it was only after I uninstalled it that I got it to work. python-imaging seems to be sufficient, although I do admit that I tried quite a few other things along the way before I got it working. stackoverflow was very helpful.

pasky · Post by **pasky** » Wed Aug 13, 2014 12:30 pm

quantumf wrote:One initial worry is that it's quite slow, despite the C optimization - takes a good few seconds to parse an image on an i7 machine. That may be an issue in certain situations. Not sure exactly what you mean by automatic game transmission, but moves are often played faster than a few seconds apart.

As far as I understand (not 100% certain though), an order-of-magnitude speedup might not be too difficult to pull out. Plus if you have a sequence of images shot from the same/similar position, it could be speed up even more as vast majority of time is spent in the grid search.

Thanks for the technical notes you mentioned! For me, it worked pretty much out of the box on my Debian system after installing python-numpy and python-pygame packages.

GregA · Post by **GregA** » Wed Aug 13, 2014 3:36 pm

Nice work! For improving tolerance for glare, you could make use of the fact that stones are generally ellipses on intersections, with the gaps in the center between grid lines being bare board. If you cut out a square from the perspective-adjusted image, the center is the stone (or lack of a stone), the corners are the bare board.

You could either use that information locally - get some median color in the center and corners and compare. If the center is about the same as the corners, it's probably glare, not a white stone. Or you could sample the gaps, run a low-pass filter of some sort, and then use the resulting image as a gradient map to correct the whole image.

It might also be useful to run a hough circle detector on the perspective-corrected image. Stones won't be perfect circles, depending on camera angle, but you should get some kind of blurry peak if there was a stone. If there's no peak and the center is not brighter than the corners, it's probably glare rather than a white stone.

Actually, dumping that image into gimp and running edge detect on it the grid lines are very clear and the circles are generally pretty clear, too. You could do hough line filter and hough circle filter on the square around where you expect the stone. If there's a stone, you'd expect a circle and only very weak lines crossing through the center (if any). If there's no stone, you'd expect no circle detected and two strong lines. You could run both and compare the results and let the bigger one decide stone or no stone. Then, once you've determined it's a stone, you could check the brightness to know white/black.

quantumf · Post by **quantumf** » Thu Aug 14, 2014 1:49 am

pasky wrote:
quantumf wrote:One initial worry is that it's quite slow, despite the C optimization - takes a good few seconds to parse an image on an i7 machine. That may be an issue in certain situations. Not sure exactly what you mean by automatic game transmission, but moves are often played faster than a few seconds apart.
As far as I understand (not 100% certain though), an order-of-magnitude speedup might not be too difficult to pull out. Plus if you have a sequence of images shot from the same/similar position, it could be speed up even more as vast majority of time is spent in the grid search.

Thanks for the technical notes you mentioned! For me, it worked pretty much out of the box on my Debian system after installing python-numpy and python-pygame packages.

After sending my last comment I realized that an obvious solution was to just take lots of photos and queue them up, real-time conversion is not really necessary. Re-using the information from one board to the next would clearly help too.

However, I'm pretty certain the completely automated game transmission is a fantasy - too often players will play moves almost simultaneously, and you'll have multiple moves played while hands/arms are obscuring critical parts of the board. You'll still need sufficiently skillful game recorders to transcribe the correct move order in fast ko fights, which I presume is the bane of actual game recorders.

More technical comments: my out-of-the-box xubuntu (14.04) already had python-pil (the pillow fork) installed, so actually installing PIL confused my system. This was resolved by uninstalling PIL. pygame is not installed on my system, not sure what it's required for - imago seems to work fine without it. I initially tried to get it to compile on Windows...what a pain, I gave up after a few hours. Partly because Visual Studio is utterly messed up on my system, but even when I tried on another Windows machine, Windows and python seemed unhappy, insisting that initpcf wasn't there, when it clearly is. More competent people than me can probably get it right, though.

The license text that Tomáš chose is a bit odd - he's used the BSD 3 paragraph license, aka "BSD-new" aka "Modified BSD License", which is fine, but (a) he's not called it that, which is not the recommended way, and (b) he's at pains to say that he loathes the GNU GPL, even though I gather (from Wikipedia) that the FSF have endorsed the BSD-new license as compatible with the GNU GPL

GregA · Post by **GregA** » Thu Aug 14, 2014 9:58 am

You could do hough line filter and hough circle filter on the square around where you expect the stone. If there's a stone, you'd expect a circle and only very weak lines crossing through the center (if any). If there's no stone, you'd expect no circle detected and two strong lines. You could run both and compare the results and let the bigger one decide stone or no stone.

Actually, it's easier than that. If you take the edge-detect image from when you found the grid, do a perspective warp on it based on the corners, and look at the rectangle around a single intersection... the line and circle hough transforms get really simple. Loop over the edge pixels in the rectangle - count how many, A, are within a doughnut of some radius from the center and count how many, B, are within some tolerance of being on the center vertical or center horizontal line. Ignore pixels on both. Then get a ratio A/(A+B). Close to 0 means empty intersection, close to 1 means a stone. Find a reasonable threshold in the middle. And if A+B is close to 0, that means no edges - so you can't really say.

thouis · Post by **thouis** » Tue Aug 19, 2014 4:45 am

Are PIL/Pillow being used for more than loading/saving images? If not, then imread is a far simpler solution. (I do a lot of image processing in python).

pasky · Post by **pasky** » Wed Aug 20, 2014 7:02 am

quantumf: I think that Computer Go based approaches could actually produce pretty good sequence disambiguers, maybe better than a lot of human scribes! (...who can, though, ask other observers for help.) Also, even if a sequence is not perfect, if the end result is correct, some rate of mistakes may be acceptable even in the real world.

GregA: Cool ideas! Now can you please transform that to actual code...?

(I think when Tomáš tried some similar approaches, it turned out that stones are in fact often quite unaligned with the grid.)

Jujube · Post by **Jujube** » Wed Aug 27, 2014 1:47 pm

http://go-tracer.appspot.com/

Anyone seen this one?

levelonedev · Post by **levelonedev** » Wed Aug 27, 2014 8:00 pm

Tested it with a picture of my go board. This is the result.

: It didn't work out so nicely.; oops.JPG (126.65 KiB) Viewed 16068 times

GregA · Post by **GregA** » Sun Oct 19, 2014 3:56 pm

pasky wrote:GregA: Cool ideas! Now can you please transform that to actual code...? (I think when Tomáš tried some similar approaches, it turned out that stones are in fact often quite unaligned with the grid.)

I'll see what I can do. Has anyone got this running under windows? I'm looking for the shortest path to get the code up and running. This isn't a reflection on Tomáš - more the general problems of opensource development in a windows environment, but I see 3 paths to getting it going and none seem particular easy or fun:

Try to compile pcf.c with the exact same compiler and settings as the python executable was built with. Thinkable, but I've gone down that route before and it's many hours of work just to get things compiling and running.
Install linux and develop in there. It sounds like that is especially difficult with windows 8 - probably no easier than the above.
Rewrite pcf.c in python. I don't care about the speed too much at the moment - if Tomáš has an old pre-optimization copy lying around, that would be great. Otherwise, it's probably still easier than the above, unfortunately.

For the stones not being aligned, one of the papers out there mentions that once you find the grid and perspective transform, the stones themselves have thickness. Rather than pretending that they are in the plane of the board, you should assume they are a few millimeters above the grid. This different perspective transform fixes the misalignment bias so that all you're left with is how sloppily the stones are placed on the board, which should give better results.

Anyway, yes, I'm looking forward to diving into the code, as I'm better at and enjoy adjusting algorithms far more than troubleshooting development environments to get a codebase running.

EDIT: Update... I hacked up a quick pcf.py that is a drop-in replacement for pcf.c (fortunately pfc.c was still very close to the original python code). With that, and installing a few things (I installed the win32 python 2.7, and also the python 2.7 win32 versions of: pygame, PIL, and numpy from http://www.lfd.uci.edu/~gohlke/pythonlibs/) I was able to run the test image through and it worked! That's all for tonight, but I'll see about digging in to the meat of the code another night.
Also, btw, to get the full debugging that shows visualizations along the way, I also installed these packages from the same site: matplotlib, six, dateutil, pyparsing)

quantumf · Post by **quantumf** » Sun Oct 19, 2014 11:34 pm

GregA wrote:[*] Install linux and develop in there. It sounds like that is especially difficult with windows 8 - probably no easier than the above.

That's what I ended up doing. Wrestled with Visual Studio and eventually gave up. Admittedly my Windows box needs a reinstall, too many versions of Visual Studio have been installed, and it's virtually unusable now, at least from the command line.

I used VirtualBox, using Windows as the host, which works fine. I'm not aware of any issues with Windows 8 as the host.

GregA wrote:EDIT: Update... I hacked up a quick pcf.py that is a drop-in replacement for pcf.c (fortunately pfc.c was still very close to the original python code). With that, and installing a few things (I installed the win32 python 2.7, and also the python 2.7 versions of pygame, PIL, and numpy - all compiled for win32) I was able to run the test image through and it worked! That's all for tonight, but I'll see about digging in to the meat of the code another night.

Did you upload it to github? If not, can you share it some way?

Rémi · Post by **Rémi** » Mon Oct 20, 2014 11:44 pm

GregA wrote:I see 3 paths to getting it going and none seem particular easy or fun:

It may be too late now, but just in case: cygwin is a really good linux-like environment for Windows.

Rémi

GregA · Post by **GregA** » Tue Oct 21, 2014 8:17 am

quantumf wrote:Did you upload it to github? If not, can you share it some way?

Python is notoriously picky about spacing, so you might need to fix up all the indenting, but this might work:

Code: Select all

"""Unoptimized pcf module - rewritten to avoid having to compile pcf.c
"""

import Image
import math

def combine(im_bg, im_fg):
	sum = 0
	area = 0
	for i in range(0,len(im_bg)-1):
		if im_fg[i] != chr(0):
			sum += ord(im_bg[i])
			area += 1
	return float(sum) / float(area)
	
def edge((x, y), image):
	n_image = (len(image)) * [chr(0)]
	for i in range(0,2 * x - 1):
		n_image[i] = chr(0)
		n_image[(y - 2) * x + i] = chr(0)
	for i in range(0,y - 1):
		n_image[x * i] = chr(0)
		n_image[x * i + 1] = chr(0)
		n_image[x * i + x - 2] = chr(0)
		n_image[x * i + x - 1] = chr(0)
	for i in range(2,(x-2)-1):
		for j in range(2,(y-2)-1):
			sum = (ord(image[x * j + i - 2]) + ord(image[x * j + i - 1]) + ord(image[x * j + i + 1]) + ord(image[x * j + i + 2]) + 
				ord(image[x * (j - 2) + i - 2]) + ord(image[x * (j - 2) + i - 1]) + ord(image[x * (j - 2) + i]) + 
				ord(image[x * (j - 2) + i + 1]) + ord(image[x * (j - 2) + i + 2]) + 
				ord(image[x * (j - 1) + i - 2]) + ord(image[x * (j - 1) + i - 1]) + ord(image[x * (j - 1) + i]) + 
				ord(image[x * (j - 1) + i + 1]) + ord(image[x * (j - 1) + i + 2]) +
				ord(image[x * (j + 2) + i - 2]) + ord(image[x * (j + 2) + i - 1]) + ord(image[x * (j + 2) + i]) + 
				ord(image[x * (j + 2) + i + 1]) + ord(image[x * (j + 2) + i + 2]) +
				ord(image[x * (j + 1) + i - 2]) + ord(image[x * (j + 1) + i - 1]) + ord(image[x * (j + 1) + i]) + 
				ord(image[x * (j + 1) + i + 1]) + ord(image[x * (j + 1) + i + 2]) 
				- (24 * ord(image[x * j + i])))
			if sum < 0:
				sum = 0
			if sum > 255:
				sum = 255
			n_image[x * j + i] = chr(sum)
	return "".join(n_image)
	
def hough((x,y), image, init_angle, dt):
    matrix = len(image) * [0]
    for i in range(0,x*y-1):
		matrix[i] = 0
    d = x/2
    for i in range(0,x-1):
        print i
        b = (i - x / 2)
        for j in range(0,y-1):
            c = (j - y / 2)
            if (image[j * x + i]!=chr(0)):
                for a in range(0,y-1):
                    th = a*dt + init_angle
                    distance = b * math.sin(th) - c * math.cos(th) + d
                    column = int(distance+0.5)
                    if ((0 <= column) and (column < x)):
                        matrix[a * x + column] += 1
	n_image = len(image) * [chr(0)]
	minimum = matrix[0]
	maximum = matrix[0]
    for i in range(1,x*y-1):
        if (matrix[i] < minimum):
            minimum = matrix[i]
        if (matrix[i] > maximum): 
            maximum = matrix[i]
    maximum = maximum - minimum + 1
    for i in range(0,x*y-1):
        n_image[i] = chr(int((float(matrix[i] - minimum) / float(maximum)) * 256.0))
    return "".join(n_image)

And by the way, I've been thinking about how machine learning might be used to improve grid finding. I tried out some random photos from the web and the line finding worked great, but the grid was often pretty far off - my thought is clever algorithms can only get your so far, but machine learning with a lot of data can go farther. The big question first question is always how do you get a lot of training data. Here's a plan I'm going to try out when I get a chance:

Get an SGF file of a game and print out the move order
have a helper place the stones one by one in order
for each stone, first take a very clear top-down photo with the board nearly filling the frame
then take 9 other photos from different perspectives, within 45 degrees of pitch and yaw
number the photos something like Board-XXX-YYY-Z (X is SGF #, Y is move #, Z is photo # for that move)
Then you can automate a process - run existing algorithm to find lines, then do a search to find corners consistent with the known stone positions, save these off
verify corner positions: run through all images and their saved-off corner positions, display zoomed in on the corners and iterate through all images - user hits Y to accept as good and N to reject as wrong

At the end of it, you'll have about 1000 sample images with known grid corners and known stone positions. Once you've done it a couple of times, you could probably gather 1000 images with 30 minutes of human effort. A few hours would give you 10,000 samples - and you'd have the lines found, the corners, and the stone positions, everything you need to train for both grid finding and stone recognizing.

Rémi · Post by **Rémi** » Sat Oct 25, 2014 3:48 am

GregA wrote:And by the way, I've been thinking about how machine learning might be used to improve grid finding.

If you are interested, I posted my dataset to the computer-go mailing list:
https://groups.google.com/d/msg/compute ... Y7zQwVKtwJ
I agree that machine learning helps a lot for this kind of task. It helped me for kifu-snap.

Rémi

GregA · Post by **GregA** » Sun Nov 02, 2014 10:17 am

GregA wrote:At the end of it, you'll have about 1000 sample images with known grid corners and known stone positions. Once you've done it a couple of times, you could probably gather 1000 images with 30 minutes of human effort. A few hours would give you 10,000 samples - and you'd have the lines found, the corners, and the stone positions, everything you need to train for both grid finding and stone recognizing.

FYI, I took some time to do a something like the above. I ended up with 2400 images for a 240 move game from a book. Some suggestions for anyone who wants to give it a shot:

Set your camera's resolution low. I left mine at full-res, which probably added an extra second per image just for the camera to snap it. When processing with imago, the first thing it does is cut it down to 640x480, I think, so that is mostly wasted. I filled a 16G SD card, and just copying to my computer took a long time.
Keeping track of move number and number of images per move is tricky when going fast. I got confused a few times, so sometimes I got only 8 images per move and sometimes 15.
It's very helpful to start each 10 photo batch with a nice clear full-frame top-dowshot. Given the above, I found it very easy to open an explorer window with large icons, and I could easily select / move each batch of 10 photos to a separate move-NNN folder.
I misplayed some stones. A helper would have prevented that, I think. As is, it isn't really possible to get a single coherent sgf file for the whole game I captured, as there are a 2-3 times where I misplayed, and then a few turns later moved a stone or two. One fixup solution could be to just find those moves in my image set and redo them, but it would be nice to not have to worry about it.
using a tablet with an SGF viewer to click through the moves would probably have been a better choice than using a book.
It probably took me 3 hours of snapping photos. Taking into account some of the above, I could probably do it in 1 hour next time.

I'm about halfway through running imago through all the images. My first impressions are:

As expected, nearly empty boards get very nice hough transforms, nearly full boards get messy hough transforms. I think stone edge detection at an early stage will be necessary for highly accurate grid detection with nearly-full board.
Line detection could probably be improved - it seemed like some useful information in the hough transforms was lost. I bet an algorithm that knows a bit more about go board hough transforms and where the important information generally is would be able to tease out more useful information, even in messier end-game images.
that said, in nearly all cases, the outer detected lines give you a quadrilateral that fully encloses the grid - though it often includes the outer board edges and sometimes nearby table edges. I bet this could be useful as a rough area to do stone edge detection within, and as a sanity check for detected grids that are heavily skewed or drastically smaller.
There are a lot of errors in grid detection where it misjudges the grid spacing. It looks like perspective isn't being taken into account. If you look at the hough transforms, you can see one of the dotted lines starting with narrow gaps and the gaps growing as you move along. The end result is a half-size grid that is otherwise aligned to the board. This might be an opportunity for machine learning, using the hough images as input and grid spacing as output. Or maybe just some simple tweaks to the algorithm.
When a correct grid is found, it is highly accurate. I'll get some stats eventually, but aside from mistaking the board edges for grid boundaries, you rarely find a grid that would be hard to know is inaccurate since it's close, but is still wrong enough to cause problems down the pipeline.
Stone color detection needs work. Probably the two things that would help the most are using stone edge detection to improve classifying board vs stone, and shifting the grid vertically to account for perspective shift of stones in the back corners. Once you know the skewed grid, you can compute the perspective shift. You can then transform a vector that points up a few millimeters into pixel space to find a better offset for the stone centers. Also, this is a great candidate for machine learning. I have 2400 images, which is almost 1 million images of individual stones. Once my data is fully marked up, I would imagine doing the perspective warp and vertical shift for all the images, resulting in a whole bunch of small rectangular images with known white/black/board classification. As these images would be very small - roughly 25x25 pixels, this would be pretty straightforward to feed into a neural network. I figure the inputs would be 3x25x25 for the HSV components of the warped original image, plus 1x25x25 for the warped edge detect image. The output would be a three-choice classifier, white stone, black stone, board. Of all the components, this seems like it would be the most straightforward to apply machine learning to.

Anyway, once I get the data cleaned and marked up a bit, I'd be happy to share it, and even happier if someone were to take a stab at getting another 2000+ images. I don't have a good place to drop the ~1 Gig of data, though. If someone really plans on working on this, I'd be happy to burn a DVD and mail it. Also, if someone wants to try taking some photos, I'd be happy to give more detailed advice.

Life In 19x19

Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go

Re: Webcam + Go