Being interested in linguistic corpora, I was looking at some search algorithms, such as Rabin-Karp. These generally involve hash functions, and that got me thinking about go position searches when I moved on to my next task, something go-related. The only one I know about is Kombilo, and I got to wondering how it works.
Using hashing for chess positions is, if not exactly easy in practical terms, not too hard to follow. The limited board size and the convenient marriage of 64 squares and 64 bits make the hashing principle there reasonably straightforward. I imagine most of the cleverness of the top programmers goes into things like bit magic. I used to work with some on shogi, and it was fascinating to see how they grappled with the upgrade to 9x9 squares. The first question was: has anybody got a 9-bit chip?
But go is not only many times bigger - the task seems quite different. If it were just a question of searching for matches of whole-board positions using a cached, pre-hashed databank, that would not seem specially interesting. But Kombilo (a magnificent beast) allows you to make searches of rectangles of any size within a cache built (as I infer) from 19x19 hashing. I can't get my head round that at all except by imagining hashing and caching every possible rectangle, which is clearly nowhere near feasible. As a reference point, on my machine Kombilo seems to use about 2G on a 110,000-game database that occupies about 15MB. That seems insanely economical.
Can anyone lift the veil on the magic?
Making a hash of it
-
John Fairbairn
- Oza
- Posts: 3724
- Joined: Wed Apr 21, 2010 3:09 am
- Has thanked: 20 times
- Been thanked: 4672 times
-
bernds
- Lives with ko
- Posts: 259
- Joined: Sun Apr 30, 2017 11:18 pm
- Rank: 2d
- GD Posts: 0
- Has thanked: 46 times
- Been thanked: 116 times
Re: Making a hash of it
Kombilo (and I've used essentially the same algorithm in q5go) stores two bitmaps per game. These hold the positions where white and black stones appeared at any point during the game. It's not quite the same as the final position, since captured stones are also recorded here. I'm not sure I'd use the term "hashes" for these bitmaps, but I suppose one could.John Fairbairn wrote:But go is not only many times bigger - the task seems quite different. If it were just a question of searching for matches of whole-board positions using a cached, pre-hashed databank, that would not seem specially interesting. But Kombilo (a magnificent beast) allows you to make searches of rectangles of any size within a cache built (as I infer) from 19x19 hashing. I can't get my head round that at all except by imagining hashing and caching every possible rectangle, which is clearly nowhere near feasible. As a reference point, on my machine Kombilo seems to use about 2G on a 110,000-game database that occupies about 15MB. That seems insanely economical.
Can anyone lift the veil on the magic?
The search is then a two-pass affair. First, for every game, you try to match the pattern against these bitmaps.
Let's say you're looking for
Code: Select all
XO
OXCode: Select all
OO XX
.O X.I believe Kombilo also uses something else for each game that's more conventionally a hash value. These are used as (hopefully) unique IDs and IIRC they are used to identify games and print annotations such as "Commentary in Go World 72".
-
John Tilley
- Dies with sente
- Posts: 83
- Joined: Tue Mar 01, 2016 2:28 pm
- Rank: now 1kyu-ish
- GD Posts: 0
- Has thanked: 6 times
- Been thanked: 65 times
Re: Making a hash of it
John - Kombilo uses a Dyer signature to identify each game - that's the position of the six moves 20, 40, 60, 31, 51, 71 in SGF format.
There was some discussion on Sensei's Library about hashing and searching at:
https://senseis.xmp.net/?SearchAlgorithms
You can use the Kombilo database in your own Python scripts to mine your SGF files - worth looking at before you rush into any coding. The PDF in the Kombilo Documentation covers this.
Take Care - John
There was some discussion on Sensei's Library about hashing and searching at:
https://senseis.xmp.net/?SearchAlgorithms
You can use the Kombilo database in your own Python scripts to mine your SGF files - worth looking at before you rush into any coding. The PDF in the Kombilo Documentation covers this.
Take Care - John
-
John Fairbairn
- Oza
- Posts: 3724
- Joined: Wed Apr 21, 2010 3:09 am
- Has thanked: 20 times
- Been thanked: 4672 times
Re: Making a hash of it
Thanks for replies. It seems I made a rash assumption that hashing was involved. I did this partly because Ulrich mentions hashing a lot and even offers a function within the program for corner searches with or without hashing. But furtehr scrutiny does suggest he never exactly says he used hashing (hidden pun not intended!).
I also assumed that doing a hash look-up on cached data was the only way to achieve the sort of speed he gets, though he did once say he used C for a portion of the program to achieve speed.
I'm not looking to code anything myself. After all, I have Kombilo already, and have written a workaround to get over the problem that it crashes with date of 2000 and over. But the SL pages mentioned, supplementing the remarks above, look as if they will give me the "tourist" view I was looking for - though the main Ulrich link on the SL page is now dead
. Is that because it's 2021?
I also assumed that doing a hash look-up on cached data was the only way to achieve the sort of speed he gets, though he did once say he used C for a portion of the program to achieve speed.
I'm not looking to code anything myself. After all, I have Kombilo already, and have written a workaround to get over the problem that it crashes with date of 2000 and over. But the SL pages mentioned, supplementing the remarks above, look as if they will give me the "tourist" view I was looking for - though the main Ulrich link on the SL page is now dead
-
bernds
- Lives with ko
- Posts: 259
- Joined: Sun Apr 30, 2017 11:18 pm
- Rank: 2d
- GD Posts: 0
- Has thanked: 46 times
- Been thanked: 116 times
Re: Making a hash of it
I looked at it for a while but failed. I'm not enough of a Python export, and had no idea how the integration of Python and C++ code needed to be dealt with. I found it easier to just reimplement the search algorithm.Marcel Grünauer wrote:Speaking of kombilo, has anyone found a way to get it to run under Python 3? The developer says it's incompatible with Python 3 but I'm wondering how much effort would be involved to make it compatible. (I'm neither a Python nor a C++ expert.)
-
bernds
- Lives with ko
- Posts: 259
- Joined: Sun Apr 30, 2017 11:18 pm
- Rank: 2d
- GD Posts: 0
- Has thanked: 46 times
- Been thanked: 116 times
Re: Making a hash of it
Oh yeah, now that you mention it, I kind of remember seeing something like this. But I never looked deeply into it since the general case search seems to be fast enough, especially for corner patterns, so I felt that a special case wasn't needed.John Fairbairn wrote:Thanks for replies. It seems I made a rash assumption that hashing was involved. I did this partly because Ulrich mentions hashing a lot and even offers a function within the program for corner searches with or without hashing.
-
bernds
- Lives with ko
- Posts: 259
- Joined: Sun Apr 30, 2017 11:18 pm
- Rank: 2d
- GD Posts: 0
- Has thanked: 46 times
- Been thanked: 116 times
Re: Making a hash of it
No. What exactly would you need?Marcel Grünauer wrote:I see that q5Go has a pattern matcher; is there a standalone version like libkombilo that can be used in scripts?bernds wrote:I looked at it for a while but failed. I'm not enough of a Python export, and had no idea how the integration of Python and C++ code needed to be dealt with. I found it easier to just reimplement the search algorithm.
-
aeb
- Dies with sente
- Posts: 101
- Joined: Wed Dec 04, 2013 7:08 pm
- GD Posts: 0
- Has thanked: 5 times
- Been thanked: 36 times
Re: Making a hash of it
sgfutils has sgfinfo that can select games from a game collection given some partial metainformation or some moves or some board pattern. See sgfinfo.html. It can be used in scripts. It was written for a Linux environment.Marcel Grünauer wrote:I see that q5Go has a pattern matcher; is there a standalone version like libkombilo that can be used in scripts?