It is currently Thu Mar 28, 2024 1:02 am

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 35 posts ]  Go to page Previous  1, 2
Author Message
Offline
 Post subject: Re: Extending SGF?
Post #21 Posted: Thu May 02, 2019 5:43 am 
Dies in gote

Posts: 38
Liked others: 4
Was liked: 20
https://www.red-bean.com/sgf/sgf4.html

Quote:
"SGF is a text-only format (not a binary format). It contains game trees, with all their nodes and properties, and nothing more."

[...]

Only one of each property is allowed per node, e.g. one cannot have two comments in one node:
... ; C[comment1] B [dg] C[comment2] ; ...


There's some terminology confusion that maybe I'm contributing to, but property names are what I'm calling keys.

You cannot do ;AB[dd]AB[pp] -- same key twice in a node is disallowed. Each key in a node should be unique.

But you can do ;AB[dd][pp] -- keys can have more than 1 value. Any handicap game will prove this. It's also used for markup, e.g. TR, MA, SQ, CR, AR, and that sort of thing which indicate triangles, crosses, etc. If you want multiple triangles in a node you'll need a key (property name) attached to multiple values.

Many SGF parsers (including mine) will tolerate the first form and treat it like the second.

As a final note, the SGF specs are annoying and will tell you that [dd][pp] is specifying a single value of type "list". While you can think about it that way, it is much, much simpler to consider this form as indicating multiple values. You can dispose of the notion of values having types (every value is a string, really) until you need to interpret them. This is pretty normal to do, e.g. Sabaki's sgf API considers a property to be a key which retrieves an array of strings (i.e. the values).

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #22 Posted: Fri May 17, 2019 5:23 am 
Lives with ko
User avatar

Posts: 151
Location: Belgium
Liked others: 11
Was liked: 48
Rank: 2d
KGS: LordVader
I personally think it's perhaps a good time to drop SGF, and to introduce a new format.

Here is what ZBaduk uses internally : https://gist.github.com/bvandenbon/56e3811d279c5c6a7a1a91d7d902d6dc
It's almost a direct translation of SGF to JSON.

One minor tweak in this format, is that it groups those traditional properties in objects.
(This is just a quick copy of the typescript file)
Code:
  public moveProperties: MoveProperties = new MoveProperties();
  public setupProperties: SetupProperties = new SetupProperties();
  public nodeAnnotationProperties: NodeAnnotationProperties = new NodeAnnotationProperties();
  public moveAnnotationProperties: MoveAnnotationProperties = new MoveAnnotationProperties();
  public markupProperties: MarkupProperties = new MarkupProperties();
  public timingProperties: TimingProperties = new TimingProperties();
  public miscProperties: MiscProperties = new MiscProperties();
  public scoreEstimateProperties: ScoreEstimateProperties = new ScoreEstimateProperties();


In fact, I could provide TypeScript interfaces (definition) for the entire thing.
And perhaps that could be the start of an open standard, as an alternative to hard-to-parse SGF.
On top of that, I could provide an SGF parser that loads SGF files to the same structure,
making it 100% compatible.

As for AI properties, I propose this addition:
We could just add a statisticsProperties to the list.
And that object could contain properties like: bot, version, stats (which is an array).
A single stat should contain properties like: winrate, playouts, visits, ...

And that's where I think, these properties don't belong in SGF, but should only exist in this new format.
It's just too hard to create layers (e.g. collections) in the traditional SGF format.

_________________
Enjoy LeeLaZero and KataGo from your webbrowser, without installing anything !
https://www.zbaduk.com

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #23 Posted: Fri May 17, 2019 5:36 am 
Lives in sente

Posts: 757
Liked others: 114
Was liked: 916
Rank: maybe 2d
If there is going to be a new standard, I hope it will stop using the silly alphabetic encoding for coordinates that only permits board sizes up to 25x25, and just start using zero-indexed or one-indexed integers. That part of the SGF spec confuses me, it specifically reduces the futureproofness and flexibility of the format, increases parsing difficulty (you have to skip the letter 'i'!), and for only at best a slight gain in human readability when end-users actually rarely ever actually try to read an SGF to the point of mentally parsing coordinates anyways.

While we're at it, explicit support for non-square board sizes in the format would also be nice, specifying width and height separately.

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #24 Posted: Fri May 17, 2019 6:34 am 
Lives with ko
User avatar

Posts: 151
Location: Belgium
Liked others: 11
Was liked: 48
Rank: 2d
KGS: LordVader
My personal goal and motivation:
I would like a file format that can be used by Lizzie, ZBaduk, Sabaki, Go Review Partner, and all those AI viewers.

And I have the impression that what holds us back is all about the syntax: ";(X[])".

There have been many attempts to come up with XML formats, but I've never seen a succesful one. That's why I don't want to make it too innovative neither.
I personally, just want to change the structure, not the tags or element names per se.

I could live with numeric coordinates though.

TL;DR:
But If we do make the coordinates numeric, then I propose a 0-based numeric format.
And then reserving -1 ; -1 for a pass. - I think that would be reasonable.

lightvector wrote:
While we're at it, explicit support for non-square board sizes in the format would also be nice, specifying width and height separately.


My fear is, that once you start messing with the shape of the board, that <1% of software will implement it. The same goes for 3-color-go, circular boards, boards without edges, not to mention 3D shaped boards, ...
It's only April 1st, one day a year. So, it's a lot of effort for something that will rarely ever be used.

_________________
Enjoy LeeLaZero and KataGo from your webbrowser, without installing anything !
https://www.zbaduk.com

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #25 Posted: Fri May 17, 2019 4:36 pm 
Dies in gote

Posts: 38
Liked others: 4
Was liked: 20
lightvector wrote:
If there is going to be a new standard, I hope it will stop using the silly alphabetic encoding for coordinates that only permits board sizes up to 25x25, and just start using zero-indexed or one-indexed integers. That part of the SGF spec confuses me, it specifically reduces the futureproofness and flexibility of the format, increases parsing difficulty (you have to skip the letter 'i'!)


This isn't so. Firstly, SGF supports up to size 52, e.g. try (;SZ[52];B[XX];W[dW];B[cc];W[Wd]) in a good viewer, e.g. Sabaki.

Secondly, one does not "skip the i" when parsing. The viewer itself may or may not display coordinates like that, but the SGF format has no concept whatsoever that i is skipped. Try (;SZ[19];B[ii])

Is it possible you're talking about GTP rather than SGF? The two aren't related.

As for spook's ideas, certainly JSON is far nicer to deal with than XML but I'm not exactly seeing the need. SGF is not "hard to parse", it is easy to parse. It is not hard to create collections, it is trivial.


Last edited by Amtiskaw on Fri May 17, 2019 5:08 pm, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #26 Posted: Fri May 17, 2019 5:03 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
lightvector wrote:
While we're at it, explicit support for non-square board sizes in the format would also be nice, specifying width and height separately.



(;FF[4]ST[2]GM[1]CA[UTF-8]AP[GOWrite:3.0.15]SZ[5:7]PM[2]FG[259:]PB[ ]PW[ ]GN[ ]
)

5x7 board. :)

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #27 Posted: Sat May 18, 2019 12:17 am 
Lives in sente

Posts: 757
Liked others: 114
Was liked: 916
Rank: maybe 2d
Amtiskaw wrote:
lightvector wrote:
If there is going to be a new standard, I hope it will stop using the silly alphabetic encoding for coordinates that only permits board sizes up to 25x25, and just start using zero-indexed or one-indexed integers. That part of the SGF spec confuses me, it specifically reduces the futureproofness and flexibility of the format, increases parsing difficulty (you have to skip the letter 'i'!)


This isn't so. Firstly, SGF supports up to size 52, e.g. try (;SZ[52];B[XX];W[dW];B[cc];W[Wd]) in a good viewer, e.g. Sabaki.

Secondly, one does not "skip the i" when parsing. The viewer itself may or may not display coordinates like that, but the SGF format has no concept whatsoever that i is skipped. Try (;SZ[19];B[ii])

Is it possible you're talking about GTP rather than SGF? The two aren't related.

As for spook's ideas, certainly JSON is far nicer to deal with than XML but I'm not exactly seeing the need. SGF is not "hard to parse", it is easy to parse. It is not hard to create collections, it is trivial.


Bill Spight wrote:
lightvector wrote:
While we're at it, explicit support for non-square board sizes in the format would also be nice, specifying width and height separately.



(;FF[4]ST[2]GM[1]CA[UTF-8]AP[GOWrite:3.0.15]SZ[5:7]PM[2]FG[259:]PB[ ]PW[ ]GN[ ]
)

5x7 board. :)


Oh, sorry, yep that would be GTP rather than SGF. But yeah, the fact that GTP has these encoding restrictions is really annoying. Looks like SGF is actually a bit nicer, so ignore my previous post. :)

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #28 Posted: Sat May 18, 2019 8:37 am 
Lives with ko
User avatar

Posts: 151
Location: Belgium
Liked others: 11
Was liked: 48
Rank: 2d
KGS: LordVader
For sure, SGF is simple and is made to store sequences. And it may appear like SGF is just what we need.

Amtiskaw wrote:
As for spook's ideas, certainly JSON is far nicer to deal with than XML but I'm not exactly seeing the need. SGF is not "hard to parse", it is easy to parse. It is not hard to create collections, it is trivial.


If we get a little more technical it may become obvious.
What LeeLa Zero returns is something like this:

Code:
info move Q16 visits 33 winrate 4346 prior 1673 lcb 4299 order 0 pv Q16 D4 D16 Q4 C6 C14 R6 R14 O3 info move D16 visits 33 winrate 4347 prior 1662 lcb 4298 order 1 pv D16 Q4 Q16 D4 C6 C14 R6 R14 O3 info move Q4 visits 33 winrate 4349 prior 1663 lcb 4293 order 2 pv Q4 D16 D4 Q16 R14 R6 C14 C6 O17 info move D4 visits 29 winrate 4340 prior 1644 lcb 4280 order 3 pv D4 Q16 D16 Q4 O17 F17 O3 F3 R6


This data is an array in itself.
Each element of this array has the following properties: move, winrate, priority, lcb, order and a sequence (which is a list of moves on itself).

We have to keep in mind that other AIs will have overlap, but may have more or less properties. I don't think you want to define a standard and dedicate it to 1 bot.
So, it should be very flexible.

So, what I propose in JSON is:

Code:
stats: [
{
  move: "Q16",
  visits: 33,
  winrate: 43.46,
  priority: 1673,
  lcb: 42.99,
  order: 2
  prediction: [Q16 D4 D16 Q4 C6 C14 R6 R14 O3]
},
{
  move: "D16",
  visits: 33,
  winrate: 43.47,
  priority: 1662,
  lcb: 42.98,
  order: 1
  prediction: [D16 Q4 Q16 D4 C6 C14 R6 R14 O3]
},
...
]


Let's assume that we want to store information about a different kind of bot. (e.g. AlphaGo)
If it only mentions winrates, it could look like this:

Code:
stats: [
{
  move: "Q16",
  winrate: 43.46,
},
{
  move: "D16",
  winrate: 43.47,
},
...
]


Now, let's continue and make things just a little more complicated.
In future, you may want to go 1 step further, and store statistics of multiple bots inside the same file, but still keeping them seperate:
So, for each move you would have:

Code:
botStats: [
{
  bot: "LeeLa Zero",
  version: "0.16 weightXyZ",
  stats: [ ... ]
},
{
  bot: "AlphaGo",
  version: "Master",
  stats: [ ... ]
}
]



There is nothing in SGF that resembles this even a little. This is a totally new kind of structure. On top of that, it would be hard to keep SGF backwards compatible. Software developers aren't supposed to write their own XML or JSON parser. Nevertheless, each Baduk related software project has its own SGF parser. And as a result there are over 100 implementations of SGF parsers. The problem being: each one of these has small variations and trade-offs in how they handle ";[]\/()" characters in comments. So, if you try to create a new structure, there is a reasonable chance that you will break existing software. It's a minefield.

_________________
Enjoy LeeLaZero and KataGo from your webbrowser, without installing anything !
https://www.zbaduk.com

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #29 Posted: Sat May 18, 2019 9:02 am 
Lives with ko
User avatar

Posts: 151
Location: Belgium
Liked others: 11
Was liked: 48
Rank: 2d
KGS: LordVader
... or basically what John says: :bow:

John Fairbairn wrote:
Not all sgf editors read sgf files correctly. In fact, I'm told that very, very few do. Even the Eidogo one used on this forum seems to choke on quite a few things.

So extending the format seems to be a recipe for more confusion. A fresh start using e.g. xml may be wiser.

One simple solution for the OP seems to be to use the C[ ] property. The info he wants can be considered a kind of comment anyway, but he can also bracket it in some coded way within the C[] text so that the info can be used or extracted programmatically.


Last year I might have gone for XML.
But now I would go for JSON. :)

(PS: If you only need it programmatically, encoding with base64 is also a good trick to avoid conflicts.)

_________________
Enjoy LeeLaZero and KataGo from your webbrowser, without installing anything !
https://www.zbaduk.com

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #30 Posted: Sat May 18, 2019 2:59 pm 
Dies in gote

Posts: 38
Liked others: 4
Was liked: 20
spook wrote:
each one of these has small variations and trade-offs in how they handle ";[]\/()" characters in comments.


When writing, only ] and \ need to be escaped. When reading, just accept whatever follows a \ character as being part of the comment.

Of course there are additional problems when the data isn't UTF-8, but mandating UTF-8 would fix SGF's biggest problems.

Still, I agree SGF has no nice way to record bot analysis, especially from 2 or more bots at once.

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #31 Posted: Sun Jun 16, 2019 3:35 am 
Lives with ko
User avatar

Posts: 284
Liked others: 94
Was liked: 153
Rank: OGS 7 kyu
I like to proposal of spook, I think JSON is a good fit for this job.

spook wrote:
We have to keep in mind that other AIs will have overlap, but may have more or less properties. I don't think you want to define a standard and dedicate it to 1 bot.
So, it should be very flexible.
Very good point. We have no idea what the state of the art go bots in 10 years from now will look like. They maybe they will use a technology different from neural networks of MCTS, and won't talk in term of winrate or playouts.

Amtiskaw wrote:
Of course there are additional problems when the data isn't UTF-8, but mandating UTF-8 would fix SGF's biggest problems.
Yes this please :) The different encodings are a legacy from the past, and have not more reason to be used today. Enforcing UTF-8 would help so much. As a matter of fact, GoReviewPartner only outputs UTF-8 SGF or RSGF now. It immediately converts from any other encoding it encounters, and try to use it anyway when encoding is not specified.

As a note for later, we may have to specify somewhere the units of some values with use in the JSON file. For example, winrate like below:
Code:
stats: [
{
  move: "Q16",
  winrate: 0.46,
},
{
  move: "D16",
  winrate: 0.47,
},
...
]
Would that mean 46% or 0.46%? Would that be winrate for the player to make that move? or this that the winrate for black? (like in Alphago teaching tool?). I am always using a format like "45.3%/56.5%" in GoReviewPartner to avoid any ambiguity, but that solution is not really satisfying.

As a note for later, that would be good to have support for a simple/standard markup language for the comments. Something simple that can be used to add hyperlinks, bullet lists, bold/italic... something like Markdown :)

_________________
I am the author of GoReviewPartner, a small software aimed at assisting reviewing a game of Go. Give it a try!

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #32 Posted: Mon Jun 17, 2019 6:07 am 
Lives in sente

Posts: 757
Liked others: 114
Was liked: 916
Rank: maybe 2d
pnprog wrote:
spook wrote:
We have to keep in mind that other AIs will have overlap, but may have more or less properties. I don't think you want to define a standard and dedicate it to 1 bot.
So, it should be very flexible.
Very good point. We have no idea what the state of the art go bots in 10 years from now will look like. They maybe they will use a technology different from neural networks of MCTS, and won't talk in term of winrate or playouts.


And some modern bots already right now also reports not just winrate, but the average expected score, as well as a standard deviation value that measures uncertainty about the score. ;-)

pnprog wrote:
As a note for later, we may have to specify somewhere the units of some values with use in the JSON file. For example, winrate like below:
Code:
stats: [
{
  move: "Q16",
  winrate: 0.46,
},
{
  move: "D16",
  winrate: 0.47,
},
...
]
Would that mean 46% or 0.46%? Would that be winrate for the player to make that move? or this that the winrate for black? (like in Alphago teaching tool?). I am always using a format like "45.3%/56.5%" in GoReviewPartner to avoid any ambiguity, but that solution is not really satisfying.


I agree with thinking about units. Leela Zero's lz-analyze currently multiplies all probability values by 10000 and then rounds them - but if we're talking file formats, 10000 seems like a pretty arbitrary constant. I would vote not multiplying at all - fields intended to be probabilities should be floats between 0 and 1, predicted score should be in units of points rather than points-times-some-constant, if a bot wants to report signed utility (e.g. version of winrate that is positive if a player is ahead and negative if behind, possibly blending in a term for greater score), that could be around the scale of -1 to 1, etc.

Mandating values from the view of a consistent player (e.g. black) rather than alternating by side to move would make it much easier to write tools that graph the winrate or other values, or scan game records for large differences between consecutive moves looking for mistakes and such, since for both of those applications consistent-view values can be used as-is while side to move values need to be inverted every other move. There's also precedent in Chess - my impression is that it's also somewhat more common in Chess analysis land as well to use a consistent player (the first player) rather than to show by side-to-move.

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #33 Posted: Sun Dec 29, 2019 5:01 pm 
Lives in gote

Posts: 580
Location: Adelaide, South Australia
Liked others: 207
Was liked: 264
Rank: Australian 2 dan
GD Posts: 200
So did this ever go anywhere? It looks like https://www.red-bean.com/sgf/ hasn't been updated in a very long time.

My 2 cents:
  • SGF is close to human-readable. I do sometimes open up SGF in a text editor to fix broken files or reformat stuff, or use grep on a file of SGFs to find things. JSON is also more or less human-readable. XML isn't, despite what the fans say: it's just too verbose. I'd prefer to stay with SGF so as to not throw out all my old software. But I can see the benefits of a JSON alternative.
  • As well as bot evaluations, the other thing I'd love to see added to SGF (or a new format) is node labels plus the ability to hyperlink from a comment to another (labelled) node. This would be great for commented games and for SGF joseki dictionaries: the comments could say things like "compare this variation with (link)".

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #34 Posted: Mon Dec 30, 2019 1:34 am 
Gosei

Posts: 1494
Liked others: 111
Was liked: 315
Can we not just go back to the Ishi format?

_________________
North Lecale

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #35 Posted: Mon Dec 30, 2019 5:28 pm 
Lives in sente
User avatar

Posts: 914
Liked others: 391
Was liked: 162
Rank: German 2 dan
You have two options: keep SGF compatibility, or create a new format. I think that SGF is not too bad, and it has the big advantage that there is already a large(ish) community of programs and people who know how to work with it. On the other hand, SGF has made some restricting decisions that you might want to get rid of.

If you keep to extending SGF, there are two kinds of extensions: add more properties just to nodes (e. g. bot evaluations), and add more structure (e. g. links). The latter is more interesting. The former is restricted by the decision to have only short, unstructured property names.

For more properties, I'd propose to add just one new property name, and make its value arbitrarily structured. This might look like:

Code:
Z[{bots {leela ({version "157" black-win-rate 57.32 prediction (ca if ig hi)})}}]


For links, I think the idea of labelling and then referring is not too bad. For example, mark with MK[some-label]. The referring is not so easy, because it has to extend/restrict the syntax inside e. g. a comment. Maybe C[See the {ref some-label with "other position"}].

(In the above, I avoided the use of square brackets and colons so that there is less need for escaping. It's just a quick draft.)

If you think about a new format, I believe that one should think about whether a tree is the right model. Maybe a general property graph can offer new opportunities, e. g. unifying variations that arrive at the same position, or representing not only moves, but also links or ko restrictions as edges. This is hard to do in a human readable way, but maybe that's rather liberating. Moving to a binary format could also simplify the implementation, since human readability is not a concern anymore and you can put exact type and length tags in — no escaping needed.

P. S.: no XML.

_________________
A good system naturally covers all corner cases without further effort.


This post by Harleqin was liked by: xela
Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 35 posts ]  Go to page Previous  1, 2

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group