It is currently Sun Nov 17, 2019 10:54 pm

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 35 posts ]  Go to page Previous  1, 2
Author Message
Offline
 Post subject: Re: Extending SGF?
Post #21 Posted: Wed May 01, 2019 4:24 pm 
Beginner

Posts: 17
Liked others: 0
Was liked: 4
Amtiskaw wrote:
SGF properties can have multiple values attached to them, e.g. AB[dd][dp][pd][pp] so if one wants to have a winrate for a node by multiple bots, you could have something like:

VR[48.6:1600:7.5:Leela Zero][43.6:900:7.5:Elf]

e.g. Leela thinks Black is 48.6% to win, Elf thinks Black is 43.6% to win...


What if AB[dd][dp][pd][pp] are taken as separate nodes AB[dd]AB[dp]AB[pd]AB[pp]?

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #22 Posted: Thu May 02, 2019 2:44 am 
Dies in gote

Posts: 35
Liked others: 4
Was liked: 19
deungsan wrote:
What if AB[dd][dp][pd][pp] are taken as separate nodes AB[dd]AB[dp]AB[pd]AB[pp]?


A semi-colon ; is how SGF specifies a new node has started.

A node with 4 AB values is written like ;AB[dd][dp][pd][pp]

4 nodes, with 1 AB value each, are written like ;AB[dd];AB[dp];AB[pd];AB[pp] (notice the 4 semi-colons)

Finally, ;AB[dd]AB[dp]AB[pd]AB[pp] is actually invalid because of duplicated keys within a node, although many SGF readers will accept it as being 1 node with 4 values.

All of this is just a description of how SGF works already, not a proposal. The AB property for instance is used to set handicap stones.

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #23 Posted: Thu May 02, 2019 4:59 am 
Beginner

Posts: 17
Liked others: 0
Was liked: 4
Amtiskaw wrote:
A semi-colon ; is how SGF specifies a new node has started.


There are lots of semicolons in a sgf file, but there is no node in sgf. Perhaps a concept of nodes could be a product of some programmers, used in their sgf parser.

Amtiskaw wrote:
Finally, ;AB[dd]AB[dp]AB[pd]AB[pp] is actually invalid because of duplicated keys within a node, although many SGF readers will accept it as being 1 node with 4 values.


AB or AW is not a key, but a property name in sgf. In my understanding, sgf is a collection of property name and property value pairs. SGF can be parsed without using node or key. In my program, AB[dd]AB[dp]AB[pd]AB[pp] is not "invalid". By the way on what basis you call this valid or invalid?

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #24 Posted: Thu May 02, 2019 5:43 am 
Dies in gote

Posts: 35
Liked others: 4
Was liked: 19
https://www.red-bean.com/sgf/sgf4.html

Quote:
"SGF is a text-only format (not a binary format). It contains game trees, with all their nodes and properties, and nothing more."

[...]

Only one of each property is allowed per node, e.g. one cannot have two comments in one node:
... ; C[comment1] B [dg] C[comment2] ; ...


There's some terminology confusion that maybe I'm contributing to, but property names are what I'm calling keys.

You cannot do ;AB[dd]AB[pp] -- same key twice in a node is disallowed. Each key in a node should be unique.

But you can do ;AB[dd][pp] -- keys can have more than 1 value. Any handicap game will prove this. It's also used for markup, e.g. TR, MA, SQ, CR, AR, and that sort of thing which indicate triangles, crosses, etc. If you want multiple triangles in a node you'll need a key (property name) attached to multiple values.

Many SGF parsers (including mine) will tolerate the first form and treat it like the second.

As a final note, the SGF specs are annoying and will tell you that [dd][pp] is specifying a single value of type "list". While you can think about it that way, it is much, much simpler to consider this form as indicating multiple values. You can dispose of the notion of values having types (every value is a string, really) until you need to interpret them. This is pretty normal to do, e.g. Sabaki's sgf API considers a property to be a key which retrieves an array of strings (i.e. the values).

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #25 Posted: Fri May 17, 2019 5:23 am 
Dies with sente
User avatar

Posts: 70
Location: Belgium
Liked others: 5
Was liked: 21
Rank: 2d
KGS: LordVader
I personally think it's perhaps a good time to drop SGF, and to introduce a new format.

Here is what ZBaduk uses internally : https://gist.github.com/bvandenbon/56e3811d279c5c6a7a1a91d7d902d6dc
It's almost a direct translation of SGF to JSON.

One minor tweak in this format, is that it groups those traditional properties in objects.
(This is just a quick copy of the typescript file)
Code:
  public moveProperties: MoveProperties = new MoveProperties();
  public setupProperties: SetupProperties = new SetupProperties();
  public nodeAnnotationProperties: NodeAnnotationProperties = new NodeAnnotationProperties();
  public moveAnnotationProperties: MoveAnnotationProperties = new MoveAnnotationProperties();
  public markupProperties: MarkupProperties = new MarkupProperties();
  public timingProperties: TimingProperties = new TimingProperties();
  public miscProperties: MiscProperties = new MiscProperties();
  public scoreEstimateProperties: ScoreEstimateProperties = new ScoreEstimateProperties();


In fact, I could provide TypeScript interfaces (definition) for the entire thing.
And perhaps that could be the start of an open standard, as an alternative to hard-to-parse SGF.
On top of that, I could provide an SGF parser that loads SGF files to the same structure,
making it 100% compatible.

As for AI properties, I propose this addition:
We could just add a statisticsProperties to the list.
And that object could contain properties like: bot, version, stats (which is an array).
A single stat should contain properties like: winrate, playouts, visits, ...

And that's where I think, these properties don't belong in SGF, but should only exist in this new format.
It's just too hard to create layers (e.g. collections) in the traditional SGF format.

_________________
Enjoy LeeLa Zero from your webbrowser, without installing anything !
https://www.zbaduk.com

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #26 Posted: Fri May 17, 2019 5:36 am 
Lives in gote

Posts: 308
Liked others: 61
Was liked: 270
Rank: maybe 2d
If there is going to be a new standard, I hope it will stop using the silly alphabetic encoding for coordinates that only permits board sizes up to 25x25, and just start using zero-indexed or one-indexed integers. That part of the SGF spec confuses me, it specifically reduces the futureproofness and flexibility of the format, increases parsing difficulty (you have to skip the letter 'i'!), and for only at best a slight gain in human readability when end-users actually rarely ever actually try to read an SGF to the point of mentally parsing coordinates anyways.

While we're at it, explicit support for non-square board sizes in the format would also be nice, specifying width and height separately.

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #27 Posted: Fri May 17, 2019 6:34 am 
Dies with sente
User avatar

Posts: 70
Location: Belgium
Liked others: 5
Was liked: 21
Rank: 2d
KGS: LordVader
My personal goal and motivation:
I would like a file format that can be used by Lizzie, ZBaduk, Sabaki, Go Review Partner, and all those AI viewers.

And I have the impression that what holds us back is all about the syntax: ";(X[])".

There have been many attempts to come up with XML formats, but I've never seen a succesful one. That's why I don't want to make it too innovative neither.
I personally, just want to change the structure, not the tags or element names per se.

I could live with numeric coordinates though.

TL;DR:
But If we do make the coordinates numeric, then I propose a 0-based numeric format.
And then reserving -1 ; -1 for a pass. - I think that would be reasonable.

lightvector wrote:
While we're at it, explicit support for non-square board sizes in the format would also be nice, specifying width and height separately.


My fear is, that once you start messing with the shape of the board, that <1% of software will implement it. The same goes for 3-color-go, circular boards, boards without edges, not to mention 3D shaped boards, ...
It's only April 1st, one day a year. So, it's a lot of effort for something that will rarely ever be used.

_________________
Enjoy LeeLa Zero from your webbrowser, without installing anything !
https://www.zbaduk.com

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #28 Posted: Fri May 17, 2019 4:36 pm 
Dies in gote

Posts: 35
Liked others: 4
Was liked: 19
lightvector wrote:
If there is going to be a new standard, I hope it will stop using the silly alphabetic encoding for coordinates that only permits board sizes up to 25x25, and just start using zero-indexed or one-indexed integers. That part of the SGF spec confuses me, it specifically reduces the futureproofness and flexibility of the format, increases parsing difficulty (you have to skip the letter 'i'!)


This isn't so. Firstly, SGF supports up to size 52, e.g. try (;SZ[52];B[XX];W[dW];B[cc];W[Wd]) in a good viewer, e.g. Sabaki.

Secondly, one does not "skip the i" when parsing. The viewer itself may or may not display coordinates like that, but the SGF format has no concept whatsoever that i is skipped. Try (;SZ[19];B[ii])

Is it possible you're talking about GTP rather than SGF? The two aren't related.

As for spook's ideas, certainly JSON is far nicer to deal with than XML but I'm not exactly seeing the need. SGF is not "hard to parse", it is easy to parse. It is not hard to create collections, it is trivial.


Last edited by Amtiskaw on Fri May 17, 2019 5:08 pm, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #29 Posted: Fri May 17, 2019 5:03 pm 
Honinbo

Posts: 9027
Liked others: 2751
Was liked: 3072
lightvector wrote:
While we're at it, explicit support for non-square board sizes in the format would also be nice, specifying width and height separately.



(;FF[4]ST[2]GM[1]CA[UTF-8]AP[GOWrite:3.0.15]SZ[5:7]PM[2]FG[259:]PB[ ]PW[ ]GN[ ]
)

5x7 board. :)

_________________
The Adkins Principle:

At some point, doesn't thinking have to go on?

— Winona Adkins

Everything with love.

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #30 Posted: Sat May 18, 2019 12:17 am 
Lives in gote

Posts: 308
Liked others: 61
Was liked: 270
Rank: maybe 2d
Amtiskaw wrote:
lightvector wrote:
If there is going to be a new standard, I hope it will stop using the silly alphabetic encoding for coordinates that only permits board sizes up to 25x25, and just start using zero-indexed or one-indexed integers. That part of the SGF spec confuses me, it specifically reduces the futureproofness and flexibility of the format, increases parsing difficulty (you have to skip the letter 'i'!)


This isn't so. Firstly, SGF supports up to size 52, e.g. try (;SZ[52];B[XX];W[dW];B[cc];W[Wd]) in a good viewer, e.g. Sabaki.

Secondly, one does not "skip the i" when parsing. The viewer itself may or may not display coordinates like that, but the SGF format has no concept whatsoever that i is skipped. Try (;SZ[19];B[ii])

Is it possible you're talking about GTP rather than SGF? The two aren't related.

As for spook's ideas, certainly JSON is far nicer to deal with than XML but I'm not exactly seeing the need. SGF is not "hard to parse", it is easy to parse. It is not hard to create collections, it is trivial.


Bill Spight wrote:
lightvector wrote:
While we're at it, explicit support for non-square board sizes in the format would also be nice, specifying width and height separately.



(;FF[4]ST[2]GM[1]CA[UTF-8]AP[GOWrite:3.0.15]SZ[5:7]PM[2]FG[259:]PB[ ]PW[ ]GN[ ]
)

5x7 board. :)


Oh, sorry, yep that would be GTP rather than SGF. But yeah, the fact that GTP has these encoding restrictions is really annoying. Looks like SGF is actually a bit nicer, so ignore my previous post. :)

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #31 Posted: Sat May 18, 2019 8:37 am 
Dies with sente
User avatar

Posts: 70
Location: Belgium
Liked others: 5
Was liked: 21
Rank: 2d
KGS: LordVader
For sure, SGF is simple and is made to store sequences. And it may appear like SGF is just what we need.

Amtiskaw wrote:
As for spook's ideas, certainly JSON is far nicer to deal with than XML but I'm not exactly seeing the need. SGF is not "hard to parse", it is easy to parse. It is not hard to create collections, it is trivial.


If we get a little more technical it may become obvious.
What LeeLa Zero returns is something like this:

Code:
info move Q16 visits 33 winrate 4346 prior 1673 lcb 4299 order 0 pv Q16 D4 D16 Q4 C6 C14 R6 R14 O3 info move D16 visits 33 winrate 4347 prior 1662 lcb 4298 order 1 pv D16 Q4 Q16 D4 C6 C14 R6 R14 O3 info move Q4 visits 33 winrate 4349 prior 1663 lcb 4293 order 2 pv Q4 D16 D4 Q16 R14 R6 C14 C6 O17 info move D4 visits 29 winrate 4340 prior 1644 lcb 4280 order 3 pv D4 Q16 D16 Q4 O17 F17 O3 F3 R6


This data is an array in itself.
Each element of this array has the following properties: move, winrate, priority, lcb, order and a sequence (which is a list of moves on itself).

We have to keep in mind that other AIs will have overlap, but may have more or less properties. I don't think you want to define a standard and dedicate it to 1 bot.
So, it should be very flexible.

So, what I propose in JSON is:

Code:
stats: [
{
  move: "Q16",
  visits: 33,
  winrate: 43.46,
  priority: 1673,
  lcb: 42.99,
  order: 2
  prediction: [Q16 D4 D16 Q4 C6 C14 R6 R14 O3]
},
{
  move: "D16",
  visits: 33,
  winrate: 43.47,
  priority: 1662,
  lcb: 42.98,
  order: 1
  prediction: [D16 Q4 Q16 D4 C6 C14 R6 R14 O3]
},
...
]


Let's assume that we want to store information about a different kind of bot. (e.g. AlphaGo)
If it only mentions winrates, it could look like this:

Code:
stats: [
{
  move: "Q16",
  winrate: 43.46,
},
{
  move: "D16",
  winrate: 43.47,
},
...
]


Now, let's continue and make things just a little more complicated.
In future, you may want to go 1 step further, and store statistics of multiple bots inside the same file, but still keeping them seperate:
So, for each move you would have:

Code:
botStats: [
{
  bot: "LeeLa Zero",
  version: "0.16 weightXyZ",
  stats: [ ... ]
},
{
  bot: "AlphaGo",
  version: "Master",
  stats: [ ... ]
}
]



There is nothing in SGF that resembles this even a little. This is a totally new kind of structure. On top of that, it would be hard to keep SGF backwards compatible. Software developers aren't supposed to write their own XML or JSON parser. Nevertheless, each Baduk related software project has its own SGF parser. And as a result there are over 100 implementations of SGF parsers. The problem being: each one of these has small variations and trade-offs in how they handle ";[]\/()" characters in comments. So, if you try to create a new structure, there is a reasonable chance that you will break existing software. It's a minefield.

_________________
Enjoy LeeLa Zero from your webbrowser, without installing anything !
https://www.zbaduk.com

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #32 Posted: Sat May 18, 2019 9:02 am 
Dies with sente
User avatar

Posts: 70
Location: Belgium
Liked others: 5
Was liked: 21
Rank: 2d
KGS: LordVader
... or basically what John says: :bow:

John Fairbairn wrote:
Not all sgf editors read sgf files correctly. In fact, I'm told that very, very few do. Even the Eidogo one used on this forum seems to choke on quite a few things.

So extending the format seems to be a recipe for more confusion. A fresh start using e.g. xml may be wiser.

One simple solution for the OP seems to be to use the C[ ] property. The info he wants can be considered a kind of comment anyway, but he can also bracket it in some coded way within the C[] text so that the info can be used or extracted programmatically.


Last year I might have gone for XML.
But now I would go for JSON. :)

(PS: If you only need it programmatically, encoding with base64 is also a good trick to avoid conflicts.)

_________________
Enjoy LeeLa Zero from your webbrowser, without installing anything !
https://www.zbaduk.com

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #33 Posted: Sat May 18, 2019 2:59 pm 
Dies in gote

Posts: 35
Liked others: 4
Was liked: 19
spook wrote:
each one of these has small variations and trade-offs in how they handle ";[]\/()" characters in comments.


When writing, only ] and \ need to be escaped. When reading, just accept whatever follows a \ character as being part of the comment.

Of course there are additional problems when the data isn't UTF-8, but mandating UTF-8 would fix SGF's biggest problems.

Still, I agree SGF has no nice way to record bot analysis, especially from 2 or more bots at once.

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #34 Posted: Sun Jun 16, 2019 3:35 am 
Lives with ko
User avatar

Posts: 282
Liked others: 93
Was liked: 148
Rank: OGS 7 kyu
I like to proposal of spook, I think JSON is a good fit for this job.

spook wrote:
We have to keep in mind that other AIs will have overlap, but may have more or less properties. I don't think you want to define a standard and dedicate it to 1 bot.
So, it should be very flexible.
Very good point. We have no idea what the state of the art go bots in 10 years from now will look like. They maybe they will use a technology different from neural networks of MCTS, and won't talk in term of winrate or playouts.

Amtiskaw wrote:
Of course there are additional problems when the data isn't UTF-8, but mandating UTF-8 would fix SGF's biggest problems.
Yes this please :) The different encodings are a legacy from the past, and have not more reason to be used today. Enforcing UTF-8 would help so much. As a matter of fact, GoReviewPartner only outputs UTF-8 SGF or RSGF now. It immediately converts from any other encoding it encounters, and try to use it anyway when encoding is not specified.

As a note for later, we may have to specify somewhere the units of some values with use in the JSON file. For example, winrate like below:
Code:
stats: [
{
  move: "Q16",
  winrate: 0.46,
},
{
  move: "D16",
  winrate: 0.47,
},
...
]
Would that mean 46% or 0.46%? Would that be winrate for the player to make that move? or this that the winrate for black? (like in Alphago teaching tool?). I am always using a format like "45.3%/56.5%" in GoReviewPartner to avoid any ambiguity, but that solution is not really satisfying.

As a note for later, that would be good to have support for a simple/standard markup language for the comments. Something simple that can be used to add hyperlinks, bullet lists, bold/italic... something like Markdown :)

_________________
I am the author of GoReviewPartner, a small software aimed at assisting reviewing a game of Go. Give it a try!

Top
 Profile  
 
Offline
 Post subject: Re: Extending SGF?
Post #35 Posted: Mon Jun 17, 2019 6:07 am 
Lives in gote

Posts: 308
Liked others: 61
Was liked: 270
Rank: maybe 2d
pnprog wrote:
spook wrote:
We have to keep in mind that other AIs will have overlap, but may have more or less properties. I don't think you want to define a standard and dedicate it to 1 bot.
So, it should be very flexible.
Very good point. We have no idea what the state of the art go bots in 10 years from now will look like. They maybe they will use a technology different from neural networks of MCTS, and won't talk in term of winrate or playouts.


And some modern bots already right now also reports not just winrate, but the average expected score, as well as a standard deviation value that measures uncertainty about the score. ;-)

pnprog wrote:
As a note for later, we may have to specify somewhere the units of some values with use in the JSON file. For example, winrate like below:
Code:
stats: [
{
  move: "Q16",
  winrate: 0.46,
},
{
  move: "D16",
  winrate: 0.47,
},
...
]
Would that mean 46% or 0.46%? Would that be winrate for the player to make that move? or this that the winrate for black? (like in Alphago teaching tool?). I am always using a format like "45.3%/56.5%" in GoReviewPartner to avoid any ambiguity, but that solution is not really satisfying.


I agree with thinking about units. Leela Zero's lz-analyze currently multiplies all probability values by 10000 and then rounds them - but if we're talking file formats, 10000 seems like a pretty arbitrary constant. I would vote not multiplying at all - fields intended to be probabilities should be floats between 0 and 1, predicted score should be in units of points rather than points-times-some-constant, if a bot wants to report signed utility (e.g. version of winrate that is positive if a player is ahead and negative if behind, possibly blending in a term for greater score), that could be around the scale of -1 to 1, etc.

Mandating values from the view of a consistent player (e.g. black) rather than alternating by side to move would make it much easier to write tools that graph the winrate or other values, or scan game records for large differences between consecutive moves looking for mistakes and such, since for both of those applications consistent-view values can be used as-is while side to move values need to be inverted every other move. There's also precedent in Chess - my impression is that it's also somewhat more common in Chess analysis land as well to use a consistent player (the first player) rather than to show by side-to-move.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 35 posts ]  Go to page Previous  1, 2

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: Google [Bot], Majestic-12 [Bot] and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group