It is currently Thu Mar 28, 2024 10:59 am

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 23 posts ]  Go to page 1, 2  Next
Author Message
Offline
 Post subject: Good territory scoring rules for training computers?
Post #1 Posted: Wed Dec 13, 2017 1:24 pm 
Lives in sente

Posts: 757
Liked others: 114
Was liked: 916
Rank: maybe 2d
As far as I know, all major attempts to train neural nets for Go have done so in area-scoring rules. For example, AlphaGo used a variation on Tromp-Taylor rules. The critical factor is that area-scoring rules allow one to play out to capture all dead stones and fully resolve the board position without affecting the final result. This makes it easy to completely automate the end-of-game to determine all status of all groups and territories for computer training. But of course, this causes problems when applying the neural net to play with territory scoring rules. For example Deep Zen I seem to recall had problems multiple times in close games in Japanese rules due to this, since its value net was only trained under area scoring rules, causing Deep Zen to screw up in what would have been an 0.5-margin territory game. On Deepmind's side, they completely ignored the problem by just never trying to use AlphaGo for any territory scoring rules.

Are there good rules that still allow playout to resolve all statuses on the board but where correct play prior to that cleanup and the final result match what correct play and the final result would be under most territory scoring rules?

Something like Tromp-Taylor rules except with a button ("Button Go") seem promising. Except that there are still simple or common cases where correct play in Button Go diverges from what it would be in true territory scoring rules, often involving a final ko, right? This seems to make it not ideal for this purpose, since presumably that divergence would still cause problems.

How about instead of a button, one used Tromp-Taylor rules with the following modifications?
* Every time a player makes a move on the board, that player also loses 1 point (passes do not lose points).
* After two consecutive passes, the game does not end but enters a cleanup phase, where moves no longer cause players to lose points.
* After two consecutive passes in the cleanup phase, the game ends immediately. A player's final score is the Tromp-Taylor score minus all points lost by making moves, plus any komi set for that player.
* (optionally also disallow suicide)

Are there any simple pathologies with these rules that could make correct play or the final result differ significantly from what it would be under most territory-scoring rules?

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #2 Posted: Wed Dec 13, 2017 4:39 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
lightvector wrote:
As far as I know, all major attempts to train neural nets for Go have done so in area-scoring rules. For example, AlphaGo used a variation on Tromp-Taylor rules. The critical factor is that area-scoring rules allow one to play out to capture all dead stones and fully resolve the board position without affecting the final result. This makes it easy to completely automate the end-of-game to determine all status of all groups and territories for computer training. But of course, this causes problems when applying the neural net to play with territory scoring rules. For example Deep Zen I seem to recall had problems multiple times in close games in Japanese rules due to this, since its value net was only trained under area scoring rules, causing Deep Zen to screw up in what would have been an 0.5-margin territory game. On Deepmind's side, they completely ignored the problem by just never trying to use AlphaGo for any territory scoring rules.


I have not looked anything up, but as I recall, the problem for Zen was the 6.5 komi instead of the 7.5 komi. Even though it used the 6.5 komi to score its rollouts. (I hope that I misunderstood.)

Quote:
Are there good rules that still allow playout to resolve all statuses on the board but where correct play prior to that cleanup and the final result match what correct play and the final result would be under most territory scoring rules?


Back in the '90s I anticipated this problem, and proposed some rules on a mailing list. They did not catch on. ;)

Quote:
Something like Tromp-Taylor rules except with a button ("Button Go") seem promising. Except that there are still simple or common cases where correct play in Button Go diverges from what it would be in true territory scoring rules, often involving a final ko, right? This seems to make it not ideal for this purpose, since presumably that divergence would still cause problems.


Yeah, button go is a hybrid of area and territory scoring.

Quote:
How about instead of a button, one used Tromp-Taylor rules with the following modifications?
* Every time a player makes a move on the board, that player also loses 1 point (passes do not lose points).


If a board play loses 1 point you have chilled go, which is fine in theory, but is worse as regards ko than territory scoring.

Edit: Oh, I see what you are doing. You are chilling area scoring to get territory scoring. ;)

I may still have my suggestion around somewhere, but here is the thing. For "true" territory scoring you want ko fights to end at territory zero. For that to happen you want, not a button that lose ½ pt., but a large number of buttons that do not affect the score; i.e., virtual dame. Like dame they should lift ko bans, so that kos are resolved at temperature zero. That still leaves questions like scoring Bent Four in the Corner and Three Points without Capturing, but these can be resolved in an encore such as Lasker-Maas or Spight rules have. That is not exactly like Japanese or Korean scoring, since you may be able to score some points in seki or have ko fight at temperature -1. But you should be able to program such rules fairly easily. :) And they would be "true" territory rules.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.


Last edited by Bill Spight on Thu Dec 14, 2017 1:41 am, edited 2 times in total.

This post by Bill Spight was liked by: lightvector
Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #3 Posted: Thu Dec 14, 2017 1:31 am 
Judan

Posts: 6087
Liked others: 0
Was liked: 786
Simply speaking, good territory scoring rules do not and cannot exist because regular and playout phases need different rules and pass stones as an attempt to have the same rules fail leading to pass-fights. Button go changes go so far for territory scoring to be rejected by proponents of territory scoring. The Simplified Japanese Rules are the IMO best candidate but they do need different rules for the phases and hardcore proponents would still object due to remaining changes. They want at least 99.99% of the exceptional cases to behave as in the illogical official either Japanese or Korean rules (whose behaviours differ) and this conflict with the need of programs for logical rules availavable for implementation within reasonable time and reasonably low computational complexity of mere application of the rules can never be solved. From a rules POV, things are solved as approximations above. However, it is a political question whether to force programs to lose because their programmers do not waste very much time on implementing an approximation to illogical rules, about as much time as needed for creating a atrong program during the regular phase.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #4 Posted: Thu Dec 14, 2017 11:36 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
As we know, it is not easy to program Japanese and Korean rules. The Japanese '89 rules are an attempt at a rationalized rule set, but they seem ambiguous to me. I have not seen the latest Korean rules, but the earlier rules that I saw seemed to me to require judgement to implement. Humans can resolve ambiguity and exercise judgement, but computer programs, neural nets aside, are more like the Good Soldier Schweik. Clarity and precision are important. Simplicity is also important, if you are going to play thousands of games per second. :)

lightvector wrote:
Are there good rules that still allow playout to resolve all statuses on the board but where correct play prior to that cleanup and the final result match what correct play and the final result would be under most territory scoring rules?


Since the main territory rules are Japanese and Korean, my guess is no. :( However, there are other territory rules without the special cases of Japanese and Korean rules. (BTW, by referring to special cases I do not mean to criticize those rules, just to say that they are not the most general or simple territory rules. :))

Quote:
Something like Tromp-Taylor rules except with a button ("Button Go") seem promising. Except that there are still simple or common cases where correct play in Button Go diverges from what it would be in true territory scoring rules, often involving a final ko, right? This seems to make it not ideal for this purpose, since presumably that divergence would still cause problems.


I agree. Under "true" territory rules kos should be resolved at temperature 0. With Button Go they may be resolved at lower temperatures.

As I said earlier, to have kos resolved at temperature zero you need some temperature zero plays if and when you run out of dame. I called these plays virtual dame. Like actual dame, they must lift any ko and superko bans. Trump-Taylor rules are area rules, and so simply providing virtual dame is not enough, because you also have to make it so that it does not matter who gets the last dame. The ½ pt. button solves that problem. :) Both the virtual dame and the button may be implemented as passes. Below is how that might be done with area counting.

With area counting the button gains ½ pt., and the virtual dame each gain 1 pt. There are two phases to the game; in phase one the players play regular territory go; in phase two they play regular area go to eliminate any dead stones, and they perhaps take one way dame. Playing area go to eliminate dead stones is consistent with territory scoring. Doing so will not alter the territory score. Taking one way dame will affect the territory score, but there you go. ;) Taking the button separates the two phases.

What does it mean to take the button? Since early passes, as virtual dame, lift ko bans, how do you end phase one? One way would be with three consecutive passes, as the first pass lifts any ko ban, the second and third passes show that neither player "wants" to play in a ko or superko, or make any other board play, so three consecutive passes could end phase one. However, one player might play something like Sending Two Returning One every time the opponent passed, so that there is no sequence of consecutive passes. Hence my rule: Phase one ends when the same player passes a second time in the same board position. This phase ending pass is equivalent to the button, and so gains only ½ pt. After this, the second phase continues with passes gaining nothing. My preference is for passes to lift ko and superko bans, and to end this phase the same way, but most people seem to prefer ending play with two consecutive passes. The difference matters only in very rare cases. You can also count no territory in seki, to approach Japanese and Korean rules. :)

Edit: It may be possible, with these rules, for the players to collaborate at produce a never ending game. You can alter the rules to take care of that, but for training purposes why bother?

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #5 Posted: Thu Dec 14, 2017 11:49 am 
Judan

Posts: 6087
Liked others: 0
Was liked: 786
I have a copy of Korean Rules a few years old. They are still very illogical.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #6 Posted: Thu Dec 14, 2017 4:17 pm 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
lightvector wrote:
* Every time a player makes a move on the board, that player also loses 1 point (passes do not lose points).
Nice and interesting idea, kind of reverse AGA. Besides minor things like onesided dame it may actually get close to what you want in theory. Could it ever be advantageous to play first in the 2nd phase?

OTOH, in practice I wonder if this would appeal to bot authors. The two phases need different strategies, so either two NNs or at least an extra phase bit or feature plane (and even worse, some play/prisoner count). And that identical board positions need different policy distributions (or value estimates) may even reduce bot strength. But who knows, Zen authors may still prefer this to their hacks. :)

For the same purpose, real Japanese-style rules would obviously be much less practical but theoretically may still be possible. For example use two phases, and define the score as the territory score of the board position after the first two passes, with dead stones defined as strings with all stones on what is the opponent's pass-alive area after the second two passes. (So the bot would play a mandatory cleanup/dispute phase internally.)

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #7 Posted: Thu Dec 14, 2017 9:11 pm 
Judan

Posts: 6087
Liked others: 0
Was liked: 786
moha wrote:
lightvector wrote:
* Every time a player makes a move on the board, that player also loses 1 point (passes do not lose points).
Nice and interesting idea, kind of reverse AGA. Besides minor things like onesided dame it may actually get close to what you want in theory.


I think such a rule (without also using the area scoring defining rule to let White make the last move) leads to pass-fights. If so, it is NOT reverse AGA and is NOT what one wants and is NOT just minor things being different.

Quote:
real Japanese-style rules would obviously be much less practical but theoretically may still be possible.


"Real" Japanese-style rules are computationally arbitrarily complex because of demanding perfect playout play. Sampling approximations are not good enough. Programs need (mathematical) proof play or complete checking of all variations to verify statuses. There are my Japanese 2003 Rules, which can be worked out to an algorithm, so theoretically possible - yes. Computationally possible for the general position? No. Proof play (not to mention complete checking of all variations) is too complex in most positions. Note that ANY position can be a scoring position and it must be possible in practice to score it WITHOUT APPROXIMATION, because that is what "real" Japanese-style rules demand.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #8 Posted: Thu Dec 14, 2017 9:27 pm 
Lives in sente

Posts: 727
Liked others: 44
Was liked: 218
GD Posts: 10
It might be easier to train AI to be super good at playing White with 0.5 komi in Chinese rule and use that even game Japanese rule when playing White, and train playing black at 13.5 komi.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #9 Posted: Fri Dec 15, 2017 1:50 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
I thought it might be interesting to play out the following position according to the rules I have suggested here. It is moha's example against button go in this post: viewtopic.php?p=224689#p224689

Click Here To Show Diagram Code
[go]$$cm37 Straightforward play
$$ - - - - - - - - - -
$$ | 4 O 2 . . . . . . |
$$ | W X O O O O O . . |
$$ | 3 1 X X X X O . . |
$$ | X X X . X O O . . |
$$ | . . . X X O . O . |
$$ | . . . X O O . . . |
$$ | . . . . X O O O O |
$$ | . . . X . X X X X |
$$ | . . . . . . . . . |
$$ - - - - - - - - - -[/go]


:b41:, :w42:, :b43: = pass

:b43: is the button pass. There is no need for an encore, as there are no dead stones to capture. Further plays or passes would not alter the score.

White gets 37 pts. on the board plus 1 pt. for the pass, :w42:, plus 7 pts. for komi, for a total of 45 pts.
Black gets 44 pts. on the board plus 1 pt. for the pass, :b41:, plus ½ pt. for the "button" pass, :b43:, for a total of 45½ pts. Black wins by ½ pt.

Under Japanese and Korean rules with 6½ komi Black gets 24 pts. and White gets 23½ pts. Black wins by ½ pt.

Now let's look at possible play when White prolongs the ko fight.

Click Here To Show Diagram Code
[go]$$cm37 Ko fight
$$ - - - - - - - - - -
$$ | 5 O 4 . . . . . . |
$$ | W X O O O O O . . |
$$ | 3 1 X X X X O . . |
$$ | X X X . X O O . . |
$$ | . . . X X O . O . |
$$ | . . . X O O . . . |
$$ | . . 7 6 X O O O O |
$$ | . . . X . X X X X |
$$ | . . . . . . . . . |
$$ - - - - - - - - - -[/go]


:w38: = pass, :w44: = ko, :b45:, :w46: = pass

Click Here To Show Diagram Code
[go]$$cm47 Ko fight
$$ - - - - - - - - - -
$$ | 1 O O . . . . . . |
$$ | O X O O O O O . . |
$$ | X X X X X X O . . |
$$ | X X X . X O O . . |
$$ | . . . X X O . O . |
$$ | . . . X O O . . . |
$$ | . . X . X O O O O |
$$ | . . . X 2 X X X X |
$$ | . . . . 3 . . . . |
$$ - - - - - - - - - -[/go]


:w50: = ko, :b51: = pass, :w52: = ko, :b53:, :w54:, :b55: = pass

White gets 37 pts. on the board plus 3 pts. for passes, plus 7 pts. komi, for a total of 47 pts.
Black gets 44 pts. on the board plus 3½ pts. for passes, for a total of 47½ pts. The ko is resolved at temperature zero by territory scoring (temperature one by area scoring), and then Black takes the button for ½ pt.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #10 Posted: Fri Dec 15, 2017 7:21 am 
Lives in sente

Posts: 757
Liked others: 114
Was liked: 916
Rank: maybe 2d
RobertJasiek wrote:
moha wrote:
lightvector wrote:
* Every time a player makes a move on the board, that player also loses 1 point (passes do not lose points).
Nice and interesting idea, kind of reverse AGA. Besides minor things like onesided dame it may actually get close to what you want in theory.


I think such a rule (without also using the area scoring defining rule to let White make the last move) leads to pass-fights. If so, it is NOT reverse AGA and is NOT what one wants and is NOT just minor things being different.



I'm curious now what you had in mind. Robert, could you give an example of a such a pass fight using these rules?

moha wrote:
OTOH, in practice I wonder if this would appeal to bot authors. The two phases need different strategies, so either two NNs or at least an extra phase bit or feature plane (and even worse, some play/prisoner count). And that identical board positions need different policy distributions (or value estimates) may even reduce bot strength. But who knows, Zen authors may still prefer this to their hacks. :)


Personally, I'd want my value net to be adaptable to a reasonable range of komi, and would randomly pick various komi in a certain range when generating training data (such as via self-play) if I were trying to make a strong Go bot. The komi would need to be an input to the neural net, so it's no trouble to also merge the play/prisoner count difference into that. But yeah, it does make things a bit more complex.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #11 Posted: Fri Dec 15, 2017 7:38 am 
Judan

Posts: 6087
Liked others: 0
Was liked: 786
I am too busy for checking pass-fight examples but you can check the standard examples on Sensei's.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #12 Posted: Fri Dec 15, 2017 9:13 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
lightvector wrote:
Personally, I'd want my value net to be adaptable to a reasonable range of komi, and would randomly pick various komi in a certain range when generating training data (such as via self-play) if I were trying to make a strong Go bot.


Yes, and some handicap games, too, against earlier, weaker versions of itself. :)

Also, I came up with my "same player passes twice in the same position" rule in the '90s for straight area scoring. With the rules I propose here, Sending Two Returning One produces the same area position, but the subsequent pass gains one point, so the three consecutive pass rule to end phase one works. :) Phase two can follow Tromp-Taylor rules.

Edit: It has been awhile. Sending Two Returning One is not the only way one player might defeat the three pass rule. So pass by the same player in the same position seems to be the way to go.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.


Last edited by Bill Spight on Sat Dec 16, 2017 12:31 pm, edited 1 time in total.
Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #13 Posted: Sat Dec 16, 2017 10:05 am 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
If this is just about rules for bot training (in light of Zen's problems with komi hacks), what would be an acceptable minimum? Would eliminating the differences with dame parity AND endgame ko temperature enough for a bot to be safe? (onesided dame aside)

For example, it may be possible to trick a bot playing under such emulation in close games (if the game is actually Japanese). By getting it play all dame in 2nd phase, it can be made think it now wins by several points. Then it may answer some tricky threat in its territory in point losing way (a sure win is better than a big win, right? :))

lightvector wrote:
moha wrote:
And that identical board positions need different policy distributions (or value estimates) may even reduce bot strength. But who knows, Zen authors may still prefer this to their hacks. :)
Personally, I'd want my value net to be adaptable to a reasonable range of komi, and would randomly pick various komi in a certain range when generating training data (such as via self-play) if I were trying to make a strong Go bot. The komi would need to be an input to the neural net, so it's no trouble to also merge the play/prisoner count difference into that.
Beyond komi, the correct moves (policy net) are different in the two phases / temperatures, this would be the bigger hurdle I think.

Btw this also has some connection to my earlier idea of button variant ("less stones played (before first two passes) win ties, B wins if still tie"). For ties (no adjustment of bigger differences), rewarding less stones played is like penalizing more stones played.

I also think this goes past bots. Since Japanese-style rules with a single phase CAN easily be used for most human games (with nice properties like no dame fill), two-phase rules that leave this first phase intact would be of real value (most players will be unaware of extra rules anyway, so they are best only applied to additional phases / problem cases). For example, could something (like either of your or Bill's idea) be worked out in a way that it would only apply IF either player resumes the game after the first double pass (normal Japanese scoring on agreement otherwise), AND correct play at the end of the 1st phase remains unchanged?

RobertJasiek wrote:
moha wrote:
real Japanese-style rules would obviously be much less practical but theoretically may still be possible.
"Real" Japanese-style rules are computationally arbitrarily complex because of demanding perfect playout play.
I think they also work reasonably well with using the players' playouts. Like my idea above: "use two phases, and define the score as the territory score of the board position after the first two passes, with dead stones defined as strings with all stones on what is the opponent's pass-alive area after the second two passes". Not perfect OC, but the bigger problem is the same dual strategy issue as above.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #14 Posted: Sat Dec 16, 2017 12:57 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
moha wrote:
If this is just about rules for bot training (in light of Zen's problems with komi hacks), what would be an acceptable minimum? Would eliminating the differences with dame parity AND endgame ko temperature enough for a bot to be safe? (onesided dame aside)


I suspect that for bot training, button go would be good enough. The questions about ko, one way dame, etc., occur infrequently enough that I suspect that they would have little effect on training. OC, that is an empirical question.

Quote:
For example, it may be possible to trick a bot playing under such emulation in close games (if the game is actually Japanese). By getting it play all dame in 2nd phase, it can be made think it now wins by several points. Then it may answer some tricky threat in its territory in point losing way (a sure win is better than a big win, right? :))


Playing neutral dame in the second phase is an error, as a rule. But, as we know, bots make larger endgame errors. So, sure, a bot could be fooled. Since nobody knows how to eliminate the larger errors, I don't think that minor changes to the rules would make much difference.

moha wrote:
For ties (no adjustment of bigger differences), rewarding less stones played is like penalizing more stones played.


Good point.

Quote:
I also think this goes past bots. Since Japanese-style rules with a single phase CAN easily be used for most human games (with nice properties like no dame fill), two-phase rules that leave this first phase intact would be of real value (most players will be unaware of extra rules anyway, so they are best only applied to additional phases / problem cases). For example, could something (like either of your or Bill's idea) be worked out in a way that it would only apply IF either player resumes the game after the first double pass (normal Japanese scoring on agreement otherwise), AND correct play at the end of the 1st phase remains unchanged?


Back in the '90s, Lasker-Maas rules, Berlekamp's rules, and my rules, all of which have an encore (second phase, possibly optional) were devised to be played by humans. Back in the '70s I also wrote some rules to be used by humans. In the '60s Ikeda devised a number of territory rule sets with encores, also for human use. :)

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #15 Posted: Sat Dec 16, 2017 2:25 pm 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
Bill Spight wrote:
Quote:
For example, it may be possible to trick a bot playing under such emulation in close games (if the game is actually Japanese). By getting it play all dame in 2nd phase, it can be made think it now wins by several points. Then it may answer some tricky threat in its territory in point losing way (a sure win is better than a big win, right? :))
Playing neutral dame in the second phase is an error, as a rule. But, as we know, bots make larger endgame errors.
The bot may pass on odd dame in the 1st phase, which can be seen as correct since it has higher winrate (allows an opponent to err by passing as well). Then playing dame in 2nd is also correct.

Quote:
Quote:
I also think this goes past bots. Since Japanese-style rules with a single phase CAN easily be used for most human games (with nice properties like no dame fill), two-phase rules that leave this first phase intact would be of real value (most players will be unaware of extra rules anyway, so they are best only applied to additional phases / problem cases). For example, could something (like either of your or Bill's idea) be worked out in a way that it would only apply IF either player resumes the game after the first double pass (normal Japanese scoring on agreement otherwise), AND correct play at the end of the 1st phase remains unchanged?

Back in the '90s, Lasker-Maas rules, Berlekamp's rules, and my rules, all of which have an encore (second phase, possibly optional) were devised to be played by humans. Back in the '70s I also wrote some rules to be used by humans. In the '60s Ikeda devised a number of territory rule sets with encores, also for human use. :)
Sure, but I wonder if any of these fit? Most of them change correct play (dame needs to be played, some even have further artifacts), and yours change the 1st phase (doesn't stop on two passes). :) Actually I'm not sure if it's even possible to met those conditions.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #16 Posted: Sat Dec 16, 2017 2:51 pm 
Lives in gote

Posts: 340
Location: Spain
Liked others: 181
Was liked: 41
Rank: Low
Sounds like playing Go under true Japanese rules would make for a good Turing test. :)


This post by luigi was liked by: Bill Spight
Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #17 Posted: Sat Dec 16, 2017 4:28 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
moha wrote:
Bill Spight wrote:
Back in the '90s, Lasker-Maas rules, Berlekamp's rules, and my rules, all of which have an encore (second phase, possibly optional) were devised to be played by humans. Back in the '70s I also wrote some rules to be used by humans. In the '60s Ikeda devised a number of territory rule sets with encores, also for human use. :)
Sure, but I wonder if any of these fit? Most of them change correct play (dame needs to be played, some even have further artifacts), and yours change the 1st phase (doesn't stop on two passes). :) Actually I'm not sure if it's even possible to met those conditions.


Not playing dame in the first phase under Japanese rules used to be the custom, and it was allowed by a lenient reading of the '89 rules, but after some end of game problems occurred, the custom has changed to filling the dame, I understand. :) Certainly, playing all the neutral points in the first phase is, except for rare anomalies that depend upon the strange seki rule, correct play. The only way that the J89 rules changed my game was to fill all the dame before passing, to avoid strange sekis. As far as I can tell, the changes in the rule sets I mentioned to correct play in the first phase depend upon the different scoring in the second phase, or in the changes in the ko rules. If you count points in seki, for instance, that can have a large difference in correct play in the first phase.

The third pass in my rules is a result of having passes lift ko or superko bans, which is to have kos resolved in the first phase. The Japanese '49 rules simply decreed that kos be resolved; the Japanese '89 rules let unresolved kos be played anew in hypothetical play, so that leaving them unresolved would be a bad idea, as a rule. I know of no rule set that reproduces the Japanese '49 rules with an encore, hypothetical or not. And I know of no rule set that reproduces the Japanese '89 rules with an actual encore.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #18 Posted: Sat Dec 16, 2017 9:36 pm 
Judan

Posts: 6087
Liked others: 0
Was liked: 786
moha wrote:
Since Japanese-style rules with a single phase CAN easily be used for most human games


Sigh. Such a claim has no meaning. "Can be used" overlooks the possibility of pass-fights, which mean that a single phase does NOT reproduce human games - in almost all human games. If, however, you design rules with a single phase, with territory-like scoring and without pass-fights, they are not Japanese-style rules. Button rules are not single phase rules because there is the button. Territory scoring rules with a single phase and playout alternation are known to have frequent pass-fights. See Sensei's for examples. Modifying such rules to prohibit pass-fights creates highly complicated two-phase rules.

What is possible, see the Simplified Japanese Rules, is Japanese-style rules with a SINGLE PLAYOUT ALTERNATION (the second phase), and those are like almost all human games under Japanese rules and can be modified by adding exceptions for sekis etc. to be even closer. Computationally, playing well under such rules can be trained by AI programs, but we do not get 100% perfect play because AI can play suboptimally during playout; this is not a problem of the rules but a problem of better AI training.

Quote:
"use two phases, and define the score as the territory score of the board position after the first two passes, with dead stones defined as strings with all stones on what is the opponent's pass-alive area after the second two passes".


Such would need to be worked out.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #19 Posted: Sat Dec 16, 2017 10:56 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
RobertJasiek wrote:
Button rules are not single phase rules because there is the button.


That is a matter of definition. Button go is a kind of coupon or token go with only one or two tokens of a certain size. You do not have to define token go as having more than one phase. :)

Quote:
Territory scoring rules with a single phase and playout alternation are known to have frequent pass-fights.


Robert has a broad definition of pass fight.

Quote:
Modifying such rules to prohibit pass-fights creates highly complicated two-phase rules.


Original pass fights were of the following kind. Under AGA rules two consecutive passes end play, but a third pass is needed with territory counting if Black passes last, so that White makes the last pass. (The player who passes hands over a pass stone.) This produces area scoring with territory counting, because each player has the same number of stone on the board when the score is counted, so the difference in territory is the same as the difference in area. Some people thought that they could modify AGA rules to produce territory scoring by simply not requiring the third pass by White. This had the effect of players avoiding the second consecutive pass, if possible. They could do that by playing a sente move after an initial pass and then passing after the opponent's response. Then they would pass and force the opponent to make the last pass unless he could also interpose a sente play. Making the last pass at the cost of a pass stone is a disadvantage.

First, these rules are not simply territory rules, but naive territory rules. Second, the disadvantage of the last pass may be eliminated by making the last pass cost only ½ point instead of 1 pt. Then it does not matter who makes the last pass, and there is no reason for this kind of pass fight. That is a property of Double Button Go and one of Berlekamp's suggested rules. It is not especially complicated.

Edit: Not that having the last pass cost only ½ pt. makes AGA scoring territory scoring. It is still area scoring, but worth ½ pt. less for Black. ;) But there is no pass fight.

Edit 2: That being the case, are these rules not naive territory rules, but not territory rules at all?

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Good territory scoring rules for training computers?
Post #20 Posted: Sun Dec 17, 2017 6:36 am 
Lives in gote

Posts: 311
Liked others: 0
Was liked: 45
Rank: 2d
Bill Spight wrote:
moha wrote:
Sure, but I wonder if any of these fit? Most of them change correct play (dame needs to be played, some even have further artifacts), and yours change the 1st phase (doesn't stop on two passes). :) Actually I'm not sure if it's even possible to met those conditions.
Not playing dame in the first phase under Japanese rules used to be the custom, and it was allowed by a lenient reading of the '89 rules, but after some end of game problems occurred, the custom has changed to filling the dame, I understand. :) Certainly, playing all the neutral points in the first phase is, except for rare anomalies that depend upon the strange seki rule, correct play. The only way that the J89 rules changed my game was to fill all the dame before passing, to avoid strange sekis. As far as I can tell, the changes in the rule sets I mentioned to correct play in the first phase depend upon the different scoring in the second phase, or in the changes in the ko rules. If you count points in seki, for instance, that can have a large difference in correct play in the first phase.
By not changing correct play I meant the players don't know in advance or prepare for that there will be a dispute or extra phase. They play by the assumption that the game can end the usual way with agreement. So the extra rules must be suitable to be applied unexpectedly, after a normal game stop, without this being any disadvantage for a player. Moves played assuming agreement must be ok even if dispute happens - hence dame fill is out.

Ikeda notes that between 1st and 2nd phase there is a point where playing worth nothing for either player, even letting the opponent move twice is no problem (that's why two passes happen). I think this needs to be taken further. The players consider all plays worthless on the above assumption of agreement. So when they realize there is no agreement, their opinion and strategic choice of moves may change.

I now think this is obviously possible, the rules only need three phases (with the 2nd still played in territory mode). The slight complication (in case of dispute) is a tiny price for allowing 99.9% of cases to go without ANY complication or (knowledge of) extra rules, nor any rules tampering with the first phase. And in the remaining cases the players can even be reminded of the procedure (by an online server for example - it's not too late to learn about the rules at this point! :)).

There is also the minor case where the first pass happened before retaking in a ko, in lack of threat or dame (I guess this is one reason behind your stopping rule - I agree that passes lift bans, this is obvious), but I'm not sure if this needs special treatment (having the extra territory phase may already mitigate this to an extent). Even if necessary, such complicating seems better reserved to extra phases only, I think. The first phase is best untouched.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 23 posts ]  Go to page 1, 2  Next

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group