AI RYUSEI Tournament 2017, December 9-10

pookpooi · Post by **pookpooi** » Sat Dec 09, 2017 7:03 pm

johnsmith wrote:Do you know what are time settings

http://www.igoshogi.net/ai_ryusei/01/en/rules.html

pookpooi · Post by **pookpooi** » Sat Dec 09, 2017 10:27 pm

As expected (no pun intended), DeepZen meet FineArt in the final. This is basically Alien vs. Predator scenario because they demolish other programs with ease. Good thing FineArt blundered yesterday (Tianrang also) so the programmer can fix, today there's no room for any error anymore. FineArt is the favorite to win but can DeepZen pull an upset just like it won at World AI Go Championship in China?

Update: FineArt has very high chance to win, according to DeepZenGo. But no matter who wins, we lose

pookpooi · Post by **pookpooi** » Sun Dec 10, 2017 12:25 am

All second game records are here http://52.198.104.180/secondday.html

EdLee · Post by **EdLee** » Sun Dec 10, 2017 2:01 am

Does anyone happen to know how the developers in Japan (e.g. DeepZen) and outside of Japan (e.g. AlphaZero, FineArt, etc.) feel about implementing area scoring (e.g. AGA's) versus territory scoring ?

( How does Leela do scoring? )

Waylon · Post by **Waylon** » Sun Dec 10, 2017 8:54 am

EdLee wrote:Does anyone happen to know how the developers in Japan (e.g. DeepZen) and outside of Japan (e.g. AlphaZero, FineArt, etc.) feel about implementing area scoring (e.g. AGA's) versus territory scoring ?

( How does Leela do scoring? )

It seems that Leela uses area scoring.

I used SmartGo to create the four attached game files. One uses Japanese rules and komi 6.5, the other three are with 7.5 komi and Chinese, AGA territory and AGA area rules.

SmartGo's scoring function gives the expected results:
Jap = B+0.5, Chi = B+1.5, AGAter = B+1.5, AGAarea = B+1.5

Leela scores differently only with Japanese rules:
Jap = B+2.5, Chi = B+1.5, AGAter = B+1.5, AGAarea = B+1.5

Zenith Go 7 seems to use always territory scoring, regardless of the RU parameter in the sgf file:
Jap = B+0.5, Chi = W+0.5, AGAter = W+0.5, AGAarea = W+0.5

Crazy Stone Deep Learning gives the correct score for Japanese and Chinese rules, but seems to be wrong with AGA rules:
Jap = B+0.5, Chi = B+1.5, AGAter = W+0.5, AGAarea = W+0.5

(It seems I can only attach three files. You can edit the RU parameter in the sgf file with a text editor)

moha · Post by **moha** » Sun Dec 10, 2017 9:12 am

EdLee wrote:how the developers in Japan (e.g. DeepZen) and outside of Japan (e.g. AlphaZero, FineArt, etc.) feel about implementing area scoring (e.g. AGA's) versus territory scoring ?

AFAIK most programs use area scoring internally, and I recall Zen authors mentioning that for Japanese rules they set an internal komi to ensure that Zen targets a board score that wins in all cases. Still there were some problems with this IIRC, cases where Zen played the usual nonsense endgame, throwing away points thinking it could still win by 0.5, where in fact it lost. Don't remember what went wrong though (OC this approach won't work with one-sided dame for example, but that wasn't the problem in practice).

Waylon wrote:I used SmartGo to create the four attached game files. One uses Japanese rules and komi 6.5, the other three are with 7.5 komi and Chinese, AGA territory and AGA area rules.

The number of B and W stones differs here without indicating captures, this further complicates things.

Uberdude · Post by **Uberdude** » Sun Dec 10, 2017 9:55 am

I presumed Ed's question was about how do the various go playing engines score the board internally as part of their algorithms for deciding what move to play, and I believe the answer for pretty much every bot is area scoring as that's easier and allows them to fill in their own territory at no cost. The user interfaces associated with various bots implement territory scoring because that's what their human users like.

I seem to recall when DeepZen messed up in the WGC that was blamed on its value network being trained on Chinese/area counting/komi but the tournament being Japanese so it gave away too much. Also didn't the AlphaGi Zero paper say they used tromp taylor for counting.

Edit. Yup. From p22.

2. AlphaGo Zero uses Tromp-Taylor scoring during MCTS simulations and self-play training. This is because human scores (Chinese, Japanese or Korean rules) are not well-defined if the game terminates before territorial boundaries are resolved. However, all tournament and evaluation games were scored using Chinese rules.

Bill Spight · Post by **Bill Spight** » Sun Dec 10, 2017 11:28 am

Uberdude wrote:I presumed Ed's question was about how do the various go playing engines score the board internally as part of their algorithms for deciding what move to play, and I believe the answer for pretty much every bot is area scoring as that's easier and allows them to fill in their own territory at no cost. The user interfaces associated with various bots implement territory scoring because that's what their human users like.

I seem to recall when DeepZen messed up in the WGC that was blamed on its value network being trained on Chinese/area counting/komi but the tournament being Japanese so it gave away too much. Also didn't the AlphaGi Zero paper say they used tromp taylor for counting.

Edit. Yup. From p22.
2. AlphaGo Zero uses Tromp-Taylor scoring during MCTS simulations and self-play training. This is because human scores (Chinese, Japanese or Korean rules) are not well-defined if the game terminates before territorial boundaries are resolved. However, all tournament and evaluation games were scored using Chinese rules.

Using Tromp-Taylor scoring, which counts only empty points surrounded by one color as territory, might explain a persistent bias towards taking territory fairly early. If the game ends before territorial boundaries are resolved, which will often happen in the early stages of the development of the program, empty points in areas that are not completely surrounded count for naught. Proximity scoring, OTOH, counts empty points according to the color of the closest stone. The score for games that end prematurely then rewards influence over territory. Using proximity scoring might produce a persistent bias towards making influence early.

EdLee · Post by **EdLee** » Sun Dec 10, 2017 1:04 pm

Actually, the operative word is feel:

How do the developers in and out of Japan feel about the different rules sets ?

The question is about the humans, not the algorithms.

pookpooi · Post by **pookpooi** » Mon Dec 11, 2017 4:16 am

Uberdude wrote:So for which bot was 'define' a test account?

Define is Tianrang?????

https://twitter.com/ohashihirofumi/stat ... 0502737920

Uberdude · Post by **Uberdude** » Mon Dec 11, 2017 8:36 am

pookpooi wrote:Define is Tianrang?????

Well, it certainly liked the 3-3!

Life In 19x19

AI RYUSEI Tournament 2017, December 9-10

And the winner is...

Re: AI RYUSEI Tournament 2017, December 9-10

Re: AI RYUSEI Tournament 2017, December 9-10

Re: AI RYUSEI Tournament 2017, December 9-10

Re:

Re: Re:

Re: AI RYUSEI Tournament 2017, December 9-10

Re: AI RYUSEI Tournament 2017, December 9-10

Re: AI RYUSEI Tournament 2017, December 9-10

Re: AI RYUSEI Tournament 2017, December 9-10