It is currently Tue Apr 16, 2024 1:21 am

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 18 posts ] 
Author Message
Offline
 Post subject: Bots trained for possibility of ties?
Post #1 Posted: Fri Dec 13, 2019 10:18 am 
Lives with ko

Posts: 237
Location: Pasadena, USA
Liked others: 79
Was liked: 12
Rank: OGS 9 kyu
OGS: Maharani
Are any bots, such as KataGo, trained with ties as a possible outcome for games played with integer komi? I know that komi can be set to 7 in KataGo, but does KataGo actually understand what a tie is? How about other bots?

If the answer is currently no, I think it would be highly interesting to experiment with this.

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #2 Posted: Fri Dec 13, 2019 10:41 am 
Lives in sente

Posts: 757
Liked others: 114
Was liked: 916
Rank: maybe 2d
KataGo models a tie as being half of a win and half of a loss (this is actually configurable though!), and behaves accordingly, and the winrate will reflect this.

I've had one user request an explicit modeling of the probability of a tie. I never got around to doing this, unfortunately, since it would be some work and some complexity to code to track this separately from just the winrate, so it's just folded into the winrate. But aside from not being able to explicitly visualize the predicted probability of a tie, it should be handled correctly.

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #3 Posted: Fri Dec 13, 2019 10:46 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
Maharani wrote:
Are any bots, such as KataGo, trained with ties as a possible outcome for games played with integer komi? I know that komi can be set to 7 in KataGo, but does KataGo actually understand what a tie is? How about other bots?

If the answer is currently no, I think it would be highly interesting to experiment with this.


What do you do with ties during training? If you are trying to decide which program is stronger, then ignoring ties is preferable. But if you train for that, then the bot may not learn to prefer a tie to a loss. So it may be better to count ties, one way or other. Maybe the old program should get to count a tie as a win, I dunno.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #4 Posted: Fri Dec 13, 2019 10:47 am 
Lives with ko

Posts: 237
Location: Pasadena, USA
Liked others: 79
Was liked: 12
Rank: OGS 9 kyu
OGS: Maharani
Fascinating - thank you for the swift reply! :) Yeah, would be very interesting to know what the probability of a tie is for an empty board with 7 komi, but I understand that this would be too complex to implement.

Bill Spight wrote:
Maybe the old program should get to count a tie as a win, I dunno.


The points you raised make sense to me, but this seems like a great solution :)

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #5 Posted: Fri Dec 13, 2019 12:13 pm 
Lives in gote

Posts: 445
Liked others: 0
Was liked: 37
Bill Spight wrote:
What do you do with ties during training? If you are trying to decide which program is stronger, then ignoring ties is preferable.

I think ties just pull the value net for the given game towards 0, the correct output. I also doubt ignoring ties is preferable in any case (in test matches they should pull the strength diff towards 0 as well - otherwise if A wins 1 and ties 9 you would think it is much stronger).

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #6 Posted: Fri Dec 13, 2019 6:32 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
jann wrote:
Bill Spight wrote:
What do you do with ties during training? If you are trying to decide which program is stronger, then ignoring ties is preferable.

I think ties just pull the value net for the given game towards 0, the correct output. I also doubt ignoring ties is preferable in any case (in test matches they should pull the strength diff towards 0 as well - otherwise if A wins 1 and ties 9 you would think it is much stronger).


If the question is which program is stronger you ignore ties. I learned that in my first research class. :) How much stronger a program is is a different question.

As for what is correct for the value net I couldn't say. :)

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #7 Posted: Fri Dec 13, 2019 6:44 pm 
Lives in gote

Posts: 577
Liked others: 22
Was liked: 36
Rank: Fox Tygem 6d
KGS: emerus
Tygem: emerus
OGS: emerus
A new FineArt model might be trained for ties. It plays on FoxGo with 2 handicap and 0 komi.

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #8 Posted: Fri Dec 13, 2019 6:51 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
As I was out today, I wondered about training chess engines, since draws are a large part of high level chess. I think I would use two game matches for self play, with each player playing Black in one game and White in the other. (Since the question is which program is better, we ignore ties of the two game match, where each player wins a game or both games are draws.) I expect that most decisive matches will be won by 1 point, one win and one tie. That illustrates the value of ties in two game matches. You don't have to win both games, you can tie one of them. :)

Whether such two game matches pull the value net in each game towards zero, I couldn't say.

----

Since emerus brings up Fine Art playing with two stones, I must say I also like the idea of two game matches, switching colors, for training handicap play. :)

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #9 Posted: Sat Dec 14, 2019 4:19 am 
Lives in gote

Posts: 445
Liked others: 0
Was liked: 37
Bill Spight wrote:
I think I would use two game matches for self play, with each player playing Black in one game and White in the other. (Since the question is which program is better

I'm not sure exactly what you mean here. Training and selfplay for NN bots serves two purposes: create targets for policy training (since search results are of higher quality than raw policy), and target data for value training (the eventual game outcome is a bit of info about the prospects of positions occurred in that particular game). How would twin matches fit here (other than two individual matches)?

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #10 Posted: Sat Dec 14, 2019 5:39 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
jann wrote:
Bill Spight wrote:
I think I would use two game matches for self play, with each player playing Black in one game and White in the other. (Since the question is which program is better

I'm not sure exactly what you mean here. Training and selfplay for NN bots serves two purposes: create targets for policy training (since search results are of higher quality than raw policy), and target data for value training (the eventual game outcome is a bit of info about the prospects of positions occurred in that particular game). How would twin matches fit here (other than two individual matches)?


Well, first, I am asking a different question. Which bot is better? To answer that question, ties do not matter, since they give no information about which one is better. OC, self play is a slight misnomer, since the bots will differ to some extent. The two game match does make a tie in a single game desirable to some extent, as a bot can win the match with a tie and a win.

Speaking in general, you want to reinforce correct decisions. In fact, you want to reinforce better decisions, even if they are not correct, or not known to be correct. Winning a two game match is evidence of making better decisions, even when those decisions result in a tie in one of the games. Thus, a two game match can reinforce decisions that a single game would not.

Now, it is possible to reinforce decisions that do not lead to a win. For instance, in SOAR subgoals are created and decisions that lead to reaching a subgoal are reinforced. In playing go, a subgoal might be to read a ladder out to resolution. Reading the ladder out may not win a game, but the decisions made to read it out correctly may still be reinforced. Or a goal may be to predict the result of the game. Even if the game is lost, decisions that led to a correct prediction may still be reinforced. Another, related goal may be to predict the result of the two game match. This decision may be made with or without the knowledge of the result of the other game. How these decisions are reinforced is a matter of implementation. :)

Oh, I meant to mention. In a two game match it is possible to reinforce decisions that win one of the games, as well as the two game match. Whether that's a good idea or not is an empirical question. For instance, if the bots have a long series of two game matches where the first player wins, resulting in tied matches, do we really want to reinforce the decisions by the first player which led to those wins? My guess is that it may be better to reinforce decisions by the second player that lead to a tie in a single game. :)

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #11 Posted: Sat Dec 14, 2019 6:06 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
One thing I have suggested for go is a two game match decided on total points. If the games are played without knowledge of the result of the other game, then we can reinforce the decisions that led to the greater win in one game, and the lesser loss in the other. Now, with KataGo we can set the komi, right? In that case say that Black wins a three stone game by 39 points sans komi. For the second game we could set the komi to 39 and reinforce decisions on that basis. Based upon empirical results, we could come up with a komi for the first game, as well. :)

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #12 Posted: Sat Dec 14, 2019 9:41 am 
Lives in sente

Posts: 757
Liked others: 114
Was liked: 916
Rank: maybe 2d
And what makes you think that KataGo is not already doing some or all of these things? :razz:


This post by lightvector was liked by: Bill Spight
Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #13 Posted: Sat Dec 14, 2019 10:19 am 
Lives with ko

Posts: 237
Location: Pasadena, USA
Liked others: 79
Was liked: 12
Rank: OGS 9 kyu
OGS: Maharani
lightvector wrote:
And what makes you think that KataGo is not already doing some or all of these things? :razz:


Please do elaborate... :3

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #14 Posted: Sun Dec 15, 2019 8:15 am 
Gosei

Posts: 1590
Liked others: 886
Was liked: 528
Rank: AGA 3k Fox 3d
GD Posts: 61
KGS: dfan
Bill Spight wrote:
One thing I have suggested for go is a two game match decided on total points. If the games are played without knowledge of the result of the other game, then we can reinforce the decisions that led to the greater win in one game, and the lesser loss in the other.
Yes. In fact, you can play O(n^2) "virtual two-game matches" with only n games if the games are played without knowledge of the other game's result; pretend games 1 and 2 are a match, games 1 and 3, games 1 and 4, etc. The "value" of a game result ends up being what fraction of the other game results it is superior to, which for those with a probability background is known as a cumulative distribution function. I have a paper about this that should go up on arXiv.org this coming week. (It learned well, but I only tried it on a simple game.)


This post by dfan was liked by: Bill Spight
Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #15 Posted: Sun Dec 15, 2019 8:58 am 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
dfan wrote:
Bill Spight wrote:
One thing I have suggested for go is a two game match decided on total points. If the games are played without knowledge of the result of the other game, then we can reinforce the decisions that led to the greater win in one game, and the lesser loss in the other.
Yes. In fact, you can play O(n^2) "virtual two-game matches" with only n games if the games are played without knowledge of the other game's result; pretend games 1 and 2 are a match, games 1 and 3, games 1 and 4, etc. The "value" of a game result ends up being what fraction of the other game results it is superior to, which for those with a probability background is known as a cumulative distribution function. I have a paper about this that should go up on arXiv.org this coming week. (It learned well, but I only tried it on a simple game.)


However, to satisfy the requirement of switching sides, it should be the odd numbered games vs. the even numbered games, assuming that's how you number them. :) Then you get some number of virtual matches less than N*N/4, since you eliminate virtual ties when deciding which player is better.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #16 Posted: Sun Dec 15, 2019 11:26 am 
Gosei

Posts: 1590
Liked others: 886
Was liked: 528
Rank: AGA 3k Fox 3d
GD Posts: 61
KGS: dfan
Bill Spight wrote:
dfan wrote:
Bill Spight wrote:
One thing I have suggested for go is a two game match decided on total points. If the games are played without knowledge of the result of the other game, then we can reinforce the decisions that led to the greater win in one game, and the lesser loss in the other.
Yes. In fact, you can play O(n^2) "virtual two-game matches" with only n games if the games are played without knowledge of the other game's result; pretend games 1 and 2 are a match, games 1 and 3, games 1 and 4, etc. The "value" of a game result ends up being what fraction of the other game results it is superior to, which for those with a probability background is known as a cumulative distribution function. I have a paper about this that should go up on arXiv.org this coming week. (It learned well, but I only tried it on a simple game.)

However, to satisfy the requirement of switching sides, it should be the odd numbered games vs. the even numbered games, assuming that's how you number them. :) Then you get some number of virtual matches less than N*N/4, since you eliminate virtual ties when deciding which player is better.

In my setup, the games are all training games, where both players are the same bot, so the games really are all comparable. (For example, I can pretend I was White in game n and Black in game m, which really means that I "win" if my performance as White in game n is better than a clone's performance as White in game m, and this can be done for any value of m /= n.)

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #17 Posted: Sun Dec 15, 2019 12:52 pm 
Honinbo

Posts: 10905
Liked others: 3651
Was liked: 3374
Bill Spight wrote:
dfan wrote:
Bill Spight wrote:
One thing I have suggested for go is a two game match decided on total points. If the games are played without knowledge of the result of the other game, then we can reinforce the decisions that led to the greater win in one game, and the lesser loss in the other.
Yes. In fact, you can play O(n^2) "virtual two-game matches" with only n games if the games are played without knowledge of the other game's result; pretend games 1 and 2 are a match, games 1 and 3, games 1 and 4, etc. The "value" of a game result ends up being what fraction of the other game results it is superior to, which for those with a probability background is known as a cumulative distribution function. I have a paper about this that should go up on arXiv.org this coming week. (It learned well, but I only tried it on a simple game.)

However, to satisfy the requirement of switching sides, it should be the odd numbered games vs. the even numbered games, assuming that's how you number them. :) Then you get some number of virtual matches less than N*N/4, since you eliminate virtual ties when deciding which player is better.

dfan wrote:
In my setup, the games are all training games, where both players are the same bot, so the games really are all comparable. (For example, I can pretend I was White in game n and Black in game m, which really means that I "win" if my performance as White in game n is better than a clone's performance as White in game m, and this can be done for any value of m /= n.)


I have some questions, which your paper may answer. Depends on your audience, I suppose. :) Anyway, rank statistics are nice for a Bayesian approach.

_________________
The Adkins Principle:
At some point, doesn't thinking have to go on?
— Winona Adkins

Visualize whirled peas.

Everything with love. Stay safe.

Top
 Profile  
 
Offline
 Post subject: Re: Bots trained for possibility of ties?
Post #18 Posted: Mon Dec 16, 2019 6:38 pm 
Gosei

Posts: 1590
Liked others: 886
Was liked: 528
Rank: AGA 3k Fox 3d
GD Posts: 61
KGS: dfan
dfan wrote:
Bill Spight wrote:
One thing I have suggested for go is a two game match decided on total points. If the games are played without knowledge of the result of the other game, then we can reinforce the decisions that led to the greater win in one game, and the lesser loss in the other.
Yes. In fact, you can play O(n^2) "virtual two-game matches" with only n games if the games are played without knowledge of the other game's result; pretend games 1 and 2 are a match, games 1 and 3, games 1 and 4, etc. The "value" of a game result ends up being what fraction of the other game results it is superior to, which for those with a probability background is known as a cumulative distribution function. I have a paper about this that should go up on arXiv.org this coming week. (It learned well, but I only tried it on a simple game.)
Here it is: Self-Play Learning Without a Reward Metric. The presentation in the paper starts with CDF-based rewards and then derives the virtual-match approach from it, but in fact the history of the idea is the other way around; we started with two-game matches as you describe (independent evolution; if you mentioned it here I didn't see it) and then realized that CDF-based rewards modeled the same thing in the end and converged much more quickly.


This post by dfan was liked by: Bill Spight
Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 18 posts ] 

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group