Developing weak AIs in KaTrain

Limeztone · **#41**

Bill Spight wrote:

Limeztone wrote:

Bill Spight wrote:

Most life and death problems are not as big as capturing 25 stones vs. letting them escape.

I think you are missing the point!

Perhaps you are missing my point, I dunno.

Quote:

This is not playing and it's not choosing between two different moves.

It's a comparison between the size of two different moves, right?

No it is definitely not!

(it has nothing to do with the size of a any move)

And thats where you are missing my point.

PS Yeah it should be means (sorry for my bad english)

Bill Spight · **#42**

Bill Spight wrote:

Limeztone wrote:

Bill Spight wrote:

Most life and death problems are not as big as capturing 25 stones vs. letting them escape.

I think you are missing the point!

Perhaps you are missing my point, I dunno.

Quote:

This is not playing and it's not choosing between two different moves.

It's a comparison between the size of two different moves, right?

Limeztone wrote:

No it is definitely not!

(it has nothing to do with the size of a any move)

And thats where you are missing my point.

PS Yeah it should be means (sorry for my bad english)

Well, it is a comparison, right? What are you comparing?

Why 25 stones and not 15 stones or 10 stones?

Limeztone · **#43**

Bill Spight wrote:

Well, it is a comparison, right? What are you comparing?

Why 25 stones and not 15 stones or 10 stones?

It was an arbitrary choice.
I could have typed 15 or 10 it does not matter.
It was just a way I hoped to clarify the usage, obviously I failed :-)

And NO there is not really a comparison (at least not in playing value)!!

The net will ONLY tell how probable it is that a human would make that move, nothing else!
That's what it will be trained for. Like a net that is teached recognise a cat in a picture.
It's not a tricky thing to do.

PS If one would care to teach such a net I am quite sure it will be very helpful in making the weakening more humanlike.
PS2 The only reason to use a small net (like 6 block or 10 block) is to reduce memory usage, the running time will have no impact.

Bill Spight · **#44**

So, based upon what you are saying we can edit the following statement, right?

Limeztone wrote:

It should easily be able to distinct between a 25 stone atari and a slightly more difficult life and death problem.

How's this?

Quote:

The smaller net should easily be able to distinguish between some atari and some life and death problem.

I actually think that you meant to say more than that, however.

Limeztone · **#45**

Bill Spight wrote:

So, based upon what you are saying we can edit the following statement, right?

Limeztone wrote:

It should easily be able to distinct between a 25 stone atari and a slightly more difficult life and death problem.

How's this?

Quote:

The smaller net should easily be able to distinguish between some atari and some life and death problem.

I actually think that you meant to say more than that, however.

I give up!

I think you are trolling me! (If not reread my posts)

Limeztone · **#46**

Sanderl, I hope you understand what I am proposing.

Train a small net (as big as it takes to get a reasonable result) on human games at all levels.
Only train the net on human games (no self play at all).
A good starting point could be games played by players around 9 kyu.
(one could experiment on what level of games one should include vs not include)
But don't train the net to play! Train it to tell how likely moves are to be played by an human.
(Earlier versions of AlphaGo (train on pro games) could give this type of information)

Then when weakening KataGo. Don't let it skip moves that the helping net find it likely for a human to make. Ultimate would to train multiple net on different human levels. But I don't think that will be necessary, as a single net would still be very useful.

Obviously the net will find responding to an atari on a big group to be likely for an human to play.
While the net will find playing the correct move in a life and death position will be less likely for the human to play. Specially as the net will be trained on "mostly" "weak" human play.

NOTE: the net will NOT say anything about how big a move is, that will be KataGos job.
The net output will be in the form of: there is a 87% percent chance that a human play there.

Sanderl, Unless you have questions about this, I will post no more.
(Personally, I am convinced that such a helping net would be very useful)

Bill Spight · **#47**

Limeztone wrote:

Bill Spight wrote:

So, based upon what you are saying we can edit the following statement, right?

Limeztone wrote:

It should easily be able to distinct between a 25 stone atari and a slightly more difficult life and death problem.

How's this?

Quote:

The smaller net should easily be able to distinguish between some atari and some life and death problem.

I actually think that you meant to say more than that, however.

I give up!

I think you are trolling me! (If not reread my posts)

I am not trolling you. I got all along what you were saying about the smaller net not playing go but learning to predict human play. What I did not, and still do not understand is what you were trying to say with that specific sentence. But when I asked about it specifically you thought I did not understand you more generally. It has taken some time to finally focus on that one statement.

Now, you may not particularly care, since that statement is not necessary to your main point. That's OK. We can drop the discussion here. But if you do care, we can continue. I really am trying to understand what you are trying to say with it.

Tryss · **#48**

Quote:

I am not trolling you. I got all along what you were saying about the smaller net not playing go but learning to predict human play. What I did not, and still do not understand is what you were trying to say with that specific sentence. But when I asked about it specifically you thought I did not understand you more generally. It has taken some time to finally focus on that one statement.

Simple : the goal is that the program avoid playing stupid/artificial mistakes, but still make big/natural mistakes.

Missing a 25 stones in atari, or the life and dead of some 25pts group is mostly the same for a bot, but one is a mistake you expect a 10k to make, the other isn't.

A move with a high "9-kyu policy" but bad winrate is probably a natural error, the opposite probably isn't

Bill Spight · **#49**

Tryss wrote:

Quote:

I am not trolling you. I got all along what you were saying about the smaller net not playing go but learning to predict human play. What I did not, and still do not understand is what you were trying to say with that specific sentence. But when I asked about it specifically you thought I did not understand you more generally. It has taken some time to finally focus on that one statement.

Simple : the goal is that the program avoid playing stupid/artificial mistakes, but still make big/natural mistakes.

Missing a 25 stones in atari, or the life and dead of some 25pts group is mostly the same for a bot, but one is a mistake you expect a 10k to make, the other isn't.

A move with a high "9-kyu policy" but bad winrate is probably a natural error, the opposite probably isn't

So the ability of the small network to distinguish between these two is not really the issue. It is that it will predict different plays under the two different circumstances, based, not upon its skill at go, but upon what it thinks a presumably weak human would play. And we think that weak humans are more likely to make a mistake in a life and death situation, where there is only one correct play but many possible mistakes, than in a situation where there is only one possible mistake. That makes sense, even for weak bots, not just for weak humans.

Therefore, having seen many more mistakes by the weak human in the life and death situation, the net will predict a human mistake more often that it will in the simple situation. Again, this has nothing to do with the net learning to distinguish between the two situations.

Edit: And, BTW, that assumption is not necessarily the case. Some versions of Crazy Stone, for instance, have been trained on weak human play and yet they do not judge simple situations like the weak humans do, possibly because those situations do not arise often enough to have been learned. You don't have millions of weak human games to learn from.

But the proof is in the pudding. It's worth a try.

Tryss · **#50**

Quote:

It is that it will predict different plays under the two different circumstances, based, not upon its skill at go, but upon what it thinks a presumably weak human would play.

And that's good if your goal is to create an AI that play like a weak human...

Bill Spight · **#51**

Tryss wrote:

Quote:

It is that it will predict different plays under the two different circumstances, based, not upon its skill at go, but upon what it thinks a presumably weak human would play.

And that's good if your goal is to create an AI that play like a weak human...

Sure, but Limeztone's statement was about the ability of the net to distinguish between the two different circumstances, not about it making different predictions under them. That confused the issue.

sanderl · **#52**

Even if you did all the work you propose and trained the net as you propose in a way to be compatible with katago, and it worked I couldn't guarantee it would make it into the program.
The potential instability of running two instances of katago etc might not be worth it given the rather small improvement over the range of AIs already present.

Limeztone · **#53**

sanderl wrote:

Even if you did all the work you propose and trained the net as you propose in a way to be compatible with katago, and it worked I couldn't guarantee it would make it into the program.
The potential instability of running two instances of katago etc might not be worth it given the rather small improvement over the range of AIs already present.

I do see your point. Even though one would not need a full katago/leela but a much simpler beast only using the cpu it could give some issues (memory should not be a problem). The solution that would work would be to have it built into katago as a simple policy net. But I don't think lightvector would se any benefit from that in his rechearch.

Love your work sanderl, even if one is not interested in the weakening. Making KataGo available to people not mastering computers is great. I have helped far to many to install KataGo on their computers. Now I don't have to. I just give them KaTrain :-)

go4thewin · **#54**

playing against the calibrated rank bot, will setting max visits to 2 instead of 1 make a difference in the moves the bot plays/bot strength? I only ask b/c with leela zero i notice when set to -t1 -p1 it uses 2 visits. in katrain, setting to 1 visit gives me an instaneous move, while 2 has a slight delay on my old laptop (pentium). Thanks for the time in answering!

put another way, does the policy network mean 1 visit or 1 playout with a possible 2 visits?

willemvdb42 · **#55**

A bit late perhaps, but I was wondering how the Simple Style AI was trained, and how strong it is. Is it also a special version of Katago? Was it trained differently in some way?

sanderl · **#56**

willemvdb42 wrote:

A bit late perhaps, but I was wondering how the Simple Style AI was trained, and how strong it is. Is it also a special version of Katago? Was it trained differently in some way?

None of the AIs are trained, which would need other networks, increasing download size etc. I also don't have the computational capacity to do this.

Simple style is based on getting the ownership prediction for each candidate move suggested by katago and does the following:

Discard moves with low visits (default <3) or which lose too many points
Define settledness(player) := sum(|ownership|) for all squares with ownership sign = player sign
define negUtility := pointsLost + a isAttachment + b isTenuki - c (settledness(player) + d settledness(opponent))
Play the move with the lowest negUtility

Basically it likes territory being clear, and dislikes miai and dame.

How strong it is depends entirely on your settings. Increasing visits or root noise will both result in more candidates, but also often more certainty about them. This makes it hard to give an indication about rank, but https://online-go.com/game/27299624 suggests it can retain some of its style while still being really strong.

sanderl · **#57**

go4thewin wrote:

playing against the calibrated rank bot, will setting max visits to 2 instead of 1 make a difference in the moves the bot plays/bot strength? I only ask b/c with leela zero i notice when set to -t1 -p1 it uses 2 visits. in katrain, setting to 1 visit gives me an instaneous move, while 2 has a slight delay on my old laptop (pentium). Thanks for the time in answering!

put another way, does the policy network mean 1 visit or 1 playout with a possible 2 visits?

Calibrated rank only uses the policy, so 1 visit is sufficient. I am not sure what you mean by playout other than in the context of older MCTS bots.

hzamir · **#58**

@sanderl

very much enjoying katrain, which I only just discovered today --enough so that I would like to ask guidelines for non-high-end macs that would run it the best. I have 2018 Macbook Pro. Since in any event I may upgrade to some other Mac, either laptop or iMac, do you recommend something that will be particularly good for running Katago?

Developing weak AIs in KaTrain

Who is online