I think it's worth calculating and exposing both values at least for a little while, because the comparison between them in different sorts of situations could provide some insight. I do agree that B seems more meaningful and useful, though.lightvector wrote:So, my thought is to try to make KataGo estimate B instead. And, I could also continue estimating A too, but it would be extra overhead in the search to carry both around, so my inclination is to just not have A once we have B. Unless people think it should keep reporting both? Thoughts?
Can We Stop Calling Kata "scoreMean" Points?
-
dfan
- Gosei
- Posts: 1598
- Joined: Wed Apr 21, 2010 8:49 am
- Rank: AGA 2k Fox 3d
- GD Posts: 61
- KGS: dfan
- Has thanked: 891 times
- Been thanked: 534 times
- Contact:
Re: Can We Stop Calling Kata "scoreMean" Points?
-
Yakago
- Dies in gote
- Posts: 53
- Joined: Tue Jan 16, 2018 10:39 am
- GD Posts: 0
- Has thanked: 2 times
- Been thanked: 12 times
Re: Can We Stop Calling Kata "scoreMean" Points?
I would say that it's a bit 'bloaty' to have two score estimates. Even if it could provide insight in some situation
I think the 'B' version is to be preferred, and would 'solve' this issue up to the inaccuracy of the network.
I think it should be understable that the 'points' we see is based on the preferred line of play, and during analysis we would be able to see that the two lines of play differ in winrate and points.
I think the 'B' version is to be preferred, and would 'solve' this issue up to the inaccuracy of the network.
I think it should be understable that the 'points' we see is based on the preferred line of play, and during analysis we would be able to see that the two lines of play differ in winrate and points.
-
TelegraphGo
- Lives with ko
- Posts: 131
- Joined: Sat Oct 05, 2019 12:32 am
- Rank: AGA 4 dan
- GD Posts: 0
- Universal go server handle: telegraphgo
- Has thanked: 1 time
- Been thanked: 18 times
Re: Can We Stop Calling Kata "scoreMean" Points?
If you want an AI's opinion for which move is easy for AI to handle in an AI v. AI match, then you shouldn't be looking at KataGo scores. That's literally exactly the metric that percentages are designed to give. ELF, Leela-Zero, and maybe some other AI are (I believe) a little stronger than KataGo, and thus probably better at giving percentages. You should be keeping in mind that none of these AI can tell us how easy a move is for humans to handle.Marcel Grünauer wrote:Doesn't that mean that a score estimate should be qualified with a probability?lightvector wrote:Suppose you play the bot against itself 100 times and you find that on average it loses by 20 points in some position (winning a few games barely, losing most games by a lot). Suppose that 20 points was precisely what the bot had given as its "final score difference estimate" in that position. Great, right?
Suppose you dig further into the example and determine that actually, if the bot had just played move X, it would lose only by about 4 points - the resulting endgame is stable, and although it's not clear how to play it exactly optimally, it's highly clear that it's not going to vary by more than +/- 1 point under any reasonable lines of play. If you had 4 more points, then you'd have 50-50 winning chances playing move X. And the bot also agrees. The *reason* why the bot did not play move X and instead chose Y was that X led to an easy and predictable loss, whereas move Y is a complex and uncertain move that gives some slim winning chances instead of zero, but average seems to lead to a much bigger loss.
In the example, it would mean "move Y loses the game by 4 points with 100% certainty" (i.e., winrate 0%) and "move X loses the game by 20 points with 50% certainty" and "move X wins by 1 point ('barely') maybe 5% of the time".
Statistics is not my strong suit so I'm sure my example is flawed, but I hope it conveys what I mean.
The way that AI complicates games is different than the way humans complicate games - AI is much more confident in its ability (and thus its opponent's ability) to invade than the typical human, for example. If you want to learn how to create complications that are hard for humans to deal with while losing slightly, KataGo by itself is probably not the way.
KataGo's purpose is to give useful score estimates. I see no need to dilute that, just let KataGo do KataGo's job well. I'm very excited to see the B-style network, and very impressed that lightvector seems to think it won't be that hard to create.
- spook
- Lives with ko
- Posts: 151
- Joined: Thu Jul 24, 2014 1:34 pm
- Rank: 2d
- GD Posts: 0
- KGS: LordVader
- Location: Belgium
- Has thanked: 11 times
- Been thanked: 48 times
- Contact:
Re: Can We Stop Calling Kata "scoreMean" Points?
I agree.lightvector wrote: I think B is more useful.
Out with the old, in with the new.lightvector wrote: So, my thought is to try to make KataGo estimate B instead. And, I could also continue estimating A too, but it would be extra overhead in the search to carry both around, so my inclination is to just not have A once we have B. Unless people think it should keep reporting both? Thoughts?
It is a preview of the next ZBaduk release. For brevity (to reduce spam here): https://github.com/lightvector/KataGo/issues/57.xela wrote:What software did you use to make these graphs?
Enjoy LeeLaZero and KataGo from your webbrowser, without installing anything !
https://www.zbaduk.com
https://www.zbaduk.com
-
lightvector
- Lives in sente
- Posts: 759
- Joined: Sat Jun 19, 2010 10:11 pm
- Rank: maybe 2d
- GD Posts: 0
- Has thanked: 114 times
- Been thanked: 916 times
Re: Can We Stop Calling Kata "scoreMean" Points?
I'm going to keep both internally, since actually I'm a bit nervous there's a mathematical principledness that would break in the formulation of winloss utility + score utility if simply swapping it out. So the old value will continue to be used in the utility computation (utility is the name for what KataGo aims to maximize, which blends winning and score).
But I'm going to outright replace the "scoreMean" value which is what different GUIs are showing to the user. The old value will be hanging around in an extra new field of kata-analyze if some GUI app really really wants to show it.
The computation of the old value actually is also changing nontrivially due to some architectural changes in the neural net's outputs. The latest test run of KataGo I actually found the value to *underestimate* differences, rather than overestimate it! (Which I guess supports the point of this value not being very stable between different versions).
But I'm going to outright replace the "scoreMean" value which is what different GUIs are showing to the user. The old value will be hanging around in an extra new field of kata-analyze if some GUI app really really wants to show it.
The computation of the old value actually is also changing nontrivially due to some architectural changes in the neural net's outputs. The latest test run of KataGo I actually found the value to *underestimate* differences, rather than overestimate it! (Which I guess supports the point of this value not being very stable between different versions).
-
Gomoto
- Gosei
- Posts: 1733
- Joined: Sun Nov 06, 2016 6:56 am
- GD Posts: 0
- Location: Earth
- Has thanked: 621 times
- Been thanked: 310 times
Re: Can We Stop Calling Kata "scoreMean" Points?
lightvector, it is great that we have you around in this forum and that you give us some views on the inside of your work.
- spook
- Lives with ko
- Posts: 151
- Joined: Thu Jul 24, 2014 1:34 pm
- Rank: 2d
- GD Posts: 0
- KGS: LordVader
- Location: Belgium
- Has thanked: 11 times
- Been thanked: 48 times
- Contact:
Re: Can We Stop Calling Kata "scoreMean" Points?
Does it also have an indirect influence on the calculation of the stddev field ?lightvector wrote: But I'm going to outright replace the "scoreMean" value which is what different GUIs are showing to the user.
Enjoy LeeLaZero and KataGo from your webbrowser, without installing anything !
https://www.zbaduk.com
https://www.zbaduk.com