Life In 19x19 http://lifein19x19.com/ |
|
KataGo stares at the empty board for a long time http://lifein19x19.com/viewtopic.php?f=18&t=17270 |
Page 1 of 1 |
Author: | xela [ Mon Feb 17, 2020 5:18 am ] |
Post subject: | KataGo stares at the empty board for a long time |
Just to humour Maharani, I left my computer on while I was out for a few hours. Specifically, I gave it 90 minutes to stare at an empty board with 7.5 komi, and another 90 minutes with 7 points (whole number) komi. Here are the log files. 7.5 komi 7 points komi I'm not exactly sure what all the numbers mean, but there are some hints in this github issue. For something like: Code: Q16 : T -14.38c W -14.09c S -0.29c ( -1.2 L -0.8) LCB -14.39c P 5.74% WF 0.45% PSV 1820124 N 1820124 I think T=-14.38c is "total utility: 14.38 cents favouring white", and I'm guessing that cents run from -100 to +100, so to convert it to a winrate it's 50-(14.38/2) = 42.81% for black. And either -0.3 or -0.2 would be the score median, I'm not sure which one. N=1820124 would be the number of playouts for this move, and P=5.74% is the policy network value. So 3 million playouts isn't enough for KataGo to stop preferring 4-4 points as the first move. It's interesting to see that the policy values are actually higher for 3-4 than for 4-4, which must mean that 4-4 gets preferred entirely on the basis of the winrates being higher. |
Author: | Jæja [ Mon Feb 17, 2020 6:32 am ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
TLDR: It's an interesting experiment to allow for many playouts, but improvements to the network could theoretically result in radically different results and we'll never know for sure. Perhaps KataGo will start preferring another opening move when it has played enough games against itself and further improved the (value and policy) neural networks? If I understand correctly, the playouts are generating moves and their win-rates by alternating between a policy network, which generates points of interest, and a value network, which estimates the value of the board (the win rate). Both these networks could theoretically change a lot if the training procedure ever gets KataGo out of a possible local optimum. This is a place in the search space of network configurations where all solutions in the immediate surrounding are suboptimal, hence solutions similar to the current one are preferred. Think of it as standing on a hilltop, but because of fog, you're unable to see whether there are even higher hilltops around us. It could happen that the policy network will prefer different points for opening, thus selecting different targets for deep analysis. Also, the value network could be updated in such a way that different patterns and therefore different openings are preferred. It's actually impossible to know whether a different, more optimal solution can be achieved. Even more so, it's even impossible to know whether we're currently in a global optimum. All we can do is keep improving and see what happens. However, it could be that we're searching around the Mont Blanc and we're unable to see Mount Everest |
Author: | lightvector [ Mon Feb 17, 2020 8:05 am ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
You guessed right with respect to the output. Code: <move> : T <total utility> W <winrate utility from -1 to +1> S <score utility from -something to +something> ( <selfplay score estimate> L <lead estimate>) LCB <lcb> P <policy> WF <slight weighting factor for move> PSV <value used to select move> N <visit count>
|
Author: | xela [ Mon Feb 17, 2020 6:26 pm ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
Thanks! What's the difference between score estimate and lead estimate? If I had to take another guess :-) I'd guess that the first one is mean and the second is median? PSV looks similar but not identical to visit count (sometimes but not always the same number). Is it some combination of visit count and utility? |
Author: | lightvector [ Mon Feb 17, 2020 6:43 pm ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
It's the same thing that I announced with the 1.3 release of KataGo. ScoreSelfPlay is the estimated average score that would result from self-play, with self-play itself being affected by Kata not being entirely score-maximizing, sometimes taking risks or playing safe and giving up bits of points doing so. Lead is the estimated average score adjustment that would be needed to make the winrate 50%. It's not the median of anything directly, it's trained be "how much do I need to adjust the komi to make myself (older versions of my self that generated my training data, plus a little bit of search) say 50%"? Since 1.3, this is the number that gets shown in GUIs. |
Author: | Schachus [ Tue Feb 18, 2020 12:16 am ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
One question about the lead: what does it mean, if it is a fractional number? Is it linearly interpolated between .5 of a whole number? |
Author: | lightvector [ Tue Feb 18, 2020 7:12 am ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
Yeah, sorta. There's some fiddly stuff that happens due to discreteness. Particularly with Chinese rules. I'm going off memory, so some of the below is perhaps not *quite* correct, since implementation was tricky. Imagine you're playing black and that the neural net's best guess is that you're 40% to win on the board by 9 points, and 60% to win on the board by 7 points, and no other outcome is possible with bot-quality play. What is the "fair" komi? Well, a komi of 7 would mean you get 70% equity (40% win + 60% draw), so that's not fair. A komi of 8 would mean you get 40% equity (40% win + 60% loss). So that's not fair either. The most fair blend would be if komi were 7 a third of the time, and komi were 8 two-thirds of the time, giving you 50% equity. So the neural net will be trained to try to say something like 7.666666... as fair in this case. In that case, if you're playing 7.5 komi, the neural net might say the lead is +0.16666666... Yes that's a little weird, given that at 7.5 komi it would be saying the lead is +0.1666666, and your winrate would only be 40%. If komi were 8 or 8.5, your winrate would stay at 40%, but it would now be saying -0.33333333 or -0.833333333. This weirdness goes away in Japanese rules, since only in Chinese rules do you tend to have discreteness in chunks of 2 points. Also, the neural net itself is a little noisy, so take all the above and add noise. And then MCTS will average across all these fractional values too, the same way it averages across winrates (since averaging is much cheaper than medianing) so there's a little bit of average-like behavior going on from the search itself. So it's a bit messy. For most of the game, things are all smooth enough that you can interpret the number intuitively. Like, +0.7 means KataGo's opinion is that it's leading on average by 0.7 points, and if it goes down to +0.6, then KataGo's opinion changed downward by 0.1 point. That doesn't mean its opinion is correct, or that its opinion changed for a good reason - as Bill would say, the "margin of error" is certainly more than 0.1 points - and if the misunderstanding is major, such as a group on the board that it doesn't realize the status of due to a blind spot, it could be off by a ton more. But it would be accurate to think of its *opinion* as having changed slightly, even if just due to noise. Near the end of the game though, the tiny fractional differences are going to be basically averaging out the discreteness in game results and possible komi values, given KataGo's remaining uncertainty about the position. |
Author: | YeGO [ Tue Feb 18, 2020 7:47 am ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
lightvector wrote: Well, a komi of 7 would mean you get 70% equity (40% win + 60% draw), so that's not fair. A komi of 8 would mean you get 40% equity (40% win + 60% loss). So that's not fair either. The most fair blend would be if komi were 7 a third of the time, and komi were 8 two-thirds of the time, giving you 50% equity. I am a bit confused by this part. What does "equity" mean here? How is it computed? Also, does the estimated win rate just mean black's chance to win or the chance that either player may win? Couldn't the bot also estimate that both black and white have some chance of winning? |
Author: | lightvector [ Tue Feb 18, 2020 8:00 am ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
If you define a draw to be as good as 50% win and 50% of a loss, then 40% chance to win and 60% chance to draw: 40% + 0.5 * 60% = 70%. |
Author: | Maharani [ Sun Feb 23, 2020 1:05 pm ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
I've finally managed to get to enough playouts (1.2 million) within sixty minutes on ZBaduk to "fill in" this picture Every point on the third or fourth line of the board received at least eleven playouts. Only once this had happened did KataGo first consider points (i. e. give them more than ten playouts) that didn't have a third- or fourth-line coordinate (namely, the 5-5 points). https://i.ibb.co/61C9cD0/Screen-Shot-20 ... -21-AM.png |
Author: | Bill Spight [ Sun Feb 23, 2020 1:47 pm ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
Maharani wrote: I've finally managed to get to enough playouts (1.2 million) within sixty minutes on ZBaduk to "fill in" this picture Every point on the third or fourth line of the board received at least eleven playouts. Only once this had happened did KataGo first consider points (i. e. give them more than ten playouts) that didn't have a third- or fourth-line coordinate (namely, the 5-5 points). https://i.ibb.co/61C9cD0/Screen-Shot-20 ... -21-AM.png Thanks. Hmmmm. Should we consider the 3-4 pt. to have gotten 606k rollouts, and the 4-4 pt. to have gotten 567k rollouts? |
Author: | xela [ Mon Feb 24, 2020 12:53 am ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
Bill Spight wrote: Should we consider the 3-4 pt. to have gotten 606k rollouts, and the 4-4 pt. to have gotten 567k rollouts? Sorry, no. The rollouts in each corner are mostly duplicating the same information. |
Author: | xela [ Mon Feb 24, 2020 12:58 am ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
Maharani wrote: I've finally managed to get to enough playouts (1.2 million) within sixty minutes on ZBaduk to "fill in" this picture Every point on the third or fourth line of the board received at least eleven playouts. Only once this had happened did KataGo first consider points (i. e. give them more than ten playouts) that didn't have a third- or fourth-line coordinate (namely, the 5-5 points). For this sort of thing, it's good to look at the network policy values, to give you a handle on how long it's going to take. Unfortunately I'm not sure how you do that with ZBaduk+KataGo. Room for future enhancements? With Lizzie, the "show policy" button will show you policy values, but only for moves that have already got at least one playout. For LZ, you can see policy values for unexplored moves by running LZ from the command line and using the "heatmap" command. It would be nice to see this integrated into Lizzie some time. I don't think KataGo yet has an equivalent to LZ's heatmap. |
Author: | lightvector [ Mon Feb 24, 2020 4:54 am ] |
Post subject: | Re: KataGo stares at the empty board for a long time |
KataGo has it! See "kata-raw-nn" command documented at: https://github.com/lightvector/KataGo/b ... ensions.md Only in master branch for now (so needs custom compile), not part of a release yet. Will get included in next release, of course. |
Page 1 of 1 | All times are UTC - 8 hours [ DST ] |
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |