It is currently Thu Jan 20, 2022 4:02 pm

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 11 posts ] 
Author Message
Offline
 Post subject: Quality of random in MCTS ?
Post #1 Posted: Mon Dec 27, 2021 6:19 am 
Lives in gote

Posts: 493
Liked others: 115
Was liked: 88
Rank: OGS ddk
KGS: Ferran
IGS: Ferran
OGS: Ferran
Question,

how good does the random generator have to be in a MCTS program? Specially one attached to a Neural Network? Does anyone have an idea?

Thanks. Take care.

_________________
玄 之 玄

Top
 Profile  
 
Offline
 Post subject: Re: Quality of random in MCTS ?
Post #2 Posted: Mon Dec 27, 2021 8:05 am 
Lives in gote

Posts: 693
Liked others: 111
Was liked: 848
Rank: maybe 2d
Not sure what you mean. Typically modern forms of MCTS that uses a neural network for policy and value are fundamentally deterministic. So they don't need a random number generator at all.

They get some nondeterminism from multithreading and GPUs, because threads don't always run in the same order and GPUs sometimes vary in their results depending on batching, but that's not core to the algorithm and doesn't involve any built-in random generators, that's just a quirk of computer architecture.

And you can explicitly add in randomness if you want, separate from the algorithm. For example at the very end, entirely separately from search, if you wish to randomize between moves that are evaluated as nearly equal so that it doesn't always play the same game. Or if you want to randomize the orientation of a neural net or whatever when you evaluate a node within the search. But none of that is important or necessary at all for the algorithm, which itself has no randomness - these are just extra things you can do on the side depending on what you want.

MCTS used to be based on monte-carlo rollouts but those got replaced with neural nets, and DeepMind unfortunately didn't consider to change the name of the algorithm once they did that replacement, and it's too-widely published now to change the name, so we're stuck with "monte-carlo" being part of the name of an essentially deterministic algorithm.

The heart of MCTS is to recursively treat each node as a multi-armed bandit optimization (https://en.wikipedia.org/wiki/Multi-armed_bandit), and we now understand that whether it uses any randomness has nothing to do with the core of the algorithm, it entirely depends on what addons you combine that core with (and whether those add-ons themselves happen to have any randomness). The bandit-based part has been enduring though, so way back in 2006-2007, if the original authors had been more prescient, they might have called it "Bandit-based tree search" rather than "Monte-carlo tree search".


This post by lightvector was liked by: ez4u
Top
 Profile  
 
Offline
 Post subject: Re: Quality of random in MCTS ?
Post #3 Posted: Mon Dec 27, 2021 11:49 am 
Lives in gote

Posts: 493
Liked others: 115
Was liked: 88
Rank: OGS ddk
KGS: Ferran
IGS: Ferran
OGS: Ferran
I was under the impression that you still needed to randomly choose which leaves to explore to the end, though. There's some previous selection, sure, but there's a point when you need to pick a smaller subset randomly, isn't there?

Take care.

_________________
玄 之 玄

Top
 Profile  
 
Offline
 Post subject: Re: Quality of random in MCTS ?
Post #4 Posted: Mon Dec 27, 2021 4:16 pm 
Lives in gote

Posts: 693
Liked others: 111
Was liked: 848
Rank: maybe 2d
No, there is a direct formula that says exactly which node to explore (given the stats so far). You apply this formula repeatedly this until you hit a node not in the tree yet, add the node, do one neural net eval, and then repeat. So there isn't any random selection.

Note that the stopping condition is hitting a node not in the tree rather than hitting a node that is the end of the game. So ever since neural nets being used for everything, you also don't roll out to the end of the game, you just incrementally read each branch a little further each time.


This post by lightvector was liked by 2 people: Ferran, Harleqin
Top
 Profile  
 
Offline
 Post subject: Re: Quality of random in MCTS ?
Post #5 Posted: Tue Dec 28, 2021 2:16 pm 
Lives in sente
User avatar

Posts: 845
Liked others: 338
Was liked: 141
Rank: German 2 dan
Thanks, I was wondering about that, too.

So, what does the »number of playouts« mean now? Number of evaluated nodes? And blind spots mean that some relevant nodes were not given high enough priority? By the policy net?

_________________
A good system naturally covers all corner cases without further effort.

Top
 Profile  
 
Offline
 Post subject: Re: Quality of random in MCTS ?
Post #6 Posted: Tue Dec 28, 2021 4:01 pm 
Lives in gote

Posts: 693
Liked others: 111
Was liked: 848
Rank: maybe 2d
Yes, playouts means number of nodes in the tree now - it's the number of times you repeated the process of descending the tree down by following that formula that tells you the "most useful" path to explore next, hit the end of the tree, and added and evaluated one new node.

Blind spot is a phrase that has lost a lot of meaning, because people tend to just call most mistakes by a bot a blind spot now, regardless of what kind of mistake it is. Yes, it used to refer to mistakes due to having extremely low policy prior on a move, rather than value misevaluation or lack of reasonable amount of compute time.

The reason why people stopped using the term correctly is because while it may have been clearer when bots were slightly weaker "why" a particular mistake happens, as bots have gotten better it has gotten slightly harder, and also it seems like fewer people are bothering to understand the cause of any particular mistake. When there is a blind spot, it's often *not* the first move you find that is correct but that the bot doesn't want to play. Usually the bot sees that move fine and it's something deeper, and many iterations of interactivity are needed to find it. For example: https://imgur.com/a/H4D6t7k


This post by lightvector was liked by 4 people: CDavis7M, ez4u, MikeKyle, Uberdude
Top
 Profile  
 
Offline
 Post subject: Re: Quality of random in MCTS ?
Post #7 Posted: Tue Dec 28, 2021 5:03 pm 
Judan

Posts: 6719
Location: Cambridge, UK
Liked others: 436
Was liked: 3696
Rank: UK 4 dan
KGS: Uberdude 4d
OGS: Uberdude 7d
Great worked example of finding the blind spot, thanks!

Top
 Profile  
 
Offline
 Post subject: Re: Quality of random in MCTS ?
Post #8 Posted: Wed Dec 29, 2021 2:32 am 
Lives in gote

Posts: 493
Liked others: 115
Was liked: 88
Rank: OGS ddk
KGS: Ferran
IGS: Ferran
OGS: Ferran
@lightvector

Thanks. I've been checking a few things on my off time these days, and I think I have a path. And that I need to check the pseudocode for UCB better.

Happy New Year

_________________
玄 之 玄

Top
 Profile  
 
Offline
 Post subject: Re: Quality of random in MCTS ?
Post #9 Posted: Wed Dec 29, 2021 6:06 am 
Lives with ko

Posts: 193
Liked others: 15
Was liked: 59
Rank: KGS 1k EGF 2k
KGS: Schachus12
So, I understand correctly, for the first „playout“ you just take the move with the highest policy and evaluate the value-net? How does the second playout work? How is it determined from policy-values and (and maybe the value of that 1 playout?) whether to explore that first branch deeper or whether to go into the second highest policy move?

Top
 Profile  
 
Offline
 Post subject: Re: Quality of random in MCTS ?
Post #10 Posted: Wed Dec 29, 2021 11:58 am 
Lives in gote

Posts: 693
Liked others: 111
Was liked: 848
Rank: maybe 2d
If you want the actual formula, look at:
https://ai.stackexchange.com/questions/ ... -root-node
or search online yourself for many other tutorials and explanations. Note that the handling of what Q should be initialized to for a node with 0 playouts so far varies between bots, and there are some other messy formulas that go into that as well. (That post assumes it's initialized to 0, but that's not a given, you can initialize to other things, different bots will tune to whatever parameters maximize their strength).

Top
 Profile  
 
Offline
 Post subject: Re: Quality of random in MCTS ?
Post #11 Posted: Wed Dec 29, 2021 12:19 pm 
Honinbo

Posts: 9301
Liked others: 1563
Was liked: 1616
KGS: Kirby
Tygem: 커비라고해
lightvector wrote:


This example is awesome.

_________________
be immersed

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 11 posts ] 

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group