124. Chew (3k) vs Fuego (Bot)
- perceval
- Lives in gote
- Posts: 312
- Joined: Thu Aug 05, 2010 3:35 am
- Rank: 7K KGS
- GD Posts: 0
- KGS: tictac
- Has thanked: 52 times
- Been thanked: 41 times
Re: 124. Chew (3k) vs Fuego (Bot)
Here is what i understand of Fuego process, which might be really wrong:
-it uses the UTC algo ie generate random playouts, with a preferences to explore moves that seems to be better.
ez4u asked why in the number of playout the min seemed to be 23.
my guess is that a minimum value to get a rough estimate of the winning % of each node.
However 23 playouts gives a huge error bar on winning ratio (10%).
Increasing to, say 2500 would take little time (300*2500*=750000 playout, much less than the total number played) and decrease the error bar to 1%
i would like to know if changing this value would change the exploration rates and the final move ?
it should not of course but i wonder how the search tree would be modified.
Esmeralddemon, if you know how to change this value, would you have time to run the 2 search with 23 and 2500 as min playout for next move to see if there is a difference ?
it might be a part of the horizon "effect" ie the fact that increasing the thinking time does not linearly increase the playing strength because a good move might be discarded early because of lack of luck in early explorations. (or more complicated effects such as: its a good move but only following a very narrow tactical path that a few random playouts are unlikely to find).
Woudl you find this experiment interesting ?
-it uses the UTC algo ie generate random playouts, with a preferences to explore moves that seems to be better.
ez4u asked why in the number of playout the min seemed to be 23.
my guess is that a minimum value to get a rough estimate of the winning % of each node.
However 23 playouts gives a huge error bar on winning ratio (10%).
Increasing to, say 2500 would take little time (300*2500*=750000 playout, much less than the total number played) and decrease the error bar to 1%
i would like to know if changing this value would change the exploration rates and the final move ?
it should not of course but i wonder how the search tree would be modified.
Esmeralddemon, if you know how to change this value, would you have time to run the 2 search with 23 and 2500 as min playout for next move to see if there is a difference ?
it might be a part of the horizon "effect" ie the fact that increasing the thinking time does not linearly increase the playing strength because a good move might be discarded early because of lack of luck in early explorations. (or more complicated effects such as: its a good move but only following a very narrow tactical path that a few random playouts are unlikely to find).
Woudl you find this experiment interesting ?
In theory, there is no difference between theory and practice. In practice, there is.
- flOvermind
- Lives with ko
- Posts: 295
- Joined: Wed Apr 21, 2010 3:19 am
- Rank: EGF 4 kyu
- GD Posts: 627
- Location: Linz, Austria
- Has thanked: 21 times
- Been thanked: 43 times
Re: 124. Chew (3k) vs Fuego (Bot)
The problem with increasing the minimum playouts of each move is that this involves an exponential decrease in the statistical coverage of the followup moves.
Making 1500 playouts once is not that much, but 1500 playouts for each move at each expanded node is quite a lot. With normal tree traversal that would translate to an exponential increase in time, but the nature of the UCT algorithm's playouts means effectively that the total amount of playouts choosing "good" nodes decreases, while the time remains constant. That means the child nodes will get expanded later (on average), and the maximum tree depth decreases.
Effectively, you're making the search tree more balanced. That makes the horizon effect worse instead of better
Of course it would be interesting to find a good balance between discarding bad nodes early, and increasing the accuracy when deciding what nodes are bad. But 1500 feels a lot too high.
Making 1500 playouts once is not that much, but 1500 playouts for each move at each expanded node is quite a lot. With normal tree traversal that would translate to an exponential increase in time, but the nature of the UCT algorithm's playouts means effectively that the total amount of playouts choosing "good" nodes decreases, while the time remains constant. That means the child nodes will get expanded later (on average), and the maximum tree depth decreases.
Effectively, you're making the search tree more balanced. That makes the horizon effect worse instead of better
Of course it would be interesting to find a good balance between discarding bad nodes early, and increasing the accuracy when deciding what nodes are bad. But 1500 feels a lot too high.
- emeraldemon
- Gosei
- Posts: 1744
- Joined: Sun May 02, 2010 1:33 pm
- GD Posts: 0
- KGS: greendemon
- Tygem: greendemon
- DGS: smaragdaemon
- OGS: emeraldemon
- Has thanked: 697 times
- Been thanked: 287 times
Re: 124. Chew (3k) vs Fuego (Bot)
The trouble is that the UCT algorithm doesn't treat the root node differently from any other node in the search tree. So it wouldn't be +750000 playouts, it would be +750000 playouts per move (which is why fl0vermind said the increase is exponential). The response from Martin Mueller suggested the possibility of changing the algorithm to treat the root node differently, but it would be a nontrivial change to the source code.perceval wrote:
Increasing to, say 2500 would take little time (300*2500*=750000 playout, much less than the total number played) and decrease the error bar to 1%
That said, there is a specific exploration/exploitation tradeoff parameter I can tweak and play with. I will try some different values for the next move, see what it does.
However, as topazg pointed out. It's easy to see if the program plays differently, but not so easy to tell if it plays better.
- perceval
- Lives in gote
- Posts: 312
- Joined: Thu Aug 05, 2010 3:35 am
- Rank: 7K KGS
- GD Posts: 0
- KGS: tictac
- Has thanked: 52 times
- Been thanked: 41 times
Re: 124. Chew (3k) vs Fuego (Bot)
o ok i though there was something special about the first node because of those 23 playouts for all nodes.
In theory, there is no difference between theory and practice. In practice, there is.
-
Mike Novack
- Lives in sente
- Posts: 1046
- Joined: Mon Aug 09, 2010 9:36 am
- GD Posts: 0
- Been thanked: 184 times
Re: 124. Chew (3k) vs Fuego (Bot)
Ah, but does an error bar of that size hurt anything? At this point all that is being done is pruning the worst nodes from the tree. Not yet selecting the best. All that is needed is that the right node (the move ultimately decided upon) not get pruned early.perceval wrote: ez4u asked why in the number of playout the min seemed to be 23.
my guess is that a minimum value to get a rough estimate of the winning % of each node.
However 23 playouts gives a huge error bar on winning ratio (10%).
For example, if that first pruning process didn't remove anything with a score over half of the highest score then that possible 10% error wouldn't much.
- Chew Terr
- Gosei
- Posts: 2060
- Joined: Mon Apr 19, 2010 12:45 pm
- Rank: KGS 3k
- GD Posts: 264
- KGS: Chew
- Location: Texas
- Has thanked: 546 times
- Been thanked: 172 times
- Contact:
- emeraldemon
- Gosei
- Posts: 1744
- Joined: Sun May 02, 2010 1:33 pm
- GD Posts: 0
- KGS: greendemon
- Tygem: greendemon
- DGS: smaragdaemon
- OGS: emeraldemon
- Has thanked: 697 times
- Been thanked: 287 times
- Chew Terr
- Gosei
- Posts: 2060
- Joined: Mon Apr 19, 2010 12:45 pm
- Rank: KGS 3k
- GD Posts: 264
- KGS: Chew
- Location: Texas
- Has thanked: 546 times
- Been thanked: 172 times
- Contact:
Re: 124. Chew (3k) vs Fuego (Bot)
Yeah, I've been leaving this group hanging for a bit too long. Really wanted to slide into the corner more for eyespace instead, but didn't like how much white could threaten to sever connections, while undercutting my base.
Someday I want to be strong enough to earn KGS[-].
- emeraldemon
- Gosei
- Posts: 1744
- Joined: Sun May 02, 2010 1:33 pm
- GD Posts: 0
- KGS: greendemon
- Tygem: greendemon
- DGS: smaragdaemon
- OGS: emeraldemon
- Has thanked: 697 times
- Been thanked: 287 times
- Chew Terr
- Gosei
- Posts: 2060
- Joined: Mon Apr 19, 2010 12:45 pm
- Rank: KGS 3k
- GD Posts: 264
- KGS: Chew
- Location: Texas
- Has thanked: 546 times
- Been thanked: 172 times
- Contact:
- emeraldemon
- Gosei
- Posts: 1744
- Joined: Sun May 02, 2010 1:33 pm
- GD Posts: 0
- KGS: greendemon
- Tygem: greendemon
- DGS: smaragdaemon
- OGS: emeraldemon
- Has thanked: 697 times
- Been thanked: 287 times
- Chew Terr
- Gosei
- Posts: 2060
- Joined: Mon Apr 19, 2010 12:45 pm
- Rank: KGS 3k
- GD Posts: 264
- KGS: Chew
- Location: Texas
- Has thanked: 546 times
- Been thanked: 172 times
- Contact:
Re: 124. Chew (3k) vs Fuego (Bot)
Don't want to get cut here. Next, I'll either slide towards the corner or go up and out.
Someday I want to be strong enough to earn KGS[-].
- emeraldemon
- Gosei
- Posts: 1744
- Joined: Sun May 02, 2010 1:33 pm
- GD Posts: 0
- KGS: greendemon
- Tygem: greendemon
- DGS: smaragdaemon
- OGS: emeraldemon
- Has thanked: 697 times
- Been thanked: 287 times
- Loons
- Gosei
- Posts: 1378
- Joined: Tue Apr 20, 2010 4:17 am
- GD Posts: 0
- Location: wHam!lton, Aotearoa
- Has thanked: 253 times
- Been thanked: 105 times
Re: 124. Chew (3k) vs Fuego (Bot)
Wait, I'm confused about how Fuego thinks, again.
So the numbers on the board are the quantity of MC game-play-outs it started there, and every victorious play-out adds weight to that play-out's starting move, and high placing moves (2nd, 3rd place etc) are tried earlier in later play-outs (as they become established)?
Could you give us some samples of the finished MC play-outs Fuego reaches ? I am curious how crazy they look. (Sorry if I just overlooked them earlier). Would it be useful to post a white's-move-values board for the first move after Fuego's chosen move, or can we just infer that from Fuego's predicted sequences?
So the numbers on the board are the quantity of MC game-play-outs it started there, and every victorious play-out adds weight to that play-out's starting move, and high placing moves (2nd, 3rd place etc) are tried earlier in later play-outs (as they become established)?
Could you give us some samples of the finished MC play-outs Fuego reaches ? I am curious how crazy they look. (Sorry if I just overlooked them earlier). Would it be useful to post a white's-move-values board for the first move after Fuego's chosen move, or can we just infer that from Fuego's predicted sequences?