Re: MuZero beats AlphaZero
Posted: Fri Nov 22, 2019 9:17 pm
Since DeepMind is not gonna provide exact elo value anyway I'll do this for fun. I try to find elo from graphs assuming graphs have accurate scale.
We'll start with the exact number the paper mention (from AlphaGo Zero paper)
3,144 for AlphaGo Fan
3,739 for AlphaGo Lee
4,858 for AlphaGo Master
AlphaGo Zero (40 blocks/ 40 days) 5,185
Now estimated number
AlphaGo Zero (20 blocks/ 3 days) 4,884 (from AlphaZero paper)
AlphaZero (20 blocks/ 13 days) 4987 (from MuZero paper), 4980 (from AlphaZero paper), very similar number across these two papers so I think they have accurate scale graphs
MuZero (16 blocks/ 12 hours?) 5161 (from MuZero paper)
Though there is a very BIG caution, they're different match condition, in MuZero paper the condition is 800 simulations per move, and in other graph shows that MuZero is able to outperform AlphaZero from 0.1 seconds to 20 seconds per move, at 20 to 50 seconds per move AlphaZero outperform MuZero, and we don't know what will happen at even longer thinking time.
We'll start with the exact number the paper mention (from AlphaGo Zero paper)
3,144 for AlphaGo Fan
3,739 for AlphaGo Lee
4,858 for AlphaGo Master
AlphaGo Zero (40 blocks/ 40 days) 5,185
Now estimated number
AlphaGo Zero (20 blocks/ 3 days) 4,884 (from AlphaZero paper)
AlphaZero (20 blocks/ 13 days) 4987 (from MuZero paper), 4980 (from AlphaZero paper), very similar number across these two papers so I think they have accurate scale graphs
MuZero (16 blocks/ 12 hours?) 5161 (from MuZero paper)
Though there is a very BIG caution, they're different match condition, in MuZero paper the condition is 800 simulations per move, and in other graph shows that MuZero is able to outperform AlphaZero from 0.1 seconds to 20 seconds per move, at 20 to 50 seconds per move AlphaZero outperform MuZero, and we don't know what will happen at even longer thinking time.