DeepMind reveals latest AlphaGo's shocking strength graph

pookpooi · Post by **pookpooi** » Tue May 23, 2017 10:41 pm

From Future of Go Summit AI conference today
Source https://twitter.com/webigojp/status/867228171174903808
Video https://www.facebook.com/GOking2007/vid ... 096921048/

Comment from Seigenblues in Reddit

AG Master used 10x less compute, trained in weeks vs months. Single machine. (Not 5? Not sure). Main idea behind AlphaGo Master: only use the best data. Best data is all AG's data, i.e. only trained on AG games.
Using training data (self play) to train new policy network. They train the policy network to produce the same result as the whole system. Ditto for revising the value network. Repeat. Iterated "many times".
Results: AG Lee beat AG Fan at 3 stones. AG Master beat AG Lee at three stones! Chart stops there, no hint at how much stronger AG Ke is or if it's the same as AG Master
Strong caveat here from the researchers: bot vs bot handicap margins aren't predictive of human strength, especially given it's tendency to take it's foot off the gas when it's ahead

xiayun · Post by **xiayun** » Tue May 23, 2017 10:46 pm

So apparently the Elo rating of the latest version is over 4,500.

pookpooi · Post by **pookpooi** » Tue May 23, 2017 10:52 pm

xiayun wrote:So apparently the Elo rating of the latest version is over 4,500.

They put AlphaGo Lee (nice name) at 3700 Elo, Master at 4800 Elo (judging from my eyes)
So 1100/3 stones = 366 elo per stone. May make sense at high level as each handicap stone require more and more Elo to give than at lower level. At 2 dan and more KGS use 230 Elo though, so if they use KGS standard it'll be around 4400 Elo.

Krama · Post by **Krama** » Wed May 24, 2017 4:40 am

This is absurd. First they claim AG can't play handicap games then they say it's 3 stones above AG Lee. Why didn't they do a handicap match with Ke Jie then?
For example since Ke lost a game the next game AG should give him 2 stones.

pookpooi · Post by **pookpooi** » Wed May 24, 2017 5:09 am

Krama wrote:This is absurd. First they claim AG can't play handicap games then they say it's 3 stones above AG Lee. Why didn't they do a handicap match with Ke Jie then?
For example since Ke lost a game the next game AG should give him 2 stones.

I haven't seen DeepMind claim AlphaGo can't play handicap games. In fact, they always did, and this time even more impressive as David Silver say it's no komi handicap game (in Nature Paper it has to retain komi of 7.5 so it give 4 stones but receive 7.5 point as komi, thus making it 3 handicap stones game instead). I think the problem is on the human pro side instead.
This is why I suggest Jubango idea to DeepMind, the next stage is Japan (more press freedom than China for sure), the human representative is Iyama Yuta. The format is ten games but each game if one side lose it has to receive handicap stone the next game.

pookpooi · Post by **pookpooi** » Wed May 24, 2017 6:18 am

USGO write a nice article on this

http://www.usgo.org/news/2017/05/new-ve ... efficient/

Kirby · Post by **Kirby** » Wed May 24, 2017 9:37 am

Thanks for the link, pookpooi. From the article:

usgo article wrote: The version of AlphaGo that defeated Ke Jie 9p in the first round of the three game challenge match yesterday was trained entirely on the self-play games of previous versions of AlphaGo, a Google DeepMind engineer told an audience in China.

Based on the assumption that Master is truly 3-stones stronger than last year's version of AlphaGo, there are at least two potential explanations for AlphaGo's jump in strength:

AlphaGo's continued training and self-play.
The changed methodology of feeding the policy network with self-play games.

The argument for #1 is understandable; after all, there was about a 3-stone jump in strength between AlphaGo Fan and AlphaGo Lee, apparently due primarily to the continued self-play. Why not expect similar results by continuing the same method of self-play?

But #2 has more potential to be inspiring. Namely, it suggests that feeding one's policy network with higher quality input can result in significant gains in strength. If this were true, what does it mean for human player improvement? I suppose it would imply the, perhaps obvious, idea that considering better moves results in better play. Practically speaking, maybe that means it'd be better to spend less time kibitzing amateur games, and more time absorbing top pro games - or AlphaGo games, perhaps. Having this exposure seems analogous to feeding our policy networks with better input data.

All of this being said, there's no reason for me to believe that #2, above, had a more significant impact on AlphaGo Master's strength compared to simply doing more self-play games. #2 probably helped, but #1 alone was enough for AlphaGo Lee to jump 3 stones from AlphaGo Fan...

Nonetheless, it's something to think about.

ewan1971 · Post by **ewan1971** » Wed May 24, 2017 4:02 pm

pookpooi wrote:
Krama wrote:This is absurd. First they claim AG can't play handicap games then they say it's 3 stones above AG Lee. Why didn't they do a handicap match with Ke Jie then?
For example since Ke lost a game the next game AG should give him 2 stones.
I haven't seen DeepMind claim AlphaGo can't play handicap games. In fact, they always did, and this time even more impressive as David Silver say it's no komi handicap game (in Nature Paper it has to retain komi of 7.5 so it give 4 stones but receive 7.5 point as komi, thus making it 3 handicap stones game instead). I think the problem is on the human pro side instead.
This is why I suggest Jubango idea to DeepMind, the next stage is Japan (more press freedom than China for sure), the human representative is Iyama Yuta. The format is ten games but each game if one side lose it has to receive handicap stone the next game.

Playing in Japan would be the next logical move. Although AlphaGo's hardware might require radiation shielding, as do Google's representatives.

djhbrown · Post by **djhbrown** » Wed May 24, 2017 4:37 pm

delete constant, insert variable, turn on, qed.
as for radiation shielding, its not those that radiate bulldust that need shielding, it's the poor saps that soak it up and genuflect, genuflect.

ewan1971 · Post by **ewan1971** » Wed May 24, 2017 9:27 pm

djhbrown wrote:delete constant, insert variable, turn on, qed.
as for radiation shielding, its not those that radiate bulldust that need shielding, it's the poor saps that soak it up and genuflect, genuflect.

LOL!

AlphaGo will do to Iyama what the atomic bombs did to... But seriously, the Japanese have been lying through their teeth about the true radiation levels from Fukushima. I'd seriously worry about Google team's health if they decided to host their next event in Japan.

djhbrown · Post by **djhbrown** » Wed May 24, 2017 11:55 pm

David looks so much like Dustin Hoffman in Marathon Man, it feels like someone will ask him "Is it safe?"

Post by **Solomon** » Thu May 25, 2017 12:03 am

ewan1971 wrote: LOL!

AlphaGo will do to Iyama what the atomic bombs did to... But seriously, the Japanese have been lying through their teeth about the true radiation levels from Fukushima. I'd seriously worry about Google team's health if they decided to host their next event in Japan.

djhbrown wrote:David looks so much like Dustin Hoffman in Marathon Man, it feels like someone will ask him "Is it safe?"

This sort of politics is against forum rules ewan1971. You have already been warned prior for violating forum rules, so this will be a suspension. djhbrown, the same goes for you. I should also note that you have received numerous reports for your negative and off-putting posts, and I have even received PMs requesting information on how to ignore your posts. As a result, your account will be suspended as well.

TheCannyOnion · Post by **TheCannyOnion** » Thu May 25, 2017 12:41 am

Solomon wrote:This sort of politics is against forum rules...

Just thought it appropriate to point out that user pookpooi too has a fondness for injecting irrelevant political remarks into his posts. Specifically, I've noticed that he has a habit of making snide remarks toward China and Chinese poltics. This latest eruption from ewan1971 seems to have been a reaction to an earlier post from pookpooi.

Here's a sample of pookpoi's posts I could find off-hand that are completely unnecessary and against forum rules prohibiting political discussions outside the off-topic forum.

pookpooi wrote:A little bit update, DeepMind team will stay in China for 3 weeks, they'll go back to London in June. And yes, they're very exciting on working under 'The Great Firewall' for the first time...

pookpooi wrote:This is why I suggest Jubango idea to DeepMind, the next stage is Japan (more press freedom than China for sure), the human representative is Iyama Yuta...

I'd like to appeal to pookpooi to cease injecting politics into his remarks. This forum is not a place to express personal prejudice or grandstand; it's a place to discuss Go.

Thank you.

Uberdude · Post by **Uberdude** » Thu May 25, 2017 1:42 am

TheCannyOnion wrote:Just thought it appropriate to point out that user pookpooi too has a rather nasty tendency of injecting irrelevant political remarks into his posts.

I've not noticed that. There was quite some discussion on reddit, but less here, about how China blocked live broadcast of the match within China just a few days before the event, probably because they don't like Google. And before he went Aja said his wife would be posting updates for him on facebook as it's blocked there.

TheCannyOnion · Post by **TheCannyOnion** » Thu May 25, 2017 2:04 am

Uberdude wrote:
TheCannyOnion wrote:Just thought it appropriate to point out that user pookpooi too has a rather nasty tendency of injecting irrelevant political remarks into his posts.
I've not noticed that. There was quite some discussion on reddit, but less here, about how China blocked live broadcast of the match within China just a few days before the event, probably because they don't like Google. And before he went Aja said his wife would be posting updates for him on facebook as it's blocked there.

Google is not the press, and it's not a beacon of freedom either. It's a internet giant working closely with the NSA, DHS, CIA, FBI and other spy agencies. Ooops, see what I did there?

I simply asked that pookpooi stop his blatant tendency of injecting political commentary into his posts. Besides, if we take pookpooi's suggestion that AlphaGo next play in Japan, then are we not condoning Japan's unequal treatment of women in workplaces, its continued commercial slaughtering of whales, and its inability to apologize for Pearl Harbor and committing mass atrocities in World War II? Ooops, see what I just did again?

Or can we please all stop injecting politics into discussions on Go?

Life In 19x19

DeepMind reveals latest AlphaGo's shocking strength graph

DeepMind reveals latest AlphaGo's shocking strength graph

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap

Re: DeepMind reveals latest AlphaGo's shocking strength grap