Page 1 of 2

AlphaGo second paper released: AlphaGo Zero

Posted: Wed Oct 18, 2017 10:08 am
by pookpooi
DeepMind makes a revolution again!
Image
Image
Image
Image
Image
Image
https://deepmind.com/blog/alphago-zero- ... g-scratch/
The sequel Nature paper named "Mastering the game of Go without human knowledge" is freely available at https://deepmind.com/documents/119/agz_ ... nature.pdf

AlphaGo second paper media coverage

Nature BBC The Verge Wired Science Magazine MIT Technology Review

Some interesting bits I found in various news source

DeepMind said that it’s not releasing the code as it might for other projects. Hassabis says outside researchers will likely be able to replicate parts of it from the Nature paper.

The team says they don’t know AlphaGo Zero’s upper limit—it got so strong that it didn’t seem worth training it anymore.

“Its games look a lot like human play but it also feels more free, perhaps because it is not limited by our knowledge,” Fan Hui says. He’s already christened one tactic it came up the “zero move,” such is its striking power in the early stages of a game. “We have never seen a move like this, even from AlphaGo," he says.

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Wed Oct 18, 2017 10:32 am
by dfan
pookpooi wrote:I'm trying to see if there's free copy available right now
Here it is (linked to from their blog post so legit): https://deepmind.com/documents/119/agz_ ... nature.pdf

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Wed Oct 18, 2017 10:34 am
by pookpooi
dfan wrote:
pookpooi wrote:I'm trying to see if there's free copy available right now
Here it is (linked to from their blog post so legit): https://deepmind.com/documents/119/agz_ ... nature.pdf
Thank you, will edit now.

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Wed Oct 18, 2017 10:41 am
by dfan

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Wed Oct 18, 2017 11:26 am
by pookpooi
dfan wrote:and records of self-play games: http://www.nature.com/nature/journal/v5 ... 270-s2.zip
Too bad it only has 20 games in each figure. I'm counting Zero selfplay game with White win 14 and Black win 6. It's to be expected considering Chinese rule favor white but at least I want to know the result of all 100 games though.

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Wed Oct 18, 2017 12:31 pm
by luigi
pookpooi wrote:
dfan wrote:and records of self-play games: http://www.nature.com/nature/journal/v5 ... 270-s2.zip
Too bad it only has 20 games in each figure. I'm counting Zero selfplay game with White win 14 and Black win 6. It's to be expected considering Chinese rule favor white but at least I want to know the result of all 100 games though.
Ask them in the reddit AMA. :)

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Wed Oct 18, 2017 12:35 pm
by pookpooi
luigi wrote: Ask them in the reddit AMA. :)
There are already people asking question specifically to komi and bias.

I'm still in disbelief at how strong AlphaGo Zero is. In 2014 no one could imagine that in the next three years computer go will get 10 stones stronger...

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Wed Oct 18, 2017 3:23 pm
by kris
From www.nature.com:

Nature 550, 354–359 (19 October 2017) doi:10.1038/nature24270
Received 07 April 2017 Accepted 13 September 2017 Published online 18 October 2017

It's interesting that they've received this paper before future of go summit which had place in May.

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Wed Oct 18, 2017 7:47 pm
by pookpooi
Regarding the black/white winrate I read the paper carefully again and conclude that we can't conclude anything on this topic because the self-played games are not all Zero full strength, it's divided into 20 periods with only the 20th period being the strongest version.

This is the information overload moment but I'll begin to read more and more until I can crystalized Zero.

AI strength in the KGS dan converted by myself
Crazy Stone 2015 5d
AlphaGo Fan 9d
AlphaGo Lee 11d (DeepZen/FineArt ± 1 stone)
AlphaGo Master 14d
AlphaGo Zero 15d

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Thu Oct 19, 2017 2:47 am
by pookpooi
Chinese Weiqi master Ke Jie commented on the remarkable accomplishments of the new program via his Weibo account, "A pure self-learning AlphaGo is the strongest, humans seem redundant in front of its self-improvement."

Source: http://www.ecns.cn/m/2017/10-19/277691.shtml

In response to the reports, Lee Se-dol, the only human player to date that has won against AlphaGo, said, “The previous version of AlphaGo wasn’t perfect, and I believe that’s why AlphaGo Zero was made.”

Mok Jin-seok, who directs the South Korean national Go team, said the Go world has already been imitating the playing styles of previous versions of AlphaGo and creating new ideas from them, and he is hopeful that new ideas will come out from AlphaGo Zero.

Mok also added that general trends in the Go world are now being influenced by AlphaGo’s playing style.

“At first, it was hard to understand and I almost felt like I was playing against an alien. However, having had a great amount of experience, I’ve become used to it,” Mok said.

“We are now past the point where we debate the gap between the capability of AlphaGo and humans. It’s now between computers.”

Mok has reportedly already begun analyzing the playing style of AlphaGo Zero along with players from the national team.

“Though having watched only a few matches, we received the impression that AlphaGo Zero plays more like a human than its predecessors,” Mok said.

http://koreabizwire.com/go-players-exci ... zero/98282

Gu Li then quote Ke Jie and said he's sad at human progress because human can't cramp 20 years of go knowledge in 3 days (laugh and cry)
Tang Weixing said he doesn't know what to say. The version before was made using many years while the version that doesn't use human knowledge use only 40 days. He then began to wonder the future of humanity if what that dragging feet (in development) are really human ourselves, then as a small part of God, we're all for nothing.
Gu Li then quote Tang Weixing and jokingly said we are all dragging feet
Google translate from https://sports.sina.cn/others/qipai/201 ... l?from=wap
Moving on to Japan, Hideki Kato, co developer of DeepZenGo tweet that today there's demand for him to give a lecture about this paper at 6 AM, but since the paper release in the night time in Japan (around 2 AM) he hadn't read or heard anything yet.
He commented that the paper should be specific if TPU in use is version 1 or 2 because there's significant difference between them
https://twitter.com/gghideki_katoh

Any other professional reactions?

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Thu Oct 19, 2017 5:37 am
by John Fairbairn
Cargo cult alert: "People tend to overestimate the effect of a technology in the short run and underestimate it in the long run."

This is a famous quotation though I don't know who said it. It seems to working here. Some people seem to be assuming AlphaGo Zero has reached some exorbitant level such as 20-dan.

Maybe it has. But if we assume the real strength of AI programs is not their go "knowledge" but the fact that they make far fewer mistakes than humans, a mistake-free program that beats a 9-dan not-quite-mistake-free program (or human) 100-0 is not necessarily 20-dan or whatever. It eliminates luck, so it might just be 9.1-dan.

There is an adage that photographs never lie, but of course they do. In the same way, while it so often assumed numbers never lie, in ratings systems they too "never had sexual relations with that woman."

Of course my remarks are built on the assumption that the calorie-free version of AG has eliminated mistakes. We have no way of knowing. But I do wonder whether what we have seen is AG Master being trained on human data and so having some mistakes built in, whereas tabula rasa AG Zero has essentially eliminated the GIGO effect.

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Thu Oct 19, 2017 5:53 am
by pookpooi
John Fairbairn wrote:Of course my remarks are built on the assumption that the calorie-free version of AG has eliminated mistakes. We have no way of knowing. But I do wonder whether what we have seen is AG Master being trained on human data and so having some mistakes built in, whereas tabula rasa AG Zero has essentially eliminated the GIGO effect.
Your opinion might resonate with psychological professor Gary Marcus

"Marcus is generally critical of what he sees as a general bias in the AI field toward tabula rasa programming. He argues that "in biology, actual human brains are not tabula rasa ... I don't see the principal theoretical reason why you should do that, why you should abandon lots of knowledge that we have about the world.""

From http://www.npr.org/sections/thetwo-way/ ... -knowledge

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Thu Oct 19, 2017 6:17 am
by vier
John Fairbairn wrote:If we assume the real strength of AI programs is not their go "knowledge" but the fact that they make far fewer mistakes than humans, a mistake-free program that beats a 9-dan not-quite-mistake-free program (or human) 100-0 is not necessarily 20-dan or whatever. It eliminates luck, so it might just be 9.1-dan.
I don't understand. Difference in rating is defined by winning probability.
Of course my remarks are built on the assumption that the calorie-free version of AG has eliminated mistakes. We have no way of knowing.
AlphaGo Zero beat AlphaGo Master 89-11. (Not 100-0.)

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Thu Oct 19, 2017 6:32 am
by HKA
vier wrote: I don't understand. Difference in rating is defined by winning probability.
If a rating system is based on winning probability, then of course it is defined by such.

However, I think what John means is that the program is so precise, it can win against humans all the time. To such a rating program, that might lead eventually to a rating of 20 dan - but in another sense - could it give a 9 dan 11 stones?

To me it is like comparing Go Seigen and Lee ChangHo in their primes. It certainly looked as if Go Seigen was 11 dan - he could probably hold his own against 9 dans at 2 stones - certainly he proved he could at one stone. Lee Changho, on the other hand seemed precisely one or two points better than everyone else - not ranks but points.

John is questioning whether the program is an uber Go Seigen, or simply an infallible Lee Changho

Re: AlphaGo second paper released: AlphaGo Zero

Posted: Thu Oct 19, 2017 7:52 am
by pookpooi
Lee Hajin reviews AlphaGo Zero, take a look!

https://www.youtube.com/watch?v=QprlFINq9co