AlphaGo Zero Beats Its Forerunner in 100 Bouts of Go in a Row After Learning to Play From Scratch

Liao Shumin

DATE: Oct 19 2017

/ SOURCE: Yicai

(Yicai Global) Oct. 19 -- AlphaGo Zero, the latest evolution of an artificial intelligence program developed by Google's DeepMind, learned the ancient Chinese board game Go from scratch without any human input and beat its forerunner -- AlphaGo Lee -- 100 to 0, its developer said in a paper published in the journal Nature.

After playing several million games against itself, the new AI program discovered intricacies of Go that took humans thousands of years to understand, the article said. Zero came up with original strategies, producing insights into the ancient game.

AlphaGo Lee has 48 Tensor Processing Unit (TPU) and beat South Korea's nine-dan professional Go player Lee Sedol in four out of five games in March last year after studying established Go move sequences (josekis) and playing against itself about 30 million times over several months.

AlphaGo Zero has four TPUs and learned to play without facing humans. It took the new version three days and some 4.9 million self-training games to best AlphaGo Lee in 100 bouts a row.

The program's development has taken reinforcement learning algorithms to a new level.

Evolution of reinforcement learning has gone through three stages -- early algorithms in the early 1990s, 'Q learning' and in-depth reinforcement learning that started a decade ago. As shown in the development of Zero, combining reinforcement learning with a look ahead mechanism (similar to reconnaissance in military operations) from tree traversal theory has created a more efficient in-depth reinforcement learning model.

As a result, Zero did not rely on existing knowledge like its predecessor did, and can invent better Go strategies through self-training, said Xu Lei, a chair professor at Shanghai Jiao Tong University and head of the Centre for Cognitive Machines and Computational Health (CMaCH).

Compared with its processors, AlphaGo Zero's algorithms are simpler and smarter. Instead of using artificial big data, it discovered knowledge by applying rules for learning set by its human developers, and it 'knows' how to rectify mistakes made by humans. It acquired such abilities with amazing efficiency. Interestingly, the AI cannot explain how it achieved this and can only provide demonstrations, said Zhang Zheng, a computer science professor at New York University Shanghai.

AlphaGo Zero's algorithms and programs are like a black box that can improve itself as the number of self-training sections increases, and it 'inherits' optimized algorithms by copying certain codes, but people cannot look inside the algorithms, said Wei Hui, a professor at Fudan University's School of Computer Science and Technology.

It is unclear if Zero and other AI programs and computers have explored all the possible moves of the board game, but AI is definitely faster than humans and will bring new discoveries -- or rather new josekis, Zhang said.

Follow Yicai Global on

Keywords: AI,ALPHAGO,Algorithms,GO,Reinforcement Learning

Report

Log in to Yicai Global

EMAIL

0/50

PASSWORD

Forgot password? sign up

Create your account

EMAIL

By signing up, you agree to our Terms, Privacy Policy

We sent you a code

Enter it below to verify via ****@****.com

VERIFICATION CODE

Didn't receive email? Resend email

You'll need a password

Make sure it's 8 characters or more

PASSWORD

SHOW PASSWORD/HIDE PASSWORD

Success!

Welcome to Yicai Global

Find your Yicai Global account

Enter your email

Check your email

We've sent an email to *********@q*.*** with a confirmation code.

Enter the code below to reset your password.

If you don't see the email, check your junk, spam or other folders.

Enter code

Didn't receive email?Resend email

Change your password

Strong passwords include numbers,letters,and special characters.

Resetting your password will log you out of all your active Yicai Global sessions.

Enter your new password

Enter your new password again

Congratulations!

Your password has been changed successfully.

Reset your password

Strong passwords include numbers,letters,and special characters.

RELATED

Log in to Yicai Global

EMAIL

PASSWORD

Create your account

EMAIL

We sent you a code

VERIFICATION CODE

You'll need a password

PASSWORD

Find your Yicai Global account

Enter your email

Check your email

Enter code

Change your password

Enter your new password

Enter your new password again

Reset your password

Enter your new password

getcode