Product attributes
Other attributes
AlphaGo is a computer program developed by Google DeepMind to play the board game Go. AlphaGo became the first computer program to defeat a professional human Go player and the first to defeat a Go world champion. DeepMind has since developed a more advanced artificial intelligence (AI) program for playing Go, called AlphaGo Zero. The techniques behind AlphaGo Zero have been generalized into another AI system called AlphaZero, capable of playing Go, chess, and shogi.
Go originated in China over 2,500 years ago and is played by more than 40 million people worldwide. The game requires multiple levels of strategic thinking to win. Two players (one playing as white and one playing as black) take turns placing stones on a 19 by 19 board. The aim of the game is to surround and capture the opponent's stones or strategically create spaces of territory. After all possible moves have been played, the players count one point for each vacant point inside their own territory, and one point for every stone they have captured. The player with the larger total of territory plus prisoners is the winner. Go has 10170 possible board configurations—more than the number of atoms in the known universe. Go's level of complexity makes it challenging for computers to play, and before AlphaGo, leading programs could only play Go as well as amateurs.
Standard AI methods, which test all possible moves and positions using a search tree, fail to handle the sheer number of possible Go moves or evaluate the strength of each possible board position. DeepMind developed a new approach when creating AlphaGo, combining advanced search trees with deep neural networks. The description of the Go board is input to these neural networks, processing through a number of different layers containing millions of neuron-like connections. One neural network, called the "policy network," selects the move to play. Another neural network, called the "value network," predicts the winner of the game.
These deep neural networks are trained by a combination of supervised learning from human expert games and reinforcement learning from games of self-play. DeepMind trained the neural networks on 30 million moves from games played by human experts until it could predict the human move 57% of the time. Next, AlphaGo learned to discover new strategies for itself, by playing thousands of games between its neural networks, adjusting the connections using a trial-and-error process (reinforcement learning). Training AlphaGo required significant computing power, making use of Google's cloud platform.
AlphaGo is described in detail in a 2016 Nature paper titled "Mastering the game of Go with deep neural networks and tree search."
AlphaGo was first tested in a tournament between the leading Go-playing computer programs. AlphaGo lost only one game out of 500. In October 2015, DeepMind invited reigning three-time European Go Champion Fan Hui to a closed-door match. AlphaGo won 5-0. In March 2016, AlphaGo defeated Lee Sedol, winner of eighteen world titles, 4-1 in a series of matches watched by over 200 million people worldwide. The game against Sedol meant AlphaGo earned a 9-dan professional ranking—the highest certification—and it was the first time a computer Go player received the accolade. During the games, AlphaGo played several inventive and surprising winning moves that have been extensively studied since.
In January 2017, DeepMind revealed an improved online version of AlphaGo called Master, which achieved sixty straight wins in time-control games against top international players. In May 2017, AlphaGo took part in the Future of Go Summit in China. The summit included various game formats, such as pair Go, team Go, and a match with the world’s number one player, Ke Jie. After the summit, DeepMind unveiled AlphaGo's successor, AlphaGo Zero.