AlphaZero is a generalization of the computer program AlphaGo Zero, a system that teaches itself how to play chess, shogi (Japanese chess), and Go.
AlphaZero is a generalization of the computer program AlphaGo Zero, a system that teaches itself how to play chess, shogi (Japanese chess), and Go.
AlphaZero is a generalization of the computer program AlphaGo Zero, a system that teaches itself how to play chess, shogi (Japanese chess), and Go. While traditional chess engines rely on rules and heuristics handcrafted by human players, AlphaZero replaces these hand-crafted rules with a deep neural network and general purpose algorithms that know nothing about the game beyond the basic rules. AlphaZero was developed by DeepMind technologies in 2018.
To learn each game, an untrained neural network plays millions of games against itself via a process of trial and error called reinforcement learning. At first, it plays completely randomly, but over time the system learns from wins, losses, and draws to adjust the parameters of the neural network, making it more likely to choose advantageous moves in the future. The amount of training the network needs depends on the style and complexity of the game, taking approximately 9 hours for chess, 12 hours for shogi, and 13 days for Go.
The trained network is used to guide a search algorithm – known as Monte-Carlo Tree Search (MCTS) – to select the most promising moves in games. For each move, AlphaZero searches only a small fraction of the positions considered by traditional chess engines.