It used to be that we humans could feel a little superior compared to machines, knowing that there were still some things that humans could do better: recognize emotions, dream and play strategic games like chess.
But in the last few decades, that human edge is slowly but surely being eroded: IBM’s supercomputer Deep Blue beat chess world champion Garry Kasparov back in 1997, and Watson defeated top human contestants on the quiz show Jeopardy! back in 2011.
And now, in what is being called a breakthrough for artificial intelligence research, a program developed by Google scientists has bested a professional human player for the first time in the ancient and complex board game known as Go.
With the aim of pitting new machine learning research against the best human players, Google’s “AlphaGo” program is being spearheaded by researchers at DeepMind, a British artificial intelligence company that was acquired by Google in 2014.
Artificial intelligence researcher Demis Hassabis of Google DeepMind explains the significance of this victory: “Go is considered to be the pinnacle of game AI research. It’s been the grand challenge, or holy grail if you like, of AI since Deep Blue beat Kasparov at chess. Go is a very beautiful game with extremely simple rules that lead to profound complexity. In fact, Go is probably the most complex game ever devised by humans.”
Neural Nets with Human-like “Imagination”
Go, also known by the names “Igo”, “Weiqi” and “Baduk”, is a popular game in China, Japan and South Korea, and dates back to at least 2,500 years. The game involves two players strategically positioning black or white pieces over a gridded board, with the aim of taking more territory than one’s opponent.
Go is a vastly more complex game than chess, which may have on average 35 possible moves from each turn; in comparison, Go has 250, and by exponential extension, the number of possible moves by some estimates are greater than the number of atoms in the universe.
There were already computers that could play amateur-level Go, but it is this very complexity that had many AI researchers predicting (as recently as last year) that it would take yet another decade to develop a machine that would beat a top-level human in the game.
The problem was that traditional “brute force” AI methods, which generate many “search tree” sequences of possible moves — work fine with chess game computations, but are woefully inadequate with the infinitely more complex probabilities presented by Go.
As the researchers explain in this post and paper published in Nature, the key to was to reduce the vast depth of the “search space” presented by all these possible moves in Go, by using and training two deep neural networks to work in tandem.
With this approach, scientists say that AlphaGo’s algorithms are much more human-like than previous models, and is even akin to a kind of artificial “imagination.” One neural network is designated as the “policy network,” which attempts to anticipate the next handful of possible moves that would most likely help it to win.
The other net is specified as the “value network” which searches ahead to evaluate the winner in each position, but instead of searching ahead to the end of the game, as in traditional “brute force” AI methods, it only plays out a modest number of possibilities ahead.
In leveraging these self-generated simulations, AlphaGo can anticipate better and devise more efficient strategies, where the program’s policy network comes up with the most advantageous maneuvers to execute, while the value network will judge which move of those will most likely to help it win.
Distributed Computing for a Stronger AlphaGo
The inherent intricacies of Go required a different method of training the neural networks—and a lot of data.
To do this, Google DeepMind researchers trained the policy network with a database of 30 million moves from games played by human Go masters. AlphaGo then used this data to develop new strategies by using its neural networks to play games against itself, in an evolutive, trial-and-error method of machine learning called reinforcement learning.
The result was a self-training system that would not only beat the best artificial Go-playing programs, but also the best human players.
What’s interesting here is the role that distributed computing plays in making an even more intelligent AlphaGo, capable of destroying the competition even with a 4-move handicap. “Of course, all of this requires a huge amount of compute power,” say the researchers. “So we made extensive use of Google Cloud Platform, which enables researchers working on AI and Machine Learning to access elastic compute, storage and networking capacity on demand. In addition, new open source libraries for numerical computation using data flow graphs, such as TensorFlow, allow researchers to efficiently deploy the computation needed for deep learning algorithms across multiple CPUs or GPUs.”
But perhaps what is most significant is that the team didn’t use specialized rules to create an “expert system”; rather, general machine learning techniques were used to create a robust system that is able to learn on its own, almost like a human, but within structured conditions.
Games are just the beginning; the researchers envision that similar approaches could be applied to serious real-world problems—from robotics to climate change mitigation strategies, to analyzing complex diseases or developing personalized medical treatments, or smarter digital personal assistants that will make our daily lives more productive.
Of course, some may see this breakthrough as a possible future disaster. Some observers, like Tesla’s Elon Musk and scientist Stephen Hawking, are warning about the possible dangers of super-intelligent, autonomous machines running amok and wreaking destruction upon humanity. But Hassabis remains decidedly less alarmist, saying, “We’re still talking about a game here.”
Yet, Google DeepMind researcher David Silver is franker: “Humans have weaknesses: they get tired, they make mistakes, they are not able to make the precise, tree-based computations that a computer can perform. And perhaps most importantly, humans are limited to the number of Go games they can play and process in a lifetime, while AlphaGo can play millions of games every single day. It’s at least conceivable that, as a result, AlphaGo —given enough processing, training and search power—could reach a level that’s beyond any human.”
The New Stack is a wholly owned subsidiary of Insight Partners. TNS owner Insight Partners is an investor in the following companies: MADE, Real.