It’s becoming soberingly clear that machines have the advantage (or are at least catching up rapidly) when it comes to performing exclusively cognitive tasks like dominating humans in chess, Go, or learning how to bluff during poker games. While a well-rounded, human-like artificial general intelligence that can accomplish a diverse range of tasks is still some ways off, there’s nevertheless other horizons for intelligent machines to conquer — such as mastering Jenga.
To that end, researchers from MIT have now developed a robot that can play Jenga using a human-like, hierarchical model of learning. While it might not seem all that complicated to us, Jenga is actually pretty hard for machines to figure out. Named for the Swahili word kujenga, meaning “to build,” Jenga is a game that requires players to process a lot of tactile information, as they attempt to carefully rearrange pieces by progressively extracting and stacking them on top, without toppling an increasingly unstable structure. It’s a fun challenge that most people can learn to play reasonably well within a few games, but watch how the team’s robotic arm performs:
MIT’s robot makes it look easy, but there’s actually a lot going on under the hood here. For most robots, learning to play Jenga well is difficult as they aren’t particularly proficient in what is known as tactile reasoning — the ability to perform a task using clues gleaned from physically touching and interacting with objects.
“Unlike in more purely cognitive tasks or games such as chess or Go, playing the game of Jenga also requires mastery of physical skills such as probing, pushing, pulling, placing, and aligning pieces,” explained MIT engineering professor Alberto Rodriguez, who is also co-author of the study published in Science Robotics. “It requires interactive perception and manipulation, where you have to go and touch the tower to learn how and when to move blocks. This is very difficult to simulate, so the robot has to learn in the real world, by interacting with the real Jenga tower. The key challenge is to learn from a relatively small number of experiments by exploiting common sense about objects and physics.”
Combining Visual and Tactile Reasoning
To build a robot with a more developed sense of tactile reasoning, the team created an AI model that emulates how humans might approach such a feat — by first learning the game through a short initial period of trial-and-error, and then using tactile and visual data from its previous attempts to infer how future actions might influence the behavior of blocks.
More specifically, the team used an ABB IRB 120 robotic arm, in addition to a temporal hierarchical Bayesian model. Each time the robotic limb makes an attempt to push and relocate a block, the system records the resulting visual and tactile measurements, and whether the attempt succeeded or not. The system will then adjust its behavior in accordance with its current actions and the inferences it makes. In contrast to conventional AI models that might need to be trained through tens of thousands of block-manipulation attempts, the team’s model was sufficiently trained on only about 300 attempts, thanks to its ability to “cluster” data on certain block behaviors, as well as the related physical information.
According to the team, this data-clustering technique increases the robot’s efficiency in learning the game by allowing it to group possible outcomes based on previous experience, therefore enabling it to predict whether it will be able to successfully move a particular block, using current visual and tactile data.
“The robot builds clusters and then learns models for each of these clusters, instead of learning a model that captures absolutely everything that could happen,” noted MIT graduate student and lead author Nima Fazeli.
When testing their model against other machine learning algorithms in simulations, the team discovered that other models would require “orders of magnitude more towers” to successfully learn the game. The team even pitted their system against human volunteers and found that the robotic arm fared almost as well as a human might, though there’s still some room for improvement before their robot can compete strategically against human Jenga champions.
Nevertheless, having a robot that can not only see but “feel” from its actions will be indispensable in other applications beyond playing mere games. For instance, an army of such dextrous bots would be well-suited to assembling tiny electronic parts or assisting in surgical procedures — although such implementations will no doubt hasten the robot takeover that’s already underway.
Read the paper over at Science Robotics.
The New Stack is a wholly owned subsidiary of Insight Partners. TNS owner Insight Partners is an investor in the following companies: Real.