Open Source AI ‘Gym’ Helps Robots Evolve

While the idea of biological organisms evolving is familiar to us, the idea of machines “evolving” may be less so. But roboticists are often trying to find new ways to optimize their robot designs; however, more often than not, it happens in a fragmentary way, with researchers either improving upon a robot’s mechanical body, or their “brains,” also known as the controller. Rarely it is simple to optimize both at the same time, or in an automated way.
But that gap may soon be filled, thanks to a team of researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), who are now proposing a simple and open source platform that uses AI algorithms for optimizing both robot brain and body. This method of simultaneous optimization is also known as “co-design.” While there is already some existing robot co-design software out there, they generally require a lot of time and computational resources.
In contrast, the MIT team’s platform, Evolution Gym, aims to offer other researchers a simple method of co-optimizing the body structure and controller of robots, in addition to providing a way to test them out in a standardized way.
“While optimal control is well studied in the machine learning and robotics community, less attention is placed on finding the optimal robot design,” wrote the team in their paper, which was recently presented at the Conference on Neural Information Processing Systems. “This is mainly because co-optimizing design and control in robotics is characterized as a challenging problem, and more importantly, a comprehensive evaluation benchmark for co-optimization does not exist. Evolution Gym [is] the first large-scale benchmark for co-optimizing the design and control of soft robots.”
Generations of Machine Evolution
The Evolution Gym system is essentially a simple and fast simulator that generates different types of soft robots using a library of “voxels,” or robotic components that might be soft, rigid, or horizontal and vertical actuators. Within the two-dimensional interface of Evolution Gym, these simulated components are represented in different assemblages of colored squares, each representing a different type of voxel. These components are organized in different layouts, with different algorithms used to automatically determine the best design for the task at hand.
These simulated soft robots are then put to the test in 30 benchmark environments that will see how they perform with various types of terrains, locomotions or manipulations. For instance, depending on the type of terrain or motion required, tasks might be rated as either “easy”, “medium” or “hard” — walking on across flat surface would be ranked as “easy.” while sliding over that same flat surface and under a beam would be considered “hard.”
Rather than designing and tweaking individual robots, the team’s approach mimics biological evolution by leveraging AI algorithms, in order to generate successive populations of robots, each with slightly different designs, which are then tested in these benchmark environments. In particular, the team’s method uses two interdependent levels of optimization — one level or “outer loop” that involves a design optimization method that evolves physical structures of the robots, and an “inner loop” that optimizes the robot’s controller for that particular structure design.
“We [developed] several robot co-evolution algorithms by combining state-of-the-art design optimization methods and deep reinforcement learning techniques,” explained the team. “Evaluating the algorithms on our benchmark platform, we [observed] robots exhibiting increasingly complex behaviors as evolution progresses, with the best-evolved designs solving many of our proposed tasks.”
For the design optimization loop, the team used a variety of avenues, like genetic algorithm (GA), Bayesian optimization (BO), and Compositional Pattern Producing Network (CPPN). For the control optimization loop used to train the robot’s controller, the team applied a type of reinforcement learning (RL) algorithm known as Proximal Policy Optimization (PPO).
By having the algorithms for both design optimization and control optimization work in tandem, a kind of evolutionary process occurs where the design optimizer can generate a structure for a new robot to the control optimizer, which will then produce a controller for that new structure after some interactions with the Evolution Gym, tailored to ensure that this new generation of robot will maximize the reward achieved in whatever benchmark tests it will be to subjected to. Design aspects of robots that successfully perform tasks and maximize rewards will be kept, reiterated and improved upon in subsequent generations, thus automatically evolving to retain their greatest advantages.
When compared to their human-created counterparts, the algorithmically designed robots generally performed much better than hand-designed ones, with the system coming up with complex designs that no human could have ever conjured up, along with some designs that were strikingly animal-like in nature, despite it having no information about the animal world. However, there were some tasks that too difficult for either both the human- and machine-generated robots to accomplish.
“The experiment results demonstrate that intelligent robot designs can be evolved fully autonomously while outperforming hand-designed robots in easier tasks, which reaffirms the necessity of jointly optimizing for both robot structure and control,” said the team. “However, none of the baseline algorithms are capable enough to successfully find robots that complete the task in our hardest environments. Such insufficiency of the existing algorithms suggests the demand for more advanced robot co-design techniques, and we believe our proposed Evolution Gym provides a comprehensive evaluation testbed for robot co-design and unlocks future research in this direction.”
Ultimately, the researchers hope that this nascent field of study can further develop by making this simple but versatile platform open source and accessible to everyone. By offering a comprehensive platform where robot designs are generated and tested against standard benchmarks, developing optimized robots will become an easier task for a wider segment of the discipline, rather than an insurmountable challenge that can only be solved by those with the most resources.
“In this way, Evolution Gym provides an easy-to-use platform for co-design algorithms to evolve both robot structure and control to optimize for robots’ task performances,” noted the team. “Evolution Gym is designed to be the first comprehensive testbed for benchmarking and comparing different co-design algorithms with the hope to facilitate the development of more novel and powerful algorithms in the co-design field.”
Read more in the paper, and download the code on GitHub.
Images: MIT CSAIL