Solo.io’s Autopilot: A Framework for Automating Service Mesh Operations

Portworx sponsored The New Stack’s coverage of KubeCon + CloudNativeCon North America 2019.
A new project from service mesh provider Solo.io demonstrates the possibilities of automating service mesh operations, using the telemetry created by the service mesh itself.
The project, Autopilot is a framework creating an automated series of operations that can be triggered by conditions set by the administrator. Not only could it be used to maintain an actively running service mesh, but it could also automate tasks such as canary deployments, adaptive security, and chaos testing through a set of declarative steps requiring no user intervention.
While service meshes provide invaluable help in routing messages across different microservices, they are still configured and managed largely by hand, noted Idit Levine, founder and CEO of Solo.io, and co-author of Autopilot, in an interview with The New Stack.
An “adaptive mesh,” as Levine calls it, could “change its configuration” based on external requirements. Like an airplane’s autopilot, this Autopilot watches the current state of the service mesh and can adjust settings to maintain the desired state, and even run automated operations. A set of workflow rules that guide the operation are defined in a Custom Resource Definition (CRD) that configures the controller, which in turn watches the current state of the service mesh.
The company debuted the project, to a largely favorable response, at ServiceMeshCon, a one day Cloud Native Computing Foundation event co-located with KubeCon + CloudNativeCon North America 2019, being held this week in San Diego.
Still Manual
Service meshes quickly became a necessity after organizations started adopting the Kubernetes container orchestration engine, and microservice architectures in general. Applications were broken down into smaller containerized services, which all had to communicate with one another, and the outside world, both securely and safely. The service mesh abstracted all of these components into sidecars, which could be attached to the application or microservices and could communicate with all the other sidecars on a network, thereby setting up a routing network across all the microservices. And their use generated a lot of useful operational data around traffic patterns and system loads.
Building on the concept of programmable infrastructure, Autopilot takes the next step and harnesses the service mesh’s telemetry so it can actually drive operations, without the need for admin input, beyond the setting of initial conditions. Webhooks could be used to bring in additional data.
The technology uses the Kubernetes Operator, a templating pattern originally created to streamline the deployment of complex applications, to automate actions. With scaffolding, Autopilot builds and deploys Operators that run against a Kubernetes cluster installed with a service mesh. It provides/generates the primitives, generated code, and helper functions to control the service mesh. A YAML-based file is created for every desired state of the mesh, which is continually reconciled with its current state.
According to the project’s GitHub page, “Autopilot provides a more opinionated control loop via a generated scheduler that implements the Controller-Runtime Reconciler interface, for which users write stateless Work functions for various states of their top-level [custom resource definition]. State information is stored on the status of the CRD, promoting a stateless design for Autopilot operators.”
Initially, AutoPilot works with the Istio service mesh, though over time it will work with all service meshes that respond to the service mesh interface (SMI) standard. And currently, AutoPilot will require Kubernetes to run, though this may not be a dependency in the future, Levine said. The software is written in Go.
The Cloud Native Computing Foundation and KubeCon+CloudNativeCon are sponsors of The New Stack.