In KubeCon this December, we are going to dive in resiliency and fault tolerance features of the CNCF’s Istio service mesh architecture. Before going there, let’s discuss the importance of the service mesh in a cloud-native environment.
Microservices and containers have changed application design and deployment patterns. They’ve also brought with them some new challenges, such as service discovery, routing, failure handling, and visibility. And while PaaS platforms like Cloud Foundry are great for deploying microservices, they were created with a view of simplifying application deployment across multiple runtimes. Similarly, CNCF’s Kubernetes container orchestration software can handle multiple container-based workloads, including microservices, but when it comes to more sophisticated features like traffic management, failure handling, and resiliency, both the platforms leave a lot to be desired.
Imagine an application that is broken down into multiple microservices; each microservice has multiple instances, and each deployed instance has multiple versions. Typically, even a simple application deployment with this kind of model can span hundreds of microservices. When an application deployment gets this large, distributed, and complex, the result is often failure. But you need to fail fast and recover quickly. You need a mechanism that is fault-tolerant, one that provides more visibility and control into the complex network of microservices and ensures reliable, secure, and timely communication between them.
For this deployment model, we need to keep track of the traffic flow between microservices, route traffic for microservices based on request content or traffic origination point, and handle failures in a graceful manner when a number of microservices are not reachable. We also need to enforce strong identity assertion between services and limit the entities that can access a service. Most importantly, we want to do all this without changing the application code. Service mesh architecture was created to handle these requirements. Think of a service mesh as a network of interconnected devices with routers and switches, except in this case the network exists at the application layer (layer 7 of the OSI stack), nodes are services, and routing, delivery, and other tasks are off-loaded to the service mesh. The goal is to get a request in a reliable, secure and timely manner across this mesh of microservices from origination to destination microservice.
Typically, this is achieved by using “proxies” to intercept all incoming and outgoing network traffic. Proxies in a service mesh architecture are implemented using the sidecar pattern: A sidecar is conceptually attached to the main (or parent) application and complements that parent by providing platform features. With this kind of model, your microservice can use the sidecar either as a set of processes inside the same microservice container or as a sidecar in its own container to leverage platform capabilities such as routing, load balancing, resiliency, in-depth monitoring, and access control.
Istio: A Service Mesh Architecture Implementation
Istio is a service mesh created through a collaboration between IBM, Google and Lyft. It uses the sidecar pattern, where sidecars are enabled by the Envoy proxy and are based on containers. By injecting Envoy proxy servers into the network path between services, Istio provides sophisticated traffic management controls, such as load-balancing and fine-grained routing. This routing mesh also enables you to extract a wealth of metrics about traffic behavior, which can be used to enforce policy decisions such as fine-grained access control and rate limits that operators can configure. Those same metrics are also sent to monitoring systems. Istio achieves this by deploying:
- A control plane that manages the overall network infrastructure and enforces the policy and traffic rules
- A data plane which includes sidecars implemented using Envoy, an open source edge proxy.
Apart from Envoy proxy, key components of Istio are:
- Istio Pilot (for traffic management): In addition to providing content and policy-based load balancing and routing, Pilot also maintains a canonical representation of services in the mesh.
- Istio Auth (for access control): Istio Auth controls access to the microservices based on traffic origination points and users, and also provides a key management system to manage keys and certificates.
- Istio Mixer (for monitoring, reporting, and quota management): Istio Mixer provides in-depth monitoring and logs data collection for microservices, as well as a collection of request traces. It uses Prometheus, Grafana, and Zipkin to provide some of these in-depth metrics.
Learn more about “Enable your Microservices with Advanced Resiliency and Fault Tolerance Leveraging Istio” from Animesh Singh and Tommy Li of IBM at KubeCon and CloudNativeCon North America, Dec. 6-8, 2017 in Austin, TX. Register here.