How has the recent turmoil within the OpenAI offices changed your plans to use GPT in a business process or product in 2024?
Increased uncertainty means we are more likely to evaluate alternative AI chatbots and LLMs.
No change in plans, though we will keep an eye on the situation.
With Sam Altman back in charge, we are more likely to go all-in with GPT and LLMs.
What recent turmoil?
Cloud Native Ecosystem / Microservices / Service Mesh

Chart a Course for a Service Mesh Future: Lifting Off with Istio

A primer for how a service mesh works and why you'd need one in a microservices environment.
Sep 24th, 2019 10:00am by
Featued image for: Chart a Course for a Service Mesh Future: Lifting Off with Istio

Karen Bruner
Karen Bruner is a Principal DevOps Engineer for StackRox, where she drives automation and advocates for operationalizing the product. Previously, Karen has held DevOps and site reliability engineering roles at Clari, Ooyala, LinkedIn, and Yahoo. She started her career working in Hollywood in the digital effects industry and has a film credit in “Babe” for Internet Bandit. She spends her spare time rendering puns in yarn, learning obscure fiber crafts, and tripping over cats.

The maturation of cloud native architectures and technologies, such as containers and Kubernetes, is driving the emergence and adoption of service mesh architectures. While cloud native environments promise a wealth of benefits for organizations that deploy them, complexity has emerged as a significant challenge for those responsible for architecting, developing, operating and securing such systems, including DevOps practitioners, infrastructure engineers, software developers, and network operators as well as CIOs, CTOs, and other organizational technology leaders.

The ability to consolidate on a consistent service and network management experience for applications across cloud native environments and how that aligns with and speeds DevOps practices together propel the development of service mesh frameworks. As cloud native adoption continues to accelerate, it’s critical for the engineering teams that own cloud native applications to familiarize themselves with service mesh capabilities now to determine if that technology will provide value to their organization in the future.

What Is a Service Mesh?

A service mesh allows you to connect, secure, control, and observe services in an orchestrator platform. The term “service mesh” itself applies either to the set of overlapping network connections between services in a distributed application or to a set of tools used to manage that group of connected services. If you have two microservices which interact via network connections, you have a service mesh. Here’s the simplest example, a mesh with two services:

More likely, as the number of microservices in your environment grows, your service mesh will start to look something like the following:

As cloud environments expand into hybrid and multicloud deployments, developers use microservices to speed development and to ensure portability among the many containers and distributed cloud resources used by an organization — a vast and complex network of data and applications. As the complexity of a microservices ecosystem grows, so does the need to manage it effectively and intelligently, to get insights into how the microservices interact, and to secure communications between the microservices.

What Is Istio?

If you’ve heard about service meshes, you’ve almost certainly heard about Istio in conjunction. Istio is an open source service mesh that can be deployed alongside existing cloud-native applications. It also has platform-like features that allows it to be integrated into logging platforms, or telemetry or policy systems. The policy integration allows Istio to act as a security tool creating a uniform method to secure, connect, and monitor microservices within a given environment. When referenced, “the Istio service mesh” usually refers to the Istio toolset, whereas “an Istio service mesh” usually denotes a specific application cluster managed by an Istio installation. Istio’s many Custom Resource Definitions (CRDs) enable programmatic configuration (using the Kubernetes API) of the behavior of the application network layer, where the application is the set of interdependent microservices. Istio, more or less, is the Kleenex of service meshes in today’s cloud native stack — it’s the most feature-rich and standardized.

Do I Need a Service Mesh?

Although service mesh adoption is likely to continue to spread quickly, especially as the feature sets and manageability of tools like Istio improve, not every cloud native environment needs it. So how do you know if a service is right for your organization and environment? You should consider deploying a service mesh if you need a solution to one or more of the requirements or problems outlined below:

  • You have performance issues in a distributed microservice-based application
  • You need to gather and deliver consistent request and connection metrics for all microservices
  • You want to make over-the-wire encryption the default without having to manage TLS certificates directly
  • You require service-to-service control at a finer-grained resolution than vanilla Kubernetes can provide with Network Policies
  • You want to enable release automation with canary rollouts and application API multi-version support
  • You desire to add user authentication and authorization without modifying the application

On the other hand, using a service mesh comes with some trade-offs if it’s not needed in your stack. Deploying a service mesh (Istio included) will require a significant level of migration effort and operational overhead given how complex these environments can be. If you don’t expect the number of microservices deployed to grow, if other solutions meet your internal HTTP request routing needs, or if you already have manageable effective solutions to any of the key requirements listed above, then a service mesh probably isn’t the best solution for your environment at this time.

But as service mesh adoption continues to surge, the ecosystem of features developed to support it will inevitably continue to expand. This increase will improve manageability and functionality so that in the future, as organizational maturity demands it, DevOps teams will have easier access to a more robust set of service mesh tools without the anxiety associated with deploying a fresh layer of infrastructure into the cloud native stack.

How Istio Works

Istio components can be broken down into two groups — the control plane and the data plane. The control plane refers to the services that manage configurations and monitor the data plane. The data plane consists of intelligent proxies deployed as sidecars in application pods, the smallest deployable object in the Kubernetes object model. These Istio proxies facilitate controlling and monitoring the network connections between microservices. Routing and policy rules are received from the control plane. The data plane then reports back connection handling telemetry.

Istio service meshes are configured through the creation of Kubernetes resources. There are many Kubernetes Custom Resource Definitions that map to various elements of Istio’s functionality created by the folks behind it. We discuss more about the control and data planes below, but first we wanted to raise a few points about Istio’s potential (as well as potential pitfalls).

The Potential and Pitfalls

Istio offers a range of features to handle and control network connections through its mesh of dynamically-configurable proxies. But this functionality comes with a steep learning curve and a heavy load of configurations. There are also sometimes many common issues that come along with migrating existing applications into Istio architectures, even if they are already Kubernetes-native microservices.

Ironically, Istio lacks visibility into how it translates user-supplied configurations to Envoy routes. Envoy is the high-performance proxy developed as an intermediary for inbound and outbound traffic for the services in a service mesh created by developers from the ride-share service Lyft to transition from a monolithic architecture to a service mesh architecture. Other adoption issues can include the learning curve required to understand requirements for deployment and service resource configuration, addressing Kubernetes readiness and liveness probes that break when mTLS is turned on, and working with headless services (Kubernetes services with no ClusterIP) or services that otherwise bypass the normal Kubernetes service discovery flow.

The upside is that Istio is rapidly evolving with frequent releases and engaged working groups that actively solicit user feedback. Many limitations come from the Envoy proxy, which is also being actively developed and improved as Istio continues to drive its usefulness.

Configuring the Control Plane

A typical Istio deployment in a Kubernetes cluster should have the following services:

  • The Pilot service, which aggregates traffic management specifications configured in Istio networking custom resources and delivers it to the istio-proxy sidecars.
  • The Mixer service, which handles telemetry for request metrics generated by proxy sidecars to send them to configured backends and acts as an authorization policy enforcer. If policy checks are turned on (Istio 1.1 turns them off by default), the proxy sidecars will connect to the Mixer to confirm that the connection is allowed. This approach, unfortunately, comes at the slight cost of additional network latency.
  • The Citadel service, which is Istio’s Public Key Infrastructure (PKI) service, generates, rotates, and revokes the client TLS certificates generated for each service in a mesh and used for peer-to-peer authentication.
  • The Galley service, which is the Kubernetes controller for most of the Istio Custom Resource Definitions enabling users to make changes to custom resources and distribute the contents to the other Istio services.

The Data Plane

The data plane is powered by the Envoy service proxy, built with extensions for Istio. The proxy intercepts incoming traffic to the pod’s service ports and, by default, all outgoing TCP traffic from the pod’s other containers. In most cases, the proxy sidecar can run in a pod without requiring any changes to the application code and with only minor changes to the application’s Kubernetes Deployment and Service resource specifications. The configuration of the proxy sidecars is managed dynamically by the services in the Istio control plane.

Ultimately, a time may come when you will need to deploy a service mesh to ensure your cloud native environment is fully functioning and amply secured. Familiarizing yourself with the fundamentals today will give you a leg up when it comes time for recognizing when it’s time to deploy and being prepared. With visibility into the design and functionality of Istio, and how it reduces the inherent complexity of containerized microservices and cloud-native environments, engineers trying to plan for scaling on Kubernetes and other container platforms can feel relief knowing that a highly functional and rapidly improving solution exists and is actively evolving to enhance scalability, security and ease of management.

As organizations continue on their journey towards the adoption of cloud native and distributed architectures, Istio’s service mesh capabilities, along with lower-level infrastructure network controls and general Kubernetes security best practices, will unburden DevOps organizations from the pressures associated with scaling and managing application infrastructure.

Feature image via Pixabay.

Group Created with Sketch.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.