Cloud Native / Networking / Service Mesh

Microsoft Debuts Open Service Mesh into a Crowded, Contentious Ecosystem

6 Aug 2020 2:26pm, by

Although Istio has been in the headlines, Microsoft has released its new Open Service Mesh project into an increasingly crowded and often confusing ecosystem.

There’s already Linkerd (the project that coined the term), plus (in no particular order) Kuma, Maesh, Mesher, SOFAMesh, Cilium, Consul Connect, AWS App Mesh, Citrix Service Mesh, F5’s Aspen Mesh and VMware’s Tanzu (three commercial Istio distributions with enterprise management), as well as load balancers like NGINX that support service mesh functionality. There are even projects like Network Service Mesh that use the term even though it’s not a service mesh. Tools like Meshery, Service Mesh Hub and Weave Flagger let you manage or deploy over multiple service meshes, in case one project you adopt has an affinity for a different service mesh than the one you’ve chosen to adopt for other applications. So why yet another service mesh?

Microsoft has been collaborating on APIs for common service mesh functionality for some time with Linkerd, Hashicorp, Red Hat, Rancher, Docker, Aspen Mesh, VMware and others, in the Cloud Native Computing Foundation-hosted Service Mesh Interface (SMI) specification and like Linkerd but unlike Istio, Open Service Mesh can be configured by that SMI and it may even serve as a reference implementation.

Open Service Mesh activity on GitHub started at the end of 2019. That means it’s been in progress for too long to be an immediate reaction to Google’s decision to set up a new organization to shepherd the Istio trademark rather than donating the project to the CNCF the way IBM maintains Google committed to do.

Instead, it seems more likely that it is rather a response to Microsoft customers (and others) who want to standardize on Envoy — which is a proxy rather than a service mesh, so OSM uses Envoy by following the common pattern of injecting an Envoy proxy as a sidecar container next to each instance of an application to handle access control rules, routing and collecting metrics for observability.

Using the SMI specification offers the promise of moving more easily to another SMI-configured service mesh (something that will likely be useful inside Microsoft where several different service mesh projects are in use). However the OSM configuration is split between using SMI for the more straightforward options and using the Envoy xDS APIs surfaced inside OSM for more advanced integrations, and the latter could be harder to move if you pick a service mesh like Linkerd which doesn’t use Envoy.

Microsoft committed to bringing OSM to the CNCF and filed a proposal to add it as a Sandbox CNCF project shortly after announcing it; voting on the next round of Sandbox projects is due to happen in September.

The project was greeted with some interest within the wider Kubernetes community: Knative co-founder Matt Moore talked about building Knative integration and the arcade Kubernetes CLI quickly added OSM support. There’s already been discussion with the Kiali project (a management console for observability and configuration with Istio) about adding SMI Traffic Metrics so Kiali can offer observability for OSM.

Kubernetes co-founder Joe Beda raised the question of “how service meshes (and things that are service mesh-like) can talk to each other securely and with high fidelity” since both OSM and the Dapr runtime for assembling platform-agnostic applications out of microservices and APIs use Envoy sidecars and mTLS (although at different layers).

As Envoy creator Matt Klein told the New Stack previously, “Envoy is becoming a building block; people are building different things on top and we’re happy to have Envoy be that building block that gets everywhere. We don’t have a goal of being the one control plane or the one way people do things; we want to be a toolkit that people can use to build systems.”

But there was also some controversy when Linkerd product lead Oliver Gould noted that the OSM repo included “health checking” code that appeared to have been copied from Linkerd. Microsoft apologized for the lack of attribution, removed the code completely and promised to improve the review process for the project going forward.

The question that remains is how Open Service Mesh will stand out in the crowded and still evolving service mesh space.

The Cloud Native Computing Foundation, HashiCorp, Aspen Mesh, Red Hat and VMware are sponsors of The New Stack.

A newsletter digest of the week’s most important stories & analyses.