Microservices emerged 10 years ago as one of the examples of convergent evolution that sometimes happens in software. While the term can be credited to James Lewis and Martin Fowler at Thoughtworks, the global software consultant, similar ideas were being discussed by Adrian Cockcroft, then at Netflix and many other Silicon Valley firms. Amazon, Google and eBay were among the companies that independently arrived at more or less the same architectural pattern at roughly the same time.
In the decade since the term was coined, we’ve seen the rise of Kubernetes, service mesh and the beginnings of serverless, and we’re starting to see the influence of microservices being applied to the front end as well. Alongside horizontal scaling practices, microservices allow developers to deploy code more rapidly, favoring the replaceability of components over their maintainability.
For better or worse microservices have become, for many, the default architectural choice. For organizations with autonomous teams and loosely coupled systems, microservices can work well, but they bring the complexity inherent in working with any distributed system.
“I make the case strongly for the significant benefits of public cloud over private cloud and data centers, which I think are very clear cut. In many cases it’s fear holding people back in those situations” Sam Newman, the independent tech consultant who is seeing the second edition of his book “Building Microservices” published in August, told The New Stack. “But with microservices, the world is much, much more complex than that.”
With this in mind, a decade into the microservices era it is interesting to think about where we’ve got to, and what issues we still need to resolve.
Taking Stock: Deployment and Runtimes
Likewise, good monitoring options abound. The emergence of OpenTelemetry is particularly significant. Formed through the merger of OpenTracing and OpenCensus, it has wide vendor and language support, providing standardization as to what the distributed telemetry data looks like. This means that developers need only instrument their code once, and can then swap and change monitoring tools, comparing competing solutions and even running multiple different monitoring solutions in production for different needs.
The picture gets a little murkier, however, when we look at deployment and runtimes. Kubernetes, which has become more-or-less synonymous with microservices, suffers from burgeoning complexity, prompting Adrian Mouat, chief scientist at Container Solutions, a cloud native consultancy, to speculate that we will see competitors arise to it.
“It’s worth noting that the complexity isn’t just hidden under the hood. It’s spilling out into the interface and impacting users,” Mouat said. “It’s still fairly easy to hack at
kubectl run and get a demo up and running. But running production apps and figuring out how to expose them securely requires understanding a wealth of different features that inevitably result in YAML files longer than most microservice source code.”
Newman summed up an essential challenge: “Kubernetes is not developer friendly. It stuns me that we still don’t have a good, reliable, Heroku-like abstraction that is widely used on top of Kubernetes.”
Spotify’s director of engineering, Pia Nilsson, has talked about the average 60 days it took for the rapidly scaling company’s new engineers to merge their 10th pull request. In response, the company spun up a developer portal, Backstage, now a sandbox project at the Cloud Native Computing Foundation.
Netflix has put a great deal of emphasis on DevEx — the company’s “paved road” for developers — using it to help accelerate the adoption of new technologies such as GraphQL. Likewise we’ve seen the rise of developer platforms both built in-house and via vendors like Humanitec. Ambassador Labs has the related concept of a developer control plane — which, its website claims, “enables developers to control and configure the entire cloud development loop in order to ship software faster.”
“Kubernetes is not developer friendly. It stuns me that we still don’t have a good, reliable, Heroku-like abstraction that is widely used on top of Kubernetes.”
—Sam Newman, author of Building Microservices
Daniel Bryant, director of developer relations at Ambassador Labs, told the New Stack, “If you look at what companies like Airbnb, Shopify and Lunar are doing, there is a clear commonality between them. They are creating a Heroku-like CLI for their developers, so that a command like ‘create new microservice’ spins up some scaffolding, plugs into CI, plugs into pipelines, plugs into observability. The question is, what is the abstraction you expose to developers so they get the visibility they need and also make the requirements they need clear as well?”
Stepping up a level, Bryant continued, “Developers need to specify certain operational characteristics: this is a memory heavy service; this service needs low latency; this service needs to be very close to that service. At the moment you do this by spinning up Kubernetes and writing lots of YAML. The abstraction isn’t quite right there, particularly as you bring in other mechanisms for deployment such as serverless and low code/no code.
“I wonder if the winner will be whoever has the right abstractions exposed via the platform, and then leaves it up to the engineers how to package their code — but the way they package it is the same, and the platform exposes some properties which traditionally have been operational properties.”
Open Application Model
A couple of other initiatives regarding Kubernetes are worth tracking. Jointly created by Microsoft and Alibaba Cloud, the Open Application Model (OAM) is a specification for describing applications that separate the application definition from the operational details of the cluster. It thereby enables application developers to focus on the key elements of their application rather than the operational details of where it deploys.
Crossplane is the Kubernetes-specific implementation of the OAM. It can be used by organizations to build and operate an internal platform-as-a-service (PaaS) across a variety of infrastructures and cloud vendors, making it particularly useful in multicloud environments, such as those increasingly commonly found in large enterprises through mergers and acquisitions.
Whilst OAM seeks to separate out the responsibility for deployment details from writing service code, service meshes aim to shift the responsibility for interservice communication away from individual developers via a dedicated infrastructure layer that focuses on managing the communication between services using a proxy. Unfortunately, they also suffer from complexity, and can also introduce considerable performance overhead.
It is therefore the case that to date many of the successful implementations of service meshes in production have been in startups that are very tech-savvy. In a podcast with Wes Reisz at InfoQ from 2020, Newman suggested waiting six months before selecting one and he is, he told The New Stack, still giving the same advice.
“The realities of them are just horrendous in terms of the weight of the stack, the management, the impacts, the performance implications of this stuff,” Newman said. “There are some organizations that say they couldn’t have done what they did without them, Monzo are a great example of that, and in an organization where you’ve got a heterogeneous technology stack and you need to do things like mutual [transport layer security] at scale, I can see the value of it. But it still feels to me like ‘great concept, poor execution.’ And we could be saying the same thing for a while yet I think.”
Hiding the Service Mesh
One thing that may happen, at least for enterprise customers where performance concerns tend not to be that acute, is that the service mesh gets pushed deeper into the platform and largely hidden from developers. Red Hat OpenShift, for example, integrates Istio under the covers, and there are multiple similar initiatives to integrate service meshes more tightly with public cloud platforms, such as AWS App Mesh and Google Cloud Platform Traffic Director.
Work is also being conducted to reduce the networking overhead introduced by a service mesh. Some of the most promising are the work by the Cilium team, which utilizes the eBPF functionality in the Linux kernel for what it calls “very efficient networking, policy enforcement and load balancing functionality.”
I think now we need Domain-Drive Design for the rest of us. Because even folks who are regular developers rather than architects need to have some understanding of how to scope entities and boundaries, a lot of which comes back to good API design
—Daniel Bryant, director of developer relations, Ambassador Labs
Another possibility though is that we may shift to a different runtime altogether. Simon Wardley, an advisor to the Leading Edge Forum, has suggested that Function-as-a-Service (Faas)/Serverless will ultimately replace Kubernetes as the de facto standard runtime for distributed applications, and we are seeing some real-world production examples of this, such as the BBC, which has gone directly to Lambda on Amazon Web Services from its previous LAMP stack for the majority of its online architecture.
“I think FaaS is a great abstraction for managing the deployment,” Newman said. “As a developer-friendly abstraction for deploying software, it is the best thing we’ve had since Heroku. I do think the current implementations are poor, but they’ll improve. But they’re only dealing with the execution of one thing at one location. That’s not solving the problems of the abstractions of the larger network system.”
As an example, Newman cited Microsoft Azure’s Durable Functions, which offer something analogous to continuations via reactive extensions, allowing developers to build stateful workflows and functions in a serverless environment. But whilst the deployment abstractions may improve, it would be naive to imagine you can entirely abstract away the complexities of writing distributed systems.
“You can’t assume that the things you’re going to talk to are there,” Newman said. “You can’t assume that data is going to magically beam instantaneously from one point in time to another. Because it isn’t. And no amount of abstractions are going to solve that fundamental issue.”
Architecture for Autonomous Teams
Another area that remains challenging has to do with the overall system architecture, and the related issues around team organization and structure. As Holly Cummins, worldwide developer leader at IBM, has pointed out, “even with properly autonomous teams, system-level considerations don’t go away.”
Eric Evans’ Domain Driven Design, a cornerstone of the microservices movement, should be read by any software architect, Bryant said. But he goes a step further:
“I think now we need DDD for the rest of us,” he told the New Stack. “Because even folks who are regular developers rather than architects need to have some understanding of how to scope entities and boundaries, a lot of which comes back to good API design. Once you get that understanding of the importance of coupling and cohesion, separation of concerns and boundaries, you naturally jump into that gear whatever abstraction (module, class, service, application) you are dealing with.”
The forthcoming second edition of Newman’s book Building Microservices introduces a lot of these concepts with the next generation of services in mind.
In updating the book, Newman told The New Stack, “I wanted to talk a bit more about coupling. I wanted to talk a bit more about cohesion. I wanted to talk a lot more about information hiding, which for me is the big thing now.
“I think even if people get to grips with the distributed systems side of things, they don’t get to grips with the fact that, fundamentally, microservices are just a form of modular architecture. And yet a lot of the people creating microservices have no concept as to what modular architectures are or the concepts of how you do modularization.”
Newman’s updated book also brings in some of the changes in organizational thinking that have emerged since the first edition was published in 2014. He cites, in particular, Matthew Skelton and Manuel Pais’ hugely influential work Team Topologies, on how to organize business and technology teams for fast flow, and the Accelerate book from Nicole Forsgren, Jez Humble and Gene Kim, which explores the science behind lean management and DevOps principles.
The revision process revealed not only how much new knowledge about microservices there is to share, but how that knowledge keeps building.
“I wanted my book to be the book you read to get your breadth understanding of what microservices are and the impact it has on software development,” Newman said. “And I found that I was recommending to people, oh, you should read chapter four in that book. Nowadays I’d say this, rather than that. I didn’t want to keep equivocating over recommending my own book. That’s why I wrote a second edition: because I wanted it to be good and accurate.”
The New Stack is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Ambassador Labs.
Amazon Web Services, the Cloud Native Computing Foundation, IBM and Red Hat are sponsors of The New Stack.