The Post-Amazon Challenge and the New Stack Model
We are seeing a cambrian explosion of cloud native software: containers, orchestration and developer platforms. The New Stack (TNS) is the journal of record for this story of an amazing industry coming to life around us. But something has been missing: a clear coherent picture of how all these moving parts fits together — a new stack … And so, with an abject lack of humility, I shall attempt to provide one.
Why do we Need a New Stack?
We need a new stack in order to understand which technology to choose and why.
In 2015 everyone needs to build new applications just to stay competitive. But there seem to be a thousand tools to choose from, all of which appeared yesterday. It’s as if we were thrown back to the early 1990s, before websites were fully understood, except we are now trying to build a large number of custom software applications and APIs. As a result, we are seeing distracting debates about “VMs vs containers,” “opinionated vs un-opinionated,” and “structured vs unstructured” that presuppose a common model without providing one.
We need a model to make sense of all this: a new stack. This is in two parts:
- The new stack model gives you a context to understand what current and past offerings are trying to do. In turn, this helps you to pick the right tools for your application and to make build versus buy decisions.
- The new stack provides a clear layering to cleanly tease apart functionality, one where the upper layers truly abstract details of the lower layers.
This matters because the ability to create software applications in the right way is becoming revenue critical. Picking the right stack becomes necessary in order to attain the best economic outcome. This stack should deliver better “agility” than ad hoc solutions, and solve a key new stack problem, which we’ll describe next.
The New Stack Problem
For the last five years the industry has been looking for answers to two key questions. These questions amount to the new stack problem:
- How do I control costs? Can my business create applications that enjoy the economic benefits of a cloud, without having to use Amazon’s cloud for every deployment?
- How do I go faster? Can my business create an increasing number of these cloud-native applications, while simultaneously reducing the total cost of managing them?
To really solve the new stack problem, it must be possible to run the same breadth of use cases (apps, services, etc.) that you can run on Amazon, but anywhere, and in a systematic way, so that management is consistent and therefore cheaper.
We believe that there is now a clear answer. Developers and businesses should adopt the cloud-native stack we describe below. This is the right way to think about how developers build cloud-native apps, including microservices, data services and more.
Layers of The New Stack
Here is a simple three-layer cloud-native stack — 2015 edition. Central to this story is the creation of a new standard container management layer in the stack. We call this the container service. Up until 2013 this layer was hidden inside larger offerings, such as PaaS. We believe that PaaS is important, but it emerges as one of many products that can run on top of containers, which become the design center. We’ll go into more detail in the rest of the article.
Key Takeaways Here
- Each layer is independent. Most people will prefer to choose best of breed at each layer, based on their business goal. Understanding this stack should help with that.
- Higher layers can abstract away details of the lower layers. For example, if you get the container service right, the layer below (automated infrastructure) should be invisible to all except the group tasked with looking after it. That group might be an in-house ops team or a cloud provider.
- Each layer represents an architectural commitment which should be understood alongside any commercial considerations.
The Role of Weave
We need to shift our thinking from “Who is going to win?” to “How do I navigate the stack?” In the future, apps will have more moving parts than before: containers, platforms, microservices, data services, etc. Weave started out making container networks easy, and now has several thousand downloads a day just for Weave Net. But our product now goes much further, using insight into the network to provide monitoring and management. Given any application (and any network), Weave Scope automatically builds a map of your app, so that you can understand it and diagnose where issues arise in the stack or in your app. Weave Run adds service discovery, load balancing and more, e.g. if Scope detects your system is unhealthy it will redirect traffic to patched nodes.
At Weaveworks, our aim is to make this easy and convenient for all, without forcing you to prefer one platform over another. And we do mean for all. We provide simple-to-use developer-friendly tools suitable for anyone who wants to focus just on containerized apps.
Like Docker, all of Weave has the critical advantage that it is super easy to deploy and run on any automated infrastructure, with any application, without app changes or a custom OS kernel.
In the Beginning was the Cloud
Let’s now tell the story of how we got to cloud native, explaining the layers in more detail as we go.
In this story, developers are empowered by cloud technologies. Enabling developers to work fast and well has become central to how a business wins customer hearts and minds. This is a new kind of economics of convenience. Convenience changes people’s behaviors and this is what moves the market. It began with Amazon, was discovered by start-ups and then expanded to existing businesses. This is the story of cloud-native applications.
Cast your mind back to 2006. With EC2 and S3, Amazon introduced self-service compute and storage. Let’s say you are a developer. Suddenly and for the first time, you can create and deploy a real software application without talking to anyone else. You do not have to get hardware from an ops team. You do not have to discuss a capacity plan with your boss. You just code and deploy and go.
And so you can be agile. You can try out ideas — back the winners, shut down the losers. Because you can do all this without having to ask permission, you can anticipate business needs yourself and make changes to your application in real time. So begins a new world of continuous deployment and integration, microservices and automation.
Back in 2006, using cloud infrastructure meant “use EC2.” But since 2011 it has meant “use an integrated set of tools and cloud services.” You need a whole agile infrastructure, with support for automation in three major areas:
- This is all you really need. It is perfectly possible to deploy and run large-scale apps using only automated infrastructure. People have been doing it on the public cloud since 2006, and the “private” infrastructure options are only going to get better.
- You can do this in two ways. Vendors will sell you an integrated version of this layer. For example, with Mesosphere DCOS you get Mesos plus some tooling. Or, you also can get good results by picking your own pieces.
- Whatever you decide, cloud-native applications must be run on agile infrastructure, i.e., automated and self-service. Other choices impede developers from using cloud-native patterns and practices, such as continuous deployment and architecting for resilience and uptime.
What Happened After AWS Appeared
Between 2010 and 2012, a new generation of “cloud first” startups is turning whole industries upside down by moving quickly. For businesses, the economics of convenience is compelling and obvious. We enter the unicorn era: Airbnb, Square, Stripe, Uber. These leaders combine cloud with innovations in mobile and social to reframe the entire customer experience. Most often they use Amazon Web Services. Even the federal government wants in.
Around this time, people begin using “cloud native” to talk about the broad set of technology, practices, and deployment patterns that make developers most effective at rapidly delivering new functionality. A host of new tools begins to appear from Netflix, Heroku and others. This represents a recognition that standard tools are required to organize cloud applications.
Enter the PaaS
If you are a developer, Amazon gives you freedom. But then you find that freedom can mean doing everything yourself. And that is hard. And operations is even harder. Maybe freedom isn’t such a great idea in all things. By 2010 a new solution had emerged — the Heroku PaaS.
In the Heroku PaaS model, you get “everything you need” to create an app on Amazon, provided you accept the constraints imposed by the platform. But we don’t call these constraints. Instead, we say that the PaaS is “opinionated.” The Heroku approach is to focus on a specific type of web apps, which are evangelized as “12 factor apps.”
Although quite popular, the initial Heroku model remains a niche solution, even on Amazon. This is evidence that the many apps don’t fit into the constraints of the 12-factor app model.
The Modern PaaS
More recently, the PaaS model has evolved. After Heroku, it was taken further, for example, by Cloud Foundry and OpenShift. Post Docker, even more PaaS platforms have appeared.
The modern PaaS has three main properties:
- It does not need to run only on Amazon.
- It standardizes on one application lifecycle.
- It is usually open source.
The argument in favor of the modern PaaS is that customers would like to have a platform, like Heroku, but one that solves the new stack problem. This is intended to help customers deliver apps that run anywhere, are “cloud native,” without creating cost proliferation. Resources can be offered to developers through this platform. Depending on the developer, they may get the freedom to choose how much resource they need. Still, it is managed. Anyone can turn on a faucet, but it does not mean they have access to their community’s entire water supply.
Pros and Cons of PaaS
The issues with the PaaS model are twofold. First, the machinery which delivers all the promised value of a “complete platform” can be unwieldy; and second, customers may wish to deliver application types, e.g., data services, that are not supported by the platform.
PaaS certainly helps. The value of a PaaS comes from its ability to support one set of opinions about a use case, and to automate work supporting that use case. But, there are as many opinions as there are developers. And at the business level, there is a trade off between the velocity gain of standardizing on a specific app architecture, versus the costs of being constrained when you find your business needs more. This usually first manifests at the level of something simple like integrating your own monitoring tools.
We believe that these potential shortcomings are addressed by the cloud native stack described earlier. To dig into this, let’s examine the range of use cases.
Domain-Specific Use Cases
To solve the new stack problem, we want to run the same breadth of use cases that you can run on EC2, but anywhere, and in a systematic way. Today we see a vast number of different components and frameworks that amount to domain-specific use cases. Some apps will even combine components from multiple domains. Here are some examples:
- 12 Factor web app frameworks.
- Continuous integration and deployment services.
- Microservices and supporting tools, e.g. circuit breaker, API endpoint management.
- Many types of PaaS, both large scale “platforms”, and simpler, leaner products.
- Batch processing frameworks.
- Big data apps and streams processing.
- Transaction processing.
- Some legacy migration (encapsulation of snowflakes).
- Integration and cloud services business process composition.
- Data services like queues, noSQL and SQL databases; storage management.
- Staging and application lifecycle support — integrated with CI, GitHub, etc.
- Application-level management, logging, monitoring.
- Billing, quotas and usage planning.
- Security, encryption, authentication, identity, provenance.
All these use cases can be implemented on Amazon, in an ad hoc way. Moving them to other infrastructures can often be done by hand. Some of these use cases can be supported as “opinions” in a PaaS, which enables a degree of consistency and portability. But no PaaS has been able to support “all opinions.” Instead, we need a new un-opinionated platform model. Much of the excitement around containers stems from the belief that they are a basis for this. A consistent platform would drive costs down and liberate apps from specific infrastructure.
This is the emerging reality: apps are built out of domain-specific components running in containers, managed using a widely accepted container service, and running on automated infrastructure that is by and large invisible to the developer. Opinionated PaaS and other frameworks increase value and agility for specific domains. All these run on top of standard container services, which provide containers “on demand,” i.e., a cloud / runtime for containers.
Solving for the New Stack
The container service has emerged because of a need for:
- Portability: A runtime layer that is cross-platform, and runs on any automated infrastructure.
- Management: The layer includes support and APIs for managing containers, and thus components and applications, in a systematic way which is therefore cheaper.
Portability and management add business value, so let’s step through them carefully.
The first role of a container service is to be an efficient common virtual layer for any cloud, that is efficient and practical for developers. It is “just there,” and developers draw upon it when needed. Ops teams manage it. Devs manage it, too, just like any resource.
This layer runs containers, and it wires in core execution services that containerized apps need, such as networking. This is analogous to Amazon or OpenStack which support VMs. Containers and VMs are similar: you are running applications in them. But there are differences: containers can run anywhere, VMs are limited to specific clouds; containers are more resource efficient, leading to capacity and velocity gains.
The second role of a container service is to provide a way to execute and manage containerized apps. These apps have a wide range of architectures: from traditional N-tier applications through to distributed applications, big data and streams. The more domain-specific components that you can support, the more useful your container service layer. As a developer, you should ask suppliers what range of components they can support and manage.
Cost efficiency is due to the entire system — every service — being as portable as the containers themselves. By using the container service, applications can share one set of operator APIs, simplifying changes and lowering costs.
From Container Service to Cloud Native
A container service provides a complete environment that can be run anywhere. And so, the “invisible and unbounded” infrastructure provided by Amazon can be delivered without lock-in.
Now you have a completely new application execution layer that supports “any app architecture, anywhere.” So, overall, this is a huge step forward in terms of enabling the economics of convenience. And so it completes our cloud-native stack. We believe that this architecture will increasingly become the norm in modern enterprises because it will help customers layer different products according to need, for example, Mesosphere DCOS and Kubernetes, or Amazon ECS, Weave and Docker.
People are already running large-scale production applications using cloud-native technology. As the world moves to containers and microservices, there is a war of the platforms (see these slides from GOTO London). Platform vendors whose offerings are vertically integrated (aka “structured PaaS“) may claim to be better than platform vendors who specialize in one layer, and vice versa. And at each layer, vendors will compete. They are all potentially “cloud native application platforms” — it is a maturing space. If you want more detail, we recommend Brian Gracely at Wikibon, Stephen O’Grady at Redmonk, and Joe Beda, formerly of Google.
What about the future? Let’s wrap up with predictions.
There are going to be a lot more developers. In the future, every business will employ significant development teams. So we need better app deployment and management models. CI/CD support is going to be even bigger than ever.
Developers will pick the easiest tools to understand and use. Therefore, the three layers of the cloud-native stack will standardize. This favors specialist container service projects such as Docker and Kubernetes, and related offerings from CoreOS, Rancher and Hashicorp. Docker Compose will be a lingua franca here.
For Weaveworks, specifically, monitoring and management will need to become portable to all automated infrastructure, and adapt to the new stack, meaning interoperating with any container service — including Amazon ECS as well as Kubernetes, et al., at the network level.
This is not a winner-take-all market. Customers operate at different scales and budgets. There are foundations being formed to standardize and grow the open-source base, including the new Cloud Native Computing Foundation, which is focused on the container service layer.
We’ll see many more value-add offerings that focus on the top layer of the cloud-native stack. Obvious growth areas include mobile, and microservices using portable Java. The number of “lean” PaaS will grow, as the container service becomes standard. Already we are seeing Deis, Flynn, Tutum, Apollo, Cloud66, Convox, Empire, Giant Swarm and Magnetic build on top of Docker and Kubernetes.
Fortune 100 enterprises may prefer vertically integrated “structured PaaS” like Apcera Continuum or Pivotal Cloud Foundry. CF embeds its own container service, Diego. But Red Hat OpenShift is using Kubernetes.
Amazon will keep doing very well. All vendors of automated infrastructure, such as VMware and Microsoft, will interoperate with and integrate the leading container services offerings, to win more cloud-native app business.
This is the story of what happened after one company, Amazon, made computing convenient. With the advent of AWS we saw the sudden availability of almost unending compute. Apps could be deployed without the worry of getting a seven-figure bill. Today, instead, we worry about how to make those “unlimited” resources more business oriented.
Amazon empowered developers and made them more agile. But, in spite of the economics of convenience, there was still a new stack problem. Initially, people were worried about being tied to Amazon, so developers were free to consume as needed, but from a monopoly supplier. The bigger problem is that every app is different. Rapid proliferation of cloud architectures led to a vast and unmanageable array of tools and services. Modern PaaS came about as a response to this problem, and has now evolved into a new cloud-native stack.