Kubernetes and PaaS: The Force of Developer Experience and Workflow
A recent conversation within the Datawire team about what the term “PaaS” really means — and how this relates to developer experience (DevEx) and workflow — triggered lots of internal conversation that I believe would be good to share. I know from working with customers and from chatting to people at conferences that other teams deploying applications onto (and similar platforms) are also somewhat unsure as to the relation between the “platform” and workflow, and I hope to provide some clarity — or at the very least learn something as I get trolled in a constructive manner!
Infrastructure, Platform, Workflow: Three Things Essential, They Are
Starting with first principles, I’m fairly confident in saying that all modern web-based software development involves working with three layers:
- Infrastructure: This layer is the abstraction that provides raw compute resources like bare metal, VMs, OS, network, storage etc, which will ultimately be responsible for processing code and data associated with your application. I’m avoiding using the term “physical” here, because although everything ultimately runs on hardware we are increasingly seeing the abstraction of infrastructure shift to Software-Defined-Everything (SDx).
- Platform: This layer provides a coarse-grained system-level building block that may be run-time specific — such as an instance of compute with an integrated JVM or CLR — and also datastores, middleware, IAM, auditing etc. Note I’m also avoiding using the term PaaS here, which I cover later. It is also worth mentioning that I believe that you always deploy your application onto some form of platform, even if you don’t consciously assemble one.
- Workflow: This layer is the summation of how you design, build, test, deploy, operate and observe your applications. Every developer has a workflow (even if this is implicit), from the one-person indie website builder, to the thousand-strong team working on complex enterprise systems.
When I talk about this model with friends and clients, I generally get some form of agreement on the concepts and structure. The disagreement begins when we start discussing the coupling between the concepts, particularly in relation to a platform-as-a-service (PaaS).
These Aren’t the Platforms You’re Looking for
I often hear that “a PaaS always has a built-in workflow,” or “if you are running a PaaS, then you don’t need to know about infrastructure.” I don’t particularly agree with these statements, and in an effort to build shared understanding, I typically start explaining about my mental model of the generations of PaaS:
- First Generation: The original Heroku, and friends. This type of PaaS hid away the underlying cloud infrastructure (via deployable slugs and execution “Dynos”), and presented a very opinionated workflow that was coupled to the platform (“git push heroku master”). For simple monolithic Ruby applications, this was fantastic — you could become proficient in the deployment workflow within minutes, and all your knowledge was transferrable to the next project.
- Second Generation: Cloud Foundry (DEA edition) and friends. This type of PaaS could be deployed on your own infrastructure and was slightly less opinionated in that you could bring you own buildpacks and runtime. The workflow was still integrated (“cf push my-awesome-app -b java_buildpack”), but the platform was beginning to enable multi-service applications (and workflows) through concepts like context path routing.
- Third Generation: Cloud Foundry (Diego edition) and the current versions of Google App Engine and AWS Elastic Beanstalk (both of which evolved from first and second Gen PaaS, respectively). These PaaS were even less opinionated about infrastructure — you can bring your own container — and the documentation makes it clear about restrictions with the runtime and compute environment. The workflow here is moving to support distributed systems (microservices) and encouraging developers to build a workflow from the current “recommended practice” of assembling your own continuous delivery pipeline, perhaps using a tool like Concourse or Spinnaker.
- Fourth Generation: Kubernetes, Docker Swarm, Mesos, HashiCorp Nomad, AWS ECS and friends (and I appreciate that these aren’t really PaaS, but bear with me a moment). This type of platform is infrastructure agnostic — who hasn’t been to a conference and seen Kubernetes running on a Raspberry Pi cluster? — and you are nearly always exposed to the underlying compute and networking infrastructure that makes up the cluster (with the potential exception of AWS Fargate). You can also deploy any kind of container or process you like. This platform emerged out the desire to build “cloud native” applications, which are distributed by nature. The “best practice” workflow here is of yet undefined — or more correctly, currently still co-evolving with the technology — and we’re taking our current understanding of CI/CD and evolving this. Accordingly, this is an area where interesting open source and commercial tooling is emerging: think Datawire’s Forge and Telepresence for workflow and Ambassador for traffic shifting/deployment, Weaveworks’ Flux and Scope, Heptio and Bitnami’s ksonnet, Microsoft’s Draft and Helm and Containous’ traefik load-balancer/API Gateway
I believe the confusion between the coupling of platform and workflow arises as many of us (including me) have been happily deploying our coarse-grained (monolithic?) style applications where either the workflow is tightly coupled with the platform — i.e. First and Second Generation PaaS — or the workflow has been loosely-coupled but so well-defined that we don’t notice the friction — i.e. building and deploying language-specific artifacts onto traditional infrastructure or Third Generation PaaS.
The challenge emerges with the fourth Generation platforms, where the technologies are still maturing, and the architectural and related organizational best practices are also still co-evolving — think microservices and serverless, and the inverse Conway maneuver, respectively. There is also increasing pressure from the business for speed, stability and visibility, and this is often being realized by decoupling business processes (forming cross-functional business units) and devolving decision making to teams working at the front-lines — this is typified by business movements such as holacracies and Teal organizations.
Looking back to my software development layers above, a key point to note is that our infrastructure and platforms are becoming increasingly more distributed and decoupled, but they are enabled by a centralizing force — for example, AWS in the case of an infrastructure, and the Kubernetes (Apps SIG) community in the case of a platform — which defines common protocols, standards and interfaces. Our workflows are also becoming distributed, but I don’t think we have centralized forces here yet — i.e. a common descriptive language, set of tools, and pre-defined (pluggable) workflow steps.
Building (Snowflake) Bespoke Workflows: The Greatest Teacher, Failure Is
Increasingly with organizations embracing Kubernetes I am seeing teams creating bespoke developer experience workflows — often differing across teams within a single organization — which leads to fragile solutions and limited shared learning. Often these teams try to codify their workflow in such a way that they, in effect, end up building a platform on top of Kubernetes. Don’t get me wrong — deploying onto a platform is essential — but the concepts of a platform and the workflow should be thought about and designed separately. Tightly coupling the platform and workflow leads to inflexible developer workflows.
I believe the Kubernetes “platform” is becoming somewhat clearer as we enter 2018. Kubernetes itself is maturing nicely, and the promise of what service meshes powered by the likes of Envoy, Conduit and Cilium appear to be filling in some of the missing parts of a platform. However, there is still much thinking to be done around developer experience. We are seeing best practices within the operational space being codified within methodologies like Weaveworks GitOps and Atlassian’s BDDA (Build-Diff, Deploy-Apply), and I believe there is something analogous that will emerge in the application development (AppDev) space.