Platform Engineering Needs to Manage Infrastructure too
In today’s world, application developers no longer only write code, they need to deploy and operate their applications as well. Only in extremely large mission-critical deployments or organizations can they turn over the operations and troubleshooting to a team of site reliability engineers (SREs).
That’s a big reason why platform engineering is so critical to the next phase of DevOps and why it needs to include managing infrastructure, not just the app stack. Let’s consider, for example, the issues that developers struggle with in the three large phases of the typical DevOps life cycle: develop, deploy, operate.
During these phases, developers need to focus on translating business logic into code, build and test the code, and ensure it is written to properly scale and run on the target production infrastructure. Tools and approaches such as GitOps, Jenkins and Docker make the continuous cycles of this phase much easier on the developer. Infrastructure targets such as Kubernetes allow the developer to test scalability and resiliency of their applications due to the inherent automation and orchestration that Kubernetes provides.
Yet one big challenge is that in addition to addressing the app stack, developers now need to be “infrastructure aware” during a DevOps life cycle. That’s less of an issue when using basic native Kubernetes features like Amazon Elastic Kubernetes Service (EKS), Azure Kubernetes Service (AKS), Google Kubernetes Engine (GKE) or OpenShift that are available across Kubernetes platforms, because Infrastructure as Code (IaC) allows for predetermined, repeatable and reliable infrastructure configuration. But what happens when the application requires more robust features around network, storage or security?
Scaling the application for many users without repetitive toil is a key requirement during the deploy phases. Infrastructure as Code and API-based tools are useful but not a complete answer. While they create a programmatic interface, they do not reduce the repetitive toil of managing the infrastructure.
The idea is to reduce toil by ensuring application mobility without intervention or code changes specific to the target cloud where possible. If using core Kubernetes concepts and capabilities, again, this is typically not a problem, but when you require extended functionality at the infrastructure layer (think of reducing noisy neighbor problems, auto-growth of persistent disks or disaster recovery of your application), there could be multiple solutions for each specific cloud a developer is deploying to.
During the operate phase, the developer needs to ensure scaling is working properly, the application is performant and usable, and that the application is behaving in an expected manner.
Developers also gather feedback via observability and monitoring tools so that they can plan for application improvements, enhancements and modifications during the next development cycle. Again, when extended functionality beyond the basics is required, these efforts can become exponential. For each cloud, there may be optimizations that need to be done differently; enhancements can behave differently on different clouds; and modifying and maintaining specific fixes on a per-cloud basis can lead to severe technical debt and a spaghetti bowl of different codebases depending on the target cloud.
Platform Engineering to the Rescue
Wearing their DevOps cape, a platform engineer is the modern organizational antidote to these developer’s needs. As a practice, platform engineering has grown rapidly and incorporates developer tools and direct ways to link a deployment platform for apps to Git and the GitOps pipeline. It can also support blue-green deployments for easy rollbacks to a prior version, canary-style staged deployments and a plethora of namespace and LDAP-based tools used in security compliance and policy-based access to different user groups.
Infrastructure and data management are key elements of the internal developer platform, which enables developer self-service, with tools to manage compute, networking, storage and data services elastically. Guardrails for each use case and namespace are essential to ensure high availability, disaster recovery and active rollback to previous app and infrastructure configurations and data — using snapshots in case of issues with upgrades of any part of the app stack.
In all of this, to be successful, infrastructure and data services need to be brought together under Kubernetes control in the IDP. It’s another important way that Kubernetes can help your organization maintain operational excellence and simplicity.