Platform Engineering: Infrastructure Meets Dev Experience
“Platform engineering” is a concept you’ve probably heard of recently, particularly if you’re working in the cloud native software space. Is it just hype — or is there something substantial behind the growing buzz? And where has this trend toward developer-focused platforms come from?
After hearing rumblings about platform engineering and its role in shaping and supporting the developer experience, I started to make connections between the work I do and the discussions I’ve seen taking place in the wider cloud native community.
There’s a shift in the direction of developer ownership of their code, supported by “paved paths” that reduce cognitive load and complexity for developers. Doing a deeper exploration of the landscape, I pulled together a range of resources and insights that address a few key questions: What is platform engineering, what does it aim to achieve and why are cloud native organizations converging on centralization of their developers’ experiences?
“Platform Engineering” is rapidly becoming the new DevOps or SRE. Almost every day we hear about another org building an internal developer platform or control plane.
Want to know what platform engineering is, where the trends are going, and why you should care?
Read on 🧵👇
— Daniel Bryant (@danielbryantuk) February 18, 2022
This tweet thread quickly went viral and continues to get attention, largely because the topic appears to have hit a nerve. Cloud native engineering organizations are growing fast, onboarding new developers and building increasingly complex software. Everyone is looking for practical solutions to the complexity and cognitive load issues, which can, if left unchecked, slow progress and create unnecessary toil. A consensus is emerging around building developer platforms to address these challenges. For example, a 2021 Humanitec poll of 1,850 engineering organizations indicated that the majority are already building, or plan to build, their own internal developer platforms.
What Is Platform Engineering, Why Does It Matter?
Platform engineering takes the idea of creating a platform (everything seems to be on or run on a platform of some kind), or a control plane, to provide developers with a centralized, self-service “paved path” to shipping and running apps quickly and safely.
My thoughts on the subject were kickstarted by Netflix’s groundbreaking work in creating its own internal developer platforms and supporting toolchains for its developer teams. I certainly was not alone in seeing the sense in Netflix’s approach to creating “full cycle developers.” This meant that developers at Netflix became responsible for the end-to-end life cycle of their software: coding, testing, shipping and running their own applications.
The “you build it, you run it” motion has, to some degree, been cited across cloud native engineering organizations as a target model and has triggered a flourishing ecosystem of different types of developer platforms or control planes. These are largely designed around the needs and goals of the organization in which a developer works.
There is no “one-size-fits-all” paved path. The Googles of the world may want their developers to focus on their core work, coding, without thinking about the shipping and running of the corresponding software. Thinking about or being responsible for infrastructure in such cases is more of a distraction than a value-add activity. Other cloud native organizations may assemble a platform engineering team and ask them to collaborate with a site reliability engineering team working to empower developer teams.
Three key things are happening here, even if there are countless variations on the implementation of an internal developer platform.
- Silos are being broken down: With the push toward centralized control planes, platform engineering, site reliability engineering (SRE) and developer experience/enablement (DX) teams, united by a common mission (as outlined in the Google SRE book), are talking to each other and working together more than ever.
- Paved-road platform engineering builds understanding: Regardless of the end game — such as developer responsibility for the full software life cycle — organizations are moving toward normalizing and requiring full life cycle understanding for developers. Even if developers are never required to ship software, fostering developer empathy for the roles of SREs and DevOps and understanding of infrastructure seems to be more common.
- Everyone is building platforms, but that doesn’t mean it’s easy: Most developer platforms aim to reduce complexity and give developers self-service access to everything they need to do their work. But the balance in paving this path is different for every organization, particularly when an organization considers different kinds of developers — the 99% of developers who build most of the real-world applications and rely on legacy software versus the vocal minority/developer influencers. What should a platform that serves both look like?
The New DevOps and SRE, or a Key to Organizational Culture?
The platform a company creates depends on a number of factors, many of which have little to do with the internal developer population. Topline considerations frequently focus on existing organizational culture, key business goals and a company’s level of cloud maturity. Many companies have designed their platform around business goals and invested in developing the platform because they expected a certain return on investment, such as in developer productivity or achieving cost savings. The composition of teams and their roles follows.
These business and cultural elements can drive, or at least exert tremendous influence on, the resulting platform, what teams look like and how much freedom and responsibility developers are asked to take on.
“Some organizations are set up to empower developers to take on as much as they want; others are siloed and prefer to ‘contain’ developers. Whether developers have full freedom to own the full software life cycle or are more constrained by organizational or platform restrictions, getting to a point where developers are empowered to take on increasing levels of responsibility can contribute to better software and better teams,” according to Nicki Watt, the CEO/CTO of technical consultancy OpenCredo.
Another example is Alan Barr, internal developer and security platform product owner at Veterans United Home Loans, who secured internal buy-in for a centralized developer control plane within the conservative world of financial services. He tackled the need for a developer platform by building a business case around unlocking developer efficiency, focusing on customer needs, controlling costs, automating processes and ensuring security as well as freeing up DevOps and SREs from firefighting duties.
Does the drive to give developers more responsibility and a self-service platform eliminate DevOps or SRE responsibilities? No. Instead, platform engineering is an evolution of these functions rather than a replacement.
Bo Daley, DevOps platform engineer from Zipcar, described his experience evolving his organization’s thinking on delivering developer support. He said it moved from a more traditional DevOps approach toward building a centralized platform that would give developers a paved path from coding locally to delivering in production more efficiently.
CartaX’s Mario Loria echoed Daley’s sentiments, “SREs play a key role in guiding developers through the learning curve toward comprehensive self-service and service ownership. But it should not be up to me as an SRE to define how your application gets deployed or at what point it needs to be rolled back, or at what point it needs to be changed, or when its health check should be modified. Developers should be capable of — and empowered — to make these determinations.”
Most leaders I’ve spoken to in the cloud native space have agreed that while the platform itself may differ from organization to organization, and the supporting roles played by DevOps and SRE shift, the platform itself helps developers deliver on what the organization expects. However, the platform serves not only as a basic “paved path” but also as a jumping-off point. The platform is a central self-service hub that can provide a standard experience and serve as a springboard for exploring beyond the paved path, as long as developers doing this also assume responsibility for the outcome.
How Do Platforms Support the Developer Experience?
I’ve argued that the platform engineer’s role is to examine the entire software development life cycle from source to production and build a workflow that enables application developers to rapidly code and ship software. While this is fundamentally and broadly true, it is clear that there are many ways to get there. Understanding the varied real-world developer experiences is critical to building the right platform. Just as the developer experience is no more monolithic than the microservices architectures these platforms now aim to simplify, the platforms themselves will only support the developer experience to the extent that they are adopted and used in the real world.
An organization can try to do everything right: Focus on the business case, get internal buy-in and budget, and believe wholeheartedly in platform engineering as the path to efficiency. But despite it all, if an organization has not considered the needs and input of the developers using the platform, they are destined to fail. Developers are unlikely to adopt a platform that has not included their feedback and preferred workflows.
Platforms can support the developer experience by seeking developer perspective and building the eventual platform around this insight. Just as software is built for an end user, the platform customer (developer) needs to believe that the tool was built for them with their challenges in mind. That is, build with empathy.
Best Practices for Building a Developer Platform
Creating effective developer platforms requires much more than a technical framework for serving developer needs. It is more than shifting around the roles of developers, DevOps and SREs. It is more than understanding business goals, although this, too, is a critical ingredient. Instead, the best practices underpinning successful platform engineering endeavors focus on cultural and psychological factors.
- Listen to your audience (developers) and develop practically. If you’re building a platform to reduce toil, increase productivity and lessen cognitive load for your developers, focus first on the 99% of real-world developers who are responsible for the stability of many critical applications the organization relies on.
- Be human, encourage empathy: Kelsey Hightower has spoken extensively about empathy for the end user and elevating it as part of the software development process. The developers in your organization, and by extension, the developer experience, demands empathy for the human who works on your organization’s software.
- Use platforms to empower developers to do their best work: Extending Hightower’s empathetic engineering concept, Alan Barr also highlighted Twilio’s “hospitality” focus as an example of creating a developer-focused platform. Remove challenges that stand in the way of developers doing their best work.
- Build a two-way understanding: The tacit agreement between an organization and its developers is that they work toward understanding each other. If the organization provides a good developer experience, the developer agrees to bring mechanical sympathy to their work and a willingness to be responsible for their code and its potential downstream effects.
With these four points in mind, it is worth reiterating that the majority of platforms are successfully adopted from the bottom up, driven by the needs of development teams. The days of executives buying a platform while out on the golf course and mandating its use in the office are long gone. Today the developers are the king and queen makers of the platform space.