Platform Engineering Demands a Product Mindset
LONDON — If you build it, they won’t come. Successful platform engineers keep beating this drum. Less successful platform engineers are stuck thinking they know best — after all, they are engineers so surely they must know what their fellow engineers want better than they do.
Platform engineering — the discipline dedicated to removing friction and frustration across the software development lifecycle in order to improve developer experience — demands a platform-as-a-product mindset. This is where your internal developer platform is built not only with your internal developers “in [often back of] mind” but with demos and prototypes and continuous feedback throughout. Basically, your developers become your customers. And you want to make dang sure you’re building what they want because otherwise they won’t use it, and you’ll be back where you started, except having wasted everyone’s time, money and trust.
At CIVO Navigate, Syntasso’s Principal Engineer Abigail Bangser reflected on what it really means to adopt a platform-as-a-product mindset, and when she’s fallen short over years in platform engineering roles.
Platform as a Team Topology
In math, a topology is a structure that holds strong despite being stretched and squashed, even under constant pressure and change. Recognizing that software development teams can be laid out in different ways that are more sustainable and flexible to change under pressure, Matthew Skelton and Manuel Pais created a series of principles and patterns they call team topologies.
“The shapes of teams look a little bit different throughout an organization and how they interact is a little bit different so they codified patterns that we’ve seen throughout the years,” Bangser explained.
The shape of teams may vary, but most organizations share common types of teams. What team topologies call stream-aligned teams, usually referred to as application development teams, focus on products and features that deliver value to the end user.
“In an ideal world, 100% of engineers would be working on that thing, because that’s what customers pay for,” she said. “But the reality is that you can’t have 100% of people working on customer-facing features because there’s a lot of underlying requirements that they have to depend on.” And with the continually more complex cloud native landscape, your dependencies have dependencies, which is why you have other teams to support those value-driving app teams.
Most organizations have to enable teams that bring specialization like in testing, agile coaching, or databases. There are also complicated subsystem teams that bring specialization in internal features like AI or security. Finally, there’s the platform team that grows to provide the underlying services that most if not all app teams use to release that value.
In the age of DevOps and continuous delivery pipelines, feature teams have come to rely more and more on tooling created by these supporting, cross-functional teams in order to release their own code to production.
All three of these teams work to reduce the cognitive load of stream-aligned teams, with platform teams often abstracting and combining the work of the enabling and complex subsystem teams into easier to consume workflows and toolchains.
Collaboration vs. Flow State
Team topologies also highlight three ways in which the platform team interacts with its app team customers:
- Collaboration — is person-to-person, like asking a specific person for access or permission to do something like provision a container. When you’re a baby startup with everyone sitting next to each other, this mode of interaction can be good enough. No matter what your company size, it’s also good for brainstorming and testing out new ideas. However, collaboration does not scale. It can also be a major source of friction, often slowing down deployment and interrupting flow states.
- Facilitation — This follows a one-to-many human interaction pattern where documentation, training and communities of practice scale information and knowledge. This scales but at a human scale. And developers notoriously put off writing documentation and often don’t have time for extra training.
- X-as-a-Service — a shift from human-to-human interaction where your developers interact with an internal platform via an API. When done with your development team customer in mind, this can have infinite scalability.
“As a topology, platform teams are often operating all three of these modes at different points in their time,” Bangser said. All three ways are important and, in the world of platform engineering, not everything should be automated, lest you become too distanced from your customers.
“I think that platform engineering has a lot of potential but I don’t think we always take advantage of that potential.” In the end, Bangser contends that platform engineering initiatives most often falter when platform teams don’t see themselves as product teams.
The Trial and Error of Platform Engineering
Bangser reflected on her eight years of building platforms across a dozen companies. At one scale-up, not uncommonly, there were several app teams dedicated to delivering business value, which were supported by other teams like customer support, finance and marketing. Perhaps more unusually, each different feature delivery team had a different Amazon Web Services account.
“When we were a small company, we wanted to allow people autonomy and to move fast and easy, but we also wanted to make sure that we didn’t risk permissions issues and stomping on each other. The fastest and easiest way to get people going was accounts,” Bangser said. As teams grew and split, new AWS accounts were opened.
Profering of new AWS accounts kicked off with a request via Jira, which the platform team picked up, went to Confluence to look at the runbook, and then they returned to the feature team with a properly configured account. This shadow IT process was “good enough” right up until finance started asking why it was getting expense reports and requests to reimburse personal credit cards for AWS charges. And we all know how cloud costs can add up.
Bangser’s team went down the Jira data rabbit hole. As her colleague at Syntasso Paula Kennedy previously told The New Stack, Jira is a great first step for platform teams to uncover repeated work, widespread pain and lengthy bottlenecks.
They realized not only was it less than ideal financial tracking, but that it was taking two weeks to get these teams up and running with their new AWS accounts.
“So we looked at our process, and we realized that the thing that was causing the issue was this manual runbook,” Bangser said. “It’s not just that it was manual. It was tiring for us to do it because people weren’t confident with it. And it was a pain in the butt and it took a lot of time and so people avoided it,” adding to the platform team’s backlog.
So they automated the runbook, which initially seemed to limit unexpected expenses which pleased finance. And it removed a frustrating, repetitive and manual task for the platform team, which meant less toil and a lower risk of errors.
“We’re pretty happy with ourselves because now, when a ticket comes into JIRA, we click one button, run one script and we’re back out the door,” Bangser said. “It reduced our time to market a lot as an individual team.”
Three months later, finance returned to the problem, realizing it hadn’t really been solved for them. They were still getting unexpected finance reports.
“The problem was, we had fixed a problem, not the problem,” she said. “Instead of a technical implementation point of view — how can we speed things up? — We needed to look at this from a customer point of view: What is it that the customers who are trying to create accounts need and how can we deliver that as a product team?”
They were correct that this two-week lag time was a problem. But they were centering the solution on the platform engineer’s experience, not the software developer’s motivations.
The platform team went on a journey of discovery following the Jobs-to-be-Done Theory, the outcome-driven innovation framework for defining, categorizing, capturing, and organizing customer needs. Remember, for platform teams, your colleagues are your customers.
“What the Jobs-to-be-Done Theory says is: No matter how great your data is about people, if you don’t understand what motivates them and what they need to complete by using your product, you’re going to be insufficient in solving problems,” Bangser explained.
This strategy developed by Tony Ulwick at the turn of the century argues that demographics are not the most important information about your prospects. What matters is answering: What job are they trying to do?
There are four characteristics to jobs, explained Bangser:
- Solution agnostic — there can be many ways to complete that job.
- You need to complete the job — progress must be made.
- A job is stable over time — you can innovate to do the job better, not for the sake of innovation.
- No need is just functional — there are social and emotional aspects too. Indeed, platform engineering is always a sociotechnical endeavor.
Jobs vary, as she explained using everyday life examples, and can be:
- One-off or unexpected — breaking a bone.
- Regular, repeated, or expected — tax season.
- Small — making dinner.
- Big — moving house.
An app team will (hopefully) adopt an internal developer platform to get the job of operations done.
“When it comes to internal platforms, we need to learn about what jobs our customers — these application developers — need to achieve and…how they want to build and operate their software,” Bangser said. The team needed to ask: Why are you creating an AWS account?
Like all good customer relationships, this kicks off with a conversation, taking maybe 15 minutes out of your day.
“You’re building a relationship. Show that you care about what they’re trying to do, and actually care about what they’re trying to do because they are your customers, right? You don’t want to tell them they’re right or wrong. You don’t want to problem-solve with them. You just want to hear from them.” — Abigail Bangser, Syntasso
They realized that different app teams had different jobs to get done which led them to open an AWS account. They could be splitting teams as they scaled beyond two-pizza-sized. Some projects wanted to get to production more quickly. Some wanted to duplicate a project to launch in a new country. Others wanted to support more authentication options across all products.
Bangser and her team realized that an AWS account was a means to an end: “They didn’t really know why they needed an account, when all they actually needed was a source of documents in S3 [AWS cloud compute storage], or all they actually needed was access to a server and that was all they wanted.”
They also realized that teams were still circumventing the platform-led path to cloud access because Jira intimidated them and it was easier to copy an old ticket, throw it on their personal credit cards and file for reimbursement or to manually reach out to a friend on the platform team to help them get the job done.
“They needed services, not accounts.” She explained that these app developers, “weren’t pros at using a cloud. They wanted to have this [process] much more approachable and much more usable for their use cases.”
Prototyping without Judgment
The platform team started brainstorming solutions.
“We had lots of big visions and ideas for where the platform would go,” Bangser said, “but finance was on a time budget and we had to get a solution ASAP.
They settled on eight possible solutions:
- Simplify Jira.
- Create a Slackbot interface.
- Create a buddy system where existing confident users become mentors to newer users.
- Platform team offers the services.
- Change pull request services.
- Pair programming with app teams and platform engineers.
- Build configuration templates.
- Absorb accounts.
Then they compared the solutions, considering cost versus value in terms of improving the platform offering. And settled on simplifying Jira as a way to balance the need to quickly appease finance with the need to invest in clearer interfaces for the application teams.
“Except Jira can be really hard to automate,” she said, to a knowingly chuckling audience. “We ran into the same problem as we run into with any code, which is that we want fast feedback but that’s not always easy to get.” Because code is expensive and takes time to write, test, implement and maintain, she explained, especially if you’re not sure it’s what your customers want.
“You have to give people something tangible to look at to get realistic feedback. If you tell them an idea, they either are sort of tuned out or they don’t really know how to respond to you,” she continued.
Rapid Feedback De-Risks Decisions
So they made a Jira prototype. And the feedback was that, in their bid to simplify, they were actually creating more stress. The prototype simply asked developers to identify an account type, but it didn’t explain what that even meant and the app teams — who knew both the organizational constraints and the tool’s complexity — didn’t even believe it was an accurate depiction of reality.
They went back to their ideation finalists, accepting that they’d have to go with something that would cost more time and money. They went for the chatbot, which Bangser clarified in an interview, “This allowed for more distance from the existing interfaces that caused friction as well as a more interactive feedback loop for users when making requests.”
This second ideation got positive feedback and they went ahead and implemented this chatbot-based platform engineering solution.
“Talk to your customers and try and get faster feedback loops,” she reminded the CIVO Navigate audience. “Platforms are viewed as mandatory. Even if they shouldn’t be, products don’t get feedback, if they’re mandatory.”