CyberArk Decreases Cognitive Load with Platform Engineering
When a software company goes from an on-premise product to a cloud-based service, it’s not just the customers that have a different experience. The internal developer experience completely changes, too. Suddenly, on top of their domain-specific duties, your developers have to learn cloud computing. This adds to the cognitive load on already overworked, burned-out teams.
Platform engineering is an emerging discipline formed to serve those developer teams. To develop best practices that help them in their move to the cloud, and to obscure the rest. These engineers work with and learn from their developer customers and build a platform that helps these developers do their jobs more efficiently.
The New Stack talked to Ran Isenberg, principal software architect in the platform engineering group at CyberArk identity security SaaS provider, about how one company built a platform engineering team and what they learned along the way.
How to Start a Platform Engineering Team
In the early days, if one of CyberArk’s services wanted to make this move from an on-premise offering to a Software as a Service (SaaS), its team had to independently research cloud computing and invent its own pathway. This, Isenberg explained, “created organizational waste and different solutions and different onboarding and logins for customer experience.” Plus a lot of extra work for those sudden SaaS teams.
This simply could not scale alongside the growth expected from releasing a self-service SaaS.
Suddenly there was a demand for a consistent user experience across CyberArk not just from external customers but from internal customers as well. “We started to create SaaS applications and we realized we wanted to have a better, more unified experience for the customers and also internally,” he said.
This wasn’t about gaining control over developers’ day jobs, but about reducing the cognitive load to have to learn things that have nothing to do with their day-to-day business domain, like:
- Cloud migration
- Cloud security
- Tenant management
- Testing methodologies in the cloud
- CI/CD practices
Therefore the new platform engineering team decided to prioritize anything that involved duplicate work or created a different experience for the end users, the internal users or the support staff.
Whenever there were two CyberArk services developing the same capability to achieve their business goals, the platform engineering team would initiate a conversation with those stakeholders to see how it could help.
One emergent platform engineering use case was storing, encrypting and fetching service audit trails. Audits represent a log of service events such as “user X connected to machine Y and performed action Z.” All CyberArk services send audits, so instead of having all services create their own internal audit service, they can use the central platform maintained audit service.
CyberArk’s platform engineering team looks at their colleagues, the internal developers, as their primary customers, building for them a SaaS experience across the internal services they use.
Platform engineering duties include providing:
- Seamless login
- Unified organization and discoverability of services
- Software development kits (SDKs)
- Tenant isolation
The platform team became the de facto cloud engineering team, defining the cloud best practices for the whole company and setting up self-service training. This cloud platform engineering team set up a self-service template that allows developers to create their own service with all the tools they need, including a CI/CD pipeline and logging, all with the platform SDKs.
“It saves them a lot of time, reduces cognitive load,” Isenberg said. “We just solve for them all the issues they don’t have to worry about.”
Platform Engineers Have to Prove Themselves
“We had to prove to the org that we are going to provide good value and you should integrate in our work,” Isenberg said. In the beginning, the platform engineering group was an experiment unto itself, made of just 15 teammates.
As a way to garner early feedback, some of the developers on the platform team were tasked not with building the platform, but with building a service on top of the new platform — “to prove it, to have the first customer as part of the group,” he explained. After about 18 months, those developers moved away from the platform team and became their own service.
Now the platform team has grown to more than 40 engineers, serving more than 700 developers, but Isenberg admitted that it wasn’t always smooth sailing. “It took a lot of time to get approval and a lot of bumpy rides to integrate with the other solutions,” but eventually, by treating their developer colleagues as customers, they became the golden path — the way for services to be built moving forward.
“At some point, we got the golden path certificate and got the approval that people use us. Now, every new service goes through us.” But, he emphasized, “It’s something that we still need to maintain,” asking for continuous feedback, sending questionnaires, and working side by side with development teams. “If people don’t use the platform, I’m not just not doing my job, they will just [build] their own version.”
A Golden Path with Guardrails
As we are hearing from a lot of platform engineering teams, this so-called golden pathway is lined more with guardrails than gates. You’re led on a path, but you have the freedom to diverge.
CyberArk’s internal platform is paved in Python, which Isenberg contends is great for the cloud. “You can still use other languages — you wouldn’t be able to use all the internal SDKs but you could use some. Platform engineering provides you with a lot of freedom. But you benefit more if you’re using Python because we’re more focused on it. But if you decide one day for Java or Go, we can create adaptations for other languages,” he said.
Similarly, the internal platform includes a tenant management service, which, in order to access, you are required to implement an API and create a tenant, “but I have no idea what it means for you to create a tenant, it’s a black box from my perspective, you can use serverless technology or something else. As long as you implement an API that you are required to, that’s an abstracted way you have more freedom there,” Isenberg further explained, saying his internal customers can even use a publish-subscribe pattern if they want.
The platform is built using serverless on Amazon Web Services, with a defined tech stack and a CI/CD pipeline, and comes with a recommendation of what tech internal developers should use. “From there, you can do pretty much what you want,” he explained. If a team fancies experimenting with Kubernetes, they are welcome to but they are responsible for it. But then, if two teams wanted to use Kubernetes in a similar way, the platform team might come in to create a consistent experience that the whole company can build on.
The CyberArk team builds services for its external SaaS customers in the same way. As a cybersecurity company, there are legal requirements to offer audit services that are encrypted and maintained for seven years. No matter what the service their customer is accessing, they all have the same user interface to access the audits.
The Challenges of a Platform Engineer
“You need to have the service mentality for internal customers who are much harsher than external customers,” warned Isenberg. “These are people we see in the office in person so it’s more extreme.”
Therein lies the rub of the platform engineering team — your colleagues are your customers. On the upside, it removes a lot of barriers for communication and shortens the feedback cycle.
“Providing good developer experience is not trivial,” he observed, remarking they are still working it out. In the beginning, that meant ReadMe pages on GitHub, but, upon observing colleagues trying to use the code samples, they realized they were missing lines, which led to support needs. Now that’s evolved into fully styled documentation housed on GitHub Pages.
The CyberArk platform engineering team aims to treat its internal customers like a successful open source project would. “We treat our inner code, SDKs, inner services, as internal open source,” he extrapolated, with versioning, release notes and a lot of documentation. “That helped us to gain the trust of developers.”
Isenberg’s own role has evolved over the last 20 months into an internal developer advocate.
Another challenge of the platform engineering team is to avoid becoming a bottleneck “because everyone needs something from us and they needed it yesterday,” Isenberg commented. After all, as the internal platform adoption grew, so did the requests. Not being able to do it all, they’ve begun accepting internal code contributions. Once merged, the platform team maintains it — “You create it, you contribute it, we review it and maintain it,” he explained.
They’ve also adopted inner sourcing. If there is a new service to be built, that team can join the platform team for a few months to build it platform native, which then the platform engineering team will then maintain.
Overall, platform engineering influences an important cultural change that facilitates accelerated knowledge sharing. “The managers know that when they do that they help themselves but also the rest of CyberArk,” Isenberg said.