How Platform Engineering Supports SRE
When people in the tech industry discuss platform engineering, they often focus on the benefits for developers. Speed up build and delivery! Lighten that cognitive load! Simplify and standardize tooling choices! Eliminate the need to delve deep into the mysteries of Kubernetes!
But creating an internal developer platform (IDP) can not only improve the work lives of frontend developers but also improve that of engineers who work on the backend. The platform team frees itself and its organization’s site reliability engineers (SREs) from repetitive tasks by automating them.
“The aim is, the operations team will spend less time troubleshooting infrastructure issues, and allow them to concentrate on optimizing how their application uses that platform to improve the reliability, security and performance,” said Martin Parker, a solutions architect for UST, a global digital transformation solutions provider, in this episode of The New Stack Makers podcast.
When an IDP is deployed well, Parker told our audience, “A platform engineer can support the SRE teams … you’ll see a reduction in incidents and tickets.”
SREs, then, will have more time to simply maintain systems and look for improvements, rather than putting out fires, adding value to their organizations.
A key element of an IDP is to build a standardized observability and monitoring solution for that platform, said Parker — one that follows best practices, which should align with feedback from the organization’s SREs.
Like a company’s developers, SREs are also internal customers for the platform engineering team.
For instance, Parker said, “Say, we’re building a Kafka service or a Kubernetes service. The platform engineering team is basically providing the service level indicators, the SLAs,” which enable the SREs to easily perform their role.
The SRE team, he said, “can build on top of those SLAs and [do] what they need to do to provide that reliability — and the KPIs around that platform reliability.”
One of the biggest challenges SREs face, Parker said, is getting the right telemetry from their observablity tools. Without standards, each SRE will potentially build its own solution in its own silo.
Platform engineering, he said, helps them avoid that scenario by “establishing standardized practices, automating operational tasks, and implementing the tools for effective monitoring and alerting, and also incident response.”
Check out the full episode to hear about best practices for supporting SRE with platform engineering.