TNS
VOXPOP
Favorite Social Media Timesink
When you take a break from work, where are you going?
Instagram/Facebook
0%
Discord/Slack
0%
LinkedIn
0%
Video clips on TikTok/YouTube
0%
X, Bluesky, Mastodon et al...
0%
Web surfing
0%
I do not get distracted by petty amusements
0%
DevOps / Operations / Platform Engineering

How Platform Engineering Can Help Keep Cloud Costs in Check

Unlock long-term cloud sustainability by using a proactive, inclusive approach to include developers.
Sep 11th, 2023 10:18am by
Featued image for: How Platform Engineering Can Help Keep Cloud Costs in Check
Image from Facets Cloud.

This is the fifth part in a series. 

Picture being in a never-ending cycle of cloud costs that keep piling up, no matter what you do. You’re in good company. Most businesses are stuck in the same loop, using the same old audits and tools as quick fixes. But let’s face it, that’s just putting a Band-Aid on a problem that needs surgery.

Now, we all know audits and quick reviews are essential; they’re like the routine checkups we need to stay healthy. But when it comes to cloud costs, those checkups aren’t enough. They might spot the immediate problems, but they rarely dig deep to find the root cause. It’s time to think longer term.

Instead of just putting out fires, why not prevent them in the first place? A more sustainable approach to managing cloud costs is to focus on building an efficient system from the ground up. This isn’t about quick fixes; it’s about laying a strong foundation that prevents issues down the road.

Good news: As the pages of platform engineering are being written, it presents an opportunity for the creators to help you do exactly that. Think of it as designing your new toolkit for smarter, more efficient cloud management. With platform engineering, your team gets access to high-level tools that go beyond patching holes. They help you map out a well-planned route through the confusing world of cloud costs.

Attempted Solutions and Reactive Approaches

The moment the cloud cost alarm bells start ringing, specialized centralized teams or “war rooms” are created — often to manage this process. These teams look closely at cost reports, figure out which department is spending too much, and then tell them to cut back. Here’s how it typically goes down:

  • By audit: Relying on audits to identify areas of excessive spending. Continuous audit cycles are used to understand and potentially optimize cloud costs. It’s often seen as a never-ending process.
  • Manual oversight: The centralized team is responsible for scrutinizing cost dashboards, identifying responsible teams for various infrastructure parts and informing them to take corrective action.
  • Project tracker: A project tracker is created to monitor the cost-reducing activities and to keep all stakeholders updated.
  • Tools and anomaly detection: Specialized tools that offer better analysis and anomaly detection capabilities are deployed, with some even allowing automated actions.
  • Ops team responsibility: Typically, the operations team handles the burden of cost management, but they are often lean and already over-burdened with other critical tasks.

The problem? All of these steps are more reactive than proactive, and prone to toil. They focus on trimming existing costs — often described as cutting the fat — rather than building a cost-efficient system from the start. The result is a strategy that’s more about short-term gains than long-term sustainability.

Further, In the world of cloud native apps, Ops teams alone can’t take optimizations beyond a point. Service and architectural enhancements by application developers give biggest results in the long run. But the system today isn’t inclusive enough.

So, how do we break this cycle? By shifting the focus from immediate cost-cutting to long-term financial health. That means adopting strategies that don’t just react to problems as they arise but prevent them from happening in the first place.

Platform Engineering: The Linchpin

This is where platform engineering comes in. The platform engineering team is responsible for laying down the path not only to make developers own their cost, but also to inherently control costs. Here’s how platform engineering contributes to cloud cost sustenance:

Sharing ownership and accountability: Platform engineers need to let go of the control of cost ownership and instead look at creating a collaborative experience for developers to share  ownership.

Building cost-efficient golden paths: The platform engineering team’s first order of business is to lay down golden paths engineered to be cost-efficient from the start. This becomes the playground for developers to experiment and build, but cost control isn’t just nice to have; it’s a must-have.

Providing developer-friendly cost breakdowns: The platform gives developers the tools to see costs broken down in a language they understand. The platform should present a zoomed-in view that allows each development team to see only the costs related to the resources they’re directly managing. This focus helps teams zero in on costs specific to their own projects or services.

Providing smart cost correlation: Understanding the “why” behind the costs is as crucial as knowing the “what.” The platform lets developers tie costs to specific runtime metrics like “utilization” or business metrics like “number of transactions,” paving the way for smarter decision-making.

Assigning budgets: Setting a budget shouldn’t feel like walking a tightrope. The platform allows teams to set up budgetary guardrails for different resources and activities. If you’re about to go over budget, consider yourself notified or even restricted — keeping costs in check.

Ability to prevent leaks: Unused or underutilized resources are the silent budget killers. The platform should be designed to prevent these so-called “leakages” earlier in the software development life cycle and prevent them from draining your budget in the future.

In essence, platform engineering aims to create a symbiotic relationship between developers and their cloud environment. It’s not just about empowering developers; it’s about making them conscientious stewards of their resources. This fosters a culture where cost efficiency and developer freedom coexist, setting clear guidelines for how to manage both effectively.

Developer Responsibilities

In a world powered by platform engineering, treating cost as an afterthought just won’t cut it. Developers need to elevate cost to the VIP status of “first-class citizen” in their sprints, right next to other big-league players like performance and availability.

Be your own landlord: Owning cloud infrastructure, including services and resources, isn’t just a responsibility, it’s a necessity. With ownership comes the imperative of constant vigilance: Developers need to be on top of monitoring both costs and resource use, around the clock.

Budget mastery: Staying within the lines of a coloring book is basic; doing the same with budgets is an art. Developers must stick to the budget frameworks laid out by the platform engineering team, while making sure cost-optimization tasks don’t get pushed to the back burner during sprints.

Business-metrics harmony: Translating cloud costs into business speak is a win-win. Developers should align their resource utilization metrics with tangible business outcomes. Want to know the cost of a single business transaction or operation? That’s the kind of clarity this alignment can offer.

Resource optimization: Don’t let resource “leakages” turn into resource “floods.” Developers should break down the attributed cost to pinpoint and plug these leakages, and to fine-tune the overall resource landscape for optimal efficiency.

Innovation: Many cost-optimization projects are tweaks to your service performance and architecture that can lead to tremendous results.

Keep the dialogue going: A fruitful partnership with the platform engineering team isn’t a one-off event; it’s an ongoing conversation. Developers should keep the lines of communication open to continuously refine tools, metrics and best practices for sustainable cloud management.

By taking ownership of these responsibilities, developers aren’t just lightening the load on the Ops team; they’re stepping up as co-pilots in navigating the cloud cost landscape. It’s a team effort aimed at achieving a leaner, more efficient cloud without compromising on performance or possibilities.

In a Nutshell

Criteria Cloud Optimization by Audit Cloud Sustenance
Objective To reduce immediate costs through audits and one-time actions. To maintain a sustainable, cost-effective architecture by design.
Methodology Audit-based, reactionary measures taken after costs have escalated. Planning and a set of practices and mechanisms for long-term sustainability.
Primary Responsibility Centralized team or Ops team usually handles this through audits and dashboards. Both platform engineering teams and development teams are responsible for cost management.
Impact Short-term cost reduction. Long-term efficiency and cost-effectiveness.
Continuity Generally a recurring but isolated exercise. Integrated into development sprints and long-term planning.

While audit-based cloud optimization might offer a rapid-fire way to trim costs, let’s be honest — it’s a reactive, temporary solution mostly overseen by operations teams. And because it often sprawls across the entire cloud, pinpointing who’s responsible for what in the cost-saving equation can get muddled.

On the flip side, cloud sustenance is a proactive, long-game approach that zeroes in on specific projects, distributing cost responsibilities across developers, platform engineers and operations.

While the journey toward sustainable cloud management needs everyone on board, the upfront time and resources invested pay off big time. We’re talking about a cloud ecosystem that’s built for long-term efficiency and resilience. So why not invest a little more now for peace of mind later?

This is part of a series on platform engineering. Read the entire series:

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Pragma.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.