What news from AWS re:Invent last week will have the most impact on you?
Amazon Q, an AI chatbot for explaining how AWS works.
Super-fast S3 Express storage.
New Graviton 4 processor instances.
Emily Freeman leaving AWS.
I don't use AWS, so none of this will affect me.
FinOps / Operations

Engineer’s Guide to Cloud Cost Optimization: Prioritize Cloud Rate Optimization

FinOps can get complicated fast. To help teams prioritize, it’s a good idea to break down cloud cost optimization into two different universes: engineering and finance.
Sep 27th, 2023 8:00am by and
Featued image for: Engineer’s Guide to Cloud Cost Optimization: Prioritize Cloud Rate Optimization
Feature image by Alexander Stein from Pixabay.
The following is the third part in a three-part series on cloud cost optimization. Read part one here, and part two here.

More than 60% of Amazon Web Services users’ cloud bills come from compute spend, via resources in Elastic Compute Cloud (EC2), Lambda and/or Fargate. So it makes sense to prioritize optimization in compute — this is where you’ll realize savings most efficiently.

It’s possible to reduce compute costs by more than 40%, while reducing the overall AWS bill by 25%, using RIs and Savings Plans. The trick is to secure the appropriate level of commitment. Both over-committing and under-committing produce suboptimal savings.

Commitment Management Is Complex; Focus on Effective Savings Rate

Even though discount instruments are designed to produce savings, companies need to choose the right instrument for workloads to fully realize those savings.

All discount instruments have benefits, tradeoffs and specific rules. This chart illustrates the elements:

How do these elements relate to cloud savings? They can be easy to misinterpret. That’s why optimizing for Effective Savings Rate (ESR) is a recommended best practice. ESR simplifies rate optimization; it focuses FinOps teams on one metric that reveals the savings outcome.

ESR is the percentage of discount being received. It is calculated by dividing the amount spent using discounts (like RIs and Savings Plans) by the amount that would have been spent via on-demand pricing.

Because utilization, coverage and discount rate are part of the calculation, it produces a consistent measure of savings performance and a reliable benchmarking metric.

Best Practice for Cost Optimization: Rate First, then Resource

Because cloud cost optimization is complicated, it’s helpful to organize it in these buckets and track:

  1. How much you are spending by monitoring cloud spend per month
  2. Savings potential by tracking Effective Savings Rate
  3. Waste reduction/other resource optimization strategies like re-architecting, tracking untagged/unknown spend, rightsize, unused/unattached resources

Optimize Rate and Resource at the Same Time with Autonomous Discount Management

Discount instruments contain many moving parts and they are complex to manage manually.

Using automation in an autonomous approach, however, creates a more efficient rate optimization experience. It enables “hands-free” management of cloud discount instruments. With cloud rate optimization being managed in a way that produces consistent incremental savings, engineering teams can focus on innovation and resource optimization synchronously.

Optimize Cloud Discount Rates and Engineering Resources at the Same Time

While prioritizing cloud cost optimization around discount rates is a way to jump-start cloud savings, it’s possible — and preferable — to optimize discount rates and engineering resources at the same time.

When discount rates are managed using algorithms, it creates intra-team efficiency for FinOps. Not only are challenges mitigated with the manual management of discount instruments, but engineering teams are also freed up to focus on strategic projects and resource optimization. It’s a scenario that produces maximized cloud savings.

Why Is an Autonomous Approach Necessary to Optimize Discount Rates?

There are certain jobs that complex algorithms perform better than humans. Cloud cost management is one of them. More specifically, calculations performed by sophisticated algorithms enable a more efficient, accurate and responsive approach: the autonomous management of discount instruments.

The concept of saving money using AWS Savings Plans and Reserved Instances (RIs) appears simple. However it is challenging to manage and exchange these instruments in a way that provides coverage flexibility. Each instrument contains benefits, limitations and tradeoffs, along with inherent challenges due to infrastructure volatility, commitments and terms.

Automation in the form of algorithmic calculations handles these intricacies efficiently.

Infrastructure Volatility: a Wild Card

Resource usage is dynamic in the cloud, and that movement creates unpredictable patterns whether it manifests as:

  • Increasing and decreasing usage
  • Moving from EC2 instances to Spot
  • Switching between instance families
  • Converting from EC2 to Fargate
  • Moving to various containers

These types of engineering optimizations create volatility in company infrastructure and are challenging to match at scale, particularly when rigid discount commitment terms are in place.

Discount Commitment Rules and Coverage Planning

Two things that might not be obvious about working with discount instruments: optimization efforts can be hindered by commitment rules and challenges in discount coverage planning.

Compute Savings Plans, for example, are applied in a specific order: first, to the resource that will receive the greatest discount in the account where Savings Plans are purchased.

The discount benefit can next float to other accounts within an organization, but Savings Plans are not transferable once deployed in an account. In order to maximize benefits to an organization and centralize discount management, it is a best practice to purchase savings plans in an account isolated from resource usage.

Coverage commitment planning for Savings Plans, too, is tricky because commitments are made in post-discount dollars — an abstract concept. FinOps teams must quantify upcoming needs (using post-discount dollars) for resources that contain variable discounts. It’s comparable to estimating a gift card amount for products with varying discount rates that will be bought, exchanged or returned.

Rigid Terms and Lock-In Risk

Companies can get locked into commitment terms that end up creating more risk than benefit.

AWS discount instruments, for example, are procured in 12-month or 36-month commitment terms. While Convertible and Standard RIs can be exchanged (the latter on the RI Marketplace), Savings Plans are immutable. Once made, these terms cannot be modified. They must be maintained through the end of the term.

Most companies respond to these constraints by under-committing their coverage. In an effort to be conservative and avoid risk, they incur on-demand rates, which are much higher.

AWS Discount Instrument Profiles.

Manual Management of Discount Instruments Produces Suboptimal Outcomes

It’s nearly impossible to execute all of these moving parts in a timely manner without a technical assist.

Companies that try to manage discount instruments manually, or via a pure play RI broker, generally wind up with:

  • Discount mismanagement
  • Missed savings opportunities
  • Overcommitment

This media company, for example, was locked into one-year discount rates and did not seek the opportunity to secure more favorable three-year rates. Higher discounts were missed. Because discount changes were being performed manually (and therefore at a slower pace), the company paid for commitments that were unutilized. With suboptimal coverage and discounts, they paid higher prices for cloud services, missing out on $3 million in potential savings.

This dynamic is very common.

Automation, however, is only part of the solution.

Autonomous Discount Management: Algorithms Enable Hands-Free Optimization

Many automated tools simplify steps or entire process sequences. Most will provide recommendations or a list of actions that require human intervention to implement. While that has value, particularly in saving time, cloud rate optimization (or discount management) is achieved holistically, with automation that performs algorithmic calculations and uses real-time telemetry to:

  1. Recognize resource usage patterns and scale up and down to cover them
  2. Autonomously manage and deploy discounts using a blended portfolio of Savings Plans, Standard Reserved Instances and Convertible Reserved Instances, taking into account the benefits and risks of each instrument
  3. Optimize for savings performance using Effective Savings Rate (ESR) and not for coverage or utilization alone.

Autonomous Discount Management is a hands-free experience that enables synchronous rate and resource optimization. FinOps teams can let the autonomous solution act for them (optimizing cloud costs) while they pursue other strategic tasks; engineering teams can optimize resources at the same time.

How Drift Optimizes Rate and Resources Synchronously

Like most companies, Drift had established internal methods to understand AWS costs and optimization performance. Savings, however, were still elusive, because Drift lacked visibility into cost drivers, appropriate tools—and discount instruments were being managed manually.

Consequently, Drift was receiving a 27% discount with only 57% coverage of discountable resources.

With a tool that provides cost visibility and attribution from Cloud Zero and another tool executing autonomous discount management from ProsperOps, Drift was able to optimize rates and resources synchronously. Drift’s ESR nearly doubled and more than $2.9 million in savings has been returned to its cloud budget in only a few months.

Drift can now:

  • Identify optimization opportunities and proactively respond to anomalies in real time
  • Review reports in minutes, not hours, with data about costs, usage, coverage, discounts and overall cloud savings performance
  • Understand ESR improvements over time (savings performance and ROI) and their drivers
  • Continue to realize incremental savings

Engineering and finance teams also have more efficient communication and coordination. This is just one example of what is possible when rate and resource optimization is synchronized.

Synchronous Optimization Addresses Top FinOps Challenges

The right tools and FinOps-supportive culture are key to achieving synchronous rate and resource optimization—and results like Drift’s. It’s a “better together” approach that helps teams

resolve key FinOps challenges depicted in The State of FinOps 2023 report, including:

  • Empowering engineers to take action on optimization
  • Getting to unit economics
  • Organizational adoption of FinOps
  • Reducing waste or unused resources
  • Enabling automation

The approach produces greater visibility into costs across the organization, a fast way to reduce costs and create savings, more time for engineering projects and improved intra-team communication.

Start a More Efficient Cloud Cost Optimization Journey with ProsperOps

We believe cloud cost optimization can be best addressed by first implementing autonomous rate optimization and working with respected vendors for specific resource optimization support.

A free savings analysis is the first step in the cost optimization journey; it reveals savings potential, results being achieved with the current strategy and optimization opportunities. To chart a course for maximized, consistent, long-term cloud savings, register for your savings analysis today!

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Enable, Pragma.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.