FinOps: How Kubernetes Teams Can Best Work with Finance
VALENCIA, Spain — Around 70% of organizations only estimate or don’t monitor the cost of their Kubernetes at all. While 86% of respondents to the same Cloud Native Computing Foundation’s FinOps survey said that their cloud costs are going up. All the while, when the cloud bills come down the business line, they’re in Excel.
“You have engineers producing cost by the push of a button, and you still have the old traditional processes” for accounting and governance. That’s how Vanessa Kantner introduced the demand for FinOps — the tech industry’s hottest portmanteau, this time breaking the silos between DevOps and finance teams. She and her colleague Manuela Latz spoke on Day Two of KubeCon + CloudNativeCon Europe 2022, sharing their experiences both at the FinOps and cloud consultancy Liquid Reply and as members of the CNCF’s FinOps Foundation.
Kubernetes hasn’t just created a technical abstraction layer, it’s created one across business and financial transparency. “Kubernetes creates a huge knowledge gap between technical and not technical,” Latz explained, because the finance side “can’t comprehend how the cost dynamics of Kubernetes and the cloud works. But we [engineering teams] need their buy-in.”
While all departments somehow have to tie back to cost, FinOps has evolved as a synonym for cloud financial management. Its goal is to help engineers make data-driven spending decisions in a way that increases the value of Kubernetes for the entire company, Latz explained.
So what is FinOps for Kubernetes, and how can DevOps teams get started with FinOps? Read on.
What Is FinOps for Kubernetes?
Pre-cloud, product managers would manage and approve costs around server allocation. Now engineers are producing the costs. “So you have a lot more people producing the costs, just by the push of a button,” Kantner reminded the KubeConEU audience, “but at the same time, you have the old traditional processes that apply.”
Latz tasked the in-person audience of about 200 to identify the least efficient workload on a cloud bill. One attendee could. It was proof that Kubernetes users can’t get around FinOps because that Excel spreadsheet cloud bill is no longer sufficient in allocating budgets or governing project costs. And, Kantner explained, most of the time the business side doesn’t really know how the cloud works.
Then, Kubernetes creates another abstraction layer on top, furthering the gap between the technical and non-technical colleagues.
“We have the opportunity to build and orchestrate high-performing FinOps teams, and bring DevOps teams from ‘Uck’ about cost management,’ to pretty excited,” Latz said to describe the purpose of her job. Because FinOps can’t be achieved without the collaboration of DevOps teams.
The purpose of FinOps becomes to enable data-driven decision-making and better resource utilization in the cloud. Most importantly, it’s about creating transparency and a common language between tech and business, educating the business side on the cost variability of the cloud and Kubernetes.
The cloud financial management practice of FinOps aims to:
- Break silos between DevOps and finance teams.
- Increase business value of the cloud and Kubernetes for the whole company.
- Enable data-based spending decisions.
The first step toward FinOps, Kantner said, is engineering teams to ask the right questions:
- How efficient (or inefficient) is a product or application?
- What are the spending drivers per environment, cluster, namespace or pod?
- Which project has the highest spend?
- What is used when, by what, and is it needed?
When these are answered, then you get to focus on the what, where and with whom of optimizing. “Then that money can be freed up for new innovative features or maybe a new work colleague,” she said.
Monitoring and Labeling for Cost Transparency
Monitoring, Latz explained, is not just about performance, but anything that can be linked to cost metrics. This is a different data source than usual for engineering teams, including:
- Cloud providers bills
- Agreed-upon discounts
- Number of workloads running your nodes and clusters
“It’s about scenario-building. How much would something cost if I do it on demand? How much — if we were talking about AWS — would the Savings Plan cost?” Latz explained that “Your monitoring should be able to link those metrics because then you can measure not only the efficiency but you can tell how much it costs. This brings you one step closer to transparency.”
There needs to be a link between technical and cost metrics. Technical metrics include:
- Running hours
- Data transfers
- Logs, APIs
- Error messages
Monitoring for those things are important, but they don’t help answer the key questions like project spend until they are tied to cost metrics like:
- Total costs
- Reserved coverage
- Unused reservations
- Potential costs
- Pricing models and purchase options
This is where in-code labeling comes in to link the two groups of metrics.
“If you don’t label at the right spot in your configuration template, you cannot monitor costs,” Latz said. She then shared how she labeled the deployment incorrectly, so she had to start all over.
Key-value pairs have to target cost management. The FinOps Foundation offers the following commonly agreed-upon cost keys:
- Business unit
- Cost center
Just remember, “Cost monitoring and cost reporting that comes out of the cost monitoring is for non-technical people, so you have to keep in mind when you are writing your values that they are comprehendible for non-technical people,” Latz said.
“Because, even if you set up the perfect labeling and you have it with policies and everything running, if no one else despite you understands it, then you have the same issue with you’re the only one know what runs on your clusters,” she continued.
FinOps, at its crux, comes down to common understanding across roles as to why it’s important. Labeling and monitoring allow for that common language and understanding and creating custom labels and teams simply blocks that communication. That’s why the Liquid Reply team suggests keeping a list of labels, with standardized spelling, approved by business and finance, that you include in your processes and documentation.
And when it comes to naming conventions of clusters, whatever monitoring tool you’re using, be sure to not use the same name for two different things. It’s not just about your team.
It’s important to make a habit out of monitoring and labeling, so it’s done automatically as soon as you create something new.
Rightsizing and Waste Management for Cost Control
Cost transparency is important but nothing without cost optimization. Rightsizing is setting the right amount of resources, like memory or CPU, for your cluster, nodes or workloads, so you can scale without waste. Rightsizing starts with setting resource requests and limits.
By default Kubernetes doesn’t set any limits, so, as Kantner said, your pods can consume any amount of resources that it wants. Which makes it the task of engineers to set those limits in order to automate autoscaling.
There are three common types of autoscalers to use:
- Vertical Pod Autoscaler – VPA monitors the actual usage of the pod and suggests new values for resource requests, specifically for stateful workloads that just need a bit more for a limited amount of time.
- Horizontal Pod Autoscaler – HPA also monitors actual usage and then adds or removes pod replicas based on target usage, including creating new pods to hit the CPU targets.
- Cluster Autoscaler – CA adds and removes nodes, but isn’t based on usage, but rather is based not the scheduling stages of a pod. If it’s not able to scale due to resource constraints on your nodes, then in brings up a new node.
However, eliminating waste goes beyond autoscalers. A straightforward way to cut costs is by adding policies that shut down environments and non-critical workloads when you don’t need them, like developer and test environments during off-hours and weekends. Kantner shared an example that this isn’t a huge cost-cutter at the individual level, but at $50 a dev environment, that adds up.
“FinOps is getting out of the bubble and seeing the big picture because when you’re working for an organization, you’re not the only account, you’re not the only project, you’re not the only developer,” Latz says it’s about the cumulative waste.
The first step, she said, is acknowledging that FinOps is a necessity, and understanding that it’s about enabling data-driven decisions.
The next step, once you’ve established the cost transparency and control of individuals and teams is to identify shared costs and just keep improving via automation.
The Liquid Reply team said that there still is no standardization yet, but that FinOps starts by bringing people together, making allies and coming up with a FinOps strategy.
The Rapid Growth of FinOps
FinOps is a rapidly growing discipline in tech organizations of all sizes. When The New Stack spoke to Steve Trask, Vice President of marketing at the FinOps Foundation, on the KubeCon floor, he said membership is growing fast, with over 5,600 members and more than 2,000 certified practitioners, including Atlassian, Spotify, Google, VMware, and Target.
The FinOps Foundation looks to create standards like the labeling practices. At its core, the foundational is about “coming together as a community to develop the best practices around FinOps because everyone is striving to do better and get more value out of the cloud spend.” He continued to explain that their job is to help make the interactions and connections needed, via community, training and best practices.
Of course, cutting cost doesn’t just save money. It’s proven that cutting your cloud budget also cuts carbon dioxide emissions. However, Kantner warned, not all of FinOps is GreenOps because of the pricing discounts you get even without rightsizing. But rightsizing and autoscaling will always be greener too.