Grafana Wants to Help You Avoid Getting Dinged by Kubernetes Costs
Grafana aims to assist in preventing cloud native and Kubernetes-related cost overruns, which are often the result of unused resources — a problem often referred to as “zombie resources.”
This issue is largely due to a lack of visibility and observability, especially in the complex and distributed Kubernetes environments and clusters. The challenge becomes even more significant when dealing with third-party cloud offerings for Kubernetes, which can strain the budgets and cost constraints of startups. Managing resources with established cloud vendors like Amazon, Microsoft and Google Cloud, can be a daunting task in a Kubernetes environment. After all, cloud costs are often rising and many users have voiced complaints about feature removals and cost rises for cloud features while there has been a shift to remove resources from the cloud in many instances to save costs.
Needless to say, we need better cost-utilization and resource management strategies. Grafana has, to that end, introduced a solution by consolidating various resources into a single panel within the renowned Grafana interface, labeled “Kubernetes Monitoring.” This platform simplifies the monitoring of Kubernetes cluster usage, infrastructure, and hardware, offering detailed time-series metrics on historical usage and future projections.
Grafana is achieving this by adding this capability while adhering to its spirit of integrating popular open source tools. Prometheus, obviously, plays a huge role in Grafana usage, for example. To this end, Grafana’s Kubernetes Monitoring heavily relies on the open source project Opencost, which was created to provide enhanced visibility and resource allocation for Kubernetes clusters. This includes the analysis and monitoring of historical usage, a practice that Grafana implements with several other open source tools. Among the fundamental open source alternatives integrated with Grafana is Prometheus. Grafana utilizes Opencost to integrate these features and additional functionalities into its Kubernetes monitoring panel as part of its Grafana Cloud offering.
“Amazon, Microsoft and Google Cloud are not all too motivated to help organizations optimize their cloud spend,” Torsten Volk, an analyst at Enterprise Management Associates, said. “The integration of Opencost allows Grafana to offer the visibility and insights needed to optimally size Kubernetes clusters based on current and projected resource requirements while also taking into account current contracts, fixed fees, billing granularity, dependency on other resources, data transfer cost, and discounts, e.g. for the use of spot instances for certain workloads. Combined with its Prometheus integration, the integration of OpenCost allows Grafana to deliver a near turn-key experience for Kubernetes monitoring and visibility.”
Grafana obviously considers Opencost an important project, which explains why Grafana recently joined the OpenCost community as a contributing partner. Since late last year, with the first commit Gafana achieved with the project, Grafana has regularly contributed upstream to the OpenCost project, in parallel to the improvements we’ve made to the K8s Monitoring solution, Richard Lam, Grafana’s director of product management, told The New Stack.
“So rather than going out and build something completely proprietary that’s locked in, just to look like any other competitor out there — why not deliver on top of something that’s already amazing, like OpenCost?” Lam said. “It’s a proven solution that we ourselves use internally to monitor our own costs, especially in the earlier days before we rolled out the K8s Monitoring solution for Grafana Cloud.”
While Grafana’s Kubernetes Monitoring solution is exclusive to Grafana Cloud (including the free tier), there are also other ways “we can give back to the community,” Lam said. “This means that users can still get a taste of what we offer via Grafana Cloud, if they just want to roll their own stack, like with Mimir, Loki, and Tempo,” Lam said.
Less Scale-out Fear
One of the most challenging aspects of scaling with Kubernetes, especially when working with cloud vendors, is cost prediction. Grafana Kubernetes Monitoring addresses this issue by providing cost estimation capabilities through cluster provider information and customer-specific data — before you hit that deploy button.
“Everyone who has been burned by much higher than expected costs for the EKS, AKS, or GKE clusters knows that estimating Kubernetes cost on the ‘back of an envelope’ is risky business,” Volk said. “This is due to these Kubernetes cluster vending machines are not at all optimized to dispense turnkey clusters at a fixed rate, but come with lots of dependencies on compute, storage, and networking resources that are all billed separately. By providing DevOps teams with ‘all inclusive’ cost estimates for their specific use cases, Grafana’s Kubernetes management dashboard might take out a lot of the cost risk that comes with adopting Kubernetes at scale.”
With the help of Grafana’s machine-learning plugin, Grafana communicated the following predictive capabilities Kubernetes Monitoring offers:
CPU and memory prediction to help ensure resources are available during spikes in resource usage and help you decrease the amount of unused resources due to over-provisioning.
Predict Mem Usage: Shows a predictive graph for memory usage one week in the future. Calculations are based on metrics from the previous week.
Predict CPU: Shows a predictive graph for CPU usage one week in the future. Calculations are based on metrics from the previous week.
“The ability to predict costs before deployment can help avoid over-provisioning or over-licensing and save you not just money — but time and resources. Essentially, it gives you the opportunity to make better data-driven decisions around resource allocation, scaling strategies, and technology investments,” Grafana’s Lam said. “The ability to predict costs before deployment can help avoid over-provisioning or over-licensing and save you not just money — but time and resources. Essentially, it gives you the opportunity to make better data-driven decisions around resource allocation, scaling strategies, and technology investments.”
While ReveCom has yet tested it, Grafana’s ambition with the Kubernetes Monitoring tool is to offer a comprehensive solution to manage and monitor Kubernetes resources effectively, addressing the complexities of cost management in cloud native environments.
“Cloud native cost management is so tricky due to the distributed character of microservices applications and public cloud infrastructure,” Volk said. “I know from my own experience how easy it is to forget to add a significant cost factor to an estimate and then being surprised when the monthly bill rolls around. If Grafana makes these estimates reliable, this would be a big deal.”
As Lam noted, the lines between what’s important to monitor are starting to “blur as time goes on,” Lam said. “In traditional observability, we think a lot about system health, performance and reliability. While all of this is important, topics like cost have started to rise to the top in meteoric fashion, especially due to the recent macroeconomics and a greater emphasis on tightening our budgets,” Lam said. “As a result, we see a lot more individuals and groups at companies, beyond just R&D, who want to monitor how services are doing and to track where’s all of the money going. Questions asked include: What’s your most expensive service to run? Are you over-provisioning a ton of K8s clusters that don’t necessarily need to be that large or up for that long? How can we get smarter about how we’re using our resources, and effectively, our money?”