Kubernetes / Machine Learning / Tools

Why StormForge Says ML Can Save Kubernetes Cloud Costs

23 Feb 2022 12:07pm, by

Artificial intelligence (AI) solutions provider StormForge says it offers a very practical solution to solve an enormously complex problem as organizations are not only challenged by managing Kubernetes environments at scale but they also must interpret the massive amounts of observability data they must process to make key decisions. With the use of Optimize Live released today, StormForge says DevOps teams can apply its machine learning (ML) tool in such a way as to not only make their operations running on Kubernetes clusters more productive by making better use of observability, but they can also see reductions in cloud-service and other costs as a result.

Optimize Live does this by analyzing existing observability data using ML to recommend real-time configuration changes that reduce resource usage and cost while ensuring application performance. Observability data typically provides DevOps teams with “a lot of data and a lot of visibility into what’s going on, which is really important, but it doesn’t necessarily give them actionable insights” to improve application performance and allocation of resources that StormForge Optimize Live offers, Rich Bentley, vice president, product marketing, for StormForge, told The New Stack.

“Once you put the code in production, Optimize Live just constantly watches the observability data, detects when it’s time to make changes, makes the recommendations and then applies them,” Bentley said.

Deluged with Observability Data

In addition to the complexity associated with managing Kubernetes clusters — often at scale — to make sure that the environment remains operational and that applications run like they should, ensuring that resources are allocated properly can sometimes take a backseat to improving infrastructure performance. Meanwhile, DevOps teams can be deluged with observability data. The data, in theory, could be used in such a way to better manage the intricacies of the utilization of CPU, memory and other resources, but organizations very often lack the resources and know-how to interpret the observability data.

The Optimizer Live ML was designed to process this data and offer support and know-how DevOps teams typically lack in-in-house to communicate the “actionable results” needed to make decisions in a straightforward way, Bentley explained during a briefing with The New Stack.

Optimizer Live’s ML algorithms process observability data from  Kubernetes environments and provide actionable results on an ongoing basis, as resource allocation and other decisions will change as operations are altered or scaled. “Even if you optimize your environment for today that doesn’t mean it’s going to stay optimized nine months from now — this is where by leveraging tools like StormForge, you’re able to not only reduce risk because you’re able to ensure your apps get what they need, but you’re able to also not overly burden your people,” Scott Sinclair, an analyst for ESG Global, told The New Stack.”You’re also able to say ‘we’re reducing the amount of budget taken up by our existing production apps and that frees up more opportunity for us to expand and grow and do new digital initiatives, which in the digital economy, is going to drive more revenue and make everybody more money.’”

Optimizer Live’s capabilities are more nuanced than just serving as an ML tool that communicates what selections to make when managing operations for Kubernetes environments. Instead, the tool provides a range of choices listed on a chart that weighs the performance versus cost factor for each selection when configuring operations. The options range from selecting configurations that might be cheap to maintain while performance is poor or opting for a high-cost and -performance option.

“We are abstracting away the finer details as we try to walk the line between the balance of power and abstraction,” John Platt, vice president of machine learning, for StormForge, told The New Stack. “This also enables you to get stuff done with flexibility by being able to make it work in any environment.”

Acquia’s Usage

Acquia, a platform provider for Drupal applications, has used Optimize Live as an early adopter to see cost savings with the use of Amazon Web Services (AWS) through better allocation of CPU, memory and other resources for use with the customer services it offers.

The actionable options Optimizer Live offers has allowed Acquia to have “a smart starting point for the class of services we provide so that we can allocate enough but not too much,” Charley Dublin, vice president of product management for Acquia, told The New Stack. “Optimize Live then allows us to stay in tune to the services and monitor and manage them” once the service is deployed.

Integrations and Other Features

StormForge can be integrated with DataDog and Prometheus, and will shortly be able to accommodate Dynatrace observability tools. Other observability tool options will be available in the future as well, Bentley said.

The range of StormForge Optimize Live features — in addition to ML for continuous optimization of production environments by analyzing observability data to recommend resource-allocation settings  include:

  • Improved recommendations versus those provided by VPA due to ML of customers’ specific environment.
  • Configurable policies for flexibility (for example, auto or manual approval of recommendations by namespace).
  • The possibility to use StormForge platform with common user management, SSO, RBAC, release automation and other tools.

The New Stack is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: StormForge.