Favorite Social Media Timesink
When you take a break from work, where are you going?
Video clips on TikTok/YouTube
X, Bluesky, Mastodon et al...
Web surfing
I do not get distracted by petty amusements
Cloud Native Ecosystem / Containers / Kubernetes

ScaleOps Dynamically Right-Sizes Containers at Runtime

With continuous automation, ScaleOps aims to eliminate having to manually tune workloads and adjust Kubernetes resource allocation, saving companies money.
Jan 3rd, 2024 8:05am by
Featued image for: ScaleOps Dynamically Right-Sizes Containers at Runtime

The complexity and eye-watering expense of cloud computing for many has led companies like Basecamp and 37Signals to totally hop off, saying cloud computing is for the birds.

While the FinOps movement aims to get all the business and tech folks focused on gaining the most value from their cloud bucks, even shifting financial accountability down to developers, too often that hasn’t been the reality, according to Yodar Shafrir, co-founder and CEO at ScaleOps, an Israeli startup focused on dynamic resource allocation of containers at runtime.

Working with clients previously at GPU optimizer Run:AI, he found DevOps teams’ greatest frustration was having to size all kinds of resources, and companies ending up paying for more than they actually used. Those DevOps teams had to chase the application teams and developers to adjust and configure workloads to reduce costs — but DevOps didn’t have ownership of the workload configurations.

Engineers have to set a scaling strategy for each application container and end up devoting hours trying to predict demand, running load tests and tweaking the configuration files for each one, which becomes impossible to manage manually when running thousands of containers.

Shafrir’s co-founder, Guy Baron, previously head of research and development at website builder Wix, pointed out that engineering teams waste a lot of time manually tuning workloads and adjusting resource allocation when developers really don’t care about that — they want to focus on building the core product.

So Shafrir and Baron set out to reduce the friction between teams over costly and manual resource configurations by making the allocations totally automated and dynamic.

Right-Sizing Containers Individually

Configuration management in Kubernetes in particular “is a multidimensional chess game, and one that DevOps and IT teams are losing too often,” Ofer Idan, previously at StormForge and now at Capital One, wrote in The New Stack, in advocating for machine learning-based approaches instead.

Those ML-powered tools “are proving themselves by intelligently analyzing and managing hundreds of interrelated variables with millions of potential combinations to automatically select the optimal settings for each application,” he wrote.

Rather than Software as a Service, ScaleOps takes an agent-based approach to apply machine learning to workloads, right-sizing each container individually for CPU and memory at runtime. It consolidates pods into the optimal number of nodes, removing excess. Once the platform is installed, everything starts with a dashboard showing the savings potential from the optimization.

The ScaleOps dashboard tracks how much money organizations are spending — and saving — on computing resources.

“Because we eventually do all the optimizations during runtime, [clients] need to trust us,” Shafrir said. “So we designed the platform in a way that everything would start in read only. They would see the potential, they would see the simulation … and how ScaleOps would act. Then with a single click of a button, they can just start automating.”

“They can trust that they can start with the granularity of a single workload and then continue to expand the automation according to their appetite, once they see if that makes sense.”

You can click on a specific workload and select a time period — say, 12 hours — to look at resource usage and potential CPU spikes. Rather than a static allocation, ScaleOps adjusts according to the dynamic needs of that container.

The yellow line represents a static configuration. The green line shows how ScaleOps would provision resources to account for spikes.

You can set different policies from those included or create your own. The “dynamic” policy, for instance, is more cost-focused than, say, “high availability.”

“We didn’t want companies to worry about adjusting the policies, so out of the box, once you install us, we scrape all the workloads that you currently have, all the parameters you use,” Shafrir said. “And we know to automatically assign the best policy for every workload. So [your] team will not need to do any configuration. It’s a context-aware platform.”

It integrates with open source autoscalers Karpenter, Cluster Autoscaler, HorizontalPodAutoscaler (HPA) or Keda to determine the optimal number of replicas for every workload.

Fully Automated

A much-requested feature, the In-Place Update of Pod Resources (KEP-1287), is in alpha in Kubernetes 1.29 “Mandala.” It will give operators the ability to adjust CPU and memory resource configurations on the fly, without having to restart the container.

The part about “giving operators the ability to adjust” is the key, according to Shafrir, who says ScaleOps already has automated that. ScaleOps bills itself as “the first fully automated platform that continuously optimizes and manages cloud native resources during runtime.”

He maintains that all the competitors on the market give static recommendations, usually before runtime. For instance, Harness Cloud Cost Management provides the ability to run what-if analyses for container cost management and to attribute costs to specific teams. Kubecost is an open source example. Those initial recommendations then have to be adjusted manually in production, according to Shafrir, a claim that no doubt competitors, including StormForge and Cast AI, would have something to say about.

In fact, StormForge has offered a fully automated platform that continuously optimizes cloud native resources during runtime for quite some time now, according to Yasmin Rajabi, head of product at StormForge

StormForge, a TNS sponsor, has written about its real-time Kubernetes SaaS operator Optimize Live. Cast AI, meanwhile, automatically optimizes within clusters every few seconds, recommending the most advantageous machines to begin with as well as tackling other decision-making headaches.

Looking Ahead

Though ScaleOps has been focused on right-sizing containers and node optimization, it will be tackling horizontal pod autoscaling in the future as well, Shafrir said.

It works in any Kubernetes environment, including the major cloud platforms like Amazon Web Services (AWS) , Microsoft Azure and Google Cloud, as well as on-premises and air-gapped servers.

Its clients include security firms Wiz, Orca Security and Salt Security, insurance vendor At-Bay and Netherlands-based payment service PayU.

Founded in 2022, Tel Aviv-based ScaleOps recently landed $21.5 million in investment in two rounds, Seed and Series A funding.

“What ultimately convinced us was seeing firsthand how ScaleOps automatically manages critical applications during runtime in million-dollar production environments of industry leaders,” said David Gussarsky, partner at investor Lightspeed Venture Partners. “ScaleOps isn’t just one step ahead; it’s a giant leap into the future.”

Group Created with Sketch.
TNS owner Insight Partners is an investor in: StormForge, Wiz.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.