While there is still plenty of work being done on the fundamentals of Kubernetes, it has also reached a level of maturity where much of the innovation is in the surrounding ecosystem, whether that’s extensions and operators, or ways to improve the developer experience.
KEDA, a Kubernetes Event-Driven Autoscaler — which has just reached version 2.0 — helps make Kubernetes more suitable for serverless and event-driven computing. As Kubernetes co-founder and Microsoft Corporate Vice President Brendan Burns pointed out in a recent AMA, it gives enterprises a way of removing some of the complexity of Kubernetes from developers without going as far as creating their own platform architecture. “We see [some] patterns where people are actually taking advantage of the ecosystem, where it’s not a platform architecture team, but they’re installing a function system service runtime like KEDA that gives them more developer-centric app on top of their Kubernetes cluster.”
He expects this kind of approach to become more common. “People are going through a lot of pain right now to get their apps into Kubernetes and if we can help with that, that’s great. I think in a few years, people are going to actually ban using the raw Kubernetes APIs for most developers. We’re going get into a place where there’s going be rules in these companies. We’re already seeing that where people are saying, “no, that’s too low level. It’s like using assembly language; you can’t do it, you have to use these higher-level primitives.”
Scaling the Scalers
As Jeff Hollan, principal product manager for both Azure functions and serverless technology on Kubernetes at Microsoft explained to The New Stack at Kubecon last year, KEDA is a custom controller that augments the Horizontal Pod Autoscaler in Kubernetes.
“By default Kubernetes can really only do resource-based scaling looking at CPU and memory and in many ways, that’s like treating the symptom, and not the cause. Yes, a bunch of messages will eventually cause the CPU usage to rise but what I really want to be able to scale on is if there’s a million messages in the queue that need to be processed.”
Scaling based on resource consumption will always be reactive; KEDA can make scaling proactive so the workload is ready when the tasks arrive.
KEDA lets you set custom metrics like the number of messages or events in a queue, or topic lag in Kafka, to scale containers up before demand affects performance, and then as events drop off scales them back down; if necessary, all the way down to zero if the number of events queued for processing remains low (a trigger can specify, say, no more than five messages in 15 minutes).
KEDA uses scalers to handle specific event sources; it already had key sources like RabbitMQ, Kafka and NATS streaming. KEDA adds new scalers for a wide range of tools and services, many of them contributed by the community. “We are building upon the platforms that we already had but broadening our spectrum with new providers, extensibility and platform unification,” KEDA maintainer Tom Kerkhove from Codit told the New Stack. “We have introduced a ton of new scalers with a focus on different areas.”
That includes adding new platforms like IMB’s MQ scaler, expanding existing platform support, such as Azure Log Analytics, to make it easier for those customers to use what they need and improving existing scalers by introducing other ways to authenticate such as Managed Identity support for Azure Monitor scaler.
The new Metrics API scaler and external push options (instead of the current pull-based model) improve extensibility: “users can extend KEDA by scaling on metrics [for] systems that we don’t support yet but they have all the controls they need,” Kerhove explained. “Another reason for this is to stimulate the community to build more scalers and see which ones are getting most traction so we can potentially merge them in our core. We want our users to scale what they need to scale, not only what we support, so we give them the tools for that.”
If the event source has a REST API, KEDA 2.0 can now consume that through the Metrics API without needing a custom scaler, so you could have workloads scale based on custom metrics from in-house APIs from production systems or make cloud services like Dynamics and Salesforce that automate processes also trigger the workload to deal with the process as part of an event-driven flow.
“We allow customers to integrate with existing metric systems over HTTP so they can scale on them. This can be to use external systems that we don’t support yet or simply based on an in-house API that is already available and re-uses that information,” Kerkhove explained. There’s also a go-client library so developers could work with the KEDA API directly from applications.
KEDA 2.0 also uses the Metrics API to expose more information about how KEDA itself is behaving, with Liveness and Readiness probes on the Operator and Metrics server pods, and Prometheus metrics for each scaler in use. “With the probes and Prometheus metrics we aim at improving the operability of KEDA runtime for our users allowing them to check if it’s still up and running and gaining insights on how their scaled objects are doing,” Kerkhove told us.
The Prometheus metrics are new, Roubalik added. “We have just started exposing them and we will probably extend this in subsequent releases. We are exposing data on what was scaled, number of errors and so on; users can then scrape these metrics to see what was scaled and how.” Initially, the metrics cover ScaledObject; metrics for ScaledJobs will be in a future release.
Unified Granular Scaling
To handle long-running processes, KEDA has the granularity to scale jobs as well as entire deployments, so it can avoid scaling down an instance that would kill off long-running executions that only need a few more minutes of processing. KEDA 2.0 makes that more flexible, Kerkhove said.
“We wanted to provide people the capability to scale their workloads based on their needs. This is why we’ve introduced ScaledJobs next to ScaledObject because they have different scaling behavior. Jobs require an instance per message, for example, and run to completion while ScaledObject is more a daemon constantly running and we just fan out/in based on the metrics. This used to be the same CRD but we’ve noticed that it was confusing so we decided to split them.”
It can also use multiple triggers for autoscaling, with different scale rules for each trigger.
“Up until 2.0 this was not possible with one ScaledObject while this comes back a few times, for example, we want to scale on queue depth and CPU. Another example is if one container is processing multiple queues, we need to monitor both queues and scale accordingly.”
“Longer-term we can build upon this feature to do more intelligent autoscaling. For example, today we can scale our workload if the queue is pilling up, but if we see that our database is already drowning we would make it even worse [by scaling up the workload on Kubernetes], so we would verify that the usage is below x%, for example.”
KEDA 2.0 has even more granularity for how resources scale, with its own CPU and Memory scaler — which means you can use KEDA for all your scaling, rather than needing to mix KEDA and HPA scaling.
“We strive to make autoscaling apps dead simple but to do that we need to become a unified autoscaling platform,” Kerkhove explained. “That’s why we’ve introduces a CPU and Memory scaler so that you can use KEDA to scale everything and no longer have to mix it with HPAs anymore for some aspects. As part of that, we’ve opened up the HPA configuration. so more advanced users can tweak the underlying HPA itself.”
But perhaps the biggest change is that it can now scale not just deployments but anything that implements the Kubernetes /scale subresource; that includes StatefulSets and any other custom resources.
“Now that people can scale anything with /scale subresource it allows other projects to scale their own components. For example, we have ArgoCD scaling their own resources based on this,” Kerhove said.
Argo uses KEDA for Rollouts. Knative has been experimenting with using KEDA for autoscaling Knative Event Sources, and eventually Brokers and Channels, KEDA maintainer Zbynek Roubalik from Red Hat told us. “There is intention to add /scale subresource on Knative Eventing components (not Serving). This way KEDA could scale these components in Knative Eventing, such as Sources, Brokers and Channels.” There’s a proof of concept project in the Knative sandbox with support for using KEDA 2.0 with Kafka and AWS SQS as sources, plus experimental support for RabbitMQ Broker and Redis Stream Source.
Knative isn’t the only serverless platform looking at KEDA: Fission is building a catalog of ready-made KEDA connectors that have been written to scale its serverless functions on Kubernetes. And DAPR the event-driven distributed application runtime that’s just reached its 1.0 milestone, uses KEDA for autoscaling on Kubernetes.
Alibaba Cloud is using KEDA to autoscale its Enterprise Distributed Application Service in production and several other projects have also adopted KEDA for scaling. Apache Airflow and Astronomer use KEDA to autoscale workflows based on SQL queries (the pending, queued and running tasks in Airflow are stored in a meta-database and Polidea and the Astronomer team contributed Postgres and MySQL scalers).
Originally developed by Microsoft and Red Hat, KEDA became a Cloud Native Computing Foundation Sandbox project earlier this year and the project is hoping to graduate to incubation at the end of 2020 or early next year. There have also been discussions about migrating some or all of the functionality into the Kubernetes scaler to make this more broadly available. In the meantime, the project is improving its security posture and there’s a public roadmap for upcoming features. That includes plans for several new scalers but also more ambitious ideas like using historical data and predictive analytics to scale even more proactively.
The Cloud Native Computing Foundation is a sponsor of The New Stack.