Kubernetes / Machine Learning

YuniKorn: Alternative to Kubernetes’ Default Scheduler for Big Data, ML

26 May 2022 3:00am, by

The big data and machine learning resource scheduler YuniKorn has attained top-level status within the Apache Software Foundation and released its 1.0 version.

Focused to work with Kubernetes, YuniKorn Project Management Committee member Wilfred Spiegelenburg maintains the young Apache project must work closely with the Cloud Native Computing Foundation and the Kubernetes community to attain the required level of integration.

I caught up with Spiegelenburg at KubeCon+CloudNativeCon EU last week to learn more about YuniKorn and the direction it’s taking.

YuniKorn is a resource scheduler for machine learning and big data applications. It originated at Cloudera in 2019 and entered the Apache Incubator in January 2020.

It was created to achieve fine-grained resource sharing for various workloads efficiently on a large-scale, multitenant and cloud native environment. YuniKorn offers unified, cross-platform scheduling for mixed workloads that consist of stateless batch workloads and long-running stateful services in extensive distributed systems.

Optimized to run Apache Spark on Kubernetes, Apache YuniKorn’s performance makes it an optional replacement to the Kubernetes default scheduler.

Its features include:

    • It runs on-premises and in a variety of public cloud environments and maximizes resource elasticity with better throughput.
    • Hierarchical resource queues — It efficiently manages cluster resources and provides the ability to control resource consumption for each tenant.
    • Application-aware scheduling — It recognizes users, applications and queues, scheduling according to submission order, priority, resource usage and more.
    • Job ordering — It offers built-in robust scheduling capabilities, supporting fairness-based cross-queue preemption, hierarchies, pluggable node sorting policies, preemption and more.
    • Central management console monitors performance across different tenants. The dashboard tracks resource utilization for managed nodes, clusters, applications and queues.
    • Efficiency — It reduces resource fragmentation and proactively triggers up-scaling. Cloud elasticity lowers overall operational costs.

UniKorn architecture diagram.

Features of the recently released 1.0 version, the fifth update since entering the Apache Incubator, include:

  • Decreased memory and CPU usage
  • Extended metrics and diagnostics information
  • A new deployment model supporting future upgrades
  • Technical preview of the plugin deployment mode

Users include Alibaba, Apple, Cloudera, Lyft, Visa and Zillow.

Melbourne, Australia-based Spegielburg had flown to KubeCon in Valencia, Spain, to tout the project and boost interest from the Kubernetes community. (This conversation has been edited for length and clarity.)

The New Stack: Tell me how this project got started.

So we got started about 3 1/2 years ago inside Cloudera with a small team of a couple of people and it was looking for providing a really good fix for the problem that we saw around scheduling and scheduling on Kubernetes. And we also thought at that point in time that we needed one type of scheduling for Yarn and Kubernetes both combined. … Around three years ago, we open sourced that from Cloudera, and just listed it as a generic open source. That was the point that we had a workable proof-of-concept product out there. And about six months after that we donated the whole project to Apache to get it incubating.

So why would a company choose YuniKorn over the default scheduler in Kubernetes?

A company should choose YuniKorn when they have mixed workloads. When a cluster runs mixed workloads, services and batch/HPC, advanced scheduling options like workload queueing or shared quotas are commonly required. Using YuniKorn gives you all these options and would help improve the user experience. Companies have also seen cost savings due to a better resource usage.

Since it is a drop-in replacement for the default scheduler it is simple to try out. No long setup required, just deploy YuniKorn via the provided Helm charts and start using it. Stopping YuniKorn will revert the cluster back to its old state.

Why did you decide to put the project in Apache?

So we looked at CNCF and Apache with the background that we had within Cloudera at that point in time. We’ve done Hadoop and all that stuff for years and years and years. So we were really comfortable doing that and knowing how to work, knowing how we would go about growing community. So at that point, it was “Let’s give it to Apache, build a community around the product and see where we go from there.” … Since that time, we’ve grown from a handful of people just within Cloudera to about 30 or 35 regular committers. The PMC is about 20 to 25 people.

So what have you accomplished technically since you’ve been in incubation?

A lot. It has changed a lot. So we focused purely on the Kubernetes side of things. We left the Yarn thing to the side because we saw more demand and more interest in the Kubernetes side of things. Because Yarn has a good scheduler for batch jobs and things like that. And that was, especially around that time, completely missing in Kubernetes.

Now, that’s why I’m here too. We see the interest also coming from the Kubernetes perspective. But two, three years ago, it wasn’t there. So that’s been a big change.

During the incubation time, we’ve grown the community. We’ve stabilized the product, made a number of large changes, big new features being added. New ways of scheduling have been added during that time. And around the time (of coming out of incubation), we also started looking at the time of doing our first major release. We’ve always had a zero-point-something release, and we were ready for a first major release. …

We postponed our first major release until after we graduated because it had a number of impacts on the way we build and a number of other things that showed up. … So graduation happened, then two weeks later, we released 1.0.

So tell me about some of these new ways of scheduling that you’ve added.

What we saw from our users was they wanted to do gang scheduling, and different ways of sorting. People are using it in the cloud, but also on-premise. … So we implemented the gang scheduling based on the demand that we saw. We changed the way that we allow nodes or infrastructure to be sorted, and implemented that during the time we’ve made sure that the end user has got a simple control over the way that we configure things. Line it up with the Kubernetes way of doing things because that’s the area that we’re now mostly focused on.

And all of that we did in a number of steps. So before we did the full gang scheduling, we first said, “OK, let’s make it simple.” One application starts up and then the next step application starts. So we order it a bit better. And then we said, “OK, that’s not good enough. Customers are asking for more uses. The community says how can we do this? Can we do this?” And we said, ‘OK, if we fold all of that underneath the gang scheduling umbrella, then we can do that.” So we took a release, to work on all that. Put that out there. And that’s a highly used feature now within the system.

So what’s going to happen with the project as a top-level project?

The main thing is, again, to grow the community. We’ve seen with all the things that are going on in Kubernetes around scheduling. We see more adoption coming. … We want as many people using our product, and we want to do a number of new things within the product. Go further.

What kind of new things?

We were looking at providing more business-like setups like recovery and restarts, making sure that upgrades can be done on the fly without any impact. So if a business (needs) a new version, they can just push the button and it happens without having any impact on that infrastructure. That means that we have to do also disaster recovery, to make sure that that works 100% of the time.

The other thing is preemption. People change their configurations all the time and when we change the configuration, that could mean that some of the workloads that runs now shouldn’t run after that anymore. So we need to be able to kill that, to automate that.

Some side cases that we’ve seen, coming as feedback from users, and specifically targeted at Kubernetes again, is daemon sets. The scheduling didn’t happen nicely. Because of the way that Kubernetes has changed under us over the years — we’ve got a moving target running under us. So they changed some things, and we adapted a little bit and we saw that (it) wasn’t good enough. So we need to change and adapt further.

A big change is the new Kubernetes scheduling framework that they brought out about a year ago. In our major release, 1.0, we also put a tech preview of the new deployment type that we’ve gotten based on the scheduling framework. And (our preview) needs to mature over the next couple of releases. People can start using it, but we don’t guarantee that’s 100% production-ready yet.

But the 1.0 is production-ready?

Yes, it is production-ready. We’ve got a number of customers running it in production with the standard modes for deployment.

So what else?

We have a project that hangs between two things. We have Apache, but we are building on top of the CNCF, the cloud native stuff. I think that we’re the first project within Apache has got that close of an integration in the system. And we’ll see where we get to, so that’s partially why I’m here because we see some problems on the integration side of things with the CNCF provider code. We need to cooperate, we need to integrate, we need to talk to each other.

So and then luckily, in the last two days that I have been here, the Kubernetes people see the same problem. They understand, and they have got from their users seeing the same problem. They see UniKorn as a fix for some of it, but they say they want to pull things out of UniKorn and put it in Kubernetes. I said, “If you can provide us with that, that’s fantastic. Then we don’t have to do it, we don’t have to maintain it and we just do your thing and on top of what you already provide.” It makes it easier for us. So that’s going to be a challenge because we still want to innovate. And we can’t wait for what Kubernetes does and on the other hand, Kubernetes does things that could hamper us innovating.

We need to get aligned. That’s going to be a tricky part.