Amazon Web Services Gears Elastic Kubernetes Service for Batch Work

Cloud giant Amazon Web Services has launched AWS Batch for its Amazon Elastic Kubernetes Service (Amazon EKS). AWS Batch is ideal for developers looking for a more simplified workflow when it comes to managing Kubernetes clusters and pods to use with their batch jobs, according to a Tuesday blog post from Steve Roberts, AWS developer advocate.
AWS announced this new service at KubeCon+CloudNativeCon North America, being held this week in Detroit.
Kubernetes “requires you to invest significant time in custom configuration and management to fine-tune a suitable solution,” wrote Roberts of setting up and running a batch job. AWS Batch for Amazon Elastic Kubernetes Service (Amazon EKS) is a fully managed Amazon Elastic Compute Cloud (Amazon EC2) service that handles the installation and management of complex, custom batch workloads. Built-in custom components, a scheduler, serverless queue, and integration offer batch job creation in fewer steps. The service itself is free of charge — the only costs associated are tied to the resources the job consumes.
Kubernetes has more of a microservice architecture feel, so when its pods and clusters are used in batch jobs, the workflow is a little heavier. And it makes sense — microservice and batch jobs are wildly different, so if a technology suites one well, then there would have to be a lot of work needed to carve out full functionality on the other.
There are third-party frameworks made to help run batch jobs on Kubernetes that do exist. But there are gaps, and inside those gaps is the heavy lifting of building, configuring, maintaining, scheduling, and scaling custom batch solutions, Roberts wrote.
Before diving into what Amazon aims to solve, let’s dive into some fundamental differences between the two:
Microservice applications tend to lean towards continuous running and availability vs batch jobs with defined starts and ends. Microservice workloads are expected to respond to requests within milliseconds. Batch jobs don’t require availability on nearly as tight of a timeline.
Finally, microservice jobs scale in more reliable patterns, usually linearly as the load increases and decreases. Batch jobs follow more exponential scaling as they tend to start at zero, get much more extreme, then go back to zero.
Introducing AWS Batch for Amazon EKS
This fully managed service runs batch workloads using clusters hosted on Amazon Elastic Compute Cloud (Amazon EC2) handles the installation and management of complex, custom batch solutions.
Some features of AWS Batch for Amazon EKS are:
- Scheduler: This controls and runs high-volume batch jobs together with an orchestration component that evaluates when, where, and how to place jobs submitted to a queue. Rather than having to coordinate the job, now the user just requests submitted to the queue.
- Serverless queue: The serverless queue handles job queueing, dependency tracking, retries, prioritization, compute resource provisioning for Amazon Elastic Compute Cloud (EC2) and Amazon Elastic Compute Cloud (EC2) Spot, and pod submission.
- Integration: AWS Batch for Amazon EKS integration provides integration with other services such as AWS identity and Access Management (IAM), Amazon EventBridge, and AWS Step Function and other partners and tools in the Kubernetes ecosystem.
How It Works
AWS Batch is the main entry point to submit workload requests when running batch jobs on Amazon EKS clusters. Amazon Batch then launches worker nodes into the cluster to process the jobs based on the queued jobs. These nodes are kept separate in a distinct namespace from your other node groups in Amazon EKS. Similarly, nodes in other pods are isolated from those used with AWS Batch.
AWS Batch uses managed Amazon EKS Clusters, and these need to be registered with AWS Batch and have permissions set. Instructions for launching a managed cluster can be found here. Permission-setting instructions are located here.
Users can submit jobs to the queue. After a job is submitted, the following actions are triggered:
- Once received, the queue dispatches a request to the configured compute environment for resources. An AWS Batch managed scaling group is created if it does not already exist. The AWS Batch starts launching Amazon Elastic Compute Cloud (EC2) instances in the group. The new instances are then added to the AWS Batch Kubernetes namespace of the cluster.
- The Kubernetes scheduler places any configured DaemonSet on the node.
- Once the node is ready, the AWS Batch starts sending pod placement requests to the cluster using labels and taints to make the placement choices for the pods, bypassing much of the logic of the K8s scheduler.
- The process is repeated, scaling as needed across more EC2 instances in the scaling ground. This continues until the maximum configured capacity is reached.
- If the queued job has another compute environment defined, such as Spot instances, additional nodes will launch in that computing environment.
- AWS Batch removes the nodes from the cluster and terminates the instance once the work is complete.
AWS Batch is available today at no charge. The only cost associated with this service is the resources the job consumes. AWS created a self-guided workshop to help developers get started. There is also the Getting Started with Amazon EKS topic in the AWS Batch User Guide.