CAST AI sponsored this post.
Spot instances offer an attractive discount, but there’s a catch. AWS can pull the plug at any time, giving you only two minutes to move your Kubernetes workloads somewhere else. When used smartly, spot instances can lead to dramatic cost savings. However, they aren’t a silver bullet and AWS comes with some limitations.
If your industry or setup prevents you from running workloads on spot instances, don’t worry. There are still ways to cut your cloud bill even by half. The best way to get started is with cloud instance rightsizing.
In its essence, rightsizing means:
- Analyzing the utilization and performance metrics of your instances
- Understanding whether they’re running efficiently for the price
- Improving the performance-to-cost ratio of your infrastructure by upgrading, downgrading or terminating instances
Here’s a step-by-step guide to choosing the best Amazon EC instance for your Kubernetes workload.
1. Define Your Requirements
You might be tempted to get an affordable instance, but you’ll be taking a risk here. The instance might experience performance issues once you run a memory-intensive application.
Consider the requirements of your workload first. To keep cloud costs at bay, order only what you need across these key compute dimensions:
- CPU count
- CPU architecture
- SSD storage
So, you’ve identified a set of matching instance types. Before selecting the type, choose between CPU- and GPU-dense instances. If you’re building a machine learning application, pick a GPU instance. It’s much faster in training models. AWS introduced a new instance type designed for inference (read more in the next section).
2. Select an Instance Type that Fits
AWS offers several types of EC2 instances that match different use cases and offer different parameter ratios. They’re also scalable, so it means that you can always get a larger size if your workload demands it
This type comes with a balanced ratio of CPU to memory. It’s a good fit for general-purpose applications that use an equal amount of CPU and memory, such as web servers with low to medium traffic or smaller databases.
These instances are optimized for CPU-intensive workloads and have a high ratio of CPU to memory. Pick them for use cases such as web servers with medium traffic, batch preprocessing or application servers.
This type offers a high memory-to-CPU ratio and works well in production workloads like database servers, analytics or larger in-memory caches.
A storage-optimized instance is a good choice for workloads that need heavy read/write operations and low latency. It’s great for Big Data, SQL and NoSQL databases and data warehousing.
These instances use hardware accelerators to carry out tasks such as data pattern matching, graphics processing and floating-point number calculations much better than software running on CPUs. Select them for machine learning and high-performance computing (HPC).
AWS introduced this type to support machine learning applications. EC2 Inf1 offers up to 30% higher throughput and 45% lower cost per inference than AWS EC2 G4 instances.
3. Consider the Matter of Chips and Processors
To offer compute services, cloud providers roll out different computers. The chips in those computers have different performance characteristics.
Here’s an example scenario. While one instance has an older-generation processor that is slightly slower, the other one is a new-generation processor that is slightly faster. Without any awareness, you might end up choosing an instance that offers strong performance characteristics you don’t actually need.
How to Verify Real Performance Across Various Instances
The best method for choosing the best VM for the job is benchmarking, which means dropping the same workload onto a different instance type and checking its performance. We did at CAST AI when we started over a year ago. Here’s what we learned.
Example Insight: Differences in Cloud Endurance
To understand instance performance, we developed a metric called “Endurance Coefficient.” Here’s how we calculated it:
- We measured how much work a given instance type can do in 12 hours and how variable its performance is.
- Note that for a sustained baseload, your goal is stability. For a bursty workload, lower stability works just fine.
- In our case study, instances with stable performance are closer to 100, and instances with random performance are closer to 0 value.
Making a decision here might be difficult, since it’s not clear how much stability you’re getting for your money when using different instance types.
In our case study, the DigitalOcean s1_1 machine achieved an endurance coefficient of 0.97107 (97%). The AWS t3_medium_st achieved 0.43152 (43%).
Note: Consider ARM-Powered Instances
AWS already offers ARM-based instances – for example, the EC2 A1 family uses the Graviton2 ARM processor. ARM is cheaper to run and cool because it consumes less energy. Cloud providers are likely to charge less for it.
If you’d like to use it, you might have to re-architect your delivery pipeline to compile your application for ARM. On the other hand, if you’re running an interpreted stack (think Python, Ruby or NodeJS), your applications will probably run.
3. Take Advantage of CPU Bursting
Burstable performance instances were designed to offer businesses a baseline level of CPU performance with the extra option of bursting to a higher level when the workload requirements change suddenly.
Such instances work well with low-latency interactive applications, microservices or small and medium databases.
Note that the number of accumulated CPU credits will always depend on the instance type you pick. In general, large instances collect more credits per hour for four hours or more per day, on average. If you run an e-commerce site that gets a surge of visitors after each marketing campaign, a burstable instance is a good fit, however.
Consider This: CPU Capacity Has Its Limits
By testing burstable instances, we discovered that compute capacity usually increases linearly during the first four hours and after that becomes much more limited. Here’s a chart for a t2_2xlarge instance.
4. Double-Check Storage Transfer Limitations
Another significant cost generator is data storage. AWS EC2 instances use Elastic Block Store (EBS) for storing disk volumes. When choosing an instance type, ensure that it comes with a storage throughput that your application requires. Avoid expensive drive options such as premium SSD unless you expect to use them to the fullest.
5. Don’t Forget About The Network Bandwidth
If you’re facing a huge data migration or a high volume of traffic, take a look at the size of the network connection between your instance and the consumers assigned to it.
You might find instances that can bolster 10 or 20 Gbps of transfer speed. Remember, though, that only those instances will be able to support this level of network bandwidth.
Wrap Up: Automated Rightsizing
Instead of rightsizing cloud instances manually, you can get an automated solution that does the job for you.
We built CAST AI to do this with an AI-driven instance-selection algorithm. It selects the best instance type that meets the workload requirements and applies changes whenever your cluster needs extra nodes. Your workloads will always be running at maximum performance and minimum cost.
Sign up to CAST AI and analyze your cluster for free to start saving from 50% to 90% on your cloud bill.
The New Stack is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Real.
Amazon Web Services (AWS) is a sponsor of The New Stack.
Featured image via Pixabay.