ScyllaDB Challenges DynamoDB on Latency and Pricing
ScyllaDB is a distributed database that operates at scale and is architected for data-intensive applications that need high performance and low latency. The creators consider the database system to be a close competitor to Amazon Web Services‘ DynamoDB NoSQL database service. They are so confident in this claim that they have released ScyllaDB Alternator, which offers A/B testing between Scylla and Dynamo with just a few scripts and zero downtime.
At the ScyllaDB Summit 2023, ScyllaDB Vice President of Product Tzach Livyatan makes this case, in the talk “Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More Affordable, All at Once.”
Livyatan compared ScyllaDB to Amazon’s DynamoDB in a side-by-side matchup and compared the two in the categories of price and latency. In his testing, the results show that Scylla had lower latency and pricing. ScyllaDB’s Alternator tool is a DynamoDB-compatible API which makes any application using DynamoDB also ScyllaDB compatible.
Any vendor’s self-reported numbers should be taken with more than one grain of salt. Nonetheless, there is a lot of insight that could be gained through reviewing this work.
Round One: Latency Testing
While the Scylla team had the home-court advantage with their deep understanding of ScyllaDB and how to manipulate it in testing, DynamoDB was a “black box” with the underlying tech unknown. They had to go through some of the same trial and error mishaps and learnings just like everyone else, to test the two.
Here are the specs for latency testing:
- Yahoo! Cloud Serving Benchmark (YCSB) 0.18.0+, the “standard for no SQL databases” per Livyatan
- Scylla’s “latest and greatest” version, Scylla Enterprise 2022.2
- They used a three-node Scylla cluster — i4i.2xlarge, split across us-east-1 zones b, c, d. Scylla Cloud defaults to three zones. Higher reliability potentially slightly higher latency.
- The loaders included eight nodes of i4i.2xlarge with each machine running three instances of YCSB. The total was 40 threads with 18 processes with a parallelism of 720. There was a test done with 50 threads with no performance gains.
- They found the maximum throughput and then brought it back down to 70%. Latency suffers at the throughput max.
- Testing was done with Uniform, Zipfian, and Hotspot distributions but this article only references Hotspot. Hotspot mimics the real-world scenario of hot partitions — many partitions but only a few receive the bulk of the traffic.
- 1TB of storage since latency was being tested.
Here are the results for DynamoDB:
And for ScyllaDB:
The graphs above show that latencies are higher in DynamoDB.
Here is the use case comparison (including yearly cost):
The chart above is the overall use case comparison which tool results from both tests and compares them against one another. ScyllaDB has lower latencies and significant cost savings, in ScyllaDB’s estimation.
Similar comparisons were done for provisioned tables, which also came out in ScyllaDB’s favor, according to ScyllaDB. ScyllaDB provisions by cluster rather than table so if a cluster has additional allocation and a table is spiking, that table can sweep up the additional allocation. This concept allows for the provisioning of the entire database rather than the individual tables.
Zero Downtime Migration
The Alternator, an open source technology built on Apache Spark, offers a DynamoDB Compatible API built off of REST/ HTTP. It provides a way for those applications using DynamoDB to be compatible with ScyllaDB as a drop-in replacement. With the Alternator, Scylla is compatible with the same applications, SDKs, data modeling, queries, and other features. There’s an in-depth demo starting at 17:17 of the video of the presentation: