How to Size Your MongoDB Clusters
Overprovisioning remains a rampant problem, which companies such as Qubole aim to solve through automation. It can be costly when using public clouds if companies pay for more capacity than they need to run their workloads. Underprovisioning, however, can lead to dire performance problems and customer churn.
At MondoDB World 2017 recently, Jay Runkel, principal solutions architect at MongoDB, demonstrated how to apply a little math to get a pretty close guesstimate of the resources needed to run your database workload.
It can be key to a go/no-go decision on building an app at all — if this app can be built for $10,000 a month in Atlas, it’s a go, but not if could cost $100,000/month, for instance.
He cautioned, however, that his method is still a best guess.
“If you require precision, you have to build a prototype with actual data, actual queries on the class of server you plan to use in production,” he warned.
It involves building a spreadsheet, which is available on GitHub.
During the design processes, you might brainstorm lots of different schemas you might want to use. Depending on the schema you pick, that might have implications on the type of hardware you need or the number of shards. So you might do a sizing exercise for each of the top candidate schemas you come up with, he said.
This exercise is to size replica sets and/or sharded clusters, so the solution will consist of some number of shards, as well as how many CPUs, storage in available disk space, IOPS, memory and sometimes network.
Figuring out the needed IOPS — how quickly the drive can randomly find random bits of information on the data store, whether on hard drive, SDD or hard disk — is the hardest part, he said.
MongoDB’s new storage engine, WiredTiger, offers improved throughput and compression. It has a cache that will include a file for each collection and index. Roughly 50 percent of the RAM in the server will be dedicated to this cache.
The working set is the indexes plus frequently accessed documents. If you can get all your indexes and frequently accessed documents into memory, you’re going to do a lot less IO, Runkel said. You want to make sure RAM is greater than the working set, but also less than the data size.
Just to simplify things in this exercise, he assumes that all indexes are in memory and all our documents are in the file system. Here is how to break down IOPS performance in this approach:
If the system has to go to disk for documents, a find query for 100 documents requires 100 IOPS. If we’re doing 10 per second, then we need 1,000 IOPS on the server to process those 10 queries in a second.
But if your find indexes aren’t good, then this whole assumption gets blown out of the water, he said.
If you’re doing an insert, you have to write the document to disk, so that’s one IOP. And to update each index file, that’s one IOP for each index. If you know how many indexes we’re going to do per second, we can estimate that.
Similarly with deletes. Using indexes to find the document and mark it, then update each index file requires one IOP plus the number of index calls.
Updates are almost the same. In MongoDB, updates are really a delete plus an insert, so it’s navigating memory indexes, marking the document as deleted, inserting the new version of the document and updating the indexes. So it’s two plus the number of indexes.
This provides an estimate of the IOPS you need.
“That’s the good news. The bad news is that I simplified a bunch of stuff and you’ll way overestimate the number of IOPS you have. Once you have that baseline, we need to apply some modifications to remove that overcounting,” Runkel said.
Those things include:
Document size: IO systems work on blocks, not documents, so if you’re doing lots of inserts, it’s possible small documents will fit on one disk block. So every insert of a disk block will include many documents and will involve fewer IOPS.
Indexed arrays: Rather than affecting one part of a document file, an insert will affect many, so you have to account for that.
Journal, log: This model ignores the overhead from journaling and logging, which get executed every time you do an update. If you’re following MongoDB best practices, you put the journal and the log on separate IO systems so they don’t affect your workload. But if you’re not architecting your server like that, you may need to account for those as well, he said.
Checkpoints: In MongoDB, when you do a write, the document on disk is not updated immediately. It’s updated in RAM, then written to the journal so that write is persisted, and periodically a checkpoint occurs, typically at 60 seconds or 2GBs of data, whichever occurs first. That checkpoint is a background process that writes the data to disk. If you do lots of writes on a document in a short period of time — you may change that document 18 times — but it’s only written to disk once at the checkpoint. So we’re going to way over overcount the IOPS if we don’t take that into account.
The sizing process involves:
- Figuring out the data size and index size of collections,
- Determining the working set,
- Calculating the required IOPS,
- Adjusting assumptions based on things like working set and checkpoints,
- Calculating the number of required shards.
For each collection, you should estimate the number of documents and their size, which gives an estimated total size of each collection. An analysis of the indexes can produce an index size. And you can look at how well this is going to compress. All of this will vary by application.
There are lots of tools in MongoDB to give you average size of your documents, he said, but if it’s the first day of your application, you don’t have any documents. It’s better to build some rather than to guess their size. There are tools on the internet to help you build data sets to get to 5 to 10 percent of your estimated data size to create the measurements you need.
Once you have a collection, you can use db.collection.stats method. It will show you the count, the size of the collection, average document size and how it compresses. And it tells how big the indexes are. You can extrapolate out from there.
You just figured out the index size. For frequently accessed documents, you have to know your application’s “hot” data set and do an estimate.
Now, to figure out the IOPS for queries, do the following:
- Find queries equals documents returned per second,
- For updates per second, figure one plus the number of indexes,
- For indexes, one plus the number of indexes,
- For deletes, two plus the number of indexes.
Total all that out, then subtract out if there will a lot of updates in a short period.
For the find queries, you have to get some feel for how the working set will cover your workload. Usually, you don’t have to worry about the number of CPUs needed because the RAM requirements you ask for will come with plenty of CPUs. The only exception is if you’re doing lots of aggregations.
In terms of estimating the amount of disk space you’ll need: If you know storage capacity, working set size, IOPS estimate, you should have some idea of what type of machine to deploy on.
If the total requires 9TB of data and the WiredTiger compression says it requires 3TB of disk, you’ll need two shards if each server provides 2 terabytes of data. He recommends adding some cushion because some Mongo processes will temporarily expand the storage size.
The same analysis can be done for estimating RAM needs. If the working set needs 428GB and the servers have 128, it requires four shards.
In terms of IOPS: If we need 50,000 IOPS and AWS instance has 20,000, you need three shards.
So you look at all those and pick the biggest number, he said. If you said you need nine shards for disk, but three shards for RAM, you need nine shards.