Cosmos DB: Microsoft Azure’s All-in-One Distributed Database Service
Microsoft’s recently released Cosmos DB is a globally distributed NoSQL database service that lets you pick and mix your favorite data model and database APIs and still get the consistency offered by a standard transactional database.
Cosmos DB, which debuted at the Microsoft Build user conference earlier this year, is an upgrade of the Azure DocumentDB service, pressing beyond that service’s original roots as a JSON document store.
NoSQL systems often scale well but don’t have a rich query experience; database schemas have rich query options but don’t scale well. The combination of data models and APIs in Cosmos DB means you can pick a middle ground, taking away the burden of dealing with schema but not at the expense of queries.
Cosmos DB offers a globally distributed database with elastic scale, petabytes of storage, guaranteed single-digital millisecond latencies, no need for schema or index management. It can handle multiple data models and data APIs and offers multiple consistency models that give you some entirely new ways to build a distributed system.
“SQL servers were optimized for reads and queries, for the workloads of the last twenty years, but the world has changed,” said Microsoft distinguished engineer and founder of Cosmos DB Dharma Shukla. Internet of Things “devices require a rapid velocity of data; there’s lots of data being generated at a high rate and you need an engine that can sustain large, rapid writes and still serve queries.”
Cloud computing sets the stage for globally distributed apps. Many organizations, however, will distribute the front-end, but leave the back-end database in one location. With Cosmos DB, data is distributed in sync with the app.
Cosmos DB is one of the fastest growing Azure services, although Microsoft can’t name some of the largest customers using it. Microsoft is also a little cagey about how it uses the service itself, but all Microsoft billing, including all the Store transactions, go through it.
Any Schema, Any Model, Any API
Cosmos DB has a write-optimized, latch-free database engine with automatic indexing. “We can keep the database in sync at all times and we can deploy it worldwide without worrying about schema versions; because it’s schema agnostic, there is no schema version,” Shukla said. With no versioning to take care of, developers can iterate their apps rapidly.
The database engine in Cosmos DB supports multiple data models and APIs. “No data is is born relational,” Microsoft Cosmos DB architect Rimma Nehme told us; “it’s born dirty and messy, in whatever shape or structure it’s created.”
Future updates will add support for Cassandra and perhaps Amazon Web Services’ DynamoDB and other database stores, Nehme suggested. “We’re not dogmatic about APIs or data models.”
The same underlying database engine handles all these data models, Shukla told us. “Instead of hosting different database engines [for different models], we created one engine that’s schema agnostic, that’s write optimized, that can support multiple data models — and that’s API neutral.”
The same is true of graph support, which is based on Apache TinkerPop. “The graph layer is very generalized. Gremlin is one query language but we’re also going to add graph operators to DocumentDB SQL, MongoDB has graph operators so we will support those. Graph is very interesting; there are a lot of IoT scenarios. There’s a lot of momentum around graphs inside Microsoft and customers have been asking for it.”
NoSQL isn’t the only possibility; Shukla suggested that ‘significant proportions’ of the full ANSI SQL grammar could also be mapped to the Cosmos DB data model.
Choose Your Consistency
The exciting thing about Cosmos DB is that the consistency models it gives you are genuinely new. Most users choose either strong or eventual consistency; the standard models other distributed databases offer, which are the models most Cosmos DB users pick. The rest pick one of the three consistency models in between those two extremes: bounded staleness, session and constant prefix.
Nehme described the options as “a slider that allows Cosmos DB to behave like a relational database or a NoSQL database.”
“Implementing any distributed system involves a trade-off between, on the one hand, the degree of consistency it provides to users, and on the other its availability and response time,” Nehme said. Some customers need consistency enough that they’re willing to pay for that in performance; the other extreme has been NoSQL speed but no consistency guarantees.
Bounded staleness means that reads can lag behind writes, but only by a fixed amount (in seconds or numbers of operations) and write order is guaranteed; that’s a good match for gaming or where you need to monitor a sensor and take action if a problem reading occurs. Session-based consistency guarantees monotonic reads and writes, in sequence, within a session; the latency is better than bounded staleness but the global ordering might not be.
“If you have devices with data that’s cached locally session consistency can go a long way to solve the problem of having unique data at the edge [of the network]; when the device gets online that cache has to be converged,” Shukla explained. This is the most popular model for developers using the service. “The reason session is a sweet spot is because it enables all these scenarios without you having to choose.”
The newest model is a consistent prefix, which guarantees you won’t get gaps. “If you’re operating on a record, version by version, you won’t get gaps in those versions. If you see version one of a record, then you’ll see versions two, three and four in that order; you won’t see version four arriving and then it went back to version one and then on to version seven,” explained Shukla. “This is a much stricter guarantee than eventual consistency gives, but you get high availability and low latency.”
Consistent prefix is “very good for building messaging and queuing systems,” Shukla said.
Cosmos DB is not the only distributed database service with variable consistency levels.
Google Spanner recently added bounded and exact staleness, with a maximum staleness of one hour. Spanner’s consistency models however currently cover only a single region. This will increase to three data centers later in 2017. Azure, in contrast, offers all consistency models across all 38 of its data centers.
Cosmos DB is a ‘ring 0’ service in Azure so as new Azure regions are rolled out, Cosmos DB will always be on the list of available services, according to Microsoft. You can start with one region or many, and you can add and remove the regions you want your data to be available in, without any downtime. You can also use policy to geofence data into specific regions if you’re covered by regulations.
The team used the TLA+ specification language created by Turing Award winner Leslie Lamport to specify the different consistency models.
“Consistency is no longer a theoretical thing; it’s a reality that developers are facing,” Shukla pointed out. If you’ve got customers around the world and you want fast performance, you have to distribute your database and that means handling consistency.
Shukla compares defining these consistency models for distributed data storage to the codified isolation levels that relational database models created; now you know what trade-offs you’re making. And he noted, “you can save a lot of money by choosing any of those three compared to strong consistency.”
Performance and Promises
The Cosmos DB service level agreements (SLAs) put a price on delivering the promised latency, throughput and availability. “Developers want predictable performance for unpredictable needs,” explained Nehme.
The SLA says 1KB reads from Cosmos DB in the same Azure region will take less than 10ms and indexed writes less than 15ms — but that’s the worst case scenario and the median results are actually less than 2ms and 6ms, respectively, according to Nehme, and that’s with data encrypted at rest. And because you can distribute your data into multiple regions, you can get that low latency wherever in the world you need it. The SLA for distributing your database into another Azure region is 30 minutes, but again, the data movement happens faster than that on the service.
Cosmos DB can elastically scale both storage — which Shukla called a relatively easy problem “because it takes time for a table to grow to a petabyte or for data to be deleted” — but also throughput. You can change the number of transactions per second in your code, or set different regions to have different throughput rates based on the time of day. “Dealing with latency and throughput forces you to create a resource governed stack. It’s a very difficult distributed systems problem.”
Currently, Cosmos DB charges you by the hour. But if you’re provisioning for peak usage the day you launch a new online service or game, you want more granularity than that, so Microsoft is introducing request units per minute. You can choose how many requests per minute you want your database to handle, and rather than dropping requirements if they exceed the limit you’ve set, Cosmos DB can borrow from your budget for the hour to handle sudden bursts.
Alongside those SLAs are manual and automatic failover options — because the way to see if consistency works is to see what happens when the inevitable network failure happens and partitions your database. As you create Cosmos DB regions in the portal and populate them with data, you can set the priorities of the different regions and if a failover happens it follows that order.
As well as local persistence and replication within and across regions, Cosmos DB takes periodic backups; if you accidentally delete data you can contact Azure support to get it restored (and there’s an API coming to let you restore the data yourself).
Feature image via Pixabay.