Microsoft Acquires Citus Data for Postgres Scale

Microsoft announced Thursday that it has purchased Citus Data, a San Francisco-based startup focused on making Postgres scale horizontally.
As an open source PostgreSQL extension, the Citus software essentially turns the application into a distributed database. Citus Data provides the horizontal scalability of a NoSQL database with the transactional consistency and functionality of a relational database.
The company was founded eight years ago, when much of the tech world was focused on Hadoop and NoSQL databases perceived to offer more options for data management than a traditional relational database, according to a Citus Data blog post.
The company, however, chose to build on top of the nearly 30-year-old Postgres, which despite its age, DB-Engines ranked the DBMS of the year for 2018, based on its continued popularity.
At a high level, Citus distributes the data across a cluster of commodity servers. Incoming SQL queries are then parallel processed across these servers.
It uses an architecture similar to Hadoop: One master node uses metadata about the shards and parcels out fragments of incoming queries to worker nodes that actually run the query pieces in parallel.
It provides:
- Massively parallel processing for SQL analytics.
- Real-time inserts/updates on distributed database tables.
- Dynamic scalability on commodity hardware with the ability to easily add or remove nodes.
- JSON and structured data in one database.
- The expressiveness and familiarity of Postgres.
Stanford alums Ozgun Erdogan, Sumedh Pathak and Umur Cubukcu founded the company in 2010 and went through Y Combinator in summer 2011. The company open sourced its technology in March 2016. It has raised more than $13 million.
“Together, Microsoft and Citus Data will further unlock the power of data, enabling customers to scale complex multi-tenant SaaS applications and accelerate the time to insight with real-time analytics over billions of rows, all with the familiar PostgreSQL tools developers know and love,” Rohan Kumar, corporate vice president for Azure Data, wrote in a blog post announcing the acquisition. Terms of the deal were not disclosed.
Microsoft is touting the acquisition as part of its continuing support for open source databases including MySQL, PostgreSQL and MariaDB, as well as its investments in open source SQL Server on Linux, a multimodel NoSQL database with Azure Cosmos DB, and open source analytics projects Hadoop and Spark.
Microsoft is not expected to significantly alter Citus Data’s business, but Kumar added the acquisition will “accelerate the delivery of key, enterprise-ready features from Azure to PostgreSQL and enable critical PostgreSQL workloads to run on Azure with confidence.”
Citus is available as a fully-managed database as a service, as enterprise software, and as a free open source download.
The Citus co-founders add that “as part of Microsoft, we will stay focused on building an amazing database on top of PostgreSQL that gives our users the game-changing scale, performance, and resilience they need. We will continue to drive innovation in this space.”
Citus is not the only startup basing its business on Postgres. It’s the basis for the time-series database Timescale, and both Crunchy Data and PingCAP are building on the concept of operators for their Postgres-based databases.