Data / Contributed

You Don’t Need a Blockchain, You Need a Time-Series Database

21 Oct 2020 10:13am, by

Nicolas Hourcard
Nicolas is a lifelong technology enthusiast and co-founder of QuestDB. He previously leveraged time-series data for companies like Rothschild and Nasdaq. He splits his time between London and San Francisco and enjoys tennis, skiing and climbing new mountain peaks each year.

The use of blockchains for enterprise applications has proliferated in recent years — indeed IBM X-Force Red has said that “organizations are seeing real efficiencies and cost savings from its use.” Forward-looking enterprises can choose between R3’s Corda, Consensys’ Quorum, or even Hyperledger (hosted by the Linux Foundation) to support their applications. Such blockchains would in theory provide cross-industry support, from real estate to financial services, and from healthcare to supply chain management. However, the reality is different; thus far only about 14% of PoCs make it to production according to HFS Research.

This brings up an important question: do you really need an enterprise blockchain for your application?

At QuestDB, after spending our careers working on both blockchain and database technologies, we think that for many applications what you really need is a highly performant time-series database instead. But before we argue our case, let’s start with a quick reminder of what blockchains are all about, and what differentiates public blockchains — such as Bitcoin — from private, closed, enterprise ones.

First things first, a blockchain is a database to store information over time. The “Bitcoin blockchain” was widely touted as a revolutionary technology because it introduced a fully decentralized consensus mechanism to bypass intermediaries. For example, a financial transaction can be validated between participants without recourse to an independent third party. It is the network, which anybody can be part of, which will decide whether the transaction is valid. Once validated, the transaction is added into a new block of transactions. Each new block will then be added to the blockchain. This elegantly solves the ”double-spending” problem where the same digital currency could be spent more than once. This network is so-called permissionless because anybody can participate and start validating transactions. Public, permissionless blockchains offer new perspectives in terms of decentralization.

A peer-to-peer framework open to everyone may be revolutionary for digital currencies but is less suited to enterprise needs, which may for example include not wanting to disclose sensitive records into the open, and only allowing a few trusted users to do approvals. Enter the world of private blockchains, which are permissioned, meaning that only a selected number of users can approve transactions, and which also offer a specific set of rules around the visibility of transactions, governance, and more.

Most enterprise applications do not need decentralization in the first place and are best served by a centralized database with a single point of truth.

Decentralization in the form of a blockchain comes with significant trade-offs for enterprises. The primary drawback of blockchains is their lack of scalability. The Ethereum blockchain, which underpins some of the enterprise blockchains, can only handle 15 transactions per second on average. This is a far cry from enterprise requirements — a single database could be ingesting several millions of data points per second. The second challenge is that blockchains rely on resource-intensive consensus mechanisms, leading to more processing overheads and higher energy consumption.

It turns out that a specialized type of database, known as time-series (“TSDB”), shares many properties with enterprise blockchains.

Let’s compare properties between private blockchains and TSDBs:

  • Time as the primary axis: Blocks are added to the blockchain at regular time intervals. For each block of data, there is an associated timestamp. Time-series databases are optimized to efficiently ingest and retrieve data points associated with timestamps. Think stock prices changing every microsecond.
  • Immutability: Once a block is added to the blockchain, it cannot be changed. In the database world, this is akin to “INSERT”, without the ability to “DELETE” or “UPDATE.” Instead of updating a record, one just appends the most recent one, which will de-facto become the most up to date reading. Time-series databases are in general APPEND-only and share those characteristics.
  • Long 256 format: This is the format of crypto public addresses. At QuestDB we have built a data type that is better than a string to efficiently write and read Long 256 blockchain addresses.

Consequently, one can use a TSDB to replay the full history of all individual transactions ordered by time; this is how blockchain nodes work. Other similarities include:

  • Data replication: Each node in the blockchain holds the entire history of transactions. If one node is compromised, we rely on the others to provide the full history. Again, this concept has been in effect for decades with traditional databases: if one database fails, we may want another as a backup.
  • Consensus: Blockchains would have multiple parties (i.e., nodes) to agree for a specific transaction. There are consensus algorithms such as Raft and Paxos in traditional databases akin to a voting mechanism.
  • Sharding: Instead of having all nodes computing all operations to validate transactions and execute smart contracts, nodes are assigned to process only certain computations. Database sharding breaks up large databases into smaller chunks (known as shards)  to facilitate horizontal scaling across multiple servers.

Other aspects of blockchain functionality can be handled in the application. For example, a private ledger allows data to be shared and seen by different parties who do not need to trust each other because of the software’s rules – this could be a bank and a regulator sharing data access to customer trades data. I would argue that such data access could be done via the business logic layer sitting on top of the database layer to provide data to outside parties.

For censorship-resistant use cases such as digital currencies that governments cannot influence, there are public permissionless blockchains such as Bitcoin. Most enterprise applications do not need decentralization in the first place and are best served by a centralized database with a single point of truth. If time is your primary axis, time-series databases are your best bet.

TNS Analyst Lawrence Hecht contributed to this post.

A newsletter digest of the week’s most important stories & analyses.