How Blockchain Startups Think about Databases and dApp Efficiency
For applications to gain traction in today’s market, they need to achieve two fundamental attributes: low latency and high-security standards. The database is a significant contributor to both of these attributes and is an important decision in early-stage application architectures. There’s recently been a significant increase in the number of applications being built using blockchains like Ethereum as the underlying storage mechanism. While these applications are using blockchain for storing transactions on the public ledger, in order for them to achieve low latency and high-security standards, they need to leverage traditional database solutions that are off-chain.
As teams tackle the underlying application architecture, they also face familiar early-stage challenges: identifying product-market fit, working with smaller teams than their enterprise counterparts, and ensuring they’re building with tools that are dev-friendly and have their users in mind.
When applications are built on top of a blockchain, these applications are inherently decentralized — hence referred to as dApps (decentralized applications). Most dApps today leverage a Layer 1 (L1) blockchain technology like Ethereum as their primary form of storage for transactions.
There are two primary ways that dApps interact with the underlying blockchain: reads and writes. Let’s use an NFT and gaming dApp that rewards gamers who win coins that they can then use to purchase NFTs as an example: Writes are performed to an L1 chain whenever a gamer wins and coins are added to their wallet; reads are performed when a gamer logs into the game and needs to pull the associated NFT metadata for their game character (think stats, ranking, etc.). As an early-stage dApp building the game described above, writing directly to Ethereum is prohibitive because of slow performance (impacting latency) and high cost (hard for an early-stage company).
In order for applications using blockchain to achieve low latency and high security standards, they need to leverage traditional database solutions that are off-chain.
To help developers in the dApp ecosystem, sidechains and Layer 2 (L2) solutions like Polygon improve performance. The high cost is associated with directly writing to Ethereum by storing transactions in a sidechain and, every so often, submitting an aggregation of the transactions to the primary chain. While Polygon helps for performant and cost-effective writes, solutions like The Graph have made it simple for developers to query directly from L1 or L2 chains using GraphQL within an application.
While blockchains were created for the storage of transactions, and L2 chains/index accelerators have made working directly with on-chain data simpler, there’s data within a dApp that can still benefit from leveraging a centralized database, which is referred to as off-chain data.
Off-Chain Data Requirements for dApps
When building an NFT and gaming dApp, as described above, there are two main data requirements — reading/writing of data shouldn’t significantly impact application latency and personal user information must be kept private. The latter is especially relevant since one of the main tenets of blockchain is to keep users anonymous. Off-chain databases also help keep application latency low by storing game metadata close to users and ensuring user information is private. To fully take advantage of the security and performance advantages from an off-chain database, dApps have to ensure that the database is interoperable within their on-chain data.
dApps must maintain a high bar for security across their data storage systems to avoid exposing private data and maintain anonymity for their users. When choosing an off-chain database for their user information and metadata, the dApps must look for security capabilities, such as attribute-based access control (ABAC), so they can control exactly who is accessing sensitive user data. This ensures that private data is not exposed on a public ledger, while also allowing individuals to log in to dApps with individual profile information.
Interoperability with On-Chain Data
When choosing an off-chain database, it’s critical that the off-chain database fits into a dApp’s architecture, allowing easy communication between off-chain and on-chain data. Off-chain databases need to at least have simple APIs that dApps can use when making data queries. For the simplest interoperability, having a GraphQL interface to query data allows dApp developers to work with an API query language that they’re familiar with and is used by other data solutions (i.e., index accelerator like The Graph) within the blockchain ecosystem.
It’s critical for applications to maintain low latency so they can deliver a good user experience. To review, a dApp has two choices for its data architecture: store all data on-chain, or store some data on-chain and some off-chain. While it’s important to continue to transact information directly on-chain, it’s not reasonable to store application metadata such as rankings or user statistics in a gaming dApp as it significantly increases latency. Off-chain databases can not only store this data securely, but they can also store it in a globally distributed manner to keep the data as close to the user as possible.
As the blockchain ecosystem continues to grow, different architecture choices will be identified to effectively build and scale dApps. In the database layer of any application, it’s important to laser focus on performance and security. Today, there is a range of choices for dApps, ranging from purely decentralized to a mix of centralized and decentralized solutions in the market. The choice depends on how important it is for the dApps to be performant and how much sensitive user data will be collected. The sooner startups are able to identify and make these decisions, the easier it will be to onboard users and scale a growing application.