Startup pgEdge Tackles the Distributed Edge with Postgres
Denis Lussier and Phillip Merrick, who worked together creating EnterpriseDB starting in 2005, are debuting a new project called pgEdge to address some of the toughest database challenges organizations face like data latency, ultra-high availability and data residency restrictions.
With EnterpriseDB, which they call Door No. 1, they sought to add a Oracle compatibility layer on top of open source PostgreSQL. They call their latest venture Door No. 2, bringing a distributed Postgres database closer to the edge.
What Is the Edge?
Companies like Cloudflare, Akamai and Fastly have been handling compute closer to the edge, but Glauber Costa, founder and CEO of ChiselStrike, in a post called “What the Heck is Edge?” points out that bringing compute to the edge only solves half the problem if you’re making calls to a centralized database somewhere far away.
While there are many facets to the edge, including IoT devices, 5G and smart devices, edge development platforms like CloudFlare Workers, Fastly, Netlify or Vercel are focused on making more traditional web applications and enterprise applications faster by putting more of the application components closer to the edge. So pgEdge is bringing the database closer as well.
“For the web application developer who’s trying to shave milliseconds off the performance of the frontend of their application, if you’ve got a round trip to the database, which is probably someplace like Amazon us-east-1 that’s going to take you 100 milliseconds each time you need to access the database, then whatever work you’re doing on the frontend to optimize things is going to get swamped by that data latency problem,” Merrick explained.
The solution at pgEdge is reducing that distance with a distributed Postgres database.
“You’re not going to place 400 Postgres database nodes around the network. That would be a little excessive, and also unnecessary. So instead, you’re going to strategically place a handful of pgEdge nodes around the network, and each one of those nodes can take read or write traffic. And if you’re going to be using them in conjunction with edge platforms, like say Cloudflare Workers or Fastly, which is what we’ve designed this for, you’re going to be able to get really great response times out of the database layer,” Merrick said.
Pure Postgres and Its Standard Extension
Having learned from their experience with EnterpriseDB, they wanted to start with a strong foundation layer. More than 35 years old, Postgres has been undergoing a resurgence lately.
“We haven’t rewritten the database engine underneath. It is Postgres. It’s not compatible with Postgres. It is Postgres, and that’s an important distinction,” Merrick said.
It also uses the Postgres standard extension mechanism. Its extension is called Spock, a reference to the logical Star Trek character. It’s a derivative of pgLogical that provides bidirectional multi-active (also known as multi-master) logical replication.
Nodes in a pgEdge cluster can span multiple cloud regions and data centers for high-level resiliency, availability and performance.
What that means for developers, especially those using the JAMStack architecture, according to Merrick:
“Yes, you can measure the page load times, but it’s also a less costly and easier route to getting very highly available database applications. Because if you’ve got multiple read/write nodes around the network, you can lose some number of them, depending on how many you have, and everything will just keep going independently. The traffic can fail over to the surviving nodes. So, so it’s a low-latency, ultra-high availability, distributed database.”
You can see a demo here.
Each node runs standard PostgreSQL (v15) and is kept updated via asynchronous logical replication with configurable conflict resolution and avoidance. It offers delta conflict avoidance to maintain consistency for data fields that keep running sums like financial balances or inventory counts.
It also offers geo-sharding to help users comply with regulations like GDPR that require certain data to remain within particular geographic boundaries. It breaks out tables on a location field so some data can be kept locally and while the rest is shared globally.
Existing Postgres Applications, Few if Any Changes
The company emerged from stealth earlier this month and announced a $9 million seed funding round. The technology is still in beta, with a handful of customers nearing production with it. The company is looking at Q3 for general availability.
“For leading e-commerce companies like ours, pgEdge is a game changer and will give us fast page loads and a smooth customer experience regardless of where our customer is located,” said David Ting, CTO of Zenni Optical, adding that pgEdge can also “give our engineers a solid foundation for building a new generation of e-commerce applications by allowing the developers to develop on native Postgres yet have the advantages of a modern distributed data store.”
Postgres users should be able to use their existing Postgres applications with few to no changes, Merrick said.
“ …you get to use all the tooling that works with Postgres, and many of the extensions that work with Postgres. So, [developers can] use it in exactly the fashion that they’re using Postgres today, whatever way their applications access it, whatever tools they like to use, for managing schemas and things like that. That’ll just work because it is standard Postgres.”
Bill Mitchell, CTO of PublicRelay, a SaaS media analytics company said, “We have been using pgEdge for a few months now, and I am very impressed with its performance and stability. Having their multi-master replication feature has allowed me to easily manage and scale my PostgreSQL databases without any issues. The product is easy to manage and that has saved me a lot of time and effort.”
All Open Source
Others also are trying to move the database closer to the edge. Costa’s company ChiselStrike, for instance, is attacking the latency problem with Turso, SQLite-compatible embedded database. EnterpriseDB also has a product similar to pgEdge called EDB Postgres Distributed (PGD).
Rivals like Cockroach, Yugabyte and others, in approaching the distributed edge problem, made the design decision to use a different database, then put a compatibility layer on top, according to Merrick.
“And we know from our work with Oracle compatibility, that will only get you 90% of the way there, and that last 10 percent is a bit of a killer,” he said.
A second differentiator is that the source code for pgEdge is all open source and available on GitHub.
The company has made its entire PostgreSQL distribution, a self-hosted and self-managed version called pgEdgePlatform that’s free and available for download. It also offers a managed database as a service called pgEdge Cloud on AWS and Azure. You can join the private beta here.