Q&A: How EnterpriseDB Brings Five Nines to PostgreSQL
Marc Linster is the chief technology officer of EnterpriseDB (EDB), which offers a high-performance supported version of the open source PostgreSQL (or Potsgres) relational database system.
He recently sat down with The New Stack to discuss all things Postgres, including a new release of the EBD platform that includes increased availability (to 99.999%) and Transparent Data Encryption (TDE).
TNS: What is EDB?
Linster: Think about EDB and Postgres the way you think about Red Hat and Linux. That’s the simplest way of saying who we are. Red Hat, SUSE, and others are companies that make Linux commercially accessible and commercially successful. Commercially in this context doesn’t just mean money, but also provides everything the enterprise customer needs to put a piece of software to use at scale and responsibly.
The Postgres community is inherently independent but since 2004 EDB has been very actively involved with Postgres and a significant part of the Postgres community. We’re a very significant part of the Postgres community. Of the approximately 114 companies that are actively contributing with more than one resource to the Postgres project, we are the largest provider or supporter of Postgres, in terms of active contributors, but we’re not the majority as it’s important that Postgres, just like Linux is an independent, truly open source community project.
Can you explain the necessity of a commercially supported version?
Postgres is a really, really good database, a really great open source project, and we provide the wrapping around that in terms of tools for high availability. These tools include for example, migration from Oracle, customer support, roadmap enhancements, and other functionality that customers would otherwise expect from a commercial entity or commercially supported piece of software.
With these tools, Postgres becomes something that the enterprise can use at scale and responsibly. At scale, meaning they can use a lot of it, and it’s manageable. But it also refers to the high performance and reliability of mission-critical areas. Responsibly, because now there’s a company to call at three o’clock in the morning on the last day of the quarter and say, “Oh my God, we have a problem.” EDB is available to not only provide a responsible SLA but to also be there if something goes wrong or if you have a question or if you need services. There’s someone there you can work with.
One of those new tools was the TDE. Can you tell me a little more about that?
Our customers and you know, our biggest business segment is what we call BFSI: banking, financial services, and insurance. In that space, encryption on disk, also known as Transparent Data Encryption or TDE, is an absolute requirement. Because of the legal or compliance requirement, there are a whole lot of applications that are not accessible to open source Postgres if you can’t guarantee that the data is encrypted on disk.
So when we say our mission is to help the enterprise use Postgres at scale and responsibly Well, that’s exactly what we do. Right? We make this now accessible after we added TDE.
How did your team build out this groundbreaking tech?
TDE is an extension to Postgres because we don’t go deep into Postgres to change it more so we’re adding something around it to make Postgres usable for applications that have personally identifiable information, PCI information, and other applications where the user has to make sure that that the data on disk is encrypted.
This has been stewing for a good while but similar to when a very experienced mechanic fixes your car and they do something really simple. It didn’t take them long, but it took them 20 years of experience to know exactly what to do.
We’re fortunate because we have many Postgres contributors, and people that have been with the Postgres project since its inception, that the actual work on it may have been less than a year but the thinking about it and figuring out how to do it in the right way took many years.
We expect that there will be other capabilities coming over time. Customers are very happy but I’m sure that when they start putting it to use, they’ll want this bell here and this whistle there. So this is our first release of TDE, not the last release, but it’s not a beta — it’s definitely the real thing.
EDB’s other new recent announcement is 99.999% availability. How important is such high availability?
Five nines of availability means you can have up to six minutes of unplanned downtime a year. That’s a long time. Think about it this way — if you’re an e-commerce operator, a big one, and an average gadget sold is $50 and your site does a million transactions. You can figure out what a half hour of downtime will translate to in losses. Plus customers aren’t likely to come back if they can’t complete the transaction.
“So when we say five nines of availability, we include maintenance operations. Today, our customers say, ‘No, no, no, no, no. This thing does not go down for any reason.’”
Credit card companies run payment gateways on our software. Payment gateways are the backbone, the livelihood, of credit card companies. It’s no joke; if that service is down for a few seconds we get a call. On its face, six minutes may not seem like a lot but it’s a lot. If you can’t process credit card transactions, if the credit card authorization stops for six minutes it’s an incredibly big deal.
It was never as critical years ago because very few businesses were global. E-commerce wasn’t the way things were done. Right before the digital transformation, a lot more happened in human interaction and over the phone but now everything is on the computer. We have customers that run our system for Single Sign On [SSO] solutions, the login, and the company and they run, and they have to run that with five nines of availability because when SSO stops, everything stops. When the computer stops, everything stops.
What was EDB’s previous availability offering and what practices were implemented to offer higher availability?
So before this, we were able to get up to four nines of availability. Four nines of availability means you can have about one hour of unplanned downtime a year. In the old days, high availability and these nines were targeted to protect against hardware failure — disk drive failures, power supply failures, network card failures… Those were the weak points. Hardware has become extremely reliable; it’s not a problem anymore. What is the problem today is maintenance — applying a security patch, applying an upgrade. Those are the things that you want to do without taking the system down.
So when we say five nines of availability, we include maintenance operations. In the past, we were talking about this to protect against hardware failure, and then still needed the maintenance window. Today, our customers come and say, “No, no, no, no, no. This thing does not go down for any reason.” That includes database maintenance and re-indexing. Maybe slightly degraded performance is acceptable but you can’t just stop and say, “Hey, everybody go home for a half hour. I got to do maintenance.” That doesn’t work.
What changes were made to maintenance operations to reduce downtime?
The key challenge is handling the upgrades. Let’s say you go from Postgres version 14 to version 15. Version 15 will have new features and a slightly different layout in the database. You need a technology that does high availability, which usually means replication of data that doesn’t care about these small differences. This is called logical replication. We used to use physical replication which basically moves disk blocks back and forth. Physical replication faces challenges when a major version has a different disk layout but logical replication can handle it.
Logical replication looks like this: let’s say A, B, and C are members of a cluster on version 14 about to get upgraded to version 15. First A is upgraded to version 15. B & C are still replicating amongst each other. A is on 15, B & C are on 14. Now all the traffic is rerouted to A while B is brought up to speed and then switched back into the cluster. Transactions are still going on. Now C is brought up to speed and rejoins the network.
Ten years ago, CEO Ed Boyajian and I had to evangelize for Postgres. We don’t need to do that anymore.
This is an example of a major maintenance operation taking place without shutting the server down. This is what we want, this makes it possible to apply security patches, do version upgrades, and general maintenance without taking the service down. The trick behind that is to move from physical replication to logical replication. Physical replication got us to four nines and logical replication got us to five nines.
That’s the reality of where we’re bringing Postgres, certain applications weren’t accessible to Postgres, because we couldn’t say you can use Postgres with four nines. Or five nines of high availability. There’s a whole niche that the CIO had to carve out and say, “Okay, that is still solidly in the realm of my very expensive commercial database.” And we’re saying, “Yeah, we’re pushing this envelope. You can now use more Postgres” It can get into this realm of four nines, five nines of availability.
Is there a delay between new Postgres versions and EDB’s automatic updates?
We’re a significant part of the Postgres community, code reviewers, contributors, and committers. There are other people as well but everything is public so when Postgres releases, we’ve seen the new feature coming for a long time and helped with it so all our ancillary tooling around it is ready when Postgres is ready.
What do you want to say to anyone thinking of moving to or further adopting Postgres from their commercial database?
Everyone wants to innovate quickly and at the heart of quick innovation is the developer. If you’re not creating an environment that attracts developers where they use tools that they want to use, you don’t get the good talent, you get the talent that doesn’t get another job.
Postgres, in the Stack Overflow survey, is the number one most loved, most used, and most wanted in all categories. It was always highly rated in the different categories but for the first time, Postgres is the leader in every single database category. It’s also the number one recommended database by the Cloud Native Computing Foundation. This shows that Postgres is at the heart of rapid innovation.
But then companies also have the work they did over the last 20-30 years on Oracle that cost a fortune. CIOs come to us and say, “Hey, Postgres is so much cheaper, and it’s so good. I can use it now for so many things, including TDE. It has the same high availability as Oracle RAC, but it does it for less than 20% of the cost.” For the last 15 years now, we’ve been developing native compatibility with Oracle so you can take an application that was developed for Oracle and run it on EDB Postgres with very few changes and get there very quickly.
There’s nothing more expensive in the CIOs budget on the software side than the database spend, so if they can take that number one expense line and squeeze it by 80%, imagine what they can do: Take the cost out, and also get a lot more flexibility because Postgres runs on every cloud, every operating system, every piece of hardware, right and not just on let’s say, proprietary engineered hardware, that costs a fortune.
When I started in this business 10 years ago, the CEO [Ed Boyajian] and I had to evangelize for Postgres. We don’t need to do that anymore. We optimize for more use of Postgres but I have not come across a company or an organization in years that did not have Postgres, it’s everywhere. The problem is just that it’s in a small box. Our job is to show them they can use it on a larger scale, and it’s a responsible way of doing it with a partner like EDB.