PGQ: Queuing for Long-Running Jobs in Go Written Atop Postgres
When Dataddo was maxing out the capabilities of RabbitMQ, it discovered it already had a solution hiding in plain sight: PostgreSQL.
Long-running jobs with RabbitMQ were leading to heartbeat timeouts and reconnects without full observability into why the problems were happening. Running it on hosted AWS meant the data integration company couldn’t configure RabbitMQ quite the way it wanted, but didn’t have the engineering capability to manage the open source message broker in-house.
Working with some Postgres contributors on other projects, the global data integration company found that Postgres, the tried-and-true workhorse database could quite nicely handle those long-running jobs and provide better insights into any potential problems. Thus the queuing mechanism PGQ, short for Postgres queue, was born and made open source.
Written in Go and built on top of a Postgres database, it means developers can add simple but reliable message queues for their services using the infrastructure they’re likely already familiar with.
“A lot of people are interested in this topic. …[They] already use Postgres in their company or on the project, and they are facing the same troubles or they are using Postgres for everything, and are satisfied with that,” said Dataddo CTO Tomáš Sedláček, adding that using RabbitMQ, Kafka or another tool simply adds another technology that developers have to learn and maintain. From a hiring standpoint, it’s easier to find engineers who just know Postgres, he said.
A Regular Postgres Table
The queue in PGQ is just a regular Postgres table, so anyone who has some experience with standard SQL can use it to look at a table, insert new rows or whatever. PGQ uses a publisher-consumer model in which publishers add events to the queue and consumers process them asynchronously. With a high volume of tasks distributed across multiple workers, this also enables jobs to execute in parallel. PGQ is designed to be resilient even amid temporary failures with mechanisms to handle errors and retries.
Improved visibility is one big plus, according to Sedláček.
Dataddo found observability limited with RabbitMQ — only able to see what is waiting to be processed, not what is being or what has already been processed.
In Postgres, everything is written to the hard disk rather than in memory mode to eliminate the risk of losing any data, which means there’s a record of everything that happens whether processing was completed or not. You can easily track metrics such as queue depth, processing and error rates and customize them according to your needs.
“When using PGQ, you have nice observability about what is happening in the queue; [errors] are mitigated by default … Like what happened yesterday in the queue? … It’s already stored there until you delete it,” he said.
The company maintains that PGQ works well for companies that already use Postgres, don’t need to optimize for speed and don’t want to deal with the learning curve and maintenance of yet another technology. Because it’s writing everything to hard disk, PGQ will be a tad slower than Kafka, according to Sedláček, but not all that much.
But it’s not well suited for companies with highly advanced requirements for message routing or those dealing with extreme volumes and needing to optimize for throughput.
Though only for Go applications at the moment, a PHP version is in the works.
How Dataddo Uses PGQ Internally
Founded in 2018, Dataddo offers a fully managed, no-code data integration platform that provides ETL (extract, transform, load), ELT (extract, load, transform) and reverse ETL services along with more than 250 connectors to securely send data between cloud-based applications and business intelligence tools, data warehouses and data lakes.
Its customers include X (formerly Twitter), Ogilvy, Uber Eats, international financial services provider Allianz and Microsoft.
Dataddo uses PGQ internally for more than 200,000 long-running jobs daily, as well as short jobs like sending emails or saving logs, asynchronous app communication between Go, PHP and Node.js, and monitoring the performance of its own platform.