Why Time Series Is Upending the Database Market
InfluxData sponsored this podcast.
In this week’s The New Stack Makers podcast interview in advance of AWS re:Invent, The New Stack founder and publisher Alex Williams caught up with InfluxData vice president of product Tim Hall to discuss why time-series databases are gaining in popularity with developers and how they differ from other databases.
Whether you’re talking about handling the data streaming out of Internet of Things (IoT) devices or the data used to monitor complex virtualized infrastructure and distributed applications, time-series databases are increasingly being seen as the tool of choice. As Hall explains, time-series data is simply any data that comes with a timestamp, and InfluxData is building a platform specifically to handle that type of data.
“Our focus is really on developers who are building applications systems for observability and other things that deal with timestamp data. I’m just trying to make their experience as awesome as possible,” said Hall. “That means allowing them to solve their problems quickly and provide them a system of record for all of the metrics, events, log data that they need to do their work.”
Building a product to handle time-series data differs from other databases in some distinct ways. For example, when handling time-series data, you need to decide how much data to keep, at what resolution and for what length of time. The system also needs to be capable of ingesting that data at a high rate, as well as turning it around to make it available to the end user as soon as possible. To that end, explains Hall, InfluxData is purpose built to handle the specific requirements of time-series data.
“Time series is a unique category of data. It’s different from relational. It’s different from key-value pairs. It’s different from that sort of semi-structured big data sort of stuff that’s running around. From our perspective, we’re disrupting all of those markets to a certain degree,” said Hall. “As people attempt to pour their time series data into any of those sources, they’re running into limitations and challenges. And some of those limitations are how quickly can those technologies ingest the data? How well is it compressed on disk to state to save on storage costs? And then are those engines capable of querying the data back at a reasonable rate?”
Of course, the answer to all of those questions is that the various other databases, from Postgres to MySQL, are unable to handle time-series data at the same rate and with the same efficiency as time-series database built from the ground up to handle those requirements.
In another key point of the discussion, Hall examines InfluxData’s open source foundation, noting that it not only offers a level of feedback and testing that they might be unable to provide without the open source community, but also a roadmap for direction and innovation. Flux, for example, is a new query language developed by InfluxData that came out of the open source community and now provides users with the ability to write a single query and bring back results from multiple sources.
As for AWS Re:Invent, the InfluxData team is there to showcase its availability on AWS, which it launched in September. On AWS, InfluxData runs as a serverless time-series database, which Hall explains “means you don’t need to care about the number of cores that you’re running on the memory footprint” and offers a billing structure in terms of data, time, and storage – much like serverless itself.
Feature image by nile from Pixabay.