InfluxDB Moves to Cloud-Native Architecture

22 Jun 2018 4:00pm, by

InfluxData, a company that offers the InfluxDB time-series database, has focused on bringing its TICK Stack (Telegraf, InfluxDB, Chronograf, and Kapacitor) to a cloud-native architecture, one that runs on ephemeral containers.

The change was necessitated by their customers using TICK as a service, said Paul Dix, InfluxData chief technology officer and co-founder. This gave rise to the need for multi-tenanted systems where workloads can be isolated from one user to another. The ability to set utilization limits and memory utilization limits allows workload isolation across multiple tenants.

“A containerized architecture fits that model really, really well,” said Dix said in a phone interview.

InfluxDB’s version 1.x looks like a monolithic system, which is actually how pretty much all databases are implemented, he said. But the design team  2.0  from the ground up to be containerized, using Kubernetes for the infrastructure.

The stack runs on Amazon Web Services, said Dix. Kubernetes is the base layer,  etcd stores the metadata, and Kafka as used to write the data. Everything else is different microservices created within Kubernetes. All the features are going to be open source, with the exception of the code for high availability and for scaling out clusters, which will be available commercially.

What’s a TICK?

InfluxDB was created in 2012 as a SaaS application for real-time metrics and monitoring. But users were more interested in the infrastructure, so the company open sourced the codebase and built the other components into the stack over time.

The database system is a component of the TICK Stack, a platform for working with time-series data, one that is designed to solve some common data problems, said Dix. Name, Collect it, store and query it, process and monitor it, and visualize it.

The components are:

T=Telegraf a time-series data collector agent for collecting and reporting metrics and data.  It’s a binary program that runs across all servers.

I= InfluxDB, the time series database piece.

C = Chronograf is the visualization piece for monitoring and dashboarding. It has pieces for working with components of the stack,  including monitoring, and alerting rules.

K = Kapacitor is the real-time streaming data processing engine.  It offers basic ETL, anomaly detection, monitoring and alerting. It works for both stream and batch mode monitoring and send alerts to about 20 different systems including Slack and PagerDuty.

De-Coupling is Key

Decoupling the components in the architecture is key, Dix told TNS  in a podcast last year. The design team created different services for different pieces of the stack and decoupled the write pipeline from the indexing pipeline, from the query-processing pipeline, and from the monitoring pipeline, allowing each one to scale independently. User interfaces require high velocity, bug searches not so much.

The wide variety of uses for InfluxDB comes with a corresponding variety of workload requirements. Query processing, for example. A lot of their users have real-time dashboards up when they’re at work, but what happens at the end of the day? It’s wasteful to have servers up for the downtime. “Decoupling the processing tier from the data storage tier means you can have cheap data storage and very flexible ephemeral processing,” he explained.

Two Kinds of Data

There are two kinds of time-series data, Dix explained. Regular data occurs at a fixed point in time, like sensor data, server monitoring data, or CPU readings. Irregular data is event-driven, for example individual requests to an API trades in a stock market. “We want our stack to work well for both regular and irregular data,” said Dix.

There’s also another dimension to consider: hot and cold data. Hot data is data that’s easily accessible for queries in memory and on fast but expensive solid state drives. Cold data, on the other hand, is less likely to be accessed, and so can be stored on less expensive media, such as the Amazon Web Services S3 object store.

The InfluxDB API works by seamlessly pulling data across both hot and cold data. If you’re pulling data from cold storage, it may take a little longer, Dix said, but you don’t have to write code to get to it.

Flux: New Query Language

The traditional database platform is not the only thing that InfluxData has been rethinking. Dix has spent the last year writing Flux, a new query language to work with InfluxDB. “I don’t believe that the best possible language for working with data was invented the 70s and [that] there’s nothing better than can be produced beyond SQL,” said Dix.

It takes time to learn a new language, he admitted, but the tradeoff is developer productivity. He specifically wrote Flux to manage time-series data. SQL is notorious for how much code is necessary to return the simplest time-driven data.

“There are things we can do in the language that actually make it more elegant and easy to work with then it would if you are trying to write a bunch of SQL queries,” Dix said.

The structure of Flux makes it easy for developers to add new functions to the language, he said.

He wants to see more query workloads done in the database itself. For example, it’s very common for a data scientist to query data from the database, pull it back to a local machine, do some work on it, then load the data back. Having an actual scripting language is a step in simplifying that process.

InfluxData is a sponsor of The New Stack.

Feature image by Colin Carter on Unsplash.

A newsletter digest of the week’s most important stories & analyses.