Monitoring

Kentik Is a Data Engine Modeled after Google Dremel

14 Sep 2016 11:06am, by

San Francisco-based startup Kentik takes a big data approach to network traffic intelligence in a SaaS model for companies such as Box, Dailymotion and Yelp.

It relies on what it calls its “post-Hadoop big data engine,” to offer sub-minute response to simultaneous queries from multiple users on billions of records to provide insight into network performance and security, specifically DDoS attacks.

Its customers tend to be web and SaaS providers themselves, so Kentik provides visibility into the “digital supply chain” of dependencies they have.

“Our customers are looking for something ‘modern,’ which means online, full-resolution, open and not siloed,” said CEO Avi Freedman.

“The traditional way was to aggregate the data, then throw the data away. They don’t want that. If they have an attack, they want to see what happened with that traffic. And they want other groups [within their organization] to be able to use it.”

Its typical customer has possibly hundreds of devices that are sending hundreds to thousands of records per second each. And while some have tried building their own open source systems for this — using Impala, Drill Kudu on top of Hadoop stacks, Elastic, Spark — at scale they tend to perform poorly and running them can be overwhelming, he said.

Machine-to-Machine Data

The company wrote its own columnar-store database called Kentik Data Engine modeled after Google Dremel and designed to deal with machine-to-machine data.

It ingests billions of NetFlow, sFlow, IPFIX, BGP, and SNMP data records from multiple sources, then enriches the data with things like routing information, geography and threat intelligence before it goes into the system.

It’s available for ad-hoc analyses, alerting and dashboarding within seconds, and the full data set is kept for 90 days. Data can be encrypted in transit and each customer’s instance is isolated from all others.

It uses Postgres as the API and has its own query and storage layers.

Users can query across 60 traffic fields and set up their own features to slice and dice the data as they choose. They can write custom alerts, and visualization and filtering tools help dig deeper into the origin of attacks.

The service also can be deployed on-premise using agents to send server infrastructure data.

Staying Focused

Freedman describes the company’s leadership team as “people who come from the network side, but have experience running infrastructure at scale.”

Co-founders Principal Engineer Ian Pye and Chief Architect Ian Applegate came from content delivery network and security provider CloudFlare; co-founder and Sales Vice President Justin Biegel from internet infrastructure provider Internap. Freedman spent more than a decade at CDN and cloud services provider Akamai.

The startup has raised $38.2 million, most recently a $23 million Series B announced in August.

As a 2 ½-year-old startup, the company is trying to stay tightly focused, Freedman said.

“Our focus is on inside the enterprise out to the end user or business partner. Rather than doing synthetic analytics like Keynote, Thousand Eyes and others that do ‘what if”’ scenarios, we look at actual traffic. So when we see problems, we can point to actually where.

“If there’s an alert — today, we’re not trying to debug and don’t have the data to — inside the browser, inside the app stack or even to some extent the return path. The data we see is 100 percent of the outbound path. We don’t see the return path as much. You usually want to start with the traffic, then use those other types of synthetic analytics tools for debugging. Often we’ll be co-existing with other systems that do that synthetic monitoring. It’s data from routers and switches, hosts, hypervisors, load balancers, things that can send summaries of traffic, then we can take other infrastructure data — SNMP and such — and show people the path the traffic is taking.”

Its visualizations include time-series line, bar and stacked line charts, comparison bar charts, and traffic flow charts. Users can analyze traffic based on metrics such as bits, packets and flows per second; endpoints, such as unique source and destination IP addresses; and network performance, such as TCP retransmits and jitter.

All the data is available online through the portal API.

“A lot of people integrate around it to bring insights from the ELK (Elasticsearch, Logstash, Kibana )layer out to people doing application or security analytics,” he said.

Kentik generally is used along with another service such as metrics-based monitoring — New Relic, SignalFx, Wavefront, Datadog; a classic application performance management solution; or what Freedman refers to as a “synthetic monitoring” option, such as Catchpoint or Thousand Eyes.

Its main competitor is Arbor Networks on the operational side, and to some with Lancope, which Cisco bought, on security use cases, keeping the data for forensic access, but that’s not its main use case. Other than that it’s DIY — Elastic, any of the Hadoop stack where people are trying to effectively harness operational data. Open source NFDump and pmacct are the single-machine competitors. In the vendor space, it’s competing with appliances.

“With over 100 million average monthly unique visitors coming to Yelp, we see traffic reaching many gigabits per second, so it’s critical for us to be able to look deeply into all network traffic in real-time and gain real insight,” said Sam Eaton, Yelp vice president for engineering, operations and infrastructure.

“That’s where Kentik has helped. We can see things we simply couldn’t see before.”

Feature Image: “Traffic,” by Andreas Levers, licensed under CC BY-SA 2.0.

Cisco and New Relic are sponsors of The New Stack.


A digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.