In a timely example of applying data to a pressing problem, Austin, Texas-based Eventador built a real-time system based on its platform providing real-time geolocation to get help to victims of Hurricane Harvey, which caused untold damage across Texas and other southern U.S. states late last month.
The company’s processing system captures calls for help on Twitter where people put in their addresses, and it plots their location on maps for citizen responders or others. The system takes into account duplicates, maps the address and can provide a way for people to be marked “safe.” And it does it all in seconds.
“Basically we were pissed off that people were stuck without help and wondered what we could do” explained co-founder and CEO Kenny Gorman.
“We worked on it for a few days, only to stop and actually go down there and help people using chainsaws and hand tools. To be fair, it’s really proof-of-concept quality at this point, but it does work. It just needs more energy to make it something people depend on. … It’s planet-scale, and a great example of the power of streaming data platforms like ours.”
The company plans to open source some of the components, such as the map generator, soon.
“We had this epiphany around mid-2015 when we were still at Object Rocket and Rackspace when we saw customers trying and failing to deliver data quickly to their clients, whether that be by iOS apps, desktop apps, dashboards, IoT or sensors,” Gorman said.
So they created Eventador last year, based on the idea that they could provide a better way to deliver real-time streaming workloads that enterprises increasingly use. The two had worked together at PayPal and eBay back in the early 2000s. Object Rocket provides managed MongoDB and Elasticsearch; Rackspace acquired it in 2013.
“We cut our teeth on these giant workloads,” Gorman said. Beebe brought storage expertise and Gorman database knowledge to the partnership.
He describes an “aha” moment behind their venture:
“I was talking to a group of CTOs in London about data problems. I thought [the problems] would be Hadoop doesn’t have this. Or this database doesn’t have that.
“But the problem was that they knew their competitors were building systems that would deliver data in more real time than they could. They were worried that the [competitors’] applications would be more compelling to customers because of that,” he said.
“It was all about time of delivery, not about how much data you had or are you storing all the data coming from your logs or whatever it might be. It was how fast can you get that meaningful data percolated down to build a more compelling product than your competition.”
They decided early on that Apache Kafka, the technology that LinkedIn made open source, had to be the backbone of their service.
“Kafka is designed from the ground up to deal with millions of firehose-style events generated in rapid succession. It guarantees low latency, ‘at-least-once’ delivery of messages to consumers. Kafka also supports retention of data for offline consumers, which means that the data can be processed either in real-time or in offline mode.”
Kafka can serve people just tinkering with it to systems handling millions of messages, Gorman said.
“If you take a packet-inspection-type workload or manufacturing workloads, to pull data off sensors, those things might run at 10 Hertz, literally thousands of messages per second. Our goal is to provide a more scalable and robust enterprise-grade platform. We want to cater to folks building big, big, big data infrastructure.”
Redmonk noted Kafka’s rising popularity around the same time:
“With new workloads in areas such as IoT, mobile and gaming generating massive, and ever increasing, streams of data, developers have been looking for a mechanism to easily consume the data in a consistent and coherent manner. Which is exactly where Kafka fits in.”
But Gorman concedes that Kafka isn’t the total answer. So the company is adding real-time processing technology Apache Flink for compute. It allows customers to build applications that process, filter and aggregate data in real time. That aspect is still in beta, with GA expected soon.
Eventador offers a managed service — Gorman says the five-person team is highly focused on support — “but it’s more than just installing software and saying, ‘If you have a problem, call us,’” he said. A cloud service must offer customers more than if they just installed it themselves.
It enables users to build and deploy pipelines to AWS with just a few clicks. Plans are in the works to add Azure and Google clouds as well. Jupyter Notebooks allow easy analysis, and experimentation and it uses Presto distributed SQL engine for real-time analysis, aggregations, filtering and reporting.
Eventador offers a full metrics and monitoring infrastructure, security enhancements beyond what Kafka offers, cloud and scalability enhancements, he said.
Flink has been wrapped with a “cool” GitHub integration, Gorman said.
He describes the Projects component, which allows developers to easily integrate existing software development workflows into a Flink project via Github. Eventador handles all the complexity of the build process (typically Maven) and the deploy process.
Gorman admits his tiny startup faces stiff competition. There’s the whole Hadoop/Spark ecosystem backed by Databricks and IBM. There are Cloudera and Hortonworks.
There’s also Amazon’s Kenesis. And don’t forget Confluent, from some of the folks who originally wrote Kafka. Confluent is both a partner and competitor, Gorman said, though Eventador is adding technology beyond Kafka.