Data

Landoop Lenses Promises to Ease Application Development for Kafka Streams

1 Mar 2018 2:00pm, by

Take a quick gander at Landoop’s GitHub account, and you can easily see the company’s primary focus: making data from the Apache Kafka stream processing platform usable by enterprises. The company, founded by CEO Antonios Chalkiopoulos and Chief Product Officer Christina Daskalaki, grew out of the pair’s multiple years of developing add-ons and tooling around Apache Kafka.

Those tools covered a range of enterprise must-haves, from Extract, Transform and Load (ETL) reference architecture stream-reactor to fast-data-dev, a project which includes Zookeeper, Kafka, a Schema Registry, and over 20 connectors for the platform.

Now, they’re taking the company into the product market with their Lenses Platform. This commercial enterprise Kafka streaming platform includes many of the benefits Landoop created in its individual tooling. The goal of the platform is to make it easier for developers and analysts to use Kafka Streams for real-time analysis of data.

In Lenses, this takes the form of a web-based tool which gives users drag-and-drop and Excel-like access to their entire range of Kafka streams and topics. Users can also register SQL stream processors to enact ETL and other data work on information while it’s in transit. This gives even business intelligence workers direct access to play around with the streaming data and even offers them a chance to do analysis during ETL work.

“We started working with Kafka a long while ago, and back then Kafka didn’t have the capabilities it has now. A few years ago we started building a lot of open source tools around Kafka. We wrote 35 open source components. Over time, we came up with the idea of building something much bigger and better to make it easier for a number of stakeholders,” Chalkiopoulos said.

Those stakeholders can be just about anywhere in the company, and do not require knowledge beyond SQL, he added. “The modern data team has a data engineer, data scientists, the business analysts, operations, and even high-level management,” said Daskalaki. “We’ve seen people want to see data flowing into their browser. One of the first components we made was for the visualization of data. That is powered by our SQL engine sitting on top of this layer. That’s not only a Web interface, but we provide libraries that wrap our APIs that are accessible for developers as well.”

In December, Landoop released Redux, a JavaScript library for Kafka development. This grew out of the needs of one customer. Combined with built-in support for Kubernetes and automated scaling, and Lenses provides a quick and widely accessible way to interact with Kafka streams. Users can browse topics inside the data pipelines, and even assign transformations to those pipelines, eliminating the need for a separate data normalization step.

“Big data, over the past decade, was dominated by big data being introduced to a lot of enterprises,  but the value those systems delivered in some cases was questionable,” Chalkiopoulos said. “We think streaming data is going to be the future. We think it’s a growing domain. Alongside data warehouse, microservices, and mobile there is going to be a new vertical for streaming data where people anticipate results immediately.”

The power of Lenses is also in its ability to connect external systems to the Kafka streams. “We’ve done quite a lot of work connecting Kafka with third-party systems. We have a big collection of Kafka connectors, and they are all open source. This allows people to build streaming ETL with only configuration, without coding anything. As long as the code is there, it understands Kafka and the target systems; whether it’s a warehouse, NoSQL, or a relational database. This takes care of the first and last letters of ETL: Extraction and loading,” said Chalkiopoulos.

Daskalaki admitted that SQL on Kafka Streams is par for the course, these days, but she said that Landoop’s offerings are significantly different from the others available. “You can view and query data in real-time and batch data in Kafka. You can create real-time streaming precision with small rules. We also include some aspects of the SQL engine into our connectors as well, so you can query and filter at ingestion time as part of the ETL process. We have been developing this SQL engine since 2016, and it is widely adopted. Also, the biggest factor is it natively integrates with Kubernetes. For SQL processors, it comes with a Web interface that visualizes the line of the SQL query so you can interact with the nodes.”

Panos Papadopoulos, partner at Marathon Venture Capital and an investor in Landoop, choosing to put his money in this company was easy. “For me it was natural,” said Papadopoulos, “My previous job was at Splunk where I was product manager. In 2012 there were no good stream processors. I saw customers who wanted to move data, had a pipeline and wanted to send the data to another system, like Splunk or Hadoop.”

Papadopoulos sees Landoop as providing a much-needed service to these companies. “People wanted to create this common pipeline of data, but Kafka itself was lower level infrastructure, so then they have to use other tools to go in and analyze. Landoop was the kind of thing that can connect the world of ops with the world of app development or analytics. Their interaction is mind-blowing. You can have data flowing through your operation and your infrastructure, and it’s something that can be used by ops. They can be many things for different people without having to go into a different set of tools.”

Feature image via Pixabay.

A newsletter digest of the week’s most important stories & analyses.