StreamNative Integrates Kafka into Apache Pulsar-Based Cloud

StreamNative, maker of a cloud native event streaming platform powered by Apache Pulsar and Apache BookKeeper, recently announced the beta release of its StreamNative Cloud for Kafka product, which integrates Apache Kafka with Pulsar, in a cloud offering. StreamNative briefed The New Stack, explaining how the two seemingly competitive streaming platforms can be melded together.
StreamNative was founded in 2019 by former Yahoo colleagues Sijie Guo and Matteo Merli, Apache Pulsar’s original developers. The two have been working on the technology since its inception, approximately 10 years ago. Back then, the goal was to build a messaging platform that would centralize all Yahoo services in one system and allow scalable queueing of workloads. The messaging systems available at the time were not meeting Yahoo’s requirements, which included handling a plethora of messaging topics across workloads and providing a multitenant architecture to ease infrastructure management. These very requirements became the defining features of Apache Pulsar.
What’s Inside
Pulsar is an open source distributed messaging system that can be utilized across a broad range of use cases, sufficiently scalable to move large amounts of data with ease, by decoupling the resources that store the underlying message data from the compute resources that handle message distribution.
Pulsar includes a management component supporting built-in multitenancy and geo-replication for copying data across the messaging substrate. This multilayer approach makes Pulsar especially well-suited for the cloud and containerized environments, both of which hold separation of compute and storage as a key tenet.
Pulsar vs. Apache Kafka: Do We Have to Choose?
StreamNative Cloud for Kafka brings these capabilities, along with support for large numbers of topics, to organizations that have significant investments in Apache Kafka, the near-ubiquitous distributed event store and stream-processing platform. StreamNative Cloud for Kafka lets users keep and use Kafka’s APIs, wire protocol and even its connectors, while being able to take full advantage of Apache Pulsar and its own capabilities, on the backend.
StreamNative planted the seeds of StreamNative Cloud for Kafka about two years ago with Kafka on Pulsar (KoP), a precursor feature for open source Pulsar. Now, StreamNative Cloud for Kafka integrates this capability into StreamNative’s cloud platform as well.
Kafkaesque Challenges
StreamNative works with many organizations that are invested in Apache Kafka, and says those organizations have seen challenges with it. For example, StreamNative says it can be difficult to implement multitenancy natively within Kafka since different organizations or teams may use completely distinct Kafka clusters, which in turn imposes significant management overhead.
Addison Higham, StreamNative’s Chief Architect, commented that “in the microservices world, where you have lots of different applications, [customers are] going to have various, broad ranges of use cases; for example, where they’ll need support for large numbers of topics,” for compliance and other reasons. Higham stipulates that, with Kafka, “if you add more topics, it can cause degrading performance” and that “…Pulsar… can support millions of topics.” StreamNative also explained that Pulsar offers a work queue where users can connect as many consumers as they want, while Kafka requires connecting partitions or topics, making complex computations difficult in general, and imposing significant limitations in machine learning scenarios, in particular.
StreamNative believes there are many reasons that its new Kafka/Pulsar offering may appeal to organizations already invested in Kafka. Higham says, “for some organizations, it is more about the capabilities, others it is more about the management aspects, and for others, it is part of their move toward a more cloud native strategy. In all cases, we’ve seen Pulsar being a technology that solves their problems and now the ability for us to support Kafka helps ease that migration process.”
Simplifying the Process
StreamNative sees the Cloud for Kafka product as the next step in bringing open source technology into its fully managed cloud service and offering a more complete solution. It views the product as reducing time-to-value with Pulsar, as it enables customers to use their existing applications, avoiding loss of investment in the Kafka ecosystem.
While the original requirements that inspired the creation of Apache Pulsar are less pressing than they were ten years ago, many of Pulsar’s core value propositions are still relevant — for some organizations perhaps even more so than before. Not surprisingly, StreamNative believes the streaming data market will see an accelerating trend towards Pulsar adoption.
While increasing share for Pulsar may or may not come to pass, one thing is clear: even when Kafka is not used as the native messaging back end, its APIs, protocols and connector ecosystem together comprise a de facto industry standard. It’s a standard for which an increasing number of seemingly competitive technologies and companies may add support. Major cloud providers already did and now StreamNative does so, too.