This Week in Numbers: Apache Kafka’s Metamorphosis
The Apache Kafka distributed streaming platform is not changing but its typical use cases are. Commercial Apache distributor Confluent issued its third annual Apache Kafka Report, which surveyed over 600 users of the technology, a 71 percent increase compared to last year’s sample. What appears to be a broadening user community has a better idea of Kafka’s business value. A more nuanced definition of the technology and increased adoption of microservices results in a different outlook on Kafka’s association with the term “stream processing.”
Data pipelines and messaging are now the top two uses of Kafka, followed by microservices/event processing and stream processing. The addition of several answer choices partially explains significant drops in the use of Kafka for stream processing (66 percent in 2017 to 48 percent in 2018) and data integration (60 percent to 46 percent). The chart below shows that people distinguish between the use of Kafka capabilities for event processing and streaming ETL, resulting in the broad stream processing category getting fewer responses.
Another explanation for the changes is that Kafka is no longer viewed just as a backend for stream processing in competition with Apache Spark and Storm, and connecting with Hadoop. Thus, Spark may be used for real-time analysis while Kafka is the source for real-time data.
The data gets more interesting when people are asked not about the broad categories, but instead about the specific Kafka streaming capabilities being used. Those using Kafka stream processing capabilities with asynchronous applications jumped from 25 percent in 2017 (when the term was “Kafka Streams API”) to 44 percent in 2018. Its use for backend analytics also rose from 22 percent to 30 percent. As adoption of the microservices architecture increases, so does the focus on Kafka’s event-driven aspects as opposed to the real-time capabilities.
At least in the Kafka world, adoption of microservices architecture has jumped, from 50 percent of respondents in 2017 to 78 percent in 2018. While actual use of microservices has probably not jumped that much in the last year, the data indicates that Kafka is becoming an integral part of existing microservice architectures. Just looking at those that have microservice architectures, Kafka is twice as likely to be used to preserve and communicate state — in 2017, 29 percent did so and in 2018 that figured jumped to 63 percent.
By communicating through Kafka with events, microservices create a shared log which preserves the state they share. As Kafka becomes a backbone for microservices, Confluent believes organizations gain more organizational agility.
If so, then Kafka is addressing what its users believe is the technology’s top business value. When asked broadly about the business value of streaming platforms like Kafka, 55 percent said increased agility, with another 48 percent saying it unlocks new use cases. The top responses in 2017 “more accurate and/or faster decisions” and “reduced operating costs” dropped 11 and 9 points respectively as the “agility” and “new use cases” options were added to the question in 2018. The updated choices show that businesses care about the ability to make quick, agile changes in the infrastructure as opposed to making fast, real-time decisions based on analytics.
Finally, although Kafka allows for new use cases, that does not mean it is only supplementing existing technologies. In fact, 62 percent said that is replacing a technology, with messaging and pub/sub being the areas most likely to be affected.
Featured image via Pexels.