How to Make Kafka Cloud Native
Raygun sponsored this podcast.
With over 60% of the Fortune 100 relying on Apache Kafka, the service has become both popular and entrenched. But as cloud technology is expanding, some fundamental changes were necessary to make Apache Kafka truly cloud native.
Kafka sits above the operation layer and below the application layer in the stack. It lives in the data infrastructure layer alongside relational database and/or modern no-SQL databases, and alongside data warehouses. It provides a new foundation for data that can bring data from all those different varieties of services in one place so you can consume it at a large scale, she said.
“What we know about enterprises is they want to buy the whole car,” said Narkhede. So the company’s engineers took a year or so to build a cloud native, fully managed service that developers can use in the public cloud. They added security around Kafka workloads so you can deploy it in real enterprises for real workloads, she explained. Next, they created a whole ecosystem of tools, from connectors to different systems to a stream processing layer, to a way to manage your schemas.
Fundamental changes were also needed to the Kafka architecture to leverage economies of scale to provide elasticity essential to being fully cloud native.
The first big change was to solve for multitenancy and add the ability to manage quotas in the system. With services like Amazon Web Services sharing containers and services, the Confluent Cloud built security around multitenancy architecture, along with instituting elastic quotas. This, she said is “so you can scale and still maintain several nines of uptime.”
Next, elastic scaling takes away the need to size clusters to accommodate spikes in service. The Confluent Cloud can scale elastically up to 100 mg/second (reads and writes) without having to plan anything or talk to anyone, essentially providing a “no cluster” experience.
The third fundamental change didn’t really have to do with Kafka at all.
When you start work on elastic scaling, Narkhede said, you are constantly running into limits the cloud infrastructure has on cloud abstractions, e.g., the number of connections that can be made at the time. So Confluent built a control plane around Kafka to automate limit handling.
One of the harder problems they had was building a true “pay for what you stream” experience. So all of your backend systems can account for your time, and you are billed for what you truly use vs. clusters.
“What users care about is data,” Narkhede said. They want to create topics, and they want to send and receive messages from the topic. They want to connect those topics to external systems like S3 or just write some SQL queries to understand what’s in that topic.”
Listen in to hear Narkhede talk about moving away from cluster provisioning, what you should be learning next, and some interesting things Confluent customers are doing.
In this Edition:
1:16: A high level overview of the services and product Confluent provides.
7:07: How to change Kafka to take advantage of the services that are now available in the cloud.
10:33: The financial gain/loss of using containers, and building for actual use
11:48: If you’re not provisioning from a cluster perspective, what perspective are you using?
14:47: Moving away from viewing data as a static store that needs to be processed once a day, to a continuously viewed stream of information.
18:15: Advice for engineers to stabilize their careers