Beyond the Quickstart: Running Apache Kafka as a Service on Kubernetes
Running Apache Kafka as a service on Kubernetes provides a number of benefits to different environments, especially to organizations that are focused on becoming cloud native. After a few years of experience running Kafka on Kubernetes, we’ve learned that it doesn’t have to be complicated.
To clarify what we mean by Kafka as a Service, think of it as a 24×7 fully managed service without the need for internal Kafka expertise — which allows for automated provisioning, management and operations. Most importantly, developers are allowed to focus on what truly matters: coding. As organizations continue to adopt a cloud-centric approach, Kafka on Kubernetes goes hand in hand with any sort of cloud native development to allow for better configuration for next-generation application development and management.
Running Kafka on Kubernetes makes Kafka more elastic and scaling becomes faster and easier. Upgrades that used to take months can now be completed in just days — talk about business value. Using Kafka operators, you can automate Kafka deployment and deploy it quickly on Kubernetes from scratch. By abstracting the infrastructure, having a user-specified “what” and a provisioner-specified “who,” the whole process of configuration and automation is done once and can be run anywhere.
Even Kelsey Hightower, a principal engineer at Google Cloud and a highly influential voice in the Kubernetes space, was doubtful that stateful workloads could be run elegantly on Kubernetes. We’re here to tell you it’s not only possible, it’s imperative. Later in this article, you can read about three of the most common issues with setting up Kafka on Kubernetes and along with our tips for how to avoid these problems.
To make sure that networking is working from both external and internal worlds, it’s necessary to ensure you have some form of health check. This can be done through a Kafka producer and consumer running inside or outside of your Kubernetes cluster (or both). Additionally, network implementation may vary based on the cloud provider being used. Different implementations of the same resource can often pose significant challenges. Building up networking expertise to understand how to manage issues as they arise, and having the proper monitoring and telemetry in place allow users immediate knowledge of if and when a problem occurs.
How Do You Get It Right?
Kubernetes provides many networking options such as node ports, ingress, load balancers and, with Red Hat OpenShift, routes as well. Kafka requires the producers and consumers to talk to individual brokers based on the placement of partitions and partition leaders. Based on the different networking options, you have to configure your network correctly so that the producers and consumers are able to individually address the brokers.
Kafka exposes the “advertised.listeners” option in the broker configuration, which allows the clients to directly connect to the brokers. When configuring the Kubernetes services to allow access to the brokers, you will also configure the “advertised.listeners” in the broker to ensure that producers and consumers are able to connect to the individual brokers.
Kubernetes abstracts infrastructure, following an interface pattern wherein third-party providers can create their own plugins that follow a standard interface definition. So you could also build your own routing layer to make sure you are able to address the brokers. Kubernetes allows you to do this via ingress resources.
If you are connecting to Kafka from within the same Kubernetes, you can solve this in a much easier way by creating a headless service to reach any of the brokers as your bootstrap endpoint. At the same time, you can address the pods individually because Kubernetes allows you to address the pods using specific DNS names (as long as “advertised.listeners” is configured appropriately as well).
If you are connecting to Kafka from outside its Kubernetes cluster with one of the many networking options such as node ports, ingress, etc., you still need to make sure that the requirement is met to address the individual brokers and that Kafka is configured appropriately.
There are many different layers of security, authentication, authorization and encryption of the network channel, as well as the encryption of data at rest. You need to secure the communication between the brokers and ZooKeeper (thanks to KIP-500, this is going away) as well as the producers and consumers to the brokers. The Kubernetes ecosystem is evolving so quickly that service meshes, such as Istio (an open source service mesh platform designed to run in Kubernetes containers), reduce the pressure when it comes to securing communication. Additionally, when configured properly for Kafka, users can ensure internal and external communications are properly secured with mTLS (also known as mutual authentication).
With different Kubernetes integrations, however, come different problems. With Istio, for example, whenever a broker shuts down, communication still needs to happen with other brokers in the cluster as part of the shutdown process. When a pod is killed, the envoy sidecar container is killed nondeterministically and will close the network connection between that broker and other brokers that it needs to replicate partitions onto.
How Do You Get It Right?
To properly secure your data with Kafka on Kubernetes, producers and consumers need to send data in an encrypted fashion. To do this, creating SSL certificates and configuring the brokers with those certificates allows the communication between producers and consumers to the broker to be encrypted. Istio, for example, can help encrypt these communications. Integrations with Kubernetes are progressing quickly and are now better able to support many of the Kafka features and the larger ecosystem.
mTLS or SASL (available in many different shapes and forms such as SASL PLAINTEXT or SASL SCRAM) can help with authenticating the brokers. Once authenticated, make sure that you configure ACLs to allow for fine-grained authorization.
After the data gets to Kafka, data storage is the next stage. With tiered storage and persistent volumes in Kubernetes, you can easily encrypt data at rest to make sure the data cannot be read in case unauthorized users access the Kafka logs or log segments. Tiered storage to cloud provider-based storage such as S3 can easily be encrypted at rest when configuring the bucket and integrates easily with AWS KMS.
Kubernetes, on the other hand, allows you to encrypt your persistent volumes by configuring the storage class appropriately based on the storage class type. However, static persistent volumes are pre-created and you might have to make sure that it is encrypted if required.
Depending on where you run Kafka on Kubernetes, you may have to handle storage based on provisioners available in Kubernetes. The three major cloud providers (AWS, Google Cloud and Microsoft Azure) abstract the network-attached storage, and Kubernetes supports a wide variety of storage systems, including NFS, CIFS, Ceph, GlusterFS and more.
Storage volumes need to be created in a zone-aware manner if you’re running Kafka across different zones, which Kubernetes provides in a native fashion using topology-aware volume provisioning.
How Do You Get It Right?
Kafka is a stateful service, which means a persistent volume (PV) is required to prevent data loss from pod failure. As a result, you want to use pod-mounted storage so that when a pod needs to move, Kubernetes can mount that storage to a different node and call upon the pod successfully. PV claims, along with storage classes, can help with provisioning pod-mounted storage.
Kafka provides tiered storage as well, which makes storing huge volumes of data in Kafka manageable and reduces operational burden and cost. The fundamental idea is to separate the concerns of data storage from the concerns of data processing, allowing each to scale independently. Kafka provides tiered storage to AWS G3, GCS and Pure Storage FlashBlade — all of which support encryption at rest. This allows data to be encrypted from the moment it leaves your client, stored at rest, and kept encrypted all the way back to your consumer.
While running Kafka on Kubernetes requires a significant investment and you need a solid understanding of the different pieces, the benefits of Kafka and Kubernetes are immense. The two together bring flexibility and automation if you understand the initial hurdles, properly manage internal and external comms between nodes and brokers, maintain proper security protocols and are mindful of storage provisioning. Once you invest in the initial ramp-up, numerous applications can be run and rebuilding tools specific to a particular application is no longer necessary.
On the other hand, if you don’t want to manage any of this complexity, you can try Confluent for free, so you can spend your time creating and building rather than being bogged down by operations.