Confluent Platform 7.0: Data Streaming Across Multiclouds
The challenge is clear: How to offer real- or near real-time access to data that is continually refreshed across a number of different distributed environments. With different types of data streaming from various sources such as multicloud and on-premises environments, the data, often in shared digital layers such as so-called digital information hubs (DIHs), must be updated asynchronously. This is necessary in order to maintain a consistent user experience.
To that end, data streaming platform provider Confluent’s 7.0 release features what the company calls Cluster Linking for data mirroring from different Confluent clusters across multicloud and on-premises environments. Built on the open source data streaming tool Apache Kafka, hundreds of different applications and data systems can use it to migrate to the cloud or share data between their data center and the public cloud, Confluent says.
Traditionally, syncing data between multiple clouds or between on-premises and the cloud was “like a bad game of telephone,” Luke Knepper, a product manager for Confluent, told The New Stack. “You have these point-to-point connections and batch ETL jobs that slowly deliver outdated information. Now, hybrid architects can use Cluster Linking, which is like a laser beam that syncs data in real-time to and from wherever you need it,” Knepper said. “You no longer have to try to manage and audit the tangled web of telephone wires, which saves tons of time and resources, and empowers your application teams to deliver rich real-time user experiences, business-critical applications, and analytics.”
Not to be confused with Kubernetes clusters, Confluent clusters provide a platform to set data in motion whereas Kubernetes clusters provide a platform to run applications and other platforms, Knepper explained. “A Confluent cluster can move data between many different applications and regions. A Kubernetes cluster can host those applications — and can even host a Confluent cluster,” Knepper said. “Our customers who choose to self-host their Confluent clusters can leverage Confluent for Kubernetes to run Confluent Platform on top of their Kubernetes clusters.”
The release includes Apache Kafka 3.0, which offers a developer preview of Apache Kafka Raft Metadata mode (KRaft). KRaft is a new consensus mechanism built into Kafka that doesn’t require Apache Zookeeper, Knepper noted. “KRaft makes deploying Kafka easier, allows it to scale to millions of partitions and makes recovering from broker failures ten times faster,” Knepper said.
Indeed, KRaft was introduced to remove Apache Kafka’s dependency on ZooKeeper for metadata management. “Replacing external metadata management with KRaft greatly simplifies Kafka’s architecture by consolidating responsibility for metadata into Kafka itself, rather than splitting it between two different systems: ZooKeeper and Kafka,” Hasan Jilani, a senior product manager for Confluent, wrote in a blog post. “This improves stability, simplifies the software, and makes it easier to monitor, administer and support Kafka. It also allows Kafka to have a single security model for the whole system, along with enabling clusters to scale to millions of partitions and achieve up to a tenfold improvement in recovery times.”
Other new features Confluent 7.0 offers the Confluent communicated include:
- APIs in Confluent for Kubernetes in ksqlDB: This feature, Knepper noted “will be familiar to SQL developers for powerful queries.” There is now support for ksqlDB 0.22, which includes foreign-key joins. In previous releases of ksqlDB, users were restricted to only joining tables based on each table’s primary key.
- Management Mode for Confluent Control Center: To help customers leverage Confluent’s Health+ offering and “offload expensive metrics and monitoring” to the cloud,” Knepper said.
- Confluent for Kubernetes 2.2: This follows the release of Confluent for Kubernetes in May, which allows private-cloud Kafka services to be created by using a declarative API to deploy and operate Confluent. Kubernetes distributions such as VMware Tanzu Kubernetes Grid (TKG) and Red Hat OpenShift or distribution meeting the Cloud Native Computing Foundation’s (CNCF) conformance standards are supported.