Confluent Kafka Cloud Gets Apache Flink Instant Analytics

Confluent is embellishing its Kafka-based message brokering cloud service with real-time analytics capabilities, through a managed version of Apache Flink. The service aims to eliminate the burden of setting up analytics on the enterprise backend, and its serverless scale-up model promises to save provisioning costs as well, the company promises.
The serverless-based billing is “based on usage, not on allocated compute or allocated capacity,” said James Rowland-Jones, Confluent stream processing product leader, in an interview with TNS. “And so what users end up paying for is actual usage not provisioned capacity.”
More details will be forthcoming at the company’s Current 2023 user conference, being held this week in San Jose, California.
In addition to the inclusion of Flink, the company also unveiled a new self-service data portal, which will provide a graphical user interface for running data streams. The company has also forged a number of partnerships with AI companies to bring AI capabilities to data streams. It also has a new package, called Enterprise Cluster, that allows users of Confluent Cloud to set up private clusters accessible over VPC. It is built on the Kora Engine.
The use of real-time data analysis is on the rise, as more businesses need to compete in competitive marketplaces. The 2023 Data Streaming report estimated that 72% of organizations, use it to power mission-critical systems.
While Kafka can manage large-scale data streams coming in from a source, additional analysis may be needed to make sense of the data. Over the past few years, Apache Flink has risen to prominence, often in conjunction with Kafka, as an open source platform for high-throughput, low-latency stream processing. Data processing in this context, according to Confluent, can mean tasks such as matching drivers and riders for a ride-sharing company, ferreting out fraudulent activity for financial companies, and detecting unusual activity for security companies.
Initially, the interface for Flink will be through SQL, meaning developers can write SQL queries to interrogate the data. If someone creates a topic and a schema within Kafka, Flink will create a SQL table that can be queried against in Flink, eliminating the need to set a table separately.” And so for you as a user, you don’t have to duplicate the metadata, it’s already there,” Rowland-Jones said. Next year, the company will open Flink for more programmatic analysis through a set of programmatic APIs with Python and Java.
Apache Flink will be available as an open preview for current Confluent Cloud customers using AWS in select regions for testing and experimentation purposes. General availability is coming soon.
More highlights from the conference:
Stream processing was considered niche, with a number of inherent limitations compared to traditional batch, including issues in transactional correctness, lack of tools and poor scalability. But these limitations were not insurmountable — @jaykreps #current23 @confluentinc pic.twitter.com/M8vT7auMUw
— Joab Jackson (@Joab_Jackson) September 26, 2023
Warner Brothers launched a new video streaming service this year, called Max, built on @confluentinc’s managed #Kafka service. WB SVP of engineering Girish Rad cited the service’s elasticity and log management #current23 pic.twitter.com/KqbeRReZxL
— Joab Jackson (@Joab_Jackson) September 26, 2023
.@NotionIQ uses #Kafka-based data streaming for #AI to auto-fill summaries of everything entered into the productivity app, in real time — Daniel Sternberg, Notion Head of Data, #Current23 @confluentinc pic.twitter.com/uW72APsLkA
— Joab Jackson (@Joab_Jackson) September 26, 2023
Most data pipelines can be expressed as SQL statements—@confluentinc’s David Anderson on the values of #ApacheFlink… #Current23 pic.twitter.com/bzZs99Jkx8
— Joab Jackson (@Joab_Jackson) September 26, 2023
Businesses building on batch systems are on the decline — @jaykreps @confluentinc, on how most organizations are operating in stream processing environments now, press lunch #current23 pic.twitter.com/8tkeEtaRNU
— Joab Jackson (@Joab_Jackson) September 26, 2023
Business data is *always* streaming. It is either bounded (“batch”) or unbounded. #ApacheFlink handles both in the same way, making it easier to query both historical and live data — @confluentinc’s Martijn Visser, #Current23 pic.twitter.com/tVjlsIALg3
— Joab Jackson (@Joab_Jackson) September 26, 2023