Cloud Native / Data / Monitoring

Confluent ‘Proactive Support’ Aims to Speed Resolution of Kafka Streaming Data Issues

11 Nov 2020 3:00am, by

Confluent says its new Proactive Support service will further speed time involved in troubleshooting the company’s commercial Kafka platform for managing streaming data.

Proactive Support is “based on the lessons we learned managing thousands of clusters in Confluent Cloud,” Dan Rosanova, head of product management, Confluent Cloud, told The New Stack. “Through Confluent Cloud, we are able to see how a huge range of cluster sizes and different Kafka environments function. We codified that knowledge into Proactive Support and are able to bring this SaaS benefit to customers who are managing Kafka on-premise.”

The service, which provides ongoing, real-time analysis of performance and configuration data, sends notifications to alert users to potential environmental issues. Job ticket times required to resolve issues are reduced by up to 25% when using Proactive Support with Confluent Platform or Confluent Cloud to manage deployments on Kafka, the company claims.

This added option allows organizations to opt for extended Confluent Platform 6.0’s capabilities specific to better troubleshooting and alerts for operations teams managing clusters with Kafka. The Kafka-based Confluent Platform 6.0, for example, was designed to help operations DevOps team members to automate tasks. DevOps teams can look to Proactive Support to see improvements in resolving issues when Kafka clusters fail, Confluent said.

Proactive Support, as a subset of capabilities that are part of  Confluent Platform 6.0, fall under Confluent’s Project Metamorphosis, in a likely reference to Franz Kafka’s classic novella “The Metamorphosis.” Project Metamorphosis represents Confluent’s umbrella effort to help organizations realize the full potential of their data event streaming clusters, often deployed in multicloud and on-premises environments around the world. The Confluent Platform 6.0 was created to solve several issues operations teams face when managing clusters with Kafka and is also part of the company’s Project Metamorphosis initiative.

In Apache Kafka environments, troubleshooting once job tickets are issued can be especially time-consuming. “It’s a manual process given its highly distributed, complex nature,” Rosanova said. “There are two buckets Kafka issues fall into, slow builds and sudden bursts. With Proactive Support’s 24/7 analysis of cluster metadata, it helps quickly resolve both.”

Proactive Support is configured to issue alerts for slow builds “when you are heading into a potential danger zone for an issue,” Rosanova said. “For this initial launch, we’ll be able to alert you when your disks or networking are getting overloaded: two very common and costly issues organizations run into when they are scaling Kafka.”

Much emphasis is placed on issuing alerts head of time before crashes occur, with relevant metrics relating to cluster performance issued. By alerting DevOps teams when they can mitigate threats more efficiently before putting out fires, more time is thus available for engineering and innovation for data streaming deployments with Kafka, the company said.

Rosanova described the case example of when sudden issues occur that “are only noticeable right when it happens.” In such a case, Proactive Support sends an alert via email or Slack to connect with Confluent’s support team. “We’ll already be equipped with real-time health metrics of your environment, significantly cutting down the number of questions needed to be asked before identifying the problem,” Rosanova said.

Confluent’s claims mentioned above that Proactive Support helps to resolve Confluent Platform support tickets up to 25% faster are largely based on automating much of the diagnostic information flow, thus eliminating more manual processes needed to identify a problem in self-managed Apache Kafka environments, Rosanova said.

“During a typical support call, we would need to ask for people to click through their systems to figure out things like Kafka version numbers, disk and network configuration details complete with historical data,” Rosanova said. “With Proactive Support we already have real-time health metrics at hand, so the process becomes more context-driven and efficient.”

A newsletter digest of the week’s most important stories & analyses.