Confluent Expands Stream Governance Capabilities
“Customers need more controls and advanced capabilities and easier means of expanding,” said Confluent Senior Product Marketing Manager Greg Murphy of the creation of Stream Governance Advanced, which was unveiled at the company’s annual user conference Current 2022 Summit.
“Building on the suite of features initially introduced with Stream Governance Essentials, Stream Governance Advanced delivers more ways to easily discover, understand, and trust data in motion. With scalable quality controls in place, organizations can democratize access to data streams across teams while achieving always-on data integrity and regulatory compliance,” a Confluent press release notes.
The three tiers in which Stream Governance Advanced differ primarily from Stream Governance are the new point-in-time playbacks for stream lineage, the ability to add business metadata to the Stream catalog, and the global availability of schema registry.
Jon Fancey, Confluent principal product manager, sat down with The New Stack in a one-on-one interview ahead of the launch to discuss Stream Governance to share some additional details on why an enterprise client would choose one over the other. He started by explaining of the name and products that, “Governance is a way of making sure your data is clean and of high quality. Stream governance provided the essentials. Advanced takes that to the next level.”
The whole governance suite provides deep integration with schema and schema registry. It harnesses the power that comes along with describing data because we have schema and schema registry. Once you understand the shape of the data, you get more value out of it. The original Stream Governance, now titled Stream Essentials will now be free to all Confluent Cloud clients.
Stream Lineage — Looking Back in Time
Fancey asked, “What happened to the pipeline on Wednesday at 4 p.m. when support tickets started arriving?” And Stream Lineage answers. Stream Lineage provides the ability to “drill in and understand [or help find] what contributed or caused [a] particular problem by looking at how the data flows changed previously, to what’s happening now,” explained Fancey.
While also available in Stream Governance, the look-back time was only ten minutes. The next level with Stream Governance Advanced is up to one week in hourly chunks of time or the last 24 hours for detailed analysis. The previous challenges of having to wait for an issue to arise again since it “happened three days ago” are long gone.
A new feature of Stream Lineage is the ability to search across lineage graphs for specific objects such as client IDs or topics. Clients can see a bird’s eye view or drill in on specific details.
Adding Business Metadata and Why That Matters…
There’s a solid argument to be made for most people not wanting their social security number to be in anyone’s data catalog… but sometimes it’s in there. And for the right reasons… Agreed. Let’s keep it safe. Stream Governance already had Personally Identifiable Information (PII) tags available in its schemas but with the addition of business metadata, this goes one step further.
Since the social security numbers are already marked PII, they require special clearance. The data isn’t visible but there are instances where developers need to look deeper into something and need access to the events or topics. Now they have the ability to see the producers, consumers, and schema. And while the sensitive data isn’t available, the data owner or contact is. Valuable time is saved and there is contact information making resources more widely available.
Clients can now search the catalog with GraphQL APIs which is the same API that confluent is built on. GraphQL allows for more declarative searches and helps with the problem of over and under-fetching data.
Stream Registry Is Now Available Worldwide
Stream registry now includes 28 regions, which is more than double what it was previously. Stream Registry is also more resilient with 99.95%, uptime Service Level Agreement (SLA). In short, this means applications can run in more regions at a higher SLA.
Fancey explains that, “If you’re opinionated for regulatory reasons [and] have to run in a certain region on certain clouds, we provide more choice than ever now, to be able to do that.”