Cloud native monitoring company Sysdig has expanded its flagship Secure DevOps Platform to incorporate full compatibility with Prometheus, making the platform to be “the only enterprise monitoring solution to be fully compatible” with the open source cloud native monitoring tool.
Although widely used for monitoring Kubernetes-based cloud native deployments, Prometheus itself can not scale beyond a single server. “Prometheus servers are good for a small environment,” said Payal Chakravarty, vice president of product management at Sysdig.
“Today, when developers and DevOps teams get started with Prometheus, it’s easy to manage it when it’s just one or one or two apps or a handful of clusters, but as they try to standardize on Prometheus across their enterprise, it’s a completely different ballgame. You need significant resources to understand how to federate manageability of that and then how to scale it for long term data retention because Prometheus servers can handle only a few weeks of data,” Chakravarty said.
The Sysdig platform, by contrast, can handle more than 100 million metrics per second and retain 13 months of data — features that it already provides for IBM Cloud Platform. Part of Sysdig’s value proposition to users is to combine monitoring and security into a single platform. Similarly, Chakravarty explained that Prometheus support further adds to the data available and the resultant possibilities. “The agent collects these metrics and remotely writes to our metric store, which is a time-series database that collects all of this. You don’t need a Prometheus server at all,” said Chakravarty.
With the addition of support for PromQL, the Prometheus query language, the Sysdig platform allows users to create dashboards, alerts, and metric analytics within the Sysdig platform, use Sysdig’s out-of-the-box Kubernetes dashboards and continue using dashboards and alerts they had already created using PromQL.
“It provides a lot more possibilities of insights that we can create on top of the metrics that we are gathering, whether it’s Kubernetes capacity management or even anomaly detection based on Prometheus metrics. These are some things that we would consider to do in the future as we get that data,” said Chakravarty. “Today you can actually combine Prometheus metrics with the syscall derived metrics and you can do the math on metrics. So, if you had to compute certain trends or combine metrics that were gathered from different sources to compare or mishmash or slice and dice, you would be able to do that.”
To be sure, Sysdig is not the only organization attempting to scale Prometheus beyond a single server.
Weaveworks introduced horizontal scaling to Prometheus some years back with Cortex, as did the Cloud Native Computing Foundation‘s Thanos project. Sysdig contends that its release is the first commercial offering to provides a cloud-scale level of scalability.
“From everything that we know, the Cortex offering is not the level of scale that could support a whole cloud-scale implementation. It doesn’t reliably go up to that level of metrics,” Sysdig Chief Marketing Officer Janet Matsuda. “The reality that we’re hearing from our customers, even though there may be Cortex or these things out there that are making claims that they can scale, the reality of it is that it’s not working for people.”
In addition to Prometheus and PromQL support, Sysdig also launched PromCat, “a curated repository of vetted Prometheus exporters, dashboard, and alerts to monitor any infrastructure, application, and service running in the cloud,” which at launch hosts integrations for the Kubernetes control plane Istio and a growing list of AWS services.
The Cloud Native Computing Foundation is a sponsor of The New Stack.