The Great Grafana Mimir and Cortex Split
Grafana Labs’ Mimir release in March generated a discussion about why Grafana decided to forgo its support for Cortex and launch an alternative to visualize metrics from Prometheus on a time series database. Some of the speculation was unfounded, as in that Grafana Labs supposedly yanked its support because it was worried about Amazon Web Services (AWS) doing an ElasticSearch-like move and creating a forked competing version for Grafana’s customers. There was also some uncertainty about to what degree Cortex and Mimir would overlap in functionality.
A few months after the release, some clarification is in order about why Mimir and Cortex are now very different animals and how Grafana is holding true to its promise to further differentiate Mimir as what Grafana executives have referred to as “the Grafana backend for metrics,” with the much-touted capability to scale to 1 billion metrics.
“We got rid of so much old stuff from Cortex: we ripped out the complete legacy storage engine and redid the complete configuration. We also set a lot of new defaults that just make it easy to operate and get up to speed,” Richard (RichiH) Hartmann, director of community at Grafana Labs and a CNCF TAG Observability chair, told The New Stack. “So, we invested insane amounts of time into usability — you can literally just run a single binary version on your laptop and you have a fully running Mimir,” in just a few minutes.”
This is not to say that Mimir does not still share its foundational code base with Cortex. Created to help scale and improve upon the use of Prometheus for metrics, initially as a SaaS offering, Cortex was accepted as a sandbox project by the CNCF in 2018 before becoming an Incubating Project in 2020 as Grafana’s famous dashboards and Loki’s and Tempo’s popularity gained momentum.
The fact that Grafana Labs had become the largest contributor to the Cortex project eventually raised some concerns about how some cloud vendors were able to benefit commercially from Cortex without contributing to the project as much. Grafana Labs had also begun to contribute more to Grafana Enterprise Metrics (GEM), shifting resources away from open source Cortex. It was then decided that a new project —Mimir — would become Grafana Labs’ open source project for metrics, and that it would share capabilities from GEM. Grafana Labs also thought it was wise for Mimir to have a more restrictive license compared to Cortex.
“And we ended up putting more and more code into the closed-source portion — and we didn’t like that,” Hartmann said. “We believe in open source and open source comes first.”
The resulting AGPLv3 license was then adopted for the use of Mimir instead of Cortex’s Apache2 license. The licensing-scheme change was also adopted in order to encourage more contributions or users of Mimir to contribute back in a more robust way compared to users of Cortex. (Grafana, Tempo and Loki also use the AGPLv3 license.)
In a blog post, Tom Wilkie, vice president of product at Grafana Labs, who is also a Prometheus maintainer and a Loki and Cortex co-creator, wrote: “Mimir combines the best of what we built in Cortex with features we developed to run GEM and Grafana Cloud at massive scale, all under the AGPLv3 license. Included with Mimir are previously commercial features, including unlimited cardinality using a horizontally scalable, ‘split’ compactor and blazing fast, high cardinality queries through a sharded query engine.”
The shared capabilities between Mimir, Cortex and GEM are listed here:
A Mimir feature that Grafana promised in March when it was released was to extend its reach beyond Prometheus metrics, to include metrics from Influx, Graphite and Datadog. To that end, Grafana Labs has begun to open source three write proxies for Mimir for metrics from Graphite, Datadog and InfluxDB (Mimir already supports OpenTelemetry “natively,” Hartmann says). “These proxies, which are labeled in the open source project, allow quick and simple ingestion of metrics using existing monitoring infrastructure — and lay the foundation for Mimir to ingest metrics from any system,” Alex Greenbank, a senior software engineer for Grafana Labs, wrote in a blog post.
The ability to ingest metrics from Prometheus, Grafana Agent, OpenMetrics, InfluxDB, Datadog, and Graphite is derived from how the write proxies allow native ingestion of metrics from Graphite and Datadog and via Influx Line protocol. “By adding the proxy as an additional endpoint for the collection agent, any metrics will be translated to Prometheus time series and sent in Prometheus remote write format to be stored within Mimir,” Greenbank wrote.