Infinity for Mesosphere: A Data Stack with Akka, Cassandra, Kafka and Spark
Mesosphere has unveiled a data stack for its data center operating system (DCOS). Called Infinity, the new Mesosphere DCOS is a high-volume, full-stack platform assembled with the help of Cisco that leverages open-source technologies to make a backend processing model for real-time data processing,
“Almost any company, whether they’re making flavored soda water or automobiles, have products that they want to connect back to their data center or cloud,” states Matt Trifiro, Mesosphere’s senior vice president, in an interview with The New Stack . “And that creates a pretty dramatic challenge: How do you process all that data in real time?
“Prior to Mesosphere Infinity, the way you would do that is, your IT team would have to set up a pretty complex software stack, with all these different open source tools,” Trifiro continues. “Setting each of those individual services up, usually in their own clusters, you have a Cassandra cluster, a Kafka cluster, and a Spark cluster. And these are all siloed. They’re not elastic, they’re not sharing resources, and they require their own teams of experts to not only stand them up — which could take months — but actually to operate them. And to do dev and standing and production, they’re going to have three times that.”
Infinity does not reinvent the wheel at any point, but rather gives development shops an iron cast with which to reproduce the wheel on a massive scale.
It includes open source technologies that have already been proven with DCOS: Apache Spark (which is already a friend to Mesos) for real-time analytics, Apache Kafka as a scalable publish/subscribe messaging system, Typesafe’s Akka as a fault-tolerant middleware framework for parallel applications, and Apache Cassandra as its NoSQL database. Marathon serves as Mesosphere’s preferred orchestrator, though with DCOS, it also offers Kubernetes as an alternative.
All of these have seen considerable action on DCOS, and certainly in some cases, they’ve all been seen together. But their presentation here in Infinity is well-tuned for a new and emerging use case: real-time, parallel data gathering on an IP network.
The One Thing That Matters
“Smartphones are things on the Internet,” noted Mesosphere’s Trifiro with his characteristic gift for observation. Signals, he goes on, are how smartphones send data back to data centers, and all of these signals are happening in parallel.
“It’s an almost inconceivable amount of data hitting the data center in real time, and which requires highly scalable, distributed systems in order to be processed. It used to be, it was OK to think about streaming data, sticking it in a database, and then overnight running some analytics on it and drawing some conclusions.
But applications today — whether I’m ordering a taxi or a valet driver, or I’m looking for a restaurant — have to know everything about me at that moment, so that they can deliver the best possible service. They have to know what kinds of restaurants I like, what restaurants are available, and to deliver a recommendation to me based on the street corner that I just walked by. That requires real-time data processing across millions of devices.”
Could this emerging backend processing model for real-time data processing, if it delivers everything Mesosphere has promised, have a positive impact on the architectures of Internet-of-Things data gathering systems, enabling new permutations that hadn’t been considered before?
That’s not the concern of Mesosphere, as CEO Florian Leibert tells The New Stack. Infinity will be centered on the backend, while partners such as Cisco will deal with the problem of how much of “everything” it can stuff into a router.
“Think of Infinity as everything that happens once the sensor data hits an endpoint, or is extracted from the sensor,” says Liebert.
While Infinity will be comprised of multiple components from various sources, Trifiro tells us that, for the purposes of SLAs, Mesosphere will be customers’ point of contact for support. This is an extremely important point as containerization and orchestration evolve beyond the needs of the cloud, to a scale that even Google may not have envisioned.
It’s About the Software
There should be a revolution in telecommunications. The technology for one certainly exists.
IP-based networking would seek to replace the world’s circuit-switched communications system with packet-based Internet voice and data exchanges that would supersede telephone quality by orders of magnitude. Billions of modern smartphones would become all-purpose, data consuming, client consoles. And suddenly, there’s your “Internet of Things.”
Amazing as this may seem, today’s servers are not fast enough to handle such a transition, if it were ever to come. The hardware may not be the problem. It’s conventional software architecture, which simply cannot scale up to this level. Microservices may be the only way to handle all that bandwidth and platforms like Infinity are looking as a way to do it.
Cisco is a sponsor of The New Stack.