Riak’s NoSQL Database and its New Fit with Apache Mesos
Basho Technologies has collaborated with Cisco to enable the Riak KV NoSQL database to run on the Apache Mesos resource manager that automates a customer’s underlying data center infrastructure.
With Riak KV managing the data tier and Mesos managing the underlying infrastructure, integration also allows for “push button” scaling up and down as Mesos aggregates resources for the Riak nodes, according to the company.
Mesos is the resurce manager that Twitter and Airbnb use to manage their massive infrastructure. It’s a different reality for these kinds of Internet-scale companies.
But increasingly, the enterprise is finding that managing monolithic apps is not what it’s all about. Scaling is important and innovation requires more than optimizing a server with virtual machines.
Now there’s this new world of containers and sophisticated resource managers that companies all over the world are playing with to get some understanding of just exactly what they wish to do.
In part, that’s why companies like Basho are seeing the importance of the Mesos ecosystem. They are not the only ones, either, by any means, that are looking to platforms like Mesos as a way to reach the changing enterprise customer base.
Cassandra, for example, is a scalable NoSQL database, and competitor to Riak, which according to the Mesosphere blog, is well suited to run on Mesos due to its peer-to-peer architecture. It has horizontal scalability, no single point of failure and a simple query language (CQL). Mesosphere is the main company backing Mesos.
And then there’s companies like Crate, a distributed data store that does synchronization, sharding, scaling, and replication. It is designed to self-heal and automatically rebalance a cluster.
Crate’s integration with Mesos allows for management across Crate instances without the need for explicit knowledge on the quantity and their specifications. An overview of the integration is available on the Crate site.
The Riak integration grew out of Basho’s recently released data platform, according to CEO Adam Wray. That platform is aimed at developers who want to build applications using different open sources databases, analytics tools or search offerings. To that end, it offers integrations with Apache Spark, as well as Redis and Apache Solr, the open source search technology.
“We wanted to be able to handle many data models, keep them indexed, synced, clustered, highly available with highly accurate data. The goal was to simplify the management of large distributed datasets for enterprise. What we’re doing with Cisco is moving that into the next layer,” Wray said.
“What we’re looking to deliver are even more simplified solutions and data sets for the enterprise and make the data center portion almost auto-scaling in nature with the resource scheduler form Mesos being part of our entire offering.”
Basho is developing an open source integration with Mesos, but also will offer a supported enterprise version. Wray said there will be more work with Cisco and its Intercloud federated public cloud to further offerings in this area.
Cisco is one of Basho’s customers, Wray said.
“Cisco was keen to handle the conundrum of large distributed datasets. Working with us, they felt we had the best platform, core RIAK, to deliver on that promise, but they also needed the resource component added. They needed it for their own services long-haul; we knew this had value to all large enterprise clients providing services to their own users,” he said.
Basho had been trying to decide whether to build its own resource scheduler, but large-scale enterprise clients including Cisco were using Mesos.
“The consistent theme was: Here was a thoughtful, well-proven, effectively resourced scheduler for the distributed environment. … We started working with Mesos directly and realized it would be easier to integrate the open source version of Mesos rather than go out and recreate the wheel.”
Basho will demonstrate the beta framework for Riak KV running on Mesos running on Intercloud at MesosCon, in both the Basho booth and the Cisco booth. Mesosphere has certified the integration with its Mesosphere Datacenter Operating System, Wray said.
Basho is riding a wave of popularity for NoSQL databases in the enterprise market – Wray said the company has grown 226 percent quarter over quarter – but faces new competition from the likes of Aerospike and MemSQL.
“We have clients like Telefonic, China Payment, Cisco, Comcast, who as they collect more data and use it as a strategic enabler for their applications, the concept of having a rigid structure, a relational database, does not work. What I’ve seen in the past 12 months — and I think this is really escalating — is companies have gone from playing with NoSQL, and when they were playing with it, the ease of use for the developer was the only thing that mattered. But when you transition from playing with it to be able to use large amounts of data to be distributed with a lot of latency around it, to expect it to always be available, but to always be accurate, the expectation levels go up big-time,” he said.
“At Basho, we feel like we’ve got a point of view that trumps the industry because the things we’ve always worked on are availability, accuracy of the data and operational scale and simplicity. Using Mesos with Cisco, this is another thing we’re trying to simplify so people can use data in their applications.”
In addition to its work with Cisco, Wray said it will announce other big collaborations around the end of the year and other product components in 30 to 60 days.
Michael Franklin, chair of computer science and director of the Algorithms, Machine and People Laboratory (AMPLab) at UC Berkeley, said there were a couple of interesting points in Basho’s Mesos integration:
“First, IoT applications are inherently widely distributed and highly dynamic, so a scalable, geo-distributed solution for IoT data management makes a lot of sense. The combination of cloud compute and storage plus dynamic data and resource management software seems like the right approach for deploying and evolving data-intensive IoT applications,” he said.
“Second, the innovation in Big Data is largely happening at companies and labs with strong ties to the open source software community. Cisco has been a long-time sponsor of the UC Berkeley AMPLab and RADLab projects, where Apache Mesos was originally developed. Cisco’s adoption of Apache Mesos and Riak is further evidence of the continued rapid evolution of enterprise software being driven by the fast moving open source Big Data community.”
Basho and Cisco are sponsors of The New Stack.