Qumulo: It’s not About the Storage, It’s About the Data
Scaling out app development and management gets a lot of attention. But the lower depths of the new stack also require a rethinking as data becomes of more importance comparatively to the client-server uses that storage and networking technologies once solved.
Qumulo, launching today after three years in stealth mode, is releasing its first product today that doesn’t scale the storage hardware. Instead, Qumulo has developed Qumulo Core, a scalable, data analytics platform to give enterprises a view of their data and storage resources. It runs on commodity hardware with software that uses common RESTful programming techniques.
Qumulo’s three co-founders — Peter Godman, Neal Fachan, Aaron Passey — were the primary inventors of Isilon Systems’ scale-out NAS OneFS. EMC acquired Isilon by EMC for $2.5 billion in 2010.
They set out to build from scratch a new storage solution and raised $67 million, the most recent round led by Kleiner Perkins Caufield & Byers along with existing investors Highland Capital, Madrona Venture Group and Valhalla Partners.
They performed more than 600 interviews with companies to learn about their storage pain points and how to address them. Focusing on Flash, they quietly began shipping the product to customers last August and working with their feedback. They’ve applied agile development methods to the process to roll out new iterations every two weeks.
“Buy more storage” has been the mantra of storage vendors in the past as the solution to companies’ ballooning storage needs, but that’s a no-win strategy these days, the company says.
“The story in the past has been about scaling storage, but now it’s about scaling data,” CEO Peter Godman says.
Despite the massive amounts of data being stored, the place where that data lives is strangely silent about it, he says.
It’s all about questions, Godman says: What do I actually have? How are we growing over time? Can I break that down by application or user? Which users are using what? How much performance are they consuming? What do I need to archive? What do I need to back up?
Qumulo set out to make the storage invisible, but the data visible. At the same time, customers want something fast like NetApp but scalable like Isilon, he said.
They’re basically doing Isilon 2.0, according to Arun Taneja, founder, president and consulting analyst of the Taneja Group, who also consulted with Isilon when it created OneFS.
“These guys learned a lot and realized that when we talk about big data today, it’s a very different world. Petabyte was not a word used 12 years ago [when OneFS was being created],” Taneja said.
“There’s an impression — and one perpetuated in the vendor community — that the only way you can deal not only with millions of files, but billions of files, is with object-based design. Qumulo just busted through that myth,” according to Taneja, who listed the system’s scalability, analytics, global namespace and two-week update cycles as “paradigm shifting” in the industry.
Qumulo Core involves a hardware layer that can use existing commodity hardware, virtual machines, a cloud instance or Qumulo’s appliance. A SaaS software layer, Qumulo Scalable File System (QSFS) builds a database into the file system itself to answer questions about the data that affect capacity and performance.
A dashboard provides fields in areas such as built and test, the number of files in each, and with a click, a user can see how much performance is being consumed in each area of the filesystem tree. When a heavy load shows up, a click will reveal which files and directories that client is accessing.
The analytics, in particular, sets Qumulo apart, Taneja said, adding that he knows of only two other companies, Data Gravity and Tarmin, offering data-aware storage. He said each of the three companies is going a slightly different way, though, with Qumulo focused on extreme scalability, such as that needed by media and entertainment companies.
The intelligence needed for other verticals, such as e-discovery, might come later, Godman said.
As Taneja explained it:
“More intelligence can be added later because the architecture is the important part. If you didn’t architect the system to deliver intelligence, you couldn’t add it later if you tried.
That’s the reason the EMCs, the IBMs and HPs can’t jump up and say, ‘I had dumb storage yesterday, but I just made it intelligent.’ No, no, no, no. You can’t create intelligence unless you redesign the system and I think that’s Qumulo’s time advantage.”
Taneja Group also does software testing, and its report on Qumulo, based on a pre-production version, explains:
“Qumulo has an ‘API first’ methodology that enables [the software-defined] storage controller to be programmatically accessed via an API or accessed via a web-based GUI … All features of the storage controller can be accessed via API calls using common RESTful programming techniques … Communications and data are passed through a 10Gb network …
“The data is initially written and read from the SSD device, and then data that is accessed less frequently is written to HDD devices and striped as wide as practical. As the controller is truly software-defined, it does not use any hardware-based RAID products for data protection, which not only allows a greater flexibility in data placement but also simple porting of the product to other physical and virtual platforms.”
Qumulo tested the system on four billion files with 300,000 directories, and reports it took 17 hours to rebuild a 6TB hard drive.
Qumulo’s early customers include Ant Farm, Blind Studios, Sinclair Oil, Sportvision, TELUS Studios and the University of Utah Scientific Computing and Imaging Institute.
Entry pricing for a four node, 100TB, raw capacity, Qumulo Q0626 hybrid storage cluster begins at $50,000.
Sportvision, a company that has produced a number of technology enhancements for sporting events such as the yellow first-down line used on broadcasts of football games, began testing Qumulo last October, according to IT manager Grant Turner.
For a baseball game, it might set up four to six HD cameras for a game that can last two to three hours. That video has to be stored at the ballpark, then again at its Fremont, California, site where developers use it to create the company’s products.
“We were only able to put a small amount [of content] online, then were pushing the rest onto large LTO tapes. It was just a pain, because to bring up a certain game or event, sometimes turnaround would be up to a week just to find the correct tape, and they would store a couple of terabytes of data,” he said.
With Qumulo, developers are able to playback video directly from the system; previously they had an external hard drive plugged directly into their development system. “That was a pretty big improvement for us,” he said.
It’s also been consolidating its storage into a single system that everyone can access.
And Sportvision has been excited about Qumulo’s analytics.
“Now we can identify data that gets used often and we can tell what hasn’t been touched for several weeks. We can move that off to our existing tape system in an archival fashion,” Turner said.
Beyond the original four-node demo system, Sportvision has added three more nodes. It’s moved a couple of production items onto Qumulo and plans to keep moving in that direction, Turner said.
Feature image via Flickr Creative Commons.