Containers / Kubernetes / Storage

Dell EMC’s ScaleIO Provides Scalable Stateful Storage for Orchestrated Containers

26 May 2017 3:00am, by

Container schedulers and software-based storage make a great combo that can ease operational complexity for running persistent applications in the cloud, asserted David vonThenen, open source engineer with Dell EMC’s {code} labs, during ApacheCon North America last week.

To illustrate his point, vonThenen demonstrated a software-defined storage framework he created, called ScaleIO, with Apache Mesos.

The {code} by Dell EMC team enables different orchestration platforms to consume storage from different sources on the back end. ScaleIO is available as a free download.

In his presentation, vonThenen pointed out a feature in Mesos called Frameworks, which allow you to schedule a task based on your specific application’s needs. Combining the offer-accept model in Mesos and software-based storage enables deployment of managed tasks while maintaining high availability, scale-out growth and automation.

Mesos Frameworks consists of a scheduler and an executor. The scheduler can accept and deny resources. That task then gets deployed as a container, which is the executor. The framework is tightly coupled to the application, and can enable things such as health checks and monitoring beyond just configuring and deploying applications.

ScaleIO, released in September 2016 is on version 0.3.1. It’s scale-out block storage. It’s all software: You install RPMs, you install debs on however many nodes you choose. You can do it in a hyper-converged configuration or a two-tier configuration, he explained.

As you add nodes, because it’s all software-based, the metadata manager automatically knows that it needs to rebalance the data. If you take nodes out for maintenance or hardware failure, any data that was on that node will automatically be rebalanced.

“All the maintenance operations are completely taken care of for you,” he explained, adding that you don’t have to even think about all the operational tasks associated with node failure because they’re completed automatically.

It also has an elastic architecture. If you need more IOPs, you can add nodes. Rather than one controller like in a traditional array, as a consumer of a volume from ScaleIO, it will stripe the data from end nodes all at once.

Rancher has a competing product, he pointed out, adding that the functions he described could be done on any software-based platform.

If you had a Mesos highly available three-node cluster, and various pieces of compute underneath, when you deploy the framework, it will imprint every agent node with the ability to provision and consume storage from the software-based storage platform.

If it’s imprinted on every agent node, you can provision storage from one and also have that storage in case of failover on another piece of compute. You could reattach that volume to another node so you have all the data. That’s high availability for containers.

And when you bring a new Mesos agent node up into a master cluster, once that agent registers its resources to Mesos, the agent will send up its resources to that storage-based framework and immediately imprint that node with the ability to access that storage-based platform.

Because everything’s software-based — in this case, ScaleIO, where all the operations tasks are handled for you and it can scale linearly — you don’t have the operational complexity of having a storage array where you have to worry about failed disks and other problems.

If you have a hardware failure, bring the node out. It will automatically rebalance. Then you fix the node and introduce it back in. Because it’s all RPM- and deb-based, you can deploy this anywhere. Because ScaleIO is designed to run on bare metal, he said, it will run on VMs or any public cloud, including Azure because it supports Windows.

Going Stateful

Though containers originally were stateless, he pointed out that 10 out of 20 of the most popular containers on Docker Hub are stateful, including Postgres, MongoDB, Elasticsearch, and others.

Traditionally, if you’ve spun up Postgres, you write data within the container. Yet if Postgres goes down, you’ve lost your database.

The orchestrators — Docker Swarm, Kubernetes, Mesos — realized that you’re running this container on a particular piece of compute, vonThenen said. There’s usually a direct-attached disk to that compute.

“So let’s just reroute the data so when we’re writing to our Postgres database, we write to a local disk instead of within the container itself, so when you bring the container down, then bring it back up, it can reattach a local disk mount to your container and have all your data. That means you’re writing all your data to local disk,” he said. But what if you have a hard-drive failure or the motherboard on that system goes out? Because it’s all local to that piece of compute, you’ve lost all your data.

“What we’ve learned, going way back, is if you want that data to be highly available, it needs to live on some piece of external storage,” he said

ScaleIO and VMware’s vSAN basically do the same thing: They take your direct-attached disks, contribute them to a global pool. The data is striped across that global pool. You provision storage or volumes out of that pool or can move volumes from one node to the next. Even though you’re using local aggregated disks, because it’s accessible from every node, it looks like external storage even though it’s using local storage as a back end to your storage platform.

As to where this is going, he said: “If we have a framework that configures and deploys an application and APIs to monitor and manage the application, we should, through this framework, be able to do health and remediation on the storage platform.”

Using a software-based storage platform with a cloud platform driven by APIs, applications should be able to do things like auto-scale instances, dial in IOPs for disk and provision new hard drives.

“If you have a framework specific to your application, it gives your application the ability to do things otherwise not intended,” he said.

In his demo, he showed how he could roll out software-based storage on all the nodes and add data to make that storage platform look like it’s 98 percent full, typically a nightmare scenario. But because the software-based storage framework does the auto-balancing, the framework itself can add new disks, provision more storage to expand the capacity of the storage pool so it’s not full anymore.

Feature image by Pixabay.

A newsletter digest of the week’s most important stories & analyses.