CoreOS Launches a Kubernetes-Friendly Storage System: Torus

Continuing to innovate at a rapid clip within the container orchestration space, CoreOS has launched an open source project aimed to give Kubernetes users a proper storage system to work with their pods.
The goal behind Torus is to provide scalable storage to container clusters orchestrated by Kubernetes.
“We’ve heard from Kubernetes users that working with existing storage systems can be difficult,” said Wei Dang, CoreOS head of product, in an interview with TNS. “We made it really easy to run Torus. It runs within a container and can be deployed with Kubernetes. It provides a single distributed storage cluster to manage microservices workloads.”
Torus tackles one of the thorniest problems in container and microservices management, namely working with persistent storage, according to a blog post introducing the new technology from CoreOS software engineer Barak Michener. The application containers themselves are frequently started, stopped, upgraded, and migrated between nodes, though the data an application needs must be accessible in a consistent location.
“If you stand up a bunch of microservices in containers that have individual data stores, managing each of those individual data stores can become quite challenging, particularity if your leveraging existing storage systems that weren’t designed to handle all these containers,” Dang said.
On behalf of Torus, Kubernetes keeps track of all the different resources within the cluster. Torus could be used to host a database system that can be called from other microservices, no matter how often those microservices, or the database itself, changes locations within the namespace.
Of course, there is a whole mature product set for running distributed storage systems, which offer the advantage of spreading large storage pools across multiple servers. Red Hat engineers have been working hard on both GlusterFS and Ceph open source file systems, both of which can be harnessed to offer easily scaled-out distributed storage.
They can be tricky to use, however, and unidentified errors can rapidly propagate alarmingly quickly, Michener charged.
“It primarily goes back to simplicity,” Dang said. “To get those solutions up and running and managed is quite difficult. They were not really designed for large-scale container infrastructure. They were primarily designed for small clusters of very large machines.”
Given all the problems we've seen with Ceph in production, I'm pumped for this: https://t.co/Bjiy4JTLJx /cc @coreoslinux
— Gabe Monroy (@gabe_monroy) June 1, 2016
Like it did for its etcd distributed key-value database, the company built out Torus following the Google GIFEE (Google Infrastructure for Everyone Else) approach, which advocates highly-scalable distributed infrastructure for enterprise use. The company also offers a commercially supported version of Google Kubernetes, called Tectonic, for container orchestration.
Torus is well-suited for distributed workloads, in no small part because it relies on CoreOS’ etcd key-value store to coordinate file or object metadata. The etcd database also provides solid support for the consensus algorithms needed to keep track of moving resources across different servers. Torus is written in the Go programming language and uses the Google gRPC protocol, both of which Michener hopes will provide easy extensibility for building out third-party Toris clients.
In Operation
Torus can manage all the disks under its purview as a single storage pool, and can scale to hundreds of nodes, Dang said.
Once in operation, Torus allows Kubernetes users to dynamically attach volumes to pods being deployed. “To an app running in a pod, Torus appears as a traditional filesystem,” Michener wrote. Kubernetes itself provides the means for deploying Torus, by way of Kubernetes manifests, allowing administrators to run Torus as a Kubernetes-managed application.
Currently, Torus supports block-oriented storage via a Network Block Device (NBD), though in the future, it may support file storage as well. Data can be encrypted, and the software provides many of the modern features built into today’s file systems, including hashing, replication, garbage collection, and pool rebalancing.
“If you add a new node, Torus will automatically figure out how to data has been placed and replicated to accommodate that,” Dang said. “Torus computes data placement across all the nodes across the cluster automatically.”
“At its core, Torus is a library with an interface that appears as a traditional file, allowing for storage manipulation through well-understood basic file operations,” Michener explained. “Coordinated and checkpointed through etcd’s consensus process, this distributed file can be exposed to user applications in multiple ways.
Torus has been awhile in the making. Major thanks to @coreoslinux for giving me the opportunity, and @packethost for collab. and my testbed
— Barak Michener (@barakmich) June 1, 2016
CoreOS is not alone in addressing the emerging storage needs of container lifecycles. Docker has partnered with both Hedvig and BlockBridge to extend native storage capabilities for its Docker Datacenter. EMC offers REX-Ray, designed to provide persistent storage access for Docker and Mesos-based container runtimes. And IT consultancy OpenCredo recently posted KubeFuse, a Kubernetes-friendly file system that allows administrators to carry out such handy tasks as editing services and replication controllers.
CoreOS hopes others will contribute to this open source project. For those in the San Francisco area, the company will do a deep-dive into the technology at the next CoreOS meetup, June 16.
TNS Research Analyst Lawrence Hecht contributed to this story.
CoreOS, Docker and Red Hat are sponsors of The New Stack.
Feature image via Pixabay.