STORK: Open Source Storage Intelligence for Kubernetes
A newly-released Kubernetes-focused open source storage project from Portworx, called STORK (STorage Orchestrator Runtime for Kubernetes), focuses on easing management of stateful applications like databases, queues, key-value stores that are run in containers.
There was a buzz about storage at KubeCon + CloudNativeCon 2017 in December, primarily around the Container Storage Interface (CSI), a cross-industry standards initiative, that was released as an alpha feature in Kubernetes 1.9. CSI will enable users to plug in and swap storage for the Kubernetes cluster.
The focus on CSI largely has been about provisioning storage in the first place. STORK is more about facilitating orchestration of the application itself without using labels, according to Goutham “Gou” Rao, Portworx co-founder and CTO.
“We came about this project from a lot of lessons learned about deploying these databases at large scale,” he said. “Some of the ways customers were going about it, like adding labels or manually adding constraints to facilitate deployment of these applications, it sort of became challenging in a production environment.”
In a production environment, he explained, there are a lot of dynamic changes — machines crashing, disks failing, storage getting over-utilized. “You could have a lot of wear and tear on the storage fabric. This sort of intelligence needed to be fed back to Kubernetes so it could be even more effective at managing distributed, stateful applications,” Rao explained.
Labels basically are pinning applications to which type of node. But when you have hundreds of servers and thousands of containers, it gets very cluttered. You lose focus of where your labels are and what they’re doing, he said. Basically, a label becomes another thing you have to manage. STORK is able to provide all these benefits in an automated, self-driven way.
In its initial form, STORK uses the open storage framework that Portworx uses, but the company plans to work with other vendors to provide an array of storage options. Users also can add their own drivers by writing their own plug-ins.
It’s a natural extension of Kubernetes that sits on the Kubernetes cluster and helps make storage-based decisions about how to best schedule and manage the application, like where should the application run. What should happen if the server on which the application is running has a problem with storage?
“The goal behind this is to provide the operator, the system administrator, a hands-free environment; they don’t have to worry about the implications of durability or wear and tear on their storage environment. STORK sort of goes on autopilot and lets your application run, regardless of the underlying infrastructure,” Rao said.
It’s aware of your cloud environment or of your physical servers if running on-prem.
STORK v1.0 offers three main features:
- Hyperconvergence, co-locating pods with their data. STORK will run so that each container’s data is local.
- The scheduler can make decisions that something is going wrong with storage and relocate containers to other nodes where storage is local.
- Volume snapshots through Kubernetes and restoring those snapshots to other Persistent Volume Claims (PVCs) all through Kubernetes.
In addition to expanding the range of storage options, enhancements going forward include support for availability zones so that the prioritization of nodes will take into account the zone and rack on which data is located and support for taking backups to the cloud using the same interface as snapshots.
Feature image via Pixabay.