Kasten’s Kubestr: An ‘Easy Button’ for In-Cluster Storage Validation

Stateful applications running on Kubernetes have evolved to such an extent that their deployment and management — while still challenging — have become simpler to implement. Storage is one such stateful application that falls under this category. Still missing however, are the tools that provide information about what storage solutions are available for particular Kubernetes clusters and how well they are performing.
To that end, Kasten by Veeam created Kubestr to offer DevOps teams an “easy button” to “identify, validate and evaluate” storage systems running in cloud native environments, Michael Cade, a senior global technologist for Veeam, said during a presentation.
Kubestr does this by automating the identification and validation of storage options running on Kubernetes-integrated platforms and provides benchmarks to gauge storage performance for the storage options once implemented.
“Kubestr is really focused on making sure that your production storage is validated and evaluated to whatever your workload needs to look like,” Cade said. “For any operator who is deploying Kubernetes with a mind around stateful workloads and leveraging storage, this tool could give some insight into how to do that.”
This functionality is useful because “predictability and consistent storage performance is near the top of most Kubernetes operators’ wish list, especially when gradually migrating and modernizing traditional enterprise applications,” Torsten Volk, an analyst for Enterprise Management Associates (EMA), told The New Stack. In this usage scenario, for example, Kubernetes’ strongpoint — its application scalability and portability achieved through abstraction from the underlying infrastructure resources — “can come back to bite app ops,” especially when running IOPS-intensive workloads such as streaming data analytics, database operations or machine learning model training and inference, Volk said.
According to EMA research, maintaining persistent volume performance in cloud native environments is a major challenge.
“Therefore, Kubestr is more than welcome,” Volk said. “It provides DevOps guys with a simple way of determining where an application can safely be placed, moved to or scaled.”
Open Source Magic
Following Kopia, a backup solution, and Kanister, a framework for Kubernetes data management, Kasten’s release of Kubestr is its third major open source project for supporting stateful applications running in Kubernetes environments, and more particularly, validating and evaluating storage.
“We understand that open source is critical to enable our customers and the wider community. Cloud native applications typically consist of many services, many of which are open source-based,” Cade told The New Stack. “While providing solutions for cloud native workloads, Kasten is also using cloud native approaches and architectures (i.e. drinking our own champagne). In doing so, we both use open source and give back to the community by contributing.”
The “easy button” that Kubestr provides that Cade referred to involves helping DevOps teams have a better understanding of what works best for their operations. This is amid a wide range of storage choices as a subset of the very large number of stateful application and tool choices for cloud native environments.
It is now possible, for example, to deploy management platforms for Kubernetes workloads from established vendors such as SAP, SQL Server, Oracle and Hadoop on Kubernetes, reflecting the wide range of cloud platform choices on offer.
Kanister, another open source project founded by Kasten mentioned above, offers application-level data management within Kubernetes. Kanister allows DevOps teams to capture application-specific data management tasks in blueprints, which can be easily shared and extended, Cade described.
For storage, the explosion of choices for Kubernetes environments is largely due to the creation of the Container Storage Interface (CSI), allowing for a decoupling of storage solutions for Kubernetes deployments. This means that storage systems for data stores for microservices such as Cassandra, Redis, MongoDB, and other NoSQL, MySQL or PostgreSQL databases do not require configuration for Kubernetes clusters.
In addition to managing the database, the performance, scalability and resiliency within a specific application must be taken into account in these complex cloud native environments, Volk said. “Now that we no longer control exactly where the application gets deployed to, we need to be very careful in terms of enabling operators to safely match performance requirements with potential deployment targets.
The wide choice of platform and storage options, as well as the different user and application requirements can thus have a very obvious effect on storage performance — which Kubestr, as described above, was created to help DevOps teams make more informed decisions about storage management in Kubernetes environments. “We want to make it easy for the operators and the developers to choose the right storage for their platform,” Cade said.
Identify, Validate, Evaluate and Love
Storage choice for cloud native environments typically involves first assessing the different applications running in Kubernetes environments, as well as a subset of users and their requirements. Kubestr — with its “easy button” activated — first identifies the storage nodes in the Kubernetes cluster. It then validates the configuration with the storage in place. In doing so, it automates the process of gauging storage performance that is available and discovering wasted storage resources. For evaluation, a series of benchmarking tests are run.
“There are many different options with Kubestr when it comes to benchmarking,” Cade told The New Stack. While Kubestr uses Flexible IO Tester with a default test, while “you can bring your own FIO configuration to perform workload-specific benchmarking,” Cade said.
“I really want to really highlight the simplicity of Kubestr,” Cade said. “It’s really about making life simple for both the developers and also the operators that are being pulled into these storage decisions, these environments that they may not be so familiar with,” Cade said. “This gives us easy buttons to be able to do that.”