Data on Kubernetes: Operators, Tools Need Standardization
When it was introduced to the world, Kubernetes showed off its ability to easily juggle stateless workloads — those workloads that did not need to interact with some form of permanently stored data. Over time, however, the open source container orchestration engine offered hooks for working with databases and other sources of persistent data.
But while there are a plethora of “solutions” for running stateful workloads on Kubernetes today, users need more standardization across tools in order to facilitate large-scale production usage, according to a report released today by the recently-formed Data on Kubernetes Community (DoKC).
Today’s most advanced K8s users “see these really massive productivity gains, so they want to standardize,” said Melissa Logan, DoKC director, in an interview with The New Stack. “They’re trying to kind of figure out how to make all these things work together.”
The group will hold a meetup at this year’s KubeCon+CloudNativeCon on Tuesday, starting at 8:45 a.m. pacific coast time, which is being live-streamed. There, they will discuss these findings in greater detail.
Stateful Workloads on Kubernetes
The report, Data on Kubernetes 2021, is the result of a survey of 500 Kubernetes users about the types and volume of data-intensive workloads being deployed in Kubernetes.
Overall, it found that the large majority are running stateful workloads, and they want to run even more. But they need to get a handle on managing all the resources first.
They will require greater integration and interoperability with the current set of tools. They also need skilled staff, better Kubernetes operators, and more trusted vendors, the reports found.
This report was completed in conjunction with ClearPath Strategies. The study interviewed 502 executives and technical practitioners across a variety of industries. They represented organizations that ranged from 100 employees to more than 1,000.
In the survey, half of the respondents reported running 50% or more of their production workloads on Kubernetes, with advanced users reporting 2x or greater gains. About 90% believe it’s ready for stateful workloads and a large majority (70%) are running them in production with databases topping the list. Companies report significant benefits to standardization, consistency, and management as key drivers.
Database operators are a particular challenge for K8s users. Operators are user-defined extensions that can use custom resources to manage applications. They have been widely used to help manage databases under K8s. Most orgs deploy more than one database, so it would make sense they would accrue multiple operators.
“They specifically call out a difficulty of maintaining interoperability with other operators,” Logan said.
Kubernetes also has to address other industry trends in the industry. Databases were the chief issue for the users, though other sources of data were also identified, such as object storage, streaming messaging, backup and archival, analytics and machine learning were also identified. Users are also calling for more standards in data management, particularly around declarative programming, the survey found.
Data on Kubernetes: A New Group
Founded in June, The Data on Kubernetes community aims to help Kubernetes practitioners work with data sources, identifying and aiding the development of tools for the job. The group has over 4,000 members and has held over 100 meetups worldwide. MayaData, and then Datastax sponsored the group.
“The intention was always to kind of bring more people together to solve these challenges,” Logan said. “We’re trying to bring everybody to the table and put some things down on paper. So we have a common understanding of what we’re doing here.”