Data Science / DevOps / Kubernetes / Sponsored / Contributed

DevSecOps Teams Need Application-Consistent Backups for Kubernetes Workloads

14 Oct 2020 10:07am, by

Dell Technologies sponsored this post.

Nivas Iyer
Nivas is a Dell Technologies product manager and strategic partnerships leader known for delivering innovative products. He is passionate about understanding customer needs and delivering solutions that address them. With exceptional market insight, Nivas is able to extract key information and quickly develop precise product responses.

When you think about the DevSecOps transformation in the modern applications world, you need to think first about the developer’s point of view back when development across an organization had its own challenges. At any time, you could have hundreds of developers contributing to one build — and any time a developer checked in changes, something could break if there was a conflict. Since those days of yore, the underlying platform has matured and changed from physical to virtual machines, and now to containers. In this software development paradigm, the concept of microservices allows each team to be dependent; and this has enabled them to independently develop and deploy their own versions of code. This now effectively isolates their work from the rest of the organization. Now development teams can use plug-in compatibility — via a software module or a service using a standard interface — to merge changes.

Think of this as using a USB or HDMI cable to plug into a “box” and it doesn’t matter what is at the other end of the communication. The connections both need to use a common tool to be able to work together; this is similar to the way software needs to be built today. The need for different applications to be able to work together to meet a common output or the same language. In addition, developers need to rely on a back-end tier to write to the database, but the challenge of mapping object-oriented constructs to a relational database isn’t always easy. That’s why NoSQL databases came along, plus new relational databases (for example PostgreSQL and DB2) that were easier to deploy inside Kubernetes clusters. Now the result is a share-nothing architecture, which removes single points of failure but requires that each development team now manages their own data services inside their own microservice and instantiation of the software.

Previously the IT Ops team centrally managed the database, but now each developer and/or DevOps team must do it — and this results in additional overhead, not to mention toe-stepping and frustration.

Wait and what?!!!? Confused yet? What it comes down to is that these software architectures that lend themselves to flexibility for individual developers, who can now use their own programming languages and database of choice, lead to new and different operational challenges.

Previously the IT Ops team centrally managed the database, but now each developer and/or DevOps team must do it — and this results in additional overhead, not to mention toe-stepping and frustration. As important as protection, backup and restore is to DevSecOps, so is the way it is backed up and made available. This is where the need for application-consistent backups becomes a critical advantage to the end user. Typically, crash-consistent backups are good for point-in-time recovery for VMs. Block-level backup, but for Kubernetes workloads, in particular, isn’t as reliable. It is critical for developers to have application-consistency to meet stringent Recovery Point and Recovery Time objectives for their respective applications and services.

Let’s look a little more into application-consistency. Traditionally databases write data in-memory, producing high performance and in-write operations. They also typically continuously write to a log file. Anytime you want to save the data in-memory to a persistent storage layer, you need to go through a process called check-pointing — whereby you need to go through all the logs, commit all of the logs back into the disk and then have a consistent view of the data, which we know as a “point-in-time” version of the data in the persistent data disk. After that, a snapshot is taken and it will be available when the database restarts. Also if there is data corruption or hardware failure, the database can restart quickly; this is known as application-consistency.

Now introducing the Kubernetes challenge. With Kubernetes, the challenge is that developers have multiple databases — and many choices, such as MySQL, MongoDB, Postgres, Cassandra, etc. Therefore, savvy UI architects need to produce an agentless mechanism, that requires no installation in the database layer and can be a simple and extensible interface to backup all of this data (and from any source).

In order to make this happen, a template can be created and with this template every time a backup is created, it also creates something called a pre-hook and a post-hook. The pre-hook is doing something that is exactly what was described earlier, which is basically pausing a database so that there are no more write operations. It flushes all of the data in memory to disk, so that we get an efficient point-in-time operation. After that, when you take a backup you essentially take a snapshot of the disk; and now when you restart the database from that, it will be much more efficient.

All these operations used to be done by the IT teams, but can now be managed by the individual teams responsible for managing their own data services. This balances the need for IT operations policy control and developer self-service. Additionally, this can be automated as part of a continuous integration and continuous delivery/deployment pipeline, in accordance with DevSecOps’ best practice.

Feature image via Pixabay