Companies adopt hybrid and multicloud strategies for various reasons, including reliability, geographic necessity, and vendor independence and with this comes the complexity that no two private or hybrid clouds are alike. Despite the industry standardizing on Kubernetes, how it presents itself is different for every cloud. Amazon‘s Elastic Kubernetes Service (EKS) requires different configuration than Microsoft’s Azure Kubernetes Service (AKS) or Google Kubernetes Engine (GKE). Most private clouds are built on platforms like VMWare‘s Tanzu/VSphere or Nutanix where the installs and deployed capabilities are often highly customized.
Any systems engineer who has tried to maintain multicloud or hybrid cloud infrastructure knows that this variability makes management very difficult. It is not just the different Kubernetes configurations but the scripts necessary to log in and apply the changes are also unique. And even if one is adopting infrastructure as code approaches, managing those scripts in git, managing the scripts across a set of git repositories and branches, and ensuring the right configs are applied to the right runtime environments remains a heavy lift. Let’s say, for example, that some new regulatory-related config needs to be added to deployments across AWS and vSphere, the SRE needs to first add the appropriate config to both the AWS and the vSphere branches in git, and then they need to make sure to apply the former to the EKS implementation and the latter to the Tanzu cluster. All too often, the application of these config changes is done by hand — say, using the AWS cli or the vSphere console — and mistakes will be made.
Now imagine a global environment with multiregional or even multinational infrastructure. Add a few edge networks to this setup. The resulting complex DevOps workflow is not sustainable — and yet it is precisely what many engineers are doing right now. We need a better way.
Automation is essential. At scale, automation needs to be done a particular way.
GitOps makes multicloud and hybrid cloud feasible when deployment topologies get more complicated. Cloud, Kubernetes and application configurations are stored declaratively in Git, allowing for changes to be submitted as pull requests, approved by a maintainer, and automatically applied by software agents. Rather than managing the rollout of configuration to the multitude of different targets in the hybrid environment, GitOps allows engineers to program automation that ensures the right configs and applied to the right environments. A series of agents notice when the running configuration is divergent from git and self-heals the difference. And, while there are always differences in config across the hybrid topology, with GitOps we specify common base configuration — which we then manage only once across all environments — and kustomizations (yes, spelled with a ‘k’ ;-)) with the specializations. All of this is to say that GitOps provides a standardized operational model across that heterogeneous landscape.
Have Containers Will Travel — Securely
Imagine that there is a significant service outage in AWS-East hosted in Northern Virginia. Because you have much of the config is common, and you’ve already provided the specializations for GCP, you need only enable that configuration for the GCP us-east1-b region and that standardized GitOps process will produce a deployment to compensate for the outage. Or consider that your company expands to China or any of the countries that AWS does not reach, so you will run on Azure or your own infrastructure. Yup, you maintain one base configuration and a set of environment-specific specializations in git, and the right combinations are applied wherever they are needed.
If maintaining multiple levels of divergent configuration and application scripts is a logistical pain, it is nothing compared to the security problem it may cause. Benjamin Franklin said, “Three may keep a secret if two of them are dead.” Yet, today’s DevOps strategies often manage secrets for the hybrid environment in such a way that a few individuals have access to every system. GitOps allows for secrets to be separated from the config management process. This allows the config management process to proceed in git — someone makes a change to config, submits a pull request, and then an authorized maintainer merges it — and then the application of that config is handled in a separate, very secure manner. Instead of multiple people logging into various pieces of infrastructure to apply changes, a GitOps process that has very localized security access, notices the difference and applies it. No credentials are spilled in the name of DevOps.
GitOps makes multicloud and hybrid cloud feasible when deployment topologies get more complicated.
Yeats said it best, “Things fall apart. The center cannot hold.” Someday, somewhere, somehow, something will break, and more often than not the breakage will have been a result of some configuration change. To protect ourselves from a repeat occurrence we will thoroughly investigate, looking at all changes that were applied and determining how they impacted the outage. With GitOps, auditing changes is simple. Git automatically logs the changes, providing a level of transparency and accuracy that greatly aids such an investigation. No longer are operators logging into servers and misapplying a script with no traceability; all changes are recorded and it is just a matter of understanding what changes were merged, by whom, and when.
Hand-Crafted, Automated, or in Flux?
Ultimately it all comes down to how much automation is in place, and the particular nature of said automation. If a development organization uses CI/CD pipelines, is delivering constant changes, and is responsible for infrastructure at scale, hand-applied imperative scripts are not an inefficiency they can afford. Necessity is why some of the first adopters of GitOps are the people laying out infrastructure at a massive scale. They are managing 5G networks across tens of thousands of nodes or are engineers in the financial services managing systems that run millions of transactions 24/7 and process real money.
No one thinks that multicloud is as simple as deploying on a single region on a single cloud provider. The most secure, most straightforward, and most elegant way to manage this complexity is using GitOps technologies and methodologies. Luckily, these tools are readily available to you right now. The Cloud Native Computing Foundation recently promoted Flux into its incubation stage. Flux is an open source toolkit for implementing GitOps.
Using Flux, engineers can create a single, auditable, versioned source of truth that describes exactly how a cluster and workloads will run. That “truth” can then spread to other clusters, other regions, other clouds, and even other continents. The configuration it applies is in the vernacular that cloud developers and engineers are used to speaking: Flux talks git and YAML and Kubernetes.
Ultimately, doing GitOps for a large and complicated environment requires visualization and monitoring tools that allow system engineers to understand what is happening in real-time, and to explore topologies, especially as they change. Weaveworks, which coined the term “GitOps” and originally created and donated Flux to the CNCF, extends Flux to do just that.
Weave Cloud provides teams with a fully supported service offering that enables GitOps with just a few point-and-click operations, increases observability, and simplifies managing the GitOps process and overall topology. The Weave Kubernetes Platform extends this further by managing the entire development and deployment lifecycle and by adding additional observability capabilities.
GitOps Is Not Just a Technology
In exploring GitOps, one should neither under or overestimate the importance of tools like Git, Flux, or even Kubernetes. GitOps is not only technology but also a set of best practices — it’s a paradigm. It fully embraces, even depends on, the tenet that says systems management is done through declarative configuration that is constantly reconciled with what is actually running. That is, sys admin becomes a process of expressing the desired state of a system, managing that expression in a systematic way (in git, for example) and leveraging the right type of automation to deliver that config and automatically operate deployments. It treats system administration the way you treat a code regression. If something breaks, you roll it back and let the GitOps process take care of correcting the system.
Getting started with GitOps is fairly simple, and there is value, such as the ability to audit and roll back changes, even if you are not managing massive, complex systems. However, if you are a larger team running complex infrastructure on multiple clouds, GitOps may not just be the simplest way to manage that complexity, it’s likely the only sustainable one.
Amazon Web Services, The Cloud Native Computing Foundation and VMware are sponsors of The New Stack.
Feature image via Pixabay.