CI/CD / Cloud Native / DevOps

CNCF Working Group Sets Some Standards for ‘GitOps’

18 Nov 2021 6:00am, by

Engineers from GitHub, Microsoft, CodeFresh, Weaveworks, Red Hat and other cloud native-savvy companies have banded together to assemble a set of definitions and conformance specifications for GitOps.

The work was all done through the Application Delivery Technical Advisory Group of the Cloud Native Computing Foundation. Last month, the group released version 1.0 of the specification.

“A lot of people think that they’re doing GitOps because they’re using git and they’re doing pull requests and pushing changes out,” said Leonardo Murillo, a co-chair of the GitOps working group. “We want the community to start to see that GitOps is not just CI/CD with git.  There is a lot more.”

They started by defining the core principles of GitOps, which then can be interpreted by vendors, who are free to interpret them in their own ways.

Leonardo Murillo, a co-chair of the CNCF GitOps working group, sits down with TNS to discuss the OpenGitOps specification

Leonardo Murillo

GitOps must meet these four requirements, according to the group:

  1. Declarative: system managed by GitOps must have its desired state expressed declaratively. “You’re no longer giving instructions, you’re describing state,” Murillo described.
  2. Versioned and Immutable: Desired state is stored in a way that enforces immutability, versioning and retains a complete version history. “The only way for you to introduce change in your system is by creating a new version of your desired state,’ Murillo added.
  3. Pulled Automatically: Software agents automatically pull the desired state declarations from the source. Agents within the system pull the desired state from the repository.
  4. Continuously Reconciled: Software agents continuously observe the actual system state and attempt to apply the desired state. “The desired state [of the system or software] is continually reconciled, Murillo said.

GitOps differs from a standard continuous delivery (CD) system in that a CD pipeline will be imperatively defined. The user must provide a set of all the actions needed for a system to reach the desired state, after which there is no guarantee that the system remains in that state, or know not what actions to take to repair a system that is out of sorts.

With GitOps, the agents inside the system are always checking to ensure the system is in the desired state, and if there is “drift” in a system, the “reconciliation loop”  will automatically return it to the desired state, by pulling the original artifacts from the repository.

“You’re not writing pipelines where you have if-then actions and whatnot, you’re delegating that responsibility, encapsulating all that complexity into the decision-making logic of your operators,” Murillo said.

This approach gives the user full traceability, where each change results in a new version of the system and can be easily rolled back to an earlier version if needed. Another advantage in this approach is: Developers don’t need access to the clusters their applications are using.

The GitOps approach is not limited to application maintenance. it can also be used for maintaining a system itself, enabling Infrastructure-as-a-Service (IaaC). With Kubernetes, it can be used just to manage configurations or services. But it can also be used to manage legacy applications.

The guidance is vendor and Implementation-agnostic. It is up to the system integrator which tools to use, and the desired state is kept. Even the git repository, upon which the name GitOps is built, is not a required dependency.

Turning Pets into Cattle

The GitOps model could be far-reaching, extending beyond application management to uniform management of clusters and even entire operating environments.

The GitOps model can “be extended to all sorts of areas of your architecture and environment that you would not have considered initially,” Murillo said.

One area where GitOps can be of high value, Murillo said, is in fleet management, where hundreds or even thousands of clusters need to be managed. It’s more work to set up a pipeline to push out the configurations and software updates to all these clusters, compared to having all these clusters autonomously update themselves.

The cloud native community likes to refer to treating containers as “cattle rather than pets,” insofar as they should be managed en masse, rather than on a case-by-case basis. Yet, operating environments themselves are still treated as pets, Murillo said.

“I think we still treat environments as pets. We give them names, “production” and “staging.” They’re still our pets,” he said. The GitOps model could enable organizations to move towards more uniformity and extensibility of these environments.

“I think you can start thinking about ephemeral environments — just multiplying, creating and destroying environments altogether,” Murillo said.

When Things Go Wrong

Now that version 1.0 has been finalized, the group is looking to extend the standards to define additional functionality. For instance: how incident management is handled in a GitOps environment. Same for security management, credential management, fleet management.

“If your cluster is malfunctioning… if your app is malfunctioning, what are the procedures that one should follow in GitOps?”

With programmable infrastructure, for instance, it is frowned upon for an administrator just to SSH into a router and make a simple change, such as fixing a mistaken IP number. This is a simple but long-held practice that is nonetheless frowned upon by the ideals of configuration management because it throws the deployment out of its “desired state” (and can be a security hazard).

So the group is looking at what would be the proper mechanisms to handle simple incident management activities. Do you suspend continuous reconciliation in such cases? How do you handle hotfixes?

The group hopes to answer answers to “What the proper principles are in a GitOps environment,” Murillo said.