Bridging the Gap Between Infrastructure as Code and GitOps
Infrastructure as Code has been one of the greatest evolutions that’s happened to the computing space in the last decade. IaC has established a new normal where the change process that’s already ingrained in your software team can extend to the infrastructure that you run your software on as well. Request a change, review, approve, apply.
Our team has been on a mission to provide the automation behind both the provisioning (Day 1) and day-to-day (Day 2) sides of infrastructure management to power our instant Kubernetes platforms. The challenge is that Day 1 and Day 2 IaC automation needs can be very different depending on the infrastructure governance you need in place for your Day 2 workflows. Are you OK letting your IaC tool decide that your Kubernetes cluster needs to be destroyed, or do you require a human to be involved in that decision?
A few months ago, a really neat combination of technology fell out of a research spike we were working through where we discovered a way to use Terraform, Crossplane and Atlantis together in a way that provides the best qualities of each while still allowing you the flexibility to implement the strict Day 2 governance that your organization might require.
Our Job: Providing Free Automated IaC to the Masses
Because we’re trying to enable an automated IaC solution for the masses, we have some careful considerations to make about the IaC technologies we choose for the internal developer platform you’ll receive. We carefully weigh platform components based on their popularity, free-tier value, open source license status, stability, scalability, approachability, documentation, ease of maintenance and cohesion with the other platform tools.
There have been three technologies in this IaC space that have captured our imaginations for years — Terraform, Atlantis and Crossplane. They each solve major portions of the IaC automation problem. Let’s briefly discuss each and highlight their special skills and drawbacks.
Hashicorp Terraform — Command Line IaC
Terraform has become the de facto standard for IaC across the enterprise. We’ve loved Terraform for many years. It has a stable product, a simple language and a bustling marketplace of vendor-agnostic providers to allow you to configure anything: clouds, users, secrets, git repos — anything at all.
Terraform is a command line tool that runs in a directory of files that represents your desired state in its HCL language. When you run a
terraform plan, it’ll compare your desired state to the actual state and tell you what would change if you were to apply. When you run a
terraform apply, it makes the actual change or tells you why it couldn’t.
Because Terraform is a command line tool, many organizations are using it … wait for it … as a command line tool. Shocker, I know. Cloud engineers can apply changes to their cloud infrastructure directly from their local machines. Site reliability engineers and platform teams quiver at this opaque type of setup. Their disciplines get a lot easier when you know what changed, how, when and by whom, so command line tools in the dark aren’t ideal. This is especially true when alerts wake you up in the middle of the night.
- Unmatched provider support — almost everything has a Terraform provider.
- Stable and dependable.
- Command line tool means you don’t need preexisting infrastructure to run it, which is great for starting from scratch.
- Running Terraform as part of GitOps sequencing requires either a stop in that orchestration or Terraform execution from a custom pod.
- No native control plane for automating the plan/apply execution (unless you pay for the SaaS offering).
Atlantis — Terraform Workflow Automation
Atlantis has long been our go-to technology to integrate the process of making Terraform changes with the natural flow of change in a software shop. Most folks will keep their Terraform in a git repository, so when you want to change IaC code, you open a pull request, seek approval, then apply the change.
Atlantis hooks into this flow so that when the pull request is opened, the Terraform plan will automatically run and report the result of the plan directly on the pull request as a comment.
If you like the plan after review, you can comment
atlantis apply directly on the pull request, and Atlantis will attempt to apply the Terraform change and report back with the result, closing and merging the pull request automatically when successful.
We have Atlantis running in Kubernetes with a service account that has permissions to the resources it’s managing. With this setup, you can allow your developers to contribute to your IaC without actually granting the individual developers permissions to make the changes themselves.
- Provides visibility to Terraform plans and applies.
- Provides centralized audit log of all infrastructure changes.
- Meets developers where they want to be: in git.
- Developers don’t need cloud access to contribute to infrastructure.
- Only works with Terraform IaC.
Crossplane — IaC that Fits GitOps Perfectly
We’re a GitOps shop through and through. There might be no greater power in the Kubernetes space than binding your Kubernetes engine to a desired state in git. You get to define what you want, and your GitOps engine will either make it so or tell you why it couldn’t.
Crossplane is very similar to Terraform in purpose — they each have a vendor-agnostic marketplace of providers, an extremely valuable free tier if you’re willing to self-manage, and when you define your desired state, Crossplane will try to apply it.
The mechanics of executing Crossplane are quite different, however. It’s not a command line tool, it’s a control plane that runs in Kubernetes. You also don’t typically write code in Crossplane; it asks you to define your desired state in custom resource definitions. When you use GitOps sync waves to orchestrate complex provisioning operations, this is extremely advantageous, as you can include the IaC steps as part of your GitOps sequencing without interrupting the GitOps flow.
When we provision new clusters at Kubefirst, GitOps is how all of the applications get installed in our clusters, and all this orchestration is defined in our
gitops repository. Here’s an example of what this looks like in our upstream template repository. In each YAML file, you’ll find an annotation that looks like this:
That annotation is what defines which wave to which the GitOps content will be applied and allows us to control the order of installations so we can do things like install Vault first, and then install the
- GitOps-ready technology allows for more frictionless, automated creation/destruction of IaC resources.
- Alternate provider options if Terraform’s new Business Source License is an issue for your organization.
- Provider support is good but not quite up to Terraform’s, as it’s a newer technology (however, you can create Crossplane providers from Terraform providers, which provides a trustworthy path forward).
- Requires a Kubernetes cluster to run, which is problematic when creating initial Kubernetes infrastructure.
IAC Automation Governance
GitOps is great when it adheres to your governance philosophies and can be a little dangerous when it doesn’t.
For apps, GitOps is clearly a powerful step forward. Want a new app version? Just set the file in your GitOps repo to the new version, and when it touches main, that’s your app version. Want the old version back? Just set the file in your GitOps repo to the old version, and that becomes your app version.
This is a great workflow for apps, and it significantly streamlines asset management in Kubernetes and improves your disaster recovery posture by a ton. But as we discussed above, Kubernetes can manage more than just apps. Now we’re talking about infrastructure too.
IaC runs with a three-step plan-review-apply sequence as a classic command line tool or as a two-step review-apply as a control plane tool, and they’re both useful in different circumstances. If we’re talking about your production clusters, you may not want your GitOps engine to decide it’s OK to delete your production cluster without a human approving the plan.
When provisioning platforms for a living, as many platform engineering teams do, combining GitOps and Crossplane for platform provisioning operations is incredible. You basically just get to run IaC whenever you need to in your GitOps sync wave orchestration with no stopping for some awkward checklist of steps to conduct. But if it’s a production cluster that you’ve just provisioned, should it be managed as GitOps or with a higher level of governance like Atlantis provides?
Thought experiment: The remainder of this article will describe a comfortable world with Atlantis integrating with your pull request. Imagine, as an alternate approach, a world where you are governing Crossplane IaC resources and their deletion policy in your GitOps repo, based on a policy engine like Kyverno and some custom resources that can declare that the production cluster is not able to be deleted.
Day 1 GitOps with Day 2 Governance
We recently discovered that you can run Terraform in Crossplane using Crossplane’s Terraform provider. This allows you to leverage GitOps to run your Terraform in all the right spots when you’re creating new cluster infrastructure. However, at the end of the run, you may not want the cluster managed by GitOps any longer and want to shift its governance to Atlantis so you can review the plans as a human from that point forward.
When you set the
deletionPolicy: Orphan on your Crossplane resources, it will advise Crossplane to not delete the physical infrastructure when the object is removed from GitOps. So if you orphan the resource, and wait for it to sync in ArgoCD, you can then remove the Terraform from your GitOps sequencing, and the infrastructure will stay intact.
Next you can take the same Terraform that you had in Crossplane, copy it over to your Atlantis-managed directory via pull request, and the plan will show no changes (assuming you retain the same state store). When you apply the no-Op change, the pull request will merge, and the Terraform will then be managed by Atlantis moving forward.
This technique gives you the GitOps speed and power that your platform team wants on Day 1 while giving your organization the governance and control it requires for Day 2 and beyond.
Explore Crossplane-Wrapped Terraform on Kubefirst
Kubefirst just announced its introduction of cluster life-cycle management to its instant GitOps open source platform. With a simple
kubefirst launch up command, you’ll be met with a provisioning app that can create a multicluster ecosystem of GitOps clusters bound to all the most popular tools working seamlessly together for free.
You’ll get four new clusters when you install kubefirst by default:
production. Management will house your Atlantis instance, your Crossplane control plane and a kubefirst UI that can generate cluster definitions in your GitOps repository. The other three clusters are built from those commits, and you can create as many clusters as you want. There’s a templates directory that defines how the clusters get created, and you can adjust their components as you wish.
You’ll also have a Terraform directory in your GitOps repository that defines the Terraform entry points that are managed by Atlantis. If you use a pull request to change any of those directories, you’ll see a Terraform plan kick off in your pull request, which you can apply with an
atlantis apply comment in the pull request.
See if you can provision a suite of kubefirst clusters and orphan the Crossplane workspace for your development cluster. Then, move it to Atlantis’ ownership with the details provided above. It’s an incredibly powerful way to provision complex infrastructure in GitOps. Hand the controls back over to the humans for Day 2 governance of your production clusters.
If you have any trouble, we have a community with hundreds of engineers who all want to be using the most popular cloud native tools together. Our free platform is comprehensive, portable, extensible and open source. All of our opinions are yours to change. We welcome contributors and hope to earn your business with our user interface. Our pro tier is free while we roll out physical and virtual cluster life-cycle management to all of our supported clouds and receive feedback on our template-driven GitOps approach. We hope you’ll join our mission.