Why GitLab’s Move to Continuous Delivery Couldn’t Wait for Kubernetes
The narrative in the eco-system today is about continuous integration/continuous delivery (CI/CD). Companies are under pressure to adopt the snazziest technologies quickly to chase after the elusive DevOps culture and minimal cycle time. This is GitLab’s story of doing the opposite and why.
At GitLab, we issue releases on the 22nd day of every month to help our large, self-managed product maintain consistency. We’re in the process of moving to Kubernetes, and we need to deploy even more frequently than we have in the past as we increase the velocity of feature development. But instead of modernizing completely with Kubernetes and then starting CD, we have opted to push our existing CI/CD system to the limit by using our preexisting legacy tools — and a lot of smarts. This is the story of how we found success, both human and byte-sized, by using the most cost-effective, boring solutions to start CD.
“Cycle time compression may be the most underestimated force in determining winners and losers in tech,” said Marc Andreessen, a World Wide Web Hall of Fame-inductee and one of Silicon Valley’s most prominent entrepreneurs.
Engineering teams deliver customer value in the form of features. To stay competitive, businesses require software be delivered as quickly and efficiently as possible. By shifting the DevOps lifecycle left, engineering teams can compress cycle time and accelerate the pace of innovation.
At GitLab, we go beyond delivering the Minimum Viable Product (MVP) or even Minimum Viable Feature (MVF) in order to always focus on creating the Minimum Viable Change (MVC). We operate with a low level of shame and a high level of efficiency by concentrating g on what value we can deliver to our customers, no matter how minimal. Because our engineers are tasked with always creating iterative innovation, for example, we often pass on shiny, new technologies to consider more established stable solutions that could deliver real business value to our customers. By shipping the MVC, engineering teams can collect numerous benefits:
- Get feedback and learn faster.
- Risk from any one change is much smaller.
- Ability to back out of a small change is much easier.
- Learn and improve faster.
Marin Jankovski, engineering manager of the delivery team at GitLab, recognized that we needed to move GitLab to CD, for example. But rather than waiting to implement new tooling, the delivery team elected to stress our available resources and get our engineers accustomed to working with a CD mindset.
How Moving Our System to CD Created Immediate Results
By rapidly scaling to CD, we moved from deploying weekly to daily, to being able to ship code to canary within two hours. For folks that are used to weekly deploys, being able to go from commit to canary within two hours has completely changed their world for the better.
Before CD, any change that was merged could take anywhere from one day to a few days to get into production. There was essentially a one-day minimum to get any change into staging environment. By using our current system of legacy CD tools, we went from four deploys in May 2019 to 24 deploys in July 2019, speeding up our delivery time without having to wait until Kubernetes is fully implemented to see results.
We’ve hit the limits of our system, but we’re already reaping the benefits of CD. While we continue to update components of the CD system with Kubernetes, our engineering team is still working in tandem to begin the process of migrating, scaling and managing the shift to CD.
How We Moved To CD Without Using Kubernetes
GitLab runs on GitLab and uses Ansible scripts for CD on virtual machines (VM). Again, as mentioned above, we have not yet implemented Kubernetes for all of our CD needs. Instead, we are taking the unusual step of pushing the capabilities of our legacy tools to the brink. Here’s how:
- We are using VMs for all of our environments, and all have omnibus packages.
- The Ansible scripts Kubernetes gave us are used to orchestrate all of the VMs.
- All the details Kubernetes needs to work are inside the Ansible playbooks.
- We still manually deploy to the production environment because of compliance requirements.
Granted, we are already using some Kubernetes for review apps (for developers to see the impact of their code changes, similar to a local host except not local) and feature flags as part of our move towards CD.
The components of Kubernetes we do use on the current CD system allows engineers to review any changes to their code live, without having to wait for deployment. The pipeline automatically creates a review instance of the application, and deploys the app using Kubernetes. Now the developer can review the code changes and conduct quality assessments without having to wait until the code is deployed.
Establishing Cultural Changes in Our Engineering Team
The biggest benefit of pushing our legacy CD system to the limit has been the cultural change that followed. When the time comes to fully implement Kubernetes for CD, our engineering teams will have already worked through the growing pains of implementing CD.
Here are some of the results:
- QA is king. Now, quality is the number one priority for all engineers. Our review apps help developers see exactly what they pushed. While there is no production data in there, any developer, even a frontend engineer, can see what might be wrong right away. Approvers also have significantly more responsibility. Now that it is easier to identify problems, approvers are expected to flag them.
- No more hotpatching. Well, almost never. The only time we hot patch (which is logging into the VMs and changing production code with a diff) is if something is a p1 *and* an s1. A p1 is a priority 1 for *some users* and an s1 is severity 1 because it affects a large portion of users or very critical customers.
- All developers are on the on-call rotation now. After migrating to CD, we achieved in three weeks what we had been talking about for three years. Because of the stringency of the QA rules, developers are responsible for their code and carry pagers.
Anyone running an engineering team or organization will recognize that these are big cultural shifts, and they certainly were not easy to implement. But our approach of starting with our legacy CD system and maxing it to capacity worked, and our engineering team has shifted to a CD mindset as well. Next, we plan to invest our budgeted time by completely adoption of Kubernetes.
Follow along with GitLab as we share our scaling process, challenges and successes with implementing Kubernetes. We are sure there will be more exciting developments to come.
For more case study discussions and CI/CD best practices, attend GitLab Commit Brooklyn September 17th. GitLab’s inaugural user event will showcase the power of DevOps in action through strategy and technology discussions, lessons learned, behind-the-scenes looks at the development lifecycle, and more.