How Containerized CI/CD Pipelines Work with Kubernetes and GitLab
It is hardly surprising Kubernetes’ popularity continued to grow in 2019 and this trend will likely continue in 2020.
However, while it offers so many advantages, Kubernetes adoption has also revealed new difficulties that have to be addressed — and fixed. One of them is how we automatically deploy and manage our applications. With the below examples, I will share useful tips and tricks on how to enhance your Kubernetes CI/CD pipelines with the help of GitLab and open source technologies.
Move Your Pipeline Workload into Your Cluster
As a first example, let’s examine CI/CD pipelines in general and why I think containerized pipelines can help solve many issues.
A pipeline, as we know it, is divided into different chain links called stages. Those stages can contain single or multiple jobs. A job describes the commands that need to be executed to achieve the desired outcome. A command can be a binary or a complex toolchain. Independent of complexity, the tools as well as their dependencies need to be available on the pipeline-worker nodes. Depending on your project, you may also need to choose the right version and path for multiple installed versions.
Containerized pipelines offer the following benefits:
- Isolation between pipeline jobs
- No dependency issues between pipeline jobs
- Immutability, with every pipeline job runtime exactly the same
- Easy scalability
In a containerized pipeline, every single job runs in a container based on an image, which includes all of the dependencies and version particularities of the toolchains needed by a single project. One of the many advantages of containerized pipelines is how there will be no conflicts between different jobs in a project, including different project pipelines running on the same node. You can also run this particular pipeline job on any of your pipeline worker nodes because all the needed dependencies are baked into the container image.
With containerized pipelines, you are then able to move your pipeline workload into an existing Kubernetes cluster. This helps you to make better use of your existing compute resources by running your pipeline workload next to applications. You can also scale out your pipelines nearly endless by using existing Kubernetes advantages.
Sounds good, right? Let’s talk about the details. If you have used GitLab CI/CD, you are aware of the GitLab Runner — a binary that schedules and manages pipeline workloads on worker nodes. The worker nodes are the machines (it doesn’t matter if they are virtualized or bare metal) that provide the compute to run pipeline workloads.
The GitLab Runner also provides a so-called Kubernetes executor. This allows us to move containerized pipelines into our Kubernetes Cluster. The Kubernetes executer, first of all, runs itself as a pod within the Kubernetes cluster. Every time a pipeline job needs to be scheduled, the Kubernetes executor talks to the Kubernetes API and schedules a pod based on the defined container image. It also manages any kind of housekeeping tasks like code checkout, caching and artifact management. The Kubernetes executor divides the build into multiple steps:
- Prepare: creates pod with build and service containers.
- Pre-build: clones repo, restore cache, download artifacts.
- Build: user build steps.
- Post-build: creates caches and upload artifacts.
In addition to all of the housekeeping steps, which happen automatically in the background, the Kubernetes executor will run the pipeline steps we defined in .gitlab-ci.yml in step three.
Let’s look at an example customized pipeline (you can review the whole example here):
This above pipeline example details the deploy stage. The deploy stage contains a single job called app-deploy. The job is defined to be scheduled by the Kubernetes executor via the defined Kubernetes tag (which also needs to map the tag defined in the GitLab Runner definition). The image name parameter defines the container image used to execute the commands defined in the script section. In our example, we run an application deployment using Helm. The commands are executed in an Alpine-based container providing the Helm and kubectl CLI (details on the used container image are available here).
The Kubernetes authentication details needed to successfully execute the application installation are provided automatically through the GitLab Runner and are available in the container while running the job. This is done by the Kubernetes Integration of GitLab, available on a project or group level.
Run Container Builds within Your Cluster
Moving our deployment pipeline workload into our Kubernetes cluster brings us many advantages and possibilities which I already mentioned above. But wouldn’t it be great to also build our applications as well as other container images within our Kubernetes cluster?
You may have heard about a technique called Docker-in-Docker (DinD) which allows us to build and run container images inside of containers but in my opinion, Docker-In-Docker brings some disadvantages, including:
- exposing the host’s Docker socket into the container.
- mounting /var/lib/docker into the container.
- running a privileged container to be able to run a Docker daemon inside the container.
A way to avoid Docker-in-Docker is to use an open source project called Kaniko. Kaniko is a project introduced by Google, which allows users to build container images based on a Dockerfile inside a container or Kubernetes cluster. Kaniko doesn’t depend on a Docker daemon or any other external dependency or privileges. This enables users to build container images in environments that can’t easily or securely run a Docker daemon, such as a managed Kubernetes cluster. Kaniko is designed to run as a container based on the official container image (gcr.io/kaniko-project/executor:latest). This is an example of how to run Kaniko as a pod:
The above example runs a pod based on a container running the latest Executor image. The container is started with the following arguments:
- –dockerfile defining the path to the Dockerfile
- –context defining the context root, could be a mounted volume, S3/GCS bucket or Azure Blob storage
- –destination defining the target registry, image name and tag.
It is possible to also mount a Kubernetes secret, which can be used to authenticate against a registry or context root location if needed.
Of course, you can also use Kaniko in your GitLab CI pipeline to build any kind of container images. An example containerized build pipeline based on Kaniko looks like this (you can review the whole example here):
Once again, we have a single stage with a single job scheduled to run on a GitLab Runner holding the Kubernetes tag. The image is defined to use the debug tag which adds support for a shell that is needed for GitLab Runner to work. We also need to overwrite the container image entry point to be able to execute the configured commands. In the before_script section, we create a file called config.json, which is used to define the registry authentication details.
In our example, we push the image into the GitLab project registry which we can authenticate against using the CI_JOB_TOKEN environment variable provided by the GitLab Runner. In the script section, we then run the Kaniko executor binary and defining the context root, Dockerfile path and destination details.
Nico’s talk at GitLab Commit in Brooklyn 2019 with more details on containerized pipelines and Kaniko:
Gain Further Insights
Some words about GitLab Commit taking place on Jan. 14 in San Francisco: The inaugural GitLab Commit will bring together GitLab users for a day of learning, networking, and inspiration. Experts in continuous integration, continuous delivery, Kubernetes, DataOps, and security will provide cutting-edge insight into the latest DevOps technologies. Expect to go home with new tricks, fresh solutions to age-old problems and some new friends.