Strengthening Your DevOps Pipeline with CI Observability
Featured image via Pixabay.
From the Accelerate State of DevOps 2019 Report, it’s well known that teams and organizations that adopt DevOps-related practices are more successful at product development. This is because the goal of DevOps is to increase velocity while maintaining stability. As a result, when practiced effectively, organizations reap major benefits. For example, the chart below illustrates the development gap between those who embrace DevOps and those who lag behind in DevOps.
We must also remember that DevOps practices are not a defined set of operations but rather a model of development to aspire to. Therefore, changing the culture of developing and implementing the required tools and infrastructure is a difficult process. Having all these efforts go to waste could lead to a retreat from progressive changes for any company, dooming it to archaic development practices and eventually leading to slow and sub-standard product development.
This is even more poignant considering the technical debt in attempting to change current development practices. Companies are reluctant to change considering the financial costs, time required, and data migration needs. Hence, it is seen as crucial that once we start adopting DevOps-related practices, we ensure that we build a solid core or base on which we can continue bettering our development practices.
However, there is a vulnerability to any team’s DevOps pipeline, and that is the failure of effective CI/CD. We will elaborate on why this stage is crucial for the overall development practice in later parts of this piece. So why is this so? Establishing the importance of the CI/CD stage when thinking about DevOps is what we will explore in the piece. We will also consider the cultural and tooling remediation required to insulate the pipeline from this vulnerability.
The Backbone of DevOps
To reiterate, the goal of DevOps has always been to increase velocity while maintaining the stability and availability of your application systems. A lot of this is achieved through two principles. The first is the breaking down of silos and the second is automation. Various DevOps tools, practices and cultures revolve around promoting these two principles.
For example, Monitoring and Incident management are aimed at allowing developers to be aware of the state of their application systems, and hence break down the silos. Similarly, automated testing tools integrated with the developer’s IDE help in increasing velocity by boosting automation.
However, no tool or practice is so crucially poised as Continuous Integration and Continuous Delivery. This is because the stage of CI/CD is that key moment in the development cycle where “code is traditionally thrown over the wall.” The transition between dev-centric domains to ops-centric domains. As a result, we see the potential in both breaking down silos and automation at this stage.
As a result, CI/CD becomes the backbone of DevOps where automation in the form of automated tests and automated deployments are enveloped under the domains of Continuous Integration and Continuous Delivery. A failed or weak CI/CD means that getting code to production could be arduously slow and could even raise the potential of incidents and disruptions as the CI/CD stage failed to capture buggy code. Both velocity and stability are directly impacted by a weak CI/CD practice in the DevOps pipeline.
Unfortunately, this stage is also a fragile stage. This is because when going through the CI and CD process issues may arise and things may fail either in the form of test failures or pipeline execution failures. One such case that is increasingly becoming an issue among developers is flaky tests. This is when some of the tests that run in the CI phase randomly succeed or fail without any actual change in the code.
The reason behind the phenomenon of flaky tests is usually obscure. Many times we do not know why the test is failing and it is common practice to ignore the test and override the warning signs to continue with deployment. This is definitely a slippery slope, as we don’t always know why the test is failing. It begins to degrade the trust that we have in the CI process. As can be seen, we are really getting ourselves into a “boy who cried wolf” situation.
Strengthening the Backbone with Observability
Considering the importance of CI to the DevOps pipeline, it is imperative that we strengthen it. To understand what this means let us remind ourselves what actually transpires within the CI stage.
CI allows for a single source of truth by providing a version control system and artifact repository. It is, at this stage, that we manage merges along with associated commits to the trunk branch of the source code. It is this source code that is then packaged and sent off for deployment. Actually, before it can be sent to deployment, the CI also performs the necessary tests, some of which could result in flaky tests, as mentioned in the previous section.
As can be seen, there are various steps at this stage, each step playing a pivotal role in the success of the overall system. Therefore, we must not be left in the dark while going through this stage of the development journey. This is where observability comes in, by providing the right insights into the current stage of your CI servers.
Observability and monitoring are not new concepts and, in fact, a crucial part of the DevOps pipeline. However, they have traditionally been thought of as Ops domains. As you can notice, throughout this piece we have continuously distinguished between dev domains and ops domains across the development story and DevOps pipeline. This, of course, does not bode well with the philosophy of DevOps, which aims to break down all silos, and in more fanatical terms, even the division of domains.
As a result, it can be expected that observability and monitoring that were initially applied for understanding the state of the application in production, now be applied to understand the state of versioning and tests. By tracking various metrics such as Quality and Time-based metrics, while leveraging metrics traces and logs in testing and debugging scenarios, we can effectively do away with the woes of traditional CI.
Therefore, with these metric we can actually list the major benefits:
- Building trust in the CI/CD stage across teams with metrics that provide a ground reality status and understanding.
- Providing insights crucial to the resolution of failed and flaky tests.
- Reducing the risk of incidents and disruptions in production due to providing an added layer of “debugging”.
- Building resilience in the CI/CD stage and overall DevOps pipeline
Overall, by remediating the vulnerabilities of the CI/CD domain leveraging observability, we effectively strengthen the backbone of the DevOps pipeline. With CI observability, we see the closing of the gaps in understanding our application throughout its development life cycle. In production, we already have sophisticated observability and monitoring tools.
In development, we can leverage debugging strategies to provide the necessary insights. However, for a long time now the stage between development and production was a blindspot. With the rise in CI observability, we are finally shedding some light on this blind spot.
Over the past decade now, since Patrick Debois coined the term back in 2009, DevOps has been growing in popularity. However, amid this hype, overly-eager teams and companies usually dive deep down into a concept that they do not fully understand or are not prepared for. Moreover, this change cannot happen overnight, so it’s also not immediately apparent that a team or organization is going down the wrong path.
When asked, what is the right path, there is no right answer. Each organization is unique and each team is unique, as a result, everyone’s DevOps journey will also be different. However, there are best practices to take into consideration. CI Observability is one of them.
This is because, as discussed in this piece, CI/CD makes up the bulk of the team’s effort in regards to breaking down silos and promoting automation. It is imperative that this stage in the DevOps pipeline be strengthened, to ensure a successful DevOps practice overall.