CI/CD / DevOps / Tools / Sponsored / Contributed

Fixing Tests in CI/CD: Why are Your Tests Failing?

24 Aug 2021 4:00am, by

Ismail Egilmez
Ismail is the business development manager at Thundra. He is enthusiastic about sales, marketing and growth strategies.

Automated testing is a best practice in today’s software development world. Every developer knows they should run multiple tests on their code. Even so, most information about how to run tests is based on opinion.

Should you write integration tests first and then write unit tests for specific bugs to keep regressions in check? Do you need 100% coverage? If not, how much coverage is acceptable?

The questions are constant, and there are many ways to test applications. Even if you get the answer you need, you may not know what to do next. For instance, what will you do when a test fails, and what’s the fastest way to fix it?

In general, it’s good when a test shows you something is wrong with your code. It means it won’t affect a user directly, but it will have an effect. A failing test slows your delivery process because you need to stop everything and fix it. Your users have to wait for new features, and if they have to wait too long, they might be inclined to look at what your competitors have to offer.

In this article, we’ll talk about fixing tests that somehow only failed on your CI server, even though they worked on your local computer. We’ll also cover why it’s important to take failing tests seriously, and what you can do to reduce the effort involved in fixing them.

Why We Ignore Failing Tests

It’s an endless struggle with tests: They tell you when you’ve failed, which is helpful, but may also indicate other issues. They also create extra coding work on top of the actual application code you have to write. As a result, small, seemingly unimportant tests that fail will be ignored. Often this goes on until there is a mass of failed tests that nobody knows how to get under control. If your backlog is so big that people avoid it, it will become a graveyard for tasks that no one wants to do.

Though small, these obstacles lead to test neglect. Besides, if a test fails, a developer may have more pressing things on their mind than fixing the test, because the new feature needs to be delivered by the end of the week.

Don’t think this is a late-process problem, however. A demoralizing glut of failed tests usually comes from tests being ignored right from the start of a project. The harder a test error is to reproduce, the less motivated a developer will be to debug and fix it. If you didn’t plan processes to ease the pain of debugging early on, these problems grow over time.

Why Should We Take Failing Tests Seriously?

There are multiple reasons to take failing tests seriously. First, failed tests can block the delivery of new releases, because the CI/CD pipeline will only deploy when every test succeeds. Alternatively, the pipeline can be configured to allow more flexibility and deploy a release with failed tests. Neither scenario is ideal. Either your delivery halt, or you start delivering software with sub-par quality that your customers are sure to notice.

Because solving your test issues won’t get easier over time, the first corrective measures should be taken early on. Technical debt doesn’t fix itself. Ultimately, if not taken seriously, failing tests can grind your development team down and lead to a freeze on CI/CD progress.

Tests are meant to help you in your development process. If your tests don’t have value and aren’t worth fixing, you shouldn’t have written them in the first place. A system without automated tests to check its functionality is bad. A system full of unhelpful failing tests is even worse; they use resources, and you don’t gain anything from their results.

The Best Methods for Fixing Failing Tests

In a worst-case scenario, fixing a failed test requires a developer to check out an older version of their code and replicate the continuous integration environment on another machine. There may be additional steps to find the bug, depending on the level of automation. In a traditional environment, a developer may have to manually install multiple parts of the system.

This is what developers are thinking when they see their commit fail in CI. All are possible excuses for not debugging, and each one can be circumvented with a little planning. Here are some strategies:

Minimize Upfront Work

Managers and developers should minimize the upfront work necessary to start debugging. It’s important to notify the creator of the failing commit immediately, and tell them everything you know about the issue, instead of only communicating the news that their CI run failed.

If you’re using infrastructure-as-code in your application, leverage that. IaC can accelerate setting up a development environment and allows developers to debug immediately with a system that resembles the CI environment. This saves time researching and creating an environment that’s a perfect match.

Debug Tests Where They Take Place

Even better, enable developers to debug the system right where it was deployed, inside the CI environment. This removes a lot of boring upfront work and saves you from waiting until a development system is deployed.

You can also instrument your CI and test code so you know exactly what happened on that remote server. How long did the test take? What code was executed? Did it hit any resource limits? Which services were involved in the test scenario? These metrics and traces might be enough to give a frustrated developer an aha moment, without the need to replicate anything.

Employ the Right Tools

Thundra’s newest product, Foresight, is the perfect tool for making your CI pipeline more observable. Foresight takes the observability know-how we’ve gathered from years of building monitoring solutions for cloud applications and applies it to the CI/CD pipeline. Now, you can get insights such as metrics, traces and logs before the system is even deployed to production.

With Foresight’s new record-and-replay feature, you can go through your test step by step. This allows you to replay what actually happened, just like you would when replicating the system on your local machine or a dedicated development environment in the cloud.

This feature has two huge benefits:

  1. It eliminates all the work that needs to be done before a developer can start debugging.
  2. It runs on the same hardware that executed your test in the first place, so you don’t have to worry that your replica won’t match the CI environment.

Conclusion

When things go wrong, developers are inclined to re-run tests on their local machine because that’s where they have control. Yet in the age of cloud infrastructure and managed services, this replication isn’t always possible. While running automated tests is a good practice, when they’re executed outside the CI/CD pipeline, their errors quickly become confusing.

This is why we need to bring the control of running tests and debugging locally to the cloud and, in turn, the remote CI/CD pipeline.

Many test-monitoring solutions tell developers about an error, but they don’t give them actionable insights that help them fix it. Often, the best they get are some incomplete logs that don’t even have the most crucial information. This leads developers to dread fixing tests, and in the end, either software quality or delivery velocity suffers.

Thundra Foresight was created to keep the delivery process smooth, including the inconvenient task of fixing broken builds. Explore Thundra’s open source projects and see just how easy fixing failed tests can be.

The New Stack is a wholly owned subsidiary of Insight Partners. TNS owner Insight Partners is an investor in the following companies: Velocity.

Featured image via Pixabay

A newsletter digest of the week’s most important stories & analyses.