How Canva Uses Data to Prioritize QA Testing
The quality assistance engineers behind the Canva media design service optimize their quality assistance model by focusing engineering resources on troubleshooting risks and edge cases up front for the most-used features first. These data-driven decisions help prevent mission-critical errors while clients are using their services.
Canva’s QA engineers assist every team member, including project managers, designers, developers, and data analysts in preventing errors early on, explained Zi Yang Pang, a quality assistance engineer at Canva in a recent blog post.
The challenge, though, is common in that resources are sometimes too limited. With many projects needing to hit strict deadline requirements, decisions on where to focus engineers and QA engineers are quite important otherwise they run the risk of stretching too thin.
Under that principle, Canva optimized its quality assistance model by making data-driven decisions when possible. These data-driven decisions allow Canva to prioritize testing the most used features first, hypothesizing that if any bugs slip through to the production level, they would tend to end up in lesser-used features.
“There’s no software out there with zero bugs. What’s important is how often your users encounter them and how critical they are,” Pang wrote.
Code Coverage and Unit Testing
Canva code exists in a monorepo and an owner’s file identifies who owns a certain area of the code. For code coverage, Canva focused on unit tests because they, “give the most detailed coverage information compared to integration tests, which are much more difficult to assess.”
Using unit tests, Canva engineers looked at the code coverage for the code owned as a team. This raised visibility on features with very little code coverage so they could focus their efforts on areas with few automated tests.
Here were some common first reactions Canva heard when presented to the development team:
“Oh but I’ve written integration tests for that.”
Pang says this is, “the most common answer you will get, and that’s fine.” The purpose of this is to raise awareness on the amount of tests needed per shipped feature. They suggest creating a team dashboard to which integration tests covers specific user scenarios.
“Oh yeah, we kinda know about it, but we don’t have time or space to work on it.”
The challenge behind this comment is technical debt. There’s a risk the team is only working on new product features and is lacking the time needed to work on current technical debt or engineering foundation work. This may lead to the accrual of more technical debt over time thus hindering the ability to scale more in the future.
Canva’s suggestion here is to pick up at least one technical dent or engineering foundation work ticket from the backlog every sprint to minimize the accrual of technical debt. Pang says, “You want to be paying off your technical debt faster than accruing it.”
“Code coverage itself isn’t useful because we don’t know what we are looking at.”
To this Canva replied with a question, “And that’s a fair call because those numbers don’t really mean anything without context. So, how do we get around this?”
A Deep Understanding of the User Experience
“A quality product doesn’t only mean it’s free of bugs. It also has to be easy for users to discover and use a new feature well,” explains Pang. More users skew more towards trying to get around the issue than those who create a support ticket. This just means its more important for Canva’s QA engineers to be able to, “pick up moments like this to help reduce our users’ friction points as much as possible.”
The image below shows the area in the user interface where an administrator can invite someone to join their team from the homepage sidebar.
This next image shows the area in the user interface where an administrator can invite someone to join their team after the design is made.
In this example understanding some of the following information about how administrators prefer to invite people to their team is useful data collected for making these data-driven decisions:
- Is desktop or mobile more popular?
- Is the homepage or once the design is made more widely used?
These data points can be further broken down and help to focus testing on the highest traffic areas as well as determine the severity and priority of bugs accurately. Should any bugs slip though, now Canva has a way to determine the number of users affected per any specific feature bug. This also helps reduce alert fatigue as every bug alert that comes in doesn’t have the same priority.
These analyses led to better tests for areas that have the highest feature usage with the lowest code coverage. It also helps mitigate the risk of the component breaking in the future and affecting many users.
A Shift Towards Quality
The team started with easy-to-achieve goals and regularly reported metric improvements. Over time, their scope grew. It took time but they started to see better test coverage, less code rework, and greater confidence in the team when releasing new features. Pang says, “In the past three months, we saw zero incidents across three teams that I work with (and hope to keep it that way!)”
Since there’s no way to get to 100% unit testing and unit testing isn’t perfect, the team supplemented with other QA activities such as QA kickoffs, testing parties, and reviewing designs before engineering work starts.
There is a constant balance between investing time in engineering foundation work and shipping features. There is a better understanding of the current health of the code with an objective measure to work towards.
All of this results in a better user experience.