Earlier this year, LinkedIn engineer Nikolai Avteniev gave a talk at the Software Craftsmanship conference in New York City about the modern code review process and how it’s practiced at LinkedIn. One of the key lessons he imparted was that software must have an “owner” — software that doesn’t have a clearly defined owner will have more bugs.
As part of the LinkedIn Flagship Product Engineering team, Avteniev is intimately familiar with the code review process at the business social networking giant. So, we caught up with Avteniev to learn more about software testing at LinkedIn, as well as to discuss how could review fits in with the overall goals of DevOps-driven software development.
What does the LinkedIn Flagship Product Engineering Team do?
The Flagship team builds our Flagship applications, so the LinkedIn mobile applications for iOS and Android, the LinkedIn website and all the services that power those experiences for our members including messaging, the feed, the notifications that you get on the site, etc.. That is all part of what we call Flagship engineering. And video of course. Video is another part of the ecosystem. And I talk about that one because that’s the team I work on.
Could you describe the usual code review process for this team?
Usually, a developer, let’s say me, for example, is working on a feature and that feature requires me to make a change to a particular service or a particular part of the code base. Say I want to make a change in the Android application; I go ahead, write the tests, write the change to meet those tests to make sure that the change I’m making is well tested. And then I create a code review submission. We use a tool called Review Board. We also have a little command line tool that lets us create a code review from our local git repository, Git workspace.
That creates a Review Board ticket and also notifies people who are responsible for maintaining the particular part of the code base I’m changing. Those authors then look at Review Board, look at the changes to the code change, and possibly offer suggestions. This might happen in a single iteration. Let’s say I got everything perfect and it just looks great and the maintainers of the code base look at it and say, “Looks great, thank you very much,” they would then give it a sign-off for a contribution to the code base.
I would then use another command line tool to submit the change. It goes through a battery of automated tests, then it’s committed to the code base and it’s ready for our continuous deployment pipeline to build it and make it deployable for the next release of our Android application.
Could you tell us more about the code review process at LinkedIn and its benefits?
Code review used to mean a very formal process where a group of engineers would get together in a conference room and go through the entire code change that’s being proposed line-by-line and dive deeply into the implementation details. You would have to get everybody to look at the same code base and capture the feedback — and it’s a large investment of time.
More recently a more asynchronous process has been developed where people review and read the code but they don’t necessarily do it together in the same room. They usually do it on their own time.
It’s usually facilitated with a tool, something like Review Board, or the GitHub pull request process where an engineer wants to make a contribution, creates a pull request and then the maintainers of the project use the pull request as a way to have a conversation about the code and what changes if any need to be made by the contributor before the code can be merged into the main branch of the service.
For the modern code review process, the pull request is the atomic unit or it’s the thing that triggers it?
Yeah. Somebody wants to make a change to the service, or the library for the project and it’s a way to get more eyes on the contribution before it goes into the main developer branch.
For the contributor, it’s a way to get input from folks who are really familiar with the code base and have experience with the code, know what patterns work well in this code base and what patterns need to be avoided.
For the maintainers of the code — the folks who are responsible for the long-term health of the project — it’s a way for them to keep track of all the different changes that are happening in the project and a way for them to know what’s going on, where there might be trouble spots coming up or where things are working really well.
And for, let’s say, a neutral observer, somebody who’s trying to find out how to contribute to the code or learn about the code base, it’s an opportunity to observe the interactions, understand how the code changes are being made to understand what patterns are working well and what patterns aren’t supported by the code base. And just generally get a good sense of who to submit contributions to in a way that will be acceptable.
What are some of the things that a code review looks at?
That’s an interesting question. There was a survey study done at Microsoft where developers were asked, “What do you think is the purpose of the code review?” And developers gave several reasons. The top one was finding defects in the code. And they also had a similar question for engineering managers and they agreed that finding defects in the code is the primary reason for a code review.
And that might be historically the reason for it. That’s what the formal code inspections were meant to find.
But when they actually looked at the type of comments that people provide, they’re not always about finding bugs. They are more about the evolution of the code base, maintainability of the code base, what patterns to use one when making changes.
When somebody’s reviewing the code, they are looking from the perspective of being able to understand the code base, being able to see how this change fits in with the rest of the application, and they are looking at it from the point of view of longer-term maintenance and evolution of the code base rather than just their defects.
Code reviews are a good way for an individual to learn best practices and for senior coders to share the knowledge of what are good practices in the code base that will make it really easy to support moving forward.
The folks who are maintaining the code base are the ones familiar with those practices and understand how they work; which is why you want them engaged and involved here.
It also enables other people who are not as familiar with the code base to start making contributions and improving the code because they’re actually making changes and adding new features while learning how to keep the code base in a healthy way.
This ties back to another study by Microsoft that found that code that had lots of contributors who make small contributions were more likely to have defects in them.
Just because you had more people contributing and there’s less of coherence tying it together?
They didn’t really explain why. The idea is that, “Yes those folks who are contributing just once in awhile are not familiar with the patterns of work or might not be familiar with certain things that they need to do to make the change safe.” And you want to get more experienced engineers to give them feedback on the change to help identify potential issues.
Is there a natural tension between the speed at which you want to bring new features to market and this code review process?
Code reviews definitely take some time. In published empirical research it’s about an hour before you get the first feedback to your code review submission. Most contributions get feedback within a day. And it’s about a day before somebody says, “Yes,” and signs off and it says, “This contribution is good.” And a few days for most changes that go in.
There are some changes that take a little longer to review and that depends on the size of the code change, the familiarity of the contributor with the code change being made.
At LinkedIn, we found for our own research that we have similar timing, so about a day to get a change signed off and commit it to a code base.
And it’s hard to say what the total contribution is to get this feature out to members because you need to consider what the tradeoffs would be if you didn’t have a code review process. You’ll likely have to put a process in place to find potential issues or enable the long-term health and maintenance of the code base.
Can you give me an example of something you’ve learned during a code review process?
I have lots of learnings, both as a reviewer and as a contributor. But most recently I’ve been on the contributor end so I’ll point out one in particular. I’m relatively new to Android development. I’ve been a Java developer for a long time but Android is a new platform for me.
Sometime last year I was working on a little feature … well, it probably seems like a little feature for somebody who’s an Android expert. For me, it looked like a huge undertaking where I needed to monitor the battery health of the device to then modify playback of the video on the Android device.
And when I first created the change I got feedback from folks who are experienced reviewers on Android and they pointed out a way for me to make this change reusable for other components in the Android application, not just for the video component.
And they also pointed out the need to move it into a separate library so that I can implement the change faster, test better and then integrate it into the main Flagship application in a way that would be reusable across many other applications at LinkedIn that use Android for building a system.
Even though I am not an expert in Android, I was able to contribute this change in a way that somebody who is an expert and understands the operating system much better would have contributed if they were doing one of their own.
It shows me the power of this multiplying force of the code review process where the experts can help others make really great changes that are leverage-able in a way that if they tried to do it on their own they might not have come up with the same answer.
How does the code review process work with pair-programming?
Pair-programming is another collective code ownership technique that’s used sometimes by smaller teams to do a similar thing. And I think these interplay together. When the team is relatively small and you do have a continuous pair-programming practice, that’s very much like a code review process but it’s happening in real time as you’re writing the code.
As your team gets bigger and, for example, gets geographically distributed and you might not have the ability to always pair on all the changes. That might be a good time to start introducing a code review process.
If I want to become a code reviewer I wouldn’t necessarily have expertise in that particular code base, as long as I was familiar with…
You definitely want some people who are familiar with the code base to be code reviewers.
But you shouldn’t decline participation if you’re not an expert yet. It’s a great way to learn. And after a few code reviews, you’ll get pretty familiar with the code base and your contributions will be pretty good when giving feedback on code reviews.