What We Mean by ‘Feature Flags’

“You keep using that word. I do not think it means what you think it means.” — Inigo Montoya, The Princess Bride.
Tweaked slightly, that famous line from the 1987 classic film could easily apply to the term “feature flags.”
It’s a line that often comes to mind when I travel around to various industry conferences and meet-up events, where I often ask people the question, “are you using feature flags?” Usually, I get plenty of people responding in the affirmative. At one time, I thought this meant that feature flags had become universally known in the developer world. But then I started asking follow-up questions, and I realized that was not the case at all. There is some confusion around the term that still persists.
It turns out that when many people hear the term feature flags, they fixate on the word “flag” and are actually thinking of something much older — other flags in software engineering. They are referring to a compile-time flag or a server configuration flag or maybe a server configuration file. While those are indeed flags, what they all have in common is that they are global. What I mean by that is they impact every user passing through that piece of software.
But when I say feature flags, sometimes called feature toggles or ops-toggles, I’m talking about a very different thing. I’m referring to making a dynamic decision in my code — live. I’m deciding which way I’m going to send a user, without having to push new code and without having to change a config file. It’s not static, like those other examples of flags. It’s a user-by-user, session-by-session decision.
Directing Traffic
One simple way to think of a feature flag is as a traffic cop directing users. Maybe that traffic cop is sending 10 percent of the population to a new feature, and the rest are being sent around that feature — in other words, they are not getting that feature. This is what you do in a gradual rollout based on percentages of users. You can also roll out features to internal users, beta testers, free-tier users, or people who have opted in to get early updates – or any number of other attributes within a target population.
In a feature rollout, you put that traffic cop — the feature flag — into your code. Once you’ve done that, you control who goes where dynamically using rules that are edited externally. Deployed code isn’t necessarily released code. And when the code is released, it can be turned off without a rollback or a new deployment.
That’s the key with feature flags: they allow you to separate deployment from release. This unlocks a lot of power for you in terms of being able to gradually roll out individual pieces of your code — and do it without “canary releasing” to separate servers and routing traffic to them. It also enables you to conduct experiments. In both cases, you are simply routing some users one way and some users another way, and then observing the differences between the two groups.
Taking It a Step Further
Now, there are plenty of folks I meet who are indeed doing the kind of feature flagging I’ve described above. Yet, many are doing it in a limited way. People naturally value the idea of having control, as it allows them to fix things quickly if something goes wrong. And feature flags provide that control. But what many are missing is the ability to observe and measure, which allows them to make smarter decisions. That’s what metrics provide.
When you can combine the two — feature flags for control and metrics for observability, then you can get into true experimentation, and that is far more powerful than feature flags alone. It’s almost akin to getting a faint echo versus a vivid technicolor image.
Still, you have to walk before you can run, and understanding what feature flags are and how they work is an important first step.