Contributed / Top Stories

How Product Experimentation Goes Wrong

15 Jun 2018 11:08am, by

Jon Noronha
As Director of Product Management at Optimizely, Jon Noronha is responsible for leading a team of Product Managers with a goal to discover new ways for companies to experiment more across websites, apps and every level of the stack. Prior to joining Optimizely, Jon coordinated engineering teams across Seattle and Beijing to rethink visual search at Microsoft. As part of Bing’s Image Search team at Microsoft, Jon developed cutting-edge technology in machine learning, distributed systems, and image processing and combined it with great design based on usability studies, constant A/B testing, and quantitative analysis. Jon has a bachelor’s degree in Computer Science from Harvard University.

One of the biggest trends in software development today is product experimentation. Instead of designing a product, building it, launching and hoping that it will work, teams are increasingly looking for ways to iterate gradually and validate their ideas with data. Techniques like A/B testing, feature flagging and gradual rollouts are quickly going from niche to mainstream.

But like any trend, product experimentation is a good idea that can easily go wrong. For every game-changing A/B test, there’s a trail of testing mistakes that leads well-meaning teams down the wrong path. Armed with the knowledge of what to avoid when building successful experiments, teams can stay on the straight and narrow.

Picking the Wrong Metrics. Experimentation is the most powerful tool we have for moving our metrics. As long as you choose the right numbers to track and measure, you’ll see transformative success. But if you choose the wrong ones, you’ll guide your product in exactly the wrong direction. A/B testing is easy when you have a single, simple conversion, like getting more leads from a landing page. It gets much harder when you’re optimizing for a subtler goal, like driving long-term retention or delivering the best user experience. As you design your product experiments, ask yourself one question: if this metric went up and everything else stayed flat, would you be happy? Or perhaps another: if your users knew you were optimizing this metric, how would they feel? If it feels like metrics are leading you the wrong way, don’t be afraid to rethink them.

Making Your Sample Size Too Small. Our teams at Optimizely run hundreds of experiments on our landing pages and sign-up flow, but as a B2B application, we have to be careful about choosing the right sample sizes and running tests that have the power to be impactful. Experimenting in products can be much more difficult than doing so in marketing or ecommerce, because they often get less traffic — and therefore less data from which to glean insights. How much traffic is enough? Use a calculator like this one to get a feel for the numbers you’ll need. A general guideline: If you have more than 10,000 users a month, you should absolutely be experimenting. Any less and you’re better off doing qualitative research.

Getting Tricked by Statistics. Misinterpreting an experiment is typically worse than not running it at all. False alarms are incredibly costly and can quickly negate the entire value of experimentation by injecting uncertainty and fear into a process that’s meant to provide clarity and confidence. I’ve seen teams spend weeks frantically retooling their features to make an experiment “pass”, all without realizing they’d been tricked by a false positive. Watch out for the multiple comparisons problem: if you test 100 metrics with 95 percent confidence, you should expect five of them to come out as false positives! And be wary of “peeking” at results too early. With a standard t-test, it’s easy to game the system by refreshing until you get the results you want. To avoid these risks, make sure your experimentation platform uses modern approaches like sequential testing and false discovery rate control.

Not Questioning Your Results. Use Twyman’s Law as a guide: scientific results that appear extreme or out of the ordinary are usually not—they’re usually wrong. Similarly, if your metric moves for no reason, don’t take it at face value. And if a winner seems too good to be true, retest it! Experimentation doesn’t replace intuition, it supplements it. It’s good to be data-driven until your data drives you off a cliff.

Operating in a Silo. Experimentation data is not immune from the conversation about business silos. Any given data set can be useful to other teams doing experiments — and even to those who use data for other purposes. Find a centralized place for your data and reporting and improve governance of it by sharing everything with the entire company. Don’t find gold and forget to share the location with your team. Data silos are not intentional, it’s simply something that happens naturally in organizations that are moving quickly.

In the last few years, experimentation has gained popularity as an important, effective way for businesses to confidently test new products and ideas before rolling out to the market, and adoption continues to rise. Done right, experimentation gives you the confidence to move quickly. You can reduce the danger of a risky rollout, eliminate the uncertainty that can come from years of planning without feedback from customers, and you can prove the impact of your work to executives and the rest of the business. When done right, experimentation is the gateway to great customer experiences and lasting business success. Don’t let these pitfalls hold you back from using this powerful technique: just experiment responsibly.

Feature image via Pixabay.


A digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.