A/B Testing – Big change, not big bang

Vito Covalucci

August 21, 2018|5 min read

Even in the age of digital experimentation, big bang redesigns continue to find traction with brands across the world, despite ample evidence there are better ways to create positive change. Can we use A/B testing to bring about significant UX change or is it a tool for incremental optimization only? That’s the question we’re here to discuss today.

Redesign vs. optimize

One of the challenges optimizers face is how to balance optimizing your current user experience against implementing a redesigned experience. Moving to a redesigned experience is a great unknown: the new experience may solve for observed customer needs but may also disrupt key flows in your business funnel. So how do you confidently move into a new experience without disrupting these paths? By iteratively and sequentially A/B testing.

A/B testing is not limited to small scale changes

A/B testing is not antithetical to a redesign — rather, it is essential to a successful redesign. There is a common misperception that A/B testing is inadequate for making big change and that it’s best for optimizing in the world you already live.

This isn’t correct — A/B testing will tell you if the hypotheses you are testing — inside or outside of your existing framework — are empirically better than your control. When done properly, you will have the data to inform the ‘why’ and not simply the ‘what’.

Big bang redesigns often don’t work

While often well-intentioned and grounded in research, the fact is that the majority of tested changes to a user experience will fail. The web is filled with examples of large, untested redesigns gone awry. You protect yourself from this risk through testing changes!

It’s important to note that A/B testing the changes all at once is no better than a big bang redesign. Baking more than one hypothesis into a singular test is bad science as it makes disentangling the results from the changes infeasible. The only thing you learn from this is whether the sum of all changes surpassed the control — you are left without understanding what changes are driving the performance and why.

A/B testing is the right way to redesign

The scientific way to reach a redesigned vision is to A/B test your hypotheses one at a time in sequential tests. There are advanced methods for experimentation with higher degree of variation, but they are rarely necessary and require a higher order of precision from both the data and experiment design.

A/B testing diagram using clusters of yellow, orange, grey, and green dots and black arrows

As shown above, you may find that an individual test is a loser versus the control. Learning this is a good outcome, as you are able to go back to your previous optimum. Learning this in a controlled environment allows you to cut losses in underperforming areas and iterate back into winning experiences.

Big redesigns suffer when this happens as your options are either to roll back, attempt to iterate in areas of uncertainty, or live with a bad loss. Parsing through data from a complex redesign experiment often leads to further assumptions that reduce confidence in the next variation.

So how do you put iterative A/B testing in practice to drive a redesign (and learn from your mistakes)?

Capital One homepage navigation redesign example

At Capital One, we faced this challenge when experimenting on our website navigation redesign, shown below. If we had simply tested the dozens of proposed changes in a single test, we’d have gotten a result with no clear explanation of why, and no pathway to deeper optimization.

screen shot of capital one website homepage with grey menu and blue arrow pointing to new homepage with red squares

Going to market with the complete concept via big bang test would have left many questions unanswered and little direction for future tests. Instead of moving forward as a single redesign effort, we resolved to decompose the design into intermediate tests to learn the contribution effects of intermediate changes and to identify areas where further optimization was warranted.

The ideal situation, of course, would have been to compose iterative experiments based on observations with a north star user experience in mind, rather than a pre-packaged destination. With these tests behind us, our team is now focused on testing more deliberately and iteratively in the future.

Looking at the redesigned experience, our tech, design and product team started by noting every change the redesign proposed (there were a lot!) and bucketed the changes into hypotheses we could test with our existing design framework. We added new instrumentation to better track the changes and user flows, to further enhance our reads.

At the end of these tests, we had a few actionable insights, but largely saw performance in line with our existing navigation. This built confidence that the larger concepts were not a significant risk. When testing the complete concept head-to-head versus our control, we were rewarded with an inconclusive result.

Follow-up tests are underway currently with our immediate follow-up tests aimed at improving user pathing to the Pre-Qualification page for some of our card products. From our insights, we hypothesized that improving the visibility of the link would lead to improvements to our core metrics, including improved visitor satisfaction. To quickly test this hypothesis, we built and shipped a test prominently displaying “See if You’re Pre-Qualified” in a call to action. Results from this test are still pending.

screen shot of capital one credit card menu with blue, grey, orange, and green credit cards in a line from top to bottom

Putting insights to use

So how do you put iterative A/B testing in practice to drive a redesign (and learn from your mistakes)? Specifically, how can you take data insights from your experiments and rapidly turn them into strong follow-up tests?

Collect KPI, anonymous page interaction data from previous tests.
Identify areas and interactions where control and test vary.
Assess opportunity of closing, or exceeding the gap — is this worth doing.
Formulate a hypothesis.
Figure out how to test the hypothesis quickly and cheaply to learn if the hypothesis is valid.
Figure out how many people need to see the test.
Build and ship test. (ABS — Always Be Shipping)
Collect data and conduct analysis.
Iterate and continue.

I’d like to leave you with a powerful reminder that data is both catalyst and foundation of your experimentation program. Observe, measure, hypothesize, win.

It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.-Sherlock Holmes

Vito Covalucci, Digital Product Manager

Vito Covalucci is a digital product management leader at Capital One. He's passionate about digital experimentation and the role of machine learning in helping solve financial problems.