⇥ Change a button, gain 20%—the scientific way

February 2, 2008
7 comments
 
⇥ Permalink

Websites are living entities, and they must change to reflect their users’ needs and expectations. The problem is—how do you tell what your users need and expect?

The easiest way is, obviously, to ask them, but this has a number of drawbacks. First, a large bias is introduced in choice but preferences a user is unconscious of—for example, a certain colours are more attractive than others, or the position of a particular element can make it more evident in relation to the page in which it resides. Under these circumstances, users won’t be able to tell you why they like something, let alone how to make it better. Conducting surveys is also very expensive and slow, and, at the very best, it can provide an indication of what changes might be beneficial—you will still have to implement those changes and hope that your analysis is correct.

Blue elephants are tough sales
Given that a direct question won’t work, the next best idea is to turn your website into a lab and conduct experiments to analyze user behaviour without affecting the outcome by letting your users in on the fact that they are, in fact, participating in an experiment. Ideally, a good research technique should have these characteristics:

  • Be simple to implement and easy to automate
  • Perform many experiments at once and be capable to measure their individual effect as well as their effect correlated with the other experiments
  • Incorporate research and deployment in a single system—once your experiment is complete, you already have a working solution, as opposed to ideas that still validation

Selling Blue Elephants, by H. Moskovitz and A. Gofman, is a great book that explains the basics of Rule Developing Experimentation—a statistical research methodology that incorporates precisely these three characteristics. RDE works by combining a series of individual dynamic elements and presenting them in a variety of ways to a focus group of users and then correlate the success of each element individually and of group of elements taken together with the group’s participants to extrapolate a “winning combination” that achieves a given goal.

For example, you could be a pizza sauce manufacturer that is trying to determine the best recipe for a product that best targets your customer base. An RDE expert will probably start by analyzing the individual ingredients and their effect on the pizza sauce from the point of view of a customer—whether they increase the spiciness, overall taste, change the texture, and so on. Given any number of these parameters, an RDE system will generate a number of statistically-determined combinations and then distribute them among a group of testers. At the end of the process, each tester is asked to fill in a questionnaire and provide his or her opinion on each combination.

The RDE analyst will then be able to extrapolate the influence of each variable from the data and come up with a “best” combination that targets a specific demographic aspect of the intended audience (I am oversimplifying here—if you want more detail, buy the book!). For example, it would be possible to tell that a “spicy” combination of cumin and paprika appeals to a population that is predominantly male and between 18 and 35 years of age. Perhaps more interestingly, the end result is ready for production, because you don’t just know that young men like spicy sauces—you know that they like a particular sauce.

Web users don’t have taste buds
It’s a little difficult to make web users fill in meaningful questionnaires, but a website gives you the best possible way to tell whether a particular change is successful: you can measure its effect on user behaviour.

For example, suppose your site sells shoes. How do you know if your product pages are “good” at turning your users into customers? Simple, you start by measuring the effectiveness of your current page, and introduce an alternative that is served randomly to your users until you have enough information to decide which one works best. Your conversion goals could be things like the user reaching a given page, or making a purchase, or remaining on your site for a minimum amount of time, and so on.

If you prefer, you can test more granular changes—for example, move an image, change a title, and so on, and measure the effect that these individual differences have on the page’s overall performance.

These two types of tests are not directly related to RDE, but can be just as useful. The first kind—in which a whole page is replaced—is called A/B testing, while the second kind, in which individual elements are mixed up in various ways to find the best possible combination, is called multivariate testing.

Google is your friend
Getting either type of testing working on your site sounds rather complicated, and would probably be if you had to be on your own. Luckily, that’s not the case—if you are a customer of Google’s AdWords program, you have access to a nifty little tool called Google Website Optimizer (GWO).

GWO allows you to create either A/B or multivariate tests and automate their execution on your site. You can run any number of different tests at the same time, and all you need is a modicum of patience and a bit of dedicated JavaScript code on your site.

To give you a very simple example, we recently ran a multivariate test on our magazine landing page to determine how effective our “add to basket” and “buy now” links are. To test the effectiveness of various possible solutions, we made the test replace our traditional, text-only links with graphical buttons of various colours—red, green, blue and grey.

Without giving away too much information, you can see that the various combination yielded wildly varying results (which, of course, both gives importance to the test itself and to the fact that you will never find out these details until you test them):

You’re probably wondering what changes in the winning “Combination 5″ could improve our ability to sell magazines by 25%. Believe it or not, the only difference on that page is that the two text links have been replaced by two red buttons. That’s it—nothing more.

To be sure, setting up the test takes some time and, therefore, some money. In this particular case, let’s assume it took one hour to set up the HTML code of the pages so that the test could run, and maybe another hour (in reality, just a few minutes) to create the various buttons. The results, however, are well worth it—particularly if you consider how tiny and simple the changes are.