Death of Pop A/B Testing

A B Testing

TL;DR A/B testing only works for really large platforms with dedicated experimentation teams and statisticians on board. If you are a small fish in the pond, better go with web personalization and established UX industry best practices.
In this blog post I want to talk about two things – scientism and A/B testing.

Scientism, as it sounds, already tells you that it is something that will bore you to death. But because it’s crucial to understand what is going on in the web experimentation aka A/B testing world, I am gonna go ahead with a definition from Wikipedia:

  • Scientism is the ideology of science. The term scientism generally points to the facile application of science in unwarranted situations not amenable to application of the scientific method.

A/B testing, at least how it’s marketed and performed in most situations exemplifies scientism in business world. People in marketing departments with good intentions try their hands on empirical verification methods.

But A/B testing is almost dead everywhere except really large tech companies and commercial platforms.

The biggest player in the industry, Optimizely is pivoting to web personalization after retiring its free plan. Reports after reports are coming out detailing how people delude themselves mismanaging and misinterpreting A/B testing outcomes, intentionally or because of confirmation bias.

VentureBeat summarized it the best:

  • In other words, A/B testing is a rich person’s game. It requires enough traffic, experience, and information to test quickly and with high confidence. If you don’t have the necessary resources to set up A/B tests correctly, you are better off avoiding the risk of operating under false assumptions.

This is Google Optimize, a powerful new free tool (and a threat to Optimizely, VWO etc) by Google, explains itself to the people:

google optimise

See that heading? Test, adapt, personalize. Not too long ago, experiment tools used to focus only on the experimentation side of the equation, ie CRO, A/B testing etc.

But after the disillusionment from A/B testing, world is quietly moving away from purely testing to personalization or a mix of both.

But if A/B testing was pitched as a scientific method of achieving the best conversion rates, what went wrong?

What is A/B testing?

In case you are unfamiliar with A/B testing, here’s how Optimizely, one of the biggest evangelist of A/B testing, defines it:

  • A/B testing (also known as split testing or bucket testing) is a method of comparing two versions of a webpage or app against each other to determine which one performs better. AB testing is essentially an experiment where two or more variants of a page are shown to users at random, and statistical analysis is used to determine which variation performs better for a given conversion goal.

So basically what you are doing is that you are presenting two or more versions of any page on your website to users and measuring which one is performing better. A better variation or a variation your users like most gets the most desired results, ie more purchases, more sign ups etc.

And the variation that performs the best goes live on the production site. This hypothesis based testing comes in many flavors – A/B testing, A/B/n testing, multivariate testing etc etc.

And A/B testing works perfect in theory, but in the real world there are too many obstacles in the way of successful A/B runs. Here are some of them:

  • Not having proper hypothesis before starting tests
  • Not having enough traffic, ie inadequate sample size
  • Stopping tests before you have statistical significant results
  • Managers’ inability to understand advanced statistical methods

Let’s be honest here, A/B testing was never meant to be viable for everyone and an experienced statistician will laugh at the ways these tests are conducted inside companies’ marketing departments.

Before web testing was commercialized and marketed to businesses by firms like Optimizely and VWO,  web experimentation like these were confined to data teams inside large technological firms.

Even when you think you have a large enough sample size, ie website visitors, you are going to have a very good statistical prowess to really nail down the causality. And statisticians who specialize in running these kind of experiments come too expensive for most businesses to afford.

Which brings us to the core of the problem – statistical misunderstanding and manipulation.

Misleading with statistics and getting mislead with statistics

  • There are lies, damned lies and statistics. – Mark Twain

Statistical manipulation and associated complexity with experimentation has been well known in the business and government world. But marketing world is just coming to the realization that these types of empirical experiments are way more complex than they seem.

AB Test Size Calculator

As you can see from the above image, for a baseline conversion rate of 1%, which is typical of any real world business website, you are going to need at least 200,000 monthly hits just to be sure of a 10% change in the rate.

Taking Optimizely’s incentive to persuade people in favor of signing up for its service into account, you can easily come to a number which is plainly off limits for most small to medium business websites.

So, when people run tests on their website, all they get is an illusion of getting something useful out of the results. In his whitepaper “Most A/B Tests Are Illusory”, Martin Goodson, research lead at Qubit, states the problem with the most common results people get running A/B tests:

  • In this article I’ll show that badly performed A/B tests can produce winning results which are more likely to be false than true. At best, this leads to the needless modification of websites; at worst, to modification which damages profits.

And really bad problems arise when the people start testing multiple goals with multiple variants. What it does is that it multiplies the errors.

To put it in plain English, unless you have > 1M/monthly traffic and a really good team of data scientists and statisticians, you are probably deluding yourself with the results of A/B tests that you are running.

The future of web experimentation

  • “If you torture the data long enough, it will confess.” – Ronald H. Coase

No matter what the marketing people over at these A/B testing tool firms tell you, A/B testing is just not a viable option for most of the businesses. Web personalization is a better option for most small to medium size businesses.

And it makes more sense if you really think about it. Why showcase every user the same homepage when everyone has different psychology and context. Here’s how Optimizely defines web personalization:

  • Website Personalization is the process of creating customized experiences for visitors to a website. Rather than providing a single, broad experience, website personalization allows companies to present visitors with unique experiences tailored to their needs and desires.

Personalization is the primary reason behind the enormous success of social media platforms. When we are presented with something that is really tailored for us, we feel more connected.

There are really great service providers like Unless and Qubit which make it a breeze to set up personalization for your web pages.

I think that eventually all current A/B service providers will shift the focus more towards personalization and there will be even more awareness regarding the perils of scientism in the marketing industry.


Previous Post
Best Social Media Design Tips
Next Post
An Introduction to .Net Entity Framework