Written by on 7 April 2015

Crossdomain A/B testing kept simple

Our goal was to provide our customers a consistent experience during their journey across our domains while being on an A/B test. We achieved this in a simple way by using our unique Coolblue-Session key.

For a few years now we use A/B testing to test all kind of things on our websites. But until recent there was one thing bothering us, our 3th party testing tool has some problems following our customers when they are visiting multiple domains.

As the result, our customer could be assigned to variation A while browsing www.consoleshop.be and variation B while on www.tvstore.be. This sudden change of behavior from our platform was causing an unpleasant experience for our customers and was influencing the test results as their were somehow forced to interact with multiple variations during the very same visit.

After some brainstorming while looking for a scalable solution that could be implemented quickly, we did some experiments with our Coolblue-Session key to be certain if it does carry the attributes we are looking for.

To be more precise, we were looking for an identifier which is unique per visitor/device. Also it should be totally random and evenly distributed. That being evenly distributed was the key to our approach since we wanted to use it as the seed of assigning our customers to different variations of multiple running A/B tests.

After doing enough research to make sure if we have finally found what we were looking for, we decided to run a simple test against Coolblue-Session key to prove it’s evenly distributed.

We converted this unique hexadecimal value to a decimal number and then divide it by two to observe its reminder. The result was promising, out of 10,000 experiments against our candidate key, 50.1% of the reminders were 0 and the rest were one. This simple experiment proved that it’s evenly distributed.

Since we wanted to use this key as the seed of our multiple A/B tests, we split this decimal number into chunks of three digits numbers. By doing that we do have enough seeds to assign to each A/B tests and expects them to act random.

Each of these seeds resembles one test. By using modulo we attach our customer to a single variation of each tests and this attachment won’t change unless the user’s key is changed. This happens by dividing the seed to the number of variations under the test, the reminder is the indicator of the variation users is attached to.

An example with two variations:
a % 2 = 0 = Control;
a % 2 = 1 = Variation A;

And one with three variations:
a % 3 = 0 = Control;
a % 3 = 1 = Variation A;
a % 3 = 2 = Variation B;

etc. etc. etc.

By using Google analytics custom dimensions we keep track of the assigned variations. Benefit of this is that our commercial teams will be able to use the test variations as a segment in their google analytics reports.

As the number of concurrent running tests at the moment is limited to five, we are investigating a more future proof solution. To remove this constraint, we need to persist the test variation to the Cooblue-Session in a data storage. In the next couple of weeks we will try a range of different data storages to save the assignments. We are looking into range of highly available, scalable and persistent key value storages such as Redis and MongoDB.

We will try to share the results of these tests on the blog as well.

Marijn de Römph and Behdad Khoshkhu

REPLIES (2)What you say.

  1. 2 years

    Gijs,

    For sure every running test (more or less depending on the domain it has been introduced) has influence on the entire user’s experience.

    At the moment our “cross domain” tests are not directly connected to user accounts, this is a feature on our backlog which needs integration with our identity service, thus we cannot “directly” measure the influence of our tests.

    However there are other approaches, for example by monitoring our metrics we can easily correlate influence of a running tests in offline mode.

    Behdad.

  2. 2 years

    Very interesting approach.
    Question: do you believe that there’s a different propensity to contact your contact center in different variations?

    If yes, how is “offline” behavior taken into account when assesing the outcome of an online test?

COMMENTGive your two cents.

*