Did you ever realize that your raw data is lying to you?
In the age of big data, site optimization is a key objective for nearly all business and organizations with an online presence. Data can and should be used to find the best performing versions of websites. Companies can also leverage data to create personalized users experiences, providing a benefit for their audience as well as for themselves. For example, a brand may want to address the female clientele in Brazil with one version of their site while creating another version for the male audience in Canada.
These ideas are great in theory but not so easy to put into practice. This is especially true when the data in question has strong bias. This bias can come in many shapes and forms: temporal changes in the system, exposure to different users, or products simply displayed differently (for instance, above or below the fold).
In Wix, for example, we optimize multiple lists with numerous elements. Many of these elements appear in different parts of the screen, often in locations that are known to have a strong bias on users interactions.
How to Tackle the Bias Problem
There are different ways to address the issue of bias. One of the most powerful of those is the “Multi Arm Bandit Model”. This model is based on a strong randomization element, which helps to neutralize bias. The Bandit Model is effective because it relies on a mix of:
Exploring: Giving all objects an equal chance of being used.
Exploiting the data: Serving the element which is considered to be the optimal.
Using the Bandit Model can help you test numerous variations of the site. As data accumulates, the model allocates more traffic to the best elements, while keeping a small portion random to continue and learn from.
There is another benefit to applying this model. Implementing live learning systems into the code keeps the site updated, dynamic and fresh. A new product will receive an initial amount of exposure. Based on its performance, it will be placed high (above the fold) or buried in the bottom of the list.
Traffic allocation under the Bandit Model is based on a method called the Thompson Sampling. This sampling method basically chooses the action that maximizes reward, while working under the assumption there is some element of uncertainty.
One of the top benefits the Bandit Model has over other machine learning models is that it does not assume an answer a-priori, but rather it listens to and learns from the live data. Bandit doesn’t assume to know which version better suits the Brazilian or the Canadian user. But it does recognize that there should be two distinct versions, and it gradually converges towards them.
Posted by Doron Bar Tov