Causality is not overrated

 In yesterday's post my colleague, Lars Trieloff, introduced the The Causality Trap  and how it can lead marketers to systematically spend money on the wrong customers, those customers who are most valuable on their own, regardless of whether they receive advertising or not.


Blue Yonder founder Prof. Dr. Michael Feindt
Blue Yonder founder Prof. Dr. Michael Feindt


This problem has been recently addressed by so-called Uplift-modeling, as e.g. advocated by Eric Siegel, Ph.D.,Founder of Predictive Analytics World.

From Wikipedia (Uplift modeling): Traditional response modeling typically takes a group of treated customers and attempts to build a predictive model that separates the likely responders from the non-responders through the use of one of a number of predictive modeling techniques. Typically this would use decision trees or regression analysis. This model would only use the treated customers to build the model. In contrast uplift modeling uses both the treated and control customers to build a predictive model that focuses on the incremental response. To understand this type of model it is proposed that there is a fundamental segmentation that separates customers into the following groups:

  • The Persuadables: customers who only respond to the marketing action because they were targeted
  • The Sure Things: customers who would have responded whether they were targeted or not
  • The Lost Causes: customers who will not respond irrespective of whether or not they are targeted
  • The Do Not Disturbs or Sleeping Dogs: customers who are less likely to respond because they were targeted

The only segment that provides true incremental responses is the Persuadables.

Uplift modeling provides a scoring technique that can separate customers into the groups described above.

Traditional response modeling often targets the Sure Things being unable to distinguish them from the Persuadables.

To discover the uplift, i.e. the incremental effect of a marketing measure from the data, A/B tests must be conducted. Two groups are then randomly chosen, with one group getting the treatment and the other not getting it. A comparison between these two groups in the prediction target (e.g. revenue) then is a measure of the causal effect of the marketing measure. If you want to add segmentation to the mix (e.g. for younger/older customers), you will end up with more than two groups and a more complicated validation. A/B testing always comes at a cost, since at least one group is handled non-optimally therefore producing cost and/or loss. It's these costs which have to be balanced with the advantages of better knowledge and the prospect of more effective targeting.

Uplift modeling ideas even helped swing the 2012 Obama campaign and ensure his re-election by concentrating efforts on the Persuadables instead of the Sure Things.

Uplift modeling is considered state-of-the-art . However, the work my team and I have conducted goes far beyond this. A new algorithm enables us to separate causal effects from pure accidental correlation from historical data. The algorithm is not universal, which means certain conditions must apply. For example, a high number of input parameters and a sufficiently long history of actions and responses need to be recorded. No unknown effects are allowed that influence both the action and the target simultaneously. And there must be some variation in the historical action selection. Fortunately, this is often the case in realistic database marketing data.

This means, the true uplift of a marketing activity can be discovered (and predicted) even if the data only shows whether customers have been targeted (i.e. have been sent a catalog or email or shown a display ad), and their target value.

One of the biggest advantages is that an A/B test is not strictly required and customers can benefit from increased ROI right from the start. However, partial random exploration may help to further improve targeting efforts and A/B tests might be required to win over skeptics.

In the A/B tests we have conducted with customers (generally retailers) using our causality algorithm for improved ad targeting, pilot projects have demonstrated extremely promising results, enabling customers to cut distribution (or ad spend) in half while keeping results the same. You could say we've really found out which half of your advertising budget is wasted.

Knowing that reducing cost alone is not enough for a growing company, our pilot projects have actually shown that it is virtually always possible to find a point where results can be improved (whether it's engagement, revenue or profit) while simultaneously keeping constant or even reducing cost.

Our causality algorithm relies heavily on properties of Blue Yonder's NeuroBayes™ algorithm. This includes Bayesian Statistics, the ability to deal with high dimensionality and arbitrarily correlated input, to handle weighted events as well as the ability to produce conditional probability densities. The causality algorithm also relies on the Blue Yonder Platform's ability to process massive data sets during training, or to make a trained expert available for integration into an automated marketing process.

With large amounts of training data (including lots of conversion history) and highly dimensional input (many customer and behavior properties), the algorithm allows for multivariate individualization, which means it can learn complicated behavior of individual customers and determine its impact on conversion.

With the introduction of causal predictive modeling, a whole new set of possibilities is emerging when it comes to the data's ability to tell our customers why things happen the way they do. Just think: better targeting for marketers plus more effective and individualized offers for customers - and that's just the beginning!

If you want to learn more about this topic, please write us an e-mail to We'll show you how causality can be used in your Business.

Prof. Dr. Michael Feindt Prof. Dr. Michael Feindt

is the mind behind Blue Yonder. In the course of his many years of scientific research activity at CERN, he developed the NeuroBayes algorithm. Michael Feindt is a professor at the Karlsruhe Institute of Technology (KIT), Germany, and a lecturer at the Data Science Academy.