The Role of Simplicity in Analytics

William of Ockham (c. 1287–1347) was an English Franciscan friar and scholastic philosopher who stated the basic principle of successful analytics as “entia non sunt multiplicanda praeter necessitatem,” literally translated as “entities should not be multiplied unless absolutely necessary.” During the Middle Ages, multiplication was considered to be a complex operation, and it was only done when strictly necessary. Nowadays, multiplication is simple and easy with the availability of computers, but the level of complex operations has substantially increased.

Translated to an analytical setting, Ockham’s principle basically states that analytical models should be as simple as possible, free of any unnecessary complexities and/or assumptions. Put in other terms, analytical models should be parsimonious and compact, giving a clear insight into the underlying patterns in the data. Ockham’s principle is also often referred to as Ockham’s razor, since the razor is used to shave away unnecessary complexities and/or assumptions. Today, the relevance of Ockham’s razor endures in analytics.

Analytical techniques come in many flavors: from complex mathematical models to simple and easy to understand parsimonious models. An example of a complex model in analytics could be the following:

Probability churn=2,34-3,58*(age of customer)2*(income) – 6,22*(days since last purchase)3- ...

The above model could have been built using a very complex analytical technique (e.g., neural networks). Obviously, the model is very black box and thus not very insightful. A more comprehensible model could be based on a set of if–then business rules, such as:

IF income > 2000 AND customer is older than 45
THEN customer will not churn
IF day since last purchase is less than 30 AND
Gender=female THEN customer will not churn

These rules are clearly a lot more comprehensible. In analytics, it is highly preferred to have interpretable models to get a thorough understanding of the underlying problem. An important reason for this is that analytics should be actionable in the sense that the models should give us insight into what actions could be undertaken to improve customer retention. Interpretability often involves a trade-off with model performance. Consider the complex mathematical formula depicted above. It may well be that this formula more accurately predicts customer churn that a set of simple if–then rules. This is a trade-off that needs to be evaluated, typically by both the decision maker in collaboration with the data scientist. The experience and education of both will play a crucial role in providing an answer to this trade-off. It is important to note that although simplicity is an important focus, it should not become a blind obsession. In other words, when reality is complex to understand, then enough room should be given to the analytical model to appropriately take this into account.

Continuing with the pursuit of simplicity and understandability in analytics, analytics should focus on solutions rather than techniques. Far too many software tools and consulting solutions available in the industry nowadays focus on providing a whole range of complex analytical techniques from which it is not immediately clear how they can be adopted to solve practical problems. In order to bridge the gap toward decision makers and increase chances of success, software should be oriented toward solutions, such as managing churn or fraud, dealing with credit risk, etc., rather than techniques.

Finally, in order to manage the successful deployment of analytical models and guarantee their simplicity, corporate governance and management oversight is needed. Appropriate organizational procedures should be put in place to optimally safeguard simplicity and thus comprehensibility. A very useful idea here is the concept of an analytical model board, which is a set of people that closely follows up the analytical model development. It includes all business users that will end up using the analytical models, because if people don’t understand a model, they simply will not use it. By including business users from the outset of the analytical model development, we can ensure they clearly understand the model (and its limitations), and improve our chances of successful analytical model deployment.

To summarize:

  • Ockham’s razor is key to successful analytics. Models should be kept as simple as possible, but also not simpler than that.
  • Analytical software and consulting solutions should be solution-oriented and not technique-oriented.
  • Corporate governance and management oversight is critical to the success of analytics. One way to achieve this is by using the concept of model boards.

    Image courtesy of Shutterstock

Related Articles

With increased demand for mass customization and personalization, the emergence of Web 2.0, and one-to-one marketing, and the need for better risk management and timely fraud detection, the pressure is on for organizations to improve their ability to extract, understand, and exploit analytical patterns of customer behavior and strategic intelligence.

Posted November 12, 2015


Subscribe to Big Data Quarterly E-Edition