Understanding the LIFT CHART

The lift chart is synonymous with evaluating data mining model performance and the predictive power of one model against another. Often, in presentations and training sessions it is suggested that the chart is indicative of the models ability to accurately predict within a training population. For example, the following explanation is provided;

“the lift chart shows that this model is good because it only needs to evaluate 30% of data in order to correctly predict all the target outcomes”

This type of statement is simply not true – it is INCORRECT, WRONG, MISLEADING and shows a lack of understanding about what the chart represents. This post looks at explaining the chart by examining how it is created –  seeking to remove some of the misconceptions about the use of the chart.

Consider the following example. In this chart it would be argued that an ideal model (red line) would only need ~ 55% of the population in order to predict all the target states. Without a model, we would need to use the entire population and so our model (being somewhat useful) lies between the best model and a random guess. These values can be determined by the intercepts each model with the X axis (note that at 55% of the population, the best model achieves 100% accuracy).

Another common question arising from the interpretation of this chart occurs when we know that the target (predicted) state is found in only 55% of the population. The question is “why do we show 100% accuracy when only 55% of the population can exist in the predicted state and therefore the Y axis should have a maximum of 55%”.

For my own analysis, I shall ask a question of the reader so that the construction of the chart can better be understood. The question is simple.

If my model does not predict the correct result 100% of the time how could my accuracy ever achieve 100%? Let’s be realistic, it would have to be a pretty impressive model to never be wrong – and this is what the chart always shows à 100% accuracy!

Now let’s look at construction

In order to create a lift chart (also referred to as an accumulative gain) the data mining model needs to be able to predict the probability of its prediction. For example, we predict a state and the probability of that state. Within SSAS, this is achieved with the PredictProbability
function.

Now, since we can include the probability of our predictions, we would naturally order training data by the predicted probability in suggesting the likelihood of a test case being the predicted target state. Or perhaps put another way, if I only had 10 choices for choosing which test case (that is a an observation from the testing data) would be the predicted value, I would choose the top 10 testing cases based on their predicted probability – after all the model is suggesting that these cases have the highest probability of being the predicted state.

As we move through the testing data (and the predicted probability decreases), it is natural to expect the model to become less accurate – will make more false predictions. So let’s summarise this (training) data. For convenience, I have group my training data into 10 bins and each bin has ~ 320 cases (the red line below). Working with the assumption that the predictive power of my model decreases with probability, the number of predictions also decreases as we move through more of the training data. This is clearly visible in the chart and data below – the first bin has a high predictive power (275 correct predictions) while the last bin has only 96 correct predictions.

If I focus on the models ability to correctly predict values, I will notice that it can predict 1,764 correct results – but now let’s turn our look to the accumulative power of the model. If, from the set of my sample data I could only choose 322 cases (coincidently this is the number of cases in bin 1), I would choose all cases from Bin 1 and get 275 correct (or 16% of the possible correct values). If I had to choose 645 cases, I would choose the cases from bin 1 & 2 and get 531 correct (30% of correct possibilities). This continues with the more predictions that I make and is summarised in the following table.

This data is transposed onto the lift chart – the Bin on the X axis (representing the % of population) and the Percent Running Correct on the Y axis (representing the number of correct predictions). As we can see, the data is indicative of the models ability to quickly make accurate predictions rather than its overall predictive ability.

Best and Random Cases

The chart also includes best and random cases as guidelines for prediction – let’s focus on these. These lines are theoretical – really ‘what if’ type of scenarios.

Suppose that I had an ideal model. If this was the case my model would predict 322 in bin 1, 323 in bin 2 and so on – it must because we have ordered the data by PredictProbability and in a perfect world we would get them all correct! However, the model can only predict 1,764 correct values -we know this from the actual results. Because of this we would only need up to bin 6 to get all our correct values (see column ‘Running Correct Best Case’ in the following table. Just as we did for the model prediction we can convert this to a percent of total correct (the population) and chart it with the model.

Now for the random guess – again this is theoretical. I know that I can only predict 1,764 correct values so, if these were evenly distributed amongst my bins, I would have ~176 correct predictions in each bin. This is then added to the chart.

What is the Problem?

Now we can see that the chart is essentially just a view of how quickly the model makes accurate predictions.  Perhaps there is nothing wrong with that but what happens when we compare models?  Well, in this case, the comparison is relative.  Those steps are reproduced for each chart and what you essentially see is relative comparative performance.  Thus, the comparison of two models in the charts gives NO indication of performance accuracy – after all how could they since they each plot relative percent accuracy for their own states.

For this reason, relying on this chart as a sole measure of accuracy is just dangerous and really shows very little about the total accuracy of the model.

Conclusion

Understanding how the lift chart has been constructed can help in understanding how to interpret it. We can see that it indicates the accumulative power of the model to give predictions – or perhaps more correctly the accumulative power of the model to give its correct prediction.

Presenting Sessions on Data Mining

I am currently preparing a few presentations on using Data Mining in business intelligence.  These will be at the Brisbane SQL Server user group (this month) and SQL Rally Amsterdam (in Nov).  I am especially looking forward to Amsterdam because this will be my first trip to the Netherlands.

 The application of data mining within an organisation is an interesting topic for me which I liken to a milestone in organisational maturity.  When most people discuss business intelligence (and analytics), they are talking about the production of a system so that the and end user can either get canned reports quickly or (more importantly) interrogate data in a real time manner.  After all, this was the intent of OLAP! – the idea that the end user can drive data in real time so that once they think of a problem and a potential solution, they can verify their thoughts against data. 

 However, as good as this type of system is (compared to the starting base), this can be one of the short comings of stoping the BI journey here.  That is, the user needs to think of a solution to a problem, and then verify against it.  But what do we verify against? – and when do we stop investigating?  This is one of the major benefits of data mining.  It allows a scientific analysis at a targeted problem without the limitations of our dimensional thought (after all we can only think in a relatively small number of dimensions & attributes at a time).

SQL Stream @ BBBT

Most presentations to the BBBT focus on delivering better and more scalable reporting & analytical solutions.  While this may be a gross over simplification of the vendors that present, a general summary is that most focus on the delivery of a full platform (cloud or onprem) or an improved method of analytics and reporting under an existing architecture (the more traditional BI implementation of stage, transform and storage which include those dreaded latency issues). 

These are very valid applications for BI (and to be truthful account for the 99.9% of most requirements), however, sometimes a technology is presented that is so completely different that it is worthy of a note – enter SQL Stream.  This offering seems truly remarkable.  A technology that streams big data so that it can be queried using SQL.  The outcome – real time reporting and alerting over blended big data (yes combining different sources) – sounds cool eh?  Well it is!

Finally, I must also give my apologies to all the other vendors that have presented the BBBT with great products and were not been mentioned (on this blog).  As stated, these products cover the traditional implementation of BI that are encountered on a day to day basis.  Check out the BBBT website if you are interested in what the BBBT is, its meetings, podcasts and membership.  

thenakedleaf blog

I have long been a fan of the Jedox product for its write-back and text capabilities in OLAP.  Its ability to publish Excel pages to the web (server) allows reports (and input forms) to be created quickly.  There are some very impressive methods for write back and it can be a great piece of technology (in the right environment of course)!

 Now, thanks to Chris Mentor there is a new practical blog about using Jedox with tips, tricks and explanations.  If your interested in the product or want to extend your understanding of the toolset then the site should be on your reading list.