OTT Churn modeling – From statistics to big data and artificial intelligence

For decades, highly accurate predictive results have been obtained using representative population samples in studies. Specifically, with a sample of only 1,100 people (regardless of the study’s total population size), the probability rate for a successful prediction is 95{25e1698c2f9a0472130d0b738967fee05ea39487d443821ec133845c5d454689}, with a margin of error of only 3{25e1698c2f9a0472130d0b738967fee05ea39487d443821ec133845c5d454689}. Is this magic? No. It’s mathematics, specifically, statistics.

Statistics, therefore, facilitates making the following kinds of predictions — at relatively low cost — by “simply” taking a random sample of 1,000 people and creating a well-designed survey:

  • What percentage of the population will vote for a specific political party, thus forecasting the election results;
  • What percentage of people will buy a particular product; so how many should we manufacture;
  • What percentage of people will watch a specific movie, giving us insight into the number of copies we should make and distribute.

The preceding examples are sufficient when we are interested in the simple headline numbers. Yet, if our interest level demands a greater dissection of these numbers in order to provide insight into the people behind them, statistics alone won’t get the job done.

The advent of affordable, cloud-based data storage, processing, and computing capacity has engendered other mathematical models that go further than mere predictive statistics.

Mathematical algorithms, concretely machine learning, contribute to the extraction of information and knowledge from data itself. This knowledge would otherwise be forever obscured in the universe of data.

This approach is not new; there are well-known feedback models that, depending on the results, automatically tweak themselves until the results coincide with the defined goal. These models are widely used across a number of industries.

The following diagram will serve as a reminder that there is nothing new about the way we use these algorithms:

Figure 1. Feedback control model

For instructional purposes and with significant simplification, the model can be explained by following the flowchart in Figure 1, as follows:  The controlled variable (the current video service churn rate) is measured and its value compared to the setpoint (target error rate). An error variable is generated, representing the difference between both variables (actual churn rate and target churn rate).  From this error variable, the mathematical model generates new output (e.g. modifies one or more of the factors that influence a customer’s decision to unsubscribe). This, in turn, changes the input parameters to the algorithm resulting therefore in both different input and different outputs. This cycle is repeated until the error value is zero (in other words, the difference between the current and the target churn rate is zero). The model takes into account possible disruptions that may occur in the video ecosystem. For example, changes to the boundary conditions might arise from the launch of a new internal marketing campaign or a competitor’s campaign.

New approaches to churn modeling (like JUMP Retention) go further by identifying the user_ID for each one of the video service customers who are at risk of churning in the near future (30, 60, or 90 days). Meaning, the overall headline number of customers at risk of churn is broken down to an individual user level. This is huge progress! But new generation models don’t stop there. It also identifies the primary factors that cause these customers to be at-risk candidates in the defined near future.

This new intelligence introduces a new dimension to video service churn management because it paves the way for service providers to act on an individual at-risk customer basis, tailoring their actions in accordance with each of these variables identified as increasing individual churn risk levels. This is where the new generation models hit its greatest dimension because once action has been taken on one or more of these churn variables, the model is rerun with feedback from the result (different input, different output) and tries to outdo itself, breaking its own initial prediction, thereby achieving a reduction in the video service churn rate.

Is all this magic? No. It’s data, lots of data! And mathematics, lots of math!