Jan Manďák, Jana Hančlová
Use of Logistic Regression for Understanding and Prediction of Customer Churn in Telecommunications
Číslo: 2/2019
Periodikum: Statistika
Klíčová slova: Customer churn, telecommunications, predictive analytics, logistic regression, sensitivity
Pro získání musíte mít účet v Citace PRO.
Anotace:
Customer churn, loss of customers due to switch to another service provider or non-renewal of commitment,
is very common in highly competitive and saturated markets such as telecommunications. Predictive models
need to be implemented to identify customers who are at risk of churning and also to discover the key drivers
of churn. The aim of this paper is to use demographic and service usage variables to estimate logistic regression
model to predict customer churn in European Telecommunications provider and to find the factors influencing
customer churn. An interesting findings came out of the estimated model – younger customers who are shorter
time with company, who use mobile data and sms more than traditional calls, having occasional problem
with paying bills, with students account and ending contract in the near future are typical representatives
of customers who tend to leave the company.
An interaction terms added as explanatory variables showed that effect of usage of data and voice vary
depending on the year of birth. The quality of the logistic regression model was assessed by Hosmer-Lemeshow test and pseudo R squared measures. An independent testing data set was further used to evaluate the predictive ability of the model by computation of performance metrics such as the area under the ROC curve (AUC), sensitivity and precision. The resulting model was able to catch 94.8% of customers who in fact left the company. Quality of the model was confirmed also by high value of AUC metric equal to 0.9759. Logistic regression represents a very useful tool
Zobrazit více »
is very common in highly competitive and saturated markets such as telecommunications. Predictive models
need to be implemented to identify customers who are at risk of churning and also to discover the key drivers
of churn. The aim of this paper is to use demographic and service usage variables to estimate logistic regression
model to predict customer churn in European Telecommunications provider and to find the factors influencing
customer churn. An interesting findings came out of the estimated model – younger customers who are shorter
time with company, who use mobile data and sms more than traditional calls, having occasional problem
with paying bills, with students account and ending contract in the near future are typical representatives
of customers who tend to leave the company.
An interaction terms added as explanatory variables showed that effect of usage of data and voice vary
depending on the year of birth. The quality of the logistic regression model was assessed by Hosmer-Lemeshow test and pseudo R squared measures. An independent testing data set was further used to evaluate the predictive ability of the model by computation of performance metrics such as the area under the ROC curve (AUC), sensitivity and precision. The resulting model was able to catch 94.8% of customers who in fact left the company. Quality of the model was confirmed also by high value of AUC metric equal to 0.9759. Logistic regression represents a very useful tool