SlideShare a Scribd company logo
2
Most read
13
Most read
16
Most read
How to data model Churn
Real life examples
Quick quizz
•  How many of you are familiar with Churn issue?
•  with Machine Learning?
Logistic Regression, Random Forest, Gradient Boosting trees?
(Not the subject here)
•  With SQL?
(we may see some code later)
•  What database tech do you use?
What about EMC Greenplum or Vertica?
Who I am
•  Senior Data Scientist at Dataiku
(worked on churn prediction, fraud detection, bot detection, recommender systems,
graph analytics, smart cities, … )
•  Occasional Kaggle competitor
•  Mostly code with python and SQL
•  Twitter @prrgutierrez
Churn definition
•  Wikipedia:
“Churn rate (sometimes called attrition rate), in its broadest sense, is a measure of the
number of individuals or items moving out of a collective group over a specific period of
time”
= Customer leaving
Two types of Churn
•  Subscription models:
•  Telco
•  E-gamming (Wow)
•  Ex : Coyote -> 1 year subscription
-> you know when someone leave
•  Non subscription models:
•  E-Business (Amazon, Price Minister, Vente Privée)
•  E-gamming (Candy Crush, free MMORPG)
-> you approximate someone leaving
Candy Crush: days / weeks
MMORPG: 2 months (holidays)
Price Minister: months
Two types of Churn
•  Blurred Separation:
•  Ex: T-mobile: 1 month subscription -> paying each call
•  Ex: Wow: 1 month to 6 month subscription
•  Banking?
•  Focus : no subscription:
•  Can be seen as a generalization where you have to approximate the target
•  Bonus : Seller churn
•  Market places
•  Clients that participate product life
•  Forums (Reddit)
•  E-gamming (Korean competitions, guilds etc.)
Dealing with churn
•  Motivations :
•  Saturated market
-> cost get new client >>> cost keep client
•  Ex : http://www.bain.com/publications/articles/breaking-the-back-of-customer-churn.aspx
•  Wireline company : 2% to 2.5 % churn rate per month.
•  If 5 M customers -> 1.32 M churn per year
•  When reducing from 2.5% to 2% lowest estimation : 240 M $ in 18 month
Dealing with churn
•  Predict churn :
•  One model for performance <- our focus, short term, more ML
•  One model for understanding <- long term, more Analytics
•  Act on it (short term) :
•  Special offer (telco call, free in game money, discount coupon … )
•  Does it work? Feedback loop needed!
•  Model probabilities of leaving because of offer. A/B tests. Multi arms Bandit?
•  Significant LTV for activation?
•  Act on it (long term) :
•  Is there a problem in my purchasing funnel?
•  Is the game too hard at some point?
Dealing with churn
•  Candy Crush Rumor :
•  Change the distribution of
probabilities of candies / bombs
•  Change the difficulty of the
game
•  Loosing a lot makes the game
easier
Modelling Churn
•  Machine learning model (classification) -> target:
•  Known in subscription
•  Unknown in general
•  Step 1 : Maintain customer status
•  Do you care only about your best?
•  Anyway churn action won’t be the same
•  Has a client churned?
-> target = churner = don’t buy / visit since time X
-> best = buy / visit more than y since time Y
•  Can be refined (“new customer”, several class of best or inactive, reactivated…)
•  Storage : maintain only the difference!
Modelling Churn
•  Machine learning model -> features:
•  Explicative factors to use as input for the model
•  Step 2 : Maintain customer features
•  Social (woman, age, etc.)
•  Behavioral!
•  Utilization / buying rate
•  Trend in utilization / buying rate
•  Ad hoc features :
•  WoW / Social game churn: take into account friend network churn
•  Telco: call to call centers
•  Beware of time dependence!
Data Model
Computation Dependency diagram
Ex : Train and predict scheme
Time	
  
T	
  :	
  present	
  ,me	
  T	
  –	
  4	
  month	
  
Data	
  is	
  used	
  for	
  target	
  
crea,on	
  :	
  ac,vity	
  during	
  
the	
  last	
  4	
  months	
  
Data	
  is	
  used	
  for	
  feature	
  
genera,on.	
  
Use	
  model	
  to	
  predict	
  
future	
  churn	
  
Train	
  model	
  using	
  features	
  and	
  target	
  
Ex : Train Evaluation and Predict Scheme
Time	
  
T	
  :	
  present	
  ,me	
  T	
  –	
  4	
  month	
  
Data	
  is	
  used	
  for	
  target	
  
crea,on	
  :	
  ac,vity	
  during	
  
the	
  last	
  4	
  months	
  
Data	
  is	
  used	
  for	
  
feature	
  genera,on	
  
Valida&on	
  set	
  
Use	
  model	
  to	
  
predict	
  future	
  
churn	
  
Training	
  
Evaluate	
  on	
  the	
  target	
  
of	
  the	
  valida,on	
  set	
  
T	
  –	
  8	
  month	
  
Data	
  is	
  used	
  for	
  features	
  
genera,on.	
  
Data	
  is	
  used	
  for	
  target	
  
crea,on	
  :	
  ac,vity	
  during	
  
the	
  last	
  4	
  months	
  
Thank you for your attention !

More Related Content

What's hot (20)

PDF
Predicting Bank Customer Churn Using Classification
Vishva Abeyrathne
 
PDF
Customer churn prediction for telecom data set.
Kuldeep Mahani
 
PDF
Telecom Churn Prediction
Anurag Mukhopadhyay
 
PDF
Predicting the e-commerce churn
Lviv Data Science Summer School
 
PDF
Customer churn prediction in banking
BU - PG Master Computing Conference
 
PDF
Customer Churn Management For Profit Maximization PowerPoint Presentation Slides
SlideTeam
 
PDF
Customer attrition and churn modeling
Mariya Korsakova
 
PPTX
Data mining and analysis of customer churn dataset
Rohan Choksi
 
PDF
Customer Churn Prevention Powerpoint Presentation Slides
SlideTeam
 
PPTX
Churn Analysis in Telecom Industry
Satyam Barsaiyan
 
PPT
MIS637_Final_Project_Rahul_Bhatia
Rahul Bhatia
 
PPTX
Prediction of customer propensity to churn - Telecom Industry
Pranov Mishra
 
PDF
churn prediction in telecom
Hong Bui Van
 
PPT
Churn Predictive Modelling
Hugo E. Cisternas
 
PDF
RFM Analysis for Customer Segmentation
CleverTap
 
PDF
Customer Segmentation for Retention Strategy
Melody Ucros
 
PDF
Customer Churn, A Data Science Use Case in Telecom
Chris Chen
 
PDF
IRJET - Customer Churn Analysis in Telecom Industry
IRJET Journal
 
PDF
Churn in the Telecommunications Industry
skewdlogix
 
PDF
Ways to Reduce the Customer Churn Rate
FORMCEPT
 
Predicting Bank Customer Churn Using Classification
Vishva Abeyrathne
 
Customer churn prediction for telecom data set.
Kuldeep Mahani
 
Telecom Churn Prediction
Anurag Mukhopadhyay
 
Predicting the e-commerce churn
Lviv Data Science Summer School
 
Customer churn prediction in banking
BU - PG Master Computing Conference
 
Customer Churn Management For Profit Maximization PowerPoint Presentation Slides
SlideTeam
 
Customer attrition and churn modeling
Mariya Korsakova
 
Data mining and analysis of customer churn dataset
Rohan Choksi
 
Customer Churn Prevention Powerpoint Presentation Slides
SlideTeam
 
Churn Analysis in Telecom Industry
Satyam Barsaiyan
 
MIS637_Final_Project_Rahul_Bhatia
Rahul Bhatia
 
Prediction of customer propensity to churn - Telecom Industry
Pranov Mishra
 
churn prediction in telecom
Hong Bui Van
 
Churn Predictive Modelling
Hugo E. Cisternas
 
RFM Analysis for Customer Segmentation
CleverTap
 
Customer Segmentation for Retention Strategy
Melody Ucros
 
Customer Churn, A Data Science Use Case in Telecom
Chris Chen
 
IRJET - Customer Churn Analysis in Telecom Industry
IRJET Journal
 
Churn in the Telecommunications Industry
skewdlogix
 
Ways to Reduce the Customer Churn Rate
FORMCEPT
 

Viewers also liked (7)

PPTX
Credit card fraud detection methods using Data-mining.pptx (2)
k.surya kumar
 
PDF
Fraud Detection presentation
Hernan Huwyler
 
DOCX
Credit card fraud detection
anthonytaylor01
 
PPTX
Analysis of-credit-card-fault-detection
Justluk Luk
 
PPTX
Credit card fraud detection
kalpesh1908
 
PPT
Presentation on fraud prevention, detection & control
Dominic Sroda Korkoryi
 
Credit card fraud detection methods using Data-mining.pptx (2)
k.surya kumar
 
Fraud Detection presentation
Hernan Huwyler
 
Credit card fraud detection
anthonytaylor01
 
Analysis of-credit-card-fault-detection
Justluk Luk
 
Credit card fraud detection
kalpesh1908
 
Presentation on fraud prevention, detection & control
Dominic Sroda Korkoryi
 
Ad

Similar to Churn prediction data modeling (20)

PDF
A Proposed Churn Prediction Model
Mona Nasr
 
PDF
EVALUTION OF CHURN PREDICTING PROCESS USING CUSTOMER BEHAVIOUR PATTERN
IRJET Journal
 
PPTX
ai ml presentation.pptx ON SUBSCRIPTION BASED INDUSTRY
prernaagarwalmba25
 
PPTX
Predicting Azure Churn with Deep Learning and Explaining Predictions with LIME
Feng Zhu
 
PDF
Telecom Retention - Business Understanding.pdf
AmmarAhmedSiddiqui2
 
PPTX
Customer_Churn_prediction.pptx
patilaniket2418
 
PPTX
Customer_Churn_prediction.pptx
Aniket Patil
 
PDF
Automated Feature Selection and Churn Prediction using Deep Learning Models
IRJET Journal
 
PPTX
PYTHON (IETE).pptxmanjunanr75pythonproject
ManjunathNR1
 
PDF
1710 track3 zhu
Rising Media, Inc.
 
PPTX
Solving churn challenge in Big Data environment - Jelena Pekez
Institute of Contemporary Sciences
 
PPTX
Webinar - Pattern Mining Log Data - Vega (20160426)
Turi, Inc.
 
PDF
Af4506165171
IJERA Editor
 
PPTX
LEAP Predictive Churn Model
Daniel Williams
 
PDF
Churn analysis
Naveen Kumar
 
PDF
From Data to AI with the Machine Learning Canvas
Louis Dorard
 
PDF
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
IRJET Journal
 
PPTX
Maximizing Retention with Minimal Effort
vijaykumardevalla199
 
PDF
From data to AI with the Machine Learning Canvas by Louis Dorard Slides
Big Data Spain
 
A Proposed Churn Prediction Model
Mona Nasr
 
EVALUTION OF CHURN PREDICTING PROCESS USING CUSTOMER BEHAVIOUR PATTERN
IRJET Journal
 
ai ml presentation.pptx ON SUBSCRIPTION BASED INDUSTRY
prernaagarwalmba25
 
Predicting Azure Churn with Deep Learning and Explaining Predictions with LIME
Feng Zhu
 
Telecom Retention - Business Understanding.pdf
AmmarAhmedSiddiqui2
 
Customer_Churn_prediction.pptx
patilaniket2418
 
Customer_Churn_prediction.pptx
Aniket Patil
 
Automated Feature Selection and Churn Prediction using Deep Learning Models
IRJET Journal
 
PYTHON (IETE).pptxmanjunanr75pythonproject
ManjunathNR1
 
1710 track3 zhu
Rising Media, Inc.
 
Solving churn challenge in Big Data environment - Jelena Pekez
Institute of Contemporary Sciences
 
Webinar - Pattern Mining Log Data - Vega (20160426)
Turi, Inc.
 
Af4506165171
IJERA Editor
 
LEAP Predictive Churn Model
Daniel Williams
 
Churn analysis
Naveen Kumar
 
From Data to AI with the Machine Learning Canvas
Louis Dorard
 
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
IRJET Journal
 
Maximizing Retention with Minimal Effort
vijaykumardevalla199
 
From data to AI with the Machine Learning Canvas by Louis Dorard Slides
Big Data Spain
 
Ad

More from Pierre Gutierrez (6)

PDF
Pragmatic deep learning for image labelling
Pierre Gutierrez
 
PDF
From Labelling Open data images to building a private recommender system
Pierre Gutierrez
 
PDF
Machine learning and Internet of Things, the future of medical prevention
Pierre Gutierrez
 
PDF
Beyond Churn Prediction : An Introduction to uplift modeling
Pierre Gutierrez
 
PDF
Introduction to Uplift Modelling
Pierre Gutierrez
 
PDF
Before Kaggle
Pierre Gutierrez
 
Pragmatic deep learning for image labelling
Pierre Gutierrez
 
From Labelling Open data images to building a private recommender system
Pierre Gutierrez
 
Machine learning and Internet of Things, the future of medical prevention
Pierre Gutierrez
 
Beyond Churn Prediction : An Introduction to uplift modeling
Pierre Gutierrez
 
Introduction to Uplift Modelling
Pierre Gutierrez
 
Before Kaggle
Pierre Gutierrez
 

Recently uploaded (20)

PDF
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
PPT
AI Future trends and opportunities_oct7v1.ppt
SHIKHAKMEHTA
 
PPTX
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PDF
Copia de Strategic Roadmap Infographics by Slidesgo.pptx (1).pdf
ssuserd4c6911
 
PDF
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PPTX
apidays Singapore 2025 - From Data to Insights: Building AI-Powered Data APIs...
apidays
 
PPTX
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PDF
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
PPTX
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
PDF
Driving Employee Engagement in a Hybrid World.pdf
Mia scott
 
PPTX
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
PDF
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
PDF
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
PDF
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
AI Future trends and opportunities_oct7v1.ppt
SHIKHAKMEHTA
 
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
Copia de Strategic Roadmap Infographics by Slidesgo.pptx (1).pdf
ssuserd4c6911
 
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
apidays Singapore 2025 - From Data to Insights: Building AI-Powered Data APIs...
apidays
 
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
Driving Employee Engagement in a Hybrid World.pdf
Mia scott
 
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 

Churn prediction data modeling

  • 1. How to data model Churn Real life examples
  • 2. Quick quizz •  How many of you are familiar with Churn issue? •  with Machine Learning? Logistic Regression, Random Forest, Gradient Boosting trees? (Not the subject here) •  With SQL? (we may see some code later) •  What database tech do you use? What about EMC Greenplum or Vertica?
  • 3. Who I am •  Senior Data Scientist at Dataiku (worked on churn prediction, fraud detection, bot detection, recommender systems, graph analytics, smart cities, … ) •  Occasional Kaggle competitor •  Mostly code with python and SQL •  Twitter @prrgutierrez
  • 4. Churn definition •  Wikipedia: “Churn rate (sometimes called attrition rate), in its broadest sense, is a measure of the number of individuals or items moving out of a collective group over a specific period of time” = Customer leaving
  • 5. Two types of Churn •  Subscription models: •  Telco •  E-gamming (Wow) •  Ex : Coyote -> 1 year subscription -> you know when someone leave •  Non subscription models: •  E-Business (Amazon, Price Minister, Vente Privée) •  E-gamming (Candy Crush, free MMORPG) -> you approximate someone leaving Candy Crush: days / weeks MMORPG: 2 months (holidays) Price Minister: months
  • 6. Two types of Churn •  Blurred Separation: •  Ex: T-mobile: 1 month subscription -> paying each call •  Ex: Wow: 1 month to 6 month subscription •  Banking? •  Focus : no subscription: •  Can be seen as a generalization where you have to approximate the target •  Bonus : Seller churn •  Market places •  Clients that participate product life •  Forums (Reddit) •  E-gamming (Korean competitions, guilds etc.)
  • 7. Dealing with churn •  Motivations : •  Saturated market -> cost get new client >>> cost keep client •  Ex : http://www.bain.com/publications/articles/breaking-the-back-of-customer-churn.aspx •  Wireline company : 2% to 2.5 % churn rate per month. •  If 5 M customers -> 1.32 M churn per year •  When reducing from 2.5% to 2% lowest estimation : 240 M $ in 18 month
  • 8. Dealing with churn •  Predict churn : •  One model for performance <- our focus, short term, more ML •  One model for understanding <- long term, more Analytics •  Act on it (short term) : •  Special offer (telco call, free in game money, discount coupon … ) •  Does it work? Feedback loop needed! •  Model probabilities of leaving because of offer. A/B tests. Multi arms Bandit? •  Significant LTV for activation? •  Act on it (long term) : •  Is there a problem in my purchasing funnel? •  Is the game too hard at some point?
  • 9. Dealing with churn •  Candy Crush Rumor : •  Change the distribution of probabilities of candies / bombs •  Change the difficulty of the game •  Loosing a lot makes the game easier
  • 10. Modelling Churn •  Machine learning model (classification) -> target: •  Known in subscription •  Unknown in general •  Step 1 : Maintain customer status •  Do you care only about your best? •  Anyway churn action won’t be the same •  Has a client churned? -> target = churner = don’t buy / visit since time X -> best = buy / visit more than y since time Y •  Can be refined (“new customer”, several class of best or inactive, reactivated…) •  Storage : maintain only the difference!
  • 11. Modelling Churn •  Machine learning model -> features: •  Explicative factors to use as input for the model •  Step 2 : Maintain customer features •  Social (woman, age, etc.) •  Behavioral! •  Utilization / buying rate •  Trend in utilization / buying rate •  Ad hoc features : •  WoW / Social game churn: take into account friend network churn •  Telco: call to call centers •  Beware of time dependence!
  • 14. Ex : Train and predict scheme Time   T  :  present  ,me  T  –  4  month   Data  is  used  for  target   crea,on  :  ac,vity  during   the  last  4  months   Data  is  used  for  feature   genera,on.   Use  model  to  predict   future  churn   Train  model  using  features  and  target  
  • 15. Ex : Train Evaluation and Predict Scheme Time   T  :  present  ,me  T  –  4  month   Data  is  used  for  target   crea,on  :  ac,vity  during   the  last  4  months   Data  is  used  for   feature  genera,on   Valida&on  set   Use  model  to   predict  future   churn   Training   Evaluate  on  the  target   of  the  valida,on  set   T  –  8  month   Data  is  used  for  features   genera,on.   Data  is  used  for  target   crea,on  :  ac,vity  during   the  last  4  months  
  • 16. Thank you for your attention !