0% found this document useful (0 votes)

52 views

Vig SPS-5382-Telecom Customer Churn Prediction Using Watson Auto AI

This project report discusses predicting customer churn for a telecom company using IBM's Watson Auto AI. The report includes an introduction outlining the purpose and importance of customer churn prediction. It also provides a literature review summarizing several existing research papers on customer churn prediction in telecom, including the problems addressed and solutions proposed. The report goes on to describe the theoretical analysis, experimental investigations, results, and applications of using Watson Auto AI to build a model for predicting customer churn.

Uploaded by

Vignesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views

Vig SPS-5382-Telecom Customer Churn Prediction Using Watson Auto AI

Uploaded by

Vignesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

PROJECT

REPORT - IBM BUILD-A-THON
Telecom Customer Churn Prediction using Watson Auto AI

presented by
Vignesh K
email-id: [email protected]
in the month of
October 2020
TABLE OF CONTENTS

1 INTRODUCTION
1.1 Overview
1.2 Purpose

2 LITERATURE SURVEY
2.1 Existing Problem
2.2 Proposed Solution

3 THEORITICAL ANALYSIS
3.1 Block Diagram
3.2 Hardware / Software Designing

4 EXPERIMENTAL INVESTIGATIONS

5 FLOWCHART

6 RESULT

7 ADVANTAGES & DISADVANTAGES

8 APPLICATIONS

9 CONCLUSION

10 FUTURE SCOPE

11 BIBILIOGRAPHY

APPENDIX
A. Source Code
INTRODUCTION

1.1 OVERVIEW
Churn prediction is one of the most popular Big Data use cases in
business. It consists of detecting customers who are likely to cancel a
subscription to a service. This can be telecom companies, SaaS
companies, and any other company that sells a service for a monthly fee.
In the telecom industry, customers can choose from multiple service
providers and actively switch from one operator to another. In this highly
competitive market, the telecommunications industry experiences an
average of 15-25% annual churn rate. Given the fact that it costs 5-10 times
more to acquire a new customer than to retain an existing one, customer
retention has now become even more important than customer acquisition.
For many incumbent operators, retaining high proﬁtable customers is the
number one business goal.

1.2 PURPOSE
Customer churn prediction can help you see which customers are
about to leave your service so you can develop proper strategy to re-engage
them before it is too late. This is a vital tool in a business' arsenal when it
comes to customer retention. Having the ability to accurately predict future
churn rates is essential because it helps your business gain a better
understanding of future expected revenue. Predicting churn rates can also
help your business identify and improve upon areas where customer
service is lacking.
To reduce customer churn, telecom companies need to predict which
customers are at high risk of churn.
In this project, the customer-level data of a leading telecom ﬁrm, build
predictive models to identify customers who will stay in the company (or)
who will leave the company based on a set of parameters.
LITERATURE SURVEY

S.No Title & Author Existing Problem Proposed Solution

1. Telco Churn Prediction A churner quits the service Experimental results confirm
with Big Data - Yiqing provided by operators and that the prediction
Huang, Fangzhou Zhu, yields no profit any longer. performance has been
Mingxuan Yuan, Ke The prepaid customers has significantly improved by
Deng, Yanhua Li, Bing Ni, a significantly higher churn using a large
Wenyuan Dai, Qiang rate (on average 9.4%) than volume of training data, a
Yang, Jia Zeng the large variety of features from
postpaid customers (on both
average 5.2%), because the business support systems
prepaid (BSS) and operations support
customers are not bound to systems (OSS), and a high
contract and can quit easily velocity of processing new
without recharging. The cost coming
of acquiring new customers data. Automatic matching
is much higher than that of retention campaigns with the
retaining the existing targeted potential churners
ones significantly boost their
recharge
rates, leading to a big
business value.

2. On the Operational Existing studies on customer The gap between predictive

Efficiency of Different churn prediction generally performance and operational
Feature Types for agree that adding features efficiency by devising a new
Telco Churn Prediction - typically increases predictive feature type classification and
Sandra Mitrovića, , Bart performance, they rarely a novel reusable method to
Baesensa, Wilfried discuss the accompanying determine optimal feature
Lemahieua , Jochen De issues such as data type combinations based
Weerdt availability on Pareto multi-criteria
and computational cost. optimization. The results
provide several insights that
can serve as a guideline for
industry practitioners.

3. A Proposed Churn Conventional churn The purpose of prediction

Prediction Model - prediction techniques have is to anticipate the value that a
Essam Shaaban, Yehia the advantage of being random variable will
Helmy, Ayman Khedr, simple and robust with assume in the future or to
Mona Nasr respect to defects in the estimate the likelihood of
input data, they possess future events.
serious limitations to the
interpretation of reasons for
churn. Therefore, measuring
the effectiveness of a
prediction model depends
also on how well the results
can be interpreted for
inferring the possible
reasons of churn.
4. Handling Imbalanced At the time of the customer This research applied a
Data in Customer Churn churn is taking place, the combination of sampling
Prediction Using percentage of data that techniques and Weighted
Combined Sampling and describes the customer Random Forest (WRF) to
Weighted Random churn is usually low. improve the customer churn
Forest - Veronikha Unfortunately, the churn data prediction model on a sample
Effendy, Adiwijaya, is the data which have to be dataset from a
Z.K.A. Baizal. predicted earlier. The lack of telecommunication industry in
data on customer churn led Indonesia. Sampling
to the problem of techniques were applied to
imbalanced data. The enhance performance.
imbalanced data caused
diﬃculties in developing a
good prediction model.
5. Churn Prediction using The suggested approaches 1. Slicing a monthly call graph
Dynamic demonstrate a lot of to capture dynamic changes in
RFM-Augmented creativity when it comes to calling patterns.
node2vec - Sandra deriving new 2. A devise network designs
Mitrovic, Bart Baesens, features from the underlying which conjoin interaction and
Wilfried Lemahieu, networks, they also exhibit at structural
Jochen De Weerdt. least one of information.
the following problems: they 3. Adapted and applied the
either do not account node2vec method to learn
properly for dynamic node representations in a
aspects of call networks or more automated way and to
they do not exploit the full avoid the need for
potential of joint feature handcrafting.
interaction and structural
features and additionally,
they usually address
these in a non-systematic
manner which involves
hand-engineering of
features.
THEORITICAL ANALYSIS

3.1 BLOCK DIAGRAM

The block diagram depicts the workﬂow of the entire system. Watson
Studio acts the central point of computation, and is used for running python
notebooks and creating, monitoring, and managing deployments. The
runtime environment is powered by Watson Machine Learning Service. The
UI is designed using HTML and the backend process is automated using
Flask framework, which also facilitates deployment of the ML models using
the scoring endpoint.

3.2 HARDWARE / SOFTWARE DESIGNING
The following are the hardware requirements for standard users
(commodity hardware) of the proposed system:
Processor: Core i5 Quad Core
RAM:8GB
The software specification for the proposed system is as follows:
IBM Watson Studio:
Watson ML Package - 'Lite'
Instance Type - 'v2'
Environment Definition - Default Python 3.6XS
Virtual Hardware Configuration - 2 vCPU 8GB RAM
COS Instance Region - 'London'
Python Flask Application:
HTML - 5.0
Flask - 1.1.2
Python Libraries required:
scikit-learn
pandas
numpy
seaborn
json
sklearn.preprocessing
sklearn.model_selection
sklearn.feature_selection
EXPERIMENTAL INVESTIGATIONS

3.1 LOADING THE DATASET
The dataset is provided in the template provided. This data set
contains details of a company's customers and the target variable is a
binary variable reﬂecting the fact whether the customer left the company
(or) he continues to be a customer.
IBM has provided options to add the dataset as an asset into the
projects. The project assets can be added directly into the Python notebook
using a simple process, and the code is automatically generated.
The dataset is available here.

The metadata of the dataset is as follows:
● # of rows: 10,000
● # of columns: 14
● # of Input Variables: 13
● # of Ouput Variable(s): 1 - ["Exited"]

3.2 INFORMATION ABOUT DATASET
The info() function gives some of the basic details about the dataset.
It gives the information about the following:
● Number of entries
● Null Value status
● Datatype
for each attribute of the dataframe.
3.3 CHECKING NULL VALUES
A null value is a value that has no value exists for the particular
position. Dataset with null values affect the performance of the Machine
Learning model. Null values in the dataset can be identiﬁed and can be
eradicated.

3.4 DESCRIPTIVE ANALYTICS FOR DATA
Descriptive analytics gives you a general view of the historic data, to
provide a clear, straightforward picture of the company's operations.
Descriptive analytics is the interpretation of historical data to better
understand changes that have occurred in a business. Descriptive analytics
describes the use of a range of historic data to draw comparisons.
3.4.1 PRECISION SETUP
We can set the options in the pandas setup to make the precision to a
number of decimal points that we desire. The describe() function is used to
get the descriptive statistics of the dataset. The output of this function will
be the following:
● count
● mean
● standard devidation
● minimum
● maximum
● 25%, 50%, 75% values
for each attributes of the dataset.

3.4.2 IDENTIFYING CORRELATIONS
Correlation is a statistical measure that expresses the extent to which
two variables are linearly related (meaning they change together at a
constant rate). Correlation is a measure of the strength of a linear
relationship between two quantitative variables.
The Pearson coefficient is a type of correlation coefficient that
represents the relationship between two variables that are measured on the
same interval or ratio scale. The Pearson coefficient is a measure of the
strength of the association between two continuous variables.
The results of applying Pearson Correlation to our dataset is as
follows:

3.5 DATA VISUALIZATION
Data visualization is the discipline of trying to understand data by
placing it in a visual context so that patterns, trends and correlations that
might not otherwise be detected can be exposed. Python offers multiple
great graphing libraries that come packed with lots of different features.
Data visualization is the graphical representation of data in order to
interactively and eﬃciently convey insights to clients, customers, and
stakeholders in general.
The different types of data visualization techniques used in analysing
the dataset are as follows:
3.5.1 COUNT PLOTS
The countplot() method is used to show the counts of observations in
each categorical bin using bars. The countplot() is applied for the following
attributes of the dataset:
● Tenure
● Credit Score
● Geography
The output of a countplot is as follows:

3.5.2 PAIRGRID GENERATION
A pairgrid is a subplot grid for plotting pairwise relationships in a
dataset. This object maps each variable in a dataset onto a column and
row in a grid of multiple axes. Different axes-level plotting functions can be
used to draw bivariate plots in the upper and lower triangles, and the the
marginal distribution of each variable can be shown on the diagonal.
The pairgrid shows the relationship between an attribute of the
dataset with any other attribute of the dataset. The dense spots indicate
the strong relationships between the attributes. The sparse spots indicate
the weak relationships between the attributes. The pairgrid that is
generated for the dataset under consideration is as follows:
3.6 DATA DISTRIBUTION
A data distribution is a function or a listing which shows all the
possible values (or intervals) of the data. It also (and this is important) tells
you how often each value occurs. Often, the data in a distribution will be
ordered from smallest to largest, and graphs and charts allow you to easily
see both the values and the frequency with which they appear.
From a distribution you can calculate the probability of any one
particular observation in the sample space, or the likelihood that an
observation will have a value which is less than (or greater than) a point of
interest. The function of a distribution that shows the density of the values
of our data is called a probability density function, and is sometimes
abbreviated pdf. The methods of data distributions used in the project are
as follows:

3.6.1 BOXPLOTS
A box plot (or box-and-whisker plot) shows the distribution of
quantitative data in a way that facilitates comparisons between variables or
across levels of a categorical variable. The box shows the quartiles of the
dataset while the whiskers extend to show the rest of the distribution,
except for points that are determined to be “outliers” using a method that is
a function of the inter-quartile range. The boxplot for the dataset is as
follows:
3.6.2 DISTRIBUTION PLOTS
Seaborn distplot lets you show a histogram with a line on it. We use
seaborn in combination with matplotlib, the Python plotting module. A
distplot plots a univariate distribution of observations. The distplot()
function combines the matplotlib hist function with the seaborn kdeplot()
and rugplot() functions.
The histogram shows buckets of data ranges called as bins and
distributes the values of the attributes into the buckets. Then, it calculates
the probability of occurrence of each of the buckets. Thi process is done
for categorical data. For continuous data, a PDF curve is generated, which
shows the distribution of categorical data as a function of a polynomial,
which generates a curve.

A sample output for a distplot is as follows:
The distplots are generated for the following attributes of the dataset:
● Balance
● # of Products
● Estimated Salary

3.7 ONE-HOT ENCODING
Sometimes in datasets, we encounter columns that contain numbers
of no specific order of preference. The data in the column usually denotes a
category or value of the category and also when the data in the column is
label encoded. This confuses the machine learning model, to avoid this the
data in the column should be One Hot encoded. One-hot encoding refers to
splitting the column which contains numerical categorical data to many
columns depending on the number of categories present in that column.
Each column contains “0” or “1” corresponding to which column it has been
placed.
For non pre-processed data, LabelEncoder() helps to generate
one-hot encoding for the dataset. For pre-processed data,
pd.get_dummies() function helps to generate one-hot encoding for the
dataset.
An example for one-hot encoding is as follows:
3.8 OUTLIER DETECTION
The presence of outliers in a classification or regression dataset can
result in a poor fit and lower predictive modeling performance. Identifying
and removing outliers is challenging with simple statistical methods for
most machine learning datasets given the large number of input variables.
Instead, automatic outlier detection methods can be used in the modeling
pipeline and compared, just like other data preparation transforms that may
be applied to the dataset.
The methods used for outlier detection are as follows:

3.8.1 Z-SCORE
Z score is an important concept in statistics. Z score is also called
standard score. This score helps to understand if a data value is greater or
smaller than mean and how far away it is from the mean. More speciﬁcally,
Z score tells how many standard deviations away a data point is from the
mean. The Z-Score method is applied to 'EstimatedSalary' attribute, and it
showed no presence of an outlier. The formula for Z-Score is as follows:

3.8.2 INTER-QUARTILE RANGE METHOD
In descriptive statistics, the interquartile range, also called the
midspread or middle 50%, or technically H-spread, is a measure of
statistical dispersion, being equal to the difference between 75th and 25th
percentiles, or between upper and lower quartiles, IQR = Q₃ − Q₁. In our
dataset, the IQR (Inter-Quartile Range) method is applied to 'Ballance' and it
showed no evidence of outliers.
The formula for IQR method is as follows:

3.9 FEATURE ENGINEERING
All machine learning algorithms use some input data to create
outputs. This input data comprise features, which are usually in the form of
structured columns. Algorithms require features with some speciﬁc
characteristic to work properly. Here, the need for feature engineering
arises. The features we use inﬂuence more than everything else the result.
No algorithm alone can supplement the information gain given by correct
feature engineering. The method of feature engineering that is used in our
project is "Polynomial Features"

3.9.1 POLYNOMIAL FEATURES
Polynomial features are those features created by raising existing
features to an exponent. For example, if a dataset had one input feature X,
then a polynomial feature would be the addition of a new feature (column)
where values were calculated by squaring the values in X, e.g. X^2. This
process can be repeated for each input variable in the dataset, creating a
transformed version of each. As such, polynomial features are a type of
feature engineering, e.g. the creation of new input features based on the
existing features. The “degree” of the polynomial is used to control the
number of features added, e.g. a degree of 3 will add two new variables for
each input variable. Typically a small degree is used such as 2 or 3.
The output of the feature engineering had 87 attributes in our dataset,
and among them, the best 25 were selected.

3.10 FEATURE SCALING
Feature scaling is a method used to normalize the range of
independent variables or features of data. Feature Scaling is a technique to
standardize the independent features present in the data in a ﬁxed range. It
is performed during the data pre-processing to handle highly varying
magnitudes or values or units. In data processing, it is also known as data
normalization and is generally performed during the data preprocessing
step.
For every feature, the minimum value of that feature gets transformed
into a 0, the maximum value gets transformed into a 1, and every other
value gets transformed into a decimal between 0 and 1. Min-max
normalization has one fairly signiﬁcant downside: it does not handle
outliers very well.
The output for scaling the independent attributes is as follows:
3.11 CREATING TRAIN AND TEST DATA
The training data and testing data are created from the pre-processed
dataset. The function of the training data is to train the model and improve
its understanding about the dataset and its attributes, across many epochs
and batches. The function of the test data is to evaluate the model's
understanding to the problem.
In this project, we have splitted the training and test data in the ratio
of 2:1. The output of the shape of the train and test data is as follows:

3.12 MODEL CREATION
The Machine Learning model is created by invoking appropriate
functions that are available in "scikit-learn" package in Python. There are
various parameters which can be used under different scenarios for
creating the Machine Learning model. In our project, there are 4 different
models taken into consideration. They are as follows:
● Support Vector Classifier on non pre-processed data
● Support Vector Classifier on pre-processed data
● Logistic Regression
● Multi Layer Perceptron (Neural Network)
We can get the description of the model's parametrs once we fit the
model with the training and test data. The sample output for creating an ML
model is as follows:
3.13 AUTO AI
The AutoAI graphical tool in Watson Studio automatically analyzes
data and generates candidate model pipelines customized for predictive
modeling problems. These model pipelines are created iteratively as
AutoAI analyzes your dataset and discovers data transformations,
algorithms, and parameter settings that work best for problem setting.
Results are displayed on a leaderboard, showing the automatically
generated model pipelines ranked according to problem optimization
objective. AutoAI enables AI and ML end-to-end lifecycle management.
The result of AutoAI in our dataset is as follows:
FLOW CHART
The flowchart depicts the sequential implementation of the proposed
system. The flowchart shows the dependencies, the independent and
dependent tasks. The flowchart helps to organize the system
functionalities.
Here, the proposed system is implemented by creating a Watson
Machine Learning instance. Here, the use of a python notebook is essential
as it is helpful for processes like dataset preparation, data pre-processing,
data visualization, feature engineering, model creation, and model
prediction.
Once the Machine Learning model is ready-to-use, the deployment
space is created, in which the deployment model is created and added to
the deployment assets. When the asset is deployed successfully, the
scoring endpoint is generated.
The flask application process consists of HTML form creation, flask
integration, and scoring endpoint integration. Once these processes are
done, the application can be executed and the ML model automation will be
complete.
RESULTS

The results that has been obtained as a result of evaluating the ML

models that are created are listed below. Also, a comparison ahs been
made on which algorithm (model) works better for the dataset provided.
The deployment process of the model and scoring endpoint automation is
also mentioned here.

6.1 MODEL EVALUATION
Model evaluation aims to estimate the generalization accuracy of a
model on future (unseen/out-of-sample) data. It helps to find the best
model that represents our data and how well the chosen model will work in
the future. Model evaluation metrics are used to assess goodness of fit
between model and data, to compare different models, in the context of
model selection, and to predict how predictions (associated with a specific
model and data set) are expected to be accurate. The three main metrics
used to evaluate a classification model are accuracy, precision, and recall.
Accuracy is defined as the percentage of correct predictions for the
test data. It can be calculated easily by dividing the number of correct
predictions by the number of total predictions.

Precision is deﬁned as the fraction of relevant examples (true
positives) among all of the examples which were predicted to belong in a
certain class.
Recall is deﬁned as the fraction of examples which were predicted to
belong to a class with respect to all of the examples that truly belong in the
class.

The sample output of obtaining the metrics is given as follows:

The following table shows the readings of the metrics as the output
of model evaluation:

ALGORITHM ACCURACY PRECISION - RECALL

SCORE SCORE
Support Vector Classifier on 0.8051 0.20
non pre-processed data
Support Vector Classifier on 0.8615 0.66
pre-processed data
Logistic Regression 0.8115 0.46
Multi Layer Perceptron 0.8651 0.68
6.2 ROC CURVE GENERATION
A receiver operating characteristic curve, or ROC curve, is a graphical
plot that illustrates the diagnostic ability of a binary classifier system as its
discrimination threshold is varied. An ROC curve is a graph showing the
performance of a classification model at all classification thresholds. This
curve plots two parameters: True Positive Rate and False Positive Rate.
ROC curves are frequently used to show in a graphical way the
connection/trade-off between clinical sensitivity and specificity for every
possible cut-off for a test or a combination of tests. In addition the area
under the ROC curve gives an idea about the benefit of using the test(s) in
question.
The ROC curve generated for the ML models is shown below:
The curve shows that MLP and SVC pre-processed show better
results. The Logistic Regression model shows an average performance and
SVC non pre-processed has a low TPR due to the nature of the data fed into
the model.

6.3 BUILDING FLASK APPLICATION
IBM provides the option of deploying the ML models created in
Watson Studio to get deployed in real time by providing dynamic scoring
endpoint URLs. This enables users to create models and deploy them
effectively.
The scoring endpoint URL can be obtained by creating a deployment
model and adding it to the deployment space as an instance. This enables
multiple model to get deployed simultaneously. I have deployed the model
"Support Vector Classifier on pre-processed data" into a deployment space.
The scoring endpoint URL obtained for by deploying the model is:
https://eu-gb.ml.cloud.ibm.com/ml/v4/deployments/a77fd05b-67a5-40d1-
8ab1-17e160b261c8/predictions?version=2020-10-20
A flask application is built in order to perform automated deployment
of the ML model. The UI is built using HTML by creating a form to get the
independent variables of the dataset as the user inputs. The output of the
UI is given below:
After the UI is built, the python script is built and executed. The
execution of the script makes the flask app to get deployed onto the local
server (http://127.0.0.1:5000) port number 5000.

Once the user clicks the "Submit" button, the responses get recodred
and the independent attributes of the dataset are transformed into the
pattern into which it is sent into the ML model. The payload is created in
the pattern of "[ﬁelds]:[values]" and is sent along with the URL as a POST
request. The model present in the deployment makes the prediciton and
sends the result back to the local server in JSON format. The prediciton of
the model is printed in the result page.
6.4 AUTO AI PIPELINE DETAILS
The AutoAI experiment is run on IBM Watson Studio by feeding it with
the dataset. A pipeline is generated which had two different algorithms,
with different versions, by varying critical parameters of Machine Learning.
All the models created and run automatically and the results are provided
with ample amount of metrics available for comparison.
The result of pipeline comparison is shown below:

The ﬁnal list with the AutoAI algorithms into consideration along with
the computed metrics is given below:
ADVANTAGES & DISADVANTAGES

7.1 ADVANTAGES
● The services provided by IBM Cloud can be leveraged to perform
complex tasks of any scale with ease.
● The interface is easy to use, with the tours guiding through all the
important aspects of the services.
● Usage of python is easy and handy when it comes to data
visualization and analysing data distribution.
● Access to a wide rande of software assets which can be incorporated
into the project in just a few clicks.
● Access to Auto AI has made a huge impact on the project. It enables
even naive users to understand Machine Learning algorithms and
many more techniques.
● Production deployments and automation using payload scoring gives
exposure of handling end-to-end application.

7.2 DISADVANTAGES
● Exceeding capacity-units per hour (CUH) imposes a bottleneck in
utilizing the capabilities.
APPLICATIONS

The proposed system will be helpful in predicting whether a customer
will leave the company or continue with the company. This will be beneficial
for Telecom companies who suffer significant losses due to customer
churn. The rate at which customers churn from a company is called as
churn rate. The churn prediciton will try to reduce the churn rate as
minimum as possible, playing an important role in the company's turnover
and reputation.
The use of an application to make adhoc predicitons will help the
users of the application to get instant results for the inputs. The
parameters chosen for predicting customer churn are spot-on, and all of
them are very critical for predicting the churn rate.
The use of churn prediciton beforehand will enable the company to
make counter attacks and try to retain more customers by introducing
optimized plans, new offers etc., It also helps the company to avoid
unnecessary loss and also adds up new customers due to improvised
workflow strategies.
CONCLUSION

I would like to extend my gratitude to IBM India Pvt. Ltd. and
SmartInternz - by SmartBridge Educational Services, for giving me an
opportunity to use the resources, study materials, tutorials provided by
them and to have me as a part of IBM Build-a-thon.
I have built a project named "Telecom Churn Prediciton using Watson
Auto AI" and have been provided free Watson Studio Desktop access for 30
days. I think I have done justice for the opportunity and the resources
provided to me.
It has been an enthaling experience for me working under this projejct
for 3 weeks. I have recorded a video to demonstrate the working of the
project. I have also added all the resources from my side to the Git.

Scoring Endpoint URL:
https://eu-gb.ml.cloud.ibm.com/ml/v4/deployments/a77fd05b-67a5-40d1-
8ab1-17e160b261c8/predictions?version=2020-10-20

GitHub link to my project:
https://github.com/SmartPracticeschool/SPS-5382-Telecom-Customer-Ch
urn-Prediction-using-Watson-Auto-AI

Link to the Project Demonstration Video:
https://drive.google.com/ﬁle/d/1zsHlecIcB76JRT8yOepPMlURsCig0nYi/vie
w?usp=sharing
FUTURE SCOPE

The project can be enhanced from different view points namely:
● Optimized Machine Learning algorithms
● More feature engineering techniques
● Analysing vital parameters for targeted customers
● Flask UI with improved functionalities
● Multiple deployments for different business scenarios
BIBILIOGRAPHY

1. Essam Shaaban, Yehia Helmy, Ayman Khedr, Mona Nasr | International
Journal of Engineering Research and Applications (IJERA) | A Proposed
Churn Prediction Model

2. Sandra Mitrović, Bart Baesens, Wilfried Lemahieu, Jochen De Weerdt |
On the Operational Eﬃciency of Different Feature Types for Telco Churn
Prediction

3. Veronikha Effendy, Adiwijaya, Z.K.A. Baizal. | 2014 2nd International
Conference on Information and Communication Technology (ICoICT) |
Handling Imbalanced Data in Customer Churn Prediction Using Combined
Sampling and Weighted Random Forest.

4. Yiqing Huang, Fangzhou Zhu, Mingxuan Yuan, Ke Deng, Yanhua Li, Bing
Ni, Wenyuan Dai, Qiang Yang, Jia Zeng | Advancing Computing as a Science
& Profession | Telco Churn Prediction with Big Data.

5. Sandra Mitrović, Bart Baesens, Wilfried Lemahieu, Jochen De Weerdt |
Churn Prediction using Dynamic RFM-Augmented node2vec.
APPENDIX

A. SOURCE CODE

"""# Customer Churn Prediction

## 1. Loading Libraries
"""

import json
import os
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import preprocessing, svm
from itertools import combinations
from sklearn.preprocessing import PolynomialFeatures, LabelEncoder,
StandardScaler
import sklearn.feature_selection
from sklearn.model_selection import train_test_split
from collections import defaultdict
from sklearn import metrics
import pickle

"""### The Dataset

From a telecommunications company. It includes information about:
- Customers who left within the last month – the column is called Churn

- Services that each customer has signed up for – phone, multiple lines,
internet, online security, online backup, device protection, tech support, and
streaming TV and movies

- Customer account information – how long they’ve been a customer,
contract, payment method, paperless billing, monthly charges, and total
charges

- Demographic info about customers – gender, age range, and if they have
partners and dependents

### 2. Loading Our Dataset

Click on the cell below to highlight it.

Then go to the `Files` section to the right of this notebook and click `Insert
to code` for the data you have uploaded. Choose `Insert pandas
DataFrame`.
"""

import types
import pandas as pd
from botocore.client import Conﬁg
import ibm_boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It
includes your credentials.
# You might want to remove those credentials before you share the
notebook.
client_b874c30054d441ffacbe02cbcc8859e6 =
ibm_boto3.client(service_name='s3',
    ibm_api_key_id='***',
    ibm_auth_endpoint="https://iam.cloud.ibm.com/oidc/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3.eu-geo.objectstorage.service.networklayer.com')

body =
client_b874c30054d441ffacbe02cbcc8859e6.get_object(Bucket='telcochur
nprediciton-donotdelete-pr-2use5r9izvml7k',Key='Churn_Modelling.csv')['Bo
dy']
# add missing __iter__ method, so pandas accepts body as ﬁle-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__,
body )

df_data_2 = pd.read_csv(body)
df_data_2.head()

customer_data = df_data_2

# Checking that everything is correct
pd.set_option('display.max_columns', 30)
customer_data.head(10)

"""### 3. Get some info about our Dataset and whether we have missing
values"""

# After running this cell we will see that we have no missing values
customer_data.info()
customer_data.shape

# Drop customerID column
customer_data = customer_data.drop('RowNumber', axis=1)
customer_data = customer_data.drop('CustomerId', axis=1)
customer_data = customer_data.drop('Surname', axis=1)
customer_data.head(5)

# Check if we have any NaN values
customer_data.isnull().values.any()

customer_data.info()

"""### 4. Descriptive analytics for our data"""

# Describe columns with numerical values
pd.set_option('precision', 3)
customer_data.describe()

# Describe columns with objects
customer_data.describe(exclude=np.number)

# Find correlations
customer_data.corr(method='pearson')

"""### 5. Visualize our Data to understand it better

#### Plot Relationships
"""

# Plot Tenure Frequency count
sns.set(style="darkgrid")
sns.set_palette("hls", 3)
fig, ax = plt.subplots(figsize=(20,10))
ax = sns.countplot(x="Tenure", data=customer_data)
# Plot CreditScore Frequency count
sns.set(style="darkgrid")
sns.set_palette("hls", 3)
fig, ax = plt.subplots(figsize=(20,10))
ax = sns.countplot(x="CreditScore", data=customer_data)

# Plot Geography Frequency count
sns.set(style="darkgrid")
sns.set_palette("hls", 3)
ﬁg, ax = plt.subplots(ﬁgsize=(20,10))
ax = sns.countplot(x="Geography", data=customer_data)

# Create Grid for pairwise relationships
gr = sns.PairGrid(customer_data, size=5)
gr = gr.map_diag(plt.hist)
gr = gr.map_offdiag(plt.scatter)
gr = gr.add_legend()

"""#### Understand Data Distribution"""

# Set up plot size
ﬁg, ax = plt.subplots(ﬁgsize=(6,6))

# Attributes destribution
a = sns.boxplot(orient="v", palette="hls", data=customer_data.iloc[:],
ﬂiersize=14)

# Tenure data distribution
histogram = sns.distplot(customer_data.iloc[:, 5], hist=True)
plt.show()

# Monthly Charges data distribution
histogram = sns.distplot(customer_data.iloc[:, 6], hist=True)
plt.show()

# Total Charges data distribution
histogram = sns.distplot(customer_data.iloc[:, 7], hist=True)
plt.show()

customer_data1 = customer_data
customer_data1 = customer_data1.drop('Exited', axis=1)
customer_data1.head(5)

"""### 6. Encode string values in data into numerical values"""

# Use pandas get_dummies
customer_data_encoded = pd.get_dummies(customer_data1)
print(customer_data_encoded.head(10))
customer_data_encoded.shape

"""### 7. Create Training Set and Labels"""

# Create training data for non-preprocessed approach
X_npp = customer_data.iloc[:, :-1].apply(LabelEncoder().ﬁt_transform)
pd.DataFrame(X_npp).head(5)

# Create training data for that will undergo preprocessing
X = customer_data_encoded
X.head()
print(X.shape)

# Extract labels
y_unenc = customer_data['Exited']
# Convert strings of 'yes' and 'no' to binary values of 0 or 1
le = preprocessing.LabelEncoder()
le.ﬁt(y_unenc)

y_le = le.transform(y_unenc)
pd.DataFrame(y_le)

"""### 8. Detect outliers in numerical values"""

# Calculate the Z-score using median value and median absolute deviation
for more robust calculations
# Working on EstimatedSalary column
threshold = 3

median = np.median(X['EstimatedSalary'])
median_absolute_deviation = np.median([np.abs(x - median) for x in
X['EstimatedSalary']])
modiﬁed_z_scores = [0.6745 * (x - median) / median_absolute_deviation
for x in X['EstimatedSalary']]
results = np.abs(modiﬁed_z_scores) > threshold

print(np.any(results))

# Do the same for Balance column but using the interquartile method

quartile_1, quartile_3 = np.percentile(X['Balance'], [25, 75])
iqr = quartile_3 - quartile_1
lower_bound = quartile_1 - (iqr * 1.5)
upper_bound = quartile_3 + (iqr * 1.5)

print(np.where((X['Balance'] > upper_bound) | (X['Balance'] < lower_bound)))
print(X)
X.shape

# Find interactions between current features and append them to the
dataframe
def add_interactions(dataset):
    # Get feature names
    comb = list(combinations(list(dataset.columns), 2))
    col_names = list(dataset.columns) + ['_'.join(x) for x in comb]
    # Find interactions
    poly = PolynomialFeatures(interaction_only=True, include_bias=False)
    dataset = poly.ﬁt_transform(dataset)
    dataset = pd.DataFrame(dataset)
    dataset.columns = col_names
    # Remove interactions with 0 values
    no_inter_indexes = [i for i, x in enumerate(list((dataset ==0).all())) if x]
    dataset = dataset.drop(dataset.columns[no_inter_indexes], axis=1)
    return dataset

print(X)
X.shape

X_inter = add_interactions(X)
X_inter.head(15)

# Select best features
select = sklearn.feature_selection.SelectKBest(k=25)
selected_features = select.ﬁt(X_inter, y_le)
indexes = selected_features.get_support(indices=True)
col_names_selected = [X_inter.columns[i] for i in indexes]

X_selected = X_inter[col_names_selected]
X_selected.head(10)

"""### 10. Split our dataset into train and test datasets

#### Split non-preprocessed data
"""

X_train_npp, X_test_npp, y_train_npp, y_test_npp = train_test_split(X_npp,
y_unenc,\
test_size=0.33, random_state=42)
print(X_train_npp.shape, y_train_npp.shape)
print(X_test_npp.shape, y_test_npp.shape)

X_train, X_test, y_train, y_test = train_test_split(X, y_unenc,\
test_size=0.33, random_state=42)
print(X_train.shape, y_train.shape)
print(X_test.shape, y_test.shape)

X_test.head()

"""#### Trying to send data to the endpoint will return predictions with
probabilities

### 11. Scale our data
"""

# Use StandardScaler
scaler = preprocessing.StandardScaler().ﬁt(X_train, y_train)
X_train_scaled = scaler.transform(X_train)

pd.DataFrame(X_train_scaled, columns=X_train.columns).head()
pd.DataFrame(y_train).head()

"""### 12. Start building a classiﬁer

#### Support Vector Macines on non-preprocessed data
"""

from sklearn.svm import SVC

# Run classiﬁer
clf_svc_npp = svm.SVC(random_state=42)
clf_svc_npp.ﬁt(X_train_npp, y_train_npp)

"""#### Support Vector Machines on preprocessed data"""

from sklearn.linear_model import LogisticRegression

# Run classiﬁer
clf_svc = svm.SVC(random_state=42)
clf_svc.ﬁt(X_train_scaled, y_train)

"""#### Logestic Regression on preprocessed data"""

from sklearn.linear_model import LogisticRegression

clf_lr = LogisticRegression()
model = clf_lr.ﬁt(X_train_scaled, y_train)
model

"""#### Multilayer Perceptron (Neural Network) on preprocessed data"""

from sklearn.neural_network import MLPClassifier
clf_mlp = MLPClassifier(verbose=0)
clf_mlp.fit(X_train_scaled, y_train)

# Note: MLP as a NN, can use data without the feature engineering step, as
the NN will handle that automatically

"""### 13. Evaluate our model"""

# Use the scaler ﬁt on trained data to scale our test data
X_test_scaled = scaler.transform(X_test)
pd.DataFrame(X_test_scaled, columns=X_train.columns).head()

"""#### Evaluate SVC on non-preprocessed data"""

# Predict conﬁdence scores for data
y_score_svc_npp = clf_svc_npp.decision_function(X_test_npp)
pd.DataFrame(y_score_svc_npp)

# Get accuracy score
from sklearn.metrics import accuracy_score
y_pred_svc_npp = clf_svc_npp.predict(X_test_npp)
acc_svc_npp = accuracy_score(y_test_npp, y_pred_svc_npp)
print(acc_svc_npp)

# Get Precision vs. Recall score
from sklearn.metrics import average_precision_score
average_precision_svc_npp = average_precision_score(y_test_npp,
y_score_svc_npp)

print('Average precision-recall score: {0:0.2f}'.format(
average_precision_svc_npp))
"""#### Evaluate SVC on preprocessed data"""

# Get model conﬁdence of predictions
y_score_svc = clf_svc.decision_function(X_test_scaled)
y_score_svc

# Get accuracy score
y_pred_svc = clf_svc.predict(X_test_scaled)
acc_svc = accuracy_score(y_test, y_pred_svc)
print(acc_svc)

# Get Precision vs. Recall score
average_precision_svc = average_precision_score(y_test, y_score_svc)

print('Average precision-recall score: {0:0.2f}'.format(
average_precision_svc))

"""#### Evaluate Logistic Regression on preprocessed data"""

y_score_lr = clf_lr.decision_function(X_test_scaled)
y_score_lr

y_pred_lr = clf_lr.predict(X_test_scaled)
acc_lr = accuracy_score(y_test, y_pred_lr)
print(acc_lr)

average_precision_lr = average_precision_score(y_test, y_score_lr)

print('Average precision-recall score: {0:0.2f}'.format(
average_precision_lr))
"""#### Evaluate MLP on preprocessed data"""

y_score_mlp = clf_mlp.predict_proba(X_test_scaled)[:, 1]
y_score_mlp

y_pred_mlp = clf_mlp.predict(X_test_scaled)
acc_mlp = accuracy_score(y_test, y_pred_mlp)
print(acc_mlp)

average_precision_mlp = average_precision_score(y_test, y_score_mlp)

print('Average precision-recall score: {0:0.2f}'.format(
average_precision_mlp))

"""### 14. ROC Curve and models comparisons"""

# Plot SVC ROC Curve
plt.ﬁgure(0, ﬁgsize=(20,15)).clf()

fpr_svc_npp, tpr_svc_npp, thresh_svc_npp = metrics.roc_curve(y_test_npp,
y_score_svc_npp)
auc_svc_npp = metrics.roc_auc_score(y_test_npp, y_score_svc_npp)
plt.plot(fpr_svc_npp, tpr_svc_npp, label="SVC Non-Processed, auc=" +
str(auc_svc_npp))

fpr_svc, tpr_svc, thresh_svc = metrics.roc_curve(y_test, y_score_svc)
auc_svc = metrics.roc_auc_score(y_test, y_score_svc)
plt.plot(fpr_svc, tpr_svc, label="SVC Processed, auc=" + str(auc_svc))

fpr_mlp, tpr_mlp, thresh_mlp = metrics.roc_curve(y_test, y_score_mlp)
auc_mlp = metrics.roc_auc_score(y_test, y_score_mlp)
plt.plot(fpr_mlp, tpr_mlp, label="MLP, auc=" + str(auc_mlp))
fpr_lr, tpr_lr, thresh_lr = metrics.roc_curve(y_test, y_score_lr)
auc_lr = metrics.roc_auc_score(y_test, y_score_lr)
plt.plot(fpr_lr, tpr_lr, label="Logistic Regression, auc=" + str(auc_lr))

plt.legend(loc=0)

ﬁlename = 'clf_svc.pkl'
pickle.dump(clf_svc, open(ﬁlename, 'wb'))
#!mkdir C:\Users\Palani\Downloads\model
!cp clf_svc.pkl C:\Users\Palani\Downloads
!tar -zcvf clf_svc.tar.gz clf_svc.pkl

from ibm_watson_machine_learning import APIClient

wml_credentials = {
"url":"https://eu-gb.ml.cloud.ibm.com",
"apikey":"***"
}

client = APIClient(wml_credentials)

metadata = {
    client.spaces.ConfigurationMetaNames.NAME:"Telco Churn DS",
    client.spaces.ConfigurationMetaNames.DESCRIPTION:"To predict
customers who exit the company",
    client.spaces.ConfigurationMetaNames.STORAGE:{
        "type":"bmcos_object_storage",
        "resource_crn":"***"
    },
    client.spaces.ConfigurationMetaNames.COMPUTE:{
        "name":"WatsonMachineLearning",
        "crn":"***"
    },
}

space_details = client.spaces.store(meta_props=metadata)

space_details

space_id = space_details["metadata"]["id"]
space_id

#space_id = "***"

client.set.default_space(space_id)

client.software_speciﬁcations.list()

import sklearn
sklearn.__version__

spec_id =
client.software_speciﬁcations.get_id_by_name("scikit-learn_0.20-py3.6")

#spec_id = "***"

model_details = client.repository.store_model(model=clf_svc,meta_props={
    client.repository.ModelMetaNames.NAME:"Churn Prediction",
    client.repository.ModelMetaNames.SOFTWARE_SPEC_UID:spec_id,
    client.repository.ModelMetaNames.TYPE:"scikit-learn_0.20"
})

model_id = model_details["metadata"]["id"]
model_id

#model_id = "***"

deployment_metadata = {
client.deployments.ConﬁgurationMetaNames.NAME:"Churn Prediction
Deployment",
client.deployments.ConﬁgurationMetaNames.ONLINE:{}
}

deployment_details = client.deployments.create(artifact_uid=model_id,
meta_props=deployment_metadata)

deployment_id = deployment_details["metadata"]["id"]

col = X.columns
col = list(col)
col

score_list = ['589','39','6','163520.37','3','1','0','75238.55','0','1','0','1','0']

payload = {
    client.deployments.ScoringMetaNames.INPUT_DATA:[{
        "ﬁelds":col,
        "values":[score_list],
    }]
}
deployment_details =
client.deployments.score(deployment_id=deployment_id,
meta_props=payload)

deployment_details

Automotive Signal Geneartor Asg102
100% (1)
Automotive Signal Geneartor Asg102
30 pages
Syngistix AA Software Guide
No ratings yet
Syngistix AA Software Guide
192 pages
Final Churn Prediction
No ratings yet
Final Churn Prediction
16 pages
Online windPRO Course
No ratings yet
Online windPRO Course
1 page
SPS-5382-Telecom Customer Churn Prediction Using Watson Auto AI
No ratings yet
SPS-5382-Telecom Customer Churn Prediction Using Watson Auto AI
51 pages
Token ID Ain20250117003-1
No ratings yet
Token ID Ain20250117003-1
14 pages
Customer_Churn_Prediction_Capstone_Projectdocx (1)
No ratings yet
Customer_Churn_Prediction_Capstone_Projectdocx (1)
11 pages
Churn Prediction Product Idea
No ratings yet
Churn Prediction Product Idea
7 pages
Customer Churn Prediction Capstone Himanshu
No ratings yet
Customer Churn Prediction Capstone Himanshu
5 pages
Data science case report
No ratings yet
Data science case report
20 pages
Research Churn
No ratings yet
Research Churn
4 pages
Speech F
No ratings yet
Speech F
16 pages
Capstone Project
No ratings yet
Capstone Project
21 pages
Abhishekj uvatkar
No ratings yet
Abhishekj uvatkar
4 pages
Group 13 - Analyzing Customer Churn
No ratings yet
Group 13 - Analyzing Customer Churn
6 pages
Analysis of Customer Churn Prediction in Telecom Industry Using Decision Trees and Logistic Regression
No ratings yet
Analysis of Customer Churn Prediction in Telecom Industry Using Decision Trees and Logistic Regression
4 pages
Iranian Churn
No ratings yet
Iranian Churn
16 pages
InternshipReport1[2]
No ratings yet
InternshipReport1[2]
35 pages
Telecom Customer Churn Report
No ratings yet
Telecom Customer Churn Report
3 pages
Sample Report
No ratings yet
Sample Report
34 pages
Churn Prediction2
No ratings yet
Churn Prediction2
16 pages
A Survey on Customer Churn Prediction In
No ratings yet
A Survey on Customer Churn Prediction In
6 pages
Customer Churn Analysis and Prediction
No ratings yet
Customer Churn Analysis and Prediction
4 pages
output_4
No ratings yet
output_4
5 pages
Customer Churn Prediction
100% (1)
Customer Churn Prediction
18 pages
Gợi ý làm KHDL
No ratings yet
Gợi ý làm KHDL
82 pages
Paper3 On Chrun Prediction
No ratings yet
Paper3 On Chrun Prediction
16 pages
Predictive Analytics Customer Churn (1)
No ratings yet
Predictive Analytics Customer Churn (1)
3 pages
Customer Churn Internship Report PDF
No ratings yet
Customer Churn Internship Report PDF
34 pages
Sample Report 2
No ratings yet
Sample Report 2
4 pages
phase-2 ibrahim
No ratings yet
phase-2 ibrahim
9 pages
Grade Project
No ratings yet
Grade Project
1 page
Algorithms 17 00231
No ratings yet
Algorithms 17 00231
21 pages
Report Final FINAL
No ratings yet
Report Final FINAL
72 pages
Customer_Churn_Prediction_Using_Machine_Learning_Algorithms
No ratings yet
Customer_Churn_Prediction_Using_Machine_Learning_Algorithms
6 pages
Customer Churn Prediction Using Machine Learning
No ratings yet
Customer Churn Prediction Using Machine Learning
7 pages
0 - Worsheet Template
No ratings yet
0 - Worsheet Template
10 pages
Predictive Analytics Strategy
No ratings yet
Predictive Analytics Strategy
4 pages
FINALIZED VERSION
No ratings yet
FINALIZED VERSION
16 pages
Assignment Csit
No ratings yet
Assignment Csit
5 pages
Customer Churn Prediction in Telecommunication Industry Using Deep Learning
No ratings yet
Customer Churn Prediction in Telecommunication Industry Using Deep Learning
15 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
5 pages
ANN Report
No ratings yet
ANN Report
26 pages
DataScience_Project-new[1]
No ratings yet
DataScience_Project-new[1]
16 pages
Abhishek_Singh_15_ICICN_Research_Paper_Feb_2025
No ratings yet
Abhishek_Singh_15_ICICN_Research_Paper_Feb_2025
6 pages
Efficacy of Customer Churn Prediction System
No ratings yet
Efficacy of Customer Churn Prediction System
8 pages
DSS 2 DRAFT
No ratings yet
DSS 2 DRAFT
33 pages
Paper Published
No ratings yet
Paper Published
5 pages
Duplichecker-Plagiarism-Report (72)
No ratings yet
Duplichecker-Plagiarism-Report (72)
2 pages
12622-Article Text-22383-1-10-20220510
No ratings yet
12622-Article Text-22383-1-10-20220510
5 pages
fin_irjmets1682089319
No ratings yet
fin_irjmets1682089319
19 pages
Project 3 - Build A Logistic Regression Model To Predict Custo Mer Churn in Telecom IndustryV1.0 PDF
100% (1)
Project 3 - Build A Logistic Regression Model To Predict Custo Mer Churn in Telecom IndustryV1.0 PDF
38 pages
Customer Churn Prediction in Telcom Industry Using Data Mining Techniques
No ratings yet
Customer Churn Prediction in Telcom Industry Using Data Mining Techniques
14 pages
20pd02 Aakar Ppt (1)
No ratings yet
20pd02 Aakar Ppt (1)
16 pages
review1-1
No ratings yet
review1-1
16 pages
1-7-Machine-Learning-Approaches-for-Customer-Churn-Prediction-in-Telecommunications (2)
No ratings yet
1-7-Machine-Learning-Approaches-for-Customer-Churn-Prediction-in-Telecommunications (2)
7 pages
Customer Churn Prediction in Telecom Sector Using Machine Learning Techniques
No ratings yet
Customer Churn Prediction in Telecom Sector Using Machine Learning Techniques
16 pages
Customer Churn Prediction in The Telecom Sector
No ratings yet
Customer Churn Prediction in The Telecom Sector
6 pages
Synopsis Major Project
No ratings yet
Synopsis Major Project
8 pages
Journal of Forecasting - 2021 - Pekel Ozmen - A novel deep learning model based on convolutional neural networks for (1)
No ratings yet
Journal of Forecasting - 2021 - Pekel Ozmen - A novel deep learning model based on convolutional neural networks for (1)
12 pages
Predictive Analytics Project
No ratings yet
Predictive Analytics Project
13 pages
An Introduction to SDN Intent Based Networking
From Everand
An Introduction to SDN Intent Based Networking
alasdair gilchrist
5/5 (1)
Defect Prediction in Software Development & Maintainence
From Everand
Defect Prediction in Software Development & Maintainence
Rudra Kumar
No ratings yet
KENSHO - S35 Manual
No ratings yet
KENSHO - S35 Manual
52 pages
Timer and Counting Devices: Real Time Clock
No ratings yet
Timer and Counting Devices: Real Time Clock
2 pages
Dimatulac, Aaron Joseph F. Date: April 2020 GED102 - A7 Score: Co4, CW Exercise 1.1 (Pages 269-270) Provide The Answer To Each of The Problem
No ratings yet
Dimatulac, Aaron Joseph F. Date: April 2020 GED102 - A7 Score: Co4, CW Exercise 1.1 (Pages 269-270) Provide The Answer To Each of The Problem
5 pages
Sensodrive boxgear
No ratings yet
Sensodrive boxgear
54 pages
Structural Steel Sub-Grades JR, J0 and J2 PDF
No ratings yet
Structural Steel Sub-Grades JR, J0 and J2 PDF
1 page
Cat. EJB - A - Series - 2018
No ratings yet
Cat. EJB - A - Series - 2018
6 pages
Minimum Viable Product (MVP) : Definition
No ratings yet
Minimum Viable Product (MVP) : Definition
14 pages
Jmis 26 4 167
No ratings yet
Jmis 26 4 167
9 pages
Drilling Rig Checklist Rev1
No ratings yet
Drilling Rig Checklist Rev1
29 pages
Import Java
No ratings yet
Import Java
5 pages
Pygame
No ratings yet
Pygame
3 pages
Integrate Google Calendar With ServiceNow - How To Read The Docs - Integration Part 1 - Developer Community - Article - ServiceNow Community
No ratings yet
Integrate Google Calendar With ServiceNow - How To Read The Docs - Integration Part 1 - Developer Community - Article - ServiceNow Community
16 pages
Prediction of Heart Disease Using Machine Learning Techniques
No ratings yet
Prediction of Heart Disease Using Machine Learning Techniques
4 pages
Lab 07-Analysis of CCP Module For Capture, Compare and PWM: Answer
No ratings yet
Lab 07-Analysis of CCP Module For Capture, Compare and PWM: Answer
2 pages
Unit III Virtualization
No ratings yet
Unit III Virtualization
36 pages
Pds Hempadur 35900 En-Gb
No ratings yet
Pds Hempadur 35900 En-Gb
2 pages
FortiExtender 101F QuickStart - Online
No ratings yet
FortiExtender 101F QuickStart - Online
17 pages
NEXTVision Usermanual
No ratings yet
NEXTVision Usermanual
55 pages
F2PY A Tool For Connecting Fortran and Python Prog
No ratings yet
F2PY A Tool For Connecting Fortran and Python Prog
11 pages
Exercise Transaction Management System
100% (1)
Exercise Transaction Management System
2 pages
Sumaiya_Resume_Experience
No ratings yet
Sumaiya_Resume_Experience
2 pages
EEE ECE F343 Communication Networks: BITS Pilani
No ratings yet
EEE ECE F343 Communication Networks: BITS Pilani
11 pages
1 PB
No ratings yet
1 PB
26 pages
RPV Stock Sprockets: Hub Types
No ratings yet
RPV Stock Sprockets: Hub Types
1 page
Mixing SM Fibers
No ratings yet
Mixing SM Fibers
7 pages
Metro Station Site Visit
No ratings yet
Metro Station Site Visit
8 pages
Complete Course in Dispensing: Continuing Education
No ratings yet
Complete Course in Dispensing: Continuing Education
8 pages

Vig SPS-5382-Telecom Customer Churn Prediction Using Watson Auto AI

Uploaded by

Vig SPS-5382-Telecom Customer Churn Prediction Using Watson Auto AI

Uploaded by

PROJECT

S.No Title & Author Existing Problem Proposed Solution

2. On the Operational Existing studies on customer The gap between predictive

3. A Proposed Churn Conventional churn The purpose of prediction

The results that has been obtained as a result of evaluating the ML

ALGORITHM ACCURACY PRECISION - RECALL

You might also like