SMDM-Project Sample Business Report
SMDM-Project Sample Business Report
tia
Statistical Methods for Decision Making
[email protected]
Project Report en
166FVD0TPV
fid
on
C
l
tia
1.2 Data Overview 7
1.5
1.6
[email protected]
Answer Key Questions
en
Conclusion and Recommendations
35
40
List of Tables
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 1
List of Figures
l
tia
3 Univariate Analysis of Partner 15
8
[email protected]
en
Univariate Analysis of Gender
18
19
166FVD0TPV 9 Univariate Analysis of Education 19
fid
10 Univariate Analysis of Personal Loan 20
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 2
22 Make vs Profession Plot 30
l
tia
29 Make vs Profession Plot (Male) 37
[email protected]
en
166FVD0TPV
fid
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 3
Project 1
Problem Definition
Context
l
day-to-day life, people use cars for commuting to work, shopping, visiting
tia
family and friends, etc. Research shows that more than 76% of people
prevent themselves from traveling somewhere if they don't have a car. Most
people tend to buy different types of cars based on their day-to-day
In order to be familiar with the types of cars preferred by the customers and
on
factors influencing the car purchase behavior in the US market, Austo has
contracted a consulting firm. Based on various market surveys, the
consulting firm has created a dataset of 3 major types of cars that are
C
extensively used across the US market. They have collected various details
of the car owners which can be analyzed to understand the automobile
market of the US.
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 4
Objective
l
understand the dynamics of a new market. Suppose you are a Data
tia
Scientist working at the consulting firm that has been contracted by Austo.
You are given the task to create buyer's profiles for different types of cars
with the available data as well as a set of recommendations for Austo.
166FVD0TPV
fid
Data Description
austo_automobile.csv: The dataset contains buyer's data corresponding to
different types of products(cars).
on
Data Dictionary
● Age: Age of the customer
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 5
● No_of_dependents: Number of dependents(partner/children/spouse)
of the customer
● Personal_loan: Indicates whether the customer availed a personal
loan or not
● House_loan: Indicates whether the customer availed house loan or
not
● Partner_working: Indicates whether the customer's partner is working
or not
● Salary: Annual Salary of the customer
l
● Partner_salary: Annual Salary of the customer's partner
tia
● Total_salary: Annual household income (Salary + Partner_salary) of
the customer's family
● Price: Price of the car
● Make: Car type (Hatchback/Sedan/SUV)
[email protected]
en
166FVD0TPV
fid
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 6
Data Overview
Load the required packages, set the working directory, and load the data
file.
The dataset has 1581 rows and 14 columns. It is always a good practice to
view a sample of the rows. A simple way to do that is to use head()
function.
l
tia
[email protected]
en
Table 1: Top five rows of the dataset
166FVD0TPV
fid
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 7
A quick look at the dataset information tells us that there are 6 are
numerical and 8 categorical variables. There are few Null records present in
two variables: Gender and Partner_salary, which will be analyzed in detail in
the next section. There are no duplicate records in the dataset.
l
tia
Inspecting Null Values -
Handling
[email protected] Nulls -
en
166FVD0TPV
fid
Nulls are usually handled by the following techniques –
● If the proportion of Null values is more than 60 % of the total number
of records in a column, then drop the column. Here you assume that
the column is uninformative.
on
For the given data, neither (a) nor (b) is applicable since the proportion of
null values in any column is small and no row contains a large number of
missing observations.
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 8
Simple rules for imputation:
● For categorical variables, we can impute the Nulls with the majority
class. For the current dataset, Null values in the ‘Gender’ field are
imputed with ‘Male’ (Male being the majority class).
l
variables are internally related.
tia
The three variables on salary are related to one another:
en
Also, non-null values in the Partner_salary field are possible only if the
Binary variable Partner_working is YES. Hence for this data, we do a
[email protected]
166FVD0TPV
rule-based imputation instead of the mean/median imputation –
fid
● If Partner_working = ‘No’ then Partner_salary = 0
● If Partner_working = ‘Yes’ then Partner_salary = Total_salary - Salary
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 9
Statistical Summary
l
tia
Observations:
[email protected]
en
Table 3: Numerical summarization of the dataset
166FVD0TPV ● The average age of the customers is around 32 years. 75% of the
fid
customers are below 38 years and the minimum age of the customer
is 22. This indicates that buyers in the age group 22-38 purchase new
cars.
● 50% of the customers have at least 2 dependents.
on
● At least 25% of the customer's partners are not working. The average
partner's salary of the customer is around 20000. The mean salary is
less than the median, this suggests that salary distribution will be
left-skewed.
● The average household salary of the customer is around 80000, with
a standard deviation of around 25000. The mean salary is
approximately equal to the median, this suggests that salary
distribution is symmetrical.
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 10
● The price of the car lies in the range of 18000 to 70000 with an
average of around 36000. The mean salary is greater than the
median, this suggests that salary distribution will be a bit
right-skewed.
l
Determining the unique values for each categorical variable to check if any
tia
junk/garbage values are present. This check can also help us to identify if
any data entry issues are present.
[email protected]
en
166FVD0TPV
fid
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 11
From the value counts of the Gender variable, we find that there are two
instances of possible data entry issues. The word Female has been
misspelled as ‘Femle’ and ‘Femal’.
For the current dataset, we are confident that the category Female has
been misspelled, so we can go ahead and impute these records with the
correct spelling i.e. ‘Female’. However, in real-time data, the issues might
l
not be this straightforward all the time, it might need thorough inspection
tia
and domain knowledge to rectify such issues.
The rest of the categorical fields seem to be free from any such issues.
[email protected]
en
166FVD0TPV
fid
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 12
Univariate Analysis
For performing Univariate analysis we will take a look at the Boxplots and
Histograms to get better understanding of the distributions.
Numerical variables
l
● Observations on Age
tia
[email protected]
en
166FVD0TPV
fid
on
Observations:
● The distribution of Age is right skewed.
● From the boxplot we can see that the second quartile(Q2) is less than
30 which means more than 50% of customers in the dataset are
below the age of 30.
● There are a few outliers in this variable.
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 13
● Observations on Salary
l
tia
[email protected]
en
166FVD0TPV Figure 2: Univariate analysis of Salary
fid
Observations:
● The salary of the customer lies between 30,000 to 90,000, with an
average of around 60,000.
on
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 14
● Observations on Partner’s salary
l
tia
[email protected]
en
166FVD0TPV
Figure 3: Univariate analysis of Partner_salary
fid
Observations:
● Around 45% of the customer's partners do not work. Hence, their
salary is 0.
on
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 15
● Observations on Total salary
l
tia
[email protected]
en
166FVD0TPV Figure 4: Univariate analysis of Total_salary
fid
Observations:
● The total salary of the customer's household follows a normal
distribution, with an average of around 80,000.
on
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 16
● Observations on Price
l
tia
[email protected]
en
166FVD0TPV
fid
Figure 5: Univariate analysis of Price
Observations:
● Most of the cars cost in the range 20000-40000.
on
● The mean price of the cars is greater than the median. This indicates
that the car price is right-skewed.
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 17
Categorical variables
● Observations on Gender
Observations:
● There are more male
customers(around 79%) than
l
females(around 21%).
tia
[email protected]
en
166FVD0TPV
fid
Figure 6: Univariate analysis of Gender
● Observations on Profession
on
Observations:
● There are more salaried
customers(around 57%) than business
persons(around 43%).
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 18
● Observations on Marital Status
Observations:
● 91.3% customers are married.
Only 8.7% customers are single.
l
tia
[email protected]
en
166FVD0TPV
Figure 8: Univariate analysis of Marital Status
fid
● Observations on Education
on
Observations:
● Around 38% customers are
graduate; whereas 62% have completed
their post graduation.
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 19
● Observations on Personal Loan
Observations:
● Around 50% of the customers have
a personal loan.
l
tia
en
Figure 10: Univariate analysis of Personal Loan
[email protected]
166FVD0TPV
● Observations on Number of dependents
fid
Observations:
on
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 20
● Observations on House loan
Observations:
● Around 33% of the customers
have a house loan.
l
tia
en
Figure 12: Univariate analysis of House Loan
[email protected]
166FVD0TPV
● Observations on Partner working
fid
Observations:
on
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 21
● Observations on Make
Observations:
● The Sale of the 'Hatchback'
type car is more compared to SUV
and Sedan.
● Only 15% of the customers
l
buy SUVs.
tia
[email protected]
en
166FVD0TPV
Figure 14: Univariate analysis of Make
fid
Insights
● Sedan is the most preferred purchase, followed by Hatchback and
on
SUV.
● The number of customers having a working partner are slightly higher
than customers with nonworking partner or singles. There are a total
of 713 customers with Partner_working variable as ‘No’, out of which
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 22
● Majority of the customers in the dataset are Post Graduate.
● From the Barplot of No_of_dependnts variable we can infer that
majority of the customers have either 2 or 3 dependents, followed by
1 or 4 dependents. Very few customers have zero no of dependents.
Multivariate Analysis
l
tia
● Correlation of Numerical Variables
[email protected]
en
166FVD0TPV
fid
on
Observations:
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 23
● Show the relationship between numerical variables
l
tia
[email protected]
en
166FVD0TPV
fid
on
Observations:
● Customers with higher household salaries prefer SUVs and sedans;
whereas customers with lower household salaries prefer Hatchback
cars.
● Customers in the higher age group prefer SUVs; whereas young
customers prefer hatchbacks.
● Let's analyze it further to get more insights.
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 24
Find relationship between Numerical and Categorical variables
● Make vs Age
l
tia
Observations:
[email protected]
en
Figure 17: Make vs Age Plot
166FVD0TPV
● SUV is preferred by customers in the age group 35-60.
fid
● Sedan is preferred by customers in the age group 30-45.
● Hatchback is preferred by the younger customers in the age group
22-30.
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 25
● Make vs Price
l
tia
Observations:
● SUV
en
Figure 18: Make vs Price
is the costliest type of car among the three car types. The price
[email protected]
166FVD0TPV
range of the SUVs is 50000-70000.
fid
● Sedan is costlier compared to hatchback type cars.
● Hatchback is the most affordable car ranging between 15000-35000.
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 26
● Make vs Salary
l
tia
Observations:
● SUV
[email protected]
en
Figure 19: Make vs Salary
is the costliest type of car among the three car types. Hence,
166FVD0TPV
customers with higher household incomes prefer to buy SUVs.
fid
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 27
● Make vs Education
l
tia
Observations:
● Customers
[email protected]
en
Figure 20: Make vs Education
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 28
● Make vs Number of dependents
l
tia
Observations:
● Customers
[email protected]
en
Figure 21: Make vs Number of dependents
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 29
● Make vs Profession
l
tia
Observations:
[email protected]
en
Figure 22:Make vs Profession
166FVD0TPV ● Customers with salaries buy more cars compared to customers who
fid
own their businesses.
● Sales of the hatchback are almost the same for Business and
Salaried individuals. Hatchback is more popular in both the
professions.
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 30
● Make vs Personal loan
l
tia
Observations:
● Few
en
Figure 23: Make vs Personal loan
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 31
● Make vs House loan
l
tia
Observations:
● SUV
en
customers do not have a house loan.
[email protected]
Figure 24: Make vs House loan
166FVD0TPV
fid
● More of the hatchback customers have a house loan compared to the
Sedan customers.
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 32
● Make vs Gender
l
tia
Observations:
[email protected]
en
Figure 25: Make vs Gender
166FVD0TPV
● Females prefer SUV and are least likely to buy a Hatchback
fid
● Males prefer Sedan or hatchback
● SUV is least preferable among males
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 33
● Make vs Marital status
l
tia
Observations:
● Married
[email protected]
en
Figure 26: Make vs Marital status
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 34
Answer Key Questions
l
tia
Proportion of Males buying SUVs = 0.09 (Number of males who bought
SUV / Total number of males)
[email protected]
en
166FVD0TPV
fid
on
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 35
● What is the likelihood of a salaried person buying a Sedan?
l
Proportion of SUVs purchased = 0.23 (Total SUVs bought by salaried / Total
tia
Cars purchased by salaried)
en
Using Visualization to arrive at the conclusion, we plot a count plot of
Profession as x , while Make as Hue parameter.
[email protected]
166FVD0TPV
fid
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 36
● What evidence or data supports Sheldon Cooper's claim that a
salaried male is an easier target for an SUV sale over a Sedan sale?
l
tia
Proportion of SUVs = 90/672 = 0.13 (Total SUV purchased / Total Cars
purchased)
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 37
● How does the amount spent on purchasing automobiles across
gender?
Females are more likely to buy SUVs and on average spend more on cars
than males 47705 Units against 32416 Units.
l
Female = 47705
tia
Male = 32416
Female = 49000
Male = 29000
[email protected]
en
166FVD0TPV
fid
Mean and Median Price for Female customers is higher than for Male
customers.
on
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 38
Personal Loan: Yes= 31000
l
tia
● How does having a working partner influence the purchase of
higher-priced cars?
Partner_working:
[email protected] No = 36000
en
166FVD0TPV
Partner_working: Yes = 35267
fid
Median of Price across Partner_working:
Partner_working: No = 31000
on
The Mean and Median price of the purchased automobile is almost similar
across the Partner_working category, thus indicating that partner working
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 39
Conclusion and Recommendations
Conclusions
Hatchback:
● An affordable and general-purpose car that can be used by a wide
range of users.
● It can be considered as an entry-level car generally targeted at the
l
younger population with an average income of 55k.
tia
Sedan:
● Slightly costlier compared to hatchback-type cars
● The product also generally targets customers in their 30's who have a
SUV:
slightly higher income. en
● The product is suitable for single customers.
[email protected]
166FVD0TPV
● A costly car that will excite the car-lovers
fid
● It has a higher price point and is more suitable for customers who do
not have any kind of loans on them.
● The buyers in this segment are elder and salaried individuals.
on
Business Recommendations
● Austo should first launch the affordable Hatchback model in the US
market targeting the younger population. This car type can be the
C
flagship product that brings in profits for the company as most of the
young USA customers prefer this model.
● Then, Austo should launch a good and affordable Sedan model. The
company needs to engage in more marketing for this model and
should try to lure the younger age group customers into buying this
model.
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 40
● After the successful launch of these models, the company can launch
the SUV model with a competitive pricing strategy to gain more
profits from the US automobile market. SUVs can be targeted to
people from the age group of 35 -60. As most of the customers for
SUVs are in this age range.
l
tia
[email protected]
en
166FVD0TPV
fid
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 41
Project 2- Framing Analytics Problem
Problem Definition
CONTEXT
l
capital that the bank lends out to customers has historically been the most
tia
significant method of revenue generation. The bank earns profits from the
difference between the interest rates it pays on deposits and other sources
of funds, and the interest rates it charges on the loans it gives out. GODIGT
Bank is a mid-sized private bank that deals in all kinds of banking products,
en
such as savings accounts, current accounts, investment products, etc.
among other offerings. The bank also cross-sells asset products to its
existing customers through personal loans, auto loans, business loans, etc.,
[email protected]
166FVD0TPV
and to do so they use various communication methods including cold
fid
calling, e-mails, recommendations on net banking, mobile banking, etc.
GODIGT Bank also has a set of customers who were given credit cards
based on risk policy and customer category class but due to huge
competition in the credit card market, the bank is observing high attrition in
on
credit card spending. The bank makes money only if customers spend
more on credit cards. Given the attrition, the Bank wants to revisit its credit
card policy and make sure that the card given to the customer is the right
credit card. The bank will make a profit only through the customers that
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 42
Objective
As a Data Scientist at the company and the Data Science team has shared
some data. You are supposed to find the key variables that have a vital
impact on the analysis which will help the company to improve the
business.
Data Description
l
Credit Card Data for GODIGIT Bank
tia
Data Dictionary:
userid - Unique bank customer-id
card_no - Masked credit card number
card_bin_no - Credit card IIN number
[email protected]
en
166FVD0TPV Issuer - Card network issuer
fid
card_type - Credit card type
card_source_data - Credit card sourcing date
high_networth - Customer category based on their net-worth value (A: High to E:
on
Low)
active_30 - Savings/Current/Salary etc. account activity in last 30 days
active_60 - Savings/Current/Salary etc. account activity in last 60 days
active_90 - Savings/Current/Salary etc. account activity in last 90 days
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 43
engagement_products - Number of investment/loan products the customer
holds (FD, RD, Personal loan, auto loan)
annual_income_at_source - Annual income recorded in the credit card
application
other_bank_cc_holding - Whether the customer holds another bank credit card
bank_vintage - Vintage with the bank (in months) as on Tthmonth
T+1_month_activity - Whether customer uses credit card in T+1 month (future)
l
T+2_month_activity - Whether customer uses credit card in T+2 month (future)
tia
T+3_month_activity - Whether customer uses credit card in T+3 month (future)
T+6_month_activity - Whether customer uses credit card in T+6 month (future)
T+12_month_activity - Whether customer uses credit card in T+12 month (future)
en
Transactor_revolver - Revolver: Customer who carries balances over from one
month to the next. Transactor: Customer who pays off their balances in full every
month.
[email protected]
166FVD0TPV
avg_spends_l3m - Average credit card spends in last 3 months
fid
Occupation_at_source - Occupation recorded at the time of credit card
application
cc_limit - Current credit card limit
on
*All above data has been recorded as on Tth month excluding T+1_month_activity,
T+2_month_activity, T+3_month_activity, T+6_month_activity, T+12_month_activity
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 44
Top 5 important variables with justification.
● annual_income_at_source - Annual income plays a big role in the
purchasing power of an individual hence is a vital piece of info.
Income can be used by the banks to make better decisions in areas
such as risk profiling, targeted ads, campaigns, offers, loan limits etc.
● cc_limit – Defining Credit Card limit for customers basis their
l
attributes (such as income, CIBIL Score, etc.) is part of the Risk
tia
Management practice wherein the banks try to minimize the number
of defaulters. The banks seek a quantifiable answer to the query
“How much is too much?”
● cc_active30
[email protected]
en
– Flag variables such as cc_active30, cc_active60 can be
166FVD0TPV
fid
used to get an understanding over how frequently does the customer
use the credit card, if the account is dormant or if the customer is
experiencing any issues leading to reduced usage of the card etc.
on
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 45
high spend indicates primary account whereas lower spend would
mean secondary account. Campaigns can be rolled out on the basis
of the customer preference, customized offers can be given to lure
customers into using the credit account more frequently.
Few variables which are unimportant from an analysis point of view, and
l
are merely customer/account identifiers
tia
1. userid
2. card_no
3. card_bin_no
[email protected]
en
166FVD0TPV
fid
on
C
This
Proprietary content. file isLearning.
©Great meant forAll
personal
Rights use by [email protected]
Reserved. only.
Unauthorized use or distribution prohibited
Sharing or publishing the contents in part or full is liable for legal action. 46