0% found this document useful (0 votes)
22 views

PDF Notebook

Case stdy solution

Uploaded by

Krunal Kalariya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

PDF Notebook

Case stdy solution

Uploaded by

Krunal Kalariya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

09/12/2023, 21:33 Aerofit 9th dec - Jupyter Notebook

In [1]: import pandas as pd

In [2]: df=pd.read_csv('aerofit.csv')

In [3]: df.head(10)

Out[3]:
Product Age Gender Education MaritalStatus Usage Fitness Income Miles

0 KP281 18 Male 14 Single 3 4 29562 112

1 KP281 19 Male 15 Single 2 3 31836 75

2 KP281 19 Female 14 Partnered 4 3 30699 66

3 KP281 19 Male 12 Single 3 3 32973 85

4 KP281 20 Male 13 Partnered 4 2 35247 47

5 KP281 20 Female 14 Partnered 3 3 32973 66

6 KP281 21 Female 14 Partnered 3 3 35247 75

7 KP281 21 Male 13 Single 3 3 32973 85

8 KP281 21 Male 15 Single 5 4 35247 141

9 KP281 21 Female 15 Partnered 2 3 37521 85

In [4]: df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 180 entries, 0 to 179
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Product 180 non-null object
1 Age 180 non-null int64
2 Gender 180 non-null object
3 Education 180 non-null int64
4 MaritalStatus 180 non-null object
5 Usage 180 non-null int64
6 Fitness 180 non-null int64
7 Income 180 non-null int64
8 Miles 180 non-null int64
dtypes: int64(6), object(3)
memory usage: 12.8+ KB

In [5]: df.isnull().sum()

Out[5]: Product 0
Age 0
Gender 0
Education 0
MaritalStatus 0
Usage 0
Fitness 0
Income 0
Miles 0
dtype: int64

In [6]: df['Product'].value_counts()

Out[6]: KP281 80
KP481 60
KP781 40
Name: Product, dtype: int64

In [7]: import seaborn as sbn

localhost:8888/notebooks/Desktop/DSML/dsml-case-studies/Aerofit/Aerofit 9th dec.ipynb 1/5


09/12/2023, 21:33 Aerofit 9th dec - Jupyter Notebook

In [8]: sbn.boxplot(x='Product', y='Income', data=df)

Out[8]: <AxesSubplot:xlabel='Product', ylabel='Income'>

In [9]: sbn.boxplot(x='Product', y='Miles', data=df)

Out[9]: <AxesSubplot:xlabel='Product', ylabel='Miles'>

In [10]: sbn.boxplot(x='Product', y='Education', data=df)

Out[10]: <AxesSubplot:xlabel='Product', ylabel='Education'>

In [12]: df.groupby('Product')['Education'].median()

Out[12]: Product
KP281 16.0
KP481 16.0
KP781 18.0
Name: Education, dtype: float64

localhost:8888/notebooks/Desktop/DSML/dsml-case-studies/Aerofit/Aerofit 9th dec.ipynb 2/5


09/12/2023, 21:33 Aerofit 9th dec - Jupyter Notebook

In [13]: df.groupby('Product')['Education'].describe()

Out[13]:
count mean std min 25% 50% 75% max

Product

KP281 80.0 15.037500 1.216383 12.0 14.0 16.0 16.0 18.0

KP481 60.0 15.116667 1.222552 12.0 14.0 16.0 16.0 18.0

KP781 40.0 17.325000 1.639066 14.0 16.0 18.0 18.0 21.0

In [15]: sbn.boxplot(x='Product', y='Miles', data=df)

Out[15]: <AxesSubplot:xlabel='Product', ylabel='Miles'>

In [17]: sbn.scatterplot(x='Miles', y='Income',hue='Product', data=df)

Out[17]: <AxesSubplot:xlabel='Miles', ylabel='Income'>

In [18]: sbn.boxplot(x='Gender', y='Miles', hue='Product', data=df)

Out[18]: <AxesSubplot:xlabel='Gender', ylabel='Miles'>

localhost:8888/notebooks/Desktop/DSML/dsml-case-studies/Aerofit/Aerofit 9th dec.ipynb 3/5


09/12/2023, 21:33 Aerofit 9th dec - Jupyter Notebook

In [19]: sbn.heatmap(df.corr(), annot=True)

Out[19]: <AxesSubplot:>

In [ ]: #insights- visual analysis



In [ ]: #200 products of KP781. Who's gonna buy more? What's the number?

In [ ]: #100 Women, what's of percentage of women buying 281

In [ ]: #2300 PRODUCTS OF TYPE 281, WHO'S GONNA BUY MORE Married/ Unmarried and by what percentage

In [22]: pd.crosstab(index=df['Gender'], columns=df['Product'])

Out[22]:
Product KP281 KP481 KP781

Gender

Female 40 29 7

Male 40 31 33

In [25]: pd.crosstab(index=df['Gender'], columns=df['Product'], normalize='columns')*100

Out[25]:
Product KP281 KP481 KP781

Gender

Female 50.0 48.333333 17.5

Male 50.0 51.666667 82.5

In [ ]: #1200 pieces of KP281 are sold, how many of them are bought by females = 600x

In [26]: pd.crosstab(index=df['Gender'], columns=df['Product'], margins=True)

Out[26]:
Product KP281 KP481 KP781 All

Gender

Female 40 29 7 76

Male 40 31 33 104

All 80 60 40 180

In [ ]: # % of females - 76/180 - Marginal Prob



#if there are 200 females, how many of them will buy kp281- 40/76*200

In [ ]: #In my whole data, what's the contribution of males buying kp481 -- Joint probability - 17.2%

localhost:8888/notebooks/Desktop/DSML/dsml-case-studies/Aerofit/Aerofit 9th dec.ipynb 4/5


09/12/2023, 21:33 Aerofit 9th dec - Jupyter Notebook

In [28]: pd.crosstab(index=df['Gender'], columns=df['Product'], margins=True, normalize=True)*100

Out[28]:
Product KP281 KP481 KP781 All

Gender

Female 22.222222 16.111111 3.888889 42.222222

Male 22.222222 17.222222 18.333333 57.777778

All 44.444444 33.333333 22.222222 100.000000

In [ ]: #insights-----

localhost:8888/notebooks/Desktop/DSML/dsml-case-studies/Aerofit/Aerofit 9th dec.ipynb 5/5

You might also like