0% found this document useful (0 votes)

26 views

Exp 01-B Feature Selection and Extraction

The document discusses the process of data preprocessing which involves collecting raw data from various sources, cleaning the data by handling missing values, outliers, and duplicates, exploring and visualizing the data to gain insights, selecting relevant features, and scaling features. The key steps are data collection, cleaning, exploration and visualization, feature selection, and feature scaling. Effective data preprocessing contributes significantly to building accurate machine learning models.

Uploaded by

R J SHARIN URK21CS5022

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views

Exp 01-B Feature Selection and Extraction

Uploaded by

R J SHARIN URK21CS5022

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Exp 01-B Feature Selection and Extraction

January 12, 2024

URK21CS1124
Aim: The main function of data preprocessing is to extract the data sources related to the moni-
toring target based on the mining requirements, check the legality of the data, and to generate the
next waiting core data for analysis
Description:
1. Data Collection:
Gather raw data from various sources, such as databases, files, APIs, or external datasets.
2. Data Cleaning:
Handle missing values: Decide whether to remove instances with missing values or impute
them with techniques like mean, median, or more sophisticated methods. Remove duplicates:
Eliminate identical records to avoid redundancy in the dataset. Handle outliers: Identify and
deal with outliers that might adversely affect model training.
3. Data Exploration and Visualization:
Explore the dataset through statistical analysis and visualizations to gain insights into the
distribution, relationships, and patterns within the data.
4. Feature Selection:
Identify and select relevant features that contribute the most to the prediction task. Remove
irrelevant or redundant features to simplify the model and reduce computational requirements.
5. Feature Scaling:
Standardize or normalize numerical features to bring them to a common scale. This helps
prevent features with larger magnitudes from dominating the learning process.
Data preprocessing is an iterative process, and the choice of techniques depends on the specific
characteristics of the dataset and the requirements of the machine learning task at hand.
Effective data preprocessing contributes significantly to building robust and accurate machine
learning models.

[2]: import pandas as pd

[3]: #1. Read the data

print("URK21CS1124")
df=pd.read_csv("Stores1b.csv")

1
URK21CS1124

[4]: #2. Calculate the % of missing values in the columns

print("URK21CS1124")
missing_values=df.isna().sum()
total=len(df)
percentage=(missing_values/total)*10
print(percentage)

URK21CS1124
Branch 0.784314
City 0.000000
Customer 0.000000
Gender 0.000000
Product line 3.529412
Unit_price 0.000000
Quantity 0.000000
Tax 0.000000
Total 0.000000
Payment 0.000000
cogs 0.000000
Rating 0.000000
Age 0.000000
Quarterly_Tax 0.784314
Price 0.000000
dtype: float64

[5]: #3. Replace missing value with mean for the numerical column, if the % of␣
↪missing value is less than 10%. (Use temporary data frame & inplace=True)

print("URK21CS1124")
missing_values=df.isna().sum()
total=len(df)
percentage=(missing_values/total)*10
temp=df.copy()
for i in df.columns:
if df[i].dtype!="object" and percentage[i]<10:
c_mean=df[i].mean()
temp[i].fillna(c_mean,inplace=True)
df.update(temp)
df.head(10)

URK21CS1124

[5]: Branch City Customer Gender Product line Unit_price \

2
4 A Yangon Normal Male Sports and travel 86.31
5 C Naypyitaw Normal Male Electronic accessories 85.39
6 A Yangon Member Female NaN 68.84
7 C Naypyitaw Normal Female NaN 73.56
8 A Yangon Member Female NaN 36.26
9 B Mandalay Member Female NaN 54.84

Quantity Tax Total Payment cogs Rating Age \

0 7 26.1415 548.9715 Ewallet 522.83 9.1 23
1 5 3.8200 80.2200 Cash 76.40 9.6 23
2 7 16.2155 340.5255 Credit card 324.31 7.4 24
3 8 23.2880 489.0480 Ewallet 465.76 8.4 26
4 60 30.2085 634.3785 Ewallet 604.17 5.3 30
5 7 29.8865 627.6165 Ewallet 597.73 4.1 32
6 6 20.6520 433.6920 Ewallet 413.04 5.8 27
7 10 36.7800 772.3800 Ewallet 735.60 8.0 30
8 2 3.6260 76.1460 Credit card 72.52 7.2 27
9 3 8.2260 172.7460 Credit card 164.52 5.9 23

Quarterly_Tax Price
0 210.0 7469
1 210.0 1528
2 124.0 4633
3 210.0 5822
4 210.0 8631
5 210.0 8539
6 210.0 6884
7 210.0 7356
8 100.0 3626
9 185.0 5484

[6]: #4. Perform the interpolation using nearest method to estimate the missing␣
↪values for the numerical column, if the % of missing value is less than 10%.␣

↪(Use temporary dataframe)

print("URK21CS1124")
missing_values2=df.isna().sum()
total2=len(df)
percentage2=(missing_values2/total2)*10
temp2=df.copy()
for i in df.columns:
if df[i].dtype!="object" and percentage2[i]<10:
temp2[i].interpolate(method="nearest",inplace=True)
temp2.head(10)

URK21CS1124

3
[6]: Branch City Customer Gender Product line Unit_price \
0 A Yangon Member Female Health and beauty 74.69
1 C Naypyitaw Normal Female Electronic accessories 15.28
2 A Yangon Normal Male Home and lifestyle 46.33
3 A Yangon Member Male Health and beauty 58.22
4 A Yangon Normal Male Sports and travel 86.31
5 C Naypyitaw Normal Male Electronic accessories 85.39
6 A Yangon Member Female NaN 68.84
7 C Naypyitaw Normal Female NaN 73.56
8 A Yangon Member Female NaN 36.26
9 B Mandalay Member Female NaN 54.84

Quantity Tax Total Payment cogs Rating Age \

Quarterly_Tax Price
0 210.0 7469
1 210.0 1528
2 124.0 4633
3 210.0 5822
4 210.0 8631
5 210.0 8539
6 210.0 6884
7 210.0 7356
8 100.0 3626
9 185.0 5484

[7]: #5. Perform the mode imputation for a categorical data, if the % of missing␣
↪value is less than 10%. (Use temporary data frame & inplace=True)

print("URK21CS1124")
missing_values3=df.isna().sum()
total3=len(df)
percentage3=(missing_values3/total3)*10
temp3=df.copy()
for i in df.columns:
if df[i].dtype =="object" and percentage3[i]<10:
c_column=df[i].mode().iloc[0]
temp3[i].fillna(c_column,inplace=True)

4
df.update(temp3)
df.head(10)

URK21CS1124

[7]: Branch City Customer Gender Product line Unit_price \

0 A Yangon Member Female Health and beauty 74.69
1 C Naypyitaw Normal Female Electronic accessories 15.28
2 A Yangon Normal Male Home and lifestyle 46.33
3 A Yangon Member Male Health and beauty 58.22
4 A Yangon Normal Male Sports and travel 86.31
5 C Naypyitaw Normal Male Electronic accessories 85.39
6 A Yangon Member Female Electronic accessories 68.84
7 C Naypyitaw Normal Female Electronic accessories 73.56
8 A Yangon Member Female Electronic accessories 36.26
9 B Mandalay Member Female Electronic accessories 54.84

Quantity Tax Total Payment cogs Rating Age \

Quarterly_Tax Price
0 210.0 7469
1 210.0 1528
2 124.0 4633
3 210.0 5822
4 210.0 8631
5 210.0 8539
6 210.0 6884
7 210.0 7356
8 100.0 3626
9 185.0 5484

[8]: #6. Drop the columns with more than 10% missing values and display the size␣
↪(Use temporary data frame & inplace=True)

print("URK21CS1124")
missing_values4=df.isna().sum()
total4=len(df)
percentage4=(missing_values4/total4)*10
temp4=df.copy()

5
for i in df.columns:
if percentage4[i]<10:
temp4.drop(i,axis=1,inplace=True)
print(temp4.head())
print(temp4.shape)

URK21CS1124
Empty DataFrame
Columns: []
Index: [0, 1, 2, 3, 4]
(51, 0)

[9]: #7. Drop the rows with outlier Z-score value > 3 for “Guarantee_Period” and␣
↪display the size. (Use temporary data frame & inplace=True)

print("URK21CS1124")
import numpy as np
temp5=df.copy()
zscore=np.abs((temp5["Quantity"]-temp5["Quantity"].mean())/temp5["Quantity"].
↪std())

temp5.drop(temp5[zscore>3].index,inplace=True)
print(temp5["Quantity"].head(10))
print(temp5.shape)

URK21CS1124
0 7
1 5
2 7
3 8
5 7
6 6
7 10
8 2
9 3
10 4
Name: Quantity, dtype: int64
(47, 15)

[10]: #8. Find the % of duplicate rows with all columns having same value.
print("URK21CS1124")
duplicated=df.duplicated().sum()
total_num=len(df)
perc=(duplicated/total_num)*100
perc

URK21CS1124

[10]: 3.9215686274509802

6
[11]: #9. Find the % of duplicate rows based on some specific columns␣
↪[Price&,Age,Mfg_Month,Fuel_Type] having same value. Drop the duplicates and␣

↪display the size. (Use temporary data framei nplace=True)

print("URK21CS1124")
temp6=df.copy()
specific_column=['Customer', 'Product line', 'Age', 'Gender']
dup=temp6.duplicated(subset=specific_column)
num_dup=dup.sum()
total_num=len(temp6)
percen=(num_dup/total_num)*100
temp6.drop_duplicates(subset=specific_column,inplace=True)
print(temp6.head())
print(temp6.shape)

URK21CS1124
Branch City Customer Gender Product line Unit_price \
0 A Yangon Member Female Health and beauty 74.69
1 C Naypyitaw Normal Female Electronic accessories 15.28
2 A Yangon Normal Male Home and lifestyle 46.33
3 A Yangon Member Male Health and beauty 58.22
4 A Yangon Normal Male Sports and travel 86.31

Quantity Tax Total Payment cogs Rating Age \

Quarterly_Tax Price
0 210.0 7469
1 210.0 1528
2 124.0 4633
3 210.0 5822
4 210.0 8631
(40, 15)

[12]: #10. Perform the min-max normalization for a numerical feature Quarterly_Tax␣
↪using Python code and analyze the values in scatter plot.

print("URK21CS1124")
age_min=df["Age"].min()
age_max=df["Age"].max()
df["normalized"]=(df["Age"]-age_min)/(age_max-age_min)
import matplotlib.pyplot as plt
plt.scatter(df.index,df["normalized"])
plt.xlabel("index")
plt.ylabel("normalized")

7
URK21CS1124

[12]: Text(0, 0.5, 'normalized')

[13]: #11. Perform the Z-score normalization for a numerical feature,Quarterly_Tax␣

↪using Python code and analyze the values in scatter plot.

print("URK21CS1124")
mean=df["Age"].mean()
std=df["Age"].std()
df["normalized_age"]=(df["Age"]-mean)/std
plt.scatter(df.index,df["normalized_age"])
plt.xlabel("index")
plt.ylabel("normalized")

URK21CS1124

[13]: Text(0, 0.5, 'normalized')

8
[14]: #12. Perform the label encoding for a categorical feature ‘Fuel_Type’ using␣
↪Python code

print("URK21CS1124")
from sklearn.preprocessing import LabelEncoder
label_encoder=LabelEncoder()
label_encoder.fit(df["Payment"])
df["payment_encoded"]=label_encoder.transform(df["Payment"])
df[["Payment","payment_encoded"]]

URK21CS1124

[14]: Payment payment_encoded

0 Ewallet 2
1 Cash 0
2 Credit card 1
3 Ewallet 2
4 Ewallet 2
5 Ewallet 2
6 Ewallet 2
7 Ewallet 2
8 Credit card 1

9
9 Credit card 1
10 Ewallet 2
11 Cash 0
12 Ewallet 2
13 Ewallet 2
14 Cash 0
15 Cash 0
16 Credit card 1
17 Credit card 1
18 Credit card 1
19 Ewallet 2
20 Ewallet 2
21 Ewallet 2
22 Credit card 1
23 Ewallet 2
24 Ewallet 2
25 Credit card 1
26 Cash 0
27 Credit card 1
28 Cash 0
29 Cash 0
30 Credit card 1
31 Cash 0
32 Cash 0
33 Credit card 1
34 Ewallet 2
35 Ewallet 2
36 Ewallet 2
37 Ewallet 2
38 Ewallet 2
39 Cash 0
40 Ewallet 2
41 Cash 0
42 Cash 0
43 Cash 0
44 Cash 0
45 Cash 0
46 Credit card 1
47 Ewallet 2
48 Credit card 1
49 Credit card 1
50 Ewallet 2

[15]: #13. Perform the one-hot encoding for a categorical feature ‘Fuel_Type’ using␣
↪Python code

print("URK21CS1124")
dummie=pd.get_dummies(df["Payment"])

10
dummie

URK21CS1124

[15]: Cash Credit card Ewallet

0 0 0 1
1 1 0 0
2 0 1 0
3 0 0 1
4 0 0 1
5 0 0 1
6 0 0 1
7 0 0 1
8 0 1 0
9 0 1 0
10 0 0 1
11 1 0 0
12 0 0 1
13 0 0 1
14 1 0 0
15 1 0 0
16 0 1 0
17 0 1 0
18 0 1 0
19 0 0 1
20 0 0 1
21 0 0 1
22 0 1 0
23 0 0 1
24 0 0 1
25 0 1 0
26 1 0 0
27 0 1 0
28 1 0 0
29 1 0 0
30 0 1 0
31 1 0 0
32 1 0 0
33 0 1 0
34 0 0 1
35 0 0 1
36 0 0 1
37 0 0 1
38 0 0 1
39 1 0 0
40 0 0 1
41 1 0 0

11
42 1 0 0
43 1 0 0
44 1 0 0
45 1 0 0
46 0 1 0
47 0 0 1
48 0 1 0
49 0 1 0
50 0 0 1

Result: The given dataset is being analysed using data pre-processing and output is verifiedsuc-
cessfully

Data Preprocessing in Machine Learning
No ratings yet
Data Preprocessing in Machine Learning
27 pages
A Survey On Sentiment Analysis Methods Applications and Challenges
No ratings yet
A Survey On Sentiment Analysis Methods Applications and Challenges
50 pages
Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection
From Everand
Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection
Bart Baesens
No ratings yet
DP
No ratings yet
DP
9 pages
Lab File
No ratings yet
Lab File
96 pages
Overview of Data Cleaning
No ratings yet
Overview of Data Cleaning
17 pages
Data Exploration Preparation
No ratings yet
Data Exploration Preparation
12 pages
Practicals
No ratings yet
Practicals
42 pages
MLM Report Customer Churn
No ratings yet
MLM Report Customer Churn
17 pages
Handling Missing Values in A Real-Time Dataset During
No ratings yet
Handling Missing Values in A Real-Time Dataset During
5 pages
Data Analytics lab manual
No ratings yet
Data Analytics lab manual
47 pages
Data Cleaning
No ratings yet
Data Cleaning
13 pages
Machine Learning Project Roadmap
No ratings yet
Machine Learning Project Roadmap
4 pages
DAP writeups_merged
No ratings yet
DAP writeups_merged
33 pages
DS Question Bank Unit-1 Part-2
No ratings yet
DS Question Bank Unit-1 Part-2
3 pages
Building Good Training Sets UNIT 1 PART2
No ratings yet
Building Good Training Sets UNIT 1 PART2
46 pages
Cleaning Data in Python
No ratings yet
Cleaning Data in Python
8 pages
Data Mining Lab 03
No ratings yet
Data Mining Lab 03
10 pages
Lab 08 - Data Preprocessing
No ratings yet
Lab 08 - Data Preprocessing
9 pages
Dmdw-Lab Manual
No ratings yet
Dmdw-Lab Manual
61 pages
Machine Learning Project Checklist
No ratings yet
Machine Learning Project Checklist
30 pages
Group A Assignment No2 Writeup
No ratings yet
Group A Assignment No2 Writeup
9 pages
ML (Prac1)
No ratings yet
ML (Prac1)
12 pages
Prac 7
No ratings yet
Prac 7
5 pages
Data Preprocesing JavaPoint
No ratings yet
Data Preprocesing JavaPoint
19 pages
DA lab
No ratings yet
DA lab
27 pages
DS Problem Statements and Codes
No ratings yet
DS Problem Statements and Codes
21 pages
Abhiml ML File
No ratings yet
Abhiml ML File
74 pages
AIDS - DM Using Python - Lab Programs
No ratings yet
AIDS - DM Using Python - Lab Programs
19 pages
Avinash DA 6
No ratings yet
Avinash DA 6
3 pages
DA_Programs
No ratings yet
DA_Programs
44 pages
Advance Python
No ratings yet
Advance Python
5 pages
Data Analytics Lab Manual_250402_095326
No ratings yet
Data Analytics Lab Manual_250402_095326
58 pages
Data Preprocessing Tutorial
No ratings yet
Data Preprocessing Tutorial
39 pages
Phython Example
No ratings yet
Phython Example
12 pages
B Tech-AIML-question bank-2 Answer Key
No ratings yet
B Tech-AIML-question bank-2 Answer Key
9 pages
# (Data Preprocessing) : (Cheatsheet)
No ratings yet
# (Data Preprocessing) : (Cheatsheet)
10 pages
ASSi2 DSBDA
No ratings yet
ASSi2 DSBDA
4 pages
Code explanation for date types
No ratings yet
Code explanation for date types
8 pages
04 DS 2023
No ratings yet
04 DS 2023
63 pages
Experiment No. 5: Objective
No ratings yet
Experiment No. 5: Objective
5 pages
Data Cleaning in Python
No ratings yet
Data Cleaning in Python
6 pages
Python (Unit - 2)
No ratings yet
Python (Unit - 2)
22 pages
Lesson 3. Data Preparation and Structuring 1 Data Cleaning
No ratings yet
Lesson 3. Data Preparation and Structuring 1 Data Cleaning
36 pages
Praktikum Modul 3
No ratings yet
Praktikum Modul 3
5 pages
Week 6 - Data Cleaning
No ratings yet
Week 6 - Data Cleaning
8 pages
1
No ratings yet
1
3 pages
ds
No ratings yet
ds
114 pages
ML SELF UNIT 2
No ratings yet
ML SELF UNIT 2
20 pages
PreProcessing With R
No ratings yet
PreProcessing With R
6 pages
Data Science Tutorial 1686911993
No ratings yet
Data Science Tutorial 1686911993
41 pages
Document (4)
No ratings yet
Document (4)
15 pages
dw lab file
No ratings yet
dw lab file
18 pages
Pandas-1
No ratings yet
Pandas-1
13 pages
Ads Exp2 C35
No ratings yet
Ads Exp2 C35
9 pages
data-mining-lab-manual-CSE-VII-Sem
No ratings yet
data-mining-lab-manual-CSE-VII-Sem
63 pages
Data Wrangling and Preprocessing
100% (1)
Data Wrangling and Preprocessing
41 pages
Data Cleaning and Preprocessing
No ratings yet
Data Cleaning and Preprocessing
4 pages
Lect 2
No ratings yet
Lect 2
54 pages
2777959-Day 8 - Data Wrangling
No ratings yet
2777959-Day 8 - Data Wrangling
2 pages
Wavelet Neural Networks: With Applications in Financial Engineering, Chaos, and Classification
From Everand
Wavelet Neural Networks: With Applications in Financial Engineering, Chaos, and Classification
Antonios K. Alexandridis
No ratings yet
20CS2050L Software Engineering Labmanuel
No ratings yet
20CS2050L Software Engineering Labmanuel
47 pages
Sharin MK 2024
No ratings yet
Sharin MK 2024
1 page
Exp No 03
No ratings yet
Exp No 03
15 pages
Challenges in Internet of Things
No ratings yet
Challenges in Internet of Things
7 pages
Polymorphism in Python
No ratings yet
Polymorphism in Python
17 pages
Polymorphism in Python With EXAMPLES
No ratings yet
Polymorphism in Python With EXAMPLES
7 pages
Statistical Texture Measures Computed From Gray Level Coocurrence Matrices
No ratings yet
Statistical Texture Measures Computed From Gray Level Coocurrence Matrices
14 pages
Rajagopal - Phase 2 - Journal Plagiarism Fully Completed
No ratings yet
Rajagopal - Phase 2 - Journal Plagiarism Fully Completed
27 pages
Descriptors and Their Selection Methods in QSAR Analysis - Paradigm For Drug Design
No ratings yet
Descriptors and Their Selection Methods in QSAR Analysis - Paradigm For Drug Design
5 pages
20IT503 - Big Data Analytics - Unit2
No ratings yet
20IT503 - Big Data Analytics - Unit2
62 pages
Tutorial Galgo R
No ratings yet
Tutorial Galgo R
92 pages
Explainable Prediction of Surface Roughness in Multi-Jet Polishing Based On
No ratings yet
Explainable Prediction of Surface Roughness in Multi-Jet Polishing Based On
12 pages
Data Mining Assignment
No ratings yet
Data Mining Assignment
8 pages
Beating The Odds: Learning To Bet On Soccer Matches Using Historical Data
No ratings yet
Beating The Odds: Learning To Bet On Soccer Matches Using Historical Data
7 pages
Research Proposal for Masters in China 攻读硕
No ratings yet
Research Proposal for Masters in China 攻读硕
7 pages
A Comprehensive Review of Approaches To Building Occupancy Detection
No ratings yet
A Comprehensive Review of Approaches To Building Occupancy Detection
14 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
103 pages
Android Based Malware Detection Technique Using Machine Learning Algorithms
No ratings yet
Android Based Malware Detection Technique Using Machine Learning Algorithms
6 pages
Csit (r22) 3-2 Machine Learning Digital Notes
No ratings yet
Csit (r22) 3-2 Machine Learning Digital Notes
120 pages
BICA 2023
No ratings yet
BICA 2023
8 pages
paper2
No ratings yet
paper2
19 pages
TSP CMC 41333
No ratings yet
TSP CMC 41333
14 pages
Eature Engineering: Presenter: Prof. Amit Kumar Das
No ratings yet
Eature Engineering: Presenter: Prof. Amit Kumar Das
17 pages
MCNN-LSTM Combining CNN and LSTM to Classify Multi-Class Text in Imbalanced News Data
No ratings yet
MCNN-LSTM Combining CNN and LSTM to Classify Multi-Class Text in Imbalanced News Data
16 pages
Predicting Young's Modulus of Linear Polyurethane and Polyurethane-Polyurea Elastomers
No ratings yet
Predicting Young's Modulus of Linear Polyurethane and Polyurethane-Polyurea Elastomers
14 pages
Report 102 Intro Start
No ratings yet
Report 102 Intro Start
73 pages
A Review of Feature Selection and Its Methods: Cybernetics and Information Technologies March 2019
No ratings yet
A Review of Feature Selection and Its Methods: Cybernetics and Information Technologies March 2019
25 pages
DLG Book
No ratings yet
DLG Book
332 pages
(Chou e JIANG) A Survey On Data-Driven Network Intrusion Detection.
No ratings yet
(Chou e JIANG) A Survey On Data-Driven Network Intrusion Detection.
36 pages
Chi2 Feature Selection and Discretization of Numeric Attributes
No ratings yet
Chi2 Feature Selection and Discretization of Numeric Attributes
4 pages
Unit-2: Logistic Regression
No ratings yet
Unit-2: Logistic Regression
30 pages
MSC Software Engineering Dissertation
100% (2)
MSC Software Engineering Dissertation
4 pages
ICET Presentation 1
No ratings yet
ICET Presentation 1
23 pages
Multi-Objective Optimization Algorithms For Intrusion Detection in IoT Networks A Systematic Review
No ratings yet
Multi-Objective Optimization Algorithms For Intrusion Detection in IoT Networks A Systematic Review
10 pages
Sport Analytics For Cricket Game Results Using Machine Learning - An Experimental Study - Semantic Scholar
No ratings yet
Sport Analytics For Cricket Game Results Using Machine Learning - An Experimental Study - Semantic Scholar
9 pages

Exp 01-B Feature Selection and Extraction

Uploaded by

Exp 01-B Feature Selection and Extraction

Uploaded by

Exp 01-B Feature Selection and Extraction

January 12, 2024

[2]: import pandas as pd

[3]: #1. Read the data

[4]: #2. Calculate the % of missing values in the columns

[5]: Branch City Customer Gender Product line Unit_price \

Quantity Tax Total Payment cogs Rating Age \

↪(Use temporary dataframe)

Quantity Tax Total Payment cogs Rating Age \

[7]: Branch City Customer Gender Product line Unit_price \

Quantity Tax Total Payment cogs Rating Age \

↪display the size. (Use temporary data framei nplace=True)

Quantity Tax Total Payment cogs Rating Age \

[12]: Text(0, 0.5, 'normalized')

[13]: #11. Perform the Z-score normalization for a numerical feature,Quarterly_Tax␣

[13]: Text(0, 0.5, 'normalized')

[14]: Payment payment_encoded

[15]: Cash Credit card Ewallet

You might also like