Cleaning and Preparing Data
Cleaning and Preparing Data
Data
TE AIML (Hon)
Athang Joshi
Athang Joshi 1
What is data cleaning?
Athang Joshi 2
Steps to clean data
Athang Joshi 3
Pratibha Sharma 4
Pratibha Sharma 5
Filter unwanted outliers
Pratibha Sharma 6
Fix structural errors
Athang Joshi 7
Handle missing data
Athang Joshi 8
Validate and QA
Athang Joshi 9
Advantages and benefits of data cleaning
Athang Joshi 10
Characteristics of a quality data
• Validity (The degree to which the data conforms to defined rules or constraints)
• Accuracy (The data is close to the true values)
• Completeness (The degree to which all required data is known)
• Consistency (Data is consistent within the same dataset and/or across multiple
data sets)
• Uniformity (The degree to which the data is specified using the same unit of
measure)
Athang Joshi 11
Thank You!
([email protected])
Athang Joshi 12