5. Exploratory Data Analysis (EDA) in Data
5. Exploratory Data Analysis (EDA) in Data
Analysis (EDA) in
Data Analytics
Understanding and Visualizing Your Data
What is EDA?
Importance of EDA
Data Collection
•Gather data from relevant sources.
Data Cleaning
•Handle missing, duplicate, or inconsistent data.
Data Transformation
•Normalize or encode variables.
Data Visualization
•Create charts and graphs to identify patterns.
Tools for EDA
DA: a set of methods used to summarize and describe the main features of a
dataset, such as its central tendency, variability, and distribution.
CA: a statistical method used in research to measure the strength of the linear
relationship between two variables and compute their association
s
Multivariate Analysis:
Heatmaps, pair plots
Terms
• Univariate analysis is a statistical method that
examines a single variable in a data set.
• BiVariate analysis: It involves the analysis of two
variables (often denoted as X, Y), for the purpose of
determining the empirical relationship between them.
• Multivariate analysis is a statistical method that
analyzes multiple variables at once to identify patterns
and relationships
Handlin Techniques for
Missing
g Data:Imputation,
deletion,
Outliers interpolation.
and Dealing with
Outliers:Z-score
Missing analysis, IQR
method.
Data
Large datasets and
computational complexity.
Misinterpreting visualizations.
Dataset: Use a
popular dataset like
Titanic, Iris, or a real-
Case world business
Study/Examp dataset.
le Steps: Show how
EDA was performed
with key insights and
visualizations.