0% found this document useful (0 votes)

20 views

Ai ML Exp2

The document discusses exploratory data analysis (EDA) of healthcare data. It describes five objectives of EDA: identifying data quality issues; understanding data distribution; exploring relationships; visualizing trends and patterns; and generating hypotheses. The key goals of EDA are outlined as data cleaning, descriptive statistics, data visualization, feature engineering, correlation analysis, data segmentation, and hypothesis generation. The document also differentiates between univariate EDA, which focuses on single variables, and multivariate EDA, which analyzes relationships between multiple variables. Overall, the document provides an overview of EDA techniques and their application to healthcare data.

Uploaded by

Kamat Hrishikesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Ai ML Exp2

Uploaded by

Kamat Hrishikesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

EXPERIMENT NO-2

Aim: Perform Exploratory data analysis of Healthcare Data.

Objectives:
Exploratory Data Analysis (EDA) is a critical step in understanding and deriving insights from
healthcare data. Here are five objectives for performing EDA on healthcare data:
 Identify Data Quality Issues: The first objective is to assess the quality of the healthcare
data. This includes checking for missing values, outliers, and inconsistencies, which are
crucial for data integrity and reliable analysis.
 Understand Data Distribution: EDA helps in understanding the distribution of key
healthcare variables such as patient ages, diagnosis codes, and treatment outcomes. This
understanding can reveal trends and patterns within the data.
 Explore Relationships: EDA allows for the exploration of relationships between different
healthcare variables. For example, you can investigate how patient age impacts the likelihood
of specific medical conditions or treatment effectiveness.
 Visualize Trends and Patterns: EDA involves creating visualizations like histograms,
scatter plots, and box plots to highlight trends and patterns within the data. This helps in
making complex healthcare data more interpretable.
 Hypothesis Generation: EDA can lead to the generation of hypotheses for more focused
research. For instance, you may identify associations between certain patient characteristics
and health outcomes, leading to targeted investigations and studies in the healthcare domain.

Theory:

Exploratory Data Analysis (EDA): Exploratory Data Analysis is an approach to analyzing data
sets to summarize their main characteristics, often with the help of graphical representations. EDA
is used to gain a better understanding of the data, detect patterns, anomalies, and relationships, and
to inform subsequent data analysis. EDA is an essential step before conducting more advanced
statistical or machine learning analyses.

 The Foremost Goals of EDA

1. Data Cleaning: EDA involves examining the information for errors, lacking values, and
inconsistencies. It includes techniques including records imputation, managing missing statistics,
and figuring out and getting rid of outliers.

2. Descriptive Statistics: EDA utilizes precise records to recognize the important tendency,
variability, and distribution of variables. Measures like suggest, median, mode, preferred
deviation, range, and percentiles are usually used.

3. Data Visualization: EDA employs visual techniques to represent the statistics graphically.
Visualizations consisting of histograms, box plots, scatter plots, line plots,

heatmaps, and bar charts assist in identifying styles, trends, and relationships within the facts.

4. Feature Engineering: EDA allows for the exploration of various variables and their
adjustments to create new functions or derive meaningful insights. Feature engineering can

1
contain scaling, normalization, binning, encoding express variables, and creating interplay or
derived variables.

5. Correlation and Relationships: EDA allows discover relationships and dependencies between
variables. Techniques such as correlation analysis, scatter plots, and pass-tabulations offer insights
into the power and direction of relationships between variables.

6. Data Segmentation: EDA can contain dividing the information into significant segments based
totally on sure standards or traits. This segmentation allows advantage insights into unique
subgroups inside the information and might cause extra focused analysis.

7. Hypothesis Generation: EDA aids in generating hypotheses or studies questions based totally
on the preliminary exploration of the data. It facilitates form the inspiration for in addition
evaluation and model building.

8. Data Quality Assessment: EDA permits for assessing the nice and reliability of the
information. It involves checking for records integrity, consistency, and accuracy to make certain
the information is suitable for analysis.

 TYPES OF EDA

1. Univariate Exploratory Data Analysis (EDA): Univariate EDA focuses on the analysis of a
single variable at a time. Its primary goal is to understand and summarize the characteristics
of individual variables, typically using descriptive statistics and visualizations. Univariate
EDA can be further broken down into two main types:

 Descriptive Statistics: This type of univariate EDA involves calculating and examining
summary statistics for a single variable. Common statistics include mean, median, mode,
range, variance, standard deviation, and percentiles. Descriptive statistics provide an overview
of the central tendency, spread, and shape of the variable's distribution.
 Example: Calculating the mean and standard deviation of patient ages in a healthcare dataset.

 Data Visualization: Univariate EDA also includes creating visual representations of a single
variable's distribution. Common visualizations include histograms, box plots, bar charts, and
density plots. These visualizations help in understanding the shape, spread, and patterns
within the data.
 Example: Creating a histogram to visualize the distribution of patient ages in a healthcare
dataset.

2. Multivariate Exploratory Data Analysis (EDA): Multivariate EDA focuses on the

simultaneous analysis of relationships between multiple variables in a dataset. It aims to
uncover patterns, dependencies, and interactions between variables. Multivariate EDA can be
categorized into several types:

 Scatterplots: Scatterplots are used to visualize the relationship between two continuous
variables. They help identify correlations, trends, and outliers.
 Example: Creating a scatterplot to explore the relationship between patient age and
cholesterol levels in a healthcare dataset.

2
 Correlation Analysis: Correlation analysis quantifies the strength and direction of the linear
relationship between pairs of continuous variables. Common correlation coefficients include
Pearson's correlation and Spearman's rank correlation.
 Example: Calculating the Pearson correlation coefficient between patient weight and blood
pressure in a healthcare dataset.

 Categorical Data Analysis: Multivariate EDA also involves the analysis of categorical
variables. Techniques like contingency tables and chi-squared tests are used to examine the
relationships between categorical variables.
 Example: Analyzing the association between patient gender and the presence of specific
medical conditions in a healthcare dataset.

 Heatmaps: Heatmaps are used to visualize the relationships between multiple variables by
displaying a matrix of correlations or other measures.
 Example: Creating a heatmap to visualize correlations between various medical test results in
a healthcare dataset.

Univariate and multivariate EDA are both essential for understanding data and making informed
decisions. While univariate EDA provides insights into individual variables, multivariate EDA
uncovers complex relationships and interactions between variables, offering a more
comprehensive view of the data. These approaches are fundamental for data exploration,
hypothesis generation, and guiding subsequent analyses in a wide range of fields, including
healthcare, finance, and social sciences.

DIAGRAM:

CODE& OUTPUTS

3
 Loading the dataset and Getting Insights About The Dataset:

 EDA and more insight into the dataset

4
 OUTLIERS

5
6
CONCLUSION: In this experiment we got to study how to get insights about a dataset and how
to perform EDA(Exploratory Data Analysis), univariate EDA(Histogram), Multivariate
EDA(Scatterplot & Heatmap) on diabetes dataset.

Business Report SMDM Project - Coded
No ratings yet
Business Report SMDM Project - Coded
27 pages
Unit I - Part I Notes
100% (7)
Unit I - Part I Notes
33 pages
05_AIHC_Exp02
No ratings yet
05_AIHC_Exp02
11 pages
Key Concepts in Exploratory Data Analysis (EDA)
No ratings yet
Key Concepts in Exploratory Data Analysis (EDA)
5 pages
UNIT 1
No ratings yet
UNIT 1
23 pages
22amh32 - Data Analytics and Data Science Unit I & Exploratory Data Analysis (Eda) 1. Exploratory Data Analysis (Eda)
No ratings yet
22amh32 - Data Analytics and Data Science Unit I & Exploratory Data Analysis (Eda) 1. Exploratory Data Analysis (Eda)
9 pages
Unit-1-DEV
No ratings yet
Unit-1-DEV
74 pages
Document (4)
No ratings yet
Document (4)
21 pages
eda1
No ratings yet
eda1
25 pages
Exploratory Data Analysis - Komorowski PDF
No ratings yet
Exploratory Data Analysis - Komorowski PDF
20 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
13 pages
DOC-20250125-WA0000.
No ratings yet
DOC-20250125-WA0000.
15 pages
Komorowski EDA2016
No ratings yet
Komorowski EDA2016
20 pages
Unit-1
No ratings yet
Unit-1
52 pages
5. Exploratory Data Analysis (EDA) in Data
No ratings yet
5. Exploratory Data Analysis (EDA) in Data
12 pages
What Is Exploratory Data Analysis (EDA) ?
No ratings yet
What Is Exploratory Data Analysis (EDA) ?
6 pages
datascience unit-4
No ratings yet
datascience unit-4
6 pages
Unit 3 Ids Notes
No ratings yet
Unit 3 Ids Notes
31 pages
BI-LEc 3
No ratings yet
BI-LEc 3
24 pages
Unit 3
No ratings yet
Unit 3
47 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
3 pages
EDA
No ratings yet
EDA
9 pages
Eda Sandhya
No ratings yet
Eda Sandhya
7 pages
Exp-12
No ratings yet
Exp-12
7 pages
Exploratory Dataanalysis (EDA) : Kevin Angelo A. Inlong
No ratings yet
Exploratory Dataanalysis (EDA) : Kevin Angelo A. Inlong
6 pages
EDA Feature eng- Estimation Inference and Hypothesis
No ratings yet
EDA Feature eng- Estimation Inference and Hypothesis
53 pages
FDS Unit 2
No ratings yet
FDS Unit 2
15 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
2 pages
Exploratory Data Analysis unit 2
No ratings yet
Exploratory Data Analysis unit 2
39 pages
Unit 3
No ratings yet
Unit 3
222 pages
DSML Notes
No ratings yet
DSML Notes
32 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
23 pages
Exploratory Data Analysis (EDA)
No ratings yet
Exploratory Data Analysis (EDA)
12 pages
Komorowski EDA2016
No ratings yet
Komorowski EDA2016
20 pages
EDA
No ratings yet
EDA
3 pages
Komorowski EDA2016
No ratings yet
Komorowski EDA2016
20 pages
Dev 1
No ratings yet
Dev 1
2 pages
EDA in Healthcare Analysis
No ratings yet
EDA in Healthcare Analysis
9 pages
d8 PPT Review 3
No ratings yet
d8 PPT Review 3
45 pages
unit-1
No ratings yet
unit-1
50 pages
ML EXP1_2201107
No ratings yet
ML EXP1_2201107
34 pages
Module 2
No ratings yet
Module 2
81 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
2 pages
IOT Domain
No ratings yet
IOT Domain
70 pages
Exploratory Data Analysis in ML
No ratings yet
Exploratory Data Analysis in ML
7 pages
Group-7
No ratings yet
Group-7
19 pages
Unit 1 - Intro To EDA
No ratings yet
Unit 1 - Intro To EDA
40 pages
UNIT 1 Exploratory Data Analysis
100% (1)
UNIT 1 Exploratory Data Analysis
8 pages
DL_EDA_process
No ratings yet
DL_EDA_process
2 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
5 pages
Best Journal
No ratings yet
Best Journal
11 pages
Unit 1
No ratings yet
Unit 1
19 pages
IDA Question Bank Ch2
No ratings yet
IDA Question Bank Ch2
26 pages
Assignment EDA
No ratings yet
Assignment EDA
4 pages
Chapter 7 SQQS1033
No ratings yet
Chapter 7 SQQS1033
37 pages
The analysis_In_EDA
No ratings yet
The analysis_In_EDA
7 pages
EXP-12
No ratings yet
EXP-12
4 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
17 pages
Notes - EDA-Unit1 (2)
No ratings yet
Notes - EDA-Unit1 (2)
34 pages
Biostatistics Explored Through R Software: An Overview
From Everand
Biostatistics Explored Through R Software: An Overview
Vinaitheerthan Renganathan
3.5/5 (2)
Statistical Data Analysis Made Easy
From Everand
Statistical Data Analysis Made Easy
Pasquale De Marco
No ratings yet
Data Science Presentation
100% (3)
Data Science Presentation
113 pages
Business+Report - Ensemble1 Lavekar
No ratings yet
Business+Report - Ensemble1 Lavekar
32 pages
Tasks for Students-1
No ratings yet
Tasks for Students-1
3 pages
Statistics For Criminology And Criminal Justice 3rd Jacinta M Gau pdf download
No ratings yet
Statistics For Criminology And Criminal Justice 3rd Jacinta M Gau pdf download
84 pages
Business Mathematics & Statistics - BBA Sem 1
No ratings yet
Business Mathematics & Statistics - BBA Sem 1
3 pages
Unit 2 Univariate Data Unit Plan
No ratings yet
Unit 2 Univariate Data Unit Plan
5 pages
Capstone Notes-1
No ratings yet
Capstone Notes-1
18 pages
Exploratory Data Analysis with Python Cookbook: Over 50 recipes to analyze, visualize, and extract insights from structured and unstructured data Oluleye - Download the ebook and explore the most detailed content
100% (1)
Exploratory Data Analysis with Python Cookbook: Over 50 recipes to analyze, visualize, and extract insights from structured and unstructured data Oluleye - Download the ebook and explore the most detailed content
58 pages
Download Full Introduction to Research Methods and Data Analysis in Psychology 3rd Edition Darren Langdridge PDF All Chapters
100% (8)
Download Full Introduction to Research Methods and Data Analysis in Psychology 3rd Edition Darren Langdridge PDF All Chapters
82 pages
The Impact of Working Mother On Quality Time With Children: Aryanti Wardiyah Rilyani Nirwanto
No ratings yet
The Impact of Working Mother On Quality Time With Children: Aryanti Wardiyah Rilyani Nirwanto
8 pages
Statistics for Data Scientists
100% (1)
Statistics for Data Scientists
486 pages
Statistics with Commonsense Using Statistics with Commonsense Kault instant download
100% (1)
Statistics with Commonsense Using Statistics with Commonsense Kault instant download
64 pages
Data Classification & Tabulation
No ratings yet
Data Classification & Tabulation
3 pages
Statistics for Criminology and Criminal Justice Ronet D. Bachman instant download
100% (1)
Statistics for Criminology and Criminal Justice Ronet D. Bachman instant download
64 pages
Get Statistics for Management and Economics Abbreviated 10th Edition Gerald Keller Test Bank Free All Chapters Available
100% (7)
Get Statistics for Management and Economics Abbreviated 10th Edition Gerald Keller Test Bank Free All Chapters Available
36 pages
ResearchMethodologyandmedicalstatistics_formdMs_presentation
No ratings yet
ResearchMethodologyandmedicalstatistics_formdMs_presentation
17 pages
classXII DS Teacher Handbook
No ratings yet
classXII DS Teacher Handbook
73 pages
Jurnal
No ratings yet
Jurnal
10 pages
10.lesson Plan Theory
No ratings yet
10.lesson Plan Theory
10 pages
Predictive Modeling Projectt
No ratings yet
Predictive Modeling Projectt
109 pages
Variable and Their Types - SPSS
No ratings yet
Variable and Their Types - SPSS
5 pages
Complete Download IBM SPSS Statistics 19 Made Simple 1st Edition Colin D. Gray PDF All Chapters
100% (7)
Complete Download IBM SPSS Statistics 19 Made Simple 1st Edition Colin D. Gray PDF All Chapters
61 pages
Instant ebooks textbook Fertility Family Planning and Population Control in China Routledge Studies in Asia s Transformations Chiung download all chapters
100% (4)
Instant ebooks textbook Fertility Family Planning and Population Control in China Routledge Studies in Asia s Transformations Chiung download all chapters
81 pages
Artificial Intelligence and Machine Learning: Subject Code: 21CS54 by Savitha Nagaraju Aiml Dept, Atme
No ratings yet
Artificial Intelligence and Machine Learning: Subject Code: 21CS54 by Savitha Nagaraju Aiml Dept, Atme
80 pages
PDF (eBook PDF) Research in Applied Linguistics: Becoming a Discerning Consumer 3rd Edition download
100% (7)
PDF (eBook PDF) Research in Applied Linguistics: Becoming a Discerning Consumer 3rd Edition download
51 pages
SLG Statistics and Probability Q2W7
No ratings yet
SLG Statistics and Probability Q2W7
6 pages
Instant Ebooks Textbook Statistics For Criminology and Criminal Justice Jacinta M. Gau Download All Chapters
100% (3)
Instant Ebooks Textbook Statistics For Criminology and Criminal Justice Jacinta M. Gau Download All Chapters
62 pages

Ai ML Exp2

Uploaded by

Ai ML Exp2

Uploaded by

EXPERIMENT NO-2

Aim: Perform Exploratory data analysis of Healthcare Data.

 The Foremost Goals of EDA

2. Multivariate Exploratory Data Analysis (EDA): Multivariate EDA focuses on the

 EDA and more insight into the dataset

You might also like