Outliers Z-Score

The document discusses using Z-scores to identify outliers in a dataset that follows a normal distribution. It explains that Z-scores measure the number of standard deviations an observation is from the mean, with values above +/-3 considered outliers. However, outliers can skew the calculation of Z-scores by inflating the mean and standard deviation. The document then introduces an alternative method using interquartile range to calculate inner and outer fences to identify outliers. Values outside the outer fences or between the inner fences are classified as outliers.

Uploaded by

Ana Chikovani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views

Outliers Z-Score

Uploaded by

Ana Chikovani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 1

Using Z-scores to Detect Outliers

Z-scores can quantify the unusualness of an observation when your data follow the normal distribution. Z-
scores are the number of standard deviations above and below the mean that each value falls. For example, a
Z-score of 2 indicates that an observation is two standard deviations above the average while a Z-score of -2
signifies it is two standard deviations below the mean. A Z-score of zero represents a value that equals the
mean.

The further away an observation’s Z-score is from zero, the more unusual it is. A standard cut-off value for
finding outliers are Z-scores of +/-3 or further from zero. The probability distribution below displays the
distribution of Z-scores in a standard normal distribution. Z-scores beyond +/- 3 are so extreme you can barely
see the shading under the curve.

In a population that follows the normal distribution, Z-score values more extreme than +/- 3 have a probability
of 0.0027 (2 * 0.00135), which is about 1 in 370 observations. However, if your data don’t follow the normal
distribution, this approach might not be accurate.

Also, note that the outlier’s presence throws off the Z-scores because it inflates the mean and standard
deviation as we saw earlier. Notice how all the Z-scores are negative except the outlier’s value. If we
calculated Z-scores without the outlier, they’d be different! Be aware that if your dataset contains outliers, Z-
values are biased such that they appear to be less extreme (i.e., closer to zero).

To calculate the outlier fences, do the following:

1. Take your IQR and multiply it by 1.5 and 3. We’ll use these values to
obtain the inner and outer fences. For our example, the IQR equals
0.222. Consequently, 0.222 * 1.5 = 0.333 and 0.222 * 3 = 0.666.
We’ll use 0.333 and 0.666 in the following steps.
2. Calculate the inner and outer lower fences. Take the Q1 value and subtract the two values from step 1. The
two results are the lower inner and outer outlier fences. For our example, Q1 is 1.714. So, the lower inner
fence = 1.714 – 0.333 = 1.381 and the lower outer fence = 1.714 – 0.666 = 1.048.
3. Calculate the inner and outer upper fences. Take the Q3 value and add the two values from step 1. The two
results are the upper inner and upper outlier fences. For our example, Q3 is 1.936. So, the upper inner fence =
1.936 + 0.333 = 2.269 and the upper outer fence = 1.936 + 0.666 = 2.602.

Using the Outlier Fences with Our Example Dataset

For our example dataset, the values for these fences are 1.048, 1.381, 2.269, and 2.602. Almost all of our data
should fall between the inner fences, which are 1.381 and 2.269. At this point, we look at our data values and
determine whether any qualify as being major or minor outliers. 14 out of the 15 data points fall inside the
inner fences—they are not outliers. The 15th data point falls outside the upper outer fence—it’s a major or
extreme outlier.

The IQR method is helpful because it uses percentiles, which do not depend on a specific distribution.
Additionally, percentiles are relatively robust to the presence of outliers compared to the other quantitative
methods. Values that fall inside the two inner fences are not outliers. Let’s see how this method works using
our example dataset.

Determination of Dissolved Oxygen by Winkler Titrattion
50% (2)
Determination of Dissolved Oxygen by Winkler Titrattion
10 pages
Conservation of Momentum
100% (4)
Conservation of Momentum
21 pages
Finding Outliers 2 Wayes Z-Score and Interquortile Range
No ratings yet
Finding Outliers 2 Wayes Z-Score and Interquortile Range
1 page
WINSEM2024-25_CBS3006_ETH_VL2024250505168_2025-01-09_Reference-Material-III
No ratings yet
WINSEM2024-25_CBS3006_ETH_VL2024250505168_2025-01-09_Reference-Material-III
4 pages
Numericalquestionsonzscoreand IQ
No ratings yet
Numericalquestionsonzscoreand IQ
3 pages
Lecture 3
No ratings yet
Lecture 3
23 pages
Outlier Detection and Removal
No ratings yet
Outlier Detection and Removal
2 pages
5 Ways To Find Outliers in Your Data - Statistics by Jim
No ratings yet
5 Ways To Find Outliers in Your Data - Statistics by Jim
35 pages
Handling Outliers
No ratings yet
Handling Outliers
6 pages
Outliers
No ratings yet
Outliers
3 pages
Outlier treatment
No ratings yet
Outlier treatment
16 pages
Nikita Prasad - Outliers Basics
No ratings yet
Nikita Prasad - Outliers Basics
13 pages
DPT Week 10
No ratings yet
DPT Week 10
1 page
How To Calculate Outliers
No ratings yet
How To Calculate Outliers
7 pages
Discusion Forum Unit 2
No ratings yet
Discusion Forum Unit 2
2 pages
Outlier Detection
No ratings yet
Outlier Detection
41 pages
Univariate Outlier Detection
No ratings yet
Univariate Outlier Detection
9 pages
OUTLIERS
100% (1)
OUTLIERS
5 pages
Detecting Data Outliers
No ratings yet
Detecting Data Outliers
7 pages
Detecting Data Outliers
No ratings yet
Detecting Data Outliers
7 pages
Identifying and Handling Outliers in Pandas - A Step-By-Step Guide - by Arvid Eichner - Python in Plain English
No ratings yet
Identifying and Handling Outliers in Pandas - A Step-By-Step Guide - by Arvid Eichner - Python in Plain English
19 pages
Explanatory Data Analysis
100% (1)
Explanatory Data Analysis
28 pages
Empirical_rule_and_Outliers_1721456291
No ratings yet
Empirical_rule_and_Outliers_1721456291
13 pages
Outliers in Machine Learning
No ratings yet
Outliers in Machine Learning
13 pages
Boxplot Outlier
No ratings yet
Boxplot Outlier
3 pages
Outlier Detection in Non-Gaussian Distributions Uitschieter Detectie in Niet-Gauss Verdelingen
No ratings yet
Outlier Detection in Non-Gaussian Distributions Uitschieter Detectie in Niet-Gauss Verdelingen
45 pages
Notes PDF ML Day 17
No ratings yet
Notes PDF ML Day 17
9 pages
Updated 2 - STAT100 - Median+Mode+Range+Outlier+Percentiles - Problem+Solution - Asma
No ratings yet
Updated 2 - STAT100 - Median+Mode+Range+Outlier+Percentiles - Problem+Solution - Asma
7 pages
Mathematical
No ratings yet
Mathematical
14 pages
Advanced Data Analysis Techniques 3
No ratings yet
Advanced Data Analysis Techniques 3
31 pages
Lec 7 Data Visualization Basic Statistics Updated 21102024 122008pm
No ratings yet
Lec 7 Data Visualization Basic Statistics Updated 21102024 122008pm
39 pages
4- Lect-Finding Z- Score, Percentiles and Quartiles,
No ratings yet
4- Lect-Finding Z- Score, Percentiles and Quartiles,
23 pages
3-Introduction to data cleaning outlires
No ratings yet
3-Introduction to data cleaning outlires
5 pages
Handling Ouliers
No ratings yet
Handling Ouliers
5 pages
ML_EX2
No ratings yet
ML_EX2
7 pages
CHP 3b
No ratings yet
CHP 3b
32 pages
Test To Identify Outliers in Data Series
No ratings yet
Test To Identify Outliers in Data Series
16 pages
TN 5 3.2_3.3
No ratings yet
TN 5 3.2_3.3
5 pages
Mba 15-2
No ratings yet
Mba 15-2
18 pages
M4. Outliers (1)
No ratings yet
M4. Outliers (1)
11 pages
Outlier Analysis in Data Mining
No ratings yet
Outlier Analysis in Data Mining
5 pages
Outliers PDF
No ratings yet
Outliers PDF
5 pages
Feature Engineering
No ratings yet
Feature Engineering
63 pages
Anomaly Detection
No ratings yet
Anomaly Detection
10 pages
Box Plot Outliers
No ratings yet
Box Plot Outliers
2 pages
Descriptive Stats - Part B: Measures of Relative Location and Detecting Outliers Exploratory Data Analysis
No ratings yet
Descriptive Stats - Part B: Measures of Relative Location and Detecting Outliers Exploratory Data Analysis
18 pages
Numerical Measures of Relative Standing: Fall 2016-2017 MGT 205 1
No ratings yet
Numerical Measures of Relative Standing: Fall 2016-2017 MGT 205 1
44 pages
What is Outlier
No ratings yet
What is Outlier
3 pages
05 -- moments-standized_variable_chebychev-1
No ratings yet
05 -- moments-standized_variable_chebychev-1
22 pages
Statistics Measures of Position Unit Plan
No ratings yet
Statistics Measures of Position Unit Plan
3 pages
Feature Engineering
No ratings yet
Feature Engineering
35 pages
4_Outliers_+Transformaations ML
No ratings yet
4_Outliers_+Transformaations ML
28 pages
Lecture 8 Data Prepration Techniques
No ratings yet
Lecture 8 Data Prepration Techniques
4 pages
Identifying and Excluding Outliers The 130% IQR Rule: ST RD
No ratings yet
Identifying and Excluding Outliers The 130% IQR Rule: ST RD
3 pages
Research File 3
No ratings yet
Research File 3
10 pages
OutlierAnalysis
No ratings yet
OutlierAnalysis
2 pages
Feature Engineering
No ratings yet
Feature Engineering
66 pages
Data Cleaning
No ratings yet
Data Cleaning
4 pages
Slides Prepared by John S. Loucks St. Edward's University: 1 Slide © 2003 Thomson/South-Western
No ratings yet
Slides Prepared by John S. Loucks St. Edward's University: 1 Slide © 2003 Thomson/South-Western
34 pages
17 dm2 Anomaly Detection 2022 23
No ratings yet
17 dm2 Anomaly Detection 2022 23
113 pages
GCSE Maths Revision: Cheeky Revision Shortcuts
From Everand
GCSE Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (2)
Gre Formula Book
From Everand
Gre Formula Book
Saifuddin Kamran
No ratings yet
Distributions Normal Binominal
No ratings yet
Distributions Normal Binominal
1 page
Box Plot Consect
No ratings yet
Box Plot Consect
2 pages
Final Project - Group 1
No ratings yet
Final Project - Group 1
6 pages
Project 1 - Descriptive Statistics
No ratings yet
Project 1 - Descriptive Statistics
11 pages
Synergy Annual Report 2023
No ratings yet
Synergy Annual Report 2023
160 pages
8 Application Area of Polar Curves
No ratings yet
8 Application Area of Polar Curves
15 pages
Jackson On Becoming Human
No ratings yet
Jackson On Becoming Human
45 pages
5. Limits of Accuracy
No ratings yet
5. Limits of Accuracy
10 pages
Logical Reasoning
No ratings yet
Logical Reasoning
595 pages
CAT35ML - Hi-Tech Current Limiting Fuses PDF
No ratings yet
CAT35ML - Hi-Tech Current Limiting Fuses PDF
58 pages
Multistrada 1260 S Abs Eu Us My18 DMH
No ratings yet
Multistrada 1260 S Abs Eu Us My18 DMH
184 pages
Chemistry of Spices First Edition V A Parthasarathy download
100% (10)
Chemistry of Spices First Edition V A Parthasarathy download
70 pages
Schneider Electric TeSys-GV2 GV2ME10
No ratings yet
Schneider Electric TeSys-GV2 GV2ME10
16 pages
Harrison Et Al v. Veolia Water West Operating Services, Inc. Et Al
No ratings yet
Harrison Et Al v. Veolia Water West Operating Services, Inc. Et Al
17 pages
7.4 Antidiabetic Drugs
No ratings yet
7.4 Antidiabetic Drugs
25 pages
Arrow Sign
No ratings yet
Arrow Sign
1 page
Deep Learning-Based Damage Detection of Mining Conveyor Belt
No ratings yet
Deep Learning-Based Damage Detection of Mining Conveyor Belt
9 pages
Mental Time Travel
No ratings yet
Mental Time Travel
2 pages
SETRA - Comprendre Les Principaux Paramètres de Conception Géométrique Des Routes - 2006 For Design (English)
No ratings yet
SETRA - Comprendre Les Principaux Paramètres de Conception Géométrique Des Routes - 2006 For Design (English)
30 pages
PIPOS EXPLORER Volume VI
No ratings yet
PIPOS EXPLORER Volume VI
42 pages
Solar Car Aerodynamic Design For Optimal Cooling and High Efficiency
No ratings yet
Solar Car Aerodynamic Design For Optimal Cooling and High Efficiency
8 pages
General Problems in Solid Mechanics and Non-Linearity: $finite Deformation Occurs. in
No ratings yet
General Problems in Solid Mechanics and Non-Linearity: $finite Deformation Occurs. in
21 pages
If You Forget Me Analysis
No ratings yet
If You Forget Me Analysis
3 pages
CBSE Class 12 Physics 2 Mark Question Bank
No ratings yet
CBSE Class 12 Physics 2 Mark Question Bank
8 pages
Electric Schema
100% (1)
Electric Schema
21 pages
Handbook of Spectroscopy 2 Vol Set 1st Edition Gauglitz G. - The ebook is now available, just one click to start reading
No ratings yet
Handbook of Spectroscopy 2 Vol Set 1st Edition Gauglitz G. - The ebook is now available, just one click to start reading
80 pages
Power System: Dps 2900B-48-6 19in Celld 300
No ratings yet
Power System: Dps 2900B-48-6 19in Celld 300
2 pages
UPPC UPCAT Tutorials 2020 Math Part 2
No ratings yet
UPPC UPCAT Tutorials 2020 Math Part 2
60 pages
Alumidi: Concealed Bracket With and Without Holes
No ratings yet
Alumidi: Concealed Bracket With and Without Holes
12 pages
2020 Epidermic and Endermic Diseases
No ratings yet
2020 Epidermic and Endermic Diseases
5 pages
Critical Reviews in Oncology / Hematology
No ratings yet
Critical Reviews in Oncology / Hematology
9 pages
Uop 41
100% (1)
Uop 41
2 pages