Anscombe's Quartet:: Data Sets

The document discusses Anscombe's Quartet, which contains four datasets that have nearly identical simple descriptive statistics (mean and correlation) but appear very different when graphed. It demonstrates that just reporting statistics can be misleading without also examining the actual relationships visually. Each pair is then graphed to show that though the statistics are the same, the actual relationships between the variables in each dataset are completely different.

Uploaded by

Rahul Karna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

99 views3 pages

Anscombe's Quartet:: Data Sets

Uploaded by

Rahul Karna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Manobin Sharma

073/MSMS/859
Assignment 01(Data Science)
Anscombe's Quartet:
Anscombe’s quartet is a classic example of the drawback to just reporting correlation. Francis
Anscombe illustrated in his 1973 American Statistician paper, how a set of four different pairs
of variables can deliver the same correlation coefficient, while the relationships between each
pair are completely different. It is constructed to demonstrate both the importance of graphing
data before analyzing it and the effect of outliers on statistical properties.

As it contains four datasets that have nearly identical simple descriptive statistics, yet appear
very different when graphed.

He described the article as being intended to counter the impression among statisticians that
"numerical calculations are exact, but graphs are rough.

Data Sets:

X1 Y1 X2 Y2 X3 Y3 X4 Y4

10 8.04 10 9.14 10 7.46 8 6.58

8 6.95 8 8.14 8 6.77 8 5.76
13 7.58 13 8.74 13 12.74 8 7.71
9 8.81 9 8.77 9 7.11 8 8.84
11 8.33 11 9.26 11 7.81 8 8.47
14 9.96 14 8.1 14 8.84 8 7.04
6 7.24 6 6.13 6 6.08 8 5.25
4 4.26 4 3.1 4 5.39 19 12.5
12 10.84 12 9.13 12 8.15 8 5.56
7 4.82 7 7.26 7 6.42 8 7.91
5 5.68 5 4.74 5 5.73 8 6.89
Average:9 7.50090909 9 7.50090909 9 7.5 9 7.50090909
Mean Of X : 9
Mean Of Y : 7.50
Linear regression line
y = 3.00 + 0.500x
X1 vs Y1
12 y = 0.5001x + 3.0001
R² = 0.6665
10

6 X1 vs Y1
Linear (X1 vs Y1)
4

0
0 5 10 15

 The first scatter plot appears to be a simple linear relationship, corresponding to

two variables correlated and following the assumption of normality.

X2 vs Y2
12
y = 0.5x + 3.0009
10 R² = 0.6662

8
X2 vs Y2
6
Linear (X2 vs Y2)
4

0
0 5 10 15

 he second graph is not distributed normally; while a relationship between the two variables is
obvious, it is not linear, and the Pearson correlation coefficient is not relevant. A more
general regression and the corresponding coefficient of determination would be more
appropriate.
X3 vs Y3
14
y = 0.4997x + 3.0025
12 R² = 0.6663
10

8
X3 vs Y3
6
Linear (X3 vs Y3)
4

0
0 5 10 15

 In the third graph, the distribution is linear, but should have a different regression line (a robust
regression would have been called for). The calculated regression is offset by the one outlier
which exerts enough influence to lower the correlation coefficient from 1 to 0.816.

X4 vs Y4
14
y = 0.4999x + 3.0017
12 R² = 0.6667
10

8
X4 vs Y4
6
Linear (X4 vs Y4)
4

0
0 5 10 15 20

 Finally, the fourth graph shows an example when one outlier is enough to produce a high
correlation coefficient, even though the other data points do not indicate any relationship
between the variables

It is unknown, how Anscombe created his datasets. Since its publication, several methods to
generate similar data sets with identical statistics and dissimilar graphics have been developed.

A Tool To Expand Organizational Understanding of Workforce Diversity
0% (1)
A Tool To Expand Organizational Understanding of Workforce Diversity
19 pages
Homework 3
100% (1)
Homework 3
7 pages
Why Data Visualizations Are Not Optional: Anscombe's Quartet
No ratings yet
Why Data Visualizations Are Not Optional: Anscombe's Quartet
2 pages
Data Science Unit-3
No ratings yet
Data Science Unit-3
42 pages
Subjective Questions
No ratings yet
Subjective Questions
8 pages
Part 2 Exploring Relationships Among Variables
No ratings yet
Part 2 Exploring Relationships Among Variables
8 pages
DS-203: E2 Assignment - Linear Regression Report: Sahil Barbade (210040131) 29th Jan 2024
No ratings yet
DS-203: E2 Assignment - Linear Regression Report: Sahil Barbade (210040131) 29th Jan 2024
18 pages
Unit 2 notes
No ratings yet
Unit 2 notes
4 pages
Lectures 14 15
No ratings yet
Lectures 14 15
66 pages
Week9 Regression Analysis
No ratings yet
Week9 Regression Analysis
43 pages
Anscombe's Quartet
No ratings yet
Anscombe's Quartet
4 pages
Stats10_Chapter+4 2
No ratings yet
Stats10_Chapter+4 2
54 pages
ECE 3040 Lecture 18: Curve Fitting by Least-Squares-Error Regression
No ratings yet
ECE 3040 Lecture 18: Curve Fitting by Least-Squares-Error Regression
38 pages
Chapter 3 Describing Relationships
No ratings yet
Chapter 3 Describing Relationships
39 pages
Assignment Linear Regression
No ratings yet
Assignment Linear Regression
10 pages
Anscombe's Data Workbook
No ratings yet
Anscombe's Data Workbook
5 pages
Review: I Am Examining Differences in The Mean Between Groups
100% (2)
Review: I Am Examining Differences in The Mean Between Groups
44 pages
Bike Assignment - Subjective Sol
No ratings yet
Bike Assignment - Subjective Sol
5 pages
Anscombes Quartet
No ratings yet
Anscombes Quartet
3 pages
Linear Regression Analysis: Smoking and Lung Capacity
No ratings yet
Linear Regression Analysis: Smoking and Lung Capacity
16 pages
Statistic For Agriculture Studies: The Assumptions of Regression
No ratings yet
Statistic For Agriculture Studies: The Assumptions of Regression
6 pages
Assignment-Based Subjective Questions
100% (1)
Assignment-Based Subjective Questions
10 pages
American Statistical Association, Taylor & Francis, Ltd. The American Statistician
No ratings yet
American Statistical Association, Taylor & Francis, Ltd. The American Statistician
8 pages
7 Regression
No ratings yet
7 Regression
96 pages
Linear Regression Analysis: Gaurav Garg (IIM Lucknow)
No ratings yet
Linear Regression Analysis: Gaurav Garg (IIM Lucknow)
96 pages
Ds203 Assignment-02: Programming For Data Science
No ratings yet
Ds203 Assignment-02: Programming For Data Science
10 pages
MBA 8040 MODEL BUILDING With Data Transformations PDF
No ratings yet
MBA 8040 MODEL BUILDING With Data Transformations PDF
17 pages
Uttam Linear Regression 17March24 (1)
No ratings yet
Uttam Linear Regression 17March24 (1)
82 pages
Lec 34
No ratings yet
Lec 34
15 pages
Linear Regression Analysis: Module - Iv
No ratings yet
Linear Regression Analysis: Module - Iv
10 pages
Bivariate EDA and Regression Analysis
No ratings yet
Bivariate EDA and Regression Analysis
61 pages
Chapter12 Stats
No ratings yet
Chapter12 Stats
6 pages
Stats_ch_4_powerpoint
No ratings yet
Stats_ch_4_powerpoint
67 pages
Simple Linear Regression: Y XI. XI X
No ratings yet
Simple Linear Regression: Y XI. XI X
25 pages
Relationship- Correlation and Regression (1)
No ratings yet
Relationship- Correlation and Regression (1)
42 pages
Linear Regression and Correlation
No ratings yet
Linear Regression and Correlation
41 pages
SEE5211 Chapter3-P2017
No ratings yet
SEE5211 Chapter3-P2017
58 pages
Regression and Correlation
No ratings yet
Regression and Correlation
13 pages
Correlation
100% (1)
Correlation
29 pages
Functions and Applications
No ratings yet
Functions and Applications
30 pages
Statistics Regression Final Project
100% (2)
Statistics Regression Final Project
12 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
02 Correlation coefficient and the residual
No ratings yet
02 Correlation coefficient and the residual
10 pages
Lecture8 4
No ratings yet
Lecture8 4
29 pages
Chapter 4
No ratings yet
Chapter 4
15 pages
Linear Regression
No ratings yet
Linear Regression
13 pages
Regression and correlation notes
No ratings yet
Regression and correlation notes
28 pages
Unit 3 Covariance and Correlation
No ratings yet
Unit 3 Covariance and Correlation
7 pages
Module 2 - Section 4 (Linear Regression) - 11
No ratings yet
Module 2 - Section 4 (Linear Regression) - 11
20 pages
Estad Istica II Chapter 5. Regression Analysis (Second Part)
No ratings yet
Estad Istica II Chapter 5. Regression Analysis (Second Part)
39 pages
Q&A Univ 3unit
No ratings yet
Q&A Univ 3unit
18 pages
Examining Relationships Scatterplot Analysis.: R N 1 Xi X SX Yi y Sy
No ratings yet
Examining Relationships Scatterplot Analysis.: R N 1 Xi X SX Yi y Sy
3 pages
Workbook.regression.solutions
No ratings yet
Workbook.regression.solutions
52 pages
5_Chapter9-linear regression
No ratings yet
5_Chapter9-linear regression
15 pages
Regression Modelling With Actuarial and Financial Applications - Key Notes
No ratings yet
Regression Modelling With Actuarial and Financial Applications - Key Notes
3 pages
Correlation and Regression
No ratings yet
Correlation and Regression
23 pages
Chapter_10.QM sir pac
No ratings yet
Chapter_10.QM sir pac
8 pages
Linear Regression Analysis_1
No ratings yet
Linear Regression Analysis_1
18 pages
Correlation and Regression Skill Set
No ratings yet
Correlation and Regression Skill Set
8 pages
Mda-Session-7 Simple Linear Regression
No ratings yet
Mda-Session-7 Simple Linear Regression
75 pages
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
Continuum Mechanics Syllabus
No ratings yet
Continuum Mechanics Syllabus
2 pages
Chora Chori Final Bill
No ratings yet
Chora Chori Final Bill
109 pages
Manu Smriti by DR Surender Kumar
No ratings yet
Manu Smriti by DR Surender Kumar
770 pages
MSC Booklet 2073
No ratings yet
MSC Booklet 2073
22 pages
Syllabus 3210 Fall 2012 PDF
No ratings yet
Syllabus 3210 Fall 2012 PDF
4 pages
Syllabus 3210 Fall 2012 PDF
No ratings yet
Syllabus 3210 Fall 2012 PDF
4 pages
Note For RCC
No ratings yet
Note For RCC
3 pages
IIT Guhati Note Continuum Mechanics
No ratings yet
IIT Guhati Note Continuum Mechanics
254 pages
Chapter 9 - Geometry Practice Worksheet
No ratings yet
Chapter 9 - Geometry Practice Worksheet
9 pages
5construction Stage OL14May2014 PDF
No ratings yet
5construction Stage OL14May2014 PDF
5 pages
Chapter 7 - Algebra Practice Worksheet
No ratings yet
Chapter 7 - Algebra Practice Worksheet
7 pages
Syllabus 3210 Fall 2012 PDF
No ratings yet
Syllabus 3210 Fall 2012 PDF
4 pages
Chapter 8 - Word Problems Practice Worksheet
No ratings yet
Chapter 8 - Word Problems Practice Worksheet
3 pages
Chapter 3 - Essay Examples
No ratings yet
Chapter 3 - Essay Examples
3 pages
Chapter 4 - Quantitative Practice Worksheet
No ratings yet
Chapter 4 - Quantitative Practice Worksheet
2 pages
Chapter 6 - Arithmetic Practice Worksheet
No ratings yet
Chapter 6 - Arithmetic Practice Worksheet
7 pages
English - How To Prepare For IELTS PDF
No ratings yet
English - How To Prepare For IELTS PDF
7 pages
Chapter 5 - Quantitative Techniques Practice Worksheet
No ratings yet
Chapter 5 - Quantitative Techniques Practice Worksheet
5 pages
Inspira Journal of Modern Management Entrepreneurshipjmme Vol 07 n0 04 October 2017 Pages 01 To 09
No ratings yet
Inspira Journal of Modern Management Entrepreneurshipjmme Vol 07 n0 04 October 2017 Pages 01 To 09
9 pages
LP Pearson Correlation Coefficient
No ratings yet
LP Pearson Correlation Coefficient
10 pages
2 Correlation and Linear Regression PDF
No ratings yet
2 Correlation and Linear Regression PDF
26 pages
Assumptions of Theory Z A Tool For Manag
No ratings yet
Assumptions of Theory Z A Tool For Manag
12 pages
Body-Piercing-and-Tattoos-Survey-on-peoples-Health-Risk-and-perception-in-Body-arts (Recent)
No ratings yet
Body-Piercing-and-Tattoos-Survey-on-peoples-Health-Risk-and-perception-in-Body-arts (Recent)
84 pages
FHA
No ratings yet
FHA
1 page
3RD Quarter Reviewer
No ratings yet
3RD Quarter Reviewer
19 pages
Corelation Based On Absolute Gain: Midcap Small Cap Nifty 50 Gold Silver S&P 500 DJ
No ratings yet
Corelation Based On Absolute Gain: Midcap Small Cap Nifty 50 Gold Silver S&P 500 DJ
19 pages
AI_SL_-_Pearsons_Spearmans_Regression__Markscheme_
No ratings yet
AI_SL_-_Pearsons_Spearmans_Regression__Markscheme_
16 pages
Statistical Treatement of Data
100% (1)
Statistical Treatement of Data
4 pages
Hamidah Karil-02
No ratings yet
Hamidah Karil-02
19 pages
The Session Rating Scale Preliminary Psychometric Properties of Workling Alliance
No ratings yet
The Session Rating Scale Preliminary Psychometric Properties of Workling Alliance
10 pages
Automated Versus Manual Platelet Count in Aden 2161 0681-3-149
100% (1)
Automated Versus Manual Platelet Count in Aden 2161 0681-3-149
4 pages
Impact of Liquidity Management On Bank Pradhan Nepal 2019
No ratings yet
Impact of Liquidity Management On Bank Pradhan Nepal 2019
11 pages
Ferragamo Group3 La14. 2
No ratings yet
Ferragamo Group3 La14. 2
5 pages
2076 4252 1 PB
No ratings yet
2076 4252 1 PB
11 pages
VII Pearson R
No ratings yet
VII Pearson R
4 pages
W. T. A. Nilashin Organizational Work Life Balance Factors and
No ratings yet
W. T. A. Nilashin Organizational Work Life Balance Factors and
17 pages
MODULE 3 - Data Management
0% (1)
MODULE 3 - Data Management
24 pages
Maryhill College, Inc
No ratings yet
Maryhill College, Inc
10 pages
HR Analytics Day 1
No ratings yet
HR Analytics Day 1
80 pages
The Impact of Just-In-Time Production On Food Quality: Xin (James) He & Jack C. Hayya
No ratings yet
The Impact of Just-In-Time Production On Food Quality: Xin (James) He & Jack C. Hayya
20 pages
Remote Work - Advantages and Disadvantages On The Example in It Organisation
No ratings yet
Remote Work - Advantages and Disadvantages On The Example in It Organisation
8 pages
Determining Surface Roughness Level Based On Texture Analysis
No ratings yet
Determining Surface Roughness Level Based On Texture Analysis
10 pages
Unit 8
No ratings yet
Unit 8
16 pages
The Production Split Method in Multilayer Reservoi
No ratings yet
The Production Split Method in Multilayer Reservoi
8 pages
Correlation Analysis in python
100% (1)
Correlation Analysis in python
6 pages
Artikel Lina 37-43
No ratings yet
Artikel Lina 37-43
7 pages