Describing Data: Centre Mean Is The Technical Term For What Most People Call An Average. in Statistics, "Average"

Statistics

Uploaded by

Shane Lambert

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

Describing Data: Centre Mean Is The Technical Term For What Most People Call An Average. in Statistics, "Average"

Statistics

Uploaded by

Shane Lambert

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Math 1111 Learning Centre

Describing Data

This worksheet focuses on describing data through measuring its centre and variability.
These measurements will give us an idea of what our data set looks like.
CENTRE
There are three ways to describe the centre of a data set: mean, median, and mode.
Mean is the technical term for what most people call an average. In statistics, “average”
is too vague, so you aren’t likely to see it. The mean represents the typical value of a
data set.
To find the mean, we first take the sum of all numbers in the data set. The symbol for
taking the sum of a set of numbers is the capital Greek letter sigma, Σ, so “Σ x” means
“take the sum of all values of x”. Then we divide by n, the number of observations in the
data set. You will see the notation for mean represented two ways: x⎯ (pronounced “x
bar”) is used when we are finding the average of a sample of data, and μ (pronounced
“mew”, the Greek letter mu) represents the mean of a population of data. The
population is all the members of the group of interest (e.g., all ducks) while a sample is
a smaller group, or subset, of the population (e.g., 250 ducks at Trout Lake).
Example 1: Find the mean of the following sample: {3, 5, 4, 9, 8, 5, 7, 8, 9, 12}
Solution: We take the sum of all numbers, and then we divide by n, which is 10:
Σ xi = 3 + 5 + 4 + 9 + 8 + 5 + 7 + 8 + 9 + 12 = 70
Σ xi 70
x⎯ = = =7
n 10
The mean of our data set is 7. The formula above can be used to calculate the mean of
any data set.
The median is the middle value of an ordered data set. If there is an odd number of
observations, then the median is the middle number. If there is an even number of
observations, we take the mean of the two middle numbers. We find the position of the
central observation using the formula:
n+1
position number = 2
The median is a more useful measure of central tendency if the data is skewed,
meaning that the data favours high numbers over low numbers, or vice versa. In
graphical form, a skewed curve appears asymmetrical, with one longer tail leading off in
the direction of skew. (So data that is right skewed has a long tail to the right.)

© 2013 Vancouver Community College Learning Centre. Authored by

by Emily
EmilySimpson
Simpson
Student review only. May not be reproduced for classes. & Darren Rigby
Example 2: Find the median of the data set in Example 1.
Solution: The first step is to put the data set in order from smallest to largest:
{3, 4, 5, 5, 7, 8, 8, 9, 9, 12}
Next we have to determine which position within the data set is the middle position. Our
data set has 10 observations, so n = 10. The middle observation is number (10+1)/2 =
#5.5, meaning it’s halfway between #5 and #6. We must take the average (the mean!)
of the observations in the 5th and 6th positions. That’s 7 and 8, so the median of our data
set is 7.5. (Note: the mean is not 5.5! That’s a position number and has nothing to do
with the observations themselves.)
The mode of a data set is the value that occurs most frequently. It is possible to have
more than one mode in a data set. In the data set {3, 4, 5, 5, 7}, the number 5 occurs
twice so it is the mode. In the data set {2, 4, 2, 6, 7, 7, 7, 8, 2}, both the numbers 2 and
7 occur three times each. This would be a bimodal data set.
Example 3: Identify the mode(s) in the data set from Exercise 1 if any exist.
Solution: There are three modes in this data set: 5, 8, and 9 (each value occurs twice).
This is called a multimodal data set.

VARIABILITY
The range is the difference between the highest (maximum) and lowest (minimum)
value in the data set, but it doesn’t tell us much. If a data set has 50 observations, of
which one is 20, another is 100, and the rest are all 60, the range is 100 − 20 = 80, but
that’s not a true measure of how much the data varies.
The most common measure of variability of a data set is standard deviation. It reflects
the deviations, or differences, of all values in the data set from the mean. A larger
standard deviation would indicate greater variability for a data set.
If you calculated the mean mark on a class midterm to be 65, that only tells you the
average mark. Did the marks in the class look like {66, 64, 67, 66, 62, 70, …} or like {48,
97, 83, 57, 62, 81, …}? The first set of marks has low standard deviation—most of the
marks are quite close to the mean. The second set has a higher standard deviation as
there is a greater spread of values from the mean. The notation for standard deviation
of a population is σ (sigma again, but this is the lower case form of the Greek letter).
The notation for standard deviation of a sample is s. To calculate standard deviation we
use the following formulas:
Population Standard Deviation Sample standard deviation

Σ ( x i − μ) 2 2
Σ ( x i − x )2 Σ ( x 2 ) − (Σnx )
σ= s= =
n n −1 n −1
The rightmost formula for sample standard deviation is the easiest one to use for
calculating s by hand. Variance is another related measure of variability that is simply
the square of the standard deviation (σ2 or s2). Variance is less useful as a measure of
variability, since its scale often doesn’t match the spread of the data. (Data that varies
by no more than 10 might have a variance of 20. If so, the 20 is meaningless in the

© 2013 Vancouver Community College Learning Centre.

Student review only. May not be reproduced for classes. 2
context of the problem. The standard deviation, on the other hand, would be about 20
≈ 4.5, which would mean you might expect data to vary from the mean by 4.5 on
average.)

Example 4: Calculate the standard deviation of the data set from Example 1.
Solution: We know n = 10 from Example 1. We also know Σ x = 70 from Example 1.
The only term left to figure out is Σ(x2). Σ(x2) is the sum of the square of all data values:
Σ(x2) = 32 + 52 + 42 + 92 + 82 + 52 + 72 + 82 + 92 + 122 = 558
Now we plug into the formula:
558 − 70
2

10
s= = 7.555555 ... ≈ 2.75
10 − 1
Another way to measure variability is by using quartiles. This is a more accurate
description of variability than the standard deviation if a data set has strong outliers
(values that lie far away from the rest of the data) or is strongly skewed.
The first quartile (Q1) is the data point that lies above ¼ (25%) of all the points of the
data set, and the third quartile (Q3) is the point that lies above ¾ (75%) of all the data
points. The second quartile, which lies above ½ of all data points, is simply the median.
To calculate a first quartile, find the median of the “first half” of the data, and to calculate
a third quartile, find the median of the “last half” of the data. If the data set has an odd
number of observations, then the median is not included in either half of the data for the
purposes of finding quartiles.
The interquartile range (IQR) is the difference between the third quartile and first
quartile: Q3 − Q1. This range will include the middle 50% of the values of the data set.
Example 5: For the data set in Example 1, find the first and third quartiles, and the IQR.
Solution: We know the median is the 5.5th position, so the first half of the data are in
positions #1 – #5, and the last half are in positions #6 – #10. Consider the first half of
the observations and find its median. The position of the first quartile is 5+21 = #3. The
position of the third quartile is 6+210 + 3 = #8. Within the data set:

{3, 4, 5, 5, 7, 8, 8, 9, 9, 12}

Q1 Q3
Therefore Q1 = 5, and Q3 = 9. The IQR is 9 − 5 = 4.
A good summary of a data set is called a five-figure summary. It includes the minimum
value of the set, the first quartile, the median, the third quartile and the maximum value.
The five-figure summary gives a good idea of the spread of data as well as the
presence of any outliers in the data set. In the following example, there’s a serious
outlier at one end of the data, but the description of the centre won’t be affected by it:

Student review only. May not be reproduced for classes. 3
Example 6: Give the five-figure summary of: {32, 33, 37, 37, 39, 40, 42, 45, 98}.
Solution: The minimum value of the data set is 32; the maximum is 98. There are 9
observations in the data set, so n = 9. The data is already in numerical order.
The median value is at position 9+21 = #5. The data point in position #5 is 39. This is the
median.
The first half of the data is #1 – #4 and the last half is #6 – #9. Because there is an odd
number of observations, the median is one of the data points, so it’s not included in
either half of the data when finding quartiles. The first quartile is at position 4+2 1 = #2.5
(so between #2 and #3) and the third quartile is at 6+29 = #7.5. (between #7 and #8). This
means the first quartile is the mean of the data points in positions #2 and #3, which are
33 and 37. 33+237 = 35, so Q1 is 35. The third quartile is the mean of the data points in
positions #7 and #8, which are 42 and 45. 42+245 = 43.5, so Q3 is 43.5. The five-figure
summary is: 32, 35, 39, 43.5, 98.

EXERCISES
A. For the following sets of data, calculate (a) sample mean, (b) median, (c) mode, (d)
range, (e) standard deviation, (f) first quartile, (g) third quartile, (h) interquartile range,
and (i) the five-figure summary. Round to two decimal places where appropriate.
1) {8, 24, 9, 6, 10, 18, 7, 14, 16, 21, 13, 24}
2) {3, 6, 5, 4, 6, 5, 9, 10, 11, 7, 9}
3) {41, 39, 38, 42, 43, 39, 40, 43, 26, 42, 42, 41, 41, 42, 27, 55, 60
B. Identify the data set (1, 2, or 3 according to the numbered exercises above) with the
greatest variability based on (a) standard deviation, (b) range and (c) IQR.
C. Explain why the answers to B(a) and B(b) are different from B(c).

SOLUTIONS
A. 1) (a) 14.17 (b) 13.5 (position #6.5) (c) 24 (d) 18 (e) 6.46 (f) 8.5 (position #3.5)
(g) 19.5 (position #9.5) (h) 11 (i) 6, 8.5, 13.5, 19.5, 24
2) (a) 6.82 (b) 6 (position #6) (c) 5, 6 and 9 (d) 8 (e) 2.60 (f) 5 (position #3)
(g) 9 (position #9) (h) 4 (i) 3, 5, 6, 9, 11
3) (a) 41.24 (b) 41 (position #9) (c) 42 (d) 34 (e) 7.93 (f) 39 (position #4.5)
(g) 42.5 (position #13.5) (h) 3.5 (i) 26, 39, 41, 42.5, 60
B. a) Data set #3 has the greatest standard deviation.
b) Data set #3 has the greatest range.
c) Data set #1 has the greatest IQR.
C. Data set #3 has outliers at both ends of the data. Because of this, data set #3 has a
high standard deviation and range. However, the IQR is less sensitive to outliers.
For this reason, the IQR of data set #3 is much lower than that of data set #1.

Student review only. May not be reproduced for classes. 4

Fullsize SBAR Report Sheet
No ratings yet
Fullsize SBAR Report Sheet
1 page
Head To Toe Patient Assessment
100% (3)
Head To Toe Patient Assessment
2 pages
Nursing Student
No ratings yet
Nursing Student
1 page
2015 - Bratman Et Al - The Benefits of Nature Experience - Improved Affect and Cognition
No ratings yet
2015 - Bratman Et Al - The Benefits of Nature Experience - Improved Affect and Cognition
10 pages
Stat 1101 4 7
No ratings yet
Stat 1101 4 7
18 pages
Standard Deviation Formulas
No ratings yet
Standard Deviation Formulas
10 pages
MetNum1 2023 1 Week 10
No ratings yet
MetNum1 2023 1 Week 10
79 pages
Statistics Part 1 and 2
No ratings yet
Statistics Part 1 and 2
53 pages
EECM3724_Unit_1_Ch3_slides_2022
No ratings yet
EECM3724_Unit_1_Ch3_slides_2022
48 pages
Stat Handout
No ratings yet
Stat Handout
7 pages
المحاضرة رقم 3
No ratings yet
المحاضرة رقم 3
44 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
Descriptive Statistics PDF
100% (1)
Descriptive Statistics PDF
40 pages
03 Numerical Description
No ratings yet
03 Numerical Description
52 pages
2 - Chapter 1 - Measures of Central Tendency and Variation New
100% (1)
2 - Chapter 1 - Measures of Central Tendency and Variation New
18 pages
Module 3 - Branches of Statistics (1)
No ratings yet
Module 3 - Branches of Statistics (1)
50 pages
MATH& 146 Lesson 8: Averages and Variation
No ratings yet
MATH& 146 Lesson 8: Averages and Variation
30 pages
Lecture 3 - Stat HO
No ratings yet
Lecture 3 - Stat HO
21 pages
FDSA unit 2
No ratings yet
FDSA unit 2
44 pages
2a. Describing Variables with Numbers
No ratings yet
2a. Describing Variables with Numbers
30 pages
2021 EDA-Module 2 DESCRIBING DATA - Oct. 22c
No ratings yet
2021 EDA-Module 2 DESCRIBING DATA - Oct. 22c
70 pages
dddddd2
No ratings yet
dddddd2
5 pages
SALMAN ALAM SHAH - Definitions of Statistics
No ratings yet
SALMAN ALAM SHAH - Definitions of Statistics
16 pages
Statistical Analysis
No ratings yet
Statistical Analysis
15 pages
Click To Add Text Dr. Cemre Erciyes
No ratings yet
Click To Add Text Dr. Cemre Erciyes
69 pages
الفصل الثالث مقدمة في الاحصاء.pdf
No ratings yet
الفصل الثالث مقدمة في الاحصاء.pdf
69 pages
4x @6ote ) 'Btda2@m
No ratings yet
4x @6ote ) 'Btda2@m
55 pages
Ch 2 Lecture Notes
No ratings yet
Ch 2 Lecture Notes
12 pages
Lecture 6
No ratings yet
Lecture 6
84 pages
Module 1 Overview_of_Statistics
No ratings yet
Module 1 Overview_of_Statistics
11 pages
Ch.2 Measures of Location and Spread
No ratings yet
Ch.2 Measures of Location and Spread
1 page
Measures of Central Tendency and Spread: Chapter 1, Section 2
No ratings yet
Measures of Central Tendency and Spread: Chapter 1, Section 2
36 pages
Univariate Statistics
No ratings yet
Univariate Statistics
4 pages
3 5 A IntroSummaryStatistics
No ratings yet
3 5 A IntroSummaryStatistics
32 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
2 Stats Intro 14022024 105150am
No ratings yet
2 Stats Intro 14022024 105150am
19 pages
Quantitative Methods For Decision Making: Dr. Akhter
No ratings yet
Quantitative Methods For Decision Making: Dr. Akhter
100 pages
1-2-3 Descriptive Stats & Central Tendency
No ratings yet
1-2-3 Descriptive Stats & Central Tendency
21 pages
Descriptive Stat
No ratings yet
Descriptive Stat
13 pages
Bio Statistics 3
No ratings yet
Bio Statistics 3
13 pages
المحاضرة الثالثة
No ratings yet
المحاضرة الثالثة
16 pages
Measures-of-Grouped-and-Ungrouped-Data-se-201
No ratings yet
Measures-of-Grouped-and-Ungrouped-Data-se-201
8 pages
CH 3 - Luc
No ratings yet
CH 3 - Luc
76 pages
Measures of Central Tendency
100% (15)
Measures of Central Tendency
15 pages
Statistics 3: DR Taher
No ratings yet
Statistics 3: DR Taher
38 pages
STAE lecture notes_LU3_Annotated
No ratings yet
STAE lecture notes_LU3_Annotated
10 pages
Measures of Location and Spread
No ratings yet
Measures of Location and Spread
1 page
MAT112 CH 11 Ungrouped Data PDF
No ratings yet
MAT112 CH 11 Ungrouped Data PDF
4 pages
Ch3 Numerically Summarizing Data
No ratings yet
Ch3 Numerically Summarizing Data
35 pages
Evaluating Analytical Chemistry
No ratings yet
Evaluating Analytical Chemistry
4 pages
Introduction To Statistics PDF
No ratings yet
Introduction To Statistics PDF
32 pages
St130: Basic Statistics Week 3: Lecture: School of Computing Information and Mathematical Sciences
No ratings yet
St130: Basic Statistics Week 3: Lecture: School of Computing Information and Mathematical Sciences
62 pages
Lecture 3 Numerical Measures of Data
No ratings yet
Lecture 3 Numerical Measures of Data
36 pages
CHAPTER 1 Descriptive Statistics
No ratings yet
CHAPTER 1 Descriptive Statistics
5 pages
Ge 4 - Topic 2-Statistics
No ratings yet
Ge 4 - Topic 2-Statistics
8 pages
Lesson-3.2-Measures-of-Central-Tendency-Position-and-Variation
No ratings yet
Lesson-3.2-Measures-of-Central-Tendency-Position-and-Variation
62 pages
Data Management
No ratings yet
Data Management
50 pages
History Reporting
No ratings yet
History Reporting
61 pages
STAE Lecture Notes - LU3
No ratings yet
STAE Lecture Notes - LU3
24 pages
Statistics, Statistical Modelling & Data Analytics
No ratings yet
Statistics, Statistical Modelling & Data Analytics
68 pages
Angilan, Ef
No ratings yet
Angilan, Ef
5 pages
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Sample PDF
No ratings yet
Sample PDF
23 pages
Math0861 FunctionNotation
No ratings yet
Math0861 FunctionNotation
4 pages
Elements, Compounds & Reactions: Essentials of Chemistry 1
No ratings yet
Elements, Compounds & Reactions: Essentials of Chemistry 1
2 pages
AdultAcuteCareSetupAndCurriculumGuide 2017 02 21 LB 2 PDF
No ratings yet
AdultAcuteCareSetupAndCurriculumGuide 2017 02 21 LB 2 PDF
79 pages
StatsB TheBinomialDistribution
No ratings yet
StatsB TheBinomialDistribution
4 pages
Dosage Calculations I I
100% (1)
Dosage Calculations I I
2 pages
Dosage of Oral Meds
No ratings yet
Dosage of Oral Meds
1 page
MathBasics EverythingAboutAlgebra PDF
No ratings yet
MathBasics EverythingAboutAlgebra PDF
4 pages
MathBasics EverythingAboutAlgebra PDF
No ratings yet
MathBasics EverythingAboutAlgebra PDF
4 pages
Ante-Partum Haemorrhage: Ed Management
No ratings yet
Ante-Partum Haemorrhage: Ed Management
1 page
Learning Output Module No.4
No ratings yet
Learning Output Module No.4
2 pages
Detention
No ratings yet
Detention
8 pages
What Is Archaeology? - How Is It Different From History? - What Is Culture?
No ratings yet
What Is Archaeology? - How Is It Different From History? - What Is Culture?
8 pages
Pujangga Piping - The Meaning of Valves - Only The Basic
No ratings yet
Pujangga Piping - The Meaning of Valves - Only The Basic
21 pages
TA10-Unit 1
No ratings yet
TA10-Unit 1
59 pages
EDUC319 - Semi Detailed LP
No ratings yet
EDUC319 - Semi Detailed LP
4 pages
IS 802 Part 2nd
100% (1)
IS 802 Part 2nd
13 pages
Seminars and Fieldtrip
No ratings yet
Seminars and Fieldtrip
3 pages
Priorities For The Next EU Commission (2024-2029)
No ratings yet
Priorities For The Next EU Commission (2024-2029)
11 pages
Bromma, Eh 5 U
No ratings yet
Bromma, Eh 5 U
4 pages
Guia Reparacion Inyectores Denso
100% (4)
Guia Reparacion Inyectores Denso
22 pages
Auto-ABSA Automatic Detection of Aspects in Aspect-Based Sentiment Analysis
No ratings yet
Auto-ABSA Automatic Detection of Aspects in Aspect-Based Sentiment Analysis
11 pages
A Strengths-Based Approach To Widening Participation Students Highlighted
No ratings yet
A Strengths-Based Approach To Widening Participation Students Highlighted
16 pages
Knowing in Community: 10 Critical Success Factors in Building Communities of Practice
No ratings yet
Knowing in Community: 10 Critical Success Factors in Building Communities of Practice
8 pages
Data Modeling in Power BI
100% (1)
Data Modeling in Power BI
15 pages
Noor Resume-1
No ratings yet
Noor Resume-1
2 pages
ArcelorMittal Cofrastra 40 Brochure en
No ratings yet
ArcelorMittal Cofrastra 40 Brochure en
12 pages
CRD - c48 CRD-C48-92 Standard Test Method For Water Permeability of Concrete
No ratings yet
CRD - c48 CRD-C48-92 Standard Test Method For Water Permeability of Concrete
4 pages
Vibratory Sifter: Dust Free Sifting
No ratings yet
Vibratory Sifter: Dust Free Sifting
2 pages
N45SH Grade Neodymium Magnets Data
No ratings yet
N45SH Grade Neodymium Magnets Data
1 page
Wright Writing Skills Checklist
No ratings yet
Wright Writing Skills Checklist
2 pages
ads-unit-ii-notes
No ratings yet
ads-unit-ii-notes
32 pages
21st Century Education
No ratings yet
21st Century Education
54 pages
Beed Ii-B - Nobleza, Kim Eduard Module 5 - Prof Ed. 6
60% (5)
Beed Ii-B - Nobleza, Kim Eduard Module 5 - Prof Ed. 6
7 pages
Tugas 1 Bahasa Inggris Niaga
No ratings yet
Tugas 1 Bahasa Inggris Niaga
3 pages
Reading Papyri Writing Ancient History Second Edition Roger S Bagnall instant download
100% (2)
Reading Papyri Writing Ancient History Second Edition Roger S Bagnall instant download
54 pages
Models of Self-Regulated Learning DA
100% (4)
Models of Self-Regulated Learning DA
19 pages
The Doppler Effect With Sound
No ratings yet
The Doppler Effect With Sound
6 pages
HSR Tuning Manual 050102
No ratings yet
HSR Tuning Manual 050102
16 pages

Describing Data: Centre Mean Is The Technical Term For What Most People Call An Average. in Statistics, "Average"

Uploaded by

Describing Data: Centre Mean Is The Technical Term For What Most People Call An Average. in Statistics, "Average"

Uploaded by

Math 1111 Learning Centre

© 2013 Vancouver Community College Learning Centre. Authored by

© 2013 Vancouver Community College Learning Centre.

© 2013 Vancouver Community College Learning Centre.

© 2013 Vancouver Community College Learning Centre.

You might also like