Lecture 4
Lecture 4
Asmaa El-Toony
• standard scores
• percentiles
• quartiles
• outliers
They are used to locate the relative position of a data value in the data set. For
example, if a value is located at the 80th percentile, it means that 80% of the
values fall below it in the distribution and 20% of the values fall above it. The
median is the value that corresponds to the 50th percentile, since one-half of the
values fall below it and onehalf of the values fall above it. This section discusses
these measures of position
Example 1:
Solution
For Test A:
𝑋−𝑋̅ 38 −40
𝑍= = = −0.4.
𝑠 5
For Test B:
𝑋−𝑋̅ 94 −100
𝑍= = = −0.6.
𝑠 10
The score for test A is relatively higher than the score for test B.
Example 2:
A student scored 65 on a calculus test that had a mean of 50 and a standard
deviation of 10; she scored 30 on a history test with a mean of 25 and a standard
deviation of 5. Compare her relative positions on the two tests.
Statistical Methods Dr. Asmaa El-Toony
Solution
First, find the z scores. For calculus the z score is:
𝑋 − 𝑋̅ 65 − 50
𝑍= = = 1.5
𝑠 10
Since the z score for calculus is larger, her relative position in the calculus class is
higher than her relative position in the history class.
2) Percentiles
Percentiles divide the data set into 100 equal groups. Percentiles are symbolized
by P1, P2, P3, . . ., P99
Statistical Methods Dr. Asmaa El-Toony
Example 3:
A teacher gives a 20-point test to 10 students. The scores are shown here. Find
the percentile rank of a score of 12. 18, 15, 12, 6, 8, 2, 3, 5, 20, 1
Solution
Step 1 Arrange the data in order from lowest to highest:
2, 3, 5, 6, 8, 10, 12, 15, 18, 20
Step 2 Then substitute into the formula
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑏𝑒𝑓𝑜𝑟𝑒 𝑋 + 0.5
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 = ∗ 100
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠
Statistical Methods Dr. Asmaa El-Toony
Since there are six values below a score of 12, the solution is:
6 + 0.5
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 = ∗ 100 = 65 𝑡ℎ 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒
10
Thus, a student whose score was 12 did better than 65% of the class.
Example 4:
Using the data in Example 3, find the percentile rank for a score of 6.
Solution
There are three values below 6. Thus
3 + 0.5
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 = ∗ 100 = 35 𝑡ℎ 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒
10
A student who scored 6 did better than 35% of the class.
Statistical Methods Dr. Asmaa El-Toony
Example 5:
Using the data in Example 3, find the value corresponding to the 25th percentile.
Solution
Step 1 Arrange the data in order from lowest to highest.
2, 3, 5, 6, 8, 10, 12, 15, 18, 20
Step 2 Compute
𝑛∗ 𝑝
𝐶=
100
where n = total number of values p = percentile
Thus,
10 ∗ 25
𝐶= = 2.5
100
Step 3 If c is not a whole number, round it up to the next whole number; in this
case, C=3.
Start at the lowest value and count over to the third value, which is 5. Hence, the
value 5 corresponds to the 25th percentile.
Example 5:
Using the data in Example 3, find the value corresponding to the 60th percentile.
Solution
Step 1 Arrange the data in order from lowest to highest.
2, 3, 5, 6, 8, 10, 12, 15, 18, 20
Step 2 Compute
Statistical Methods Dr. Asmaa El-Toony
𝑛∗ 𝑝 10 ∗ 60
𝐶= = =6
100 100
If c is a whole number, use the value halfway between the c and c+1 values when
counting up from the lowest value—in this case, the 6th and 7th values.
10 + 12
= 11
2
Hence, 11 corresponds to the 60th percentile. Anyone scoring 11 would have
done better than 60% of the class.
3) Quartiles
Quartiles divide the distribution into four groups, separated by Q1, Q2, Q3.
Note that Q1 is the same as the 25th percentile; Q2 is the same as the 50th
percentile, or the median; Q3 corresponds to the 75th percentile, as shown:
Statistical Methods Dr. Asmaa El-Toony
Example 6:
Find Q1, Q2, and Q3 for the data set: 15, 13, 6, 5, 12, 50, 22, 18
Solution
Step 1 Arrange the data in order
5, 6, 12, 13, 15, 18, 22, 50
Step 2
𝑋4 +𝑋5 13+15
For Q2 (median) = = = 14
2 2
5, 6, 12, 13
𝑋2 +𝑋3 6+12
Q1 = = =9
2 2
For Q3 : Find the median of the data values greater than 14.
15, 18, 22, 50
𝑋6 +𝑋7 18+22
Q3 = = = 20
2 2
4) Outliers
An outlier is an extremely high or an extremely low data value when
compared with the rest of the data values.
An outlier can strongly affect the mean and standard deviation of a variable.
For example, suppose a researcher mistakenly recorded an extremely high
data value. This value would then make the mean and standard deviation of
the variable much larger than they really were. Outliers can have an effect
on other statistics as well. There are several ways to check a data set for
outliers. One method is shown in this Procedure Table.
Statistical Methods Dr. Asmaa El-Toony
Example 7:
Check the following data set for outliers.
5, 6, 12, 13, 15, 18, 22, 50.
Solution
Step 5 Check the data set for any data values that fall outside the interval
from -7.5 to 36.5. The value 50 is outside this interval; hence, it can be
considered an outlier.
Statistical Methods Dr. Asmaa El-Toony
A boxplot can be used to graphically represent the data set. These plots
involve five specific values:
1. The lowest value of the data set (i.e., minimum)
2. Q1
3. Q2 (The median)
4. Q3
5. The highest value of the data set (i.e., maximum)
Example 8:
The number of meteorites found in 10 states of the United States:
89, 47, 164, 296, 30, 215, 138, 78, 48, 39.
Construct a boxplot for the data.
Solution
Step 1 Arrange the data in order:
30, 39, 47, 48, 78, 89, 138, 164, 215, 296
Step 2 Find the median, Q1, Q3
Median = 83.5 Q1 = 47 Q3 = 164
Statistical Methods Dr. Asmaa El-Toony