0% found this document useful (0 votes)
7 views

Lect 5

Uploaded by

Fatih
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Lect 5

Uploaded by

Fatih
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Dispersion

Why Study Dispersion?

• A measure of location, such as the mean or the median, only describes the
center of the data. It is valuable from that standpoint, but it does not tell us
anything about the spread of the data.

• For example, if your nature guide told you that the river ahead averaged 3
feet in depth, would you want to wade across on foot without additional
information? Probably not. You would want to know something about the
variation in the depth.

• A second reason for studying the dispersion in a set of data is to compare


the spread in two or more distributions.
Sample of Dispersions
Measures of Dispersion

• Range

• Variance and Standard


Deviation
EXAMPLE – Range

The number of cappuccinos sold at the Starbucks location in a


Country Airport between 4 and 7 p.m. for a sample of 5 days
last year was 20, 40, 50, 60, and 80. Determine the range for the
number of cappuccinos sold.

Range = Largest – Smallest value


= 80 – 20 = 60
EXAMPLE – Variance and Standard Deviation

The number of traffic citations issued during the last five months in Beaufort
County, South Carolina, is 38, 26, 13, 41, and 22. What is the population
variance?

𝜎 = 106.8 = 10.33
EXAMPLE – Sample Variance

The hourly wages for a sample of part-time


employees at Home Depot are: $12, $20, $16,
$18, and $19. What is the sample variance?

s = 10 = 3.1623
Variance and Standard Deviation-Grouped Data

Computational Formula
Example

Find the variance and


standard deviation for
the following data:
Example

Thus, the standard deviation of the number of orders received at the office of
this mail-order company during the past 50 days is 2.75
Measures of Position

• In addition to measures of central tendency and measures of variation, there


are measures of position or location.
• These measures include standard scores, percentiles, and quartiles.
• They are used to locate the relative position of a data value in the data set.
• For example, if a value is located at the 80th percentile, it means that 80% of
the values fall below it in the distribution and 20% of the values fall above
it.
• The median is the value that corresponds to the 50th percentile since one-
half of the values fall below it and one-half of the values fall above it.
Standard Scores

• There is an old saying, “You can’t compare apples and oranges.”


• But with the use of statistics, it can be done to some extent.
• Suppose that a student scored 90 on a music test and 45 on an English
exam.
• Direct comparison of raw scores is impossible since the exams might not be
equivalent in terms of number of questions, value of each question, and so
on.
• However, a comparison of a relative standard similar to both can be made.
• This comparison uses the mean and standard deviation and is called a
standard score or z score.
Standard Scores
Example: Test Scores

• A student scored 65 on a calculus test that had a mean of 50 and a standard


deviation of 10; she scored 30 on a history test with a mean of 25 and a
standard deviation of 5. Compare her relative positions on the two tests.

• Since the z score for calculus is larger, her relative position in the calculus
class is higher than her relative position in the history class.
Example: Test Scores

• Note that if the z score is


positive, the score is above
the mean. If the z score is 0,
the score is the same as the
mean. And if the z score is
negative, the score is below
the mean.
Percentiles

• Percentiles are position measures used in educational and health-related


fields to indicate the position of an individual in a group.

• Percentiles are used to compare an individual’s test score with the national
norm. For example, tests such as the National Educational Development Test
(NEDT) are taken by students in ninth or tenth grade. A student’s scores are
compared with those of other students locally and nationally by using
percentile ranks.
Percentiles

• Percentiles are not the same as percentages.


• That is, if a student gets 72 correct answers out of a possible 100, she
obtains a percentage score of 72.
• There is no indication of her position with respect to the rest of the class.
She could have scored the highest, the lowest, or somewhere in between.
• On the other hand, if a raw score of 72 corresponds to the 64th percentile,
then she did better than 64% of the students in her class.
Percentiles

Example
• A teacher gives a 20-point test to 10 students. The scores are shown here. Find
the percentile rank of a score of 12.
18, 15, 12, 6, 8, 2, 3, 5, 20, 10
Percentiles
Percentiles

Example
• Test scores used in the example above, find the percentile rank for a score of 6.
Quartiles

• Quartiles divide the distribution into four groups, separated by 𝑄1, 𝑄2, 𝑄3. Note
that 𝑄1 is the same as the 25𝑡ℎ percentile; 𝑄2 is the same as the 50𝑡ℎ percentile, or
the median; 𝑄3 corresponds to the 75𝑡ℎ percentile, as shown:
Quartiles

Example
• Find 𝑄1 , 𝑄2 , and 𝑄3 for the data set 15, 13, 6, 5, 12, 50, 22, 18.
Quartiles

Exercise

Calculate Q1, Q2 and Q3


a. 145, 119, 122, 118, 125, 116
b. 14, 16, 27, 18, 13, 19, 36, 15, 20
Outliers

• A data set should be checked for extremely high or extremely low values. These
values are called outliers.

• An outlier can strongly affect the mean and standard deviation of a variable.
For example, suppose a researcher mistakenly recorded an extremely high data
value.

• This value would then make the mean and standard deviation of the variable
much larger than they really were. Outliers can have an effect on other
statistics as well.
Outliers
Outliers
• Check the following data set for outliers.
5, 6, 12, 13, 15, 18, 22, 50

Solution
The data value 50 is extremely suspect.
These are the steps in checking for an
outlier.
Step 1: Find 𝑄1 and 𝑄3 . This was done
in the above example; 𝑄1 is 9 and 𝑄3 is
20.
Check if there is an outlier
• 24, 32, 54, 31, 16, 18, 19, 14, 17, 20
• 321, 343, 350, 327, 200
Exploratory Data Analysis

The Five-Number Summary and Boxplots


Exploratory Data Analysis

Example: Number of Meteorites Found


The number of meteorites found in 10 states of the United States is 89, 47, 164, 296, 30, 215,
138, 78, 48, 39. Construct a boxplot for the data
Exploratory Data Analysis
Exploratory Data Analysis

Example: Sodium Content of Cheese


• A dietitian is interested in comparing the sodium content of real cheese with the sodium
content of a cheese substitute. The data for two random samples are shown. Compare the
distributions, using boxplots.
Exploratory Data Analysis

• Step 3: Draw the boxplots for each distribution on the same graph.
• Step 4: Compare the plots. It is quite apparent that the distribution for the cheese substitute
data has a higher median than the median for the distribution for the real cheese data. The
variation or spread for the distribution of the real cheese data is larger than the variation for
the distribution of the cheese substitute data.

You might also like