0% found this document useful (0 votes)

11 views

Lesson 4 Data Description Measures of Position-1

Uploaded by

alfredojrdavin4

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Lesson 4 Data Description Measures of Position-1

Uploaded by

alfredojrdavin4

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

FT T216/DC-EC SP4

Applied Statistics/Descriptive and Inferential Statistics

LESSON 4: Data Description: Measures of Position

Measures of Position
In addition to measures of central tendency and measures of variation, there are measures
of position or location. These measures include standard scores, percentiles, deciles, and quartiles.
They are used to locate the relative position of a data value in the data set. For example, if a value is
located at the 80th percentile, it means that 80% of the values fall below it in the distribution and
20% of the values fall above it. The median is the value that corresponds to the 50th percentile,
since one-half of the values fall below it and one half of the values fall above it.

Standard Scores
There is an old saying, “You can’t compare apples and oranges.” But with the use of
statistics, it can be done to some extent. Suppose that a student scored 90 on a music test and 45 on
an English exam. Direct comparison of raw scores is impossible, since the exams might not be
equivalent in terms of number of questions, value of each question, and so on. However, a
comparison of a relative standard similar to both can be made. This comparison uses the mean and
standard deviation and is called a standard score or z score.
A standard score or z score tells how many standard deviations a data value is above or
below the mean for a specific distribution of values. If a standard score is zero, then the data value is
the same as the mean.

Example:
Test Scores
A student scored 65 on a calculus test that had a mean of 50 and a standard deviation of 10; she
scored 30 on a history test with a mean of 25 and a standard deviation of 5. Compare her relative
positions on the two tests.

Solution
First, find the z scores. For calculus the z score is

For history the z score is

Since the z score for calculus is larger, her relative position in the calculus class is higher than her
relative position in the history class.

Note that if the z score is positive, the score is above the mean. If the z score is 0, the score is the
same as the mean. And if the z score is negative, the score is below the mean.

Example:
Test Scores
Find the z score for each test, and state which is higher.

When all data for a variable are transformed into z scores, the resulting distribution will have
a mean of 0 and a standard deviation of 1. A z score, then, is actually the number of standard
deviations each value is from the mean for a specific distribution.

Percentiles
Percentiles are position measures used in educational and health-related fields to indicate
the position of an individual in a group.
Percentiles divide the data set into 100 equal groups.
In many situations, the graphs and tables showing the percentiles for various measures such
as test scores, heights, or weights have already been completed. Percentiles are also used to
compare an individual’s test score with the national norm.
Percentiles are not the same as percentages. That is, if a student gets 72 correct answers
out of a possible 100, she obtains a percentage score of 72. There is no indication of her position
with respect to the rest of the class. She could have scored the highest, the lowest, or somewhere in
between. On the other hand, if a raw score of 72 corresponds to the 64th percentile, then she did
better than 64% of the students in her class.
Percentiles are symbolized by

and divide the distribution into 100 groups.

Percentile graphs can be constructed as shown below. Percentile graphs use the same values
as the cumulative relative frequency graphs, except that the proportions have been converted to
percents.
Figure 1. Weights of Girls
by Age and Percentile
Rankings
Source: Distributed by Mead
Johnson Nutritional Division.
Reprinted with permission.

Example:
Systolic Blood Pressure
The frequency distribution for the systolic blood pressure readings (in millimeters of mercury, mm
Hg) of 200 randomly selected college students is shown here. Construct a percentile graph.

Solution
Step 1 Find the cumulative frequencies and place them in column C.
Step 2 Find the cumulative percentages and place them in column D. To do this step, use the
formula
Step 3 Graph the data, using class boundaries for the x axis and the percentages for the y axis

Once a percentile graph has been constructed, one can find the approximate corresponding
percentile ranks for given blood pressure values and find approximate blood pressure values for
given percentile ranks. For example, to find the percentile rank of a blood pressure reading of 130,
find 130 on the x axis of the figure above, and draw a vertical line to the graph. Then move
horizontally to the value on the y axis. Note that a blood pressure of 130 corresponds to
approximately the 70th percentile. If the value that corresponds to the 40th percentile is desired,
start on the y axis at 40 and draw a horizontal line to the graph. Then draw a vertical line to the x
axis and read the value.
The 40th percentile corresponds to a value of approximately 118. Thus, if a person has a
blood pressure of 118, he or she is at the 40th percentile. Finding values and the corresponding
percentile ranks by using a graph yields only approximate answers. Several mathematical methods
exist for computing percentiles for data. These methods can be used to find the approximate
percentile rank of a data value or to find a data value corresponding to a given percentile. When the
data set is large (100 or more), these methods yield better results.

Example:
Test Scores
A teacher gives a 20-point test to 10 students. The scores are shown here. Find the percentile rank of
a score of 12.
18, 15, 12, 6, 8, 2, 3, 5, 20, 10

Solution
Arrange the data in order from lowest to highest.
2, 3, 5, 6, 8, 10, 12, 15, 18, 20
Then substitute into the formula.

Note: One assumes that a score of 12 in the example, for instance, means theoretically any value
between 11.5 and 12.5.

Example:
Test Scores
Using the data in the previous example, find the percentile rank for a score of 6.
Solution
There are three values below 6. Thus

A student who scored 6 did better than 35% of the class.

The next examples show a procedure for finding a value corresponding to a given percentile.
Example:
Test Scores
Using the scores in previous example, find the value corresponding to the 25th percentile.

Solution:

Step 3 If c is not a whole number, round it up to the next whole number; in this case, c = 3. Start at
the lowest value and count over to the third value, which is 5. Hence, the value 5 corresponds to the
25th percentile.

If c is a whole number:
Example:
Using the data set in the previous example, find the value that corresponds to the 60th percentile.
Solution
Step 1 Arrange the data in order from smallest to largest.
2, 3, 5, 6, 8, 10, 12, 15, 18, 20
Quartiles and Deciles
Quartiles divide the distribution into four groups, separated by Q1, Q2, Q3. Note that Q1 is
the same as the 25th percentile; Q2 is the same as the 50th percentile, or the median; Q3
corresponds to the 75th percentile, as shown:

Example:
Find Q1, Q2, and Q3 for the data set 15, 13, 6, 5, 12, 50, 22, 18.
In addition to dividing the data set into four groups, quartiles can be used as a rough
measurement of variability. The interquartile range (IQR) is defined as the difference between Q1
and Q3 and is the range of the middle 50% of the data. The interquartile range is used to identify
outliers, and it is also used as a measure of variability in exploratory data analysis.
Deciles divide the distribution into 10 groups, as shown. They are denoted by D1, D2, etc.

Note that D1 corresponds to P10; D2 corresponds to P20; etc. Deciles can be found by using
the formulas given for percentiles. Taken altogether then, these are the relationships among
percentiles, deciles, and quartiles.
Deciles are denoted by D1, D2, D3, . . . , D9, and they correspond to P10, P20, P30, . . . , P90.
Quartiles are denoted by Q1, Q2, Q3 and they correspond to P25, P50, P75.
The median is the same as P50 or Q2 or D5.

Outliers
A data set should be checked for extremely high or extremely low values. These values are
called outliers. An outlier is an extremely high or an extremely low data value when compared with
the rest of the data values.
An outlier can strongly affect the mean and standard deviation of a variable. For example,
suppose a researcher mistakenly recorded an extremely high data value. This value would then make
the mean and standard deviation of the variable much larger than they really were. Outliers can
have an effect on other statistics as well.

Example:
Check the following data set for outliers.
5, 6, 12, 13, 15, 18, 22, 50

Reasons Why Outliers May Occur

1. The data value may have resulted from a measurement or observational error. Perhaps the
researcher measured the variable incorrectly.
2. The data value may have resulted from a recording error. That is, it may have been written
or typed incorrectly.
3. The data value may have been obtained from a subject that is not in the defined population.
For example, suppose test scores were obtained from a seventh-grade class, but a student in
that class was actually in the sixth grade and had special permission to attend the class. This
student might have scored extremely low on that particular exam on that day.
4. The data value might be a legitimate value that occurred by chance (although the probability
is extremely small).

There are no hard-and-fast rules on what to do with outliers, nor is there complete agreement
among statisticians on ways to identify them. Obviously, if they occurred as a result of an error, an
attempt should be made to correct the error or else the data value should be omitted entirely.
When they occur naturally by chance, the statistician must make a decision about whether to
include them in the data set. When a distribution is normal or bell-shaped, data values that are
beyond 3 standard deviations of the mean can be considered suspected outliers.

Exploratory Data Analysis

In traditional statistics, data are organized by using a frequency distribution. From this
distribution various graphs such as the histogram, frequency polygon, and ogive can be constructed
to determine the shape or nature of the distribution. In addition, various statistics such as the mean
and standard deviation can be computed to summarize the data.
The purpose of traditional analysis is to confirm various conjectures about the nature of the
data. For example, from a carefully designed study, a researcher might want to know if the
proportion of Americans who are exercising today has increased from 10 years ago. This study would
contain various assumptions about the population, various definitions such as of exercise, and so on.
In exploratory data analysis (EDA), data can be organized using a stem and leaf plot. The
measure of central tendency used in EDA is the median. The measure of variation used in EDA is the
interquartile range Q3 – Q1. In EDA the data are represented graphically using a boxplot
(sometimes called a box-and-whisker plot). The purpose of exploratory data analysis is to examine
data to find out what information can be discovered about the data such as the center and the
spread. Exploratory data analysis was developed by John Tukey and presented in his book
Exploratory Data Analysis (Addison-Wesley, 1977).

The Five-Number Summary and Boxplots

A boxplot can be used to graphically represent the data set. These plots involve five specific values:
1. The lowest value of the data set (i.e., minimum)
2. Q1
3. The median
4. Q3
5. The highest value of the data set (i.e., maximum)
These values are called a five-number summary of the data set.

Procedure for constructing a boxplot

1. Find the five-number summary for the data values, that is, the maximum and minimum data
values, Q1 and Q3, and the median.
2. Draw a horizontal axis with a scale such that it includes the maximum and minimum data values.
3. Draw a box whose vertical sides go through Q1 and Q3, and draw a vertical line though the
median.
4. Draw a line from the minimum data value to the left side of the box and a line from the maximum
data value to the right side of the box.

Example:
Number of Meteorites Found
The number of meteorites found in 10 states of the United States is 89, 47, 164, 296, 30, 215, 138,
78, 48, 39. Construct a boxplot for the data.
Source: Natural History Museum.
Solution:

Step 5 Draw a scale for the data on the x axis.

Step 6 Locate the lowest value, Q1, median, Q3, and the highest value on the scale.
Step 7 Draw a box around Q1 and Q3, draw a vertical line through the median, and connect the
upper value and the lower value to the box.

The distribution is somewhat positively skewed

If the boxplots for two or more data sets are graphed on the same axis, the distributions can
be compared. To compare the averages, use the location of the medians. To compare the variability,
use the interquartile range, i.e., the length of the boxes.

Example:
Sodium Content of Cheese
A dietitian is interested in comparing the sodium content of real cheese with the sodium content of
a cheese substitute. The data for two random samples are shown. Compare the distributions, using
boxplots.
Solution:

Step 4 Compare the plots. It is quite apparent that the distribution for the cheese substitute data has
a higher median than the median for the distribution for the real cheese data. The variation or
spread for the distribution of the real cheese data is larger than the variation for the distribution of
the cheese substitute data.

A modified boxplot can be drawn and used to check for outliers. In exploratory data
analysis, hinges are used instead of quartiles to construct boxplots. When the data set consists of an
even number of values, hinges are the same as quartiles. Hinges for a data set with an odd number
of values differ somewhat from quartiles. However, most calculators and computer programs use
quartiles.
Another important point to remember is that the summary statistics (median and
interquartile range) used in exploratory data analysis are said to be resistant statistics. A resistant
statistic is relatively less affected by outliers than a nonresistant statistic. The mean and standard
deviation are nonresistant statistics. Sometimes when a distribution is skewed or contains outliers,
the median and interquartile range may more accurately summarize the data than the mean and
standard deviation, since the mean and standard deviation are more affected in this case.
A modified boxplot can be drawn by placing a box around Q1 and Q3 and then extending the
whiskers to the largest and/or smallest values within 1.5 times the interquartile range (that is, Q3 –
Q1).
Mild outliers are values between 1.5(IQR) and 3(IQR).
Extreme outliers are data values beyond 3(IQR).

Example:
Unhealthful Smog Days
For the data shown here, draw a modified boxplot and identify any mild or extreme outliers. The
data represent the number of unhealthful smog days for a specific year for the highest 10 locations.

Solution:
Learning Task/Activity:

Name: ___________________ Date: _

Course & Year: ________________________ Instructor: Dr. ANJIN PLEIADESS P. CABRERA

Exercise 4b
Data Description
C. Measures of Position
1. Miles per Hour
Using the data below, find the approximate percentile ranks of the following miles per hour (mph).
a. 380 mph
b. 425 mph
c. 455 mph
d. 505 mph
e. 525 mph

2. Test Scores
Find the percentile rank for each test score in the data set.
12, 28, 35, 42, 47, 49, 50

3. Another measure of average is called the midquartile; it is the numerical value halfway between
Q1 and Q3, and the formula is

Using this formula and other formulas, find Q1, Q2, Q3, the midquartile, and the interquartile range
for each data set
a. 5, 12, 16, 25, 32, 38 12; 20.5; 32; 22; 20
b. 53, 62, 78, 94, 96, 99, 103

4. Check each data set for outliers.

a. 16, 18, 22, 19, 3, 21, 17, 20
b. 24, 32, 54, 31, 16, 18, 19, 14, 17, 20
c. 321, 343, 350, 327, 200
d. 88, 72, 97, 84, 86, 85, 100
e. 145, 119, 122, 118, 125, 116
f. 14, 16, 27, 18, 13, 19, 36, 15, 20

5. Driver’s License Exam Scores

The average score on a state CDL license exam is 76 with a standard deviation of 5. Find the
corresponding z score for each raw score.
a. 79
b. 70
c. 88
d. 65
e. 77
D. Exploratory Data Analysis

A. Identify the five-number summary and find the interquartile range.

1. 8, 12, 32, 6, 27, 19, 54
2. 19, 16, 48, 22, 7
3. 362, 589, 437, 316, 192, 188

B. Use each boxplot to identify the maximum value, minimum value, median, first quartile, third
quartile, and interquartile range.

Experimentation An Introduction To Measurement Theory and Experiment Design DC Baird
100% (3)
Experimentation An Introduction To Measurement Theory and Experiment Design DC Baird
210 pages
Measures of Position and Variability
100% (1)
Measures of Position and Variability
36 pages
Decision Science
No ratings yet
Decision Science
523 pages
Cooling System
100% (1)
Cooling System
24 pages
Lect 7
No ratings yet
Lect 7
16 pages
Measure of Position
No ratings yet
Measure of Position
13 pages
1407507926L04 Measures of location
No ratings yet
1407507926L04 Measures of location
5 pages
Box and Whiskers Plot Correlation
No ratings yet
Box and Whiskers Plot Correlation
39 pages
Lecture 4
No ratings yet
Lecture 4
11 pages
Unit 4 Assessment
No ratings yet
Unit 4 Assessment
34 pages
8th PPT Lecture On Measures of Position
0% (1)
8th PPT Lecture On Measures of Position
19 pages
Chapter 3.2 - Part 2
No ratings yet
Chapter 3.2 - Part 2
23 pages
MMW Module 8 - Measures of Relative Position
No ratings yet
MMW Module 8 - Measures of Relative Position
11 pages
Measures of Location
No ratings yet
Measures of Location
41 pages
Interpretation of Test Results
No ratings yet
Interpretation of Test Results
27 pages
Lec 5 BUSINESS STATISTICS DANISH 31032021 103747am
No ratings yet
Lec 5 BUSINESS STATISTICS DANISH 31032021 103747am
25 pages
Measures of Position: MAT C301 Jose Rizal University
No ratings yet
Measures of Position: MAT C301 Jose Rizal University
16 pages
Measures of Relative Position
No ratings yet
Measures of Relative Position
28 pages
Measures of Relative Position
No ratings yet
Measures of Relative Position
3 pages
Powerpoint Presentation
No ratings yet
Powerpoint Presentation
15 pages
Measures of Relative Position
100% (1)
Measures of Relative Position
18 pages
Local Media8189417746246610906
No ratings yet
Local Media8189417746246610906
23 pages
Statistics Lab 10-4
No ratings yet
Statistics Lab 10-4
11 pages
11-6D Quartiles, Percentiles and Boxplots and Histograms
No ratings yet
11-6D Quartiles, Percentiles and Boxplots and Histograms
24 pages
Measures of Position
No ratings yet
Measures of Position
19 pages
Descriptive Statistics - Handout
No ratings yet
Descriptive Statistics - Handout
10 pages
Curriculum Reference and Summary
No ratings yet
Curriculum Reference and Summary
8 pages
Measures of Position: S X X Z
No ratings yet
Measures of Position: S X X Z
3 pages
Statistics 3.4 Answers
100% (1)
Statistics 3.4 Answers
3 pages
Lesson 8 - Measure of Relative Position
No ratings yet
Lesson 8 - Measure of Relative Position
6 pages
Measures of Relative Standing
No ratings yet
Measures of Relative Standing
59 pages
Measures of Position
No ratings yet
Measures of Position
49 pages
Stat 3
No ratings yet
Stat 3
28 pages
Unit-4-Normal-Curve-and-linear-fegression
No ratings yet
Unit-4-Normal-Curve-and-linear-fegression
37 pages
Stat Chapter 5-9
No ratings yet
Stat Chapter 5-9
32 pages
Measures of Relative Position
No ratings yet
Measures of Relative Position
21 pages
Measures of Relative Position Written Report
No ratings yet
Measures of Relative Position Written Report
3 pages
Measure of Other Positions Quartile Percentile Decile Percentile Rank
No ratings yet
Measure of Other Positions Quartile Percentile Decile Percentile Rank
49 pages
Lecture 4 - Statistics and Data Analysis I 2
No ratings yet
Lecture 4 - Statistics and Data Analysis I 2
9 pages
Stat 3 April 2021
No ratings yet
Stat 3 April 2021
24 pages
Exploring Measures of Position and Its Real-life Application
No ratings yet
Exploring Measures of Position and Its Real-life Application
24 pages
Measures of Position
No ratings yet
Measures of Position
44 pages
Dispersion
No ratings yet
Dispersion
10 pages
Measures of Position: Section 3-4
No ratings yet
Measures of Position: Section 3-4
25 pages
Chap 7 Statistics - Measures of Relative Position - 3
No ratings yet
Chap 7 Statistics - Measures of Relative Position - 3
24 pages
4.14 Measures of Position
No ratings yet
4.14 Measures of Position
6 pages
Measures of Relative Position(2)
No ratings yet
Measures of Relative Position(2)
22 pages
Sullivan Section 3.4 Measures of Position and Outliers 1
No ratings yet
Sullivan Section 3.4 Measures of Position and Outliers 1
11 pages
Year 11 Preliminary Standard Math: Analysing Data
No ratings yet
Year 11 Preliminary Standard Math: Analysing Data
32 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
10 pages
Measures of Relative Motion
0% (1)
Measures of Relative Motion
20 pages
Lect 5
No ratings yet
Lect 5
31 pages
Module On Measures of Variability
No ratings yet
Module On Measures of Variability
33 pages
Sec 2.8 - Measures of Position
No ratings yet
Sec 2.8 - Measures of Position
20 pages
Statistics 84
No ratings yet
Statistics 84
4 pages
Estat6t PPT 0303
No ratings yet
Estat6t PPT 0303
39 pages
Lesson 4: Statistics/Data Management Unit 1 - Measures of Central Tendency
No ratings yet
Lesson 4: Statistics/Data Management Unit 1 - Measures of Central Tendency
26 pages
Measures of Variability and Position
No ratings yet
Measures of Variability and Position
34 pages
Module Lesson 4 - MP
No ratings yet
Module Lesson 4 - MP
8 pages
Department of Education: 4 QUARTER - Module 1
No ratings yet
Department of Education: 4 QUARTER - Module 1
10 pages
Percentile Standard Deviation
No ratings yet
Percentile Standard Deviation
3 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
WT8565 19 Aet
No ratings yet
WT8565 19 Aet
1 page
Integrity Reliability and Failure
No ratings yet
Integrity Reliability and Failure
26 pages
Biodiesel Calculator
No ratings yet
Biodiesel Calculator
1 page
4KP6800 en
No ratings yet
4KP6800 en
36 pages
RAMVAC Dental Products
No ratings yet
RAMVAC Dental Products
16 pages
OET Nursing - Official OET Practice Part 4
No ratings yet
OET Nursing - Official OET Practice Part 4
46 pages
Mat JAck-Up Rig Paper 2007
No ratings yet
Mat JAck-Up Rig Paper 2007
19 pages
(Ebook) 2015 IBC (International Building Code) Code & Commentary, Volume 1 by International Code Council ISBN 9781609832803, 1609832809 - The latest ebook version is now available for instant access
100% (1)
(Ebook) 2015 IBC (International Building Code) Code & Commentary, Volume 1 by International Code Council ISBN 9781609832803, 1609832809 - The latest ebook version is now available for instant access
81 pages
Advanced Photogrammetry
No ratings yet
Advanced Photogrammetry
95 pages
EMCO Orifice Plate With Single Pressure Tapping Series MEF For Insertion Between Flanges
No ratings yet
EMCO Orifice Plate With Single Pressure Tapping Series MEF For Insertion Between Flanges
4 pages
CONTENT FOR BRIDGE COURSE STD 9th
No ratings yet
CONTENT FOR BRIDGE COURSE STD 9th
5 pages
QM C 7.1 Ab 0001 Q
No ratings yet
QM C 7.1 Ab 0001 Q
9 pages
ECF-BREINING-GB
No ratings yet
ECF-BREINING-GB
16 pages
Steam Power Plant: Pros & Cons: What This Power Station Presents Advantages Disadvantages
100% (2)
Steam Power Plant: Pros & Cons: What This Power Station Presents Advantages Disadvantages
4 pages
Specification RMA801
No ratings yet
Specification RMA801
11 pages
Solution Manual For Introduction To Robotics Mechanics and Control 3rd Edition by Craig
100% (52)
Solution Manual For Introduction To Robotics Mechanics and Control 3rd Edition by Craig
5 pages
Marel Ar2022
No ratings yet
Marel Ar2022
286 pages
Course: Soil Mechanics II Code: CVNG 2009 Lab: Consolidation Name: Adrian Rampersad I.D:809001425
No ratings yet
Course: Soil Mechanics II Code: CVNG 2009 Lab: Consolidation Name: Adrian Rampersad I.D:809001425
16 pages
Paradiso 1800DD Web
100% (1)
Paradiso 1800DD Web
2 pages
Jam Recipes
No ratings yet
Jam Recipes
23 pages
Petal Talk Paper Flower Making Grade 10
No ratings yet
Petal Talk Paper Flower Making Grade 10
2 pages
01-71 Code Control Module
No ratings yet
01-71 Code Control Module
4 pages
Soal Sumatif Semester Ganjil Kelas VII
No ratings yet
Soal Sumatif Semester Ganjil Kelas VII
8 pages
General Vector Spaces
No ratings yet
General Vector Spaces
15 pages
Razor Series 1000VA - 10kVA Parallel Redundant Tower/Rack Convertible UPS
No ratings yet
Razor Series 1000VA - 10kVA Parallel Redundant Tower/Rack Convertible UPS
3 pages
ES2098 Accpet Test Req Rev Ab
100% (2)
ES2098 Accpet Test Req Rev Ab
101 pages
SMARDT Catalogue Ta Class
No ratings yet
SMARDT Catalogue Ta Class
18 pages
2021 BSC KIN
No ratings yet
2021 BSC KIN
2 pages