0% found this document useful (0 votes)
7 views

dddddd2

The document discusses measures of central tendency, including mode, median, and mean, which help describe population attributes. It also covers measures of variability such as range, variance, and standard deviation, which quantify data spread. Additionally, it explains measures of position like percentiles, deciles, and quartiles, which indicate the relative position of data values within a dataset.

Uploaded by

planerpop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

dddddd2

The document discusses measures of central tendency, including mode, median, and mean, which help describe population attributes. It also covers measures of variability such as range, variance, and standard deviation, which quantify data spread. Additionally, it explains measures of position like percentiles, deciles, and quartiles, which indicate the relative position of data values within a dataset.

Uploaded by

planerpop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

7/19/2024

Measures of Central Tendency


Researchers are often interested in defining a value that
best describes some attribute of the population. Often
Computing and interpreting this attribute is a measure of central tendency or a
proportion. The three most commonly used measures of
measures of central tendency central tendency are mode, median and mean.

Mode Median
• The mode is the most frequently appearing value in the
population or sample.
• The median is a balance point since it splits the data into
two piles each containing half the values.
• It is the value with a highest frequency.
• To find the median, we arrange the observations in
order from smallest to largest value.
• Example: consider five women having the following
• If there is an odd number of observations, the median is
weights; 100 kg, 100 kg, 130 kg, 140 kg, and 150 kg.
the middle value.
• If there is an even number of observations, the median is
• The value with the highest frequency is 100kg the average of the two middle values.
• Thus, in the sample of five women, the median value
• Thus the mode would equal 100 kg. would be 130 kg; since 130 kg is the middle weight.
7/19/2024

Measures of Variability
Mean
• Measures of dispersion measure how spread out a set
• The sample mean is perhaps the most important of the of data is.
three measures
• They are important for describing the spread of the data,
• It represents the balance point (or centre of gravity) of a or its amount of variation around a central value
distribution
• The mean of a sample or a population is computed by • For example, consider a population of four random
adding all of the observations and dividing by the variables {5, 5 ,5, 5}. Here, each of the random variables
number of observations. are equal, so there is no variation. The set {3, 5, 5, 7}, on
the other hand, has some variation since some random
• Returning to the example of the five women, the mean variables are different.
weight would equal (100 + 100 + 130 + 140 + 150)/5 =
620/5 = 124 kg. • The three parameters that are used to quantify the
amount of variation in a set of random variables are the
range, the variance, and the standard deviation.

The range Standard deviation and variance


• The range is the simplest measure of variation.

• Defined as the difference between the largest and


smallest sample values

• Range = Maximum value - Minimum value

• Therefore, the range of the four random variables (3, 5,


5, 7} would be 7 - 3 or 4.

• As demonstrated, depends only on extreme values and


provides no information about how the remaining data is
distributed.
7/19/2024

Computing standard deviation and variance 1


Computing standard deviation and variance 2
• First find the mean and the
deviations about the mean.
x X-μ (X-μ)2 • The sum of the average squared deviations is called the
• Add up these deviations and sums of squares
find out how far on average 1 1-3= -2 4
the scores deviate from the
mean.
2 2-3= -1 1 • Divide the squared sums by the number of observation
3 3-3= 0 0 to get the average squared deviation or variance. In this
4 4-3= 1 1
example it is 10/5 = 2.
• Whenever the deviations are
added (in order to find the 5 5-3= 2 4

average of the deviations) μ=3 Σ(X-μ)= 0 Σ(X-μ)2 =10


• Root of the variance in order to get the standard
they will always sum to zero
deviation.
(Refer Table).
• The standard deviation is the average deviation about
the mean. For our example we take the square root of 2
• To avoid this, the squared
and find1.41 is the standard deviation
deviations are added to get a
measure of overall variability
in the distribution.

  2
Population variance and standard
deviation computational formulae Sample variance and standard deviation
computational formulae
• σ2 = Σ ( Xi - μ )2 / N • The variance of a sample is defined by a slightly
different formula, the numerator is divide by n – 1 instead
• Population standard deviation is given by Root σ2= σ of N
  2 • s2 = Σ ( xi - x )2 / ( n - 1 )
► where σ2 is the population variance,
► μ is the population mean, ►where s2 is the sample variance,
► Xi is the ith element from the population, ► x is the sample mean,
► and N is the number of elements in the population. ► xi is the ith element from the sample,
► and n is the number of elements in the sample.
• Thus by definition, the variance of a random variable is
the average squared deviation from the population mean Standard deviation s s2
7/19/2024

Measures of position 1
• Measures of position tell where a specific data value Measures of position 2
falls within the data set or its relative position in
comparison with other data values.
• In a similar way we define the quartiles as the quarter
values of the data set, deciles as the one-tenth values of
• The most common measures of position are the data set and so on.
percentiles, deciles, and quartiles.
• Quartiles, deciles and percentiles (unlike the median
• Quartiles, deciles and percentiles are just a which acts as a measure of central tendency) give us an
generalization of the median. idea about the skewness of the data set.

• Median is the middle value of the data set (assumed to


be arranged in ascending order).

Measures of position 3 Quartiles


• Standard Scores • Data (or the distribution) can be divided into FOUR parts
– A standard score or z score is used when direct and the cut points are called QUARTILES denoted by
comparison of raw scores is impossible. Q1, Q2, Q3.
– A standard score or z score for a value is obtained by ► Q1 is the same as the 25th percentile;
subtracting the mean from the value and dividing the
result by the standard deviation. ► Q2 is the same as the 50th percentile or the median;
► Q3 corresponds to the 75th percentile.
• Percentiles
– Percentiles are position measures used in For example, consider the following 15 numbers
educational and health-related fields to indicate the
position of an individual in a group. 3 6 7 11 13 22 30 40 44 50 52 61 68 80 94
– A percentile, P, is an integer between 1 and 99 such Q1 Q2 Q3
that the Pth percentile is a value where P % of the
data values are less than or equal to the value and
100 – P % of the data values are greater than or
equal to the value.
7/19/2024

Deciles
Quartiles • In a Data Set deciles are the 9 values that divide the sorted
data into 10 equal parts/groups.

• The first quartile is Q1=11 The second quartile is • Accordingly they are called the 1st, 2nd... 9th deciles (also
denoted as D1,D2 ... D9). If X1, X2, X3, .., Xn are the observed
Q2=40 (This is also the Median.) The third quartile is values and are assumed to be arranged in ascending (or
Q3=61 descending order) then the corresponding definition are as
follows:
• (To within 1 datum) One quarter of the data are below
D1 10th percentile =X(10)
Q1, two quarters below Q2, three quarters below Q3
D2 20th percentile =X(20)
D3 30th percentile =X(30)
• One quarter of the data (about) is between Q1 and Q2, D4 40th percentile =X(40)
etc. D5 50th percentile=X(50)
D6 60th percentile =X(60)
D7 70th percentile =X(70)
D8 80th percentile =X(80)
D9 90th percentile =X(90)

Deciles

• Let us consider the following example. • Outliers


• Example 1: Data Set: 1, 2, 4, 6, 7, 9, 10, 12, 14 – An outlier is an extremely high or an
extremely low data value when compared with
• Deciles: the rest of the data values.
D1 = 1, D2 = 2, D3 = 4, – Outliers can be the result of measurement or
D4 = 6, D5 = 7, D6 = 9, observational error.
D7 = 10, D8 = 12, D9 = 14 – When a distribution is normal or bell-shaped,
data values that are beyond three standard
deviations of the mean can be considered
suspected outliers.

You might also like