0% found this document useful (0 votes)
8 views

Module 2

Uploaded by

Ananya Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Module 2

Uploaded by

Ananya Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

MODULE II

MEASURES OF CENTRAL TENDENCY:


Central tendency is a statistical measure to determine a single score that defines the center of a
distribution. The goal of central tendency is to find the single score that is most typical or most
representative of the entire group.

MEAN
The mean for a distribution is the sum of the scores divided by the number of scores. It is one of the
most used measures of central tendency and is often referred to as average.

Properties-

 The mean is responsive to the exact position of each score in the distribution
 The mean is the balance point of a distribution, to use a mechanical analogy.
 The sum of the negative deviations from the mean exactly equals the sum of the positive
deviations.
 When a measure of central tendency should reflect the total of the scores, the mean is the best
choice because it is the only measure based on this quantity.
 Changing a Score Changing the value of any score will change the mean
 Introducing a New Score or Removing a Score Adding a new score to a distribution, or removing
an existing score, will usually change the mean. The exception is when the new score (or the
removed score) is exactly equal to the mean. It is easy to visualize the effect of adding or
removing a score if you remember that the mean is defined as the balance point for the
distribution.
 Adding or Subtracting a Constant from Each Score If a constant value is added to every score in a
distribution, the same constant will be added to the mean. Similarly, if you subtract a constant
from every score, the same constant will be subtracted from the mean.

Advantages of Mean

 The definition of mean is rigid which is a quality of a good measure of central tendency.
 It is not only easy to understand but also easy to calculate.
 All the scores in the distribution are considered when mean is computed.
 Further mathematical calculations can be carried out on the basis of mean.
 Fluctuations in sampling are least likely to affect mean.

Limitations of Mean

 Outliers or extreme values can have an impact on mean.


 When there are open ended classes, such as 10 and above or below 5, mean cannot be
computed. In such cases median and mode can be computed. This is mainly because in such
distributions midpoint cannot be determined to carry out calculations.
 If a score in the data is missing or lost or not clear, then mean cannot be computed unless mean
is computed for rest of the data by not considering the lost score and dropping it all together.
 It is not possible to determine mean through inspection. Further, it cannot be determined based
on a graph.
 It is not suitable for data that is skewed or is very asymmetrical as then in such cases mean will
not adequately represent the data.

MEDIAN
If the scores in a distribution are listed in order from smallest to largest, the median is the midpoint of
the list. More specifically, the median is the point on the measurement scale below which 50% of the
scores in the distribution are located. Median is a point in any distribution below and above which lie
half of the scores.

Properties

 the median is less sensitive than the mean to the presence of a few extreme score
 in distributions that are strongly asymmetrical (skewed) or have a few very deviant scores, the
median may be the better choice for measuring the central tendency if we wish to represent
the bulk of the scores and not give undue weight to the relatively few deviant ones.
 In behavioral studies, there are occasions when a researcher cannot record the exact values of
scores at the upper end of a distribution. The score for such a trial should be greater than the
time it took to get to the next stop, but we do not know how much greater. Distributions like
this one are called open-ended. In open-ended distributions, we cannot calculate the mean
without making assumptions, but we can find the median.

Advantages of Median

 The definition of median is rigid which is a quality of a good measure of central tendency.
 It is easy to understand and calculate.
 It is not affected by outliers or extreme scores in data.
 Unless the median falls in an open-ended class, it can be computed for grouped data with open
ended classes.
 In certain cases, it is possible to identify median through inspection as well as graphically.

Limitations of Median

 Some statistical procedures using median are quite complex. Computation of median can be
time consuming when large data is involved because the data needs to be arranged in an order
before median is computed.
 Median cannot be computed exactly when an ungrouped data is even. In such cases, median is
estimated as mean of the scores in the middle of the distribution.
 It is not based on each and every score in the distribution.
 It can be affected by sampling fluctuations and thus can be termed as less stable than mean.
MODE
In a frequency distribution, the mode is the score or category that has the greatest frequency. Though if
the scores in a distribution greatly vary then it is possible that there is no mode. Mode as such does not
provide an adequate characterization of the distribution because it just takes in to consideration the
most frequent scores and other scores are not considered.

Properties

 The mode is the only measure that can be used for data that have the characteristics of a
nominal scale.
 For quantitative variables, you need to be aware of some concerns about the use of the mode as
a measure of central tendency.
 When quantitative data are grouped, the mode may be strongly affected by the width and
location of class intervals.
 In addition, there may be more than one mode for a particular set of scores.

Advantages of Mode

 It is not only easy to comprehend and calculate but it can also be determined by mere
inspection.
 It can be used with quantitative as well as qualitative data.
 It is not affected by outliers or extreme scores.
 Even if a distribution has one or more than one open ended class(es), mode can easily be
computed.

Limitations of Mode

 It is sometimes possible that the scores in the data vary from each other and in such cases the
data may have no mode.
 Mode cannot be rigidly defined.
 In case of bimodal, trimodal or multimodal distribution, interpretation and comparison becomes
difficult.
 Mode is not based on the whole distribution.
 It may not be possible to compute further mathematical procedures based on mode.
 Sampling fluctuations can have an impact on mode.

MEASURES OF VARIABILITY: Standard Deviation, Quartile Deviation, Average Deviation

Variability provides a quantitative measure of the differences between scores in a distribution and
describes the degree to which the scores are spread out or clustered together.

In general, a good measure of variability serves two purpose

1. Variability describes the distribution. Specifically, it tells whether the scores are clustered close
together or are spread out over a large distance. Usually, variability is defined in terms of
distance. It tells how much distance to expect between one score and another, or how much
distance to expect between an individual score and the mean. For example, we know that the
heights for most adult males are clustered close together, within 5 or 6 inches of the average.
Although more extreme heights exist, they are relatively rare.

2. Variability measures how well an individual score (or group of scores) represents the entire
distribution. This aspect of variability is very important for inferential statistics, in which
relatively small samples are used to answer questions about populations. For example, suppose
that you selected a sample of one person to represent the entire population. Because most
adult males have heights that are within a few inches of the population average (the distances
are small), there is a very good chance that you would select someone whose height is within 6
inches of the population mean. On the other hand, the scores are much more spread out
(greater distances) in the distribution of weights. In this case, you probably would not obtain
someone whose weight was within 6 pounds of the population mean. Thus, variability provides
information about how much error to expect if you are using a sample to represent a
population.

Functions of Variability

The major functions of dispersion or variability are as follows:

 It is used for calculating other statistics such as analysis of variance, degree of correlation,
regression etc.
 It is also used for comparing the variability in the data obtained as in the case of Socio-Economic
Status, income, education etc.
 To find out if the average or the mean/median/mode worked out is reliable. If the variation is
small then we could state that the average calculated is reliable, but if variation is too large,
then the average may be erroneous.
 Dispersion gives us an idea if the variability is adversely affecting the data and thus helps in
controlling the variability

STANDARD DEVIATION
Standard deviation is the square root of the variance and provides a measure of the standard, or
average distance from the mean.

Standard deviation shows how much variation there is, from the mean. SD is calculated from the mean
only. If standard deviation is low it means that the data is close to the mean. A high standard deviation
indicates that the data is spread out over a large range of values. Standard deviation may serve as a
measure of uncertainty. If you want to test the theory or in other word, want to decide whether
measurements agree with a theoretical prediction, the standard deviation provides the information. If
the difference between mean and standard deviation is very large then the theory being tested probably
needs to be revised. The mean with smaller standard deviation is more reliable than mean with large
standard deviation. A smaller SD shows the homogeneity of the data. The value of standard deviation is
based on every observation in a set of data. It is the only measure of dispersion capable of algebraic
treatment therefore, SD is used in further statistical analysis.
The main merits of using standard deviation are as follows:

 It is widely used because it is the best measure of variation by virtue of its mathematical
characteristics.
 It is based on all the observations of the data.
 It gives an accurate estimate of population parameter when compared with other measures of
variation.
 SD is least affected by sample fluctuations
 It is also possible to calculate combined SD, that is not possible with other measures.
 Further statistics can be applied on the basis of SD like, correlation, regression, tests of
significance, etc.
 Coefficient of variation is based on mean and SD. It is the most appropriate method to compare
variability of two or more distributions.

The limitations of SD are as follows:

 While calculating standard deviation more weight is given to extreme values and less to those,
near the means. When we calculate SD, we take deviation from mean (X-M) and square these
obtained deviations. Therefore, large deviations, when squared are proportionally more than
small deviations. For example, the deviations 2 and 10 are in the ratio of 1:5 but their square 4
and l00 are in the ratio 1:25.
 It is difficult to compute as compared to other measures of dispersion.

The uses of standard deviation are as follows:

 SD is used when one requires a more reliable and accurate measure of variability but it is
recommended when the distribution is normal or near to normal.
 It is used when further statistics like, correlation, regression, tests of significance, etc. have to be
computed.

AVERAGE DEVIATION (AD) OR MEAN DEVIATION (MD)

The average deviation is the mean of the deviation of all the separate scores is a series taken from their
mean.

Average is a central value and thus, some deviations will be positive (+) and some may be negative (-).
Mean deviation ignores the signs of the deviations, and it considers all the deviations to be positive. This
is so because the algebraic sum of all the deviations from the mean equals to zero. MD or AD is
arithmetic mean of the difference of the values from the average. The average is either the arithmetic
mean or the median. It is a measure of variability that takes into account the variations of all the scores
in the data. It is an absolute measure of dispersion and is expressed in the same unit as the raw scores.

The main merits of AD are as follows:

 AD or MD is easy to understand and compute.


 It is based on all observations, unlike R or QD.
 It is an accurate measure of variability since it averages the absolute deviations.
 It is less affected by extreme observations.
 It is based on average thus, it is a better measure to compare about the formations of different
distributions.

The main limitations of average deviation are as follows:

 While calculating average deviation we ignore the plus minus sign and consider all values as
plus. Because of this mathematical property, it is not used in inferential statistics.
 AD cannot be computed for open-end classes.
 It tends to increase with the size of the sample.

Use of Average Deviation

 When it is desired to weight all deviation from the mean according to their size.
 When the standard deviation in unduly influenced by the presence of extreme scores.
 Distribution of the score is not near to normal.

THE QUARTILE DEVIATION (QD)


The Quartile deviation is a measure that depends on the relatively stable central portion of a
distribution. The Quartile deviation is half the scale distance between 75th and 25th per cent in a
frequency distribution. The entire data is divided into four equal parts and each part contains 25% of the
values.

Kurtosis is proportional to quartile deviation. Smaller the quartile deviation, greater is the concentration
of scores in the middle of the distribution, thus making the distribution with high peak and narrow body.
The scores that are widely dispersed indicate a large quartile deviation and thus, long IQR. This
distribution has a low peak and broad body.

On the basis of above definitions, it can be said that quartile deviation is half the distance between Q1
and Q3.

Inter Quartile Range (IQR): The range computed for the middle 50% of the distribution is the
interquartile range. The upper quartile (Q3) and lower quartile (Q1) is used to compute IQR. This is Q3 –
Q1. IQR is not affected by extreme values.

Semi-Interquartile Range (SIQR) or Quartile Deviation (QD): Half of the IQR is called as semi inter quartile
range. SIQR is also called as quartile deviation or QD. Thus, QD is computed as;

QD = Q3 – Q1/2

Thus, quartile deviation is obtained by dividing IQR by 2. Quartile deviation is an absolute measure of
dispersion and is expressed in the same unit as the scores.

Merits:
 Quartile deviation is a better measure of dispersion than range because it takes into account 50
per cent of the data, unlike the range which is based on two values of the data, that is highest
value and the lowest value.
 Secondly, quartile deviation is not affected by extreme scores since it does not consider 25 per
cent data from the beginning and 25 per cent from the end of the data.
 Lastly, quartile deviation is the only measure of dispersion which can be computed from the
frequency distribution with open-end class.

Demerits:

 The value of quartile deviation is based on the middle 50 percent values, it is not based on all
the observations. Thus, it is not regarded as a stable measure of variability
 The value of quartile deviation is affected by sampling fluctuation.
 The value of quartile deviation is not affected by the distribution of the individual values within
the intervals of middle 50 percent observed values.

Uses of Quartile Deviation

 The distribution contains few and very extreme scores.


 When the median is the measure of central tendency.
 When our primary interest is to determine the concentration around the median.

You might also like