Basic Statistics Note.1
Basic Statistics Note.1
Module
January 2021
Business statistics
1|Page
Chapter one
1. Introduction (4 lecture hours)
1.1 Definition and classification of Statistics
1.2 Stages in statistical investigation
1.3 Definition of some basic terms
1.4 Applications, uses and limitations of Statistics
1.5 Types of variables and measurement scales
2|Page
1
Average (3645 4568 5432 6751 7369) 5553 then our work belongs to the
5
domain of descriptive statistics.
If we say that there was an increase of 724 patients from 1986 to 1990, then again this
belongs to the domain of descriptive statistics.
2. Inferential Statistics: consists of generalizing from samples to populations,
performing estimations and hypothesis tests, determining relationships among
variables, and making predictions. Statistical techniques based on probability
theory are required. Example 1: In the above example if we predict the number of
malaria patients in the year 1995 to be 9917, then our work belongs to the domain
of inferential statistics. Example 2: Suppose we want to have an idea about the
percentage of illiterates in our country. We take a sample from the population and
find the proportion of illiterates in the sample. This sample proportion with the
help of probability enables us to make some inferences about the population
proportion. This study belongs to inferential statistics.
3|Page
IIII.. Proper collection of data: in order to draw valid conclusions, it is important ‘good’
data. Data are gathered with aim to meet predetermine objectives. In other words,
the data must provide answers to problems. The data itself form the foundation of
statistical analyses and hence the data must be carefully and accurately collected. In
section 1.6 we will see the methods of data collection.
IIIIII.. Organization and classification of data: in this stage the collected data organized in
a systematic manner. That means the data must be placed in relation to each other.
The classification or sorting out of data is, by itself, a kind of organization of data.
IIVV.. Presentation of data: The purpose of putting the organized data in graphs, charts
and tables is two-fold. First, it is a visual way to look at the data and see what
happened and make interpretations. Second, it is usually the best way to show the
data to others. Reading lots of numbers in the text puts people to sleep and does
little to convey information.
VV.. Analyses of data: is the process of looking at and summarizing data with the intent
to extract useful information and develop conclusions. Data analysis is closely
related to data mining, but data mining tends to focus on larger data sets, with less
emphasis on making inference, and often uses data that was originally collected for
a different purpose. In this stage different types of inferential statistical methods will
apply. For instance, hypothesis testing such as 2 test of association.
V
VII.. Interpretation of data: interpretation means drawing valid conclusions from data
which form the basis of decision making. Correct interpretation requires a high
degree of skill and experience.
Note that: Analyses and interpretation of data are the two sides of the same coin.
4|Page
Sample: Is a portion of a population which is selected using some technique of
sampling. Sample must be representative of the population so that it must be selected
by any of the developed technique.
Sampling: Is the process of selecting units (e.g., people, households, organizations)
from a population of interest so that by studying the sample we may fairly generalize
our results back to the population from which they were chosen. There are two types of
sampling techniques namely random sampling technique and non-random sampling
technique.
Random sampling technique or probability sampling technique gives a non- zero
chance for all elements to be included in the sample. In other words, there is no
personal bias regarding the selection. The five common random sampling techniques
are:
Simple Random sampling
Systematic Random sampling
Stratified Random sampling
Cluster Random sampling
Multi-stage sampling
Non-random sampling technique is mostly known as non-probability sampling
techniques and in this case not all elements of a population have a known chance of
inclusion or if some outcomes have a zero chance of being selected as a sample. The
most familiar examples of non-random sampling techniques are
Quota sampling
Convenience sampling
Volunteer sampling
Purposive sampling
Haphazard sampling
Snow ball sampling etc…
Sample size: The number of elements or observation to be included in the sample.
Parameter: Any measure computed from the data of a population.
Example: Populations mean and population standard deviation
Statistic: Any measure computed from the sample.
Example: sample mean x , sample standard deviation S
Survey: A collection of quantitative information about members of a population when
no special control is exercised over any of the factors influencing the variable of interest.
Sample survey: A survey that include only a portion of the population.
Census: A collection of information about every member of a population
Sample survey has the following advantages over census
Sample survey saves time and cost
5|Page
Has great accuracy
Avoid wastage of material
Variable: A variable is a characteristic or attribute that can assume different values.
Variables whose values are determined by chance are called random variables.
Variables are often specified according to their type and intended use and hence
variable can be classified in to two namely qualitative and quantitative variables.
A quantitative variable is naturally measured as a number for which meaningful
arithmetic operations make sense. Examples: Height, age, crop yield, GPA, salary,
temperature, area, air pollution index (measured in parts per million), etc.
Qualitative variable: Any variable that is not quantitative is qualitative. Qualitative
variables take a value that is one of several possible categories. As naturally
measured, qualitative variables have no numerical meaning. Examples: Hair color,
gender, field of study, marital status, political affiliation, status of disease infection.
Quantitative variables can be classified as discrete and continuous variable. Discrete
variables can assume certain numerical values. That is, there are gaps between the
possible values. Such as 0, 1, 2...It may be countable finite or countable infinite. For
example the number of students in a classroom, number of children a family.
Continuous variable can take any value within a specified interval with a finite enough
measuring device. No gaps between possible values. They are obtained by measuring.
For example, consider the heights of two people no matter how close it is we can find
another person whose height falls somewhere between the two heights is a continuous
variable.
6|Page
II. Uses of Statistics
Statistics presents fact in the form of numerical data.
It condenses and summarizes a mass of data in to a few presentable and precise
figures.
It facilitates comparison of data
It helps in formulating and testing hypothesis.
It helps in predicting future trend.
It helps in formulating polices.
III. Limitations of Statistics
Statistics with all its wide application in every sphere of human activity has its own
limitation. Some of them are given below
Statistics is not suitable to the study of qualitative phenomenon: Since statistics is
basically a science and deals with a set of numerical data, it is applicable to the study
of only these subjects of enquiry, which can be expressed in terms of quantitative
measurements. As a matter of fact, qualitative phenomenon like honesty, poverty,
beauty, intelligence etc, cannot be expressed numerically and any statistical analysis
cannot be directly applied on these qualitative phenomenons. Nevertheless,
statistical techniques may be applied indirectly by first reducing the qualitative
expressions to accurate quantitative terms. For example, the intelligence of a group
of students can be studied on the basis of their marks in a particular examination.
Statistics does not study individuals: Statistics does not give any specific
importance to the individual items; in fact it deals with an aggregate of objects.
Individual items, when they are taken individually do not constitute any statistical
data and do not serve any purpose for any statistical enquiry.
Statistical laws are not exact: It is well known that mathematical and physical
sciences are exact. But statistical laws are not exact and statistical laws are only
approximations. Statistical conclusions are not universally true. They are true only
on an average.
Statistics table may be misused: Statistics must be used only by experts; otherwise,
statistical methods are the most dangerous tools on the hands of the inexpert. The
use of statistical tools by the inexperienced and untraced persons might lead to
wrong conclusions. Statistics can be easily misused by quoting wrong figures of
data. As King says aptly ‘statistics are like clay of which one can make a God or
Devil as one pleases.’
Statistics is only, one of the methods of studying a problem: Statistical method
does not provide complete solution of the problems because problems are to be
studied taking the background of the countries culture, philosophy or religion into
7|Page
consideration. Thus the statistical study should be supplemented by other
evidences.
Examples:
Letter grades (A, B, C, D, F).
Rating scales (Excellent, Very good, Good, Fair, poor).
Military status.
8|Page
3. Interval Scales
Interval scales are measurement systems that possess the following properties:
Level of measurement which classifies data that can be ranked and differences
are meaningful. However, there is no meaningful zero, so ratios are meaningless.
All arithmetic operations except division are applicable.
Relational operations are also possible.
Examples:
IQ, Temperature in F0.
4. Ratio Scales
Ratio scales measurement possess the following properties: Level of measurement
which classifies data that can be ranked, differences are meaningful, and there is a true
zero. True ratios exist between the different units of measure.
All arithmetic and relational operations are applicable.
Examples:
Weight
Height
Number of students
Age
Use of level of measurements
Helps you decide how to interpret the data from the variable.
Helps you decide what statistical analysis is appropriate on the values that were
assigned. For example if a measurement is nominal then you know that you
never average the data level.
Assignment: What is the application of statistics for your field of study (15%)?
9|Page
2. Methods of Data Collection and Presentation (6 lecture hours)
1.6 Methods of data collection
1.6.1 Sources of data
1.6.2 Types of data
1.6.3 Methods of collection
1.7 Methods of Data Presentation
1.7.1 Motivating examples
1.7.2 Frequency distributions: qualitative, quantitative: absolute, relative,
percentage, cumulative
1.7.3 Tabular presentation of data
1.7.4 Diagrammatic display of data: Bar charts, Pie-chart, Cartograms
1.7.5 Graphical presentation of data: Histogram, Frequency Polygon,
Ogive Curves
13 | P a g e
1. You have to identify that the data is in nominal or ordinal scale of measurement.
2. Make a table as show below
A B C D
class Tally Frequency Percent
14 | P a g e
1 3 0 2 4 5 0 5 7 5 1 1
0 2
A. Construct ungrouped frequency distribution
B. How many workers had at least 1 day of sick leave?
C. How many workers had between 3 and 5 days of sick leave?
Solution:
A. Since this data set contains only a relatively small number of distinct or different
values, it is convenient to represent it in a frequency table which presents each distinct
value along with its frequency of occurrence.
B. Since 12 of the 50workers had no days of sick leave, the answer is 50-12=38
C. The answer is the sum of the frequencies for values 3, 4 and 5 that is 4+5+8=17
2.2.2.3. Grouped Frequency Distribution
When the range of the data is large, the data must be grouped in which each class has more than
one unit in width. While we construct this frequency distribution, we have to follow the
following steps.
1. Find the highest and the lowest values
2. Find the range; or
3. Select the number of classes desired. Here, we have two choices to get the desired number
of classes:
I. Use Struge’s rule. That is, where is the number of class and
is the number of observations.
II. Select the number of classes arbitrarily between 5 and 20. This is a conventional
way. If you fail to calculate by Struge’s rule, this method is more appropriate.
When we choose the number of classes, we have to think about the following criteria
The classes must be mutually exclusive. Mutually exclusive classes have non
overlapping class limits so that values can’t be placed in to two classes.
The classes must be continuous. Even if there are no values in a class, the class must
be included in the frequency distribution. There should be no gaps in a frequency
distribution. The only exception occurs when the class with a zero frequency is the
first or last. A class width with a zero frequency at either end can be omitted with out
affecting the distribution.
15 | P a g e
The classes must be equal in width. The reason for having classes with equal width is
so that there is not a distorted view of the data. One exception occurs when a
distribution is open-ended. i.e., it has no specific beginning or end values.
4. Find the class width by dividing the range by the number of classes
Note that: Round the answer up to the nearest whole number if there is a reminder. For
instance, and
5. Select the starting point as the lowest class limit. This is usually the lowest score
(observation). Add the width to that score to get the lower class limit of the next class.
Keep adding until you achieve the number of desired class calculated in step 3.
6. Find the upper class limit; subtract unit of measurement from the lower class limit of
the second class in order to get the upper limit of the first class. Then add the width to
each upper class limit to get all upper class limits.
Unit of measurement: Is the next expected upcoming value. For instance, 28, 23, 52, and
then the unit of measurement is one. Because take one datum arbitrarily, say 23, then the
next upcoming value will be 24. Therefore, If the data is 24.12, 30,
21.2 then give priority to the datum with more decimal place. Take 24.12 and guess the
next possible value. It is 24.13. Therefore, .
Note that: is the maximum value of unit of measurement and is the value when
we don’t have a clue about the data.
7. Find the class boundaries.
and . In short, and
.
8. Tally the data and write the numerical values for tallies in the frequency column
9. Find cumulative frequency. We have two type of cumulative frequency namely
less than cumulative frequency and more than cumulative frequency. Less than
cumulative frequency is obtained by adding successively the frequencies of all
the previous classes including the class against which it is written. The cumulate
is started from the lowest to the highest size. More than cumulative frequency is
obtained by finding the cumulate total of frequencies starting from the highest to
the lowest class.
For example, the following frequency distribution table gives the marks obtained
by 40 students:
16 | P a g e
The above table shows how to find less than cumulative frequency and the table
shown below shows how to find more than cumulative frequency.
Example 2.3: Consider the following set of data and construct the frequency
distribution.
11 29 6 33 14 21 18 17 22 38
31 22 27 19 22 23 26 39 34 27
Steps
1.
2.
3.
4.
5. Select starting point. Take the minimum which is 6 then add width 6 on it to get
the next class LCL.
6 12 18 24 30 36
6. Upper class limit. Since unit of measurement is one. So 11 is the
UCL of the first class. Therefore, is the first class
Class limit 6-11 12-17 18-23 24-29 30-35 36-41
17 | P a g e
2.2.2.4. Relative Frequency Distribution
An important variation of the basic frequency distribution uses relative frequencies, which are
easily found by dividing each class frequency by the total of all frequencies. A relative frequency
distribution includes the same class limits as a frequency distribution, but relative frequencies
are used instead of actual frequencies. The relative frequencies are sometimes expressed as
percents.
Relative frequency distribution enables us to understand the distribution of the data and to
compare different sets of data.
2.2.3. Graphical Presentation of Data
We have discussed the techniques of classification and tabulation that help us in
organizing the collected data in a meaningful fashion. However, this way of
presentation of statistical data does not always prove to be interesting to a layman. Too
many figures are often confusing and fail to convey the massage effectively.
One of the most effective and interesting alternative way in which a statistical data may
be presented is through diagrams and graphs. There are several ways in which
statistical data may be displayed pictorially such as different types of graphs and
diagrams.
General steps in constructing graphs
1. Draw and label the and axes
2. Choose a suitable scale for the frequencies or cumulative frequencies and label it
on the axis.
3. Represent the class boundaries for the histogram or Ogive or the mid point for
the frequency polygon on the axis.
4. Plot the points
5. Draw the bars or lines
2.2.3.1. Diagrammatic display of data: Bar charts, Pie-chart, Cartograms
I. Pie chart
Pie chart can used to compare the relation between the whole and its components. Pie
chart is a circular diagram and the area of the sector of a circle is used in pie chart.
Circles are drawn with radii proportional to the square root of the quantities because
the area of a circle is .
To construct a pie chart (sector diagram), we draw a circle with radius (square root of
the total). The total angle of the circle is . The angles of each component are
calculated by the formula.
These angles are made in the circle by mean of a protractor to show different
components. The arrangement of the sectors is usually anti-clock wise.
Example2.4: The following table gives the details of monthly budget of a family.
Represent these figures by a suitable diagram.
18 | P a g e
Solution: The necessary computations are given below:
Chart Title
misclaneous
20%
food
Fuel and Light 40%
7%
House Rent
27%
clothing
6%
19 | P a g e
Draw two perpendicular lines one horizontally and the other vertically at an
appropriate place of the paper.
Take the basis of classification along horizontal line (X-axis) and the observed
variable along vertical line (Y-axis) or vice versa.
Marks signs of equal breath for each class and leave equal or not less than half
breath in between two classes.
Finally marks the values of the given variable to prepare required bars.
Example 2.5: Draw simple bar diagram to represent the profits of a bank for 5 years.
B. Multiple bar charts are used two or more sets of inter-related data are represented
(multiple bar diagram facilities comparison between more than one phenomenon).
The technique of simple bar chart is used to draw this diagram but the difference is
that we use different shades, colors, or dots to distinguish between different
phenomena.
Example 2.6: Draw a multiple bar chart to represent the import and export of Canada
(values in $) for the years 1991 to 1995.
20 | P a g e
C. Stratified (Stacked or component) Bar Chart is used to represent data in which the total
magnitude is divided into different or components. In this diagram, first we make simple
bars for each class taking total magnitude in that class and then divide these simple
bars into parts in the ratio of various components. This type of diagram shows the
variation in different components within each class as well as between different
classes. Sub-divided bar diagram is also known as component bar chart or staked
chart.
Example 2.7: The table below shows the quantity in hundred kgs of Wheat, Barley and
Oats produced on a certain form during the years 1991 to 1994. Draw stratified bar
chart.
Solution: To make the component bar chart, first of all we have to take year wise total
production.
21 | P a g e
2.2.3.2. Graphical presentation of data: Histogram, Frequency Polygon, Ogive Curves
Statistical graphs can be used to describe the data set or to analyze it. Graphs are also
useful in getting the audience’s attention in a publication or a speaking presentation.
They can be used to discuss an issue, reinforce a critical point, or summarize a data set.
They can also be used to discover a trend or pattern in a situation over a period of time.
The three most commonly used graphs in research are
1. The histogram.
2. The frequency polygon.
3. The cumulative frequency graph, or ogive (pronounced o-jive).
(1). Histogram
Histogram is a special type of bar graph in which the horizontal scale represents classes of data
values and the vertical scale represents frequencies. The height of the bars correspond to the
frequency values, and the drawn adjacent to each other (without gaps).
We can construct a histogram after we have first completed a frequency distribution table for a
data set. The axis is reserved for the class boundaries.
Example2.8: Take the data in example 2.3.
7.0
6.0
5.0
Frequency 4. 0
3.0
2.0
1.0
Relative frequency histogram has the same shape and horizontal ( ) scale as a histogram,
but the vertical ( ) scale is marked with relative frequencies instead of actual frequencies.
(2). Frequency Polygon
A frequency polygon uses line segment connected to points located directly above class midpoint
values. The heights of the points correspond to the class frequencies, and the line segments are
22 | P a g e
extended to the left and right so that the graph begins and ends on the horizontal axis with the
same distance that the previous and next midpoint would be located.
Example 2.9: Take the data in example 2.3.
7.0
6.0
5.0
4.0
3.0
2.0
2.5 8.5 14.5 20.5 26.5 32.5 38.5 44.5
Midpoints
15
10
23 | P a g e
CHAPTER 3
X
i=1
i = X 1 + X 2 + + X N
The expression is read, "the sum of X sub i from i equals 1 to N." It means "add up
all the numbers."
Example: Suppose the following were scores made on the first homework
assignment for five students in the class: 5, 7, 7, 6, and 8. In this example set of five
numbers, where N=5, the summation could be written:
X
i=1
i = X 1 + X 2 + X 3 + X 4 + X 5 = 5 + 7 + 7 + 6 + 8 = 33
The "i=1" in the bottom of the summation notation tells where to begin the
sequence of summation. If the expression were written with "i=2", the summation
would start with the second number in the set.
5
For example: X
i= 2
i = X 2 + X 3 + X 4 + X 5 = 7 + 7 + 6 + 8 = 28
The "N" in the upper part of the summation notation tells where to end the
sequence of summation. If there were only three scores then the summation and
example would be:
X
i=1
i = X 1 + X 2 + X 3 = 5 + 7 + 7 = 19
PROPERTIES OF SUMMATION
n
1. ∑ K = nK , Where k is any constant
i= 1
n n
2. ∑ KX i = K ∑ X i , Where k is any constant
i= 1 i= 1
25 | P a g e
3.
n
(a + bX i ) = na + b X i
n
, where a and b are any constant
i i=1
4. n n n
(X
i=1
i + Yi ) = X i=1
i + Yi
i=1
N
5. (X i Yi ) = X 1 Y1 + X 2 Y2 + + X N YN
i=1
5 5 5
∑ ∑ ∑ 11
5 5
a)
i= 1
Xi b)
i= 1
Yi c)
i= 1
d)
i=1
(X i + Yi ) e) (X
i=1
i Yi )
5 2 5 5 5
∑X ∑ XiYi g) ∑ X i ∑ Y i
5 5
X + Y
i
f) g) h) i i
i= 1 i= 1 i= 1 i= 1
i=1 i=1
The choice of these averages depends up on which best fit the property under
discussion.
26 | P a g e
X1 + X 2 + + X n n
X= = Xi /n
n i=1
When the data are arranged or given in the form of frequency distribution i.e.
there are k variate values such that a value X i has a frequency f i ( i=1,2,---
,k) ,then the Arithmetic mean will be
f i Xi
∑
k
f i= n
X= i=1 Where k is the number of classes and
k i= 1
f i=1
i
fY i i
X= i=1
n
Where Y i = the class mark of the ith class and fi = the frequency of
f
i=1
i
Example 3.2:
2) The distribution of age at first marriage of 130 males was as given below
Age in years(X): 18 19 20 21 22 23 24 25 26 27 28 29
No. of males (f): 2 1 4 8 10 12 17 19 18 14 13 12
27 | P a g e
3) Calculate the mean for the following age distribution.
Class frequency
6- 10 35
11- 15 23
16- 20 15
21- 25 12
26- 30 9
31- 35 6
1) Find the mean of the marks obtained by 51 students with A=48.5 and w=10 of
fi 4 12 15 13 7
Merits:
• It is rigidly defined.
• It is based on all observation.
• It is suitable for further mathematical treatment.
• It is stable average, i.e. it is not affected by fluctuations of sampling to some
extent.
• It is easy to calculate and simple to understand.
Demerits:
• It is affected by extreme observations.
• It can not be used in the case of open end classes.
• It can not be determined by the method of inspection.
• It can not be used when dealing with qualitative characteristics, such as
intelligence, honesty, beauty.
• It can be a number which does not exist in a serious.
• Some times it leads to wrong conclusion if the details of the data from which
it is obtained are not available.
• It gives high weight to high extreme values and less weight to low extreme
values.
3.6.The Mode
Mode is a value which occurs most frequently in a set of values
28 | P a g e
The mode may not exist and even if it does exist, it may not be unique.
In case of discrete distribution the value having the maximum frequency is
the model value.
If in a set of observed values, all values occur once or equal number of
times, there is no mode
Examples:
1. Find the mode of 5, 3, 5, 8, and 9
Mode =5
2. Find the mode of 8, 9, 9, 7, 8, 2, and 5.
It is a bimodal Data: 8 and 9
3. Find the mode of 4, 12, 3, 6, and 7.
No mode for this data.
- The mode of a set of numbers X1, X2, …Xn is usually denoted by X̂ .
If data are given in the shape of continuous frequency distribution, the mode is
defined as:
Δ1
Xˆ = Lmod + ( )W
Δ1 + Δ2
Example 3.7: The following is the distribution of the size of certain farms selected
at random from a district. Calculate the mode of the distribution.
Size of farms No. of farms
5- 15 _____________________________8
15- 25____________________________12
25- 35____________________________17
29 | P a g e
35- 45____________________________29
45- 55____________________________31
55- 65____________________________5
65- 75____________________________3
30 | P a g e
2) The export of agricultural products in million dollars from a country during
eight quarters in 1974 and 1975 was, 29.7, 16.6, 2.3, 14.1, 36.6, 18.7, 3.5, 21.3.
Find the median of the given set of values?
Median for grouped data.
-If data are given in the shape of continuous frequency distribution, the median is
defined as:
~ W n
X = Lmed + ( fc )
f med 2
Where: L med =lower class boundary of the median class.
f med = The frequency of the median class
f c= The comulative frequency less than type preceding the median class .
W=the size of the median class.
n=total number of observation.
Note: The median class is the class with the smallest cumulative frequency (less
than type) greater than or equal to n/2.
Example 3.9: Find the median of the following distribution.
Class Frequency
40-44 7
45-49 10
50-54 22
55-59 15
60-64 12
65-69 6
70-74 3
Merits and Demerits of Median
Merits:
• Median is a positional average and hence not influenced by extreme
observations.
• Can be calculated in the case of open end intervals.
• Median can be located even if the data are incomplete.
Demerits:
• It is not a good representative of data if the number of items is small.
• It is not amenable to further algebraic treatment.
• It is susceptible to sampling fluctuations.
~
Empirical relationship between X, Xˆ, and X
~
X = Xˆ = X , for symmetrical distribution
~
X Xˆ = 3 X X , for unimodal skewed or asymmetrical frequency distribution.
31 | P a g e
a. Motivating examples
b. Objectives of measures of variation
c. Measures of Dispersion (Variation)
i. Range and Relative Range
ii. Variance, Standard Deviation and Coefficient of Variation
iii. Standard Scores
4.1 Introduction
Consider the following two sets of scores:
Both these sets have the same mean (50), but the second set is a lot more widely
dispersed ("scattered") than the first.
Set 1 Set 2
Measure of central tendency alone does not adequately describe a set of observation
unless all observations are the same. So we need some additional information like
1) The extent to which the items in a particular distribution are scatters around the
central tendency i.e. measure of dispersion.
2) The direction of scattered ness whether more items are attached towards higher or
lower values i.e. measure of skew ness.
3) The extent to which the distribution is more peaked or more flat toped than the
normal distribution i.e. measure of kurtosis.
Definition:
The scatter or spread of items of a distribution is known as dispersion or variation.
In other words the degree to which numerical data tend to spread about an average
value is called dispersion or variation of the data.
Measures of dispersions are statistical measures which provide ways of measuring
the extent in which data are dispersed or spread out.
32 | P a g e
To determine the reliability of an average by pointing out as how far an
average is representative of the entire data.
To determine the nature and cause of variation in order to control the
variation itself.
Enable comparison of two or more distribution with regard to their
variability.
Measuring variability is of great importance to other statistical analysis. E.g.,
it is the basis of statistical quality control.
Furthermore, although the same unit of measurement is used, the two MCT
(means) may be quite different. If we compare the AMD of weights of first grade
children with the AMD of the weights of high school freshmen, we may find that
the latter AMD is numerically larger than the former, because the weights
themselves are larger, not because the AMD is larger.
33 | P a g e
4.3 Types of Measures of Dispersion
Various measures of dispersions are in use. The most commonly used measures of
dispersions are:
Distribution 1: 32 35 36 36 37 38 40 42 42 43 43 45
Distribution 2: 32 32 33 33 33 34 34 34 34 34 35 45
For this reason, among others, the range is not the most important measure of
variability.
X max X min
For ungrouped data: RR =
X max + X min
34 | P a g e
UCBlast LCB first
For grouped data: RR =
UCBlast + LCB first
Merits and Demerits of range
Merits:
It is rigidly defined.
It is easy to calculate and simple to understand.
Demerits:
It is not based on all observation.
It is highly affected by extreme observations.
It is affected by fluctuation in sampling.
It is not liable to further algebraic treatment.
It can not be computed in the case of open end distribution.
It is very sensitive to the size of the sample.
Example 1
For raw data, 5, 6,8,4,5, 3,9,8,7,3,5,6,8,11
R=11-3=8
11 3 8
coefficient of range 0.57
11 3 14
Example 2:
Height Number of
(in) Students
Less than 59.5 0
Less than 62.5 5
Less than 65.5 23
Less than 68.5 65
Less than 71.5 92
Less than 74.5 100
R=74.5-56.5=18
xmax xmin 74.5 56.5
coefficient range 0.135
xmax xmin 74.5 56.5
Example 4.1:1) Find the R, and RR and then identify which data is more dispersed?
a) For the month income of 10 workers Xi: 347, 420, 500,600,696,710, 835, 850,
and 900.
35 | P a g e
b) For the following age distribution.
Class frequenc
y
6- 10 35
11- 15 23
16- 20 15
21- 25 12
26- 30 9
31- 35 6
2. If the range and relative range of a series are 4 and 0.25 respectively. Then what is the
value of:
a) Smallest observation
b) Largest observation
4.3.2 The Variance
Population Variance
If we divide the variation by the number of values in the population, we get something
called the population variance. This variance is the "average squared deviation from the
mean".
N
( xi u ) 2
Population Variance i 1
, i 1,2,3,..., N
2
N
Sample Variance
One would expect the sample variance to simply be the population variance with the
population mean replaced by the sample mean. However, one of the major uses of
statistics is to estimate the corresponding parameter. This formula has the problem that
the estimated value isn't the same as the parameter. To counteract this, the sum of the
squares of the deviations is divided by one less than the sample size.
n
( xi x ) 2
Sample Variance i 1
n 1
2
I.e. The sample variance, denoted by s , of a set of n observed values having a mean x
is the sum of the squared deviations divided by n 1 .
36 | P a g e
n
f i ( xi x) 2
i 1
2
s n 1
We usually use the following short cut formula.
n 2
2
x nx
i 1
2 i
s n 1
, for raw data
n 2
f x nx
2
i
i 1
for frequency distribiti on, where f i n
2 i
s n 1
,
Standard Deviation
There is a problem with variances. Recall that the deviations were squared. That means
that the units were also squared. To get the units back the same as the original data
values, the square root must be taken.
population s tan dard deviation
2
37 | P a g e
Solutions:
SB 11
C.VB 100 0 0 100 0 0 23.16
XB 47.5
Since C.VA < C.VB, in firm B there is greater variability in individual wages.
Xi u
Zi , for the population
Xi x
Zi , for the sample
s
Z gives the deviations from the mean in units of standard deviation.
Z gives the number of standard deviation a particular observation lie above or
below the mean.
It is used to compare two observations coming from different groups.
Examples:
1. Two sections were given introduction to statistics examinations. The following
information was given.
Value Section 1 Section 2
Mean 78 90
Stand. deviation 6 5
Student A from section 1 scored 90 and student B from section 2 scored 95. Relatively
speaking who performed better?
Solutions:
Calculate the standard score of both students.
X X 1 90 78
Z1 1 2
S1 6
38 | P a g e
X 2 X 2 95 90
Z2 1
S2 5
Student A performed better relative to his section because the score of student A is
two standard deviation above the mean score of his section while, the score of
student B is only one standard deviation above the mean score of his section.
2. Two groups of people were trained to perform a certain task and tested to find out
which group is faster to learn the task. For the two groups the following information
was given:
Value Group one Group two
Mean 10.4 min 11.9 min
Stand.dev. 1.2 min 1.3 min
Relatively speaking:
a) Which group is more consistent in its performance
b) Suppose a person A from group one take 9.2 minutes while person B from
Group two take 9.3 minutes, who was faster in performing the task? Why?
Solutions:
a) Use coefficient of variation.
S 1.2
C.V1 1 100 0 0 100 0 0 11.54 0 0
X1 10.4
S2 1.3
C.V2 100 0 0 100 0 0 10.92 0 0
X2 11.9
Since C.V < C.V , group 2 is more consistent.
2 1
X B X 2 9.3 11.9
ZB 2
S2 1 .3
Child B is faster because the time taken by child B is two standard deviation shorter
than the average time taken by group 2 while, the time taken by child A is only one
standard deviation shorter than the average time taken by group 1.
39 | P a g e
Chapter 5.
1. Elementary probability (4 lecture hours) Lecture
note
5.1.Introduction
• Probability theory is the foundation upon which the logic of inference is built.
• It helps us to cope up with uncertainty.
• In general, probability is the chance of an outcome of an experiment. It is the measure of
how likely an outcome is to occur.
5.2. Definitions of some probability terms
1. Experiment: Any process of observation or measurement or any process which
generates well defined outcome.
2. Probability Experiment (Random Experiment): It is an experiment that can be
repeated any number of times under similar conditions and it is possible to enumerate
the total number of outcomes without predicting an individual out come.
Example: If a fair coin is tossed three times, it is possible to enumerate all possible
eight sequences of head (H) and tail (T). But it is not possible to predict which
sequence will occur at any occasion.
3. Outcome: The result of a single trial of a random experiment
4. Sample Space(S): Set of all possible outcomes of a probability experiment.
Example 1: Sample space of a trial conducted by three tossing of a coin is
S= {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
Example 2: Recording the gender of children of two-child families.
S= {bb, bg, gb, gg}. An event B may be:B=“children of both genders.” Then B={bg,
gb}.
Sample space can be
Countable (finite or infinite)
Uncountable
5. Event (Sample Point): It is a subset of sample space. It is a statement about one or
more outcomes of a random experiment. It is denoted by capital letter A, B, C - - -.
For example, in the event, that there are exactly two heads in three tossing of a coin, it
would consist of three points HTH, HHT and THH.
Remark: If S (sample space) has n members with two possible outcomes in each trial then
there are exactly 2n subsets or events.
6. Equally Likely Events: Events which have the same chance of occurring.
7. Complement of an Event: the complement of an event A means non- occurrence of A
and is denoted by A' orAc or {A , contains those points of the sample space which
don’t belong to A.
8. Elementary (simple) Event: an event having only a single element or sample point.
9. Mutually Exclusive (Disjoint) Events: Two events which cannot happen at the same
time.
10. Independent Events: Two events are said to be independent if the occurrence of one
does not affect the probability of the other occurring.
11. Dependent Events: Two events are dependent if the first event affects the outcome or
occurrence of the second event in a way the probability is changed.
5.3 Counting Techniques
40 | P a g e
The number of outcomes of the random experiment or number of cases favorable to an event
can be determined by using mathematical methods (multiplication rule, addition rule,
permutation and combinations) without direct enumeration.
Addition rule
If there are k procedures and the ith procedure may be performed in ways , then
the number of ways in which we may perform procedure1 or procedure 2 or …procedure k is
given by assuming that no two procedures may be performed together.
Example:
1. Suppose that we are planning a trip and are deciding between bus or train
transportation. If there are 3 bus routes and 2 train routes, how many different routes
are available for the trip?
Solution: There are 3 bus and 2 train routes. Thus there routes are available for trip.
Multiplication rule
In a sequence of n events in which the first one has possibilities, the second one has , the
3rd one has and etc, the total possibility of the sequence will be
Example:
1. An instructor gives a six question multiple choice examinations. There are four possible
responses for each question. How many answer keys can be made?
2. A product is assembled in three stages. At the first stage there are five assembly lines, at
the second stage there are there are 6 assembly lines and at the third stage there are 10
assembly lines. In how many different ways may the product be routed through the
assembly process?
Solution:
Totally
the product can be routed in an assembly process by
Permutations
41 | P a g e
A permutation is an arrangement of n objects in a specific order. The number of arrangement
of n different objects taking all together is given by
Example:
1. Suppose that the photographer want to arrange 4 people in a raw for photographing. By
how many different ways can the arrangement be done?
2. How many different 5 letter permutation can be performed from the letters in the word
DISCOVER?
Solution:
1. The number of arrangement of 4 people in a raw is given by
2.
8!
8 p5 6720
(8 5)!
Combination
A selection of distinct objects without regard to order is called a combination. The difference
between a permutation and a combination is that in a combination, the order or arrangement
of the objects is not important; by contrast, order is important in a permutation.
42 | P a g e
Example: 1 how many different committees of 3 people can be chosen to work on a special
project
I n a club there is 7 women and 5 men. A committee of 3 women and 2 men is to be chosen.
How many different possibilities are there?
Solution:
43 | P a g e
Example:
Exercise
A committee of 5 people must be selected from 5 men and 8 women. By how many ways
can the selection be done if there are at least 3 women in the committee?
Axiomatic Approach:
Let E be a random experiment and S be a sample space associated with E. With each event A a
real number called the probability of A satisfies the following properties called axioms of
probability or postulates of probability.
a) 0 P A 1
b) P(s) =1
c) If A and B are mutually exclusive events, the probability that one or the other occur
equals the sum of the two probabilities. i. e. P (AuB) =P (A) +P (B)
d) For any event A , P A 0
e) Pφ = 0
f) For any event A and B ,P(AuB)=P(A)+P(B)-P(AnB)
g) P A = 1 P(A)
44 | P a g e
5.5. Conditional Probability and Independence
Conditional Events: If the occurrence of one event has an effect on the next occurrence of the
other event then the two events conditional or dependant events.
Conditional Probability
Let A and B be two events such that P(A) 0. Denote by P(B|A) the probability of B given that A
has occurred. Since A is known to have occurred, it becomes the new sample space replacing
the original S .From this we are led to the definition
p(A B
Or PA B = , P (B) 0 or P (A B) = P (A|B).P(B)
P B
The above definition implies that the probability that both A and B occur is equal to the
probability that A occurs times the probability that B occurs given that A has occurred. We call P
the conditional probability of B given A, i.e., the probability that B will occur given that A has
occurred. It is easy to show that conditional probability satisfies the axioms of probability.
Remark:
1) 2) and
3) P A / B 1 P A / B
4) P B / A 1 PB / A
5) if and are mutually exclusive
6) For three events
Examples
1. The probability that it is Friday and that a student is absent is 0.03. Since there are 5 school
days in a week, the probability that it is Friday is 0.2. What is the probability that a student
is absent given that today is Friday?
Solution:
45 | P a g e
P Absent | Friday
P( Friday and Absent ) 0.03
0.15
P( Firday ) 0.2
2. A jar contains black and white marbles. Two marbles are chosen without replacement.
The probability of selecting a black marble and then a white marble is 0.34, and the
probability of selecting a black marble on the first draw is 0.47. What is the probability of
selecting white marble on the second draw, given that the first marble drawn was black?
Solution:
PWhite | Black
P( Black and White) 0.34
0.72
P( Black ) 0.47
Assignment:
4. Suppose a study conducted at Wollega University reveals that students who attended class
95 % to 100% of the time usually scored an A in the class. Students who attended class
46 | P a g e
80% to 90% of the time usually scored B or C in the class. Students who attended class less
than 80% of the time usually received D or F or eventually withdrew from the class.
i) Are descriptive, inferential, or both types of statistics used? Why?
ii) What is the population under study?
iii) What are the data in the study?
iv) What are the variables under study?
v) Identify the types of variables?
vi) Which type of scale of measurement is used for those variables?
47 | P a g e