Statistics
Statistics
1. Introduction
1
Definition for Statistics
• Statistics(In the plural sense) refers to the systematic collection of
numerical facts. Example: birth rate, death rate , import and export of
goods, etc.
• It indicates information in terms of numbers or numerical data.
• Statistics (In the singular sense) is the science of conducting studies to
collect, organize, present, analyze, and draw conclusions from data.
2
Classification of Statistics
• The body of knowledge called statistics is sometimes divided into two
main areas, depending on how data are used. The two areas are
1. Descriptive statistics
2. Inferential statistics
3
1. Descriptive Statistics
• Descriptive statistics consists of the collection, organization,
summarization, and presentation of data.
• In descriptive statistics the statistician tries to describe a situation.
• Consider the national census conducted by the Ethiopian government every
10 years. Results of this census give you the average age, income, and other
characteristics of the Ethiopia population.
• To obtain this information, the Census Bureau must have some means to
collect relevant data. Once data are collected, the bureau must organize and
summarize them.
• Finally, the bureau needs a means of presenting the data in some meaningful
form, such as charts, graphs, or tables.
4
2. Inferential Statistics
• Descriptive statistics describes data (for example, a chart or graph) and inferential statistics allows you
to make predictions (“inferences”) from that data.
• With inferential statistics, you take data from samples and make generalizations about a population.
• it consists of generalizing from samples to populations, performing estimations and hypothesis tests,
determining relationships among variables, and making predictions.
• Inferential statistics uses probability, i.e., the chance of an event occurring. You may be familiar with
the concepts of probability through various forms of gambling. If you play cards, dice, bingo, or
lotteries, you win or lose according to the laws of probability.
• Statisticians also use statistics to determine relationships among variables. For example, relationships
between Smoking and Health. There is a relationship between smoking and lung cancer.
• Finally, by studying past and present data and conditions, statisticians try to make predictions based
on this information. For example, a car supplier may look at past sales records for a specific month to
decide what types of automobiles and how many of each type to order for that month next year.
5
Class Activities
Determine whether descriptive or inferential statistics were used.
A. The average price of laptops show in a recent year was $50 dollars.
B. The CSA predicts that the population of Ethiopia in 2030 will be 150 million
people.
C. A medical report stated that taking statins is proven to lower heart attacks, but
some people are at a slightly higher risk of developing diabetes when taking
statins.
D. A survey of 2234 people conducted by research centers found that 55% of the
respondents said that excessive complaining by adults was the most annoying
social media habit.
6
7
Basic Terms
• A population consists of all subjects (human or otherwise) that are
being studied.
• A sample is a group of subjects selected from a population.
• Parameter: is a summary computed to describe characteristics of the
population. Example: population mean and population variance.
• Statistic: is a summary computed to describe characteristics of the
sample. Sample statistics (plural of statistic) provides information
about the population. Example: sample mean and sample variance.
8
Con’d
• A variable is a characteristic or attribute that can assume different values.
• Qualitative variables are variables that have distinct categories according to some characteristic or
attribute. Example: Gender, religious preference and geographic locations.
• Quantitative variables are variables that can be counted or measured. Example: Age, heights,
weights, and body temperatures.
• Quantitative variables can be further classified into two groups: discrete and continuous.
• Discrete variables assume values that can be counted. Discrete variables can be assigned values
such as 0, 1, 2, 3 and are said to be countable. Example: the number of children in a family, the
number of students in a classroom, and the number of calls received by a call center each day for a
month.
• Continuous variables can assume an infinite number of values between any two specific values.
They are obtained by measuring. They often include fractions and decimals.
9
Cont’d
• Data are the values (measurements or observations) that the variables
can assume.
• A collection of data values forms a data set.
• Each value in the data set is called a data value or a datum.
10
Measurement Scales
• Variables can be classified by how they are categorized, counted, or measured.
• The nominal level of measurement classifies data into mutually exclusive
(nonoverlapping) categories in which ranking or order can't be placed on the data.
• Example: political party (Democratic, Republican, independent, etc.), religion
(Christianity, Judaism, Islam, etc.), and marital status (single, married, divorced,
widowed, separated).
• The ordinal level of measurement classifies data into categories that can be ranked;
however, precise differences between the ranks do not exist. Example: letter grades
(A, B, C, D, F), from student evaluations, guest speakers might be ranked as
superior, average, or poor.
11
Cont’d
• The interval level of measurement ranks data, and precise differences
between units of measure do exist; however, there is no meaningful zero.
IQ, Temperature °F
• One property is lacking in the interval scale: There is no true zero. For
example, IQ tests do not measure people who have no intelligence, for
temperature, 0°F does not mean no heat at all.
• The ratio level of measurement possesses all the characteristics of interval
measurement, and there exists a true zero. In addition, true ratios exist when
the same variable is measured on two different members of the population.
• height, weight, area, and number of phone calls received.
12
Summary
13
14
Sampling Techniques (Optional)
15
Probability Sampling
(Random Sampling)
16
1. Simple Random Sampling
• The simple random sample means that every case of the population
has an equal probability of inclusion in sample.
• This sampling method is as easy as assigning numbers to the
individuals (sample) and then randomly choosing from those numbers
through an automated process.
• Finally, the numbers that are chosen are the members that are included
in the sample.
• Simple random sampling can be done either using the lottery method
or table of random numbers (using number generating software).
17
2. Systematic Sampling
• Researchers use the systematic sampling method to choose the
sample members of a population at regular intervals, i.e. every
nth individual to be a part of the sample.
• For example, if surveying a sample of consumers, every fifth
consumer may be selected from your sample.
• Let N=population size, n=sample size, k=N/n=sampling interval.
Choose any number between 1 and K. Suppose it is j where (1 <= j <=
k) then jth unit is selected at first and then (j+k)th , (j+2k)th, . . . , etc
until the required sample size is selected.
• The advantage of this sampling technique is its simplicity.
18
3. Stratified Sampling
• It involves a method where the researcher divides a more extensive population into smaller
subgroups (or called strata) that usually don’t overlap but represent the entire population.
• While sampling, organize these groups and then draw a sample from each group separately.
• A standard method is to arrange or classify by sex, age, ethnicity, and similar ways. Splitting
subjects into mutually exclusive groups and then using simple random sampling to choose
members from groups.
• Stratified sampling is often used where there is a great deal of variation within a population.
• Elements in the same strata should be more or less homogeneous while different in different
strata.
• Its purpose is to ensure that every stratum is adequately represented.
19
4. Cluster Sampling
• Cluster Sampling is a way to select participants randomly that are spread out geographically. For
example, if you wanted to choose 100 participants from the entire population of Ethiopia, it is
likely impossible to get a complete list of everyone. Instead, the researcher randomly selects
clusters (i.e., cities or regions etc) and all the sampling units in the selected clusters will be
surveyed or considered.
• Clusters are formed in a way that elements within a cluster are heterogeneous, i.e. observations in
each cluster should be more or less dissimilar.
• Cluster sampling usually analyzes a particular population in which the sample consists of more
than a few elements, for example, city, family, university, etc. Researchers then select the clusters
by dividing the population into various smaller sections.
20
5. Multistage Sampling
• Multistage sampling divides large populations into stages to make the
sampling process more practical.
• A combination of stratified sampling or cluster sampling and simple
random sampling is usually used.
21
Example
22
23
Non- Probability Sampling
(Non-Random Sampling)
24
1.Convenience Sampling
• Convenience sampling is a non-probability sampling technique where
samples are selected from the population only because they are
conveniently available to the researcher.
• Researchers choose these samples just because they are easy to recruit,
and the researcher did not consider selecting a sample that represents
the entire population.
25
2. Judgmental or Purposive Sampling
• In the judgmental sampling method, researchers select the samples
based purely on the researcher’s knowledge and credibility.
• In other words, researchers choose only those people who they
believe to fit to participate in the research study.
• Judgmental or purposive sampling is not a scientific method of
sampling, and the downside to this sampling technique is that the
preconceived notions of a researcher can influence the results.
• Thus, this research technique involves a high amount of ambiguity.
26
3. Quota Sampling
• In quota sampling, the selection of members in this sampling technique happens based on
a pre-set standard.
• In this case, as a sample is formed based on specific attributes, the created sample will
have the same qualities found in the total population. It is a rapid method of collecting
samples.
• Hypothetically consider, a researcher wants to study the career goals of male and female
employees in an organization.
• There are 500 employees in the organization, also known as the population.
• To understand better about a population, the researcher will need only a sample, not the
entire population.
• Further, the researcher is interested in particular strata within the population.
• Here is where quota sampling helps in dividing the population into strata or groups
27
4. Snowball Sampling
• It is a sampling method that researchers apply when the samples/subjects are difficult to
trace/locate. For example, it will be extremely challenging to survey shelterless people or illegal
immigrants.
• Researchers also implement this sampling method in situations where the topic is highly sensitive
and not openly discussed for example, surveys to gather information about HIV Aids.
• Not many victims will readily respond to the questions. Still, researchers can contact people they
might know or volunteers associated with the cause to get in touch with the victims and collect
information.
• This sampling system works like the referral program.
• Researchers use this technique when the sample size is small and not easily available.
• Once the researchers find suitable subjects, he asks them for assistance to seek similar subjects to
form a considerably good size sample.
28
Methods of Data Collection
• Interview
• Questionnaire
• Scheduled through enumerator
• Observation
• Experiment
• Registration methods
• Documentary sources
• Etc
29
Functions of Statistics
• to present facts in definite form
• comparisons
• formulation and testing of hypothesis
• forecasting
• policy making
• to measure uncertainty
• it enlarges knowledge
30
Limitation of Statistics
• Qualitative aspect ignored.
• It does not deal with individual items.
• Statistics laws and results are true only on average.
• If sufficient care is not exercised in collecting, analyzing and
interpretation of the data, statistical results might be misleading.
• Only a person who has an expert knowledge of statistics can handle
statistical data efficiently.
• Some errors are possible in statistical decisions. Particularly the
inferential statistics involves certain errors. We don’t know whether an
error has ben committed or not.
31
Chapter 2
2. Descriptive Statistics
32
Definition
• Descriptive statistics, in short, help describe and understand the
features of a data set, by giving short summaries about the measures of
the data.
• Descriptive statistics uses the data to provide descriptions of the
population, either through tables, graphs/diagrams or numerical
calculations.
33
Frequency Distributions(FD)
• After collecting data, the first task for a researcher is to organize and
simplify the data so that it is possible to get a general overview of the
results.
• This is the goal of descriptive statistical techniques.
• One method for simplifying and organizing data is to construct a
frequency distribution.
• A frequency distribution is the organization of raw data in table from,
using classes and frequency.
• Two types namely categorical and numerical FDs.
34
1. Categorical Frequency Distributions
• The categorical frequency distribution is used for data that can be
placed in specific categories, such as nominal- or ordinal-level data.
• For example, data such as political affiliation, religious affiliation, or
major field of study would use categorical frequency distributions.
35
Solution
36
2. Numerical Frequency Distribution
• It is the organization of data measured in interval level and ratio levels.
• Two types.
A. Ungrouped Frequency distribution
B. Grouped Frequency Distribution
37
A. Ungrouped Frequency distribution
• Ungrouped data is data given as individual data points.
• Example: Twenty students were asked how many hours they worked
per day. Their responses, in hours, are listed below. Arrange the data in
table form.
• Example:
5, 6, 3, 3, 2, 4, 7, 5, 2, 3, 5, 6, 5, 4, 4, 3, 5, 2, 5, 3
38
Solution:
Frequency Table of Student Work Hours
DATA VALUE FREQUENCY
(work hour/day) (No. of students)
2 3
3 5
4 3
5 6
6 2
7 1
39
B. Grouped Frequency Distributions
• When the range of the data is large, the data must be grouped into
classes that are more than one unit in width, in what is called a
grouped frequency distribution.
• Grouped data is data that has been organized into groups from the
raw data.
40
Basic Terms
• The class width is the difference between the upper or
lower class limits of consecutive classes in a bin frequency table.
• The boundaries of each class are called the lower-class limit and the
upper-class limit, and the class width is the difference between the
lower (or higher) limits of successive classes.
• It is refers to the difference between the upper and lower boundaries of
any class (category).
• Class boundaries are the data values which separate classes.
• boundary means the dividing line or location between two
areas, whereas limit means a restriction.
41
Cont’d
• Class marks. They are the midpoints of the classes. They are obtained
by averaging the limits.
• Unit of measurement: the smallest difference of any two values of
the given data set.
42
Constructing a Grouped Frequency
Distribution
1. Determine the unit of measurement.
2. Find the range of the data set.
3. Decide the number of classes(K) using sturges rule. K= 1+3.322logN
4. Find the width by dividing the range by the number of classes and rounding
to the nearest possible value.
5. Generate class limits, class boundaries. Usually select a starting point as the
lowest value of the data.
6. Tally the data and find the numerical frequencies from the tallies.
43
Summary
• There should be between 5 and 20 classes.
• The classes must be continuous.
• The classes must be exhaustive.
• The classes must be mutually exclusive.
• The classes must be equal in width
44
Example
These data represent the record high temperature in degree Fahrenheit
for each of the 50 states. Construct a grouped frequency distribution for
the data.
45
Solution
1. U= 1,
2. R= 134-100= 34
3. K= 1+ 3.322 Log 50 = 6.64 ~ 7
4. w = R/K = 34/7 = 4.9 ~ 5
5. Lowest value = lowest class limit = 100
46
Homework
• A distribution has constant class width with 6 classes and the second
class mark is 8. If the class mark of the forth distribution is 18, and the
class width, the class limits and class boundaries of the distribution.
Assume unit of measurement is 1.
47
Relative Frequency
• A relative frequency is the ratio (fraction or proportion) of the number
of times a value of the data occurs in the set of all outcomes to the
total number of outcomes.
48
Cumulative frequency distribution
• Sometimes we may be interested to know the number of observations
less than or more than a specified value.
• Cumulative FD is a distribution of class associated with their
corresponding cumulative frequency.
• Two Types:
1. Less than cumulative FD
2. More than cumulative FD
49
Cont’d
• Less than cumulative FD is obtained by adding successively
the frequencies of all the previous classes including the class against
which it is written.
• The cumulate is started from the lowest to the highest size.
• More than cumulative FD obtained by adding successively
the frequencies of all the succeeding classes including the class
against which it is written.
• Number of observations less than the upper boundary of a class is
called "less than type" cumulative frequency of that class.
• Number of observations more than or equal to lower boundary of a
class is called "more than type" cumulative frequency of that class.
50
Example
• Give here practical examples for relative FD, more than and less than
cumulative FD.
51
Measure of Central Tendency
• In the previous sections, we discussed the techniques of classification
and tabulation which help in summarizing the collected data and
presenting them in the form of diagrams and graphs.
• A measure of central tendency is a single value that attempts to
describe and summarize a set of data by identifying the central
position within that set of data.
52
Cont’d
. It is also called
summary
measures of statistics
central location
average
53
Characteristics
• It should be simple to understand and easy to compute.
• It should be rigidly defined.
• It should be based on all the observations.
• It should be suitable for further algebraic treatments.
• It should not be affected by extreme values.
54
Mean
55
Class Activity
If the heights of 5 people are 142 cm, 150 cm, 149 cm, 156 cm, and
153 cm. Find the mean height.
142+150+149+156+153
ത
𝑋= = 150
5
56
Challenging Questions
1. Let the mean of x1, x2, x3 ……xn be A, then what is the mean of:
A. (x1+k) ,(x2+k), (x3x3+k), ……(xn+k)?
B. (x1-k) ,(x2-k), (x3-k), ……(xn-k)?
C. kx1, kx2, kx3 ……kxn?
2. The mean of 5 numbers is 18 If one number is excluded, their mean is
16. Find the excluded number.
57
Mean for FD
• Mean for FD is computed as follows.
58
Example
59
Solution
60
Median
▪ The median is the midpoint or halfway point in a data set.
(𝑛+1)
▪ It is located on at th observation.
2
▪ Before you can find this point, the data must be arranged in
ascending or increasing order.
▪ When the data set is ordered, it is called a data array.
61
Example
For example, consider the data: 4, 4, 6, 3, 2. find the median.
62
Cont’d
• Let's consider the data: 50, 67, 24, 34, 78, 43. What is the median?
Solution
• Arranging in ascending order, we get: 24, 34, 43, 50, 67, 78. Here, n
(no. of observations) = 6
• Median = (43+50)/2 = 46.5
• Median = 46.5
63
Median for FD
• First compute median class. Median Class is the class where n/2 lies.
• Specifically,
𝑛
𝑡ℎ−𝐶𝐹
• Median = LCB+[ 2
]. 𝑊
𝑓𝑚
Where, W= width of the median class, CF = cumulative frequency of
the class preceding the median class, fm= frequency of the median
class.
64
Class Activity
• Consider the following data and compute the median for the data.
Data Frequency
51-55 2
56-60 7
61-65 8
66-70 4
65
Solution
21
−9
• Median= 60.5 + ( 2
)*5
8
= 60.5 + 0.9375
= 61.4375
66
Mode
• The value that occurs most often in a data set is called the mode.
• If a data set has two values that occur with the same greatest
frequency, both values are considered to be the mode and the data set
is said to be bimodal.
• If a data set has more than two values that occur with the same greatest
frequency, each value is used as the mode, and the data set is said to be
multimodal.
• When no data value occurs more than once, the data set is said to have
no mode.
• Note: Do not say that the mode is zero. That would be incorrect,
because in some data, such as temperature, zero can be an actual value.
67
Class Activity 1
• The data show the number of public libraries in a sample of eight
states. Find the mode.
68
Class Activity 2
• Since the category of soft drinks has the largest frequency, 52, we can
say that the mode or most typical drink is a soft drink.
69
Class Activity 3
70
Class Activity 4
71
Mode for FD
• It can be computed as
𝑑1
Mode =LCB + ∗𝑊
𝑑1 +𝑑2
where
W= width of the modal class
LCB= lower class boundary of the model class
D1= fm-fp. D2 = fm- fs
72
Class Activity 1
Consider the following data and compute the mode for the data.
73
Solution
8−7
• Mode= 60.5 + *5
8−7 +(8−4)
= 60.5 + (1/5) × 5
= 61.5
74
Weighted Mean
75
Example: Weighted Mean
• A student received an A in English Composition I (3 credits), a C in Introduction to Psychology (3 credits), a B
in Biology I (4 credits), and a D in Physical Education (2 credits). Assuming A = 4 grade points, B = 3 grade
points, C = 2 grade points, D = 1 grade point, and F = 0 grade points, find the student’s grade point average.
76
77
Measure of Variation/Dispersion
• Dispersion is the measure of the variation of the items.
• Dispersion is a measure of extent to which the individual items vary.
• A measure of variation is used to describe the degree of spread of data.
• A measure of dispersion conveys information regarding the amount of
variability present in a set of data.
• If all the values are the same, there is no dispersion; if they are not all
the same, dispersion is present in the data.
• The amount of dispersion may be small, when the values, though
different, are close together.
78
Importance/ Purpose of Measuring Variation
• To test the reliability of an average
• To serve as a basis for control of variability
• To compare two or more series with regard to their variability
• To facilitate as a basis for further statistical analysis
79
Cont’d
• For the spread or variability of a data set, three measures are
commonly used: range, variance, standard deviation and coefficient
of variation.
80
Range
• The range is the highest value minus the lowest value. The symbol R
is used for the range.
• R = highest value − lowest value
81
Variance
82
Example
83
Cont’d
Find the variance and standard deviation for brand A paint data.
84
Solution
Data X-µ (X−µ) 𝟐
10 -25 625
60 25 625
50 15 225
30 -5 25
40 5 25
20 -15 225
σ(X−µ )𝟐 =1750
1750
= = 291.5, 𝜎 = 291.5 = 17.1
6
85
Class Activity
• Find the variance and standard deviation for brand B paint data.
86
Solution
87
Sample Standard Deviation
88
Example
• The number of public school teacher strikes in Pennsylvania for a
random sample of school years is shown. Find the sample variance and
the sample standard deviation.
9, 10, 14, 7, 8, 3
89
Solution
90
Class Activity
• The number of public school teacher strikes in Pennsylvania for a
random sample of school years is shown. Find the sample variance and
sample standard deviation.
9, 10, 14, 7, 8, 3
Answer
91
Variance and Standard Deviation for Grouped Data
σ 𝑓𝑖 (𝑋−𝜇)2
• 𝜎2 = , σ = 𝜎2
𝑁
ത 2
σ 𝑓𝑖 (𝑋−𝑋)
• 𝑆2 = , S = 𝑆2
𝑛−1
92
Example
• Find the sample variance and the sample standard deviation for the
frequency distribution of the data shown. The data represent the
number of miles that 20 runners ran during one week.
93
Solution
σ𝑓 𝑋 490
𝑋ത = σ 𝑖 𝑖 = = 24.5
𝑓𝑖 20
2
σ 𝑓𝑖 (𝑋𝑖 − 𝑥)ҧ 2 = 1 8 − 24.5 2 + ⋯ + 2 38 − 24.5 2
𝑆 =
σ 𝑓𝑖 − 1
=68.7
Class Activities
• Consider a sample of mark of students randomly selected from any
class. Find
A. The average mark of students.
B. The sample variance and standard deviation of mark of students.
Mark Number of students
0-5 2
5-10 4
10-15 6
15-20 4
20-25 2
95
96
Coefficient of Variation
The distribution having greater C.V is considered more variable than the
other, the distribution with lesser C.V shows greater consistency,
homogeneity and uniformity. 97
Class Activity 1
The mean of the number of sales of cars over a 3-month period is 87,
and the standard deviation is 5. The mean of the commissions is $5225,
and the standard deviation is $773. Compare the variations of the two.
98
99
Class Activity 2
2. The mean speed for the five fastest wooden roller coasters is 69.16 miles per hour, and the variance
is 2.76. The mean height for the five tallest roller coasters is 177.80 feet, and the variance is 157.70.
Compare the variations of the two data sets.
100
Chapter 3
3. Elementary Probability
101
Definition
• Probability theory was originated from gambling theory.
• Probability is a mathematical tool used to study randomness.
• It deals with the chance (the likelihood) of an event occurring.
• A large number of problems exist even today which are based on the game of
chance, such as coin tossing, dice throwing and playing cards.
Example:
• Meteorologists, for instance, use weather patterns to predict the probability of
rain.
• In epidemiology, probability theory is used to understand the relationship
between exposures and the risk of health effects. 102
SOME IMPORTANT TERMS &CONCEPTS
• Random Experiments: Experiments of any type where the outcome cannot be
predicted are called random experiments.
• An outcome is the result of a single trial of a probability experiment.
• Sample Space: A set of all possible outcomes from an experiment.
103
Cont’d
• Null Event: An event having no sample point is called a null event and is denoted
by ∅.
• Exhaustive Events: The total number of possible outcomes in any trail is known as
exhaustive events.
• Eg: In throwing a die the possible outcomes are getting 1 or 2 or 3 or 4 or 5 or 6.
Hence we have 6 exhaustive events in throwing a die.
• Two events are mutually exclusive events or disjoint events if they cannot occur at
the same time (i.e., they have no outcomes in common).
• Two events are said to be mutually exclusive when the occurrence of one affects
the occurrence of the other. In otherwords, if A & B are mutually exclusive events
and if A happens then B will not happen and vice versa.
• Eg: In tossing a coin the events head or tail are mutually exclusive, since both tail
& head cannot appear in the same time.
104
Cont’d
• Equally Likely Events: Two events are said to be equally likely if one of
them cannot be expected in the preference to the other.
• Eg: In throwing a coin, the events head & tail have equal chances of
occurrence.
• Independent & Dependent Events: Two events are said to be independent
when the actual happening of one does not influence in any way the
happening of the other.
• Events which are not independent are called dependent events.
• Eg: If we draw a card in a pack of well shuffled cards and again draw a card from the rest of pack of
cards (containing 51 cards), then the second draw is dependent on the first. But if on the other hand, we
draw a second card from the pack by replacing the first card drawn, the second draw is known as
independent of the first.
105
Class Activity 1
• Find the sample space for rolling two dice.
106
Solution
107
Class Activity 2
108
Solution
109
Class Activity 3
110
Tree Diagram
111
Examples
112
113
Counting Techniques
114
1. Addition Rule
• If an event A1 can occur n1 ways, A2 can occur n2 ways, etc. then the
event A1 or A2 or A3 etc. can occurs in n1+n2+…+𝑛𝑘 ways provided
the events are pairwise mutually exclusive.
115
Example
• In a certain class a class representative is to be chosen from 3 female and 4
male students. Count the ways in which a class representative can be
chosen.
Solution
• Here, a female representative is to be chosen in 3 ways and a male
representative is to be chosen in 4 ways. Therefore the number of ways in
which a class representative can be chosen will be 3+4=7ways.
116
2. Multiplication Rule
117
Class Activity 1
• A coin is tossed and a die is rolled. Find the number of outcomes for
the sequence of events.
Solution
• Since the coin can land either heads up or tails up and since the die can
land with any one of six numbers showing face up, there are 2 · 6 = 12
possibilities.
118
Class Activity 2
• There are four blood types, A, B, AB, and O. Blood can also be Rh+
and Rh−. Finally, a blood donor can be classified as either male or
female. How many different ways can a donor have his or her blood
labeled?
119
120
3. Permutations
121
Class Activities 1
Solution
• 4!=24 different ways
122
Example
• It is required to seat 5 men and 4 women in a row so that the women occupy the even
places. How many such arrangements are possible?
Solution: We are given that there are 5 men and 4 women.
i.e. there are 9 positions.
The even positions are: 2nd, 4th, 6th and the 8th places
These four places can be occupied by 4 women in P(4, 4) ways = 4! = 4 . 3. 2. 1 = 24
ways
The remaining 5 positions can be occupied by 5 men in P(5, 5) = 5! = 5.4.3.2.1 = 120
ways
Therefore, by the Fundamental Counting Principle,
Total number of ways of seating arrangements = 24 x 120 = 2880
123
• In how many different ways can the letters of the word 'OPTICAL' be
arranged so that the vowels always come together?
The word 'OPTICAL' has 7 letters. It has the vowels 'O','I','A' in it and
these 3 vowels should always come together. Hence these three vowels
can be grouped and considered as a single letter. That is, PTCL(OIA).
Hence we can assume total letters as 5 and all these letters are
different.
Number of ways to arrange these letters
=5!=5×4×3×2×1=120=5!=5×4×3×2×1=120
All the 3 vowels (OIA) are different
Number of ways to arrange these vowels among themselves
=3!=3×2×1=6=3!=3×2×1=6
Hence, required number of ways
=120×6=720
124
• In how many different ways can the letters of the word 'CORPORATION' be arranged so that the
vowels always come together?
• The word 'CORPORATION' has 11 letters. It has the vowels 'O','O','A','I','O' in it and these 5 vowels
should always come together. Hence these 5 vowels can be grouped and considered as a single
letter. that is, CRPRTN(OOAIO).
Hence we can assume total letters as 7. But in these 7 letters, 'R' occurs 2 times and rest of the
letters are different.
Number of ways to arrange these letters
=7!2!=7×6×5×4×3×2×12×1=2520=7!2!=7×6×5×4×3×2×12×1=2520
In the 5 vowels (OOAIO), 'O' occurs 3 and rest of the vowels are different.
Number of ways to arrange these vowels among
themselves =5!3!=5×4×3×2×13×2×1=20=5!3!=5×4×3×2×13×2×1=20
Hence, required number of ways
=2520×20=50400=2520×20=50400
125
126
Examples
1. A business owner wishes to rank the top 3 locations selected from 5
locations for a business. How many different ways can she rank
them?
2. How many permutations of the letters can be made from the word
STATISTICS?
127
Solutions
128
4. Combination
• A selection of distinct objects without regard to order is called a
combination.
• Combinations are used when the order or arrangement is not important,
as in the selecting process.
129
Example
130
Example
131
Approaches to Define Probability
1. Classical Probability
2. Empirical or Relative Frequency Probability
3. Subjective Probability
132
1. Classical Probability
▪ Classical probability uses sample spaces to determine the numerical
probability that an event will happen.
▪ Classical probability assumes that all outcomes in the sample space are
equally Likely to occur.
▪ For example, when a single die is rolled, each outcome has the same
probability of occurring. Since there are six outcomes, each outcome has
a probability of 1/6.
▪ When a card is selected from an ordinary deck of 52 cards, you assume
that the deck has been shuffled, and each card has the same probability of
being selected. In this case, it is 1/52.
133
134
2. Empirical Probability
• The difference between classical and empirical probability is that
classical probability assumes that certain outcomes are equally likely
(such as the outcomes when a die is rolled), while empirical
probability relies on actual experience to determine the likelihood of
outcomes.
135
Class Activity 1
• Suppose, for example, that a researcher for the American Automobile Association (AAA)
asked 50 people who plan to travel over the Thanksgiving holiday how they will get to
their destination. The results can be categorized in a frequency distribution as shown.
A. the probability of selecting a person who is driving over the Thanksgiving holiday.
B. the probability that the person will travel by train or bus over the Thanksgiving holiday.
136
Solutions
A. the probability of selecting a person who is driving is 41 /50 , since
41 out of the 50 people said that they were driving.
B. P(E) = f/n = 3/50 = 0.06
137
Class Activity 2
• In a sample of 50 people, 21 had type O blood, 22 had type A blood, 5
had type B blood, and 2 had type AB blood. Set up a frequency
distribution and find the following probabilities.
A. A person has type O blood.
B. A person has type A or type B blood.
C. A person has neither type A nor type O blood.
D. A person does not have type AB blood.
138
139
Class Activities 3
140
141
3. Subjective Probability
• Subjective probability uses a probability value based on an educated guess or estimate,
employing opinions and inexact information.
• In subjective probability, a person or group makes an educated guess at the chance that an
event will occur. This guess is based on the person’s experience and evaluation of a
solution.
• For example, a sportswriter may say that there is a 70% probability that Bahir Dar
Kenema will win the Coffee Club next game. A physician might say that, on the basis of
her diagnosis, there is a 30% chance the patient will need an operation. A seismologist
might say there is an 80% probability that an earthquake will occur in a certain area.
• These are only a few examples of how subjective probability is used in everyday life.
• All three types of probability (classical, empirical, and subjective) are used to solve a
variety of problems in business, engineering, and other fields.
142
Probability Rules
143
Cont’d
• The above rules can be extended to three or more events. For three
mutually exclusive events A, B, and C.
144
Class Activity 1
145
Class Activity 2
1. In the United States there are 59 different species of mammals that
are endangered, 75 different species of birds that are endangered, and
68 species of fish that are endangered. If one animal is selected at
random, find the probability that it is either a mammal or a fish.
146
Cont’d
147
Example
• In a hospital unit there are 8 nurses and 5 physicians; 7 nurses and 3
physicians are females. If a staff person is selected, find the probability
that the subject is a nurse or a male.
148
149
Cont’d
150
Homework
151
The Multiplication Rules
• The multiplication rules can be used to find the probability of two or more events
that occur in sequence.
• Two events A and B are independent events if the fact that A occurs does not
affect the probability of B occurring.
• For example, if you toss a coin and then roll a die, you can find the probability of
getting a head on the coin and a 4 on the die. These two events are said to be
independent since the outcome of the first event (tossing a coin) does not affect
the probability outcome of the second event (rolling a die).
• Here are other examples of independent events:
• Rolling a die and getting a 6, and then rolling a second die and getting a 3.
• Drawing a card from a deck and getting a queen, replacing it, and drawing a
second card and getting a queen.
152
153
Example
154
Class Activities 2
155
156
Conditional Probability
• When the outcome or occurrence of the first event affects the outcome
or occurrence of the second event in such a way that the probability is
changed, the events are said to be dependent events.
• Here are some examples of dependent events:
1. Drawing a card from a deck, not replacing it, and then drawing a
second card
2. Selecting a ball from an urn, not replacing it, and then selecting a
second ball
3. Having high grades and getting a scholarship
4. Parking in a no-parking zone and getting a parking ticket
157
• The conditional probability of an event B in relationship to an event A
is the probability that event B occurs after event A has already
occurred.
• The notation for conditional probability is P(B|A). This notation does
not mean that B is divided by A; rather, it means the probability that
event B occurs given that event A has already occurred.
158
159
Class Activities 1
160
161
Class Activities 2
• A box contains black chips and white chips. A person selects two chips
without replacement. If the probability of selecting a black chip and a
white chip is 15 /56 and the probability of selecting a black chip on the
first draw is 3 /8 , find the probability of selecting the white chip on
the second draw, given that the first chip selected was a black chip.
162
163
Total Probability Theorem
• Bayes’ theorem is a way to figure out conditional probability.
• Conditional probability is the probability of an event happening, given
that it has some relationship to one or more other events.
164
Cont’d
165
Bayes Theorem
• Let B1, B2… B k is partition value of the sample space S. and let A be an event associated with S.
Then we may write,
166
Example
1. Three types of machines A, B, and C produce 50%, 30% and 20%
respectively of the total output. The percentages of defective output of
this computer are 3%, 4% and 5% respectively. If an item is selected
at random,
A. What is the probability that the selected item is defective?
B.What is the probability that this selected item is taken from machine
B?
167
Homework
1. A bag contains 6 white and 4 red balls. Another bag contains 5 white
and 7 red balls. Two balls are transferred from the first bag to the
second bag and then one ball is taken from the second bag. What is
the probability that the ball drawn from the second bag is red?
2. For three persons A, B, C the chance of being the selected as a
manager of the firm is in the ratio of 4:1:2 respectively. The
respective probabilities for them to introduce a radical change in
marketing strategy are 0.4, 0.8 and 0.6. If the change does take place,
find the probability that it is due to the appointment of person A?
168
Chapter 4
4. Probability Distribution
&
Random Variables
169
Random Variables
• A random variable is a variable whose values are determined by
chance.
• Discrete variables have a finite number of possible values or an
infinite number of values that can be counted. The word counted
means that they can be enumerated using the numbers 1, 2, 3, etc.
• Continuous random variables are obtained from data that can be
measured rather than counted.
• Continuous random variables can assume an infinite number of values
and can be decimal and fractional values.
170
Con’d
• A discrete probability distribution consists of the values a random
variable can assume and the corresponding probabilities of the values.
The probabilities are determined theoretically or by observation.
171
172
173
Class Activities
174
175
176
Class Activities 1
177
Class Activities 2
178
Homework
179
Some Discrete Probability Distribution
180
Binomial Distribution
181
182
Class Activities
183
184
185
186
Class Activities 1
187
Class Activities 2
188
189
Class Activities 3
190
191
Class Activities 4
192
193
Poisson Distribution
• A discrete probability distribution that is useful when n is large and p
is small and when the independent variables occur over a period of
time is called the Poisson distribution.
194
195
Class Activities 1
1. A sales firm receives, on average, 3 calls per hour on its toll-free
number. For any given hour, find the probability that it will receive
the following.
A. At most 3 calls
B. At least 3 calls
C. 5 or more calls
196
197
Class Activities 2
1. If there are 200 typographical errors randomly distributed in a 500
page manuscript, find the probability that a given page contains
exactly 3 errors.
198
199
Class Activities 3
200
Normal Distributions
• If a random variable has a probability distribution whose graph is
continuous, bell-shaped, and symmetric, it is called a normal
distribution. The graph is called a normal distribution curve.
201
202
Normal and Skewed Distribution
203
The Standard Normal Distribution
• The standard normal distribution is a normal distribution with a mean
of 0 and a standard deviation of 1.
• All normally distributed variables can be transformed into the standard
normally distributed variable by using the formula for the standard
score:
204
Class Activities 1
1. Find the area under the standard normal distribution curve to the left
of z = 1.73.
2. Find the area under the standard normal distribution curve to the
right of z = −1.24.
3. Find the area under the standard normal distribution curve between z
= 1.62 and z = −1.35.
205
206
Homework
1. Find the probability for each. (Assume this is a standard normal
distribution.)
A. P(0 < z < 2.53)
B. P(z < 1.73)
C.P(z > 1.98)
207
Class Activities 2
1. Each month, an American household generates an average of 28
pounds of newspaper for garbage or recycling. Assume the variable
is approximately normally distributed and the standard deviation is 2
pounds. If a household is selected at random, find the probability of
its generating
A. Between 27 and 31 pounds per month
B. More than 30.2 pounds per month
208
209
210
Chapter 5
211
Introduction
• Researchers are interested in answering many types of questions.
• For example,
✓ Automobile manufacturers are interested in determining whether a new type of seat belt will reduce the
severity of injuries caused by accidents.
✓ A physician might want to know whether a new medication will lower a person’s blood pressure.
✓ An educator might wish to see whether a new teaching technique is better than a traditional one.
✓ A retail merchant might want to know whether the public prefers a certain color in a new line of fashion.
212
Cont’d
• These types of questions can be addressed through statistical
hypothesis testing.
• it is a decision-making process for evaluating claims about a
population.
• In hypothesis testing, the researcher must define
✓the population under study,
✓state the particular hypotheses that will be investigated,
✓give the significance level,
✓select a sample from the population, collect the data, perform
the calculations required for the statistical test, and
✓reach a conclusion.
213
Cont’d
• We will introduce two forms of statistical inference in this section,
each one representing a different way of using the information
obtained in the sample to draw conclusions about the population.
• These forms are estimation and hypothesis testing.
214
5.1. Estimation
• It is the process of estimating the value of a parameter from information obtained
from a sample.
• The most fundamental points in estimation process of population parameter is
collecting a sample of data from a population using simple random methods.
• From these simple random samples, we can calculate sample statistic such as
sample mean, sample variance, sample proportions, etc to estimate population
parameters such as population mean, population variance and population
proportions respectively.
• A formula that uses sample data to calculate a single number (a sample statistic)
that can be used as an estimate of a population parameter is called estimators.
ത
• A specific observed value of the statistic is called estimation. e.g.𝑋=10, 𝑆 2 = 20,
etc.
• There are two types of estimation: Point Estimation and Confidence Interval
Estimation.
215
1. Point Estimation
• It deals with about the problems of obtaining a single sample value to
estimate the population parameter.
• Sample mean (𝑋) ത and sample variance (𝑆 2 ) are point estimators of
population mean (μ) and population variance (𝜎 2 ), respectively.
• Suppose a college president wishes to estimate the average age of
students attending classes this semester.
• The president could select a random sample of 100 students and find the
average age of these students, say, 22.3 years.
• From the sample mean, the president could infer that the average age of
all the students is 22.3 years. This type of estimate is called a point
estimate.
216
Properties of Good Estimator
1. The estimator should be an unbiased estimator. That is, the expected
value or the mean of the estimates obtained from samples of a given size
is equal to the parameter being estimated.
2. The estimator should be a relatively efficient estimator. That is, of all
the statistics that can be used to estimate a parameter, the relatively
efficient estimator has the smallest variance.
3. The estimator should be consistent. For a consistent estimator, as sample
size increases, the value of the estimator approaches the value of the
parameter estimated.
4. The estimator should be sufficient. it uses the entire sample information
in estimating the population parameter to be estimated.
217
2. Confidence Interval Estimation
• the sample mean will be, for the most part, somewhat different from the population mean due to
sampling error.
• Therefore, you might ask a second question: How good is a point estimate? The answer is that there is
no way of knowing how close a particular point estimate is to the population mean.
• In an interval estimate, the parameter is specified as being between two values. For example, an interval
estimate for the average age of all students might be 21.9 < 𝜇 < 22.7, or 22.3 ± 0.4
• An interval estimate of a parameter is an interval or a range of values used to estimate the parameter.
This estimate may or may not contain the value of the parameter being estimated.
• A degree of confidence (usually a percent) must be assigned before an interval estimate is made.
• For instance, you may wish to be 95% confident that the interval contains the true population mean.
• Another question then arises. Why 95%? Why not 99 or 99.5%?
218
Cont’d
• If you desire to be more confident, such as 99% confident, then you must make the
interval larger. For example, a 99% confidence interval for the mean age of college
students might be 21.7 < 𝜇 < 22.9, or 22.3 ± 0.6 . Hence, a tradeoff occurs.
• To be more confident that the interval contains the true population mean, you must
make the interval wider.
• The confidence level of an interval estimate of a parameter is the probability that the
interval estimate will contain the parameter, assuming that a large number of samples
are selected and that the estimation process on the same parameter is repeated.
• A confidence interval is a specific interval estimate of a parameter determined by
using data obtained from a sample and by using the specific confidence level of the
estimate.
• Intervals constructed in this way are called confidence intervals.
219
Cases in Confidence Interval
• when the population standard deviation (𝜎) is known and n is large(n≥30), then the population
mean is estimated by:
• when the population standard deviation (𝜎) is unknown and n is large(n≥30), then the population
mean is estimated by:
• when the population standard deviation (𝜎) is unknown and n is small(n<30), then the population
mean is estimated by:
Note: Even if the sample size is small (n<30), use Z-distribution when the population data is
normal.
220
Cont’d
221
Cont’d
• The factors that determine the width of a confidence interval are:
1. The sample size, n.
2. The proportion or variability in the population.
3. The desired level of confidence
222
Example
• The Dean of the Business School wants to estimate the mean number of hours worked per week by
students. A sample of 49 students showed a mean of 24 hours with a standard deviation of 4 hours. What
is the population mean?
Solution:
• the value of the population mean is not known. Our best estimate of this value is the sample mean of 24.0
hours. This value is called a point estimate. To find the 95 percent confidence interval for the population
mean, we can compute it as
• The confidence limits range from 22.88 to 25.12. About 95% of the similarly constructed intervals
included the population parameter.
223
5.2. Hypothesis Testing
• Every hypothesis-testing situation begins with the statement of a
hypothesis.
• A statistical hypothesis is a conjecture about a population parameter.
This conjecture may or may not be true.
A. The null hypothesis (H0) is a statistical hypothesis that states that there
is no difference between a parameter and a specific value, or that there
is no difference between two parameters.
B. The alternative hypothesis(research hypothesis) (H1) is a statistical
hypothesis that states the existence of a difference between a parameter
and a specific value, or states that there is a difference between two
parameters.
224
Cont’d
The three methods used to test hypotheses are
1. The traditional method
2. The P-value method
3. The confidence interval method
225
Illustration
• Situation A: A medical researcher is interested in finding out whether a
new medication will have any undesirable side effects. The researcher is
particularly concerned with the pulse rate of the patients who take the
medication. The researcher knows that the mean pulse rate for the
population under study is 82 beats per minute .
• Will the pulse rate increase, decrease, or remain unchanged after a patient
takes the medication?
• the hypotheses for this situation are
H0: μ = 82 and H1: μ ≠ 82
• This test is called a two-tailed test since the possible side effects of the
medicine could be to raise or lower the pulse rate.
226
• Situation B: A chemist invents an additive to increase the life of an
automobile battery. If the mean lifetime of the automobile battery
without the additive is 36 months, then her hypotheses are
H0: μ = 36 and H1: μ > 36
• In this situation, the chemist is interested only in increasing the lifetime
of the batteries, so her alternative hypothesis is that the mean is greater
than 36 months.
• This test is called right-tailed, since the interest is in an increase only.
227
• Situation C: A contractor wishes to lower heating bills by using a
special type of insulation in houses. If the average of the monthly
heating bills is $78, her hypotheses about heating costs with the use of
insulation are
H0: μ = $78 and H1: μ < $78
• This test is a left-tailed test, since the contractor is interested only in
lowering heating costs.
228
229
Class Activities
230
231
Types of Errors
• A type I error occurs if you reject the null hypothesis when it is true.
• A type II error occurs if you do not reject the null hypothesis when it
is false.
232
Illustration of Hypothesis Testing
• The hypothesis-testing situation can be likened to a jury trial. In a jury
trial, there are four possible outcomes.
• The defendant is either guilty or innocent, and he or she will be
convicted or acquitted.
• Now the hypotheses are
H0: The defendant is innocent.
H1: The defendant is not innocent (i.e., guilty).
233
234
• Next, the evidence is presented in court by the prosecutor/attorney, and
based on this evidence, the jury decides the verdict, guilty or not guilty.
• If the defendant is convicted but he or she did not commit the crime,
then a type I error has been committed. See block 1.
• In other words, a type I error means an innocent person is put in jail. On
the other hand, if the defendant is convicted and he or she has
committed the crime, then a correct decision has been made. See block
2.
• If the defendant is acquitted and he or she did not commit the crime, a
correct decision has been made by the jury. See block 3.
• However, if the defendant is acquitted and he or she did commit the
crime, then a type II error has been made. See block 4.
• In other words, a type II error is letting a guilty person go free.
235
• The decision of the jury does not prove that the defendant did or did not commit
the crime.
• The decision is based on the evidence presented. If the evidence is strong enough,
the defendant will be convicted in most cases. If the evidence is weak, the
defendant will be acquitted in most cases. Nothing is proved absolutely.
• Likewise, the decision to reject or not reject the null hypothesis does not prove
anything.
• The only way to prove anything statistically is to use the entire population, which,
in most cases, is not possible.
• The decision, then, is made on the basis of probabilities. That is, when there is a
large difference between the mean obtained from the sample and the hypothesized
mean, the null hypothesis is probably not true. The question is, How large a
difference is necessary to reject the null hypothesis?
236
Basic Terms in Hypothesis testing
• A statistical test uses the data obtained from a sample to make a
decision about whether the null hypothesis should be rejected.
• The numerical value obtained from a statistical test is called the test
value or test statistic.
• The critical or rejection region is the range of test values that
indicates that there is a significant difference and that the null
hypothesis should be rejected.
• The noncritical or nonrejection region is the range of test values that
indicates that the difference was probably due to chance and that the
null hypothesis should not be rejected.
237
• The critical value separates the critical region from the noncritical region.
• A one-tailed test indicates that the null hypothesis should be rejected when the
test value is in the critical region on one side of the mean. A one-tailed test is
either a right-tailed test or a left-tailed test, depending on the direction of the
inequality of the alternative hypothesis.
238
• In a two-tailed test, the null hypothesis should be rejected when the
test value is in either of the two critical regions.
239
Steps in Hypothesis Testing
1. State the hypotheses and identify the claim
2. fix the level of confidence usually, 5%, 1% and 2% is taken.
3. Compute test value either using
or or
240
241
242
243
Examples
1. The mean life of a sample of 200 tyres/tire taken from the lot is found to be 40,000kms. Past experience shows
that the standard deviation for life of tyres in the lot is 3200kms.
A. Construct a 95% confidence interval for the mean life of tyre in the lot is expected to lie?
B. Is it reasonable to suppose the mean life of tyres in the lot as 41,000kms?(At 5% level of significance)
2. A soap manufacturing company was distributing a particular type of brand through a large number of retails
soaps. Before a heavy advertising movement, the mean sales per weak per shop were 140 dozens. After the
movement, a sample of 49 shops was taken and the mean sales were found to be 147 dozens with SD 16.
A. Construct a 95% confidence interval for the mean sales of soap manufacturing company?
B. Can you consider the advertisement effective?
3. An automobile tyre manufacturing climes that the average life of a particular grade of tyres is more than
20,000kms when used under normal driving conditions. A random sample of 16 tyres was tested and mean and
SD of 22,000Kms and 5000kms respectively were computed.
A. Construct a 95% confidence interval for the average life of an automobile tyre manufacturing company?
B. At 5% level of significance, decide whether the manufacturer’s clime is true?
244
245
246
247
Homework
1. A researcher believes that the mean age of medical doctors in a large hospital system is older than the
average age of doctors in the United States, which is 46. Assume the population standard deviation is 4.2
years. A random sample of 30 doctors from the system is selected, and the mean age of the sample is
48.6. Test the claim at α = 0.05.
2. The Medical Rehabilitation Education Foundation reports that the average cost of rehabilitation for
stroke victims is $24,672. To see if the average cost of rehabilitation is different at a particular hospital,
a researcher selects a random sample of 35 stroke victims at the hospital and finds that the average cost
of their rehabilitation is $26,343. The standard deviation of the population is $3251. At α = 0.01, can it
be concluded that the average cost of stroke rehabilitation at a particular hospital is different from
$24,672?
3. A researcher claims that the average wind speed in a certain city is 8 miles per hour. A sample of 32 days
has an average wind speed of 8.2 miles per hour. The standard deviation of the population is 0.6 mile per
hour. At α = 0.05, is there enough evidence to reject the claim?
248