0% found this document useful (0 votes)
196 views

Course1 STA 112 Notes 2021 Latest

The document provides an introduction to the course STA 112 Probability at the Federal University Lokoja. It outlines the learning objectives which include understanding key probability concepts and computations. It also discusses what statistics is and where it is used, including in government, business, consulting, and various fields like biostatistics, econometrics, and operations research.

Uploaded by

bighabbydey4u
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
196 views

Course1 STA 112 Notes 2021 Latest

The document provides an introduction to the course STA 112 Probability at the Federal University Lokoja. It outlines the learning objectives which include understanding key probability concepts and computations. It also discusses what statistics is and where it is used, including in government, business, consulting, and various fields like biostatistics, econometrics, and operations research.

Uploaded by

bighabbydey4u
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

STA 112

Probability
Lecture Notes
ADENIYI, Isaac Adeola
Department: Statistics
Faculty of Science
Federal University Lokoja
Number of Credit: 3
Semester: Second (2019/2020)
Lecture time: 11:00am to 1:00 pm Every Tuesday
10:00am to 11:00 am Every Friday
Lecture Venue: LT 1

Page 1
LEARNING OBJECTIVES
After careful study of this course, students should be able to do the following:

1. Provide a useful summary of the available information


2. Explain key definitions: Population vs. Sample; Primary vs. Secondary Data;
Parameter vs. Statistic and Descriptive vs. Inferential Statistics
3. Construct and interpret visual data displays, including the stem-and-leaf display, the
histogram, and the box plot
4. Construct and interpret normal probability plots
5. Explain how to use box plots and other data displays to visually compare two or more
samples of data
6. Identify types of data and levels of measurement
7. Understand basic concepts in probability such as sample space, events, random
variables, types of random variables, probability distributions, probability density
functions.
8. Carry out basic probability computations

Page 2
LECTURE 1 – INTRODUCTION TO STATISTICS
INTRODUCTION
We have all been exposed to the popular notion that statistics is about numbers that are deadly-
dull, and perhaps intentionally misleading. You will quickly discover in this course that the
opposite is the case: Statistics is the science of extracting useful (and therefore interesting)
numbers from the world; and the statistician is committed to forcing these numbers to reveal the
truth. Thus, Statistics is the language in which man reads the Universe. It is a language with
numerical vocabulary, a mathematical grammar and, like any language, has its own distinct way
of shaping the speaker’s view of the world.
Statistics like many other sciences is a developing discipline. It is not nothing static. It
has gradually developed during last few centuries. In different times, it has been defined in
different manners. Some definitions of the past look very strange today but those definitions had
their place in their own time. Defining a subject has always been difficult task. A good definition
of today may be discarded in future. Thus, it is difficult to define statistics.
Definition (Statistics): Statistics is the science of conducting studies to collect, organize,
analyze, summarize, and draw conclusions from data.
Statistics – what do I care?
The importance of statistics comes from its usefulness. It is rather impossible to think of any
sphere of human activity where statistics does not creep in. In fact to a very striking degree, the
modern culture has become a statistical culture and the subject of statistics has acquired
tremendous progress in the recent past so much so that an elementary knowledge of statistical
methods has become a part of the general education in the curricula of many primary schools in
the world.
Where is statistics used?
The Problem: Performance is multidimensional: CPU time, I/O time, Network time, Interactions of
various components etc
(a) Systems are often specialized
Performs great on application type X and Performs lousy on anything else
(b) Potentially a wide range of execution times on one system using different benchmark
programs
(c) Nevertheless, people still want a single number answer!
(d) How to (correctly) summarize a wide range of measurements with a single value?
1. Describing data: you have data on the number of times event X occurs, for example,
Number of cache misses or Number of I/O operations for 10,000 computers in an Excel
worksheet. Question: How do you present this data in a meaningful way?
2. Given a set of values for some variable, we want to be able to describe these to other
people, in some more meaningful way than just listing the raw data. For example, I
converted a Java library for matrix manipulation into JavaScript, and was interested in the

Page 3
time behaviour of some of the functions. In one test, I generated 100 random matrices of
size 70 x 60 and used the JavaScript Date object to time the calculation of the pseudo-
inverse of each matrix in milliseconds. How should I describe this data? I could just list
the 100 values:
318, 314, 315, 315, 313, 314, 315, 314, 314, 315, 313, 313, 315, 313, 314, 315, 314, 314,
315, 316, 315, 315, 314, 314, 314, 314, 314, 315, 314, 314, 316, 315, 314, 314, 315, 315,
316, 315, 313, 314, 313, 314, 314, 313, 313, 313, 315, 313, 312, 312, 313, 316, 313, 315,
315, 315, 313, 313, 312, 314, 314, 313, 313, 315, 314, 314, 315, 314, 314, 315, 313, 313,
314, 312, 312, 316, 314, 315, 315, 315, 315, 315, 314, 314, 313, 314, 314, 315, 313, 315,
316, 314, 315, 314, 323, 314, 314, 315, 314, 310.
3. A quality control engineer at a plant making disk drives for computers needs to make
sure that no more than 3% of the drives produced are defective. The engineer may
routinely collect random samples of drives and check their quality. Based on the random
samples, the engineer may then draw a conclusion about the proportion of defective items
in the entire population of drives, or state whether the production process is working to
the required standard.
Thus, Statistics has become an essential tool of modern civilization. It is particularly useful since
it is concerned with making inferences about a body of data when only a part of the data are
observed or studied. In general, we summarized below some of the importance of Statistics:
(i) To provide information that can be useful in formulating plan for developmental
purpose or programs
(ii) To measure progress and guide research
(iii) For decision making and forecasting
(iv) To evaluate the existing conditions

Users of Statistics

Users of Statistics
The basic users of statistics include everyone who is concerned with making informed decisions
from relevant data. The following among others are users of statistics.
1. Government: for planning e.g. socio-economic and developmental planning
2. Business men/women: They use statistics in the following ways
i. Market research
ii. Risk analysis
iii. Stock control
iv. Quality control
v. Sale trend monitoring e.t.c

Page 4
3. Consultants: Consultants in areas such as data analysis, database management, business
planning etc., knowledge of statistics is vital
4. Operational researcher

A number of specialties have evolved to apply statistical theory and methods to various
disciplines. Certain topics have "statistical" in their name but relate to manipulations of
probability distributions rather than to statistical analysis.

 Actuarial Statistics is the discipline that applies mathematical and statistical methods to
assess risk in the insurance and finance industries.

 Biostatistics is a branch of biology that studies biological phenomena and observations


by means of statistical analysis, and includes medical statistics.

 Business analytics is a rapidly developing business process that applies statistical


methods to data sets (often very large) to develop new insights and understanding of
business performance & opportunities

 Chemometrics is the science of relating measurements made on a chemical system or


process to the state of the system via application of mathematical or statistical methods.

 Demography is the statistical study of all populations. It can be a very general science
that can be applied to any kind of dynamic population, that is, one that changes over time
or space.

 Econometrics is a branch of economics that applies statistical methods to the empirical


study of economic theories and relationships.

 Environmental statistics is the application of statistical methods to environmental


science. Weather, climate, air and water quality are included, as are studies of plant and
animal populations.

 Epidemiology is the study of factors affecting the health and illness of populations, and
serves as the foundation and logic of interventions made in the interest of public health
and preventive medicine.

Page 5
 Geostatistics is a branch of geography that deals with the analysis of data from
disciplines such as petroleum geology, hydrogeology, hydrology, meteorology,
oceanography, geochemistry, geography.

 Operations research (or Operational Research) is an interdisciplinary branch of applied


mathematics and formal science that uses methods such as mathematical modeling,
statistics, and algorithms to arrive at optimal or near optimal solutions to complex
problems.

 Population ecology is a sub-field of ecology that deals with the dynamics of species
populations and how these populations interact with the environment.

 Psychometrics is the theory and technique of educational and psychological


measurement of knowledge, abilities, attitudes, and personality traits.

 Quality control reviews the factors involved in manufacturing and production; it can
make use of statistical sampling of product items to aid decisions in process control or in
accepting deliveries.

 Quantitative psychology is the science of statistically explaining and changing mental


processes and behaviors in humans.

 Reliability Engineering is the study of the ability of a system or component to perform


its required functions under stated conditions for a specified period of time

 Statistical mechanics is the application of probability theory, which includes


mathematical tools for dealing with large populations, to the field of mechanics, which is
concerned with the motion of particles or objects when subjected to a force.

 Statistical physics is one of the fundamental theories of physics, and uses methods of
probability theory in solving physical problems.

 Statistical thermodynamics is the study of the microscopic behaviors of thermodynamic


systems using probability theory and provides a molecular level interpretation of
thermodynamic quantities such as work, heat, free energy, and entropy.

Functions of Statistic

Page 6
Statistics as a discipline is considered indispensable in almost all spheres of human knowledge.
There is hardly any branch of study which does not use statistics. Scientific, social and economic
studies use statistics in one form or another. These disciplines make-use of observations, facts
and figures, enquiries and experiments etc. using statistics and statistical methods. Statistics
studies almost all aspects in an enquiry. It mainly aims at simplifying the complexity of
information collected in an enquiry. It presents data in a simplified form as to make them
intelligible. It analyses data and facilitates drawal of conclusions. Below are some of the
important functions of statistics.

1. Presents facts in simple form: Statistics presents facts and figures in a definite form.
That makes the statement logical and convincing than mere description. It condenses the
whole mass of figures into a single figure. This makes the problem intelligible.
2. Reduces the Complexity of data: Statistics simplifies the complexity of data. The raw
data are unintelligible. We make them simple and intelligible by using different
statistical measures. Some such commonly used measures are graphs, averages,
dispersions, skewness, kurtosis, correlation and regression etc. These measures help in
interpretation and drawing inferences. Therefore, statistics enables to enlarge the horizon
of one's knowledge.
3. Facilitates comparison: Comparison between different sets of observation is an
important function of statistics. Comparison is necessary to draw conclusions as
Professor Boddingtons rightly points out.” the object of statistics is to enable comparison
between past and present results to ascertain the reasons for changes, which have taken
place and the effect of such changes in future. So to determine the efficiency of any
measure comparison is necessary. Statistical devices like averages, ratios, coefficients
etc. are used for the purpose of comparison.
4. Testing hypothesis: Formulating and testing of hypothesis is an important function of
statistics. This helps in developing new theories. So statistics examines the truth and
helps in innovating new ideas.
5. Formulation of Policies: Statistics helps in formulating plans and policies in different
fields. Statistical analysis of data forms the beginning of policy formulations. Hence,
statistics is essential for planners, economists, scientists and administrators to prepare
different plans and programmes.
6. Forecasting: The future is uncertain. Statistics helps in forecasting the trend and
tendencies. Statistical techniques are used for predicting the future values of a variable.
For example a producer forecasts his future production on the basis of the present

Page 7
demand conditions and his past experiences. Similarly, the planners can forecast the
future population etc. considering the present population trends.
7. Derives valid inferences: Statistical methods mainly aim at deriving inferences from an
enquiry. Statistical techniques are often used by scholars’ planners and scientists to
evaluate different projects. These techniques are also used to draw inferences regarding
population parameters on the basis of sample information.

Key definitions
1. A population is the collection of all items, objects or things under consideration. The
objects may be people, animal, plants etc.
2. A sample is a subset or portion of the population selected for analysis. The sample needs
both to be representative of the population and to be large enough to contain sufficient
information to answer the questions about the population that are crucial to the
investigation.
3. Parameter is a summary measure that describes a characteristic of the population
4. Statistic is a summary measure computed from a sample to describe a characteristic of
the population
5. A variable is any characteristics of interest that can take different values.
6. A unit or element is a single entity, usually an object or person, whose characteristics
are of interest.
Table 1: examples of populations, units, and variables
Population Unit Variables/Characteristics
All students currently enrolled in Student GPA, number of credits, hours of work per
school week, major, right/left-handed
All printed circuit boards Board type of defects, number of defects, location
manufactured during a month of defects
All campus fast food restaurants Restaurant number of employees, seating capacity,
hiring / not hiring
All books in library Book replacement cost, frequency of check-out,
repairs needed
We are usually interested in certain characteristics of objects, for example, weight = 58.kg
(datum), other characteristics could be height, gender, race, marital status, age etc.
In class exercise #1:

Page 8
1. Consider the population of all laptop computers owned by FUL students. You want to
know the size of the hard disk.
(a) Specify the population unit. (laptop computers)
(b) Specify the variable of interest. (hard disk size)
(c) Specify the population. (all laptop computers owned by FUL students)
2. Determine if the described data set is a population or a sample. If it is a sample, describe
the population.
(a) The ages at the time of their first marriage of 25 residents in an assisted living facility
that houses 100 residents.
(b) The number of years of employment of all registered computer scientists in a large IT
firms.
3. Indicate whether the following is a parameter or a statistic:
(i) a mean obtained from sampling 2,000 Nigerian adults using random digit dialing
(phone sample) (statistic)
(ii) a mean obtained from sampling 10,000 teenagers on Facebook (statistic)
(iii)the average number of defects obtained when a company manufactures 5,000 electron
microscopes and tests them all. (parameter)
Branches of statistics
1. Descriptive statistics: Collect data (survey or experiment), present data (tables and
graphs). It describes three things about the given data
(1) Central tendency of the given data - where the “centre point” is located and it is
usually described by the mean, median and mode
(2) Dispersion of the given data i.e., the spread from the smallest to the largest values
(how close or how far apart are the given data values from each other)
(3) Distribution (the shape of the spread) i.e., symmetrical, skewed (right or left), bell-
shaped, flat, peaked.
Numerical Summary Graphical Summary
 Measures of Location -  Categorical Data - Pie chart, Bar
Arithmetic mean, Median, Mode plot, Frequency table, Dot plot
 Measures of Spread – Range,  Univariate Data - Stem and leaf
Standard deviation, IQR plot, Box plot, Histogram
 Others: Min and max, Quartiles,  Bivariate Data - Scatter plot
Correlation coefficient

Page 9
2. Inferential statistics: Consists of generalizing from samples to populations, Performing
estimations and hypothesis tests, Determining relationships among variables and making
predictions.
TEST YOUR UNDERSTANDING …
1. Indicate whether the following is a parameter or a statistic:
(i) Sample mean
(ii) Population standard deviation
(iii)Sample standard deviation
(iv) Population variance
(v) Population mean
2. If you were interested in understanding the relationship between level of education and
lifetime earnings, what elements would you sample?
3. Determine if the described data set is a population or a sample. If it is a sample, describe
the population.
(a) A survey of every 4th customer leaving a grocery store to determine how many times
per week they shop at a grocery store.
(b) The time it takes each mail carrier in zip code 91210 to complete a Saturday route.
(c) A consumer magazine article asks “How Safe Is the Air in Airplanes?” and goes on to
say that the air quality was measured on 158 different flights for U.S. based airlines.
Let the variable of interest be a numerical measure of staleness.
4. Identify the population, sample and the variable of interest in each of the following
situations.
(a) To learn about starting salaries for Computer Science students graduating from FUL,
twenty graduating seniors are asked to report their starting salary.
(b) For the presidential elections in Nigeria, specify the population to be captured and its
element.

Page 10
LECTURE 2 - DATA – CLASSIFICATIONS AND COLLECTION METHODS
What are data? The term “data” refers to the kinds of information researchers obtain on the
subjects of their research or the observed values of a variable.
Sources of data: There are two ways of sourcing (or collecting) data, these are:
Primary source
 First-hand information collected, compiled and published by individual or organization
for some purpose.
 They are most original data
 not undergone any sort of statistical treatment
 e.g., average age of FUL students. Directly collecting from each students
Secondary source
 Second hand information which are already collected by someone (organization) for
some purpose and are available for the present study
 not original in character
 have undergone some statistical treatment at least once
 e.g., average age of FUL students retrieved from the university record

Method of data collection


The problem does not arise if secondary data are to be used. However, if primary data are to be
collected a decision has to be taken whether census method or sample technique is to be used for
data collection.
 In the census method, we result to 100% inspection of the population and enumerate each
and every unit of the population.
 In the sample technique we inspect or study only a selected representative and adequate
fraction (finite subset) of the population.
We will discuss these two concepts in detail in our next lecture.
Methods of Collecting Primary Data - may be collected by the following methods
 Experimentation,
 Survey (the design of Questionnaire)
 Interview (face to face or telephone),
 Observation etc.
Methods of Collecting Secondary Data – may be collected from the following sources:
 The publications of the Statistical Division, Ministry of Finance, the Bureaus of
Statistics, Ministries of Agriculture, Industry, Labour etc…
 Research Organizations such as Universities and other institutions.
 Publication of Trade Associations, Chambers of Commerce etc…
 Others are magazines, periodicals, and Newspapers etc…

Page 11
TYPES OF DATA

In Class Exercise #1: For each of the following, specify the characteristic of interest and
determine whether the variable is numerical or categorical. If it is numerical, determine whether
it is discrete or continuous.
(a) Number of telephones in your house
(b) Size of desktop computer memory
(c) Ownership of a cell phone
(d) Number of local phone calls you made in a month
(e) Length of longest phone call you made yesterday
(f) Length of your foot
(g) Zip Code
(h) The daily temperature of Malete over the past month
(i) The amount of water consumed daily by a teenager.
(j) The number of freshmen enrolled at FUL this session.
(k) The type of beverage selected on a lunch pre-order.
(l) Birth month of students in this class.
In-class exercise 3: A survey by an electric company contains questions on the following: Age
of household head; Sex of household head; Number of people in household; Use of electric
heating (yes or no); Number of large appliances used daily; Thermostat setting in winter;
Average number of hours heating is on; Average number of heating days; Household income;
Average monthly electric bill; Ranking of this electric company as compared with two previous
electricity suppliers. Describe the variables implicit in these 11 items as numerical or categorical.
TEST YOUR UNDERSTANDING
1. Realtors who help sell condominiums in the Boston area provide prospective buyers with
the information given in Table 1. Which of the variables in the table are numerical and
which are categorical?

Page 12
Table 1: Boston Condominium Data
Asking No of No of Direction Washer/Dryer? Doorman?
Price Bedrooms Bathrooms Facing
$709,000 2 1 E Y Y
812,500 2 2 N N Y
980,000 3 3 N Y Y
830,000 1 2 W N N
850,900 2 2 W Y N
Source: Boston.condocompany.com, March 2007.
2. Describe each of the following variables as numerical or categorical.
Table 2: The Richest People on Earth 2007
Name Wealth ($ Age Industry Country of
billion) Citizenship
William Gates III 56 51 Technology USA
Warren Buffett 52 76 Investment Mexico
Carlos Slim Helú 49 67 Telecom Mexico
Ingvar Kamprad 33 80 Retail Sweden
Bernard Arnault 26 58 Luxury goods France
Source: Forbes, March 26, 2007 (the “billionaires” issue), pp. 104–156.
3. Indicate which of the following are discrete measurements and which are continuous
measurements:
(i) life of a Dell monitor
(ii) waist size of EPL football players
(iii) mileage of Japanese cars
(iv) weight of human brains
(v) Number of left-handed people on basketball teams
(vi) Time to complete the task of assembling a computer by 20 CSC students
(vii) Number of non-indigene students in each statistics class
(viii) Number of bedrooms in your home

Lesson 3: Meaning and rules of probability

Probability is the theory of uncertainty. It is a necessary concept because the world according to
the scientist is unknowable in its entirety. However, prediction and decisions are obviously
possible. As such, probability theory is a rational means of dealing with an uncertain world.

Probabilities are numbers associated with events that range from zero to one (0-1). A probability
of zero means that the event is impossible. For example, if I were to flip a coin, the probability of
a leg is zero, due to the fact that a coin may have a head or tail, but not a leg. Given a probability
of one, however, the event is certain. For example, if I flip a coin the probability of heads, tails,
or an edge is one, because the coin must take one of these possibilities. The probability (usually
represented by a capital P) of event A is a numerical measure of the likelihood of the event’s

Page 13
occurring. These events are uncertain and Probability is a measure of these uncertainties. For
instance, business organizations operate in conditions that are far from certain. Economic
circumstances change, customer bases shift, employees move to other jobs. New product
development and investment ventures are typically something of a gamble. We often hear and
use such expressions as:
(a) It probably will rain tomorrow afternoon
(b) It is very likely that the plane will arrive lately
(c) The chances are good that he will be able to join us for dinner this evening
The concept of the probability of a particular event of an experiment is subject to various
meanings or interpretations. For instance, if a geologist is quoted as saying that “there is a 60
percent chance of oil in a certain region,” we all probably have some intuitive idea as to what is
being said. Indeed, most of us would probably interpret this statement in one of two possible
ways: either by imagining that
1. The geologist feels that, over the long run, in 60 percent of the regions whose outward
environmental conditions are very similar to the conditions that prevail in the region
under consideration, there will be oil; or, by imagining that
2. The geologist believes that it is more likely that the region will contain oil than it is that it
will not; and in fact .6 is a measure of the geologist’s belief in the hypothesis that the
region will contain oil.
Basic Rules for Probability
We want to express how likely it is that an event occurs. To do this we will assign a probability
to each event. The assignment of probabilities to events is in general not an easy task. Since each
event has to be assigned a probability, we speak of a probability function. It has to satisfy the
following basic properties.
(A) The Range of Values
Probability obeys certain rules. The first rule sets the range of values that the probability measure
may take.
For any event A, the probability P (A) satisfies 0  P ( A)  1 for every event A . In other words,
probability is a number on a scale that runs from zero to one inclusive, although it can be
expressed as a percentage.
If there is a probability of zero that an outcome will occur it means there is literally no chance
that it will happen. At the other end of the scale, if there is a probability of one that something
will happen, it means that it is absolutely certain to occur. Within the range of values 0 to 1, the
greater the probability, the more confidence we have in the occurrence of the event in question.
A probability of 0.95 implies a very high confidence in the occurrence of the event. A probability

Page 14
of 0.80 implies a high confidence. When the probability is 0.5, the event is as likely to occur as it
is not to occur (such a probability is described as a fifty–fifty chance). When the probability is
0.2, the event is not very likely to occur. When we assign a probability of 0.05, we believe the
event is unlikely to occur, and so on. The table below is an informal aid in interpreting
probability.
Table 1: The numerical probability range with some notional verbal descriptions of regions
Value of Meaning
probability
0 Impossible event
0.5 Event is as likely to occur as not to occur. Equally likely or
50 – 50
0 – 0.25 Event is not very likely to occur.
0.25 – 0.5 Event is more likely not to occur than to occur.
0.5 – 0.75 Event is more likely to occur than not to occur.
0.75 -1 Event is very likely to occur.
0 – 0.5 and 0.5 - 1 Probable or likely
1 Sure or certain event
Although, probability is a measure that goes from 0 to 1, in everyday conversation we often
describe probability in less formal terms. For example, people sometimes talk about odds. If the
odds are 1 to 1, the probability is 1/2; if the odds are 1 to 2, the probability is 1/3; and so on.
Also, people sometimes say, “The probability is 80 percent.” Mathematically, this probability is
0.80.

In class exercise #47:


Figure out the following probabilities.
1) What is the probability of getting a tail on a coin flip?
2) What is the probability of getting two tails in two flips?
3) What is the probability of rolling a 4 with a die?
11) If there is a 60% chance of rain, what is the probability it will not rain?
12) In a group of 50 people, what is the probability at least two have the same birthday? (Just
guess)

Definitions

Page 15
Outcome
An outcome is the result of a single trial of a probabilistic (random) experiment.

Trial
A trial is the performance of a single component of a random experiment. A trial means flipping
a coin once, rolling one die once, or the like. When a coin is tossed, there are two possible
outcomes: head or tail. (Note: We exclude the possibility of a coin landing on its edge.) In the
roll of a single die, there are six possible outcomes: 1, 2, 3, 4, 5, or 6.

Sample Point
Each element or outcome in the sample space of a probability experiment is called a sample
point.

Sample Space
The set of all possible outcomes of a random experiment is called the sample space of the
experiment. The sample space is denoted as S. A sample space is often defined based on the
objectives of the analysis. A sample space is discrete if it consists of a finite or countable infinite
set of outcomes. A sample space is continuous if it contains an interval (either finite or infinite)
of real numbers.

Events
Events are subsets of the sample space and we often use letters A, B, C, etc. to label them. A
simple event consists of exactly one outcome and a compound event consists of more than one
outcome.

Approaches of Probability
Probabilistic statements can be interpreted in different ways. For example, how would you
interpret the following statement? There is a 40 percent chance of rain today.
Assignment of probability: The axioms of probability define the properties of a probability
measure, which are consistent with our intuitive notions. However, they do not guide us in
assigning probabilities to various events. How can we attach probabilities to events? There are
three approaches, these are:
(A) The Classical Approach
The first and easiest case is an experiment with a finite sample space consisting of 𝑁 points.
Suppose that, because of the nature of the experiment (e.g. tossing a fair coin), all points are
equiprobable, that is, equally likely, and let A be some event in S. Here, we define the
𝑛(𝐴)
probability of an event 𝐴, written 𝑃(𝐴) as the ratio 𝑃(𝐴) = (2.1) where 𝑛(𝐴) denotes the
𝑁

number of points in 𝐴. Despite its simplicity, formula (2.1) can lead to non trivial calculations. In
order to use it in a given problem, we need to determine:
(i) The number 𝑁 = 𝑛(𝑆) of all equiprobable outcomes, and
(ii) The number of all those outcomes leading to the occurrence of 𝐴.

Page 16
In many practical situations the different outcomes are not equally likely: Success of treatment;
Chance to die of a heart attack and Chance of rainfall tomorrow etc., It is not immediately clear
how to measure chance in each of these cases.
(B) The Frequentist (Empirical) Approach
To determine a probability, we can repeatedly perform the action and measure the outcomes. A
second case is when a basic experiment can be repeated in exactly the same conditions any
number n of times. We call this situation the case of independent trials under identical
conditions. In this case, we can give a precise meaning to the concept of probability.
In each trial a particular event A may or may not occur. Let 𝑛(𝐴) be the number of trials in
which 𝐴 occurs. The relative frequency of the event 𝐴 in the given series of n trials is defined as
𝑛(𝐴)
𝑓𝑛 (𝐴) = . It is an empirical fact that the 𝑓𝑛 (𝐴) observed for different series of trials are
𝑛

virtually the same for large 𝑛, clustering about a constant value 𝑃(𝐴), called the probability of A.
Roughly speaking, the probability of 𝐴 equals the fraction of trials leading to the occurrence of A
in a large series of trials.
The Frequency Interpretation of Probability: The probability of an event is the proportion of
time that events of the same kind (repeated independently and under the same conditions) will
occur in the long run. The frequency interpretation of probability is based on the following
theorem.
Theorem (Law of Large Numbers): If a situation, trial, or experiment is repeated again and
again, the proportion of successes will converge to the probability of any one outcome being a
success. As the number of replications increases, the relative frequency will approach the
probability of the event.
Example: Agriculture: Cotton. A botanist has developed a new hybrid cotton plant that can
withstand insects better than other cotton plants. However, there is some concern about the
germination of seeds from the new plant. To estimate the probability that a seed from the new
plant will germinate, a random sample of 3000 seeds was planted in warm, moist soil. Of these
seeds, 2,430 germinated. (a) Use relative frequencies to estimate the probability that a seed will
germinate. What is your estimate? (b) a seed will not germinate (c) Either a seed germinates, or it
does not. What is the sample space in this problem? Do the probabilities assigned to the sample
space add up to 1? Should they add up to 1? Explain.
Solution: (a) The relative frequency is f =no of favorable cases/total no of cases = 2430/3000 =
0.81. Therefore, our estimate of the probability that a seed will germinate is 0.81.
(b) The relative frequency that a seed will not germinate is f = 570/3000 = 0.19. Therefore, our
estimate of the probability that a seed will not germinate is 0.19.

Page 17
(c) The sample space in this problem is the set of all possible outcomes of numbers of seeds that
germinated, which is {0, 1, 2… 2999, 3000}. Each outcome has a probability of1/1000, and then
they add up to 1, which what should happen, because the sample space accounts for all the
possible events.

(C) The Subjectivist (Bayesian) Approach


There are certain problems with 'long run frequencies' — verifiability — (who can flip
forever?) i.e., not all events are repeatable, for instance: Will it rain tomorrow? What is the
probability that you will pass this course? Will Mr. Jones, 42, live to 65? Will the parents of
twins give birth to another twin? Does the Iran have weapons of mass destruction? To all these
questions the answer is either “yes” or “no”, but we are uncertain about the right answer.
Need to quantify our uncertainty about an event: another common but more subjective
approach to probability assignment is that of relative likelihood. When it is not feasible or is
impossible to perform an experiment a large number of times, the probability of an event may be
assigned as a result of subjective judgment. The statement ‘there is a 40% probability of rain
tomorrow’ is an example in this interpretation, where the number 0.4 is assigned on the basis of
available information and professional judgment. The Bayesian interpretation of probability is
that probability that measures the personal (subjective) uncertainty of an event. The next basic
quantity to be introduced here is that of a probability function (or of a probability measure).

EXTRA
(D) Axiomatic Approach
This is an approach based on the four basic laws of probability. It is such that probabilities
problems are solved using understanding of nature of probability as measure of uncertainty, in
deciding correctness of answers to such problems.
The Axioms of Probability
Let E denote an event in sample space S such that 𝑃(𝐸) = 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑒𝑣𝑒𝑛𝑡 𝐸. Then
𝑃(𝐸) is said to be a true probability function if and only if the following axioms are satisfied
1. 0 ≤ 𝑝(𝐸) ≤ 1
Interpretation: The probability of any event E is a number (either a fraction or decimal)
between and including 0 and 1. The simple implication of this is that probabilities cannot be
negative or greater than 1

Page 18
2. 𝑷(∅) = 𝟎
Interpretation: If an event E cannot occur (i.e., the event contains no members in the sample
space), its probability is 0.
3. 𝟎 ≤ 𝑷(𝑬) ≤ 𝟏
Interpretation: If an event E is certain, then the probability of E is 1.
4. 𝑷(𝑺) = 𝟏
Interpretation: The sum of the probabilities of all the outcomes in the sample space is 1.

(B) The Rule of Complements


Our second rule for probability defines the probability of the complement of an event in terms of
the probability of the original event. Recall that the complement of set A is denoted by A .
For any event A , P A  1  P A where, P( A) is called the probability that A occurs.
As a simple example, if the probability of rain tomorrow is 0.3, then the probability of no rain
tomorrow must be 1 - 0.3 = 0.7. If the probability of drawing an ace is 4/52, then the probability
of the drawn card’s not being an ace is 1 – 4/52 = 48/52.

Counting techniques
After completing this chapter, you should be able to:
 Solve counting problems using multiplication principle, permutation and combination.
To calculate probabilities, we need to find the number of outcomes in the required events as well
as the total number of outcomes. In simple experiments such as roll of two dice or a roll of three
coins, it is easy to determine the sample space. But when an experiment is such that it can be
treated in three or more stages, it becomes tedious to determine the sample space as well as the
number of outcomes in favour of a particular event. Counting techniques are those methods
developed to solve problems. Some of the methods used for counting are: Tree Diagram,
Fundamental Counting Principle, Permutation and Combination. Let’s look at each in detail
Tree diagram
If you have to investigate the probabilities of sequences of several outcomes it can be difficult to
work out the different combinations of outcomes in your head. The problem of counting the
sample space and the sample points corresponding to various events is simplified by the use of a
tree diagram, especially if the experiment can be treated as not more than three stages. If there
are more than three stages, the tree becomes unmanageable. A tree diagram, which is sometimes
called a probability tree, represents the different sequences of outcomes in the style of a tree that
‘grows’ from left to right. Each branch of the tree leads to a particular outcome. On the right-
hand side of the diagram, at the end of each branch, we can insert the combination of outcomes

Page 19
that the sequence of branches represents and, using the multiplication rule, the probability that
the sequence of outcomes happens.
Determine the number of outcomes in each of the following experiments:
a. Tossing a coin two times
b. Throwing a coin and rolling a dice
In class exercise 1: Each message in a digital communication system is classified as to whether
it is received within the time specified by the system design. If three messages are classified, use
a tree diagram to represent the sample space of possible outcomes.
Each message can either be received on time or late. The possible results for three messages can
be displayed by eight branches in the tree diagram shown in Fig. 2-5.
In class exercise 2: An automobile manufacturer provides vehicles equipped with selected
options. Each vehicle is ordered: With or without an automatic transmission; with or without air-
conditioning; with one of three choices of a stereo system; with one of four exterior colors. If the
sample space consists of the set of all possible vehicle types, what is the number of outcomes in
the sample space? The sample space contains 48 outcomes. The tree diagram for the different
types of vehicles is displayed in Fig. 2-6.
By doing a few more of these problems, I might begin to see a pattern that would suggest a way
of determining the total number of outcomes without listing and without using a tree diagram.
This will lead us to the Fundamental Counting Principle.
Multiplication principle
Where the tree diagram is not feasible or suitable, the multiplication rule comes in handy. In a
tree stage experiment, if the first stage can occur in 𝑛1 places and the second stage in 𝑛2 places
while the third stage can occur in 𝑛3 places, then the total number of possible outcomes becomes
𝑛1 × 𝑛2 × 𝑛3 . More generally, if there are 𝑟 experiments, where the first has 𝑛1 possible
outcomes, the second 𝑛2 , . . . , and the rth 𝑛𝑟 possible outcomes, there are a total of 𝑛1 × 𝑛2 ×
𝑛3 × … × 𝑛𝑟 possible outcomes for the 𝑟 experiments.
In class exercise 3: Suppose Joke has three blouses, two pairs of slacks, and four pairs of shoes.
Assuming no matter what she wears, they all match, how many outfits does she have altogether?
She has three choices for a blouse, two choices for her slacks, and four choices for her shows.
Using the Fundamental Counting Principle, she has 3 × 2 × 4 or 24 different outfits.
In class exercise 4: Ade, Bayo, and Kale are running a race, in how many ways can they finish?
There are three ways I could choose the winner, and after that occurs; there are two ways to pick
second place, and one way to pick the third place finisher. Therefore there are 3 × 2 × 1 or 6
different way these three boys could finish the race.

Page 20
Hint: To find the number of ways of making several decisions in a row, multiply the numbers of
choices that can be made in each decision.
In class exercise 5: True False Tests
How many ways are there to roll out a 2 question True or False Exam?
Since there are 2 ways to answer the first question and 2 ways to answer the second question,
there are 2 × 2 = 4 possible answer keys: TT TF FT FF
In class exercise 6: How many ways are there to roll out a 3 question True or False Exam?
Since there are 2 ways to answer each of the three questions, there are 2 3 = 8 possible answer
keys:
TTT TTF TFT TFF FTT FTF FFT FFF
In class exercise 7: How many ways are there to roll out a 2 question Multiple Choice Exam
which has 5 possible answers A, B, C, D, and E? Since there are 5 ways to answer the first
question and 5 ways to answer the second question, there are 5 x 5 = 25 possible answer keys
In class exercise 8: How many ways are there to roll out a 3 question Multiple Choice Exam?
Since there are 5 ways to answer each question, there are 53 = 125 possible answer keys.
Permutation
Permutation is an arrangement of objects in which the order matters – without repetition. Order
typically matters when there is position or awards. Like first place, second place, or when
someone is named president, or vice president. The number of permutation of 𝑛 items taking 𝑟 at
𝑛!
a time is denoted 𝑛𝑃𝑟 = (𝑛−𝑟)!.
𝑛
The number of permutation of 𝑛 distinct items taking all at a time is 𝑃𝑛 = 𝑛(𝑛 − 1)(𝑛 −
2) … 3 × 2 × 1 = 𝑛! ways; here 𝑟 = 𝑛. The symbol 𝑛! is called 𝑛 - factorial and we define 1! =
0! = 1
In class exercise 9: How many ways can you set up a committee of 4 people from 10 people?
6𝑃 6𝑃
In class exercise 10: evaluate the following 3
and 3
× 3𝑃2
6 3!

In class exercise 11: Determine whether each of the following is true or false
5! 14!
(i) 3! + 4! = 7! (ii) 5𝑃3 = 3!(5−3)! (iii) 13! = (iv) 4! × 3! = 12!
14

Permutation of 𝒏 items, not all of which are distinct


The number of permutation of 𝑛 things taking all at a time where 𝑝 of them are alike of one kind,
𝑛!
𝑞 are alike of another kind and 𝑟 are alike of third kind, etc., is given by 𝑝!𝑞!𝑟!.

In class exercise 12: In how many ways can the letters of the word STATISTICS be arranged?
Solution: T occurs 3 times; I occur 2 times and S occurs 3 times. So the numbers of possible
10!
arrangement are 3!2!3! = 50400

Page 21
Combination: Combination is an arrangement of objects in which the order does not matter –
without repetition. This is different from a permutation because the order does not matter. If you
change the order, you don’t change the group; you do not make a new combination. So, a dime,
nickel, and penny is the same combination of coins as a penny, dime, and nickel.
In essence, a combination is nothing more than a permutation that is being divided by the
different orderings of that permutation.
𝑛 𝑛!
The number of combination of 𝑛 things taking 𝑟 at a time denoted ( ) = 𝑟!(𝑛−𝑟)!
𝑟
Most of the problems on selection without replacement can be solved using combination
approach.
In class exercise 13: From among 12 students trying out for the basketball team, how many
ways can 7 students be selected?
Does the order matter? Is this a permutation or combination? Well, if you were going out for the
team and a list was printed, would it matter if you were listed first or last? All you would care
about is that your name is on the list. The order is not important, therefore this would be a
12
combination problem of 12 students take 7 at a time i.e. ( ) = 792. There would be 792
7
different teams that can be chosen.
In class exercise 14: Kunle has 6 employees, three of them must be on duty during the night
shift, how many ways can he choose who will work? Does order matter? By formula, we’d have
6
( ). Working that out, there would be 20 ways. Doing these problems by hand can be very
3
distracting, you would be able to concentrate on the problem more if you have a calculator that
6
has permutations and combination on it, that way, when you had ( ), all you would do is plug
3
those numbers in, press the appropriate buttons, and wala, you would have gotten 20. Don’t you
just love technology!
In class exercise 15: How many ways can a committee be formed consisting of 3 men and 2
women, selected from 7 men and 10 women?
In class exercise 16: In how many ways can a committee of be selected from amongst 6 males
and 7 females; if the committee must consist of (i) Two males and three females (ii) at most three
males
Solution: there are a total of 13 persons.
(i) The total number of selection in which two males can be selected from 6 males is
6 7
( ) = 15 and that of three females from 7 females is ( ) = 35. Therefore, the total
2 3
number of combination is 15 × 35 = 525 ways

Page 22
(ii) There could be 0, 1, 2 and maximum of 3 males. Hence, the total number of
6 7 6 7 6 7 6 7
combination would be ( ) ( ) + ( ) ( ) + ( ) ( ) + ( ) ( ) = 15 + 210 +
0 5 1 4 2 3 3 2
525 + 420 = 1176
In class exercise 17: A box contains 20 balls all of which are of the same size. 15 of them are
red and the other black. In how many ways can you choose 4 balls if you must have (i) Exactly
two black balls (ii) At least one red balls (iii) No black balls
Permutations or Combinations
Students are often confused as to whether a problem calls for permutations or combinations. The
rule of thumb is: Use a permutation if the order of selection matters and a combination if the
order of selection doesn't matter. Does order matters for the following situations:
(a) Outcome of a race: first, second, or third. Order matters.
(b) Does order matters for the following situations: eating 7 courses at a meal. Order
matters.
(c) 7 digits selected for a phone number. Order matters. (753-1835 is different from 753-
8531)
(d) 3 member of the Math Faculty are selected to form a selection committee for a Finite
Mathematics textbook. Order doesn't matter if each person has equal powers.
(e) Among all KWASU Math Majors, 4 officers must be elected to be the President, Vice-
President, Treasurer, and Secretary of the Math Club. Order matters.

TEST YOURSELF…
1. Ten cars are in a race. How many ways can we have first, second, and third place?
2. How many ways can a True-False test be answered if there are 4 questions? 6 questions?
n questions?
3. In how many ways can six items be chosen from nine items?
4. How many ways can a basketball team of 5 players be selected from a squad of 12
players?
5. A man’s wardrobe consists of 5 sport coats, 3 dress slacks, and 2 pairs of shoes.
Assuming they all match, in how many ways can she select an outfit?
6. Zip™ Disks: Zip™ disks come in two sizes (100MB and 250MB), packaged singly, in
boxes of five, or in boxes of ten. When purchasing singly, you can choose from five
colors; when purchasing in boxes of five or ten you have two choices, black or an
assortment of colors. If you are purchasing Zip disks, how many possibilities do you have
to choose from?

Page 23
7. Tests: A test requires that you answer either Part A or Part B. Part A consists of 8 true-
false questions, and Part B consists of 5 multiple-choice questions with 1 correct answer
out of 5. How many different completed answer sheets are possible?
8. HTML: Colors in HTML (the language in which many web pages are written) can be
represented by 6-digit hexadecimal codes: sequences of six integers ranging from 0 to 15
(represented as 0, ..., 9, A, B, .., F). (a) How many different colors can be represented?
(b) Some monitors can only display colors encoded with pairs of repeating digits (such as
44DD88). How many colors can these monitors display? (c) Grayscale shades are
represented by sequences xyxyxy. consisting of a repeated pair of digits. How many
grayscale shades are possible? (d) The pure colors are pure red: xy0000; pure green:
00xy00; and pure blue: 0000xy. (xy = FF gives the brightest pure color, while xy = 00
gives the darkest: black). How many pure colors are possible?
(6) (2.5) + (2.2.2) = 18 (7) 28 + 55 = 3,381 (8) (a) 166 =16,777,216 (b) 163 =
4096 (c) 162= 256 (d) 3 × 162 − 2 = 766

TEST YOURSELF…
1. Meteorology is not an exact science and hence weather forecasts have to be couched in terms
that express that lack of precision. The following is a Meteorological Office forecast for the
United Kingdom covering the period 23 September to 2nd October 2006. “Low pressure is
expected to affect northern and western parts of the UK throughout the period. There
is a risk of some showery rain over south-eastern parts over the first weekend but
otherwise much of eastern England and possibly eastern Scotland should be fine. More
central and western parts of the UK are likely to be rather unsettled with showers and
some spells of rain at times, along with some periods of strong winds too. However, with
a southerly airflow dominating, rather warm conditions are expected, with warm
weather in any sunshine in the east.” Underline all those parts of this report that indicate
lack of certainty.
2. Give 2 examples each of the situations described in table 1 above
3. Assign a reasonable numerical probability to the statement “Rain is very likely tonight.”
4. If a team has an 80% chance of winning a game, describe its chances in words.
5. A machine produces components for use in cellular phones. At any given time, the machine
may be in one, and only one, of three states: operational, out of control, or down. From
experience with this machine, a quality control engineer knows that the probability that the

Page 24
machine is out of control at any moment is 0.02, and the probability that it is down is 0.015.
What is the relationship between the following events?
(a) A: “machine is out of control” and B: “machine is down”?
(b) Unless the machine is down, it can be used to produce a single item. C: “the machine can
be used to produce a single component right now” and D: “machine is down”
6. How likely is an event that has a 0.65 probability? Describe the probability in words.
7. It is clear that there is a relationship between set theory and probability, when can we have
the probability of an event to be zero in set notation? What is the name given to this
relationship?
8. Tick the appropriate column in the table below
s/n Events Impossible Possible Sure
event event event
A A person has a heart 
B Omashola winning the ongoing big brother Naija 
show
C Rainfall tomorrow 
D Getting an eight when a regular die is tossed 
E Chelsea cannot win the ongoing 2019/2020 
Champions League tournament
F Manchester Unite wins the ongoing 2019/2020 
Champions League tournament
G Michael Jackson (the music star) will be next 
president of NIgeria
ANSWER:
(3) 0.85 is a typical “very likely” probability. (4) The team is very likely to win. (5) (a)
Mutually exclusive (b) Complements (7) when we have an empty set or there is no
intersection; mutually exclusive events
(C) The Rule of Unions.
We now state a very important rule, the rule of unions. The rule of unions allows us to write the
probability of the union of two events in terms of the probabilities of the two events and the
probability of their intersection. There are 2 cases:
CASE I: If A and B are mutually exclusive, then P( A  B)  P( A)  P( B)
CASE II: If A and B are not mutually exclusive, then P( A  B)  P( A)  P( B)  P( A  B)

Page 25
In general, additivity of P implies that the probability of an event is obtained by summing the
probabilities of the outcomes belonging to the event (but based on the conditions above). The
rule can be extended to more than two events. What if we have 3 events?
The rule of unions is especially useful when we do not have the sample space for the union of
events but do have the separate probabilities.
A footnote to the addition rule is that if we have a set of mutually exclusive and collectively
exhaustive outcomes their probabilities must add up to one. A probability of one means certainty,
which reflects the fact that in a situation where there are a set of mutually exclusive and
collectively exhaustive outcomes, one and only one of them is certain to occur.
In class exercise 1: Suppose your chance of being offered a certain job is 0.4, your probability
of getting another job is 0.5, and the probability of being offered both jobs (i.e., the intersection)
is 0.3. What is the probability of being offered at least one of the two jobs? By the rule of unions,
your probability of being offered at least one of the two jobs (their union) is 0.4 + 0.5 - 0.3 = 0.6.
In class exercise 2: For two events A and B, suppose that P( A)  0.3 , P ( B )  0.5 and
P( A  B)  0.6 . Calculate P ( A  B ) .
In class exercise 3: The probability that a customer purchases fuel is 192/500; The probability
that a customer pays by debit card is 78/500 and the probability that a customer purchases fuel
and pays by debit card is 62/500. Calculate the probability that a customer coming into the
service station purchases fuel or pays by debit card.
Applying the addition rule: P (F or D) = P (F) + P(D) – P (F and D). P (F or D) = 192/500 +
78/500 – 62/500
= (192 + 78 – 62)/500 = 208/500 = 0.416 or 41.6 per cent
In class exercise 4: use the given information to find the indicated probability.
(a) If A  B = Ø, P (A) = 0.3, P( A  B) = 0.4. Find P (B). 0.1
(b) If A  B = Ø, P (A) = 0.3, P (B) = 0.4. Find P( A  B) . 0.7
(c) A, B and C are mutually exclusive. P (A) = 0.2, P (B) = 0.6, P(C) = 0.1. Find
P ( A  B  C ) . 0.9
In class exercise 5: Determine whether the information shown is consistent with a probability
distribution. If not, say why.
(a)
Outcome a b c d e
Probability 0 0 0.65 0.3 0.05
where a, b, c, d and e are collectively exhaustive.
(b) P (A) = 0.2, P (B) = 0.1; P( A  B) = 0.4 No; P( A  B) should be = P(A)+P(B).

Page 26
In class exercise 6: State with reasons, if the following probability distribution is admissible or
not:
(a)
x 0 1 2
f x  0.3 0.2 0.5

(b)
x -2 -1 0 1 2
f x  0.3 0.4 -0.2 0.2 0.3

In class exercise 7: In a single throw with two uniform dice, find the probability of throwing:
(a) Five
(b) Eight
(c) Five or eight
(d) Odd number on first dice and six on the second
(e) A number greater than four on each dice
(f) A total of eleven
(g) A total of nine or eleven
(h) A total greater than eight
In class exercise 8: a committee of 4 persons is to be appointed from 3 officers of the production
department, 4 officers of the purchase department, 2 officers of the sales department and 1
chattered accountant. Find the probability of forming the committee in the following manner:
(1) There must be one from each category
(2) It should have at least one from the purchase department
(3) The chattered accountant must be in the committee
In class exercise 9: Three friends start Higher Education courses at the same time in the same
institution. Angela is studying Accounting, Bashir is studying Business and Charlie is studying
Computing. Seventy per cent of students who begin the Accounting course complete it
successfully, 80 per cent of students who begin the Business course complete it successfully, and
60 per cent of students who begin the Computing course complete it successfully. Construct a
tree diagram and use it to work out:
(a) The probability that all three friends successfully complete their courses.
(b) The probability that two of the friends successfully complete their courses.
(c) The probability that only one of the friends successfully completes their course.

Page 27
We will use A to represent Angela, B for Bashir and C for Charlie. To indicate someone failing
their course we will use the appropriate letter followed by a ‘mark, so A’ represents the outcome
that Angela fails her course, whereas A alone represents the outcome that Angela completes her
course successfully.
The completion rate suggests the probability that Angela passes the Accounting course is 0.7,
and given that the ‘pass’ and ‘fail’ outcomes are mutually exclusive and collectively exhaustive,
the probability she fails is 0.3. The probability that Bashir passes the Business course is 0.8, and
the probability that he fails is 0.2. The probability that Charlie passes the Computing course is
0.6 and the probability that he fails is 0.4.
DIAGRAM
The probability that all three pass, that is P (ABC), is the probability at the top on the right-hand
side, 0.336 or 33.6 per cent.
The probability that two of the friends pass is the probability that one of three sequences, either
𝐴𝐵𝐶′ or 𝐴𝐵’𝐶 or 𝐴’𝐵𝐶 occurs. Since these combinations are mutually exclusive we can apply the
simpler form of the addition rule: P (𝐴𝐵𝐶′ or 𝐴𝐵’𝐶 or 𝐴’𝐵𝐶) = 0.224 + 0.084 + 0.144 = 0.452 or
45.2 per cent
The probability that only one of the friends passes is the probability that either 𝐴𝐵′𝐶′ or 𝐴′𝐵𝐶′ or
𝐴’𝐵′𝐶 occurs. These combinations are also mutually exclusive so again we can apply the simpler
form of the addition rule: P (𝐴𝐵′𝐶′ or 𝐴′𝐵𝐶′ or 𝐴’𝐵′𝐶) = 0.056 + 0.096 + 0.036 = 0.188 or 18.8
per cent
A tree diagram should include all possible sequences. One way you can check that it does is to
add up the probabilities on the right-hand side. Because these outcomes are mutually exclusive
and collectively exhaustive their probabilities should add up to one. We can check that this is the
case in this Example.
0.336 + 0.224 + 0.084 + 0.056 + 0.144 + 0.096 + 0.036 + 0.024 = 1
In class exercise 10: One weekend a total of 178 prospective homebuyers visit a new housing
development. They are offered a choice of three different types of house: the 2-bedroom
‘Ambience’, the 3-bedroom ‘Bermuda’, or the 4-bedroom ‘Casa’. When invited to select the type
of house they would most like, 43 chose the ‘Ambience’, 61 the ‘Bermuda’, and 29 the ‘Casa’.
Are the choices mutually exclusive? What is the probability that a prospective homebuyer from
this group has chosen the ‘Bermuda’ or the ‘Casa’? The 3 choices of types of homes are
mutually exclusive and exhaustive, yes or no? Why?
We can assume that the choices are mutually exclusive because each prospective homebuyer was
asked to choose only one type of house. We can therefore use the simpler form of the addition
rule.

Page 28
For convenience we can use the letter A for ‘Ambience’, B for ‘Bermuda’ and C for ‘Casa’.
P (B or C) = P (B) + P (C) = 61/178 + 29/178 = 90/178 = 0.5056 or 50.56 per cent
If you read the in class exercise 10 carefully you can see that although the three choices of types
of house are mutually exclusive, they do not constitute all the alternative outcomes. That is, they
are not collectively exhaustive. As well as choosing one of the three types of house each
prospective homebuyer has a fourth choice, to decline to express a preference. If you subtract the
number of prospective homebuyers expressing a preference from the total number of prospective
homebuyers you will find that 45 of the prospective homebuyers have not chosen one of the
types of house.

TEST YOURSELF…
1. We toss a coin three times. For this experiment we choose the sample space Ω = {HHH,
THH, HTH, HHT, TTH, THT, HTT, TTT} where T stands for tails and H for heads.
(a) Write down the set of outcomes corresponding to each of the following events:
A: “we throw tails exactly two times.” B: “we throw tails at least
two times.”
C: “tails did not appear before a head appeared.” D: “the first throw results in
tails.”
(b) Write down the set of outcomes and probabilities corresponding to each of the
following events: A’, A∪ (C ∩ D), and A ∩ D’.
2. Let S be the set of all outcomes when flipping a fair coin four times, so that all 16
outcomes are equally likely. Define the events A and B by: A = {s ∈ S; s contains more Ts
than Hs}, B = {s ∈ S; any T in s precedes every H in s}.Compute the probabilities P(A),
P(B).
3. Students in a certain college subscribe to three news magazines A, B, and C according to
the following proportions: A: 20%, B: 15%, C: 10%, both A and B: 5%, both A and C:
4%, both B and C: 3%, all three A, B, and C: 2%. If a student is chosen at random, what is
the probability he/she subscribes to none of the news magazines?
4. Use the given information to find the indicated probability.
(a) P (A) = 0.1, P (B) = 0.6, P ( A  B ) = 0.05. Find P( A  B) . 0.65
(b) P( A  B) = 0.9, P (B) = 0.6, P ( A  B ) = 0.1. Find P (A). 0.4
(c) P (A) = 0.22. Find P (A'). 0.78
(d) A and B are mutually exclusive. P (A) = 0.4, P (B) = 0.4. Find P( A  B) . 0.2
(e) P( A  B) = 0.3 and P ( A  B ) = 0.1. Find P (A) + P (B). 0.4

Page 29
5. Determine whether the information shown in (a) and (b) is consistent with a probability
distribution. If not, say why.
(a) P (A) = 0.1, P (B) = 0; P( A  B) = 0. Yes
6. A number is chosen from each of the two sets: 1,2,3,4,5,6,7,8,9 and 1,2,3,4,5,6,7,8,9. If
p1 is the probability that the sum of the two numbers is 10 and p 2 is the probability that
their sum is 8, Find (i) p1  p2 (ii) p 2 / p1 (iii) p1 p 2
7. In the play of two dice, the thrower loses if his first throw is 2, 4 or 12. He wins if his
first throw is a 5 or 11. Find the ratio between his probability of losing and probability of
winning in the first throw.
8. A bag contains eight balls, five are red and three are white. If a man select two balls at
random from the bag, what is the probability that (i) Both are red (ii) Both are white (iii)
He’ll get one ball each.
9. Two events A and B are mutually exclusive, such that P ( A)  1 / 5 , P ( B )  1 / 3 . Find the
probability that (i) Either A or B will occur (ii) Both A and B will occur (iii) Neither A
nor B will occur.
10. There are three Economists and three Sociologists. A committee of three is to be formed
at random. Find the probability that at least one Economist and at least one Sociologist is
in the committee.
11. Two digits are selected at random from the digits 1 through 9. If the sum is even, find the
probability that both are odd.
12. Let A and B be the respective events that two contracts I and II, say, are completed by
certain deadlines, and suppose that: P (at least one contract is completed by its deadline)
= 0.9 and P (both contracts are completed by their deadlines) = 0.5. Calculate the
probability: P (exactly one contract is completed by its deadline).
13. Point out the error in the following statement: “the probability that a student will commit
exactly one mistake during his physics laboratory experiment is 0.08 and the probability
that he will commit at least one mistake is 0.05 ” wrong, the latter probability must be
greater than the former
14. Criticize the following statement: “the probability of Lanre passing an examination is 1/3
and the probability of Seyi passing the same examination is 2/3. Therefore, the
probability of either one of them passing in the examination is 1.” Wrong, because the
two events are not mutually exclusive.
15. Give reasons why there must be a mistake in each of the following statement:

Page 30
(i) “If the probability that an urn contain Uranium is 0.2, the probability that it does
not contain Uranium is 0.62”.
(ii) “A quality control Engineer claims that the probabilities that a large consignment
of glass bricks contains 0, 1, 2, 3, 4 or 5 defectives are 0.11, 0.23, 0.37, 0.16, 0.09
and 0.05”.
(iii) “A company is working on the construction of two shopping centers. The
probability that the larger of the two shopping centers will be completed on time
is 0.35 and the probability that both shopping centers will be completed on time is
0.42”.
16. State with reasons, if the following probability distribution is admissible or not:
x -1 0 1
f x  0.4 0.4 0.3

x 0 1 2 3
f x  0.2 0.3 0.3 0.1

Frequency Distribution and Graphs

Introduction

When conducting a statistical study, the researcher must gather data for the particular variable under
study. For example, if a researcher wishes to study the number of people who were bitten by poisonous
snakes in a specific geographic area over the past several years, he or she has to gather the data from
various doctors, hospitals, or health departments. To describe situations, draw conclusions, or make
inferences about events, the researcher must organize the data in some meaningful way. The most
convenient method of organizing data is to construct a frequency distribution. After organizing the data,
the researcher must present them so they can be under-stood by those who will benefit from reading the
study. The most useful method of presenting the data is by constructing statistical charts and graphs.
There are many different types of charts and graphs, and each one has a specific purpose.

Page 31
Summary: When summarizing large masses of raw data, it is often useful to distribute the data
into classes, or categories, and to determine the number of individuals belonging to each class,
called the class frequency. A tabular arrangement of data by classes together with the
corresponding class frequencies is called a Frequency distribution, or frequency table.

Definition: A frequency distribution is the organization of raw data in table form, using classes and
frequencies

Assuming Table 1 is a frequency distribution of heights (recorded to the nearest inch) of 100
male students at Federal University Lokoja.

Table 1 Heights of 100 male at students Federal University Lokoja

Height (in) Number of Students

60–62 5
63–65 18
66–68 42
69–71 27
72–74 8
Total 100

Data organized and summarized as in the above frequency distribution are often called grouped
data. Although the grouping process generally destroys much of the original detail of the data, an
important advantage is gained in the clear ‘‘overall’’ picture that is obtained and in the vital
relationships that are thereby made evident.

CLASS INTERVALS AND CLASS LIMITS

A symbol defining a class, such as 60–62 in Table 1, is called a class interval. The end numbers,
60 and 62, are called class limits; the smaller number (60) is the lower class limit, and the larger
number (62) is the upper class limit. The terms class and class interval are often used
interchangeably, although the class interval is actually a symbol for the class. A class interval
that, at least theoretically, has either no upper class limit or no lower class limit indicated is
called an open class interval. For example, referring to age groups of individuals, the class
interval ‘‘65 years and over’’ is an open class interval.

CLASS BOUNDARIES

If heights are recorded to the nearest inch, the class interval 60–62 theoretically includes all
measurements from 59.5000 to 62.5000 in. These numbers, indicated briefly by the exact
numbers 59.5 and 62.5, are called class boundaries, or true class limits; the smaller number
(59.5) is the lower class boundary, and the larger number (62.5) is the upper class boundary.

Page 32
In practice, the class boundaries are obtained by adding the upper limit of one class interval to
the lower limit of the next-higher class interval and dividing by 2.Sometimes, class boundaries
are used to symbolize classes. For example, the various classes in the first column of Table 2.1
could be indicated by 59.5–62.5, 62.5–65.5, etc. To avoid ambiguity in using such notation, class
boundaries should not coincide with actual observations. Thus if an observation were 62.5, it
would not be possible to decide whether it belonged to the class interval 59.5–62.5 or 62.5–65.5

THE SIZE, OR WIDTH, OF A CLASS INTERVAL

The size, or width, of a class interval is the difference between the lower and upper class
boundaries and is also referred to as the class width, class size, or class length. If all class
intervals of a frequency distribution have equal widths, this common width is denoted by c. In
such case c is equal to the difference between two successive lower class limits or two successive
upper class limits. For the data of Table 1, for example, the class interval is c 62.5-59.5= 65.5-
62:5=3.

THE CLASS MARK

The class mark is the midpoint of the class interval and is obtained by adding the lower and
upper class limits and dividing by 2. Thus the class mark of the interval 60–62 is (60+62)/2 = 61.
The class mark is also called the class midpoint. For purposes of further mathematical analysis,
all observations belonging to a given class interval are assumed to coincide with the class mark.
Thus all heights in the class interval 60–62 in are considered to be 61 in.

GENERAL RULES FOR FORMING FREQUENCY DISTRIBUTIONS

1. Determine the largest and smallest numbers in the raw data and thus find the range (the
difference between the largest and smallest numbers).

2. Divide the range into a convenient number of class intervals having the same size. If this is not
feasible, use class intervals of different sizes or open class intervals. The number of class
intervals is usually between 5 and 20, depending on the data. Class intervals are also chosen so
that the class marks (or midpoints) coincide with the actually observed data. This tends to lessen
the so-called grouping error involved in further mathematical analysis. However, the class
boundaries should not coincide with the actually observed data.

3. Determine the number of observations falling into each class interval; that is, find the class
frequencies. This is best done by using a tally, or score sheet.

Page 33
4. For convenience sake, it is advisable to use the formula below to calculate the number of
Classes. 𝐾 = 1 + 3.3 log10 𝑛, where K= number of classes and n = sample size.

Illustration 1: The table below is weight loss of Students in Pure and Applied Sciences.

Weight Loss Data


20.5 19.5 15.6 24.1 9.9
15.4 12.7 5.4 17.0 28.6
16.9 7.8 23.3 11.8 18.4
13.4 14.3 19.2 9.2 16.8
8.8 22.1 20.8 12.6 15.9

Provide a useful summary of the available information

Illustration 1:

Categorical Frequency Distributions

The categorical frequency distribution is used for data that can be placed in specific categories,
such as nominal- or ordinal-level data. For example, data such as political affiliation, religious
affiliation, or major field of study would use categorical frequency distributions.

Illustration2: Twenty-five army inductees were given a blood test to determine their blood type.
The data set

A B B AB O

O O B AB B

B B O A O

A O O O AB

AB A O B A

Construct a frequency distribution for the data.

Illustration3: The data shown are the number of grams per serving of 30 selected brands of cakes.
Construct a frequency distribution using 5 classes.

32 47 51 41 46 30
46 38 34 34 52 48
48 38 43 41 21 24
25 29 33 45 51 32
32 27 23 23 34 35

Page 34
Assignment 2

 The number of passengers (in thousands) for the leading U.S. passenger airlines in 2004
is indicated below. Use the data to construct a grouped frequency distribution and a
cumulative frequency distribution with a reasonable number of classes, and comment on
the shape of the distribution.
91,570 86,755 81,066 70,786 5,373 42,400 40,551 21,119 16,280
14,869 13,659 13,417 13,170 12,632 11,731 10,420 10,024 9,122 7,041
6,954 6,406 6,362 5,930 5,585 5,427
 A survey was taken on how much trust people place in the information they read on the
Internet. Construct a categorical frequency distribution for the data. A = trust in
everything they read, M = trust in most of what they read, H= trust in about one-half of
what they read, S = trust in a small portion of what they read. (Based on information from
the UCLA Internet Report.)

M M M A H M S M H M

S M M M M A M M A M

M M H M M M H M H M

A M M M H M M M M M

 The final grades in mathematics of 80 students at State University are recorded in the
accompanying table.

68 84 75 82 68 90 62 88 76 93

73 79 88 73 60 93 71 59 85 75

61 65 75 87 74 62 95 78 63 72

66 78 82 75 94 77 69 74 68 60

96 78 89 61 75 95 60 79 83 71

79 62 67 97 78 85 76 65 71 75

65 80 73 57 88 78 62 76 53 74

86 67 73 81 72 63 76 75 85 77

With reference to this table, find:

(a) The highest grade.

Page 35
(b) The lowest grade.

(c) The range.

(e) Summarize the data completely.

The widths (w) of the intervals can be decided using this formula

𝑅
𝑊 = 𝐾, where R is the Range and K is the earlier given formula.

FREQUENCY GRAPH FOR CONTINUOUS DATA

 Histogram and the Frequency Polygon

A convenient way of displaying a frequency table is by means of a histogram and/or a frequency


polygon. A histogram is a diagram in which:

 The horizontal scale represents the value of the variable marked at interval boundaries.
 The vertical scale represents the frequency or relative frequency in each interval

A histogram presents us with a graphic picture of the distribution of measurements. This picture
consists of rectangular bars joining each other, one for each interval.

Illustration4: Construct a histogram to represent the data shown for the record high
temperatures for each of the 50 states.

Class boundaries Frequency


99.5–104.5 2
104.5–109.5 8
109.5–114.5 18
114.5–119.5 13
119.5–124.5 7
124.5–129.5 1
129.5–134.5 1

Solution

Step 1 Draw and label the x and y axes. The x axis is always the horizontal axis, and the y axis is
always the vertical axis.

Step 2 Represent the frequency on the y axis and the class boundaries on the x axis.

Step 3 Using the frequencies as the heights, draw vertical bars for each class.

Page 36
 The Frequency Polygon

Another way to represent the same data set is by using a frequency polygon.

The frequency polygon is a graph that displays the data by using lines that connect points
plotted for the frequencies at the midpoints of the classes. The frequencies are represented by
the heights of the points.

Solution

Step 1: Find the midpoints of each class. Recall that midpoints are found by adding the upper and
lower boundaries and dividing by 2: and so on. The midpoints are

Class boundaries Midpoints Frequency


99.5–104.5 102 2
104.5–109.5 107 8
109.5–114.5 112 18
114.5–119.5 117 13
119.5–124.5 122 7
124.5–129.5 127 1
129.5–134.5 132 1

Step 2: Draw the x and y axes. Label the x axis with the midpoint of each class, and then use a
suitable scale on the y axis for the frequencies.

Step 3: Using the midpoints for the x values and the frequencies as the y values, plot the points.

Step 4: Connect adjacent points with line segments. Draw a line back to the x axis at the
beginning and end of the graph, at the same distance that the previous and next midpoints would
be located.

 The Ogive

The third type of graph that can be used represents the cumulative frequencies for the classes.
This type of graph is called the cumulative frequency graph, or ogive. The cumulative frequency
is the sum of the frequencies accumulated up to the upper boundary of a class in the distribution.

The ogive is a graph that represents the cumulative frequencies for the classes in a frequency
distribution.

Construct an ogive for the frequency distribution described in illustration 4.

Page 37
Solution

Step 1: Find the cumulative frequency for each class

Cumulative frequency
Less than 99.5 0
Less than 104.5 2
Less than 109.5 10
Less than 114.5 28
Less than 119.5 41
Less than 124.5 48
Less than 129.5 49
Less than 134.5 50

Step 2 Draw the x and y axes. Label the x axis with the class boundaries. Use an appropriate
scale for the y axis to represent the cumulative frequencies. (Depending on the numbers in the
cumulative frequency columns, scales such as 0, 1, 2, 3. . . or 5, 10, 15, 20, . . . , or 1000, 2000,
3000, . . . can be used. Do not label the y axis with the numbers in the cumulative frequency
column.) In this example, a scale of 0, 5, 10, 15, . . . will be used.

Step 3 Plot the cumulative frequency at each upper class boundary, as shown in Figure 2–4.
Upper boundaries are used since the cumulative frequencies represent the number of data values
accumulated up to the upper boundary of each class.

Step 4 Starting with the first upper class boundary, 104.5, connect adjacent points with line
segments, as shown in Figure 2–5. Then extend the graph to the first lower class boundary, 99.5,
on the x axis.

 Relative Frequency Graphs

The relative frequency of a class is the frequency of the class divided by the total frequency of
all classes and is generally expressed as a percentage. To find the relative frequency of a class,
divide the frequency f by the sample size n.

𝑓
Relative frequency = 𝑛, where f =frequency of the class and n=total number of values.

Illustration 5: Construct a histogram, frequency polygon, and ogive using relative frequencies
for the distribution of the miles that 20 randomly selected runners ran during a given week.

Class boundaries Frequency Midpoints Relative frequency Cumulative Relative frequency

Page 38
5.5–10.5 1
10.5–15.5 2
15.5–20.5 3
20.5–25.5 5
25.5–30.5 4
30.5–35.5 3
35.5–40.5 2
Total 20

Solution

Step 1 Convert each frequency to a proportion or relative frequency by dividing the frequency
for each class by the total number of observations.

Step 2 Find the cumulative relative frequencies. To do this, add the frequency in each class to
the total frequency of the preceding class.

Step 3 Draw each graph as shown in Figure 2–7. For the histogram and ogive, use the class
boundaries along the x axis. For the frequency polygon, use the midpoints on the x axis. The
scale on the y axis uses proportions.

Types of frequency polygon curves

1) Symmetrical or bell-shaped curves are characterized by the fact that observations


equidistant from the central maximum have the same frequency. Adult male and adult
female heights have bell-shaped distributions.

Symmetrical or bell-shaped

2) Curves that have tails to the left are said to be skewed to the left. The lifetimes of males
and females are skewed to the left. A few die early in life but most live between 60 and
80 years. Generally, females live about ten years, on the average, longer than males.

Page 39
Skewed to the left

3) Curves that have tails to the right are said to be skewed to the right. The ages at the time
of marriage of brides and grooms are skewed to the right. Most marry in their twenties
and thirties but a few marry in their forties, fifties, sixties and seventies.

Skewed to the right

4) Curves that have approximately equal frequencies across their values are said to be
uniformly distributed. Certain machines that dispense liquid colas do so uniformly
between 15.9 and 16.1ounces, for example.

Uniform

5) In a J-shaped or reverse J-shaped frequency curve the maximum occurs at one end or the
other.
6) AU-shaped frequency distribution curve has maxima at both ends and a minimum in
between.
7) A bimodal frequency curve has two maxima.
8) A multimodal frequency curve has more than two maxima. Symmetrical or bell-shaped
Skewed to the right Skewed to the left Uniform

 Lorenz Curve

Page 40
Lorenz curve is a graphical method of studying dispersion. It was introduced by Max.O.Lorenz,
a great Economist and a statistician, to study the distribution of wealth and income. It is also
used to study the variability in the distribution of profits, wages, revenue, etc.

It is specially used to study the degree of inequality in the distribution of income and wealth
between countries or between different periods. It is a percentage of cumulative values of one
variable in combined with the percentage of cumulative values in other variable and then Lorenz
curve is drawn.

The curve starts from the origin (0, 0) and ends at (100,100). If the wealth, revenue, land etc are
equally distributed among the people of the country, then the Lorenz curve will be the diagonal
of the square. But this is highly impossible. The deviation of the Lorenz curve from the diagonal
shows how the wealth, revenue, land etc are not equally distributed among people.

Illustration 6: In the following table, profit earned is given from the number of companies
belonging to two areas A and B. Draw in the same diagram their Lorenz curves and interpret
them.

Profit earned (in thousands) Number of Companies


Area A Area B
5 7 13
26 12 25
89 28 57
110 33 45
155 25 28
180 18 13
200 8 6

Solution:

Profit Area A Area B


Earned Cum Cum % #companies Cum # Cum % #companies Cum # Cum %
Profit
5 5 1 7 7 5 13 13 6

26 31 4 12 19 13 25 38 17

89 96 12 28 33 23 57 81 35

Page 41
110 185 22 33 61 42 45 138 60
155 95 36 25 94 65 28 183 80

180 450 54 18 119 82 13 211 92

200 630 76 8 137 94 6 224 97


830 100 145 100 230 100

LORENZ-CURVE
80 100

Area A
Area B
60
y

40
20
0

0 20 40 60 80

x2

FREQUENCY GRAPH FOR CATEGORICAL OR QUALITATIVE DATA

Diagrams/Charts: A diagram/chart is a visual form for presentation of statistical data,


highlighting their basic facts and relationship. If we draw diagrams on the basis of the data
collected they will easily be understood and appreciated by all. It is readily intelligible and save a
considerable amount of time and energy.

Significance of charts/Diagrams

1. They are attractive and impressive.

2. They make data simple and intelligible.

3. They make comparison possible

4. They save time and labour.

5. They have universal utility.

Page 42
6. They give more information.

7. They have a great memorizing effect.

Types of diagrams:

In practice, a very large variety of diagrams are in use and new ones are constantly being added.
For the sake of convenience and simplicity, they may be divided under the following heads:

1. One-dimensional diagrams

2. Two-dimensional diagrams

3. Three-dimensional diagrams

4. Pictograms and Cartograms

One-dimensional diagrams: In such diagrams, only one-dimensional measurement, i.e. height


is used and the width is not considered. These diagrams are in the form of bar or line charts and
can be classified as

1. Line Diagram

2. Simple Diagram

3. Multiple Bar Diagram

4. Sub-divided Bar Diagram

5. Percentage Bar Diagram

Line Diagram: Line diagram is used in case where there are many items to be shown and there
is not much of difference in their values. Such diagram is prepared by drawing a vertical line for
each item according to the scale. The distance between lines is kept uniform. Line diagram
makes comparison easy, but it is less attractive.

Illustration 7: Show the following data by a line chart:

No. of children 0 1 2 3 4 5
Frequency 10 14 9 6 4 2

Page 43
Solution: use the frequency for the y-axis and No. of children on the x-axis, starting from origin
(0,0)

Bar Graphs: A bar graph is the analogue of a histogram for categorical data. A bar is displayed
for each level of a factor, with the heights of the bars proportional to the frequencies of
observations falling in the respective categories. A disadvantage of bar graphs is that the levels
are ordered alphabetically (by default), which may sometimes obscure patterns in the display.

A bar graph represents the data by using vertical or horizontal bars whose heights or
lengths represent the frequencies of the data.

Simple Bar chart: Simple bar diagram can be drawn either on horizontal or vertical base, but
bars on horizontal base more common. To make the diagram attractive, the bars can be coloured.

Bar diagram are used in business and economics. However, an important limitation of such
diagrams is that they can present only one classification or one category of data.

Illustration 8: Represent the following data by a bar diagram.

Year Production (in tones)


1991 45
1992 40
1993 42
1994 55
1995 50

Multiple Bar Diagram: Multiple bar diagram is used for comparing two or more sets of
statistical data. Bars are constructed side by side to represent the set of values for comparison. In
order to distinguish bars, they may be either differently coloured or there should be different
types of crossings or dotting, etc. An index is also prepared to identify the meaning of different
colours or dottings.

Illustration 9: Draw a multiple bar diagram for the following data.

Page 44
Year Profit before tax ( in lakhs of rupees ) Profit after tax ( in lakhs of rupees )
1998 195 80
1999 200 87
2000 165 45
2001 140 32

Sub-divided Bar Diagram: In a sub-divided bar diagram, the bar is sub-divided into various
parts in proportion to the values given in the data and the whole bar represent the total. Such
diagrams are also called Component Bar diagrams. The sub divisions are distinguished by
different colours or crossings or dottings.

The main defect of such a diagram is that all the parts do not have a common base to enable one
to compare accurately the various components of the data.

Illustration 10: Represent the following data by a sub-divided bar diagram.

Expenditure items Monthly expenditure (in Rs.)


Family A Family B
Food 75 95
Clothing 20 25
Education 15 10
Housing Rent 40 65
Miscellaneous 25 35

Percentage bar diagram: This is another form of component bar diagram. Here the components
are not the actual values but percentages of the whole. The main difference between the sub-
divided bar diagram and percentage bar diagram is that in the former the bars are of different
heights since their totals may be different whereas in the latter the bars are of equal height since
each bar represents 100 percent. In the case of data having sub-division, percentage bar diagram
will be more appealing than sub-divided bar diagram.

Illustration 11: Represent the following data by a percentage bar diagram:

Particular Factory A Factory B


Selling Price 400 650
Quantity Sold 240 365

Page 45
Wages 3500 5000
Materials 2100 3500
Miscellaneous 1400 2100
Solution: Convert the given values into percentages as follows:

Particular Family A Family B


Values % Values %
Selling Price 400 650
Quantity Sold 240 365
Wages 3500 5000
Materials 2100 3500
Miscellaneous 1400 2100
Total 7640 11615

Two-dimensional Diagrams: In one-dimensional diagrams, only length 9 is taken into account.


But in two-dimensional diagrams the area represents the data and so the length and breadth have
both to be taken into account. Such diagrams are also called area diagrams or surface diagrams.
The important types of area diagrams are:

 Rectangles
 Squares
 Pie-diagram. For the sake of this course, we only limit ourselves to pie-diagram or Pie-
chart.

Pie Diagram or Circular Diagram: Another way of preparing a two-dimensional diagram is in

the form of circles. In such diagrams, both the total and the component parts or sectors can be

shown. The area of a circle is proportional to the square of its radius. While making

comparisons, pie diagrams should be used on a percentage basis and not on an absolute basis. In

constructing a pie diagram the first step is to prepare the data so that various components values

can be transposed into corresponding degrees on the circle.

Page 46
The second step is to draw a circle of appropriate size with a compass. The size of the radius

depends upon the available space and other factors of presentation. The third step is to measure

points on the circle and representing the size of each sector with the help of a protractor.

A pie graph is a circle that is divided into sections or wedges according to the percentage of
frequencies in each category of the distribution.

f f
Degrees = (n)360o Percentage = (n)100

Illustration 12: Draw a Pie diagram for the following data of production of sugar in quintals of various
countries.

Country Production of Sugar (in quintals)


Cuba 62
Australia 47
India 35
Japan 16
Egypt 6

Solution: The values are expressed in terms of degree as follows

Country Production of Sugar

Page 47
In Quintals In Degrees
Cuba 62 134
Australia 47
India 35
Japan 16
Egypt 6
Total 166

Pareto Charts: When the variable displayed on the horizontal axis is qualitative or categorical, a
Pareto chart can also be used to represent the data.

Definition:

 A Pareto chart is used to represent a frequency distribution for a categorical variable, and
the frequencies are displayed by the heights of vertical bars, which are arranged in order
from highest to lowest.
 A bar chart used to separate the “vital few” from the “trivial many.” These charts are
based on the Pareto Principle which states that 20 percent of the problems have 80
percent of the impact. The 20 percent of the problems are the “vital few” and the
remaining problems are the “trivial many.” A Pareto chart can help you:
i. Separate the few major problems from the many possible problems so you can focus
your improvement efforts
ii. Arrange data according to priority or importance
iii. Determine which problems are most important, using data, not perception

Illustration 13: The data shown here consist of the number of homeless people for a sample of
selected cities. Construct and analyze a Pareto chart for the data.

City Number
Atlanta 6832
Baltimore 2904
Chicago 6680
St. Louis 1485
Washington 5518
Solution

Page 48
Step 1 Arrange the data from the largest to smallest according to frequency.

Step 2 Draw and label the x and y axes.

Step 3 Draw the bars corresponding to the frequencies.

The graph shows that the number of homeless people is about the same for Atlanta and

Chicago and a lot less for Baltimore and St. Louis.

Stem-and-Leaf Plots: The stem-and-leaf plot is a simple way to organize and present data in a
histogram-like display. It is an excellent way to begin an analysis.

The stem and leaf plot is a method of organizing data and is a combination of sorting and
graphing. It has the advantage over a grouped frequency distribution of retaining the actual data
while showing them in graphical form.

Illustration 14: At an outpatient testing center, the number of cardiograms performed each day
for 20 days is shown. Construct a stem and leaf plot for the data.

25 31 20 32 13

14 43 02 57 23

36 32 33 ` 32 44

32 52 44 51 45

Solution:

Step 1 Arrange the data in order

Step 2 Separate the data according to the first digit

Step 3 A displays can be made by using the leading digit as the stem and the trailing digit as the
leaf.

Illustration 15: The next illustrative example shows how to draw a stem-and-leaf plot for data
that might not immediately lend itself to plotting. Consider the data: 1.47 2.06 2.36 3.43 3.74
3.78 3.94 4.42

Values have 3 significant digits and a decimal point. In such instances, we truncated the data to
include the first two significant digits. Thus, a value of 1.47 becomes 1.4, a value of 2.06
becomes 2.1, and so on. The stem-and-leaf plot of this truncated data looks like this:

Page 49
 NUMERICAL METHODS
Measures of Central Tendency: The central value is called a measure of central
tendency or an average or a measure of locations. There are five averages. Among them
mean, median and mode are called simple averages and the other two averages geometric
mean and harmonic mean are called special averages.

The meaning of average is nicely given in the following definitions.

“A measure of central tendency is a typical value around which other figures congregate.”

“An average stands for the whole group of which it forms a part yet represents the whole.”

“One of the most widely used set of summary figures is known as measures of location.”

Characteristics for a good or an ideal average:

The following properties should possess for an ideal average.

1. It should be rigidly defined.

2. It should be easy to understand and compute.

3. It should be based on all items in the data.

4. Its definition shall be in the form of a mathematical formula.

5. It should be capable of further algebraic treatment.

6. It should have sampling stability.

7. It should be capable of being used in further statistical computations or processing.

Besides the above requisites, a good average should represent maximum characteristics of the
data; its value should be nearest to the most items of the given series.

 Arithmetic mean or mean: Arithmetic mean or simply the mean of a variable is defined
as the sum of the observations divided by the number of observations. If the variable x
assumes n values x1,x2,..,xn then, the mean, 𝑋̅ , is given by

𝒏
̅ = ∑𝒊 𝑿𝒊 , This formula is for the ungrouped or raw data.
𝑿 𝒏

Illustration16: Calculate the mean for 2, 4, 6, 8, 10

Page 50
Short-Cut method: Under this method an assumed or an arbitrary average (indicated by A) is
used as the basis of calculation of deviations from individual values. The formula is

̅ = 𝑨 − ∑ 𝒅, where, A = the assumed mean or any value in x and d = the deviation of each value
𝑿 𝒏

from the assumed mean (d = x-A)

Illustration17: A student’s marks in 5 subjects are 75, 68, 80, 92, 56. Find his average mark.

Grouped Data: The mean for grouped data is obtained from the following formula

̅ = ∑ 𝒇𝒙 , where x = the mid-point of individual class, f = the frequency of individual class


𝑿 𝑵

And N = the sum of the frequencies or total frequencies

Short-cut method:

̅ = 𝑨 − ∑ 𝒇𝒅 × 𝒄 Where 𝒅 =
𝑿
𝒙−𝑨
,
𝒏 𝒄

A = any value in x

N = total frequency

c = width of the class interval

Illustration18: Given the following frequency distribution, calculate the arithmetic mean

Marks 64 63 62 61 60 59

Number of Students 8 18 12 9 7 6

Solution:

X F fx d=x-A fd
64 8
63 18
62 12
61 9
60 7
59 6
TOTAL

Page 51
Illustration19: Following is the distribution of persons according to different income groups.
Calculate arithmetic mean.

Income Rs(100) 0-10 10-20 20-30 30-40 40-50 50-60 60-70

Number of persons 6 8 10 12 7 4 3

Solution

Income Number of Mid 𝒙−𝑨 Fd


𝒅=
Persons (f) X 𝒄

Page 52
Merits and demerits of Arithmetic mean:

Merits:

1. It is rigidly defined.

2. It is easy to understand and easy to calculate.

3. If the number of items is sufficiently large, it is more accurate and more reliable.

4. It is a calculated value and is not based on its position in the series.

5. It is possible to calculate even if some of the details of the data are lacking.

6. Of all averages, it is affected least by fluctuations of sampling.

7. It provides a good basis for comparison.

Demerits:

1. It cannot be obtained by inspection nor located through a frequency graph.

2. It cannot be in the study of qualitative phenomena not capable of numerical measurement i.e.
Intelligence, beauty, honesty etc.

3. It can ignore any single item only at the risk of losing its accuracy.

4. It is affected very much by extreme values.

5. It cannot be calculated for open-end classes.

Page 53
6. It may lead to fallacious conclusions, if the details of the data from which it is computed are
not given.

Weighted Arithmetic mean: The average whose component items are being multiplied by
certain values known as “weights” and the aggregate of the multiplied results are being divided
by the total sum of their “weight”.

If x1, x2…xn be the values of a variable x with respective weights of w1, w2… wn assigned to
them, then

∑𝑛
𝑖=1 𝑥𝑖 𝑤𝑖
Weighted A.M (𝑋̅𝑤 ) = ∑ 𝑤𝑖

Uses of the weighted mean

Weighted arithmetic mean is used in:

a. Construction of index numbers.

b. Comparison of results of two or more universities where number of students differ.

c. Computation of standardized death and birth rates.

Illustration16: Calculate weighted average from the following data

Designation Monthly salary(in Rs) Strength of the cadre wx


(x) (w)
Class 1 officers 1500 10 15000
Class 2 officers 800 20 16000
Subordinate staff 500 70 35000
Clerical staff 250 100 25000
Lower staff 100 150 15000
350 106000

∑𝑛
𝑖=1 𝑥𝑖 𝑤𝑖 106000
Weighted A.M (𝑋̅𝑤 ) = ∑ 𝑤𝑖
= = 302.86
350

Harmonic mean (H.M):

Page 54
Harmonic mean of a set of observations is defined as the reciprocal of the arithmetic average of
the reciprocal of the given values. If x1, x2…..xn are n observations,

𝑛
H.M = 1
∑( )
𝑥𝑖

Illustration17: From the given data calculate H.M 5,10,17,24,30

X 1/x 𝑛
H.M = 1
∑( )
5 0.2000 𝑥𝑖

5
10 0.1000 = = 11.526
0.4338
17 0.0588
24 0.0417
30 0.0333
Total 0.4338

Geometric mean:

The geometric mean of a series containing n observations is the nth root of the product of the
values. If x1, x2… xn are observations then

1 1
G.M = (∏𝑛𝑖=1 𝑥𝑖 )𝑛 = (𝑥1 … 𝑥𝑛 )𝑛

∑ log 𝑥𝑖
Alternatively, G.M = 𝐴𝑛𝑡𝑖𝑙𝑜𝑔 𝑛

Illustration17: Calculate the geometric mean of the following series of monthly income of a
batch of families 180, 250, 490, 1400, 1050

X logx ∑ log 𝑥𝑖
G.M = 𝐴𝑛𝑡𝑖𝑙𝑜𝑔 𝑛
180 2.2553
13.5107
= 𝐴𝑛𝑡𝑖𝑙𝑜𝑔
250 2.3979 5
490 2.6902
= 𝐴𝑛𝑡𝑖𝑙𝑜𝑔(2.7021) = 503.6
1400 3.1461
1050 3.0212
13.5107

Page 55
 Median: The median of a sample (data set) is the middle number when the measurements
are arranged in ascending order. If n is odd, the median is the middle number and If n is
even, the median is the average of the middle two numbers.

Remarks:

(i) Mean is sensitive to extreme values


(ii) The median is insensitive to extreme values (because median is a measure of location
or position).
 Mode: The mode is the value of x (observation) that occurs with the greatest frequency.

Illustration 1.2: The data below represent number of harvested maize cobs for eight days.

9, 2, 7, 11, 14, 7, 2, 7. Summarize this data numerically using the central tendency.

Merits of Median

1. Median is not influenced by extreme values because it is a positional average.

2. Median can be calculated in case of distribution with open-end intervals.

3. Median can be located even if the data are incomplete.

4. Median can be located even for qualitative factors such as ability, honesty etc.

Demerits of Median

1. A slight change in the series may bring drastic change in median value.

2. In case of even number of items or continuous series, median is an estimated value other than
any value in the series.

3. It is not suitable for further mathematical treatment except its use in mean deviation.

4. It is not taken into account all the observations.

Merits of Mode

1. It is easy to calculate and in some cases it can be located mere inspection

2. Mode is not at all affected by extreme values.

3. It can be calculated for open -end classes.

4. It is usually an actual value of an important part of the series.

5. In some circumstances it is the best representative of data.

Demerits of mode

Page 56
1. It is not based on all observations.

2. It is not capable of further mathematical treatment.

3. Mode is ill-defined generally, it is not possible to find mode in some cases.

Measures of Variability/dispersion

 Range: largest measurement - smallest measurement


∑|𝑋−𝑋̅ |
 Mean Absolute Difference (MAD): 𝑀𝐴𝐷 = 𝑛

Remarks:

(i) MAD is a good measure of variability

(ii) It is difficult for mathematical manipulations


∑𝑛 ̅ 2
𝑖=1(𝑋−𝑋 )
 Sample Variance, s2 = 𝑛−1
∑𝑛 ̅ 2
𝑖=1(𝑋−𝑋 )
 Standard deviation s= √ 𝑛−1
Standard deviation
 Standard error =
√𝑛

Illustration 1.3: using illustration 1.2, find the all the possible measures of variability.

Quartiles: The quartiles divide the distribution in four parts. There are three
quartiles. The second quartile divides the distribution into two halves and therefore
is the same as the median. The first (lower) quartile (Q1) marks off the first one-
fourth, the third (upper) quartile (Q3) marks off the three-fourth.

Illustration1.4: A random sample of 8 cassavas heaps were selected from a


cassava plantation in FUL. The length of each tuber (in cm) was measured as
shown below: 2,5,8,10,11,14,17,20. Find the upper, second and lower quartiles.
Hence, obtain the inter-quartile range.

Page 57

You might also like