Business and Social Statistics Lecture Manual
Business and Social Statistics Lecture Manual
Sciences 1
(A Preliminary Guide)
Dr. E. O. Eleje
December, 2015
Chapter One
Introduction to Statistics
1
1.0 Objectives
By the end of this chapter, students should have refreshed their knowledge on the
following:
Generally, statistics is a branch of mathematics that deals with the analysis and
interpretation of numerical data in terms of samples and populations. As an academic
field of study, statistics is an integral part of management curriculum and a capstone
integrative course offered to students who have previously been through a set of core
functional area courses. In this sense, statistics is defined as the study of the field of
mathematics that deals with the analysis and interpretation of numerical data to aid
enterprise in decisions that determine the direction of the organization and shape its
future.
In modern time, business and social statistics involve the use of scientific methods to
generate and analyze relevant numerical data to produce information necessary for
effective planning and decision making in both private and public organizations.
Business statistics encompasses decision making in the face of uncertainty and it is
used in many discipline including finance, accounting, economics, production and
operations, services improvement, and marketing research among others.
2
Business and social statistics is useful to any economy or system in diverse ways.
Few of these could be pointedly documented here for class discussion:
Data could be primary or secondary in nature. Primary data are raw and uncollated.
They are original data which exist in their natural form. Secondary data on the other
hand are historical data already collated, processed and stored in retrievable form.
3
N/B: Variable or fact itself does not all the time qualify as data or information. What
makes it so is the usage. For this reason, particular variable may be data to some
persons and at the same time information to another person just as Cement is a
finished product to cement manufacturer and raw material to an estate developer.
The source of primary data is the field survey. The instrument is through the
administration of a self-designed questionnaire, interviews, or direct observations by
the researcher. On the other hand, the sources of secondary data could be
companies’ articles, Journals, Magazines, dailies, seminar and workshop papers, as
well as unpublished materials in the form of handouts and project works earlier done
in the area. The main instruments for secondary data are the libraries and the
internet.
Chapter Two
Sample and Sampling Procedures
2.0 Objectives
4
By the end of this chapter, students should:
Case 1: Suppose Ganaja village is an electoral ward that consists of 6000 voters
amongst which are 60% Christians, 30% Muslim, and 10% scheduled tribe. As an
investigator, represent in a lucid tabular format the population of Ganaja village
according to their category:
Solution:
Total number of voters = 6000
Population of Christians = 60% (6000) = 3600 voters
Population of Muslims = 30% (6000) = 1800 voters
Population of Scheduled tribe = 10% (6000) = 600 voters
5
Scheduled Tribe 10 600
Total 100 6000
n =
Case 2: Use the above Yaro Yamane sample size determination model to draw a
sample from the 6000 population of voters in Ganaja village as contained in case 1:
Solution:
Total number of voters (N) = 6000
Sample Size = ?
Tolerant error = 5% or 0.05
6
Sampling may be defined as the process of selecting some part of an aggregate or
totality on the basis of which a judgment or inference about the aggregate or totality is
made. In other words it is the process of obtaining information about an entire
population by examining only a part of it. In most of the research work and surveys
the usual approach happens to be to make generalization or to draw inferences based
on samples about the parameters of population from which the population are taken.
So we can now define a sample as any number of persons, units or objects selected to
represent the population according to some rule or plan.
Sampling process is different from census. The census method is the enumeration of
all the numbers or units of a population to get the idea of the entire population
whereas sampling is the method of selecting a fraction of the population in such a way
that it represents the entire population.
Sampling Methods/Techniques
In the process of choosing a good sample, two basic methods or techniques are
employed: Probability and Non probability sampling methods.
7
sample. A lottery method of selecting a student from the complete students’ names
from a box with blind or folded eyes is the best example of random sampling. It is the
best technique and unbiased method. It is the best process of selecting representative
sample. But the major disadvantage is that for this technique, we need the complete
sampling frame i.e. the list of the complete items or population which is not always
available. Probability sampling methods are of three types:
a.) Simple random sampling: This is a method where each element has the equal
probability or chance to be selected as a sample. It is bias free. Here, the makeup of
the population is a simple group with common characteristics and not different
groups. Selection is drawn once and for this reason, each element of the group has
only one chance of being selected and as such cannot come twice as sample.
b.) Stratified random sampling: In stratified random sampling the population is first
divided into different homogeneous group or strata which may be based upon a single
criterion such as male or female; or upon combination of more criteria like sex, caste,
level of education and so on. This method is generally applied when different category
of individuals constitutes the population. To have an actual picture of a particular
population about the standard of living, it is advisable to categorize the population on
the basis of caste, religion or land holding otherwise some section may be under-
represented or not represented at all.
8
Case 3: From table 2.1, proportionately assign the sample size of 370 voters in case
two to the various categories of voters in Ganaja village. Represent the sample
proportion in a lucid tabular format:
Solution:
Total number of voters (N) = 6000
Sample size of all voters (n) = 375
Hence;
Sample size for Christians:
3600/6000 x 375 = 225 voters
Sample size for Muslims:
1800/6000 x 375 = 112 voters
Sample size for Christian:
600/6000 x 375 = 38 voters
9
c) Cluster sampling: This is another type of probability sampling method, in which
the sampling units are not individual elements of the population, but group of
elements or group of individuals are selected as sample. In cluster sampling the total
population is divided into a number of relatively small sub-divisions or groups which
are themselves clusters and then some of these cluster are randomly selected for
inclusion in the sample. Suppose an investigator wants to study the functioning of
mid-day meal service in a district in that case he can use some schools clustering in a
block or two without selecting the schools scattering all over the district. Cluster
sampling reduces the cost and labour of collecting the data of the investigator but less
precise than random sampling.
2. Non Probability Sampling Methods: In this type of sampling, items for the
sample are selected deliberately by the researcher instead of using the techniques of
random sampling. It is also known as purposive or judgment sampling. Some
important techniques of non-probability sampling methods include:
a) Quota sampling: This method of sampling is almost the same with that of
stratified random sampling as stated above, the only difference is that here in selecting
the elements, randomization is not done instead quota is taken into consideration.
b) Purposive sampling: this is also non random sampling method; here the
investigator selects the sample arbitrarily which he considers important for the
research and believes it as typical and representative of the population. Say, an
investigator wants to forecast the chance of coming into the power of a political party
in a general election; for that purpose he selected some reporters, some teachers and
some elite people of the territory and collect their opinions. He considers these chosen
people as leading persons in the issue and their views are relevant for the chance of
coming into power of a party. As it is a purposive method it has big sampling errors
and carry misleading conclusion.
c) Systematic sampling: In this method every nth element is selected from a list of
population having serial number. For a large population of 1000 that is taken into
10
study a sample size of 100, the investigator is to select every nth name, that is,
(1000/100). This means that the starting name may be anyone within 1-1000, another
name is selected after every 10 th name and continuously in that other. However,
selecting a particular element/person taking the 10th name cannot represent the
different strata or groups that may exist in that big population. Moreover once the
starting number is decided and collected data it cannot be changed or switched over
the other category as per its definition (systemic). Moreover the list may have the
chance to repeat the same category of element by passing the other. It is biased and
misleading but useful in homogeneous population.
e) Double sampling: In this method sampling is drawn twice. For the first time a
large size of sample is selected and send the mailed questionnaire to the respondents
(say 500) after receiving back the answered questionnaire (say 300, as all mailed
questionnaire do not come back,) the investigator again randomly draws the required
number of sample (say100) and send the modified questionnaire to the respondents.
This method is time consuming and expensive.
11
Figure 2.1: Diagrammatical Summary of the Sampling Methods/Techniques
Sampling Techniques
Quota
Purposive
Simple Stratified Cluster
Systematic
Snowball
Double Sampling
Non-Proportionate
Proportionate
Source: Author
12
Chapter Three
Numerical data presentation sometimes could also involve the use of table to
summarize data in a distribution. This is necessary when multiple number values
occur in a sentence or paragraph. For example, consider this statement:
B. Histogram: This is a set of vertical bars or columns whose areas are proportional
to the frequencies they represent. The bars of a histogram are joined together. The
students’ grade in statistics examination in figure 3.1 is represented in the histogram
below:
Figure 3.2: Histogram showing grades of Students in a Statistics Examination
14
Frequency
6
0
A B C D E F
Grade
C. Bar Chart: Bar chart is similar to histogram. It is also a set of vertical bars
whose areas are proportional to the frequencies they represent. However, in a bar
chart, the bars are not joined together but separated apart from each other in even
proportion. The students’ grade in statistics examination in figure 3.2 is represented in
the bar chart below:
15
D. Pie Chart: Pie chart is a graphical representation which is in the form of a
circular ‘pie’. To prepare a pie chart, the values of a distribution are first converted
into degrees of which the summation of all converted values MUST equal to 360
degrees. In figure 3.1, the sum total of the students’ grade in statistics examination
(20) makeup the whole pie of 360 0. Each piece of the pie is a sector of the circle as
represented in the pie chart below:
Figure 3.4:
16
3.4 Audio-Visual Method Data Presentation
Recent innovations in computer and information technology have tended to
significantly facilitate the method of data presentation and management. Today, data
can now be easily imputed and processed via electronic devices such as personal
computers, film projectors, television gadgets, mobile phones etc. The use of
electronic device to present data for easy and accurate data management is audio-
visual method of data presentation. It is audio in a situation whereby the electronic
device is capable of producing sound and visual where data can be viewed through a
screen.
17
Chapter Four
Statistical Data Analysis and Frequency Distributions
4.0 Objectives
18
administration, engineering, medicine, among others. From the viewpoint of
management sciences, the basic steps in statistical data analysis will include:
Collecting the data from record or other sources or from sample surveys;
Arranging the data into manageable form;
Analysing and interpreting the figures by means of statistical techniques;
Using the calculated results to make rational/informed decisions.
19
Non-parametric statistics or test on the other hand is concerned with determining
the type of relationship existing between sets of non-numeric otherwise called
ordinal variables in a population or sample. Examples of non-parametric tests
include chi-square test, Spearman’s rank order correlation, etc.
20
Case 5: Example of a simple frequency distribution could be the scores of 20
students in a business statistics test viz: 60, 40, 60, 50, 40, 45, 45, 70, 70, 60, 40,
45, 44, 72, 60, 40, 72, 50, 40, and 60.
To form an array of distribution, the data is re-arrange in ascending order viz: 40,
40, 40, 40, 40, 44, 45, 45, 45, 50, 50, 60, 60, 60, 60, 60, 70, 70, 72, 72
Table 4.1:
Case 5: The outputs of 50 operators in Salem University pure water factory were
recorded during a shift as follows:
Table 4.2: Bags of Pure Water Bagged by Workers in Salem Pure
Water Factory
601 702 876 965 1001
1023 787 1290 548 1196
845 1321 779 1123 799
670 789 898 987 1135
921 902 615 1189 1056
1019 1098 908 876 966
987 589 890 824 690
1022 1242 1280 800 567
934 1390 812 1399 1043
21
1156 1278 912 1479 1485
The data in table 4.2 could be re-arranged in ascending or descending order and
would then be termed an ‘array’. More simply, the values would be grouped into
classes and the frequency of the class entered. Such an arrangement would be as
follows:
Notes:
A grouped frequency table such as table 4.3 is a convenient and
informative method of summarizing the original raw data albeit with some
loss of accuracy.
The above table uses equal class intervals. However, in some occasions,
unequal or open ended class intervals are used.
You will observe that ∑f = 50, i.e. the number of recordings in the original
data.
22
Case 6: Chart the above grouped frequency distribution in a concise Histogram
and line graph using a scale of 2cm = 100units on the X axis and 2cm = 1 unit on
the Y axis. Depict the same grouped frequency distribution by a Cumulative
Frequency Curve using a scale of 2cm = 100units on the X axis and 2cm = 5 units
on the Y axis respectively. From your cumulative frequency curve, determine the
output of the median worker.
Solution:
The histogram, line graph and cumulative frequency in figure 4.1-4.3 below were
generated from the grouped frequency distribution of bags of pure water bagged
by workers in Salem University pure water factory in table 4.3 above with the aid
of excel computer software. Meanwhile, also find attached a manually drawn
histogram, line graph and cumulative frequency using the approved standard
statistical graph sheet.
You will remember that a histogram as earlier stated is a set of vertical bars or
columns whose areas are proportional to the frequencies they represent.
Figure 4.1:
Histogram Showing Bags of Pure Water Bagged By Workers in Salem University
Pure Water Factory
23
Similarly, a line graph or frequency polygon on the other hand is a straight line
joining the mid-points of the class intervals proportionately to the frequencies.
Figure 4.2:
Frequency Polygon Showing Bags of Pure Water Bagged By Workers in Salem
University Pure Water Factory
1400
1300
1100
1000
1500
1200
900
800
700
600
24
Cumulative Frequency Showing Bags of Pure Water Bagged By Workers in Salem
University Pure Water Factory
500 600 700 800 900 1000 1100 1200 1300 1400 1500
Averages: That is, the typical size of the distribution otherwise known as the
central tendency. For our purpose, the most important measures are the
arithmetic mean, the median, and the mode.
Dispersion: This is the variation, spread, or scatter of the distribution for
which the most important measures are the variance and the standard
deviation. Other measures of variation are the range (i.e., the difference
between the largest and smallest values); and the semi-interquartile range
(i.e., half the range of the middle 50% of items).
Skewness: This is the lopsidedness or asymmetry of a distribution.
Kurtosis: The peakedness or height of a distribution.
25
4.5.1 Averages
The most important measure of central tendency or average of a distribution is the
arithmetic mean or simply the mean. It is mathematically denoted by .
() = ∑x/n
Case 7: Using the array of scores of the 20 students in a business statistics test in
case 5 above; determine the mean () of the simple distribution:
Solution:
Mean () =
40+40+ 40+40+40+44+45+45+45+50+50+60+60+60+60+60+70+70+72+72
20
() =
But for a tabulated simple frequency distribution, the mean is calculated thus:
() = ∑fx/∑f
Case 8: Using data from the tabulated simple frequency table of scores of the 20
students in a business statistics test above and here represented below; determine
the mean () of the simple distribution:
Table 4.5:
26
Simple Frequency Distribution of Students Score in a Statistics Examination
Scores of Students (x) Frequency (f) Fx
40 5 200
44 1 44
45 3 135
50 2 100
60 5 300
70 2 140
72 2 144
Total 20 1063
Solution:
Mean () = ∑fx/∑f
Mean () = 1063/20 =
() = ∑fx/∑f
Case 9: Using data from the above grouped frequency distribution table 4.3 of
bags of pure water bagged by workers in Salem University pure water factory;
determine the mean () of the distribution:
Solution:
27
Hint: - Because the distribution is grouped, we will need to compute the x column
first before fx column.
Note; where the data in a frequency distribution contain a few very large or small
values, the median value is often considered to be a more representative value
than the mean although it cannot be used for subsequent calculations as is possible
with the mean.
C. Mode of Frequency Distributions
28
Simply put, the mode is the number with the highest frequency. More precisely,
the mode of a frequency distribution is the value which occurs most often or the
value around which there is the greatest degree of clustering. Ordinarily, the mode
is only meaningful if there is a marked or significant clustering of values round a
single point else, the mode is not often an issue of concern.
N/B: In a symmetrical distribution, the mean, median, and mode must have the
same value. But in asymmetric conditions, their values differ (see skewness
below).
4.5.2 Dispersion
For an ungrouped distribution, the formula for variance and Standard deviation are
represented thus:
29
For a grouped distribution, the formula for Standard deviation and variance are
represented thus:
Case 10: Using data from the same grouped frequency distribution table 4.3
above determine the output variance and standard deviation of bags of pure water
produced in the SU factory:
Solution:
Hint: - From the formula, we shall need to compute for the following columns: x,
(x-), (x-)2 and f(x-) 2 respectively. However, remember that the mean () of the
distribution is 968 bags as already determined in case 9.
30
Standard Deviation (δ) = ∑f(x-)2
∑f - 1
= 2,713,800
50-1
= 235
Solution:
δA = 235; A = 968
31
4.5.3 Skewness of Distributions
Skewness occurs where there is a lack of symmetry or evenness in a distribution.
When there is no symmetry in a distribution, such distribution is said to be
asymmetric. The effect of asymmetric distribution is that the mean, median, and
mode will manifest differing values. Distribution could be positively skewed or
negatively skewed. The diagrams below illustrate the two forms of skewedness:
f f
SK = 3 (Mean - Median)
δ
However, it is important to assert that the above formula still has limited practical
application.
Review question 1:
Estimate the nature of skewness of the two distributions ‘A’ and ‘B’ in Case 11
above assuming the median output of ‘A’ is 520 bags and ‘B’ 450 bags
respectively.
32
4.5.4 Kurtosis of Distributions
A = Leptokurtic
B = Mesokurtic
C = Platykurtic
33
Chapter Five
Correlation Statistics
5.0 Objectives
A. Perfect Correlations
A perfect correlation is a relationship in which a straight line can be drawn through all
the points. Elasticity or coefficient of perfect positive correlation is 1 while that of
perfect negative correlation is -1.
34
B. Partial Correlations
C. Zero Correlation
35
Nonsense Correlation: A nonsense correlation occurs in a situation where two
variables produce a high calculated ‘r’ value and yet no causal relationship
exists between the two variables.
R = 1 - {6∑d2/ n (n2-1)}
Case 12: A group of 8 Accounting and Business Administration students are tested in
business mathematics and business statistics tests. Their performance ranking in the
two tests were as shown in table 5.1.
Table 5.1: Positions of Students in Business Administration and Mathematics Tests
Students Business Statistics Business Mathematics
A 2 3
B 7 6
C 6 4
D 1 2
E 4 5
F 3 1
G 5 8
H 8 7
You are required to determine the nature of association (R) between the two subjects
and interpret your result.
Solution:
Looking at the formula, a new table that captured ‘d’ and d2 emerged as follows:
36
Table 5.2: Spearman’s Rank Order Coefficient of Correlation Table
Students Statistics Mathematics D d2
A 2 3 -1 1
B 7 6 1 1
C 6 4 2 4
D 1 2 -1 1
E 4 5 -1 1
F 3 1 2 4
G 5 8 -3 9
H 8 7 1 1
2
∑d 22
R = 1-{6X22/8(82-1)
R = +0.74
Interpretation:
Positive ‘R’ of 0.74 students’ performance outcome implies a partial but strong
positive agreement or correlation between the two subjects.
Tied Rankings
A slight adjustment to the above formula is necessary if some students obtain the same
marks in the test and thus are given the same ranking.
Review Question 2: Assume that students E and F achieved equal marks in statistics
and were given equal third place; determine the coefficient of correlation of the
students’ performance.
37
5.3.2 Pearson’s Product Moment Coefficient of Correlation (r)
The Pearson’s product moment coefficient of correlation model is of the form:
r = n∑xy - ∑x∑y
√n∑x – (∑x)2 . √n∑y2-(∑y)2
2
Case 13: The following data have been collected in respect of sales and advertising
expenditure in a manufacturing company.
Table 5.3: Relationship between Advertising and Sales Volume of a Firm
Advertising Expenditure (N million) Y Sales Volume (N million) X
8.5 210
9.2 250
7.9 290
8.6 330
9.4 370
10.1 410
Determine the nature and rate of correlation (r) between advertising expenditure and
sales volume.
Solution:
Looking at the Pearson’s model above, we will first create table that will
accommodate all the variables of the model as follows:
Table 5.4 Pearson’s Coefficient of Correlation Table for Advertising and Sales
Volume of a Firm
Advert Cost in NM (Y) Sales NM (X) Y2 X2 XY
8.5 210 72.25 44100 1785
9.2 250 84.64 62500 2300
7.9 290 62.42 84100 2291
8.6 330 73.96 108900 2838
9.4 370 88.36 136900 3478
10.1 410 102.01 168100 4141
53.7 1,860 483.63 604600 16833
r = 6 (16833) – (1860x53.7)
√6 (604600) – (1860)2 x √6 (483.63)-(53.7)2
r = 0.64
38
Interpretation:
The positive r of 0.64 is an indication of partial but semi-strong positive correlation or
agreement between advertising expenditure and the volume of sales of the firm.
r2 = { n∑xy - ∑x∑y }2
√n∑x2 – (∑x)2 . √n∑y2-(∑y)2
Case 14: Using the data provided in table 5.3 on the relationship between advertising
and sales volume, evaluate the coefficient of determination (r2) and interpret your
result.
Solution:
A cursory look at the coefficient of determination model above shows that the formula
is simply the square of the coefficient of correlation model. Hence;
r2 = (0.64)2
= 0.41.
Interpretation:
The coefficient of determination value of 0.41 means that approximately 41% of the
variation in sales volume is actually caused by changes in advertising expenditure
while 59% (i.e., 1-0.41) is caused by other factors other than advertising cost.
39
Chapter Six
Regression Statistics
6.0 Objectives
After studying this chapter, students will:
Be able to explain the concept of regression
Know the basic types of regression statistics
Determine the equation of a straight line
Estimate the coefficients of regression
Interpret the nature of various regressions results
Regression on the other hand, tells us the exact kind of linear association that exists
between those two variables.
40
6.3 Estimating the Coefficients of Simple Linear Regression (r)
Coefficient of correlation and regression are both concerned with association or
interrelationship between variables. Hence, the product moment coefficient of
correlation (r) model used in the correlation statistics above is also employed in the
case of coefficient of regression thus.
r = n∑xy - ∑x∑y
√n∑x – (∑x)2 . √n∑y2-(∑y)2
2
Case 15: The table below represents the demand (Y) of a commodity of an enterprise
at various prices (X) within the defined period 2000-2009.
What is the nature of the relationship between X and Y (i.e., determine ‘r’ and
interpret result)
Solution:
Table 6.2: Regression Table for the Relationship between Prices and Demand of
Commodity of a Firm
Year Price (NM) X Q. DD (Y) XY Y2 X2
2000 5 100 500 10000 25
2001 7 75 525 5625 49
2002 6 80 480 6400 36
2003 6 70 420 4900 36
2004 8 50 400 2500 64
2005 7 65 455 4225 49
2006 5 90 450 8100 25
2007 4 100 400 10000 16
2008 3 110 330 12100 9
2009 9 60 540 3600 81
Coefficient of Regression (r) = n∑xy - ∑x∑y
41
√n∑x2 – (∑x)2 . √n∑y2-(∑y)2
r = 10 (4500) – (60x800)
√10 (390) – (60)2 x √10 (67450)-(800)2
r = -0.93.25
Interpretation:
Regression coefficient of -0.93.25 is an indication of highly strong negative
association between price and quantity demanded of the firm’s commodity. This also
means that the higher the price, the lower the quantity demanded of the commodity.
a = ∑y - b∑x
n
b = n∑xy - ∑x∑y
n∑x2 – (∑x)2
Case 16: State the equation of a straight line. Using data from table 6.1, determine the
values of a and b in the stated equation.
Y = a+bx
42
Adopting Cramer’s model implies;
b = n∑xy - ∑x∑y
n∑x2 – (∑x)2
Where: n = 10
∑xy = 4500
∑x = 60
∑y = 800
∑x2 = 390
b = 10 (4500) – (60x800)
(10x390) – (60)2
b = - 10
Substituting ‘b’ in Cramer’s equation for ‘a’ implies;
a = ∑y - b∑x
n
a = 800 – (-10) 60
10
a = 140
Review Question 3: Using data from table 6.1, determine the values of a and b in the
equation of the straight line above.
Where:
43
b1 = (∑∆Y∆X1)(∑∆X22) - (∑∆Y∆X2)(∑∆X1∆X2) b2 = (∑∆Y∆X2)(∑∆X21) - (∑∆Y∆X1)(∑∆X1∆X2)
(∑∆X21)(∑∆X22) - (∑∆X1∆X2)2 (∑∆X21)(∑∆X22) - (∑∆X1∆X2)2
An investigation on the other possible causes of variations in the demand of the above
commodity in table 6.1 reveals that income level of consumers could be another
variable as shown in table 5 below:
Table 6.3: Demand Prices and Income of Consumer of Commodity of a Firm
Year Quantity Demanded (in Bags) Price (N Million) Income (N000)
2000 100 5 10
2001 75 7 6
2002 80 6 12
2003 70 6 5
2004 50 8 3
2005 65 7 4
2006 90 5 13
2007 100 4 11
2008 110 3 13
2009 60 9 3
Review Question 4:Find the relationship existing between Y and XI, X2. That is,
determine the values of a, b1 and b2 in the relation Y = a + b1X1 + b2X2
Chapter Seven
44
Decision Making Analysis
7.0 Objectives
In an ideal senario, business decisions can be predicted with certainty. But in reality,
certainty conditions are rare in the business world. One can only speak of certainty
conditions whenever the number of possible outcomes from a business activity falls
within a very narrow range of possible values. In that case, there would be only a very
remote possibility of divergence between expected and realized outcomes. This
problem of unpredictability of certainty in business decison making, virtually all
business decisions are taking under conditions of either risk or uncertainty. What then
is Risk and Uncertainty?
Risks and uncertainty are in most cases used inter-changeably. However, different
authors have identified slight variation and relationship. Literarily, the term ‘risk’
means exposure to danger, or economic adversity. This is the layman’s viewpoint. The
News Oxford Advanced Learner’s Dictionary (2000) defined risk as the possibility of
meeting danger or of suffering harm or loss. However, a more embracing definition is
found in Okafor (1983) who posits that risk is the exposure to loss arising from
variations between the expected and the actual outcome of investment activities. He
further argues that where the range of possible outcomes is wide, exposure to risk
45
would be high, but if the range is narrow, exposure to risk would be narrow. Deducing
from this position, a condition of risk will occur where an investor knows exactly the
range of possible outcomes to expect from a business opportunity and the possible
occurrences of each outcome as well.
Uncertainty on the other hand refers to a situation where alternative outcomes exist
with unknown probabilities. That is, when the future outcome of event cannot be
predicted with any degree of confidence from a knowledge of past or existing events.
A condition of uncertainty implies a near complete ignorance of the future outcomes
of present decisions. It arises where the decision maker has no dependable information
about the nature of factors which impinge on his investment activities. Uncertainty is a
subjective phenomenon. This means that two or more investors are unlikely to have
identical perception of the outcome of investment decision taken under condition of
uncertainty. Consequently, it is very difficult to generate universally acceptable model
for dealing with uncertainty. A decision maker faced with uncertainty condition would
attempt to generate probability distribution of possible outcomes on ground of his
personal judgment of the situation. By so doing, a condition of uncertainty would at
least conceptually, be reduced to one of risk.
46
where an event is some occurrence, e.g. spinning a coin to show a head, drawing an
Ace from a pack of cards, etc. Probabilities can thus be regarded as relative
frequencies, as shown in the following examples.
Case 17: What is the probability of drawing an Ace from a shuffled pack of cards?
The numerator is 4 because there are four Aces in a pack and the denominator is 52
because there are fifty two cards in a pack.
P (throwing a 3) = 1/6
The numerator is 1 because only one side depicts a three and the denominator is 6
because there are six sides on the die.
Where the probability of an event is based on past data and the circumstances are
repeatable by test, the probability is known as statistical or objective probability. For
example, the probability of tossing a coin and a head showing is 50% or ½ or 0.5. This
value can be shown to be correct by repeated trials. In most circumstances objective
probabilities are not available in business, so that subjective probabilities must be
used. These are quantifications of personal judgment, experience and expertise. For
example, the Sales Manager considers that there is a 40% chance (i.e. p = 0.4) of
obtaining the order for which the firm has just quoted. Clearly this value cannot be
tested by repeated trials. In spite of the undoubted shortcomings of subjective
probabilities they are all that are normally available and so they are used to help in the
decision-making process. It should be emphasized that the use of probabilities does
not of itself make the decision. It merely provides more information on which a more
informed decision can be taken.
47
Mutually Exclusive and Independent Events: Events which cannot happen at the
same time are said to be mutually exclusive events. If one happens, the other(s) cannot
occur. For example, if we are considering the classification of people the events
‘male’ and ‘female’ are mutually exclusive. If a person is ‘female’, this automatically
excludes the possibility of being ‘Male’. But two or more events are independent if
the occurrence or non-occurrence of any one event does not affect the occurrence or
non-occurrence of the others. For example, the outcome of any throw of a die is
independent of the outcome of any preceding or succeeding throws.
Case 18: A restaurant offers a choice of 3 starters, 4 main courses and 3 sweets. How
many different meals are available?
But there will be occasions when selections will be made where the order does not
matter meaning that the arrangement A, B will be same as B, A. This is known as a
combination.
Where n is the total number of items and r is the number of items per arrangement.
The above example works out as follows:
6! 6x5x4x3x2x1
2!(6-2)! = 2x1x4x3x2x1 = 15 ways
Case 19: A company is considering investing in either of two projects A and B. You
are expected to calculate the expected value of each of the two projects and advice on
which is more preferable.
Solution:
Possible Outcomes Project A Project B
N Prob. Exp. V (N) N Prob. Exp. V (N)
Optimistic 6000 0.2 1200 6500 0.1 650
Most Likely 3500 0.5 1750 4000 0.6 2400
Pessimistic 2500 0.3 750 1000 0.3 300
Project E.V 3700 3350
Decision:
On the basis of Expected Value, Project A would be preferred as it has the higher
expected value.
49
Notes:
a) Although the EV of A is N3700, strictly this value would only be achieved in the
long run over many similar decisions — extremely unlikely circumstances!
b) If the project was implemented, any of the three outcomes could occur, with the
values stated.
Advantages:
a) Simple to understand and calculate
b) Represents whole distribution by a single figure
c) Arithmetically takes account of the expected variabilities of all outcomes.
Disadvantages:
a) By representing the whole distribution with a single figure it ignores the other
characteristics of the distribution, e.g. the range and skewness.
b) Makes the assumption that the decision maker is risk neutral, i.e. he would rank
equally the following distributions:
Pessimistic Outcome
Most likely Outcome
Optimistic Outcome
It is of course unlikely that any decision maker would rank them equally due to his
personal attitude to risk.
c) Although it appears to be widely used for the purpose, expected value is not
particularly well suited to one-off decisions. Expected value can strictly only be
interpreted as the value that would be obtained if a large number of similar decisions
were taken with the same ranges of outcome and associated probabilities.
50