0% found this document useful (0 votes)
33 views187 pages

Lecture Note (Basic Statistics Acc & Fina)

Uploaded by

biyadgendeshew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views187 pages

Lecture Note (Basic Statistics Acc & Fina)

Uploaded by

biyadgendeshew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 187

Basic Statistics

(Stat 2131)

Addis Ababa University


Statistics Department

October, 2023
Course Outline
1. Introduction to Statistics
 What is Statistics?
 Descriptive Versus Inferential Statistics
 Types of Variables and Scales of Measurement
 Statistics in Business Decision
2. Visual Description of Data
 The frequency Distribution and the Histogram
 The Stem and Leaf Display and the Dot Plot
 Other Methods for Visual Representation of Data
 Bar chart, Line Graph, Pictogram, Pie Chart
 The scatter Diagram
 Tabulation and Contingency Table
3. Statistical Description of Data
 Statistical Description: Measures of Central Tendency
 Statistical Description: Measures of Dispersion
 Descriptive Statistics from Grouped Data
 Statistical Measures of Association

4. Probability and Probability Distribution


 Basic Definitions of Probability
 Fundamental Concepts: Experiment, event, conditional and joint
probability
 Discrete random variables, expected value and variance of discrete random
variables
 The Binomial, Poisson, and Hypergeometric probability distribution and
their applications
 Continuous Probability Distribution: Uniform, Normal, and Exponential
probability Distribution and their application.
Suggested Text Book

 Ronald M. Weirers (2011). Introduction to Basic Statistics. 7th edition.


Why Study Statistics?
Because, you would like to know:
1. How does an instructor grade on a curve
2. How does a tire manufacturer determine mileage warranty
3. How does FDA verify that a new drug is more effective than the present drug
4. What does it mean when one says the median home rent price in Addis Ababa is
10,000 Birr
5. How does one select a sample for a survey
5
The art of learning from data
6
Data vs Information

• Data are the raw materials for data processing.


• Information is data that has been processed

7
1.1 Introduction
Definition of Statistics
 Plural form
 numerical facts and figures collected for a certain purposes

 aggregates of numerically expressed facts (figures) collected in a systematic

manner for a predetermined purpose

 Singular form
 systematic collection and interpretation of numerical data to make a decision

 the science of collecting, organizing, presenting, analyzing and interpreting

numerical data to make decision on the bases of such analysis

8
Classification of Statistics
 Descriptive Statistics
 Mainly concerned with the methods and techniques used in collection,

organization, presentation, and analysis of a set of data without making


any conclusions or inferences.
 Gathering data

 Editing and classifying them

 Presenting data in tables

 drawing diagrams and graphs for them

 Calculating averages and measures of dispersions.

Remark: Descriptive statistics doesn’t go beyond describing the data


themselves.
9
Classification of Statistics …
 Descriptive Statistics (Example)
 The average age of students in this class is 21.

 The sample shows 40% of year I students have positive attitude toward the

delivery of lectures.

 Drawing graphs that show the difference in the ‘scores’ of pre-


engineering male and female students.

10
Classification of Statistics …
 Inferential Statistics
 Deals with the method of inferring or drawing conclusion about the

characteristics of the population based upon the results of a sample

 Utilizes sample data to make decision for entire data set based on sample

 Inferential Statistic (Example)

 There is a definitive relationship between smoking and lung cancer

 Drinking decaffeinated coffee can raise cholesterol levels by 7%.

 Forward soccer players have a better performance than midfielders

 Senior students are vulnerable to addiction

11
Definition of Some Basic Statistical Terms
 Data
 a collection of related facts and figures from which conclusions may be

drawn

 a scientific term for facts, figures, information and measurement

 Population/target population
 a totality of things, objects, peoples, etc about which information is being

collected

 Often too large to sample in its entirety

 Example: population of athletes fed a certain type of diet

12
Definition of Some Basic Statistical Terms
 Sample
 part of a population selected to draw conclusions about the population

 Subset of a population

Population

Sample
 Census
 a complete enumeration of the population. But in most real problems it
cannot be realized, hence we take sample.

13
Definition of Some Basic Statistical Terms
 Statistic
 A value computed from the sample, used to describe the sample.

 Parameter
 A descriptive measure (value) computed from the population.

 Variable
 is a characteristic or attribute that can assume different values.

 Sampling frame
 A list of people, items or units from which the sample is taken.

14
Stages in Statistical Investigation

 Statistical data must possess the following properties

 The data must be aggregate of facts

 They must be affected to a marked extent by a multiplicity of causes

 They must be estimated according to reasonable standards of accuracy

 The data must be collected in a systematic manner for predefined purpose

 The data should be placed in relation to each other

15
Stages in Statistical Investigation
1. Data Collection
 The processes of measuring, assembling and gathering data

 Data may be collected by the investigator directly using interview,

questionnaire, and observation or may be available from published or


unpublished sources.

 Data gathering is the basis (foundation) of any statistical work.

 Valid conclusions can only result from properly collected data.

16
Stages in Statistical Investigation …
2. Data Organization
 It is a stage where we edit our data

 The collected data involve irrelevant figures, incorrect facts, omission and

mistakes

 classify (arrange) according to their common characteristics

3. Data Presentation
 The organized data can now be presented in the form of tables, diagram and

graphs.

 The main purpose of data presentation is to facilitate statistical analysis

17
Stages in Statistical Investigation …
4. Data Analysis
 Study the data to draw conclusions about the population parameter

 Dig out information useful for decision making

 Calculations of averages, the computation of measures of dispersion,

regression and correlation analysis

5. Data Interpretation
 Draw valid conclusions from the results obtained through data analysis

 Making inference about general population from sample results

18
Uses and Limitations of Statistics
 Uses of Statistics
 Condenses and summarizes complex data

 Facilitates comparison of data

 Helps to measure variability in data

 Used to create relationship between variables

 Helps in predicting future trends

 Influences the policies of government

 Helpful in formulating and testing hypothesis and to develop new theories

19
Uses and Limitations of Statistics …
 Limitations of Statistics
 Statistics doesn’t deal with single (individual) values rather it deals with

aggregate values

 Statistics can’t deal with qualitative characteristics

 Statistical conclusions are not universally true

 Statistical interpretations require a high degree of skill and understanding of

the subject

 Statistics can be misused

20
Applications in Business and Economics
 Accounting
 Public accounting firms use statistical sampling procedures
when conducting audits for their clients.
 Economics
 Economists use statistical information in making forecasts
about the future of the economy or some aspect of it.
 Marketing
 Electronic point-of-sale scanners at retail checkout counters
are used to collect data for a variety of marketing research
applications.
 Finance
 Financial advisors use price-earnings ratios and dividend yields
to guide their investment recommendations

21
Scales of Measurment
 A variable in statistics is any characteristic, which can take on different
values for different elements when data are collected

 Variable can be qualitative or quantitative

 Qualitative Variables are nonnumeric variables and can't be measured,


example (gender, blood type etc.).

 Quantitative variables are numeric variables and can be quantified

 Quantitative variables can be discrete (takes always whole number values)


or continuous (assume or take any decimal value )

22
Scales of Measurement
 Measurement “is assigning numbers to objects, events, or abstract
concepts according to a known set of rules”

 This permits data to be categorised, quantified and/or analysed in order


that meaningful conclusions can be drawn.

 Four scales of measurement are identified

 Nominal Scale Lowest Level

 Ordinal Scale

 Interval Scale

 Ratio Scale Highest Level


23
Scales of Measurement
 Nominal Scales of Measurement
 A measure of identity or category into mutually exclusive classes

 Provides no information regarding either order or magnitude


 Example: Blood type (A, B, AB and O) , Name of A student, Identification number

 Ordinal Scales of Measurement


 A measure of order or rank

 Used to arrange data into series

 Provides no information regarding magnitude


 Example: Ratings (good, v. good & excellent), economic status (low, medium & high)

24
Scales of Measurement …
 Interval Scales of Measurement
 A measure of order and quantity

 Difference between values can be calculated

 Cannot establish ‘x-fold’ increase


 Example: Temperature (10oC (50oF) and 20oC (68OF) as between 25oc (77oF) and 35oc
(95oF))

 Ratio Scales of Measurement


 Highest level of measurement

 An interval scale with an absolute zero point


 Example: weight, height, income, etc.

25
Sources of Data
 Primary data
 data measured or collect by the investigator or the user directly from the source

 the data you collect is unique to you and your research and, until you publish, no one

else has access to it

 The primary sources of data are objects or persons from which we collect the

figures used for first hand information.

 Secondary data
 second-hand information and data or information that was either gathered by

someone else

 The secondary sources are either published or unpublished materials or records.

 Few of sources of secondary data are

26
Sources of Data

27
Methods of Data Collection
 Planning to data collection requires
 Identify source and elements of the data

 Decide whether to consider sample or census

 If sampling is preferred, decide on sample size, selection method, etc

 Decide measurement procedure

 Set up the necessary organizational structure

 Collect data using different (appropriate) techniques

28
Methods of Data Collection

 There are three major methods of data collection.


1) Observational or measurement.
2) Interview with questionnaires.
a. Face to face interview.
b. Telephone interview.
c. Self administered questionnaires
3) The use of documentary sources

29
Methods of Data Collection

Observational or measurement ( direct personal observation)


 Data can be obtained through direct observation or measurement.
 This requires training and monitoring of the measurer to ensure the use

of standard procedure.
 Provides accurate information but it is expensive and inconvenient.
 Example: laboratory tests, clinical measurements and physical
examination etc.

30
Methods of Data Collection

 Interview with questionnaires:


 Draft a detailed questionnaire.

 Questionnaires: are written documents which instruct the reader or listener to


answer the questions written on it.

 Respondents (Interviewees): are individuals those who are answered the


questions on the questionnaire.

 Interviewers: are individuals those who are recorded the responses given by the
respondents.

31
Methods of Data Collection
a) Face to Face Interviews (questionnaires in charge of enumerators)

 The interviewer knows exactly who is responding to the questionnaire.

 Advantages

 The interviewer can help the respondent if he/she has difficulty in understanding
the questions. The difficulty could be due to language, concentration or limited
intellectual capacity.

 There is more flexibility in presenting the items; they can range from closed to
open.

 There is the ability to use the method of skip patterns.

 Skip patterns means skipping a questions or a group of questions which are not
32
applicable.
Methods of Data Collection
a) Face to Face Interviews (questionnaires in charge of enumerators)

Disadvantages

▪ It costs much in terms of time and money.

▪ Attribute of the interviewer may affect the responses due to:

a) Bias of the interviewer and

b) his/her social or ethnic characteristics.

▪ Untrained interviewer may distort the meaning of the questions.


33
Methods of Data Collection
b. Telephone Interviews

Advantages

• It is less expensive in time and money compared with face to face interviews.

• The interviewer is able to help the respondent if he/she doesn’t understand the
question (as seen with face to face interview)

• Broad representative samples can be obtained for those who have telephone lines.

Disadvantage

 Under representation of those groups which do not have telephones.

 Respondent may be substituted by another.

 Problem with unlisted telephone number in the directory.


34
Methods of Data Collection
c. Self administered questionnaires returned by mail (mailed questionnaire)

Here the questionnaire is mailed to the respondents to be filled.

Advantages

 These are the cheapest.

 There is no need for trained interviewer.

 There is no interviewer bias.

Disadvantage

• Low response rate

• Uncompleted questionnaires due to omission or invalid responses.

• No assurance that the questionnaire was answered by the right person

• Needs intense follow up to get a high response rate. 35


3. The use of documentary sources
✓ Extracting information from existing sources (e.g. Hospital records) is much less expensive than the
other two methods. It can be an important source of data.
Advantage of secondary data

 Secondary data may help to clarify or redefine the definition of the problem as part of the exploratory
research process.
 Provides a larger database as compared to primary data
 Time saving
 Does not involve collection of data

Disadvantages of secondary data

 It is difficult to get information needed, when records are compiled in unstandardized manner.

❖ Lack of availability ❖ Inaccurate data


❖ Lack of relevance ❖ Insufficient data
2. Methods of Data Presentation/Visualization

37
Methods of Data Presentation/Visualization
 The major objectives of data presentation are
 To presenting data in visual display and more understandable

 To have great attraction about the data

 To facilitate quick comparisons using measures of location and dispersion.

 To enable the reader to determine the shape and nature of distribution to

make statistical inference, and to facilitate further statistical analysis.


 There are three methods of data presentation
 Tables,

 Diagrams, and

 Graphs

38
Methods of Data Presentation …
 Tabular presentation of data
 Tables are important to summarize large volume of data in more

understandable way.

 Tables can be

 Simple (one way table): table which present one characteristics for example age
distribution.

 Two way table: it presents two characteristics in columns and rows for example
age versus sex.

 A higher order table: table which presents two or more characteristics in one
table.

39
Methods of Data Presentation …
 Frequency Distribution
 It is the organization of raw data in table form, using classes and frequencies.

 Frequency is the number of values in a specific class of the distribution.

 There are three basic types of frequency distributions

 Categorical frequency distribution

 Ungrouped frequency distribution

 Grouped frequency distribution

40
Methods of Data Presentation …
 Categorical Frequency Distribution
 The categorical frequency distribution is used for data which can be placed

in specific categories such as nominal or ordinal level data


 The major components of categorical frequency distribution are class, tally and

frequency (or proportion).


 Percentages are also usable

 Forms of a categorical distribution

A B C D
Class Tally Frequency Percent

41
Methods of Data Presentation …
 Example: Data on smoking status by gender of a sample of 20 health workers in
Jimma Hospital 1986 E.C was given. Construct categorical frequency
distribution.

Observation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Gender M F M M F F F M M M F F F F M F M F M M
Smoking Y N N Y N N Y N N N N N N Y Y Y N N Y Y
status
Characteristics Tally Frequency
Gender
Male //// //// 10
Female //// //// 10
Smoking status
No //// //// // 12
Yes //// /// 8

42
Methods of Data Presentation …
 Ungrouped Frequency Distribution
 It is the distribution that use individual data values along with their

frequencies.

 often constructed for small set of data on discrete variable (when data are

numerical), and when the range of the data is small.

 sometimes it is complicated to use ungrouped frequency distribution for

large mass of data, as result we use grouped frequency distribution.

 The major components of this type of frequency distributions are class, tally,

frequency, and cumulative frequency (less than/more than).

43
Methods of Data Presentation …
Example: Age in year of 20 women who attended health education at Jimma
Health center in 1986 are given as follows. Construct ungrouped frequency
distribution
30 25 23 41 39 27 41 24 32 29 29 35 31 36 33 36 42
35 37 41

Age(xj) 23 24 25 27 29 30 31 32 33 35 36 37 39 41 42

Tally / / / / // / / / / // // / / /// /

Frequency(f) 1 1 1 1 2 1 1 1 1 2 2 1 1 3 1

44
Methods of Data Presentation …

 Grouped Frequency Distribution


 It is a frequency distribution when several numbers are grouped in one class

 the data must be grouped in which each class has more than one unit in

width.

 We use when the range of the data is large, and for data from continuous

variable.

 Sometimes used for large volume of discrete data

45
Methods of Data Presentation …
 Guidelines for classes
 There should be 5 to 20 classes. Determine using Sturge’s rule

K = 1+ 3.32 log n
 Classes should be continuous.

 Classes must be mutually exclusive.

 Classes should be exhaustive.

 Classes should have same width (except open ended classes)

Range R
W= =
Number of classes K

46
Methods of Data Presentation …
 Class limit (CL)
 It separates one class from another.

 The limits could actually appear in the data

 have gaps between the upper limits of one class and the lower limit of the next class.

 Class boundary(CB)
 Separate one class in a grouped frequency distribution from the other.

 The boundary has one more decimal place than the raw data.

 There is no gap between the upper boundaries of one class and the lower boundaries

of the succeeding class.

47
Methods of Data Presentation …
 Unit of measurement (U)
 This is the possible difference between successive values. E.g. 1, 0.1, 0.01 …

 Class width (W)


 The difference between the upper and lower boundaries of any consecutive class.

 The class width is also the difference between the lower limit or upper limits of two

consecutive classes.

 Class mark (Midpoint)


 It is found by adding the lower and upper class limit (Boundaries) and divided the

sum by two.

48
Methods of Data Presentation …
 Steps to construct grouped frequency distribution
 Find smallest (S) and largest (L) values in your data

 Compute difference between L and S, R

 Determine the number of class using Sturge’s rule, round up!

 determine class width, ratio of R and K, round up!

 Take the smallest value as the first class lower class limit, and add class width to get consecutive
lower class limits

 To get upper class limit subtract unit of measurement from second class lower class limit, and add
class width to get remaining upper class limits

 Subtract half of unit of measurement from lower class limit to get class boundary, and add half of
unit of measurement to upper class limit to get upper class boundary

 Tally data

 Find cumulative frequency

49
Methods of Data Presentation …
Example: Age in year of 20 women who attended health education at Jimma
Health center in 1986 are given as follows. Construct grouped frequency
distribution

30 25 23 41 39 27 41 24 32 29 29 35 31 36 33 36 42
35 37 41
n=20
k=1+3.322(log20) =1+3.322(1.3010) = 5.196  k=6
w= (42-23)/5=4
The grouped frequency table using Sturges formula

Class 23-26 27-30 31-34 35-38 39-42


Frequency (f) 3 4 3 5 5

50
Methods of Data Presentation …
 Diagrammatic and Graphic presentation of the data
 One of the most effective and interesting alternative way in which a statistical data may be
presented is through diagrams and graphs.
 There are several ways in which statistical data may be displayed pictorially such as
different types of graphs and diagrams.
 Pie chart

 Bar chart

 Pictogram

 Histogram

 Line Graph

 Stem and Leaf Display

 Dot Plot

 The scatter Diagram


51
Methods of Data Presentation …
 Pie Chart
 Pie chart is a circular diagram and the area of the sector of a circle is used in

pie chart.

 To construct a pie chart (sector diagram), draw a circle (measures 3600)

 The angles of each component are calculated by the formula

Component part
Angle of sec tor =  3600
Total

 These angles are made in the circle by mean of a protractor to show different

components.

 The arrangement of the sectors is usually anti-clock wise.

52
Methods of Data Presentation …
 Pie Chart (Example)
 The following table gives the details of quarterly sale of a Sport Wear

company’s profit (in millions of dollar) in four quarters of a year.

Month Profit($,000,000)
1st quarter 100
2nd quarter 300
3rd quarter 500
4th quarter 600
Total 1500

 Construct a pie chart

53
Methods of Data Presentation …
 Pie Chart (Example)
Quarter Angle of sector Percen
Profit($,000,000)
(in degrees) t (%)

1st quarter 100 24 7


2nd quarter 300 72 20
3rd quarter 500 120 33
4th quarter 600 144 40
Total 1500 360 100

1st quarter
7%
2nd quarter

20% 3rd quarter


40%
4th quarter

33%

54
Methods of Data Presentation …
 Bar Chart
 Use vertical or horizontal bins to represent the frequencies of a distribution.

 While we draw bar chart, we have to consider the following two points.

 Make the bars the same width

 Make the units on the axis that are used for the frequency equal in size

 Bar charts can be

 Simple bar chart,

 Multiple bar charts,

 Stratified or stacked bar chart

 Deviation bar chart

55
Methods of Data Presentation …
 Simple Bar Chart
 Used to represents data involving only one variable classified on spatial,

quantitative or temporal basis

 Make bars of equal width but variable length

 Example (Sports Wear company quarterly sales)

56
Methods of Data Presentation …
 Multiple Bar Chart
 When two or more interrelated series of data are depicted by a bar diagram

 Make bars of equal width but variable length

 Example: Suppose we have export and import (in million) figures for a

company working on mineral for few years.

80
60
40 Export

20 Import

0
2010 2011 2012

57
Methods of Data Presentation …
 Stratified/Stacked Bar Chart
 used to represent data in which the total magnitude is divided into
different or components.

 First make simple bars for each class taking total magnitude in that class
and then divide these simple bars into parts in the ratio of various
components

 Shows the variation in different components within each class as well as


between different classes.

 Stratified bar diagram is also known as component bar chart.

58
Methods of Data Presentation …
 Stratified/Stacked Bar Chart
 The table below shows the profit of a company ($ Millions) from different

item sales in 1st quarter of the year. Draw stratified/stacked bar chart
Company Shoe T-shirt Ball Total
X 30 50 40 120
Y 33 16 27 76
Z 37 13 37 87

140 Ball
120 T-shirt
Sales in $,000,000

100 40 Shoe
80
37
60 27
50
40 16 13

20 30 33 37
0
X Y Z
Company 59
Methods of Data Presentation …
 Deviation Bar Chart
 Used when the data contains both positive and negative values such as data

on net profit, net expense, percent change etc

 Suppose we have the following data relating to net profit (percent) of

commodity.
Commodity Net profit
Soap 80
Sugar -95 Net profit
Coffee 125 150
100
50 Soap
0 Sugar
Soap Sugar Coffee
-50 Coffee
-100
-150

60
Methods of Data Presentation …
 Pictogram
 Is a figure that represents something using an image or illustration.

 Pictographs are often used in writing and graphic systems in which the

characters are to considerable extent pictorial in appearance.

 Also in presentations you can use pictograms to create visual presentations.

61
Methods of Data Presentation …
 Histogram
 Histogram is a special type of bar graph in which the horizontal scale

represents classes of data values and the vertical scale represents frequencies.

 The height of the bars correspond to the frequency values, and the drawn

adjacent to each other (without gaps).

 A graph which displays data by using vertical bars of various heights to

represent frequencies.

 Class boundaries are placed along the horizontal axes.

62
Methods of Data Presentation …
 Histogram
 A histogram shows the shape of continuous data, checks for homogeneity, and

suggests possible outliers.

 To construct a histogram, we split the range of data into equal intervals, “bins,”

and count how many observations fall into each bin.

Histogram for the age in years of


20 women

63
Methods of Data Presentation …
 Line Graph

 used to show how the value of something changes over time,

 Compare how several things change over time relative to each other.

 Example: Line graph showing sales (in thousands Birr) over six months

64
Methods of Data Presentation …
 Stem and Leaf Display
 Uses place value to organize data

 Shows data in an organized way so it can be analyzed easily

 Organizes data so it is easier to find the median, mode, and range

 Stem-and-Leaf Plots: A convenient method to display every piece of data by

showing the digits of each number.

65
Methods of Data Presentation …
 Stem and Leaf Display
 How to Draw One:

1. Put the first digits of each piece of data in numerical order down the left-
hand side

2. Go through each piece of data in turn and put the remaining digits in the
proper row

3. Re-draw the diagram putting the pieces of data in the right order

4. Add a key

66
Methods of Data Presentation …
 Stem and Leaf Display
 Example: the following are scores of students:

67
Methods of Data Presentation …

 Dot Plot
 Is a simple form of data visualization that consists of data points plotted as
dots on a graph with an x- and y-axis.

 These types of charts are used to graphically depict certain data trends or
groupings.

 A dot plot is similar to a histogram in that it displays the number of data


points that fall into each category or value on the axis, thus showing the
distribution of a set of data.

 A dot plot is a graphical display of data using dots.

68
Methods of Data Presentation …

 Dot Plot
 Example: This data set gives pulse rates, in beats per minute, for a group
of 30 students.

68 60 76 68 64 80 72 76 92 68 56 72 68 60 84 72 56 88 76 80
68 80 84 64 80 72 64 68 76 72

69
Methods of Data Presentation …
 The scatter Diagram
 A scatter graph is a graph using paired data that can be used to find out

whether there is a relationship between two variables

 paired data is two separate pieces of data referring to the same thing

 the age and value of a car

 the height and shoe size of a person

 the marks that a person gained in two separate tests.

70
Methods of Data Presentation …
 The scatter Diagram
 Example: 10 students sat both a Math and a Stat exam, here are their scores:
Subj Stud 1 Stud 2 Stud 3 Stud 4 Stud 5 Stud 6 Stud 7 Stud 8 Stud 9 Stud 10

Math 56 24 67 70 71 42 48 32 52 80
Stat 65 38 71 72 73 51 56 42 57 82

71
3. Statistical Description of Data

72
3.1. Measures of Central Tendency
 A measure of central tendency is a descriptive statistic that describes the
average, or typical value of a set of scores.

 It is also defined as a single value that is used to describe “center” of the


data

Typical value
(Center of data)

73
Types of measures of central tendency

 Good properties of typical average


 Computation should be based on all the observed values.

 It should be simple to understand and easy to interpret.

 As little as affected by fluctuations of sampling.

 should not unduly be influenced by extreme values.

 it should be defined rigidly which means that it should have a definite value

74
Types of measures of central tendency

 There are three common measures of central tendency


 Mean

 Median

 Mode

75
The Summation Notation
 Also called Sigma notation

 Sigma is a Greek letter ∑ meaning “sum”

 Let X is a variable

n ending point/

X
Upper limit of
the summation
i
i =1
Summation
notation
Xi is the index of
summation, each
starting point/
term of the sum
Lower limit of
the summation
(index of the
summation)

76
The Summation Notation..
 Properties of summation notation
n

X
i =1
i = X1 + X 2 +  + X n

XY
i =1
i i = X 1Y1 + X 2Y2 +  + X nYn

 i 1 2
X 2

i =1
= X 2
+ X 2
+  X 2
n

n n

 CX
i =1
i = C  X i = CX 1 + CX 2 +  + CX n
i =1

77
The Mean
 Mean is the most commonly used measure of central tendency. There are
different types of mean
 Arithmetic mean,

 Weighted mean,

 Geometric mean (GM) and

 Harmonic mean (HM)

 If mentioned without an adjective (as mean), it generally refers to the


arithmetic mean.

78
The Arithmetic Mean
 It is computed by adding all the values in the data set divided by the number of
observations in it.

 If we have the raw data, mean is given by the formula


n

X i
X= i =1

 If we have frequency distribution (ungrouped) mean is given by the formula


n

fX i i
X= i =1

n
 If we have frequency distribution (grouped) mean is given by the formula
n LCB/UCB is lower/upper class boundary
fm i i
LCBi + UCBi
X= i =1
, where mi =
n 2

79
The Arithmetic Mean …
 Example 1: The following data is the weight (in Kg) of eight youths:
32,37,41,39,36,43,48 and 36. Calculate the arithmetic mean of their weight.
(Ans:312/8=39 )

 Example 2: The ages of a random sample of patients in a given hospital in Ethiopia is


given below: (Ans: 16.075)

Age (xi) Number of patients (fi)


10 3
12 6
14 10
16 14
18 11
20 5
22 4
80
The Arithmetic Mean …
 Example 3: Age in year of 20 women who attended health education at Jimma Health
center in 1986 is summarized in the table. What is the mean age of these women. (Ans:
670/20=33.5)

Time (in seconds) Number of students


23-26 3
27-30 4
31-34 3
35-38 5
39-42 5

81
Properties of Arithmetic Mean …
 It can be computed for any set of numerical data, it always exists, and unique.

 It depends on all observations.

 The sum of deviations of the observations about the mean is zero i.e.

 It is greatly affected by extreme values.

 It lends itself to further statistical treatment, for instance, combinations of means.

 It is relatively reliable, i.e. it is not greatly affected by fluctuations in sampling.

 The sum of squares of deviations of all observations about the mean is the minimum

 If a constant is added to all observations, the new mean is old mean plus constant

 If all observations are multiplied by a constant, the new mean is the multiple of the constant and old
mean

 If wrong value is recorded and latter on it is discovered, the new corrected mean is

X corr= X wrong+
(X corr − X wrong )
n
82
Weighted Mean
 Weighted mean is calculated when certain values in a data set are more
important than the others.

 A weight wi is attached to each of the values xi to reflect this importance.

 The weighted mean is computed as


k

w x i i
Xw = i =1
k

w
i =1
i

 Example: CGPA of a students (each result is weighted by credit of a course) [Ans: 2.88]

83
Geometric Mean
 It is defined as the arithmetic mean of the values taken on a log scale.

 It is also expressed as the nth root of the product of an observation.

 GM is an appropriate measure when values change exponentially and in case of


skewed distribution that can be made symmetrical by a log transformation.

 Note: The geometric mean is useful in finding the average of percentages,


ratios, indexes, or growth rates.

 One important disadvantage of GM is that it cannot be used if any of the values


are zero or negative.
84
Geometric Mean…

Example 1:- The G.M of 4, 8 and 6 is.

Solution:

Example 2: The man gets three annual raises in his salary. At the end of first year,
he gets an increase of 4%, at the end of the second year, he gets an increase of 6%
and at the end of the third year, he gets an increase of 9% of his salary. What is the
average percentage increase in the three periods?

Solution:

85
Properties of geometric mean

 Its calculations are not as such easy.

 It involves all observations during computation

 It may not be defined even it a single observation is

negative.

 If the value of one observation is zero its values becomes

zero.
Harmonic Mean
 Another important mean is the harmonic mean, which is suitable measure of
central tendency when the data pertains to speed, rates and price.

 It is the reciprocal of the arithmetic mean of the observations.

 Let be n variant values in a set of observations, then simple harmonic


mean is given by:

 Note: SHM is used for equal distances, equal costs and equal rates.

87
Harmonic Mean
Example 1: A motorist travels for three days 480 km at each day. On the first day
he travels 10 hours at a rate of 48 km/h, on the second day 12 hours at a rate of 40
km/h, on the third day 15 hours at a rate of 32 km/h. What is the average speed?
Solution: Since the distance covered by the motorist is equal
( ), so we use SHM.

so the required average speed = 38.92 km/hr


We can check this, by using the known formula for average speed in elementary
physics.
Check;
=
=
38.92km/hr
88
Weighted harmonic mean (WHM)
 WHM is used for different distance, different cost and different
rate.

Example 1: A driver travel for 3 days. On the 1st day he drives


for 10h at a speed of 48 km/h, on the 2nd day for 12h at 45 km/h
and on the 3rd day for 15h at 40 km/h. What is the average
speed?

Solution: since the distance covered by the driver is not equal, so we use
WHM by taking the distance as weights (wi).

(480+540+600)
𝑤. ℎ. 𝑚 = 480 540 600 =43.78
+ +
48 45 40
Properties of harmonic mean
 It is based on all observation in a distribution.

 Used when a situations where small weight is give for larger observation
and larger weight for smaller observation

 Difficult to calculate and understand

 Appropriate measure of central tendency in situations where data is in ratio,


speed or rate.
Relation between AM, GM, and Hm

 If all the values in a data set are the same, then all the three means (arithmetic
mean, GM and HM) will be identical.

 As the variability in the data increases, the difference among these means also
increases.

 Arithmetic mean is always greater than the GM, which in turn is always greater
than the HM.
 AM > GM > HM

91
Median
 If the sample data are arranged in increasing order, the median is
 if n is an odd number, median is middle value

 Example: systolic blood pressure of seven persons were given as 113, 124, 124,
132, 146, 151, and 170. what is the median systolic blood pressure? (Ans: 132)

 if n is an even number, midway between the two middle values

 Six men with high cholesterol participated in a study to investigate the effects of diet
on cholesterol level. At the beginning of the study, their cholesterol levels (mg/dL)
were as follows:366, 327, 274, 292, 274 and 230. what is the median cholesterol
level? (Ans:283)

92
Median …
 If the data is in ungrouped frequency distribution, median is the class with largest
less than cumulative frequency smaller than or equal to half of the total observation
 Example: Forty five students were taken to field and evaluated their performance using 60m
pure speed test. The time is recorded in seconds, and the result is summarized in the table. What
is the median performance of these students. (Ans: 19 secs)

Time (in Number of Less than


seconds) students cumulative
frequency
15 4 4
16 9 13
18 8 21
19 14 35
20 10 45

93
Median …
 If the data is in grouped frequency distribution, median is

 Example: fifty students were taken to field and evaluated their performance using 100 m
pure speed test. The time is recorded in seconds, and the result is summarized in the table.
What is the median performance of these students. (Ans: 20.81 secs)

Time (in seconds) Number of students


14-16 6
17-19 12
20-22 16
23-25 9
26-28 7
94
Mode
 The most frequent observation (value) in a data

 An observation with the largest frequency


 There can be no mode Eg: 25, 27, 22, 18

 There can be only one mode-unimodal Eg: 25, 27, 22, 25,18

 There can be two mode-bimodal Eg: 25, 27, 22, 27, 25, 18, 20

 There can be more than two mode-multimodal Eg: 25, 27, 22, 27, 25, 18, 20, 19, 22, 17

 Mode grouped frequency distribution

 f1 = frequency of the modal class


 f0 = frequency of the class preceding the modal class
 f2 = frequency of the class next to the modal class

95
Mode…
 The most frequent observation (value) in a data
 Example: Twenty five amateur cyclists were taken to field and their time is
recorded to complete a given distance. The time is recorded in seconds, and
the result is summarized in the table. What is the modal time to complete the
distance. (Ans: 29.5 secs)

Time (in seconds) Number of


Atheletes
15.5- 21.5 3
21.5-27.5 6
27.5-33.5 8
33.5-39.5 4
39.5-45.5 3
45.5-51.5 1

96
2.3 Quantiles

 Quartiles are three points which divide an array into four parts in
such a way that each portion contains an equal number of
elements.
 First quartile (Q1) 25% of the observations lies below or equal to it

 Second quartile (Q2) 50 % of the observations lies below or equal to it and

 Third quartile (Q3) 75% of the observations lies below or equal to it

 The ith quartile for raw data is


i(n + 1)
Qi =
4
 If there is an even number of data items, then we need to get the average
of the middle numbers.
97
Quantiles
 Example: Find the median, lower quartile and upper quartile of the
following numbers.
a) 12, 5, 22, 30, 7, 36, 14, 42, 15, 53, 25

b) 12, 5, 22, 30, 7, 36, 14, 42, 15, 53, 25, 65

 Solution: first arrange data from smallest to largest


a)

b)

13 23.5 39
98
Quantiles
 The ith quartile for grouped frequency distribution is

99
Quantiles …

 Deciles are nine points which divide an array into 10 parts in such
a way that each part contains equal number of elements.
 The nine deciles are denoted by D1, D2, …, D9

 First decile (D1) 10% of the observations lies below or equal to it

 Second decile (D2) 20% of the observations lies below or equal to it etc

 The ith decile for grouped frequency distribution is

100
Quantiles …

 Percentiles are 99 points which divide an array into 100 parts in


such a way that each part consists of equal number of elements.
 The ninty nine percentiles are denoted by P1, P2, …, P99

 First percentile (P1) 1% of the observations lies below or equal to it

 Second percentile (P2) 2% of the observations lies below or equal to it etc

 The ith percentile for grouped frequency distribution is

101
Quantiles …
 Example:- The following frequency distribution is the score of 25 students.

Score Number
of
students Compute the following quantities
25-29 1 ● First quartile (Ans:44.92)
30-34 1
●Ninth decile (Ans:65.75)
35-39 1
●forty fifth percentile (Ans:51.38)
40-44 3
45-49 3 Remark:
50-54 6
Q1 = P25
55-59 4
Q2 = D5 = P50 = Median
60-64 3
65-69 2 Q3 = P75
70-74 1 D1 = P10 ; D2 = P20 ;; D9 = P90

102
3.2. Measures of Dispersion

103
Introduction
 Central tendency measures do not reveal the variability present in the data.

 Dispersion is the scatteredness of the data series around it average.

 Dispersion is the extent to which values in a distribution differ from the

average of the distribution

 A measure of statistical dispersion is a nonnegative real number that is zero if

all the data are the same and increases as the data become more diverse.

 Why we need measures of dispersion?


 Determine the reliability of an average

 Serve as a basis for the control of the variability

 To compare the variability of two or more series and

 Facilitate the use of other statistical measures.

104
Introduction…
 Properties of a good measures of dispersion
 It should be rigidly defined

 It should be easy to understand and to calculate

 It should be based on all observations of data

 It should be easily subjected to further mathematical treatment

 It should be least affected by sampling fluctuation

 It shouldn’t be unduly affected by extreme values

105
Introduction…
 There are many types of dispersion measures
 Range /Relative Range (Coefficient of range)

 Inter Quartile Range/ coefficient of quartile deviation

 Mean Absolute Deviation /Coefficient of mean deviation

 Variance/Standard Deviation/ coefficient of variation

 Measures of dispersion cane be absolute or relative.


 When measurements are observed with different units, or have different

averages use relative measures of dispersion.

106
Range (R)
 Range is the difference between two extreme values in a data

 Denoted by R

R = max − min

 Only two values are used in its calculation.

 It is influenced by an extreme value (non-robust).

 It is easy to compute and understand.

107
Relative Range (RR)

 Relative range is the ratio of the difference and sum of the two
extreme values in a data

 Denoted by RR/CR

max− min
RR =
max+ min

 Example: what is the range and relative range of the following


data: 4, 8, 1, 6, 6, 2, 9, 3, 6, 9. (Ans: R=8, RR=0.8)

108
Properties of range

 It is the simplest crude measure and can be easily understood

 It takes into account only two values which causes it to be a


poor measure of dispersion

 Very sensitive to extreme observations

 The larger the sample size, the larger the range


Inter Quartile Range
 Measures the range of the middle 50% of the values only

 Is defined as the difference between the upper and lower quartiles

 Interquartile range = upper quartile - lower quartile

= Q3 - Q1

 The semi-interquartile range (or SIR) is defined as the difference of the


first and third quartiles divided by two

SIR = (Q3 - Q1) / 2

 The SIR is often used with skewed data as it is insensitive to the extreme
scores

110
Coefficient of Quartile Deviation
 The ratio of the difference to sum of the two extreme quartiles of a
data. Denoted by CQD

Q3 − Q1
CQD =
Q3 + Q1

 Example: The following data are recorded: 9, 7, 3, 7, 1, 2, 5, 4, 5,


10, 10, 2, 2, 2, 6, 7, 9, 8, 5, 6. What are the SIR and CQD for the
recorded data?

 Solution: put in ascending order: 1, 2, 2, 2, 2, 3, 4, 5, 5, 5, 6, 6, 7,


7, 7, 8, 9, 9, 10, 10. (Ans: SIR=2.5, CQD=0.5)

111
Properties of IQR

 It is a simple and versatile measure


 It encloses the central 50% of the observations
 It is not based on all observations but only on two specific values
 Since it excludes the lowest and highest 25% values, it is not
affected by extreme values
 Less sensitive to the size of the sample
Mean Absolute Deviation (MAD)
 Measures the ‘average’ distance of each observation away from the mean of
the data
 Gives an equal weight to each observation

 Generally more sensitive than the range or interquartile range, since a


change in any value will affect it
 The Mean Absolute Deviation of a set of n numbers is
n

 x −x i
MAD = i =1

 All values are used in the calculation.

 It is not unduly influenced by large or small values (robust)

 The absolute values are difficult to manipulate.

113
Coefficient of Mean Deviation (CMD)
MAD
CMD =
x
 All values are used in the calculation.

 It is not unduly influenced by large or small values (robust)

 The absolute values are difficult to manipulate.

 Example: For the following data

52.5, 46.8, 38.8, 37.6, 32.3.

 Compute MAD and CMD?

 Solution: (Ans: MAD=6.44, CMD=0.16)

114
Solution
Step 2 Step 3

Observation x x−x x−x


1 52.5 10.9 10.9

2 46.8 5.2 5.2

3 38.8 -2.8 2.8

4 37.6 -4 4

5 32.3 -9.3 9.3

Total 208 0 Step 4 32.2


Step 1
Mean 41.6 0 Step 5 6.44
Properties of mean deviation
 MD removes one main objection of the earlier measures, that it
involves each value

 It is not affected much by extreme values

 Its main drawback is that algebraic negative signs of the


deviations are ignored which is mathematically unsound
Variance
 Variance is the mean of squared deviation of observations from
their arithmetic mean
𝑁
𝑖=1(𝑥𝑖 − 𝜇 )2
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝜎 2 = → 𝑓𝑜𝑟 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛.
𝑁
𝑛
− 𝑥 )2
𝑖=1 (𝑥𝑖
𝑆𝑎𝑚𝑝𝑙𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑠 2 = → 𝑓𝑜𝑟 𝑠𝑎𝑚𝑝𝑙𝑒.
𝑛−1
 All values are used in the calculation.

 It is not extremely influenced by outliers (non-robust).

 The units of variance are awkward: the square of the original

units.
 Therefore standard deviation is more natural since it recovers the original units.

117
 In general, the sample variance is computed by:

𝑛 𝑛
𝑖=1(𝑥𝑖 − 𝑥 )2 2
𝑖=1 𝑥𝑖 − 𝑛𝑥
2
= . → 𝑓𝑜𝑟 𝑟𝑎𝑤 𝑑𝑎𝑡𝑎.
𝑛−1 𝑛−1
𝑘 2 𝑘 2 2
𝑖=1 𝑓𝑖 (𝑥𝑖 − 𝑥 ) 𝑖=1 𝑓𝑖 𝑥𝑖 − 𝑛𝑥
𝑠2 = 𝑘 = . → 𝑓𝑜𝑟 𝑢𝑛𝑔𝑟𝑜𝑢𝑝𝑒𝑑 𝑑𝑎𝑡𝑎.
𝑓
𝑖=1 𝑖 − 1 𝑛 − 1
𝑘 2 𝑘 2 2
𝑖=1 𝑓𝑖 (𝑚𝑖 − 𝑥 ) 𝑖=1 𝑓𝑖 𝑚𝑖 − 𝑛𝑥
𝑘 = . → 𝑓𝑜𝑟 𝑔𝑟𝑜𝑢𝑝𝑒𝑑 𝑑𝑎𝑡𝑎.
𝑓
𝑖=1 𝑖 − 1 𝑛 − 1
Standard Deviation
 One of the most useful measures of dispersion is the standard deviation.

 It is based on deviations from the mean of the data.

 The sample standard deviation is found by calculating the square root of


the variance.
s=
 ( x − x )2 .
n −1
 To calculate standard deviation follow this step
1. Calculate the mean of the numbers

2. Find the deviations from the mean.

3. Square each deviation

4. Sum the squared deviations

5. Divide the sum in Step 4 by n – 1

6. Take the square root of the quotient in Step 5

119
Example 1: Compute the variance for the sample: 5, 14, 2, 2 and
17. 𝑛 𝑛
Solution: 𝑛 =5, 𝑥𝑖 = 40, 𝑥 = 8 , 𝑥𝑖 2 = 518 .
𝑖=1 𝑖=1
𝑛 2
𝑖=1 𝑥𝑖− 𝑛𝑥 2 518 − 5 𝑥 82
𝑠2 = = = 49.5. , 𝑆 = 49.5 = 7.04.
𝑛−1 5−1

Example 2: Suppose the data given below indicates time in


minute required for a laboratory experiment to compute a certain
laboratory test. Calculate the mean, variance and standard
deviation for the following data.
32 36 40 44 48 Total
2 5 8 4 1 20
64 180 320 176 48 788
2048 6480 12800 7744 2304 31376

2
31376 − 20 𝑥 (39.4)2
𝑥 = 39.4 , 𝑠 = = 17.31. , 𝑆 = 17.31 = 4.16.
19
Properties of Variance
 The variance is always non-negative ( 𝑠 2 ≥ 0).

 If every element of the data is multiplied by a constant "c",


then the new variance
𝑠 2 𝑛𝑒𝑤 = 𝑐 2 𝑥 𝑠 2 𝑜𝑙𝑑 .

 When a constant is added to all elements of the data, then the


variance does not change.

 The variance of a constant (c) measured in n times is zero. i.e.


(var(c) = 0).
Coefficient of Variation
 The Coefficient of Variation (CV) for a data set defined as the ratio of the standard
deviation to the mean

 It shows the extent of variability in relation to mean of the population.

 It is a normalized measure of dispersion of a probability distribution or frequency


distribution.

s
CV = 100%
x
 All values are used in the calculation.

 The actual value of the CV is independent of the unit in which the measurement has been

taken, so it is a dimensionless number.

 For comparison between data sets with different units or widely different means, one

should use the coefficient of variation instead of the standard deviation.

122
Coefficient of Variation
Example: Last semester, the students of Biology and Chemistry Departments took
Stat 273 course. At the end of the semester, the following information was
recorded.
Department Biology Chemistry
Mean score 79 64
Standard deviation 23 11

Compare the relative dispersions of the two departments’ scores using the
appropriate way.
Solution: Biology Department Chemistry Department
11 23
CV = 100 = 17.19% CV = 100 = 29.11%
64 79

Since the CV of Biology Department students is greater than that of Chemistry


Department students, we can say that there is more dispersion in the distribution of
Biology students’ scores compared with that of Chemistry students.

123
2.5 Standard Score
 If X is a measurement from a distribution with mean X and standard
deviation S, then its value in standard units is
X −X
Z=
S
 Z gives the deviations from the mean in units of standard deviation

 Z gives the number of standard deviation a particular observation lie


above or below the mean.

 It is used to compare two observations coming from different groups

124
Standard Score
 Example: Two groups of people were trained to perform a certain task
and tested to find out which group is faster to learn the task. For the two
groups the following information was given:
Value Group one Group two
Mean 10.4 min 11.9 min
Stan.dev. 1.2 min 1.3 min

 Relatively speaking:

a) Which group is more consistent in its performance? (Ans: Group 2)


b) Suppose a person A from group one take 9.2 minutes while person B from Group
two take 9.3 minutes, who was faster in performing the task? Why? (Ans: person B
is faster)

125
Solution
S1 1.2
Coefficient of variation for group 1: CV = 100% = 100% = 11.54%
x1 10.4
S2 1.3
Coefficient of variation for group 2: CV = 100% = 100% = 10.92%
x2 11.9

CV for group 2 < CV for group 1 →group 2 is more consistent

x A − x1 9.2 − 10.4
Z-score of Person A: Z= = = −1.00
S1 1.2

xB − x2 9.3 − 11.9
Z-score of Person B: Z = S =
1.3
= −2.00
2

Z-score of Person B < Z-score of Person A → Person B is faster than


person A
4. Probability and Probability Distribution
Basic Definitions of Probability
 People use the term probability many times each day.

 For example, physician says that a patient has a 50-50 chance of surviving a

certain operation.

 Another physician may say that she is 95% certain that a patient has a

particular disease

 An economist may say that he is 80% certain this year inflation may be

higher than last year.

 Probability: is the likelihood of occurrence of an outcome.

128
Fundamental Concepts
 Experiment: Any process that generates well defined outcomes.

 Steps involved in an Experiment:


 Input Equipment's, material, input data etc.

 Action to be performed

 Output list of all results of the experiment

 Experiment can be Deterministic or probabilistic/Non- Deterministic


/stochastic

129
Fundamental Concepts
 Deterministic Experiment

 A precisely deterministic input yields a precisely deterministic output.

 This is an experiment for which the outcomes can be predicted in

advance and is known prior to its conduct.

 Example: an experiment conducted to determine the economic law of

demand: Qt =a+bPt where Q is a quantity demand, P is the price and t


is a time

130
Fundamental Concepts
 Non-Deterministic/ probabilistic/Stochastic Experiment
 Even exact knowledge of input and action does not allow exact prediction of

outcome.

 This is an experiment for which the outcome of a given trial cannot be

predicted in advance prior to its conduct.

 Usually the result of this experiment is subjected to chance and is possibly

more than one.

 All the possible outcomes are known prior to conducting the exoeriment.

 Example: Rolling a die and observing the number that is rolled is a probability
experiment.

131
Fundamental Concepts
 The result of a single trial in a probability experiment is the outcome.

 Experiment 1: Toss a coin

 Outcomes: {T}, {H}

 Experiment 2: Roll a six sided die

 Outcomes: {1}, {2}, {3}, {4}, {5}, {6},

 The set of all possible outcomes for an experiment is the sample space.
 S={T, H}
 S={1, 2, 3, 4, 5, 6}

132
Fundamental Concepts
 An event consists of one or more outcomes and is a subset of the sample
space.
 Let event A be getting tail: A={T}

 Let B be getting an even number: B={2, 4,6}

 A simple event is an event that consists of a single outcome.


 Let C be an event of etting number 6 on the die: C={6}

133
Classical Probability
 Classical (or theoretical) probability is used when each outcome
in a sample space is equally likely to occur. The classical
probability for event E is given by
Number of outcomes in event
P (E ) = .
Total number of outcomes in sample space

 Example: A die is rolled. Find the probability of Event A: rolling and


getting a 5.
 There is one outcome in Event A: {5}
1
P(A) =  0.167
6

134
Empirical Probability
 Empirical (or statistical) probability is based on observations obtained from
probability experiments. The empirical frequency of an event E is the relative
frequency of event E.

P (E ) = Frequency of Event E = f
Total frequency n

 Example: A company manufactures light bulbs. Out of 100, 000 produced light
bulbs 6 are defective. What is the probability next produced bulb will be
defective.

135
Subjective Probability
 Subjective probability results from intuition, educated guesses, and estimates.
 May differ from person to person.
 Example: A business analyst predicts that the probability of a certain union
going on strike is 0.15.

136
Rules of Probability
 Range of Probabilities Rule
 The probability of an event E is between 0 and 1, inclusive. That is
0  P(A)  1.

Impossible 0.5 Certain


to occur Even to occur
chance

 The complement of Event E is the set of all outcomes in the sample space

that are not included in event E. (Denoted E′ and read “E prime.”)


P(E) + P (E′ ) = 1 P(E) = 1 – P (E′ ) P (E′ ) = 1 – P(E)
 The probability that event A or B will occur is given by

P (A or B) = P (A) + P (B) – P (A and B ).

137
Conditional Probability
 A conditional probability is the probability of an event occurring,
given that another event has already occurred.
P (B |A) “Probability of B, given A”

 Example: There are 5 red chip, 4 blue chips, and 6 white chips in a basket. Two
chips are randomly selected. Find the probability that the second chip is red
given that the first chip is blue. (Assume that the first chip is not replaced.)

 Because the first chip is selected and not replaced, there are only
14 chips remaining.

5
P (selecting a red chip|first chip is blue) =  0.357
14
138
Conditional Probability
 Example: 100 college students were surveyed and asked how many hours a
week they spent studying. The results are in the table below. Find the
probability that a student spends more than 10 hours studying given that the
student is a male.

The sample space consists of the 49 male students. Of these 49, 16 spend more than 10
hours a week studying.

16
P (more than 10 hours|male) =  0.327
49
139
Independent Events
 Two events are independent if the occurrence of one of the events does not
affect the probability of the other event. Two events A and B are independent if

P (B |A) = P (B) or if P (A |B) = P (A).

 Events that are not independent are dependent.

Example: Decide if the events are independent or dependent.

• Selecting a diamond from a standard deck of cards (A), putting it 


back in the deck, and then selecting a spade from the deck (B). 
P (B A ) = 13 = 1 and P (B ) = 13 = 1 .
52 4 52 4
The occurrence of A does not
affect the probability of B, so the
events are independent.
140
Mutually Exclusive Events
 Two events, A and B, are mutually exclusive if they cannot occur at the same
time. Event A: Select a Jack from a deck of cards. Event
Event A: Roll a number less than 3 on a die. B: Select a heart from a deck of cards.

Event B: Roll a 4 on a die.


A J
9 2 B
3 10
J J A 7
K 4
A B J 5 8
1 6
4 Q
2

A and B are not mutually


A and B are mutually exclusive.
exclusive.

141
Fundamental Counting Principle

If one event can occur in m ways and a second event can occur in n wa
ys, the number of ways the two events can occur in sequence is m· n.
This rule can be extended for any number of events occurring in a sequ
ence.
Example:
A meal consists of a main dish, a side dish, and a dessert. How many different meals can be
selected if there are 4 main dishes, 2 side dishes and 5 desserts available?

# of main dishes # of side dishes # of desserts

4  2  5 = 40
There are 40 meals available.
Fundamental Counting Principle
Example:
Two coins are flipped. How many different outcomes are there? List the sample
space.
Start
1st Coin
Tossed
Heads Tails 2 ways to flip the coin
2nd Coin
Tossed
Heads Tails Heads Tails 2 ways to flip the coin

There are 2  2 = 4 different outcomes: {HH, HT, TH, TT}.


Fundamental Counting Principle
Example:

The access code to a house's security system consists of 5 digits. Each digit
can be 0 through 9. How many different codes are available if
a.) each digit can be repeated?
b.) each digit can only be used once and not repeated?

a.) Because each digit can be repeated, there are 10 choices for each of the 5 digits.

10 · 10 · 10 · 10 · 10 = 100,000 codes
b.) Because each digit cannot be repeated, there are 10 choices for the first digit, 9 choices
left for the second digit, 8 for the third, 7 for the fourth and 6 for the fifth.

10 · 9 · 8 · 7 · 6 = 30,240 codes
Permutations
A permutation is an ordered arrangement of objects. The number
of different permutations of n distinct objects is n!.
“n factorial”

n! = n · (n – 1)· (n – 2)· (n – 3)· …· 3· 2· 1

Example:
How many different surveys are required to cover all possible question
arrangements if there are 7 questions in a survey?

7! = 7 · 6 · 5 · 4 · 3 · 2 · 1 = 5040 surveys
Permutation of n Objects Taken r at a Time

The number of permutations of n elements taken r at a time


is
n! .
n Pr =
(n − r)!
# in the
group # taken from
the group

Example:

You are required to read 5 books from a list of 8. In how many different orders c

an you do so?

n Pr = 8 P5 = 8! = 8! = 8  7  6  5  4  3  2  1 = 6720 ways
(8 − 5)! 3! 3  2 1
Distinguishable Permutations
The number of distinguishable permutations of n objects, where n1 a
re one type, n2 are another type, and so on is
n! , where n1 + n2 + n3 + + nk = n.
n1 !  n2 !  n3 ! nk !

Example:
In how many ways can you order the word PSSISSIPIP

10! 10  9  8  7  6  5  4!
=
3!4!3! 3!4!3!

= 4,200 different ways to arrange the plants


Combination of n Objects Taken r at a Time
A combination is a selection of r objects from a group of n things wh
en order does not matter. The number of combinations of r objects sel
ected from a group of n objects is

nC r =
n! .
# in the (n − r)! r !
collection
# taken from the
collection

Example:
You are required to read 5 books from a list of 8. In how many d
ifferent ways can you do so if the order doesn’t matter?

C 5 = 8! = 8  7  6  5! = 56 combinations
8
3!5! 3!5!
Application of Counting Principles
Example:
In a state lottery, you must correctly select 6 numbers (in any order)
out of 44 to win the grand prize.
a.) How many ways can 6 numbers be chosen from the 44 numbers?
b.) If you purchase one lottery ticket, what is the probability of
winning the top prize?

44!
a.) C = = 7,059,052 combinations
44 6 6!38!
b.) There is only one winning ticket, therefore,
1
P (win) =  0.00000014
7059052
Random variables
 A random variable is a numerical quantity that is generated by a
random experiment.
 A random variable is called discrete if it has either a finite or a
countable number of possible values.

 A random variable is called continuous if its possible values


contain a whole interval of numbers.

 The examples in the table are typical in that discrete random

variables typically arise from a counting process, whereas


continuous random variables typically arise from a measurement.
150
Discrete random variables
 Example
Experiment Number X Possible values of X

Roll two fair dice Sum of the number of dots 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
on the top faces
Flip a fair coin twice Count number of tails 0, 1, 2

 Example: Flip a coin three times, let X be the number of heads in three
tosses. Construct a probability distribution for X.

151
Discrete random variables: Probability distribution

 Example: Flip a coin three times, let X be the number of heads in three
tosses. Construct a probability distribution for X.
S Rx
X(S)
HHH 3
HHT 2
HTH 2
HTT 1
THH 2
THT 1
TTH 1
TTT 0
 X = {0, 1, 2, 3}

152
Discrete random variables: Probability distribution
 Example: probability distribution

X 0 1 2 3
P(X=x) 1/8 3/8 3/8 1/8

153
The Mean and Standard Deviation of a Discrete Random Variable

 The mean (also called the expected value) of a discrete random


variable X is the number.

 The mean of a random variable may be interpreted as the average of the values
assumed by the random variable in repeated trials of the experiment.

154
The Mean and Standard Deviation of a Discrete Random Variable
 The variance, σ2, of a discrete random variable X is the number

σ2
 The standard deviation, σ, of a discrete random variable X is the square root of
its variance, hence is given by the formulas

 The variance and standard deviation of a discrete random variable X may be


interpreted as measures of the variability of the values assumed by the random
variable in repeated trials of the experiment.

155
The Mean and Standard Deviation of a Discrete Random Variable
 Example: A manufacturer receives a certain component from a supplier in
shipments of 100 units. Two units in each shipment are selected at random and
tested. If either one of the units is defective the shipment is rejected. Suppose a
shipment has 5 defective units.
a) Construct the probability distribution for the number X of defective units in such a

sample.

b) Find the probability that such a shipment will be accepted.

c) Determine and standard deviation of number of defectives?

156
The Mean and Standard Deviation of a Discrete Random Variable
 Example: A manufacturer receives ….
a) The probability distribution

X 0 1 2
P(X=x) 0.902 0.096 0.002

a) P(shipment is accepted)=0.902

b) Mean =∑XP(X=x)=(0*0.902)+(1*0.096)+(2*0.002)=0.10

and variance=0.094

157
The Binomial probability distribution
 Suppose a random experiment has the following characteristics.
 There are n identical and independent trials of a common procedure.

 There are exactly two possible outcomes for each trial, one termed “success”

and the other “failure.”

 The probability of success on any one trial is the same number p.

 Then the discrete random variable X that counts the number of successes
in the n trials is the binomial random variable with parameters n and
p. We also say that X has a binomial distribution with parameters n
and p.

158
The Binomial probability distribution
 If X is binomial, then

159
The Binomial probability distribution
 Example: A corporation has advertised heavily to try to insure that over half the
adult population recognizes the brand name of its products. In a random sample
of 20 adults, 14 recognized its brand name.
a) What is the probability that 14 or more people in such a sample would recognize its

brand name if the actual proportion p of all adults who recognize the brand name
were only 0.50?

b) What is the mean number of adults who recognized the brand name

c) What is the variance number of adults who recognized the brand name

160
The Binomial probability distribution
 Example: A corporation has …
Let X be the number pf adults who recognized brand name
P=0.5
n=20→x~BINOM(20,0.5)
a) What is the probability that 14 or more people in such a sample would recognize its
brand name if the actual proportion p of all adults who recognize the brand name
were only 0.50?
P(X≥14)=0.0577
b) What is the mean number of adults who recognized the brand name

Mean=E(X)=np=10
c) What is the variance number of adults who recognized the brand name

Var(X)=np(1-p)=5

161
The Poisson probability distribution
 The Poisson probability distribution provides a good model for the
probability distribution of the number of “rare events” that occur
randomly in time, distance, or space.

 Assume that an interval is divided into a very large number of


subintervals so that the probability of the occurrence of an event in
any subinterval is very small.

162
The Poisson probability distribution
 Assumptions of a Poisson probability distribution:

 The probability of an occurrence of an event is constant for all

subintervals: independent events;

 You are counting the number times a particular event occurs in a unit;

and

 As the unit gets smaller, the probability that two or more events will

occur in that unit approaches zero.

163
The Poisson probability distribution
 The random variable X is said to follow the Poisson probability
distribution if it has the probability function:
e −  x
P( x) = , for x = 0, 1,2,...
 where x!
• P(x) = the probability of x successes over a given period of time or space, given 

•  = the expected number of successes per time or space unit;  > 0

• e = 2.71828 (the base for natural logarithms)

• The mean and variance of the Poisson probability distribution are:

 x = E ( X ) =  and  x2 = E[( X −  ) 2 ] = 

164
The Poisson probability distribution
 Example: If calls to your cell phone are a Poisson process with a
constant rate =2 calls per hour,

 what’s the probability that, if you forget to turn your phone off in a

1.5 hour movie, your phone rings during that time?

(2 * 1.5) 0 e −2(1.5) (3) 0 e −3


P( X = 0) = = e −3 = .05
0! 0!
 How many phone calls do you expect to get during the movie?

E(X) = t = 2(1.5) = 3

165
The Poisson probability distribution
 Example: A life insurance company insures the lives of 5,000 men
of age 42. If actuarial studies show the probability of any 42-year-
old man dying in a given year to be 0.001.
a) What is the probability that a company will pay 4 claims per year

b) What is the mean number of claims per year the company will pay.

c) What is the probability that a company will pay at least 1 claims per
year

166
The Poisson probability distribution
 Example: A life insurance company….

n=5000, p=0.001, Mean=np→ binomial, →λ=5→X~Poisson(4)


a) What is the probability that a company will pay 4 claims per year
𝑒 −λ λ−𝑥 𝑒 −5 5−4
𝑃 𝑋=4 = = 𝑋=4 = =0.17547
𝑥! 4!

b) What is the mean number of claims per year the company will pay.
Mean=5

c) What is the probability that a company will pay at least 1 claims per
year

𝑃 𝑋 ≥1 =𝑃 𝑋 =1 +𝑃 𝑋 =2 +⋯=1−𝑃 𝑋 =0
𝑒 −5 5−0
=1- 0! =0.993262 167
Continuous Probability Distribution
 A continuous random variable is a variable that can assume any value in
an interval

 thickness of an item

 time required to complete a task

 temperature of a solution

 height, in meters

168
Continuous Probability Distribution
 A continuous random variable has an infinite number of possible
values that can be represented by an interval on the number line.

Hours spent studying in a day

0 3 6 9 12 15 18 21 24

The time spent studying


can be any number
between 0 and 24.

The probability distribution of a continuous random variable is called a continuous


probability distribution.

169
Normal Probability Distribution
 The most important probability distribution in statistics is the normal
distribution. f(x)
 Bell Shaped’

 Symmetrical
σ
 Mean, Median and Mode are Equal
μ x
 Location is determined by the mean, μ

 Spread is determined by the standard deviation, σ

 The random variable has an infinite theoretical range: +  to − 

 The total area under the curve is equal to one.

170
Normal Probability Distribution
 The normal distribution closely approximates the probability distributions of a
wide range of random variables

 Distributions of sample means approach a normal distribution given a “large”


sample size

 Computations of probabilities are direct and elegant

 The normal probability distribution has led to good business decisions for a
number of applications

 The formula for the normal probability density function is

171
Normal Probability Distribution
 For a normal random variable X with mean μ and variance σ2 , i.e., X~N(μ, σ2),
the cumulative distribution function is

F(x 0 ) = P(X  x 0 )

f(x)

P(X  x 0 )

0 x0 x 172
Normal Probability Distribution
 There may be thousands of normal distribution curves, each with a different
mean and a different standard deviation.

 Since the shapes are different, the areas under the curves between any two points
are also different.

 To make life easier, all normal distributions can be converted to a standard


normal distribution.

 A standard normal distribution has a mean of 0 and a standard deviation of 1.

173
Standard Normal Probability Distribution
 The letter z is used to designate the standard normal random variable.

=1

z
0
• Converting to the Standard Normal Distribution requires the use of this formula

Value - Mean x -μ
z= = .
Standard deviation σ

• If X is distributed normally with mean of 100 and standard deviation of 50, the Z
value for X = 200 is 2.0

174
Standard Normal Probability Distribution

 a −μ b −μ
P(a  X  b) = P Z 
 σ σ 
f(x)  b −μ  a −μ
= F  − F z 
 σ   σ 

x
a µ b
a −μ b −μ
Z
σ 0 σ

175
Standard Normal Probability Distribution
 Properties of the Standard Normal Distribution
1. The cumulative area is close to 0 for z-scores close to z = −3.49.

2. The cumulative area increases as the z-scores increase.

3. The cumulative area for z = 0 is 0.5000.

4. The cumulative area is close to 1 for z-scores close to z = 3.49

Area is close to 0. Area is close to 1.

z = −3.49 z = 3.49
z=0
Area is 0.5000.

z=0

176
Standard Normal Probability Distribution
 Example: Find the area that corresponds to a z-score of between 0 and 2.71.

 Find the area by finding 2.7 in the left hand column, and then moving across the row to the column under 0.01

 The area to between z=0 and z = 2.71 is 0.4966.


177
Standard Probability Distribution

178
Standard Normal Probability Distribution

 Example: A personal computer is used for office work at home, research,


communication, personal finances, education, entertainment, social networking,
and a myriad of other things. Suppose that the average number of hours a
household personal computer is used for entertainment is two hours per day.
Assume the times for entertainment are normally distributed and the standard
deviation for the times is half an hour.

a. Find the probability that a household personal computer is used for


entertainment between 1.8 and 2.75 hours per day.

b. Find the maximum number of hours per day that the bottom quartile of
households uses a personal computer for entertainment.

179
Standard Normal Probability Distribution

Example: The service life of a certain brand of automobile battery is normally


distributed with a mean of 1000 days and a standard deviation of 100 days. The
manufacturer of the battery wants to offer a guarantee, but does not know the
length of the warranty. It does not want to replace more than 10 percent of the
batteries sold. What should be the length of the warranty?

180
Uniform Probability Distribution

 The uniform distribution is a probability distribution that has equal probabilities


for all possible outcomes of the random variable

f(x)
Total area under the
uniform probability
density function is 1.0

xmin xmax x

181
Uniform Probability Distribution

 The uniform probability distribution is given as

1
if a  x  b
b−a
f(x) =
0 otherwise
where

f(x) = value of the density function at any x value

a = minimum value of x

b = maximum value of x

182
Exponential Probability Distribution
 Used to model the length of time between two occurrences of an event (the time
between arrivals)

 Examples:

 Time between trucks arriving at an unloading dock

 Time between transactions at an ATM Machine

 Time between phone calls to the main operator

183
Exponential Probability Distribution
 The exponential random variable T (t>0) has a probability density
function

f(t) = λ e − λ t for t  0

 Where
  is the mean number of occurrences per unit time

 t is the number of time units until the next occurrence

 e = 2.71828

 T is said to follow an exponential probability distribution

184
Exponential Probability Distribution
 The exponential random variable T (t>0) has a probability density
function

f(t) = λ e − λ t for t  0

 Where
  is the mean number of occurrences per unit time

 t is the number of time units until the next occurrence

 e = 2.71828

 T is said to follow an exponential probability distribution

185
Exponential Probability Distribution
 Defined by a single parameter, its mean  (lambda)

 The cumulative distribution function (the probability that an arrival time is less
than some specified time t) is

−λt
F(t) = 1− e
where

e = mathematical constant approximated by 2.71828

 = the population mean number of arrivals per unit

t = any value of the continuous variable where t > 0

186
Exponential Probability Distribution
 Example: Customers arrive at the service counter at the rate of 15 per hour.
What is the probability that the arrival time between consecutive customers is
less than three minutes?

▪ The mean number of arrivals per hour is 15, so  = 15

▪ Three minutes is .05 hours

▪ P(arrival time < .05) = 1 – e- X = 1 – e-(15)(.05) = 0.5276

▪ So there is a 52.76% probability that the arrival time between successive


customers is less than three minutes

187

You might also like