0% found this document useful (0 votes)
20 views

2nd Software Engineering

The document discusses statistics including definitions, types, methods of data collection and organization, and variables and scales of measurement. It covers descriptive and inferential statistics, qualitative and quantitative data, frequency distributions, and examples of categorical and discrete frequency distributions.

Uploaded by

wakgarielias4
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

2nd Software Engineering

The document discusses statistics including definitions, types, methods of data collection and organization, and variables and scales of measurement. It covers descriptive and inferential statistics, qualitative and quantitative data, frequency distributions, and examples of categorical and discrete frequency distributions.

Uploaded by

wakgarielias4
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 107

Contents

1 Introduction and Definition of Statistics

Type and Application of Statistics

Variables and Scale of Measurement


2 Methods of Data Collection and Organization

Type and Sources of Data

Methods of Data Collection

Methods of Data Organization

Methods of Data Presentation


3 Measures of Central Tendency

Types of Measures of Central Tendency


4 Measures of Variation
5 Types of Measures of Variations
6 Theory of Probability

Definition and Some Basic Concepts of Probability


Definition of Statistics

• The term ’statistics’ is derived from the Latin word status, meaning

state, and historically statistics referred to the display of facts and


figures relating to the demography of states or countries

• Currently Defined in plural and Singular senses

• Plural sense: Statistics are collection of facts (figures)

• Examples: figures on sales, employment or unemployment, accident,

weather, death, education, etc.

• But not all numerical data are statistics.


Definition of Statistics..

In order for the numerical data to be identified as statistics:

• It should be aggregate of facts

• It should be affected by multiple causes but not the outcome of a

single cause.

• Should be numerically expressed

• The data should be collected in a systematic manner for

predetermined purpose

• It enumerated or estimated according to reasonable standard of

accuracy.
Singular sense

Statistics is the science that deals with the methods of data collection,
organization, presentation, analysis and interpretation of data. According
to this definition statistical investigation have five stages

1 Collection of Data: Process of obtaining data

2 Organization of Data: Making ready for clear understanding by


editing,classifying and tabulation

3 Presentation of Data: Visualizing organized data diagrammatically


or graphically

4 Analysis of Data: Summarizing data to reach on conclusion about


give problems

5 Interpretation of Data: process of drawing conclusion on the analyzed


results
Classification of Statistics

Based on scope of decision making statistics can be

1 Descriptive Statistics: used to organize and summarize masses of


data using measures of summary statistics. It does not go beyond
summary

2 Inferential Statistics: Making generalization about a population


based on the sample results using probability theory application
Application of Statistics

• Statistics is applied in almost all fields of human endeavor

• In Scientific Research:From the beginning of the design up to final

interpretation of the results

• In Industry: Help to check whether a product satisfies a given

standard

• In Business: To forecast future demand and profits of the business

• In Medicine: For drug development design and identification of

different health problems though research


Uses of Statistics

• To reduce and summarize masses of data: Using summary statistics

or diagrams or graphs

• To facilitate comparison: Uses averages, percentages, ratios,etc.

• To determining functional relationships between two or more

phenomenon.Through measures of association

• To formulate and testing hypotheses: Bases on test statistics and

testing procedures

• For forecasting: Using statistical models


Limitation of Statistics

• Does not deal with a single observation: Due to aggregates of facts

• Not applicable to qualitative opinion since it focused on

quantifications

• Statistical results are true on average

• Statistics are liable to be misused or misinterpreted


Important terms of Statistics

• Variable: Is any phenomena or an attribute that can assume

different values

• Data: Observation or measurement obtained on a given variables

• Populations: totality of all objects under study possessing certain

common characteristics

• Sample: subset(portion) of the population selected for the purpose of

investigations

• Sampling: is the procedure of obtaining sample using statistical

techniques

• Parameter: Population value to describe characteristics of a given

population

• Statistic: sample value to characterize given sample


Variables and its Types

Variable can be:

• Qualitative variables: Variable assuming category values (not

expressed in Numeric value)


Example: Gender, Religion, Color of automobile, educational level

• Quantitative variables: Variables assuming numerical values

Example: Height, Family size, Weight, etc.

• Quantitative Variable can be also Discrete or continuous

• Discrete variable: assumed distinct countable values. eg.Family size,

Number of children in a family, etc.

• Continuous variable: assumes measurement values with give

measurement units. Eg.Example: Height, Weight, Time,


Temperature, etc.
Scale of Measurement of Variable

Use to know the information contained in given variables used to identify


its type, values and mathematical operation used over. The four scales of
measurements are:

1 Nominal: Reflection of categories. Eg. Gender, ethnicity

2 Ordinal: Reflection of categories with ordering of the categorized


values. Eg. Academic Rank, Grade letters, Economic status, Health
status

3 Interval: Reflection of quantitative variable. There is no true zero


E.g. IQ, temperature measurement in degree Celsius

4 Ratio: The scale of quantitative variable. There is true zero for this
scales E.g. Age, weight, height,... measurements
Data types

Based on its nature data can be:

• Qualitative: Expressed in terms of categories obtained based on

qualitative variable
Examples: Data on gender, religion, economic status, ethnicity of
subjects under investigation

• Quantitative: Expressed in numeric values obtained based on

quantitative variables
Examples: Data on age, weight, temperature, number of children a
family has, etc.

• Quantitative Data can be also discrete or continuous which is

obtained by measuring the values of discrete or continuous variables


Data types

Based on its source data can be:

• Primary data: Collected by investigator himself for the purpose of a

specific inquiry or study

• Secondary Data: Data collected by others and obtained from different

secondary sources

Based on time of data collection data can be:

• Cross-sectional data: is a set of observations taken at a point of time.

• Time series data: is a set of observations collected for a sequence of

time usually at equal intervals.


Methods of Data Collection

The first and foremost task in statistical investigation is data collection.


Before beginning data collection the investigator should address the
following four question

1 Why?: The purpose of data collection

2 What? : Defined nature of data to be collected

3 Where?: Source of data

4 How? Methods used for the collection

The data collection methods may be based on Questionnaire for the


survey or observational for the experimental studies
Questionnaire Methods

The questionnaire methods are based on personal interview using


different techniques or self administered questionnaire

• Secondary Data: Also extracted from different secondary sources by

checking its reliability,suitability and adequacy


Methods of Data Organization

After data collection was made the collected data must be organized into
some meaningful way. Organization of data involves

1 Data edition: Edition was mad for the purpose of completeness,


consistency, accuracy and homogeneity

2 Data Classification: Separation of data according to their similar


characteristics based different categorizations

3 Tabulation of Data: systematic arrangement of data in rows and


columns

Tabulated data have different contents such as title, caption,....


Frequency Distributions

The most convenient way of organizing numerical data is to construct a


frequency distribution. Frequency distribution is the organization of raw
data in table form, using classes and frequencies.

1 Categorical Frequency Distribution: Used to organize qualitative


data i.e. either nominal or ordinal

2 Un grouped Frequency Distribution: Used to organized discrete


quantitative data. single class resented using single numeric values
and its frequency

3 Grouped (Continuous) Frequency Distribution: several values of a


variable are grouped into one class. It has its own procedure of
construction
Examples on categorical

• Example: The blood type of 22 students is given below. Construct

categorical frequency distribution. A B B AB O A O O B AB B A B B


O A O AB A O O AB

Class (Blood type) Frequency (no of students)


A 5
B 6
AB 4
O 7
Total 22
Examples on Discrete

Number of children for 21 families is:


235433231043221114222
Construct ungrouped frequency distribution.
Class (no of children) Frequency (No of families)
0 1
1 4
2 7
3 5
4 3
5 1
Total 21
Grouped Frequency Distributions Construction
Construction of grouped frequency distribution is based on number of
class, and class width having its own class limits and class boundaries
• Number of class (k): Number of classes a given FD has

• Class Width: Size of a given class

• Class Limits: The lowest(LCL) and highest(UCL) value of a given


class
• Class boundaries: Class limits when there is no difference between
the first UCL and the next class LCL
• Class Mark: The mid point of a given class

• Class Frequency: The number of observation lying in a given class

• Relative Frequency: is the ratio of class frequency to total frequencies

• Cumulative frequency: The sum of frequency proceeding(LCF) or


succeeding(MCF) given class
• Unit of measurement (u): smallest difference between any two
observations
Steps of Frequency Distributions Construction
1 Arrange the data in ascending or descending order

2 Find u

3 Find range: R = Maximum - Minimum

4 Determine number of classes(k): k = 1 + 3.322 log(n)


R
5 Determine class width (w): w = k(1+3.322log(n))

6 Generate class limits as: LCL1 = minimum, LCLi = LCLi−1 + w and


UCLi = LCLi + (w − u)

7 Generate class boundaries: LCBi = LCLi − 21 u and


UCBi = UCLi + 21 u

8 Determine class frequency (fi ): Counting number of observation lying


each class
UCLi +LCLi UCBi +LCBi
9 Determine class mark (xi ): 2
= 2

10 Determine relative frequency: Pnfi


f
i=1 i
P1 Pk
11 Determine Cumulative frequency: LCFi = i=i fj and MCFi = i=i fi
Example

Consider mark of 50 students out of 40 which are: 16 21 26 24 11 17 25 26


13 27 24 26 3 27 23 24 15 22 22 12 22 29 18 22 28 25 7 17 22 28 19 23 23
22 3 19 13 31 23 28 24 9 20 33 30 23 20 8 21 24 and construct its grouped
frequency distribution.

• Arranging data: 3 3 7 8 9 11 12 13 13 15 16 17 17 18 19 19 20 20 21
21 22 22 22 22 22 22 23 23 23 23 23 24 24 24 24 24 25 25 26 26 26 27
27 28 28 28 29 30 31 33

• u = 7-8 =1, R= 33-3 = 30, k= 1 + 3.322 log(50) = 6.64 ≈ 7


30
• w= 6.64
= 4.5 ≈ 5 , w-u= 5-1 = 4

• LCL1 = 3, LCL2 = 3 + 5 = 8, ...LCL7 and


UCL1 = 3 + 4 = 7, UCL2 = 8 + 4 = 12, ..., UCL7

• LCBi = LCLi − 0.5 and UCBi = UCLi + 0.5


Constructed FD Table

Class limit Class boundaries xi fi RFi LCFi MCFi


3
3-7 2.5-7.5 5 3 50
3 50
4
8-12 7.5-12.5 10 4 50
7 47
6
13-17 12.5-17.5 15 6 50
13 43
13
18-22 17.5-22.5 20 13 50
26 37
17
23-27 22.5-27.5 25 17 50
43 24
6
28-32 27.5-32.5 30 6 50
49 7
1
33-37 32.5-37.5 35 1 50
50 1
Total 50 1
UCLi +LCLi UCBi +LCBi fi
• xi = 2
= 2
, RFi = n

• LCFi and MCFi is the sum of class frequencies


Methods of Data Presentation

Is the methods helps to visualize summarized data using different


diagrams or graphs based on the data nature

• Diagrams: Used to present qualitative data. Diagrams includes

pie,simple,multiple and component bar charts

• Pie and simple bar chart: Is used to present single variable

qualitative data. Pie chart is the circular data presentation.

• Multiple and component bar chats: used to present two or more

variable qualitative data. There difference is multiple bar uses


different bar for categories whereas multiple bar divided a single bar
in to number of categories based on category frequency

• In any bar charts Y-axis represents the frequency for each category

and X-axis represents the categories


Examples

• Consider the following data collected from 200 individuals and

organized on their marital status and gender. Based on this given


data sets present marital status by sex and marital status only using
diagrams

Marital Male Female Total


Single 90 10 100
Married 30 40 70
Others 5 25 30
Categorical Data Presentation Examples

Figure: Presentation of marital status vs sex and Marital status only


Methods of Quantitative Data Presentation

• Quantitative Data organized by un grouped and grouped frequency

distribution is also presented using different line graphs, histogram,


frequency polygon and other graphical techniques

• Histogram: based on class boundaries and class frequency

• Frequency polygon: connection class mark with class frequency

• Cumulative frequency polygon: Based on class mark with class

frequency
Examples

• Presentation of data collected on the height of 45 students

Figure: Hieght of Students presentations


Measures of Central Tendency

To give further conclusion about the collected data, organization and


presentation of data is not enough which needs further statistical
measures which summarized the data more. MCT is among these
summary measures to condense data in providing average of a given data
set. MCT also helps:

• To condense a mass of data with single numeric value

• To facilitate comparison among different groups

• To know the center value (average) of given data sets


Properties of Good Measures of Central Tendency

Measure of central tendency is good or satisfactory, if it characterized by:

• Based on all observations when calculated

• Not be affected by extreme values

• Should have a definite value

• Always exist.

• Easy to understand and calculate

• Capable of further algebraic treatment


Summation notations and its property
Summation is important to calculate mean and other statistical measures
as well

• Lex x be variable having successive values x1 , x2 , ..., xn . The


Pn
summation of x1 + x2 + ... + xn = i=1 xi
Pn
• x21 + x22 + ... + x2n = i=1 x2i
Pn
• x1 y1 + x2 y2 + . . . + xn yn = i=1 xi yi

1 1 1
Pn 1
• x1
+ x2
+ ... + xn
= i=1 xi

Rules of Summation
Pn Pn Pn
1 For two variables x and y: i=1 (xi ± yi ) = i=1 xi ± i=1 yi
Pn Pn
2 For any constant k: i=1 kxi = k i=1 xi
Pn
3 i=1 kx = nk
Pn Pn Pn
4 i=1 (xi − k)2 = i=1 x2i − 2k i=1 xi + nk
Types of Measures of Central Tendency

The three commonly used measures of central tendency are mean,


median and mode. Each of the MCT have their own properties

• Mean: The average of the a given observed data

• Median: The middle value of a given data. Divides a given data in to

two equal parts (more than 50% of our observation is below median
value whereas the remaining 50% is above the value)

• Mode: The most frequently observed values in a given data set (the

most frequently occurring value)


Types of Mean

• Arithmetic Mean: The sum of all observed values divided by number


Pn
i=1 xi
of observations: x̄ = n
, for raw data
Pk
fi xi
• x̄ = Pi=1
k f
, where xi and fi are class mark and frequency of ith class
i=1 i
respectively.

• Weighted Arithmetic Mean: Is the case when each observation have


Pn
w i xi
their own weight based on their importance x̄ = Pi=1
n w
, where wi
i=1 i
th
and xi are the i weigh and observed value respectively

• In simple arithmetic mean all observation considered to have equal

importance whereas in case of the weighted mean each of the


observation have their own weight based on their importance
Types of Mean...

• Geometric Mean(GM): The nth root of the product of all observed

values. It gives good average when our data is in ratios, proportion


and trends showing an increment or decrement in a given data
√ p
n
Qn
• For un organized data: GM= x1 ∗ x2 ∗ ... ∗ xn =
n
i=1 xi
Pk
qQ
k fi
• For grouped data: GM = i=1 ffi i=1 xi

• Harmonic Mean (HM): It give good average when the observed data

are expressed in terms of per unit time.

n n
• For un organized data: HM = 1 =
+ x1 +...+ x1 1
Pn
x1 2 n i=1 xi
Pn
• For grouped data: HM = i=1 fi
Pn fi
i=1 xi

• The weighted harmonic mean when each observation have their own
Pn
wi
weigh is given by: HM = Pni=1 wi
i=1 xi
Median
Median is the value which located at the center and considered as the
measure of location

• The calculation median formula depends on type of number

observations we have if n is odd the median is given by:


x̃ = ( n+1
2
)th observed value.
For n is even the median is given by:
( n2 )th value+( n2 +1)th value
x̃ = 2
n −LCF
x̃−1
• For grouped data: x̃ = LCBx̃ + ( 2
fx̃
)∗w

• This is done after identifying the median class. Median class is the

class containing ( n2 )th Observed value

• Mode (x̂): Most frequently observed value. For un grouped data the

one with greater frequency is the modal value


fx̂−f
x̂−1
• For grouped data: x̂ = LCBx̂ + ( f )∗w
x̂−fx̂−1 +fx̂−fx̂+1
Mode

Mode (x̂): Most frequently observed value. For un grouped data the one
with greater frequency is the modal value
fx̂−f
x̂−1
• For grouped data: x̂ = LCBx̂ + ( f )∗w
x̂−fx̂−1 +fx̂−fx̂+1

• The modal class is the class with larger frequency of the class
Exercise on MCT
Example 1: The heights of 7 students selected from a class are given
below in centimeter. 165, 160, 172, 168, 159, 170, 173. Calculate the
simple AM of heights.
x̄ = 165+160+172+168+159+170+173
7
= 1167
7
= 166.5 cm is the average height of
the students
• Example 2: Calculate the mean amount of yield of maize, based on
the following grouped data.
Yield (in kg) No of plots (fi ) Class mark (xi ) fi mi
171-179 3 175 525
180-188 7 184 1288
189-197 12 193 2316
198-206 9 202 1818
207-215 4 211 844
216-224 4 220 880
225-233 1 229 229
Total 40 7900
P
• x̄ = Pfi xi = 7900
= 197.5 kg per plot is the average yield
fi 40
Exercise on MCT...

Example 3: A student was registered Stat 281 and Math 261 with four
credit hours and Math 224, Phil 201, and Comp 201 with three credit
hours. If the student earned B grade for the courses Stat 281, Math
261,and Phil 201 and C grade for the remaining two course find the
average score of the students
4∗3+4∗3∗+3∗3+3∗2+3∗2 45
4+4+3+3+3
= 17 = 2.64 is the average score student can earn at
the end of the semester.

• Example 4: Assuming given epidemic was spreading at the rate of 1.5


and 2.67 in two successive days find average spread rate

GM = 1.5 ∗ 2.67 = 2.001 is the average spread of the epidemic
within two days

• Example 5 : If driver travels for 3 days at speed of 48 km per hr for


about 10 hrs, 40 km per hr for 12 hrs, 32 km per hr for 15 hrs
respectively. Find the average speed of the driver in 3 days
10+12+15
HM = 10 + 12 + 15
= 41.48km per hr is the average speed for 3 days
48 40 32
Exercise...

• Example 5: What is the median of 180, 201, 220, 191, 219, 209 and

220

• Sorted values 180, 191, 201, 209, 219, 220, 220. Since n= 7 is odd its

median is given by:


( 7+1
2
)th value = 4th value = 209

• Find median and mode of 62, 63, 64, 65, 66, 66, 68 and 78.

• Sorted values 62, 63, 64, 65, 66, 66, 68, 78. Since n= 8 is odd the

median is given by:


( n2 )th value+( n2 +1)th value 4th value+5th value 65+66
2
= 2
= 2
=65.5

• the modal value is 66


Exercise...

• Example: Consider example 2 data and find the median and modal
value of yield.
Yield (in kg) No of plots (fi ) Class mark (xi ) LCF
171-179 3 175 3
180-188 7 184 10
189-197 12 193 22
198-206 9 202 31
207-215 4 211 35
216-224 4 220 36
225-233 1 229 40

• The median class is the class which 9 40


2
)th value = 20th value lies i.e
the third class
• x̃ = 188.5 + ( 20−10
12
) ∗ 9 = 196
• The modal class is the class with greater frequency which is the third
class also
12−7
• x̂ = 188.5 + ( 12−7+12+9 ) ∗ 9 = 190.3
Properties of Arithmetic Mean

• If a constant k is added or subtracted from each value observations

i.e x̄new = x̄old ± k

• If each value of observations is multiplied by a constant k i.e

x̄new = kx̄old
Pn
• Deviation taken from the mean is zero i.e i=1 (xi − x̄) = 0

• If mean wrongly computed it is possible to reach on corrected one

based on wrong and corrected observations i.e


( n
P P P
i=1 xi )wrong − xwrong + xcorrected
x̄corrected = n

• It is also possible to have combine mean for different groups i.e


Pk
ni x̄i
x̄combined = Pi=1
k ni
i=1
Examples

• Example 1: There are 49 students in a certain department. Among

these 7 are seniors with average weight of 165 lbs, 9 are juniors with
average weight of 160 lbs, 13 are sophomores with average weight of
152 lbs and 20 freshman with average weight of 150 lbs. Find the
average weight of students in the department.
7∗165+9∗130+3∗152+20∗150
x̄combined = 7+9+13+20
= 93.28 lbs

• Example 2: The mean age of a group of 100 students was found to be

32.02 years. Later it was discovered that age of 57 was misread as


27. Find the correct mean.
32.02∗100+57−27 3232
x̄corrected = 100
= 100
= 32.32year
Other Measures of Locations

• The are quantiles which divides a given data sets in two more than

two equal parts which you can read further


Measures of Variation

• Two or more data sets may have the same mean and (or) median but
they may be quite different. This implies that MCT alone do not
provide enough information about the nature of the data.
Score of class A 30 30 30
Score of class B 29 30 31
Score of class C 15 30 45
Score of class D 5 30 55
• All the four data sets have mean 30 and median is also 30. This do
not implied the data sets are similar and does not give clear picture
about the nature of data
Objectives of Measures of Variation

• To have an idea about the reliability of the measures of central

tendency

• To compare two or more sets of data with regard to their variability

• To provide information about the structure of the data

• To pave way to the use of other statistical measures


Types of Measures of Variation

• Absolute Measures of Variation: A measure of actual amount of

variation of an item from a measure of central tendency and are


expressed in concrete units in which the data have been expressed

• Relative Measures of Variation: Is the quotient obtained by dividing

the absolute measure by a quantity in respect to which absolute


deviation has been computed. Used for making comparisons between
different distributions.
Types of Measures of Variation...

Absolute Measures Relative Measures


Range Coefficient of Range
Mean Deviation Coefficient Mean Deviation
Variance Coefficient of Variation
Standard Deviation Standard Scores
Range and Mean Deviation

• Range: Based only on maximum and minimum values R= L-S

R
• Coefficient of Range: L+S

• Mean deviation: The arithmetic mean of the absolute values of the

deviation from measures of central tendency.


Pn Pn Pn
|xi −x̄| |x −x̃| |x −x̂|
MD = i=1
n
, i=1n i , i=1n i
Pn Pn Pn
i=1 fi |xi −x̄| f |x −x̃| f |x −x̂|
orMD = n
, i=1 ni i , i=1 ni i

• Coefficient Mean Deviation: The ration of mean deviation to MCT


Variance and Standard Deviation

• Variance: The mean of the squared deviation taken from the


Pn 2 Pn 2
i=1 (xi −x̄) i=1 fi (xi −x̄)
arithmetic mean i.e s2 = n−1
or s2 = n−1
for grouped
data

• Standard deviation: is the square root of variance

• Coefficient of Variation: Is relative measures variation. Used to know

how the observations are heterogenous or homogeneous relative to


mean values

S
• CV = x̄

• % CV =CV*100%
Examples

• Consider a sample with data values of 27, 25, 20, 15, 30, 34, 28, and

25. Compute the range, coefficient of range, mean deviation about


mean, mean deviation about median, coefficient of mean deviation
about mean, coefficient of mean deviation about median, variance,
standard deviation and CV

• R = max - min = 34 - 15 = 19, CR = max−min


max+min
= 3415
34+15
= 0.388

27+...+25 25+27
• x̄ = 8
= 25.5, x̃ = 2
= 26
|27−25.5|+...+|25−25.5| 34 4.25
• MDx̄ = 8
= 8
= 4.25, CMDx̄ = 25.5
= 0.1667,
|27−26|+...+|25−26| 32 4
MDx̃ = 8
= 8
= 4, CMDx̃ = 26
= 0.154
(27−25.5)2 +...+(25−25.5)2

• s2 = 7
= 34.57, S = 34.57 = 5.88, CV = 5.88
25.5
=
0.231
Examples

• Consider the following grouped data on score of students and find


measures of variations having x̄ = 25.64 and x̃ = 26.1
class xi fi |xi − x̄| fi |xi − x̄| |xi − x̃| fi |xi − x̃| fi (xi − x̄)2
10.5-14.5 12.5 4 13.14 52.56 13.6 54.4 690.6384
14.5-18.5 16.5 7 9.14 63.98 9.6 67.2 584.7772
18.5-22.5 20.5 8 5.14 41.12 5.6 44.8 211.3568
22.5-26.5 24.5 10 1.14 11.40 1.6 16.0 12.9960
26.5-30.5 28.5 12 2.86 34.32 2.4 28.8 98.1552
30.5-34.5 32.5 7 6.86 48.02 6.4 44.8 329.4172
34.5-38.5 36.5 8 10.86 86.88 10.4 83.2 943.5168
Total 56 338.28 339.2 2870.8576

38−11
• R = UCLlast − LCLfirst = 38 − 11 = 27, CR = 38+11
= 0.551
338.28 6.04
• MDx̄ = 56
= 6.04, CMDx̄ = 25.64 = 0.24,
339.2 6.06
MDx̃ = 56
= 6.06, CMD x̃ = 26.1
= 0.23

• s2 = 2870.8576
55
= 52.19, s = 52.19 = 7.22, CV = 7.22
25.5
= 0.283
Properties of variance

• If a constant is added (subtracted) to (from) each and every

observation, the standard deviation as well as the variance remains


the same.

• If each and every value is multiplied by a nonzero constant k, the

standard deviation is multiplied by k and the variance is multiplied


by k2 .
Theory of Probability

• Probability is a numerical description of uncertainty of a given

phenomena under certain condition

• For example individual choice of subjects for a given investigation are

assumed to be random

• We may sample a population at random and make inferences about

the population as a whole from the sample by using statistical


analysis

• In general probability is about the occurrence or none occurrence of a

given event resulted from the experiment


Review on Set Theory

• Union(Or): A set consisting all elements in A or B or both is called


S
the union set of A and B, i.e A B = {x : xA, xBorxboth}.

• Intersection (And): A set consisting all elements in both A and B i.e


T
A B = {x : xAandx}B.
S
• Complement (Not): is a set consisting all elements of that are not
in A; i.e., Ac = {x : x/A}.
T
• Disjoint Set: Sets A and B are disjoint set if A B = ∅.
Definition and Some Basic Concepts

• Experiment (ξ): Any trials that results defined outcomes

• Sample space(S): is the set of all possible outcomes of an experiment

• Example: Tossing a coin two times is an experiment and , S =

{HH, HT, TH, TT}, Rolling a die is an experiment and , S =


{1, 2, 3, 4, 5, 6}

• Event: is the subset of possible outcomes a given experiment with

defined characteristics.

• E.g The event of getting two head with through of fair of coin twice

E= {HH}, The event of getting even number in the though of dice E=


{2, 4, 6}
Type of events

• Simple Event: is an event consisting a single outcome

• Independent Event: is the events that the occurrence or none

occurrence of one event has no effect the other event

• Mutually Exclusive Events: Is the events having no outcome in

common (intersection)

• Complementary Event: mutually exclusive events are

complementary if there are no common elements between


themselves.

• null event: event with no outcome from a given experiment

• Exhaustive Events: events that their union forms the sample space
Counting Rules

Helps to know the number of possible outcome of a given experimental


outcomes or doing an experiment. The techniques are:

• Addition Rule: If an experiment has k procedures, where ith

procedures has ni alternatives and the procedures co not be


performed at the same time, total possible way of doing this is given
by: ni=1 ni
P

• Example: Suppose a lady wants to make journey from Harar to Dire

Dawa. If she can use either plane, bus, cycle, horse, and there are 3
flights, 4 buses, 2 cycles and 3 horses available. In how many
different ways can she make her journey?
From the given problem nf = 3, nb = 4, nc = 2 and nh = 3. So she has
nf + nb + nc + nh = 3 + 4 + 2 + 3 = 12 different ways to make her trip
from Harar to Dire Dawa.
Counting Rules

• Multiplication Rule: If there were k procedures of doing an

experiment when each of ith procedures can be done in ni possible


ways and the procedures are performed at the same. The, the total
possible way of doing this experiment is given by: ki=1 ni
Q

• Examples: Assume that an individual has 3 pairs of shoe, 4 trousers,

and 4 t-shirts in how many possible ways this guy an wear his
clothes. ns = 3, nt = 4, nts = 4, 3*4*4= 48 possible ways

• Permutation:Is used to know possible ways of making an

arrangement or ordering. The possible ways arranging n objects is


given by n!

• Example. Suppose a photographer must arrange 4 persons in a row

for a photograph. In how many different ways can the arrangement


be done? n= 4, 4!= 24 possible ways
Counting Rules...

n!
• Permutation of n objects by taking r of them is given by: (n−r)!

• Permutation of n objects in which n1 are alike, n2 are alike, ..., nr are

alike is given by: Qr n!


i=1
ni !

• Example: How many different permutations can be made from the

letters in the word:STATISTICS


n1 = n(s) = 3, n2 = n(t) = 3, n3 = n(a) = 1, n4 = n(i) = 2 and n5 = n(c) =
1.
10!
Thus, 3!∗3!1!∗2!∗1! =50400

• Combination: is the possible way of making selection

• Possible ways of selecting r objects from the n total objects are given
n n!

by: r
= r!∗(n−r)!
Counting Rules...

• The selection having k procedures from which r1 is selected from n1 ,

r2 is selected n2 up to rk which is selected from nk is given by: ki=1 ni


Q 
ri

• Example 1: Example: In how many ways can student choose 3 books


12 12!
 
from a list of 12 different books? 3
= 3!(12−3)!
= 220

• Example: Out of 5 male workers and 7 female workers of some

factory a committee consisting 2 male and 3 female workers to be


formed. In how many ways can this done if
(a) all workers are eligible, 52 ∗ 73 = 350
 

5 6
 
(b) one particular female must be a member, 2
∗ 2
= 150
Approaches in Probability Definition

• The Classical Approach : Suppose there are N possible outcomes in

the sample space S of an experiment out which n are favorable to the


n
event E, then the probability that the event E is given by: P(E) = N

• Example 1: Consider an experiment of tossing a die. Then, what is

the probability that an odd number occurs S={1, 2, 3, 4, 5, 6}, E =


3 1
{1, 3, 5}, P(E)= 6
= 2

• The Empirical Approach (frequent): It is based on a relative

frequency of given frequency distributions. Given a frequency


distribution, the probability of an event being in a given class is given
by: Pfi
fi

• Subjective Approach: Based on an educated guess or experience or

evaluation of a problem.
Some Probability Rules or Axioms

Let S be a sample space of an experiment, and A, B be events defined on


experiment. The, the followings are axioms of probability

1 0 ≤ P(A) ≤ 1

2 P(S) = 1

3 P(Ac ) = 1 − P(A)
S T
4 P(A B) = P(A) + P(B) − P(A B)

5 P(∅) = 0, Where ∅ is impossible or null event

6 If A1 , A2 , ..., An are pairwise mutually exclusive events, then


P( ni=1 ) = ni=1 P(Ai )
S P
Exercise

• Example 1: A box of 20 candles consists of 5 defective and 15

non-defective candles. If 4 of these candles are selected at random,


what is the probability that
(a) all will be defective. Let A be an event of all candles are
(5)(15)
defective.P(A) = 4 20 0 = 0.001032
(4)
(b) 3 will be non-defective. Let B be an event of 3 candles are
(5)(15)
non-defective. 1 20 3 = 0.4696
(4)
• Example 2: An urn contains 6 white, 4 red and 9 black balls. If 3

balls are drawn at random, find the probability that at least one is
white. Let W is the event that at least one drawn ball is white
(6)∗(13) (6)∗(13) (6)∗(13)
P(w) = 1 19 2 + 2 19 1 + 3 19 0 = 0.7048
(3) (3) (3)
Independence Probability

Two events are independent if the occurrence or non occurrence of one


event do not influences the others.
T
• Event A and B are said to be independent if P(A B) = P(A)*P(B)
Conditional Probability

Two events are independent if the occurrence or non occurrence of one


event influences the others. The conditional probabilities event A and B
are given by:

• Conditional probability of A given that event B has already occurred,


T
P(A B)
P(A/B) = P(B)

• Conditional probability of B given that event A has already occurred,


T
P(A B)
P(B/A) = P(A)

• For mutually exclusive events A1 andA2 ,


S
P(A1 /B A2 /B) = P(A1 /B) + P(A2 /B)

• For pairwise mutually exclusive events, Ai ,

P( ni=1 Ai /B) = ni=1 P(Ai /B)


S P
Examples

• If the probability that a research project will be well planned is 0.6,

and the probability that it will be well planned and well executed is
0.54. Then, what is the probability that it will be
(a) well executed given that it is well planned. Let D and E be an
events of the research project is well planned and well executed
T
respectively. Then P(D) = 0.6 and P(D E) = 0.54.
T
P(E D) 0.54
P(E/D) = P(D)
= 0.6
= 0.9

• (b) will not be well executed given that it is well planned.


P(Ec D)
T T
P(D)P(D E)
P(Ec /D) = P(D)
= P(D)
= 1 − P(E/D) = 0.1
Multiplicative rule and Law of Total Probability

• Notation: Conditional Probability of event A given that the event B

has occurred: P(A/B)


T
• The multiplicative rule of probability:P(A B) = P(A/B)P(B)

• The Law of Total Probability: Let A1 ;... ; Ak be mutually exclusive and

exhaustive events. Then for any event B, P(B)= ki=1 P(B/Ai )P(Ai )
P

• The events A1 ;...; Ak are said to be exhaustive if one of them must


S S S
occur, that is A1 A2 ... Ak = S
Bayes’ Theorem

• Let A1 ;...; Ak be mutually exclusive as well as exhaustive events with

P(Ai ) > 0 for i = 1; 2;...; k which partitioned any event B for P(B) > 0.
Then, the probability of jth event of Ai is obtained by Bayes’s Theory
which is based on multiplicative and total probability rule as:
T
P(Aj B) P(B/Aj )P(Aj )
• P(Aj /B) = P(B)
= Pk for j= 1,2, ..., k
i=1
P(B/Ai)P(Ai )
Bayes’ Theorem Example

• E.g: Microchips from a factory are sorted into three separate boxes.
Box 1 contains 25 microchips from shift 1, box 2 contains 35
microchips from shift 2, and box 3 contains 40 microchips from shift
3. There are 5, 10 and 5 defective microchips in the first, second and
third boxes, respectively. Let A denote the event that a defective
microchip is obtained and B1, B2 and B3 be the events of choosing
box 1, box 2 and box 3, respectively. All three boxes are equally likely
25 35
to be chosen. Given P(B1) = 100 , P(B2) = 100 , P(B1) =
40 5 10 5
100
, P(A/B1) = 25
, P(A/B2) = 35
, P(A/B3) = 40

• What is the probability of obtaining a defective microchip?


P3 5 25 10 35 5 40
P(A) = i=1 P(A/Bi )P(Bi ) = 25
∗ 100
+ 35
∗ 100
+ 40
∗ 100
= 0.2

• If we picked a defective microchip, what is the probability that is


5 ∗ 25
P(A/B1 ∗P(B1 )
from box 1? P(B1 /A) = P(A)
= 25 100
0.2
= 0.25
One Dimensional Random Variable

In previous section when have basic concepts of probability as well as


methods of calculating the probability of an event. This sections focuses
on calculating the probability of an event under somewhat more complex
conditions.

• It focuses how to define random variables over a given experiment

and summarize it possible values using probability models

• Random Variable: variable defined over a given random experiment.

A variable X which assumes real numbers to all possible values of a


sample space is called a random variable
Type of One Dimensional Random Variables

The type of random variable is based on the nature of the possible out
come of a given experiment

1 Discrete random variable: A variable which assumes infinite or finite


countable values from defined experiment
Example 1: The number of heads from the experiment of tossing a
coin two times. S = {HH, HT, TH, TT}, X ={0, 1, 2}

2 continuous random variable: Variable assuming infinite number of


any real number between defined points based on defined
experiment.
Examples: Life of light bulbs under investigation, time taken for
recovery after undergone surgery,etc..
Type of One Dimensional Random Variables

The type of random variable is based on the nature of the possible out
come of a given experiment

1 Discrete random variable: A variable which assumes infinite or finite


countable values from defined experiment
Example 1: The number of heads from the experiment of tossing a
coin two times. S = {HH, HT, TH, TT}, X ={0, 1, 2}

2 Continuous random variable: Variable assuming infinite number of


any real number between defined points based on defined
experiment.
Examples: Life of light bulbs under investigation, time taken for
recovery after undergone surgery,etc..
Probability Distribution

Probability model for given random variables or a function defining all


possible values of given random variables with respective probabilities

• Probability distribution of discrete random variable is known to be

probability mass function

Let X be discrete random variable p(xi ) is said to be PMF of X, if it satisfy


the following conditions

1 0 ≤ p(xi ) ≤ 1
P
2 i p(xi ) = 1
P
3 p(Xi ≤ xi ) = i≤x p(xi )
Examples

• Example 1: Construct a probability distribution for getting heads in

an experiment of tossing a coin two times.

X 0 1 2
1 2 1
P(X = xi ) 4 4 4

• Based on the given probability mass function find P(X ≤ 1)

• Example 2: The probability distribution of a discrete random variable

Y is given by:
P(Y = y) = cy2 , y = 0, 1, 2, 3, 4. Then find the value of c.
Probability Density function
Probability Distribution of continuous random variable is known to be
probability density function

• Let X be continuous random variable, a function f (x) is said to be

PDF of X, if it satisfy the following conditions

1 f (x) ≥ 0, ∀x
R∞
2
−∞
f (x)dx = 1
Rx
3 p(X ≤ x) = ∞
f (x)dx

• Example: Let X be a continuous random variable and its pdf is given

by:
f(x) = 2x, for 0 < x < 1,
R1
• (a)Verify whether f(x) is a pdf or not: 0
2xdx = x2 |10 = 1
R 0.75
• Find p(0.5<x<0.75)= 0.5
2xdx = x2 |0.75 2 2
0.5 = 0.75 − 0.5 = 0.315
CUMULATIVE DISTRIBUTION FUNCTION

Is the cumulative of the probability values up to specified values of a


given random variable

• The cumulative distribution of discrete random variable X is defined


P
as: FX (x) = P(X ≤ x) = xi ≤x P(xi ) ⇔ p(xi ) = F(xi ) − F(xi−1 ), for i=
2,3,..and p(x1 ) = F(x1 )

Properties of cumulative distribution function

1 In the limiting cases, lim −∞FX (x) = 0, lim inf FX (x) = 1

2 FX is non-decreasing, that is a < b ⇒ F(a) < F(b)

3 For a < b, P(a < X ≤ b) = FX (b) − FX (a)


CUMULATIVE DISTRIBUTION FUNCTION...

Is the cumulative of the probability values up to specified values of a


given random variable

• The cumulative distribution of continuous random variable X is


Rx ∂
defined as: FX (x) = P(X ≤ x) = −∞
f (t)dt ⇔ f (x) = F (x),
∂x X

Properties of cumulative distribution function

1 In the limiting cases, lim −∞FX (x) = 0, lim inf FX (x) = 1

2 FX is non-decreasing, that is a < b ⇒ F(a) < F(b)

3 For a < b, P(a < X ≤ b) = FX (b) − FX (a) =


P(a < X < b) = P(a < X ≤ b) = P(a ≤ X < b) = P(a ≤ X ≤ b)
Expectation and variance of Random Variables

Expected value is the mean of random variable which is measure average


and its variance its used to measure the variability of random variables
values in relation to their average value.
P
• For a discrete random variable: E(X) = µ = ∀x xi p(xi ) and Variance
2 P 2
is :E(X − µ) ∀x (xi − µ) p(xi )
R∞
• For a given continuous random variable: E(X)= µ = −∞
xf (x)dx,
2 2
R∞ 2
Var(x) = σ = E(X − µ) = −∞
(x − µ) f (x)dx

• Standard deviation: Is the root of variance


Expectation and variance of Random Variables properties

• For any constant ’a’ we have: E(aX) = aE(X)

E(a) = a, E(X± a) = E(X)± a


R∞
• If g(x) is a function of random variable: E(g(x)) = −∞
g(x)f (x)dx,
P
when X is continuous and E(g(x)) = i≤xi g(xi )p(xi ), when X is
discrete random variable

• Var(X)=σ 2 = E((X − E(X))2 ) = E(X − µ)2 = E(X 2 ) − µ2


R∞
• E(X 2 ) = x2i p(xi ) for discrete case and E(X 2 ) = x2 f (x)dx
P
∀xi −∞

• Var(aX)= a2 σ 2 , Var(a) = 0

• Var(X±a) = Var(X)
Examples

A coin is tossed two times. Let X be the number of heads. Find the mean
value and the standard deviation of X.
X 0 1 2
1 2 1
P(X = xi ) 4 4 4

1 2 1
P
• E(X) = xi p(xi ) = 0 ∗ 4
+1∗ 4
+2∗ 4
=1

• Var(X) =E(X 2 ) = 02 ∗ 1
4
+ 12 ∗ 2
4
+ 22 ∗ 1
4
= 1.5
Var(X) = 1.5-1 = 0.5
√ √
• SD(X) = σ = σ2 = 0.5 = 0.707
Examples

• Suppose that X is a continuous random variable with pdf of


1 + x, if − 1 < x < 0

f (x) =
1 − x

if 0 < x < 1

• then find the mean value and variance of X.


R0 R1
• E(X) = −1
x(1 + x)dx + 0
X(1 − x)dx = 0
R0 R1
• Var(X) = E(X 2 ) − µ2 , −1
x2 (1 + x)dx + 0
x2 (1 − x)dx = 0.167

• σ 2 = E(X 2 ) − µ2 = 0.167 − 02 = 0.167


Two Dimensional Random variables

There are a situation when define two random variable over a given
experiment. Suppose X and Y are random variables on the probability
space (Ω, A, P(.)). Their joint probability distribution describe information
about their properties relative to each other which is defined over R2 each
taking values in R.

• Their joint probability distribution is given by

P(X = xi , Y = yi ) = P(xi , yi ) when they are discrete random variable


and fx,y (x, y) when they are continuous two dimensional one

• Their cumulative distribution function is

P(X ≤ xi , Y ≤ yi ) = P(Xi , Xi ) and FX,Y (X, Y) for continuous case


Joint and marginal probability mass function

• Let (X,Y) be two dimensional discrete random variable

P(X = xi , Y = yi ) = P(xi , yi ) is their joint probability distribution iff:


P
1 xi
P P(X = xi , Y = yi ) = 1, ∀xi , yi
yi

2 0 ≤ P(X = xi , Y = yi ) ≤ 1

The marginal probability distribution of X and Y which describe marginal


distribution is given by:
P
• P(xi ) = yi P(X = xi , Y = yi ) is the marginal of X
P
• P(yi ) = xi P(X = xi , Y = yi ) is the marginal of Y
Joint and marginal probability mass function

• Let (X,Y) be two dimensional discrete random variable

P(X = xi , Y = yi ) = P(xi , yi ) is their joint probability distribution iff:


P
1 xi
P P(X = xi , Y = yi ) = 1, ∀xi , yi
yi

2 0 ≤ P(X = xi , Y = yi ) ≤ 1

The marginal probability distribution of X and Y which describe marginal


distribution is given by:
P
• P(xi ) = yi P(X = xi , Y = yi ) is the marginal of X
P
• P(yi ) = xi P(X = xi , Y = yi ) is the marginal of Y
Joint and marginal density functions

• Let (X,Y) be two dimensional continuous random variable a function

fx,y (x, y) is their joint probability density function iff:


R∞ R∞
1
−∞ −∞
f (x, y)dxdy = 1, ∀x, y

2 f (x, y) ≥ 0

The marginal density function of X and Y which describe marginal


distribution is given by:
R∞
• f (x) = −∞
f (x, y)dy is the marginal of X
R∞
• f (y) = −∞
f (x, y)dx is the marginal of Y
Conditional Distributions

• The conditional probability mass function of X and y is given by:


p(xi ,yi )
P(xi /yi ) = p(yi )
p(xi ,yi )
P(yi /xi ) = p(xi )

• The conditional density function of X and Y is given by:


f (x,y)
f (x/y) = f (y)
f (x,y)
f (y/x) = f (x)

• Two dimensional random variable (X,y) is said to be independent iff:

P(xi , yi ) = P(xi )P(yi )∀xi , yi for discrete case


f (x, y) = f (x)f (y), ∀x, y for continuous case
Examples
• A company produces two types of compressors, grade A and grade B.
Let X denote the number of grade A compressors produced on a given
day. Let Y denote the number of grade B compressors produced on
the same day. Suppose that the joint probability mass function is
given by:
Y
p(xi ; yi ) 0 1 p(xi )
0 0.1 0.3 0.4
x 1 0.2 0.1 0.3
2 0.2 0.1 0.3
p(yi ) 0.5 0.5 1
• Find
P(X < 1, Y ≤ 1) = p(X = 0, Y = 0)+P(X = 0, Y = 1) = 0.1+0.3 = 0.4
• Find P(X ≤ 1/Y < 1) =
P(X≤1,Y<1) p(X=0,Y=0)+P(X=1,Y=0) 0.1+0.2 3
P(Y<1)
= P(Y=0)
= 0.5
= 5

• Find the marginal probability distribution

• Find the conditional distribution


Examples

• Conditional Probability mass function of of X given Y

Y
p(xi /yi ) 0 1
0 0.2 0.6
x 1 0.4 0.2
2 0.4 0.2
• The conditional probability mass function of Y given X

Y
p(xi /yi ) 0 1
0 0.25 0.75
x 1 0.67 0.33
2 0.67 0.33
Example 2

Suppose an electronic circuit contains two transistors. Let X be the time


to failure of transistor 1 and let Y be the time to failure of transistor 2
having the following probability density function of:

4e−2(x+y) , if x > 0, y > 0
f (x, y) =
0 if otherwise

• Find marginal distribution of X and Y


R∞ R∞
f (x) = 0 4e−2(x+y) dy = 2x 4e−u du
2
= 2e−2x
R ∞ −2(x+y) R ∞ −u du
f (y) = 0 4e dx = 2y 4e 2 = 2e−2y

• Find the condition density of X and y:


4e−2(x+y)
f (y/x) = 2e−2x
= 2e−2y
4e−2(x+y)
f (x/y) = 2e−2y
= 2e−2x
R 2 R 2−x
P(x>1/x+y≤2) 4e−2(x+y) dxdy e−2 −3e−3
• P(x > 1/x + y ≤ 2) = = R12 R02−x =
P(x+y≤2) 4e−2(x+y) dydx 1−5e−4
0 0
Covariance and Correlation between Random Variables
Covariance is used to know the association between the two random
variables whereas the correlation is used to know the degree of
association between these two variables
• The covariance between the two random variables is given by:
Cov(X, Y) = σXy = P − µx )(Y − µy )] = E(XY) − µX µY ,
PE[(X
Where E(XY) = x y xi yi P(xi yi ) and
R∞ R∞
E(XY) = −∞ −∞ xyf (x, y)dxdy for two dimensional discrete and
continuous random variables respectively
• Positive covariance impieties positive association, negative
covariance value indicates negative association whereas zero
covariance value implies no association between random variable X
and Y
• The correlation between Random variable X and Y is given by:
Cov(X,Y) σXY
ρXY = √ = σX σY
Var(x)∗Var(Y)

• −1 ≤ ρXY ≤ 1

• ρXY ≈ 0 indicates weak or no association, ρXY ≈ ±0.5 indicates


moderate negative or positive association , and ρXY ≈ ±1 indicates
strong positive or negative association
Examples
• Consider example company produces two types of compressors whose
joint probability distribution is given as follow and calculate its
correlation between the number of compressors
Y
p(xi ; yi ) 0 1 p(xi )
0 0.1 0.3 0.4
x 1 0.2 0.1 0.3
2 0.2 0.1 0.3
p(yi ) 0.5 0.5 1
• E(XY)= 0*0*0.1+0*1*0.3+1*0*0.2+1*1*0.1+2*0*0.2+2*1*0.1= 0.3

• µx = 0*0.4+1*0.3+2*0.3 = 0.9, µy = 0*0.5+1*0.5=0.5

• σXY = E(XY) − µx ∗ µy = 0.3-0.9*0.5 = -0.15

• Var(X) = E(X 2 ) − µ2x , E(X 2 ) =


02 ∗ 0.4 + 12 ∗ 0.3 + 22 ∗ 0.3=1.5,σX2 = 1.5 − 0.92 = 0.69, Var(Y) =
σY2 = E(Y 2 )−µ2Y ,E(Y 2 ) = 02 ∗0.5+12 ∗0.5 = 0.5, σY2 = 0.5−0.52 = 0.25
σxy
• ρxy = = √ −0.15 = −0.36
σx σy 0.69∗0.25
Examples

• Consider the following joint density function of random variable


(X,Y)and calculate its correlation coefficients

 3 x2 + y, if 0 < x < 1, 0 < y < 1
f (x, y) = 2
0 if otherwise

R1R1
• E(XY) = 0 0
xy( 23 x2 + y)dxdy = 34
96
R1 R1
• E(X) = 0
x( 32 x2 + 21 )dx = 58 , E(Y) = 0
y( 12 + y)dy = 7
12

• σXY = − 58 ∗ 12
34
96
7
= −26
96
= −0.01042
1 R1
• E(X 2 ) = 0 x2 ( 32 x2 + 12 )dx = 21 , E(Y 2 ) = 0 y2 ( 21 + y)dy = 5
R
30 12

• σx2 = 21
30
− ( 58 )2 = 0.3094, σy2 = 5
12
7 2
− ( 12 ) = 0.0764

• ρxy = √ −0.01042 = −0.068


0.3094∗0.0764
Common Discrete Probability Distributions
Though there are several discrete probability distribution which is used
to model the probability of a give discrete random variable this course
focuses on Binomial and Poisson distributions

• Binomial Probability Distribution: is discrete probability distribution

used to model the random variable defined over n fixed experiment


having two out comes for a single trial. In general to apply Binomial
distribution our experiment should be characterized by the following
for properties

1 The experiment should have a fixed number of trials(n)

2 The trials are independent

3 Each trial should results two out comes(success and failure)based on


event of interest

4 The probability of a success(p) is constant to all trials


Common Discrete Probability Distributions
Though there are several discrete probability distribution which is used
to model the probability of a give discrete random variable this course
focuses on Binomial and Poisson distributions

• Binomial Probability Distribution: is discrete probability distribution

used to model the random variable defined over n fixed experiment


having two out comes for a single trial. In general to apply Binomial
distribution our experiment should be characterized by the following
for properties

1 The experiment should have a fixed number of trials(n)

2 The trials are independent

3 Each trial should results two out comes(success and failure)based on


event of interest

4 The probability of a success(p) is constant to all trials


Binomial Distribution..
The probability distribution of random variable X defined over such
experiment is given by:
P(X = x) = nx px (1 − p)n−x , where x = the number of successes, , p = the


probability of a success on one trial,q = the probability of failure on one


trial (q = 1 âĹŠ p), and n = the number of trials

• E(X) = np , Var(X) = σ 2 = npq, and StanD= npq

• Example: A university found that 10% of its students withdraw

without completing the sophomore course. Assume that 20 students


registered for the course. Compute the probability
(a) exactly four will withdraw.Let X be number of students who will
withdraw without completing the introductory statistics course.
P(X = 4) = 20
 4 20−4
4
0.1 0.9 = 0.0898
(b) at most two will withdraw. P(X ≤ 2) = P(X = 0) + P(X =
1) + P(X = 2) = 20 0.1 0.9 + 20 0.1 0.9 + 20
 0 20  1 19  2 18
0 1 2
0.1 0.9
Poisson Distribution
There are also a cases when our experiment deals with number of
accidents occurred, number of arrivals in specified period of time and the
like when the number of trials are unknown. In such circumstances we
used poisson distribution looking for the following assumptions

1 n is indefinite

2 Probability of the event to be happen is very rare

3 The trials are independent and there should be average number


events to be happen at given interval

If X is random variable defined over Poisson experiment its PMF is given


by:
e−λ ∗λx
P(X = x) = x!
, where x counts number of occurrence of event and λ
average number of events happens at given interval
E(X) = Var(X) = λ
Examples on Poisson Distribution

A student finds that the average number of amoeba in 10 ml of pond water


is 4. Find the probability that in 10 ml of water from that pond there are

1 exactly 5 amoeba. Let Y be the number of amoeba found in 10 ml


e−4 ∗45
pond water. P(Y = 5) = 5!
= 0.156

e−4 40
2 no amoeba.P(Y = 0) = 0!
= 0.0183

3 at least one amoeba.


P(Y ≥ 1) = ∞
P
i=1 P(Y = yi ) = 1 − P(Y < 1) = 1 − P(Y = 0) = 0.9817

4 Find mean and standard deviation for the number of amoeba in 10


ml pound of water
Common Continuous Distributions
• Though there are several Continuous Distributions this section

focuses on normal distribution which plays key role in modeling the


probability distribution of continuous random variable and in
statistical inferences. It is bell shaped characterized by its mean and
variance

Figure: Normal probability distribution Curve


Normal probability distribution

The probability density function of X having normal distribution is given


by:
−1 2
(x−µ)
f (x) = √ 1 e 2σ2 , ∞ < x < ∞, where µ = mean, σ = standard
2Πσ 2
deviation
• The distribution is characterized by its mean and variance

• its is also symmetric to its mean

• Have maximum point at its mean

• The standard deviation determines its flatness and wideness of the

normal curve.
Normal probability distribution...

To get the probability under the normal density one have to integrate
−1
(x−µ)2
√ 1
R
2
e 2σ2 dx which is very complex.
2Πσ
To avoid this complexity the normal distribution should be changed to
standardized normal distribution to use normality table for finding the
probability

• Transforming in to normality is subtracting from the given random

value the mean and dividing it by the standard deviation of normal


x−µ
density given by: z = σ

• After the transformation the property of the normal density and the

area under the curve of similar to that of standardized one


Properties of Standardized Normal Distribution

• It has mean zero and standard deviation 1

• The curved of standardized normal distribution its symmetric to its

mean or zero

• The total area under the curve is one

• The area to -z value is equal to the area to z based on symmetric

property

• The area from −∞ to 0 equal the area from 0 to ∞ which half of the

total (0.5)
Properties of Standardized Normal Distribution..

Figure: Standard Norma distribution Curve

• P(0 < z < 2.5) = 0.4938

• P(z > 1) = P(z > 0) âĹŠ P(0 < z < 1) = 0.5 âĹŠ 0.3413 = 0.1587

• P(z < 1) = P(z < 0) + P(0 < z < 1) = 0.5 + 0.3413 = 0.8413
Standardized Normal Distribution Table

Figure: Standard Norma distribution Table


Examples

The college boards, which are administered each year to many thousands
of high school students, are scored so as to yield a mean of 500 and a
standard deviation of 100. These scores are close to being normally
distributed. What percentage of the scores can be expected to satisfy each
condition?

• Greater than 600.Let X be the score of students,

P(X > 600) = P( X−500


100
> 600−500
100
)
P(Z > 1) = P(Z > 0) − P(0 < Z < 1) = 0.5 − 0.3413 = 0.1587

• Between 450 and

600.P(450 < X < 600) = P( 450−500


100
< X−500
100
< 600−500
100
)
P(0.5 < Z < 1) = P(0.5 < Z < 0) + P(0 < Z < 1) = P(0 < Z <
0.5) + P(0 < z < 1)
= 0.1915 + 0.3413 = 0.5328
Simple Linear Regression Analysis

Pn
x y − n xi n
P P
n i=1 yi
• β̂ = Pn i i 2 i=1
i=1 Pn
n i=1 xi −( i=1 xi )2

• β̂0 = ȳ − β̂1 x̄

• The slope tell us the magnitude of the effect of the independent of the

response or how the average change in response is changed with unit


change in the independent variable

• Coefficient of Determination:measures the proportion or percentage

of the variation in the dependent variable explained by the


independent variable
Simple Linear Regression Analysis Examples

• Example: A researcher wants to find out if there is a relationship

between the heights of sons and the heights of their fathers. In other
words, do taller fathers have taller sons? The researcher took a
random sample of 6 fathers and their 6 sons. Their height in inches is
given below in an ordered array.

Father (x) 63 65 66 67 67 68
Son (y) 66 68 65 67 69 70
Pn Pn Pn Pn
• i=1xi yi = 26740, i=1 xi = 396, i=1 yi = 405, i=1 y2i =
27355, ni=1 x2i = 26152
P

6∗26740−396∗405 6∗26740−396∗405
• r= √ √ =0.597, β̂1 = √ = 0.625,
6∗26152−(396)2 ∗ 6∗27355−(405)2 2
6∗26152−(396)
β̂0 = 67.50.625 ∗ 66=26.25

• Estimated Model = ŷ = 26.25 + 0.625xi


Simple Linear Regression Analysis Examples

• Estimated correlation value indicates the moderate positive

association between father and sons height

• Estimated slope indicates for one inch increment in fathers height

with one unit, the average height of the son is increased by


0.625inches.

• Coefficient of determination: R2 = (r)2 = 0.5972 = 0.357 shows 35.7%

of variation in the dependent variable (son height) is accounted for by


the variation of the independent variable (father height).

You might also like