BSAD - Lecture PPTs
BSAD - Lecture PPTs
Probability and
Distribution Statistics
1/13/2020 1
Importance of Probability
• Statistics develops your logic where as
probability develops your Analytics.
• Analytics involves the tasks such as predicting
the occurrence of an event, testing a
hypotheses, building models to explain the
variation in a variable of importance to the
business such as profitability, market share,
demand management, human resources etc.
1/13/2020 2
Terminology of Probability _
1.Random Experiment
• An experiment (event/occurrence) in
which the outcome is not known
with full certainty.
Examples : Future Demand, Future
Sales, Future Revenue, ROI, Outcome
of a match, Package received, footfall
etc.
1/13/2020 3
2.Sample Space
• Sample Space is an universal set which consist of all the
possible outcomes (elementary events) of an experiment.
• It is denoted by Letter ‘S’. It can be finite or even infinite
• Example :
• Experiment 1– A Cricket Match between India & West
Indies
• Sample Space = S = All Possible Outcomes of the match =
{Wins, Loses, Draw, Cancelled}
1/13/2020 5
4.Probability estimation Using Relative
Frequency
• Probability of an event is estimated based on
the relative frequency of occurrence of that
event. Statistically,
• P(X) = No. of Observations in Favor of event X / Total No. of
Observations
• P(X) = n(X) / N
1/13/2020 6
Numerical 1
• A Company has 1000 total employees. Every year
if 200 employees quit the job, calculate the
attrition rate for that company.
• Solution : (This is a case of Probability estimation
Using Relative Frequency)
• P(X) = No. of Observations in Favor of event X /
Total No. of Observations
• P(X) = n(X) / N = 200/1000 = 0.2 = 20%
1/13/2020 7
Numerical 2
• A website displays 10 advertisements. The revenue generated by
this website depends on the number of visitors clicking on any of
these advertisements. The data collected by the company
revealed that out of total 2500 visitors to the website, 30 clicked
on one advertisement, 15 clicked on two advertisements, 5
clicked on three advertisements. Remaining visitors did not click
on any of the advertisements. Calculate,
• 1. The probability that a visitor will click on any one of the
advertisements.
• 2. The probability that a visitor will click on at least two
advertisements.
• 3. The probability that a visitor will not click on any of the
advertisements.
1/13/2020 8
Numerical 2 _ Solution
• Using the concept of Probability estimation
Using Relative Frequency, we obtain
• P(X) = No. of Observations in Favor of event X / Total
No. of Observations
• 1. The probability that a visitor will click on any one of the
advertisements. = 50/2500 =0.02
• 2. The probability that a visitor will click on at least two
advertisements. = 20/2500 = 0.008
• 3. The probability that a visitor will not click on any of the
advertisements = 2450/2500 = 0.98
1/13/2020 9
Symbols Frequently Occurring
• X ∪ Y = X Union Y = X or Y
• Meaning either event X will happen at a time
or event Y will happen at a time but in no case
both will happen simultaneously.
• X ∩ Y = X Interaction Y = X and Y
• Meaning event X and event Y will happen
simultaneously.
• Symbol ∪ is read as or & symbol ∩ is read as
and.
1/13/2020 10
5. Algebra of events _ Rules of Probability_
1.Commutative Rule
If X and Y are the events of a sample space
‘S’,
Commutative Rule
P{X∪Y} = P{Y∪X}
P{X∩Y} =P{Y∩X}
1/13/2020 11
Rule 2: Associative Rule
(X ∪ Y) ∪ Z = X ∪ ( Y ∪ Z )
(X ∩ Y ) ∩ Z = X ∩ ( Y ∩ Z )
1/13/2020 12
3. Distributive Rule
• X ∪ (Y ∩ Z) = (X ∪ Y) ∩ ( Y ∪ Z )
X ∩ (Y ∪ Z) = (X ∩ Y) ∪ (X ∩ Z )
1/13/2020 13
4. De Morgan’s Law
(X ∪ Y)C = X C ∩ Y c
C C
(X ∩ Y) = X ∪ Y C
C C
Where X and Y are the
complementary events of
X&Y
1/13/2020 14
Axioms of Probability
1.Probability of an event E generally lies between 0 & 1
i.e. Mathematically 0 <= P(E) <= 1
1/13/2020 15
Contd…
4. For an event X, the probability of the
complementary event is given as ,
P (X C) = 1 - P(X)
5. Probability of an empty or impossible event is
always zero. P(Ø) = 0
6. If an event A⊂B , then P(A) <= P(B)
7.The probability of either event A occurs or event B
occurs or both occur together then,
P (A ∪ B) = P(A) + P(B) – P ( A ∩ B ) i.e
P(A U B) = P(A)+P(B) – P(Simultaneous Event)
1/13/2020 16
Contd….
8. If A & B are two mutually exclusive events so
that P ( A ∩ B ) = 0 then
P (A ∪ B) = P(A) + P(B)
9. Summation of all partition probabilities
always is = 1, n
i=1
1/13/2020 17
Types of Probability_ 1.Joint Probability
P ( A ∩ B ) = No. of observations in A ∩ B
Total No. of observations
1/13/2020 18
Types of Probability_ 2.Marginal Probability
1/13/2020 19
Types of Probability_ 3.Conditional Probability
1/13/2020 20
Types of Probability_ 4.Independent Events
Probability
• Two events A & B are said to be
independent if the occurrence of one
event does not affect the probability
of occurrence of another event.
Mathematically,
P ( A ∩ B ) = P(A) * P(B)
1/13/2020 21
Baye’s Theorem
• If A & B are two independent events where event
B has taken place earlier than event A. Then the
conditional probability of occurrence of event A
& B is given by,
• P(A|B) = P(A ∩ B) & P(B|A) = P(A ∩ B)
P(B) P(A)
Dividing eq.2 by eq.1 we obtain,
P(B|A) = P(A|B)* P(B)
P(A)
This new equation is known as Baye’s Theorem.
1/13/2020 22
Numerical 1 _ Baye’s Theorem
• The black boxes used in the air crafts are
manufactured by three companies A,B & C.
75% black boxes are manufactured by A, 15%
by B and 10% by company C. The rate of
defective boxes being manufactured by A,B &
C are 4%, 6% & 8% respectively. If one black
box is tested randomly and is found defective,
what is the probability that it is manufactured
by company A.
1/13/2020 23
Solution
• Let P(A),P(B) and P(C) be the probabilities of
black box being manufactured by companies
A, B & C respectively. And let P(D) be the
probability of the defective black box. We need to
calculate P(A|D).
• P(A|D) = P(D|A)*P(A)/P(D) ……(Baye’s Theorem)
• We know that P(D|A)=0.04 & P(A)=0.75 hence
P(D)=(0.75*0.04)+(0.15*0.06)+(0.10*0.08)=0.047
Hence P(A|D) = 0.04*0.75/0.047 = 0.6382= 63.82%
1/13/2020 24
Numerical 2_Baye’s Theorem
• The batteries used in the mobile hand sets are
manufactured by three companies A,B & C.
34% mobile batteries are manufactured by A, 25%
by B and 41% by company C. The rate of defective
batteries being manufactured by A,B & C are 2%,
12% & 4% respectively. If one of the mobile
batteries is tested randomly and is found
defective, what is the probability that it is
manufactured by company A. Also determine the
same for B & C also.
1/13/2020 25
Probability Distribution
MODULE III….CONTD.
1/13/2020 26
Random Variable
• A function that assigns a real number
to each sample point in a given
sample space.
O o
O
o o O o
o
1/13/2020 27
Discrete Random Variables
• Discrete Random Variable : The variable
is assigned with a single discrete value.
• 0 represents an un- favorable (negative)
event while 1 represents a favorable
(positive) event. Example -
Lose or Win,
Loss or Profit,
Male or Female,
True or False etc.
1/13/2020 28
∞
Continuous Random Variable
• Continuous Random Variable : Measured on a
continuous scale. CRV may attains any value on
a given measurement range.
Market Share Range 12% -18% ( range 0 to 100)
Temperature Range 96 degrees-100 degrees
Attendance Range 0 -100% etc
Salary Range – 50,000 to 80,000 Rs.
CAT Percentile - -100 to +100
1/13/2020 29
Probability Mass Function (PMF) for
Discrete Random Variable
• A statistical function that maps each random
variable to a probability i.e. PMF is the
probability of achieving that discrete value by
the random variable.
• Example – Probability of India winning the
match (DRV=1) is say 80% (or 0.8). This 0.8 is
known as PMF for (DRV=1) i.e. India emerging
as Winner of the match.
1/13/2020 30
Binomial Distribution
• Binomial Distribution is one of the most
important discrete probability distribution. A
random variable is said to follow binomial
distribution when,
• 1. RV has only two outcomes (Success & Failure)
• 2.The objective is to find Probability of getting k
successes out of n trials.
• 3. P(Failure)= 1-P(Success)
• 4.Probability p is constant & does not change during
the trials.
1/13/2020 31
Binomial Distribution _ Examples
• 1. Employee Attrition Vs Employee Retention
• 2. Fraudulent Vs Genuine insurance claims
• 3. Loan repayment default Vs No default
• 4. Purchase Vs No Purchase
• 5. Online transaction Failure Vs transaction
success.
1/13/2020 32
Poisson Distribution
• Poisson distribution is used when we need to
determine the probability of frequency of
occurrence of an event.
• Example :
• 1. Number of order cancellations per day
• 2.Number of reservation cancellations per
day.
• 3. Number of clicks to an advertisement per
hour etc.
1/13/2020 33
Normal Distribution
• All normal distributions are
symmetrical bell shape around the
mean µ .
• Mean=Mode=Median
• It is a two parameter distribution i.e.
mean (µ) and standard deviation (σ).
1/13/2020 34
Specific Values of Normal Distribution
( Z table)
1/13/2020 35
Numerical 5
1/13/2020 36
Solution
1/13/2020 37
Contd…
1/13/2020 38
Numerical _ 6
1/13/2020 39
Solution _ 6
1/13/2020 40
Numerical _ 7
• A Market Research was conducted in four cities to determine
the consumer preferences for A brand of soap. The responses
obtained are given below. (Figures in ‘000)
1/13/2020 41
Numerical _ 7 _ Contd…
• 1.Determine the probability a consumer selected
randomly the brand A.
• 2.Determine the probability a consumer selected
the brand A and is from Chennai.
• 3. Determine the probability a consumer selected
the brand A and was from Delhi.
• 4. Determine the probability a consumer selected
the brand A and was from Mumbai.
• 5.Develop a joint probability table for the above
data.
1/13/2020 42
Numerical _ 8
1/13/2020 43
Solution _ 8
1/13/2020 44
Numerical _ 9
• The lifetimes of certain electronic devices have a mean
of 300 hours and standard deviation of 25 hours.
Assuming that the distribution of these lifetimes can be
approximated closely with the normal curve.
• a) Find the probability that any one of these electronic
devices will have lifetimes more than 350 hours
• b) What percentage has lifetimes of 300 hours ?
• c) What percentage will have lifetimes from 220 or 260
hours
• (Given : Z 2 = 0.9545, Z 1.6 = 0.4452, Z 3.2 = 0.4903)
1/13/2020 45
Answer _ 9
• A) Given: μ = 300, Ϭ = 25 and x = 350 hours
• Z = x– μ/ Ϭ = 350-300/25 = 2
• curve between z = 0 and z = 2 is 0.9545. Thus
the required probability is 1 – 0.9545 = 0.0544
= 4.55%
1/13/2020 46
Solution _ Part B
• Z = x – μ/ Ϭ = 300-300/ 25 = 0
• Therefore the required percentage is 0
1/13/2020 47
Solution _ Part C
• Given: X1 = 220, X2 = 260, μ = 300, Ϭ = 25. Thus
•
• Z1 = x – μ/ Ϭ = 220-300 /25 = -3.2 and
Z2 = 260 -300/25 = - 1.6
• From the given data, we have
• P(z = -1.6) = 0.4452 and P(z = -3.2) = 0.4903
• The required probability is
• P(z = -3.2) - P(z = -1.6) = 0.4903-0.4452 = 0.0451
• Hence the required percentage is
0.0451 x 100 = 4.51%
1/13/2020 48
Numerical _ 10
• A bookseller has purchased few copies of
newspaper @ Rs.1.25 per copy and sold @
Rs.1.50 per copy. The probability of demand
for the news paper is as given below. How
many copies should the book seller order so
as to gain maximum profit.
• No. of Copies – 15 16 17 18 19 20
Probability - 0.04 0.19 0.33 0.26 0.11 0.07
1/13/2020 49
Solution _ 10
No. of Copies Probability of Profit Per Copy Expected Profit E(P)
Ordered Demand
(A) (B) (C) E(P) = A*B*C
15 0.04 0.25 15
16 0.19 0.25 76
17 0.33 0.25 140
18 0.26 0.25 117
19 0.11 0.25 52
20 0.07 0.25 35
1/13/2020 50