0% found this document useful (0 votes)
14 views20 pages

Baysian Inferences

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views20 pages

Baysian Inferences

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Bayesian Inference

Bayesian Inference

• What is Bayesian Inference?


• Bayesian inference refers to the application of
Bayes’ Theorem in determining the updated
probability of a hypothesis given new information.
• Bayesian inference allows the posterior probability
(updated probability considering new evidence) to
be calculated given the prior probability of a
hypothesis and a likelihood function.
Bayes Theorem
• Simplistically, Bayes’ theorem can be expressed
through the following mathematical equation

• where A is an event and B is evidence. So, P(A) is


the prior probability of event A and P(B) is
evidence of event B.
• Hence, P(B|A) is the likelihood. The denominator
is a normalizing constant
Bayes Theorem
• So, Bayes’ Theorem gives us the probability of an
event based on our prior knowledge of the
conditions that might be related to the event and
updates that conditional probability when some
new information or evidence comes up.
Bayes Theorem
• There are 3 components of the Bayes’ theorem
• - Prior
• - Likelihood
• - Posterior
• Prior Distribution – This is the key factor in Bayesian inference which
allows us to incorporate our personal beliefs or own judgements into
the decision-making process through a mathematical representation.
• Mathematically speaking, to express our beliefs about an unknown
parameter θ we choose a distribution function called the prior
distribution.
• This distribution is chosen before we see any data or run any
experiment.
Bayes Theorem
• How do we choose a prior?
• Theoretically, we define a cumulative distribution function for
the unknown parameter θ.
• In basic context, events with the prior probability of zero will
have the posterior probability of zero
• and events with the prior probability of one, will have the
posterior probability of one.
• Hence, a good Bayesian framework will not assign a point
estimate like 0 or 1 to any event that has already occurred or
already known not to occur.
• A very handy widely used technique of choosing priors is using a
family of distribution functions, that is sufficiently flexible.
Bayes Theorem
• How do we choose a prior?....

• i. Conjugate Priors – Conjugacy occurs when the final posterior


distribution belongs to the family of similar probability density
functions as the prior belief but with new parameter values
which have been updated to reflect new evidence/ information.
• Examples Beta-Binomial, Gamma-Poisson or Normal-Normal.

• ii. Non-conjugate Priors –It is also quite possible that the


personal belief cannot be expressed in terms of a suitable
conjugate prior and for those cases simulation tools are applied
to approximate the posterior distribution. An example can be
Gibbs sampler.
Bayes Theorem
• iii. Un-informative prior – Another approach is
to minimize the amount of information that
goes into the prior function to reduce the bias.
• This is an attempt to have the data have
maximum influence on the posterior.
• These priors are known as uninformative
Priors but for these cases, the results might be
pretty similar to the frequentist approach.
Bayes Theorem
• Probability of all the outcomes or ‘X’s taking some value of x given a
value of theta.
• This is the probability of observing the actual data that we collected
(head or tail), conditioned on a value of the parameter theta
(fairness of coin) and can be expressed as follows-

• This is the concept of likelihood which is the density function


thought of as a function of theta.
• To maximize the likelihood, choose the theta that will give us the
largest value of the likelihood.
• This is referred to as the maximum likelihood estimate or MLE.
Bayes Theorem
• Posterior Distribution – This is the result or
output of the Bayes’ Theorem.
• A posterior probability is the revised or
updated probability of an event occurring
after taking into consideration new
information.
• We calculate the posterior probability p(θ|X)
i.e., how probable is our hypothesis about θ
given the observed evidence.
Bayesian Inference
Bayesian inference:
An alternative approach to hypothesis testing is Bayesian inference.
It involves treating the unknown parameters themselves as
random variables.
The analyst starts with a prior distribution for the parameters and
then uses the observed data and Baye’s theorem to get an
updated posterior distribution for the parameters .
Rather than making probability judgments about the tests, you
make the probability judgments about the parameters.
Example: when the unknown parameter is a probability, we often
use a prior from the Beta distribution, which puts all its
probability between 0 and 1:

11
Mechanism of Bayesian Inference:
• The Bayesian approach treats probability as a
degree of beliefs about certain event given the
available evidence.
• In Bayesian Learning, Theta is assumed to be a
random variable.
• Let’s understand the Bayesian inference
mechanism a little better with an example.
Bayesian Inference
• Example of Bayesian inference
• Bayesian inference is probably best explained through a
practical example.
• Let’s say that our friend Bob is selecting one marble from
two bowls of marbles. The first bowl has 75 red marbles and
25 blue marbles. The second bowl has 50 red marbles and
50 blue marbles. Given that Bob is equally likely to choose
from either bowl and does not discriminate between the
marbles themselves, Bob in fact chooses a red marble. What
is the probability Bob picked the marble from bowl #1?
• Let’s call the possibility that Bob chose a marble from bowl
#1 H1 (Hypothesis 1) and the possibility he chose a marble
from bowl #2 H2 (Hypothesis 2).
Bayesian Inference
• If we know that Bob believes the bowls are
identical, then the probability of hypothesis 1
is equal to the probability of hypothesis 2
( P(H1) = P(H2) ), and both hypotheses must
be equal to one (the total probability), making
them each 0.5.
• P(H1) = P(H2) = 0.5
Bayesian Inference
• Now we’ll call the observation of a red marble, event E. Given
the distribution of the marbles in each bowl, we know that :
• P(E|H1) = 75/100 = 0.75
• P(E|H2) = 50/100 = 0.50
• Plugging in these probabilities into Bayes’ formula we get:
Bayesian Inference
• Before we used the observational data from Bob’s
choice, the probability that he chose a marble
from bowl #1 (Hypothesis 1) was 0.5 because the
bowls were equal from Bob’s point of view.
• After we observe that he chose a red marble, and
apply Bayes’ theorem, we revise the probability of
Hypothesis 1 to 0.6.
• This is Bayesian inference, using new information
to update a probabilistic model.
Application of Bayesian Inference in financial risk modeling:

• Bayesian inference has found its application in various


widely used algorithms
• e.g., regression, Random Forest, neural networks, etc.
• Apart from that, it also gained popularity in several
Bank’s Operational Risk Modelling.
• Bank’s operation loss data typically shows some loss
events with low frequency but high severity. For these
typical low-frequency cases, Bayesian inference turns
out to be useful as it does not require a lot of data.
Application of Bayesian Inference in
financial risk modeling:
• Earlier, Frequentist methods were used for operational risk
models but due to its inability to infer about the parameter
uncertainty,
– Bayesian inference was considered to be more informative as it
has the capacity of combining expert opinion with actual data to
derive the posterior distributions of the severity and frequency
distribution parameters.
• Generally, for this type of statistical modeling, the bank’s
internal loss data is divided into several buckets and the
frequencies of each bucket loss are determined by expert
judgment and then
• fitted into probability distributions.
Questions
• What is Prior distribution in Bayesian inference?
How do you choose a prior, explain the ways of it.
• What is Bayesian Inference? Explain in brief.
• What are the components of Bay’s Theorem?
Explain the concept of Prior in detail.
• What are the components of Bay’s Theorem?
Explain the concept of Likelihood in detail.
• What are the components of Bay’s Theorem?
Explain the concept of Posterior in detail.

You might also like