0% found this document useful (0 votes)
56 views

EN007001 Engineering Research Methodology: Statistical Inference: Bayesian Inference

This document provides an overview of topics in Bayesian inference to be covered in an engineering research methodology course. The topics include the conceptual framework of statistical modeling and analysis, probability, statistical inference, Bayesian statistics, maximum a posteriori estimation, comparison to maximum likelihood estimation, and Bayesian networks. Example problems and programs in Sagemath and Julia are provided to illustrate key concepts in probability density functions and Bayesian inference.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

EN007001 Engineering Research Methodology: Statistical Inference: Bayesian Inference

This document provides an overview of topics in Bayesian inference to be covered in an engineering research methodology course. The topics include the conceptual framework of statistical modeling and analysis, probability, statistical inference, Bayesian statistics, maximum a posteriori estimation, comparison to maximum likelihood estimation, and Bayesian networks. Example problems and programs in Sagemath and Julia are provided to illustrate key concepts in probability density functions and Bayesian inference.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 72

EN007001 Engineering Research

Methodology
Statistical Inference: Bayesian Inference

Assoc. Prof. Bhichate Chiewthanakul

Faculty of Engineering, Khon Kaen University


slide:: https://gear.kku.ac.th/~bhichate/Bayesian inferV5.pdf

EN007001 Engineering Research Methodology 1 / 72


coursework

Lecture: 3 hours
Text:
Kroese, D. P., & Chan, J. C. (2014). Statistical modeling and
computation. New York: Springer.
Koller, D., & Friedman, N. (2009). Probabilistic graphical
models: principles and techniques. MIT press.
Bolstad, W. M., & Curran, J. M. (2016). Introduction to
Bayesian statistics. John Wiley & Sons.
Leon-Garcia, A. (2017). Probability, statistics, and random
processes for electrical engineering. Pearson Education.

EN007001 Engineering Research Methodology 2 / 72


Topic outline

The conceptual framework for statistical modeling and analysis


Review of probability
Statistical inference
Bayesian statistics
Maximum A Posteriori (MAP) Estimation
Comparison to ML Estimation
Bayesian Networks

EN007001 Engineering Research Methodology 3 / 72


The conceptual framework for statistical modeling and analysis is
sketched in Fig 1

Fig 1. Statistical modeling and analysis


Data is used to represent the real-life problem
Probabilistic model for data
Using the model to carry out our calculations and analysis
Conclusions about the model
Conclusions about the model are translated into conclusions
about the reality.

EN007001 Engineering Research Methodology 4 / 72


Definition 1 (Random Variable)
A random variable is a function from the sample space Ω to R.
i.e.
X:Ω→R

EN007001 Engineering Research Methodology 5 / 72


Fig 2. Random variable as a mapping from Ω to R.

EN007001 Engineering Research Methodology 6 / 72


Definition 2 (Cumulative Distribution Function)
The cumulative distribution function (cdf) of a random variable X
is the function F : R → [0, 1] defined by

F (x) = P(X ≤ x), x∈R

EN007001 Engineering Research Methodology 7 / 72


Fig 3. A cumulative distribution function (cdf)

EN007001 Engineering Research Methodology 8 / 72


Definition 3 (Discrete Random Variable)
A discrete random variable X is defined as a random variable that
assume values from a countable set.

Definition 4 (Probability Mass Function)


The probability mass function (pmf) of a discrete random variable
X is defined as:

pX (x) = P (X = x) = P ({ζ : X(ζ) = x}) , x∈R

EN007001 Engineering Research Methodology 9 / 72


Definition 5 (Conditional Probability Mass Function)
Let X be a discrete random variable with pmf pX (x), and let C
be an event that has nonzero probability, P (C) > 0.
The conditional probability mass function of X is defined by the
conditional probability:

P ({X = x} ∩ C)
pX (x | C) =
P (C)

See: conditional event: A|B

EN007001 Engineering Research Methodology

10 / 72
Definition 6 (Continuous Random Variable)
A continuous random variable is defined as a random variable
whose cdf FX (x) is continuous everywhere and it can be written
as an integral of some nonegative function f (x):
Z x
FX (x) = f (t)dt,
−∞

where Z ∞
f (t)dt = 1.
−∞

EN007001 Engineering Research Methodology

11 / 72
Definition 7 (Probability Density Function)
The probability density function of X (pdf), if it exist, is defined
as the derivative of cdf FX (x):

dFX (x)
fX (x) =
dx

Definition 8 (Joint Cumulative Distribution Function)


The joint cumulative distribution function of X and Y is defined
by
FX,Y (x, y) = P (X ≤ x, Y ≤ y)

EN007001 Engineering Research Methodology

12 / 72
Definition 9 (Jointly continuous random variables)
Two random variables X and Y are jointly continuous with joint
density fX,Y (x, y) if
ZZ
P ((X, Y ) ∈ A) = fX,Y (x, y)dxdy,
A

where
Z ∞ Z ∞
fX,Y (x, y)dxdy = 1
−∞ −∞

Definition 10 (Condition pdf)


The conditional pdf of X given C is defined by

dFX (x | C)
fX (x | C) =
dx

EN007001 Engineering Research Methodology

13 / 72
Fact 1
If X and Y are independent then
the joint cdf is

FX,Y (x, y) = FX (x) · FY (y)

the joint pdf is

fX,Y (x, y) = fX (x) · fY (y)

EN007001 Engineering Research Methodology

14 / 72
Important Random Variables. The most commonly used random
variables in communications are:
Bernoulli Random Variable.

p, x=1
pX (x) =
1 − p, x = 0

It is a good model for a binary data generator and a channel errors.


Binomial Random Variable. It is a r.v. giving the number of 1’s
in a sequence of n independent Bernoulli trials.

pk (1 − p)n−k , 0 ≤ k ≤ n
P (X = k) =
0, otherwise

EN007001 Engineering Research Methodology

15 / 72
This r.v. model, for example, the total number of bits received in
error when a sequence of n bits is transmitted over a channel with
bit-error probability of p.
Uniform Random Variable. The pdf is given by

 1 , a<x<b
f (x) = b−a
0, otherwise

This r.v. model, for example, when the phase of a sinusoid is


random it is usually modeled as a uniform random
variable between 0 and 2π.

EN007001 Engineering Research Methodology

16 / 72
The most important distribution in the study of statistics: the
normal (or Gaussian) distribution.
Definition 11 (Normal Distribution)
A random variable X is said to have a normal distribution with
parameters µ and σ 2 , N (µ, σ 2 ) if its pdf is given by
1 2 2
fX (x) = √ e−(x−µ) /2σ , x∈R
2πσ

If µ = 0 and σ 2 = 1 or N (0, 1), then


1 2
fX (x) = √ e−x /2

is known as the standard normal distribution.

EN007001 Engineering Research Methodology

17 / 72
Example 1
Show that
1 −(2x2 −2xy+y2 )/2
fX,Y (x, y) = e

is a valid joint probability density.
Solution. Since fX,Y (x, y) > 0, all we have to do is show that it
integrates to one. By factor the exponent, we obtain
2 /2 2
e−(y−x) e−x /2
fX,Y (x, y) = √ · √ .
2π 2π
 
Then
Z ∞Z ∞ ∞ 2 ∞ 2
e−x /2  e−(y−x) /2 
Z Z
fX,Y (x, y)dxdy = √ √ dy  dx = 1
 

−∞ −∞ −∞ 2π  −∞ 2π 
| {z } | {z }
N (0,1) N (x,1)
#
EN007001 Engineering Research Methodology

18 / 72
Sagemath Program 1
#Proof
reset()
x,y=var("x,y")
f(x,y)=1/2/pi*e^( -(2*x^2-2*x*y+y^2)/2 )

#Proof f(x,y)>0
print bool(f(x,y)>0)

#Proof volume below the surface is equal to 1


print integral(integral(f,(x,-oo,oo)),y,-oo,oo)

EN007001 Engineering Research Methodology

19 / 72
Julia Program 1

#Proof valid of the joint pdf


using PyCall
@pyimport sympy as sm
x,y=sm.symbols("x y")
oo=sm.oo

f=sm.Function("f")
f=1/2/(sm.pi)*sm.exp(-(2*x^2-2x*y+y^2)/2)
sm.integrate(f,(x,-oo,oo),(y,-oo,oo))

EN007001 Engineering Research Methodology

20 / 72
Julia Program 1 (cont.)

EN007001 Engineering Research Methodology

21 / 72
Definition 12 (Random Vector and Random Matrix)
A vector X whose entries are random variables is called a random
vector, and a matrix Y whose entries are random variables is
called a random matrix. i.e.

X = [X1 , X2 , . . . , Xn ]T ,
 
Y11 Y12 . . . Y1p
 Y21 Y22 . . . Y2p 
 
Y= .
 .. .. .
.. 
 .. . . . 
Yn1 Yn2 . . . Ynp

EN007001 Engineering Research Methodology

22 / 72
Definition 13 (Statistical inference)
Statistical inference is a collection of methods that deal with
drawing conclusions about the model on the basis of the observed
data. (See fig 1)

The two main approaches to statistical inference are:


Classical statistics.
Bayesian statistics.

EN007001 Engineering Research Methodology

23 / 72
Definition 14 (Classical statistics)
Let x be outcome of a random vector X described by a
probabilistic model that depend on unknown parameter θ. Let θ
is assumed to be fixed. Then classical statistic is the
method for estimating and for drawing inferences about a
parameter θ.

The model is specified up to a (multidimensional) parameter θ; that


is, X ∼ f (·; θ).

EN007001 Engineering Research Methodology

24 / 72
Definition 15 (Bayesian statistics)
Let x be outcome of a random vector X described by a
probabilistic model that depend on unknown parameter θ. Let θ
is assumed to be random. Then Bayesian statistic is the
method for estimating and for drawing inferences about a
parameter θ, such that θ is carried out by analyzing the conditional
pdf f (θ | x).

f (θ | x) is called posterior pdf of the parameter θ.

EN007001 Engineering Research Methodology

25 / 72
Example 2 (Bias coin)
We throw a coin 1000 times and observe 570 Heads. Using this
information, what can we say about the “fairness” of the coin?
The data (or better, datum) here is the number x = 570. Suppose
we view x as the outcome of a random variable X which
describes the number of Heads in 1000 tosses. Our statistical
model (See Fig 1) is then

X ∼ Bin(1000, p)

where p ∈ [0, 1] is unknown.

EN007001 Engineering Research Methodology

26 / 72
Remark 1
Any statement about the fairness of the coin is expressed in
terms of p and is assessed via this model.
It is important to understand that p will never be known.
A common-sense estimate of p is simply the proportion of
Heads, x/1000 = 0.570.

How accurate is this estimate? Is it possible that the unknown p


could in fact be 0.5? One can make sense of these questions
through detailed analysis of the statistical model. (i.e. by using
Classical statistics)

EN007001 Engineering Research Methodology

27 / 72
Bayesian statistics is a branch of statistics that is centered around
Bayes’ formula.
Theorem 1 (Bayes’ Rule)

Let A be an event with P (A) > 0 and let B1 , B2 , . . . , Bn be a


partition of Ω. Then,

P (A | Bj )P (Bj )
P (Bj | A) = n . (1)
P
P (A | Bi )P (Bi )
i=1

EN007001 Engineering Research Methodology

28 / 72
Corollary 1.1
For continuous random variables X and Y , Bayes’ Theorem is
formulated in terms of densities:

f (x | y)f (y)
f (y | x) = R ∝ f (x | y) · f (y), (2)
| {z } f (x | y)f (y)dy | {z } |{z}
posterior likelihood prior

where
f (y) := fY (y), f (x | y) := fX|Y (x | y), f (y | x) := fY |X (y | x)
and likelihood function, l(x | y) = f (x | y).

EN007001 Engineering Research Methodology

29 / 72
Definition 16 (Prior, Likelihood, and Posterior)
Let x and θ denote the data and parameters in a Bayesian
statistical model.

The pdf of θ, f (θ) is called the prior pdf.


The conditional pdf f (x | θ) is called the Bayesian likelihood
function.
The central object of interest is the posterior pdf f (θ | x)
which, by Bayes theorem, is proportional to the product of the
prior and likelihood:

f (θ | x) ∝ f (x | θ)f (θ).

EN007001 Engineering Research Methodology

30 / 72
Remark 2 (Bayesian Statistical Inference)

The goal is to draw inferences about an unknown variable Y by


observing a related random variable X. The unknown variable is
modeled as a random variable Y , with prior distribution

fY (y), if Y is continuous,
PY (y), if Y is discrete.

EN007001 Engineering Research Methodology

31 / 72
Remark 2 (Cont.)
After observing the value of the random variable X, we find the
posterior distribution of Y . This is the conditional PDF (or PMF)
of Y given X = x,

fY |X (y | x) or PY |X (y | x).

EN007001 Engineering Research Methodology

32 / 72
Fact 2
If X ∼ Uniform[a, b], then

 1 , a≤x≤b
fX (x) = b−a
0, otherwise.

If X ∼ Geometric(p), then

PX (x) = (1 − p)x−1 · p, x = 1, 2, 3, . . .

EN007001 Engineering Research Methodology

33 / 72
Fact 3 (Law of Total Probability)
Let A be an event and let B1 , B2 , . . . , Bn be a partition of Ω.
Then,
Xn
P (A) = P (A | Bi )P (Bi ).
i=1

Fig 4. A partition B1 , . . . , B6 of the sample space Ω

EN007001 Engineering Research Methodology

34 / 72
Example 3

Let X ∼ Uniform(0, 1). Suppose that we know


(Y |X = x) ∼ Geometric(x). Find the posterior density of X given
Y = 2, fX|Y (x|2).
Solution. Using Bayes’rule we have

PY |X (2 | x)fX (x)
fX|Y (x | 2) = .
PY (2)

Since (Y |X = x) ∼ Geometric(x). We obtain


PY |X (y | x) = x(1 − x)y−1 , y = 1, 2, . . .

and
PY |X (2 | x) = x(1 − x).

EN007001 Engineering Research Methodology

35 / 72
Example 3 (cont.)
To find PY (2), we can use the law of total probability
Z ∞
PY (2) = PY |X (2 | x)fX (x)dx
−∞
Z1
= x(1 − x) · 1dx
0
1
= .
6
Therefore, we obtain
x(1 − x) · 1
fX|Y (x | 2) = 1
6
= 6x(1 − x), for 0 ≤ x ≤ 1.
#

EN007001 Engineering Research Methodology

36 / 72
The posterior distribution, fX|Y (x | y) (or PX|Y (x | y)), contains
all the knowledge about the unknown quantity X. Therefore, we
can use the posterior distribution to find point or interval estimates
of X.
Definition 17 (MAP)
Let fX|Y (x | y) be a posterior distribution. Then the Maximum A
Posteriori (MAP) Estimation is defined as

x̂M AP = max fX|Y (x | y) (3)


x

EN007001 Engineering Research Methodology

37 / 72
fY |X (y|x)fX (x)
Since fX|Y (x | y) = fY (y) and fY (y) does not depend on
x. Therefore,

x̂M AP = max fY |X (y | x)fX (x) (4)


x

Find x̂M AP
To find the MAP estimate of X given that we have observed
Y = y, we find the value of x that maximizes

fY |X (y | x)fX (x)

If either X or Y is discrete, we replace its PDF in the above


expression by the corresponding PMF.

EN007001 Engineering Research Methodology

38 / 72
Example 4

Let X be a continuous random variable with the following PDF:



2x, if 0 ≤ x ≤ 1
fX (x) =
0, otherwise

Also, suppose that (Y | X = x) ∼ Geometric(x). Find the MAP


estimate of X given Y = 3.
Solution. Since (Y | X = x) ∼ Geometric(x), Then

PY |X (y | x) = x(1 − x)y−1 , for y = 1, 2, . . .

EN007001 Engineering Research Methodology

39 / 72
Example 4 (cont.)
For Y = 3, it follows that

PY |X (3 | x) = x(1 − x)2 .

For x ∈ [0, 1] one has,

PY |X (3 | x)fX (x) = x(1 − x)2 · 2x (5)

To find the value of x that maximizes Eq. (5), one need

EN007001 Engineering Research Methodology

40 / 72
Example 4 (cont.)
d 2
x (1 − x)2 = 2x(1 − x)2 − 2(1 − x)x2 = 0.

(6)
dx
Solve for x, one obtain
1
x̂M AP = .
2
#

EN007001 Engineering Research Methodology

41 / 72
Julia Program 2 (Find solution of (6) and x̂M AP )

EN007001 Engineering Research Methodology

42 / 72
We now consider a classical statistic inference called maximum
likelihood method for finding a point estimator that maximizes the
probability of the observed data Yn = (Y1 , Y2 , . . . , Yn ).
Definition 18 (Likelihood function)

Let Yn = (y1 , y2 , . . . , yn ) be the observed values of a random


sample for the random variable Y and let X be the parameter of
interest. Then likelihood function of the sample is a function of X
defined as follows:

l(yn ; x) = l(y1 , y2 , . . . , yn ; x)

P
Y |X (y1 , y2 , . . . , yn | x), Y discrete r.v. (7)
=
Y |X (y1 , y2 , . . . , yn | x),
f Y cont. r.v.,

EN007001 Engineering Research Methodology

43 / 72
Definition 18 (cont.)
where PY |X (y1 , y2 , . . . , yn | x) and fY |X (y1 , y2 , . . . , yn | x) are the
joint pmf and joint pdf evaluated at the observation values if the
parameter value is x.

Definition 19 (MLE: maximum likelihood estimator)

Let Yn = (y1 , y2 , . . . , yn ) be the observed values of a random


sample for the random variable Y and let x be the parameter of
interest. Then maximum likelihood estimator of x, denoted by
x̂M L is the parameter value that maximizes the likelihood function,
that is,
l(y1 , y2 , . . . , yn ; x̂M L ) = max l(y1 , y2 , . . . , yn ; x)
x

EN007001 Engineering Research Methodology

44 / 72
Remark 3

The maximum likelihood estimate (MLE), answers the question:


“For which parameter value of x does the observed data
(y1 , y2 , . . . , yn ) have the biggest probability?”

EN007001 Engineering Research Methodology

45 / 72
Example 5

Suppose that a particular gene occurs as one of two alleles (A and


a), where allele A has frequency θ in the population. That is, a
random copy of the gene is A with probability θ and a with
probability 1 − θ. Since a diploid genotype consists of two genes,
the probability of each genotype is given by:

genotype AA Aa aa
probability θ2 2θ(1 − θ) (1 − θ)2

Suppose we test a random sample of people and find that k1 are


AA, k2 are Aa, and k3 are aa. Find the MLE of θ.

EN007001 Engineering Research Methodology

46 / 72
Example 5 (cont.)
Solution. The likelihood function is given by
! ! !
k1 + k2 + k3 k2 + k3 k3
P (k1 , k2 , k3 | θ) = ×
k1 k2 k3 (8)
θ2k1 (2θ(1 − θ))k2 (1 − θ)2k3 .

So the log likelihood is given by

constant + 2k1 ln(θ) + k2 ln(2θ) + k2 ln(1 − θ) + 2k3 ln(1 − θ).

EN007001 Engineering Research Methodology

47 / 72
Example 5 (cont.)
We set the derivative equal to zero:

2k1 + k2 k2 + 2k3
− = 0.
θ 1−θ
Solving for θ, we find the MLE is

2k1 + k2
θ̂ = .
2k1 + 2k2 + 2k3
#

EN007001 Engineering Research Methodology

48 / 72
Julia Program 3 (Find θ̂ that MLE of (8))

EN007001 Engineering Research Methodology

49 / 72
Example 6
2 ) is transmitted over a
Suppose that the signal X ∼ N (0, σX
communication channel. Assume that the received signal is given
by
Y = X + W,
2 ) is independent of X.
where W ∼ N (0, σW
1 Find the ML estimate of X, given Y = y is observed.
2 Find the MAP estimate of X, given Y = y is observed.

EN007001 Engineering Research Methodology

50 / 72
Example 6 (cont.)
Solution. The PDF for r.v. X is
2
1 − x2
fX (x) = √ e 2σX .
2πσX
2 ), thus the conditional PDF is
Since (Y | X = x) ∼ N (x, σW
(y−x)2
1 −
2σ 2
fY |X (y | x) = √ e W .
2πσW

EN007001 Engineering Research Methodology

51 / 72
Example 6 (cont.)
1 The ML estimate of X, given Y = y, is the value of x that
maximizes
(y−x)2
1 −
2σ 2
fY |X (y | x) = √ e W .
2πσW
To maximize the above function, we should minimize
(y − x)2 . Therefore, we conclude

x̂M L = y.
2 The MAP estimate of X, given Y = y, is the value of x that
maximizes
(y − x)2 x2
  
fY |X (y | x)fX (x) = c exp − 2 + 2 ,
2σW 2σX

EN007001 Engineering Research Methodology

52 / 72
Example 6 (cont.)
where c is a constant. To maximize the above function, we should
minimize

(y − x)2 x2
2 + 2 . (9)
2σW 2σX
By differentiation, we obtain the MAP estimate of x as
2
σX
x̂M AP = 2 + σ 2 y.
σX W

EN007001 Engineering Research Methodology

53 / 72
Julia Program 4 (Find the MAP estimate of x from (9))

EN007001 Engineering Research Methodology

54 / 72
Example 7 (Bayesian Inference for Coin Toss Experiment)

Consider the basic random experiment where we toss a biased coin


n times. Suppose that the outcomes are x1 , . . . , xn , with xi = 1 if
the ith toss is Heads and xi = 0 otherwise, i = 1, . . . , n. Let θ
denote the probability of Heads. We wish to obtain information
about θ from the data x = (x1 , . . . , xn ). For example, we wish to
construct a confidence interval.
Solution.

EN007001 Engineering Research Methodology

55 / 72
Example 7 (cont.)
Let prior pdf f (θ) is given by a uniform pior f (θ) = 1, 0 ≤ θ ≤ 1.
We assume that conditional on θ the {xi } are independent and
Ber(θ) distributed. Thus, the Bayesian likelihood is
n
Y
f (x | θ) = θxi (1 − θ)1−xi = θs (1 − θ)n−s ,
i=1

where s = x1 + · · · + xn represents the total number of successes.


Using a uniform prior gives the posterior pdf

f (θ | x) = c θs (1 − θ)n−s , 0 ≤ θ ≤ 1.

EN007001 Engineering Research Methodology

56 / 72
Example 7 (cont.)
This is the pdf of the Beta(s + 1, n − s +!1) distribution. The
n
normalization constant is c = (n + 1) . The graph of the
s
posterior pdf for n = 100 and s = 1 is given in Fig 3.

Fig 5. Posterior pdf for θ , with n = 100 and s = 1

EN007001 Engineering Research Methodology

57 / 72
Example 7 (cont.)
A Bayesian confidence interval, called a credible interval, for θ is
formed by taking the appropriate quantiles of the posterior pdf. As
an example, suppose that n = 100 and s = 1. Then, a left
one-sided 95% credible interval for θ is [0, 0.0461], where 0.0461
is the 0.95 quantile of the Beta(2, 100) distribution.

EN007001 Engineering Research Methodology

58 / 72
Definition 20 (Bayesian Network)
Mathematically, a Bayesian network is a directed acyclic graph,
that is, a collection of vertices (nodes) and arcs (arrows between
nodes) such that arcs, when put head-to-tail, do not create loops.

Fig 6. Graph of Network

EN007001 Engineering Research Methodology

59 / 72
Remark 4
The directed graphs in (a) and (b) are acyclic. Graph (c) has a
(directed) cycle and can therefore not represent a Bayesian
network

Bayesian networks can be used to graphically represent the joint


probability distribution of a collection of random variables. In
particular, consider a Bayesian network with vertices labeled
x1 , . . . , xn . Let Pj denote the set of parents of xj , that is, the
vertices xi for which there exists an arc from xi to xj in the
graph. We can associate with this network a joint pdf

EN007001 Engineering Research Methodology

60 / 72
n
Y
f (x1 , . . . , xn ) = f (xj | Pj ).
j=1

By the product rule, we obtain

f (x1 , . . . , xn ) = f (x1 )f (x2 | x1 ) · · · f (xn | x1 , . . . , xn−1 ).

EN007001 Engineering Research Methodology

61 / 72
Example 8

Consider

Fig 7. Left: a classical statistical model Right: corresponding Bayesian


model with observed (i.e., fixed) data x1 , . . . , xn , indicated by shaded
nodes.
EN007001 Engineering Research Methodology

62 / 72
Example 8 (cont.)
The left plane of Fig. 5 shows a classical statistical model with
random variables x1 , . . . , x5 and fixed parameters θ1 , θ2 :

f (x1 , . . . , xn ) = f (x1 ; θ1 )f (x2 | x1 ; θ2 )f (x3 | x2 )


f (x4 | x2 )f (x5 | x3 , x4 ).

The right plane of Fig. 5 shows the corresponding Bayesian model.

f (x1 , . . . , xn ) = f (x1 )f (x2 | x1 )f (x3 | x2 )f (x4 | x2 )f (x5 | x3 , x4 ).

It represents the situation where the “data” x1 , . . . , xn have been


observed. The aim is to find the posterior pdf of θ1 and θ2 given
the data.
#
EN007001 Engineering Research Methodology

63 / 72
Example 9 (Applied Bayesian Networks)
Graph representations:

Fig 8. A sample Bayesian networks


Independencies:
Let random variable X is independent of Y given Z
denoted by (X⊥Y | Z). Then

(F ⊥H | S), (C⊥S | F, H), (M ⊥H, C | F ), (M ⊥C | F )

EN007001 Engineering Research Methodology

64 / 72
Example 9 (cont.)
Factorization:
P (S, F, H, C, M ) = P (S)·P (F | S) · P (H | S)·
P (C | F, H) · P (M | F )

EN007001 Engineering Research Methodology

65 / 72
Example 10 (Belief Nets)

The purpose of this belief net is to determine if a patient is to be


diagnosed with heart disease, based on several factors and
symptoms. Two important factors in heart disease are smoking and
age, and two main symptoms are chest pains and shortness of
breath. The belief net in Fig. 8 shows the prior probabilities of
smoking and age, the conditional probabilities of heart disease
given age and smoking, and the conditional probabilities of chest
pains and shortness of breath given heart disease.

EN007001 Engineering Research Methodology

66 / 72
Example 10 (cont.)

Fig 9. A Bayesian belief net for the diagnosis of heart disease

EN007001 Engineering Research Methodology

67 / 72
Example 10 (cont.)
Solution.
The belief net in Fig. 8 shows the prior probabilities of
smoking and age, the conditional probabilities of heart disease
given age and smoking, and the conditional probabilities of
chest pains and shortness of breath given heart disease.
Suppose a person experiences chest pains and shortness of
breath, but we do not know her/his age and if she/he is
smoking. How likely is it that she/he has a heart disease?

EN007001 Engineering Research Methodology

68 / 72
Example 10 (cont.)
Define the variables s (smoking), a (age), h (heart disease), c
(chest pains), and b (shortness of breath). We assume that s and a
are independent. Let “Yes” denoted by “Y” and “No” denoted by
“N”. We wish to calculate

P (h = Yes | b = Yes, c = Yes) = P (h = Y | b = Y, c = Y).
From the Bayesian network structure, we see that the joint pdf of
s, a, h, c and b can be written as
f (s, a, h, c, b) = f (s)f (a)f (h | s, a)f (c | h)f (b | h).
It follows that X
f (h | b, c) ∝ f (c | h)f (b | h) f (h | s, a)f (s)f (a) .
a,s
| {z }
f (h)

EN007001 Engineering Research Methodology

69 / 72
Example 10 (cont.)
We have
P (h = Y) = P (h = Y | s = Y, a ≤ 50) · P (s = Y ) · P (a ≤ 50)+
P (h = Y | s = Y, a > 50) · P (s = Y ) · P (a > 50)+
P (h = Y | s = N, a ≤ 50) · P (s = N ) · P (a ≤ 50)+
P (h = Y | s = N, a > 50) · P (s = N ) · P (a > 50)
= 0.2 × 0.3 × 0.6 + 0.4 × 0.3 × 0.4
+ 0.05 × 0.7 × 0.6 + 0.15 × 0.7 × 0.4 = 0.147.

EN007001 Engineering Research Methodology

70 / 72
Example 10 (cont.)
Consequently,
P (h = Y | b = Y, c = Y) = β · P (c = Y | h = Y )·
P (b = Y | h = Y ) · P (h = Y )
= β × 0.2 × 0.3 × 0.147 = β0.00882
and

P (h = N | b = Y, c = Y) = β · P (c = Y | h = N )·
P (b = Y | h = N ) · P (h = N )
= β × 0.01 × 0.1 × (1 − 0.147)
= β0.000853

EN007001 Engineering Research Methodology

71 / 72
Example 10 (cont.)
for some normalization constant β. Thus,
0.00882
f (h = Yes | b = Yes, c = Yes) =
0.0882 + 0.000853
= 0.911816 ≈ 0.91.

EN007001 Engineering Research Methodology

72 / 72

You might also like