0% found this document useful (0 votes)

67 views

Optimization

This document discusses optimization and numerical methods for a multinomial setting. It derives the complete-data and observed-data log-likelihood functions, and finds the closed-form solution for the maximum likelihood estimates. It then derives the expressions for performing Newton-Raphson optimization and the EM algorithm to find the maximum likelihood estimates numerically.

Uploaded by

Qwerty123456

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views

Optimization

Uploaded by

Qwerty123456

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

G0A63A: Optimization and Numerical Methods

Stefan Velev r0924289

2023-01-09

Exercise 1

We consider a simple multinomial setting where

Z11 Z12 Z21 Z22 Z3 Z4

3θ1 θ2 1 θ1 1 θ2 θ1 θ2
8 4 2 − 2 2 − 2 8 4

Y1 Y2 Y3 Y4
3θ1 θ2 1 θ1 1 θ2 θ1 θ2
8 + 4 2 − 2 + − 2 2 8 4
1124 8023 131 722

a) Derive analytically the complete-data log-likelihood function.

6
X
ℓc (θ1 , θ2 ) = Zj log[πjc (θ1 , θ2 )]
j=1
3θ1 θ2
= Z11 (1124, θ1 ) log( ) + Z12 (1124, θ2 ) log( )
8 4
1 θ1 1 θ2
+ Z21 (8023, θ1 ) log( − ) + Z22 (8023, θ2 ) log( − )
2 2 2 2
θ1 θ2
+ 131 log( ) + 722 log( )
8 4

b) Derive analytically the observed-data log-likelihood function.

4
X
ℓ(θ1 , θ2 ) = Yj log[πj (θ1 , θ2 )]
j=1
3θ1 θ2 1 θ1 1 θ2
= 1124 log(+ ) + 8023 log( − + − )
8 4 2 2 2 2
θ1 θ2
+ 131 log( ) + 722 log( )
8 4

c) Derive analytically the closed-form solution, if available.

First let us derive the partial derivative with respect to θ1 and set it to 0

1
∂ℓ 3y1 y2 y3
= + + =0 pluging in the values
∂θ1 3θ1 + 2θ2 θ1 + θ2 − 2 θ1
∂ℓ 3372 8023 131
= + + =0
∂θ1 3θ1 + 2θ2 θ1 + θ2 − 2 θ1

Now let us derive the partial derivative with respect to θ2 and set it to 0

∂ℓ 2y1 y2 y4
= + + =0 pluging in the values
∂θ2 3θ1 + 2θ2 θ1 + θ2 − 2 θ2
∂ℓ 2248 8023 722
= + + =0
∂θ2 3θ1 + 2θ2 θ1 + θ2 − 2 θ2
Now solving these two partial derivatives as a series of equations we get two possibilities
√
−451 − 17 7873
θ1 = ≈ −0.19594087642280523424
10000√
4405 + 17 7873
θ2 = ≈ 0.59134087642280523424
10000
OR
√
−451 + 17 7873
θ1 = ≈ 0.10574087642280523424
10000√
4405 − 17 7873
θ2 = ≈ 0.28965912357719476576
10000
Naturally we choose the second set of results, first because if θ1 was negative then we would get an undefined
result for the 131 log( θ81 ) term in the log likelihood function. Secondly because we ran a reality check and
the maximum of the log-likelihood is indeed at θ1 ≈ 0.1057 and θ2 ≈ 0.2896 with a value of −6689.525

d) Derive analytically the expressions for Newton-Raphson optimization.

Now as we have a θ1 and a θ2 we have to use a more generalized version of the Newton-Raphson method
(as opposed to to the single parameter one). First in essence what we want is to solve a system of n = 2
equations in n = 2 unknowns. The system we have is
(
∂ℓ 3372 8023 131
f1 = ∂θ1 = 3θ1 +2θ2 + θ1 +θ2 −2 + θ1 =0
∂ℓ 2248 8023 722
f2 = ∂θ2 = 3θ1 +2θ2 + θ1 +θ2 −2 + θ2 =0

and from here we define a map : f : Θ ∈ R2 → R2 where it is defined as

3372 8023 131 2248 8023 722
f (θ) = (f1 (θ1 ), f2 (θ2 )) = ( + + , − + )
3θ1 + 2θ2 θ1 + θ2 − 2 θ1 3θ1 + 2θ2 θ1 + θ2 − 2 θ2
From here we need to find the Jacobian of this function which is therefore the second (partial) derivatives
of the log-likelihood function as the equations above are the first derivatives. Specifically we get
" #
− (3θ10116
1 +2θ2 )
8023
2 − (θ +θ −2)2 − θ 2
1 2
131
− (3θ16744 8023
+2θ2 )2 − (θ1 +θ2 −2)2
J f (θ1 , θ2 ) = 6774 8023
1
− (3θ1 +2θ2 )2 − (θ1 +θ2 −2)2 − (3θ1 +2θ2 )2 − (θ1 +θ2 −2)2 − 722
4496 8023
θ2
2

From here we would want to calculate the inverse of this matrix i.e. [J f (θ1 , θ2 )]−1 . However this is a tedious
process and the result does not fit on the page. You can see it in closed form in the appendix. Nevertheless

2
# "
(0)
θ1
once we have the inverse and an starting values (0) we can specify the expression of the Newton-Raphson
θ2
optimisation as follow:
" # " #
(k+1) (k)
θ1 θ1 −1 (k) (k)
(k+1) = (k) − [J f (θ1 , θ2 )] f (θ1 , θ2 )
θ2 θ2
3372 8023
+ 131
" # " #
(k) + (k)
(k)
θ1 A B (k)
3θ1 +2θ2
(k)
θ1 +θ2 −2
(k)
θ1
= (k) − 2248 8023
θ2 C D (k) (k) + (k) (k) + 722
(k)
3θ1 +2θ2 θ1 +θ2 −2 θ2

A B
where is the inverse Jacobian and can be seen in the appendix
C D

e) DeriveDerive analytically the expressions for the EM algorithm

First let us define the Expectation stage. We have the following:

(t)
3θ1 (t)
(t) (t) 8 3372θ1
E(Z11 |θ1 , θ2 Y1 ) = 1124 (t) (t)
= (t) (t)
3θ1 θ2 3θ1 + 2θ2
8 + 4
(t) (t)
(t) (t) 3372θ1 2248θ2
E(Z12 |θ1 , θ2 Y1 ) = 1124 − (t) (t)
= (t) (t)
3θ1 + 2θ2 3θ1 + 2θ2

(t)
1 θ1 (t)
(t) (t) 2 − 2 8023(θ1 − 1)
E(Z21 |θ1 , θ2 Y2 ) = 8023 (t) (t)
= (t) (t)
1 θ1 1 θ2 θ1 + θ2 − 2
2 − 2 + 2 − 2
(t) (t)
(t) (t) 8023(1 − θ1 ) 8023(θ2 − 1)
E(Z22 |θ1 , θ2 Y2 ) = 8023 − (t) (t)
= (t) (t)
θ1 + θ2 − 2 θ1 + θ2 − 2

First taking a look at the complete log-likelihood function from above we can plug in the expectations and
derive the objective function as follows:

(t) (t) 3θ1

(t) (t) (t) (t) θ2
Q(θ1 , θ2 |θ1 , θ2 ) = Z11 (1124, θ1 , θ2 ) log( ) + Z12 (1124, θ2 , θ2 ) log( )
8 4
(t) (t) 1 θ1 (t) (t) 1 θ2
+ Z21 (8023, θ1 , θ2 ) log( − ) + Z22 (8023, θ2 , θ2 ) log( − )
2 2 2 2
θ1 θ2
+ 131 log( ) + 722 log( ) =
8 4
(t) (t)
3372θ 3θ1 2248θ θ2
= (t) 1 (t) log( ) + (t) 2 (t) log( )
3θ + 2θ 8 3θ + 2θ 4
1 2 1 2
(t) (t)
8023(θ1 − 1) 1 θ1 8023(θ − 1) 1 θ2
+ (t) (t)
log( − ) + (t) 2(t) log( − )
θ1 + θ2 − 2 2 2 θ1 + θ2 − 2 2 2
θ1 θ2
+ 131 log( ) + 722 log( )
8 4

From here we can take the first derivatives with respect to θ1 and θ2 respectively and set them to zero.

3
(t) (t)
∂Q(θ1 , θ2 |θ1 , θ2 ) θ1 (Z11 + Z21 + 131) − Z11 − 131
= =
∂θ1 (θ1 − 1)θ1
(t) (t)
3372θ1 8023(θ1 − 1) 131
= (t) (t)
− (t) (t)
+ =0
θ1 (3θ1 + 2θ2 ) 2( 12 + θ1
θ2 − 2) θ1
2 )(θ1 +
(t) (t)
∂Q(θ1 , θ2 |θ1 , θ2 ) θ2 (Z12 + Z22 + 722) − Z12 − 722
= =
∂θ2 (θ2 − 1)θ2
(t) (t)
2248θ2 8023(θ2 − 1) 722
= (t) (t)
− (t) (t)
+ =0
θ2 (3θ1 + 2θ2 ) 2( 12 − θ2
θ2 − 2) θ2
2 )(θ1 +
From here once we can find an expression for θ1 and θ2 respectively. They are
Z11 + 131
θ1 = =
Z11 + Z21 + 131
(t) (t) (t) (t) (t)
(3765(θ1 )2 + θ1 (4027θ2 − 7530) + 262 ∗ (θ2 − 2)θ2 )
= (t) (t) (t) (t) (t)
=
(27834(θ1 )2 + 3θ1 (6691θ2 − 10533) + 2θ2 (131θ2 − 8285))
Z12 + 722
θ2 = =
Z12 + Z22 + 722
(t) (t) (t) (t) (t)
(2166(θ2 )2 + θ2 (5858θ2 − 4332) + 3692(θ2 − 2)θ2 )
= (t) (t) (t) (t) (t)
(2166(θ2 )2 + θ2 (29927θ2 − 28401) + 142θ2 (139θ2 − 165))

a1) Implement your Newton-Raphson optimization algorithm

Here I will implement

" my Newton-Raphson
# optimization. We have to be a bit careful in our starting values
(0)
θ = 0.5
if we start with 1(0) we end up in the other solution (see above). This is because I want to show a
θ2 = 0.5
direct
" from#scratch code from the calculations above. Given that I know the answer I will start with values
(0)
θ1 = 0.2
(0) , where the algorithm converges within 6 steps up to 7 decimals. Please see steps below for the
θ2 = 0.2
result. The code is available in the appendix

## Theta1 Theta2
## 1 0.20000000 0.2000000
## 2 0.09673894 0.2725738
## 3 0.10505359 0.2889658
## 4 0.10573591 0.2896598
## 5 0.10574088 0.2896591
## 6 0.10574088 0.2896591
## 7 0.10574088 0.2896591
## 8 0.10574088 0.2896591
## 9 0.10574088 0.2896591
## 10 0.10574088 0.2896591

Implement your EM Algorithm

Here I will implement my EM algorithm. This algorithm takes

" a little #
bit longer to converge than the
(0)
θ = 0.2
Newton-Raphson with about 45-46 steps with starting values 1(0) . The code is available in the
θ2 = 0.2
appendix

4
## Teta1 Teta2
## 1 0.2000000 0.2000000
## 2 0.1672030 0.2260423
## 3 0.1479708 0.2451093
## 4 0.1352150 0.2581961
## 5 0.1265147 0.2673104

## [1] "..."

## Teta1 Teta2
## 40 0.1057411 0.2896589
## 41 0.1057410 0.2896590
## 42 0.1057410 0.2896590
## 43 0.1057410 0.2896590
## 44 0.1057409 0.2896591
## 45 0.1057409 0.2896591

Exercise 2

a) Find the normalizing constant of q(θ) using two methods for numerical integration.

First we know from Bayes theorem that

likelihood × prior p(y|θ)p(θ)
P osterior = = p(θ|y) =
probability(data) p(y)
.
However the question does not explicitly state if it wants the normalizing constant for this Bayesian problem
or the normalizing constant which makes q(θ) integrate to 1 when we divide by it. Therefore I will show
both. First the normalizing constant which makes
Z 1
q(θ)
=1
0 constant
Z 1
constant = q(θ)
0
Z 1
= (exp(−4(θ − 0.5)2 − 0.1 cos2 (12πθ))
0
= 0.7108697

In the code below I will implement Riemann and Trapezoidal integral approximation along with my personal
choice stochastic (Monte Carlo) integration to calculate the integral above. The code is available in the
appendix

## [1] "RIEMAN"

## [1] 0.07333786

## [1] "TRAPEZOID"

## [1] 0.07333786

## [1] "STOCHASTIC"

5
## [1] 0.073371

If we however assume that we are looking for the normalizing constant for the Bayes problem.Then we need
to calculate from

p(θ|y) ∝ p(y|θ)p(θ)

Now given that p(θ|y) is a probability, then the integral over all values of θ must be equal to 1, which in
turn means that
p(y|θ)p(θ) p(y|θ)p(θ) p(y|θ)q(θ)
p(θ|y) = = R1 = R1
p(y) p(y|θ)p(θ) p(y|θ)q(θ)
0 0
R1
from here we have that the normalizing constant is the integral 0 p(y|θ)p(θ). Now the question gives us 10
Bernoulli observation and we know that 7 of those were successful.So in essence the question is asking us to
R1 7 (10−7) (exp(−4(θ−0.5)2 −0.1 cos2 (12πθ))
use two numerical methods to find 0 ( 10 7 θ θ ) 0.7108697 where the first part is the
Binomial and the second part is the prior. In the code (available in the appendix) below I will calculate this
integral again using Rieman, Trapezoidal and Stochastic integration methods

## [1] "RIEMAN"

## [1] 0.1031664

## [1] "TRAPEZOID"

## [1] 0.1031664

## [1] "STOCHASTIC"

## [1] 0.1031175

b) Implement the probability integral transform method to simulate M = 10000 draws from
p(θ).

What we have is that p(θ) is a Random Variable with PDF

exp(−4((θ − 0.5)2 ) − 0.1 cos(12πθ)2 )

p(θ) = for 0 < θ < 1
0.7108697
If we use the Integration from above we can plot the CDF of this Random variable. (code for the graph
available in the appendix)

6
CDF of p(theta)
1.00

0.75
p(theta)

0.50

0.25

0.00

0.00 0.25 0.50 0.75 1.00

Theta
Here I will assume that by probability integral transform method I am meant to use the inverted CDF
method. In order to invert this CDF I will use the built in integrate function and not the numeric
integration functions from above (for computational performance reasons), then I will also use the unitroot
function to find the root of this function and invert the CDF. Finally I will generate a loop which draws
from a uniform random variable and plugs in the value into this inverted CDF function. For the inverse
CDF method we know that if we have a random variable X with a CDF FX (x) = P (X ≤ x) then the
−1
random variable FX (U ), where U ∼ U nif (0, 1) , has the same law as X. So below I will draw 10000
samples from p(θ) and and show the histogram.Unfortunately this method is a bit slow (hence the reason I
use the built in integrate) and 10000 draws are not enough to show the nice wavy line of the distribution.
Please check the accept reject method histogram as it will be with more draws and a higher resolution. The
code is available in the appendix

7
Observations drawn from p(theta) using inverse CDF

1.5

1.0
density

0.5

0.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
p_theta
### c) Implement a rejection algorithm to simulate M = 10000 draws from p(θ).
Below I will implement the accept reject sampling method. The histogram shows draws from the same
density as above but with many more draws, so that the resolution is better.

8
Observations drawn from p(theta) using accept reject

1.0
density

0.5

0.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
p_theta
### ci) Write the (closed form) formula of the envelope.
For the envelope I have used a Normal distribution with a µ = 0.5 and σ 2 = 0.25. It was scaled by M ≈ 1.756
p(θ)
which I got from the maximum value of ϕ(θ) where ϕ(θ) is the pdf of the normal distribution. The closed
form is the following

1 1 θ − 0.5 2
ϕ(θ) = √ exp(− ( ) )Envelope = ϕ(θ) × 1.756
0.5 2π 2 0.5
below you can see how the two distributions look. I have extended the x axis so that it looks better, but for
the accept reject I limit them to the interval (0, 1)

9
Envoelope Visualisation

1.0
Density

0.5

0.0

−2 −1 0 1 2
Theta
### cii) Estimate the acceptance probability.
We can derive the acceptance probability as follows.

p(Θ)
P U≤ = E1U ≤ p(Θ)
M g(Θ) M g(Θ)

= E E 1U ≤ p(Θ) | Θ By the tower property
M g(Θ)

p(Θ)
=E P U ≤ |Θ
M g(Θ)

p(Θ)
=E because P(U < u) = u when U is uniform on (0, 1)
M g(Θ)
Z
p(θ)
= g(θ)dθ
θ:g(θ)>0 M g(θ)
Z
1
= p(θ)dθ
M θ:g(θ)>0
1
=
M
and indeed in the above histogram we have drawn 100000 samples from p(θ) from a total of 1757819 candi-
dates which gives us exactly the ratio
100000 1 1
= =
1757819 1.7578 M

10
ciii) Compare the estimated acceptance probability to the theoretical value
1
I ( )
We are given that the theoretical acceptance probability is A = 020√20e = 0.0492 which is a generally poor
acceptance probability compared to the ≈ 57% which we have when we use the normal distribution as an
envelope. Additionally, a theoretical acceptance probability is as far as I understand accept reject method
is entirely dependent on the envelope, and the envelope is not specified in the question and I assume can
be chosen freely, plus I could not find anything in the lectures of the notes which would indicate there is a
fixed theoretical answer. My intuition says that if such exists it is some kind of Taylor approximation, but
I could not find any mention of a theoretical acceptance rate.
If we nevertheless had to find a density and an M such that we have an acceptance probability of A =
1
I0 ( 20 )
√
20
e
= 0.0492 we could just use the same normal distribution as in our methodology but use M = 29.77225.
Alternatively we could use a uniform distribution from (0, 14.54649) and use an M = 20.32 and then run
the accept reject sampling. We would get that acceptance probability

d) Derive the posterior p(θ|y).

First we know from Bayes theorem that

likelihood × prior p(y|θ)p(θ)

P osterior = = p(θ|y) =
probability(data) p(y)

. Which if we look at the calculations we did in section a) we have the following

0
( 7 θ7 θ(10−7) ) (exp(−4(θ−0.5)
0.7108697

Where I assume that for y I only have the 10 Bernoulli draws of which 7 were successful. Obviously prof.
Wierdo has been doing this experiment for longer and has more observations, but nevertheless we will limit
this analysis to just the 10 mentioned in the question. We get the following density.

11
Bernoulli likelihood
0.20
p_bern(range)

0.10
0.00

0.0 0.2 0.4 0.6 0.8 1.0

range

As we can see we get the maximum at 0.7 which makes sense. Now if we take a look below we can see how
the density of p(θ|y) looks like (code in the appendix)

12
Density of p(theta|y)
3.0
2.5
p_theta_y(range)

2.0
1.5
1.0
0.5
0.0

0.0 0.2 0.4 0.6 0.8 1.0

range

## [1] "Check if it integrates to 1"

## 1 with absolute error < 6.7e-07

e) Apply the Metropolis sampler to sample from p(θ|y). Simulate M = 10000 draws.

Now what we need to do is use The Metropolis-Hastings algorithm to draw 10000 samples from p(θ|y). The
MH algorithm is a powerful way of approximating a distribution using Markov chain Monte Carlo. Here we
can use the fact that MH needs is an expression that is proportional to the density we are looking to sample
from. Even though we could use the non-normalized densities, I have alread derived them and shall therefore
use them. In the code below I am using a normal distribution as the proposal density with a µ = 0.5 and
σ 2 = 0.25. The code is available in the appendix

13
Observations drawn from p(theta|y) using Metropolis−Hastings

2
density

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

p_theta_y

APENDIX

EX 1 d) The inverse of the Jacobian

We calculate the inverse of the Jacobian with the help of https://matrixcalc.org/ which kindly provides
the results in Latex

A B
[J f (θ1 , θ2 )]−1 =
C D

with
−157904θ28 − 952544θ1 θ27 + 118144θ27 − 2205448θ12 θ26 + 611200θ1 θ26 − 118144θ26
−2405016θ13 θ25 + 1278720θ12 θ25 − 493056θ1 θ25 − 1216665θ14 θ24 + 1409472θ13 θ24 − 785664θ12 θ24 −
272916θ15 θ23 + 857736θ14 θ23 − 623808θ13 θ23 − 58482θ16 θ22 + 233928θ15 θ22 − 233928θ14 θ22
A=
7661142θ16 − 30644568θ15 + 30644568θ14 + 177488456θ26 + 931869784θ1 θ25 − 131527616θ25 + 1889147546θ12 θ24 −
546698048θ1 θ24 + 131527616θ24 + 1785372552θ13 θ23 − 781027488θ12 θ23 + 415170432θ1 θ23 + 694317969θ14 θ22 − 447575904θ13 θ22
+365857056θ12 θ22 + 35751996θ15 θ2 − 112363416θ14 θ2 + 81718848θ13 θ2

155344θ28 + 905088θ1 θ27 − 107904θ27 + 1982496θ12 θ26 − 431616θ1 θ26 + 107904θ26 +

1935288θ13 θ25 − 566496θ12 θ25 + 323712θ1 θ25 + 710559θ14 θ24 − 242784θ13 θ24 + 242784θ12 θ24
B=
7661142θ16 − 30644568θ15 + 30644568θ14 + 177488456θ26 + 931869784θ1 θ25 − 131527616θ25 + 1889147546θ12 θ24 −
546698048θ1 θ24 + 131527616θ24 + 1785372552θ13 θ23 − 781027488θ12 θ23 + 415170432θ1 θ23 +
694317969θ14 θ22 − 447575904θ13 θ22 + 365857056θ12 θ22 + 35751996θ15 θ2 − 112363416θ14 θ2 + 81718848θ13 θ2

14
155464θ28 + 905688θ1 θ27 − 108384θ27 + 1983606θ12 θ26 − 433536θ1 θ26 + 108384θ26 +
1936188θ13 θ25 − 569016θ12 θ25 + 325152θ1 θ25 +
710829θ14 θ24 − 243864θ13 θ24 + 243864θ12 θ24
C=
7661142θ16 − 30644568θ15 + 30644568θ14 + 177488456θ26 + 931869784θ1 θ25 − 131527616θ25 +
1889147546θ12 θ24 − 546698048θ1 θ24 + 131527616θ24 + 1785372552θ13 θ23 − 781027488θ12 θ23 +
415170432θ1 θ2 + 694317969θ14 θ22 − 447575904θ13 θ22 + 365857056θ12 θ22 + 35751996θ15 θ2 − 112363416θ14 θ2 + 81718848θ13 θ2
3

−170928θ28 − 989296θ1 θ27 + 170240θ27 − 2162804θ12 θ26 + 706112θ1 θ26 − 170240θ26 −

2133912θ13 θ25 + 1013232θ12 θ25 − 535872θ1 θ25 − 836406θ14 θ24 + 590544θ13 θ24 − 477360θ12 θ24 −
49518θ15 θ23 + 155628θ14 θ23 − 113184θ13 θ23 − 10611θ16 θ22 + 42444θ15 θ22 − 42444θ14 θ22
D=
7661142θ16 − 30644568θ15 + 30644568θ14 + 177488456θ26 + 931869784θ1 θ25 − 131527616θ25 +
1889147546θ12 θ24 − 546698048θ1 θ24 + 131527616θ24 + 1785372552θ13 θ23 − 781027488θ12 θ23 +
415170432θ1 θ2 + 694317969θ14 θ22 − 447575904θ13 θ22 + 365857056θ12 θ22 + 35751996θ15 θ2 − 112363416θ14 θ2 + 81718848θ13 θ2
3

a1) Implement your Newton-Raphson optimization algorithm Code

###NEWTON-RAPHSON IMPLEMENTATION
#Package for inverting the Jaconbian, I could use the
#fomula in the appendix if packages are not allowed
library(matlib)
jacob<-function(x,y) {
a<- -(10116/(3*x+2*y)ˆ2)-(8023/(x+y-2)ˆ2)-(131/(xˆ2)) #partial second derivative of log-likelihood
b<- -(6744/(3*x+2*y)ˆ2)-(8023/(x+y-2)ˆ2) #partial second derivative of log-likelihood
c<- -(6774/(3*x+2*y)ˆ2)-(8023/(x+y-2)ˆ2) #partial second derivative of log-likelihood
d<- -(4496/(3*x+2*y)ˆ2)-(8023/(x+y-2)ˆ2)-(722/yˆ2) #partial second derivative of log-likelihood
#invert the Jacobian, I could use the fomula in the appendix but this is more elegant
IA<-inv(matrix(data = c(a,b,c,d),ncol = 2,nrow = 2))

e<-(3372/(3x+2y))+(8023/(x+y-2))+(131/x) #First derivative of log-likelihood

f<-(2248/(3*x+2*y))+(8023/(x+y-2))+(722/y) #First derivative of log-likelihood
#Multiply the inverse of the Jacobian times the first derivative
g<-IA %*% matrix(data = c(e,f),
ncol = 1,nrow = 2)

return(g)
}

#create a dataframe for place holder

grid<-data.frame("Theta1"=rep(NA,10),"Theta2"=rep(NA,10))
#Set initial values different values may lead to the other solution #Set only valid initial values
grid[1,]<-c(0.2,0.2)

for (i in 1:(nrow(grid)-1)) { #Loop for the Netwon raphson

#Value in next step is the value in this step minus the output from the jacob function
grid[(i+1),]<-grid[i,]-t(jacob(grid[i,1],grid[i,2]))

#grid #print the results

15
b1) Implement your EM Algorithm

grid<-data.frame("Teta1"=rep(NA,15),"Teta2"=rep(NA,15))
grid[1,]<-c(0.01,0.01)

EMfunc1<- function(params) {
a<-params[1]
b<-params[2]
c<-(3372*a/(3*a+2*b))
d<-(2248*b)/(3*a+2*b)
e<-(8023*(a-1))/(a+b-2)
f<-(8023*(b-1))/(a+b-2)
#if we do not want to calculate the expectation every time we can use this form
#x<- (3765*aˆ2+a*(4027*b-7530)+262*(b-2)*b)/(27834*aˆ2+3*a*(6691*b-10533)+2*b*(131*b-8285))
#y<- (2166*aˆ2+a*(5858*b-4332)+3692*(b-2)*b)/(2166*aˆ2+a*(29927*b-28401)+142*b*(139*b-165))
x<-(c+131)/(c+e+131)
y<- (d+722)/(d+f+722)
z<-c(x,y)
return(z)

for (i in 1:(nrow(grid)-1)) {
grid[(i+1),]<-EMfunc1(c(grid[i,1],grid[i,2]))

#grid

2 a1) Normalizing Constant which makes p(θ) integrate to 1

Normalizing Constant which makes p(θ) integrate to 1

#Define the function

int_q_theta<-function(x){
y<-(exp(-4*((x-0.5)ˆ2)-0.1*cos(12*pi*x)ˆ2))
return(y)}

#Riemann integration Approximation

rie<-function(n,a,b){

h=(b-a)/n
y=rep(0,length(n))

for (j in 1:length(n)){
grid1=seq(a,b-h[j],h[j])
grid_int=int_q_theta(grid1)
y[j]=sum(grid_int)*h[j]}
return(y)}

#trapezoid integration Approximation

trp<-function(n,a,b){

16
h=(b-a)/n
y=rep(0,length(n))

for (j in 1:length(n)){
grid1=seq(a,b,h[j])
grid_int=int_q_theta(grid1)
y[j]=(sum(grid_int)-0.5*(grid_int[1]+grid_int[length(grid1)]))*h[j]}
return(y)}

#Stochastic Integration Approximation

#Note differs slightly from the other two due to its random nature
stch<-function(n,a,b){
rn=runif(n,min = a,max = b)
y=rep(NA,n)
for (j in 1:n){
y[j]<-int_q_theta(rn[j])
}
return(mean(y))

}
print("RIEMAN")

## [1] "RIEMAN"

rie(10000000,0,1) #RIEMAN

## [1] 0.7108697

print("TRAPEZOID")

## [1] "TRAPEZOID"

trp(10000000,0,1) #TRAPEZOID

## [1] 0.7108697

print("STOCHASTIC")

## [1] "STOCHASTIC"

stch(10000000,0,1) #STOCHASTIC

## [1] 0.7107604

2 a2) Bernoulli×prior normalizing constant

Bernoulli×prior normalizing constant

17
#Define the function
int_q_theta<-function(x){
y<-(choose(10,7)*((x)ˆ7)*((1-x)ˆ3))*((exp(-4*((x-0.5)ˆ2)-0.1*cos(12*pi*x)ˆ2))/0.7108697)
return(y)}

#Riemann integration Approximation

rie<-function(n,a,b){

h=(b-a)/n
y=rep(0,length(n))

for (j in 1:length(n)){
grid1=seq(a,b-h[j],h[j])
grid_int=int_q_theta(grid1)
y[j]=sum(grid_int)*h[j]}
return(y)}

#trapezoid integration Approximation

trp<-function(n,a,b){

h=(b-a)/n
y=rep(0,length(n))

for (j in 1:length(n)){
grid1=seq(a,b,h[j])
grid_int=int_q_theta(grid1)
y[j]=(sum(grid_int)-0.5*(grid_int[1]+grid_int[length(grid1)]))*h[j]}
return(y)}

#Stochastic Integration Approximation

#Note differs slightly from the other two due to its random nature
stch<-function(n,a,b){
rn=runif(n,min = a,max = b)
y=rep(NA,n)
for (j in 1:n){
y[j]<-int_q_theta(rn[j])
}
return(mean(y))

}
print("RIEMAN")

## [1] "RIEMAN"

rie(10000000,0,1) #RIEMAN

## [1] 0.1031664

print("TRAPEZOID")

## [1] "TRAPEZOID"

18
trp(10000000,0,1) #TRAPEZOID

## [1] 0.1031664

print("STOCHASTIC")

## [1] "STOCHASTIC"

stch(10000000,0,1) #STOCHASTIC

## [1] 0.1031679

2 b) Inverted CDF sampling

CDF plot code

library(ggplot2)
p_theta<-function(x){
y<-exp(-4*((x-0.5)ˆ2)-0.1*cos(12*pi*x)ˆ2)/0.7108697
return(y)}
u<-seq(from=0, to=1,length.out=100000)
rie_p<-function(n,a,b){

h=(b-a)/n
y=rep(0,length(n))

for (j in 1:length(n)){
grid1=seq(a,b-h[j],h[j])
grid_int=p_theta(grid1)
y[j]=sum(grid_int)*h[j]}
return(y)}

values<-c()
for (i in 1:length(u)) {
values[i]<-rie_p(1000,0,u[i])

df<-data.frame("Values"=values,"Sequence"=u)

cdf_p_theta<-ggplot(data = df,aes(x=Sequence,y=Values))+
geom_line()+labs(title = "CDF of p(theta)",x="Theta",y="p(theta)")+theme_bw()
#cdf_p_theta

Inverse CDF Sampling

# Get the cdf by numeric integration

cdf <- function(x){
integrate(p_theta, 0, x)$value
}

19
# Use a root finding function to invert the cdf
invcdf <- function(q){
uniroot(function(x){cdf(x) - q}, range(0,1))$root
}

p_theta_draws<-rep(0,10000)
for (i in 1:length(p_theta_draws)) {
u<-runif(1,0,1)
p_theta_draws[i]<-invcdf(u)
}

df2<-data.frame("p_theta"=p_theta_draws)

p_draws<-ggplot(df2, aes(x = p_theta)) +

geom_histogram(aes(y = ..density..),
colour = 1, fill = "lightblue") +
geom_density(lwd = 1.2,
linetype = 2,
colour = "blue")+
labs(title = "Observations drawn from p(theta) using inverse CDF")+
scale_x_continuous(breaks=c(seq(from=0,to=max(df2$p_theta)+0.1,by=0.1)))+
theme_bw()

p_draws

## ‘stat_bin()‘ using ‘bins = 30‘. Pick better value with ‘binwidth‘.

20
Observations drawn from p(theta) using inverse CDF

1.0
density

0.5

0.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
p_theta

2 c) Accept reject Sampling

Code below for drawing from p(θ) using accept reject

g<- function(x) {
y<-dnorm(x,mean=0.5,sd=0.5)
return(y)
}
range<-seq(-2,2,length.out=1000)
M<-max(p_theta(range)/g(range))
f_x<-rep(0,1000000)
count<-0

for (i in 1:length(f_x)) {
repeat {
count<-count+1
x<-rnorm(1,mean=0.5,sd=0.5)
u<-runif(1,0,1)
if(u< p_theta(x)/(M*g(x)) & 0<x & x < 1) {break}
}

f_x[i]<-x

21
df3<-data.frame("p_theta"=f_x)
acc_rej<-ggplot(df3, aes(x = p_theta)) +
geom_histogram(bins=100,aes(y = ..density..),
colour = 1, fill = "lightblue") +
geom_density(lwd = 1.2,
linetype = 2,
colour = "pink")+
labs(title = "Observations drawn from p(theta) using accept reject")+
scale_x_continuous(breaks=c(seq(from=0,to=max(df2$p_theta)+0.1,by=0.1)))+
theme_bw()

#acc_rej

range<-seq(-2,2,length.out=1000)
M<-max(p_theta(range)/g(range))

df_dist<-data.frame("p_theta"=p_theta(range),"phi_theta"=M*g(range),"range"=range)

#ggplot(df_dist) +
#geom_line(aes(x=range,y = p_theta),color="red")+
#geom_line(aes(x=range,y = phi_theta),color="blue")+
#labs(title = "Envoelope Visualisation",y="Density",x="Theta")+
#theme_bw()

2 d) Derive the posterior p(θ|y).

Bernouli Distribution plot code

p_bern<-function(x) {

y<-(choose(10,7)*((x)ˆ7)*((1-x)ˆ(10-7)))
return(y)
}

range<-seq(0,1,length.out=1000)
#plot(range,p_bern(range),type = "l",main = "Bernoulli likelihood")

code to plot the density of p(θ|y)

int_q_theta<-function(x){
y<-(choose(10,7)*((x)ˆ7)*((1-x)ˆ3))*((exp(-4*((x-0.5)ˆ2)-0.1*cos(12*pi*x)ˆ2))/0.7108697)
return(y)}

p_theta_y<-function(x) {

y<-((choose(10,7)*((x)ˆ7)*((1-x)ˆ3))*((exp(-4*((x-0.5)ˆ2)-0.1*cos(12*pi*x)ˆ2))/0.7108697))/integrate(i
return(y)
}
range<-seq(0,1,length.out=1000)
#plot(range,p_theta_y(range),type = "l",main = "Density of p(theta|y)")
#print("Check if it integrates to 1")
#integrate(p_theta_y, 0, 1)

22
2 e) Apply the Metropolis sampler to sample from p(θ|y). Simulate M = 10000 draws.

Here is the code for the Metropolis Hastings sampler

mh_sampler <- function(dens, start = 0, nreps = 10000, prop_sd = 1, ...){

theta <- numeric(nreps)
theta[1] <- start

for (i in 2:nreps){
theta_star <- rnorm(1, mean = theta[i - 1], sd = prop_sd)
alpha = dens(theta_star, ...) / dens(theta[i - 1], ...)

if (runif(1) < alpha) {theta[i] <- theta_star}

else {theta[i] <- theta[i - 1]}
}

return(theta)
}

mh_sample<-mh_sampler(p_theta_y, nreps = 300000, start = 0.5)

df4<-data.frame("p_theta_y"=mh_sample)

p_y_draws<-ggplot(df4, aes(x = p_theta_y)) +

geom_histogram(bins=100,aes(y = ..density..),
colour = 1, fill = "darkgreen") +
geom_density(lwd = 1.2,
linetype = 2,
colour = "green")+
labs(title = "Observations drawn from p(theta|y) using Metropolis-Hastings")+
scale_x_continuous(breaks=c(seq(from=0,to=max(df2$p_theta)+0.1,by=0.1)))+
theme_bw()

#p_y_draws

An Introduction To Signal Detection and Estimation - Second Edition Chapter IV: Selected Solutions
100% (1)
An Introduction To Signal Detection and Estimation - Second Edition Chapter IV: Selected Solutions
7 pages
Problem Set 4 Solution Numerical Methods
No ratings yet
Problem Set 4 Solution Numerical Methods
6 pages
451hw02 Soln
No ratings yet
451hw02 Soln
16 pages
Using Maxlik
No ratings yet
Using Maxlik
20 pages
18 05 Lec37
No ratings yet
18 05 Lec37
4 pages
Week 4_418
No ratings yet
Week 4_418
10 pages
STAT4027 Assignment 1: Lewis Hastie
No ratings yet
STAT4027 Assignment 1: Lewis Hastie
26 pages
2023 Tarea Curso Identificacion
No ratings yet
2023 Tarea Curso Identificacion
10 pages
EM-algorithm: California Institute of Technology 136-93 Pasadena, CA 91125 Welling@vision - Caltech.edu
No ratings yet
EM-algorithm: California Institute of Technology 136-93 Pasadena, CA 91125 Welling@vision - Caltech.edu
7 pages
Probabilistic Modelling and Reasoning
No ratings yet
Probabilistic Modelling and Reasoning
13 pages
2019-20-I MS Key
No ratings yet
2019-20-I MS Key
6 pages
Kill Me
No ratings yet
Kill Me
23 pages
Computational Biology Project Report
No ratings yet
Computational Biology Project Report
15 pages
Maximum Likelihood Estimation: Guy Lebanon February 19, 2011
No ratings yet
Maximum Likelihood Estimation: Guy Lebanon February 19, 2011
6 pages
Scribe Notes BML
No ratings yet
Scribe Notes BML
25 pages
Econometrics 2018 Final Solutions
No ratings yet
Econometrics 2018 Final Solutions
5 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
14 pages
8th Lecture Note - 1039837803 230515 094639
No ratings yet
8th Lecture Note - 1039837803 230515 094639
10 pages
Tutorial 9 Memo
No ratings yet
Tutorial 9 Memo
3 pages
Exercise 3 Computer Intensive Statistics
No ratings yet
Exercise 3 Computer Intensive Statistics
10 pages
All Ex Sol
No ratings yet
All Ex Sol
43 pages
Estimating A Dirichlet Distribution Thomas P. Minka
No ratings yet
Estimating A Dirichlet Distribution Thomas P. Minka
15 pages
Minka Dirichlet PDF
No ratings yet
Minka Dirichlet PDF
14 pages
The Problem: Library (MASS) Data (Faithful) Attach (Faithful)
No ratings yet
The Problem: Library (MASS) Data (Faithful) Attach (Faithful)
7 pages
Econometrics - Exercise set 2 (solution)
No ratings yet
Econometrics - Exercise set 2 (solution)
12 pages
EE364a Homework 7 Solutions
No ratings yet
EE364a Homework 7 Solutions
16 pages
SDET Formulae MidSem2 2018 Ver3
No ratings yet
SDET Formulae MidSem2 2018 Ver3
2 pages
Solutions Chapter 11
No ratings yet
Solutions Chapter 11
6 pages
Prints PDF
No ratings yet
Prints PDF
106 pages
STAT 135 Solutions To Homework 4:: 30 Points
No ratings yet
STAT 135 Solutions To Homework 4:: 30 Points
9 pages
CS229 Lecture 3 PDF
100% (1)
CS229 Lecture 3 PDF
35 pages
Filt Ident Lecturenotes
No ratings yet
Filt Ident Lecturenotes
12 pages
Solution 4 Problem 1: A A ( 1, +1) : Iid Data
No ratings yet
Solution 4 Problem 1: A A ( 1, +1) : Iid Data
18 pages
ps2,3
No ratings yet
ps2,3
48 pages
Math525 2
No ratings yet
Math525 2
8 pages
msqe_metrics_1_ps2
No ratings yet
msqe_metrics_1_ps2
11 pages
Nummax
No ratings yet
Nummax
3 pages
Numerical Optimization of Likelihoods: Additional Literature For STK2120
No ratings yet
Numerical Optimization of Likelihoods: Additional Literature For STK2120
46 pages
CPSC 440: Advanced Machine Learning: Exponential Families
No ratings yet
CPSC 440: Advanced Machine Learning: Exponential Families
15 pages
Mathematical Statistics (II)
No ratings yet
Mathematical Statistics (II)
112 pages
STA360/601 Midterm Solutions
No ratings yet
STA360/601 Midterm Solutions
6 pages
Notes
No ratings yet
Notes
10 pages
T10 Sol..ol
No ratings yet
T10 Sol..ol
8 pages
Statistics 580 Maximum Likelihood Estimation: 1 2 N 0 N 1 P 0 P
No ratings yet
Statistics 580 Maximum Likelihood Estimation: 1 2 N 0 N 1 P 0 P
25 pages
Assn11 Sol
No ratings yet
Assn11 Sol
7 pages
Class Exercises
No ratings yet
Class Exercises
10 pages
Statistics
No ratings yet
Statistics
4 pages
Lecture 2
No ratings yet
Lecture 2
8 pages
281A Final Sol
No ratings yet
281A Final Sol
9 pages
sol3_2015
No ratings yet
sol3_2015
8 pages
Tutorial On Generalized Expectation
No ratings yet
Tutorial On Generalized Expectation
6 pages
Tutorial On Generalized Expectation Maximization: Javier R. Movellan
No ratings yet
Tutorial On Generalized Expectation Maximization: Javier R. Movellan
6 pages
ME21B172 - Anuj - Jagannath - Said - Assignment - 1
No ratings yet
ME21B172 - Anuj - Jagannath - Said - Assignment - 1
10 pages
The University of Nottingham
No ratings yet
The University of Nottingham
6 pages
prml_solution_manual-2
No ratings yet
prml_solution_manual-2
122 pages
ML and MAP - HTML
No ratings yet
ML and MAP - HTML
9 pages
Solving Math Problems
From Everand
Solving Math Problems
George N. Frempong
No ratings yet
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
From Everand
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
Shubhankar Paul
No ratings yet
Factoring and Algebra - A Selection of Classic Mathematical Articles Containing Examples and Exercises on the Subject of Algebra (Mathematics Series)
From Everand
Factoring and Algebra - A Selection of Classic Mathematical Articles Containing Examples and Exercises on the Subject of Algebra (Mathematics Series)
CSPacademic
No ratings yet
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
H
No ratings yet
H
11 pages
ASEAN Engineer Application Form
0% (1)
ASEAN Engineer Application Form
4 pages
Ref: "Tata Motors Limited"Direct Recruitments Offer
No ratings yet
Ref: "Tata Motors Limited"Direct Recruitments Offer
3 pages
Helmet Compression Syndrome
No ratings yet
Helmet Compression Syndrome
18 pages
Sharp MX-m260 MX-m310 Service Manual
100% (1)
Sharp MX-m260 MX-m310 Service Manual
137 pages
Etech q2m3
No ratings yet
Etech q2m3
3 pages
Modul I Uji Sumur
No ratings yet
Modul I Uji Sumur
61 pages
Examination Centres Name and Addresses Type of Examination
No ratings yet
Examination Centres Name and Addresses Type of Examination
3 pages
Utkarsh India Limited - Leading Steel Tube Manufacturer in India
No ratings yet
Utkarsh India Limited - Leading Steel Tube Manufacturer in India
20 pages
Aisi 301 Material Properties
No ratings yet
Aisi 301 Material Properties
1 page
Gateways to Art Understanding the Visual Arts 1st Edition Debra Dewitte download
100% (6)
Gateways to Art Understanding the Visual Arts 1st Edition Debra Dewitte download
57 pages
Chapter 9
No ratings yet
Chapter 9
16 pages
Truth and Method - Hans G. Gadamer
No ratings yet
Truth and Method - Hans G. Gadamer
5 pages
Fill in The Blanks With Correct Laboratory Tool From The Choices in The Box Below
No ratings yet
Fill in The Blanks With Correct Laboratory Tool From The Choices in The Box Below
11 pages
Nanocryl Coating of PMMA Complete Denture Base Materials To Prevent Scratching
No ratings yet
Nanocryl Coating of PMMA Complete Denture Base Materials To Prevent Scratching
12 pages
Traditional Wedding of Vietnam: Welcome To Presentation of Group 2
No ratings yet
Traditional Wedding of Vietnam: Welcome To Presentation of Group 2
32 pages
Transparent Solar Panels
No ratings yet
Transparent Solar Panels
5 pages
Ms Word Free Questions Download
No ratings yet
Ms Word Free Questions Download
20 pages
Lieferantenleitfaden - v1.0 (En) Mar 19
No ratings yet
Lieferantenleitfaden - v1.0 (En) Mar 19
49 pages
IBT - 07 Quiz 1
No ratings yet
IBT - 07 Quiz 1
2 pages
Streamlines Shape Hull PDF
No ratings yet
Streamlines Shape Hull PDF
15 pages
Laboratory Manual: College of Engineering, Osmanabad
No ratings yet
Laboratory Manual: College of Engineering, Osmanabad
35 pages
T&P Hand Book-Final
No ratings yet
T&P Hand Book-Final
21 pages
Answer_23년_11월_고1_모의고사_07_01_변형문제(유형별)
No ratings yet
Answer_23년_11월_고1_모의고사_07_01_변형문제(유형별)
92 pages
A Family Friend
100% (1)
A Family Friend
7 pages
Choosing The Right Surgical Glove An Overview and Update
No ratings yet
Choosing The Right Surgical Glove An Overview and Update
5 pages
Text Scrolling Code File by Hardhat Electronics
No ratings yet
Text Scrolling Code File by Hardhat Electronics
2 pages
Lsa Inventory
No ratings yet
Lsa Inventory
28 pages
Stage 1 Visit 1
No ratings yet
Stage 1 Visit 1
15 pages
Iranian Midlife Challenges Scale - 2018
No ratings yet
Iranian Midlife Challenges Scale - 2018
7 pages

Optimization

Uploaded by

Optimization

Uploaded by

G0A63A: Optimization and Numerical Methods

Stefan Velev r0924289

We consider a simple multinomial setting where

Z11 Z12 Z21 Z22 Z3 Z4

a) Derive analytically the complete-data log-likelihood function.

b) Derive analytically the observed-data log-likelihood function.

c) Derive analytically the closed-form solution, if available.

d) Derive analytically the expressions for Newton-Raphson optimization.

and from here we define a map : f : Θ ∈ R2 → R2 where it is defined as

e) DeriveDerive analytically the expressions for the EM algorithm

First let us define the Expectation stage. We have the following:

(t) (t) 3θ1

a1) Implement your Newton-Raphson optimization algorithm

Here I will implement

Implement your EM Algorithm

Here I will implement my EM algorithm. This algorithm takes

First we know from Bayes theorem that

What we have is that p(θ) is a Random Variable with PDF

exp(−4((θ − 0.5)2 ) − 0.1 cos(12πθ)2 )

0.00 0.25 0.50 0.75 1.00

d) Derive the posterior p(θ|y).

First we know from Bayes theorem that

likelihood × prior p(y|θ)p(θ)

. Which if we look at the calculations we did in section a) we have the following

0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0

## [1] "Check if it integrates to 1"

## 1 with absolute error < 6.7e-07

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

EX 1 d) The inverse of the Jacobian

155344θ28 + 905088θ1 θ27 − 107904θ27 + 1982496θ12 θ26 − 431616θ1 θ26 + 107904θ26 +

−170928θ28 − 989296θ1 θ27 + 170240θ27 − 2162804θ12 θ26 + 706112θ1 θ26 − 170240θ26 −

a1) Implement your Newton-Raphson optimization algorithm Code

e<-(3372/(3*x+2*y))+(8023/(x+y-2))+(131/x) #First derivative of log-likelihood

#create a dataframe for place holder

for (i in 1:(nrow(grid)-1)) { #Loop for the Netwon raphson

#grid #print the results

2 a1) Normalizing Constant which makes p(θ) integrate to 1

Normalizing Constant which makes p(θ) integrate to 1

#Define the function

#Riemann integration Approximation

#trapezoid integration Approximation

#Stochastic Integration Approximation

2 a2) Bernoulli×prior normalizing constant

Bernoulli×prior normalizing constant

#Riemann integration Approximation

#trapezoid integration Approximation

#Stochastic Integration Approximation

2 b) Inverted CDF sampling

CDF plot code

Inverse CDF Sampling

# Get the cdf by numeric integration

p_draws<-ggplot(df2, aes(x = p_theta)) +

## ‘stat_bin()‘ using ‘bins = 30‘. Pick better value with ‘binwidth‘.

2 c) Accept reject Sampling

Code below for drawing from p(θ) using accept reject

2 d) Derive the posterior p(θ|y).

Bernouli Distribution plot code

code to plot the density of p(θ|y)

Here is the code for the Metropolis Hastings sampler

mh_sampler <- function(dens, start = 0, nreps = 10000, prop_sd = 1, ...){

if (runif(1) < alpha) {theta[i] <- theta_star}

mh_sample<-mh_sampler(p_theta_y, nreps = 300000, start = 0.5)

p_y_draws<-ggplot(df4, aes(x = p_theta_y)) +

You might also like

e<-(3372/(3x+2y))+(8023/(x+y-2))+(131/x) #First derivative of log-likelihood