Optimization
Optimization
2023-01-09
Exercise 1
Y1 Y2 Y3 Y4
3θ1 θ2 1 θ1 1 θ2 θ1 θ2
8 + 4 2 − 2 + − 2 2 8 4
1124 8023 131 722
6
X
ℓc (θ1 , θ2 ) = Zj log[πjc (θ1 , θ2 )]
j=1
3θ1 θ2
= Z11 (1124, θ1 ) log( ) + Z12 (1124, θ2 ) log( )
8 4
1 θ1 1 θ2
+ Z21 (8023, θ1 ) log( − ) + Z22 (8023, θ2 ) log( − )
2 2 2 2
θ1 θ2
+ 131 log( ) + 722 log( )
8 4
4
X
ℓ(θ1 , θ2 ) = Yj log[πj (θ1 , θ2 )]
j=1
3θ1 θ2 1 θ1 1 θ2
= 1124 log(+ ) + 8023 log( − + − )
8 4 2 2 2 2
θ1 θ2
+ 131 log( ) + 722 log( )
8 4
First let us derive the partial derivative with respect to θ1 and set it to 0
1
∂ℓ 3y1 y2 y3
= + + =0 pluging in the values
∂θ1 3θ1 + 2θ2 θ1 + θ2 − 2 θ1
∂ℓ 3372 8023 131
= + + =0
∂θ1 3θ1 + 2θ2 θ1 + θ2 − 2 θ1
Now let us derive the partial derivative with respect to θ2 and set it to 0
∂ℓ 2y1 y2 y4
= + + =0 pluging in the values
∂θ2 3θ1 + 2θ2 θ1 + θ2 − 2 θ2
∂ℓ 2248 8023 722
= + + =0
∂θ2 3θ1 + 2θ2 θ1 + θ2 − 2 θ2
Now solving these two partial derivatives as a series of equations we get two possibilities
√
−451 − 17 7873
θ1 = ≈ −0.19594087642280523424
10000√
4405 + 17 7873
θ2 = ≈ 0.59134087642280523424
10000
OR
√
−451 + 17 7873
θ1 = ≈ 0.10574087642280523424
10000√
4405 − 17 7873
θ2 = ≈ 0.28965912357719476576
10000
Naturally we choose the second set of results, first because if θ1 was negative then we would get an undefined
result for the 131 log( θ81 ) term in the log likelihood function. Secondly because we ran a reality check and
the maximum of the log-likelihood is indeed at θ1 ≈ 0.1057 and θ2 ≈ 0.2896 with a value of −6689.525
Now as we have a θ1 and a θ2 we have to use a more generalized version of the Newton-Raphson method
(as opposed to to the single parameter one). First in essence what we want is to solve a system of n = 2
equations in n = 2 unknowns. The system we have is
(
∂ℓ 3372 8023 131
f1 = ∂θ1 = 3θ1 +2θ2 + θ1 +θ2 −2 + θ1 =0
∂ℓ 2248 8023 722
f2 = ∂θ2 = 3θ1 +2θ2 + θ1 +θ2 −2 + θ2 =0
From here we would want to calculate the inverse of this matrix i.e. [J f (θ1 , θ2 )]−1 . However this is a tedious
process and the result does not fit on the page. You can see it in closed form in the appendix. Nevertheless
2
# "
(0)
θ1
once we have the inverse and an starting values (0) we can specify the expression of the Newton-Raphson
θ2
optimisation as follow:
" # " #
(k+1) (k)
θ1 θ1 −1 (k) (k)
(k+1) = (k) − [J f (θ1 , θ2 )] f (θ1 , θ2 )
θ2 θ2
3372 8023
+ 131
" # " #
(k) + (k)
(k)
θ1 A B (k)
3θ1 +2θ2
(k)
θ1 +θ2 −2
(k)
θ1
= (k) − 2248 8023
θ2 C D (k) (k) + (k) (k) + 722
(k)
3θ1 +2θ2 θ1 +θ2 −2 θ2
A B
where is the inverse Jacobian and can be seen in the appendix
C D
(t)
3θ1 (t)
(t) (t) 8 3372θ1
E(Z11 |θ1 , θ2 Y1 ) = 1124 (t) (t)
= (t) (t)
3θ1 θ2 3θ1 + 2θ2
8 + 4
(t) (t)
(t) (t) 3372θ1 2248θ2
E(Z12 |θ1 , θ2 Y1 ) = 1124 − (t) (t)
= (t) (t)
3θ1 + 2θ2 3θ1 + 2θ2
(t)
1 θ1 (t)
(t) (t) 2 − 2 8023(θ1 − 1)
E(Z21 |θ1 , θ2 Y2 ) = 8023 (t) (t)
= (t) (t)
1 θ1 1 θ2 θ1 + θ2 − 2
2 − 2 + 2 − 2
(t) (t)
(t) (t) 8023(1 − θ1 ) 8023(θ2 − 1)
E(Z22 |θ1 , θ2 Y2 ) = 8023 − (t) (t)
= (t) (t)
θ1 + θ2 − 2 θ1 + θ2 − 2
First taking a look at the complete log-likelihood function from above we can plug in the expectations and
derive the objective function as follows:
From here we can take the first derivatives with respect to θ1 and θ2 respectively and set them to zero.
3
(t) (t)
∂Q(θ1 , θ2 |θ1 , θ2 ) θ1 (Z11 + Z21 + 131) − Z11 − 131
= =
∂θ1 (θ1 − 1)θ1
(t) (t)
3372θ1 8023(θ1 − 1) 131
= (t) (t)
− (t) (t)
+ =0
θ1 (3θ1 + 2θ2 ) 2( 12 + θ1
θ2 − 2) θ1
2 )(θ1 +
(t) (t)
∂Q(θ1 , θ2 |θ1 , θ2 ) θ2 (Z12 + Z22 + 722) − Z12 − 722
= =
∂θ2 (θ2 − 1)θ2
(t) (t)
2248θ2 8023(θ2 − 1) 722
= (t) (t)
− (t) (t)
+ =0
θ2 (3θ1 + 2θ2 ) 2( 12 − θ2
θ2 − 2) θ2
2 )(θ1 +
From here once we can find an expression for θ1 and θ2 respectively. They are
Z11 + 131
θ1 = =
Z11 + Z21 + 131
(t) (t) (t) (t) (t)
(3765(θ1 )2 + θ1 (4027θ2 − 7530) + 262 ∗ (θ2 − 2)θ2 )
= (t) (t) (t) (t) (t)
=
(27834(θ1 )2 + 3θ1 (6691θ2 − 10533) + 2θ2 (131θ2 − 8285))
Z12 + 722
θ2 = =
Z12 + Z22 + 722
(t) (t) (t) (t) (t)
(2166(θ2 )2 + θ2 (5858θ2 − 4332) + 3692(θ2 − 2)θ2 )
= (t) (t) (t) (t) (t)
(2166(θ2 )2 + θ2 (29927θ2 − 28401) + 142θ2 (139θ2 − 165))
## Theta1 Theta2
## 1 0.20000000 0.2000000
## 2 0.09673894 0.2725738
## 3 0.10505359 0.2889658
## 4 0.10573591 0.2896598
## 5 0.10574088 0.2896591
## 6 0.10574088 0.2896591
## 7 0.10574088 0.2896591
## 8 0.10574088 0.2896591
## 9 0.10574088 0.2896591
## 10 0.10574088 0.2896591
4
## Teta1 Teta2
## 1 0.2000000 0.2000000
## 2 0.1672030 0.2260423
## 3 0.1479708 0.2451093
## 4 0.1352150 0.2581961
## 5 0.1265147 0.2673104
## [1] "..."
## Teta1 Teta2
## 40 0.1057411 0.2896589
## 41 0.1057410 0.2896590
## 42 0.1057410 0.2896590
## 43 0.1057410 0.2896590
## 44 0.1057409 0.2896591
## 45 0.1057409 0.2896591
Exercise 2
a) Find the normalizing constant of q(θ) using two methods for numerical integration.
In the code below I will implement Riemann and Trapezoidal integral approximation along with my personal
choice stochastic (Monte Carlo) integration to calculate the integral above. The code is available in the
appendix
## [1] "RIEMAN"
## [1] 0.07333786
## [1] "TRAPEZOID"
## [1] 0.07333786
## [1] "STOCHASTIC"
5
## [1] 0.073371
If we however assume that we are looking for the normalizing constant for the Bayes problem.Then we need
to calculate from
p(θ|y) ∝ p(y|θ)p(θ)
Now given that p(θ|y) is a probability, then the integral over all values of θ must be equal to 1, which in
turn means that
p(y|θ)p(θ) p(y|θ)p(θ) p(y|θ)q(θ)
p(θ|y) = = R1 = R1
p(y) p(y|θ)p(θ) p(y|θ)q(θ)
0 0
R1
from here we have that the normalizing constant is the integral 0 p(y|θ)p(θ). Now the question gives us 10
Bernoulli observation and we know that 7 of those were successful.So in essence the question is asking us to
R1 7 (10−7) (exp(−4(θ−0.5)2 −0.1 cos2 (12πθ))
use two numerical methods to find 0 ( 10 7 θ θ ) 0.7108697 where the first part is the
Binomial and the second part is the prior. In the code (available in the appendix) below I will calculate this
integral again using Rieman, Trapezoidal and Stochastic integration methods
## [1] "RIEMAN"
## [1] 0.1031664
## [1] "TRAPEZOID"
## [1] 0.1031664
## [1] "STOCHASTIC"
## [1] 0.1031175
b) Implement the probability integral transform method to simulate M = 10000 draws from
p(θ).
6
CDF of p(theta)
1.00
0.75
p(theta)
0.50
0.25
0.00
7
Observations drawn from p(theta) using inverse CDF
1.5
1.0
density
0.5
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
p_theta
### c) Implement a rejection algorithm to simulate M = 10000 draws from p(θ).
Below I will implement the accept reject sampling method. The histogram shows draws from the same
density as above but with many more draws, so that the resolution is better.
8
Observations drawn from p(theta) using accept reject
1.0
density
0.5
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
p_theta
### ci) Write the (closed form) formula of the envelope.
For the envelope I have used a Normal distribution with a µ = 0.5 and σ 2 = 0.25. It was scaled by M ≈ 1.756
p(θ)
which I got from the maximum value of ϕ(θ) where ϕ(θ) is the pdf of the normal distribution. The closed
form is the following
1 1 θ − 0.5 2
ϕ(θ) = √ exp(− ( ) )Envelope = ϕ(θ) × 1.756
0.5 2π 2 0.5
below you can see how the two distributions look. I have extended the x axis so that it looks better, but for
the accept reject I limit them to the interval (0, 1)
9
Envoelope Visualisation
1.0
Density
0.5
0.0
−2 −1 0 1 2
Theta
### cii) Estimate the acceptance probability.
We can derive the acceptance probability as follows.
p(Θ)
P U≤ = E1U ≤ p(Θ)
M g(Θ) M g(Θ)
= E E 1U ≤ p(Θ) | Θ By the tower property
M g(Θ)
p(Θ)
=E P U ≤ |Θ
M g(Θ)
p(Θ)
=E because P(U < u) = u when U is uniform on (0, 1)
M g(Θ)
Z
p(θ)
= g(θ)dθ
θ:g(θ)>0 M g(θ)
Z
1
= p(θ)dθ
M θ:g(θ)>0
1
=
M
and indeed in the above histogram we have drawn 100000 samples from p(θ) from a total of 1757819 candi-
dates which gives us exactly the ratio
100000 1 1
= =
1757819 1.7578 M
10
ciii) Compare the estimated acceptance probability to the theoretical value
1
I ( )
We are given that the theoretical acceptance probability is A = 020√20e = 0.0492 which is a generally poor
acceptance probability compared to the ≈ 57% which we have when we use the normal distribution as an
envelope. Additionally, a theoretical acceptance probability is as far as I understand accept reject method
is entirely dependent on the envelope, and the envelope is not specified in the question and I assume can
be chosen freely, plus I could not find anything in the lectures of the notes which would indicate there is a
fixed theoretical answer. My intuition says that if such exists it is some kind of Taylor approximation, but
I could not find any mention of a theoretical acceptance rate.
If we nevertheless had to find a density and an M such that we have an acceptance probability of A =
1
I0 ( 20 )
√
20
e
= 0.0492 we could just use the same normal distribution as in our methodology but use M = 29.77225.
Alternatively we could use a uniform distribution from (0, 14.54649) and use an M = 20.32 and then run
the accept reject sampling. We would get that acceptance probability
p(y|θ)p(θ)
p(θ|y) = p(θ|y) =
p(y)
p(y|θ)p(θ)
= R1
0
p(y|θ)p(θ)
2
−0.1 cos2 (12πθ))
( 10 θ7 θ(10−7) ) (exp(−4(θ−0.5)
= R 1 710 0.7108697
2 −0.1 cos2 (12πθ))
0
( 7 θ7 θ(10−7) ) (exp(−4(θ−0.5)
0.7108697
Where I assume that for y I only have the 10 Bernoulli draws of which 7 were successful. Obviously prof.
Wierdo has been doing this experiment for longer and has more observations, but nevertheless we will limit
this analysis to just the 10 mentioned in the question. We get the following density.
11
Bernoulli likelihood
0.20
p_bern(range)
0.10
0.00
range
As we can see we get the maximum at 0.7 which makes sense. Now if we take a look below we can see how
the density of p(θ|y) looks like (code in the appendix)
12
Density of p(theta|y)
3.0
2.5
p_theta_y(range)
2.0
1.5
1.0
0.5
0.0
range
e) Apply the Metropolis sampler to sample from p(θ|y). Simulate M = 10000 draws.
Now what we need to do is use The Metropolis-Hastings algorithm to draw 10000 samples from p(θ|y). The
MH algorithm is a powerful way of approximating a distribution using Markov chain Monte Carlo. Here we
can use the fact that MH needs is an expression that is proportional to the density we are looking to sample
from. Even though we could use the non-normalized densities, I have alread derived them and shall therefore
use them. In the code below I am using a normal distribution as the proposal density with a µ = 0.5 and
σ 2 = 0.25. The code is available in the appendix
13
Observations drawn from p(theta|y) using Metropolis−Hastings
2
density
APENDIX
We calculate the inverse of the Jacobian with the help of https://matrixcalc.org/ which kindly provides
the results in Latex
A B
[J f (θ1 , θ2 )]−1 =
C D
with
−157904θ28 − 952544θ1 θ27 + 118144θ27 − 2205448θ12 θ26 + 611200θ1 θ26 − 118144θ26
−2405016θ13 θ25 + 1278720θ12 θ25 − 493056θ1 θ25 − 1216665θ14 θ24 + 1409472θ13 θ24 − 785664θ12 θ24 −
272916θ15 θ23 + 857736θ14 θ23 − 623808θ13 θ23 − 58482θ16 θ22 + 233928θ15 θ22 − 233928θ14 θ22
A=
7661142θ16 − 30644568θ15 + 30644568θ14 + 177488456θ26 + 931869784θ1 θ25 − 131527616θ25 + 1889147546θ12 θ24 −
546698048θ1 θ24 + 131527616θ24 + 1785372552θ13 θ23 − 781027488θ12 θ23 + 415170432θ1 θ23 + 694317969θ14 θ22 − 447575904θ13 θ22
+365857056θ12 θ22 + 35751996θ15 θ2 − 112363416θ14 θ2 + 81718848θ13 θ2
14
155464θ28 + 905688θ1 θ27 − 108384θ27 + 1983606θ12 θ26 − 433536θ1 θ26 + 108384θ26 +
1936188θ13 θ25 − 569016θ12 θ25 + 325152θ1 θ25 +
710829θ14 θ24 − 243864θ13 θ24 + 243864θ12 θ24
C=
7661142θ16 − 30644568θ15 + 30644568θ14 + 177488456θ26 + 931869784θ1 θ25 − 131527616θ25 +
1889147546θ12 θ24 − 546698048θ1 θ24 + 131527616θ24 + 1785372552θ13 θ23 − 781027488θ12 θ23 +
415170432θ1 θ2 + 694317969θ14 θ22 − 447575904θ13 θ22 + 365857056θ12 θ22 + 35751996θ15 θ2 − 112363416θ14 θ2 + 81718848θ13 θ2
3
###NEWTON-RAPHSON IMPLEMENTATION
#Package for inverting the Jaconbian, I could use the
#fomula in the appendix if packages are not allowed
library(matlib)
jacob<-function(x,y) {
a<- -(10116/(3*x+2*y)ˆ2)-(8023/(x+y-2)ˆ2)-(131/(xˆ2)) #partial second derivative of log-likelihood
b<- -(6744/(3*x+2*y)ˆ2)-(8023/(x+y-2)ˆ2) #partial second derivative of log-likelihood
c<- -(6774/(3*x+2*y)ˆ2)-(8023/(x+y-2)ˆ2) #partial second derivative of log-likelihood
d<- -(4496/(3*x+2*y)ˆ2)-(8023/(x+y-2)ˆ2)-(722/yˆ2) #partial second derivative of log-likelihood
#invert the Jacobian, I could use the fomula in the appendix but this is more elegant
IA<-inv(matrix(data = c(a,b,c,d),ncol = 2,nrow = 2))
return(g)
}
15
b1) Implement your EM Algorithm
grid<-data.frame("Teta1"=rep(NA,15),"Teta2"=rep(NA,15))
grid[1,]<-c(0.01,0.01)
EMfunc1<- function(params) {
a<-params[1]
b<-params[2]
c<-(3372*a/(3*a+2*b))
d<-(2248*b)/(3*a+2*b)
e<-(8023*(a-1))/(a+b-2)
f<-(8023*(b-1))/(a+b-2)
#if we do not want to calculate the expectation every time we can use this form
#x<- (3765*aˆ2+a*(4027*b-7530)+262*(b-2)*b)/(27834*aˆ2+3*a*(6691*b-10533)+2*b*(131*b-8285))
#y<- (2166*aˆ2+a*(5858*b-4332)+3692*(b-2)*b)/(2166*aˆ2+a*(29927*b-28401)+142*b*(139*b-165))
x<-(c+131)/(c+e+131)
y<- (d+722)/(d+f+722)
z<-c(x,y)
return(z)
for (i in 1:(nrow(grid)-1)) {
grid[(i+1),]<-EMfunc1(c(grid[i,1],grid[i,2]))
#grid
h=(b-a)/n
y=rep(0,length(n))
for (j in 1:length(n)){
grid1=seq(a,b-h[j],h[j])
grid_int=int_q_theta(grid1)
y[j]=sum(grid_int)*h[j]}
return(y)}
16
h=(b-a)/n
y=rep(0,length(n))
for (j in 1:length(n)){
grid1=seq(a,b,h[j])
grid_int=int_q_theta(grid1)
y[j]=(sum(grid_int)-0.5*(grid_int[1]+grid_int[length(grid1)]))*h[j]}
return(y)}
}
print("RIEMAN")
## [1] "RIEMAN"
rie(10000000,0,1) #RIEMAN
## [1] 0.7108697
print("TRAPEZOID")
## [1] "TRAPEZOID"
trp(10000000,0,1) #TRAPEZOID
## [1] 0.7108697
print("STOCHASTIC")
## [1] "STOCHASTIC"
stch(10000000,0,1) #STOCHASTIC
## [1] 0.7107604
17
#Define the function
int_q_theta<-function(x){
y<-(choose(10,7)*((x)ˆ7)*((1-x)ˆ3))*((exp(-4*((x-0.5)ˆ2)-0.1*cos(12*pi*x)ˆ2))/0.7108697)
return(y)}
h=(b-a)/n
y=rep(0,length(n))
for (j in 1:length(n)){
grid1=seq(a,b-h[j],h[j])
grid_int=int_q_theta(grid1)
y[j]=sum(grid_int)*h[j]}
return(y)}
h=(b-a)/n
y=rep(0,length(n))
for (j in 1:length(n)){
grid1=seq(a,b,h[j])
grid_int=int_q_theta(grid1)
y[j]=(sum(grid_int)-0.5*(grid_int[1]+grid_int[length(grid1)]))*h[j]}
return(y)}
}
print("RIEMAN")
## [1] "RIEMAN"
rie(10000000,0,1) #RIEMAN
## [1] 0.1031664
print("TRAPEZOID")
## [1] "TRAPEZOID"
18
trp(10000000,0,1) #TRAPEZOID
## [1] 0.1031664
print("STOCHASTIC")
## [1] "STOCHASTIC"
stch(10000000,0,1) #STOCHASTIC
## [1] 0.1031679
library(ggplot2)
p_theta<-function(x){
y<-exp(-4*((x-0.5)ˆ2)-0.1*cos(12*pi*x)ˆ2)/0.7108697
return(y)}
u<-seq(from=0, to=1,length.out=100000)
rie_p<-function(n,a,b){
h=(b-a)/n
y=rep(0,length(n))
for (j in 1:length(n)){
grid1=seq(a,b-h[j],h[j])
grid_int=p_theta(grid1)
y[j]=sum(grid_int)*h[j]}
return(y)}
values<-c()
for (i in 1:length(u)) {
values[i]<-rie_p(1000,0,u[i])
df<-data.frame("Values"=values,"Sequence"=u)
cdf_p_theta<-ggplot(data = df,aes(x=Sequence,y=Values))+
geom_line()+labs(title = "CDF of p(theta)",x="Theta",y="p(theta)")+theme_bw()
#cdf_p_theta
19
# Use a root finding function to invert the cdf
invcdf <- function(q){
uniroot(function(x){cdf(x) - q}, range(0,1))$root
}
p_theta_draws<-rep(0,10000)
for (i in 1:length(p_theta_draws)) {
u<-runif(1,0,1)
p_theta_draws[i]<-invcdf(u)
}
df2<-data.frame("p_theta"=p_theta_draws)
p_draws
20
Observations drawn from p(theta) using inverse CDF
1.0
density
0.5
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
p_theta
g<- function(x) {
y<-dnorm(x,mean=0.5,sd=0.5)
return(y)
}
range<-seq(-2,2,length.out=1000)
M<-max(p_theta(range)/g(range))
f_x<-rep(0,1000000)
count<-0
for (i in 1:length(f_x)) {
repeat {
count<-count+1
x<-rnorm(1,mean=0.5,sd=0.5)
u<-runif(1,0,1)
if(u< p_theta(x)/(M*g(x)) & 0<x & x < 1) {break}
}
f_x[i]<-x
21
df3<-data.frame("p_theta"=f_x)
acc_rej<-ggplot(df3, aes(x = p_theta)) +
geom_histogram(bins=100,aes(y = ..density..),
colour = 1, fill = "lightblue") +
geom_density(lwd = 1.2,
linetype = 2,
colour = "pink")+
labs(title = "Observations drawn from p(theta) using accept reject")+
scale_x_continuous(breaks=c(seq(from=0,to=max(df2$p_theta)+0.1,by=0.1)))+
theme_bw()
#acc_rej
range<-seq(-2,2,length.out=1000)
M<-max(p_theta(range)/g(range))
df_dist<-data.frame("p_theta"=p_theta(range),"phi_theta"=M*g(range),"range"=range)
#ggplot(df_dist) +
#geom_line(aes(x=range,y = p_theta),color="red")+
#geom_line(aes(x=range,y = phi_theta),color="blue")+
#labs(title = "Envoelope Visualisation",y="Density",x="Theta")+
#theme_bw()
p_bern<-function(x) {
y<-(choose(10,7)*((x)ˆ7)*((1-x)ˆ(10-7)))
return(y)
}
range<-seq(0,1,length.out=1000)
#plot(range,p_bern(range),type = "l",main = "Bernoulli likelihood")
int_q_theta<-function(x){
y<-(choose(10,7)*((x)ˆ7)*((1-x)ˆ3))*((exp(-4*((x-0.5)ˆ2)-0.1*cos(12*pi*x)ˆ2))/0.7108697)
return(y)}
p_theta_y<-function(x) {
y<-((choose(10,7)*((x)ˆ7)*((1-x)ˆ3))*((exp(-4*((x-0.5)ˆ2)-0.1*cos(12*pi*x)ˆ2))/0.7108697))/integrate(i
return(y)
}
range<-seq(0,1,length.out=1000)
#plot(range,p_theta_y(range),type = "l",main = "Density of p(theta|y)")
#print("Check if it integrates to 1")
#integrate(p_theta_y, 0, 1)
22
2 e) Apply the Metropolis sampler to sample from p(θ|y). Simulate M = 10000 draws.
for (i in 2:nreps){
theta_star <- rnorm(1, mean = theta[i - 1], sd = prop_sd)
alpha = dens(theta_star, ...) / dens(theta[i - 1], ...)
return(theta)
}
df4<-data.frame("p_theta_y"=mh_sample)
#p_y_draws
23