HW2P
HW2P
1.1 Q1: Bayesian Sequential Estimation and the Effects of Priors [8 points]
In this problem, we will consider the problem of sequential Bayesian estimation of a coin’s bias-
weighting for heads. Beginning with a Beta (Bishop Equation 2.13) prior distribution of the bias
𝜇, you will simulate observing successive coin flips, and plot the posterior distrbution of the bias
𝜇 as determined by applying Bayes’ Theorem (Bishop Equation 1.43). This material is covered in
Bishop, pages 71-74
To explore the effect that different intial prior distributions have, you will perform this sequential
estimation process for three initial priors in parallel:
• (a = 1, b = 1)
• (a = 0.5, b = 0.5)
• (a = 50, b = 50)
You will simulate flipping a coin with a 1/4 probability of coming up heads on each toss. This
question is divided into two parts. The first will deal with the behavior of the three distributions
for the first few sample flips. The second will deal with the behavior of the three distributions as
we see several thousand flips. For each part, you will create a collection of subplots to visualize
evolution of the posterior distributions over time.
First, in the cell below, we define helper function plotbetapdfs, which is used to plot the Beta
distributions for our prior and posterior estimates.
1
[2]: def plotbetapdfs(ab, sp_idx, tally):
"""
Inputs:
ab: 3-by-2 matrix containing the a,b parameters for the priors/posteriors
Initial entries in the matrix:
ab[0,:]=[1 1];
ab[1,:]=[.5 .5];
ab[2,:]=[50 50];
num_rows = ab.shape[0]
xs = np.arange(.001, 1, .001)
mark = ['-',':','--']
vals = beta.pdf(xs, a, b)
plt.plot(xs, vals, marker)
axes = plt.gca()
axes.set_xlim([0, 1])
axes.set_ylim([0, 20])
plt.title('{:d} h, {:d} t'.format(*tally))
plt.xlabel(r'Bias weighting for heads $\mu$')
plt.ylabel(r'$p(\mu|\{data\},I)$')
plt.legend([r'$\alpha$={:g}, $\beta$={:g}'.format(a,b) for a,b in ab],␣
↪loc="upper right", prop={'size': 6})
2
1.1.1 Part 1: Behavior for first 5 flips [5 points]
In this part, you will perform a total of 5 flips. You will produce a 3-by-2 matrix of subplots, one
plot each for the prior and for the state after each flip.
First:
1. Intialize 3-by-2 matrix ab which first contains the initial hyperparameters of the beta distri-
butions as a numpy array
2. Initalize numpy array tally to store counts of heads and tails as [#heads, #tails]. HINT
Initially there are no heads or tails counted
[3]: #initializing initial hyperparameters ab as a matrix
ab = np.array(...)
3
Now we will simulate 5 flips with bias of 𝜇 = 0.25 and update the tally. To do this we must:
1. Initialize probability matrix p to contain the probabilities of heads and tails as [P(heads),
P(tails)].
2. Initialize flips_tally as a 5 by 2 numpy array of zeroes to store the CUMULATIVE
heads/tails counts at each flip.
3. Simulate 5 coin flips using np.random.choice and update flips_tally accordingly.
[6]: # Intialize probability distrbution
p = ...
4
Run the loop below ONCE to simulate 5 coinflips and print updated flips_tally.
[8]: # Simulate 5 coin flips while updating flips_tally.
np.random.seed(1)
for i in range(5):
coin_flip_outcome = np.random.choice([1, 0], p=p)
if coin_flip_outcome == 1:
flips_tally[i, 0] = 1
else:
flips_tally[i, 1] = 1
flips_tally[i] += flips_tally[i-1]
# Print flips_tally after coinflips
print(flips_tally)
[[0 1]
[0 2]
[1 2]
[1 3]
[2 3]]
Now complete the loop below to update and plot the prior distributions for each flip. After each
subsequent flip, update the posterior ab in the variable updated_ab:
[9]: sp_idx
[9]: (3, 2, 1)
# For each flip, update ab using bayesian rule, then use plotbetapdfs with␣
↪sp_index to plot distrbutions:
[[ 1. 2. ]
[ 0.5 1.5]
[50. 51. ]]
5
[[ 1. 3. ]
[ 0.5 2.5]
[50. 52. ]]
[[ 2. 3. ]
[ 1.5 2.5]
[51. 52. ]]
[[ 2. 4. ]
[ 1.5 3.5]
[51. 53. ]]
[[ 3. 4. ]
[ 2.5 3.5]
[52. 53. ]]
6
7
[11]: assert updated_ab.shape == (3, 2)
### BEGIN HIDDEN TESTS
assert (updated_ab == np.array([[ 3. , 4. ], [ 2.5 , 3.5], [52. , 53. ]])).
↪all()
# Plot initial prior distributions using plotbetapdfs with the above sp_idx.␣
↪Fill out and uncomment the below line
# plotbetapdfs(...)
flips_tally = ...
8
tally = np.array([0, 0])
# Simulate coinflips
np.random.seed(1)
for i in range(len(intervals)):
# calculate number of flips as difference between successive elements in␣
↪intervals
# simulate num_flips coin flips and accumulate num heads and tails
flips = np.random.choice([1, 0], size=num_flips, p=p)
heads = np.sum(flips)
tails = num_flips - heads
# update ab
updated_ab = ab + np.expand_dims(np.array(flips_tally[i-1]), axis=0)
# plot
plotbetapdfs(updated_ab, sp_idx, flips_tally[i-1])
9
10
[13]: assert intervals[0] == 2 and intervals[-1] == 2048 and len(intervals) == 11 and␣
↪sum(intervals) == 4094
11
3. What does having a=b in our priors mean? [0.8 pts]
A. I don’t know
B. It means that we do not believe that heads or tails are more likely than each other
[18]: ans_3 = ... # 'A', 'B'
### BEGIN SOLUTION
ans_3 = "B"
### END SOLUTION
4. If we believed that heads was more likely than tails, which should you set? [0.8 pts]
A. a > b
B. a < b
C. a = b
[20]: ans_4 = ... # 'A', 'B', 'C'
### BEGIN SOLUTION
ans_4 = "A"
### END SOLUTION
5. Why do the plots of the posterior distributions after several thousand flips (for the different
priors) look so similar? [0.8 pts]
A. Because with lots of data, the prior becomes less important
B. Because we made a coding error
[22]: ans_5 = ... # 'A', 'B'
### BEGIN SOLUTION
ans_5 = "A"
### END SOLUTION
12
file scaledfaithful.txt. Your task is to provide code for the functions called by runKMeans: -
calcSqDistances (Possibly helpful functions: np.dot, np.sum) - determineRnk (Possibly help-
ful functions: np.argmin, np.eye, .shape) - recalcMus (Possibly helpful functions: np.dot,
np.divide, np.sum)
Note: If you find that you have written much more than 10 lines for any of the above functions,
you should try to rethink your approach.
[24]: # Defining the plotting helper function
def plotCurrent(X, Rnk, Kmus):
N, D = np.shape(X)
K = np.shape(Kmus)[0]
KColorMat = InitColorMat[0:K]
colorVec = Rnk.dot(KColorMat)
muColorVec = np.eye(K).dot(KColorMat)
13
[28]: def runKMeans(K,fileString):
fig = plt.gcf()
#contains the squared distance from the nth data vector to the kth mu␣
↪vector
#sqDmat will be an N-by-K matrix with the n,k entry as specfied above
sqDmat = calcSqDistances(X, Kmus)
KmusOld = Kmus
plotCurrent(X, Rnk, Kmus)
plt.show()
14
if sum(abs(KmusOld.flatten() - Kmus.flatten())) < 1e-6:
break
plotCurrent(X,Rnk,Kmus)
return Kmus
15
16
17
18
[30]: assert Kmus.shape == (4,2)
19