Maths Assignment
Maths Assignment
E [Li ] = 1 · p + 0 · (1 − p) = p. (1)
Indeed, the expected value is the sum of possible outcomes weighted by their probabilities by
definition. Using the fact that Bn is the sum of n independent and identically distributed (i.i.d.)
Bernoulli RVs Li , we can derive its expected value from (1) by linearity:
n
" n # n
X X X
Bn = Li ⇒ E [Bn ] = E Li = E [Li ] [from the linearity of expectation]
i=1 i=1 i=1
n
X
= p = np.
i=1
We can proceed in a similar way for the variance. First, we compute some second-order moments
of Bernoulli RVs. Explicitly,
E L2i = 12 · p + 02 · (1 − p) = p,
A MATLAB script that computes the values of the binomial pmf is presented below.
1 % Takes vector k and scalars n and p and returns a vector of the same size
2 % as k, whose entries represent the probability of a binomial RV with
3 % parameters n and p at the points of the elements of k.
4
5 function pmf = binomial pmf(k, n, p)
6
1
7 pmf = zeros(size(k)); % Initialize output
8
9 for i = 1:length(k)
10 % Check that the element of k is in the support of the binomial(n,p).
11 % Otherwise, pmf = 0.
12 if (k(i) >= 0) && (k(i) <= n)
13 pmf(i) = nchoosek(n,k(i)) * pˆk(i) * (1-p)ˆ(n-k(i));
14 end
15 end
16
17 end
You could write a similar code replacing nchoosek by the function factorial, but be aware
that it is only accurate for numbers up to 21. This is because MATLAB represents numbers in
double precision which have roughly 15 “usable” digits. The function nchoosek uses different
approximations for large n.
We can use binomial pmf.m to plot the pmfs and cdfs requested in part A. Notice we use of
stem to show the pmf and stairs for the cdf, since these are discrete RV.
2
B Binomial and Poisson distributions. The expected value of a Poisson RV P with param-
eter λ is given by
∞ ∞ ∞
−λ k !
X e λ X kλ k−1 d X λk
E [P ] = k· = λe−λ = λe−λ
k! k! dλ k!
k=0 k=0 k=0
d
= λe−λ eλ = λe−λ eλ = λ.
dλ
Notice that in the first equality in the second line comes from recognizing that the infinite sum is
the Taylor series of the exponential.
The MATLAB code to plot the Poisson distribution is shown below.
The result can be found in Figure 3. Note the similarity with Figures 1 for large n.
To calculate the MSE, we first check a few values of the Poisson pmf to find the cut-off point for
the exercise. Simple trial and error shows that we need only considers values for k ∈ [3, 9]. Hence,
we can use the following code to evaluate and plot the MSE.
3
19 plot(n vector, mse, 'x', 'LineWidth', 2);
20 title('MSE between Poisson and binomial pmfs')
21 xlabel('n');
22 ylabel('MSE');
23 grid;
24
25
26 disp(mse);
27
28
29 %%% Export figure %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
30 set(gcf,'Color','w');
31 export fig('-q101', '-pdf', 'HW2 B2.pdf');
32 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Table 1: MSE between the pmf of a Poisson with λ = 5 and binomials with parameters (n, λ/n)
for n = 6, 10, 20, 50.
n MSE
6 0.01684
10 0.00168
20 0.00025
50 0.00003
The table above shows that the MSE between the binomial and Poisson distributions rapidly
decreases to zero as n increases, indicating that the distributions become increasingly similar for
large n.
C Binomial and Poisson distributions again. We start by writing out the pmf of Bn .
Explicitly,
k
λ n−k
n! n! λ
P [Bn = k] = pk (1 − p)n−k = 1− .
(n − k)!k! (n − k)!k! n n
Then, notice that we can rearrange the first two terms to read
Finally, we can take the limit as n → ∞. Recall that the limit of the product is the product of the
limit as long as all limits exist, which is the case here:
λk λ n−k
n (n − 1) (n − 2) (n − k + 1)
lim P [Bn = k] = lim ··· 1−
n→∞ k! n→∞ n n n n n
λk n
λ −k
λ
= (1 × 1 × · · · × 1) lim 1 − lim 1 − .
k! n→∞ n n→∞ n
4
The last limit is simply 1−k . Also, by definition, we have
λ n
lim 1 − = e−λ .
n→∞ n
Thus,
λk −λ
lim P [Bn = k] = e = P [P = k] .
n→∞ k!
D Binomial and normal distributions. If Zn is a normal RV with zero mean and unit
variance (also known as a standard normal ), then from
Pn
Yi − nµ
Zn = i=1 √ ,
σ n
Pn 2
it holds that Yn = i=1 Yi is also normally distributed but with mean np and variance nσ ,
2
where σ = p(1 − p) is the variance of a single Bernoulli Yi . The MATLAB code to display this
approximation is given below.
E Normal and Poisson approximations. I’ll provide you with a hint for this part: The
Poisson limit theorem is about counting a large number of increasingly improbable events. In
particular, note that for the distribution of a sum of i.i.d. Bernoulli RVs (i.e., a binomial RV) to
converge to a Poison distribution with mean λ, the probability of success of each Bernoulli trial
5
must be p = λ/n, which goes to zero as n → ∞. On the other hand, p is fixed for the CLT. Hence,
the CLT and the Poisson limit theorem are addressing fundamentally different limits. I leave more
contemplation on this matter to you!
n=6 n = 10
0.4 0.4
0.3 0.3
pmf
pmf
0.2 0.2
0.1 0.1
0 0
0 20 40 0 20 40
k k
n = 20 n = 50
0.4 0.4
0.3 0.3
pmf
pmf
0.2 0.2
0.1 0.1
0 0
0 20 40 0 20 40
k k
Figure 1: Binomial pmf for n = 6, 10, 20, 50 and p = 5/n (part A).
6
n=6 n = 10
1 1
cdf
cdf
0.5 0.5
0 0
0 20 40 0 20 40
k k
n = 20 n = 50
1 1
cdf
cdf
0.5 0.5
0 0
0 20 40 0 20 40
k k
Figure 2: Binomial cdf for n = 6, 10, 20, 50 and p = 5/n (part A).
0.16
0.14
0.12
0.1
pmf
0.08
0.06
0.04
0.02
0
0 10 20 30 40 50
k
Figure 3: The pmf of a Poisson RV with λ = 5 (part B). Note that the support of the Poisson
distribution is the whole non-negative integer line. However, we display only the first 50 points.
7
MSE between Poisson and binomial pmfs
0.018
0.016
0.014
0.012
0.01
MSE
0.008
0.006
0.004
0.002
0
5 10 15 20 25 30 35 40 45 50
n
Figure 4: MSE between the pmf of a Poisson with λ = 5 and binomials with parameters (n, λ/n)
for n = 6, 10, 20, 50 (part B).
n=10
1
Binomial
cdf
0.5
Normal
0
0 2 4 6 8 10
k
n=20
1
Binomial
cdf
0.5
Normal
0
0 5 10 15 20
k
n=50
1
Binomial
cdf
0.5
Normal
0
0 10 20 30 40 50
k
Figure 5: Cumulative distribution function of the binomial RV Yn for n = 10, 20, 30 and its ap-
proximation by a normal cdf (part D).