0% found this document useful (0 votes)
6 views

CH 2

Uploaded by

yingceng393
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

CH 2

Uploaded by

yingceng393
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Chapter 2 Concentration of sums of independent random variables

Desh Raj

June 2, 2018

Exercise 2.1.4.

Solution Given, g ∼ N (0, 1). On differentiating, we get g 0 (t) = −tg(t).

Z ∞
EX 2
1X>t = x2 g(x)dx
t
Z ∞
= x(xg(x))dx
t
Z ∞ Z ∞
=x xg(x)dx + g(x)dx (Integration by parts)
t t
∞
= −xg(x) + P(X > t)
t
= tg(t) + P(X > t) (Taking limit of the first term)

This is the desired result.

Exercise 2.2.3. Bounding the hyperbolic cosine

Solution We have,

ex + e−x
cosh(x) =
2
∞ ∞
!
1 X xi X (−x)i
= + (Using Taylor series expansion)
2 i! i!
i=0 i=0

X x2i ∞
X x2i
= ≥
(2i)! 2i i!
i=0 i=0
(Consider the denominator of the ith term. Each product term in (2i)! is strictly greater than that in 2i i!.)
x2
 
= exp . (Again using Taylor series expansion)
2

Exercise 2.2.7.

Solution Given, X1 , . . . , XN are independent random variables such that Xi ∈ [mi , Mi ]. Let
SN = X1 + . . . + XN . Then, for t > 0, we have

1
P{SN − E SN ≥ t} = P{es(SN −E SN ) ≥ est } (For some s > 0)
E es(SN −E SN )
≤ (Using Markov’s inequality)
est
N
Y h i
= e−st E es(Xi −E Xi )
i=1
N
s2 (Mi − mi )2
Y  
−st
≤e exp (Using Hoeffding’s lemma, since E(Xi − E Xi ) = 0)
8
i=1
N
!
s2 X
= exp −st + (Mi − mi )2 .
8
i=1
 2 PN

Let g(s) = exp −st + s8 2
i=1 (Mi − mi ) . g(s) achieves its minimum at s = PN 4t
2
.
i=1 (M i −mi )
Substituting this value in the inequality, we get
!
2t2
P{SN − E SN ≥ t} ≤ exp − PN .
i=1 (Mi − mi )2
Exercise 2.2.8.

Solution Let Xi denote the Bernoulli random variable indicating that the answer is wrong in
the ith turn. Then E Xi = 12 − δ, and Xi ∈ [0, 1]. We have to bound the probability that the final
answer is wrong with probability , i.e., P(majority of decisions are wrong) ≤ . From Hoeffding’s
inequality, we have

N
! !
X −2t2
P (Xi − E Xi ) ≥ t ≤ exp PN
i=1 i=1 (Mi − mi )2
N 
!
−2t2
  
X 1
⇒P Xi − + δ ≥t ≤ exp (Since Mi = 1 and mi = 0)
2 N
i=1

To get a wrong answer finally, we require that at least half of all answers must be wrong, i.e.
N
X N
Xi ≥
2
i=1
N  
X 1
⇒ Xi − + δ ≥ δN
2
i=1

Plugging t = δN in earlier inequality, we get


N   !
X 1
Xi − + δ ≥ δN ≤ exp −2N δ 2

P
2
i=1

2
For this probbility to be bounded by , we require that
exp(−2N δ 2 ) ≤ 
⇒ − 2N δ 2 ≤ log() (Taking log both sides)
log(−1 )
⇒N ≥ .
2δ 2

Exercise 2.2.9.
Solution Part 1
We want to bound the probability that the sample mean deviates from the true mean by a
quantity greater than . Using Chebyshev’s inequality, we have
 P 
N
! 1 N
1 X var N i=1 Xi
P k Xi − µk ≥  ≤ (1)
N 2
i=1

Since the samples are drawn i.i.d, we have


var(X1 + . . . + XN ) = var(X1 ) + . . . + var(XN )
⇒var(X1 + . . . + XN ) = N σ 2 (Since var(Xi )=σ 2 )
1 σ2
⇒ var(X1 + . . . + XN ) = (Dividing both sides by σ 2 )
N2 N
N
!
1 X σ2
⇒ var Xi =
N N
i=1

Plugging this value in 1, we get


N
!
1 X σ2
P k Xi − µk ≥  ≤
N N 2
i=1
We want that an -accurate estimate should be computed with probability at least 3/4, i.e.,
2
the error probability should be at most 1/4. Using this bound, we get Nσ2 ≤ 14 , which implies
2
N = O( σ2 ).
Part 2
Using Theorem 2.2.6 with Xi ∈ [−σ, σ], we get
N N
! !
1 X X
P Xi − µ ≥  = P (Xi − µ) ≥ N  (Multiplying both sides by N )
N
i=1 i=1
!
2N 2 2
≤ exp − PN (From Theorem 2.2.6)
2
i=1 (2σ)
2N 2 2
 
= exp −
4N σ 2
N 2
 
= exp − 2

3
We want that an -accurate estimate should be computed with probability at least 1 − δ, i.e.,
the error probability should be at most δ. This means that
N 2
 
exp − 2 ≤ δ

N 2
⇒− ≤ log(δ) (Taking log both sides)
2σ 2
σ2
⇒N = O(log(δ −1 ) 2 ),

which is the desired result.

Exercise 2.2.10.

Solution Part 1
Let fXi denote the probability density function. Since Xi is non-negative, it’s MGF is given as

Z ∞
MXi (−t) = e−tx fXi (x)dx
0

Since e−tx ≥ 0, e−tx fXi ≤ e−tx maxkfXi k = e−tx (since maxkfXi k = 1). Using this in above
equation, we get

Z ∞
1
MXi (−t) ≤ e−tx dx =
0 t

Part 2
We have
N N
! !
X X
P Xi ≤ N =P −Xi ≥ −N
i=1 i=1
N
!
X Xi
=P − ≥ −N

i=1
N
!
X Xi
= P exp( − ≥ exp(−N ) (Since exp is monotonously increasing)

i=1
N
!
N
X Xi
≤ e E exp − (Using Markov inequality)

i=1
N  
n
Y Xi
=e E exp −

i=1
N
Y
≤ eN  (Using result in part 1)
i=1
N
= (e) .

4
Exercise 2.3.2.
 x
t
Solution Since t < µ, so the function f (x) = µ is monotonically decreasing. So we can write

 SN  t !
t t
P(SN ≤ t) = P ≥
µ µ
 SN
E µt
≤  t (By Markov’s inequality)
t
µ

We have
PN N
Y
E αSN = E α i=1 Xi
=E αXi
i=1
N
Y  1
pi α + (1 − pi )α0

=
i=1
(Since Xi can only take values 1 and 0 with probability pi and 1 − pi )
N
Y
= (1 + (α − 1)pi )
i=1
N
Y
≤ exp((α − 1)pi ) (Using 1 + x ≤ ex )
i=1
N
X
= exp( ((α − 1)pi ))
i=1
= exp[(α − 1)µ]

Using this inequality in above, we get


 µ t
P(SN ≤ t) ≤ exp(t − µ)
t
 eµ t
= e−µ ,
t
which is the desired result.

Exercise 2.3.3.

Solution Since X ∼ Pois(λ), by Poisson limit theorem, we can say that X is the sum of N
Bernoulli random variables X1 , . . . , XN , for some large N , where E Xi = λ. As such, we can use
the Chernoff concentration bound for the sum of Bernoulli random variables to bound X, hence
giving the desired result.

Exercise 2.3.5.

5
Solution First, using t = (1 + δ)µ in Chernoff bounds for upper tails, we get
 (1+δ)µ
−µ e
P{X − µ ≥ δµ} ≤ e . (2)
1+δ
Now, using t = (1 − δ)µ in Chernoff bounds for lower tails (proved in previous problem), we get
 (1−δ)µ
−µ e
P{X − µ ≤ −δµ} ≤ e . (3)
1−δ
Adding 2 and 3, we get
" (1+δ)µ  (1−δ)µ #
e e
P{kX − µk ≥ δµ} ≤ e−µ +
1+δ 1−δ

We can bound the terms inside the bracket on the RHS as follows.
 (1+δ)µ   
e e
= exp (1 + δ)µ log
1+δ 1+δ
= exp [µ(1 + δ)(1 − log(1 + δ))]
 
δ
≤ exp µ(1 + δ)(1 − ) (Since log(1 + x) ≥ x
1+x/2
)
1 + δ/2
" #
1 − 2δ
= exp µ(1 + δ)
1 + 2δ
δ2
  
= exp µ 1 −
2+δ
δ2
  
≤ exp µ 1 − (Since 2 + δ ≤ 3)
3

and
 (1−δ)µ   
e e
= exp (1 − δ)µ log
1−δ 1−δ
= exp [µ(1 − δ)(1 − log(1 − δ))]
 
−δ
≤ exp µ(1 − δ)(1 − √ ) (Since log(1 − x) ≥ √−x )
1−x
1−δ
h √ i
= exp µ(1 − δ + δ 1 − δ)
δ2
  

≤ exp µ 1 − (Since 1 − δ ≤ 1 − 2δ )
3

Using these bounds, we get


δ2
  
P{kX − µk ≥ δµ} ≤ 2 exp −µ 1 −
3
2
≤ 2e−cµδ .

6
Exercise 2.3.6.

Solution Since X ∼ Pois(λ), by Poisson limit theorem, we can say that X is the sum of N
Bernoulli random variables X1 , . . . , XN , for some large N , where E Xi = λ. Using the result in
Exercise 2.3.5, we get
P{kX − λk ≥ δλ} ≤ 2 exp(−cλδ 2 ). (4)
Now substituting t = δλ in 4, we get the desired result.

Exercise 2.3.8.

Solution We will first show that the sum of independent Poisson distributions is a Poisson dis-
tribution. Let X and Y be Poisson distributions with means λ1 and λ2 , respectively. Their MGFs
are given as MX (s) = e−λ1 (1−s) and MY (s) = e−λ2 (1−s) . Then, we have

MX+Y (s) = MX (s)MY (s) = e−(λ1 +λ2 )(1−s) .

Hence, X + Y ∼ Pois(λ1 + λ2 ).
It is given that X ∼ Pois(λ). Therefore, X can be considered as a sum of λ Poisson distributions
with mean 1, using the above result. Now consider the distribution Z = X−λ √ . We can directly
λ
apply the Central Limit Theorem to Z to obtain the desired result, since E X = λ and var(X) = λ.

Exercise 2.4.2.

Solution Consider a fixed vertex i. Using Markov inequality, we can bound the probability that
its degree di is larger than O(log n), as
E di
P{di ≥ O(log n)} ≤ =c
O(log n)
We can now use the union bound to bound the probability that such a vertex exists as follows
n
X
P{∃i : di ≥ O(log n)} = P{di ≥ O(log n)} ≤ cn
i=1

We can choose a constant c such that this probability is very low, and therefore, the probability
of its counterpart is very high.

Exercise 2.4.3.

Solution Similar to the previous exercise, we can show that if the expected degree is d = O(1),
then with a high probability, all vertices
 of G have degrees O(1). Furthermore, we know from
log n
asymptotic notations that O(1) ≤ O log log n . Hence, the result is satisfied.

Exercise 2.4.4.

Solution Since degrees are not independent, let us define d0i as the out-degree of the ith vertex of
the corresponding directed graph. Since we know that degrees are always larger than out-degrees,
if we can show that with high probability, there exists an i such that d0i = 10d, then we are done.
For this, we can assume that in the limit of a large n, the degree distribution (which is actually a
Binomial) approximates a Poisson distribution. With this assumption, we can now use the Poisson
tail bound for an exact value (2.9).
Note: I have not been able to derive the exact result for this problem.

7
Exercise 2.4.5.

Solution Since the degrees approximately follow the Poisson distribution for large n, and given
that the expected degrees are O(1) we have

1k e−1
P{di = k} = e−1 ≥ k , (5)
k! k
where the last inequality is obtained by Stirling’s approximation.
For the given k, we have

log k k = k log k
log n
= (log log n − log log log n)
log log n
≈ log n (Since log log n >> log log log n)
∴ kk ≈ n

Using this approximation in 5 and taking union bound over all vertices, we see that the required
probability is at least e−1 , which is sufficiently large.

Exercise 2.5.1.

Solution We have
Z ∞
1 2
EX = p
xp √ e−x /2 dx
−∞ 2π
Z ∞
1 2
=√ xp−1 (xe−x /2 )dx
2π −∞
 Z ∞ 
1 p−1 −x2 /2 ∞ p−2 −x2 /2
=√ −x e |−∞ + (p − 1) x xe dx (Integration by parts)
2π −∞
p − 1 ∞ p−2 −x2 /2
Z
= √ x xe dx (Since the first term is 0)
2π −∞
= (p − 1) E X p−2
= (p − 1)(p − 3) . . . (3)(1) (Since E X 0 = 1)
p!
=
2.4 . . . p
p!
= p/2 p
2 ( 2 )!
 1/p
1 p!
∴ kXkp = (E X p )1/p =√ ,
2 (p/2)!

and the equivalence to the desired result can be seen by using the definition Γ(z) = (z − 1)!.

8
We can further write
 1/p
p 1/p 1 p!
kXkp = (E X ) =√
2 (p/2)!
1  p 1/p
= √ p(p − 1) . . . ( + 1)
2 2
1   1/p
≤ √ pp/2 (Since each term in product is less than p)
2

= O( p)

as p → ∞.
For the moment-generating function, we have
Z ∞ −x2 /2
λx e
E exp(λX) = e √ dx
−∞ 2π
∞ Z
1 1 2
=√ e− 2 x +λx dx
2π −∞
Z ∞
1 1 2 2 1 2
=√ e− 2 (x −2λx+λ )+ 2 λ dx (Adding and subtracting 1 2
2
λ )
2π −∞
2
e /2 ∞ − (x−λ)2
λ Z
= √ e 2 dx
2π −∞
2 /2
= eλ N (λ, 1)
λ2 /2
=e ,

which is the desired result.

Exercise 2.5.4.

Solution

E exp(λ2 X 2 ) = E exp(λX)2
λX
X
= E exp λX
i=1
λX
Y
=E exp(λX)
i=1
λX
Y
≤E exp(λ2 ) (Using Property 5 with K5 = 1)
i=1
= λ E X ≤ exp(K 2 λ2 )
3
(Using Property 4)

This inequality can only hold for all λ ∈ R if E X = 0.

Exercise 2.5.5.

9
Solution Part 1
We can calculate the MGF of X 2 for X ∼ N (0, 1) as follows.
Z ∞
1 x2
2
E λX = λx2 √ e− 2 dx
−∞ 2π
Z ∞
2λ x2
=√ x(xe− 2 )dx
2π 0
 Z ∞ 
2λ 2
− x2 ∞ − x2
2
=√ −e |0 + e dx (Integration by parts)
2π 0
√ !
2
=λ +1
π

This is a monotonously increasing function in λ, and therefore it is only finite in a bounded neigh-
borhood of 0.
Part 2
We have

E exp(λ2 X 2 ) ≤ exp(Kλ2 )
λ2 X 2 (λ2 X 2 )2 Kλ2 (Kλ2 )2
 
⇒E 1 + + + ... ≤ 1 + + + . . . (Using Taylor expansion)
1! 2! 1! 2!

∴∀i ∈ 0, 1, . . ., λ2i E X 2i ≤ λ2i ( K)2i
(Comparing individual terms since all terms are positive)
∴ E X 2i ≤ K i
⇒ E((X 2 )i ) ≤ K
∴ kXk∞ ≤ K

Exercise 2.5.7.

Solution To prove that a function is a norm, it needs to satisfy the following 3 properties:

1. p(u + v) ≤ p(u) + p(v)

2. p(av) = |a|p(v)

3. If p(v) = 0, then v = 0

We will now show each of these properties for the sub-gaussian norm k·kψ2 .

10
2
1. Let f (x) = ex . We have
   
|X + Y | |X| + |Y |
f ≤f
a+b a+b
   
a |X| b |Y |
≤ f + f
a+b a a+b b
(Jensen’s inequality for convex functions)
     
|X + Y | a |X| b |Y |
Ef ≤ Ef + Ef (Taking E on both sides)
a+b a+b a a+b b
 
|X + Y | a b
Ef ≤ 2+ 2=2 (Taking a = kXkψ2 and b = kY kψ2 )
kXkψ2 + kY kψ2 a+b a+b

So kXkψ2 + kY kψ2 is in the set t > 0 : E exp(X 2 /t2 ) ≤ 2, and the proof is complete.
2. We have
kaXkψ2 = inf t > 0 : E exp(−(aX)2 /t2 ) ≤ 2
= inf au > 0 : E exp(−X 2 /t2 ) ≤ 2 (Taking t = au)
= akXkψ2

3. We have
kXkψ2 = 0
⇒ inf t > 0 : E exp(−X 2 /t2 ) ≤ 2 = 0
⇒ E exp(−X 2 /t2 ) ≤ 2
⇒ exp(− E X 2 /t2 ) ≤ 2 (By Jensen’s inequality)
2 2
⇒ E X ≤ −t log 2
⇒ E X 2 ≤ lim −t2 log 2 (Taking limt→0 both sides)
t→0
⇒ E X2 ≤ 0
∴X=0

Exercise 2.5.9.

Solution (i) X ∼ Pois(λ). We have


P{X ≥ t} > P{X = t}
λt
= e−λ
t!

Exercise 2.5.10.

Solution

Exercise 2.5.11.

Solution

11

You might also like