0% found this document useful (0 votes)

2 views

note2

This document discusses concentration inequalities, particularly Hoeffding's Inequality, and its applications in various contexts such as Multi-Armed Bandits and supervised learning. It introduces key concepts including pseudo-regret, simple regret, and generalization bounds, providing mathematical formulations and proofs for these concepts. The document emphasizes the importance of union bounds and Hoeffding's Inequality in deriving performance guarantees for learning algorithms.

Uploaded by

zzyy20010204

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

note2

Uploaded by

zzyy20010204

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Concentration Inequalities and Union Bound

Nan Jiang

September 13, 2022

This note introduces the basics of concentration inequalities and examples of its applications (often
with union bound), which will be useful for the rest of this course.

1 Hoeffding’s Inequality
Theorem 1. Let X1 , . . . , Xn be independent random variables on R such that Xi is bounded in the interval
Pn
[ai , bi ] . Let Sn = i=1 Xi . Then for all t > 0,
2 Pn 2
Pr[Sn − E[Sn ] ≥ t] ≤ e−2t / i=1 (bi −ai ) , (1)
−2t2 /
Pn 2
Pr[Sn − E[Sn ] ≤ −t] ≤ e i=1 (bi −ai ) . (2)

Remarks:
2 Pn 2
• By union bound, we have Pr[|Sn − E[Sn ]| ≥ t] ≤ 2e−2t / i=1 (bi −ai ) .

• We often care
h about the convergence
i of the empirical mean to the true average, so we can devide
2 2 Pn 2
Sn by n: Pr Snn − E[Snn ] ≥ t ≤ 2e−2n t / i=1 (bi −ai ) .

• A useful rephrase of the result whenqall variables share the same support [a, b]: with probability
Sn E[Sn ] 1
at least 1 − δ, n − n ≤ (b − a) 2n ln 2δ .

• X1 , . . . , Xn are not necessarily identically distributed; they just have to be independent.

• The number of variables, n, is a constant in the theorem statement. When n is a random variable
itself, for Hoeffding’s inequality to apply, n cannot depend on the realization of X1 , . . . , Xn .
Example: Consider the following Markov chain:

s1
p 1-p

s2 s4

1
Say we start at s1 and sample a path of length T (T is a constant). Let n be the number of times
we visit s1 , and we can use the transitions from s1 to estimate p.

1. Can we directly apply Hoeffding’s inequality here with n as the number of coin tosses? If
you want to derive a concentration bound for this problem, look up Azuma’s inequality.
2. What if we sample a path until we visit s1 N times for some constant N ? Can we apply
Hoeffding’s inequality with N as the number of random variables?

2 Multi-Armed Bandits (MAB)

2.1 Formulation
A MAB problem is specified by K distributions over R, {Ri }K i=1 . Each Ri has bounded supported
[0, 1] and mean µi . Let µ⋆ = maxi∈[K] µi . For round t = 1, 2, . . . , T , the learner

1. Chooses arm it ∈ [K].

2. Receives reward rt ∼ Rit .

A popular objective for MAB is the pseudo-regret, which poses the exploration-exploitation challenge:
T
X
RegretT = (µ⋆ − µit ).
t=1

Another important objective is the simple regret:

µ⋆ − µî ,

where î is the arm that the learner picks after T rounds of interactions. This poses the “pure explo-
ration” challenge, since all it matters is to make a good final guess and the regret incurred within the
T rounds does not matter. A related objective is called Best-Arm Identification, which asks whether
î ∈ arg maxi∈[K] µi ; Best-Arm Identification results often require additional gap conditions.

2.2 Uniform sampling

We consider the simplest algorithm that chooses each arm the same number of times, and after T
rounds selects the arm with the highest empirical mean. For simplicity let’s assume that T /K is an
integer. We will prove a high-probability bound on the simple regret. The analysis gives an example
of the application of Hoeffiding’s inequlaity to a learning problem; the algorithm itself is likely to be
suboptimal.
For simplicity let’s assume that T /K is an integer. After T rounds, each arm is chosen T /K times,
and let µ̂i be the empirical average reward associated with arm i. By Hoeffding’s inequality, we have:
2
Pr[|µ̂i − µi | ≥ ϵ] ≤ 2e−2T ϵ /K
.

2
Now we want accurate estimation for all arms simultaneously. That is, we want to bound the proba-
bility of the event that any µ̂i deviating from µi too much. This is where union bound is useful:
"K #
[
Pr {|µ̂i − µi | ≥ ϵ} (the event that estimation is ϵ-inaccurate for at least 1 arm)
i=1
K
X 2
≤ Pr [|µ̂i − µi | ≥ ϵ] ≤ 2Ke−2T ϵ /K
. (union bound, then Hoeffding’s inequality)
i=1
q
K
To rephrase this result: with probability at least 1 − δ, |µ̂i − µi | ≤ 2T ln 2K
δ holds for all i simultane-
ously.
Finally, we use the estimation error to bound the decision loss: recall that î = arg maxi∈[K] µ̂i , and
let i⋆ = arg maxi∈[K] µi .

µ⋆ − µî = µi⋆ − µ̂i⋆ + µ̂i⋆ − µî

r
K 2K
≤ µi⋆ − µ̂i⋆ + µ̂î − µî ≤ 2 ln .
2T δ
We can rephrase this result as a sample complexity
statement:
in order to guarantee that µ⋆ − µî ≤ ϵ
K K
with probablity at least 1 − δ, we need T = O 2 ln .
ϵ δ

2.3 Lower bound

The linear dependence of the sample complexity on K makes a lot of sense, as to choose a arm with
high reward we have to try each arm at least once. Below we will see how to mathematically formalize
this idea and prove a lower bound on the sample complexity of MAB.
p
Theorem 2. For any K ≥ 2, ϵ ≤ 1/8, and any MAB algorithm, there exists an MAB instance where µ⋆
is ϵ better than other arms, yet the algorithm identifies the best arm with no more than 2/3 probability unless
K
T ≥ 72ϵ 2.

The theorem itself is stated as a best-arm identification lower bound, but it is also a lower bound
for simple regret minimization. This is because all arms except the best one is ϵ worse than µ⋆ , so
missing the optimal arm means a simple regret of at least ϵ.
See the proof in [1] (Theorem 2); the technique is due to [2] and can be also used to prove the lower
bound on the regret of MAB.

3 Generalization Bounds for Supervised Learning

Consider a simple supervised learning setting: let X be the feature space and Y be the label space; in
this example we consider classification so Y = {0, 1}. Let PX,Y be a distribution over X × Y, and we
are given a dataset {(Xi , Yi )}ni=1 with each (Xi , Yi ) drawn i.i.d. from PX,Y . Let F : X → Y be a finite
hypothesis class. The classifier in F that minimizes the classification error is:

f ⋆ := arg min E[I[f (X) ̸= Y ]],

f ∈F

3
where E[·] is w.r.t. PX,Y . Given only a finite sample, one natural thing to do is empirical risk minimiza-
tion, i.e., find the classifer that has the lowest training error rate on data:
n
1X
fˆ = arg min E[I[f
b (X) ̸= Y ]] := I[f (Xi ) ̸= Yi ].
f ∈F n i=1

The question is, can we give any guarantee to how good the learned classifier fˆ is compared to the
optimal one f ⋆ , as a function of n? In other words, we want to bound

E[I[fˆ(X) ̸= Y )]] − E[I[f ⋆ (X) ̸= Y )]].

We provide the analysis below, which mainly uses Hoeffding’s and union bound. First of all,

E[I[fˆ(X) ̸= Y )]] − E[I[f ⋆ (X) ̸= Y )]]

≤ E[I[fˆ(X) ̸= Y )]] − E[I[
b fˆ(X) ̸= Y )]] + E[I[f
b ⋆
(X) ̸= Y )]] − E[I[f ⋆ (X) ̸= Y )]] (fˆ is optimal w.r.t. E)
b
≤ 2 · max |E[I[f (X) ̸= Y )]] − E[I[f
b (X) ̸= Y )]]|. (3)
f ∈F

It then suffices to bound maxf ∈F |E[I[f (X) ̸= Y )]] − E[I[f

b (X) ̸= Y )]]|, which is often called a uniform
deviation bound. The key is to realize that, for any fixed f ∈ F, E[I[f b (X) ̸= Y ]] is the average of
i.i.d. random variables I[f (Xi ) ̸= Yi ] bounded in [0, 1], whose true expectation is precisely E[I[f (X) ̸=
Y ]]. Applying Hoeffding’s, for a fixed f ∈ F, with probability at least 1 − δ, we have
r
1 2
|E[I[f (X) ̸= Y ] − E[I[f (X) ̸= Y ]| ≤
b ln .
2n δ
Union bounding over F and plugging into Eq.(4),
r
2 2|F|
E[I[fˆ(X) ̸= Y )]] − E[I[f ⋆ (X) ̸= Y )]] ≤ ln . (4)
n δ

References
[1] Akshay Krishnamurthy, Alekh Agarwal, and John Langford. PAC reinforcement learning with
rich observations. In Advances in Neural Information Processing Systems, pages 1840–1848, 2016.

[2] Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. Finite-time analysis of the multiarmed bandit
problem. Machine learning, 47(2-3):235–256, 2002.

Diffusion - Virtual Lab
100% (2)
Diffusion - Virtual Lab
8 pages
Aviation Psychology and Human Factors - Monica Martinussen and David R. Hunter 2018
0% (1)
Aviation Psychology and Human Factors - Monica Martinussen and David R. Hunter 2018
66 pages
GB 50205-2001 - English
100% (1)
GB 50205-2001 - English
143 pages
Solution For Plasma Physic PDF
No ratings yet
Solution For Plasma Physic PDF
5 pages
Bandits
No ratings yet
Bandits
2 pages
3 Hydrogen Atom DPT Tutorial55
No ratings yet
3 Hydrogen Atom DPT Tutorial55
58 pages
Gambling, Random Walks and The Central Limit Theorem: 3.1 Random Variables and Laws of Large Num-Bers
No ratings yet
Gambling, Random Walks and The Central Limit Theorem: 3.1 Random Variables and Laws of Large Num-Bers
59 pages
sol_prob2
No ratings yet
sol_prob2
3 pages
3 The Rao-Blackwell Theorem: 3.1 Mean Squared Error
No ratings yet
3 The Rao-Blackwell Theorem: 3.1 Mean Squared Error
2 pages
On The Existence of Essentially Unique, Commutative Groups: A. Lastname, B. Donotbelieve, C. Liar and D. Haha
No ratings yet
On The Existence of Essentially Unique, Commutative Groups: A. Lastname, B. Donotbelieve, C. Liar and D. Haha
12 pages
ECE 562: Problem Set 6 Linear Equalizers, Signaling On Fading Channels
No ratings yet
ECE 562: Problem Set 6 Linear Equalizers, Signaling On Fading Channels
3 pages
Lecture 3
No ratings yet
Lecture 3
15 pages
Sol 6
No ratings yet
Sol 6
5 pages
Exercises - MIT Assignments
No ratings yet
Exercises - MIT Assignments
6 pages
03 Hoeffding
No ratings yet
03 Hoeffding
5 pages
Lecture Notes for CS 6550 Advanced Graduate Algorithms Randomized and Approximation Algorithms Eric Vigoda Et Al. - Experience the full ebook by downloading it now
100% (1)
Lecture Notes for CS 6550 Advanced Graduate Algorithms Randomized and Approximation Algorithms Eric Vigoda Et Al. - Experience the full ebook by downloading it now
61 pages
Error Estimates For Approximate Approximations With Gaussian Kernels On Compact Intervals
No ratings yet
Error Estimates For Approximate Approximations With Gaussian Kernels On Compact Intervals
11 pages
E0234 PPT
No ratings yet
E0234 PPT
41 pages
Immediate download Lecture Notes for CS 6550 Advanced Graduate Algorithms Randomized and Approximation Algorithms Eric Vigoda Et Al. ebooks 2024
100% (1)
Immediate download Lecture Notes for CS 6550 Advanced Graduate Algorithms Randomized and Approximation Algorithms Eric Vigoda Et Al. ebooks 2024
40 pages
Chapter 10
No ratings yet
Chapter 10
21 pages
SSRN-id2203155
No ratings yet
SSRN-id2203155
12 pages
Chap2ParameterEstimation
No ratings yet
Chap2ParameterEstimation
14 pages
Anti-Parabolic, Linearly Tate, Trivially Associative Manifolds For A Minkowski-Poncelet Isometry
No ratings yet
Anti-Parabolic, Linearly Tate, Trivially Associative Manifolds For A Minkowski-Poncelet Isometry
17 pages
The Central Limit Theorem (Page 288) : Ity Ity / N N /2 2
No ratings yet
The Central Limit Theorem (Page 288) : Ity Ity / N N /2 2
2 pages
Handout 14: Unbiasedness and MSE
No ratings yet
Handout 14: Unbiasedness and MSE
3 pages
Lecture Notes for CS 6550 Advanced Graduate Algorithms Randomized and Approximation Algorithms Eric Vigoda Et Al. instant download
No ratings yet
Lecture Notes for CS 6550 Advanced Graduate Algorithms Randomized and Approximation Algorithms Eric Vigoda Et Al. instant download
43 pages
Mathgen 1792618923
No ratings yet
Mathgen 1792618923
10 pages
Lec 3
No ratings yet
Lec 3
8 pages
Cambridge International Advanced Subsidiary and Advanced Level
No ratings yet
Cambridge International Advanced Subsidiary and Advanced Level
16 pages
Stability and Convergence of Crank-Nicholson Method For Fractional Advection Dispersion Equation
No ratings yet
Stability and Convergence of Crank-Nicholson Method For Fractional Advection Dispersion Equation
9 pages
ATH Echanical Ibrations
No ratings yet
ATH Echanical Ibrations
4 pages
Research - Seminar - N Jana
No ratings yet
Research - Seminar - N Jana
19 pages
9702 w17 QP 42 PDF
No ratings yet
9702 w17 QP 42 PDF
28 pages
Lim PR Ob - 0: Convergence in Probability
No ratings yet
Lim PR Ob - 0: Convergence in Probability
4 pages
Pdf of good
No ratings yet
Pdf of good
17 pages
On The Computation of Complex Functionals
No ratings yet
On The Computation of Complex Functionals
10 pages
Lec 2
No ratings yet
Lec 2
7 pages
Ecotrics (PR) Panel Data - Random Effect Model
No ratings yet
Ecotrics (PR) Panel Data - Random Effect Model
10 pages
Gelombang_Pertemuan 3_Gertaran teredam
No ratings yet
Gelombang_Pertemuan 3_Gertaran teredam
23 pages
review_sol_8
No ratings yet
review_sol_8
9 pages
Interest Tin Paper About Predictability
No ratings yet
Interest Tin Paper About Predictability
5 pages
Chebyshev
No ratings yet
Chebyshev
3 pages
Document 1
No ratings yet
Document 1
7 pages
DeltaMethod
No ratings yet
DeltaMethod
10 pages
Lecture 2021
No ratings yet
Lecture 2021
6 pages
Lec 6
No ratings yet
Lec 6
7 pages
EXP3
No ratings yet
EXP3
36 pages
PX267 - Hamilton Mechanics
100% (1)
PX267 - Hamilton Mechanics
2 pages
CS1A
No ratings yet
CS1A
14 pages
cs747 A2020 Quizzes PDF
No ratings yet
cs747 A2020 Quizzes PDF
5 pages
Complete Convergence of NSD
No ratings yet
Complete Convergence of NSD
10 pages
Random Processes For Engineers 1st Hajek Solution Manual
No ratings yet
Random Processes For Engineers 1st Hajek Solution Manual
18 pages
Exam (v1) Spring201
No ratings yet
Exam (v1) Spring201
2 pages
Journal of Number Theory: Shaoji Feng, Xiaosheng Wu
No ratings yet
Journal of Number Theory: Shaoji Feng, Xiaosheng Wu
13 pages
Solution For Plasma Physic PDF
No ratings yet
Solution For Plasma Physic PDF
5 pages
Week 6
No ratings yet
Week 6
7 pages
Solutions to Chen's Plasma Physics
No ratings yet
Solutions to Chen's Plasma Physics
10 pages
Equation Sheet
No ratings yet
Equation Sheet
4 pages
2
No ratings yet
2
39 pages
Generalized Least Squares Estimation: 8.1 Generalized Linear Regression Model
No ratings yet
Generalized Least Squares Estimation: 8.1 Generalized Linear Regression Model
5 pages
Introduction To Uncertainty: Jes Us Fern Andez-Villaverde Duke University
No ratings yet
Introduction To Uncertainty: Jes Us Fern Andez-Villaverde Duke University
30 pages
69458622
No ratings yet
69458622
9 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Top CBSE School in Coimbatore - Yuvabharathi
No ratings yet
Top CBSE School in Coimbatore - Yuvabharathi
1 page
Estimation: M. Shafiqur Rahman
No ratings yet
Estimation: M. Shafiqur Rahman
31 pages
Swatch X Omega to the Planets With the BIOCERAMIC MOONSWATCH Collection
No ratings yet
Swatch X Omega to the Planets With the BIOCERAMIC MOONSWATCH Collection
1 page
Iso 9000 and ISO 14000
No ratings yet
Iso 9000 and ISO 14000
15 pages
1final_manu (7)
No ratings yet
1final_manu (7)
89 pages
IEC 62353 Booklet
No ratings yet
IEC 62353 Booklet
18 pages
PFF260S 2023 Chapter 3.2 - Conservation Laws (Conservation of Energy)
No ratings yet
PFF260S 2023 Chapter 3.2 - Conservation Laws (Conservation of Energy)
23 pages
12th AITS
No ratings yet
12th AITS
2 pages
Y6AutEoB1 Place Value
No ratings yet
Y6AutEoB1 Place Value
2 pages
A General Factor of Personality GFP From The Multidimensional Personality Questionnaire
No ratings yet
A General Factor of Personality GFP From The Multidimensional Personality Questionnaire
6 pages
DLL w5
100% (1)
DLL w5
4 pages
Chapter 3 Other Process Improvement and Quality Methods
No ratings yet
Chapter 3 Other Process Improvement and Quality Methods
11 pages
Support Vector Machines - Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow, 3rd Edition
No ratings yet
Support Vector Machines - Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow, 3rd Edition
17 pages
Tide Embankment Project3
No ratings yet
Tide Embankment Project3
6 pages
Word Problems Using Direct Proportion
No ratings yet
Word Problems Using Direct Proportion
5 pages
Bahir Dar Polytechnic College TTLM: BTC/133-14 B2
No ratings yet
Bahir Dar Polytechnic College TTLM: BTC/133-14 B2
19 pages
Thesis Statement For A Biography Research Paper
100% (1)
Thesis Statement For A Biography Research Paper
8 pages
Solutions and Colligative Properties
No ratings yet
Solutions and Colligative Properties
2 pages
Phylum Magnoliophyta: Angiosperms
No ratings yet
Phylum Magnoliophyta: Angiosperms
7 pages
Section 4 Complete
No ratings yet
Section 4 Complete
23 pages
Specific Gravity of Crude Oil PDF
No ratings yet
Specific Gravity of Crude Oil PDF
3 pages
Beeno Book 1
100% (1)
Beeno Book 1
34 pages
Beginner's Guide To Programming
No ratings yet
Beginner's Guide To Programming
3 pages
Revamped Frost Magic 5e
No ratings yet
Revamped Frost Magic 5e
13 pages
ENGR 242 Outline
No ratings yet
ENGR 242 Outline
1 page
EmSAT Vocabulary - Mock Test - Answer Key
No ratings yet
EmSAT Vocabulary - Mock Test - Answer Key
10 pages
Cemfair FD_0
No ratings yet
Cemfair FD_0
2 pages

note2

Uploaded by

note2

Uploaded by

Concentration Inequalities and Union Bound

September 13, 2022

• X1 , . . . , Xn are not necessarily identically distributed; they just have to be independent.

2 Multi-Armed Bandits (MAB)

1. Chooses arm it ∈ [K].

2. Receives reward rt ∼ Rit .

Another important objective is the simple regret:

2.2 Uniform sampling

µ⋆ − µî = µi⋆ − µ̂i⋆ + µ̂i⋆ − µî

2.3 Lower bound

3 Generalization Bounds for Supervised Learning

f ⋆ := arg min E[I[f (X) ̸= Y ]],

E[I[fˆ(X) ̸= Y )]] − E[I[f ⋆ (X) ̸= Y )]].

E[I[fˆ(X) ̸= Y )]] − E[I[f ⋆ (X) ̸= Y )]]

It then suffices to bound maxf ∈F |E[I[f (X) ̸= Y )]] − E[I[f

You might also like