Week-12
Week-12
● Hypothesis class H:
● Practical issue: ERM may not find the true minimizer (e.g., due to
optimization challenges).
Key Theoretical Tools
Uniform Convergence
Uniform Convergence
Probabilistic Guarantee
Bound for ERM Solution
Sample Complexity
Sample Complexity
VC Dimension
● Hypothesis Space
● This is the set of all possible functions (or classifiers) that a model can learn. For
example, in linear classifiers, it's all possible straight lines (in 2D) that can divide the
space into two classes.
● Shattering
● A hypothesis space shatters a set of points if it can correctly classify all possible labelings
of those points.
○ For n points, there are 2𝑛 possible ways to label them (each point can be +1 or
-1).
○ If your model can classify all 2n labelings correctly using some function in
the hypothesis space, then it shatters those n points.
● VC Dimension
● The VC dimension of a hypothesis class is the maximum number of points that can be
shattered by the hypothesis class.
Examples -1
Examples
Examples -2
Why Does VC Dimension Matter?
What is PAC Learning?
● Correct: Refers to how well the model generalizes on unseen data (not
just training data).
Formal Definition
Relationship with VC Dimension
Examples of PAC Learnable Classes
Intuition: What Makes a Class PAC-Learnable?
You’re given labeled data You’re given unlabeled You’re not told the right
- inputs with data, with no clear answer.
corresponding outputs output.
(e.g., image → “cat”). You interact with an
environment, try actions,
The model learns to map The goal is to find and observe the outcomes
inputs to outputs. structure or patterns (rewards/punishments).
(e.g., clustering,
Examples: association rules). Learning is driven by
Classification, experience, not direct
Regression. supervision.
Illustrative Example: Learning to Ride a Bicycle
● RL is inspired by
behavioral psychology —
like Pavlov’s classical
conditioning.
● The modern field was
kickstarted by Sutton &
Barto, whose 1983 work
laid the foundation for
today's algorithms and
techniques.
Games as a RL Metaphor
● Rewards:
● Key Points:
Source
Question-1
a) 2
b) 3
c) 4
d) None of the above
Question-1- Correct answer
a) 2
b) 3
c) 4
d) None of the above
Correct options: (b) Any 3 points can be classified using a linear decision
boundary
Question-2
a) Linear regression
b) Logistic regression
c) Decision trees
d) Support Vector Machines
Question-2 - Explanation
SVM, is called structural risk minimization because they have an additional constraint that is
there apart from the empirical they also try to minimize the solution size.
They try to minimize the norm of the weight factor so that actually gives rest to a different kind
of minimization.
a) Linear regression
b) Logistic regression
c) Decision trees
d) Support Vector Machines
a) Statement 1 is true. Statement 2 is true. Statement 2 is the correct reason for statement 1
b) Statement 1 is true. Statement 2 is true. Statement 2 is not the correct reason for statement
1
c) Statement 1 is true. Statement 2 is false
d) Both statements are false
Correct options: (b) The +5 reward for reaching the target encourages goal achievement, while
the-0.1 penalty for each second promotes finding the shortest path. Omitting rewards for
hitting walls as question has nothing in this regard
Question-6-8
For the rest of the questions, we will follow a simplistic game and see how a
Reinforcement Learning agent can learn to behave optimally in it.
This is our game:
a) At the start of the game, the agent is on the Start state and can choose to move left or right
at each turn.
b) If it reaches the right end(RE), it wins and if it reaches the left end(LE), it loses.
c) Because we love maths so much, instead of saying the agent wins or loses, we will say that
the agent gets a reward of +1 at RE and a reward of-1 at LE. Then the objective of the agent
is simply to maximum the reward it obtains!
Question-6
For each state, we define a variable that will store its value. The value of the state will help the
agent determine how to behave later. First we will learn this value. Let V be the mapping from
state to its value.
Initially,
V(LE) =-1
V(X1) = V(X2) = V(X3) = V(X4) = V(Start) = 0
V(RE) = +1
For each state S ∈ {X1,X2,X3,X4,Start}, with SL being the state to its immediate left and SR being
the state to its immediate right, repeat: V(S) = 0.9×max(V(SL),V(SR))
Till V converges (does not change for any state).
a) 1
b) 0.9
c) 0.81
d) 0
Question-6 - Correct answer
a) 1 V(X4) = 0.9×max(V(X3),V(RE))
b) 0.9
V(S) = 0.9×max(0,+1) = 0.9
c) 0.81
d) 0
(a)-1
(b)-0.9
(c)-0.81
(d) 0
Question-7 - Correct answer
a) -1
b) -0.9
c) -0.81
d) 0- V(X1) = 0.9×max(V(LE),V(X2)) V(S) = 0.9×max(−1,0) = 0
a) 0.59
b) -0.9
c) 0.63
d) 0
Question-8 - Correct answer
Next Session:
Wednesday:
16-Apr-2025
6:00 - 8:00 PM