0% found this document useful (0 votes)

10 views35 pages

Winter Semester 2023-24 - CSE4037 - ETH - AP2023246000594 - 2024-01-05 - Reference-Material-I

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views35 pages

Winter Semester 2023-24 - CSE4037 - ETH - AP2023246000594 - 2024-01-05 - Reference-Material-I

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 35

Reinforcement Learning

Dr.Ch.Balaram Murthy
What is Reinforcement Learning (RL)?

To understand Reinforcement Learning, let’s start with the big

picture.

The idea behind Reinforcement Learning is that an agent (an AI) will
learn from the environment by interacting with it (through trial and
error) and receiving rewards (negative or positive) as feedback for
performing actions.

Learning from interactions with the environment comes from our

natural experiences.
Main points in Reinforcement learning :

Input: The input should be an initial state from which the model
will start.

Output: There are many possible outputs as there are a variety of

solutions to a particular problem.

Training: The training is based upon the input, The model will
return a state and the user will decide to reward or punish the
model based on its output.

The model keeps continues to learn……………….

The best solution is decided based on the maximum reward.

First, we need to start with defining the problem setting. RL
algorithms require us to define the following properties:
Terms used in Reinforcement Learning
Agent(): An entity that can perceive/explore the environment and act upon it.
Environment(): A situation in which an agent is present or surrounded by.
In RL, we assume the stochastic environment, which means it is random in nature.

Action(): Actions are the moves taken by an agent within the environment.
State(): State is a situation returned by the environment after each action taken by
the agent.

Reward(): A feedback returned to the agent from the environment to evaluate the
action of the agent.

Policy(): Policy is a strategy applied by the agent for the next action based on the
current state.

Value(): It is expected long-term retuned with the discount factor and opposite to
the short-term reward.

Q-value(): It is mostly similar to the value, but it takes one additional parameter as
a current action (a).
Examples of Reinforcement Learning

Any real-world problem where an agent must interact with an

uncertain environment to meet a specific goal is a potential
application of RL.

Robotics. Robots with pre-programmed behavior are useful in

structured environments, such as the assembly line of an automobile
manufacturing plant, where the task is repetitive in nature. In the
real world, where the response of the environment to the behavior
of the robot is uncertain, pre-programming accurate actions is nearly
impossible. In such scenarios, RL provides an efficient way to build
general-purpose robots.
Examples of Reinforcement Learning
Examples of Reinforcement Learning

AlphaGo. One of the most complex strategic games is a 3,000-year-

old Chinese board game called Go. Its complexity stems from the
fact that there are 10^270 possible board combinations, several
orders of magnitude more than the game of chess.

In 2016, an RL-based Go agent called AlphaGo defeated the greatest

human Go player. Much like a human player, it learned by
experience, playing thousands of games with professional players.

The latest RL-based Go agent has the capability to learn by playing

against itself, an advantage that the human player doesn’t have.
Examples of Reinforcement Learning

Autonomous Driving. An autonomous driving system must perform

multiple perception and planning tasks in an uncertain environment.

Some specific tasks where RL finds application include vehicle path

planning and motion prediction.

Vehicle path planning requires several low and high-level policies to

make decisions over varying temporal and spatial scales.

Motion prediction is the task of predicting the movement of

pedestrians and other vehicles, to understand how the situation
might develop based on the current state of the environment.
Benefits of Reinforcement Learning

RL is applicable to a wide range of complex problems that cannot be

tackled with other ML algorithms.

RL is closer to artificial general intelligence (AGI), as it possesses the

ability to seek a long-term goal while exploring various possibilities
autonomously.
Benefits of Reinforcement Learning
Some of the benefits of RL include:

Focuses on the problem as a whole. Conventional ML algorithms

are designed to excel at specific subtasks, without a notion of the big
picture.

RL, on the other hand, doesn’t divide the problem into sub-
problems; it directly works to maximize the long-term reward. It has
an obvious purpose, understands the goal, and is capable of trading
off short-term rewards for long-term benefits.
Benefits of Reinforcement Learning
Does not need a separate data collection step.
In RL, training data is obtained via the direct interaction of
the agent with the environment.

Training data is the learning agent’s experience, not a separate

collection of data that has to be fed to the algorithm.

This significantly reduces the burden on the supervisor in charge of

the training process.
Benefits of Reinforcement Learning
Works in dynamic, uncertain environments.
RL algorithms are inherently adaptive and built to respond to
changes in the environment.

In RL, time matters and the experience that the agent collects is not
independently and identically distributed (i.i.d.), unlike conventional
ML algorithms.

Since the dimension of time is deeply buried in the mechanics of RL,

the learning is inherently adaptive.
Challenges with Reinforcement Learning
RL agent needs extensive experience.

RL methods autonomously generate training data by interacting with

the environment. Thus, the rate of data collection is limited by the
dynamics of the environment. Environments with high latency slow
down the learning curve.

Furthermore, in complex environments with high-dimensional state

spaces, extensive exploration is needed before a good solution can
be found.
Challenges with Reinforcement Learning
Delayed rewards.
The learning agent can trade off short-term rewards for long-term
gains. While this foundational principle makes RL useful, it also
makes it difficult for the agent to discover the optimal policy.

This is especially true in environments where the outcome is

unknown until a large number of sequential actions are taken. In this
scenario, assigning credit to a previous action for the final outcome
is challenging and can introduce large variance during training.

The game of chess is a relevant example here, where the outcome of

the game is unknown until both players have made all their moves.
Challenges with Reinforcement Learning
Lack of interpretability.

Once an RL agent has learned the optimal policy and is deployed in

the environment, it takes actions based on its experience. To an
external observer, the reason for these actions might not be obvious.

This lack of interpretability interferes with the development of trust

between the agent and the observer. If an observer could explain the
actions that the RL agent tasks, it would help him in understanding the
problem better and discovering limitations of the model, especially in
high-risk environments.
Characteristics of Reinforcement Learning

Important characteristics of reinforcement learning.

• There is no supervisor, only a real number or reward signal.

• Sequential decision making.
• Time plays a crucial role in Reinforcement problems.
• Feedback is always delayed, not instantaneous.
• Agent’s actions determine the subsequent data it receives.
Elements of Reinforcement Learning

There are four main elements of Reinforcement Learning:

• Policy
• Reward Signal
• Value Function
• Model of the environment
Approaches to implement Reinforcement Learning

There are mainly three ways to implement reinforcement-learning

in ML, which are:

1.Value-based:

The value-based approach is about to find the optimal value

function V(s), which is the maximum value at a state under any
policy. In this method, the agent is expecting a long-term return of
the current states under policy π.
Approaches to implement Reinforcement Learning

2. Policy-based:

In a policy-based RL method, you try to come up with such a policy

that the action performed in every state helps you to gain maximum
reward in the future.

The policy-based approach has mainly two types of policy:

Deterministic: The same action is produced by the policy (π) at
any state.
Stochastic: In this policy, probability determines the produced
action.
Approaches to implement Reinforcement Learning

3.Model-based: In the model-based approach, a virtual model is

created for the environment, and the agent explores that
environment to learn it. There is no particular solution or algorithm
for this approach because the model representation is different for
each environment.
Types of Reinforcement Learning

There are two types of Reinforcement:

Positive: Positive Reinforcement is defined as when an event, occurs

due to a particular behavior, increases the strength and the
frequency of the behavior. In other words, it has a positive effect on
behavior. Advantages of reinforcement learning are:
 Maximizes Performance
 Sustain Change for a long period of time
 Too much Reinforcement can lead to an overload of states
which can diminish the results
Types of Reinforcement Learning

Negative: Negative Reinforcement is defined as strengthening of

behavior that occurs because of a negative condition which should
have stopped or avoided.

It helps you to define the minimum stand of performance.

However, the drawback of this method is that it provides enough to

meet up the minimum behavior.
Advantages of Reinforcement learning
 It is used to solve very complex problems that cannot be solved by
conventional techniques.
 The model can correct the errors that occurred during the training process.
 In RL, training data is obtained via the direct interaction of the agent with the
environment
 RL can handle environments that are non-deterministic, meaning that the
outcomes of actions are not always predictable. This is useful in real-world
applications where the environment may change over time or is uncertain.
 RL can be used to solve a wide range of problems, including those that involve
decision making, control, and optimization.
 RL is a flexible approach that can be combined with other ML techniques, such
as deep learning, to improve performance.
Disadvantages of Reinforcement learning
 RL is not preferable to use for solving simple problems.
 RL needs a lot of data and a lot of computation
 RL is highly dependent on the quality of the reward function. If the reward
function is poorly designed, the agent may not learn the desired behavior.
 RL can be difficult to debug and interpret. It is not always clear why the agent
is behaving in a certain way, which can make it difficult to diagnose and fix
problems.
Learning Models of Reinforcement

There are two important learning models in RL:

• Markov Decision Process (MDP)
• Q learning
Markov Decision Process
The following parameters are used to get a solution:
Set of actions- A
Set of states -S
Reward- R
Policy- n
Value- V

The mathematical approach for mapping a solution in RL is recon as

a Markov Decision Process or (MDP).
Reinforcement Learning vs. Supervised Learning
Reinforcement Learning Applications
, Rob o-s occe r, wa lk in g, juggling, etc.
Robot navigation

s.
er
in
ec g
ta
m in
on
s o ar n
i n le
m nt
he e
t t cem
pu for
nd ein
sa r
od ep
go de
ck se
pi s u

optimizing the chemical reactions.

to bot
ro

business strategy planning

Introduction To Reinforcement Learning
100% (1)
Introduction To Reinforcement Learning
52 pages
Reinforcement Learning, Q-Learning
No ratings yet
Reinforcement Learning, Q-Learning
20 pages
Reinforcement Learning
100% (1)
Reinforcement Learning
25 pages
The Seven Sieves - Romance Languages
100% (2)
The Seven Sieves - Romance Languages
229 pages
Background of PCC
100% (1)
Background of PCC
29 pages
RL Introduction
No ratings yet
RL Introduction
225 pages
DRL Final Notes
No ratings yet
DRL Final Notes
281 pages
4.1 Reinforcement Learning 2
No ratings yet
4.1 Reinforcement Learning 2
31 pages
CMPE257 - W10C13 - Reinforcement Learning
No ratings yet
CMPE257 - W10C13 - Reinforcement Learning
161 pages
Module 01
No ratings yet
Module 01
66 pages
RL
No ratings yet
RL
94 pages
Unit 5 - Reinforcement Learning
No ratings yet
Unit 5 - Reinforcement Learning
15 pages
AI Unit - 3
No ratings yet
AI Unit - 3
102 pages
RL Week - 1
No ratings yet
RL Week - 1
53 pages
RL & DL Notes
No ratings yet
RL & DL Notes
73 pages
UNIT V Reinforcement Learning
No ratings yet
UNIT V Reinforcement Learning
8 pages
LCCI - Level 3 Diploma in Accounting
No ratings yet
LCCI - Level 3 Diploma in Accounting
2 pages
Module 1
No ratings yet
Module 1
72 pages
Reinforcement Learning and Deep Learning Unit 1,2
No ratings yet
Reinforcement Learning and Deep Learning Unit 1,2
74 pages
Reinforcement Learning (RL) : Agent
No ratings yet
Reinforcement Learning (RL) : Agent
35 pages
Sara Reinforcement Learning
No ratings yet
Sara Reinforcement Learning
69 pages
Unit 6
No ratings yet
Unit 6
34 pages
Unit 4
No ratings yet
Unit 4
56 pages
Unit V Reinforcement Learning and Genetic Algorithm
No ratings yet
Unit V Reinforcement Learning and Genetic Algorithm
40 pages
RL & DL Notes
No ratings yet
RL & DL Notes
43 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
38 pages
Introduction To Prolog-Unit3
No ratings yet
Introduction To Prolog-Unit3
30 pages
Teachers Guide Cookery 9
82% (11)
Teachers Guide Cookery 9
18 pages
Lecture Week12
No ratings yet
Lecture Week12
37 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
25 pages
Stem C.chico DLL Genbio 1004-072022
No ratings yet
Stem C.chico DLL Genbio 1004-072022
5 pages
Lecture 9 - Reinforced Learning
No ratings yet
Lecture 9 - Reinforced Learning
18 pages
L-14 - Reinforcement-L-d-07062024-111949am
No ratings yet
L-14 - Reinforcement-L-d-07062024-111949am
22 pages
Reinforced Learning
No ratings yet
Reinforced Learning
25 pages
Ai PPT New
No ratings yet
Ai PPT New
14 pages
Reinforcement Learning-1
No ratings yet
Reinforcement Learning-1
19 pages
Unit 5
No ratings yet
Unit 5
45 pages
Lesson Exemplar Co4 English Sy 2023 2024
100% (1)
Lesson Exemplar Co4 English Sy 2023 2024
15 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
25 pages
Unit 1 - Reinforcement Learning, Overfitting, Training, Validation Sets, Metrics, Bias and Variance
No ratings yet
Unit 1 - Reinforcement Learning, Overfitting, Training, Validation Sets, Metrics, Bias and Variance
16 pages
Unit-5 Mla
No ratings yet
Unit-5 Mla
22 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
17 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
29 pages
What Is Reinforcement Learning
No ratings yet
What Is Reinforcement Learning
15 pages
7.reinforcement Learning-Introduction-The Learning Task Q-Learning
No ratings yet
7.reinforcement Learning-Introduction-The Learning Task Q-Learning
34 pages
Exp-14 Reinforcement Learning
No ratings yet
Exp-14 Reinforcement Learning
11 pages
L11 Reinforcement Learning 1
No ratings yet
L11 Reinforcement Learning 1
18 pages
MLT Unit-5 Notes
No ratings yet
MLT Unit-5 Notes
17 pages
Final
No ratings yet
Final
18 pages
Reinforcement Learning 1
No ratings yet
Reinforcement Learning 1
14 pages
Unit-5 (AI)
No ratings yet
Unit-5 (AI)
21 pages
Introduction To Reinforcement Learning: Presented by - Rohit Mahto
No ratings yet
Introduction To Reinforcement Learning: Presented by - Rohit Mahto
9 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
11 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
7 pages
Reinforcement
No ratings yet
Reinforcement
9 pages
UNIT-V-Reinforcement Learning
No ratings yet
UNIT-V-Reinforcement Learning
4 pages
21ai020 & Reinforcement Learning UNIT 1-LM:1
No ratings yet
21ai020 & Reinforcement Learning UNIT 1-LM:1
8 pages
tiếng anhi
No ratings yet
tiếng anhi
7 pages
Module No. 3: Parsing Structure in Text
No ratings yet
Module No. 3: Parsing Structure in Text
54 pages
Reinforcement Learning Enhanced
No ratings yet
Reinforcement Learning Enhanced
3 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
4 pages
Reinforcement learning-WPS Office
No ratings yet
Reinforcement learning-WPS Office
1 page
Reinforcement 2
No ratings yet
Reinforcement 2
2 pages
Lesson Plan - Student Profile
No ratings yet
Lesson Plan - Student Profile
2 pages
Assignment 15 Modern AI
No ratings yet
Assignment 15 Modern AI
3 pages
Reinforcement Learning Overview
No ratings yet
Reinforcement Learning Overview
2 pages
General Audience Analysis
No ratings yet
General Audience Analysis
4 pages
Kindergarten Year Plan 2022-2023
No ratings yet
Kindergarten Year Plan 2022-2023
11 pages
Telling Time Lesson Plan
No ratings yet
Telling Time Lesson Plan
4 pages
Chapter 1
No ratings yet
Chapter 1
24 pages
Image Caption
No ratings yet
Image Caption
16 pages
UPLOAD DOC - PPT - TOEFL PREPARATION CLASS
No ratings yet
UPLOAD DOC - PPT - TOEFL PREPARATION CLASS
16 pages
JHV
No ratings yet
JHV
24 pages
Applications of NLP
No ratings yet
Applications of NLP
48 pages
Principles of Teaching 2
No ratings yet
Principles of Teaching 2
3 pages
Daily Reflection Internship
No ratings yet
Daily Reflection Internship
66 pages
Summary Cultural Psychology Chapters 1 Through 5
No ratings yet
Summary Cultural Psychology Chapters 1 Through 5
26 pages
Winter Semester 2023-24 CSE3015 ETH AP2023246000714 Quiz-I-Question-Paper
No ratings yet
Winter Semester 2023-24 CSE3015 ETH AP2023246000714 Quiz-I-Question-Paper
74 pages
Module-5: Project Management Concepts
No ratings yet
Module-5: Project Management Concepts
18 pages
Nurs 603 Assignment 4 D Student
No ratings yet
Nurs 603 Assignment 4 D Student
10 pages
Software Re-Engineering: ©ian Sommerville 2000 Software Engineering, 6th Edition. Chapter 28 Slide 1
No ratings yet
Software Re-Engineering: ©ian Sommerville 2000 Software Engineering, 6th Edition. Chapter 28 Slide 1
32 pages
Eng 1023 Syllabus
No ratings yet
Eng 1023 Syllabus
9 pages
HKHB
No ratings yet
HKHB
21 pages
Constructivism: Jean Piaget (1896-1983), Lev Vigotsky (1896-1934)
No ratings yet
Constructivism: Jean Piaget (1896-1983), Lev Vigotsky (1896-1934)
12 pages
University of Warwick Doctoral College: Ranking Criteria For Open Competition Applications
No ratings yet
University of Warwick Doctoral College: Ranking Criteria For Open Competition Applications
2 pages
Course Syllabus
No ratings yet
Course Syllabus
2 pages
Component-Based Software Engineering 1: ©ian Sommerville 2004 Slide 1
No ratings yet
Component-Based Software Engineering 1: ©ian Sommerville 2004 Slide 1
16 pages
Chess Using Reinforcement Learning: Mungili Chetan Sai Raju - 21BCE9409 Jakka Subramanya Rithwik - 21BCE9028
No ratings yet
Chess Using Reinforcement Learning: Mungili Chetan Sai Raju - 21BCE9409 Jakka Subramanya Rithwik - 21BCE9028
15 pages
Reinforcement Learning - Playing Tic-Tac-Toe (Pre-Print)
No ratings yet
Reinforcement Learning - Playing Tic-Tac-Toe (Pre-Print)
11 pages
Marvel Electronics and Home Entertainment (SRS)
No ratings yet
Marvel Electronics and Home Entertainment (SRS)
15 pages
MoE For Social Studies
No ratings yet
MoE For Social Studies
6 pages
Eep306 Assessment 1 Feedback
No ratings yet
Eep306 Assessment 1 Feedback
2 pages
Saigon Toastmasters Club: 275 Chapter Meeting - 23rd Oct 2021
No ratings yet
Saigon Toastmasters Club: 275 Chapter Meeting - 23rd Oct 2021
3 pages
Inglés 3 - Digital Signal Processing Subject: Vocabulary Activities
No ratings yet
Inglés 3 - Digital Signal Processing Subject: Vocabulary Activities
3 pages
LT, Mat & Sat Answer Key 2015-16
No ratings yet
LT, Mat & Sat Answer Key 2015-16
2 pages
Episode 16
No ratings yet
Episode 16
5 pages
VMGO MinSu & CCS
No ratings yet
VMGO MinSu & CCS
3 pages
Als Class Program Sy 2025 2026 Grace
No ratings yet
Als Class Program Sy 2025 2026 Grace
1 page
B. Voc AI and ML Syllabus
No ratings yet
B. Voc AI and ML Syllabus
2 pages
Reinforcement Learning Explained - A Step-by-Step Guide to Reward-Driven AI
From Everand
Reinforcement Learning Explained - A Step-by-Step Guide to Reward-Driven AI
Luka Nikolic
No ratings yet
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)