0% found this document useful (0 votes)

15 views9 pages

ML-10

Uploaded by

22beit30160

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views9 pages

ML-10

Uploaded by

22beit30160

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Machine Learning CT604EN

C L LEGPT

Reinforcement
10
Learning

Prepared and Edited by:- Mayank Yadav Designed by:- Kussh Prajapati

Get Prepared Together

www.collegpt.com [email protected]
Prepared By : Mayank Yadav Machine Learning

Reinforcement Learning

⭐Reinforcement Learning is a feedback-based Machine learning technique in which an agent

learns to behave in an environment by performing the actions and seeing the results of actions. For each
good action, the agent gets positive feedback or reward, and for each bad action, the agent gets
negative feedback or penalty.

● In Reinforcement Learning, the agent learns automatically using feedback without any labeled
data, unlike supervised learning.

● Since there is no labeled data, the agent is bound to learn by its experience only.

● RL solves a specific type of problem where decision making is sequential, and the goal is
long-term, such as game-playing, robotics, etc.

● The agent interacts with the environment and explores it by itself. The primary goal of an agent
in reinforcement learning is to improve the performance by getting the maximum positive
rewards.

● The agent learns with the process of hit and trial, and based on the experience, it learns to
perform the task in a better way. Hence, we can say that "Reinforcement learning is a type of
machine learning method where an intelligent agent (computer program) interacts with the
environment and learns to act within that.

● It is a core part of Artificial intelligence, and all AI agents work on the concept of reinforcement
learning. Here we do not need to pre-program the agent, as it learns from its own experience
without any human intervention.

● The agent continues doing these three things (take action, change state/remain in the same
state, and get feedback), and by doing these actions, he learns and explores the environment.
Prepared By : Mayank Yadav Machine Learning

Terms used in Reinforcement Learning:

● Agent(): An entity that can perceive/explore the environment and act upon it.

● Environment(): A situation in which an agent is present or surrounded by. In RL, we assume the
stochastic environment, which means it is random in nature.

● Action(): Actions are the moves taken by an agent within the environment.

● State(): State is a situation returned by the environment after each action taken by the agent.

● Reward(): A feedback returned to the agent from the environment to evaluate the action of the
agent.

● Policy(): Policy is a strategy applied by the agent for the next action based on the current state.

● Value(): It is expected long-term returns with the discount factor and opposite to the short-term
reward.

● Q-value(): It is mostly similar to the value, but it takes one additional parameter as a current
action (a).
Prepared By : Mayank Yadav Machine Learning

Reinforcement Learning Algorithms

There are 3 approaches to implement reinforcement learning algorithms:

● Value-Based – The main goal of this method is to maximize a value function. Here, an agent
through a policy expects a long-term return of the current state.

● Policy-Based – In policy-based, you can come up with a strategy that helps to gain maximum
rewards in the future through possible actions performed in each state. Two types of
policy-based methods are deterministic and stochastic.

● Model-Based – In this method, we need to create a virtual model for the agent to help in
learning to perform in each specific environment.

Common RL Techniques:

● Q-Learning: A value-based method where the agent learns a Q-value for each state-action pair,
representing the expected future reward for taking an action in a given state.

● SARSA (State-Action-Reward-State-Action): Another value-based method that updates the

Q-value based on the current state, action taken, received reward, next observed state, and next
chosen action.

● Policy Gradient Methods: Focus on directly improving the policy by estimating the gradient of
the expected reward with respect to the policy parameters.

● Deep Reinforcement Learning: Combines RL with deep neural networks for powerful agents
capable of handling complex environments and high-dimensional states.
Prepared By : Mayank Yadav Machine Learning

⭐Concept of Penalty and Reward in Reinforcement Learning⭐

In Reinforcement Learning (RL), rewards and penalties are the core feedback mechanisms that guide an
agent's learning process. They act as the language of the environment, telling the agent what actions are
desirable (rewarded) and which ones are undesirable (penalized). This feedback loop allows the agent to
gradually learn the optimal behavior to achieve its goals.

Rewards

-> Positive reinforcement: Positive reinforcement is defined as when an event, occurs due to specific
behavior, increases the strength and frequency of the behavior. It has a positive impact on behavior.

Advantages:
● Maximizes the performance of an action
● Sustain change for a longer period
Disadvantage:
● Excess reinforcement can lead to an overload of states which would minimize the results.

Example: In a game of Pong, the agent receives a reward for hitting the ball towards the opponent's goal.

-> Shaping rewards: Complex tasks might require breaking them down into smaller sub-goals with
associated rewards. These intermediate rewards can guide the agent's learning towards the final goal.

● Example: In a robot learning to walk, initial rewards might be given for taking a step, then for
maintaining balance, and finally for achieving forward movement.

Penalties

-> Negative reinforcement: Negative Reinforcement is represented as the strengthening of a behavior.

In other ways, when a negative condition is barred or avoided, it tries to stop this action in the future.

Advantages:
● Maximized behavior
● Provide a decent to minimum standard of performance
Disadvantage:
● It just limits itself enough to meet up a minimum behavior

Example: In a maze navigation task, the agent receives a penalty for hitting a wall.

-> Importance of appropriate penalties: Penalties should be strong enough to discourage

undesirable actions but not too harsh, hindering exploration.
Prepared By : Mayank Yadav Machine Learning

Key Points about Rewards and Penalties:

● Magnitude: The magnitude of rewards and penalties can influence the learning speed and
effectiveness.
○ Larger rewards can encourage faster learning towards desired behavior, but smaller,
more frequent rewards can provide more continuous feedback.
● Sparsity: Rewards and penalties might not be received after every action, especially in complex
environments. The agent needs to learn to deal with delayed rewards and sparse feedback.
● Exploration vs. Exploitation: A balance needs to be struck between exploration (trying new
actions to learn about the environment) and exploitation (repeating actions known to yield
rewards).
● Reward Engineering: Defining appropriate reward signals is crucial for effective RL. Rewards
should be clear, consistent, and aligned with the desired goals.
Prepared By : Mayank Yadav Machine Learning

Reinforcement Learning Framework

The Reinforcement Learning (RL) framework provides a structured approach for training agents to learn
and make decisions in an interactive environment. Unlike supervised learning with labeled data, RL
agents learn by trial and error, receiving rewards for desired actions and penalties for undesirable ones.

Core Elements:

● Agent: The learning entity that interacts with the environment and aims to maximize its
long-term reward.
● Environment: The system or world the agent operates in. It provides the agent with observations
(state) and rewards based on its actions.
● State: The representation of the environment relevant to the current situation (e.g., game board
configuration, robot's sensor readings). It captures the information necessary for the agent to
make decisions.
● Action: The choices the agent can make in a given state. These actions influence the
environment and the agent's future state.
● Reward: A signal indicating the desirability of an action. Positive rewards encourage repetition,
while negative rewards (penalties) discourage it. Rewards provide feedback to the agent about
the consequences of its actions.
Prepared By : Mayank Yadav Machine Learning

The RL Loop:

● Perception: The agent observes the current state of the environment through sensors or other
information sources.
● Decision-Making: Based on the perceived state, the agent selects an action using its policy
(strategy). This policy defines the mapping between states and actions.
● Action: The agent takes the chosen action in the environment, potentially changing the
environment's state.
● Reward: The environment provides a reward signal based on the outcome of the action. This
reward reflects the desirability of the chosen action.
● Update: The agent updates its policy (learning) based on the observed reward and the transition
from the previous state to the current state. The goal is to adjust the policy to favor actions that
lead to higher rewards in the long run.

Reinforcement Learning Applications

● Robotics: RL is used in Robot navigation, Robo-soccer, walking, juggling, etc.

● Control: RL can be used for adaptive control such as Factory processes, admission control in
telecommunication, and Helicopter pilot is an example of reinforcement learning.
● Game Playing: RL can be used in Game playing such as tic-tac-toe, chess, etc.
● Chemistry: RL can be used for optimizing the chemical reactions.
● Business: RL is now used for business strategy planning.
● Manufacturing: In various automobile manufacturing companies, the robots use deep
reinforcement learning to pick goods and put them in some containers.
● Finance Sector: The RL is currently used in the finance sector for evaluating trading strategies.
C LLEGPT

All the Best

"Enjoyed these notes? Feel free to share them with

your friends and provide valuable feedback in your

review. If you come across any inaccuracies, don't

hesitate to reach out to the author for clarification.

Your input helps us improve!"

Visit: www.collegpt.com

www.collegpt.com ColleGPT [email protected]

FINE Marine PDF
No ratings yet
FINE Marine PDF
8 pages
Manual ENRAF 854 ATG
100% (1)
Manual ENRAF 854 ATG
112 pages
Unit-5 Mla
No ratings yet
Unit-5 Mla
22 pages
L-14 - Reinforcement-L-d-07062024-111949am
No ratings yet
L-14 - Reinforcement-L-d-07062024-111949am
22 pages
What Is Reinforcement Learning
No ratings yet
What Is Reinforcement Learning
15 pages
Unit 5
No ratings yet
Unit 5
45 pages
Reinforcement Learning
100% (1)
Reinforcement Learning
25 pages
REINFORCEMENT LEARNING-1
No ratings yet
REINFORCEMENT LEARNING-1
19 pages
UNIT-4
No ratings yet
UNIT-4
56 pages
RL Vishnu Sankar
No ratings yet
RL Vishnu Sankar
26 pages
Reinforcement Learning Notes ?
No ratings yet
Reinforcement Learning Notes ?
40 pages
UNIT 5 ML
No ratings yet
UNIT 5 ML
49 pages
Reinforcement Learning - Basics
No ratings yet
Reinforcement Learning - Basics
7 pages
Reinforced Learning
No ratings yet
Reinforced Learning
25 pages
Reinforcement
No ratings yet
Reinforcement
9 pages
What Is Reinforcement Learning
No ratings yet
What Is Reinforcement Learning
5 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
38 pages
Sara Reinforcement Learning
No ratings yet
Sara Reinforcement Learning
69 pages
MLT Unit-5 notes
No ratings yet
MLT Unit-5 notes
17 pages
Lecture 5
No ratings yet
Lecture 5
28 pages
Reinforcement_Learning_Enhanced
No ratings yet
Reinforcement_Learning_Enhanced
3 pages
RL & DL Notes
No ratings yet
RL & DL Notes
73 pages
UNIT-3
No ratings yet
UNIT-3
29 pages
Unit 5 - Reinforcement Learning
No ratings yet
Unit 5 - Reinforcement Learning
15 pages
RL & DL Notes
No ratings yet
RL & DL Notes
43 pages
Unit-5 (AI)
No ratings yet
Unit-5 (AI)
21 pages
DRL Final Notes
No ratings yet
DRL Final Notes
281 pages
RL Introduction
No ratings yet
RL Introduction
225 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
11 pages
L11 Reinforcement Learning 1
No ratings yet
L11 Reinforcement Learning 1
18 pages
21ai020 & Reinforcement Learning UNIT 1-LM:1
No ratings yet
21ai020 & Reinforcement Learning UNIT 1-LM:1
8 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
25 pages
RL Week_1
No ratings yet
RL Week_1
53 pages
Unit-5
No ratings yet
Unit-5
58 pages
UNIT-V-Reinforcement Learning
No ratings yet
UNIT-V-Reinforcement Learning
4 pages
ML_Unit-4
No ratings yet
ML_Unit-4
10 pages
Exp-14 Reinforcement Learning
No ratings yet
Exp-14 Reinforcement Learning
11 pages
ML Assignment 2
No ratings yet
ML Assignment 2
6 pages
Reinforcement Learning: Nazia Bibi
100% (1)
Reinforcement Learning: Nazia Bibi
61 pages
unit 5 ml
No ratings yet
unit 5 ml
15 pages
Module 1
No ratings yet
Module 1
72 pages
Reinforcement_Learning_Basics_and_Beyond
No ratings yet
Reinforcement_Learning_Basics_and_Beyond
1 page
Reinforcement Learning, Q-Learning
No ratings yet
Reinforcement Learning, Q-Learning
20 pages
IntroductiontoRL-BR
No ratings yet
IntroductiontoRL-BR
22 pages
Reinforcement Learning With Python
No ratings yet
Reinforcement Learning With Python
24 pages
RL Unit 1
100% (1)
RL Unit 1
26 pages
Introduction To Reinforcement Learning: Presented by - Rohit Mahto
No ratings yet
Introduction To Reinforcement Learning: Presented by - Rohit Mahto
9 pages
Lec 01
No ratings yet
Lec 01
60 pages
Unit-8 - Reinforcement Learning
No ratings yet
Unit-8 - Reinforcement Learning
52 pages
Lecture Week12
No ratings yet
Lecture Week12
37 pages
Ai PPT New
No ratings yet
Ai PPT New
14 pages
Unit No. 05 - Reinforced and Deep Learning
No ratings yet
Unit No. 05 - Reinforced and Deep Learning
44 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
28 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
2 pages
UNIT V reinforcement learning
No ratings yet
UNIT V reinforcement learning
8 pages
RL UNIT - III (1)
No ratings yet
RL UNIT - III (1)
20 pages
Reinforcement Learning (RL) : Agent
No ratings yet
Reinforcement Learning (RL) : Agent
35 pages
Unleashing The Power of Reinforcement Learning
No ratings yet
Unleashing The Power of Reinforcement Learning
2 pages
Unit 1 - Reinforcement Learning,Overfitting, Training, Validation Sets, Metrics, Bias and Variance
No ratings yet
Unit 1 - Reinforcement Learning,Overfitting, Training, Validation Sets, Metrics, Bias and Variance
16 pages
ML 4
No ratings yet
ML 4
4 pages
Reinforcement Learning 1
No ratings yet
Reinforcement Learning 1
14 pages
Reinforcement Learning Explained - A Step-by-Step Guide to Reward-Driven AI
From Everand
Reinforcement Learning Explained - A Step-by-Step Guide to Reward-Driven AI
Luka Nikolic
No ratings yet
Chapter 3: Dynamic Routing: Instructor Materials
No ratings yet
Chapter 3: Dynamic Routing: Instructor Materials
38 pages
CV Rizky Ananda - Rizky Ananda
No ratings yet
CV Rizky Ananda - Rizky Ananda
1 page
PA R 205 6 - Data Sheet
No ratings yet
PA R 205 6 - Data Sheet
7 pages
eFLL - A Fuzzy Library For Arduino and Embedded Systems
No ratings yet
eFLL - A Fuzzy Library For Arduino and Embedded Systems
31 pages
Programming Microsoft ASP NET MVC Dino Esposito download pdf
100% (1)
Programming Microsoft ASP NET MVC Dino Esposito download pdf
62 pages
Ertiga 2021 11 17 2021 12 17
No ratings yet
Ertiga 2021 11 17 2021 12 17
6 pages
Tutorial Problems
No ratings yet
Tutorial Problems
74 pages
Forticlient 5.6.6 Windows Release Notes
No ratings yet
Forticlient 5.6.6 Windows Release Notes
18 pages
Infosys Training
No ratings yet
Infosys Training
5 pages
Mapwork Exam Questions 31july2014
No ratings yet
Mapwork Exam Questions 31july2014
9 pages
Winplotr Ins
No ratings yet
Winplotr Ins
30 pages
Security Services Install
No ratings yet
Security Services Install
192 pages
Data Structs
No ratings yet
Data Structs
14 pages
Abb Acs 600
No ratings yet
Abb Acs 600
70 pages
RC 1977 08
No ratings yet
RC 1977 08
65 pages
AutoCAD Civil 3D 2014 Standalone Installation
No ratings yet
AutoCAD Civil 3D 2014 Standalone Installation
8 pages
CSEET ENDGAME Revision Notes - YES Academy, Pune
No ratings yet
CSEET ENDGAME Revision Notes - YES Academy, Pune
255 pages
KK Aggarwal
82% (11)
KK Aggarwal
412 pages
Memristors For ML
No ratings yet
Memristors For ML
11 pages
2012 Report To Congress
No ratings yet
2012 Report To Congress
509 pages
VFDL Frekvenciavalto ENG
No ratings yet
VFDL Frekvenciavalto ENG
16 pages
Slip 2
No ratings yet
Slip 2
4 pages
Amharic Ocr
No ratings yet
Amharic Ocr
62 pages
F01 Energy Management
No ratings yet
F01 Energy Management
34 pages
IOOP - Assignment Question
No ratings yet
IOOP - Assignment Question
10 pages
devops
No ratings yet
devops
55 pages
JVC Sb3 Chassis Av65wp74 Projection TV D
No ratings yet
JVC Sb3 Chassis Av65wp74 Projection TV D
126 pages
CV Adam 2
No ratings yet
CV Adam 2
19 pages

ML-10

Uploaded by

ML-10

Uploaded by

Machine Learning CT604EN

Get Prepared Together

⭐Reinforcement Learning is a feedback-based Machine learning technique in which an agent

Terms used in Reinforcement Learning:

Reinforcement Learning Algorithms

There are 3 approaches to implement reinforcement learning algorithms:

● SARSA (State-Action-Reward-State-Action): Another value-based method that updates the

⭐Concept of Penalty and Reward in Reinforcement Learning⭐

-> Negative reinforcement: Negative Reinforcement is represented as the strengthening of a behavior.

-> Importance of appropriate penalties: Penalties should be strong enough to discourage

Key Points about Rewards and Penalties:

Reinforcement Learning Framework

Reinforcement Learning Applications

● Robotics: RL is used in Robot navigation, Robo-soccer, walking, juggling, etc.

All the Best

your friends and provide valuable feedback in your

review. If you come across any inaccuracies, don't

hesitate to reach out to the author for clarification.

Your input helps us improve!"

www.collegpt.com ColleGPT [email protected]

You might also like