0% found this document useful (0 votes)
15 views

Unit:1 Reinforcement Learning

nil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Unit:1 Reinforcement Learning

nil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

UNIT :1 REINFORCEMENT LEARNING

TOPIC 2:ELEMENTS OF REINFORCEMENT


LEARNING,LIMITATIONS AND SCOPE OF
REINFORCEMENT LEARNING

ELEMENTS OF REINFORCEMENT LEARNING:


1. Policy

2. Reward Signal

3. Value Function

4. Model of the environment

1) Policy: A policy can be defined as a way how an agent behaves at a

given time. It maps the perceived states of the environment to the actions

taken on those states. A policy is the core element of the RL as it alone

can define the behavior of the agent. In some cases, it may be a simple

function or a lookup table, whereas, for other cases, it may involve

general computation as a search process. It could be deterministic or a

stochastic policy:
Fordeterministicpolicy:a=π(s)

For stochastic policy: π(a | s) = P[At =a | St = s]

2) Reward Signal: The goal of reinforcement learning is defined by the

reward signal. At each state, the environment sends an immediate signal

to the learning agent, and this signal is known as a reward signal. These

rewards are given according to the good and bad actions taken by the

agent. The agent's main objective is to maximize the total number of

rewards for good actions. The reward signal can change the policy, such

as if an action selected by the agent leads to low reward, then the policy

may change to select other actions in the future.

3) Value Function: The value function gives information about how

good the situation and action are and how much reward an agent can

expect. A reward indicates the immediate signal for each good and bad

action, whereas a value function specifies the good state and action for

the future. The value function depends on the reward as, without reward,

there could be no value. The goal of estimating values is to achieve more

rewards.
4) Model: The last element of reinforcement learning is the model,

which mimics the behavior of the environment. With the help of the

model, one can make inferences about how the environment will behave.

Such as, if a state and an action are given, then a model can predict the

next state and reward.

The model is used for planning, which means it provides a way to take a

course of action by considering all future situations before actually

experiencing those situations. The approaches for solving the RL

problems with the help of the model are termed as the model-based

approach. Comparatively, an approach without using a model is called

a model-free approach.
Reinforcement Learning Applications

1. Robotics:RL is used in Robot navigation, Robo-soccer, walking,

juggling, etc.

2. Control:RL can be used for adaptive control such as Factory

processes, admission control in telecommunication, and Helicopter

pilot is an example of reinforcement learning.


3. Game Playing:RL can be used in Game playing such as tic-tac-toe,

chess, etc.

4. Chemistry:RL can be used for optimizing the chemical reactions.

5. Business:RL is now used for business strategy planning.

6. Manufacturing:In various automobile manufacturing companies,

the robots use deep reinforcement learning to pick goods and put

them in some containers.

7. Finance Sector:The RL is currently used in the finance sector for

evaluating trading strategies.

LIMITATIONS OF REINFORCEMENT LEARNING:

 Too much of reinforcement may cause an overload which could

weaken the results.

 Reinforcement learning is preferred for solving complex

problems, not simple ones.

 It requires plenty of data and involves a lot of computation.

 Maintenance cost is high.


CHALLENGES IN REINFORCEMENT LEARNING

1. Large Datasets: Since Reinforcement Learning Models are complex,

they need massive datasets to make better decisions.

2. Environment Dependency: As we know that the Reinforcement

Learning Models learn based on the agent’s interactions with the

environment – it causes hindrance in the training of the model; the agent

learns based on the current state of the environment, and for a constantly

changing environment, it becomes difficult for the agent to get trained.

3. Design of Reward Structure: For any real worlds use case of RL,

one needs to analyze the problem statement and devise an appropriate

structure as to when the model should be awarded and when should it be

penalized. This remains another problem that the researchers are

constantly in the face of.


SCOPE OF REINFORCEMENT LEARNING

 Reinforcement Learning closely mimics human learning patterns

observation, trial and error. For example, let us consider a game of

chess. Our agent begins to play the game with an absolute trial and

error approach. Every time it wins, it is rewarded, and when it

loses, it is accordingly penalized.

 It serves as an impeccable resolve for situations when the target

our problem statement is trying to accomplish is clear, but the way

of getting there is not.

 The real-world applications of RL are limited and are not in

constant circulation in our daily lives, With the tenacious and

committed researchers constantly digging deep into the field of

RL, we’ll surely break through all the challenges and resistance the

current studies face and revolutionize the Artificial Intelligence

sphere.

You might also like