0% found this document useful (0 votes)
19 views

Data Driven Control IEEE Paper

Uploaded by

Dhruv Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Data Driven Control IEEE Paper

Uploaded by

Dhruv Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Data-Driven Control for Autonomous Systems Using Reinforcement Learning

Authors: Dhruv Kumar, Anirudh Pratap Singh

Supervised by: Dr. Sandeep Kumar Soni

Abstract

This paper investigates a data-driven control approach for autonomous systems using reinforcement

learning (RL). The project focuses on stabilizing an inverted pendulum and robotic arm through

Q-learning, a model-free RL algorithm. The system dynamics, including torque and angular

velocities, are represented discretely to enable optimal control policy learning. Simulations

demonstrate the successful stabilization of both systems, showing the effectiveness of RL in control

tasks that deal with nonlinear dynamics and uncertain environments.

Keywords

Reinforcement learning, Q-learning, data-driven control, inverted pendulum, robotic arm, autonomous syste

I. Introduction

Optimal control has been a fundamental topic in control theory for decades, applied in numerous

fields such as robotics, missile guidance, and energy systems. However, for complex nonlinear

systems, traditional control methods often fall short. This paper presents a reinforcement learning

approach to stabilize two classic systems: an inverted pendulum and a robotic arm. Using

Q-learning, the agent learns to apply optimal torques to stabilize both systems, demonstrating the

potential of RL in handling uncertain and nonlinear systems.

II. Related Work

Reinforcement learning, particularly Q-learning, has seen extensive applications in control systems

due to its ability to learn optimal policies without requiring a model of the environment. Previous
studies have demonstrated its effectiveness in stabilizing various control systems. This work aims to

contribute to this research gap by applying Q-learning to autonomous systems with unknown

dynamics.

III. System Model

The systems under consideration are:

1. Inverted Pendulum: A classic example of a nonlinear system, where the goal is to stabilize the

pendulum in an upright position.

2. Robotic Arm: A multi-joint system where the objective is to apply optimal torques to achieve a

desired joint configuration.

In both systems, the state is represented by the angular displacement and velocity, which are

discretized for use with Q-learning. The torque is treated as a discrete action, and the reward

function penalizes deviations from the desired state.

IV. Methodology

A. Q-Learning Framework

Q-learning is a model-free reinforcement learning algorithm where the agent learns the optimal

policy by updating a Q-table based on observed rewards. The algorithm operates in discrete time

steps, where at each step, the agent selects an action, observes the reward, and updates its

knowledge of the Q-values.

1. State Representation: The continuous state space (angular displacement and velocity) is

discretized into bins, enabling the use of Q-learning for control.

2. Action Representation: The torque applied to the pendulum and robotic arm is discretized into

multiple levels to control the system's behavior.

3. Reward Function: A custom reward function is used to penalize large deviations from equilibrium.
B. Training Process

The agent is trained for 10,000 episodes, during which it explores and exploits different actions. The

exploration-exploitation trade-off is controlled by an epsilon-greedy strategy, where the epsilon value

decays over time. The Q-values are updated using the Bellman equation.

V. Results

A. Pendulum Stabilization

The Q-learning agent successfully stabilized the inverted pendulum, bringing both the angular

displacement and velocity to zero. The agent was able to adapt to various initial conditions,

showcasing the robustness of the RL approach.

B. Robotic Arm Stabilization

Similarly, the robotic arm was stabilized by the RL agent, which successfully applied torques to the

joints to bring them into the desired position.

VI. Conclusion

This paper demonstrates the applicability of reinforcement learning, specifically Q-learning, to

autonomous systems such as the inverted pendulum and robotic arm. The results indicate that RL

can effectively stabilize nonlinear and uncertain systems, providing a foundation for real-world

applications in robotics and automation.

VII. Future Work

Future research will focus on:

- Exploring adaptive learning rates to improve convergence.

- Applying more advanced RL algorithms such as Deep Q-Networks (DQN).

- Testing the trained policies on real-world systems to evaluate their performance in physical
environments.

References

1. X. Jia, X. Zhang, S. Zhu, F. Deng, and B. Zhu, "Data-driven adaptive consensus control for

heterogeneous nonlinear multi-agent systems using online reinforcement learning," IEEE

Transactions on Cybernetics, 2020.

2. T. Wang, G. Zong, X. Zhao, and N. Xu, "Data-driven-based sliding-mode dynamic event-triggered

control of unknown nonlinear systems via reinforcement learning," IEEE Transactions on Control

Systems Technology, 2020.

3. GeeksforGeeks, "What is Reinforcement Learning?" Available:

https://www.geeksforgeeks.org/what-is-reinforcement-learning/. [Accessed: Dec. 2024].

4. GeeksforGeeks, "Q-Learning in Python," Available:

https://www.geeksforgeeks.org/q-learning-in-python/. [Accessed: Dec. 2024].

5. MathWorks, "Train DDPG Agent to Swing-Up and Balance Pendulum," Available:

https://in.mathworks.com/help/reinforcement-learning/ug/train-ddpg-agent-to-swing-up-and-balance-

pendulum.html. [Accessed: Dec. 2024].

You might also like