0% found this document useful (0 votes)

77 views

05 Cart Pole

The document provides the correct equations that describe the dynamics of the cart-pole system, as the classic papers that introduced this problem contained mistakes in their equations. It identifies two mistakes in the equations presented in a seminal 1983 paper: the friction force was incorrectly modeled, and the gravitational acceleration was mistakenly given a negative value. The document then derives the fully correct equations of motion for the cart position, pole angle, and friction force from first principles using Newton's laws of motion.

Uploaded by

sonytin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views

05 Cart Pole

Uploaded by

sonytin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Correct equations for the dynamics of the cart-pole system

R zvan V. Florian a Center for Cognitive and Neural Studies (Coneural) Str. Saturn 24, 400504 Cluj-Napoca, Romania [email protected] July 11, 2005; updated February 10, 2007
Abstract The problem of balancing a pole on a moving cart is a widely used benchmark problem for testing reinforcement learning algorithms. The classic papers that introduced this problem contain mistakes in the equations that govern the dynamics of the cart-pole system, and these mistakes propagated in other studies that used the same problem as a benchmark. Here we provide the equations that describe correctly the dynamics of the system.

Introduction

The control of a cart-pole system is widely used as a benchmark problem for testing the efciency of reinforcement learning algorithms. It seems to have been rst used as a test problem in adaptive control by Michie and Chambers (1968a,b) and became a more famous problem since its use in the paper of Barto et al. (1983). Google Scholar1 reports about 500 papers citing this paper, and about 100 papers containing the words cart pole or cartpole. There are, however, two mistakes in the equations from Barto et al. (1983) that describe the dynamics of the cart pole. One mistake introduces a difference between the reported equations and the equations describing a correct physical model, and the other mistake is probably a typo. The mistakes propagated in other papers that followed the original paper of Barto (e.g., Anderson, 1986; Schmidhuber, 1990; Si and Wang, 2001). The existence of these mistakes does not affect the validity of the reinforcement learning algorithms presented in the papers using them, because the problematic equations still describe a complex dynamical system. However, we believe that it is useful to be corrected, for the sake of scientic rigor. Here we provide the equations that describe correctly, from a physical point of view, the dynamics of the system.

uz ux uy ux '

' uy

Gp N -N Nc FF Gc F

Figure 1: The cart-pole system.

The system

The studied system is a cart of which a rigid pole is hinged (see gure). The cart is free to move within the bounds of a one-dimensional track. The pole can move in the vertical plane parallel to the track. The controller can apply a force F to the cart, parallel to the track. The cart has a mass mc ; the pole has mass m p and length 2l. We note with x the position of the cart along the track. The angle between the pole and the vertical is . The friction coefcient between the cart and the track is c ; there also exists friction in the articulation connecting the pole to the cart that leads to a torque p .

The wrong equations and the mistakes

The equations reported in Barto et al. (1983) are: F m p l 2 sin + c sgn(x) p mc + m p mp l l x= 4 m p cos2 3 mc + m p

g sin + cos =

(1)

F + m p l [ 2 sin cos ] c sgn(x) mc + m p

(2)

The mistake inherent in these equations is related to the friction force between the cart and the track, that is considered to be c sgn(x). The mistake is apparent even by
1 http://scholar.google.com/

dimensional analysis, observing that c sgn(x) is adimensional instead of having the dimensionality of a force. In fact, the friction force is the product between the friction coefcient and the magnitude of the force normal to the track; and the force normal to the track is not constant, because the movement of the pole induces a variation in its magnitude. A second mistake in Barto et al. (1983) is to consider the gravitational acceleration g in these equations to be negative (in the paper of Barto is specied that g = 9.8 m/s2 ). In fact, these equations are consistent with a positive value of g. Otherwise, the term g sin would produce a negative for a small , according to Eq. 1, meaning that the gravity pushes the pole towards the vertical position; this is, of course, wrong.

Correct equations

We consider that the cart acts with a reaction force N on the pole, at the articulation. According to the law of action-reaction, the pole will act on the cart with a force N. By applying Newtons second law to the cart we get: F + F f + Gc N + Nc = mc ac , (3)

where Ff is the friction force between the cart and the track that acts on the cart, and ac is the acceleration of the cart. We have F = F ux ; F f = Ff ux ; Gc = mc g uy ; N = Nx ux Ny uy ; Nc = Nc uy ; ac = x ux ; ux , uy and uz are the unit vectors of the laboratory frame of reference (see gure). Decomposing the previous equation on the x and y axis we get F Ff Nx = mc x mc g + Ny Nc = 0. (4) (5)

According to the Coulomb model of friction, and assuming that the track limits the movement of the cart both downwards and upwards, the friction force is Ff = c |Nc | sgn(x) = c Nc sgn(Nc x). By applying Newtons second law to the linear movement of the pole we get: N + Gp = mp ap, (7) (6)

where G p = m p g uy . The acceleration a p of the center of mass of the pole is due to the composed effects of the acceleration of the cart it is attached to, and of the rotation of the pole with angular velocity = uz and angular acceleration = uz : a p = ac + r p + ( r p ), (8)

where r p = l (sin ux cos uy ) is the vector representing the position of the center of mass of the pole relative to the articulation around which the pole rotates. Thus, we get a p = x ux + l uz (sin ux cos uy ) + l 2 uz [uz (sin ux cos uy )]. We have uz ux = uy and uz uy = ux . Hence, a p = x ux + l (sin uy + cos ux ) l 2 (sin ux cos uy ). 3 (10) (9)

An alternative way to get to the previous equation is to compute directly the accelera tion component due to the angular velocity, l 2 ux , and the acceleration component due to the angular acceleration, l uy , and expressing the unit vectors ux and uy of the frame of reference rotating with the pole in the laboratory frame of reference. By introducing Eq. 10 into 7 and decomposing on the x and y axis we get Nx = m p (x + l cos l 2 sin ) m p g Ny = m p (l sin + l 2 cos ) (11) (12)

By applying Newtons second law to the rotational movement of the pole around the articulation (that moves with acceleration ac relative to the laboratory frame of reference) we get: M = I + r p ac , (13) where M = r p G p p uz is the sum of the non-inertial torques acting on the pole relative to the articulation, I = 4/3 m p l 2 is the moment of inertia of the pole relative to the articulation, and r p ac can be interpreted as the torque generated by the inertial force caused by the acceleration of the cart. Hence, we get m p g l sin p = 4/3 m p l 2 + m p x l cos . From Eq. 4 and 11 we get: x= F + m p l ( 2 sin cos ) Ff , mc + m p (15) (14)

and by introducing this into Eq. 14 we get g sin + cos = l F m p l 2 sin + Ff p mc + m p mp l 4 m p cos2 3 mc + m p

(16)

We see that the last two equations are the same as Bartos equations 1 and 2, with the difference that Barto used a form of the friction force Ff that is wrong. Indeed, from Eq. 6, 5 and 12 we get Nc = (mc + m p ) g m p l ( sin + 2 cos ) Ff = c [(mc + m p ) g m p l ( sin + 2 cos )] sgn(Nc x). By introducing the previous equation in Eq. 15 and then in 16, we get g sin + cos = F m p l 2 [sin + c sgn(Nc x) cos ] p + c g sgn(Nc x) mc + m p mp l l 4 m p cos [cos c sgn(Nc x)] 3 mc + m p (19) (17) (18)

Conclusion

In conclusion, we have provided dynamical equations for the cart-pole system that are correct from a physical point of view. They are: Nc = (mc + m p ) g m p l ( sin + 2 cos ) g sin + cos = (20)

F m p l 2 [sin + c sgn(Nc x) cos ] p + c g sgn(Nc x) mc + m p mp l l 4 m p cos [cos c sgn(Nc x)] 3 mc + m p (21) l ( 2 sin cos ) c Nc sgn(Nc x) . mc + m p

F + mp

(22)

During a simulation of the system, at each timestep, we may assume that Nc has the same sign as at the previous timestep (we may consider it to be positive at the beginning of the simulation) and compute according to Eq. 21. We then compute Nc using the that we obtained, according to Eq. 20; if Nc changes sign, we compute again value of taking into account the new sign. Finally, we compute x according to 22. Usually, for common choices of the parameters, Nc will be always positive, as the cart should not try to jump off the track. If we neglect friction, the equations are g sin + cos = l x= F m p l 2 sin mc + m p

4 m p cos2 3 mc + m p

(23)

F + m p l ( 2 sin cos ) . mc + m p

(24)

In these equations, g is positive, and not negative, as mistakenly indicated in the paper of Barto.

Acknowledgements

I thank to James Knight for spotting a typo in a previous version of this paper.

References
Anderson, C. W. (1986), Learning and Problem Solving with Multilayer Connectionist Systems, PhD thesis, University of Massachussets, Department of Computer and Information Science. http://www.cs.colostate.edu/anderson/res/rl/chuck-diss.pdf Barto, A. G., Sutton, R. S. and Anderson, C. (1983), Neuron-like adaptive elements that can solve difcult learning control problems, IEEE Transactions on Systems, 5

Man, and Cybernetics 13, 834846. http://www.cs.ualberta.ca/sutton/papers/barto-sutton-anderson-83.pdf.gz Michie, D. and Chambers, R. A. (1968a), BOXES: An experiment in adaptive control, in E. Dale and D. Michie, eds, Machine Intelligence 2, Oliver and Boyd, Edinburgh, pp. 137152. Michie, D. and Chambers, R. A. (1968b), Boxes as a model of pattern formation, in C. H. Waddington, ed., Towards a theoretical biology, Vol. 1, Edinburgh University Press, Edinburgh, pp. 206215. Schmidhuber, J. (1990), Networks adjusting networks, in J. Kindermann and A. Linden, eds, Proceedings of Distributed Adaptive Neural Information Processing, Oldenbourg, pp. 197208. Extended version: TR FKI-125-90 (revised), Institut f r Inu formatik, TUM. ftp://ftp.idsia.ch/pub/juergen/fki125.ps.gz Si, J. and Wang, Y.-T. (2001), On-line learning control by association and reinforcement, IEEE Transactions on Neural Networks 12(2), 264276. http://ebrains.la.asu.edu/jennie/siandwang01.pdf

Automatic Control Kuo Solution Manual 10th
100% (4)
Automatic Control Kuo Solution Manual 10th
978 pages
Wma11 01 Que 20240510
No ratings yet
Wma11 01 Que 20240510
32 pages
Inverted Pendulum Project
No ratings yet
Inverted Pendulum Project
10 pages
Gantry Crane
0% (1)
Gantry Crane
6 pages
c02 v11 Solution v2
No ratings yet
c02 v11 Solution v2
48 pages
Bosch Pressure Sensors
No ratings yet
Bosch Pressure Sensors
5 pages
t12-pendulum
No ratings yet
t12-pendulum
5 pages
Linear Motion Inverted Pendulum: Derivation of The State-Space Model
No ratings yet
Linear Motion Inverted Pendulum: Derivation of The State-Space Model
8 pages
MIT2 003SCF11 Rec7notes1
No ratings yet
MIT2 003SCF11 Rec7notes1
5 pages
Mathematical Modelling and Controller Design of Inverted Pendulum
No ratings yet
Mathematical Modelling and Controller Design of Inverted Pendulum
6 pages
Mathematical Modelling of Inverted Pendulum With Disturbance Input
No ratings yet
Mathematical Modelling of Inverted Pendulum With Disturbance Input
5 pages
Cart and Pendulum 2 DOF Equations of Motion Legal
No ratings yet
Cart and Pendulum 2 DOF Equations of Motion Legal
5 pages
Cartpole Eom
No ratings yet
Cartpole Eom
3 pages
Inverted Pendulum
No ratings yet
Inverted Pendulum
21 pages
A Neural Network Pole Balancer That Learns and Operates On A Real Robot in Real Time
No ratings yet
A Neural Network Pole Balancer That Learns and Operates On A Real Robot in Real Time
8 pages
MIT2 003SCF11 Pset10sol
No ratings yet
MIT2 003SCF11 Pset10sol
21 pages
Solution Manual (Ch3&4)
No ratings yet
Solution Manual (Ch3&4)
31 pages
Towards Stable Balancing
No ratings yet
Towards Stable Balancing
28 pages
Lab Manual - Newtons 2nd Law
No ratings yet
Lab Manual - Newtons 2nd Law
6 pages
Pendulum-Cart System: Analysis of The Equations of Motions
No ratings yet
Pendulum-Cart System: Analysis of The Equations of Motions
30 pages
Homework 1-Trà Quốc Khang-1951149
No ratings yet
Homework 1-Trà Quốc Khang-1951149
13 pages
Robot Modelling and Control PDF
No ratings yet
Robot Modelling and Control PDF
10 pages
Lab Gantry
No ratings yet
Lab Gantry
17 pages
Lagrange Equation 1
No ratings yet
Lagrange Equation 1
32 pages
Lagrange Equations: Use Kinetic and Potential Energy To Solve For Motion!
No ratings yet
Lagrange Equations: Use Kinetic and Potential Energy To Solve For Motion!
32 pages
Statics and Dynamics Dr. Mahesh V. Panchagnula Department of Applied Mechanics Indian Institute of Technology, Madras Lecture - 28
No ratings yet
Statics and Dynamics Dr. Mahesh V. Panchagnula Department of Applied Mechanics Indian Institute of Technology, Madras Lecture - 28
10 pages
ECE680 L3notes
No ratings yet
ECE680 L3notes
4 pages
Non Linear Model
No ratings yet
Non Linear Model
6 pages
Chapter 1: Introduction: 1 Example 1
No ratings yet
Chapter 1: Introduction: 1 Example 1
9 pages
MPCResearchPaper
No ratings yet
MPCResearchPaper
23 pages
Inverted Pendulum
No ratings yet
Inverted Pendulum
21 pages
Stabilization of Double Inverted Pendulum On Cart: LQR Approach
No ratings yet
Stabilization of Double Inverted Pendulum On Cart: LQR Approach
5 pages
ACS _10e_Chapter _02_YSLee(1)
No ratings yet
ACS _10e_Chapter _02_YSLee(1)
57 pages
Files-3-Handouts Solved Problems Chapter 3 Mechanical Systems PDF
No ratings yet
Files-3-Handouts Solved Problems Chapter 3 Mechanical Systems PDF
7 pages
Modeling A Two Wheeled Inverted Pendulum Robot
100% (2)
Modeling A Two Wheeled Inverted Pendulum Robot
59 pages
Ann Exec
No ratings yet
Ann Exec
7 pages
Z05 Fran5717 08 Se W2.1.4
No ratings yet
Z05 Fran5717 08 Se W2.1.4
6 pages
09
No ratings yet
09
12 pages
Model of Pendulum - Team3
No ratings yet
Model of Pendulum - Team3
4 pages
09 Maxwell 192
No ratings yet
09 Maxwell 192
9 pages
MIT2 003SCF11 Rec8notes1
No ratings yet
MIT2 003SCF11 Rec8notes1
4 pages
Modelling and Simulation of Mechatronic Systems
No ratings yet
Modelling and Simulation of Mechatronic Systems
19 pages
P P P DT D H: Equal Area Criterion 1.0 Development of Equal Area Criterion
No ratings yet
P P P DT D H: Equal Area Criterion 1.0 Development of Equal Area Criterion
30 pages
11.2 Belt - and Gear-Driven Systems
No ratings yet
11.2 Belt - and Gear-Driven Systems
5 pages
Enosh29 Assignment
No ratings yet
Enosh29 Assignment
3 pages
Design of Fuzzy Logic Control System For Segway Type Mobile Robots
No ratings yet
Design of Fuzzy Logic Control System For Segway Type Mobile Robots
6 pages
Lab Balance
No ratings yet
Lab Balance
1 page
SHM Concepts Qs
No ratings yet
SHM Concepts Qs
22 pages
MIT8 - 223IAP17 - Lec11 - Forced Oscilations
No ratings yet
MIT8 - 223IAP17 - Lec11 - Forced Oscilations
5 pages
3.mathematical Modeling of Mechanical Systems and Electrical Systems
No ratings yet
3.mathematical Modeling of Mechanical Systems and Electrical Systems
49 pages
Em3_exercise_sheet_11_solution
No ratings yet
Em3_exercise_sheet_11_solution
4 pages
Balancing Control of Two-Wheeled Robot by Using LQG
No ratings yet
Balancing Control of Two-Wheeled Robot by Using LQG
5 pages
Lab 15
No ratings yet
Lab 15
5 pages
Things To Know For The Physics GRE: Daniel Beller October 28, 2009
No ratings yet
Things To Know For The Physics GRE: Daniel Beller October 28, 2009
20 pages
Full Free Motion of Celestial Bodies Around a Central Mass - Why Do They Mostly Orbit in the Equatorial Plane?
From Everand
Full Free Motion of Celestial Bodies Around a Central Mass - Why Do They Mostly Orbit in the Equatorial Plane?
Raul Fattore
No ratings yet
Control of A Cart-Ball System Using State-Feedback Controller
No ratings yet
Control of A Cart-Ball System Using State-Feedback Controller
6 pages
A Two Wheel Self-Balancing Vehicle
No ratings yet
A Two Wheel Self-Balancing Vehicle
16 pages
Lecture 3 System Modeling
No ratings yet
Lecture 3 System Modeling
16 pages
A Complete Course in Physics (Graphs) - First Edition
From Everand
A Complete Course in Physics (Graphs) - First Edition
Rajat Kalia
No ratings yet
Fifth Dimension: The Light to See
From Everand
Fifth Dimension: The Light to See
Marc E. King
No ratings yet
Mathematical Functions
From Everand
Mathematical Functions
Oliver Linton
No ratings yet
Exercises of Mechanics
From Everand
Exercises of Mechanics
Simone Malacrida
No ratings yet
38 Complex
No ratings yet
38 Complex
3 pages
Single Phase Transformer Construction and Working
No ratings yet
Single Phase Transformer Construction and Working
11 pages
1.1B Psychological Factors
No ratings yet
1.1B Psychological Factors
7 pages
SESAM Saturable Absorbers
No ratings yet
SESAM Saturable Absorbers
51 pages
Periodic Table
100% (1)
Periodic Table
23 pages
Low Noise, Cascadable Silicon Bipolar MMIC Amplifier: Technical Data
No ratings yet
Low Noise, Cascadable Silicon Bipolar MMIC Amplifier: Technical Data
4 pages
ME143L - Report 2 - Heat Pump
100% (2)
ME143L - Report 2 - Heat Pump
14 pages
2015 Synthesis, Photoluminescence and Judd-Ofelt Parameters of LiNa3P2O7Eu3+ Orthorhombic Microstructures
No ratings yet
2015 Synthesis, Photoluminescence and Judd-Ofelt Parameters of LiNa3P2O7Eu3+ Orthorhombic Microstructures
10 pages
RF & Microwave Engineering: EEE G581
No ratings yet
RF & Microwave Engineering: EEE G581
9 pages
H Section
No ratings yet
H Section
3 pages
Triaxial
No ratings yet
Triaxial
7 pages
Prediction of Maximum Initial Strip Width in The Cage Roll Forming Process of ERW Pipes Using Edge Buckling Criterion
100% (1)
Prediction of Maximum Initial Strip Width in The Cage Roll Forming Process of ERW Pipes Using Edge Buckling Criterion
10 pages
Assignment 3 BMMU1013 - BEEU1013 (Question)
No ratings yet
Assignment 3 BMMU1013 - BEEU1013 (Question)
7 pages
Lenses
100% (2)
Lenses
23 pages
Sprayberry Academy of Radio - ND-10 - Review of Fundamental Principles
No ratings yet
Sprayberry Academy of Radio - ND-10 - Review of Fundamental Principles
32 pages
TP-Minor Test-6-P-1+2(A)+2-NURTURE-JEE-MAIN-16.12.2024-F1 (1) 2
No ratings yet
TP-Minor Test-6-P-1+2(A)+2-NURTURE-JEE-MAIN-16.12.2024-F1 (1) 2
16 pages
Kindom Led
No ratings yet
Kindom Led
2 pages
Perm DRV PDF
No ratings yet
Perm DRV PDF
72 pages
Iso 3650 1998
No ratings yet
Iso 3650 1998
11 pages
Bearing Number Codes
No ratings yet
Bearing Number Codes
3 pages
Keys for Mastering Ascension
No ratings yet
Keys for Mastering Ascension
24 pages
The Casting Powders Book Mills Dacker 2017
No ratings yet
The Casting Powders Book Mills Dacker 2017
550 pages
Electric Charges and Field For AISSCE-2024.
No ratings yet
Electric Charges and Field For AISSCE-2024.
4 pages
Contact Details
No ratings yet
Contact Details
3 pages
Trizppt 3007
No ratings yet
Trizppt 3007
32 pages
Earthquake Response Spectrum Response Spectrum
No ratings yet
Earthquake Response Spectrum Response Spectrum
25 pages
1 - Concept of Stress
No ratings yet
1 - Concept of Stress
40 pages
Smart Spot RF
No ratings yet
Smart Spot RF
6 pages

05 Cart Pole

Uploaded by

05 Cart Pole

Uploaded by

Correct equations for the dynamics of the cart-pole system

Figure 1: The cart-pole system.

The wrong equations and the mistakes

F + m p l [ 2 sin cos ] c sgn(x) mc + m p

You might also like