0% found this document useful (0 votes)

38 views

ECE 551 Lecture 3

This lecture discusses dynamic programming and its application to optimal control problems. It begins by generalizing the dynamic programming solution from the previous lecture to systems described by nth order differential equations. It then considers using dynamic programming to solve continuous-time systems directly, without discretization, by deriving the Hamilton-Jacobi-Bellman equation. The lecture proceeds to discretize the equations of motion and performance index to obtain a recursive relationship and solve for the optimal control policy using backward induction.

Uploaded by

adambose1990

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views

ECE 551 Lecture 3

Uploaded by

adambose1990

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

ECE 551 LECTURE 3

Preface

This lecture continues our discussion of the method of dynamic programming. We proceed to generalize
the particular solution obtained from the example in the prior lecture to encompass all optimal control
problems involving an nth order differential equation with both state and control input constraints. This
treatment gives rise to a general mathematical statement of the dynamic programming algorithm when
applied to optimal control problems involving time-invariant dynamical systems.

We next consider whether dynamic programming can be used to solve problems involving continuous-
time systems without converting those systems to a discrete-time representation. This leads to the
classic Hamilton-Jacobi-Bellman (HJB) partial differential equation.

Optimal Control Application

We will begin with a statement of the problem as follows:

Consider an nth order time-invariant system described by the following state equation:

x (t ) a x (t ), u (t ) (4.10)

We seek the control law that minimizes the performance measure

J h x (t f ) g x (t ), u (t ) dt (4.11)
0

Where, the final time t f is assumed to be fixed, and the admissible controls are constrained to lie within a

set U , i.e. u U .

We begin our solution by converting equation (4.10) to a difference equation. We assume there are N
equally spaced time increments in 0 , t f , and as before, we approximate the time derivative as follows:
x (t t ) x (t )
x (t )
t

And so, we have that

x (k 1) x(k ) t a x (k ) , u (k )

Which we denote as follows:

x (k 1) a D x (k ) , u (k ) (4.12)

1
ECE 551 LECTURE 3

Proceeding in a similar manner with the performance measure, we obtain

N 1
J h x ( N ) t g x (k ) , u (k )
k 0

Which we denote as follows:

N 1
J h x ( N ) g D x (k ) , u (k ) (4.13)
k 0

Now, lets define the following quantity, which describes the cost of reaching the final state value x (N ) ,
i.e.

J NN x ( N ) h x ( N ) (4.14)

Next, we define the following

J N 1 , N x ( N 1) , u ( N 1) J NN x ( N ) g D x ( N 1) , u ( N 1) (4.15)

Note that equation (4.15) represents a one-stage process or the cost of operation during the interval
N 1 t t N t . Also note that J N 1 , N is only dependent on x ( N 1) and u ( N 1) , since

x (N ) is related to both x ( N 1) and u ( N 1) through equation (4.12). Thus, we may write

J N 1, N x ( N 1) , u ( N 1) g D x ( N 1) , u ( N 1)
(4.16)
J NN a D x ( N 1) , u ( N 1)

And so, the optimal cost is given by

J N 1, N x ( N 1) , u ( N 1) min { g D x ( N 1) , u ( N 1)
u (k )
(4.17)
J NN a D x ( N 1) , u ( N 1) }

We know that the optimal choice u

N 1 will depend on x N 1 , thus we may denote that by

u x N 1, N 1 .

Now, lets consider the last two intervals, i.e.

2
ECE 551 LECTURE 3

J N 2 , N x ( N 2) , u ( N 2) , u ( N 2) g D x ( N 2) , u ( N 2)
g D x ( N 1) , u ( N 1)
h x (N ) (4.18)
g D x ( N 2) , u ( N 2)
J N 1, N x ( N 1) , u ( N 1)

Hence, the optimal policy for the last two intervals is given by

J N 2 , N x ( N 2) min { g D x ( N 2) , u ( N 2)
u ( N 2) ,
u ( N 1) (4.19)
J
N 1 , N x ( N 1) }

But, since x ( N 1) is related to x ( N 2) and u ( N 2) through equation (4.12), J N 2 , N only depends on

x ( N 2) . Hence,

J N 2 , N x ( N 2) min { g D x ( N 2) , u ( N 2)
u ( N 2)
(4.20)
J N 1, N a D x ( N 2) , u ( N 2) }

We can continue backwards is this same manner, and if we do so, we obtain the following result for a k -
stage process

J N k , N x ( N k ) min { g D x ( N k ) , u ( N k )
u ( N k )
(4.21)
J N k 1, N a D x ( N k ) , u ( N k ) }

Note that equation (4.21) is a recurrence relation: i.e. if we know J N k 1, N , then we can use equation

(4.21) to find J N k , N , and so on. Consequently, if we start with J N , N , then we can work backwards over

all N stages. As long as N k , the cost of the k stage process:

th
stage is imbedded in that of the N
this is the so-called imbedding principle.

Hamilton-Jacobi-Bellman (HJB) Equation

We have approximated continuous-time systems with discrete-time representations in our initial treatment
of dynamic programming, and we have seen that this approach led to a recurrence relation suited to
solution by a digital computer. We will now consider an alternative approach to dealing with continuous-
time systems and the application of dynamic programming which results in a non-linear partial differential
equation, i.e. the so-called HJB equation. Start by considering the following continuous-time state
equation:

3
ECE 551 LECTURE 3

x (t ) a ( x(t ) , u (t ) , t ) (6.7)

And, the performance measure

J h ( x (t f ) , t f ) g ( x ( ) , u ( ) , ) d (6.8)
t0

Where, we have that h and g are specified functions, t 0 and t f are fixed, and is a dummy variable of
integration.

Let us now apply the imbedding principle, i.e.

J ( x ( t ) , u ( t ) , t ) h ( x (t f ) , t f )
t t f

tf (6.9)
g ( x ( ) , u ( ) , ) d
t

Where, we have that t can be any value less than or equal to t f , and x ( t ) can be any admissible state

value. Now lets try to determine the controls that minimize equation (6.9) for all t t f and all admissible

x(t ) .

The minimum cost is given by

J ( x ( t ) , t ) min { g ( x ( ) , u ( ) , ) d h ( x (t f ) , t f ) }

(6.10)
u ( )
t t f t

4
ECE 551 LECTURE 3

Now, lets subdivide the time-interval as follows:

t t tf

g ( x ( ) , u ( ) , ) d g ( x ( ) , u ( ) , ) d

J ( x ( t ) , t ) min {
u ( ) (6.11)
t t f t t t

h ( x (t f ) , t f ) }

Recall that the principle of optimality requires that

t t

g ( x ( ) , u ( ) , ) d J

J ( x ( t ) , t ) min { (x(t t ) , t t ) } (6.12)
u ( )
t t f t

Where, we have that J

( x ( t t ) , t t ) is the minimum cost for the time-interval t t t f
with the initial state equal to x ( t t ) . If we assume that the second partial derivatives of

J ( x ( t t ) , t t ) exist and are bounded, then we can expand J ( x ( t t ) , t t ) in a Taylor

series about the point ( x ( t t ) , t ) as follows:

t t

g ( x ( ) , u ( ) , ) d J

J ( x ( t ) , t ) min { (x(t ) , t )
u ( )
t t f t

J
( x (t ) , t ) t
t (6.13)

J
( x (t ) , t ) T [ x ( t t ) x ( t ) ]
x
higher order terms }

We have that for small t ,

5
ECE 551 LECTURE 3

J ( x ( t ) , t ) min { g ( x (t ) , u (t ) , t ) t J ( x ( t ) , t )
u( t )

J t ( x (t ) , t ) t (6.14)
T
J x ( x (t ) , t ) [ a ( x ( t ) , u ( t ) , t ] t
o(t ) }

Where, we note the following:

1. o ( t ) , denotes terms containing powers of t

J

2. J
t
t

J J J J
3. J x
T

x x1 x2 xn

We can remove terms in equation (6.14) involving J ( x ( t ) , t ) and J t ( x ( t ) , t ) from the minimization,
since they do not depend on u ( t ) . Thus, we have that

0 J t ( x ( t ) , t ) t min { g ( x (t ) , u (t ) , t ) t
u( t )

J x T ( x (t ) , t ) [ a ( x ( t ) , u ( t ) , t ] t (6.15)
o(t ) }

Now, lets divide throughout by t and take the limit as t , i.e.

6
ECE 551 LECTURE 3

0 J t ( x ( t ) , t ) min { g ( x (t ) , u (t ) , t )
u( t )
(6.16)
T
J x ( x (t ) , t ) [ a ( x ( t ) , u ( t ) , t ] }

In order to find the boundary condition for equation (6.16), we set t t f in equation (6.10) which yields the
following expression:

J ( x ( t f ) , t f ) h ( x (t f ) , t f ) (6.17)

Lets define a new quantity, i.e. the Hamiltonian, as follows:

H ( x ( t ) , u ( t ) , J x , t ) g ( x (t ) , u (t ) , t )
(6.18)
J x T ( x (t ) , t ) [ a ( x ( t ) , u ( t ) , t ]

And so, we have that

H ( x ( t ) , u ( x ( t ) , J x , t ) , J x , t ) min H ( x (t ) , u (t ) , J x , t ) (6.19)
u(t )

Using these definitions, i.e. equations (6.18) and (6.19), we arrive at the HJB equation

0 J t ( x (t ) , t ) H ( x ( t ) , u ( x ( t ) , J x , t ) , J x , t ) (6.20)

7
ECE 551 LECTURE 3

Note that equation (6.20) is a non-linear (in general) partial differential equation which is the result of
applying dynamic programming to the solution of an optimal control problem involving a continuous-time
system.

Detection Theory Book Solutions Stephen Kay
90% (105)
Detection Theory Book Solutions Stephen Kay
229 pages
Solution of Linear System Theory and Design 3ed For Chi-Tsong Chen
89% (19)
Solution of Linear System Theory and Design 3ed For Chi-Tsong Chen
106 pages
A Child's Guide To Dynamic Programming
No ratings yet
A Child's Guide To Dynamic Programming
20 pages
HJB Equations
100% (1)
HJB Equations
38 pages
DPOCexam2008midterm Solution
No ratings yet
DPOCexam2008midterm Solution
12 pages
Robert Devaney, Robert L. Devaney-An Introduction To Chaotic Dynamical Systems-Westview Press (2003)
100% (5)
Robert Devaney, Robert L. Devaney-An Introduction To Chaotic Dynamical Systems-Westview Press (2003)
351 pages
Servo Valve, Hydraulic - Equations
100% (5)
Servo Valve, Hydraulic - Equations
13 pages
5.1 Dynamic Programming and The HJB Equation: k+1 K K K K
No ratings yet
5.1 Dynamic Programming and The HJB Equation: k+1 K K K K
30 pages
Namic Programming
No ratings yet
Namic Programming
18 pages
L4 Discrete Time Optimal Control Indirect LQ ARE
No ratings yet
L4 Discrete Time Optimal Control Indirect LQ ARE
26 pages
Dynamic Programming and Linear Quadratic (LQ) Control (Discrete-Time and Continuous Time Cases)
No ratings yet
Dynamic Programming and Linear Quadratic (LQ) Control (Discrete-Time and Continuous Time Cases)
53 pages
ECE 551 Lecture 4
No ratings yet
ECE 551 Lecture 4
10 pages
5 - HJB
No ratings yet
5 - HJB
12 pages
Calculus of Variations and Optimal Control: Continuous Systems
No ratings yet
Calculus of Variations and Optimal Control: Continuous Systems
29 pages
SLchapt 3
No ratings yet
SLchapt 3
10 pages
DP_Slides
No ratings yet
DP_Slides
263 pages
Pontryagin's Maximum Principle: Emo Todorov
No ratings yet
Pontryagin's Maximum Principle: Emo Todorov
12 pages
4 The Linear Quadratic Regulator: 4.1 Time Varying and Finite Horizon Case
No ratings yet
4 The Linear Quadratic Regulator: 4.1 Time Varying and Finite Horizon Case
12 pages
Week 5 CalculusVariation
100% (1)
Week 5 CalculusVariation
7 pages
10.3934 dcdss.2024060
No ratings yet
10.3934 dcdss.2024060
20 pages
SC Dec22
No ratings yet
SC Dec22
82 pages
1 The Hamilton-Jacobi-Bellman Equation
No ratings yet
1 The Hamilton-Jacobi-Bellman Equation
7 pages
Yale Univ. Mathematics Camp - 12
No ratings yet
Yale Univ. Mathematics Camp - 12
27 pages
Woolseylecture 1
No ratings yet
Woolseylecture 1
4 pages
Mathii at Su and Sse: John Hassler Iies, Stockholm University February 25, 2005
No ratings yet
Mathii at Su and Sse: John Hassler Iies, Stockholm University February 25, 2005
87 pages
An Algorithm For Minimax Solution of Overdetennined Systems of Non-Linear Equations
No ratings yet
An Algorithm For Minimax Solution of Overdetennined Systems of Non-Linear Equations
8 pages
Deterministic Continuous Time Optimal Control and The Hamilton-Jacobi-Bellman Equation
No ratings yet
Deterministic Continuous Time Optimal Control and The Hamilton-Jacobi-Bellman Equation
7 pages
Figure by Mit Opencourseware
No ratings yet
Figure by Mit Opencourseware
26 pages
ECE 551 Lecture 1
No ratings yet
ECE 551 Lecture 1
10 pages
16 - Optimal Control of Unknown Parameter Systems
No ratings yet
16 - Optimal Control of Unknown Parameter Systems
3 pages
Optimal Control Theory
No ratings yet
Optimal Control Theory
14 pages
Bellman Routingproblem 1958
No ratings yet
Bellman Routingproblem 1958
5 pages
Note Set 7 - Nonlinear Equations: 7.1 - Overview
No ratings yet
Note Set 7 - Nonlinear Equations: 7.1 - Overview
10 pages
Dynamic Programing and Optimal Control PDF
No ratings yet
Dynamic Programing and Optimal Control PDF
276 pages
Dynamic Programing and Optimal Control
No ratings yet
Dynamic Programing and Optimal Control
276 pages
16.323 Principles of Optimal Control: Mit Opencourseware
No ratings yet
16.323 Principles of Optimal Control: Mit Opencourseware
32 pages
Convex Optimization Quizz
No ratings yet
Convex Optimization Quizz
5 pages
MIT Dynamic Programming Lecture Slides
No ratings yet
MIT Dynamic Programming Lecture Slides
261 pages
Dynamic Programming and Optimal Control Script
No ratings yet
Dynamic Programming and Optimal Control Script
58 pages
Stochastic Control Princeton
No ratings yet
Stochastic Control Princeton
14 pages
Lecture10 - Pontryagins Minimum Principle
No ratings yet
Lecture10 - Pontryagins Minimum Principle
9 pages
Chapter 7: The Optimal Control System: X Fxxu X Fxxu X Fxxu
No ratings yet
Chapter 7: The Optimal Control System: X Fxxu X Fxxu X Fxxu
29 pages
Robotics: Control Theory
No ratings yet
Robotics: Control Theory
54 pages
A2 Linear-Quadratic Optimal Control
No ratings yet
A2 Linear-Quadratic Optimal Control
8 pages
Dynamic Programming and Optimal Control: Third Edition Dimitri P. Bertsekas
No ratings yet
Dynamic Programming and Optimal Control: Third Edition Dimitri P. Bertsekas
54 pages
Homework - 06 - 223 - Spring 2024
No ratings yet
Homework - 06 - 223 - Spring 2024
5 pages
1 13 Optimal Control Proofs
No ratings yet
1 13 Optimal Control Proofs
9 pages
Ece5530 CH03LQR PDF
No ratings yet
Ece5530 CH03LQR PDF
37 pages
Direct Transcription Using Single Point Collocation For Students
No ratings yet
Direct Transcription Using Single Point Collocation For Students
6 pages
16.323 Principles of Optimal Control: Mit Opencourseware
No ratings yet
16.323 Principles of Optimal Control: Mit Opencourseware
24 pages
Optimal Control Exercises
100% (2)
Optimal Control Exercises
79 pages
Typeset by AMS-TEX
No ratings yet
Typeset by AMS-TEX
27 pages
Dynamic Programming and Optimal Control, Volumes I Solution Selected
No ratings yet
Dynamic Programming and Optimal Control, Volumes I Solution Selected
30 pages
Vol I Dimitri PDF
No ratings yet
Vol I Dimitri PDF
30 pages
Aio PDF
No ratings yet
Aio PDF
256 pages
Applied Intertemporal Optimization
No ratings yet
Applied Intertemporal Optimization
256 pages
Inequality 20161031
No ratings yet
Inequality 20161031
31 pages
A Strengthened Conjecture on the Minimax Optimal Constant Stepsize for Gradient Descent
No ratings yet
A Strengthened Conjecture on the Minimax Optimal Constant Stepsize for Gradient Descent
8 pages
Solution - 05 - 223 - Spring 2024 - Truncated
No ratings yet
Solution - 05 - 223 - Spring 2024 - Truncated
12 pages
Equality Constrained Optimization: Daniel P. Robinson
No ratings yet
Equality Constrained Optimization: Daniel P. Robinson
33 pages
16.323 Principles of Optimal Control: Mit Opencourseware
No ratings yet
16.323 Principles of Optimal Control: Mit Opencourseware
27 pages
Optimal Control Theory
No ratings yet
Optimal Control Theory
28 pages
Optimal Control Theory
No ratings yet
Optimal Control Theory
28 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
ECE 551 Lecture 7
No ratings yet
ECE 551 Lecture 7
12 pages
Preface: Euler-Lagrange Equation
No ratings yet
Preface: Euler-Lagrange Equation
10 pages
ECE 551 Lecture 2
No ratings yet
ECE 551 Lecture 2
11 pages
Estimation Theory Book Solutions Stephen Kay
90% (42)
Estimation Theory Book Solutions Stephen Kay
221 pages
Tutorial 1 Basic Android Setup Linux SCIEN
No ratings yet
Tutorial 1 Basic Android Setup Linux SCIEN
5 pages
Energy Meter
No ratings yet
Energy Meter
5 pages
6909-Article Text-28850-1-10-20200503
No ratings yet
6909-Article Text-28850-1-10-20200503
13 pages
Robust Control: Lecture Notes in Control and Information Sciences January 1993
No ratings yet
Robust Control: Lecture Notes in Control and Information Sciences January 1993
15 pages
4 Direct Numerical Integration Methods: Hapter
No ratings yet
4 Direct Numerical Integration Methods: Hapter
62 pages
RPH Bi THN 4 Week 1 Day 2
No ratings yet
RPH Bi THN 4 Week 1 Day 2
1 page
Mathematical Modelling With Differential Equations
No ratings yet
Mathematical Modelling With Differential Equations
284 pages
Hirsch Smale
No ratings yet
Hirsch Smale
186 pages
2017 Nonlinear Partial Differential Equations
No ratings yet
2017 Nonlinear Partial Differential Equations
36 pages
Feedback Linearization
No ratings yet
Feedback Linearization
4 pages
CPEN 301 Lecture 3 Systems - 2024_2025-1
No ratings yet
CPEN 301 Lecture 3 Systems - 2024_2025-1
58 pages
Mth102a PDF
No ratings yet
Mth102a PDF
174 pages
7423
No ratings yet
7423
276 pages
Solution Techniques For Models Yielding Ordinary Differential Equations (ODE)
No ratings yet
Solution Techniques For Models Yielding Ordinary Differential Equations (ODE)
67 pages
Lecture Notes 1 - Differential Equations 2020
No ratings yet
Lecture Notes 1 - Differential Equations 2020
14 pages
Solvability of A Fractional Integral Equation With The Concept of Measure of Noncompactness
No ratings yet
Solvability of A Fractional Integral Equation With The Concept of Measure of Noncompactness
18 pages
Diffusivity Equation Solutions
No ratings yet
Diffusivity Equation Solutions
9 pages
Foundation Uplift Raft
No ratings yet
Foundation Uplift Raft
10 pages
Chapter 4. J2 Plasticity Algorithms v1.0
No ratings yet
Chapter 4. J2 Plasticity Algorithms v1.0
72 pages
MCQ Unit 1 Mathematics Ii
0% (1)
MCQ Unit 1 Mathematics Ii
14 pages
Inverted Pendulum System
No ratings yet
Inverted Pendulum System
5 pages
Soil Damping Formulation in Nonlinear Time Domain Site Response Analysis
No ratings yet
Soil Damping Formulation in Nonlinear Time Domain Site Response Analysis
27 pages
ChemE 105
No ratings yet
ChemE 105
6 pages
18353spring 2001
No ratings yet
18353spring 2001
42 pages
Business Statistics
50% (4)
Business Statistics
500 pages
Virtual Oscillatorâ - 'Based Methods For Grid-Forming Inverter Control
No ratings yet
Virtual Oscillatorâ - 'Based Methods For Grid-Forming Inverter Control
21 pages
SAP2000 Analysis - Computers and Structures, Inc
No ratings yet
SAP2000 Analysis - Computers and Structures, Inc
6 pages
2006 A Generalized Memory Polynomial Model For Digital Predistortion of RF Power Amplifiers
No ratings yet
2006 A Generalized Memory Polynomial Model For Digital Predistortion of RF Power Amplifiers
9 pages
Review of Harmonic Load Flow Formulations
No ratings yet
Review of Harmonic Load Flow Formulations
9 pages
ME 5659 Scribed Notes
No ratings yet
ME 5659 Scribed Notes
8 pages

ECE 551 Lecture 3

Uploaded by

ECE 551 Lecture 3

Uploaded by

ECE 551 LECTURE 3

Optimal Control Application

We will begin with a statement of the problem as follows:

We seek the control law that minimizes the performance measure

And so, we have that

Which we denote as follows:

Proceeding in a similar manner with the performance measure, we obtain

Which we denote as follows:

Next, we define the following

x (N ) is related to both x ( N 1) and u ( N 1) through equation (4.12). Thus, we may write

And so, the optimal cost is given by

We know that the optimal choice u

Now, lets consider the last two intervals, i.e.

all N stages. As long as N k , the cost of the k stage process:

Hamilton-Jacobi-Bellman (HJB) Equation

And, the performance measure

Let us now apply the imbedding principle, i.e.

The minimum cost is given by

Now, lets subdivide the time-interval as follows:

Recall that the principle of optimality requires that

Where, we have that J

J ( x ( t t ) , t t ) exist and are bounded, then we can expand J ( x ( t t ) , t t ) in a Taylor

series about the point ( x ( t t ) , t ) as follows:

We have that for small t ,

Where, we note the following:

1. o ( t ) , denotes terms containing powers of t

Now, lets divide throughout by t and take the limit as t , i.e.

Lets define a new quantity, i.e. the Hamiltonian, as follows:

And so, we have that

You might also like