0% found this document useful (0 votes)
19 views

Model Based Output Difference Feedback Optimal Control

Uploaded by

Gary Rey
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Model Based Output Difference Feedback Optimal Control

Uploaded by

Gary Rey
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Model-Based Output-Difference Feedback

Optimal Control

1 Introduction
This document investigates a model-based method to design the optimal Output-
Difference Feedback Controller (ODFC). We begin by assuming the presence of
an observer that provides an unbiased estimate of the state, represented math-
ematically as:

x̂k = xk + ϵk , ϵk ∼ N (0, Σϵ )

2 Theorem 3.1: Optimal Control Problem


Consider the optimal control problem defined by equations (2)-(5). The optimal
state feedback controller gain K ∗ is given by:

K ∗ = R + B T P ∗ B −1 B T P ∗ A + N T
where P ∗ > 0 is the solution to the Algebraic Riccati Equation (ARE):

AT P ∗ A − P ∗ − AT P ∗ B + N R + B T P ∗ B −1 B T P ∗ A + N T + Q = 0
T T
Here, Q = A Qx A, R = B T Qx B + R, N = A Qx B, and A = A − I.

3 Average Cost
The average cost associated with K ∗ is given by:

λK ∗ = Tr(ATeff Qx Aeff Σϵ )+Tr(Qx Ww )+2Tr(Qy Wv )+Tr(K ∗T B T P ∗ K ∗ Wv )+Tr(P ∗ (Ww +Σϵ ))−Tr((A−BK ∗ )T P

3.1 Deriving Each Term


1. **State Cost**: - Tr(ATeff Qx Aeff Σϵ ): Captures the cost associated with the
state estimation error.

1
2. **Control Cost**: - Tr(Qx Ww ): Reflects the cost related to the process
noise affecting the state.
3. **Output Cost**: - 2Tr(Qy Wv ): Represents the cost linked to the output
noise.
4. **Feedback Gain Cost**: - Tr(K ∗T B T P ∗ K ∗ Wv ): Captures the cost
incurred due to the control action based on the feedback gain K ∗ .
5. **Covariance Cost**: - Tr(P ∗ (Ww + Σϵ )): Accounts for the combined
effect of the process noise covariance and the estimation error covariance.
6. **Adjustment for Feedback**: - −Tr((A − BK ∗ )T P ∗ (A − BK ∗ )Σϵ ):
Adjusts for the effect of the feedback control on the state dynamics.

4 Proof Overview
The proof resembles results for linear stochastic systems with state-dependent
quadratic costs, following similar procedures to those found in [?]. The optimal
feedback gain K ∗ is derived from minimizing the Bellman equation, leading to
the satisfaction of equations (9) and (10).

5 Theorem 3.2: Iterative Algorithm


Let K0 be any stabilizing state feedback controller gain and Pi > 0 be the
solution of the Lyapunov equation:

ATi Pi Ai − Pi + Q + KiT RKi − KiT N − N Ki = 0


where i = 0, 1, 2, . . . and Ai = A − BKi . For Ki+1 calculated as:

Ki+1 = R + B T Pi B −1 B T Pi A + N T
The following holds:

• A − BKi+1 is Schur.

• P ∗ ≤ Pi+1 ≤ Pi
• limi→∞ Pi = P ∗ , limi→∞ Ki = K ∗

6 Proof Overview
The proof follows arguments similar to those in [?] (Theorem 3.1) and is there-
fore omitted here.

2
7 Theorem 3.3: Parameterized Observer
A parameterized observer is introduced to estimate the system state xk from
the output difference measurement. The observer can be combined with (8) to
provide a solution for the optimal control problem.
The state parametrization is given as:

x̄k = Γu αk + Γy βk
This converges exponentially in mean to the state xk as k → ∞ for an
observable system. The estimation error is given by:

x̃k ≡ xk − x̄k ∼ N (0, Σϵ )


where Σϵ is a bounded error covariance matrix.

7.1 Matrices and Updates


The matrices Γu and Γy contain system-dependent transfer function coefficients.
The updates for αk and βk are defined as follows:
i
αk+1 = Aαki + Buk , ∀i = 1, 2, . . . , m

βki = Cσki + D(yk − yk−1 ), ∀i = 1, 2, . . . , p


where ui and yi are the i-th input and output, respectively.

7.2 Existence of the Observer


The existence of the parametrization is equivalent to the difference-feedback
state observer:

x̄k+1 = (A − LCA + LC)x̄k + (B − LCB)uk + L(yk+1 − yk )


where L is the observer gain. The mean and covariance of the estimation
error can be determined using this formulation.

8 Derivation of Discrete-Time ARE


The Algebraic Riccati Equation (ARE) is a fundamental equation in optimal
control theory, particularly for discrete-time linear systems. Below, we derive
the discrete-time ARE from the principles of optimal control.

3
8.1 Discrete-Time Linear System
Consider a discrete-time linear system described by:

xk+1 = Axk + Buk


where:
• xk is the state vector at time k,

• uk is the control input,


• A is the state transition matrix,
• B is the input matrix.

8.2 Cost Function


We want to minimize a quadratic cost function of the form:

X
xTk Qxk + uTk Ruk + 2xTk N uk

J=
k=0

where:

• Q is a positive semi-definite matrix,


• R is a positive definite matrix,
• N is a matrix that captures the coupling between the state and control
inputs.

8.3 Bellman Equation


The optimal control problem can be formulated using the Bellman equation.
The value function V (x) represents the minimum cost to go from state x:

V (x) = min xT Qx + uT Ru + 2xT N u + V (Ax + Bu)



u

Assuming a quadratic form for the value function:

V (x) = xT P x
where P is a positive semi-definite matrix, we can write:

V (Ax + Bu) = (Ax + Bu)T P (Ax + Bu)

4
8.4 Substituting into the Bellman Equation
Substituting back into the Bellman equation, we have:

V (x) = min xT Qx + uT Ru + 2xT N u + xT AT P Ax + xT AT P Bu + uT B T P Ax + uT B T P Bu



u

Grouping terms, we get:

V (x) = xT Q + AT P A x + uT R + B T P B u + 2xT AT P B + N u
  

8.5 Minimizing the Cost Function


To minimize this quadratic expression with respect to u, we take the derivative
and set it to zero:
∂V
= 2 R + B T P B u + 2 AT P B + N x = 0
 
∂u
Solving for u gives:
−1
u∗ = − R + B T P B BT P A + N T x


8.6 Substituting Back into the Cost Function


Substituting u∗ back into the cost function:

−1 T
J ∗ = x T Q + AT P A x − x T AT P B + N R + B T P B B PA + NT x
  

This leads to the equation:

 −1 T 
J ∗ = x T Q + AT P A − AT P B + N R + B T P B B PA + NT x


For the minimum cost to be zero, the term in parentheses must equal zero:

−1
AT P A − P − AT P B + N R + BT P B BT P A + N T + Q = 0
 

8.7 Algebraic Riccati Equation


Rearranging gives us the discrete-time Algebraic Riccati Equation (ARE):
−1
AT P A − P − AT P B R + B T P B BT P A + Q = 0

5
9 Conclusion
The discrete-time ARE is a key result in optimal control, allowing us to compute
the optimal feedback gain matrix K ∗ using:
−1
K ∗ = R + BT P B BT P A + N T


The solution P can be found using various numerical methods, such as iter-
ative algorithms or matrix factorizations.

References

You might also like