Model Based Output Difference Feedback Optimal Control
Model Based Output Difference Feedback Optimal Control
Optimal Control
1 Introduction
This document investigates a model-based method to design the optimal Output-
Difference Feedback Controller (ODFC). We begin by assuming the presence of
an observer that provides an unbiased estimate of the state, represented math-
ematically as:
x̂k = xk + ϵk , ϵk ∼ N (0, Σϵ )
K ∗ = R + B T P ∗ B −1 B T P ∗ A + N T
where P ∗ > 0 is the solution to the Algebraic Riccati Equation (ARE):
AT P ∗ A − P ∗ − AT P ∗ B + N R + B T P ∗ B −1 B T P ∗ A + N T + Q = 0
T T
Here, Q = A Qx A, R = B T Qx B + R, N = A Qx B, and A = A − I.
3 Average Cost
The average cost associated with K ∗ is given by:
1
2. **Control Cost**: - Tr(Qx Ww ): Reflects the cost related to the process
noise affecting the state.
3. **Output Cost**: - 2Tr(Qy Wv ): Represents the cost linked to the output
noise.
4. **Feedback Gain Cost**: - Tr(K ∗T B T P ∗ K ∗ Wv ): Captures the cost
incurred due to the control action based on the feedback gain K ∗ .
5. **Covariance Cost**: - Tr(P ∗ (Ww + Σϵ )): Accounts for the combined
effect of the process noise covariance and the estimation error covariance.
6. **Adjustment for Feedback**: - −Tr((A − BK ∗ )T P ∗ (A − BK ∗ )Σϵ ):
Adjusts for the effect of the feedback control on the state dynamics.
4 Proof Overview
The proof resembles results for linear stochastic systems with state-dependent
quadratic costs, following similar procedures to those found in [?]. The optimal
feedback gain K ∗ is derived from minimizing the Bellman equation, leading to
the satisfaction of equations (9) and (10).
Ki+1 = R + B T Pi B −1 B T Pi A + N T
The following holds:
• A − BKi+1 is Schur.
• P ∗ ≤ Pi+1 ≤ Pi
• limi→∞ Pi = P ∗ , limi→∞ Ki = K ∗
6 Proof Overview
The proof follows arguments similar to those in [?] (Theorem 3.1) and is there-
fore omitted here.
2
7 Theorem 3.3: Parameterized Observer
A parameterized observer is introduced to estimate the system state xk from
the output difference measurement. The observer can be combined with (8) to
provide a solution for the optimal control problem.
The state parametrization is given as:
x̄k = Γu αk + Γy βk
This converges exponentially in mean to the state xk as k → ∞ for an
observable system. The estimation error is given by:
3
8.1 Discrete-Time Linear System
Consider a discrete-time linear system described by:
where:
V (x) = xT P x
where P is a positive semi-definite matrix, we can write:
4
8.4 Substituting into the Bellman Equation
Substituting back into the Bellman equation, we have:
V (x) = xT Q + AT P A x + uT R + B T P B u + 2xT AT P B + N u
−1 T
J ∗ = x T Q + AT P A x − x T AT P B + N R + B T P B B PA + NT x
−1 T
J ∗ = x T Q + AT P A − AT P B + N R + B T P B B PA + NT x
For the minimum cost to be zero, the term in parentheses must equal zero:
−1
AT P A − P − AT P B + N R + BT P B BT P A + N T + Q = 0
5
9 Conclusion
The discrete-time ARE is a key result in optimal control, allowing us to compute
the optimal feedback gain matrix K ∗ using:
−1
K ∗ = R + BT P B BT P A + N T
The solution P can be found using various numerical methods, such as iter-
ative algorithms or matrix factorizations.
References