0% found this document useful (0 votes)
8 views

[Lecture Notes in Control and Information Sciences №488] Zhong-Ping Jiang,Christophe Prieur,Alessandro Astolfi (eds.) - Trends in Nonlinear and Adaptive Control_ A Tribute to Laurent Praly for his 65th Birthday (2

This document is a collection of contributions celebrating the 65th birthday of Laurent Praly, highlighting his significant impact on nonlinear and adaptive control over a 40-year career. It features nine chapters authored by various researchers, covering foundational areas in control theory influenced by Praly's work. The volume serves as a tribute to his pioneering research and mentorship in the field of control and information sciences.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

[Lecture Notes in Control and Information Sciences №488] Zhong-Ping Jiang,Christophe Prieur,Alessandro Astolfi (eds.) - Trends in Nonlinear and Adaptive Control_ A Tribute to Laurent Praly for his 65th Birthday (2

This document is a collection of contributions celebrating the 65th birthday of Laurent Praly, highlighting his significant impact on nonlinear and adaptive control over a 40-year career. It features nine chapters authored by various researchers, covering foundational areas in control theory influenced by Praly's work. The volume serves as a tribute to his pioneering research and mentorship in the field of control and information sciences.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 290

Lecture Notes in Control and Information Sciences 488

Zhong-Ping Jiang
Christophe Prieur
Alessandro Astolfi Editors

Trends in
Nonlinear
and Adaptive
Control
A Tribute to Laurent Praly for his
65th Birthday
Lecture Notes in Control and Information
Sciences

Volume 488

Series Editors
Frank Allgöwer, Institute for Systems Theory and Automatic Control,
Universität Stuttgart, Stuttgart, Germany
Manfred Morari, Department of Electrical and Systems Engineering,
University of Pennsylvania, Philadelphia, USA

Advisory Editors
P. Fleming, University of Sheffield, UK
P. Kokotovic, University of California, Santa Barbara, CA, USA
A. B. Kurzhanski, Moscow State University, Moscow, Russia
H. Kwakernaak, University of Twente, Enschede, The Netherlands
A. Rantzer, Lund Institute of Technology, Lund, Sweden
J. N. Tsitsiklis, MIT, Cambridge, MA, USA
This series reports new developments in the fields of control and information
sciences—quickly, informally and at a high level. The type of material considered
for publication includes:
1. Preliminary drafts of monographs and advanced textbooks
2. Lectures on a new field, or presenting a new angle on a classical field
3. Research reports
4. Reports of meetings, provided they are
(a) of exceptional interest and
(b) devoted to a specific topic. The timeliness of subject material is very
important.
Indexed by EI-Compendex, SCOPUS, Ulrich´s, MathSciNet, Current Index to
Statistics, Current Mathematical Publications, Mathematical Reviews,
IngentaConnect, MetaPress and Springerlink.

More information about this series at http://www.springer.com/series/642


Zhong-Ping Jiang Christophe Prieur
• •

Alessandro Astolfi
Editors

Trends in Nonlinear
and Adaptive Control
A Tribute to Laurent Praly for his
65th Birthday

123
Editors
Zhong-Ping Jiang Christophe Prieur
Department of Electrical Automatic Control
and Computer Engineering CNRS
New York University Saint Martin d’Heres, France
Brooklyn, NY, USA

Alessandro Astolfi
Department of Electrical
and Computer Engineering
Imperial College London
London, UK

ISSN 0170-8643 ISSN 1610-7411 (electronic)


Lecture Notes in Control and Information Sciences
ISBN 978-3-030-74627-8 ISBN 978-3-030-74628-5 (eBook)
https://doi.org/10.1007/978-3-030-74628-5
MATLAB is a registered trademark of The MathWorks, Inc. See https://www.mathworks.com/
trademarks for a list of additional trademarks.

Mathematics Subject Classification: 34H05, 34K35, 37N35, 49L20, 49N90, 93C10, 93C20, 93C35,
93C40, 93C55, 93C73, 93D05, 93D25

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
to Laurent,
a friend and a continuous source
of inspiration
Preface

This book is a tribute to

Laurent Praly

on the occasion of his 65th birthday. Throughout his 40 year career Laurent has
contributed ground-breaking results, has initiated research directions, has laid the
foundations of adaptive control, nonlinear stabilization, nonlinear observer design,
and network systems, and has motivated, guided, and forged students, junior
researchers, and colleagues. In addition, he has been a driving force for the intel-
lectual and cultural growth of the systems and control community worldwide.
The volume collects nine contributions written by a total of seventeen
researchers. The leading author of each contribution has been selected among the
researchers who have worked or interacted with Laurent, have been influenced by
his research activity, or have had the privilege and honor of being his Ph.D. stu-
dents. The contributions focus on two foundational areas of control theory: non-
linear control and adaptive control, in which Laurent has been an undisputed top
player for four decades. The diversity of the areas covered and the depth of the
results are tangible evidence of Laurent’s impact on the way control problems are
currently studied and results are developed. Control would be a very different
discipline without Laurent’s vision and without his ability to push the boundaries of
what is known and achievable. Laurent’s papers are timeless: the results therein are
fundamental and are never superseded by more advanced or newer results. They
constitute cornerstones upon which generations of control theorists will build.
Similarly, practitioners and industrialists have greatly benefited from Laurent’s
engineering ingenuity and tools.
As anticipated, the contributions in the book reflect important areas which have
been pioneered and influenced by Dr. L. Praly, as detailed hereafter.
It has been known since a long time that invertible MIMO nonlinear systems can
be input–output linearized via dynamic state feedback. However, the techniques
originally developed to achieve this design goal are fragile, as they require the
availability of an accurate model of the plant and access to the full state. Very

vii
viii Preface

recently, a robust version of these techniques has been developed, by means of


which a linear input–output behavior can be approximated up to any arbitrarily
fixed degree of accuracy. As a byproduct, for a strongly minimum phase invertible
MIMO system, these techniques provide a robust stabilization paradigm, which can
be also used in wider contexts, for instance, to simplify the solution of a problem of
output regulation. The chapter “Almost Feedback Linearization via Dynamic
Extension: a Paradigm for Robust Semiglobal Stabilization of Nonlinear MIMO
Systems,” by A. Isidori and Y. Wu, reviews the techniques in question and their
application to the design of output regulators.
The chapter “Continuous-Time Implementation of Reset Control Systems,” by
A. Teel, considers using a differential inclusion, instead of a hybrid system, to
effectuate a linear control system with resets. In particular, it establishes global
exponential stability for the differential inclusion when the hybrid version of the
reset control system admits a strongly convex Lyapunov function that establishes
stability.
The problem of establishing the stability of the feedback interconnection of two
systems is perhaps the most fundamental and well-studied problem in control
theory. In their seminal 1997 paper, Megretski and Rantzer extended the classical
multiplier approach to this problem by using a homotopy argument to circumvent
the standard requirement that the multipliers admit certain factorizations. In “On the
Role of Well-Posedness in Homotopy Methods for the Stability Analysis of
Nonlinear Feedback Systems,” R. A. Freeman shows how to relax their assumption
that the feedback interconnection is well-posed along the entire homotopy path.
In “Design of Heterogeneous Multi-agent System for Distributed Computation,”
J. G. Lee and H. Shim study the design aspect of heterogeneous multi-agent systems
by a tool set based on singular perturbation analysis. A few applications illustrate
how the tool is employed for generating multi-agent systems or algorithms.
The chapter “Contributions to the Problem of High-Gain Observer Design for
Hyperbolic Systems,” by C. Kitsos, G. Besançon, and C. Prieur, extends classical
results on high-gain observers to quasilinear hyperbolic partial differential equations
(PDE). Assuming that the first coordinate of the state defines the output, two
different observer design methods are given: firstly a direct method, assuming that
there is only one nonlinear functional velocity in the PDE, giving a natural
extension to what is known for finite-dimensional nonlinear systems; secondly an
indirect method, where a suitable state transformation is used to deal with distinct
functional velocities.
The adaptive attenuation of unknown periodic or approximately periodic output
disturbances in the presence of broadband noise and modeling uncertainties is an
important practical problem with a wide range of applications. The chapter “Robust
Adaptive Disturbance Attenuation,” by S. Jafari and P. Ioannou, proposes several
feedback adaptive control techniques which are shown analytically and demon-
strated via simulations to reject periodic disturbances without attenuating the output
noise and exciting unmodeled dynamics. Some of the novel techniques used
include over-parametrization that provides the structural flexibility to meet multiple
control objectives and robust adaptive laws for parameter estimation. The results
Preface ix

have also been extended to the case of minimum phase and possibly unstable plant
models with unknown parameters where the objectives of control, disturbance
rejection, and robustness with respect to output noise and modeling errors are met.
The proposed techniques cover continuous and discrete-time plants as well as
MIMO systems.
In “Delay-Adaptive Observer-Based Control for Linear Systems with Unknown
Input Delays,” M. Krstic and Y. Zhu present a tutorial retrospective of advances,
over the last ten years, in adaptive control of linear systems with input delays,
enabled with a parameter-adaptive certainty-equivalence version of PDE back-
stepping. In addition to unknown plant parameters and unmeasured plant states,
they address delay-specific challenges like unknown delays (delay-adaptive
designs) and systems with distributed delays, where the delay kernels are
unknown functional parameters, estimated with infinite-dimensional update laws.
In “Adaptive Control for Systems with Time-Varying Parameters—A Survey,”
K. Chen and A. Astolfi survey the so-called congelation of variables method for
adaptive control. This method allows recasting an adaptive control problem with
time-varying parameter into an adaptive control problem with constant parameter
and a robust control problem with time-varying perturbation. This allows applying
classical adaptive control results to systems with time-varying parameters. Both
state-feedback design and output-feedback design are presented. Boundedness of
closed-loop signals and convergence of output/state are guaranteed without any
restrictions on the rates of parameter variations.
In “Robust Reinforcement Learning for Stochastic Linear Quadratic Control
with Multiplicative Noise,” B. Pang and Z. P. Jiang focus on the development of
robust reinforcement learning algorithms for how to learn adaptive optimal con-
trollers from limited data. The chapter first shows that the well-known policy
iteration algorithm is inherently robust in the sense of small-disturbance
input-to-state stability and then presents a novel off-policy reinforcement learning
algorithm for data-driven adaptive and stochastic LQR with multiplicative noise.
We complete the preface with some personal considerations. As control theorists
we have been blessed to share time, as Ph.D. students and collaborators, with
Laurent. It is very difficult to describe the magic that occurs in Laurent’s office,
while writing on the board, or in the nearby forest, while searching for chestnuts or
mushrooms. It is, however, this magic that makes Laurent unique and special, as a
researcher, as a teacher, and as a friend, and that has attracted us and many col-
leagues to learn, work and interact with him. Without the “Fontainebleau experi-
ence” our life would have been different, and for this gift we are grateful to Laurent.
It is for us a great honor to celebrate Laurent’s contributions to science and to
our life.

New York Zhong-Ping Jiang


Grenoble Christophe Prieur
London Alessandro Astolfi
January 2021
Contents

1 Almost Feedback Linearization via Dynamic Extension:


a Paradigm for Robust Semiglobal Stabilization of Nonlinear
MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Alberto Isidori and Yuanqing Wu
2 Continuous-Time Implementation of Reset Control Systems . . . . . . 27
Andrew R. Teel
3 On the Role of Well-Posedness in Homotopy Methods
for the Stability Analysis of Nonlinear Feedback Systems . . . . . . . . 43
Randy A. Freeman
4 Design of Heterogeneous Multi-agent System for Distributed
Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Jin Gyu Lee and Hyungbo Shim
5 Contributions to the Problem of High-Gain Observer Design
for Hyperbolic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Constantinos Kitsos, Gildas Besançon, and Christophe Prieur
6 Robust Adaptive Disturbance Attenuation . . . . . . . . . . . . . . . . . . . . 135
Saeid Jafari and Petros Ioannou
7 Delay-Adaptive Observer-Based Control for Linear Systems
with Unknown Input Delays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Miroslav Krstic and Yang Zhu
8 Adaptive Control for Systems with Time-Varying
Parameters—A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Kaiwen Chen and Alessandro Astolfi

xi
xii Contents

9 Robust Reinforcement Learning for Stochastic Linear Quadratic


Control with Multiplicative Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Bo Pang and Zhong-Ping Jiang

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Chapter 1
Almost Feedback Linearization via
Dynamic Extension: a Paradigm for
Robust Semiglobal Stabilization of
Nonlinear MIMO Systems

Alberto Isidori and Yuanqing Wu

Abstract It is well known that invertible MIMO nonlinear systems, can be input–
output linearized via dynamic state feedback (augmentation of the dynamics and
memoryless state feedback from the augmented state). The procedures for the design
of such feedback, developed in the late 1980s for nonlinear systems, typically are
recursive procedures that involve state-dependent transformations in the input space
and cancelation of nonlinear terms. As such, they are fragile. In a recent work of Wu-
Isidori-Lu-Khalil, a method has been proposed, consisting of interlaced design of
dynamic extensions and extended observers, that provides a robust version of those
feedback-linearizing procedures. The method in question can be used as a systematic
tool for robust semiglobal stabilization of invertible and strongly minimum-phase
MIMO nonlinear systems. The present paper provides a review of the method in
question, with an application to the design of a robust output regulator.

1.1 Foreword

This paper is dedicated to Laurent Parly on the occasion of his 65th birthday. It is a real
honor to have been invited to prepare a paper in honor of one of the most influential
and respected authors in our community. Over the years, the first author of this paper
has been deeply influenced, in his own work, by ideas, style and mathematical rigor of
Laurent, since the first time he met him—on the sands of a secluded beach in Belle Ile
while catching and eating fresh palourdes—in the course of a workshop held there in
1982 under the aegis of the French CNRS. Working with Laurent, an opportunity that

A. Isidori (B)
Department of Computer, Control and Management Engineering,
University of Rome “Sapienza”, Rome, Italy
e-mail: [email protected]
Y. Wu
School of Automation, Guangdong University of Technology, Guangzhou, China
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 1


Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control,
Lecture Notes in Control and Information Sciences 488,
https://doi.org/10.1007/978-3-030-74628-5_1
2 A. Isidori and Y. Wu

we wish had occurred much more frequently, was a real privilege and every time a
incredible chance for learning new technical skills and broaden oneself’s knowledge.
Among the various areas in which Laurent produced seminal result stand, of course,
those of feedback stabilization and output regulation of nonlinear systems. It is for
this reason why, in this paper, we have chosen to report our modest new contributions
to a design problem that sits in between these two areas.

1.2 Invertibility and Feedback Linearization

Feedback linearization has been one of the most popular, but also frequently despised,
approaches for the design of feedback laws for nonlinear systems. The idea originated
in 1978 with a work of R. Brockett who, in [1], while investigating the effect of feed-
back from the state on a input-affine nonlinear system, showed that the joint effect of
a feedback and a change of coordinates could yield a system modeled by linear equa-
tions. Independently, a similar approach was pursued by G.Meyer and coauthors at
NASA, in the design of autopilots for helicopters. The idea attracted soon the atten-
tion of various other authors and, in 1980, B. Jakubczyk and W. Respondek provided
a complete solution to the problem of determining, for a SISO input-affine system,
conditions for the existence of feedback laws and changes of coordinates yielding
a system modeled by linear equations [2]. Since then, the problem gained a lot of
popularity, in view of its intuitive appealing. However, the fragility of such design
method was also immediately pointed out, because the method involves cancelation
of nonlinear terms (and hence questionable in the presence of model uncertainties)
and access to all components of the state (and hence again questionable if only lim-
ited measurements are available for feedback). A somewhat less ambitious version
of this approach is that of forcing a linear input–output behavior via state feedback.
Such design requires substantially weaker assumptions, but the above-mentioned
criticisms of lack of robustness still persist. It was only relatively recently, in 2008,
that a robust alternative was proposed, by L. Freidovich and H. Khalil [3], who
showed how it is possible to robustly control an input-affine SISO nonlinear system
so as to recover—up to any arbitrarily fixed degree of accuracy—the performance
that would have been obtained by means of the classical input–output feedback lin-
earization design, had the parameters of the system been accurately known and had
the full state been available. Essentially, in the terminology introduced earlier by J.
C. Willems [4] in the analysis of the problem of disturbance decoupling for linear
systems, the authors of [3] have proven how to achieve almost input–output feedback
linearization, by means of a robust controller.
In the early 1980s, various authors had also addressed the problem of controlling
an input-affine MIMO nonlinear system so as to obtain a linear input–output behav-
ior. In particular, J. Descusse and C. Moog, in a 1985 paper [5], showed that any
invertibile system can be forced to have a linear input–output behavior if a dynamic
feedback is used, a control consisting of the addition of extra state variables and
of a feedback from the augmented state. This result was acknowledged to be quite
1 Almost Feedback Linearization via Dynamic Extension … 3

powerful, because one cannot think of an assumption weaker than invertibility, but
again the underlying design is not robust, as it relies upon exact cancelations and avail-
ability of the full state of the controlled plant. In particular, the method in question is
based on a recursive design procedure that, at each step, requires a state-dependent
change of coordinates in the input space, an intrinsically non-robust operation. Very
recently, in [6], taking advantage of some developments concerning the structure of
the normal forms of invertible nonlinear systems [7], a robust version of the method
of [5] has been proposed. Specifically, the results of [6] have shown how it is pos-
sible to design a robust controller that solves the problem of almost input–output
feedback linearization, for a reasonably general class of uniformly invertible MIMO
systems. The purpose of the present paper is to summarize the highlights of the main
results of this work, in the more general context of systems possessing a nontrivial
zero dynamics, and to show how they can provide a useful paradigm for robust sta-
bilization of MIMO systems: as an application, it is also shown how the method in
question can be profitably used in the solution of a problem of output regulation.

1.3 Normal Forms of Uniformly Invertible Nonlinear


Systems

We consider in this paper input-affine nonlinear systems modeled by equations of


the form
x̄˙ = f¯(x̄) + ḡ(x̄)u
y = h̄(x̄)

with state x̄ ∈ Rn , input u ∈ Rm and output y ∈ Rm , in which f¯(·) and the m columns
of ḡ(·) are smooth vector fields, while the m entries of h̄(·) are smooth functions. It is
also assumed that x̄ = 0 is an equilibrium of the unforced system, i.e., that f¯(0) = 0,
and, without loss of generality, that h̄(0) = 0.

1.3.1 Normal Forms

It is known (see [8, 9], [10, pp. 251–280]) that if a MIMO nonlinear system having
the same number m of input and output components is uniformly invertible in the
sense of Singh [11], and if certain vector fields are complete (see in particular [8]),
there exists a globally defined change of coordinates by means of which the system
can be expressed, in normal form, as
4 A. Isidori and Y. Wu

ż = f 0 (z, x) + g0 (z, x)u


ẋi,1 = xi,2
···
ẋi,r1 = xi,r1 +1 + δi,r
1
1 +1
(z, x)(a1 (z, x) + b1 (z, x)u)
···
ẋi,r2 −1 = xi,r2 + δi,r1
2
(z, x)(a1 (z, x) + b1 (z, x)u)
2
j
ẋi,r2 = xi,r2 +1 + δi,r2 +1 (z, x)(a j (z, x) + b j (z, x)u)
j=1
··· (1.1)

i−1
j
ẋi,ri−1 = xi,ri−1 +1 + δi,ri−1 +1 (z, x)(a j (z, x) + b j (z, x)u)
j=1
···

i−1
j
ẋi,ri −1 = xi,ri + δi,ri (z, x)(a j (z, x) + b j (z, x)u)
j=1
ẋi,ri = ai (z, x) + bi (z, x)u
yi = xi,1 i = 1, . . . , m,

where (z, x) ∈ Rn−r × Rr , with x = col(x1 , . . . , xm ) and xi = col(xi1 , . . . , xi,ri ) for


i = 1, . . . , m, and y = col(y1 , . . . , ym ).1 Note that, as a consequence of the assump-
tion that f¯(0) = 0,

f 0 (0, 0) = 0 and ai (0, 0) = 0 ∀i.

As a consequence of the property of uniform invertibility, certain parameters that


characterize the normal form (1.1), namely the vectors bi (z, x) that pre-multiply the
j
input u and the “multipliers” δi,k (z, x), have special properties. To describe such
properties and their consequences, set
⎛ ⎞ ⎛ ⎞
a1 (z, x) b1 (z, x)
A(z, x) = ⎝ · · · ⎠ , B(z, x) = ⎝ · · · ⎠ ,
am (z, x) bm (z, x)

in which case the last equations of each block of (1.1) can be rewritten together in
compact form as ⎛ ⎞
ẋ1,r1
⎝ · · · ⎠ = A(z, x) + B(z, x)u .
ẋm,rm

A consequence of uniform invertibility (see, e.g., [10, p. 274]) is that the m × m


matrix B(z, x) is invertible for all (z, x). Such property, which extends to the present
general setting the classical (but quite restrictive) property of having a vector relative

1 For convenience, it is assumed that dim(yi ) = 1 for all i = 1, . . . , m and r1 < r2 < . . . < rm .
qgeneral, one should consider y split into q blocks y1 , . . . , yq , with dim(yi ) = m i ≥ 1 and
In
i=1 m i = m. The structure of the equations remains the same.
1 Almost Feedback Linearization via Dynamic Extension … 5

degree, can be naturally exploited in the design of state-feedback control laws. For
instance, because of such property, one could think of choosing the control u as

u = B −1 (z, x)[−A(z, x) + ū] (1.2)

changing this way the system into a system of the simpler form

ż = f 0 (z, x) + g0 (z, x)B −1 (z, x)[−A(z, x) + ū]


ẋi,1 = xi,2
···
ẋi,r1 = xi,r1 +1 + δi,r
1
1 +1
(z, x)ū 1
···
ẋi,r2 −1 = xi,r2 + δi,r1
2
(z, x)ū 1
2
j
ẋi,r2 = xi,r2 +1 + δi,r2 +1 (z, x)ū j
j=1
···

i−1
j
ẋi,ri−1 = xi,ri−1 +1 + δi,ri−1 +1 (z, x)ū j
j=1
···

i−1
j
ẋi,ri −1 = xi,ri + δi,ri (z, x)ū j
j=1
ẋi,ri = ū i
yi = xi,1 i = 1, . . . , m.

The property that B(z, x) is invertible, on the other hand, is also useful in the char-
acterization of the so-called zero dynamics of the system (see below) and of its
asymptotic properties.
j
It is worth stressing that, if the “multipliers” δi,ri (z, x) in (1.1) were all indepen-
dent of (z, x), the state-feedback law (1.2) would have induced a linear input–output
j
behavior.2 In general, though, the multipliers δi,k (z, x) are not constant. However—
as a consequence of the property of invertibility—they can only depend on the indi-
vidual components of x in a special way, that can be described as follows. For any
sequence of real variables xi j , with 1 ≤ j ≤ r , let xi and x ik denote the strings

xi = (xi1 , xi2 , . . . , xir )


x ik = (xi1 , xi2 , . . . , xik ) k <r.

It has been shown (see [12]) that, if m = 2 and dim(z) = 0, a consequence of the
property of uniform invertibility is that the multipliers in question, which in this case
reduce to δ2,r
1
1 +1
(x), . . . , δ2,r
1
2
(x), depend on the components of x2 in a “triangular”
fashion, i.e.,

2Note that, if any of such multipliers is nonzero, the system fails to have a vector relative degree.
Nevertheless input–output linearization is achieved (see [10, pp. 280–287]).
6 A. Isidori and Y. Wu

δ2,k+1
1
(x) = δ2,k+1
1
(x1 , x 2k ), k = r1 , . . . , r2 − 1.

Motivated by such property, we consider in what follows nonlinear systems having


m > 2 in which a similar property of “triangular” dependence, introduced earlier in
[7] in the study of the problem of stabilization by output feedback, holds. Specifically,
we assume that3 :
j
Assumption 1.1 The multipliers δi,k+1 (z, x) in (1.1) are independent of z and
depend on the components of x in a “triangular” fashion, as in
j
δi,k+1 (x1 , . . . , xi−1 , x i,k . . . , x m,k ), ri−1 ≤ k ≤ ri − 1, 1 ≤ j ≤ i − 1 .

Remark 1.1 An interesting fallout of this Assumption, highlighted in [7], is that


all components of x can be expressed as functions of the components of y and of
a suitable number of their higher order derivatives with respect to time. Thus, a
uniformly invertible nonlinear system in which dim(z) = 0, if Assumption 1.1 holds
is uniformly observable.

1.3.2 Strongly Minimum-Phase Systems

A nonlinear system is said to be globally minimum-phase if the internal dynamics


arising when the control is chosen to force the output to be identically zero are
globally asymptotically stable. In the case of the normal form (1.1), it is seen that if
y(t) is identically zero so is also x(t) and, necessarily,

u(t) = [B(z(t), 0)]−1 [−A(z(t), 0)] .

As a consequence, the zero dynamics are those of

z̃˙ = f 0 (z, 0) + g0 (z, 0)[B(z, 0)]−1 [−A(z, 0)] (1.3)

and the system is said to be globally minimum-phase if the equilibrium z = 0 of the


latter is globally asymptotically stable.
It has been stressed in [13, 14] that, in the design of feedback stabilizing laws,
it is appropriate to look at a stronger notion of “minimum-phase,” that—roughly
speaking—requires the dynamics of the inverse system to be input-to-state stable.4

3 It is easy to show that the Assumption in question is compatible with the assumption of uniform
invertibility, i.e., that if in a normal form like (1.1) such Assumption holds, and the matrix B(z, x)
is nonsingular, the system is uniformly invertible in the sense of Singh. However, it must be stressed
that the necessity of such triangular dependence has been proven only for systems having m = 2
and a trivial dynamics of z.
4 A property that implies, but is not implied by, the property that the system is globally minimum-

phase.
1 Almost Feedback Linearization via Dynamic Extension … 7

In the present context of systems modeled in normal form, the property in question
considers, instead of (1.3), the forced dynamics

ż = f 0 (z, x) + g0 (z, x)[B(z, x)]−1 [−A(z, x) + χ ], (1.4)

seen as a system with state z and inputs (x, χ ) and requires the latter to be input-to-
state stable.
With the results of [15, 16] in mind, we formally define such property as follows.

Definition 1.1 System (1.1) is strongly minimum-phase (SMP) if there exist a C 1


function V : Rn−r → R and four class K∞ functions α(·), α(·), α(·), σ (·) such that

α(z) ≤ V (z) ≤ α(z) for all z ∈ Rn−r

and
∂V  
f 0 (z, x)+g0 (z, x)[B(z, x)]−1 [−A(z, x) + χ ] ≤ −α(z) + σ (x) + σ (χ )
∂z
(1.5)
for all (z, x, χ ) ∈ Rn−r × Rr × Rm . The system is strongly—and also locally
exponentially-minimum-phase (eSMP) if the inequalities above hold with α(·), α(·),
α(·), σ (·) that are locally quadratic near the origin.

1.4 Robust (Semiglobal) Stabilization via Almost Feedback


Linearization

1.4.1 Standing Assumptions

System (1.1), if the matrix B(z, x) is invertible and Assumption 1.1 holds, is uni-
formly invertible. Hence, as it is known since a long time (see, e.g., [5], [17, pp.
249–263]), it can be input–output linearized by means of a control consisting of an
augmentation of the dynamics and of a state feedback from the augmented state. In
general, methods for feedback linearization require exact cancelation of nonlinear
terms and availability of the full state of the system: as such they cannot be regarded
as robust design methods. In the case of MIMO systems, the issue of robustness is
further aggravated by the fact that all known (recursive) methods for achieving feed-
back linearization via dynamic extension require, at each stage, a state-dependent
change of coordinates in the input space, which is intrinsically non-robust. In [6] it
has been shown how such methods can be made robust, by means of a technique
based on interlaced use of dynamic extensions and robust observers, extending in this
way to MIMO systems the seminal results of [3], in which—for a SISO system—
input–output linearization is achieved up to any arbitrarily fixed degree of accuracy
by means of a robust controller.
8 A. Isidori and Y. Wu

The method of [6] considers the case of a system in normal form (1.1), supposed
to satisfy Assumption 1.1, and in which the matrix B(z, x) has the property indicated
in the following additional assumption.
Assumption 1.2 The matrix B(z, x) is lower triangular and there exist numbers
bmin , bmax such that

0 < bmin ≤ bii (z, x) ∀i, B(z, x) ≤ bmax ∀(z, x). (1.6)

Note that, as a consequence of this assumption, there exist a number b0 and a number
0 < δ0 < 1 such that

bii (z, x) − b0
≤ δ0 < 1 ∀i, ∀(z, x) . (1.7)
b0

The method described in [6] addresses the case in which the dynamics of z are
trivial. If this is not the case, the following extra assumption is needed.
Assumption 1.3 The controlled plant (1.1) is eSMP.

1.4.2 The Nominal Linearizing Feedback

The (recursive) procedure for exact input–output linearization via state augmentation
and feedback can be summarized as follow. First of all, the dynamics of (1.1) are
augmented, by means of a dynamic extension defined as

ζ̇1 = S1 ζ1 + T1 v1
ζ̇2 = S2 ζ2 + T2 v2
(1.8)
···
ζ̇m−1 = Sm−1 ζm−1 + Tm−1 vm−1

in which, for i = 1, . . . , m − 1, ζi ∈ Rrm −ri are additional states and vi ∈ R are addi-
tional inputs, and ⎛ ⎞ ⎛ ⎞
0 1 ··· 0 0
⎜· · · · · ·⎟ ⎜· · ·⎟
Si = ⎜ ⎟
⎝0 0 · · · 1 ⎠ , Ti = ⎜⎝ 0 ⎠.

0 0 ··· 0 1

Then a state feedback (from the full state of the augmented system) is determined,
by means of a recursive design procedure, that consists in the following steps.
Step 1: Set x1 = col ξ1 , ζ1 with ξ1 ∈ Rr1 defined as ξ1 j = x1 j for 1 ≤ j ≤ r1 .
Indeed, ξ̇1, j = ξ1, j+1 for 1 ≤ j ≤ r1 − 1 and ξ̇1,r1 = a1 (z, x) + b1 (z, x)u. Let u be
such that
a1 (z, x) + b1 (z, x)u = ζ11 . (1.9)
1 Almost Feedback Linearization via Dynamic Extension … 9

Pick


r1 
rm
v1 = v1∗ (ξ1 , ζ1 ) + ū 1 = − d j−1 ξ1 j − d j−1 ζ1, j−r1 + ū 1 = − K̂ x1 + ū 1 ,
j=1 j=r1 +1
(1.10)
where
K̂ = d0 d1 · · · drm −1 . (1.11)

By construction, ẋ1 = (  − B̂ K̂ )x1 + B̂ ū 1 , in which  ∈ Rrm ×rm and B̂ ∈ Rrm ×1 are


matrices of the form
⎛ ⎞ ⎛ ⎞
0 1 ··· 0 0
⎜· · · · · ·⎟ ⎜· · ·⎟
 = ⎜ ⎟
⎝0 0 · · · 1⎠ , B̂ = ⎝ 0 ⎠ .
⎜ ⎟ (1.12)
0 0 ··· 0 1

Step 2: Assume (1.9) holds. Set x2 = col ξ2 , ζ2 with ξ2 ∈ Rr2 defined as

ξ2 j = x 2 j 1 ≤ j ≤ r1
ξ2 j = x2 j + ψ2 j (x1 , x 2, j−1 , . . . , x m, j−1 , ζ 1, j−r ) r1 + 1 ≤ j ≤ r2 ,
1

where the ψ2 j (·) are such that ξ̇2, j = ξ2, j+1 for 1 ≤ j ≤ r2 − 1. It can be checked
that
ξ̇2,r2 = a2 (z, x) + b2 (z, x)u + c2 (x, ζ )

in which c2 (x, ζ ) is a function that vanishes at (x, ζ ) = (0, 0).


Let u be such that

a2 (z, x) + b2 (z, x)u + c2 (x, ζ ) = ζ21 . (1.13)

Pick


r2 
rm
v2 = v2∗ (ξ2 , ζ2 ) + ū 2 = − d j−1 ξ2 j − d j−1 ζ2, j−r2 + ū 2 = − K̂ x2 + ū 2 .
j=1 j=r2 +1
(1.14)
By construction, ẋ2 = ( Â − B̂ K̂ )x2 + B̂ ū 2 .
Step 3: Assume (1.9) and (1.13) hold. Set x3 = col ξ3 , ζ3 with ξ3 ∈ Rr3 defined
as

ξ3 j = x 3 j 1 ≤ j ≤ r1
ξ3 j = x3 j + ψ3 j (x1 , x 2, j−1 , . . . , x m, j−1 , ζ 1, j−r ) r1 + 1 ≤ j ≤ r2
1
ξ3 j = x3 j + ψ3 j (x1 , x2 , x 3, j−1 . . . , x m, j−1 , ζ 1, j−r , ζ 2, j−r ) r2 + 1 ≤ j ≤ r3 ,
1 2
10 A. Isidori and Y. Wu

where the ψ3 j (·) are such that ξ̇3, j = ξ3, j+1 for 1 ≤ j ≤ r3 − 1. It can be checked
that
ξ̇3,r3 = a3 (z, x) + b3 (z, x)u + c3 (x, ζ )

in which c3 (x, ζ ) is a function that vanishes at (x, ζ ) = (0, 0). Let u be such that

a3 (z, x) + b3 (z, x)u + c3 (x, ζ ) = ζ31 . (1.15)

Pick


r3 
rm
v3 = v3∗ (ξ3 , ζ3 ) + ū 3 = − d j−1 ξ3 j − d j−1 ζ3, j−r3 + ū 3 = − K̂ x3 + ū 3 .
j=1 j=r3 +1
(1.16)
By construction, ẋ3 = ( Â − B̂ K̂ )x3 + B̂ ū 3 .
Step m-1: Assume (1.9), (1.13), . . . hold. Set xm−1 = col ξm−1 , ζm−1 with
ξm−1 ∈ Rrm−1 defined as in such a way that ξ̇m−1, j = ξm−1, j+1 for 1 ≤ j ≤ rm−1 − 1.
It can be checked that

ξ̇m−1,rm−1 = am−1 (z, x) + bm−1 (z, x)u + cm−1 (x, ζ )

in which cm−1 (x, ζ ) is a function that vanishes at (x, ζ ) = (0, 0). Let u be such that

am−1 (z, x) + bm−1 (z, x)u + cm−1 (x, ζ ) = ζm−1,1 . (1.17)

Pick

vm−1 = vm−1 (ξ , ζ ) + ū m−1
rm−1 m−1 m−1 m
= − j=1 d j−1 ξm−1, j − rj=r m−1 +1
d j−1 ζm−1, j−rm−1 + ū m−1 = − K̂ xm−1 + ū m−1 .
(1.18)
By construction, ẋm−1 = ( Â − B̂ K̂ )xm−1 + B̂ ū m−1 .
Step m: Assume (1.9), (1.13), . . . , (1.17) hold. Set xm = ξm with ξm ∈ Rrm defined
as

ξm j = x m j 1 ≤ j ≤ r1
ξm j = xm j + ψm j (x1 , x 2, j−1 , . . . , x m, j−1 , ζ 1, j−r ) r1 + 1 ≤ j ≤ r2
1
ξm j = xm j + ψm j (x1 , x2 , x 3, j−1 , . . . , x m, j−1 , ζ 1, j−r , ζ 2, j−r ) r2 + 1 ≤ j ≤ r3
1 2
···
ξm j = xm j + ψm j (x1 , x2 , . . . , xm−1 , x m, j−1 , ζ 1, j−r , ζ 2, j−r , . . . , ζ m−1, j−r )
1 2 m−1
rm−1 + 1 ≤ j ≤ rm ,

where the ψm j (·) are such that ξ̇m, j = ξm, j+1 for 1 ≤ j ≤ rm − 1 . It can be checked
that m−1
ξ̇m,rm = am (z, x) + bm (z, x)u + cm (x, ζ ) − i=1 γi (x, ζ )vi
1 Almost Feedback Linearization via Dynamic Extension … 11

in which cm (x, ζ ) is a function that vanishes at (x, ζ ) = (0, 0) and the γi (x, ζ )’s,
for i = 1, . . . , m − 1, are appropriately defined functions. Define, for convenience,
γm (x, ζ ) = 1.
Let u be such that

am (z, x) + bm (z, x)u + cm (x, ζ ) =


m−1
= i=1 γ (x, ζ )vi + vm∗ (ξm ) + ū m
m−1 i m−1 (1.19)
= i=1 γi (x, ζ )vi∗ (ξi , ζi ) + i=1 γi (x, ζ )ū i + vm∗ (ξm ) + ū m
m ∗ m
= i=1 γi (x, ζ )vi (ξi , ζi ) + i=1 γi (x, ζ )ū i ,

in which

rm
vm∗ (ξm ) = − d j−1 ξm j = − K̂ xm . (1.20)
j=1

By construction, ẋm = ( Â − B̂ K̂ )xm + B̂ ū m .


Formulas (1.9), (1.13), (1.15), …, (1.17), (1.19), that implicitly define the control
u, can be expressed in compact form as follows. Observe that, for each i = 1, . . . , m,
the vector xi ∈ Rrm is a function of (x, ζ ), that will be written as xi = Ψi (x, ζ ). Set
⎛ ⎞ ⎛ ⎞
x1 Ψ1 (x, ζ )
x = ⎝· · ·⎠ Ψ (x, ζ ) = ⎝ · · · ⎠ .
xm Ψm (x, ζ )

It can be checked that the map

Rm·rm → Rm·rm
(x, ζ ) → x = Ψ (x, ζ )

is a globally defined diffeomorphism that preserves the origin. Set


⎛ ⎞ ⎛

−ζ11 0
⎜ c2 (x, ζ ) − ζ21 ⎟ ⎜0⎟
⎜ ⎟ ⎜ ⎟
C(x, ζ ) = ⎜
⎜ ··· ⎟
⎟ D=⎜ ⎟
⎜· · ·⎟
⎝cm−1 (x, ζ ) − ζm−1,1 ⎠ ⎝0⎠
cm (x, ζ ) 1

Γ (x, ζ ) = γ1 (x, ζ ) . . . γm−1 (x, ζ ) γm (x, ζ ) ,

and observe that C(x, ζ ) vanishes at (x, ζ ) = (0, 0) because so do all the ci (x, ζ )’s.
Then, (1.9), (1.13), (1.15), (1.19) altogether can be expressed as

A(z, x) + B(z, x)u + C(x, ζ ) = −DΓ (x, ζ )(Im ⊗ K̂ )Ψ (x, ζ ) + DΓ (x, ζ )ū

in which ū = col{ū 1 , . . . , ū m }.
12 A. Isidori and Y. Wu

It is seen from all of the above that if the controls vi of the dynamic extension
(1.8) are chosen as

vi = vi∗ (ξi , ζi ) + ū i = − K̂ Ψi (x, ζ ) + ū i , i = 1, . . . , m − 1, (1.21)

in which K̂ ∈ R1×rm is a row-vector of design parameters, and the control u is chosen


as

u = [B(z, x)]−1 [−A(z, x) − C(x, ζ ) − DΓ (x, ζ )(Im ⊗ K̂ )Ψ (x, ζ ) + DΓ (x, ζ )ū],


(1.22)
a closed-loop system is obtained described by equations of the form

ż = f 0 (z, x) + g0 (z, x)[B(z, x)]−1


×[−A(z, x) − C(x, ζ ) − DΓ (x, ζ )(Im ⊗ K̂ )Ψ (x, ζ ) + DΓ (x, ζ )ū]
ẋi = ( Â − B̂ K̂ )xi + B̂ ū i
yi = Ĉxi i = 1, . . . , m
(1.23)
in which  ∈ Rrm ×rm and B̂ ∈ Rrm ×1 are matrices of the form (1.12) and Ĉ ∈ R1×rm
is a matrix of the form
Ĉ = 1 0 · · · 0 . (1.24)

It is seen from this that the indicated choice of u and of the vi ’s has rendered the
system input–output linear (and also non-interactive). Moreover, if the matrix K̂ of
free design parameters is such that the polynomial

d(λ) = d0 + d1 λ + · · · + drm −1 λrm −1 + λrm (1.25)

is Hurwitz, the m lower subsystems of (1.23) are all globally asymptotically stable.
A consequence of the assumption of strong minimum phase is that the closed-
loop system (1.23), viewed as a system with input ū and state (z, x), is input-to-state
stable.
Proposition 1.1 Let Assumptions 1.1, 1.2, 1.3 be fulfilled and let K̂ be chosen so that
 − B̂ K̂ is Hurwitz. Then, system (1.23), viewed as a system with input ū and state
(z, x), is input-to-state stable. If ū = 0, the equilibrium (z, x) = (0, 0) is globally
and also locally exponentially stable.
Proof Recall that C(x, ζ ) and Γ (x, ζ ) are smooth functions, with C(x, ζ ) vanish-
ing at (x, ζ ) = (0, 0). Recall also that the map (x, ζ ) = Ψ −1 (x) is a smooth map
vanishing at x = 0. Then, there exists a class K function α1 (·), locally linearly near
the origin, such that
x ≤ α1 (x) , (1.26)

a class K function α2 (·), locally linearly near the origin, such that

C(x, ζ ) − DΓ (x, ζ )(Im ⊗ K̂ )Ψ (x, ζ ) ≤ α2 (x) , (1.27)


1 Almost Feedback Linearization via Dynamic Extension … 13

and two class K functions α3 (·), α4 (·), locally linearly near the origin, such that

2DΓ (x, ζ )ū ≤ α3 (x) + α( ū) . (1.28)

Combining the estimate (1.5) with the estimates (1.26)–(1.27)–(1.28), we con-


clude the existence of two class K∞ functions σ (·) and σ (·), locally quadratic near
the origin, such that

∂V 
f 0 (z, x) + g0 (z, x)
∂z 
×[B(z, x)]−1 [−A(z, x) − C(x, ζ ) − DΓ (x, ζ )(Im ⊗ K̂ )Ψ (x, ζ ) + DΓ (x, ζ )ū]
≤ − α(z) + σ (x) + σ (C(x, ζ ) − DΓ (x, ζ )(Im ⊗ K̂ )Ψ (x, ζ ) + DΓ (x, ζ )ū)
≤ − α(z) + σ (α1 (x)) + σ (2α2 (x)) + σ (2DΓ (x, ζ )ū)
≤ − α(z) + σ (α1 (x)) + σ (2α2 (x)) + σ (2α3 (x)) + σ (2α4 (ū))
≤ − α(z) + σ (x) + σ (ū).

Since x is a state of a linear input-to-state stable system, the claim follows from
standard results.

1.4.3 Robust Feedback Design

The nominal input–output linearizing control provided by (1.8)–(1.21)–(1.22) is


fragile, because the states (z, x) are not available and the entries of A(z, x), B(z, x),
C(x, ζ ), Ψ (x, ζ ), Γ (x, ζ ) might be uncertain. It was shown in [6], though, that the
linearizing (and stabilizing) effect of a similar feedback law can be recovered up to
any desired degree of accuracy by means of an implementable (extended-observer-
based) control law.
As suggested in [6], the controls u 1 , . . . , u m and the additional inputs v1 , . . . , vm−1
can be chosen as follows. Let g (·) be a (smooth) saturation function5 and let ϕ(σ, ς )
be a function defined as6
1
ϕ(σ, ς ) = (−σ + ς ) .
b0

The controls u i , for i = 1, . . . , m, are functions defined as

u 1 = g (ϕ(σ1 , ζ11 ))
···
(1.29)
u m−1 = g (ϕ(σm−1 , ζm−1,1 ))
m
u m = g (ϕ(σm , − rj=1 d j−1 ξ̂m j + ū m ))

5 A smooth function, characterized as follows: sat  (s) = s if |s| ≤ , sat  (s) is odd and monotoni-
cally increasing, with 0 < sat (s) ≤ 1, and lims→∞ sat  (s) = (1 + c) with 0 < c  1.
6 The number b here is any number for which condition (1.7) holds.
0
14 A. Isidori and Y. Wu

in which the d j ’s are the entries of the matrix (1.11), chosen in such a way that the
polynomial (1.25) is Hurwitz, and σ1 , . . . , σm−1 , σm and ξ̂m = (ξ̂m1 , . . . , ξ̂m,rm ) are
states of extended observers that are defined below.
Moreover, the controls vi , for i = 1, . . . , m − 1, are functions defined as
1 m
v1 = g (− rj=1 d j−1 ξ̂1 j ) − rj=r d ζ + ū 1
r2  m 1 +1 j−1 1, j−r1
v2 = g (− j=1 d j−1 ξ̂2 j ) − rj=r 2 +1
d j−1 ζ 2, j−r + ū 2
2
(1.30)
···
rm−1 m
vm−1 = g (− j=1 d j−1 ξ̂m−1, j ) − rj=r m−1 +1
d j−1 ζm−1, j−rm−1 + ū m−1

in which ξ̂1 = (ξ̂11 , . . . , ξ̂1,r1 ), ξ̂2 = (ξ̂21 , . . . , ξ̂2,r2 ), . . . , ξ̂m−1 = (ξ̂m−1,1 , . . . ,


ξ̂m−1,rm−1 ) are states of the extended observers that are defined below.
The extended observers that generate the variables σi and ξ̂i , for i = 1, 2, . . . , m,
are defined as
ξ̂˙i,1 = ξ̂i,2 + κi ci,r1 (yi − ξ̂i,1 )
ξ̂˙ = ξ̂ + κ 2 c
i,2 i,3 i (y − ξ̂ )
i,r1 −1 i i,1
...
(1.31)
ξ̂˙i,ri −1 = ξ̂i,ri + κiri −1 ci,2 (yi − ξ̂i,1 )
ξ̂˙i,ri = σi + b0 u i (·) + κiri ci,1 (yi − ξ̂i,1 )
σ̇i = κiri +1 ci,0 (yi − ξ̂i,1 )

with ξ̂i = (ξ̂i,1 , ξ̂i,2 , . . . , ξ̂i,ri ) in which the ci, j ’s and the gain κi are design parameters.
Note that the components of ζ1 , . . . , ζm−1 appearing in (1.29), (1.30) are states
of the dynamical extension and, as such, available for feedback. Overall, the control
defined by (1.29)–(1.30)–(1.31) is a dynamical system, with internal state

ξ̂ = col(ξ̂1 , . . . , ξ̂m ), σ = col(σ1 , . . . , σm ),

driven by the inputs y and ζ = col(ζ1 , . . . , ζm−1 ), that generates the controls u and v.
It can be shown that the closed-loop system obtained controlling (1.1) by means of
(1.8)–(1.29)–(1.30) can be seen as a perturbed version of the system (1.23) obtained
by means of the nominal linearizing control. In particular, with x = Ψ (x, ζ ) defined
as before, the dynamics of x can be expressed in the form

ẋ = Ax + Bū + G(w, z, x, ξ̂ , σ, ū) ,

in which A = Im ⊗ ( Â − B̂ K̂ ), B = Im ⊗ B̂ and G(w, z, x, ξ̂ , σ, ū) is a perturbation


term.
1 Almost Feedback Linearization via Dynamic Extension … 15

As shown7 in [6], if the design parameters d0 , . . . , dm 1 , ci0 , . . . , ci,ri for i =


1, . . . , m,  and κ1 , . . . , κm are appropriately tuned, by means of such control is
possible to render the perturbation term arbitrarily small, on a time-interval of the
form [T0 , ∞), in which T0 can be taken arbitrarily small as well. Hence, the input–
output behavior can be made arbitrarily close to that of the ideally input–output
linearized system (1.23).
To the purpose of expressing this claim in precise terms and comparing the results
obtained under the nominal input–output linearizing control with those obtainable
under the control introduced in this section, let x L (t) denote the state-response of the
system obtained under the nominal input–output linearizing control, which is
 t
x L (t) = e x L (0) +
At
eA(t−τ )Bū(τ )dτ .
0

Then, the methods of [6] make it possible to prove the following claim.

Theorem 1.1 Consider system (1.1), augmented with (1.8), and controlled by
u defined as in (1.29) and the v̂i ’s defined as in (1.30), in which (ξ̂i , σi ), for
i = 1, . . . , m, are states of the extended observers (1.31). Let the di ’s be such that
the polynomial (1.25) is Hurwitz. Let the ci j ’s be chosen in such a way that the
polynomials pi (λ) = λri +1 + ci,ri λri + · · · ci1 λ + ci0 has all real and negative roots.
Suppose initial conditions are taken as a fixed (but otherwise arbitrary) compact
set C . Suppose the input ū(·) satisfies ū(t) ≤ U for all t ≥ 0, with U a fixed (but
otherwise arbitrary) number. Then, there is a choice of the saturation level  such

that, given any ε > 0, there is a value κm∗ and, for every κm ≥ κm∗ a value κm−1 (κm )
∗ ∗
and, for every κm−1 ≥ κm−1 (κm ) a value κm−2 (κm−1 , κm ), and so on, such that, if
∗ ∗
κm ≥ κm∗ , κm−1 ≥ κm−1 (κm ), κm−2 ≥ κm−2 (κm−1 , κm ), . . ., κ1 ≥ κ1∗ (κ2 , . . . , κm ), the
trajectories of the closed-loop system obtained controlling (1.1) via (1.8)–(1.29)–
(1.30)– (1.31) remain bounded and

x(t) − x L (t) ≤ ε ∀t ∈ [0, ∞). (1.32)

If, in addition, ū(t) = 0 for all t ≥ 0, then

lim z(t) = 0, lim x(t) = 0, lim ζ (t) = 0, lim ξ̂ (t) = 0, lim σ (t) = 0.
t→∞ t→∞ t→∞ t→∞ t→∞

Thus, the proposed robust controller is able to obtain almost feedback linearization
and, in particular, semiglobal asymptotic stability.

7 The proof provided in the reference [6] addresses the case in which the dynamics of z are trivial.
If this is not the case, appropriate modifications are needed, taking into account the Assumption of
strong minimum-phase.
16 A. Isidori and Y. Wu

1.5 Application to the Problem of Output Regulation

The method for robust semiglobal asymptotic stabilization described in the previous
section can be fruitfully applied in the design of a robust stabilizer for the solution
of a problem of output regulation. We assume, in what follows, that the reader
is familiar with the fundamentals of the theory of output regulation for nonlinear
systems (see, in this respect [18–20]). Usually, a controller that solves the problem
of output regulation requires two ingredients: an internal model, whose purpose
is to generate—in steady state—a “feedforward” control that keeps the regulated
output at zero in steady state and a stabilizer, whose purpose is to make trajectories
converging to the desired steady state. The method for robust stabilization described
in the previous section provides a simple and straightforward procedure for the design
of such stabilizer.
In this section, we address a problem of output regulation for a system having a
normal form with a structure identical to that of (1.1) in which the outputs y1 , . . . , ym
are replaced by the components e1 , . . . , em of the regulated variables and in which
the various nonlinear functions/maps are affected by an exogenous input w. To avoid
duplications we do not rewrite such normal form explicitly,8 but we limit ourselves to
stress that, in a structure identical to that of (1.1), f 0 (z, x) is replaced by f 0 (w, z, x),
g0 (z, x) is replaced by g0 (w, z, x), ai (z, x) is replaced by ai (w, z, x) and bi (z, x)
j
is replaced by bi (w, z, x), for i = 1, . . . , m; the multipliers δik (·) are allowed to
depended on w but, as in Assumption 1.1, are assumed to be independent of z and
dependent on the individual components of x as in
j
δi,k+1 (w, x1 , . . . , xi−1 , x i,k . . . , x m,k ), ri−1 ≤ k ≤ ri − 1, 1 ≤ j ≤ i − 1 .

Finally, a property identical to that indicated in Assumption 1.2 is taken. For con-
venience, in what follows we will refer to such assumptions as to the “equivalent
versions” of Assumptions 1.1 and 1.2.
The exogenous input w is any solution of the autonomous o.d.e.

ẇ = s(w)

(usually known as the exosystem) with initial conditions ranging on a compact and
invariant set W . The problem of output regulation consists in finding a feedback law,
driven by the regulated variables e1 , . . . , em , so that—in the resulting closed-loop
system—all trajectories are bounded and limt→∞ e(t) = 0.
The first step in the solution of this problem is the characterization of a solution of
the so-called regulator equations that identify—in the state space of the composite
system plant-exosystem—a manifold that is rendered invariant via feedback and on
which the regulated variable vanish. The manifold in question is characterized by a
pair of maps z = π0 (w) and x = πx (w), and u = ψ(w) is the control that renders it
invariant. Simple calculations show that, since the regulated variables must vanish

8 See [21] for a more detailed presentation.


1 Almost Feedback Linearization via Dynamic Extension … 17

on such manifold, necessarily πx (w) = 0. Moreover, π0 (w) is characterized by the


p.d.e.
∂π0
s(w) = f 0 (w, π0 (w), 0) + g0 (w, π0 (w), 0)ψ(w) ,
∂w
in which
ψ(w) = [B(w, π0 (w), 0)]−1 [−A(w, π0 (w), 0)] .

Having assumed that such π0 (w) exists, to proceed in the analysis it is convenient
to scale the variable z as
z̃ = z − π0 (w) .

As a consequence, the dynamics of z are replaced by those of

z̃˙ = f 0 (w, z̃ + π0 (w), x) + g0 (w, z̃ + π0 (w), x)[ψ(w) − ψ(w) + u] − ∂π0


∂w
s(w)
= f˜0 (w, z̃, x) + g̃0 (w, z̃, x)[−ψ(w) + u]

in which, by construction, f˜0 (w, z̃, x) vanishes at (z̃, x) = (0, 0). The dynamics of
xi,ri are also affected by such scaling and change into

ẋi,ri = ãi (w, z̃, x) + b̃i (w, z̃, x)[−ψ(w) + u]

in which, by construction, ãi (w, z̃, x) vanishes at (z̃, x) = (0, 0). Consistently, set

Ã(w, z̃, x) = col(ã1 (w, z̃, x), . . . , ãm (w, z̃, x))
B̃(w, z̃, x) = col(b̃1 (w, z̃, x), . . . , b̃m (w, z̃, x)).

On the rescaled system it is easy to characterize the property of strong minimum-


phase. Consistently with the definition given earlier in the paper, the latter considers
the dynamics

z̃˙ = f˜0 (w, z̃, x) + g̃0 (w, z̃, x)[ B̃(w, z̃, x)]−1 [− Ã(w, z̃, x) + χ ], (1.33)

seen as a system with state z̃ and inputs (x, χ ) and requires it to be input-to-state
stable. With the results of [15, 16] in mind, the property in question can be expressed
as follows.
Definition 1.2 The system is strongly minimum-phase if there exist a C 1 function
V : Rn−r → R and four class K∞ functions α(·), α(·), α(·), σ (·) such that

α(z̃) ≤ V (z̃) ≤ α(z̃) for all z̃ ∈ Rn−r

and
∂V  ˜ 
f 0 (w, z̃, x)+ g̃0 (w, z̃, x)[ B̃(w, z̃, x)]−1 [− Ã(w, z̃, x) + χ ]
∂ z̃
≤ −α(z̃) + σ (x) + σ (χ )
18 A. Isidori and Y. Wu

for all (w, z̃, x, χ ) ∈ W × Rn−r × Rr × Rm . The system is strongly—and also


locally exponentially—mimimum phase (eSMP) if the inequalities above hold with
α(·), α(·), α(·), σ (·) that are locally quadratic near the origin.

In what follows, it is assumed that the property indicated in this Definition hold
and we will refer to it as to the “equivalent version” of Assumption 1.3.
The key ingredient in the solution of a problem of output regulation is the design
of an internal model, a device able to generate, in steady state, the control input
u = ψ(w) that renders the point (z̃, x) = (0, 0) an equilibrium point. To this end, an
assumption on ψ(w) is convenient.

Assumption 1.4 For each i = 1, . . . , m there exist an integer di and a globally


Lipschitz smooth function φi : Rdi → R such that the i-th component ψi (w) of
ψ(w) satisfies

L ds i ψi (w) = φi (ψi (w), L s ψi (w), . . . , L sdi −1 ψi (w)) ∀w ∈ W.

The functions φ1 (·), . . . , φm (·) determine the construction of the internal model,
the aggregate of m SISO systems of the form

η̇i = Âi ηi + B̂i φi (ηi ) + G i ū i


u i = Ĉi ηi + ū i

in which ηi ∈ Rdi , and Âi , B̂i , Ĉi are matrices of the form (1.12)–(1.24). The vec-
tors G i ∈ Rdi are vectors of design parameters. The controls ū i will be used for
stabilization purposes.
By construction (see [22]), for each i = 1, . . . , m, the map

ϑi (w) = col{ψi (w), L s ψi (w), . . . , L sνi −1 ψi (w)}

satisfies
∂ϑi
s(w) = Âi ϑi (w) + B̂i φi (ϑi (w))
∂w for all w ∈ W .
ψi (w) = Ĉi ϑi (w)

Altogether, the various subsystems that characterize the internal model can be put in
the form
η̇ = Âη + B̂φ(η) + G ū
u = Ĉη + ū

in which Â, B̂, Ĉ, G are block-diagonal matrices, whose the i-th diagonal blocks are
Âi , B̂i , Ĉi , G i , and
η = col{η1 , . . . , ηm }
ū = col{ū 1 , . . . , ū m }
φ(η) = col{φ1 (η1 ), . . . , φm (ηm )}.
1 Almost Feedback Linearization via Dynamic Extension … 19

If η is scaled as
η̃ = η − ϑ(w),

with ϑ(w) = col{ϑ1 (w), . . . , ϑm (w)}, a simple calculation yields

η̃˙ = F(w, η̃) + G[Ĉ η̃ + ū]

in which, by construction,

F(w, η̃) = [ Â − G Ĉ]η̃ + B̂[φ(η̃ + ϑ(w)) − φ(ϑ(w))]

vanishes at η̃ = 0. Such scaling also affects the dynamics of z̃ and xi,ri , that change
as
z̃˙ = f˜0 (w, z̃, x) + g̃0 (w, z̃, x)[Ĉ η̃ + ū]
ẋi,ri = ãi (w, z̃, x) + b̃i (w, z̃, x)[Ĉ η̃ + ū].

We are now ready to write the normal form of the so-called “augmented system,”
namely, the system obtained preprocessing the plant by means of the internal model.
The normal form in question is

ẇ = s(w)
z̃˙ = f˜0 (w, z̃, x) + g̃0 (w, z̃, x)[Ĉ η̃ + ū]
η̃˙ = F(w, η̃) + G[Ĉ η̃ + ū]
ẋi,1 = xi,2
···
(1.34)
ẋi,r1 = xi,r1 +1 + δi,r 1
1 +1
(w, x)(ã1 (w, z̃, x) + b̃1 (w, z̃, x)[Ĉ η̃ + ū])
···
 j
ẋi,ri −1 = xi,ri + i−1 j=1 δi,ri (w, x)(ã j (w, z̃, x) + b̃ j (w, z̃, x)[Ĉ η̃ + ū])
ẋi,ri = ãi (w, z̃, x) + b̃i (w, z̃, x)[Ĉ η̃ + ū]
ei = xi,1 i = 1, . . . , m.

Since f˜0 (w, 0, 0) = 0, ãi (w, 0, 0) = 0, F̃(w, 0) = 0, the point (z̃, η̃, x) = (0, 0, 0)
is an equilibrium point when ū = 0, for every value of w, and the regulated error
vanishes at such point. Hence, if such equilibrium is stabilized, the problem of output
regulation is solved.
System (1.34) has a structure similar to that of system (1.1). Hence, it is reasonable
to expect that if assumptions corresponding to those considered in the previous
section hold, semiglobal stability can be obtained by means of a robust controller.
Equivalent versions of Assumptions 1.1 and 1.2 have already been claimed; hence, it
remains to check the property of strong minimum-phase for system (1.34). According
to Definition 1.2, we should look at the system

z̃˙ = f˜0 (w, z̃, x) + g̃0 (w, z̃, x)[ B̃(w, z̃, x)]−1 [− Ã(w, z̃, x) + χ ]
(1.35)
η̃˙ = F(w, η̃) + G[ B̃(w, z̃, x)]−1 [− Ã(w, z̃, x) + χ ]
20 A. Isidori and Y. Wu

and check that the latter, seen as a system with state (z̃, η̃) and input (x, χ ) has the
properties indicated in such Definition. The system in question is the cascade of
two systems: the upper subsystem—if the plant satisfies the equivalent version of
Assumption 1.3—has already the desired properties. Thus, we only have to make sure
that the lower subsystem has properties that imply, for (1.35), the fulfillment of the
conditions indicated in Definition 1.2. This is actually possible, thanks to following
result (whose proof is an extension of a proof given in [22]).

Proposition 1.2 There is a choice of G, a positive definite symmetric matrix P and


real number a > 0 such that the quadratic function U (η̃) = η̃T P η̃ satisfies

∂U
[F(w, η̃) + G û] ≤ −aη̃2 + û2 for all (w, η̃, û).
∂ η̃

With this result in mind, appealing to standard results concerning the input-to-
state stability properties of composite systems (see [23]), it is possible to prove that,
if the equivalent version of Assumption 1.1 holds, an appropriate choice of G makes
system (1.34) strongly minimum phase.
Having checked the appropriate assumptions, we can conclude that semiglobal
asymptotic stability of the equilibrium (z̃, η̃, x) = (0, 0, 0) can be enforced by means
of the robust controller described in the previous section. Note that the controller is in
the present case identical to the controller consisting of (1.8)–(1.29)–(1.30)–(1.31),
because the structure of such controller is determined only by the integers r1 , . . . , rm .
Only a slight change of notation is needed. The “extra input” ū in (1.29)–(1.30) must
be suppressed (because only asymptotic stability is sought) and the variable u in
(1.29)–(1.31) should be replaced by ū, to make it consistent with the present setting,
where the control used for stabilization purposes has been denoted by ū.

1.6 An Illustrative Example

Consider the 2-input 2-output system modeled by the equations

ż = z + x̄2 + u 1
x̄˙1 = 2z + x̄2 + u 1
x̄˙2 = x̄3 + x̄1 [2z + x̄2 + u 1 ]
x̄˙3 = z2 + u2
y1 = x̄1
y2 = x̄2 .

This system does not have a vector relative degree, because


      
ẏ1 2z + x̄2 1 0 u1
= + .
ẏ2 x̄3 + x̄1 2z + x̄1 x̄2 x̄1 0 u2
1 Almost Feedback Linearization via Dynamic Extension … 21

However, this system is uniformly invertible (as a simple check shows).


Let the outputs y1 , y2 be required to asymptotically track two different harmonic
signals. We cast this as a problem of output regulation, defining tracking errors

ei = yi − qi (wi ) i = 1, 2

in which qi (wi ) = Q i wi and ẇi = Si wi , where


   
wi1 0 1
wi = Si = Qi = 1 0 i = 1, 2
wi2 −Ωi2 0

and Ω1 = Ω2 . The design procedure presented in the paper can be implemented in


various steps, as follows.
Step 1: First of all, the system with input u and output e is put in normal form.
To this end, we define

x11 = x̄1 − q1 (w1 ) = x̄1 − w11


x21 = x̄2 − q2 (w2 ) = x̄2 − w21
x22 = x̄3 − q̇2 (w2 ) + x̄1 q̇1 (w1 ) = x̄3 − w22 + x̄1 w12 .

The resulting normal form is

ż = z + x21 + w21 + u 1
ẋ11 = 2z − w12 + x21 + w21 + u 1
ẋ21 = x22 + (x11 + w11 )[2z − w12 + x21 + w21 + u 1 ]
ẋ22 = z 2 + u 2 + Ω22 w21 + (2z + x21 + w21 + u 1 )w12 − (x11 + w11 )Ω12 w11
e1 = x11
e2 = x21 .

This is the desired normal form (1.1), with r1 = 1, r2 = 2,

f 0 (z, x, w) = z + x21 + w21 g0 (z, x, w) = 1 0 δ12


1
(w, x) = x11 + w11
 
2z − w12 + x21 + w21
A(w, z, x) = 2
z + Ω22 w21 + (2z + x21 + w21 )w12 − (x11 + w11 )Ω12 w11

1 0
B(w, z, x) = .
w12 1

Step 2: Next, we look at the nonlinear regulator equations. The function ψ(w) is
given by
ψ(w) =  [B(w, π0 (w), 0)]−1 [−A(w, π0 (w), 0)] 
−2π0 (w) + w12 − w21
= .
−w122
− [π0 (w)]2 − Ω22 w21 + Ω12 w11
2
22 A. Isidori and Y. Wu

Replacing ψ(w) into the p.d.e. that defines π0 (w) (note that only ψ1 (w) is involved)
we get

∂π0 (w)
s(w) = π0 (w) + w21 − 2π0 (w) + w12 − w21 = −π0 (w) + w12 .
∂w
It is seen from this that π0 (w) is a linear form in w1 . Setting π0 (w) = c1 w11 + c2 w12
the p.d.e. reduces to a Sylvester equation c1 c2 S1 = − c1 c2 + 0 1 that has a
unique solution. Looking now at the expression of ψ(w), it is realized that ψ1 (w) is
a linear form in (w1 , w2 ), while ψ2 (w) is the sum of a quadratic form in w1 and of a
linear form in w2 . In other words, we can conclude that there are vectors Γ1 ∈ R1×4
and Γ2 ∈ R1×5 such that9
⎛ 2 ⎞
   [2]  w11
w1 w1
ψ1 (w) = Γ1 , ψ2 (w) = Γ2 where w1[2] = ⎝w11 w12 ⎠ .
w2 w2 2
w12

Step 3: We now check that the Assumption 1.3 of strong minimum-phase is


fulfilled. Scaling z as z̃ = z − π0 (w) we get

z̃˙ = z̃ + x21 + [u 1 − ψ1 (w)] .

In this expression, we have to replace [u 1 − ψ1 (w)] by the first component of


[ B̃(w, z̃, x)]−1 [− Ã(w, z̃, x) + χ ], which is

−ã1 (w, z̃, x) + χ1 = −2z̃ − x21 + χ1 .

This yields
z̃˙ = −z̃ + χ1 ,

which, seen as a system with input χ1 , is trivially input-to-state stable.


Step 4: Having checked that Assumptions 1.1–1.3 are fulfilled, we proceed
now to check that also Assumption 1.4 is fulfilled and we determine the functions
φ1 (·), φ2 (·). For the function φ1 (·), using Cayley–Hamilton’s Theorem, it is seen that

L 4s ψ1 (w) = −(Ω12 + Ω22 )L 2s ψ1 (w) − Ω12 Ω22 ψ1 (w)

Thus η1 ∈ R4 and

φ1 (η1 ) = −(Ω12 Ω22 )η11 − (Ω12 + Ω22 )η13 := Φ1 η1 .

9 Note that the actual values of Γ1 and Γ2 are not needed in the sequel.
1 Almost Feedback Linearization via Dynamic Extension … 23

For the function φ2 (·), observe that (see [19]) dtd w1[2] = S1[2] w1[2] where S1[2] is a
matrix having characteristic polynomial p1[2] (λ) = λ3 + 4Ω12 λ. Since the character-
istic polynomial S2 is p2 (λ) = λ2 + Ω22 , using Cayley–Hamilton’s, we get

L 5s ψ2 (w) = −(4Ω12 + Ω22 )L 3s ψ2 (w) − (4Ω12 Ω22 )L s ψ2 (w).

Thus η2 ∈ R5 and

φ2 (η2 ) = −(4Ω12 Ω22 )η22 − (4Ω12 + Ω22 )η24 := Φ2 η2 .

Step 5: To complete the design of the internal model, we have to fix the vectors
G 1 and G 2 . Since we are dealing with linear φi (·)’s the issue is trivial. The function
F(w, η̃) is   
F1 0 η̃1
F(w, η̃) =
0 F2 η̃2

in which the Fi ’s have the form Fi = Âi + B̂i Φi − G i Ĉi with ( Âi + B̂i Φi , Ĉi )
observable. Hence the G i ’s can be determined by standard methods.
Step 6: At this stage, we choose the dynamic extension. Since r1 = 1 and r2 = 2,
only a 1-dimensional extension is needed ζ̇1 = v1 .
Step 7: Finally, we add the appropriate extended observers (1.31) and define the
controls v1 and ū 1 , ū 2 . This yields

v1 = g (−d0 ξ̂11 ) − d1 ζ1
ū 1 = b0 g ( b10 (−σ1 + ζ1 ))
ū 2 = b0 g ( b10 (−σ2 − d0 ξ̂21 − d1 ξ̂22 ))

in which ζ1 , ξ̂11 , ξ̂21 , ξ̂22 , σ1 , σ2 are states of

ζ̇1 = g (−d0 ξ̂11 ) − d1 ζ1


˙ξ̂ = σ1 + b0 g ( b10 (−σ1 + ζ1 )) + κ1 c11 (e1 − ξ̂11 )
11
σ̇1 = κ12 c10 (e1 − ξ̂11 )
ξ̂˙21 = ξ̂22 + κ2 c22 (e2 − ξ̂21 )
˙ξ̂ = σ + b g ( 1 (−σ − d ξ̂ − d ξ̂ )) + κ 2 c (e − ξ̂ )
22 2 0  b0 2 0 21 1 22 2 21 2 21
σ̇2 = κ23 c20 (e2 − ξ̂21 ) .

The control designed in this way was implemented to solve a problem of tracking
the sinusoidal references

y1,ref = sin t , y2,ref = sin 2t ,


24 A. Isidori and Y. Wu

0.5
e1(t)
e2(t)

e(t)
0

-0.5
0 5 10 15
t

Fig. 1.1 The e1 and e2

1
z(t)

0.5
z(t)

-0.5

-1
0 5 10 15
t

Fig. 1.2 The z

in which case the reference trajectory in the (y1 , y2 ) plane is the classical Lissajou
“figure eight.” We choose G 1 and G 2 so as to have

det (s I − F1 ) = (s + 1)(s + 2)(s + 3)(s + 4)


det (s I − F2 ) = (s + 1)(s + 2)(s + 3)(s + 4)(s + 5)

and
b0 = 0.6, d0 = 1, d1 = 2,
c11 = 3, c10 = 2, c22 = 6, c21 = 11, c20 = 6,
 = 100, κ1 = 10, κ2 = 20 .

Then, we have run a simulation assuming x11 (0) = 0.5 and x21 (0) = x22 (0) =
z(0) = 0, for the state of the plant, and η1 (0), η2 (0), ζ1 (0), ξ̂1 (0), ξ̂2 (0), σ1 (0), σ2 (0)
all zero for the state of the controller. The results of the simulation are shown in
following figures (Figs. 1.1, 1.2, 1.3, and 1.4).
1 Almost Feedback Linearization via Dynamic Extension … 25

60

40

20

-20

-40
0 5 10 15

Fig. 1.3 The ξ̂ and σ

1.5

0.5
y2(t)

-0.5

-1

-1.5
-1.5 -1 -0.5 0 0.5 1 1.5
y1(t)

Fig. 1.4 The outputs

References

1. Brockett, R.W.: Feedback invariants for nonlinear systems. IFAC Congr. 6, 1115–1120 (1978)
2. Jakubczyk, B., Respondek, W.: On linearization of control systems. Bull. Acad. Polonaise Sci.
Ser. Sci. Math. 28, 517–522 (1980)
3. Freidovich, L.B., Khalil, H.K.: Performance recovery of feedback-linearization- based designs.
IEEE Trans. Autom. Control 53, 2324–2334 (2008)
4. Willems, J.C.: Almost invariant subspaces: an approach to high-gain feedback design - Part I.
IEEE Trans. Autom. Control 26, 235–252 (1981)
5. Descusse, J.C., Moog, C.H.: Decoupling with dynamic compensation for strong invertible
affine nonlinear systems. Int. J. Control 43, 1387–1398 (1985)
6. Wu, Y., Isidori, A., Lu, R., Khalil, H.: Performance recovery of dynamic feedback-linearization
methods for multivariable nonlinear systems. IEEE Trans. Autom. Control AC–65, 1365–1380
(2020)
7. Wang, L., Isidori, A., Marconi, L., Su, H.: Stabilization by output feedback of multivariable
invertible nonlinear systems. IEEE Trans. Autom. Control AC–62, 2419–2433 (2017)
8. Schwartz, B., Isidori, A., Tarn, T.J.: Global normal forms for MIMO nonlinear systems, with
applications to stabilization and disturbance attenuation. Math. Control, Signals Syst. 2, 121–
142 (1999)
26 A. Isidori and Y. Wu

9. Liu, X., Lin, Z.: On normal forms of nonlinear systems affine in control. IEEE Trans. Autom.
Control AC-56, 1–15 (2011)
10. Isidori, A.: Lectures in Feedback Design for Multivariable Systems. Springer, London (2017)
11. Singh, S.N.: A modified algorithm for invertibility in nonlinear systems. IEEE Trans. Autom.
Control 26(2), 595–598 (1981)
12. Wang, L., Isidori, A., Marconi, L., Su, H.: A consequence of invertibility on the normal form
of a MIMO nonlinear systems. IFAC-PapersOnLine 49, 856–861 (2016)
13. Liberzon, D., Morse, A.S., Sontag, E.D.: Output-input stability and minimum-phase nonlinear
systems. IEEE Trans. Autom. Control 47, 422–436 (2002)
14. Liberzon, D.: Output-input stability implies feedback stabilization. Syst. Control Lett. 53,
237–248 (2004)
15. Lin, Y., Sontag, E.D., Wang, Y.: Various results concerning set input to state stability. In:
Proceedings 34th IEEE CDC, pp. 1330–1335 (1995)
16. Sontag, E.D., Wang, Y.: New characterizations of input-to-state stability. IEEE Trans. Autom.
Control AC-41, 1283–1294 (1996)
17. Isidori, A.: Nonlinear Control Systems, 3rd edn. Springer, London (1995)
18. Isidori, A., Byrnes, C.I.: Output regulation of nonlinear systems. IEEE Trans. Autom. Control
25, 131–140 (1990)
19. Huang, J.: Nonlinear Output Regulation: Theory and Applications. SIAM Series: Advances in
Design and Control, Philadelphia (2004)
20. Marconi, L., Praly, L., Isidori, A.: Output stabilization via nonlinear Luenberger observers.
SIAM J. Control Optim. 45(6), 2277–2298 (2007)
21. Wu, Y., Isidori, A., Lu, R.: Output regulation of invertible nonlinear systems via robust dynamic
feedback-linearization. IEEE Trans. Autom. Control, to appear (2021)
22. Byrnes, C.I., Isidori, A.: Nonlinear internal models for output regulation. IEEE Trans. Autom.
Control AC-49, 2244–2247 (2004)
23. Sontag, E.D., Teel, A.R.: Changing supply functions in input/state stable systems. IEEE Trans.
Autom. Control, AC-40, 1476–1478 (1995)
Chapter 2
Continuous-Time Implementation of
Reset Control Systems

Andrew R. Teel

Abstract This chapter presents a continuous-time implementation of a reset control


system that has a linear flow map, a linear jump map, and a quadratic function that
describes the jump set. The implementation is a homogeneous differential inclusion
that depends only on the data of the reset system and matches the flows of the reset sys-
tem in the flow set. Assuming that the reset control system admits a strongly convex
Lyapunov function that establishes stability of its origin, the continuous-time imple-
mentation has the origin globally exponentially stable. In particular, the continuous-
time implementation eliminates purely discrete-time solutions of the reset system that
do not converge. The behavior of the continuous-time implementation is illustrated
through multiple examples.

2.1 Introduction

This chapter is dedicated to Laurent Praly, who has impacted the problems I have
worked on and the solutions I have found ever since my brief, but indelible, post-
doctoral visit during the last six months of 1992. During the virtual workshop held
to celebrate his 65th birthday, as part of the 59th IEEE Conference on Decision and
Control, Laurent mentioned how Petar Kokotović encouraged him to stop working
on discrete-time systems and move to continuous-time systems, where Petar envi-
sioned that Laurent could conceive the most intricate and successful of Lyapunov
approaches to nonlinear control problems. While Laurent has not suggested that
I stop working on hybrid systems, for this particular work, I have chosen to shift
back to continuous time, where Laurent is the dominant player, showing how a reset
control system, which is a particular type of hybrid system, can be implemented
in continuous time. Fittingly, from my point of view, the continuous-time system
is a differential inclusion, which I began to understand more deeply while working

A. R. Teel (B)
University of California, Santa Barbara, CA 93106-9560, USA
e-mail: [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 27
Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control,
Lecture Notes in Control and Information Sciences 488,
https://doi.org/10.1007/978-3-030-74628-5_2
28 A. R. Teel

with Laurent on disturbance attenuation problems and converse Lyapunov theorems


roughly twenty years ago. I am grateful for all I have learned from the brilliant
Laurent Praly.
In this chapter, a continuous-time implementation of a class of reset control sys-
tems is presented. Reset control systems originated with the work of J.C. Clegg in
1958 [5], who was motivated to develop a nonlinear integrating circuit with less
phase lag than a standard linear integrator. Clegg’s integrator caught the attention of
Krishnan and Horowitz [16] in the 1970s who described it as “an integrator which
resets to zero at zero crossings of the input and is an ordinary integrator between zero
crossings.” Shortly thereafter, Horowitz and Rosenbaum [12] developed the more
general “First-Order Reset Elements” (FOREs), “whose output ... also resets to zero
at zero crossings of the input,” to address perceived “shortcomings” of the Clegg
integrator. Stability analysis for control loops with Clegg integrators and FOREs
began in earnest with the work of Hollot and coauthors in the late 1990s [2, 4, 10,
11, 13, 14]. Significant additional progress was made on the analysis of feedback
loops with Clegg integrators and FOREs when the zero-crossing-triggered interpre-
tation of the resetting mechanism was replaced by a sector-triggered interpretation
and piecewise-quadratic Lyapunov functions were used for the analysis [20–22].
Since that time, there has been an explosion of papers on reset control systems; at
the writing of this chapter, nearly 390 of the 430 references to the original paper by
Clegg (90%) have appeared since 2005.
The paper [21] was one of first works to cast a closed-loop reset control system as a
hybrid system explicitly. It used the hybrid systems framework of [6–9] to model the
closed-loop system, showed how to use temporal regularization to eliminate purely
discrete-time solutions that do not converge, and demonstrated the advantages of
pursuing non-quadratic Lyapunov functions for stability analysis.
This chapter explores implementing a reset control system using a differential
inclusion that depends on the data of the hybrid system rather than implementing the
hybrid system itself. One reason for pursuing such a result is that (discontinuous)
differential equations are more familiar, compared to hybrid systems, to the dynam-
ical systems and control engineering communities. Perhaps more significantly, the
differential inclusion implementation obviates the need for any temporal regulariza-
tion in the hybrid system to avoid purely discrete-time solutions that do not converge.
Equivalences between certain classes of hybrid systems and differential inclusions
arising as projected dynamical systems have appeared in the literature; see [3] and
the references therein. The connection between projected dynamical systems and the
differential inclusion used in this chapter is not explored here.
The main result of this chapter, Theorem 2.1, establishes conditions under which
the origin of the prescribed differential inclusion is globally exponentially stable.
The main assumption is that the hybrid reset control system admits a continuously
differentiable, strongly convex, homogeneous Lyapunov function that establishes sta-
bility, though not necessarily asymptotic stability, of the origin; see Assumption 2.1.
Assuming the existence of a strongly convex, homogeneous Lyapunov function is
not outlandish: results on the existence of convex, homogeneous Lyapunov functions
in switched, but not hybrid, linear settings, can be found in [18]. Moreover, some
2 Continuous-Time Implementation of Reset Control Systems 29

reset control systems admit positive definite, quadratic Lyapunov functions, which
are necessarily homogeneous and strongly convex. The final section of the chapter
considers several examples, each of which corresponds to a reset control system
admitting a positive definite, quadratic Lyapunov function that establishes stability.
The example in Sect. 2.4.4 considers a setting that is the genesis for the general
results in this chapter: a differential inclusion for accelerated convex optimization
[1, 17] that is inspired by a hybrid algorithm for accelerated convex optimization as
in [24] and the references therein.

2.2 Objective and Primary Assumption

The objective of this chapter is to find a continuous-time implementation of the


hybrid, reset control system
 
x ∈ C := x ∈ Rn : x T M x ≤ 0 ẋ = Ax (2.1a)
  +
x ∈ D := x ∈ Rn : x T M x ≥ 0 x = Rx (2.1b)

such that the origin of the implementation is globally exponentially stable when the
following conditions hold for the data of the reset control system:

Assumption 2.1 The matrices A, R, M = M T ∈ Rn×n are such that


1. there exist ε > 0 and a continuously differentiable, strongly convex, homoge-
 V : R → R≥0 such that, with the
n
neous of degree two,
 positive definite function
definition Cε := x ∈ R : x M x ≤ εx x , the following inequalities hold:
n T T

x ∈ Cε =⇒ ∇V (x), Ax ≤ 0, (2.2a)


x ∈ D =⇒ V (Rx) − V (x) ≤ 0. (2.2b)

Here, “homogeneous of degree two” means that V (λx) = λ2 V (x) for all λ > 0
and x ∈ Rn ; “strongly convex” means that there exists μ > 0 such that, for all
(x, y) ∈ Rn × Rn ,

V (y) ≥ V (x) + ∇V (x), y − x + μ|y − x|22 . (2.3)

2. x ∈ D implies Rx ∈ C, where C and D are defined in (2.1).


3. There is no solution of (2.1a) with an unbounded time domain that keeps the
function V of item 1 equal to a nonzero constant.

While item 1 of Assumption 2.1 is enough to guarantee that the origin of (2.1) is
stable, the totality Assumption 2.1 is not strong enough to guarantee exponential
stability of the origin for the reset system (2.1). Indeed, for any x◦ ∈ Rn \ {0} such
that Rx◦ = x◦ (there exists such x◦ whenever R − I is singular, which is the case
30 A. R. Teel

for most of the examples considered in Sect. 2.4), there is the solution x(0, j) = x◦
for all j ∈ Z≥0 . More generally, it is common for reset control systems to have
purely discrete-time solutions that do not converge to zero and that must therefore
be removed with some type of temporal or space regularization. This fact is part of
the motivation for pursuing a continuous-time implementation of (2.1).
The conditions of Assumption 2.1 are also not strong enough for exponential
stability of the origin for (2.1) even if the focus is on solutions with time domains
that are unbounded in the ordinary time direction, as illustrated by the following
example.
Example 2.1 Consider
    

01 0 −1 01
A := , R := , M := . (2.4)
−1 0 1 0 10

With these definitions, the flow set C is the union of the second and fourth quadrants in
the plane while the jump set is the union of the first and third quadrants. The reset map
R corresponds to a rotation of 90 degrees in the counterclockwise direction and the
flows move along circles in the clockwise direction at constant speed. Consequently,
items 2 and 3 of Assumption 2.1 hold. Finally, item 1 of Assumption 2.1 holds
with V (x) := x T x for all x ∈ R2 . This implies that the origin of (2.1) is stable.
The origin of (2.1) is not exponentially stable since each circle is forward invariant
underflows and jumps. The results of this chapter will show that a continuous-time
implementation of this reset control system transforms the origin from being stable
but not attractive to being exponentially stable. Simulations of the continuous-time
implementation of this system appear in Sect. 2.4.1. 

2.3 Continuous-Time Implementation and Main Result

To implement the system (2.1) in continuous time, while achieving exponential sta-
bility of the origin, consider the differential inclusion
 
ẋ ∈ Ax + γ · SGN(x T M x) + 1 (Rx − x) =: F(x), (2.5)

where γ > 0 is sufficiently large. The set-valued mapping SGN : R ⇒ R is equal


to the sign of its argument except at zero, where it is equal to the interval [−1, 1].
The resulting set-valued mapping F is outer semicontinuous (that is, its graph is
closed) and locally bounded with convex values. If the state x comprises a plant state
x p ∈ Rn p and a compensator state xc ∈ Rn c , i.e., x := (x Tp , xcT )T ∈ Rn p +n c , and the
reset map does not change the plant state, i.e.,
 
0n p
Rx − x = ∀x ∈ Rn , (2.6)

2 Continuous-Time Implementation of Reset Control Systems 31

then the solutions of the differential inclusion (2.5) satisfy


 
ẋ p = I 0 ẋ = I 0 Ax. (2.7)

In other words, no modification of the plant dynamics are needed to implement (2.5)
for control systems with plant states that do not reset. Other than this feature, the
solutions of (2.5) may not bear much resemblance to the solutions of (2.1). For an
elaboration on this point, see Sect. 2.4.1. However, the following result holds:
Theorem 2.1 Under Assumption 2.1, the origin of (2.5) is globally exponentially
stable for γ > 0 sufficiently large.
Proof It is straightforward to see that F(λx) = λF(x) for all λ > 0 and x ∈ Rn .
Hence global exponential stability of the origin is equivalent to asymptotic stability
of the origin; see [19, Theorem 11], for example. Asymptotic stability of the origin is
established now. For the origin of (2.5), consider the Lyapunov candidate V , which
can be seen to be positive definite and radially unbounded using (2.3) with x = 0
and y ∈ Rn arbitrary.
Step 1: Bounding ∇V (x), f 2 , f 2 ∈ SGN(x T M x) + 1 (Rx − x).
Combining (2.3) with y = Rx, (2.2b), and the definition of D in (2.1b), it follows
that

xT Mx ≥ 0 =⇒ ∇V (x), Rx − x ≤ −μ|Rx − x|22 . (2.8)

In turn, from the definition of SGN, it follows that

s ∈ SGN(x T M x) =⇒ ∇V (x), (s + 1)(Rx − x) ≤ −(s + 1)μ|Rx − x|22 .


(2.9)

Letting σ > 0 satisfy |M(Rx + x)|2 ≤ σ |x|2 for all x ∈ Rn and then using the
Cauchy–Schwarz inequality, item 2 in Assumption 2.1, and M = M T , it follows
that

−(Rx − x)T M(Rx + x)


x T M x ≥ 0, x = 0 =⇒ |Rx − x|2 ≥
σ |x|2
x M x − x T R T M Rx
T
=
σ |x|2
T
x Mx
≥ . (2.10)
σ |x|2

Combining (2.9) and (2.10) results in

x = 0, s ∈ SGN(x T M x) =⇒ (2.11)
  xT Mx
∇V (x), (s + 1)(Rx − x) ≤ −2μ max 0, x T M x .
σ 2 |x|22
32 A. R. Teel

Step 2: Bounding ∇V (x), Ax .


Due to item 1 of Assumption 2.1, this quantity is not positive when x T M x ≤ ε|x|22 .
For x T M x ≥ ε|x|22 , using the homogeneity of degree two for V , and hence the
homogeneity for degree one for ∇V due to Euler’s homogeneous function theorem,
it follows that there exists κ > 0 such that

x T M x ≥ ε|x|22 > 0 =⇒ ∇V (x), Ax ≤ κ|x|22


κ
≤ xT Mx (2.12)
ε
  κσ 2 x T M x
≤ max 0, x T M x .
ε2 σ 2 |x|22

Step 3: Combining the previous steps and analyzing solutions.


It follows from (2.2a) and (2.12) together with (2.11) that, for each ν > 0 there exists
γ > 0 sufficiently large, such that

x = 0, s ∈ SGN(x T M x) =⇒
  xT Mx
∇V (x), Ax + γ · (s + 1)(Rx − x) ≤ −ν max 0, x T M x ≤ 0. (2.13)
|x|22

It follows that the origin of (2.5) is stable for γ > 0 sufficiently large and, by the
invariance principle for differential inclusions (see [23, Theorem 1], for example),
which applies due to the properties of F listed below (2.5), the origin is asymptotically
stable if and only if there is no solution x : R≥0 → Rn and c > 0 such that V (x(t)) =
c for all t ≥ 0. Being a solution of (2.5), x(·) satisfies, for almost all t,

ẋ(t) = Ax(t) + γ · (s(t) + 1)(Rx(t) − x(t)), (2.14a)


s(t) ∈ SGN x (t)M x(t) .
T
(2.14b)

Assuming that t → V (x(t)) is a nonzero constant, by the chain rule, for almost all
t,

0 = ∇V (x(t)), Ax(t) + γ · (s(t) + 1)(Rx(t) − x(t)) . (2.15)

According to (2.13), such a solution requires x T (t)M x(t) ≤ 0 for all t ≥ 0. In turn,
it follows from (2.2a) and (2.9) and the positivity of γ that, for almost all t,

0 = ∇V (x(t)), Ax(t) (2.16a)


0 = ∇V (x(t)), γ · (s(t) + 1)(Rx(t) − x(t)) . (2.16b)
2 Continuous-Time Implementation of Reset Control Systems 33

Again with (2.9) and the positivity of γ and μ, it follows that, for almost all t,

(s(t) + 1)|Rx(t) − x(t)|2 = 0. (2.17)

It then follows from (2.14a) that x(·) is also a solution of (2.1a). In turn, it follows
from item 3 of Assumption 2.1 that x(·) does not keep V equal to a nonzero constant.
That is, V (x(t)) = c > 0 for all t ∈ R≥0 is impossible. 

Remark 2.1 It follows from the proof that, when condition (2.2a) is strengthened to
∇V (x), Ax ≤ 0 for all x ∈ Rn , the global exponential stability result holds even
if γ > 0 is not large, V is not homogenous, and V is just strictly convex. 

2.4 Examples and Simulations

In this section, the behavior of the differential inclusion in (2.5) is illustrated for
several examples. For simplicity, examples where the reset control system admits a
quadratic Lyapunov function are used.

2.4.1 Example 2.1 Revisited

Consider the matrices A, R, M of (2.4). As indicated previously in Example 2.1, the


origin of (2.1) is not exponentially stable, as each circle is invariant. On the other hand,
Assumption 2.1 holds with V (x) = x T x, so the result of Theorem 2.1 applies to the
system (2.5) for this A, R, M. In contrast to the subsequent examples, the simulations
for (2.5) for this example are especially “stiff” numerically for γ > 0 large, since
the second term in the differential inclusion causes sliding along the vertical axis
toward the origin. Indeed, at the boundary of the third and fourth quadrants, Ax
points directly to the left (into the jump set) while Rx − x points to the right (into
the flow set) and up. Similarly, at the boundary of the first and second quadrants, Ax
points directly to the right (into the jump set) while Rx − x points to the left (into the
flow set) and down. A simulation of the state trajectory for the system (2.5), for the
given A, R, M, √ using a√fixed, small (0.0001) step size and γ = 100 from the initial
condition (1/ 2, −1/ 2)T is shown in Fig. 2.1. The resulting trajectory bears little
resemblance to the trajectories of the reset control system (2.1) for this A, R, M.
34 A. R. Teel

0.8

0.6

0.4

0.2

-0.2

-0.4

-0.6

-0.8

-1
0 1 2 3 4 5 6 7 8 9 10

Fig. 2.1 The values of x1 (t) and x2 (t) as a function of time t for Example
√ 2.1
√revisited, implemented
with the differential inclusion (2.5) using γ = 100 and x◦ = (1/ 2, −1/ 2)T

2.4.2 A Clegg Integrator Controlling a Single Integrator


System

Consider the data


     
01 10 01
A := , R := , M := , (2.18)
−1 0 00 10

which corresponds to a Clegg integrator [5] controlling a single integrator plant


using negative feedback. The origin of ẋ = Ax is stable but not exponentially stable.
The origin of the reset control system (2.1) is globally exponentially stable with
convergence to the origin in finite time. Regarding Assumption 2.1, item 1 holds
with V (x) = x T x, item 2 holds since (Rx)T M Rx = 0, and item 3 holds since the
flows oscillate and hence always leave the flow set, unless starting at the origin. For
a simulation of the system (2.5) with A, R, M as in (2.18) from the initial condition
(1, 0)T using γ = 100, Fig. 2.2 shows the evolution of the state while Fig. 2.3 shows
the evolution of the Lyapunov function plotted on a log scale. The behavior of the
state is quite similar to the behavior of the state for the reset system, until just after
the first jump time. Indeed, the flow is identical until the first jump time of the reset
control system, which occurs at π/2 seconds. At that time, the reset system state
jumps to the origin exactly; the plant state x1 state reaches zero at that time due to
the flows while the controller state x2 is reset to zero at that time. In the continuous-
time implementation, the state x2 is quickly driven close to zero by the extra term in
2 Continuous-Time Implementation of Reset Control Systems 35

0.8

0.6

0.4

0.2

-0.2

-0.4

-0.6

-0.8

-1
0 1 2 3 4 5 6 7 8 9 10

Fig. 2.2 The values of x1 (t) and x2 (t) as a function of time t for a Clegg integrator controlling a
single integrator plant using negative feedback, implemented with the differential inclusion (2.5)
using γ = 100 and x◦ = (1, 0)T

the differential inclusion. Subsequently, this pattern repeats itself continually but at
smaller and smaller scales.

2.4.3 A Bank of Clegg Integrators Controlling a Strictly


Passive System

Let n p and n u be positive integers, let A p ∈ Rn p ×n p , B p ∈ Rn p ×n u and C p ∈ Rn u ×n p


be such that, for some P = P T > 0,

A Tp P + P A p < 0 , P B p = C Tp . (2.19)

Due to these conditions, the linear system (A p , B p , C p ) is state strictly passive; see
[15, Sect. 6.3] for example. Let n := n p + n u , and let A, R, M ∈ Rn×n be defined
as
     
A p Bp I 0 0 C Tp
A := , R := n p , M := . (2.20)
−C p 0 0 0 Cp 0

Using the Lyapunov function candidate


36 A. R. Teel

-2

-4

-6

-8

-10

-12

-14

-16

-18
0 1 2 3 4 5 6 7 8 9 10

Fig. 2.3 The value of log10 (V (x(t))) as a function of time t for a Clegg integrator controlling a
single integrator plant using negative feedback, implemented with the differential inclusion (2.5)
using γ = 100 and x◦ = (1, 0)T

V (x) := x T diag(P, I )x, (2.21)

it follows that the origin of ẋ = Ax is stable. It follows from the invariance principle
[15, Sect. 4.2] that the origin of ẋ = Ax is exponentially stable if and only if the
null space of B p is the origin, i.e., B p has full column rank. The reset control system
(2.1) has purely discrete-time solutions that do not converge to the origin from any
nonzero point x◦ such that the last n u components of x◦ are zero. To reiterate, this
behavior is one of the primary motivations for pursuing the results in this chapter.
Regarding Assumption 2.1, item 1 holds with V defined in (2.21), item 2 holds since
(Rx)T M Rx = 0, and item 3 holds if and only if B p has full column rank. To simulate
an example, let n p = 10, n u = 3, and generate random matrices A p (Hurwitz) and B p
(with full column rank) of appropriate dimension using the MATLAB command “rss”
and then rounding to two decimal places to facilitate repeatability. The simulations
reported here use
2 Continuous-Time Implementation of Reset Control Systems 37
⎡ ⎤
−1.25 0.73 0 −0.27 −0.31 0.70 0.32 −0.04 0.34 0.35
⎢ 0.67 −1.14 −0.16 0.02 0.40 −0.60 −0.37 −0.48 −0.21 −0.34 ⎥
⎢ ⎥
⎢ 0.16 −0.33 −0.73 0.44 0.28 0.34 −0.34 −0.05 −0.12 −0.20 ⎥
⎢ ⎥
⎢ −0.57 0.34 0.42 −1.27 0.52 −0.58 0.51 −0.39 −0.24 0.43 ⎥
⎢ ⎥
⎢ −0.13 0.22 0.25 0.59 −3.38 −0.50 −0.27 −0.44 1.02 0.43 ⎥
Ap = ⎢
⎢ 0.22 −0.18 0.56 −1.03
⎥,
⎢ −0.34 −2.24 0.17 −0.98 −0.03 0.23 ⎥⎥
⎢ −0.02 −0.10 −0.09 0.03 −0.05 −0.09 −0.72 0.22 0.69 −0.91 ⎥
⎢ ⎥
⎢ 0.01 −0.53 −0.06 −0.37 −0.44 −1.02 0.16 −2.06 −0.38 0.49 ⎥
⎢ ⎥
⎣ 0.77 −0.60 −0.29 0.12 0.90 −0.11 0.40 −0.41 −1.59 0.48 ⎦
−0.25 0.18 0.14 −0.24 0.70 0.06 −0.70 0.56 0.74 −1.50
⎡ ⎤
0.02 0.52 −0.29
⎢ −0.26 −0.02 −0.85 ⎥
⎢ ⎥
⎢ 0 0 −1.12 ⎥
⎢ ⎥
⎢ −0.29 0 2.53 ⎥
⎢ ⎥
⎢ −0.83 1.02 1.66 ⎥
Bp = ⎢ ⎥
⎢ −0.98 −0.13 0.31 ⎥ .
⎢ ⎥
⎢ −1.16 −0.71 −1.26 ⎥
⎢ ⎥
⎢ −0.53 1.35 −0.87 ⎥
⎢ ⎥
⎣ −2.00 −0.22 −0.18 ⎦
0 −0.59 0.79

It can be verified numerically that the matrix A p is Hurwitz and the matrix B p has
full column rank. Then generate P = P T > 0 and C p via

A Tp P + P A p = −I , C p := B pT P.

Finally, pick an initial condition x◦ ∈ Rn p +n c using the MATLAB command “randn”


and then rounding to two decimal places. The simulations reported here use

x◦ :=
 T
−0.65 1.19 −1.61 −0.02 −1.95 1.02 0.86 0 −0.07 −2.49 0.58 −2.19 −2.32 .

Figure 2.4 shows the evolution of t → V (x(t)), on a log scale, for the linear system
ẋ = Ax (dashed curve), which is the same as the differential inclusion (2.5) with
γ = 0, and for the differential inclusion (2.5) with γ = 100 (solid curve). The speed
of convergence in the latter case compared to the former case is a potential advantage
of using a (continuous-time implementation of a) reset control system.
38 A. R. Teel

1.5

0.5

-0.5

-1

-1.5

-2
0 1 2 3 4 5 6 7 8 9 10

Fig. 2.4 The values of log10 (V (x(t))) as a function of time t for a bank of Clegg integrators
controlling a strictly passive system, implemented with the differential inclusion (2.5) using γ = 100
(solid curve) and γ = 0 (dashed curve)

2.4.4 A Bank of Stable FOREs Controlling a Detectable


Passive System

Let n p and n u be positive integers, let A p ∈ Rn p ×n p , B p ∈ Rn p ×n u and C p ∈ Rn u ×n p


be such that (C p , A p ) is detectable and, for some P = P T > 0,

A Tp P + P A p ≤ 0 , P B p = C Tp . (2.22)

Due to these conditions, the linear system (A p , B p , C p ) is passive; see [15, Sect. 6.3]
for example. Let n := n p + n u , let σ > 0, and let A, R, M ∈ Rn×n be defined as
     
A p Bp In p 0 0 C Tp
A := , R := , M := . (2.23)
−C p −σ I 0 0 Cp 0

Using the Lyapunov function candidate

V (x) := x T diag(P, I )x, (2.24)

it follows that the origin of ẋ = Ax is stable. It follows from the invariance principle
[15, Sect. 4.2] that the origin of ẋ = Ax is exponentially stable if and only if (C p , A p )
is detectable. The reset control system (2.1) has purely discrete-time solutions that do
not converge to the origin from any nonzero point x◦ such that the last n u components
2 Continuous-Time Implementation of Reset Control Systems 39

of x◦ are zero. Regarding Assumption 2.1, item 1 holds with V defined in (2.24),
item 2 holds since (Rx)T M Rx = 0, and item 3 holds since (C p , A p ) is detectable.
Consider an example that is related to a particular convex optimization approach
using acceleration methods [1, 17]. Take n p = 12, n u = 12, A p = 0, B p = I , and
C p a random, symmetric, positive definite matrix with entries rounded to one decimal
place to facilitate repeatability. The simulations reported here use
⎡ ⎤
4.8 −3.6 −4.8 −2.4 2.5 −1.3 0.5 −4.0 −0.4 0.2 3.0 −3.2
⎢ −3.6 10.0 4.4 7.9 0.3 −4.8 −3.4 1.0 −0.9 0.7 −0.9 6.8 ⎥
⎢ ⎥
⎢ −4.8 4.4 11.3 2.7 −1.8 −1.2 −1.7 4.1 2.0 −0.5 −2.2 4.5 ⎥
⎢ ⎥
⎢ −2.4 7.9 2.7 18.0 1.4 −6.5 −1.2 −2.4 −4.3 1.4 −3.7 10.0 ⎥
⎢ ⎥
⎢ 2.5 0.3 −1.8 1.4 10.2 −0.7 −4.6 −0.7 −3.8 5.6 2.3 1.3 ⎥
⎢ ⎥
⎢ −1.3 −4.8 −1.2 −6.5 −0.7 7.3 0.7 2.4 0.1 −0.5 −1.1 −2.1 ⎥
C p := ⎢
⎢ 0.5 −3.4 −1.7 −1.2

⎢ −4.6 0.7 13.3 −0.8 1.0 −2.2 0.8 0.4 ⎥

⎢ −4.0 1.0 4.1 −2.4 −0.7 2.4 −0.8 9.6 3.6 1.8 −2.2 −2.1 ⎥
⎢ ⎥
⎢ −0.4 −0.9 2.0 −4.3 −3.8 0.1 1.0 3.6 8.9 −3.8 −2.3 −5.0 ⎥
⎢ ⎥
⎢ 0.2 0.7 −0.5 1.4 5.6 −0.5 −2.2 1.8 −3.8 8.2 −0.7 −2.0 ⎥
⎢ ⎥
⎣ 3.0 −0.9 −2.2 −3.7 2.3 −1.1 0.8 −2.2 −2.3 −0.7 8.2 1.7 ⎦
−3.2 6.8 4.5 10.0 1.3 −2.1 0.4 −2.1 −5.0 −2.0 1.7 14.0

whose eigenvalues range from about 0.04 to about 38.12, for a condition number
close to 1000. Pick an initial condition randomly with entries rounded to one decimal
place. The simulations reported here use

x◦ :=

−0.8 1.5 0 1.6 −0.4 0.6 −0.1 −2.0 −1.0 0.6 −0.1 −1.1
T
−0.6 0.2 −1.0 1.0 −0.6 1.8 −1.1 0.2 −1.5 −0.7 −0.6 0.4 .

Use σ = 0.1. Figure 2.5 shows the evolution of t → V (x(t)), on a log scale, for the
linear system ẋ = Ax (dashed curve), which is the same as the differential inclusion
(2.5) with γ = 0, and for the differential inclusion (2.5) with γ = 100 (solid curve).

2.5 Conclusion

Under mild assumptions, including strong convexity of a Lyapunov function, it is


possible to implement a reset control system, whose origin may be stable but not
exponentially stable, using a differential inclusion whose origin is globally expo-
nentially stable. The behavior of the proposed inclusion has been demonstrated in
several different settings, including situations that correspond to reset control using a
Clegg integrator or a more general first-order reset element (FORE). This differential
inclusion implementation has the potential to make “reset” control systems easier to
construct, certify, and employ than their hybrid systems counterparts.
40 A. R. Teel

-2

-4

-6

-8
0 5 10 15 20 25 30 35 40 45 50

Fig. 2.5 The values of log10 (V (x(t))) as a function of time t for a bank of stable FOREs controlling
a detectable, passive system, implemented with the differential inclusion (2.5) using γ = 100 (solid
curve) and γ = 0 (dashed curve)

Acknowledgements Research supported in part by the Air Force Office of Scientific Research
under grant AFOSR FA9550-18-1-0246.

References

1. Baradaran, M., Le, J.H., Teel, A.R.: Analyzing the persistent asset switches in continuous hybrid
optimization algorithms. In: Submitted to the 2021 American Control Conference (2020)
2. Beker, O., Hollot, C.V., Chait, Y., Han, H.: Fundamental properties of reset control systems.
Automatica 40(6), 905–915 (2004)
3. Brogliato, B., Daniilidis, A., Lemaréchal, C., Acary, V.: On the equivalence between com-
plementarity systems, projected systems and differential inclusions. Syst. Control Lett. 55(1),
45–51 (2006)
4. Chen, Q., Hollot, C.V., Chait, Y.: Stability and asymptotic performance analysis of a class of
reset control systems. In: Proceedings of the 39th IEEE Conference on Decision and Control,
pp. 251–256 (2000)
5. Clegg, J.C.: A nonlinear integrator for servomechanisms. Trans. A.I.E.E. 77(Part II), 41–42
(1958)
6. Goebel, R., Hespanha, J., Teel, A.R., Cai, C., Sanfelice, R.: Hybrid systems: generalized solu-
tions and robust stability. In: IFAC Symposium on Nonlinear Control Systems, Stuttgart, Ger-
many, pp. –12 (2004)
7. Goebel, R., Sanfelice, R.G., Teel, A.R.: Hybrid dynamical systems. IEEE Control Syst. Mag.
29(2), 28–93 (2009). April
8. Goebel, R., Sanfelice, R.G., Teel, A.R.: Hybrid Dynamical Systems: Modeling, Stability, and
Robustness. Princeton University Press, Princeton (2012)
2 Continuous-Time Implementation of Reset Control Systems 41

9. Goebel, R., Teel, A.R.: Solutions to hybrid inclusions via set and graphical convergence with
stability theory applications. Automatica 42, 573–587 (2006)
10. Hollot, C.V., Zheng, Y., Chait, Y.: Stability analysis for control systems with reset integrators.
In: Proceedings of the 36th IEEE Conference on Decision and Control, vol. 2, pp. 1717–1719
(1997)
11. Hollot, C.V.: Revisiting Clegg integrators: periodicity, stability and IQCs. IFAC Proc. 30, 31–38
(1997)
12. Horowitz, I.M., Rosenbaum, P.: Non-linear design for cost of feedback reduction in systems
with large parameter uncertainty. Int. J. Control 21(6), 977–1001 (1975)
13. Hu, H., Zheng, Y., Chait, Y., Hollot, C.V.: On the zero-input stability of control systems with
Clegg integrators. In: Proceedings of the 1997 American Control Conference, vol. 1, pp. 408–
410 (1997)
14. Hu, H., Zheng, Y., Hollot, C.V., Chait, Y.: On the stability of control systems having Clegg
integrators. Topics in Control and its Applications. Springer, London (1999)
15. Khalil, H.K.: Nonlinear Systems, 3rd edn. Prentice-Hall (2002)
16. Krishnan, K.R., Horowitz, I.M.: Synthesis of a non-linear feedback system with significant
plant-ignorance for prescribed system tolerances. Int. J. Control 19(4), 689–706 (1974)
17. Le, J.H., Teel, A.R.: Hybrid heavy-ball systems: reset methods for optimization with uncer-
tainty. In: Submitted to the 2021 American Control Conference (2020)
18. Molchanov, A.P., Pyatnitskiy, Ye.S.: Criteria of asymptotic stability of differential and differ-
ence inclusions encountered in control theory. Syst. Control Lett. 13, 59–64 (1989)
19. Nakamura, H., Yamashita, Y., Nishitani, H.: Smooth Lyapunov functions for homogeneous
differential inclusions. In: Proceedings of the 41st SICE Annual Conference, vol. 3, pp. 1974–
1979 (2002)
20. Nesic, D., Teel, A.R., Zaccarian, L.: Stability and performance of SISO control systems with
first-order reset elements. IEEE Trans. Autom. Control 56(11), 2567–2582 (2011)
21. Nesic, D., Zaccarian, L., Teel, A.R.: Stability properties of reset systems. In: Proceedings of
16th IFAC World Congress, vol. 38, pp. 67 – 72 (2005)
22. Nesic, D., Zaccarian, L., Teel, A.R.: Stability properties of reset systems. Automatica 44(8),
2019–2026 (2008)
23. Ryan, E.P.: A universal adaptive stabilizer for a class of nonlinear systems. Syst. Control Lett.
16(3), 209–218 (1991)
24. Teel, A.R., Poveda, J.I., Le, J.: First-order optimization algorithms with resets and Hamiltonian
flows. In: 58th IEEE Conference on Decision and Control, pp. 5838–5843 (2019)
Chapter 3
On the Role of Well-Posedness in
Homotopy Methods for the Stability
Analysis of Nonlinear Feedback Systems

Randy A. Freeman

Abstract We consider the problem of determining the input/output stability of the


feedback interconnection of two systems. Dissipativity and graph separation tech-
niques are two related and popular approaches to this problem, and they include well-
known passivity and small-gain methods. The use of block diagram transformations
with dynamic multipliers can greatly reduce the conservativeness of such approaches,
but for the stability of the transformed system to imply that of the original one, these
multipliers should admit appropriate factorizations. An alternative approach which
circumvents the need to factorize multipliers was provided by Megretski and Rantzer
in their seminal 1997 paper on integral quadratic constraints. Their approach is based
on homotopy: one constructs a continuous transformation of a trivially stable sys-
tem into the target system of interest, and by satisfying certain conditions along the
homotopy path one guarantees that the target system is also stable. This method
assumes that the feedback interconnection is well-posed along the homotopy path,
namely, that the feedback equations have solutions for all possible exogenous inputs
and that the mapping from these inputs to the solutions is causal. In this chapter we
will explore the role of well-posedness in this homotopy method. In so doing we
demonstrate that what suffices for the homotopy analysis is a property significantly
weaker than well-posedness, one which involves a certain lower hemicontinuity of
the feedback interconnection along with a certain controllability of its domain. More-
over, we show that these methods can be applied to general signal spaces, including
extended Sobolev spaces, spaces of smooth functions, and spaces of distributions.

R. A. Freeman (B)
Department of Electrical and Computer Engineering, Northwestern University,
2145 Sheridan Rd., Evanston, IL, USA
e-mail: [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 43
Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control,
Lecture Notes in Control and Information Sciences 488,
https://doi.org/10.1007/978-3-030-74628-5_3
44 R. A. Freeman

3.1 Introduction

We consider the feedback interconnection of two systems G and  as illustrated in


Fig. 3.1. We wish to determine the stability of the feedback loop, in the sense that the
outputs (y1 , y2 ) are bounded by the inputs (u 1 , u 2 ) in some appropriate sense. Here
the outputs represent the internal (or endogenous) signals and the inputs represent
the external (or exogenous) signals. In many cases G represents a known system and
 represents uncertainty, or G represents the linear part of the system and  the
nonlinear part. For example, in the classical absolute stability problem of Lur’e and
Postnikov [19], G is a known linear and time-invariant system and  is a nonlinear
system known only to satisfy certain sector conditions.
This stability problem has a long history; for example, the 2006 survey paper
[17] on absolute stability contains over 470 references. Classical approaches to this
problem include the circle criterion, the Popov criterion, the small-gain theorem,
and the passivity theorem (details of which can be found in textbooks like [7, 15],
among others). The classical approaches are made less conservative through the use
of multipliers, or linear systems M inserted into the feedback loop together with their
inverses so that the interconnection in Fig. 3.1 becomes the one in Fig. 3.2. If we can
find a class of multipliers that preserve, say, the passivity of the lower path, then we
can perform a search over this class to find one that allows the transformed system
to satisfy the conditions of an appropriate stability theorem.
O’Shea was apparently the first to propose the use of noncausal multipliers to
further reduce the conservativeness of the stability tests [22]. Indeed, because the
transformed system of Fig. 3.2 is artificial and exists solely for the purpose of stability
analysis, there is no reason to impose physical restrictions on it (such as causality).
Nonetheless, for this approach to work, the stability of the artificial, transformed
system in Fig. 3.2 should imply the stability of the original system in Fig. 3.1. One
can guarantee that this is true when the multipliers admit certain factorizations [7].

Fig. 3.1 The classical


feedback interconnection of
systems G and 

Fig. 3.2 The classical


feedback interconnection
with multiplier M
3 On the Role of Well-Posedness in Homotopy Methods … 45

As an alternative to the factorization approach, Megretski and Rantzer pioneered a


homotopy approach in [21, 23]. Their homotopy approach does not employ multipli-
ers as they appear in Fig. 3.2; instead, the multipliers are embedded inside of integral
quadratic constraints (IQCs) on the component systems G and . Such IQCs cannot
guarantee stability directly, however, because the feedback loop can be unstable even
when the IQCs are satisfied. Instead, Megretski and Rantzer begin with a system that
they know to be stable and continuously deform it into the target system of Fig. 3.1.
Their main result is that if the IQCs are satisfied along the entire homotopy path,
then the target system is also stable. An advantage of the homotopy approach is that
it circumvents the requirement that the multipliers admit factorizations. This does
come at a price: paraphrasing their words in [21],
The price paid for [circumventing the factorization requirement] is the very mild assumption
that the feedback loop is well-posed [along the entire homotopy path].

What is this “very mild” well-posedness assumption? As defined in [21, 23], the
feedback loop in Fig. 3.1 is well-posed when it satisfies the following two conditions:
first, there should exists outputs (y1 , y2 ) for all possible choices of the inputs (u 1 , u 2 ),
and second, the resulting mapping from the inputs to the outputs should be causal.
Note that this definition of well-posedness is not as strong as other definitions in the
literature, such as the one in [28], but it suffices for their homotopy argument.
In this chapter we investigate the role of this well-posedness assumption in the
homotopy method of [21, 23]. In particular, we make the following contributions:
1. We show that the homotopy approach to stability analysis works in settings much
more general than the classical setting of extended L2 spaces considered in [21,
23]. This includes settings in which signals belong to extended Sobolev spaces or
spaces of distributions. In fact, any locally convex Hausdorff topological vector
space can serve as the space of signals.
2. We relax the requirement that outputs exist for all possible inputs. Instead, we
require only that the domain of the feedback interconnection have a certain con-
trollability property.
3. We relax the requirement that the mapping from the inputs to the outputs in
Fig. 3.1 is causal. Instead, we require only that this mapping have a certain lower
hemicontinuity property together with an assumed limit on signal growth.
4. We extend the homotopy method to certain interconnections more general than
the classical input-additive interconnection of Fig. 3.1.
One might make the valid point that ill-posed feedback interconnections are poor
models of physical systems, so there is no reason to relax well-posedness assump-
tions. Keep in mind, however, that all systems along the homotopy path except the
target system are artificial and thus need not be physically meaningful. Moreover, in
some applications signals are constrained (e.g., they must take on positive values),
so requiring outputs to exist for all possible inputs might be unduly restrictive.
The results of this chapter apply mainly to systems satisfying so-called soft or
conditional IQCs [20, 21]. This is because when systems satisfy so-called hard or
46 R. A. Freeman

complete IQCs instead, then we can often establish stability using classical dissipa-
tivity or graph separation techniques and thus homotopy is not needed (see [4] and
the references therein).
The chapter is organized as follows. In Sect. 3.2 we define a general notion of a
signal space which constitutes the setting for our results. This notion goes beyond the
classical extended L p spaces and instead emulates the treatment in [8] which does
not rely on the existence of truncation operators. In Sect. 3.3 we define notions of
controllability and causality adapted from versions in [29]; in particular, we will see
that the perspective on controllability in [29] (which makes sense even for systems
without input or state) is the natural one for our setting. In Sect. 3.4 we present our
notions of stability, including a “look-ahead” version of classical finite-gain stability
that is well suited to noncausal systems. We also present our main homotopy results
in this section. Finally, in Sect. 3.5 we apply our results to the stability analysis
of interconnections, including the one in Fig. 3.1. All proofs can be found in the
appendix.
Notation We let N denote the set of natural numbers including zero. We let id
denote the identity relation on a set, seen either as a map from the set to itself or
as the graph of this map. If A is a set, we let 1 A denote the indicator function of A
(having its domain clear from context), and we let P(A) denote the power set of A.
We let F denote either of the fields R or C, depending on whether we wish to work
with real or complex signal spaces.

3.2 Signal Spaces

A classical choice for a signal space in continuous time is the extended L p space
p
for p ∈ [1, ∞], i.e., the vector space L loc of all locally p-integrable functions on the
time axis T = R0 taking values in F [34]. Each time t ∈ T defines a seminorm
p
·t on L loc given by the truncation xt = x · 1[0,t]  p , where · p denotes the
usual p-norm for F -valued functions on T . We let S = {·t }t∈T denote this family
p
of seminorms, and we note that S defines a locally convex topology on L loc , namely,
the coarsest topology under which each seminorm in S is continuous (thus turning
p
L loc into a Fréchet space). Furthermore, we can use these seminorms to recover L p
itself:
p 
L p = {x ∈ L loc : sup xt < ∞ . (3.1)
t∈T

p
In summary, we see that the family S of seminorms on L loc provides us with three
essential elements of a signal space: a time axis (the index set T of the family S), a
small-signal subspace via the construction (3.1), and a locally convex topology.
Extending this approach, we proceed to define a signal space as a vector space
together with a family of seminorms, where the index set of the family has the
structure of a “time axis.” Let X be a vector space over F , and let S = {·t }t∈T be a
3 On the Role of Well-Posedness in Homotopy Methods … 47

family of seminorms on X over an index set T . Recall that the family S is separated
when for every nonzero x ∈ X there exists t ∈ T such that xt > 0. Every family
S induces a natural preorder  on its index set T as follows: given s, t ∈ T , we say
s  t when there exists C > 0 such that ·s  C·t . The resulting equivalence
relation ∼ on T (defined as s ∼ t when both s  t and t  s) corresponds to the
usual equivalence relation for seminorms. Note that  defines a partial order on
T precisely when s ∼ t implies s = t. Next, recall that  directs T when for any
s, t ∈ T there exists r ∈ T such that s  r and t  r . We say that the family S is
temporal when it is separated and  is a partial order on T that directs T . A signal
space is a vector space X together with an index set T (called the time axis) and
a temporal family S of seminorms on X indexed over T . We will refer to such a
space as (X, T , S), or simply as X when T and S are either clear from context or
not explicitly named. Elements of X are signals.
The time axis of a signal space carries a natural notion of order (the partial order
) and direction (any two-time instants have a common future time instant). The
requirement that  be a partial order on T ensures that the time axis T cannot circle
back on itself. Note that in this setting there is no concept of the value of a signal
at a particular time, only of its size. Also, we make no assumption that the mapping
t → xt is monotone in t for any fixed x ∈ X, a departure from many classical
definitions of signals spaces [7, 12, 28, 34].
The family of seminorms S for a signal space (X, T , S) provides a natural topol-
ogy for the space, namely, the coarsest topology under which each seminorm in
S is continuous. We will call this the seminorm topology for X, and the require-
ment that the family S be separated ensures that this topology is Hausdorff. Thus by
construction, any signal space is a locally convex Hausdorff space in its seminorm
topology. The next lemma shows that the converse holds, namely, that every locally
convex Hausdorff space X admits a temporal family of seminorms that generates its
topology, turning X into a signal space.
Lemma 3.1 Let X be a locally convex Hausdorff space. Then there exists a temporal
family S of seminorms on X indexed over a set T such that the resulting seminorm
topology on the signal space (X, T , S) coincides with the given topology.
Thus we can regard all locally convex Hausdorff spaces as signal spaces, including
dual spaces like spaces of distributions. As we shall see, however, a signal space is
more than a locally convex space—the particular choice of seminorms defining the
topology plays a crucial role in the theory. In Sect. 3.2.1 we provide some examples
of signal spaces with specific choices for their families of seminorms.
Unless otherwise specified, all topological notions (such as open sets) for a signal
space (X, T , S) will be with respect to the seminorm topology. Because  directs
T , a local base for this topology at any x̄ ∈ X is the collection of all balls of the form
 
Bt,ε (x̄) = x ∈ X : x − x̄t < ε (3.2)

for t ∈ T and ε > 0. We will also need balls having zero radius, i.e., balls of the
form
48 R. A. Freeman

  
Bt (x̄) = x ∈ X : x − x̄t = 0 = Bt,ε (x̄) (3.3)
ε>0

for t ∈ T and x̄ ∈ X. Abusing notation slightly, we will occasionally refer to these


zero-radius balls as Bt,0 , keeping in mind that Bt,0 means Bt in (3.3) and does not
mean plugging ε = 0 into (3.2) (which would result in empty balls). The collection
of zero-radius balls Bt defines another topology on X:
Lemma 3.2 The collection of balls in (3.3) for all x̄ ∈ X and t ∈ T is a base for a
topology on X. Moreover, for each x̄ ∈ X the collection of balls (3.3) for all t ∈ T
is a local base at x̄.
The topology given in Lemma 3.2 is clearly finer than the seminorm topology, and
hence we call it the fine topology. Vector addition is continuous in the fine topology
because Bt(x̄1 ) + Bt(x̄2 ) = Bt (x̄1 + x̄2 ). In particular, if U ⊆ X is finely open then
x + U is finely open for any x ∈ X. However, the mapping (c, x) → cx from F × X
to X need not be jointly continuous in the fine topology , and so X with the fine
topology is generally not a topological vector space. Nevertheless, for any fixed scalar
c ∈ F the mapping x → cx is continuous in the fine topology because cBt (x̄) ⊆
Bt(c x̄) (with equality when c = 0). In particular, if U ⊆ X is finely open and c ∈ F
is nonzero, then cU is finely open.
p
Recall that in (3.1) we recovered L p from its extension L loc by selecting “small”
signals, namely, those signals whose seminorms are uniformly bounded. This concept
generalizes in a natural way for a signal space (X, T , S). Adopting terminology from
[12], we define the small-signal subspace Xs as
 
Xs = x ∈ X : sup xt < ∞ , (3.4)
t∈T

and we call elements of Xs small signals. Here the superscript s stands for “small.”
We equip the small-signal subspace Xs with a norm ·s given by

xs = sup xt (3.5)


t∈T

for x ∈ Xs. In general, the associated norm topology on Xs is finer than the seminorm
topology that Xs inherits as a subset of X. Also, we emphasize that the small-signal
subspace is a property of the particular choice of the family S of seminorms defining
the signal space, so that two different signal spaces sharing the same underlying
vector space X can have the same seminorm topology but different small-signal
subspaces. Finally, we note that Xs is generally not a finely closed subspace of X
(and hence not closed either), and in many cases it is a finely dense subspace of X
(see Sect. 3.3.1).
3 On the Role of Well-Posedness in Homotopy Methods … 49

3.2.1 Examples of Signal Spaces

According to Lemma 3.1, we can turn any locally convex Hausdorff space into a
signal space. In this section we provide some examples with specific choices for the
temporal families of seminorms.
Example 3.1 (Normed spaces) Let X be a normed space, and suppose S consists
solely of its norm · so that T is a singleton. Then S is a temporal family and thus
(X, T , S) is a signal space. Moreover, all signals are small signals, that is, Xs = X.
The following example is similar to the treatment in [8].
Example 3.2 (Extended spaces) Let X be a vector space, let {Xt }t∈T be a collection
of normed spaces, and let {Rt }t∈T be a collection of linear operators Rt : X → Xt
such that

(a) t∈T ker(Rt ) = {0},
(b) ker(Rs ) = ker(Rt ) implies s = t, and
(c) for all s, t ∈ T there exist r ∈ T and bounded linear operators Bs : Xr → Xs
and Bt : Xr → Xt such that both Rs = Bs ◦ Rr and Rt = Bt ◦ Rr .
Let S = {·t }t∈T be the family of seminorms on X given by xt = Rt x. It is
straightforward to show that (a) implies that S is separated, (b) implies that  is a
partial order on T , and (c) implies that  directs T . We conclude that S is a temporal
family and thus (X, T , S) is a signal space.
We now use Example 3.2 to define extended L p spaces of functions on a general
measure space (Example 3.3) and extended Sobolev spaces of functions on an open
subset of Rn (Example 3.4). We say that a collection C of distinct nonempty subsets
of a set S is a directed cover of another set A ⊆ S when C covers A and is directed
by inclusion (meaning that the union of any two members of C is contained in some
member of C).
Example 3.3 (Extended L p spaces) Let μ be a measure on a σ -algebra of subsets of
a nonempty set E. Let C = {At }t∈T be a collection of measurable subsets of E that
is a directed cover of E, that has a countable subcover, and is such that the symmetric
difference of two distinct members of C has positive measure. Let p ∈ [1, ∞], let V
be a Banach space over F , and let X be the vector space of all Bochner measurable
functions x : E → V such that the truncated signal x · 1 At has a finite p-norm for
every t ∈ T (and as usual we make no distinction between two functions that agree
almost everywhere). Let S = {·t }t∈T be the family of seminorms on X given by
xt = x · 1 At  p . Then S is a temporal family (Lemma 3.20 in the appendix) and
thus (X, T , S) is a signal space. The small-signal subspace is Xs = L p (E).
The countable subcover condition in Example 3.3 guarantees that S is separated.
Without this condition, S need not be separated as demonstrated by Example 3.12
p
in the appendix. Example 3.3 includes as a special case the extended space L loc we
described at the beginning of this section: we take μ to be the Lebesgue measure on
50 R. A. Freeman

E = T = R0 and C to be the collection of all real intervals of the form At = [0, t]
for t ∈ T . For an analogous construction in discrete time, we take μ to be the counting
measure on E = T = N and C to be the collection of all integer intervals of the form
At = [0, t] for t ∈ T .
Example 3.4 (Extended Sobolev spaces) Let  be a nonempty open subset of Rn ,
let k ∈ N, and let p ∈ [1, ∞]. Let C = {At }t∈T be a collection of open subsets of 
that is a directed cover of . Let X be the vector space of all functions x :  → R
such that for every t ∈ T the restriction x| At belongs to the Sobolev space W k, p (At ).
Let S = {·t }t∈T be the family of seminorms on X given by xt = x| At W k, p (At ) .
This is a special case of Example 3.2: for each t ∈ T we let Xt = W k, p (At ), and we
define Rt : X → Xt to be the restriction Rt x = x| At . It is straightforward to show
that properties (a)–(c) hold, and we conclude that (X, T , S) is a signal space. The
small-signal subspace is Xs = W k, p ().
Example 3.5 (Smooth spaces) Let  be a nonempty open subset of Rn , let X be the
vector space C ∞ (), let C be a directed cover of  whose members are all compact,
let T = N × C, and let S = {·t }t∈T be the family of seminorms on X given by
  
x(N ,K ) = max ∂ αx(τ ) (3.6)
τ ∈K
|α|N

for each (N , K ) ∈ T , where α ∈ Nn is a multi-index and ∂ α denotes the correspond-


ing mixed partial derivative operator. Then S is a temporal family (Lemma 3.21 in
the appendix) and thus (X, T , S) is a signal space. Note that the time axis T is not
linearly ordered in this example. Also, small signals must be real analytic, and when
n = 1 they include all sinusoids having radian frequency strictly less than one.
Example 3.6 (Weighted spaces) Let (X, T , S) be a signal space, and let w : T →
(0, ∞) be a positive weight function. Then for each t ∈ T the seminorm w(t)·t
is equivalent to the seminorm ·t , and we conclude that (X, T , {w(t)·t }t∈T ) is
another signal space. This new space is a weighted version of the original space.
Note that we make no assumption on the monotonicity of the weight function w. If
we take the extended L p space from Example 3.3 and choose the weight function
w(t) = μ(At )−1/ p whenever μ(At ) = 0, then the weighted space is a “power signal”
space in which the seminorm w(t)·t measures the “power” or “average L p energy”
of the signal at time t. In this case L ∞ (E) ⊂ Xs, that is, all bounded signals are small.

3.2.2 Composite Signals

Signals are often composite, made up of different parts we label as inputs, outputs,
states, etc., each having a particular role in the overall model. Thus we should consider
the Cartesian product of signal spaces, or in fact their direct sum as we want the
product itself to be a vector space. One complication in forming such a direct sum
3 On the Role of Well-Posedness in Homotopy Methods … 51

is that we need also to form a corresponding temporal family of seminorms on the


sum in a way that preserves the signal space structure of its components.
To this end, suppose we have a finite collection of signal spaces {(Xi , Ti , Si )}i=1
N

over a common field F . Let a : T → T1 × · · · × T N be a mapping from an index set


T to the Cartesian product of the individual time axes Ti , and let ai : T → Ti denote
the i th component of a. We assume that a has the following three properties: (a) it is
injective, (b) its image is cofinal with respect to the product order on its codomain,
and (c) each component ai is surjective. Using a, we define a family S = {·t }t∈T
of seminorms on the direct sum X = X1 ⊕ · · · ⊕ X N as follows:


N
(x1 , . . . , x N )t = xi ai (t) , (3.7)
i=1

where the i th norm in the sum is from the family Si . The partial order  induced on
T by this family S is such that s  t if and only if ai (s)  ai (t) for each i (namely, a
is monotone with respect to the product order on its codomain). It is straightforward
to show that under the assumptions on a listed above, S is a temporal family and thus
(X, T , S) is a signal space. Also, because each component ai is surjective we have

sup xi ai (t) = sup xi ti (3.8)


t∈T ti ∈Ti

for each i, and it follows from (3.5) that


N
max sup xi ti  (x1 , . . . , x N )s  sup xi ti (3.9)
i=1,...,N ti ∈Ti
i=1 ti ∈Ti

for all (x1 , . . . , x N ) ∈ Xs. Hence the small-signal subspace Xs coincides with the
direct sum X1s ⊕ · · · ⊕ XsN , and thus we can write (3.9) as


N
max xi s  (x1 , . . . , x N )s  xi s (3.10)
i=1,...,N
i=1

for all (x1 , . . . , x N ) ∈ Xs. Moreover, both the seminorm and the fine topologies on
X coincide with the corresponding product topologies on the underlying Cartesian
product X1 × · · · × X N .
As an example of this direct sum, suppose each component space (Xi , Ti , Si ) is an
extended L p space from Example 3.3, and suppose they all share the same underlying
measure μ, set E, collection C, and time axis Ti = T . Then the natural way to define
their direct sum is to take each component function ai to be the identity map on
T . This choice is clearly injective with surjective components, and the cofinality
property follows from the fact that the collection C is a directed cover of E. As a
second example, suppose (X1 , T1 , S1 ) is a normed space as in Example 3.1 so that
52 R. A. Freeman

T1 = {t1 } is a singleton, and let (X2 , T2 , S2 ) be another signal space. Then the choice
a(t) = (t1 , t) with T = T2 has the desired properties (and is the only such choice up
to an isomorphism).
In what follows, whenever we talk about a direct sum of signal spaces, we will
implicitly assume that it carries the seminorm structure in (3.7) for an appropriate
choice for the mapping a. Finally, we will use π to denote canonical projections onto
component spaces, e.g., πXi : X → Xi .

3.3 Systems, Controllability, and Causality

A system is a pair (X, ) where X is a signal space and ⊆ X is a collection of


signals. This is the viewpoint taken in [34] and refined in [29], and as in [29] we
call the behavior of the system. For convenience, we will often use the behavior
as shorthand for the system (X, ) itself when the signal space X is clear from
context. A system is linear when its behavior is a linear subspace of its signal space.

There are two particular subsystems associated with the small signals in any
system (X, ). The first is the small-signal subsystem (Xs, s) with behavior
s= ∩ Xs, where Xs denotes the small-signal subspace of X. To define the
second, recall that the small-signal subspace Xs carries a norm ·s. We can use
this norm on Xs to create a signal space as in Example 3.1 by letting the time axis
be a singleton and letting the family of seminorms contain just the norm ·s. We
will use the notation Xn to refer to this normed signal space. Here the superscript n
stands for “normed.” The normed subsystem associated with (X, ) is the system
(Xn, n) with behavior n = s, so that n is also the collection of all small
signals in . Even though s and n contain the same signals, their seminorm
structures are different: s inherits a seminorm ·t from X for each t ∈ T , whereas
n has only one seminorm · (which is actually a norm). In other words, a signal
s
in s carries the notion of time t ∈ T it inherits from X, but the same signal in n
carries no notion of time. We summarize these distinctions in Table 3.1.

Table 3.1 The subsystems s and n associated with any system


System Name Behavior Signal space Time axis
(X, T , S) T
s Small-signal ∩ Xs (Xs, T , S) T
subsystem
n Normed ∩ Xs (Xn, {0}, {·s}) Singleton
subsystem
3 On the Role of Well-Posedness in Homotopy Methods … 53

3.3.1 Controllability

In the behavioral approach of [29], controllability is a property of a system that makes


sense even for systems without inputs. The following definition of controllability is
an approximate version of “exact controllability” from [29, Definition V.1]:
Definition 3.1 A system (X, ) is controllable to a trajectory x ∈ X when for every
x̄ ∈ and every open neighborhood U ⊆ X of x̄ there exists x̂ ∈ U ∩ such that
x̂ − x ∈ Xs. It is finely controllable to x when the same holds for every finely open
neighborhood U .
Note that by this definition, a system can be controllable to a trajectory that is not
part of its behavior (in other words, x need not belong to ). Recalling that (3.2) and
(3.3) provide respective bases for the seminorm and fine topologies , controllability
to x essentially means that given any system trajectory x̄ and any time t, there exists
another system trajectory x̂ that is close to x̄ up to time t but that is ultimately close
to x (in the sense that the difference x̂ − x is a small signal). Or, in the words of Jan
Willems in [29]:
…for a controllable system the past of a trajectory will have no lasting influence on the far
future, since sooner or later any other trajectory can be joined.

We will be particularly interested in systems that are controllable to zero (i.e., to the
zero trajectory). Note that a system is (finely) controllable to zero if and only if its
small-signal subsystem s is (finely) dense in . If the system (X, X) containing all
possible trajectories is (finely) controllable to zero, then we say that the signal space
X itself is (finely) controllable to zero. For example, the signal space of Example 3.3
is finely controllable to zero because all truncated signals of the form x · 1 At for t ∈ T
are small signals. Likewise, the signal space of Example 3.5 is controllable to zero,
and it is finely controllable to zero if and only if all sets in C are finite (Lemma 3.22 in
the appendix). If X is controllable to zero, then every system (X, ) whose behavior
is contained in the closure of its interior is controllable to zero. Likewise, if X is
finely controllable to zero, then every system (X, ) whose behavior is contained
in the fine closure of its fine interior is finely controllable to zero. Finally, if X is a
direct sum X = X1 ⊕ · · · ⊕ X N as in Sect. 3.2.2, then X is controllable to zero (or
finely controllable to zero) if and only if each component Xi is.
We also need the following stronger version of controllability to zero:
Definition 3.2 A system (X, ) is uniformly controllable to zero when there exist
K, b  0 such that for all x̄ ∈ , all t ∈ T , and all ε > 0 there exists x̂ ∈ Bt,ε (x̄) ∩
s such that  x̂  K x̄ + b + ε. It is uniformly finely controllable to zero when
s t
this holds with ε = 0. The constants K and b are the controllability constant and
controllability bias, respectively.
As before, we can apply this definition to entire signal spaces. For example, the
signal space of Example 3.3 is uniformly finely controllable to zero with K = 1
and b = 0 because x · 1 At s = xt for all signals x. Likewise, it follows from the
54 R. A. Freeman

results in [14] that if the boundaries of the sets At in Example 3.4 are sufficiently well
behaved, then the signal space of Example 3.4 is uniformly finely controllable to zero
with b = 0 and a value of K that depends on n, k, p, and the boundary parameters.
Note that the property of a signal space being uniformly finely controllable to zero,
which appears as Assumption 4.1(d) in [8] for the case of zero controllability bias,
is essentially a property that signals admit “soft” truncations.

Example 3.7 (Linear time-invariant systems) Let X be the signal space L2loc on the
time axis T = R0 with signals x ∈ X having vector values xt ∈ Rn at times t ∈ T .
Let (X, ) be a linear system, and suppose has a state-space representation

ξ̇t = Aξt + B E xt , ξ0 = 0 (3.11)


0 = Cξt + Dxt (3.12)

for matrices A, B, C, D, E of appropriate dimensions. In other words, x ∈ if and


only if (3.12) holds for almost all t ∈ T , where ξ is the unique absolutely continuous
solution to (3.11). As we will show in Lemma 3.23 in the appendix, if (A, B) is
stabilizable, (A, C) is detectable, and [ DE ] is right-invertible, then is uniformly
finely controllable to zero (with zero controllability bias). An analogous result holds
in discrete time.

3.3.2 Input/Output Systems, Causality, and Hemicontinuity

An input/output (IO) system is a system (X, ) whose signal space X is a direct sum
X = I ⊕ O of an input space I and an output space O. We will write such a system as
the triple (I, O, ). Hence an IO system is merely a system for which we label some
signals as inputs and the rest as outputs. Unlike [29], we impose no requirements
that the inputs are “free” or that the outputs “process” the inputs. Instead, we will
take an IO stability point of view: the outputs will be those signals that we wish to
be small whenever the signals we have labeled as inputs are small.
Similarly, an IO system with latent variables is a system (X, ) whose signal
space X is a direct sum X = I ⊕ O ⊕ L of an input space I, an output space O, and
a third space L of latent variables that are neither inputs nor outputs (such as state
variables) [29]. Every IO system with latent variables generates an associated IO
system without latent variables via projection onto I ⊕ O. In the parlance of [29],
the original behavior ⊆ I ⊕ O ⊕ L is the full behavior and its projection πI⊕O ( )
is the manifest behavior. For convenience, we will assume that our IO systems have
no latent variables, i.e., that any latent variables have been projected out to produce
the manifest behavior. Note that the system (3.11)–(3.12) in Example 3.7 becomes
a familiar IO system when X = I ⊕ O so that x ∈ X has components x = (u, y);
indeed, setting E = [I 0] and D = [D1 −I ] for some matrix D1 leads to the system
ξ̇t = Aξt + Bu t and yt = Cξt + D1 u t .
3 On the Role of Well-Posedness in Homotopy Methods … 55

Let (I, O, ) be an IO system. The domain of is the projection dom( ) =


πI ( ). For all u ∈ I we define the cross section [u] ⊆ O as
 
[u] = y ∈ O : (u, y) ∈ , (3.13)

and for a set S ⊆ I we define [S] = u∈S [u]. If [u] is a singleton, then we
equate it with its sole member, e.g., we can write y = [u] instead of y ∈ [u]. We
thus regard the map u → [u] as a set-valued map whose graph is . We say that
is univalent when [u] is a singleton for each u ∈ dom( ). We say that is an
operator when dom( ) = I, that is, when its inputs can be freely chosen [23]. Note
that if is a univalent operator, then [·] is an ordinary single-valued function from
I to O. The inverse of an IO system (I, O, ) is the IO system (O, I, −1 ) with
behavior
−1
 
= (y, u) ∈ O ⊕ I : (u, y) ∈ . (3.14)

In particular, the inverse image of a set Y ⊆ O is


−1
 
[Y ] = u ∈ I : [u] ∩ Y = ∅ . (3.15)

Note that every system has an inverse in this set-valued sense.


In the classical setting, causality is defined using truncation operators on the
signal spaces [7, 28, 35]. More appropriate for our setting is the following notion of
causality adapted from [29]:
Definition 3.3 An IO system (I, O, ) is causal when for every (ū, ȳ) ∈ , every
t ∈ T , and every u ∈ Bt(ū) ∩ dom( ) there exists y ∈ Bt ( ȳ) ∩ [u].
In other words, is causal when for every (ū, ȳ) ∈ , every t ∈ T , and every
u ∈ dom( ) such that u − ūt = 0 there exists y ∈ [u] such that y − ȳt = 0.
Arguably more accurate terms than “causal” are nonanticipative [7, 35] or nonan-
ticipating [29], as they avoid the implication that the inputs somehow “cause” the
outputs. Nevertheless, we use the more familiar term as it is less cumbersome.
We next show that causality is a particular form of uniform continuity of an IO
system (I, O, ) regarded as a set-valued map from the input space I to the output
space O. Recall that a set-valued map is lower hemicontinuous when the inverse
image of every open set is open.1 We will slightly weaken this standard definition by
requiring only that the inverse image of every open set is open relative to its domain.
Making specific choices for the topologies on I and O leads to the following:
Definition 3.4 An IO system (I, O, ) is lower hemicontinuous (lhc) when the
inverse image −1[Y ] of every open set Y ⊆ O is open in I relative to dom( ).
It is finely lower hemicontinuous (flhc) when the inverse image of every finely open
set is finely open in I relative to dom( ). It is weakly lower hemicontinuous (wlhc)
when the inverse image of every open set is finely open in I relative to dom( ).

1 Many authors use “semicontinuous” rather than “hemicontinuous,” e.g., [3].


56 R. A. Freeman

Fig. 3.3 An illustration of -uniform fine lower hemicontinuity: given (ū, ȳ) ∈ and t ∈ T , for
any u ∈ dom( ) that agrees with ū up to time (t) there exists y ∈ [u] that agrees with ȳ up to
time t. This is the same as causality when = id, namely, when (t) = t for all t ∈ T

We could also consider a strong version of lower hemicontinuity using the seminorm
topology on I and the fine topology on O, but we will make no use of such a version
here. Also, neither lhc nor flhc is a stronger property than the other in general, but
they are both stronger than wlhc.
To define uniform versions of lower hemicontinuity, we tie the open sets in O in
Definition 3.4 to particular open sets in I using the balls in (3.2) and (3.3) together
with a mapping of the time axis T . Given a signal space (X, T , S), we say that a
function : T → T is a look-ahead map for X when there exists a constant L > 0
such that ·t  L· (t) for all t ∈ T . We call L a look-ahead constant for . Note
that by the definition of the partial order  on T , we have t  (t) for all t ∈ T
(which means (t) indeed “looks ahead” in time). Also, the identity map = id on
T is always a look-ahead map with L = 1. As another example, suppose T = R0
(continuous time) or T = N (discrete time) and suppose the mapping t → xt is
monotone in t for any fixed x ∈ X (as in most classical settings); then (t) = t + τ
for a fixed positive τ ∈ T is a look-ahead map with L = 1. More generally, if there
exists a constant a ∈ R such that the mapping t → eat xt is monotone in t for any
fixed x ∈ X, then (t) = t + τ is a look-ahead map with L = eaτ .

Definition 3.5 Let be a look-ahead map for I ⊕ O. An IO system (I, O, ) is -


uniformly lower hemicontinuous ( -lhc) when for every (ū, ȳ) ∈ , t ∈ T , and ε > 0
there exists δ > 0 such that B (t),δ(ū) ∩ dom( ) ⊆ −1[Bt,ε( ȳ)]. It is -uniformly
finely lower hemicontinuous ( -flhc) when B (t)(ū) ∩ dom( ) ⊆ −1[Bt( ȳ)] for all
(ū, ȳ) ∈ and t ∈ T . It is -uniformly weakly lower hemicontinuous ( -wlhc) when
B (t)(ū) ∩ dom( ) ⊆ −1[Bt,ε( ȳ)] for all (ū, ȳ) ∈ , t ∈ T , and ε > 0.

In particular, we see that causality is just a special case of -flhc when = id. We
illustrate the notion of -flhc in Fig. 3.3. Like before, neither -lhc or -flhc is a
stronger property than the other in general, but they are both stronger than -wlhc.
3 On the Role of Well-Posedness in Homotopy Methods … 57

3.4 Stability and Gain of IO Systems

An IO system (I, O, ) is called “input–output stable” in [12] when small inputs


produce small outputs. This definition makes sense in the context of univalent systems
in which the cross section [u] is a singleton for every u ∈ dom( ). When [u] is
a set, however, there are different notions of what it means for it to be “small.” We
adapt one such notion from [25, 31, 32] as follows:
Definition 3.6 An IO system (I, O, ) is minimally stable when for each u ∈
dom( )s, the cross section [u] is controllable to zero.
Thus in a minimally stable system, the set of outputs for any small input has the set
of small outputs as a dense subset. This is weaker than the property that small inputs
produce only small outputs.

3.4.1 Finite-Gain Stability

To deal with inputs that are not necessarily small, we will introduce the notion of
the gain of an IO system. In the classical setting, an IO system (I, O, ) has a finite
gain when there exist constants γ , β  0 such that

yt  γ · ut + β (3.16)

for all (u, y) ∈ and all t ∈ T . If the input is small then we can take the supremum
of both sides of (3.16) over t to conclude that all associated outputs must also be
small; in particular, all IO systems having finite gain are minimally stable. We next
generalize this definition by allowing different values for time on the left- and right-
hand sides of the inequality (3.16).
Given β  0 and a look-ahead map for I ⊕ O, the look-ahead gain with bias β
of the IO system (I, O, ) is the nonnegative extended real number gβ ( ) defined
as

⎪ yt − β
⎪ sup
⎨ when = ∅
t∈T ε + u (t)
gβ ( ) = ε>0, (u,y)∈ (3.17)


⎩0 when = ∅ .

Note that gβ ( )  0 because if the numerator in (3.17) is negative, then taking


ε → ∞ will make the supremum zero. Also, if gβ ( ) < ∞ then

yt  gβ ( ) · u (t) +β ∀(u, y) ∈ , ∀t ∈ T . (3.18)


58 R. A. Freeman

We see from (3.17) that gβ ( ) is nonincreasing in β, so it has a limit as β → ∞


which we define to be the look-ahead gain g ( ) of :

g ( ) = inf gβ ( ) = lim gβ ( ) . (3.19)


β0 β→∞

We say that the system is -stable when g ( ) < ∞, and we say that it is -stable
with zero bias when g0( ) < ∞. These gains are identical for linear systems:
Lemma 3.3 If is a linear IO system then g ( ) = g0( ).
A related result is the following:
Lemma 3.4 Every -stable linear IO system is univalent and -flhc.
Note that supt∈T u (t)  supt∈T ut , so if is -stable and u is small then we
can take the supremum of both sides of (3.18) over t ∈ T to obtain ys  gβ ( ) ·
us + β for all y ∈ [u]. Hence if is -stable then small inputs produce only
small outputs, and in particular all -stable systems are minimally stable.
We next show that -stability is preserved under compositions. Given IO sys-
tems (I, X, ) and (X, O, ), their composition is the IO system (I, O,  ◦ ) with
behavior
 
◦ = (u, y) ∈ I ⊕ O : ∃x ∈ X such that (u, x) ∈ and (x, y) ∈  . (3.20)

Lemma 3.5 Suppose (I, X, ) and (X, O, ) are IO systems. If is 1 -stable and
 is 2 -stable, then  ◦ is ( 1 ◦ 2 )-stable with g 1 ◦ 2( ◦ )  g 2()g 1( ). If
in addition the biases for and  are zero then g01 ◦ 2( ◦ )  g02()g01( ).
Note that unlike [23], we do not use the term “bounded” to describe the finite-
gain property. This is because the notion of a bounded operator between topological
vector spaces has a standard meaning, namely, that (von Neumann) bounded sets
are mapped to bounded sets. Also, note that -stability is stronger than continuity,
even for linear univalent operators. Indeed, it is clear that a linear map T : I → O
is continuous (with respect to the seminorm topologies) when for every s ∈ T there
exists t ∈ T and C > 0 such that T us  Cut for all u ∈ I; for -stability we
require further that t = (s) and that C is independent of s.
Because the special case = id is important, we highlight it by defining gβ ( ) and
g( ) (without the superscript ) to be the gains in (3.17) and (3.19) for the specific
choice = id. In this special case the inequality (3.18) reduces to the classical finite-
gain inequality (3.16), and we say that is stable when g( ) < ∞ and stable
with zero bias when g0( ) < ∞. Note that if L is a look-ahead constant for then
gβ ( )  Lgβ ( ) for any β  0. In particular, stable systems are -stable for any
look-ahead map (but not conversely in general). Also, Lemma 3.4 with = id
states that every stable linear IO system is univalent and causal. However, there exist
stable nonlinear IO systems that are not causal: consider the discrete-time system
p
with I = O = L loc given by the difference equation yt = u t tanh(u t+1 ) for t ∈ N.
3 On the Role of Well-Posedness in Homotopy Methods … 59

3.4.2 Relationships Between Gain, Small-Signal Gain, and


Norm Gain

We can also apply these gain concepts to the subsystems s and n in Table 3.1
to obtain the small-signal gain g ( s) and the norm gain g( n). Note that the
finiteness of either of these gains does not imply that small inputs produce small
outputs because by definition, the subsystems s and n contain only small signals.
Indeed, in the classical extended L2 setting in continuous time, a linear time-invariant
system described by a proper transfer function with no poles on the imaginary axis
has a finite norm gain equal to the peak value of the magnitude portion of its Bode
plot, even if the system is unstable. As we will see in Sect. 3.5.3, soft IQCs by
themselves generally provide bounds on the norm gain only and thus do not ensure
stability without additional assumptions.
The small-signal and norm gains of a system are related to the look-ahead gain
g ( ) through the following inequalities:
n s
g( )  g( )  g( ) (3.21)
n s
gβ ( )  gβ ( )  gβ ( ) , (3.22)

where (3.22) holds for any β  0 and any look-ahead map , and (3.21) follows from
(3.22) by taking the limit as β → ∞. Note that by setting = id in (3.21) we obtain
the inequality g( n)  g( ). What we need for our stability analysis, however, is
the reverse inequality (at least up to a constant factor). One way to achieve such a
reverse inequality is to assume that is a minimally stable causal operator:
Lemma 3.6 Let (I, O, ) be a minimally stable causal operator, and suppose the
input space I is uniformly finely controllable to zero with controllability constant K.
Then g( )  Kg( n).
Lemma 3.6 is used implicitly in the stability analysis of [21, 23], and is related to
[7, Exercise 8c] and [25, Proposition 6]. We will not prove Lemma 3.6 separately as
it is a direct corollary of the following lemma:
Lemma 3.7 Let (I, O, ) be a minimally stable IO system. If is -lhc and dom( )
is uniformly controllable to zero (or if is -wlhc and dom( ) is uniformly finely
controllable to zero), then gβ̄ ( )  Kgβ ( n) for all β  0 such that gβ ( n) < ∞,
where β̄ = β + bgβ ( n) and K and b are the controllability constant and bias for
dom( ). In particular g ( )  Kg( n).
We obtain Lemma 3.6 by setting = id in Lemma 3.7 and recognizing that, in this
case, causality is then the same as -flhc (which in turn implies -wlhc). The following
version of Lemma 3.7 has weaker assumptions but involves the larger small-signal
gain rather than the norm gain:
Lemma 3.8 Let (I, O, ) be a minimally stable IO system. If is lhc and dom( )
is controllable to zero (or if is wlhc and dom( ) is finely controllable to zero),
then gβ ( s) = gβ ( ) for all β  0, and in particular g ( s) = g ( ).
60 R. A. Freeman

3.4.3 Stability Robustness in the Gap Topology

Are stability, -stability, or minimal stability preserved under small perturbations? In


this section we answer this question for perturbations in a certain topology on the set
of systems known as a “gap topology” (see [6] and the references therein). We will use
a version of the gap topology described in [10, 23]. This topology comes from a type
of normalized Hausdorff set distance, defined in the following manner. Let (X, T , S)
be a signal space, and consider the distance functions d, q : X × X → [0, ∞] given
by

x − yt
d(x, y) = sup (3.23)
ε + xt
ε>0, t∈T

q(x, y) = ln 1 + d(x, y) , (3.24)

with the convention ln ∞ = ∞.


Lemma 3.9 The distance function q is an extended quasimetric on X.
This quasimetric q generates a topology on X in the usual manner [5]: a set U ⊆ X
is open in the q-topology when for every x ∈ U there exists r > 0 such that y ∈ X
and q(x, y) < r imply y ∈ U . Thus a sequence {xn }n∈N in X converges to a point
x ∈ X in the q topology if and only if q(x, xn ) → 0 as n → ∞. The q-topology is
finer than the seminorm topology, and X with the q-topology is not a topological
vector space (unless X is trivial and contains only the zero signal). Indeed, 0 is an
isolated point in the q-topology because q(0, x) = ∞ for all x = 0. As a result,
neither scalar multiplication nor vector addition is continuous in the q-topology
when X contains a nonzero signal x: the sequence x/n does not converge to zero
as n → ∞; moreover, the sequences (1 + 1/n)x and (−1 + 1/n)x have q-limits
x and −x (respectively), but their sum is 2x/n which does not converge to zero.
A consequence of the following lemma is that the conjugate quasimetric q̄ defined
as q̄(x, y) = q(y, x) is topologically equivalent to q, that is, it also generates the
q-topology:

Lemma 3.10 If d(x, y) < 1 then d(y, x)  d(x, y)/ 1 − d(x, y) .
In particular, the pointwise maximum of q and q̄ is an extended metric on X which
generates the q-topology.
We next use d in (3.23) to define a Hausdorff-like distance d between sets A, B ⊆
X:
 

d(A, B) = max d(A, 
B), d(B, A) , (3.25)

where d : P(X) × P(X) → [0, ∞] is defined as


3 On the Role of Well-Posedness in Homotopy Methods … 61


⎪sup inf b∈B d(a, b) if A = ∅
⎨ a∈A

d(A, B) = (3.26)
⎪0 if A = B = ∅


∞ if A = ∅ and B = ∅ .

These distance functions d and d are called the gap and the directed gap (respec-
tively), and they are versions of the gaps defined in [10, 23]. The logarithm
ln 1 + d(·,·) is an extended pseudometric on the power set P(X) (as shown in
[11], for example), and the associated pseudometric topology on P(X) is called the
gap topology.
The following lemma shows that the set of stable systems is open in the gap
topology; this is essentially [23, Lemma 1] but without the causality assumptions.
Lemma 3.11 Let (I, O, ) and (I, O, ) be IO systems. If is stable and

d(, ) < (2g( ) + 2)−1 then g()  2g( ) + 1. If is stable with zero bias

and d(, ) < (2g0( ) + 2)−1 then g0()  2g0( ) + 1.
This lemma also shows that the smaller the gain of , the more it can be perturbed
while preserving stability. In general, however, neither the set of minimally stable
systems nor the set of -stable systems is open in the gap topology, as the following
example shows:
Example 3.8 Let be a look-ahead map, and define the parameterized IO system
 
α = (u, y) ∈ I ⊕ O : yt  u + αy (t) ∀t ∈ T (3.27)

for each parameter α ∈ R. Clearly g0( 0 ) = 1 so 0 is -stable. Also, the mapping


α → α is continuous in the gap topology; indeed, suppose (u, y) ∈ α and define
ū = u + (α − ᾱ)y and ȳ = y for some ᾱ ∈ R. Then (ū, ȳ) ∈ ᾱ and

u − ūt + y − ȳt = |α − ᾱ| · yt (3.28)

 α , ᾱ )  |α − ᾱ|, and by reversing the roles of α and


for all t ∈ T . It follows that d(
ᾱ we obtain d( α , ᾱ )  |α − ᾱ|. Thus α → α is actually uniformly continuous.
Next we examine the cross section α [0], which is the set
 
α [0] = y ∈ O : yt  αy (t) ∀t ∈ T . (3.29)
p
Now suppose we are in discrete time with I = O = L loc and 1  p < ∞, suppose
(t) = t + 1, and suppose 0 < |α| < 1. Let yt denote the value of the signal y at
p p
time t. Then yt+1 = |yt+1 | p + yt for all y ∈ O and all t ∈ T , which means
   p 
α [0] = y ∈ O : |yt+1 | p  |α|1 p − 1 · yt ∀t ∈ T . (3.30)

Thus we see that if y ∈ α [0] is nonzero, then ys > 0 for some s which means
|yt | cannot converge to zero as t → ∞. Hence the only small signal in α [0] is the
62 R. A. Freeman

zero signal, and because α [0] also contains nonzero signals we conclude that α
is not minimally stable (and in particular it is not -stable).
The main reason we lack stability robustness in Example 3.8 is that nonzero
outputs in the perturbed system blow up very quickly when α is near zero. If we limit
the growth of such outputs a priori, then we can indeed preserve minimal stability
under perturbations in the gap topology, at least under some additional controllability
and lower hemicontinuity assumptions. To this end, given a look-ahead map and a
parameter μ  1 we define the following set of output signals:
 
Gμ = y ∈ O : ∃χ  0 such that ∀t ∈ T , y (t)  μyt + χ . (3.31)

This set Gμ represents those outputs that do not blow up too quickly, as measured by
and μ. For example, if T = R0 and (t) = t + τ for some positive constant τ ∈ T ,
then Gμ represents the set of all signals in O that exhibit exponential growth no faster
than μt/τ . We say that an IO system (I, O, ) is ( , μ)-limited when Gμ is dense in
[u] for all small inputs u ∈ dom( ). This basically means that small inputs to
produce outputs having limited growth. In many cases linear growth conditions on
differential or difference equations can guarantee this property. Note that for = id
and μ = 1 we have G1id = O, which means every IO system is (id, 1)-limited.
The following is a version of Lemma 3.11 that uses the small-signal gain g ( s)
to measure how far a system can be perturbed while preserving minimal stability:
Lemma 3.12 Let (I, O, ) be a minimally stable IO system such that g ( s) < ∞.
Suppose that either is -lhc and dom( ) is controllable to zero, or that is
-wlhc and dom( ) is finely controllable to zero. Let (I, O, ) be an ( , μ)-limited
IO system, and let L be a look-ahead constant for . If


d(, ) < (μg ( s
) + μL)−1 (3.32)

then  is minimally stable.


Note that we can use either the small-signal gain g ( s) or the gain g ( ) in
Lemma 3.12, because in this case they are equal by Lemma 3.8.

3.4.4 Stability via Homotopy

The main idea behind the stability proofs in [21, 23] is to start with a simple system
0 which we know to be stable, and then continuously deform it through a mapping
α → α for α ∈ [0, 1] until we reach the target system of interest 1 . If α satisfies
certain conditions along the homotopy path, then we conclude that the target system
1 is stable as well. Most of the effort in the proof comes from Lemma 3.12, which
along with either Lemma 3.7 or Lemma 3.8 leads to the following stability theorems.
The first one makes use of Lemma 3.8 along with the small-signal gain:
3 On the Role of Well-Posedness in Homotopy Methods … 63

Theorem 3.1 Let {(I, O, α )} be a family of IO systems with parameter α ∈ [0, 1],
and let be a look-ahead map for I ⊕ O. Suppose
(i) the mapping α → α is continuous in the gap topology,
(ii) there exists μ  1 such that α is ( , μ)-limited for all α ∈ [0, 1],
(iii) for each α ∈ [0, 1], either α is -lhc and dom( α ) is controllable to zero. or
α is -whlc and dom( α ) is finely controllable to zero,
(iv) there exists γ  0 such that g ( αs )  γ for all α ∈ [0, 1], and
(v) 0 is minimally stable.
Then 1 is -stable with g ( 1)  γ.
Note that we cannot conclude g ( 1 )  γ directly from (iv) and Lemma 3.8 because
without the homotopy argument we do not know that 1 is minimally stable. The
second theorem uses Lemma 3.7 along with the weaker (and easier to verify) norm
gain, but it requires uniform controllability rather than mere controllability:
Theorem 3.2 Let {(I, O, α )} be a family of IO systems with parameter α ∈ [0, 1],
and let be a look-ahead map for I ⊕ O. Suppose (i), (ii), and (v) of Theorem 3.1
hold, but instead of (iii) and (iv) suppose
(iii’) there exists K  0 such that for each α ∈ [0, 1], α is -lhc and dom( α )
is uniformly controllable to zero (or α is -whlc and dom( α ) is uniformly
finely controllable to zero) with controllability constant K, and
(iv’) there exists γ  0 such that g( αn)  γ for all α ∈ [0, 1].
Then 1 is -stable with g ( 1)  Kγ .
If the target system 1 is a causal operator, then we can use Lemma 3.6 to conclude
from Theorem 3.1 or 3.2 that it is stable (rather than merely -stable) even though
α might not be a causal operator along the homotopy path. If each α is in fact a
causal operator, then we obtain the following version of [23, Theorem 2]:
Corollary 3.1 Let {(I, O, α )} be a family of IO systems with parameter α ∈ [0, 1],
and suppose the input signal space I is uniformly finely controllable to zero with
controllability constant K. Suppose (i) and (v) of Theorem 3.1 hold, suppose (iv’) of
Theorem 3.2 holds, and suppose α is a causal operator for every α ∈ [0, 1]. Then
1 is stable with g( 1 )  Kγ .

Note that [23, Theorem 2] does not assume (iv’) directly, but rather it uses a sufficient
condition for (iv’) stated in terms of IQCs. We will show how to do this in Sect. 3.5.3.
For us to conclude that the gain results in Theorems 3.1 and 3.2 and Corollary 3.1
hold with zero bias, we simply assume that (iv) or (iv’) is satisfied with g0 instead of
g and that the uniform controllability conditions hold with zero controllability bias.
Another simple extension of Theorems 3.1 and 3.2 is to allow to depend on α; in
this case all we need is to have a single look-ahead constant valid for all α , plus the
following monotonicity property: if α  ᾱ then Gμα ⊆ Gμᾱ .
The following example shows that if all of the conditions of Theorem 3.2 are
satisfied except that there is no single value of the controllability constant K in (iii’)
64 R. A. Freeman

that works for every α, then the target system can be unstable. The parameterized
system in this example is even causal and linear for each α, but this does not help.
Example 3.9 Let I = O = L2loc ⊕ L2loc on the time axis T = R0 so that each input
and output signal has two components u = (u 1 , u 2 ) and y = (y1 , y2 ). Let h : T → R
be the signal h(t) = 2et , and consider the family {(I, O, α )} of linear IO systems
with parameterized behavior
 
α = (u, y) ∈ I ⊕ O : y1 = u 1 , y2 = u 2 + h ∗ u 2 , and y1 = (1 − α)y2
(3.33)

for α ∈ [0, 1], where ∗ denotes the convolution of one-sided signals, i.e., signals
supported on T . This family satisfies almost all of the conditions of Theorem 3.2
with = id. We will show in Lemma 3.24 in the appendix that the mapping α → α
is uniformly continuous. The growth condition in (ii) is trivially satisfied because =
id. The unstable linear map u 2 → u 2 + h ∗ u 2 has the all-pass, unity-gain transfer
function (s + 1)/(s − 1), and it follows that (iv’) holds with γ = 1. The initial system
0 satisfies y1 = y2 = u 1 and is thus minimally stable. All that is left is (iii’). Each
α is causal and thus also -wlhc. Its domain is

 
dom( α) = (u 1 , u 2 ) ∈ I : u 1 = (1 − α)(u 2 + h ∗ u 2 ) . (3.34)

If α = 1 then dom( α ) = {0} × L2loc which is uniformly finely controllable to zero


with controllability constant K = 1. If α < 1 then we can write dom( α ) in the
form (3.11)–(3.12) in Example 3.7 and thus conclude that it is also uniformly finely
controllable to zero. However, there is no single value of K that works for all α ∈
[0, 1], so (iii’) is not satisfied (one can show that the value of K must grow like
1/(1 − α) as α → 1 before it jumps down to K = 1 at α = 1). As a result, the
conclusion of Theorem 3.2 fails to hold, and indeed the target system 1 is unstable.

3.5 Stability of Interconnections

We typically apply the homotopy methods of Theorems 3.1 and 3.2 to interconnec-
tions of systems. In this section we show how we can use properties of individual
subsystems (e.g., continuity with respect to parameters and controllability) to deduce
the analogous properties of their interconnection as needed in these theorems.
Let (I, O, ) be an IO system. Given a system (O, ), we define the interconnec-
tion of  and  to be the IO system (I, O, [, ]) with behavior
 
[, ] = (u, y) ∈  : y ∈  . (3.35)

Figure 3.4 illustrates this interconnection. Note that  is not necessarily an IO system
in this context, and we need not regard the signal y in Fig. 3.4 as “entering” or
3 On the Role of Well-Posedness in Homotopy Methods … 65

Fig. 3.4 The


interconnection [, ] of 
and 

Fig. 3.5 The classical


interconnection [G +, ]
with an additive input u

Fig. 3.6 The classical


interconnection [G +, ] of
Fig. 3.5 when G and  are
IO systems

“leaving” . Instead,  represents a constraint on the outputs of that result in


the new IO system [, ]. This perspective on interconnections is from [29], and 
typically represents the nonlinear, time-varying, or uncertain part of the system.
There are various ways of defining the stability of the interconnection [, ]. In
[30], the signal u is treated as an internal state, and stability means the convergence of
u to zero in forward time. We will instead take an IO point of view in which stability
means the finite-gain stability of [, ] as an defined in Sect. 3.4.1. In this approach
u represents the exogenous signal, y represents the endogenous signal, and stability
implies that y cannot be large unless u is also large.
An important special case of the interconnections in Fig. 3.4 involves a system
(O, G) defined on the output space O. Let us first compose G with addition on O to
obtain the system (O ⊕ O, G + ):
 
G + = (u, y) ∈ O ⊕ O : u + y ∈ G . (3.36)

If I = O, then we can form the interconnection [G +, ] as illustrated in Fig. 3.5. We


regard this special case as a classical interconnection, because if G and  are both
IO systems then we can split u and y into components to obtain the familiar feedback
connection in Fig. 3.6 (note that u 2 enters the summing junction in Fig. 3.6 with a
negative sign so that Fig. 3.6 is indeed a form of Fig. 3.5, but this has no bearing on
stability). The system G is linear in much of the classical stability literature. The
following lemmas show that the mapping from G to G + is continuous in the gap
topology and preserves controllability to zero:
66 R. A. Freeman

Lemma 3.13 The mapping (·)+ : P(O) → P(O ⊕ O) is uniformly continuous.


Lemma 3.14 If both (O, G) and the signal space O itself are controllable (resp.
finely controllable) to zero, then so is G + .

3.5.1 Well-Posed Interconnections

Recall the definition of a well-posedness from [21, 23]: we say that = [, ] in
Fig. 3.4 is a well-posed interconnection when is a causal operator. The methods
of [21, 23] assume that the interconnection is well-posed along the entire homotopy
path. We have already seen in Theorems 3.1 and 3.2 how we can relax this well-
posedness assumption. In particular, we can replace the requirement that is an
operator with a weaker controllability requirement on its domain. Likewise, we can
replace the causality requirement with a weaker lower hemicontinuity requirement
together with a growth condition. Thus even if the target system is well-posed, the
other artificial systems along the homotopy path need not be. In fact, it is possible
that the only well-posed system on the homotopy path is the target system itself,
because as the next examples show, the set of well-posed systems is not open in the
gap topology.
p p
Example 3.10 We consider the interconnection of Fig. 3.6 with I = O = L loc ⊕ L loc
on T = R0 so that each input and output signal has two components u = (u 1 , u 2 )
and y = (y1 , y2 ). Let G ⊂ O be the linear system G = {(0, 0)}, that is, the system
whose behavior contains solely the zero vector in O. For each parameter α ∈ [0, 1],
we define the system α as
 
α = (y1 , y2 ) ∈ O : ∀t ∈ T , y1t  (1 − α)y2t or y1t  0 , (3.37)

where y1t and y2t represent the values of the signals y1 and y2 at time t. It follows
that the interconnection α = [G + , α ] is given by
 
α = (u, y) ∈ I ⊕ O : y = −u and −u ∈ α . (3.38)

We will show in Lemma 3.25 in the appendix that the mapping α → α is uniformly
continuous. Moreover, the target system 1 is well-posed because when α = 1 we
have α = O, and thus α reduces to a constant gain of −1 (which is clearly a causal
operator). When α < 1, however, the inputs to α cannot be freely chosen, which
means α is causal but no longer an operator and is thus no longer well-posed.
Example 3.11 We again consider the interconnection of Fig. 3.6 with I and O as in
Example 3.10. Let G ⊂ O be the linear system G = {(w1 , w2 ) ∈ O : w1 = 0}. For
each parameter α ∈ [0, 1], we define the system
 
α = (y1 , y2 ) ∈ O : ∀t ∈ T , y2t = (1 − α)y1t · sech(y1,t+1 ) , (3.39)
3 On the Role of Well-Posedness in Homotopy Methods … 67

where y1t and y2t are as in Example 3.10 and y1,t+1 represents the value of the signal
y1 at time t + 1. It follows that the interconnection α = [G + , α ] is given by

α = (u, y) ∈ I ⊕ O : y1 = −u 1 and

∀t ∈ T , y2t = (α − 1)u 1t · sech(u 1,t+1 ) . (3.40)

We will show in Lemma 3.26 in the appendix that the mapping α → α is uniformly
continuous. Moreover, the target system is 1 = {(u, y) : y1 = −u 1 and y2 = 0
which is clearly a causal operator. When α < 1, however, the system is a noncausal
operator and is thus no longer well-posed.

To summarize these examples, perturbing a well-posed system can cause it to remain


causal but no longer be an operator, or to remain an operator but to no longer be causal.
It is also possible to lose both properties (e.g., just take an appropriate Cartesian
product of the above two examples).

3.5.2 Regular Systems

The homotopy method for stability analysis relies on the continuity (in the gap
topology) of the interconnection mapping [· , ·]. In this section we show that such
continuity holds when the first argument is regular as defined below. As an example,
we will see that the system G + in the classical interconnection of Fig. 3.5 is always
regular. As argued in [30], however, this classical interconnection may not be the best
way to incorporate  as model uncertainty; indeed, it seems better suited to the case
in which  is a controller and u represents additive actuator and sensor noise. The
following definition of a regular system allows us to extend the homotopy approach
to certain more general interconnections of the type shown in Fig. 3.4.
Definition 3.7 An IO system (I, O, ) is r -regular when for each (u, y) ∈  there
exists an IO system (O, I, ) such that id ⊆  ◦  and g0( − (y, u))  r . We let
Regr (I, O) denote the set of all such r -regular systems, and we say that  is regular
when it is r -regular for some r  0.
In this definition, each system  is a “right inverse” of . Thus a regular system is
one for which there is a “stable right inverse” centered at each point in its graph.
In particular, if  is regular and nonempty, then at least one  exists and thus
 = ∅ implies [, ] = ∅. Note that G + in (3.36) is 1-regular for any system G;
indeed, given (u, y) ∈ G + choose  = {( ȳ, ū) : u + y = ū + ȳ}. The following is
a characterization of regular linear systems, which are essentially those having stable
right inverses:
Lemma 3.15 If (I, O, ) is linear and there exists a stable univalent linear system
(O, I, ) such that id ⊆ ◦, then  is regular with r = g().
68 R. A. Freeman

Lemma 3.16 Let I and O be signal spaces. Then for each r  0, the map [· , ·] is
uniformly continuous on Regr (I, O) × P(O).
Note that [23, Lemma 2], which states that the classical interconnection shown in
Fig. 3.5 is continuous in G and , follows from Lemmas 3.13 and 3.16 together with
the fact that G + is always 1-regular. We next show that regularity also provides a
way to verify that the interconnection [, ] is controllable to zero.
Lemma 3.17 If  is regular and if  and  are controllable (resp. finely control-
lable) to zero, then [, ] is controllable (resp. finely controllable) to zero.

3.5.3 Integral Quadratic Constraints

When α = [Gα+ , α ] for parameterized systems Gα and α , we can use integral


quadratic constraints (IQCs) to verify condition (iv’) in Theorem 3.2. Let (X, T , S)
be a signal space, and let ·s denote the norm (3.5) on its small-signal subspace Xs.
Following [23], we say that a functional σ : Xs → R is quadratically continuous
when for every ε > 0 there exists C > 0 such that

σ (y)  σ (x) + εx2s + Cx − y2s (3.41)

for all x, y ∈ Xs. A typical choice for σ is the following:


Lemma 3.18 Let U and V be normed spaces over R or C, let A : Xs → U and
B : Xs → V be bounded R-linear operators, and let · , · : U × V → R be a
bounded R-bilinear functional. Then σ given by σ (x) = Ax, Bx is quadratically
continuous.
Suppose T = R or T = R0 (for continuous time), and suppose U and V are both
equal to the space of L2 functions from T to F n . Then a particular choice for the
bilinear functional · , · in Lemma 3.18 is the symmetric one given by
 ∞
u, v = Re û ∗(ω)(ω)v̂(ω) dω , (3.42)
−∞

where û and v̂ denote the Fourier transforms of u and v,  is a Cn×n Hermitian-valued


function with L ∞ entries, and Re denotes the real part. If the small-signal subspace
Xs is also L2 , then we can take A = B = id in Lemma 3.18 so that
 ∞
σ (x) = x̂ ∗(ω)(ω)x̂(ω) dω . (3.43)
−∞

This choice for σ leads to the IQCs considered in [21]. Another choice for A and
B is for both to be of the form x → (x, xδ ) where xδ is the result of delaying the
signal x by the amount δ. This leads to the delay-IQCs discussed in [1, 2]. A third
3 On the Role of Well-Posedness in Homotopy Methods … 69

choice, explored in [8], is for when Xs is the Sobolev space H k and A and B are
given by x → (x, Dx, . . . , D k x), where D denotes the weak derivative operator.
Other choices are possible as well.
The following lemma comes from the proof of [23, Theorem 2].
Lemma 3.19 (Integral quadratic constraints) Let (O, G) and (O, ) be systems on
a signal space O. Suppose there exist constants ε > 0 and d  0 and a quadratically
continuous map σ : Os → R such that
(i) σ (w)  −2εw2s + d for all w ∈ G s, and
(ii) σ (y)  −d for all y ∈ s.
√ √
Then gβ ([G + , ] n)  γ , where β = 4d/ε and γ = 2(1 + C/ε) with C from
(3.41).
Condition (ii) is an IQC for the system , and the constant d is called its defect in
[25]. Condition (i) is called an “inverse graph” IQC in [4]. Indeed, we can consider
Lemma 3.19 as an example of a graph separation result (for example, see [24, 27]
and the references therein).

3.6 Summary

In this chapter we investigated the role of the well-posedness assumption in the


homotopy methods of [21, 23]. In particular, we showed that feedback connection
need not be defined for all possible inputs; instead, it is sufficient that its domain
has a certain controllability property. We also showed how to relax the causality
assumption by replacing it with a lower hemicontinuity assumption together with
a growth condition. The notions of controllability and causality from [29] were
particularly useful in this context.
Many have argued that the finite-gain type of stability we considered here is less
appropriate for nonlinear systems, and that the use of gain functions instead [13, 26,
27] can provide more flexibility. For this reason it is of interest to extend the results
of this chapter to that more general setting.

3.7 Appendix

Proof (Lemma 3.1) Because X is a locally convex Hausdorff space, there exists
a separated family S0 = {·α }α∈I of seminorms on X over an index set I that
generates the topology. Let F ⊆ P(I) denote the
 set of all finite subsets of I, and
for each F ∈ F define the seminorm · F = α∈F ·α . Then the family S+ =
{· F } F∈F induces the same topology on X that S0 does, and its index set F is
directed by the natural preorder . Let T ⊆ F be any transversal of the quotient
F/ ∼ (where ∼ is the equivalence relation defines by ), that is, let T be any
70 R. A. Freeman

set consisting of a single member from each equivalence class in F/ ∼. Then the
subfamily S = {·t }t∈T of S+ is temporal and induces the same topology on X that
S0 does. 

Proof (Lemma 3.2) Suppose that x̄ ∈ Bs ( ȳ) ∩ Bt(z̄) for some ȳ, z̄ ∈ X and s, t ∈ T .
Choose r ∈ T such that both s  r and t  r , namely, such that ·s  Cs ·r
and ·t  Ct ·r for some positive constants Cs and Ct . Suppose that x ∈ Br (x̄).
Because x̄ − ȳs = x̄ − z̄t = 0 we have x − ȳs  x − x̄s  Cs x − x̄r =
0 and x − z̄t  x − x̄t  Ct x − x̄r = 0, and hence x ∈ Bs ( ȳ) ∩ Bt(z̄). Thus
Br (x̄) ⊆ Bs ( ȳ) ∩ Bt(z̄), and we conclude that the collection of sets in (3.3) is a base
for a topology on X. Moreover, it is straightforward to show that if x̄ ∈ Bt ( ȳ) then
Bt (x̄) ⊆ Bt ( ȳ), which means the sets (3.3) for t ∈ T are a local base at x̄. 

Lemma 3.20 The family of seminorms in Example 3.3 is temporal and thus (X, T , S)
is a signal space.

Proof This is a special case of Example 3.2: if x ∈ X and t ∈ T then the restriction
x| At belongs to the normed space Xt = L p (At ) with xt = x · 1 At  p = x| At , so
we define Rt : X → Xt as Rt x = x| At . Property (b) of Example 3.2 holds because the
symmetric difference of two distinct members of C has positive measure. Property (c)
holds as well: given s, t ∈ T we choose r ∈ T such that As ∪ At ⊆ Ar , and we let Bs
and Bt be the restrictions to As and At , respectively. We have left to show that property
(a) holds. Suppose x ∈ X is nonzero; then there exists a measurable set A ⊆ E such
that μ(A) >0 and x is never zero on A. Let {Ati }i∈N  be a countable subcover of C.
Then E = i∈N Ati which implies 0 < μ(A)  i∈N μ(A ∩ Ati ). It follows that
μ(A ∩ Ati ) > 0 for some i ∈ N and thus xti = x · 1 Ati  p  x · 1 A∩Ati  p > 0.
In other words, x ∈ / ker(Rti ). 

Example 3.12 (The countable subcover condition in Example 3.3 cannot be


removed) Let I be the real interval [0, 1], let E 1 be I together with the Lebesgue
measure, and let E 2 be the set I ∪ {2} together with the counting measure (and having
all sets measurable). Let E = E 1 × E 2 together with the product measure μ such
that the measure μ(A) of a measurable set A is the sum of the Lebesgue measures
of its horizontal sections. Let T be the set of all nonempty finite subsets of I , and
for each t ∈ T define
   
At = (c, 2) ∪ [0, 1] × {c} . (3.44)
c∈t

Then C = {At }t∈T satisfies all of the conditions listed in Example 3.3 except for the
countable subcover condition. Let A = I × {2}; then μ(A) = 1 which means 1 A is
not equivalent to the zero function, but 1 A t = 0 for every t ∈ T which means S is
not separated.

Lemma 3.21 The family of seminorms in Example 3.5 is temporal and thus (X, T , S)
is a signal space.
3 On the Role of Well-Posedness in Homotopy Methods … 71

Proof The family S is separated because if x ∈ X is such that x(τ ) = 0 for some
τ ∈ , then x(0,K ) > 0 for any K ∈ C containing τ . Next we show that  is a
partial order on T . Suppose (N1 , K 1 ), (N2 , K 2 ) ∈ T with (N1 , K 1 ) = (N2 , K 2 ). If
K 1 = K 2 , then because K 1 and K 2 are distinct closed sets there exists x ∈ X that is
equal to zero on a neighborhood of one of them (K i ) but takes on a nonzero value
on the other one (K j ). Hence x(Ni ,K i ) = 0 but x(N j ,K j ) > 0, and we conclude
that these two norms are not equivalent. If K 1 = K 2 = K but N1 = N2 , then choose
σ ∈ K and consider the signal x defined as x(τ ) = sin(ω(τ1 − σ1 ) + π4 ) for ω > 1,
where τ1 and σ1 denote the respective first components of the n-vectors τ and σ .
Then we have
√
 α 
∂ x(σ ) = 2 ω if α = (i, 0, . . . , 0) for some i ∈ N
2 i
(3.45)
0 otherwise,

and it follows from (3.6) that


N √ 
N

2
ω
2 i
 x(N ,K )  ωi (3.46)
i=0 i=0

for all N ∈ N. Plugging in N1 and N2 and taking ratios yields



2 ω N1 +1 − 1 x(N1 ,K ) √ ω N1 +1 − 1
· N +1   2 · N +1 . (3.47)
2 ω 2 −1 x(N2 ,K ) ω 2 −1

If the two norms in (3.47) were equivalent, then their ratio would be bounded both
from above and below by positive constants not depending on ω, but we see from
(3.47) that this is not the case as ω → ∞ when N1 = N2 . Finally, it is clear that
 directs T because C is directed by inclusion and if N1  N2 and K 1 ⊆ K 2 then
·(N1 ,K 1 )  ·(N2 ,K 2 ) . 

Lemma 3.22 The signal space in Example 3.5 is controllable to zero. Moreover, it
is finely controllable to zero if and only if all sets in C are finite.

Proof We first identify a class of small signals on this space. Let ĝ be a smooth,
even, real-valued function on Rn that has support within the box B = [−1, 1]n with
an integral over B equal to (2π )n . Then its inverse Fourier transform, given by

1
g(τ ) = ĝ(ω)e jω·τ dω (3.48)
(2π )n Rn

for τ ∈ Rn , is a real analytic even Schwartz function with g(0) = 1. For each ε  0 we
define the scaled version gε as gε (τ ) = g(ετ ) for τ ∈ Rn , and we note that g0 ≡ 1. If
ε > 0, then by using the differentiation and scaling properties of the Fourier transform
we obtain
72 R. A. Freeman

 F j |α|+|β| ωα  β 
∂ α τ β gε (τ ) −−→ · ∂ ĝ (ω/ε) (3.49)
ε|β|+n

for any multi-indices α, β ∈ Nn . For each β, let Cβ denote the maximum of |∂ β ĝ|
over B, and note that

2n
|ωα | dω = ε|α|+n  2n ε|α|+n . (3.50)
εB (α1 + 1) . . . (αn + 1)

Thus the inverse Fourier transform formula yields


  Cβ ε|α|
 α β
∂ τ gε (τ )   n |β| (3.51)
π ε

for all τ ∈ Rn and α, β ∈ Nn . If ε ∈ (0, 1) then we can take the supremum over τ
and sum over α to obtain
  
 Cβ
sup ∂ α τ β gε (τ )   (3.52)
α∈Nn τ ∈Rn π n (1 − ε)n ε|β|

for all β ∈ Nn . It follows from (3.6) and (3.4) that gε h| (that is, the product gε h
restricted to ) is a small signal for every polynomial signal h and every parameter
ε ∈ (0, 1).
Next fix any x ∈ X, N ∈ N, K ∈ C, and γ > 0. Our goal is to find a polynomial h
and a parameter ε ∈ (0, 1) such that x − gε h| (N ,K ) < γ . Because K is compact,
we can extend x to all of Rn while preserving its value and the values of all of its
partial derivatives on K ; in other words, there exists z ∈ C ∞ (Rn ) such that z ≡ x on
a neighborhood of K . The seminorm (3.6) extends to Rn as well, so it suffices to find
h and ε such that z − gε h(N ,K ) < γ . It follows from [33, Theorem 4] that there
exists a sequence {h k }k∈N of scaled and shifted Bernstein polynomials such that
 
lim max ∂ α z(τ ) − ∂ α h k (τ ) = 0 (3.53)
k→∞ τ ∈K

for each α ∈ Nn . Therefore we can choose k sufficiently large so that z − h k (N ,K ) <
1
2
γ . Next, it follows from [9, Proposition 2.9] that the mapping f : R0 → R0 given
by f (ε) = h k − gε h k (N ,K ) is continuous, so because f (0) = 0 there exists ε ∈
(0, 1) such that h k − gε h k (N ,K ) < 21 γ . Thus z − gε h(N ,K ) < γ , and we conclude
that X is controllable to zero.
Next suppose all sets in C are finite, and choose x ∈ X, N ∈ N, and K ∈ C as
before. Because g defined above is real analytic and g(0)= 1, the set E τ = {ε ∈
(0, 1) : gε (τ ) = 0} is discrete for any τ ∈ Rn . The union τ ∈K E τ is also discrete
because K is finite, which means there exists ε ∈ (0, 1) not in this union, i.e., such
that gε is nonzero on K . It follows from [18, Theorem 19] that there exists a Hermite
interpolating polynomial h that agrees with x/gε and all of its derivatives up to order
3 On the Role of Well-Posedness in Homotopy Methods … 73

N on K . Hence x − gε h| (N ,K ) = 0, and we conclude that X is finely controllable


to zero.
Conversely, suppose C contains an infinite set K ; then because K is compact
it has an accumulation point σ ∈ K . Let σ denote the connected component of
 that contains σ . Let x ∈ X be the real analytic signal given by x(τ ) = eτ1 for
τ = (τ1 , . . . , τn ) ∈ , which satisfies x(σ ) > 0 and x(N ,K )  (N + 1)x(σ ) for all
N ∈ N. Now suppose y ∈ Xs is such that x − y(0,K ) = 0; then y is real analytic
and agrees with x on K , which means y ≡ x on σ . Hence y(N ,K )  (N + 1)x(σ )
for all N ∈ N, but this contradicts the assumption that y is small. We conclude that
X is not finely controllable to zero. 
Lemma 3.23 If (A, B) is stabilizable, (A, C) is detectable, and [ DE ] is right-
invertible, then the system in Example 3.7 is uniformly finely controllable to zero
(with zero controllability bias).
Proof Fix x̄ ∈ and let ξ be such that ξ̇t = Aξt + B E x̄t , ξ0 = 0, and 0 = Cξt +
D x̄t for almost all t ∈ T . We proceed with an idea from [16]. Let the matrix L be
such that A + LC is Hurwitz, and consider the observer given by the equation

ż t = Az t + B E x̄t + L(C z t + D x̄t ) , z 0 = 0 , (3.54)

where z represents the observer state. The error e = ξ − z satisfies ėt = (A + LC)et
with e0 = 0, and therefore et = 0 for all t ∈ T . It follows that

ξ̇t = (A + LC)ξt + (B E + L D)x̄t , ξ0 = 0 (3.55)

for almost all t ∈ T , which means


 t
ξt = e(A+LC)(t−τ ) (B E + L D)x̄τ dτ (3.56)
0

for all t ∈ T . Now because A + LC is Hurwitz, there exists a constant c independent


of t such that |ξt |  cx̄t for all t ∈ T . Next, let the matrix K be such that A + B K
is Hurwitz and fix t ∈ T . Let F be a right inverse of [ DE ], and define the signal x̂ as

x̄τ when τ  t
x̂τ = (A+B K )(τ −t)
(3.57)
K
F[ −C ]e ξt when τ > t .

It is straightforward to verify that x̂ ∈ . Because A + B K is Hurwitz, there exists


a constant κ independent of t such that the L2 norm of x̂ on the interval [t, ∞) is
bounded from above by κ|ξt |. It follows that x̂s  (1 + cκ)x̄t , and we con-
clude that is uniformly finely controllable to zero with controllability constant
1 + cκ. 
Proof (Lemma 3.3) The result holds trivially when is either empty or not -stable,
so we assume that is nonempty and -stable. Let β be such that gβ ( ) < ∞, and
74 R. A. Freeman

suppose (u, y) ∈ . Then (cu, cy) ∈ for all c ∈ F , and thus (3.18) yields

yt  gβ ( ) · u (t) + β/|c| (3.58)

for all nonzero c ∈ F and all t ∈ T . Taking |c| → ∞ gives yt  gβ ( ) · u (t)
for all t ∈ T and (u, y) ∈ . Thus from the definition (3.17) we obtain g0( ) 
gβ ( ). 

Proof (Lemma 3.4) Suppose is an -stable linear IO system, and let β  0 be such
that gβ ( ) < ∞. We first show that is -flhc. Pick any (ū, ȳ) ∈ and t ∈ T , and
let u ∈ dom( ) be such that u − ū (t) = 0. Choose any y ∈ [u]; then by the
linearity of we have (cu − cū, cy − c ȳ) ∈ for any c ∈ F , and we conclude
from (3.18) that |c| · y − ȳt  β for all c ∈ F . Hence y − ȳt = 0. The same
argument with u = ū shows that is univalent. 

Proof (Lemma 3.5) Suppose is 1 -stable and  is 2 -stable, and let β be large
enough that gβ1( ) and gβ2() are both finite. Choose (u, y) ∈  ◦ ; then from
(3.20) there exists x ∈ X such that (u, x) ∈ and (x, y) ∈ . It follows from (3.18)
that

yt  gβ2() · x 2 (t)



 gβ2()gβ1( ) · u 1 ( 2 (t))
+ gβ2()β + β (3.59)

for all t ∈ T . If we define β̄ = gβ2()β + β then (3.59) implies

g 1◦ 2
( ◦ )  gβ̄1 ◦ 2( ◦ )  gβ2()gβ1( ) . (3.60)

Thus  ◦ is ( 1 ◦ 2 )-stable, and because (3.60) holds for all β large enough we
obtain g 1 ◦ 2( ◦ )  g 2()g 1( ). Setting β = 0 throughout yields the result for
g0 . 

Proof (Lemma 3.7) Pick any (ū, ȳ) ∈ , t ∈ T , and ε > 0, and let δ > 0 be as in
Definition 3.5. We can choose δ  ε without loss of generality. Because dom( )
is uniformly controllable to zero, there exists û ∈ B (t),δ (ū) ∩ dom( )s such that
ûs  Kū (t) + b + ε. Because is -lhc there exists ŷ0 ∈ Bt,ε ( ȳ) ∩ [û]. The
same statements hold when is only -wlhc but dom( ) is uniformly finely con-
trollable to zero if we take δ = 0. In either case, because is minimally stable there
exists ŷ ∈ Bt,ε ( ȳ) ∩ [û]s. Hence
n
 ȳt   ŷt +  ŷ − ȳt  gβ ( ) · ûs + β + ε
n n n
 Kgβ ( ) · ū (t) + β + bgβ ( ) + εgβ ( )+ε. (3.61)

This holds for every ε > 0, which means  ȳt  Kgβ ( n) · ū (t) + β̄. Because
(ū, ȳ) ∈ and t ∈ T were arbitrary, we conclude that gβ̄ ( )  Kgβ ( n). 
3 On the Role of Well-Posedness in Homotopy Methods … 75

Proof (Lemma 3.8) Pick any (ū, ȳ) ∈ , t ∈ T , and ε > 0, and let Y = Bt,ε ( ȳ).
Because is lhc, there exists an open neighborhood U of ū such that U ∩
dom( ) ⊆ −1[Y ]. Because dom( ) is controllable to zero, there exists u ∈
Is ∩ U ∩ dom( ) ∩ B (t),ε (ū). The same statements hold when is only wlhc but
dom( ) is finely controllable to zero if we take U to be finely open instead. In either
case we have u ∈ −1[Y ], namely, there exists ŷ ∈ Y ∩ [u], and because is
minimally stable there exists y ∈ Y ∩ [u]s. Pick any β  0. If gβ ( s) = ∞ then
also gβ ( ) = ∞, so suppose gβ ( s) < ∞. Then (3.18) implies yt  gβ ( s) ·
u (t) + β, and thus
s
 ȳt  yt + y − ȳt  gβ ( ) · u (t) +β +ε
s s
 gβ ( ) · ū (t) + gβ ( ) · u − ū (t) +β +ε
s s
 gβ ( ) · ū (t) + εgβ ( )+β +ε. (3.62)

This holds for all ε > 0 which means  ȳt  gβ ( s) · ū (t) + β. Because this is
true for all (ū, ȳ) ∈ and t ∈ T , we conclude from (3.17) that gβ ( )  gβ ( s). 

Proof (Lemma 3.9) Because S is separated we see that q(x, y) = 0 if and only if
x = y, so we have left to prove that the triangle inequality holds. For any x, y, z ∈ X,
ε > 0, and t ∈ T we have

x − zt x − yt ε + yt y − zt


 + · . (3.63)
ε + xt ε + xt ε + xt ε + yt

We also have
ε + yt ε + x − yt + xt x − yt
 =1+ . (3.64)
ε + xt ε + xt ε + xt

We add 1 to both sides of (3.63) and use (3.64) to obtain

x − zt  x − yt   y − zt 


1+  1+ · 1+ . (3.65)
ε + xt ε + xt ε + yt

We take the supremum of both sides over ε and t to obtain


  
1 + d(x, z)  1 + d(x, y) · 1 + d(y, z) . (3.66)

Finally, we take the logarithm of both sides of (3.66) to obtain q(x, z)  q(x, y) +
q(y, z) as desired. 

Proof (Lemma 3.10) It follows from (3.23) that

x − yt  d(x, y) · xt (3.67)


76 R. A. Freeman

for all t ∈ T . Hence

xt  x − yt + yt  d(x, y) · xt + yt (3.68)

and therefore

1 − d(x, y) · xt  yt (3.69)

for all t ∈ T . Because 1 − d(x, y) > 0, it follows from (3.67) and (3.69) that

d(x, y) d(x, y)  
x − yt  · yt  ε + yt (3.70)
1 − d(x, y) 1 − d(x, y)

for all ε > 0 and t ∈ T , and the result follows. 

Proof (Lemma 3.11) When  = ∅ we have g() = g0() = 0 and thus the result
holds. Hence we assume that  is nonempty. Let β  0 be large enough that

d(, ) < δ, where δ = (2gβ ( ) + 2)−1 . Let (u, y) ∈ ; then there exists (ū, ȳ) ∈
such that

u − ūt + y − ȳt  δut + δyt (3.71)

for all t ∈ T . It follows that

yt   ȳt + y − ȳt  gβ ( ) · ūt + β + y − ȳt


 
 gβ ( ) · ut + (gβ ( ) + 1) u − ūt + y − ȳt + β
 (gβ ( ) + 21 ) · ut + 21 yt + β (3.72)

for all t ∈ T , and solving for yt gives yt  (2gβ ( ) + 1) · ut + 2β for all
t ∈ T . This holds for all (u, y) ∈  and t ∈ T , and we conclude that g2β () 
2gβ ( ) + 1. Because this holds for all β sufficiently large, we also have g() 
2g( ) + 1. The same argument with β = 0 yields the result for g0 . 

Proof (Lemma 3.12) Let β be large enough such that both gβ ( s) < ∞ and

d(, ) < ρ −1 , where ρ = μgβ ( s) + μL. Let (u, y) ∈  be such that u ∈ Is, and
let Y ⊆ O be any open neighborhood of y. By assumption Gμ is dense in [u], which

means there exists y̆ ∈ Y ∩ [u] ∩ Gμ . If we choose d such that d(, ) < d < ρ −1 ,
then by the definition of d there exists (ū, ȳ) ∈ such that

u − ūs +  y̆ − ȳs  dus + d y̆s (3.73)

for all s ∈ T . Fix t ∈ T and ε > 0, and let δ > 0 be as in Definition 3.5. We can choose
δ  ε without loss of generality. Because dom( ) is controllable to zero, there exists
û ∈ B (t),δ (ū) ∩ dom( )s, and because is -lhc there exists ŷ0 ∈ Bt,ε ( ȳ) ∩ [û].
3 On the Role of Well-Posedness in Homotopy Methods … 77

The same statements hold when is only -wlhc but dom( ) is finely controllable
to zero if we take δ = 0. In either case, because is minimally stable there exists
ŷ ∈ Bt,ε ( ȳ) ∩ [û]s. We are now ready to bound  y̆ (t) . Because y̆ ∈ Gμ , there
exists χ  0 independent of t such that

 y̆ (t)  μ y̆t + χ  μ ŷt + μ y̆ − ȳt + μ ŷ − ȳt + χ


s
 μgβ ( ) · û (t) + μL y̆ − ȳ (t) + μ(ε + β) + χ
s s s
 μgβ ( ) · u (t) + μgβ ( ) · û − ū (t) + μgβ ( ) · u − ū (t)

+ μL y̆ − ȳ (t) + μ(ε + β) + χ


s s
 (μgβ ( ) + ρd) · u (t) + ρd y̆ (t) + μ(εgβ ( ) + ε + β) + χ ,
(3.74)

where we have used (3.73) with s = (t). Because ρd < 1 we can solve for  y̆ (t) :

s
 y̆ (t)  (1 − ρd)−1 (μgβ ( ) + ρd) · u (t)
s

+μ(εgβ ( ) + ε + β) + χ .
(3.75)

Now  y̆t  L y̆ (t) and u (t)  us, so we have


s s

 y̆t  L(1 − ρd)−1 (μgβ ( ) + ρd) · us + μ(εgβ ( ) + ε + β) + χ .
(3.76)

The right-hand side of (3.76) is independent of t, and it follows that y̆ ∈ Os. Thus
for every u ∈ dom()s, every y ∈ [u], and every open neighborhood Y of y we
have found y̆ ∈ Y ∩ [u]s. In other words,  is minimally stable. 

Proof (Theorem 3.1) Let η = (μγ + μL)−1 , where L is a look-ahead constant


for . Because the interval [0, 1] is compact, it follows from (i) that the map
α → α is uniformly continuous; hence there exists δ > 0 such that |α − ᾱ| < δ
implies d( α , ᾱ ) < η. Choose an integer N > 1/δ and define αi = i/N for
i = 0, 1, . . . , N . Note that α0 = 0 so α0 is minimally stable from (v). It follows
from (ii)–(iv) and Lemma 3.12 that the minimal stability of αi implies the minimal
stability of αi+1 , so by induction we conclude that 1 is minimally stable. We then
apply Lemma 3.8. 

Proof (Theorem 3.2) We follow the proof of Theorem 3.1 but with η = (μγ K +
μL)−1 . For the induction step, it follows from (ii), (iii’)–(iv’), and Lemmas 3.7
and 3.12 that the minimal stability of αi implies the minimal stability of αi+1 . We
conclude that 1 is minimally stable, and then we then apply Lemma 3.7. 

Lemma 3.24 The mapping α → α in Example 3.9 is uniformly continuous.

Proof Note that α = [, α ], where  and α are the systems


78 R. A. Freeman
 
 = (u, y) ∈ I ⊕ O : y1 = u 1 and y2 = u 2 + h ∗ u 2 (3.77)
 
α = (y1 , y2 ) ∈ O : y1 = (1 − α)y2 . (3.78)

We first show that the mapping α → α is uniformly continuous. Pick α, ᾱ ∈ [0, 1],
let (y1 , y2 ) ∈ α , and define ȳ2 = y2 and ȳ1 = (1 − ᾱ)y2 . Then ( ȳ1 , ȳ2 ) ∈ ᾱ and
 α , ᾱ )  |α − ᾱ|.
y1 − ȳ1 t = |α − ᾱ| · y2 t for all t ∈ T , and it follows that d(
By reversing the roles of α and ᾱ we obtain d(α , ᾱ )  |α − ᾱ|. The uniform
continuity of the mapping α → α then follows from Lemmas 3.15 and 3.16. 
Proof (Lemma 3.13) Choose γ > 0 and let G 1 , G 2 ⊆ O be such that d(G 1 , G 2 ) <
γ̄ , where γ̄ = γ /2. If either G i is empty, then both are empty and thus d(G + +
1 , G2 ) =
d(∅, ∅) = 0 < γ . Hence we assume that the G i are both nonempty, which implies
that the G i+ are also both nonempty. Fix (ū, ȳ) ∈ G + 1 and define w̄ = ū + ȳ so
that w̄ ∈ G 1 . Because d(G 1 , G 2 ) < γ̄ , there exists w ∈ G 2 such that w − w̄t 
γ̄ w̄t for all t ∈ T . Define u = w − ȳ so that (u, ȳ) ∈ G +
2 and (u, ȳ) − (ū, ȳ)t =
u − ūt = w − w̄t for all t ∈ T . Then we have (u, ȳ) − (ū, ȳ)t  γ̄ (ū, ȳ)t
for all t ∈ T , which means d(G  + , G + )  γ̄ < γ . Reversing the roles of G i gives
1 2
 + + + +
d(G 2 , G 1 ) < γ , and we conclude that d(G 1 , G 2 ) < γ . 
Proof (Lemma 3.14) We first suppose that G and O are controllable to zero.
Let (ū, ȳ) ∈ G + so that w̄ ∈ G, where w̄ = ū + ȳ. Fix t ∈ T and ε > 0, and
let δ = ε/4. Because G and O are both controllable to zero, there exist w ∈ G s
and u ∈ Os such that w − w̄t  δ and u − ūt  δ. Define y = w − u so that
(u, y) ∈ (G + )s. Then y − ȳt  w − w̄t + u − ūt  2δ, and it follows that
(u, y) − (ū, ȳ)t  3δ < ε. Thus (u, y) ∈ Bt,ε (ū, ȳ), and because t ∈ T and ε > 0
were arbitrary we conclude that G + is controllable to zero. If G and O are both finely
controllable to zero, then a similar argument with δ = 0 shows that G + is finely con-
trollable to zero as well. 
Lemma 3.25 The mapping α → α in Example 3.10 is uniformly continuous.
Proof We first show that the mapping α → α is uniformly continuous. Pick α, ᾱ ∈
[0, 1], let (y1 , y2 ) ∈ α , and define ȳ2 = y2 and

(1 − ᾱ)y2t when 0 < y1t < (1 − ᾱ)y2t
ȳ1t = (3.79)
y1t otherwise

for all t ∈ T . Then ( ȳ1 , ȳ2 ) ∈ ᾱ and



(1 − ᾱ)y2t − y1t when 0 < y1t < (1 − ᾱ)y2t
|y1t − ȳ1t | = (3.80)
0 otherwise

for all t ∈ T . It follows from (3.37) that

(1 − ᾱ)y2t − y1t = (1 − α)y2t − y1t + (α − ᾱ)y2t  (α − ᾱ)y2t (3.81)


3 On the Role of Well-Posedness in Homotopy Methods … 79

for all t ∈ T such that y1t > 0, and thus from (3.80) we obtain |y1t − ȳ1t |  |α − ᾱ| ·
|y2t | for all t ∈ T . Therefore y1 − ȳ1 t + y2 − ȳ2 t  |α − ᾱ| · y2 t for all t ∈
 α , ᾱ )  |α − ᾱ|. By reversing the roles of α and ᾱ we
T , and it follows that d(
obtain d(α , ᾱ )  |α − ᾱ|. The uniform continuity of the mapping α → α then
follows from Lemmas 3.13 and 3.16 together with the fact that G + is 1-regular. 

Lemma 3.26 The mapping α → α in Example 3.11 is uniformly continuous.

Proof We first show that the mapping α → α is uniformly continuous. Pick α, ᾱ ∈


[0, 1], let (y1 , y2 ) ∈ α , and define ȳ1 = y1 and

ȳ2t = (1 − ᾱ)y1t · sech(y1,t+1 ) (3.82)

for all t ∈ T . Then ( ȳ1 , ȳ2 ) ∈ ᾱ and

|y2t − ȳ2t | = |α − ᾱ| · |y1t | · sech(y1,t+1 )  |α − ᾱ| · |y1t | (3.83)

for all t ∈ T . The rest of the proof is similar to the end of the proof of
Lemma 3.25. 

Proof (Lemma 3.15) Because id ⊆ ◦ we see that  must be an operator. Given
(u, y) ∈  let  =  + (y, u). Pick ȳ ∈ O and let ū = u + [ ȳ − y] so that ( ȳ, ū) ∈
. Now id ⊆ ◦ so there exists x such that ( ȳ − y, x) ∈  and (x, ȳ − y) ∈ .
But ( ȳ − y, ū − u) ∈  and  is univalent so x = ū − u. Hence (ū − u, ȳ − y) ∈ 
and (u, y) ∈ , and because  is linear we have (ū, ȳ) ∈ . This means ( ȳ, ȳ) ∈
 ◦ . 

Proof (Lemma 3.16) Fix γ > 0 and let 1 , 2 ∈ Regr (I, O) and 1 , 2 ⊆ O be
such that both d(1 , 2 ) < γ̄ and d(1 , 2 ) < γ̄ , where γ̄ = γ /(2r + 3). If either
i is empty, then both are empty and thus d([1 , 1 ], [2 , 2 ]) = d(∅, ∅) = 0 < γ .
Likewise, if either i is empty then again d([1 , 1 ], [2 , 2 ]) = 0 < γ . Therefore
we assume that the i and i are all nonempty, which means [1 , 1 ] and [2 , 2 ]
are also nonempty.
 1 , 2 ) < γ̄ there exists (u, y) ∈ 2 such that
Fix (ū, ȳ) ∈ [1 , 1 ]. Because d(

u − ūt + y − ȳt  γ̄ ūt + γ̄  ȳt (3.84)

 1 , 2 ) < γ̄ there exists ŷ ∈ 2 such that


for all t ∈ T . Likewise, because d(

 ŷ − ȳt  γ̄  ȳt (3.85)

for all t ∈ T . By assumption 2 ∈ Regr (I, O), and it follows from Definition 3.7 that
there exists an IO system (O, I, 2 ) such that id ⊆ 2 ◦ 2 and g0(2 − (y, u))  r .
Therefore ( ŷ, ŷ) ∈ 2 ◦ 2 which means there exists û ∈ I such that ( ŷ, û) ∈ 2
and (û, ŷ) ∈ 2 , and in particular (û, ŷ) ∈ [2 , 2 ]. We also have ( ŷ − y, û − u) ∈
2 − (y, u), and it follows that
80 R. A. Freeman

û − ut  r  ŷ − yt  r  ŷ − ȳt + r y − ȳt


 γ̄ r  ȳt + r y − ȳt (3.86)

for all t ∈ T . Therefore

û − ūt +  ŷ − ȳt  û − ut + u − ūt + γ̄  ȳt


 γ̄ (r + 1) ȳt + u − ūt + r y − ȳt

 2γ̄ (r + 1) ūt +  ȳt (3.87)

 1 , 1 ], [2 , 2 ])  2γ̄ (r + 1) < γ , and reversing the roles of


for all t ∈ T . Thus d([
i and i gives d([  2 , 2 ], [1 , 1 ]) < γ . Therefore d([1 , 1 ], [2 , 2 ])
< γ. 

Proof (Lemma 3.17) Suppose  and  are both controllable to zero. Let (ū, ȳ) ∈
[, ], let r  0 be such that  is r -regular, fix t ∈ T and ε > 0, and let δ = ε/(2r +
3). Because  is controllable to zero there exists (u, y) ∈  s such that u − ūt +
y − ȳt  δ. Likewise, because  is controllable to zero there exists ŷ ∈ s such
that  ŷ − ȳt  δ. Let  be as in Definition 3.7; then because id ⊆ ◦  there exists
û ∈ [ ŷ] such that (û, ŷ) ∈ , and because g0( − (y, u))  r we have

û − us  r  ŷ − ys (3.88)

for all s ∈ T . Taking the supremum of both sides of (3.88) over s gives û − us 
r  ŷ − ys. It follows that û ∈ Is and thus (û, ŷ) ∈ [, ]s. Using (3.88) with s = t
gives

(û, ŷ) − (ū, ȳ)t  u − ūt + û − ut +  ŷ − ȳt


 u − ūt + r  ŷ − yt +  ŷ − ȳt
 u − ūt + (r + 1) ŷ − ȳt + r y − ȳt
 2δ(r + 1) < ε . (3.89)

Thus (û, ŷ) ∈ Bt,ε (ū, ȳ), and because t ∈ T and ε > 0 were arbitrary we conclude
that [, ] is controllable to zero. If  and  are both finely controllable to zero,
then a similar argument with δ = 0 shows that [, ] is finely controllable to zero.


Proof (Lemma 3.18) We let · denote the operator norms for A and B, and because
· , · is bounded there exists M  0 such that Ax, By  MA·B·xs ·ys
for all x, y ∈ Xs. Therefore
3 On the Role of Well-Posedness in Homotopy Methods … 81

σ (y) − σ (x) = Ay, By − Ax, Bx


= A(y − x), By + Ax, B(y − x)
= A(y − x), B(y − x) + A(y − x), Bx + Ax, B(y − x)
 MA·B·x − y2s + 2MA·B·xs ·x − ys
 MA·B·x − y2s
+ 1ε M 2 A2 ·B2 ·x − y2s + εx2s , (3.90)

from which we obtain (3.41). 

Proof (Lemma 3.19) From (3.41) there exists C > 0 such that

σ (y)  σ (u + y) + εu + y2s + Cu2s (3.91)

for any u, y ∈ Os. Then from (i)–(ii) we obtain

−d  −εu + y2s + d + Cu2s (3.92)

for any (u, y) ∈ [G + , ]s. Therefore

y2s  2u + y2s + 2u2s  2(1 + Cε )u2s + 4d


ε
(3.93)

for any (u, y) ∈ [G + , ]s, and the result follows. 

References

1. Altshuller, D.: Delay-integral-quadratic constraints and stability multipliers for systems with
MIMO nonlinearities. IEEE Trans. Automat. Contr. 56(4), 738–747 (2011)
2. Altshuller, D.: Frequency Domain Criteria for Absolute Stability. Lecture Notes in Control and
Information Sciences, vol. 432. Springer, Berlin (2013)
3. Aubin, J.P., Frankowska, H.: Set-Valued Analysis. Birkhäuser, Boston (1990)
4. Carrasco, J., Seiler, P.: Conditions for the equivalence between IQC and graph separation
stability results. Int. J. Control 92(12), 2899–2906 (2019)
5. Cobzaş, Ş: Functional Analysis in Asymmetric Normed Spaces. Frontiers in Mathematics,
Birkhäuser (2013)
6. de Does, J., Schumacher, J.M.: Interpretations of the gap topology: a survey. Kybernetika 30(2),
105–120 (1994)
7. Desoer, C.A., Vidyasagar, M.: Feedback Systems: Input-Output Properties. Academic, New
York (1975)
8. Fetzer, M.: From classical absolute stability tests towards a comprehensive robustness analysis.
Ph.D thesis, Universität Stuttgart (2017)
9. Freeman, R.A., Kokotović, P.V.: Robust Nonlinear Control Design. Modern Birkhäuser Clas-
sics, Birkhäuser (2008)
10. Georgiou, T.T., Smith, M.C.: Robustness analysis of nonlinear feedback systems: An input-
output approach. IEEE Trans. Automat. Contr. 42(9), 1200–1221 (1997)
82 R. A. Freeman

11. Greshnov, A.V.: Distance functions between sets in (q1 , q2 )-quasimetric spaces. Sib. Math. J.
61(3), 417–425 (2020)
12. Hill, D.J., Moylan, P.J.: Dissipative dynamical systems: Basic input-output and state properties.
J. of the Franklin Institute 309(5), 327–357 (1980)
13. Jiang, Z.P., Teel, A.R., Praly, L.: Small-gain theorem for ISS systems and applications. Math.
Control Signals Systems 7, 95–120 (1994)
14. Jones, P.W.: Quasiconformal mappings and extendability of functions in Sobolev spaces. Acta
Math. 147, 71–88 (1981)
15. Khalil, H.K.: Nonlinear Systems, 3rd edn. Prentice Hall, Upper Saddle River (2002)
16. Krichman, M., Sontag, E.D., Wang, Y.: Input-output-to-state stability. SIAM J. Control. Optim.
39(6), 1874–1928 (2001)
17. Liberzon, M.R.: Essays on the absolute stability theory. Autom. Remote. Control. 67(10),
1610–1644 (2006)
18. Lorentz, R.A.: Multivariate Hermite interpolation by algebraic polynomials: A survey. J. Com-
put. Appl. Math. 122, 167–201 (2000)
19. Lur’e, A.I., Postnikov, V.N.: On the theory of stability of control systems. J. Appl. Math. Mech.
8(3), 246–248 (1944)
20. Megretski, A.: KYP lemma for non-strict inequalities and the associated minimax theorem
(2010)
21. Megretski, A., Rantzer, A.: System analysis via integral quadratic constraints. IEEE Trans.
Automat. Contr. 42(6), 819–830 (1997)
22. O’Shea, R.P.: An improved frequency time domain stability criterion for autonomous contin-
uous systems. IEEE Trans. Automat. Contr. 12(6), 725–731 (1967)
23. Rantzer, A., Megretski, A.: System analysis via integral quadratic constraints: Part II. Technical
report ISRN LUTFD2/TFRT–7559–SE, Department of Automatic Control, Lund Institute of
Technology, Sweden (1997)
24. Safonov, M.G.: Stability and Robustness of Multivariable Feedback Systems. MIT Press, Cam-
bridge (1980)
25. Shiriaev, A.S.: Some remarks on “System analysis via integral quadratic constraints”. IEEE
Trans. Automat. Contr. 45(8), 1527–1532 (2000)
26. Sontag, E.D.: Smooth stabilization implies coprime factorization. IEEE Trans. Automat. Contr.
34(4), 435–443 (1989)
27. Teel, A.R.: On graphs, conic relations, and input-output stability of nonlinear feedback systems.
IEEE Trans. Automat. Contr. 41(5), 702–709 (1996)
28. Willems, J.C.: The Analysis of Feedback Systems. The MIT Press, Cambridge (1971)
29. Willems, J.C.: Paradigms and puzzles in the theory of dynamical systems. IEEE Trans. Automat.
Contr. 36(3), 259–294 (1991)
30. Willems, J.C., Takaba, K.: Dissipativity and stability of interconnections. Int. J. Robust Non-
linear Contr. 17(5–6), 563–586 (2007)
31. Yakubovich, V.A.: Necessity in quadratic criterion for absolute stability. Int. J. Robust Nonlinear
Contr. 10(11–12), 889–907 (2000)
32. Yakubovich, V.A.: Popov’s method and its subsequent development. Eur. J. Control. 8(3),
200–208 (2002)
33. Veretennikov, A.Y, Veretennikova, E.V.: On partial derivatives of multivariate Bernstein poly-
nomials. Siberian Adv. Math. 26(4), 294–305 (2016)
34. Zames, G.: On the input-output stability of time-varying nonlinear feedback systems. Part I:
Conditions using concepts of loop gain, conicity, and positivity. IEEE Trans. Automat. Contr.
11, 228–238 (1966)
35. Zames, G., Falb, P.L.: Stability conditions for systems with monotone and slope-restricted
nonlinearities. SIAM J. Control 6(1), 89–108 (1968)
Chapter 4
Design of Heterogeneous Multi-agent
System for Distributed Computation

Jin Gyu Lee and Hyungbo Shim

Abstract A group behavior of a heterogeneous multi-agent system is studied which


obeys an “average of individual vector fields” under strong couplings among the
agents. Under stability of the averaged dynamics (not asking stability of individ-
ual agents), the behavior of heterogeneous multi-agent system can be estimated by
the solution to the averaged dynamics. A following idea is to “design” individual
agent’s dynamics such that the averaged dynamics performs the desired task. A
few applications are discussed including estimation of the number of agents in a net-
work, distributed least-squares or median solver, distributed optimization, distributed
state estimation, and robust synchronization of coupled oscillators. Since stability
of the averaged dynamics makes the initial conditions forgotten as time goes on,
these algorithms are initialization-free and suitable for plug-and-play operation. At
last, nonlinear couplings are also considered, which potentially asserts that enforced
synchronization gives rise to an emergent behavior of a heterogeneous multi-agent
system.

4.1 Introduction

During the last decade, synchronization and collective behavior of a multi-agent sys-
tem have been actively studied because of numerous applications in diverse areas,
e.g., biology, physics, and engineering. An initial study was about identical multi-
agents [1–4], but the interest soon transferred to the heterogeneous case because

J. G. Lee
Department of Engineering, University of Cambridge, Control Group,
Trumpington Street, CB2 1PZ Cambridge, United Kingdom
e-mail: [email protected]
H. Shim (B)
ASRI, Department of Electrical and Computer Engineering, Seoul National University,
Gwanak-ro 1, Gwanak-gu, 08826 Seoul, Korea
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 83


Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control,
Lecture Notes in Control and Information Sciences 488,
https://doi.org/10.1007/978-3-030-74628-5_4
84 J. G. Lee and H. Shim

uncertainty, disturbance, and noise are prevalent in practice. In this regard, hetero-
geneity was mostly considered harmful—something that we have to suppress or
compensate. To achieve synchronization, or at least approximate synchronization
(with arbitrary precision if possible), against heterogeneity, various methods such as
output regulation [5–10], backstepping [11], high-gain feedback [12–15], adaptive
control [16], and optimal control [17], have been applied. However, heterogeneity of
multi-agent systems is a way to achieve a certain task collaboratively from different
agents having different roles. From this viewpoint, heterogeneity is something we
should design, or, at least, heterogeneity is an outcome of distributing a complex
computation into individual agents.
This chapter is devoted to investigating the design possibility of heterogeneity.
After presenting a few basic theorems which describe the collective behavior of
multi-agent systems, we exhibit several design examples by employing the theorems
as a toolkit. A feature of the toolkit is that the vector field of the collective behavior
can be assembled from the individual vector fields of each agent when the coupling
strength among the agents is sufficiently large. This process is explained by the
singular perturbation theory. In fact, the assembled vector field is nothing but an
average of the agents’ vector fields, and it appears as the quasi-steady-state subsystem
(or the slow subsystem) when the inverse of the coupling gain is treated as the
singular perturbation parameter. We call the quasi-steady-state subsystem as blended
dynamics for convenience. The behavior of the blended dynamics is an emergent
one if none of the agents has such a vector field. For instance, we will see that we
can construct a heterogeneous network that individuals can estimate the number of
agents in the network without using any global information. Since individuals cannot
access the global information N , this collective behavior cannot be obtained by the
individuals alone. On the other hand, appearance of the emergent behavior when
we enforce synchronization seems intrinsic. We will demonstrate this fact when we
consider nonlinear coupling laws in the later section. Finally, the proposed tool leads
to the claim that the network of a large number of agents is robust against the variation
of individual agents. We will demonstrate it in the case of coupled oscillators.
There are two notions which have to be considered when a multi-agent system
is designed. It is said that the plug-and-play operation (or, initialization-free) is
guaranteed for a multi-agent system if it maintains its task without resetting all agents
whenever an agent joins or leaves the network. On the other hand, if a new agent that
joins the network can construct its own dynamics without the global information such
as graph structure, other agents’ dynamics, and so on, it is said that the decentralized
design is achieved. It will be seen that the plug-and-play operation is guaranteed for
the design examples in this chapter. This is due to the fact that the group behavior of
the agents is governed by the blended dynamics, and therefore, as long as the blended
dynamics remains stable, individual initial conditions of the agents are forgotten as
time goes on. The property of decentralized design is harder to achieve in general.
However, for the presented examples, this property is guaranteed to some extent;
more specifically, it is achieved except the necessity of the coupling gain which is
the global information.
4 Design of Heterogeneous Multi-agent System for Distributed Computation 85

4.2 Strong Diffusive State Coupling

We begin with the simplest case of heterogeneous multi-agent systems given by



ẋi = f i (t, xi ) + k αi j (x j − xi ) ∈ Rn , i ∈ N , (4.1)
j∈N i

where N := {1, . . . , N } is the set of agent indices with the number of agents, N ,
and Ni is a subset of N whose elements are the indices of the agents that send the
information to agent i. The coefficient αi j is the i j-th element of the adjacency matrix
that represents the interconnection graph. We assume the graph is undirected and
connected in this chapter. The vector field f i is assumed to be piecewise continuous
in t, continuously differentiable with respect to xi , locally Lipschitz with respect
to xi uniformly in t, and f i (t, 0) is uniformly bounded for t. The summation term
in (4.1) is called diffusive coupling, in particular, diffusive state coupling because
states are exchanged among agents through the term. Diffusive state coupling term
vanishes when state synchronization is achieved (i.e., xi (t) = x j (t), ∀i, j). Coupling
strength, or coupling gain, is represented by the positive constant k.
It is immediately seen from (4.1) that synchronization of xi (t)’s to a common
trajectory s(t) is hopeless in general due to the heterogeneity of f i unless it holds
that ṡ(t) = f i (t, s(t)) for all i ∈ N . Instead, with sufficiently large coupling gain
k, we can enforce approximate synchronization. To see this, let us introduce a linear
coordinate change

1 
N
s= xi ∈ Rn
N i=1 (4.2)
z̃ = (R T ⊗ In )col(x1 , . . . , x N ) ∈ R(N −1)n ,

where R ∈ R N ×(N −1) is any matrix that satisfies


1   
1TN   0 0
N L 1N R =
RT 0Λ

with a positive definite matrix Λ ∈ R(N −1)×(N −1) , where 1 N := [1, . . . , 1]T ∈ R N
and L := D − A is the Laplacian matrix of a graph, in which A = [αi j ] and D =
diag(di ) with di = j∈N i αi j . By the coordinate change, the multi-agent system is
converted into the standard singular perturbation form

1 
N
ṡ = f i (t, s + (Ri ⊗ In )z̃)
N i=1
(4.3)
1˙ 1
z̃ = −(Λ ⊗ In )z̃ + (R T ⊗ In )col( f 1 (t, x1 ), . . . , f N (t, x N )),
k k
86 J. G. Lee and H. Shim

where Ri implies the i-th row of R. From this, it is seen that z̃ quickly becomes
arbitrarily small with arbitrarily large k, and the quasi-steady-state subsystem of
(4.3) is given by
1 
N
ṡ = f i (t, s), (4.4)
N i=1

which we call blended dynamics.1 By noting that xi = s + (Ri ⊗ In )z̃, it is seen that
the behavior of the multi-agent system (4.3) can be approximated by the blended
dynamics with some kind of stability of the blended dynamics (and with sufficiently
large k) as follows:
Theorem 4.1 ([14, 19]) Assume that the blended dynamics (4.4) is contractive.2
Then, for any compact set K ⊂ Rn N and for any η > 0, there exists k ∗ > 0 such
that, for each k > k ∗ and col(x1 (t0 ), . . . , x N (t0 )) ∈ K , the solution to (4.1) exists for
all t ≥ t0 , and satisfies

lim sup xi (t) − s(t) ≤ η, ∀i ∈ N .


t→∞

Theorem 4.2 ([15, 19]) Assume that there is a nonempty compact set Ab ⊂ Rn that
is uniformly asymptotically stable for the blended dynamics (4.4). Let Db ⊃ Ab be
an open subset of the domain of attraction of Ab , and let3

Dx := 1 N ⊗ s + w : s ∈ Db , w ∈ Rn N such that (1TN ⊗ In )w = 0 .

Then, for any compact set K ⊂ Dx ⊂ Rn N and for any η > 0, there exists k ∗ > 0
such that, for each k > k ∗ and col(x1 (t0 ), . . . , x N (t0 )) ∈ K , the solution to (4.1)
exists for all t ≥ t0 , and satisfies

lim sup xi (t) − x j (t) ≤ η and lim sup xi (t) Ab ≤ η, ∀i, j ∈ N . (4.5)
t→∞ t→∞

If, in addition, Ab is locally exponentially stable for the blended dynamics (4.4) and,
f i (t, s) = f j (t, s), ∀i, j ∈ N , for each s ∈ Ab and t, then we have more than (4.5)
as
lim xi (t) − x j (t) = 0 and lim xi (t) A b = 0, ∀i, j ∈ N .
t→∞ t→∞

We emphasize the required stabilities in the above theorems are only for the
blended dynamics (4.4) but not for individual agent dynamics ẋi = f i (t, xi ). A group
of unstable and stable agents may end up with a stable blended dynamics so that the

1 More appropriate name could be “averaged dynamics,” which may however confuse the reader
with the averaged dynamics in the well-known averaging theory [18] that deals with time average.
2 ẋ = f (t, x) is contractive if ∃Θ > 0 such that Θ ∂ f (t, x) + ∂ f (t, x)T Θ ≤ −I for all x and t [20].
∂x ∂x
3 The condition for w in D can be understood by recalling that col(x , . . . , x ) = 1 ⊗ s + (R ⊗
x 1 N N
In )z̃.
4 Design of Heterogeneous Multi-agent System for Distributed Computation 87

above theorems can be applied. In this case, it can be interpreted that the stability is
traded throughout the network with strong couplings.
The blended dynamics (4.4) shows an emergent behavior of the multi-agent system
in the sense that s(t) is governed by the new vector field that is assembled from
the individual vector fields participating in the network. From now, we list a few
examples of designing multi-agent systems (or, simply called “networks”) whose
tasks are represented by the emergent behavior of (4.4) so that the whole network
exhibits the emergent collective behavior with sufficiently large coupling gain k.

4.2.1 Finding the Number of Agents Participating in the


Network

When constructing a distributed network, sometimes there is a need for each agent
to know global information such as the number of agents in the network without
resorting to a centralized unit. In such circumstances, Theorem 4.1 can be employed to
design a distributed network that estimates the number of participating agents, under
the assumption that there is one agent (whose index is 1 without loss of generality)
who always takes part in the network. Suppose that agent 1 integrates the following
scalar dynamics: 
ẋ1 = −x1 + 1 + k α1 j (x j − x1 ), (4.6)
j∈N 1

while all others integrate



ẋi = 1 + k αi j (x j − xi ), i = 2, . . . , N (4.7)
j∈N i

where N is unknown to the agents. Then, the blended dynamics (4.4) is obtained as

1
ṡ = − s + 1. (4.8)
N

This implies that the resulting emergent motion s(t) converges to N as time goes to
infinity. Then, it follows from Theorem 4.1 that each state xi (t) approaches arbitrarily
close to N with a sufficiently large k. Hence, by increasing k such that the estimation
error is less than 0.5, and by rounding xi (t) to the nearest integer, each agent gets to
know the number N as time goes on.
By resorting to a heterogeneous network, we were able to impose stable emergent
collective behavior that makes possible for individuals to estimate the number of
agents in the network. Note that the initial conditions do not affect the final value of
xi (t) because they are forgotten as time tends to infinity due to the stability of the
blended dynamics. This is in sharp contrast to other approaches such as [21] where
the average consensus algorithm is employed, which yields the average of individual
88 J. G. Lee and H. Shim

initial conditions, to estimate N . While their approach requires resetting the initial
conditions whenever some agents join or leave the network during the operation,
the estimation of the proposed algorithm remains valid (after some transient) in
such cases because the blended dynamics (4.8) remains contractive for any N ≥ 1.
Therefore, the proposed algorithm achieves the plug-and-play operation. Moreover,
when the maximum number Nmax of agents is known, the decentralized design is
also achieved. Further details are found in [22].

Remark 4.1 A slight variation of the idea yields an algorithm to identify the agents
attending the network. Let the number 1 in (4.6) and (4.7) be replaced by 2i−1 , where
i is the unique ID of the agent in {1, 2, . . . , Nmax }. Then the blended dynamics (4.8)
becomes ṡ = −(1/N )s + j∈N a 2 j−1 /N , where Na is the index set of the attending
agents, and N is the cardinality of Na . Since the resulting
 emergent behavior s(t) →
j−1 j−1
j∈N a 2 , each agent can figure out the integer value j∈N a 2 , which contains
the binary information of the attending agents.

4.2.2 Distributed Least-Squares Solver

Distributed algorithms have been developed in various fields of study so as to divide


a large computational problem into small-scale computations. In this regard, finding
a solution of a given large linear equation in a distributed manner has been tackled
in recent years [23–25]. Let the equation be given by

Ax = b ∈ RM , (4.9)

where A ∈ R M×n has full column rank, x ∈ Rn , and b ∈ R M . We suppose that the
total of M equations are grouped into N equation banks and the i-th equation bank
N
consists of m i equations so that i=1 m i = M. In particular, we write the i-th equa-
tion bank as

Ai x = bi ∈ Rm i , i = 1, 2, . . . , N , (4.10)

where Ai ∈ Rm i ×n is the i-th block rows of the matrix A and bi ∈ Rm i is the i-th
block elements of b. The problem of finding a solution to (4.10) (in the sense of least-
squares when there is no solution) is dealt with in [26–29]. Most notable among them
are [27, 28], in which they proposed a distributed algorithm given by

ẋi = −AiT (Ai xi − bi ) + k αi j (x j − xi ). (4.11)
j∈N i

Here, we analyze (4.11) in terms of Theorem 4.2. In particular, the blended dynam-
ics of the network (4.11) is obtained as
4 Design of Heterogeneous Multi-agent System for Distributed Computation 89

1 T
ṡ = − A (As − b) (4.12)
N
which is equivalent to the gradient descent algorithm of the optimization problem

minimizex Ax − b 2
(4.13)

that has a unique minimizer (A T A)−1 A T b; the least-squares solution of (4.9). Thus,
Theorem 4.2 asserts that each state xi approximates the least-squares solution with
a sufficiently large k, and the error can be made arbitrarily small by increasing k.

Remark 4.2 Even in the case that A T A is not invertible, the network (4.11) still
solves the least-squares problem because s of (4.12) converges to one of the mini-
mizers. Further details are found in [30].

4.2.3 Distributed Median Solver

The idea of designing a network based on the gradient descent algorithm of an


optimization problem, as in the previous subsection, can be used for most other
distributed optimization problems. Among them, a particularly interesting example
is the problem of finding a median, which is useful, for instance, in rejecting outliers
under redundancy.
For a collection R of real numbers ri , i = 1, 2, . . . , N , their median is defined as
a real number that belongs to the set

{r(N
s
+1)/2 }, if N is odd,
MR =
[r N /2 , r Ns /2+1 ],
s
if N is even,

where ris ’s are the elements of the set R with its index being rearranged (sorted)
such that r1s ≤ r2s ≤ · · · ≤ r Ns . With the help of this relaxed definition of the median,
finding a median s of R becomes solving an optimization problem
N
minimizes i=1 |ri − s|.

Then, a gradient descent algorithm given by

1 
N
ṡ = sgn(ri − s) (4.14)
N i=1

will solve this minimization problem, where sgn(s) is 1 if s > 0, −1 if s < 0, and 0
if s = 0. In particular, the solution s satisfies
90 J. G. Lee and H. Shim

lim s(t) MR = 0.
t→∞

Motivated by this, we propose a distributed median solver, whose individual


dynamics of the agent i uses the information of ri only:

ẋi = sgn(ri − xi ) + k αi j (x j − xi ), i ∈ N . (4.15)
j∈N i

Now, the algorithm (4.15) finds a median approximately by exchanging their states
xi only (not ri ). Further details can be found in [31].

4.2.4 Distributed Optimization: Optimal Power Dispatch

As another application of distributed optimization, let us consider an optimization


problem


N
minimizeλ1 ,...,λ N Ji (λi )
i=1
(4.16)

N 
N
subject to λi = di , λi ≤ λi ≤ λi , i ∈ N
i=1 i=1

where λi ∈ R is the decision variable, Ji is a strictly convex C 2 function, and λi , λi ,


and di are given constants. A practical example is the economic dispatch problem of
electric power, in which di represents the demand of node i, λi is the power generated
at node i with its minimum λi and maximum λi , and Ji is the generation cost.
A centralized solution is easily obtained using Lagrangian and Lagrange dual
functions. Indeed, it can be shown that the optimal value is obtained by λi∗ = θi (s ∗ )
where
d Ji −1 d Ji d Ji
θi (s) := sat s, (λi ), (λi ) ,
dλi dλi dλi

in which (d Ji /dλi )−1 (·) is the inverse function of (d Ji /dλi )(·), sat(s, a, b) is s if a ≤
s≤ b, b if b < s, and a if s < a. The optimal s ∗ maximizes the dual function g(s) :=
N ∗
i=1 Ji (θi (s)) + s(di − θi (s)), which is concave so that s can be asymptotically
obtained by the gradient algorithm:

dg  N
ṡ = (s) = (di − θi (s)). (4.17)
ds i=1
4 Design of Heterogeneous Multi-agent System for Distributed Computation 91

A distributed algorithm to solve the optimization problem approximately is to


integrate 
ẋi = di − θi (xi ) + k αi j (x j − xi ), i ∈ N (4.18)
j∈N i

because the blended dynamics of (4.18) is given by

1 
N
1 dg
ṡ = (di − θi (s)) = (s). (4.19)
N i=1 N ds

Obviously, (4.19) is the same as the centralized solver (4.17) except the scaling of
1/N , which can be compensated by scaling (4.18). By Theorem 4.2, the state xi (t)
of each node approaches arbitrarily close to s ∗ with a sufficiently large k, and so,
we obtain λi∗ approximately by θi (xi (t)) whose error can be made arbitrarily small.
Readers are referred to [32], which also describes the behavior of the proposed
algorithm when the problem is infeasible so that each agent can figure out that infea-
sibility occurs. It is again emphasized that the initial conditions are forgotten and
so the plug-and-play operation is guaranteed. Moreover, each agent can design its
own dynamics (4.18) only with their local information so that decentralized design is
achieved, except the global information k. In particular, the function θi can be com-
puted within the agent i from the local information such as Ji , λi , and λi . Therefore,
the proposed solver (4.18) does not exchange the private information of each agent
(except the dual variable xi ).

4.3 Strong Diffusive Output Coupling

Now, let us consider a bit more complex network—a heterogeneous multi-agent


systems under diffusive output coupling law as4

ż i = gi (t, z i , yi ) ∈ Rm i ,

ẏi = h i (t, yi , z i ) + kΛ αi j (y j − yi ) ∈ Rn , i ∈N, (4.20)
j∈N i

where the matrix Λ is positive definite. The vector fields gi and h i are assumed to
be piecewise continuous in t, continuously differentiable with respect to z i and yi ,

4 A particular case of (4.20) is



ẋi = f i (t, xi ) + k B αi j (x j − xi ), i ∈ N ,
j∈N i

where the matrix B is positive semi-definite, which can always be converted into (4.20) by a linear
coordinate change.
92 J. G. Lee and H. Shim

locally Lipschitz with respect to z i and yi uniformly in t, and gi (t, 0, 0), h i (t, 0, 0)
are uniformly bounded for t.
For this network, under the same coordinate change as (4.2) in which xi replaced
by yi , it can be seen that the quasi-steady-state subsystem (or, the blended dynamics)
becomes

ẑ˙ i = gi (t, ẑ i , s), i ∈ N ,


1 
N
(4.21)
ṡ = h i (t, s, ẑ i ).
N i=1

This can also be seen by treating z i (t) as external inputs of h i in (4.20).

Theorem 4.3 ([19]) Assume that the blended dynamics (4.21) is contractive. Then,
for any compact set K and for any η > 0, there exists k ∗ > 0 such that, for each
k > k ∗ and col(z 1 (t0 ), y1 (t0 ), . . . , z N (t0 ), y N (t0 )) ∈ K , the solution to (4.20) exists
for all t ≥ t0 , and satisfies

lim sup z i (t) − ẑ i (t) ≤ η and lim sup yi (t) − s(t) ≤ η, ∀i ∈ N .


t→∞ t→∞

Theorem 4.4 ([15, 19]) Assume that there is a nonempty compact set Ab that is
uniformly asymptotically stable for the blended dynamics (4.21). Let Db ⊃ Ab be an
open subset of the domain of attraction of Ab , and let

Ax := col(ẑ 1 , s, ẑ 2 , s, . . . , ẑ N , s) : col(ẑ 1 , . . . , ẑ N , s) ∈ Ab ,
⎧ ⎫
⎨ 1 
N ⎬
Dx := col(ẑ 1 , s1 , . . . , ẑ N , s N ) : col(ẑ 1 , . . . , ẑ N , s) ∈ Db such that si = s .
⎩ N ⎭
i=1

Then, for any compact set K ⊂ Dx and for any η > 0, there exists k ∗ > 0 such
that, for each k > k ∗ and col(z 1 (t0 ), y1 (t0 ), . . . , z N (t0 ), y N (t0 )) ∈ K , the solution to
(4.20) exists for all t ≥ t0 , and satisfies

lim sup col(z 1 (t), y1 (t), . . . , z N (t), y N (t)) Ax ≤ η. (4.22)


t→∞

If, in addition, Ab is locally exponentially stable for the blended dynamics (4.21)
and, h i (t, yi , z i ) = h j (t, y j , z j ), ∀i, j ∈ N , for each col(z 1 , y1 , . . . , z N , y N ) ∈ Ax
and t, then we have more than (4.22) as

lim col(z 1 (t), y1 (t), . . . , z N (t), y N (t)) Ax = 0.


t→∞

With the extended results, two more examples follow.


4 Design of Heterogeneous Multi-agent System for Distributed Computation 93

4.3.1 Synchronization of Heterogeneous Liénard Systems

Consider a network of heterogeneous Liénard systems modeled as

z̈ i + f i (z i )ż i + gi (z i ) = u i , i = 1, . . . , N , (4.23)

where f i (·) and gi (·) are locally Lipschitz. Suppose that the output and the diffusive
coupling input are given by

oi = az i + ż i , a > 0, and u i = k αi j (o j − oi ). (4.24)
j∈N i

For (4.23) with (4.24), we claim that synchronous and oscillatory behavior is obtained
with a sufficiently large k if the averaged Liénard systems given by
   
1  1 
N N
z̈ + fˆ(z)ż + ĝ(z) := z̈ + f i (z) ż + gi (z) = 0 (4.25)
N i=1 N i=1

has a stable limit cycle. This condition may be interpreted as the blended version
of the condition for a stand-alone Liénard system z̈ + f (z)ż + g(z) = 0 to have a
stable limit cycle. Note that this condition implies that, even when some particular
agents z̈ i + f i (z i )ż i + gi (z i ) = 0 do not yield a stable limit cycle, the network still
can exhibit oscillatory behavior as long as the average of ( f i , gi ) yields a stable limit
cycle. It is seen that stability of individual agents can be traded among agents in this
way so that some malfunctioning oscillators can oscillate in the oscillating network
as long as there are a majority of good neighbors. The frequency and the shape of
synchronous oscillation is also determined by the average of ( f i , gi ).
To justify the claim, we first realize (4.23) and (4.24) with yi := az i + ż i as

ż i = −az i + yi

ẏi = −a 2 z i + ayi − f i (z i )yi + a f i (z i )z i − gi (z i ) + k αi j (y j − yi ),
j∈N i

and compute its blended dynamics (4.21) as

ẑ˙ i = −a ẑ i + s, i ∈N, (4.26)


       
1  1  1  1 
N N N N
ṡ = −a 2
ẑ i + as − f i (ẑ i ) s + a f i (ẑ i )ẑ i − gi (ẑ i ) .
N N N N
i=1 i=1 i=1 i=1

To see whether this (N + 1)-th order blended dynamics has a stable limit cycle, we
observe that, with a > 0, all ẑ i (t) converge exponentially to a common trajectory
ẑ(t) as time goes on. Therefore, if the blended dynamics has a stable limit cycle,
which is an invariant set, it has to be on the synchronization manifold S defined as
94 J. G. Lee and H. Shim

S := col(ẑ, . . . , ẑ, s̄) ∈ R N +1 : col(ẑ, s̄) ∈ R2 .

Projecting the blended dynamics (4.26) to the synchronization manifold S , i.e.,


replacing ẑ i with ẑ in (4.26) for all i ∈ N , we obtain a second-order system

ẑ˙ = −a ẑ + s,
(4.27)
ṡ = −a 2 ẑ + as − fˆ(ẑ)s + a fˆ(ẑ)ẑ − ĝ(ẑ).

Therefore, (4.27) should have a stable limit cycle if the blended dynamics has a stable
limit cycle. It turns out that (4.27) is a realization of (4.25) by s = az + ż, and thus,
existence of a stable limit cycle for (4.25) is a necessary condition for the blended
dynamics (4.26) to have a stable limit cycle. Further analysis, given in [33], proves
that the converse is also true. Then, Theorem 4.4 holds with the limit cycle of (4.26)
as the compact set Ab , and thus, with a sufficiently large k, all the vectors (z i (t), ż i (t))
stay close to each other, and oscillate near the limit cycle of the averaged Liénard
system (4.25). This property has been coined as “phase cohesiveness” in [34].

4.3.2 Distributed State Estimation

Consider a linear system


⎤ ⎡ ⎤
⎡ ⎡ ⎤
o1 G1 n1
⎢ .. ⎥ ⎢ .. ⎥ ⎢ .. ⎥
χ̇ = Sχ ∈ R , o = ⎣ . ⎦ = ⎣ . ⎦ χ + ⎣ . ⎦ = Gχ + n, oi ∈ Rqi
n

oN GN nN

where χ ∈ Rn is the state to be estimated, o is the measurement output, and n is the


measurement noise. It is supposed that there are N distributed agents, and each agent
i can access the measurement oi ∈ Rqi only (where often qi = 1). We assume that
the pair (G, S) is detectable, while each pair (G i , S) is not necessarily detectable as
in [35, 36]. Each agent is allowed to communicate its internal state to its neighboring
nodes. The question is how to construct a dynamic system for each node that estimates
χ (t). See, e.g., [37, 38] for more details on this distributed state estimation problem.
To solve the problem, we first employ the detectability decomposition for each
node, that is, for each pair (G i , S). With pi being the dimension of the unde-
tectable subspace of the pair (G i , S), let [Z i , Wi ] be an orthogonal matrix, where
Z i ∈ Rn×(n− pi ) and Wi ∈ Rn× pi , such that
   
Z iT   S̄i 0    
S Z i Wi = , G i Z i Wi = Ḡ i 0
WiT ∗ ∗

and the pair (Ḡ i , S̄i ) is detectable. Then, pick a matrix Ūi ∈ R(n− pi )×qi such that
S̄i − Ūi Ḡ i is Hurwitz. Now, individual agent i can construct a local partial state
4 Design of Heterogeneous Multi-agent System for Distributed Computation 95

observer, for instance, as

ḃi = S̄i bi − Ūi (Ḡ i bi − oi ) ∈ R pi (4.28)

to collect the information of the state χ as much as possible from the available
measurement oi only; each agent i can obtain a partial information bi about χ in the
sense that

bi = Z iT χ + z i ∈ R pi , i ∈ N (4.29)

where z i denotes the estimation error that converges to zero by (4.28) if n i = 0.


When we collect (4.29) and write them as
⎡ ⎤ ⎡ T⎤ ⎡ ⎤
b1 Z1 z1
⎢ .. ⎥ ⎢ .. ⎥ ⎢ .. ⎥
b = ⎣ . ⎦ = ⎣ . ⎦ χ + ⎣ . ⎦ =: Aχ + z, (4.30)
bN Z NT zN

detectability of (G, S) implies that col(Z 1T , . . . , Z NT ) = A has full-column rank.


Therefore, the least-squares solution χ̂ (t) of Aχ̂(t) = b(t) can generate a unique
estimate of χ (t). This reminds us of the problem in Sect. 4.2.2; finding the least-
squares solution in a distributed manner.
Based on the discussion above, we propose a distributed state estimator for the
given linear system as

χ̂˙ i = S χ̂i − κ Z i (Z iT χ̂i − bi ) + k αi j (χ̂ j − χ̂i ) ∈ Rn , i ∈ N , (4.31)
j∈N i

where bi comes from (4.28), and both κ and k are design parameters. Note that the
least-squares solution χ̂ for Aχ̂(t) = b(t) is time-varying, and so, in order to have
asymptotic convergence of χ̂i (t) to χ (t) (when there is no noise n), we had to add
the generating model of χ (t) in (4.31), inspired by the internal model principle.
To justify the proposed distributed state estimator (4.28) and (4.31), let us denote
the state estimation error as yi := χ̂i − χ , then we obtain the error dynamics for the
partial state observer (4.28) and the distributed observer (4.31) as

ż i = ( S̄i − Ūi Ḡ i )z i + Ūi n i



ẏi = Syi − κ Z i (Z iT yi − z i ) + k αi j (y j − yi ), i ∈N. (4.32)
j∈N i

The blended dynamics (4.21) is obtained as


96 J. G. Lee and H. Shim

ẑ˙ i = ( S̄i − Ūi Ḡ i )ẑ i + Ūi n i


⎤⎡
  ẑ 1
κ 
N
κ κ ⎢ ⎥
ṡ = Ss − Z i (Z iT s − ẑ i ) = S − A T A s + A T ⎣ ... ⎦ . (4.33)
N i=1
N N
ẑ N

For a sufficiently large gain κ, the blended dynamics (4.33) becomes contractive, and
thus, Theorem 4.3 guarantees that the error variables (z i (t), yi (t)) of (4.32) behave
like (ẑ i (t), s(t)) of (4.33). Moreover, if there is no noise n, Theorem 4.4 asserts that
all the estimation errors (z i (t), yi (t)), i ∈ N , converge to zero because Ax = {0}.
Even with the noise n, the proposed observer achieves almost best possible estimate
whose detail can be found in [30].

4.4 General Description of Blended Dynamics

Now, we extend our approach to the most general setting—a heterogeneous multi-
agent systems under rank-deficient diffusive coupling law given by
  
ẋi = f i (t, xi ) + k Bi αi j x j − xi , i ∈ N , (4.34)
j∈N i

where the matrix Bi is positive semi-definite for each i ∈ N . For this network, by
increasing the coupling gain k, we can enforce synchronization of the states that
correspond to the subspace


N
R B := im(Bi ) ⊂ Rn . (4.35)
i=1

In order to find the part of individual states that synchronize, let us follow the pro-
cedure:
1. Find Wi ∈ Rn× pi and Z i ∈ Rn×(n− pi ) where pi is the rank of Bi , such that [Wi Z i ]
is an orthogonal matrix and
   2 
WiT   Λi 0
Bi Wi Z i = (4.36)
Z iT 0 0

where Λi ∈ R pi × pi is positive definite. Let Wnet := diag(W1 , . . . , W N ), Z net :=


diag(Z 1 , . . . , Z N ), and Λnet := diag(Λ1 , 
. . . , Λ N ).
2. Find Vi ∈ R pi × ps such that, with p̄ := i=1 N
pi and V := col(V1 , . . . , VN ) ∈
R p̄× ps
, the columns of V are orthonormal vectors satisfying

(L ⊗ In )Wnet Λnet V = 0n(N −1)× ps (4.37)


4 Design of Heterogeneous Multi-agent System for Distributed Computation 97

where ps is the dimension of ker(L ⊗ In )Wnet Λnet , and L is the graph Laplacian
matrix.
3. Find V ∈ R p̄×( p̄− ps ) such that [V V ] ∈ R p̄× p̄ is an orthogonal matrix.

Proposition 4.1 ([19])


(i) ps ≤ min{ p1 , . . . , p N } ≤ n.
(ii) All matrices Wi Λi Vi (i = 1, . . . , N ) are the same; so let us denote it by M ∈
Rn× ps , then rank(M) = ps , im(M) = R B , and dim(R B ) = ps .
(iii) Define
T
Q := V Λnet Wnet T
(L ⊗ In )Wnet Λnet V ∈ R( p̄− ps )×( p̄− ps ) .

Then, Q is positive definite.

Now, we introduce a linear coordinate change by which the state xi of individual


agent is split into Z iT xi and WiT xi . In particular, the sub-state z i := Z iT xi is the
projected component of xi on im(Z i ), and has no direct interconnection with the
neighbors as its dynamics is given by

ż i = Z iT f i (t, xi ), i ∈N. (4.38)

On the other hand, the sub-state WiT xi is split once more into si := ViT Λi−1 WiT xi ∈
R ps and the other part. (In fact, si determines the behavior of the individual agent
in the subspace R B in the sense that Msi ∈ R BN.) With a sufficiently
 N large k, these
si are enforced to synchronize to s := (1/N ) i=1 si = (1/N ) i=1 ViT Λi−1 WiT xi ,
which is governed by
1  T −1 T
N
ṡ = V Λ Wi f i (t, xi ). (4.39)
N i=1 i i

To see this, let us consider a coordinate change for the whole multi-agent system
(4.34):
⎡ ⎤ ⎡ T ⎤⎡x ⎤
z Z net 1
⎣s ⎦ = ⎣ (1/N )V T Λ−1 T ⎦ ⎢ .. ⎥
W
net net ⎣ . ⎦ (4.40)
T
w Q −1 V Λnet Wnet
T
(L ⊗ In ) xN
N
where w ∈ R(N −1) ps + i=1 ( pi − ps ) collects all the components both in col(s1 , . . . , s N )
that are left after taking s = (1/N )1TN col(s1 , . . . , s N ) ∈ R ps , and in WiT xi that are
left after taking si = ViT Λi−1 WiT xi . It turns out that the governing equation for w is
⎡ ⎤
f 1 (t, x1 )
1 1 T ⎢ .. ⎥
ẇ = −Qw + Q −1 V Λnet Wnet
T
(L ⊗ In ) ⎣ . ⎦. (4.41)
k k
f N (t, x N )
98 J. G. Lee and H. Shim

Then, it is clear that the system (4.38), (4.39), and (4.41) is in the standard form of
singular perturbation. Since the inverse of (4.40) is given (in [19]) by
⎡ ⎤
x1
⎢ .. ⎥
⎣ . ⎦ = (Z net − Wnet Λnet L)z + N Wnet Λnet V s + Wnet Λnet V w
xN

where L ∈ R p̄×(n N − p̄) is defined as


T
L = col(L 1 , . . . , L N ) := V Q −1 V Λnet Wnet
T
(L ⊗ In )Z net

with L i ∈ R pi ×(n N − p̄) , the quasi-steady-state subsystem (that is, the blended dynam-
ics) becomes

ẑ˙ i = Z iT f i (t, Z i ẑ i − Wi Λi L i ẑ + N Ms), i ∈ N


1  T −1 T
N
(4.42)
ṡ = V Λ Wi f i (t, Z i ẑ i − Wi Λi L i ẑ + N Ms)
N i=1 i i

where ẑ = col(ẑ 1 , . . . , ẑ N ).

Theorem 4.5 ([19]) Assume that the blended dynamics (4.42) is contractive. Then,
for any compact set K and for any η > 0, there exists k ∗ > 0 such that, for each
k > k ∗ and col(x1 (t0 ), . . . , x N (t0 )) ∈ K , the solution to (4.34) exists for all t ≥ t0 ,
and satisfies

lim sup xi (t) − (Z i ẑ i (t) − Wi Λi L i ẑ(t) + N Ms(t)) ≤ η, ∀i ∈ N .


t→∞

Theorem 4.6 ([19]) Assume that there is a nonempty compact set Ab that is uni-
formly asymptotically stable for the blended dynamics (4.42). Let Db ⊃ Ab be an
open subset of the domain of attraction of Ab , and let


Ax := (Z net − Wnet Λnet L)ẑ + N (1 N ⊗ M)s : col(ẑ, s) ∈ Ab ,
!   "

Dx := (Z net − Wnet Λnet L)ẑ + N (1 N ⊗ M)s + Wnet Λnet V w : ∈ Db , w ∈ R p̄− ps .
s

Then, for any compact set K ⊂ Dx and for any η > 0, there exists k ∗ > 0 such that,
for each k > k ∗ and col(x1 (t0 ), . . . , x N (t0 )) ∈ K , the solution to (4.34) exists for all
t ≥ t0 , and satisfies

lim sup col(x1 (t), . . . , x N (t)) Ax ≤ η. (4.43)


t→∞
4 Design of Heterogeneous Multi-agent System for Distributed Computation 99

If, in addition, Ab is locally exponentially stable for the blended dynamics (4.42)
and,
⎡ ⎤
f 1 (t, x1 )
T ⎢ .. ⎥
V Λnet Wnet
T
(L ⊗ In ) ⎣ . ⎦ = 0, (4.44)
f N (t, x N )

for all col(x1 , . . . , x N ) ∈ Ax and t, then we have more than (4.43) as

lim col(x1 (t), . . . , x N (t)) Ax = 0.


t→∞

4.4.1 Distributed State Observer with Rank-Deficient


Coupling

We revisit the distributed state estimation problem discussed in Sect. 4.3.2 with the
following agent dynamics, which has less dimension than (4.28) and (4.31):


N
χ̂˙ i = S χ̂i + Ui (oi − G i χ̂i ) + kWi WiT αi j (χ̂ j − χ̂i ) (4.45)
j=1

where Ui := Z i Ūi and k is sufficiently large. Here, the first two terms on the right-
hand side look like a typical state observer, but due to the lack of detectability of
(G i , S), it cannot yield stable error dynamics. Therefore, the diffusive coupling of
the third term exchanges the internal state with the neighbors, compensating for the
lack of information on the undetectable parts. Recalling that WiT χ represents the
undetectable components of χ by oi in the decomposition given in Sect. 4.3.2, it
is noted that the coupling term compensates only the undetectable portion in the
observer. As a result, the coupling matrix Wi WiT is rank-deficient in general. This
point is in sharp contrast to the previous results such as [38], where the coupling
term is nonsingular so that the design is more complicated.
With xi := χ̂i − χ and Bi = Wi WiT , the error dynamics becomes


N
ẋi = (S − Ui G i )xi + Ui n i + k Bi αi j (x j − xi ), i ∈N.
j=1

This is precisely the multi-agent system (4.34), where in this case the matrices Z i
and Wi have implications related to detectable decomposition. In particular, from
the detectability of the pair (G, S), it is seen that ∩i=1
N
im(Wi ) = ∩i=1
N
ker(Z iT ) = {0}
(which corresponds to R B in (4.35)), by the fact that ker(Z i ) is the undetectable
T

subspace of the pair (G i , S). This implies that ps = 0, V is null, and thus, V can be
100 J. G. Lee and H. Shim

chosen as the identity matrix. With them, the blended dynamics (4.42) is given by,
with the state s being null,

ẑ˙ i = Z iT (S − Ui G i )(Z i ẑ i − Wi Λi L i ẑ) + Z iT Ui n i


= (Z iT S − Ūi Ḡ i Z iT )(Z i ẑ i − Wi Λi L i ẑ) + Ūi n i = ( S̄i − Ūi Ḡ i )ẑ i + Ūi n i , i ∈ N .

Since S̄i − Ūi Ḡ i is Hurwitz for all i, the blended dynamics is contractive and Theo-
rem 4.5 asserts that the estimation error xi (t) behaves like Z i ẑ i (t), with a sufficiently
large k. Moreover, if there is no measurement noise, then the set Ab = {0} ⊂ Rn N − p̄
is globally exponentially stable for the blended dynamics. Then, Theorem 4.6 asserts
that the proposed distributed state observer exponentially finds the correct estimate
with a sufficiently large k because (4.44) holds with Ax = {0} ⊂ Rn N .

4.5 Robustness of Emergent Collective Behavior

When a product is manufactured in a factory, or some cells and organs are produced
in an organism, a certain level of variance is inevitable due to imperfection of the
production process. In this case, how to reduce the variance in the outcomes if
improving the process itself is not easy or even impossible?
We have seen throughout the chapter, the emergent collective behavior of the net-
work involves averaging the vector fields of individual agents; that is, the network
behavior is governed by the blended dynamics if the coupling strength is sufficiently
large. Therefore, even if the individual agents are created with relatively large vari-
ance from their reference model, its blended dynamics can have smaller variance
because of the averaging effect. Then, when the coupling gain is large, all the agents,
which were created with large variance, can behave like an agent that is created with
small variance.
In this section, we illustrate this point. In particular, we simulate a network of
pacemaker cells under a single conduction line. The nominal behavior of a pacemaker
cell is given in [39] as

z̈ + (1.45z 2 − 2.465z − 0.551)ż + z = 0 (4.46)

which is a Liénard system considered in Sect. 4.3.1 that has a stable limit cycle. Now,
suppose that a group of pacemaker cells are produced with some uncertainty so that
they are represented as

z̈ i + f i (z i )ż i + gi (z i ) = u i , i = 1, . . . , N ,

where
4 Design of Heterogeneous Multi-agent System for Distributed Computation 101

f i (z i ) = 0.1Δi1 z i3 + (1.45 + Δi2 )z i2 − (2.465 + Δi3 )z i − (0.551 + Δi4 )


gi (z i ) = (1 + Δi5 )z i + 0.1Δi6 z i2

in which, all Δli are randomly


 chosen from a distribution of zero mean and unit
variance. With u i = k j∈N i (ż j + z j − ż i − z i ), the blended dynamics of the group
of pacemaker is given as the averaged Liénard system (4.25) with

fˆ(z) = 0.1Δ̄1 z 3 + (1.45 + Δ̄2 )z 2 − (2.465 + Δ̄3 )z − (0.551 + Δ̄4 )


ĝ(z) = (1 + Δ̄5 )z + 0.1Δ̄6 z 2
N
where Δ̄l = (1/N ) i=1 Δli whose expectation is zero and variance is 1/N . By
the Chebyshev’s theorem in probability, it is seen that the behavior of the blended
dynamics recovers that of (4.46) almost surely as N tends to infinity. It is emphasized
that some agent may not have a stable limit cycle depending on their random selection
of Δli , but the network can still exhibit oscillatory behavior, and the frequency and
the shape of oscillation becomes more robust as the number of agents gets large.
Figure 4.1 shows the simulation results of the pacemaker network when the num-
ber of agents are 10, 100, and 1000, respectively. For example, we randomly gener-
ated the network for N = 10 three times independently, and plotted their behavior in
Fig. 4.1a–c, respectively. It is seen that the variation is large in this case and Fig. 4.1b
even shows the case that no stable limit cycle exists. On the other hand, in the case
when N = 1000, the randomly generated networks exhibit rather uniform behavior
as in Fig. 4.1g–i. For the simulation, the initial condition is z i (0) = ż i (0) = 1, the
coupling gain is k = 50, and the graph has all-to-all connection.
We refer the reader to [14] for more discussions in this direction.

4.6 More than Linear Coupling

Until now, we have considered linear diffusive couplings with constant strength k.
In this section, let us consider two particular nonlinear couplings; edge-wise and
node-wise funnel couplings, whose coupling strength varies as a nonlinear function
of time and the differences between the states.

4.6.1 Edge-Wise Funnel Coupling

The coupling law to be considered is inspired by the so-called funnel controller [40].
For the multi-agent system

ẋi = f i (t, xi ) + u i ∈ R, i ∈ N ,
102 J. G. Lee and H. Shim

(a) N = 10 (b) N = 10 (c) N = 10

(d) N = 100 (e) N = 100 (f) N = 100

(g) N = 1000 (h) N = 1000 (i) N = 1000

Fig. 4.1 Simulation results of randomly generated pacemaker networks. Initial condition is (1, 1)
for all cases

let us consider the following edge-wise funnel coupling law, with νi j := x j − xi ,

 |νi j | νi j
u i (t, {νi j , j ∈ Ni }) := γi j (4.47)
ψi j (t) ψi j (t)
j∈N i

where each function ψi j : [t0 , ∞) → R>0 is positive, bounded, and differentiable


with bounded derivative, and the gain functions γi j : [0, 1) → R≥0 are strictly
increasing and unbounded as s → 1. We assume the symmetry of coupling func-
tions; that is, ψi j = ψ ji and γi j = γ ji for all i ∈ N and j ∈ Ni (or, equivalently,
j ∈ N , i ∈ N j because of the symmetry of the graph). A possible choice for γi j
and ψi j is
4 Design of Heterogeneous Multi-agent System for Distributed Computation 103

Fig. 4.2 State difference νi j evolves within the funnel Fψi j so that the synchronization error can
be prescribed by the shape of the funnel

1
γi j (s) = and ψi j (t) = (ψ − η)e−λ(t−t0 ) + η,
1−s

where ψ, λ, η > 0.
With the funnel coupling (4.47), it is shown in [41] that, under the assumption
that no finite escape time exists, the state difference νi j (t) evolves within the funnel:

Fψi j := (t, νi j ) : |νi j | < ψi j (t)

if |νi j (t0 )| < ψi j (t0 ), ∀i ∈ N , j ∈ Ni , as can be seen in Fig. 4.2. Therefore, approx-
imate synchronization of arbitrary precision can be achieved with arbitrarily small
η > 0 such that lim supt→∞ ψi j (t) ≤ η. Indeed, due to the connectivity of the graph,
it follows from lim supt→∞ |νi j (t)| ≤ η, ∀i ∈ N , j ∈ Ni , that

lim sup |x j (t) − xi (t)| ≤ dη, ∀i, j ∈ N (4.48)


t→∞

where d is the diameter of the graph. For the complete graph, we have d = 1.
Here, we note that, by the symmetry of ψi j and γi j and by the symmetry of the
graph, it holds that


N 
N 
|νi j | νi j N 
|ν ji | ν ji
ui = γi j =− γ ji
i=1 i=1 j∈N i
ψi j (t) ψi j i=1
ψ ji (t) ψ ji
j∈N i


N  |ν ji | ν ji 
N
=− γ ji =− u j.
j=1 i∈N j
ψ ji (t) ψ ji j=1

Therefore, we have that


N 
N
0= ui = (ẋi (t) − f i (t, xi (t))) (4.49)
i=1 i=1
104 J. G. Lee and H. Shim

which holds regardless whether synchronization is achieved or not. If all xi ’s are syn-
chronized whatsoever such that xi (t) = s(t) by a common trajectory s(t), it implies
that ẋi = f i (t, s) + u i = ṡ for all i ∈ N ; i.e., u i (t) compensates the term f i (t, s(t))
so that all ẋi (t)’s become the same ṡ. Hence, (4.49) implies that

1 
N
ṡ = f i (t, s) =: f s (t, s). (4.50)
N i=1

In other words, enforcing synchronization under the condition (4.49) yields an emer-
gent behavior for xi (t) = s(t), governed by the blended dynamics (4.50). In practice,
the funnel coupling (4.47) enforces approximate synchronization as in (4.48), and
thus, the behavior of the network is not exactly the same as (4.50) but can be shown
to be close to it. More details are found in [41].
A utility of the edge-wise funnel coupling is for the decentralized design. It is
because the common gain k, whose threshold k ∗ contains all the information about
the graph and the individual vector fields of the agents, is not used. Therefore, the
individual agent can self-construct their own dynamics when joining the network.
(For example, if it is used for the distributed least-squares solver in Sect. 4.2.2,
then the agent dynamics (4.11) can be constructed without any global information.)
Indeed, when an agent joins the network, the agent can handshake with the agents to
be connected, and communicate to set the function ψi j (t) so that the state difference
νi j at the moment of joining resides inside the funnel.

4.6.2 Node-Wise Funnel Coupling

Motivated by the observation in the previous subsection that the enforced synchro-
nization under the condition (4.49) gives rise to the emergent behavior of (4.50), let us
illustrate how different nonlinear couplings may yield different emergent behavior.
As a particular example, we consider the node-wise funnel coupling given by

|νi | νi 
u i (t, νi ) := γi where νi = αi j (x j − xi ) (4.51)
ψi (t) ψi (t)
j∈N i

where each function ψi : [t0 , ∞) → R>0 is positive, bounded, and differentiable with
bounded derivative, and the gain functions γi : [0, 1) → R≥0 are strictly increasing
and unbounded as s → 1. A possible choice for γi and ψi is

δ
π 
tan s , s>0
γi (s) = s
π
2 and ψi (t) = (ψ − η)e−λ(t−t0 ) + η (4.52)
2
δ, s=0

where δ, ψ, λ, η > 0.
4 Design of Heterogeneous Multi-agent System for Distributed Computation 105

With the funnel coupling (4.51), it is shown in [42] that, under the assump-
tion that no finite escape time exists, the quantity νi (t) evolves within the funnel
Fψi := {(t, νi ) : |νi | < ψi (t)} if |νi (t0 )| < ψi (t0 ), ∀i ∈ N . Therefore, approximate
synchronization of arbitrary precision can be achieved with arbitrarily small η > 0
such that lim supt→∞ ψi (t) ≤ η. Indeed, due to the connectivity of the graph, it fol-
lows that √
2 N
lim sup |x j (t) − xi (t)| ≤ η, ∀i, j ∈ N (4.53)
t→∞ λ2

where λ2 is the second smallest eigenvalue of L .


Unlike the case of edge-wise funnel coupling, there is lack of symmetry so that
the equality of (4.49) does not hold. However, assuming that the map u i (t, νi ) from
νi to u i is invertible (which is the case for (4.52) for example) such that there is a
function Vi such that
νi = Vi (t, u i (t, νi )), ∀t, νi ,

we can instead make use of the symmetry in νi as


N  
N  
N 
αi j (x j − xi ) = − α ji (xi − x j ) = − α ji (xi − x j ),
i=1 j∈N i i=1 j∈N i j=1 i∈N j

which leads to


N 
N 
N
0= νi = Vi (t, u i (t, νi )) = Vi (t, ẋi (t) − f i (t, xi (t))). (4.54)
i=1 i=1 i=1

This holds regardless whether synchronization is achieved or not. If all xi ’s are


synchronized whatsoever such that xi (t) = s(t) by a common trajectory s(t), it
implies that ẋi (t) = f i (t, s) + u i (t) = ṡ for all i ∈ N ; i.e., u i (t) compensates the
term f i (t, s) so that all ẋi are the same as ṡ, which can be denoted by f s (t, s) =
f i (t, s) + u i (t). Hence, (4.54) implies that


N
Vi (t, f s (t, s) − f i (t, s)) = 0. (4.55)
i=1

In other words, (4.55) defines f s (t, s) implicitly, which yields the emergent behavior
governed by
ṡ = f s (t, s). (4.56)

In practice, the funnel coupling (4.51) enforces approximate synchronization as in


(4.53), and the behavior of the network is not exactly the same as (4.56) but it is
shown in [42] to be close to (4.56).
106 J. G. Lee and H. Shim

In order to illustrate that different emergent behavior may arise from various
nonlinear couplings, let us consider the example of (4.52), for which the function Vi
is given by
2ψi (t) u 
tan−1
i
Vi (t, u i ) = .
π δ

Assuming that all ψi ’s are the same, the emergent behavior ṡ = f s (t, s) can be found
with f s (t, s) being the solution to


N
f s (t, s) − f i (t, s)
0= tan−1 .
i=1
δ

If we let δ → 0, then the above equality shares the solution with


N
0= sgn ( f s (t, s) − f i (t, s)) .
i=1

Recalling the discussions in Sect. 4.2.3, it can be shown that f s (t, s) takes the median
of all the individual vector fields f i (t, s), i = 1, . . . , N . Since taking median is a sim-
ple and effective way to reject outliers, this observation may find further applications
in practice.

Acknowledgements This work was supported by the National Research Foundation of Korea
Grant funded by the Korean Government (Ministry of Science and ICT) under No. NRF-2017R1E1
A1A03070342 and No. 2019R1A6A3A12032482.

References

1. Olfati-Saber, R., Murray, R.M.: Consensus problems in networks of agents with switching
topology and time-delays. IEEE Trans. Autom. Control 49(9), 1520–1533 (2004)
2. Moreau, L.: Stability of continuous-time distributed consensus algorithms. In: Proceedings of
43rd IEEE Conference on Decision and Control, pp. 3998–4003 (2004)
3. Ren, W., Beard, R.W.: Consensus seeking in multiagent systems under dynamically changing
interaction topologies. IEEE Trans. Autom. Control 50(5), 655–661 (2005)
4. Seo, J.H., Shim, H., Back, J.: Consensus of high-order linear systems using dynamic output
feedback compensator: low gain approach. Automatica 45(11), 2659–2664 (2009)
5. Kim, H., Shim, H., Seo, J.H.: Output consensus of heterogeneous uncertain linear multi-agent
systems. IEEE Trans. Autom. Control 56(1), 200–206 (2011)
6. De Persis, C., Jayawardhana, B.: On the internal model principle in formation control and
in output synchronization of nonlinear systems. In: Proceedings of 51st IEEE Conference on
Decision and Control, pp. 4894–4899 (2012)
7. Isidori, A., Marconi, L., Casadei, G.: Robust output synchronization of a network of heteroge-
neous nonlinear agents via nonlinear regulation theory. IEEE Trans. Autom. Control 59(10),
2680–2691 (2014)
8. Su, Y., Huang, J.: Cooperative semi-global robust output regulation for a class of nonlinear
uncertain multi-agent systems. Automatica 50(4), 1053–1065 (2014)
4 Design of Heterogeneous Multi-agent System for Distributed Computation 107

9. Casadei, G., Astolfi, D.: Multipattern output consensus in networks of heterogeneous nonlinear
agents with uncertain leader: a nonlinear regression approach. IEEE Trans. Autom. Control
63(8), 2581–2587 (2017)
10. Su, Y.: Semi-global output feedback cooperative control for nonlinear multi-agent systems via
internal model approach. Automatica 103, 200–207 (2019)
11. Zhang, M., Saberi, A., Stoorvogel, A.A., Grip, H.F.: Almost output synchronization for het-
erogeneous time-varying networks for a class of non-introspective, nonlinear agents without
exchange of controller states. Int. J. Robust Nonlinear Control 26(17), 3883–3899 (2016)
12. DeLellis, P., Di Bernardo, M., Liuzza, D.: Convergence and synchronization in heterogeneous
networks of smooth and piecewise smooth systems. Automatica 56, 1–11 (2015)
13. Montenbruck, J.M., Bürger, M., Allgöwer, F.: Practical synchronization with diffusive cou-
plings. Automatica 53, 235–243 (2015)
14. Kim, J., Yang, J., Shim, H., Kim, J.-S., Seo, J.H.: Robustness of synchronization of heteroge-
neous agents by strong coupling and a large number of agents. IEEE Trans. Autom. Control
61(10), 3096–3102 (2016)
15. Panteley, E., Loría, A.: Synchronization and dynamic consensus of heterogeneous networked
systems. IEEE Trans. Autom. Control 62(8), 3758–3773 (2017)
16. Lee, S., Yun, H., Shim, H.: Practical synchronization of heterogeneous multi-agent system
using adaptive law for coupling gains. In: Proceedings of American Control Conference, pp.
454–459 (2018)
17. Modares, H., Lewis, F.L., Kang, W., Davoudi, A.: Optimal synchronization of heterogeneous
nonlinear systems with unknown dynamics. IEEE Trans. Autom. Control 63(1), 117–131
(2017)
18. Sanders, J.A., Verhulst, F.: Averaging Methods in Nonlinear Dynamical Systems. Springer,
Berlin (1985)
19. Lee, J.G., Shim, H.: A tool for analysis and synthesis of heterogeneous multi-agent systems
under rank-deficient coupling. Automatica 117 (2020)
20. Lohmiller, W., Slotine, J.J.E.: On contraction analysis for non-linear systems. Automatica
34(6), 683–696 (1998)
21. Shames, I., Charalambous, T., Hadjicostis, C.N., Johansson, M.: Distributed network size esti-
mation and average degree estimation and control in networks isomorphic to directed graphs.
In: Proceedings of 50th Annual Allerton Conference on Communication, Control, and Com-
puting, pp. 1885–1892 (2012)
22. Lee, D., Lee, S., Kim, T., Shim, H.: Distributed algorithm for the network size estimation:
blended dynamics approach. In: Proceedings of 57th IEEE Conference on Decision and Control,
pp. 4577–4582 (2018)
23. Mou, S., Morse, A.S.: A fixed-neighbor, distributed algorithm for solving a linear algebraic
equation. In: Proceedings of 12th European Control Conference, pp. 2269–2273 (2013)
24. Mou, S., Liu, J., Morse, A.S.: A distributed algorithm for solving a linear algebraic equation.
IEEE Trans. Autom. Control 60(11), 2863–2878 (2015)
25. Anderson, B.D.O., Mou, S., Morse, A.S., Helmke, U.: Decentralized gradient algorithm for
solution of a linear equation. Numer. Algebra Control Optim. 6(3), 319–328 (2016)
26. Wang, X., Zhou, J., Mou, S., Corless, M.J.: A distributed linear equation solver for least square
solutions. In: Proceedings of 56th IEEE Conference on Decision and Control, pp. 5955–5960
(2017)
27. Shi, G., Anderson, B.D.O.: Distributed network flows solving linear algebraic equations. In:
Proceedings of American Control Conference, pp. 2864–2869 (2016)
28. Shi, G., Anderson, B.D.O., Helmke, U.: Network flows that solve linear equations. IEEE Trans.
Autom. Control 62(6), 2659–2674 (2017)
29. Liu, Y., Lou, Y., Anderson, B.D.O., Shi, G.: Network flows as least squares solvers for linear
equations. In: Proceedings of 56th IEEE Conference on Decision and Control, pp. 1046–1051
(2017)
30. Lee, J.G., Shim, H.: A distributed algorithm that finds almost best possible estimate under
non-vanishing and time-varying measurement noise. IEEE Control Syst. Lett. 4(1), 229–234
(2020)
108 J. G. Lee and H. Shim

31. Lee, J.G., Kim, J., Shim, H.: Fully distributed resilient state estimation based on distributed
median solver. IEEE Trans. Autom. Control 65(9), 3935–3942 (2020)
32. Yun, H., Shim, H., Ahn, H.-S.: Initialization-free privacy-guaranteed distributed algorithm for
economic dispatch problem. Automatica 102, 86–93 (2019)
33. Lee, J.G., Shim, H.: Behavior of a network of heterogeneous Liénard systems under strong
output coupling. In: Proceedings of 11th IFAC Symposium on Nonlinear Control Systems, pp.
342–347 (2019)
34. Dörfler, F., Bullo, F.: Synchronization in complex networks of phase oscillators: a survey.
Automatica 50(6), 1539–1564 (2014)
35. Bai, H., Freeman, R.A., Lynch, K.M.: Distributed Kalman filtering using the internal model
average consensus estimator. In: Proceedings of American Control Conference, pp. 1500–1505
(2011)
36. Kim, J., Shim, H., Wu, J.: On distributed optimal Kalman-Bucy filtering by averaging dynamics
of heterogeneous agents. In: Proceedings of 55th IEEE Conference on Decision and Control,
pp. 6309–6314 (2016)
37. Mitra, A., Sundaram, S.: An approach for distributed state estimation of LTI systems. In:
Proceedings of 54th Annual Allerton Conference on Communication, Control, and Computing,
pp. 1088–1093 (2016)
38. Kim, T., Shim, H., Cho, D.D.: Distributed Luenberger observer design. In: Proceedings of 55th
IEEE Conference on Decision and Control, pp. 6928–6933 (2016)
39. dos Santos, A.M., Lopes, S.R., Viana, R.L.: Rhythm synchronization and chaotic modulation
of coupled Van der Pol oscillators in a model for the heartbeat. Physica A: Stat. Mech. Appl.
338(3–4), 335–355 (2004)
40. Ilchmann, A., Ryan, E.P., Sangwin, C.J.: Tracking with prescribed transient behaviour. ESAIM:
Control, Optim. Calc. Var. 7, 471–493 (2002)
41. Lee, J.G., Berger, T., Trenn, S., Shim, H.: Utility of edge-wise funnel coupling for asymptot-
ically solving distributed consensus optimization. In: Proceedings of 19th European Control
Conference, pp. 911–916 (2020)
42. Lee, J.G., Trenn, S., Shim, H.: Synchronization with prescribed transient behavior: Heteroge-
neous multi-agent systems under funnel coupling. Under review (2020). https://arxiv.org/abs/
2012.14580
Chapter 5
Contributions to the Problem of
High-Gain Observer Design for
Hyperbolic Systems

Constantinos Kitsos, Gildas Besançon, and Christophe Prieur

Abstract This chapter proposes some non-trivial extensions of the classical high-
gain observer designs for finite-dimensional nonlinear systems to some classes of
infinite-dimensional ones, written as triangular systems of coupled first-order hyper-
bolic Partial Differential Equations (PDEs), where an observation of one only coor-
dinate of the state is considered as the system’s output. These forms may include
some epidemic models and tubular chemical reactors. To deal with this problem,
depending on the number of distinct velocities of the hyperbolic system, direct and
indirect observer designs are proposed. We first show intuitively how direct observer
design can be applied to quasilinear partial integrodifferential hyperbolic systems of
balance laws with a single velocity, as a natural extension of the finite-dimensional
case. We then introduce an indirect approach for systems with distinct velocities (up
to three velocities), where an infinite-dimensional state transformation first maps the
system into suitable systems of PDEs and the convergence of the observer is sub-
sequently exhibited in appropriate norms. This indirect approach leads to the use of
spatial derivatives of the output in the observer dynamics.
Dedication
Control Theory owes a lot to Laurent Praly and his scientific legacy has inspired the
development of various new areas in Control of various new areas in Control. His
long-term and still ongoing contributions in adaptive robust control and stabilization,
forwarding and backstepping methods, nonlinear observers, and output feedback
control are deeply motivating, and it is a pleasure to propose the present chapter as a

C. Kitsos (B)
LAAS-CNRS, University of Toulouse, CNRS, 31400 Toulouse, France
e-mail: [email protected]
C. Kitsos · G. Besançon · C. Prieur
Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France
e-mail: [email protected]
C. Prieur
e-mail: [email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 109
Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control,
Lecture Notes in Control and Information Sciences 488,
https://doi.org/10.1007/978-3-030-74628-5_5
110 C. Kitsos et al.

tribute to his work, the quality of presentation of his results, the rigor in his theorems,
and his continuous pursuit for the most possible generality of the approaches.

5.1 Introduction

Our chapter deals with solutions to a problem of High-Gain Observer design (H-
GODP) for hyperbolic systems. This problem for the case of finite-dimensional
systems has already been addressed (see [12, 13]) and has gained significant con-
sideration in the last decades [17, 18]. These observers rely on a tuning parameter
(gain), chosen large enough, in order to compensate for the nonlinear terms and
ensure arbitrary convergence rate. This chapter aims at presenting some extensions
of this design to infinite-dimensional systems, namely hyperbolic systems of balance
laws, obeying to some triangular structure, similarly to the observability form in the
finite dimensions, see [4] while considering one observation only. There exist some
studies on observer design for infinite-dimensional systems in the literature, mainly
considering the full state vector on the boundaries as measurement. Among others,
one can refer to [2, 11, 14, 29] for Lyapunov-based analysis and backstepping, and
to [26] for optimization methods. The case of state estimation for nonlinear infinite-
dimensional systems, which is significantly more complicated, has been addressed in
[5, 6, 15, 25, 28, 30]. Unlike these approaches, the present chapter provides solutions
to this H-GODP, where a part of the state is fully unknown (including at the bound-
aries). The known part is however distributed and the explored observers strongly rely
on high gain, extending techniques and performances of finite-dimensional cases.
In general, the problem of control/observer design with a reduced number of
controls/observations, less than the number of the states, is a difficult problem. To
the best of our knowledge, observer design for systems with reduced number of
observations, whose error equations cannot achieve dissipativity in their boundary
conditions, has not been considered. Somehow dual problems of controllability for
cascade systems of PDEs with a reduced number of internal controls have already
been considered (see in particular [1]). In [24], observability for coupled systems
of linear PDEs with a reduced number of observations is studied. In this work, we
reveal some links to these works, coming from our assumption on stronger regularity
of the system. Additionally to this, for hyperbolic systems, arbitrary convergence,
a feature of high-gain observers, would be desirable, since the boundary observers
proposed in the literature, for instance [6], experience a limitation with respect to
convergence speed (transport phenomena). The minimum-time control problem, see
for instance [10], suggests that a faster observer than a boundary one, would be
desirable in some cases. While dealing with the H-GODP in infinite dimensions, the
assumed triangularity of the source terms, similar to the finite-dimensional case, is not
enough and several difficulties come from the properties of the hyperbolic operator.
This might not allow designs for any large number of states. Also, the presence of
nonlocal terms in the dynamics, the generality of the boundary conditions and types
of nonlinearities increase the complexity of the design.
5 High-Gain Observer Design for Systems of PDEs 111

The main contribution here is the proof of solvability of the H-GODP first for
n × n quasilinear hyperbolic triangular systems with nonlocal terms, i.e., systems
of Partial Integrodifferential Equations (PIDEs), considering only a single velocity
function. Then, in the case of distinct velocities, the nonexistence of diagonal Lya-
punov functionals that would result in the proof of the observer convergence, leads
us to adopt an indirect strategy for the case of 2 × 2 and 3 × 3 systems. For this,
we introduce a nonlinear infinite-dimensional state transformation, in order to map
the initial system into a new system of PDEs. The problem comes from the lack
of a commutative property, yet needed in the Lyapunov stability analysis. Note that
constraints on the source term can be found in some studies of stability problems
as in [3, 9, 27], which allow a similar commutation, while this is not the case here.
The methodology proposed here requires stronger regularity for system’s dynam-
ics and then output’s spatial derivatives up to order 2 are injected in the high-gain
observer dynamics additionally to the classical output correction terms. The presence
of nonlinearities in the velocity functions, which might also have nonlocal nature,
the presence of nonlocal and nonlinear components in the source terms, and the
generality of the couplings on the boundaries are treated explicitly. The proposed
approach relies on Lyapunov analysis for function spaces of stronger regularity and
the introduction of an infinite-dimensional state transformation. The direct observer
design has already partially appeared in [20], without considering velocity functions
of nonlocal nature. The indirect one is inspired by our previous work for semilinear
parabolic systems [21] and for the case of quasilinear strictly hyperbolic systems, as
the ones considered here, it has not appeared before.
In Sect. 5.2, we introduce the considered system, some examples from epidemics
and chemical reactions and then the main observer design problem H-GODP, along
with its complications and some proposed solutions. In Sect. 5.3, we present a direct
approach for the H-GODP for a hyperbolic system with one velocity via a Lyapunov-
based methodology. In Sect. 5.4, we show indirect solvability of the H-GODP for
systems with distinct velocities, via a suitable infinite-dimensional state transforma-
tion, which maps the system into appropriate target systems for observer design.
Notation For a given w in Rn , |w| denotes its usual Euclidean norm. For a given
constant matrix A in Rn×n , A denotes its transpose, |A| := sup {|Aw| , |w| = 1} is

its induced norm and Sym(A) = A+A 2
stands for its symmetric part. By eig(A), we
denote the minimum eigenvalue of a symmetric matrix A. By In , we denote the iden-
tity matrix of dimension n. For given ξ : [0, +∞) × [0, L] → Rn and time t ≥ 0, we
use the notation ξ(t)(x) := ξ(t, x), for all x in [0, L] to refer to the profile at certain
time and with ξt or ∂t ξ (resp. ξx or ∂x ξ ), we refer to its partial derivative with respect
to t (resp. x). By dt (resp. dx ), we refer to the total derivative with respect to t (resp. x).
For a continuous (C 0 ) map [0, L]  x → ξ(x) ∈ Rn , we adopt the notation ξ 0 :=
max{|ξ(x)| , x ∈ [0, L]} for its sup-norm. If this mapping q is q- times continuously
differentiable (C q ), we adopt the notation ξ q := i=0 ∂xi ξ 0 for the q-norm. We
use the difference operator given by Δξ̂ [F ] (ξ ) := F [ξ̂ ] − F [ξ ], parametrized by
ξ̂ , where F denotes any chosen operator acting on ξ . By D f , we denote the Jacobian
of a differentiable mapping Rn  u → f (u) ∈ Rm . For a Fréchet differentiable map-
112 C. Kitsos et al.

ping F , by D F [u] , h , we denote its Fréchet derivative w.r.t. u acting on h. For a


locally Lipschitz mapping F , F ∈ Li ploc (X , · X ) means that for every R > 0,
there exists L R > 0, such that, for every w, ŵ ∈ X , with w X , ŵ X ≤ R, it
holds F [w] − F [ŵ] X ≤ L R w − ŵ X . For a globally Lipschitz mapping F ,
i.e., F ∈ Li p (X , · X ), the previous holds for all w, ŵ ∈ X . By sgn(x) we
denote the signum function sgn(x) = dxd |x|, when x = 0, with sgn(0) = 0.

5.2 Problem Description and Solutions

In this section, we introduce the hyperbolic system written in a triangular form,


which allows the observer design proposed in this chapter. It might be quasilinear
and contain also both velocity functions of nonlocal nature and nonlocal source terms,
making it a system of PIDEs. We illustrate some examples of systems having this
triangular form and then we introduce the main observer problem and its solutions.

5.2.1 Triangular Form for Observer Design

We are concerned with one-dimensional hyperbolic systems of balance laws,


described by the following equations on a strip Π := [0, +∞) × [0, L]:

ξt (t, x) + Λ[ξ1 (t)](x)ξx (t, x) = Aξ(t, x) + F [ξ(t)] (x), (5.1a)


 
where ξ = ξ1 · · · ξn . The matrix function Λ[·] contains the velocity functions of
the balance law, each of which assumed strictly positive, and is diagonal of the form

Λ[ξ1 ] := diag {λ1 [ξ1 ], . . . , λn [ξ1 ]} .

We assume a specific structure of the source terms, which provides an internal


coupling of the n equations in a triangular fashion. More specifically, matrix A
contains 1s on its sup-diagonal and 0s elsewhere, i.e., it performs the operation
 
Aξ = ξ2 ξ3 · · · ξn 0

and the nonlinear source term F (·) has the following form
 
F (ξ ) = F1 [ξ1 ] F2 [ξ1 , ξ2 ] · · · Fn [ξ1 , . . . , ξn ] .

We assume that mappings Λmight include  x terms of the form λi (ξ1 (t, x)) or
 local
x
nonlocal terms of the form λi 0 ξ1 (t, s)ds or 0 λi (ξ1 (t, s)) ds, for instance. Same
for the nonlinear source term that might include local terms of the form f (x, ξ(t, x)),
5 High-Gain Observer Design for Systems of PDEs 113
x
integral terms of Volterra type of the form 0 f (s, ξ(t, s)) ds, and possibly boundary
terms of the form f (ξ(t, l)), with l = 0, L.
Consider, also, a distributed measurement, available at the output, written as fol-
lows:

y(t, x) =Cξ(t, x), (5.1b)

where
 
C = 1 0 ··· 0 .

To complete the definition of the class of systems, let us consider initial condition
(in general unknown) ξ 0 and boundary conditions as follows:

ξ(0, x) =ξ 0 (x), x ∈ [0, L], (5.2a)


ξ(t, 0) =H (ξ(t, L)) , t ∈ [0, +∞), (5.2b)

where H is a nonlinear mapping coupling the incoming with the outgoing informa-
tion on the boundaries.
More about the regularity of the dynamics and the properties of the system will
be provided in the forthcoming sections. Note that the above system has the same
structure up to the hyperbolic operator as the one considered in the finite-dimensional
nonlinear triangular systems, appropriate for observer designs, see [17].
We provide here some examples of dynamic phenomena coming from epidemiol-
ogy and chemical reactions, that can be described by triangular systems of hyperbolic
PDEs, as the ones given above. Note that some distributed Lotka–Volterra systems
might also take this triangular form, as it was shown in [21], but obeying parabolic
equations.

• SIR epidemic models


For infectious diseases, a fundamental model was formulated by Kermack and
McKendrick (see [3, Chap. 1] for more details). In this model, population is clas-
sified into three groups: (i) the individuals who are uninfected and susceptible
(S) of catching the disease, (ii) the individuals who are infected (I ) by the con-
cerned pathogen, (iii) the recovered (R) individuals who have acquired a permanent
immunity to the disease. Assuming that the age of patients is taken into account,
S(t, x), I (t, x), R(t, x) represent the age distribution of the population of each
group at time t. As a result, the integral from x1 to x2 of S, I , and R is the number
of individuals of each group with ages between x1 and x2 .
The dynamics of the disease propagation in the population are then described by
the following set of hyperbolic PIDEs on Π
114 C. Kitsos et al.

St (t, x) + Sx (t, x) + μ(x)S(t, x) + G [S(t), I (t)](x) = 0,


It (t, x) + Ix (t, x) + (γ (x) + μ(x)) I (t, x) − G [S(t), I (t)](x) = 0, (5.3)
Rt (t, x) + Rx (t, x) + μ(x)R(t, x) − γ (x)I (t, x) = 0,
L
where G [S(t), I (t)](x) := β(x)S(t, x) 0 I (t, s)ds stands for the disease trans-
mission rate by contact between susceptible and infected individuals, which is
assumed to be proportional to the sizes of both groups, with β(x) > 0 being
the age-dependent transmission coefficient between all infected individuals and
susceptibles having age x. The maximal life duration in the considered popu-
lation is denoted by L and, thus, S(t, L) = I (t, L) = R(t, L) = 0. Parameter
μ(x) > 0 denotes the natural age-dependent per capita death rate in the pop-
ulation and γ (x) > 0 is the age-dependent rate at which infected individuals
recover from the disease. We also assume some boundary conditions of the form
S(t, 0) = B(t), I (t, 0) = 0, R(t, 0) = 0, where B(t) stands for the inflow of new-
born individuals in the susceptible part of the population at time t. Assume that
we are able to measure the number of people in the group R of recovered patients
between ages 0 and x, for every age x ∈ [0, L] and time t ≥ 0, i.e., system’s output
x
is given by the quantity 0 R(t, s)ds.
System (5.3) is written in the form (5.1a)-(5.1b)-(5.2) by applying a nonlocal
transformation of the following form:
 x
ξ1 (t, x) = R(t, s)ds,
0 x
ξ2 (t, x) = γ (s)I (t, s)ds, (5.4)
0
 x  L
ξ3 (t, x) = β(s)γ (s)S(t, s)ds I (t, s)ds.
0 0

Then, in the new coordinates, system is written in the general form (5.1a) we con-
sidered here, with constant velocities, namely, Λ[ξ1 ] = In and with its nonlinear
source term having the form F [ξ(t)](x) := F (x, ξ(t)(x)), containing also non-
linear nonlocal terms, more explicitly, some integral terms of Volterra type and
boundary terms. For the exact form of these mappings that we derive after the
transformation (5.4), the reader can refer to [20].
Such a problem, where the hyperbolic operator has a single velocity, is investi-
gated in Sect. 5.3, which allows a direct observer design. Note also that due to the
nonlocal nature of the transformation (5.4), one needs to prove the convergence of
a candidate observer for system in the new coordinates ξ in the 1-spatial norm (and
not in the sup-spatial norm), in order to be able to return to the original coordinates
of the SIR model.
• Tubular chemical reactors
Control and observer designs for chemical reactors in the context of distributed
parameter systems have been widely investigated, see for instance [7]. We present
5 High-Gain Observer Design for Systems of PDEs 115

here a model of a parallel plug flow chemical reactor (see [3, Chap. 5.1]). A plug
flow chemical reactor is a tubular reactor where a liquid reaction mixture circulates.
The reaction proceeds as the reactants travel through the reactor. We consider the
case of a horizontal reactor, where a simple mono-molecular reaction takes place
between A and B, where A is the reactant species and B is the desired product.
The reaction is supposed to be exothermic and a jacket is used to cool the reactor.
The cooling fluid flows around the wall of the tubular reactor. The dynamics are
described by the following hyperbolic equations on Π :

∂t Tc + Vc ∂x Tc − k0 (T c − Tr ) = 0,
∂t Tr + Vr ∂x Tr + k0 (Tc − Tr ) − k1r (Tr , C A ) = 0, (5.5)
∂t C A + Vr ∂x C A + r (Tr , C A ) = 0,

where Vc is the coolant velocity in the jacket, Vr is the reactive fluid velocity in the
reactor, k0 and k1 are some positive constants, Tc (t, x) is the coolant temperature,
Tr (t, x) is the reactor temperature, and C A (t, x) is the concentration of the chemical
A in the reaction medium. The function r (Tr , C A ) stands  for the reaction rate
 
and is given by r (Tr , C A ) := (a + b)C A − bC A exp − RT
in E
r
, where we have
assumed that the sum of concentrations C A + C B is constant, equal to C A (t, x) +
C B (t, x) = C inA , as it is simply described by a delay equation. Also, a, b are rate
constants, C inA is the concentration on the left endpoint, E is the activation energy
and R is the Boltzmann constant. We consider boundary conditions of the form:
Tr (t, 0) = Trin , Tc (t, 0) = Tcin , C A (t, 0) = C in
A , C B (t, 0) = 0. Assuming that the
measured quantity is the coolant temperature Tc , we can transform system (5.5)
into a form as (5.1a)-(5.1b)-(5.2) by applying the invertible transformation

  E
ξ1 = Tc , ξ2 = Tr , ξ3 = k1 (a + b)C A − bC in
A exp − .
RTr

In this example, the hyperbolic operator has distinct velocities. Such a problem is
investigated in Sect. 5.4.

5.2.2 The High-Gain Observer Design Problem

We present here the main problem this chapter deals with and some proposed solu-
tions.

Definition 5.1 (H-GODP) The High-Gain Observer Design Problem is solvable for
a system written in the form (5.1a)–(5.2) with output (5.1b), while output’s spatial
derivatives of order at most n − 1 might also be available, if there exists a well-
posed observer system of PDEs, which estimates the state of initial system with a
convergence speed that can be arbitrarily tuned via a single parameter (high-gain
116 C. Kitsos et al.

constant) θ . More precisely, for every κ > 0, there exists θ0 > 1, such that for every
θ ≥ θ0 , solutions to (5.1a)–(5.2) satisfy

ξ̂ (t, ·) − ξ(t, ·) X1 ≤ e−κt ξ̂ 0 (·) − ξ 0 (·) X2 (5.6)

for some > 0 polynomial in θ , where ξ̂ , ξ̂ 0 represent the observer state and its
initial condition, respectively, and by X1 , X2 , we denote some function spaces,
whose accurate choice depends on the number of distinct velocities.

A feature of this observer design problem is a considered internal measurement


of a part of the state, without any other knowledge on the other states. Furthermore,
another feature indicated in the H-GODP definition, is a required stronger regularity
of the solutions to the initial systems since the observer dynamics may include spatial
derivatives of the output. This requirement reveals some links to previous studies on
internal controllability for cascade systems with reduced number of controls, see
[1]. We note here that, although boundary observers with the full-state measurement
are preferred for practical reasons, see for instance [6], in the present formulation
distributed measurement of the part of the state might be available in many cases, for
instance, thermal cameras for chemical reactors or approximations with distributed
measurements within the domain. Additionally, the required spatial derivatives of the
output can be available in real time since they follow from causal measurements, con-
trary to the time derivatives of the output, which are strictly not included in observer
designs, as the knowledge of them is non-causal. Although this requirement of the
availability of space derivatives of the output might seem restrictive, approximations
via kernel convolutions might be an alternative realization.

Remark 5.1 Solvability of the H-GODP suggests that a high-gain observer would
be arbitrarily fast, without any limitation in the convergence speed. H-GODP is
not solvable in the case of boundary measurement, instead of internal measurement
as in (5.1b). First, arbitrary convergence condition would not be fulfilled since a
boundary observer for hyperbolic systems would experience a limitation with respect
to convergence speed. The rate of convergence is limited by a minimal observation
time which depends on the size of the domain and the velocities in that case (see [23]
for minimum time of observability due to transport phenomena). Second, following a
boundary observer design methodology as in [6], in the presence of a general form of
boundary conditions, where a nonlinear law couples the incoming with the outgoing
information on the boundaries, boundary measurement of the whole state vector
would be required, instead of just the first state, for the boundary observer to be
feasible. In [8], control design is achieved for a 2 × 2 hyperbolic system with some
specific boundary conditions, via boundary control on one end of only one state.
Here, however, where we consider the dual problem of observer design with one
observation, such an approach would not be feasible, because for general boundary
conditions, by just one observation, we cannot achieve a dissipativity of the boundary
conditions as in this work, which would lead to stability of the observation error
system (see [3] about linking dissipativity of boundary conditions with stability).
5 High-Gain Observer Design for Systems of PDEs 117

The main problem appearing when dealing with the solvability of the H-GODP
comes from the hyperbolic operator. More particularly, the form and also the domain
of the hyperbolic operator might be general, including distinct velocities and also very
general couplings of the incoming with the outgoing information on the boundaries.
In stability analysis of hyperbolic systems, diagonal Lyapunov functionals are usu-
ally chosen, see [3] since in taking its time derivative, integration by parts is required,
which can be simplified if the Lyapunov matrix commutes with the diagonal matrix
of the velocities Λ[·]. In our case, the Lyapunov matrix cannot be diagonal since it
shall solve a quadratic Lyapunov equation for the non-diagonally stabilizable matrix
A. Section 5.3 deals with a solution to this H-GODP for a system with one velocity,
where an extension from finite dimensions is direct since the aforementioned com-
mutative property is met. In Sect. 5.4, we elaborate an indirect design, where the
general hyperbolic operator of distinct velocities is decomposed into a new hyper-
bolic operator with one velocity plus some mappings acting only on the measured
first state, and a bilinear triangular mapping between the measured first state and the
second one. Additionally to these complications, note that for the solvability of the
H-GODP, difficulties also come from the presence of nonlocal terms, which require
stability proof in the sup-norm, and also, from the quasilinearity of the system, i.e.,
the dependence of Λ[·] on the state.

5.3 Observer Design for Systems with a Single Velocity

In this section, we show the solvability of the H-GODP for a system with a single
velocity, which constitutes a direct extension of observer designs in finite dimension.

5.3.1 Problem Statement and Requirements

Consider the general hyperbolic system (5.1a)–(5.2), with output (5.1b) and with
the same triangularity of its mappings given therein. We assume in this section that
system has only one velocity, i.e., matrix of velocities is of the form

Λ[ξ1 ] := λ[ξ1 ]In ,

where velocity λ : C 1 ([0, L]; R) → C 1 ([0, L]; R) is Fréchet differentiable, possi-


bly nonlocal, and positive (namely, λ[y] > 0, for all y ∈ C 0 ([0, L]; R) (nonlocal
case), or y ∈ R (local case)), and nonlinear mapping F [ξ(t)](x) := F (x, ξ(t)(x)),
with F : [0, L] × C 1 ([0, L]; Rn ) → C 1 ([0, L]; Rn ) is continuously differentiable
with respect to its first argument and Fréchet differentiable with respect to its sec-
ond argument. It can possibly contain nonlocal terms (integral  terms of Volterra type

and boundary terms). We further assume that DF ∈ Li ploc C 0 ([0, L]; Rn ) , · 0 .
Also, initial condition ξ 0 ∈ C 1 ([0, L]; Rn ) satisfies zero-order and one-order com-
118 C. Kitsos et al.

patibility conditions (see [3, App. B] for precise definition of compatibility con-
ditions) and the nonlinear mapping H coupling the incoming with the outgoing
information is in C 1 (Rn ; Rn ), while its gradient is locally Lipschitz continuous, i.e.,
DH ∈ Li ploc (Rn , | · |).
The assumption that follows is essential to assert the well-posedness of the con-
sidered system, along with an observer design requirement of forward completeness.
Furthermore, it imposes global boundedness of classical solutions in the 1-norm. The
latter requirement is due to the quasi-linearity of the system (the dependence of λ on
ξ1 ) and can be dropped for the case of semilinear systems, but then a stronger assump-
tion on the nonlinear source terms would be imposed instead. For a more detailed
presentation of the nature of the following assumption, the reader can refer to [3]
and references therein, where sufficient conditions for the well-posedness and exis-
tence of classical solutions for hyperbolic systems are given. In the case of nonlocal
conservation laws, i.e., wherevelocity λ : C 0 ([0, L]; R) → C 1 ([0, L]; R) might be
x
of the form λ[ξ1 (t)](x) := λ( 0 ξ1 (t, s)ds), this assumption can be met more easily,
see for instance [16] and other works of these authors.

Assumption 5.1 Consider a set M ⊂ C 1 ([0, L]; R) nonempty and bounded, con-
sisting of functions satisfying zero-order and one-order compatibility conditions for
problem (5.1a)–(5.2). Then for any initial condition ξ 0 in M , problem (5.1a)–(5.2)
admits a unique classical solution in C 1 ([0, +∞) × [0, L]; Rn ). Moreover, there
exists δ > 0, such that for all ξ 0 in M , we have ξ ∈ Bδ1 :=
u ∈ C 1 ([0, L]; Rn ) : u 1 ≤ δ .

With these assumptions, we are in a position to introduce our candidate observer


dynamics and its boundary conditions on Π for system (5.1)–(5.2), as follows:
 
ξ̂t (t, x) + λ [y(t)] (x)ξ̂x (t, x) =Aξ̂ (t, x) + F sδ ξ̂ (t) (x)
 
− Θ K y(t, x) − C ξ̂ (t, x) , (5.7a)
  
ξ̂ (t, 0) =H sδ ξ̂ (t, L) , (5.7b)
 
where the function sδ : Rn  ζ → sδ (ζ ) = sδ1 (ζ1 ), . . . , sδn (ζn ) is parametrized by δ
(given in Assumption 5.1) and satisfies the following properties:
1. it is uniformly bounded and continuously differentiable;
2. its first derivative is uniformly bounded;
3. its derivative function Dsδ (·) is in Li p(Rn , | · |);
4. for every δ > 0 and v, w in Rn , such that |w| ≤ δ, there exists ωδ > 0, such that
the following inequality is satisfied

|sδ (v) − w| ≤ ωδ |v − w|. (5.8)

Note that a saturation-like function of the form


5 High-Gain Observer Design for Systems of PDEs 119

ζi ,   |ζi | ≤ δ
sδi (ζi ) = (5.9)
sgn(ζi ) (|ζi | − δ) e−|ζi |+δ + δ , |ζi | > δ

satisfies all properties and, particularly, (5.8)
√ with ωδ = n max 1, e−1 + δ and
with Lipschitz constant of Dsδ (·) equal to ne−3 .
Also, Θ, appearing in the output correction term of the observer, is a diagonal
matrix given by
Θ := diag θ, θ 2 , . . . , θ n , (5.10)

where θ > 1 is the candidate high-gain constant of the observer, which will be
selected precisely later, and K ∈ Rn is chosen such that A + K C is Hurwitz (we
can always find such a K , due to the observability of the pair (A, C)). Note that for
such a K , one can find a symmetric and positive definite n × n matrix P satisfying
a quadratic Lyapunov equation of the following form:

2Sym (P (A + K C)) = −In . (5.11)

Let us remark that P satisfying (5.11) cannot be diagonal since matrix A fails by
its definition to be a diagonally stabilizable matrix. The matrix P will be used as
the Lyapunov matrix in the Lyapunov functional used in the proof of the observer
convergence. However, in stability analysis of general hyperbolic systems, see for
instance [3], the chosen Lyapunov functionals are diagonal, in order to commute
with the matrix of the velocities. In the present case, we assume only one velocity
and, thus, we do not need that P is diagonal.

5.3.2 Direct Solvability of the H-GODP

We are in a position to present our main result on the solvability of the H-GODP via
observer system (5.7).
Theorem 5.1 Consider system (5.1a)–(5.2), with a single velocity λ and output
(5.1b), and suppose that Assumption 5.1 holds for initial condition ξ 0 in M . Let
also K in Rn , chosen in such a way that A + K C is Hurwitz. Then, the H-GODP
for system (5.1a)–(5.2) is solvable by system (5.7) for θ > 1 as a high gain and
initial condition ξ̂ 0 in C 1 ([0, L]; Rn ), with ξ̂ 0 (x) = ξ̂ (0, x), satisfying zero-order
and one-order compatibility conditions. More precisely, for every κ > 0, there exists
θ0 ≥ 1, such that for every θ > θ0 , the following inequality holds

ξ̂ (t, ·) − ξ(t, ·) 1 ≤ e−κt ξ̂ 0 (·) − ξ 0 (·) 1 , ∀t ≥ 0 (5.12)

for some > 1, polynomial in θ .


Note that Theorem 5.1 shows solvability of the H-GODP of Definition 5.1 with no
use of spatial derivatives of the output. This is the reason why we call this approach
120 C. Kitsos et al.

direct. This result slightly generalizes [19, 20] in the sense that it considers also the
case of a velocity function of nonlocal nature.

Proof Prior to the observer convergence, existence and uniqueness of global classi-
cal solutions to the observer system, which is a semilinear hyperbolic system with
possibly nonlocal terms, must be proven. The reader can refer to [22, Theorem 2.1]
and, similarly to that work, we can follow a fixed-point methodology, taking into
account the sufficient regularity of the dynamics, the global boundedness of sys-
tem’s solutions (and, thus of the output y) coming from Assumption 5.1, and also the
fact that the nonlinearities appearing in the observer system are globally Lipschitz.
More details can be found in [20, Appendix A].
Consider the following linearly transformed observer error
 
ε := Θ −1 ξ̂ − ξ ,

for which we derive the following hyperbolic equations on Π :

εt (t, x) + λ [y(t)] (x)εx (t, x) =θ (A + K C) ε(t, x)


+ Θ −1 Δz(t) [F ] (ξ(t))(x), (5.13a)
−1
ε(t, 0) =Θ Δz(t) [H ] (ξ(t))(L), (5.13b)
 
where z := sδ ξ̂ . Notice that in the above internal dynamics, θ (A + K C)ε will be
used as a damping term leading to the exponential stabilization of the error dynamics.
To prove exponential stability of the error system at the origin in the 1-norm, we
adopt a Lyapunov approach inspired by methodologies presented in [3]. The proof
is included in the Appendix. A slightly different proof for a local velocity function
has appeared in [20].
Following the proof in the Appendix, the H-GODP for (5.1a), (5.1b), (5.2) is
solved by designing an exponential in the 1-norm high-gain observer of adjustable
convergence rate, dependent on the selection of θ .

5.4 Observer Design for Systems with Distinct Velocities

In this section, we employ an indirect strategy, in order to show solvability of the H-


GODP, when the velocities are not identical. To this end, we first map the system into
an appropriate system of PDEs via an introduced nonlinear infinite-dimensional state
transformation, noting however that for nonlinear systems with more than three states,
accompanied with more than three distinct velocities, such a state transformation-
based approach is difficult to be employed.
5 High-Gain Observer Design for Systems of PDEs 121

5.4.1 System Requirements and Main Approach

Consider again system (5.1)–(5.2), but with the restriction of up to three states,
namely
n ∈ {2, 3} .

To provide the properties and appropriate regularity of the dynamics of the con-
sidered system, let us first define a, roughly speaking, “index of strict hyperbolicity”
as follows:
q := min i : λi ≡ λ j , ∀ j = i, . . . , n ,

where we used the equivalence relation λi ≡ λ j ⇔ λi [ξ1 ] = λ j [ξ1 ], ∀ξ1 ∈ C 0


([0, L]; R). By this definition, we have q ∈ {1, 2, 3} and in the case of a strictly
hyperbolic system, we have q = n. The case where q = 1 (a single velocity) was
addressed in the previous section. We further define

q0 := max {1, q − 1} .

We assume that velocities λi : C q0 ([0, L]; R) → C q0 ([0, L]; R) , i = 1, . . . , n


are q0 -times Fréchet differentiable and positive (namely λi [y] > 0, for all y ∈
C 0 ([0, L]; R) (nonlocal case), or y ∈ R (local case)), nonlinear mapping F :
C q0 ([0, L]; Rn ) → C q0 ([0, L]; Rn ) isq0 -times Fréchet differentiable
 and for q0 = 1
we further assume that DF ∈ Li ploc C 0 ([0, L]; Rn ) , · 0 . For initial condition
ξ 0 ∈ C q0 ([0, L]; Rn ), we assume that it satisfies compatibility conditions of order
q0 (see [3, Chap. 4.5.2] for definition of compatibility conditions of any order)
and mapping H is of class C q0 (Rn ; Rn ), while for q0 = 1, we additionally have
DH ∈ Li ploc (Rn , | · |).
As in the previous section, we make an assumption on the existence and unique-
ness of solutions of possibly stronger regularity, depending on the index of strict
hyperbolicity q, and this assumption might be met more easily in the case of nonlo-
cal conservation laws, see for instance [16].
Assumption 5.2 Consider M ⊂ C q0 ([0, L]; R) nonempty and bounded, consisting
of functions satisfying compatibility conditions of order q0 for problem (5.1a)–(5.2).
Then for any initial condition ξ 0 in M , problem (5.1a)–(5.2) admits a unique solution
in C q0 ([0, +∞) × [0, L]; Rn ). Moreover, there exists δ > 0, such that for all ξ 0 in
q
M , we have ξ ∈ Bδ 0 := u ∈ C q0 ([0, L]; Rn ) : u q0 ≤ δ .
Let us define a Banach space by

X := C q0 ([0, L]; R) × C 1 ([0, L]; Rn−1 ),

equipped with the norm · X := · 1 , when n = 2 and ξ X := ξ1 q0 +


ξ2 1 + ξ3 1 , when n = 3.
To deal with the generality of the considered hyperbolic operator, i.e., the pres-
ence of distinct velocities, we need to employ a different strategy than in the previous
122 C. Kitsos et al.

section. The problem comes from the fact that the balance laws in (5.1a) do not allow
the choice of a diagonal Lyapunov functional to be used in the stability analysis of
the observer error equations. A non-diagonal Lyapunov functional does not permit
an integration by parts when taking its time derivative since the Lyapunov matrix
and the matrix of velocities do not commute. To address this problem, we perform a
transformation including spatial derivations of the state up to order q − 2, in order
to write the system in an appropriate form for which a Lyapunov approach is fea-
sible. Then, for the obtained target system, we design the high-gain observer and,
finally, returning to the initial coordinates, solvability of H-GODP is guaranteed. The
increased difficulties with respect to the presence of distinct velocities appear in the
somehow dual problems of internal controllability with reduced numbers of controls
(see comments on algebraic solvability in [1]).  q 
We shall show the existence  of a nonlinear
 transformation T ∈ C 0 Bδ 0 ; B(X )
invertible, with T −1 ∈ C 0 Bδ 0 ; B(X ) (with B(X ) : the space of bounded linear
q

operators from X to X ), which maps system (5.1a)–(5.2) into a target system ζ , as


follows:

ζ = T [ξ1 ]ξ ; (5.14)
with ζ1 = ξ1 .

Assume also that this transformation is written in the form

T [·] = In + T˜ [·]C, (5.15)

for some column operator T˜ .


The desired target system (T) of PDEs satisfies the following equations on Π
⎧ 

⎪ ζt (t, x) + λn [ζ1 (t)](x)ζx (t, x) = Aζ (t, x) + F ζ (t) − T˜ [ζ1 (t)]Cζ (t) (x)



+N1 [ζ1 (t)](x)  + N2 [ζ1 (t), ζ2 (t)](x), 
(T)

⎪ ζ (t, 0) = H ζ (t, L) − T ˜ [ζ1 (t)]Cζ (t)(L) + N3 [ζ1 (t)](0),



Y (t, x) = y(t, x) = Cζ (t, x),

with initial condition ζ (0, x) := ζ 0 (x) = T [ξ10 ]ξ 0 (x), where nonlinear operators
N1 : C q0 ([0, L]; R) → C 0 ([0, L]; Rn ), N3 : C q0 ([0, L]; R) → Rn are acting on the
measured state ζ1 , N2 : C q0 ([0, L]; R) × C 0 ([0, L]; R) → C q0 −1 ([0, L]; Rn ) is a
bilinear triangular mapping, to be determined in the sequel, depending on the choice
of T , and Y is target system’s output, which remains equal to the original system’s
output y.
In this considered target system of PDEs, we observe that system’s hyperbolic
operator has been decomposed into the sum of a hyperbolic one with only one
velocity, the last one λn , plus a nonlinear differential operator acting only on the
measured first state only N1 , a bilinear mapping of the state N2 , while a nonlinear
differential operator N3 acting on the first state appears on the boundaries. Thus,
5 High-Gain Observer Design for Systems of PDEs 123

observer design can be possible for target system (T), as we now meet the desired
property of one single velocity that we imposed in the previous section, while we can
simultaneously cancel the unwanted terms of the transformed system, represented by
the nonlinear operators N1 , N2 , N3 acting on the measured state ζ1 . The proposed
high-gain observer for target system (T) satisfies the following equations on Π :
  
ζ̂t (t, x) + λn [y(t)](x)ζ̂x (t, x) = Aζ̂ (t, x) + F sδ ζ̂ (t) − T˜ [y(t)]y(t) (x)
 
+ N1 [y(t)](x) + N2 [y(t), ζ̂2 (t)](x) − Θ K y(t, x) − C ζ̂ (t, x) , (5.16a)
   
ζ̂ (t, 0) = H sδ ζ̂ (t, L) − T˜ [y(t)]y(t)(L) + N3 [y(t)](0), (5.16b)

with initial condition ζ̂ 0 (x) := ζ̂ (0, x) (for a function ζ̂ 0 in X ), where again, as in


the previous section, Θ is the diagonal matrix containing the increasing powers of
the high-gain constant θ > 1 as in (5.10), sδ is a saturating function satisfying the
properties of items 1–4 as in Sect. 5.3.1 and K is a constant vector gain rendering
matrix A + K C Hurwitz, similarly as in the previous section.
In the next subsection, we determine the transformation T and we show the
solvability of the H-GODP.

5.4.2 Indirect Solvability of the H-GODP

We are now in a position to state the main result of this section, which includes both
the existence of an infinite-dimensional transformation (5.14) and the convergence of
observer (5.16) to the transformed system (T), implying that, inverting the observer
state via T , we eventually establish converge to the actual state ξ . This leads to an
indirect solvability of the observer design problem.

Theorem 5.2 Assume that Assumption 5.2 holds for initial condition ξ 0 ∈ M . Then,
the H-GODP is solvable for system (5.1a)–(5.2), with output (5.1b) and n, q ∈ {2, 3}
by T −1 [y]ζ̂ (where ζ̂ is the unique solution to (5.16)), for θ > 1 as a high gain
and initial condition T −1 [y(0)]ζ̂ 0 (x), with ζ̂ 0 satisfying zero-order and one-order
compatibility conditions. More precisely, for every κ > 0, there exists θ0 ≥ 1, such
that for every θ > θ0 , the following holds for all t ≥ 0:

T −1 [y(t)]ζ̂ (t)(·) − ξ(t, ·) 2−q0 ≤ e−κt T −1 [y(0)]ζ̂ 0 (·) − ξ 0 (·) X , (5.17)

with > 0 a polynomial in θ .

We note here that in the study of internal controllability for underactuated systems,
the phenomenon of loss of derivatives appears, as the regularity of the dynamics is
stronger than the regularity of the control laws, whenever the velocities are distinct
(see [1, Theorem 3.1]). In the present framework aiming at the solvability of the
124 C. Kitsos et al.

H-GODP, we note that for n = q = 3, the regularity of system’s dynamics needs to


be stronger (of order q0 = 2) than the regularity of the space in the norm of which
the asymptotic convergence of the observer is exhibited (sup-norm).
The above theorem constitutes a generalization of our previous works, see for
instance [19], where the case of linear hyperbolic and semilinear parabolic systems
via a linear state transformation is treated. We introduce here a transformation-based
approach, inspired by these works, but using a nonlinear state transformation, in
order to deal with the quasilinearity of the system.
Proof Let us choose T˜ in (5.14), (5.15) by


⎪ 0,
⎛ ⎞ when n = 2,

0
T˜ [ξ1 ] := ⎝ ;

⎪ τ [ξ1 ]⎠ dx , when n = 3

0

where τ [ξ1 ] := λ2 [ξ1 ] − λ3 [ξ1 ].


Obviously, transformation T , with T˜ given as above, meets the specifications
of the previous subsection independently of boundary conditions and its inverse is
given by
T −1 [·] = In − T˜ [·]C.

Applying the transformation chosen above to system (5.1a)–(5.2), we obtain target


system (T) of the previous subsection, with


⎪ (λ2 [ζ1 ] − λ1 [ζ1 ]) ∂x ζ1

⎪ , n = 2,

⎪ 0
⎨⎛ ⎞
(λ3 [ζ1 ] − λ1 [ζ1 ] − τ [ζ1 ]) ∂x ζ1
N1 [ζ1 ] := ⎜ ⎟

⎪ ⎜ Dτ [ζ 1 ], − (λ 1 [ζ1 ] + τ [ζ1 ]) ∂x ζ1 + F1 [ζ1 ] ∂x ζ1 − τ [ζ1 ] ⎟ , n = 3,

⎪ ⎝ ⎠

⎪ ×d x (λ 1 [ζ 1 ]∂ x ζ1 − F1 [ζ1 ]) + λ3 [ζ1 ]dx (τ [ζ1 ]∂x ζ1 )

0


⎪ 0, n = 2,
⎨⎛ ⎞
0
N2 [ζ1 , ζ2 ] := ⎝

⎪ Dτ [ζ1 ], ζ2 ∂x ζ1 ⎠ , n = 3,

0



⎨⎛
0, ⎞ n = 2,
0
N3 [ζ1 ] := ⎝

⎪ τ [ζ1 ]∂x ζ1 ⎠ , n = 3,

0

We are now in a position to prove that solutions to observer (5.7) converge expo-
nentially to the solutions to transformed system (T). First, the well-posedness of the
observer system, i.e., the global existence of unique classical solutions of regularity
C 1 follows from classical arguments that one can find, for instance, in [3] or in our
previous works [21] (details are left for the reader) and relies on Assumption 5.2 on
existence and boundedness in the C q0 of system solutions and also on the fact that
5 High-Gain Observer Design for Systems of PDEs 125

observer nonlinearities are globally Lipschitz. In this way, we now focus on the proof
of the stability analysis.
Let us consider observer error by
 
ε := Θ −1 ζ̂ − ζ ,

which satisfies the following hyperbolic equations:

εt (t, x) + λn [y(t)] (x)εx (t, x) =θ (A + K C) ε(t, x)



+ Θ −1 Δz(t) [F ] (ζ (t) − T˜ [y(t)]y(t))(x)
 
+N2 y(t), ζ̂2 (t) − ζ2 (t) (x) , (5.18a)

ε(t, 0) =Θ −1 Δz(t) [H ] (ζ (t) − T˜ [y(t)]y(t))(L),


(5.18b)
 
where z := sδ ζ̂ − T˜ [y]y.
For the proof of the exponential stability of solutions to (5.18), the reader can refer
to the Appendix of the present chapter. Following the proof in the Appendix, where
an appropriate Lyapunov functional is chosen, we obtain an exponential stability
result for the transformed system in the 1-norm as follows:

ζ̂ (t, ·) − ζ (t, ·) 1 ≤ ¯e−κt ζ̂ 0 (·) − ζ 0 (·) 1 , (5.19)

where κ > 0 is adjustable by choosing the high-gain constant θ large enough, and
¯ > 0 is a polynomial in θ .
Now, to return to the original coordinates, we notice that T [Cξ ] : X → X
q
is bounded for ξ ∈ Bδ 0 , X is continuously embedded in C 1 ([0, L]; Rn ), also the
−1 q
extension of T [Cξ ] on C 0 [0, L]; Rn ) for ξ ∈ Bδ 0 is bounded in C 0 [0, L]; Rn ),
and C ([0, L]; R ) is continuously embedded in C ([0, L]; Rn ). Thereby, by (5.19),
1 n 0

we can calculate a constant , polynomial again in θ , such that (5.17) is satisfied.


The proof of Theorem 5.2 is complete. 

Remark 5.2 Although in this section we considered a reduced number of states (up
to 3), as the presence of an increased number of distinct velocities imposes extra
difficulties to the problem, we note that the H-GODP is solvable even for more than
three states, but with the restriction that system is linear and also space L-periodic. In
this case, we consider a state transformation that includes higher order differentiations
in its domain than the ones in this section and, to determine it, we solve an operator
Sylvester equation. We have included this generalization in previous works, see [19],
where some links with problems of controllability of coupled hyperbolic PDEs as in
[1] were revealed.
126 C. Kitsos et al.

5.5 Conclusion

Solutions to a high-gain observer design problem for a class of quasilinear hyper-


bolic systems, containing also nonlocal source terms and velocities, and written in
a triangular form were presented in this chapter. A part of the state was considered
as measurement (one observation). First, this problem was solved for systems with
n equations and only one velocity, as a direct extension of the finite-dimensional
approach. Then, sufficient conditions were provided for the solvability of such a
problem for the case of 2 or 3 distinct velocities. This required the introduction of a
nonlinear infinite-dimensional state transformation, which led to the injection of out-
put spatial derivatives in the observer dynamics. The extension of this methodology
to wider classes of infinite-dimensional systems, ISS properties of such observers,
and the investigation of output feedback laws via such observers with applications
to real systems, will be topics of our future works.

Appendix

Observer Convergence Proofs

In this section, we prove the Lyapunov stability part of both Theorems 5.1 and 5.2,
appearing in Sects. 5.3.2 and 5.4.2, respectively.
Particularly, observer error systems appearing in Theorems 5.1 and 5.2 are given
by (5.13) and (5.18), respectively. We prove here the Lyapunov stability result for
error system (5.18) in Theorem 5.2 only, which is more complicated. Then, the
Lyapunov stability in Theorem 5.1 follows, as error system (5.13) therein is a simpler
version of (5.18), with ζ substituted by ξ , ζ̂ substituted by ξ̂ , λn substituted by λ,
and T˜ = 0, N2 = 0.
To prove the exponential stability of the solution to error system (5.18) at the origin
in Theorem 5.2, let us define a Lyapunov functional W p : C 1 ([0, L]; Rn ) → R by
 1/ p
L  
W p [ε] := π(x)exp pμθ,δ x G p [ε](x)dx ; (5.20)
0
  p
G p [ε] := ε Pε + ρ0 εt Pεt ,
 
where ε := Θ −1 ζ̂ − ζ is the observer error for the transformed via T system,
ρ0 ∈ (0, 1) is a constant (to be chosen appropriately), p ∈ N, P ∈ Rn×n is positive
definite and symmetric, satisfying (5.11), π : [0, L] → R is given by

x sup ζ 0 ≤δ
λn [Cζ ]
π(x) := (π̄ − 1) + 1; π̄ := , (5.21)
L inf ζ 0 ≤δ
λn [Cζ ]
5 High-Gain Observer Design for Systems of PDEs 127

with π(x) ∈ [1, π̄ ], and constant μθ,δ is given by

1
μθ,δ := ln(μδ θ 2n−2 ) (5.22a)
L
where
|P|  
μδ := 2 , γ 2 , γ 2 δ2 , γ γ δ ;
max γ1,δ (5.22b)
2,δ 3,δ 1 1,δ 2,δ 1
eig(P)
 
|Θ −1 Δz [H ] ζ − T˜ [y]y (L)|
γ1,δ := sup
|ζ (L)|≤δ,y∈C 1 ([0,L];R),ζ̂ (L)∈Rn ,ε(L)=0
θ n−1 |ε(L)|
 
γ2,δ :=θ 1−n sup |Θ −1 DH [z] (L)Dsδ ζ̂ (L) Θ|,
ζ̂ (L)∈Rn , y 1 ≤δ
1
γ3,δ := sup
|ζ (L)|≤δ,ζ̂ (L)∈Rn , y 1 ≤δ,ε(L)=0
θ n−1 |ε(L)|
 
× |Θ −1 Δz [DH ] (ζ − T˜ [y]y)(L)Dsδ (ζ̂ ) + DH (z(L))Δζ̂ [Dsδ ](ζ )(L) |,

with
z := sδ (ζ̂ ) − T˜ [y]y,

and also
   
δ1 := sup ζt 0 + dt T˜ [y]y 0
ζ ∈B δ1 , y q0 ≤δ
   
= sup dt (T [Cξ ]ξ ) 0 + dt T˜ [Cξ ]Cξ 0 ,
q
ξ ∈B δ 0

which is proven to be finite after substituting ξt by −Λ[Cξ ]ξx + Aξ + F [ξ ] by


(5.1a) and invoking the regularity of Λ, and the boundedness of the mapping T in
X.
Note here that the above constants with subscript δ do not have any dependence
on the observer gain θ but on the global bound δ of system solutions (coming from
Assumption 5.2); otherwise, such a dependence will be explicit by use of a subscript
θ . We took also into account the implication

q
ξ ∈ Bδ 0 (coming from Assumption 5.2) ⇒ T [Cξ ]ξ = ζ ∈ Bδ1 .

By invoking global existence of solutions to observer system (5.16) and Assump-


tion 5.2, which establishes global unique classical solutions for system (5.1a)–
(5.2), and boundedness of mapping T in X , we are now in a position to define
G p , W p : [0, +∞) → R
128 C. Kitsos et al.

G p (t) := G p [ε](t), W p (t) := W p [ε](t), ∀t ≥ 0. (5.23)

Before taking the time derivative of the Lyapunov function, we temporarily assume
that ε is of class C 2 and we can, thus, derive the hyperbolic equations satisfied by εt
(details are left to the reader). Calculating the time derivative Ẇ p along the classical
solutions of the hyperbolic equations (5.18) for ε and of the corresponding hyperbolic
equations for εt , we get

1 1− p L  
Ẇ p = W p pπ(x) exp pμθ,δ x G p−1 (x)
p 0
 
× εt (x) Pε(x) + ε (x)Pεt (x) + ρ0 εtt (x)Pεt (x) + ρ0 εt (x) Pεtt (x) dx,

where after substituting dynamics of ε, εt and performing integration by parts, we


obtain
1 1
Ẇ p = W p1− p T1, p + T2, p + T3, p + T4, p , (5.24)
p p

where
 
T1, p := − π(L)λn [y](L) exp pμθ,δ L G p (L) + π(0)λn [y](0)G p (0),
 L
  
T2, p := dx π(x)λn [y](x) exp pμθ,δ x G p (x)dx,
0
 L
  
T3, p :=2 π(x) exp pμθ,δ x G p−1 (x) ε (x)PΘ −1
0
    
× Δz [F ] ζ − T˜ [y]y (x) + N2 y, ζ̂2 − ζ2 (x)

+ρ0 εt (x)Sym(PK [ζ ](x))εt (x) + ρ0 εt (x)P K ζ̂ [ζ ](x)
      
+Θ −1 DF [z] , Dsδ ζ̂ Θεt (x) + Θ −1 dt N2 y, ζ̂2 − ζ2 (x) dx,
 L
 
T4, p :=θ π(x) exp pμθ,δ x G p−1 (x)
0
× 2ε Sym(P(A + K C))ε + 2ρ0 εt Sym(P(A + K C))εt

−ρ0 εt PK [ζ ](A + K C)ε − ρ0 ε (A + K C) K  [ζ ]Pεt dx,
 
where K : Bδ1 → C 0 [0, L]; Rn×n is a bounded mapping defined by

K [ζ ] := (λn [ζ1 ])−1 Dλn [ζ1 ], Cζt In (5.26)

ζ̂
and K : Bδ1 → C 0 ([0, L]; Rn ), parametrized by ζ̂ ∈ C 0 ([0, L]; Rn ), is given by
5 High-Gain Observer Design for Systems of PDEs 129
   
K ζ̂ [ζ ] := − K [ζ ]Θ −1 Δz [F ] ζ − T˜ [ζ1 ]ζ1 + N2 ζ1 , ζ̂2 − ζ2
  
+ Θ −1 Δz [DF ] (ζ − T˜ [ζ1 ]ζ1 ), ζt − dt T˜ [ζ1 ]ζ1
 
+ DF [z], Δζ̂ [Dsδ ](ζ )ζt . (5.27)

After substituting boundary conditions for ε, εt in T1, p and by virtue of (5.21) and
(5.22b), we obtain the following inequality:
    p
T1, p ≤ sup (λn [Cζ ])G p (L) − exp pμθ,δ L + θ 2n−2 μδ
ζ ∈B δ1

and, subsequently, by (5.22a), we get

T1, p ≤ 0. (5.28)

For T2, p , we can easily derive the following bound:

T2, p ≤ (ω1,δ + p|μθ,δ |ω2,δ )W pp (5.29)

where
|P|δ supζ ∈B δ1 |dx λn [Cζ ]| |P| supζ ∈B δ1 λn [Cζ ]
ω1,δ := , ω2,δ := .
eig(P) eig(P)

By taking into account that the dynamics are locally Lipschitz, we obtain
p−1
T3, p ≤ ω3,δ G 1 (·) 0 W p−1 , (5.30)

where
!
|P| 3
ω3,δ :=2 max γ4,δ , γ5,δ , γ6,δ , sup |K [ζ ], γ7,δ ;
eig(P) ζ ∈B δ1 2
1
γ4,δ := sup
ζ ∈B δ1 ,y∈C 1 ([0,L];R),ζ̂ ∈C 0 ([0,L];Rn ),ε =0 ε 0
   
× Θ −1 Δz [F ] ζ − T˜ [y]y + N2 y, ζ̂2 − ζ2 0,

K ζ̂ [ζ ] 0
γ5,δ := sup ,
ζ ∈B δ1 ,ζ̂ ∈C 0 ([0,L];Rn ),ε =0 ε 0
   
Θ −1 DF [z], Dsδ ζ̂ Θεt 0
γ6,δ := sup ,
ζ̂ ∈C 0 ([0,L];Rn ), y 1 ≤δ,εt ∈C 0 ([0,L];Rn ),εt =0 εt 0
130 C. Kitsos et al.
 
Θ −1 dt N2 y, ζ̂2 − ζ2 0
γ7,δ := sup ,
ζ ∈B δ1 , y q0 ≤δ,ζ̂2 ∈C
0 ([0,L];R),ε,ε
t =0
ε 0 + εt 0

 
where we can easily see that γ7,δ is finite since quantity dt N2 y, ζ̂2 − ζ2 is given
by
   
dt N2 y, ζ̂2 − ζ2 =N2 y, θ 2 ∂t ε2
⎛ ⎞ ⎛ ⎞
" 0 # " 0 #
+ ⎝ yt x Dτ [y], θ 2 ε2 ⎠ + ⎝ yx D 2 τ [y], θ 2 yt ε2 ⎠
0 0
q
and yt , yt x are uniformly bounded, whenever ξ ∈ Bδ 0 (see Assumption 5.2), as seen
by hyperbolic dynamics (5.1a).
Term T4, p , which will lead to the negativity of the Lyapunov derivative, can be
rewritten in the following way:
 L  
T4, p := − θ π(x) exp pμθ,δ x G p−1 (x)E  (x)Σ[ζ ](x)E(x)dx,
0

   
where E := ε εt and after utilizing (5.11), Σ : Bδ1 → C 0 [0, L]; R2n×2n is
given by

In −ρ0 (A + K C) K  [ζ ]P
Σ[ζ ] := .
−ρ0 PK [ζ ](A + K C) ρ0 In

Now, we can easily verify (using Schur complement) that for all w ∈ R2n \0, we have

Σ[ζ ]w
inf ζ ∈B δ1 w |w| 2 > 0, if
⎧ ⎫

⎨ ⎪

1
ρ0 < min  2 ⎪ .
, 1

⎩ |P|2 |A + K C|2 sup ⎭
1 K [ζ ]
ζ ∈B δ

It turns out that for every choice of matrices P and K satisfying (5.11), there always
exists a ρ0 , such that (5.5) is satisfied and this fact renders Σ positive. Consequently,
there exists σδ > 0, such that
σδ p
T4, p ≤ −θ W . (5.31)
|P| p

We note here that all the previously defined constants with subscript δ (for instance,
γi,δ , i = 4, . . . , 7) have no dependence on the observer gain constant θ and this is
a consequence of the triangularity of the involved nonlinear mappings, similarly
5 High-Gain Observer Design for Systems of PDEs 131

as in the classical high-gain observer designs [13]. This property turns out to be
sufficient for the solvability of the H-GODP. More precisely, while bounding the
Lyapunov derivative from above, the independence of these parameters on θ shall
not add positive terms with linear (or higher order) dependence on θ . On the other
hand, negative terms will appear depending linearly on θ as a direct consequence of
the assumed observability of the pair (A, C). This will render the negativity of the
Lyapunov derivative feasible.
Now, combining (5.28)–(5.30), and (5.31) with (5.24), we obtain
p−1
Ẇ p ≤ (−θ ω4,δ + ω5,δ ln(θ ) + ω6,δ )W p + ω3,δ W p1− p W p−1 G 1 (·) 0 , (5.32)

σδ ω2,δ (2n−2) ω2,δ


where ω4,δ := |P| , ω5,δ := L
, ω6,δ := L
| ln μδ |. Now, using Hölder’s
inequality, one can obtain
p−1 1/ p
W p−1 ≤ W pp−1 π(·) 0 .

Utilizing the above inequality, (5.32) gives

Ẇ p ≤ (−θ ω4,δ + ω5,δ ln(θ ) + ω6,δ )W p + ω3,δ π̄ 1/ p G 1 (·) 0 . (5.33)

We obtained the estimate (5.33) of Ẇ p for ε of class C 2 , but, by invoking density


arguments, the results remain valid with ε only of class C 1 (see [9] for further details
on analogous statements). Taking the limit as p → +∞ of both sides of (5.33), we
get in the distribution sense in (0, +∞),

d  
G 1 (t, ·) 0 ≤ −θ ω4,δ + ω5,δ ln(θ ) + ω7,δ G 1 (t, ·) 0 , (5.34)
dt
where ω7,δ := ω3,δ + ω6,δ .
Now, one can select the high gain θ , such that θ > θ0 , where θ0 ≥ 1 is such that

− θ ω4,δ + ω5,δ ln(θ ) + ω7,δ ≤ −2κδ , ∀θ > θ0 (5.35)

for some κδ > 0. One can easily check that for any κδ > 0, there always exists a
θ0 ≥ 1, dependent on the involved constants, such that the previous inequality is
satisfied.
Subsequently, (5.34) yields to the following differential inequality in the distri-
bution sense in (0, +∞)

d
G 1 (t, ·) 0 ≤ −2κδ G 1 (t, ·) 0
dt
and by the comparison lemma, we get

G 1 (t, ·) 0 ≤ e−2κδ t G 1 (0, ·) 0 , ∀t ≥ 0. (5.36)


132 C. Kitsos et al.

Now, by the dynamics (5.18), we can obtain the following inequality:

inf λn [Cζ ] εx 0 − sδ,θ ε 0 ≤ εt 0 ≤ sup λn [Cζ ] εx 0 + sδ,θ ε 0


ζ ∈B δ1 ζ ∈B δ1

where sδ,θ := θ |A + K C| + γ4,δ . Invoking these inequalities, also (5.22a), estimate


(5.36) and the following inequality,

ρ0 μθ,δ −|μθ,δ | L μθ,δ +|μθ,δ |


e 2 eig(P) ( ε 0 + εt 0)
2
≤ G 1 (·) 0 ≤e 2 L
|P| ( ε 0 + εt 0)
2
2
we obtain
−κδ t
ε 1 ≤ θ,δ e ε0 1 , ∀t ≥ 0, (5.37)

where ε0 (x) := ε(0, x) and


'
|P| 1 n−1
θ,δ := (μδ ) 2L θ L
ρ0 eig(P)
! !
1
× max sδ,θ + 1, max 1 + 2sδ,θ , 2 sup λn [Cζ ] .
inf ζ ∈B δ1 λn [Cζ ] ζ ∈B δ1

By (5.37), we derive the following estimate, which holds for every t ≥ 0

ζ̂ (t, ·) − ζ (t, ·) 1 ≤ ¯θ,δ e−κδ t ζ̂ 0 − ζ 0 1 , (5.38)

where ¯θ,δ := θ n−1 θ,δ .


Note that the polynomial dependence of ¯δ,θ on θ is a phenomenon appearing also
in high-gain observer designs in finite dimensions.
The proof of the exponential convergence of ζ̂ to ζ in the 1-norm is complete.

References

1. Alabau-Boussouira, F., Coron, J.-M., Olive, G.: Internal controllability of first order quasilinear
hyperbolic systems with a reduced number of controls. SIAM J. Control Optim. 55(1), 300–323
(2017)
2. Anfinsen, H., Diagne, M., Aamo, O.M., Krstić, M.: An adaptive observer design for n +
1 coupled linear hyperbolic PDEs based on swapping. IEEE Trans. Autom. Control 61(12),
3979–3990 (2016)
3. Bastin, G., Coron, J.-M.: Stability and boundary stabilization of 1-D hyperbolic systems. In:
Progress in Nonlinear Differential Equations and Their Applications. Springer International
Publishing (2016)
5 High-Gain Observer Design for Systems of PDEs 133

4. Besançon, G.: Nonlinear Observers and Applications. Springer, New York (2007)
5. Bounit, H., Hammouri, H.: Observer design for distributed parameter dissipative bilinear sys-
tems. Appl. Math. Comput. Sci. 8, 381–402 (1998)
6. Castillo, F., Witrant, E., Prieur, C., Dugard, L.: Boundary observers for linear and quasi-linear
hyperbolic systems with application to flow control. Automatica 49(11), 3180–3188 (2013)
7. Christofides, P.D., Daoutidis, P.: Feedback control of hyperbolic pde systems. AIChE J. 42(11),
3063–3086 (1996)
8. Coron, J.-M., Vazquez, R., Krstić, M., Bastin, G.: Local exponential stabilization of a 2 × 2
quasilinear hyperbolic system using backstepping. SIAM J. Control Optim. 51(3), 2005–2035
(2013)
9. Coron, J.-M., Bastin, G.: Dissipative boundary conditions for one-dimensional quasi-linear
hyperbolic systems: Lyapunov stability for the C1-norm. SIAM J. Control Optim. 53(3), 1464–
1483 (2015)
10. Coron, J.-M., Nguyen, H.M.: Optimal time for the controllability of linear hyperbolic systems
in one dimensional space. SIAM J. Control Optim. 57(2), 1127–1156 (2019)
11. Di Meglio, F., Vasquez, R., Krstić, M.: Stabilization of a system of n + 1 coupled first-order
hyperbolic linear PDEs with a single boundary input. IEEE Trans. Autom. Control. 58, 3097–
3111 (2013)
12. Gauthier, J.P., Bornard, G.: Observability for any u(t) of a class of nonlinear systems. IEEE
Trans. Autom. Control 26(4), 922–926 (1981)
13. Gauthier, J.P., Hammouri, H., Othman, S.: A simple observer for nonlinear systems: applica-
tions to bioreactors. IEEE Trans. Autom. Control 37(6), 875–880 (1992)
14. Hasan, A., Aamo, O.M., Krstić, M.: Boundary observer design for hyperbolic PDE-ODE
cascade systems. Automatica 68, 75–86 (2016)
15. Karafyllis, I., Ahmed-Ali, T., Giri, F.: Sampled-data observers for 1-D parabolic PDEs with
non-local outputs. Syst. Control Lett. 133 (2019)
16. Keimer, A., Pflug, L., Spinola, M.: Existence, uniqueness and regularity of multidimensional
nonlocal balance laws with damping. J. Math. Anal. Appl. 466, 18–55 (2018)
17. Khalil, H.K.: High-gain observers in nonlinear feedback control. Advances in Design and
Control, SIAM (2017)
18. Khalil, H.K., Praly, L.: High-gain observers in nonlinear feedback control. Int. J. Robust Non-
linear Control 24(6), 993–1015 (2014)
19. Kitsos, C.: High-gain observer design for systems of PDEs. Ph.D. Thesis. Univ. Grenoble Alpes
(2020)
20. Kitsos, C., Besançon, G., Prieur, C.: High-gain observer design for a class of quasilinear
integro-differential hyperbolic systems - application to an epidemic model. In: To appear in
IEEE Transactions on Automatic Control (2022). https://doi.org/10.1109/TAC.2021.3063368
21. Kitsos, C., Besançon, G., Prieur, C.: High-gain observer design for some semilinear reaction-
diffusion systems: a transformation-based approach. IEEE Control Syst. Lett. 5(2), 629–634
(2021)
22. Kmit, I.: Classical solvability of nonlinear initial-boundary problems for first-order hyperbolic
systems. Int. J. Dyn. Syst. Differ. Equ. 1(3), 191–195 (2008)
23. Li, T.-T.: Exact boundary observability for quasilinear hyperbolic systems. ESAIM: Control,
Optim. Calc. Var. 14(4), 759–766 (2008)
24. Lissy, P., Zuazua, E.: Internal observability for coupled systems of linear partial differential
equations. SIAM J. Control Optim. Soc. Ind. Appl. Math. 57(2), 832–853 (2019)
25. Meurer, T.: On the extended Luenberger-type observer for semilinear distributed-parameter
systems. IEEE Trans. Autom. Control 58(7), 1732–1743 (2013)
26. Nguyen, V., Georges, D., Besançon, G.: State and parameter estimation in 1-d hyperbolic PDEs
based on an adjoint method. Automatica 67, 185–191 (2016)
27. Prieur, C., Girard, A., Witrant, E.: Stability of switched linear hyperbolic systems by Lyapunov
techniques. IEEE Trans. Autom. Control 59(8), 2196–2202 (2014)
28. Schaum, A., Moreno, J.A., Alvarez, J., Meurer, T.: A simple observer scheme for a class of 1-D
semi-linear parabolic distributed parameter systems. European Control Conf, Linz, Austria, pp.
49–54 (2015)
134 C. Kitsos et al.

29. Vazquez, R., Krstić, M.: Boundary observer for output-feedback stabilization of thermal-fluid
convection loop. IEEE Trans. Control Syst. Technol. 18(4), 789–797 (2010)
30. Xu, C., Ligarius, P., Gauthier, J.P.: An observer for infinite-dimensional dissipative bilinear
systems. Comput. Math. Appl. 29(7), 13–21 (1995)
Chapter 6
Robust Adaptive Disturbance
Attenuation

Saeid Jafari and Petros Ioannou

Abstract Effective attenuation of unwanted sound and vibrations dominated by a


number of harmonics is a key enabling technology in a vast array of industrial control
applications. This chapter focuses on output feedback robust adaptive suppression of
unknown disturbances acting on dynamical systems with uncertain models. Under
certain assumptions on the open-loop plant characteristics, the robust stability and
performance of a proposed adaptive disturbance-rejecting feedback control scheme
are examined. It is shown that over-parameterization of the controller together with a
suitable pre-compensation of the open-loop plant can effectively attenuate the distur-
bance harmonics without amplifying the output broadband noise, while at the same
time guaranteeing stability and robustness with respect to unmodeled dynamics.
Analytical results are provided for single-input single-output as well as multi-input
multi-output systems in both continuous- and discrete-time domains and practical
design considerations are presented together with numerical simulations to demon-
strate the effectiveness of the proposed scheme.

This chapter is partially reprinted from [1, 2] with the following copyright and permission notices:
• 2015,
c IEEE. Reprinted, with permission, from S. Jafari, P. Ioannou, B. Fitzpatrick, and Y. Wang,
Robustness and performance of adaptive suppression of unknown periodic disturbances, IEEE
Transactions on Automatic Control, vol. 60, pages 2166–2171 (2015).
• Reprinted from Automatica, Vol 70, S. Jafari and P. Ioannou, Robust adaptive attenuation of
unknown periodic disturbances in uncertain multi-input multi-output systems, Pages 32–42, Copy-
right (2016), with permission from Elsevier.

S. Jafari (B)
Aurora Flight Sciences – A Boeing Company, Manassas, Virginia 20110, United States
e-mail: [email protected]
P. Ioannou
Department of Electrical and Computer Engineering, at the University of Southern California,
Los Angeles, California 90089, United States
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 135
Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control,
Lecture Notes in Control and Information Sciences 488,
https://doi.org/10.1007/978-3-030-74628-5_6
136 S. Jafari and P. Ioannou

6.1 Introduction

The problem of attenuation of unwanted sound and vibrations is very important in


many systems in precision engineering as the system’s performance is significantly
affected by various sources of vibrational disturbances. This problem arises in many
applications such as laser beam pointing, structural vibration control, active acoustic
noise control, disturbance reduction in hard disk drives, disturbance suppression
in wind turbines, and active acoustic noise control in magnetic resonance imaging
systems [3–13].
The applications of laser systems have grown in recent decades, including areas
such as communications [14, 15], detection and ranging [16], imaging [17], medicine
[18], and defense [19, 20]. Of particular interest in many laser applications are the
problems of beam control, ranging from pointing/tracking to wavefront control. The
performance of laser beam control systems is often adversely affected by difficult-
to-characterize disturbances that arise from the medium of propagation, structural
vibrations in the platform, or other external factors [21]. In particular, much of the
jitter in laser beam control is due to periodic disturbances whose frequencies and
amplitudes are unknown and could vary with time.
Isolation of sensitive equipment from the vibrations of a base structure is another
application of adaptive vibrational control. In some cases, the sensitive equipment
may be supported by a structure that vibrates due to unknown oscillatory forces. A set
of actuators and sensors connected by a feedback loop is often employed to minimize
the effects of vibrations and to prevent the propagation of vibrational disturbances
to sensitive components [8, 22]. Examples of such applications include (i) vibration
reduction in helicopters, where the main objectives are to improve the comfort of
the pilot/passengers, to reduce fatigue in the rotor and structure of the helicopter,
and to protect onboard equipment from damage [23–26]; and (ii) vibration control in
high-performance spacecraft, wherein some components generate uncertain periodic
disturbances which are detrimental to performance; for example, disturbances caused
by gyroscopes and cryogenic coolers [27]. Identifying and attenuating the effect of
such disturbances are essential to improve the performance of instruments.
The control of acoustic noise in a wide class of systems has been the subject of
a lot of engineering research over recent decades. In many industrial applications,
undesired sound can be classified as periodic or quasi-periodic disturbances which
are mainly caused by components such as electric motors, compressors, engines,
cooling systems, fans, propellers, air-conditioning systems, or can be generated by
resonance coils in magnetic resonance imaging system [13, 28–32]. In active noise
control, the main objective is to minimize the noise level in an environment by
producing anti-noise at sensor position, i.e., generating noise from speakers (control
actuators) such that noise level at microphones (error sensors) position is made
as small as possible. The active noise control primarily deals with low-frequency
acoustical noise, typically less than 500 Hz; high-frequency components can be
effectively suppressed by using passive sound absorbers [33, 34]. Some successful
6 Robust Adaptive Disturbance Attenuation 137

applications including reduction of propeller-induced cabin noise in aircraft, engine


and road noise in cars, and noise in acoustic ducts can be found in [35–37].
In most research efforts of the past two decades, the problem of attenuation of
unknown periodic disturbances is formulated as follows. A dynamical system with
LTI model is considered in the presence of additive input and/or output disturbances.
The nominal model of the system is often assumed to be known and stable. It is
inherent in the assumption that if the system was unstable but known, a robust LTI
controller could be designed to end up with the configuration involving a stable LTI
system with a known nominal model. In many control applications, the plant has
been already stabilized with a fixed-gain controller, and what is referred to as the
controlled process for the disturbance rejection problem is an augmentation of the
baseline original control design [7, 9, 38]. The dominant part of additive disturbances
can be often modeled as a summation of a finite number of sinusoidal terms with
unknown amplitude, frequency, and phase. Another formulation of the disturbance
terms is to view it as the output of a filter with poles on the imaginary axis (in
continuous-time formulations), or on the unit circle (in discrete-time formulations)
and with Dirac impulse as the input of the filter.
The control objective is to find a control input u that will reject the unknown
periodic components of the disturbance from the output and make the magnitude
of the plant output as small as possible. This formulation under the aforementioned
assumptions has been extensively studied in the control literature [7, 9, 12, 37,
39–46]. In [41], properties and limitations of different approaches for rejection of
unknown periodic disturbances have been discussed. Most of the proposed techniques
are based on the internal model principle and are divided into two main classes:
indirect and direct adaptive control schemes. In indirect methods, an on-line estimator
provides an estimate of the parameters of the disturbance internal model which
are used to calculate the controller parameters and generate the control action. In
direct methods, however, under certain parameterization e.g., Youla-Kuc̆era or other
parameterizations [47, 48], the controller parameters are directly calculated on-line
without intermediate calculations.
Also, a state-space approach involving the design of adaptive observers together
with on-line frequency identifiers is used in [49]. In [41, 50], a method based on
the concept of a phase-locked loop is proposed. The problem is also studied using
harmonic steady-state methods and averaging in which the plant is approximated by
its steady-state sinusoidal response [26, 51]. In [52, 53], an adaptive pole placement
control has been used for rejection of unknown disturbances acting on unknown LTI
systems. It is assumed that the plant order and the number of distinct frequencies of the
disturbance are known, but the plant and disturbance parameters are unknown. The
global stability and convergence of the algorithm have been established without the
requirement of persistently exciting signals. The method, however, has some major
drawbacks impairing its usability in practical situations: in addition to difficulties
in the extraction of the estimate of the internal model of the disturbance from a
composite polynomial, especially for high-order cases, the procedure is susceptible
to numerical problems due to possible division by zero; also no effective procedure
has been proposed for the construction of some design parameters; moreover, some
138 S. Jafari and P. Ioannou

unrealistic assumptions are made which are limiting its practical use. In [54], an
adaptive harmonic steady-state algorithm for rejection of sinusoidal disturbances
acting on unknown linear systems has been proposed, but the disturbance frequencies
are assumed to be known, and in [55], the idea has been extended for unknown
disturbances. In both cases, the local stability of the closed-loop system has been
established, but no analysis for the size of the region of attraction has been provided.
A state-derivative feedback adaptive controller has been proposed in [56] for the
cancellation of unknown sinusoidal disturbances acting on a class of continuous-
time LTI systems with unknown parameters; and in [57], an LTI plant model in
the controllable canonical form with unknown parameters and measurable full state
is considered, and a state-feedback adaptive control scheme is proposed for the
rejection of an unknown sinusoidal disturbance term. It is, however, not clear that
this approach can be practically implemented as the robustness properties of the
scheme in the presence of unmodeled dynamics has not been studied; moreover, it
has not been addressed how the proposed controller may perform for rejection of
disturbances with multiple frequencies. In fact, the case of unknown plant model
and unknown disturbance remains an open problem as no practical solution with
guaranteed global stability has been yet proposed.
Despite the considerable number of publications in this area, problems of high
practical importance need to be addressed.
◦ In practice, the plant transfer function is never known perfectly let alone be LTI.
Analysis of the effect of inevitable plant modeling uncertainties on any control
scheme is of great practical importance. In most publications, the robustness with
respect to plant unmodeled dynamics and noise has been taken for granted under
the argument that persistence of excitation (PE) of the regressor in the adaptive
law is guaranteed due to sufficient excitation by the periodic disturbance terms.
The PE property guarantees exponential convergence of the estimated distur-
bance model parameters close to their real values, which in turn guarantees a
level of robustness. This assumption, however, is based on a parameterization that
uses the exact number of the unknown frequencies in the disturbance. Assum-
ing an upper bound for the number of frequencies which is the practical thing
to do, given that the number of frequencies may change with time, leads to an
over-parameterization in which case the regressor cannot be PE. In the absence
of PE, most if not all of the adaptive laws proposed in the literature and even
experimented with are non-robust as small disturbances can lead to parameter
drift and instability as pointed out using simple examples in [58]. This serious
practical problem has been completely overlooked in the literature on the adap-
tive rejection of periodic vibrations. In this chapter, we address this problem and
present practical solutions supported by analysis.
◦ In most publications and applications, the focus is to reject the periodic distur-
bances. In the absence of noise, this objective can be achieved exactly provided
that the LTI plant is exactly known and stable. In the presence of noise, however,
a control scheme that in the absence of noise perfectly rejects the periodic dis-
turbances may drastically amplify the noise leading to worse performance than
6 Robust Adaptive Disturbance Attenuation 139

without the feedback. A practical feedback control scheme should have enough
structural flexibility to reject the periodic disturbances without amplifying the
noise. This structural flexibility should be supported analytically as part of the
overall design. The amplification of noise often occurs when the zeros of the
internal model of the disturbance model are close to the zeros of the plant in
addition to other cases. Addressing and understanding these issues and find-
ing appropriate solutions are critical. One of the objectives of this chapter is to
provide solutions to these problems and show possible limitations.
◦ The assumption that the plant is exactly known, stable, and LTI is fundamental to
all approaches proposed for adaptive vibration control. The justification behind
this assumption is that off-line identification is used to estimate the parameters
of a dominant plant model and a fixed robust controller is designed to stabi-
lize it. While this assumption may be valid under normal operations, changes
in the plant parameters over time due to tear and wear or due to some compo-
nent failures may easily lead to the failure of the fixed controller. Designing a
robust adaptive control scheme that can simultaneously stabilize a plant with
unknown or changing parameters in addition to rejecting periodic disturbances
is recognized to be an open problem and of practical importance.
This chapter aims to address the above issues and propose practical solutions
for the problem together with guidelines for the selection of design parameters for
practical implementation.

6.2 Problem Formulation and Objectives

Consider an LTI system with n u inputs and n y outputs, whose output is corrupted
by an additive disturbance as shown in Fig. 6.1. The input–output relationship of the
system is described as

y(t) = G(q)[u(t)] + d(t) = (I + Δm (q))G 0 (q)[u(t)] + d(t), (6.1)

Fig. 6.1 A system with modeled dynamic G 0 (q) and multiplicative unmodeled dynamic Δm (q),
whose output y is corrupted with unknown disturbance d, and u is the input
140 S. Jafari and P. Ioannou

Fig. 6.2 The modeled part


of plant and disturbance for
control design purposes

where y(t) ∈ Rn y is the measurable output, u(t) ∈ Rn u is the control input, d(t) ∈
Rn y is an unknown bounded disturbance which is not directly measurable. The trans-
fer function of the system is denoted by G(q) = (I + Δm (q))G 0 (q), where G 0 (q)
is the modeled part of the system and Δm (q) is an unknown multiplicative modeling
uncertainty term, and q is either the variable of Laplace transformation (i.e., q = s
in continuous-time systems) or the variable of the z-transformation (i.e., q = z in
discrete-time systems).
In many applications, unwanted sound and vibrational disturbances are often mod-
eled as a summation of a finite number of sinusoidal terms with unknown frequency,
magnitude, and phase corrupted by broadband noise as presented by the following
equation for each output channel:


nf
d(t) = ds (t) + v(t) = ai sin(ωi t + ϕi ) + v(t), (6.2)
i=1

where ds (t) is the dominant part of the disturbance (modeled disturbance) with n f
distinct frequencies, and v(t) is a zero-mean bounded random noise. The parameters
of the modeled part of the disturbance, i.e., ai , ωi , ϕi , and n f are all unknown.
An known upper bound for n f is often assumed to be available for control design.
Note that the disturbances applied to different output channels may have completely
different characteristics. As shown is Fig. 6.2, the modeled part of the disturbance,
ds (t), may be viewed as the response of an unknown LTI system with transfer function
matrix G d (q), to a Dirac impulse δ(t), where G d (q) is of order 2n f with all poles on
the unit circle for the discrete-time model or on the jω-axis for the continues-time
formulation.
The control objective is to design the control signal u as a function of plant output
y to make the effect of d on output y as small as possible. Such a feedback control
law must provide an acceptable level of performance and robustness when applied to
the actual plant with modeling uncertainties. The analysis of the trade-offs between
performance and robustness helps in the selection of design parameters to achieve a
good practical design.
The design of a robust adaptive control law is done under certain assumptions on
the properties of the open-loop plant model. The following two cases are considered:
6 Robust Adaptive Disturbance Attenuation 141

Fig. 6.3 General structure


of the control law for Case 1,
when the parameters of the
plant model G 0 (q) are
known, K (q, θ) is an
adaptive filter and F(q) is an
LTI compensator

Fig. 6.4 General structure


of the control law for Case 2,
when the parameters of the
plant model G 0 (q) are
unknown, K y (q, θ y ) and
K u (q, θu ) are adaptive filters

Case 1: The modeled part of the open-loop plant, G 0 (q), is known and stable, with
possibly unstable zeros. The following scenarios are studied for this case:
◦ Discrete-time SISO Systems
◦ Continues-time SISO Systems
◦ Discrete-time MIMO Systems
◦ Continues-time MIMO Systems
Case 2: The modeled part of the open-loop plant, G 0 (q), is unknown and possibly
unstable, but has stable zeros. The following scenario is studied for this case:
◦ Discrete-time SISO Systems
For Case 1, we consider the control architecture shown in Fig. 6.3, wherein the
adaptive filter K (q, θ ) and the LTI compensator F(q) are to be designed to meet the
control objective [1, 2, 59, 60]. The knowledge on the parameters and stability of the
modeled part of the plant, G 0 (q), is utilized for the design of the control law. For Case
2, however, a fully adaptive control law as shown in Fig. 6.4 is assumed, wherein
the adaptive filters K y (q, θ y ) and K u (q, θu ), in addition to unknown disturbance
rejection, must stabilize a possibly unstable open-loop plant. In the second case,
the minimum-phase property of G 0 (q) is employed to meet the control objective. It
should be noted that the architecture in Fig. 6.4 is equivalent to that of the classical
model-reference adaptive control with zero reference signal [58, Sect. 6.3], [61].

6.2.1 Preliminaries and Notation

Throughout this chapter, the following notation and definitions are used. For an
n u -input, n y -output finite-dimensional LTI system with real-rational transfer func-
tion matrix H (q), with input u(t), and output y(t), the input–output relationship is
142 S. Jafari and P. Ioannou

expressed as y(t) = H (q)[u(t)], where the variable q is either z or s, for discrete-


time and continuous-time systems, respectively. For a signal x, the same notation x(t)
is used in both continuous and discrete-time domains, where in the continuous-time
case, the independent variable t takes continuous values, while in the discrete-time
case, t is the dimensionless time index taking only integer values (assuming a fixed
constant sampling period). With this notation, for a sinusoidal signal x(t) = sin(ω0 t),
the frequency ω0 is in rad/s in continuous-time domain and is in rad/sample in
discrete-time domain.
Definition 6.1 The H∞ -norm and the L1 -norm of an LTI system with transfer
function matrix H (q) are defined as follows [58, 62, 63]:
• Discrete-time systems (q = z): The H∞ -norm is defined as

H (z)∞ = max σmax (H (eiθ )),


θ∈[0,π]

where σmax denotes the maximum singular value, and the L1 -norm is defined as

ny 
nu
H (z)1 = max h i j 1 ,
i=1
j=1

∞ h i j is the impulse response of the i j-element of H (z) and h i j 1 =


where
τ =0 |h i j (τ )|.
• Continuous-time systems (q = s): The H∞ -norm is defined as

H (s)∞ = max σmax (H ( jω)),


ω

where σmax denotes the maximum singular value, and the L1 -norm is defined as

ny 
nu
H (s)1 = max h i j 1 ,
i=1
j=1

where
∞ h i j is the impulse response of the i j-element of H (s) and h i j 1 =
0 |h i j (τ )|dτ .

Remark 6.1 The H∞ norm and the L1 norm are induced norms and satisfy the
multiplicative property. For an LTI system with transfer function matrix H (q), the
two norms are related as [64]:
√ √
H ∞ ≤ n y H 1 ≤ n u n y (2n + 1)H ∞ ,

where n is the degree of a minimal realization of H (q).


Definition 6.2 We say that the signal x is μ-small in the mean square sense and
write x ∈ S (μ), for a given constant μ ≥ 0, if [58, 62]:
6 Robust Adaptive Disturbance Attenuation 143

• Discrete-time systems (q = z): For any t, T and some constant c1 , c2 ≥ 0:

−1
t+T
x(τ ) x(τ ) ≤ c1 μT + c2 .
τ =t

• Continuous-time systems (q = s): For any t, T and some constant c1 , c2 ≥ 0:


 t+T
x (τ )x(τ )dτ ≤ c0 μT + c1 .
t

6.3 Known Stable Plants: SISO Systems

In this section, we consider the plant model shown in Fig. 6.1 and the control structure
in Fig. 6.3 and propose a robust adaptive scheme for rejection of unknown periodic
components of the disturbance acting on the plant output. It is assumed that the
nominal plant model, G 0 (q), is known and stable, possibly with unstable zeros.
The stability and performance properties of the closed-loop system are analyzed for
both discrete-time and continuous-time SISO systems. We first consider the ideal
scenario (non-adaptive) when complete information about the characteristics of the
disturbance is available. The analysis of the known frequency case allows us to
identify what is the best performance that can be achieved and set the reference
performance and robustness levels to be compared with those in the more realistic
case where the frequencies are unknown. We show that the rejection of periodic terms
may lead to amplification of output unmodeled disturbance, v(t). A way to avoid such
undesirable noise amplification is to increase the order of the feedback adaptive filter
in order to have the flexibility to achieve rejection of the periodic disturbance terms
while minimizing the effect of the noise on the plant output. The increased filter order
leads to an over-parameterized scheme where the persistence of excitation is no longer
possible, and this shortcoming makes the use of robust adaptation essential. With this
important insight in mind, the coefficients of the feedback filter whose size is over-
parameterized are adapted using a robust adaptive law. We show analytically that
the proposed robust adaptive control scheme guarantees performance and stability
robustness with respect to unmodeled dynamics and disturbance.

6.3.1 Discrete-Time Systems

Consider the plant model (6.1) and assume G 0 (z) is a known stable nominal plant
transfer function (possibly non-minimum-phase) and the unknown multiplicative
modeling uncertainty term Δm (z) is such that Δm (z)G 0 (z) is proper with stable
poles.
144 S. Jafari and P. Ioannou

6.3.1.1 Non-adaptive Case: Known Disturbance Frequencies

Let the filter K in Fig. 6.3 be an FIR filter of order N of the form K (z, θ ) = Θ(z)/z N ,
where Θ(z) = θ0 + θ1 z + . . . + θ N −1 z N −1 , which can be written as
N −1
Θ(z)  i−N
K (z, θ ) = = θi z = θ α(z),
zN i=0
(6.3)
α(z) = [z −N , z 1−N , . . . , z −1 ] ,

where θ = [θ0 , θ1 , . . . , θ N −1 ] ∈ R N . The control objective is to find the parameter


vector θ and to design a stable compensator F(z) (if needed) such that the magnitude
of y is minimized.
If the frequencies of ds (t) in (6.2) are known, then its internal model

nf
Ds (z) = (z 2 − 2 cos(ωi )z + 1) (6.4)
i=1

is known. From Figs. 6.1 and 6.3, the sensitivity transfer function from d(t) to y(t)
is given by
1 − G 0 (z)F(z)K (z, θ )
y(t) = S(z, θ )[d(t)] = [d(t)]. (6.5)
1 + Δm (z)G 0 (z)F(z)K (z, θ )

It follows from (6.5) that the effect of periodic components of d(t) on y(t) is com-
pletely rejected if S(z, θ ) has zeros on the unit circle at the disturbance frequencies;
in other words, if S(z, θ ) has the internal model of the sinusoidal components Ds (z)
as a factor. In addition, the filters K (z, θ ) and F(z) should be chosen such that S(z, θ )
remain stable for any admissible Δm (z). Assuming that we have perfect knowledge
on the frequencies in ds (t), we show that with the control architecture Fig. 6.3 and
filter (6.3), the control objective can be met. In particular, we discuss that how the
design parameters may affect robust stability and performance.

Theorem 6.1 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with
disturbance model (6.2) and filter K (z, θ ) of the form (6.3). Let ω1 , . . . , ωn f be the
distinct frequencies of the sinusoidal disturbance ds (t) and Ds (z) in (6.4) be the
internal model of ds (t). Then, there exists a θ ∗ such that with K (z, θ ∗ ), the control
law of Fig. 6.3 completely rejects the periodic components of the disturbances if and
only if G 0 (z i )F(z i ) = 0, z i = exp(± jωi ), for i = 1, 2, . . . , n f , i.e., G 0 (z)F(z) has
no zero at the roots of Ds (z), and N ≥ 2n f , provided the stability condition

Δm (z)G 0 (z)F(z)K (z, θ ∗ )∞ < 1 (6.6)

is satisfied. The choice of θ ∗ is unique if N = 2n f . In addition, all signals in the


closed-loop system are guaranteed to be uniformly bounded and the plant output
satisfies
6 Robust Adaptive Disturbance Attenuation 145

lim sup |y(τ )| ≤ S(z, θ ∗ )1 v0


t→∞ τ ≥t
(6.7)
≤ c(1 + G 0 (z)F(z)K (z, θ ∗ )1 )v0 ,
 
 
where v0 = supτ |v(τ )| and c =  1+Δm (z)G 0 (z)F(z)K
1
(z,θ ∗ ) 1
is a finite positive constant.
Moreover, in the absence of unmodeled noise (i.e., if v(t) = 0), y(t) converges to
zero exponentially fast.

Proof The proof is given in [1]. 

Theorem 6.1 gives conditions under which we can completely reject the sinusoidal
components of the disturbance when the frequencies in ds (t) are known. It also shows
that if the plant uncertainty satisfies a norm-bound condition, the output will be of
the order of the broadband random noise level at the steady state. In the presence
of noise v(t), however, a large magnitude of the sensitivity function S(z, θ ∗ ) may
lead to noise amplification, especially at high frequencies. The situation may be
worse if the plant has a very small gain at the frequency range of ds (t) with a larger
gain at high frequencies. In such cases, the design of a pre-compensator F(z) to
shape the frequency response of the plant will be a possible remedy to achieve a
good compromise between performance and robustness. It should be noted that with
N = 2n f , there exists a unique θ ∗ for which complete rejection of the sinusoidal
terms of the disturbance is possible. Such a unique parameter vector, however, does
not provide any flexibility to improve performance and/or robust stability margins.
For N > 2n f , however, there exists an infinite number of vectors θ ∗ that guarantee the
results of Theorem 6.1. In such cases, one may choose a θ ∗ that in addition to rejecting
the sinusoidal disturbance terms, minimizes the magnitude of G 0 (z)F(z)K (z, θ ∗ )
and therefore limits the possible amplification of the output noise. The existence of
a minimizer is the subject of the following lemma.

Lemma 6.1 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with distur-
bance model (6.2) and filter K (z, θ ) of the form (6.3). If the conditions in Theorem 6.1
are satisfied, then there exists a θ ∗ with θ ∗  ≤ r0 , for some r0 > 0, that solves the
following constrained convex optimization problem:

θ ∗ = arg min G 0 (z)F(z)K (z, θ )∞ (6.8)


θ∈Ω

where

Ω = {θ | θ  ≤ r0 , G 0 (z i )F(z i )K (z i , θ ) = 1, z i = exp(± jωi ), i = 1, . . . , n f }.


(6.9)

Proof The proof is given in [1]. 

Remark 6.2 When N = 2n f , the set Ω in (6.9) is a singleton, hence, the cost in
(6.8) is fixed constant and cannot be reduced.
146 S. Jafari and P. Ioannou

Remark 6.3 As shown in [1], the constraint G 0 (z i )F(z i )K (z i , θ ) = 1 is a polyno-


mial equation which can be expressed as a Sylvester-type matrix equation, i.e., the
constraint is equivalent to a system of algebraic linear equations.
The following simple example illustrates the above results.
Example 6.1 Consider the following open-loop plant model and disturbance:

−0.00146(z − 0.1438)(z − 1)
G 0 (z) = , (6.10)
(z − 0.7096)(z 2 − 0.04369z + 0.01392)

with sampling period of 1/480 s and assume

d(t) = sin(ω0 t) + v(t), (6.11)

where ω0 = 0.0521 rad/sample (= 25 rad/s) and v(t) is a zero-mean Gaussian noise


with standard deviation 0.02. Let us suppose Δm (z) = 0 and choose F(z) = 1.
Figure 6.5 shows the performance of the off-line designed controllers obtained from
(6.8) for complete rejection of the periodic term of a disturbance. The performance
of the controller with different orders N are shown with different colors in Fig. 6.5.
After closing the loop at t = 20 s, the sinusoidal disturbance is completely rejected,
but the noise part is amplified for low-order filters. Increasing the order of filter
K (z, θ ) provides the flexibility to reduce noise amplification.

Fig. 6.5 Control performance for Example 6.1, where the disturbance frequency ω0 is known. The
controller is turned on at time 20 s. The closed-loop plant outputs for three different values of the
filter order N show how performance improvement by increasing the filter order N . 2015,
c IEEE.
Reprinted, with permission, from S. Jafari, P. Ioannou, B. Fitzpatrick, and Y. Wang, Robustness
and performance of adaptive suppression of unknown periodic disturbances, IEEE Transactions on
Automatic Control, vol. 60, pages 2166–2171 (2015)
6 Robust Adaptive Disturbance Attenuation 147

The control design and analysis of the known frequency case are very useful
as they establishes analytically that, for a better performance, the order of the filter
K (z, θ ) has to be much larger than 2n f in order to provide the structural flexibility to
design the coefficients of the filter to simultaneously reject the periodic components
of the disturbance and minimize the effect of noise disturbances on the plant output.
The non-adaptive analysis provides the form and dimension of the filter K (z, θ ) in
the adaptive case where the frequencies of the disturbances are completely unknown.
We treat this case in the following subsection.

6.3.1.2 Adaptive Case: Unknown Disturbance

In order to attenuate disturbances with unknown characteristics, the same control


architecture as shown in Fig. 6.3 is employed wherein the unknown controller param-
eters are replaced with their estimates. That is, based on the certainty equivalence
principle [62], a robust parameter identifier is to be designed to calculate the con-
troller parameters at each time.
The adaptive filter in Fig. 6.3 is given by

N −1

K (z, θ̂(t)) = θ̂i (t)z i−N = θ̂ (t)α(z),
i=0
(6.12)
α(z) = [z −N , z 1−N
, . . . , z −1 ] ,

where θ̂(t) = [θ̂0 (t), θ̂1 (t), . . . , θ̂ N −1 (t)] ∈ R N is the estimate of the unknown θ ∗
at time t. The control law is the same as in the known parameter case except that the
unknown vector θ ∗ is replaced with its estimate as

u(t) = −F(z) K (z, θ̂ (t − 1))[ζ (t)] ,
(6.13)
ζ (t) = y(t) − G 0 (z)[u(t)],

where θ̂ (t − 1) is the most recent estimate of θ ∗ available to generate control action


at time t.
In order to design a parameter estimator to generate θ̂(t) at each time t, we express
θ ∗ in an appropriate parametric form. The following lemma presents a parametric
model for the closed-loop plant that is used for online parameter estimation.

Lemma 6.2 The closed-loop system (6.1),(6.13) is parameterized as

ζ (t) = φ(t) θ ∗ + η(t), (6.14)

where θ ∗ = [θ0∗ , θ1∗ , . . . , θ N∗ −1 ] ∈ R N is the unknown desired parameter vector to be


identified, ζ (t) = y(t) − G 0 (z)[u(t)] is a measurable scalar signal, and the regres-
sor vector is given by
148 S. Jafari and P. Ioannou

φ(t) = G 0 (z)F(z)α(z)[ζ (t)], (6.15)

and the unknown error term η(t) depends on the unmodeled noise v(t) and the plant
unmodeled dynamics Δm (z) and is given by

η(t) = (1 − G 0 (z)F(z)K (z, θ ∗ )) [v(t) + Δm (z)G 0 (z)F(z)[u(t)]] + εs (t),


(6.16)
where εs (t) = (1 − G 0 (z)F(z)K (z, θ ∗ ))[ds (t)] is an exponentially decaying to zero
term.
Proof The proof is given in [1]. 
Remark 6.4 It follows from Lemma 6.2 that in the absence of noise and modeling
error (i.e., if Δm (z) = 0 and v(t) = 0), the error term η(t) is just an exponentially
decaying to zero signal.
Remark 6.5 The definition of the regressor (6.15) implies that the stable compen-
sator F(z) provides the flexibility to manipulate the regressor excitation level by
shaping the spectrum of the open-loop plant G 0 (z).
Now, using the parametric model (6.14) and employing the parameter estimation
techniques discussed in [62], we design a robust adaptive law to estimate the unknown
parameter vector θ ∗ as follows.
Let θ̂ (t − 1) be the most recent estimate of θ ∗ , then the predicted value of the
signal ζ (t) based on θ̂ (t − 1) is generated as

ζ̂ (t) = φ(t) θ̂ (t − 1). (6.17)

The normalized estimation error is defined as

ζ (t) − ζ̂ (t)
ε(t) = , (6.18)
m 2 (t)

where m 2 (t) = 1 + γ0 φ(t) φ(t), with γ0 > 0, is a normalizing signal. To generate


θ̂ (t), we consider the robust pure least-squares algorithm [62]:

P(t − 1)φ(t)φ(t) P(t − 1)


P(t) = P(t − 1) −
m 2 (t) + φ(t) P(t − 1)φ(t) (6.19)
θ̂ (t) = proj(θ̂ (t − 1) + P(t)φ(t)ε(t))

where P(0) = P (0) > 0 and projection operator proj(·) is used to guarantee that
θ̂ (t) ∈ S, ∀t, where S is a compact set defined as

S = {θ ∈ R N | θ θ ≤ θmax
2
}, (6.20)

where θmax > 0 is such that the desired parameter vector of the optimum filter, θ ∗ ,
belongs to S. Since we do not have a priori knowledge on the norm of θ ∗ , the
6 Robust Adaptive Disturbance Attenuation 149

upper bound θmax must be chosen sufficiently large. The projection of the estimated
parameter vector into S may be implemented as [62, 65]:

χ(t) = θ̂ (t − 1) + P(t)ε(t)φ(t)
ρ̄(t) = P −1/2 (t)χ(t)
ρ̄(t) if ρ̄(t) ∈ S̄ (6.21)
ρ(t) =
⊥ proj of ρ̄(t) on S̄ if ρ̄(t) ∈
/ S̄
θ̂(t) = P 1/2 (t)ρ(t),

where P −1 is decomposed into (P −1/2 ) (P −1/2 ) at each time step t to ensure all the
properties of the corresponding recursive least-squares algorithm without projection.
In (6.21), the set S is transformed into S̄ such that if χ ∈ S, then P −1/2 χ ∈ S̄. Since S
is chosen to be convex and P −1/2 χ is a linear transformation, then S̄ is also a convex
set [65]. An alternative to projection is a fixed σ -modification wherein no bounds for
|θ ∗ | are required [58]. In implementing (6.19), we also need to use modifications such
as covariance resetting [62] which monitors the covariance matrix P(t) to make sure
it does not become small in some directions, i.e., its minimum eigenvalue is always
greater than a small positive constant. Such modifications are presented and analyzed
in [62] and other references on discrete-time robust adaptive control schemes.
The following theorem summarizes the properties of the adaptive control law.

Theorem 6.2 Consider the closed-loop system (6.1), (6.13), (6.17)–(6.21), and
choose N > 2n̄ f , where n̄ f is a known upper bound for the number of distinct
frequencies in the disturbance. Assume that the plant modeling error satisfies

c0 Δm (z)G 0 (z)F(z)1 < 1, (6.22)

where c0 is a finite positive constant independent of Δm (z) which depends on known


parameters. Then, all signals in the closed-loop system are uniformly bounded and
the plant output satisfies

t+T −1
1 
lim sup |y(τ )|2 ≤ c(μ2Δ + v02 ), (6.23)
T →∞ T τ =t

for any t ≥ 0 and some finite positive constant c which is independent of t, T , Δm (z),
and v(t), where μΔ is a constant proportional to the size of the plant unmodeled
dynamics Δm (z), and v0 = supτ |v(τ )|. In addition, in the absence of modeling error
and noise (i.e., if Δm (z) = 0 and v(t) = 0), the adaptive control law guarantees the
convergence of y(t) to zero.

Proof The proof is given in [1]. 

Remark 6.6 The condition (6.22) is a sufficient condition to ensure the uniform
boundedness of all signals in the closed-loop system. Such norm-bound conditions
150 S. Jafari and P. Ioannou

state that the closed-loop system can tolerate nonzero small-size modeling errors.
This condition also indicates the role of the compensator F(z) in improving the
stability robustness with respect to unmodeled dynamics.

6.3.2 Continuous-Time Systems

In this section, we extend the result of Sect. 6.3.1 to continuous-time systems. Con-
sider the plant model (6.1) and assume G 0 (s) is a known stable nominal plant transfer
function (possibly non-minimum phase) and the unknown multiplicative modeling
uncertainty term Δm (s) is such that Δm (s)G 0 (s) is proper with stable poles.

6.3.2.1 Non-adaptive Case: Known Disturbance Frequencies

Let the filter K in Fig. 6.3 be of the form

N −1
 λ N −k
K (s, θ ) = θk = θ Λ(s),
k=0
(s + λ) N −k
(6.24)
λ N
λ
Λ(s) = ,..., ,
(s + λ) N s+λ

where θ = [θ0 , θ1 , . . . , θ N −1 ] ∈ R N is the controller parameter vector and the scalar


λ > 0 and the integer N are design parameters. The control objective is to find the
parameter vector θ and to design a stable compensator F(s) (if needed) such that the
magnitude of y is minimized.
If the frequencies of ds (t) in (6.2) are known, then its internal model


nf
Ds (s) = (s 2 + ωi2 ) (6.25)
i=1

is known. From Figs. 6.1 and 6.3, the sensitivity transfer function from d(t) to y(t)
is given by

1 − G 0 (s)F(s)K (s, θ )
y(t) = S(s, θ )[d(t)] = [d(t)]. (6.26)
1 + Δm (s)G 0 (s)F(s)K (s, θ )

It follows from (6.26) that the effect of periodic components of d(t) on y(t) is com-
pletely rejected if S(s, θ ) has zeros on the unit circle at the disturbance frequencies;
in other words, if S(s, θ ) has the internal model of the sinusoidal components Ds (s)
as a factor. In addition, the filters K (s, θ ) and F(s) should be chosen such that S(s, θ )
remain stable for any admissible Δm (s). Assuming that we have perfect knowledge
on the frequencies in ds (t), we show that with the control architecture Fig. 6.3 and
6 Robust Adaptive Disturbance Attenuation 151

filter (6.24), the control objective can be met. In particular, we discuss how the design
parameters may affect robust stability and performance.

Theorem 6.3 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with
disturbance model (6.2) and filter K (s, θ ) of the form (6.24). Let ω1 , . . . , ωn f be
the distinct frequencies of sinusoidal disturbance ds (t) and Ds (s) in (6.25) be the
internal model of ds (t). Then, there exists a θ ∗ such that with K (s, θ ∗ ), the control
law of Fig. 6.3 completely rejects the periodic components of the disturbances if and
only if G 0 (si )F(si ) = 0, si = jωi , for i = 1, 2, . . . , n f , i.e., G 0 (s)F(s) has no zero
at the roots of Ds (s), and N ≥ 2n f , provided the stability condition
Δm (s)G 0 (s)F(s)K (s, θ ∗ )∞ < 1 (6.27)

is satisfied. The choice of θ ∗ is unique if N = 2n f . In addition, all signals in the


closed-loop system are guaranteed to be uniformly bounded and the plant output
satisfies
lim sup |y(τ )| ≤ S(s, θ ∗ )1 v0
t→∞ τ ≥t
(6.28)
≤ c(1 + G 0 (s)F(s)K (s, θ ∗ )1 )v0 ,
 
 
where v0 = supτ |v(τ )| and c =  1+Δm (s)G 0 (s)F(s)K
1
(s,θ ) 1
∗ is a finite positive constant.
Moreover, in the absence of unmodeled noise (i.e., if v(t) = 0), y(t) converges to
zero exponentially fast.

Proof The proof is given in [60]. 

Theorem 6.3 gives conditions under which we can completely reject the sinusoidal
components of the disturbance when the frequencies in ds (t) are known. It also
shows that if the plant uncertainty satisfies a norm-bound condition, the output is
of the order of the broadband random noise level at steady state. In the presence
of noise v(t), however, a large magnitude of the sensitivity function S(s, θ ∗ ) may
lead to noise amplification, especially at high frequencies. The situation may be
worse if the plant has a very small gain at the frequency range of ds (t) with a larger
gain at high frequencies. In such cases, the design of a pre-compensator F(s) to
shape the frequency response of the plant is a possible remedy to achieve a good
compromise between performance and robustness. It should be noted that with N =
2n f , there exists a unique θ ∗ for which complete rejection of the sinusoidal terms
of the disturbance is possible. Such a unique parameter vector, however, does not
provide any flexibility to improve performance and/or robust stability margins. For
N > 2n f , however, there exists an infinite number of vectors θ ∗ that guarantee the
results of Theorem 6.3. In such cases, one may choose a θ ∗ that in addition to rejecting
the sinusoidal disturbance terms, minimizes the magnitude of G 0 (s)F(s)K (s, θ ∗ )
and therefore limits the possible amplification of the output noise. The existence of
a minimizer is the subject of the following lemma.
152 S. Jafari and P. Ioannou

Lemma 6.3 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with dis-
turbance model (6.2) and filter K (s, θ ) of the form (6.24). If the conditions in The-
orem 6.3 are satisfied, then there exists a θ ∗ with θ ∗  ≤ r0 , for some r0 > 0, that
solves the following constrained convex optimization problem:

θ ∗ = arg min G 0 (s)F(s)K (s, θ )∞ (6.29)


θ∈Ω

where

Ω = {θ | θ  ≤ r0 , G 0 (si )F(si )K (si , θ ) = 1, si = ± jωi , i = 1, . . . , n f }. (6.30)

Proof The proof is given in [60]. 

Remark 6.7 When N = 2n f , the set Ω in (6.30) is a singleton, hence, the cost in
(6.29) is fixed constant and cannot be reduced.

Remark 6.8 As shown in [60], the constraint G 0 (si )F(si )K (si , θ ) = 1 is a polyno-
mial equation which can be expressed as a Sylvester-type matrix equation, i.e., the
constraint is equivalent to a system of algebraic linear equations.

From (6.27), the LTI filter F(s) can help to improve the stability robustness with
respect to modeling errors. In the adaptive case, in addition to stability margin, F(s)
affects the level of excitation of the regressor at the disturbance frequencies. Let
us further explain the objective of introducing this filter. Since the plant unmod-
eled dynamics are often dominant at high-frequencies and the control bandwidth
for disturbance rejection is in the low-frequency ranges, we choose F(s) to make
the magnitude of G 0 (s)F(s) sufficiently large over the control bandwidth and suffi-
ciently small at high frequencies in order to limit the excitation of the high-frequency
unmodeled dynamics.
A simple procedure for the design of F(s) for open-loop plants with low gain
over the control bandwidth is given below.
1. Let G̃ 0 (s) = G 0 (s).
2. If G̃ 0 (s) has a zero on the jω-axis at s = ± jω0 , change it to s = −δ0 ± jω0 ,
where δ0 > 0 is some small positive constant.
3. If G̃ 0 (s) has an unstable zero at s = σ0 ± jω0 with σ0 > 0, change it to s =
−σ0 ± jω0 , i.e., reflect it across the jω-axis and make it stable.
4. Let
α0m
F(s) = κ0 G̃ −1 (s), (6.31)
(s + α0 )m 0

where m > 0 is an integer greater than the relative degree of G 0 (s) and κ0 , α0 > 0
are design constants that are chosen such that the low-pass filter κ0 α0m /(s + α0 )m
has a large enough gain over the control bandwidth.
The example below clarifies the above procedure.
6 Robust Adaptive Disturbance Attenuation 153

Fig. 6.6 The magnitude plot of (a) G 0 (s) and (b) G 0 (s)F(s). The plant nominal model G 0 (s) has
zeros on the jω axis at ω = 0 and 2 rad/s. The filter F(s) flattens the magnitude plot of G 0 (s) over
the frequency range of interest and improves disturbance rejection

Example 6.2 Consider the following nominal plant model which has three zeros on
the jω-axis:
s(s 2 + 4)(s − 0.8)(s + 1.4)
G 0 (s) = ,
(s + 0.5)3 (s + 2)2 (s + 3)

and assume that that the disturbance frequencies are in the range [0, 300] rad/s.
Following the above procedure, the filter F(s) can be chosen as

κ0 α 2 (s + 0.5)3 (s + 2)2 (s + 3)
F(s) = ,
(s + α)2 (s + δ0 )((s + δ0 )2 + 4)(s + 0.8)(s + 1.4)

where κ0 = 1, δ0 = 0.01, and α = 300. Figure 6.6 shows the magnitude plot of
G 0 (s) and G 0 (s)F(s). The compensated plant G 0 (s)F(s) has unity gain over most
part of the control bandwidth except at and around ω = 0 and 2 rad/s, as G 0 (s) has
zeros on the jω axis at these two frequencies. It should be noted that according to
Theorem 6.3, for this plant model, rejection of disturbances at ω = 0 and 2 rad/s is
impossible as the plant has zero gain at these two frequencies.

6.3.2.2 Adaptive Case: Unknown Disturbance

In order to control unknown disturbances, we design and analyze an adaptive version


of filter (6.24) based on the certainty equivalence principle [58].
The adaptive filter in Fig. 6.3 is given by

N −1
 λ N −k
K (s, θ̂ (t)) = θ̂k (t) = θ̂(t) Λ(s),
k=0
(s + λ) N −k
(6.32)
λN λ
Λ(s) = ,..., ,
(s + λ) N s+λ
154 S. Jafari and P. Ioannou

where θ̂ (t) = [θ̂0 (t), θ̂1 (t), . . . , θ̂ N −1 (t)] ∈ R N and λ > 0 is a design constant.
Then, the control law can be expressed as

u(t) = −F(s) K (s, θ̂ (t))[ζ (t)] ,
(6.33)
ζ (t) = y(t) − G 0 (s)[u(t)],

where θ̂ (t) is the estimate of θ ∗ at time t.


In order to design a parameter estimator to generate θ̂(t) at each time t, we express
θ ∗ in an appropriate parametric form. The following lemma presents a parametric
model for the closed-loop plant that is used for online parameter estimation.
Lemma 6.4 The closed-loop system (6.1),(6.33) is parameterized as

ζ (t) = φ(t) θ ∗ + η(t), (6.34)

where θ ∗ = [θ0∗ , θ1∗ , . . . , θ N∗ −1 ] ∈ R N is the unknown desired parameter vector to be


identified, ζ (t) = y(t) − G 0 (s)[u(t)] is a measurable scalar signal, and the regres-
sor vector is given by
φ(t) = G 0 (s)F(s)Λ(s)[ζ (t)], (6.35)

where the unknown error term η(t) depends on the unmodeled noise v(t) and the
plant unmodeled dynamics Δm (s) and is given by

η(t) = (1 − G 0 (s)F(s)K (s, θ ∗ )) [v(t) + Δm (s)G 0 (s)F(s)[u(t)]] + εs (t),


(6.36)
where εs (t) = (1 − G 0 (s)F(s)K (s, θ ∗ ))[ds (t)] is an exponentially decaying to zero
term.
Proof The proof is given in [60]. 
Remark 6.9 It follows from Lemma 6.4 that in the absence of noise and modeling
error (i.e., if Δm (s) = 0 and v(t) = 0), the error term η(t) is just an exponentially
decaying to zero signal.
Remark 6.10 The definition of the regressor (6.35) implies that the stable compen-
sator F(s) provides the flexibility to manipulate the regressor excitation level by
shaping the spectrum of the open-loop plant G 0 (s).
Now, using the parametric model (6.34) and employing the parameter estimation
techniques discussed in [58], we design a robust adaptive law to estimate the unknown
parameter vector θ ∗ as follows.
The predicted value of the signal ζ (t) based on θ̂ (t) is generated as

ζ̂ (t) = φ(t) θ̂(t). (6.37)

The normalized estimation error is defined as


6 Robust Adaptive Disturbance Attenuation 155

ζ (t) − ζ̂ (t)
ε(t) = , (6.38)
m 2 (t)

where m 2 (t) = 1 + γ0 φ(t) φ(t), with γ0 > 0, is a normalizing signal. To generate


θ̂ (t), we consider the robust pure least-squares algorithm [58]:

φ(t)φ(t)
Ṗ(t) = −P(t) P(t)
m 2s (t) (6.39)
θ̂˙ (t) = proj (P(t)φ(t)ε(t))

where P(0) = P (0) > 0 and projection operator proj(·) is used to guarantee that
θ̂ (t) ∈ S, ∀t, where S is a compact set defined as

S = {θ ∈ R N | θ θ ≤ θmax
2
}, (6.40)

where θmax > 0 is such that the desired parameter vector of the optimum filter, θ ∗ ,
belongs to S. Since we do not have a priori knowledge on the norm of θ ∗ , the
upper bound θmax must be chosen sufficiently large. The projection of the estimated
parameter vector into S may be implemented as [58]:


⎪ P(t)φ(t)ε(t) if θ̂ (t) ∈ S0


⎨ or if θ̂ (t) ∈ δS and
θ̂˙ (t) =

⎪ (P(t)φ(t)ε(t)) θ̂(t) ≤ 0,


⎩ P(t)φ(t)ε(t) − P(t) θ̂ (t)θ̂(t) P(t)φ(t)ε(t) otherwise
θ̂(t) P(t)θ̂ (t)
(6.41)
where S0 = {θ ∈ R N | θ θ < θmax 2
} and δS = {θ ∈ R N | θ θ = θmax
2
} denote the
interior and the boundary of S, respectively.
The following theorem summarizes the properties of the adaptive control law.

Theorem 6.4 Consider the closed-loop system (6.1), (6.33), (6.37)–(6.41), and
choose N > 2n̄ f , where n̄ f is a known upper bound for the number of distinct
frequencies in the disturbance. Assume that the plant modeling error satisfies

c0 Δm (s)G 0 (s)F(s)1 < 1, (6.42)

where c0 is a finite positive constant independent of Δm (s) which depends on known


parameters. Then, all signals in the closed-loop system are uniformly bounded and
the plant output satisfies
 t+T
1
lim sup |y(τ )|2 dτ ≤ c(μ2Δ + v02 ), (6.43)
T →∞ T t
156 S. Jafari and P. Ioannou

for any t ≥ 0 and some finite positive constant c which is independent of t, T , Δm (s),
and v(t), where μΔ is a constant proportional to the size of the plant unmodeled
dynamics Δm (s), and v0 = supτ |v(τ )|. In addition, in the absence of modeling error
and noise (i.e., if Δm (s) = 0 and v(t) = 0), the adaptive control law guarantees the
convergence of y(t) to zero.

Proof The proof is given in [60]. 

Theorem 6.4 states that with the proposed adaptive control law if the modeling
uncertainty satisfies the norm-bound condition (6.42), the energy of the plant output
y(t) is of the order of broadband noise level and the size of plant unmodeled dynamics.

6.4 Known Stable Plants: MIMO Systems

In this section, the results of Sect. 6.3 are generalized to MIMO systems. We con-
sider the plant model shown in Fig. 6.1 and the control structure in Fig. 6.3 and
propose a robust adaptive scheme for rejection of unknown periodic components of
the disturbance acting on the output channels of MIMO systems. It is assumed that
the nominal plant model, G 0 (q), is known and stable, possibly with unstable zeros.
The stability and performance properties of the closed-loop system are analyzed for
both discrete-time and continuous-time MIMO systems. We first consider the ideal
scenario (non-adaptive) when complete information about the characteristics of the
disturbance is available. Subsequently, we design a robust adaptive control scheme
and analytically show that the proposed control law guarantees performance and
stability robustness with respect to unmodeled dynamics and disturbance. It should
be noted that for MIMO systems, for each output channel, the additive output dis-
turbance is assumed to be of the form of (6.2), that is, for the j-th channel, we
have
n fj

d j (t) = ds j (t) + v j (t) = ai j sin(ωi j t + ϕi j ) + v j (t). (6.44)
i=1

That is, in MIMO systems, the disturbances applied to different output channels may
have completely different characteristics.

6.4.1 Discrete-Time Systems

Let G 0 (z) be the nominal plant transfer function with n u inputs and n y outputs and
assume that the unknown multiplicative modeling uncertainty term Δm (z) is such
that Δm (z)G 0 (z) is proper with stable poles.
6 Robust Adaptive Disturbance Attenuation 157

6.4.1.1 Non-adaptive Case: Known Disturbance Frequencies

Let the filter K in Fig. 6.3 be an n u × n y matrix whose elements are FIR filters of
order N of the form  
K (z, θ ) = K i j (z, θi j ) n u ×n y ,
K i j (z, θi j ) = θi j α(z), (6.45)
−N −1
α(z)  [z ,z 1−N
,...,z ] ,

where θi j ∈ R N is the parameter vector of the i j-th element of the filter and θ =
[θ11 , θ12 , . . . , θn u n y ] ∈ R N n u n y is the concatenation of θi j ’s.
Considering the filter structure in (6.45), we examine the conditions under which
there exists a desired parameter vector θ ∗ for which the periodic terms of the distur-
bance can be completely rejected without amplifying the output noise. In addition,
we examine who the stable LTI filter F(z) in Fig. 6.3 can be chosen to further improve
robustness and performance.
If for each output channel, the frequencies of the additive disturbance are known,
then we have the list of all distinct frequencies in the disturbance vector d(t). We let n f
denote the total number of distinct frequencies in d(t) and the distinct frequencies be
denoted by ω1 , . . . , ωn f ; since some output channels may have disturbance terms at
n y
the same frequencies, then n f ≤ j=1 n f j . Then, the internal model of the sinusoidal
disturbances is given by


nf
Ds (z) = (z 2 − 2 cos(ωi )z + 1). (6.46)
i=1

From Figs. 6.1 and 6.3, the sensitivity transfer function from d(t) to y(t) is given by

y(t) = S(z, θ )[d(t)]


(6.47)
= (I − G 0 (z)F(z)K (z, θ )) (I + Δm (z)G 0 (z)F(z)K (z, θ ))−1 [d(t)].

The existence of a disturbance rejecting parameter vector θ ∗ is the subject of the


following theorem.

Theorem 6.5 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with
disturbance model (6.2) and filter K (z, θ ) of the form (6.45). Let ω1 , . . . , ωn f be
the distinct frequencies of the sinusoidal disturbance terms. Then, there exists a θ ∗
such that with K (z, θ ∗ ), the control law of Fig. 6.3 completely rejects the periodic
components of the disturbances if and only if n y ≤ n u , rank(G 0 (z i )F(z i )) = n y ,
z i = exp(± jωi ), for i = 1, 2, . . . , n f , and N ≥ 2n f , provided the stability condition

Δm (z)G 0 (z)F(z)K (z, θ ∗ )∞ < 1 (6.48)


158 S. Jafari and P. Ioannou

is satisfied. The choice of θ ∗ is unique if N = 2n f . In addition, all signals in the


closed-loop system are guaranteed to be uniformly bounded and the plant output
satisfies
lim sup y(τ )∞ ≤ S(z, θ ∗ )1 v0
t→∞ τ ≥t
(6.49)
≤ c(1 + G 0 (z)F(z)K (z, θ ∗ )1 )v0 ,
 
where v0 = supτ |v(τ )| and c = (I + Δm (z)G 0 (z)F(z)K (z, θ ∗ ))−1 1 is a finite
positive constant. Moreover, in the absence of unmodeled noise (i.e., if v(t) = 0),
y(t) converges to zero exponentially fast.

Proof The proof is given in [2]. 

For plants with fewer inputs than outputs, for degenerate plants rank(G 0 (z)F(z))
< min{n y , n u }, ∀z, and for plants with transmission zeros at the frequencies of the
disturbance, complete rejection of disturbances in all directions is impossible. It
should be noted that Theorem 6.5 provides necessary and sufficient conditions for
rejection of periodic disturbance signals in any direction; however, for a given distur-
bance vector, these conditions are sufficient as for a specific direction we just need
S(z, θ ) to have zero gain in the direction of the disturbance vector not necessarily
in all directions. In a MIMO system, a signal vector u with frequency ω0 can pass
through G 0 (z) even if G 0 (z) has a transmission zero at this frequency; which is the
case when the input vector u is not in the zero-gain directions of G 0 . The following
two examples show that the conditions given in Theorem 6.5 are not necessary for a
specific direction of the disturbance vector.

Example 6.3 (A system with zero gain at a single frequency) Consider the following
plant model (the sampling period is 0.001 s), and assume F(z) = I and Δm (z) = 0:

1 z 2 cos(0.1)z − 1
G 0 (z) = ,
z2 − 0.5z + 0.5 1 z

which has a pair of zeros on the unit circle at ω0 = 0.1 rad/sample (100 rad/s). From
Theorem 6.5, complete rejection of periodic disturbances with frequency ω0 in all
directions is impossible for this plant, because rank(G 0 (exp(± jω0 ))) = 1 < 2 and
the system has zero gain in some direction. However, for some disturbance vectors,
say ds (t) = [sin(0.1t + 0.1), sin(0.1t)] , we can find a filter of the form (6.45) that
completely rejects ds (t), even though the conditions in Theorem 6.5 are not satisfied.
For example, for
1 0.2469 −0.4994
K (z, θ ) = ,
z 0.9906 0.2469

the sensitivity transfer function matrix S(z, θ ) kills the effects of ds (t) on y(t), despite
the fact that the conditions of Theorem 6.5 are not satisfied (rank(G 0 (exp(±0.1 j)))
= 1 < 2 and N = 1 < 2).
6 Robust Adaptive Disturbance Attenuation 159

Example 6.4 (A degenerate system): Consider the following plant model (the sam-
pling period is 0.001 s), and assume F(z) = I and Δm (z) = 0:

1 1 1
G 0 (z) = .
z 1 1

The largest possible rank of G 0 (z) is one, the minimum singular value of the system is
zero for all frequencies, and the input direction corresponding to the zero of the system
is [1, −1] . Therefore, for this system we cannot reject periodic disturbances in every
direction. However, for some disturbances, e.g., ds (t) = [sin(0.1t), sin(0.1t)] ,
there exists a filter of the form (6.45) for which ds (t) is completely rejected. For
example, for
0.74z − 0.4975 1 1
K (z, θ ) = ,
z2 1 1

the sensitivity transfer function matrix S(z, θ ) has zero gain at ω0 = 0.1 rad/sample
(100 rad/s) in the direction of the disturbance vector ds (t), and therefore the distur-
bance is rejected despite the fact that one of the conditions of Theorem 6.5 is not
satisfied (rank(G 0 (exp(±0.1 j))) = 1 < 2).

A disturbance rejecting parameter vector θ ∗ in the ideal case (i.e., when v(t) = 0
and Δm (z) = 0) leads to a perfect performance and guarantees that the plant output
converges to zero exponentially fast. However, in the presence of noise and modeling
error, a large gain of G 0 (z)F(z)K (z, θ ∗ ) may drastically amplify the noise part
of the disturbance. Moreover, it may significantly reduce the stability margin or
lead to instability. The situation may get worse if the disturbance has some modes
z i = exp(± jωi ) near the transmission zeros of G 0 (z)F(z). In such cases, G 0 (z)F(z)
is close to becoming rank deficient at the frequencies of the disturbance ωi ’s. Indeed,
in order to cancel the periodic disturbance terms, the controller has to generate
periodic terms of the same frequencies which after going through G 0 (z)F(z) result
in periodic terms identical to that of the disturbance but of opposite sign. Clearly,
when the plant has a very low gain at the frequencies of the disturbance, the filter gain
must be large enough to make the gain of G 0 (z)F(z)K (z, θ ) in the direction of the
disturbance vector close to one. This however may increase G 0 (z)F(z)K (z, θ )∞
leading to noise amplification.

Lemma 6.5 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with dis-
turbance model (6.2) and filter K (z, θ ) of the form (6.45). If the conditions in The-
orem 6.5 are satisfied, then there exists a θ ∗ with θ ∗  ≤ r0 , for some r0 > 0, that
solves the following constrained convex optimization problem:

θ ∗ = arg min G 0 (z)F(z)K (z, θ )∞ (6.50)


θ∈Ω

where
160 S. Jafari and P. Ioannou

Ω = {θ | θ  ≤ r0 , G 0 (z i )F(z i )K (z i , θ ) = I, z i = exp(± jωi ), i = 1, . . . , n f }.


(6.51)

Proof The proof is given in [2]. 

Remark 6.11 When N = 2n f , the set Ω in (6.51) is a singleton, hence the cost in
(6.50) is fixed constant and cannot be reduced.

Remark 6.12 As shown in [2], the constraint G 0 (z i )F(z i )K (z i , θ ) = I is a set of


polynomial equations which can be expressed as a Sylvester-type matrix equation,
i.e., the constraint is equivalent to a system of algebraic linear equations.

To design a filter with a satisfactory performance, one may use either or both of
the following two remedies:
(i) Increasing the order of the filter: Increasing the value of N provides more flexi-
bility in selecting the best parameter vector θ ∗ for which the periodic components
of the disturbance are rejected and the H∞ -norm of sensitivity function of the
output with respect to the random noise in the disturbance is minimized.
(ii) Pre-filtering modification (design of the stable filter F(z)): If the plant has a
small gain at the frequencies of the disturbance and comes close to lose rank at
these frequencies, we may have a very poor performance because of possible
large value of the sensitivity transfer function S(z, θ ) at other frequencies which
may lead to noise amplification. Moreover, in the presence of high-frequency
unmodeled dynamics, if the plant has relatively large gain at high frequencies,
with a controller of the form (6.45), the closed-loop system may have a small
stability margin or may become unstable. Since the plant model G 0 (z) is known,
by proper shaping of the singular value of G 0 (z) both performance and robust
stability may be improved. That is, a pre-compensator F(z) is designed such that
G 0 (z)F(z) has a large enough gain in all directions over the frequency range of
the disturbances (if possible), and has a sufficiently small gain at high frequencies
where the modeling error is often dominant. We discuss this modification and
the trade-offs between performance improvement and robustness with respect
to plant modeling uncertainties.
The design of the filter F(z) requires the following a priori information: i) the
frequency range where the modeled part of the plant G 0 (z) has high enough accuracy
(which is typically at low frequencies); iii) an upper bound for the maximum sin-
gular value of the unmodeled dynamics Δm (z); iii) the expected frequency range of
the dominant part of the disturbance ds (t). It should be noted that if the disturbance
contains periodic terms with frequencies at the high range where the unmodeled
dynamics are dominant, their rejection may excite the unmodeled dynamics and
adversely affect stability. In practice, however, most high-frequency periodic dis-
turbances have a small amplitude and it may be better to ignore them and consider
them as part of unmodeled disturbance v(t) rather than try to attenuate them. This is
one of the trade-offs of performance versus robustness that is well known in robust
control and robust adaptive control [58].
6 Robust Adaptive Disturbance Attenuation 161

In classical robust control design for MIMO systems, prior to the design of a
controller we may need to shape the singular values of the nominal plant in order to
be able to meet the desired specifications. The desired shape of the open-loop plant
is typically as follows: large enough gain at low frequencies in all directions, low
enough roll-off rate at the desired bandwidth (about 20 dB/decade) and higher rate
at high frequencies, and very small gain at high frequencies [63, 66, 67]. It is also
desired that the maximum and minimum gain of the shaped plant to be almost the
same, i.e., singular values to be aligned to have a plant with almost the same gain in all
directions at each frequency [63, 68]. For an ill-conditioned system (a system with
large condition number), however, aligning singular values are not recommended
as it may lead to a poor performance and robustness [69]. Several algorithms and
procedures have been proposed for singular value shaping which mainly requires
some trial and error [66, 70]. In [71], a more systematic algorithm has been proposed
which guarantees that the loop-shaped and the singular values and condition number
of the shaping weights lie in a pre-specified region.
System decomposition techniques such as inner–outer factorization can be also
used to design a frequency weighing matrix for well-conditioned square plants. The
inner–outer factorization is used in solving problems related to optimal, robust,
and H∞ control design and several algorithms have been proposed to calculate
an inner–outer factorization of a proper rational matrix [72–74]. Consider a square
plant matrix G 0 (z) with stable poles and without any transmission zero on the unit
circle. By using the algorithm proposed in [75], inner–outer factors of G 0 (z) can be
computed. If the plant has some zeros on the unit circle, one can perturb the zeros
and move them slightly away from the unit circle and then apply the decomposi-
tion procedure. Let G out (z) be the outer factor of G 0 (z) (or perturbed version of
G 0 (z)), then G 0 (z)G −1
out (z) is a proper stable all-pass (or almost all-pass) matrix. Let
F(z) = κ0 f (z)G −1 out (z), where f (z) is a scalar low-pass filter with dc-gain of one
and desired bandwidth and roll-off rate at high frequencies and κ0 > 0 is a design
constant. Then, G 0 (z)F(z) has a gain of κ0 over the desired low-frequency range
and small gain at high frequencies with aligned singular values.

6.4.1.2 Adaptive Case: Unknown Disturbance

In order to counteract the effect of unknown disturbances on the plant output, an


adaptive filter is required to adjust its parameters in the direction of minimizing the
norm of the plant output. The structure of the closed-loop system with an adaptive
filter is shown in Fig. 6.3, where the adaptive filter parameter vector is calculated at
each time.
The adaptive version of (6.45) is given by
162 S. Jafari and P. Ioannou

K (z, θ̂(t)) = K i j (z, θ̂i j (t)) ,
n u ×n y
(6.52)
K i j (z, θ̂i j (t)) = θ̂i j (t) α(z),
α(z) = [z −N , z 1−N , . . . , z −1 ]

where θ̂i j (t) ∈ R N is the parameter vector of the i j-th element of the filter and
θ̂ (t) = [θ̂11 (t) , θ̂12 (t) , . . . , θ̂n u n y (t) ] ∈ R N n u n y . Then, the control law can be
expressed as 
u(t) = −F(z) K (z, θ̂ (t − 1))[ζ (t)] ,
(6.53)
ζ (t) = y(t) − G 0 (z)[u(t)],

where θ̂ (t − 1) is the most recent estimate of θ ∗ available to generate control action


at time t. In order to generate an online parameter estimator that would generate
the estimate of θ ∗ , we express θ ∗ in an appropriate parametric form. The following
lemma presents a parametric model for the closed-loop plant to be used for parameter
estimation.

Lemma 6.6 The closed-loop system (6.1),(6.53) is parameterized as

ζ (t) = Φ(t) θ ∗ + η(t), (6.54)

∗ ∗
where θ ∗ = [θ11 , θ12 , . . . , θn∗u n y ] ∈ R N n u n y is the unknown desired parameter vec-
tor to be identified, ζ (t) = y(t) − G 0 (z)[u(t)] ∈ Rn y is a measurable vector signal,
and the regressor matrix is given by

Φ(t) = G 0 (z)F(z)[W (t)], (6.55)

where ⎡ ⎤
w(t)
⎢ .. ⎥
W (t) = ⎣ . ⎦ (6.56)
w(t) N n
u n y ×n u

where w(t) = [α(z) [ζ1 (t)], . . . , α(z) [ζn y (t)]] ∈ R N n y , and the unknown error
term η(t) depends on the unmodeled noise v(t) and the plant unmodeled dynamics
Δm (z) and is given by

η(t) = (I − G 0 (z)F(z)K (z, θ ∗ )) [v(t) + Δm (z)G 0 (z)F(z)[u(t)]] + εs (t),


(6.57)
where εs (t) = (I − G 0 (z)F(z)K (z, θ ∗ ))[ds (t)] is an exponentially decaying to zero
term.

Proof The proof is given in [2]. 


6 Robust Adaptive Disturbance Attenuation 163

Remark 6.13 It follows from Lemma 6.6 that in the absence of noise and modeling
error (i.e., if Δm (z) = 0 and v(t) = 0), the error term η(t) is just an exponentially
decaying to zero signal.

Remark 6.14 The definition of the regressor (6.55) implies that the stable com-
pensator F(z) provides the flexibility to manipulate the regressor excitation level
by shaping the spectrum of the open-loop plant G 0 (z). The matrix W (t) in (6.55)
contains the frequency components of the disturbance, and the regressor Φ(t) is
obtained by passing W (t) through G 0 (z)F(z). If G 0 (z) has very low gains at some
frequencies of the disturbance, it may severely attenuate those frequency compo-
nents, so the regressor will carry almost no information on those frequencies with
the consequence of making their identification and therefore their rejection difficult if
at all possible in some cases. Introducing the stable filter F(z) to shape the frequency
response of G 0 (z) helps to improve parameter estimation.

After shaping the singular values of the nominal plant G 0 (z), we design a param-
eter estimator for the adaptive filter. Based on the derived parametric model (6.54), a
robust parameter estimator can be developed using the techniques discussed in [62]
to guarantee stability and robustness independent of the excitation properties of the
regressor Φ(t) in the presence of non-zero error term η(t). It should be noted that
the number of distinct frequencies, n f , of the disturbance vector is unknown and
an upper bound, n̄ f , for it is assumed to be known; moreover, as discussed earlier
even if n f is known, we often choose N > 2n f to achieve a better performance. This
choice of N leads to an over-parameterization of the controller and gives a regressor
Φ(t) which cannot be persistently exciting. The lack of persistence of excitation
makes the adaptive law susceptible to parameter drift [58] and possible instability.
Therefore the adaptive law for estimating θ ∗ must be robust; otherwise, the presence
of the nonzero error term η(t) may cause parameter drift and lead to instability. In
the absence of a persistently exciting regressor, several modifications have been pro-
posed in the literature to avoid parameter drift in the adaptive law [58]. In the present
paper, we use parameter projection to directly restrict the estimate of the unknown
parameter vector from drifting to infinity.
Let θ̂ (t − 1) be the most recent estimate of θ ∗ , then the predicted value of the
signal ζ (t) based on θ̂ (t − 1) is generated as

ζ̂ (t) = Φ(t) θ̂(t − 1). (6.58)

The normalized estimation error vector is defined as

ζ (t) − ζ̂ (t)
ε(t) = , (6.59)
m 2 (t)

where m 2 (t) = 1 + γ0 trace(Φ(t) Φ(t)), with γ0 > 0, is a normalizing signal. To


generate θ̂ (t), we consider the robust pure least-squares algorithm [62]:
164 S. Jafari and P. Ioannou

Φ(t)Φ(t)
P −1 (t) = P −1 (t − 1) + ,
m 2 (t) (6.60)
 
θ̂ (t) = proj θ̂ (t − 1) + P(t)Φ(t)ε(t) ,

where P −1 (0) = P − (0) > 0 and projection operator proj(·) is used to guarantee
that θ̂ (t) ∈ S, ∀t, where S is a compact set defined as

S = {θ ∈ R N n u n y | θ θ ≤ θmax
2
}, (6.61)

where θmax > 0 is such that the desired parameter vector of the optimum filter, θ ∗ ,
belongs to S. Since we do not have a priori knowledge on the norm of θ ∗ , the upper
bound θmax must be chosen sufficiently large. To avoid P −1 (t) from growing without
bound, covariance resetting modification [58, 62] can be used, i.e., set P(tr ) = β0 I ,
where tr is the sample time at which λmin (P) ≤ β1 , and β0 > β1 > 0 are design
constants. One may also use the modified least-squares algorithm with forgetting
factor [58].
The projection of the estimated parameter vector into S may be implemented as
[62, 65]:
χ (t) = θ̂ (t − 1) + P(t)Φ(t)ε(t)
ρ̄(t) = P −1/2 (t)χ (t)
ρ̄(t) if ρ̄(t) ∈ S̄ (6.62)
ρ(t) =
⊥ proj of ρ̄(t) on S̄ if ρ̄(t) ∈
/ S̄
θ̂(t) = P 1/2 (t)ρ(t),

where P −1 is decomposed into (P −1/2 ) (P −1/2 ) at each time step t to ensure all the
properties of the corresponding recursive least-squares algorithm without projection.
In (6.21), the set S is transformed into S̄ such that if χ ∈ S, then P −1/2 χ ∈ S̄. Since S
is chosen to be convex and P −1/2 χ is a linear transformation, then S̄ is also a convex
set [65]. An alternative to projection is a fixed σ -modification wherein no bounds
for |θ ∗ | are required [58].
The following theorem summarizes the properties of the adaptive control law.
Theorem 6.6 Consider the closed-loop system (6.1), (6.53), (6.58)–(6.62), and
choose N > 2n̄ f , where n̄ f is a known upper bound for the number of distinct
frequencies in the disturbance vector. Assume that the plant modeling error satisfies

c0 Δm (z)G 0 (z)F(z)1 < 1, (6.63)

where c0 is a finite positive constant independent of Δm (z) which depends on known


parameters. Then, all signals in the closed-loop system are uniformly bounded and
the plant output satisfies
6 Robust Adaptive Disturbance Attenuation 165

t+T −1
1 
lim sup y(τ )22 ≤ c(μ2Δ + v02 ), (6.64)
T →∞ T τ =t

for any t ≥ 0 and some finite positive constant c which is independent of t, T , Δm (z),
and v(t), where μΔ is a constant proportional to the size of the plant unmodeled
dynamics Δm (z), and v0 = supτ |v(τ )|. In addition, in the absence of modeling error
and noise (i.e., if Δm (z) = 0 and v(t) = 0), the adaptive control law guarantees the
convergence of y(t) to zero.
Proof The proof is given in [2]. 
The stability condition (6.63) indicates that the stability margin with respect to
modeling uncertainties can be improved by proper shaping of the singular values of
the plant model G 0 (z) by an appropriate choice of filter F(z).

6.4.2 Continuous-Time Systems

In this section, we extend the result of Sect. 6.4.1 to continuous-time systems. Fol-
lowing the same procedure, by considering a structure for the filter K in Fig. 6.3,
we first examine the solvability of the problem and then design and analyze a robust
adaptive control scheme.

6.4.2.1 Non-adaptive Case: Known Disturbance Frequencies

Let the filter K in Fig. 6.3 be an n u × n y matrix whose elements are FIR filters of
order N of the form
 
K (s, θ ) = K i j (s, θi j ) n u ×n y ,
N −1
 λ N −k
K i j (s, θi j ) = (θi j )k = θi j Λ(s),
k=0
(s + λ) N −k (6.65)

λN λ
Λ(s) = ,..., ,
(s + λ) N s+λ

where θi j ∈ R N is the parameter vector of the i j-th element of the filter, θ =


[θ11 , θ12 , . . . , θn u n y ] ∈ R N n u n y is the concatenation of θi j ’s, and λ > 0 is a design
parameter.
Considering the filter structure in (6.65), we examine the conditions under which
there exists a desired parameter vector θ ∗ for which the periodic terms of the distur-
bance can be completely rejected without amplifying the output noise. In addition,
we examine how the stable LTI filter F(z) in Fig. 6.3 can be designed to further
improve robustness and performance.
166 S. Jafari and P. Ioannou

Let n f denote the total number of distinct frequencies in d(t) and the distinct
frequencies be denoted by ω1 , . . . , ωn f ; since some output channels may have dis-
n y
turbance terms at the same frequencies, then n f ≤ j=1 n f j . Then, the internal
model of the sinusoidal disturbances is given by


nf
Ds (s) = (s 2 + ωi2 ). (6.66)
i=1

From Figs. 6.1 and 6.3, the sensitivity transfer function from d(t) to y(t) is given by

y(t) = S(s, θ )[d(t)]


(6.67)
= (I − G 0 (s)F(s)K (s, θ )) (I + Δm (s)G 0 (s)F(s)K (s, θ ))−1 [d(t)].

The following lemma gives a necessary and sufficient condition under which the
periodic components of the disturbance vector can be rejected.

Theorem 6.7 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with
disturbance model (6.2) and filter K (s, θ ) of the form (6.65). Let ω1 , . . . , ωn f be
the distinct frequencies of the sinusoidal disturbance terms. Then, there exists a θ ∗
such that with K (s, θ ∗ ), the control law of Fig. 6.3 completely rejects the periodic
components of the disturbances if and only if n y ≤ n u , rank(G 0 (si )F(si )) = n y ,
si = ± jωi , for i = 1, 2, . . . , n f , and N ≥ 2n f , provided the stability condition

Δm (s)G 0 (s)F(s)K (s, θ ∗ )∞ < 1 (6.68)

is satisfied. The choice of θ ∗ is unique if N = 2n f . In addition, all signals in the


closed-loop system are guaranteed to be uniformly bounded and the plant output
satisfies
lim sup y(τ )∞ ≤ S(s, θ ∗ )1 v0
t→∞ τ ≥t
(6.69)
≤ c(1 + G 0 (s)F(s)K (s, θ ∗ )1 )v0 ,
 
where v0 = supτ |v(τ )| and c = (I + Δm (s)G 0 (s)F(s)K (s, θ ∗ ))−1 1 is a finite
positive constant. Moreover, in the absence of unmodeled noise (i.e., if v(t) = 0),
y(t) converges to zero exponentially fast.

Proof The proof is given in [59]. 

Theorem 6.7 implies that the minimization of the magnitude of G 0 (s)F(s)


K (s, θ ) improves the stability margin, and prevents the unnecessary amplification
of v(t). Since for N > 2n f , there is an infinite number of vectors θ which guarantee
rejection of the periodic disturbance terms, we can select the θ which also improves
the stability margin and prevents the broadband output noise amplification to the
extent possible. When partial knowledge about the frequency range of the distur-
bances is available, the filter F(s) should be chosen such that G 0 (s)F(s) has large
6 Robust Adaptive Disturbance Attenuation 167

enough gains in all directions (if possible) over the expected disturbance frequency
range, and very small gains at high frequencies where the unmodeled dynamics may
be dominant. Large enough gains of G 0 (s)F(s) at the frequencies of the disturbance
increase the excitation level of the regressor at those frequencies and can significantly
improve the performance; moreover, small gains of G 0 (s)F(s) at high frequencies
reduce the level of excitation of the high-frequency unmodeled dynamics. To design
F(s), one may use singular-value shaping techniques [66] or system decomposition
approaches such as inner–outer factorization [75].

Lemma 6.7 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with dis-
turbance model (6.2) and filter K (s, θ ) of the form (6.45). If the conditions in The-
orem 6.7 are satisfied, then there exists a θ ∗ with θ ∗  ≤ r0 , for some r0 > 0, that
solves the following constrained convex optimization problem:

θ ∗ = arg min G 0 (s)F(s)K (s, θ )∞ (6.70)


θ∈Ω

where

Ω = {θ | θ  ≤ r0 , G 0 (si )F(si )K (si , θ ) = I, si = ± jωi , i = 1, . . . , n f }. (6.71)

Proof The proof is given in [59]. 

Remark 6.15 As shown in [59], the constraint G 0 (si )F(si )K (si , θ ) = I is a set of
polynomial equations which can be expressed as a Sylvester-type matrix equation,
i.e., the constraint is equivalent to a system of algebraic linear equations.

6.4.2.2 Adaptive Case: Unknown Disturbance

In order to suppress unknown periodic disturbances, we propose the adaptive version


of (6.65) given by

K (s, θ̂ (t)) = K i j (s, θ̂i j (t)) ,
n u ×n y
N −1
 λ N −k
K i j (s, θ̂i j (t)) = (θ̂i j (t))k = θ̂i j (t) Λ(s), (6.72)
k=0
(s + λ) N −k

λN λ
Λ(s) = ,..., ,
(s + λ) N s+λ

where θ̂i j (t) ∈ R N is the parameter vector of the i j-th element of the filter and
θ̂ (t) = [θ̂11 (t) , θ̂12 (t) , . . . , θ̂n u n y (t) ] ∈ R N n u n y . Then, the control law can be
expressed as
168 S. Jafari and P. Ioannou

u(t) = −F(s) K (s, θ̂ (t))[ζ (t)] ,
(6.73)
ζ (t) = y(s) − G 0 (s)[u(t)],

where θ̂ (t) is the estimate of the unknown parameter vector θ ∗ at time t.


In order to generate an online parameter estimator that would generate the estimate
of θ ∗ , we express θ ∗ in an appropriate parametric model described by the following
lemma.
Lemma 6.8 The closed-loop system (6.1),(6.73) is parameterized as

ζ (t) = Φ(t) θ ∗ + η(t), (6.74)

∗ ∗
where θ ∗ = [θ11 , θ12 , . . . , θn∗u n y ] ∈ R N n u n y is the unknown desired parameter vec-
tor to be identified, ζ (t) = y(t) − G 0 (s)[u(t)] ∈ Rn y is a measurable vector signal,
and the regressor matrix is given by

Φ(t) = G 0 (s)F(s)[W (t)], (6.75)

where ⎡ ⎤
w(t)
⎢ .. ⎥
W (t) = ⎣ . ⎦ (6.76)
w(t) N n u n y ×n u

where w(t) = [Λ(s) [ζ1 (t)], . . . , Λ(s) [ζn y (t)]] ∈ R N n y , and the unknown error
term η(t) depends on the unmodeled noise v(t) and the plant unmodeled dynamics
Δm (s) and is given by

η(t) = (I − G 0 (s)F(s)K (s, θ ∗ )) [v(t) + Δm (s)G 0 (s)F(s)[u(t)]] + εs (t),


(6.77)
where εs (t) = (I − G 0 (s)F(s)K (s, θ ∗ ))[ds (t)] is an exponentially decaying to zero
term.
Proof The proof is given in [59]. 
The parametric model (6.74) together with the parameter estimation techniques
discussed in [58] is used to design a robust adaptive law to estimate the unknown
parameter vector θ ∗ as follows.
The predicted value of the signal ζ (t) based on θ̂ (t) is generated as

ζ̂ (t) = Φ(t) θ̂ (t). (6.78)

The normalized estimation error vector is defined as

ζ (t) − ζ̂ (t)
ε(t) = , (6.79)
m 2 (t)
6 Robust Adaptive Disturbance Attenuation 169

where m 2 (t) = 1 + γ0 trace(Φ(t) Φ(t)), with γ0 > 0, is a normalizing signal. To


generate θ̂ (t), we consider the robust pure least-squares algorithm [58]:

Φ(t)Φ(t)
Ṗ(t) = −P(t) P(t)
m 2s (t) (6.80)
θ̂˙ (t) = proj (P(t)Φ(t)ε(t))

where P(0) = P (0) > 0 and projection operator proj(·) is used to guarantee that
θ̂ (t) ∈ S, ∀t, where S is a compact set defined as

S = {θ ∈ R N n u n y | θ θ ≤ θmax
2
}, (6.81)

where θmax > 0 is such that the desired parameter vector of the optimum filter, θ ∗ ,
belongs to S. The projection of the estimated parameter vector into S may be imple-
mented as [58]:


⎪ P(t)Φ(t)ε(t) if θ̂ (t) ∈ S0


⎨ or if θ̂ (t) ∈ δS and
θ̂˙ (t) =

⎪ (P(t)Φ(t)ε(t)) θ̂ (t) ≤ 0,


⎩ P(t)Φ(t)ε(t) − P(t) θ̂ (t)θ̂(t) P(t)Φ(t)ε(t) otherwise
θ̂(t) P(t)θ̂(t)
(6.82)
where S0 = {θ ∈ R N n u n y | θ θ < θmax
2
} and δS = {θ ∈ R N n u n y | θ θ = θmax
2
} denote
the interior and the boundary of S, respectively.
Some alternatives to projection and other robust modifications can be found in
[58]; for example, one may use the fixed σ -modification that requires no bounds
on the set where the unknown parameters belong to; however, it destroys the ideal
convergence properties of the adaptive algorithm [58]. Also, to prevent the covariance
matrix P(t) from becoming close to singularity, covariance resetting [58] is used
to keep the minimum eigenvalue of P(t) greater than a pre-specified small positive
constant ρ1 at each time. This modification guarantees that P(t) is positive definite
for all t ≥ 0.
The following theorem summarizes the properties of the adaptive control law.
Theorem 6.8 Consider the closed-loop system (6.1), (6.73), (6.78)–(6.82), and
choose N > 2n̄ f , where n̄ f is a known upper bound for the number of distinct
frequencies in the disturbance. Assume that the plant modeling error satisfies

c0 Δm (s)G 0 (s)F(s)1 < 1, (6.83)

where c0 is a finite positive constant independent of Δm (s). Then, all signals in the
closed-loop system are uniformly bounded and the plant output satisfies
 t+T
1
lim sup y(τ )22 dτ ≤ c(μ2Δ + v02 ), (6.84)
T →∞ T t
170 S. Jafari and P. Ioannou

for any t ≥ 0 and some finite positive constant c which is independent of t, T , Δm (s),
and v(t), where μΔ is a constant proportional to the size of the plant unmodeled
dynamics Δm (s), and v0 = supτ |v(τ )|. In addition, in the absence of modeling error
and noise (i.e., if Δm (s) = 0 and v(t) = 0), the adaptive control law guarantees the
convergence of y(t) to zero.

Proof The proof is given in [59]. 

The proposed control scheme provides a satisfactory performance and stability


margin if the design parameters are chosen properly. In any practical control design,
different types of uncertainties and modeling errors demand some sort of trade-off
between robust stability and performance which can be achieved using any avail-
able a priori knowledge regarding bounds on uncertainties to choose certain design
parameters. We summarize the main design parameters that contribute to the per-
formance and robustness with respect to unmodeled dynamics as follows: The filter
F(s), λ, N , γ0 , and P(0) are the design parameters affecting the excitation level of the
regressor and the rate of adaptation; and the margin of stability depends upon F(s)
and the adaptive filter order N . As explained earlier, we choose the design parameters
based on a priori knowledge of the maximum number of distinct frequencies of the
disturbance, frequency range of the disturbance, and the frequency range over which
the unmodeled dynamics may be dominant.

6.5 Unknown Minimum-Phase Plants: SISO Systems

In this section, we relax the assumption of stable known plant and consider the case
where the plant model G 0 (q) in Fig. 6.1 can be unstable with unknown parameters.
We do assume, however, that G 0 (q) has stable zeros. We use the Model Reference
Adaptive Control (MRAC) structure [58, 62] to meet the objective of unknown
periodic disturbance rejection without amplifying the effect of broadband noise in the
output. The cost of achieving these objectives is the use of an over-parameterization
which adds to the number of computations. By focusing on discrete-time SISO
systems, we consider the plant model shown in Fig. 6.1 and the control structure in
Fig. 6.4, and show how the problem of rejecting unknown periodic disturbances can
be solved for unstable plants with unknown parameters as long as they are minimum
phase.
We consider the plant model (6.1) and assume

k0 Z 0 (z)
G 0 (z) = (6.85)
R0 (z)

is an unknown minimum-phase nominal plant transfer function (possibly unstable).


6 Robust Adaptive Disturbance Attenuation 171

Assumption 6.1 The following assumptions are made about G 0 (z):


• Z 0 (z) is a monic Hurwitz polynomial with unknown coefficients, and R0 (z) is an
arbitrary unknown monic polynomial which is allowed to be non-Hurwitz.
• The polynomial degrees n p = deg(R0 ) and n z = deg(Z p ) are unknown, yet an
upper bound, n̄ p , for n p is known, and the relative degree of G 0 (z), n ∗ = n p − n z >
0, is known
• The gain k0 is unknown, yet its sign, sign(k0 ), and an upper bound, k̄0 , for |k0 | are
known.

The above assumptions are the same as those in the classical MRAC [58, 62].
It should be noted that the knowledge of the sign of k0 significantly simplifies the
structure of the control scheme for which the control objective can be met; it can be
relaxed at the expense of a more complex control law [76].
The uncertain nature of the plant and disturbance models necessitates the use of a
robust adaptive control scheme to achieve the control objective with high accuracy.
We first consider the case where the parameters of G 0 (z) and disturbance frequen-
cies are perfectly known. A non-adaptive controller is then designed and conditions
under which the control objective is achievable as well as limitations imposed by
the controller structure are studied. The analysis of the non-adaptive closed-loop
system is crucial as it provides insights that can be used to deal with the unknown
parameter case. Subsequently, the structure of the non-adaptive controller along with
the certainty equivalence principle [58, 62] are used to design an adaptive control
algorithm to handle the unknown plant unknown disturbance case.
Consider the plant model (6.1) and disturbance (6.2) and the closed-loop system
architecture shown in Figs. 6.1 and 6.4. We solve the problem for discrete-time SISO
systems. The following assumption is made about the plant unmodeled dynamic
Δm (z).

Assumption 6.2 The multiplicative uncertainty Δm (z) is analytic in |z| ≥ ρ0 , for

some known ρ0 ∈ [0, 1), and z −n Δm (z) is a proper transfer function.

6.5.1 Non-adaptive Case: Known Plant and Known


Disturbance Frequencies

Let us suppose that the disturbance frequencies ω1 , . . . , ωn f and the plant model
G 0 (z) are perfectly known, and consider the structure of the classical MRAC scheme
[62] as shown in Fig. 6.4 where
u(t) = K u (z, θu )[u(t)] + K y (z, θ y )[y(t)],
α(z)
K u (z, θu ) = θ1 , θu = θ1 ∈ R N ,
Λ(z)
(6.86)
α(z)
K y (z, θ y ) = θ2 + θ3 , θ y = [θ2 , θ3 ] ∈ R N +1 ,
Λ(z)
α(z) = [z N −1 , z N −2 , . . . , z, 1] , Λ(z) = z N ,
172 S. Jafari and P. Ioannou

where N is the order of the controller. We can express the control law (6.86) as

θ α(z) + θ3 Λ(z)
u(t) = 2 [y(t)]. (6.87)
Λ(z) − θ1 α(z)

By substituting (6.87) into (6.1), we obtain

R0 (z)(Λ(z) − θ1 α(z))
y(t) = [d(t)].
R0 (z)(Λ(z) − θ1 α(z)) − k0 Z 0 (z)(θ2 α(z) + θ3 Λ(z))(1 + Δm (z))
(6.88)
To achieve the disturbance rejection objective, the controller parameters θ1 , θ2 , θ3
are to be chosen such that the sensitivity transfer function from d(t) to y(t) in (6.88)
is stable and has zero gain at the disturbance frequencies. This can be achieved if
the following matching equations are satisfied (the existence of solutions to these
equations will be investigated subsequently):

R0 (z)Ds (z)A(z) + B(z) = z n Λ(z), (6.89a)
θ1 α(z) = Λ(z) − Z 0 (z)Ds (z)A(z), (6.89b)
θ2 α(z) + θ3 Λ(z) = −B(z)/k0 , (6.89c)

where n ∗ is the relative degree of G 0 (z) and Ds (z) is the internal model (generating
polynomial) of sinusoidal disturbances defined as


nf
Ds (z) = (z 2 − 2 cos(ωi )z + 1), (6.90)
i=1

where ωi ’s, i = 1, 2, . . . , n f , are the distinct frequencies in the modeled part of the
disturbances (in rad/sample), and n f is the total number of distinct frequencies, and
the polynomials A(z) (monic and of degree N − deg(Z 0 ) −deg(Ds )) and B(z) (of
degree at most N ) are to be determined by solving (6.89a).

Remark 6.16 In the absence of additive disturbances (i.e., when d(t) = 0), the gen-
erating polynomial of disturbances is Ds (z) = 1; hence, the control law (6.86), (6.89)
with N = deg(R0 ) − 1 = n p − 1 reduces to that of the classical MRAC scheme [62,
Sect. 7.2.2].

From (6.88) and (6.89), the sensitivity transfer function from d(t) to y(t) can be
expressed in terms of polynomials A(z) and B(z) as
 
R0 (z)A(z)Ds (z) B(z) −1
y(t) = 1 + Δm (z) n ∗ [d(t)], (6.91)
z n ∗ Λ(z) z Λ(z)
6 Robust Adaptive Disturbance Attenuation 173

where Ds (z) has zero gain at z i = exp(± jωi ), i = 1, 2, . . . , n f . The solvability of


(6.89) and the existence of polynomials A(z) and B(z) is the subject of the following
lemma.
Lemma 6.9 Consider the matching equations (6.89) and let A(z) be a monic poly-
nomial of degree N − deg(Z 0 ) −deg(Ds ) and B(z) be a polynomial of degree at
most N . If N ≥ deg(R0 Ds ) − 1 = n p + 2n f − 1, then for any given G 0 (z), the sys-
tem of equations (6.89) is solvable. That is, the existence of polynomials A(z) and
B(z), hence the controller parameters θ1 , θ2 , θ3 satisfying (6.89) is guaranteed. In
addition, if N = n p + 2n f − 1, the solution is unique.
Proof The proof is given in [61]. 
Now, we can summarize the properties of the closed-loop system by the following
theorem.
Theorem 6.9 Consider the closed-loop system shown in Figs. 6.1 and 6.4 with
disturbance model (6.2), under Assumption 6.1, with (6.86) and (6.89). If N ≥ n p +
2n f − 1, and the stability condition
 
 
Δm (z) B(z)  <1 (6.92)
 z Λ(z) ∞
n ∗

is satisfied, then all signals in the closed-loop system are uniformly bounded, the
effect of the sinusoidal disturbances on the plant output is completely rejected, and
the plant output satisfies
   
 B(z) 

lim sup |y(τ )| ≤ c 1 +  n ∗  v0 , (6.93)
t→∞ τ ≥t z Λ(z) 1
 −1 
 
where v0 = supτ |v(τ )| and c = 
 1 + Δ m (z) B(z)

z n Λ(z)
 is a finite positive con-

1
stant. Moreover, in the absence of unmodeled noise (i.e., if v(t) = 0), y(t) converges
to zero exponentially fast.
Proof The proof is given in [61]. 
Let us examine how the solution of the matching equation (6.89) may affect the
performance and the relative stability of the closed-loop system. From (6.89a), the
following relation holds

R0 (z)A(z)Ds (z) B(z)


+ n∗ = 1, (6.94)
z n ∗ Λ(z) z Λ(z)

then since Ds (z) is equal to zero at disturbance frequencies ωi , then

B(z) 
 = 1. (6.95)
z n ∗ Λ(z) ω=ωi
174 S. Jafari and P. Ioannou

Equation (6.94) also implies that at any frequency


   
 R0 (z)A(z)Ds (z)   B(z)   R0 (z)A(z)Ds (z) B(z) 
  −  n∗ 
 ≤  + n∗ = 1, (6.96)
 z n ∗ Λ(z) z Λ(z)   z n ∗ Λ(z) z Λ(z) 
∗ ∗
that is, the magnitudes of R0 (z)A(z)Ds (z)/(z n Λ(z)) and B(z)/(z n Λ(z)) differ
at most by one at any frequency. The constraints (6.95) and (6.96), however, do
not limit the magnitude of each transfer function. Indeed, imposing the constraint
(6.95) at the disturbance frequencies may significantly increase the magnitude of
∗ ∗
R0 (z)A(z)Ds (z)/(z n Λ(z)) and B(z)/(z n Λ(z)) at other frequencies ω = ωi . In
other words, a controller that completely rejects sinusoidal disturbances may drasti-
cally amplify the effect of broadband noise v(t) on the plant output y(t), and hence
destroy the tracking performance. Moreover, in the presence of plant unmodeled
dynamics, a polynomial B(z) satisfying (6.95) may violate the stability condition
(6.92) or may significantly reduce the margin of stability.
One way to address this issue is to over-parameterize the controller. From
Lemma 6.9, if N = n p + 2n f − 1, there is a unique solution to (6.89a); how-
ever, for N > n p + 2n f − 1, the polynomial equation (6.89a) has infinitely many
solutions among which we can choose one that minimizes the peak magnitude of

B(z)/(z n Λ(z)), thereby the stability margin and the tracking performance can be
improved. The existence of the best parameter vector θ = [θ1 , θ2 , θ3 ] is the subject
of the following lemma.
Lemma 6.10 Consider the closed-loop system shown in Figs. 6.1 and 6.4 with dis-
turbance model (6.2), under Assumption 6.1, with (6.86) and (6.89). If the conditions
in Theorem 6.9 are satisfied, then there exists a θ ∗ with θ ∗  ≤ r0 , for some r0 > 0,
that solves the following constrained convex optimization problem:
 
 B(z) 
θ ∗ = arg min   , Ω = {θ | θ  ≤ r0 , (6.89) holds}. (6.97)
θ∈Ω  z n Λ(z) ∞

Proof The proof is given in [61]. 


Remark 6.17 As shown in [61], if N ≥ n p + 2n f − 1, then the set Ω is non-empty,
compact, and convex, hence the cost function in (6.97) attains its minimum on Ω.
Also, the optimization problem can be formulated as a linear matrix inequality (LMI)
feasibility problem and can be solved using efficient LMI solvers.
The following simple example illustrates the above results.
Example 6.5 Consider the following minimum-phase unstable open-loop plant
model:
k0 Z 0 (z) 2.63(z − 0.13)
G 0 (z) = = 2 , (6.98)
R0 (z) z − 1.91z + 1.44

with sampling period of 0.001 s, and suppose that the additive output distur-
bances have two sinusoidal components at frequencies ω1 = 0.2416 rad/sample
6 Robust Adaptive Disturbance Attenuation 175


Table 6.1 The H∞ -norm of B(z)/(z n Λ(z)) from the solution of (6.97)
 
 ∗ 
N B(z)/(z n Λ(z)) from the solution of

(6.97)
5 60.89 (= 35.69 dB)
7 15.67 (= 23.90 dB)
9 8.00 (= 18.07 dB)
11 5.15 (= 14.23 dB)
13 3.71 (= 11.38 dB)
15 2.86 (= 9.13 dB)
17 2.31 (= 7.24 dB)
19 1.94 (= 5.75 dB)

(= 38.45 Hz) and ω2 = 0.6357 rad/sample (= 101.17 Hz). From Theorem 6.9, if
the controller order satisfies N ≥ 5, then the sinusoidal disturbances are completely

rejected. Table 6.1 shows the peak value of |B(z)/(z n Λ(z))| from the solution of
the optimization problem (6.97) for different values of controller order N . Although
for any N ≥ 5, the sinusoidal terms of the disturbance are perfectly rejected, for

small values of N , the transfer function B(z)/(z n Λ(z)) has large gains at high
frequencies making the closed-loop system very sensitive to high-frequency unmod-
eled dynamics (see the stability condition (6.92)); moreover it may drastically
amplify the effect of the high-frequency noise on the plant output and therefore
destroy the tracking performance. Figure 6.7 shows the magnitude bode plot of
∗ ∗
B(z)/(z n Λ(z)) and R0 (z)A(z)Ds (z)/(z n Λ(z)) based on the solution of (6.97), for
three different values of N . It should be noted that according to (6.95), at the dis-
∗ ∗
turbance frequencies we have B(z)/(z n Λ(z)) = 1 and B(z)/(z n Λ(z))∞ cannot

∗ ∗
Fig. 6.7 Magnitude bode plot of B(z)/(z n Λ(z)) and R0 (z)A(z)Ds (z)/(z n Λ(z)) from the solution
of (6.97), for different values of N . The controller is designed to reject the disturbances at frequencies
ω1 = 38.45 Hz and ω2 = 101.17 Hz. Increasing the controller order N reduces the H∞ -norm of

B(z)/(z n Λ(z)) and thereby improve performance and robustness
176 S. Jafari and P. Ioannou

be made less than one. Also, the inequality (6.96) implies that at any frequency,
∗ ∗
|R0 (z)A(z)Ds (z)/(z n Λ(z))| and |B(z)/(z n Λ(z))| differ at most by one.
The above results reveal the properties of the control law (6.86) and provide insight
into the corresponding level of achievable performance for the ideal case when the
parameters of plant model and the disturbance frequencies are perfectly known.

6.5.2 Adaptive Case: Unknown Plant and Unknown


Disturbance

In this section, we design an adaptive version of the control law (6.86) to solve
the problem of rejecting unknown disturbances acting on unknown minimum-phase
systems.
The adaptive filters in Fig. 6.4 are given by

u(t) = K u (z, θ̂u (t))[u(t)] + K y (z, θ̂ y (t))[y(t)],


α(z)
K u (z, θ̂u (t)) = θ̂1 (t) , θ̂u (t) = θ̂1 (t) ∈ R N ,
Λ(z)
(6.99)
α(z)
K y (z, θ̂ y (t)) = θ̂2 (t) + θ̂3 (t), θ̂ y (t) = [θ̂2 (t) , θ̂3 (t)] ∈ R N +1 ,
Λ(z)
N −1 N −2
α(z) = [z ,z , . . . , z, 1] , Λ(z) = z N ,

In order to design a robust parameter estimator, we first develop a suitable


parameterization for the closed-loop system in terms of the unknown controller
parameters, based on which an estimate of the controller parameter vector θ̂ (t) =
[θ̂1 (t) , θ̂2 (t) , θ̂3 (t)] is generated at each time t. The following lemma describes
the parametric model to be used for parameter estimation.
Lemma 6.11 The closed-loop system (6.1),(6.99) is parameterized as

ζ (t) = φ(t) θ̄ ∗ + η(t), (6.100)

where θ̄ ∗ = [θ1∗ , θ2∗ , θ3∗ , 1/k0 ] ∈ R2N +2 is the unknown parameter vector to be
identified, ζ (t) and φ(t) are known measurable signals, and η(t) is an unknown
function representing the modeling error and noise terms, where

ζ (t) = z −n [u(t)],
α(z) α(z) ∗
φ(t) = ∗ [u(t)], n ∗ [y(t)], z −n [y(t)], y(t) ,
z Λ(z)
n z Λ(z)
Z 0 (z)A(z)Ds (z) R0 (z)A(z)Ds (z)
η(t) = −Δm (z) [u(t)] − [v(t)] − εs (t),
z n ∗ Λ(z) k0 z n ∗ Λ(z)
(6.101)
6 Robust Adaptive Disturbance Attenuation 177
 ∗ 
where εs (t) = (R0 (z)A(z)Ds (z))/(k0 z n Λ(z)) [ds (t)] is an exponentially decaying
to zero term.

Proof The proof is given in [61]. 

Based on the affine parametric model (6.100) a wide class of robust parameter
estimators can be employed to generate an estimate of the unknown parameter vector
θ̄ ∗ at each time t. Let θ̄ˆ (t − 1) be the most recent estimate of θ̄ ∗ , then the predicted
value of the signal ζ (t) based on θ̄ˆ (t − 1) is generated as

ζ̂ (t) = φ(t) θ̄ˆ (t − 1). (6.102)

The normalized estimation error is defined as

ζ (t) − ζ̂ (t)
ε(t) = , (6.103)
m 2 (t)

where m 2 (t) = 1 + γ0 φ(t) φ(t), with γ0 > 0, is a normalizing signal. To generate


θ̄ˆ (t), we consider the robust pure least-squares algorithm [62]:

P(t − 1)φ(t)φ(t) P(t − 1)


P(t) = P(t − 1) −
m 2 (t) + φ(t) P(t − 1)φ(t) (6.104)
θ̄ˆ (t) = proj(θ̄ˆ (t − 1) + P(t)φ(t)ε(t))

where P(0) = P (0) > 0 and projection operator proj(·) is used to guarantee that
θ̄ˆ (t) ∈ S, ∀t, where S is a compact set defined as

S = {θ ∈ R2N +2 | θ θ ≤ θmax
2
}, (6.105)

where θmax > 0 is such that the desired parameter vector of the optimum filter, θ̄ ∗ ,
belongs to S. The projection of the estimated parameter vector into S may be imple-
mented as [62, 65]:

χ (t) = θ̄ˆ (t − 1) + P(t)ε(t)φ(t)


ρ̄(t) = P −1/2 (t)χ (t)
ρ̄(t) if ρ̄(t) ∈ S̄ (6.106)
ρ(t) =
⊥ proj of ρ̄(t) on S̄ if ρ̄(t) ∈
/ S̄

θ̄ˆ (t) = P 1/2 (t)ρ(t),

where the set S is transformed into S̄ such that if χ ∈ S, then P −1/2 χ ∈ S̄ [65].
178 S. Jafari and P. Ioannou

Remark 6.18 The last element of θ̄ˆ (t) is the estimate of 1/k0 (reciprocal of the
plant model gain k0 in (6.85)). The adaptive law provides the estimate of 1/k0 at
each time, but this value is discarded as the controller (6.99) depends on only the
first 2N + 1 elements of θ̄ˆ (t).

The following theorem summarizes the properties of the adaptive control law.

Theorem 6.10 Consider the closed-loop system (6.1), (6.99), (6.102)–(6.106), under
Assumptions 6.1 and 6.2, and choose N ≥ n̄ p + 2n̄ f − 1, where n̄ p is an upper
bound for deg(R0 ) and n̄ f is an upper bound for the number of distinct frequencies
in the disturbance. If Δm (z) satisfies a norm-bound condition, then all signals in the
closed-loop system are uniformly bounded and the plant output satisfies

t+T −1
1 
lim sup |y(τ )|2 ≤ c(μ2Δ + v02 ), (6.107)
T →∞ T τ =t

for any t ≥ 0 and some finite positive constant c which is independent of t, T , Δm (z),
and v(t), where μΔ is a constant proportional to the size of the plant unmodeled
dynamics Δm (z), and v0 = supτ |v(τ )|. In addition, in the absence of modeling error
and noise (i.e., if Δm (z) = 0 and v(t) = 0), the adaptive control law guarantees the
convergence of y(t) to zero.

Proof The proof is obtained by following the same steps as those in the proof of
Theorem 7.2 and Theorem 7.3 in [62, 77]. 

Remark 6.19 The minimum-phase assumption on the plant model is the main lim-
itation of the MRAC scheme. The location of zeros of a discretized system critically
depends on the type of sampling as well as the size of sampling time. A minimum-
phase continuous-time system after discretization with a sample-and-hold device
may become a non-minimum-phase discrete-time system; particularly, when the rel-
ative degree of the continuous-time system is greater than two and the sampling
time is significantly small [78, Sect. 2]. It is also possible for a non-minimum-phase
continuous-time system to become a minimum-phase discrete-time system [78, Sect.
2]. There are some criteria under which all zeros of a sampled transfer function are
stable; these criteria, however, are restrictive [79].

6.6 Numerical Simulation

In this section we demonstrate the performance of the adaptive control schemes


proposed in Sects. 6.3, 6.4, and 6.5.
6 Robust Adaptive Disturbance Attenuation 179

6.6.1 SISO Discrete-Time Systems with Known Plant Model

Consider the following open-loop plant model shown in Fig. 6.1 and the controller
structure in Fig. 6.3 with disturbance model (6.2). Suppose

−0.00146(z − 0.1438)(z − 1)
G 0 (z) = , (6.108)
(z − 0.7096)(z 2 − 0.04369z + 0.01392)

with sampling period of 1/480 s, is the known stable modeled part of the plant, and

−0.0001
Δm (z) = ,
(z + 0.99)2

is the unknown plant multiplicative uncertainty which is dominant at high frequencies


and negligible at low frequencies, and

d(t) = 0.7 sin(ω1 t + π/3) + 0.5 sin(ω2 t + π/4) + v(t) (6.109)

is an unknown additive output disturbance, where ω1 = 0.0521 rad/sample (=


25 rad/s), ω2 = 0.4688 rad/sample (= 225 rad/s), and v(t) is a zero-mean Gaus-
sian noise with standard deviation 0.02.
The following design parameters are assumed for the adaptive law (6.13), (6.17)–
(6.21): N = 50, γ0 = 1, P(0) = 20I , θ̂(0) = 0, F(z) = 100. Fig. 6.8 shows the per-
formance of the adaptive control scheme, where at t = 10 s the feedback is switched
on and the control input u(t) is applied. In order to demonstrate the behavior of the

Fig. 6.8 Simulation results for a SISO discrete-time system with known stable plant model and
the adaptive control scheme (6.13), (6.17)–(6.21). The control input is applied at t = 10 s. New
unknown sinusoidal terms are abruptly added to the existing disturbance at t = 30 s; the adaptive
controller updates its parameters at this time to suppress the effect of new terms on the plant output
180 S. Jafari and P. Ioannou

adaptive controller when the disturbance characteristics changes, at t = 30 s, two


new sinusoidal terms 0.6 sin(ω3 t − π/6) ω3 = 0.1771 rad/sample (= 85 rad/s), and
0.4 sin(ω4 t + π/2), ω4 = 0.2604 rad/sample (= 125 rad/s), are abruptly added to the
existing disturbance (6.109). At this time, the controller re-adjusts its parameters to
reject the new sinusoidal terms. Such performance is achieved because the controller
order N is chosen large enough to handle multiple disturbance frequencies.

6.6.2 SISO Continuous-Time Systems with Known Plant


Model

Consider the following open-loop plant model shown in Fig. 6.1 and the controller
structure in Fig. 6.3 with disturbance model (6.2). Suppose

0.5(s − 0.2)
G 0 (s) =
s 2 + s + 1.25

is the known stable modeled part of the plant, and Δm (s) = −0.001s is the unknown
unmodeled dynamics with small magnitude at low frequencies and large magnitude
at high frequencies. We also assume that

d(t) = 0.6 sin(ω1 t + π/4) + 0.7 sin(ω2 t + π/2) + v(t),

is an unknown additive output disturbance, where ω1 = 70 rad/s, ω2 = 187 rad/s,


and v(t) is a zero-mean Gaussian noise with standard deviation 0.02.
Let us suppose the following partial knowledge about the unknown disturbance
is given: the disturbances dominated by at most 5 distinct frequencies and the largest
frequency of the disturbance is less than 600 rad/s. This information helps in selection
of some design parameters. In order to show the effect of open-loop plant pre-filtering
by a suitable LTI filter F(s), we compare the performance of the adaptive law (6.33),
(6.37)–(6.41), with and without a compensator F(s).
The magnitude of the open-loop plant G 0 (s) is relatively small at the expected
frequency range of the disturbance, which may drastically slow down the adaptation
and adversely affect the performance of the proposed adaptive control scheme. In
order to increase the plant gain over the frequency range of interest of the disturbance,
we use the procedure proposed in Sect. 6.3.2 and design a stable filter F(s) such that
the compensated plant model G 0 (s)F(s) has a large enough gain over the expected
frequency range of disturbance. The filter F(s) designed for this plant is given by

α02 (s 2 + s + 1.25)
F(s) = , α0 = 500. (6.110)
(s + α0 )2 (s + 0.2)

Figure 6.9 shows the magnitude bode plot of the original uncompensated open-loop
plant G 0 (s) and that of the compensated version G 0 (s)F(s).
6 Robust Adaptive Disturbance Attenuation 181

Fig. 6.9 The magnitude plots of G 0 (s) and G 0 (s)F(s). The filter F(s) increases the open-loop
plant gain over the expected range of disturbance frequencies. Open-loop plant filtering increases
the excitation level of the regressor (6.35) and improves the disturbance rejection performance of
the adaptive controller

The following design parameters are assumed for the adaptive law (6.33), (6.37)–
(6.41): N = 20, γ0 = 1, λ = 500, P(0) = 500I , and θ̂ (0) = 0. Figure 6.10 shows the
plant output y(t), where the control input u(t) is applied at t = 5 s. Figure 6.10a shows
the performance of the proposed adaptive control scheme without filter F(s) (i.e.,
F(s) = 1). The rate of adaptation in this case is very small as the plant model G 0 (s)
has a very small gain at the frequencies of the disturbance ω1 = 70 and ω2 = 187 rad/s
(see Fig. 6.9). It is clear that the adaptive controller with F(s) = 1 and N = 20 is
pretty slow and does not provide significant improvement in performance. We should
note that by increasing the size of the adaptive filter, N , the performance shown in
Fig. 6.10a can be improved but not as effectively as with filter F(s) in (6.110) for
the same filter order. As shown in Fig. 6.10b, the periodic terms have been quickly
rejected when the control input is applied at t = 5 s with filter (6.110) in the loop.

6.6.3 MIMO Discrete-Time Systems with Known Plant Model

Consider the following open-loop plant model shown in Fig. 6.1 and the controller
structure in Fig. 6.3 with disturbance model (6.2). Suppose

0.01(z − 1) z − 2 z + 0.5
G 0 (z) =
(z − 0.75)(z + 1.3z + 0.8) z − 2 z + 1
2

with sampling period of 0.001 s, is the known stable modeled part of the plant, and
182 S. Jafari and P. Ioannou

Fig. 6.10 Simulation results for a SISO continuous-time system with known stable plant model
and the adaptive control scheme (6.33), (6.37)–(6.41). The performance of the proposed scheme
for N = 20 a without filter F(s) (i.e., F(s) = 1), the speed of adaptation is pretty low; b with
filter F(s) in (6.110), much better performance is achieved. In both cases, the control signal u(t) is
applied at t = 5 s

−10−5
Δm (z) = I,
(z + 0.999)2

is the unknown plant multiplicative uncertainty which has a negligible size at low
frequencies and relatively large size near the Nyquist frequency, and the unknown
additive output disturbance applied to the two output channels are

d1 (t) = 0.6 sin(ω11 t) + 0.6 sin(ω12 t + π/8) + 0.6 sin(ω13 t + π/6) + v1 (t),
d2 (t) = 0.5 sin(ω21 t + π/4) + 0.5 sin(ω22 t + π/2) + 0.5 sin(ω23 t + π/5) + v2 (t),

where ω11 = 0.03 rad/sample (= 30 rad/s), ω12 = 0.095 rad/sample (= 95 rad/s),


ω13 = 0.18 rad/sample (= 180 rad/s), ω21 = 0.02 rad/sample (= 20 rad/s), ω22 =
0.11 rad/sample (= 110 rad/s), ω23 = 0.21 rad/sample (= 210 rad/s), and v1 (t) and
v2 (t) are zero-mean Gaussian with standard deviation 0.02.
The sigma plot of the open-loop plant G 0 (z) shows large gains of G 0 (z) at high
frequencies and low gains at low frequencies. With such a system, the closed-loop
system is vulnerable to high-frequency plant unmodeled dynamics with a poor dis-
turbance rejection performance. A suitable filter F(z) is therefore needed to properly
shape the singular values of the plant model.
For design of filter F(z), we use the inner-outer factorization of G 0 (z). Since
the plant has zeros on the unit circle, we apply the algorithm proposed in [75] to
the perturbed plant model G̃ 0 (z) obtained by scaling the zeros by factor 0.99 to
6 Robust Adaptive Disturbance Attenuation 183

Fig. 6.11 The maximum and minimum singular values of the original uncompensated plant model
G 0 (z) and those of the compensated plant G 0 (z)F(z). The singular values of G 0 (z)F(z) are almost
aligned

move them slightly away from the unit circle. Then G̃ 0 (z) = G̃ in (z)G̃ out (z), where
inner factor G̃ in (z) is a stable proper all-pass filter and the outer factor G̃ out (z) is a
stable proper with a stable right inverse. Then, we choose F(z) = κ0 f (z)G̃ −1 outer (z),
where κ0 = 0.1 and f (z) is a scalar low-pass filter with DC gain of one designed to
compensate the effect of high-frequency modeling uncertainties. We assume f (z) is
a third-order Butterworth low-pass filter with cutoff frequency of 630 rad/s, given by

0.018099(z + 1)3
f (z) = .
(z − 0.5095)(z 2 − 1.251z + 0.5457)

Then, the compensated plant G 0 (z)F(z) has the gain of κ0 = 0.1 = −20 dB in
every direction over the expected frequency range of disturbances. The selection of
the bandwidth of f (z) and the gain κ0 is based on partial knowledge on the size
and the frequency range where the modeling error may be dominant. Large values of
these two parameters can adversely affect the stability margin. Figure 6.11 shows the
maximum and minimum singular values of the original uncompensated plant model
G 0 (z) and those of the compensated plant G 0 (z)F(z).
The following design parameters are assumed for the adaptive law (6.53), (6.58)–
(6.62): N = 60, γ0 = 1, P −1 (0) = 0.01I , and θ̂ (0) = 0. To demonstrate the perfor-
mance of the adaptive law when new disturbance are abruptly added to the plant
output, we assume that new unknown disturbance terms 0.9 sin(ω14 t + π/7), ω14 =
0.103 rad/sample (= 103 rad/s) and 0.9 sin(ω24 t + π/3), ω24 = 0.128 rad/sample
(= 128 rad/s), are added to the output of channel 1 and 2, respectively, at time
t = 25 s. Figure 6.12 shows the performance of the proposed adaptive control scheme
with the above filter F(z). After closing the feedback loop at t = 10 s, the controller
quickly adjusts its parameters to reject the sinusoidal components of the disturbance.
184 S. Jafari and P. Ioannou

Fig. 6.12 Simulation results for a MIMO discrete-time system with known stable plant model and
the adaptive control scheme (6.53), (6.58)–(6.62). The control input is applied at t = 10 s. At time
t = 25 s, new unknown sinusoidal terms are abruptly added to the existing disturbance

At t = 25 s, the controller re-adjusts its parameters to counteract the effect of new


disturbance terms.

6.6.4 SISO Discrete-Time Systems with Unknown Plant


Model

Consider the following open-loop plant model shown in Fig. 6.1 and the controller
structure in Fig. 6.4 with disturbance model (6.2). Suppose

k0 Z 0 (z) 2.63(z − 0.13)


G 0 (z) = = 2 ,
R0 (z) z − 1.91z + 1.44

with sampling period of 0.001 s, is the unknown unstable minimum-phase modeled


part of the plant, and

d(t) = 1.2 sin(ω1 t + π/3) + 0.8 sin(ω2 t − π/4) + v(t)

is an unknown additive output disturbance, where ω1 = 0.127 rad/sample


(= 127 rad/s), ω1 = 0.225 rad/sample (= 225 rad/s), and v(t) is a zero-mean Gaus-
sian noise with standard deviation 0.02.
The following design parameters are assumed for the adaptive law (6.99), (6.102)–
(6.106): N = 30, γ0 = 1, P(0) = 20I , θ̂i (0) = 0, for i = 1 : 2N + 1, θ̂2N +2 (0) = 5.
The control input is applied at t = 10 s, and at time t = 30 s a new sinusoidal
6 Robust Adaptive Disturbance Attenuation 185

Fig. 6.13 Simulation results for a SISO discrete-time system with unknown unstable minimum-
phase plant model and the adaptive control scheme (6.99), (6.102)–(6.106). The control input is
applied at t = 10 s and at time t = 30 s, a new unknown sinusoidal term is abruptly added to the
existing disturbance

disturbance term 1.5 sin(ω3 t − π/5), where ω3 = 0.325 rad/sample (= 325 rad/s),
is abruptly added to existing output disturbance (Fig. 6.13).

6.7 Conclusion

The problem of attenuating unknown narrowband disturbances in the presence of


broadband random noise and plant unmodeled dynamics for SISO and MIMO plants
are examined in both continuous and discrete-time formulations. We showed that by
using proper plant pre-filtering, an over-parameterization of the controller parame-
ters, and a robust adaptive law for parameter estimation we can achieve the following:
◦ Guarantee stability provided the unmodeled dynamics are small in the low-
frequency range,
◦ Attenuation of the periodic components of the disturbance despite the presence
of noise, unmodeled dynamics and time-varying frequencies of the periodic
disturbance terms
◦ Improve performance as well as stability margin, especially in cases where the
zeros of the plant are close to the zeros of the internal model of some of the
disturbance terms.
Suppression of unknown additive sinusoidal disturbances acting on a class of
discrete-time SISO systems with unknown parameters in the presence of unstruc-
tured modeling uncertainties are studied. It is shown that an over-parameterized
version of the classical robust MRAC can be employed for rejection of the unknown
periodic disturbance components in the plant output without amplifying the noise.
186 S. Jafari and P. Ioannou

This capability is achieved by increasing the number of the parameters of the con-
troller without changing the structure of the control law. The use of the reference
model structure, however, restricts the class of dominant plant models to those that
are minimum phase. Numerical simulations are presented to demonstrate the perfor-
mance of the proposed schemes.

References

1. Jafari, S., Ioannou, P., Fitzpatrick, B., Wang, Y.: IEEE Trans. Autom. Control 60(8), 2166
(2015)
2. Jafari, S., Ioannou, P.: Automatica 70, 32 (2016)
3. Chen, X., Tomizuka, M.: IEEE Trans. Control Syst. Technol. 20(2), 408 (2012)
4. Kim, W., Chen, X., Lee, Y., Chung, C.C., Tomizuka, M.: Mech. Syst. Signal Process. 104, 436
(2018)
5. Gan, W.C., Qiu, L.: IEEE/ASME Trans. Mechatron. 9(2), 436 (2004)
6. Houtzager, I., van Wingerden, J.W., Verhaegen, M.: IEEE Trans. Control Syst. Technol. 21(2),
347 (2013)
7. Perez-Arancibia, N.O., Gibson, J.S., Tsao, T.C.: IEEE/ASME Trans. Mechatron. 14(3), 337
(2009)
8. Preumont, A.: Vibration Control of Active Structures. Springer, Berlin (2011)
9. Orzechowski, P.K., Chen, N.Y., Gibson, J.S., Tsao, T.C.: IEEE Trans. Control Syst. Technol.
16(2), 255 (2008)
10. Silva, A.C., Landau, I.D., Ioannou, P.: IEEE Trans. Control Syst. Technol. (2015)
11. Landau, I.D., Silva, A.C., Airimitoaie, T.B., Buche, G., Noe, M.: Eur. J. Control. 19(4), 237
(2013)
12. Landau, I.D., Alam, M., Martinez, J.J., Buche, G.: IEEE Trans. Control Syst. Technol. 19(6),
1327 (2011)
13. Rudd, B.W., Lim, T.C., Li, M., Lee, J.H.: J. Vib. Acoust. 134(1), 1 (2012)
14. Uchida, A.: Optical Communication with Chaotic Lasers: Applications of Nonlinear Dynamics
and Synchronization. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim (2012)
15. Andrews, L.C., Phillips, R.L.: Laser Beam Propagation through Random Media. SPIE (1998)
16. Fujii, T., Fukuchi, T.: Laser Remote Sensing. CRC, Boca Raton (2005)
17. Roggemann, M.C., Welsh, B.: Imaging Through Turbulence. CRC Press, Boca Raton (1996)
18. Vij, D.R., Mahesh, K.: Medical Applications of Lasers. Springer, Berlin (2002)
19. Baranec, C.: Astronomical adaptive optics using multiple laser guide stars, Ph.D Dissertation,
University of Arizona (2007)
20. Tyson, R.: Principles of Adaptive Optics. CRC Press, Boca Raton (2010)
21. Watkins, R.J., Agrawal, B., Shin, Y., Chen, H.J.: In: 22nd AIAA International Communications
Satellite Systems Conference (2004)
22. Ulbrich, H., Gunthner, W.: IUTAM Symposium on Vibration Control of Nonlinear Mechanisms
and Structures. Springer, Netherlands (2005)
23. Roesch, P., Allongue, M., Achache, M., In: Proceedings of the 9th European Rotorcraft Forum,
vol. Cernobbio, Italy, pp. 1–19. Cernobbio, Italy (1993)
24. Bittanti, S., Moiraghi, L.: IEEE Trans. Control Syst. Technol. 2(4), 343 (1994)
25. Friedmann, P.P., Millot, T.A.: J. Guidance Control Dyn. 18, 664 (1995)
26. Patt, D., Liu, L., Chandrasekar, J., Bernstein, D.S., Friedmann, P.P.: J. Guidance Control Dyn.
28(5), 918 (2005)
27. Lau, J., Joshi, S.S., Agrawal, B.N., Kim, J.W.: J. Guidance Control Dyn. 29(4), 792 (2006)
28. Nelson, P.A., Elliott, J.C.: Active Control of Sound. Academic, London (1992)
6 Robust Adaptive Disturbance Attenuation 187

29. Benesty, J., Sondhi, M.M., Huang, Y.: Springer Handbook of Speech Processing. Springer,
Berlin (2008)
30. Emborg, U., Ross, C.F.: In: Proceedings of the Recent Advances in Active control Sound and
Vibrations. Blacksburg (1993)
31. Eriksson, L.J.: Sound Vibra. 22, 2 (1988)
32. Kuo, S.M., Morgan, D.R.: Active Noise Control Systems, Algorithms and DSP Implementa-
tions. Wiley-Interscience, New York (1996)
33. Elliott, S.J., Nelson, P.A.: Electron. Commun. Eng. J. 2(4), 127 (1990)
34. Elliott, J.C., Nelson, P.A.: IEEE Signal Process. Mag. 10(4), 12 (1993)
35. Elliott, S.J., Nelson, P.A., Stothers, I.M., Boucher, C.C.: J. Sound Vib. 140(2), 219 (1990)
36. Sutton, T.J., Elliott, S.J., McDonald, A.M., Saunders, T.J.: Noise Control Eng. J. 42(4), 137
(1994)
37. Amara, F.B., Kabamba, P., Ulsoy, A.: J. Dyn. Syst. Meas. Contr. 121(4), 655 (1999)
38. Wang, Y., Liu, L., Fitzpatrick, B.G., Herrick, D.: In: Proceedings of the Directed Energy System
Symposium (2007)
39. Pulido, G.O., Toledo, B.C., Loukianov, A.G.: In: Proceedings of the 44th IEEE Conference on
Decision and Control, pp. 4821–4826 (2005)
40. Kinney, C.E., de Callafon, R.A.: Int. J. Adapt. Cont. Sig. Process. 25, 1006 (2011)
41. Bodson, M.: Int. J. Adapt. Cont. Sig. Process. 19, 67 (2005)
42. Kim, W., Kim, H., Chung, C.C., Tomizuka, M.: IEEE Trans. Control Syst. Technol. 19(5),
1296 (2011)
43. Amara, F.B., Kabamba, P., Ulsoy, A.: J. Dyn. Syst. Meas. Contr. 121(4), 648 (1999)
44. Aranovskiy, S., Freidovich, L.B.: Eur. J. Control. 19(4), 253 (2013)
45. Marino, R., Tomei, P.: Automatica 49(5), 1494 (2013)
46. Landau, I.D., Alma, M., Constantinescu, A., Martinez, J.J., Noe, M.: Control. Eng. Pract.
19(10), 1168 (2011)
47. Youla, D., Bongiorno, J., Jabr, H.: IEEE Trans. Autom. Control 21(1), 3 (1976)
48. Landau, I.D.: Int. J. Control 93(2), 204 (2020)
49. Marino, R., Santosuosso, G.L.: IEEE Trans. Autom. Control 52(2), 352 (2007)
50. Bodson, M., Douglas, S.: Automatica 33, 2213 (1997)
51. Chanderasekar, J., Liu, L., Patt, D., Friedmann, P.P., Bernstein, D.S.: IEEE Trans. Control Syst.
Technol. 14(6), 993 (2006)
52. Feng, G., Palaniswami, M.: IEEE Trans. Autom. Control 37(8), 1220 (1992)
53. Palaniswami, M.: IEE Proc. D Control Theory Appl. 140(1), 51 (1993)
54. Pigg, S., Bodson, M.: IEEE Trans. Autom. Control 18(4), 822 (2010)
55. Pigg, S., Bodson, M.: Asian J. Control 15, 1 (2013)
56. Basturk, H.I., Krstic, M.: Automatica 50(10), 2539 (2014)
57. Basturk, H.I., Krstic, M.: Automatica 58, 131 (2015)
58. Ioannou, P.A., Sun, J.: Robust Adaptive Control. Prentice-Hall, Upper Saddle River (1996)
59. Jafari, S., Ioannou, P.: Int. J. Adapt. Control Signal Process 30(12), 1674 (2016)
60. Jafari, S., Ioannou, P., Rudd, L.: J. Vib. Control 23(4), 526 (2017)
61. Jafari, S., Ioannou, P.: Int. J. Adapt. Control Signal Process 33(1), 196 (2019)
62. Ioannou, P.A., Fidan, B.: Adaptive Control Tutorial. SIAM (2006)
63. Skogestad, S., Postlethwaite, I.: Multivariable Feedback Control: Analysis and Design. Wiley,
New York (2005)
64. Dahleh, M., Diaz-Bobillo, I.J.: Control of Uncertain Systems: A Linear Programming
Approach. Prentice-Hall, Englewood Cliffs (1995)
65. Goodwin, G.C., Sin, K.S.: Adaptive Filtering Prediction and Control. Prentice-Hall, Englewood
Cliffs (1984)
66. McFarlane, D., Glover, K.: Robust Controller Design Using Normalized Coprime Factor Plant
Descriptions (Lecture Notes in Control and Information Sciences, vol. 138. Springer, Berlin
(1990)
67. Papageorgiou, G., Glover, K.: In: The AIAA Conference on Guidance, Navigation and Control,
pp. 1–14 (1999)
188 S. Jafari and P. Ioannou

68. McFarlane, D., Glover, K.: IEEE Trans. Autom. Control 37(6), 759 (1992)
69. Hyde, R.A.: The Application of Robust Control to VSTOL Aircraft, Ph.D thesis, University of
Cambridge (1991)
70. Hyde, R.A.: H∞ Aerospace Control Design – A VSTOL Flight Application. Advances in
Industrial Control. Springer, London (1995)
71. Lanzon, A.: Automatica 41(7), 1201 (2005)
72. Francis, B.A.: A Course in H∞ Control Theory. Lecture Notes in Control and Information
Science. Springer, Berlin (1987)
73. Weiss, M.: IEEE Trans. Autom. Control 39(3), 677 (1994)
74. Varga, A.: IEEE Trans. Autom. Control 43(5), 684 (1998)
75. Chen, B.M., Lin, Z., Shamash, Y.: Linear Systems Theory: A Structural Decomposition
Approach. Birkhauser, Boston (2004)
76. Lee, T.H., Narendra, K.: IEEE Trans. Autom. Control 31(5), 477 (1986)
77. Ioannou, P.A., Fidan, B.: Adaptive control tutorial – supplement to chapter 7. https://archive.
siam.org/books/dc11/Ioannou-Web-Ch7.pdf
78. Astrom, K.J., Wittenmark, B.: Computer-Controlled Systems: Theory and Design. Dover Pub-
lication, New York (2011)
79. Astrom, K., Hagander, P., Sternby, J.: Automatica 20(1), 31 (1984)
Chapter 7
Delay-Adaptive Observer-Based Control
for Linear Systems with Unknown Input
Delays

Miroslav Krstic and Yang Zhu

Abstract Laurent Praly’s contributions to adaptive control and to state and param-
eter estimation are inestimable. Inspired by them, over the last several years we have
developed adaptive and observer-based control designs for the stabilization of linear
systems that have large and unknown delays at their inputs. In this chapter, we pro-
vide a tutorial introduction to this collection of results by presenting several of the
most basic ones among them. Among the problems considered are some with mea-
sured and some with unmeasured states, some with known and some with unknown
plant parameters, some with known and some with unknown delays, and some with
measured and some with unmeasured actuator state under unknown delays. We have
carefully chosen, for this chapter, several combinations among these challenges, in
which estimation of a state (of the plant or of the actuator) and/or estimation of a
parameter (of the plant or the delay) is being conducted and such estimates fed into
a certainty-equivalence observer-based adaptive control law. The exposition pro-
gresses from designs that are relatively easy to those that are rather challenging. All
the designs and stability analyses are Lyapunov based. The delay compensation is
based on the predictor approach and the Lyapunov functionals are constructed using
backstepping transformations and the underlying Volterra integral operators. The
stability achieved is global, except when the delay is unknown and the actuator state
is unmeasured, in which case stability is local.

M. Krstic (B)
Department of Mechanical and Aerospace Engineering, University of California,
San Diego, USA
e-mail: [email protected]
Y. Zhu
State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control,
Zhejiang University, Hangzhou, China
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 189
Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control,
Lecture Notes in Control and Information Sciences 488,
https://doi.org/10.1007/978-3-030-74628-5_7
190 M. Krstic and Y. Zhu

7.1 Introduction

7.1.1 Adaptive Control for Time-Delay Systems and PDEs

Actuator and sensor delays are among the most common dynamic phenomena in
engineering practice, and when disregarded, they render controlled systems unstable
[11]. Over the past 60 years, predictor feedback has been a key tool for compen-
sating such delays, but conventional predictor feedback algorithms assume that the
delays and other parameters of a given system are known [2, 5, 6, 24, 26]. When
incorrect parameter values are used in the predictor, the resulting controller may be
as destabilizing as without the delay compensation [6, 13, 14].
Adaptive control—a simultaneous real-time combination of system identification
(the estimation of the model parameters) and feedback control—is one of the most
important in the field of control theory and engineering [1, 3, 12, 18, 25]. In adaptive
control, actuator or sensor delays have traditionally been viewed as perturbations—
effects that may challenge the robustness of the adaptive controllers and whose
ignored presence should be studied with the tools for robustness analysis and redesign
[10, 13, 14, 22, 23].
The simultaneous presence of unknown plant parameters and actuator/sensor
delays poses challenges, which arise straight out of the control practice, and which are
not addressed by conventional predictor and adaptive control methods. The turning
point for the development of the ability to simultaneously tackle delays and unknown
parameters was the introduction of the Partial Differential Equation (PDE) backstep-
ping framework for parabolic PDEs and the resulting backstepping interpretation
of the classical predictor feedback [1, 15, 16, 19–21, 25]. Similar to the role that
finite-dimensional adaptive backstepping had played in the development of adaptive
control in the 1990s, PDE backstepping of the 2000s furnished the Lyapunov func-
tionals needed for the study of stability of delay systems under predictor feedback
laws. It also enabled the further design of adaptive controllers for systems with delays
and unknown parameters, from the late 2000s onward. Adaptive control problems
for single-input linear systems with unknown discrete input delay are addressed in
[7–9, 29, 35], and then extended to multi-input systems with distinct discrete delays
by [30, 32, 33]. In publications [4, 27, 31, 34], delay-adaptive control for linear
systems with uncertain distributed delays are studied.

7.1.2 Results in This Chapter: Adaptive Control for


Uncertain Linear Systems with Input Delays

In general, linear systems with input delays usually come with the following five
types of uncertainties [28]:

• unknown actuator delays,


7 Delay-Adaptive Observer-Based Control for Linear Systems … 191

Table 7.1 Uncertainty collections of linear systems with input delays


Section Delay Delay kernel Parameter Plant state Actuator state
Sect. 7.2.1 known – known unknown known
Sect. 7.2.2 unknown – known known known
Sect. 7.2.3 unknown – known known unknown
Sect. 7.3 unknown – unknown unknown known
Sect. 7.4 unknown unknown known known known

• unknown actuator delay kernels,


• unknown plant parameters,
• unmeasurable finite-dimensional plant state, and
• unmeasurable infinite-dimensional actuator state.

In this chapter, we provide a tutorial introduction to delay-adaptive control


approach to handle a collection of above basic uncertainties. We have carefully
chosen, for this chapter, several combinations among these challenges, in which
estimation of a state (of the plant or of the actuator) and/or estimation of a parameter
(of the plant or the delay) is being conducted and such estimates fed into a certainty-
equivalence observer-based adaptive control law. All the designs and stability analy-
ses are Lyapunov based. The delay compensation is based on the predictor approach
and the Lyapunov functionals are constructed using backstepping transformations
and the underlying Volterra integral operators.
To clearly describe the chapter’s organization, different combinations of uncer-
tainties considered in later sections are summarized in Table 7.1, from which the
interested readers and the practitioners can make their own selections to address a
vast class of relevant problems.

7.2 Adaptive Control for Linear Systems with Discrete


Input Delays

Consider linear systems with discrete input delays as follows:

Ẋ (t) = A(θ )X (t) + B(θ )U (t − D) (7.1)


Y (t) = C X (t), (7.2)

where X (t) ∈ Rn is the plant state, Y (t) ∈ Rq is the output, and U (t) ∈ R is the
control input with a constant delay D ∈ R+ . The matrices A(θ ) and B(θ ) dependent
upon constant parameter vector θ ∈ R p are linearly parameterized such that
192 M. Krstic and Y. Zhu


p

p
A(θ ) = A0 + θi Ai , B(θ ) = B0 + θi Bi , (7.3)
i=1 i=1

where θi is the ith element of θ and A0 , Ai , B0 , and Bi for i = 1, ..., p are known
matrices. To stabilize the potentially unstable system (7.1)–(7.2), we have following
assumptions:
Assumption 7.1 The actuator delay satisfies

0 < D ≤ D ≤ D, (7.4)

where D and D are known lower and upper bounds of D, respectively. The plant
parameter vector belongs to

Θ = {θ ∈ R p |P(θ ) ≤ 0}, (7.5)

where P(·) : R p → R is a known, convex, and smooth function and Θ is a convex


set with a smooth boundary ∂Θ.
 
Assumption 7.2 The pair A(θ ), B(θ ) is stabilizable. In other words, there exists
a matrix K (θ ) to let A(θ ) + B(θ )K (θ ) is Hurwitz such that

(A + B K )T (θ )P(θ ) + P(θ )(A + B K )(θ ) = −Q(θ ) (7.6)

with P(θ ) = P(θ )T > 0 and Q(θ ) = Q(θ )T > 0.

The system (7.1)–(7.2) is equivalent to the ODE–PDE cascade system

Ẋ (t) = A(θ )X (t) + B(θ )u(0, t) (7.7)


Y (t) = C X (t) (7.8)
Du t (x, t) = u x (x, t), x ∈ [0, 1] (7.9)
u(1, t) = U (t), (7.10)

where the PDE solution is

u(x, t) = U (t + D(x − 1)), x ∈ [0, 1]. (7.11)

The ODE–PDE cascade system (7.7)–(7.10) representing the linear system with
input delay may come with the following four types of basic uncertainties:
• unknown actuator delay D,
• unknown plant parameter θ ,
• unmeasurable finite-dimensional plant state X (t), and
• unmeasurable infinite-dimensional actuator state u(x, t).
7 Delay-Adaptive Observer-Based Control for Linear Systems … 193

Fig. 7.1 Adaptive control for uncertain linear systems with input delays

The basic idea of certainty-equivalence-based adaptive control is using an estimator


(a parameter estimator or a state estimator) to replace the unknown variables in the
control law, as shown in Fig. 7.1.

7.2.1 Global Stabilization under Uncertain Plant State

From this section, different combinations of four above uncertainties are taken into
account. Accordingly, for each kind of uncertainty combination, a unique control
scheme is proposed. First of all, we consider the observer-based stabilization when
the finite-dimensional plant state X (t) is unmeasurable. Since θ is known in this
section, for sake of brevity, we denote A = A(θ ), B = B(θ ). We have the following
assumption:
Assumption 7.3 The pair (A, C) is detectable, namely, there exists a matrix L ∈
Rn×q to make A − LC Hurwitz such that

(A − LC)T PL + PL (A − LC) = −Q L (7.12)

with PLT = PL > 0 and Q TL = Q L > 0.


The control scheme is summarized in Table 7.2.

Theorem 7.1 Consider the closed-loop system consisting of the ODE–PDE cascade
plant (7.13)–(7.16), the ODE-state observer (7.17), and the predictor feedback law
(7.18). The origin is exponentially stable in the sense of the norm
  1  21
|X (t)|2 + | X̂ (t)|2 + u(x, t)2 dt . (7.25)
0
194 M. Krstic and Y. Zhu

Table 7.2 Global stabilization under uncertain plant state

The ODE-PDE cascade system:

Ẋ (t) = AX (t) + Bu(0, t) (7.13)


Y (t) = C X (t) (7.14)
Du t (x, t) = u x (x, t), x ∈ [0, 1] (7.15)
u(1, t) = U (t) (7.16)

The ODE-state observer:


 
X̂˙ (t) = A X̂ (t) + Bu(0, t) + L Y (t) − C X̂ (t) (7.17)

The predictor feedback law:


  1
U (t) = u(1, t) = K e AD X̂ (t) + D e AD(1−y) Bu(y, t)dy (7.18)
0

The invertible backstepping transformation:


  x
w(x, t) = u(x, t) − K e ADx X̂ (t) + D e AD(x−y) Bu(y, t)dy (7.19)
0
  x
u(x, t) = w(x, t) + K e(A+B K )Dx X̂ (t) + D e(A+B K )D(x−y) Bw(y, t)dy (7.20)
0

The closed-loop target system:

X̂˙ (t) = (A + B K ) X̂ (t) + Bw(0, t) + LC X̃ (t), (7.21)


X̃˙ (t) = (A − LC) X̃ (t), (7.22)
Dwt (x, t) = wx (x, t) − D K e ADx LC X̃ (t), x ∈ [0, 1] (7.23)
w(1, t) = 0 (7.24)
where X̃ (t) = X (t) − X̂ (t).

Proof The proof is found in [15, Chap. 3].

7.2.2 Global Stabilization Under Uncertain Delay

In this section, we consider the adaptive stabilization when the input delay D is
unknown. The control scheme is summarized in Table 7.3.
7 Delay-Adaptive Observer-Based Control for Linear Systems … 195

Table 7.3 Global stabilization under uncertain input delay

The ODE-PDE cascade system:

Ẋ (t) = AX (t) + Bu(0, t) (7.26)


Du t (x, t) = u x (x, t), x ∈ [0, 1] (7.27)
u(1, t) = U (t) (7.28)

The predictor feedback law:


  1
U (t) = u(1, t) = K e A D̂(t) X (t) + D̂(t) e A D̂(t)(1−y) Bu(y, t)dy (7.29)
0

The invertible backstepping transformation:


  x
w(x, t) = u(x, t) − K e A D̂(t)x X (t) + D̂(t) e A D̂(t)(x−y) Bu(y, t)dy (7.30)
0
  x
u(x, t) = w(x, t) + K e(A+B K ) D̂(t)x X (t) + D̂(t) e(A+B K ) D̂(t)(x−y) Bw(y, t)dy (7.31)
0

The delay update law:

˙
D̂(t) = γ D Proj[D,D] τ D (t), D̂(0) ∈ [D, D], (7.32)
1  
0 (1 + x)w(x, t)K e A D̂(t)x d x AX (t) + Bu(0, t)
τ D (t) = − 1
(7.33)
1 + X (t)T P X (t) + g 0 (1 + x)w(x, t)2 d x

The closed-loop target system:

Ẋ (t) = (A + B K )X (t) + Bw(0, t), (7.34)


˙
Dwt (x, t) = wx (x, t) − D̃(t) p(x, t) − D D̂(t)q(x, t), x ∈ [0, 1] (7.35)
w(1, t) = 0 (7.36)
where D̃(t) = D − D̂(t),
   
p(x, t) = K e A D̂(t)x AX (t) + Bu(0, t) = K e A D̂(t)x (A + B K )X (t) + Bw(0, t) (7.37)
 x
q(x, t) = K Axe A D̂(t)x X (t) + K (I + A D̂(t)(x − y))e A D̂(t)(x−y) Bu(y, t)dy (7.38)
0
196 M. Krstic and Y. Zhu

Theorem 7.2 Consider the closed-loop system consisting of the ODE–PDE cascade
plant (7.26)–(7.28), the predictor feedback law (7.29), and the delay update law
(7.32), (7.33). The zero solution is stable in the sense of the norm
  1  21
|X (t)| +
2
u(x, t) dt + |D − D̂(t)|
2 2
(7.39)
0

and the convergence limt→∞ X (t) = 0 and limt→∞ U (t) = 0 is achieved.

Proof The proof is found in [15, Chap. 7].

7.2.3 Local Stabilization Under Uncertain Delay and


Actuator State

In this section, we consider the adaptive stabilization when the input delay D is
unknown and the actuator state u(x, t) is unmeasurable. The delay-adaptive prob-
lem without the measurement of u(x, t) is unsolvable globally because it cannot
be formulated as linearly parameterized in the unknown delay D. That is to say,
when the controller uses an estimate of u(x, t), not only do the initial values of the
plant state and the actuator state have to be small, but the initial value of the delay
estimation error also has to be small (the delay value is allowed to be large but the
initial value of its estimate has to be close to the true value of the delay). The control
scheme is summarized in Table 7.4.

Theorem 7.3 Consider the closed-loop system consisting of the ODE–PDE cascade
plant (7.40)–(7.42), the PDE-state observer (7.43), (7.44), the predictor feedback law
(7.45), and the delay update law (7.48), (7.49). The zero solution is stable in the sense
of the norm

 1  1  1 1
2
|X (t)|2 + u(x, t)2 dt + û(x, t)2 dt + û x (x, t)2 dt + |D − D̂(t)|2 (7.55)
0 0 0

and the convergence limt→∞ X (t) = 0 and limt→∞ U (t) = 0 is achieved, if there
exists M > 0 such that the initial condition

 1  1  1 1
2
|X (0)|2 + u(x, 0)2 dt + û(x, 0)2 dt + û x (x, 0)2 dt + |D − D̂(0)|2 ≤M
0 0 0
(7.56)

is satisfied.

Proof The proof is found in [15, Chap. 8] and [9].


7 Delay-Adaptive Observer-Based Control for Linear Systems … 197

Table 7.4 Local stabilization under uncertain input delay and actuator state

The ODE-PDE cascade system:

Ẋ (t) = AX (t) + Bu(0, t) (7.40)


Du t (x, t) = u x (x, t), x ∈ [0, 1] (7.41)
u(1, t) = U (t) (7.42)

The PDE-state observer:


˙
D̂(t)û t (x, t) = û x (x, t) + D̂(t)(x − 1)û x (x, t), x ∈ [0, 1] (7.43)
û(1, t) = U (t) (7.44)
 
where û(x, t) = U t + D̂(t)(x − 1) , x ∈ [0, 1]
The predictor feedback law:
  1
U (t) = û(1, t) = K e A D̂(t) X (t) + D̂(t) e A D̂(t)(1−y) B û(y, t)dy (7.45)
0

The invertible backstepping transformation:


  x
ŵ(x, t) = û(x, t) − K e A D̂(t)x X (t) + D̂(t) e A D̂(t)(x−y) B û(y, t)dy (7.46)
0
  x
û(x, t) = ŵ(x, t) + K e(A+B K ) D̂(t)x X (t) + D̂(t) e(A+B K ) D̂(t)(x−y) B ŵ(y, t)dy (7.47)
0

The delay update law:

˙
D̂(t) = γ D Proj[D,D] τ D (t), D̂(0) ∈ [D, D], (7.48)
 1
 
τ D (t) = − (1 + x)ŵ(x, t)K e A D̂(t)x d x AX (t) + B û(0, t) (7.49)
0

The closed-loop target system:

Ẋ (t) = (A + B K )X (t) + B ŵ(0, t) + B ũ(0, t) (7.50)


˙
D ũ t (x, t) = ũ x (x, t) − D̃(t)r (x, t) − D D̂(t)(x − 1)r (x, t) (7.51)
ũ(1, t) = 0 (7.52)
˙
D̂(t)ŵt (x, t) = ŵx (x, t) − D̂(t) D̂(t)s(x, t) − D̂(t)K e A D̂(t)x B ũ(0, t) (7.53)
ŵ(1, t) = 0 (7.54)
where D̃(t) = D − D̂(t), ũ(x, t) = u(x, t) − û(x, t).
198 M. Krstic and Y. Zhu

7.3 Observer-Based Adaptive Control for Linear Systems


with Discrete Input Delays
 
In this section, we deal with a more challenging uncertainty collection D, X (t), θ .
When the state X (t) and the parameter θ in the finite-dimensional plant are unknown
simultaneously, the relative degree plays an important role in the output-feedback
problem. Consider single-input single-output uncertain linear systems with input
delay
bm s m + · · · + b1 s + b0
Y (s) = n e−Ds U (s) (7.57)
s + an−1 s n−1 + · · · + a1 s + a0

which is of the observer canonical form:



0
Ẋ (t) = AX (t) − aY (t) + (ρ−1)×1 U (t − D)
b
Y (t) = e1T X (t), (7.58)

where ⎡ ⎤ ⎡ ⎤
 an−1 bm
0(n−1)×1 In−1 ⎢ .. ⎥ ⎢ .. ⎥
A= , a = ⎣ . ⎦, b = ⎣ . ⎦ (7.59)
0 01×(n−1)
a0 b0

and ei for i = 1, 2, ... is the ith coordinate vector; ρ denotes the relative degree
satisfying ρ = n − m; X (t) = [X 1 (t), X 2 (t), · · · , X n (t)]T ∈ Rn is the plant state
unavailable to measure; Y (t) ∈ R is the measurable output; U (t − D) ∈ R is the con-
trol input with an unknown constant time delay D; and an−1 , · · · , a0 and bm , · · · , b0
are unknown constant plant parameters and control coefficients, respectively. The
system (7.58) is written compactly as
 T
Ẋ (t) = AX (t) + F U (t − D), Y (t) θ
Y (t) = e1T X (t), (7.60)

where p = n + m + 1-dimensional parameter vector θ is defined by



b
θ= (7.61)
a

and

 T 0(ρ−1)×(m+1)
F U (t − D), Y (t) = U (t − D), −In Y (t). (7.62)
Im+1

Several assumptions concerning the system (7.57)–(7.58) are given.


7 Delay-Adaptive Observer-Based Control for Linear Systems … 199

Assumption 7.4 The plant (7.57) is minimum-phase, i.e., the polynomial B(s) =
bm s m + · · · + b1 s + b0 is Hurwitz.

Assumption 7.5 There exist two known constants D > 0 and D̄ > 0, such that
D ∈ [D, D̄]. The high-frequency gain’s sign sgn(bm ) is known and a constant bm
is known such that |bm | ≥ bm > 0. Furthermore, θ belongs to a convex compact set
Θ = {θ ∈ R p |P(θ ) ≤ 0}, where P : R p → R is a smooth convex function.

The control purpose is to let output Y (t) asymptotically track a time-varying reference
signal Yr (t) which satisfies the assumption given below.
Assumption 7.6 In the case of known θ , given a time-varying reference output
trajectory Yr (t) which is known, bounded, and smooth, there exist known reference
state signal X r (t, θ ) and reference input signal U r (t, θ ) which are bounded in t,
continuously differentiable in the argument θ and satisfy
 T
Ẋ r (t, θ ) = AX r (t, θ ) + F U r (t, θ ), Yr (t) θ
Yr (t) = e1T X r (t, θ ). (7.63)

We introduce the distributed input

u(x, t) = U (t + D(x − 1)), x ∈ [0, 1], (7.64)

where the measurable actuator state is governed by the following PDE equation:

Du t (x, t) = u x (x, t) (7.65)


u(1, t) = U (t). (7.66)

To estimate the unmeasurable ODE state, we employ the Kreisselmeier filters (K-
filters) as follows:

η̇(t) = A0 η(t) + en Y (t) (7.67)


λ̇(t) = A0 λ(t) + en U (t − D) (7.68)
ξ(t) = −An0 η(t) (7.69)
Ξ (t) = −[An−1
0 η(t), · · · , A0 η(t), η(t)] (7.70)
j
υ j (t) = A0 λ(t), j = 0, 1, ..., m (7.71)
Ω(t)T = [υm (t), · · · , υ1 (t), υ0 (t), Ξ (t)] (7.72)

where k = [k1 , k2 , · · · , kn ]T is chosen so that the matrix A0 = A − ke1T is Hurwitz,


i.e., A0T P + P A0 = −I , P = P T > 0. The unmeasurable ODE state X (t) is virtu-
ally estimated as X̂ (t) = ξ(t) + Ω(t)T θ and the estimation error ε(t) = X (t) − X̂ (t)
vanishes exponentially as
ε̇(t) = A0 ε(t). (7.73)
200 M. Krstic and Y. Zhu

Thus, we get a static relationship such that

X (t) = ξ(t) + Ω(t)T θ + ε(t) (7.74)


= −A(A0 )η(t) + B(A0 )λ(t) + ε(t), (7.75)


n−1 
m
where A(A0 ) = An0 + ai Ai0 , B(A0 ) = bi Ai0 .
i=0 i=0
According to (7.63), when θ is known, we can use reference K-filters ηr (t) and
λ (t, θ ) to produce the reference output Yr (t). When θ is unknown, by certainty-
r

equivalence principle, we have

η̇r (t) = A0 ηr (t) + en Yr (t) (7.76)


∂λ (t, θ̂ ) ˙
r
λ̇r (t, θ̂ ) = A0 λr (t, θ̂ ) + en U r (t, θ̂) + θ̂ (7.77)
∂ θ̂
ξ r (t) = −An0 ηr (t) (7.78)
Ξ r (t) = −[An−1
0 η (t), · · · , A0 η (t), η (t)]
r r r
(7.79)
j
υ rj (t, θ̂ ) = A0 λr (t, θ̂ ), j = 0, 1, ..., m (7.80)
Ω r (t, θ̂ )T = [υmr (t, θ̂ ), · · · , υ1r (t, θ̂ ), υ0r (t, θ̂ ), Ξ r (t)] (7.81)
X (t, θ̂ ) = ξ (t) + Ω (t, θ̂ ) θ̂
r r r T
(7.82)
= − Â(A0 )ηr (t) + B̂(A0 )λr (t, θ̂ ) (7.83)
Yr (t) = e1T X r (t, θ̂ ), (7.84)

where θ̂(t) is the estimate of unknown parameter θ with θ̃ (t) = θ − θ̂(t), and

n−1 
m
Â(A0 ) = An0 + âi Ai0 , B̂(A0 ) = b̂i Ai0 . A few of error variables are defined
i=0 i=0
as follows:

z 1 (t) = Y (t) − Yr (t) (7.85)


Ũ (t − D) = U (t − D) − U r (t, θ̂) (7.86)
η̃(t) = η(t) − ηr (t) (7.87)
λ̃(t) = λ(t) − λr (t, θ̂ ) (7.88)
ξ̃ (t) = ξ(t) − ξ r (t) (7.89)
Ξ̃ (t) = Ξ (t) − Ξ r (t) (7.90)
υ̃ j (t) = υ j (t) − υ rj (t, θ̂ ), j = 0, 1, ..., m (7.91)
Ω̃(t) = Ω(t) − Ω (t, θ̂ ) ,
T T r T
(7.92)

which are governed by the following dynamic equations:


7 Delay-Adaptive Observer-Based Control for Linear Systems … 201

˙ = A0 η̃(t) + en z 1 (t)
η̃(t) (7.93)
˙ ∂λ (t, θ̂ ) ˙
r
λ̃(t) = A0 λ̃(t) + en Ũ (t − D) − θ̂ (7.94)
∂ θ̂
ż 1 (t) = ξ̃2 (t) + ω̃(t)T θ̂ + ε2 (t) + ω(t)T θ̃ (7.95)
¯ T θ̂ + ε2 (t) + ω(t)T θ̃ ,
= b̂m υ̃m,2 (t) + ξ̃2 (t) + ω̃(t) (7.96)

where

ω̃(t) = [υ̃m,2 (t), υ̃m−1,2 (t), · · · , υ̃0,2 (t), Ξ̃2 (t) − z 1 (t)e1T ]T (7.97)
¯
ω̃(t) = [0, υ̃m−1,2 (t), · · · , υ̃0,2 (t), Ξ̃2 (t) − z 1 (t)e1T ]T (7.98)
ω(t) = [υm,2 (t), υm−1,2 (t), · · · , υ0,2 (t), Ξ2 (t) − Y (t)e1T ]T . (7.99)

Then a couple of new variables are further defined as


⎡ ⎤ ⎡ ⎤
Y (t) Yr (t)
χ (t) = ⎣ η(t) ⎦ , χ r (t, θ̂ ) = ⎣ ηr (t) ⎦ (7.100)
λ(t) λr (t, θ̂ ).

Thus, a new error variable is derived


⎡ ⎤
z 1 (t)
χ̃ (t) = χ (t) − χ r (t, θ̂ ) = ⎣ η̃(t) ⎦ ∈ R2n+1 (7.101)
λ̃(t)

which is driven by

  ∂χ r (t, θ̂ ) ˙
χ̃˙ (t) = Aχ̃ (θ̂)χ̃ (t) + e2n+1 Ũ (t − D) + e1 ε2 (t) + ω(t)T θ̃ − θ̂,
∂ θ̂
(7.102)
where
⎡ ⎤  
−ân−1 −e2T Â(A0 ) e2T B̂(A0 ) ∂χ r
(t, θ̂ ) 0(1+n)× p
Aχ̃ (θ̂ ) = ⎣ en A0 0n×n ⎦ , = ∂λr (t,θ̂ ) (7.103)
∂ θ̂ ∂ θ̂
.
0n×1 0n×n A0

Next, similar to the transformation on [18, pp. 435–436], we bring in the dynamic
equation for the m-dimensional inverse conversion ζ (t) = T X (t) of (7.58) and its
reference signal ζ r (t) as

ζ̇ (t) = Ab ζ (t) + bb Y (t) (7.104)


ζ̇ (t) = Ab ζ (t) + bb Yr (t),
r r
(7.105)

where
202 M. Krstic and Y. Zhu
⎡ bm−1 ⎤
− b  
⎢ .m ⎥ 0  ρ 

Ab = ⎣ .. Im−1
⎥ , bb = T
⎦ Aρ b − a , T = Ab e1 , · · · , Ab e1 , Im ,
bm
− bb0 0 · · · 0
m
(7.106)
thus the error state

ζ̃ (t) = ζ (t) − ζ r (t) (7.107)

is driven by

ζ̃˙ (t) = Ab ζ̃ (t) + bb z 1 (t) (7.108)

under Assumption 7.4, we can see that Ab is Hurwitz, i.e., Pb Ab + AbT Pb = −I ,


Pb = PbT > 0.
Aiming at system

ż 1 = b̂m υ̃m,2 + ξ̃2 + ω̃¯ T θ̂ + ε2 + ω T θ̃ (7.109)


∂λ (t, θ̂ ) ˙
r
υ̃˙ m,i = υ̃m,i+1 − ki υ̃m,1 − eiT Am
0 θ̂, i = 2, 3, ..., ρ − 1 (7.110)
∂ θ̂
∂λr (t, θ̂ ) ˙
υ̃˙ m,ρ = Ũ (t − D) + υ̃m,ρ+1 − kρ υ̃m,1 − eρT Am 0 θ̂
∂ θ̂
(7.111)

we present the adaptive backstepping recursive control design below.


Coordinate Transformation:

z 1 = Y − Yr (7.112)
z i = υ̃m,i − αi−1 , i = 2, 3, ..., ρ. (7.113)

Stabilizing Functions:

1  
α1 = − (c1 + d1 )z 1 − ξ̃2 − ω̃¯ T θ̂ (7.114)
b̂m
 2
∂α1
α2 = −b̂m z 1 − c2 + d2 z 2 + β2 (7.115)
∂z 1
 2
∂αi−1
αi = −z i−1 − ci + di z i + βi , i = 3, ..., ρ (7.116)
∂z 1
∂αi−1 ∂αi−1
βi = ki υ̃m,1 + (ξ̃2 + ω̃ T θ̂ ) + (A0 η̃ + en z 1 )
∂z 1 ∂ η̃
7 Delay-Adaptive Observer-Based Control for Linear Systems … 203


m+i−1
∂αi−1
+ (λ̃ j+1 − k j λ̃1 ), i = 2, ..., ρ, (7.117)
j=1
∂ λ̃ j

where ci > 0, di > 0 for i = 1, 2, ..., ρ are design parameters.


Adaptive Control Law:

Ũ (t − D) = −υ̃m,ρ+1 + αρ . (7.118)

Note that αi for i = 1, 2, ..., ρ are linear in z 1 , η̃, λ̃ but nonlinear in θ̂ . Thus, through
a recursive but straightforward calculation, we show the following equalities:

z 2 = K 2,z1 (θ̂ )z 1 + K 2,η̃ (θ̂ )η̃ + K 2,λ̃ (θ̂)λ̃ (7.119)


z 3 = K 3,z1 (θ̂ )z 1 + K 3,η̃ (θ̂ )η̃ + K 3,λ̃ (θ̂ )λ̃ (7.120)
z i+1 = K i+1,z1 (θ̂)z 1 + K i+1,η̃ (θ̂ )η̃ + K i+1,λ̃ (θ̂)λ̃, i = 3, ..., ρ − 1 (7.121)
Ũ (t − D) = K z1 (θ̂ )z 1 (t) + K η̃ (θ̂)η̃(t) + K λ̃ (θ̂)λ̃(t) = K χ̃ (θ̂)χ̃ (t), (7.122)

where the explicit expressions of K i,z1 (θ̂ ), K i,η̃ (θ̂ ), K i,λ̃ (θ̂ ) for i = 2, ..., ρ and
K z1 (θ̂), K η̃ (θ̂ ), K λ̃ (θ̂ ) are as follows:

1  
K 2,z1 (θ̂ ) = c1 + d1 − ân−1 (7.123)
b̂m
1
K 2,η̃ (θ̂ ) = − e2T Â(A0 ) (7.124)
b̂m
1 T
K 2,λ̃ (θ̂ ) = e2 B̂(A0 ) (7.125)
b̂m
 
∂α1 2 ∂α1 ∂α1
K 3,z1 (θ̂ ) = b̂m + c2 + d2 K 2,z1 (θ̂ ) + ân−1 − en (7.126)
∂z 1 ∂z 1 ∂ η̃
 
∂α1 2 ∂α1 T ∂α1
K 3,η̃ (θ̂ ) = c2 + d2 K 2,η̃ (θ̂) + e Â(A0 ) − A0 (7.127)
∂z 1 ∂z 1 2 ∂ η̃
 
∂α1 2 ∂α1 T
K 3,λ̃ (θ̂ ) = c2 + d2 K 2,λ̃ (θ̂ ) + e2T Am+1 − e B̂(A0 )
∂z 1 0
∂z 1 2

m+1
∂α1
− e j e Tj A0 (7.128)
j=1
∂ λ̃
 2
∂αi−1 ∂αi−1 ∂αi−1
K i+1,z1 (θ̂ ) = K i−1,z1 (θ̂ ) + ci + di K i,z1 (θ̂ ) + ân−1 − en
∂z 1 ∂z 1 ∂ η̃
(7.129)
204 M. Krstic and Y. Zhu

 2
∂αi−1 ∂αi−1 T
K i+1,η̃ (θ̂ ) = K i−1,η̃ (θ̂ ) + ci + di K i,η̃ (θ̂ ) + e Â(A0 )
∂z 1 ∂z 1 2
∂αi−1
− A0 (7.130)
∂ η̃
 2
∂αi−1
K i+1,λ̃ (θ̂ ) = K i−1,λ̃ (θ̂) + ci + di K i,λ̃ (θ̂ ) + eiT Am+1
∂z 1 0

∂αi−1 T  ∂αi−1
m+i−1
− e2 B̂(A0 ) − e j e Tj A0 , i = 3, 4, ..., ρ − 1
∂z 1 j=1
∂ λ̃
(7.131)
  2
∂αρ−1
K z1 (θ̂ ) = − K ρ−1,z1 (θ̂ ) + cρ + dρ K ρ,z1 (θ̂ )
∂z 1
∂αρ−1 ∂αρ−1
+ ân−1 − en (7.132)
∂z 1 ∂ η̃
  2
∂αρ−1
K η̃ (θ̂ ) = − K ρ−1,η̃ (θ̂ ) + cρ + dρ K ρ,η̃ (θ̂ )
∂z 1
∂αρ−1 T ∂αρ−1
+ e Â(A0 ) − A0 (7.133)
∂z 1 2 ∂ η̃
  
∂αρ−1 2
K λ̃ (θ̂ ) = − K ρ−1,λ̃ (θ̂) + cρ + dρ K ρ,λ̃ (θ̂ )
∂z 1

∂αρ−1 T

m+ρ−1
∂αρ−1
+eρT Am+1 − e B̂(A0 ) − e j e Tj A0 ⎦ (7.134)
0
∂z 1 2 j=1
∂ λ̃

and
K χ̃ (θ̂) = [K z1 (θ̂ ), K η̃ (θ̂ ), K λ̃ (θ̂)] ∈ R1×(2n+1) . (7.135)

If parameter θ and time delay D are known, utilizing θ to replace θ̂ , bearing (7.64)
and (7.122) in mind, one can prove that the following prediction-based control law

U (t) = U r (t + D, θ ) − K χ̃ (θ )χ r (t + D, θ ) + K χ̃ (θ )e Aχ̃ (θ)D χ (t)


 1
+D K χ̃ (θ )e Aχ̃ (θ)D(1−y) e2n+1 u(y, t) dy (7.136)
0

achieves our control objective for system (7.58). To deal with the unknown plant
parameters and the unknown actuator delay, utilizing certainty-equivalence principle,
we bring in the reference transport PDE u r (x, t, θ̂ ) = U r (t + Dx, θ̂ ), x ∈ [0, 1] and
the corresponding PDE error variable ũ(x, t) = u(x, t) − u r (x, t, θ̂ ) satisfying
7 Delay-Adaptive Observer-Based Control for Linear Systems … 205

∂u r (x, t, θ̂ ) ˙
D ũ t (x, t) = ũ x (x, t) − D θ̂, x ∈ [0, 1] (7.137)
∂ θ̂
ũ(1, t) = Ũ (t) = U (t) − U r (t + D̂, θ̂ ), (7.138)

where D̂(t) is a estimate of D, D̃(t) = D − D̂(t). Here we further bring in a similar


backstepping transformation presented in [8] and [15, Sect. 2.2],

w(x, t) = ũ(x, t) − K χ̃ (θ̂ )e Aχ̃ (θ̂ ) D̂x χ̃(t)


 x
− D̂ K χ̃ (θ̂ )e Aχ̃ (θ̂ ) D̂(x−y) e2n+1 ũ(y, t) dy (7.139)
0

ũ(x, t) = w(x, t) + K χ̃ (θ̂ )e(Aχ̃ +e2n+1 K χ̃ )(θ̂) D̂x χ̃ (t)


 x
+ D̂ K χ̃ (θ̂ )e(Aχ̃ +e2n+1 K χ̃ )(θ̂) D̂(x−y) e2n+1 w(y, t) dy (7.140)
0

with which systems (7.73), (7.93), (7.108), (7.109)–(7.111), and (7.137), (7.138) are
transformed into the closed-loop target error systems as follows:

ż = A z (θ̂ )z + Wε (θ̂ )(ε2 + ω T θ̃ ) + Q(z, t)T θ̂˙


+Q r (t, θ̂ )T θ̂˙ + e w(0, t)
ρ (7.141)
η̃˙ = A0 η̃ + en z 1 (7.142)
ζ̃˙ = Ab ζ̃ + bb z 1 (7.143)
ε̇ = A0 ε (7.144)
Dwt (x, t) = wx (x, t) − D(ε2 + ω T θ̃ )r0 (x, t) − D̃ p0 (x, t)
˙ (x, t) − D θ̂˙ T q(x, t)
−D D̂q (7.145)
0
w(1, t) = 0, (7.146)

where A z (θ̂ ), Wε (θ̂), Q(z, t)T , Q r (t, θ̂ )T , r0 (x, t), p0 (x, t), q0 (x, t), and q(x, t) are
listed as follows:
⎡ ⎤
−(c1 + d1 )  b̂m  0 ··· 0
⎢  2 . .. ⎥
⎢ −b̂m
⎢ − c2 + d 2
∂α1
∂z 1
1 .. . ⎥

⎢ ⎥
⎢ . . ⎥

A z (θ̂ ) = ⎢ 0 −1 . . . . 0 ⎥

⎢ .. .. .. .. ⎥
⎢ . . . . ⎥
⎢  1
 2 ⎥
⎣ ∂α

0 ··· 0 −1 − cρ + dρ ∂zρ−1 1

(7.147)
206 M. Krstic and Y. Zhu

⎡ ⎤ ⎡ ⎤
1 0
⎢ − ∂α ⎥ ⎢ − ∂α1 ⎥
⎢ ∂ θ̂ ⎥
1
⎢ ∂z1 ⎥
Wε (θ̂ ) = ⎢ .. ⎥ , Q(z, t) = ⎢ ⎢ .. ⎥ ,
⎥ T
⎣ . ⎦ ⎣ . ⎦
∂α ∂α
− ∂zρ−1 − ∂ρ−1
θ̂
⎡ ⎤
1

0
⎢  ∂α1 T ∂λr (t,θ̂ ) ⎥
⎢ m+1

⎢ − e2T Am − e ⎥
⎢ 0 ∂ λ̃ j j ∂ θ̂ ⎥
⎢ j=1

Q r (t, θ̂ )T = ⎢ .. ⎥ (7.148)
⎢ . ⎥
⎢ ⎥
⎢  ∂αρ−1 T ∂λr (t,θ̂ ) ⎥
m+ρ−1
⎣ ⎦
− eρ A0 −
T m
∂ λ̃
ej ∂ θ̂ j
j=1

r0 (x, t) = K χ̃ (θ̂ )e Aχ̃ (θ̂ ) D̂x e1 (7.149)


 
p0 (x, t) = K χ̃ (θ̂ )e Aχ̃ (θ̂ ) D̂x Aχ̃ (θ̂)χ̃(t) + e2n+1 ũ(0, t) (7.150)
 
= K χ̃ (θ̂ )e Aχ̃ (θ̂ ) D̂x (Aχ̃ + e2n+1 K χ̃ )(θ̂)χ̃ (t) + e2n+1 w(0, t) (7.151)
q0 (x, t) = K χ̃ Aχ̃ (θ̂)xe Aχ̃ (θ̂ ) D̂x χ̃(t)
 x
 
+ K χ̃ (θ̂) I + Aχ̃ (θ̂) D̂(x − y) e Aχ̃ (θ̂ ) D̂(x−y) e2n+1 ũ(y, t) dy (7.152)
0
  x
 
= K χ̃ Aχ̃ (θ̂)xe Aχ̃ (θ̂ ) D̂x + K χ̃ (θ̂) I + Aχ̃ (θ̂ ) D̂(x − y) e Aχ̃ (θ̂ ) D̂(x−y) e2n+1
0

×K χ̃ (θ̂)e(Aχ̃ +e2n+1 K χ̃ )(θ̂ ) D̂ y dy χ̃(t)


 x 
 
+ w(y, t) K χ̃ (θ̂ ) I + Aχ̃ (θ̂ ) D̂(x − y) e Aχ̃ (θ̂ ) D̂(x−y) e2n+1
0
 x
 
+ D̂ K χ̃ (θ̂ ) I + Aχ̃ (θ̂ ) D̂(x − s) e Aχ̃ (θ̂ ) D̂(x−s) e2n+1
y

×K χ̃ (θ̂)e(Aχ̃ +e2n+1 K χ̃ )(θ̂ ) D̂(s−y) e2n+1 ds dy (7.153)

∂ K χ̃ (θ̂) ∂ Aχ̃ (θ̂)


qi (x, t) = + K χ̃ (θ̂) D̂x e Aχ̃ (θ̂ ) D̂x χ̃(t)
∂ θ̂i ∂ θ̂i
 x
∂ K χ̃ (θ̂ ) ∂ Aχ̃ (θ̂ )
+ D̂ + K χ̃ (θ̂) D̂(x − y)
0 ∂ θ̂i ∂ θ̂i
∂χ r (t, θ̂ ) ∂u r (x, t, θ̂ )
×e Aχ̃ (θ̂ ) D̂(x−y) e2n+1 ũ(y, t) dy − K χ̃ (θ̂)e Aχ̃ (θ̂ ) D̂x +
∂ θ̂i ∂ θ̂i
 x ∂u r (y, t, θ̂ )
− D̂ K χ̃ (θ̂ )e Aχ̃ (θ̂ ) D̂(x−y) e2n+1 dy (7.154)
0 ∂ θ̂i
7 Delay-Adaptive Observer-Based Control for Linear Systems … 207


∂ K χ̃ (θ̂ ) ∂ Aχ̃ (θ̂ )
= + K χ̃ (θ̂) D̂x e Aχ̃ (θ̂ ) D̂x
∂ θ̂i ∂ θ̂i
 x ∂ K χ̃ (θ̂ ) ∂ Aχ̃ (θ̂ )
+ D̂ + K χ̃ (θ̂) D̂(x − y) e Aχ̃ (θ̂ ) D̂(x−y) e2n+1
0 ∂ θ̂i ∂ θ̂i

×K χ̃ (θ̂)e(Aχ̃ +e2n+1 K χ̃ )(θ̂ ) D̂ y dy χ̃(t)


 
x ∂ K χ̃ (θ̂) ∂ Aχ̃ (θ̂ )
+ D̂ w(y, t) + K χ̃ (θ̂) D̂(x − y) e Aχ̃ (θ̂ ) D̂(x−y) e2n+1
0 ∂ θ̂i ∂ θ̂i
 x ∂ K χ̃ (θ̂ ) ∂ Aχ̃ (θ̂ )
+ D̂ + K χ̃ (θ̂) D̂(x − s) e Aχ̃ (θ̂ ) D̂(x−s) e2n+1
y ∂ θ̂i ∂ θ̂i

×K χ̃ (θ̂)e(Aχ̃ +e2n+1 K χ̃ )(θ̂ ) D̂(s−y) e2n+1 ds dy

∂χ r (t, θ̂ ) ∂u r (x, t, θ̂ )
−K χ̃ (θ̂ )e Aχ̃ (θ̂ ) D̂x +
∂ θ̂i ∂ θ̂i
 x ∂u r (y, t, θ̂ )
− D̂ K χ̃ (θ̂ )e Aχ̃ (θ̂ ) D̂(x−y) e2n+1 dy. (7.155)
0 ∂ θ̂i

As a consequence, we design our control below to ensure that (7.146) holds:

U (t) = U r (t + D̂, θ̂ ) − K χ̃ (θ̂ )χ r (t + D̂, θ̂ ) + K χ̃ (θ̂ )e Aχ̃ (θ̂) D̂ χ (t)


 1
+ D̂ K χ̃ (θ̂)e Aχ̃ (θ̂) D̂(1−y) e2n+1 u(y, t) dy. (7.156)
0

Two Lyapunov-based estimators to estimate unknown plant parameters and actuator


time-delay are presented as follows:

D̂˙ = γ D Proj[D, D̄] {τ D }, γ D > 0 (7.157)


 1
τD = − (1 + x)w(x, t) p0 (x, t) dx (7.158)
0

θ̂˙ = γθ ProjΘ {τθ }, γθ > 0, b̂m (0)sgn(bm ) ≥ bm


(7.159)
 1
1
τθ = ωWε (θ̂ )T z − ω (1 + x)w(x, t)r0 (x, t) dx, (7.160)
2g 0

where g > 0, Proj[D, D̄] {·} is a standard projection operator defined on the interval
[D, D̄], and ProjΘ {·} is a standard projection algorithm defined on the set Θ to
guarantee that |b̂m | ≥ bm > 0.
The adaptive controller is shown in Fig. 7.2. Above all, the stability of ODE–PDE
cascade system is summarized in the main theorem below.
208 M. Krstic and Y. Zhu

Fig. 7.2 Observer-based adaptive control for linear systems with input delays

Theorem 7.4 Consider the closed-loop system consisting of the plant (7.58), the K-
filters (7.67)–(7.72), the adaptive controller (7.156), the time-delay identifier (7.157),
(7.158), and the parameter identifier (7.159)–(7.160). There exists a constant M > 0
such that if the initial error state satisfies the condition

|z(0)|2 + |η̃(0)|2 + |ζ̃ (0)|2


+ |θ̃(0)|2 + | D̃(0)|2 + |ε(0)|2 + w(x, 0)2 ≤ M (7.161)

then all the signals of the closed-loop system are bounded and the asymptotic tracking
is achieved, i.e.,
lim z 1 (t) = lim (Y (t) − Yr (t)) = 0. (7.162)
t→∞ t→∞

Proof The proof is found in [28, Chap. 5] and [29].

Based on the equalities (7.75), (7.85)–(7.88), (7.107), (7.119)–(7.121), (7.139)–


(7.140), θ̃ = θ − θ̂ , and D̃ = D − D̂, the initial condition of the error state (7.161)
can be transformed into the initial conditions of the states of the actual plants, filters,
identifiers, and transport PDE, namely, X (0), η(0), λ(0), θ̂(0), D̂(0), and u(x, 0).
We illustrate next, through simulations, the proposed scheme by applying it numer-
ically to the following three-dimensional plant with relative degree two:

Ẋ 1 (t) = X 2 (t) − a2 Y (t)


Ẋ 2 (t) = X 3 (t) − a1 Y (t) + b1 U (t − D)
Ẋ 3 (t) = −a0 Y (t) + b0 U (t − D)
Y (t) = X 1 (t), (7.163)
7 Delay-Adaptive Observer-Based Control for Linear Systems … 209

Output Trajectory
30 Y

Trajectory Tracking
Yr
20
10
0
−10
−20
0 10 20 30 40 50 60
Time(sec)

Control Input
40 U
Ur
Control Input

20

−20

0 10 20 30 40 50 60
Time(sec)

Fig. 7.3 Output and input tracking

Delay Estimate
1.5

1.4
Delay Estimate

1.3
1.2
1.1
1
0 10 20 30 40 50 60
Time(sec)

Parameter Estimate
−2.5
Parameter Estimate

â2

−3

−3.5
0 10 20 30 40 50 60
Time(sec)

Fig. 7.4 Delay and parameter estimate


210 M. Krstic and Y. Zhu

where X 2 (t), X 3 (t) are the unmeasurable states, Y (t) is the measured output, D = 1
is the unknown constant time-delay in control input, and a2 = −3 is the unknown
constant system parameter. To show the effectiveness of the developed control algo-
rithm, without losing the generality, we simplify the simulation by assuming that
a1 = a0 = 0, b1 = b0 = 1 are known and the PDE state u(x, t) is measurable. It
is easy to check that this plant is potentially unstable due to the poles possessing
nonnegative real part. The prior information including D = 0.5, D̄=1.5, a 2 = −4,
ā2 = −2 are known to designer. Yr (t) = sin t is the known reference signal to track.
The boundedness and the asymptotic tracking of the system output and control input,
as well as the estimations of θ and D are shown in Figs. 7.3, 7.4.

7.4 Adaptive Control for Linear Systems with Distributed


Input Delays

In this section, we deal with the adaptive control problem for linear systems with
distributed input delays. The plant model is as follows:
 D
Ẋ (t) = AX (t) + B(D − σ )U (t − σ )dσ (7.164)
0
Y (t) = C X (t) (7.165)

which is equivalent to the following ODE–PDE cascade system:


 1
Ẋ (t) = AX (t) + D B(Dx)u(x, t)d x (7.166)
0
Y (t) = C X (t) (7.167)
Du t (x, t) = u x (x, t), x ∈ [0, 1] (7.168)
u(1, t) = U (t). (7.169)

Concentrating on (7.164)–(7.165) and its transformation (7.166)–(7.169), a linear


plant with distributed actuator delay comes with the following five types of basic
uncertainties:
• unknown delay D,
• unknown delay kernel B(Dx),
• unknown parameters in the system matrix A,
• unmeasurable finite-dimensional plant state X (t), and
• unmeasurable infinite-dimensional actuator state u(x, t).
This section addresses the most relevant problem where the delay and delay ker-
nel are unknown. Since the n-dimensional input vector B(Dx) for x ∈ [0, 1] is a
continuous function of Dx such that
7 Delay-Adaptive Observer-Based Control for Linear Systems … 211
⎤ ⎡
ρ1 (Dx)
⎢ ρ2 (Dx) ⎥
⎢ ⎥
B(Dx) = ⎢ .. ⎥ (7.170)
⎣ . ⎦
ρn (Dx),

where ρi (Dx) for i = 1, ..., n are unknown components of the vector-valued function
B(Dx). On the basis of (7.166), we further denote


n 
n
B(x) = D B(Dx) = Dρi (Dx)Bi = bi (x)Bi , (7.171)
i=1 i=1

where bi (x) = Dρi (Dx) for i = 1, ..., n are unknown scalar continuous functions of
x, and Bi ∈ Rn for i = 1, ..., n are the unit vectors accordingly. The system (7.166)–
(7.169) is rewritten as
 1
Ẋ (t) = AX (t) + B(x)u(x, t)d x (7.172)
0
Y (t) = C X (t) (7.173)
Du t (x, t) = u x (x, t), x ∈ [0, 1] (7.174)
u(1, t) = U (t). (7.175)

A few of assumptions are assumed.


Assumption 7.7 There exist known constants D, D, b̄i , and known continuous
functions bi∗ (x) such that
 1  2
0 < D ≤ D ≤ D, 0 < bi (x) − bi∗ (x) d x ≤ b̄i (7.176)
0

for i = 1, ..., n.

Assumption 7.8 The pair (A, β) is stabilizable where


 1
β= e−AD(1−x) B(x)d x. (7.177)
0

Namely, there exist a vector K (β) to make A + β K (β) Hurwitz such that

(A + β K (β))T P(β) + P(β) ( A + β K (β)) = −Q(β), (7.178)

where P(β) = P(β)T > 0 and Q(β) = Q(β)T > 0.


Denote D̂(t) and b̂i (x, t) as the estimates of D and bi (x) (for i = 1, ..., n) with
estimation errors satisfying
212 M. Krstic and Y. Zhu

D̃(t) = D − D̂(t) (7.179)


b̃i (x, t) = bi (x) − b̂i (x, t) (7.180)

and

n
B̂(x, t) = b̂i (x, t)Bi . (7.181)
i=1

The delay-adaptive control scheme is designed as follows:


The control law is

U (t) = u(1, t) = K (β̂(t))Z (t), (7.182)

where K (β̂(t)) is chosen to make

 1
Acl (β̂(t)) = A + e−A D̂(t)(1−x) B̂(x, t)d x K (β̂(t))
0
 1 
n
= A+ e−A D̂(t)(1−x) b̂i (x, t)Bi d x K (β̂(t)). (7.183)
0 i=1
  
β̂(t)

Hurwitz and
 1  x
Z (t) = X (t) + D̂(t) e−A D̂(t)(x−y) B̂(y, t)dyu(x, t)d x. (7.184)
0 0

The update laws are

˙
D̂(t) = γ D Proj[D,D] {τ D (t)}, γ D > 0 (7.185)
1
1/g Z (t)P(β̂(t)) f D (t) −
T
+ x)w(x, t)h D (x, t)d x0 (1
τ D (t) = (7.186)
1 + Ξ (t)
b̂˙i (x, t) = γb Proj{τbi (x, t)}, γb > 0 (7.187)
1/g Z (t)P(β̂(t)) f bi (x, t)
T
τbi (x, t) =
1 + Ξ (t)
1
0 (1 + y)w(y, t)h bi (y, t)dy f bi (x, t)
− , (7.188)
1 + Ξ (t)

where P(β̂(t)) satisfies (7.178) and g > 0 is a designing coefficient,


7 Delay-Adaptive Observer-Based Control for Linear Systems … 213

w(x, t) = u(x, t) − K (β̂(t))e Acl (β̂(t)) D̂(t)(x−1) Z (t) (7.189)


 1
Ξ (t) = Z T (t)P(β̂(t))Z (t) + g (1 + x)w(x, t)2 d x (7.190)
0
 1
f D (t) = B̂(x, t)u(x, t)d x
0
 1
− e−A D̂(t)(1−x) B̂(x, t)d xu(1, t)
0
 1 x
− D̂(t) Ae−A D̂(t)(x−y) B̂(y, t)dyu(x, t)d x (7.191)
0 0
 
Acl (β̂(t)) D̂(t)(x−1)
h D (x, t) = K (β̂(t))e f D (t) + Acl (β̂(t))Z (t) (7.192)
f bi (x, t) = Bi u(x, t) (7.193)
h bi (x, t) = K (β̂(t))e Acl (β̂(t)) D̂(t)(x−1) (7.194)

for i = 1, ..., n, and the projector operators are defined as




⎨0, D̂(t) = D and τ < 0
Proj[D,D] {τ } = 0, D̂(t) = D and τ > 0 (7.195)


τ, else
⎧   1 b̂i (x)−b∗ (x)τ (x)d x

⎪τ (x) − b̂i (x) − b (x) 0 

,
i

⎪ 2


i 1
b̂ (x)−b ∗
i (x) d x

⎨  2 0 i

1 ∗
Proj{τ (x)} = i f b̂i (x) − b (x) d x = b̄i (7.196)


0
 i


⎪ 1 ∗
and 0 b̂i (x) − bi (x) τ (x)d x > 0,




τ (x), else

Theorem 7.5 Consider the closed-loop system consisting of the plant  (7.172)–
(7.175) and the adaptive controller (7.182)–(7.196). All the states X (t), u(x, t),

D̂(t), b̂i (x, t) of the closed-loop system are globally bounded and the regulation of
X (t) and U (t) such that limt→∞ X (t) = limt→∞ U (t) = 0 is achieved.

Proof The proof is found in [28, Chap. 11] and [31].

7.5 Beyond the Results Given Here

Both adaptive control and control of PDEs are challenging subjects. In adaptive con-
trol, the challenge comes from the need to design feedback for a plant whose dynamics
may be highly uncertain (due to the plant parameters being highly unknown) and
open-loop unstable, requiring control and learning to be conducted at the same time.
214 M. Krstic and Y. Zhu

In control of PDEs, the challenge lies in the infinite-dimensional essence of the sys-
tem dynamics. Adaptive control problem for PDEs [1, 25] is a task whose difficulties
are even greater than the sum of the difficulties of its two components. In particular,
the conventional adaptive control methods for ODEs [3, 12, 18] cannot be trivially
used to address uncertain delay systems whose dynamics are infinite dimensional.
By modeling the actuator state under input delay as a transport PDE and regarding
the propagation speed (delay dependent) as a parameter in the infinite-dimensional
part of the ODE–PDE cascade system, an introductory exposition of adaptive control
of delay systems has been given in this chapter. As aforementioned, linear systems
with input delays usually have the following five types of uncertainties:
• unknown input delays,
• unknown delay kernels,
• unknown plant parameters,
• unmeasurable finite-dimensional plant state, and
• unmeasurable infinite-dimensional actuator state.
Different uncertainty combinations results in different control designs. For a tutorial
introduction, in this chapter, we chose subset of results that are the most basic.
More results are available in the articles [4, 7–9, 17, 27, 29–35] and are summa-
rized in the books [15, 28]. The first adaptive control design for an ODE system with
a large discrete input delay of unknown length was developed by Bresch-Pietri and
Krstic [8, 9, 17]. An introduction where the delay is unknown but the ODE plants
are known and the transport state is measurable is available in [17] and [15, Chap. 7].
The publications [8] and [15, Chap. 9] generalize the design to the situation where,
besides the unknown delay value, the ODE also has unknown parameters. The ref-
erences [9] and [15, Chap. 8] solve the problem of adaptive stabilization when both
the delay value and the actuator state are unavailable. In the case where the delay
state is not available for measurement, only local stability is obtained as the problem
is not linearly parameterized, which means that the initial delay estimate needs to
be sufficiently close to the true delay. More uncertainty combinations without the
knowledge of the actuator state are taken into account in [7].
In [29, 35] and [28, Chaps. 4–5], we deal with the observer-based delay-adaptive
control problems in the presence of uncertain parameters and state of the ODE plant,
where the modeling of the observer canonical form and the Kreisselmeier filters is
employed for the ODE-state estimation. On the basis of uncertainty-free case [26],
we consider adaptive control for multi-input linear systems with distinct discrete
input delays in [30, 32, 33] and [28, Chaps. 6–9]. Since each delay length in multi-
input channels is not identical, the multi-input plant significantly complicates the
prediction design as it requires to compute different future state values on time
horizons (which seems to be non-causal at first sight).
In [5], the PDE-backstepping predictor method is expanded to compensate another
big family of delays—distributed input delays. Since linear systems with distributed
delays consisting of the finite-dimensional plant state and the infinite-dimensional
actuator state are not in the strict-feedback form, a novel forwarding-backstepping
transformation is introduced to transform the system to an exponentially stable target
7 Delay-Adaptive Observer-Based Control for Linear Systems … 215

system [5]. The paper [4], for the first time, addresses the adaptive stabilization
problem of linear systems with unknown parameters and known distributed input
delays. Delay-adaptive control for linear systems with unknown distributed input
delays are further studied in [31, 34], [28, Chap. 11] (single-input case) and in [27],
[28, Chap. 12] (multi-input case with distinct delays).
Sizable opportunities exist for further development of the subject of delay-adaptive
control, including systems with simultaneous input and state delays, and PDE systems
with delays.

References

1. Anfinsen, H., Aamo, O.-M.: Adaptive Control of Hyperbolic PDEs. Springer, Berlin (2019)
2. Artstein, Z.: Linear systems with delayed controls: a reduction. IEEE Trans. Autom. Control
27(4), 869–879 (1982)
3. Astrom, K.J., Wittenmark, B.: Adaptive Control. Courier Corporation, Chelmsford (2013)
4. Bekiaris-Liberis, N., Jankovic, M., Krstic, M.: Adaptive stabilization of LTI systems with
distributed input delay. Int. J. Adapt. Control Signal Proc. 27, 47–65 (2013)
5. Bekiaris-Liberis, N., Krstic, M.: Lyapunov stability of linear predictor feedback for distributed
input delays. IEEE Trans. Autom. Control 56, 655–660 (2011)
6. Bekiaris-Liberis, N., Krstic, M.: Nonlinear Control Under Nonconstant Delays. Society for
Industrial and Applied Mathematics, Philadelphia (2013)
7. Bresch-Pietri, D., Chauvin, J., Petit, N.: Adaptive control scheme for uncertain time-delay
systems. Automatica 48, 1536–1552 (2012)
8. Bresch-Pietri, D., Krstic, M.: Adaptive trajectory tracking despite unknown input delay and
plant parameters. Automatica 45, 2074–2081 (2009)
9. Bresch-Pietri, D., Krstic, M.: Delay-adaptive predictor feedback for systems with unknown
long actuator delay. IEEE Trans. Autom. Control 55(9), 2106–2112 (2010)
10. Evesque, S., Annaswamy, A.M., Niculescu, S., Dowling, A.P.: Adaptive control of a class of
time-delay systems. ASME Trans. Dynam. Syst. Meas. Control 125, 186–193 (2003)
11. Fridman, E.: Introduction to Time-Delay Systems: Analysis and Control. Birkhauser (2014)
12. Goodwin, G.C., Sin, K.S.: Adaptive Filtering, Prediction and Control. Courier Corporation,
Chelmsford (2014)
13. Karafyllis, I., Krstic, M.: Delay-robustness of linear predictor feedback without restriction on
delay rate. Automatica 49, 1761–1767 (2013)
14. Krstic, M.: Lyapunov tools for predictor feedbacks for delay systems: inverse optimality and
robustness to delay mismatch. Automatica 44, 2930–2935 (2008)
15. Krstic, M.: Delay Compensation for Nonlinear, Adaptive, and PDE Systems. Birkhauser, Berlin
(2009)
16. Krstic, M.: Compensation of infinite-dimensional actuator and sensor dynamics: nonlinear and
delay-adaptive systems. IEEE Control Syst. Mag. 30, 22–41 (2010)
17. Krstic, M., Bresch-Pietri, D.: Delay-adaptive full-state predictor feedback for systems with
unknown long actuator delay. In: Proceedings of 2009 American Control Conference, Hyatt
Regency Riverfront, St. Louis, MO, USA, June 10–12 (2009)
18. Krstic, M., Kanellakopoulos, I., Kokotovic, P.V.: Nonlinear and Adaptive Control Design.
Wiley, New York (1995)
19. Krstic, M., Smyshlyaev, A.: Backstepping boundary control for first-order hyperbolic PDEs
and application to systems with actuator and sensor delays. Syst. Control Lett. 57(9), 750–758
(2008)
20. Krstic, M., Smyshlyaev, A.: Boundary Control of PDEs: A Course on Backstepping Designs.
SIAM, Philadelphia (2008)
216 M. Krstic and Y. Zhu

21. Liu, W.-J., Krstic, M.: Adaptive control of Burgers’ equation with unknown viscosity. Int. J.
Adapt. Control Signal Process. 15, 745–766 (2001)
22. Niculescu, S.-I., Annaswamy, A.M.: An adaptive smith-controller for time-delay systems with
relative degree n ∗ ≤ 2. Syst. Control Lett. 49, 347–358 (2003)
23. Ortega, R., Lozano, R.: Globally stable adaptive controller for systems with delay. Int. J. Control
47, 17–23 (1988)
24. Smith, O.J.M.: A controller to overcome dead time. ISA 6, 28–33 (1959)
25. Smyshlyaev, A., Krstic, M.: Adaptive Control of Parabolic PDEs. Princeton University Press,
Princeton (2010)
26. Tsubakino, D., Krstic, M., Oliveira, T.R.: Exact predictor feedbacks for multi-input LTI systems
with distinct input delays. Automatica 71, 143–150 (2016)
27. Zhu, Y., Krstic, M.: Adaptive and robust predictors for multi-input linear systems with dis-
tributed delays. SIAM J. Control Optim. 58, 3457–3485 (2020)
28. Zhu, Y., Krstic, M.: Delay-adaptive Linear Control. Princeton University Press, Princeton
(2020)
29. Zhu, Y., Krstic, M., Su, H.: Adaptive output feedback control for uncertain linear time-delay
systems. IEEE Trans. Autom. Control 62, 545–560 (2017)
30. Zhu, Y., Krstic, M., Su, H.: Adaptive global stabilization of uncertain multi-input linear time-
delay systems by PDE full-state feedback. Automatica 96, 270–279 (2018)
31. Zhu, Y., Krstic, M., Su, H.: Delay-adaptive control for linear systems with distributed input
delays. Automatica 116, 108902 (2020)
32. Zhu, Y., Krstic, M., Su, H.: PDE boundary control of multi-input LTI systems with distinct and
uncertain input delays. IEEE Trans. Autom. Control 63, 4270–4277 (2018)
33. Zhu, Y., Krstic, M., Su, H.: PDE output feedback control of LTI systems with uncertain multi-
input delays, plant parameters and ODE state. Syst. Control Lett. 123, 1–7 (2019)
34. Zhu, Y., Krstic, M., Su, H.: Predictor feedback for uncertain linear systems with distributed
input delays. IEEE Trans. Autom. Control 64, 5344–5351 (2020)
35. Zhu, Y., Su, H., Krstic, M.: Adaptive backstepping control of uncertain linear systems under
unknown actuator delay. Automatica 54, 256–265 (2015)
Chapter 8
Adaptive Control for Systems with
Time-Varying Parameters—A Survey

Kaiwen Chen and Alessandro Astolfi

Dedicated to Laurent Praly

Abstract Adaptive control was originally proposed to control systems, the model
of which changes over time. However, traditionally, classical adaptive control has
been developed for systems with constant parameters. This chapter surveys the so-
called congelation of variables method to overcome the obstacle of time-varying
parameters. Two examples, illustrating how to deal with time-varying parameters
in the feedback path and in the input path, respectively, are first presented. Then
n-dimensional lower triangular systems to show how to combine the congelation of
variables method with adaptive backstepping are discussed. Finally, we study how
to control a class of nonlinear systems via output feedback: this is a problem that
cannot be solved directly due to the coupling between the input and the time-varying
perturbation. It turns out that if we assume a strong minimum-phase property, namely,
ISS of the inverse dynamics, such a coupling is converted into a coupling between
the output and the time-varying perturbation. Then, a small-gain-like analysis, which
takes all subsystems into account, yields a controller that achieves output regulation

∗ This chapter is partially reprinted from [8] with the following copyright and permission notice:
©2021 IEEE. Reprinted, with permission, from Chen, K., Astolfi, A.: Adaptive control for systems
with time-varying parameters. IEEE Transactions on Automatic Control 66(5), 1986–2001 (2021).

K. Chen (B) · A. Astolfi


Imperial College London, SW7 2AZ London, UK
e-mail: [email protected]
A. Astolfi
e-mail: [email protected]
A. Astolfi
Università di Roma “Tor Vergata”, 00133 Rome, Italy
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 217
Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control,
Lecture Notes in Control and Information Sciences 488,
https://doi.org/10.1007/978-3-030-74628-5_8
218 K. Chen and A. Astolfi

and boundedness of all closed-loop signals. Simulation results to demonstrate that


the proposed controller achieves asymptotic output regulation and outperforms the
classical adaptive controller in the presence of time-varying parameters are presented.

8.1 Introduction

Since the publication of the seminal paper [10] that proposes an adaptive control
scheme with theoretically guaranteed stability properties, adaptive control has under-
gone extensive research (see, e.g., [2, 13, 17, 26, 29]) typically under the assumption
that the system parameters are constant. This, however, somehow deviates from the
original intention that one could use adaptive control to cope with plants or environ-
ments that change with time. One would expect that if the time-varying parameters
can be estimated exactly, their effects could also be exactly cancelled by a certainty-
equivalence design. Therefore, early works on adaptive control for time-varying
systems (see, e.g., [12]) exploit persistence of excitation to guarantee stability by
ensuring that parameter estimates converge to the true parameters. The restriction of
persistence of excitation is relaxed by subsequent works (see, e.g., [16, 25]) which
only require bounded and slow (in an average sense) parameter variations.
More recent contributions can be mainly categorized into two trends, both of which
exploit techniques from robust adaptive control to confine the parameter estimates.
One of the trends is based on the so-called switching σ -modification (see, e.g., [13]),
a mechanism which adds leakage to the parameter update integrator, if the parameter
estimates drift out of a pre-specified “reasonable” region, to guarantee boundedness
of the parameter estimates. This approach achieves asymptotic tracking when the
parameters are constant, otherwise the tracking error is nonzero and related to the
rates of the parameter variations, see [30]. Such a result can be further improved, as
shown in [32, 33], if one could model the parameter variations in two parts: known
parameter variations and unknown variations, in which case the residual tracking
error only depends on the rates of the unknown parameter variations.
The other trend exploits the projection operation (see, e.g., [11, 28]), which con-
fines the parameter estimates within a pre-specified compact set to guarantee bound-
edness of the parameter estimates, and the so-called filtered transformation, which
is essentially an adaptive observer described via a change of coordinates, see [20,
21, 23]. These methods guarantee asymptotic tracking provided that the parameters
are bounded in a compact set, their derivatives are L1 , and the disturbance on the
state evolution is additive and L2 . Moreover, a priori knowledge on parameter vari-
ations is not needed and the residual tracking error is independent of the rates of the
parameter variations.
The methods mentioned above cannot guarantee zero-error regulation when the
unknown parameters are persistently varying. To achieve asymptotic state/output
regulation when the time-varying parameters are neither known nor asymptotically
constant, in [3, 4] a method called the congelation of variables has been proposed
and developed on the basis of the adaptive backstepping approach and the adaptive
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 219

immersion and invariance (I&I) approach, respectively. In the spirit of the conge-
lation of variables method, each unknown time-varying parameter is treated as a
nominal unknown constant parameter perturbed by the difference between the true
parameter and the nominal parameter, which causes a time-varying perturbation term.
The controller design is then divided into a classical adaptive control design, with
constant unknown parameters, and a damping design via dominance to counteract the
time-varying perturbation terms. This method is compatible with most adaptive con-
trol schemes using parameter estimates, as it does not change the original parameter
update law designed for time-invariant systems.
Since full-state feedback is not always implementable, most practical scenarios
require an output-feedback adaptive control scheme. In the output-feedback design
with the congelation of variables method, the major difficulty is caused by the
coupling between the input and the time-varying perturbation. In this case, sim-
ply strengthening damping terms in the controller alters the input (as well as the
perturbation itself) and therefore causes a chicken-and-egg dilemma, which pre-
vents stabilization via dominance. In [5, 6], a special output-feedback case is solved
on the basis of adaptive backstepping and adaptive I&I, respectively, by exploit-
ing a modified minimum-phase property for time-varying systems and decomposing
the coupling between the input and the time-varying perturbation into a coupling
between some output-related nonlinearities and some “new” time-varying perturba-
tions, which enables the use of the dominance design again, though it is still restricted
by a relative degree condition. This restriction is relaxed in [8] by using the nonlinear
negative feedback control law proposed in [7] and by performing a stability analysis
that takes all the filter subsystems into account.
This chapter summarizes the ideas and results of [3, 5, 7, 8] and gives some
extensions. The chapter is organized as follows. In Sect. 8.2, two scalar systems to
illustrate the use of the congelation of variables method are presented, and an n-
dimensional lower triangular system with unmatched uncertainties controlled by an
adaptive state-feedback controller is discussed to elaborate on the combination of
the congelation of variables method with adaptive backstepping. With these design
tools, in Sect. 8.3, we recall the results developed in [5–8] on the decomposition of
the perturbation coupled with the input, on the input and output filters design, and on
a small-gain-like controller design. In Sect. 8.4, a numerical example to highlight the
performance improvement achievable with the proposed scheme is presented. For
conciseness, most of the technical proofs in this chapter are omitted. All the proofs
can be found in [8].

Notation This chapter uses standard notation unless stated otherwise. √ For an
n-dimensional vector v ∈ Rn , |v| denotes the Euclidean 2-norm, |v| M = v Mv,
M = M   0, denotes the weighted 2-norm with weight M, vi ∈ Ri , 1 ≤ i ≤ n,
denotes the vector composed of the first i elements of v. ei denotes the i-th unit
vector of proper dimension. For an n × m matrix M, (M)i denotes the i-th column,
(M  )i denotes the i-th row, (M)
i j denotes the i-th element on the j-th column, tr(M)
n m
denotes the trace, and |M|F = i=1 j=1 (M)i j denotes the Frobenius norm. I
2
220 K. Chen and A. Astolfi

Fig. 8.1 Graphical


illustration of the role of 0 ,
θ , θ (t), and δθ . ©2021
IEEE. Reprinted, with
permission, from Chen, K.,
Astolfi, A.: Adaptive control
for systems with
time-varying parameters.
IEEE Transactions on
Automatic Control 66(5),
1986–2001 (2021)

and S denote the identity matrix and the upper shift matrix of proper dimension,
respectively. For an n-dimensional time-varying signal s : R → Rn , the image of
which is contained in a compact set S, s : R → Rn denotes the deviation of s from
a constant value s , i.e., s (t) = s(t) − s , and δs ∈ R denotes the supremum of the
2-norm of s, i.e., δs = supt≥0 |s(t)| ≥ 0. (·)(n) = dtd n denotes the n-th time derivative
n

operator.
In this chapter, the unknown time-varying system parameters1 θ : R → Rq and
bm : R → R may verify one of the assumptions below.
Assumption 8.1 The parameter θ is piecewise continuous and θ (t) ∈ 0 , for all
t ≥ 0, where 0 is a compact set. The “radius” of 0 , i.e., δθ , is assumed to be
known, while 0 can be unknown (see Fig. 8.1).
Assumption 8.2 The parameter θ is smooth, that is, θ (i) (t) ∈ i , for i ≥ 0, for all
t ≥ 0, respectively, where i are compact sets possibly unknown. δθ is assumed to
be known.
Assumption 8.3 The parameter bm (t) is bounded away from 0 in the sense that there
exists a constant bm such that sgn(bm ) = sgn(bm (t)) = 0 and 0 < |bm | ≤ |bm (t)|,
for all t ≥ 0. The sign of bm and bm (t), for all t ≥ 0, is known and does not change.

8.2 Motivating Examples and Preliminary Result

In this section, we first briefly discuss the core idea which allows to cope with time-
varying parameters, the so-called congelation of variables method, by demonstrating

1 All parameters, e.g., θ, are time-varying unless stated otherwise. To highlight this fact, the time
argument is explicitly used, e.g., θ(t), although this may be dropped for conciseness as long as no
confusion arises.
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 221

the adaptive controller design on two scalar systems. Then, an n-dimensional lower
triangular system is considered to generalize the proposed method.

8.2.1 Parameter in the Feedback Path

To begin with consider a scalar nonlinear system described by the equation

ẋ = θ (t)x 2 + u, (8.1)

where x(t) ∈ R is the state, u(t) ∈ R is the input, and θ (t) ∈ R is an unknown
time-varying parameter satisfying Assumption 8.1. In the spirit of the certainty-
equivalence principle, we can substitute an “estimate” θ̂ for the true parameter θ (t),
and rewrite (8.1) as

ẋ = θ̂ x 2 + u + (θ − θ̂ )x 2 . (8.2)

In the classical direct adaptive control scheme for time-invariant systems, considering
a Lyapunov function candidate of the form

1 2 1
V (x, θ̂ , θ ) = x + (θ − θ̂ )2 (8.3)
2 2γθ

(and assuming differentiability of θ for the time being) yields

θ̂˙ θ̇
V̇ = θ̂ x 3 + ux + (θ − θ̂)x 3 − (θ − θ̂ ) + (θ − θ̂) , (8.4)
γθ γθ

which allows using the parameter update law

θ̂˙ = γθ x 3 (8.5)

to cancel the unknown (θ − θ̂ )x 3 term in V̇ , where the constant γθ > 0 is known as


the adaptation gain. Since it is typically assumed that θ is constant, the indefinite
term (θ − θ̂) γθ̇θ disappears, and the selection of the control law

u = −kx − θ̂ x 2 , (8.6)

with k > 0, yields V̇ = −kx 2 ≤ 0. We can then conclude, by invoking Barbalat’s


lemma, that x and θ̂ are bounded, and lim x(t) = 0. However, if θ is not a constant
t→+∞
parameter, special treatment is needed for the indefinite term (θ − θ̂ ) γθ̇θ . One popular
modification to cope with this indefinite term is the so-called projection operation
222 K. Chen and A. Astolfi

(see, e.g., [11, 28]), which confines the parameter θ̂ inside a convex compact set by
adding an additional term to (8.5) so that it “projects” θ̂ back to a “reasonable” set
when θ̂ drifts out of the set, and therefore guarantees boundedness of (θ − θ̂ ). It fol-
lows that boundedness of θ̇ guarantees boundedness of x (either exact boundedness,
e.g., in [34] or boundedness in an average sense, e.g., in [25]), and θ̇ ∈ L1 guarantees
that lim x(t) = 0 (e.g., in [21–23]). Alternatively, one may exploit a soft version
t→+∞
of the projection operation, commonly referred to as switching σ -modification, to
guarantee boundedness of θ̂ , which adds some leakage to the integrator (8.5) if
the parameter estimate drifts outside a “reasonable” region, see, e.g., [30, 32, 33].
All these schemes share the similarity that they treat θ̇ as a disturbance. Therefore,
designing in the spirit of disturbance attenuation, one could guarantee that bounded
θ̇ causes bounded state/output regulation/tracking error, and sufficiently fast con-
verging θ̇ , which means that θ becomes constant sufficiently fast, guarantees the
convergence of the error to 0. As a result, none of these methods can guarantee zero-
error regulation/tracking when the unknown parameter is persistently time varying,
in which case θ̇ is non-vanishing.
To overcome this limitation, first note that the time derivative θ̇ is introduced by
taking time derivative of the θ − θ̂ term in (8.3) along the solutions of the system.
Also note that the role of the θ − θ̂ term is only to guarantee boundedness of θ̂ , yet
by no means guaranteeing convergence of θ̂ to θ , no matter if θ is time varying or
constant. In fact, replacing θ with a constant θ , to be determined, can guarantee the
same properties without introducing θ̇ when taking the time derivative. In the light
of this, consider the modified Lyapunov function candidate

1 2 1
V (x, θ̂ , θ ) = x + (θ − θ̂ )2 . (8.7)
2 2γθ

Taking the time derivative of V along the trajectories of (8.2) yields

θ̂˙
V̇ = θ̂ x 3 + ux + (θ − θ̂)x 3 − (θ − θ̂ ) + θ x 3 , (8.8)
γθ

where θ = θ − θ . Comparing (8.8) with (8.4) we see that the substitution of θ


for θ eliminates the θ̇ term, at the cost of adding a perturbation term θ x 3 due to the
inconsistency between θ and θ . Considering the same parameter update law as in
(8.5) and the new control law
 
1 1
u=− k+ δθ x − θ δ θ x
3
− θ̂ x 2 , (8.9)
2 θ 2

where θ > 0 is a constant, to balance the linear and the nonlinear terms, yields
 
1 1
V̇ = − k + δθ x 2 − θ δ θ x
4
+ θ x 3 ≤ −kx 2 ≤ 0. (8.10)
2 θ 2
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 223

Therefore, we can conclude boundedness of all trajectories of the closed-loop system,


as well as lim x(t) = 0, using the same argument as the one used in the classical
t→+∞
constant parameter problem, without requiring a small or vanishing θ̇ , which shows
that such a result can be achieved even for systems with fast and persistently varying
parameters. The method of substituting the constant θ for the time-varying θ to avoid
unnecessary time derivatives is called congelation of variables [3].2 The following
remarks are given to further facilitate understanding the proposed scheme.
Remark 8.1 The control law (8.9) and the parameter update law (8.5) do not contain
the unknown constant θ explicitly, which can be regarded as an analogy of the fact
that classical adaptive controllers do not contain the unknown constant θ explicitly.
This fact shows that the proposed scheme is “adaptive” and distinguishes it from
a robust scheme with static high gain, which requires the knowledge of θ . One
can interpret the proposed controller as a combination of an adaptive controller, to
cope with the unknown parameter θ representing the average of θ (t), and a robust
controller, to cope with the time-varying perturbation θ (t), the 2-norm of which is
upper bounded by δθ , the “radius” of the compact set 0 containing θ , as shown in
Fig. 8.1. It is also worth noting that the control law (8.9) contains the same certainty-
equivalence term, that is, the term −θ̂ x 2 , as in the classical control law (8.6), and
therefore when θ is a constant, one could select θ = θ , hence δθ = 0, and the control
law (8.9) would be reduced to the classical control law (8.6). This fact distinguishes
the proposed scheme from the dynamic high-gain scheme in, e.g., [19, 31], in which
the adaptive term is not certainty equivalence but used for dominance.
Remark 8.2 The control law (8.9) explicitly contains δθ , which is assumed to be
known by Assumption 8.1. Such an assumption can be relaxed by introducing an
“estimate” for δθ and replacing the nonlinear damping term that contains δθ with a
certainty-equivalence term. This is feasible since δθ is a constant and the control law
is linearly parameterized, thus yielding a problem which can be effectively solved
by a classical adaptive control scheme. See Remark 2 of [7] for a brief example.
Remark 8.3 It is worth introducing a convention to clarify the spirit in which we
treat unknown quantities. If an unknown indefinite term in the time derivative of the
Lyapunov function vanishes as the system parameters become constant, then this
term is to be dominated by a static damping design, like the θ -term in this case,
and we do not aim at estimating δθ , the bound of θ ; if an unknown indefinite term
is not vanishing even when all system parameters are constant, like the θ -term in
this case, then this term is to be compensated by a dynamically updated “estimate,”
which is θ̂ in this case. The reasons for this convention of design are, first, that we do
not want to over-extend the dimension of the closed-loop system by adding too many

2 Some works predating [3] exploit similar ideas to avoid involving θ̇ in the analysis. For example,
in [1], the unknown time-varying controller parameter in the Lyapunov function is replaced with
a constant (0, as a matter of fact). In other works, one first derives a constant parameter controller
via dominance design (instead of directly using a time-varying parameter controller that cancels
the time-varying parameter) and then estimates the constant parameter of the dominance controller,
see, e.g., [19, 31].
224 K. Chen and A. Astolfi

dynamic estimates, and second, that we need the static damping terms to counteract
fast parameter variations for better transient performance (for the same reason one
can use nonlinear damping techniques even for system with constant parameters).
Remark 8.4 Consider a classical adaptive control problem for system (8.1) in which
θ is constant. The closed-loop dynamics can be described via a negative feedback
loop consisting of two passive subsystems, namely,
 
ẋ1 = −kx1 + x12 u 1 , ẋ2 = γθ u 2 ,
: : (8.11)
1
y1 = x13 , 2
y2 = x2 ,

where x1 = x, x2 = θ̂ − θ , u 1 = −y2 , u 2 = y1 . The storage functions are S1 = 21 x12


and S2 = 2γ1θ x22 , respectively. Although θ̂ is called the parameter estimate by con-
vention, it is well known that the parameter update law (8.5), in general, cannot
guarantee that lim θ̂ (t) − θ = 0. The selection of the update law is to guarantee
t→+∞
that the signal θ̂ − θ can be used to create a passive interconnection. When θ is time
varying, the dynamics of 2 are described by

ẋ2 = γθ u 2 − θ̇ ,
: (8.12)
2
y2 = x2 ,

which causes the loss of passivity from u 2 to y2 . The congelation of variables method
can therefore be interpreted as selecting a new signal θ̂ − θ that can yield a pas-
sive interconnection, while maintaining the passivity of 1 via nonlinear damping.
Combining the adaptive controller described by (8.5) and (8.9) with the open-loop
system (8.1), the two interconnected passive subsystems are described by

ẋ1 = −a(x1 , t)x1 + x12 u 1 ,
: (8.13)
1
y1 = x13 ,

ẋ2 = γθ u 2 ,
: (8.14)
2
y2 = x2 ,
 
where x1 = x, x2 = θ̂ − θ , u 1 = −y2 , u 2 = y2 , and a(x1 , t) = k + 2
1

δ θ +
1
δ x2
2 θ θ 1
− θ x1 ≥ k > 0.
In the same spirit, one could apply the congelation of variables method to other
parameter-based adaptive schemes to cope with time-varying parameters. For exam-
ple, consider system (8.1) again, but with an adaptive I&I controller defined by the
equations
   
1 1 1  
u= − k+ + x− z+ θ δ
2
θ
x 3 − θ̂ + β(x) x 2 , (8.15)
z 2 θ 2
 
∂β  
θ̂˙ = − θ̂ + β(x) x 2 + u , (8.16)
∂x
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 225

x3
where β(x) = γθ and (·) > 0. Defining the error variable z = θ̂ − θ + β(x) in
3
the spirit of the congelation of variables yields the closed-loop system dynamics
described by
   
1 1 1
ẋ = − k + + x− z + θ δ
2
θ
x 3 − x 2 (z + θ ), (8.17)
z 2 θ 2
ż = − γθ x 4 (z + θ ). (8.18)

Let Vx (x) = 21 x 2 , Vz (z) = 2γ1θ z 2 , and note that their time derivatives along the solu-
tions of (8.17) and (8.18), respectively, yield
   
1 1 1
V̇x = − k + + x −2
z + θ δθ x − x (z + θ )
2 4 3
z 2 θ 2
1 4 2
≤ − kx 2 − z δθ x
2 4
zx z , + (8.19)
4
3
V̇z = − x 4 z 2 − x 4 zθ ≤ − x 4 z 2 + δ
2
θ
x 4. (8.20)
4
Finally, setting V (x, z) = V (x) + z Vz (z) yields

1 4 2
V̇ = V̇ + z V̇z ≤ −kx 2 − z x z ≤ 0. (8.21)
2

This guarantees that lim x(t) = 0 and x 2 z ∈ L2 .


t→+∞
Similar to Remark 8.4, the substitution of θ for θ removes θ̇ from the z-dynamics
yet, differently, it makes the x-subsystem and the z-subsystem finite-gain L2 stable
instead of making them passive, which can be seen from (8.19) and (8.20). The
nonlinear damping term x 3 in (8.15) renders the loop gain of the interconnected
system sufficiently small, so that stability properties hold. It is worth comparing this
result with the classical scenario (see Sect. 3.2 of [2]) in which the parameter θ is
constant, and the property x 2 z ∈ L2 can be concluded directly from (8.18) without
the presence of θ . This leads to two cascaded subsystems instead of two subsystems
interconnected in a loop.
The two adaptive schemes discussed above show the compatibility of the conge-
lation of variables method with different adaptive schemes. This is due to the fact
that the substitution of θ for θ only introduces θ , whereas the role of θ is exactly
the same as that of θ in the classical constant parameter scenario, which allows using
the same parameter update law. For simplicity, in what follows, we only demon-
strate the congelation of variables method with the passivity-based scheme, which
is also referred to as direct adaptive control or Lyapunov-based adaptive control in
the literature.
226 K. Chen and A. Astolfi

8.2.2 Parameter in the Input Path

In this subsection, we show how to extend the idea of congelation of variables to


systems with a time-varying parameter coupled with the input that is commonly
referred to as the high-frequency gain. To this end, consider the scalar system

ẋ = θ (t)x 2 + b(t)u, (8.22)

where θ (t) ∈ R satisfies Assumption 8.1 and b(t) ∈ R satisfies Assumption 8.1 and
Assumption 8.3. In the spirit of the congelation of variables method (8.22) can be
rewritten as
 
1
ẋ =θ̂ x + ū + θ x + b ˆ ū + (θ − θ̂)x − b
2 2 2
− ˆ ū, (8.23)
b

where b (t) = b(t) − b , ˆ is an “estimate” of 1b , and u = ˆ ū. From classical adap-
tive control theory (see, e.g., [17]) it is known that the parameter estimation error
terms in (8.23) can be cancelled by selecting the parameter update laws (8.5) and

˙ˆ = −γ sgn(b )ūx, (8.24)

and considering the Lyapunov function candidate V (x, θ̂ , ˆ ) = 21 x 2 + 1


2γθ
(θ − θ̂ )2 +
|b | 1
2γ b
( − ˆ )2 , the time derivative of which along the trajectories of (8.23) satisfies

V̇ = θ̂ x 3 + ūx + θ x 3 + b ˆ ūx. (8.25)

Note that the perturbation term b ˆ ūx depends on ū explicitly, which means that we
cannot dominate this term by simply adding damping terms to ū, as doing this would
also alter the perturbation term itself. Instead, we need to make b ˆ ūx non-positive
by designing ū and selecting b . Similar to the selection of θ in Sect. 8.2.1, such a
selection is only made for analysis rather than implementation, i.e., b does not need
to be known. Let ū be defined as
   
1 δ θ 1 1
ū = − k + + + ( θ δθ + θ̂ θ̂ )x x = −κ(x, θ̂ )x,
2 2
(8.26)
2 θ θ̂ 2

with θ̂ > 0 and κ(x, θ̂ ) > 0. Substituting (8.26) into (8.24) yields ˙ˆ = γ sgn(b )κ x 2 .
By Assumption 8.3, we only need to consider two cases. In the first case, there exists
a constant b such that 0 < b ≤ b(t), for all t ≥ 0, and therefore b > 0, ˙ˆ ≥ 0,
which means that any initialization of ˆ such that ˆ (0) > 0 guarantees that ˆ (t) > 0,
for all t ≥ 0, and therefore b ˆ ūx = −b ˆ κ x 2 ≤ 0, for all t ≥ 0. Alternatively
b(t) ≤ b < 0, and therefore b < 0, ˙ˆ ≤ 0. Then selecting ˆ (0) < 0 guarantees
that ˆ (t) < 0, for all t ≥ 0, and b ˆ ūx ≤ 0. Recalling (8.25), (8.26), and noting that
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 227

b ˆ ūx ≤ 0 yields
   
θ̂ 1 θ δ θ δθ 2
V̇ ≤ − kx −2
θ̂ x +
2 4
x − θ̂ x −
3
x +
4
x + θ x 3
2 2 θ̂ 2 2 θ
≤ −kx ≤ 0.2

(8.27)

Exploiting the same stability argument as before, boundedness of the system trajec-
tories and convergence of x to zero follows.
Remark 8.5 This example highlights the flexibility of the congelation of variables
method: the congealed parameter (·) can be selected according to the specific usage.
It can be a nominal value for robust design as in Sect. 8.2.1, or an “extreme” value to
create sign-definiteness as in Sect. 8.2.2, as long as the resulting perturbation (·) is
considered consistently. One can even make (·) a time-varying parameter subject to
some of the assumptions used in the literature (e.g., ˙(·) ∈ L∞ , ˙(·) ∈ L1 , see, e.g.,
[21, 25]), and use the congelation of variables method to relax these assumptions.
This is the reason why the proposed method is named “congelation”3 not “freeze.”

Remark 8.6 Similar to what is discussed in Remark 8.4, the selection of b makes
ˆ − 1b a passivating input/output signal. In addition, note that the closed-loop system
described by (8.23), (8.5), (8.24), and (8.26) is passive from −b ˆ κ x to x (see Fig.
8.2 for a schematic representation). Our selection of b always guarantees that −b ˆ κ
is negative and therefore yields a negative feedback “control” (if regarding −b ˆ κ x
as the control law), which is well known to possess an arbitrarily large gain margin
and it is robust against the variation of b ˆ κ.

The examples discussed above are simple, yet illustrate the core ideas put forward
in the chapter: no matter if the time-varying parameters appear in the feedback path or
in the input path. These ideas allow us to proceed with more complicated scenarios.

8.2.3 Preliminary Result: State-Feedback Design for


Unmatched Parameters

The scalar systems considered in Sects. 8.2.1 and 8.2.2 satisfy the so-called matching
condition, that is, the unknown parameter θ enters the system dynamics via the same
integrator from which the input u enters. For a more general class of systems in
which the unknown parameters are separated from the input by integrators, adaptive
backstepping design [17] is needed. Consider an n-dimensional nonlinear system in
the so-called parametric strict-feedback form, namely,

3 The word “congelation” is polysemous: it means both “coagulation” and


“freeze/solidification”[24].
228 K. Chen and A. Astolfi

Fig. 8.2 Schematic representation of system (8.23), (8.5), and (8.24) as the interconnection of
passive subsystems. Each of the subsystems in the round-rectangular blocks is passive from its
input to its output. ©2021 IEEE. Reprinted, with permission, from Chen, K., Astolfi, A.: Adaptive
control for systems with time-varying parameters. IEEE Transactions on Automatic Control 66(5),
1986–2001 (2021)

ẋ1 = φ1 (x1 )θ (t) + x2 ,


...

ẋi = φi (xi )θ (t) + xi+1 , (8.28)


..
.
ẋn = φn (x)θ (t) + b(t)u,

where i = 2, . . . , n − 1; x(t) = [x1 (t), . . . , xn (t)] ∈ Rn is the state; u(t) ∈ R is the


input; θ (t) ∈ Rq is the vector of unknown parameters satisfying Assumption 8.1;
and b(t) ∈ R is an unknown parameter satisfying Assumptions 8.1 and 8.3. The
regressors φi : Ri → Rq , i = 1, . . . , n, are smooth mappings and satisfy φi (0) = 0.

Remark 8.7 By Hadamard’s lemma [27], the condition φi (0) = 0 implies that
¯ i (xi )xi , for some smooth mappings 
φi (xi ) =  ¯ i . This also means that φi (0)θ (t) =
0, allowing zero control effort at x = 0 regardless of θ (t). One can easily see that
if φi (0) = 0, φi (0)θ (t) becomes an unknown time-varying disturbance, yielding a
disturbance rejection/attenuation problem not discussed here.
We directly give the results below and omit the derivation of the adaptive back-
stepping procedures.4 For each step i, i = 1, . . . , n, define the error variables

4The classical procedures of adaptive backstepping, on which the following procedures are based,
can be found in Chap. 4 of [17].
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 229

z 0 = 0, (8.29)
z i = xi − αi−1 , (8.30)

the new regressor vectors

i−1
∂αi−1
wi (xi , θ̂ ) = φi − φj, (8.31)
j=1
∂x j

the tuning functions

i
τi (xi , θ̂ ) = τi−1 + wi z i = wi z i , (8.32)
j=1

and the virtual control laws

α0 = 0, (8.33)
∂αi−1
αi (xi , θ̂ ) = − z i−1 − (ci + ζi )z i − wi θ̂ + θ τi
∂ θ̂
i−1 i−1
∂αi−1 ∂α j−1
+ x j+1 + θ wi z j , i = 1, . . . , n − 1, (8.34)
j=1
∂x j j=2 ∂ θ̂
αn = ˆ ᾱn = − ˆ κ(x, θ̂ )z n , (8.35)

where ci > 0 are constant feedback gains; ζi (xi , θ ) are nonlinear feedback gains
to be defined; θ = θ  0 is the adaptation gain; κ(x, θ̂ ) is a positive nonlinear
feedback gain to be defined, similar to the one in (8.26). To proceed with the analysis,
select the control law and the parameter update laws as

u = αn , (8.36)
θ̂˙ = θ τn , (8.37)
˙ˆ = −γ sgn(b )ᾱn z n , (8.38)

respectively, and consider the Lyapunov function candidate V (z, θ̂ , ˆ ) = 21 |z|2 +


|b | 1
1
| − θ̂|2−1 + 2γ
2 θ
| b − ˆ |2 , where z = [z 1 , . . . , z n ] . Taking the time derivative of
V yields
230 K. Chen and A. Astolfi

n
V̇ = − (ci + ζi )z i + z n ᾱn +  + z n ψ
i=1
 n−1    (8.39)
˙ˆ 
wi z i − θ−1 θ̂˙ + b
1
+ (θ − θ̂ ) −ˆ ᾱn z n − ,
i=1
b γ

where
n−1
= z i wi θ + b ˆ ᾱn z n , (8.40)
i=1
n−1 n−1
∂αn−1 ∂αn−1 ∂α j−1
ψ = z n−1 + wn θ̂ − x j+1 − θ τn − θ wn z j . (8.41)
j=1
∂x j ∂ θ̂ j=2 ∂ θ̂

Remark 8.8 Recalling Remark 8.7 and implementing (8.29)–(8.35) recursively, it is


not hard to see that z i (xi , θ̂ ), wi (xi , θ̂ ), τi (xi , θ̂ ), αi (xi , θ̂ ) are smooth and z i (0, θ̂ ) =
0, wi (0, θ̂ ) = 0, τi (0, θ̂ ) = 0, αi (0, θ̂ ) = 0. In addition, the θ̂ -dependent change of
coordinates between z i and xi is smooth, invertible, and xi = 0 implies and is implied
by z i = 0, thus we can directly express wi as wi = W̄i (xi , θ̂ )z i , with Wi smooth, and
similarly, ψ as ψ = ψ̄  (x, θ̂ )z, with ψ̄ smooth.
The parameter estimation error terms in (8.39) are eliminated by the parameter
update laws (8.37) and (8.38), and the non-positivity of b ˆ ᾱn z n can be established
in the same way as in Sect. 8.2.2, thanks to the form5 of ᾱn . The rest of the problem
is to determine the nonlinear damping gains ζi (xi , θ̂ ) and κ(x, θ̂ ) to dominate the
θ -terms.
Proposition 8.1 Consider system (8.28) and the control law (8.36) with the nonlin-
ear damping gains
 
1 δ 1
ζi (xi , θ̂ ) = (n − i + 1) θ + θ δθ | W̄i |F
2
+ , (8.42)
2 θ ψ̄
1
κ(x, θ̂ ) = cn + ζn + ψ̄ |ψ̄|2 , (8.43)
2
with cn > 0 and (·) > 0, and the parameter update laws (8.37) and (8.38) with
sgn( ˆ (0)) = sgn(b). Then all closed-loop signals are bounded and lim x(t) = 0.
t→+∞
Although state feedback is, in general, not available in practice, the result pre-
sented above indicates how to combine the congelation of variables method and
backstepping to cope with unmatched time-varying parameters, which is essential in
the output-feedback design.

5 This form of ᾱn is inspired by [18], which also proposes a control law with a nonlinear negative
feedback gain, albeit to achieve inverse optimality.
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 231

8.3 Output-Feedback Design

Consider now an n-dimensional system in output-feedback form with relative degree


ρ described by the equations
q
ẋ1 = x2 + φ0,1 (y) + φ1, j (y)a j (t),
j=1
..
.
q
ẋρ = xρ+1 + φ0,ρ (y) + φρ, j (y)a j (t) + bm (t)g(y)u,
j=1
..
.
q
ẋn = φ0,n (y) + φn, j (y)a j (t) + b0 (t)g(y)u,
j=1

y = x1 , (8.44)

or, in compact form, by the equations

ẋ = Sx + φ0 (y) + F  (y, u)θ,


(8.45)
y = e1 x,

where x(t) = [x1 (t), . . . , xn (t)] ∈ Rn is the state, u(t) ∈ R is the input, y(t) ∈ R is
the output, θ (t) = [b (t), a  (t)] is the vector of unknown time-varying parameters,
a(t) = [a1 (t), . . . , aq (t)] ∈ Rq , b(t) = [bm (t), . . . , b0 (t)] ∈ Rm+1 , m = n − ρ,

0(ρ−1)×(m+1)
F  (y, u) = g(y)u,  (y) , (8.46)
Im+1

( (y))i j = φi, j (y) and g : R → R is a smooth mapping such that g(y) = 0, for
all y ∈ R. In addition, θ satisfies Assumption 8.2, and, in particular, bm also satisfies
Assumption 8.3. The mappings φ0,i : R → R and φi, j : R → R, i = 1, . . . , n, j =
1, . . . , q, are smooth and such that φ0,i (0) = 0, φi, j (0) = 0.

Remark 8.9 Similar to what is discussed in Remark 8.7, there exist smooth map-
pings φ̄0,i and φ̄i, j such that φ0,i (y) = φ̄0,i (y)y, φi, j (y) = φ̄i, j (y)y.
232 K. Chen and A. Astolfi

8.3.1 System Reparameterization

Due to the presence of unmeasured state variables we use Kreisselmeier filters (K-
filters) [15] to reparameterize the system with the filter state variables (which are
known) into a new form that is favorable for the adaptive backstepping design [17].
The filters are given by the equations

ξ̇ = Ak ξ + ky + φ0 (y), (8.47)
˙ = Ak  +  (y),
   
(8.48)
λ̇ = Ak λ + en g(y)u, (8.49)

where Ak = S − ke1 , and k ∈ Rn is the vector of filter gains. These filters are equiv-
alent, see [17], to the filters

ξ̇ = Ak ξ + ky + φ0 (y), (8.50)
˙ = Ak  + F (y, u),
   
(8.51)

where

 = [vm , . . . , v0 ,  ], (8.52)
vi = Aik λ, i = 0, . . . , m. (8.53)

Define now the non-implementable state estimate

x̂ = ξ +  θ . (8.54)

The state estimation error dynamics are then described by the equation

0(ρ−1)×1
ε̇ = Ak ε + F  (y, u)θ = Ak ε +  (y)a + g(y)u, (8.55)
b

where ε = x − x̂. We now show that after using the K-filters (8.47)–(8.49) with
the congelation of variables method the original n-dimensional system with time-
varying parameters can be reparameterized as a ρ-dimensional system with constant
parameters θ and some auxiliary systems to be defined. The substitution of θ for
θ prevents θ̇ from appearing in the ε-dynamics. For ρ > 1, one has the problem
described by the equations

ẏ = ω0 + ω̄ θ + ε2 + bm vm,2


v̇m,i = −ki vm,1 + vm,i+1 , i = 2, . . . , ρ − 1, (8.56)
v̇m,ρ = −kρ vm,1 + vm,ρ+1 + g(y)u,
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 233

and, for ρ = 1, one has

ẏ = ω0 + ω θ + ε2 + bm g(y)u, (8.57)

where ω0 = φ0,1 + ξ2 , ω̄ = [0, vm−1,2 , . . . , v0,2 , ()  


1 + ()2 ] , ω = ω̄ + e1 vm,2 .
Similar to the classical adaptive backstepping scheme, we consider the ρth-order
system (8.56) (or (8.57) if ρ = 1) to exploit its lower triangular form, yet (8.56)
and (8.57) are useful only if the estimation error ε2 is converging to 0. In classical
schemes, this is not a problem since there are no a (t) or b (t) terms and ε converges
to 0 exponentially provided that Ak is Hurwitz. The effect of a can be dominated
via a strengthened damping design, as proposed in [3]. However, the dominance
method cannot be directly applied to (8.55) since b is coupled with the input u. To
solve this issue, in the next section, we revisit the ideas of [5] and [6] to see how we
can decouple b and u with the help of the inverse dynamics of system (8.44).

8.3.2 Inverse Dynamics

To study the inverse dynamics of (8.44) pretend that the system is “driven” by y,
φ0,i (y), φi (y), and their time derivatives. Then one could write

x2 =y (1) − (φ1 a + φ0,1 ),


..
. (8.58)
xρ =y (ρ−1) − (φ1 a + φ0,1 )(ρ−2) − · · · − (φρ−1

+ φ0,ρ−1 ).

Setting yi = φi a + φ0,i , i = 1, . . . , n and u g = g(y)u yields

1 (ρ−1)
ug = (−xρ+1 + y (ρ) − y1 − · · · − yρ ). (8.59)
bm

The resulting inverse dynamics are then described by

bm−1 bm−1 (ρ) (ρ−1)


ẋρ+1 = − xρ+1 + xρ+2 + yρ+1 + (y − y1 − · · · − yρ ),
bm bm
.. (8.60)
.
b0 b0 (ρ) (ρ−1)
ẋn = − xρ+1 + yn + (y − y1 − · · · − yρ ).
bm bm
234 K. Chen and A. Astolfi

Algorithm 8.1 Change of coordinates xρ+1 , . . . , xn .


Require: xρ+1 , . . . , xn , ẋρ+1 , . . . , ẋn .
Ensure: x̄ρ+1 , . . . , x̄n , x̄˙ρ+1 , . . . , x̄˙n .
1: while time derivatives of y appear in the expression of ẋρ+1 , . . . , ẋn do This while-loop
iterates for ρ times as it reduces the order of y (ρ) by one each iteration.
2: for i = n → ρ + 2 do
3: Update x̄i and x̄˙i using (8.61).
4: Rewrite xi in terms of x̄i in the expression of ẋi−1 and leave the feedback term − bbn−i m
xρ+1
unchanged.
5: end for
6: Update x̄ρ+1 and x̄˙ρ+1 using (8.61).
7: Rewrite xρ+1 in terms of x̄ρ+1 in the expressions of x̄˙ρ+1 , . . . , x̄˙n , respectively. This
brings back the time derivatives of y, y1 , . . . , yρ , but with the order reduced by one.
8: xρ+1 ← x̄ρ+1 , . . . , xn ← x̄n , ẋρ+1 ← x̄˙ρ+1 , . . . , ẋn ← x̄˙n . Update the old coordinates
before the next iteration.
9: end while
©2021 IEEE. Reprinted, with permission, from Chen, K., Astolfi, A.: Adaptive control for
systems with time-varying parameters. IEEE Transactions on Automatic Control 66(5), 1986–
2001 (2021)

Since it is difficult to use backstepping techniques to establish convergence properties


for the time derivatives of y or yi , we need to perform a change of coordinates to
remove the derivative terms from the inverse dynamics. Note that for any pair of
smooth signals s1 and s2 the equation
 i−1 (1)
( j) (i−1− j)
s1 s2(i) = (−1)i s1(i) s2 + (−1) j s1 s2 (8.61)
j=0

holds. With this fact, the change of coordinate

ρ−1  ( j) ρ−1 ρ−i−1  ( j)


b0 b0 (ρ−i−1− j)
x̄n =xn − (−1) j y (ρ−1− j) + (−1) j yi
j=0
bm i=1 j=0
bm
(8.62)

yields
 (ρ) ρ  (ρ−i)
b0 b0 b0
x̄˙n = − xρ+1 + yn + (−1)ρ y− (−1) ρ−i
yi ,
bm bm i=1
bm
(8.63)

which does not contain time derivatives of y and yi . In the same spirit, applying the
change of coordinates specified by Algorithm 8.1, we are able to remove the terms
containing the time derivatives of y and yi in each equation of the inverse dynamics.
The resulting inverse dynamics in the new coordinates (we use x̄i , i = ρ + 1, . . . , n
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 235

with a slight abuse of notation) are described by the equations


n n q
x̄˙ =Ab̄ (t)x̄ + bx̄ y (t)y + bx̄φ,0,i (t)φ0,i (y) + bx̄φ,i, j (t)φi, j (y),
i=1 i=1 j=1
(8.64)
 ρ−1 ρ ρ−i 
1 ( j)
ug = − x̄ρ+1 + y (ρ) + au g y ( j) (t)y ( j) + au g y ( j) (t)yi ,
bm (t) j=0 i=1 j=0
i

(8.65)

where x̄(t) = [x̄ρ+1 (t), . . . , x̄n (t)] ∈ Rm , Ab̄ = S − b̄e1 , b̄(t) = 1


[b (t),
bm (t) m−1
. . . , b0 (t)] ∈ Rm .

Remark 8.10 The time-varying vectors bx̄ y , bx̄φ,0,i , bx̄φ,i, j and the time-varying
scalars au g y ( j) , au g y ( j) are unknown as they depend on the unknown θ . However, as a
i
consequence of Assumption 8.2, they are bounded.

Assumption 8.4 The time-varying system (8.44) has a strong minimum-phase prop-
erty in the sense that the inverse dynamics (8.64) are input-to-state stable (ISS) with
respect to the inputs y, φ0,i (y), φi, j (y), i = 1, . . . , n, j = 1, . . . , q. Moreover, there
exists an ISS Lyapunov function γ x̄ |x̄|2 ≤ Vx̄ (x̄, t) ≤ γ̄x̄ |x̄|2 , 0 ≤ γ x̄ ≤ γ̄x̄ , such that
the time derivative of Vx̄ along the trajectories of the inverse dynamics satisfies the
inequality

V̇x̄ ≤ −|x̄|2 + σx̄ y y 2 + σx̄φ0 |φ0 (y)|2 + σx̄ |(y)|2F , (8.66)

for some constant σ(·) > 0.

Remark 8.11 Assumption 8.4 is verified if x̄ = 0 is a globally exponentially stable


equilibrium of the zero dynamics described by x̄˙ = Ab̄ (t)x̄, see, e.g., Lemma 4.6 in
[14]. Some works (e.g., [30] and [23]) exploit this exponential stability property as a
substitute for the classical minimum-phase assumption. Note, finally, that Assump-
tion 8.4 is not more restrictive than the classical minimum-phase assumption because
for time-invariant systems Assumption 8.4 reduces to minimum-phaseness.

8.3.3 Filter Design

Consider now the state estimation error dynamics (8.55) with u g given by (8.65),
which yields
236 K. Chen and A. Astolfi

0(ρ−1)×1 1
ε̇ = Ak ε +  (y)a + ×
b bm
 ρ−1 ρ ρ−i  (8.67)
( j)
− x̄ρ+1 + y (ρ) + au g y ( j) (t)y ( j) + au g y ( j) (t)yi .
i
j=0 i=1 j=0

Similar to what is done in Sect. 8.3.2, we need to use a change of coordinates to remove
the time derivative terms brought by u g . Implementing a change of coordinates in
the same spirit of Algorithm 8.1, the state estimation error dynamics in the new
coordinates ε̄ are described by the equations
n n q
ε̄˙ =Ak ε̄ − 
¯ b x̄ρ+1 + bε̄y (t)y + bε̄φ,0,i (t)φ0,i (y) + bε̄φ,i, j (t)φi, j (y),
i=1 i=1 j=1
(8.68)

¯ b = [01×(ρ−1) , 
where   1
b ] bm .
Remark 8.12 The time derivative terms are injected into the ε-dynamics via the
vector of gains b . Similar to Remark 8.10, the time-varying vectors  ¯ b , bε̄y , bε̄φ,0,i ,
bε̄φ,i, j are unknown, yet bounded, due to Assumption 8.2. We will see that as long as
these parameters are bounded they do not affect the controller design. In particular,
when b is constant, b (t) = 0 for all t ≥ 0, provided b = b, and thus  ¯ b , bε̄y ,
bε̄φ,0,i , bε̄φ,i, j are all identically 0 and ε̄ = ε, which yields ε̇ = Ak ε +  (y)a , a
simplified case that has been dealt with in [3].
Similar to the description of the ISS inverse dynamics, we want the state estimation
error dynamics to be ISS, but in this case, rather than assuming it, we can guarantee
such a property by designing the K-filters.
Proposition 8.2 The state estimation error dynamics are ISS with respect to the
inputs x̄ρ+1 , y, φ0,i (y), φi, j (y), i = 1, . . . , n, j = 1, . . . , q, if the vector of filter
gains is given by k = 21 X ε̄ e1 , X ε̄ = X ε̄  0, and X ε̄ satisfies the Riccati inequality6

S X ε̄ + X ε̄ S  − X ε̄ (e1 e1 − γε̄−1 I )X ε̄ + Q ε̄  0, (8.69)

where
 n n q 
δ¯ b δbε̄y δbε̄φ,0,i δbε̄φ,i, j
Q ε̄ = + + + I. (8.70)
¯b
 bε̄y bε̄φ,0,i bε̄φ,i, j
i=1 i=1 j=1

Moreover, there exists an ISS Lyapunov function Vε̄ = γε̄ |ε̄|2Pε̄ , with Pε̄ = X ε̄−1 and
the time derivative of Vε̄ along the trajectories of the state estimation error dynamics
satisfies the inequality

6 The solvability of (8.69) has been discussed in [3].


8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 237

n
V̇ε̄ ≤ − |ε|2 + bε̄y δbε̄y y
2
+ bε̄φ,0,i δbε̄φ,0,i φ0,i (y)
2

i=1
n q (8.71)
+ bε̄φ,i, j δbε̄φ,i, j φi, j (y)
2
+ ¯ b δ
 ¯ b x̄ ρ+1 ,
2

i=1 j=1

where (·) > 0, or in a more compact (yet more conservative) form,

V̇ε̄ ≤ − |ε̄|2 + σε̄y y 2 + σε̄φ0 |φ0 (y)|2 + σε̄ |(y)|2F + σε̄ x̄ρ+1 x̄ρ+1
2
, (8.72)

for some constant σ(·) > 0.

8.3.4 Controller Design

In Sects. 8.3.2 and 8.3.3, we have established the ISS of the inverse dynamics and the
state estimation error dynamics. However, before proceeding to design the controller,
we have to consider (8.56) in the new coordinates. Note that ε2 can be written as

ε2 = ε̄2 + aε2 y (1) (t) ẏ + Yε2 (y), (8.73)


n n q
where Yε2 (y) = aε2 y (t)y + i=1 aε2 φ,0,i (t)φ0,i (y) + i=1 j=1 aε2 φ,i, j (t)φi, j (y)
bm (t)
and aε2 y (1) (t) = bm (t)
Two special cases, in which either ρ = 1 or ρ ≥ 2 and bm
.
is constant, and therefore aε2 φ,0,i (t) = 0, for all t ≥ 0, have been discussed in [5]. In
general, aε2 y (1) (t) = 0 and, as a result, ε2 contains ẏ. Substituting (8.73) into the first
equation of (8.56) yields

(1 − aε2 y (1) ) ẏ = ω0 + ω̄ θ + bm vm,2 + ε̄2 + Yε2 . (8.74)

Noting that 1
1−aε (1)
= bm
bm −bm
= bm
 bm
, we can write the dynamics of y as
2y

 
bm (t)  bm (t)
ẏ = (ω0 + Yε2 + ε̄2 ) + ω̄ θ + bm (t)vm,2 . (8.75)
bm bm

Observe that the effect of the aε2 y (1) (t) ẏ term is to bring the time-varying parameters
back to the dynamics of y, which requires the congelation of variables method again.
To do this, we need first to augment system (8.56) with the ξ , , and v-dynamics,
which are not needed in the classical constant parameter scenarios but necessary in
the current setup. It turns out that the extended system is in the so-called parametric
block-strict-feedback form [17] described by the equations

ξ̇ = Ak ξ + ky + φ0 (y), (8.76)
238 K. Chen and A. Astolfi

˙  = Ak  +  (y),
 (8.77)
 
bm (t) bm (t)
ẏ = (ω0 + Yε2 + ε̄2 ) + ω̄ θ + bm (t)vm,2 ,
bm bm
v̇0,2 = v1,2 , (8.78)
..
.
v̇m−1,2 = vm,2 ,
v̇m,2 = − k1 vm,1 + vm,3 , (8.79)
..
.
v̇m,ρ−1 = − kρ−1 vm,1 + vm,ρ ,
v̇m,ρ = − kρ vm,1 + vm,ρ+1 + g(y)u.

In these Eqs. (8.76) and (8.77) describe the state evolution of the filters of the
regressors; Eq. (8.79) gives the integrator-chain structure used for backstepping; and
Eq. (8.78) is the key part of the design that contains the dynamics of the output y.
Recall that ω0 = φ0,1 + ξ2 and ω̄ = [0, vm−1,2 , . . . , v0,2 , ()  
1 + ()2 ] . The con-
gelation of variables method requires an ISS-like property of the state variables
coupled with the time-varying parameters. It turns out that we need to first establish
ISS properties for (8.76) and (8.77), and the zero dynamics of (8.78), before devel-
oping the backstepping design. For the subsystems described by (8.76) and (8.77),
we have the following result.

Lemma 8.1 Let the filter gain k be as in Proposition 8.2. Then system (8.76) is ISS
with respect to the inputs y, φ0,i (y) and system (8.77) is ISS with respect to the inputs
φi, j (y), where i = 1, . . . , n, j = 1, . . . , q. Moreover, there exist two ISS Lyapunov
functions Vξ = |ξ |2Pξ and V = tr(P  ), with Pξ = P = γε̄ Pε̄  0, such that
the time derivative of Vξ along the trajectories of (8.76) satisfies

V̇ξ ≤ − |ξ |2 + σξ y y 2 + σξ φ0 |φ0 (y)|2 , (8.80)

and the time derivative of V along the trajectories of (8.77) satisfies

V̇ ≤ − ||2F + σ |(y)|2F , (8.81)

for some constant σ(·) > 0.

The remaining work is to investigate if ISS holds for the inverse dynamics of
(8.78). To do this, first let
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 239

1 1 1 
vm,2 = ẏ − (ω0 + Yε2 + ε̄2 ) − ω̄ θ
bm bm bm
b b 1
= − m−1 vm−1,2 − · · · − 0 v0,2 + ẏ (8.82)
bm bm bm
1
− (() 
2 + ()1 )a − (ω0 + Yε2 + ε̄2 )
bm

and then define the change of coordinates: v̄0,2 = v̄0,2 , . . . , v̄m−2,2 = vm−2,2 ,
v̄m−1,2 = vm−1,2 − b1m y. The inverse dynamics of (8.78) are then described by

v̄˙ = Ab̄ v̄ + gv̄ (y, ξ, , ε̄2 , t), (8.83)

 bm−1 
where Ab̄ = S − em 

, b̄ = [ bb0 , . . . ,  bm
] , and gv̄ (y, ξ, , ε̄2 , t) = [0, . . . , 0,
m
b
1
bm
y, −( bmm−1
 bm
+ ( b1m )(1) )y − (()
2 + () 
2 )a − bm (ω0 + Yε2 + ε̄2 )] . Exploiting
1

the flexibility of the congelation of variables method we can always select b to


construct a Hurwitz Ab̄ , and therefore ISS of system (8.83) can be established as
shown in the lemma that follows.
Lemma 8.2 Suppose b = [bm , . . . , b0 ] is such that the polynomial bm
s m + bm−1 s m−1 + · · · + b0 is Hurwitz. Then system (8.83) is ISS with respect to
the inputs y, φ0,i (y), φi, j (y), ξ2 , () j2 , and ε̄2 , where i = 1, . . . , n, j = 1, . . . , q.
Moreover, there is an ISS Lyapunov function Vv̄ = |v̄|2Pv̄ , with Pv̄ = Pv̄  0, such
that the time derivative of Vv̄ along the trajectories of (8.83) satisfies

V̇v̄ ≤ − |v̄|2 + σv̄y y 2 + σv̄φ0 |φ0 (y)|2 + σv̄ |(y)|2F


(8.84)
+ σv̄ξ2 ξ22 + σv̄()2 |()2 |2 + σv̄ε̄2 ε̄22 ,

where σ(·) > 0 are constant.


Having established the ISS properties of (8.76), (8.77) and the zero dynamics
of (8.78), we proceed to the backstepping design on the chain of integrators (8.79).
Define the error variables

z 1 = y, (8.85)
z i = vm,i − αi−1 , i = 2, . . . , ρ, (8.86)

the tuning functions

τ1 = (ω − ˆ ᾱ1 e1 )z 1 , (8.87)
∂αi−1
τi = τi−1 − ωz i , i = 2, . . . , ρ, (8.88)
∂y

the virtual control laws


240 K. Chen and A. Astolfi

α1 = ˆ ᾱ1 = − ˆ κz 1 , (8.89)
∂α1
α2 = − b̂m z 1 − (c2 + ζ2 )z 2 + β2 + θ τ2 , (8.90)
∂ θ̂
i−1
∂αi−1 ∂α j−1 ∂αi−1
αi = − z i−1 − (ci + ζi )z i + βi + θ τi − θ ωz j , (8.91)
∂ θ̂ j=2 ∂ θ̂ ∂y
i = 3, . . . , ρ,

with

κ = c1 + θ̂ ¯
|θ̂|2 + ζ̂ y + ζ̂φ0 |φ̄0 (y)|2 + ζ̂ |(y)| F,
2
(8.92)
2
 2  
1 ρδbm 1 ∂α1
ζ2 = + + bm δbm ( ˆ κ + 1) +
2 2
θ̄ δθ̄ + Yε2 + ε̄2 ,
2 θ̂ 2 b m 2 ∂y
(8.93)
 2  
1 ∂αi−1
ζi = bm δbm ( ˆ κ + 1) + θ̄ δθ̄ + Yε2
2 2
+ ε̄2 , (8.94)
2 ∂y
∂αi−1 ∂αi−1
βi = ki vm,1 + (ω0 + ω θ̂ ) + (Ak ξ + ky + φ0 )
∂y ∂ξ
q m+i−1
∂αi−1 ∂αi−1
+ (Ak ( ) j + ( ) j ) + (−k j λ1 + λ j+1 ) (8.95)
j=1
∂( ) j j=1
∂λ j
∂αi−1 ˙ ∂αi−1 ˙ ∂αi−1 ˙ ∂αi−1 ˙
+ ˆ+ ζ̂ y + ζ̂φ0 + ζ̂ , i = 2, . . . , ρ,
∂ˆ ∂ ζ̂ y ∂ ζ̂φ0 ∂ ζ̂

the control law


1
u= (αρ − vm,ρ+1 ), (8.96)
g(y)

and the parameter update laws

˙ˆ = γ sgn(b )κz 2 , (8.97)


m 1
˙ζ̂ = γ z 2 , ζ̂˙ = γ |φ |2 , ζ̂˙ = γ ||2 , (8.98)
y ζy 1 φ0 ζφ0 0  ζ F
˙θ̂ =  τ , (8.99)
θ ρ

where ci > 0, i = 1, . . . , ρ, > 0, γ(·) > 0, θ = θ  0, θ̄ (t) = bmb(t) θ , and


(·)
m
¯
θ̄ = θ̄ (t) − θ . In the definition of κ, φ̄0 (y), (y) are defined such that φ0 (y) =
¯
φ̄0 (y)y, (y) = (y)y, which is feasible due to Remark 8.9. Moreover, the initial
value of the parameter estimates is selected such that ˆ (0) > 0, ζ̂(·) (0) > 0.
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 241

Fig. 8.3 A schematic


representation of the
closed-loop system as the
interconnection of the z, x̄, ε̄,
ξ , , and v̄ subsystems. By
convention, an active node is
denoted by a green solid
circle, and an inactive node
is denoted by a red dashed
circle

Proposition 8.3 Consider the adaptive controller described by Eqs. (8.85)–(8.99)


for the system described by Eqs. (8.76)–(8.79). Then the closed-loop signals z, x̄, ε̄,
ξ , , v̄, θ̂, ˆ , and ζ̂(·) are bounded.

Remark 8.13 We use dynamically updated “estimates” ζ̂(·) as the coefficient of the
additional damping terms due to the convention mentioned in Remark 8.3, since the
required damping coefficients are, in general, hard to compute and not vanishing
even when all system parameters are constant. Meanwhile, we do not need to know
δθ̄ as it can be dominated by these adaptive damping terms with the help of the
balancing constant θ̄ .
Another advantage provided by the dynamic estimates ζ̂(·) is that the L2 gains
of the input–output maps of the z-subsystem with outputs y, φ0 (y), and (y) are
arbitrarily and adaptively adjustable. In [9], such a subsystem in a network system
is called an active node, and if each directed cycle of the network contains at least
one active node, the overall dissipation inequality can be made negative by adjusting
the L2 gains. Since the z-subsystem is contained in each directed cycle as shown in
Fig. 8.3, one could prove Proposition 8.3 with some network analysis. This serves as
an alternative interpretation of the proof in [8].

We should not forget that the invariance-like analysis of asymptotic output regula-
tion requires boundedness of ε. In Proposition 8.3, we have established boundedness
of ε̄ after the change of coordinates described in Algorithm 8.1. However, it is not
easy to directly infer boundedness of ε since Algorithm 8.1 involves the time deriva-
tives of y, φ0,i (y), and φi, j (y), i = 1, . . . , n, j = 1, . . . , q, boundedness of which is
difficult to establish. Recall that these time derivatives are present because u has to
be decomposed at the design stage with the help of the inverse dynamics. Now that
we have completed the design, it is more convenient to directly use boundedness of
u for concluding boundedness of ε, which leads to the following proposition.

Proposition 8.4 Consider the system described by Eqs. (8.76)–(8.79) and the adap-
tive controller described by Eqs. (8.85)–(8.99). Then all closed-loop signals are
bounded and lim y(t) = 0, that is, asymptotic output regulation to 0 is achieved.
t→+∞
242 K. Chen and A. Astolfi

In addition, using the fact that lim z(t) = 0 one can establish that lim ξ(t) = 0,
t→+∞ t→+∞
lim (t) = 0, lim λ(t) = 0, lim ε(t) = 0, and lim x(t) = 0 by exploiting
t→+∞ t→+∞ t→+∞ t→+∞
the converging-input converging-output property of the corresponding subsystems
or the dependency on converging signals.

8.4 Simulations

To compare the proposed controller with the classical adaptive controller, consider
the nonlinear system described by the equations

ẋ1 = a1 (t)x12 + x2 ,
ẋ2 = a2 (t)x12 + x3 + b1 (t)u,
(8.100)
ẋ3 = a3 (t)x12 + b0 (t)u,
y = x1 ,

where b1 is a periodic signal switching between 0.6 and 1.4 at frequency 8 rad/s; b0
is a periodic signal switching between 4 and 6 at frequency 15 rad/s; and a is defined
as
 
∂α1 (ω̄)3:5
a(t) = [2, 3, 1] − 10 sgn z2 , (8.101)
∂y |(ω̄)3:5 |

with (ω̄)3:5 = [(ω̄)3 , (ω̄)4 , (ω̄)5 ] . Each of these parameters comprises a constant
nominal part and a time-varying (a is also state dependent) part designed to “desta-
bilize” the system. It is not difficult to verify that Assumption 8.4 is satisfied since
b0
b1
≥ 1.4
4
> 0. Consider now two controllers: Controller 1 is the classical adaptive
backstepping controller, and Controller 2 is the controller proposed in this chapter.
To compare fairly, set the common controller parameters as c1 = c2 = 1, θ = I ,
γ = 1 and the initial conditions θ̂(0) = 0, ˆ (0) = 1 for both controllers. Each of
the controllers uses an identical set of K-filters given by (8.47)–(8.49). The filter
gains are obtained by solving the algebraic Riccati equation (8.69) with Q ε̄ = 10
and γε̄ = 100, and the filter states are initialized to 0. For the parameters solely
used in Controller 2, set γ(·) = 1, (·) = 1, δbm = 0.2, θ̄ δθ̄ = 1 (note that one
does not need to know δθ̄ as mentioned in Remark 8.13), and set the initial condi-
tions to ζ̂ y (0) = 2, ζ̂ (0) = 1 (nonzero initial conditions provide additional damping
from the beginning to counteract the parameter variations). The initial condition for
the system state is set to x(0) = [1, 0, 0] . Two scenarios are explored: in the first
scenario, each controller is applied to a separate yet identical system while the state-
dependent time-varying parameters of both systems are generated by the closed-loop
system controlled by Controller 1, and the second scenario has the same setting as
the first scenario except that the state-dependent time-varying parameters are gen-
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 243

Fig. 8.4 Scenario 1: time-varying parameters generated by the closed-loop system controlled by
Controller 1

Fig. 8.5 Scenario 1: time histories of the system state and control effort driven by different con-
trollers and the parameters shown in Fig. 8.4
244 K. Chen and A. Astolfi

Fig. 8.6 Scenario 2: time-varying parameters generated by the closed-loop system controlled by
Controller 2

Fig. 8.7 Scenario 2: time histories of the system state and control effort driven by different con-
trollers and the parameters shown in Fig. 8.6
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 245

erated by the closed-loop system controlled by Controller 2. In both scenarios, the


“Baseline” results describe the responses of the closed-loop system with constant
nominal parameters controlled by Controller 1, which demonstrate the performance
of the classical controller in the case of constant parameters. The responses of the
system state variables in each scenario are plotted in Figs. 8.5 and 8.7, respectively,
and the parameters used in each scenario are shown in Figs. 8.4 and 8.6, respec-
tively. These results show that the proposed controller (Controller 2) outperforms
the classical controller (Controller 1) in the presence of time-varying parameters and
effectively prevents the oscillations caused by parameter variations. Note that the
parameter variations shown in Figs. 8.4 and 8.6 contain discontinuity and therefore
do not satisfy Assumption 8.2, and the proposed controller proves to be effective
also in this operation condition.

8.5 Conclusions

This chapter surveys a new adaptive control scheme developed to cope with time-
varying parameters based on the so-called congelation of variables method. Sev-
eral examples with full-state feedback, including scalar systems with time-varying
parameters in the feedback path and in the input path, and n-dimensional systems
with unmatched time-varying parameters, to illustrate the main ideas, are consid-
ered. The output regulation problem for a more general class of nonlinear systems,
to which the previous results are not directly applicable due to the coupling between
the input and the time-varying perturbation, is then discussed. To solve this problem,
ISS of the inverse dynamics, a counterpart of minimum-phaseness in classical adap-
tive control schemes, is exploited to convert the coupling between the input and the
time-varying perturbation into the coupling between the output and the time-varying
perturbation. A set of K-filters that guarantee ISS state estimation error dynamics are
also designed to replace the unmeasured state variables. Finally, a controller with
adaptively updated damping terms is designed to guarantee convergence of the out-
put to zero and boundedness of all closed-loop signals, via a small-gain-like analysis.
The simulation results show performance improvement resulting from the use of the
proposed controller compared with the classical adaptive controller in the presence
of time-varying parameters.

References

1. Annaswamy, A.M., Narendra, K.S.: Adaptive control of simple time-varying systems. In: Pro-
ceedings of the 28th IEEE Conference on Decision and Control, pp. 1014–1018. IEEE (1989)
2. Astolfi, A., Karagiannis, D., Ortega, R.: Nonlinear and Adaptive Control with Applications.
Springer Science & Business Media, Berlin (2007)
3. Chen, K., Astolfi, A.: Adaptive control of linear systems with time-varying parameters. In:
Proceedings of the 2018 American Control Conference, pp. 80–85. IEEE (2018)
246 K. Chen and A. Astolfi

4. Chen, K., Astolfi, A.: I&I adaptive control for systems with varying parameters. In: Proceedings
of the 57th IEEE Conference on Decision and Control, pp. 2205–2210. IEEE (2018)
5. Chen, K., Astolfi, A.: Output-feedback adaptive control for systems with time-varying param-
eters. IFAC-PapersOnLine 52(16), 586–591 (2019)
6. Chen, K., Astolfi, A.: Output-feedback I&I adaptive control for linear systems with time-
varying parameters. In: Proceedings of the 58th IEEE Conference on Decision and Control,
pp. 1965–1970. IEEE (2019)
7. Chen, K., Astolfi, A.: Adaptive control for nonlinear systems with time-varying parameters
and control coefficient. IFAC-PapersOnLine 53(2), 3829–3834 (2020)
8. Chen, K., Astolfi, A.: Adaptive control for systems with time-varying parameters. IEEE Trans.
Autom. Control 66(5), 1986–2001 (2021)
9. Chen, K., Astolfi, A.: On the active nodes of network systems. In: Proceedings 59th IEEE
Conference on Decision and Control, pp. 5561–5566. IEEE (2020)
10. Goodwin, G., Ramadge, P., Caines, P.: Discrete-time multivariable adaptive control. IEEE
Trans. Autom. Control 25(3), 449–456 (1980)
11. Goodwin, G.C., Mayne, D.Q.: A parameter estimation perspective of continuous time model
reference adaptive control. Automatica 23(1), 57–70 (1987)
12. Goodwin, G.C., Teoh, E.K.: Adaptive control of a class of linear time varying systems. IFAC
Proceedings Volumes 16(9), 1–6 (1983)
13. Ioannou, P.A., Sun, J.: Robust Adaptive Control. PTR Prentice-Hall Upper Saddle River, NJ
(1996)
14. Khalil, H.K.: Nonlinear Systems. Prentice Hall (2002)
15. Kreisselmeier, G.: Adaptive observers with exponential rate of convergence. IEEE Trans.
Autom. Control 22(1), 2–8 (1977)
16. Kreisselmeier, G.: Adaptive control of a class of slowly time-varying plants. Syst. Control Lett.
8(2), 97–103 (1986)
17. Krstic, M., Kokotovic, P.V., Kanellakopoulos, I.: Nonlinear and Adaptive Control Design, 1st
edn. Wiley, New York, NY, USA (1995)
18. Li, Z., Krstic, M.: Optimal-design of adaptive tracking controllers for nonlinear-systems. Auto-
matica 33(8), 1459–1473 (1997)
19. Lin, W., Qian, C.: Adaptive control of nonlinearly parameterized systems: the smooth feedback
case. IEEE Trans. Autom. Control 47(8), 1249–1266 (2002)
20. Marino, R., Tomei, P.: Global adaptive output-feedback control of nonlinear systems. I. Linear
parameterization. IEEE Trans. Autom. Control 38(1), 17–32 (1993)
21. Marino, R., Tomei, P.: An adaptive output feedback control for a class of nonlinear systems
with time-varying parameters. IEEE Trans. Autom. Control 44(11), 2190–2194 (1999)
22. Marino, R., Tomei, P.: Robust adaptive regulation of linear time-varying systems. IEEE Trans.
Autom. Control 45(7), 1301–1311 (2000)
23. Marino, R., Tomei, P.: Adaptive control of linear time-varying systems. Automatica 39(4),
651–659 (2003)
24. Merriam-Webster Staff: Merriam-Webster’s Collegiate Dictionary. Merriam-Webster (2004)
25. Middleton, R.H., Goodwin, G.C.: Adaptive control of time-varying linear systems. IEEE Trans.
Autom. Control 33(2), 150–155 (1988)
26. Narendra, K.S., Annaswamy, A.M.: Stable Adaptive Systems. Prentice Hall (1989)
27. Nestruev, J.: Smooth Manifolds and Observables. Springer Science & Business Media (2006)
28. Pomet, J.B., Praly, L.: Adaptive nonlinear regulation: Estimation from the Lyapunov equation.
IEEE Trans. Autom. Control 37(6), 729–740 (1992)
29. Tao, G.: Adaptive Control Design and Analysis. Wiley (2003)
30. Tsakalis, K., Ioannou, P.A.: Adaptive control of linear time-varying plants. Automatica 23(4),
459–468 (1987)
31. Wang, C., Guo, L.: Adaptive cooperative tracking control for a class of nonlinear time-varying
multi-agent systems. J. Frankl. Inst. 354(15), 6766–6782 (2017)
32. Zhang, Y., Fidan, B., Ioannou, P.A.: Backstepping control of linear time-varying systems with
known and unknown parameters. IEEE Trans. Autom. Control 48(11), 1908–1925 (2003)
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey 247

33. Zhang, Y., Ioannou, P.A.: Adaptive control of linear time varying systems. In: Proceedings of
35th IEEE Conference on Decision and Control, vol. 1, pp. 837–842. IEEE (1996)
34. Zhou, J., Wen, C.: Adaptive Backstepping Control of Uncertain Systems: Nonsmooth Nonlin-
earities. Springer, Interactions or Time-Variations (2008)
Chapter 9
Robust Reinforcement Learning for
Stochastic Linear Quadratic Control
with Multiplicative Noise

Bo Pang and Zhong-Ping Jiang

Dedicated to Laurent Praly, a beautiful mind

Abstract This chapter studies the robustness of reinforcement learning for discrete-
time linear stochastic systems with multiplicative noise evolving in continuous state
and action spaces. As one of the popular methods in reinforcement learning, the
robustness of policy iteration is a longstanding open problem for the stochastic lin-
ear quadratic regulator (LQR) problem with multiplicative noise. A solution in the
spirit of input-to-state stability is given, guaranteeing that the solutions of the policy
iteration algorithm are bounded and enter a small neighborhood of the optimal solu-
tion, whenever the error in each iteration is bounded and small. In addition, a novel
off-policy multiple-trajectory optimistic least-squares policy iteration algorithm is
proposed, to learn a near-optimal solution of the stochastic LQR problem directly
from online input/state data, without explicitly identifying the system matrices. The
efficacy of the proposed algorithm is supported by rigorous convergence analysis
and numerical results on a second-order example.

B. Pang (B) · Z.-P. Jiang


Control and Networks Lab, Department of Electrical and Computer Engineering, Tandon School
of Engineering, New York University, 370 Jay Street, Brooklyn, NY 11201, USA
e-mail: [email protected]
Z.-P. Jiang
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 249
Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control,
Lecture Notes in Control and Information Sciences 488,
https://doi.org/10.1007/978-3-030-74628-5_9
250 B. Pang and Z.-P. Jiang

9.1 Introduction

Since the huge successes in the Chinese game of Go and video game Atari [50],
reinforcement learning (RL) has been extensively studied by both researchers in
academia and practitioners in industry. Optimal control [11] is a branch of control
theory that discusses the synthesis of feedback controllers to achieve optimality prop-
erties for dynamical control systems, but often requires the knowledge of the system
dynamics. Adaptive control [48] is a field that deals with dynamical control systems
with unknown parameters, but usually ignores the optimality of the control systems
(with a few exceptions [9, 19, 46]). RL combines the advantages of these two con-
trol methods [51], and searches for adaptive optimal controllers with respect to some
performance index through interactions between the controller and the dynamical
system, without the complete model knowledge. Over the past decades, numerous
RL methods have been proposed for different optimal control problems with various
kinds of dynamical systems, see books [6, 28, 31, 50] and recent surveys [12, 29,
33, 37, 45] for details.
A class of important and popular methods in RL is policy iteration. Policy iteration
involves two steps, policy evaluation and policy improvement. In policy evaluation,
a given policy is evaluated based on a scalar performance index. Then this perfor-
mance index is utilized to generate a new control policy in policy improvement.
These two steps are iterated in turn, to find the solution of the RL problem at hand.
If implemented perfectly, policy iteration is proved to converge to the optimal solu-
tion. However in reality, policy evaluation or policy improvement can hardly be
implemented precisely, because of the existence of various errors induced by func-
tion approximation, state estimation, sensor noise, external disturbance, and so on.
Therefore, a natural question to ask is: when is a policy iteration algorithm robust to
the exogenous errors? That is, under what conditions on the errors, does the policy
iteration still converge to (a neighborhood of) the optimal solution? In spite of the
popularity and empirical successes of policy iteration, its robustness issue has not
been fully investigated yet in theory [5], especially for RL problems of physical
systems where the state and action spaces are unbounded and continuous, such as
robotics and autonomous cars [38].
Regarding the policy iteration as a dynamical system, and utilizing the concepts
of input-to-state stability in control theory [49], the robustness of policy iteration for
the classic continuous-time and discrete-time LQR problems is analyzed in [43] and
[44], respectively. It is shown that the policy iteration with errors for the LQR is small-
disturbance input-to-state stable, if the errors are regarded as the disturbance input.
In this chapter, we generalize this robustness result to the policy iteration for LQR
of discrete-time linear systems perturbed by stochastic state- and input-dependent
multiplicative noises. Stochastic multiplicative noises are important in modeling the
random perturbation in system parameters and coefficients, and are widely found in
modern control systems such as networked control systems with noisy communica-
tion channels [24], modern power networks [23], neuronal brain networks [10], and
human sensorimotor control [8, 27, 53]. We firstly prove that the optimal solution
9 Robust Reinforcement Learning … 251

of this stochastic LQR problem is a locally exponentially stable equilibrium of the


exact policy iteration. Then based on this observation, we show that if the policy
iteration starts from an initial solution close to the optimal solution, and the errors
are small and bounded, the discrepancies between the solutions generated by the
policy iteration and the optimal solution will also be small and bounded, in the spirit
of Sontag’s input-to-state stability [49]. Thirdly, we demonstrate that for any initial
stabilizing control gain, as long as the errors are small, the approximate solution
given by policy iteration will eventually enter a small neighborhood of the optimal
solution. Finally, a novel off-policy model-free RL algorithm, named multi-trajectory
optimistic least-squares policy iteration (MO-LSPI), is proposed to find near-optimal
solutions of the stochastic LQR problem directly from online input/state data when
all the system matrices are unknown. Our robustness result is applied to show the con-
vergence of this off-policy MO-LSPI. Experiments on a numerical example validate
our results. In the presence of stochastic multiplicative noise, the Lyapunov equation
in policy evaluation for the classic deterministic LQR [44] becomes the generalized
Lyapunov equation in policy evaluation for the stochastic LQR, while the algebraic
Riccati equation for the classic deterministic LQR becomes the generalized algebraic
Riccati equation for the stochastic LQR. Thus, although the robustness analysis of
policy iteration for stochastic LQR is parallel to that for its deterministic counterpart,
the derivations and proofs are inevitably distinct.
The optimal control of linear systems with stochastic multiplicative noise has been
studied for a long time [3, 4, 15–17, 47, 52, 56]. With the popularization of low-cost,
more powerful computational resources and data-acquisition equipment, the study of
stochastic LQR with multiplicative noises is re-emerging in the context of data-driven
control and learning-based control. Data-driven methods are proposed in [13, 14] to
find near-optimal solutions of stochastic LQR, assuming that the distribution of the
stochastic multiplicative noise is unknown. Model-free RL algorithms are proposed
to find near-optimal solutions in [20] using policy gradient, in [35] using policy
iteration for stochastic LQR, and in [22] using policy iteration for linear quadratic
games. A system identification method is proposed in [55] to explicitly estimate
the system matrices from multiple-trajectory data for subsequent LQR design. The
algorithms proposed in these papers either assume the knowledge of the system
dynamics [13, 14], or do not have proofs of convergence [22, 35], or belong to the
class of on-policy methods [20], or lead to indirect adaptive control [55]. In contrast,
the MO-LSPI algorithm proposed in this paper is off-policy, and learns near-optimal
solutions of the stochastic LQR directly from input/state data without the precise
knowledge of any system matrices, and with provable convergence analysis. It is
also worth emphasizing that MO-LSPI proposed in this chapter is based on policy
iteration, while the Q-learning algorithm proposed in [18] for stochastic LQR with
random parameters is based on value iteration. The robustness studied in this book
chapter differs from [14, 21] in that the learning process (policy iteration) itself is
viewed as a dynamical system (similar to [7] where robustness is studied for value
iteration), while in [14, 21] the robustness analysis is developed for the closed-loop
system comprised of the environment and the policy.
252 B. Pang and Z.-P. Jiang

The rest of this paper is organized as follows. Section 9.2 introduces the stochastic
LQR with multiplicative noise and its associated policy iteration. Section 9.3 con-
ducts the robustness analysis of policy iteration. Section 9.4 presents the MO-LSPI
algorithm and its convergence analysis. Section 9.5 validates the proposed robust RL
algorithm by means of an elementary example. Section 9.6 closes the chapter with
some concluding remarks.
Notation R is the set of all real numbers; Z+ denotes the set of nonnegative
integers; Sn is the set of all real symmetric matrices of order n; ⊗ denotes the
Kronecker product; In denotes the identity matrix of order n;  ·  F is the Frobenius
norm;  · 2 is the Euclidean norm for vectors and the spectral norm for matrices; for
function u : F → Rn×m , u∞ denotes its l ∞ -norm when F = Z+ , and L ∞ -norm
when F = R. For matrices X ∈ Rm×n , Y ∈ Sm , and vector v ∈ Rn , define

vec(X ) = [ X 1T X 2T · · · X nT ]T , ṽ = svec(vv T ),
√ √ √ √
svec(Y ) = [y11 , 2y12 , · · · , 2y1m , y22 , 2y23 , · · · , 2ym−1,m , ym,m ]T ,

where X i is the ith column of X . vec−1 (·) and svec−1 (·) are operations such that
X = vec−1 (vec(X )), and Y = svec−1 (svec(Y )). For Z ∈ Rm×n , define Br (Z ) =
{X ∈ Rm×n |X − Z  F < r } and B̄r (Z ) as the closure of Br (Z ). Z † is the Moore–
Penrose pseudoinverse of matrix Z .

9.2 Problem Formulation and Preliminaries

Consider linear systems with state- and input-dependent multiplicative noises,


q1

q2
xt+1 = (A0 + wt, j A j )xt + (B0 + ŵt,k Bk )u t , (9.1)
j=1 k=1

where x ∈ Rn is the system state; u ∈ Rm is the control input; the initial state x0 ∈ Rn
is given and deterministic; wt, j ∈ R and ŵt,k ∈ R are stochastic noises; and A0 , B0 ,
q1 q2
{A j } j=1 , {Bk }k=1 are system matrices of compatible dimensions. wt, j and ŵt,k are
mutually independent random variables independent and identically distributed over
time. For all t ∈ Z+ , j = 1, · · · , q1 , k = 1, · · · , q2 ,

E[wt, j ] = E[ŵt,k ] = 0, E[wt,2 j ] = E[ŵt,k


2
] = 1.

Definition 9.1 The unforced system (9.1), i.e., system (9.1) with u t = 0 for all
t ∈ Z+ , is said to be mean-square stable if for any x0 ∈ Rn ,

lim E[xt xtT ] = 0.


t→∞
9 Robust Reinforcement Learning … 253

System (9.1) is said to be mean-square stabilizable, if there exists a matrix K ∈ Rm×n ,


such that the closed-loop system (9.1) with control law u t = −K xt is mean-square
stable. In this case, the gain K is said to be mean-square stabilizing.
The following lemma gives conditions under which K is mean-square stabilizing.
Lemma 9.1 For a control gain K ∈ Rm×n , the following statements are equivalent:
(i) K is mean-square stabilizing.
(ii) There exists P > 0, such that L K (P) < 0, where

L K (P)  (P) − P − K T B0T P A0 − A0T P B0 K + K T (P)K ,


q1 q2
and (P) = j=0 A Tj P A j and (P) = k=0 BkT P Bk .
(iii) ρ(A (K ) + In ⊗ In ) < 1, where


q1
A (K ) = A Tj ⊗ A Tj − In ⊗ In − A0T ⊗ (K T B0T )
j=0


q2
− (K T B0T ) ⊗ A0T + K T BkT ⊗ K T BkT .
k=0

Proof (i) ⇐⇒ (ii) is obtained by a direct application of [42, Lemma 1.]. The proof
of (i) ⇐⇒ (iii) can be found in [20, Lemma 2.1]. 
Assuming that system (9.1) is mean-square stabilizable, in stochastic LQR we want
to find a controller u minimizing the cost functional


J (x0 , u) = E[ xtT Sxt + u tT Ru t ], (9.2)
t=0

where weighting matrices S ∈ Sn and R ∈ Sm are positive definite. From dynamic


programming theory (e.g., from [14, Proposition 3]), it follows that the optimal
controller to this stochastic LQR problem is u ∗t = −K ∗ xt , and the optimal cost is
J (x0 , u ∗ ) = x0T P ∗ x0 , where

K ∗ = −(R + (P ∗ ))−1 B0T P ∗ A0 (9.3)

is mean-square stabilizing and P ∗ ∈ Sn is the unique positive definite solution of the


generalized algebraic Riccati equation (GARE)

P = S + (P) − A0T P B0 (R + (P))−1 B0T P A0 . (9.4)

For mean-square stabilizing control gain K ∈ Rm×n , the cost it induces is


J (x0 , −K x) = x0T PK x0 , where PK ∈ Sn is the unique positive definite solution of
the generalized Lyapunov equation
254 B. Pang and Z.-P. Jiang

L K (PK ) + S + K T R K = 0. (9.5)

Define
   
[G(PK )]x x [G(PK )]ux
T
S + (PK ) − PK A0T PK B0
G(PK ) = = .
[G(PK )]ux [G(PK )]uu B0T PK A0 R + (PK )

Then (9.5) can be equivalently rewritten as

H(G(PK ), K ) = 0, (9.6)

where  
  In
H(G(PK ), K ) = In , −K T G(PK ) .
−K T

The policy iteration for stochastic LQR is described in the following procedure:
Procedure 9.1 (Exact Policy Iteration)
1) Choose a mean-square stabilizing control gain K 1 , and let i = 1.
2) (Policy evaluation) Evaluate the performance of control gain K i , by solving

H(G i , K i ) = 0 (9.7)

for Pi ∈ Sn , where G i = G(Pi ).


3) (Policy improvement) Obtain an improved policy

K i+1 = [G i ]−1
uu [G i ]ux . (9.8)

4) Set i ← i + 1 and go back to Step 2.


The following convergence results of Procedure 9.1 are similar to those of the policy
iteration for the LQR in [25], whose proof is given in Appendix 2.
Theorem 9.1 In Procedure 9.1, we have
i) K i is mean-square stabilizing for all i = 1, 2, · · · .
ii) P1 ≥ P2 ≥ P3 ≥ · · · ≥ P ∗ .
iii) limi→∞ Pi = P ∗ , limi→∞ K i = K ∗ .
Theorem 9.1 guarantees that the optimal solution will be found by Procedure 9.1 in the
q1 q2
limit. However, the exact knowledge of {A j } j=0 and {Bk }k=0 is required in Procedure
q1 q2
9.1, as the solution to (9.7) relies upon {A j } j=0 and {Bk }k=0 . In practice, very often
we only have access to incomplete information required to solve the problem. In
other words, each policy evaluation step will result in inaccurate estimation. Thus,
we are interested in studying the following problem.
Problem 9.1 If G i is replaced by an approximated matrix Ĝ i , will the conclusions
in Theorem 9.1 still hold?
9 Robust Reinforcement Learning … 255

The difference between Ĝ i and G i can be attributed to errors from various sources,
q1 q2
including but are not limited to: the estimation errors of {A j } j=0 and {Bk }k=0 in indi-
rect adaptive control [48], system identification [39], and model-based reinforcement
learning [54]; approximate values of S and R in inverse optimal control/imitation
learning, due to the absence of exact knowledge of the cost function [36, 41]; the
imprecise values of Pi in model-free reinforcement learning [1, 54].
In Sect. 9.3, using the concept of exponential stability and input-to-state stability
in control theory, we provide an answer to Problem 9.1. In Sect. 9.4, we present the
model-free RL algorithm MO-LSPI, to find near-optimal solutions of stochastic LQR
directly from a set of input/state data collected along the trajectories of the control
q1 q2
system, when {A j } j=0 and {Bk }k=0 are unknown. The derived answer to Problem
9.1 in Sect. 9.3 is utilized to analyze the convergence of the proposed MO-LSPI
algorithm.

9.3 Robust Policy Iteration

Consider the policy iteration in the presence of errors.


Procedure 9.2 (Inexact Policy Iteration)
(1) Choose a mean-square stabilizing control gain K̂ 1 , and let i = 1.
(2) (Inexact policy evaluation) Obtain Ĝ i = G̃ i + G i , where G i ∈ Sm+n is a
disturbance, G̃ i  G( P̃i ) and P̃i ∈ Sn satisfy

H(G̃ i , K̂ i ) = 0, (9.9)

and J (x0 , − K̂ i x) = x0T P̃i x0 is the true cost induced by control gain K̂ i .
(3) (Policy update) Construct a new control gain

K̂ i+1 = [Ĝ i ]−1


uu [Ĝ i ]ux . (9.10)

(4) Set i ← i + 1 and go back to Step 2.

Remark 9.1 The requirement that Ĝ i ∈ Sm+n in Procedure 9.2 is not restrictive,
since for any X ∈ R(n+m)×(n+m) , x T X x = 21 x T (X + X T )x, where 21 (X + X T ) is
symmetric.

We firstly show that the exact policy iteration Procedure 9.1, viewed as a dynamical
system, is locally exponentially stable at P ∗ . Then based on this result, we show that
the inexact policy iteration Procedure 9.2, viewed as a dynamical system with G i
as the input, is locally input-to-state stable.
For X ∈ Rm×n , Y ∈ Rn×n , define

K (Y ) = R −1 (Y )B0T Y A, R(Y ) = R + (Y ).


256 B. Pang and Z.-P. Jiang

Note that
vec(L X (Y )) = A (X ) vec(Y ), (9.11)

where L X (·) and A (·) are defined in Lemma 9.1. By (iii) in Lemma 9.1, if X is
mean-square stabilizing, then A (X ) is invertible, and (9.11) implies that the inverse
operator L−1X (·) exists on R
n×n
.
In Procedure 9.1, suppose K 1 = K (P0 ), where P0 ∈ Sn is chosen such that K 1
is mean-square stabilizing. Such a P0 always exists. For example, since K ∗ is mean-
square stabilizing, one can choose P0 close to P ∗ by continuity. Then from (9.7) and

(9.8), the sequence {Pi }i=0 generated by Procedure 9.1 satisfies

Pi+1 = L−1
K (Pi ) −S − K (Pi ) RK (Pi ) .
T
(9.12)

If Pi is regarded as the state, and the iteration index i is regarded as time, then (9.12)
is a discrete-time dynamical system and P ∗ is an equilibrium by Theorem 9.1. The
next lemma shows that P ∗ is actually a locally exponentially stable equilibrium,
whose proof is given in Appendix 3.
Lemma 9.2 For any σ < 1, there exists a δ0 (σ ) > 0, such that for any Pi ∈ Bδ0 (P ∗ ),
R(Pi ) is invertible, K (Pi ) is mean-square stabilizing and

Pi+1 − P ∗  F ≤ σ Pi − P ∗  F .

In Procedure 9.2, suppose K̂ 1 = K ( P̃0 ) and G 0 = 0, where P̃0 ∈ Sn is chosen


such that K̂ 1 is mean-square stabilizing. If K̂ i is mean-square stabilizing and [Ĝ i ]uu is
invertible for all i ∈ Z+ , i > 0 (this is possible under certain conditions, see Appendix

4), the sequence { P̃i }i=0 generated by Procedure 9.2 satisfies

P̃i+1 = L−1
K ( P̃ )
−S − K ( P̃i )T RK ( P̃i ) + E(G̃ i , G i ), (9.13)
i

where

E(G̃ i , G i ) = L−1

−S − K̂ i+1
T
R K̂ i+1 − L−1
K ( P̃ )
−S − K ( P̃i )T RK ( P̃i ) .
i+1 i


Regarding {G i }i=0 as the disturbance input, the next theorem shows that dynamical
system (9.13) is locally input-to-state stable [30, Definition 2.1], whose proof can be
found in Appendix 4.
Lemma 9.3 For σ and its associated δ0 in Lemma 9.2, there exists δ1 (δ0 ) > 0, such
that if G∞ < δ1 , P̃0 ∈ Bδ0 (P ∗ ),
(i) [Ĝ i ]uu is invertible and K̂ i is mean-square stabilizing, ∀i ∈ Z+ , i > 0;
(ii) (9.13) is locally input-to-state stable:

 P̃i − P ∗  F ≤ β( P̃0 − P ∗  F , i) + γ (G∞ ), ∀i ∈ Z+ ,


9 Robust Reinforcement Learning … 257

where c3
β(y, i) = σ i y, γ (y) = y, y ∈ R,
1−σ

and c3 (δ0 ) > 0.


(iii)  K̂ i  F < κ1 for some κ1 (δ0 ) ∈ R+ , ∀i ∈ Z+ , i > 0;
(iv) limi→∞ G i  F = 0 implies limi→∞  P̃i − P ∗  F = 0.
Intuitively, Lemma 9.3 implies that in Procedure 9.2, if P̃0 is near P ∗ (thus K̂ 1
is near K ∗ ), and the disturbance input G is bounded and not too large, then the
cost of the generated control policy K̂ i is also bounded, and will ultimately be no
larger than a constant proportional to the l ∞ -norm of the disturbance. The smaller
the disturbance is, the better the ultimately generated policy is. In other words, the
algorithm described in Procedure 9.2 is not sensitive to small disturbances when the
initial condition is in a neighborhood of the optimal solution.
The requirement that the initial condition P̃0 need to be in a neighborhood of P ∗
in Lemma 9.3 can be removed, as stated in the following theorem whose proof is
given in Appendix 5.
Theorem 9.2 For any given mean-square stabilizing control gain K̂ 1 and any > 0,
if S > 0, there exist δ2 ( , K̂ 1 ) > 0, α(δ2 ) > 0, and κ(δ2 ) > 0, such that as long
as G∞ < δ2 , [Ĝ i ]uu is invertible, K̂ i is mean-square stabilizing,  P̃i  F < α,
 K̂ i  F < κ, ∀i ∈ Z+ , i > 0, and

lim sup  P̃i − P ∗  F < .


i→∞

If in addition limi→∞ G i  F = 0, then limi→∞  P̃i − P ∗  F = 0.


In Theorem 9.2, K̂ 1 can be any mean-square stabilizing control gain, which is dif-
ferent from that of Lemma 9.3. When there is no disturbance, Theorem 9.2 implies
the convergence result of Procedure 9.1, i.e., Theorem 9.1.

9.4 Multi-trajectory Optimistic Least-Squares Policy


Iteration
q q
In this section, the system matrices {A j } j=0
1
and {Bk }k=0
2
are assumed unknown in
system (9.1), and the MO-LSPI is proposed, to find the near-optimal solutions of the
stochastic LQR directly from the input/state data. The following lemma provides an
alternative way to implement the policy evaluation step in Procedure 9.1.
Lemma 9.4 For any given mean-square stabilizing control gain K ∈ Rm×n , its asso-
ciated PK ∈ Sn satisfying (9.6) is the unique stable equilibrium of the following
iteration:
PK , j+1 = H(Q(PK , j ), K ), PK ,0 ∈ Rn×n , (9.14)
258 B. Pang and Z.-P. Jiang

where  
PK , j 0
Q(PK , j ) = G(PK , j ) + .
0 0

Proof Vectorizing (9.14) and using (9.11), we have

vec(PK , j+1 ) = (A (K ) + In ⊗ In ) vec(PK , j ) + vec(S + K T R K ). (9.15)

By Lemma 9.1, ρ(A (K ) + In ⊗ In ) < 1. This implies that (9.14) admits a unique
stable equilibrium and it must be PK . 

Lemma 9.4 means that instead of solving algebraic equation (9.7), we could use
iteration (9.14) for the policy evaluation. Now we explain how (9.14) is used together
with least-squares method to form an estimation Ĝ i in Procedure 9.2 directly from
the input/state data.
Suppose the following control law is applied to system (9.1)

u t = − K̂ 1 xt + vt , (9.16)

where K̂ 1 is mean-square stabilizing, vt is independently drawn from multivariate


Gaussian distribution with mean Vt and covariance Im , and sequence {Vt }∞ t=0 is a
realization of the discrete-time white Gaussian noise process. Then for any P ∈ Sn ,
we have

E[xt+1
T
P xt+1 ] = E[xtT (P)xt + 2xtT A0T P B0 u t + u t (P)u t ]
= E[z tT Q(P)z t − xtT Sxt − u tT Ru t ],

where z t = [xtT , u tT ]T . Vectorizing the above equation yields

Z tT svec(Q(P)) = X t+1
T
svec(P) + X tT svec(S) + UtT svec(R), (9.17)

where Z t = E[z̃ t ], X t = E[x̃t ], and Ut = E[ũ t ]. For M ∈ Z+ , organizing equations


(9.17) for time index from t = 0 to t = M − 1 into one
equation yields
M svec(Q(P)) =  M svec(P) + r M ,
2
(9.18)

where
r M = M
1
svec(S) +  M svec(R)
M = [Z 0 , Z 1 , · · · , Z M−1 ]T ,  M
1
= [X 0 , X 1 , · · · , X M−1 ]T ,
M
2
= [X 1 , X 2 , · · · , X M ]T ,  M = [U0 , U1 , · · · , U M−1 ]T .

The following assumption is made on the matrix M.


9 Robust Reinforcement Learning … 259

Assumption 9.1 Matrix M has full column rank, i.e.,

(m + n)(m + n + 1)
rank( M) = .
2
Under Assumption 9.1, (9.18) can be rewritten as

svec(Q(P)) = M ( M
2
svec(P) + r M ). (9.19)

Then (9.14) is equivalent to

PK , j+1 = H(svec−1 ( †
M ( M
2
svec(PK , j ) + r M )), K ). (9.20)
q q
Note that (9.20) does not depend on any system matrix in {A j } j=0
1
and {Bk }k=0
2
. How-
ever, matrices M ,  M ,  M , and  M are not known either. This issue is overcome
1 2

by averaging the multiple input/state trajectories of the system (9.1). Concretely,


suppose in total N trajectories are collected by running system (9.1) independently
p p
for N times. Let xt and u t denote the state and input at time t of the pth trajectory.
p p
Then using xt and u t we can construct iteration

P̂K , j+1 = H( Q̂ K , j , K ),
(9.21)
Q̂ K , j = svec−1 ( ˆ †N ,M (
ˆ N2 ,M svec( P̂K , j ) + r̂ N ,M )),

where
ˆ N1 ,M svec(S) + 
r̂ N ,M =  ˆ N ,M svec(R),
ˆ N ,M = [ Ẑ N ,0 , Ẑ N ,1 , · · · , Ẑ N ,M−1 ]T , 
ˆ N1 ,M = [ X̂ N ,0 , X̂ N ,1 , · · · , X̂ N ,M−1 ]T ,
ˆ N2 ,M = [ X̂ N ,1 , X̂ N ,2 , · · · , X̂ N ,M ]T , 
 ˆ N ,M = [Û N ,0 , Û N ,1 , · · · , Û N ,M−1 ]T ,

and
1  p 1  p 1  p
N N N
Ẑ N ,t = z̃ t , X̂ N ,t = x̃t , Û N ,t = ũ t .
N N N
p=1 p=1 p=1

Since any two trajectories are independent, by the strong law of large numbers, almost
surely
lim Ẑ N ,t = Z t , lim X̂ N ,t = X t , lim Û N ,t = Ut .
N →∞ N →∞ N →∞

Then for each M ∈ Z+ ,

lim ˆ N ,M = M, ˆ N1 ,M =  M
lim  1
,
N →∞ N →∞
(9.22)
ˆ N2 ,M =  M
lim  2
, ˆ N ,M =  M .
lim 
N →∞ N →∞

Thus for large value of N , it is expected that P̂K , j and Q̂ K , j generated by (9.21)
are good approximations of PK , j and Q(PK , j ) generated by (9.20), respectively, for
260 B. Pang and Z.-P. Jiang

all j ∈ Z+ , if P̂K ,0 = PK ,0 . Then since (9.14) is equivalent to (9.20), by Lemma


9.4, P̂K , j and Q̂ K , j are expected to be close to P̂K and Q(PK ) for j large enough,
respectively. In this way, the policy evaluation step is implemented directly from the
input/state data. The proposed MO-LSPI is presented in Algorithm 9.1. Based on
Theorem 9.2, the convergence of Algorithm 9.1 is derived as the following theorem,
whose proof is postponed to Appendix 6.
Theorem 9.3 For each mean-square stabilizing control gain K̂ 1 , each realization
Vt of the discrete-time white Gaussian noise process, each M ≥ (m + n)(m + n +
1)/2, and any > 0, if Assumption 9.1 is satisfied, then there exist L̄ 0 > 0 and
N0 > 0 such that for any L̄ ≥ L̄ 0 and N ≥ N0 , almost surely

lim sup  P̃I¯ − P ∗  F <


I¯→∞

and K̂ i is mean-square stabilizing for all i = 1, · · · , I¯, where P̃I¯ is the unique
solution of (9.9) for K̂ I¯ .
Remark 9.2 To satisfy Assumption 9.1, the noises vt are added to the controller
(9.16). If the mean Vt ≡ 0, then E[xt u tT ] = −E[xt xtT ] K̂ 1 , and by definition, M
cannot be full column rank. This is why we require {Vt }∞ t=0 to be a realization of the
discrete-time white Gaussian noise process.

Algorithm 9.1: MO-LSPI


Input: Initial control gain K̂ 1 , Number of policy iterations I¯, Length of policy evaluation L̄,
Length of rollout M, Number of rollout N .
1 for t = 0, · · · , M − 1 do
2 Generate Vt independently from standard multivariate Gaussian distribution ;
3 end
4 for p = 1, · · · , N do
p M p M−1
5 Generate trajectories {xt }t=0 and {u t }t=0 by applying control law (9.16) to system
(9.1).
6 end
7 Compute the data matrices ˆ N ,M ,  ˆ1 ,
N ,M
ˆ 2 and 
N ,M
ˆ N ,M ;
8 for i = 1, · · · , I¯ − 1 do
9 P̂i,0 ← 0;
10 for j = 0, · · · , L̄ − 1 do
11 Q̂ i, j = svec−1 ( ˆ †N ,M (
ˆ 2 svec( P̂i, j ) + r̂ N ,M )) ;
N ,M
12 P̂i, j+1 = H( Q̂ i, j , K̂ i );
13 end
14 Q̂ i = svec−1 ( ˆ †N ,M ( ˆ 2 svec( P̂i, L̄ ) + r̂ N ,M ));
N ,M
15 K̂ i+1 ← [ Q̂ i ]−1
uu [ Q̂ i ]ux ;
16 end
17 return K̂ I¯ .
9 Robust Reinforcement Learning … 261

9.5 An Illustrative Example

The proposed MO-LSPI Algorithm 9.1 is applied to the second-order system with
multiplicative noises studied in [55], which is described by (9.1) with system matrices
     
−0.2, 0.3 1 0.08, 1.69 1 −7.37, −0.58
A0 = , A1 = , A2 =
−0.4, 0.8 100 −0.25, 0 100 −1.61, 0
     
0, 0 −0.037, 0.022 −1.8
A3 = , A4 = , B0 = ,
0, 0.08 0.1617, 0 −0.8
   
1 −4.7 1 20.09
B1 = , B2 = .
100 −0.62 100 −2.63

The stochastic noises wt, j and ŵt,k are random variables independently drawn from
the normal distribution for each t, j, and k. The initial control gain is chosen as
K̂ 1 = [0, 0], which is mean-square stabilizing by checking that ρ(A ([0, 0]) + I2 ⊗
I2 ) < 1. In the simulation, we set weighting matrices S = I2 and R = 1, number of
policy iterations I¯ = 11, length of rollout M = 7, number of rollout N = 106 , length
of policy evaluation L̄ = 1000. Algorithm 9.1 is run for 200 times, with the same
realization of the discrete-time white Gaussian noise process Vt . In other words, for
each t, the random seed in Line 2 of Algorithm 9.1 is fixed over the 200 implemen-
tations. In each iteration, the relative errors  P̃i − P ∗  F /P ∗  F for i = 1, 2, · · · , I¯
are computed and recorded. The sample average and sample variance of relative error

Relative Error (Avergage)


0.3

0.2

0.1

0
1 2 3 4 5 6 7 8 9 10 11
Iteration Index
-4 Relative Error (Variance)
10
1.5

0.5

0
1 2 3 4 5 6 7 8 9 10 11
Iteration Index

Fig. 9.1 Experimental results of the second-order system


262 B. Pang and Z.-P. Jiang

for each iteration index i are plotted in Fig. 9.1. This validates Theorem 9.3. Since
Theorem 9.3 is based on Theorem 9.2, our robustness results are also verified.

9.6 Conclusions

This chapter analyzes the robustness of policy iteration for stochastic LQR with
multiplicative noises. It is proved that starting from any mean-square stabilizing
initial policy, the solutions generated by policy iteration with errors are bounded and
ultimately enter and stay in a neighborhood of the optimal solution, as long as the
errors are small and bounded. This result is employed to prove the convergence of the
multiple-trajectory optimistic least-squares policy iteration (MO-LSPI), a novel off-
policy model-free RL algorithm for discrete-time LQR with stochastic multiplicative
noises in the model. The theoretical results are validated by the experiments on a
numerical example.

Acknowledgements Confucius once said, Virtue is not left to stand alone. He who practices it
will have neighbors. Laurent Praly, the former PhD advisor of the second-named author, is such a
beautiful mind. His vision about and seminal contributions to control theory, especially nonlinear
and adaptive control, have influenced generations of students including the authors of this chapter.
ZPJ is privileged to have Laurent as the PhD advisor during 1989–1993 and is very grateful to
Laurent for introducing him to the field of nonlinear control. It is under Laurent’s close guidance
that ZPJ started, in 1991, working on the stability and control of interconnected nonlinear systems
that has paved the foundation for nonlinear small-gain theory. The research findings presented here
are just a reflection of Laurent’s vision about the relationships between control and learning. We
also thank the U.S. National Science Foundation for its continuous financial support.

Appendix 1

The following lemma provides the relationship between operations vec(·) and svec(·).

Lemma 9.5 ([40, Page 57]) For X ∈ Sn , there exists a unique matrix Dn
∈ Rn × 2 n(n+1) with full column rank, such that
2 1

vec(X ) = Dn svec(X ), svec(X ) = Dn† vec(X ).

Dn is called the duplication matrix.


Lemma 9.6 ([44, Lemma A.3.]) Let O be a compact set such that ρ(O) < 1 for any
O ∈ O, then there exist an a0 > 0 and a 0 < b0 < 1, such that

O k 2 ≤ a0 b0k , ∀k ∈ Z+

for any O ∈ O.
9 Robust Reinforcement Learning … 263

For X ∈ Rn×n , Y ∈ Rn×m , X + X ∈ Rn×n , Y + Y ∈ Rn×m , supposing X and


X + X are invertible, the following inequality is repeatedly used:

X −1 Y − (X + X )−1 (Y + Y ) F = X −1 Y − X −1 (Y + Y )
+X −1 (Y + Y ) − (X + X )−1 (Y + Y ) F
(9.23)
=  − X −1 Y + X −1 X (X + X )−1 (Y + Y ) F
≤ X −1  F Y  F + X −1  F (X + X )−1  F (Y + Y ) F X  F .

Appendix 2

The following property of L K (·) is useful.


Lemma 9.7 If K is mean-square stabilizing, then L K (Y1 ) ≤ L K (Y2 ) =⇒ Y1 ≥
Y2 , where Y1 , Y2 ∈ Sn .

Proof Let {xt }∞


t=0 be the solution of the closed-loop system (9.1) with controller
u = −K x. Then for any t ≥ 1

E[xt+1
T
Y1 xt+1 − xtT Y1 xt ] = E[xtT L K (Y1 )xt ]
≤ E[xtT L K (Y2 )xt ] = E[xt+1
T
Y2 xt+1 − xtT Y2 xt ].

Since K is mean-square stabilizing,



 ∞

−x0T Y1 x0 = E[xtT L K (Y1 )xt ] ≤ E[xtT L K (Y2 )xt ] = −x0T Y2 x0 .
t=0 t=0

The proof is complete because x0 is arbitrary. 

Now we are ready to prove Theorem 9.1.


Proof (Theorem 9.1) By (9.7) and (9.8), for any x ∈ Rn ,

K 2 ∈ arg min{x T H(P1 , K )x}.


K ∈Rm×n

Thus H(P1 , K 2 ) ≤ 0. By definition, P1 > 0 and

L K 2 (P1 ) ≤ −S − K 2T R K 2 < 0.

Then Lemma 9.1 implies that K 2 is mean-square stabilizing. Inserting (9.7) into the
above inequality yields L K 2 (P1 ) ≤ L K 2 (P2 ). This implies P1 ≥ P2 by Lemma 9.7.
An application of mathematical induction proves the first two items. For the last item,
by a theorem on the convergence of a monotone sequence of self-adjoint operators
264 B. Pang and Z.-P. Jiang

(see [32, Pages 189–190]), limi→∞ Pi and limi→∞ K i exist. Letting i → ∞ in (9.7)
and (9.8), and eliminating K ∞ in (9.7) using (9.8), we have

P∞ = S + (P∞ ) − A T P∞ B(R + (P∞ ))−1 B T P∞ A.

The proof is complete by the uniqueness of P ∗ . 

Appendix 3

Proof (Lemma 9.2) Since K (P ∗ ) is mean-square stabilizing, by continuity there


always exists a δ̄0 > 0, such that R(Pi ) is invertible, K (Pi ) is mean-square stabi-
lizing for all Pi ∈ B̄δ̄0 (P ∗ ). Suppose Pi ∈ B̄δ̄0 (P ∗ ). Subtracting

T
K i+1 B T P ∗ A + A T P ∗ B K i+1 − K i+1
T
R(P ∗ )K i+1

from both sides of the GARE (9.4) yields



LK (Pi ) (P ) = −S − K T (Pi )RK (Pi )+
(9.24)
(K (Pi ) − K (P ∗ ))T R(P ∗ )(K (Pi ) − K (P ∗ )).

Subtracting (9.24) from (9.12), we have



Pi+1 − P ∗ = −L−1 ∗ T ∗ ∗
K (Pi ) ((K (Pi ) − K (P )) R(P )(K (Pi ) − K (P )) .

Taking norm on both sides of above equation, (9.11) yields

Pi+1 − P ∗  F ≤ A (K (Pi ))−1 2 R(P ∗ ) F K (Pi ) − K (P ∗ )2F .

Since K (·) is locally Lipschitz continuous at P ∗ , by continuity of matrix norm and


matrix inverse, there exists a c1 > 0, such that

Pi+1 − P ∗  F ≤ c1 Pi − P ∗ 2F , ∀Pi ∈ B̄δ̄0 (P ∗ ).

So for any 0 < σ < 1, there exists a δ̄0 ≥ δ0 > 0 with c1 δ0 ≤ σ . This completes the
proof. 

Appendix 4

Before the proof of Lemma 9.3, some auxiliary lemmas are firstly proved. Procedure
9.2 will exhibit a singularity, if [Ĝ i ]uu in (9.10) is singular, or the cost (9.2) of K̂ i+1 is
9 Robust Reinforcement Learning … 265

infinity. The following lemma shows that if G i is small, no singularity will occur.
Let δ̄0 be the one defined in the proof of Lemma 9.2, then δ0 ≤ δ̄0 .
Lemma 9.8 For any P̃i ∈ Bδ0 (P ∗ ), there exists a d(δ0 ) > 0, independent of P̃i , such
that K̂ i+1 is mean-square stabilizing and [Ĝ i ]uu is invertible, if G i  F ≤ d.

Proof Since B̄δ̄0 (P ∗ ) is compact and A (K (·)) is a continuous function, set

S = {A (K ( P̃i ))| P̃i ∈ B̄δ̄0 (P ∗ )}

is also compact. By continuity and Lemma 9.1, for each X ∈ S, there exists a r (X ) >
0 such that ρ(Y + In ⊗ In ) < 1 for any Y ∈ Br (X ) (X ). The compactness of S implies
the existence of a r > 0, such that ρ(Y + In ⊗ In ) < 1 for each Y ∈ Br (X ) and
all X ∈ S. Similarly, there exists d1 > 0 such that [Ĝ i ]uu is invertible for all P̃i ∈
B̄δ̄0 (P ∗ ), if G i  F ≤ d1 . Note that in policy improvement step of Procedure 9.1
(the policy update step in Procedure 9.2), the improved policy K̃ i+1 = [G̃ i ]−1 uu [G̃ i ]ux
(the updated policy K̂ i+1 ) is continuous function of G̃ i (Ĝ i ), and there exists a 0 <
d2 ≤ d1 , such that A ( K̂ i+1 ) ∈ Br (A (K ( P̃i ))) for all P̃i ∈ B̄δ̄0 (P ∗ ), if G i  F ≤
d2 . Thus, Lemma 9.1 implies that K̂ i+1 is mean-square stabilizing. Setting d = d2
completes the proof. 

By Lemma 9.8, if G i  F ≤ d, the sequence { P̃i }i=0 satisfies (9.13). For simplicity,
we denote E(G̃ i , G i ) in (9.13) by Ei . The following lemma gives an upper bound
on Ei  F in terms of G i  F .
Lemma 9.9 For any P̃i ∈ Bδ0 (P ∗ ) and any c2 > 0, there exists a 0 < δ11 (δ0 , c2 ) ≤ d,
independent of P̃i , where d is defined in Lemma 9.8, such that

Ei  F ≤ c3 G i  F < c2 ,

if G i  F < δ11 , where c3 (δ0 ) > 0.

Proof For any P̃i ∈ B̄δ0 (P ∗ ), G i  F ≤ d, we have from (9.23)

K ( P̃i ) − K̂ i+1  F ≤ [G̃ i ]−1 −1


uu  F (1 + [Ĝ i ]uu  F [Ĝ i ]ux  F )G i  F
≤ c4 (δ0 , d)G i  F , (9.25)

where the last inequality comes from the continuity of matrix inverse and the
extremum value theorem. Define

P̌i = L−1

−S − K̂ i+1
T
R K̂ i+1 , P̊i = L−1
K ( P̃ )
−S − K ( P̃i )T RK ( P̃i ) .
i+1 i

Then by (9.11) and (9.13),


266 B. Pang and Z.-P. Jiang

Ei  F =  vec( P̌i − P̊i )2 ,


vec( P̌i ) = A −1 K̂ i+1 vec −S − K̂ i+1
T
R K̂ i+1 ,

vec( P̊i ) = A −1 K ( P̃i ) vec −S − K ( P̃i )T RK ( P̃i ) .

Define

Ai = A K ( P̃i ) − A K̂ i+1 , bi = vec K ( P̃i )T R K ( P̃i ) − K̂ i+1


T
R K̂ i+1 .

Using (9.25), it is easy to check that Ai  F ≤ c5 G i  F , bi 2 ≤ c6 G i  F ,


for some c5 (δ0 , d) > 0, c6 (δ0 , d) > 0. Then by (9.23)

Ei  F ≤ A −1 K̂ i+1 c6 + c5 A −1 K ( P̃i )


F F

× S + K ( P̃i ) RK ( P̃i )
T
G i  F ≤ c3 (δ0 )G i  F ,
F

where the last inequality comes from the continuity of matrix inverse and Lemma
9.8. Choosing 0 < δ11 ≤ d such that c3 δ11 < c2 completes the proof.

Now we are ready to prove Lemma 9.3.


Proof (Lemma 9.3) Let c2 = (1 − σ )δ0 in Lemma 9.9, and δ1 be equal to the δ11
associated with c2 . For any i ∈ Z+ , if P̃i ∈ Bδ0 (P ∗ ), then [Ĝ i ]uu is invertible, K̂ i+1
is mean-square stabilizing and

 P̃i+1 − P ∗  F ≤ Ei  F + L−1


K ( P̃ )
(S + P̃iT B R −1 B T P̃i ) − P ∗
i F

≤ σ  P̃i − P  F + c3 G i  F (9.26)

≤ σ  P̃i − P  F + c3 G∞ (9.27)
< σ δ0 + c3 δ1 < σ δ0 + c2 = δ0 , (9.28)

where (9.26) and (9.28) are due to Lemmas 9.2 and 9.9. By induction, (9.26) to (9.28)
hold for all i ∈ Z+ , thus by (9.27),

 P̃i − P ∗  F ≤ σ 2  P̃i−2 − P ∗  F + (σ + 1)c3 G∞


≤ · · · ≤ σ i  P̃0 − P ∗  F + (1 + · · · + σ i−1 )c3 G∞
c3
< σ i  P̃0 − P ∗  F + G∞ ,
1−σ

which proves (i) and (ii) in Lemma 9.3. Then (9.25) implies (iii) in Lemma 9.3.
In terms of (iv) in Lemma 9.3, for any > 0, there exists a i 1 ∈ Z+ , such that

sup{G i  F }i=i 1
< γ −1 ( /2). Take i 2 ≥ i 1 . For i ≥ i 2 , we have by (ii) in Lemma
9.3,
9 Robust Reinforcement Learning … 267

 P̃i − P ∗  F ≤ β( P̃i2 − P ∗  F , i − i 2 ) + /2 ≤ β(c7 , i − i 2 ) + /2,

where the second inequality is due to the boundedness of P̃i . Since limi→∞ β(c7 , i −
i 2 ) = 0, there is a i 3 ≥ i 2 such that β(c7 , i − i 2 ) < /2 for all i ≥ i 3 , which completes
the proof. 

Appendix 5

Notice that all the conclusions of Theorem 9.2 can be implied by Lemma 9.3 if

δ2 < min(γ −1 ( ), δ1 ), P̃1 ∈ Bδ0 (P ∗ )

for Procedure 9.2. Thus, the proof of Theorem 9.2 reduces to the proof of the fol-
lowing lemma.
Lemma 9.10 Given a mean-square stabilizing K̂ 1 , there exist 0 < δ2 < min(γ −1
( ), δ1 ), ī ∈ Z+ , α2 > 0, and κ2 > 0, such that [Ĝ i ]uu is invertible, K̂ i is mean-
square stabilizing,  P̃i  F < α2 ,  K̂ i  F < κ2 , i = 1, · · · , ī, P̃ī ∈ Bδ0 (P ∗ ), as long
as G∞ < δ2 .
The next two lemmas state that under certain conditions on G i  F , each element
in { K̂ i }īi=1 is mean-square stabilizing, each element in {[Ĝ i ]uu }īi=1 is invertible, and
{ P̃i }īi=1 is bounded. For simplicity, in the following we assume S > In and R > Im .
All the proofs still work for any S > 0 and R > 0, by suitable rescaling.
Lemma 9.11 If K̂ i is mean-square stabilizing, then [Ĝ i ]uu is nonsingular and K̂ i+1
is mean-square stabilizing, as long as G i  F < ai , where

√ √ −1
ai = m( n +  K̂ i 2 )2 + m( n +  K̂ i+1 2 )2 .

Furthermore,
 K̂ i+1  F ≤ 2R −1  F (1 + B T P̃i A F ). (9.29)

Proof By definition,

[G̃ i ]−1 −1
uu ([Ĝ i ]uu − [G̃ i ]uu ) F < ai [G̃ i ]uu  F .

Since R > Im , the eigenvalues λ j ([G̃ i ]−1


uu ) ∈ (0, 1] for all 1 ≤ j ≤ m. Then by the
fact that for any X ∈ Sm

X  F =  X  F ,  X = diag{λ1 (X ), · · · , λm (X )},

we have
268 B. Pang and Z.-P. Jiang

[G̃ i ]−1
uu ([Ĝ i ]uu − [G̃ i ]uu ) F < ai m < 0.5. (9.30)

Thus by [26, Section 5.8], [Ĝ i ]uu is invertible.


For any x ∈ Rn on the unit ball, define
 
I  
X K̂ i = x x T I − K̂ iT .
− K̂ i

From (9.9) and (9.10) we have

x T H(G̃ i , K̂ i )x = tr(G̃ i X K̂ i ) = 0,

and
tr(Ĝ i X K̂ i+1 ) = min tr(Ĝ i X K ).
K ∈R
m×n

Then

tr(G̃ i X K̂ i+1 ) ≤ tr(Ĝ i X K̂ i+1 ) + G i  F tr(11T |X K̂ i+1 |abs )


≤ tr(Ĝ i X K̂ i ) + G i  F 1T |X K̂ i+1 |abs 1
≤ tr(G̃ i X K̂ i ) + G i  F 1T (|X K̂ i |abs + |X K̂ i+1 |abs )1
≤ G i  F 1T (|X K̂ i |abs + |X K̂ i+1 |abs )1, (9.31)

where |X K̂ i |abs denotes the matrix obtained from X K̂ i by taking the absolute value
of each entry. Thus by (9.31) and the definition of G̃ i , we have

x T L K̂ i+1 ( P̃i )x + 1 ≤ 0, (9.32)

where

1 = x T (S + K̂ i+1
T
R K̂ i+1 )x − G i  F 1T (|X K̂ i |abs + |X K̂ i+1 |abs )1.

For any x on the unit ball, |1T x|abs ≤ n. Similarly,√for any K ∈ Rm×n , by the
definition of induced matrix norm, |1T K x|abs ≤ K 2 m. This implies
 
I √ √
1 T
x = 1T x − 1T K x ≤ m( n + K 2 ),
−K abs
abs


which means 1T |X K |abs 1 ≤ m( n + K 2 )2 . Thus

G i  F 1T (|X K̂ i |abs + |X K̂ i+1 |abs )1 < 1.

Then S > In leads to


9 Robust Reinforcement Learning … 269

x T L K̂ i+1 ( P̃i )x < 0

for all x on the unit ball. So K̂ i+1 is mean-square stabilizing by Lemma 9.1.
By definition,

 K̂ i+1  F ≤ [Ĝ i ]−1


uu  F (1 + B P̃i A F )
T

≤ [G̃ i ]−1 −1 −1
uu  F (1 − [G̃ i ]uu ([Ĝ i ]uu − [G̃ i ]uu ) F ) (1 + B P̃i A F )
T

≤ 2R −1  F (1 + B T P̃i A F ), (9.33)

where the second inequality comes from [26, Inequality (5.8.2)], and the last inequal-
ity is due to (9.30). This completes the proof. 

Lemma 9.12 For any ī ∈ Z+ , ī > 0, if

G i  F < (1 + i 2 )−1 ai , i = 1, · · · , ī, (9.34)

where ai is defined in Lemma 9.11, then

 P̃i  F ≤ 6 P̃1  F ,  K̂ i  F ≤ C0 ,

for i = 1, · · · , ī, where


 
C0 = max  K̂ 1  F , 2R −1  F 1 + 6B T  F  P̃1  F A F .

Proof Inequality (9.32) yields

L K̂ i+1 ( P̃i ) + (S + K̂ i+1


T
R K̂ i+1 ) − 2,i I < 0, (9.35)

where
2,i = G i  F 1T (|X K̂ i |abs + |X K̂ i+1 |abs )1 < 1.

Inserting (9.9) into above inequality, and using Lemma 9.7, we have
−1
P̃i+1 < P̃i + 2,i L K̂ (−I ). (9.36)
i+1

With S > In , (9.35) yields

L K̂ i+1 ( P̃i ) + (1 − 2,i )I < 0.

Similar to (9.36), we have

1
L−1 (−I ) < P̃i . (9.37)
K̂ i+1 1− 2,i
270 B. Pang and Z.-P. Jiang

From (9.36) to (9.37), we obtain


 
2,i
P̃i+1 < 1 + P̃i .
1− 2,i

By definition of 2,i and condition (9.34),

2,i 1
≤ , i = 1, · · · , ī.
1− 2,i i2

Then [34, §28. Theorem 3] yields

P̃i ≤ 6 P̃1 , i = 1, · · · , ī.

An application of (9.29) completes the proof. 


Now we are ready to prove Lemma 9.10.
Proof (Lemma 9.10) Consider Procedure 9.2 confined to the first ī iterations, where
ī is a sufficiently large integer to be determined later in this proof. Suppose

1 √ −2
G i  F < bī  n + C0 . (9.38)
2m(1 + ī )
2

Condition (9.38) implies condition (9.34). Thus K̂ i is mean-square stabilizing, [Ĝ i ]−1
uu
is invertible,  P̃i  F and  K̂ i  F are bounded. By (9.9) we have

L K̂ i+1 ( P̃i+1 − P̃i ) = −S − K̂ i+1


T
R K̂ i+1 − L K̂ i+1 ( P̃i ).

Letting E i = K̂ i+1 − K ( P̃i ), the above equation can be rewritten as

P̃i+1 = P̃i − N( P̃i ) + L−1 (E ),


K ( P̃ ) i
(9.39)
i

where N( P̃i ) = L−1


K ( P̃ )
◦ R( P̃i ), and
i

R(Y ) = (Y ) − Y − A0T Y B0 (R + (Y ))−1 B0T Y A0 + S,


Ei = −E iT R( P̃i+1 )E i + E iT R( P̃i+1 ) K ( P̃i+1 ) − K ( P̃i )
T
+ K ( P̃i+1 ) − K ( P̃i ) R( P̃i+1 )E i .

Given K̂ 1 , let Mī denote the set of all possible P̃i , generated by (9.39) under condition
(9.38). By definition, {M j }∞ j=1 is a nondecreasing sequence of sets, i.e., M1 ⊂ M2 ⊂

· · · . Define M = ∪ j=1 M j , D = {P ∈ Sn | P F ≤ 6 P̃1  F }. Then by Lemma 9.12
and Theorem 9.1, M ⊂ D; M is compact; K (P) is stable for any P ∈ M.
9 Robust Reinforcement Learning … 271

Now we prove that N(P 1 ) is Lipschitz continuous on M. Using (9.11), we have

N(P 1 ) − N(P 2 ) F = A −1 (K (P 1 )) vec(R(P 1 )) − A −1 (K (P 2 )) vec(R(P 2 ))2


≤ A −1 (K (P 1 ))2 R(P 1 ) − R(P 2 ) F +
R(P 2 ) F A −1 (K (P 1 )) − A −1 (K (P 2 ))2
≤ LP 1 − P 2  F , (9.40)

where the last inequality is due to the fact that matrix inversion A (·), K (·), and R(·)
are locally Lipschitz, thus Lipschitz on compact set M with some Lipschitz constant
L > 0.
Define {Pk|i }∞
k=0 as the sequence generated by (9.12) with P0|i = P̃i . Similar to
(9.39), we have
Pk+1|i = Pk|i − N(Pk|i ), k ∈ Z+ . (9.41)

By Theorem 9.1 and the fact that M is compact, there exists k0 ∈ Z+ , such that

Pk0 |i − P ∗  F < δ0 /2, ∀P0|i ∈ M. (9.42)

Suppose
L−1
K ( P̃
(Ei+ j ) F < μ, j = 0, · · · , ī − i. (9.43)
i+ j )

We find an upper bound on Pk|i − P̃i+k  F . Notice that from (9.39) to (9.41),


k−1 
k−1 
k−1
Pk|i = P0|i − N(P j|i ), P̃i+k = P̃i − N( P̃i+ j ) + L−1
K ( P̃
(Ei+ j ).
i+ j )
j=0 j=0 j=0

Then (9.40) and (9.43) yield


k−1
Pk|i − P̃i+k  F ≤ kμ + LP j|i − P̃i+ j  F .
j=0

An application of the Gronwall inequality [2, Theorem 4.1.1.] to the above inequality
implies

k−1
Pk|i − P̃i+k  F ≤ kμ + Lμ j (1 + L)k− j−1 . (9.44)
j=0

By (9.11), the error term in (9.39) satisfies

L−1 (E )
K ( P̃ ) i
= A −1 (K ( P̃i )) vec (Ei ) ≤ C1 Ei  F , (9.45)
i F 2
272 B. Pang and Z.-P. Jiang

where C1 is a constant and the inequality is due to the continuity of matrix inverse.
Let ī > k0 , and k = k0 , i = ī − k0 in (9.44). Then by condition (9.38), Lemma
9.12, (9.43), (9.44), and (9.45), there exist i 0 ∈ Z+ , i 0 > k0 , such that Pk0 |ī−k0 −
P̃ī  F < δ0 /2, for all ī ≥ i 0 . Setting i = ī − k0 in (9.42), the triangle inequality
yields P̃ī ∈ Bδ0 (P ∗ ), for ī ≥ i 0 . Then in (9.38), choosing ī ≥ i 0 such that δ2 = bī <
min(γ −1 ( ), δ1 ) completes the proof. 

Appendix 6

For given K̂ 1 , let K denote the set of control gains (including K̂ 1 ) generated by

Procedure 9.2 with all possible {G i }i=1 satisfying G∞ < δ2 , where δ2 is the
one in Theorem 9.2. The following result is firstly derived.
Lemma 9.13 Under the conditions in Theorem 9.3, there exist L̄ 0 > 0 and N0 > 0
such that for any L̄ ≥ L̄ 0 and N ≥ N0 , K̂ i ∈ K implies G i  F < δ2 , almost surely.

Proof By definition, in the context of Algorithm 9.1,

G i  F ≤  Q̂ i − Q( P̂i, L̄ ) F + Q( P̂i, L̄ ) − Q( P̃i ) F +  P̃i − P̂i, L̄  F ,

where P̃i is the unique solution of (9.9) with K = K̂ i . Thus, the task is to prove that
each term in the right-hand side of the above inequality is less than δ2 /3. To this end,
we firstly study  P̃i − P̂i, L̄  F . Define p̂i, j = vec( P̂i, j ), by Lemma 9.5, Line 11 and
Line 12 in Algorithm 9.1 can be rewritten as

p̂i, j+1 = T 1 ( ˆ †N ,M , 
ˆ N2 ,M , K̂ i ) p̂i, j + T 2 ( ˆ †N ,M , r̂ N ,M , K̂ i ), (9.46)

2
where p̂i,0 ∈ Rn and
   
T 1 ( ˆ †N ,M , 
ˆ N2 ,M , K̂ i ) = In , − K̂ iT ⊗ In , − K̂ iT D(m+n)(m+n+1)/2 ˆ †N ,M  ˆ N2 ,M Dn† ,
   
T 2 ( ˆ †N ,M , r̂ N ,M , K̂ i ) = In , − K̂ iT ⊗ In , − K̂ iT D(m+n)(m+n+1)/2 ˆ †N ,M r̂ N ,M .

Similar derivations applied to (9.20) with K = K̂ i yield


2
p̄i, j+1 = T 1 ( M , M ,
2
K̂ i ) p̄i, j + T 2 ( M , rM , K̂ i ), p̄i,0 ∈ Rn . (9.47)

Since (9.20) is identical to (9.14), (9.47) is identical to (9.15) with K and vec(PK , j )
replaced by K̂ i and p̄i, j respectively, and

T 1( M , M , K̂ i ) = A ( K̂ i ) + In ⊗ In , T 2 ( M , rM , K̂ i ) = vec(S + K̂ iT R K̂ i ).
(9.48)
9 Robust Reinforcement Learning … 273

Since A ( K̂ i ), K̂ i ∈ K is mean-square stabilizing, by Lemma 9.4

lim P̄i, j = P̃i , (9.49)


j→∞

where P̄i, j = vec−1 ( p̄i, j ). By definition and Theorem 9.2, K̄ is bounded, thus com-
pact. Let V be the set of the unique solutions of (9.5) with K ∈ K. Then by
Theorem 9.2 V is bounded. So A (K ) is mean-square stable for ∀K ∈ K̄, oth-
erwise by (9.11) and Lemma 9.1 it contradicts the boundedness of V. Define
K1 = {A (K ) + In ⊗ In |K ∈ K̄}. Then ρ(X ) < 1 for any X ∈ K1 , and by conti-
nuity K1 is a compact set. This implies the existence of a δ3 > 0, such that ρ(X ) < 1
for any X ∈ K̄2 , where

K2 = {X |X ∈ Bδ3 (Y ), Y ∈ K1 }.

Define
T N1 ,M,i = T 1 ( M , M ,
2
K̂ i ) − T 1 ( ˆ †N ,M , 
ˆ N2 ,M , K̂ i ),
T N2 ,M,i = T 2 ( M , rM , K̂ i ) − T 2 ( ˆ †N ,M , r̂ N ,M , K̂ i ).

The boundedness of K, (9.22), and (9.48) imply the existence of N1 > 0, such that
for any N ≥ N1 , any K̂ i ∈ K, almost surely

T 1 ( ˆ †N ,M , 
ˆ N2 ,M , K̂ i ) ∈ K̄2 , T 2 ( ˆ †N ,M , r̂ N ,M , K̂ i ) < C9 , (9.50)

where C9 > 0 is a constant. Then

ρ(T 1 ( ˆ †N ,M , 
ˆ N2 ,M , K̂ i )) < 1

and (9.46) admits a unique stable equilibrium, that is,

lim P̂i, j = P̊i (9.51)


j→∞

for some P̊i ∈ Sn . From (9.46), (9.47), (9.49), and (9.51), we have
−1
vec( P̃i ) = In 2 − T 1 ( M , M , K̂ i ) T 2( M , rM , K̂ i ),
−1
vec( P̊i ) = In 2 − T 1 ( ˆ †N ,M , 
ˆ N2 ,M , K̂ i ) T 2 ( ˆ †N ,M , r̂ N ,M , K̂ i ).

Thus by (9.23), for any N ≥ N1 , any K̂ i ∈ K, almost surely


274 B. Pang and Z.-P. Jiang

−1 
 P̊i − P̃i  F ≤ In 2 − T 1 ( M , M , K̂ i ) T N2 ,M,i  F +
F

−1
In 2 − T ( ˆ
1 † ˆ2
N ,M ,  N ,M , K̂ i ) T (ˆ 2 †
N ,M , r̂ N ,M , K̂ i ) T N1 ,M,i  F
F 2

≤ C10 T N2 ,M,i  F + C11 T N1 ,M,i  F ,

where C10 and C11 are some positive constants, and the last inequality is due to
(9.48), (9.50) and the fact that K1 and K̄2 are compact sets. Then for any 1 > 0, the
boundedness of K and (9.22) implies the existence of N2 ≥ N1 , such that for any
N ≥ N2 , almost surely
 P̊i − P̃i  F < 1 /2, (9.52)

as long as K̂ i ∈ K. By Lemma 9.6 and (9.52), for any N ≥ N2 and any K̂ i ∈ K,


j j
 P̊i − P̂i, j  F ≤ a0 b0  P̊i  F ≤ a1 b0 ,

for some a0 > 0, 1 > b0 > 0 and a1 > 0. Therefore there exists a L̄ 1 > 0, such that
for any L̄ ≥ L̄ 1 , and any N ≥ N2 , almost surely

 P̂i, L̄ − P̊i  F < 1 /2, (9.53)

as long as K̂ i ∈ K. With (9.52) and (9.53), we obtain

 P̂i, L̄ − P̃i  F < 1, (9.54)

almost surely for any L̄ ≥ L̄ 1 , any N ≥ N2 , as long as K̂ i ∈ K. Since 1 is arbitrary,


we can choose 1 such that almost surely

 P̂i, L̄ − P̃i  F < δ2 /3

for any L̄ ≥ L̄ 1 , any N ≥ N2 , as long as K̂ i ∈ K.


Secondly, by definition and (9.54), there exist L̄ 2 ≥ L̄ 1 and N3 ≥ N2 , such that

Q( P̂i, L̄ ) − Q( P̃i ) F < δ2 /3

for any L̄ ≥ L̄ 2 , any N ≥ N3 , as long as K̂ i ∈ K.


Thirdly, since V is bounded, P̂i, L̄ is also almost surely bounded by (9.54). Thus,
from Line 14 in Algorithm 9.1 and (9.22), there exists N4 ≥ N3 , such that

 Q̂ i − Q( P̂i, L̄ ) F < δ2 /3

for any N ≥ N4 and any L̄ ≥ L̄ 2 , as long as K̂ i ∈ K.


Setting N0 = N4 and L̄ 0 = L̄ 2 yields G i  < δ2 . 
9 Robust Reinforcement Learning … 275

Now we are ready to prove the convergence of Algorithm 9.1.


Proof (Theorem 9.3) Since K̂ 1 ∈ K, Lemma 9.13 implies G 1  F < δ2 almost
surely. By definition, K̂ 2 ∈ K. Thus G i  F < δ2 , i = 1, 2, · · · almost surely by
mathematical induction. Then Theorem 9.2 completes the proof. 

References

1. Abbasi-Yadkori, Y., Lazic, N., Szepesvari, C.: Model-free linear quadratic control via reduc-
tion to expert prediction. In: International Conference on Artificial Intelligence and Statistics
(AISTATS) (2019)
2. Agarwal, R.P.: Difference Equations and Inequalities: Theory, Methods, and Applications, 2nd
edn. Marcel Dekker Inc, New York (2000)
3. Athans, M., Ku, R., Gershwin, S.: The uncertainty threshold principle: some fundamental
limitations of optimal decision making under dynamic uncertainty. IEEE Trans. Autom. Control
22(3), 491–495 (1977)
4. Beghi, A., D’alessandro, D.: Discrete-time optimal control with control-dependent noise and
generalized Riccati difference equations. Automatica, 34(8):1031 – 1034, 1998
5. Bertsekas, D.P.: Approximate policy iteration: A survey and some new methods. J. Control
Theory Appl. 9(3), 310–335 (2011)
6. Bertsekas, D.P.: Reinforcement Learning and Optimal Control. Athena Scientific, Belmont,
Massachusetts (2019)
7. Bian, T., Jiang, Z.P.: Continuous-time robust dynamic programming. SIAM J. Control Optim.
57(6), 4150–4174 (2019)
8. Bian, T., Wolpert, D.M., Jiang, Z.P.: Model-free robust optimal feedback mechanisms of bio-
logical motor control. Neural Comput. 32(3), 562–595 (2020)
9. Bitmead, R.R., Gevers, M., Wertz, V.: Adaptive Optimal Control: The Thinking Man’s GPC.
Prentice-Hall, Englewood Cliffs, New Jersy (1990)
10. Breakspear, M.: Dynamic models of large-scale brain activity. Nat. Neurosci. 20(3), 340–352
(2017)
11. Bryson, A.E., Ho, Y.C.: Applied Optimal Control: Optimization, Estimation and Control. Talor
& Francis (1975)
12. Buşoniu, L., de Bruin, T., Tolić, D., Kober, J., Palunko, I.: Reinforcement learning for control:
Performance, stability, and deep approximators. Annu. Rev. Control 46, 8–28 (2018)
13. Coppens, P., Patrinos, P.: Sample complexity of data-driven stochastic LQR with multiplicative
uncertainty. In: The 59th IEEE Conference on Decision and Control (CDC), pp. 6210–6215
(2020)
14. Coppens, P., Schuurmans, M., Patrinos, P.: Data-driven distributionally robust LQR with mul-
tiplicative noise. In: Learning for Dynamics and Control (L4DC), pp. 521–530. PMLR (2020)
15. De Koning, W.L.: Infinite horizon optimal control of linear discrete time systems with stochastic
parameters. Automatica 18(4), 443–453 (1982)
16. De Koning, W.L.: Compensatability and optimal compensation of systems with white param-
eters. IEEE Trans. Autom. Control 37(5), 579–588 (1992)
17. Drenick, R., Shaw, L.: Optimal control of linear plants with random parameters. IEEE Trans.
Autom. Control 9(3), 236–244 (1964)
18. Du, K., Meng, Q., Zhang, F.: A Q-learning algorithm for discrete-time linear-quadratic control
with random parameters of unknown distribution: convergence and stabilization. arXiv preprint
arXiv:2011.04970, 2020
19. Duncan, T.E., Guo, L., Pasik-Duncan, B.: Adaptive continuous-time linear quadratic gaussian
control. IEEE Trans. Autom. Control 44(9), 1653–1662 (1999)
276 B. Pang and Z.-P. Jiang

20. Gravell, B., Esfahani, P.M., Summers, T.: Learning robust controllers for linear quadratic
systems with multiplicative noise via policy gradient. IEEE Trans. Autom. Control (2019)
21. Gravell, B., Esfahani, P.M., Summers, T.: Robust control design for linear systems via multi-
plicative noise. arXiv preprint arXiv:2004.08019 (2020)
22. Gravell, B., Ganapathy, K., Summers, T.: Policy iteration for linear quadratic games with
stochastic parameters. IEEE Control Syst. Lett. 5(1), 307–312 (2020)
23. Guo, Y., Summers, T.H.: A performance and stability analysis of low-inertia power grids with
stochastic system inertia. In: American Control Conference (ACC), pp. 1965–1970 (2019)
24. Hespanha, J.P., Naghshtabrizi, P., Xu, Y.: A survey of recent results in networked control
systems. Proceedings of the IEEE 95(1), 138–162 (2007)
25. Hewer, G.: An iterative technique for the computation of the steady state gains for the discrete
optimal regulator. IEEE Trans. Autom. Control (1971)
26. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, New York (2012)
27. Jiang, Y., Jiang, Z.P.: Adaptive dynamic programming as a theory of sensorimotor control.
Biolog. Cybern. 108(4), 459–473 (2014)
28. Jiang, Y., Jiang, Z.P.: Robust Adaptive Dynamic Programming. Wiley, Hoboken, New Jersey
(2017)
29. Jiang, Z.P., Bian, T., Gao, W.: Learning-based control: A tutorial and some recent results.
Found. Trends Syst. Control. 8(3), 176–284 (2020)
30. Jiang, Z.P., Lin, Y., Wang, Y.: Nonlinear small-gain theorems for discrete-time feedback systems
and applications. Automatica 40(12), 2129–2136 (2004)
31. Kamalapurkar, R., Walters, P., Rosenfeld, J., Dixon, W.: Reinforcement learning for optimal
feedback control: A Lyapunov-based approach. Springer (2018)
32. Kantorovich, L.V., Akilov, G.P.: Functional Analysis in Normed Spaces. Macmillan, New York
(1964)
33. Kiumarsi, B., Vamvoudakis, K.G., Modares, H., Lewis, F.L.: Optimal and autonomous control
using reinforcement learning: A survey. IEEE Trans. Neural Netw. Learn. Syst. 29(6), 2042–
2062 (2018)
34. Konrad, K.: Theory and Application of Infinite Series, 2nd edn. Dover Publications, New York
(1990)
35. Lai, J., Xiong, J., Shu, Z.: Model-free optimal control of discrete-time systems with additive
and multiplicative noises. arXiv preprint arXiv:2008.08734 (2020)
36. Levine, S., Koltun, V.: Continuous inverse optimal control with locally optimal examples. In:
International Conference on Machine Learning (ICML) (2012)
37. Levine, S., Kumar, A., Tucker, G., Fu, J.: Offline reinforcement learning: Tutorial, review, and
perspectives on open problems. arXiv preprint arXiv:2005.01643 (2020)
38. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.:
Continuous control with deep reinforcement learning. In: International Conference on Learning
Representations (ICLR) (2016)
39. Ljung, L.: System Identification: Theory for the user, 2nd edn. Prentice Hall PTR, Upper Saddle
River (1999)
40. Magnus, J.R., Neudecker, H.: Matrix Differential Calculus With Applications In Statistics And
Economerices. Wiley, New York (2007)
41. Monfort, M., Liu, A., Ziebart, B.D.: Intent prediction and trajectory forecasting via predictive
inverse linear-quadratic regulation. In: AAAI Conference on Artificial Intelligence (AAAI)
(2015)
42. Morozan, T.: Stabilization of some stochastic discrete-time control systems. Stoch. Anal. Appl.
1(1), 89–116 (1983)
43. Pang, B., Bian, T., Jiang, Z.-P.: Robust policy iteration for continuous-time linear quadratic
regulation. IEEE Trans. Autom. Control (2020)
44. Pang, B., Jiang, Z.-P.: Robust reinforcement learning: A case study in linear quadratic regula-
tion. In: AAAI Conference on Artificial Intelligence (AAAI) (2020)
45. Powell, W.B.: From reinforcement learning to optimal control: A unified framework for sequen-
tial decisions. arXiv preprint arXiv:1912.03513 (2019)
9 Robust Reinforcement Learning … 277

46. Praly, L., Lin, S.-F., Kumar, P.R.: A robust adaptive minimum variance controller. SIAM J.
Control Optim. 27(2), 235–266 (1989)
47. Rami, M.A., Chen, X., Zhou, X.Y.: Discrete-time indefinite LQ control with state and control
dependent noises. J. Glob. Optim. 23(3), 245–265 (2002)
48. Åström, K.J., Wittenmark, B.: Adaptive Control, 2nd edn. Addison-Wesley, Reading, Mas-
sachusetts (1995)
49. Sontag, E.D.: Input to state stability: Basic concepts and results. Nonlinear and optimal control
theory. volume 1932, pp. 163–220. Springer, Berlin (2008)
50. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. MIT Press,
Cambridge, Massachusetts (2018)
51. Sutton, R.S., Barto, A.G., Williams, R.J.: Reinforcement learning is direct adaptive optimal
control. IEEE Control Syst. Mag. 12(2), 19–22 (1992)
52. Tiedemann, A., De Koning, W.: The equivalent discrete-time optimal control problem for
continuous-time systems with stochastic parameters. Int. J. Control 40(3), 449–466 (1984)
53. Todorov, E.: Stochastic optimal control and estimation methods adapted to the noise charac-
teristics of the sensorimotor system. Neural Comput. 17(5), 1084–1108 (2005)
54. Tu, S., Recht, B.: The gap between model-based and model-free methods on the linear quadratic
regulator: An asymptotic viewpoint. In Annual Conference on Learning Theory (COLT) (2019)
55. Xing, Y., Gravell, B., He, X., Johansson, K.H., Summers, T.: Linear system identification under
multiplicative noise from multiple trajectory data. In American Control Conference (ACC), pp
5157–5261 (2020)
56. Huang, Y., Zhang, W., Zhang, H.: Infinite horizon LQ optimal control for discrete-time stochas-
tic systems. In: The 6th World Congress on Intelligent Control and Automation (WCICA), vol.
1, pp. 252–256 (2006)
Index

A Continuous-time system, 27, 140, 142, 143,


Active node, 241 150, 165, 178, 180, 182
Active noise control, 136 Controllability bias, 53, 54, 63, 73
Actuator, 67, 136, 189–192, 196, 197, 199, Controllability constant, 53, 59, 63, 64, 73
204, 207, 210, 214 Controllable, 53, 57, 59, 62, 63, 66, 68, 71,
Adaptive backstepping, 190, 202, 217–219, 72, 75, 76, 78, 80, 138
227, 228, 232, 233, 242 Covariance resetting, 149, 164, 169
Adaptive control, 137, 138, 141, 149, 155, Cross section, 55, 57, 61
156, 164, 165, 169–171, 178–185
Algebraic Riccati equation (ARE), 242, 251
Almost feedback linearization, 1, 7, 15
Asymptotic stability, 15, 20, 28, 31, 39
Asymptotic tracking, 208, 210, 218 D
Averaged dynamics, 83, 86 Decentralized design, 84, 88, 91, 104
Degenerate system, 159
Delay, 68, 115, 189–198, 204, 207–210, 212,
B 214, 215
Backstepping, 84, 109, 110, 189, 190, 214, Delay kernel, 191, 210, 214
230, 234, 238, 239 Differential inclusion, 27–40
Backstepping transformation, 189, 191, 194, Directed gap, 61
195, 197, 205, 214 Direct sum, 50–54
Behavior, 2, 5, 15, 27, 33, 34, 36, 39, 52–55,
Discrete delay, 190
58, 64, 66, 83, 84, 86–88, 91, 93, 97,
Discrete-time system, 27, 58, 140, 142, 143,
100, 101, 104–106, 179
156, 178, 179, 181, 184, 185
Broadband noise, 135, 140, 156, 170, 174
Distributed delay, 190, 214
Distributed optimization, 83, 89, 90
C Distributed state estimation, 83, 94, 99
Causal system, 55 Disturbance rejection, 137, 228
Certainty-equivalence principle, 200, 204, Domain of a system, 55
221 Dynamical system, 14, 28, 135, 137, 250,
Composite signals, 50 251, 255, 256
Congelation of variables, 217–220, 223– Dynamic extension, 1, 7, 8, 12, 23
227, 230, 232, 237–239, 245 Dynamic feedback, 2

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer 279
Nature Switzerland AG 2022
Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control,
Lecture Notes in Control and Information Sciences 488,
https://doi.org/10.1007/978-3-030-74628-5
280 Index

E Internal model principle, 95, 137


Estimator, 95, 137, 147, 154, 162, 163, 168, Inverse dynamics, 217, 233–239, 241, 245
176, 177, 193, 207 Inverse of a system, 55
Exponential stability, 29–31, 33, 120, 125, Inverse system, 6
126, 235, 255 Invertibility, 2–6
Extended space, 49, 50 IO system. see input/output system
IQC. see integral quadratic constraint

F
Feedback linearization, 2, 3, 7 K
Finely controllable, 53, 59, 63, 66, 68, 71, Kreisselmeier filter (K-filter), 199, 200, 208,
73, 75, 77, 78, 80 214, 232, 236, 242, 245
Finely lower hemicontinuous, 55
Finely uniformly controllable, 53, 54, 59, 63,
64, 73, 74
L
Fine topology, 48, 51, 53, 56
-flhc. see finely lower hemicontinuous
Finite-dimensional, 109, 110, 113, 126, 141,
-lhc. see lower hemicontinuous
190–193, 198, 210, 214
(, μ)-limited system, 62
Finite-gain stability, 46, 57, 58, 65
-stable system, 58
FIR filter, 144, 157, 165
-wlhc. see weakly lower hemicontinuous
Flhc. see finely lower hemicontinuous
Least-squares algorithm, 148, 149, 155, 163,
Funnel coupling, edge-wise, 101, 102, 104,
164, 169, 177, 249
105
Lhc. see lower hemicontinuous
Funnel coupling, node-wise, 101, 104
Linear Matrix Inequality (LMI), 174
Linear Matrix Linequality (LMI), 174
G Linear Quadratic Regulator (LQR), 249–
Gap, 61 255, 257, 262
Gap topology, 60–63, 65–67 Linear system, 2, 35, 37–39, 44, 52, 58, 66,
Generalized Algebraic Riccati equation 67, 94, 95, 138, 189–193, 198, 208,
(GARE), 253, 264 210, 214, 215, 250–252
Generating polynomial, 172 Lienard system, 93, 100, 101
Global stabilization, 193–195 Look-ahead constant, 56, 58, 63
Look-ahead gain, 57–59
Look-ahead map, 56–59, 61–63
H Lower hemicontinuous, 55
High-gain observer, 109–111, 115, 116, 120, Lower triangular system, 217, 219, 221
122, 123, 126, 131, 132 Lyapunov function, 27–29, 33–35, 38, 39,
Homotopy, 43, 45, 46, 62–64, 66, 67, 69 128, 221–223, 226, 229, 235, 236,
Hybrid system, 27, 28, 39 238, 239
Lyapunov functional, 111, 117, 119, 122,
125, 126, 189–191
I
Ill-conditioned system, 161
Immersion and invariance, 219, 224 M
Infinite-dimensional, 109–111, 120, 123, Mean-square stabilizable, 253
126, 191, 192, 210, 214 Mean-square stabilizing, 253–258, 260–
Inner-outer factorization, 161, 167, 182 267, 269, 270, 273
Input-output system, 20, 54, 136 Mean-square stable, 252, 253, 273
Input-to-state stability, 20, 24, 249–251, 255 Minimally stable, 57–59, 61–64, 74, 77
Integral quadratic constraint, 43, 45, 46, 59, Minimum phase, 6, 12, 20, 22, 170, 178, 186,
63, 68, 69 199
Interconnection, 43–46, 64–68, 85, 97, 224, Model Reference Adaptive Control
228, 241 (MRAC), 170–172, 178, 185
Index 281

Multi-agent system, 83–87, 91, 96, 97, 99, Persistence of Excitation (PE), 138, 143,
101 163, 218
Multi-Input Multi-Output (MIMO) system, Plant parameter, 139, 189–192, 198, 204,
1, 3, 7, 135, 141 156, 158, 161, 181 207, 213, 214
Multiplicative noise, 249–252, 261, 262 Plant state, 30, 31, 34, 191–194, 196, 198,
Multiplier, 4–6, 16, 43–45 210, 214
Plug-and-play operation, 83, 84, 88, 91
Policy, 250, 251, 254, 255, 257, 262, 265
N Policy evaluation, 250, 251, 254, 255, 257,
Noise amplification, 143, 145, 146, 151, 159, 258, 260, 261
160, 166 Policy improvement, 250, 254, 265
Nonlinear control, 27, 262 Policy iteration, 249–252, 254, 255, 257,
Nonlinear damping, 223–225, 230 260–262
Nonlinear observer, 109 Pre-filtering modification, 160
Nonlinear system, 1–3, 6, 16, 24, 39, 44, 69, Predictor, 189–191, 193–197, 214
109, 120, 217, 221, 227, 242, 245, Projection, 52, 54, 55, 148, 149, 155, 163,
262 164, 169, 177, 207, 218, 221, 222
Norm gain, 59, 63
Normal form, 3, 4, 6–8, 16, 19, 21
Normalized estimation error, 148, 154, 163, Q
168, 177 Quadratically continuous, 68, 69
Normalizing signal, 148, 155, 163, 169, 177
Normed subsystem, 52
R
Rate of adaptation, 170, 181
Regular system, 67
O Regulator equation, 16, 21
Observable, 6, 23 Reinforcement Learning (RL), 249–252,
Observer, 1, 7, 14, 15, 23, 73, 95, 96, 99, 100, 255, 262
109–120, 122–127, 130, 137, 189, Reset control system, 27–30, 33, 34, 36–39
191, 193, 194, 196–198, 208, 218 Robust adaptive control, 139, 140, 143, 149,
Observer canonical form, 198, 214 156, 160, 165, 171, 218
ODE–PDE cascade, 192–197, 207, 210, 214 Robust reinforcement learning, 250
Off-policy, 249, 251, 262 Robustness, 2, 7, 60, 62, 100, 135, 138, 140,
Operator, 46, 49, 50, 55, 58, 59, 63, 66–69, 143, 145, 146, 150–152, 156, 157,
79, 80, 110, 111, 113–115, 117, 121– 160, 161, 163, 165, 170, 175, 190,
123, 125, 148, 155, 164, 169, 177, 249–252, 262
189, 191, 207, 213, 220, 256, 263
Optimal control, 84, 250, 251, 255
Output feedback, 2, 6, 109, 126, 217, 231 S
Output regulation, 2, 3, 16, 18, 19, 21, 84, Semiglobal stabilization, 1
217, 218, 222, 241, 245 Seminorm topology, 47, 48, 56, 58, 60
Output regulator, 1 Sensor, 67, 136, 190, 250
Over-parameterization, 135, 138, 163, 170, Shaping of the singular values, 160, 165
185 Signals, 21, 24, 44, 45, 47–50, 52–54, 59–62,
64–68, 71–73, 137, 140–142, 144,
147–149, 151, 154, 155, 158, 162–
P 164, 166, 168, 169, 173, 176–178,
Parameter drift, 138, 163 182, 199, 201, 208, 210, 218, 220,
Parametric model, 147, 148, 154, 162, 163, 224, 227, 230, 234, 241, 242, 245
168, 176, 177 Signal space, 43, 46–56, 60, 63, 66, 68–71
Partial differential equation, 109–111, 113, Single-Input Single-Output (SISO) system,
115, 120, 122, 125, 190, 192, 199, 2, 7, 18, 135, 141 143, 170, 171, 185,
204, 208, 210, 213–215 198
282 Index

Small-gain, 43, 44, 82, 217, 219, 245, 262, U


276 Uniformly controllable, 53, 63, 74
Small-signal gain, 59, 62 Uniformly finely lower hemicontinuous, 56
Small-signal subspace, 46, 48–52, 68 Uniformly lower hemicontinuous, 56
Small-signal subsystem, 52, 53 Uniformly weakly lower hemicontinuous,
Stability, 19, 27–29, 43–46, 54, 58–62, 64, 56
65, 67, 69, 77, 83, 86, 87, 93, 111, Univalent system, 55, 57
116, 117, 119, 122, 125, 126, 135, Unknown minimum-phase plant, 170
137, 138, 141, 143–145, 150–152, Unknown periodic disturbance, 136, 137,
156, 157, 159, 160, 163, 165, 166, 146, 167, 170, 185
170, 173–175, 183, 185, 189–191,
Unmodeled dynamics, 135, 138, 139, 143,
207, 214, 218, 219, 225, 227, 262
148–150, 152, 154, 156, 160, 162,
Stabilization, 2, 3, 6, 7, 16, 18, 20, 24, 109,
165, 167, 168, 170, 171, 174, 175,
120, 189, 193, 194, 196, 197, 214,
178, 180, 182, 185
215, 219
Update law, 195–197, 212, 219, 221–226,
Stable system, 13, 24, 43, 58, 61
229, 230, 240
State feedback, 2, 7, 8, 219, 245
Stochastic system, 249
Strongly minimum-phase, 1, 6–8, 17, 20
Sylvester-type matrix equation, 146, 152,
W
160, 167
System, 52 Weakly lower hemicontinuous, 55
Well-posed, 43, 45, 66, 67, 115
Wlhc. see weakly lower hemicontinuous
T
Temporal family of seminorms, 47, 51
Time axis, 47
Time-varying system, 218–220, 235 Z
Transmission zero, 158, 159, 161 Zero dynamics, 3, 5, 6, 235, 238, 239

You might also like