Ultimately Bounded Output Feedback Control for Networked Nonlinear Systems With Unreliable Communication Channel a Buffer-Aided Strategy
Ultimately Bounded Output Feedback Control for Networked Nonlinear Systems With Unreliable Communication Channel a Buffer-Aided Strategy
7, JULY 2024
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: ULTIMATELY BOUNDED OUTPUT FEEDBACK CONTROL FOR NETWORKED NONLINEAR SYSTEMS 1567
In recent years, adaptive dynamic programming (ADP) has also facilitates the attainment of the desired estimation out-
gradually gained much research attention. Leveraging actor/ comes [9]. Unfortunately, even with its profound engineering
critic neural networks (NNs) known for their superior approxi- ramifications and broad application prospects, the control
mation capabilities, ADP has been extensively employed to problems of NCSs using a buffer-aided strategy over unreli-
tackle optimal control problems with both known and able communication channels have yet to receive the research
unknown nonlinear dynamics [6]−[9]. The ADP-based algo- attention they deserve.
rithms have garnered significant research attention, and Motivated by the aforementioned considerations, our objec-
numerous notable results can be found in [10]−[12]. Although tive is to delve into ultimately bounded output-feedback con-
much of the research on ADP-based control has centered on trol problems, which holds both theoretical and practical sig-
state feedback, practical engineering often limits access to full nificance, for nonlinear NCSs that employ a buffer-aided
state information of systems. This limitation, caused either by strategy over unreliable communication channels. The output-
budget constraints or complex external environments, has feedback control problem under investigation presents three
steered engineers towards favoring ADP-based output-feed- anticipated yet foundational challenges: 1) How to quantify
back control strategies [13]. transmission unreliability and buffer-aided strategy effects?
Networked control systems (NCSs) denote dynamical sys- 2) How to design the tuning laws for the neural-network-
tems in which distinct system components communicate weights (NNWs) for networked nonlinear systems that use a
through a network characterized by limited bandwidth buffer-aided strategy over unreliable communication chan-
nels? and 3) How to analyze bounded stability of considered
[14]−[17]. Over the past two decades, rapid advancements in
networked nonlinear systems with a buffer-aided strategy to
network-based communication technology have significantly
counteract the limited communication capacity? The primary
expanded the potential of NCSs [18]−[20]. Enhanced data
drive of this research is, therefore, to address these challenges
transmission rates, improved error correction methods, and the
through a comprehensive examination.
rise of machine learning techniques for network optimization
The primary contributions are enumerated as follows.
have all combined to elevate the capabilities of these systems.
1) The ultimately bounded output-feedback control prob-
As a result, NCSs have permeated a myriad of practical engi- lem is first concerned for networked nonlinear systems under
neering fields including spacecrafts, smart grids, mobile a buffer-aided strategy over unreliable communication chan-
robots, and unmanned underwater vehicles [21]−[24]. Each of nels.
these applications underscores the versatility and transforma- 2) An intricately devised ADP-based output-feedback con-
tive potential of NCSs in modern engineering landscapes. trol scheme is introduced to address system dynamics con-
In the deployment of NCSs, the reliability of signal trans- strained by limited communication capacity and the buffer-
missions is significantly impacted by pervasive communica- aided strategy.
tion constraints. Such constraints are often manifested as lim- 3) An adaptive tuning law is designed for the controller.
ited bandwidth or finite bit rates [25]−[29]. Issues such as 4) The ultimate boundedness affected by unreliable commu-
congestion or packet dropping can be caused by constraints nication channels and the buffer-aided strategy are rigorously
like limited communication capacity. As a result, the reliabil- analyzed.
ity of signal transmissions can be substantially compromised,
leading to diminished or even devastated estimation/control II. Problem Formulation and Preliminaries
performance [30]. Due to these challenges, attention has now In this paper, a nonlinear NCS is examined in which sensor-
been drawn to control problems associated with NCSs operat- to-controller transmission is facilitated through an unreliable
ing over unreliable communication channels from both con- communication network. A buffer-aided strategy is integrated
trol and signal processing communities. Consequently, numer- with aim to optimize efficiency of measurement data utiliza-
ous research outcomes have been documented [31], [32]. tion by archiving historical measurements during instances
In response to the challenges posed by unreliable communi- when the communication channel becomes inaccessible. This
cation channels, the buffer-aided strategy, which has gained section is dedicated to providing an in-depth delineation of the
nonlinear NCS, the peculiarities of transmission behaviors and
widespread acceptance in practical applications. This strategy
the control methodology employed.
aims to enhance the transmission of measurement signals dur-
ing specific transmission instants. Initially, newly generated A. System Model and Signal Transmissions
signals are stored in the buffer and, following this, all the sig- Consider the following nonlinear system:
nals stored (i.e., both current and historical instant signals) are {
transmitted to the receiver (e.g., observer) simultaneously at xk+1 = Axk + f (xk ) + Buk + Eωk
(1)
the designated transmission instant (often, the present yk = Cxk + Dωk
moment). Once the transmission is completed, the buffer is where xk ∈ Rnx, yk ∈ Rny and uk ∈ Rnu represent, respectively,
cleared to create space for measurement signals generated in the system state, the measurement signal and the control input.
the ensuing instants [33]. Leveraging this method, a greater f (·) is an unknown but bounded smooth nonlinear function on
number of measurement signals can be harnessed by the a compact set Ω ∈ Rn . ω(k) ∈ Rnω denotes the bounded stocha-
observer for the estimation procedure. The buffer-aided strat- stic noise with zero-mean and known variance Q̄ = Q̃Q̃T .
egy not only ensures a more judicious use of resources but Matrices A, B, C, D and E are known.
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
1568 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO. 7, JULY 2024
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: ULTIMATELY BOUNDED OUTPUT FEEDBACK CONTROL FOR NETWORKED NONLINEAR SYSTEMS 1569
state estimates with ⃗xt(i)−q(i)+1 = x̂t(i)−q(i)+1, x̂k and Ŵ f,k are the ever, for nonlinear systems, finding a solution to the HJB
estimates of xk and W f , respectively. W ⃗ f, j is the reorganized equation often proves challenging due to the presence of intri-
estimate value of W f . Here, Lh(i) is the observer gain. cate nonlinearities within the system. In response, the ADP
The adaptive tuning law is algorithm, leveraging the actor/critic NNs, has been intro-
duced as an optimal control solution to for these nonlinear
Case 1 : If {i|k = t(i), i ≥ 0} = ∅
systems. A detailed description regarding control strategy
Ŵ f,k+1 = Ŵ f,k − α1 α2 Ŵ f,k design will be provided subsequently.
Case 2 : If {i|k = t(i), i ≥ 0} , ∅
Before proceeding further, we shall introduce some perfor-
mance requirements about exponential ultimate boundedness
⃗ f, j + α1 (C T (y j+1 − C⃗x j+1 )φ̃T (⃗x j )
⃗ f, j+1 = W
W
(5)
in mean square.
f
) Definition 1 [36]: The discrete nonlinear system (1) is said
− α2 W ⃗ f, j , t(i) − q(i) + 1 ≤ j ≤ t(i)
to be exponentially ultimately bounded (EUB) in mean square
Ŵk+1 = W ⃗ k+1 if there exist positive constants ϑ > 0 , 0 ≤ ϱ < 1 and ς > 0 such
that, for any solution xk with the initial condition x0 , the fol-
where W⃗ t(i)−q(i)+1 = Ŵt(i)−q(i)+1, α1 and α2 are two positive tun- lowing is true:
ing scalars, and φ̃ f (⃗xk ) ≜ φ f (⃗xk )/(∥1 + φTf (⃗xk )φ f (⃗xk )∥ ∥C T C∥).
E[∥xk ∥2 ] ≤ ϑ∥x0 ∥2 ϱk + ς, k ≥ 0
Now, we are ready to consider the observer-based control
strategy based on x̂k . The desired control input is calculated where ς is an asymptotic upper bound in mean square of (1).
by minimizing J(xk ) (i.e., uk = arg min{J(xk )}), where The objectives are twofold.
1) Design the observer parameter Lh(i) such that the estima-
∑
∞
J(xk ) ≜ l(x j , u j ) tion error (i.e., xk − x̂k ) is EUB in mean square.
(6)
j=k 2) Design the weight update laws and analyze ultimate
boundedness.
with the utility function l(xk , uk ) ≜ xkT Mxk + uTk Ruk, l(0, 0) = 0,
and l(xk , uk ) ≥ 0 for any xk and uk . This paper aims to design a III. Main Results
suboptimal control strategy to optimize (6). Unfortunately,
such a minimization problem is quite difficult to solve since A. Observer Design
the value of xk is unknown. An alternative method is to gener- Utilizing a buffer-aided strategy, an NN-based observer will
ate the desired control input by minimizing an approximated be constructed to address unreliable signal transmission sce-
cost function J( ˆ x̂k ).
narios. Since the suboptimal control strategy uk is derived
According to the universal approximation property of the based on x̂k , the error dynamics proves crucial for achieving
NN, it is easy to see that J(xk ) can be approximated by an NN precise control. Subsequently, a joint analysis on the EUB of
(namely, the critic NN) estimation errors for both state and NNW will be undertaken.
J(xk ) = W JT φ J (xk ) + ζ J,k (7) Defining W̃ f,k ≜ W f − Ŵ f,k and W̌ f,k ≜ W f − W ⃗ f,k as the esti-
where W J is the ideal weight, φ J (xk ) is the corresponding mation error and the reorganized estimated error of nonlinear
active function, and ζ J,k is the bounded approximation error. NNW, respectively, the error dynamics is
Similarly, the ideal control input (i.e., uk = arg min{J(xk )}) can
Case 1 : If {i|k = t(i), i ≥ 0} = ∅
also be approximated by an NN (namely, the actor NN)
W̃ f,k+1 = (1 − α1 α2 )W̃ f,k + α1 α2 W f
u(xk ) = WuT φu (xk ) + ζu,k (8)
Case 2 : If {i|k = t(i), i ≥ 0} , ∅
where Wu is the ideal weight matrix for the actor NN, φu (xk ) is
W̌ f, j+1 = (1 − α1 α2 )W̌ f, j − α1C T C W̌ f, j φ f (⃗x j )φ̃Tf (⃗x j )
the corresponding active function, and ζu,k is the bounded
approximation error.
− α1C T Dω j+1 φ̃Tf (⃗x j ) − α1C T C ζˇj φ̃Tf (⃗x j )
Since the plant state is inaccessible, the developed control
strategy is based on state estimates x̂k . Accordingly, the
+ α1 α2 W f − α1C T C Ēh(i) ω j φ̃Tf (⃗x j )
approximated cost function J( ˆ x̂k ) and control input are
− α1C T C Āh(i) x̃ j φ̃Tf (⃗x j )
ˆ x̂k ) = Ŵ T φ J ( x̂k )
J( (9)
t(i) − q(i) + 1 ≤ j ≤ t(i)
J,k
and
W̃
f,k+1 = W̌ f,k+1
û( x̂k ) = Ŵu,k
T
φu ( x̂k ) (10) (11)
where Ŵ J,k and Ŵu,k denote the estimate of W J and Wu, with W̃t(i)−q(i)+1 = W̌t(i)−q(i)+1, Āh(i) ≜ A − Lh(i)C , Ēh(i) ≜ E−
( )
respectively. The detailed design procedure about the parame- Lh(i) D and ζ̌ f,k ≜ W φ f (xk ) − φ f (⃗xk ) + ζ f,k .
ters Ŵ J,k and Ŵu,k will be introduced in Section III-B. Let the estimation error and reorganized estimated error be
Remark 2: In linear cases, HJB equations can be reduced to x̃k ≜ xk − x̂k and x̌k ≜ xk − ⃗xk . The error dynamics is governed
Riccati equations, which are straightforward to solve. How- by
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
1570 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO. 7, JULY 2024
C T CPC T C − σ3 ∥C T C∥2 P ≤ 0 (17) V1,k ≜ x̃kT P x̃k , V2,k ≜ δtr{W̃ Tf,k PW̃ f,k }.
∑
H ∑
M Since the observer has no measurement signal to utilize
p̄ s (1 + µ1 ) s−M (1 − µ2 ) M + p̄ s (1 − µ2 ) s < 1 (18) when k , t(i), the error dynamics (11) and (12) would undergo
s=M+1 s=1 an increment. Fortunately, at t(i), the buffer signal packet
would be transmitted to the observer. With the aid of the sig-
ps , if s ∈ Ha
∑ nal packet, the estimation value of system state and nonlinear
p̄ s ≜
1− p̄ι , if s ∈ Hb (19) NNW from t(i) − q(i) + 2 to t(i) would be regenerated, and then
ι∈Ha those regenerated estimates would be utilized to generate the
where state estimate of t(i) + 1 (as seen in (4) and (5)). In this way,
11 the increment would be compensated by the decrement, and
Π Π12
1
Π1 ≜ 1 the overall error dynamics (for both state and NNW estima-
∗ Π22
1 tion) would be EUB in mean square. Therefore, the following
11 analysis of the error dynamics of state and NNW estimation is
Π2 0 0
implemented based on (11) and (12). Consider two cases.
Π2 ≜ ∗ Π22 0
Case 1: {i|k = t(i), i ≥ 0} = ∅
2
∗ ∗ Π33
2
In this case, there exists a positive scalar i satisfying
11 t(i) < k ≤ t(i + 1) − q(i). Denote ∆Vk as the difference between
Ξ1 Ξ12
1 0 Ξ14
1 Vk+1 and Vk , i.e.,
∗ Ξ22 0 Ξ24
1
Ξ1 ≜ 1
∑
2 ∑
2
∗
∗ Ξ33 0 ∆Vk = ∆Vr,k = (Vr,k+1 − Vr,k ). (21)
1 r=1 r=1
∗ ∗ ∗ Ξ44
1
According to the estimation error dynamics (12), by calcu-
11
Ξ2 Ξ12
2 0 Ξ14
2 lating the mathematical expectation of E{∆Vk − µ1 Vk } , we can
easily obtain that
∗ Ξ22 0 Ξ24
2
Ξ2 ≜ 2
∗
∗ Ξ33 0 E{∆Vk − µ1 Vk }
2
∗ ∗ ∗ Ξ44 = E{V1,k+1 − (1 + µ1 )V1,k + V2,k+1 − (1 + µ1 )V2,k } (22)
2
where
ε1 ≜ δα̃ε̄, ε̄ ≜ 1 − α1 α2 + 4α1 + α1 σ4 + α1 α2 σ5
E{V1,k+1 − (1 + µ1 )V1,k }
ε4 ≜ 1/∥C T C∥2 , ᾱ ≜ 1 − α1 α2 + 4α1 , ε5 = 1 + α21 {
= E 2 x̃kT AT PW̃ f,k φ f ( x̂k ) + 2 x̃kT AT Pζ̌ f,k + x̃kT AT PA x̃k
ε6 ≜ δα1 σ3 σ−1
4 ᾱ + 2σ3 α1 , ε2 ≜ ε3 ≜ δα1 σ3 ε̄
2
ε7 ≜ δα1 α2 σ−1 + 2φTf ( x̂k )W̃ Tf,k Pζ̌ f,k + φTf ( x̂k )W̃ Tf,k PW̃ f,k φ f ( x̂k )
5 ᾱ + 2α1 α2
2 2
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: ULTIMATELY BOUNDED OUTPUT FEEDBACK CONTROL FOR NETWORKED NONLINEAR SYSTEMS 1571
q(i+1)−h(i+1)
and Vt(i+1)−q(i+1)+1 ≤ π̄1 Vt(i)+1 + d̄1 (29)
E{V2,k+1 − (1 + µ1 )V2,k } q(i+1)−h(i+1)+1
π̄ −π̄1
{ { where d̄1 ≜ d1 1 1−π̄1 .
= δtr E (1 − α1 α2 )2 W̃ Tf,k PW̃ f,k + 2(1 − α1 α2 ) Case 2: {i|k = t(i), i ≥ 0} , ∅
In this case, there exists a positive scalar i such that
× α1 α2 W̃ Tf,k PW f − (1 + µ1 )W̃kT PW̃ f,k k = t(i + 1). Furthermore, under the effects of buffer-aided
}} strategy, the available measurement signals (i.e., Yt(i+1) =
+ W Tf (α21 α22 P − Φ3 )W f + W Tf Φ3 W f . (24)
{yt(i+1) , yt(i+1)−1 , . . . , yt(i+1)−q(i+1)+1 }) are utilized to facilitate the
Subsequently, by means of state estimation process, where the reorganized estimated
states and NNWs are acquired (as shown in (4) and (5)). Then,
σ1 φTf ( x̂k )W̃ Tf,k W̃ f,k φ f ( x̂k ) − σ1 φ̄2f tr{W̃ Tf,k W̃ f,k } ≤ 0 (25) the desired state estimate x̂t(i+1)+1 is generated based on the
and considering (22) to (25), we have reorganized estimated states.
For t(i + 1) − q(i + 1) + 1 ≤ j < t(i + 1) + 1, letting V̌ j ≜ V̌1, j +
E{∆Vk − µ1 Vk }
{ V̌2, j ≜ x̌Tj P x̌ j + δtr{W̌ Tf, j PW̌ f, j } and calculating the mathemati-
≤ E 2 x̃kT AT PW̃ f,k φ f ( x̂k ) + 2 x̃kT AT Pζ̌ f,k cal expectation of E{V̌ j+1 − V̌ j } , we have
+ 2φTf ( x̂k )W̃ Tf,k Pζ̌ f,k + x̃kT AT PA x̃k E{∆V̌ j } = E{V̌1, j+1 + V̌2, j+1 − V̌1, j − V̌2, j } (30)
where
+ φTf ( x̂k )W̃ Tf,k PW̃ f,k φ f ( x̂k ) + ζ̌ Tf,k Φ2 ζ̌ f,k
E{V̌1, j+1 − V̌1, j }
+ ωTk Φ1 ωk + ωTk (E T PE − Φ1 )ωk {
= E 2 x̌Tj ĀTh(i) PW̌ f, j φ f (⃗x j ) + 2 x̌Tj ĀTh(i) Pζ̌ f, j + x̌Tj ĀTh(i)
+ ζ̌ Tf,k (P − Φ2 )ζ̌ f,k − (1 + µ1 ) x̃kT P x̃k
{ × PAh(i) x̌ j + 2φTf (⃗x j )W̌ Tf, j Pζ̌ f, j + φTf (⃗x j )W̌ Tf, j PW̌ f, j
+ σ1 φ̄2f tr{W̃ Tf,k W̃ f,k } + δtr α̃2 W̃ Tf,k PW̃ f,k
× φ f (⃗x j ) + 2ωTj Ēh(i)
T
Pζ̌ f, j + ωTj (Ēh(i)
T
PĒh(i) − Φ4 )ω j
+ 2α̃α1 α2 W̃ Tf,k PW f − (1 + µ1 )W̃ Tf,k PW̃ f,k
}} + ζ̌ Tf, j (P − Φ5 )ζ̌ f, j + ωTj Φ4 ω j + ζ̌ Tf, j Φ5 ζ̌ f, j
+ W Tf (δα21 α22 P − Φ3 )W f + W Tf Φ3 W f
}
− (1 − µ2 ) x̌Tj P x̌ j − µ2 x̌Tj P x̌ j (31)
≤ E{γkT Π1 γk + ηTk Ξ1 ηk } + d1 (26)
where and
[ ]T E{V̌2, j+1 − V̌2, j }
γk ≜ W̃ Tf,k W Tf { {(
[ ]T = δtr E α̃W̌ f, j − α1C T C W̌ f, j φ f (⃗x j )φ̃Tf (⃗x j )
ηk ≜ x̃kT φT ( x̂k )W̃ Tf,k ωk T ζ̌kT
− α1C T C Āh(i) x̌ j φ̃Tf (⃗x j ) − α1C T Dω j+1 φ̃Tf (⃗x j )
d1 ≜ tr{Q̃T Φ1 Q̃ + Φ2 ζ̃ 2 + Φ3 W̄ 2f }
− α1C T C Ēh(i) ω j φ̃Tf (⃗x j ) − α1C T C ζˇj φ̃Tf (⃗x j )
ζ̃ ≜ 2W̄ f φ̄ f + ζ¯f , α̃ ≜ 1 − α1 α2 .
) (
Taking (13), (15) and (26) into account, we arrive at + α1 α2 W f P α̃W̌ f, j − α1C T C W̌ f, j φ f (⃗x j )φ̃Tf (⃗x j )
E{∆Vk − µ1 Vk } ≤ γkT Π1 γk + ηTk Ξ1 ηk + d1 ≤ d1 . (27) − α1C T C Āh(i) x̌ j φ̃Tf (⃗x j ) − α1C T Dω j+1 φ̃Tf (⃗x j )
Therefore, for any t(i) + 1 ≤ k < t(i + 1) − q(i + 1) + 1 and pos- − α1C T C Ēh(i) ω j φ̃Tf (⃗x j ) − α1C T C ζˇj φ̃Tf (⃗x j )
itive scalar π1, we have ) }}
+ α1 α2 W f − (1 − µ2 + µ2 )W̌ Tj PW̌ j . (32)
πk+1
1 Vk+1 − π1 Vk = π1 (Vk+1 − Vk ) + π1 (π1 − 1)Vk
k k+1 k
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
1572 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO. 7, JULY 2024
q(i+1)−h(i+1) −q(i+1)
Afterwards, with the help of the inequity where µ̃ ≜ µ̄1 µ̄2 .
∑
Considering (19), we have from 0 ≤ pn ≤ 1 − ι∈Ha pι that
σ2 φTf (⃗x j )W̌ Tf, j W̌ f, j φ f (⃗x j ) − σ2 φ̄2f tr{W̌ Tf, j W̌ f, j } ≤ 0 (34)
q(i+1)−h(i+1) −q(i+1)
we substitute (33) and (31) into (30) to obtain E{µ̄1 µ̄2 }
{
E{∆V̌ j } ≤ E 2 x̌Tj ĀTh(i) PW̌ f, j φ f (⃗x j ) + 2 x̌Tj ĀTh(i) Pζ̌ f, j ∑
M ∑
H
= p s µ̄1s−s µ̄−s
2 + p s µ̄1M−s µ̄−M
2
+ (1 + ε3 ) x̌Tj ĀTh(i) PĀh(i) x̌ j + 2φTf (⃗x j ) s=1 s=M+1
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: ULTIMATELY BOUNDED OUTPUT FEEDBACK CONTROL FOR NETWORKED NONLINEAR SYSTEMS 1573
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
1574 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO. 7, JULY 2024
{ { }}
NNW (i.e., Ŵ J,0 and Ŵu,0 ) be selected from a compact set tr E ∆V3,k + ∆V4,k | x̂k , W̃u,k , W̃ J,k
which includes the ideal weights. Assume that there exist
{ {
scalars β1 > 0, β2 > 0, 0 < µ j < 1 ( j = 3, 4), σ s > 0 (s = 6, 7,
≤ tr E ξkT Ξ3 ξk − µ3 W̃ J,k
T
W̃ J,k + d3 + ξ̄kT Ξ4 ξ̄k
8, 9) and positive matrices Γl (l = 1, 2, 3) such that
{ }}
Ξ3 < 0 (51) − µ4 W̃u,k
T
W̃u,k + d4
Ξ4 < 0 (52)
where
where [ T ]T [ T ]T
11 11 ξk ≜ W̃ J,k ϱTk , ξ̄k ≜ W̃u,k ΥTu
Ξ3 0 Ξ4 Ξ12
4
Ξ3 ≜
, Ξ4 ≜ ( )
∗ Ξ3
22 ∗ Ξ4
22 ϱk ≜ lT x̂k , û( x̂k ) + ∆φTJ ( x̂k )W J , d3 ≜ tr{3W̄ J φ̄ J Γ1 }
( ) d4 ≜ tr{(∥R−1 BT ∥2 φ̄2u φ̄2J W̄ J2 + φ̄4u W̄u2 )Γ2 }.
Ξ11
3 ≜ − β1 (2 − σ6 − 4β1 σ6 φ̄ J − 4β1 φ̄ J ) − µ3
2 2
{ } −1 T
E ∆V4,k | x̂k , Ŵu,k , Ŵ J,k Π33
5 ≜ (2 + σ10 )B B − Γ4 .
{ }
= E V4,k+1 | x̂k , Ŵu,k , Ŵ J,k − V4,k Then, system (1) with control policy (10) is EUB in mean
{ { square.
≤ tr E (−2β2 φ̄2u + β22 φ̄4u + β2 σ7 + β2 φ̄2u σ8 + µ4 )W̃u,k
T Proof: In light of the optimal control theory, (8) will stabi-
lize (in the sense of input-to-state stability) the following sys-
× W̃u,k + 2(−β2 + β22 φ̄2u )W̃u,k
T
Υu + ΥTu (β22 + β22 σ9 tem on a compact set [13]:
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: ULTIMATELY BOUNDED OUTPUT FEEDBACK CONTROL FOR NETWORKED NONLINEAR SYSTEMS 1575
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
1576 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO. 7, JULY 2024
5
The initial values are
4 W^ 1, J, k
x0 = [0.9 −0.6 0.6]T , x̂0 = [−0.24 0.12 −0.36]T 3 W^ 2, J, k
W^ 3, J, k
Ŵ f,0 = [0.1 0.1 0.1]T , Ŵ J,0 = [−1 −1 1.8]T 2
Amplitude
1
Ŵu,0 = [−1.52 −4.24 7.6] .
0
The validity and efficacy of our proposed approach are visu-
−1
ally substantiated through results explained as follows.
1) To begin, Fig. 1 showcases the norm of state trajectories −2
×108
3.0 Fig. 3. The weight estimate of critic NN.
||xk||
2.5 8
2.0 W^ 1, u, k W^ 2, u, k W^ 3, u, k
6
Amplitude
1.5 4
1.0
Amplitude
2
0.5 0
0 −2
−0.5 −4
0 200 400 600 800 1000
Time (k)
−6
0 10 20 30 40 50 60 70 80 90
Fig. 1. Norm of the state vector of the open-loop system. Time (k)
2) Transitioning to the closed-loop system, we have dis- Fig. 4. The weight estimate of actor NN.
played both the state trajectories and estimates in Fig. 2,
which provides a clear testament to the feasibility of the NN- 8
based output-feedback control strategy developed in our ^ ^ k)
u(x
6
study. The trajectories closely align with their estimates,
underscoring the controller’s ability to maintain system stabil- 4
ity and accurately track the desired states.
Amplitude
1.0 x1, k 0
0.8 x^ 1, k
x2, k −2
0.6 x^ 2, k
0.4 x3, k −4
Amplitude
x^ 3, k
0.2 −6
0 10 20 30 40 50 60 70 80 90
0
Time (k)
−0.2
−0.4 Fig. 5. The control input.
−0.6
−0.8
4) Collectively, these simulation outcomes show that the
0 10 20 30 40 50 60 70 80 90 proposed NN-based control strategy achieves satisfactory per-
Time (k)
formance, and our developed approach not only addresses the
Fig. 2. States and their estimates of the closed-loop system. inherent instability of the system but also provides commend-
able precision and adaptability.
3) Delving into the neural network details, Figs. 3 and 4
depict the estimates of the actor/critic NNWs, respectively, V. Conclusions
and this provides insight into the dynamic adaptation and In this study, we have examined the ultimately bounded out-
learning process that the networks undergo as they interact put-feedback control for networked nonlinear systems
with the system. The control input, crucial for achieving the employing a buffer-aided strategy over unreliable communica-
desired system behavior, is represented in Fig. 5, from which tion channels was explored. Given the unreliable nature of sig-
one can verify the controller’s responsiveness and precision in nal transmission, we have used a buffer-aided strategy to relay
action. a greater number of measurements. To obtain the coveted con-
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: ULTIMATELY BOUNDED OUTPUT FEEDBACK CONTROL FOR NETWORKED NONLINEAR SYSTEMS 1577
trol strategy, an NN-based observer has been devised for state Networks and Learning Systems, vol. 415, pp. 258–265, Nov. 2020.
estimation. In addition, an observer-based ADP algorithm has [16] L. Wang, S. Liu, Y. Zhang, D. Ding, and X. Yi, “Non-fragile l2–l∞ state
estimation for time-delayed artificial neural networks: An adaptive
been introduced to approximate the ideal solution for the sub- event-triggered approach,” Int. Journal of Systems Science, vol. 53,
optimal control issue. Utilizing the Lyapunov stability, suffi- no. 10, pp. 2247–2259, Jul. 2022.
cient conditions have been identified that jointly ensure that [17] Z. Zhao, X. Yi, L. Ma, and X. Bai, “Quantized recursive filtering for
the close-loop system, state estimates and critic/actor-NNW networked systems with stochastic transmission delays,” ISA
Transactions, vol. 127, pp. 99–107, Aug. 2022.
estimates are all the EUB in mean square. Numerical exam-
[18] R. Caballero-Aguila, A. Hermoso-Carazo, and J. Linares-Perez,
ples have been presented to reinforce the efficacy of the out- “Optimal state estimation for networked systems with random
lined control strategy. Potential avenues for future investiga- parameter matrices, correlated noises and delayed measurements,” Int.
tions include the extension of the proposed control strategy to Journal of General Systems, vol. 44, no. 2, pp. 142–154, Feb. 2015.
systems with buffer-aided strategy and other phenomena such [19] D. Ciuonzo, A. Aubry, and V. Carotenuto, “Rician MIMO channel- and
jamming-aware decision fusion,” IEEE Trans. Signal Processing,
as complex networks [37]−[39], wireless sensor networks vol. 65, no. 15, pp. 3866–3880, 2017.
[40], multiagent systems [41], and others [42]−[48]. [20] X.-M. Zhang, Q.-L. Han, X. Ge, D. Ding, L. Ding, D. Yue, and C.
Peng, “Networked control systems: A survey of trends and techniques,”
IEEE/CAA J. Autom. Sinica, vol. 7, no. 1, pp. 1–17, Jan. 2020.
References
[21] X. Guan, J. Hu, J. Qi, D. Chen, F. Zhang, and G. Yang, “Observer-
[1] X. Liang, Q. Qi, H. Zhang, and L. Xie, “Decentralized control for based H∞ sliding mode control for networked systems subject to
networked control systems with asymmetric information,” IEEE Trans. communication channel fading and randomly varying nonlinearities,”
Automatic Control, vol. 67, no. 4, pp. 2076–2083, Apr. 2022. Neurocomputing, vol. 437, pp. 312–324, May 2021.
[2] F. L. Lewis and D. Vrabie, “Reinforcement learning and adaptive [22] W. Qian, W. Xing, and S. Fei, “H∞ state estimation for neural networks
dynamic programming for feedback control,” IEEE Circuits and with general activation function and mixed time-varying delays,” IEEE
Systems Magazine, vol. 9, no. 3, pp. 40–58, 2009. Trans. Neural Networks and Learning Systems, vol. 32, no. 9,
[3] B. Sun and E.-J. Van Kampen, “Event-triggered constrained control pp. 3909–3918, Sept. 2021.
using explainable global dual heuristic programming for nonlinear [23] L. Yu, Y. Cui, Y. Liu, N. D. Alotaibi, and F. E. Alsaadi, “Sampled-
discrete-time systems,” Neurocomputing, vol. 468, pp. 452–463, Jan. based consensus of multi-agent systems with bounded distributed time-
2022. delays and dynamic quantisation effects,” Int. Journal of Systems
Science, vol. 53, no. 11, pp. 2390–2406, Aug. 2022.
[4] X. Wang, W. Liu, Q. Wu, and S. Li, “A modular optimal formation
control scheme of multiagent systems with application to multiple [24] Y. Zhao, X. He, L. Ma, and H. Liu, “Unbiasedness-constrained least
mobile robots,” IEEE Trans. Industrial Electronics, vol. 69, no. 9, squares state estimation for time-varying systems with missing
pp. 9331–9341, Sept. 2022. measurements under round-robin protocol,” Int. Journal of Systems
Science, vol. 53, no. 9, pp. 1925–1941, Jul. 2022.
[5] H. Zhang, Y. Luo, and D. Liu, “Neural-network-based near-optimal
control for a class of discrete-time affine nonlinear systems with control [25] H. Geng, Z. Wang, Y. Chen, X. Yi, and Y. Cheng, “Variance-
constraints,” IEEE Trans. Neural Networks and Learning Systems, constrained filtering fusion for nonlinear cyber-physical systems with
vol. 20, no. 9, pp. 1490–1503, Sept. 2009. the denial-of-service attacks and stochastic communication protocol,”
IEEE/CAA J. Autom. Sinica, vol. 9, no. 6, pp. 978–989, Jun. 2022.
[6] D. V. Prokhorov, R. Santiago, and D. C. Wunsch, “Adaptive critic
designs: A case study for neurocontrol,” Neural Networks, vol. 8, no. 9, [26] X. Li, F. Han, N. Hou, H. Dong, and H. Liu, “Set-membership filtering
pp. 1367–1372, 1995. for piecewise linear systemswith censored measurements under Round-
Robin protocol,” Int. Journal of Systems Science, vol. 51, no. 9,
[7] X. Wang, Y. Sun, and D. Ding, “Adaptive dynamic programming for pp. 1578–1588, 2020.
networked control systems under communication constraints: A survey
of trends and techniques,” Int. Journal of Network Dynamics and [27] Y. S. Shmaliy, S. Zhao, and C. K. Ahn, “Unbiased finite impluse
Intelligence, vol. 1, no. 1, pp. 85–98, Dec. 2022. response filtering: An iterative alternative to Kalman filtering ignoring
noise and initial conditions,” IEEE Control Systems Magazine, vol. 37,
[8] Q. Wei, D. Wang, and D. Zhang, “Dual iterative adaptive dynamic no. 5, pp. 70–89, 2017.
programming for a class of discrete-time nonlinear systems with time-
delays,” Neural Computing and Applications, vol. 23, pp. 7–8, Dec. [28] H. Song, D. Ding, H. Dong, G. Wei, and Q.-L. Han, “Distributed
2013. entropy filtering subject to DoS attacks in non-Gauss environments,”
Int. Journal of Robust and Nonlinear Control, vol. 30, no. 3, pp. 1240–
[9] X. Wu and C. Wang, “Event-driven adaptive near-optimal tracking 1257, Feb. 2020.
control of the robot in aircraft skin inspection,” Int. Journal of Robust
and Nonlinear Control, vol. 31, no. 7, pp. 2593–2613, May 2021. [29] Z. Wang, L. Wang, S. Liu, and G. Wei, “Encoding-decoding-based
control and filtering of networked systems: Insightsdevelopments and
[10] K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward opportunities,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 1, pp. 3–18, Jan.
networks are universal approximators,” Neural Networks, vol. 2, no. 5, 2018.
pp. 359–366, Jun. 1989.
[30] D. Shi, T. Chen, and L. Shi, “Event-triggered maximum likelihood state
[11] Z. Ming, H. Zhang, Y. Luo, and W. Wang, “Dynamic event-based estimation,” Automatica, vol. 50, no. 1, pp. 247–254, Feb. 2014.
control for stochastic optimal regulation of nonlinear networked control
[31] M. Barakat, “Novel chaos game optimization tuned-fractional-order
systems,” IEEE Trans. Neural Networks and Learning Systems, vol. 34,
PID fractional-order PI controller for load-frequency control of
no. 10, p. 7308, 7299. 2023.
interconnected power systems,” Protection and Control of Modern
[12] H. Ren, H. Zhang, Y. Mu, and J. Duan, “Off-policy synchronous Power Systems, 2022. DOI: 10.1186/s41601-022-00238-x
iteration IRL method for multi-player zero-sum games with input
[32] Y. Wang and G. Yang, “Robust H∞ model reference tracking control for
constraints,” Neurocomputing, vol. 379, pp. 413–421, Feb. 2020.
networked control systems with communication constraints,” Int.
[13] D. Ding, Z. Wang, and Q.-L. Han, “Neural-network-based consensus Journal of Control, Automation, and Systems, vol. 7, no. 6, pp. 992–
control for multiagent systems with input constraints: The event- 1000, Dec. 2009.
triggered case,” IEEE Trans. Cybernetics, vol. 50, no. 8, pp. 3719–3730, [33] Y. Xu, L. Yang, Z. Wang, H. Rao, and R. Lu, “State estimation for
Aug. 2020. networked systems with Markov driven transmission and buffer
[14] Y. Chen, K. Ma, and R. Dong, “Dynamic anti-windup design for linear constraint,” IEEE Trans. Systems, Man, and Cybernetics: Systems,
systems with time-varying state delay and input saturations,” Int. vol. 51, no. 12, pp. 7727–7734, Dec. 2021.
Journal of Systems Science, vol. 53, no. 10, pp. 2165–2179, Jul. 2022. [34] Y. Cui, L. Yu, Y. Liu, W. Zhang, and F. E. Alsaadi, “Dynamic event
[15] W. Qian, Y. Li, Y. Zhao, and Y. Chen, “New optimal method for l2–l∞ based non-fragile state estimation for complex networks via partial
state estimation of delayed neural networks,” IEEE Trans. Neural nodes information,” Journal of the Franklin Institute, vol. 358, no. 18,
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
1578 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO. 7, JULY 2024
pp. 10193–10212, Dec. 2021. Zidong Wang (Fellow, IEEE) received the B.Sc.
degree in mathematics in 1986 from Suzhou Univer-
[35] X. Wang, D. Ding, X. Ge, and Q.-L. Han, “Neural-network-based sity, and the M.Sc. degree in applied mathematics in
control for discrete-time nonlinear systems with denial-of-service 1990 and the Ph.D. degree in electrical engineering
attack: The adaptive event-triggered case,” Int. Journal of Robust and in 1994, both from Nanjing University of Science
Nonlinear Control, vol. 32, no. 5, pp. 2760–2779, Mar. 2022. and Technology.
[36] L. Zou, Z. Wang, Q.-L. Han, and D. Zhou, “Ultimate boundedness He is currently Professor of dynamical systems
control for networked systems with try-once-discard protocol and and computing in the Department of Computer Sci-
uniform quantization effects,” IEEE Trans. Automatic Control, vol. 62, ence, Brunel University London, UK. From 1990 to
no. 12, pp. 6582–6588, Dec. 2017. 2002, he held teaching and research appointments in
[37] G. Bao, L. Ma, and X. Yi, “Recent advances on cooperative control of universities in China, Germany and the UK. Prof. Wang’s research interests
heterogeneous multi-agent systems subject to constraints: A survey,” include dynamical systems, signal processing, bioinformatics, control theory
Systems Science & Control Engineering, vol. 10, no. 1, pp. 539–551, and applications. He has published a number of papers in international jour-
Dec. 2022. nals. He is a holder of the Alexander von Humboldt Research Fellowship of
Germany, the JSPS Research Fellowship of Japan, William Mong Visiting
[38] C. Gao, X. He, H. Dong, H. Liu, and G. Lyu, “A survey on fault- Research Fellowship of Hong Kong, China.
tolerant consensus control of multi-agent systems: Trends, Prof. Wang serves (or has served) as the Editor-in-Chief for International
methodologies and prospects,” Int. Journal of Systems Science, vol. 53, Journal of Systems Science, the Editor-in-Chief for Neurocomputing, the Edi-
no. 13, pp. 2800–2813, Oct. 2022. tor-in-Chief for Systems Science and Control Engineering, and an Associate
[39] X. Wan, Y. Li, Y. Li, and M. Wu, “Finite-time H∞ state estimation for Editor for 12 international journals including IEEE Transactions on Auto-
two-time-scale complex networks under stochastic communication matic Control, IEEE Transactions on Control Systems Technology, IEEE
protocol,” IEEE Trans. Neural Networks and Learning Systems, vol. 33, Transactions on Neural Networks, IEEE Transactions on Signal Processing,
no. 1, pp. 25–36, Jan. 2022. and IEEE Transactions on Systems, Man, and Cybernetics-Part C. He is a
Member of the Academia Europaea, a Member of the European Academy of
[40] Y. Ju, G. Wei, D. Ding, and S. Liu, “A novel fault detection method
Sciences and Arts, an Academician of the International Academy for Systems
under weighted try-once-discard scheduling over sensor networks,”
and Cybernetic Sciences, a Fellow of the IEEE, a Fellow of the Royal Statisti-
IEEE Trans. Control of Network Systems, vol. 7, no. 3, pp. 1489–1499,
cal Society and a member of program committee for many international con-
Sept. 2020.
ferences.
[41] W. Qian, Y. Gao, and Y. Yang, “Global consensus of multiagent
systems with internal delays and communication delays,” IEEE Trans.
Systems, Man, and Cybernetics: Systems, vol. 49, no. 10, pp. 1961–1970, Lei Zou (Senior Member, IEEE) received the Ph.D
Oct. 2019. degree in control science and engineering in 2016
[42] Y. Chen, Q. Song, Z. Zhao, Y. Liu, and F. E. Alsaadi, “Global Mittag- from Harbin Institute of Technology.
He is currently a Professor with the College of
Leffler stability for fractional-order quaternion-valued neural networks
Information Science and Technology, Donghua Uni-
with piecewise constant arguments and impulses,” Int. Journal of
versity. From October 2013 to October 2015, he was
Systems Science, vol. 53, no. 8, pp. 1756–1768, Jun. 2022.
a visiting Ph.D. student with the Department of Com-
[43] X. Li, Q. Song, Y. Liu, and F. E. Alsaadi, “Nash equilibrium and bang- puter Science, Brunel University London, U.K. His
bang property for the non-zero-sum differential game of multi-player research interests include control and filtering of net-
uncertain systems with Hurwicz criterion,” Int. Journal of Systems worked systems, moving-horizon estimation, state
Science, vol. 53, no. 10, pp. 2207–2218, Jul. 2022. estimation subject to outliers, and secure state estimation.
[44] Y. Sun, D. Ding, H. Dong, and H. Liu, “Event-based resilient filtering Prof. Zou serves (or has served) as an Associate Editor for IEEE/CAA
for stochastic nonlinear systems via innovation constraints,” Journal of Automatica Sinica, Neurocomputing, International Journal of Sys-
Information Sciences, vol. 546, pp. 512–525, Feb. 2021. tems Science, and International Journal of Control, Automation and Systems,
a Senior Member of IEEE, a Senior Member of Chinese Association of
[45] H. Tao, H. Tan, Q. Chen, H. Liu, and J. Hu, “H∞ state estimation for
Automation, a Regular Reviewer of Mathematical Reviews, and a very active
memristive neural networks with randomly occurring DoS attacks,” Reviewer for many international journals.
Systems Science & Control Engineering, vol. 10, no. 1, pp. 154–165,
Dec. 2022.
Yun Chen received the B.E. degree in thermal engi-
[46] H. Shen, M. Xing, H. Yan, and J. Cao, “Observer-based l2–l∞ control neering in 1999 from Central South University of
for singularly perturbed semi-Markov jump systems with an improved Technology (Central South University), and the M.E.
weighted TOD protocol,” Science China-Information Sciences, 2022. degree in engineering thermal physics in 2002, and
DOI: 10.1007/s11432-021-3345-1 the Ph.D. degree in control science and engineering
[47] H. Yu, J. Hu, B. Song, H. Liu, and X. Yi, “Resilient energy-to-peak in 2008, both from Zhejiang University.
filtering for linear parameter-varying systems under random access From August 2009 to August 2010, he was a Vis-
protocol,” Int. Journal of Systems Science, vol. 53, no. 11, pp. 2421– iting Fellow with the School of Computing, Engi-
2436, Aug. 2022. neering and Mathematics, University of Western
[48] Q. Zhang and Y. Zhou, “Recent advances in non-Gaussian stochastic Sydney, Australia. From December 2016 to Decem-
systems control theory and its applications,” Int. Journal of Network ber 2017, he was an Academic Visitor with the Department of Mathematics,
Dynamics and Intelligence, vol. 1, no. 1, pp. 111–119, Dec. 2022. Brunel University London, UK. In 2002, he joined Hangzhou Dianzi Univer-
sity, where he is currently a Professor. His research interests include stochas-
tic and hybrid systems, robust control and filtering.
Yuhan Zhang received the B.Eng. degree in elec-
tronic information science and technology from the Guoping Lu received the B.S. degree from the
Shandong University of Science and Technology, in Department of Applied Mathematics, Chengdu Uni-
2016, and the M.Sc. degree in marketing from the versity of Science and Technology, in 1984, and the
University of Nottingham, UK, in 2017. She is cur- M.S. and Ph.D. degrees in applied mathematics from
rently pursuing the Ph.D. degree in control science the Department of Mathematics, East China Normal
and engineering from Shandong University of Sci- University, in 1989 and 1998, respectively.
ence and Technology. He is currently a Professor with the School of
Her current research interests include the control Electrical Engineering, Nantong University. His cur-
and filtering of networked systems, reinforcement rent research interests include singular systems, mul-
learning, and neural networks. tiagent systems, networked control, and nonlinear
She is a very active Reviewer for many international journals. signal processing.
Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.