0% found this document useful (0 votes)
3 views13 pages

Ultimately Bounded Output Feedback Control for Networked Nonlinear Systems With Unreliable Communication Channel a Buffer-Aided Strategy

This paper addresses ultimately bounded output feedback control for networked nonlinear systems operating over unreliable communication channels, utilizing a novel buffer-aided strategy to enhance measurement data utilization. An observer-based controller is developed using neural networks to mitigate the effects of signal transmission issues and unknown nonlinear dynamics, with a focus on analyzing system performance through stochastic analysis and Lyapunov stability. The research aims to provide solutions to challenges posed by communication constraints, ensuring effective control in practical engineering applications.

Uploaded by

maodeyinmuzhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views13 pages

Ultimately Bounded Output Feedback Control for Networked Nonlinear Systems With Unreliable Communication Channel a Buffer-Aided Strategy

This paper addresses ultimately bounded output feedback control for networked nonlinear systems operating over unreliable communication channels, utilizing a novel buffer-aided strategy to enhance measurement data utilization. An observer-based controller is developed using neural networks to mitigate the effects of signal transmission issues and unknown nonlinear dynamics, with a focus on analyzing system performance through stochastic analysis and Lyapunov stability. The research aims to provide solutions to challenges posed by communication constraints, ensuring effective control in practical engineering applications.

Uploaded by

maodeyinmuzhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

1566 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO.

7, JULY 2024

Ultimately Bounded Output Feedback Control for


Networked Nonlinear Systems With Unreliable
Communication Channel: A Buffer-Aided Strategy
Yuhan Zhang , Zidong Wang , Fellow, IEEE, Lei Zou , Senior Member, IEEE, Yun Chen , and Guoping Lu

Abstract—This paper concerns ultimately bounded output- Abbreviations and Notations


feedback control problems for networked systems with unknown HJB Hamilton-Jacobi-Bellman
nonlinear dynamics. Sensor-to-observer signal transmission is
ADP Adaptive dynamic programming
facilitated over networks that has communication constraints.
These transmissions are carried out over an unreliable communi- NN Neural network
cation channel. In order to enhance the utilization rate of mea- NCSs Networked control systems
surement data, a buffer-aided strategy is novelly employed to NNW Neural network weight
store historical measurements when communication networks are
inaccessible. Using the neural network technique, a novel LMI Linear matrix inequality
observer-based controller is introduced to address effects of sig- R n The n-dimensional Euclidean space
nal transmission behaviors and unknown nonlinear dynamics. R n×m The set of all n × m real matrices
Through the application of stochastic analysis and Lyapunov sta-
N The set of nonnegative integers
bility, a joint framework is constructed for analyzing resultant
system performance under the introduced controller. Subse- U≥F U − F is positive semi-definite
quently, existence conditions for the desired output-feedback con- U>F U − F is positive definite
troller are delineated. The required parameters for the observer- S T The transpose of the matrix S
based controller are then determined by resolving some specific
matrix inequalities. Finally, a simulation example is showcased to tr{S } The trace of the matrix S
confirm method efficacy. ∥S ∥ The Frobenius norm of the matrix S
Index Terms—Buffer-aided strategy, neural networks, nonlin- λmin (S ) The minimum eigenvalue of S
ear control, output-feedback control, unreliable communication b−1 (·) The inverse function of b(·)
channel. The occurrence probability of the random
Prob{·} event “·”
Manuscript received October 17, 2023; revised November 26, 2023; E{x} The expectation of the stochastic variable x
accepted February 4, 2024. This work was supported in part by the National E{x|y} The expectation of x conditional on y
Natural Science Foundation of China (61933007, 62273087, U22A2044,
61973102, 62073180), the Shanghai Pujiang Program of China (22PJ140 0 Zero matrix of compatible dimension
0400), and the Royal Society of the UK, and the Alexander von Humboldt
I Identity matrix of compatible dimension
Foundation of Germany. Recommended by Associate Editor Xiaohua Ge.
(Corresponding author: Zidong Wang.) diag{· · · } The block-diagonal matrix
Citation: Y. Zhang, Z. Wang, L. Zou, Y. Chen, and G. Lu, “Ultimately The symmetric parts in the symmetric block
bounded output feedback control for networked nonlinear systems with “*” matrix
unreliable communication channel: A buffer-aided strategy,” IEEE/CAA J.
Autom. Sinica, vol. 11, no. 7, pp. 1566–1578, Jul. 2024.
Y. Zhang is with the College of Electrical Engineering and Automation, I. Introduction
Shandong University of Science and Technology, Qingdao 266590, China (e- VER the past few decades, a wide interest has been
mail: [email protected]).
Z. Wang is with the College of Electrical Engineering and Automation,
O shown in optimal control problems due to their signifi-
cance in fields of finance, ecology, power systems, and
Shandong University of Science and Technology, Qingdao 266590, China,
and also with the Department of Computer Science, Brunel University aerospace [1]−[4]. The optimal control is to minimize (or
London, Uxbridge, Middlesex, UB8 3PH, United Kingdom (e-mail: Zidong.
[email protected]). maximize) certain performance index function for a given sys-
L. Zou is with the College of Information Science and Technology, tem while adhering to certain physical constraints. It is widely
Donghua University, Shanghai 201620, and also with the Engineering recognized that gains of optimal controllers are typically
Research Center of Digitalized Textile and Fashion Technology, Ministry of
Education, Shanghai 201620, China (e-mail: [email protected]).
derived from solving Hamilton-Jacobi-Bellman (HJB) equa-
Y. Chen is with the School of Automation, Hangzhou Dianzi University, tions. In linear cases, HJB equations simplify to Riccati equa-
Hangzhou 310018, China (e-mail: [email protected]). tions, allowing the controllers’ gain matrices to be parameter-
G. Lu is with the School of Electrical Engineering, Nantong University, ized upon solving these equations. However, for nonlinear
Nantong 226019, China (e-mail: [email protected]).
systems, solving HJB equations becomes notably challenging
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. because of the complexities introduced by inherent nonlineari-
Digital Object Identifier 10.1109/JAS.2024.124314 ties [5].

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: ULTIMATELY BOUNDED OUTPUT FEEDBACK CONTROL FOR NETWORKED NONLINEAR SYSTEMS 1567

In recent years, adaptive dynamic programming (ADP) has also facilitates the attainment of the desired estimation out-
gradually gained much research attention. Leveraging actor/ comes [9]. Unfortunately, even with its profound engineering
critic neural networks (NNs) known for their superior approxi- ramifications and broad application prospects, the control
mation capabilities, ADP has been extensively employed to problems of NCSs using a buffer-aided strategy over unreli-
tackle optimal control problems with both known and able communication channels have yet to receive the research
unknown nonlinear dynamics [6]−[9]. The ADP-based algo- attention they deserve.
rithms have garnered significant research attention, and Motivated by the aforementioned considerations, our objec-
numerous notable results can be found in [10]−[12]. Although tive is to delve into ultimately bounded output-feedback con-
much of the research on ADP-based control has centered on trol problems, which holds both theoretical and practical sig-
state feedback, practical engineering often limits access to full nificance, for nonlinear NCSs that employ a buffer-aided
state information of systems. This limitation, caused either by strategy over unreliable communication channels. The output-
budget constraints or complex external environments, has feedback control problem under investigation presents three
steered engineers towards favoring ADP-based output-feed- anticipated yet foundational challenges: 1) How to quantify
back control strategies [13]. transmission unreliability and buffer-aided strategy effects?
Networked control systems (NCSs) denote dynamical sys- 2) How to design the tuning laws for the neural-network-
tems in which distinct system components communicate weights (NNWs) for networked nonlinear systems that use a
through a network characterized by limited bandwidth buffer-aided strategy over unreliable communication chan-
nels? and 3) How to analyze bounded stability of considered
[14]−[17]. Over the past two decades, rapid advancements in
networked nonlinear systems with a buffer-aided strategy to
network-based communication technology have significantly
counteract the limited communication capacity? The primary
expanded the potential of NCSs [18]−[20]. Enhanced data
drive of this research is, therefore, to address these challenges
transmission rates, improved error correction methods, and the
through a comprehensive examination.
rise of machine learning techniques for network optimization
The primary contributions are enumerated as follows.
have all combined to elevate the capabilities of these systems.
1) The ultimately bounded output-feedback control prob-
As a result, NCSs have permeated a myriad of practical engi- lem is first concerned for networked nonlinear systems under
neering fields including spacecrafts, smart grids, mobile a buffer-aided strategy over unreliable communication chan-
robots, and unmanned underwater vehicles [21]−[24]. Each of nels.
these applications underscores the versatility and transforma- 2) An intricately devised ADP-based output-feedback con-
tive potential of NCSs in modern engineering landscapes. trol scheme is introduced to address system dynamics con-
In the deployment of NCSs, the reliability of signal trans- strained by limited communication capacity and the buffer-
missions is significantly impacted by pervasive communica- aided strategy.
tion constraints. Such constraints are often manifested as lim- 3) An adaptive tuning law is designed for the controller.
ited bandwidth or finite bit rates [25]−[29]. Issues such as 4) The ultimate boundedness affected by unreliable commu-
congestion or packet dropping can be caused by constraints nication channels and the buffer-aided strategy are rigorously
like limited communication capacity. As a result, the reliabil- analyzed.
ity of signal transmissions can be substantially compromised,
leading to diminished or even devastated estimation/control II. Problem Formulation and Preliminaries
performance [30]. Due to these challenges, attention has now In this paper, a nonlinear NCS is examined in which sensor-
been drawn to control problems associated with NCSs operat- to-controller transmission is facilitated through an unreliable
ing over unreliable communication channels from both con- communication network. A buffer-aided strategy is integrated
trol and signal processing communities. Consequently, numer- with aim to optimize efficiency of measurement data utiliza-
ous research outcomes have been documented [31], [32]. tion by archiving historical measurements during instances
In response to the challenges posed by unreliable communi- when the communication channel becomes inaccessible. This
cation channels, the buffer-aided strategy, which has gained section is dedicated to providing an in-depth delineation of the
nonlinear NCS, the peculiarities of transmission behaviors and
widespread acceptance in practical applications. This strategy
the control methodology employed.
aims to enhance the transmission of measurement signals dur-
ing specific transmission instants. Initially, newly generated A. System Model and Signal Transmissions
signals are stored in the buffer and, following this, all the sig- Consider the following nonlinear system:
nals stored (i.e., both current and historical instant signals) are {
transmitted to the receiver (e.g., observer) simultaneously at xk+1 = Axk + f (xk ) + Buk + Eωk
(1)
the designated transmission instant (often, the present yk = Cxk + Dωk
moment). Once the transmission is completed, the buffer is where xk ∈ Rnx, yk ∈ Rny and uk ∈ Rnu represent, respectively,
cleared to create space for measurement signals generated in the system state, the measurement signal and the control input.
the ensuing instants [33]. Leveraging this method, a greater f (·) is an unknown but bounded smooth nonlinear function on
number of measurement signals can be harnessed by the a compact set Ω ∈ Rn . ω(k) ∈ Rnω denotes the bounded stocha-
observer for the estimation procedure. The buffer-aided strat- stic noise with zero-mean and known variance Q̄ = Q̃Q̃T .
egy not only ensures a more judicious use of resources but Matrices A, B, C, D and E are known.

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
1568 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO. 7, JULY 2024

The communication network is now introduced. Communi- observe that Ha ∪ Hb = H and Ha ∩ Hb = ∅.


cation between sensors and controllers transpires via an unre- Remark 1: Assumptions 1 and 2 are quite reasonable in real-
liable network channel, which is prone to intermittent packet world applications. Assumptions 1 is proposed based on the
dropouts during signal transmissions. Traditionally, if the intermittent characteristic of the signal transmissions under
communication channel is inaccessible, the measurement sig- the impact of the unreliable communication channels. In engi-
nals, which the sensors produce, would be lost. This sporadic neering practice, it obvious that the transmission intervals of
packet dropout, in contrast to continuous transmission, inevi- networked systems are upper bounded. Assumption 2 shows
tably impairs the estimation/control performance, attributed the typical characteristics of the buffer (i.e., limited capacity),
mainly to the “low utilization efficiency” of measurement which is preferred in practical applications in order to save
data. The estimation/control challenges arising from unreli- economic costs. In these cases, it is of practical significance to
able or lossy networks have been the subject of extensive assume that the number of signals transmitted is bounded.
research. For instance, in [34], the non-fragile estimation chal- Let us now consider the measurement data received by the
lenge was explored for a complex networks subset with a controller. It is clear that data can only be received by the con-
dynamic event-based transmission mechanism. Similarly, [35] troller at transmission instants. Specifically, at each transmis-
tackled the NN-based control problem for a nonlinear system sion instant t(i), the number of measurement signals received
faced with intermittent packet dropouts caused by denial-of- by the controller is dictated by the amount of data retained in
service attacks. the buffer. By designating q(i) as the count of signals pre-
To mitigate unreliable transmission, a buffer-aided mecha- served in the buffer at t(i), it can be deduced that
nism is proposed to boost utilization efficiency of measure- q(i) = min{Q, h(i)}, i ∈ N+ .
ment signals. Specifically, this mechanism operates in two Accordingly, the received measurement data for the con-
distinct modes: the storage mode and the delivery mode. In troller at time k (defined as Yk ) is
the storage mode, when the communication network is inac- {
cessible, measurement signals are retained in the buffer, which {yk− j } j=0,1,...,q(i)−1 , if {i|k = t(i), i ≥ 0} , ∅
Yk =
has a designated maximum buffer capacity denoted as Q. If the ∅, if {i|k = t(i), i ≥ 0} = ∅.
buffer is filled to its capacity, the “oldest” measurement sig-
nal stored therein will be displaced by the most recently gen- B. Observer-Based Controller
erated signal. Conversely, in the delivery mode, when the In this study, an observer-based control strategy is emplo-
communication network becomes accessible, all the measure- yed to control the plant as defined in (1), considering the
ment signals retained in the buffer are concurrently dis- influences of both the buffer-aided strategy and unreliable sig-
patched over the communication channel. Subsequent to this nal transmissions. To address the unknown nonlinearity f (·),
transmission, the buffer undergoes a clearing process to an NN-based observer is initially introduced to produce esti-
remove all the signals it previously held. This approach mates, followed by presentation of the observer-based con-
ensures that a larger volume of measurement signals are troller policy.
employed for control as compared to traditional methods According to [10], an NN is utilized to approximate f (·) via
where generated measurements are instantly discarded if the W f φ f (xk ) + ζ f,k , where φ f (·) , W f ∈ Rnx ×n x and ζ f,k ∈ Rnx
communication channel is out of service. denote the activation function, the ideal weight matrix and the
In this paper, the characteristics of unreliable signal trans- approximation error of the NN, respectively. Thus, we have
missions are described in the following assumptions. {
xk+1 = Axk + W f φ f (xk ) + Buk + Eωk + ζ f,k
Assumption 1 (Transmission interval) [33]: Let h(i) be the (3)
transmission interval between t(i) and t(i − 1), i.e., h(i) ≜ t(i)− yk = Cxk + Dωk .
t(i − 1) (h(i) ∈ N+ ). For i ∈ N+, h(i) satisfies Here, it is reasonable to assume that
h(i) ∈ H ≜ {1, 2, . . . , H} ∥W∗ ∥ ≤ W̄∗ , ∥φ∗ (·)∥ ≤ φ̄∗ , ∥ζ∗,k ∥ ≤ ζ̄∗
where constant H is known and positive representing a maxi- where W̄∗, φ̄∗, and ζ̄∗ are known positive constants, and ∗ rep-
mum transmission interval. resents f or other symbols.
Assumption 2: The transmission intervals {h(i)}i≥0 is a According to the received measurement data Yk , the follow-
sequence of random variables which are independently and ing observer is utilized to acquire desired estimates:
identically distributed. The disturbance noise ωk and transmis- Case 1 : If {i|k = t(i), i ≥ 0} = ∅
sion intervals h(i) are mutually uncorrelated stochastic vectors. 





The occurrence probability of h(i) = χ (∀χ ∈ {1, . . . , H}) is par- 

 x̂k+1 = A x̂k + Ŵ f,k φ f ( x̂k ) + Buk



tially unknown, i.e., 


{ Case 2 : If {i|k = t(i), i ≥ 0} , ∅

Prob{h(i) = ι} = pι , if ι ∈ Ha 


(4)
(2) 
 ⃗x j+1 = A⃗x j + W ⃗ f, j φ f (⃗x j ) + Bu j + Lh(i) (y j



Prob{h(i) = τ} =?, if τ ∈ Hb 



 − C⃗x j ), t(i) − q(i) + 1 ≤ j ≤ t(i)
where 0 ≤ pι ≤ 1 and “?”, respectively, are the known and 


∑H 
unknown probabilities with h(i)= 1 ph(i) = 1. Ha ≜ {ι | pι is
x̂k+1 = ⃗xk+1
known} and Hb ≜ {τ | pτ is unknown}. Obviously, it is easy to where {⃗x j+1 }t(i)−q(i)+1≤ j≤t(i) are the so-called “reorganized”

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: ULTIMATELY BOUNDED OUTPUT FEEDBACK CONTROL FOR NETWORKED NONLINEAR SYSTEMS 1569

state estimates with ⃗xt(i)−q(i)+1 = x̂t(i)−q(i)+1, x̂k and Ŵ f,k are the ever, for nonlinear systems, finding a solution to the HJB
estimates of xk and W f , respectively. W ⃗ f, j is the reorganized equation often proves challenging due to the presence of intri-
estimate value of W f . Here, Lh(i) is the observer gain. cate nonlinearities within the system. In response, the ADP
The adaptive tuning law is algorithm, leveraging the actor/critic NNs, has been intro-
duced as an optimal control solution to for these nonlinear
Case 1 : If {i|k = t(i), i ≥ 0} = ∅


 systems. A detailed description regarding control strategy





 Ŵ f,k+1 = Ŵ f,k − α1 α2 Ŵ f,k design will be provided subsequently.





 Case 2 : If {i|k = t(i), i ≥ 0} , ∅
Before proceeding further, we shall introduce some perfor-


 mance requirements about exponential ultimate boundedness



 ⃗ f, j + α1 (C T (y j+1 − C⃗x j+1 )φ̃T (⃗x j )
⃗ f, j+1 = W
W
(5)
in mean square.



f


 ) Definition 1 [36]: The discrete nonlinear system (1) is said


 − α2 W ⃗ f, j , t(i) − q(i) + 1 ≤ j ≤ t(i)


 to be exponentially ultimately bounded (EUB) in mean square


Ŵk+1 = W ⃗ k+1 if there exist positive constants ϑ > 0 , 0 ≤ ϱ < 1 and ς > 0 such
that, for any solution xk with the initial condition x0 , the fol-
where W⃗ t(i)−q(i)+1 = Ŵt(i)−q(i)+1, α1 and α2 are two positive tun- lowing is true:
ing scalars, and φ̃ f (⃗xk ) ≜ φ f (⃗xk )/(∥1 + φTf (⃗xk )φ f (⃗xk )∥ ∥C T C∥).
E[∥xk ∥2 ] ≤ ϑ∥x0 ∥2 ϱk + ς, k ≥ 0
Now, we are ready to consider the observer-based control
strategy based on x̂k . The desired control input is calculated where ς is an asymptotic upper bound in mean square of (1).
by minimizing J(xk ) (i.e., uk = arg min{J(xk )}), where The objectives are twofold.
1) Design the observer parameter Lh(i) such that the estima-


J(xk ) ≜ l(x j , u j ) tion error (i.e., xk − x̂k ) is EUB in mean square.
(6)
j=k 2) Design the weight update laws and analyze ultimate
boundedness.
with the utility function l(xk , uk ) ≜ xkT Mxk + uTk Ruk, l(0, 0) = 0,
and l(xk , uk ) ≥ 0 for any xk and uk . This paper aims to design a III. Main Results
suboptimal control strategy to optimize (6). Unfortunately,
such a minimization problem is quite difficult to solve since A. Observer Design
the value of xk is unknown. An alternative method is to gener- Utilizing a buffer-aided strategy, an NN-based observer will
ate the desired control input by minimizing an approximated be constructed to address unreliable signal transmission sce-
cost function J( ˆ x̂k ).
narios. Since the suboptimal control strategy uk is derived
According to the universal approximation property of the based on x̂k , the error dynamics proves crucial for achieving
NN, it is easy to see that J(xk ) can be approximated by an NN precise control. Subsequently, a joint analysis on the EUB of
(namely, the critic NN) estimation errors for both state and NNW will be undertaken.
J(xk ) = W JT φ J (xk ) + ζ J,k (7) Defining W̃ f,k ≜ W f − Ŵ f,k and W̌ f,k ≜ W f − W ⃗ f,k as the esti-
where W J is the ideal weight, φ J (xk ) is the corresponding mation error and the reorganized estimated error of nonlinear
active function, and ζ J,k is the bounded approximation error. NNW, respectively, the error dynamics is
Similarly, the ideal control input (i.e., uk = arg min{J(xk )}) can 

 Case 1 : If {i|k = t(i), i ≥ 0} = ∅



also be approximated by an NN (namely, the actor NN) 

 W̃ f,k+1 = (1 − α1 α2 )W̃ f,k + α1 α2 W f



u(xk ) = WuT φu (xk ) + ζu,k (8) 




 Case 2 : If {i|k = t(i), i ≥ 0} , ∅



where Wu is the ideal weight matrix for the actor NN, φu (xk ) is 



 W̌ f, j+1 = (1 − α1 α2 )W̌ f, j − α1C T C W̌ f, j φ f (⃗x j )φ̃Tf (⃗x j )
the corresponding active function, and ζu,k is the bounded 





approximation error.
 − α1C T Dω j+1 φ̃Tf (⃗x j ) − α1C T C ζˇj φ̃Tf (⃗x j )



Since the plant state is inaccessible, the developed control 


strategy is based on state estimates x̂k . Accordingly, the 

 + α1 α2 W f − α1C T C Ēh(i) ω j φ̃Tf (⃗x j )



approximated cost function J( ˆ x̂k ) and control input are 




 − α1C T C Āh(i) x̃ j φ̃Tf (⃗x j )



ˆ x̂k ) = Ŵ T φ J ( x̂k )
J( (9) 

 t(i) − q(i) + 1 ≤ j ≤ t(i)
J,k 


and 

W̃
f,k+1 = W̌ f,k+1
û( x̂k ) = Ŵu,k
T
φu ( x̂k ) (10) (11)

where Ŵ J,k and Ŵu,k denote the estimate of W J and Wu, with W̃t(i)−q(i)+1 = W̌t(i)−q(i)+1, Āh(i) ≜ A − Lh(i)C , Ēh(i) ≜ E−
( )
respectively. The detailed design procedure about the parame- Lh(i) D and ζ̌ f,k ≜ W φ f (xk ) − φ f (⃗xk ) + ζ f,k .
ters Ŵ J,k and Ŵu,k will be introduced in Section III-B. Let the estimation error and reorganized estimated error be
Remark 2: In linear cases, HJB equations can be reduced to x̃k ≜ xk − x̂k and x̌k ≜ xk − ⃗xk . The error dynamics is governed
Riccati equations, which are straightforward to solve. How- by

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
1570 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO. 7, JULY 2024

Case 1 : If {i|k = t(i), i ≥ 0} = ∅




 Π22
1 ≜ δα1 α2 P − Φ3 , Ξ1 ≜ A PA − (1 + µ1 )P
2 2 11 T





 x̃k+1 = A x̃k + W̃ f,k φ f ( x̂k ) + Eωk + ζ̌ f,k

 1 ≜ P − σ1 I, Ξ1 ≜ E PE − Φ1 , Ξ1 ≜ P − Φ2
Ξ22 33 T 44




Case 2 : If {i|k = t(i), i ≥ 0} , ∅

 (12) Ξ14
1 ≜ A P, Π2 ≜ ε4 D CPC D − Φ6
T 22 T T


 x̌ j+1 = Āh(i) x̌ j + W̌ f, j φ f (⃗x j ) + Ēh(i) ω j + ζ̌ f, j

 )


 Π11
2 ≜ ε1 + ε2 φ̄ f − δ(1 − µ1 ) P + σ2 φ̄ f I
2 2


t(i) − q(i) + 1 ≤ j ≤ t(i)



x̃k+1 = x̌k+1 Ξ11
2 ≜ (1 + ε3 ) Āh(i) P Āh(i) − (1 − µ2 )P
T

with x̌t(i)−q(i)+1 = x̃t(i)−q(i)+1. 2 ≜ P, Ξ2 ≜ Ξ2 ≜ Āh(i) P, Π2 ≜ ε7 P − Φ7


Ξ24 12 14 T 33

Theorem 1: Let the state estimator gain Lh(i) be given.


Ξ44
2 ≜ (1 + ε6 )P − Φ5 , Ξ2 ≜ P − σ2 I
22
Assume that there exist scalars δ > 0 , µ1 > 0 , 0 < µ2 < 1,
0 < αi < 1 (i = 1, 2), σ s > 0 (s =1, 2, 3, 4, 5), and positive defi- Ξ33
2 ≜ ε5 Ē h(i) P Ē h(i) − Φ4 .
T

nite matrices P, Φl (l = 1, 2, . . . ,7) such that the following con-


Then, both the error dynamics (11) and (12) are EUB in
ditions hold:
mean square subject to ωk .


 Π1 < 0 (13) Proof: To begin with, we construct the following Lya-





 2 Π < 0 (14) punov-like function:




 Ξ1 < 0 (15) Vk ≜ V1,k + V2,k (20)



Ξ2 < 0 (16) where

C T CPC T C − σ3 ∥C T C∥2 P ≤ 0 (17) V1,k ≜ x̃kT P x̃k , V2,k ≜ δtr{W̃ Tf,k PW̃ f,k }.


H ∑
M Since the observer has no measurement signal to utilize
p̄ s (1 + µ1 ) s−M (1 − µ2 ) M + p̄ s (1 − µ2 ) s < 1 (18) when k , t(i), the error dynamics (11) and (12) would undergo
s=M+1 s=1 an increment. Fortunately, at t(i), the buffer signal packet
 would be transmitted to the observer. With the aid of the sig-

 ps , if s ∈ Ha


 ∑ nal packet, the estimation value of system state and nonlinear
p̄ s ≜ 
1− p̄ι , if s ∈ Hb (19) NNW from t(i) − q(i) + 2 to t(i) would be regenerated, and then



ι∈Ha those regenerated estimates would be utilized to generate the
where state estimate of t(i) + 1 (as seen in (4) and (5)). In this way,
 11  the increment would be compensated by the decrement, and
Π Π12 
1 
Π1 ≜  1  the overall error dynamics (for both state and NNW estima-
∗ Π22
1 tion) would be EUB in mean square. Therefore, the following
 11  analysis of the error dynamics of state and NNW estimation is
Π2 0 0 

 implemented based on (11) and (12). Consider two cases.
Π2 ≜  ∗ Π22 0 
 Case 1: {i|k = t(i), i ≥ 0} = ∅
 2
∗ ∗ Π33
2
In this case, there exists a positive scalar i satisfying
 11  t(i) < k ≤ t(i + 1) − q(i). Denote ∆Vk as the difference between
Ξ1 Ξ12
1 0 Ξ14
1  Vk+1 and Vk , i.e.,
 

 ∗ Ξ22 0 Ξ24
1 
Ξ1 ≜  1
 ∑
2 ∑
2
 ∗
 ∗ Ξ33 0  ∆Vk = ∆Vr,k = (Vr,k+1 − Vr,k ). (21)
1  r=1 r=1
∗ ∗ ∗ Ξ44
1
According to the estimation error dynamics (12), by calcu-
 11 
Ξ2 Ξ12
2 0 Ξ14
2  lating the mathematical expectation of E{∆Vk − µ1 Vk } , we can
  easily obtain that
 ∗ Ξ22 0 Ξ24
2 
Ξ2 ≜  2

 ∗
 ∗ Ξ33 0  E{∆Vk − µ1 Vk }
2 
∗ ∗ ∗ Ξ44 = E{V1,k+1 − (1 + µ1 )V1,k + V2,k+1 − (1 + µ1 )V2,k } (22)
2
where
ε1 ≜ δα̃ε̄, ε̄ ≜ 1 − α1 α2 + 4α1 + α1 σ4 + α1 α2 σ5
E{V1,k+1 − (1 + µ1 )V1,k }
ε4 ≜ 1/∥C T C∥2 , ᾱ ≜ 1 − α1 α2 + 4α1 , ε5 = 1 + α21 {
= E 2 x̃kT AT PW̃ f,k φ f ( x̂k ) + 2 x̃kT AT Pζ̌ f,k + x̃kT AT PA x̃k
ε6 ≜ δα1 σ3 σ−1
4 ᾱ + 2σ3 α1 , ε2 ≜ ε3 ≜ δα1 σ3 ε̄
2

ε7 ≜ δα1 α2 σ−1 + 2φTf ( x̂k )W̃ Tf,k Pζ̌ f,k + φTf ( x̂k )W̃ Tf,k PW̃ f,k φ f ( x̂k )
5 ᾱ + 2α1 α2
2 2

( 2 ) + ωTk Φ1 ωk + ωTk (E T PE − Φ1 )ωk + ζ̌ Tf,k Φ2 ζ̌ f,k


Π11
1 ≜ δ α̃ − (1 + µ1 ) P + σ1 φ̄ f I
2
}
Π12 + ζ̌ Tf,k (P − Φ2 )ζ̌ f,k − (1 + µ1 ) x̃kT P x̃k
1 ≜ δα̃α1 α2 P, Ξ1 ≜ P, Ξ1 ≜ A P
24 12 T
(23)

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: ULTIMATELY BOUNDED OUTPUT FEEDBACK CONTROL FOR NETWORKED NONLINEAR SYSTEMS 1571

q(i+1)−h(i+1)
and Vt(i+1)−q(i+1)+1 ≤ π̄1 Vt(i)+1 + d̄1 (29)
E{V2,k+1 − (1 + µ1 )V2,k } q(i+1)−h(i+1)+1
π̄ −π̄1
{ { where d̄1 ≜ d1 1 1−π̄1 .
= δtr E (1 − α1 α2 )2 W̃ Tf,k PW̃ f,k + 2(1 − α1 α2 ) Case 2: {i|k = t(i), i ≥ 0} , ∅
In this case, there exists a positive scalar i such that
× α1 α2 W̃ Tf,k PW f − (1 + µ1 )W̃kT PW̃ f,k k = t(i + 1). Furthermore, under the effects of buffer-aided
}} strategy, the available measurement signals (i.e., Yt(i+1) =
+ W Tf (α21 α22 P − Φ3 )W f + W Tf Φ3 W f . (24)
{yt(i+1) , yt(i+1)−1 , . . . , yt(i+1)−q(i+1)+1 }) are utilized to facilitate the
Subsequently, by means of state estimation process, where the reorganized estimated
states and NNWs are acquired (as shown in (4) and (5)). Then,
σ1 φTf ( x̂k )W̃ Tf,k W̃ f,k φ f ( x̂k ) − σ1 φ̄2f tr{W̃ Tf,k W̃ f,k } ≤ 0 (25) the desired state estimate x̂t(i+1)+1 is generated based on the
and considering (22) to (25), we have reorganized estimated states.
For t(i + 1) − q(i + 1) + 1 ≤ j < t(i + 1) + 1, letting V̌ j ≜ V̌1, j +
E{∆Vk − µ1 Vk }
{ V̌2, j ≜ x̌Tj P x̌ j + δtr{W̌ Tf, j PW̌ f, j } and calculating the mathemati-
≤ E 2 x̃kT AT PW̃ f,k φ f ( x̂k ) + 2 x̃kT AT Pζ̌ f,k cal expectation of E{V̌ j+1 − V̌ j } , we have

+ 2φTf ( x̂k )W̃ Tf,k Pζ̌ f,k + x̃kT AT PA x̃k E{∆V̌ j } = E{V̌1, j+1 + V̌2, j+1 − V̌1, j − V̌2, j } (30)
where
+ φTf ( x̂k )W̃ Tf,k PW̃ f,k φ f ( x̂k ) + ζ̌ Tf,k Φ2 ζ̌ f,k
E{V̌1, j+1 − V̌1, j }
+ ωTk Φ1 ωk + ωTk (E T PE − Φ1 )ωk {
= E 2 x̌Tj ĀTh(i) PW̌ f, j φ f (⃗x j ) + 2 x̌Tj ĀTh(i) Pζ̌ f, j + x̌Tj ĀTh(i)
+ ζ̌ Tf,k (P − Φ2 )ζ̌ f,k − (1 + µ1 ) x̃kT P x̃k
{ × PAh(i) x̌ j + 2φTf (⃗x j )W̌ Tf, j Pζ̌ f, j + φTf (⃗x j )W̌ Tf, j PW̌ f, j
+ σ1 φ̄2f tr{W̃ Tf,k W̃ f,k } + δtr α̃2 W̃ Tf,k PW̃ f,k
× φ f (⃗x j ) + 2ωTj Ēh(i)
T
Pζ̌ f, j + ωTj (Ēh(i)
T
PĒh(i) − Φ4 )ω j
+ 2α̃α1 α2 W̃ Tf,k PW f − (1 + µ1 )W̃ Tf,k PW̃ f,k
}} + ζ̌ Tf, j (P − Φ5 )ζ̌ f, j + ωTj Φ4 ω j + ζ̌ Tf, j Φ5 ζ̌ f, j
+ W Tf (δα21 α22 P − Φ3 )W f + W Tf Φ3 W f
}
− (1 − µ2 ) x̌Tj P x̌ j − µ2 x̌Tj P x̌ j (31)
≤ E{γkT Π1 γk + ηTk Ξ1 ηk } + d1 (26)
where and
[ ]T E{V̌2, j+1 − V̌2, j }
γk ≜ W̃ Tf,k W Tf { {(
[ ]T = δtr E α̃W̌ f, j − α1C T C W̌ f, j φ f (⃗x j )φ̃Tf (⃗x j )
ηk ≜ x̃kT φT ( x̂k )W̃ Tf,k ωk T ζ̌kT
− α1C T C Āh(i) x̌ j φ̃Tf (⃗x j ) − α1C T Dω j+1 φ̃Tf (⃗x j )
d1 ≜ tr{Q̃T Φ1 Q̃ + Φ2 ζ̃ 2 + Φ3 W̄ 2f }
− α1C T C Ēh(i) ω j φ̃Tf (⃗x j ) − α1C T C ζˇj φ̃Tf (⃗x j )
ζ̃ ≜ 2W̄ f φ̄ f + ζ¯f , α̃ ≜ 1 − α1 α2 .
) (
Taking (13), (15) and (26) into account, we arrive at + α1 α2 W f P α̃W̌ f, j − α1C T C W̌ f, j φ f (⃗x j )φ̃Tf (⃗x j )

E{∆Vk − µ1 Vk } ≤ γkT Π1 γk + ηTk Ξ1 ηk + d1 ≤ d1 . (27) − α1C T C Āh(i) x̌ j φ̃Tf (⃗x j ) − α1C T Dω j+1 φ̃Tf (⃗x j )

Therefore, for any t(i) + 1 ≤ k < t(i + 1) − q(i + 1) + 1 and pos- − α1C T C Ēh(i) ω j φ̃Tf (⃗x j ) − α1C T C ζˇj φ̃Tf (⃗x j )
itive scalar π1, we have ) }}
+ α1 α2 W f − (1 − µ2 + µ2 )W̌ Tj PW̌ j . (32)
πk+1
1 Vk+1 − π1 Vk = π1 (Vk+1 − Vk ) + π1 (π1 − 1)Vk
k k+1 k

Furthermore, by means of C T CPC T C ≤ σ3 ∥C T C∥2 P , (32)


≤ πk1 (π1 + µ1 π1 − 1)Vk + πk+1
1 d1 . (28) can be calculated as
Defining π̄1 ≜ 1/(1 + µ1 ) and calculating the sum of both E{V̌2, j+1 − V̌2, j }
sides of (28) from t(i) + 1 to t(i + 1) − q(i + 1) + 1 with respect { {
to k, we have ≤ δtr E ε1 W̌ Tf, j PW̌ f, j + ε2 φ̄2f W̌ Tf, j PW̌ f, j + ε3 x̌Tj
t(i+1)−q(i+1)+1
π̄1 Vt(i+1)−q(i+1)+1 − π̄t(i)+1
1 Vt(i)+1 × ĀTh(i) PĀh(i) x̌ j + ε4 ωTj+1 DT CPC T Dω j+1

t(i+1)−q(i+1)+1
π̄t(i)+2 − π̄1
t(i+1)−q(i+1)+2 T
≤ d1
ϕ
π̄1 = d1 1 + ε5 ωTj Ēh(i)
T
PĒh(i) ω j + ε6 ζˇj Pζˇj + ε7 W Tf P
ϕ=t(i)+2
1 − π̄1 }}
× W f − (1 − µ2 )W̌ Tf, j PW̌ f, j − µ2 W̌ Tf, j PW̌ f, j . (33)
which implies

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
1572 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO. 7, JULY 2024

q(i+1)−h(i+1) −q(i+1)
Afterwards, with the help of the inequity where µ̃ ≜ µ̄1 µ̄2 .

Considering (19), we have from 0 ≤ pn ≤ 1 − ι∈Ha pι that
σ2 φTf (⃗x j )W̌ Tf, j W̌ f, j φ f (⃗x j ) − σ2 φ̄2f tr{W̌ Tf, j W̌ f, j } ≤ 0 (34)
q(i+1)−h(i+1) −q(i+1)
we substitute (33) and (31) into (30) to obtain E{µ̄1 µ̄2 }
{
E{∆V̌ j } ≤ E 2 x̌Tj ĀTh(i) PW̌ f, j φ f (⃗x j ) + 2 x̌Tj ĀTh(i) Pζ̌ f, j ∑
M ∑
H
= p s µ̄1s−s µ̄−s
2 + p s µ̄1M−s µ̄−M
2
+ (1 + ε3 ) x̌Tj ĀTh(i) PĀh(i) x̌ j + 2φTf (⃗x j ) s=1 s=M+1

× W̌ Tf, j Pζ̌ f, j + φTf (⃗x j )W̌ Tf, j PW̌ f, j φ f (⃗x j ) ∑


H ∑
M
≤ p̄ s (1 + µ1 ) s−M (1 − µ2 ) M + p̄ s (1 − µ2 ) s ≜ µ̂
− (1 − µ2 ) x̌Tj P x̌ j − µ2 x̌Tj P x̌ j − σ2 φTf (⃗x j ) s=M+1 s=1
(39)
(
× W̌ Tf, j W̌ f, j φ f (⃗x j ) + ωTj (1 + ε5 )Ēh(i)
T
and
) −q(i+1)
× PĒh(i) − Φ4 ω j + ωTj Φ4 ω j + ζ̌ Tf, j Φ5 ζ̌ f, j E{µ̄2 d̄1 + d̄2 }
( ) { { − µ̄2 }
q(i+1)−h(i+1)+1 −q(i+1)+1
+ ζ̌ Tf, j (1 + ε6 )P − Φ5 ζ̌ f, j + δtr ε1 W̌ Tf, j P −q(i+1) µ̄1 − µ̄1 µ̄
= E µ̄2 d1 + d2 2
1 − µ̄1 1 − µ̄2
× W̌ f, j + δ−1 σ2 φ̄2f W̌ Tf, j W̌ f, j + ε2 φ̄2f W̌ Tf, j P

H ( µ̄1M−s+1 − µ̄1 µ̄−M+1 − µ̄2 )
× W̌ f, j + ε4 ωTj+1 DT CPC T Dω j+1 + ε7 W Tf P ≤ p̄ s µ̄−M d1 + d2
2
2
1 − µ̄1 1 − µ̄2
}} s=M+1
× W f − (1 − µ2 )W̌ Tf, j PW̌ f, j − µ2 W̌ Tf, j PW̌ f, j

M ( µ̄−s+1 − µ̄2 )
≤ E{γ̄Tj Π2 γ̄ j + ηTj Ξ2 η j − µ2 V̌ j } + d2 (35) + p̄ s µ̄−s d d 2
2 1 2 1 − µ̄ ≜ d̂.
2 (40)
s=1
where γ̄ j ≜ [W̌ Tf, j d2 ≜ωTj+1 W Tf ]T , tr{Q̃T (Φ4 + Φ6 )Q̃ + Φ5 ζ̃ 2 + Then, calculate the conditional expectation of (38), and
Φ7 W̄ 2f }. from (39) and (40), we have
It can be observed from (14), (16) and (35) that
E{Vt(i+1)+1 |t(i), x̃t(i) } ≤ µ̂E{Vt(i)+1 |t(i), x̃t(i) } + d̂. (41)
E{∆V̌ j } ≤ γ̄Tj Π2 γ̄ j + ηTj Ξ2 η j − µ2 E{V̌ j } + d2 Take the mathematical expectation of (41).
≤ − µ2 E{V̌ j } + d2 . E{Vt(i+1)+1 } ≤ µ̂E{Vt(i)+1 } + d̂. (42)
Obviously, for any t(i + 1) − q(i + 1) + 1 ≤ k < t(i + 1) and Next, for any positive scalar µ̄, one has
positive scalar µ2, one has
µ̄m+1 E{Vt(i+1)+1 } − µ̄m E{Vt(i)+1 }
j+1 j j+1 j
µ̄2 V̌ j+1 − µ̄2 V̌ j = µ̄2 (V̌ j+1 − V̌ j ) + µ̄2 (µ̄2 − 1)V̌ j
≤ µ̄m (µ̄ − µ̄(1 − µ̂) − 1)E{Vt(i)+1 } + µ̄m+1 d̂. (43)
j j+1
≤ µ̄2 (µ̄2 − µ̄2 µ2 − 1)V̌ j + µ̄2 d2 . (36) Subsequently, denoting µ̄ = 1/µ̂ and summing up (43) from
Denoting µ̄2 ≜ 1/(1 − µ2 ) and calculating the summation in t(0) + 1 to t(z) + 1 in respect to z, one has
(36) from t(i + 1) − q(i + 1) + 1 to t(i + 1) + 1 in respect to k, one
µ̄ − µ̄z+1
has µ̄z E{Vt(0) } − E{Vt(0)+1 } ≤ d̂
1 − µ̄
t(i+1)−q(i+1)+1
µ̄t(i+1)+1
2 V̌t(i+1)+1 − µ̄2 V̌t(i+1)−q(i+1)+1 which results in

t(i+1)+1 t(i+1)−q(i+1)+2
µ̄2 − µ̄t(i+1)+2 1 − µ̂z
ϕ
≤ d2 µ̄2 = d2 2
. E{Vt(z)+1 } ≤ µ̂z E{Vt(0)+1 } + d̂
1 − µ̄2 1 − µ̂
ϕ=t(i+1)−q(i+1)+2

Furthermore, it is obvious that Vt(i+1)+1 = V̌t(i+1)+1 and µ̂z d̂ d̂


≤ µ̂z (1 − µ2 )E{Vt(0) } + µ̂z d2 − + .
Vt(i+1)−q(i+1)+1 = V̌t(i+1)−q(i+1)+1. Then, we have 1 − µ̂ 1 − µ̂
−q(i+1) Therefore, E{Vt(z)+1 } is ultimately bounded, i.e.,
Vt(i+1)+1 ≤ µ̄2 Vt(i+1)−q(i+1)+1 + d̄2 (37)
−q(i+1)+1 d̂
µ̄2 −µ̄2 lim E{Vt(z)+1 } =
< +∞.
where d̄2 ≜ d2 1−µ̄2 . z→+∞ 1 − µ̂
Aggregation of Cases 1 and 2 Then, for any t(z) + 1 ≤ k < t(z + 1) + 1, one has E{Vk } ≤
We now aggregate the results obtained in the analysis of E{Vt(z)+H } , and
Cases 1 and 2. In the following part of this subsection, we will
E{Vk } ≤ E{Vt(z)+H }
show that the EUB of the error dynamics (11) and (12) can be
z 1−H ( d̂ )
simultaneously guaranteed. To this end, it is easily obtained
from (29) and (37) that ≤ µ̂z µ̄1−H
1 (1 − µ2 )V0 + d̃ + µ̂ µ̄1 d2 − (44)
1 − µ̂
−q(i+1) d̂µ̄1−H d1 (µ̄1 −µ̄2−H )
Vt(i+1)+1 ≤ µ̃Vt(i)+1 + µ̄2 d̄1 + d̄2 (38) where d̃ ≜ 1
+ 1
. Finally,
1−µ̂ µ̄1 −1

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: ULTIMATELY BOUNDED OUTPUT FEEDBACK CONTROL FOR NETWORKED NONLINEAR SYSTEMS 1573

d̃ Z J (W J ), we obtain the update law of Ŵ J,k for critic NNs based


lim E{∥ x̃k ∥2 } ≤ .
k→+∞ λmin (P) on gradient descent.
■ Ŵ J,k+1 =Ŵ J,k − β1 ∆φ J ( x̂k )r J ( x̂k )Z JT (Ŵ J,k ) (47)
Theorem 2: For the error dynamics (11) and (12), assume
where β1 is the tuning scalar of the update law and r J ( x̂k ) is
that there exist scalars δ > 0 , µ1 > 0 , 0 < µ2 < 1, 0 < αi < 1 (i =
the step length used to adjust the updated amplitude, r J ( x̂k ) =
1, 2), σ s > 0 (s = 1, 2, 3, 4, 5) , positive definite matrices P, ( )
1/ 1 + ∥∆φ J ( x̂k )T ∆φ J ( x̂k )∥ .
Φl (l = 1, 2, . . . , 7), and observer gain matrix Lh(i) satisfying
(13)–(15), (17)–(19), and the following matrix inequality: Next, we are in a position to design the weight update law.
Based on (10) and the Bellman’s principle of optimality, one
Ξ̃2 < 0 (45) desired “optimal” control policy is governed by
where ( )
 11  ∂l x̂k , û( x̂k ) ∂J( x̂k+1 )
Ξ̃2 Ξ̃15 Ξ̃16 =− .
0 0 0 2  ∂û( x̂k ) ∂û( x̂k )
 
2
 ∗ Ξ̃22 Ξ̃25 0  ( ) ( )

0 0
 Define g û( x̂k ) as the derivative function of l x̂k , û( x̂k )
2 2
( ) ( )
 ∗ ∗ Ξ̃33 Ξ̃36  which is invertible, i.e., g û( x̂k ) ≜ ∂l x̂k , û( x̂k ) /∂û( x̂k ) . As
0 0 2 
Ξ̃2 ≜  2
 shown in [13], the approximated value of u( x̂k ) (i.e., U( x̂k )) is
 ∗ ∗ ∗ Ξ̃44 0 0  ( )
 2  calculated based on an inverse function of g û( x̂k )
 ∗ ∗ ∗ ∗ Ξ̃55 0 


2  ( ∂l( x̂k , û( x̂k )) )
∗ ∗ ∗ ∗ ∗ Ξ̃66 U( x̂k ) = g−1
2 ∂û( x̂k )
Ξ̃11
2 ≜ −(1 − µ2 )P, Ξ̃2 ≜ A − C Lh(i)
15 T T T
1
[√ ] = − R−1 BT ∇φTJ ( x̂k+1 )Ŵ J,k
2
Ξ̃16
2 ≜
1 + ε3 (AT − C T Lh(i)
T ) 0
, Ξ̃33
2 ≜ −Φ4 where ∇φTJ ( x̂k+1 ) represents the gradient operation of
[ √ ]
Ξ̃25 1 + ε5 (E T − DT Lh(i) φTJ ( x̂k+1 ). Let Zu ( x̂k ) be the control input error represented by
2 ≜ I, Ξ̃2 ≜
36 0 T )

Ξ̃44 Zu ( x̂k ) = û( x̂k ) − U( x̂k )


2 ≜ (1 + ε6 )P − Φ5 , Ξ̃2 ≜ P − 2I
55
1
Ξ̃22 = Ŵu,k
T
φu ( x̂k ) + R−1 BT ∇φTJ ( x̂k+1 )Ŵ J,k .
2 ≜ −σ2 I, Ξ̃2 ≜ diag{P − 2I, P − 2I}
66
2
where ε s (s = 3, 5, 6) are defined in Theorem 1. Then, (11) and Similarly, using gradient descent, we obtain a weight update
(12) are EUB in mean square subject to ωk . law by minimizing 12 ZuT ( x̂k )Zu ( x̂k ), i.e.,
Proof: The proof follows from Theorem 1 and Schur Com-
plement Lemma. ■ Ŵu,k+1 = Ŵu,k − β2 φu ( x̂k )ZuT ( x̂k ). (48)
Remark 3: Because µ1 > 0 and 0 < µ2 < 1, error dynamics Define W̃ J,k = Ŵ J,k − W J as the estimation error of critic-
undergoes an increment since the observer has no measure- NNW.
ment signal to utilize when implementing the observation
task. Fortunately, during t(i) − q(i) + 1 ≤ k < t(i + 1) + 1, a decre- W̃ J,k+1 = Ŵ J,k+1 − W J
ment would be utilized to compensate the increment. In this = W̃ J,k − β1 ∆φ J ( x̂k )r J ( x̂k )Z JT ( x̂k )
way, the EUB of the error dynamics (11) and (12) can be
jointly guaranteed. = W̃ J,k − β1 ∆φ J ( x̂k )r J ( x̂k )
( )
× ∆φTJ ( x̂k )W̃ J,k + Υ J (49)
B. Controller Design
( )
In this subsection, we design controller parameters. Further- where Υ J ≜ lT x̂k , û( x̂k ) + ∆φTJ ( x̂k )W J .
more, the error dynamics EUB about actor/critic-NNWs will Letting W̃u,k = Ŵu,k − Wu be the estimation error of the
be simultaneously analyzed. actor-NNW, (48) indicates
With the help of the Bellman’s principle of optimality, J(xk )
can be rewritten as W̃u,k+1 = Ŵu,k+1 − Wu


= W̃u,k − β2 φu ( x̂k )ZuT ( x̂k )
J(xk ) = l(xk , uk ) + l(x j , u j )
j=k+1 = W̃u,k − β2 φu ( x̂k )φTu ( x̂k )W̃u,k − β2 Υu
= l(xk , uk ) + J(xk+1 ). (46) 1
− β2 φu ( x̂k )W̃ J,k
T
∇φ J ( x̂k+1 )B(R−1 )T (50)
Considering the approximation of J(xk ) shown in (7), let 2
Z J (W J ) ≜ J( x̂k ) − J(xk ) be the residual error produced during (
where Υu ≜ 12 φu ( x̂k ) W JT ∇φ J ( x̂k+1 )B(R−1 )T + φTu ( x̂k ) × Wu ).
the approximation process of critic NN. Then, we have The following theorem presents the selection scheme on the
( ) tuning scalars β1 and β2, which ensures that the error dynam-
Z J (W J ) = l x̂k , û( x̂k ) + J( x̂k+1 ) − J(xk )
( ) ics (49) and (50) are EUB in mean square.
≈ l x̂k , û( x̂k ) + W JT ∆φ J ( x̂k ) Theorem 3: Let the initial control input (i.e., û0 ( x̂k ) ≜
where ∆φ J (xk ) ≜ φ J ( x̂k+1 ) − φ J (xk ). By minimizing 12 Z JT (W J )× T φ ( x̂ )) be admissible and the initial actor- and critic-
Ŵu,0 u k

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
1574 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO. 7, JULY 2024

{ { }}
NNW (i.e., Ŵ J,0 and Ŵu,0 ) be selected from a compact set tr E ∆V3,k + ∆V4,k | x̂k , W̃u,k , W̃ J,k
which includes the ideal weights. Assume that there exist
{ {
scalars β1 > 0, β2 > 0, 0 < µ j < 1 ( j = 3, 4), σ s > 0 (s = 6, 7,
≤ tr E ξkT Ξ3 ξk − µ3 W̃ J,k
T
W̃ J,k + d3 + ξ̄kT Ξ4 ξ̄k
8, 9) and positive matrices Γl (l = 1, 2, 3) such that
{ }}
Ξ3 < 0 (51) − µ4 W̃u,k
T
W̃u,k + d4
Ξ4 < 0 (52)
where
where [ T ]T [ T ]T
 11   11  ξk ≜ W̃ J,k ϱTk , ξ̄k ≜ W̃u,k ΥTu
Ξ3 0  Ξ4 Ξ12 
4 
Ξ3 ≜   
 , Ξ4 ≜    ( )
∗ Ξ3
22 ∗ Ξ4
22 ϱk ≜ lT x̂k , û( x̂k ) + ∆φTJ ( x̂k )W J , d3 ≜ tr{3W̄ J φ̄ J Γ1 }
( ) d4 ≜ tr{(∥R−1 BT ∥2 φ̄2u φ̄2J W̄ J2 + φ̄4u W̄u2 )Γ2 }.
Ξ11
3 ≜ − β1 (2 − σ6 − 4β1 σ6 φ̄ J − 4β1 φ̄ J ) − µ3
2 2

+ β2 φ̄2u φ̄2J (β2 + σ7−1 + σ−1 −1 −1 T 2


8 + σ9 )∥R B ∥ Inequalities (51) and (52) indicate
−1 −1 2
{ { }}
Ξ22
3 ≜ β1 (σ6 + 4β1 σ6 φ̄ J + 4β1 φ̄ J ) − Γ1
2
tr E ∆V3,k + ∆V4,k | x̂k , W̃u,k , W̃ J,k
Ξ11
4 ≜ (−2β2 φ̄u + β2 φ̄u + β2 σ7 + β2 φ̄u σ8 + µ4 )
2 2 4 2
{ }
≤ tr − µ3 V3,k + d3 − µ4 Ṽ4,k + d4 . (55)
Ξ12
4 ≜ −β2 + β2 φ̄u , Ξ4 ≜ β2 + β2 σ9 − Γ2 .
2 2 22 2 2

Then, both estimation errors for critic/actor-NNWs are EUB Remark 4: Utilizing the universal approximation property,
in mean square. (9) and (10) are used to suitably approximate (7) and (8),
Proof: For the critic NN with update law (47) and the actor respectively. By this approach, the NN-based control algo-
NN with update law (48), we construct Lyapunov functions rithm can be realized. Furthermore, based on Lyapunov stabil-
ity, the boundedness of both critic-NNW and actor-NNW is
V3,k ≜ tr{W̃ J,k
T
W̃ J,k }, V4,k ≜ tr{W̃u,k
T
W̃u,k }. assured.
Taking the mathematical expectation along the trajectory of C. Boundedness Analysis for the Nonlinear NCSs
(49) and (50) leads to
In this subsection, stability analysis will be conducted.
{ }
E ∆V3,k | x̂k , Ŵu,k , Ŵ J,k Theorem 4: Let the initial control input (i.e., û0 ( x̂k ) ≜
T φ ( x̂ )) be admissible and the initial actor- and critic-NN
{ } Ŵu,0 u k
= E V3,k+1 | x̂k , Ŵu,k , Ŵ J,k − V3,k weights (i.e., Ŵ J,0 and Ŵu,0 ) be selected from a compact set
{ { ( ) which includes the ideal weights. Suppose that there exist
≤ tr E − β1 (2 − σ6 − 4β1 σ6 φ̄2J − 4β1 φ̄2J ) − µ3 scalars 0 < ℵ < 1, σ10 > 0 , 0 < µ5 < 1 and positive matrices
( T ) Γl (l = 4, 5) such that
× W̃ J,k
T
W̃ J,k − µ3 W̃ J,k
T
W̃ J,k + W̃ J,k ∆φ J ( x̂k ) + ΥTJ
( ) Π5 < 0 (56)
× β1 (σ−1 −1 2
6 + 4β1 σ6 φ̄ J + 4β1 φ̄ J ) − Γ1
2
where
( ) ( T )  11 
× ∆φTJ ( x̂k )W̃ J,k + Υ J + W̃ J,k ∆φ J ( x̂k ) + ΥTJ Π5 0 0 
 
( )}} Π5 ≜  ∗ Π22 0 
 5 
× Γ1 ∆φTJ ( x̂k )W̃ J,k + Υ J (53) ∗ ∗ Π33
5
−1 T
and Π11
5 ≜ ℵ(1 + 2σ10 ) − 1 + µ5 , Π5 ≜ (2 + σ10 )B B − Γ3
22

{ } −1 T
E ∆V4,k | x̂k , Ŵu,k , Ŵ J,k Π33
5 ≜ (2 + σ10 )B B − Γ4 .
{ }
= E V4,k+1 | x̂k , Ŵu,k , Ŵ J,k − V4,k Then, system (1) with control policy (10) is EUB in mean
{ { square.
≤ tr E (−2β2 φ̄2u + β22 φ̄4u + β2 σ7 + β2 φ̄2u σ8 + µ4 )W̃u,k
T Proof: In light of the optimal control theory, (8) will stabi-
lize (in the sense of input-to-state stability) the following sys-
× W̃u,k + 2(−β2 + β22 φ̄2u )W̃u,k
T
Υu + ΥTu (β22 + β22 σ9 tem on a compact set [13]:

1 xk+1 =Axk + f (xk ) + Buk + Eωk = Λ(xk ) + Eωk .


− Γ2 )Υu + ΥTu Γ2 Υu + β2 φ̄2u (β2 + σ−1 −1 −1
7 + σ8 + σ9 ) In other words, there exists a positive constant ℵ < 1 such
4
that
× R−1 BT ∇φTJ ( x̂k+1 )W̃ J,k W̃ J,k
T
∇φ J ( x̂k+1 )B(R−1 )T
}} E{∥Λ(xk )∥2 } ≤ ℵE{∥xk ∥2 } + ∥Eωk ∥. (57)
− µ4 W̃u,k
T
W̃u,k . (54)
Considering the observer-based control framework, in view
Inequalities (53) and (54) indicate of (10), we have the following actual closed-loop system:

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: ULTIMATELY BOUNDED OUTPUT FEEDBACK CONTROL FOR NETWORKED NONLINEAR SYSTEMS 1575

xk+1 = Λ(xk ) − BWuT φu (xk ) − Bζu,k + BŴu,k


T
φu ( x̂k ) sions, the EUB of the system states, along with error dynam-
ics of system states, nonlinear/critic/actor-NNWs, have been
= Λ(xk ) − Bζu,k − BWuT φ̀u (xk ) + BW̃u,k
T
φu ( x̂k ) (58) collectively assured.
where φ̀u (xk ) ≜ φu (xk ) − φu ( x̂k ). Let us construct Remark 7: It should be mentioned that the signal transmis-
sions of a typical network system are implemented via a digi-
V5,k ≜ tr{xkT xk }. tal communication channel, where an encoding-decoding
Seeking for the mathematical expectation implies mechanism is utilized to encode signals. By now, various
{ } { } encoding-decoding schemes have been reported in the litera-
E ∆V5,k = E V5,k+1 | x̂k , Ŵu,k , Ŵ J,k − V5,k ture (e.g., the quantization-based encoding-decoding schemes
{ { and symbolic-based encoding-decoding schemes) [29]. Differ-
( ) ent encoding-decoding mechanisms would lead to different
≤ tr E ξkT Π5 ξk + ζu,k + WuT φ̀u (xk ) T Γ3
“decoding errors”, which will affect the resultant accuracy of
( ) the control system. One of our future research topics is to
× ζu,k + WuT φ̀u (xk ) T + φTu ( x̂k )W̃u,k
study the design of optimal buffer-aided control strategy for
}}
× Γ4 W̃u,k
T
φu ( x̂k ) − µ5 xkT xk + Q̃T Q̃ networked systems with unreliable communication channels
and encoding-decoding mechanisms.
{ { }}
≤ tr E ξ̃kT Π5 ξ̃k − µ5 xkT xk + d5 (59) IV. Illustrative Example
Consider a networked nonlinear system (1) where
where ξ̃k ≜ [xkT ζu,k T + φ̀T (x )W W̃ T ]T , d ≜ tr{(ζ̄ + 2W̄ φ̄ )T ×
u k u u,k 5 u u u    
   1   0.1 
Γ3 (ζ̄u + 2W̄u φ̄u ) + φ̄2u Γ4 (d3 + d4 ) + Q̃T Q̃} . 0.5 0 −0.6
    
Taking (56) into consideration, it follows from (59) that: A =  0 1.01    
0  , B = −1 , E = −0.1
0 0.5 0.2 1 0.05
E{∆V5,k } ≤ −µ5 E{V5,k } + d5 . (60) [ ] [ ]
0.8 −0.8 0 0.1
Now, let us consider (55) and (60). It is obvious that C= , D= .
−0.7 0 −0.7 0

5 ∑
5
The variance of ωk is set as 0.2, and fk = 8[sin(x1,k )
E{Vr,k } ≤ (µ̄r Vr,k−1 + dr )
sin(x2,k ) sin(x2,k ) cos(x3,k )]T .
r=3 r=3
Let the maximum capacity of the buffer be Q = 2. The trans-

5
( 1 − µ̄kr ) mission interval h(i) is selected from the set H = {1, 2, 3, 4},
≤ µ̄kr Vr,0 + dr (61) whose known occurrence probabilities is taken as p1 = 0.2 and
1 − µ̄r
r=3
p2 = 0.4.
where µ̄r ≜ 1 − µr . Set δ = 0.1, µ1 = 1.2, µ2 = 0.75, α1 = 5, α2 = 1.2 , σ1 = 9 ,
By constructing the following Lyapunov-like function: σ2 = 1.2, σ3 = 1.3, σ4 = 0.2 and σ5 = 0.5. Using MATLAB

5 LMI Toolbox, the desired solution to the matrix inequalities
Vk ≜ Vr,k (13)−(15), (17)−(19), and (45) is
 
r=1
 24.1891 −1.3302 −1.4385
 
and considering (44) and (61), we have P = −1.3302 24.1972 −0.0034
 
lim E{Vk } < d̃ + d3
1
+ d4
1
+ d5
1
< +∞. −1.6385 −0.0034 24.1876
k→+∞ 1 − µ̄3 1 − µ̄4 1 − µ̄5    
 0.3017 0.1487 
  0.5680 0.3322 

  
■ L1 = −0.2577 −0.0482 , L2 = −1.2492 −0.4145
Remark 5: In Theorems 1–4, we have explored the ulti-    
−0.1327 −0.0617 −0.5034 −0.4423
mately bounded output-feedback control for nonlinear NCSs    
by a buffer-aided strategy amidst inconsistent communication  0.19070 0.2279   0.3740 0.3322 

  
channels. Specifically, we have quantitatively modeled the L3 = −0.2764 −0.4392 , L4 = −0.6433 −0.4392 .
   
unreliable signal transmissions and evaluated the impact of the −0.2997 −0.1672 −0.3007 −0.1742
buffer-aided approach, designed the tuning laws for the Set ξ = 0.6, µ3 = 0.01, µ4 = 0.01, µ5 = 0.01, β1 = 0.99 ,
NNWs, and also ensured the bounded stability. β2 = 0.99 , σ6 = 0.8, σ7 = 0.2, σ8 = 0.2, σ9 = 0.2 and σ10 =
Remark 6: Compared with existing results, the salient fea- 0.2 . Therefore, the matrix inequalities (51), (52), and (56)
tures of can be summarized as follows. 1) This work pioneers hold. In what follows, let us validate this ADP-based control
the exploration into the NN-based output-feedback control for strategy. The utility function is selected as l(xk , uk ) = xkT Mxk +
networked nonlinear systems utilizing a buffer-aided strategy uTk Ruk where M = 1.6I and R = 1.2I . The activation functions
amidst unreliable signal transmissions. 2) Given the nature of
are selected as
unreliable signal transmissions and the incorporation of the
[ ]
buffer-aided strategy, this paper introduces innovative adap- φ f ( x̂k ) = 0.01 tanh( x̂1,k ) tanh( x̂2,k ) tanh( x̂3,k ) T
tive tuning laws of the nonlinear/critic/actor-NNWs. Further- [ ]
2 ) tanh( x̂ x̂ ) tanh( x̂ ) T
more, NN tuning scalars have been tailored to ensure a com- φv ( x̂k ) = 0.4 tanh( x̂1,k 2,k 3,k 3,k
mendable approximation of unknown nonlinearities and the [ ]
critic/actor NNs. 3) In the face of unreliable signal transmis- φu ( x̂k ) = 0.4 tanh( x̂1,k ) tanh(0.2 x̂2,k ) tanh(0.2 x̂3,k ) T .

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
1576 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO. 7, JULY 2024

5
The initial values are
4 W^ 1, J, k
x0 = [0.9 −0.6 0.6]T , x̂0 = [−0.24 0.12 −0.36]T 3 W^ 2, J, k
W^ 3, J, k
Ŵ f,0 = [0.1 0.1 0.1]T , Ŵ J,0 = [−1 −1 1.8]T 2

Amplitude
1
Ŵu,0 = [−1.52 −4.24 7.6] .
0
The validity and efficacy of our proposed approach are visu-
−1
ally substantiated through results explained as follows.
1) To begin, Fig. 1 showcases the norm of state trajectories −2

for the open-loop system. It becomes evident that the open- −3


loop system is inherently unstable, which motivates the need −4
0 10 20 30 40 50 60 70 80 90
for an effective control strategy even more apparent. Time (k)

×108
3.0 Fig. 3. The weight estimate of critic NN.
||xk||
2.5 8
2.0 W^ 1, u, k W^ 2, u, k W^ 3, u, k
6
Amplitude

1.5 4
1.0

Amplitude
2
0.5 0
0 −2
−0.5 −4
0 200 400 600 800 1000
Time (k)
−6
0 10 20 30 40 50 60 70 80 90
Fig. 1. Norm of the state vector of the open-loop system. Time (k)

2) Transitioning to the closed-loop system, we have dis- Fig. 4. The weight estimate of actor NN.
played both the state trajectories and estimates in Fig. 2,
which provides a clear testament to the feasibility of the NN- 8
based output-feedback control strategy developed in our ^ ^ k)
u(x
6
study. The trajectories closely align with their estimates,
underscoring the controller’s ability to maintain system stabil- 4
ity and accurately track the desired states.
Amplitude

1.0 x1, k 0
0.8 x^ 1, k
x2, k −2
0.6 x^ 2, k
0.4 x3, k −4
Amplitude

x^ 3, k
0.2 −6
0 10 20 30 40 50 60 70 80 90
0
Time (k)
−0.2
−0.4 Fig. 5. The control input.
−0.6
−0.8
4) Collectively, these simulation outcomes show that the
0 10 20 30 40 50 60 70 80 90 proposed NN-based control strategy achieves satisfactory per-
Time (k)
formance, and our developed approach not only addresses the
Fig. 2. States and their estimates of the closed-loop system. inherent instability of the system but also provides commend-
able precision and adaptability.
3) Delving into the neural network details, Figs. 3 and 4
depict the estimates of the actor/critic NNWs, respectively, V. Conclusions
and this provides insight into the dynamic adaptation and In this study, we have examined the ultimately bounded out-
learning process that the networks undergo as they interact put-feedback control for networked nonlinear systems
with the system. The control input, crucial for achieving the employing a buffer-aided strategy over unreliable communica-
desired system behavior, is represented in Fig. 5, from which tion channels was explored. Given the unreliable nature of sig-
one can verify the controller’s responsiveness and precision in nal transmission, we have used a buffer-aided strategy to relay
action. a greater number of measurements. To obtain the coveted con-

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: ULTIMATELY BOUNDED OUTPUT FEEDBACK CONTROL FOR NETWORKED NONLINEAR SYSTEMS 1577

trol strategy, an NN-based observer has been devised for state Networks and Learning Systems, vol. 415, pp. 258–265, Nov. 2020.
estimation. In addition, an observer-based ADP algorithm has [16] L. Wang, S. Liu, Y. Zhang, D. Ding, and X. Yi, “Non-fragile l2–l∞ state
estimation for time-delayed artificial neural networks: An adaptive
been introduced to approximate the ideal solution for the sub- event-triggered approach,” Int. Journal of Systems Science, vol. 53,
optimal control issue. Utilizing the Lyapunov stability, suffi- no. 10, pp. 2247–2259, Jul. 2022.
cient conditions have been identified that jointly ensure that [17] Z. Zhao, X. Yi, L. Ma, and X. Bai, “Quantized recursive filtering for
the close-loop system, state estimates and critic/actor-NNW networked systems with stochastic transmission delays,” ISA
Transactions, vol. 127, pp. 99–107, Aug. 2022.
estimates are all the EUB in mean square. Numerical exam-
[18] R. Caballero-Aguila, A. Hermoso-Carazo, and J. Linares-Perez,
ples have been presented to reinforce the efficacy of the out- “Optimal state estimation for networked systems with random
lined control strategy. Potential avenues for future investiga- parameter matrices, correlated noises and delayed measurements,” Int.
tions include the extension of the proposed control strategy to Journal of General Systems, vol. 44, no. 2, pp. 142–154, Feb. 2015.
systems with buffer-aided strategy and other phenomena such [19] D. Ciuonzo, A. Aubry, and V. Carotenuto, “Rician MIMO channel- and
jamming-aware decision fusion,” IEEE Trans. Signal Processing,
as complex networks [37]−[39], wireless sensor networks vol. 65, no. 15, pp. 3866–3880, 2017.
[40], multiagent systems [41], and others [42]−[48]. [20] X.-M. Zhang, Q.-L. Han, X. Ge, D. Ding, L. Ding, D. Yue, and C.
Peng, “Networked control systems: A survey of trends and techniques,”
IEEE/CAA J. Autom. Sinica, vol. 7, no. 1, pp. 1–17, Jan. 2020.
References
[21] X. Guan, J. Hu, J. Qi, D. Chen, F. Zhang, and G. Yang, “Observer-
[1] X. Liang, Q. Qi, H. Zhang, and L. Xie, “Decentralized control for based H∞ sliding mode control for networked systems subject to
networked control systems with asymmetric information,” IEEE Trans. communication channel fading and randomly varying nonlinearities,”
Automatic Control, vol. 67, no. 4, pp. 2076–2083, Apr. 2022. Neurocomputing, vol. 437, pp. 312–324, May 2021.
[2] F. L. Lewis and D. Vrabie, “Reinforcement learning and adaptive [22] W. Qian, W. Xing, and S. Fei, “H∞ state estimation for neural networks
dynamic programming for feedback control,” IEEE Circuits and with general activation function and mixed time-varying delays,” IEEE
Systems Magazine, vol. 9, no. 3, pp. 40–58, 2009. Trans. Neural Networks and Learning Systems, vol. 32, no. 9,
[3] B. Sun and E.-J. Van Kampen, “Event-triggered constrained control pp. 3909–3918, Sept. 2021.
using explainable global dual heuristic programming for nonlinear [23] L. Yu, Y. Cui, Y. Liu, N. D. Alotaibi, and F. E. Alsaadi, “Sampled-
discrete-time systems,” Neurocomputing, vol. 468, pp. 452–463, Jan. based consensus of multi-agent systems with bounded distributed time-
2022. delays and dynamic quantisation effects,” Int. Journal of Systems
Science, vol. 53, no. 11, pp. 2390–2406, Aug. 2022.
[4] X. Wang, W. Liu, Q. Wu, and S. Li, “A modular optimal formation
control scheme of multiagent systems with application to multiple [24] Y. Zhao, X. He, L. Ma, and H. Liu, “Unbiasedness-constrained least
mobile robots,” IEEE Trans. Industrial Electronics, vol. 69, no. 9, squares state estimation for time-varying systems with missing
pp. 9331–9341, Sept. 2022. measurements under round-robin protocol,” Int. Journal of Systems
Science, vol. 53, no. 9, pp. 1925–1941, Jul. 2022.
[5] H. Zhang, Y. Luo, and D. Liu, “Neural-network-based near-optimal
control for a class of discrete-time affine nonlinear systems with control [25] H. Geng, Z. Wang, Y. Chen, X. Yi, and Y. Cheng, “Variance-
constraints,” IEEE Trans. Neural Networks and Learning Systems, constrained filtering fusion for nonlinear cyber-physical systems with
vol. 20, no. 9, pp. 1490–1503, Sept. 2009. the denial-of-service attacks and stochastic communication protocol,”
IEEE/CAA J. Autom. Sinica, vol. 9, no. 6, pp. 978–989, Jun. 2022.
[6] D. V. Prokhorov, R. Santiago, and D. C. Wunsch, “Adaptive critic
designs: A case study for neurocontrol,” Neural Networks, vol. 8, no. 9, [26] X. Li, F. Han, N. Hou, H. Dong, and H. Liu, “Set-membership filtering
pp. 1367–1372, 1995. for piecewise linear systemswith censored measurements under Round-
Robin protocol,” Int. Journal of Systems Science, vol. 51, no. 9,
[7] X. Wang, Y. Sun, and D. Ding, “Adaptive dynamic programming for pp. 1578–1588, 2020.
networked control systems under communication constraints: A survey
of trends and techniques,” Int. Journal of Network Dynamics and [27] Y. S. Shmaliy, S. Zhao, and C. K. Ahn, “Unbiased finite impluse
Intelligence, vol. 1, no. 1, pp. 85–98, Dec. 2022. response filtering: An iterative alternative to Kalman filtering ignoring
noise and initial conditions,” IEEE Control Systems Magazine, vol. 37,
[8] Q. Wei, D. Wang, and D. Zhang, “Dual iterative adaptive dynamic no. 5, pp. 70–89, 2017.
programming for a class of discrete-time nonlinear systems with time-
delays,” Neural Computing and Applications, vol. 23, pp. 7–8, Dec. [28] H. Song, D. Ding, H. Dong, G. Wei, and Q.-L. Han, “Distributed
2013. entropy filtering subject to DoS attacks in non-Gauss environments,”
Int. Journal of Robust and Nonlinear Control, vol. 30, no. 3, pp. 1240–
[9] X. Wu and C. Wang, “Event-driven adaptive near-optimal tracking 1257, Feb. 2020.
control of the robot in aircraft skin inspection,” Int. Journal of Robust
and Nonlinear Control, vol. 31, no. 7, pp. 2593–2613, May 2021. [29] Z. Wang, L. Wang, S. Liu, and G. Wei, “Encoding-decoding-based
control and filtering of networked systems: Insightsdevelopments and
[10] K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward opportunities,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 1, pp. 3–18, Jan.
networks are universal approximators,” Neural Networks, vol. 2, no. 5, 2018.
pp. 359–366, Jun. 1989.
[30] D. Shi, T. Chen, and L. Shi, “Event-triggered maximum likelihood state
[11] Z. Ming, H. Zhang, Y. Luo, and W. Wang, “Dynamic event-based estimation,” Automatica, vol. 50, no. 1, pp. 247–254, Feb. 2014.
control for stochastic optimal regulation of nonlinear networked control
[31] M. Barakat, “Novel chaos game optimization tuned-fractional-order
systems,” IEEE Trans. Neural Networks and Learning Systems, vol. 34,
PID fractional-order PI controller for load-frequency control of
no. 10, p. 7308, 7299. 2023.
interconnected power systems,” Protection and Control of Modern
[12] H. Ren, H. Zhang, Y. Mu, and J. Duan, “Off-policy synchronous Power Systems, 2022. DOI: 10.1186/s41601-022-00238-x
iteration IRL method for multi-player zero-sum games with input
[32] Y. Wang and G. Yang, “Robust H∞ model reference tracking control for
constraints,” Neurocomputing, vol. 379, pp. 413–421, Feb. 2020.
networked control systems with communication constraints,” Int.
[13] D. Ding, Z. Wang, and Q.-L. Han, “Neural-network-based consensus Journal of Control, Automation, and Systems, vol. 7, no. 6, pp. 992–
control for multiagent systems with input constraints: The event- 1000, Dec. 2009.
triggered case,” IEEE Trans. Cybernetics, vol. 50, no. 8, pp. 3719–3730, [33] Y. Xu, L. Yang, Z. Wang, H. Rao, and R. Lu, “State estimation for
Aug. 2020. networked systems with Markov driven transmission and buffer
[14] Y. Chen, K. Ma, and R. Dong, “Dynamic anti-windup design for linear constraint,” IEEE Trans. Systems, Man, and Cybernetics: Systems,
systems with time-varying state delay and input saturations,” Int. vol. 51, no. 12, pp. 7727–7734, Dec. 2021.
Journal of Systems Science, vol. 53, no. 10, pp. 2165–2179, Jul. 2022. [34] Y. Cui, L. Yu, Y. Liu, W. Zhang, and F. E. Alsaadi, “Dynamic event
[15] W. Qian, Y. Li, Y. Zhao, and Y. Chen, “New optimal method for l2–l∞ based non-fragile state estimation for complex networks via partial
state estimation of delayed neural networks,” IEEE Trans. Neural nodes information,” Journal of the Franklin Institute, vol. 358, no. 18,

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.
1578 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 11, NO. 7, JULY 2024

pp. 10193–10212, Dec. 2021. Zidong Wang (Fellow, IEEE) received the B.Sc.
degree in mathematics in 1986 from Suzhou Univer-
[35] X. Wang, D. Ding, X. Ge, and Q.-L. Han, “Neural-network-based sity, and the M.Sc. degree in applied mathematics in
control for discrete-time nonlinear systems with denial-of-service 1990 and the Ph.D. degree in electrical engineering
attack: The adaptive event-triggered case,” Int. Journal of Robust and in 1994, both from Nanjing University of Science
Nonlinear Control, vol. 32, no. 5, pp. 2760–2779, Mar. 2022. and Technology.
[36] L. Zou, Z. Wang, Q.-L. Han, and D. Zhou, “Ultimate boundedness He is currently Professor of dynamical systems
control for networked systems with try-once-discard protocol and and computing in the Department of Computer Sci-
uniform quantization effects,” IEEE Trans. Automatic Control, vol. 62, ence, Brunel University London, UK. From 1990 to
no. 12, pp. 6582–6588, Dec. 2017. 2002, he held teaching and research appointments in
[37] G. Bao, L. Ma, and X. Yi, “Recent advances on cooperative control of universities in China, Germany and the UK. Prof. Wang’s research interests
heterogeneous multi-agent systems subject to constraints: A survey,” include dynamical systems, signal processing, bioinformatics, control theory
Systems Science & Control Engineering, vol. 10, no. 1, pp. 539–551, and applications. He has published a number of papers in international jour-
Dec. 2022. nals. He is a holder of the Alexander von Humboldt Research Fellowship of
Germany, the JSPS Research Fellowship of Japan, William Mong Visiting
[38] C. Gao, X. He, H. Dong, H. Liu, and G. Lyu, “A survey on fault- Research Fellowship of Hong Kong, China.
tolerant consensus control of multi-agent systems: Trends, Prof. Wang serves (or has served) as the Editor-in-Chief for International
methodologies and prospects,” Int. Journal of Systems Science, vol. 53, Journal of Systems Science, the Editor-in-Chief for Neurocomputing, the Edi-
no. 13, pp. 2800–2813, Oct. 2022. tor-in-Chief for Systems Science and Control Engineering, and an Associate
[39] X. Wan, Y. Li, Y. Li, and M. Wu, “Finite-time H∞ state estimation for Editor for 12 international journals including IEEE Transactions on Auto-
two-time-scale complex networks under stochastic communication matic Control, IEEE Transactions on Control Systems Technology, IEEE
protocol,” IEEE Trans. Neural Networks and Learning Systems, vol. 33, Transactions on Neural Networks, IEEE Transactions on Signal Processing,
no. 1, pp. 25–36, Jan. 2022. and IEEE Transactions on Systems, Man, and Cybernetics-Part C. He is a
Member of the Academia Europaea, a Member of the European Academy of
[40] Y. Ju, G. Wei, D. Ding, and S. Liu, “A novel fault detection method
Sciences and Arts, an Academician of the International Academy for Systems
under weighted try-once-discard scheduling over sensor networks,”
and Cybernetic Sciences, a Fellow of the IEEE, a Fellow of the Royal Statisti-
IEEE Trans. Control of Network Systems, vol. 7, no. 3, pp. 1489–1499,
cal Society and a member of program committee for many international con-
Sept. 2020.
ferences.
[41] W. Qian, Y. Gao, and Y. Yang, “Global consensus of multiagent
systems with internal delays and communication delays,” IEEE Trans.
Systems, Man, and Cybernetics: Systems, vol. 49, no. 10, pp. 1961–1970, Lei Zou (Senior Member, IEEE) received the Ph.D
Oct. 2019. degree in control science and engineering in 2016
[42] Y. Chen, Q. Song, Z. Zhao, Y. Liu, and F. E. Alsaadi, “Global Mittag- from Harbin Institute of Technology.
He is currently a Professor with the College of
Leffler stability for fractional-order quaternion-valued neural networks
Information Science and Technology, Donghua Uni-
with piecewise constant arguments and impulses,” Int. Journal of
versity. From October 2013 to October 2015, he was
Systems Science, vol. 53, no. 8, pp. 1756–1768, Jun. 2022.
a visiting Ph.D. student with the Department of Com-
[43] X. Li, Q. Song, Y. Liu, and F. E. Alsaadi, “Nash equilibrium and bang- puter Science, Brunel University London, U.K. His
bang property for the non-zero-sum differential game of multi-player research interests include control and filtering of net-
uncertain systems with Hurwicz criterion,” Int. Journal of Systems worked systems, moving-horizon estimation, state
Science, vol. 53, no. 10, pp. 2207–2218, Jul. 2022. estimation subject to outliers, and secure state estimation.
[44] Y. Sun, D. Ding, H. Dong, and H. Liu, “Event-based resilient filtering Prof. Zou serves (or has served) as an Associate Editor for IEEE/CAA
for stochastic nonlinear systems via innovation constraints,” Journal of Automatica Sinica, Neurocomputing, International Journal of Sys-
Information Sciences, vol. 546, pp. 512–525, Feb. 2021. tems Science, and International Journal of Control, Automation and Systems,
a Senior Member of IEEE, a Senior Member of Chinese Association of
[45] H. Tao, H. Tan, Q. Chen, H. Liu, and J. Hu, “H∞ state estimation for
Automation, a Regular Reviewer of Mathematical Reviews, and a very active
memristive neural networks with randomly occurring DoS attacks,” Reviewer for many international journals.
Systems Science & Control Engineering, vol. 10, no. 1, pp. 154–165,
Dec. 2022.
Yun Chen received the B.E. degree in thermal engi-
[46] H. Shen, M. Xing, H. Yan, and J. Cao, “Observer-based l2–l∞ control neering in 1999 from Central South University of
for singularly perturbed semi-Markov jump systems with an improved Technology (Central South University), and the M.E.
weighted TOD protocol,” Science China-Information Sciences, 2022. degree in engineering thermal physics in 2002, and
DOI: 10.1007/s11432-021-3345-1 the Ph.D. degree in control science and engineering
[47] H. Yu, J. Hu, B. Song, H. Liu, and X. Yi, “Resilient energy-to-peak in 2008, both from Zhejiang University.
filtering for linear parameter-varying systems under random access From August 2009 to August 2010, he was a Vis-
protocol,” Int. Journal of Systems Science, vol. 53, no. 11, pp. 2421– iting Fellow with the School of Computing, Engi-
2436, Aug. 2022. neering and Mathematics, University of Western
[48] Q. Zhang and Y. Zhou, “Recent advances in non-Gaussian stochastic Sydney, Australia. From December 2016 to Decem-
systems control theory and its applications,” Int. Journal of Network ber 2017, he was an Academic Visitor with the Department of Mathematics,
Dynamics and Intelligence, vol. 1, no. 1, pp. 111–119, Dec. 2022. Brunel University London, UK. In 2002, he joined Hangzhou Dianzi Univer-
sity, where he is currently a Professor. His research interests include stochas-
tic and hybrid systems, robust control and filtering.
Yuhan Zhang received the B.Eng. degree in elec-
tronic information science and technology from the Guoping Lu received the B.S. degree from the
Shandong University of Science and Technology, in Department of Applied Mathematics, Chengdu Uni-
2016, and the M.Sc. degree in marketing from the versity of Science and Technology, in 1984, and the
University of Nottingham, UK, in 2017. She is cur- M.S. and Ph.D. degrees in applied mathematics from
rently pursuing the Ph.D. degree in control science the Department of Mathematics, East China Normal
and engineering from Shandong University of Sci- University, in 1989 and 1998, respectively.
ence and Technology. He is currently a Professor with the School of
Her current research interests include the control Electrical Engineering, Nantong University. His cur-
and filtering of networked systems, reinforcement rent research interests include singular systems, mul-
learning, and neural networks. tiagent systems, networked control, and nonlinear
She is a very active Reviewer for many international journals. signal processing.

Authorized licensed use limited to: Anhui University of Technology. Downloaded on March 19,2025 at 03:14:19 UTC from IEEE Xplore. Restrictions apply.

You might also like