0% found this document useful (0 votes)
13 views

Soft_Sensors_for_Industrial_Processes_Using_Multi- (2)

This paper presents a novel data-driven approach for developing soft sensors (SSs) for multi-step-ahead prediction of industrial process variables using Hankel Dynamic Mode Decomposition with Control (HDMDc). The proposed method, MSA-HDMDc, effectively captures complex nonlinear dynamics and outperforms existing techniques in the soft sensors domain, as validated by two real-world case studies. The study emphasizes the importance of hyperparameter tuning and model order reduction to enhance prediction accuracy and control in industrial applications.

Uploaded by

acheampongwarren
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Soft_Sensors_for_Industrial_Processes_Using_Multi- (2)

This paper presents a novel data-driven approach for developing soft sensors (SSs) for multi-step-ahead prediction of industrial process variables using Hankel Dynamic Mode Decomposition with Control (HDMDc). The proposed method, MSA-HDMDc, effectively captures complex nonlinear dynamics and outperforms existing techniques in the soft sensors domain, as validated by two real-world case studies. The study emphasizes the importance of hyperparameter tuning and model order reduction to enhance prediction accuracy and control in industrial applications.

Uploaded by

acheampongwarren
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

electronics

Article
Soft Sensors for Industrial Processes Using Multi-Step-Ahead
Hankel Dynamic Mode Decomposition with Control
Luca Patanè , Francesca Sapuppo * and Maria Gabriella Xibilia

Department of Engineering, University of Messina, Contrada di Dio, 98158 Messina, Italy;


[email protected] (L.P.); [email protected] (M.G.X.)
* Correspondence: [email protected]

Abstract: In this paper, a novel data-driven approach for the development of soft sensors (SSs) for
multi-step-ahead prediction of industrial process variables is proposed. This method is based on the
recent developments in Koopman operator theory and dynamic mode decomposition (DMD). It is
derived from Hankel DMD with control (HDMDc) to deal with highly nonlinear dynamics using
augmented linear models, exploiting input and output regressors. The proposed multi-step-ahead
HDMDc (MSA-HDMDc) is designed to perform multi-step prediction and capture complex dynamics
with a linear approximation for a highly nonlinear system. This enables the construction of SSs
capable of estimating the output of a process over a long period of time and/or using the developed
SSs for model predictive control purposes. Hyperparameter tuning and model order reduction are
specifically designed to perform multi-step-ahead predictions. Two real-world case studies consisting
of a sulfur recovery unit and a debutanizer column, which are widely used as benchmarks in the SS
field, are used to validate the proposed methodology. Data covering multiple system operating points
are used for identification. The proposed MSA-HDMDc outperforms currently adopted methods in
the SSs domain, such as autoregressive models with exogenous inputs and finite impulse response
models, and proves to be robust to the variability of systems operating points.

Keywords: soft sensors; dynamical models; system identification; dynamic mode decomposition
with control; nonlinear processes; multi-step prediction

Citation: Patanè, L.; Sapuppo, F.;


Xibilia, M.G. Soft Sensors for
Industrial Processes Using Multi-
1. Introduction
Step-Ahead Hankel Dynamic Mode
Decomposition with Control. Soft sensors (SSs) are widely exploited for monitoring and controlling industrial
Electronics 2024, 13, 3047. https:// processes where real-time estimation of variables is essential. SSs provide mathematical
doi.org/10.3390/electronics13153047 solutions for estimating and predicting hard-to-measure variables (i.e., quality features)
using easy-to-measure variables (i.e., quantity features) [1,2].
Academic Editor: Luca Mesin
In automation and control strategies, SSs are applied to address the issues related to
Received: 27 May 2024 delays in measurements due to time-consuming laboratory analysis in the feedback loop.
Revised: 19 July 2024 This requires that the designed SSs are capable of multi-step-ahead prediction. Applications
Accepted: 30 July 2024 are ubiquitous in the process industry: refineries [3], chemical plants [4], power plants [5],
Published: 1 August 2024 food processing [6], polymerization processes [7] or wastewater treatment systems [8].
Industrial processes are often highly nonlinear, suffer from intrinsic dynamic dependencies
between input and output variables and may exhibit transients, intermittent phenomena
and continuous spectra.
Copyright: © 2024 by the authors.
The main approach to identifying such complex nonlinear dynamical systems origi-
Licensee MDPI, Basel, Switzerland.
nates with Poincare’s studies and works on the geometry of subspaces of local linearizations
This article is an open access article
around fixed points, periodic orbits and more general attractors [9]. This methodology has
distributed under the terms and
a deep theoretical foundation, such as the Hartman–Grobman theorem, which determines
conditions of the Creative Commons
Attribution (CC BY) license (https://
when and where it is possible to approximate a nonlinear system with linear dynamics.
creativecommons.org/licenses/by/
On the one hand, such a geometric perspective enables the application of simple
4.0/).
quantitative, locally linear models, such as autoregressive (ARX), principal component

Electronics 2024, 13, 3047. https://doi.org/10.3390/electronics13153047 https://www.mdpi.com/journal/electronics


Electronics 2024, 13, 3047 2 of 24

regression (PCR) and partial least-square regression (PLSR) models [1], and proper orthogo-
nal decomposition [10], as well as the composition of multiple linear systems as components
of more complex modeling techniques [11,12]. In this scenario, the rich linear analytical
framework can be used around such operating points and is therefore suitable for linear
control strategies. On the other hand, the global analysis remains qualitative and is based
on computational analysis, which is not suitable for predicting, estimating and controlling
nonlinear systems far from fixed points and periodic orbits. Moreover, due to the complex
theoretical environment, data-driven approaches are often used for SSs to support both
linear and nonlinear methods [12–17]. In this methodological and application scenario,
the Koopman operator [18] can provide a theoretical tool for obtaining a global linear
representation that is valid for nonlinear systems, even far from fixed points and periodic
orbits. A main motivation for the adoption of the Koopman framework is the possibility
to simplify the dynamics by the eigenvalue decomposition of the Koopman operator [19]
and thus to represent a nonlinear dynamical system globally by an infinite-dimensional
linear operator. It uses a Hilbert space of observable functions related to the state of the
system to describe the space of all possible measurement state functions. It is linear and
its spectral decomposition fully characterizes the behavior of a nonlinear system, without
a direct relation to the operating points of the system.
The application of such a powerful tool to industrial problems, namely to obtain a finite-
dimensional approximation of the Koopman operator, is a challenge of recent research [20].
Moreover, since the closed form of the Koopman operator is not always obtainable [21],
data-driven algorithms are needed. Koopman mode decomposition can be performed using
data-driven approaches such as dynamic mode decomposition (DMD) [22].
Applications of DMD can be found in the literature in fluid dynamics [23,24],
epidemiology [25], neuroscience [26], plasma physics [27,28], robotics [29], power
grid instabilities [30] and renewable energy prediction [31]. DMD represents a method
for approximating the Koopman operator that provides a best-fit linear model for one-step-
ahead prediction. Such an approximation might not be rich enough to describe nonlinear
dynamics. To overcome this limitation and apply DMD to nonlinear industrial processes,
it is possible to extend DMD with different strategies based on either nonlinear functions
or delayed measurements. Extended DMD [32] and sparse identification nonlinear dy-
namics (SINDy) [33] belong to the first category. The second category, on which this paper
focuses, is based on the use of delayed state variables obtained by the Hankel operator.
Such approaches overcome the limitations of standard DMD, which cannot accurately
describe systems where the number of variables is smaller than the spectral complexity.
Therefore, in Hankel-based DMD, the number of variables is increased by considering
time-delayed vectors in addition to the current state vector. There are a few variants of
the Hankel approach for the Koopman operator: Hankel DMD (HDMD) [34], high-order
DMD (HODMD) [35] and Hankel alternative view of Koopman (HAVOK) [36]. Thanks to
the state variable augmentation, these methods are more robust and accurate than classical
DMD and are therefore suitable for the identification of nonlinear dynamical systems and
offer robust noise filtering [37–39].
With the aim of applying DMD-based approaches to industrial processes with
exogenous inputs, the DMD with control (DMDc) approach has been proposed in the
literature [40]. This is a modified version of DMD that considers both system measurements
and exogenous control inputs to identify input–output relationships and the underlying
dynamics. Hankel DMD with control (HDMDc) has recently been introduced to handle
both time-delayed state variables and control inputs [41,42]. Applications of Koopman
theory for quasiperiodically driven systems have also been presented in [43].
In this paper, we propose an extension of the HDMDc approach to multi-step-ahead
(MSA) prediction (hereafter referred to as MSA-HDMDc) in the SSs design domain.
This solution leverages and exploits the intrinsic HDMDc capability for forecasting [42],
operating and, in addition, iterative multi-step-ahead model optimization and output
prediction. This makes SSs suitable for the application of model-based online control
Electronics 2024, 13, 3047 3 of 24

strategies that are widely used in industrial processes, such as model predictive control
(MPC) [44–47]. To evaluate the potential of the MSA-HDMDc approach in industry,
and to test the robustness and reliability of the method in real-world industrial environ-
ments considering noise and uncertainty [48,49], two widely used benchmarks in the SS
field are considered: the sulfur recovery unit (SRU) [1,12,17,50,51] and the debutanizer
column (DC) [52,53]. Multi-step-ahead prediction of the output variables is evaluated
on such datasets, and a comparison with currently adopted linear model identification
techniques is performed.
The main outcomes of this work are summarized here:
• The MSA-HDMDc procedure is developed and applied in the field of soft sensors for
industrial applications to perform multi-step-ahead prediction;
• A model order reduction strategy with a two-step optimization is developed to reduce
the computational complexity of the identified model;
• A global linear model for a nonlinear process is identified so that model analysis and
control cover multiple working points of the process.
The article is structured as follows. Section 2 describes the theoretical background,
the equations and the algorithmic procedure behind the models used. The two industrial
case studies, the SRU and the DC, are presented in Section 3. Sections 4 and 5 present the
simulation results for the implemented models, including the design and optimization
of hyperparameters and model order reduction for both case studies. Comparisons with
baseline methods are also reported in this section. Finally, conclusions are drawn in
Section 6.

2. Theory Fundamentals
There are different approaches for the data-driven identification of dynamic pro-
cesses. When dealing with linear models, the widely used model classes in the SS field
are the AutoRegressive with eXogenous Input (ARX) and the finite impulse response (FIR)
filter [1,2]. They are considered here for comparison with the new MSA-HDMDc, which
has been evaluated in industrial applications. This section describes the theoretical and
mathematical foundations for both the baseline and the HDMDc models. The algorithmic
procedure of the MSA-HDMDc is also presented.

2.1. Baseline Methods


An ARX model set is determined by two polynomials whose degrees are n a and
nb , respectively:

A ( z −1 , θ ) = 1 + a 1 z −1 + a 2 z −2 + · · · + a n a z − n a
(1)
B(z−1 , θ ) = b0 + b1 z−1 + b2 z−2 + · · · + bnb z−nb

where z−1 represents the time delay operator and θ is the set of parameters:

θ := [ a1 a2 · · · ana b1 b2 · · · bnb ] T (2)

The acronym ARX can be explained in the model equation form for the calculation of
y(t), the predicted output at the time instant t:

B ( z −1 , θ ) 1
y(t) = u(t) + e(t) (3)
A ( z −1 , θ ) A ( z −1 , θ )

or equivalently:
A ( z −1 , θ ) y ( t ) = B ( z −1 , θ ) u ( t ) + e ( t ) (4)
where e(t) is a zero-mean white noise process and u(t) is the exogenous input vector.
Electronics 2024, 13, 3047 4 of 24

AR refers to the autoregressive part A(z−1 , θ )y(t) in the model, while X refers to the
exogenous term B(z−1 , θ )u(t). The model set is completely determined once the integers
n a , nb and the parameter set θ have been specified.
A more general expression including an input/output delay is represented by:

B ( z −1 , θ ) 1
y ( t ) = z−nk u(t) + e(t) (5)
A ( z −1 , θ ) A ( z −1 , θ )

where nk is the number of the input–output delay samples.


When an SS is designed to replace the hardware sensors, the output regressors are not
always available. In these cases, an infinite-step prediction should be performed, using as
output regressors the past estimated values. As an alternative, it is preferred to not involve
output regressors in the system dynamics description, and finite impulse response (FIR)
models can be adopted. FIR is a special case of Equation (1), with n a = 0.

2.2. HDMDc Method


The algorithm produces a discrete state-space model, hence the notation for discrete
instances, xk , of the continuous time variable, x (t), is used, where xk = x (kTs ) and Ts is the
sampling time of the model. Delay coordinates (i.e., xk−1 ,xk−2 , etc.) are also included in
the state-space model to account for state delay in the system. This procedure allows the
creation of the augmented state space relevant to model nonlinear phenomena, as discussed
in Section 1. Therefore, we define a state delay vector as:

xdk = [ xk−1 xk−2 · · · xk−q+1 ] T , (6)

where q is the number of delay coordinates (including the current time step) of the state,
with xdk ∈ R(q−1)nx and n x the number of state variables.
The input delay vector is defined as:

udk = [uk−1 uk−2 · · · uk−qu +1 ] T , (7)

where qu is the number of delay coordinates (including the current time step) of the inputs,
with udk ∈ R(qu −1)nu and nu the number of the exogenous input variables.
The discrete state-space function is defined as:

xk+1 = Axk + Ad xdk + Buk + Bd udk , (8)

where A ∈ Rnx ×nx is the state matrix, Ad ∈ Rnx ×(q−1)nx is the state delay system matrix,
B ∈ Rnx ×nu is the input matrix and Bd ∈ Rnx ×(qu −1)nu is the delay input matrix. The system
output is assumed to be equal to the state, i.e., the output matrix is assumed to be the
identity matrix. When dealing with system identification in which only an input/output
time series is available, this assumption implies that n x should be chosen as the size of the
process output vector. The training time series consists of discrete measurements of the
outputs (i.e., yk = xk ) and corresponding inputs (i.e., uk ).
The training data exploring the augmented state space, thanks to the delay shifts, are
organized in the following matrices:
h i
X = xq x q +1 x q +2 ··· x(q−1)+w (9)

X ′ = x q +1
 
x q +2 x q +3 ··· xq+w (10)

x q −1 xq x q +1 ··· x(q−1)+w−1
 
 .. .. .. .. .. 
Xd =  .
 . . . . 
 (11)
 x2 x3 x4 ··· x w +1 
x1 x2 x3 ··· xw
Electronics 2024, 13, 3047 5 of 24

xq x q +1 x q +2 ··· x(q−1)+w
 
 .. .. .. .. .. 
Xd′ =  .
 . . . . 
 (12)
 x3 x4 x5 ··· x w +2 
x2 x3 x4 ··· x w +1
h i
Γ = uq u q +1 u q +2 ··· u(q−1)+w (13)
 
u q −1 u q +0 u q +1 · · · u(q−1)+w−1
 .. .. .. .. .. 
Γd = 
 . . . . . 
 (14)
u(q−qu )+2 u(q−qu )+3 u(q−qu )+4 · · · x(q−qu )+w+1 
 
u(q−qu )+1 x(q−qu )+2 u(q−qu )+3 · · · u(q−qu )+w
where w represents the time snapshots and is the number of columns in the matrices, X ′
is the matrix X shifted forward by one time step, Xd is the matrix with delay states and
Γ is the matrix of inputs. Moreover, to incorporate the dynamic effect of control inputs,
an extended matrix of the exogenous inputs with time shifts (i.e., Γd ) is created and included
in the model. Equation (8) can now be combined with the matrices in Equations (9)–(14)
to produce:  ′
X
= AX + Ad Xd + BΓ + Bd Γd (15)
Xd′
Note that the primary objective of HDMDc is to determine the best-fit model matrices,
A, Ad , B and Bd , given the data in X ′ , X, Xd , Γ and Γd [40]. Considering the definition of
the Hankel matrix, H, for a generic single measurement time series, hk , and applying a d
time shift:

hd h d +1 h d +2 ··· h(d−1)+w
 
 h d −1 hd h d +1 ··· h(d−2)+w 
H= . (16)
 
 .. .. .. .. .. 
. . . . 
h1 h2 h3 ··· hw
we can introduce the synoptic notation:
 ′
Γ
   
X ′ X
XH = , XH = ′ , ΓH = ,
Xd Xd Γd (17)
   
A H = A Ad , BH = B Bd

with X H ∈ Rqnx ×w and Γ H ∈ Rqu nu ×w the Hankel matrices for the time series xk and uk ,
respectively. A H and BH are the transformation matrices for the augmented state and
inputs, with A H ∈ Rqnx ×qnx and BH ∈ Rqnx ×qu nu .
Considering the matrix Ω ∈ R(qnx +qu nu )×w as the composition of the delayed inputs
and outputs, and G as the global transformation matrix described in Equation (18).
 
XH
Ω=
 
, G = A H BH , (18)
ΓH

we obtain:

XH = GΩ (19)
A truncated Singular Value Decomposition (SVD) of the Ω matrix results in the
following approximation:

Ω ≈ Ũ p Σ̃ p ṼpT (20)
Electronics 2024, 13, 3047 6 of 24

where the notation ˜ represents rank-p truncation of the corresponding matrix, Ũ ∈


R(qnx +qu nu )× p , Σ̃ ∈ R p× p , and Ṽ ∈ Rw× p . Then the approximation of G can be computed as:

G ≈ XH Ṽp Σ̃− 1 T
p Ũ p (21)

For reconstructing the approximate state matrices à H and B̃H , the matrix Ũ p can
be split in two separate components: Ũ p1 , related to the state, and Ũ p2 , related to the
exogenous inputs: h i
T T
Ũ pT = Ũ p1 Ũ p2 (22)

where Ũ p1 ∈ Rqnx × p and Ũ p2 ∈ Rqu nu × p .


The complete G matrix can be therefore split in:
 h ′ i
Ṽp Σ̃− 1 T ′ Ṽ Σ̃−1 Ũ T

G ≈ Ā H B̄H = X H p Ũ p1 XH p p p2 (23)

Due to the high dimension of the matrices and to obtain further optimization in the
′ matrix results in the
computation of the reconstructed system, a truncated SVD of the X H
following approximation:

XH ≈ Ûr Σ̂r V̂rT (24)
where the notation ˆ represents rank-r truncation, Ûr ∈ Rqnx ×r , Σ̂r ∈ Rr×r , and V̂r ∈ Rw×r ,
and typically we consider r < p. Considering the projection of the operators Ā H and B̄H
on the low-dimensional space we obtain:

à H = ÛrT Ā H Ûr = ÛrT X H Ṽp Σ̃− 1 T
p Ũ p1 Ûr (25)


B̃H = ÛrT B̄H = ÛrT X H Ṽp Σ̃− 1 T
p Ũ p2 (26)

with à H ∈ Rr×r and B̃H ∈ Rr×qu nu The approximated discrete-time system based on
the Hankel transformation of the original time series (i.e., x̃kH and ukH ) can be therefore
represented as:
x̃kH+1 = Ã H x̃kH + B̃H ukH (27)
with xkH = Ûr x̃kH . The original time series xk is then extracted from xkH considering only the
rows with index i =n · q + 1, where n = 0, 1, . . . , (n x − 1).
Figure 1 clarifies the HDMDc procedure at a higher level. The input and output
measurement data from the historical dataset of an industrial process are fed into the
HDMDc block that performs in sequence the Hankel transformation of the input/output
variables, the merging of the state and input matrices, which integrates the control signals
(DMDc), and then the model identification procedure (DMD). The DMD performs a space
transformation, based on the SVD, and returns the reduced estimated state-space system
representation used for the output multi-step-ahead prediction.

Figure 1. HDMDc block scheme.


Electronics 2024, 13, 3047 7 of 24

2.3. MSA-HDMDc Method


In this section, the proposed MSA-HDMDc method is described and the procedure
is presented in Algorithm 1. It performs a model optimization and multi-step-ahead pre-
diction of the process output on the basis of a model identified using HDMDc. The model
optimization is based on a cost function depending on a combination of key performance
indicators (KPIs) such as the mean average percentage error (MAPE% ) and the coeffi-
cient of determination (R2 ), adopted for the comparison of the estimated output to the
measured target.
Moreover, to compare a set of models to a chosen baseline (bl) one, the performance
improvement (PI%) index is defined for each KPI as:

( T − KPIbl ) − ( T − KPInewmodel )
KPIPI% = % (28)
( T − KPIbl )

with T = 0 for MAPE% and T = 1 for R2 .


opt
As a preliminary step, the optimal delay shifts qopt and qu , for the state and the input,
respectively, should be determined in sequence. The selection is performed by applying
the HDMDc algorithm without order reduction, adopting an optimization algorithm (e.g.,
grid search strategy), and comparing the prediction performances in terms of MAPE% and
R2 , as will be shown in Section 4.
The MSA-HDMDc algorithm is described in the following. The acquired input/output
data samples are required to perform the model identification procedure. The training
dataset is used to create the X and Γ matrices with the output and inputs, respectively.
The state and input augmentation (X H and Γ H ) are performed by applying the Hankel
operator to the original measurements, and the extended state matrix, Ω, is obtained by
appending the X H and Γ H matrices. The core SVD algorithm is performed on the Ω and
XH′ matrices. The iteration for model optimization on multi-step-ahead prediction is then

performed by determining, in sequence, the optimal reduction for the Ω and X H ′ matrices.

The core of the model reduction and reconstruction is performed in the function Reconstruct
in Algorithm 2. It performs four operations:
• The state matrices order truncation (p,r) for model reduction;
• The determination of the HDMDc operators à H and B̃H as the state space representa-
tion of the identified reduced model;
• The iterative reconstruction of the multi-step-ahead estimated state in the reduced
state space (X̃ H ), for each selected time horizon within Kmax ;
• The remapping of the reduced state variables to the original augmented state space (X̂H );
• The extraction of the original state variables from X̂ H , selecting the rows related to the
first time shift or each state variable;
• The evaluation of the MAPE% and R2 , comparing the model predictions with the
target output.
The model reduction is performed in two steps. In the first step, the Ω matrix is
reduced by adopting different order reductions, p ∈ prange , selected on a grid with maxi-
opt ′ is kept to the full-order
mum value pmax = n x qopt + nu qu . In this phase, the matrix X H
opt
r = rmax = n x q. The optimal reduction order, p , is determined by maximizing the cost
function, f p , over the prange . The cost function consists of a linear combination of the
adopted KPIs (MAPE% and R2 ) evaluated on the maximum prediction time step, Kmax .
The optimization is performed using the validation data contained in the training dataset.
In the second phase, the matrix X H ′ is reduced by adopting different order reductions

r ∈ rrange , selected on a grid with maximum value rmax = min(n x qopt , p). The reduced
matrix, Ω, with order p = popt is here considered. The optimal reduction order, r opt , is deter-
mined by maximizing the cost function, f r , over the rrange . The multi-step-ahead prediction
can be performed by reconstructing the output dynamics of the optimal identified model.
Electronics 2024, 13, 3047 8 of 24

Algorithm 1 MSA-HDMDc Algorithm

Given the training datasets X ∈ Rnx ×w , Γ ∈ Rnu ×w ;


X = [ x 1 x 2 · · · x m −1 ]
X ′ = [ x2 x3 · · · x m ]
Γ = [ u 1 u 2 · · · u m −1 ]
Set Kmax as the maximum prediction time step value
opt
pmax = n x qopt + nu qu
rmax = min(n x q , p)opt

Set prange and rrange with r < p and r < rmax


X H =Hankel(X, qopt );
XH′ =Hankel(X ′ , qopt )
opt
Γ H =Hankel(Γ, qu )
Ω = [XH T ΓT ]T
H
Compute the SVD of Ω = Ũ Σ̃Ṽ T
Compute the SVD of X H ′ = Û Σ̂V̂ T
for p ∈ prange do
[MAPE%( p,rmax ) ,R2( p,r ]=Reconstruct (p,rmax ,Kmax )
max )
end for
[MAPE%bl p ,R2bl p ]=Reconstruct (pmax ,rmax ,Kmax )
f p ( MAPE%( p,rmax ) , R2( p,r , MAPE%bl p , R2bl p ) =
max )
MAPEPI% + R2PI%
popt = arg max f p ( p, rmax )
p∈ prange
for r ∈ rrange do
[MAPE%( popt ,r) ,R2( popt ,r) ]= Reconstruct (popt ,r, Kmax )
end for
[MAPE%blr ,R2blr ]=Reconstruct (popt ,rmax ,Kmax )
f r ( MAPE%( popt ,rmax ) , R2( popt ,r ) , MAPE%blr , R2blr ) = MAPEPI% + R2PI%
max
r opt = arg max f r ( popt , r )
r ∈rrange
[MAPE%( popt ,ropt ) ,R2( popt ,ropt ) ]=
Reconstruct (popt , r opt , Kmax )

Algorithm 2 Reduction and Reconstruction Algorithm


function [MAPE% , R2 ] =Reconstruct (p,r,K)
Truncate the SVD matrices at order p: Ω ≈ Ũ p Σ̃ p ṼpT
′ ≈ Û Σ̂ V̂ T
Truncate the SVD matrices at order r: X H r r r
Compute the HDMDc operators à H = ÛrT X H ′ Ṽ Σ̃−1 Ũ T Û
p p p1 r
and B̃H = ÛrT X H ′ Ṽ Σ̃−1 Ũ T
p p p2
for j = 1, 2, · · · , w − Kmax do
Get x jH , u jH from the datasets X H and Γ H
x̃ jH = ÛrT x jH
for k = 0, 1, · · · , Kmax − 1 do
x̃ jH+k+1 = Ã H x̃ jH+k + B̃H u jH+k
x̂ jH+k+1 = Û x̃ jH+k+1
end for
X̂ H = [ x̂1H x̂2H · · · x̂wH
−Kmax ]
Select X̂ H rows for the first time shift for each state variable
Compute MAPE% and R2
end for
Return MAPE% and R2 for the prediction step K
Electronics 2024, 13, 3047 9 of 24

3. Case of Study
This section contains the description of two case studies that are widely used in
the field of SSs. Both are from the petrochemical sector and are complex systems with
exogenous inputs and strong nonlinearities.

3.1. The Sulfur Recovery Unit


The SRU desulphurization unit considered here is located in a refinery in Sicily (Italy),
as described in [54]. SRUs in refineries are used to recover elemental sulfur from gaseous
hydrogen sulfide (H2 S) contained in by-product gasses produced during the refining of
crude oil and other industrial processes. Since H2 S is a hazardous environmental pollutant,
such a process is of fundamental importance.
The inlets of each SRU line receive two acid gases: MEA gas, rich in H2 S, and SWS
(Sour Water Stripping) gas, rich in H2 S and ammonia (NH3 ). These input gases are com-
busted in two separate chamber reactors fed by a suitable airflow supply for combustion
control. The output gas stream contains residues of H2 S and sulphur dioxide (SO2 ). Nor-
mally, the ratio of H2 S to SO2 in the tail gas must be maintained, which is specified by
a setpoint. An additional secondary airflow is used as an input to improve process control.
This variable is the output of a feedback control system and is used to reduce the peak
values of H2 S and SO2 .
Figure 2 represents a working scheme for an SRU line. The application of SSs is
therefore necessary to estimate such concentrations.

Figure 2. SRU line working scheme.

The input and output variables available in the SRU historical dataset are listed in
Table 1. MSA-HDMDc is used to estimate the H2 S concentration (i.e., y1 output).

Table 1. Input and output variables of the SRU models.

Variable Description

u(1) = MEA_GAS gas flow in the MEA chamber (NM3 /h)


u(2) = AIR_MEA airflow in the MEA chamber (NM3 /h)
u(3) = SWS_GAS total gas flow in the SWS chamber (NM3 /h)
u(4) = AIR_SWS total airflow in the SWS chamber (NM3 /h)
u(5) = AIR_MEA2 secondary air flow (NM3 /h)
y1 = [ H2 S] H2 S concentration (output 1) (mol%)
y2 = [SO2 ] SO2 concentration (output 2) (mol%)

3.2. The Debutanizer Column


The column is located at ERG Raffineria Mediterranea s.r.l. (ERGMED) in Syracuse,
Italy, and is an integral part of a desulphurization and naphtha splitting plant. In the DC,
propane (C3) and butane (C4) are extracted from the naphtha stream as overheads [52].
Electronics 2024, 13, 3047 10 of 24

The DC is required to:


• Ensure sufficient fractionation in the debutanizer;
• Maximize the C5 content (stabilized gasoline) in the debutanizer overhead (LP gas
splitter feed) while complying with the legally prescribed limit;
• Minimize the C4 (butane) content in the bottom of the debutanizer (feed to the
naphtha splitter).
A detailed schematic of the debutanizer column is shown in Figure 3. It includes the
following components:
• E150B heat exchanger;
• E107AB overhead condenser;
• E108AB bottom reboiler;
• P102AB head reflux pump;
• P103AB feed pump to the LPG splitter;
• D104 reflux accumulator.

Figure 3. Schematic representation of the debutanizer column (DC) with indication of the location of
the hardware measuring devices, the model exogenous input, u, and soft sensor model output, y.

A number of hardware sensors are installed in the plant to monitor product quality.
The subset of sensors that are relevant for the described application are listed in Table 2.

Table 2. Input and output variables of the DC models.

Variable Description
u (1)= T040 top temperature (◦ C)
u (2)= P011 top pressure (Kg/cm2 )
u(3) = F015 top reflux (m3 /h)
u(4) = F018 top flow (m3 /h)
u(5) = T004 side temperature (◦ C)
u (6) = ( T036 + T037)/2 T036 and T037 bottom temperatures (◦ C)
y = FC4 C4 concentration in the bottom flow (%)

The C4 concentration in the bottom flow is estimated as the output of the designed
SS. It is not detected on the bottom stream, but at the overheads of the deisopentanization
column. The C4 content in the C5 depends solely on the operating conditions of the
debutanizer: it can be assumed that the C4 detected in the C5 stream is that which flows
Electronics 2024, 13, 3047 11 of 24

from the bottom of the debutanizer. Due to the location of the analyzer, the concentration
values are determined with a long delay, which is not exactly known but is constant and
probably in the range of 30 min. To improve the control quality of the DC, real-time
estimation of both C4 content and C5 content is required. For this purpose, a virtual sensor
is needed, which is described in the following sections.

4. Sulfur Recovery Unit Results


In this section, the proposed MSA-HDMDc is applied to the SRU case study and the
results are compared with the baseline models. The available dataset consists of about
14,000 samples, with a sampling period of one minute. In addition, 70% of the dataset is
used for model training, 15% for the validation of the hyperparameters and model order
reduction and, finally, the testing is performed on the remaining 15% of the dataset.

4.1. Model Optimization and Hyperparameter Tuning


An MSA prediction is performed for the output, y1 . To validate the performance of the
procedure, a time horizon of Kmax = 30 steps is selected. To better show the effect of the
MSA prediction for different time horizons, values in the interval K ∈ {1, 5, 10, 15, 20, 25, 30}
are considered.

4.1.1. Baseline Models


Two linear models were considered: ARX and FIR. The optimal model order was
selected based on the minimum description length (MDL) criterion [55]. In particular,
the ARX structure was identified to have eight common poles (n a = 8), two zeros (nb = 3),
and no delay for all input variables (nk = 0), while the FIR order parameters resulted to be
nb = 10 and nk = 0.

4.1.2. MSA-HDMDc Model


In a preliminary phase of the iterative MSA-HDMDc procedure, a parametric study
was performed based on model performance in terms of MAPE% and R2 . Such a procedure
allowed the definition of the hyperparameters related to the delay shifts q and qu applied
to the state and inputs, respectively.
The first step was to determine the optimal state variable delay shift, q. A grid search
strategy was applied and, for each q that lies between q = 20 and q = 60 with a step of
10, the full-rank HDMDc model was identified. The estimated output reconstructed at
the maximum prediction step (i.e., 30 steps) was compared with the measured output, y1 .
For statistical analysis, the validation dataset was divided into 20 subsets of 100 samples.
MAPE% and R2 were evaluated for each subset, and the corresponding distribution was
determined. In particular, the median value of MAPE% was used to select the optimal
hyperparameter, q.
Table 3 shows the mean value of MAPE% over the 20 trials considering a 30-step-ahead
prediction for the selected q ∈ {20, 30, 40, 50, 60}. The best performing model corresponds
to qopt = 40, as shown by the reported PI%. In a second step, assuming that the optimal
value qopt = 40 is fixed, a further parametric optimization was performed by varying
the Hankel shift, qu , applied to the exogenous control inputs in the Γ H matrix. Since
qu ≤ qopt , for the system to be causal, the model was identified for each qu belonging
to the set qu ∈ {10, 15, 30, 40}. The estimated output, reconstructed at each considered
prediction step, was compared with the measured output, y1 . Considering the model
with qu = qopt = 40, corresponding to the maximum allowable value, as the baseline one,
the barplot in Figure 4 shows how the system performance changes (PI%), at different
prediction steps, by decreasing the value of qu . It can be noticed how the reduction of the
Hankel shift, qu , applied to the input variables causes negative PI% and, thus, the decaying
of the model performance, both in terms of MAPE% and R2 for prediction steps higher
opt
than 5. This led to the selection of qu = qopt = 40 as the optimal number of Hankel shifts
for inputs and state variables.
Electronics 2024, 13, 3047 12 of 24

Table 3. SRU case study: performance comparison for the selection of the qopt . The mean value over
20 subsets of data of the MAPE% is reported for different state time shifts, q. The KPI is evaluated for
a 30-step-ahead prediction. The PI% is reported considering q = 40 as the reference value.

State Time-Shift Optimization


q 20 30 40 50 60
MAPE% 5.53 5.22 5.19 5.28 5.46
PI% −6.66 −0.58 0 −1.79 −5.13

(a) PI% for MAPE%

(b) PI% for R2

Figure 4. SRU case study: percentage performance improvement PI% for (a) MAPE% and (b) R2 at
each prediction step, varying the input delay shifts, qu , in the MSA-HDMDc algorithm. The PI% was
calculated for each of the identified models with respect to the baseline model with qu = q = 40.

4.2. Model Order Reduction


To evaluate the importance of the model reduction phase, as proposed in MSA-
HDMDc, the results obtained during the model order optimization are shown for different
prediction horizons. To better assess the performance of the model, both MAPE% and R2
are shown.
According to Algorithm 1, the analysis was performed considering two optimization
phases, first for p and then for r, each representing the SVD truncation for the matrices
Ω (Equation (20)) and X H ′ (Equation (24)). In a first step, the Ω matrix was truncated
opt opt
from the full-order pmax = n x qopt + nu qu = 240 (with qopt = 40, n x = 1, qu = 40,
and nu = 5) to the reduced order p. Models with different p-reductions from the set
′ with order r
prange ∈ {201, 202, 210, 220, 240} and with the full-order matrix X H max =
opt 2
min(n x q , p) = 40 are shown in Figure 5, where the indices MAPE% and R are given for
different prediction steps. The global performance of HDMDc on the validation dataset
Electronics 2024, 13, 3047 13 of 24

was compared with the performance of the baseline methods (i.e., ARX and FIR). It can be
noticed that the MSA-HDMDc model outperforms the FIR model for all prediction steps
and the ARX model for prediction steps greater than or equal to 5, regardless of the model
order, p. The comparison of the MSA-HDMDc model for the different p also shows that
the reduction of the Ω matrix to order p has a positive effect on the performance of the
model for prediction steps greater than 5, and the optimal configuration is achieved for
order p = 202.

(a) MAPE%

(b) R2

Figure 5. SRU case study: MSA model performances: (a) MAPE% , (b) R2 for ARX, FIR, and MSA-
HDMDc models by varying the reduced order, p, of the Ω matrix in the prange ∈ {201, 202, 210, 220, 240}
′ at full-order r
and considering the matrix X H max = 40.

In a second step, while maintaining the optimal Ω order popt = 202, the X H′ matrix was
opt opt
truncated from the full-order rmax = min(n x q , p ) = 40 to the reduced order, r. Models
with different r reductions belonging to the set r ∈ {18, 20, 23, 25, 30, 35} were identified.
The global performance comparison on the validation dataset is shown in Figure 6, where
the PI% is given for both MAPE% and R2 at different prediction steps. The reference model
for the PI% is MSA-HDMDc with popt = 202 and X H ′ full-order r
max = 40. It can be noticed
how the MSA-HDMDc model with a reduction of X H ′ to order r = 25 outperforms the full-

order model for prediction steps greater than 5. These results confirm that the optimized
Electronics 2024, 13, 3047 14 of 24

MSA-HDMDc order reduction allows identification of the dominant dynamics and thus
introduces robustness and noise rejection features to the reduced model. These properties
thus improve the long-term prediction performance compared with the full-order system.
Finally, the optimal order for the MSA-HDMDc model was determined to be popt = 202
and r opt = 25.

(a) PI% for MAPE%

(b) PI% for R2

Figure 6. SRU case study: barplot of PI% for (a) MAPE% and (b) R2 with Ω matrix order reduction
popt = 202 and varying the X ′ matrix reduction order in the range rrange ∈ {18, 20, 23, 25, 30, 35}.
The PI% was calculated for each of the identified models with respect to the reference model with
popt = 202 and r = rmax = 40.

4.3. Model Comparisons and Discussion


In this section, the performance of the MSA-HDMDc reduced-order model is evaluated.
In particular, the results for the maximum step prediction (i.e., 30 steps), are here presented.
The regression plots on the test dataset, for the baseline and the optimal MSA-HDMDc
models, are reported in Figure 7. The ARX model regression plot presents a slope of
0.41 and a bias of 0.13; the global performance over the test dataset is not considered
acceptable. The FIR regression plot presents a slope of 0.76 with a bias of 0.04. MSA-
HDMDc outperforms both the baseline models with a slope of 0.77 and a bias of 0.068.
Electronics 2024, 13, 3047 15 of 24

(a) ARX (b) FIR

(c) MSA-HDMDc

Figure 7. SRU case study: regression plots of predicted output at 30 steps versus the target measured
output, y1 : (a) ARX model, (b) FIR model, (c) MSA-HDMDc model with optimal parameters qopt = 40,
opt
qu = 40, popt = 202, r opt = 25.

The time plot in Figure 8 shows a comparison between the measured output, y1 , and
the 30-step-ahead predicted output for the baseline and the MSA-HDMDc models for
a subset of the test dataset.

Figure 8. SRU case study: comparison of the measured output (y1 ) with the predicted ones at
30-step-ahead for the baseline and the MSA-HDMDc models with optimal parameters qopt = 40,
opt
qu = 40, popt = 202, r opt = 25.
Electronics 2024, 13, 3047 16 of 24

As described above, the main objective of the SRU is to remove H2 S from the gas flow.
Therefore, the estimation of the output peaks is of greatest interest for the designed model.
Figure 8 shows the prediction performance of the proposed model, which outperforms the
results obtained with the baseline approaches, especially with respect to the peak events.
Tables 4 and 5 reports the MAPE% and R2 for the test dataset considering different
prediction steps for the considered models. Both KPIs are in agreement and show the
superiority of the MSA-HDMDc model for large prediction horizons.

Table 4. SRU case study: MAPE% values at different prediction steps obtained for the considered
opt
models: ARX, FIR, MSA-HDMDc (qopt = 40, qu = 40, popt = 202, r opt = 25).

MAPE%

Steps 1 5 10 15 20 25 30

ARX 2.75 8.08 10.91 12.70 13.74 14.26 14.55


FIR 11.84 11.84 11.84 11.84 11.84 11.84 11.84
MSA-HDMDc 5.56 5.85 6.24 6.59 6.90 7.15 7.35
Bold values represent the best performance for the specific prediction steps column.

Table 5. SRU case study: R2 values at different prediction steps obtained for the considered models:
opt
ARX, FIR, MSA-HDMDc (qopt = 40, qu = 40, popt = 202, r opt = 25).

R2

Steps 1 5 10 15 20 25 30

ARX 0.95 0.49 0.18 −0.02 −0.15 −0.21 −0.24


FIR 0.26 0.26 0.26 0.26 0.26 0.26 0.26
MSA-HDMDc 0.73 0.70 0.67 0.64 0.62 0.61 0.59
Bold values represent the best performance for the specific prediction steps column.

As mentioned in Section 1, the strength of the Koopman operator, which forms the
basis of MSA-HDMDc, lies in the identification of a global state space model that is valid,
even far from specific working points and/or attractors. It differs from standard linear
models that either exploit linearization around specific working points or extend the linear
approximation to the entire domain. With this in mind, the exogenous inputs in the entire
test interval were clustered using the k-means algorithm with squared Euclidean distance,
which is commonly used in pattern recognition [56], classification and predictive model-
ing [57]. Such a method aims to identify different operating points contained in the input
dynamics. To select the optimal number of clusters, the silhouette score distribution [58]
was used, obtaining three distinct clusters. The results of such a procedure allowed us
to divide the test time series into sub-intervals, each of which is associated with a cluster
identifying a working point.
The results of the clustering are shown in Figure 9. The first panel shows the time evo-
lution of the exogenous inputs as named and described in Table 1. The second field shows
the different operating points, defined as clusters, identified by the k-means algorithm in
the analyzed time window. The third field contains the MAPE% time evolution, which
was analyzed in time batches of 100 min for a 30-step-ahead prediction. It can be observed
that, in the first and last time interval, belonging to Cluster1, which is representative of
the majority of the dataset, all three models predict the output with good performance.
For data belonging to the second and third clusters, only MSA-HDMDc guarantees a
performance similar to those obtained in the previous cluster. This confirms the suitability
of MSA-HDMDc to identify global models.
Electronics 2024, 13, 3047 17 of 24

Figure 9. SRU case study: analysis of MAPE% computed using time batches of 100 samples for
a 30-step-ahead prediction on a selected interval of the test dataset. The corresponding normalized
input signals and associated clusters are also included. 1st panel: time evolution of the inputs, 2nd
panel: input clusters, 3rd panel: time evolution of MAPE% .

5. Debutanizer Column Results


In this section, the proposed MSA-HDMDc is applied to the DC case study and the
results are compared with the baseline models. The available dataset consists of 4 months of
data, i.e., March, May, July and September 2004, with a sampling period of 6 min. Here, 50%
of the dataset (March and May) is used for model training, 25% (July) for the validation of
the hyperparameters and model order reduction and, finally, the testing is performed on
the final 25% of the dataset (September).

5.1. Model Optimization and Hyperparameter Tuning


An MSA prediction on the y output is performed. To validate the performance of
the procedure, a time horizon of Kmax = 20 steps corresponding to 120 min is selected.
To better show the effect of MSA prediction for different time horizons, values in the
interval K ∈ {2, 5, 10, 20} steps are considered.

5.1.1. Baseline Models


Two linear models were considered: ARX and FIR. The optimal model order was selected
based on the minimum description length (MDL) criterion [55]. In particular, the ARX
structure was identified to have three common poles (na = 3), eight zeros (nb = 8) and no
delay for all input variables (nk = 0), while the FIR order parameters resulted to be nb = 8
and nk = 0.

5.1.2. MSA-HDMDc Model


In a preliminary stage of the iterative MSA-HDMDc method, the HDMDc algorithm
was used without order reduction. A grid search strategy was applied to find the
optimal q in the range from q = 10 to q = 30. The estimated output reconstructed for
the prediction step, K max , was compared with the measured output, y. For the statistical
analysis, the validation dataset was divided into 70 subsets of 100 samples. MAPE% and
R2 were evaluated for each subset and the corresponding distribution was determined.
In particular, the median value of MAPE% was taken into account when selecting the
optimal q hyperparameters.
Table 6 shows the mean value over the 70 trials of the MAPE% , considering a 20-steps-
ahead prediction. The considered state delay shifts are q ∈ {10, 12, 15, 17, 20, 30}. The best-
performing model corresponds to qopt = 12, as shown by the PI% reported.
As a second step, considering the optimal value qopt = 12, further parametric opti-
mization was carried out by varying the Hankel shift, qu , applied to the exogenous control
Electronics 2024, 13, 3047 18 of 24

inputs in the Γ H matrix. Being qu ≤ qopt , for the system to be causal, the model was identi-
fied for each qu belonging to the set qu ∈ {5, 10, 12}. The estimated output, reconstructed at
each considered prediction step, was compared with the measured output, y. It was found
that the reduction of the Hankel shift, qu , applied to the input variables causes decaying
of the model performance, both in terms of MAPE% and R2 . This led to the selection of
opt
qu = qopt = 12 as the optimal number of Hankel shifts for inputs and state variables.

Table 6. DC case study: performance comparison for the selection of the qopt . The mean value over
70 subsets of 100 of data samples of the MAPE% is reported for different state time shifts, q. The KPI is
evaluated for a 20-step-ahead prediction. The PI% is reported considering q = 12 as the reference value.

State Time-Shift Optimization


q 10 12 15 17 20 30
MAPE% 25.73 24.87 25.32 26.43 27.52 30.19
PI% −3.43 0 −1.81 −6.28 −10.65 −17.20

5.2. Model Order Reduction


opt
In a first step, the Ω matrix was truncated from the full-order pmax = nx qopt + nu qu = 84
opt
(where qopt = 12, n x = 1, qu = 12 and nu = 6) to the reduced order p. Models with
different p reductions in the set prange ∈ {65, 66, 70, 84} and with the full-order matrix
XH′ with r opt
max = min ( n x q , p ) = 12 are shown in Figure 10, where the indices MAPE%
2
and R are given for different prediction steps. The global performance on the validation
dataset was compared with the baseline methods (i.e., ARX and FIR). It can be seen that
the MSA-HDMDc model outperforms both FIR and ARX for all prediction steps, regardless
of the model order, p. When comparing MSA-HDMDc for the different p, it is also noted
that reducing the Ω matrix to order p leaves the performance unchanged, as it decreases
from p = 84 to p = 66, but starts to deteriorate at p = 65. This leads to the conclusion that
popt = 66 holds for all prediction steps.

(a) MAPE%

(b) R2

Figure 10. DC case study: MSA model performances in terms of (a) MAPE% , (b) R2 for ARX,
FIR and MSA-HDMDc models by varying the reduced order, p, of the Ω matrix in the prange ∈
′ at full-order r
{65, 66, 70, 84} and considering the matrix X H max = 12.
Electronics 2024, 13, 3047 19 of 24

In a second step, while maintaining the optimal Ω order popt = 66, the X H ′ matrix was
opt opt
truncated from the full-order rmax = min(n x q , p ) = 12 to the reduced order r. Models
with different r reductions belonging to the set r ∈ {4, 5, 8, 12} were identified. The global
performance comparison on the validation dataset is shown in Figure 11, where the PI% is
given with respect to MSA-HDMDc, with popt = 66 and rmax = 40 for both MAPE% and R2
at different prediction steps. It can be seen that the MSA-HDMDc model with a reduction
′ to order r = 5 outperforms the full-order model for all selected prediction steps.
of X H
These results confirm that the optimized MSA-HDMDc order reduction allows to identify
the dominant dynamics and thus introduces robustness and noise rejection features to
the reduced model. These properties thus improve the long-term prediction performance
compared with the full-order system. Finally, the optimal order for the MSA-HDMDc
model was determined to be popt = 66 and r opt = 5.

(a) PI% for MAPE% (b) PI% for R2

Figure 11. DC case study: barplot of PI% for (a) MAPE% and (b) R2 with Ω matrix order reduction
popt = 66 and varying the X ′ matrix reduction order in the range rrange ∈ {4, 5, 8, 12}. The PI% was
calculated for each of the identified models with respect to the reference MSA-HDMDc model with
popt = 66 and r = rmax = 12.

5.3. Model Comparison and Discussion


In this section, the performance of the reduced-order MSA-HDMDc model is further
evaluated in terms of the predicted time series and robustness to the variability of the
operating points. The time plot in Figure 12 shows a comparison between the measured
output, y, and the 5- and 10-step-ahead predicted outputs for the baseline and MSA-
HDMDc models for a subset of the test dataset. It shows that the prediction performance
of the proposed model outperforms the results obtained with the baseline approaches,
especially for the 10-step-ahead prediction, and confirms that the difference between the
models becomes more pronounced at longer prediction intervals.

(a) 5-step-ahead (30 min)

Figure 12. Cont.


Electronics 2024, 13, 3047 20 of 24

(b) 10-step-ahead (60 min)

Figure 12. DC case study: comparison of the measured output (y) with the predicted one at (a) 5-
step-ahead (30 min) and (b) 10-step-ahead (60 min) for the baseline and the MSA-HDMDc models
opt
with the optimal parameters qopt = 12, qu = 12, popt = 66, r opt = 5 on a selected interval of the
test dataset.

Tables 7 and 8 reports the MAPE% and R2 for the test dataset considering different
prediction steps for the considered models. Both KPIs are in agreement and show the
superiority of the MSA-HDMDc model for all the prediction horizons.

Table 7. DC case study: MAPE% values at different prediction steps obtained for the considered
opt
models: ARX, FIR, MSA-HDMDc (qopt = 12, qu = 12, popt = 66, r opt = 5) in the test dataset.

MAPE%
Steps 2 5 10 20
ARX 4.44 13.75 29.40 53.91
FIR 57.42 57.42 57.42 57.42
MSA-HDMDc 1.66 6.01 12.93 23.13
Bold values represent the best performance for the specific prediction steps column.

Table 8. DC case study: R2 values at different prediction steps obtained for the considered models:
opt
ARX, FIR, MSA-HDMDc (qopt = 12, qu = 12, popt = 66, r opt = 5) in the test dataset.

R2
Steps 2 5 10 20
ARX 0.983 0.854 0.347 −1.20
FIR −0.99 −0.99 −0.99 −0.99
MSA-HDMDc 0.998 0.974 0.890 0.661
Bold values represent the best performance for the specific prediction steps column.

The analysis of the model performance was performed by identifying different op-
erating points through exogenous inputs clustering. The k-means algorithm [56,57] was
applied with squared Euclidean distance. The optimal number of clusters was found to
be two by using the silhouette score distribution [58]. The clustering results are reported
in Figure 13. In the first panel, the time evolution of the normalized exogenous inputs,
as named and described in Table 2, is shown. In the second panel, the clusters identified by
the k-means algorithm are reported in the analyzed time window showing the operating
points with different colors. The third panel contains the MAPE% time evolution computed
in time batches of 100 samples for a 5-step-ahead prediction. It can be seen that, for data be-
longing to the operating point labeled as Cluster 1 between 1400 and 2000 samples, only the
MSA-HDMDc model maintains consistent performance across different system operating
points. In contrast, the performance of the ARX and FIR models degrades significantly and
shows high variability with changes in operating conditions. This highlights the sensitivity
Electronics 2024, 13, 3047 21 of 24

of the ARX and FIR models to variations in operating points and confirms the suitability
of MSA-HDMDc for identifying global models that provide robust performance under
different operating conditions.

Figure 13. DC case study: analysis of MAPE% computed using time batches of 100 samples for
a 5-step-ahead prediction on a selected interval of the test dataset. The corresponding normalized
input signals and associated clusters are also included. 1st panel: time evolution of the inputs, 2nd
panel: input clusters, 3rd panel: time evolution of MAPE% .

6. Conclusions
In this work, we have proposed the MSA-HDMDc approach for the development of
SSs in industrial applications. Two well-established benchmarks based on the SRU and DC
data were considered.
The MSA-HDMDc approach focused on optimizing the identified HDMDc state-space
model for multiple-step-ahead predictions. It employed the Hankel operator for state-
space dimension augmentation to address nonlinear dynamics effectively through a linear
state-space model.
As a preliminary step, the augmentation delay time shifts for input and state variables
were optimized through statistical analyses, using the MAPE% and R2 as key performance
indexes. As a second step, model order reduction was investigated to find the minimum
number of Koopman space observables that better describe the nonlinear system dynamics
while keeping the computational burden reasonable, aiming to obtain models that can be
integrated into model-based control strategies. An iterative procedure was proposed to
select the optimal order reduction, maximizing a fitness function based on performance
improvements relative to full-order models. The optimal reduced MSA-HDMDc model
structure was then compared for different step-ahead predictions with baseline linear
identification methods (i.e., ARX and FIR), which are widely used in the SS field.
The MSA-HDMDc model outperformed, for both the SRU and the DC case studies,
the baseline ARX and FIR models for high prediction steps. The model exhibited reduced
sensitivity to different operational conditions compared with the baseline models. This
was demonstrated through a clustering procedure applied to the input time series data,
which identified and classified various operative states as clusters. The performance of
the analyzed models was monitored over time and associated with these distinct states.
The MSA-HDMDc model consistently maintained its performance across various condi-
tions, whereas the ARX and FIR models showed sensitivity to these variations, resulting
in decreased performance for certain clusters. This is in line with the Koopman theory
that allows the identification of a global model for the system able to properly work on
multiple operating points. This feature, along with the model optimization due to the
SVD-based model order reduction, makes the proposed MSA-HDMDc suitable for MPC-
Electronics 2024, 13, 3047 22 of 24

based applications, enhancing the reliability and effectiveness of the control system in
industrial applications.
The capability of the proposed methodology to identify a global model that func-
tions across different operating points encourages further exploration of its transferability.
Applying the identified model to different process lines could address the issue of data
scarcity. Additionally, given that SRU and DC data exhibit strong nonlinearity and dynamic
variations in operating points and spectral content, future efforts will focus on integrating
MSA-HDMDc with multi-resolution DMD [59] to incorporate slow feature analysis.

Author Contributions: Conceptualization, L.P., F.S. and M.G.X.; methodology, L.P., F.S. and M.G.X.;
software, F.S.; validation, L.P., F.S. and M.G.X.; formal analysis, L.P., F.S. and M.G.X.; investigation,
L.P., F.S. and M.G.X.; resources, L.P. and M.G.X.; data curation, L.P., F.S. and M.G.X.; writing—original
draft preparation, F.S.; writing—review and editing, L.P., F.S. and M.G.X.; visualization, F.S.; super-
vision, L.P., F.S. and M.G.X.; project administration, L.P. and M.G.X.; funding acquisition, L.P. and
M.G.X. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by Progetto Green SENSing systems based on Bacterial Cellulose
(SENS-BC), Italian Ministry of University and Research, CUP J53D23003460006.
Data Availability Statement: Dataset available on request from the authors.
Conflicts of Interest: The authors declare no conflicts of interest.

Abbreviations
The following abbreviations are used in this manuscript:

DMD Dynamic mode decomposition


HDMDc Hankel dynamic mode decomposition with control
MSA Multi-step-ahead
SVD Singular Value Decomposition
MPC Model predictive control
SRU Sulfur recovery unit
DC Debutanizer column

References
1. Fortuna, L.; Graziani, S.; Rizzo, A.; Xibilia, M. Soft Sensors for Monitoring and Control of Industrial Processes; Springer: London, UK,
2007. [CrossRef]
2. Kadlec, P.; Gabrys, B.; Strandt, S. Data-driven Soft Sensors in the process industry. Comput. Chem. Eng. 2009, 33, 795–814.
[CrossRef]
3. Pani, A.K.; Amin, K.G.; Mohanta, H.K. Soft sensing of product quality in the debutanizer column with principal component
analysis and feed-forward artificial neural network. Alex. Eng. J. 2016, 55, 1667–1674. [CrossRef]
4. Graziani, S.; Xibilia, M.G. Deep Structures for a Reformer Unit Soft Sensor. In Proceedings of the 2018 IEEE 16th International
Conference on Industrial Informatics (INDIN), Porto, Portugal, 18–20 July 2018; pp. 927–932. [CrossRef]
5. Sujatha, K.; Bhavani, N.; Cao, S.Q.; Ram Kumar, K. Soft Sensor for Flame Temperature Measurement and IoT based Monitoring
in Power Plants. Mater. Today Proc. 2018, 5, 10755–10762. [CrossRef]
6. Zhu, X.; Rehman, K.U.; Wang, B.; Shahzad, M. Modern Soft-Sensing Modeling Methods for Fermentation Processes. Sensors 2020,
20, 1771. [CrossRef]
7. Zhu, C.; Zhang, J. Developing Soft Sensors for Polymer Melt Index in an Industrial Polymerization Process Using Deep Belief
Networks. Int. J. Autom. Comput. 2020, 17, 44–54. [CrossRef]
8. Pisa, I.; Santín, I.; Vicario, J.; Morell, A.; Vilanova, R. ANN-Based Soft Sensor to Predict Effluent Violations in Wastewater
Treatment Plants. Sensors 2019, 19, 1280. [CrossRef]
9. Guckenheimer, J.; Holmes, P. Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields; Springer: New York, NY,
USA, 1983; Volume 42. [CrossRef]
10. Babaei Pourkargar, D.; Armaou, A. Control of spatially distributed processes with unknown transport-reaction parameters via
two layer system adaptations. AIChE J. 2015, 61, 2497–2507. [CrossRef]
11. Souza, F.; Araujo, R. Online mixture of univariate linear regression models for adaptive soft sensors. IEEE Trans. Ind. Inform.
2014, 10, 937–945. [CrossRef]
12. Chen, X.; Mao, Z.; Jia, R.; Zhang, S. Ensemble regularized local finite impulse response models and soft sensor application in
nonlinear dynamic industrial processes. Appl. Soft Comput. 2019, 85, 105806. [CrossRef]
Electronics 2024, 13, 3047 23 of 24

13. Liu, J.; He, J.; Tang, Z.; Xie, Y.; Gui, W.; Ma, T.; Jahanshahi, H.; Aly, A.A. Frame-Dilated Convolutional Fusion Network and
GRU-Based Self-Attention Dual-Channel Network for Soft-Sensor Modeling of Industrial Process Quality Indexes. IEEE Trans.
Syst. Man Cybern. Syst. 2022, 52, 5989–6002. [CrossRef]
14. Xie, S.; Xie, Y.; Li, F.; Yang, C.; Gui, W. Optimal Setting and Control for Iron Removal Process Based on Adaptive Neural Network
Soft-Sensor. IEEE Trans. Syst. Man Cybern. Syst. 2020, 50, 2408–2420. [CrossRef]
15. Xie, Y.; Wang, J.; Xie, S.; Chen, X. Adversarial Training-Based Deep Layer-Wise Probabilistic Network for Enhancing Soft Sensor
Modeling of Industrial Processes. IEEE Trans. Syst. Man Cybern. Syst. 2023, 54, 972–984. [CrossRef]
16. Dias, T.; Oliveira, R.; Saraiva, P.M.; Reis, M.S. Linear and Non-Linear Soft Sensors for Predicting the Research Octane Number
(RON) through Integrated Synchronization, Resolution Selection and Modelling. Sensors 2022, 22, 3734. [CrossRef]
17. Patanè, L.; Xibilia, M.G. Echo-state networks for soft sensor design in an SRU process. Inf. Sci. 2021, 566, 195–214. [CrossRef]
18. Koopman, B.O. Hamiltonian Systems and Transformation in Hilbert Space. Proc. Natl. Acad. Sci. USA 1931, 17, 315–318.
[CrossRef]
19. Mezić, I. Spectral Properties of Dynamical Systems, Model Reduction and Decompositions. Nonlinear Dyn. 2005, 41, 309–325.
[CrossRef]
20. Brunton, S.L.; Budišić, M.; Kaiser, E.; Kutz, J.N. Modern Koopman Theory for Dynamical Systems. SIAM Rev. 2022, 64, 229–340.
[CrossRef]
21. Brunton, S.L.; Brunton, B.W.; Proctor, J.L.; Kutz, J.N. Koopman invariant subspaces and finite linear representations of nonlinear
dynamical systems for control. PLoS ONE 2015, 11, e0150171. [CrossRef]
22. Brunton, S.L.; Kutz, J.N. Data-Driven Science and Engineering; Cambridge University Press: Cambridge, UK, 2022. [CrossRef]
23. Song, G.; Alizard, F.; Robinet, J.C.; Gloerfelt, X. Global and Koopman modes analysis of sound generation in mixing layers. Phys.
Fluids 2013, 25, 124101. [CrossRef]
24. Mezić, I. Analysis of Fluid Flows via Spectral Properties of the Koopman Operator. Annu. Rev. Fluid Mech. 2013, 45, 357–378.
[CrossRef]
25. Proctor, J.L.; Eckhoff, P.A. Discovering dynamic patterns from infectious disease data using dynamic mode decomposition. Int.
Health 2015, 7, 139–145. [CrossRef] [PubMed]
26. Alfatlawi, M.; Srivastava, V. An Incremental Approach to Online Dynamic Mode Decomposition for Time-Varying Systems with
Applications to EEG Data Modeling. arXiv 2019, arXiv:1908.01047.
27. Taylor, R.; Kutz, J.N.; Morgan, K.; Nelson, B.A. Dynamic mode decomposition for plasma diagnostics and validation. Rev. Sci.
Instruments 2018, 89, 053501. [CrossRef] [PubMed]
28. Kaptanoglu, A.A.; Morgan, K.D.; Hansen, C.J.; Brunton, S.L. Characterizing magnetized plasmas with dynamic mode decomposi-
tion. Phys. Plasmas 2020, 27, 032108. [CrossRef]
29. Bruder, D.; Gillespie, B.; Remy, C.D.; Vasudevan, R. Modeling and Control of Soft Robots Using the Koopman Operator and
Model Predictive Control. arXiv 2019, arXiv:1902.02827.
30. Susuki, Y.; Mezic, I. Nonlinear Koopman Modes and a Precursor to Power System Swing Instabilities. IEEE Trans. Power Syst.
2012, 27, 1182–1191. [CrossRef]
31. Jones, C.; Utyuzhnikov, S. Application of higher order dynamic mode decomposition to modal analysis and prediction of power
systems with renewable sources of energy. Int. J. Electr. Power Energy Syst. 2022, 138, 107925. [CrossRef]
32. Williams, M.O.; Kevrekidis, I.G.; Rowley, C.W. A Data–Driven Approximation of the Koopman Operator: Extending Dynamic
Mode Decomposition. J. Nonlinear Sci. 2015, 25, 1307–1346. [CrossRef]
33. Brunton, S.L.; Proctor, J.L.; Kutz, J.N. Discovering governing equations from data by sparse identification of nonlinear dynamical
systems. Proc. Natl. Acad. Sci. USA 2016, 113, 3932–3937. [CrossRef]
34. Arbabi, H.; Mezić, I. Ergodic theory, Dynamic Mode Decomposition and Computation of Spectral Properties of the Koopman
operator. SIAM J. Appl. Dyn. Syst. 2016, 16, 2096–2126. [CrossRef]
35. Clainche, S.L.; Vega, J.M. Higher Order Dynamic Mode Decomposition. SIAM J. Appl. Dyn. Syst. 2017, 16, 882–925. [CrossRef]
36. Brunton, S.L.; Brunton, B.W.; Proctor, J.L.; Kaiser, E.; Kutz, J.N. Chaos as an intermittently forced linear system. Nat. Commun.
2017, 8, 19. [CrossRef] [PubMed]
37. Wu, Z.; Brunton, S.L.; Revzen, S. Challenges in dynamic mode decomposition. J. R. Soc. Interface 2021, 18, 20210686. [CrossRef]
[PubMed]
38. Clainche, S.L.; Han, Z.H.; Ferrer, E. An alternative method to study cross-flow instabilities based on high order dynamic mode
decomposition. Phys. Fluids 2019, 31, 094101. [CrossRef]
39. Clainche, S.L. Prediction of the Optimal Vortex in Synthetic Jets. Energies 2019, 12, 1635. [CrossRef]
40. Proctor, J.L.; Brunton, S.L.; Kutz, J.N. Dynamic Mode Decomposition with Control. SIAM J. Appl. Dyn. Syst. 2016, 15, 142–161.
[CrossRef]
41. Mustavee, S.; Agarwal, S.; Enyioha, C.; Das, S. A linear dynamical perspective on epidemiology: Interplay between early
COVID-19 outbreak and human mobility. Nonlinear Dyn. 2022, 109, 1233–1252. [CrossRef] [PubMed]
42. Shabab, K.R.; Mustavee, S.; Agarwal, S.; Zaki, M.H.; Das, S. Exploring DMD-Type Algorithms for Modeling Signalised
Intersections. arXiv 2021, arXiv:2107.06369v1.
43. Das, S.; Mustavee, S.; Agarwal, S.; Hasan, S. Koopman-Theoretic Modeling of Quasiperiodically Driven Systems: Example of
Signalized Traffic Corridor. IEEE Trans. Syst. Man Cybern. Syst. 2023, 53, 4466–4476. [CrossRef]
Electronics 2024, 13, 3047 24 of 24

44. Wolfram, D.; Meurer, T. DMD-Based Model Predictive Control for a Coupled PDE-ODE System. IFAC-PapersOnLine 2023,
56, 4258–4263. [CrossRef]
45. Liu, J.; Zhang, X.; Xu, X.; Xiong, Q. Receding Horizon Actor–Critic Learning Control for Nonlinear Time-Delay Systems with
Unknown Dynamics. IEEE Trans. Syst. Man Cybern. Syst. 2023, 53, 4980–4993. [CrossRef]
46. Zhang, X.; Liu, J.; Xu, X.; Yu, S.; Chen, H. Robust Learning-Based Predictive Control for Discrete-Time Nonlinear Systems with
Unknown Dynamics and State Constraints. IEEE Trans. Syst. Man Cybern. Syst. 2022, 52, 7314–7327. [CrossRef]
47. Narasingam, A.; Kwon, J.S.I. Development of local dynamic mode decomposition with control: Application to model predictive
control of hydraulic fracturing. Comput. Chem. Eng. 2017, 106, 501–511. [CrossRef]
48. Lin, S.; Zhang, M.; Cheng, X.; Shi, L.; Gamba, P.; Wang, H. Dynamic Low-Rank and Sparse Priors Constrained Deep Autoencoders
for Hyperspectral Anomaly Detection. IEEE Trans. Instrum. Meas. 2024, 73, 1–18. [CrossRef]
49. Zhang, C.; Li, G.; Lei, R.; Du, S.; Zhang, X.; Zheng, H.; Wu, Z. Deep Feature Aggregation Network for Hyperspectral Remote
Sensing Image Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5314–5325. [CrossRef]
50. Yuan, X.; Wang, Y.; Yang, C.; Gui, W. Stacked isomorphic autoencoder based soft analyzer and its application to sulfur recovery
unit. Inf. Sci. 2020, 534, 72–84. [CrossRef]
51. Bidar, B.; Shahraki, F.; Sadeghi, J.; Khalilipour, M.M. Soft Sensor Modeling Based on Multi-State-Dependent Parameter Models
and Application for Quality Monitoring in Industrial Sulfur Recovery Process. IEEE Sens. J. 2018, 18, 4583–4591. [CrossRef]
52. Fortuna, L.; Graziani, S.; Xibilia, M. Soft sensors for product quality monitoring in debutanizer distillation columns. Control Eng.
Pract. 2005, 13, 499–508. [CrossRef]
53. Mou, T.; Zou, Y.; Li, S. Enhancing graph convolutional network of knowledge-based co-evolution for industrial process key
variable prediction. Control Theory Appl. Lilun Yu Yinyong 2024, 41, 416.
54. Fortuna, L.; Rizzo, A.; Sinatra, M.; Xibilia, M. Soft analyzers for a sulfur recovery unit. Control Eng. Pract. 2003, 11, 1491–1500.
[CrossRef]
55. Ljung, L. System Identification: Theory for the User; Prentice-Hall, Inc.: Upper Saddle River, NJ, USA, 1986.
56. Sahbudin, M.A.B.; Scarpa, M.; Serrano, S. MongoDB Clustering Using K-means for Real-Time Song Recognition. In Proceedings
of the 2019 International Conference on Computing, Networking and Communications (ICNC), Honolulu, HI, USA, 18–21
February 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 350–354. [CrossRef]
57. Cohn, R.; Holm, E. Unsupervised Machine Learning Via Transfer Learning and k-Means Clustering to Classify Materials Image
Data. Integr. Mater. Manuf. Innov. 2021, 10, 231–244. [CrossRef]
58. Rousseeuw, P.J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1987,
20, 53–65. [CrossRef]
59. Schmid, P.J. Dynamic Mode Decomposition and Its Variants. Annu. Rev. Fluid Mech. 2022, 54, 225–254. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like