0% found this document useful (0 votes)
19 views

Gas Turbine Off-Design Performance Adaption Based on Cluster Sampling

This article presents a method for adapting gas turbine performance models using cluster sampling to improve accuracy. By applying k-means clustering to field data and utilizing particle swarm optimization, the model error was significantly reduced from 2.947% to 0.610%. The proposed approach demonstrates enhanced performance accuracy compared to traditional random sampling methods, particularly in underrepresented operating conditions.

Uploaded by

JeeEianYann
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Gas Turbine Off-Design Performance Adaption Based on Cluster Sampling

This article presents a method for adapting gas turbine performance models using cluster sampling to improve accuracy. By applying k-means clustering to field data and utilizing particle swarm optimization, the model error was significantly reduced from 2.947% to 0.610%. The proposed approach demonstrates enhanced performance accuracy compared to traditional random sampling methods, particularly in underrepresented operating conditions.

Uploaded by

JeeEianYann
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

applied

sciences
Article
Gas Turbine Off-Design Performance Adaption Based on
Cluster Sampling
Jing Kong 1,2 , Wei Yu 1 , Jinwei Chen 2, * and Huisheng Zhang 2

1 Huadian Electric Power Research Institute Co., Ltd., Hangzhou 310030, China; [email protected] (J.K.)
2 The Key Laboratory of Power Machinery and Engineering of Education Ministry,
Shanghai Jiao Tong University, Shanghai 200240, China; [email protected]
* Correspondence: [email protected]

Abstract: An accurate gas turbine performance model is important for system performance evaluation.
To minimize the simulated performance error, an adaption method using coefficients of scaling
factors is applied to tune the component characteristic maps and make the gas turbine model meet
measurements of a few randomly sampled points. However, the field data are non-homogeneously
distributed. In this situation, randomly selecting a few sampling may lead to the inappropriate
correction of the component characteristic maps and lower the prediction accuracy of model. Firstly,
the coefficients of scaling factors are introduced to construct the performance adaption optimization
problem of the gas turbine model. Secondly, the k-means clustering algorithm is applied to divide the
146 field data points into 10 different categories, and then the points closest to the cluster centers are
selected to form the sampling set. Thirdly, a particle swarm optimization algorithm is used to search
the optimal scaling factors. As a result, the model error decreases from 2.947% to 0.610%. Finally, the
proposed method is validated with the remaining field data of a real E-class gas turbine. The average
predicted error is 0.466%. Compared with the performance results obtained by random sample, the
model based on cluster sampling shows a better accuracy.

Keywords: gas turbine; performance adaption; cluster sampling; k-means; PSO

Citation: Kong, J.; Yu, W.; Chen, J.; 1. Introduction


Zhang, H. Gas Turbine Off-Design
Gas turbines, due to its advantages of lower emissions, fast starting–accelerating,
Performance Adaption Based on
high power density [1], and high fuel flexibility [2], have been widely applied in aircrafts,
Cluster Sampling. Appl. Sci. 2023, 13,
shipping [3] and power generation [4]. Performance estimation [5], operating optimiza-
7352. https://doi.org/10.3390/
tion [6] and diagnostics [7] of gas turbines are important and depend heavily on an accurate
app13137352
gas turbine performance model. The accuracy of the performance model relies consid-
Academic Editor: Satoru Okamoto erably on its component characteristic maps, and component maps are obtained in rig
Received: 10 May 2023
tests under different conditions, which can be costly and time consuming [8]. Thus, only
Revised: 17 June 2023
component characteristic maps coming from a similar gas turbine can be provided by
Accepted: 18 June 2023 original equipment manufacturer (OEM) [9]. This is insufficient; as discrepancies caused
Published: 21 June 2023 by manufacturing, assembly deviation, and overhaul of gas turbines will always exist,
variations of component characteristic maps between gas turbines are inescapable [10].
To overcome this problem and obtain an accurate performance model, considerable
attention has been paid to the research of performance adaption based on operating data.
Copyright: © 2023 by the authors. Stamatis et al. [11] introduced modification factors to modify component characteristic
Licensee MDPI, Basel, Switzerland. maps and established an optimization procedure to optimize the modification factors.
This article is an open access article Lambiris et al. [12] further developed this method by introducing determining correc-
distributed under the terms and
tions and sensitivity analysis. Kong et al. [13] proposed a new scaling method based on
conditions of the Creative Commons
system identification, which used component characteristic maps and scaling factors at
Attribution (CC BY) license (https://
both design and off-design points. Then, a genetic algorithm (GA) was introduced to
creativecommons.org/licenses/by/
this method to create better shapes of the speed lines [14,15]. Li et al. [16] proposed a
4.0/).

Appl. Sci. 2023, 13, 7352. https://doi.org/10.3390/app13137352 https://www.mdpi.com/journal/applsci


Appl. Sci. 2023, 13, 7352 2 of 17

design-point performance adaption approach by the Newton–Raphson method. Based on


this study, a multiple off-design point adaption algorithm was proposed that linearizes
the off-design scaling factors around the design point and obtains the optimal result by
GA [17]. Furthermore, a nonlinear multiple-points off-design performance adaption was
proposed [18]. Then, a least square method was used to determine the search range of
the scaling factor coefficients [19]. To improve the accuracy of the off-design performance
adaption, Tsoutsanis et al. [20] introduced elliptic equations to represent compressor maps
with seven coefficients to control the shape of the maps, and tested it on a GE (general
electric) aeroderivative gas turbine [21]. Then, a method considering the rotation of the
ellipses and transformation of its coordinates was further developed [22]. Yang et al. [23],
considering insufficient information, proposed a new generation method, in which the
initial map was obtained according to a set of steady-operating data; then, the coefficients
were tuned through sets of transient data. To overcome the zonal distribution effect existing
in field data–gas turbine performance adaption, Li et al. [24] proposed a new metric to
evaluate the adaption accuracy and changed the performance parameters at the design
point. Yan et al. [25] proposed an enhanced component analytical solution and utilized
sensitivity analysis to set the weight coefficient for the tuning factors.
Of this research, the majority use optimization algorithms to obtain scaling factors or
their coefficients minimizing the total average prediction error of the measurable parameters
of random sampling points. However, in the production site, due to the coordination of
power grid dispatching, gas turbine generator sets are turned on for a long period of time
and turned off for another long period of time. This is reflected on the on-site data as the
frequent occurrence of working conditions and less data for other working conditions. At
the same time, due to interference, noise, sensitivity, fouling, and erosion, there are no
deterministic working conditions, and there will be fluctuations within a certain range
reflected in the data [24]. Therefore, randomly selected sampling points often result in
some working conditions being overlooked, and the sampled points cannot reflect the
real situation. Therefore, a gas turbine performance adaptation method based on cluster
sampling is proposed to improve the current adaption and enhance the prediction accuracy
on the whole operation range. The main contributions of this study can be summarized
as follows:
(1) The proposed cluster sampling method divides all operating points into categories
and selects the points closest to the cluster centers. Compared with random sam-
pling, it can enhance the coverage and dispersion (representing the degree of sample
difference) of the sampling set and enhance the representativeness of the sampling set.
(2) Compared with the performance model before adaption, the accuracy of the perfor-
mance model tuned by the particle swarm optimization algorithm (PSO) is increased.
(3) The proposed method was applied on a real E-class gas turbine power station and
compared with the performance adaption based on the random sampling method.
(4) Compared with the model based on the random sampling method, the proposed
method has a higher accuracy in the entire field dataset, especially the overlooked
random sampling conditions.
This paper is organized as follows: Section 1 reviews several performance adapta-
tion research and proposes a performance adaption based on cluster sampling and the
iteration-eliminating model. Section 2 introduces the simulation model used in this study,
performance adaptation, cluster sampling method, and the flowchart of performance adap-
tion based on cluster sampling. Section 3 uses an example of a power plant to demonstrate
the effectiveness of the proposed method. Section 4 summarizes this paper.

2. Methodology
The proposed performance adaption process depicted in Figure 1 can tune the compo-
nent characteristic maps to meet the performance of sampling points extracted from the
field data. This method is mainly composed of a gas turbine performance model, a cluster
sampling method, and an optimization algorithm. The performance model is described in
2. Methodology
The proposed performance adaption process depicted in Figure 1 can tun
component characteristic maps to meet the performance of sampling points extr
Appl. Sci. 2023, 13, 7352 3 of 17
from the field data. This method is mainly composed of a gas turbine performance
el, a cluster sampling method, and an optimization algorithm. The performance mo
described
Section in performance
2.1, the Section 2.1,adaption
the performance
method thatadaption method
minimizes that minimizes
the deviation between thethe devi
field measurement data and the simulated results is described in Section 2.2,isthe
between the field measurement data and the simulated results described
cluster in Se
2.2, the method
sampling cluster issampling
describedmethod
in Sectionis2.3,
described in Section
and the PSO used to 2.3,
tune and the PSO used to
the component
characteristic maps is described in Section 2.4.
the component characteristic maps is described in Section 2.4.

Clustering Sampling

K-means method
Field data

Sampling dataset Measurement data z

Gas Turbine Performance Model


Boundary condition x
Fuel m f Tf Pf

Air
1 2 3 4
Combustor
Simulation data ẑ

T1 T2 Rotor n T4
Compressor Turbine Load Pw
P1 P2 P4

Particle Swarm Optimizatio

Record
coefficients bx, cx personal and ∑𝑁 𝑀
𝑗 =1 ∑𝑖=1 (𝑧̂ 𝑖𝑗 −𝑧 𝑖𝑗 )/𝑧 𝑖𝑗
generational Objective min 𝑓 =
𝑏𝑥 ,𝑐𝑥 𝑁×𝑀
best position

Update velocity Yes


Update position X Max Iteration? Optimal coefficients bx, cx
vector V

No

Figure
Figure1. Flowchart of the
1. Flowchart ofperformance adaption
the performance method. method.
adaption
2.1. Performance Model of the Gas Turbine
2.1.As
Performance Model of the Gas Turbine
performance adaption is conducted on the component-level model, this section
As performance
introduces the performanceadaption is conducted
simulation model of theon the
gas component-level
turbine model,
used in this study. A this se
model consisting of compressor, combustor, turbine, rotator, and generator was
introduces the performance simulation model of the gas turbine used in this stu used to
predict
modeltheconsisting
measurements, as shown in Figure
of compressor, 2. In which,
combustor, the number
turbine, “1”and
rotator, means the inlet was us
generator
of compressor, “2” means the outlet of compressor, “3” means the outlet of burner, “4”
predict the measurements, as shown in Figure 2. In which, the number “1” mean
means the outlet of turbine.
inlet of compressor, “2” means the outlet of compressor, “3” means the outlet of bu
“4” means the outlet of turbine.
Appl. Sci. 2023, 13, x FOR PEER REVIEW 4
Appl. Sci. 2023, 13, 7352 4 of 17

Fuel mf Tf Pf

Air
1 2 3 4
Combustor

T1 T2 Rotor n T4
Compressor Turbine Load Pw
P1 P2 P4

Figure
Figure 2. Gas-path 2. Gas-path
schematic of anschematic
E-class gasofturbine.
an E-class gas turbine.

The performance Themodel


performance
of the gasmodel
turbine of thewasgasusedturbine was used
to simulate to simulate pa-
the performance the perform
parameters with the input parameters. The performance
rameters with the input parameters. The performance model is a steady-state mechanism model is a steady-state m
nismby
model established model establishedprinciples,
thermodynamic by thermodynamic principles, mass
energy conservation, energy conservation, mass
conservation
equations, andservation equations,
the component and the component
characteristic maps, whichcharacteristic maps, which
are used to characterize theare used to ch
rela-
tionship betweenterizethethe relationship
component betweenflow
equivalent the rate,
component
pressure equivalent flow rate,
ratio, isentropic pressure ratio,
efficiency,
and equivalenttropic
speed.efficiency, and equivalent
The simulation model can speed. The simulation
be expressed model
as Equation can be expressed as E
(1):
tion (1):
y = f (x) (1)
𝑦𝑦 = 𝑓𝑓(𝑥𝑥)
where f (·) denotes the mechanism model of the gas turbine; x = { T1 , P1 , m f , T f , Pf , P4 , n}
is the vector ofwhere
the input denotes the
𝑓𝑓(∙)parameters, mechanism
which are shown model
by theof variables
the gas turbine;
without𝑥𝑥asterisks
= {𝑇𝑇1 , 𝑃𝑃1in
, 𝑚𝑚𝑓𝑓 , 𝑇𝑇𝑓𝑓 , 𝑃𝑃𝑓𝑓 ,
is the vector of the input parameters, which are shown by
Table 1; and y = { T2 , P2 , T4 , Pw } is the vector of the output parameters, which are shown the variables without ast
in Table 1; and 𝑦𝑦 = {𝑇𝑇
by the variables marked with asterisks2 in2Table , 𝑃𝑃 , 𝑇𝑇 , 𝑃𝑃
4 𝑤𝑤1. } is the vector of the output parameters, whic
shown by the variables marked with asterisks in Table 1.
Table 1. Measurable parameters of the E-class gas turbine.
Table 1. Measurable parameters of the E-class gas turbine.
No. Measurable Parameters No. Measurable Parameters
No. Measurable Parameters No. Measurable Parameters
1 Compressor inlet temperature T1 7 Fuel pressure Pf
1 Compressor inlet temperature 𝑇𝑇1 7 Fuel pressure 𝑃𝑃𝑓𝑓
2 Compressor inlet pressure P1 8 Turbine outlet temperature * T4
3 2 Compressor inlet
Compressor outlet temperature * T2pressure 𝑃𝑃
9 1 8 Turbine outlet temperature
Turbine outlet pressure P4 *
4 3 Compressor
Compressor outlet
outlet pressure * Ptemperature
2 10 * 𝑇𝑇2 9 PowerTurbine
output * outlet
Pw pressure 𝑃𝑃4
5 4 Fuel Compressor
mass flow m f outlet pressure11* 𝑃𝑃2 10 Power
Rotor speed n output * 𝑃𝑃𝑤𝑤
6
5 FuelTmass
Fuel temperature f flow 𝑚𝑚 𝑓𝑓 11 Rotor speed 𝑛𝑛
* Performance parameters used to test the predicted result.
6 Fuel temperature 𝑇𝑇𝑓𝑓
* Performance parameters used to test the predicted result.
2.2. Performance Adaption
2.2. Performance
The gas turbine model was Adaption
introduced in the previous section, and the component
characteristic maps play an important role in the model computation. Therefore, the
The gas turbine model was introduced in the previous section, and the comp
performance adaption method is proposed to modify component characteristic maps for
characteristic maps play an important role in the model computation. Therefore
enhancing the model precision. The basic method of performance adaption used in this
performance adaption method is proposed to modify component characteristic map
paper contains two procedures: (1) Map scaling for design point, which introduces a
enhancing the model precision. The basic method of performance adaption used i
set of fixed scaling factors to modify the original components’ characteristic maps. This
paper contains two procedures: (1) Map scaling for design point, which introduces
procedure is not described in this paper as it is not its focus. More details can be obtained
of fixed scaling factors to modify the original components’ characteristic maps.
from Li et al. [17]. (2) Performance adaption for off-design points, in which a set of scaling
procedure
factors are introduced to is not described
improve in this paper
the prediction as itofisthe
accuracy notsimulation
its focus. More
model details
duringcan be obt
from
off-design conditions.Li et al. [17]. (2) Performance adaption for off-design points, in which a set of sc
Appl. Sci. 2023, 13, x FOR PEER REVIEW 5 of 18

Appl. Sci. 2023, 13, 7352 factors are introduced to improve the prediction accuracy of the simulation model during 5 of 17
off-design conditions.
In this paper, the component characteristic maps were adapted according to multi-
ple off-design operation
In this paper, data. As shown
the component in Figure
characteristic maps3, the
wereoff-design
adapted speed linestoinmultiple
according the ini-
tial component
off-design characteristic
operation data. Asmap
shown(solid line) are
in Figure different
3, the fromspeed
off-design the actual
lines map
in the(dotted
initial
line).
componentTo modify the maps,
characteristic mapthree
(solidcharacteristic parameters
line) are different from the (corrected
actual mapmass(dottedflow rate,
line). To
pressure
modify the ratio,
maps,andthree
isentropic efficiency)
characteristic were calibrated
parameters (corrected at mass
different
flowspeed lines. Taking
rate, pressure ratio,
the
andcompressor as an example,
isentropic efficiency) the off-design
were calibrated scaling
at different factors
speed of Taking
lines. the corrected mass flow
the compressor as
rate, pressurethe
an example, ratio, and isentropic
off-design efficiency
scaling factors were
of the introduced
corrected massand
flowdefined as shown
rate, pressure in
ratio,
Equations (2)–(4).
and isentropic efficiency were introduced and defined as shown in Equations (2)–(4).

Original Map
Scaled Map
Pressure Ratio (PRComp)

A* Design Point
A

CN=nDP

CN= nOD
Corrected Mass Flowrate (WACComp)

Figure
Figure 3.
3. Changes
Changes in
in component
component characteristic
characteristic maps
maps in
in performance
performance adaption.
adaption.

It can
It can be
be observed
observed that
that the
the scaling
scaling factors
factors at
at each
each speed
speed line
line are
are different,
different, and
and the
the
variation of the scaling factors with the speed lines is nonlinear. In this paper, a quadratic
variation of the scaling factors with the speed lines is nonlinear. In this paper, a quadratic
form (Equation
form (Equation (5))
(5)) was
was applied
applied to
to describe
describe the
the nonlinearity
nonlinearity of
of the
the corrected
corrected relative
relative
non-dimensional rotational speed (CN) [17,18].
non-dimensional rotational speed (CN) [17,18].
∗∗
𝑊𝑊𝑊𝑊𝑊𝑊
WAC 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶,𝑂𝑂𝑂𝑂
Comp,OD
𝑆𝑆𝑆𝑆𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶,𝑊𝑊𝑊𝑊𝑊𝑊
SFComp,WAC == (2)
(2)
𝑊𝑊𝑊𝑊𝑊𝑊
WACComp,OD
𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶,𝑂𝑂𝑂𝑂

∗∗
𝑃𝑃𝑃𝑃 −11
PR 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶,𝑂𝑂𝑂𝑂−
𝑆𝑆𝑆𝑆𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶,𝑃𝑃𝑃𝑃
SFComp,PR =
= 𝑃𝑃𝑃𝑃
Comp,OD
(3)
(3)
−11
𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶,𝑂𝑂𝑂𝑂−
PRComp,OD
∗∗
𝐸𝐸𝐸𝐸𝐸𝐸Comp,OD
ETA 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶,𝑂𝑂𝑂𝑂
𝑆𝑆𝑆𝑆𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶,𝐸𝐸𝐸𝐸𝐸𝐸
SFComp,ETA == (4)
(4)
𝐸𝐸𝐸𝐸𝐸𝐸
ETA 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶,𝑂𝑂𝑂𝑂
Comp, OD

nOD𝑛𝑛�𝑂𝑂𝑂𝑂 n𝑛𝑛�𝑂𝑂𝑂𝑂 2
   
SFx 𝑆𝑆𝑆𝑆
=𝑥𝑥1=+1bx+ 𝑏𝑏1𝑥𝑥−
�1 − 1 −− OD )2
� +c x𝑐𝑐𝑥𝑥 (1
+ (5)
(5)
n DP𝑛𝑛�𝐷𝐷𝐷𝐷 n𝑛𝑛�DP
𝐷𝐷𝐷𝐷

In
In Equations
Equations (2)–(4),
(2)–(4), terms
terms with
with the
the superscript
superscript “*”
“*” mean
mean the
the modified
modified characteristic
characteristic
parameters and and terms without “*” are the original characteristic parameters. The
terms without “*” are the original characteristic parameters. The sub-
subscript
script “Comp” means compressor. WAC, PR, and ETA represent the corrected
“Comp” means compressor. WAC, PR, and ETA represent the corrected mass mass
flowrate,
flowrate, pressure
pressure ratio, andratio, and isentropic
isentropic efficiency,efficiency, respectively.
respectively. OD denotes OD denotes
the the off-design
off-design condition.
condition.
Equation (5) Equation
defines(5) defines
those those
scaled scaled
factors as factors as a quadratic
a quadratic formx of
form of CN. CN. x repre-
represents one
sents
of theone of the characteristic
characteristic parameters,parameters,
WAC, PR,WAC, PR, and
and ETA. ETA. DPthe
DP denotes denotes
designthe design
point. bx
point. 𝑏𝑏𝑥𝑥 and 𝑐𝑐𝑥𝑥 represent the first-order
and c x represent the first-order and
 and second-order coefficients in the correla-
second-order coefficients in the correlation function,
nOD 𝑛𝑛�
tion
respectively. denotes CN,nand
function,nrespectively. denotes
1 −CN,
n DP and � represents
�1 − 𝑛𝑛�𝑂𝑂𝑂𝑂the
represents differencethe difference
of CN betweenof any
CN
𝐷𝐷𝐷𝐷
OD point and design point.
between any OD point and design point.
Figure 3 shows the errors between the original component map and the actual map.
The error is larger at a lower speed. Thus, the performance adaption for the off-design
Appl. Sci. 2023, 13, 7352 6 of 17

points needs to be conducted on the speed lines. During map adaptation, the original map
with solid lines were modified to the new map with dotted lines.
Suppose A is a point on a speed line with CN = nOD with the characteristic parameters
PRA and WACA . A* with characteristic parameters PRA * and WACA * is the target point at
PR∗ WAC ∗
which speed line CN = nOD should be passed. Let SFPR,OD = PR A and SFWAC,OD = WACA ;
A A
after being scaled by Equations (2)–(4), the point A moves to A* and the speed line with
CN = nOD reaches the dotted line.
In this paper, the scaled factors were decided by CN and the coefficients bx and c x .
CN is an independent input, and bx and c x are adjustable parameters. By adjusting the
coefficients, the characteristic maps can be modified, and the prediction of the performance
measurements changes accordingly. Thus, errors between the actual performance measure-
ments and the predicted performance measurements obtained by the performance model
of gas turbine shown in Figure 2 can be minimized by tunning the coefficients bx and c x .
To obtain the coefficient bx and c x , the objective function is formed by minimizing the
predicted error:
∑N M
j=1 ∑i =1 ẑij − zij /zij

min f = (6)
bx ,c x N×M
where ẑ means the predicted performance measurements, calculated by the simulation
model in Figure 2; and z means the actual performance measurements. M represents
the number of performance measurements; and N represents the number of targeted
off-design points.

2.3. Data Clustering for Sampling


When performing performance adaption, it is inevitable to incur considerable com-
putational cost if all data points are included. It is necessary to select few points from the
entire available dataset. The quality of data selection will directly affect the accuracy of the
prediction model. The random sampling method can reflect the distribution characteristics
of the data to some extent, but it may have difficulties when facing operating condition
data. For example, some operating conditions may repeatedly occur in a certain sampling
time period, increasing their distribution density. This may easily result in these similar
data being sampled multiple times, while some operating conditions are not collected.
Thus, a data-clustering technique was employed to learn the intrinsic relationship between
data and divide them into different clusters. The basic concept is that similar operating
condition data are close to each other and have similar outputs. The clustering technique
can help to group points that are within close proximity and find the centroids that have
the smallest sum of distances to all other points. In this paper, k-means clustering [26] was
used to group the data and find the centroids; then, the data points closest to the centroids
were chosen and formed the clustering sampling set to calculate the characteristic adaption.
K-means is a representative method of unsupervised learning, which divides data into
k classes predetermined according to the distance between the samples. The k-means
algorithm has the advantages of fast convergence speed and strong interpretability; its
drawback is that the number of k clusters needs to be predetermined, but in this study, it
was turned into an advantage as the number of sampling points can be chosen according
to the computational needs. On the one hand, cluster sampling can make full use of the
collected data. On the other hand, it can also collect data with large feature differences as
much as possible. Suppose there are m data points and divide them into k clusters. The
calculation steps are as follows:
1. Randomly select k points, µ01 , µ02 , . . . , µ0k , as initial centroids.
2. In the tth iteration step, calculate the distance from each point to the k centroids
according to Equation (7), and assign them to the nearest cluster.

t 2
distancei,j = k xi − µtj k , i = 1, . . . , m, j = 1, . . . , k (7)
Appl. Sci. 2023, 13, 7352 7 of 17

3. Calculate the mean of the points in each cluster and update the centroids by Equation (8).
x
µtj = ∑ xi ∈Cj ni , n is the number o f points in Cj (8)

4. Repeat steps 2 and 3, until the difference between two consecutive centroids is smaller
than a predefined threshold or the max iteration step is achieved.

2.4. Particle Swarm Optimization (PSO)


Particle swarm optimization (PSO) was used in this study to minimize the function
in Equation (6). PSO is a heuristic algorithm proposed by Kennedy and Eberhardt in
1995 [27]. For a detailed description, please refer to Kennedy and Eberhardt [27] and Poli
et al. [28]. Each solution of the optimization problem is called a particle. Particles are
located in a n-dimension search space with a certain speed, and fitness function is used
to evaluate the merits of the particles. The particles can remember and track the personal
best position Pbest and global best position Gbest; the speed is calculated according to the
flight experience and the best particle position. Suppose the position and velocity of the ith
particle at the kth generation is represented as a n-dimension vector:
 
1 2 n
xi,k = xi,k , xi,k , . . . , xi,k (9)

 
n
vi,k = v1i,k , v2i,k , . . . , vi,k (10)

Then, in the (k + 1)th generation, the position and velocity of particles are updated
as follows:

vi,k+1 = w × vi,k + c1 × rand × ( Pbesti,k − xi,k ) + c2 × rand × ( Gbestk − xi,k ) (11)

xi,k+1 = vi,k+1 + xi,k (12)


where Pbesti,k represents the previous personal best position of each particle, and Gbestk
refers to the global best position of all particles. rand is a random value from 0 to 1. c1 is the
acceleration constant of Pbest, c2 is the acceleration constant of Gbest, and w refers to inertia
weight. The values of w, c1 , and c2 can be adjusted depending on the specific problems. In
this study, w = 0.5, c1 = 1, c2 = 2, the generation was set to 200, and the population size was
set to 20.

3. Results and Discussion


3.1. Application
In this section, the performance adaption method with cluster sampling is applied to
a real E-class gas turbine power plant. The gas-path schematic of the gas turbine and the
measurable gas-path parameters are shown in Figure 2 and Table 1, respectively. The main
parameters in ISO conditions are summarized in Table 2. It was a 127.6 MW gas turbine,
with an system efficiency of 33.6%.

Table 2. Main parameters of the gas turbine power plant in ISO conditions.

Parameters Value
Output power (MW) 127.6
System efficiency (%) 33.6
Compressor pressure ratio 12.75
Turbine inlet temperature (K) 1397.15
Turbine outlet temperature (K) 821.55
Parameters Value
Output power (MW) 127.6
System efficiency (%) 33.6
Compressor pressure ratio 12.75
Appl. Sci. 2023, 13, 7352
Turbine inlet temperature (K) 1397.15 8 of 17
Turbine outlet temperature (K) 821.55

The data came from the steady-state operating conditions of the E-class gas turbine
The data came from the steady-state operating conditions of the E-class gas turbine
from June to
from June to November,
November,totallytotally146146data
data points.
points. Some
Some measurements
measurements of the
of the collected
collected
field data are shown in Figure 4. The ranges of the measurements are fuel
field data are shown in Figure 4. The ranges of the measurements are fuel flow rate𝑚𝑚𝑓𝑓 ∈ flow rate
[14.531 lb/s, 16.79
m f ∈ [14.531 lb/s, 16.79 , compressor
lb/s] lb/s ], compressorinlet
inlet temperature
temperature T1𝑇𝑇1∈∈[294.085
[294.085 K,K, K], K] ,
315.054
315.054
power output P𝑃𝑃
power output w𝑤𝑤 ∈
∈ [100.717
[ 100.717 MW,
MW, 120.850
120.850 MW],
MW ] , and
and turbine
turbine outlet
outlet temperature
temperature T4 ∈𝑇𝑇4 ∈
[832.134
[832.134 K,K,849.513
849.513K]K].
.

Figure 4. The measurements of the E-class gas turbine.

3.2. Comparison of Random Sampling and Cluster Sampling


In this section, two sampling methods, cluster sampling and random sampling, are
used to select the sampling points for the subsequent performance adaption. The number
of sampling data points was set to 10.
Random sampling randomly generates 10 points from the dataset. Cluster sampling
uses the k-means clustering algorithm and clusters the 146 data points into 10 clusters
uses the k-means clustering algorithm and clusters the 146 data points into 10 clusters
based on the variables in Table 1. It produces 10 cluster centers and selects the points
closest to the cluster centers as the sampling points. To compare the two sampling
methods, the coverage of the total dataset and the coefficient of variation (CV, repre-
Appl. Sci. 2023, 13, 7352 9 of 17
senting the degree of dispersion) of each sampling set were calculated by Equations (13)
and (14), respectively. The analysis results of the two sampling methods are shown in
Figure 5: based on the variables in Table 1. It produces 10 cluster centers and selects the points closest
to the cluster centers as the sampling points. To compare the two sampling methods, the
𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚,𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑠𝑠𝑠𝑠𝑠𝑠 − 𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚,𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑠𝑠𝑠𝑠𝑠𝑠
coverage of𝑆𝑆the
=total dataset and the coefficient of variation (CV, representing the degree (13) of
dispersion) of each𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚,𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡
sampling𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 set were− 𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚,𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡
calculated𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑by Equations (13) and (14), respectively.
The analysis results of the two sampling methods are shown in Figure 5:
𝜎𝜎
𝐶𝐶𝐶𝐶 =xmax,sample set − xmin,sample set (14)(13)
S = 𝜇𝜇
x −x
max,total data min,total data

where 𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚,𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑠𝑠𝑠𝑠𝑠𝑠 and 𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚,𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑠𝑠𝑠𝑠𝑠𝑠 are the max and σmin values of the variables in
Table 1 among the selected sample set, respectively; CV =
µ and 𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚,𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 and(14)
𝑥𝑥𝑚𝑚𝑚𝑚𝑚𝑚,𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 are thewhere
maxxmax,sample
and min values of the variables among the total dataset, re-
set and xmin,sample set are the max and min values of the variables in Table 1
spectively. 𝜎𝜎 is the standard deviation ofset,the
among the selected sample selectedand
respectively; sample
xmax,totalset, andxmin,total
data and 𝜇𝜇 is datatheare mean
the max
value. and min values of the variables among the total dataset, respectively. σ is the standard
deviation of the selected sample set, and µ is the mean value.

(a) (b)
Figure 5. Sampling coverage
Figure 5. fraction and coefficient
Sampling coverage ofcoefficient
fraction and the twoofsampling methods.
the two sampling (a)(a)Sampling
methods. Sampling cov-
coverage fraction of the two
erage sampling
fraction methods.
of the two (b) Coefficient
sampling methods. of variation
(b) Coefficient of variationof thetwo
of the two sampling
sampling methods.
methods.
It can be seen that the cluster sampling method can collect data points with large
differences as much as possible, which increases the data diversity and extends the coverage
It can be seen that
of thethe
totalcluster
dataset. sampling method
However, using randomcan collect
sampling maydata points
lead to with
repeated largeand
sampling
differences as much as possible, which increases the data diversity and extends the cov-
missed sampling.
erage of the total dataset. However,
3.3. Optimization using random sampling may lead to repeated sam-
Results
pling and missed sampling.In Section 3.2, two 10-point sampling datasets were formed by using random sampling
and cluster sampling. Based on these two sampling sets, the PSO algorithm was used
3.3. Optimization Results
to minimize the performance measurement error by tuning the scaling factor function
coefficients. Taking the calculation process of PSO based on cluster sampling as an example,
In Section 3.2, two 10-point process
its convergence sampling datasets
is shown were
in Figure 6. formed by using random sam-
pling and cluster sampling. Based
In which, ondots
the red these two the
represent sampling sets,Itthe
iteration steps. PSO
can be seenalgorithm was
that the optimization
used to minimize the performance measurement error by tuning the scaling factor func- the
process converges at the 114th generation, where the fitness function is 0.610%. During
optimization process of PSO, the fitness function decreases continuously, indicating that the
tion coefficients. Taking the calculation process of PSO based on cluster sampling as an
PSO algorithm significantly reduces the prediction error of the four measurement variables
example, its convergence process adjusting
by continuously is shown thein Figure coefficients.
correction 6. The average prediction errors of each
measurement variable before and after performance adaption are shown in Table 3. From
Table 3, after performance adaption, the accuracy of the performance model was improved,
the error of power, compressor outlet temperature, and pressure reached 0.5%, and the
error of the turbine outlet temperature reached around 1.1%.
Appl.
Appl. Sci.Sci. 2023,
2023, 13,13,7352
x FOR PEER REVIEW 10 of 18 10 of 17

Convergence
Figure6.6.Convergence
Figure process
process based
based onclustering
on the the clustering
sample.sample.
Table 3. Predicted errors before and after adaption.
In which, the red dots represent the iteration steps. It can be seen that the optimiza-
tion process Measurements
converges at the 114th generation, where
Before the fitness functionAfter
Adaption is 0.610%.
Adaption
During the optimization process of PSO, the fitness function decreases continuously, in-
dicating that the Power
PSO algorithm significantly reduces 3.316%
the prediction error of the0.497%
four
Compressor Outlet Temperature 3.716%
measurement variables by continuously adjusting the correction coefficients. The 0.316%
aver-
Compressor Outlet Pressure 3.359% 0.469%
age prediction errors of each measurement variable before and after performance adap-
Turbine Outlet Temperature 1.396% 1.159%
tion are shown in Table 3. From Table 3, after performance adaption, the accuracy of the
Four Measurements 2.947% 0.610%
performance model was improved, the error of power, compressor outlet temperature,
and pressure reached 0.5%, and the error of the turbine outlet temperature reached
around 1.1%.
3.4. Comparison of the Prediction Using Random Sampling and Cluster Sampling
In the previous section, the performance adaption processes of two ten-point sampling
Table 3. Predicted errors before and after adaption.
datasets were completed using the PSO algorithm, and the performance models after
adaption basedMeasurements
on data points of the random Beforesampling
Adaption set and After
theAdaption
cluster sampling set
Power were denoted as Model3.316%
were obtained, which 0.497%
1 and Model 2, respectively. This section
Compressor
compares Outlet Temperature
and discusses the accuracy of these two 3.716%
models using the0.316%remaining data points
Compressor
(excluding Outlet Pressure
the sampling points). The prediction3.359%
errors of the two models0.469%based on different
samplingTurbine Outletat
methods Temperature
the remaining data points 1.396%
are shown in Figure 1.159%
7.
As canFour
beMeasurements
seen from Figure 7, the cluster2.947% sampling correction 0.610%
results are better than
the random sampling results in terms of power prediction and compressor outlet pressure
3.4. Comparison
prediction. of the
The Predictionaccuracy
prediction using Random
of theSampling and Cluster
compressor outletSampling
temperature is similar in both
In the previous
methods, and bothsection, the performance
are relatively high and adaption
around processes
0.3%; theof two ten-point
prediction sam-
accuracy of the
pling datasets
turbine outletwere completedis using
temperature the PSO
also similar inalgorithm, and the
both methods. performance
Table 4 lists themodels
mean error and
after adaption
maximum based
error of on data
each points of the
measured random
variable in sampling
the dataset.set and the cluster sampling
set were
According to Table 4, it can be seen that, overall, the 2,
obtained, which were denoted as Model 1 and Model respectively.
models obtainedThisusing
sec- random
tion compares and discusses the accuracy of these two models using the remaining data
sampling correction and cluster sampling correction have average errors of less than 1.2%
points (excluding the sampling points). The prediction errors of the two models based
for each prediction variable, and the overall average error of cluster sampling is better
on different sampling methods at the remaining data points are shown in Figure 7.
than that of random sampling. For the power and compressor outlet pressure prediction
results, random sampling is slightly worse than the results of cluster sampling, which is
reflected in both the average and maximum errors. The average error of power increased
by 0.559% and that of the compressor outlet pressure increased by 0.302%. Their maximum
errors increased by 1.193% and 1.041%, respectively. For the compressor outlet temperature,
the prediction accuracy of both methods is similar, with random sampling results slightly
higher than cluster sampling results, but they are generally at the same level of accuracy.
For the turbine outlet temperature, the average prediction accuracy is 0.586% for Model
1 and 0.650% for Model 2, basically in the same accuracy level. The maximum error of
Model 1 is 3.027%, which can be relatively high, while Model 2 using cluster sampling
is 2.037%, improving by 0.685% in comparison with random sampling. In both cases,
most prediction errors are below 1.5%, 93.6% for random sampling and 96.0% for cluster
sampling. It can be seen that cluster sampling seems to have a certain effect on reducing
the maximum error and the average error. This may be due to cluster sampling taking into
account multiple clustering conditions and selecting the center points of the conditions,
Appl. Sci. 2023, 13, 7352 11 of 17

Appl. Sci. 2023, 13, x FOR PEER REVIEW 11 of 1


which considers the full range of accuracy and also maintains a certain representativeness
for each clustering condition.

Figure 7. Comparison of the prediction errors of four different measurements between the two models.
Figure 7. Comparison of the prediction errors of four different measurements between the two
models.

As can be seen from Figure 7, the cluster sampling correction results are better tha
the random sampling results in terms of power prediction and compressor outlet pre
sure prediction. The prediction accuracy of the compressor outlet temperature is simila
in both methods, and both are relatively high and around 0.3%; the prediction accurac
of the turbine outlet temperature is also similar in both methods. Table 4 lists the mea
error and maximum error of each measured variable in the dataset.
Appl. Sci. 2023, 13, 7352 12 of 17

Table 4. Predicted errors of the different sampling methods.

Measurements Error Type Model 1 Model 2


Average 1.111% 0.552%
Power
Maximum 2.847% 1.654%
Average 0.235% 0.249%
Compressor Outlet Temperature
Maximum 0.880% 1.068%
Average 0.713% 0.411%
Compressor Outlet Pressure
Maximum 2.287% 1.246%
Average 0.586% 0.650%
Turbine Outlet Temperature
Maximum 3.027% 2.037%
Average 0.661% 0.466%
Four Measurements
Maximum 1.552% 1.088%

To analyze the accuracy improvement created by cluster sampling, the data set was
divided into two parts. In Section 3.2, it was mentioned that the k-means algorithm was
used to divide the dataset into 10 categories. Cluster sampling selects the points closest to
the cluster centers from each cluster to form a sample set, while random sampling picks
points directly from the data. Thus, the random sample set may include some clusters and
exclude others; so, the data can be classified into two groups: dataset 1, which contains
the clusters that were sampled by random sampling, and dataset 2, which contains the
clusters that were not sampled by random sampling. Then, the prediction accuracies of the
previous Model 1 and Model 2 were evaluated for both groups of data, and the results are
shown in Table 5.

Table 5. Predicted errors of the dataset divided into two parts.

Dataset 1 Dataset 2
Measurements Error Type
Model 1 Model 2 Model 1 Model 2
Average 0.938% 0.567% 1.712% 0.501%
Power
Maximum 2.585% 1.654% 2.847% 1.307%
Average 0.235% 0.243% 0.235% 0.271%
Compressor Outlet Temperature
Maximum 0.880% 1.068% 0.616% 0.744%
Average 0.572% 0.392% 1.204% 0.474%
Compressor Outlet Pressure
Maximum 1.731% 1.246% 2.287% 1.041%
Average 0.432% 0.645% 1.118% 0.670%
Turbine Outlet Temperature
Maximum 2.320% 2.037% 3.028% 1.894%
Average 0.544% 0.462% 1.067% 0.479%
Four Measurements
Maximum 1.238% 1.088% 1.552% 0.887%

By observing columns 1 and 3 of Table 5, it can be seen that the model built using
random sampling has some accuracy differences in the two datasets. The model prediction
accuracy in the data categories that are not covered by random sampling is lower than that
in the data categories that are covered by random sampling. This is reflected in the average
and maximum errors of almost all measurement parameters, except the compressor outlet
temperature, which is almost the same accuracy level. This result indicates that whether the
dataset contains similar operating conditions has a certain impact on the model prediction
accuracy, and all operating conditions should be included as much as possible. Observing
the prediction results of cluster sampling in the two categories of data, in columns 2 and 4,
there is a slight difference but the degree is much lower than that of the random sampling
results and almost the same level. This may be because performance adaption is a process
of balancing the accuracy of each point, which results in a decrease in accuracy on these
categories, or it may be due to some randomness in the results. Observing columns 1 and 2
in Table 5, although both sampling methods cover this part of data categories, and random
sampling contains even more data points than cluster sampling, the prediction accuracy
does not improve considerably. This may be because cluster sampling not only covers more
egories of data, in columns 2 and 4, there is a slight difference but the degree is much
lower than that of the random sampling results and almost the same level. This may be
because performance adaption is a process of balancing the accuracy of each point,
which results in a decrease in accuracy on these categories, or it may be due to some
randomness in the results. Observing columns 1 and 2 in Table 5, although both sam-
Appl. Sci. 2023, 13, 7352 pling methods cover this part of data categories, and random sampling contains 13 even
of 17
more data points than cluster sampling, the prediction accuracy does not improve con-
siderably. This may be because cluster sampling not only covers more categories, but
also uses the
categories, butpoints closest
also uses theto the cluster
points closestcenter to helpcenter
to the cluster to extract thetointernal
to help extract features of
the internal
the data, which can increase the representativeness of the data points. Therefore,
features of the data, which can increase the representativeness of the data points. Therefore, more
accurate prediction
more accurate resultsresults
prediction can becanobtained using the
be obtained cluster
using sampling
the cluster method.
sampling method.
Based
Basedon onModel
Model22obtained
obtainedby bythethecluster
clustersampling
samplingmethod,
method,thetheisentropic
isentropicefficiency
efficiency
of
ofthe
theturbine
turbineand andenergy
energyefficiency
efficiencywerewerealso
alsocalculated,
calculated,and
andthe
theresults
resultsare
areshown
shownin in
Figure
Figure8.8.ItItcan
canbebeseen
seenthat
thatthe
theisentropic
isentropicefficiency
efficiencyofofthe
theturbine
turbinewas
wasaround
around88.5%
88.5%andand
the
thesystem
systemefficiency
efficiencywas
wasaround
around31.4%.
31.4%.

Appl. Sci. 2023, 13, x FOR PEER REVIEW 14 of 18

Figure 8. Prediction of the isentropic efficiency of the turbine and system efficiency.
Figure 8. Prediction of the isentropic efficiency of the turbine and system efficiency.
3.5. Discussion of Random Sampling and Cluster Sampling for a Long Period
3.5. Discussion of Random Sampling and Cluster Sampling for a Long Period
To compare and demonstrate the performance of two sampling methods using
long-term data, the
To compare andfield data werethe
demonstrate extended to a one-year
performance period. methods
of two sampling Figure 9 using
showslong-
the
term data, the field data were extended to a one-year period. Figure 9 shows the compressor
compressor inlet temperature data in a one-year period. The data were collected every
inlet temperature
minute data with
and the points in a one-year period.
loads below 75%The
weredata were collected every minute and the
removed.
points with loads below 75% were removed.

Figure 9.
Figure The compressor
9. The compressor inlet
inlet temperature
temperature data
data in
in aa one-year
one-year period.
period.

Overall, summer is the peak of electricity consumption, and the gas turbine units
Overall, summer is the peak of electricity consumption, and the gas turbine units
run for longer periods, resulting in more data points. In contrast, there were less data in
run for longer periods, resulting in more data points. In contrast, there were less data in
winter. The mean value of the compressor inlet temperature was 301.8 K, with a range of
winter. The mean value of the compressor inlet temperature was 301.8 K, with a range of
[275.9 K, 316.3 K] in a one-year period. The distribution of the dataset is shown in Figure 10.
[275.9 K, 316.3 K] in a one-year period. The distribution of the dataset is shown in Figure
10.

10
Figure 9. The compressor inlet temperature data in a one-year period.

Overall, summer is the peak of electricity consumption, and the gas turbine units
run for longer periods, resulting in more data points. In contrast, there were less data in
winter. The mean value of the compressor inlet temperature was 301.8 K, with a range of
Appl. Sci. 2023, 13, 7352 [275.9 K, 316.3 K] in a one-year period. The distribution of the dataset is shown in Figure
14 of 17

10.

10

8.3%8.4%
8 7.9%

6.9%
6.5%

Percentage (%)
6% 6%
6 5.5%
5.3% 5.4%
5%
4.7%

4 3.7%
3.3% 3.2% 3%
2.8% 2.8%
2.5%

2
1.2%
0.8%
0.5%
0.1% 0.1%
0
275 280 285 290 295 300 305 310 315
Compressor inlet temperature (K)

Figure
Figure 10.
10. Data
Datadistribution
distributionof
ofthe
thecompressor
compressor inlet
inlet temperature.
temperature.

As
As can
can be
be seen
seen in
in Figure
Figure 10,10, the
the data
data have
have three
three distribution
distribution centers,
centers, corresponding
corresponding
to winter, spring/autumn, and summer. The operation conditions of the field
to winter, spring/autumn, and summer. The operation conditions of the field data
data areare
naturally over-dispersed due to the over-dispersion of environmental conditions. The
naturally over-dispersed due to the over-dispersion of environmental conditions. The field
field
data data of (315
of high highK)(315 K) or (275
or lower lower K) (275 K) compressor
compressor inlet temperature
inlet temperature conditions conditions
are much lessare
much
Appl. Sci. 2023, 13, x FOR PEER REVIEW lessunder
than that than that
aboutunder
307 K.about 307
If the K. If the
random randommethod
sampling sampling method
is used, is used, theoret-
theoretically, 15the
of dis-
18
ically, theof
tribution distribution
the collected of data
the collected data willwith
will be consistent be consistent
the originalwith thedistribution,
data original data dis-
mainly
tribution,
distributed mainly
around distributed
307 K around around 307 Kand
summer, around summer,
the rest anddistributed
of the data the rest of around
the data294 dis-K
tributed around
and 284
computation
K.
cost294
This will K and 284 K. adaption,
result
of performance
in the This will result
rare operation in the rare
conditions
few sampling operation
being
points conditions
probably
are selected frombeing
covered
the
up
by the
probably common
covered operation
up it
bymore conditions.
the common Furthermore, considering the computation cost of
original data, making likely to operation
miss someconditions. Furthermore,
operating conditions. considering
Cluster sampling the
performance adaption, few sampling points are selected from the original data, making it
clusters the entire dataset based on the minimum distance to the cluster centers, ensuring
more likely to miss some operating conditions. Cluster sampling clusters the entire dataset
the diversity
based on theand representativeness
minimum distance to the ofcluster
the sampling
centers,points.
ensuring Correspondingly,
the diversity andthe clus-
represen-
tering method
tativeness was
of the proposed
sampling to detect
points. these rare operation
Correspondingly, conditions.
the clustering methodTaking 10 sam-
was proposed
pling
to detect these rare operation conditions. Taking 10 sampling points as an example, by
points as an example, the clustering results and the sampling points obtained the
the cluster sampling
clustering results and method are shown
the sampling in obtained
points Figure 11,bytogether with
the cluster randommethod
sampling sampling are
points
shownmarked by 11,
in Figure green triangles.
together with random sampling points marked by green triangles.

320
Data points
315
Compressor inlet temperature (K)

Clustering sampling points


310 Random sampling Points
305

300

295

290

285

280

275

0 2 4 6 8 10
Clusters

Figure
Figure11.
11.Clustering
Clusteringresults
resultsand
andsampling
samplingpoints
pointsobtained
obtainedby
bythe
thetwo
twosampling
samplingmethods.
methods.

Compared with the random sampling points, cluster sampling points marked by red
Compared with the random sampling points, cluster sampling points marked by red
circles cover a more diverse range of operating conditions and detect the rare operation
circles cover a more diverse range of operating conditions and detect the rare operation
conditions both in low and high temperatures. As it can be seen, cluster sampling main-
tains its advantages of diversity and representativeness even with a longer data collection
time and larger data volume.
Appl. Sci. 2023, 13, 7352 15 of 17

conditions both in low and high temperatures. As it can be seen, cluster sampling maintains
its advantages of diversity and representativeness even with a longer data collection time
and larger data volume.

4. Conclusions
This paper proposed a performance adaption method based on cluster sampling to
adjust the component characteristic map and minimize the predicted errors of performance
parameters. The tuning factors were the coefficients of scaling factors defined by the
ratio of the original and target characteristic parameters. The optimal coefficients were
determined by PSO. Through this process, the predicted errors of the performance model
can be reduced. Different from other adaption methods, the adaption based on cluster
sampling method selects more representative sampling points, which improves the model
accuracy on the entire dataset.
The proposed method was applied to a real E-class gas turbine. The simulated
performance based on the cluster sampling method was compared with the simulated
performance based on the random sampling method. The average and maximum errors of
the simulated performance based on the random sampling method on the entire dataset
were 0.661% and 1.552%, respectively. The average and maximum errors of the simulated
performance based on the cluster sampling method were 0.466% and 1.088%, respectively,
all showing some degree of improvement. In the data categories that were not included in
the random sample set, the average and maximum errors of the simulated performance
based on the random sampling method were 1.067% and 1.552%, respectively. The average
and maximum errors of the simulated performance based on the cluster sampling method
were 0.479% and 0.887%, respectively. The performance adaption based on cluster sampling
enhances the prediction accuracy of the model on the entire dataset and the prediction
stability of some operating conditions, and helps to improve the application effect in
performance estimation and gas path diagnosis.
Some aspects that can be enhanced in future research are: (1) Clustering is sensitive
to outliers in the data. When the data are large, clustering can be used to automatically
remove outliers before performing cluster sampling. (2) The current data collection does
not include the complete operating data in winter and spring. The model stability on
unknown operating conditions can be verified and adjusted in subsequent research.

Author Contributions: Conceptualization, J.K., W.Y. and H.Z.; methodology, J.K. and J.C.; software,
J.K. and J.C.; writing—original draft preparation, J.K.; writing—review and edit, J.C., W.Y. and H.Z.
All authors have read and agreed to the published version of the manuscript.
Funding: This work was Funded by the National Natural Science Foundation of China (Grant No.
51876116 and 51906138) and the National Science and Technology Major Project (2017-I-0002-0002).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.

Nomenclature
Symbols
T1 Compressor inlet temperature, K
P1 Compressor inlet pressure, kPa
T2 Compressor outlet temperature, K
P2 Compressor outlet pressure, kPa
mf Fuel mass flow, lb/s
Tf Fuel temperature, K
Pf Fuel pressure, kPa
Appl. Sci. 2023, 13, 7352 16 of 17

T4 Turbine outlet temperature, K


P4 Turbine outlet pressure, kPa
Pw Power output, MW
n Rotor speed
n Corrected relative non-dimensional rotational speed
WAC corrected mass flowrate
PR Pressure ratio
ETA Isentropic efficiency
SF Scaling factor
b The first-order coefficient
c The second-order coefficient
N Number of targeted off-design points
M Number of performance measurements
ẑ Predicted performance measurements
z Actual performance measurements
Subscript
1 Inlet of compressor
2 Outlet of compressor
3 Outlet of burner
4 Outlet of turbine
Comp Compressor
OD Off-design point
DP Design point
x One of the characteristic parameters, WAC, PR, or ETA
Abbreviations
GA Genetic algorithm
CN Corrected relative non-dimensional rotational speed
PSO Particle swarm optimization algorithm
CV Coefficient of variation

References
1. Polyzakis, A.L.; Koroneos, C.; Xydis, G. Optimum gas turbine cycle for combined cycle power plant. Energy Convers. Manag.
2008, 49, 551–563. [CrossRef]
2. Vyncke-Wilson, D. Advantages of Aeroderivative Gas Turbines: Technical & Operational Considerations on Equipment Selection.
In Proceedings of the 20th Symposium of the Industrial Application of Gas Turbines Committee, Banff, AB, Canada, 21–23
October 2013.
3. Haglind, F. A Review on the Use of Gas and Steam Turbine Combined Cycles as Prime Movers for Large Ships. Part I: Background
and Design. Energy Convers. Manag. 2008, 49, 3458–3467. [CrossRef]
4. Tahan, M.; Tsoutsanis, E.; Muhammad, M.; Karim, Z.A. Performance-Based Health Monitoring, Diagnostics and Prognostics for
Condition-Based Maintenance of Gas Turbines: A Review. Appl. Energy 2017, 198, 122–144. [CrossRef]
5. Li, Y.G. Aero Gas Turbine Flight Performance Estimation Using Engine Gas Path Measurements. J. Propuls. Power 2015, 31,
851–860. [CrossRef]
6. Gu, C.; Wang, H.; Ji, X.; Li, X. Development and Application of a Thermodynamic-Cycle Performance Analysis Method of a
Three-Shaft Gas Turbine. Energy 2016, 112, 307–321. [CrossRef]
7. Tsoutsanis, E.; Meskin, N.; Benammar, M.; Khorasani, K. Transient Gas Turbine Performance Diagnostics through Nonlinear
Adaptation of Compressor and Turbine Maps. J. Eng. Gas Turbines Power 2015, 137, 091201. [CrossRef]
8. Yang, X.; Guo, X.; Dong, W. On-Line Component Map Adaptive Procedure Based on Sensor Data. In Proceedings of the ASME
Turbo Expo 2020, Virtual, 21–25 September 2020.
9. Lo Gatto, E.; Li, Y.G.; Pilidis, P. Gas Turbine Off-Design Performance Adaptation Using a Genetic Algorithm; American Society of
Mechanical Engineers Digital Collection; ASME: New York, NY, USA, 2008; pp. 551–560.
10. Alberto Misté, G.; Benini, E. Turbojet Engine Performance Tuning With a New Map Adaptation Concept. J. Eng. Gas Turbines
Power 2014, 136, 071202-1–071202-8. [CrossRef]
11. Stamatis, A.; Mathioudakis, K.; Papailiou, K.D. Adaptive simulation of gas turbine performance. ASME J. Eng. Gas Turbines Power
1990, 112, 168–175. [CrossRef]
12. Lambiris, B.; Mathioudakis, K.; Stamatis, A.; Papailiou, K. Adaptive modeling of jet engine performance with application to
condition monitoring. J. Propuls. Power 1994, 10, 890–896. [CrossRef]
13. Kong, C.; Ki, J.; Kang, M. A new scaling method for component maps of gas turbine using system identification. J. Eng. Gas
Turbines Power 2003, 125, 979–985. [CrossRef]
Appl. Sci. 2023, 13, 7352 17 of 17

14. Kong, C.; Kho, S.; Ki, J. Component map generation of a gas turbine using genetic algorithms. J. Eng. Gas Turbines Power 2006,
128, 92–95. [CrossRef]
15. Kong, C.; Ki, J. Components map generation of gas turbine engine using genetic algorithms and engine performance deck data. J.
Eng. Gas Turbines Power 2007, 129, 312–317. [CrossRef]
16. Li, Y.G.; Pilidis, P.; Newby, M.A. An adaptation approach for gas turbine design-point performance simulation. J. Eng. Gas
Turbines Power 2006, 128, 789–795. [CrossRef]
17. Li, Y.G.; Marinai, L.; Gatto, E.L.; Pachidis, V.; Philidis, P. Multiple-point adaptive performance simulation tuned to aeroengine
test-bed data. J. Propuls. Power 2009, 25, 635–641. [CrossRef]
18. Li, Y.G.; Ghafir, M.F.; Wang, L.; Singh, R.; Huang, K.; Feng, X. Nonlinear multiple points gas turbine off-design performance
adaptation using a genetic algorithm. J. Eng. Gas Turbines Power 2011, 133, 071701-1–071701-9. [CrossRef]
19. Li, Y.G.; Ghafir, M.F.; Wang, L.; Singh, R.; Huang, K.; Feng, X.; Zhang, W. Improved multiple point nonlinear genetic algorithm
based performance adaptation using least square method. J. Eng. Gas Turbines Power 2012, 134, 031701-1–031701-10. [CrossRef]
20. Tsoutsanis, E.; Li, Y.G.; Pilidis, P.; Newby, M. Part-Load Performance of Gas Turbines: Part I—A Novel Compressor Map
Generation Approach Suitable for Adaptive Simulation. In Proceedings of the ASME 2012 Gas Turbine India Conference,
Mumbai, Maharashtra, India, 1 December 2012.
21. Tsoutsanis, E.; Li, Y.G.; Pilidis, P.; Newby, M. Part-Load Performance of Gas Turbines: Part II—Multi-Point Adaptation with Compressor
Map Generation and GA Optimization; American Society of Mechanical Engineers Digital Collection; ASME: New York, NY, USA,
2013; pp. 743–751.
22. Tsoutsanis, E.; Meskin, N.; Benammar, M.; Khorasani, K. A Component Map Tuning Method for Performance Prediction and
Diagnostics of Gas Turbine Compressors. Appl. Energy 2014, 135, 572–585. [CrossRef]
23. Yang, Q.; Li, S.; Cao, Y. A New Component Map Generation Method for Gas Turbine Adaptation Performance Simulation. J.
Mech. Sci. Technol. 2017, 31, 1947–1957. [CrossRef]
24. Li, S.Y.; Li, Z.; Li, S.Y. Improved Method for Gas-Turbine Off-Design Performance Adaptation Based on Field Data. J. Eng. Gas
Turbines Power 2020, 142, 041001-1–041001-12. [CrossRef]
25. Yan, B.; Hu, M.; Feng, K.; Jiang, Z. Enhanced Component Analytical Solution for Performance Adaptation and Diagnostics of Gas
Turbines. Energies 2021, 14, 4356. [CrossRef]
26. Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A k-means clustering algorithm. J. R. Stat. Soc. Ser. C Appl. Stat. 1979, 28, 100–108.
[CrossRef]
27. Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural
Networks, Perth, WA, Australia, 27 November 1995.
28. Poli, R.; Kennedy, J.; Blackwell, T. Particle swarm optimization: An overview. Swarm Intell. 2007, 1, 33–57.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like