Selecting Scale Factor of Bayesian Multi Fidelity Sur - 2022 - Chinese Journal o
Selecting Scale Factor of Bayesian Multi Fidelity Sur - 2022 - Chinese Journal o
Institute of Turbomachinery, School of Energy & Power Engineering, Xi’an Jiaotong University, Xi’an 710049, China
KEYWORDS Abstract The Bayesian Multi-Fidelity Surrogate (MFS) proposed by Kennedy and O’Hagan
Co-Kriging; (KOH model) has been widely employed in engineering design, which builds the approximation
Gaussian process regression; by decomposing the high-fidelity function into a scaled low-fidelity model plus a discrepancy func-
Multi-fidelity surrogate; tion. The scale factor before the low-fidelity function, q, plays a crucial role in the KOH model. This
Optimization; scale factor is always tuned by the Maximum Likelihood Estimation (MLE). However, recent stud-
Scale factor ies reported that the MLE may sometimes result in MFS of bad accuracy. In this paper, we first
present a detailed analysis of why MLE sometimes can lead to MFS of bad accuracy. This is
because, the MLE overly emphasizes the variation of discrepancy function but ignores the function
waviness when selecting q. To address the above issue, we propose an alternative approach that
chooses q by minimizing the posterior variance of the discrepancy function. Through tests on a
one-dimensional function, two high-dimensional functions, and a turbine blade design problem,
the proposed approach shows better accuracy than or comparable accuracy to MLE, and the pro-
posed approach is more robust than MLE. Additionally, through a comparative test on the design
optimization of a turbine endwall cooling layout, the advantage of the proposed approach is further
validated.
Ó 2022 Chinese Society of Aeronautics and Astronautics. Production and hosting by Elsevier Ltd. This is
an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
https://doi.org/10.1016/j.cja.2022.05.012
1000-9361 Ó 2022 Chinese Society of Aeronautics and Astronautics. Production and hosting by Elsevier Ltd.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
60 H. BU et al.
In the design process, computer simulations could run at tion in MF modeling. The main contributions of the current
multiple levels of accuracy. For example, CFD simulations study are given as follows. It is the first time that the MLE’s
could run with variable mesh resolutions. The governing equa- problem in q selection is analyzed in detail. Besides, an alterna-
tions range from inviscid (Euler) to viscous (Navier-Stokes). tive approach for q selection is proposed to overcome this
The turbulence could be modeled as scale averaged (RANS, problem. The remainder of this paper is organized as follows.
URANS) or scale resolved (LES, DNS). Some of the above- Section 2 gives the background of Bayesian MFS. Section 3
mentioned High-Fidelity (HF) computations are so expensive provides a detailed illustration of the MLE’s problem. Sec-
that only limited samples could be evaluated, which means that tion 4 presents the formulation of the proposed approach. In
the accuracy of the obtained surrogate may not be satisfactory. Section 5, numerical examples and a turbine blade design
To achieve a better approximation of the HF simulations, problem are tested to show the effectiveness of the proposed
some proposed to employ the Low-Fidelity (LF) simulations approach and its applicability to real-world design problems.
as auxiliary information to build the Multi-Fidelity Surrogate Additionally, a comparative test on the design optimization
(MFS). The most popular MFS is Kennedy and O’Hagan’s of a turbine endwall cooling layout is carried out, which fur-
Bayesian MFS (KOH model).9,10 They developed a Bayesian ther validates the advantage of the proposed approach.
multi-fidelity framework in which the HF function is approxi-
mated as the sum of the scaled LF function and a discrepancy 2. Background of Bayesian MFS
function. Given the fact that there are enough LF samples to
obtain a fine approximation of the LF function, if the discrep- In this section, we will briefly introduce the classical Bayesian
ancy function is tuned to be simple and easy to predict, the HF multi-fidelity surrogate. The formulation of model prediction
function is expected to be predicted more accurately. In this and uncertainty are given and the MLE parameter tuning
MFS, the scale factor q is always selected with Maximum approach is introduced.
Likelihood Estimation (MLE). As will be presented in the fol- In Bayesian multi-fidelity surrogate, the high-fidelity func-
lowing sections, this paper’s investigations are conducted tion is modeled as a Gaussian Process (GP) which is the com-
under the framework of the KOH model. bination of low-fidelity and discrepancy GPs:
In the past decade, several variants of MFS have been pro-
posed to improve model performance. Han and Görtz11 pro- Yh ðxÞ ¼ qYl ðxÞ þ dðxÞ ð1Þ
posed a simple but decent MFS called Hierarchical Kriging where Yh ðxÞ,Yl ðxÞ, and dðxÞ denote the GPs of the high-
(HK). In this framework, the HF function is approximated fidelity, low-fidelity, and discrepancy function respectively. q
with a Kriging surrogate model in which another Kriging is the scale factor that scales the data of low fidelity to match
model built from LF data serves as the trend function. Several with high fidelity. Once the observations of high- and low-
favorable properties are obtained within this framework, such fidelity functions are obtained, the multi-fidelity prediction of
as a more reasonable formulation of the prediction uncer- unknown point x can be calculated from the mean of posterior
tainty, no requirement of the nested sample sets. Zhang et al.12 distribution9:
proposed a novel MFS based on single linear regression. In
^
this model, the discrepancy function is modeled with polyno- yh ðxÞ ¼ EðYh ðxÞjyh ; yl ; hl ; hd ; qÞ ð2Þ
mials and the low-fidelity model is considered as an additional
where yh and yl denote the observations of high- and low-
basis function. The unknown coefficients of the polynomials
fidelity functions respectively; hl and hd denote the hyper-
along with the LF model scale factor are obtained with a single
parameters of low-fidelity and discrepancy GPs respectively.
linear regression. Numerical results indicate that this approach
The uncertainty of prediction can be calculated from the vari-
is more robust with noisy data. Hao et al.13 incorporated the
ance of the posterior distribution:
gradient information into an MFS framework and proposed
an adaptive infill sampling criterion which achieved the opti- ^2h ðxÞ ¼ VarðYh ðxÞjyh ; yl ; hl ; hd ; qÞ
r ð3Þ
mum value with fewer function calls. Although in some newly
proposed MFS the scale factors could be obtained analytically, To be more specific, the GP models of low-fidelity and dis-
MLE is still widely adopted for the selection of q. Zhou et al.14 crepancy functions are defined as.
proposed a generalized multi-fidelity Kriging model which Yl ðxÞ GP fðxÞT bl ; r2l
does not require nested samples and is more robust when cor- ð4Þ
relations between fidelity levels are weak. Long et al.15 devel- dðxÞ GP fðxÞT bd ; r2d
oped a multi-objective adaptive infill sampling approach for
a Bayesian MFS model and applied it to the optimization of where fðxÞT is the basis function vector, bl and bd are the coef-
geostationary orbit satellite systems. In these multi-fidelity sur- ficient vectors of the process mean function, and r2l and r2d are
rogates, the scale factors are also obtained with MLE. the process variances. The prediction and prediction’s uncer-
As introduced above, the key parameter of the Bayesian tainty of unknown point x are given by.
MFS is the scale factor q. As given in the work of Forrester ^
et al.16 the scale factor is often selected with MLE. However, yh ðxÞ ¼ f0 ðxÞT b0 þ tðxÞT C1 ðyS Fb0 Þ ð5Þ
some studies reported that the MLE may fail to select suitable
^2
T
q. Guo et al .17 studied the MLE’s choice in the selection rh ðxÞ ¼ q2 r2l þ r2d tðxÞT C1 tðxÞ þ f 0 ðxÞ tðxÞT C1 F
between HF only model (q ¼ 0) and Multi-Fidelity (MF)
T 1 1 0
model (q > 0). They found that even if the HF model is more F C F f ðxÞ tðxÞT C1 F
accurate, the MLE frequently chooses the MF model. How-
ever, their study mainly focused on the model selection ð6Þ
between HF and MF and did not pay attention to the q selec-
Selecting scale factor of Bayesian multi-fidelity surrogate by minimizing posterior variance 61
where f 0 ,F, b are the extended basis function and coefficient Nested sample sets are adopted, which are the same as
vectors (matrix) of the process mean function, tðxÞ is the those in the study of Han and Görtz.11 Samples of high- and
covariance vector between x and the sample points, C is the low-fidelity functions are Sh ¼ f0; 0:4; 0:6; 1:0g and
covariance matrix of the sample points, and yS is the observa- Sl ¼ f0; 0:1; 0:2; 0:3; 0:4; 0:5; 0:6; 0:7; 0:8; 0:9; 1:0g. The ranges
tion vector. More detailed formulations of the above- and tuning strategy of hyper-parameters can be found in the
mentioned quantities are listed in Appendix A. next section. The MF models built with MLE and manually
The hyper-parameters of Bayesian multi-fidelity surrogate selected parameters (q ¼ 1, hd ¼ 1) are compared. As shown
have always been estimated using MLE.16 The MLE chooses in Fig. 1, the MLE model’s prediction is not satisfactory, espe-
the hyper-parameters by minimizing the negative logarithm cially in the range of [0.6, 0.9], while the manually chosen
of the concentrated likelihood function: model presents accurate prediction in the whole range. q
8 selected by MLE is 1.46 which is not the best value, as the pre-
< hl ¼ arg min fLl ðhl Þg
hl dicted discrepancy function fails to approximate the real one.
ð7Þ
: ðq; hd Þ ¼ arg min fLd ðq; hd Þg However, as shown in Fig. 1(d), when q ¼ 1, the discrepancy
q;hd
function is well predicted. Hence, it is worth investigating that
where Ll and Ld are the negative logarithm likelihood function why MLE did not choose the best q. For clarity of compar-
of low-fidelity and discrepancy model respectively: ison, Table 1 gives the q and negative-logarithmic likelihood
nl 2 1 values of MLE and manually chosen model.
Ll ðhl Þ ¼ ln rl þ lnjRl j ð8Þ As mentioned above, the likelihood function is composed
2 2
of two terms: the variation term r2d and the waviness term
nh 2 1 jRd j. To analyze each term’s contribution to likelihood func-
Ld ðq; hd Þ ¼ ln rd þ lnjRd j ð9Þ tion, the likelihood function and each term’s value under dif-
2 2
ferent q and hd are presented in Fig. 2. As the likelihood and
where nl , nh are the number of LF and HF samples respec-
variation term are the functions of both hd and q, the values
tively, Rl and Rh are the correlation matrix of LF and HF sam-
are plotted at several different q. Those hd that minimize the
ples respectively.
likelihood under specific q are plotted with red crosses in
It should be noted that Ld is the function of q because the
Fig. 2(a). Those hd that minimize the variation term under
r2d term in Eq. (9) is determined by the observations of discrep-
specific q are plotted with red crosses in Fig. 2(b). These red
ancy yd ¼ yh qyl , as given in Eq. (6). We will further discuss
crosses could be regarded as candidate hyper-parameter com-
Ld in the next section to show how the minimization of Ld may
binations ððq; hd ÞÞ for the selection of q. As shown in Fig. 2(a),
not lead to the best q.
among all the candidate points, the point on the q ¼ 1:5 curve
produces the lowest likelihood function value, which is consis-
3. Discussion on MLE’s effectiveness tent with the MLE result plotted with red dots. Comparing
Fig. 2(a) with Fig. 2(b), one can observe that the relative posi-
Although the MLE is widely adopted to select the scale factor tions of constant q curves and candidate points on them do not
q, it is not always effective. Guo et al.17 found that, in the change much. This indicates that the variation term’s effect in
selection between HF only model (q ¼ 0) and MF model the likelihood function is overwhelming so that the waviness
(q > 0), MLE frequently chose the MF model even if the HF term’s effect is nearly eliminated. This could also be observed
model is more accurate. However, their research mainly in Fig. 2(c). Although the waviness term’s absolute value
focused on the model selection rather than q tuning. In this increases greatly as hd approaches 0, it is relatively small in
section, we will analyze the MLE’s problem in q tuning in the region where the candidate points locate (hd > 5). Conse-
detail. To solve this problem, an alternative approach to select quently, the MLE overly emphasized the discrepancy model’s
the scale factor is proposed in the next section. variation while ignored the waviness. As shown in Fig. 1, this
As given in Eq. (9), the likelihood function employed by bias leads to a wrong estimation of q and corresponding poor
MLE to find q and hd is composed of two terms: the r2d term model accuracy.
and the jRd j term. These terms represent different properties This property of MLE could also be found in some other
of the discrepancy function. Guo et al.17 pointed out that the researches. Guo et al.17 studied the MLE’s selection between
r2d term reflects the variation of the discrepancy model and HF only model ðq ¼ 0Þ and MF model ðq > 0Þ. They found
the jRd j term is a representation of the model waviness. Park that the MLE tends to select the model with smaller discrep-
et al.10 further defined the combined effect of these two terms ancy function variation. As the MF model’s discrepancy func-
as an indicator of model bumpiness. In general, the MLE tries tion variation is often smaller than the HF function variation,
to reduce the discrepancy model’s variation and waviness
together, so that the real discrepancy function could be
approximated easier with limited observations. However, this
approach may not always be effective. Here, we illustrate this Table 1 q and negative-logarithmic likelihood values of
drawback with a one-dimensional example, the Forrester func- models.
tion in Ref.14:
8 Model q Ld
< yh ¼ ð6x 2Þ2 sin ½2ð6x 2Þ MLE 1.46 0.957
h i x 2 ½0; 1 ð10Þ
: yl ¼ yh 0:5 ð6x 2Þ2 6x Manually chosen 1 6.883
62 H. BU et al.
the MLE tends to choose the MF model even if it is inferior to reflects the complexity of the discrepancy model and can be
the HF only model. Park et al.10 pointed out that MLE tries to employed to select suitable q. Here, we use the Integrated
find q that minimizes the bumpiness of the discrepancy func- Mean Square Error (IMSE) to measure the overall estimation
tion which is the combined effect of variation and waviness. uncertainty, as adopted by several studies regarding the
They defined and calculated the bumpiness, variation, and sequential design of Bayesian surrogate models.20,21 The IMSE
waviness numerically. For the Hartmann 6 function that they integrates the posterior variance over the design space and is
studied, the bumpiness correlated with the variation strongly. computationally expensive:
Although the waviness presented an opposite trend, the contri- Z
bution of waviness was overwhelmed by the variation. IMSE ¼ r2d ðxÞdx ð11Þ
x2D
Although this problem of MLE has been noticed by some
researchers, few proposed alternative approaches to select q. where D is the full design space. To obtain a cheap approxima-
Shu et al.18 proposed to minimize the product of jyd j and tion of Eq. (11), we consider the average MSE at all the LF
jhd j, where j j denotes the L2 norm of a vector, with h obtained sample points:
with MLE. This indicator is also a combination of variation Xnl
(jyd j) and waviness (jhd j). Numerical tests have shown the effec- d ¼1
IMSE r2 ðxi Þ ð12Þ
tiveness of this approach for some problems. However, this nl i¼1 d
approach is designed for the simple discrepancy MFS frame- When selecting q, we minimize the modified IMSE:
work19 and may not be suitable for the Bayesian MFS. In n o
the next section, we’ll propose an alternative approach for d q; b
q ¼ arg min IMSE h ð13Þ
q
the selection of scale factors in Bayesian MFS.
Note that, in Eq. (13), b
h is determined by minimizing the
4. Proposed approach likelihood Eq. (9) given a specific q, as also adopted in
Ref.18. The full MFS building process using the current
Recall the one-dimensional case in the previous section, and approach is illustrated with a flowchart in Fig. 3. As the pro-
the MLE produces a model of poor accuracy, as shown in posed approach tries to find a q that minimizes the posterior
Fig. 1(a). Besides, the prediction uncertainty of the MLE variance of the model, we denote the proposed approach with
model is remarkably large compared with the manually Minimum Posterior Variance (MPV).
selected model, as shown in Fig. 1. This inspires us to consider Fig. 4 presents the MF model of the Forrester function
whether the prediction uncertainty (the posterior variance) obtained with the proposed approach. As shown in the figure,
Selecting scale factor of Bayesian multi-fidelity surrogate by minimizing posterior variance 63
where i is the index of the test points to compute R2 , yh and yl
are the mean value of the HF and LF test points.
Fig. 6 presents the relation between R2 and A. As shown in
Fig. 6, the correlation between high- and low-fidelity function
decreases as A increases. When A ¼ 0, the correlation coeffi- Fig. 7 Plot of HF and several LF functions.
cient R2 ¼ 0:882, which is the highest. As A increases to 1,
Selecting scale factor of Bayesian multi-fidelity surrogate by minimizing posterior variance 65
5.2.1. Hartmann 3
The Hartmann 3 function is defined as follows:
8 " #
>
> P4 P 3 2
> y
> h
> ¼ ci exp a ij x j pij
>
> i¼1 j¼1
>
>
>
>
>
> yl ¼ yh þ 7:6ð0:585 0:324x1 0:379x2 0:431x3 0:208x1 x2
>
>
>
>
>
> þ0:326x1 x3 þ 0:193x2 x3 þ 0:225x21 þ 0:263x22 þ 0:274x23
>
>
>
>
>
> 2 3 2 3
>
> 3 10 30 1
>
>
>
> 6 7 6 7
>
> 6 0:1 10 35 7 6 1:2 7
>
<a ¼ 6 7 6 7
6 7; c ¼ 6 7
6 3 10 30 7 6 3 7
> 4 5 4 5
>
>
>
> 0:1 10 35 3:2
>
>
>
>
>
> 2 3
>
> 0:3689 0:1170 0:2673
>
>
>
> 6 7
>
> 6 0:4699 0:4387 0:7470 7
>
> 6 7
>
> p ¼ 6 7
>
> 6 0:1091 0:8732 0:5547 7
>
> 4 5
>
>
>
> 0:03815 0:5743 0:8828
>
>
>
:
xj 2 ½0; 1 j ¼ 1; 2; 3
ð18Þ
A nested sample set with 18 HF samples and 60 LF samples
is generated using Latin Hypercube Sampling (LHS) and Gra-
tiet’s nearest neighbor method. The hyper-parameters q and h
(including hl and hd ) are in the range of ½0:2; 2:5 and
½0:001; 100 respectively. Hyper-parameters are tuned using
the same strategy as in the previous subsection. To eliminate
the impact of random sampling, 100 standalone experiments
Fig. 11 Error statistics of MF models, Hartmann 3.
are conducted. For each experiment, 1000 additional samples
are generated using LHS to evaluate the accuracy of the built
models. A nested sample set containing 24 HF samples and 80 LF
Table 2 gives the median errors of the two approaches. As samples is generated using the same approach as in the Hart-
shown in the table, the MPV approach produces more accu- mann 3 case. The hyper-parameter tuning strategy and model
rate models in terms of both local and global error. Fig. 11 pre- accuracy evaluation method are kept the same as those in the
sents the error statistics of all numerical experiments. As could previous subsection. The experiment is repeated 100 times to
be seen from the figure, the MPV approach produces a statis- obtain the performance statistics.
tically more accurate MF model. The error medians (red line) Table 3 gives the median errors of the two approaches. As
of the MPV approach are all lower than the MLE approach, shown in the table, the MPV approach produces a statistically
and there are fewer extreme cases with poor accuracy gener- more accurate model. The global and local accuracies of MPV
ated by the MPV approach. models are all better than those of MLE models. Fig. 12 pre-
sents the error statistics of all numerical experiments. As
5.2.2. Borehole 8 shown in the figure, the MPV approach is steadier, generating
The Borehole 8 function is defined as follows: fewer extreme cases with poor accuracy.
8
3 ðx4 x6 Þ
>
> yh ¼ ln ðx =x Þ 1þ2x 2px
> 2 1 ½ 7 x4 =ðln ðx2 =x1 Þx1 x8 Þþx3 =x5 5.2.3. Impact of high-to-low sample ratio
2
>
>
>
>
< yl ¼ 0:4yh þ 0:07x1 x8 þ x1 x7 =x3 þ x1 x6 =x2 þ x21 x4
2 Additional experiments are conducted for the Hartmann 3
x1 2 ½0:05; 0:15; x2 2 ½100; 50000; x3 2 ½63070; 115660 function to compare the two approaches’ performance under
>
>
>
> different high-to-low sample ratios. The sampling ratios tested
>
> x 4 2 ½990; 1110; x5 2 ½63:1; 116; x6 2 ½700; 820
>
: here are nh =nl ¼ 0:3; 0:4; 0:6, each employing 60, 45, 30 LF
x7 2 ½1120; 1680; x8 2 ½9855; 12045 samples respectively. Ranges and tuning strategy of hyper-
ð19Þ parameters are kept the same as those in the previous subsec-
Table 2 Error medians of MF models, Hartmann 3. Table 3 Error medians of MF models, Borehole 8.
Model MLE MPV
Model MLE MPV
RMSE 0.91731 0.84574
RMSE 0.34057 0.33543
MAE 6.27474 5.87569
MAE 1.48780 1.42233
Selecting scale factor of Bayesian multi-fidelity surrogate by minimizing posterior variance 67
variables. Table 5 gives the description and range of these where Pexit and Pexit denote the static and total pressure at the
variables. cascade exit respectively, Pinlet denotes the total pressure at the
To illustrate the parameterization method clearly, Fig. 14 cascade inlet, and c is the heat capacity ratio. In house soft-
presents the definition and influence of several key parameters. ware is employed to compute the energy loss coefficient.
The axial chord controls the width of the blade in the axial Table 6 gives the boundary condition of the computation.
direction. The connecting angle controls the installation angle As widely adopted in aeronautical multi-fidelity design, the
of the blade. The full blade profile consists of five curves con- high- and low-fidelity aerodynamic data are obtained from com-
nected end to end smoothly with one another, including the putations of fine and coarse meshes. A preliminary mesh depen-
leading- and trailing-edge arcs and the Bézier curves on blade dent study was carried out to choose a suitable mesh resolution.
pressure and suction sides. As shown in Fig. 14(a), the full Fig. 15 gives the computed energy loss coefficient obtained from
blade suction side curve is composed of two Bézier curve seg- meshes of different resolutions. As shown in the figure, the
ments. When the throat size is known, the junction position of energy loss coefficient does not change much when the cell num-
these two segments is controlled by the outlet deflection angle. ber is larger than 1 105 . The coarsest mesh that can produce a
As shown with the red curve in Fig. 14(a), the shape of each similar result compared with the fine mesh is of 1:214 104 cells.
Bézier curve segment is controlled with two points (B, D) con- Hence, the meshes of 1:007 105 and 1:214 104 cells are cho-
strained within the tangential direction. The positions of these sen for the computations of high and low fidelity. Fig. 16 pre-
two points are defined with two control coefficients. In this sents a detailed view of the employed meshes, with every
problem, the control coefficients of the suction side inlet Bézier fourth grid point displayed for clarity.
segment are modified to have more control of this critical For this problem, 21 HF samples and 42 LF samples were
region. Fig. 14(b) shows the influence of three variables includ- generated to build the MF model using the same method as in
ing axial chord, deflection angle, and control coefficient 1 with the previous sections. To validate the built models, 21 addi-
each variable perturbed 0:25 times the deviation range. tional HF samples were generated. The normalized sample
To measure the aerodynamic performance of the designed points and observations are listed in the Appendix B (see
blade profile, an energy loss coefficient is defined as follows: Tables B1 B3). The hyper-parameter tuning method is kept
" c1 # " c1 #
Pexit c Pexit c the same with that in previous subsections. Fig. 17 presents
Cpt ¼ 1 1 = 1 ð20Þ the Cpt predictions of the MF models against the validation
Pexit Pinlet
data, where Cpt and C^pt are the real and predicted values
respectively. As shown in the figure, the MPV approach
Table 5 Input variables of turbine blade design problem.
No. Description Baseline Range
1 Axial chord (mm) 33.85 [31.85, 35.85] Table 6 Boundary conditions of computation.
2 Connecting angle (°) 59.8 [55.8, 61.8] Parameter Value
3 Inlet up deviation angle (°) 69 [54, 72]
Inlet total temperature (K) 982
4 Outlet deflect angle (°) 4.5 [3, 9]
Inlet total pressure (kPa) 344.74
5 Relation coefficient 0.35 [0.30, 0.45]
Inlet flow angle (°) 90
6 Bézier control coefficient 1 0.4 [0.25, 0.55]
Mach number at cascade exit 0.878
7 Bézier control coefficient 2 0.54 [0.39, 0.69]
Acknowledgements
Fig. 20 Convergence histories of MLE and proposed approach.
The authors would like to acknowledge the financial support
from the National Science and Technology Major Project,
Z 1 h i ^ ^
^ China (No. 2019-II-0008-0028) and Key Program of National
E½IðxÞ ¼ max min ðyh Þ yh ðxÞ; 0 / yh ðxÞ dyh ðxÞ Natural Science Foundation of China (No. 51936008).
1
ð21Þ
Appendix A. Detailed formulation of multi-fidelity predictor
where / is the probability density function of the normal dis-
tribution. To reduce the influence of the initial sampling on the The detailed formulation of quantities in Eqs. (5) and (2) are
performance of optimization, 10 standalone sampling and given as follows:
optimization experiments were carried out. Fig. 20 presents 8
the convergence histories of the MLE and the proposed < f 0 ðxÞT ¼ qfðxÞT ; fðxÞT ; b0 ¼ bTl ; bTd
approach (MPV), with the mean and standard deviation of : yT ¼ yT ; yT ; tðxÞT ¼ qr2 R ðx; X Þ; q2 r2 R ðx; X Þ þ r2 R ðx; X Þ
ten experiments plotted. As shown clearly in the figure, the S l h l l l l l h d d h
Table B2 (continued)
No. x1 x2 x3 x4 x5 x6 x7 Observation
12 0.594384 0.230590 0.748617 0.910226 0.771823 0.604569 0.065129 0.0345
13 0.401252 0.310364 0.186475 0.505991 0.665890 0.323853 0.526711 0.0362
14 0.313049 0.428469 0.998402 0.004000 0.670894 0.793081 0.732440 0.0489
15 0.921281 0.550723 0.479898 0.616961 0.855798 0.379506 0.250072 0.0362
16 0.258910 0.502983 0.200647 0.634875 0.600816 0.973113 0.340674 0.0360
17 0.977338 0.647921 0.425619 0.821700 0.330635 0.082301 0.229683 0.0348
18 0.340715 0.963652 0.814918 0.278368 0.545562 0.011389 0.441503 0.0385
19 0.083899 0.380879 0.158022 0.837456 0.790812 0.723793 0.811957 0.0360
20 0.782664 0.068037 0.935661 0.140153 0.253581 0.779255 0.090008 0.0411
21 0.492997 0.991521 0.050902 0.885275 0.966277 0.132381 0.156874 0.0374
22 0.631120 0.530650 0.544891 0.447860 0.997185 0.500922 0.897187 0.0391
23 0.699195 0.788697 0.283866 0.758497 0.753325 0.482189 0.121797 0.0356
24 0.543050 0.740033 0.021675 0.792794 0.572767 0.750337 0.581835 0.0357
25 0.126962 0.687085 0.658533 0.228745 0.111651 0.644878 0.940834 0.0399
26 0.683685 0.722353 0.785529 0.569685 0.924249 0.036797 0.408021 0.0374
27 0.721172 0.300778 0.375823 0.418744 0.627234 0.162397 0.522394 0.0363
28 0.457329 0.140445 0.249948 0.105041 0.832526 0.932634 0.264618 0.0459
29 0.011430 0.254736 0.805589 0.363539 0.734596 0.259428 0.548784 0.0383
30 0.511819 0.161992 0.565683 0.691572 0.520795 0.986560 0.288417 0.0353
31 0.950309 0.811968 0.142117 0.059043 0.092998 0.051286 0.871325 0.0413
32 0.560670 0.777656 0.694228 0.966118 0.304568 0.856461 0.481774 0.0352
33 0.231035 0.835749 0.856084 0.143756 0.570140 0.913020 0.916746 0.0430
34 0.059161 0.896607 0.641268 0.720989 0.423540 0.574063 0.185759 0.0355
35 0.787832 0.635058 0.382324 0.344716 0.402545 0.865873 0.839966 0.0395
36 0.201421 0.477097 0.875669 0.195474 0.014324 0.112057 0.783141 0.0393
37 0.365782 0.712652 0.589246 0.296842 0.069858 0.554533 0.369243 0.0374
38 0.903631 0.878909 0.685615 0.026345 0.487207 0.232823 0.108714 0.0427
39 0.751070 0.266025 0.460175 0.673440 0.703524 0.469243 0.604342 0.0360
40 0.617432 0.083308 0.906580 0.489151 0.142941 0.213074 0.030675 0.0358
41 0.647982 0.104777 0.963874 0.978186 0.900958 0.383566 0.646650 0.0365
42 0.958130 0.460714 0.113077 0.079200 0.038858 0.883094 0.464165 0.0406
References 14. Zhou Q, Wu YD, Guo ZD, et al. A generalized hierarchical co-
Kriging model for multi-fidelity data fusion. Struct Multidiscip
1. Tyacke J, Vadlamani NR, Trojak W, et al. Turbomachinery Optim 2020;62(4):1885–904.
simulation challenges and the future. Prog Aerosp Sci 15. Shi RH, Liu L, Long T, et al. Multi-fidelity modeling and adaptive
2019;110:100554. co-Kriging-based optimization for all-electric geostationary orbit
2. Tucker PG, Wang ZN. Eddy resolving strategies in turbomachin- satellite systems. J Mech Des 2020;142(2) 021404.
ery and peripheral components. J Turbomach 2021;143(1) 010801. 16. Forrester AIJ, Sóbester A, Keane AJ. Multi-fidelity optimization
3. Queipo NV, Haftka RT, Shyy W, et al. Surrogate-based analysis via surrogate modelling. Proc R Soc A 2007;463(2088):3251–69.
and optimization. Prog Aerosp Sci 2005;41(1):1–28. 17. Guo ZD, Song LM, Park C, et al. Analysis of dataset selection for
4. Oakley JE, O’Hagan A. Probabilistic sensitivity analysis of multi-fidelity surrogates for a turbine problem. Struct Multidiscip
complex models: A Bayesian approach. J Royal Stat Soc Ser B Optim 2018;57(6):2127–42.
Stat Methodol 2004;66(3):751–69. 18. Shu LS, Jiang P, Song XG, et al. Novel approach for selecting
5. Bu HY, Guo ZD, Song LM, et al. Effects of cooling configura- low-fidelity scale factor in multifidelity metamodeling. AIAA J
tions on the aerothermal performance of a turbine endwall with jet 2019;57(12):5320–30.
impingement and film cooling. J Turbomach 2021;143(6) 061013. 19. Park C, Haftka RT, Kim NH. Remarks on multi-fidelity
6. Dyn N, Levin D, Rippa S. Numerical procedures for surface surrogates. Struct Multidiscip Optim 2017;55(3):1029–50.
fitting of scattered data by radial functions. SIAM J Sci And Stat 20. Bates RA, Buck RJ, Riccomagno E, et al. Experimental design
Comput 1986;7(2):639–59. and observation for large systems. J Royal Stat Soc Ser B
7. Clarke SM, Griebsch JH, Simpson TW. Analysis of support vector Methodol 1996;58(1):77–94.
regression for approximation of complex engineering analyses. J 21. Picheny V, Ginsbourger D, Roustant O, et al. Adaptive designs of
Mech Des 2005;127(6):1077–87. experiments for accurate approximation of a target region. J Mech
8. Jones DR, Schonlau M, Welch WJ. Efficient global optimization Des 2010;132(7) 071008.
of expensive black-box functions. J Global Optim 1998;13 22. Zhang JQ, Sanderson AC. JADE: Adaptive differential evolution
(4):455–92. with optional external archive. IEEE Trans Evol Comput 2009;13
9. Kennedy M, O’Hagan A. Predicting the output from a complex (5):945–58.
computer code when fast approximations are available. Biome- 23. Ruan XF, Jiang P, Zhou Q, et al. Variable-fidelity probability of
trika 2000;87(1):1–13. improvement method for efficient global optimization of expensive
10. Park C, Haftka RT, Kim NH. Low-fidelity scale factor improves black-box problems. Struct Multidiscip Optim 2020;62(6):3021–52.
Bayesian multi-fidelity prediction by reducing bumpiness of 24. Timko LP. Energy Efficient Engine high pressure turbine compo-
discrepancy function. Struct Multidiscip Optim 2018;58 nent test performance report. Washington, D.C.: NASA; 1984.
(2):399–414. Report No.: NASA CR-168289.
11. Han ZH, Görtz S. Hierarchical Kriging model for variable-fidelity 25. Bu HY, Yang YF, Song LM, et al. Improving the film cooling
surrogate modeling. AIAA J 2012;50(9):1885–96. performance of a turbine endwall with multi-fidelity modeling
12. Zhang YM, Kim NH, Park C, et al. Multifidelity surrogate based considering conjugate heat transfer. J Turbomach 2022;144(1)
on single linear regression. AIAA J 2018;56(12):4944–52. 011011.
13. Hao P, Feng S, Li YW, et al. Adaptive infill sampling criterion for
multi-fidelity gradient-enhanced Kriging model. Struct Multidiscip
Optim 2020;62(1):353–73.