Symmetry 13 00322 v3
Symmetry 13 00322 v3
Article
Evolutionary Multilabel Classification Algorithm Based on
Cultural Algorithm
Qinghua Wu 1 , Bin Wu 2 , Chengyu Hu 3 and Xuesong Yan 3,4, *
1 Faculty of Computer Science and Engineering, Wuhan Institute of Technology, Wuhan 430205, China;
[email protected]
2 School of Economics and Management, Nanjing Tech University, Najing 211816, China; [email protected]
3 School of Computer Science, China University of Geosciences, Wuhan 430074, China; [email protected]
4 Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education,
Jilin University, Changchun 130012, China
* Correspondence: [email protected]
Abstract: As one of the common methods to construct classifiers, naïve Bayes has become one of
the most popular classification methods because of its solid theoretical basis, strong prior knowl-
edge learning characteristics, unique knowledge expression forms, and high classification accuracy.
This classification method has a symmetry phenomenon in the process of data classification. Al-
though the naïve Bayes classifier has high classification performance in single-label classification
problems, it is worth studying whether the multilabel classification problem is still valid. In this paper,
with the naïve Bayes classifier as the basic research object, in view of the naïve Bayes classification
algorithm’s shortage of conditional independence assumptions and label class selection strategies,
the characteristics of weighted naïve Bayes is given a better label classifier algorithm framework;
the introduction of cultural algorithms to search for and determine the optimal weights is proposed
as the weighted naïve Bayes multilabel classification algorithm. Experimental results show that the
algorithm proposed in this paper is superior to other algorithms in classification performance.
Citation: Wu, Q.; Wu, B.; Hu, C.; Yan, X.
Evolutionary Multilabel Classification Keywords: multilabel classification; naïve Bayesian algorithm; cultural algorithms; weighted Bayesian;
Algorithm Based on Cultural Algorithm. evolutionary multilabel classification
Symmetry 2021, 13, 322. https://
doi.org/10.3390/sym13020322
In 2004, Gao et al. proposed a multiclass (MC) classification approach to text catego-
rization (TC) [21]. McCallum et al. proposed the use of conditional random fields (CRF)
to predict the classification of unlabeled test data [22]. Zhang and Zhou proposed the
multilabel K-nearest neighbors (ML-KNNs) algorithm for the classic multilabel classifica-
tion problem [23]. Zhang et al. converted the NBC model, which is meant for single-label
datasets, into a multilabel naïve Bayes (MLNB) algorithm that is suitable for multilabel
datasets [13]. Xu et al. proposed an ensemble based on the conditional random field
(En-CRF) method for multilabel image/video annotation [24]. Qu et al. proposed the appli-
cation of Bayes’ theorem to the multilabel classification problem [25]. Wu et al. proposed a
weighted naïve Bayes based on differential evolution (DE-WNB) algorithm and estimated
a naïve Bayes based on self-adaptive differential evolution (SAMNB) algorithm for clas-
sifying single-label datasets [26,27]. In 2014, Sucar et al. proposed the use of Bayesian
network-based chain classifiers for multilabel classification [28].
For data mining researchers, methods for improving the accuracy of multilabel classi-
fiers have become an important subject in studies on the multilabel classification problem.
The problem with the NBC model is that it is exceptionally challenging for the attributes of
real datasets to be mutually independent. The assumption of mutual independence will
significantly affect classification accuracy in datasets sensitive to feature combinations and
when the dimensionality of class labels is very large. There are two problems that must be
considered when constructing a multilabel classifier: (1) the processing of the relationships
between the different labels, label sets, and attribute sets and the different attributes, and (2)
the selection of the final label set for predicting the classification of real data. The available
strategies for solving the label selection problem in NBC-based multilabel classification
generally overlook the interdependencies between the labels. This is because they rely only
on the maximization of posterior probability to perform label selections.
As naïve Bayes multilabel classification can be considered an optimization problem,
many researchers have attempted to apply intelligent optimization algorithms to it [29–35].
The intelligent optimization algorithms have wide applications [36–57]. Cultural algo-
rithms are a type of intelligent search algorithm; compared to the conventional genetic
algorithm, cultural algorithms add a so-called “belief space” to the population component.
This component stores the experience and knowledge learned by the individuals during the
population’s evolution process. The evolution of the individuals in the population space
is then guided by this knowledge. Cultural algorithms are established to be particularly
well-suited to the optimization of multimodal problems [58,59]. Based on the characteristics
of the samples in this work, a cultural algorithm was used to search for the optimal naïve
Bayes multilabel classifier. It was then used to predict the class labels of test samples.
A1
The testing instance A2
is as follows: GivenA3 Y = <A1 = Ay14, A2 = y2, AC3 1= y3, A4 = yC4>,
2 the
1 is to solvex11
objective for the valuesx12corresponding
x13 to the class
x14 labels C1 and
0 C2. 1
The
2 problem is
x21solved as follows:
x22 First,
x23construct a naïve
x24 Bayes 1 network classifier
0
(Figure31). The nodesx31
C 1 and C2 in Figure 1 represent the class attributes C1 and C2. The
x32 x33 x34 0 0
four other nodes (A1, A2, A3, A4) represent the four attribute values, A1, A2, A3, and A4. The
4 x41 x42 x43 x44 1 1
class nodes C1 and C 2 are the parent nodes of the attribute nodes A1, A2, A3, and A4. The
Figure1.1.AAnaïve
Figure naïveBayes
Bayesclassifier.
classifier.
A. All attribute nodes exhibit an equal level of importance for the selection of a
However, these assumptions tend to be untrue in real problems. In the case of As-
class node.
sumption A, it is feasible for different attributes to contribute differently to the selection
B. The attribute nodes (A1 , A2 , A3 , and A4 ) are mutually independent and completely
of a class label; different conditional attributes may not necessarily exhibit an equal level
unrelated to each other.
of importance in the classification of decision attributes. For example, in real data, if the
C. Assume that the class nodes C1 and C2 are unrelated and independent.
attribute A1 has a value larger than 0.5, this instance must belong to C1, and the value of
the attribute A2 has no bearing on whethertothis
However, these assumptions tend be data
untrue in realbelongs
instance problems.
to C1 In the
or C case of
2. Hence,
Assumption A, it is feasible for different attributes to contribute differently
the value of A2 does not significantly affect the selection of the class label. to the selection
of a class label; different conditional attributes may not necessarily exhibit an equal level
of importance
2.2. in the classification of decision attributes. For example, in real data, if the
Cultural Algorithms
attribute A1 has a value larger than 0.5, this instance must belong to C1 , and the value of
Cultural algorithms (CAs) are inspired by the processes of cultural evolution that
the attribute A2 has no bearing on whether this data instance belongs to C1 or C2 . Hence,
occur in the natural world. The effectiveness of the CA framework has already been es-
the value of A2 does not significantly affect the selection of the class label.
tablished in many applications. Figure 2 illustrates the general architecture of a CA. A CA
consists of a population
2.2. Cultural Algorithms space (POP) and a belief space (BLF). POP and BLF have inde-
pendent evolution processes. In a CA, these spaces are connected by a communication
Cultural algorithms (CAs) are inspired by the processes of cultural evolution that
protocol, i.e., a functional function, which enables these spaces to cooperatively drive the
occur in the natural world. The effectiveness of the CA framework has already been
established in many applications. Figure 2 illustrates the general architecture of a CA.
A CA consists of a population space (POP) and a belief space (BLF). POP and BLF have
independent evolution processes. In a CA, these spaces are connected by a communication
protocol, i.e., a functional function, which enables these spaces to cooperatively drive the
evolution and optimization of the individuals in the population. The functional functions
of a CA are the “accept function” and “influence function”.
Symmetry 2021, 13, x FOR PEER REVIEW 5 of 21
Figure2.2.Fundamental
Figure Fundamentalarchitecture
architectureofofcultural
culturalalgorithms.
algorithms.
As
AsCAs
CAspossess
possessmany
manyevolutionary
evolutionaryprocesses,
processes,aadifferent
differenthybrid
hybridcultural
culturalalgorithm
algorithm
can
can be developed by including a different evolutionary algorithm in the POP space.The-
be developed by including a different evolutionary algorithm in the POP space. The-
oretically,
oretically,any
anyevolutionary
evolutionaryalgorithm
algorithmcan
canbebeincorporated
incorporatedwithin
withinthe
thePOP
POPspace
spaceas
asanan
evolutionary
evolutionaryrule.
rule.However,
However,aasystematic
systematictheoretical
theoreticalfoundation
foundationhas
hasyet
yettotobe
beestablished
established
for
forapplying
applyingCAs
CAsasasan
anintelligent
intelligentoptimization
optimizationalgorithm.
algorithm.
3. Cultural-Algorithm-Based Evolutionary Multilabel Classification Algorithm
3. Cultural-Algorithm-Based Evolutionary Multilabel Classification Algorithm
3.1. Weighted Bayes Multilabel Classification Algorithm
3.1. Weighted Bayes Multilabel Classification Algorithm
In Assumption A of the NBMLC algorithm, all the attribute nodes exhibit an equal
Inimportance
level of Assumption forAtheof selection
the NBMLC algorithm,
of a class all theInattribute
label node. nodes
single-label exhibitmany
problems, an equal
re-
level of importance
searchers incorporatefor the selection
feature weightingofina the
class label node.
NBMLC In single-label
algorithm problems,
to correct this many
assumption.
researchers
This has beenincorporate
demonstrated feature weighting
to improve in the NBMLC
classification algorithm
accuracy to correct
[26,60,61]. In thisthis
work,as-
sumption. This has been demonstrated to improve classification accuracy
we apply the weighting approach to the multilabel classification problem and, thus, ob- [26,60,61]. In
this work, we apply the weighting approach to the multilabel classification
tain the weighted naïve Bayes multilabel classifier (WNBMLC). Here, wj represents the problem and,
thus, obtain
weight the weighted
of the attribute xj , i.e.,naïve Bayes multilabel
the importance of xj forclassifier (WNBMLC).
the class label Here,(1)wshows
set. Equation j repre-
sents
the the weight of
mathematical the attribute
expression xj, WNBMLC
of the i.e., the importance
algorithm.of xj for the class label set. Equation
(1) shows the mathematical expression of the WNBMLC algorithm.
d
P(Ci | X ) = argmax P(Ci ) ∏ P
wj
d ( x j Ci ) (1)
wj
P(Ci | X ) arg i
Ci max P (Cj=)1 P ( x j | Ci ) (1)
Ci j 1
Here, it is illustrated that the key to solving the multilabel classification problem
lies inHere, it is illustrated
the weighting that the
of sample key toFirst,
features. solvingwe the multilabel
constructed classification
a WNBMLC problem
(see Figure lies
3),
in thethe
where weighting
nodes Cof 1 andsample features. First,
C2 correspond to theweclassconstructed
attributesa CWNBMLC (see nodes
1 and C2 . The FigureA3),1,
Awhere
2 , A3 , and A4 represent
the nodes C1 and C the to theAclass
four attributes,
2 correspond 1 , A2 ,attributes
A3 , and AC41. and
TheCclass
2. The nodesC1Aand
nodes 1, A2,
CA2 3,are
andthe
A4parent
represent nodes the of theattributes,
four attribute A 1, A2, A13, and
nodes A2 , A
A34., The A4 . The
and class nodesweights
C1 andofC2theare
conditional
the parent attributes
nodes of A1attribute
the , A2 , A3 , nodes
and A4Afor1, Athe
2, A3selection
, and A 4. of
The a weights
class label
of from
the the class
conditional
Symmetry 2021, 13, x FOR PEER REVIEW 6 of 21
label set C =
attributes , A12,,CA23}, are
A1{C andwA14, for
w2 , the
w3 , selection
and w4 , respectively.
of a class label Infrom
this work, a CA
the class was
label setused
C = {Cto1,
iteratively optimize the selection of feature weights.
C2} are w1, w2, w3, and w4, respectively. In this work, a CA was used to iteratively optimize
the selection of feature weights.
Figure
Figure3.
3. Weighted
Weighted naïve
naïve Bayes
Bayes classifier.
classifier.
|Ci,D |+1/|C |
P(Ci ) = (3)
| D |+1
the classifier. To resolve this issue, we used the M-estimate to smooth the conditional
probability formula, as in Equation (4).
Nj,n Ck +1/ A j
P xi,j Ck = (4)
Nj Ck + 1
The product calculation is transformed into a log-sum to solve this problem. This solves
the underflow problem effectively, improves the accuracy of the calculation, and facilitates
stringent pairwise comparisons. To ensure accurate calculations in this work, M-estimate-
smoothed equations were used to calculate prior probability and conditional probability,
whereas the log-sum method was used to calculate posterior probability in all the experi-
ments described in this paper.
Symmetry 2021, 13, x FOR PEER REVIEW 8 of 21
3.2.
Symmetry 2021, 13, x FOR PEER REVIEW Improved Cultural Algorithm 8 of 21
In the proposed algorithm, the individuals (chromosomes) in the POP space are de-
real numbers.
signed The dimensionality
using real-number coding. The of the chromosome
variables is equal to the
of the individuals are dimensionality
randomly initialized of the
conditional
in the attributes
(0.0, 1.0)Therange in the sample
of real numbers data. Moreover, each real number corresponds to a
real numbers. dimensionality of theso that each chromosome
chromosome is equal to the consists of a set ofofreal
dimensionality the
conditional
numbers. The attribute in
dimensionalitythe dataset. Suppose
of thedata.
chromosome that the population size is N and that the
conditional attributes in the sample Moreover,is each
equalreal
to the dimensionality
number corresponds ofto
thea
attribute dimensionality
conditional attributesininthe of an
the individual
sample data. in the population
Moreover, is n. Then, each individual ain
conditional attribute dataset. Suppose that theeach real
population number
size iscorresponds
N and that to the
the population,
conditional Wi, may
attribute in thebedataset.
expressed as an n-dimensional
Suppose vectorsize
suchisthat Wi =that
N individual {w1the
, w2,
attribute dimensionality of an individual in thethat the population
population is n. Then, each and in
…, w
attribute
j , … w }. In this
dimensionality
n equation, w
of expressedis the
an individual
j weight of the j-th
inn-dimensional attribute
the populationvector of individual
is n. Then, w ,
eachWindividual
i which
the population, Wi, may be as an such that i = {w1, w2,
is the
in within (0.0, 1.0).W
population, The structure
i , may of each chromosome
be expressed is shownvector
as an n-dimensional in Figure
such4.that W = {w1 ,
…, wj, … wn}. In this equation, wj is the weight of the j-th attribute of individual wi,i which
w , . . . , wj , . . . wn }. In this equation, wj is the weight of the j-th attribute of individual wi ,
is 2within (0.0, 1.0). The structure of each chromosome is shown in Figure 4.
which is within (0.0, 1.0). The structure of each chromosome is shown in Figure 4.
Figure 4. Structure
Structure of a chromosome in the cultural algorithm.
Here, wi (0,1) , and n represents the dimensionality of the conditional attributes in
Here, wi ∈(0,1), and n represents the dimensionality of the conditional attributes in the
the multilabel
Here, w classification problem. the dimensionality of the conditional attributes in
multilabel i (0,1) , and
classification n represents
problem.
The structure of the POP space is shown in Figure 5.
the multilabel classification
The structure of the POPproblem.
space is shown in Figure 5.
The structure of the POP space is shown in Figure 5.
Figure5.5.Structure
Figure Structureofofthe
thepopulation
populationspace
spaceininthe
thecultural
culturalalgorithm.
algorithm.
Figure Definition
3.2.1. 5. Structure ofand
the population
Update space
Rulesinof
thethe
cultural algorithm.
Belief Space
3.2.1. The BLF space
Definition and Update in ourRulesalgorithm uses Space
of the Belief the <S, N> structure. Here, S is situational knowl-
edgeThe (SK), which is mainly used
BLF space in our algorithm uses the <S,to record theN>exemplar
structure. individuals in the evolutionary
Here, S is situational
process. The structure of SK may be expressed as SK = {S 1 , S2 , . . . , SS }. In this equation,
knowledge (SK), which is mainly used to record the exemplar individuals in the evolu-
Stionary
represents
process.the
The capacity
structureof of SK,
SK mayandbethe structure
expressed of =each
as SK {S1, Sindividual
2, …, SS}. In thisin the SK set has the
equa-
expression Si = {x
tion, S represents thej i capacity
| f (Si )}.ofInSK,this
andequation,
the structure Si is
of the
eachi-th
individual
exemplar in the SK set has in the SK set,
individual
the expression
and Si =fitness
f (xi ) is the {xji | f (S
ofi)}.individual
In this equation,
xi inSthe
i is the i-th exemplar individual in the SK
population. The structure of SK is shown in
set, and f(xi) is the fitness of individual xi in the population. The structure of SK is shown
Figure 6, and the update rules for SK are shown in Equation (7).
in Figure 6, and the update rules for SK are shown in Equation (7).
tx t )f( xf t( s t )) > f (st )
t
x best fx(best
stt +1 1 = t
best best (7) (7)
s others s t others
Figure 6.6.Structure
Figure Structureof situational knowledge.
of situational knowledge.
NNisisnormative
normative knowledge
knowledge (NK). In the BLF
(NK). In space,
the BLFit isspace,
the information
it is thethat is effec-
information that is ef-
tively carried by the range of values of the variable. When
fectively carried by the range of values of the variable. When a CA is used a CA is used to optimize a to optimize
problem of dimensionality n, the expression of NK is NK = {N1, N2, …, Nn}. In this equation,
a problem of dimensionality n, the expression of NK is NK = {N1 , N2 , . . . , Nn }. In this
Ni = {(li, ui), (Li, Ui)}. Here, i ≤ n, and li and ui are the lower and upper limits of the i-th
equation,
dimensional Nvariable,
i = {(li , uwhich , Uiinitialized
i ), (Li are i ≤zero
)}. Here, as n, and li and
and one, ui are theLlower
respectively. and upper limits of
i and Ui are the
the i-th dimensional variable, which are initialized as zero and
individual fitness values that correspond to the li lower limit and ui upper limit, respec- one, respectively. Li and Ui
are theofindividual
tively, variable xi. Lfitness
i and Uvalues that correspond
i are initialized the li values.
as positivelytoinfinite lower limit and ui upper
The structure of limit, re-
Symmetry 2021, 13, x FOR PEER REVIEW 9 of 21
NK is shownofinvariable
spectively, Figure 7, xand i . Lthe
i andupdate rules
Ui are for NK are
initialized asshown in Equation
positively infinite(8).values. The structure
of NK is shown in Figure 7, and the update rules for NK are shown in Equation (8).
x j,i x j,i <= lit or f ( xj x) <x Lit l or f ( x ) t+ t t
ff(( xx )j )x x j,i
l<=
or f ( xli) orL f ( x j ) < Li
t t t t
L1
lit+1 = t l
i
t 1 j ,i j ,i
L i
i = L
j
i t 1
it
j j ,i i j i
li others l i
t
others
L others L others
i
t
i
(8) (8)
xk,j xk,i >= uit oru f (xkx) >x U tu or f ( x t) + ( xt ) or
U f ( xk ) > U t
U t
f (fx( xk )) x
x k,i>=
t
u or f u t t
t +1 t 1 i
k, j k ,i 1 i k i t 1
i k k ,i
i i k i
ui =
i U = U
others U t others
i
uit others u i
t t
others
i U i
i
The process
The process of ourof our improved
improved CA isinshown
CA is shown Figurein
8: Figure 8:
Begin
Terminating
Y End
algorithm?
N
The new population P(t+1) is
generated by the influence
function of P(t).
dren according to the greedy selection rules, thus forming the next generation of the POP.
Step 6: Update the BLF. If the new individuals are superior to the individuals of the
BLF, the BLF is
3.3. CA-Based updated. Otherwise,
Evolutionary Multilabel Step 5 is repeated
Classification until the algorithm attains the max-
Algorithm
imum number of iterations or until the results have converged. The algorithm is then ter-
In the CA-WNB algorithm, the main purpose of the CA is to determine the weight
minated.
of the attributes for label selection as the weight searching process is effectively a label-
CA optimization is used to obtain the optimal combination of weights based on the
learning process. Once the optimal weights have been determined, the attribute weights
training set. The
may be used weighted
to classify naïve
the test Bayes
set’s posterior
instances. Theprobability formula
architecture is then used
of the CA-WNB to pre-
algorithm
dict the class labels of the unlabeled test set instances. The predictions are
is shown in Figure 9. The procedure of the CA-WNB algorithm is described below. It pro- then scored: a
point is scored if the prediction is equal to the theoretical value; no point is
vides a detailed explanation of the algorithm’s architecture. The training of the CA-WNB scored other-
wise. This is
algorithm ultimately
performed yields the average
according to the classification accuracy of the test set’s instances.
following procedure:
Architecture of
Figure 9. Architecture of the
the cultural
culturalalgorithm-based
algorithm-basedweighted
weightednaïve
naïveBayes
Bayesmultilabel
multilabelclassification
classifica-
tion (CA-WNB)
(CA-WNB) algorithm.
algorithm.
Step 5: Update the POP. Based on the features of NK and SK of the BLF, new individu-
als are generated in the POP according to the influence rules of the influence function. In the
selection function, the exemplar individuals are selected from the parents and children
according to the greedy selection rules, thus forming the next generation of the POP.
Step 6: Update the BLF. If the new individuals are superior to the individuals of
the BLF, the BLF is updated. Otherwise, Step 5 is repeated until the algorithm attains
the maximum number of iterations or until the results have converged. The algorithm is
then terminated.
CA optimization is used to obtain the optimal combination of weights based on the
training set. The weighted naïve Bayes posterior probability formula is then used to predict
Symmetry 2021, 13, 322 10 of 20
the class labels of the unlabeled test set instances. The predictions are then scored: a point
is scored if the prediction is equal to the theoretical value; no point is scored otherwise.
This ultimately yields the average classification accuracy of the test set’s instances.
real values, respectively, of the class label in the k-th dimension of the i-th sample instance
in the test set, then N is the total number of sample instances in the test set and m is the
dimensionality of the class label set. The equation for calculating the classification accuracy
of each to-be-classified sample instance in the test set is Equation (12).
1 m k k
m k∑
Accuracyi = T F Ci , Ci (12)
=1
The T (F(Cik ), Cik ) function returns a value of one if the values of F(Cik ) and Cik are equal,
and zero otherwise, as per Equation (13).
1 F Ck = Ck
T F Cik , Cik = i i
(13)
0 F Ck 6= Ck
i i
The average criterion of the algorithm is the average classification accuracy of all the
sample instances in the test set. It is calculated using Equation (14).
1 N 1 m k k
N i∑ m k∑
Accuracy = T F Ci , Ci (14)
=1 =1
The analysis of the results of the three fitting experiments is presented in Table 4. In this
table, Gau-Cau represents the difference between the average classification accuracies of the
Gaussian and Cauchy distribution experiments. Dis-Gau represents the difference between
Symmetry 2021, 13, 322 12 of 20
the average classification accuracies of the data discretization experiment and the Gaussian
distribution experiment. Dis-Cau is the difference between the average classification
accuracies of the data discretization experiment and the Cauchy distribution experiment.
Table 4. Analysis of the results of the naïve Bayes multilabel classification (NBMLC) experiments.
All three fitting methods exhibit a time complexity of magnitude O(N × n × m). Here,
N is the size of the dataset, n the dimensionality of the attributes, and m the dimensionality
of the class labels. Figure 10 shows the total computation times of each distribution method
in their 10 trial runs, which were performed using identical computational hardware.
The horizontal axis indicates the type of fitting method, whereas the vertical axis indicates
Symmetry 2021, 13, x FOR PEER REVIEW 14 of 21
the computation time consumed by each method.
Figure10.
Figure 10.Comparison
Comparisonbetween
betweenthe
thecomputational
computationaltimes
timesofofeach
eachofofthe
thethree
threefitting
fittingmethods.
methods.
Table44compares
Table comparesthe theclassification
classificationaccuracies
accuraciesofofthe theNBMLCs
NBMLCswith withthree
threedistribution
distribution
methods.ItItisisdemonstrated
methods. demonstratedthat thatthe
theclassification
classificationaccuracy
accuracyof ofan
anNBMLC
NBMLCisisthe thehighest
highest
whendata
when datadiscretization
discretizationisisused usedtotofitfitthe
theconditional
conditionalprobabilities.
probabilities.Furthermore,
Furthermore,ititisis
demonstrated
demonstratedthat that the
the data discretization approach approachyields yieldshigher
higherclassification
classificationefficacy
efficacyin
inhighly
highly concentrated
concentrated datasets.
datasets. TheThe
useuse of Gaussian
of Gaussian andand Cauchy
Cauchy distributions
distributions tothe
to fit fit the
con-
conditional probabilities
ditional probabilities of the
of the datasetdataset resulted
resulted in significantly
in significantly poorer
poorer resultsresults
than than
that of that
the
ofdiscretization
the discretization approach.
approach. Furthermore,
Furthermore, the classification
the classification accuracies
accuracies obtained obtained with
with Gauss-
Gaussian and Cauchy
ian and Cauchy distribution
distribution are similar.
are similar. Further Further
analysisanalysis revealed
revealed thateffects
that the the effects
of the
ofdifferent
the different distribution
distribution methods methods on classification
on classification accuracyaccuracy are significantly
are significantly moremore pro-
pronounced in the “emotions” dataset than in the CAL500, “scene”,
nounced in the “emotions” dataset than in the CAL500, “scene”, or “yeast” datasets. In or “yeast” datasets.
Inthethe “emotions”
“emotions” dataset,
dataset, thethe classification
classification accuracy
accuracy of theofdiscretization
the discretization approach
approach is nearlyis
nearly 13% higher than that of the other approaches. In the “scene”,
13% higher than that of the other approaches. In the “scene”, “yeast”, and CAL500 da- “yeast”, and CAL500
datasets,
tasets, thethediscretization
discretizationapproach
approachoutperformed
outperformed the the other
other approaches
approaches by by4%,4%,3%,3%,andand1%,1%,
respectively. An analysis of the characteristics of the datasets revealed
respectively. An analysis of the characteristics of the datasets revealed that the class label that the class
label dimensionality
dimensionality of the of“emotions”
the “emotions” datasetdataset is smaller
is smaller thanthan
that that of other
of the the other datasets.
datasets. It is
Itfollowed
is followedby the “scene” dataset and the “yeast” dataset. The CAL500 dataset hadhad
by the “scene” dataset and the “yeast” dataset. The CAL500 dataset the
the highest
highest number
number of class
of class labellabel dimensions.
dimensions. Therefore,
Therefore, it mayit may be concluded
be concluded that thethatclassi-
the
classification accuracies
fication accuracies of these
of these fittingfitting
methodsmethodsbecomebecomemoremore similar
similar as theas number
the number of
of class
class label dimensions increase. Although the algorithmic time complexities
label dimensions increase. Although the algorithmic time complexities of these fitting of these fitting
methods
methodsare areononananidentical
identical level of of
level magnitude,
magnitude, thethe
attribute values
attribute of the
values testtest
of the datadata
mustmust be
divided into intervals in the discretization approach. This requirement
be divided into intervals in the discretization approach. This requirement resulted in resulted in higher
computation times than the Gaussian and Cauchy distribution approaches.
higher computation times than the Gaussian and Cauchy distribution approaches.
In the CA-WNB algorithm, the CA is used to optimize the attribute weights of the
WNBMLC and to validate the weights of the three methods for conditional probability
fitting. The results obtained by the CA-WNB algorithm were compared to those by the
ordinary NBMLC algorithm, based on the abovementioned design rules and experimental
evaluation criteria for classification functions. As the Gaussian and Cauchy approaches
attempt to model the continuous attributes of each dataset by fitting their probability
curves, and the NBMLC results associated with these approaches are similar, we used
Prediction Methods 1 and 2 to compare the experimental results corresponding to the
CA-WNB and NBMLC algorithms with Gaussian and Cauchy fitting. These comparisons
were also conducted between the CA-WNB and NBMLC algorithms with the discretization
approach, with varying numbers of discretization intervals (num = 10 and num = 20).
Gaussian Cauchy
Data Set Algorithm
MAX MIN AVE MAX MIN AVE
NBMLC 0.8732 0.8574 0.8622 0.8721 0.8554 0.8635
CAL_500 CA-WNB-P1 0.8893 0.8750 0.8813 0.8871 0.8737 0.8800
CA-WNB-P2 0.8897 0.8751 0.8825 0.8890 0.8744 0.8811
NBMLC 0.6976 0.6798 0.6884 0.6976 0.6787 0.6892
emotions CA-WNB-P1 0.8059 0.7853 0.7938 0.8215 0.7850 0.8040
CA-WNB-P2 0.8115 0.7900 0.7993 0.8215 0.7869 0.8044
NBMLC 0.8239 0.8195 0.8212 0.8195 0.8098 0.8151
scene CA-WNB-P1 0.8693 0.8564 0.8630 0.8841 0.8652 0.8732
CA-WNB-P2 0.8714 0.8592 0.8654 0.8848 0.8615 0.8744
NBMLC 0.7749 0.7636 0.7688 0.7739 0.7688 0.7673
yeast CA-WNB-P1 0.8045 0.7787 0.7901 0.8129 0.7873 0.7948
CA-WNB-P2 0.8051 0.7851 0.7933 0.8126 0.7831 0.7952
Tables 7–9 [31] compare the experimental results of the top-10, top-20, and top-30
topological rankings of Prediction Methods 1 and 2 between the individuals produced by
the CA-WNB and NBMLC algorithms with Gaussian and Cauchy distribution.
Symmetry 2021, 13, 322 14 of 20
Gaussian Cauchy
Data Set Algorithm
MAX MIN AVE MAX MIN AVE
NBMLC 0.8732 0.8574 0.8622 0.8721 0.8554 0.8635
CAL_500 CA-WNB-P1 0.8885 0.8697 0.8821 0.8884 0.8727 0.8801
CA-WNB-P2 0.8897 0.8716 0.8829 0.8890 0.8744 0.8811
NBMLC 0.6976 0.6798 0.6884 0.6976 0.6787 0.6892
emotions CA-WNB-P1 0.8012 0.7799 0.7939 0.8224 0.7822 0.8039
CA-WNB-P2 0.8143 0.7900 0.8011 0.8271 0.7869 0.8070
NBMLC 0.8239 0.8195 0.8212 0.8195 0.8098 0.8151
scene CA-WNB-P1 0.8698 0.8592 0.8647 0.8836 0.8638 0.8725
CA-WNB-P2 0.8714 0.8592 0.8656 0.8848 0.8626 0.8746
NBMLC 0.7749 0.7636 0.7688 0.7739 0.7688 0.7673
yeast CA-WNB-P1 0.8040 0.7898 0.7953 0.8115 0.7875 0.7943
CA-WNB-P2 0.8051 0.7851 0.7938 0.8126 0.7831 0.7941
Gaussian Cauchy
Data Set Algorithm
MAX MIN AVE MAX MIN AVE
NBMLC 0.8732 0.8574 0.8622 0.8721 0.8554 0.8635
CAL_500 CA-WNB-P1 0.8884 0.8690 0.8823 0.8884 0.8740 0.8801
CA-WNB-P2 0.8903 0.8717 0.8832 0.8890 0.8744 0.8812
NBMLC 0.6976 0.6798 0.6884 0.6976 0.6787 0.6892
emotions CA-WNB-P1 0.8021 0.7843 0.7939 0.8231 0.7812 0.8071
CA-WNB-P2 0.8143 0.7900 0.8013 0.8271 0.7869 0.8088
NBMLC 0.8239 0.8195 0.8212 0.8195 0.8098 0.8151
scene CA-WNB-P1 0.8714 0.8573 0.8653 0.8839 0.8620 0.8724
CA-WNB-P2 0.8714 0.8592 0.8658 0.8848 0.8626 0.8750
NBMLC 0.7749 0.7636 0.7688 0.7739 0.7688 0.7673
yeast CA-WNB-P1 0.8060 0.7868 0.7955 0.8109 0.7824 0.7927
CA-WNB-P2 0.8091 0.7913 0.7972 0.8139 0.7831 0.7943
Tables 10 and 11 present the average classification accuracy of the weighting combina-
tions corresponding to the final-generation individuals whose fitness values ranked in the
top-10, top-20, and top-30, as yielded by Prediction Methods 1 and 2, in the classification
of the four experimental datasets; Gaussian and Cauchy distribution were used to model
conditional probability. These tables also show the percentage by which the CA-WNB
algorithm improves upon the classification accuracies of the NBMLC algorithm, according
to Prediction Methods 1 and 2. The bolded entries in these tables indicate the classification
accuracy obtained by the best individual. The bottom rows of these tables present the
average classification accuracy of the three algorithms. CA-WNB-P1 and CA-WNB-P2 are
the average classification accuracies obtained using the CA-WNB algorithm, according to
Prediction Methods 1 and 2.
Symmetry 2021, 13, 322 15 of 20
Gaussian Cauchy
Data Set Algorithm
MAX MIN AVE MAX MIN AVE
NBMLC 0.8732 0.8574 0.8622 0.8721 0.8554 0.8635
CAL_500 CA-WNB-P1 0.8888 0.8695 0.8822 0.8890 0.8722 0.8793
CA-WNB-P2 0.8893 0.8716 0.8829 0.8890 0.8722 0.8800
NBMLC 0.6976 0.6798 0.6884 0.6976 0.6787 0.6892
emotions CA-WNB-P1 0.8031 0.7797 0.7937 0.8187 0.7812 0.7984
CA-WNB-P2 0.8059 0.7853 0.7938 0.8215 0.7812 0.8040
NBMLC 0.8239 0.8195 0.8212 0.8195 0.8098 0.8151
scene CA-WNB-P1 0.8702 0.8580 0.8654 0.8788 0.8594 0.8701
CA-WNB-P2 0.8693 0.8564 0.8660 0.8848 0.8626 0.8735
NBMLC 0.7749 0.7636 0.7688 0.7739 0.7688 0.7673
yeast CA-WNB-P1 0.8104 0.7870 0.7951 0.8087 0.7814 0.7904
CA-WNB-P2 0.8045 0.7850 0.7921 0.8126 0.7792 0.7932
Table 10. Experiment results of NBMLC and CA-WNB with Gaussian distribution.
Table 11. Experiment results of NBMLC and CA-WNB with Cauchy distribution.
It is apparent that the average accuracy obtained by the CA-WNB algorithm is superior
to that of the NBMLC algorithm. However, this is obtained at the expense of computation
Symmetry 2021, 13, x FOR PEER REVIEW
time. The CA-WNB algorithm iteratively optimizes attribute weights prior to the prediction 18 of 21
of class labels so as to weaken the effects of the naïve conditional independence assumption.
Even if one overlooks the effects of having multiple prediction methods and different
training
the CA-WNBand testalgorithm
dataset sizes on the
is still NP computation
× MAXGEN time,timesthe time than
higher complexity
that ofof theNBMLC
the CA-WNB al-
algorithm is still NP × MAXGEN times higher than that of the NBMLC algorithm
gorithm (NP is the population size, and MAXGEN is the maximum number of evolutions). (NP is the
population
The running size,
times MAXGEN
andof the CA-WNB is the and
maximum
NBMLC number of evolutions).
algorithms (under theThe running
same times
conditions
of the CA-WNB and NBMLC algorithms (under the same conditions
and environment as in the abovementioned experiments) are shown in Figure and environment as 11.
in
the abovementioned
NBMLC-Gau experiments) are
and NBMLC-Cau are shown in Figure
the running 11. of
times NBMLC-Gau
the NBMLC andalgorithm
NBMLC-Cau with
are the running times of the NBMLC algorithm with Gaussian
Gaussian and Cauchy distribution, respectively. Meanwhile, CA-WNB-Gau and CA-and Cauchy distribution,
respectively.
WNB-Cau are Meanwhile,
the running CA-WNB-Gau and CA-WNB-Cau
times of the CA-WNB algorithmare theGaussian
with running times of the
and Cauchy
CA-WNB algorithm with
distribution, respectively. Gaussian and Cauchy distribution, respectively.
Figure11.
Figure 11.Computation
Computationtimes
timesof
ofNBMLC
NBMLCand
andCA-WNB
CA-WNBalgorithms.
algorithms.
5. Conclusions
In this paper, we study the multilabel classification problem. This paper presents the
algorithm framework of naïve Bayes multilabel classification and analyzes and compares
the effects of three common fitting methods of continuous attributes on the classification
performance of the naïve Bayes multilabel classification algorithm from the perspective
of average classification accuracy and algorithm time cost. On this basis, the framework
of weighted naïve Bayes multilabel classification is given, the determination of weights is
regarded as an optimization problem, the cultural algorithm is introduced to search for
and determine the optimal weight, and the weighted naïve Bayes multilabel classification
algorithm based on the cultural algorithm is proposed. In this algorithm, the classification
accuracy obtained by substituting the current individual in the weighted naïve Bayes
multilabel classification is taken as the objective function, and the attribute dimension
Symmetry 2021, 13, 322 18 of 20
Author Contributions: Conceptualization, Q.W. and B.W.; Data curation, Q.W. and C.H.; Investi-
gation, C.H.; Methodology, Q.W.; Software, C.H.; Visualization, X.Y.; Writing—original draft, Q.W.;
Writing—review & editing, X.Y. All authors have read and agreed to the published version of
the manuscript.
Funding: This research was funded by Natural Science Foundation of China (U1911205 and 62073300),
the Fundamental Research Funds for the Central Universities, China University of 615 Geosciences
(Wuhan) (CUGGC03) and the Fundamental Research Funds for the Central Univer-616 sities,
JLU (93K172020K18).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Acknowledgments: This paper is supported by the Natural Science Foundation of China (U1911205
and 62073300), China University of Geosciences (Wuhan) (CUGGC03), and the Fundamental Research
Funds for the Central Universities (JLU; 93K172020K18).
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Tsoumakas, G.; Katakis, I.; Vlahavas, I. Mining multi-label data. In Data Mining and Knowledge Discovery Handbook; Springer:
Boston, MA, USA, 2010; pp. 667–685.
2. Streich, A.P.; Buhmann, J.M. Classification of multi-labeled data: A generative approach. In Machine Learning and Knowledge
Discovery in Databases; Springer: Berlin/Heidelberg, Germany, 2008; pp. 390–405.
3. Kazawa, H.; Izumitani, T.; Taira, H.; Maeda, E. Maximal margin labeling for multi-topic text categorization. In Advances in Neural
Information Processing Systems; MIT Press: Vancouver, BC, Canada, 2004; pp. 649–656.
4. Snoek, C.G.; Worring, M.; Van Gemert, J.C.; Geusebroek, J.M.; Smeulders, A.W. The challenge problem for automated detection
of 101 semantic concepts in multimedia. In Proceedings of the 14th annual ACM International Conference on Multimedia,
Santa Barbara, CA, USA, 23–27 October 2006; ACM: New York, NY, USA, 2006; pp. 421–430.
5. Vens, C.; Struyf, J.; Schietgat, L.; Džeroski, S.; Blockeel, H. Decision trees for hierarchical multi-label classification. Mach. Learn.
2008, 73, 185–214. [CrossRef]
6. Boutell, M.R.; Luo, J.; Shen, X.; Brown, C.M. Learning multi-label scene classification. Pattern Recognit. 2004, 37, 1757–1771.
[CrossRef]
7. Xia, Y.; Chen, K.; Yang, Y. Multi-Label Classification with Weighted Classifier Selection and Stacked Ensemble. Inf. Sci. 2020.
[CrossRef]
8. Qian, W.; Xiong, C.; Wang, Y. A ranking-based feature selection for multi-label classification with fuzzy relative discernibility.
Appl. Soft Comput. 2021, 102, 106995. [CrossRef]
9. Yao, Y.; Li, Y.; Ye, Y.; Li, X. MLCE: A Multi-Label Crotch Ensemble Method for Multi-Label Classification. Int. J. Pattern Recognit.
Artif. Intell. 2020. [CrossRef]
10. Yang, B.; Tong, K.; Zhao, X.; Pang, S.; Chen, J. Multilabel Classification Using Low-Rank Decomposition. Discret. Dyn. Nat. Soc.
2020, 2020, 1–8. [CrossRef]
11. Kumar, A.; Abhishek, K.; Kumar Singh, A.; Nerurkar, P.; Chandane, M.; Bhirud, S.; Busnel, Y. Multilabel classification of remote
sensed satellite imagery. Trans. Emerg. Telecommun. Technol. 2020, 4, 118–133. [CrossRef]
12. Huang, S.J.; Li, G.X.; Huang, W.Y.; Li, S.Y. Incremental Multi-Label Learning with Active Queries. J. Comput. Sci. Technol. 2020,
35, 234–246. [CrossRef]
13. Zhang, M.L.; Peña, J.M.; Robles, V. Feature selection for multi-label naive Bayes classification. Inf. Sci. 2009, 179, 3218–3229.
[CrossRef]
14. De Carvalho, A.C.; Freitas, A.A. A tutorial on multi-label classification techniques. Found. Comput. Intell. 2009, 5, 177–195.
15. Spyromitros, E.; Tsoumakas, G.; Vlahavas, I. An empirical study of lazy multilabel classification algorithms. In Artificial Intelligence:
Theories, Models and Applications; Springer: Berlin/Heidelberg, Germany, 2008; pp. 401–406.
Symmetry 2021, 13, 322 19 of 20
16. Rousu, J.; Saunders, C.; Szedmak, S.; Shawe-Taylor, J. Kernel-based learning of hierarchical multilabel classification models.
J. Mach. Learn. Res. 2006, 7, 1601–1626.
17. Yang, Y.; Chute, C.G. An example-based mapping method for text categorization and retrieval. ACM Trans. Inf. Syst. (TOIS) 1994,
12, 252–277. [CrossRef]
18. Grodzicki, R.; Mańdziuk, J.; Wang, L. Improved multilabel classification with neural networks. Parallel Probl. Solving Nat. Ppsn X
2008, 5199, 409–416.
19. Gonçalves, E.C.; Freitas, A.A.; Plastino, A. A Survey of Genetic Algorithms for Multi-Label Classification. In Proceedings of the
2018 IEEE Congress on Evolutionary Computation (CEC), Rio de Janeiro, Brazil, 29 January 2018; pp. 1–8.
20. McCallum, A.; Nigam, K. A comparison of event models for naive bayes text classification. AAAI-98 Workshop Learn. Text Categ.
1998, 752, 41–48.
21. Gao, S.; Wu, W.; Lee, C.H.; Chua, T.S. A MFoM learning approach to robust multiclass multi-label text categorization. In Proceedings
of the Twenty-First International Conference on Machine Learning; ACM: New York, NY, USA, 2004; pp. 329–336.
22. Ghamrawi, N.; McCallum, A. Collective multi-label classification. In Proceedings of the 14th ACM International Conference on
Information and Knowledge Management; ACM: New York, NY, USA, 2005; pp. 195–200.
23. Zhang, M.L.; Zhou, Z.H. ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognit. 2007, 40, 2038–2048.
[CrossRef]
24. Xu, X.S.; Jiang, Y.; Peng, L.; Xue, X.; Zhou, Z.H. Ensemble approach based on conditional random field for multi-label image and
video annotation. In Proceedings of the 19th ACM International Conference on Multimedia; ACM: New York, NY, USA, 2011;
pp. 1377–1380.
25. Qu, G.; Zhang, H.; Hartrick, C.T. Multi-label classification with Bayes’ theorem. In Proceedings of the 2011 4th International
Conference on Biomedical Engineering and Informatics (BMEI), Shanghai, China, 15–17 October 2011; pp. 2281–2285.
26. Wu, J.; Cai, Z. Attribute weighting via differential evolution algorithm for attribute weighted naive bayes (wnb). J. Comput.
Inf. Syst. 2011, 7, 1672–1679.
27. Wu, J.; Cai, Z. A naive Bayes probability estimation model based on self-adaptive differential evolution. J. Intell. Inf. Syst. 2014,
42, 671–694. [CrossRef]
28. Sucar, L.E.; Bielza, C.; Morales, E.F.; Hernandez-Leal, P.; Zaragoza, J.H.; Larrañaga, P. Multi-label classification with Bayesian
network-based chain classifiers. Pattern Recognit. Lett. 2014, 41, 14–22. [CrossRef]
29. Reyes, O.; Morell, C.; Ventura, S. Evolutionary feature weighting to improve the performance of multi-label lazy algorithms.
Integr. Comput. Aided Eng. 2014, 21, 339–354. [CrossRef]
30. Lee, J.; Kim, D.W. Memetic feature selection algorithm for multi-label classification. Inf. Sci. 2015, 293, 80–96. [CrossRef]
31. Yan, X.; Wu, Q.; Sheng, V.S. A Double Weighted Naive Bayes with Niching Cultural Algorithm for Multi-Label Classification.
Int. J. Pattern Recognit. Artif. Intell. 2016, 30, 1–23. [CrossRef]
32. Wu, Q.; Liu, H.; Yan, X. Multi-label classification algorithm research based on swarm intelligence. Clust. Comput. 2016,
19, 2075–2085. [CrossRef]
33. Zhang, Y.; Gong, D.W.; Sun, X.Y.; Guo, Y.N. A PSO-based multi-objective multi-label feature selection method in classification.
Sci. Rep. 2017, 7, 376. [CrossRef] [PubMed]
34. Wu, Q.; Wang, H.; Yan, X.; Liu, X. MapReduce-based adaptive random forest algorithm for multi-label classification.
Neural Comput. Appl. 2019, 31, 8239–8252. [CrossRef]
35. Moyano, J.M.; Gibaja, E.L.; Cios, K.J.; Ventura, S. An evolutionary approach to build ensembles of multi-label classifiers. Inf. Fusion
2019, 50, 168–180. [CrossRef]
36. Guo, Y.N.; Zhang, P.; Cheng, J.; Wang, C.; Gong, D. Interval Multi-objective Quantum-inspired Cultural Algorithms.
Neural Comput. Appl. 2018, 30, 709–722. [CrossRef]
37. Yan, X.; Zhu, Z.; Hu, C.; Gong, W.; Wu, Q. Spark-based intelligent parameter inversion method for prestack seismic data.
Neural Comput. Appl. 2019, 31, 4577–4593. [CrossRef]
38. Wu, B.; Qian, C.; Ni, W.; Fan, S. The improvement of glowworm swarm optimization for continuous optimization problems.
Expert Syst. Appl. 2012, 39, 6335–6342. [CrossRef]
39. Lu, C.; Gao, L.; Li, X.; Zheng, J.; Gong, W. A multi-objective approach to welding shop scheduling for makespan, noise pollution
and energy consumption. J. Clean. Prod. 2018, 196, 773–787. [CrossRef]
40. Wu, Q.; Zhu, Z.; Yan, X.; Gong, W. An improved particle swarm optimization algorithm for AVO elastic parameter inversion
problem. Concurr. Comput. Pract. Exp. 2019, 31, 1–16. [CrossRef]
41. Yu, P.; Yan, X. Stock price prediction based on deep neural network. Neural Comput. Appl. 2020, 32, 1609–1628. [CrossRef]
42. Gong, W.; Cai, Z. Parameter extraction of solar cell models using repaired adaptive differential evolution. Solar Energy 2013,
94, 209–220. [CrossRef]
43. Wang, F.; Li, X.; Zhou, A.; Tang, K. An estimation of distribution algorhim for mixed-variable Newsvendor problems. IEEE Trans.
Evol. Comput. 2020, 24, 479–493.
44. Wang, G.G. Improving Metaheuristic Algorithms with Information Feedback Models. IEEE Trans. Cybern. 2017, 99, 1–14.
[CrossRef] [PubMed]
45. Yan, X.; Li, P.; Tang, K.; Gao, L.; Wang, L. Clonal Selection Based Intelligent Parameter Inversion Algorithm for Prestack Seismic
Data. Inf. Sci. 2020, 517, 86–99. [CrossRef]
Symmetry 2021, 13, 322 20 of 20
46. Yan, X.; Yang, K.; Hu, C.; Gong, W. Pollution source positioning in a water supply network based on expensive optimization.
Desalination Water Treat. 2018, 110, 308–318. [CrossRef]
47. Wang, R.; Zhou, Z.; Ishibuchi, H.; Liao, T.; Zhang, T. Localized weighted sum method for many-objective optimization. IEEE Trans.
Evol. Comput. 2018, 22, 3–18. [CrossRef]
48. Lu, C.; Gao, L.; Yi, J. Grey wolf optimizer with cellular topological structure. Expert Syst. Appl. 2018, 107, 89–114. [CrossRef]
49. Wang, F.; Zhang, H.; Zhou, A. A particle swarm optimization algorithm for mixed-variable optimization problems.
Swarm Evol. Comput. 2021, 60, 100808. [CrossRef]
50. Yan, X.; Zhao, J. Multimodal optimization problem in contamination source determination of water supply networks.
Swarm Evol. Comput. 2019, 47, 66–71. [CrossRef]
51. Yan, X.; Hu, C.; Sheng, V.S. Data-driven pollution source location algorithm in water quality monitoring sensor networks. Int. J.
Bio-Inspir Compu. 2020, 15, 171–180. [CrossRef]
52. Hu, C.; Dai, L.; Yan, X.; Gong, W.; Liu, X.; Wang, L. Modified NSGA-III for Sensor Placement in Water Distribution System.
Inf. Sci. 2020, 509, 488–500. [CrossRef]
53. Wang, R.; Li, G.; Ming, M.; Wu, G.; Wang, L. An efficient multi-objective model and algorithm for sizing a stand-alone hybrid
renewable energy system. Energy 2017, 141, 2288–2299. [CrossRef]
54. Li, S.; Gong, W.; Yan, X.; Hu, C.; Bai, D.; Wang, L. Parameter estimation of photovoltaic models with memetic adaptive differential
evolution. Solar Energy 2019, 190, 465–474. [CrossRef]
55. Yan, X.; Zhang, M.; Wu, Q. Big-Data-Driven Pre-Stack Seismic Intelligent Inversion. Inf. Sci. 2021, 549, 34–52. [CrossRef]
56. Wang, F.; Li, Y.; Liao, F.; Yan, H. An ensemble learning based prediction strtegy for dynamic multi-objective optimization.
Appl. Soft Comput. 2020, 96, 106592. [CrossRef]
57. Yan, X.; Li, T.; Hu, C. Real-time localization of pollution source for urban water supply network in emergencies. Clust. Comput.
2019, 22, 5941–5954. [CrossRef]
58. Reynolds, R.G. Cultural algorithms: Theory and applications. In New Ideas in Optimization; McGraw-Hill Ltd.: Berkshire, UK,
1999; pp. 367–378.
59. Reynolds, R.G.; Zhu, S. Knowledge-based function optimization using fuzzy cultural algorithms with evolutionary programming.
IEEE Trans. Syst. Man Cybern. Part B 2001, 31, 1–18. [CrossRef]
60. Zhang, H.; Sheng, S. Learning weighted naïve Bayes with accurate ranking. In Proceedings of the 4th IEEE International
Conference on Data Mining, Brighton, UK, 1–4 November 2004; pp. 567–570.
61. Xie, T.; Liu, R.; Wei, Z. Improvement of the Fast Clustering Algorithm Improved by K-Means in the Big Data. Appl. Math.
Nonlinear Sci. 2020, 5, 1–10. [CrossRef]
62. Yan, X.; Li, W.; Wu, Q.; Sheng, V.S. A Double Weighted Naive Bayes for Multi-label Classification. In International Symposium on
Computational Intelligence and Intelligent Systems; Springer: Singapore, 2015; pp. 382–389.