The document describes a proposed algorithm that uses particle swarm optimization (PSO) with a surrogate model to solve high-dimensional optimization problems. It initializes particles with randomly selected features from a feature space, then updates velocities and positions using standard PSO equations to optimize feature selection. The algorithm is tested on feature selection for classification with dimensionality ranging from 600 to 1200 features. Results show the proposed algorithm achieves better classification accuracy than other feature selection methods. An analysis suggests the algorithm's increased computation time for surrogate training is negligible compared to time for real-world problem evaluations.
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
14 views
Feature Selection CHI
The document describes a proposed algorithm that uses particle swarm optimization (PSO) with a surrogate model to solve high-dimensional optimization problems. It initializes particles with randomly selected features from a feature space, then updates velocities and positions using standard PSO equations to optimize feature selection. The algorithm is tested on feature selection for classification with dimensionality ranging from 600 to 1200 features. Results show the proposed algorithm achieves better classification accuracy than other feature selection methods. An analysis suggests the algorithm's increased computation time for surrogate training is negligible compared to time for real-world problem evaluations.
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3
Algorithm 1: Particle Swarm Optimization
1: Initialize parameters of PSO
2: Randomly initialize particles 3: WHILE stopping criterion not met DO 4: calculate each particle's fitness value 5: For i = 1 to population size DO 6: update the Gbest of Pi 7: update the Pbest of Pi 8: END 9: FOR i = 1 to population size DO 10: FOR j = 1 to dimension of particle DO 11: update the velocity of Pi according to equation (1) 12: update the position of Pi according to equation (2) 13: END 14: END 15: END
Algorithm 2: Proposed Algorithm
1: Load feature space set F of unique features f 2: Set the parameter desired number of features d 3: Initialize the parameters of PSO 4: Randomly initialize particles with dimension d over the set F 5: WHILE stopping criterion not met DO 6: calculate each particle's fitness value as accuracy of underlying classifier 7: For i = 1 to population size DO 8: update the Gbest of Pi 9: update the Pbest of Pi 10: END 11: FOR i = 1 to population size DO 12: FOR j = 1 to d DO 13: update the velocity of Pi according to equation (1) 14: update the position of Pi according to equation (2) 15: END 16: END 17: END 18: Return the best feature subset of d features found by the PSO
The main motivation of this work is to push the boundary of
surrogate-assisted metaheuristics for solving high-dimensional computationally problems. In the following, we examine the performance of the SA-COSO on 100-dimensional test algorithms on 100-dimensional test problems. As we can see,SA-COSO shows similar search dynamics as shown on the 50-
dimensional test problems. The RBF network is able to help
the SA-COSO algorithm locate the region where the optimum is and keep improving the solution, indicating the cooperative search of SL-PSO and PSO is very effective in finding an optimal solution. The experimental results on the 100- dimensional problems confirm the competitive performance of SA-COSO on high-dimensional problems. To further evaluate our method on higher dimensional problems, we have also conducted the comparisons on 200- dimensional problems and the results are provided in Supplementary materials. These results confirm the good performance of the proposed method for solving high-dimensional problems.
D. Empirical Analysis of the Computational Complexity
The computational complexity of the proposed SA-COSO is composed of three main parts, namely, the computation time for fitness evaluations, for training the RBF network, and for calculating the distances in updating the archive DB. In this section, we empirically compare computation time needed by compared algorithm for solving the 50-D and 100- D optimization problems. All algorithms are implemented on
a computer with a 2.50GHz processor and 8GB in RAM.
Table V presents the computation time of the algorithms under comparison averaged over 20 independent runs when a maximum of 1000 fitness evaluations using the real fitness function is allowed. From Table V, we find that PSO, whose computation time is mainly dedicated to fitness evaluations, requires the least computation time. For convenience, we use the time for PSO to search for an optimum on a fixed budget of 1000 fitness evaluations as the baseline for comparison. From Table V, we can see that the RBF-assisted SL-PSO requires the longest time, meaning that training the surrogate model takes up more time than calculating the distance in updating the archive DB. We can also find that the average time required by the proposed SA-COSO for solving 50-D and 100-D test problems are approximately 0.6 and 1.4 seconds per fitness evaluation using the expensive fitness function, respectively. Compared to most time-consuming fitness evaluations in real world applications where each fitness evaluation may take tens of minutes to hours, this increase in computation time for surrogate training and distance calculation can still be considered negligible.