0% found this document useful (0 votes)
17 views

Optimization of The Support Vector Machine Method Using Particle Swarm Optimizat

This document discusses using support vector machines and particle swarm optimization to improve predictions of autism using fuzzy time series methods. It describes fuzzy time series forecasting steps and rules, and aims to determine optimal interval lengths for subsets to improve prediction accuracy.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Optimization of The Support Vector Machine Method Using Particle Swarm Optimizat

This document discusses using support vector machines and particle swarm optimization to improve predictions of autism using fuzzy time series methods. It describes fuzzy time series forecasting steps and rules, and aims to determine optimal interval lengths for subsets to improve prediction accuracy.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Sriwijaya Journal of Informatic and Applications

Vol. xx, No. xx, Month Years, pp. xx-xx


ISSN xxxx-xxxx, DOI:xxxxx 1

Optimization of The Support Vector Machine Method Using


Particle Swarm Optimization Algorithm for Autism Prediction
Aprina Damayanti a,1, Alvi Syahrini Utami b,2,*, Mastura Diana Marieska b,3
a
Student a Master of Informatics Engineering, Universitas Sriwijaya
b
Lecturer, Department Informatics Engineering, Faculty of Computer Science, Sriwijaya University,
Jl. Srijaya Negara Bukit Besar, Palembang 30128, Indonesia
1
[email protected]; 2 [email protected] *; 3 [email protected]
* corresponding author

ARTICLE INFO ABSTRACT

The role of the tourism sector continues to increase in the Indonesian


Article history
economy. The number of foreign tourist visits is one of the main keys in
Received
Revised the tourism sector, but the number of foreign tourist arrivals in
Accepted Indonesia fluctuates every certain period of time. Prediction of the
number of foreign tourists is very important because it can be used as
Keywords information to determine effective programs and policies. Fuzzy Time
Foreign Tourist Series is a method for prediction that works to store data in the past and
Fuzzy Time Series
then processed to produce new values that will be displayed in the
Genetic Algorithm
MAPE future. This method was chosen because it does not require assumptions
compared to other prediction methods. In the prediction process, the
Fuzzy Time Series method has a weakness, the length of the interval in
the sub-set used is too far apart so that the prediction results are less
than optimal. Genetic Algorithm is used to help determine the best value
to be used as interval limit in the Fuzzy Time Series subset. The test
using 252 data on the number of foreign tourist arrivals in Indonesia
based on the month period resulted Mean Absolute Percentage Error
(MAPE) value 7.729518%.

1. Introduction
Tourism as one of the export commodities that cannot be seen clearly continues to increase its
role in the Indonesian economy. In developing international tourism, a targeted program is needed
and appropriate in order to increase the number of foreign tourist arrivals. However, the number of
foreign tourists visiting Indonesia fluctuates every year or it is not certain how many foreign tourists
come. This makes it difficult for tourism providers to provide their best services to foreign tourists
who visit and does not rule out the possibility of a decrease in the number of foreign tourists at some
time which causes a decrease in income from the tourism sector. Forecasting or predictions about
the number of foreign tourists visiting Indonesia can be used as useful information to increase
marketing and repair activities of various facilities needed by foreign tourists, such as immigration
services, transportation facilities, banking, accommodation, restaurants, and travel agencies [1].
Forecasting is an important tool in effective and efficient planning [2]. In the process of
forecasting or prediction, one method that is widely used is Fuzzy Time Series, where this method is
one of the soft computing methods that has been used and applied to time series data analysis or
time series. Time series data analysis is used to perform data analysis that considers the effect of
time. The time series prediction method identifies historical patterns (using time as a reference), then

http://sjia.ejournal.unsri.ac.id [email protected]
2 Sriwijaya Journal of Informatic and Applications ISSN ????-????
Vol. xx, No. xx, Month Years, pp. xx-xx

makes predictions using time-based extrapolation for those patterns. Time series model assumes that
some pattern or combination of patterns will repeat over time.
Prediction using Fuzzy Time Series works to store data in the past then processed and will
produce new values that will be displayed in the future [3], where the past data is meant in the form
of time series data which includes several time period. However, in forecasting the Fuzzy Time
Series method, it also has weakness, according to [4] one of the factors that can affect the level of
accuracy of the Fuzzy Time Series method is the length of the interval of the subset partition in the
universe of discourse. ). Furthermore, according to [5] the weakness of the Fuzzy Time Series
method is that the set of data range values or the length of the interval used is too far apart so that the
prediction results are less than optimal, so we need an additional method or algorithm to get better
prediction results. Genetic Algorithm is used to get the best value to be used as interval limit on the
Fuzzy Time Series subset, in order to get the prediction results that are closest to the actual data by
using the optimal length of the subset partition interval in the optimal universe of discourse.
Previously, research has been carried out by [6] by applying the Fuzzy Time Series method and
Genetic Algorithm in predicting the number of prospective new students of STIKOM dynamics of
the Jambi nation. By combining the two methods into one, a combination of methods that is quite
good is produced, as evidenced by the achievement of the best mean (Mean Square Error) value of
6.487. Based on this background, a research will be conducted on the prediction of the number of
foreign tourists in Indonesia using the Fuzzy Time Series method and Genetic Algorithm.
2. Literature Study
2.1 Support Vector Machine
Support Vector Machine is a concept that can be used to predict problems where historical data is
formed in linguistic values, in other words, the previous data in fuzzy time series are linguistic data,
while the current data as a result are real numbers. Here are Chen's Fuzzy Time Series forecasting
steps [2]:
1. Determine the universe of discourse (the universe of conversation) historical data.
– (2.1)
Information:
Xmin = Minimum data
Xmax = Maximum data
D1 and D2 = Positive numbers determined by the researcher to determine the universal set
from the historical data set.
2. Define the fuzzy set Ai and perform fuzzification on the historical data observed. For
example, A1, A2, …, Ak is a fuzzy set that has a linguistic value from a linguistic variable. The
definition of fuzzy set A1, A2, …, Ak in the universe of conversation U is as follows:
A1 = 1/u1 + 0.5/u2 + 0/u3 + 0/u4 +… + 0/up
A2 = 0.5/u1 + 1/u2 + 0.5/u3 + 0/u4 + … + 0/up
A3 = 0/u1 + 0.5/u2 + 1/u3 + 0/u4 + … + 0/up

Ap = 0/u1 + 0/u2 + 0/u3 + … + 0.5/up - 1 + 1/up (2.2)
where ui (i = 1, 2, .., p) is the element of the universal set (U) and the number marked with the
symbol “/” represents the degree of membership of Ai (ui) with respect to Ai (i = 1, 2, .., p)
where the value is 0, 0.5 or 1.
3. Create Fuzzy Logical Relationship (FLR) tables based on historical data
4. Classify the FLR that has been obtained from the 3rd stage into groups to form a Fuzzy
Logical Relationship Group (FLRG) and combine the same relationship.
5. Calculating the defuzzification of the prediction output value.

Aprina Damayanti et.al (Optimization Of The Support Vector Machine Method…)


ISSN ????-???? Journal of Informatic and Applications 3
Vol. xx, No. xx, Month Years, pp. xx-xx

In Chen's fuzzy time series method, there are several forecasting rules that must be considered,
including:
1. If the fuzzification result in year t is Aj and there is a fuzzy set that does not have a fuzzy
logical relation, for example if Ai→∅, where the maximum value of the membership function
of Ai is in the interval ui and the mean value of ui is mi, then the forecasting result is Ft+ 1 is
mi.
2. If the result of the fuzzification of year t is Ai and there is only one FLR in the FLRG, for
example if Ai→Aj where Ai and Aj are fuzzy sets and the maximum value of the membership
function of Aj is in the interval uj and the mean value of uj is mj, then the result Ft+1 forecast
is mj.
3. If the result of fuzzification in year t is Aj and Aj have several FLRs on FLRG, for example
Ai→Aj1, Aj2, …, Ajk where Ai, Aj1, Aj2, …, Ajk is a fuzzy set and the maximum value of the
membership function of Aj1, Aj2, …, Ajk is in the interval uj1, uj2, .., ujk and mj1, mj2, …, mjk,
then the forecasting result Ft+1 is as follows:

(2.3)
k is the number of midpoints and to find the middle value (mi) in the fuzzy set interval the
following equation can be used:

(2.4)
The accuracy in calculating the forecasting model can be calculated with several forecasting
percentage models, one of which is the Mean Absolute Percentage Error (MAPE). MAPE is the
average of the overall error percentage (difference) between actual data and forecasted data.
MAPE can be obtained with the following formula [7]:

(2.5)
2.2 Particle Swarm Optimization
Genetic Algorithm uses natural selection guidelines to find a solution. The Genetic Algorithm
uses the process of changing each individual so that it gets close to optimal results [8]. According to
[9] the search for solutions is done by combining chromosomes and then processed with genetic
operators (selection, crossover, and mutation) by initializing genetic parameters (population size,
crossover rate, mutation rate, and number of generations). The following is the process flow in the
Genetic Algorithm [10]:
1. Initialization
This is the initial stage of the Genetic Algorithm starting by generating random numbers
called chromosomes.
2. Reproduction
The purpose of this stage is to create new individuals (offspring) from the parent. There are
two ways of reproduction in the Genetic Algorithm, crossover and mutation.
 Crossover
Using one cut-point crossover, that is by randomly taking two parents from the
population to exchange the values of the chromosomes.
= ∗ (2.6)
 Mutation
The process of changing the value of a chromosome in a particular gene, begins by
randomly taking a parent from the population.
= ∗ (2.7)

Aprina Damayanti et.al (Prediction of the Number of Foreign Tourist Visits…)


4 Sriwijaya Journal of Informatic and Applications ISSN ????-????
Vol. xx, No. xx, Month Years, pp. xx-xx

3. Evaluation
Using the fitness value to assess whether or not a chromosome is good. The calculation of the
fitness value can be seen in equation 2.8.
fitness = 1 / MAPE (2.8)
4. Selection
This stage is the stage of selecting the best individual from the reproductive process and the
selected individual will become the parent in the next generation. Selection is carried out by
taking into the number of generations which is the number of repetitions (iterations) for
reproduction and the selection itself. The higher the fitness value of an individual or chromosome,
the higher the probability of being selected as the best individual and the better the next
generation so that the fitness value will have an effect. Elitism selection will be used where
individuals with high fitness values will have the opportunity to be selected.
3. Methodology
3.1 Data Collection
The data collection needed in this study was obtained from the official website of Central Bureau
of Statistics. The data to be used is data on the number of foreign tourist arrivals in Indonesia by
month period from 1999-2019. So that the amount of data collected and will be used in this study is
252 data.
3.2 Research Testing
Starting from the input data on the number of foreign tourist visits, which will then be carried out
a prediction process. After going through the prediction calculation stages using the Fuzzy Time
Series method and Genetic Algorithm, the prediction results will be evaluated using the MAPE
statistical formula which will be used in the last stage, analyzing the test results and making
conclusions.

Fig 1. Testing Stages Flowchart

Aprina Damayanti et.al (Optimization Of The Support Vector Machine Method…)


ISSN ????-???? Journal of Informatic and Applications 5
Vol. xx, No. xx, Month Years, pp. xx-xx

4. Result and Discussion


4.1 Population Size Testing
In the population size test, other genetic algorithm parameters used are the number of generations
= 50 while for the combination value of cr and mr = 0.5 & 0.6. The population size test results are
shown in Table 1.
Table 1. Population Size Test Results

Based on table1, the test results are displayed with the MAPE value of each number of
population size tested. The test results of the number of population size = 100 which produces the
lowest MAPE value of 7.8305.
4.2 Crossover Rate & Mutation Rate Testing
Genetic Algorithm parameters used for this test include the number of populations obtained from
previous tests of 100, and number of generations = 50. The crossover and mutation rate test results
are shown in Table 2.
Table 2. Crossover Rate & Mutation Rate Test Results

Aprina Damayanti et.al (Prediction of the Number of Foreign Tourist Visits…)


6 Sriwijaya Journal of Informatic and Applications ISSN ????-????
Vol. xx, No. xx, Month Years, pp. xx-xx

Based on table 2, the test resulted in a combination of crossover rate and mutation rate values
with a value of 0.5 and 0.6 getting the smallest MAPE value among other value combinations with a
value of 7.85908.
4.3 Number of Generation Testing
Testing the number of generations using the best population size, combination of cr and mr
values from the previous test results. The number of generation test results are shown in Table 3.
Table 3. Number of Generation Test Results

Based on table 3, the test results show the number of generations = 400 getting the smallest
MAPE value among other test values with a MAPE of 7.78808.
4.4 Test Result
This section describes the testing of the prediction results for the number of foreign tourists in
Indonesia using the Fuzzy Time Series and Genetic Algorithm, using the best parameters from the
previous testing results.
Table 4. Prediction of the Number of Foreign Tourists in Indonesia MAPE Test Results

Aprina Damayanti et.al (Optimization Of The Support Vector Machine Method…)


ISSN ????-???? Journal of Informatic and Applications 7
Vol. xx, No. xx, Month Years, pp. xx-xx

From the MAPE results, the error value generated using the Fuzzy Time Series method and
Genetic Algorithm is 7.729518%. Genetic Algorithm parameters that are used as input values have
an influence on the final results obtained. When population size increases the MAPE value will be
better too, but not always the addition of the population will get a better MAPE value. The
population size that is too small will cause the least possibility of obtaining optimal individuals to be
used as interval limits in the Fuzzy Time Series method.
Crossover rate and mutation rate that is too high will have the potential to produce offspring
populations that have similarities to their parents. This will affect the diversity of the population in
the next generation and affect the search for optimal values to be used as interval limits.
The number of generations in genetic algorithms that are too few causes the Genetic Algorithm
to not be able to process the search for the most optimal individual. But sometimes the large number
of generations can also distance the individual search areas with optimal values to be used as
interval limits in the Fuzzy Time Series.
5. Conclusion
The interval length of the Fuzzy Time Series subset sometimes has a distance that is too far so
that it affects the prediction results. For this reason, determining the interval limit is assisted through
a Genetic Algorithm process whose individual values are evaluated for each generation. Prediction
of the number of foreign tourists in Indonesia using the Fuzzy Time Series method and Genetic
Algorithm produces an error value (MAPE) of 7.729518%. Genetic algorithm best parameters of the
previous test results used, population size = 100, crossover rate = 0.5, mutation rate = 0.6 and the
number of generations = 400.
References
[1] Badan Pusat Statistik, “Statistik Kunjungan Wisatawan Mancanegara Tahun 2019,” Badan Pus. Stat.,
2020.
[2] N. Fauziah, S. Wahyuningsih, and Y. N. Nasution, “Peramalan Mengunakan Fuzzy Time Series Chen
(Studi Kasus : Curah Hujan Kota Samarinda),” Statistika, vol. 4, no. 2, pp. 52–61, 2016.
[3] Y. Ekananta, L. Muflikhah, and C. Dewi, “Penerapan Metode Average-Based Fuzzy Time Series Untuk
Prediksi Konsumsi Energi Listrik Indonesia,” J. Univ. Brawijaya, vol. 2, no. 3, pp. 1283–1288, 2018.
[4] A. D. A. Rifandi, B. D. Setiawan, and Tibyani, “Optimasi Interval Fuzzy Time Series Menggunakan
Particle Swarm Optimization pada Peramalan Permintaan Darah : Studi Kasus Unit Transfusi Darah
Cabang - PMI Kota Malang,” J. Pengemb. Teknol. Inf. dan Ilmu Komput. Univ. Brawijaya, vol. 2, no. 7,
pp. 2770–2779, 2018.
[5] T. Mandariansah, B. D. Setiawan, and R. C. Wihandika, “Optimasi Fuzzy Time Series Untuk Peramalan
Kebutuhan Hidup Layak Kota Kediri Dengan Menggunakan Algoritme Genetika,” J. Pengemb. Teknol.
Inf. dan Ilmu Komput. Univ. Brawijaya, vol. 2, no. 5, pp. 1823–1832, 2018.
[6] M. R. Palevi, “Fuzzy Time Series Dalam Prediksi Jumlah Calon Mahasiswa Baru STIKOM Dinamika
Bangsa Jambi,” J. Ilm. Media Process., vol. 11, no. 2, pp. 228–237, 2016.
[7] F. Aditya, D. Devianto, and M. Maiyastri, “Peramalan Harga Emas Indonesia Menggunakan Metode
Fuzzy Time Series Klasik,” J. Mat. UNAND, vol. 8, no. 2, p. 45, 2019, doi: 10.25077/jmu.8.2.45-
52.2019.
[8] D. Kurnianingtyas, W. F. Mahmudy, and A. W. Widodo, “Optimasi Derajat Keanggotaan Fuzzy
Tsukamoto Menggunakan Algoritma Genetika Untuk Diagnosis Penyakit Sapi Potong,” J. Teknol. Inf.
dan Ilmu Komput., vol. 4, no. 1, p. 8, 2017, doi: 10.25126/jtiik.201741294.
[9] H. A. Saputro, W. F. Mahmudy, and C. Dewi, “Implementasi Algoritma Genetika Untuk Optimasi
Penggunaan Lahan Pertanian,” J. Mhs. PTIIK, vol. 5, no. 12, p. 12, 2015.
[10] A. C. Nurhakim, B. Darma, and C. Dewi, “Optimasi Fuzzy Time Series Untuk Prediksi Jumlah Produksi
Saga Leather Fashion Menggunakan Metode Algoritme Genetika,” Jurnal Pengembangan Teknologi
Informasi dan Ilmu Komputer, vol. 3, no. 5, 2019.

Aprina Damayanti et.al (Prediction of the Number of Foreign Tourist Visits…)

You might also like