0% found this document useful (0 votes)
30 views

Expert Systems With Applications

Uploaded by

Marcus Vinicius
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Expert Systems With Applications

Uploaded by

Marcus Vinicius
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Expert Systems With Applications 94 (2018) 21–31

Contents lists available at ScienceDirect

Expert Systems With Applications


journal homepage: www.elsevier.com/locate/eswa

New efficient hybrid candlestick technical analysis model for stock


market timing on the basis of the Support Vector Machine and
Heuristic Algorithms of Imperialist Competition and Genetic
Elham Ahmadi a, Milad Jasemi b,∗, Leslie Monplaisir b, Mohammad Amin Nabavi c,
Armin Mahmoodi d, Pegah Amini Jam d
a
Department of Industrial Engineering, Yazd University, Yazd, Iran
b
Department of Industrial and Systems Engineering, Wayne State University, Detroit, MI, USA
c
Department of Mathematics and Statistics, Carleton University, Ottawa, Canada
d
Department of Industrial Engineering, Islamic Azad University, Masjed Soleyman Branch, Masjed Soleyman, Iran

a r t i c l e i n f o a b s t r a c t

Article history: In this paper, two hybrid models are used for timing of the stock markets on the basis of the technical
Received 22 May 2017 analysis of Japanese Candlestick by Support Vector Machine (SVM) and Heuristic Algorithms of Imperi-
Revised 16 August 2017
alist Competition and Genetic. In the first model, SVM and Imperialist Competition Algorithm (ICA) are
Accepted 9 October 2017
developed for stock market timing in which ICA is used to optimize the SVM parameters. In the second
Available online 11 October 2017
model, SVM is used with Genetic Algorithm (GA) where GA is used for feature selection in addition to
Keywords: SVM parameters optimization. Here the two approaches, Raw-based and Signal-based are devised on the
Finance basis of the literature to generate the input data of the model. For a comparison, the Hit Rate is con-
Stock market forecasting sidered as the percentage of correct predictions for periods of 1–6 day. The results show that SVM-ICA
Candlestick technical analysis performance is better than SVM-GA and most importantly the feed-forward static neural network of the
Support Vector Machine literature as the standard one.
Imperialist Competition Algorithm
© 2017 Elsevier Ltd. All rights reserved.
Genetic Algorithm

1. Introduction With recent advances in artificial intelligence, new methods for


predicting have been provided which are more accurate than the
Predicting the behavior of the stock market prices is an issue traditional methods. However, each of these methods has its own
that all the financial scientists and investments are always inter- advantages and disadvantages. The most common method among
ested. The main reason of investing in the stock market is to gain all is the neural network. Considering the technical analysis of
profit, which needs the right information from the stock market, Japanese Candlestick as the investment technique applied for stock
stock changes and trend forecasting. That is why investment needs market timing, Jasemi, Kimiagari, and Memariani (2011b) applies a
powerful and reliable tools to predict stock prices. supervised feed-forward neural network; Barak, Heidary, and Da-
Researches show that so many factors are affecting the stock hooie (2015) uses a Wrapper ANFIS-ICA as a fuzzy neural network;
market’s performance, so fluctuations in the stock market are non- and Ahmadi, Abooie, Jasemi, and Zare Mehrjardi (2016) applies a
linear (Jasemi, Kimiagari, & Memariani, 2011a). That’s why the NARX as a non-dynamic neural network as an analyst.
stock market is a non-linear dynamic system and predicting the Recently Support Vector Machine (SVM), which is a supervised
stock prices path is a difficult task. However, a suitable non-linear learning methods, has gained popularity. This method uses its abil-
modeling approach such as artificial intelligence systems can dis- ity to learn any changes in the rules lying in the time series and
cover the complex non-linear relations and handle the prevailing uses it to predict the future. Setting-up the parameters of SVM has
uncertainty and inaccuracy in the stock market. a major role in the accuracy of backup vectors. Studies use meta-
heuristic algorithms to find the proper number of variables. Ac-
cording to the current lack of literature in the field, in this study,
SVM along with two meta-heuristic algorithms, are used to predict

Corresponding author.
the movement of the stock prices. The two meta-heuristic algo-
E-mail addresses: [email protected] (E. Ahmadi), [email protected] (M.
Jasemi), [email protected] (L. Monplaisir), [email protected] (M.A.
rithms are Genetic Algorithm (GA) and the Imperialist Competition
Nabavi), [email protected] (A. Mahmoodi), [email protected] (P. Amini Algorithm (ICA) that are used to optimize the parameters of the
Jam).

https://doi.org/10.1016/j.eswa.2017.10.023
0957-4174/© 2017 Elsevier Ltd. All rights reserved.
22 E. Ahmadi et al. / Expert Systems With Applications 94 (2018) 21–31

Fig. 1. Support Vector Machine classifier.

Fig. 3. Structure of the SVM-GA model.

However, the accurate prediction is very challenging due to the


noisy nature and non-static stock prices. Many macro-economic
factors such as political events, company’s policy, General eco-
nomic conditions, product price indexes, interest rates and stocks,
expectation of investors and psychological factors, affect the stock
prices (Majhi, Rout, & Baghel, 2014). Also, government policy and
legislative measures have a significant impact on the movement of
the stock market overall.
Traders use different methods for decision making in stock mar-
ket which can be divided into two groups of technical analysis
and fundamental analysis (Sankar, Vidyaraj, & Kumar, 2015). Funda-
Fig. 2. Structure of the SVM-ICA model. mental analysis studies economy and industry conditions, financial
conditions, company management and other qualitative and quan-
titative factors for secure investigation and Technical analysis uses
model and features. The aim of this study is evaluating the SVM previous prices to predict the future prices of the stock (Anbalagan
using the mentioned algorithms, which influence the combination & Maheswari, 2015). Jasemi et al. (2011a), the base paper of this
of input variables on the overall result, and specifying the accuracy study, offered a new model for stock market timing based on a
of its predictions. neural network monitoring and technical analysis of Japanese can-
The article is structured as follows: Section 2 covers the liter- dlesticks. In this research, analysis of the Japanese charts and their
ature review. In Section 3 the background in which the new in- patterns are used as technical information. Many studies have ana-
sights that are brought with the proposed methods to overcome lyzed the use and advantages of candlestick in predicting the stock
the limitations of previous studies, are discussed and knowing the market (Lan, Zhang, & Xiong, 2011; Lee & Jo, 1999; Xie, Zhao, &
basics explained in this section is necessary to understand the Wang, 2012).
nature of work deeply. Section 4 presents the applied methodol- According to a nonlinear system of stock market, Soft Comput-
ogy along with the base conceptual model to come to the two ing methods are widely used for stock market issues (Barak, Arj-
main new models of the study. Section 5 runs the models with mand, & Ortobelli, 2017). They are useful tools for predicting such
real data and discusses them in details and a variety of aspects. turbulent areas which suggest finding their nonlinear behavior.
Section 6 presents the final discussions of the study. The use of intelligent systems like neural networks, fuzzy systems
and GA or hybrid models to predict the financial implications are
vast. Recently, artificial neural network and SVM have been used
2. Literature review to solve problems of forecasting financial time series prediction
of stock market funds (Anbalagan & Maheswari, 2015). There are
Predicting the stock market and determining trends are very in- many studies that combine the evolutionary techniques with clas-
teresting to the finance and stock market’s researchers and anyone sification mechanisms (Dahal, Almejalli, Hossain, & Chen, 2015; de
who wants to choose the correct stock and/or the right time to buy Campos, de Oliveira, & Roisenberg, 2016; Kuo, Lin, & Liao, 2011).
or sell the stocks (Sahin & Ozbayoglu, 2014). However, even after developing many efficient models, Artificial
E. Ahmadi et al. / Expert Systems With Applications 94 (2018) 21–31 23

Table 1
Raw approach.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
C2 C3 C4 C5 C6 C7 O5 H5 L5 O6 H6 L6 O7 H7 L7
C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1

Table 2
Signal approach.

1 2 3 4 5 6 7 8
C2 C3 C4 C5 C6 C7 O5 O6
C1 C1 C1 C1 C1 C1 C5 C6

9 10 11 12 13 14 15 16
O7 H7 Min (O7 ,C7 ) Max(O7 ,C7 ) Min (O7 ,C7 ) O7 L6 C7
C7 Max(O7 ,C7 ) L7 Max(O6 ,C6 ) Min (O6 ,C6 ) H6 O7 O6

17 18 19 20 21 22 23 24
Max(O6 ,C6 ) Max(O7 ,C7 ) Min (O7 ,C7 ) Min (O6 ,C6 ) H7 H7 L6 L7
Min (O5 ,C5 ) Max(O5 ,C5 ) Min (O5 ,C5 ) Max(O5 ,C5 ) H6 H5 L5 L7

mization and setting this parameter, may be preferable to other


options on GA, for instance, evolutionary strategies, Sequential
Parameter Optimization (Bartz-Beielstein, 2010) (SPO) and Parti-
cle Swarm Optimization (Ardjani & Sadouni, 2010) (PSO) and ICA
(Boutte & Santhanam, 2009). In this study, GA and ICA are used to
optimize the parameters of SVM which has never been done in the
literature.
Fig. 4. Research method.
3. The background

Neural Networks has few disadvantages in the learning process In this part, the new insights that are brought with the pro-
based on the strong likelihood that results in the lack of repro- posed methods to overcome the limitations of previous studies, are
ducibility of the process. For this reason, many researchers prefer discussed.
choosing new approaches based on robust statistical principles like
SVM (Fernandez-Lozano et al., 2013). Recently, the SVM method, 3.1. SVM
one of supervised learning methods, has gain popularity and has
been known as one of the most advanced applications of regres- SVM is actually a binary classifier in which two classes are sep-
sion and classification methods. SVM formulation minimizes the arated using a linear boundary. In this method by using an op-
structural risk and more importantly it has high performance in timization algorithm, samples that make up the boundary classes
the practical applications (Huang, Nakamori, & Wang, 2005). can be achieved. They are called Support Vectors. In Fig. 1, two
Given the above advantages, since SVM was introduced based classes and their associated support vectors are shown. Input fea-
on Vapnik’s statistical learning theory, many studies focus on the ture space that is a vector, consists of two classes and classes
theory and its applications. Several studies use the SVM to pre- hold xi educational points while i = 1, …N. These two classes are
dict the time series (Tay & Cao, 2001; Huang et al., 2005) The tagged with yi = ±1. To calculate the decision boundary of two ab-
SVM developed by Vapnik (1995, 1998), is a machine learning solutely separate classes, the method of Optimal Margin is used
technique and due to its interesting features and excellent per- (Fernandez-Lozano et al., 2013; Huang, Yang, & Chuang, 2008; Tay
formance in various problems, has been known widely on issues & Cao, 2001). In general, a boundary-line decisions can be written
of non-linear prediction and it’s been very successful (Wang, Xu, as follow:
& Weizhen Lu, 2003). Although, recently it is not one of the first w.x + b = 0 (1)
choices, a significant number of researchers report a variety of per-
formance applications in pattern recognition, regression and time Where x is a point on the decision boundary and w is an n-
series forecasting. For example, Tay and Cao (2001) try to use this dimensional vector that is perpendicular to the decision boundary.
type of neural network to predict time series; they have concluded It is to be noted that, wb  is the distance between the origin and
that the SVM is better than multi-layer propagation neural net- the decision boundary, and w.x is the inner product of the two vec-
work in forecasting of financial time series. Wang et al. (2003) use tors.
SVM to predict air quality and they came to the conclusion that In the situation that classes overlap separating the classes by
the functions of neural networks which are based on the ra- boundary linear decision-making will always have errors to solve
dius are very effective. The experimental results and review of this problem, you can start using initial data from the Rn dimen-
the literature have shown that the kernel parameters, C and σ sion using a nonlinear transformation ∅, moved to the Rm dimen-
have a significant effect on the accuracy and precision of SVM sion, in the dimensions that classes has less interference with each
(Cherkassky & Ma, 2004). Heuristic methods have not been used other. In this case, finding the optimal decision boundary to solve
successfully for determining the value of these parameters. For the optimization problem becomes the following:
this reason, researchers used meta-heuristic methods to obtain the
 
−1  
N N     N
right number of variables. Pai and Hong (20 05, 20 06) used GA Max : α1 , . . . , αN αi α j yi y j ∅(xi ).∅ x j + αi
and Gradual Annealing Algorithm, respectively. In another case, 2
i=1 j=1 i=1
Hong, Dong, Zheng, and Lai (2011) and Hong, Dong, Chen, and

N
Wei (2011) used continuous Ant Colony Algorithm and GA respec- ×0 ≤ αi ≤ C i = 1, . . . , N αi yi = 0 (2)
tively, to achieve SVR parameters. Note that for numerical opti-
i=1
24 E. Ahmadi et al. / Expert Systems With Applications 94 (2018) 21–31

Table 3
Details of training and testing of the applied datasets.

No. Training period Test period No. Training period Test period No. Training period Test period No. Training period Test period

1 20 0 0 2001 13 20 0 0–20 01 2006 25 20 0 0–20 03 2007 37 20 01–20 0 0 2006


2 20 0 0 2002 14 20 0 0–20 01 2007 26 20 0 0–20 03 2008 38 20 01–20 02 2007
3 20 0 0 2003 15 20 0 0–20 01 2008 27 2001 2002 39 20 01–20 02 2008
4 20 0 0 2004 16 20 0 0–20 02 2003 28 2001 2003 40 20 01–20 03 2004
5 20 0 0 2005 17 20 0 0–20 01 2004 29 2001 2004 41 20 01–20 02 2005
6 20 0 0 2006 18 20 0 0–20 01 2005 30 2001 2005 42 20 01–20 02 2006
7 20 0 0 2007 19 20 0 0–20 01 2006 31 2001 2006 43 20 01–20 02 2007
8 20 0 0 2008 20 20 0 0–20 01 2007 32 2001 2007 44 20 01–20 02 2008
9 20 0 0–20 01 2002 21 20 0 0–20 01 2008 33 2001 2008 45 20 01–20 04 2005
10 20 0 0–20 01 2003 22 20 0 0–20 03 2004 34 20 01–20 02 2003 46 20 01–20 04 2006
11 20 0 0–20 01 2004 23 20 0 0–20 03 2005 35 20 01–20 02 2004 47 20 01–20 04 2007
12 20 0 0–20 01 2005 24 20 0 0–20 03 2006 36 20 01–20 02 2005 48 20 01–20 04 2008

In this optimization problem α i .α is Lagrange multipliers and c Polynomial Kernel with parameter k
is a constant. In the formula (2) Instead of using ∅ it’s preferred to k
use a core function which is defined as follows: K (a, b) = (a.b + 1 ) (5)

k ( xi , x j ) = ∅ ( xi )∅ ( x j ) (3) - Polynomial Kernel with parameter σ


 
After determining the correct k (xi , xj ), in the formula (2) in- (a − b)2
K (a, b) = exp − (6)
stead of ∅ (xi ) ∅ (xj ), the function k (xi , xj ) replaced and optimiza- 2σ 2
tion problem can be solved. One of the common core functions, is
sigmoid kernel function described as follows: (Huang et al., 2005; - Polynomial Kernel with parameters δ and k
Vapnik, 1995, 1998) (a, b) = tanh (ka.b − δ ) (7)
k(xi , x j ) = exp(−γ ||x − xi ||2 ) (4) In this article the imperialist competitive algorithm has been
Two important parameters of SVM are C and γ , which should used for optimization which is inspired from a social-human phe-
be selected very carefully. C denotes the penalty. If a big value is nomenon not a natural phenomenon, RBF Kernel in SVM is used to
assigned to C, the classification accuracy rate in phases of training optimize parameters. In particular, this algorithm considers the im-
and testing will be respectively high and low; that is called over perial process as a stage of the human development, socio-political
fitting. On the other hand, the small values of C will end up with and mathematical modeling of this event takes advantage as it in-
poor accuracy of classification that is inappropriate as well. Similar spired a powerful optimization algorithm.
scenario applies for γ while it has a deeper effect than C in the
results since, it affects the resulted feature space. 3.3. SVM and GA

3.2. SVM and ICA In this model as is shown in Fig. 3, GA for feature selection and
selection of optimal parameters of SVM is used. The purpose of
In this section, we aim to develop a model for timing the stock the feature set is a subset of useful and unique features. By ex-
market with SVM and ICA based on the business strategy techni- tracting the most essential attributes that have the least number of
cal analysis with the help of Japanese Candlesticks. As is shown by features, we can greatly reduce the computational cost. Variables
Fig. 2, the turning point of this model is using the Imperialist com- related to the SVM are not only features, but they are also the
petitive algorithm to optimize the SVM parameters. The main con- Kernel function parameters. In this model, we use the SVM with
tribution of this model is the following: 1) the combination of the a combination of feature selection and parameter optimization by
Imperialist competitive algorithm and multi-class SVM algorithm GA. As typically used in data preprocessing, feature selection, se-
for determination of the amount of SVM parameters and present- lects a subset of attributes that make up the model and explains
ing a new algorithm for stock market timing. 2) Creating a model the data, by eliminating the redundant and irrelevant or confus-
based on the SVM concepts and Japanese candlesticks charts in a ing features, feature selection or the classifier can predict predic-
quite different way from previous studies. tion accuracy and the ability to understand and improve. How to
determine the methods of feature selection and feature selection
3.2.1. Imperialist competitive algorithm and optimization of Kernel indicator to build models that can better read interpret data. Fea-
parameters ture selection on one hand can decrease the cost of computing by
Select a Kernel; selecting the appropriate Kernel for the SVM reducing the dataset dimension; and on the other hand it can im-
has a major impact. Selecting the kernel that determines our data prove the model performance by eliminating redundant and irrel-
from the Initial space to the space that we want to do mapping to evant features. It should be noted that, feature selection is differ-
and the new space where we can make a linear classifier is very ent from methods like feature conversion or feature extraction in
impressive. Generally, the Gaussian Kernel or polynomial Kernel, which new features are developed out of the main ones.
have been used. If you have a basic understanding of the spatial
data, it can help in selecting the appropriate Kernel. Another way 3.3.1. GA
is to select a different Kernel and classify the data, and then se- GA optimal response which is based on the principle of Dar-
lect the appropriate Kernel based on the test data. In addition to win’s “survival of the qualified”, is obtained after a series of itera-
selecting the appropriate Kernel, it is also very important to get tive calculations. In GA, successive populations of diverse solutions
the Kernel parameters. Among Kernels that are used, the polyno- are offered by a chromosome until acceptable results are achieved.
mial, Gaussian and sigmoid Kernel are used more often (Cristianini The answer of the quality fitness function is examined in the eval-
& Shawe-Taylor, 20 0 0). uation step. The functions of Mutation and crossover are the main
E. Ahmadi et al. / Expert Systems With Applications 94 (2018) 21–31 25

Fig. 6. Evolution cycle in GA.

Fig. 7. Chromosome structure in GA.

4. Methodology

The purpose of this study is using appropriate structure for pre-


dicting stock market trading signals with high precision. For this
purpose, according to the background presented in the previous
chapters, in this study, two models were used to analyze the tech-
nical adjustment. Each of these models will be described in two
separate sections. Fig. 4 describes the general methodology that is
used in this study.

4.1. Input data

The input dataset used in this study, is based on the two ap-
proaches introduced for the first time by Jasemi et al. (2011b). In
these two approaches, the daily stock prices including Low, High,
Open and Close prices turn into 15 and 24 indicators based on
what is shown in Tables 1 and 2, respectively for the first and sec-
ond approaches. It is to be noted that in the tables Oi , Hi , Li , and
Ci respectively denotes open, high, low and close prices on the ith
day while 7th day is today (last day), 6th day is yesterday and so
on. The output is stock performance that is given in the form of
buy, sell or no-action signal.
Table 3 describes the 48 datasets that are used on a daily basis
for model training and testing. Each dataset is divided into two
groups of training and testing sets and each set contains daily
stock prices. For example in dataset 1, data of year 20 0 0 is used
for training and data of year 2001 is used for testing. In dataset
2, time-distance between training and test data are increased and
data of year 2002 are used for testing. In other datasets the num-
ber of training data are also increased; for example in dataset 9,
data of year 20 0 0 and 2001 are used together as a single training
data.

Fig. 5. Flowchart of the imperialist competitive algorithm (ICA).


4.2. The introduction of the models

4.2.1. SVM-ICA
The following parameters or notations will be used in the rest
of the paper:
operators that randomly affect the fitness value. Chromosomes se-
lection for reproduction is done by evaluating the amount of the NPop : the number of the primary population
merit. The more suitable chromosomes are more likely to be cho- Nimp : the number of the imperialists
sen for reproduction by roulette wheel method or the other ones. NCol : the number of the colonies
Crossover is the main operator that seeks the new answer in the Zeta: the impact coefficient of the colonies’ cost on the empires’
search space while a random mechanism for genes exchange be- cost
tween two chromosomes takes place by one-point, two-point or Prevolution: the probability of the revolution
multiple-point crossover. The evolutionary process is repeated till Imp Cost: Imperialist’s cost
the stop condition is satisfied. Imp Fitness: Imperialist’s fitness
26 E. Ahmadi et al. / Expert Systems With Applications 94 (2018) 21–31

Fig. 8. ICA performance during the 100 courses in the first dataset.

MaxImpCost: Maximum Imerialist’s cost


ImpColoniesCosts: Imperialist colonies’ cost
MaxDecades: Maximum of decades as a factor for stopping in
the ICA Fig. 9. Results obtained from the model in the first dataset.
Heuristic Ratio: The impact coefficient of the difference be-
tween the parents in creating a child Table 4
Results of accuracy of the implementation SVM-ICA model.
Model Selection in order to search for the optimal parameters
No. C σ Accuracy1 C σ Accuracy2
has a key role in SVM performance. However, there is still no stan-
dard criteria for choosing SVM’s kernel function and its parame- 1 8.42 1.10 56 0.49 0.78 62
2 5.30 1.04 58 0.80 0.43 60
ters. In this study RBF’s kernel function is used as the kernel func-
3 5.59 1.02 44 0.82 0.79 49
tion. RBF’s kernel function has been used for 4 reasons: (1) this 4 5.90 1.31 37 5.23 0.71 43
kernel takes the boundaries of the linear space to another space … … … … … … …
with higher dimensions. (2) Considering their performance, the lin- 45 0.86 0.57 36 8.68 1.18 57
46 11.36 2.15 49 0.35 0.94 57
ear kernel with one C parameter has the same performance as the
47 74.92 5.76 49 3.73 0.75 65
RBF’s kernel function with 2 parameters (C, σ ), (3) When consid- 48 2.20 1.86 74 0.19 1.19 81
ering more parameters, polynomial kernel has more parameters
comparing to those of the RBF’s, and (4) the RBF kernel has less
numerical problems, because kernel values are between 0 and 1,
while values of the polynomial kernel can be between 0 and in- Step 5: creating empires using the imperialists and their
finite when its degree is large. According to these advantages, the colonies
RBF kernel is used for the construction of a predictive model for Step 6: the movement of the colonies into the specified impe-
the timing of the stock market, and it is needed to specify the rialist
C and σ parameters for the RBF kernel. The upper binds for C Step 7: revolution is some colonies
and kernel parameter σ play very important roles in the perfor- Step 8: changing the location of one colony and imperialist if
mance of the SVMs. In this model the ICA is used for the selection the empire’s cost is less
of the SVM model’s parameters. The overall scheme of the model Step 9: comparing the objective function of all empires
is shown in the Fig. 4. The application of the ICA is depicted in Step 10: giving the weakest colony to the best empire
Fig. 5 and explained in the following steps: Step 11: eliminating the weak empires
Step 12: stop the algorithm if the stopping condition is reached,
Step1: creating the primary population of the colonies ran- which is accomplishing the specified rounds, otherwise go to
domly the step 6 and continue.
Step 2: evaluating the cost function of each colony. In this study
the Misclassifying probability is used as the cost function:
The ICA is used for selection of SVM’s parameters, so in each it-
 
N
Y =1 (Y predicted − Y ) eration it will have less value for the MisClass. In this study param-
MisClass = 100 × (8) eters C and σ , are presented by one colony. Each colony has two
N
members. Colonies are formed by the primary population. The ith
In which ‘N’ is the number of the outputs, ‘Ypredicted’ is the population which is consisted of two parameters of the SVM, forms
predicted output by the SVM and ‘Y’ is the real output that the ith colony, then the cost of each colony is calculated and the
is given as an input to the SVM. other colonies are sorted decreasingly and the strongest colonies
Step 3: selecting the strongest colonies as an Imperialist are chosen as the imperialists and other colonies are distributed
Step 4: allocating the residual colonies to the imperialists based among these empires. In fact, the strongest imperialists have big-
on the strength of the empires ger shares of the colonies. The share of the imperialists and their
E. Ahmadi et al. / Expert Systems With Applications 94 (2018) 21–31 27

Fig. 10. Prediction accuracy of SVM-ICA model by approaches 1 and 2.

Fig. 11. The total number of signals obtained from the model in both approaches.

Table 5 is taken place in the search space. During the absorption and rev-
Hit rate for 1 and 6 day periods by SVM-ICA model in both approaches.
olution, it is possible that a colony reaches a better position and
Raw approach Signal approach has the chance of controlling the whole empire and substitutes
No. 1d (%) 6d (%) Sig. no. No. 1d (%) 6d (%) Sig. no. the current imperialist of the empire. In addition, if any of the
colonies has more power than the respective imperialist, their po-
1 0.20 0.91 86 1 0.23 0.87 30
2 0.25 0.94 72 2 0.09 0.77 35
sition is changed. In this case the MisClass criterion is used as the
3 0.12 0.54 84 3 0.31 0.71 35 cost function in the ICA. After training the SVM, The SVM’s out-
4 0.07 0.75 59 4 0.16 0.65 51 put for the training data are calculated with the formula (8). After
… … … … … … … … the training phase, test dataset is used for checking the SVM, and
45 0.33 0.33 3 45 0.26 0.79 114
the Misclass test is also calculated. The cost of each empire is de-
46 0.17 0.63 52 46 0.37 0.93 30
47 0.16 0.68 74 47 0.32 0.86 107 pended on the cost of its imperialist and colonies which is calcu-
48 0.09 0.73 11 48 0.00 1.00 2 lated by the formula (12).

T Cn = cost (Imperialistn ) + zeta × mean{cost (Colonieso f empiren )}


fitness probability is calculated as follows: (12)
ImpF itness For the imperialism competitive firstly the fitness is calculated
p= (9)
sum(ImpF itness ) for the whole empire using formula (13) and then its probability is
For the absorb operation which is the movement of the colonies obtained by formula (12).
towards the imperialist’s country, the cross-over heuristic is per-
ImpF itness = MaxImpT otalCost − ImpT otalCost. (13)
formed. If parents 1 are the best, the child is calculated as follows:
After calculating the probability, vector P is formed as:
Child = parent2 + HeuristicRatio × ( parent1 − parent2 ) (10)
P = PP1 , Pp2 , . . . , PPNimp (14)
And if parents 2 are better:
For the distribution of colonies based on their probabilities
Child = parent1 + HeuristicRatio × ( parent2 − parent1 ) (11)
among the empires, the roulette approach is used. The empire with
Therefor the new colony (child) is a combination of the previ- the largest fitness is chosen. After this step, it must be considered
ous colony which is getting close to the imperialist’s country. Rev- that if the weakest empire has no colony, it must be allocated as a
olution with random changes in the position of some the colonies colony to the empires, based on the empires’ fitness.
28 E. Ahmadi et al. / Expert Systems With Applications 94 (2018) 21–31

Fig. 12. Prediction accuracy in different dataset by changing the number of features with the approach 1.

Table 6
The complete list of the results.

Raw approach Signal approach

No. 1 2 3 4 5 6 7 8 No. 1 2 3 4 5 6 7 8

1 17 13 17 11 8 12 78 86 1 7 7 7 2 2 1 26 30
2 18 8 15 12 9 6 68 72 2 3 7 6 5 4 2 27 35
3 10 7 8 8 5 7 45 84 3 11 5 3 2 1 3 25 35
4 4 7 6 13 7 7 44 59 4 8 7 6 5 4 3 33 51
… … … … … … … … … … … … … … … … … …
45 1 0 0 0 0 0 1 3 45 30 17 18 13 9 3 90 114
46 9 8 5 6 2 3 33 52 46 11 5 3 4 3 2 28 30
47 12 10 9 11 3 5 50 74 47 34 19 12 19 5 3 92 107
48 1 3 1 1 1 1 8 11 48 0 0 1 1 0 0 2 2

Table 7
Optimal features and parameters with predicted accuracy in each data sets.

Raw approach Signal approach

No. Optimal features C σ Accuracy No. Optimal features C σ Accuracy

1 [4,10,12,8,5,7,11] 0.80 0.14 62.75 1 [21,24,15,22,4] 0.52 0.19 63.16


2 [7,9,12,6,1] 0.81 0.10 57.31 2 [11,1,21,12,3,23] 0.07 0.35 57.71
3 [12,11,10,8,13] 0.78 0.07 75.40 3 [24,19,22,23] 0.49 0.10 76.19
4 [4,10,7,13] 0.21 0.04 83.27 4 [2,17,18] 0.92 0.03 83.67
... ... ... ... ... ... ... ... ... ...
45 [8,6,9,2] 0.63 0.11 66.01 45 [8,13,10] 0.37 0.02 65.22
46 [12,6,13,5] 0.66 0.05 62.00 46 [21,4,9,18,2,20,23] 0.98 0.28 62.40
47 [4,3,2,9,13,5] 0.46 0.20 60.96 47 [11,7,18,13,21,4,17,15,22] 0.01 0.68 61.75
48 [2,1,15,13,14] 0.31 0.77 47.04 48 [21,8,12,13,7,9,6] 0.40 0.97 45.85

This algorithm is repeated until we reach to the stopping con- 7. Calculating the hit rate
dition and at last the best answer is selected among then and the
hit rate is calculated. 4.2.2. SVM-GA
Semi code for the model is like the following: While GA can search locally for various combinations of feature
sets so that the optimization criterion is enhanced, it is expected
1. Uploading data that this class of algorithms can perform better in feature selection
2. Determining training and testing set problem. In addition different studies explain using GA in feature
3. ICA setting selection. Therefor in this study the GA is used for searching an
4. Determining kernel function and its parameters optimized subset of features for the stock market prediction model.
5. Initializing ICA Fig. 6 depicts the evolution cycle in GA.
6. ICA process Apart from selecting feature to choose the optimal RBF pa-
- Creating the primary population randomly rameters we use this algorithm, because it can be used for high-
- Defining the MisClass function as the cost function dimensional data analysis. In this case, parameters C and σ , must
- Absorbing colonies be created for SVM model.
- Revolution of colonies The model presented by SVM and GA, includes four main steps
- Exchange with the best colony that are described with more details. In this model, after identify-
- Calculating the total cost ing the number of features (in the initial procedure the number of
- Imperialism competitive features starts from 3) and loading data and other settings of the
- Extracting empires and developed colonies model, such as initial population and stop condition parameters
- Specifying the best empire, best empire position, best em- (number of generations) GA starts working. GA to calculate chro-
pire cost mosomal uses the probability of the likelihood of a mistake by the
- Determining the outputs SVM which is the same as Misclass function of SVM-ICA and uses
E. Ahmadi et al. / Expert Systems With Applications 94 (2018) 21–31 29

Fig. 13. Prediction accuracy in different data set by changing the number of features with the approach 2.

Comparision of two approach by Total number of signals


200
Raw Approach
180 Signal Approach

160
Total Number Of Signals

140

120

100

80

60

40

20
0 5 10 15 20 25 30 35 40 45 50
DataSet

Fig. 14. Compare the approaches 1 and 2 in terms of the total number of signals obtained in the SVM-GA model.

Table 8 features and kernel parameters). Each chromosome contains all the
Hit rate for 1 and 6 day periods with total number of buy and sell signals by
information needed to select the optimal features and parameters.
SVM-GA.
The length of each chromosome is 2 + n where n is the number of
Raw approach Signal approach features and 2 is the number of parameters of the RBF. In fact, this
No. 1d (%) 6d (%) Sig. no. No. 1d (%) 6d (%) Sig. no. model has a chromosome consists of 3 parts, the first part consists
1 0.36 0.61 102 1 0.25 0.41 150
of features, the second part is the parameter C and the third one
2 0.18 0.37 113 2 0.17 0.36 148 is parameter σ . Chromosome structure is shown in Fig. 7.
3 0.57 0.96 72 3 0.36 0.56 131 Step 2: Training
4 0.65 0.89 81 4 0.70 0.83 100 After creating the initial population, system uses SVM with help
… … … … … … … …
process parameter values in the chromosome and calculates the
45 0.24 0.67 73 45 0.15 0.44 120
46 0.22 0.55 112 46 0.08 0.23 162 performance of each chromosome. The performance of each chro-
47 0.53 0.83 72 47 0.10 0.28 164 mosome with the possibility of wrong classification is calculated
48 0.20 0.63 43 48 0.06 0.18 153 by Eq. (8).
In this study, the main goal is to find optimal or near-optimal
parameters so that predictions be more accurate. Then we set fit-
ness function for testing the accuracy of prediction.
it as the penalty functions. After running the algorithm and get to Step 3: Genetic operation
the final stop condition the chromosome that performed better is Here, a new generation of the population by using genetic op-
selected as the best answer. The algorithm for a higher number of erators such as selection, crossover and mutation is created. Based
properties up to 15 features in the first approach and 24 features in on the fitness values of each chromosome, the best chromosomes
the second approach is repeated and an optimal solution for each are selected and applied for the crossover as well as the mutation
is calculated. Finally, among the sub-optimal subset that works the operator with a very small mutation rate. After the new genera-
best model is chosen as the final answer, and the results antici- tion creation, the step 2 is run again along with calculation of fit-
pated by these features and optimal parameters are examined and ness values. This process is repeated until the stop condition (the
blow rate is calculated. pre-determined number of generations) is achieved. Now the chro-
Step 1: Start mosome that shows the best performance in the last selected pop-
In the first step, the system generates the initial population (the ulation is considered as the final answer.
number of features 3), uses it to find the optimal parameters (i.e.
30 E. Ahmadi et al. / Expert Systems With Applications 94 (2018) 21–31

Table 9
Results with the complete details by SVM-GA.

Raw approach Signal approach

No. 1 2 3 4 5 6 7 8 No. 1 2 3 4 5 6 7 8

1 37 15 6 2 2 0 62 102 1 52 21 9 4 4 2 92 150
2 21 8 6 3 3 1 42 113 2 40 17 11 6 5 4 83 148
3 41 15 6 2 2 2 69 72 3 62 22 9 3 3 3 102 131
4 53 10 3 3 1 1 72 81 4 85 16 5 4 2 2 114 100
… … … … … … … … … … … … … … … … … …
45 18 11 8 5 4 3 49 73 45 33 18 11 7 7 6 82 120
46 25 12 6 2 1 0 62 112 46 27 14 9 7 6 5 68 162
47 38 13 5 2 1 1 60 72 47 31 15 10 8 6 6 76 164
48 9 6 4 4 3 2 27 43 48 24 12 7 6 5 4 58 153

Table 10
Comparing the models hit ratios in percentage.

Total Raw approach Signal approach 1-day raw approach 1-day signal approach

Feed-forward neural network 74.2 74.8 73.6 45 43.4


SVM-ICA 76 70 79 19 27
SVM-GA 60 59 61 33 33

Step 4: Reaching the optimal solution As a sample, ICA performance in the first data set is shown in
Then algorithms for higher features are performed and chromo- Fig. 8 and the overall results-obtained from the model in the series,
some that has the best performance is achieved for different num- is shown in Fig. 9. In this matrix, prediction accuracy and MisClass
ber of features. Then from all the answers among all optimal chro- of the model is shown. The diagonal of the matrix output shows
mosomes, the chromosome with the best performance is selected the number of correct signals and other elements of the matrix,
and the hit rate is calculated. show the number of target signals that have been predicted by
mistake. There are 3 classes of ascending, neutral and descending
4.3. Calculate the total number of signals and hit rate signals in this matrix which have been predicted by the model. To-
tal of rows 1, 2 and 3 elements, shows the number of ascending,
Performance measures can be categorized into two groups of neutral and descending signals respectively. It is to be noted that
statistical and non-statistical ones. Non-statistical measures cover the matrix is achieved for every data set.
the economic aspects. In the area of this paper, the statistical Fig. 10 shows the prediction accuracy by SVM with the two ap-
ones are more common while the most popular one is Hit Rate proaches. It is clearly can be seen that the accuracy of SVM by sig-
(Atsalakis & Valavanis, 2009). Hit rate is defined as (number of nal approach is higher than raw approach for most of the datasets.
success) / (total signals). If the hit rate is higher than 51%, it is Also, the mean accuracy of 48 datasets in the first and the second
considered as a useful model (Lee, 2009). approaches are 70% and 79% respectively. For better depiction of
At this stage, with model outputs, sell and buy signals and total the results, Fig. 11 shows the total number of signals in both ap-
number of signals are figured out and the number of correct sig- proaches.
nals during a 6-day period are calculated. Since the base or stan- Table 5 shows the hit rate for periods of 1 and 6 day as well
dard study is Jasemi et al. (2011a), every details are set according as the total number of buying and selling signals. Table 6 presents
to that study and reading that paper is recommended for better the complete list of results while columns 1– 6 represent the per-
understanding. centages of correct signals on one, two, three, four, five and six-day
periods, respectively. Column 7 shows the total number of correct
signals and column 8 relates to the total number of signals emitted
5. Results and discussion by the model.

5.1. Experimental results of SVM-ICA


5.2. Experimental result of SVM-GA
The parameters set for the ICA are as follows:
In this model, the number of initial population (nPop) and the
MaxDecades = 100; number of generation’s iteration as the stop condition, are consid-
NPop = 30; ered 20 and 50 respectively. Table 7 shows the result of imple-
NImp = 5; menting the model for the 48 datasets, by the both approaches.
NCol = nPop−nImp; In this table the optimal features and parameters, and prediction
Prevolution = 0.7; accuracy are shown. Mean of the prediction accuracy for the ap-
Zeta = 0.1; proaches 1 and 2 are 63.84% and 64.57% respectively.
HeuristicRatio = 0.5; In Figs. 12 and 13 the prediction accuracy trends with changing
the number of features in different datasets for both approaches
With the implementation of the algorithm for the Raw and Sig- are shown. As it is clear, the prediction accuracy of the two ap-
nal approaches, optimization parameters of RBF and the results proaches are very close to each other and it doesn’t seem that the
for 48 datasets, are shown in Table 4. In this table outputs of number of features, make a major change in the forecast accuracy.
the algorithm including the optimal parameters of (C, σ ) and the However, by considering the total number of signals, the second
achieved hit ratio (accuracy) have been shown. Accuracy 1 and 2 approach shows to be better.
are the hit ratios associated with the first and second approaches Finally, Fig. 14 compares the approaches in terms of the total
respectively. number of signals obtained in the SVM-GA model.
E. Ahmadi et al. / Expert Systems With Applications 94 (2018) 21–31 31

As is presented by Tables 8 and 9, the 6-day hit rate for the Dahal, K., Almejalli, K., Hossain, M. A., & Chen, W. (2015). GA-based learning for rule
approaches 1 and 2 is 59% and 61% respectively while the 1-day identification in fuzzy neural networks. Applied Soft Computing, 35, 605–617.
de Campos, L. M. L., de Oliveira, R. C. L., & Roisenberg, M. (2016). Optimization of
hit rate is almost the same at 33%. neural networks through grammatical evolution and a genetic algorithm. Expert
Systems with Applications, 56, 368–384.
6. Conclusion Fernandez-Lozano, C., Canto, C., Gestal, M., Andrade-Garda, J. M., Rabuñal, J. R., Do-
rado, J., et al. (2013). Hybrid model based on genetic algorithms and SVM ap-
plied to variable selection within fruit juice classification. The Scientific World
This study proposed two SVM-ICA and SVM-GA models for Journal. Article ID 982438, 13 pages.
stock market timing in the way that SVM is a classifier and ICA Hong, W. C., Dong, Y., Zheng, F., & Lai, C. Y. (2011a). Forecasting urban traffic flow
by SVR with continuous ACO. Applied Mathematical Modelling, 35(3), 1282–1291.
and GA are used for optimization of the SVM parameters. GA here
Hong, W. C., Dong, Y., Chen, L. Y., & Wei, S. Y. (2011b). SVR with hybrid chaotic ge-
also selects the optimum features for better prediction. To make netic algorithms for tourism demand forecasting. Applied Soft Computing, 11(2),
the comparison fair, all the details are set according to the base 1881–1890.
Huang, C. J., Yang, D. X., & Chuang, Y. T. (2008). Application of wrapper approach
study. So a 6-day long time period is considered for evaluation of
and composite classifier to the stock trend prediction. Expert Systems with Ap-
the proposed new models. plications, 34(4), 2870–2878.
The overall comparison between the three models (the base Huang, W., Nakamori, Y., & Wang, S. Y. (2005). Forecasting stock market movement
study and the two new proposed models of this study) are shown direction with support vector machine. Computers & Operations Research, 32(10),
2513–2522.
in Table 10. The results show that both of the new models are re- Jasemi, M., Kimiagari, A. M., & Memariani, A. (2011a). A conceptual model for port-
liable in 6-day periods while the SVM-ICA is better than the other folio management sensitive to mass psychology of market. International Journal
one and even the base study. The SVM-ICA hit rate is 76% while of Industrial Engineering-Theory Application and Practice, 18(1), 1–15.
Jasemi, M., Kimiagari, A. M., & Memariani, A. (2011b). A modern neural net-
the hit rate for the base study and the SVM-GA are 74.2% and 60% work model to do stock market timing on the basis of the ancient invest-
respectively. It is to be noted that the SVM-ICA got the fantastic ment technique of Japanese Candlestick. Expert Systems with Applications, 38(4),
average hit rate of 79% when the second approach is applied. 3884–3890.
Kuo, S. C., Lin, C. J., & Liao, J. R. (2011). 3D reconstruction and face recognition using
It is surprising that the 1-day hit rate of the base study is still kernel-based ICA and neural networks. Expert Systems with Applications, 38(5),
dominant to the others under any condition that suggests a good 5406–5415.
reason for a deeper research in the field. Finally it should be noted Lan, Q., Zhang, D., & Xiong, L. (2011). Reversal pattern discovery in financial time
series based on fuzzy candlestick lines. Systems Engineering Procedia, 2, 182–190.
that application of other methods on the basis of the conceptual
Lee, K. H., & Jo, G. S. (1999). Expert system for predicting stock market timing using
model presented by Jasemi et al. (2011b) to get better results is a candlestick chart. Expert systems with applications, 16(4), 357–364.
always a promising area for future research. Lee, M. C. (2009). Using support vector machine with a hybrid feature selection
method to the stock trend prediction. Expert Systems with Applications, 36(8),
10896–10904.
References
Majhi, B., Rout, M., & Baghel, V. (2014). On the development and performance eval-
uation of a multiobjective GA-based RBF adaptive model for the prediction of
Ahmadi, E., Abooie, M. H., Jasemi, M., & Zare Mehrjardi, Y. (2016). A nonlinear au- stock indices. Journal of King Saud University-Computer and Information Sciences,
toregressive model with exogenous variables neural network for stock market 26(3), 319–331.
timing: The candlestick technical analysis. International Journal of Engineering, Pai, P. F., & Hong, W. C. (2005). An improved neural network model in forecasting
29(12), 1717–1725. arrivals. Annals of Tourism Research, 32(4), 1138–1141.
Anbalagan, T., & Maheswari, S. U. (2015). Classification and prediction of stock mar- Pai, P. F., & Hong, W. C. (2006). Software reliability forecasting by support vector
ket index based on fuzzy metagraph. Procedia Computer Science, 47, 214–221. machines with simulated annealing algorithms. Journal of Systems and Software,
Ardjani, F., & Sadouni, K. (2010). Optimization of SVM multiclass by particle swarm 79(6), 747–755.
(PSO-SVM). I.J.Modern Education and Computer Science, 2, 32–38. Sahin, U., & Ozbayoglu, A. M. (2014). TN-RSI: Trend-normalized RSI indicator for
Atsalakis, G. S., & Valavanis, K. P. (2009). Surveying stock market forecasting tech- stock trading systems with evolutionary computation. Procedia Computer Sci-
niques–Part II: Soft computing methods. Expert Systems with Applications, 36(3), ence, 36, 240–245.
5932–5941. Sankar, C. P., Vidyaraj, R., & Kumar, K. S. (2015). Trust based stock recommenda-
Barak, S., Heidary, J., & Dahooie, T. T. (2015). Wrapper ANFIS-ICA method to do stock tion system–A social network analysis approach. Procedia Computer Science, 46,
market timing and feature selection on the basis of Japanese Candlestick. Expert 299–305.
Systems with Applications, 42(23), 9221–9235. Tay, F. E., & Cao, L. (2001). Application of support vector machines in financial time
Barak, S., Arjmand, A., & Ortobelli, S. (2017). Fusion of multiple diverse predictors in series forecasting. Omega, 29(4), 309–317.
stock market. Journal of Information Fusion, 36, 90–102. Vapnik, V. (1995). The nature of statistical learning theory. New York:
Bartz-Beielstein, T. (2010). SPOT: An R package for automatic and interactive tuning Springer_Verlag.
of optimization algorithms by sequential parameter optimization. arXiv preprint Vapnik, V. N. (1998). Statistical learning theory. New York: John Wiley & Sons.
arXiv:1006.4645. Wang, W., Xu, Z., & Weizhen Lu, J. (2003). Three improved neural network models
Boutte, D., & Santhanam, B. (2009). A hybrid ICA-SVM approach to continuous phase for air quality forecasting. Engineering Computations, 20(2), 192–210.
modulation recognition. IEEE Signal Processing Letters, 16(5), 402–405. Xie, H., Zhao, X., & Wang, S. (2012). A comprehensive look at the predictive infor-
Cherkassky, V., & Ma, Y. (2004). Practical selection of SVM parameters and noise mation in Japanese candlestick. Procedia Computer Science, 9, 1219–1227.
estimation for SVM regression. Neural Networks, 17(1), 113–126.
Cristianini, N., & Shawe-Taylor, J. (20 0 0). An introduction to support vector machines.
Cambridge University Press.

You might also like