0% found this document useful (0 votes)

3 views

A-multiple-kernel-support-vector-regression-approach-for-stock-market

The document presents a multiple-kernel support vector regression (SVR) approach for stock market price forecasting, addressing the challenge of manually tuning hyperparameters in kernel functions. A two-stage learning algorithm is developed that combines advantages from different hyperparameter settings without requiring prior specification, thus improving overall system performance. Experimental results indicate that this method outperforms traditional forecasting techniques when applied to stock market data from Taiwan Capitalization Weighted Stock Index.

Uploaded by

mohsen.ir64

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

A-multiple-kernel-support-vector-regression-approach-for-stock-market

Uploaded by

mohsen.ir64

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Expert Systems with Applications 38 (2011) 2177–2186

Contents lists available at ScienceDirect

Expert Systems with Applications

journal homepage: www.elsevier.com/locate/eswa

A multiple-kernel support vector regression approach for stock market

price forecasting q
Chi-Yuan Yeh, Chi-Wei Huang, Shie-Jue Lee ⇑
Department of Electrical Engineering, National Sun Yat-Sen University, Kaohsiung 804, Taiwan

a r t i c l e i n f o a b s t r a c t

Keywords: Support vector regression has been applied to stock market forecasting problems. However, it is usually
Stock market forecasting needed to tune manually the hyperparameters of the kernel functions. Multiple-kernel learning was
Support vector regression developed to deal with this problem, by which the kernel matrix weights and Lagrange multipliers can
Multiple-kernel learning be simultaneously derived through semideﬁnite programming. However, the amount of time and space
SMO
required is very demanding. We develop a two-stage multiple-kernel learning algorithm by incorporating
Gradient projection
sequential minimal optimization and the gradient projection method. By this algorithm, advantages from
different hyperparameter settings can be combined and overall system performance can be improved.
Besides, the user need not specify the hyperparameter settings in advance, and trial-and-error for deter-
mining appropriate hyperparameter settings can then be avoided. Experimental results, obtained by run-
ning on datasets taken from Taiwan Capitalization Weighted Stock Index, show that our method
performs better than other methods.
Ó 2010 Elsevier Ltd. All rights reserved.

1. Introduction Lin, 2005; Tay & Cao, 2001; Valeriy & Supriya, 2006; Yang, Chan,
& King, 2002).
Accurate forecasting of stock prices is an appealing yet difficult ANN has been widely used for modeling stock market time ser-
activity in the modern business world. Many factors influence the ies due to its universal approximation property (Kecman, 2001).
behavior of the stock market, including both economic and non- Previous researchers indicated that ANN, which implements the
economic. Therefore, stock market forecasting is regarded as one empirical risk minimization principle, outperforms traditional sta-
of the most challenging topics in business. In the past, methods tistical models (Hansen & Nelson, 1997). However, ANN suffers
based on statistics were proposed for tackling this problem, such from local minimum traps and difficulty in determining the hidden
as the autoregressive (AR) model (Champernowne, 1948), the layer size and learning rate. On the contrary, SVR, proposed by
autoregressive moving average (ARMA) model (Box & Jenkins, Vapnik and his co-workers, has a global optimum and exhibits bet-
1994), and the autoregressive integrated moving average (ARIMA) ter prediction accuracy due to its implementation of the structural
model (Box & Jenkins, 1994). These are linear models which are, risk minimization principle which considers both the training error
more than often, inadequate for stock market forecasting, since and the capacity of the regression model (Cristianini & Shawe-Tay-
stock time series are inherently noisy and non-stationary. Recently, lor, 2000; Vapnik, 1995). However, the practitioner has to deter-
nonlinear approaches have been proposed, such as autoregressive mine in advance the type of kernel function and the associated
conditional heteroskedasticity (ARCH) (Engle, 1982), generalized kernel hyperparameters for SVR. Unsuitably chosen kernel func-
autoregressive conditional heteroskedasticity (GARCH) (Bollerslev, tions or hyperparameter settings may lead to significantly poor
1986), artificial neural networks (ANN) (Hansen & Nelson, 1997; performance (Chapelle, Vapnik, Bousquet, & Mukherjee, 2002;
Kim & Han, 2008; Kwon & Moon, 2007; Qi & Zhang, 2008; Zhang Duan, Keerthi, & Poo, 2003; Kwok, 2000). Most researchers use
& Zhou, 2004), fuzzy neural networks (FNN) (Chang & Liu, 2008; trial-and-error to choose proper values for the hyperparameters,
Oh, Pedrycz, & Park, 2006; Zarandi, Rezaee, Turksen, & Neshat, which obviously takes a lot of efforts. In addition, using a single
2009), and support vector regression (SVR) (Cao & Tay, 2001, kernel may not be sufficient to solve a complex problem satisfacto-
2003; Fernando, Julio, & Javier, 2003; Gestel et al., 2001; Pai & rily, especially for stock market forecasting problems. Several
researchers have adopted multiple-kernels to deal with these
q
problems (Bach, Lanckriet, & Jordan, 2004; Bennett, Momma, &
This work was supported by the National Science Council under the grant NSC
Embrechts, 2002; Crammer, Keshet, & Singer, 2003; Gönen et al.,
95-2221-E-110-055-MY2.
⇑ Corresponding author. 2008; Lanckriet, Cristianini, Bartlett, Ghaoui, & Jordan, 2004;
E-mail address: [email protected] (S.-J. Lee). Ong, Smola, & Williamson, 2005; Rakotomamonjy, Bach, Canu, &

0957-4174/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2010.08.004
2178 C.-Y. Yeh et al. / Expert Systems with Applications 38 (2011) 2177–2186

Grandvalet, 2007, 2008; Sonnenburg, Ratsch, Schäfer, & Schölkopf, Instead, the images which lie outside the e-insensitive band are
2006; Szafranski, Grandvalet, & Rakotomamonjy, 2008; Tsang & penalized and slack variables are introduced to account for these
Kwok, 2006; Wang, Chen, & Sun, 2008). images. For convenience, in the sequel, the term SVR is used to
The simplest way to combine multiple-kernels is by averaging stand for e-SVR. The objective function and constraints for SVR are
them. But each kernel having the same weight may not be appro-
priate for the decision process, and therefore the main issue con- 1 X l
min hw; wi þ C ðni þ ^ni Þ;
cerning multiple-kernel combination is to determine optimal w;b 2 i¼1
weights for participating kernels. Lanckriet et al. (2004) used a lin-
s:t: ðhw; /ðxi Þi þ bÞ yi 6 e þ ni ;
ear combination of matrices to combine multiple-kernels. They
transformed the optimization problem into a semidefinite pro- yi ðhw; /ðxi Þi þ bÞ 6 e þ ^ni ;
gramming (SDP) problem, which, being convex, has a global opti- ni ; ^ni P 0; i ¼ 1; . . . ; l; ð1Þ
mum. However, the amount of time and space required by this
method is demanding. Other multiple-kernel learning algorithms where l is the number of training patterns, C is a parameter which
include Bach et al. (2004), Sonnenburg et al. (2006), Rak- gives a tradeoff between model complexity and training error, ni
otomamonjy et al. (2007), Rakotomamonjy, Bach, Canu, and Grand- and ^ni are slack variables for exceeding the target value by more
valet (2008), Szafranski et al. (2008) and Gönen et al. (2008). These than e and for being below the target value by more than e, respec-
approaches deal with large-scale problems by iteratively using the tively. Note that / : X ? F is a possibly nonlinear mapping function
sequential minimal optimization (SMO) algorithm (Platt, 1999) to from the input space to a feature space F. Also, h , i indicates the
update Lagrange multipliers and kernel weights in turn, i.e., La- inner product of the involved arguments. The regression hyperplane
grange multipliers are updated with fixed kernel weights and ker- to be derived is
nel weights are updated with fixed Lagrange multipliers
alternatively. Although these methods are faster than SDP, they f ðxÞ ¼ hw; /ðxÞi þ b; ð2Þ
are likely to suffer from local minimum traps. Multiple-kernel
learning based on hyperkernels has also been studied (Ong et al., where w and b are weight vector and offset, respectively.
2005; Tsang & Kwok, 2006). Tsang and Kwok (2006) reformulated To solve Eq. (1), one can introduce the Lagrangian, take partial
the problem as a second-order cone programming (SOCP) form. derivatives with respect to the primal variables and set the result-
Crammer et al. (2003) and Bennett et al. (2002) used boosting ing derivatives to zero, and turn the Lagrangian into the following
methods to combine heterogeneous kernel matrices. Wolfe dual form
We propose a regression model, which integrates multiple-ker-
nel learning and SVR, to deal with the stock price forecasting prob- X
l X
l
maxa;a^ ^ i ai Þ e
yi ða ða
^ i þ ai Þ
lem. A two-stage multiple-kernel learning algorithm is developed i¼1 i¼1
to optimally combine multiple-kernel matrices for SVR. This learn-
ing algorithm applies SMO (Platt, 1999) and the gradient projection 1X l X l
ða
^ i ai Þða
^ j aj ÞKðxi ; xj Þ;
method (Bertsekas, 1999) iteratively to obtain Lagrange multipliers 2 i¼1 j¼1
and optimal kernel weights. By this algorithm, advantages from X
l
different hyperparameter settings can be combined and overall s:t: ða
^ i ai Þ ¼ 0;
system performance can be improved. Besides, the user need not i¼1

specify the hyperparameter settings in advance, and trial-and-er- C P ai ; a

^ i P 0; i ¼ 1; . . . ; l; ð3Þ
ror for determining appropriate hyperparameter settings can then
be avoided. Experimental results, obtained by running on datasets where ai and a ^ i , i = 1, . . . , l, are Lagrange multipliers, and
taken from Taiwan Capitalization Weighted Stock Index (TAIEX), a = [a1, a2, . . . , al] and a^ ¼ ½a^ 1 ; a^ 2 ; . . . ; a^ l . Note that K(xi, xj) is a kernel
which is a stock market index for companies traded on the Taiwan function which represents the inner product h/(xi), /(xj)i. The most
Stock Exchange, show that our method performs better than other widely adopted kernel function is the radial basis function (RBF)
methods. which is deﬁned as
The rest of this paper is organized as follows. Section 2 presents
basic concepts about support vector regression. Section 3 describes Kðxi ; xj Þ ¼ h/ðxi Þ; /ðxj Þi;
our proposed multiple-kernel support vector regression approach ¼ expðckxi xj k2 Þ; ð4Þ
for stock price forecasting. Experimental results are presented in
Section 4. Finally, a conclusion is given in Section 5. where c is the width parameter of the RBF kernel. Now, Eq. (3) can
be solved by SMO (Platt, 1999). Suppose ai and a
^ i , i = 1, . . . , ,l, are the
optimal values obtained. The regression hyperplane for the underly-
2. Support vector regression (SVR) ing regression problem is then given by:

In a regression problem, we are given a set of training patterns X

l

(x1, y1), . . . , (xl, yl), where xi 2 Rn , i = 1, . . . , l, and yi 2 R. Each yi is the f ðxÞ ¼ a^ i ai Kðxi ; xÞ þ b ; ð5Þ
i¼1
desired target, or output, value for the input vector xi. A regression
model is learned from these patterns and used to predict the target Pl
where b ¼ yk þ e i¼1 a^ i ai Kðxi ; xk Þ is obtained from any ak
values of unseen input vectors. SVR is a nonlinear kernel-based with 0 < ak < C.
regression method which tries to locate a regression hyperplane
with small risk in high-dimensional feature space. It possesses
good function approximation and generalization capabilities. 3. Proposed method
Among the various types of support vector regression, the most
commonly used is e-SVR which ﬁnds a regression hyperplane with In this section, the idea of multiple-kernel support vector
an e-insensitive band (Cristianini & Shawe-Taylor, 2000; Vapnik, regression is formulated. Then a two-stage multiple-kernel learn-
1995). To make the method more robust, the image of the input ing algorithm for deriving optimal kernel weights and Lagrange
data does not need to lie strictly on or inside the e-insensitive band. multipliers is described.
C.-Y. Yeh et al. / Expert Systems with Applications 38 (2011) 2177–2186 2179

3.1. Multiple-kernel support vector regression ﬁnd l, a, and a

^ by solving Eq. (9), the regression hyperplane would
be:
The SVR method presented earlier uses a single mapping func-
X
l

tion /, and hence a single kernel function K. If a dataset has a lo- f ðxÞ ¼ a^ i ai Ke ðxi ; xÞ þ b ; ð11Þ
cally varying distribution, using a single kernel may not catch up i¼1
the varying distribution very well. Kernel fusion can help to deal Pl
with this problem. Instead of using one single mapping function,

where b ¼ yk þ e ^
i¼1 ð ia ai Þ Ke ðxi ; xk Þ is obtained from any ak
several mapping functions are combined to do aggregate mapping. with 0 < ak < C.
A simple direct sum fusion applies a vector of M mapping func-
tions, i.e., 3.2. Two-stage multi-kernel learning
UðxÞ ¼ ½/1 ðxÞ/2 ðxÞ . . . /M ðxÞ; ð6Þ
We develop a two-stage optimization algorithm for solving Eq.
to map the input space to the feature space. We adopt the weighted (9). The algorithm consists of two-stages in which SMO and gradient
sum fusion with the following mapping function: projection are applied, respectively. These stages are iteratively per-
pffiffiffiffiffiffi pffiffiffiffiffiffi pffiffiffiffiffiffiffi formed until the specified stopping criterion is met, as shown in
UðxÞ ¼ l1 /1 ðxÞ l2 /2 ðxÞ . . . lM /M ðxÞ ; ð7Þ
Fig. 1. Note that the iteration number is indicated by t. In the first
stage, the weight vector l is kept fixed, i.e., K e ðxi ; xj Þ ¼
where l1, l2, . . . , lM are weights of component functions. Now, the PM
regression problem includes the optimization of two parts. One part l
s¼1 s K s ðxi ; xj Þ is known. Then Eq. (9) becomes:
is the regression hyperplane f(x). The other part is the weight vector X
l X
l
l = [l1, l2, . . . , lM]. Note that it was shown that U of Eq. (7) is a valid max ^ i ai Þ e
yi ða ða
^ i þ ai Þ
a;a^
mapping if all the weights are non-negative (Cristianini & Shawe- i¼1 i¼1
Taylor, 2000). Also, we require the sum of weights be unity to re- 1X l X l

strict the range of the search space to prevent the occurrence of ða

^ i ai Þða e ðxi ; xj Þ;
^ j aj Þ K
2 i¼1 j¼1
overﬁtting. By referring to Eq. (1), the objective function and con-
straints for multiple-kernel SVR become X
l
s:t: ða
^ i ai Þ ¼ 0;
1 X l i¼1
min minw;b hw; wi þ C ðni þ ^ni Þ; C P ai ; a
^ i P 0; i ¼ 1; . . . ; l: ð12Þ
l 2 i¼1

s:t: ðhw; Uðxi Þi þ bÞ yi 6 e þ ni ; This equation is, obviously, identical in form to Eq. (3) and can be
y ðhw; Uðxi Þi þ bÞ 6 e þ ^ni ; solved by SMO (Platt, 1999). In the second stage, the Lagrange mul-
i
tipliers are kept ﬁxed, and the weight vector l is updated by the
ni ; ^ni P 0; i ¼ 1; . . . ; l; gradient projection method (Bertsekas, 1999). Since SMO is a stan-
ls P 0; s ¼ 1; . . . ; M; dard algorithm for solving the Wolfe dual form, we won’t describe it
X
M here. Detailed description about SMO can be found in Platt (1999).
ls ¼ 1; ð8Þ In the following, we describe how gradient projection is applied to
s¼1
obtain optimal l in the second stage.
where U is the vector of function mappings of Eq. (7). Since the Lagrange multipliers are considered as known in the
By introducing the Lagrangian, as usual, Eq. (8) can be converted second stage, Eq. (9) can be rewritten as follows:
to the following Wolfe dual form:
min JðlÞ;
l
X
l X
l
min maxa;a^ ^ i ai Þ e
y i ða ða
^ i þ ai Þ s:t: ls P 0; s ¼ 1; . . . ; M;
l
i¼1 i¼1
X
M

1X l X l ls ¼ 1; ð13Þ
ða
^ i ai Þða e ðxi ; xj Þ
^ j aj Þ K s¼1
2 i¼1 j¼1
where
X
l
s:t: ða
^ i ai Þ ¼ 0;
X
l X l
1X l X l
i¼1
JðlÞ ¼ ^ i ai Þ e
y i ða ða
^ i þ ai Þ ða
^ i ai Þða
^j
C P ai ; a
^ i P 0; i ¼ 1; . . . ; l; i¼1 i¼1
2 i¼1 j¼1
ls P 0; s ¼ 1; . . . ; M; X
M

X
M aj Þ ls K s ðxi ; xj Þ: ð14Þ
ls ¼ 1 ð9Þ s¼1

s¼1
Note that J(l) only depends on l. By gradient projection (Bertsekas,
where 1999), we have

e ðxi ; xj Þ ¼ hUðxi Þ; Uðxj Þi
K lkþ1 ¼ lk þ gk l^ k lk ; ð15Þ
¼ l1 h/1 ðxi Þ; /1 ðxj Þi þ l2 h/2 ðxi Þ; /2 ðxj Þi þ where lk is the weight vector of the kth iteration, 0 < gk 6 1 is the
þ lM h/M ðxi Þ; /M ðxj Þi ^ k is deﬁned as
step-size, and l
¼ l1 K 1 ðxi ; xj Þ þ l2 K 2 ðxi ; xj Þ þ þ lM K M ðxi ; xj Þ
z; if z belongs to the feasible region;
l^ k ¼ ð16Þ
X
M z? ; otherwise;
¼ ls K s ðxi ; xj Þ ð10Þ
s¼1
z ¼ lk sk rJðlk Þ; ð17Þ
k
is a weighted sum of M kernel functions K1, K2, . . . , KM, correspond- where s is a positive scalar, and z\ denotes the projection of z on
ing to mapping functions /1, /2, . . . , /M, respectively. Now, if we can the feasible region. The feasible region contains all the vectors
2180 C.-Y. Yeh et al. / Expert Systems with Applications 38 (2011) 2177–2186

initializing weight vectors

0 1
s , s 1,..., M
M

t 0

using SMO to train SVR

t t 1
first stage M
t
by K s Ks
s 1

no using gradient projection

stopping criterion
t 1
is met method to update

second stage
yes

stop and
output , ˆ , and

Fig. 1. Two-stage multiple-kernel learning algorithm.

PM
v = [v1, v2, . . . , vM] such that vs P 0, 1 6 s 6 M, and s¼1 v s ¼ 1. 4. Experimental results
rJ(lk) is the following gradient:
To test the forecasting performance of our proposed method, we
k
@J 1X l X l
have conducted three experiments on the datasets taken from Tai-
rJ l s ¼ lks ¼ ða
^ i ai Þða
^ j aj ÞK s ðxi ; xj Þ ð18Þ
@ 2 i¼1 j¼1 wan Capitalization Weighted Stock Index (TAIEX). We also com-
pare the performance of our proposed method with that of other
for s = 1, . . . , M. methods, i.e., single kernel support vector regression (SKSVR)
The projection z\ of z onto the feasible region can be obtained (Tay & Cao, 2001), autoregressive integrated moving average (AR-
by solving the following constraint problem: IMA) model (Box & Jenkins, 1994), and TSK type fuzzy neural net-
work (FNN) (Chang & Liu, 2008). For convenience, we abbreviate
min kz z? k2 ; our multiple-kernel support vector regression method as MKSVR.
z?

s:t: all components of z? are non-negative and 4.1. Experiment I

their sum is unity; ð19Þ
First of all, we compare the performance of MKSVR with that of
which can be reformulated as the following form of quadratic SKSVR. In this experiment, the daily stock closing prices of TAIEX
programming: for the period of October 2002 to December 2005 are used, and a
one-season moving-window testing approach is used for generat-
1 ing the training/validating/testing data. Four datasets, DS-I to DS-
min ðz? ÞT Hz? zT z? ;
z? 2 IV, are formed, following the way done in Tay and Cao (2001).
T For instance, DS-I contains the daily stock closing prices from Octo-
s:t: ks z? P 0; 1 6 s 6 M;
T ber 2002 to September 2004 selected as training dataset, the daily
e z? ¼ 1; ð20Þ
stock closing prices from October 2004 to December 2004 selected
as validating dataset, and the daily stock closing prices from Janu-
where H is an identity matrix of rank M, ks is an M-vector with the
ary 2005 to March 2005 selected as testing dataset. The corre-
sth component being 1 and the other components being 0, and e is
sponding time periods for DS-I to DS-IV are listed in Table 1.
an M-vector with all components being 1. The step-size gk is deter-
Given the original daily stock closing prices p = {p1, p2, . . . , pt, . . .},
mined by using the Armijo rule along the feasible direction. Here, by
we follow (Tay & Cao, 2001) to derive training patterns (xt, yt) for
choosing b and r with 0 < b < 1 and 0 < r < 1, we can set gk ¼ bmk ,
SKSVR and MKSVR. Firstly, the n-day exponential moving average
where mk is the first non-negative integer m for which
of the tth day, EMAn(t), is defined as
kþ1
J lkþ1 J lkþ1 þ bm l^ lkþ1 EMAn ðtÞ ¼ EMAn ðt 1Þ þ a ðpt EMAn ðt 1ÞÞ; ð22Þ
T kþ1
P rbm rJ lkþ1 l ^ lkþ1 : ð21Þ 2
where pt is the tth day daily stock closing prices and a ¼ 1þn. The
output variable yt is defined by:
The detailed procedure of the gradient projection algorithm is de- EMA3 ðtÞ EMA3 ðt 5Þ
picted in Fig. 2. Note that the iteration number is indicated by k. yt ¼ RDPþ5 ðtÞ ¼ 100: ð23Þ
EMA3 ðt 5Þ
In each application of gradient projection, k starts with 0 and the
initial weights are set to lt. Then Eq. (15) is iteratively applied until The input vector xt consists of five components, i.e., xt = [xt,1 xt,2 xt,3
the stopping criterion is met. When the algorithm terminates, the xt,4 xt,5]. A transformed closing price is obtained by subtracting a n-
final weights obtained are set to lt+1. day EMA from the closing price, defined by:
C.-Y. Yeh et al. / Expert Systems with Applications 38 (2011) 2177–2186 2181

inital weight vector

μk = μt
k =0

∇ J ( μ k ) ← Eq.(18)
k = k +1

z belong to
no feasible region

yes

[ z ]⊥ ← Eq.(20) μˆ k ← Eq.(16)

Armijo rule
Eq.(21)

no
μ k +1 ← Eq.(15) stopping criterion
is met

yes

stop and output

optimal μ t +1 = μ k +1

Fig. 2. Gradient projection for multiple-kernel learning.

Table 1 0.4
The datasets for Experiment I. DS−I
DS−II
Datasets Training Validating Testing DS−III
0.35
DS−IV
DS-I 2002/10 – 2004/09 2004/10 – 2004/12 2005/01 – 2005/03
DS-II 2003/01 – 2004/12 2005/01 – 2005/03 2005/04 – 2005/06
DS-III 2003/04 – 2005/03 2005/04 – 2005/06 2005/07 – 2005/09 0.3
DS-IV 2003/07 – 2005/06 2005/07 – 2005/09 2005/10 – 2005/12
RMSE

0.25
d n ðtÞ ¼ p EMAn ðtÞ
EMA ð24Þ
t
0.2
and a lagged relative difference in percentage of price (RDP) is de-
ﬁned as:
pt ptn
RDPn ðtÞ ¼ 100: ð25Þ 0.01 0.05 0.1 0.5 1 5 10 50 100
ptn γ

Then the input variables are deﬁned as xt;1 ¼ EMA d 15 ðt 5Þ, Fig. 3. Forecasting performance of SKSVR with different hyperparameters in
xt,2 = RDP5(t 5), xt,3 = RDP10(t 5), xt,4 = RDP15(t 5), and Experiment I.

xt,5 = RDP20(t 5). The root mean squared error (RMSE) is adopted
for performance comparison, and is defined as follows: Table 2
Performance comparison between best SKSVR and MKSVR in Experiment I.
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u T
u1 X Methods Datasets
RMSE ¼ t ðy y ^ t Þ2 ; ð26Þ
T t¼1 t DS-I DS-II DS-III DS-IV
SKSVR 0.170 0.179 0.188 0.234
where yt and y ^t are desired output and predicted output, MKSVR 0.161 0.174 0.179 0.219
respectively.
For SKSVR, there are three parameters that have to be deter-
mined in advance while using RBF kernel, i.e., C, e, and c. We exam- Besides, we try with 37 different settings of hyperparameter c,
ine the forecasting performance of SKSVR with C = 1 and e = 0.001. from 0.01 to 0.09 with a stepping factor of 0.01, from 0.1 to 0.9
2182 C.-Y. Yeh et al. / Expert Systems with Applications 38 (2011) 2177–2186

with a stepping factor of 0.1, from 1 to 10 with a stepping factor of 64 DS−V

1, and from 10 to 100 with a stepping factor of 10. The forecasting DS−VI
62
performance obtained by SKSVR on the four datasets is shown in DS−VII
Fig. 3. From this ﬁgure, we can see that SKSVR requires different 60 DS−VIII

c settings for different datasets to obtain good performance. For 58

DS-I, the best performance occurs when 0.05 6 c 6 0.1. For DS-II, 56

RMSE
the best performance occurs when 0.1 6 c 6 0.5. For DS-III, the best
54
performance occurs when 50 6 c 6 100. For DS-IV, the best perfor-
mance occurs when 0.01 6 c 6 0.05. The best RMSE values ob- 52
tained by SKSVR are listed in Table 2. 50
For multiple-kernel learning, a kernel combining all the 37 dif-
48
ferent RBF kernels is considered, i.e., c 2 {0.01, 0.02, . . . , 0.09,
0.1, 0.2, . . . , 0.9, 1, 2, . . . , 9, 10, 20, . . . , 100}. Therefore, the combined 46

kernel matrix is a weighted sum of 37 kernel matrices, i.e., 44

e ¼ l K1 þ l K2 þ þ l K37 where l1 denotes the kernel (1,1) (5,1) (5,2) (5,3) (5,4) (5,5)
K 1 2 37 number of order (m,m)
weight for the ﬁrst kernel matrix with c = 0.01 and l2 denotes
the kernel weight for the second kernel matrix with c = 0.02, etc. Fig. 5. Forecasting performance of ARIMA with different parameters in
Experiment II.
The RMSE values obtained by MKSVR for the four datasets are also
listed in Table 2. Obviously, MKSVR performs better than the best
SKSVR for each dataset. Note that we don’t need to specify the
hyperparameter settings in advance, and trial-and-error for deter- 200
DS−V
mining appropriate hyperparameter settings is avoided. 180 DS−VI
DS−VII
160 DS−VIII
4.2. Experiment II
140
In this experiment, we compare the performance of MKSVR
RMSE

with that of ARIMA (Box & Jenkins, 1994). The daily stock closing 120

prices of TAIEX for the period of January 2004 to December 2005 100

80
Table 3
The datasets for Experiment II and Experiment III. 60
Datasets Training Validating Testing
40
DS-V 2004/01 – 2004/09 2004/10 – 2004/12 2005/01 – 2005/03 0.01 0.05 0.1 0.5 1 5 10 50 100
DS-VI 2004/04 – 2004/12 2005/01 – 2005/03 2005/04 – 2005/06 γ
DS-VII 2004/07 – 2005/03 2005/04 – 2005/06 2005/07 – 2005/09
DS-VIII 2004/10 – 2005/06 2005/07 – 2005/09 2005/10 – 2005/12 Fig. 6. Forecasting performance of SKSVR with different hyperparameters in
Experiment II.

Original stock closing prices Log−Levels Returns

7000 8.9 0.06

6800
8.85
0.04

6600

8.8
6400 0.02

8.75
6200
yt’ yt
0

6000
8.7

5800 −0.02
8.65

5600

−0.04
8.6
5400

5200 8.55 −0.06

2004/04/01 2004/08/02 2004/12/31 2004/04/01 2004/08/02 2004/12/31 2004/04/01 2004/08/02 2004/12/31
(a) (b) (c)
Fig. 4. An example of (a) p, (b) y0 , and (c) y, TAIEX (2004/04/01 – 2004/12/31).
C.-Y. Yeh et al. / Expert Systems with Applications 38 (2011) 2177–2186 2183

are used. The one-season moving-window testing approach used in tion ARIMA (m, o, n) is used. Each input vector consists of (m + n)
Pai and Lin (2005) for generating the training/validating/testing components, i.e., xt = [xt,1 xt,2. . . xt,m+n]. The values of the compo-
data is adopted and four datasets, DS-V to DS-VIII, are obtained. nents depend on the model used. For instance, for ARIMA (2, 1, 3)
For instance, DS-V contains the daily stock closing prices from Jan- we have xt,1 = yt1, xt,2 = yt2, xt,3 = t1, xt,4 = t2, and xt,5 = t3
uary 2004 to September 2004 selected as training dataset, the daily where t1, t2, and t3 are forecast errors. The RMSE is defined
stock closing prices from October 2004 to December 2004 selected as follows:
as validating dataset, and the daily stock closing prices from Janu-
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ary 2005 to March 2005 selected as testing dataset. The corre- u T u T
u1 X u1 X
sponding time periods for DS-V to DS-VIII are listed in Table 3. RMSE ¼ t ðpt p ^t Þ2 ¼ t ðexpðy0t Þ exp ðy ^0t ÞÞ2
Given the original daily stock closing prices p = {p1, p2, . . .pt, . . .}, T t¼1 T t¼1
we follow (Box & Jenkins, 1994) to derive training patterns (xt, yt) vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u T
for this experiment. Firstly, the natural logarithmic transformation u1 X
¼t exp ðy0t Þ exp y ^t þ y0t1 2 ; ð28Þ
is applied to the original daily stock closing prices p = {p1, p2, T t¼1
. . .pt, . . .}, resulting in another time series y0 ¼ fy01 ; y02 ; . . . y0t ; . . .g
where y0t ¼ lnðpt Þ. The output sequence is y = {y1, y2, . . .yt, . . .} where
^t is the predicted output obtained from the predictor.
where y
yt is defined by:

yt ¼ y0t y0t1 : ð27Þ

220 DS−V
An example of these three different sequences is shown in Fig. 4. DS−VI
200 DS−VII
The input vector xt consists of three parts, an autoregressive part, DS−VIII
an integrated part, and a moving average part, characterized by 180

three parameters m, o, n indicating the order of the autoregressive 160

part, the order of the differencing part, and the order of the moving
average part, respectively. To distinguish different models, the nota- RMSE 140

120

Table 4 100
Performance comparison among best ARIMA, best SKSVR, and MKSVR in Experiment
80
II.
60
Methods Datasets
DS-V DS-VI DS-VII DS-VIII 40
2 3 5 7 9 11 13 15
ARIMA 45.421 48.400 45.674 56.957 number of hidden nodes
SKSVR 45.686 48.667 46.401 55.294
MKSVR 45.634 47.297 44.142 54.882 Fig. 8. Forecasting performance of FNN with different numbers of hidden nodes in
Experiment III.

DS−V DS−VI
6600 desired output 6600 desired output
MKSVR MKSVR
6400 6400

6200 6200
RMSE

6000 6000

5800 5800

5600 5600
2005/01/03 2005/02/01 2005/03/01 2005/03/31 2005/04/01 2005/05/03 2005/06/01 2005/06/30
date date

DS−VII DS−VIII

6600 desired output 6800 desired output

6500 MKSVR MKSVR
6600
6400
6400
6300
RMSE

6200 6200
6100
6000
6000
5800
5900
5800 5600
2005/07/01 2005/08/01 2005/09/02 2005/09/30 2005/10/01 2005/11/01 2005/12/01 2005/12/30
date date

Fig. 7. Forecasting results by MKSVR in Experiment II.

2184 C.-Y. Yeh et al. / Expert Systems with Applications 38 (2011) 2177–2186

150 ferent parameter settings with ARIMA. We run SKSVR and MKSVR
DS−V on these four datasets, with the same settings as in Experiment I.
140
DS−VI
130 DS−VII The forecasting performance obtained by SKSVR is shown in
DS−VIII Fig. 6. From this ﬁgure, we can see that SKSVR requires different
120
c settings for different datasets to obtain good performance. The
110 best RMSE values obtained by ARIMA and SKSVR are listed in Ta-
ble 4. The RMSE values obtained by MKSVR for these datasets are
RMSE

100
90 also listed in Table 4. Obviously, MKSVR can do equally well as,
80
or even better than, the best ARIMA and SKSVR for each dataset.
However, we don’t need to worry about the hyperparameter set-
70
tings in MKSVR. Fig. 7 shows the forecasting results for datasets
60 DS-V to DS-VIII by MKSVR.
50
40 4.3. Experiment III
0.01 0.05 0.1 0.5 1 5 10 50 100
γ
In this experiment, we compare the performance of MKSVR
Fig. 9. Forecasting performance of SKSVR with different hyperparameters in with that of FNN (Chang & Liu, 2008). We use the same datasets
Experiment III. used in Experiment II, as listed in Table 1. Given the original daily
stock closing prices p = {p1, p2, . . .pt, . . .}, we follow (Chang & Liu,
2008) to derive training patterns (xt, yt) for this experiment. Let
Table 5
Performance comparison among best FNN, best SKSVR, and MKSVR in Experiment III.
y0t be pt, i.e., y0t ¼ pt . Two technical indices, SMA and BIAS, are used
in generating the input vector xt. SMA, abbreviated for simple mov-
Methods Datasets
ing average, is used to emphasize the direction of a trend and to
DS-V DS-VI DS-VII DS-VIII smooth out price and volume fluctuations. The n-day SMA of the
FNN 59.260 64.232 50.395 61.774 tth day is defined as follows:
SKSVR 45.543 47.434 46.669 57.625 Pt5
MKSVR 45.531 47.398 45.907 57.301 i¼t pi
SMAn ðtÞ ¼ : ð29Þ
n
BIAS is used to observe the difference between the closing price and
the moving average line. The n-day BIAS of the tth day is defined as
To compare ARIMA and MKSVR, we consider 25 models which
follows:
are ARIMA (m, 1, n) with m 2 {1, 2, 3, 4, 5} and n 2 {1, 2, 3, 4, 5}. The
forecasting performance obtained by ARIMA on the four datasets pt SMAn ðtÞ
BIASn ðtÞ ¼ 100: ð30Þ
is shown in Fig. 5. Interestingly, little variation occurs among dif- SMAn ðtÞ

DS−V DS−VI

desired output 6600 desired output

6400 MKSVR MKSVR
6400

6200
6200
RMSE

6000
6000

5800 5800

5600 5600
2005/01/03 2005/02/01 2005/03/01 2005/03/31 2005/04/01 2005/05/03 2005/06/01 2005/06/30
date date

DS−VII DS−VIII
6500
6800
desired output desired output
6400 MKSVR MKSVR
6600

6300 6400
RMSE

6200 6200

6100 6000

6000 5800

5900 5600
2005/07/01 2005/08/01 2005/09/02 2005/09/30 2005/10/03 2005/11/01 2005/12/01
date date

Fig. 10. Forecasting results by MKSVR in Experiment III.

C.-Y. Yeh et al. / Expert Systems with Applications 38 (2011) 2177–2186 2185

Let x0t;1 ¼ SMA6 ðt 1Þ and x0t;2 ¼ BIAS6 ðt 1Þ. Now the underlying References
dataset is partitioned into K clusters by k-means (Hartigan & Wong,
1979), a popular clustering algorithm. Then the output variable yt is Bach, F. R., Lanckriet, G. R. G., & Jordan, M. I. (2004). Multiple kernel learning, conic
duality, and the SMO algorithm. In: Proceedings of the 21th international
conference on machine learning (pp. 6–13).
y0t y
0j
yt ¼ ; ð31Þ Bennett, K. P., Momma, M., & Embrechts, M. J. (2002). MARK: A boosting algorithm
ry0j for heterogeneous kernel models. In Proceedings of the eigth ACM
SIGKDD international conference on knowledge discovery and data mining (pp.
24–31).
0j and ry0 are the mean and
where y0t belongs to the jth cluster, and y Bertsekas, D. P. (1999). Nonlinear programming (2nd ed.). Massachusetts, USA:
j
0
standard deviation in the y direction of the jth cluster. The input Athena Scientific.
vector xt = [xt,1 xt,2] is obtained by: Bollerslev, T. (1986). Generalized autoregressive conditional heteroscedasticity.
Journal of Econometrics, 31(3), 307–327.
Box, G. E. P., & Jenkins, G. M. (1994). Time series analysis: Forecasting and control (3rd
x0t;i x0j;i ed.). Englewood Cliffs: Prentice Hall.
xt;i ¼ ð32Þ
rx0j;i Cao, L., & Tay, F. E. H. (2001). Financial forecasting using support vector machines.
Neural Computing & Applications, 10(2), 184–192.
Cao, L., & Tay, F. E. H. (2003). Support vector machine with adaptive parameters in
for i = 1, 2, where ½x0t;1 x0t;2 belongs to the jth cluster, and x0j;i and rx0j;i financial time series forecasting. IEEE Transactions on Neural Networks, 14(6),
are the mean and standard deviation, respectively, in the ith direc- 1506–1518.
Champernowne, D. G. (1948). Sampling theory applied to autoregressive schemes.
tion of the jth cluster. The RMSE is defined as follows:
Journal of the Royal Statistical Society: Series B, 10, 204–231.
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Chang, P.-C., & Liu, C.-H. (2008). A TSK type fuzzy rule based system for stock price
u T u T
u1 X u1 X prediction. Expert Systems with Application, 34(1), 135–144.
RMSE ¼ t ðpt p ^t Þ2 ¼ t ðy 0 y^0t Þ2 Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2002). Choosing multiple
T t¼1 T t¼1 t parameters for support vector machines. Machine Learning, 46(1–3), 131–
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 159.
u T Crammer, K., Keshet, J., & Singer, Y. (2003). Kernel design using boosting. In S.
u1 X 2
¼t y0t y ^t ry0 þ y 0j ; ð33Þ Becker, S. Thrun, & K. Obermayer (Eds.). Advances in neural information
processing systems (Vol. 15, pp. 537–544). Cambridge, MA, USA: MIT Press.
T t¼1 j
Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines
and other kernel-based learning methods. Cambridge, UK: Cambridge University
where y ^t is the predicted output and j is the index of its correspond- Press.
Duan, K., Keerthi, S., & Poo, A. N. (2003). Evaluation of simple performance measures
ing cluster.
for tuning SVM hyperparameters. Neurocomputing, 51, 41–59.
For FNN, standard three-layer networks are adopted. There are Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of
2 nodes in the input layer and 1 node in the output layer. To exam- the variance of united kingdom inflation. Econometrica, 50(4), 987–1008.
ine the effect of different architectures on the performance, we set Fernando, P.-C., Julio, A. A.-R., & Javier, G. (2003). Estimating GARCH models using
support vector machines. Quantitative Finance, 3(3), 163–172.
the number of hidden nodes from 2 to 15 with a stepping factor of Gestel, T. V., Suykens, J. A. K., Baestaens, D. E., Lambrechts, A., Lanckriet, G.,
1 in the hidden layer. A hybrid learning algorithm incorporating Vandaele, B., et al. (2001). Financial time series prediction using least squares
particle swarm optimization (PSO) and recursive least square support vector machines within the evidence framework. IEEE Transactions on
Neural Networks, 12(4), 809–821.
(RLS) is used for refining the antecedent parameters and the conse- Gönen, M., & Alpaydin, E. (2008). Localized multiple kernel learning. In Proceedings
quent parameters, respectively. The forecasting performance ob- of the 25th international conference on machine learning (pp. 352–359).
tained by FNN with different numbers of hidden nodes is Hansen, J. V., & Nelson, R. D. (1997). Neural networks and traditional time series
methods: A synergistic combination in state economic forecasts. IEEE
depicted in Fig. 8. From this figure, we can see that FNN requires Transactions on Neural Networks, 8(4), 863–873.
different numbers of hidden nodes for different datasets to obtain Hartigan, J. A., & Wong, M. A. (1979). A K-means clustering algorithm. Applied
good performance. We run SKSVR and MKSVR on these four data- Statistics, 28, 100–108.
Kecman, V. (2001). Learning and soft computing: Support vector machines. Neural
sets, with the same settings as in Experiment I. The forecasting per-
networks, and fuzzy logic models. Cambridge, MA, USA: MIT Press.
formance obtained by SKSVR is shown in Fig. 9. Again, we can see Kim, K. J., & Han, I. (2008). Genetic algorithms approach to feature discretization in
that SKSVR requires different c settings for different datasets to ob- artificial neural networks for the prediction of stock price index. Expert Systems
with Application, 19(2), 125–132.
tain good performance. The best RMSE values obtained by FNN and
Kwok, J. T.-Y. (2000). The evidence framework applied to support vector machines.
SKSVR are listed in Table 5. The RMSE values obtained by MKSVR IEEE Transactions on Neural Networks, 11(5), 1162–1173.
for the four datasets are also listed in Table 5. Obviously, MKSVR Kwon, Y.-K., & Moon, B.-R. (2007). A hybrid neurogenetic approach for stock
works better than the best FNN and SKSVR for each dataset, and forecasting. IEEE Transactions on Neural Networks, 18(3), 851–864.
Lanckriet, G. R. G., Cristianini, N., Bartlett, P., Ghaoui, L. E., & Jordan, M. I. (2004).
we don’t need to do trail-and-error with MKSVR. Fig. 10 shows Learning the kernel matrix with semidefinite programming. Journal of Machine
the forecasting results for datasets DS-V to DS-VIII by MKSVR. Learning Research, 5, 27–72.
Oh, S.-K., Pedrycz, W., & Park, H.-S. (2006). Genetically optimized fuzzy polynomial
neural networks. IEEE Transactions on Fuzzy Systems, 14(1), 125–144.
5. Conclusion Ong, C. S., Smola, A. J., & Williamson, R. C. (2005). Learning the kernel with
hyperkernels. Journal of Machine Learning Research, 6, 1043–1071.
Pai, P.-F., & Lin, C.-S. (2005). A hybrid ARIMA and support vector machines model in
We have proposed a multiple-kernel support vector regression stock price forecasting. Omega: The International Journal of Management Science,
approach for stock market price forecasting. A two-stage multi- 33(6), 497–505.
Platt, J. C. (1999). Fast training of support vector machines using sequential minimal
ple-kernel learning algorithm is developed to optimally combine optimization. In B. Schölkopf, C. J. C. Burges, & A. J. Smola (Eds.). Advances in
multiple-kernel matrices for support vector regression. The learn- kernel methods: Support vector learning (Vol. 11, pp. 185–208). Cambridge, MA,
ing algorithm applies sequential minimal optimization and gradi- USA: MIT Press.
Qi, M., & Zhang, G. P. (2008). Trend timevseries modeling and forecasting with
ent projection iteratively to obtain Lagrange multipliers and neural networks. IEEE Transactions on Neural Networks, 19(5), 808–816.
optimal kernel weights. By this algorithm, advantages from differ- Rakotomamonjy, A., Bach, F. R., Canu, S., & Grandvalet, Y. (2008). Simplemkl. Journal
ent hyperparameter settings can be combined and overall system of Machine Learning Research, 9, 2491–2521.
Rakotomamonjy, A., Bach, F. R., Canu, S., & Grandvalet, Y. (2007) More efficiency in
performance can be improved. Besides, the user need not specify
multiple kernel learning. In Proceedings of the 24th international conference on
the hyperparameter settings in advance, and trial-and-error for machine learning (pp. 775–782).
determining appropriate hyperparameter settings can then be Sonnenburg, S., Ratsch, G., Schäfer, C., & Schölkopf, B. (2006). Large scale multiple
kernel learning. Journal of Machine Learning Research, 7, 1531–1565.
avoided. Experimental results, obtained by running on datasets ta-
Szafranski, M., Grandvalet, Y., & Rakotomamonjy, A. (2008). Composite kernel
ken from Taiwan Capitalization Weighted Stock Index, have shown learning. In Proceedings of the 25th international conference on machine learning
that our method performs better than other methods. (pp. 1040–1047).
2186 C.-Y. Yeh et al. / Expert Systems with Applications 38 (2011) 2177–2186

Taiwan Stock Exchange Corporation. <http://www.twse.com.tw/>. Wang, Z., Chen, S., & Sun, T. (2008). MultiK-MHKS: A novel multiple kernel learning
Tay, F. E. H., & Cao, L. (2001). Application of support vector machines in financial algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2),
time series forecasting. Omega: The International Journal of Management Science, 348–353.
29(4), 309–317. Yang, H., Chan, L., & King, I. (2002). Support vector machine regression for volatile
Tsang, I. W.-H., & Kwok, J. T.-Y. (2006). Efficient hyperkernel learning using stock market prediction. In Proceedings of the third international conference on
second-order cone programming. IEEE Transactions on Neural Networks, 17(1), intelligent data engineering and automated learning (pp. 391–396).
48–58. Zarandi, M. H. F., Rezaee, B., Turksen, I. B., & Neshat, E. (2009). A type-2 fuzzy rule-
Valeriy, G., & Supriya, B. (2006). Support vector machine as an efficient framework based expert system model for stock price analysis. Expert Systems with
for stock market volatility forecasting. Computational Management Science, 3(2), Application, 36(1), 139–154.
147–160. Zhang, D., & Zhou, L. (2004). Discovering golden nuggets: Data mining in financial
Vapnik, V. (1995). The nature of statistical learning theory. New York, USA: Springer- application. IEEE Transactions on Systems, Man, and Cybernetics, Part C:
Verlag. Applications and Reviews, 34(4), 513–522.

7404 2 QP Chemistry AS 23may23 AM
100% (1)
7404 2 QP Chemistry AS 23may23 AM
32 pages
A Hybrid ARIMA and Support Vector Machines Model in Stock Price Forecasting 2005 Omega
No ratings yet
A Hybrid ARIMA and Support Vector Machines Model in Stock Price Forecasting 2005 Omega
9 pages
Portfolio Optimization With Return Prediction Using Deep Learning and Machine Learning
No ratings yet
Portfolio Optimization With Return Prediction Using Deep Learning and Machine Learning
15 pages
2015 Evaluating Multiple Classifiers For Stock Price Direction Prediction
No ratings yet
2015 Evaluating Multiple Classifiers For Stock Price Direction Prediction
11 pages
Physics Form I Examq
No ratings yet
Physics Form I Examq
7 pages
10_1007_s00521-021-05842-w_pdf -- undefined series for scimag -- 10_1007_s00521-021-05842-w -- e0f77b5fee45b8c1580b2bc5590c39e9 -- Anna’s Archive
No ratings yet
10_1007_s00521-021-05842-w_pdf -- undefined series for scimag -- 10_1007_s00521-021-05842-w -- e0f77b5fee45b8c1580b2bc5590c39e9 -- Anna’s Archive
15 pages
SVM Stock
No ratings yet
SVM Stock
5 pages
Predicting Stock Market Price Using Support Vector Regression
No ratings yet
Predicting Stock Market Price Using Support Vector Regression
7 pages
Integrated Long-Term Stock Selection Models Based On Feature Selection and Machine Learning Algorithms For China Stock Market
No ratings yet
Integrated Long-Term Stock Selection Models Based On Feature Selection and Machine Learning Algorithms For China Stock Market
14 pages
A Hybrid Forecasting Model For Prediction of Stock Value of Tata Steel Using Support Vector Regression and Particle Swarm Optimization
No ratings yet
A Hybrid Forecasting Model For Prediction of Stock Value of Tata Steel Using Support Vector Regression and Particle Swarm Optimization
10 pages
Accurate Stock Price Forecasting Based on Deep Learning and Hierarchical Frequency Decomposition
No ratings yet
Accurate Stock Price Forecasting Based on Deep Learning and Hierarchical Frequency Decomposition
17 pages
Analysing Stock Market Trend Prediction Using Machine Amp Deep Learning Models A Comprehensive Review
No ratings yet
Analysing Stock Market Trend Prediction Using Machine Amp Deep Learning Models A Comprehensive Review
10 pages
Literature Survey: 2.1 Review On Machine Learning Techniques For Stock Price Prediction
No ratings yet
Literature Survey: 2.1 Review On Machine Learning Techniques For Stock Price Prediction
15 pages
Deepak Stock Paper
No ratings yet
Deepak Stock Paper
6 pages
1-s2.0-S0957417420310228-main
No ratings yet
1-s2.0-S0957417420310228-main
12 pages
Machine Learning Techniques For Stock Price Predic
No ratings yet
Machine Learning Techniques For Stock Price Predic
10 pages
Stock Price Prediction Using Deep Learning Algorithm and Its Comparison With Machine Learning Algorithms
No ratings yet
Stock Price Prediction Using Deep Learning Algorithm and Its Comparison With Machine Learning Algorithms
11 pages
A Strategy Combining Empirical Model Decomposition and Factorization Machine Based Neural Network For Stock Market Trend Prediction
No ratings yet
A Strategy Combining Empirical Model Decomposition and Factorization Machine Based Neural Network For Stock Market Trend Prediction
16 pages
ssrn-4170455
No ratings yet
ssrn-4170455
20 pages
8 Jsee2317
No ratings yet
8 Jsee2317
12 pages
Gaussian Based Market Predictions
No ratings yet
Gaussian Based Market Predictions
29 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
ssrn-5089161
No ratings yet
ssrn-5089161
12 pages
Forecasting price in a new hybrid neural network model with machine learning
No ratings yet
Forecasting price in a new hybrid neural network model with machine learning
12 pages
1-s2.0-S187705092500050X-main
No ratings yet
1-s2.0-S187705092500050X-main
12 pages
Predicting Stock Market Index Using Fusion of Machine Learning Techniques
No ratings yet
Predicting Stock Market Index Using Fusion of Machine Learning Techniques
28 pages
Ic3 2019 8844891
No ratings yet
Ic3 2019 8844891
5 pages
Literature Review
No ratings yet
Literature Review
6 pages
Ijreas Volume 2, Issue 1 (January 2012) ISSN: 2249-3905 Indian Stock Market Trend Prediction Using Support Vector Machine
No ratings yet
Ijreas Volume 2, Issue 1 (January 2012) ISSN: 2249-3905 Indian Stock Market Trend Prediction Using Support Vector Machine
19 pages
Finance 7
No ratings yet
Finance 7
16 pages
A_Stock_Price_Prediction_Method_Based_on_Price_Trend_Curves
No ratings yet
A_Stock_Price_Prediction_Method_Based_on_Price_Trend_Curves
10 pages
Stock Market Prediction Using Machine Learning
No ratings yet
Stock Market Prediction Using Machine Learning
5 pages
Python
No ratings yet
Python
12 pages
Oil Forcasting Method
No ratings yet
Oil Forcasting Method
8 pages
1 s2.0 S2667305323000273 Main
No ratings yet
1 s2.0 S2667305323000273 Main
18 pages
Support Vector Machines For Prediction of Futures Prices in Indian Stock Market
No ratings yet
Support Vector Machines For Prediction of Futures Prices in Indian Stock Market
5 pages
Stock Market Analysis 0th Review
No ratings yet
Stock Market Analysis 0th Review
27 pages
Using Volume Weighted Support Vector Machines With Walk Forward PDF
No ratings yet
Using Volume Weighted Support Vector Machines With Walk Forward PDF
9 pages
Using Machine Learning Algorithms On Prediction of Stock Price-SVR
No ratings yet
Using Machine Learning Algorithms On Prediction of Stock Price-SVR
16 pages
Computation 07 00004
No ratings yet
Computation 07 00004
20 pages
2
No ratings yet
2
4 pages
Price Prediction Evolution: From Economic Model To Machine Learning
No ratings yet
Price Prediction Evolution: From Economic Model To Machine Learning
7 pages
Stock Market Prediction Using Hidden Markov Model
No ratings yet
Stock Market Prediction Using Hidden Markov Model
4 pages
Model Optimization For Stock Market Prediction Using Multiple Labelling Techniques
No ratings yet
Model Optimization For Stock Market Prediction Using Multiple Labelling Techniques
5 pages
ballings2015
No ratings yet
ballings2015
11 pages
Clustering-Classification Based Prediction of Stock Market Future Prediction
No ratings yet
Clustering-Classification Based Prediction of Stock Market Future Prediction
4 pages
Paper 4
No ratings yet
Paper 4
11 pages
S I LSTM Stock Price Prediction Based On Multiple Data Sources and Sentiment Analysis
No ratings yet
S I LSTM Stock Price Prediction Based On Multiple Data Sources and Sentiment Analysis
20 pages
Stock Market Prediction System Using Machine Learning Approach
No ratings yet
Stock Market Prediction System Using Machine Learning Approach
3 pages
IJISAE_50_Rahul+Marui+Dhokane_3_1867
No ratings yet
IJISAE_50_Rahul+Marui+Dhokane_3_1867
8 pages
JETIR2501512
No ratings yet
JETIR2501512
6 pages
Stock Market Prediction Using Machine Learning Techniques: Jagruti Sujata Bijay K. Harshvardhan
No ratings yet
Stock Market Prediction Using Machine Learning Techniques: Jagruti Sujata Bijay K. Harshvardhan
9 pages
ML ALGORITHM PAPER
No ratings yet
ML ALGORITHM PAPER
13 pages
A Hybrid Machine Learning System For Stock Market Forecasting
No ratings yet
A Hybrid Machine Learning System For Stock Market Forecasting
4 pages
Bhandari Et Al - 2022 - Predicting Stock Market Index Using LSTM
No ratings yet
Bhandari Et Al - 2022 - Predicting Stock Market Index Using LSTM
15 pages
2410.07220v1
No ratings yet
2410.07220v1
21 pages
Forecasting Volatility Based On Wavelet Support Vector Machine-Przejrzane
No ratings yet
Forecasting Volatility Based On Wavelet Support Vector Machine-Przejrzane
9 pages
Humanoid Robot Reinforcement Learning Algorithm For Biped Walking
No ratings yet
Humanoid Robot Reinforcement Learning Algorithm For Biped Walking
7 pages
Journal of Internet Banking and Commerce
No ratings yet
Journal of Internet Banking and Commerce
22 pages
10.1515 - Comp 2020 0199
No ratings yet
10.1515 - Comp 2020 0199
11 pages
Applied Iterative Methods
From Everand
Applied Iterative Methods
Louis A. Hageman
No ratings yet
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
From Everand
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
Jyh-Horng Jeng
No ratings yet
A-ﬁnite-element-analysis-of-the-inelastic-relaxation-of-thermal
No ratings yet
A-ﬁnite-element-analysis-of-the-inelastic-relaxation-of-thermal
16 pages
Advanced-linear-scheduling-program-with-varying-production-rates-for-pipeline-construction-projects
No ratings yet
Advanced-linear-scheduling-program-with-varying-production-rates-for-pipeline-construction-projects
12 pages
1-Students-Proof-Schemes-for-Proving-and-Disproving-of-propositions-final-paper
No ratings yet
1-Students-Proof-Schemes-for-Proving-and-Disproving-of-propositions-final-paper
23 pages
Comparing-Fatigue-Corrosion-IF-Steel
No ratings yet
Comparing-Fatigue-Corrosion-IF-Steel
8 pages
rajkumar rao
No ratings yet
rajkumar rao
17 pages
Agricultural Cooperative Development Book Review PUAD 3223 Political & Social Dynamics of Rural Planning and Development
No ratings yet
Agricultural Cooperative Development Book Review PUAD 3223 Political & Social Dynamics of Rural Planning and Development
17 pages
Jubbal Empire
No ratings yet
Jubbal Empire
23 pages
W0 Odshop N3 Ws November 2016
No ratings yet
W0 Odshop N3 Ws November 2016
76 pages
Wren EMAIL Pattern PDF
80% (5)
Wren EMAIL Pattern PDF
6 pages
ANIMALS Pre 11 (Autosaved)
No ratings yet
ANIMALS Pre 11 (Autosaved)
124 pages
Daily Bancassurance_31 Juli 23
No ratings yet
Daily Bancassurance_31 Juli 23
138 pages
Dry Cladding: Arushi Bandhu Third Year A Roll Number 5
No ratings yet
Dry Cladding: Arushi Bandhu Third Year A Roll Number 5
3 pages
SAP Innovation Awards 2020 Entry Pitch Deck
No ratings yet
SAP Innovation Awards 2020 Entry Pitch Deck
11 pages
It Looks Like A Toyota:: Educational Approaches To Designing For Visual Brand Recognition
No ratings yet
It Looks Like A Toyota:: Educational Approaches To Designing For Visual Brand Recognition
15 pages
Cs1404: Internet Programming Lab: List of Experiments
75% (4)
Cs1404: Internet Programming Lab: List of Experiments
60 pages
Qualifications For Industrial Safety Management Courses
No ratings yet
Qualifications For Industrial Safety Management Courses
5 pages
Mebo Ointment
No ratings yet
Mebo Ointment
17 pages
The Political Classroom Diana E Hess And Paula Mcavoy instant download
100% (1)
The Political Classroom Diana E Hess And Paula Mcavoy instant download
50 pages
Poetry by Heart
No ratings yet
Poetry by Heart
3 pages
BestLuxurySkinCareProductsofGlamourewcdd PDF
No ratings yet
BestLuxurySkinCareProductsofGlamourewcdd PDF
3 pages
Uguru 2
No ratings yet
Uguru 2
9 pages
ENG - A2 - 2 - 15 Talk About Your Hometown
No ratings yet
ENG - A2 - 2 - 15 Talk About Your Hometown
11 pages
Exam Question 2018-2019
No ratings yet
Exam Question 2018-2019
3 pages
Advanced Management Accounting For Ca Final-Parag Gupta PDF
No ratings yet
Advanced Management Accounting For Ca Final-Parag Gupta PDF
236 pages
Labor Relations Cases: March 3, 2020 Recitation of Cases
No ratings yet
Labor Relations Cases: March 3, 2020 Recitation of Cases
3 pages
ShakeOut Recommended Earthquake Safety Actions
No ratings yet
ShakeOut Recommended Earthquake Safety Actions
2 pages
Mountain Without Mercy
No ratings yet
Mountain Without Mercy
5 pages
Application of Managerial Accounting Concepts:Case Study On Indigo Airlines
No ratings yet
Application of Managerial Accounting Concepts:Case Study On Indigo Airlines
3 pages
Cutting James E 2021 Movies on Our Minds
No ratings yet
Cutting James E 2021 Movies on Our Minds
4 pages
Dimensional Numbers in Fluid Mechanics
No ratings yet
Dimensional Numbers in Fluid Mechanics
3 pages
Parallels - Meridians
100% (5)
Parallels - Meridians
2 pages
Kode Diagnosa
No ratings yet
Kode Diagnosa
3 pages

A-multiple-kernel-support-vector-regression-approach-for-stock-market

Uploaded by

A-multiple-kernel-support-vector-regression-approach-for-stock-market

Uploaded by

Expert Systems with Applications 38 (2011) 2177–2186

Contents lists available at ScienceDirect

Expert Systems with Applications

A multiple-kernel support vector regression approach for stock market

specify the hyperparameter settings in advance, and trial-and-er- C P ai ; a

In a regression problem, we are given a set of training patterns X

3.1. Multiple-kernel support vector regression ﬁnd l, a, and a

strict the range of the search space to prevent the occurrence of ða

initializing weight vectors

using SMO to train SVR

no using gradient projection

Fig. 1. Two-stage multiple-kernel learning algorithm.

s:t: all components of z? are non-negative and 4.1. Experiment I

inital weight vector

stop and output

Fig. 2. Gradient projection for multiple-kernel learning.

with a stepping factor of 0.1, from 1 to 10 with a stepping factor of 64 DS−V

c settings for different datasets to obtain good performance. For 58

kernel matrix is a weighted sum of 37 kernel matrices, i.e., 44

Original stock closing prices Log−Levels Returns

5200 8.55 −0.06

yt ¼ y0t y0t1 : ð27Þ

three parameters m, o, n indicating the order of the autoregressive 160

6600 desired output 6800 desired output

Fig. 7. Forecasting results by MKSVR in Experiment II.

desired output 6600 desired output

Fig. 10. Forecasting results by MKSVR in Experiment III.

You might also like