0% found this document useful (0 votes)
11 views

Zhang 2012

This document provides an overview of neural networks for time series forecasting. It discusses how neural networks are well-suited for time series forecasting due to their ability to model nonlinear relationships and flexibility in modeling different types of data. The most commonly used neural network model for time series forecasting is the feedforward neural network. An example architecture of a feedforward neural network for time series forecasting is shown, with input, hidden, and output layers to model the functional relationship between past time series values and future values.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Zhang 2012

This document provides an overview of neural networks for time series forecasting. It discusses how neural networks are well-suited for time series forecasting due to their ability to model nonlinear relationships and flexibility in modeling different types of data. The most commonly used neural network model for time series forecasting is the feedforward neural network. An example architecture of a feedforward neural network for time series forecasting is shown, with input, hidden, and output layers to model the functional relationship between past time series values and future values.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

14 Neural Networks for

Time-Series Forecasting
G. Peter Zhang
Department of Managerial Sciences, Georgia State University,
Atlanta, GA, USA
[email protected]

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462

2 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463

3 Applications in Time Series Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465

4 Neural Network Modeling Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466

5 Methodological Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470

6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473

G. Rozenberg et al. (eds.), Handbook of Natural Computing, DOI 10.1007/978-3-540-92910-9_14,


# Springer-Verlag Berlin Heidelberg 2012
462 14 Neural Networks for Time-Series Forecasting

Abstract

Neural networks has become an important method for time series forecasting. There is
increasing interest in using neural networks to model and forecast time series. This chapter
provides a review of some recent developments in time series forecasting with neural networks,
a brief description of neural networks, their advantages over traditional forecasting models,
and some recent applications. Several important data and modeling issues for time series
forecasting are highlighted. In addition, recent developments in several methodological areas
such as seasonal time series modeling, multi-period forecasting, and the ensemble method are
reviewed.

1 Introduction

Time series forecasting is an active research area that has received a considerable amount of
attention in the literature. Using the time series approach to forecasting, forecasters collect
and analyze historical observations to determine a model to capture the underlying data-
generating process. Then the model is extrapolated to forecast future values. This approach is
useful for applications in many domains such as business, economics, industry, engineering,
and science. Much effort has been devoted, over the past three decades, to the development
and improvement of time series forecasting models.
There has been an increasing interest in using neural networks to model and forecast
time series. Neural networks have been found to be a viable contender when compared to
various traditional time series models (Zhang et al. 1998; Balkin and Ord 2000; Jain and
Kumar 2007). Lapedes and Farber (1987) report the first attempt to model nonlinear time
series with neural networks. De Groot and Wurtz (1991) present a detailed analysis of
univariate time series forecasting using feedforward neural networks for two benchmark
nonlinear time series. Chakraborty et al. (1992) conduct an empirical study on multivariate
time series forecasting with neural networks. Atiya et al. (1999) present a case study of
multistep river flow forecasting. Poli and Jones (1994) propose a stochastic neural net
model based on the Kalman filter for nonlinear time series prediction. Weigend et al. (1990,
1992) and Cottrell et al. (1995) address the issue of network structure for forecasting real-
world time series. Berardi and Zhang (2003) investigate the bias and variance issue in the time
series forecasting context. Liang (2005) proposes a Bayesian neural network for time series
analysis. In addition, results from several large forecasting competitions (Balkin and Ord 2000;
Weigend and Gershenfeld 1994) suggest that neural networks can be a very useful addition to
the time series forecasting toolbox.
Time series forecasting has been dominated by linear methods for decades. Linear methods
are easy to develop and implement and they are also relatively simple to understand and
interpret. However, it is important to understand the limitation of the linear models. They are
not able to capture nonlinear relationships in the data. In addition, the approximation of
linear models to complicated nonlinear relationships is not always satisfactory as evidenced by
the well-known M-competition where a majority of commonly used linear methods were
tested with more than 1,000 real time series data (Makridakis et al. 1982). The results clearly
show that no single model is the best and the best performer is dependent on the data and
other conditions. One explanation of the mixed findings is the failure of linear models
to account for a varying degree of nonlinearity that is common in real-world problems.
Neural Networks for Time-Series Forecasting 14 463

That is, because of the inherent nonlinear characteristics in the data, no single linear model is
able to approximate all types of nonlinear data structure equally well.
Neural networks provide a promising alternative tool for forecasters. The inherently
nonlinear structure of neural networks is particularly useful for capturing the complex
underlying relationship in many real-world problems. Neural networks are perhaps more
versatile methods for forecasting applications in that, not only can they find nonlinear
structures in a problem, they can also model linear processes. For example, the capability of
neural networks in modeling linear time series has been studied and reported by several
researchers (Hwang 2001; Medeiros and Pedreira 2001; Zhang 2001).
In addition to the nonlinear modeling capability, neural networks have several other features
that make them valuable for time series forecasting. First, neural networks are data-driven
nonparametric methods that do not require many restrictive assumptions on the underlying
stochastic process from which data are generated. As such, they are less susceptible to the model
misspecification problem than parametric methods. This ‘‘learning from data’’ feature is highly
desirable in various forecasting situations where time series data are usually easy to collect but the
underlying data-generating mechanism is not known. Second, neural networks have been shown
to have universal functional approximating capability in that they can accurately approximate
many types of complex functional relationships. This is an important and powerful characteristic
as a time series model aims to accurately capture the functional relationship between the variable
to be forecast and its historical observations. The combination of the above mentioned char-
acteristics makes neural networks a quite general and flexible tool for forecasting.
Research efforts on neural networks for time series forecasting are considerable and
numerous applications of neural networks for forecasting have been reported. Adya and
Collopy (1998) and Zhang et al. (1998) reviewed the relevant literature in forecasting with
neural networks. There has been a significant advance in research in this area since then. The
purpose of this chapter is to summarize some of the important recent work with a focus on
time series forecasting using neural networks.

2 Neural Networks

Neural networks are computing models for information processing. They are useful for
identifying the fundamental functional relationship or pattern in the data. Although many
types of neural network models have been developed to solve different problems, the most
widely used model by far for time series forecasting has been the feedforward neural network.
> Figure 1 is a popular one-output feedforward neural network model. It is composed of

several layers of basic processing units called neurons or nodes. Here, the network model has
one input layer, one hidden layer, and one output layer. The nodes in the input layer are used
to receive information from the data. For a time series forecasting problem, past lagged
observations (yt, yt1, . . ., ytp) are used as inputs. The hidden layer is composed of nodes
that are connected to both the input and the output layer and is the most important part of a
network. With nonlinear transfer functions, hidden nodes can process the information
received by input nodes. The output from the network is used to predict the future value(s)
of a time series. If the focus is on one-step-ahead forecasting, then only one output node is
needed. If multistep-ahead forecasting is needed, then multiple nodes may be employed in the
output layer. In a feedforward network, information is one directional. That is, it goes through
the input nodes to hidden nodes and to output nodes, and there is no feedback from the
464 14 Neural Networks for Time-Series Forecasting

. Fig. 1
A typical feedforward neural network for time series forecasting.

network output. The feedforward neural network illustrated in > Fig. 1 is functionally
equivalent to a nonlinear autoregressive model
ytþ1 ¼ f ðyt ; yt1 ; . . . ; ytp Þ þ et þ1
where yt is the observed time series value for variable y at time t and et+1 is the error term at
time t+1. This model suggests that a future time series value, yt+1, is an autoregressive function
of its past observations, yt, yt1, . . ., yt–p plus a random error. Because any time series
forecasting model assumes that there is a functional relationship between the future value
and the past observations, neural networks can be useful in identifying this relationship.
In developing a feedforward neural network model for forecasting tasks, specifying its
architecture in terms of the number of input, hidden, and output neurons is an important yet
nontrivial task. Most neural network applications use one output neuron for both one-step-
ahead and multistep-ahead forecasting. However, as argued by Zhang et al. (1998), it may be
beneficial to employ multiple output neurons for direct multistep-ahead forecasting. The
input neurons or variables are very important in any modeling endeavor and especially
important for neural network modeling because the success of a neural network depends, to
a large extent, on the patterns represented by the input variables. For a time series forecasting
problem, one needs to identify how many and what past lagged observations should be used as
the inputs. Finally, the number of hidden nodes is usually unknown before building a neural
network model and must be chosen during the model-building process. This parameter is
useful for capturing the nonlinear relationship between input and output variables.
Before a neural network can be used for forecasting, it must be trained. Neural network
training refers to the estimation of connection weights. Although the estimation process is
similar to that in the regression modeling where one minimizes the sum of squared errors,
the neural network training process is more difficult and complicated due to the nature of
Neural Networks for Time-Series Forecasting 14 465

nonlinear optimization process. There are many training algorithms developed in the litera-
ture and the most influential one is the backpropagation algorithm by Werbos (1974) and
Rumelhart et al. (1986). The basic idea of backpropagation training is to use a gradient-
descent approach to adjust and determine weights such that an overall error function such as
the sum of squared errors can be minimized.
In addition to the most popular feedforward neural networks, other types of neural
networks can also be used for time series forecasting purposes. For example, Barreto (2008)
provides a review of time series forecasting using the self-organizing map. Recurrent neural
networks (Connor et al. 1994; Kuan and Liu 1995; Kermanshahi 1998; Vermaak and Botha
1998; Parlos et al. 2000; Mandic and Chambers 2001; Huskent and Stagge 2003; Ghiassi et al.
2005; Cai et al. 2007; Jain and Kumar 2007; Menezes and Barreto 2008) that explicitly account
for the dynamic nonlinear pattern are good alternatives to feedforward networks for certain
time series forecasting problems. In a recurrent neural network, there are cycles or feedback
connections among neurons. Outputs from a recurrent network can be directly fed back to
inputs, generating dynamic feedbacks on errors of past patterns. In this sense, recurrent
networks can model richer dynamics than feedforward networks just like linear autoregressive
and moving average (ARMA) models that have certain advantages over autoregressive (AR)
models. However, much less attention has been paid to the research and applications of
recurrent networks, and the superiority of recurrent networks over feedforward networks
has not been established. The practical difficulty of using recurrent neural networks may lie in
the facts that (1) recurrent networks can assume very different architectures and it may be
difficult to specify appropriate model structures to experiment with, and (2) it is more difficult
to train recurrent networks due to the unstable nature of training algorithms.
For an in-depth coverage of many aspects of networks, readers are referred to a number of
excellent books including Smith (1993), Bishop (1995), and Ripley (1996). For neural net-
works for forecasting research and applications, readers may consult Azoff (1994), Weigend
and Gershenfeld (1994), Gately (1996), Zhang et al. (1998), and Zhang (2004).

3 Applications in Time Series Forecasting

Time series are data collected over time and this is one of the most commonly available forms
of data in many forecasting applications. Therefore, it is not surprising that the use of neural
networks for time series forecasting has received great attention in many different fields. Given
that forecasting problems arise in so many different disciplines and the literature on forecast-
ing with neural networks is scattered in so many diverse fields, it is difficult to cover all neural
network applications in time series forecasting problems in this review. > Table 1 provides a
sample of recent time series forecasting applications reported in the literature since 2005. For
other forecasting applications of neural networks, readers are referred to several survey articles
such as Dougherty (1995) for transportation modeling and forecasting, Wong and Selvi
(1998) and Fadlalla and Lin (2001) for financial applications, Krycha and Wagner (1999)
for management science applications, Vellido et al. (1999) and Wong et al. (2000) for business
applications, Maier and Dandy (2000) for water resource forecasting, and Hippert et al. (2001)
for short-term load forecasting.
As can be seen from > Table 1, a wide range of time series forecasting problems have been
solved by neural networks. Some of these application areas include environment (air pollutant
concentration, carbon monoxide concentration, drought, and ozone level), business and
466 14 Neural Networks for Time-Series Forecasting

. Table 1
Some recent neural network applications in time series forecasting

Forecasting problems Studies


Air pollutant concentration Gautama et al. (2008)
Carbon monoxide concentration Chelani and Devotta (2007)
Demand Aburto and Weber (2007)
Drought Mishra and Desai (2006)
Electrical consumption Azadeh et al. (2007)
Electricity load Hippert et al. (2005), Xiao et al. (2009)
Electricity price Pino et al. (2008)
Energy demand Abdel-Aal (2008)
Exchange rate Zhang and Wan (2007)
Food grain price Zou et al. (2007)
Food product sales Doganis et al. (2006)
Gold price changes Parisi et al. (2008)
Inflation Nakamura (2005)
Inventory Doganis et al. (2008)
Macroeconomic time series Teräsvirta et al. (2005)
Stock index option price Wang (2009)
Stock returns volatility Bodyanskiy and Popov (2006)
Tourism demand Palmer et al. (2006), Chu (2008)
Traffic flow Jiang and Adeli (2005)
Ozone level Coman et al. (2008)
River flow Jain and Kumar (2007)
Wind speed Cadenas and Rivera (2009)

finance (product sales, demand, inventory, stock market movement and risk, exchange rate,
futures trading, commodity and option price), tourism and transportation (tourist volume
and traffic flow), engineering (wind speed and river flow) and energy (electrical consumption
and energy demand). Again this is only a relatively small sample of the application areas to
which neural networks have been applied. One can find many more application areas of neural
networks in time series analysis and forecasting.

4 Neural Network Modeling Issues


Developing a neural network model for a time series forecasting application is not a trivial
task. Although many software packages exist to ease users’ effort in building a neural network
model, it is critical for forecasters to understand many important issues around the model-
building process. It is important to point out that building a successful neural network is a
combination of art and science, and software alone is not sufficient to solve all problems in the
process. It is a pitfall to blindly throw data into a software package and then hope it will
automatically give a satisfactory forecast.
Neural Networks for Time-Series Forecasting 14 467

Neural network modeling issues include the choice of network type and architecture, the
training algorithm, as well as model validation, evaluation, and selection. Some of these can be
solved during the model-building process while others must be carefully considered and
planned before actual modeling starts.

4.1 Data Issues

Regarding the data issues, the major decisions a neural network forecaster must make include
data preparation, data cleaning, data splitting, and input variable selection. Neural networks
are data-driven techniques. Therefore, data preparation is a very critical step in building a
successful neural network model. Without a good, adequate, and representative data set, it is
impossible to develop a useful predictive model. The reliability of neural network models often
depends, to a large degree, on the quality of data.
There are several practical issues around the data requirement for a neural network model.
The first is the size of the sample used to build a neural network. While there is no specific rule
that can be followed for all situations, the advantage of having a large sample should be clear
because not only do neural networks typically have a large number of parameters to estimate,
but also it is often necessary to split data into several portions to avoid overfitting, to select the
model, and to perform model evaluation and comparison. A larger sample provides a better
chance for neural networks to adequately capture the underlying data-generating process.
Although large samples do not always give superior performance over small samples, fore-
casters should strive to get as large a size as they can. In time series forecasting problems, Box
and Jenkins (1976) have suggested that at least 50, or better 100, observations are necessary to
build linear autoregressive integrated moving average (ARIMA) models. Therefore, for non-
linear modeling, a larger sample size should be more desirable. In fact, using the longest time
series available for developing forecasting models is a time-tested principle in forecasting
(Armstrong 2001). Of course, if observations in the time series are not homogeneous or the
underlying data-generating process changes over time, then larger sample sizes may not help
and can even hurt the performance of neural networks.
The second issue is the data splitting. Typically, for neural network applications, all
available data are divided into an in-sample and an out-of-sample. The in-sample data are
used for model fitting and selection, while the out-of-sample is used to evaluate the
predictive ability of the model. The in-sample data sometimes are further split into a
training sample and a validation sample. This division of data means that the true size of
the sample used in model building is smaller than the initial sample size. Although there is
no consensus on how to split the data, the general practice is to allocate more data for model
building and selection. That is, most studies in the literature use convenient ratios of
splitting for in-sample and out-of-sample, such as 70:30%, 80:20%, and 90:10%. It is
important to note that in data splitting the issue is not about what proportion of data
should be allocated in each sample but about sufficient data points in each sample to ensure
adequate learning, validation, and testing. When the size of the available data set is large,
different splitting strategies may not have a major impact. But it is quite different when the
sample size is small. According to Chatfield (2001), forecasting analysts typically retain about
10% of the data as a hold-out sample. Granger (1993) suggests that for nonlinear modeling, at
least 20% of the data should be held back for an out-of-sample evaluation. Hoptroff (1993)
recommends that at least ten data points should be in the test sample, while Ashley (2003)
468 14 Neural Networks for Time-Series Forecasting

suggests that a much larger out-of-sample size is necessary in order to achieve statistically
significant improvement for forecasting problems.
In addition, data splitting generally should be done randomly to make sure each subsam-
ple is representative of the population. However, time series data are difficult or impossible to
split randomly because of the desire to keep the autocorrelation structure of the time series
observations. For many time series problems, data splitting is typically done at researchers’
discretion. However, it is important to make sure that each portion of the sample is charac-
teristic of the true data-generating process. LeBaron and Weigend (1998) evaluate the effect of
data splitting on time series forecasting and find that data splitting can cause more sample
variation which in turn causes the variability of forecast performance. They caution the pitfall
of ignoring variability across the splits and drawing too strong conclusions from such splits.
Data preprocessing is another issue that is often recommended to highlight important
relationships or to create more uniform data to facilitate neural network learning, meet
algorithm requirements, and avoid computation problems. Azoff (1994) summarizes four
methods typically used for input data normalization. They are along-channel normalization,
across-channel normalization, mixed-channel normalization, and external normalization.
However, the necessity and effect of data normalization on network learning and forecasting
are still not universally agreed upon. For example, in modeling and forecasting seasonal time
series, some researchers (Gorr 1994) believe that data preprocessing is not necessary because
the neural network is a universal approximator and is able to capture all of the underlying
patterns well. Empirical studies (Nelson et al. 1999), however, find that pre-deseasonalization
of the data is critical in improving forecasting performance. Zhang and Qi (2002, 2005)
further demonstrate that for time series containing both trend and seasonal variations,
preprocessing the data by both detrending and deseasonalization should be the most appro-
priate way to build neural networks for best forecasting performance.

4.2 Network Design

Neural network design and architecture selection are important yet difficult tasks. Not only are
there many ways to build a neural network model and a large number of choices to be made
during the model building and selection process, but also numerous parameters and issues
have to be estimated and experimented with before a satisfactory model may emerge. Adding
to the difficulty is the lack of standards in the process. Numerous rules of thumb are available
but not all of them can be applied blindly to a new situation. In building an appropriate model
for the forecasting task at hand, some experiments are usually necessary. Therefore, a good
experimental design is needed. For discussions of many aspects of modeling issues, readers
may consult Kaastra and Boyd (1996), Zhang et al. (1998), Coakley and Brown (1999), and
Remus and O’Connor (2001).
A feedforward neural network is characterized by its architecture determined by the number
of layers, the number of nodes in each layer, the transfer or activation function used in each layer,
as well as how the nodes in each layer connect to nodes in adjacent layers. Although partial
connections between nodes in adjacent layers and direct connection from input layer to output
layer are possible, the most commonly used neural network is the fully connected one in which
each node in one layer is fully connected only to all nodes in the adjacent layers.
The size of the output layer is usually determined by the nature of the problem. For
example, in most time series forecasting problems, one output node is naturally used for
Neural Networks for Time-Series Forecasting 14 469

one-step-ahead forecasting, although one output node can also be employed for multistep-
ahead forecasting, in which case iterative forecasting mode must be used. That is, forecasts for
more than twosteps ahead in the time horizon must be based on earlier forecasts. This may not
be effective for multistep forecasting as pointed out by Zhang et al. (1998) which is in line with
Chatfield (2001) who discusses the potential benefits of using different forecasting models for
different lead times. Therefore, for multistep forecasting, one may either use multiple output
nodes or develop multiple neural networks each for one particular step forecasting.
The number of input nodes is perhaps the most important parameter for designing an
effective neural network forecaster. For causal forecasting problems, it corresponds to the
number of independent or predictor variables that forecasters believe are important in
predicting the dependent variable. For univariate time series forecasting problems, it is the
number of past lagged observations. Determining an appropriate set of input variables is vital
for neural networks to capture the essential underlying relationship that can be used for
successful forecasting. How many and what variables to use in the input layer will directly
affect the performance of neural networks in both in-sample fitting and out-of-sample
forecasting, resulting in the under-learning or over-fitting phenomenon. Empirical results
(Lennon et al. 2001; Zhang et al. 2001; Zhang 2001) also suggest that the input layer is more
important than the hidden layer in time series forecasting problems. Therefore, considerable
attention should be given to input variable selection especially for time series forecasting.
Balkin and Ord (2000) select the ordered variables (lags) sequentially by using a linear model
and a forward stepwise regression procedure. Medeiros et al. (2006) also use a linear variable
selection approach to choosing the input variables.
Although there is substantial flexibility in choosing the number of hidden layers and the
number of hidden nodes in each layer, most forecasting applications use only one hidden layer
and a small number of hidden nodes. In practice, the number of hidden nodes is often
determined by experimenting with a number of choices and then selecting using the cross-
validation approach or the performance on the validation set. Although the number of hidden
nodes is an important factor, a number of studies have shown that the forecasting performance
of neural networks is not very sensitive to this parameter (Bakirtzis et al. 1996; Khotanzad et al.
1997; Zhang et al. 2001). Medeiros et al. (2006) propose a statistical approach to selecting the
number of hidden nodes by sequentially applying the Lagrange multiplier type tests.
Once a particular neural network architecture is determined, it must be trained so that the
parameters of the network can be estimated from the data. To be effective in performing this
task, a good training algorithm is needed. Training a neural network can be treated as a
nonlinear mathematical optimization problem and different solution approaches or algo-
rithms can have quite different effects on the training result. As a result, training with different
algorithms and repeating with multiple random initial weights can be helpful in getting better
solutions to the neural network training problem. In addition to the popular basic back-
propagation training algorithm, users should be aware of many other (sometimes more
effective) algorithms. These include the so-called second-order approaches such as conjugate
gradient descent, quasi-Newton, and Levenberg–Marquardt (Bishop 1995).

4.3 Model Selection and Evaluation

Using the selection of a neural network model is typically done using the cross-validation
process. That is, the in-sample data is split into a training set and a validation set. The network
470 14 Neural Networks for Time-Series Forecasting

parameters are estimated with the training sample, while the performance of the model is
evaluated with the validation sample. The best model selected is the one that has the best
performance on the validation sample. Of course, in choosing competing models, one must
also apply the principle of parsimony. That is, a simpler model that has about the same
performance as a more complex model should be preferred.
Model selection can also be done with solely the in-sample data. In this regard, several in-
sample selection criteria are used to modify the total error function to include a penalty term
that penalizes for the complexity of the model. In-sample model selection approaches are
typically based on some information-based criteria such as Akaike’s information criterion
(AIC) and Bayesian (BIC) or Schwarz information criterion (SIC). However, it is important to
note the limitation of these criteria as empirically demonstrated by Swanson and White (1995)
and Qi and Zhang (2001). Egrioglu et al. (2008) propose a weighted information criterion to
select the model. Other in-sample approaches are based on pruning methods such as node and
weight pruning (Reed 1993) as well as constructive methods such as the upstart and cascade
correlation approaches (Fahlman and Lebiere 1990; Frean 1990).
After the modeling process, the finally selected model must be evaluated using data not used
in the model-building stage. In addition, as neural networks are often used as a nonlinear
alternative to traditional statistical models, the performance of neural networks needs to be
compared to that of statistical methods. As Adya and Collopy (1998) point out, ‘‘if such a
comparison is not conducted it is difficult to argue that the study has taught one much about the
value of neural networks.’’ They further propose three evaluation criteria to objectively evaluate
the performance of a neural network: (1) comparing it to well-accepted (traditional) models;
(2) using true out-of-sample data and (3) ensuring enough sample size in the out-of-sample
data (40 for classification problems and 75 for time series problems). It is important to note that
the test sample served as the out-of-sample should not in any way be used in the model-building
process. If the cross-validation is used for model selection and experimentation, the perfor-
mance on the validation sample should not be treated as the true performance of the model.

5 Methodological Issues
5.1 Modeling Trend and Seasonal Time Series

Many business and economic time series exhibit both seasonal and trend variations. Seasonality
is a periodic and recurrent pattern caused by factors such as weather, holidays, repeating
promotions, as well as the behavior of economic agents (Hylleberg 1992). Because of the
frequent occurrence of these time series in practice, how to model and forecast seasonal and
trend time series has long been a major research topic that has significant practical implications.
Traditional analyses of time series are mainly concerned with modeling the autocorrelation
structure of a time series, and typically require that the data under study be stationary. Trend and
seasonality in time series violate the condition of stationarity. Thus, the removal of the trend and
seasonality is often desired in time series analysis and forecasting. For example, the well-known
Box–Jenkins approach to time series modeling relies entirely on the stationarity assumption. The
classic decomposition technique decomposes a time series into trend, seasonal factor, and
irregular components. The trend and seasonality are often estimated and removed from the
data first before other components are estimated. Seasonal ARIMA models also require that
the data be seasonally differenced to achieve stationarity condition (Box and Jenkins 1976).
Neural Networks for Time-Series Forecasting 14 471

However, seasonal adjustment is not without controversy. Ghysels et al. (1996) suggest that
seasonal adjustment might lead to undesirable nonlinear properties in univariate time series.
Ittig (1997) also questions the traditional method for generating seasonal indexes and pro-
posed a nonlinear method to estimate the seasonal factors. More importantly, some empirical
studies find that seasonal fluctuations are not always constant over time and at least in some
time series, seasonal components and nonseasonal components are not independent, and thus
not separable (Hylleberg 1994).
In the neural network literature, Gorr (1994) points out that neural networks should be
able to simultaneously detect both the nonlinear trend and the seasonality in the data. Sharda
and Patil (1992) examine 88 seasonal time series from the M-competition and find that neural
networks can model seasonality effectively and pre-deseasonalizing the data is not necessary.
Franses and Draisma (1997) find that neural networks can also detect possible changing
seasonal patterns. Hamzacebi (2008) proposes a neural network with seasonal lag to directly
model seasonality. Farway and Chatfield (1995), however, find mixed results with the direct
neural network approach. Kolarik and Rudorfer (1994) report similar findings. Based on a
study of 68 time series from the M-competition, Nelson et al. (1999) find that neural networks
trained on deseasonalized data forecast significantly better than those trained on seasonally
non-adjusted data. Hansen and Nelson (2003) find that the combination of transformation,
feature extraction, and neural networks through stacked generalization gives more accurate
forecasts than classical decomposition or ARIMA models.
Due to the controversies around how to use neural networks to best model trend and/or
seasonal time series, several researchers have systematically studied the issue recently. Qi and
Zhang (2008) investigate the issue of how to best use neural networks to model trend time
series. With a simulation study, they address the question: what is the most effective way to
model and forecast trend time series with neural networks? A variety of different underlying
data-generating processes are considered that have different trend mechanisms. Results show
that directly modeling the trend component is not a good strategy and differencing the data
first is the overall most effective approach in modeling trend time series. Zhang and Qi (2005)
further look into the issue of how to best model time series with both trend and seasonal
components. Using both simulated and real time series, they find that preprocessing the data
by both detrending and deseasonalization is the most appropriate way to build neural net-
works for best forecasting performance. This finding is supported by Zhang and Kline (2007)
who empirically examine the issue of data preprocessing and model selection using a large data
set of 756 quarterly time series from the M3 forecasting competition.

5.2 Multi-period Forecasting

One of the methodological issues that has received limited attention in the time series forecasting
as well as the neural network literature is multi-period (or multistep) forecasting. A forecaster
facing a multiple-period forecasting problem typically has a choice between the iterated method –
using a general single-step model to iteratively generate forecasts, and the direct method – using a
tailored model that directly forecasts the future value for each forecast horizon. Which method
the forecaster should use is an important research and practical question.
Theoretically, the direct method should be more appealing because it is less sensitive to
model misspecification (Chevillon and Hendry 2005). Several empirical studies on the relative
performance of iterated vs. direct forecasts, however, yield mixed findings. Findley (1985) find
472 14 Neural Networks for Time-Series Forecasting

some improvement in forecasting accuracy using the direct method. Based on several
simulated autoregressive and moving average time series, Stoica and Nehorai (1989) find no
significant differences between the two methods. Bhansali (1997), however, shows that the
iterated method has a clear advantage over the direct method for finite autoregressive
processes. Kang (2003) uses univariate autoregressive models to forecast nine US economic
time series and found inconclusive results regarding which multistep method is preferred. Ang
et al. (2006) find that iterated forecasts of US GDP growth outperformed direct forecasts at
least during the 1990s. The most comprehensive empirical study to date was undertaken by
Marcellino et al. (2006) who compared the relative performance of the iterated vs. direct forecasts
with several univariate and multivariate autoregressive (AR) models. Using 170 US monthly
macroeconomic time series spanning 1959 to 2002, they find that iterated forecasts are generally
better than direct forecasts under several different scenarios. In addition, they showed that direct
forecasts are increasingly less accurate as the forecast horizon increases.
For nonlinear models, multi-period forecasting receives little attention in the literature
because of the analytical challenges and the computational difficulties (De Gooijer and Kumar
1992). Lin and Granger (1994) recommend the strategy of fitting a new model for each
forecast horizon in nonlinear forecasting. Tiao and Tsay (1994) find that the direct forecasting
method can dramatically outperform the iterated method, especially for long-term forecasts.
In the neural network literature, mixed findings have been reported (Zhang et al. 1998).
Although Zhang et al. (1998) argue for using the direct method, Weigend et al. (1992) and Hill
et al. (1996) find that the direct method performed much worse than the iterated method.
Kline (2004) specifically addresses the issue of the relative performance of iterated vs. direct
methods. He proposes three methods – iterated, direct, and joint (using one neural network
model to predict all forecast horizons simultaneously) – and compares them using a subset of
quarterly series from the M3 competition. He finds that the direct method significantly
outperformed the iterated method. In a recent study, Hamzacebi et al. (2009) also reports
better results achieved by using the direct method.

5.3 Ensemble Models

One of the major developments in neural network time series forecasting is model combin-
ing or ensemble modeling. The basic idea of this multi-model approach is the use of each
component model’s unique capability to better capture different patterns in the data. Both
theoretical and empirical findings have suggested that combining different models can be an
effective way to improve the predictive performance of each individual model, especially
when the models in the ensemble are quite different. Although a majority of the neural
ensemble literature is focused on pattern classification problems, a number of combining
schemes have been proposed for time series forecasting problems. For example, Pelikan
et al. (1992) and Ginzburg and Horn (1994) combine several feedforward neural networks for
time series forecasting. Wedding and Cios (1996) describe a combining methodology using
radial basis function networks and the Box–Jenkins models. Goh et al. (2003) use an ensemble
of boosted Elman networks for predicting drug dissolution profiles. Medeiros and Veiga
(2000) consider a hybrid time series forecasting system with neural networks used to control
the time-varying parameters of a smooth transition autoregressive model. Armano et al.
(2005) use a combined genetic-neural model to forecast stock indexes. Zhang (2003) proposes
a hybrid neural-ARIMA model for time series forecasting. Aslanargun et al. (2007) use a
Neural Networks for Time-Series Forecasting 14 473

similar hybrid model to forecast tourist arrivals and find improved results. Liu and Yao (1999)
develop a simultaneous training system for negatively correlated networks to overcome the
limitation of sequential or independent training methods. Khashei et al. (2008) propose a
hybrid artificial neural network and fuzzy regression model for financial time series forecast-
ing. Freitas and Rodrigues (2006) discuss the different ways of combining Gaussian radial
basis function networks. They also propose a prefiltering methodology to address the problem
caused by nonstationary time series. Wichard and Ogorzalek (2007) use several different
model architectures with an iterated prediction procedure to select the final ensemble
members.
An ensemble can be formed by multiple network architectures, the same architecture trained
with different algorithms, different initial random weights, or even different methods. The
component networks can also be developed by training with different data such as the resam-
pling data or with different inputs. In general, as discussed in Sharkey (1996) and Sharkey and
Sharkey (1997), the neural ensemble formed by varying the training data typically has more
component diversity than that trained on different starting points, with a different number of
hidden nodes, or using different algorithms. Thus, this approach is the most commonly used in
the literature. There are many different ways to alter training data including cross-validation,
bootstrapping, using different data sources or different preprocessing techniques, as well as a
combination of the above techniques (Sharkey 1996). Zhang and Berardi (2001) propose two
data-splitting schemes to form multiple sub-time-series upon which ensemble networks are
built. They find that the ensemble achieves significant improvement in forecasting perfor-
mance. Zhang (2007b) proposes a neural ensemble model based on the idea of adding noises
to the input data and forming different training sets with the jittered input data.

6 Conclusions

Neural networks have become an important tool for time series forecasting. They have many
desired features that are quite suitable for practical applications. This chapter provides a
general overview of the neural networks for time series forecasting problems. Successful
application areas of neural networks as well as critical modeling issues are reviewed. It should
be emphasized that each time series forecasting situation requires a careful study of the
problem characteristics, careful examination of the data characteristics, prudent design of
the modeling strategy, and full consideration of modeling issues. Many rules of thumb in
neural networks may not be useful for a new application although good forecasting principles
and established guidelines should be followed. Researchers need to be aware of many pitfalls
that could arise in using neural networks in their research and applications (Zhang 2007a).
It is important to recognize that although some of the modeling issues are unique to neural
networks, some are general issues for any forecasting method. Therefore, good forecasting
practice and principles should be followed. It is beneficial to consult Armstrong (2001) which
provides a good source of information on useful principles for forecasting model building,
evaluation, and use.
Neural networks have achieved great successes in the field of time series forecasting. It is,
however, important to note that they may not always yield better results than traditional
methods for every forecasting task under all circumstances. Therefore, researchers should not
focus only on neural networks and completely ignore the traditional methods in their
forecasting applications. A number of forecasting competitions suggest that no single method
474 14 Neural Networks for Time-Series Forecasting

including neural networks is universally the best for all types of problem in every situation.
Thus it may be beneficial to combine different models (e.g., combine neural networks and
statistical models) in improving forecasting performance. Indeed, efforts to find better ways to
use neural networks for time series forecasting should continue.

References

Abdel-Aal RE (2008) Univariate modeling and forecast- Berardi LV, Zhang PG (2003) An empirical investigation
ing of monthly energy demand time series using of bias and variance in time series forecasting:
abductive and neural networks. Comput Ind Eng modeling considerations and error evaluation.
54:903–917 IEEE Trans Neural Netw 14(3):668–679
Aburto L, Weber R (2007) Improved supply chain man- Bhansali RJ (1997) Direct autoregressive predictions for
agement based on hybrid demand forecasts. Appl multistep prediction: order selection and perfor-
Soft Comput 7(1):136–144 mance relative to the plug in predictors. Stat Sin
Adya M, Collopy F (1998) How effective are neural net- 7:425–449
works at forecasting and prediction? A review and Bishop M (1995) Neural networks for pattern recogni-
evaluation. J Forecasting 17:481–495 tion. Oxford University Press, Oxford
Ang A, Piazzesi M, Wei M (2006) What does the yield Bodyanskiy Y, Popov S (2006) Neural network approach
curve tell us about GDP growth? J Econometrics to forecasting of quasiperiodic financial time series.
131:359–403 Eur J Oper Res 175:1357–1366
Armano G, Marchesi M, Murru A (2005) A hybrid ge- Box GEP, Jenkins G (1976) Time series analysis: forecast-
netic-neural architecture for stock indexes forecast- ing and control. Holden-Day, San Francisco, CA
ing. Info Sci 170(1):3–33 Cadenas E, Rivera W (2009) Short term wind speed fore-
Armstrong JS (2001) Principles of forecasting: A hand- casting in La Venta, Oaxaca, México, using artificial
book for researchers and practitioners. Kluwer, neural networks. Renewable Energy 34(1):274–278
Boston, MA Cai X, Zhang N, Venayagamoorthy GK, Wunsch DC
Ashley R (2003) Statistically significant forecasting (2007) Time series prediction with recurrent neural
improvements: how much out-of-sample data is networks trained by a hybrid PSO-EA algorithm.
likely necessary? Int J Forecasting 19(2):229–239 Neurocomputing 70:2342–2353
Aslanargun A, Mammadov M, Yazici B, Yolacan S (2007) Chakraborty K, Mehrotra K, Mohan KC, Ranka S (1992)
Comparison of ARIMA, neural networks and hybrid Forecasting the behavior of multivariate time series
models in time series: tourist arrival forecasting. using neural networks. Neural Netw 5:961–970
J Stat Comput Simulation 77(1):29–53 Chatfield C (2001) Time-series forecasting. Chapman &
Atiya AF, El-Shoura SM, Shaheen SI, El-Sherif MS (1999) Hall/CRC, Boca Raton, FL
A comparison between neural-network forecasting Chelani AB, Devotta S (2007) Prediction of ambient
techniques-case study: river flow forecasting. IEEE carbon monoxide concentration using nonlinear
Trans Neural Netw 10(2):402–409 time series analysis technique. Transportation Res
Azadeh A, Ghaderi SF, Sohrabkhani S (2007) Forecasting Part D 12:596–600
electrical consumption by integration of neural net- Chevillon G, Hendry DF (2005) Non-parametric direct
work, time series and ANOVA. Appl Math Comput multi-step estimation for forecasting economic pro-
186:1753–1761 cesses. Int J Forecasting 21:201–218
Azoff EM (1994) Neural network time series forecasting Chu FL (2008) Analyzing and forecasting tourism
of financial markets. Wiley, Chichester, UK demand with ARAR algorithm. Tourism Manag
Bakirtzis AG, Petridis V, Kiartzis SJ, Alexiadis MC, 29(6):1185–1196
Maissis AH (1996) A neural network short term Coakley JR, Brown CE (1999) Artificial neural networks
load forecasting model for the Greek power system. in accounting and finance: modeling issues. Int J
IEEE Trans Power Syst 11(2):858–863 Intell Syst Acc Finance Manag 9:119–144
Balkin DS, Ord KJ (2000) Automatic neural network Coman A, Ionescu A, Candau Y (2008) Hourly ozone
modeling for univariate time series. Int J Forecasting prediction for a 24-h horizon using neural networks.
16:509–515 Environ Model Software 23(12):1407–1421
Barreto GA (2008) Time series prediction with the self- Connor JT, Martin RD, Atlas LE (1994) Recurrent neural
organizing map: a review. Stud Comput Intell networks and robust time series prediction. IEEE
77:135–158 Trans Neural Netw 51(2):240–254
Neural Networks for Time-Series Forecasting 14 475

Cottrell M, Girard B, Girard Y, Mangeas M, Muller C Ghiassi M, Saidane H, Zimbra DK (2005) A dynamic
(1995) Neural modeling for time series: a statistical artificial neural network model for forecasting time
stepwise method for weight elimination. IEEE Trans series events. Int J Forecasting 21(2):341–362
Neural Netw 6(6):1355–1364 Ghysels E, Granger CWJ, Siklos PL (1996) Is seasonal
De Gooijer JG, Kumar K (1992) Some recent develop- adjustment a linear or nonlinear data filtering pro-
ments in non-linear time series modeling, testing, cess? J Bus Econ Stat 14:374–386
and forecasting. Int J Forecasting 8:135–156 Ginzburg I, Horn D (1994) Combined neural networks
De Groot C, Wurtz D (1991) Analysis of univariate time for time series analysis. Adv Neural Info Process Syst
series with connectionist nets: a case study of two 6:224–231
classical examples. Neurocomputing 3:177–192 Goh YW, Lim PC, Peh KK (2003) Predicting drug disso-
Doganis P, Aggelogiannaki E, Patrinos P, Sarimveis H lution profiles with an ensemble of boosted neural
(2006) Time series sales forecasting for short networks: A time series approach. IEEE Trans Neu-
shelf-life food products based on artificial neural ral Netw 14(2):459–463
networks and evolutionary computing. J Food Eng Gorr L (1994) Research prospective on neural network
75:196–204 forecasting. Int J Forecasting 10:1–4
Doganis P, Aggelogiannaki E, Sarimveis H (2008) A com- Granger CWJ (1993) Strategies for modelling nonlinear
bined model predictive control and time series time-series relationships. Econ Rec 69(206):233–238
forecasting framework for production-inventory Hansen JV, Nelson RD (2003) Forecasting and recombin-
systems. Int J Prod Res 46(24):6841–6853 ing time-series components by using neural net-
Dougherty M (1995) A review of neural networks ap- works. J Oper Res Soc 54(3):307–317
plied to transport. Transportation Res Part C 3(4): Hamzacebi C (2008) Improving artificial neural net-
247–260 works’ performance in seasonal time series forecast-
Egrioglu E, Aladag CAH, Gunay S (2008) A new model ing. Inf Sci 178:4550–4559
selection strategy in artificial neural networks. Appl Hamzacebi C, Akay D, Kutay F (2009) Comparison of
Math Comput 195:591–597 direct and iterative artificial neural network forecast
Fadlalla A, Lin CH (2001) An analysis of the applications approaches in multi-periodic time series forecast-
of neural networks in finance. Interfaces 31(4): ing. Expert Syst Appl 36(2):3839–3844
112–122 Hill T, O’Connor M, Remus W (1996) Neural network
Fahlman S, Lebiere C (1990) The cascade-correlation models for time series forecasts. Manag Sci 42:
learning architecture. In: Touretzky D (ed) Advances 1082–1092
in neural information processing systems, vol 2. Hippert HS, Pedreira CE, Souza RC (2001) Neural net-
Morgan Kaufmann, Los Altos, CA, pp 524–532 works for short-term load forecasting: a review and
Farway J, Chatfield C (1995) Time series forecasting with evaluation. IEEE Trans Power Syst 16(1):44–55
neural networks: a comparative study using the air- Hippert HS, Bunn DW, Souza RC (2005) Large neural
line data. Appl Stat 47:231–250 networks for electricity load forecasting: are they
Findley DF (1985) Model selection for multi-step-ahead overfitted? Int J Forecasting 21(3):425–434
forecasting. In: Baker HA, Young PC (eds) Proceed- Hoptroff RG (1993) The principles and practice of time
ings of the seventh symposium on identification and series forecasting and business modeling using neu-
system parameter estimation. Pergamon, Oxford, ral networks. Neural Comput Appl 1:59–66
New York, pp 1039–1044 Huskent M, Stagge P (2003) Recurrent neural networks for
Franses PH, Draisma G (1997) Recognizing changing time series classification. Neurocomputing 50:223–235
seasonal patterns using artificial neural networks. Hwang HB (2001) Insights into neural-network forecast-
J Econometrics 81:273–280 ing of time series corresponding to ARMA (p,q)
Frean M (1990) The Upstart algorithm: A method for structures. Omega 29:273–289
constructing and training feed-forward networks. Hylleberg S (1992) General introduction. In: Hylleberg S
Neural Comput 2:198–209 (ed) Modelling seasonality. Oxford University Press,
Freitas and Rodrigues (2006) Model combination Oxford, pp 3–14
in neural-based forecasting. Eur J Oper Res 173: Hylleberg S (1994) Modelling seasonal variation. In:
801–814 Hargreaves CP (ed) Nonstationary time series anal-
Gately E (1996) Neural networks for financial forecast- ysis and cointegration. Oxford University Press,
ing. Wiley, New York Oxford, pp 153–178
Gautama AK, Chelanib AB, Jaina VK, Devotta S (2008) Ittig PT (1997) A seasonal index for business. Decis Sci
A new scheme to predict chaotic time series of air 28(2):335–355
pollutant concentrations using artificial neural net- Jain A, Kumar AM (2007) Hybrid neural network models
work and nearest neighbor searching. Atmospheric for hydrologic time series forecasting. Appl Soft
Environ 42:4409–4417 Comput 7:585–592
476 14 Neural Networks for Time-Series Forecasting

Jiang X, Adeli H (2005) Dynamic wavelet neural network (1982) The accuracy of extrapolation (time series)
model for traffic flow forecasting, J Transportation methods: results of a forecasting competition.
Eng 131(10):771–779 J Forecasting 1(2):111–153
Kang I-B (2003) Multi-period forecasting using different Mandic D, Chambers J (2001) Recurrent neural networks
models for different horizons: an application to for prediction: learning algorithms, architectures
U.S. economic time series data. Int J Forecasting and stability. Wiley, Chichester, UK
19:387–400 Marcellino M, Stock JH, Watson MW (2006) A compar-
Kaastra I, Boyd M (1996) Designing a neural network for ison of direct and iterated multistep AR methods for
forecasting financial and economic time series. forecasting macroeconomic time series. J Economet
Neurocomputing 10:215–236 135:499–526
Kermanshahi B (1998) Recurrent neural network for Medeiros MC, Pedreira CE (2001) What are the effects of
forecasting next 10 years loads of nine Japanese forecasting linear time series with neural networks?
utilities. Neurocomoputing 23:125–133 Eng Intell Syst 4:237–424
Khashei M, Hejazi SR, Bijari M (2008) A new hybrid Medeiros MC, Veiga A (2000) A hybrid linear-neural
artificial neural networks and fuzzy regression model for time series forecasting. IEEE Trans Neural
model for time series forecasting. Fuzzy Sets Syst Netw 11(6):1402–1412
159:769–786 Medeiros MC, Terasvirta T, Rech G (2006) Building
Khotanzad A, Afkhami-Rohani R, Lu TL, Abaye A, Davis neural network models for time series: a statistical
M, Maratukulam DJ (1997) ANNSTLF—A neural- approach. J Forecasting 25:49–75
network-based electric load forecasting system. IEEE Menezes JMP, Barreto GA (2008) Long-term time series
Trans Neural Netw 8(4):835–846 prediction with the NARX network: an empirical
Kline DM (2004) Methods for multi-step time evaluation. Neurocomputing 71:3335–3343
series forecasting with neural networks. In: Mishra AK, Desai VR (2006) Drought forecasting using
Zhang GP (ed) Neural networks for business feed-forward recursive neural network. Ecol Model
forecasting. Idea Group, Hershey, PA, pp 226–250 198:127–138
Kolarik T, Rudorfer G (1994) Time series fore- Nakamura E (2005) Inflation forecasting using a neural
casting using neural networks. APL Quote Quad network. Econ Lett 86:373–378
25:86–94 Nelson M, Hill T, Remus T, O’Connor M (1999) Time
Krycha KA, Wagner U (1999) Applications of artificial series forecasting using neural networks: should
neural networks in management science: a survey. the data be deseasonalized first? J Forecasting 18:
J Retailing Consum Serv 6:185–203 359–367
Kuan C-M, Liu T (1995) Forecasting exchange rates Palmer A, Montano JJ, Sese A (2006) Designing an artifi-
using feedforward and recurrent neural networks. cial neural network for forecasting tourism time
J Appl Economet 10:347–364 series. Tourism Manag 27:781–790
Lapedes A, Farber R (1987) Nonlinear signal processing Parisi A, Parisi F, Dı́az D (2008) Forecasting gold price
using neural networks: Prediction and system mod- changes: rolling and recursive neural network mod-
eling. Technical Report LA-UR-87-2662, Los Alamos els. J Multinational Financial Manag 18(5):477–487
National Laboratory, Los Alamos, NM Parlos AG, Rais OT, Atiya AF (2000) Multi-step-ahead
LeBaron B, Weigend AS (1998) A bootstrap evaluation of prediction using dynamic recurrent neural net-
the effect of data splitting on financial time series. works. Neural Netw 13:765–786
IEEE Trans Neural Netw 9(1):213–220 Pelikan E, de Groot C, Wurtz D (1992) Power consump-
Lennon B, Montague GA, Frith AM, Gent C, Bevan V tion in West-Bohemia: improved forecasts with dec-
(2001) Industrial applications of neural networks— orrelating connectionist networks. Neural Netw
An investigation. J Process Control 11:497–507 World 2(6):701–712
Liang F (2005) Bayesian neural networks for nonlinear Pino P, Parreno J, Gomez A, Priore P (2008) Forecasting
time series forecasting. Stat Comput 15:13–29 next-day price of electricity in the Spanish energy
Lin J-L, Granger CWJ (1994) Forecasting from non- market using artificial neural networks. Eng Appl
linear models in practice. J Forecasting 13:1–9 Artif Intell 21:53–62
Liu Y, Yao X (1999) Ensemble learning via negative Poli I, Jones DR (1994) A neural net model for predic-
correlation. Neural Netw 12:1399–1404 tion. J Am Stat Assoc 89:117–121
Maier HR, Dandy GC (2000) Neural networks for the Qi M, Zhang GP (2001) An investigation of model selec-
prediction and forecasting of water resource vari- tion criteria for neural network time series forecast-
ables: a review of modeling issues and applications. ing. Eur J Oper Res 132:666–680
Environ Model Software 15:101–124 Qi M, Zhang GP (2008) Trend time-series modeling and
Makridakis S, Anderson A, Carbone R, Fildes R, Hibdon forecasting with neural networks. IEEE Trans Neural
M, Lewandowski R, Newton J, Parzen E, Winkler R Netw 19(5):808–816
Neural Networks for Time-Series Forecasting 14 477

Reed R (1993) Pruning algorithms—A survey. IEEE Nonlinear modeling and forecasting. Addison-
Trans Neural Netw 4(5):740–747 Wesley, Redwood City, CA, pp 395–432
Remus W, O’Connor M (2001) Neural networks for time Werbos P (1974) Beyond regression: new tools for pre-
series forecasting. In: Armstrong JS (ed) Principles diction and analysis in the behavioral sciences. Ph.D.
of forecasting: a handbook for researchers and prac- thesis, Harvard University
titioners. Kluwer, Norwell, MA, pp 245–256 Wichard J, Ogorzalek M (2007) Time series prediction
Ripley BD (1996) Pattern recognition and neural net- with ensemble models applied to the CATS bench-
works. Cambridge University Press, Cambridge mark. Neurocomputing 70:2371–2378
Rumelhart DE, McClelland JL, PDP Research Group Wong BK, Selvi Y (1998) Neural network applications
(1986) Parallel distributed processing: explorations in finance: a review and analysis of literature
in the microstructure of cognition, Foundations, vol (1990–1996). Inf Manag 34:129–139
1. MIT Press, Cambridge, MA Wong BK, Lai VS, Lam J (2000) A bibliography of neural
Sharda R, Patil RB (1992) Connectionist approach to network business applications research: 1994–1998.
time series prediction: an empirical test. J Intell Comput Oper Res 27:1045–1076
Manufacturing 3:317–323 Xiao Z, Ye SJ, Zhong B, Sun CX (2009) BP neural net-
Sharkey CJ (1996) On combining artificial neural nets. work with rough set for short term load forecasting.
Connect Sci 8:299–314 Expert Syst Appl 36(1):273–279
Sharkey CJ, Sharkey EN (1997) Combining diverse neu- Zhang G, Patuwo EP, Hu MY (1998) Forecasting with
ral nets. Knowledge Eng Rev 12(3):231–247 artificial neural networks: the state of the art. Int J
Smith M (1993) Neural networks for statistical model- Forecasting 14:35–62
ing. Van Nostrand Reinhold, New York Zhang GP (2001) An investigation of neural networks for
Stoica P, Nehorai A (1989) On multi-step prediction linear time-series forecasting. Comput Oper Res
errors methods for time series models. J Forecasting 28:1183–1202
13:109–131 Zhang GP (2003) Time series forecasting using a hybrid
Swanson NR, White H (1995) A model-selection ap- ARIMA and neural network model. Neurocomput-
proach to assessing the information in the term ing 50:159–175
structure using linear models and artificial neural Zhang GP (2004) Neural networks in business forecast-
networks. J Bus Econ Stat 13:265–275 ing. Idea Group, Hershey, PA
Tiao GC, Tsay RS (1994) Some advances in non-linear Zhang GP (2007a) Avoiding pitfalls in neural network
and adaptive modeling in time-series. J Forecasting research. IEEE Trans Syst Man and Cybern 37:3–16
13:109–131 Zhang GP (2007b) A neural network ensemble method
Teräsvirta T, van Dijk D, Medeiros MC (2005) Linear with jittered training data for time series forecasting.
models, smooth transition autoregressions, and neural Inf Sci 177:5329–5346
networks for forecasting macroeconomic time series: a Zhang GP, Berardi LV (2001) Time series forecasting with
re-examination. Int J Forecasting 21(4):755–774 neural network ensembles: An application for ex-
Vellido A, Lisboa PJG, Vaughan J (1999) Neural networks change rate prediction. J Oper Res Soc 52(6):652–664
in business: a survey of applications (1992–1998). Zhang GP, Kline D (2007) Quarterly time-series forecast-
Expert Syst Appl 17:51–70 ing with neural networks. IEEE Trans Neural Netw
Vermaak J, Botha EC (1998) Recurrent neural networks 18(6):1800–1814
for short-term load forecasting. IEEE Trans Power Zhang GP, Qi M (2002) Predicting consumer retail sales
Syst 13(1):126–132 using neural networks. In: Smith K, Gupta J (eds)
Wang Y-H (2009) Nonlinear neural network forecasting Neural networks in business: techniques and appli-
model for stock index option price: hybrid GJR- cations. Idea Group, Hershey, PA, pp 26–40
GARCH approach. Expert Syst Appl 36(1):564–570 Zhang GP, Qi M (2005) Neural network forecasting
Wedding K II, Cios JK (1996) Time series forecasting by for seasonal and trend time series. Eur J Oper Res
combining RBF networks, certainty factors, and the 160(2):501–514
Box-Jenkins model. Neurocomputing 10:149–168 Zhang GP, Patuwo EP, Hu MY (2001) A simulation study
Weigend AS, Gershenfeld NA (1994) Time series predic- of artificial neural networks for nonlinear time series
tion: forecasting the future and understanding the forecasting. Comput Oper Res 28:381–396
past. Addison-Wesley, Reading, MA Zhang YQ, Wan X (2007) Statistical fuzzy interval neural
Weigend AS, Huberman BA, Rumelhart DE (1990) Pre- networks for currency exchange rate time series pre-
dicting the future: a connectionist approach. Int J diction. Appl Soft Comput 7:1149–1156
Neural Syst 1:193–209 Zou HF, Xia GP, Yang FT, Wang HY (2007) An investiga-
Weigend AS, Huberman BA, Rumelhart DE (1992) Pre- tion and comparison of artificial neural network and
dicting sunspots and exchange rates with connec- time series models for Chinese food grain price
tionist networks. In: Casdagli M, Eubank S (eds) forecasting. Neurocomputing 70:2913–2923

You might also like