Rule Discovery and Technical Analysis Expert System Development in Tehran Stock Ex-Change (TSE) Using Decision Trees and Neural Networks, Hooman Hidaji
Rule Discovery and Technical Analysis Expert System Development in Tehran Stock Ex-Change (TSE) Using Decision Trees and Neural Networks, Hooman Hidaji
Abstract
In this study we have tried to develop a model for predicting Tehran Stock Market price via data mining techniques, specifically by use of an artificial neural network and then decision trees and present a new way of rule extraction. Technical indicators are used as input for model. First a suitable neural network is created and trained with the stock data. The intention here is to develop a simple model. The network is able to predict the equity price trend with acceptable hit rate. Then this trained network is used as an input for decision tree. It is believed that rules induced from this tree perform better than those derived directly from the data. Then a rule-based feed-forward expert system is developed Keywords: Tehran Stock Exchange, technical analysis, neural network, decision tree, rule induction, expert system
Introduction
Tehran Stock Exchange (TSE) is a rather emerging market compared to markets of developed countries. Since its opening in February 1967 it has had lots of ups and downs. Two of the most events that extensively affected the TSE were the 1979 Islamic revolution and the 8 year war. These events made the financial markets turbulent and uncertain of the future and caused a lot of damage to it. But after the war The Tehran Stock Exchange has come a long way. Today TSE has evolved into an exciting growing marketplace where individual and institutional investor trade securities of over 420 companies. Despite the recent growth not much academic study was carried out on TSE several years ago and a huge gap could be distinguished. In recent years, due to government privatization strategies, TSE has played an important role in the country`s economy. The academic growth has also taken place in recent years in universities and a boost has happened in institutional researches. With this significance, academic courses have been launched and lots of researchers have developed interest in stock market. The result has been unprecedented number of dissertations, papers and reviews in this area e.g. Yousefpour (2007), Mohammadnia (2008), Alizadeh (2008). The stock market data has intrinsic trait of being complicated. Numerous interrelated factors of economic, cognitive, political and other sciences affect financial markets and make it difficult to predict by simple linear models. On the other hand the development of intelligent techniques such as neural networks, genetic algorithms and fuzzy logic with abundance of processing machines and PCs has opened new horizons to the studies in this field. Some instances of such studies can be found in works of, Timajchi (2008), Soleimani (2008) and Aghamiri (2009). Technical analysis has been the primary tool for predicting stock patterns for years now. It incorporates different aspects of human behavior, economic and political events etc. Success of technical analysis has made it more desirable and today financial advisors and brokerage firms seldom make decisions without using it (Chavarnakul & Enke 2009). It is important to have a good tool that can rapidly infer from raw data and turn it into useful information that can be used as signals in market. Though the weak form of efficient market hypothesis, theory that has been in markets for several decades, accounts technical analysis as mere waste of time and resources, but numerous
researchers have shown that this theory does not hold in various markets e.g. Blume, Easley & OHara 1994, Lo et al. 2000. Grundy and Kim (2002) also find value in using technical analysis. In recent years, technical analysis has proven to be powerful for evaluating stock prices and is widely accepted among financial economists and brokerage firms. This is due to the fact that technical analysis appears to be a compromising tool since it offers a relative mixture of human, political, and economical events (Achelist, 1995). With abundance of computers on the other hand, it is useful to create systems that can do the drudgery of technical analysts. Decision Support Systems and Expert Systems can effectively be put to work for this purpose, e.g. Zareiee 2009. Researchers are always trying to reach better solutions through the use of intelligent techniques. Financial markets behave in a complex manner. Simple linear or ARIMA models cannot interpret or predict markets effectively. Neural networks on the other hand have shown great potential for different tasks, including prediction. These intelligent networks can learn patterns and predict future trends. In time series problems, these networks are substituting traditional statistical models such as Box-Jenkins models. Refenes et al. indicated that traditional statistical techniques for forecasting have reached their limitation in applications with nonlinearities in the data set such as stock indices (Yoa & Tan 2000). While powerful in prediction, neural networks have a weak explanatory ability. Neural network representation is incomprehensible to human. One can not understand on what grounds a prediction is made. It is said that the network acts as a black box. In everyday applications, rule based systems are preferred for help in decision making, because these systems provide us with easy to understand rules in a symbolic form. This weakness of neural networks calls in some sort of compensation. One of the techniques that can be combined with neural network to improve its effectiveness and understandability is decision tree. Decision Trees can provide us with a good understanding of the problem domain in a schematic form of nodes and arches. Then with the selected attributes and variables, it can effectively provide us with easy to understand IF-Then rules. This kind of rules then can be easily implemented in a decision support system. Decision tree is a data mining technique, usually an algorithm that via use of some criteria and target, tries to separate data into classes that each represent homogeneous properties. Decision trees have been put to use in areas of classification, clustering, prediction and so. At each branch of the tree, data are segregated according to a variable and this process goes on until desired difference between leafs is acquired. Each leaf represents a rule in the manner that if we follow the criteria used for classifying; we can rephrase it as IF-THEN rules. Rule-based systems present the knowledge in the symbolic IFTHEN form in which the user can examine the solution based on his knowledge of the field and does not need to understand the underlying method. Neural networks can not practically supply this kind of knowledge. Thus it is very desirable to develop a system that can effectively use the synergy of these techniques, understand the underlying patterns via neural network and represent it by rules induced from decision tree. In this research we try to create an artificial neural network for prediction of several equity prices in Tehran Exchange Market. The network can predict if the price will rise or fall based on several technical indicators chosen for this purpose. Then this network will be used as an input for a decision tree and several rules will be generated. The rules are compared to the rules induced directly from the data. In the next section the vast literature in the area is discussed. Then technical analysis, neural networks and decision trees will be introduced briefly. After that development of the model is represented and its parameters, inputs and functioning are explained and the expert system is presented.
1. Literature review
In recent years intelligent techniques such as neural networks, have been extensively applied to a large variety of applications in areas of stock market prediction, investment, and technical analysis. Yao & Tan (2000) showed that a neural network model is applicable to the prediction of foreign
exchange rates. Time series data and technical indicators, such as moving average were fed to neural networks to capture the underlying rules of the movement in currency exchange rates. This study showed that no complex index was needed to do so; just the simple moving average was enough for the prediction. It also was another assault on the weak form of efficient market hypothesis. The model was run for five exchange rates in FOREX and the results were acceptable. Chavarnakul & Enke (2009) developed a complex hybrid system. Their study presented a hybrid stock trading system that integrated neural networks, fuzzy logic, and genetic algorithms techniques to increase the efficiency of stock trading. The study uses volume adjusted moving average (VAMA), a technical indicator developed from equivolume charting. For this research, a neurofuzzy-based genetic algorithm system was introduced. The results showed that the intelligent hybrid system took advantage of the synergy among these different techniques to intelligently generate more optimal trading decisions for the VAMA, allowing investors to make better stock trading decisions. This study was successful in improving returns both with and without transaction costs. Leigh, Purvis & Ragusa (2002) studied a hybrid DSS which combined technical analysis, pattern recognition, neural network and genetic algorithm to form a romantic DSS. The study shows that bull flag price and volume pattern heuristic in technical analysis can be effectively used to make profit. De Faria et al. (2009) predicted the Brazilian stock market through neural networks and adaptive exponential smoothing methods and concluded that neural networks outperformed the latter method. Wang & Chan (2007) examined the potential profit of bull flag technical trading rules using a template matching technique based on pattern recognition for the Nasdaq Composite Index and Taiwan Weighted Index. The empirical results indicated that all of the technical trading rules correctly predict the direction of changes. This finding may provide investors with important information on asset allocation. The empirical results demonstrated that the average return of trading rules conditioned on bull flag significantly better than buying every day for the study period. Rotundoa & Ausloos (2007) examined the technical analysis method by a co-evolution BakSneppen model. They tested several signals by considering microeconomic factors. This model has the ability of explaining unusual events. Friesen, Weller & Dunham (2009) examined technical trading rules derived from historical data to find out why these rules are profitable. This paper provides a model that explains the success of certain trading rules that are based on patterns in past prices. The importance of confirmation bias is noted in this study, which has been shown to play a key role in other types of decision making. Traders who acquire information and trade on the basis of that information tend to bias their interpretation of subsequent information in the direction of their original view. This produces autocorrelations and patterns of price movement that can predict future prices. Fernandez-Blanco et al. (2008) in their study took benefit of evolutionary algorithm to optimize the parameter values of technical indices. They specifically focused on Moving Average ConvergenceDivergence (MACD) and could improve its performance. In their research, Zhu & Zhou (2009) analyzed the usefulness of technical analysis, specifically the widely employed moving average trading rule from an asset allocation perspective and showed that, when stock returns are predictable, technical analysis adds value to commonly used allocation rules that invest fixed proportions of wealth in stocks. When uncertainty exists about predictability, which is likely in practice, the fixed allocation rules combined with technical analysis can outperform the prior-dependent optimal learning rule when the prior is not too informative. Moreover, the technical trading rules are robust to model specification, and they tend to substantially outperform the modelbased optimal trading strategies when the model governing the stock price is uncertain. Leigh, Paz & Purvis (2002) took benefit of a feed-forward neural network and pattern recognition technique to predict short term increase in NYSE index. But this kind of prediction is now almost desolate and processes usually are done on data itself. Enke & Thawornwong (2005) studied use of neural networks and data mining in financial market analysis. They proposed a data selection method that could improve results. They developed several market strategies and the final results were far better than buy and hold strategy. Tsang et al. (2007) developed a NN5 neural network for prediction of Hong Kong stock price which used several technical indexes such as moving average, open and close prices, RSI and MACD. The network signaled buy/sell signals based on one day change in price
and had acceptable hit rate. Skouras (2001) introduced an Artificial Technical Analyst which could select trading rules and could learn from experience. Thus the resulting rules could perform better than traditional rules. Huang & Tsai (2009) used SVM combined with Self Organizing Feature Map (SOFM) and filterbased feature selection to improve market indexes predictions. The hybrid system was successful in reducing times and costs and had an improvement over the simple SVM system. The proposed system was a success in combining several methods. Lam (2004) in her research investigates the ability of neural networks, specifically, the backpropagation algorithm, to integrate fundamental and technical analysis for financial performance prediction. The predictor attributes include 16 financial statement variables and 11 macroeconomic variables. Mokhtari & Ashouri (2007) developed a backpropagation and genetic algorithm hybrid to predict monthly minima and maxima. Saraiee, Faraki & Keikhaiee (2007) created another backpropagation neural network for predicting different parameters of stock markets. Chang et al. (2009) In his study developed an integrated system called CBDWNN by combining dynamic time windows, case based reasoning (CBR), and neural network for stock trading prediction. The system developed in this research is a first attempt in the literature to predict the sell/buy decision points instead of stock price itself. The research uses backpropagation algorithm to train the network and achieves good results. It also develops a support system for stock prediction. This research is another example of a good hybrid system. Others have tried to develop comprehensive systems. They have tried to build DSSs and Expert systems to help stock recommendation and exchange. One of the best studies in this field was carried out by Lee & Jo (1999). They developed an expert system that could give out recommendation based on candle stick charts. The knowledge base proved to be accurate and the system worked effectively. In Ha et al. (2009) rule discovery approach is followed and a recommendation system for stock exchange is developed. The research tries to form a rule base from stock database. Fazel Zarandi et al. (2009) created a type-2 fuzzy rule based expert system which used both technical and fundamental indexes to forecast stock prices and achieves significant results. Wen et al. (2010) developed an automatic stock decision support system based on box theory and SVM algorithm. SVM is thought to be a robust method in use for financial time-series forecasting. The developed system uses box theory and two SVMs to create buy/sell signals. This study is a noble one and has potential for future works. Wang (2003) attempted to predict stock price via strong rules. He developed a fuzzy rough set system consisted of two agents: a visual agent for monitoring the price and a mining agent that helped traders distinguish buy and sell signals. On the grounds of rule induction and use of decision trees in financial markets, Graven and Shavlik (1997) developed the Trepan algorithm. Trepan entailed a neural network and decision tree algorithm similar to CART. In this research the network was used for predicting Dollar-Mark exchange rate. Then this network was fed to a decision tree to create rules. The rules generated in this manner were by far better than the rules generated directly from the data. The neural network seemed to reduce the effect of noise in the data and so creating more confident rules. Andrews, Diederich &Tickle (1995) have also done a survey on this field. Wang & chan (2009) made and empirical study on trading rule discovery and proposed a template for doing so. Previous studies had not explained why some assumptions are made about specific indexes or data, thus this study can be used as a black-box for future research. This study uses chart data to infer buy signals. Proposed rules have good performance in USAs tech market stocks and can be used as a part of an expert system. Thammano (1999) used a neurofuzzy model to predict future values of Thailands largest government-owned bank. The inputs of the model are the closing prices for the current and prior three months, and the profitability ratios. The output of the model was the stock prices for the following three months. He concluded that the neuro-fuzzy architecture was able to recognize the general
characteristics of the stock market faster and more accurately than the basic backpropagation algorithm. Also, it could predict investment opportunities during the economic crisis when statistical approaches did not yield satisfactory results.
2. Technical analysis
It is useful to first take a look at the existing theories in financial markets and their viewpoint toward technical analysis.
technique uses charts to study the movement of stock prices. The use of technical indicators is another technique that includes calculations or mathematic equations to investigate the trading decisions (Chavarnakul & Enke 2008). A technical indicator is a series of data points derived by applying a formula to the price data series. Price data includes any composition of the opening, high, low or closing values over a period of time. Some indicators require only the closing prices; others include volume, or any other kind of information into their formula. Price data are included into the formula and every data point is produced by using the selected information. Any single data point does not offer much information and it is not enough to make an indicator. It is required to have a series of data points over a period of time to create valid reference points to allow analysis. By creating a series of data points along the time, a comparison can be made between present and past levels. Many investors and traders use indicators to predict the direction of future prices. For analysis purposes, technical indicators may be shown in a graphical way above or below a price chart. Usually, indicators are classified in two big classes (Reily 1989): oscillators or leading indicators, and lagging indicators. Leading indicators are designed to lead price movements. They represent a form of price momentum over a fixed look-back period, which is the time lapse used to calculate the indicator. For example, a 20- day Stochastic Oscillator would use the past 20 days of price action (about a month) in its calculation. All prior price action would be ignored. Some of the most popular leading indicators include Commodity Channel Index (CCI), Momentum, Relative Strength Index (RSI), Stochastic Oscillator and Williams %R. The lagging indicators follow the price action and are commonly referred to as trend-following indicators. Trend-following indicators work best when markets develop strong trends. They are designed to get traders in and keep them in as long as the trend is intact. These indicators are not effective in neither trading nor sideways markets. If used in exchange markets, trend-following indicators will likely lead to many false signals and whipsaws. Some popular trend-following indicators include moving averages (exponential, simple, and weighted) and Moving Average Convergence-Divergence (MACD), these indicators are picked up in this paper to prove the proposed technique (Fernndez-Blanco et al. 2008).
3. Neural Networks
An artificial neural network, ANN, usually called a neural network, NN, is a mathematical or computational model that tries to simulate the structure and/or functional aspects of biological neural networks. It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to computation. Neural networks are non-linear statistical data modeling tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data. Like the linear and polynomial approximation methods, a neural network relates a set of input variables to a set of one or more output variables. The difference between a neural network and the other approximation methods is that the neural network makes use of one or more hidden layers, in which the input variables are transformed by a special function, known as transfer function. While this hidden layer approach may seem esoteric, it represents a very efficient way to model nonlinear statistical processes (McNelis 2005). The crowd of these simple units, called nodes; creates a complicated whole which has many applications in different sciences and industries. Financial market data are not simple ARIMA processes; they can not be described by simple linear models; they are not simple white noise or even random walks. Because of the high volatility, complexity, and noise market environment, neural networks are prime candidates for prediction purposes (Yoa & Tan 2000). So many factors affect prices in an interrelated manner, thus making a complex nonlinear behavior. Therefore prediction of stock price movements is a very difficult task. Use of appropriate intelligent systems in recent years has had exponential growth and has helped this task. These techniques have been able to help market makers decide more confidently. Neural
networks, fuzzy logic and genetic algorithms are obvious examples of such intelligent techniques that have been applied to a large variety of applications in technical analysis. It is believed that (Lam 2004) neural networks are appropriate tools for forecasting financial performance for the following reasons: First, a neural network is an intrinsically numeric method, which is especially suitable for processing numeric data, specifically financial data and indicators. Second, neural networks do not require any limiting assumption on data distribution for input data e.g. requiring the distribution to be normal. This feature allows neural networks be applicable to a wide collection of problems compared to statistical techniques such as regression. And third, neural networks are data mining techniques that enable us to train it even after the first training phase. New data can be introduced to the network to retrain and update it in a parallel manner and thus sustainably improve its performance. ANNs appear to be particularly suited for financial time series forecasting, as they can learn highly non-linear models, have general learning algorithms, can handle noisy data, and can use inputs of different kinds (Chang et al. 2009). In the area of financial stock market forecasting, many studies focused on application of ANNs; the ANNs have been reported as good methods to predict financial stock market levels (Huang & Cheng 2009) Neural networks can be trained to map past values of a time series, for purposes of classification or prediction and have recently become a popular tool for financial decision making (Leigh, Purvis & Ragusa 2002). Multi-layer feed-forward neural networks have been widely used for financial forecasting due to its ability to correctly classify and predict the dependent variable (Vellido, Lisboa, and Vaughan, 1999). Backpropagation is by far the most popular neural network algorithm that has been used to perform training on the multi-layer feed-forward neural networks. It has shown to be one of the most widely used learning methods for multilayer networks and are believed to have great potential for financial forecasting (Adya & Collopy 1998). Since the feed-forward neural networks are well known, the network structures and backpropagation algorithms are not described in this paper. Although neural networks as a data mining tool have the above merits, they have their fair share of problems. One common difficulty for neural network applications involves the determination of the optimal combination of training parameters including the network architecture (the number of hidden layers and the number of hidden nodes), the learning rate, the order of submitting training examples to the network, and the number of training epochs. There are various heuristic rules and common practices for selecting the parameters, but the selection process remains as an art rather than a science, and varies from problem to problem. This problem is sometimes called the black box criticism (McNelis 2005). Another problem is their understandability. Neural networks representation is not comprehensible for human, on cannot infer from the mechanism of the network why an output is derived. It is for this incompetence that in our research we tried to improve its representation via a symbolic technique, i.e. decision tree.
5. Model Development
5.1 Inputs
Selection of the input variables is a modeling decision that can greatly affect the model performance. Our intention in this study is the use of solely technical indicators for prediction. There are numerous indexes in the area of technical analysis. Moving average may be the most famous one. But some indexes are more widely accepted and put to work by more researches. In this research we have tried to use more credited indexes (Tsang et al. 2007, Yao & Tan 2000, Huang & Tsai 2009, Lam 2004) and specially, we have used former research on Tehran Exchange Market and the indexes that perform better in it by Timajchi (2008). Below is a short description of indexes used in the model as inputs.
Day information It is fair to assume that the month and the day of the month and the day of the week may affect the security price trend, if not solely, with accordance to other factors. So we have incorporated this information in the model as three separate inputs. Many researchers use this information as input s to their models, e.g. Zareiee 2009. Moving Average Moving average is one of the most popular and easy to use tools in technical analysis. Moving average smoothes a data series and make it easier to spot trends, something that is especially helpful in volatile markets. It also forms the building blocks for many other technical indicators. But it has disadvantage of being late or so called to be lagging. In this study we used a 10-day exponential moving average to capture short period changes and a 25-day simple moving average as proposed by Timajchi (2008) as an appropriate index for TSE. The interpretation of MAs is rather simple; a buy signal is created when the price is higher than MA and a sell signal is created when it falls lower. Bollinger Bands Bollinger Bands index is consisted of three bands designed to encompass the majority of a security's price action: a simple moving average in the middle, an upper band (SMA plus 2 standard deviations) and a lower band (SMA minus 2 standard deviations). Bollinger bands give an idea about the range of price and it signals when price reaches the upper and lower bands. Timajchi (2008) recommended a 20-day Bollinger Bands for use in TSE. Aroon Oscillator Aroon is an indicator system that is used to determine whether a stock is trending or not and how strong the trend is. Aroon oscillator is formed from two indicators, Aroon up and Aroon down. In the study carried out by Timajchi (2008) 14-day Aroon oscillator was known to work best in TSE. Rate of Change (ROC) The ROC indicator is a very simple yet effective momentum oscillator that measures the percent change in price from one period to the next. The oscillator can be used like any other momentum oscillator by looking for higher lows, lower highs, positive and negative divergences, and crosses above and below zero for signals. Timajchi (2008) compared ROC with similar indicators like CCI and conclude that 3-day ROC outperforms other momentum indicators. But we took use of RSI index too. Relative Strength Index (RSI) The RSI is another extremely useful and popular momentum oscillator. It compares the magnitude of a stock's recent gains to the magnitude of its recent losses and turns that 8
information into a number that ranges from 0 to 100. It takes a single parameter, the number of time periods to use in the calculation. We use a 14-day RSI in our study which is proposed by both the creator of index (Wilder 1987) and Timajchi (2008). Moving Average Convergence-Divergence (MACD) MACD is one of the simplest and most reliable technical indicators available. MACD uses moving averages, which are lagging indicators, to include some trend-following characteristics. These indicators are turned into a momentum oscillator by subtracting the longer moving average from the shorter moving average. The resulting plot forms a line that oscillates above and below zero, without any upper or lower limits. Volume Indicators Volume is one of important predicting factors in stock markets. Sudden rises and falls in the trend of volume can be relatively accurate signals. Thus we have used the amount of volume as well as two volume indicators as inputs of the model: Volume Oscillator Points and Price Volume Trend. The Volume Oscillator displays the difference between two moving averages of a security's volume. It can be used to determine if the overall volume trend is increasing or decreasing. This indicators acts like MACD for volume. There are many ways to interpret changes in volume trends. Volume Oscillator is calculated in points or percentage but we chose to use points. The Price and Volume Trend is similar to On Balance Volume in that it is a cumulative total of volume that is adjusted depending on changes in closing prices, but unlike OBV, the PVT adds/subtracts only a portion of the daily volume. Many investors feel that the PVT more accurately illustrates the flow of money into and out of a security than OBV does (Achelis 1995).
Input Month Day of month Day of week Volume Closing price 25-D SMA 10-D EMA Bollinger Upper Band Bollinger Lower Band Aroon Osc. Price ROC Wilder RSI MACD Vol.Osc. Points Price Vol.Trend Table 1 Model inputs Description A number between 1 and 12 indicating number of the month A number between 1 and 31 indicating the number of the day in month A number between 1 and 5 indicating the number of the day in week The volume of the stocks traded Closing price of the day 25-day simple moving average 10-day exponential moving average 20 day bollinger upper limit 20 day bollinger lower limit Richard Arm's Aroon Oscillator 3-day Rate-of-Change of equity Relative Strength Index Moving Average Convergence-Divergence Difference between two moving averages of a security's volume The cumulative total of volume that is adjusted depending on changes in closing prices
Note that we developed models for five equity data and not all the inputs were used for all of them.
Symbol
Company Name
Parsian Bank Jaaber Ebne Hayyaan Pharmacy National Copper Company Hepco Heavy Equipment Production Company Zamiad Company
TEpco
Feb-04
272,095
397,800,000
KhZamia
Feb-99
1,528,800
1,200,000,000
The values of the input variables were first preprocessed by normalizing them within a range of -1 and +1 to minimize the effect of magnitude among the inputs and thus increase the effectiveness of the learning algorithm. Mapping the maximum and minimum of the data to 1 and -1 is very common in training neural networks. This also reduces the training time required for the network. The process times for this research were negligible and always less than a minute, usually few seconds and thus we chose not to perform sensitivity analysis for the process time. The data are chosen and segregated in time order. In other words, the data of the earlier period are used for training, validation and primary test and the data of the later period are used for final testing of the model. Learning algorithm used is Levenberg-Marquardt. This algorithm is one of most widely used algorithms in neural netwoks. The performance of the network is measured by Mean Squared Error of the model targets and real targets. MSE is the most accepted and widely used performance measure and is the default in most softwares. Another performance measurement is the gradient of the target. The gradient is used to measure the error and further train the network. The structure of the network in most cases had two hidden layers with 20 to 35 nodes in each. We tried to keep the model as simple as possible.
11
Fmelli
11-25-25-1
0.0052
0.0276
60.00%
TEpco KhZamia
14-25-25-1 14-25-25-2
0.0024 0.0089
0.005 0.0043
74.00% 75.00%
12
In the table 5 performances of the rules is summarized. Confidence is the probability of happening result of the rule, if proposition of the rule happen. It is a measure of goodness of the rule. Rules with high generalization are more desirable. Number of instances measures generalizability of the rule, states how often a rule can be fired. Another criterion for goodness of a rule is its fewer propositions. Rules with too many ands are hard to put to work.
Table 5 Rule set characteristics (directly extracted from data)
Number of Rules 27 Average Rule Confidence 0.79 Total Entailing Instances 380 (76% of total) Average Entailing Instances 14.07 Average Preposition In Each Rule 5
Almost half of the rules are induced in 6th level of the tree below the root, it means than in half of the rules the branch is in the edge of the tree (limitation of 6 level was applied to the tree). We created another decision tree, this time we used the data created by the neural network developed earlier. For doing so, we generate random data for the network (due to the distribution of variables) and feed it to the network as input data. Output of the network is imported to the decision tree algorithm as target values. We believe that the randomly generated data and targets can better represent underlying patterns of the market, because there is less noise in the data. We generated 500 records to conform to the records we had for direct rule induction, though better results can be achieved by use of more data. The data are used in the same manner as before for rule extraction. With the same constraints we induce rules from the tree. In this case the number of rules has fallen to 24 rules, 12 rules for cases which predict lower closing prices and 12 for cases predicting higher prices. Although the number of rules reduced, but the extracted rules could be used for more instances. These rules were applied to the same dataset to calculate the performance of the rules on real data and compare it to direct approach. The performance of the rules is summarized in table 6. 13
As it can be seen, confidence of the rules and the generalizability of the model have improved. Although the number of the rules is lower, these rules can effectively be used for more number of instances. Number of propositions in rules has reduced and thus rules are easier for use. The number of rules with 6 propositions is 2 (15%) compared to 6 (42%) which shows that this tree is more effective than the former and has reached rules before the constraint stopped it. Comparison of the two methods is summarized in figures 1, 2 and 3.
28 Number of rules 27 26 25 24 23 22 Direct from DT Confidence 0.83 0.82 0.81 0.8 0.79 0.78 0.77 Direct from DT
14
the rule base for compliance with the data and if it finds a suitable rule, the rule is fired and the results are shown to the user via systems interface. TAES has the ability to provide user with confidence of the results. Each extracted rule has a confidence level, i.e. the degree to which the rule is correct. When the result is shown, its confidence is also available to the user. Moreover, the user can ask the system why a specific result has been derived. The system then tells what rule was fired to conclude that result and by which input data.
Acknowledgements
MATLAB Neural Network Toolbox and SPSS Clementine softwares were used in this research for coding and development of neural network and decision tree models respectively. The expert system was also developed using MATLAB.
Appendix
Below is a snapshot of the Technical Analyst Expert System
15
References
[1] Achelis, S.B 1995, Technical Analysis from A to Z, Probus Publisher, USA. [2] Adya, M & Collopy, F 1998, How effective are neural networks at forecasting and prediction: A review and evaluation, Journal of Forecasting,vol. 17, pp. 481-495. [3] Aghemiri, V 2009, 'A model for stock categorization using data mining techniques', MSc dissertation, Amirkabir University of Technology. [4] Alizadeh, M 2008, 'DSS for portfolio selection', MSc dissertation, Amirkabir University of Technology. [5] Andrews, R, Diederich J & Tickle A.B, 1995, A survey and critique of techniques for extracting rules from trained artificial neural networks, Knowledge-Based Systems, vol. 8(6), pp. 373-389. [6] Blume, L., Easley, D. & OHara, M 1994, Market statistics and technical analysis: the role of volume, Journal of Finance, vol. 49(1), pp. 153-181. [7] Chang, P.C, Liu, C.H, Lin, J.L, Fan, C.Y, Celeste, S.P 2009, A neural network with a case based dynamic window for stock trading prediction, Journal of Expert Systems with Applications, vol. 36, pp.6889-6898. [8] Chavarnakul , T & Enke, D 2008, Intelligent technical analysis based equivolume charting for stock trading using neural networks, Journal of Expert Systems with Applications, vol. 34, pp. 1004-1017. [9] Chavarnakul , T & Enke, D 2009 ,A hybrid stock trading system for intelligent technical analysis-based equivolume charting, Journal of Neurocomputing, vol. 72, pp. 3517-3528. [10] Craven, M, Shavlik, J, 1997, Using neural networks for Data Mining, Future Generation Computer Systems, Special Issue on Data Mining. [11] De Faria, E.L, Albuquerque, Marcelo P, Gonzalez, J.L, Cavalcante, J.T.P, Albuquerque, Marcio P 2009, Predicting the Brazilian stock market through neural networks and adaptive exponential smoothing methods, Journal of Expert Systems with Applications, vol. 36 pp. 12506-12509. [12] Enke, D, Thawornwong, S 2005, The use of data mining and neural networks for forecasting stock market returns, Journal of Expert Systems with Applications, vol. 29, pp. 927-940. [13] Fazel Zarandi, M.H, Rezaee, B, Turksen, I.B, Neshat, E 2009, A type-2 fuzzy rule-based expert system model for stock price analysis Journal of Expert Systems with Applications, Vol. 36, pp. 139-154. [14] Fernndez-Blanco, P, Bodas-Sagi, D, Soltero, F & Hidalgo, J.L 2008, D,Technical Market Indicators Optimization using Evolutionary Algorithms , Proceedings of GECCO08. [15] Friesen, G.C, Weller, P.A & Dunham, L.M 2009, Price trends and patterns in technical analysis: A theoretical and empirical examination, Journal of Banking & Finance, vol. 33, pp. 1089-1100. [16] Grundy, B., Kim, Y. 2002, Stock market volatility in a heterogeneous information economy, Journal of Financial and Quantitative Analysis, vol. 37, pp. 1-27.
16
[17] Ha, Y.M, Park, S, Kim, S.W, Won, J.I, Yoon, J.H 2009, A stock recommendation system exploiting rule discovery in stock databases, Journal of Information and Software Technology, vol. 51, pp. 1140-1149. [18] Huang, C.L. & Tsai, C.T 2009, A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting, Journal of Expert Systems with Applications, vol. 36, pp. 1529-1539. [19] Kirkpatrick, C.D & Dahlquist, J.R, Technical Analysis: The Complete Resource for Financial Market Technicians, FT Press, 2007. [20] Lam, M 2004, Neural network techniques for financial performance prediction: integrating fundamental and technical analysis, Journal of Decision Support Systems, vol. 37, pp. 567- 581. [21] Lee, K.H, Jo, G.S 1999, Expert system for predicting stock market timing using a candlestick chart, Journal of Expert Systems with Applications, vol. 16, pp. 357-364. [22] Leigh,W, Paz, M & Purvis, R 2002, An analysis of a hybrid neural network and pattern recognition technique for predicting short-term increases in the NYSE composite index Omega, vol. 30, pp. 69-76. [23] Leigh,W, Purvis,R & Ragusa J.M 2002, Forecasting the NYSE composite index with technical analysis, pattern recognizer, neural network, and genetic algorithm: a case study in romantic decision support, Journal of Decision Support Systems, vol. 32, pp. 361-377. [24] Lo, A.W, Mamaysky, H. & Wang, J 2000, Foundation of technical analysis: computations, algorithms, statistical inference, and empirical implementation Journal of Finance, vol. 55(4), pp. 1705-1765. [25] McNelis, P 2005, Neural Networks in Finance: Gaining Predictive Edge in the Market, Elsevier Academic Press. [26] Mohammadnia, A 2008, 'Risk management modeling for securities' MSc dissertation, Amirkabir University of Technology. [27] Mokhtari, M, Ashouri, R 2007, Monthly minima & maxima stock prediction using Backpropagation & Genetic Algorithm hybrid, proceedings of Iran Data Mining Conference (IDMC`07). [28] Reilly, F. K 1989, Investment Analysis and Portfolio Management, Driden Ed., USA. [29] Rotundoa, G & Ausloosb, M 2007, Microeconomic co-evolution model for financial technical analysis signals, Physica A, vol. 373, pp. 569-585. [30] Saraiee, M.H, Faraki, M, Faraki, S 2007, Parameter values of stock prediction model using backpropagation algorithm: a case study, proceedings of Iran Data Mining Conference (IDMC`07). [31] Skouras, S 2001, Financial returns and efficiency as seen by an artificial technical analyst, Journal of Economic Dynamics & Control, vol. 25, pp. 213-244. [32] Soleimani, H 2008 'Portfolio selection and optimization using genetic algorithm' MSc dissertation, Amirkabir University of Technology. [33] Thammano, A 1999, Neuro-fuzzy model for stock market prediction, Proceedings of the artificial neural networks in engineering conference (ANNIE 99), pp. 587-591. [34] Timajchi, A 2008, 'Utilizing fuzzy logic in using of technical analysis for stock selection' MSc dissertation, Amirkabir University of Technology. [35] Tsang, P.M, Kwok, P, Choy, S.O, Kwan, R, NG, S.C, Mak, J, Tsang, J, Koong, K, Wong, T 2007, Design and implementation of NN5 for Hong Kong stock price forecasting, Journal of Engineering Applications of Artificial Intelligence, vol. 20, pp. 453-461. [36] Vellido, A., Lisboa, P. J. G. & Vaughan, J 1999, Neural networks in business: a survey of application 1992-1998, Expert Systems with Applications, vol. 17, pp. 51-70. [37] Walczak, S 2001, An empirical analysis of data requirements for financial forecasting with neural networks, Journal of Management Information Systems, vol. 17, pp. 203-222. Wang, Y.F 2003, Mining stock price using fuzzy rough set system, Journal of Expert Systems with Applications, vol. 24, pp. 13-23. [38] Wang, J.L & Chan, S.H 2007, Stock market trading rule discovery using pattern recognition and technical analysis, Journal of Expert Systems with Applications, vol. 33, pp. 304-315. [39] Wang, J.L & Chan, S.H 2009, Trading rule discovery in the US stock market: An empirical study, Journal of Expert Systems with Applications, vol. 36, pp. 5450-5455. [40] Wen, Q, Yang, Z, Song, Y & Jia, P 2010, Automatic stock decision support system based on box theory and SVM algorithm, journal of Expert Systems with Applications, vol. 37, pp. 1015-1022. [41] Wilder, J.W 1987, New Concepts in Technical Trading Systems, Trend Research publications. [42] Yao, J & Tan, C.L 2000, A case study on using neural networks to perform technical forecasting of FOREX, Journal of Neurocomputing, vol. 34, pp. 79-98. [43] Yousefpour, P 2007, 'Stock price behavior determination using stochastic process and chaos theory' MSc dissertation, Amirkabir University of Technology. [44] Zareiee, H 2009, Development of a fuzzy DSS/ES for selection and improvement of stock portfolio, MSc dissertation, Amirkabir University of Technology. [45] Zhu, Y & Zhou, G 2009, Technical analysis: An asset allocation perspective on the use of moving averages, Journal of Financial Economics, vol. 92, pp. 519-544.
17