Discovery of Technical Analysis Patterns
Discovery of Technical Analysis Patterns
net/publication/220948114
CITATIONS READS
6 945
2 authors, including:
Urszula Markowska-Kaczmar
Wroclaw University of Science and Technology
89 PUBLICATIONS 483 CITATIONS
SEE PROFILE
All content following this page was uploaded by Urszula Markowska-Kaczmar on 08 March 2014.
Abstract—In this paper our method of discovering data se- environment simulating the Warsaw Stock Market. The final
quences in the time series is presented. Two major approaches section presents conclusion and future plans.
to this topic are considered. The first one, when we need to
judge whether a given series is similar to any of the known pat-
terns and the second one when there is a necessity to find how II. RELATED WORKS
many times within long series a defined pattern occurs. In both Methods of pattern discovery in time series sequences in
cases the main problem is to recognize pattern occurrence(s), the financial analysis are closely connected to econometrics
but the distinction is essential because of the time frame within
which can shortly be defined as the branch of economy that
which identification process is carried on. The proposed method
is based on the usage of multilayered feed-forward neural net- deals with defining models of different systems by the use of
work. Effectiveness of the method is tested in the domain of fi- mathematics and statistics. Some of these models are created
nancial analysis but its adaptation to almost any kind of se- by economists in order to make analysis of data or to make a
quences data can be done easily. prediction of future stock exchange quotations. The problem
is to prepare a good model, where ‘good’ means the model
I. INTRODUCTION which takes into consideration all important relations which
The usage of rules and fuzzy rules in searching time se- for neural network are prepared. It is important to provide
quence patterns are considered as well. The examples can be representative patterns. It is a good practice that some of
found in [7] and [3]. them should be multiplied within the training set (with added
Many researches are made by the use of machine learning noise).
methods in order to retrieve some predefined technical anal-
ysis patterns within the time series, e.g. [5].
Very popular approach is an application of Kohonen’s
neural network to cluster patterns retrieved among stock ex-
change quotations. The examples of SOM networks can be
found in [4] and [6]. Authors admitted that this kind of net-
work in their experiments showed good results in searching
for patterns of main trend of quotations. They also consider
this approach as not ideal for making predictions of turning
points among quotations.
Other approach which used neural networks is presented
in [5]. The method described in this paper can be shortly
characterized as follows. Each of the patterns is memorized
as a chart in the computers memory within some specified
boundaries. Next, neural network (NN) is trained of chosen
pattern. After training, the network is able to recognize
whether a given series is similar to the pattern it was trained.
To become results more trustworthy the author suggested to
use two different NNs for a recognition of one pattern (the Fig. 1. The neural network architecture used in the experiments
average of both results was treated as a final result). What is
important, both neural networks had to be trained using dif- PrepareTrainingPatterns()//define
ferent sets of learning patterns. The method based on chart training set
pattern recognition in time sequence is proposed in as well.
NormalizePatterns()//prepare
normalization
III. THE DETAILS OF THE METHOD
TrainNeuralNetwork()//training
Our method of discovering data sequences in time series is
process
also based on the neural network which has feedback con-
nections. It is trained with back propagation learning algo- SmoothInputSeries()//step is optional
rithm. The whole idea is simple. For each pattern of techni- NormalizeInputSeries()//series’
cal analysis one dedicated neural network exists which is normalization
trained to recognize it. The architecture of the network used ProceedtheSeries()//Classifying
in the experiments is presented in Fig 1. The network is fully decision
connected. Each of the inputs represents exactly one value of
a stock exchange quotation. In this figure N describes the Fig. 2. The algorithm of discovering technical analysis patterns in time
number of input neurons (which was set to 27 in the experi- series
ments), L represents the number of hidden neurons (it was Adding similar learning patterns ensures that the neural net-
equal to 14 in the experiments) and M is the number of out- work after the training process will have better generaliza-
put neurons (that was set to 1). The response of the output tion skills. The next step (normalization of patterns) is
neuron indicates whether a given series is recognized as a needed in order to reduce all defined patterns to a common
pattern that the network was trained to recognize. range. It is important, because in other case series defined on
To be more precise it is worth mentioning, that sigmoidal different ranges could favor some patterns with higher val-
function was used as an activation function. This means that ues. Each value s from a series S is normalized according to
the value returned by the output neuron is in the range (0; 1). the equation (1). In the next step the neural network is
The output value closer to the upper bound of the range trained. The training process should be continued until an
was interpreted as a given series was similar to the series output value of the network reaches satisfied value (usually
from training set. When the continuous range of values is al- below defined threshold).
lowed the obvious question is how to make a binary decision
if the series represents a pattern in question or not? The an-
swer is not so well-defined. It depends on what the parame- si −min S
ters of the network training were set, what the stop criteria of s norm , i = , (1)
max S −min S
learning algorithm were adjusted or what kind of activation
function was chosen. In the experiments after preliminary ex-
periments this threshold value was set to 0.85. In the Fig. 2 where: si ∈S , min S – minimum, max S –
there are presented main steps of the proposed method of maximum.
discovering data sequences. In the first step training patterns
URSZULA MARKOWSKA-KACZMAR ET. AL.: DISCOVERING TECHNICAL ANALYSIS PATTERNS 197
In the next step a given series, in order to be processed by The simplest way to determine which points should be re-
the neural network, can be smoothed. It is especially essen- moved is to count how many of them is surplus (sp). After-
tial when a series consists of any abnormal values. The aim wards the number of all points (m) in the series should be di-
of smoothing is to reduce the number of points where the vided by sp resulting in the steps (k) which should be used
amplitude between two adjacent points in the chart is ex- while designating indexes of surplus points within a given
tremely high. An example of smoothing result is presented in series. Fig. 5 presents the effect of the usage of the men-
Fig. 3. Because this method changes the original points in a tioned method to the series which is depicted in the Fig. 4.
chart it is recommended to use it only if needed. The second technique is to find within a series exactly n
characteristic points (called perceptually important points
Smoothing using moving average algorithm with the window size -PIP). Other points, which were not considered as character-
equal to 3
istic points should be removed.
80,00
The last technique of shortening series of length m to be-
Original series
70,00
come one with n values is its compressing. The compression
60,00
50,00
can be done by specifying n segments in a given series and
Series smoothed with the use
all values within each segment are substituted by one value.
Value
of arithmetical average
40,00
It is important to emphasize that all described previously is- and shoulder’. The training set was prepared where each
sues were adequate to recognize a whole series as a pattern. training pattern had the length 27 (the number of neural net-
The other case is when we want to find how many times an work inputs). It consists of positive training patterns (repre-
interesting pattern was repeated among the whole series (this senting ‘head and shoulder’ form) as well as negative ones
operation can only be done, when a series is longer then (that do not represent this form). The network was trained
training patterns). One approach to this problem is to specify with an error equal to 0.001.
start index and the number which represents a value of length To evaluate the method of a shortening series the testing
step (which will be used for moving the window from the set was created. It contained: 30 artificial series of ‘head and
start index). Next, moving a window (which has the length shoulder’ pattern with the length equal to 54; 81 and 135 (10
equal to the number of input neurons of the neural network) of each length), 10 series of ‘triple top’, ‘double top’ and
the main series can be cut into subseries with a defined step some randomly chosen patterns, finally some series of ar-
from the start index. Then, each subseries should be checked chive stock (GPW) exchange quotations (which were manu-
whether it is similar to the pattern trained by the network. ally annotated by the authors whether they represent the pat-
The problem becomes more complex when the length of sub- tern in question - ‘head and shoulder’ form or not. In these
series differs from the length of training patterns. We can annotations the value 1 informs that a given time series rep-
consider checking the subseries of length from 2 up to m resents a given pattern, value 0 means that it does not).
(where m represents the length of whole series). In the test it was arbitrary assumed that the network out-
In this case the problem occurs that computational com- put equal or greater than 0.9 represents the neural network
recognition of the pattern in question.
plexity becomes O m2 . To reduce the number of sub- For each pattern from the testing set an error between de-
series that should be checked, similarly to [3] a function TC sired value of the output and the one returned by the network
(given by eq. (2)) is used. Its task is to control a length of a was used to evaluate the results (absolute value of subtrac-
series which should be processed. This function returns a tion of mentioned elements). The average error calculated
smaller value when the length of the series is closer to the for each method is the basis of comparison. The results are
preferred length. In eq. (2) dlen is the desired length of series shown in Table I.
(which in our case should be equal to the number of input
neurons in the network), slen means the series length. Addi- TABLE I
COMPARISON OF SHORTENING TECHNIQUES
tionally, dlc parameter can be adjusted according to the
Average Deviation of
steepy of the function which is used. Only for the points
Shortening technique error average er-
which are below specified threshold (i.e. λ=0. 2 ) on the ror
TC function graph the checking should be performed. 0.0672 0.1233
Surplus points
− d 1 / θ1 2 Compression 0.0697 0.1409
TC slen , dlen =1−exp , (2)
PIP 0.0936 0.1632
tested subseries. For the test purpose one long series was
chosen. It was created on the basis of stock exchange quota-
tions of the stock market 01NFI from 150 sessions (from 14
August 2006 till 16 March 2007). The algorithm of discover-
ing patterns has run twice. In the first run the length of the
window varied from 2 to 100. In the second one the TC func-
tion was applied. It allowed to limit the number of time se-
quence lengths to be checked (the range from 22 to 32 for
the network with 27 input neurons). The results are presented
in [8]. The blue line represents the widths of window for
which patterns could be found in the given series without us-
ing TC function, while the pink line shows the number of dis-
covered patterns with the use of TC function. It can be easily
notice that its usage really limits the range of widths to the Fig 10. Stock exchange quotations of 01NFI formed from 39
(21; 32). The fact, that for the widths of window in the range sessions identified as a 'head and shoulder' pattern
60 – 69 so high number of patterns were found can be a sur-
prise. But it is nothing extraordinary. We have to keep in As it was mentioned before, the method of discovering
mind, that the neural network with well performed prepro- data sequences in a time series was tested also in the artifi-
cessing algorithm (which properly shortens or expands se- cial environment – multi-agent stock exchange system which
ries) can effectively recognize patterns regardless of the is presented in . In this system agents representing real in-
length of checked series. The obtained results show that the vestors are evolved by genetic algorithm. Each agent is de-
method is not very sensitive to the length of the tested time scribed by the set of its coefficients defining its behavior.
series. The aim of the system is to find the set of agents (with the
best suited values of coefficients) who will be able to gener-
8 ate the stock price movement similar to the existing one in
7 the real stock. Evolution takes place in steps which are called
generations. After each generation the individuals (the set of
Number of discovered patterns
5
agents in our case) are evaluated in terms of fitness value
4
that informs about a quality of an individual. The better is
3
the fitness value, the better is the set of agents (individual).
2
Originally the system had an naïve algorithm (assigned as
old) of identification which investments should be done by
1
an agent. Then this algorithm was substituted by the method
0
of discovering time series sequences presented in this paper
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
93
97
Window width
(called new). The comparison of the results with the usage of
Without TC function Using TC function
both methods is shown in Table III.
Fig. 8. The number of discovered patterns in relation to the window
An analysis of the results in the table clearly shows that
width the application of the new method improves the value of
agents’ fitness. The old algorithm returned good results only
In Fig. 9 and Fig 10 examples of the series found during in the third test. It means that stock prices generated by the
the experiment are presented (the red line represents a shape agents using newer decision algorithm are much more simi-
of the chart ‘head and shoulder’). lar to the real ones. However, because a genetic algorithm
has embedded randomness in its nature, more tests are re-
quired to fully evaluate the results which were not possible
to perform now because of the duration of one experiment.
TABLE III.
THE COMPARISON OF NEW AND OLD ALGORITHM OF TAKING DECISION BY THE AGENTS
Average fitness of all Fitness of the best in-
individuals in all gen- dividual in the ex-
erations periment
Decision algorithm
Nr
old new old new
1 0.12 0.25 0.47 0.51
2 -0.01 0.19 0.43 0.45
3 0.11 0.05 0.59 0.50
Fig. 9. Stock exchange quotations of 01NFI formed from 9 ses- 4 0.02 0.03 0.42 0.42
sions identified as a 'head and shoulder' pattern
It is worth mentioning that the platform on which tests
were performed should be upgraded in some places (i.e.
200 PROCEEDINGS OF THE IMCSIT. VOLUME 3, 2008
agents should start with an amount of money adequate to the be omitted. Some improvements can be made in the test plat-
number of stocks which are on the market, genetic algorithm form, as well. Some upgrades in this system can have an im-
should not create a specified number of new agents as the re- pact on the trustworthy of the performed tests. All mentioned
sult of mutation operator after generation, etc.). For the pur- problems and places where improvements can be made are
pose of this test no upgrades were performed (only the men- great opportunity to continue studies on the proposed discov-
tioned change of the decision algorithm took place). Authors ering technical analysis pattern method.
suspect, that even better results could be gained by the use of
newer discovering patterns method if some patches to the ex- REFERENCES
isting platform were provided. Performed experiment was a [1] Fanzi Z., Zhengding Q., Dongsheng L. and Jianhai Y., Shape-based
first trial of integration and has shown that there is still some time series similarity measure and pattern discovery algorithm, Jour-
place for improvements. nal of Electronics (China) vol. 22, no 2 , Springer, 2005.
[2] Fogel G. B., Computational intelligence approaches for pattern dis-
covery in biological systems, Briefings in Bioinformatics 9(4), pp.
V. CONCLUSION AND FUTURE PLANS 307-316, 2008.
[3] Fu T., Chung F., Luk R., Ng Ch., Stock time series pattern matching:
The aim of the research presented in this paper was to design Template-based vs. Rule-based approaches, Engineering Applica-
of an effective method which is able to properly recognize a tions of Artificial Intelligence, vol. 20, Issue 3, pp. 347-364, 2007.
given pattern in the time series data. Based on the results of [4] Guimarães G., Temporal Knowledge Discovery with Self-Organizing
Neural Networks. IJCSS, 1(1), pp.5-16, 2000.
experiments we can draw the conclusion that the proposed [5] Kwaśnicka H. and Ciosmak M., Intelligent Techniques in Stock
method can properly discover the sequences of data within Analysis, Proceedings of Intelligent Information Systems, pp.
time series. Moreover, when the network is trained the 195-208, Springer, 2001.
[6] Lee Ch.-H.; Liu A., Chen W.-S., Pattern discovery of fuzzy time se-
process of recognition is easy and fast. The network response ries for financial prediction, IEEE Transactions on Knowledge and
arrives immediately. The only difficulty can be the network Data Engineering, vol. 18, Issue 5, pp.613 – 625, 2006.
training – the choice of appropriate training patterns and the [7] S., Lu L., Liao G., and Xuan J.: Pattern Discovery from Time Series
parameters of training, but after some trials and getting more Using Growing Hierarchical Self-Organizing Map, Neural Informa-
tion Processing, LNCS, Springer, 2007.
experience this problem disappears. [8] Markowska-Kaczmar U., Kwasnicka H., Szczepkowski M., Genetic
However the results are promising, there are still improve- Algorithm as a Tool for Stock Market Modelling , ICAISC, Za-
ments possible, for instance other optimization technique of kopane, 2008.
[9] Suh S. C., Li D. and Gao J., A novel chart pattern recognition ap-
finding the series of shorter or longer widths than the number proach: A case study on cup with handle , Proc of Artificial Neural
of input neurons in the network could be proposed. As it was Network in Engineering Conf, St. Louis, Missouri, 2004.
shown TC function limits the number of searched widths but
it is not the ideal solution, because some proper patterns can