Detecting Price Manipulation in The Financial Market
Detecting Price Manipulation in The Financial Market
Abstract— Market abuse has attracted much attention from based detection models and the obtained promising
financial regulators around the world but it is difficult to fully performance is reported. Finally Section V concludes the paper
prevent. One of the reasons is the lack of thoroughly studies of and discusses potential improvements and future work.
the market abuse strategies and the corresponding effective
market abuse approaches. In this paper, the strategies of II. REVIEW OF RELATED LITERATURE
reported price manipulation cases are analysed as well as the
Theoretical studies of the stock price manipulation were
related empirical studies. A transformation is then defined to
presented in a number of existing works. A model of
convert the time-varying financial trading data into pseudo-
stationary time series, where machine learning algorithms can be transaction-based manipulation was developed, showing that
easily applied to the detection of the price manipulation. The price manipulation is profitable [1]. An “equilibrium model”
evaluation experiments conducted on four stocks from NASDAQ was derived and proved that the existence of noise traders
show a promising improved performance for effectively detecting made it possible to manipulate the price, although theoretically,
such manipulation cases. no profit should be expected according to the efficient market
hypothesis [3]. A real price manipulation case conducted by
I. INTRODUCTION large traders was examined and analysed in [5]. The real case
proved that the manipulation tactic can make a risk-free profit,
Surveillance of the financial exchange market for
as a result of the significantly changing order flow. More
monitoring market abuse activities has attracted much attention
empirical studies showed the increase of the volatility,
from financial regulators across different exchange markets in
liquidity, and returns of the underlying stock and an “up then
recent years especially since the flash crash in 2010. However,
down” process of the price during the manipulation period [1]
the lack of research in effective and efficient detection
[6] [2]. A comprehensive empirical study of the price
algorithms, in both industry and academia, causes challenges
manipulation strategy as well as the corresponding intention
for regulators in their ability to monitor huge amounts of
was carried out on real manipulation cases from Korea
trading activities in real-time. A major concern to financial
Exchange (KRX) [4]. One type of price manipulation strategy
market abuse is price manipulation, where the manipulated
was formally defined according to its statistical features from
target is the bid (or ask) price of certain financial instruments
the empirical study of the data from KRX; however, the
[1]. There is a large amount of literature regarding stock
thorough study did not lead to the design of a detection model.
market manipulation theories [1] [2] [3] and a few empirical
studies of real manipulation cases [4]. However, an effective Research regarding the detection of the stock price
detection model of price manipulation is yet to be developed manipulation is comparably limited in both academia and the
due to the lack of understanding of strategic spoofing tactics. financial industry. The appropriateness of a sample entropy
methodology as a measure for the detection was evaluated in
In this paper, we summarize and further analyse the price
[7]; however, the statistical results did not favour the properties
manipulation strategies by examining actual reported cases as
of sample entropy as an indicator of price manipulation.
well as the empirical studies in existing literature. We define
Logistic regression with an artificial neural network and
two key characteristics of price manipulation strategies, which
support vector machine has been studied and compared as a
enable us to propose a transformation procedure, converting
method of detecting trade based manipulation within the
the original market trading data to a comparable metric, where
emerging Istanbul Stock Market [8]. The detection model was
the non-stationary nature of the financial data is demonstrated
built based on the assumption of higher deviations of the
to be “nearly removed” and the machine learning techniques
statistical features of daily return, volume and volatilities from
can then be effectively applied as detection models. Our
normal cases indicating manipulation. Similar work has been
proposed detection approach is evaluated based on real trading
carried out by firstly studying the reported manipulated cases
records of selected stocks from NASDAQ.
and constructing a dataset of manipulated cases, and then
The remainder of this paper is organized as follows: modelling the returns, liquidity and volatility as well as the
Section II provides a brief review of price manipulation and the news and events related to the stocks during the manipulation
corresponding detection methods. In Section III, the price period by linear and logistic regression [9]. Evaluations and
manipulation tactics are thoroughly analysed. A data transform comparisons of different techniques were also presented in [8]
procedure is then proposed and illustrated with real trading and [9], yet both works lack a reliable, reasonable analysis of
data. Section IV evaluates the proposed machine learning the link between the abnormalities of the stock features and the
disclosure of price manipulation. Therefore, this leaves a
This project is supported by the companies and organizations involved in knowledge gap between the data attribute deviations and the
the Northern Ireland Capital Markets Engineering Research Initiative. detection techniques. An Inverse Reinforcement Learning
77
(IRL) algorithm was applied to the learning and classifying of detection model. The strategy is not constructed as incidentally
traders’ behaviours. The experiments were conducted on a heuristic attempts of placing orders but as careful designs of
simulated S&P 500 futures market through a multi-agent every single attribute of the placed orders according to the
approach [10] and achieved more than 90% classification market impact theory [13], which suggests that the market
accuracy. An empirical study of the relationship between the effects are correlated with the quotes and sizes of the posted
market efficiency and the market close price manipulation, orders. A quantitative estimation of this effect given by a
defined as ramping, was carried out and showed a raise in Vector Autoregressive Model (VAR) [13] showed that either
execution costs of completing large trades when experiencing the larger size or the higher (or lower) quote (compared with
the market close ramping [11]. Ramping alert records the current bid (or ask)) induces stronger price impact on the
generated by the detection algorithm from Smart Group market. For normal traders, measuring and eliminating the
International, a surveillance system provider, were analysed as market impact is crucial; however, for the market
a benchmark for this study. The algorithm detected market manipulators, the market impact is simply utilized by them in
close ramping according to critical price changes where the strategies to make an economical profit. According to this, the
threshold was set as the 99% histogram distribution cut-off of price manipulation orders ought to be large-sized and of a
the historical price change during the benchmark period. A higher (or lower) price than the bid (or ask) to maximize the
market close ramping alert was triggered if the changes of the market impact. However, none of the reported price
closing price and price 15 minutes prior were greater than the manipulation will be completely conducted when following
chosen threshold [11]. such format [13] due to another constraint: the placed orders
for spoofing the market are expected to have little chance of
To date, existing research has mainly focused on either being executed [4], (Execution refers to a failed manipulative
empirical studies of certain price manipulation cases or the action that is not accepted by the manipulators). Consequently,
detection techniques based on abnormalities of the market we argue that a price manipulative strategy is deemed to be
features during the manipulation period. An effective fulfilling of two conditions: (1) maximising the induced price
classification algorithm was shown in [10] but it was based change; (2) minimising the execution risk.
only on simulated markets where the traders and their
strategies were clearly defined. The definition of the primary manipulation tactic, spoofing
trading, summarized from the real manipulation cases, is given
In this paper, the manipulative strategies are analysed with as: an order with a size at least twice the previous day’s
no assumptions on unusual changes of market features. The average order size, with a price at least 6 basis points (bps) 1
proposed detection approaches are aimed at learning and away from the current bid (or ask) price and with a cancellation
modelling the trading behaviours and further identifying the time longer than 30 minutes [4]. Those numerical definitions,
manipulative actions by the learned model. Our approach is “6 bps”, “2 times” and “30 minutes”, show a typical case of
evaluated in a real data context. our argument: maximizing the induced price change (impact)
III. CHARACTERISING PRICE MANIPULATION by large size (at least twice the previous day’s average size)
order staying at order book for a relatively longer time (30
Price manipulation activities affect price fluctuation in minutes) and minimizing the risk by passive quotes (6 bps
capital markets, where the returns, volatilities and liquidities, away from the bid (or ask)) [4].
unexpectedly rise then decline during the manipulation period
[1] [6] [2]. However, the occurrence of manipulation is hard to In September 2012, a price manipulation case was reported
prove given the observed changes of the market attributes, and documented by the Financial Industry Regulatory
which in most cases are the result of economic cycles, market Authority of the USA [14]. In this case, a sequence of spoofing
(index) moves and even public events. The detection models buy orders was placed inside the spread, pushing up the bid
based on the significant deviation of the market attributes are price by 6.9 bps. After the manipulators had benefited from the
doomed to suffer from the error rate of the unusual but transaction on their previous sell order at a higher price, the
legitimate activities that are recognised as manipulation [7] [8] spoofing orders were cancelled. The complete manipulation
[9]. Instead of using the discrepancies of the financial market process lasted for only 819ms and is known as quote stuffing.
attributes, the manipulation strategic behaviour intrinsically Another 17 analogous quote stuffing cases from 2011 - 2012
offers a more accurate measure. Nevertheless, a model that is were then reported by Nanex [15]. The average time duration
capable of directly monitoring behaviours is not available due and the induced bid (or ask) changes of the cases were
to the lack of accurate definitions of manipulative behaviours. calculated as 6.2 seconds and 627 bps respectively. Obviously,
This is one of the major challenges faced when attempting to the numerical features, 6.9 bps and 819ms, of quote stuffing
detect price manipulation. Recognise also conform to our argument: the aggressive quotes
maximising the fictitious wild price changes and the
A. Price manipulation strategy characteristics instantaneous market sweeping minimising the execution risk.
A generic price manipulation tactic is defined as artificially
Spoofing trading and quote stuffing suggest two primary
pushing up (or down) the bid (or ask) price of a security and
strategies of price manipulation. The former utilises a large
taking advantage of the shifted price so as to make a profit
volume and a passive quote for inducing the impact and
[12]. The deliberately constructed trading order sequences
reducing the risk while an aggressive quote and a tiny
change the market bid (or ask) price and show the trader’s
cancellation time are used in the latter, respectively. Both
manipulative intentions. The characteristics of those orders
define the manipulation strategy, which is the target of the 1
A basis point is a unit equal to one hundredth of a percentage point
78
formats can be depicted by two key conditions defined in our represent order price, volume and submission time (physical
argument. time) respectively. Furthermore, and denote the best bid
The two strategies are graphically illustrated in Fig. 1, and ask price instantaneously before the order activity. is
denoted as the length of a sliding window and is set to one
where a three-level order book is initiated at the best bid, , , trading day, corresponding with the spoofing trading
and best ask, , and the dotted lines represent a quick sweep definition. Thus, ̅ and ̅ define the moving average volume
of the market (tiny cancellation time). of the buy and sell orders in the previous period of time
Spoofing Trading excluding the current data point . The cancellation times and
volume
/ = / − / (2)
price for the cancelled or executed order respectively. Thus the
b ,3 b, 2 b ,1 Bid-ask spread a ,1 a,2 a ,3 average lifecycle of orders in the prior period are calculated
p p pt t t p t
p
t p t
as
outside spread Inside spread outside spread
(lower than bid) (higher than ask)
1
̅, / = / (3)
Fig. 1 Spoofing trading and quote stuffing strategies in a three-level order
book.
1 (4)
B. Market Data Transformation ̅, / = /
79
As a time series matches up perfectly with itself (‘zero-
= / , = / , = / ;
lag’), the figures in Fig. 3 begin from Δt = 1 to avoid large
value at Δt = 0 . Fig. 3 clearly shows the autocorrelation,
= , = , = . AutoCor, of three transformed time series , and ,
decreasing with an increasing lag and tailing off to tiny values,
The top four stocks in NASDAQ in terms of the total
which additionally suggests the decorrelation of the
market capital, Apple, Google, Intel and Microsoft, are
transformed time series. Meanwhile, AutoCor of the original
selected for evaluation. The datasets, obtained from LOBSTER
data also declines with increasing lags. Although not given in
project [24], cover messages over five trading days, from the
this paper, the datasets of another three stocks (Google, Intel
11th to the 15th of June, 2012 and consist of more than 40,000
and Microsoft) show identical features as illustrated in both
trading orders in total for each stock. We calculated the mean
Fig. 2 and Fig. 3.
and variance of the time series (X ,X ) with Δt from 0 to
5
the length of the time series. The autocorrelation, AutoCor, is x 10 Stock INTEL 935
Var of Origin P
2.76
calculated between the time
Origin P
series (X , X ) and (X ,X ) with the same Δt values. 2.75 P 930
Mean
The calculated mean and variance of three attributes, price, 2.74 925
0 100 200 0 100 200
volume and time, for the Intel dataset are shown as an example Lag Lag
in Fig. 2. It should be noted that only the first 200 lag values
(a) (b)
are illustrated in the figures for a clear comparison between the
original and transformed data. As shown in Fig. 2(g)-(l), the 1000 711.4
Var of Origin V
V
transformed price, volume and time all fluctuate around a
Origin V
Mean 711.2
nearly constant mean value with an approximately constant 500
Var of Origin T
T
the ratio of standard deviation to the mean of a data sequence,
Origin T
Mean
1845 880
is further calculated. The CV of the mean and variance
sequences under different lag values for three attributes, price,
1840 875
volume and time are calculated for both the original data and 0 100 200 0 100 200
the transformed data across four datasets and shown in Table I. Lag Lag
(e) (f)
The significantly smaller CV values of the mean and
-3
variance sequences of the transformed price, volume and time 5
x 10 Stock INTEL
x 10
-3
Var of Trans P
1.346
compared with the original data show far lower level of
Trans P
Table I COEFFICIENT OF VARIATION OF THE SEQUENCE OF MEAN & VARIANCE (g) (h)
OF THE PRICE, VOLUME AND TIME FOR THE DATASET OF FOUR STOCKS 5 0.9318
Var of Trans V
Trans V
1 6.2903
T Mean
Origin. Coefficient of variation
Trans T
80
The comparison between these figures shows that the non- the normal orders congest together as an agglomerative cluster
stationary features of the original data are “nearly removed” by occupying a certain space with analogous but different shapes
the transformation; pseudo-stationary data is then generated, due to the naturally distinct trading behaviours across financial
compensating for the time-varying features. instruments while the original data show the exotic shapes.
INTEL (Trans) INTEL (Origin) Since the manipulation cases are apparently located apart
0.1
from the cluster (as Fig. 4(a)), the boundaries of such clusters
AutoCor p
AutoCor P
0.9
0 0.8 provide an effective decision threshold. However, such
0.7
boundaries cannot be described by simply setting up thresholds
-0.1
0 5 10 0 5 10 on three attributes; this is due to the unknown convexity
Lag
x 10
4 Lag 4
x 10 feature of the 3-dimensional spherical surface. Precisely
(a) (b) describing the cluster shapes by only the normal data requires
INTEL (Trans) INTEL (Origin)
some sophistication.
0.1 0.8 Stock APPLE
AutoCor V
AutoCor v
-4
0 0.6 x 10
Normal Orders
1 Price Manipulation Orders
-0.1 0.4
0 5 10 0 5 10
Lag 4 Lag 4
x 10 x 10
Trans. T
(c) (d) 0.5
AutoCor t
0
0.2 0.5 5
2
0 0
0 0
0 5 10 0 5 10 -2 x 10
-3
-5 -4
Lag 4 Lag 4 Trans. V Trans. P
x 10 x 10
(e) (f) (a)
Fig. 3 AutoCor of original and transformed Price, Volume and Time for Intel Stock APPLE
stock.
Normal Orders
C. Strategic behaviour illustration 15000 Price Manipulation Orders
81
One-class support vector machine (OCSVM) is another of the approach, 5000 synthesized cases are injected to each
ideal approach for novelty detection, as it provides a direct dataset with each type containing 2500 examples. For the
description of the boundary of normality (the support vectors) Apple stock dataset, the reported real cases are also injected for
[27] [28]. OCSVM applied to price manipulation detection evaluation.
provides a measure of unusualness in trading activity by
learning a representation of normal orders. In our experiments, LIBSVM [32] and DDTool [33], two
open source libraries, are used as the implementation of the
In this paper, we examine the price manipulation detection OCSVM and kNN respectively. The model parameters, namely
problem using the above-mentioned two machine learning the Gaussian kernel and the k value for kNN are determined by
models on the transformed time series as well as the original 5-folder cross-validation for stable and optimised results.
market data. We argue that both models work effectively on
the underlying detection problem and the proposed Performance evaluation is based on the Receiver Operating
transformation procedure significantly improves the detection Characteristic (ROC), which is calculated according to the
performance. confusion matrix, where false positive (FP), is defined as
manipulation cases detected as normal, false negative (FN) is
A. Application of OCSVM and kNN to price manipulation defined as normal cases detected as manipulation, true positive
detection (TP) is defined as normal cases detected as normal and true
When applying the novelty detection approaches to the negative (TN) is defined as manipulation cases detected as
price manipulation detection problem, a set of normal data manipulation. The ROC curve is a widely used metric for
vectors = { , , … , } is collected as the training dataset. evaluating and comparing binary classifiers. The ROC curve
The vector is from either the vector of the original market plots the true positive rate against the false positive
data , , ( : a buy or sell order), or the transformed rate while the discrimination threshold of the binary
data , , , calculated by Equations (5)-(7). In the classifier is varied. In order to assess the overall performance
experimental evaluation, the OCSVM and kNN are applied to of a novelty detector, one can measure the area under the ROC
the four datasets Apple, Google, Intel and Microsoft curve (AUC). Larger AUC values are generally an indication
discussed in Section III.B. The selection of these datasets is of better classification performance.
according to their relatively high trading volumes and more
volatile price fluctuation, factors that may increase the The ROC curves of two models on four stock datasets
likelihood of manipulation across the exchanges [4] [29]. Each (original and transformed) with 5000 injected novelties in each
dataset is divided into five subsets according to the trading dataset are illustrated in Fig. 5. The calculated AUC values are
date. One subset is chosen as the training dataset, where the 5- summarised in TABLE II.
fold cross-validation is used for training the models; the TABLE II AUC OF TWO MODELS ON FOUR STOCK DATASETS
remaining four subsets are used in the testing.
Data
The evaluation of a detection model is usually reliant upon Model AUC Improvement
Original Transformed
the labelled benchmarks of both normality and abnormality. APPLE 0.866 0.963 11.201%
Due to a few real manipulation cases being reported, we OCSVM
GOOGLE 0.831 0.997 19.976%
needed to synthesize a number of abnormal cases based on our INTEL 0.906 0.958 5.751%
study of the characteristics of the manipulation strategy. MSFT 0.976 0.990 1.379%
Synthetic exploratory financial data is accepted in academia for Data
AUC Improvement
Original Transformed
evaluating the proposed model when real market data is hard to
APPLE 0.884 0.928 5.057%
collect [30] [31]. GOOGLE 0.625 0.857 37.279%
kNN
INTEL 0.854 0.866 1.433%
Two primary formats of price manipulation, spoofing MSFT 0. 926 0.964 4.117%
trading and quote stuffing, are reproduced in the context of the
datasets of each stock following the original characteristics
discussed in Section III.A: It is clear that both detection models with the
transformation procedure achieved a significantly better
x spoofing trading: orders with sizes of at least twice performance than with the original market data in terms of the
the previous day’s average order size, with prices of at AUC values in TABLE II, where the Improvement column is
least 6 basis points outside the current bid-ask spread
.
and with a cancellation time of 30 minutes; calculated by as the improvement
x quote stuffing: orders with regular size, with quotes percentage. On the transformed data, both models achieved
627 bps higher (or lower) than the current bid (or ask) high AUC on all four of the datasets. Even the smallest AUC
price and with approximately 6.2 seconds of value, 0.857 of kNN on the Google dataset, can also be
cancellation time. considered as a good performance [34]. After checking the
testing results of the Apple stock, the injected real
The generated manipulation cases are then randomly manipulation cases are successfully discovered by both
injected into the corresponding order records, creating a models. The good performance can be explained by the
mixture of both “normal” and “abnormal” patterns in the OCSVM and kNN models effectively modelling the
testing datasets. In order to ensure comprehensive assessment boundaries of the normal behaviour clusters.
82
Meanwhile, as shown in Fig. 4(a), the transformed data As discussed before, to compensate the non-stationarity of
show a relatively regular cluster shapes compared with the the data, one approach is to adaptively updating the model by
exotically distributed initial data in Fig. 4(b). We argue that the monitoring any deviations in the data distribution. The pseudo-
proposed transformation procedure particularly contributes to stationarity feature of the transformed data will effectively
establishing the “pseudo stationary” regular cluster shapes. reduce the necessary updates and consequently provides a
From the machine learning perspective, the proposed computationally efficient approach.
transformation procedure pre-processes the data and
sufficiently extracts the required features. The data points in It is also noted that the OCSVM outperforms the kNN
the feature domain can be easily and relatively effectively across all four different datasets. The higher performance of
modelled by both kNN and OCSVM. OCSVM may be due to a better description of the clusters of
normal cases through a more accurate description of the
1
ROC curve of APPLE
boundary by support vectors.
0.8 V. CONCLUSION AND FUTURE WORK
True Positive Rate
1
ROC curve of GOOGLE In the proposed method, the stationary nature of the data is
tested separately on three attributes, which however, have been
0.8 modelled by OCSVM and kNN as a feature vector. The study
True Positive Rate
0.6
of the stationarity of the order vector and the corresponding
detection model updating (re-training) will be the focus of our
0.4 Transformed Data OCSVM future work. Furthermore, in recent years, the market
Origin Data OCSVM
0.2 Transformed Data kNN
manipulation tends to be carried out in more than one exchange
Origin Data kNN market by some tricky manipulators. Detection within any
0
0 0.2 0.4 0.6 0.8 1 single market hardly achieves a complete and accurate result.
False Positive Rate This requires a cross-market detection model, which is also one
(b) of our primary future works.
ROC curve of INTEL
1
0.8 REFERENCES
True Positive Rate
0.6
[1] F. Allen and D. Gale, “Stock-price manipulation,” Review of Financial
Studies, vol. 5, no. 3, pp. 503-529, 1992.
0.4 Transformed Data OCSVM [2] R. K. Aggarwal and G. Wu, “Stock Market Manipulations,” The Journal
Origin Data OCSVM of Business, vol. 79, no. 4, pp. 1915-1953, 2006.
0.2 Transformed Data kNN
Origin Data kNN [3] F. Allen and G. Gorton, “Stock Price Manipulation, Market
0 Microstructure and Asymmetric Information,” European Economic
0 0.2 0.4 0.6 0.8 1
False Positive Rate
Review, vol. 36, pp. 624-630, 1992.
[4] E. J. Lee, K. S. Eom and K. S. Park, “Microstructure-based
(c) manipulation: Strategic behavior and performance of spoofing traders,”
ROC curve of MSFT Journal of Financial Markets, vol. 16, no. 2, p. 227–252, 2013.
1
[5] R. A. Jarrow, “Market manipulation, bubbles, corners, and short
0.8 squeezes.,” Journal of financial and Quantitative Analysis, vol. 3, p. 27,
True Positive Rate
1992.
0.6
[6] M. Jianping, G. Wu and C. Zhou, “Behavior based manipulation: theory
0.4 Transformed Data OCSVM
and prosecution evidence.,” New York University, 2004.
Origin Data OCSVM [7] M. Slama and E. Strömma, “Trade-Based Stock Price Manipulation and
0.2 Transformed Data kNN Sample Entropy,” Stockholm School of Economics, 2008.
Origin Data kNN
0 [8] H. Öğüt, M. M. Doğanay and R. Aktaş, “Detecting stock-price
0 0.2 0.4 0.6 0.8 1 manipulation in an emerging market: The case of Turkey,” Expert
False Positive Rate
Systems with Applications, vol. 36, no. 9, p. 11944–11949, 2009.
(d) [9] D. Diaz, B. Theodoulidis and P. Sampaio, “Analysis of stock market
manipulations using knowledge discovery techniques applied to intraday
Fig. 5 ROC of two models on four stock datasets. trade prices.,” Expert Systems with Applications, vol. 38, no. 10, pp.
12757-12771., 2011.
[10] S. Yang, M. Paddrik, R. Hayes, A. Todd, A. Kirilenko, P. Beling and W.
83
Scherer, “Behavior based learning in identifying High Frequency [23] S. Van Bellegem, “Adaptive methods for modelling, estimating and
Trading strategies,” in IEEE Conference on Computational Intelligence forecasting locally stationary processes,” Université catholique de
for Financial Engineering & Economics (CIFEr), New York, 2012. Louvain, Louvain, 2003.
[11] M. Aitken, F. R. Harris and S. Ji, “Trade-based manipulation and market [24] LOBSTER, “LOBSTER,” Humboldt Universität zu Berlin, 2013.
efficiency: a cross-market comparison,” in 22nd Australasian Finance [Online]. Available: https://lobster.wiwi.hu-berlin.de/index.php.
and Banking Conference, 2009. [25] Nanex, “Incredible, Blatant Manipulation in Apple Stock,” 10 July 2013.
[12] Y. Cao, Y. Li, S. Coleman, A. Belatreche and T.M.McGinnity, “A [Online]. Available: http://www.nanex.net/aqck2/4352.html.
Hidden Markov Model with Abnormal States for Detecting Stock Price [26] C. M. Bishop, Pattern Recognition and Machine Learning, Springer ,
Manipulation,” in 2013 IEEE International Conference on Systems, Man, 2007 .
and Cybernetics (SMC), Manchester, pp.3014-3019, Oct 2013.
[27] B. Schölkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola and R. C.
[13] N. Hautsch and R. Huang, “The market impact of a limit order,” Journal Williamson., “Estimating the support of a high-dimensional
of Economic Dynamics and Control, vol. 36, no. 4, pp. 501 - 522, 2012. distribution,” Neural computation, vol. 13, no. 7, pp. 1443-1471, 2001.
[14] M. Ong and N. Condon, “FINRA Joins Exchanges and the SEC in [28] P. Hayton, S. Utete, D. King, S. King, P. Anuzis and L. Tarassenko.,
Fining Hold Brothers More Than $5.9 Million for Manipulative Trading, “Static and dynamic novelty detection methods for jet engine health
Anti-Money Laundering, and Other Violations,” 25 September 2012. monitoring,” Philosophical Transactions of the Royal Society A:
[Online]. Available: Mathematical,Physical and Engineering Sciences, vol. 365, no. 1851, pp.
http://www.finra.org/Newsroom/NewsReleases/2012/P178687. 493-514, 2007.
[15] NANEX, “Whac-A-Mole is Manipulation,” 25 September 2012. [29] D. J. Cumming, F. Zhan and M. J. Aitken, “High Frequency Trading and
[Online]. Available: http://www.nanex.net/aqck2/3598.html. End-of-Day Manipulation,” Social Science Research Network, 2012.
[16] R. Ghazali, A. J. Hussain, N. M. Nawi and B. Mohamad, “Non- [30] G. K. Palshikar and M. M. Apte, “Collusion set detection using graph
stationary and stationary prediction of financial time series using clustering,” Data Mining and Knowledge Discovery, vol. 16, no. 2, pp.
dynamic ridge polynomial neural network,” Neurocomputing, vol. 72, 135-164, 2008.
no. 10-12, p. 2359–2367, 2009.
[31] M. Franke, B. Hoser and J. Schröder, “On the Analysis of Irregular Stock
[17] L. Cao, Y. Ou and P. Yu, “Coupled Behavior Analysis with Market Trading Behavior,” Studies in Classification, Data Analysis, and
Applications,” IEEE Transaction on Knowledge and Data Engeering, Knowledge Organization, pp. 355-362, 2008.
vol. 24, no. 8, pp. 1378-1392, 2012.
[32] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector
[18] R. S. Tsay, Analysis of Financial Time Series, Wiley, 2010. machines,” ACM Transactions on Intelligent Systems and Technology,
[19] R. F. Engle, “Autoregressive Conditional Heteroscedasticity with vol. 2, p. 27:1–27:27, 2011.
Estimates of the Variance of United Kingdom Inflation,” Econometrica, [33] D. Tax, DDtools, the Data Description Toolbox for Matlab version 2.0.1,
vol. 50, no. 4, pp. 987-1008, 1982. Delft University of Technology, 2013.
[20] T. Bollerslev, “Generalized autoregressive conditional [34] T. Fawcett, “An introduction to ROC analysis,” Pattern Recognition
heteroskedasticity,” Journal of Econometrics, vol. 31, no. 3, p. 307–327, Letters, vol. 27, p. 74, 2006.
1986.
[21] C.-C. Lee, J.-D. Lee and C.-C. Lee, “Stock prices and the efficient
market hypothesis: Evidence from a panel stationary test with structural
breaks,” Japan and the World Economy, vol. 22, no. 1, p. 49–58, 2010.
[22] M. B. Priestley, Spectral Analysis and Time Series,, Academic Press,
1982 .
84