0% found this document useful (0 votes)
112 views39 pages

Quantitative_Framework_for_Liquidity_Imbalance_Detection_and_Counterspoofing

This paper introduces a quantitative framework for detecting spoofing and liquidity imbalances in high-frequency trading markets, utilizing machine learning models and various spoofing detection metrics. It details the development of tools such as the Counterspoof Scanner and Trespass Scanner, which analyze market behavior to identify manipulative practices and optimize trading strategies. The findings suggest that integrating probabilistic models and proprietary noise filtering can enhance market transparency and inform future research directions.

Uploaded by

tsleong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views39 pages

Quantitative_Framework_for_Liquidity_Imbalance_Detection_and_Counterspoofing

This paper introduces a quantitative framework for detecting spoofing and liquidity imbalances in high-frequency trading markets, utilizing machine learning models and various spoofing detection metrics. It details the development of tools such as the Counterspoof Scanner and Trespass Scanner, which analyze market behavior to identify manipulative practices and optimize trading strategies. The findings suggest that integrating probabilistic models and proprietary noise filtering can enhance market transparency and inform future research directions.

Uploaded by

tsleong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

A Quantitative Framework for

Liquidity Imbalance Detection


and Counter-Spoofing in Futures
Markets

Connor Slaytonα , Marissa Caldwellγ θ , Graham Lindellγ

α
Futures Analytica
γ
Quantis Lindell Capital
θ
Statista Q

April 26, 2024


Abstract

This paper presents an integrated framework for detecting and exploiting market
manipulation, specifically focusing on spoofing and liquidity imbalances in high-
frequency trading environments. We begin by defining key spoofing detection
metrics, such as High Quoting Activity, Unbalanced Quoting, and Abnormal
Cancellations, and introduce a Spoofing Identification Score built on machine
learning models, including random forest classifiers. The next component intro-
duces the Microprobability Metric, a Bayesian-based framework that combines
limit order book data with forecasting techniques to predict small factor market
movements.
Part II explores market imbalance scanners, starting with the Counterspoof
Scanner, which classifies different phases of spoofing cycles and determines op-
timal trade points based on court-validated spoofing activity data. The DEX-
Array Scanner identifies aggressive order placement between bid and ask lev-
els, refining signals to detect shifts in market sentiment. Finally, the Trespass
Scanner focuses on liquidity imbalances, leveraging proprietary noise filtering to
identify high-confidence trade signals in volatile environments.
Our findings indicate that by integrating probabilistic models, machine learn-
ing, and proprietary noise filtering mechanisms, these tools provide a compre-
hensive solution for navigating high-frequency trading markets. The paper con-
cludes by discussing implications for market transparency and proposing av-
enues for future research, including the integration of additional data sources
and scalability in decentralized markets.
Acknowledgments
We would like to express our sincere gratitude to the following individuals for
their groundbreaking contributions to the field of market microstructure, which
have greatly influenced and inspired our research:
Dr. Martin David Gould and Dr. Julius Bonart Their seminal work, ”Queue
Imbalance as a One-Tick-Ahead Price Predictor in a Limit Order Book” (De-
cember 2015), provided essential insights into price prediction using order book
data. This research served as a key framework for the development of our mi-
croforecasting metric.
Dr. Bao Linh Do and Professor Tālis J. Putniņš We are deeply indebted
to their pivotal study, ”Detecting Layering and Spoofing in Markets” (August
2023). Their contributions to spoofing detection have directly informed the cre-
ation of our Counterspoofing Scanner and the spoofing identification method-
ologies outlined in this paper.
Without the pioneering research of these individuals, the techniques and
advancements presented in this paper would not have been possible. We are
grateful for their contributions to the academic community and for laying the
foundation for further exploration in this vital area of market analysis.

1
Contents

I Market Manipulation Filters and Prediction Scoring


Systems 4
1 Defining Spoofing Detection and Liquidity Manipulation as a
Probability Score 5
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Spoofing Identification Score . . . . . . . . . . . . . . . . . . . . 5
1.3 Metric 1: High Quoting Activity . . . . . . . . . . . . . . . . . . 5
1.4 Metric 2: Unbalanced Quoting . . . . . . . . . . . . . . . . . . . 6
1.5 Metric 3: Abnormal Cancellations . . . . . . . . . . . . . . . . . 7
1.6 Metric 4: Low Execution Probability . . . . . . . . . . . . . . . . 7
1.7 Metric 5: Trades Oppose Quotes . . . . . . . . . . . . . . . . . . 7
1.8 Metric 6: Cancels Oppose Trades . . . . . . . . . . . . . . . . . . 7
1.9 Metric 7: Cyclical Pattern in Depth . . . . . . . . . . . . . . . . 8
1.10 Metric 8: Cyclical Pattern in Cancellations . . . . . . . . . . . . 8
1.11 Metric Integration with Machine Learning Classifiers . . . . . . . 9
1.12 Conclusion on the Spoofing Detection Model . . . . . . . . . . . 11

2 Predictive Modeling with 2-Tick Forecasting: The Microprob-


ability Metric for Intraday Small Factor Prediction 12
2.1 2-Tick Forecasting Microprobability Metric . . . . . . . . . . . . 12
2.2 Limit Order Book Data: Input to the Model . . . . . . . . . . . . 13
2.3 Input Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Bayesian Framework . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Step 1: Defining Prior Probabilities . . . . . . . . . . . . . . . . . 14
2.6 Step 2: Likelihood Calculation . . . . . . . . . . . . . . . . . . . 14
2.7 Step 3: Posterior Probability . . . . . . . . . . . . . . . . . . . . 14
2.8 Model Implementation . . . . . . . . . . . . . . . . . . . . . . . . 15
2.9 Bayesian Microprobability Conclusion . . . . . . . . . . . . . . . 15
2.10 Metric 2: 1-Tick Forecasting Metric . . . . . . . . . . . . . . . . 15
2.11 Step-by-Step Explanation of the 1-Tick Forecasting Metric . . . . 16
2.12 Combining the Metrics: Kalman Filter Implementation . . . . . 17
2.13 Kalman Filter Overview . . . . . . . . . . . . . . . . . . . . . . . 17
2.14 State Variables and Measurements . . . . . . . . . . . . . . . . . 17
2.15 Kalman Filter Equations . . . . . . . . . . . . . . . . . . . . . . . 18

2
2.16 Final Output: 0-100 Score for Long and Short . . . . . . . . . . . 19

II Market Imbalance Scanners: Counterspoofing, Liq-


uidity Imbalances, and Aggressive Order Identification 21
3 Classification of Spoofing Phases and Determination of Optimal
Trade Timing (Counterspoof Scanner 22
3.1 Introduction on Counterspoofing . . . . . . . . . . . . . . . . . . 22
3.2 Spoofing Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Spoofing Cycle Detection and Machine Learning Integration . . . 23
3.4 Preprocessing and Model Training . . . . . . . . . . . . . . . . . 23
3.5 Trade Direction Based on Spoofing Cycle . . . . . . . . . . . . . 24
3.6 Modeling Liquidity and Market Impact . . . . . . . . . . . . . . 24
3.7 Conclusion on Counterspoofing Trades . . . . . . . . . . . . . . . 24

4 Detection and Exploitation of Market Sentiment Shifts via Ag-


gressive Order Identification (DEX-Array Scanner) 27
4.1 Introduction on DEX-ARRAY Scanner . . . . . . . . . . . . . . . 27
4.2 Why the DEX-Array Scanner Works . . . . . . . . . . . . . . . . 27
4.3 Aggressive Order Detection . . . . . . . . . . . . . . . . . . . . . 28
4.4 Price Change Verification . . . . . . . . . . . . . . . . . . . . . . 28
4.5 Integration with the Microprobability Metric, Spoofing Score,
and HF Omega . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.6 Conclusion on DEX-ARRAY Scanner . . . . . . . . . . . . . . . 30

5 Liquidity Imbalance Detection and Scalability in High-Frequency


Environments (Trespass Scanner) 31
5.1 Introduction on Trespass Scanner . . . . . . . . . . . . . . . . . . 31
5.2 Why the Trespass Scanner Works . . . . . . . . . . . . . . . . . . 31
5.3 Liquidity Imbalance Detection and Tick Stacking . . . . . . . . . 32
5.4 Proprietary Noise Filtering and Final Signal Metric . . . . . . . . 32
5.5 Trade Validation and Filtering . . . . . . . . . . . . . . . . . . . 33
5.6 Risk Management and Exit Logic . . . . . . . . . . . . . . . . . . 33
5.7 Conclusion on Trespass Scanner . . . . . . . . . . . . . . . . . . . 34

III Implications and Future Directions 35


6 Conclusion 36
6.1 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3
Part I

Market Manipulation
Filters and Prediction
Scoring Systems

4
Chapter 1

Defining Spoofing Detection


and Liquidity Manipulation
as a Probability Score

1.1 Introduction
Spoofing in the financial markets refers to the practice of placing large orders
on one side of the market to create a false impression of demand or supply, with
the intent of canceling those orders before they are executed. Spoofing distorts
market prices and creates unfair trading conditions. Detecting spoofing behavior
is vital for maintaining a fair market and for improving trading strategies.

1.2 Spoofing Identification Score


The Spoofing Detection Scanner uses several key metrics designed to identify
spoofing patterns in real time. While spoofing detection is complex, our ap-
proach simplifies this task by focusing on abnormal market behaviors—such
as high quoting activity, unbalanced quoting, and abnormal order cancella-
tions—that often precede manipulation. These metrics serve as the backbone
of not only our spoofing detection algorithms but also other related trading
strategies.

1.3 Metric 1: High Quoting Activity


High quoting activity indicates the presence of large orders placed on both sides
of the order book, often with the intention of manipulating market sentiment.
The metric is defined as:

5
|EntryAskSizei,t − EntryBidSizei,t |
 
HQi,s (d) = max
t∈s(d) AskSizei,t + BidSizei,t
This metric is particularly useful in identifying spoofing because spoofers
frequently submit large orders to create an imbalance in the order book, which
is later canceled once their manipulation achieves the desired market reaction.

1.4 Metric 2: Unbalanced Quoting


Unbalanced quoting reflects significant discrepancies between the bid and ask
sizes, which are a hallmark of spoofing activity. The metric is given by:
 
|AskSizei,t − BidSizei,t |
U Qi,s (d) = max
t∈s(d) AskSizei,t + BidSizei,t
Unbalanced quoting is key to identifying potential spoofing, as manipulators
often create an artificial imbalance to move the market in their desired direction.

6
1.5 Metric 3: Abnormal Cancellations
Abnormal cancellations occur when large orders are rapidly withdrawn from one
side of the market, usually after the manipulator has achieved their intended
price movement. This metric captures that behavior:
 
|CancelAskSizei,t − CancelBidSizei,t |
ACi,s (d) = max
t∈s(d) AskSizei,t + BidSizei,t
By monitoring abnormal cancellations, we can detect when spoofers pull
their orders after influencing the market.

1.6 Metric 4: Low Execution Probability


Spoofing orders typically have a low probability of execution, as they are placed
with the intent of being canceled. This metric measures that likelihood:

 
|AskSizeLevel2to5i,t − BidSizeLevel2to5i,t |
LEi,s (d) = max
t∈s(d) AskSizei,t + BidSizei,t

Low execution probability is a hallmark of spoofing orders, as the manipu-


lator wants to avoid their orders being filled while still impacting the market.

1.7 Metric 5: Trades Oppose Quotes


Trades that occur on the opposite side of a recent order book imbalance indicate
potential spoofing. For example, if there is a large imbalance on the ask side
but trades occur on the bid, this suggests manipulative behavior:

Ask Bid
1 if OIBi,s−1 > 10% and T radei,s = 1

T OQi,s = 1 if OIBi,s−1 > 10% and T radeAsk
Bid
i,s = 1

0 otherwise

This metric helps in pinpointing the exact moments when spoofing is likely
happening.

1.8 Metric 6: Cancels Oppose Trades


Cancels that occur on the opposite side of trades, especially after large quoting
activity, signal spoofing. The metric is calculated as:

Ask Bid
1 if CIBi,s−1 > 10% and T radei,s = 1

COTi,s = 1 if CIBi,s−1 > 10% and T radeAsk
Bid
i,s = 1

0 otherwise

7
Spoofers will often cancel their orders immediately after trades occur on the
opposite side, making this a key indicator.

1.9 Metric 7: Cyclical Pattern in Depth


The cyclical nature of unbalanced quoting can be a strong indicator of spoofing.
This metric tracks how often quoting imbalances switch from one side of the
market to the other:
X
CP Di,s (d) = I{U Qi,t }
By monitoring these patterns, we can identify repetitive manipulative be-
havior.

1.10 Metric 8: Cyclical Pattern in Cancellations


Spoofers often cancel orders in a cyclical fashion, rapidly switching sides in an
effort to manipulate prices. The cyclical pattern in cancellations is given by:
X
CP Ci,s (d) = I{U Ci,t }
Tracking cyclical cancellations is essential in detecting repeat manipulation
tactics.

8
1.11 Metric Integration with Machine Learning
Classifiers
The Spoofing Detection Model combines traditional financial metrics with ma-
chine learning techniques to improve the accuracy of identifying manipulation
patterns. In this section, we integrate two machine learning classifiers—Random
Forest (RF) and Boosted Tree (BT)—and combine their outputs into a final
probability index.
Step 1: Data Preprocessing
Before training the classifiers, the input metrics must undergo a preprocess-
ing phase. This includes:
• Normalization: Each metric is normalized to ensure that features have
a mean of 0 and a standard deviation of 1, reducing potential bias from
larger-scaled metrics.
• Feature Selection: Important metrics such as High Quoting Activity
(HQ), Unbalanced Quoting (UQ), and Abnormal Cancellations (AC) are
selected based on their predictive power. Correlation analysis is performed
to avoid multicollinearity.
• Class Labeling: Historical data is labeled as either ”spoofing” or ”non-
spoofing” based on manual inspection or previous algorithmic models.
This labeled dataset serves as the ground truth for training.
Let the preprocessed features for observation i at time s be denoted as:

Xi,s = {HQi,s−1 , U Qi,s−1 , ACi,s , T OQi,s , COTi,s , CP Di,s , CP Ci,s }


Step 2: Random Forest Classifier
The Random Forest model operates by constructing multiple decision trees
during training, each trained on different random subsets of the dataset. The
classifier predicts the probability of spoofing by averaging the predictions from
all trees:
T
1X
Spoof PRF (i, s) = ht (Xi,s )
T t=1
Where T is the number of trees in the forest, and ht (Xi,s ) is the output of
the t-th tree, predicting the probability of spoofing for observation i at time s.
Step 3: Boosted Tree Classifier
The Boosted Tree model uses a sequential approach where weak classifiers
(decision trees) are trained iteratively. At each iteration, the model focuses on
correcting the errors made by the previous model, producing a strong classifier:
K
X
Spoof PBT (i, s) = αk fk (Xi,s )
k=1

9
Where K is the number of boosting iterations, αk is the weight of the k-th
model, and fk (Xi,s ) is the output of the k-th weak learner.
Step 4: Cross-Validation and Model Evaluation
Both models are evaluated using k-fold cross-validation on the training set.
The evaluation metrics include:

• Accuracy: Percentage of correctly predicted spoofing and non-spoofing


instances.

• Precision and Recall: Precision measures the percentage of true pos-


itive spoofing detections, while recall measures the proportion of actual
spoofing cases detected.
• AUC-ROC: The area under the ROC curve is calculated to determine the
model’s ability to distinguish between spoofing and non-spoofing events.

The model with higher AUC-ROC scores receives a higher weight during the
final integration step.
Step 5: Probability Integration
After training and evaluating both classifiers, their outputs are integrated
into a combined probability index. The final probability of spoofing is computed
as a weighted combination of the outputs from the Random Forest and Boosted
Tree classifiers:

Spoof P (i, s) = γ1 Spoof PRF (i, s) + γ2 Spoof PBT (i, s)


Where γ1 and γ2 are the weights assigned to the Random Forest and Boosted
Tree classifiers, respectively. These weights are determined based on their cross-
validation performance, ensuring that the more accurate model has a stronger
influence on the final probability index.
Step 6: Final Spoofing Probability
The final step combines the probability from the machine learning classifiers
with the original logit model, incorporating historical data for a comprehensive
detection mechanism:
 
P
ln = α + β1 HQi,s−1 + β2 U Qi,s−1 + β3 ACi,s
1 − P i,s
+ β4 T OQi,s + β5 COTi,s + β6 CP Di,s
+ β7 CP Ci,s + γ1 PRF (i, s) + γ2 PBT (i, s) + ϵi,d
This combined probability index strengthens the detection of spoofing by
leveraging both traditional logit models and state-of-the-art machine learning
techniques.
This combined approach improves the robustness of spoofing detection by
leveraging both traditional metrics and machine learning classifiers.

10
1.12 Conclusion on the Spoofing Detection Model
The Spoofing Detection Model incorporates metrics like High Quoting Activity,
Unbalanced Quoting, and Abnormal Cancellations to calculate a spoofing prob-
ability score. A logit model is applied to produce this score based on real-time
data. The model relies on these combined metrics to identify spoofing patterns
across various timeframes, enabling the system to differentiate between organic
market behavior and manipulation. This structured approach allows for the
integration of multiple data points to observe market conditions during known
spoofing cycles and refine detection accuracy.

11
Chapter 2

Predictive Modeling with


2-Tick Forecasting: The
Microprobability Metric for
Intraday Small Factor
Prediction

2.1 2-Tick Forecasting Microprobability Metric


The 2-Tick Forecasting Microprobability Metric is a sophisticated tool designed
for predicting market movements at the tick level. This document introduces
three core sub-metrics: the Bayesian Microprobability, the 1-Tick Forecasting
Metric, and the 2-Tick Microprobability. The Bayesian ML Microprobability
leverages data from the top 5 levels of the limit order book (LOB) over the
last 2000 bars (or user-configurable), excluding manipulated areas flagged by
a Spoofing Detection Metric, to build a dynamic Bayesian model that predicts
future price movements in real time. The 1-Tick Forecasting Metric utilizes
queue imbalance, derived from the relative sizes of the bid and ask queues, to
predict one-tick-ahead price changes, while the 2-Tick Microprobability extends
this approach to forecast movements over two ticks by blending the Bayesian
ML and 1-Tick metrics. Each sub-metric provides a probability score between
0 and 100 for both long (buy) and short (sell) directions, allowing for accurate
and adaptive predictions of market moves. This metric serves as the backbone
for a wide range of trading strategies, enhancing precision and minimizing the
impact of market manipulation.

12
2.2 Limit Order Book Data: Input to the Model
The LOB represents all outstanding buy and sell orders in the market at different
price levels. For this model, we focus on the top five levels on both sides of the
book—buy (bid) and sell (ask). These levels give insights into market liquidity
and supply-demand imbalances, key factors in short-term price movements. The
data from these levels, particularly the quantity and price, serve as the primary
inputs for our Bayesian model.

2.3 Input Structure


The input to our model consists of two sets of variables:

• B = {b1 , b2 , b3 , b4 , b5 }, where bi represents the quantity at the i-th bid


level.
• A = {a1 , a2 , a3 , a4 , a5 }, where ai represents the quantity at the i-th ask
level.
Each data point in this dataset represents a snapshot of the LOB at a specific
time, including both the price and volume associated with each level. This data
will be used to compute the probability distributions for upward and downward
price movements.

2.4 Bayesian Framework


Bayesian inference provides a structured way to update the probability estimates
for price movement based on prior knowledge and new data. Specifically, we
use Bayes’ Theorem:

P (D|H)P (H)
P (H|D) = (2.1)
P (D)
Where:
• P (H|D) is the posterior probability, or the probability of the hypothesis H
(in this case, a price increase or decrease) given the data D (LOB inputs).

• P (H) is the prior probability, which represents our belief in a price move-
ment before observing new LOB data.
• P (D|H) is the likelihood, representing the probability of observing the
LOB data given that the hypothesis is true.

• P (D) is the marginal likelihood, which normalizes the result by considering


all possible outcomes.

13
2.5 Step 1: Defining Prior Probabilities
To begin, we define the prior probabilities for both the long (price increase) and
short (price decrease) directions. These priors can be derived from historical
price movement patterns, where we observe the frequency of upward and down-
ward price shifts over time. For example, if in a given market session, prices
increased 60% of the time and decreased 40%, we can set the priors accordingly:

P (Price Increase) = 0.6 and P (Price Decrease) = 0.4


The flexibility of the Bayesian approach allows us to update these priors as
more market data becomes available.

2.6 Step 2: Likelihood Calculation


Next, we compute the likelihood of observing the current LOB configuration
given a price increase or decrease. The likelihood function models the relation-
ship between order book data and price movement. In this case, the likelihood
depends on the volume and spread between bid and ask levels, as well as the
overall imbalance between buy and sell orders.
We define a function that calculates the imbalance I at each level:

Ii = bi − ai
Summing across all five levels gives the total imbalance:
5
X
Itotal = Ii
i=1

This imbalance helps inform the likelihood P (D|H), as significant imbal-


ances in favor of the bids (positive Itotal ) suggest a higher likelihood of upward
price movement, while imbalances favoring the asks (negative Itotal ) suggest
downward pressure.

2.7 Step 3: Posterior Probability


Using Bayes’ Theorem, we compute the posterior probability for each price
movement hypothesis (increase or decrease) based on the LOB data. The pos-
terior is updated iteratively as new data arrives, allowing the model to refine
its predictions dynamically.
The posterior probability for a price increase is calculated as:

P (D|Price Increase)P (Price Increase)


P (Price Increase|D) =
P (D)
Similarly, for a price decrease:

14
P (D|Price Decrease)P (Price Decrease)
P (Price Decrease|D) =
P (D)
Here, P (D) is the marginal likelihood and is computed as the sum of the
likelihoods over both hypotheses:

P (D) = P (D|Price Increase)P (Price Increase)+P (D|Price Decrease)P (Price Decrease)

This normalization ensures that the posterior probabilities sum to 1, provid-


ing a well-calibrated forecast of price movement direction.

2.8 Model Implementation


The Bayesian ML Microprobability model is implemented in three core phases:
1. Initialization: Set the prior probabilities based on historical LOB data or
market behavior.
2. Inference Loop: As new LOB data is collected, calculate the imbalance at
each level, update the likelihood, and compute the posterior probabilities
for long and short directions.
3. Decision Making: Use the posterior probabilities to determine which side
of the market to favor. For example, if the posterior probability of a price
increase exceeds a predefined threshold (e.g., 0.7), the system can trigger
a buy signal.

2.9 Bayesian Microprobability Conclusion


The Bayesian Microprobability metric offers a powerful tool for forecasting
short-term price movements by leveraging the probabilistic framework of Bayesian
inference. By incorporating historical and real-time LOB data, this model dy-
namically adjusts its predictions as new information becomes available, allowing
for more precise and adaptive trading decisions. The framework described here
provides a foundation for further refinements, such as incorporating additional
data sources or adjusting the priors and likelihood functions based on changing
market conditions.

2.10 Metric 2: 1-Tick Forecasting Metric


The second sub-metric is based on research from Martin D. Gould and Julius
Bonart in ”Queue Imbalance as a One-Tick-Ahead Price Predictor in a Limit
Order Book.” The 1-Tick Forecasting Metric uses the queue imbalance method
to predict one-tick-ahead price movements in a limit order book.

15
2.11 Step-by-Step Explanation of the 1-Tick Fore-
casting Metric
1. Market and Limit Orders in an LOB: - Market orders are orders that are
immediately matched with existing limit orders. - Limit orders are orders that
do not immediately match and instead become active orders in the LOB. - The
LOB L(t) at any given time t consists of all active buy and sell limit orders for
a given asset.
2. Bid Price and Ask Price: - The bid price b(t) is the highest price among
all active buy orders:

b(t) = max{px | x ∈ L(t), ωx < 0}

- The ask price a(t) is the lowest price among all active sell orders:

a(t) = min{px | x ∈ L(t), ωx > 0}

- The bid and ask prices are collectively referred to as the best quotes.
3. Queue Sizes at the Best Quotes: - The size of the buy queue at the bid
price b(t) is denoted nb (b(t), t), representing the total volume of buy orders at
the bid: X
nb (b(t), t) = |ωx |
x∈L(t),px =b(t)

- Similarly, the size of the sell queue at the ask price a(t) is denoted na (a(t), t),
representing the total volume of sell orders at the ask:
X
na (a(t), t) = ωx
x∈L(t),px =a(t)

4. Queue Imbalance I(t): - The Queue Imbalance is a normalized measure


of the difference between the sizes of the buy and sell queues at the best quotes.
It provides insight into the relative strength of buying versus selling pressure in
the market. - The Queue Imbalance I(t) at time t is defined as:

nb (b(t), t) − na (a(t), t)
I(t) =
nb (b(t), t) + na (a(t), t)

- If I(t) is close to 1, this suggests strong buying pressure (the buy queue is
much larger than the sell queue). - If I(t) is close to -1, this suggests strong
selling pressure (the sell queue is much larger than the buy queue). - If I(t) is
close to 0, the buying and selling pressures are balanced.
5. Predicting the 1-Tick Move: - The core hypothesis is that the Queue
Imbalance I(t) provides predictive power for the direction of the next mid-price
movement (the average of the bid and ask prices). - Positive Queue Imbalance:
If I(t) > 0, indicating net buying pressure, the next price movement is more
likely to be an upward tick (an increase in the mid-price). - Negative Queue

16
Imbalance: If I(t) < 0, indicating net selling pressure, the next price movement
is more likely to be a downward tick (a decrease in the mid-price).
6. Scoring the 1-Tick Forecasting Metric: - The 1-Tick Forecasting Metric
assigns a probability score for both upward and downward price movements
based on the queue imbalance. - The probability of an upward 1-tick movement
at time t is given by:

I(t) + 1
M P1-tick long (i, t) =
2
- The probability of a downward 1-tick movement at time t is given by:

M P1-tick short (i, t) = 1 − M P1-tick long (i, t)

- These probabilities scale the Queue Imbalance from the range of [−1, 1] to the
range of [0, 1], providing a score that represents the likelihood of a one-tick price
movement in the long or short direction. - A score close to 1 for M P1-tick long
suggests a strong likelihood of an upward price movement, while a score close to
1 for M P1-tick short suggests a strong likelihood of a downward price movement.

2.12 Combining the Metrics: Kalman Filter Im-


plementation
The Bayesian ML Microprobability and the 1-Tick Forecasting Metric are both
effective predictors of short-term price movements, each leveraging different as-
pects of market data. To create a more robust and adaptive final prediction,
we combine these two metrics using a Kalman filter, which dynamically adjusts
the weight of each metric based on their accuracy in real-time.

2.13 Kalman Filter Overview


A Kalman filter is a recursive algorithm that estimates the state of a system (in
this case, the future price movement) by combining multiple noisy measurements
(here, the Bayesian ML Microprobability and 1-Tick Forecasting Metric). The
filter updates its estimates as new data arrives, adjusting the weight given to
each metric based on how well they are predicting the current market.

2.14 State Variables and Measurements


We define the following variables for our Kalman filter:

• State Variable xt : The true underlying probability of a price movement


at time t, which we want to estimate.

17
• Measurement 1 M PBezian (i, t): The Bayesian ML Microprobability at
time t, providing a predicted probability of an upward or downward price
movement.
• Measurement 2 M P1-tick (i, t): The 1-Tick Forecasting Metric at time
t, offering a complementary probability prediction based on queue imbal-
ance.

The goal of the Kalman filter is to combine these two measurements to obtain
a more accurate estimate of xt , the true probability of a price movement.

2.15 Kalman Filter Equations


1. Prediction Step: At each time step t, we predict the next state xt and the
uncertainty associated with this prediction.

x̂t|t−1 = Ax̂t−1|t−1 + But−1

Where:
• x̂t|t−1 is the predicted state at time t, given the state at time t − 1.
• A is the state transition matrix (in our case, this can be considered as 1,
assuming a steady system).
• But−1 represents any control input, which can be 0 if no external forces
are applied.
We also predict the uncertainty Pt associated with the state:

Pt|t−1 = APt−1|t−1 A⊤ + Q

Where:
• Pt|t−1 is the predicted uncertainty of the state at time t.
• Q is the process noise covariance matrix, representing the uncertainty in
the prediction model.
2. Update Step: In the update step, we correct the prediction using the
actual measurements from both metrics, combining them based on their uncer-
tainties.
Kt = Pt|t−1 H ⊤ (HPt|t−1 H ⊤ + R)−1
Where:
• Kt is the Kalman gain, which adjusts how much weight to give to the
measurements.
• H is the measurement matrix, mapping the state space to the measure-
ment space.

18
• R is the measurement noise covariance matrix, representing the uncer-
tainty in the measurements from the two metrics.
The Kalman filter then updates the state estimate based on the new mea-
surements:
x̂t|t = x̂t|t−1 + Kt (zt − H x̂t|t−1 )
Where:
• zt is the measurement vector containing the two metrics: [M PBezian (i, t), M P1-tick (i, t)]⊤ .
• x̂t|t is the updated estimate of the state, combining both metrics.
Finally, the uncertainty estimate is updated:

Pt|t = (I − Kt H)Pt|t−1

Where:
• I is the identity matrix.
• Pt|t is the updated uncertainty in the state estimate.

2.16 Final Output: 0-100 Score for Long and


Short
The final output of the Kalman filter at each time step is the combined prediction
x̂t|t , which represents the best estimate of the probability of a price movement
based on the Bayesian ML Microprobability and the 1-Tick Forecasting Metric.
The Kalman gain Kt ensures that the more accurate metric at any given moment
is given more weight, dynamically adjusting as the market evolves.
This combined approach enhances prediction accuracy by leveraging the
strengths of both metrics while continuously adjusting for real-time changes
in market conditions.
After combining the Bayesian ML Microprobability and the 1-Tick Forecast-
ing Metric using the Kalman filter, we normalize the final output to obtain a
score between 0 and 100 for both long and short predictions. The normalization
process ensures that the combined probability score is scaled appropriately for
decision-making.
The final scores are calculated as follows:
x̂t|t
M Pfinal long (i, t) = × 100
x̂max
M Pfinal short (i, t) = 100 − M Pfinal long (i, t)
Where:
• M Pfinal long (i, t) is the probability score for a long (upward) price move-
ment, constrained between 0 and 100.

19
• M Pfinal short (i, t) is the probability score for a short (downward) price
movement, constrained between 0 and 100.
• x̂max is a normalization factor to scale the output to the 0-100 range.
These scores represent the combined and filtered predictions of the two met-
rics, offering a dynamic and adaptive tool for short-term market predictions.

20
Part II

Market Imbalance
Scanners: Counterspoofing,
Liquidity Imbalances, and
Aggressive Order
Identification

21
Chapter 3

Classification of Spoofing
Phases and Determination
of Optimal Trade Timing
(Counterspoof Scanner

3.1 Introduction on Counterspoofing


The Counterspoofing component of L2Azimuth systematically responds to liq-
uidity manipulation practices by market spoofers. This approach works by
identifying patterns of spoofing behavior, which distort the order book through
large, deceptive orders that are later canceled before execution. The Counter-
spoofing strategy positions itself at advantageous points in the spoofing cycle,
allowing for trades that counteract these manipulative moves. Rather than
avoiding spoofing-heavy environments, as many traditional strategies do, this
component leverages the liquidity dynamics created by spoofers; a secondary
effect of this is that with enough counter-spoofing participants, spoofers may
be forced to exit their positions via market orders causing a major increase in
strategy alpha.

3.2 Spoofing Detection


The primary condition for initiating a counterspoofing trade is detecting a high
likelihood of spoofing activity. The spoofing score, previously defined as:
 
P
Ss = ln
1 − P i,s
is used to determine the probability P of a spoofing event. For a counter-

22
spoofing trade to trigger, the spoofing score must exceed a threshold of 0.95,
indicating a 95% certainty that spoofing is occurring.
Once spoofing is detected, the next step is to check the **Microprobability
Congruity Metric** (if enabled). This further refines the decision by ensuring
that the expected price direction aligns with the spoofing behavior.

3.3 Spoofing Cycle Detection and Machine Learn-


ing Integration
Identifying spoofing activity is only the first step. To maximize the effectiveness
of counterspoofing trades, it is essential to discern which phase of the spoofing
cycle the market is currently in. Spoofers typically follow a cycle of placing large
fake orders and then canceling them to influence price direction. By timing the
entry to coincide with specific phases in this cycle, the counterspoofing strategy
optimizes trade outcomes.
We trained a machine learning model using data from real court cases of
proven spoofing activity, as detailed in **”Detecting Layering and Spoofing
in Markets”**. The dataset contains time-stamped market activities, such as
large, visible orders placed and canceled by known spoofers. This labeled data
provides a robust foundation for training our model.

3.4 Preprocessing and Model Training


To train the model, we first preprocessed the real-time data into a matrix format
suitable for machine learning. The matrix includes the following variables:

• The time of order placement,

• The size and price level of the order,


• The time of order cancellation,
• The relationship between visible orders and actual trades.

By comparing the real-time spoofing data to the court-verified phases of


the spoofing cycle, the model learned to associate specific market behaviors
with phases of spoofing. The data was cross-compared to validate the model’s
performance by confirming if the predicted spoofing phase matched the real-
world market behavior from the cases.
For this task, we used a **Recurrent Neural Network (RNN)** with Long
Short-Term Memory (LSTM) cells. RNN-LSTM models are ideal for time-
series analysis as they can capture temporal dependencies, which is crucial for
identifying spoofing patterns over time. The RNN iteratively learns to recognize
sequences of spoofing behaviors, refining its prediction of the current phase in
the spoofing cycle.

23
The model was trained using a supervised learning approach, where each
phase of the spoofing cycle was labeled. After sufficient training and cross-
validation, the model achieved high accuracy in predicting spoofing phases based
solely on order book behavior observable in real-time.

3.5 Trade Direction Based on Spoofing Cycle


Once the current part of the spoofing cycle is identified, the system uses logic
to determine whether to go long, short, or wait for a better opportunity. The
trade direction is chosen based on the spoofer’s likely next move:

• If the spoofer is building large orders on the ask side, a long trade is
triggered (expecting an upward movement once the spoof is removed).

• If the spoofer is stacking orders on the bid side, a short trade is initiated
(anticipating a downward price move).
• If the current part of the cycle is ambiguous or suggests minimal move-
ment, the system may choose to wait until a clearer signal is generated.

3.6 Modeling Liquidity and Market Impact


The Counterspoofing strategy leverages the liquidity introduced by spoofers’
fake orders to counteract liquidity shortages that are commonly a challenge in
other trading strategies. When spoofers attempt to manipulate market sen-
timent by placing and subsequently canceling large orders, they are forced to
exit the market with actual market orders, which in turn absorb liquidity. This
movement of liquidity works to the advantage of the counterspoofing trade, as
it capitalizes on the forced market movement initiated by the spoofers.
The inherent behavior of spoofers ensures sufficient liquidity to accommodate
the execution of counterspoofing trades. By exploiting these market conditions,
the strategy maintains scalability without creating significant concerns for ad-
verse market impact. Even with concurrent execution by multiple users, the
liquidity provided by the spoofers themselves supports the trade flow, mitigat-
ing risks of illiquidity and allowing for efficient execution.

3.7 Conclusion on Counterspoofing Trades


The Counterspoofing Scanner, when employed independently, demonstrates a
strong capacity to detect and respond to liquidity manipulation in the form of
spoofing. The predictive nature of the scanner alone yields a Sharpe ratio of
1.9, reflecting its ability to generate alpha by capturing market inefficiencies
introduced by spoofers. This performance highlights the scanner’s robustness
in identifying favorable entry points during periods of manipulation.

24
Figure 3.1: Impact of Spoofing Detection and Microforecasting Score on Sharpe
Ratio

However, the full potential of the Counterspoofing strategy becomes evi-


dent when it is augmented by the two-tick Microprobability Forecasting Metric.
Incorporating this forecasting filter significantly enhances the trade selection
process. Only trades that align with both the detected spoofing activity and
the microforecasting model’s directional certainty are executed, thus refining
the accuracy and overall profitability of the trades.
This synergy between the scanner and forecasting filter is particularly strik-
ing when we examine the impact on Sharpe ratios. The system’s performance
improves drastically, as the Sharpe ratio rises from 1.9 to 4.48 when both tools
are combined. This remarkable improvement underscores how congruent filter-
ing of spoofing detection and forecasting can lead to superior market outcomes.
The integration of the microforecasting filter effectively reduces the number of
suboptimal trades and maximizes the system’s efficiency.
The relationship between spoofing detection thresholds, microforecasting
certainty, and resulting Sharpe ratios is visualized in Figure X below. This
3D surface plot illustrates how varying input conditions affect strategy per-
formance, offering a clear representation of the strategy’s potential when both
filters are aligned.
This figure highlights how even small improvements in detection and fore-
casting certainty can drive a significant increase in the Sharpe ratio. The surface
plot demonstrates the scalability and performance stability of the strategy, rein-
forcing the argument for the combined use of spoofing detection and microprob-
ability forecasting as a robust mechanism for capturing market inefficiencies.
In summary, while the Counterspoofing Scanner alone provides a solid foun-
dation for exploiting market manipulation, its combination with predictive mi-

25
croforecasting significantly enhances its effectiveness. This synergy elevates the
Sharpe ratio to levels rarely seen in trading systems and demonstrates the im-
portance of blending detection and forecasting mechanisms for optimal market
performance.

26
Chapter 4

Detection and Exploitation


of Market Sentiment Shifts
via Aggressive Order
Identification (DEX-Array
Scanner)

4.1 Introduction on DEX-ARRAY Scanner


The DEX-Array Scanner identifies aggressive limit orders placed between the
bid and ask prices, which indicate potential shifts in market sentiment. These
orders are detected when the price of a new order is placed within the spread of
the current best bid and ask prices. The system then tracks subsequent changes
in price levels to confirm the influence of the detected orders. The underlying
process filters noise to isolate instances of aggressive order placement and their
impact on the market. By combining this detection mechanism with additional
filters, such as the Microprobability Metric and Spoofing Score, the scanner
narrows its focus to more reliable signals.

4.2 Why the DEX-Array Scanner Works


The DEX-Array Scanner operates by detecting aggressive limit orders placed
between the bid and ask prices, which often indicate an impending shift in mar-
ket sentiment. These aggressive orders suggest that participants are prepared
to influence market movement by buying into the ask or selling into the bid.
When such events occur, the order book is forced to adjust, often resulting in
substantial price movement.

27
The scanner is designed to identify these rare occurrences with precision by
filtering out the noise typically present in high-frequency trading environments.
Although these events are infrequent, their detection is critical due to the sig-
nificant price impact they can create. The focus on isolating these aggressive
orders allows the system to accurately capture moments of market imbalance,
which are often followed by rapid price changes.
By concentrating on these specific market conditions, the DEX-Array Scan-
ner ensures that only high-confidence signals are processed. This approach min-
imizes false positives and leverages the full potential of the scanner’s design to
identify key market shifts with a degree of reliability that can only be achieved
through meticulous filtering of market data.

4.3 Aggressive Order Detection


An aggressive order is defined as one placed between the bid and ask prices.
Let:

• porder be the price of the limit order,


• pbid be the current bid price,
• pask be the current ask price.

The function to detect an aggressive order is given by:


(
1 if pbid < porder < pask
f (porder , pbid , pask ) =
0 otherwise
This means the scanner triggers when a limit order is placed between the
bid and ask, signaling that the trader placing the order is acting aggressively,
expecting imminent price movement.

4.4 Price Change Verification


After detecting an aggressive order, we verify whether it caused a price change.
The scanner then evaluates:

• For a buy order: We check if the new bid price (pbid new ) has risen to
the level of the aggressive order or higher.
• For a sell order: We check if the new ask price (pask new ) has fallen to
the level of the aggressive order or lower.

The functions for this verification are:


(
1 if pbid new ≥ porder
gbuy (porder , pbid new ) =
0 otherwise

28
(
1 if pask new ≤ porder
gsell (porder , pask new ) =
0 otherwise
If the aggressive order causes the bid (for a buy) or the ask (for a sell) to
move to the order’s price, the algorithm confirms that the aggressive action
influenced the market.

4.5 Integration with the Microprobability Met-


ric, Spoofing Score, and HF Omega
The DEX-Array Scanner now integrates the Microprobability Metric, the Spoof-
ing Score, and the closed-source HF Omega metric to ensure both trade validity
and IP protection.
Let:
• M Plong be the Microprobability Metric for long trades,
• M Pshort be the Microprobability Metric for short trades,
• SS be the Spoofing Score,
• HFΩ be the closed-source HF Omega metric.
The highest Microprobability Metric is defined as:

M Pmax = max(M Plong , M Pshort )


The HF Omega metric, HFΩ , is proprietary and remains a crucial part of
the decision-making process, adding an additional layer of verification that is
not disclosed publicly. This metric ensures the security of the algorithm and
protects its intellectual property.
To proceed with a trade, three conditions must be met: 1. The Spoofing
Score (SS) must be below 0.4. 2. The direction of the trade (buy or sell) must
match the direction of the highest Microprobability Metric, and the Microprob-
ability Metric must be above 0.82. 3. The HF Omega metric (HFΩ ) must signal
a positive outcome for the trade.
The trade decision function becomes:



1 if SS < 0.4 and
( M Pmax > 0.82 and HFΩ > 0

 1 if M Plong = M Pmax
h(M Plong , M Pshort , Dtrade , SS, HFΩ ) = and Dtrade =


 −1 if M Pshort = M Pmax

0 otherwise

This additional layer of validation using the HF Omega metric provides


another level of security and ensures the proprietary nature of the overall sys-
tem, further enhancing the algorithm’s robustness and protecting its intellectual
property.

29
4.6 Conclusion on DEX-ARRAY Scanner
In conclusion, the DEX-Array Scanner demonstrates a notable capacity to de-
tect and act upon subtle yet impactful market phenomena. The combination of
aggressive order detection, verification mechanisms, and the integration of ad-
vanced metrics such as the Microprobability Metric and Spoofing Score enables
it to identify rare but meaningful shifts in the order book. While the scanner’s
activity is relatively infrequent, its precision in capturing these moments, sup-
ported by robust filtering processes, makes it a reliable component in the broader
scope of algorithmic trading strategies. The efficacy of the DEX-Array Scanner
lies not in its frequency of activation but in its ability to discern high-conviction
trading opportunities within a complex and dynamic market structure.

30
Chapter 5

Liquidity Imbalance
Detection and Scalability in
High-Frequency
Environments (Trespass
Scanner)

5.1 Introduction on Trespass Scanner


The Trespass Scanner is the most powerful and profitable scanner within the
L2Azimuth suite, despite appearing simple at first glance. Its strength lies in
identifying liquidity imbalances in the limit order book (LOB), leading to high-
frequency trade opportunities. However, its unique filtering system ensures that
it only acts on high-probability setups, making it responsible for the bulk of the
total alpha generated by the strategy, even though it operates with a slightly
lower win rate compared to other scanners.

5.2 Why the Trespass Scanner Works


The Trespass Scanner is built to exploit predictable inefficiencies in the LOB
caused by liquidity imbalances. These imbalances are created when the quantity
of buy and sell orders on opposite sides of the LOB becomes skewed, signaling
the likelihood of imminent price movement. While these events occur frequently,
the scanner’s ability to process and filter this data through tick stacking and
proprietary noise reduction makes it highly effective.
Unlike other scanners in the L2Azimuth suite, the Trespass Scanner is de-
signed to work across a broad range of trading instruments. The liquidity im-

31
balance detection is flexible, allowing traders to combine multiple tick levels
in cases where tick size is smaller (e.g., NQ or traditional stocks), making it
adaptable to various market conditions.

5.3 Liquidity Imbalance Detection and Tick Stack-


ing
At the core of the Trespass Scanner is its ability to detect large imbalances in
the top levels of the LOB. To account for instruments with small tick sizes, such
as certain stock indices or futures, the scanner employs a technique called **tick
stacking**. This method aggregates multiple levels in the order book to capture
the true liquidity profile and remove noise from single-level fluctuations.
Let:

• Vbid,i be the volume at the i-th level on the bid side,


• Vask,i be the volume at the i-th level on the ask side,
• L be the number of levels to aggregate (configurable by the user).

The stacked liquidity imbalance between the bid and ask sides can be quan-
tified as:
PL PL
i=1 Vbid,i − i=1 Vask,i
ILOB = PL PL
i=1 Vbid,i + i=1 Vask,i
This imbalance ratio measures the relative strength of the buy and sell sides
across multiple levels of the LOB. The logic behind tick stacking allows the user
to smooth out noise, particularly in instruments with smaller tick sizes, where
single-level imbalances may not provide reliable signals. By aggregating these
levels, we obtain a clearer picture of true liquidity conditions.

5.4 Proprietary Noise Filtering and Final Signal


Metric
Once the liquidity imbalance has been computed, the data undergoes a pro-
prietary noise-reduction process. This filtering step is crucial to remove false
signals that may arise from temporary liquidity changes or erratic market be-
havior. The filtered imbalance metric, denoted as Ifiltered , forms the basis for
the final signal.
Even though the exact mathematical formulation of the final filtered metric
is proprietary and not disclosed, the process ensures that only robust imbalances
persist through the filtering stage. The final signal metric is a combination of
this filtered imbalance with additional layers of validation from the Microprob-
ability Metric and the Spoofing Probability Score.

32
5.5 Trade Validation and Filtering
After detecting and filtering the liquidity imbalance, the Trespass Scanner val-
idates the signal through a series of checks. First, the signal must align with
both the **Microprobability Metric** and the **Spoofing Probability Score**.
Let:
• M Pdir be the Microprobability Metric in the detected trade direction (ei-
ther long or short),
• SP be the Spoofing Probability Score.
For a trade to be executed, the following conditions must be met:
1. M Pdir must confirm the detected trade signal, i.e., the predicted prob-
ability of price movement in the desired direction must exceed a preset
threshold,
2. SP < 0.2, meaning the likelihood of spoofing behavior must be less than
20%.
The decision to proceed with a trade is defined as:
(
1 if M Pdir > 0.8 and SP < 0.2
h(Ifiltered , M Pdir , SP ) =
0 otherwise
This rigorous validation ensures that only trades aligned with favorable prob-
abilities are executed, thereby enhancing the accuracy of the strategy.

5.6 Risk Management and Exit Logic


The Trespass Scanner includes a dynamic risk management system that adapts
to the trader’s risk appetite. It continuously monitors market conditions post-
entry and can exit trades early if conditions deteriorate, depending on the se-
lected risk setting.
Let:
• R be the risk setting (LOW, MEDIUM, or HIGH),
• Ptrade be the current profit/loss status of the trade,
• M Pactive be the current Microprobability score,
• SPactive be the current Spoofing Probability score.
The exit logic is defined as:

If M Pactive < 0.6 or SPactive > 0.6,
 R = LOW
Exit Condition = If in loss and M Pactive < 0.6 or SPactive > 0.6, R = MEDIUM

No early exit, R = HIGH

33
For users who have disabled the Microprobability filter, the exit conditions
rely solely on the spoofing score, which acts as the primary exit trigger when
market manipulation is detected.

5.7 Conclusion on Trespass Scanner


The Trespass Scanner is a high-frequency strategy based on liquidity imbalances,
supported by advanced tick stacking, proprietary noise filtering, and robust
trade validation. Despite its seemingly simple structure, it delivers substantial
alpha and plays a critical role in the overall performance of the L2Azimuth
suite. Its flexibility across various instruments, combined with dynamic risk
management, makes it a vital tool for traders seeking consistent performance in
volatile markets.

34
Part III

Implications and Future


Directions

35
Chapter 6

Conclusion

This paper has explored various methodologies for detecting market manipula-
tion, predicting small factor movements, and classifying spoofing cycles in real-
time trading environments. The integration of probabilistic models, machine
learning, and proprietary filtering mechanisms provides an advanced toolkit for
algorithmic trading strategies. Each chapter presented in this study builds upon
existing financial research, with a focus on improving detection accuracy and
trade validation across high-frequency trading scenarios.
The Spoofing Detection Model introduced in Part I emphasizes the identifi-
cation of spoofing behavior through key market metrics such as High Quoting
Activity, Unbalanced Quoting, and Cyclical Patterns in Depth. These metrics
were shown to be effective at both the intraday and second-interval levels. The
use of machine learning techniques, such as random forests and boosted trees,
was demonstrated to significantly enhance the detection of spoofing, enabling
the identification of manipulation with high accuracy.
The Microprobability Metric, utilizing Bayesian techniques and Kalman fil-
tering, provided a robust framework for predicting market shifts at the tick level.
This metric’s ability to refine trade entry points and validate signal strength
demonstrates the power of combining probabilistic modeling with real-time limit
order book data. The integration of 1-tick and 2-tick forecasting offered further
layers of predictive insight, proving useful in both small and large factor envi-
ronments.
In Part II, the Counterspoofing Scanner built on the insights from the Spoof-
ing Detection Model by incorporating a machine learning classifier to detect
phases of the spoofing cycle. This allowed for a more precise determination of
trade timing, based on real-time spoofing activity. The preprocessing of court
case data for model training provided the framework necessary to align spoofing
cycle phases with optimal trade points. The interaction between market liquid-
ity and spoofers’ forced market exits further validated the effectiveness of this
model in high-frequency environments.
The DEX-Array Scanner capitalized on rare occurrences of aggressive limit
orders between the bid and ask, providing a unique tool for detecting shifts

36
in market sentiment. By isolating these aggressive orders, this scanner mini-
mized noise and provided high-confidence signals that were validated through
the Microprobability Metric and Spoofing Score.
Finally, the Trespass Scanner focused on detecting liquidity imbalances, us-
ing proprietary noise filtering techniques to stack tick levels and identify op-
portunities even in volatile environments. This scanner was responsible for a
high volume of trades, leveraging its focus on imbalances to generate consistent
alpha, with built-in risk management mechanisms for early trade exits.

6.1 Future Directions


While the techniques explored in this paper have demonstrated significant po-
tential in improving the detection and exploitation of market inefficiencies, sev-
eral areas remain open for further research. First, additional data sources, such
as news sentiment and social media analytics, could be incorporated to refine
existing models. Second, the scalability of these strategies in increasingly decen-
tralized and fragmented markets warrants additional exploration. Finally, fur-
ther development of the proprietary noise filtering and machine learning models
could lead to even greater trade precision and alpha generation.
The integration of these predictive and detection models has profound im-
plications for market transparency and fairness, providing traders with more
robust tools to navigate the complexities of modern financial markets.

37

You might also like