Stat Arb

Uploaded by

yiyangsong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views38 pages

Stat Arb

Uploaded by

yiyangsong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Statistical Arbitrage (Stat Arb)

Week 10

traders.berkeley.edu
Announcements

● Final project coming out soon!

○ Stay tuned for more details on Ed
Agenda

● Brainteaser
● What is stat arb?
● Stat arb pipeline
○ Identifying relationships
○ Trading relationships
○ Managing risk
Problem of the Day
I have a dataset which I can split into four parts (say by marginalizing by a
categorical variable). A linear model can achieve an R^2 of .8 on each of these
individual datasets.

What’s the range of R^2 of a linear model on the aggregate dataset?

Solution
● An R^2 of 0 can be achieved by having the four clusters arranged such that
the OLS prediction degenerates to the mean ie have our four cluster be
symmetric about the origin.
● An R^2 of 1 can be achieved in the limit. Consider two clumps on a line with
slope not equal to 0. Move these clumps inﬁnitely far away from each other.
Warning!

This is by far the most mathematically involved lecture so

far–please ask questions and stop me if anything is unclear.
What is Stat Arb?
Arbitrage
● Arbitrage describes trades where one earns a risk-free proﬁt.
● Traditionally, arbitrage occurs between fundamentally linked assets, like an
ETF and its constituent securities.
● See the quantitative strategies lecture for more.
Stat Arb
● Stat arb is a class of trading strategies by which you identify and trade on
statistical relationships between assets.
○ Usually this looks like pairs trading.
● Stat arb is fundamentally a reversion strategy: we general hope that any
deviation from historical statistical relationships is simply due to inefﬁciency
or a temporary disruption and will correct in the future.
● Stat Arb differs from other styles of trading that we have discussed in that it is
generally at longer time scales (seconds to days).
● While market making has a capped upside, stat arb has potentially unlimited
upside.
Statistical Relationships
Identifying them:
● Fundamental Factors
○ Like Coke vs Pepsi
● Testing relationships on historical data
○ Beware of false discovery rates

Useful Concepts
● Correlation
○ Pearson: linear relationship between two random variables
■ E(XY)/[SD(X)SD(Y)]
○ Spearman: monotonic relationship between two variables
■ Pearson correlation between ranks
● Cointegration
○ Some linear combination of the two time series sum to a constant
Identifying Relationships
Regression Hypothesis Testing
● First, we have to specify a distribution. Correlation (and regression
coefficients) is generally assumed to follow a Student’s t distribution.
○ Null Hypothesis: 𝝆=0 Alternative: 𝝆≠0 with some significance level ɑ.
● Test for time series stationarity (Augmented Dickey-Fuller Test).
○ This is important because if your time series has a trend, you usually cannot perform inference
on the whole dataset.
● Most parametric tests can also be performed in non-parametric ways using
bootstrapping.
Using Correlation
● Usually, we want to
measure the relationship
between the returns.
● Returns are generally
stationary time series
which tells us that price is
an order one time series
Error Rate Analysis
● We will generally run many hypothesis tests, and we want to ensure that our
discoveries are real.
● Therefore we have to consider the False Discovery Rate (FDR) and the
Family-wise Error Rate (FWER).
● The FDR is E[V/R], ie the total number of false discoveries/the total number
of total discoveries
● The FWER (assuming our tests are independent) is 1-(1-ɑ)^n, where we run n
hypothesis tests.
Controlling the FWER: Bonferroni Correction
● Let’s say we run n hypothesis tests and we a global 5% error rate.
● We can just run each test at ɑ=.05/n.
● This gives us our goal, but is super conservative, because it assumes all of
our rejection regions are disjoint.
● This is literally the worst case.
Controlling the FDR: Benjamini-Hochberg
● The Benjamini-Hochberg procedure limits the FDR using the following
procedure.
1. Run hypothesis tests.
2. Collect p values and sort them.
3. Plot the line k/m*ɑ, where k is the rank of the current test and m is the
number of tests
4. Reject the null hypothesis for tests below this line.
Also Important
● Online FDR control (you don’t know how many hypothesis tests you’re going
to run)
○ Use the LORD algorithm
○ Or Generalized alpha investment (GAI, SAFFRON)
Out-of-Sample Testing
● Usually, you don’t just train a model on your dataset and hope it works, you
hold out a certain amount of the data to test on after training.
● Beware: data leakage. What is wrong with the following situations?
○ In order to test the performance of a strategy on the S&P, I take its current constituent stocks
and trade a momentum-based strategy on their returns.
○ I take a returns series, shuffle it and split it into a train and test set. I then train a model and
see shockingly good results on the test set.
Trading Stat Arb
No (Stat) Arbitrage Bounds
● Whenever you take a position, you have to cross the spread.
● You should not take the position when E[profit] ≤ 2*spread
● The following example does not yield a profit

Asset 1 Asset 2 Asset 1 Asset 2

Positions to Take (What)

● Short the over performer, long the underperformer.

● Make the position such that you’re betting that current deviations return
● Usually, we also don’t want to take on too much correlated risk–so we sometimes
use other trades to hedge.
○ For example, if I have 2 trades that have low correlation, I might be more
conﬁdent sizing up, because one not panning out has little bearing on the
others
Sizing (How much)
● This depends a lot on portfolio construction and risk limits
● For example, if your expected max deviation is $10 below the current price,
you shouldn't size up such that you get margin called when this max
deviation is achieved.
● How should we handle incorrect estimates and out of distribution events?
Risk Management
Portfolio Risk
● Let’s say you’re running a large stat arb strategy and a market event occurs
such that several relationships all dislocate.
● You trade on each of these relationships.
● However, as since these relationships are all correlated you run the risk of
further drawdown tanking your portfolio and getting margin called.
Structural Divergence
● A relationship may no longer exist for fundamental reasons.
● Example:
○ Let’s say you’re pairs trading Microsoft and Amazon, speciﬁcally on their respective cloud
computing businesses.
○ However, the anti-trust action is taken against Microsoft so they have to drop Azure (their
cloud platform).
○ Your statistical relationship will likely no longer hold, even when the two securities diverge.
Failure to Converge
● It is hard to predict when two statistically related securities will converge back
to their original relationship.
○ This is roughly equivalent to detecting distribution shift in an online setting. There’s a lot of
work going into how to do this.
● This can yield long term exposure to the securities you’re trading, which isn’t
a great thing.
● Stop losses or trade exit time periods can help mitigate this risk.
Liquidity Risk
● If you are running a strategy in illiquid assets, it may be very hard to clear
positions, even after reconvergence.
● You will also not be able to take large positions without having a large market
impact.
● How can you solve this?
Trade Example
Real World
● Coke and Pepsi are in very similar lines of business and therefore have very
similar price movements.
● Consider a situation where Coke’s earnings call happens and they severely
miss expectations. It’s likely that Pepsi will also miss their earnings.
● How do you trade this?
Toy
● Securities A and B’s are statistically related and usually have the following
price relationship: 3A-2B = 10 + e, where e is a gaussian(0, 7) noise term.
● Security A is current prices at $31, Security B is currently priced at $40.
● What position do you take?
Portfolio Example
Optimize which trades to take
● You have a number of trades, each of which you have measured certain
correlations for. Let’s say that you’re modelling each of these trades as a
direction in a multivariate gaussian with mean vector [100, -50, 80]^T and
covariance matrix [[1000, -400, 900], [-400, 500, -600], [900, -600, 950]].
● What trades to you make to maximize the risk adjusted return of this
strategy? (Just state the objective function)
○ Hint: model the trade as aX+bY+cZ, where X, Y, Z are each of possible trades.
Major Players
Major Players
● PDT Partners
● TGS Management
● Renaissance Technologies
● Two Sigma
● D.E. Shaw
● AQR
Questions?

Algorithmic Trading & Quantitative Strategies Gappy Lecture 2
No ratings yet
Algorithmic Trading & Quantitative Strategies Gappy Lecture 2
25 pages
Arbitrage Trading Strategies
67% (3)
Arbitrage Trading Strategies
27 pages
Trading Systems and Methods
No ratings yet
Trading Systems and Methods
5 pages
Trading Pairs With Excel Python by Anjana Gupta @tradingpdfgratis2
No ratings yet
Trading Pairs With Excel Python by Anjana Gupta @tradingpdfgratis2
93 pages
Quantitative Strategies
No ratings yet
Quantitative Strategies
63 pages
Arbitrage Pricing Theory
100% (1)
Arbitrage Pricing Theory
28 pages
Ajaz ECO204 2023-2024-Summary Finance
No ratings yet
Ajaz ECO204 2023-2024-Summary Finance
31 pages
FRM Risk P3
No ratings yet
FRM Risk P3
37 pages
Managing Risk
No ratings yet
Managing Risk
16 pages
Chapter_6C_Diversification_V7
No ratings yet
Chapter_6C_Diversification_V7
51 pages
FS_paper
No ratings yet
FS_paper
42 pages
week 4
No ratings yet
week 4
72 pages
Understanding Pairs Trading: Yuan Chen (Vincent)
No ratings yet
Understanding Pairs Trading: Yuan Chen (Vincent)
38 pages
Quantitative Trading
No ratings yet
Quantitative Trading
34 pages
Aifb Solution Olt
No ratings yet
Aifb Solution Olt
22 pages
T3 FactorModels
No ratings yet
T3 FactorModels
18 pages
2202.11309
No ratings yet
2202.11309
38 pages
Lecture 2
No ratings yet
Lecture 2
74 pages
MLbootcamp
No ratings yet
MLbootcamp
54 pages
Advanced Risk and Portfolio Management Attilio Meucci, Ago.2012 ARPM - Brochure - v1
No ratings yet
Advanced Risk and Portfolio Management Attilio Meucci, Ago.2012 ARPM - Brochure - v1
5 pages
afb ass2
No ratings yet
afb ass2
19 pages
Evolution of Pairs Trading 1725405135
100% (1)
Evolution of Pairs Trading 1725405135
14 pages
lec3
No ratings yet
lec3
38 pages
Why More (R-Squared) is Always Better - By Benedict Treloar
No ratings yet
Why More (R-Squared) is Always Better - By Benedict Treloar
10 pages
Submission 3 M7 Algorithmic Trading Strategy PDF
No ratings yet
Submission 3 M7 Algorithmic Trading Strategy PDF
29 pages
slides_deep_learning_statistical_arbitrage
No ratings yet
slides_deep_learning_statistical_arbitrage
56 pages
Group 3
No ratings yet
Group 3
17 pages
Capm, Factor Models and Apt Printout
No ratings yet
Capm, Factor Models and Apt Printout
24 pages
Risk Course1
No ratings yet
Risk Course1
68 pages
Iusmanmaqbool 1721820716
No ratings yet
Iusmanmaqbool 1721820716
59 pages
EQI_gappy_ch3_20240430
No ratings yet
EQI_gappy_ch3_20240430
21 pages
Portfolio Management
No ratings yet
Portfolio Management
5 pages
Pairs Trading: A Statistical Arbitrage Strategy
No ratings yet
Pairs Trading: A Statistical Arbitrage Strategy
21 pages
3 Big Mistakes of Backtesting - Overfitting - Look-Ahead-Bias - Phacking
No ratings yet
3 Big Mistakes of Backtesting - Overfitting - Look-Ahead-Bias - Phacking
5 pages
CALCULATION SHEET Formula Scaffolding LT
100% (1)
CALCULATION SHEET Formula Scaffolding LT
3 pages
01__Module_3__Portfolio_Optimization_
No ratings yet
01__Module_3__Portfolio_Optimization_
65 pages
From Thesis To Trading A Trend PDF
No ratings yet
From Thesis To Trading A Trend PDF
25 pages
Lec5 APT
No ratings yet
Lec5 APT
41 pages
Trend Following
No ratings yet
Trend Following
22 pages
6 APT Slides ch10
No ratings yet
6 APT Slides ch10
53 pages
Ch24 Show
No ratings yet
Ch24 Show
46 pages
FULLTEXT01
No ratings yet
FULLTEXT01
57 pages
Draft_Paper
No ratings yet
Draft_Paper
9 pages
CorrelationTrading-OpportunitiesandLimitations
No ratings yet
CorrelationTrading-OpportunitiesandLimitations
34 pages
How To Allocate Risk More Effectively - by Kyna
No ratings yet
How To Allocate Risk More Effectively - by Kyna
14 pages
AFCAT 2 2024 Ebook - 3507
No ratings yet
AFCAT 2 2024 Ebook - 3507
35 pages
Trading Strategies From Technical Analysis - 6
No ratings yet
Trading Strategies From Technical Analysis - 6
19 pages
Certificate Course in Social Work-Distance Education
100% (1)
Certificate Course in Social Work-Distance Education
6 pages
Advanced Mathematical and Computational Tools in M
No ratings yet
Advanced Mathematical and Computational Tools in M
384 pages
AHRICertificate ARUM264BTE5
No ratings yet
AHRICertificate ARUM264BTE5
1 page
MS&E448: Statistical Arbitrage: Group 5: Carolyn Soo, Zhengyi Lian, Jiayu Lou, Hang Yang
No ratings yet
MS&E448: Statistical Arbitrage: Group 5: Carolyn Soo, Zhengyi Lian, Jiayu Lou, Hang Yang
31 pages
Technical Analysis
No ratings yet
Technical Analysis
17 pages
Arbitrage Pricing Theory - APT
No ratings yet
Arbitrage Pricing Theory - APT
60 pages
Chapter 12 RWJ
No ratings yet
Chapter 12 RWJ
26 pages
An Ornstein-Uhlenbeck Framework For Pairs Trading
No ratings yet
An Ornstein-Uhlenbeck Framework For Pairs Trading
58 pages
Ering 2016
No ratings yet
Ering 2016
11 pages
Contents
No ratings yet
Contents
95 pages
Lecture 1 Quant
No ratings yet
Lecture 1 Quant
29 pages
inbound1328890827352642916
No ratings yet
inbound1328890827352642916
8 pages
Multi Factor
No ratings yet
Multi Factor
22 pages
MS&E448: Statistical Arbitrage: Group 5: Carolyn Soo, Zhengyi Lian, Jiayu Lou, Hang Yang
No ratings yet
MS&E448: Statistical Arbitrage: Group 5: Carolyn Soo, Zhengyi Lian, Jiayu Lou, Hang Yang
31 pages
Siti Athirah Jaapar MFAB2017
No ratings yet
Siti Athirah Jaapar MFAB2017
23 pages
Prob 14
No ratings yet
Prob 14
6 pages
Enhancing Statistical Significance of Backtests EPCHAN
No ratings yet
Enhancing Statistical Significance of Backtests EPCHAN
23 pages
Calculating Covariance For Stocks
No ratings yet
Calculating Covariance For Stocks
10 pages
Statistical Arbitrage For Mid-Frequency Trading
No ratings yet
Statistical Arbitrage For Mid-Frequency Trading
17 pages
Proposal C
No ratings yet
Proposal C
14 pages
Ecology Webquest 10
No ratings yet
Ecology Webquest 10
5 pages
A_Compact_Dual-Mode_Pattern-Reconfigurable_Wearable_Antenna_for_the_2.4-GHz_WBAN_Application
No ratings yet
A_Compact_Dual-Mode_Pattern-Reconfigurable_Wearable_Antenna_for_the_2.4-GHz_WBAN_Application
6 pages
Aspects That Influence Reading Development
No ratings yet
Aspects That Influence Reading Development
24 pages
Martinig CV
No ratings yet
Martinig CV
15 pages
Anne Grady Fostering European Cooperation For Cultural Heritage at Risk 27 Feb 2020
No ratings yet
Anne Grady Fostering European Cooperation For Cultural Heritage at Risk 27 Feb 2020
30 pages
Notes On Chapter 1
No ratings yet
Notes On Chapter 1
9 pages
Gujarat Technological University
100% (1)
Gujarat Technological University
2 pages
Umc Notification
No ratings yet
Umc Notification
1 page
11TH Summer Assignment Science
No ratings yet
11TH Summer Assignment Science
2 pages
Annexure V - SOP -Temp & RH Mapping (1)
No ratings yet
Annexure V - SOP -Temp & RH Mapping (1)
2 pages
Effectiveness of Forest and Wildlife Laws
No ratings yet
Effectiveness of Forest and Wildlife Laws
5 pages
Press Release
No ratings yet
Press Release
2 pages
Class XI Summer HHW 2023
No ratings yet
Class XI Summer HHW 2023
5 pages
Oh Vin, Dissipe La Tristesse
No ratings yet
Oh Vin, Dissipe La Tristesse
2 pages
Junior Writing Test
No ratings yet
Junior Writing Test
3 pages
08 2016 Biomedical Instrumentation - Electrosurgical Devices
No ratings yet
08 2016 Biomedical Instrumentation - Electrosurgical Devices
32 pages
Homework Practice Workbook Geometry Glencoe Mcgraw Hill
100% (1)
Homework Practice Workbook Geometry Glencoe Mcgraw Hill
7 pages
Introduction To Decimals: at The End of This Chapter You Should Be Able To..
No ratings yet
Introduction To Decimals: at The End of This Chapter You Should Be Able To..
14 pages
ENG 3 (Speech and Oral Communication)
No ratings yet
ENG 3 (Speech and Oral Communication)
6 pages
Resume-Himanshu Pandey.
No ratings yet
Resume-Himanshu Pandey.
3 pages
Chapter 2 PPT Condensed
No ratings yet
Chapter 2 PPT Condensed
6 pages
JB Cover Letter Template 20-21
No ratings yet
JB Cover Letter Template 20-21
1 page
Stock Market Crash Course – A Complete Guide to Investing & Trading in 21 Days
From Everand
Stock Market Crash Course – A Complete Guide to Investing & Trading in 21 Days
Piyush kumar
No ratings yet

Stat Arb

Uploaded by

Stat Arb

Uploaded by

Statistical Arbitrage (Stat Arb)

● Final project coming out soon!

What’s the range of R^2 of a linear model on the aggregate dataset?

This is by far the most mathematically involved lecture so

Asset 1 Asset 2 Asset 1 Asset 2

● Short the over performer, long the underperformer.

You might also like