SlideShare a Scribd company logo
Stock Prediction using
Hidden Markov Models &
Investor Sentiment
Patrick Nicolas
patricknicolas.blogspot.com
Silicon Valley Machine Learning for Trading
Strategies meetup, April 25, 2015
Introduction
Hidden Markov models (HMM) are have been
used for years to decipher the internal
workings of financial markets.
More recently, HMM have been applied to
predict stocks/ETFs movement using crowd
sourcing and social media.
2
Objectives
The goal is to evaluate the predictive capability of
HMM using investor’s sentiment as input and
answer 3 questions:
1. Can HMM predict price of securities in
stationary states?
2. Can HMM detect regime shift or structural
breaks?
3. What are the alternative solutions?
3
Stock Prediction using HMM in
stationary states
Detection of regime changes using
Buried Markov models
Alternative models
4
Problem I
Problem: Using AAII weekly sentiment survey to
predict market trend (1 – 3 months)
Solution: Using hidden Markov model to predict
SPX value using AAII survey as hidden states.
The sentiment of American Association of Individual
Investors (AAII) is regarded as a contrarian indicator
of the future direction in market indices.
5
AAII sentiment survey 6
www.aaii.com
1. Observations (bullishness) are independent, discrete
2. Observations depends on the internal state of the
market
3. The historical data does show any significant break
(~stationary process): Macro-economic distortion
4. The latent behavior of the stock market behavior can be
quantized into a finite number of states
5. Asset/security pricing does not follow normal
distribution but this restriction is relaxed for mixtures
Why HMM? 7
Overview of graphical model
AAII weekly sentiment (symbols) xt
HMM is a generative classifier to predict SPY z price,
conditional to a latent, internal state of the stock
market S using investor’ sentiment x as observation O.
8
Market internal state zt
…Si
Sj Sk
𝑝 𝑧𝑡+1 = 𝑆𝑗 𝑧𝑡 = 𝑆𝑖 = 𝑎𝑖𝑗
𝑝 𝑥 𝑡 = 𝑂 𝑘 𝑧𝑡 = 𝑆𝑖 = 𝑏 𝑘𝑗
Ok Om On
aij is the state transition probabilities matrix {Si}t-1 -> {Si}t
Bkj is the emission probabilities matrix {Sj}t -> {Ok}t
Trellis model representation
π0
πi
πn-1
a0,0
an-1,n-1
O0
bn-1,1
Ot
bn-1,t
O1
9
x
S0
0
Si
0
Sn−1
0
S0
1
Si
1
Sn−1
1
S0
t
Si
t
Sn−1
t
The 3 key components are
• z market behavior with n hidden states St = {increase, ..}
• x observed sentiment with m symbols Ot = {bullish,…}
• λ(πi,aij, bik) model.
Ot
bi,t
z
Hidden
states
Observations
symbols
Gaussian
mixture
Canonical forms
• Decoding: Given a set of observations and λ-model =>
most likely sequence of hidden states (Viterbi path)
• Training: Given a set of observations and set of known
states => λ-model (EM - Baum-Welch)
• Evaluation: Given λ-model => Probability of
observations (α/β passes)
10
Preliminary analysis
SPY (for hidden states)
03/15/2009 04/01/2014
150
100
50
Bullish/Bearish (observed)
11
Continuation
Bullish/flag
pattern
Support
Resistance
Observed states
1. Single/multiple variable?
2. Quantization? x = { BULL, BEAR} or x = {bullishness intervals}
3. Smoothing?
𝐱 = [x1, x2] or 𝐱 = f(x1, x2)
12
a
1
𝑚
𝑖=0
𝑖=𝑚−1
𝑥 𝑡
𝑥 𝑡
𝑚
𝑖=0
𝑖=𝑚−1
𝑥 𝑡
x ∈ {]0, 0.9], ]0.9, 1.5], ]1.5, +∞[}
𝑓2
𝑥 𝑡
4
𝑖=0
𝑖=3
𝑥 𝑡(𝑓1) 𝑥 𝑡|3 ;
xt = bullish/bearish, x1t =bullish/neutral, x2t =bearish/neutral
; 𝑓3 [ 𝑥1𝑡 , 𝑥2𝑡]|3 |9
….
Issues
Selection
First you need to define the observation symbols
(observed state) as number of variables, quantization…
(b)
Hidden states 13
1. How many states?
2. How to initialize the λ-model (initial, transition and
emission probabilities)?
3. Should the model be dynamically updated/re-
trained?
a) 4 states: Significant/moderate, decline/increase
b) Posterior, transition & emission probabilities
initialized by computing the average movement of
SPY over 4 weeks on historical data?
c) No dynamic training (roll-over)
Questions
Answers
Training
Training set: 196 weeks
Validation: 1-fold of 46 weeks
Implementation in Scala/Apache Spark
0.64
0.68
0.72
0.76
0.80
48 96 144 192 240
𝐹1 𝑆𝑐𝑜𝑟𝑒 – f2 model
Training/validation set size (weeks)
14
Prediction results
x1 = bullish/neutral
x2 = bearish/neutral
(f3) [x1t , x2t]
(f1) xt = bullish/bearish
15
Expected state SPY
Predicted
state
SPY
𝑓2
𝑥 𝑡
4
𝑖=0
𝑖=3
𝑥 𝑡
Expected state SPY
Predicted
state
SPY
Conclusion?
1. Text-book case worked. What about structural breaks?
2. Can the market response latency (set at 4 weeks), be
optimized?
3. How does it compare to observed technical analysis data?
4. Can we combine investor sentiment with technical,
fundamental analysis data?
16
Stock Prediction using HMM in
stationary states
Detection of regime changes using
Buried Markov models
Alternative models
17
Market technical indicators
𝑥𝑡 =
𝑝 𝑐𝑙𝑜𝑠𝑒−𝑝 𝑝𝑟𝑒𝑣_𝑐𝑙𝑜𝑠𝑒
𝑝 𝑝𝑟𝑒𝑣_𝑐𝑙𝑜𝑠𝑒
𝑝 𝑙𝑜𝑤
𝑝ℎ𝑖𝑔ℎ−𝑝 𝑙𝑜𝑤
Observations: ratio of the relative price change of the SPX
within a trading session over relative volatility during the
same session trading session.
18
The (in)ability of HMM to detect regime changes or
structural breaks is illustrated by using a technical analysis
signal as input observations and the same hidden states as
in the previous test.
19
SPX
03/15/2009 04/01/2014
Regime 1
Regime 2
Structural
Break (correction)
Structural breaks & regimes
Market technical indicators that relies on daily data are far
more continuous than the AAII weekly sentiment survey.
Can our HMM model detect the structural break and the two
regimes it separates?
Validation
range
Model and tests 20
The data is extracted from the Yahoo financials (.csv files)
• Training set: 870 trading sessions [1 – 420, 467 – 895]
• Validation: 25 trading sessions around SPX correction
period sessions [421 – 466]
• Hidden states (4): states (increase > 1%, increase < 1%,
decrease < 1%, decrease > 1%)
Limitation of HMM with Discrete states
The confusion matrix plots the predicted and actual values
of SPX using the 4 hidden states illustrates the poor quality
of prediction of HMM using discrete states.
21
Expected state SPY
Predicted
state
SPY
Problem II: structural breaks
Problem: How to deal with multiple
trends/regimes and structural breaks?
Solution: Model observations sequence as a
Gaussian mixture and defined dependence
between observations as a Markov chain.
HMM is accurate in “stationary” process for which
observations are independent, given a hidden state.
Can we leverage HMM’s inability to operate in shifting
environment?
22
Auto-regressive Markov model
…Si
Sj Sk
𝑝 𝑧𝑡+1 = 𝑆𝑗 𝑧𝑡 = 𝑆𝑖 = 𝑎𝑖𝑗
𝑝 𝑥𝑡 = 𝑂 𝑘 𝑧𝑡 = 𝑆𝑖 = 𝑏𝑖𝑗
Ok Om On
…
𝑝 𝑥𝑡 = 𝑂 𝑚 𝑥 𝑡−1 = 𝑂 𝑘, 𝑧𝑡 = 𝑆𝑗 = 𝒩(𝑥𝑡|𝒘𝒋. 𝒙 𝒕−𝟏 + 𝜇 𝑗, 𝜎𝑗)
t-1 t
Need to use a continuous model of observations: the
observations set is defined as a Gaussian mixture.
23
AR-HMM & BMM
Sk
On On+d
…
Auto-regressive model with 2 Markov models
- HMM for hidden state
- 1st order/Auto-regressive (AR-HMM) or higher
order/Buried Markov model (BMM) Markov chains for
observations
24
We use cardinality (number of observations associated
to the same hidden state) to evaluate the stationary
states (regime) and sudden shifts (structural breaks).
Cardinality
Regime 1 Regime 2
Cardinality
Sk->l
Sm->n
25
Once the break is identified, then the regimes are
identified by the distribution of probability across the
different hidden states.
The cardinality (number of observations associated to the
same hidden state) is quite high for stationary states. This
is especially the case if the observations are noisy.
AR-HMM & BMM: Cardinality
26
• These enhancements to the HMM for continuous or
pseudo-continuous observations have been proven to
provide higher quality prediction in homogeneous
observations (from the same source).
• AR-HMM are less accurate for observations from different
sources (i.e. Weekly investors’ sentiment vs daily market
technical indicators). In this scenario, regularization has to
be added to the model.
• We may look at different type of techniques/classifiers as
an alternative to complex regularization on AR-HMM
AR-HMM & BMM: limitations
Stock Prediction using HMM in
stationary states
Detection of regime changes using
Buried Markov models
Alternative models
27
Alternative to AR-HMM/BMM
The previous section describes the limitation of auto-
regressive HMM and buried Markov model for
observations from heterogeneous sources.
Here a summary on the few models that you may
consider beside HMM
• Conditional Random Fields
• Riemann manifold learning
• Continuous-time Kalman filter
28
Riemann manifolds
𝑎00 ⋯ 𝑎0𝑛
⋮ ⋱ ⋮
𝑎 𝑛0 ⋯ 𝑎 𝑛𝑛
𝑏00 ⋯ 𝑏0𝑛
⋮ ⋱ ⋮
𝑏 𝑚0 ⋯ 𝑏 𝑚𝑛
Define the n+m dimension space of transition and emission
probabilities, then find an embedded space (manifold)
containing the most significant, constant hidden states.
You may define a regime has by the subset of the transition
matrix that does not change overtime.
29
~constant
probabilities
Transition probabilities
over a regime
Emission probabilities
over a regime
Riemann manifolds for regimes
Riemannian
manifold
The manifold consists of the transition probability tensor
for the subset of constant states within a regime with a
cumulative probability > 80%
30
𝐶 ⊂ 𝐴⨂𝐵
𝐴⨂𝐵
Conditional random fields
Conditional Random fields (CRF) are discriminative models
derived from the logistic regression
• CRF aims at computing the conditional probability
p(x=P|z=S0, … Sn)
• CRF does not require the features be independent
• CRF does not assume that the transition probabilities A
to be constant.
31
Continuous-time Kalman filter
Kalman filter is a recursive, adaptive, optimal estimator.
• Kalman allows transitory states (adaptive)
• Kalman does not need a training set
• Kalman supports continuous state values (continuous-
time Kalman ODE)
• Kalman require specification of white noise for process
and measurement.
32
Scala in Machine Learning: $7 Sequential models – Hidden Markov
Model P. Nicolas – Packt publishing 2014
References 33
Pattern Recognition and Machine Learning §13.2.1 Maximum
Likelihood for the HMM C. Bishop –Springer 2006
American Association of Individual Investors (AAII)
http://www.aaii.com
Stock Market Forecasting Using Hidden Markov Model: A new
Approach R. Hassan, B. Nath - University of Melbourne 2008
Selective Prediction of Financial Trend with Hidden Markov Models
R. El-Yaniv , D. Pidan – Technion Israel
Machine Learning: A probabilistic Perspective $17.6.4 Auto-
regressive and buried HMMS – K. Murphy MIT press 2012
patricknicolas.blogspot.com
www.slideshare.net/pnicolas
github.com/prnicolas
www.packtpub.com
For further information …
Ad

More Related Content

What's hot (20)

Hidden Markov Models
Hidden Markov ModelsHidden Markov Models
Hidden Markov Models
Vu Pham
 
Monte carlo simulation
Monte carlo simulationMonte carlo simulation
Monte carlo simulation
Rajesh Piryani
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
Akshay Sehgal
 
Hidden markov model ppt
Hidden markov model pptHidden markov model ppt
Hidden markov model ppt
Shivangi Saxena
 
properties, application and issues of support vector machine
properties, application and issues of support vector machineproperties, application and issues of support vector machine
properties, application and issues of support vector machine
Dr. Radhey Shyam
 
Jacobson Theorem
Jacobson TheoremJacobson Theorem
Jacobson Theorem
geethannadurai
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
Yan Xu
 
Simulation & Modeling - Smilulation Queuing System
Simulation & Modeling - Smilulation Queuing SystemSimulation & Modeling - Smilulation Queuing System
Simulation & Modeling - Smilulation Queuing System
Maruf Rion
 
General purpose simulation_system
General purpose simulation_systemGeneral purpose simulation_system
General purpose simulation_system
Gokulananda Sahoo
 
Hidden markov model
Hidden markov modelHidden markov model
Hidden markov model
BushraShaikh44
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streams
hktripathy
 
Transaction slide
Transaction slideTransaction slide
Transaction slide
shawon roy
 
Solution of N Queens Problem genetic algorithm
Solution of N Queens Problem genetic algorithm  Solution of N Queens Problem genetic algorithm
Solution of N Queens Problem genetic algorithm
MohammedAlKazal
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimization
Joy Dutta
 
Ant Colony Optimization (ACO)
Ant Colony Optimization (ACO)Ant Colony Optimization (ACO)
Ant Colony Optimization (ACO)
Mahmoud El-tayeb
 
PowerPoint Presentation - Conditional Random Fields - A ...
PowerPoint Presentation - Conditional Random Fields - A ...PowerPoint Presentation - Conditional Random Fields - A ...
PowerPoint Presentation - Conditional Random Fields - A ...
butest
 
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationArtificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Mohammed Bennamoun
 
Kalman filter
Kalman filterKalman filter
Kalman filter
Raghava Raghu
 
Hidden Markov Model - The Most Probable Path
Hidden Markov Model - The Most Probable PathHidden Markov Model - The Most Probable Path
Hidden Markov Model - The Most Probable Path
Lê Hòa
 
General purpose simulation System (GPSS)
General purpose simulation System (GPSS)General purpose simulation System (GPSS)
General purpose simulation System (GPSS)
Tushar Aneyrao
 
Hidden Markov Models
Hidden Markov ModelsHidden Markov Models
Hidden Markov Models
Vu Pham
 
Monte carlo simulation
Monte carlo simulationMonte carlo simulation
Monte carlo simulation
Rajesh Piryani
 
properties, application and issues of support vector machine
properties, application and issues of support vector machineproperties, application and issues of support vector machine
properties, application and issues of support vector machine
Dr. Radhey Shyam
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
Yan Xu
 
Simulation & Modeling - Smilulation Queuing System
Simulation & Modeling - Smilulation Queuing SystemSimulation & Modeling - Smilulation Queuing System
Simulation & Modeling - Smilulation Queuing System
Maruf Rion
 
General purpose simulation_system
General purpose simulation_systemGeneral purpose simulation_system
General purpose simulation_system
Gokulananda Sahoo
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streams
hktripathy
 
Transaction slide
Transaction slideTransaction slide
Transaction slide
shawon roy
 
Solution of N Queens Problem genetic algorithm
Solution of N Queens Problem genetic algorithm  Solution of N Queens Problem genetic algorithm
Solution of N Queens Problem genetic algorithm
MohammedAlKazal
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimization
Joy Dutta
 
Ant Colony Optimization (ACO)
Ant Colony Optimization (ACO)Ant Colony Optimization (ACO)
Ant Colony Optimization (ACO)
Mahmoud El-tayeb
 
PowerPoint Presentation - Conditional Random Fields - A ...
PowerPoint Presentation - Conditional Random Fields - A ...PowerPoint Presentation - Conditional Random Fields - A ...
PowerPoint Presentation - Conditional Random Fields - A ...
butest
 
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationArtificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Mohammed Bennamoun
 
Hidden Markov Model - The Most Probable Path
Hidden Markov Model - The Most Probable PathHidden Markov Model - The Most Probable Path
Hidden Markov Model - The Most Probable Path
Lê Hòa
 
General purpose simulation System (GPSS)
General purpose simulation System (GPSS)General purpose simulation System (GPSS)
General purpose simulation System (GPSS)
Tushar Aneyrao
 

Similar to Stock Market Prediction using Hidden Markov Models and Investor sentiment (20)

Modeling & Simulation Lecture Notes
Modeling & Simulation Lecture NotesModeling & Simulation Lecture Notes
Modeling & Simulation Lecture Notes
FellowBuddy.com
 
HMM (Hidden Markov Model)
HMM (Hidden Markov Model)HMM (Hidden Markov Model)
HMM (Hidden Markov Model)
Maharaj Vinayak Global University
 
Hidden Markov model technique for dynamic spectrum access
Hidden Markov model technique for dynamic spectrum accessHidden Markov model technique for dynamic spectrum access
Hidden Markov model technique for dynamic spectrum access
TELKOMNIKA JOURNAL
 
Applications of Machine Learning in High Frequency Trading
Applications of Machine Learning in High Frequency TradingApplications of Machine Learning in High Frequency Trading
Applications of Machine Learning in High Frequency Trading
Ayan Sengupta
 
Hidden Markov Model presentation project.pptx
Hidden Markov Model presentation project.pptxHidden Markov Model presentation project.pptx
Hidden Markov Model presentation project.pptx
bikikhan0001
 
The Short-term Swap Rate Models in China
The Short-term Swap Rate Models in ChinaThe Short-term Swap Rate Models in China
The Short-term Swap Rate Models in China
International Journal of Business Marketing and Management (IJBMM)
 
State space analysis.pptx
State space analysis.pptxState space analysis.pptx
State space analysis.pptx
RaviMuthamala1
 
Hidden Markov Model (HMM).pptx
Hidden Markov Model (HMM).pptxHidden Markov Model (HMM).pptx
Hidden Markov Model (HMM).pptx
AdityaKumar993506
 
Numerical method for pricing american options under regime
Numerical method for pricing american options under regime Numerical method for pricing american options under regime
Numerical method for pricing american options under regime
Alexander Decker
 
Methods of Track Circuit Fault Diagnosis Based on Hmm
Methods of Track Circuit Fault Diagnosis Based on HmmMethods of Track Circuit Fault Diagnosis Based on Hmm
Methods of Track Circuit Fault Diagnosis Based on Hmm
IJRESJOURNAL
 
Discrete state space model 9th &10th lecture
Discrete  state space model   9th  &10th  lectureDiscrete  state space model   9th  &10th  lecture
Discrete state space model 9th &10th lecture
Khalaf Gaeid Alshammery
 
A probabilistic monte carlo model for pricing discrete barrier
A probabilistic monte carlo model for pricing discrete barrierA probabilistic monte carlo model for pricing discrete barrier
A probabilistic monte carlo model for pricing discrete barrier
Alexander Decker
 
HMM Classifier for Human Activity Recognition
HMM Classifier for Human Activity RecognitionHMM Classifier for Human Activity Recognition
HMM Classifier for Human Activity Recognition
CSEIJJournal
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
butest
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
butest
 
Mode-Decision Science, Business Statistics
Mode-Decision Science, Business StatisticsMode-Decision Science, Business Statistics
Mode-Decision Science, Business Statistics
Rachit Agarwal
 
Machine learning fundamental concepts in detail
Machine learning fundamental concepts in detailMachine learning fundamental concepts in detail
Machine learning fundamental concepts in detail
shreyassoni7
 
JOURNALnew
JOURNALnewJOURNALnew
JOURNALnew
Mohomed Abraj.
 
Dynamic asset allocation under regime switching: an in-sample and out-of-samp...
Dynamic asset allocation under regime switching: an in-sample and out-of-samp...Dynamic asset allocation under regime switching: an in-sample and out-of-samp...
Dynamic asset allocation under regime switching: an in-sample and out-of-samp...
Andrea Bartolucci
 
Hidden Markov Model
Hidden Markov Model Hidden Markov Model
Hidden Markov Model
Mahmoud El-tayeb
 
Modeling & Simulation Lecture Notes
Modeling & Simulation Lecture NotesModeling & Simulation Lecture Notes
Modeling & Simulation Lecture Notes
FellowBuddy.com
 
Hidden Markov model technique for dynamic spectrum access
Hidden Markov model technique for dynamic spectrum accessHidden Markov model technique for dynamic spectrum access
Hidden Markov model technique for dynamic spectrum access
TELKOMNIKA JOURNAL
 
Applications of Machine Learning in High Frequency Trading
Applications of Machine Learning in High Frequency TradingApplications of Machine Learning in High Frequency Trading
Applications of Machine Learning in High Frequency Trading
Ayan Sengupta
 
Hidden Markov Model presentation project.pptx
Hidden Markov Model presentation project.pptxHidden Markov Model presentation project.pptx
Hidden Markov Model presentation project.pptx
bikikhan0001
 
State space analysis.pptx
State space analysis.pptxState space analysis.pptx
State space analysis.pptx
RaviMuthamala1
 
Hidden Markov Model (HMM).pptx
Hidden Markov Model (HMM).pptxHidden Markov Model (HMM).pptx
Hidden Markov Model (HMM).pptx
AdityaKumar993506
 
Numerical method for pricing american options under regime
Numerical method for pricing american options under regime Numerical method for pricing american options under regime
Numerical method for pricing american options under regime
Alexander Decker
 
Methods of Track Circuit Fault Diagnosis Based on Hmm
Methods of Track Circuit Fault Diagnosis Based on HmmMethods of Track Circuit Fault Diagnosis Based on Hmm
Methods of Track Circuit Fault Diagnosis Based on Hmm
IJRESJOURNAL
 
Discrete state space model 9th &10th lecture
Discrete  state space model   9th  &10th  lectureDiscrete  state space model   9th  &10th  lecture
Discrete state space model 9th &10th lecture
Khalaf Gaeid Alshammery
 
A probabilistic monte carlo model for pricing discrete barrier
A probabilistic monte carlo model for pricing discrete barrierA probabilistic monte carlo model for pricing discrete barrier
A probabilistic monte carlo model for pricing discrete barrier
Alexander Decker
 
HMM Classifier for Human Activity Recognition
HMM Classifier for Human Activity RecognitionHMM Classifier for Human Activity Recognition
HMM Classifier for Human Activity Recognition
CSEIJJournal
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
butest
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
butest
 
Mode-Decision Science, Business Statistics
Mode-Decision Science, Business StatisticsMode-Decision Science, Business Statistics
Mode-Decision Science, Business Statistics
Rachit Agarwal
 
Machine learning fundamental concepts in detail
Machine learning fundamental concepts in detailMachine learning fundamental concepts in detail
Machine learning fundamental concepts in detail
shreyassoni7
 
Dynamic asset allocation under regime switching: an in-sample and out-of-samp...
Dynamic asset allocation under regime switching: an in-sample and out-of-samp...Dynamic asset allocation under regime switching: an in-sample and out-of-samp...
Dynamic asset allocation under regime switching: an in-sample and out-of-samp...
Andrea Bartolucci
 
Ad

More from Patrick Nicolas (12)

Autonomous medical coding with discriminative transformers
Autonomous medical coding with discriminative transformersAutonomous medical coding with discriminative transformers
Autonomous medical coding with discriminative transformers
Patrick Nicolas
 
Open Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningOpen Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learning
Patrick Nicolas
 
AI for electronic health records
AI for electronic health recordsAI for electronic health records
AI for electronic health records
Patrick Nicolas
 
Monadic genetic kernels in Scala
Monadic genetic kernels in ScalaMonadic genetic kernels in Scala
Monadic genetic kernels in Scala
Patrick Nicolas
 
Scala for Machine Learning
Scala for Machine LearningScala for Machine Learning
Scala for Machine Learning
Patrick Nicolas
 
Advanced Functional Programming in Scala
Advanced Functional Programming in ScalaAdvanced Functional Programming in Scala
Advanced Functional Programming in Scala
Patrick Nicolas
 
Adaptive Intrusion Detection Using Learning Classifiers
Adaptive Intrusion Detection Using Learning ClassifiersAdaptive Intrusion Detection Using Learning Classifiers
Adaptive Intrusion Detection Using Learning Classifiers
Patrick Nicolas
 
Data Modeling using Symbolic Regression
Data Modeling using Symbolic RegressionData Modeling using Symbolic Regression
Data Modeling using Symbolic Regression
Patrick Nicolas
 
Semantic Analysis using Wikipedia Taxonomy
Semantic Analysis using Wikipedia TaxonomySemantic Analysis using Wikipedia Taxonomy
Semantic Analysis using Wikipedia Taxonomy
Patrick Nicolas
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
Patrick Nicolas
 
Taxonomy-based Contextual Ads Targeting
Taxonomy-based Contextual Ads TargetingTaxonomy-based Contextual Ads Targeting
Taxonomy-based Contextual Ads Targeting
Patrick Nicolas
 
Multi-tenancy in Private Clouds
Multi-tenancy in Private CloudsMulti-tenancy in Private Clouds
Multi-tenancy in Private Clouds
Patrick Nicolas
 
Autonomous medical coding with discriminative transformers
Autonomous medical coding with discriminative transformersAutonomous medical coding with discriminative transformers
Autonomous medical coding with discriminative transformers
Patrick Nicolas
 
Open Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningOpen Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learning
Patrick Nicolas
 
AI for electronic health records
AI for electronic health recordsAI for electronic health records
AI for electronic health records
Patrick Nicolas
 
Monadic genetic kernels in Scala
Monadic genetic kernels in ScalaMonadic genetic kernels in Scala
Monadic genetic kernels in Scala
Patrick Nicolas
 
Scala for Machine Learning
Scala for Machine LearningScala for Machine Learning
Scala for Machine Learning
Patrick Nicolas
 
Advanced Functional Programming in Scala
Advanced Functional Programming in ScalaAdvanced Functional Programming in Scala
Advanced Functional Programming in Scala
Patrick Nicolas
 
Adaptive Intrusion Detection Using Learning Classifiers
Adaptive Intrusion Detection Using Learning ClassifiersAdaptive Intrusion Detection Using Learning Classifiers
Adaptive Intrusion Detection Using Learning Classifiers
Patrick Nicolas
 
Data Modeling using Symbolic Regression
Data Modeling using Symbolic RegressionData Modeling using Symbolic Regression
Data Modeling using Symbolic Regression
Patrick Nicolas
 
Semantic Analysis using Wikipedia Taxonomy
Semantic Analysis using Wikipedia TaxonomySemantic Analysis using Wikipedia Taxonomy
Semantic Analysis using Wikipedia Taxonomy
Patrick Nicolas
 
Taxonomy-based Contextual Ads Targeting
Taxonomy-based Contextual Ads TargetingTaxonomy-based Contextual Ads Targeting
Taxonomy-based Contextual Ads Targeting
Patrick Nicolas
 
Multi-tenancy in Private Clouds
Multi-tenancy in Private CloudsMulti-tenancy in Private Clouds
Multi-tenancy in Private Clouds
Patrick Nicolas
 
Ad

Recently uploaded (20)

Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
Simran112433
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
VKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptxVKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptx
Vinod Srivastava
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
Simran112433
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
VKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptxVKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptx
Vinod Srivastava
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 

Stock Market Prediction using Hidden Markov Models and Investor sentiment

  • 1. Stock Prediction using Hidden Markov Models & Investor Sentiment Patrick Nicolas patricknicolas.blogspot.com Silicon Valley Machine Learning for Trading Strategies meetup, April 25, 2015
  • 2. Introduction Hidden Markov models (HMM) are have been used for years to decipher the internal workings of financial markets. More recently, HMM have been applied to predict stocks/ETFs movement using crowd sourcing and social media. 2
  • 3. Objectives The goal is to evaluate the predictive capability of HMM using investor’s sentiment as input and answer 3 questions: 1. Can HMM predict price of securities in stationary states? 2. Can HMM detect regime shift or structural breaks? 3. What are the alternative solutions? 3
  • 4. Stock Prediction using HMM in stationary states Detection of regime changes using Buried Markov models Alternative models 4
  • 5. Problem I Problem: Using AAII weekly sentiment survey to predict market trend (1 – 3 months) Solution: Using hidden Markov model to predict SPX value using AAII survey as hidden states. The sentiment of American Association of Individual Investors (AAII) is regarded as a contrarian indicator of the future direction in market indices. 5
  • 6. AAII sentiment survey 6 www.aaii.com
  • 7. 1. Observations (bullishness) are independent, discrete 2. Observations depends on the internal state of the market 3. The historical data does show any significant break (~stationary process): Macro-economic distortion 4. The latent behavior of the stock market behavior can be quantized into a finite number of states 5. Asset/security pricing does not follow normal distribution but this restriction is relaxed for mixtures Why HMM? 7
  • 8. Overview of graphical model AAII weekly sentiment (symbols) xt HMM is a generative classifier to predict SPY z price, conditional to a latent, internal state of the stock market S using investor’ sentiment x as observation O. 8 Market internal state zt …Si Sj Sk 𝑝 𝑧𝑡+1 = 𝑆𝑗 𝑧𝑡 = 𝑆𝑖 = 𝑎𝑖𝑗 𝑝 𝑥 𝑡 = 𝑂 𝑘 𝑧𝑡 = 𝑆𝑖 = 𝑏 𝑘𝑗 Ok Om On aij is the state transition probabilities matrix {Si}t-1 -> {Si}t Bkj is the emission probabilities matrix {Sj}t -> {Ok}t
  • 9. Trellis model representation π0 πi πn-1 a0,0 an-1,n-1 O0 bn-1,1 Ot bn-1,t O1 9 x S0 0 Si 0 Sn−1 0 S0 1 Si 1 Sn−1 1 S0 t Si t Sn−1 t The 3 key components are • z market behavior with n hidden states St = {increase, ..} • x observed sentiment with m symbols Ot = {bullish,…} • λ(πi,aij, bik) model. Ot bi,t z Hidden states Observations symbols Gaussian mixture
  • 10. Canonical forms • Decoding: Given a set of observations and λ-model => most likely sequence of hidden states (Viterbi path) • Training: Given a set of observations and set of known states => λ-model (EM - Baum-Welch) • Evaluation: Given λ-model => Probability of observations (α/β passes) 10
  • 11. Preliminary analysis SPY (for hidden states) 03/15/2009 04/01/2014 150 100 50 Bullish/Bearish (observed) 11 Continuation Bullish/flag pattern Support Resistance
  • 12. Observed states 1. Single/multiple variable? 2. Quantization? x = { BULL, BEAR} or x = {bullishness intervals} 3. Smoothing? 𝐱 = [x1, x2] or 𝐱 = f(x1, x2) 12 a 1 𝑚 𝑖=0 𝑖=𝑚−1 𝑥 𝑡 𝑥 𝑡 𝑚 𝑖=0 𝑖=𝑚−1 𝑥 𝑡 x ∈ {]0, 0.9], ]0.9, 1.5], ]1.5, +∞[} 𝑓2 𝑥 𝑡 4 𝑖=0 𝑖=3 𝑥 𝑡(𝑓1) 𝑥 𝑡|3 ; xt = bullish/bearish, x1t =bullish/neutral, x2t =bearish/neutral ; 𝑓3 [ 𝑥1𝑡 , 𝑥2𝑡]|3 |9 …. Issues Selection First you need to define the observation symbols (observed state) as number of variables, quantization… (b)
  • 13. Hidden states 13 1. How many states? 2. How to initialize the λ-model (initial, transition and emission probabilities)? 3. Should the model be dynamically updated/re- trained? a) 4 states: Significant/moderate, decline/increase b) Posterior, transition & emission probabilities initialized by computing the average movement of SPY over 4 weeks on historical data? c) No dynamic training (roll-over) Questions Answers
  • 14. Training Training set: 196 weeks Validation: 1-fold of 46 weeks Implementation in Scala/Apache Spark 0.64 0.68 0.72 0.76 0.80 48 96 144 192 240 𝐹1 𝑆𝑐𝑜𝑟𝑒 – f2 model Training/validation set size (weeks) 14
  • 15. Prediction results x1 = bullish/neutral x2 = bearish/neutral (f3) [x1t , x2t] (f1) xt = bullish/bearish 15 Expected state SPY Predicted state SPY 𝑓2 𝑥 𝑡 4 𝑖=0 𝑖=3 𝑥 𝑡 Expected state SPY Predicted state SPY
  • 16. Conclusion? 1. Text-book case worked. What about structural breaks? 2. Can the market response latency (set at 4 weeks), be optimized? 3. How does it compare to observed technical analysis data? 4. Can we combine investor sentiment with technical, fundamental analysis data? 16
  • 17. Stock Prediction using HMM in stationary states Detection of regime changes using Buried Markov models Alternative models 17
  • 18. Market technical indicators 𝑥𝑡 = 𝑝 𝑐𝑙𝑜𝑠𝑒−𝑝 𝑝𝑟𝑒𝑣_𝑐𝑙𝑜𝑠𝑒 𝑝 𝑝𝑟𝑒𝑣_𝑐𝑙𝑜𝑠𝑒 𝑝 𝑙𝑜𝑤 𝑝ℎ𝑖𝑔ℎ−𝑝 𝑙𝑜𝑤 Observations: ratio of the relative price change of the SPX within a trading session over relative volatility during the same session trading session. 18 The (in)ability of HMM to detect regime changes or structural breaks is illustrated by using a technical analysis signal as input observations and the same hidden states as in the previous test.
  • 19. 19 SPX 03/15/2009 04/01/2014 Regime 1 Regime 2 Structural Break (correction) Structural breaks & regimes Market technical indicators that relies on daily data are far more continuous than the AAII weekly sentiment survey. Can our HMM model detect the structural break and the two regimes it separates? Validation range
  • 20. Model and tests 20 The data is extracted from the Yahoo financials (.csv files) • Training set: 870 trading sessions [1 – 420, 467 – 895] • Validation: 25 trading sessions around SPX correction period sessions [421 – 466] • Hidden states (4): states (increase > 1%, increase < 1%, decrease < 1%, decrease > 1%)
  • 21. Limitation of HMM with Discrete states The confusion matrix plots the predicted and actual values of SPX using the 4 hidden states illustrates the poor quality of prediction of HMM using discrete states. 21 Expected state SPY Predicted state SPY
  • 22. Problem II: structural breaks Problem: How to deal with multiple trends/regimes and structural breaks? Solution: Model observations sequence as a Gaussian mixture and defined dependence between observations as a Markov chain. HMM is accurate in “stationary” process for which observations are independent, given a hidden state. Can we leverage HMM’s inability to operate in shifting environment? 22
  • 23. Auto-regressive Markov model …Si Sj Sk 𝑝 𝑧𝑡+1 = 𝑆𝑗 𝑧𝑡 = 𝑆𝑖 = 𝑎𝑖𝑗 𝑝 𝑥𝑡 = 𝑂 𝑘 𝑧𝑡 = 𝑆𝑖 = 𝑏𝑖𝑗 Ok Om On … 𝑝 𝑥𝑡 = 𝑂 𝑚 𝑥 𝑡−1 = 𝑂 𝑘, 𝑧𝑡 = 𝑆𝑗 = 𝒩(𝑥𝑡|𝒘𝒋. 𝒙 𝒕−𝟏 + 𝜇 𝑗, 𝜎𝑗) t-1 t Need to use a continuous model of observations: the observations set is defined as a Gaussian mixture. 23
  • 24. AR-HMM & BMM Sk On On+d … Auto-regressive model with 2 Markov models - HMM for hidden state - 1st order/Auto-regressive (AR-HMM) or higher order/Buried Markov model (BMM) Markov chains for observations 24 We use cardinality (number of observations associated to the same hidden state) to evaluate the stationary states (regime) and sudden shifts (structural breaks). Cardinality
  • 25. Regime 1 Regime 2 Cardinality Sk->l Sm->n 25 Once the break is identified, then the regimes are identified by the distribution of probability across the different hidden states. The cardinality (number of observations associated to the same hidden state) is quite high for stationary states. This is especially the case if the observations are noisy. AR-HMM & BMM: Cardinality
  • 26. 26 • These enhancements to the HMM for continuous or pseudo-continuous observations have been proven to provide higher quality prediction in homogeneous observations (from the same source). • AR-HMM are less accurate for observations from different sources (i.e. Weekly investors’ sentiment vs daily market technical indicators). In this scenario, regularization has to be added to the model. • We may look at different type of techniques/classifiers as an alternative to complex regularization on AR-HMM AR-HMM & BMM: limitations
  • 27. Stock Prediction using HMM in stationary states Detection of regime changes using Buried Markov models Alternative models 27
  • 28. Alternative to AR-HMM/BMM The previous section describes the limitation of auto- regressive HMM and buried Markov model for observations from heterogeneous sources. Here a summary on the few models that you may consider beside HMM • Conditional Random Fields • Riemann manifold learning • Continuous-time Kalman filter 28
  • 29. Riemann manifolds 𝑎00 ⋯ 𝑎0𝑛 ⋮ ⋱ ⋮ 𝑎 𝑛0 ⋯ 𝑎 𝑛𝑛 𝑏00 ⋯ 𝑏0𝑛 ⋮ ⋱ ⋮ 𝑏 𝑚0 ⋯ 𝑏 𝑚𝑛 Define the n+m dimension space of transition and emission probabilities, then find an embedded space (manifold) containing the most significant, constant hidden states. You may define a regime has by the subset of the transition matrix that does not change overtime. 29 ~constant probabilities Transition probabilities over a regime Emission probabilities over a regime
  • 30. Riemann manifolds for regimes Riemannian manifold The manifold consists of the transition probability tensor for the subset of constant states within a regime with a cumulative probability > 80% 30 𝐶 ⊂ 𝐴⨂𝐵 𝐴⨂𝐵
  • 31. Conditional random fields Conditional Random fields (CRF) are discriminative models derived from the logistic regression • CRF aims at computing the conditional probability p(x=P|z=S0, … Sn) • CRF does not require the features be independent • CRF does not assume that the transition probabilities A to be constant. 31
  • 32. Continuous-time Kalman filter Kalman filter is a recursive, adaptive, optimal estimator. • Kalman allows transitory states (adaptive) • Kalman does not need a training set • Kalman supports continuous state values (continuous- time Kalman ODE) • Kalman require specification of white noise for process and measurement. 32
  • 33. Scala in Machine Learning: $7 Sequential models – Hidden Markov Model P. Nicolas – Packt publishing 2014 References 33 Pattern Recognition and Machine Learning §13.2.1 Maximum Likelihood for the HMM C. Bishop –Springer 2006 American Association of Individual Investors (AAII) http://www.aaii.com Stock Market Forecasting Using Hidden Markov Model: A new Approach R. Hassan, B. Nath - University of Melbourne 2008 Selective Prediction of Financial Trend with Hidden Markov Models R. El-Yaniv , D. Pidan – Technion Israel Machine Learning: A probabilistic Perspective $17.6.4 Auto- regressive and buried HMMS – K. Murphy MIT press 2012

Editor's Notes

  • #2: The use case in this presentation is extracted from the “Scala for Machine learning” by Packt publishing. The book describes the different features of the Scala programming language that are particularly useful in implementing machine learning algorithms or creating applications which use such techniques. The first section in this presentation is describe in Chapter 7 “Sequential Data Models”
  • #3: This presentation does not require any software engineering skills but prior knowledge of machine learning, generative models such as Naïve Bayes and basic data filtering techniques are highly recommended to absorb the key concepts.
  • #4: There are numerous papers on hidden Markov models (HMM), especially related to trading stocks, commodities or currencies. The “out-of-the-box” HMM for discrete observations works very well in an environment for which states are somewhat stationary as illustrated in this presentation, by predicting market direction using sentiment survey. Continuous observations breaks the basic assumption of independence. We look at the different options to enhance HMM with auto-regressive capabilities to operate in continuous environment. The auto-regression of Markov chain proves to be essential in the detection of regime and regime shifts.
  • #6: The American Association of Individual Investors is a non-profit organization that caters to individual investors for very reason annual subscription fee. They publish a couple of surveys from their members regarding their sentiment on future direction of the market. The result of the weekly survey is considered to be somewhat contrarian (Liz Ann Sonders, CIO, Charles Schwab).
  • #7: Sentiment is not provided in a vacuum: mentally, investors uses a reference point (last week) to specify if they are bullish (more or less bullish than last week or the last month = average of last 4 weeks). The historical data goes back to 1973 The problem is a good candidate for a Markovian model as the current state of the market may depend on the state in the previous week.
  • #8: We use the weekly sentiment of investors historical data from March 2009 and April 2014 We assume that the observations, weekly sentiment survey results are independent. However the observations are conditionally dependent on the internal, unknown state of the stock market. Alternative views of the internal state of the market such as price or do not show any significant shift. That assumption of stationary state of the market has a significant impact on the accuracy of the prediction. Although the stock market technical indicators are pseudo-continuous, we can always extract daily, weekly or monthly discrete states The internal state of the stock market is assumed to be a mixture or combination of Gaussian distributions, not a single Normal distribution.
  • #9: We use the S&P-500 ETF, SPY as a proxy to the market at large. The joint probability p(state, observation) can be broken down into Conditional probability of the stock market be in a state S(j) at t+1 given it is at state S(i) at Conditional probability of the investor bullishness be O(k) at t knowing that the current state of the stock market is S(j)
  • #10: This trellis representation illustrates the recursive nature of the HMM This visualization also a reminder of the constraint on HMM Observations depends only on the hidden state as the same time. There is no direction connection (conditional probability) between observations The states for a specific time is indeed a mixture/summation of Gaussian distribution. A hidden state is similar to a cluster or class and use EM because of mixture.
  • #11: Decoding is used as a prediction on the internal market (hidden) states. Training: The expectation-maximization algorithm processes/updates the mixture of Gaussian distribution of each hidden states to generate a lambda model. It requires the transition and emission probabilities to be initialized Evaluation: Predict the observation at t+1 given an existing sequence of observations up to t and a λ-model. The α – β passes are also known as the forward-backward passes Note: The implementation of the HMM relies heavily on dynamic programming techniques.
  • #12: The SPY Exchange Traded Fund is used as a proxy of the overall market (S&P 500 stocks index, SPX). The bullish/bearish ratio (top chart) has greater values when the SPY values (bottom chart) peaks. The trend of the market has been fairly stationary during the period March 2009 and April 2014. The market behavior over this period of time can be easily explained by the constant supply of liquidity by the US Federal Reserve and other central banks. Zero interest and quantitative easing policy entice Traders to take additional risks Long term investors to look for income/yield in dividend stocks instead of bonds Corporation to inflate their EPS by selling debt at low interest to buy their own stock The flag a pennant formation as the center of the lower graph, permute the support and resistance upward trend lines. It is a well documented bullish continuation pattern.
  • #13: Features engineering consists of asking the right questions in order to define a model The observed states (bullishness of investors) can be defined by a single variable (i..e. bullish vs. bearish) or two variables (i.e. bullish vs. neutral and bearish vs. neutral). These variables can be defined as Boolean (Bullish = true(1), Bearish = false(0)), or quantized. If the bullishness is too noisy, you may want to compute the observation states/symbols as the average or moving average of the latest m observations (a). One may want to use the rate of change in bullishness as the ratio of the current bullishness over the average in the m previous weeks (b) as observations values. Note: The observations can be smoothed using a low pass filter to avoid misclassification or over-fitting if needed. Model: 3 input/observations variable types are selected: A single variable bullish/bearish ratio Change of bullish/bearish ratio compared the average ratio over the previous 4 weeks Two variables: bullish/neutral and bearish/neutral
  • #14: The parameters of the HMM model have to be setup prior to the training. How many hidden states to represent accurately the internal state of the stock market. Hidden states can be regarded as classes or clusters with the same trade-offs for their configuration. The components of the λ-model could be randomly or heuristically initialized. We could consider retraining the model for each n new observations to make it more adaptive Here the parameters for the hidden states used in the test. 4 states to represent the stock market direction: Significant increase (> 3%), moderate increase (< %3), moderate decrease (< -3%) and, significant decrease (> -3%) over 4 weeks We decide to leverage the historical data on the SPY stock movement over 4 week periods. We found no need to re-trained the model after few new observations because of the stationary nature of the market between March 2009 and April 2014
  • #15: The training is performed with the first 196 AAII weekly sentiment survey and the validation on the remaining 46 weeks. The F1 measure, which balance precision and recall show limited variation as the training set increases.
  • #16: Confusion matrix (heat maps) are commonly used to visualize the comparison between the predicted values and expected values of the state of the stock market. The two states and the bullish/bearish relative to the average of the 4 previous weeks provide the best results. The accuracy for the target hidden state in confusion matrix varies from 62% to 75%
  • #17: The predictive quality of our HMM model using AAII weekly sentiment survey is quite good. This is mainly the results of the stationary behavior of the market. How would the model behave in non-stationary mode? Stationary process is assumed. The prediction does not work in case of significant reversal (not enough visible states/symbols). 4 weeks has been arbitrary selected by looking at historical data (to make the example look good) but latency could be either optimized or added to the model Technical analysis signals such as volatility, volume or rate of changes provide higher granularity (higher number of observations symbols) => continuous Combining weekly investor sentiment, quarterly company earnings and ratios with pseudo-continuous technical analysis signals is challenging.
  • #18: Let’s look at the case for which the stock market experiences reversal or shift.
  • #19: Contrary to the AAII weekly sentiment survey, stock market technical indicator are usually collected daily and hourly and therefore are more prone to impact of sudden change in trend. We arbitrary select the daily stock price variation relative to the volatility within the trading session.
  • #20: The training is executed over the entire period 895 trading sessions/days that include the 2 distinct regimes. The 25 trading sessions around the structural break or correction is not include in the training set.
  • #21: The training and validation sets are used to illustrate and accentuate the limitation of HMM for discrete states to cope with sudden shifts in trend (breaks). Real-world applications rely on a more even or random selection of observations for training and validation.
  • #22: The confusion matrix is quite diffused. The prediction of stock price movement using daily stock price variation relative to the volatility within the trading session is very inaccurate using the plain vanilla HMM
  • #23: Can HMM detect reversal, or change of trends in the stock market looking at different type of observations? Moreover, can a single HMM recognize different regimes?
  • #24: The solution is to relax the constraint of independence of observations by modeling the observations time series as a first order Markov chain. An observations at time t depends on the internal state of the market at time t and the observation at time t-1.
  • #25: The combination of the Markov chain for hidden market state, the conditional probability of an observation given a state and the Markov chain of observations defines an auto-regressive model. The Markov model for the observations can be simple as in the AR-HMM or higher order as in the Buried Markov model. In a stationary process, multiple observations values depends on a single internal, hidden state. This ratio is known as cardinality
  • #26: The cardinality, number of observations associated to the same hidden state, can be used to detect the sudden regime shift.
  • #27: Auto-regressive HMM does not work well when data has to be regularized. There are quite a few papers that describes the benefits and limitations of auto-regressive hidden Markov models. Regularization on HMM is applied to the transition probabilities and is quite complex.
  • #29: The section gives an overview of the alternative models. Conditional Random Fields (discriminative model based on logistic regression) Riemann manifold learning Continuous-time Kalman filter
  • #30: At a higher level, the regime can be thought of a slice of observations for which some transition probabilities (associated with high cardinality hidden state) are rather stable (in red) The transition and emission matrices have to be defined as tensors in order to be used in a non-Euclidean subspace. The “containing” space is defined by the product of A and B tensors
  • #31: The Riemann manifold uses differential geometry to define a tensor metric g. The Riemann manifold C defines the states (transition probabilities) with a cumulative probability 0 80% that do not vary much overtime within a regime. The transition probabilities can be interpreted as a local principal component analysis. Manifold learning is beyond the scope of this presentation.
  • #32: Conditional random fields do not have the same restrictions as HMM. They rely on a minimization of the hinge loss + regularization term which can be very computation intensive.
  • #33: Kalman filter is not a probabilistic model. It estimates of the feature vector (~prediction) for the next step, computes the error between the estimate and the actual value (~loss), then corrects it. The Kalman filter consists of State transition/process model (equation) Measurement/observations model (equation)
  • #34: Some useful reference to Conditional Random fields “Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data” J. Lafferty, A. McCallum, F. Pereira – 2001 “Machine Learning for Multimedia Content Analysis. 9.6 Conditional Random Fields Case Study” Y. Gong, W. Xu – Springer 2007 “Machine Learning: A Probabilistic Perspective – 19.6.2 Conditional Random Fields” K. Murphy – MIT Press 2012 “Scala for Machine Learning Chap 7 Sequential Data Models: Conditional Random Fields” P. Nicolas – Packt Publishing 2014 Interesting references to the Kalman filter: “An introduction to Kalman filter” G. Welch, G. Bishop University of N. Carolina 2006 http://www.cs.unc.edu/~welch/media/pdf/kalman_intro.pdf “Scala for Machine Learning” Chap 3. Data Pre-processing – Kalman filter P. Nicolas - Packt Publishing 2014 “HMMs, Kalman filters” Advanced robotics lecture 22 – University of California, Berkeley P. Abheel - 2009