0% found this document useful (0 votes)
2 views

Predicting VIX With Adaptive Machine Learning

This paper explores the predictability of the CBOE Volatility Index (VIX) using advanced machine learning techniques, demonstrating that daily VIX can be predicted with greater accuracy than previously reported. Key findings highlight the importance of dynamic training and a wide range of economic variables, particularly the weekly jobless claims data, in enhancing forecasting accuracy and economic relevance. The research contributes to both the understanding of VIX predictability and the development of robust quantitative investment strategies, offering insights for future financial forecasting applications.

Uploaded by

錢杰揚
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Predicting VIX With Adaptive Machine Learning

This paper explores the predictability of the CBOE Volatility Index (VIX) using advanced machine learning techniques, demonstrating that daily VIX can be predicted with greater accuracy than previously reported. Key findings highlight the importance of dynamic training and a wide range of economic variables, particularly the weekly jobless claims data, in enhancing forecasting accuracy and economic relevance. The research contributes to both the understanding of VIX predictability and the development of robust quantitative investment strategies, offering insights for future financial forecasting applications.

Uploaded by

錢杰揚
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Predicting VIX with Adaptive Machine Learning

Yunfei Bai* and Charlie X. Cai**

This draft: Oct 2024

Abstract

This paper investigates the predictability of the CBOE Volatility Index (VIX)
and explores the sources of its predictability using machine learning (ML)
techniques. We establish that daily VIX can be predicted with higher accuracy
than previously documented, yielding forecasts of significant economic value.
Our analysis underscores the efficacy of dynamic training, nonlinear methods
and a comprehensive set of economic variables in predicting VIX trends. We
identify the weekly jobless claim data as a pivotal variable, revealing its
substantial influence on market volatility, an area not extensively explored in
prior research. While accurately forecasting VIX spikes poses a challenge, our
algorithms demonstrate remarkable adaptability to new data, thereby
significantly enhancing the resilience of trading strategies. This research not
only contributes to the understanding of VIX predictability but also offers
valuable insights for the development of more robust quantitative investment
and risk management strategies.

Keywords: Machine Learning, AutoML, Explainable AI, VIX, Predictability,

Forecasting, Quantitative Trading, Big Data, S&P 500, Futures, US markets

GEL codes: G0, G17, C52, C55, C58,

* PhD, AI/ML and Big Data Consultant.


** Corresponding Author, Professor of Finance, Liverpool University
School of Management, University of Liverpool, Liverpool, UK. Email:
[email protected]. Website: www.CharliexCai.info

Acknowledgement:
We thank the associate editor and referee for their constructive advice
and Guofu Zhou for his insightful comments on an early draft of this
paper. We also thank Giuliano De Rossi for sharing his working paper
with us in the early stages of the project. All errors are our own.

1
1 Introduction

The CBOE Volatility Index (VIX), often referred to as the 'fear index' (Whaley,
2000), is a key forward-looking indicator for market participants and
policymakers. Despite its prominence, VIX predictability remains challenging
due to traditional reliance on limited predictors and linear models. This paper
leverages recent Machine Learning (ML) advancements to address these
limitations by exploring the predictability of the VIX using a wide range of
economic indicators, focusing on forecasting accuracy and economic relevance,
and examining the sources and constraints of this predictability.

While finance-ML literature has largely focused on cross-sectional


return forecasting, research on time series volatility forecasting remains
limited. Our study bridges this gap by predicting the VIX’s directional
movement for the following day, aligning with binary investment decisions
(long or short). We compile 278 features across 14 categories from Bloomberg,
including global markets, macroeconomic data, and seasonality factors,
ensuring real-time data availability without look-back bias.

We employ various ML algorithms, ranging from Naïve Bayes and


Logistic Regression to Decision Trees, Random Forests, Adaptive Boosting
(AB), and Multi-Layer Perceptron, alongside an ensemble model. A novel cross-
validation strategy maintains the time-series integrity of VIX data (Bergmeir et
al., 2018). Our study segments the data into three periods: In-Sample (pre-
2009) for training, Out-of-Sample (late 2009) for validation, and
Implementation (2010-2020) for testing.

Key findings reveal that Adaptive Boosting (AB) achieved the highest
validation accuracy (68.2%), outperforming other models and demonstrating
resilience against overfitting. Moreover, AB’s forecasts delivered an annualized
return of 225% in a simulated long/short VIX investment strategy, with a
Sharpe ratio of 1.7, underscoring its practical value. Our economic evaluation
shows that VIX forecasts have significant applications in valuation and risk
management models. Additionally, the models effectively adjusted to

2
unexpected market events, such as the 2010 Flash Crash and 2016 Brexit
referendum, recovering losses more efficiently than non-model strategies.

Variable importance analysis highlights the US weekly jobless report as


the most influential predictor, followed by seasonality factors such as days until
VIX futures expiration. Our findings suggest that relying solely on VIX’s
historical data overlooks critical economic information. The inclusion of a
broader set of features enables a deeper understanding of VIX predictability,
particularly in capturing market behaviors like momentum and reversal
patterns.

Further tests show that dynamic retraining and balanced sampling


significantly improve model performance, particularly for predicting larger
market movements. Expanding to a four-category model (up-small, up-big,
down-small, down-big) maintains accuracy. Our ML framework’s adaptability
and automation offer valuable insights for future financial forecasting
applications.

Our research contributes significantly in two areas. First, in finance


literature, we demonstrate the superiority of machine learning (ML) over
traditional methods like Logistic Regression and HAR for volatility forecasting.
This extends the work of Konstantinidi et al. (2008), Paye (2012), and
Fernandes et al. (2014), showing that ML’s ability to include a broader range of
economic variables provides deeper insights into the relationship between
macroeconomic factors and volatility. The extended historical dataset also
enhances cross-validation and out-of-sample testing.

Second, our study offers practical value for quantitative investment


strategies and risk management by accurately predicting VIX direction. This
helps investors make informed entry and exit decisions and allows for better
risk management through volatility forecasts. The model is also highly useful
for derivatives trading, as it anticipates market movements and quickly adapts
to new information, making it valuable for both short-term and long-term
investors.

Additionally, our research offers insights for future ML applications in


financial forecasting. We propose an adaptive learning framework featuring
3
Automated Machine Learning (AutoML) and Hyperparameter Optimization
(HPO) for continuous, out-of-sample implementation. This framework reduces
manual intervention and ensures the model evolves with changing market
conditions, serving as a reference for future ML studies in finance.

The structure of this paper is outlined as follows: Section 2 provides an


overview of existing literature on volatility forecasting, setting the context for
our study. In Section 3, we detail our research design, outlining the
methodologies and approaches employed. The empirical findings are presented
in Sections 4, 5, and 6, where we delve into the forecasting performance,
conduct an economic evaluation, and explore the sources of predictability,
respectively. Finally, Section 7 offers concluding remarks, summarizing the key
insights and implications of our research.

2 Related literature

2.1 Machine Learning Applications in Economics and Finance

The rapid advancements in machine learning (ML) and computing


power have revolutionized fields like economics and finance, enabling
sophisticated models to process vast data and uncover patterns missed by
traditional methods. This review engages with recent developments in ML
applications in these areas. Breiman's (2001) foundational work on ensemble
learning, particularly the random forest algorithm, has been pivotal, making
ensemble methods essential due to their robustness and accuracy in fields like
financial analytics.

In finance, BlackRock's 2019 report highlights ML's transformative


impact, showing how algorithms enhance decision-making, optimize
portfolios, and manage risks by analyzing large datasets and predicting market
trends more accurately than traditional methods (BackRock, 2019). The CFA
Institute (2020) also emphasizes ML's role in asset management, covering its
use in automated trading, fraud detection, credit scoring, and personalized
advice, while addressing ethical and regulatory concerns.

4
The integration of ML with economics and finance has opened new
research avenues, enabling the analysis of economic indicators and financial
trends through models capable of handling complex data. This paper explores
how advanced ML techniques, combined with a wide set of economic variables,
can improve our understanding of the VIX’s predictability, a critical financial
indicator.

2.2 Historical and realized volatility forecasting

The foundational models in volatility forecasting are the GARCH family


models, introduced by Engle (1982) and Bollerslev (1986). Andersen,
Bollerslev, Diebold, and Labys (2003) later showed that multivariate realized
volatility modeling outperforms the GARCH and stochastic volatility models in
out-of-sample forecasts. Corsi's (2009) heterogeneous autoregressive (HAR)
model, which captures volatility persistence over daily, weekly, and monthly
horizons, has become a popular benchmark due to its simplicity and
effectiveness in replicating empirical volatility patterns.

Many studies have integrated GARCH or HAR with nonlinear methods


to improve forecasting accuracy. Kristjanpoller and Minutolo (2018), Maciel,
Gomide, and Ballini (2016), and Psaradellis and Sermpinis (2016) contributed
to this area. Donaldson and Kamstra (1997) found that an Artificial Neural
Network-GARCH model outperformed traditional models like GARCH and
EGARCH in forecasting stock return volatility. More recently, Bucci (2020)
demonstrated that LSTM Recursive Neural Networks (RNNs) can exceed linear
models in out-of-sample forecasts of monthly realized volatility. However, most
studies have focused primarily on time series methods, with less attention to
broader economic factors.

2.3 Implied Volatility and VIX Forecasting

Implied volatility has been a key measure of market expectations since the VIX
index was introduced by CBOE in 1993. Known as the 'fear index,' the VIX is a
benchmark for U.S. equity market volatility, reflecting the 30-day expected
volatility based on SPX options prices. Early work by Hamid and Iqbal (2004)

5
showed neural networks outperform implied volatility in predicting S&P 500
futures prices, though our focus is on forecasting the VIX itself. Konstantinidi,
Skiadopoulos, and Tzagkaraki (2008) identified predictable patterns in implied
volatility forecasting using models like regression and VAR but found negative
returns when applying these forecasts to VIX futures trading. Their best model
was a simple linear regression with seven economic variables, underlining the
role of economic factors.

Paye (2012) explored how macroeconomic uncertainty, stock returns,


and credit conditions influence volatility, finding that these factors Granger-
cause volatility but don’t significantly improve out-of-sample performance.
Fernandes, Medeiros, and Scharth (2014) used HAR models with economic
variables, finding minimal long-term effects of the term spread on VIX, while
Degiannakis, Filis, and Hassani (2018) showed non-parametric models like
SSA-HW outperform parametric ones for short-term forecasts.

Our study expands on this literature by analyzing a broader set of


economic variables, using distinct phases for training, validation, and
implementation to ensure trackable out-of-sample results. We also explore the
underlying sources of predictability, considering both model design and
variable selection, and tailor our design for practical applications, particularly
in testing the profitability of VIX predictability.

3 Research Design and the Adaptive Machinne Learning


Framework

In our study, we develop an adaptive learning methodology tailored for


predicting VIX signals, addressing the broader challenges of implementing
machine learning (ML) in financial forecasting.

3.1 Research Design

From a research design perspective, we explore several critical questions:

1. Forecasting Objective: We aim to predict the daily directional signal of the


VIX for the following day.

6
2. Explanatory Variables: Our approach involves a comprehensive selection of
variables, ensuring real-time availability and the absence of look-back bias.
Bloomberg serves as our primary data source, and we have identified 278
features across 14 categories, as detailed in Table 1 (refer to Online Appendix I
for the full list). These variables are selected based on economic theories and
existing studies1.

<Insert Table 1>

Our modeling approach can be summarized by the following generatic


model form.

𝑦̂𝑡+ℎ = 𝑓(𝑋1,𝑡 , 𝑋2,𝑡 , … , 𝑋278,𝑡 ; 𝜃), Equation (1)

where:

• 𝑦̂𝑡+ℎ is the forecasted value of the target variable at time t+h. In this
paper,
𝑢𝑝, 𝐼𝑓 ∆𝑉𝐼𝑋𝑡+ℎ > 0
o 𝑦𝑡+ℎ = {
𝑑𝑜𝑤𝑛, 𝐼𝑓 ∆𝑉𝐼𝑋𝑡+ℎ > 0
o Where ∆𝑉𝐼𝑋𝑡+ℎ represents the change in the VIX from time t to
t+h and h=1.
• 𝑿𝒕 = {𝑋1,𝑡 , 𝑋2,𝑡 , … , 𝑋278,𝑡 } is the set of predictor variables at time 𝑡 which
includes the 278 features derived from 14 groups of underlying data.
These features include lagged variables of both predictor and
dependent/target data. See online appendix for a detailed list of
variables.
• 𝑓 is the machine learning model function.
• 𝜃 represents the parameters of the machine learning model,
determined during training.

3. Algorithm Selection: We incorporate a diverse range of ML algorithms,


including Naïve Bayes (NB), Logistic Regression (LR), classic ML techniques
like Decision Tree (DT) and Random Forest (RF), as well as more advanced

1 There is a potential mixed frequency issue in the data. We take a snapshot of each point in
time for all frequency of data at that point to construct input data for the forecast. We leave
the determinant of usefulness of the feature by the algorithms instead of pre-modeling feature
engerneering.

7
methods such as Adaptive Boosting (AB), Multi-Layer Perceptron (MLP), and
an Ensemble model (Ens) that integrates all the aforementioned algorithms.
This selection covers a broad spectrum of model complexities to determine the
most effective algorithm for predicting volatility direction. Further details on
these algorithms are provided in Online Appendix II.

4. Hyperparameter Selection/Model Tuning: We delve into the specifics of each


algorithm, including the selection of the most suitable model setup for our out-
of-sample application. Additionally, we address how to systematically monitor
and manage model performance decay during the inference process, focusing
on retraining strategies.

We focus our discussion on the adaptive continuous learning


methodology in the following.

3.2 The Adaptive Continuous Learning Methodology

Our methodology introduces an automated adaptive continuous ML


framework. Figure 1 illustrates the essential elements of our closed-loop
adaptive learning design, which encompasses three primary steps: training,
validation, and implementation.

<Insert Figure 1>

3.2.1 Step 1: Training and Model Selection with Dynamic Hyperparameter


Setting and K-Fold Cross-Validation

A key aspect of model construction is selecting the right hyperparameters for


classification models. Traditional methods, often based on trial and error, are
time-consuming and lack traceability. To address this, we use an AutoML-
based Hyperparameter Optimization (HPO) method with Grid Search,
combined with K-Fold cross-validation. This automates the selection process,
identifying the optimal hyperparameters for each algorithm.

Given the challenges of time series data like the VIX, where maintaining
temporal sequence is crucial, traditional cross-validation methods can
introduce bias. We adopt a strategy that preserves chronological order, as
suggested by Bergmeir et al. (2018), excluding entire rows for the test set to

8
retain the time-dependent structure. Our approach ensures minimal
information loss, particularly for time series features like MA5 and MA30.
Following Bergmeir et al., cross-validation remains valid if error terms are
uncorrelated, which is likely with our large models. We define a matrix of
hyperparameter ranges for each algorithm and use K-fold cross-validation to
ensure optimal performance and reduce overfitting risks (Appendix I for
details).

3.2.2 Step 2: Algorithm Selection with Out-of-Sample Validation

After identifying the best model setup for each algorithm, we conduct out-of-
sample validation tests. Comparing training and validation accuracy informs us
about the relative performance and stability of different algorithms. The
algorithm with the best validation performance is selected for the next
implementation step. We also consider the variability between training and
validation performances, as large variations may indicate tendencies for
overfitting or underfitting.

3.2.3 Step 3: Implementation and Closed-Loop Continuous Learning

Over time, predictive model performance can decline as market behaviors


evolve. The typical response is to periodically build new models with fresh data.
However, this approach, often based on human judgment, can lead to delayed
or unnecessary model updates.

To standardize and improve this process, we designed a closed-loop


continuous learning framework. When model performance falls below a
predefined threshold (e.g., a 42.5% prediction error rate in our study), a new
training process is automatically initiated. Additionally, a stabilization period
(e.g., at least 120 days in our research) is set before retraining to gather
sufficient data for reevaluation and to prevent too frequent model switches.
This approach ensures timely model updates, maintaining overall quality and
performance with minimal human intervention.

3.3 The VIX sample and sub samples

9
The VIX is a real-time index that measures market expectations of 30-day
volatility, often called the "fear index." Calculated by the CBOE from S&P 500
options prices, the VIX rises during market stress and falls during stability,
providing a measure of market sentiment. Economically, the VIX serves as a
key risk management tool for investors, indicating fear during downturns and
complacency during calm periods. It also influences investment strategies,
prompting safer investments in high-volatility periods and encouraging equity
investments during low-volatility phases. VIX spikes often correlate with
financial crises or major economic events, offering insight into potential
disruptions.

Figure 2 shows VIX data from 1995 to 2020, divided into three periods.
The blue line represents the In-Sample Period (3600 observations up to 2009),
the red line the Out-of-Sample Period (400 observations before 2010), and the
green line the Implementation Period (2010–2020). A vertical dashed line
marks the end of 2009, separating validation from implementation. During
implementation, we use a rolling window of 3,600 observations and 5-fold
validation for retraining.

<Insert Figure 2>

The VIX shows significant fluctuations, including clustering of high


volatility, reflecting its strong connection to market stress. Notable spikes
during the dot-com bubble, 2008 financial crisis, and COVID-19 pandemic
underscore its role as a market sentiment gauge. The VIX also exhibits mean
reversion, returning to its long-term average after short-term deviations.
Summary statistics and dynamic properties across periods are discussed in
Appendix II, showing close alignment between the training and
implementation phases, confirming the model captures market volatility
effectively.

4 Forecasting Performance

Our empirical results are organized into three sections: prediction accuracy,
economic evaluation, and sources of predictability. This section examines
training, validation, and forecasting performance based on accuracy and

10
market timing. Section 5 assesses the economic significance of the forecasts,
while Section 6 explores the sources of predictability through various
experiments.

4.1 Training and Validation Accuracy for Modeling at the End of

2009

In evaluating the prediction accuracy of our selected models for all algorithms
as of the end of 2009, we focused on the outputs from the "Step 2 validation
and algorithm selection." After completing the K-fold cross-validation training,
we retained the best-performing models and applied them to 400 validation
data points. Figure 3 presents the training and validation accuracy for each
algorithm, revealing four key observations.

<Insert Figure 3>

Firstly, the Naïve Bayes (NB) model exhibited the lowest accuracy,
indicating that more complex algorithms have a distinct advantage in this
application. Secondly, a linear model like Logistic Regression (LR)
demonstrated reasonable accuracy, suggesting that the predictability is
significantly influenced by the economic relevance of the features we selected.

Thirdly, we observed a trade-off between model complexity and


stability/variability, particularly when comparing in-sample training
performance with out-of-sample validation performance. A notable decline in
out-of-sample performance often signals overfitting. In this context, the Multi-
Layer Perceptron (MLP), a neural network-type model with a complex
nonlinear structure, showed signs of overfitting, evidenced by its high in-
sample accuracy of 94%. However, its out-of-sample validation accuracy of
62.2% was still commendable, especially when compared to NB.

Finally, Adaptive Boosting (AB) yielded the best validation results.


Intriguingly, its validation accuracy surpassed its in-sample accuracy, making
it a standout choice for implementation in Step 3, following our framework's
guidelines as of the end of 2009.

11
4.2 Out-of-Sample Implementation Accuracy from 2010 to 2020

In this section, we present the yearly accuracy ratios for our out-of-sample
forecasts by algorithms, as depicted in Figure 4, with mean ratios detailed in
Table 2. Aligning with our validation results, Figure 4 indicates that Naïve
Bayes (NB) consistently shows the lowest accuracy rates. In contrast, Decision
Tree (DT), Random Forest (RF), and AdaBoost (AB) maintain higher accuracy
rates, consistently above 50%. The Multi-Layer Perceptron (MLP)
underperforms, reinforcing concerns about overfitting identified during the
validation stage. The Ensemble model (ENS) displays intermediate
performance between DT and RF and appears to reduce year-to-year variability
in model performance, as evidenced by a narrower interquartile range.

<Insert Figure 4 >

Figure 5 reports the cumulative correct predictions for each model to


demonstrate the dynamic of the implimentation accuracy. It can be interpreted
as the number of net correct predictions at any given point in time. These plots
demonstrate that decision tree family models, such as Decision Tree (DT),
Random Forest (RF), and AdaBoost (AB), exhibit consistent growth in correct
predictions, indicating stable and reliable performance over time. Conversely,
more complex models, such as Multi-Layer Perceptron (MLP), show more
varied performance across the time period analyzed.

<Insert Figure 5 >

Overall, these results affirm that both the simplest probabilistic


classifiers (NB) and the most complex method (MLP) are less effective for
directional forecasting, while decision tree-based models emerge as the most
suitable for this classification task.

Regarding the model retraining during the closed-loop learning in Step


3, we assume the selection of the best models from each algorithm for
implementation. By design, weaker algorithms in terms of performance
necessitate more frequent retraining. Figure 6 shows that NB required the most
retraining, with 23 instances over 11 years. Given our minimum run
requirement of 120 days for each model, this frequency suggests that NB's

12
performance consistently failed to meet our error rate threshold of 42.5%,
indicating that retraining does not necessarily rectify performance issues in
weaker algorithms. Conversely, DT and AB experienced less frequent
retraining, approximately once a year. The Ensemble model demonstrated the
most stability, requiring the least retraining (7 times in 11 years).

<Insert Figure 6>

In terms of performance variation, despite frequent retraining, NB's


performance showed low variability but consistently poor results. RF and MLP,
on the other hand, exhibited significant variations in their training performance
across different stages/models. The training and validation outcomes for all
models align with our findings at the end of 2009 (Figure 3), suggesting that
our training regime consistently yields reliable outcomes across different
datasets. Notably, the overfitting issue with MLP persisted throughout the
closed-loop training2.

4.3 Statistical Test

In the out-of-sample implementation phase, Table 2 presents the accuracy and


timing measures. To assess the differences in accuracy rates between models,
we employ the Diebold and Mariano’s (1995, DM) tests. Recognizing that the
DM test may frequently reject the null hypothesis in small samples, Harvey,
Leybourne, and Newbold (1997, HLN) suggested modified statistics to mitigate
this issue. Our main results include these HLN statistics.

To contextualize our findings within existing literature, we compare our


results with a simple linear forecasting model, the Heterogeneous
Autoregressive (HAR) model, known for its effectiveness in volatility
forecasting as discussed in Section 2. We implement a rolling daily HAR model
forecast using three variables: lagged one-day, weekly average, and monthly

2In the early stages of our study, we incorporated the Support Vector Machine (SVM) among
our selection of models. However, we observed that this algorithm predominantly yielded
one-sided predictions, demonstrating limited timing ability. To provide a comprehensive
view, we have detailed the results and a focused discussion on the performance of SVM in an
online appendix.

13
average values of VIX, with a rolling window of 4000 observations, mirroring
our main analysis.

Table 2 reveals several key findings. Firstly, Panel A confirms that all
models, except NB and MLP, outperform the HAR model. The notable
difference between HAR and Logistic Regression (LR), both linear models, lies
in the number of features used. LR's outperformance over HAR underscores the
significance of the additional economic variables included in our study for
enhancing forecasting performance. Among the models, Decision Tree (DT)
and AdaBoost (AB) stand out in accuracy, corroborated by information
coefficients. For market timing, DT, AB, Random Forest (RF), and Ensemble
(ENS) models all demonstrate over 10% market timing with robust statistical
significance, whereas NB, MLP, and HAR show low market timing abilities.

<Insert Table 2>

Secondly, Panel B presents pairwise accuracy tests, indicating that DT


outperforms all other models in accuracy. The Ensemble model surpasses only
NB and MLP models.

In summary, the decision tree family models, including DT, RF, and AB,
exhibit the highest prediction accuracy. These models consistently outperform
the HAR model in predicting the directional movement of the VIX, highlighting
their effectiveness in volatility forecasting.

5 Economic Evaluation: A Simulated Strategy

Diebold and Mariano (1995) stressed that the economic impact of forecast
errors is context-dependent, influencing decision-making. To assess the
economic value of our forecasts, we simulate a long-short trading strategy based
on daily VIX prediction signals at market close. Although VIX itself is not
directly tradable, this simulation tests signal accuracy by size-weighting
returns.

We evaluate out-of-sample returns (Figure 7) across models over 11


years. Higher accuracy generally correlates with better returns, though
exceptions exist, such as Decision Tree (DT) yielding lower returns than

14
Random Forest (RF) and Adaptive Boosting (AB), despite superior accuracy (t-
value: 3.91). Table 3 shows models like ENS, RF, and AB exhibit high Sharpe
ratios, with AB having the lowest drawdown.

<Insert Figure 7 and Table 3>

Most models generate over 50 basis points in daily returns, annualizing to


125%. However, actual returns are lower due to tracking errors and transaction
costs from rebalancing. This economic analysis shows that accurate VIX
forecasts are more effective when market movements are larger, making them
valuable for derivative portfolios and aiding market makers in SPX and VIX
futures strategies3.

6 Source of Predictability

This section explores the sources of predictability through various experiments.


We assess model performance during major volatility events (6.1) and analyze
the impact of economic variables using variable importance (6.2). Additionally,
we examine two key model aspects: closed-loop training (6.3) and balanced
sampling (6.4). We also compare multi-category versus binary predictions (6.5)
and analyze prediction persistence by evaluating delays in predictors (6.6).

6.1 Volatility Spikes

Market volatility spikes, often caused by unforeseen events like the 2010 Flash
Crash or the 2021 GameStop surge, create significant tail risks, particularly for
shorting VIX. These spikes are largely exogenous and difficult to predict. We
assess model performance around volatility spikes, defined as VIX changes over
20%. Table 4 shows a higher frequency of positive spikes (64) than negative
ones (10). Models perform well during negative spikes with error rates under
20%, but struggle with positive spikes, where error rates often exceed 50%,
except for Naïve Bayes (NB).

<Insert Table 4>

3We demonstrate two more realistic investment tests with some tradable VIX derivatives,
considering transaction costs in an online appendix.

15
Table 5 tracks recovery from initial daily losses (~30%) after spikes.
Decision-tree models, especially Adaptive Boosting (AB), show strong recovery,
recouping 93% of losses within 20 days, compared to 23% for the HAR model.
Most models fully recover within 60 days, except NB and MLP.

<Insert Table 5>

In conclusion, while VIX spikes are unpredictable, decision-tree ML algorithms


demonstrate strong adaptability and recovery, highlighting their resilience in
volatile conditions.

6.2 Variable Importance and Variable Selections

Interpretable models in ML highlight the sources of predictability through


variable importance metrics. We use the ExtraTreeClassifier, an ensemble
method, to assess feature importance in VIX signal classification. This method
calculates feature importance based on node impurity reduction, indicating a
feature's influence on prediction accuracy.

Table 6 shows the top 20 variables by importance. The weekly jobless


report consistently ranks highest across models, revealing its strong impact on
market volatility. Seasonality variables like day of the week and days until VIX
contract expiry also contribute significantly, reflecting patterns in investor
behavior linked to the economic cycle. Technical indicators such as SPX’s
Relative Strength Index (RSI) and commodities like oil and gold are also
influential.

<Insert Table 6>

While the top two variables remain consistent across models, other
rankings show variability over time, with correlations between early and later
models ranging from 63% to 77%. This highlights the need for regular model
updates. Table 7 confirms that predictability stems from both technical
indicators and fundamental factors like macroeconomic variables. Categories
with individually small contributions, such as Macroeconomics, collectively
provide significant insights. Notably, the VIX Techs group contributes only

16
11.46% of total importance, suggesting that focusing solely on VIX's historical
data misses key explanatory variables.

<Insert Table 7>

6.3 One-Time Model vs. Dynamic Continuous Learning Model

To demonstrate the benefits of our continuous learning framework, we


compared it to a one-time model approach where the model is built once and
used throughout the 11-year out-of-sample period without updates. Table 8
shows that in the one-time model setup, only advanced models like Random
Forest (RF), AdaBoost (AB), and Multi-Layer Perceptron (MLP) achieve
forecast accuracy above 50%, resulting in modest gains from dynamic
retraining.

<Insert Table 8>

The most significant improvements from dynamic learning are seen in


the Decision Tree (DT) model, with accuracy rising from 49% to 58%, and the
Ensemble model also benefits. However, frequent retraining worsens
performance for Naïve Bayes (NB). These results highlight the importance of
updating models with new data, particularly for decision tree-based models.

6.4 Balanced vs Unbalanced Sampling

A key concern with nonlinear models is the risk of one-sided predictions, where
they perform well in training but poorly out-of-sample. To address this, we use
a "balanced" sample with equal up and down observations, selecting 4000 data
points. This section compares the results of using balanced and unbalanced
samples for the AdaBoost (AB) model.

Table 9 shows that while the unbalanced sample has a higher correction
ratio and better information ratio, its timing ratio is significantly lower,
indicating worse performance in predicting both up and down movements.
Economically, the unbalanced model yields lower simulated returns (49 basis
points vs. 90) and experiences a higher drawdown.

17
<Insert Table 9>

In conclusion, balanced samples improve the accuracy and robustness of


predictions, reducing bias and enhancing economic performance across
different market conditions.

6.5 The Size of VIX Changes and Multi-Category Forecasting

A limitation of binary forecasting is that it doesn't account for the magnitude of


market movements. High accuracy may lack economic significance if the model
predicts minor changes correctly but errs during significant shifts. Our
simulated investment strategy in Section 0 highlighted the importance of
return-weighted signal quality. Here, we further explore the relationship
between directional predictions and VIX movement size in two ways.

First, we introduce a 'size timing' measure, categorizing market changes


into large and small based on the upper and lower 15th percentiles of past 250
observations. We assess accuracy in predicting big versus small changes.
Second, we experiment with a multi-category prediction model (4D) with
categories: up-small, up-big, down-small, and down-big.

Table 10 shows that the AdaBoost (AB) model’s 4D approach doesn't


improve overall accuracy compared to the binary model (2D). In fact, the
market timing ratio is lower in the 4D model, and the binary model better
captures large movements.

<Insert Table 10>

These results suggest no clear benefit from more granular forecasts,


likely due to the 4D model's need for larger sample sizes, which our dataset
cannot currently support.

6.6 Persistence of the Prediction

This section examines the persistence of our model's predictions by addressing


two key points: the impact of data delays (such as the 15-minute lag in equity
data) and the stability of predictions over time. We assess the effect of delayed

18
signals and explore the model's reliance on recent data for short-term
forecasting.

We conduct three tests:

• One-Day Delay Accuracy: Assessing the impact of a one-day delay in


applying predictions.

• Next Day’s Open-to-Open Accuracy: Evaluating the prediction


accuracy for the next day’s open-to-open VIX movement.

• Direct Open-to-Open Prediction: Predicting the next day’s open-to-


open VIX movement using current day’s close data.

Table 11 shows that applying a one-day delay reduces accuracy by about


3%, with notable declines in information and timing ratios, resulting in a 70
basis point drop in returns. Similarly, predicting the next day’s open-to-open
movement with the current day’s close data shows slightly lower accuracy but
still delivers 69 basis points daily.

<Insert Table 11>

These findings confirm that timely use of the signal is critical for
maximizing forecast accuracy. For those constrained by data delays, directly
predicting open-to-open movements remains a viable, albeit slightly less
accurate, alternative. The results affirm the model's robustness, with better
performance linked to more immediate data and consistent alignment between
training and application targets.

7 Conclusions

Our study delves into the predictability of the VIX and identifies its underlying
sources, demonstrating that daily VIX can be predicted with greater accuracy
than existing literature suggests, and these predictions are economically
significant. The predictability arises from two main areas: the crucial role of
human input in selecting economically relevant variables and translating the
prediction task into a format interpretable by machines, and the forecasting
framework itself. Our extensive inclusion of economic and financial data,

19
particularly the weekly jobless claim, contributes new insights to the literature
on the determinants of market volatility and fear. This novel finding highlights
an underexplored link between the labor market and overall market volatility,
suggesting avenues for future theoretical and empirical research.

The development of our automated and adaptive training framework


based on AutoML, focusing on explainability and trackability, addresses key
challenges in applying ML to financial forecasting. This framework reduces
human intervention, enhances algorithm selection and model tuning efficiency,
ensures robust model validation, and includes proactive monitoring and
retraining processes. These features collectively enhance the model's
adaptability to market conditions and reduce the likelihood of biased
predictions.

Furthermore, our study's findings have substantial implications for


quantitative investment and risk management. The ability to accurately predict
the VIX equips investors and risk managers with a vital tool for making
informed decisions about asset allocation, hedging strategies, and risk
exposure. The insights gained from key economic variables, especially the
weekly jobless claims, provide a deeper understanding of market dynamics,
enabling more sophisticated risk management approaches. The automated and
adaptive ML framework developed in our study not only augments prediction
accuracy but also adapts to evolving market conditions, ensuring robustness in
investment and risk management strategies amidst market volatility.

In summary, our research contributes significantly to the fields of


quantitative investment and risk management, offering advanced
methodologies for predicting market volatility and enhancing the
understanding of its determinants. This work paves the way for more effective
and adaptive investment strategies in the financial industry, particularly in the
face of unpredictable market movements.

20
Disclosure of Interest

There is no interest to declare.

Disclosure of Funding

No funding was received.

21
References

Andersen, Torben G., Tim Bollerslev, Francis X. Diebold, and Paul Labys,
2003, Modeling and forecasting realized volatility, Econometrica 71,
579–625.
Bali, T. G., and H. Zhou, 2016, Risk, uncertainty, and expected returns.
Journal of Financial and Quantitative Analysis 51, 707–735.
Ballings, Michel, Dirk van den Poel, Nathalie Hespeels, and Ruben Gryp,
2015, Evaluating multiple classifiers for stock price direction prediction,
Expert Systems with Applications 42, 7046–7056.
Bergmeir, C., Hyndman, R., & Koo, B. (2018). A note on the validity of cross-
validation for evaluating autoregressive time series prediction.
Computational Statistics & Data Analysis, 120, 70-83.
https://doi.org/10.1016/j.csda.2017.11.003
BlackRock. (2019). Artificial Intelligence and Machine Learning in Asset
Management.
[https://www.blackrock.com/corporate/literature/whitepaper/viewpoint-
artificial-intelligence-machine-learning-asset-management-october-2019.pdf]
[accessed March 2021].
Bodie, Zvi, Alex Kane, and Alan J. Marcus, 2018, Investment (McGraw Hill).
Bollerslev, Tim, 1986, Generalized autoregressive conditional
heteroskedasticity, Journal of Econometrics 31, 307–327.
Booth, Ash, Enrico Gerding, and Frank McGroarty, 2014, Automated trading
with performance weighted random forests and seasonality, Expert
Systems with Applications 41, 3651–3661.
Booth, Ash, Enrico Gerding, and Frank McGroarty, 2015, Performance-
weighted ensembles of random forests for predicting price impact,
Quantitative Finance 15, 1823–1835.
Breiman, Leo, 2001, Random forests, Machine Learning 45, 5–32.
Bucci, Andrea, 2020, Realized Volatility Forecasting with Neural Networks,
Journal of Financial Econometrics 18, 502–531.
CFA Institute. (2020). Artificial Intelligence and Machine Learning in Asset
Management. Retrieved from [https://www.cfainstitute.org/-

22
/media/documents/book/rf-lit-review/2020/rflr-artificial-intelligence-
in-asset-management.ashx] [accessed Jan 2021].
Corsi, Fulvio, 2009, A simple approximate long-memory model of realized
volatility, Journal of Financial Econometrics 7, 174–196.
Degiannakis, Stavros, George Filis, and Hossein Hassani, 2018, Forecasting
global stock market implied volatility indices, Journal of Empirical
Finance 46, 111–129.
Diebold, F. X. and R. S. Mariano, 1995, Comparing predictive accuracy,
Journal of Business and Economic Statistics, 13, 253-63.
Donaldson, R. Glen, and Mark Kamstra, 1997, An artificial neural network-
GARCH model for international stock return volatility, Journal of
Empirical Finance 4, 17–46.
Engle, Robert F, 1982, Autoregressive Conditional Heteroscedasticity with
Estimates of the Variance of United Kingdom Inflation, Econometrica
50, 987–1007.
Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy,
Mu Li, Alexander Smola . AutoGluon-Tabular: Robust and Accurate
AutoML for Structured Data. Mar 2020. [arxiv: stat.ML]
https://arxiv.org/abs/2003.06505
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel,
Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Andreas Müller, Joel
Nothman, Gilles Louppe, Peter Prettenhofer, Ron Weiss, Vincent
Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau,
Matthieu Brucher, Matthieu Perrot, Édouard Duchesnay. Jun 2018.
Scikit-learn: Machine Learning in Python. [arxiv: cs.LG]
https://arxiv.org/abs/1201.0490
Fernandes, M., M.C. Medeiros, and M. Scharth, 2014, Modeling and
predicting the CBOE market volatility index, Journal of Banking and
Finance 40, 1–10.
Guidolin, Massimo, and Allan Timmermann, 2003, Option prices under
Bayesian learning: Implied volatility dynamics and predictive densities,
Journal of Economic Dynamics and Control 27, 717–769.

23
Hamid, Shaikh A., and Zahid Iqbal, 2004, Using neural networks for
forecasting volatility of S&P 500 Index futures prices, Journal of
Business Research 57, 1116–1125.
Harvey, D., S. Leybourne, and P. Newbold, 1997,. Testing the equality of
prediction mean squared errors, International Journal of Forecasting
13, 281-91.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani, 2013,
An Introduction to Statistical Learning: With Applications in R
(Springer Science & Business Media, New York).
Jurado, K., Ludvigson, S. C., and Ng, S., 2015, Measuring
uncertainty. American Economic Review 105, 1177–1216.
Konstantinidi, Eirini, George Skiadopoulos, and Emilia Tzagkaraki, 2008, Can
the evolution of implied volatility be forecasted? Evidence from
European and US implied volatility indices, Journal of Banking and
Finance 32, 2401–2411.
Kristjanpoller, Werner, and Marcel C. Minutolo, 2018, A hybrid volatility
forecasting framework integrating GARCH, artificial neural network,
technical analysis and principal components analysis, Expert Systems
with Applications 109, 1–11.
Maciel, Leandro, Fernando Gomide, and Rosangela Ballini, 2016, Evolving
Fuzzy-GARCH Approach for Financial Volatility Modeling and
Forecasting, Computational Economics 48, 379–398.
Paye, Bradley S., 2012, “Déjà vol”: Predictive regressions for aggregate stock
market volatility using macroeconomic variables, Journal of Financial
Economics 106, 527–546.
Psaradellis, I., and G. Sermpinis, 2016, Modelling and trading the U.S. implied
volatility indices. Evidence from the VIX, VXN and VXD indices,
International Journal of Forecasting 32, 1268–1283.
Rasekhschaffe, Keywan Christian, and Robert C. Jones, 2019, Machine
Learning for Stock Selection, Financial Analysts Journal 75, 70–88.
Rhoads, Russell, 2011, Trading VIX Derivatives: Trading and Hedging
Strategies Using VIX Futures, Options, and Exchange-Traded Notes,
John Wiley & Sons, Inc.

24
Rhoads, Russell, 2020, The VIX Trader’s Handbook: The history, patterns,
and strategies every volatility trader needs to know, Harriman House.
Whaley, Robert E., 2000, The investor fear gauge: Explication of the CBOE
VIX, Journal of Portfolio Management 26, 12–17.

25
Figures and Tables

Figure 1. The adaptive continuous learning methodology

26
Figure 2. VIX time series plot
The chart visualizes the VIX sample from 1995 to 2020, segmented into three periods. The blue
line (In-Sample Period) covers data up to 2009 for initial training and validation. The red line
(Out-of-Sample Period) shows the final 400 observations of 2009 for validation and model
selection. The green line (Implementation Period) spans 2010 to 2020 for testing the model. A
vertical dashed line marks the end of 2009, separating the validation phases from the
implementation stage.

27
Figure 3. Training and validation accuracy for modelling at the end
of 2009
This figure reports the accuracy ratios for the training and validation of each model.

28
Figure 4. Out-of-sample implementation accuracy by model
This figure reports the box plot of the yearly correct ratio by model. The correct ratio is obtained
from the out-of-sample forecast from 2010 to 2020.

29
Figure 5. Implementation accuracy dynamics

This figure reports the cumulative correct predictions for each model. It can
be interpreted as the number of net correct predictions at any given point in
time.

30
Figure 6. Retraining, validation, and implementation accuracy
during the close-loop implementation period
This figure reports the distribution of accuracy for the models used in the implementation stage
for each Algo. The numbers of training are reported at the bottom of the figure.

1.NB 2.LR 3.DT 4.RF 5.AB 6.MLP 7.ENS


number 23 20 9 16 10 16 7
of
retraining

31
Figure 7. Out of sample simulated long-short strategy performance
This figure reports the distributions (box plots) of the mean daily return in the 11 years between
2010 and 2020 for each algorithm. The return is calculated by applying the predicted signal to
the next day's VIX return. The diamond indicated the mean return.

32
Table 1. Summary of variables by groups

Type Variable Descriptions Num Economic Justification


Bloomberg Commodity Index,
Reflect market sentiment and
WTI Crude Future, Brent
Commodity 9 economic conditions; e.g., oil prices
Crude Future, Copper Future,
influenced by geopolitical events.
Gold 100 Oz Future, etc.
Euro Spot, Japanese Yen Spot,
Indicate changes in economic
British Pound Spot, Australian
Currency 7 conditions and investor sentiment
Dollar Spot, China Renminbi
across different regions.
Spot, etc.
US Govt bonds (2 Yr, 5 Yr, 3
Reflect interest rate expectations
Govt & Corp Yr, 12 Mth, 3 Mth, 6 Mth, 30
14 and economic outlook; yield curve
Bond Yr), TII bonds, Corporate
as predictor of economic activity.
bonds, etc.
ISM PMI, Unemployment
Provide insights into overall
Rate, PPI, Retail & Food
economic health and consumer
Macroeconomic Service, GDP, Labor 57
confidence, critical for predicting
Productivity, CPI, Consumer
market volatility.
Confidence, etc.
IBM, Apple, Amazon, General
Represent significant market
Electric, Microsoft, Bristol-
Major Equities 12 sectors; high-profile companies as
Myers Squibb, FedEx, Nvidia,
market movers.
etc.
Day of the month, Day of the Seasonality effects can influence
week, Week of the year, Month market behavior; certain times are
Seasonality 5
of the year, Day to next associated with higher trading
expired Wednesday volumes.
Technical indicators for S&P
Technical analysis indicators
SPX Member 500 Index, Bollinger Bands,
35 predict future price movements
Tech Moving Averages, MACD, RSI,
based on historical price patterns.
New highs/lows, etc.
Historical call implied Reflect market expectations of
SPX Options & volatility, Put/call volume future volatility and risk, providing
16
Futures ratios, Option volumes, Open direct insight into investor
interest, Futures volume, etc. sentiment.
Industry-specific indices
within S&P 500 (banks, Sector-specific performance
SPX Subindex retailing, automobiles, 31 highlights trends and risks in
transportation, software, different parts of the economy.
insurance, etc.)
Volatility measures, RSI,
Provide detailed insights into
ARMS index, Money flow,
market trends and investor
SPX Tech Dividend per share, Volume 39
behavior, essential for forecasting
measures, Moving averages,
volatility.
etc.
VIX-related technical Understanding the historical
indicators, Moving averages, dynamics and technical patterns of
Vix Tech 20
RSI, Max/min days, Price the VIX itself is crucial for
change percentages, etc. predicting future volatility.
Global market trends influence
Global indices (Dow Jones,
World Equity domestic market volatility;
Nikkei, Euro Stoxx, DAX, 18
Index international developments affect
NASDAQ, FTSE, MSCI, etc.)
investor sentiment.

33
Table 2. Forecast accuracy summary
Panel A reports the correct ratio, information coefficient and timing ratio in the 11-year
implementation period. The information coefficient is calculated by (2×Correct_ratio)−1. The
timing ratio is calculated as (true positive ratio + true negative ratio) -1. The HLN column
reports the Harvey, Leybourne, and Newbold (1997) test on the difference in accuracy rate
between the model on the left and the HAR mode in the daily predictions of 11-year out-of-
sample period. t-tests for the information coefficients and timing ratio are performed on the
variations of the statistics among the 11 annual observations. Panel B reports the HLN test
statistics on the difference in accuracy rate between the models in the rows and columns. ***,
**, and * indicate statistical significance at 1%, 5% and 10% level respectively.

Panel A. Accuracy Rate

Correct Compared to Information


ratio HAR coefficient Timing ratio
Models Mean HLN Mean t Mean t
1.NB 0.487 -3.91 *** -0.0262 -1.36 0.020 1.35
2.LR 0.552 2.13 ** 0.1046 4.19 *** 0.097 4.94 ***
3.DT 0.582 4.45 *** 0.1634 6.93 *** 0.126 5.25 ***
4.RF 0.560 3.66 *** 0.1194 10.64 *** 0.122 9.60 ***
5.AB 0.569 4.12 *** 0.1383 8.01 *** 0.122 7.38 ***
6.MLP 0.530 0.39 0.0595 2.33 ** 0.057 2.61 **
7.ENS 0.566 3.84 *** 0.1319 7.19 *** 0.119 6.41 ***
8.HAR 0.525 0.0502 2.31 ** 0.072 4.02 ***

Panel B. Pairwise HLN test on the accuracy rate difference between the row and
column models

2.LR 3.DT 4.RF 5.AB 6.MLP 7.ENS 8.HAR

1.NB -0.065*** -0.095*** -0.073*** -0.082*** -0.043*** -0.079*** -0.038***


2.LR -0.030** -0.008 -0.017 0.022 -0.014 0.027**
3.DT 0.022** 0.013 0.052*** 0.016 0.057***
4.RF -0.009 0.030** -0.006 0.035***
5.AB 0.039*** 0.003 0.044***
6.MLP -0.036*** 0.005
7.ENS 0.041***

34
Table 3. Out of sample simulated long-short strategy performance
This table reports the mean daily return, the Sharpe ratio, and the average maximum annual
percentage drawdown (MDD). t-tests are performed on the variations of the statistics among
the 11 annual observations. ***, **, and * indicate statistical significance at 1%, 5% and 10%
level respectively.

Daily Return Yearly MDD


model Mean t Sharpe Mean t
1.NB 0.0029 2.79** 0.85 -60% -4.39***
2.LR 0.0060 5.06*** 1.54 -52% -2.87**
3.DT 0.0056 4.30*** 1.27 -64% -2.57**
4.RF 0.0078 6.85*** 2.05 -47% -2.84**
5.AB 0.0090 5.80*** 1.73 -30% -3.31***
6.MLP 0.0045 3.91*** 1.18 -50% -3.24***
7.ENS 0.0074 7.42*** 2.24 -89% -1.92*
HAR 0.0053 3.89*** 1.15 -56% -3.63***

35
Table 4. Prediction performance on VIX spikes days
This table reports the model forecasting performance for the negative and positive VIX spikes
days. Spikes are defined as VIX movement greater or equal to 20%. It reports the number of
days that spikes occur between 2010 and 2020. Error days (% err) reports the number
(proportion) of days that the model makes incorrect predictions. The return columns report the
mean, minimum and maximum in those spike days for each model.

Return
Number of
Models Spikes days Error days % Err Mean Min Max
1.NB negative 10 2 20% 0.1373 -0.2591 0.2957
positive 64 23 36% 0.0781 -1.1560 0.5000
2.LR negative 10 2 20% 0.1432 -0.2337 0.2957
positive 64 38 59% -0.0800 -1.1560 0.5000
3.DT negative 10 1 10% 0.1892 -0.2327 0.2957
positive 64 52 81% -0.2004 -1.1560 0.4933
4.RF negative 10 0 0% 0.2357 0.2050 0.2957
positive 64 42 66% -0.1161 -1.1560 0.4638
5.AB negative 10 2 20% 0.1373 -0.2591 0.2957
positive 64 40 63% -0.0389-0.5000 1.1560
6.MLP negative 10 4 40% 0.0449 -0.2957 0.2696
positive 64 32 50% -0.0037 -1.1560 0.4933
7.ENS negative 10 2 20% 0.1352 -0.2696 0.2957
positive 64 45 70% -0.1315 -1.1560 0.5000
HAR negative 10 3 30% 0.0958 -0.2591 0.2957
positive 64 40 63% -0.0884 -1.1560 0.4933
SO negative 10 0 0% 0.2357 0.2050 0.2957
positive 64 64 100% -0.3126 -1.1560-0.2022
SVM negative 10 2 20% 0.1381 -0.2591 0.2957
positive 64 48 75% -0.1553-0.5000 1.1560

36
Table 5. Return performance following the VIX spikes days
This table reports the mean initial losses for the incorrect predictions for the spike days. It
reports the initial losses on spike day. It also reports the cumulated profit & losses 20 (60) after
and including the spike day.

Initial
losses 20 days after initial 60 days after initial
on
spike Cumulated Recover Cumulated Recover
Models day P&L percentage P&L percentage N
(1) (2) (1)-(2)/(1) (4) (1)-(4)/(1)
1.NB -0.3263 -0.2709 3% -0.2566 15% 23
2.LR -0.3306 -0.1538 46% 0.037 111% 38
3.DT -0.3157 -0.1808 50% 0.1782 161% 52
4.RF -0.3266 -0.1214 63% 0.1654 160% 42
5.AB -0.2812 -0.0221 93% 0.2702 206% 40
6.MLP -0.3163 -0.3526 -6% -0.2178 32% 32
7.ENS -0.3158 -0.119 62% 0.0793 135% 45
HAR -0.3208 -0.2495 23% 0.0303 106% 40

37
Table 6. Top 20 Variable Importance by ranking

This table reports the top 10 variables according to their ranking in each model and
all models.
Panel A. Rank by Average of Variable Importance in the initial training
Rank in
Rank Name Full Category retrain
1 US Initial Jobless Claims SA change Macroeconomic 1
2 day of the week Seasonality 2
3 SPX Index pct members with new 52w highs SPX Member Tech 28
4 SPX Index Volume SPX Tech 45
5 S5TELS Index SPX Subindex 64
6 VIX Index 60d Vix Tech 132
7 SPX Index pct members with new 8w highs SPX Member Tech 35
8 GBP Currency Currency 53
9 S5AUCO Index SPX Subindex 85
10 VIX Index RSI 14d Vix Tech 27
11 VIX Index days diff min30 Vix Tech 15
12 SPX Index pct memb px blw lwr boll band SPX Member Tech 144
13 S 1 COMB Comdty Commodity 36
14 day of the month Seasonality 4
15 SPX index days diff max30 Vix Tech 33
16 SPX Index volatility 260D SPX Tech 112
17 SX5E Index World Equity Index 103
18 US CPI Urban Consumers MoM SA Macroeconomic 194
19 VIX Index RSI 30d Vix Tech 9
20 S5INDU Index SPX Subindex 183
Panel B Rank by Average of Variable Importance in the Retraining
Ranking in
the initial
Rank Name Full Category training
1 US Initial Jobless Claims SA change Macroeconomic 1
2 day of the week Seasonality 2
3 SPX index RSI3d/RSI14d SPX Tech 98
4 day of the month Seasonality 14
5 VIX Index RSI 9d Vix Tech 41
6 VIX Index RSI3d/RSI14d Vix Tech 58
7 Day to maturity at next 3rd Wednesday Seasonality 92
8 SPX Index RSI 3D SPX Tech 24
9 VIX Index RSI 30d Vix Tech 19
10 VIX Index RSI 3d Vix Tech 112
11 CL1 COMB Comdty Commodity 88
12 CO1 COMB Comdty Commodity 50
13 SPX index days diff min30 SPX Tech 86
14 SPX Index RSI 30D SPX Tech 85
15 VIX Index days diff min30 Vix Tech 11
16 SPX Index pct members with new 24w highs SPX Member Tech 163
17 XAU Currency Commodity 221
18 SPX Index pct members with new 12 wk lows SPX Member Tech 31
19 VIX Index days diff max30 Vix Tech 48
20 SPX Index RSI 14D SPX Tech 60

38
Table 7. Variable importance by category for All model summary
This table reports the statistics for the average variable importance of all AB models used in the
implementation stage including the one at the end of 2009. It reports the mean, minimum,
maximum and sum variable importance and number of variables in each category. The rows
in the table are ordered by the sum column. The conditional formatting with green is higher
and red is lower in value within each column compared across different categories.

Category Mean Min Max Sum N


SPX Tech 0.0039 0.0031 0.0055 0.1981 51
Macroeconomic 0.0023 0.0003 0.0133 0.1432 61
SPX Subindex 0.0038 0.0033 0.0045 0.1238 33
Vix Tech 0.0041 0.0029 0.0054 0.1146 28
SPX Member Tech 0.0039 0.0032 0.0047 0.1003 26
World Equity Index 0.0039 0.0032 0.0044 0.0701 18
SPX Options and Futures 0.0039 0.0034 0.0045 0.0543 14
Govt & Corp Bond 0.0038 0.0032 0.0043 0.0527 14
Major Equities 0.004 0.0034 0.0045 0.0477 12
Commodity 0.0043 0.0038 0.0047 0.0391 9
Currency 0.0041 0.0038 0.0044 0.0284 7
Seasonality 0.0055 0.0039 0.0092 0.0276 5
All 0.0036 0.0003 0.0133 1 278

39
Table 8. Comparison between one-time model vs dynamic
retrained model
This table reports the accuracy rate in the 11-year implementation period for two different
training approaches: one-time (Onemodel) and dynamic retrained (Retrain1). Onemodel uses
the model trained at the end of 2009 and applies it to the 11 years without further retraining.
Retrain1 is the methodology reported in the main results where retraining is triggered
dynamically. The HLN column reports the Harvey, Leybourne, and Newbold (1997) test on the
difference in accuracy rate between the two training approachs. ***, **, and * indicate statistical
significance at 1%, 5% and 10% level respectively.

Accuracy rate
Models RETRAIN1 ONEMODEL Difference HLN
1.NB 0.487 0.496 -0.009 -0.99***
2.LR 0.552 0.500 0.052 5.21***
3.DT 0.582 0.492 0.089 5.54***
4.RF 0.560 0.530 0.030 3.60***
5.AB 0.569 0.560 0.009 1.04
6.MLP 0.530 0.527 0.003 0.19
7.ENS 0.566 0.497 0.069 6.24***

40
Table 9. Results for the unbalanced sample of the Adaptive
Boosting model
This table reports the correct ratio, information coefficient and timing ratio in the 11-year
implementation period for two different training approaches for the Adaptive Boosting model
(AB). One uses a ‘balanced’ sampling approach which consists of equal amounts of ups and
downs which is the same as the main result reported in Table 1. The other uses an ‘unbalanced’
sampling approach simply taking 4000 data points at the time of estimation. The information
coefficient is calculated by (2×Correct ratio)−1. The timing ratio is calculated as (true positive
ratio + true negative ratio) -1. p-values are from the test for the mean to be different from zero.

Panel A. Accuracy
Information
Correct ratio coefficient Timing ratio
Training Mean Mean p-value Mean p-value
Balanced 0.5691 0.1383 <.01 0.1224 <.01
Unbalanced 0.5714 0.1428 <.01 0.0895 <.01

Panel B. Simulated before cost return


pcnt_r_Mean Yearly MDD
Training Mean p-value Mean p-value
Balanced 0.0090 <.01 -0.30 <.01
Unbalanced 0.0049 0.02 -0.58 0.01

41
Table 10. Results of size-timing and multi-category prediction of
the Adaptive Boosting model.
This table reports the correct ratio, information coefficient and timing ratio in the 11-year
implementation period for two different training approaches for the Adaptive Boosting models
(AB). One trains the model to predict up and down (2D) which consists of equal amounts of
ups and downs which is the same as the main result reported in Table 1. The other trains the
model to predict four categories of movements up-small, up-big, down-small, and down-big
(4D). The information coefficient is calculated by (2×Correct_ratio)−1. The timing ratio is
calculated as (true positive ratio + true negative ratio) -1. p-values are from the test for the mean
to be different from zero.

Correct Information Big Correct Small Correct


ratio coefficient Timing ratio ratio ratio
NumD Mean Mean p-value Mean p-value Mean p-value Mean p-value
2D 0.5691 0.1383 <.01 0.1224 <.01 0.6101 <.01 0.5522 <.01
4D 0.5683 0.1366 <.01 0.0871 <.01 0.5619 <.01 0.5731 <.01

42
Table 11. Persistence of prediction accuracy and the use of Open-to-
Open predictions.
This table reports the accuracy and simulated returns for three different experiments with
different training and application targets. C2C indicates the current close to the next period
close VIX changes; O2O next day indicates the next day’s open to the day after the next’s open
VIX changes. The training column reports the type of returns used to construct the predicted
target while the application column reports the type of return used to calculate forecasting
performance. Panel A reports the correct ratio, information coefficient and timing ratio in the
11-year implementation period for the different training and application approaches for the
Adaptive Boosting models (AB). The information coefficient is calculated by
(2×Correct_ratio)−1. The timing ratio is calculated as (true positive ratio + true negative ratio)
-1. Panel B reports the mean daily return, the Sharpe ratio, and the average maximum annual
percentage drawdown (MDD). ***, **, and * indicate statistical significance at 1%, 5% and 10%
level respectively.

Panel A. Accuracy

Correct Information
ratio coefficient Timing ratio
Training Application Mean Mean t Mean t
C2C C2C (main result) 0.569 0.138 8.01 *** 0.122 7.38 ***
C2C C2C next day 0.541 0.082 4.49 *** 0.071 4.46 ***
C2C O2O next day 0.527 0.054 2.82 ** 0.051 2.70 **
O2O next day O2O next day 0.565 0.130 4.35 *** 0.136 4.38 ***

Panel B. Simulated returns

Daily Return Yearly MDD


Training security Mean t Sharpe Mean t Min
C2C C2C (main result) 0.0090 5.80 *** 1.73 -30% -3.31 *** -87%
C2C C2C next day 0.0020 1.38 0.43 -69% -6.43 *** -142%
C2C O2O 0.0021 1.50 0.46 -51% -4.41 *** -93%
O2O O2O 0.0069 3.69 *** 1.06 -39% -3.93 *** -107%

43
Appendices

Appendix I Detailing the utilized software, packages, and tuning

parameters for replication:

We built the ML framework using Keras, Scikit-learn, and AutoGluon, which


are prominent open-source libraries for machine learning and AutoML (see
Pedregosa et al., 2018; Erickson et al., 2020). Additionally, we extended these
libraries by incorporating our unique closed-loop training and inference
methodology, as described in Section 3.1. The parameters and their respective
ranges used during the closed-loop AutoML training are provided in the
following table.

Tuning parameters

Algorithm Parameter Selection Description


Range
Logistic Regression regularization 0.1 to 100 The strength of
strength regulation
Decision Tree max_depth 3 to 12 The maximum depth
of the tree
min_samples_le 8 to 16 The minimum
af number of samples
required to be at a
leaf node
max_leaf_nodes 5 to 100 The maximum nodes
a tree grows
Random Forest n_estimators 1 to 50 The number of trees
in the forest
max_depth 1 to 100 The maximum depth
of the tree
AdaBoost n_estimators 1 to 100 The maximum
number of estimators
the boosting
terminates
learning_rate 0.001 to 1 Learning rate
XGBoost n_estimators 50 to 150 The maximum
number of estimators
the boosting
terminates

44
max_depth 3 to 10 The maximum depth
of the tree
min_child_weig 3 to 10 The minimum sum of
ht instance weight in a
child
learning_rate 0.01 to 0.1 Learning rate
MLP NN alpha 0.01 to 1 Learning rate
layer_1_size 1 to 100 The number of
neurons in the first
layer
layer_2_size 1 to 50 The number of
neurons in the
second layer
layer_3_size 1 to 20 The number of
neurons in the third
layer
Deep NN batch_size 1024 Batch size for each
learning
epochs 50 to 75 The number of
training iteration
lambda 0.025 to 0.035 The regularization
applied to the model.
dropout_rate 0.10 to 0.25 Dropout rate
learning rate 0 Learning rate
layer_1_size 100 to 200 The number of
neurons in the first
layer
layer_2_size 100 to 200 The number of
neurons in the
second layer
layer_3_size 100 to 200 The number of
neurons in the third
layer
Ensemble All parameters All ranges Combination of all
the above parameters

45
Appendix II Summary Statistics and Dynamic Properties of VIX

This table reports the summary statistics and dynamic properties of the VIX
across different sample periods: In-Sample, Out-of-Sample, Implement, and
Full Sample.

Std PACF PACF PACF


Sample Mean Min Max Skewness Kurtosis N
Dev Lag 1 Lag 5 Lag 23
In-Sample 19.83 6.62 9.89 45.74 0.779 0.443 0.982 0.0597 0.0006 3600
Out-of-
34.71 13.88 18.81 80.86 1.085 0.426 0.972 0.1174 0.0257 400
Sample
Implement 18.11 7.18 9.14 82.69 2.585 11.942 0.966 0.0608 -0.0077 3131
Full
19.91 8.32 9.14 82.69 2.128 7.769 0.979 0.0712 0.0083 7131
Sample

The combined training sample (In-Sample + Out-of-Sample) and the


Implement period exhibit similar overall statistics and dynamics, despite
greater variation in the Out-of-Sample period. The mean VIX values are 19.83
(In-Sample) and 34.71 (Out-of-Sample), with the Out-of-Sample period’s
higher standard deviation (13.88) reflecting financial crisis volatility. The
Implement period has a mean of 18.11 and a standard deviation of 7.18, with
slightly higher skewness and kurtosis, indicating more extreme values. PACF
values at lag 1 are high across all periods (~0.97), with diminishing correlations
at higher lags. Overall, the training sample effectively captures market volatility
characteristics.

46
Predicting VIX with Adaptive Machine Learning

Yunfei Bai* and Charlie X. Cai**

Online Appendix

Online Appendix I. List of variables

ID Category Security name Filed


1 Commodity Bloomberg Commodity Index Price Change 1 Day Percent
2 Commodity WTI CRUDE FUTURE Price Change 1 Day Percent
3 Commodity BRENT CRUDE FUTR Price Change 1 Day Percent
4 Commodity COPPER FUTURE Price Change 1 Day Percent
5 Commodity GOLD 100 OZ FUTR Price Change 1 Day Percent
6 Commodity SOYBEAN FUTURE Price Change 1 Day Percent
7 Commodity CORN FUTURE Price Change 1 Day Percent
8 Commodity SUGAR Price Change 1 Day Percent
9 Commodity Gold Spot $/Oz Price Change 1 Day Percent
10 Currency Euro Spot Price Change 1 Day Percent
11 Currency Japanese Yen Spot Price Change 1 Day Percent
12 Currency British Pound Spot Price Change 1 Day Percent
13 Currency Australian Dollar Spot Price Change 1 Day Percent
14 Currency China Renminbi Spot Price Change 1 Day Percent
15 Currency Brazilian Real Spot Price Change 1 Day Percent
16 Currency DOLLAR INDEX SPOT Price Change 1 Day Percent
17 Govt & Corp Bond US Generic Govt 2 Yr Price Change 1 Day Percent
18 Govt & Corp Bond US Generic Govt 5 Yr Price Change 1 Day Percent
19 Govt & Corp Bond US Generic Govt 3 Yr Price Change 1 Day Percent
20 Govt & Corp Bond US Generic Govt 12 Mth Price Change 1 Day Percent
21 Govt & Corp Bond US Generic Govt 3 Mth Price Change 1 Day Percent
22 Govt & Corp Bond US Generic Govt 6 Mth Price Change 1 Day Percent
23 Govt & Corp Bond US Generic Govt 30 Yr Price Change 1 Day Percent
24 Govt & Corp Bond US Generic Govt TII 10 Yr Price Change 1 Day Percent
25 Govt & Corp Bond US Generic Govt TII 5 Yr Price Change 1 Day Percent
26 Govt & Corp Bond US Corporate High Yield Price Change 1 Day Percent
27 Govt & Corp Bond Corporate Price Change 1 Day Percent
28 Govt & Corp Bond Corporate Price Change 1 Day Percent
29 Govt & Corp Bond UST 13-Week Bill High Discount Price Change 1 Day Percent
30 Govt & Corp Bond Ted Spread Price Change 1 Day Percent
31 Macroeconomic ISM Manufacturing PMI SA ISM PMI
32 Macroeconomic ISM Services PMI Services PMI

Online Appendix 1
ID Category Security name Filed
33 Macroeconomic U-3 US Unemployment Rate Total Total SA
34 Macroeconomic US PPI Finished Goods SA MoM% Goods MoM SA
35 Macroeconomic Adjusted Retail & Food Service Monthly % Change
36 Macroeconomic US Import Price Index by End U % Change
37 Macroeconomic US Export Price by End Use All % Change
38 Macroeconomic US Real Average Weekly Earning CES0500000012
39 Macroeconomic GDP US Chained 2012 Dollars Qo QoQ % Change Annualized
40 Macroeconomic US Labor Productivity Output P PRS85006092
41 Macroeconomic US Unit Labor Costs Nonfarm Bu PRS85006112
42 Macroeconomic US Employees on Nonfarm Payrol Net Change SA
43 Macroeconomic US Employees on Nonfarm Payrol Private Chng SA
44 Macroeconomic US Employees on Nonfarm Payrol Net Change
45 Macroeconomic Federal Funds Target Rate - Up Fed Funds Target Rate US
46 Macroeconomic US Initial Jobless Claims SA Initial Jobless Claims SA
US CPI Urban Consumers MoM
47 Macroeconomic SA MoM % SA
48 Macroeconomic Conference Board Consumer Conf Confidence
49 Macroeconomic US Durable Goods New Orders In Month % change
50 Macroeconomic MBA US Mortgage Market Inde WoW% Change
51 Macroeconomic US New One Family Houses Sold Total sold
52 Macroeconomic US New Privately Owned Housing US Building Housing Starts
53 Macroeconomic US Industrial Production MOM S Month % change
54 Macroeconomic US Manufacturers New Orders To Monthly % Change
55 Macroeconomic US Personal Income MoM SA MoM % Change
56 Macroeconomic US Personal Consumption Expend Monthly % Change
57 Macroeconomic US Trade Balance of Goods and US Trade Balance
58 Macroeconomic Conference Board US Leading In Monthly % Change
59 Macroeconomic University of Michigan Consume Univ. of Michigan Sentiment
60 Macroeconomic ISM Manufacturing PMI SA Change
61 Macroeconomic ISM Services PMI Change
62 Macroeconomic U-3 US Unemployment Rate Total Change
63 Macroeconomic US PPI Finished Goods SA MoM% Change
64 Macroeconomic Adjusted Retail & Food Service Change
65 Macroeconomic US Import Price Index by End U Change
66 Macroeconomic US Export Price by End Use All Change
67 Macroeconomic US Real Average Weekly Earning Change
68 Macroeconomic GDP US Chained 2012 Dollars Qo Change
69 Macroeconomic US Labor Productivity Output P Change
70 Macroeconomic US Unit Labor Costs Nonfarm Bu Change
71 Macroeconomic US Employees on Nonfarm Payrol Change
72 Macroeconomic US Employees on Nonfarm Payrol Change
73 Macroeconomic US Employees on Nonfarm Payrol Change
74 Macroeconomic Federal Funds Target Rate - Up Change
75 Macroeconomic US Initial Jobless Claims SA Change
US CPI Urban Consumers MoM
76 Macroeconomic SA Change
77 Macroeconomic Conference Board Consumer Conf Change
78 Macroeconomic US Durable Goods New Orders In Change

Online Appendix 2
ID Category Security name Filed
79 Macroeconomic MBA US Mortgage Market Inde Change
80 Macroeconomic US New One Family Houses Sold Change
81 Macroeconomic US New Privately Owned Housing Change
82 Macroeconomic US Industrial Production MOM S Change
83 Macroeconomic US Manufacturers New Orders To Change
84 Macroeconomic US Personal Income MoM SA Change
85 Macroeconomic US Personal Consumption Expend Change
86 Macroeconomic US Trade Balance of Goods and Change
87 Macroeconomic Conference Board US Leading In Change
88 Macroeconomic University of Michigan Consume Change
89 Macroeconomic ICE LIBOR USD 1 Month Last Price
90 Macroeconomic ICE LIBOR USD 1 Month Price Change 1 Day Percent
91 Macroeconomic US Generic Govt 10 Yr Price Change 1 Day Percent
INTL BUSINESS MACHINES
92 Major Equities CORP Price Change 1 Day Percent
93 Major Equities APPLE INC Price Change 1 Day Percent
94 Major Equities AMAZON.COM INC Price Change 1 Day Percent
95 Major Equities GENERAL ELECTRIC CO Price Change 1 Day Percent
96 Major Equities CELGENE CORP Price Change 1 Day Percent
97 Major Equities MICRON TECHNOLOGY INC Price Change 1 Day Percent
98 Major Equities MICROSOFT CORP Price Change 1 Day Percent
99 Major Equities BRISTOL-MYERS SQUIBB CO Price Change 1 Day Percent
100 Major Equities FEDEX CORP Price Change 1 Day Percent
101 Major Equities GOLDMAN SACHS GROUP INC Price Change 1 Day Percent
102 Major Equities PROLOGIS INC Price Change 1 Day Percent
103 Major Equities NVIDIA CORP Price Change 1 Day Percent
104 Seasonality Day of the month
105 Seasonality Day of the week
106 Seasonality Week of the year
107 Seasonality Month of the year
108 Seasonality Day to next expired Wed
Pct of Members w/Px Below
109 SPX Member Tech S&P 500 INDEX Lower Bollinger Band
Pct of Members w/Px Above
110 SPX Member Tech S&P 500 INDEX Upper Bollinger Band
Percentage of Members with
111 SPX Member Tech S&P 500 INDEX Px > 10 Day Moving Avg
Percentage of Members with
112 SPX Member Tech S&P 500 INDEX Px > 20 Day Moving Avg
Percentage of Members with
113 SPX Member Tech S&P 500 INDEX MACD > Base Line Zero
Percentage of Members with
114 SPX Member Tech S&P 500 INDEX Px > 150 Day Moving Avg
Percentage of Members with
115 SPX Member Tech S&P 500 INDEX Signal > Base Line Zero
Percentage of Members with
116 SPX Member Tech S&P 500 INDEX Px > 250 Day Moving Avg
Percentage of Members with
117 SPX Member Tech S&P 500 INDEX Px > 10 Wk Moving Avg

Online Appendix 3
ID Category Security name Filed
Percentage of Members with
118 SPX Member Tech S&P 500 INDEX Px > 50 Wk Moving Avg
Percentage of Members with
119 SPX Member Tech S&P 500 INDEX Px > 100 Wk Moving Avg
Percentage of Members with 14
120 SPX Member Tech S&P 500 INDEX Day RSI Betw 30 & 70
Percentage of Members with 14
121 SPX Member Tech S&P 500 INDEX Day RSI > 70
Percentage of Members with 14
122 SPX Member Tech S&P 500 INDEX Day RSI < 30
Percentage of Members with
123 SPX Member Tech S&P 500 INDEX New 52 Week Highs
Percentage of Members with
124 SPX Member Tech S&P 500 INDEX New 52 Week Lows
Percentage of Members with
125 SPX Member Tech S&P 500 INDEX New 4 Week Highs
Percentage of Members with
126 SPX Member Tech S&P 500 INDEX New 4 Week Lows
Pct of Members w/MACD Sell
127 SPX Member Tech S&P 500 INDEX Signal Last 10 Days
Pct of Members w/MACD Buy
128 SPX Member Tech S&P 500 INDEX Signal Last 10 Days
Percentage of Members with
129 SPX Member Tech S&P 500 INDEX New 12 Week Highs
Percentage of Members with
130 SPX Member Tech S&P 500 INDEX New 12 Week Lows
Percentage of Members with
131 SPX Member Tech S&P 500 INDEX New 8 Week Highs
Percentage of Members with
132 SPX Member Tech S&P 500 INDEX New 24 Week Highs
Percentage of Members with
133 SPX Member Tech S&P 500 INDEX New 24 Week Lows
Percentage of Members with
134 SPX Member Tech S&P 500 INDEX New 8 Week Lows
SPX Options and
135 Futures S&P 500 INDEX Hist. Call Implied Volatility
SPX Options and Put Call Volume Ratio - Current
136 Futures S&P 500 INDEX Day
SPX Options and Total Option Volume - Current
137 Futures S&P 500 INDEX Day
SPX Options and
138 Futures S&P 500 INDEX Total Call Volume
SPX Options and
139 Futures S&P 500 INDEX Total Put Volume
SPX Options and
140 Futures S&P 500 INDEX Total Call Open Interest
SPX Options and
141 Futures S&P 500 INDEX Total Put Open Interest
SPX Options and
142 Futures S&P 500 INDEX Total Call Volume Current Day
SPX Options and
143 Futures S&P 500 INDEX Total Put Volume Current Day
SPX Options and Total Call Open Interest
144 Futures S&P 500 INDEX Current Day

Online Appendix 4
ID Category Security name Filed
SPX Options and Total Put Open Interest Current
145 Futures S&P 500 INDEX Day
SPX Options and Total Option Volume - Current
146 Futures S&P 500 INDEX Day
SPX Options and
147 Futures Generic 1st 'SP' Future Aggregate Open Interest
SPX Options and Aggregate Volume of Futures
148 Futures Generic 1st 'SP' Future Contracts
149 SPX Subindex S&P 500 Banks Industry Group G Price Change 1 Day Percent
150 SPX Subindex S&P 500 Retailing Industry Gro Price Change 1 Day Percent
S&P 500 Automobiles &
151 SPX Subindex Component Price Change 1 Day Percent
152 SPX Subindex S&P 500 Transportation Industr Price Change 1 Day Percent
153 SPX Subindex S&P 500 Software & Services In Price Change 1 Day Percent
154 SPX Subindex S&P 500 Insurance Industry Gro Price Change 1 Day Percent
155 SPX Subindex S&P 500 Real Estate Industry G Price Change 1 Day Percent
156 SPX Subindex S&P 500 Technology Hardware & Price Change 1 Day Percent
157 SPX Subindex S&P 500 Media & Entertainment Price Change 1 Day Percent
158 SPX Subindex S&P 500 Household & Personal P Price Change 1 Day Percent
159 SPX Subindex S&P 500 Telecommunication Serv Price Change 1 Day Percent
160 SPX Subindex S&P 500 Utilities Industry Gro Price Change 1 Day Percent
161 SPX Subindex S&P 500 Food Beverage & Tobacc Price Change 1 Day Percent
162 SPX Subindex S&P 500 Health Care Equipment Price Change 1 Day Percent
163 SPX Subindex S&P 500 Consumer Durables & Ap Price Change 1 Day Percent
164 SPX Subindex S&P 500 Pharm Biotech & Life S Price Change 1 Day Percent
165 SPX Subindex S&P 500 Energy Industry Group Price Change 1 Day Percent
166 SPX Subindex S&P 500 Capital Goods Industry Price Change 1 Day Percent
167 SPX Subindex S&P 500 Diversified Financials Price Change 1 Day Percent
168 SPX Subindex S&P 500 Food & Staples Retaili Price Change 1 Day Percent
169 SPX Subindex S&P 500 Consumer Services Indu Price Change 1 Day Percent
170 SPX Subindex S&P 500 Commercial Professiona Price Change 1 Day Percent
171 SPX Subindex S&P 500 Materials Industry Gro Price Change 1 Day Percent
172 SPX Subindex S&P 500 Consumer Discretionary Price Change 1 Day Percent
173 SPX Subindex S&P 500 Consumer Staples Secto Price Change 1 Day Percent
174 SPX Subindex S&P 500 Energy Sector GICS Lev Price Change 1 Day Percent
175 SPX Subindex S&P 500 Financials Sector GICS Price Change 1 Day Percent
176 SPX Subindex S&P 500 Health Care Sector GIC Price Change 1 Day Percent
177 SPX Subindex S&P 500 Industrials Sector GIC Price Change 1 Day Percent
178 SPX Subindex S&P 500 Information Technology Price Change 1 Day Percent
179 SPX Subindex S&P 500 Materials Sector GICS Price Change 1 Day Percent
180 SPX Subindex S&P 500 Communication Services Price Change 1 Day Percent
181 SPX Subindex S&P 500 Utilities Sector GICS Price Change 1 Day Percent
182 SPX Tech S&P 500 INDEX Volatility 30 Day
183 SPX Tech S&P 500 INDEX Volatility 90 Day
184 SPX Tech S&P 500 INDEX Volatility 60 Day
185 SPX Tech S&P 500 INDEX Volatility 260 Day
186 SPX Tech S&P 500 INDEX Volatility 360 Day
187 SPX Tech S&P 500 INDEX Volatility 10 Day
188 SPX Tech S&P 500 INDEX Volatility 20 Day

Online Appendix 5
ID Category Security name Filed
189 SPX Tech S&P 500 INDEX Volatility 180 Day
190 SPX Tech S&P 500 INDEX Volatility 200 Day
191 SPX Tech S&P 500 INDEX Volatility 120 Day
192 SPX Tech S&P 500 INDEX RSI 3 Day
193 SPX Tech S&P 500 INDEX RSI 9 Day
194 SPX Tech S&P 500 INDEX RSI 14 Day
195 SPX Tech S&P 500 INDEX RSI 30 Day
196 SPX Tech S&P 500 INDEX ARMS Daily Index
197 SPX Tech S&P 500 INDEX ARMS Weekly Index
198 SPX Tech S&P 500 INDEX Money Flow Net Non-Block
199 SPX Tech S&P 500 INDEX Money Flow Net-Block
200 SPX Tech S&P 500 INDEX Dividend Per Share Last Net
201 SPX Tech S&P 500 INDEX Volume - Realtime
202 SPX Tech S&P 500 INDEX Advance Volumes
203 SPX Tech S&P 500 INDEX Decline Volumes
204 SPX Tech S&P 500 INDEX Unchanged Volumes
205 SPX Tech S&P 500 INDEX Average Volume 5 Day
206 SPX Tech S&P 500 INDEX Average Volume 25 Day
207 SPX Tech S&P 500 INDEX Moving Average 5 Day
208 SPX Tech S&P 500 INDEX Moving Average 10 Day
209 SPX Tech S&P 500 INDEX Moving Average 20 Day
210 SPX Tech S&P 500 INDEX Moving Average 30 Day
211 SPX Tech S&P 500 INDEX Moving Average 50 Day
212 SPX Tech S&P 500 INDEX Moving Average 100 Day
213 SPX Tech S&P 500 INDEX Moving Average 200 Day
214 SPX Tech S&P 500 INDEX Percentage Index Advanced
215 SPX Tech S&P 500 INDEX DIFF_MOV_AVG_5D
216 SPX Tech S&P 500 INDEX DIFF_MOV_AVG_10D
217 SPX Tech S&P 500 INDEX DIFF_MOV_AVG_20D
218 SPX Tech S&P 500 INDEX DIFF_MOV_AVG_30D
219 SPX Tech S&P 500 INDEX DIFF_MOV_AVG_50D
220 SPX Tech S&P 500 INDEX DIFF_MOV_AVG_100D
221 SPX Tech S&P 500 INDEX DIFF_MOV_AVG_200D
222 SPX Tech S&P 500 INDEX return_5d
223 SPX Tech S&P 500 INDEX return_10d
224 SPX Tech S&P 500 INDEX return_30d
225 SPX Tech S&P 500 INDEX return_60d
226 SPX Tech S&P 500 INDEX Max 30 day
227 SPX Tech S&P 500 INDEX Days away from 30 Max
228 SPX Tech S&P 500 INDEX Min 30 day
229 SPX Tech S&P 500 INDEX Day away from 30 Min
230 SPX Tech S&P 500 INDEX RSI3d/RSI14d
231 SPX Tech S&P 500 INDEX MOV_AVG_5D_20D
232 SPX Tech S&P 500 INDEX Price Change 1 Day Percent
233 Vix Tech Cboe Volatility Index vixchanged_3d
234 Vix Tech Cboe Volatility Index vixchanged_5d
235 Vix Tech Cboe Volatility Index vixchanged_10d

Online Appendix 6
ID Category Security name Filed
236 Vix Tech Cboe Volatility Index vixchanged_30d
237 Vix Tech Cboe Volatility Index vixchanged_60d
238 Vix Tech Cboe Volatility Index vixstd_5d
239 Vix Tech Cboe Volatility Index vixstd_10d
240 Vix Tech Cboe Volatility Index vixstd_30d
241 Vix Tech Cboe Volatility Index vixstd_60d
242 Vix Tech Cboe Volatility Index Moving Average 5 Day
243 Vix Tech Cboe Volatility Index Moving Average 10 Day
244 Vix Tech Cboe Volatility Index Moving Average 20 Day
245 Vix Tech Cboe Volatility Index Moving Average 30 Day
246 Vix Tech Cboe Volatility Index Moving Average 5 Day
247 Vix Tech Cboe Volatility Index Moving Average 10 Day
248 Vix Tech Cboe Volatility Index Moving Average 20 Day
249 Vix Tech Cboe Volatility Index Moving Average 30 Day
250 Vix Tech Cboe Volatility Index RSI 3 Day
251 Vix Tech Cboe Volatility Index RSI 9 Day
252 Vix Tech Cboe Volatility Index RSI 14 Day
253 Vix Tech Cboe Volatility Index RSI 30 Day
254 Vix Tech Cboe Volatility Index Max 30 day
255 Vix Tech Cboe Volatility Index Days away from 30 Max
256 Vix Tech Cboe Volatility Index Min 30 day
257 Vix Tech Cboe Volatility Index Day away from 30 Min
258 Vix Tech Cboe Volatility Index RSI3d/RSI14d
259 Vix Tech Cboe Volatility Index MOV_AVG_5D_20D
260 Vix Tech Cboe Volatility Index Price Change 1 Day Percent
261 World Equity Index DOW JONES INDUS. AVG Price Change 1 Day Percent
262 World Equity Index NIKKEI 225 Price Change 1 Day Percent
263 World Equity Index Euro Stoxx 50 Pr Price Change 1 Day Percent
264 World Equity Index DAX INDEX Price Change 1 Day Percent
265 World Equity Index NASDAQ COMPOSITE Price Change 1 Day Percent
266 World Equity Index FTSE 100 INDEX Price Change 1 Day Percent
267 World Equity Index STXE 600 (EUR) Pr Price Change 1 Day Percent
268 World Equity Index HANG SENG INDEX Price Change 1 Day Percent
269 World Equity Index TOPIX INDEX (TOKYO) Price Change 1 Day Percent
270 World Equity Index SHANGHAI SE COMPOSITE Price Change 1 Day Percent
271 World Equity Index RUSSELL 2000 INDEX Price Change 1 Day Percent
272 World Equity Index NASDAQ 100 STOCK INDX Price Change 1 Day Percent
273 World Equity Index CAC 40 INDEX Price Change 1 Day Percent
274 World Equity Index MSCI World Index Price Change 1 Day Percent
275 World Equity Index MSCI Emerging Markets Index Price Change 1 Day Percent
276 World Equity Index NSE Nifty 50 Index Price Change 1 Day Percent
277 World Equity Index BRAZIL IBOVESPA INDEX Price Change 1 Day Percent
278 World Equity Index BarCap US Corp HY YTW - 10 Yea Price Change 1 Day Percent

Online Appendix 7
Online Appendix II. Details of the research design and list
of Algorithms included

OApp II.1 Predictive objective

The objective of this research work is to predict the VIX daily signal for the next
day. We choose to predict direction instead of the level of VIX because we take
the view from a portfolio manager who is interested in timing the VIX. The
forecast will translate to a decision to long or short volatility. Given this
objective, forecasting classification is a direct match to this operation problem.
We select a supervised machine learning approach, that the algorithm learns
from the input data and then uses this learning to predict the VIX’s UP or
DOWN signals. This is a typical Binominal Classification problem, that can be
addressed by several algorithms in Machine Learning.

OApp II.2 Variable selections and data processing

Before applying the ML algorithms to the features described above, we conduct


a set of feature engineering processes for the data4. The quality of features in
the data will directly influence the predictive model’s flexibility, simplicity,
execution performance, and corresponding results. Especially, certain ML
algorithms, such as Tree-based methods, are not sensitive to feature unit and
magnitude, but some others are. Therefore, scaling methods (Standardization
and Min-Max Scaling) are applied to the data fields, to address the features with
highly varying magnitudes, units, and ranges. Such rescaling is done using the
training sample (instead of the full sample) for each new model training to
avoid looking forward bias.

OApp II.3 Machine Learning algorithm selection

There are several recent studies on financial market classifiers. For example,
Ballings et al. (2015) found that for stock market prediction, Random Forests

4 The feature engineering normally covers data cleaning, data scaling and transformation,
feature selection, feature enhancement (extraction and enhancement), feature construction
and feature learning.

Online Appendix 8
outperformed SVM, Kernel Factory, AdaBoost, Neural Networks, K-Nearest
Neighbor, and Logistic Regression in terms of AUC. Similarly, Booth, Gerding,
and McGroarty (2015) demonstrated that an ensemble of random forests
improved forecast accuracy by over 15% compared to competing models (linear
regression, neural networks, and SVM) in out-of-sample price impact forecasts.

In this paper, we include Naïve Bayes (NB), Logistic Regression (LR),


and classic ML models such as Decision Tree (DT) and Random Forest (RF).
We also include advanced methods like Adaptive Boosting (AB), Multi-Layer
Perceptron (MLP), and an Ensemble model (Ens) combining all the above.
These models offer varying complexity to examine their efficacy in predicting
volatility direction.

Naïve Bayes (NB): This simple probabilistic classifier based on Bayes'


Theorem assumes feature independence. Despite its simplicity, Naïve Bayes
can outperform more complex methods in certain conditions, making it a useful
benchmark in this study.

Logistic Regression (LR): A statistical method used to model binary


outcomes. It finds the best-fitting model to describe the relationship between
independent variables and a dichotomous outcome. In this study, it serves as
another benchmark.

Decision Tree (DT): A tree-structured model that splits data into subsets
based on the most important features, ultimately forming a tree with decision
and leaf nodes.

Random Forest (RF): An extension of decision trees that builds multiple


trees and outputs the most likely classification. Random forests reduce
overfitting issues found in single decision trees.

Adaptive Boosting (AB): A boosting algorithm that combines weak


learners into a strong classifier, iterating over weighted data and adjusting
weights based on prediction errors.

Multi-Layer Perceptron (MLP): A type of neural network consisting of


layers of units (neurons) connected by weights, commonly used for
classification and regression tasks.
Online Appendix 9
Ensemble Model (Ens): A stacking ensemble model that combines
multiple algorithms (NB, LR, DT, RF, AB, and MLP) to improve predictive
accuracy by reducing errors related to variance, noise, and bias.

Each of these models offers distinct advantages, allowing us to examine


their performance in predicting volatility direction. The ensemble approach,
which combines multiple models, is particularly effective, as noted by
Rasekhschaffe and Jones (2019).

Online Appendix III. Support Vector Machine

We study the Support Vector Machine (SVM) for VIX prediction in this

appendix. Table OA1 reports the performance of SVM compared with other

models. The overall accuracy rate of SVM is relatively low compared to other

ML algorithms. It is comparable with the simple HAR model. However, it has

an extremely low timing ratio. In other words, it produces a highly unbalanced

prediction. It has a ‘down’ prediction bias in its prediction. Panel B shows that

this one-sided prediction produces a negative overall return which is also

observed in a short-only strategy. Overall, our experiment shows that SVM

produces a less effective accuracy rate due to its unbalanced prediction despite

the use of a balanced sample. Previous studies find supportive evidence to SVM

models is typically small system with fewer inputs. Our system with 278

variables takes much longer to run and it seems to produce a ‘corner solution’.

<Insert Table OA1>

Online Appendix 10
Online Appendix IV. Examples of practical applications

In this section, we present two examples of practical investment applications.


We first study strategies of trading VIX futures with the model signal and
compare them with the short-only strategy. We then present some limited
evidence of trading the contract for difference (CFD).

Trading VIX futures

We apply our VIX prediction to trade VIX derivatives, focusing on VIX futures,
which are relatively liquid with lower trading costs. However, VIX and its
futures are not perfectly correlated, making the effectiveness of VIX predictions
on VIX futures an empirical question. To study this, we construct a continuous
VIX futures return series by switching to the next front-month contract at
expiration, ensuring accurate return calculations during the switch.

Three key features of this VIX futures series include: (1) an 89% daily
correlation between VIX and its futures, (2) VIX futures returns are 0.6 of the
VIX return with an r-squared of 63%, and (3) a declining cumulative return,
reflecting the premium usually priced into VIX futures. The "short bias" in
futures reduces the correlation between VIX and VIX futures.

We explore two approaches: direct application of the VIX signal to


futures and accounting for the short bias. Table OA2 shows a reduction in
accuracy across models when applied to VIX futures (a drop of 1.7% to 4.4%),
with returns significantly lower than those for VIX itself due to both reduced
accuracy and the smaller magnitude of futures movements (60% of VIX). Even
after transaction costs, some strategies, particularly AB, maintain positive
returns (15 basis points daily, 38% annually), though this is lower than VIX
returns.

We then modify the strategy to account for the short bias by avoiding
long positions when VIX futures are priced higher than VIX. Table OA3 shows
this improves accuracy, particularly in avoiding conflicting days, bringing
results closer to those for the VIX spot. The short-only (SO) strategy remains
hard to beat, yielding a 22 basis points daily return with a Sharpe ratio of 0.57.

Online Appendix 11
However, the AB model consistently outperforms SO, with a 50% higher return,
better Sharpe ratio (0.93 vs. 0.57), and lower yearly MDD. The LR model also
performs reasonably well, slightly surpassing SO across most metrics. These
results demonstrate the economic relevance of our VIX forecasts, even when
incorporating futures dynamics.

<Insert Tables OA2 and OA3>

Trading VIX CFD (spread betting)

Another way to trade VIX is through contracts for difference (CFD) or spread
betting, such as via platforms like IG.com. However, IG provides limited
historical data, and their daily data, recorded at 5 am UK time, is outside normal
VIX trading hours, leading to a low correlation with the VIX. To better match
VIX calculations, we used hourly data from March to December 2020 (197
days), focusing on trades near the VIX close at 3 pm Central Time.

Table OA4 shows a slight drop in accuracy (1%-3%) across most models
when applied to IG data, except for the basic models and the ensemble. Returns
before costs are also uniformly lower than VIX returns, but decision tree models
(DT, RF, AB) still performed well. Transaction costs, mainly from spreads, were
higher during the pandemic, with an average turnover ratio between 14% and
46%. The AB model doubled its investment over the 197 days, though DT had
the most consistent returns. However, maximum drawdowns reached 58%, and
the short-only (SO) strategy, while accurate, missed larger movements,
delivering returns similar to NB.

<Insert Table OA4>

This test confirms the value of economic data and modeling, despite the
short, pandemic-affected testing period.

Online Appendix 12
Table OA1. SVM performance
This table reports the performance of SVM comparing with other MLs. Panel A reports the
correct ratio, information coefficient and timing ratio in the 11-year implementation period.
The information coefficient is calculated by (2×Correct_ratio)−1. The timing ratio is (true
positive ratio + true negative ratio) -1. Detail of the models is in section App.3. Panel B reports
the mean daily return, the Sharpe ratio, and the average maximum annual percentage
drawdown (MDD). ***, **, and * indicate statistical significance at 1%, 5% and 10% level,
respectively.

Panel A. Accuracy and Timing

Correct Information Timing Positive Negative


ratio coefficient ratio hit hit
Models Mean Mean t Mean t Mean Mean
1.NB 0.487 -0.0262 -1.36 0.020 1.35 0.770 0.250
2.LR 0.552 0.1046 4.19*** 0.097 4.94*** 0.480 0.620
3.DT 0.582 0.1634 6.93*** 0.126 5.25*** 0.350 0.780
4.RF 0.560 0.1194 10.64*** 0.122 9.60*** 0.570 0.550
5.AB 0.569 0.1383 8.01*** 0.122 7.38*** 0.480 0.650
6.MLP 0.530 0.0595 2.33** 0.057 2.61** 0.580 0.480
7.ENS 0.566 0.1319 7.19*** 0.119 6.41*** 0.490 0.630
HAR 0.525 0.0502 2.31** 0.072 4.02*** 0.680 0.390
SO 0.546 0.0928 6.78*** 0.000. 0.000 1.000
SVM 0.523 0.0452 2.80** 0.002 0.28 0.220 0.790

Panel B Simulated return applied to VIX.

Daily Return Yearly MDD


model Mean t Sharpe Mean t
1.NB 0.0029 2.79** 0.85 -60% -4.39***
2.LR 0.0060 5.06*** 1.54 -52% -2.87**
3.DT 0.0056 4.30*** 1.27 -64% -2.57**
4.RF 0.0078 6.85*** 2.05 -47% -2.84**
5.AB 0.0090 5.80*** 1.73 -30% -3.31***
6.MLP 0.0045 3.91*** 1.18 -50% -3.24***
7.ENS 0.0074 7.42*** 2.24 -89% -1.92*
HAR 0.0053 3.89*** 1.15 -56% -3.63***
SVM -0.0014 -1.80 -0.54 -75% -17.23***

Online Appendix 13
Table OA2. Forecast accuracy and return applying to VIX futures
This table reports the performance of a strategy trading the out-of-sample VIX prediction signal on the nearest month VIX futures contract. The correct ratio
is the proportion of prediction days that have a correct prediction. For the strategy returns, we report the mean daily return and the Sharpe ratio for both before
and after costs. The costs are measured by the bid-ask spread of the contract. Costs are applied when there is a change in the trading direction. Turnover
measures the proportion of the days that need to rebalance due to the change of trading direction. For the after-cost return, we also report the average
maximum annual percentage drawdown (MDD) and the ‘maximum’ of the MDD in the 11 years. ***, **, and * indicate statistical significance at 1%, 5% and 10%
level respectively.

Correct Investment
ratio Before Cost Return After Cost Return Yearly MDD daysturnover
Model Mean Mean t Sharpe Mean t Sharpe Mean Max Mean Mean
1.NB 0.445 -0.0023 -2.86** -0.86 -0.0031 -3.79*** -1.14 -0.73 -1.23 233 0.23
2.LR 0.535 0.0023 3.78*** 1.14 0.0008 1.28 0.39 -0.42 -1.32 233 0.37
3.DT 0.555 0.0023 2.29** 0.69 0.0011 1.17 0.35 -0.52 -1.36 233 0.31
4.RF 0.515 0.0014 1.53 0.46 0.0000 0.04 0.01 -0.49 -1.18 233 0.33
5.AB 0.526 0.0029 2.56** 0.77 0.0015 1.48 0.44 -0.43 -0.72 233 0.38
6.MLP 0.508 0.0005 0.56 0.17 -0.0008-0.76 -0.23 -0.53 -1.24 233 0.30
7.ENS 0.525 0.0020 2.46** 0.74 0.0006 0.73 0.22 -0.58 -1.43 233 0.33
HAR 0.470 -0.0006 -0.70 -0.21 -0.0014 -1.48 -0.45 -0.62 -1.15 233 0.19

Online Appendix 14
Table OA3. Augmented VIX futures strategy: sitting out on conflicting days to avoid shorting pressures
This table reports an augmented VIX futures trading strategy. The strategy trades the out-of-sample VIX prediction signal on the nearest month VIX futures
contract except on the days that the model predicts an up signal while the VIX futures contract is trading higher than the VIX spot value. The correct ratio is
the proportion of prediction days that have a correct prediction. For the strategy returns, we report the mean daily return and the Sharpe ratio for both before
and after costs. The costs are measured by the bid-ask spread of the contract. Costs are applied when there is a change in the trading direction. Turnover
measures the proportion of the days that need to rebalance due to the change of trading direction. For the after-cost return, we also report the average
maximum annual percentage drawdown (MDD) and the ‘maximum’ of the MDD in the 11 years. ***, **, and * indicate statistical significance at 1%, 5% and 10%
level respectively.

Correct Timing Investment


ratio ratio Before Cost Return After Cost Return Yearly MDD days turnover
Model Mean Mean Mean t Sharpe Mean t Sharpe Mean Min Mean Mean
1.NB 0.529 0.005 -0.0001 -0.03 -0.01 -0.0023-0.76 -0.23 -0.48 -1.21 81 0.60
2.LR 0.593 0.026 0.0040 4.04*** 1.22 0.0024 2.24** 0.67 -0.32 -1.28 151 0.36
3.DT 0.599 0.002 0.0031 2.15** 0.65 0.0021 1.52 0.46 -0.47 -1.29 174 0.24
4.RF 0.594 0.006 0.0035 1.99* 0.60 0.0018 1.10 0.33 -0.36 -1.18 128 0.36
5.AB 0.593 0.016 0.0050 3.95*** 1.19 0.0036 3.08** 0.93 -0.33 -0.62 149 0.37
6.MLP 0.577 0.014 0.0032 1.98* 0.60 0.0014 0.84 0.25 -0.34 -1.21 124 0.41
7.ENS 0.592 0.006 0.0036 2.60** 0.79 0.0022 1.51 0.45 -0.45 -1.34 145 0.32
HAR 0.562 0.004 0.0019 0.74 0.22 0.0004 0.15 0.04 -0.48 -1.19 98 0.36
SO 0.594 0.000 0.0022 1.88* 0.57 0.0022 1.88* 0.57 -0.41 -1.24 233 0.00

Online Appendix 15
Table OA4. Trading CFD with VIX forecasts
This table reports the performance of the models applying the VIX signals to the spread betting data from IG.com between 11 Mar 2020- 31 Dec 2020. The
hourly data is downloaded, and the trade is executed at 3 pm US central time which is 15 minutes before the VIX closing. It reports the statistics for analyses
using both the IG price and the VIX hourly data which is downloaded from (Refinitiv Tick History). It reports the correct ratio and the mean before costs return.
For IG, the after-costs mean return, cumulated return (sum, no compounding), maximum drawdown and turnover of the strategy are reported. N report the
number of days. ***, **, and * indicate statistical significance at 1%, 5% and 10% level respectively.

VIX IG VIX_r IG_r IG_r_after_cost


Models correct correct Mean Mean Mean Cumulated MDD turnover N
1.NB 0.5 0.5 -0.0033 -0.0029 -0.0036 -0.65 -0.67 0.14 197
2.LR 0.44 0.47 0.0018 -0.0010 -0.0021 -0.51 -0.53 0.2 197
3.DT 0.65 0.63 0.0185*** 0.0102*** 0.0081** 2.59 -0.58 0.37 197
4.RF 0.55 0.53 0.0123** 0.0080** 0.0058 1.34 -0.22 0.41 197
5.AB 0.59 0.58 0.0189*** 0.0077* 0.0052 1.03 -0.37 0.45 197
6.MLP 0.57 0.54 0.0083 0.0007 -0.0009 -0.39 -0.68 0.29 197
7.ENS 0.54 0.55 0.0116** 0.0043 0.0021 0.11 -0.65 0.38 197
SO 0.59 0.57 0.0035 0.0023 0.0023 0.14 -0.52 0.01 197

Online Appendix 16

You might also like