Selecting Mutual Funds Using Machine Learning Clas
Selecting Mutual Funds Using Machine Learning Clas
net/publication/333894756
CITATION READS
1 4,784
1 author:
Cyril Vanderhaeghen
Edhec Business school
1 PUBLICATION 1 CITATION
SEE PROFILE
All content following this page was uploaded by Cyril Vanderhaeghen on 20 June 2019.
Cyril Vanderhaeghen
EDHEC Business School does not express approval or disapproval concerning the opinions given in this paper
This paper uses machine learning computed probabilities as fund selection signals and tests this
signal in a fund of fund portfolio. Using time series data and alternative data, we trained several
classifying methods, Support Vector Machines, Logistic regression, Random Forest and an
Artificial Neural Network, to be used as decision processes when rebalancing our portfolio of
US mutual funds.
We found that the signal was relevant when it comes to accurately selecting funds, however the
models were mainly able to capture momentum information within mutual funds.
1
Table of content
1. Introduction ..................................................................................................................................... 5
5.5 Relationship between the models’ probability and the returns ............................................... 26
7. Conclusion................................................................................................................................. 42
2
References ......................................................................................................................................... 44
3
List of Abbreviations
CV Cross-validation
IR Information Ratio
4
Table of Figures
Figure 1 Sigmoid function ........................................................................................................ 11
Figure 2 SVM methodology ..................................................................................................... 12
Figure 3 Random Forest Methodology..................................................................................... 13
Figure 4 Artificial Neural Network methodology .................................................................... 14
Figure 5 Conditional distributions of 3 months returns............................................................ 16
Figure 6 Conditional distributions of 6 months returns............................................................ 17
Figure 7 Conditional distributions of 12 months returns.......................................................... 17
Figure 8 Conditional distributions of the volatility .................................................................. 18
Figure 9 Distribution of the consistency feature ...................................................................... 18
Figure 10 Conditional distributions of the number days of existence ...................................... 19
Figure 11 Average number of positive returns per state .......................................................... 20
Figure 12 Average number of positive returns per investment style ........................................ 21
Figure 13 5-fold ROC curves ................................................................................................... 26
Figure 14 Regression of returns to logistic regression probabilities ........................................ 27
Figure 15 Regression of returns to SVM probabilities ............................................................. 27
Figure 16 Regression of returns to random forest probabilities ............................................... 27
Figure 17 Regression of returns to ensemble classifier probabilities ....................................... 28
Figure 18 Regression of returns to ANN probabilities ............................................................. 28
Figure 19 Equal weight portfolio of all the funds over time .................................................... 31
Figure 20 Excess Returns for a quantile of 10% ...................................................................... 32
Figure 21 Excess Returns for a quantile of 20% ...................................................................... 32
Figure 22 Excess Returns for a quantile of 30% ...................................................................... 33
Figure 23 Excess Returns for a quantile of 40% ...................................................................... 33
Figure 24 Excess Returns for a quantile of 50% ...................................................................... 34
Figure 25 Predictions' accuracy ................................................................................................ 37
Figure 26 Overall accuracy for models trained without momentum component ..................... 40
Figure 27 Excess returns for models trained without momentum component ......................... 41
Figure 28 Returns when choosing the top 10% funds .............................................................. 46
Figure 29 Strategies’ value when choosing the top 10% funds ................................................ 46
Figure 30 Returns when choosing the top 20% funds .............................................................. 46
Figure 31 Strategies’ value when choosing the top 20% funds ................................................ 46
Figure 32 Returns when choosing the top 30% funds .............................................................. 46
Figure 33 Strategies’ value when choosing the top 30% funds ................................................ 46
Figure 34 Returns when choosing the top 40% funds .............................................................. 46
Figure 35 Strategies' value when choosing the top 40% funds ................................................ 46
Figure 36 Returns when choosing the top 50% funds .............................................................. 46
Figure 37 Strategies' value when choosing the top 50% funds ................................................ 46
5
Table of Tables
Table 1 Hyperparameter tuning results and cross validation scores ........................................ 25
Table 2 Regressions' results...................................................................................................... 29
Table 3 Testing mean excess returns’ significance for q = 10% .............................................. 34
Table 4 Testing mean excess returns’ significance for q = 20% .............................................. 35
Table 5 Testing mean excess returns’ significance for q = 30% .............................................. 35
Table 6 Testing mean excess returns’ significance for q = 40% .............................................. 35
Table 7 Testing mean excess returns’ significance for q = 50% .............................................. 36
Table 8 Information ratios ........................................................................................................ 36
Table 9 Mean accuracy of only the selected funds for different quantiles ............................... 38
Table 10 Testing results for higher than 50% accuracy ........................................................... 40
6
1. Introduction
This paper delivers an analysis of a mutual fund selection signal based on machine learning
classifiers’ outputted probability. The weighting scheme of each component within the portfolio
is based on the models’ calculated probability of the fund to yield a positive return over the
investment horizon. Using common risk-adjusted measures and prediction precision measures,
we will see how the strategies compare to one another when using different models to compute
the investment signals. We will also analyze their performances with respect to a naïve
A fund manager’s expertise is mostly captured by its funds track record. Similarly, a fund’s
commercial document usually displays past performance and present this information as selling
arguments. However, when it comes to fund selection, studies show that, on average, actively
managed funds struggle to outperform their benchmarks and other index funds (Fortin &
Michelson, 2002). Yet, numerous investors are willing to dedicate capital to mutual funds and
we can observe some successful funds within the industry, Blackrock or Vanguard to name a
few. It is evidence to the idea that some funds can consistently add value to their clients. This
suggests that, using data on the funds’ characteristics, one might be able to capture those
performing funds.
Even most popular fund classification and rating methods, like Morningstar’s methodology, are
based on the historical performances and use portfolio components such as asset allocation,
market capitalization, value-growth score as well as the beta and alpha of the funds. These
ranking methods are widely used by investors as tools to choose among a universe of funds.
Within this work, when choosing our explanatory variables, we will stray away from these
classical measures and analyze the predictive power of combining usual return-based features
modelling tools in finance, more complex relationship might not be captured using these
classical methods. Moreover, these financial models are regression algorithms which aim at
A lot of, now popular, machine learning algorithms were developed in the late 20th century. It
is only recently that given the exponential growth of computing power as well as the growing
amount of data that industries have taken interest and have been able to efficiently leverage the
power of these algorithms. In this paper, we will apply classification models that are not
designed to give a continuous numerical prediction but are designed to give a class and a
prediction confidence to the target variable, in this study, a 1 for a positive return, a 0 for a
negative one. We will be using both linear models like logistic regression and nonlinear
machine models such as multilayer perceptron, a common artificial neural network architecture.
8
2. Related Work
Whether fund-specific characteristics can yield predictive information is still not settled among
researchers. Carhart showed that there is momentum information within mutual fund (1997).
However, results are mixed, for instance Lakonishok et al. found no relationship between the
performance of one year and the following one (Lakonishok, Shleifer, Vishny, Hart & Perry,
1992). Other research results found, on average, no positive abnormal returns across mutual
Machine learning algorithms find successful applications in various fields, image recognition
(Krizhevsky & Sutskever, 2012) or the medical sector (Deo, 2015) to name a few.
Understandably, these algorithms have been a subject of great interest within the financial
industry, too. They are applied on a various range of classical problems such as stock price
prediction (Tarsauliya, Kant, Kala, Tiwari & Shukla, 2010) and also more original problems
like consumer credit risk modelling (Khandani, Kim & Lo, 2010), news sentiment analysis (Ho
& Wang, 2016) and even pattern recognition on chart images using Convolutional Neural
When it comes to funds, Indro, Jiang, Patuwo and Zhang used an artificial neural network to
predict mutual funds’ performance (Indro et al., 1999). They find that the Neural Network was
better than classical linear models for growth and blend funds.
Ludwig and Piovoso (2005) apply Decision Trees, Neural Networks and Naïve Bayes to
compare money managers. They use input features such as 1-, 2- and 5-year excess returns,
percentage of outperforming quarters, tracking error and various ratios. The resulting accuracy
from predicting subsequent managers’ performance is above 65% and exceeded the
based on past returns and non-financial indicators. Trained on this feature space the models
predict for each given date the investment strategy in funds over a time period.
10
3. Model Description
This section reviews the algorithms applied as well as the data used. Further it is described how
Throughout this study, we will use the widespread open source Scikit-Learn Python package
for training, testing and validating our Logistic Regression, Support Vector Machines, Decision
Tree and voting ensemble models. To setup and calibrate our Artificial Neuronal Network
Logistic regression is a simple machine learning classifying model that maps the result of a
𝟏
𝝈(𝒛) =
𝟏+ 𝒆−𝒛
The sigmoid function outputs a value in [0 , 1], as displayed in Figure 1. Therefore, one can
give a probabilistic interpretation of an element being within a certain class, in our case, the
fund yielding positive returns in the next forecasting period. If the function outputs a value
11
3.2 Support Vector Machines
period and those that do not. When predicting the funds’ returns, the trained algorithm
determines where the new fund’s data points fall with respect to the hyperplane and infers a
Support Vector Machines originally do not give any probabilistic interpretation to their
classifications, however, using Platt scaling (Platt, 2000), SVMs can be applied in a
probabilistic setting.
Random forests algorithms construct several decision trees using the training data, those
decision trees find several simple binary rules to output a class. The random forest algorithm
then takes the mode of the class from all the individual trees as its prediction.
12
Figure 3 Random Forest Methodology
Using multiple trees has the benefit of avoiding the fact that individual trees tend to overfit the
training dataset.
Probabilities can be inferred to the predictions by looking at the proportion of vote for each
This method aggregates the predictions from the previously mentioned algorithms by finding
the class that maximizes the sum of predicted probabilities from all the classifiers. On average,
the ensemble model works better than single classifiers since having several classifiers reduce
the prediction variance. The ensemble classifier will be composed of logistic regression, SVM
The probabilities are simply computed as being the average probabilities from each classifier.
13
3.5 Artificial Neural Network
of hidden layers and neurons per layer, Figure 4 Artificial Neural Network methodology
each neuron from one layer being connected to all the neurons of the next layer. An ANN can
The value yk of a neuron is the weighted sum of the values of the previous neurons as defined
as:
𝑦𝑘 = 𝜑(∑𝑛𝑖 𝑤𝑘,𝑖 𝑥𝑖 + 𝑏)
The 𝑥𝑖 𝑠 are the previous layer’s neurons’ values, 𝑤𝑘,𝑖 𝑠 the weights associated to each neuron,
b is a bias and 𝜑 an activation function, typically Rectified Linear Unit (ReLU), hyperbolic tan
When fitting the model, the algorithm essentially finds all the appropriate weights between
adjust the weights between each node toward their optimal values in order to minimise the loss
By having one neuron as output to our ANN and choosing a sigmoid activation function for it,
14
4. Data and Features
The analysed data set used is provided by the Wharton Research Data Service database, it stems
from the “Survivor-bias-Free US Mutual Fund” series. It contains historical information such
as Net Asset Value (NAV) per share, cash percentage, 52days low/high. Furthermore, it
contains more diverse information, such as parent company city/state, phone number. The data
ranges from 1962 to today with information from both active and liquidated funds.
Throughout this study we will use monthly NAV per share data, fund inception date as well as
geographical and investment style data on the funds to conduct feature engineering.
In the next sections we provide a description of the features used as explanatory variables which
we split into two categories: features computed from the NAV per share and alternative features
not based on NAV per share. All the features used to perform model training are computed
from 04/2000 to 04/2001 to train the models at predicting the next quarter’s returns, on 07/2001.
This prediction date was chosen with the objective of having a balanced amount of positive and
negative return, so that the models train the most efficiently possible. At the date of prediction,
there are 2247 funds with positive returns and 1543 with negative returns, so a slight imbalance
𝑁𝐴𝑉𝑡 − 𝑁𝐴𝑉𝑡−1
𝑟𝑡 =
𝑁𝐴𝑉𝑡−1
15
4.1 Return based features
Using the monthly NAV, we computed returns from which we defined the following 5 features:
- Annualized volatility of monthly returns within the last 12 months defined as the
The consistency and the 3-, 6- and 12-months returns were defined with the idea of capturing
momentum effects.
The features defined above display the following distributions conditional to the forward 3
16
Figure 6 Conditional distributions of 6 months returns
As we can see on Figures 5, 6 and 7, the conditional distributions show positive correlation with
the next returns: more funds with positive 3-,6- and 12 months returns yield positive returns
over the next 3 months, the distributions have observable different means.
17
Figure 8 Conditional distributions of the volatility
To capture information that might not be present in the returns, we defined one continuous
18
Length of existence Feature
variance for forward positive returns. Figure 10 Conditional distributions of the number days of
existence
state of the US the fund’s parent company is located, the rationale is that depending on the
location a fund could have advantages such as better infrastructure, contact or access to talents.
Location Feature
For the 47 different states within the database, we averaged the number of positive returns of
all the funds located in each state, and defined the four categories using the 25%, 50% and 75%
quantiles. Displayed in Figure 11, is the states’ average number of positive returns, ranked from
largest to lowest.
We assume that the state location features are stable and do not change overtime.
19
Figure 11 Average number of positive returns per state
We follow the same methodology to define four categorical variables based on the
Wiesenberger, Strategic Insight and Lipper Objective codes. The CRSP Style Code consists of
up to four characters, with each position defined. Reading Left to Right, the four codes
represent an increasing level of granularity1. For example, a code for a mutual fund is EDYG,
Figure 12 shows the mean number of positive monthly returns for all funds in each CRSP Style
Code. We proceeded as for the state features by creating four dummy variables based on the
25%, 50% and 75% quantiles of the mean number of positive returns.
Again, we assume that a given fund’s investment methodology does not change overtime.
1
The complete descriptions of the codes are available at http://www.crsp.com/products/documentation/crsp-
style-code
20
Figure 12 Average number of positive returns per investment style
In total there are 14 features and 3790 different funds to train our models on.
21
5. Calibrating the models and training results
Hyperparameter tuning is a very important task as these are the models’ parameters that cannot
be tuned during the training process using training data and thus must be defined by the user.
Each model has its own hyperparameters, we decided to tune the following ones:
- For Logistic Regression: The inverse regularization strength which accounts for
overfitting.
- For SVM: The C parameters which represents the hyperplane’s margin to the classes
- For Artificial Neural Network: The number of hidden layers and neurons per layer
We will conduct hyperparameter tuning with random search for logistic regression, SVM and
To conduct random search optimization and to validate our models, we will use cross validation
(CV). CV is a powerful method to evaluate the algorithms’ predicting power while controlling
for overfitting. For instance, a 5-fold cross validation splits the training data into 5 equally sized
sets, each on which the model makes predictions after training on the 4 remaining ones of
training data.
22
5.2 Bayesian Optimization
We decided to use a Bayesian Optimization2 approach (Larochelle & Adams, 2012) for the
ANN model as it is less computationally expensive than a grid search and more efficient than
a random search.
The idea behind Bayesian Optimization is to find the parameters that maximize an unknown
function by evaluating it at different points while still considering the previous tried values by
using a Gaussian process. Every new evaluation point is chosen as the set of parameters with
With 𝑓 the function to maximize and 𝑥 ∗ the set of parameters giving the current maximum of
the function.
Within the deep learning framework, the number of epochs is the number of times the data is
feed into the ANN and the weights updated using gradient descent. In our case, at each epoch,
the ANN trains on 66% of the data and makes predictions and computes accuracy on the
remaining 33% of data. With this definition in mind, we define the function to optimize the
parameters on as the average accuracy on the 33% of training data during 60 epochs. We
perform the optimization with 20 evaluations on the following search space: one to five layers
This method works by trying values randomly within a given search space and performing cross
validation to find the best set of tried parameters. The search spaces go as follow:
2
To implement this, we used the Python package available at https://github.com/fmfn/BayesianOptimization
23
- For Logistic Regression’s inverse regularization strength, from 0.01 to 10
- For SVM’s C parameter and polynomial degree kernel parameter respectively, from
- For Random Forests’ total number of trees in the forest, from 100 to 300
Table 1 shows the hyperparameters tuning values as well as the mean 5-fold cross validation
score, defined as the average accuracy over each testing set. The accuracy is computed the
following:
𝑇𝑃 + 𝑇𝑁
𝑎𝑐𝑐 =
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
The cross-validation scores are very high for a finance exercise, we believe it is because the
models can capture the strong relationship between past returns and future returns displayed in
the feature description part very well. Indeed, if we were to choose funds solely based on the
past 3-months returns’ sign, we would already achieve an accuracy of around 64%.
Judging by the mean 5-fold CV score, it appears the best model is the random forest algorithm
and worst is the ANN which yields a bit more overfitting than the other models.
24
Table 1 Hyperparameter tuning results and cross validation scores
To further compare our models, we look at their 5-fold Receiving Operating Characteristic
(ROC) curves and Area Under the Curve (AUC). The ROC curves show the relationship
between the false positives and true positives when changing the classification threshold. An n-
fold ROC curve is a similar concept to cross validation. We train the model several times on
different hold out data and compute the predictions on the rest of the data. We then compute
The AUC can be interpreted as the probability of the model classifying a randomly chosen
positive instance higher than a randomly chosen negative one. It gives a metric to compare our
models.
25
Figure 13 5-fold ROC curves
The ROC curves are plotted on Figure 13. The models have the following AUC with the best
- SVM: 0.86
- ANN: 0.91
The models output probabilities with the methods described previously in section 3. To get an
idea of the relationship between the returns and the predicted probabilities, we conduct a linear
regression. The target variable is the training data’s target variable: the 3 months returns from
04/2001 to 07/2001. The explanatory variables are the probabilities associated to the models’
predictions.
To conduct this analysis, we split the dataset into 66% testing and 33% training samples, fit the
models on the training sample and compute the probabilities on the testing sample, the results
26
Figure 14 Regression of returns to logistic regression probabilities
27
Figure 17 Regression of returns to ensemble classifier probabilities
As we can see on Figures 14 to 18, the returns of the funds seem to be well captured by the
probabilities outputted by the models in a linear regression. Table 2 shows the regressions’
coefficients and r-squared, the coefficients are all highly statistically significant.
The coefficients and r-squared are all within a similar range and these results provide support
28
Table 2 Regressing returns against probability results
29
6. Applying the models in a strategy
We apply the models on a long-only strategy ran on a universe of 10,415 funds. The time
window used to run the strategy on goes from 01/2002 to 12/2017, which corresponds to 64
quarters. The portfolio is rebalanced quarterly and at each rebalancing date: we filter the
relevant funds meaning those with at least 1 year of data, we recompute the NAV-based features
using up to 1 year of data and add the non-time dependent features. Then, using the models
The investment signal used is the outputted probability by the models of the fund yielding a
positive return in 3 months-time, the probabilities are ranked from highest to lowest and the
amount of funds in our portfolio for the next quarter is composed of a chosen quantile of the
funds. We define a weighting scheme so that, funds with higher probabilities have a higher
𝑝𝑖
𝑤𝑖 =
∑𝑘 𝑝𝑘
We also create a naïve momentum strategy in which we invest in the best past 3 months
performers, weighted according to the magnitude of the last 3 months returns, similarly to the
probability weighting:
𝑟𝑖
𝑤𝑖 =
∑𝑘 𝑟𝑘
∑𝑘 𝑟𝑘 is the sum over all the selected funds’ previous quarter returns
We analyze the excess returns of each of the strategies computed using an equally weighted
market portfolio made of all the mutual funds available in our dataset, which’s value over the
Figure 20 to Figure 24 below show the quarterly excess returns of the strategy for different
quantile of the universe of funds ranging from 10% to 50% of the top probabilities.
The best performing models are logistic regression and the ANN (labeled MLPClassifier);
however, the models yield similar returns patterns with SVM being the least performing one.
We can clearly see that the machine learning based strategies’ excess returns are correlated with
the Naïve momentum strategy’s performances. This might suggest that the models mostly
captured momentum effects which we will investigate more in depth in part 6.3.
31
As we select more and more funds with less prediction confidence, the below figures show that
the strategies’ returns diminish and look more and more similar because they are more likely to
pick up the same funds. The Naïve strategy on the other hand does not change much at some
32
Figure 22 Excess Returns for a quantile of 30%
33
Figure 24 Excess Returns for a quantile of 50%
The below Table 4 to Table 7 are the results of testing the significance of the strategies’ mean
excess returns when adding more unconfident predictions in the portfolio. Again, the more
funds we choose the lower the expected excess returns, and the less the excess returns are
significantly different from zero. This goes in line with the regression results showed in 5.5,
that models’ probabilities can be a good investment signal as the excess returns from strategies
We can also observe that the Naïve momentum strategy yields much higher and significant
More figures are available in the appendix displaying absolute returns and P&L.
34
Table 4 Testing mean excess returns’ significance for q = 20%
35
Table 7 Testing mean excess returns’ significance for q = 50%
Table 8 displays the information ratio (IR) of each strategies with respect to the market
portfolio, this ratio captures the gain of expected excess return per unit of risk, computed as
follow:
𝐸[𝑟 − 𝑟𝑚 ]
𝐼𝑅 =
√𝑣𝑎𝑟[𝑟 − 𝑟𝑚 ]
𝑟 is the return
It appears that the ANN’s IR drops more significantly compared to the other methods when we
choose more funds, Logistic Regression on the other hand consistently outperforms the other
algorithms and again SVM performs the worst but none of our method can beat the Naïve
36
6.2 Predictions’ accuracy
Aiming at the investigation of the predictive power of the models the prediction accuracy is
assessed. The prediction accuracies over time on each quarter are displayed on Figure 25. They
- SVM: 7.62%
- ANN: 8.39%
- Naïve: 10.38%
The machine learning based methods seem more reliable than the Naïve selection method which
37
The average accuracies are all similar and lower than the results we obtained on the training
data:
- SVM: 64.75%
- ANN: 65.30%
- Naïve: 65.76%
The nature of the training data we used is likely to be the reason why we see this difference
between the average accuracy for the back-testing period and the training data accuracies, as
the training data had some slight imbalance regarding the amount of positive and negative
instance.
Table 9 displays the mean accuracies when only considering the funds in our portfolio. They
are much higher than the overall accuracies, however the Naïve method seems less reliable than
our machine learning models. This means that the overperformance from the Naïve method
seen previously suggests that, although the Naïve method is less accurate in predicting whether
or not a fund is profitable in the next period, the selected funds yield higher returns.
Table 9 Mean accuracy of only the selected funds for different quantiles
38
6.3 Looking at the momentum effect
Throughout the results we observed in the previous parts, the findings suggest that the machine
learning based strategies could explain the same excess fund performance as Naïve momentum
selection strategies. In order to more closely examine how much of momentum, effect our
algorithms have captured, we retrained our models and back tested them after removing all the
features related to momentum effects. Therefore, we removed, the 3-, 6- and 12-months returns
as well as the consistency of returns feature, as those features were created in an effort to
measure past performance, that is momentum. Thus, the feature space for this test consists of
volatility, number of days of existence as well as the geographical and investment style features,
a total of 10 variables. Volatility remains in the feature space as it is a proxy for the past
riskiness and dispersion of fund returns. Nevertheless, Stivers and Sun (2010) find for stock
returns that dispersion is negatively related to subsequent momentum premiums, which means
for our analysis that, if their result were applicable to funds too, some momentum information
might be captured by volatility. However, for this analysis, volatility as proxy for riskiness
Figure 26 shows the overall accuracy of our models, compared to Figure 25. Clearly the
accuracies are no better than random draws, with average accuracies ranging from 46% to 53%.
However, looking closely at the accuracy patterns of the ANN and Logistic Regression they
appear to be more correlated to each other than to the other models, possibly because they use
39
Figure 26 Overall accuracy for models trained without momentum component
This visual conclusion is consolidated when looking at the below Table 10, which summarizes
40
Figure 27 Excess returns for models trained without momentum component
In line with the previous finding, the excess returns are mostly negative throughout the back-
41
7. Conclusion
In this paper, we apply logistic regression, random forest, support vector machines, ensemble
classifier and artificial neural networks to a fund selection problem. We defined a fund selection
signal based on the probabilities given by the models, which represent the models’ prediction
confidence at classifying the next time period returns as positive. The explanatory variables we
defined include both past returns-based features, volatility, consistency of returns and past
returns, and alternative features to extract information from geographical and investment style
We applied the models when back testing a strategy to build a fund of funds portfolio on a
universe of 10,415 funds from the Survivor-Bias-Free US Mutual Fund database from the
Wharton Research Data Service. The models were trained at predicting the 3-months forward
returns from 04/2001 to 07/2001 using features computed prior to 04/2001, the accuracy on this
training sample was very high. They were then used during a 15 years period from 01/2002 to
The probabilistic signal proved to be relevant at selecting funds. However, when testing the
models without momentum related features their accuracy could not statistically be rejected
from being random. Thus, it can be inferred that the original models we developed and trained,
were only able to capture momentum information within the explanatory variables and no
information came from the non-return-based features. The machine learning algorithms do not
statistically outperform a naïve momentum fund selection strategy but proved to be better at
correctly selecting funds that will yield positive returns over the next time period irrespectively
of the magnitude of the returns. The best performing algorithms were logistic regression and
the artificial neural network, possibly because they share the same method to infer prediction
confidence. On the other hand, the support vector machine performed the worst, it might be
42
because the algorithm is not originally designed to provide a probabilistic output, making it less
reliable.
The study can be extended and pushed further by improving the features engineering and
selection part. Using less momentum-capturing features, the accuracy of the models might
improve, classical financial measures such as the fund’s alpha or various financial ratios might
add additional information factoring. Factoring the funds’ fees and transaction costs into the
calculation of the performance or into the models’ features would also be an interesting addition
to the study.
Although the alternative features we chose were not able to provide additional information,
many studies on other areas suggest that information can be contained in non-financial data.
For instance, one could create a feature describing the fund manager’s experience and
qualification (Chevalier & Ellison, 1999) and have it as a time depend variable as a fund’s
learning models, should also be looked at as they stray from classical finance models and have
43
References
Brown, S. J., & Goetzmann, W. N. (1995). Performance persistence. The Journal of finance,
50(2), 679-698.
52(1), 57-82.
Chevalier, J., & Ellison, G. (1999). Are some mutual fund managers better than others? Cross‐
sectional patterns in behavior and performance. The journal of finance, 54(3), 875-899.
Fortin, R., & Michelson, S. (2002). Indexing versus active mutual fund management.
Gudelek, M. U., Boluk, S. A., & Ozbayoglu, A. M. (2017, November). A deep learning-based
stock trading model with 2-D CNN trend detection. In 2017 IEEE Symposium Series on
Ho, K. Y., & Wang, W. W. (2016). Predicting stock price movements with news sentiment: An
artificial neural network approach. In Artificial Neural Network Modelling (pp. 395-403).
Springer, Cham.
44
Indro, D. C., Jiang, C. X., Patuwo, B. E., & Zhang, G. P. (1999). Predicting mutual fund
Khandani, A. E., Kim, A. J., & Lo, A. W. (2010). Consumer credit-risk models via machine-
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep
1097-1105).
Lakonishok, J., Shleifer, A., Vishny, R. W., Hart, O., & Perry, G. L. (1992). The structure and
Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to
Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine
45
Stivers, C., & Sun, L. (2010). Cross-sectional return dispersion and time variation in value and
Tarsauliya, A., Kant, S., Kala, R., Tiwari, R., & Shukla, A. (2010). Analysis of artificial neural
network for financial time series forecasting. International Journal of Computer Applications,
9(5), 16-22.
Titman, S., & Grinblatt, M. (1989). Mutual fund performance: An analysis of quarterly portfolio
46