0% found this document useful (0 votes)
3 views

draft_v1

This study investigates the enhancement of investment strategies through natural language processing (NLP) techniques, specifically using latent dirichlet allocation (LDA) and FinBERT for text categorization. The empirical results indicate that strategies based on these NLP methods outperform traditional bag-of-words models in generating positive short-term returns. The research highlights the advantages of advanced NLP models in accurately capturing financial sentiment and improving investment performance.

Uploaded by

aliexpress267845
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

draft_v1

This study investigates the enhancement of investment strategies through natural language processing (NLP) techniques, specifically using latent dirichlet allocation (LDA) and FinBERT for text categorization. The empirical results indicate that strategies based on these NLP methods outperform traditional bag-of-words models in generating positive short-term returns. The research highlights the advantages of advanced NLP models in accurately capturing financial sentiment and improving investment performance.

Uploaded by

aliexpress267845
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/384190440

Enhancing Performance of Investment Strategies Based on Text


Categorization with NLP Techniques

Preprint · September 2024

CITATIONS READS

0 38

2 authors:

Yao-Tsung Chen Shu-Yi Lin


National Yang Ming Chiao Tung University National Yangming Chiao Tung University
18 PUBLICATIONS 294 CITATIONS 2 PUBLICATIONS 1 CITATION

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Yao-Tsung Chen on 20 September 2024.

The user has requested enhancement of the downloaded file.


Enhancing Performance of Investment Strategies Based on Text

Categorization with NLP Approaches


2024/8/1
Yao-Tsung Chen and Shu-Yi Lin
Department of Information Management and Finance,
College of Management,
National Yang Ming Chiao Tung University,
1001 University Road,
Hsinchu City 300093, Taiwan
Yao-Tsung Chen is an assistant professor of Department of Information Management and Finance,
College of Management, National Yang Ming Chiao Tung University in Taiwan.
[email protected]

Shu-Yi Lin received his M.S. degree from Graduate Program of Finance, Department of
Information Management and Finance, College of Management, National Yang Ming Chiao Tung
University in Taiwan.
[email protected]
We have no conflicts of interest to disclose.

1
Enhancing Performance of Investment Strategies Based on Text

Categorization with NLP Techniques


2024/8/1

Abstract
With the rapid development of information technology, we can employ natural language
processing (NLP) techniques to extract textual feature and enhance the investment performance.
We crawl the news articles from Wall Street Journal website, utilize latent dirichlet allocation
(LDA) and financial bidirectional encoder representations from transformers (FinBERT) to extract
textual features from those articles and apply the extracted features in text classification and further
in investment strategies. The empirical results indicate that NLP techniques enhance the
performance of investment strategies based on text classification with LDA and FinBERT.
Keywords: Investment Strategy, Text Classification, Natural Language Processing, Feature
Extraction, Sentiment Words List
JEL Codes: G11, C63

Highlight
 This study explored the application of NLP techniques in financial text analysis.
 Strategies based on FinBERT and LDA both outperform strategies based on bag-of-word
models.
 The outperformance of text categorization with FinBERT against bag-of-word models
even achieve statistically significant results.

1. Introduction
Financial market sentiment has been studied about two decades ago, where early research
often used simple metrics for prediction. For example, Tetlock, Saar-Tsechansky, and Macskassy
(2008) use regression models that regresses revenue or stock return on its sentiment index which
is calculated by obtaining the ratio of negative words to the total number of words. Li and et al.
(2014) use sentiment analysis to study the impact of financial news on stock price returns. Recently,

2
Chen and et al. (2024) propose an investment strategy based on text categorization with sentiment
words and the empirical evidence suggests that the strategy has a short-term positive return.
Most literatures rely on manually annotated financial sentiment word list (Loughran and
McDonald, 2011) to express the sentiment of articles. However, it is not sufficient to understand
the meaning of news articles by a bag-of-words model. For instance, homonyms (i.e. multiple–
meaning words) and synonymous (i.e. words with the same or similar meanings) cannot be handled
well.
Advanced natural language processing (NLP) models can overcome the shortcomings by
considering various parts of an article comprehensively and combining contextual relationships
for a more holistic understanding. For examples, latent dirichlet allocation (LDA) can
automatically extract latent topics from a large volume of text, reducing the need for manual
annotation (Blei and et. al, 2003). Financial bidirectional encoder representations from
transformers (FinBERT) is based on Google’s bidirectional encoder representations from
transformers (BERT), pre-trained and fine-tuned by a large amount of unlabeled text in the
financial domain and significantly outperforms the Loughran and McDonald’s sentiment word list
(L&M’s list) in classifying sentiment and identifying environmental, social, and governance (ESG)
issues (Araci, 2019).
To show advantages of NLP-based feature extraction, we use LDA and FinBERT to extract
text features from news articles. We also select features (e.g. words) contained in L&M’s list. Then
we apply those three kinds of feature set in the investment strategy proposed by Chen and et al.
(2024) to show the difference between them in terms of investment performance. Therefore, the
hypothesis we want to test are as follows:
Hypothesis 1. An investment strategy based on features generated by LDA has higher short-
term positive returns rather than an investment strategy based on news sentiment words.
Hypothesis 2. An investment strategy based on features generated by LDA has higher short-
term positive returns rather than an investment strategy based on all words.
Hypothesis 3. An investment strategy based on features generated by FinBERT has higher
short-term positive returns rather than an investment strategy based on news sentiment words.
Hypothesis 4. An investment strategy based on features generated by LDA has higher short-
term positive returns rather than an investment strategy based on all words.

2. Methodology
To test the hypotheses, we follow the testing procedures proposed by Chen and et al. (2024).
3
The following are major steps:
(1) crawling news articles;
(2) preprocessing the articles such as removing HTML tags, removing stop words and
stemming (for FinBERT, the latter two steps are optional);
(3) extracting tickers and corresponding dates;
(4) retrieving the historical stock prices by tickers and dates;
(5) labelling each training article as a trend category (of five);
(6) selecting or extracting features;
(7) training the text classifier by features of articles and their associated trend categories;
(8) using the trained classifier to classify a testing article into a trend category;
(9) applying the investment strategy according to the predicted trend category;
(10) calculating the investment performance and testing the hypotheses.
2.1. Preprocessing
To remove noises from the crawled news articles, the first step is cleaning up garbled or
coded content such as scripts and HTML tags. Next, stop words in the articles are removed. Stop
words typically refer to high-frequency words that are not much useful for text analysis, such as
'a', 'you', 'on', etc. There are stop word lists built in the natural language toolkit (NLTK) package
for Python. However, considering the specificity of the financial and economic fields, further stop
words commonly used in financial reports, news articles, market analysis, and related business
documents should be removed, too. The financial stop words list can be downloaded from the
Notre Dame Software Repository for Accounting and Finance.
Next, stemming on words that have the same meaning but different forms should be
performed to prevent the model from generating inconsistent results when handling the same word.
For example, a common case of stemming is to simplify the words 'cooking' and 'cooked' to their
root word 'cook'. There are also stemmer programs built in NLTK package.
2.2. Data Labeling
After crawling and preprocessing news articles, we extract the ticker and released dates from
each article. Articles with more than one tickers are removed because we cannot determine which
ticker the articles should belong. With tickers and released dates, we can query the stock prices by
the Yahoo! Finance API. Following Chen and et al.’s work (2024), it is needed to obtain the
opening, highest, lowest, and closing prices of the stocks three days before and after each article
is released. The highest and lowest stock prices in these six days are used to label an article as one
4
specific category with a profit range of 5 categories [i.e. strong downward (below -10%), medium
downward (-10% to -5%), neutral (-5% to 5%), medium upward (5% to 10%) and strong upward
(above 10%)].
2.3. Feature Extraction (or Selection)
In a traditional bag-of-words model, each word in the training data (or corpus) is a feature.
However, the corpus contains many duplicate or irrelevant features, so removing these features
does not result in loss of information. To simplifying a model, make a model easier to understand,
shorten training time, improve versatility or reduce overfitting, the number of features should be
reduced. One kind of method is feature selection in which one selects important or representative
features by some metrics or from a referred dictionary. In this study, we select representative
financial features from L&M’s list.
Many metrics have been built in Sciki-learn package for Python and used to measure the
importance of a word such as term frequency (TF) or inverse document frequency (IDF).
Following literatures (Jegadeesh and Wu, 2013; Ignatov, 2023; Chen, Y.-T., et al., 2024), the
Inverse Document Frequency (IDF) for each positive and negative word in L&M’s list are
computed as follows:
N
IDF (d , k )  log10 ( ),
DF (k )

where DF(k) is the total number of articles containing the keyword k in the training set, N is
the total number of articles in the training set, and IDF(d, k) is the inverse of DF(d, k) for the
keyword k in the article d. The higher the IDF value of the word means the greater the
discrimination power for classification since the fewer articles contain the word.
Another kind of method to reduce the number of features is feature extraction in which an
informative and non-redundant derived value is constructed from an initial measured data set. LDA
is based on the assumption that each document is a mixture of a set of latent topics, where each
topic is associated with a specific collection of words. LDA can not only identify themes within
articles but also effectively distinguish words with multiple meanings in English, assigning them
to the appropriate topics. For example, the word ‘bank’ appears in topics related to banking as well
as those discussing lakes and rivers. In the training stage, LDA analyze a corpus and create a term-
to-topic matrix in which each value evaluates the loading of mapping from term to topic. In the
testing stage, a word vector representing an article are multiplied by the term-to-topic matrix and
a topic vector can be generated to represent the same article. LDA is also built in Scikit-learn
package.
FinBERT is another feature extraction model based on the BERT architecture and specifically
5
designed to cater to the complex needs of the financial domain. The architecture of BERT utilizes
a neural network heavily reliant on a self-attention mechanism. The uniqueness of this structure
lies in its bidirectional training strategy, which considers the context of both the preceding and
following text in the document during training, providing a more comprehensive understanding of
the context. FinBERT is specifically pre-trained using financial-related datasets such as financial
news, market analysis reports, and analysts’ comments during the pre-training stage. This
specialized training approach enables FinBERT to understand and process the unique
terminologies and expressions in the financial industry more accurately. In a testing stage,
FinBERT can present an article as an embedded vector. FinBERT can be download from the
Hugging Face website.
2.4. Classifier Training
Extracted (or selected) feature vectors and labelled trend categories are used for training. As
Chen and et al. (2024) shown, tree-based classifiers have good performance in the investment task.
However, since training data may contain noisy patterns, an extended version called random forest
(RF) (Breiman, 2001) built in Scikit-learn package are used in this study. RF builds several
decision trees and uses their combined predictions through voting or averaging to enhance
accuracy. During training, each decision tree randomly samples from the original dataset using the
bootstrap method, and at each decision node, a random subset of features is chosen to evaluate the
best splitting model. This approach not only adds diversity to models but also effectively reduces
the risk of overfitting and enhances the generalization ability to new data by aggregating the
independent predictions of all trees through majority voting (for classification problems) or
averaging (for regression problems).
2.5. Performance Measurement
After the classifier is trained, a testing article can be inputted into the classifier and a trend
category and its associated lower bound return can be predicted. With the predicted lower bound
return, we apply the investment strategy proposed by Chen and et al. (2024). For example, if an
article is predicted in the high upward category and the company’s stock (mentioned by the article)
price increase less than 10% from the lowest price within three days prior to its release, we long
the stock and set a stop loss price. When the return falls by more than 1%, the long position is
closed immediately. When the predicted lower bound return is reached within 3 days after the
article is released, the long position is closed immediately, too. Finally, if the predicted lower
bound return is not reached, the long position is closed on the third days after the article is released.
Once all testing articles are processed and its corresponding investment performance are
computed, the difference between strategies with various feature sets in terms of investment
6
performance can be tested by pairwise t-tests for independent groups.

3. Empirical Results
3.1. Data
The news articles are crawled from the Wall Street Journal (WSJ) website and released
between 1997 and 2022. Totally we crawl 81,648 articles and employ a sliding window method
for splitting the training and testing datasets. Specifically, every six years of data are used as one
training set, and the subsequent year is used as the test set. For example, in one iteration, data
between 2011 and 2016 is used as a training set, and data in 2017 as the test set. In the next iteration,
data between 2012 and 2017 is used as a new training set, and data in 2018 as the test set, and so
forth.
3.2. Investment Strategies Based on Text Categorization
We use LDA and FinBERT to show the advantage of NLP comparing to traditional bag-of-
word models with full text words and with words in L&M’s list. Therefore, there are totally four
investment strategies for back testing. For fair comparison, the numbers of feature used are all
about 1,000. For a bag-of-word model with full text words, we select the top 1,000 features sorted
by IDF values. For a bag-of-word model with words in L&M’s list, there are 995 words appeared
both in the training corpus and the word list, so 995 features are included. For LDA, we set a
parameter value to generate 1,000 features. For FinBERT, the number of features is set to 768
(since we can only set the number to multiples of 768).
3.3. Investment Performance
Tables 1 to 4 summarize the investment performance for the four strategies. We list the
number of articles, the average returns and t-values for each trend category. It can be observed at
the last column for each table that all strategies have significant short-term positive returns (4th
row). However, after deducting fixed transaction costs at 0.05%, all strategies do not have
significant short-term positive returns (6th row). In terms of excess returns and three-day excess
returns to the S&P 500 index daily buy-and-hold and 3-day buy-and-hold returns, all strategies
still have significant short-term positive returns (8th row and 10th row).
3.4. Hypotheses Test
From Panel A of Table 5, we can find that the strategy with FinBERT outperforms other
strategies in terms of average return and excess return. The strategy with LDA also shows good
performance and ranks in the second place. The results of pairwise t-tests for independent groups
are shown in Panel B of Table 5. The t values of difference for four combinations are 10.0182,
7
2.5188, 8.2768 and 0.1773, respectively. Therefore, Hypotheses 1 and 3 are strongly supported,
which means NLP models are able to capture features of news articles much more accurately than
the bag-of-word model with full text words. Surprisingly, Hypothesis 2 is weakly supported and
Hypothesis 4 is not supported, which means NLP models are only slightly better than the bag-of-
word model with L&M’s list.

4. Conclusion
This study explored the application of NLP techniques in financial text analysis. By extracting
text features from articles in WSJ and integrating classification techniques, this study confirmed
that NLP models not only effectively capture textual meanings but also demonstrate superior
investment performance against traditional bag-of-word models. The empirical results show that
strategies based on FinBERT and LDA both outperform strategies based on bag-of-word models.
The outperformance of text categorization with FinBERT against bag-of-word models even
achieve statistically significant results.
Overall, this study not only deepen our understanding of financial text analysis but also
demonstrated the potential applications of NLP technologies in the financial domain. By
integrating machine learning techniques, we can parse vast amounts of financial news data more
effectively and provide more precise predictions for market trends.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
No funding was received.

Reference
Araci, D. (2019). Finbert: Financial sentiment analysis with pre-trained language models.
arXiv preprint arXiv:1908.10063.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine
Learning Research, 3, 993-1022.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5-32.
Chen, Y.-T., Yu, C.-Y., & Lin, S.-Y. (2024). An investment strategy based on news sentiment

8
words and its empirical performance. The Journal of Investing, 33 (5), 55-67.
Ignatov, K. (2023). When ESG talks: ESG tone of 10-K reports and its significance to stock
markets. International Review of Financial Analysis, 89, 102745.
Jegadeesh, N. & Wu, D. (2013). Word power: A new approach for content analysis. Journal
of Financial Economics, 110 (3), 712-729.
Li, X., Xie, H., Chen, L., Wang, J., & Deng, X. (2014). News impact on stock price return via
sentiment analysis. Knowledge-Based Systems, 69, 14-23.
Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis,
dictionaries, and 10–Ks. The Journal of Finance, 66(1), 35-65.
Tetlock, P. C., Saar–Tsechansky, M., & Macskassy, S. (2008). More than words: Quantifying
language to measure firms’ fundamentals. The Journal of Finance, 63(3), 1437-1467.

9
Table 1. Investment performance using random forest with IDF of all text words
This table report the out-of-sample investment performance using RF as classifier with IDF of all text words as
word feature. For each word appeared in all article, we calculate its Inverse Document Frequency (IDF):
N
IDF (d , k )  log10 ( ),
DF (k )
where DF(k) is the total number of articles containing the keyword k in the training set, N is the total number of
articles in the training set, and IDF(d, k) is the inverse of DF(d, k) of the keyword k in the article d.
The news articles are sourced from WSJ website from 1997 to 2022. Totally we crawl 81,648 articles and employ
a sliding window method for splitting the training and testing datasets. Specifically, every six years of data are used
as one training set, and the subsequent year is used as the test set.
The number of articles to trade, maximum return, minimum return, mean return, and t-value for each trend
category are listed. The highest (BH) and lowest (BL) of the reported company's stock price 3 days before the release
date and the highest (AH) and lowest (AL) of the reported company's stock price 3 days after the release date are
queried from Yahoo! Finance with the company ticker to label the price direction and profit lower bound category. If
range>10%, the profit lower bound is high; if range between 10% and 5%, the profit lower bound is medium; if range
<5%, the profit lower bound is low, where range = [Max (BH, AH)-Min (BL, AL)] / Cost. If the price direction is
positive, the cost is BL otherwise BH. The signs *, **, and *** indicate significance at the ten, five and one percent
levels.

High_Up Medium_Up Medium_Dn. High_Dn. Total


No. of Ar. to Tr. 341 3,762 2,908 425 7,436
Min_Ret -1.0% -0.5% -0.5% -1.0% -1.0%
Max_Ret 10.57% 6.37% 9.79% 10.32% 10.57%
Mean_Ret 0.91% 0.35% 0.41% 0.86% 0.43%
t 5.9867*** 17.6632*** 16.5879*** 5.0896*** 25.3494***
Mean_Ret (-TC) 0.41% -0.15% -0.09% 0.36% -0.07%
t 2.9991*** -2.9989** -1.2602 2.8798** -1.0350*
Mean_ExRet 0.64% 0.01% 0.92% 1.35% 0.44%
t 3.4908*** 2.4671** 24.5318*** 8.2187*** 22.2098***
Mean_3d_ExRet 0.56% 0.01% 1.11% 1.27% 0.52%
t 3.1057** 1.2678 22.8368*** 6.0972*** 19.7928***

10
Table 2. Investment performance using random forest with IDF of words in L&M’s list
This table report the out-of-sample investment performance using RF as classifier and IDF as word feature. For
each positive and negative word listed in L&M’s list (2011) in each article, we calculate its Inverse Document
Frequency (IDF):
N
IDF (d , k )  log10 ( ),
DF (k )
where DF(k) is the total number of articles containing the keyword k in the training set, N is the total number of
articles in the training set, and IDF(d, k) is the inverse of DF(d, k) of the keyword k in the article d.
The news articles are sourced from WSJ website from 1997 to 2022. Totally we crawl 81,648 articles and employ
a sliding window method for splitting the training and testing datasets. Specifically, every six years of data are used
as one training set, and the subsequent year is used as the test set.
The number of articles to trade, maximum return, minimum return, mean return, and t-value for each trend
category are listed. The highest (BH) and lowest (BL) of the reported company's stock price 3 days before the release
date and the highest (AH) and lowest (AL) of the reported company's stock price 3 days after the release date are
queried from Yahoo! Finance with the company ticker to label the price direction and profit lower bound category. If
range>10%, the profit lower bound is high; if range between 10% and 5%, the profit lower bound is medium; if range
<5%, the profit lower bound is low, where range = [Max (BH, AH)-Min (BL, AL)] / Cost. If the price direction is
positive, the cost is BL otherwise BH. The signs *, **, and *** indicate significance at the ten, five and one percent
levels.

High_Up Medium_Up Medium_Dn. High_Dn. Total


No. of Ar. to Tr. 992 3,495 2,888 985 8,360
Min_Ret -1.00% -0.50% -0.50% -1.00% -1.00%
Max_Ret 10.67% 6.37% 9.79% 11.59% 11.59%
Mean_Ret 0.76% 0.37% 0.42% 0.77% 0.48%
t 9.2752*** 18.4207*** 16.4105*** 8.3189*** 27.1254***
Mean_Ret (-TC) 0.26% -0.13% -0.08% 0.27% -0.02%
t 3.7287*** -2.1740* -0.2893 3.7277*** -0.0324
Mean_ExRet 0.38% 0.03% 0.85% 1.30% 0.51%
t 4.7830*** 3.5752** 23.8767*** 12.3521*** 24.4264***
Mean_3d_ExRet 0.53% -0.01% 1.03% 1.07% 0.52%
t 6.2398*** -0.1647 20.1578*** 8.1413*** 20.0825***

11
Table 3. Investment performance using random forest with LDA topic
This table report the out-of-sample investment performance using RF as classifier and LDA topic loading as
feature.
The news articles are sourced from WSJ website from 1997 to 2022. Totally we crawl 81,648 articles and employ
a sliding window method for splitting the training and testing datasets. Specifically, every six years of data are used
as one training set, and the subsequent year is used as the test set.
The number of articles to trade, maximum return, minimum return, mean return, and t-value for each trend
category are listed. The highest (BH) and lowest (BL) of the reported company's stock price 3 days before the release
date and the highest (AH) and lowest (AL) of the reported company's stock price 3 days after the release date are
queried from Yahoo! Finance with the company ticker to label the price direction and profit lower bound category. If
range>10%, the profit lower bound is high; if range between 10% and 5%, the profit lower bound is medium; if range
<5%, the profit lower bound is low, where range = [Max (BH, AH)-Min (BL, AL)] / Cost. If the price direction is
positive, the cost is BL otherwise BH. The signs *, **, and *** indicate significance at the ten, five and one percent
levels.

High_Up Medium_Up Medium_Dn. High_Dn. Total


No. of Ar. to Tr. 840 3,601 3,064 781 8,286
Min_Ret -1.00% -0.50% -0.50% -1.00% -1.00%
Max_Ret 10.67% 6.48% 9.79% 11.59% 11.59%
Mean_Ret 1.01% 0.33% 0.44% 1.02% 0.50%
t 11.2104*** 16.7821*** 16.4593*** 10.7765*** 20.3236***
Mean_Ret (-TC) 0.51% -0.17% -0.06% 0.52% 0.00%
t 5.9455*** -4.3278*** -0.3988 5.7296*** 0.2986
Mean_ExRet 0.56% -0.02% 0.86% 1.77% 0.55%
t 5.2785*** -0.5433* 23.3573*** 14.4894*** 23.5646***
Mean_3d_ExRet 0.72% -0.06% 1.07% 1.65% 0.57%
t 7.3453*** -0.6561 22.7868*** 10.8988*** 21.2256***

12
Table 4. Investment performance using random forest with FinBERT embedded vector
This table report the out-of-sample investment performance using RF as classifier and FinBERT embedded vector
as features.
The news articles are sourced from WSJ website from 1997 to 2022. Totally we crawl 81,648 articles and employ
a sliding window method for splitting the training and testing datasets. Specifically, every six years of data are used
as one training set, and the subsequent year is used as the test set.
The number of articles to trade, maximum return, minimum return, mean return, and t-value for each trend
category are listed. The highest (BH) and lowest (BL) of the reported company's stock price 3 days before the release
date and the highest (AH) and lowest (AL) of the reported company's stock price 3 days after the release date are
queried from Yahoo! Finance with the company ticker to label the price direction and profit lower bound category. If
range>10%, the profit lower bound is high; if range between 10% and 5%, the profit lower bound is medium; if range
<5%, the profit lower bound is low, where range = [Max (BH, AH)-Min (BL, AL)] / Cost. If the price direction is
positive, the cost is BL otherwise BH. The signs *, **, and *** indicate significance at the ten, five and one percent
levels.

High_Up Medium_Up Medium_Dn. High_Dn. Total


No. of Ar. to Tr. 689 3,775 2,965 637 8,066
Min_Ret -1.00% -0.50% -0.50% -1.00% -1.00%
Max_Ret 10.67% 6.37% 9.79% 11.59% 11.59%
Mean_Ret 1.07% 0.37% 0.45% 1.40% 0.55%
t 10.5424*** 17.7358*** 16.8748*** 12.9261*** 28.7988***
Mean_Ret (-TC) 0.57% -0.13% -0.05% 0.90% 0.05%
t 5.0864*** -3.5615** -0.221 8.3078*** 5.0112***
Mean_ExRet 0.40% 0.01% 0.88% 2.34% 0.56%
t 3.6782*** 4.6842** 6.8539*** 9.3578*** 14.8926***
Mean_3d_ExRet 0.80% -0.02% 1.12% 2.06% 0.62%
t 7.9889*** -1.2156* 22.1265*** 12.7895*** 23.3426***

13
View publication stats

Table 5. Investment performance difference between the four strategies


This table report the difference of out-of-sample investment performance using RF with different feature set.
Panel A lists mean return, mean return subtract transaction costs and mean excess returns to the S&P 500 index daily
buy-and-hold and 3-day buy-and-hold returns for each sentiment category. Panel B reports the differences in the mean
return of the four pairs (i. e. FinBERT – All, FinBERT – L&M, LDA – All, and LDA– L&M) using independent
samples t-test for independent groups. The signs *, **, and *** indicate significance at the ten, five and one percent
levels.
Panel A

Abbr. Mean Mean Mean Mean 3d


Strategies Return Return (-TC) ExReturn ExReturn
IDF vector of all text words All 0.43% -0.07% 0.44% 0.52%
IDF vector of words in L&M’s list L&M 0.48% -0.02% 0.51% 0.52%
LDA topic vector LDA 0.50% 0.00% 0.55% 0.57%
FinBERT embedded vector FinBERT 0.55% 0.05% 0.56% 0.62%

Panel B
FinBERT – FinBERT – LDA – LDA–
All L&M All L&M
Difference 0.12% 0.07% 0.07% 0.02%

t 10.0182*** 2.5188* 8.2768*** 0.1773

14

You might also like