0% found this document useful (0 votes)
8 views

Final report for plag check

This study investigates the use of large language models (LLMs) for sentiment analysis of financial news headlines, focusing on retail investors' perspectives. The fine-tuned Qwen2.5-14B model outperformed traditional sentiment analysis methods, achieving the highest accuracy in classifying sentiments from a dataset of 27,000 financial news headlines. The findings suggest that advanced LLMs can significantly enhance market insights, risk management, and investment decision-making in the financial industry.

Uploaded by

Chahat Chaniyara
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Final report for plag check

This study investigates the use of large language models (LLMs) for sentiment analysis of financial news headlines, focusing on retail investors' perspectives. The fine-tuned Qwen2.5-14B model outperformed traditional sentiment analysis methods, achieving the highest accuracy in classifying sentiments from a dataset of 27,000 financial news headlines. The findings suggest that advanced LLMs can significantly enhance market insights, risk management, and investment decision-making in the financial industry.

Uploaded by

Chahat Chaniyara
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Abstract—In this study, I explored the application of sentiment nologies enable the processing and understanding of large

analysis in financial news headlines to understand investor sen- volumes of textual data with high accuracy. Traditional sen-
timent. Using large language models (LLM), I analyze sentiment timent analysis models often struggle with the complexity
from the perspective of retail investors. The dataset that was cho-
sen contains categorized sentiments of financial news headlines, and nuances of financial language, which can include jargon,
which serves as the basis for my analysis. I fine-tuned Llama2-7b idiomatic expressions, and context-specific meanings. LLMs,
and Llama3-8b and Qwen2.5-14b to evaluate their effectiveness such as Llama are pretrained on vast corpora and can be fine-
in sentiment classification. My experiments demonstrate that the tuned for specific tasks, offering a significant improvement
fine-tuned Qwen2.5 achieves the highest accuracy. It showed over traditional methods.
significant improvements in accuracy after fine-tuning, indicating
its robustness in capturing the nuances of financial sentiment. The primary goal is to identify the most effective model
This model can be instrumental in providing market insights, for this task. We hypothesize that fine-tuning these pre-trained
risk management, and aiding investment decisions by accurately models on the Indian Financial News dataset will enhance their
predicting the sentiment of financial news. The results highlight ability to capture the subtle nuances of financial sentiment.
the potential of advanced LLMs in transforming how we analyze The fine-tuned Qwen2.5-14B achieved the highest precision,
and interpret financial information, offering a powerful tool
for stakeholders in the financial industry.It also presents a recall, and F1-score, demonstrating its robustness and accuracy
comprehensive NLP pipeline that integrates transformer-based in sentiment classification.
Named Entity Recognition with fine-tuned large language models The implications of this research are significant for the
to enhance sentiment analysis and entity resolution in financial financial industry. Accurate sentiment analysis can provide
news.
deeper market insights, help in identifying potential reputa-
tional risks, and support more informed investment decisions.
I. INTRODUCTION
By understanding the sentiment of financial news, businesses
The financial industry is a dynamic and rapidly changing can better anticipate market reactions and develop strategies
environment where news and information play a critical role to mitigate risks and capitalize on opportunities.
in shaping market behavior and investor sentiment. With the
constant influx of financial news, it becomes imperative for II. RELATED WORK
businesses, investors, and analysts to accurately gauge the The analysis of sentiment in financial news is a well
sentiment conveyed in these news items. Sentiment analysis, researched area, drawing interest from various domains includ
a branch of Natural Language Processing (NLP), offers a ing finance, computer science, and linguistics. The application
sophisticated method to automatically determine the emotional of NLP and LLMs to this field has evolved significantly
tone behind words, providing valuable insights into market over the past decade, driven by the increasing availability
trends, investor confidence, and consumer behavior. of data and advancements in machine learning algorithms.
The sentiment of financial news can significantly impact This section reviews the key literature in sentiment analysis,
market movements, influencing decisions made by retail in- particularly focusing on financial news, and highlights the
vestors, institutional investors, and other stakeholders. For contributions of recent studies that have utilized advanced NLP
instance, positive news about a company’s performance can techniques and LLMs
boost investor confidence, leading to a rise in stock prices, With the rise of deep learning, researchers have explored
while negative news can trigger fear and sell-offs. Therefore, various neural architectures, including recurrent neural net-
understanding the sentiment embedded in financial news head- works (RNNs), long short-term memory networks (LSTMs),
lines can aid in various strategic decision-making processes, and transformers, to improve sentiment classification accuracy.
including market insights, risk management, and investment The introduction of pre-trained language models such as BERT
strategies. and FinBERT has significantly enhanced the ability to capture
Advancements in NLP and Large Language Models (LLMs) financial sentiment by leveraging domain-specific corpora.
have opened new avenues for sentiment analysis. These tech- FinBERT, trained on financial texts, has been widely adopted
for sentiment analysis tasks, outperforming general-purpose
models like BERT in financial applications.
Moreover, integrating financial news sentiment analysis with
real-time trading strategies has gained attention in quantitative
finance. Several studies have shown that market sentiment
extracted from news sources can serve as a leading indicator of
stock price movements. The combination of sentiment-aware
models with automated trading algorithms has demonstrated
potential in enhancing decision-making processes for investors
and traders.
P Malo et al. [1] introduce the FinancialPhraseBank dataset
and propose methods for detecting sentiment in financial texts
using machine learning. B Pang [2] provides an extensive
review of sentiment analysis techniques, including early ap-
plications to financial text. J Si et al. [3] explores the use of
topic-based sentiment analysis of Twitter data to predict stock
prices. J Devlin et al. [4] Introduces BERT, a breakthrough Fig. 1. Class Distribution
in NLP that has significantly influenced sentiment analysis
research, including applications in finance.
M Hu and B Liu [5] Discusses techniques for extract-
ing sentiment from text, laying groundwork for applications ”Sentiment Analysis for Financial News” by Ankur Z., and the
in financial news analysis. FZ Xing et al. [6] survey of Indian Financial News dataset from Hugging Face, which con-
natural language processing techniques applied to financial tains 27,000 financial news headlines. The FinancialPhrase-
forecasting, with a focus on sentiment analysis. S Kogan Bank dataset is specifically designed for sentiment analysis in
et al. [7] Utilizes regression analysis on sentiment extracted the financial sector, featuring news headlines annotated with
from financial reports to predict firm risk. BM Barber [8] sentiment labels. It consists of two columns: Sentiment and
Examines the influence of news attention on investor behavior, Headline, where the sentiment is categorized into three dis-
highlighting the role of sentiment. tinct classes—positive, neutral, and negative. This structured
X Zhang et al. [9] Demonstrates the potential of Twitter classification provides a strong foundation for understanding
sentiment to predict stock market indicators, emphasizing real- sentiment trends in financial news from a retail investor’s
time analysis. F Li [10] Applies Naive Bayesian classification perspective.
to forward-looking statements in corporate filings to gauge To ensure data consistency,A stratified train-test split was
sentiment and predict future performance. BG Choi et al. [11] performed, with 70% of the data allocated for training and
Investigates how the sentiment of earnings announcements 30% for testing. Additionally, 5% of the total dataset was
affects investor perceptions and market outcomes. BS Kumar separated as an evaluation set. The training dataset was then
and V Ravi [12] Reviews various text mining applications in shuffled to mitigate ordering biases. Fig.1 shows the distribu-
finance, including sentiment analysis of financial news. tion of Sentiment class.
The reviewed literature highlights the importance of sen- To facilitate numerical processing by machine learning mod-
timent analysis in finance and the advancements made with els, these textual labels are mapped to corresponding integer
NLP and LLM technologies. Studies show that models like values using a predefined encoding scheme. Specifically, the
BERT, LSTM networks, and deep learning techniques have label Positive is mapped to the integer 2, Neutral (including
improved financial sentiment analysis. My study builds on entries labeled as ”none”, which are semantically aligned with
this by finetuning advanced models and demonstrating the neutrality) is mapped to 1, and Negative is assigned the value
Llama3-8B and Qwen2.5-14B model’s superior performance 0. This encoding enables the transformation of categorical
in classifying financial news sentiment, contributing to more sentiment annotations into a numerical format suitable for
accurate and efficient tools for the industry. model training and evaluation.
This work builds upon previous research by fine-tuning
the LLaMA3 model for financial sentiment analysis using a
dataset of 27,000 labeled headlines. By leveraging LoRA and A. Data Augmentation and Prompt Engineering
half-precision (fp16) training on an NVIDIA H100 GPU, this
study aims to push the boundaries of accuracy and efficiency To optimize sentiment classification, the training samples
in financial text analysis. were reformatted into a structured prompt format. The prompt
explicitly instructed the model to classify sentiment into
III. DATASET AND ANALYSIS Positive, Neutral, or Negative. The test data was structured
The datasets utilized in this study include the Finan- similarly, except without sentiment labels, ensuring a realistic
cialPhraseBank, sourced from the Kaggle repository titled inference scenario.
IV. MODEL SELECTION C. Output Layer
The LLaMA-2-7B, LLaMA-3-8B and Qwen2.5-14B models The final transformer outputs are projected to the vocabulary
were chosen as the base model due to its state-of-the-art space to generate logits for each token. LLaMA 3 uses tied
performance in text generation and contextual understanding. embeddings, meaning the output weights share parameters
Fine-tuning was performed using low-rank aversion (LoRA) to with the input embeddings:
enhance efficiency while minimizing computational overhead.
The LoRA configuration targeted key transformer layers to logits = WeT XL + bo (6)
optimize sentiment classification performance.
where:
V. LLAMA MODEL ARCHITECTURE
• WeT is the transposed embedding matrix,
A. Embedding Layer • XL is the final hidden state,
The embedding layer converts input tokens into dense • bo is a bias term.
vector representations. Instead of absolute position embed-
dings, LLaMA 3 uses Rotary Position Embeddings (RoPE), VI. QWEN2.5 MODEL ARCHITECTURE
which encode positional information directly into the attention
A. Embedding Layer
mechanism. Each token ti in the input sequence is mapped to
a high-dimensional space: The embedding layer in Qwen2.5 converts input tokens into
dense vector representations. Qwen2.5 utilizes Rotary Posi-
E(ti) = Weti (1) tional Embeddings (RoPE) to encode positional information
where We is the embedding matrix. directly into the attention mechanism. Each input token ti is
B. Transformer Layers transformed using the embedding matrix We:
The core of the model consists of multiple Transformer
layers, each comprising the following components: E(ti) = We · ti (7)
1) Multi-Head Grouped-Query Attention (GQA): LLaMA
Qwen2.5 models use byte-level byte-pair encoding
3-8B uses Grouped-Query Attention (GQA) instead of
(BBPE) with a vocabulary of 151,643 tokens, shared across
standard Multi-Head Attention, reducing memory overhead
all variants for consistency.
while maintaining performance. The attention mechanism is
defined as: B. Transformer Layers
QKT
Attention(Q, K, V ) = softmax √ V (2) Qwen2.5 maintains a decoder-only Transformer-based ar-
dk chitecture, composed of several stacked layers that include:
where: 1) Multi-Head Grouped Query Attention (GQA): Grouped
• Q, K, V are the query, key, and value matrices,
Query Attention (GQA) is employed to enhance KV cache
• dk is the dimensionality of the keys,
efficiency. In this setup, multiple query heads share fewer
• Rotary Position Embeddings (RoPE) are applied to Q and
key/value heads, reducing memory usage:
K before computing attention scores.
2) SwiGLU Feedforward Network (FFN): Instead of a QKT + BQKV
standard feedforward network, LLaMA 3 uses SwiGLU
(Swish-Gated Linear Units) for improved efficiency and Attention(Q, K, V ) = softmax √ k V (8)
d
expressiveness. The FFN is defined as: where:
FFN(X) = (Swish(XW1) ⊙ XW2)W3 (3) • Q, K, V are the query, key, and value matrices,
where: • dk is the dimensionality of the keys,
• Swish(x) = x ·sigmoid(x) is an activation function, • BQKV is a bias term specific to Qwen2.5,

• W1, W2, W3 are learned weight matrices, • RoPE is applied to Q and K before attention computa-
• ⊙ represents element-wise multiplication. tion.
3) Layer Normalization and Residual Connections: Each 2) SwiGLU Feedforward Network (FFN): Qwen2.5 adopts
Transformer layer includes pre-normalization and residual the SwiGLU activation function for the feedforward network,
connections to stabilize training and improve convergence: which improves efficiency and expressiveness:
Zl = LayerNorm(Xl + Attention(Ql, Kl, Vl)) (4)
Xl+1 = LayerNorm(Zl + FFN(Zl)) FFN(X) = (Swish(XW1) ⊙ XW2) W3 (9)
(5)
where: with:
• Xl is the input to the l-th layer, • Swish(x) = x · sigmoid(x),
• Zl is the output after attention, • W1, W2, W3 as learnable parameters,
• LayerNorm ensures stable activations. • ⊙ indicating element-wise multiplication.
3) Layer Normalization and Residual Connections: Instead and Qlora techniques. This approach focuses on fine-tuning a
of traditional LayerNorm, Qwen2.5 uses RMSNorm with pre- limited set of additional parameters while keeping most pre-
normalization to stabilize training and maintain gradient flow: trained model parameters fixed, thus reducing computational
and storage costs and mitigating the risk of catastrophic
Zl = RMSNorm(Xl + Attention(Ql, Kl, Vl)) (10) forgetting.
The training process was conducted using the AdamW
Xl+1 = RMSNorm(Zl + FFN(Zl)) (11) optimizer with gradient accumulation and a learning rate
where: scheduler to optimize performance. A warmup phase was
applied to stabilize learning, and the training progress was
• Xl is the input to the l-th layer,
logged and monitored using Weights and Biases for real-time
• Zl is the output after attention computation.
tracking.
C. Output Layer To ensure robust evaluation, accuracy was computed at
In most Qwen2.5 models, output embeddings are tied with regular intervals, and early stopping was implemented to
input embeddings to reduce parameter count: prevent overfitting. The best model was selected based on
validation accuracy to ensure optimal generalization to unseen
logits = WeT XL + bo (12) data.

where: A. Parameter Efficient Tuning


T
• We is the transposed embedding matrix (tied weights), PEFT methods have emerged as an efficient approach to
• XL is the final hidden representation, fine-tune pretrained LLMs while significantly reducing the
• bo is the output bias. number of trainable parameters. These techniques balance
computational efficiency and task performance, making it fea-
VII. PROMPT CONFIGURATION sible to fine-tune even the largest LLMs without compromising
The models were configured to predict the sentiment of on quality.
news headlines. The function for this task used three main 1) Low-Rank Adaptation (LoRA): LoRA is a parameter-
components: the test dataset (a Pandas DataFrame containing efficient fine-tuning technique that injects trainable low-rank
headlines), the model, and its tokenizer. For each headline, a matrices into Transformer layers, allowing adaptation without
prompt was created requesting sentiment analysis. The model updating the full model. By freezing the pretrained weights
then generated a sentiment prediction, which was extracted and optimizing only small adapter matrices (A and B), LoRA
and appended to the y pred list. The parameters used for reduces computational cost while maintaining performance. It
configuring the text generation included: leverages the insight that weight updates during fine-tuning
• max new tokens: Sets the maximum number of new are often low-rank, enabling efficient adaptation with minimal
tokens to generate. additional parameters.
• temperature: Controls the randomness of text generation,
with lower temperatures producing more predictable text and ∆W = BA (13)
higher temperatures generating more creative and unexpected
text. Here, ∆W represents the low-rank update to the pretrained
The process involved: weight matrix W , where A∈ Rr×d and B ∈ Rd×r, with
• Tokenizing the input headlines using the model’s tokenizer. r ≪d, making the total number of trainable parameters
significantly smaller.
• Creating prompts by appending the task description to
each headline. 2) Quantized Low-Rank Adaptation (QLoRA): QLoRA is
• Generating text responses from the model based on these an enhanced version of LoRA that improves parameter effi-
prompts. ciency by combining low-rank adaptation with quantization. It
• Extracting and storing the predicted sentiment labels uses 4-bit NormalFloat (NF4) quantization to represent model
weights efficiently by mapping them to a fixed range with-
VIII. FINE TUNING out costly computations. Additionally, Double Quantization
Prior to fine-tuning, the performance of the unmodified reduces memory usage by further compressing quantization
llama model was evaluated, resulting in an overall accuracy of constants using 8-bit floats, enabling scalable fine-tuning of
0.630. The accuracy for each label was as follows: 0.807 for large models with minimal performance loss.
positive, 0.187 for neutral, and 0.897 for negative sentiments.
This baseline performance indicated a need for fine-tuning
to enhance the model’s accuracy, particularly for the neutral IX. INFERENCE AND MODEL EVALUATION
sentiment class. After training, the fine-tuned model was evaluated on the
The fine-tuning process was conducted using the Simple test set using a text-generation pipeline with controlled decod-
Fine-Tune Trainer (SFTTrainer), employing ParameterEffi- ing. The predicted sentiment labels were mapped to numerical
cient Fine-Tuning (PEFT) methods which were mainly Lora values for performance assessment.
TABLE I
CLASSIFICATION REPORT-QWEN2.5

TABLE II
CLASSIFICATION REPORT-LLAMA3-8B

The evaluation of the model’s performance was based on


standard metrics including precision, recall, and F1-score. TABLE IV
COMPARISON OF MODEL PERFORMANCE
These metrics were computed for each sentiment class (posi-
tive, neutral, and negative), as well as for the overall perfor-
mance of the model. The formulas for these metrics are as
follows: B. Classwise Performance
1) Precision: Detailed performance metrics for each sentiment class are
TP provided in Table I, II and III models. These metrics highlight
Precision = (14) the precision, recall, and F1-score for positive, neutral, and
TP + FP
negative sentiments.
where TP is true positives and FP is false positives.
2) Recall: C. Comparison with baseline
TP The performance of the fine-tuned Llama3-8B model and
Recall = (15)
TP + FN Qwen2.5-14B was compared with several other models, in-
where FN is false negatives. cluding the baseline models and other fine-tuned models. Table
IV highlights the precision, recall, and F1-score for each
3) F1-score:
model.
Precision × Recall
F1 = 2× (16) XI. OBSERVATIONS
Precision + Recall
These models were selected due to their strong language
understanding capabilities and high-quality text generation,
X. EXPERIMENTAL RESULTS
making them well-suited for nuanced sentiment analysis tasks.
A. Overall Performance Additionally, the LLaMA and Qwen models are open-access
and can be self-hosted, allowing for flexible deployment
The fine-tuned Qwen2.5-14B model demonstrated signif-
without dependency on third-party APIs. By including models
icant improvements in classification accuracy compared to
of varying sizes—7B, 8B, and 14B parameters—we aimed to
its performance prior to fine-tuning. The overall accuracy of
explore the trade-offs between computational cost, GPU usage,
the model was found to be 0.931, indicating a high level of
and inference performance. Table V shows the performance of
correctness in sentiment classification.
each model.
• LLaMA2-7B showed the best latency, with an average
processing time of just 0.0372 seconds per headline,
while maintaining low GPU usage (15.33 GB).
• LLaMA3-8B offered slightly slower performance but
required only marginally more memory (17 GB).
• Qwen2.5-14B, although more powerful in terms of pa-
TABLE III rameter count, consumed 37.13 GB of GPU memory and
CLASSIFICATION REPORT-LLAMA2-7B
had the highest latency at 0.0599 seconds per headline.
labels, which are most likely to refer to corporate entities or
financial products.

B. Entity Extraction
TABLE V The function extract entities filters relevant entities from
PERFORMANCE
each headline. These entities may include companies, brands,
or regional indexes that are relevant for equity tracking.

XII. REAL-TIME VALIDATION C. Ticker Symbol Resolution


After fine-tuning the sentiment analysis model, it was essen- For each extracted entity, the get stock symbol function
tial to validate its performance on real-world financial news. To uses the yfinance library to attempt to resolve the correspond-
achieve this, a news parser was developed to collect the latest ing stock ticker. The function is designed with basic exception
financial news headlines from various news site, a platform handling to ensure robustness against lookup errors or rate
that aggregates financial news from various sources. limits.
The parser continuously fetches headlines at regular in- D. Aggregation
tervals, ensuring access to up-to-date financial news. Each
headline is processed and stored with its associated metadata, All identified entities and their resolved ticker symbols are
including the content, publication date, and source URL. To compiled into a mapping, associating each input headline with
avoid redundant data, a hashing mechanism was implemented a set of candidate equities.
to track previously collected headlines, ensuring that only new XIV. CONCLUSION
and unique articles are recorded.
The integration of real-time financial news parsing serves This study demonstrated the effectiveness of fine-tuning the
several important purposes. It creates a dataset enriched with Qwen2.5-14B model for sentiment analysis of financial news
sentiment annotations from experienced traders, ensuring the headlines. The fine-tuning process significantly improved the
model is trained with contextually relevant data. parsing cur- model’s performance, achieving an overall accuracy of 0.931,
rent headlines allows for continuous model validation, keeping with strong precision, recall, and F1-score metrics across
predictions aligned with market sentiment and adapting to positive, neutral, and negative sentiments.It also presents a
market changes. Expert-driven annotations enhance model comprehensive NLP pipeline that integrates transformer-based
personalization and transparency, while automating data col- Named Entity Recognition with fine-tuned large language
lection reduces manual intervention. This approach ensures the models to enhance sentiment analysis and entity resolution
model remains accurate, up-to-date, and effective for real-time in financial news. This enables a structured understanding
trading insights. of which equities are associated with each headline, laying
the groundwork for more targeted sentiment-driven financial
XIII. NAMED ENTITY-AWARE EQUITY IDENTIFIER IN analysis. When combined with a fine-tuned Qwen2.5-14B
FINANCIAL NEWS model, the system can not only determine the sentiment of
In the domain of financial news analysis, identifying rel- a news item but also tie that sentiment directly to specific
evant entities particularly companies or organizations and financial instruments, making it highly valuable for automated
linking them to their corresponding stock ticker symbols trading systems, portfolio monitoring, and real-time financial
is a foundational step for downstream applications such as decision-making.
sentiment-aware trading, automated portfolio alerts, and mar- In addition, LLama2-7b showed 0.871 accuracy, the
ket movement forecasting. This module presents a hybrid ap- LLaMA3-8B model showcased impressive results, attaining an
proach combining transformer-based Named Entity Recogni- overall accuracy of 0.923. Although slightly below Qwen2.5 in
tion (NER) using spaCy with real-time financial data querying terms of raw accuracy, Qwen2.5-14B proved to be stronger due
using the Yahoo Finance API (via the yfinance Python library) to its larger parameter size, which enabled it to capture more
The system is designed to process natural language news complex patterns in financial text, leading to slightly higher
headlines and extract potential equity-relevant named entities. accuracy 0f 0.931 and better generalization across diverse
It then maps these entities to their corresponding stock sym- sentiment categories.
bols, if available, by interfacing with Yahoo Finance. The ability of both models to accurately classify finan-
cial sentiment is critical for applications in financial market
A. Named Entity Recognition (NER) analysis, risk management, and investment decision-making.
The spaCy NLP library is employed with the Future work will focus on incorporating more diverse datasets,
en core web trf model—a transformer-based pipeline exploring advanced fine-tuning techniques, and integrating ad-
that leverages state-of-the-art language models for entity ditional contextual information to further enhance the model’s
extraction. The NER model is fine-tuned to identify a variety capabilities.
of entity types; in this context, we specifically focus on ORG In conclusion, the fine-tuned Qwen2.5-14B and LLaMA3-
(Organizations), GPE (Geo-political entities), and PRODUCT 8B models offer powerful tools for sentiment analysis in
the financial domain, with significant potential for further [8] B. M. Barber and T. Odean, “All that glitters: The effect of attention and
advancements and real-world applications. news on the buying behavior of individual and institutional investors,”
The Review of Financial Studies, vol. 21, no. 2, pp. 785–818, 2008.
XV. FUTURE WORK [9] X. Zhang, H. Fuehres, and P. A. Gloor, “Predicting stock market
A promising direction for future work involves the creation indicators through twitter ‘I hope it is not as bad as I fear’,” Procedia
- Social and Behavioral Sciences, vol. 26, pp. 55–62, 2011.
of an interactive sentiment analysis agent capable of process-
[10] F. Li, “The information content of forward-looking statements in corpo-
ing financial news headlines in real time. This agent will use rate filings—a na¨ıve bayesian machine learning approach,” Journal of
parsed data from various indian news platforms to perform Accounting Research, vol. 48, no. 5, pp. 1049–1102, 2010.
daily sentiment predictions. The system will automatically [11] B.-G. Choi, J. H. Choi, and S. Malik, “Not just for investors: The role of
classify each headline into positive, neutral, or negative sen- earnings announcements in guiding job seekers,” Journal of Accounting
timent using the current fine-tuned model. These predictions and Economics, vol. 76, no. 1, p. 101588, 2023.
will then be made available to a panel of financial experts or [12] B. S. Kumar and V. Ravi, “A survey of the applications of text mining
in financial domain,” Knowledge-Based Systems, vol. 114, pp. 128–147,
traders for validation. 2016.
The second part of this pipeline introduces a feedback [13] T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, “QLoRA:
loop from traders, where they will review the model’s sen- Efficient Finetuning of Quantized LLMs,” Advances in Neural Informa-
tion Processing Systems, vol. 37, pp. 1-12, 2023.
timent predictions and provide feedback on their correctness.
This feedback—indicating whether a prediction was right or
wrong—will be collected and stored as a new dataset. Using
this feedback, a reward model will be developed to assess the
quality of sentiment predictions. This reward model serves as
a scoring system that helps the AI model understand human
preferences in the context of financial sentiment.
Finally, the reward data will be used to retrain the sentiment
model using Reinforcement Learning from Human Feedback
(RLHF). This fine-tuning process will enable the model to
learn from real-world judgments instead of just static datasets,
making it more adaptive and aligned with human expecta-
tions. Over time, the system will form a continuous learning
loop where the sentiment agent improves daily, producing
more accurate and context-aware sentiment predictions. This
approach aims to build a powerful, self-correcting tool for
traders, offering real-time, human-aligned financial sentiment
insights.

REFERENCES
[1] P. Malo, A. Sinha, P. Korhonen, J. Wallenius, and P. Takala, “Good debt
or bad debt: Detecting semantic orientations in economic texts,” Journal
of the Association for Information Science and Technology, vol. 65, no.
4, pp. 782–796, 2014.
[2] B. Pang, L. Lee et al., “Opinion mining and sentiment analysis,”
Foundations and Trends® in Information Retrieval, vol. 2, no. 1–2, pp.
1–135, 2008.
[3] J. Si, A. Mukherjee, B. Liu, Q. Li, H. Li, and X. Deng, “Exploiting topic
based twitter sentiment for stock prediction,” in Proc. of the 51st Annual
Meeting of the Association for Computational Linguistics (Volume 2:
Short Papers), 2013, pp. 24–29.
[4] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training
of deep bidirectional transformers for language understanding,” arXiv
preprint arXiv:1810.04805, 2018.
[5] M. Hu and B. Liu, “Mining and summarizing customer reviews,” in
Proc. of the 10th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, 2004, pp. 168–177.
[6] F. Z. Xing, E. Cambria, and R. E. Welsch, “Natural language based
financial forecasting: a survey,” Artificial Intelligence Review, vol. 50,
no. 1, pp. 49–73, 2018.
[7] S. Kogan, D. Levin, B. R. Routledge, J. S. Sagi, and N. A. Smith,
“Predicting risk from financial reports with regression,” in Proc. of Hu-
man Language Technologies: The 2009 Annual Conference of the North
American Chapter of the Association for Computational Linguistics,
2009, pp. 272–280.

You might also like