Synopsis on Prediction of Share Market Index Movement Using Machine Learning Algorithms
Synopsis on Prediction of Share Market Index Movement Using Machine Learning Algorithms
On
Session 2023-24
In the first stage, Data will be collected and do the pre-processing on it to get cleaned data.
It’s require to do feature selection on collected data using Random Forest (RF) algorithm. To pre-
process on data, have to collect the historical data, live data also option chain data. Then, in second
stage, pre-processed data will be pass through the long short-term memory (LSTM) model to
predict the trend or movement in the stock market.
In this research, hybrid model is used for stock market prediction i.e. Random Forest is
used only to do the feature selection. This approach can leverage the strengths of different
algorithms for more accurate predictions. As the RF can handle the categorical as well as numerical
data and it’s suitable for combining different types of data sources. The result of RF model for
feature selection is good as compared to other models. Also the LSTM model is used to do the
prediction of the trends of stock market, because the accuracy of LSTM algorithm to predict the
market trend is high as compared to other machine learning algorithms & other technology
algorithms. As LSTM is more effective for analysing sequential data or the time series data. LSTM
can capture temporal dependencies in the data, which will be useful for predicting stock market
trend based on past close data and volatility in the market.
2. Literature survey
The literature suggests that there were various Machine learning methods used for predicting
share market movement. Fatene Dioubi, et al., Nov-2024 [01] found that the traditional stock
market prediction methods struggle with the non-linear and non-stationary nature of financial data.
To address this, hybrid models such as the External Trend and Internal Components Analysis
(ETICA) combined with Long Short-Term Memory (LSTM) have demonstrated improved
predictive accuracy, significantly reducing errors like RMSE and MAE. The study conducted by
Chin Yang Lin, et al., 2024 [02] revealed that long short-term memory (LSTM), artificial neural
networks (ANN), and support vector machines (SVM) are the most popular AI methods for stock
market prediction. In addition to this, a study conducted by Kevin Purnomo et al., 2024 [03]
highlights the effectiveness of advanced prediction models like Bidirectional Long Short-Term
Memory (BiLSTM) and Random Forest (RF) in forecasting stock prices of nickel mining
companies. Studies reveal that BiLSTM consistently outperforms RF in terms of predictive
accuracy, especially for longer forecast time frames, as evidenced by lower RMSE and MAE
values. However, RF demonstrates better computational efficiency, making it suitable for
scenarios with limited computational resources.
For instance, the study used by Atif Khan Jadoon, et al., 2024 [04] LSTM for predicting stock
market direction. LSTM is a notable method for evaluating sequential data and constructing a
robust prediction framework. SR Samarasuriya, et al.,2024 [05] found that the LSTM model is
the most suitable to work with time series data for predict the prices. The main programming
language used is Python and the model will utilize existing frameworks and libraries within Python
such as pandas, numpy, matplotlib, and scikit-learn. The TensorFlow library will form the core of
the machine learning process and in-built methods and tools offered will be employed. A recurrent
1|Page
School of Engineering & Technology, CPU, Kota, Rajasthan
Synopsis on Predicting Share market Index movement using Machine Learning Algorithms |2023-24
neural network will be used in tandem with LSTM layers to handle the time series data that is
primarily used for the model.
• To select the key highlights from the preprocessed data, including OI data, Volumes,
Technical Analysis, and volatility indicators, using advanced feature engineering
techniques i.e. Random Forest (RF).
• Evaluate the performance of the designed models and interpret the model evaluation results
to understand the impact of OI data, Volumes, Technical Analysis, and volatility indicators
on stock market prediction.
5. Proposed methodology
The two machine learning algorithms are used to predict the trend in the stock market.
Random Forest (RF) is used for feature selection and Long Short-Term Memory (LSTM) for
modeling Real-time data or sequential data can be a valid approach, and it might improve results
as compared to using only one model out of these two models. The processes involved in doing
this research are as follows:
Collect the historical data (i.e. price data, candlestick data, Volume data, etc.) of future
stocks and index from the NSE website or download from any broker's website. To fetch live
Option chain data or price data, the APIs (Application Peripheral Interface) are provided by
financial data providers or web scraping techniques.
Pre-process the collected data, remove the noise or unwanted data, and ensure that the
collected data is in a format suitable for analyzing it. Combine all data i.e. historical closed-based
data with live data, Open interest data, Volume, and Volatility data. Split the data into two sets i.
e. training and testing sets & Normalise or scale the data on the same scale to ensure all features
of the data.
The features that will play an important role in prediction could include as following:
a. Historical Price Data Features: It can be derived from historical price, volume,
candlesticks data, etc. such as relative strength index (RSI). Simple or exponential moving
averages and other technical indicators that capture past trends and patterns.
2|Page
School of Engineering & Technology, CPU, Kota, Rajasthan
Synopsis on Predicting Share market Index movement using Machine Learning Algorithms |2023-24
b. Open Interest Features: It can be derived from open interest, change in open interest, the
ratio of OI at different strike prices, and OI concentrations at specific strike prices, which
will provide insights into the market potential and sentiment support or resistance levels.
c. Implied Volatility Features: It can be derived from implied volatility, change in IV, IV
rations & IV levels at different strike prices, which will help to determine or predict the
overall market or stock volatility and options premium prices.
d. Live Price Data Features: It is derived from current price movements, price levels, and
volumes, which will give real-time information on the market conditions.
The most important step in the Random Forest for feature selection is to identify the most
relevant feature among the above features, which can improve the quality of output of RF. The RF
output will be used as input for the LSTM model for predicting share market trends. The identified
features will play an important role in predicting future price movement. By using RF for feature
selection, the focus on the most important features & dimensionality of the input data will be
reduced, which will improve the efficiency of the LSTM model. Combining RF and LSTM models
will improve the strengths of both models and potentially the overall performance of the prediction
system.
Start
Feature Selection
using Random Forest
Data Preparation
for LSTM
LSTM Training
Model
Continuous Monitoring
& Adjustment
End
3|Page
School of Engineering & Technology, CPU, Kota, Rajasthan
Synopsis on Predicting Share market Index movement using Machine Learning Algorithms |2023-24
The most important step in the LSTM model is to train the model to generate predictions
of the stock market. LSTM model is well suited for sequential data or time series data i. e. Stock
prices and volumes, as it can capture patterns & long-term dependencies in the data. Reshape the
data sets into formats suitable for LSTM, such as a 3D array with samples, time stamps, and
features that will be received as input from the RF model. Split the data into two sets i.e. training
and testing sets.
Apply the training set data as input to the LSTM model to train it and finally use the trained
LSTM model to generate predictions from the samples, time stamps, and features on the selected
test set. Continuously monitor the performance of the LSTM model and adjust a necessary based
on market conditions & new data.
a. The gap in the existing literature is addressed by combining the historical price data, live price
data, OI, IV data, and volume for stock market prediction. This holistic approach will improve
the reliability and accuracy of the stock market prediction for traders, financial analysts, and
investors.
b. This research contributes to machine learning by demonstrating the benefits of Random Forest
(RF) and LSTM algorithms in stock market prediction. By showcasing the practical
application of these algorithms in a real-world financial context.
c. This research is an undeviating application in the financial industry, which will provide
valuable insights and a data-driven approach for investors, traders, and financial institutions
to predict the stock market.
Collecting the historical and live price data, OI data, IV data, and trading volume data from
the NSE website or some other authorized resources.
To Get the features from collected data and develop machine learning models such as RF
for feature selection and LSTM for time series modelling.
Evaluate the performance of the models using accuracy, F1 score, precision etc. and
interpret the result to understand the impact of each feature on prediction accuracy. To ensure the
reliability & robustness of models, validate it using out-of-sample data.
4|Page
School of Engineering & Technology, CPU, Kota, Rajasthan
Synopsis on Predicting Share market Index movement using Machine Learning Algorithms |2023-24
Write the research paper on literature reviews and methodology implemented in the
research. Publish it in peer-reviewed journal.
Implement the final model for real-time prediction and analyze the performance of each
models. Also identify the areas for improvement.
Write the research paper detailing the methodology and results of the study and submit the
paper to a peer-reviewed journal for publication.
8. References
1. Fatene Dioubi, Negalign Wake Hundera, Huiying Xu, Xinzhong Zhu, “Enhancing
stock market predictions via hybrid external trend and internal components analysis
and long short term memory model”, November, 2024 , 13191578 The Authors. Published
by Elsevier Ltd, https://doi.org/10.1016/j.jksuci.2024.102252.
2. Chin Yang Lin , Jo˜ao Alexandre Lobo Marques , “Stock market prediction using
artificial intelligence: A systematic review of systematic reviews”, March, 2024 The
Authors. Published by Elsevier Ltd, https://doi.org/10.1016/j.ssaho.2024.100864.
3. Kevin Purnomo, Raphaelle Albetho Wijaya, Muhamad Fajar, Puti Andam Suri,
“Comparative Analysis of BiLSTM Deep Learning Model and Random Forest
Regressor Performance on Indonesian Nickel mining company stock Prices”, 1877-
0509 © 2024 The Authors. Published by Elsevier B.V., Procedia Computer Science 245
(2024) 778–786
4. Atif Khan Jadoon , Tariq Mahmood, Ambreen Sarwar, Maria Faiq Javaid, Munawar
Iqbal, “Prediction of Stock Market Movement Using Long Short-Term Memory
(LSTM) Artiicial Neural Network: Analysis of KSE 100 Index”, Journal of Life and
Social Sciences, 2024, https://doi.org/10.57239/PJLSS-2024-22.1.009.
5. SR Samarasuriya, DVDS Abeysinghe, KGK Abeywardhane, “Recurrent Neural
Network for Stock Market Forecasting using Long Short-Term Memory and an
Analysis of How Social Media Affects Share Prices”, International Journal of
Computer Applications (0975 – 8887) Volume 186 – No.10, February 2024.
6. Anum Shafique, Nousheen Tariq Bhutta, “Assessing conditional volatility due to trade
war in the G-7 stock markets”, December 2023, 2590-2911/© 2023,
https://doi.org/10.1016/j.ssaho.2023.100768.
7. Rahul Jain and Rakesh Vanzara, “Emerging Trends in AI-Based Stock Market
Prediction: AComprehensive and Systematic Review”, MDPI, Engineering
Proceedings, 2023, 56, 254, November 2023, https://doi.org/10.3390/ASEC2023-
15965.
8. Saranya K,Vijayashaarathi S, Sasirekha N, Koushikrajaa M, Lohith RakshaS, “Stock
market price prediction using machine learning”, Journal of Population Therapeutics
& Clinical Pharmacology, Vo.30(12), e130-e136, May 2023.
5|Page
School of Engineering & Technology, CPU, Kota, Rajasthan
Synopsis on Predicting Share market Index movement using Machine Learning Algorithms |2023-24
9. Althaf S, Anto Antony Cyriac, Ben Korulla Thomas, Jais Babu, Gayathri Mohan,
“Stock Market Prediction Using Artificial Intelligence”, International Research
Journal of Modernization in Engineering Technology and Science,
Volume:05/Issue:05/May-2023.
10. Feipeng Zhang, Yilin Zhang, Yixiong Xu, Yan Chen, “Dynamic relationship between
volume and volatility in the Chinese stock market: evidence from the MS-VAR model”,
October 2023, 2666-7649/© 2023, https://doi.org/10.1016/j.dsm.2023.09.003.
11. Chenjiang Bai, Yuejiao Duan, Xiaoyun Fan, Shuai Tang, “Financial market sentiment
and stock return during the COVID-19 pandemic”, February 2023, 1544-6123/©
2023, https://doi.org/10.1016/j.frl.2023.103709.
12. Latrisha N. Mintaryaa, Jeta N. M. Halima, Callista Angiea, Said Achmada, Aditya
Kurniawan, “Machine learning approaches in stock market prediction: A systematic
literature review”, 1877-0509 © 2023, 7th International Conference on Computer
Science and Computational Intelligence 2022.
13. Dharmaraja Selvamuthu, Vineet Kumar and Abhishek Mishra, “Indian stock market
prediction using artificial neural networks on tick data”, Selvamuthu et al. Financial
Innovation (2019) 5:16, March 2019, https://doi.org/10.1186/s40854-019-0131-7.
6|Page
School of Engineering & Technology, CPU, Kota, Rajasthan