Synopsis Report - Upendra
Synopsis Report - Upendra
on
Stock Price Prediction using Deep Learning
Submitted as partial fulfillment for the award of
BACHELOR OF TECHNOLOGY
DEGREE
Session 2022-23 in
CSE-Data Science
By:
Upendra Patel
2100321540174
DEPARTMENT OF CSE-DS
ABES ENGINEERING COLLEGE, GHAZIABAD
AFFILIATED TO
DR. A.P.J. ABDUL KALAM TECHNICAL UNIVERSITY, U.P., LUCKNOW
(Formerly UPTU)
1
Student’s Declaration
I/we hereby declare that the work being presented in this report entitled “Stock Price Prediction
using Deep Learning” is an authentic record of my/ our own work carried out under the
supervision of Mr. Prabhat Singh, Assistant Professor, CSE-DS. The matter embodied in this
report has not been submitted by us for the award of any other degree.
Date:
Signature of Student:
Upendra Patel
2100321540174
Department: Data Science
This is to certify that the above statement made by the candidate(s) is correct to the best of my
knowledge.
2
Acknowledgment
Signature of Students:
Upendra Patel
2100321540174
Department: Data Science
3
TABLE OF CONTENTS
Abstract iv
Chapter 1: Introduction 1
References 8
4
LIST OF ABBREVIATIONS
5
ABSTRACT
In the future, I would like to implement price prediction with Android using Django and
take meaningful data from NSE and BSE websites and inform him/her about the same
when to buy or when to sell that particular stock using fundamental and technical
analysis.
Keywords: LSTM, CNN, ML, DL, Trade Open, Trade Close, Trade Low, Trade
High.
6
Chapter 2
INTRODUCTION
The financial market is a dynamic and composite system where people can buy and sell
currencies,stocks, equities, and derivatives over virtual platforms supported by brokers. The
stock market allows investors to own shares of public companies through trading either by
exchange or over-the-counter markets. This market has given investors the chance of
gaining money and having a prosperous life through investing small initial amounts of
money, low risk compared to the risk of opening a new business or the need for a high-salary
career. Stock markets are affected by many factors causing uncertainty and high volatility in
the market. Although humans can take orders and submit them to the market, automated
trading systems (ATS) that are operated by the implementation of computer programs can
perform better and with higher momentum in submitting orders than any human. However,
to evaluate and control the performance of ATSs, the implementation of risk strategies and
safety measures applied based on human judgments are required. Many factors are
incorporated and considered when developing an ATS, for instance, the trading strategy to
be adopted, complex mathematical functions that reflect the state of a specificstock,
machine learning algorithms that enable the prediction of the future stock value, and
specific news related to the stock being analyzed.
7
Section 1.1
Stock price prediction is a classic and important problem. With a successful model for stock
prediction, we can gain insight into market behavior over time, spotting trends that would
otherwise not have been noticed. With the increasing computational power of the
computer, machine learning will be an efficient method to solve this problem. However,
the public stock dataset is too limited for many machine learning algorithms to work with,
while asking for more features may cost thousands of dollars every day. In this paper, we
will introduce a framework in which we integrate user predictions into the current machine
learning algorithm using public historical data to improve our results. The motivated idea is
that, if we know all information about today’s stock trading (of all specific traders), the
price is predictable. Thus, if we can obtain just partial information, we can expect to
improve the current prediction lot.
With the growth of the Internet, social networks, and online social interactions, getting
daily user predictions is a feasible job. Thus, our motivation is to design a public service
incorporating historical data and user predictions to make a stronger model that will
benefit everyone.
Section 1.2
PROBLEM STATEMENT
Time Series forecasting & modeling plays an important role in data analysis. Timeseries
analysis is a specialized branch of statistics used extensively in fields such as Econometrics
&Operation Research. Time Series is being widely used in analytics & data science. Stock
prices are volatile in nature and price depends on various factors. The main aim of this
project is to predict stock prices using Long short-term memory (LSTM).
8
CHAPTER 2
METHODOLOGY
Stock Price Prediction has been performed through various methods in the past, such as SVM,
ANN, and RNN. However, with technologies evolving over the years, methods that are more
efficient are evolved.
9
Section 2.1
Working of LSTM:
LSTM is a special network structure with three “gate” structures. Three gates are placed in an LSTM
unit, called the input gate, forgetting gate, and output gate. While information is entered through
the LSTM’s network, it is selected by rules. Only the information that abides by the algorithm will
be left, and the information that does not abide will be forgotten through the forgetting gate.
Firstly, at a basic level, the output of an LSTM at a particular point in time is dependant on three
things:
▹ The current long-term memory of the network — known as the cell state
▹ The output at the previous point in time — known as the previous hidden state
LSTMs use a series of ‘gates’ which control how the information in a sequence of data comes into, is
stored in, and leaves the network. There are three gates in a typical LSTM; forget gate, input gat,e
and output gate. These gates can be thought of as filters and are each their own neural network. We
will explore them all in detail during the course of this article.
In the following explanation, we consider an LSTM cell as visualized in the following diagram. When
looking at the diagrams in this article, imagine moving from left to right.
10
LSTM Diagram
Step 1
The first step in the process is the forget gate. Here we will decide which bits of the cell state (long
term memory of the network) are useful given both the previous hidden state and new input data.
Forget Gate
To do this, the previous hidden state and the new input data are fed into a neural network. This
network generates a vector where each element is in the interval [0,1] (ensured by using the sigmoid
activation). This network (within the forget gate) is trained so that it outputs close to 0 when a
11
component of the input is deemed irrelevant and closer to 1 when relevant. It is useful to think of
each element of this vector as a sort of filter/sieve that allows more information as the value
gets closer to 1.
These outputted values are then sent up and pointwise multiplied with the previous cell state. This
pointwise multiplication means that components of the cell state which have been deemed irrelevant
by the forget gate network will be multiplied by a number close to 0 and thus will have less influence
on the following steps.
In summary, the forget gate decides which pieces of the long-term memory should now be forgotten
(have less weight) given the previous hidden state and the new data point in the sequence.
Step 2
The next step involves the new memory network and the input gate. The goal of this step is to
determine what new information should be added to the networks long-term memory (cell state),
Input Gate
12
Both the new memory network and the input gate are neural networks in themselves, and both take
the same inputs, the previous hidden state and the new input data. It is worth noting that the inputs
here are actually the same as the inputs to the forget gate!
1. The new memory network is a tan h activated neural network which has learned how to
combine the previous hidden state and new input data to generate a ‘new memory update
vector’. This vector essentially contains information from the new input data given the
context from the previous hidden state. This vector tells us how much to update each
component of the long-term memory (cell state) of the network given the new data.
Note that we use a tanh here because its values lie in [-1,1] and so can be negative. The
possibility of negative values here is necessary if we wish to reduce the impact of a
component in the cell state.
2. However, in part 1 above, where we generate the new memory vector, there is a big
problem, it doesn’t actually check if the new input data is even worth remembering. This
is where the input gate comes in. The input gate is a sigmoid activated network which
acts as a filter, identifying which components of the ‘new memory vector’ are worth
retaining. This network will output a vector of values in [0,1] (due to the sigmoid
activation), allowing it to act as a filter through pointwise multiplication. Similar to what
we saw in the forget gate, an output near zero is telling us we don’t want to update that
element of the cell state.
3. The output of parts 1 and 2 are pointwise multiplied. This causes the magnitude of new
information we decided on in part 2 to be regulated and set to 0 if need be. The resulting
combined vector is then added to the cell state, resulting in the long-term memory of the
network being updated.
Step 3
13
Now that our updates to the long-term memory of the network are complete, we can move to the
final step, the output gate, deciding the new hidden state. To decide this, we will use three things;
the newly updated cell state, the previous hidden state and the new input data.
One might think that we could just output the updated cell state; however, this would be
comparable to someone unloading everything they had ever learned about the stock market when
To prevent this from happening we create a filter, the output gate, exactly as we did in the forget
gate network. The inputs are the same (previous hidden state and new data), and the activation is
also sigmoid (since we want the filter property gained from outputs in [0,1]).
Output Gate
As mentioned, we want to apply this filter to the newly updated cell state. This ensures that only
necessary information is output (saved to the new hidden state). However, before applying the filter,
we pass the cell state through a tanh to force the values into the interval [-1,1].
▹ Apply the tanh function to the current cell state pointwise to obtain the squished cell state, which
now lies in [-1,1].
▹ Pass the previous hidden state and current input data through the sigmoid-activated neural
network to obtain the filter vector.
▹ Apply this filter vector to the squished cell state by pointwise multiplication.
14
▹ Output the new hidden state!
CHAPTER 3
HARDWARE REQUIREMENTS:
System configuration:
Hardware Requirements:
• RAM: 4 GB
• Storage: 500 GB
• CPU: 2 GHz or faster
• Architecture: 32-bit or 64-bit
15
CHAPTER 4
2/3/23, 1:07 AM
Stock Prediction Web app - Jupyter Notebook
In [9]:
import yfinance as yf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pandas_datareader import data as pdr
yf.pdr_override()
import datetime as dt
import json
import tensorflow as tf
from tensorflow import keras
In [10]:
start = '2010-01-01'
end = '2022-12-31'
[*********************100%***********************] 1 of 1 completed
Out[10]:
Date
In [11]:
df.tail()
Out[11]:
Date
In [12]:
df = df.reset_index()
df.head()
Out[12]:
In [13]:
Out[13]:
In [16]:
plt.plot(df.Close)
Out[16]:
[<matplotlib.lines.Line2D at 0x1825bfc77c0>]
In [18]:
df
Out[18]:
In [19]:
Out[19]:
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
...
3267 150.515600
3268 150.157800
3269 149.764699
3270 149.412100
3271 149.062199
Name: Close, Length: 3272, dtype: float64
In [25]:
plt.figure(figsize = (12,6))
plt.plot(df.Close, label = 'Closing Price')
plt.plot(ma100,'r', label='Moving 100 day average')
plt.legend()
plt.show()
In [26]:
Out[26]:
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
...
3267 152.1331
3268 152.0096
3269 151.8867
3270 151.7593
3271 151.6110
Name: Close, Length: 3272, dtype: float64
In [11]:
plt.figure(figsize = (12,6))
plt.plot(df.Close,label='Closing Price')
plt.plot(ma100,'r',label = 'Moving 100 day average')
plt.plot(ma200,'g',label = 'Moving 200 day average')
plt.legend()
plt.show()
In [12]:
df.shape
Out[12]:
(3272, 5)
In [13]:
data_training = pd.DataFrame(df['Close'][0:int(len(df)*0.70)])
data_testing = pd.DataFrame(df['Close'][int(len(df)*0.70):int(len(df))])
print(data_training.shape)
print(data_testing.shape)
(2290, 1)
(982, 1)
In [14]:
data_training.head()
Out[14]:
Close
0 7.643214
1 7.656429
2 7.534643
3 7.520714
4 7.570714
In [15]:
from sklearn.preprocessing import MinMaxScaler #data scaling = reduces the difference between the points in the data
# which results in greater accuracy. It comes under Data Preproccesing
scaler = MinMaxScaler(feature_range=(0,1))
In [16]:
data_training_array = scaler.fit_transform(data_training)
data_training_array
Out[16]:
array([[0.01533047],
[0.01558878],
[0.01320823],
...,
[0.71710501],
[0.71739828],
[0.70127194]])
In [17]:
data_training_array.shape
Out[17]:
(2290, 1)
In [18]:
x_train = [] # this is the steps we take, for example 100 days data
y_train = [] #this is the predicted value, ie value on 101 day after analysing 100 days.
In [19]:
x_train.shape
Out[19]:
(2190, 100, 1)
In [20]:
y_train.shape
Out[20]:
(2190,)
In [21]:
#ML Model
In [22]:
from keras.layers import Dense, Dropout, LSTM
from keras.models import Sequential, model_from_json
In [23]:
model = Sequential()
model.add(LSTM(units = 50, activation = 'relu', return_sequences= True, input_shape = (x_train.shape[1],1)))
model.add(Dropout(0.2))
In [24]:
model.summary()
Model: "sequential"
=================================================================
Total params: 178,761
Trainable params: 178,761
Non-trainable params: 0
In [25]:
model.compile(optimizer = 'adam', loss = 'mean_squared_error')
model.fit(x_train , y_train, epochs = 50)
Epoch 1/50
69/69 [==============================] - 20s 206ms/step - loss: 0.0347
Epoch 2/50
69/69 [==============================] - 14s 199ms/step - loss: 0.0070
Epoch 3/50
69/69 [==============================] - 14s 201ms/step - loss: 0.0065
Epoch 4/50
69/69 [==============================] - 14s 199ms/step - loss: 0.0063
Epoch 5/50
69/69 [==============================] - 14s 204ms/step - loss: 0.0054
Epoch 6/50
69/69 [==============================] - 15s 217ms/step - loss: 0.0047
Epoch 7/50
69/69 [==============================] - 15s 215ms/step - loss: 0.0049
Epoch 8/50
69/69 [==============================] - 14s 207ms/step - loss: 0.0049
Epoch 9/50
69/69 [==============================] - 16s 234ms/step - loss: 0.0048
Epoch 10/50
69/69 [==============================] - 15s 218ms/step - loss: 0.0040
Epoch 11/50
69/69 [==============================] - 14s 206ms/step - loss: 0.0038
Epoch 12/50
69/69 [==============================] - 14s 209ms/step - loss: 0.0038
Epoch 13/50
69/69 [==============================] - 15s 220ms/step - loss: 0.0033
Epoch 14/50
69/69 [==============================] - 17s 243ms/step - loss: 0.0035
Epoch 15/50
69/69 [==============================] - 15s 219ms/step - loss: 0.0037
Epoch 16/50
69/69 [==============================] - 14s 204ms/step - loss: 0.0039
Epoch 17/50
69/69 [==============================] - 15s 213ms/step - loss: 0.0034
Epoch 18/50
69/69 [==============================] - 15s 214ms/step - loss: 0.0033
Epoch 19/50
69/69 [==============================] - 14s 208ms/step - loss: 0.0031
Epoch 20/50
69/69 [==============================] - 15s 218ms/step - loss: 0.0028
Epoch 21/50
69/69 [==============================] - 15s 220ms/step - loss: 0.0030
Epoch 22/50
69/69 [==============================] - 16s 227ms/step - loss: 0.0029
Epoch 23/50
69/69 [==============================] - 15s 218ms/step - loss: 0.0022
Epoch 24/50
69/69 [==============================] - 15s 224ms/step - loss: 0.0024
Epoch 25/50
69/69 [==============================] - 15s 221ms/step - loss: 0.0025
Epoch 26/50
69/69 [==============================] - 15s 211ms/step - loss: 0.0024
Epoch 27/50
69/69 [==============================] - 14s 209ms/step - loss: 0.0023
Epoch 28/50
69/69 [==============================] - 15s 223ms/step - loss: 0.0022
Epoch 29/50
69/69 [==============================] - 15s 222ms/step - loss: 0.0022
Epoch 30/50
69/69 [==============================] - 16s 229ms/step - loss: 0.0020
Epoch 31/50
69/69 [==============================] - 16s 228ms/step - loss: 0.0021
Epoch 32/50
69/69 [==============================] - 16s 237ms/step - loss: 0.0022
Epoch 33/50
69/69 [==============================] - 16s 233ms/step - loss: 0.0023
Epoch 34/50
69/69 [==============================] - 16s 232ms/step - loss: 0.0020
Epoch 35/50
69/69 [==============================] - 16s 226ms/step - loss: 0.0018
Epoch 36/50
69/69 [==============================] - 16s 233ms/step - loss: 0.0018
Epoch 37/50
69/69 [==============================] - 15s 215ms/step - loss: 0.0020
Epoch 38/50
69/69 [==============================] - 16s 238ms/step - loss: 0.0020
Epoch 39/50
69/69 [==============================] - 15s 224ms/step - loss: 0.0021
Epoch 40/50
69/69 [==============================] - 16s 235ms/step - loss: 0.0017
Epoch 41/50
69/69 [==============================] - 16s 225ms/step - loss: 0.0017
Epoch 42/50
69/69 [==============================] - 14s 204ms/step - loss: 0.0018
Epoch 43/50
69/69 [==============================] - 15s 220ms/step - loss: 15.4308
<keras.callbacks.History at 0x20de7492d90>
In [26]:
model.save('keras_model4.keras')
In [27]:
data_testing.head()
Out[27]:
Close
2290 42.602501
2291 42.357498
2292 42.722500
2293 42.544998
2294 42.700001
In [28]:
past_100_days = data_training.tail(100)
In [29]:
In [30]:
final_df.head()
Out[30]:
Close
0 55.959999
1 54.470001
2 54.560001
3 54.592499
4 55.007500
In [31]:
input_data = scaler.fit_transform(final_df)
input_data
Out[31]:
array([[0.13937014],
[0.1291969 ],
[0.1298114 ],
...,
[0.61785443],
[0.64222927],
[0.64441407]])
In [32]:
input_data.shape
Out[32]:
(1082, 1)
In [33]:
x_test = []
y_test = []
In [34]:
(982, 100, 1)
(982,)
In [35]:
#Making Predictions
y_predicted = model.predict(x_test)
In [36]:
y_predicted.shape
Out[36]:
(982, 1)
In [37]:
y_test
Out[37]:
In [38]:
y_predicted
Out[38]:
array([[0.09207357],
[0.09277508],
[0.09351471],
[0.09425803],
[0.09497693],
[0.09565249],
[0.09627241],
[0.09683166],
[0.09733434],
[0.09778409],
[0.09819143],
[0.09857252],
[0.09894121],
[0.09930849],
[0.09967425],
[0.10003999],
[0.1004099 ],
[0.1007849 ],
In [42]:
scaler.scale_ #gives the factor with which the above data is scaled down so that we can scale it up again
Out[42]:
array([0.00682769])
In [44]:
scale_factor = 1/0.00682769
y_predicted = y_predicted * scale_factor
y_test = y_test * scale_factor
In [45]:
plt.figure(figsize=(12,6))
plt.plot(y_test, 'b', label = "Original Price")
plt.plot(y_predicted, 'r', label = "Predicted Price")
plt.xlabel('Time')
plt.ylabel('Price')
plt.legend()
plt.show()
In [ ]:
import streamlit as st
start = '2010-01-01'
end = '2022-12-31'
#Describing Data
#Visualization
st.pyplot(fig)
data_training = pd.DataFrame(df['Close'][0:int(len(df)*0.70)])
data_testing = pd.DataFrame(df['Close'][int(len(df)*0.70):int(len(df))])
scaler = MinMaxScaler(feature_range=(0,1))
data_training_array = scaler.fit_transform(data_training)
#Load My model
model = keras.models.load_model('keras_model4.keras')
#Testing Part
past_100_days = data_training.tail(100)
final_df = pd.concat((past_100_days, data_testing),ignore_index = True,axis=0)
input_data = scaler.fit_transform(final_df)
x_test = []
y_test = []
y_predicted = model.predict(x_test)
scaler = scaler.scale_
#Final Graph
st.subheader('Predictions vs Original')
fig2 = plt.figure(figsize=(12,6))
plt.plot(y_test, 'b', label = "Original Price")
plt.plot(y_predicted, 'r', label = "Predicted Price")
plt.xlabel('Time')
plt.ylabel('Price')
plt.legend()
st.pyplot(fig2)
Future work:
We want to extend this application for predicting cryptocurrency
trading and try adding sentiment analysis for better results.
We would also like to develop an android app that can make live
predictions using live data.
[2] Nandakumar, R., Uttamraj, K. R., Vishal, R., & Lokeswari, Y. V. (2018). Stock price
prediction using long short term memory. International Research Journal of Engineering
and Technology, 5(03).
[3] Roondiwala, Murtaza, Harshal Patel, and Shraddha Varma. "Predicting stock prices
using LSTM." International Journal of Science and Research (IJSR) 6.4 (2017): 1754-1756.
[4] Pahwa, Kunal, and Neha Agarwal. "Stock market analysis using supervised machine
learning." 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel
Computing (COMITCon). IEEE, 2019.
[5] Rajput, Vivek, and Sarika Bobde. "Stock market forecasting techniques: literature
survey." International Journal of Computer Science and Mobile Computing 5.6 (2016): 500-
506.
[6] Singh, S., Madan, T. K., Kumar, J., & Singh, A. K. (2019, July). Stock market
forecasting using machine learning: Today and tomorrow. In 2019 2nd International
Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT)
(Vol. 1, pp. 738-745). IEEE.
[7] Yu, Pengfei, and Xuesong Yan. "Stock price prediction based on deep neural networks."
Neural Computing and Applications 32.6 (2020): 1609-1628.
[8] Nayak, Aparna, MM Manohara Pai, and Radhika M. Pai. "Prediction models for Indian
stock market." Procedia Computer Science 89 (2016): 441-449.
[9] Ghosh, Achyut, Soumik Bose, Giridhar Maji, Narayan Debnath, and Soumya Sen. "Stock
price prediction using LSTM on Indian share market." In Proceedings of 32nd international
conference on, vol. 63, pp. 101-110. 2019.
[10] Wei, Dou. "Prediction of stock price based on LSTM neural network." 2019
International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM).
IEEE, 2019.