0% found this document useful (0 votes)
8 views

Crypto-Currency Price Prediction Using Decision Tree and Regression Techniques

This document discusses predicting the price of Bitcoin using machine learning models. It describes using a recurrent neural network and LSTM model to classify the direction of future Bitcoin prices based on historical data. The LSTM achieved 52% accuracy, outperforming an ARIMA model. Both deep learning models were tested on GPU and CPU hardware, with the GPU providing faster training times.

Uploaded by

Bhargav Raj
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Crypto-Currency Price Prediction Using Decision Tree and Regression Techniques

This document discusses predicting the price of Bitcoin using machine learning models. It describes using a recurrent neural network and LSTM model to classify the direction of future Bitcoin prices based on historical data. The LSTM achieved 52% accuracy, outperforming an ARIMA model. Both deep learning models were tested on GPU and CPU hardware, with the GPU providing faster training times.

Uploaded by

Bhargav Raj
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Predicating The Price Of Bitcoin Using Machine Learning

Abstract:

The goal of this paper is to ascertain with what accuracy the direction of Bitcoin price in USD
can be predicted. The price data is sourced from the Bitcoin Price Index. The task is
achieved with varying degrees of success through the implementation of a Bayesian
optimised recurrent neural network (RNN) and a Long Short Term Memory (LSTM) network.
The LSTM achieves the highest classification accuracy of 52% and a RMSE of 8%. The
popular ARIMA model for time series forecasting is implemented as a comparison to the
deep learning models. As expected, the non-linear deep learning methods outperform the
ARIMA forecast which performs poorly. Finally, both deep learning models are benchmarked
on both a GPU and a CPU with the training time on the GPU outperforming the CPU
implementation by 67.7%.

Motivation:

The Bitcoin’s value varies just like a stock albeit differently. There are a number of
algorithms used on stock market data for price prediction. However, the parameters
affecting Bitcoin are different. Therefore it is necessary to predict the value of Bitcoin so
that correct investment decisions can be made. The price of Bitcoin does not depend on the
business events or intervening government unlike stock market. Thus, to predict the value
we feel it is necessary to leverage machine learning technology to predict the price of
Bitcoin.

Objective:

After the recent popularity of bitcoins, many researchers have tried to implement prediction
models. Building a prediction model for machine learning problem is a difficult task, as there
is no right or wrong best fit must be found over a lot of empirical testing for each specific
use case.

Scope:
The goal for this innovative undergrad project is to show how a trained machine model can
predict the price of a cryptocurrency if we give the right amount of data and computational
power. It displays a graph with the predicted values. The most popular technology is the
kind of technological solution that could help mankind predict future events. With vast
amount of data being generated and recorded on a daily basis, we have finally come close
to an era where predictions can be accurate and be generated based on concrete factual
data.

Problem:

Prediction of mature financial markets such as the stock market has been researched at
length. Bitcoin presents an interesting parallel to this as it is a time series prediction
problem in a market still in its transient stage. Traditional time series prediction methods
such as Holt-Winters exponential smoothing models rely on linear assumptions and require
data that can be broken down into trend, seasonal and noise to be effective. This type of
methodology is more suitable for a task such as forecasting sales where seasonal effects are
present. Due to the lack of seasonality in the Bitcoin market and its high volatility, these
methods are not very effective for this task.

Existing System

 Bitcoin presents acuriosity to this as it is a time series prediction problem in a market


still in its short stage.
 Traditional time series prediction methods such as Holt-Winters exponential
smoothing models rely on linear assumptions and require data that can be broken
down into trend, seasonal and noise to be effective.
 The prediction of Bitcoin price can be considered correspondent to other financial
time series prediction tasks such as forex and stock prediction. Distinct bodies of
research have implemented the Multilayer Perceptron (MLP) for stock price
prediction. Reported over threetime’s faster training and testing of its ANN model
when implemented on a GPU rather than a CPU.

Disadvantages of Existing System:


 As the number of bitcoins in circulation approaches closer to the limit, bitcoins
become increasingly harder to mine.
 One of the problem that analysts and researchers faced was to implement a system
capable of accurately predicting the prices.

Solution:

we attempt to predict the Bitcoin price accurately taking into consideration various
parameters that affect the Bitcoin value. For the first phase of our investigation, we aim to
understand and identify daily trends in the Bitcoin market while gaining insight into optimal
features surrounding Bitcoin price. Our data set consists of various features relating to the
Bitcoin price and payment network over the course of five years, recorded daily. For the
second phase of our investigation, using the available information, we will predict the sign
of the daily price change with highest possible accuracy.

Proposed System:

 The main goal of my project is to investigate with what accuracy the price of Bitcoin
can be predicted using machine learning. Tohelp in illustrating more traditional
approaches in financial forecasting, an ARIMA time series model is developed for
performance illustration purposes with the neural network models.
 Closing price of Bitcoin in USD from the Coin desk Bitcoin Price Index is considered
as a separate variable for this study. Ihave taken the average price from five major
Bitcoin exchanges: Bit stamp, Bitfinex, Coinbase, OkCoin and itBit.
 To estimate the performance of models, we choose root mean squared error (RMSE)
of closing price and further put into code the predicted price into categorical variable
reflecting: price up, down or no change.
 It allows increased performance in metrics that would be useful toa merchant in the
design of a trading approach: classification, accuracy, specificity, sensitivity and
precision. Also dependent variables for this paper is taken from the Coindesk, and
Blockchain.info. The closing price, the opening price, daily high &low are also
included as well as Blockchain data (hash rate).
Advantages of Proposed System:
 Models possess great potential to change opportunity into revenue.
 High Performance and accuracy.

SYSTEM REQUIREMENTS

HARDWARE REQUIREMENTS:

 Processor : Intel i3 and above


 RAM : 4GB and Higher
 Hard Disk : 500GB: Minimum

SOFTWARE REQUIREMENTS:

 Programming Language / Platform : Python


 IDE : jupyter

Libraries Used:

Pandas (Used for Data Extraction, Preparation, Filtering)


Matplotlib (Used for Data Visualization)
Numpy (Used for Matrix Processing)
Sklearn (provides supervised and non-supervised machine learning algorithms )
Keras (Used to Build and Design Neural Network)

Modules:

Data Collection. With my game-plan ready, it was time to start the actual work. I went
on bitcoin.com and downloaded all of the data they had available, ending up with 37
different Excel files. Tough luck. There was a way to feed each file separately into the neural
network, but it was way easier to manually combine the files into one big file that has 37
columns instead of just one column per file. So I did this and got a huge file that was 37
columns by around 2667 rows (each row is a day, each column is a feature of Bitcoin for
that day). Unfortunately, it’s not that easy to do machine learning — I also had to do some
data preprocessing to make sure my data was fed into the neural network in the best way.

Data Preprocessing. Okay buckle up because data preprocessing has some pretty
technical steps to it. The first thing I did was to apply a sliding window transformation to the
data? What’s that to? Basically, I slid an imaginary window over the big Excel file to make it
into arrays of 50 days by 37 features. So imagine changing a 2D rectangle into a 3D
rectangular prism. It’s kind of like that. The next thing I did was some normalization on the
data. Since the range of values for each feature varied so much, it was in my best interest
to normalize the numbers for each feature so that each separate data point would
contribute about the same to the overall training of the neural network. Sounds like a lot of
work! Hold your horses — I still had to split the data into training, validation, and testing
sets. This step is pretty easy though; I basically took the most recent 10 percent of the data
as a test set and took the other 90 percent as training data (5 percent of that 90 percent
was split off into a validation set). With the data preprocessing done I could finally get
started making a cool neural network!

Deep Learning Model. As I said earlier, I focused on a Long Short-Term Memory


Recurrent Neural Network to allow the neural network to identify small patterns in the
sequenced data and predict the next-day price based on that data. I also decided to add in
some dropout layers from this paper to make sure my model wasn’t fitting too much to the
training data (even though that sounds like it would be awesome, it actually makes the
model less accurate overall). I used Keras, a neural network library in Python 2.7, to create
my model and it was using a TensorFlow backend. The layers are as follows:

 Input layer (takes data of shape n samples x 50 x 37)

 Bidirectional LSTM layer (returns a sequence, 100 cells)

 Dropout layer (20% dropout — reduces overfitting)

 Bidirectional LSTM layer (returns a sequence, 100 cells)

 Dropout layer (20% dropout — reduces overfitting)


 Bidirectional LSTM layer (doesn’t return a sequence, 50 cells)

 Output layer (returns the predicted next day price of Bitcoin)

Training. I trained the model for 100 epochs (iterations over the dataset), and after this
the model converged pretty well. Essentially, I stopped training the model when its training
accuracy began to stabilize and it wasn’t getting any better. Of course, there are some other
hyper-parameters that I used for my model — loss function (mean squared error), batch
size (1024), activation function (linear), optimizer (Adam).

You might also like