BRCreport
BRCreport
SUBMITTED BY:
ANKIT RANJAN (0103AL211024)
SUBMITTED TO:
PROF. ADITYA PATEL
CERTIFICATE
This is to certify that the work embodied in this Minor Project entitled
“Book Recommendation System” has been satisfactorily completed by
Ankit Ranjan [0103AL211024]. It is a bonafide
piece of work, carried out under the guidance from Department of
Computer Science &Engineering (AI&ML), Lakshmi Narain College of
Technology, Bhopal for the partial fulfillment of the Bachelor of
Engineering during the academic year 2022-23.
ACKNOWLEDGEMENT
ABSTRACT
read more. With the assistance of data sets and machine learning we believe we
will choose the right book for someone supported their interests and also the data
from several different readers. Therefore here we use a collaborative filtering
method.
7
PREFACE
In this project, basically I have created Book recommender system by the help of
Kaggle dataset ( Book Recommendation Dataset | Kaggle) .In this dataset we have three
csv file (Books.csv),(users.csv) and (ratings.csv).
Also I used Machine learning to create model for recommending the book and
also used so many libraries like NumPy, pandas, seaborn soon. In this
project ,here I use
Collaborative filtering method later we discuss depth about this method.
Also we create web page or we say website by using html ,CSS and bootstrap later
I share source code also and we use flask ( Flask is a web development framework. It is a
framework with a built-in development server and a debugger)
And finally I published this project in GitHub so here we get lots of knowledge
about how GitHub work ,how website make and most important how ML model
deploy in website, so its interesting project that I have created.
GitHub:
Viratankit9/Book_recom.github.io
8
Contents
Contents Page No.
Certificate 2
Acknowledgement 3
Abstract 4
Preface 5
1: Introduction 7
3: Implementation 14
4: ML Model 18
5: Web Page 23
End Note 25
9
1. Introduction:
During the last few decades, with the rise of Youtube, Amazon, Netflix, and many
other such web services, recommender systems have taken more and more place in
our lives. From e-commerce (suggest to buyers articles that could interest them) to
online advertisement (suggest to users the right contents, matching their
preferences), recommender systems are today unavoidable in our daily online
journeys.
1. Content-Based Filtering
The algorithm recommends a product that is similar to those which used as
watched. In simple words, In this algorithm, we try to find finding item look alike.
For example, a person likes to watch Sachin Tendulkar shots, so he may like
watching Ricky Ponting shots too because the two videos have similar tags and
similar categories.
Only it looks similar between the content and does not focus more on the person
who is watching this. Only it recommends the product which has the highest score
based on past preferences.
2. Collaborative-based Filtering
Collaborative based filtering recommender systems are based on past interactions of
users and target items. In simple words here, we try to search for the look-alike
customers and offer products based on what his or her lookalike has chosen. Let us
understand with an example. X and Y are two similar users and X user has watched A,
B, and C movie. And Y user has watched B, C, and D movie then we will recommend
A movie to Y user and D movie to X user.
There are some organizations that use this method like Facebook which shows news
which is important for you and for others also in your network and the same is used
by LinkedIn too.
As the name suggests Popularity based recommendation system works with the trend.
It basically uses the items which are in trend right now. For example, if any product
which is usually bought by every new user then there are chances that it may suggest
that item to the user who just signed up.
There are some problems as well with the popularity-based recommender system and
it also solves some of the problems with it as well.
So, I hope you now have enough idea about the popularity-based recommendation
system.
12
Python language:
Machine Learning:
In the real world, we are surrounded by humans who can learn everything from
their experiences with their learning capability, and we have computers or
machines which work on our instructions. But can a machine also learn from
experiences or past data like a human does? So here comes the role of Machine
Learning.
With the help of sample historical data, which is known as training data, machine
learning algorithms build a mathematical model that helps in making predictions
or decisions without being explicitly programmed. Machine learning brings
computer science and statistics together for creating predictive models. Machine
learning constructs or uses the algorithms that learn from historical data. The
more we will provide the information, the higher will be the performance.
HTML:
HTML is an acronym which stands for Hyper Text Markup Language which is used
for creating web pages and web applications. Let's see what is meant by
Hypertext Markup Language, and Web page.
Hyper Text: Hypertext simply means "Text within Text." A text has a link within it,
is a hypertext. Whenever you click on a link which brings you to a new webpage,
you have clicked on a hypertext. Hypertext is a way to link two or more web pages
(HTML documents) with each other.
Web Page: A web page is a document which is commonly written in HTML and
translated by a web browser. A web page can be identified by entering an URL. A
Web page can be of the static or dynamic type. With the help of HTML only, we
can create static web pages.
CSS:
CSS stands for Cascading Style Sheets. It is a style sheet language which is used to
describe the look and formatting of a document written in markup language. It
provides an additional feature to HTML. It is generally used with HTML to change
the style of web pages and user interfaces. It can also be used with any kind of
XML documents including plain XML, SVG and XUL.
14
CSS is used along with HTML and JavaScript in most websites to create user
interfaces for web applications and user interfaces for many mobile applications.
BOOTSTRAP:
Bootstrap is a free and open-source tool collection for creating responsive
websites and web applications. It is the most popular HTML, CSS, and JavaScript
framework for developing responsive, mobile-first websites. Nowadays, the
websites are perfect for all browsers (IE, Firefox, and Chrome) and for all sizes of
screens (Desktop, Tablets, Phablets, and Phones).
All thanks to Bootstrap developers – Mark Otto and Jacob Thornton of Twitter,
though it was later declared to be an open-source project.
FLASK:
Flask is a web framework that provides libraries to build lightweight web
applications in python. It is developed by Armin Ronacher who leads an
international group of python enthusiasts (POCCO). It is based on WSGI toolkit
and jinja2 template engine. Flask is considered as a micro framework.
PyCharm:
PyCharm is the most well-known Python IDE, which offers fantastic features
including superb code completion and inspection with a comprehensive debugger
and compatibility for web programming and several frameworks. Jet Brains, a
Czech firm specializing in building integrated development environments for
different web development languages including PHP and JavaScript, created
PyCharm.
Jupyter Notebook:
The Jupyter Notebook is an open source web application that you can use to
create and share documents that contain live code, equations, visualizations, and
text. Jupyter Notebook is maintained by the people at Project Jupyter.
15
Jupyter Notebooks are a spin-off project from the IPython project, which used to
have an IPython Notebook project itself. The name, Jupyter, comes from the core
supported programming languages that it supports: Julia, Python, and R. Jupyter
ships with the IPython kernel, which allows you to write your programs in Python,
but there are currently over 100 other kernels that you can also use.
GitHub:
GitHub is a web-based interface that uses Git, the open source version control
software that lets multiple people make separate changes to web pages at the
same time. As Carpenter notes, because it allows for real-time collaboration,
GitHub encourages teams to work together to build and edit their site content.
16
3. Implementation:
Book Recommendation System
In this article, we will use the Collaborative based filtering method to build a book
recommender system.
Let’s make our hands dirty while trying to implement a Book recommendation
system using collaborative filtering.
Dataset Description:
• Books.csv
Books are identified by their respective ISBN. Invalid ISBNs have already been
removed from the dataset. Moreover, some content-based information is given
(Book-Title, Book-Author, Year-Of-Publication, Publisher), obtained from Amazon
Web Services. Note that in case of several authors, only the first is provided. URLs
linking to cover images are also given, appearing in three different flavours
(Image-URL-S, Image-URL-M, Image-URL-L), i.e., small, medium, large. These URLs
point to the Amazon web site.
17
• Rating.csv
Contains the book rating information. Ratings (Book-Rating) are either explicit,
expressed on a scale from 1-10 (higher values denoting higher appreciation), or
implicit, expressed by 0.
• Users.csv
Contains the users. Note that user IDs (User-ID) have been anonymized and map
to integers. Demographic data is provided (Location, Age) if available. Otherwise,
these fields contain NULL values.
Load Data
let us start while importing libraries and load datasets. while loading the file we
have some problems like.
The values in the CSV file are separated by semicolons, not by a comma.
There are some lines which not work like we cannot import it with pandas
and It throws an error because python is Interpreted language.
Encoding of a file is in Latin
So while loading data we have to handle these exceptions and after running the
below code you will get some warning and it will show which lines have an error
that we have skipped while loading.
18
Preprocessing Data:
Now in the books file, we have some extra columns which are not required for
our task like image URLs. And we will rename the columns of each file as the
name of the column contains space, and uppercase letters so we will correct as to
make it easy to use.
19
The dataset is reliable and can consider as a large dataset. we have 271360 books
data and total registered users on the website are approximately 278000 and
they have given near about 11 lakh rating. hence we can say that the dataset we
have is nice and reliable.
20
4.ML Model:
Popularity Based Method:
21
The similarity is not restricted to the taste of the user, moreover there can be
consideration of similarity between different items also. The system will give more
efficient recommendations if we have a large volume of information about users
and items. There are various types of collaborative filtering techniques as
mentioned in the diagram given below.
22
23
Import pickle
(The pickle module is used for implementing binary protocols for serializing
and de-serializing a Python object structure.)
24
Final data frame in which here 4-5 column are most important.
25
5. Web Page:
Home page
There are list of top 50 books:
Recommend page
26
End Notes
Hurray! We have to build a reliable Book Recommendation system and you can
further modify it and convert it to an end-end project. This is a wonderful
Unsupervised learning project where we have done lots of preprocessing and you
can explore the dataset more and if you find something more interesting, please
share it in the comment box.
I hope it was easy to catch up with each method and follow along with the article, If
you have any queries please post them in the comment section below. I will be
happy to help you with any queries.