Book Recommendation Using Collaborative Filtering IJERTV12IS040195
Book Recommendation Using Collaborative Filtering IJERTV12IS040195
Abstract: It is becoming a very difficult task for the users to book. This recommendation can also be used to
select the appropriate books for a specific topic as there were recommend movies, articles, music, and news.
a lot of choices available. There is a need for a system which
takes user preferences into consideration while searching and 2. RELATED WORK
recommending online books to the user. So the objective of
this research work is to design an application that
Recommender systems have become extremely common
recommends books based on users ratings. The system being and are utilized in a variety of areas: some popular
proposed uses machine learning algorithm like collaborative applications include movies, books, research articles, and
filtering [CF] that first construct the user-item interaction social tags. There are three basic categories of
matrix, then construct vector matrix using cosine similarity recommendation algorithms: collaborative filtering,
measure from user-item interaction matrix and then find the content-based filtering, and hybrid recommendation.
similarity between the books using vector matrix and Collaborative filtering methods are based on collecting and
recommend the top n books similar to the book given by the analyzing a large amount of information on users’
user as input to the algorithm. The results indicate that behaviors, activities or preferences and predicting what
recommendation performance is better when both average
ratings and cosine vector similarity measure is used as
users will like based on their similarity to other users. The
compared to the existing systems. proposed algorithm does not need professional knowledge,
and the recommendation effect will become better and
Keywords: Collaborative Filtering Technique [CF], Cosine better with the interest of the user, but there are data
Similarity, Euclidean distance similarity, RMSE sparsity and other problems. There are two types of
Collaborative Filtering techniques: Item-based and User-
1. INTRODUCTION based collaborative filtering [CF].Similarity between items
Recommendation system used to suggest items to users or users can be calculated using different similarity metrics
based on criteria like past purchases, search history, and like cosine, Euclidean, Jaccard.
other factors. Recommendation system finds items based Content-based filtering methods are based on a description
on user preference and solve the problem of information of the item and a profile of the user’s preference. These
overloading. It enables the user by providing personalized algorithms try to recommend items that are similar to those
services and user based content and saves a lot of time and that a user liked in the past. Hybrid recommendation is
money. In 2009 Netflix launch a competition to improve combining collaborative filtering and content-based
the accuracy of its movie recommender system by 10%. filtering. These methods can also be used to overcome
The recommendation system is very much important in some of the common problems in recommender systems
increasing the revenue generation of a company. As per the such as cold start and the sparsity problem .
statistics 35 percent of consumers purchase on Amazon and
more than a half of what they watch on Netflix come from 3. EXITING SYSTEM
recommendations systems. Usually, suggestions from Okon et.al. (2018) [1] proposed a model that generates
friends and family are always useful to read books or watch recommendations to buyers, through an enhanced CF
anything new but even after looking through all algorithm, a quick sort algorithm and Object-Oriented
suggestions we may not find something of our preferences, Analysis and Design Methodology (OOADM). Scalability
Hence there is a need for a system that takes our was ensured through the implementation of Firebase SQL.
preferences into consideration before recommending This system performed well on the evaluation metrics.
anything. We do not want to spend time searching for Kurmashov et.al. (2015) [2] used Pearson correlation
books that we prefer so, we create recommendation tool coefficient-based CF to provide internet based
using collaborative filtering where users can give the name recommendations to book readers and evaluated the system
of the book as input and items like the input item are through an online survey.
suggested. We implemented two methods of Mathew et.al. (2016) [3] proposed a system that saves
recommendation a popularity-based recommendation using details of books purchased by the user. From these Book
total rating and average ratings of users and a collaborative contents and ratings, a hybrid algorithm using collaborative
filtering algorithm by applying cosine vector similarity filtering, content-based filtering and association rule
which measure dot product of two vectors (Book data). generates book recommendations. Rather than Apriori, they
Cosine similarity is used to measure similarities in book recommended the use of Equivalence class Clustering and
dataset and find books with high similarities with given bottom-up Lattice Transversal (ECLAT) as this algorithm
is faster due to the fact that it examines the entire dataset Gattu Vijaya Kumar, Prasanta Kumar Sahoo, K.Eswaran
only once. [12] explain the main reason we need a recommendation
Parvatikar et.al. (2015) [4] proposed item-based system in the current generation is because humans have
collaborative filtering and association rule mining to give extremely many alternatives to utilize required information
recommendations. Similarity between different users was which is popular from the Internet. It implements five
computed through Adjusted Cosine Vector Similarity classification algorithms such as Support-Vector Machines,
function. Better recommendations were obtained as through Logistic Regression, Multinomial Naive Bayes, Multilayer
this method data sparsity problem was removed. Perceptron and K-Nearest Neighbors .It was observed that
Ayub et.al. (2018) [5] proposed a similarity function like from the comparison of all the algorithms Support-Vector
Jaccard Similarity to locate alike items and users for the Machine gives 75.13% accuracy.
enquiring item and user in nearest neighbor based Prasanta kumar sahoo, Kodaty Sri Sai Chaitany and N.
collaborative filtering. They proposed that absolute value Archana [13] suggested most of the e-commerce retailers
of ratings should be taken as against the ratio of co-rated are suffering, in displaying the targeted and relevant results
items taken in Jaccard Similarity. They also compared for a search keyword given by customers. In this scenario
performance of their method with other similarity analyzing their past interests and recent behavior of the
measures. customers is one of the important aspects to confirm the
Gogna and Majumdar [6] suggested the use of buyer’s user relevant search results. The online customer behavior
demographic and item category to overcome data sparsity has been analyzed by using Homophily Detection
and cold start problems in their movie recommendation Algorithm in this research work. They implements
system. Latent Factor Model (LFM) was used. They Behavior analysis, Trend analysis and Personalization on
developed a matrix to match the buyer and user information customer data and displaying relevant search results when a
to get a dense user and dense item matrix. Label customer search for a particular product which leads to
Consistency map, the outcome of this system was used to greater customer satisfaction.
suggest unrated and other items to new buyers.
Chatti et.al. (2013) [7] suggested tag-based and rating Literature survey suggested that recommendation systems
based CF recommendation in technology enhanced are being used by large number of online marketers to
learning (TEL) to resolve the data sparsity problem and increase their sales by offering products to customers
extract relevant information from the rating database. which match their tastes.
Memory and model oriented 16 varied tag-based
Collaborative filtering algorithms were evaluated for buyer 4. PROPOSED SYSTEM
satisfaction and accuracy of recommendations in Personal This research work mainly focuses on popularity based
Learning Environments method and Collaborative Filtering Technique for the
Choi et.al. (2010) [8] proposed RS based on HYRED, a recommendation system. Collaborative filtering [CF]
hybrid algorithm using both content and collaborative technique primarily focuses on the relationship between
filtering on a compact dataset (by reducing user interest users and items. It is a technique that can filter out items
items) and neighbor data. HYRED used altered Pearson that a user might like on the basis of ratings given by the
Coefficient based Collaborative filtering and distance-to- other users and recommends the top-n similar items to the
boundary (DTB) Content filtering. This would result in user. The similarity of items is determined by the similarity
better and faster recommendation for large amount of data. of the ratings of those items by the users who have rated
Liu et.al. (2012) [9] added the dimension of user-interest. both items. Item-based collaborative filtering is being used
They proposed iExpand, a 3-tier model i.e. user, user- by Amazon for customers. In an existing system where
interest and item. This helped in overcoming the issues of there are more users than items, item-based filtering is
overspecialization and cold start as well as reducing faster and more stable than user-based. This algorithm uses
computation costs. a similarity measure to find similarity between items.
Feng et.al. (2018) [10] proposed a RS for movies based on
a similarity model constituted of factors S1 (similarity 5. METHODOLOGY
between users), S2 (ratio of co-rated items) and S3 (user’s 1. Import all necessary libraries like pandas, numpy,
rating choice weight). This RS was particularly useful for seaborn etc
sparse datasets. 2. Data Preprocessing, in this step the dataset is checked for
According to Aggarwal [11], C.C., Collaborative Filtering missing values and null values in the data and remove or
method is used in recommendation systems to develop fill them.
recommendations based on ratings provided by the other 3. The design of the recommendation system based on two
users of the system. If buyer’s ratings of items match, it is methods as given below:
likely that their ratings of other items will also match, this a. Popularity based:
is the basic assumption of CF. Computers cannot gauge To implement popularity based recommendation, the data
qualitative factors such as taste or quality, therefore set is selected by merging the two data frames. One is data
recommendations based on the ratings of humans who can frames for no of ratings and the other one is average ratings
rate on the basis of qualitative factors, i.e., collaboration, and merge them. Popular books are those whose average
will give a better outcome. rating is more than 15 ratings are selected as per the criteria
and top 5000 results are printed.
b. collaborative filtering:
To implement recommendation based on collaborative
filtering, cosine similarity is being used to find the
similarity of user preferences and recommend books based
on cosine similarity of book.
6. OVERVIEW OF DATASET
6.1 Importing Dataset
Datasets are critical component of AI development because
they provide the training data that is used to train and test
machine learning models. In this project Datasets such as
Books.csv, Ratings.csv was collected from Kaggle for the
implementation purpose.
6.2 Merging Datasets
The data set is taken by merging Books.csv and Rating.csv
to form bookrec.csv. The data set Bookrec.csv contains 6.4 data visualization
seven attributes and in the process unnecessary attributes fig = plt.figure(figsize = (10, 5))
are removed. The attributes of Bookrec.csv are: User-ID, sns.countplot(df['Book-Rating'])
ISBN, Book-Rating, Book-Title, Book-Author, Year-Of- plt.xlabel("Book-Rating")
Publication. plt.ylabel("No. of books")
plt.title("BOOK RATINGDISTRIBUTION")
plt.show()
Case study 1
Here, A=First vector
B=Second vector
Ai=User rating of book(A) by i
Bi=User rating of book(B) by i
Example:
B1= Harry potter: Goblet of fire
B2=Harry potter: Deathly hallows
After creating a word table from the Books, Books can
be represented by the following vectors: Data similarity matrix
6.7 Cosine similarity importing and implementation.
• Importing cosine similarity
B1= {1,1,1,1,1,0,0}
B2= {1,1,0,0,0,1,1}
Using these two vectors we can calculate cosine
similarity. First, we calculate the dot product of the
vectors:
B1. B2= 1.1+1.1+1.0+1.0+1.0+0.1+0.1 =2
Magnitude of vectors:
||B1||= √5
||B2||= √4 • Recommending books using similarity
Cosine similarity (B1, B2) scores
=2/ √5. √4 =0.44721
recommendation system based on popularity method and Dr. Prasanta Kumar Sahoo
collaborative filtering is much more accurate than the
existing systems.
8. REFERENCES
[1] Okon, E.U., Eke, B.O. and Asagba, P.O. (2018). An improved
online book recommender system using collaborative filtering
algorithm. International Journal of Computer Applications (0975-
8887) Volume 179-No.46, June 2018
[2] Kurmashov, N., Konstantin, L., Nussipbekov, A. (2015). Online
book recommendation System. Proceedings of Twelve
International Conference on Electronics Computer and
Computation (ICECC)
[3] Mathew, P., Kuriakose, B. And Hegde, V. (2016). Book
Recommendation System through content based and collaborative
filtering method. Proceedings of International Conference on Data
Mining and Advanced Computing (SAPIENCE) Professor in the Department of Computer Science &
[4] Parvitikar, S. and Dr. Joshi, B. (2015). Online book
Engineering, Sreenidhi Institute of Science & Technology
recommendation system by using collaborative filtering and
association mining. Proceedings of IEEE International affiliated to JNTUH. He has completed his Ph.D. from
Conference on Computational Intelligence and Computing Fakir Mohan University, Odisha in Computer Science
Research (ICCIC) International Conference on Physics and Engineering. He has 19 years of teaching, research and
Photonics Processes in Nano Sciences Journal of Physics:
administrative experience. He has earlier worked as Head
Conference Series 1362 (2019) 012130 IOP Publishing
doi:10.1088/1742-6596/1362/1/012130 8 of the Dept. in both CSE and IT dept. in various reputed
[5] Ayub, M., Ghazanfar, M.A., Maqsood, M. and Saleem, A. (2018). Engineering Colleges. His Research interest includes Cyber
A Jaccard base similarity measure to improve performance of CF Security, Information Security, Data Science and Machine
based recommendation system. Proceedings of International
Learning. He has published around 60 research papers in
Conference on Information Networking (ICOIN)
[6] Gogna, A., Majumdar, A. (2015). A Comprehensive various reputed journals both at national and International
Recommender System Model: Improving Accuracy for Both level. Many times Dr. Prasanta Kumar Sahoo won the best
Warm and Cold Start Users. IEEE Access Vol. 3, 2803-2813, teacher award in various colleges for his contribution to the
2015 teaching and learning process. He is Certified Professional
[7] Chatti, M.A., Dakova, S., Thus, H. and Schroeder, U. (2013).
Tag-Based Collaborative Filtering Recommendation in Personal from BalaBit, completed Electronic Contextual Security
Learning Environments. IEEE Transactions on Learning Intelligence exam Intermediate Level (ECSI). He has
Technologies, Vol. 6, No. 4, October-December 2013 guided more than 50 projects both at UG and PG level. He
[8] Choi, S.H., Jeong, Y.S. and Jeong, M.K. (2010). A Hybrid
has delivered more than 15 guest lecturers. He has
Recommendation Method with Reduced Data for Large-Scale
Application. IEEE Transactions on Systems, Man and organized three national conference and nine faculty
Cybernetics-Part C: Applications and Reviews, Vol. 40, No.5, development program with immense success.
September 2010.
[9] Liu, Q., Chen, E., Xiong, H., Ding, C.H.Q. and Chen, J. (2012).
S. Dhanish, Venkat Ramana and A. Nikith Kumar are B.
Enhancing Collaborative Filtering by User Interest Expansion via
Personalised Ranking. IEEE Transactions on Systems, Man and Tech IV year students in Computer Science and
Cybernetics-Part B: Cybernetics, Vol. 42, No.1, February 2012. Engineering at Sreenidhi Institute of Science &
[10] Feng, J., Fengs, X., Zhang, N. and Peng, J. (2018). An improved Technology affiliated to JNTUH.
collaborative filtering method based on similarity. PLOS ONE 13
(9): e0204003, September 2018.
[11] Aggarwal, C.C. (2016). Recommendation System: The Textbook.
XXI, 29 p. ISBN 978-3-319- 29657-9.
[12] Gattu Vijaya Kumar, Prasanta Kumar Sahoo, K.Eswaran, “A
Recommendation System & Their Performance Metrics using
several ML Algorithms”, International Journal of Engineering and
Advanced Technology (IJEAT) ISSN: 2249 – 8958, Volume-9
Issue-3, February 2020.
[13] Prasanta kumar sahoo, Kodaty Sri Sai Chaitany and N. Archana, “
ANALYZING CUSTOMER BEHAVIOR IN E-COMMERCE
USING HOMOPHILY DETECTION ALGORITHM”, Advances
and Applications in Mathematical Sciences Volume 20, Issue 12,
Pages 3235-3255.2021.