0% found this document useful (0 votes)

24 views12 pages

2023 KEDIR Pattern Based Hybrid Book Recommendation System

Uploaded by

Aline Alencar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views12 pages

2023 KEDIR Pattern Based Hybrid Book Recommendation System

Uploaded by

Aline Alencar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

www.nature.

com/scientificreports

OPEN Pattern‑based hybrid book

recommendation system using
semantic relationships
Fikadu Wayesa , Mesfin Leranso *, Girma Asefa & Abduljebar Kedir

In the fields of machine learning and artificial intelligence, recommendation systems (RS) or
recommended engines are commonly used. In today’s world, recommendation systems based on
user preferences assist consumers in making the best decisions without depleting their cognitive
resources. They can be applied to a variety of things, including search engines, travel, music, movies,
literature, news, gadgets, and dining. A lot of people utilize RS on social media sites like Facebook,
Twitter, and LinkedIn, and it has proven beneficial in corporate settings like those at Amazon, Netflix,
Pandora, and Yahoo. There have been numerous proposals for recommender system variations.
However, certain techniques result in unfairly recommended things due to biased data because there
are no established connections between the items and consumers. In order to solve the challenges
mentioned above for new users, we propose in this work to employ Content-based Filtering (CBF) and
Collaborative Filtering (CF) with semantic relationships to capture the relationships as knowledge-
based book recommendations to readers in a digital library. When proposing things, patterns are
more discriminative than single phrases. To capture the similarity of the books that the new user
had retrieved, the patterns were grouped in a semantically equivalent manner using the Clustering
method. The effectiveness of the suggested model is examined through a series of extensive tests
employing Information Retrieval (IR) evaluation criteria. Recall Precision and F-Measure, two of the
three widely used performance measuring metrics, were employed. The findings demonstrate that the
suggested model performs noticeably better than cutting-edge models.

There is no shortage of content in the modern era. There are options for generating and gaining access to differ-
ent types of data. People struggle to understand what to access for their requirements and areas of interest when
there is a variety of content available. The largest issue arises when a person has too many options and needs to
gather sufficient data to make an informed decision. It could be appropriate goods or services. There are many
different ways someone’s quest for a book to read could turn up, for instance, if they have no specific idea of what
they desire. She or he can squander a lot of time exploring distinct websites in the hopes of succeeding. They
might look for a recommendation from other p eople1.
Popular Web applications such as Amazon, Facebook, or Netflix use a recommendations approach to suggest
new products/services to their users since navigation from page to page does not satisfy the user’s need in a large
amount of data. Predicting or assisting the users with their wishes about items (products) like books, electron-
ics, and others is very important in E-commerce sites whether based on their recent browsing history or some
hidden patterns as a limited context. Some systems, like Google Ad Sense, focus based on keywords than an
estimation of the user’s taste based on her/his recent browsing h istory2,3. Recommender systems provide users
with relevant items (top-k ranking list of “best” items) based on the user’s profile (information) they gathered
about a specific user. Such types of Recommender Systems are called Personal recommendations. The context
may rely on the user’s current activity or her/his long-term interests. Again, the recommendation is done based
on the information about the products like ratings and all details about the items.
The reviews and ratings are collected from users about the items through an implicit or explicit a pproach4,5.

• Explicit profiling is collected by asking each visitor to fill out the information about a specific item and how
much they liked it (page, book, movie, news, CDs, hotels, products, services like transport) by providing a
numerical rating.

Information Technology, School of Computing and Informatics, College of Engineering and Technology, Wachemo
University, Hosaena, Ethiopia. *email: [email protected]

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 1

Vol.:(0123456789)
www.nature.com/scientificreports/

• Implicit profiling is gathered when a user watches a movie or opens pages the system tracks the visitor’s
behavior as interest. This technique is generally transparent to the user where browsing is tracked by recording
specific users’ history (User identification) and behavior of user’s i nformation6. Amazon logs each customer’s
buying history and, based on that history, recommends specific purchases. There are the following approaches
for Recommendation S ystems2,6.

(1) Collaborative Filtering: This approach builds a model from a user’s past behavior as well as similar decisions
made by other users to predict items that the user may have an interest in.
(2) Content-based Filtering: In the content-based filtering approach the characteristics of an item are analyzed
to recommend items to the user.
(3) Demographic Recommendation: Technique that uses only user’s information to find a correlation between
the users based on their demographic profile. Users with a similar demographic profile are recommended.
The demographic technique suffers from a cold start problem for the new item, as the new item has not
been preferred by any user of the same demographic profile.
(4) Knowledge-Based Filtering: Items are recommended to a customer by using the knowledge of the item
domain. It collects the customer’s preferences on a specific product and uses its knowledge to find the
products according to the customer’s preferences.
(5) Mobile Recommender System: Recommender approaches make use of internet accessing smartphones
to offer personalized, context-sensitive recommendations. They always focused on spatial and temporal
data6,7.
(6) Hybrid The combination of the above or other approaches

Any user with a history or existing users might not be a problem to be recommended an item even if the
recommendation does not fit the exact user’s need. But a prediction for a new user which has no demographic
input data faces problems. Another problem that leads to poor recommendation is an item that has no rating
value or a new item. The aim of this paper is interested in developing a hybrid book recommendation system
that applies a pattern or data mining rule to find the correlation between books that are to be recommended for
the new user who has no previous history. The pattern in the data mining could facilitate the new users as well
as for new items which have less or no rating value when searching for books, and give better-recommended
results. In addition to pattern features grouping them to identify the related items, as well as similar user interest,
is very important. The main research contributions are summarized as follows:

• We propose to exploit Content-based Filtering (CBF) and Collaborative Filtering (CF) with semantic relation-
ships to capture the relationships as knowledge-based to recommend books to readers in a digital library to
address the problems stated above for new users.
• We model the patterns which are grouped to a semantically equivalent approach to capture the similarity of
retrieved books for the new user using the Clustering method.
• We use an extensive experiment and the three popular performance measurement metrics; Recall Precision
and F-Measure to evaluate the effectiveness of the proposed models by using Information Retrieval (IR)
evaluation metrics.
• The results show that the proposed model significantly outperforms state-of-the-art models.

The rest of this paper is organized as follows. “Related works” explains the existing research works. “Meth-
ods” formulates the problem in this paper. Then, we propose an algorithm to solve the formulated problem. The
simulation results are illustrated in “Experimentation result, discussion and evaluation”. Finally, we conclude
this paper in “Experimentation result, discussion and evaluation”.

Related works
Researchers have been working to boost the performance of recommender systems using an integration of more
than one technique. Various hybrid approaches have shown good results. In8,9, the authors suggested a tech-
nique that introduces the contents of products into the product-based collaborative filtering system to improve
the performance of a prediction algorithm. It is called the product-based clustering hybrid approach. In this
approach, they first applied the clustering algorithm to group the products. The main purpose was to group the
products into various sets and provide content-based information to determine similarities. Each product has
its attributes, such as the movie product, which may have an actor, actress, director, etc. Thus, they grouped the
items based on those attributes.
In10,11, a hybrid recommender system that integrates collaborative and content-based approaches has been
adopted. Firstly, the content-based filtering algorithm is applied to find customers, who share similar interests.
Secondly, a collaborative algorithm is applied to make predictions. It integrates the product information and
product ratings to calculate the product-product similarity, called product-the based clustering method. It also
integrates a customer’s information and a customer’s ratings to calculate the customer–customer similarity, called
the customer-based clustering method.
In10,12, authors suggested a content-based predictor to improve already-existing user data, and then used col-
laborative filtering to produce tailored suggestions. In order to handle a vector of bags of words, they constructed
a bag-of-words naive Bayesian text classifier, where each bag-of-words corresponds to a specific aspect of a film,
such an actor or a director. Additionally, they learned a user’s profile from a collection of rated movies using the

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 2

Vol:.(1234567890)
www.nature.com/scientificreports/

classifier. The rating of unrated movies is then predicted using the learned profile. The neighborhood is built
using the Pearson correlation algorithm and user-based collaborative filtering.
In11,13–15, the writers suggested more A hybrid recommender system for tourism can give useful information
on tourist attractions based on a user’s profile, location, schedule, and the amount of time they have to visit their
preferred locations. Mobile technology has developed to provide useful communication and computation func-
tions. The authors proposed a platform that enables users to decide based on their location, schedule, context,
and mobility requirements as a result. The smartphone and a server are the two parts of the suggested system.
With a smartphone, the user is connected to the system and has access to all functions at any time. A server
offers the user a number of functionalities including presentation, recommendation, punishment, socialization,
and advertising.
In8,16, to increase prediction accuracy, provide better coverage, and address the cold start issue, the authors
suggested a hybrid strategy that combines content-based, collaborative, and demographic filtering techniques.
By classifying the consumers into several categories using the closest neighbor technique, the demographic
characteristics of the customers (e.g., gender, race, age, employment status, occupation, etc.) are used to solve the
cold start problem. To locate the closest neighbors, they applied the KNN a lgorithm5,14. The authors assessed and
contrasted their hybrid algorithm with other approaches. According to their findings, their strategy outperforms
the competition for both the cold start issue and for all consumers.
Each category contains readers sharing similar demographic characteristics. The combination of demographic
characteristics and content-based approaches allows for solving the problem of new items that are added to the
system. In the related works, there is no approach try to present the problem of the items to a user in a specific
order across the most popular. This ignores the rare rated items (newly added items) or not popular ones.
On the other hand, some other approaches consider user profiles in the recommendation process since the
profiles represent the users’ information needs to identify the needs of an individual user. The accuracy of each
user profile affects the performance of the entire recommender system. If a new user has no history, the recom-
mendation might not be effective or difficult.
So this paper suggested enhancing the quality of recommended items and the problem of cold start as well as
to improve the scarcity problem, integrating knowledge base like using extraction of the relation between entities
better-using clustering method. After the clustering has been done based on the probability approach the books
to be recommended were filtered and the similarity has been calculated. The focus of this is to provide justifica-
tions for recommended books as this plays a crucial role in obtaining the satisfaction and trust of readers. It has
been shown that a reader’s trust is positively associated with a reader’s intentions to read a book.

Methods
Figure 1 depicts the overall procedures considered in this work. Let we see each components each one by one
as follows.

Components of the Proposed Model.

Figure 1. The proposed framework.

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 3

Vol.:(0123456789)
www.nature.com/scientificreports/

1. Clustering Data: We grouped user demographic data, books, and rating data of books by users because the
number of readers and books is large and it is difficult to search and compare the similarity of each user with
all existing users and we might not get the required user need. So, we applied the clustering algorithm to
group all reader data, books rating, and book information to get the similarity of the data in our database.
2. Clustering Users Data-set: We clustered existing user data-set by grouping users based on the attributes
we selected from user demographic information to cluster them. The existing users are clustered to their
appropriate group based on their similarity according to their category. In our case department is used. We
clustered users based on their department. We select this attribute since we have the assumption that readers
with a similar department have a similar interest in books because the reader wants to know the information
on their department every time.

This clustering is important and we use it when we search for similar users for the newly registered users for
reducing search complexity to find the specific user groups needed to be recommended. Since we are recom-
mending new users, it is better to know the user’s group by identifying the registered information about users.
We check always similar groups for the registered new users based on the department on the online process.
When a new user registered the process of searching for the appropriate cluster for the user will continue and
if the cluster is found the recommendation is done. But if the new user couldn’t get the exact groups, the new
groups for the users are created in our system. In addition, the recommendation provided for this new user is
the popular books filtered. The algorithm for user clustering is as follows.
In the scenario of finding a relationship between the books, we take into consideration not only the magnitude
of each word count of each book but also the angle between two vectors can be calculated as:
⌣⌣
a. b
cosθ = (1)
AB

Algorithm 1 Reader Clustering Pseudocode

Given a set of reader department D

For each new_reader Un
IF department d exists in D, THEN
FOR each D Di up to Dx
IF Un==Di, THEN
Store Un data on Di
END IF
END FOR
END IF
END FOR

• Clustering Books Data: According to Algorithm 1, the data set identifies books read and rated by users which
are stored in the table with different features of books like ISBN, categories, Title, Author, Year, publisher, and
others. So, to access the books currently existing in the system we need to have a better way to process and
retrieve from the dataset. Since they are many books and take too much time and memory for processing
and retrieving them. The better way to overcome this problem is by using clustering methods. We clustered
this book’s dataset based on the category. This means the books with similar categories are clustered under
the same cluster. For example, the books in the department of Information Technology should be considered
under the Information Technology cluster. Generally, the clustering reduces the time used to process and
the memory used while retrieving the required books and unnecessary data. So, this method overcomes the
salable problem one of the common problems in the books recommendation system.
• Clustering Rating Data: This dataset contains the books rated by users with the user ID and Books IBN
(ID). In our model, the new user registered for the system looks for books, and the system provides books
requested by a user based on their rate values. The processing of this huge data takes many time and memory
which follows the scalability problem. We apply the clustering method to group the rating dataset. As it is
mentioned this rating dataset has different features and we used these features for clustering. The clustering
for this dataset is done based on the rate value of books given by the users who read and rated the books.
After we clustered them we have 4 groups of books. Those are above-average books rating data which con-
tains books rated by the user with the 7 to 10 rating value, average rating data with 6.5 rating value, medium
rating data with 3 to 5 value and below rating data with 1 and 2 rating values, and visited rating data with
no rating value but visited by the readers. We also used reader behavior information that is gathered from
reading history, and readers’ ratings on books. Reader behavior information is gathered by giving ratings to
the books on a scale of 1 to 10 is used by the readers, where 1 indicates a less favorite and 10 indicates a top
favorite book. The books that have no rating by the reader are indicated by 0. This rating information creates
contextual information for the system and is shown as follows:

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 4

Vol:.(1234567890)
www.nature.com/scientificreports/

R : Readers ∗ Book ∗ Context ⇒ rating

• Clustering New Users: The registered user groups should be identified according to user similarity and the
data are registered to the appropriate cluster or the similar groups. This identification of new users is done
based on demographic attributes. The system adds the user demographic data into his/her respective group.
The users clustered under similar users based on their department. Filtering this information will reduce the
number of users by considering that users with similar age groups will have a more similar interest in the
books to be read.

We used Information Extraction (IE) to capture the relationship between books and authors to suggest a given
book as a sample is indicated in the algorithm 2. We generated the triples to represent a couple of books and a
relation between them. For example, (Michael T. Goodrich, write, Data structure and Algorithms) is a triple in
which ‘Michael and ‘Data structure and Algorithms’ are the related entities, and the relation between them is
‘write’. We used a Rule-based Approach by defining a set of rules for the syntax and grammatical properties to
extract information from books and used it as input for a recommendation.

Algorithm 2 Sample Rule Generated

Noun Phrases: [‘Michael T.Goodrich', 'Data Structures

and Algorithms in Java, 'author', 'Roberto Tamassia',
'Operating System',']
Verbs: [‘write’, ‘publish’, ‘explain’, ‘purpose’]
Michael T. Goodrich PERSON
Turkish NORP
The Republic of Turkey GPE
First ORDINAL
1923 DATE
Turkey GPE
Roberto Tamassia PERSON
the 20th Century DATE

According to the Algorithm 2, before any book is recommended to the user’s semantic relations were gener-
ated from the books information using a pattern relationships. Then, based on the constructed relationships,
the most relevant books are recommended.

• Retrieving Most Highly Rated Books: The rated books by filtered users should be fetched and the process of
priority consideration is applied to them before predicting the newly registered user. The user is provided a
list containing 10 highly rated books to ensure there is a representation of highly rated books.

Our approach used the following methods to predict recommendations. These are relevance (most highly
rated), clustering, and popularity (most highly read). The recommendation is done in both content-based and
collaborative filtering approaches.

1. Filtering Books by content-based approach: The book categories are used to cluster the books rated by users
and these books are rated by different users. So the books rated by many users from one category should be
recommended for the new users similar in the cluster in addition to recommended based on the collabora-
tive filtering.
2. Filtering Books by collaborative filtering approach Books rated by the user under one group according to
their demographic similarity are filtered. Since the rated books are clustered into four different groups, the
system should check from all groups and follow the priority to return the books. The books with the highest
rating value should get priority if they fulfill the number of books to be generated for the readers. If there is
a concerning book for the group, nothing is recommended. Predicting of rating value for each book to be
recommended for this new user will be done by the given rating value for each book rated in the new user
cluster since the new user has no rating values for any of the books. Predicting his/her rating based on the
groups of the user is the task of this work to predict the rating value of this user. To predict the rating value,
the system will calculate the weight of high-rated books by the user in similar groups for the new users. After
the weight of the rating is calculated, the books with that weight value will be recommended to the users.
3. Filtering popular Books: The popular books assumption in this work with the highest rating value and rated
by many users is recommended. We check their frequency or the number of occurrences since the books
frequently occurred the books are rated by many users and it is popular with many readers. Our model takes
input upon which to base the recommendations. The input used in our model includes a reader’s interest
profile, rating data, and book information. Accurate readers’ information has a crucial role in integrating
different recommendation techniques. Reader’s profile describes the reader’s description information such

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 5

Vol.:(0123456789)
www.nature.com/scientificreports/

as a department. We gathered this information during reader registration. Finally, we retrieve these books
by checking their rating value and retrieving all with the highest rating value.
4. Generating Top Recommended Books: The top N articles recommendation is done by the ranking algorithm
we developed for the books recommendation system. Since the information our work recommend is the
books we need to consider the time the book’s articles were published and the popularity of the books by
readers. The books to be recommended are retrieved through both the two approaches content-based and
collaborative filtering and the books are generated based on popularity.

Since each of the approaches we have used has its ranking methods for the books to be generated and the
results obtained are generated by the ranking algorithm each of the approaches used. And the results obtained
in individual approaches are combined. Finally, we generate the books to be recommended by the time they
were published. This means the recent books are displayed at the top and the next is also continues in this order.

Experimentation result, discussion and evaluation

This part examines the application of the experimental findings to the recommendation problems as well as the
evaluation of the proposed recommendation methodologies, datasets, and evaluation criteria.

Dataset. This research uses the good books dataset17, Harper and Konstan, 2015 which is a commonly used
dataset in the domain of recommender systems. It contains the results of real users’ interactions with the rec-
ommender system. It can recommend books using the user profile. The availability of the content descriptions
helps in finding similar books to the one selected. It is designed to offer user-item matrices to be used to develop
recommendation algorithms. This dataset was chosen to evaluate the developed algorithms in this research, and
show their effectiveness and novelty compared with the algorithm. This dataset contains the text file of books
with the book’s ISBN, title, author, publisher, categories, published year, and URL of books. We have changed
the text data of books from the source into a table in our database according to their attributes appropriately.
The dataset consists of 326,376 ratings using numerical values ranging from 1 to 10, from 278,850 users based
on 271,379 books. Users who had made fewer than 20 ratings were removed from the analysis. In addition, this
dataset contains demographic information for each user including details of their department, year, and semester.
Users who did not complete these details were removed from the analysis.
We have demographic information of users who rated or read the books with their department, semester,
year, and user ID. The demographic information of users is stored in one table. The rating value given for each
book by the active users is also another data we used in this study and it is stored on another table.

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 6

Vol:.(1234567890)
www.nature.com/scientificreports/

As we can see from Table 1, the user dataset contains two attributes; those are User ID, and department or
category. The predefined features were extracted for the books to gain a rating matrix of items by a set of users,
user’s description (sex, age, location, profession of the user), and Items’ description (genre, author, title, date,
price of the item). Recommender systems researchers have applied different measures to evaluate recommenda-
tion algorithms in terms of accuracy and quality. This insight is useful for evaluating the quality of a system and
its ability to forecast the rating for a particular item. To evaluate the performance, we depend on the objectives
of our study and we selected the related metrics to our objectives. Since the main objective of our study is to
recommend more related or interested books to users we should have to evaluate the relatedness of the books
to users. So, the popular and the most used metrics for any information retrieval to measure the relatedness or
interests are precision, recall, and F1-score.

Implementation tools. Both the backend and the frontend of the prototype have been implemented using
several tools. Our study’s implementation tools included the Java programming language, NetBeans 8.0.1 tools
for building the model on the front end, and MySQL 5.1 for storing and processing the dataset on the back end
through a connection to NetBeans. The prototype of our approach is implemented using the NetBeans tool,
which is also used to create new user accounts and deliver output to users.

Evaluation metrics. Table 2 shows how to evaluate the performance of our model in terms of Top-N rec-
ommendations, the classification accuracy metrics (i.e. Precision at N, Recall at N, and F1 measure) were chosen
for the accuracy performance evaluation of the recommender against the users in the test set.

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 7

Vol.:(0123456789)
www.nature.com/scientificreports/

UserID ISBN Book rating Department

276726 0155061224 5 HO
276729 052165615X 3 Med
276729 0521795028 6 Med
276744 038550120X 7 CS
276747 0060517794 9 CS
276747 0671537458 9 CS
276747 0679776818 8 CS
276747 0943066433 7 Med
276747 1885408226 7 Med
276748 0747558167 6 Bio
276751 3596218098 8 IT

Table 1. Books dataset information.

Recommended Not recommended

Relevant books TP (true-positive) FN (false-negative)
Irrelevant books FP (false-positive) TN (true-negative)

Table 2. Evaluation metrics table11.

1. Precision: This result will be found by calculating using Eq. (2),1.

TF
P= , (2)
TF + FP

Where P states the Precision, FP states that relevant for all books recommended.

2. Recall: In our case, it measures the good recommended out of all good recommended items as described in
Eq. (3),1.
TF
R= , (3)
TF + TP

Where R states Recall, FN states that relevant for all books recommended.

3. F1-Score: F1-score is the harmonic mean value of both precision and recall results as its formula shown in
Eq. (4).
2TF
F1 − Score = . (4)
2TF + TF + FP

Evaluation result. The model we used in this study uses the data mentioned above to provide the output
from the screenshot for the users, and we evaluate the performance of the final output result provided for users
by the metrics we have used to measure the performance of the system. We evaluated the performance of our
works in two ways. One is the experimentation of our work for comparing each user with each other and the
other is the experimentation of the system for comparing the performance of the users with clusters that consist
of many users. This work help in recommending the books to the users based on discovering the relationships
among books and the users that allows users to combine their descriptive static profile with dynamic books
behavior.

1. Experimentation for individual user similarity: For the individual users we evaluated by taking 20 users from
400 active users. We remove the books rated by these users and register each of these 20 users as a new user
and we run our proposed model to recommend the books for the users and we compare the previous books
rated by the user with these actual recommendations. Then, we calculate the precision, recall, and F1-score
values as the formula we discussed in the previous section. According to this experimentation, the results
of the works are explained in the Table 3 table format.

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 8

Vol:.(1234567890)
www.nature.com/scientificreports/

User ID Precision Recall F1-Score

93589 0.740741 0.4578 0.5992705
132852 0.85121 0.5645 0.707855
255542 0.88451 0.6568 0.770655
19498 0.65985 0.3351 0.497475
44035 0.55123 0.2354 0.393315
Average 0.6384141 0.404274 0.521344

Table 3. Experimentation based on user similarity.

Below is a table positioned for experimentation based on user similarity using a sample:
From the evaluation results, we have the accuracy of the book for each user with the values of 63.84% of aver-
age precision, 40.42% of average recall 52.1% of the average F1 score values as shown in Table 3.
The above mentioned result can be represented graphically as shown in Figure 2 to show the sample perfor-
mance results.

2. Experimentation by user cluster-based similarity: The other way we evaluated our performance is based on
the user cluster. We took 4 user clusters out of the user clusters we have in our dataset. Then, we recommend
some new users similar to the cluster selected and we compare the accuracy performance by comparing
the actual recommendation with the recommended for that cluster. The results of the work are explained is
shown in Table 4.

Based on this experiment we have the accuracy values that perform the average precision of 0.76416, aver-
age recall of 0.37429, and average F1-score of 0.56923, and the average precision of this experiment is shown
in Table 4.

Discussion and conclusion.

1. Discussion: This work employs Content-based Filtering and Collaborative Filtering with semantic relations
to capture the relationships as knowledge-based book recommendations for the biased data of new users.
We conducted two ways of experimentation mechanisms as experimentation based on individual user and

Figure 2. Experimentation value based on individual user similarity.

Category number Precision Recall F1-Score

2 0.7745 0.36213 0.5992705
4 0.58652 0.13541 0.707855
6 0.8845 0.6568 0.770655
8 0.811124 0.3351 0.497475
Average 0.76416 0.37429 0.56923

Table 4. Category based evaluation.

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 9

Vol.:(0123456789)
www.nature.com/scientificreports/

experimentation as clustered user similarity. From the two experimentation results we gained quality book
recommendation when compared to other previously work done.

The result of our experiment in this work contains two experimentation ways and each has different values
as discussed in the previous section. Finally, we analyzed that the recommendation performance is different as
obtained in two ways of our evaluation methods. The recommendation accuracy result is different according to
the similarity of the users with clustering-based and individual user similarity. According to the two experiments
results, we analyzed that the more accurate recommendation is done for the users in the clustering which per-
forms the precision of 76.4% Recall of 37.4% and F1-score of 56.9% rather than individual user recommendation.
This is done because the cluster-based recommendation contains more related books regarding the users in
that cluster than the books recommended for individual user similarity. According to this value, the Recall in
the individual performs less performance than in cluster-based. The reason behind this result is the finding of
good recommendation result numbers from many users in the cluster. So, if the good books recommended are
many then, the Recall value will become less.
From the Table 5 above, it is evident that findings of the proposed models makes the best prediction with the
highest accuracy score when compared to the other models.
The Hybrid Model and Pattern based Word Embedding were the proposed models that was clearly the most
accurately identified result with the overall accuracy of 0.521 and 0.569 F1-Score respectively. Hence, it is reason-
able to assume that the extraction of semantic relationship of the books as described previously, the relatively
positive overall evaluations model.
From the Fig. 3, our model performance evaluation was conducted a comparison with other models and it
shows our proposed model achieved the best result among the models.

2. Conclusion: Educational domain is based on a heterogeneous collection of information and services. These
services are student information services and digital library services. The main objective of this study was
to design a recommendation system for a digital library. There are many challenges in the recommenda-
tion system as we discussed in the related work. In this study, New User profile data with a Hybrid book
Recommendation system was proposed. This hybrid recommender scheme combines a content-based and
collaborative approach with user profile information with the help of pattern relationship between the users.
The content-based component uses the book features to get knowledge about the content type of the books
to select the recommendation for a similar user in the same cluster rate for mostly rated categories of books.

Approach Precision Recall F1 Score

Hybrid model (proposed model) 0.638 0.404 0.521
Semantic based STRuFSP18 0.563 0.479 0.484
Probase-LDA 0.413 0.327 0.429
CLDA19 0.38 0.37 0.401
Pattern based word embedding (proposed model) 0.764 0.374 0.569
Pattern based TNG 0.446 0.386 0.374
N-Gram 0.401 0.386 0.361
Frequent closed patterns-FCP 0.428 0.385 0.362

Table 5. Result comparison of different algorithms with the proposed model.

0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

Precision Recall F1 Score

Figure 3. Result comparison of the proposed model with other models.

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 10

Vol:.(1234567890)
www.nature.com/scientificreports/

Data availability
All data generated or analyzed during this study are included in this published article [and its supplementary
information files.

Received: 19 July 2022; Accepted: 3 March 2023

References
1. Gupta, V., Pandey, S. R. Recommender systems for digital libraries: A review of concepts and concerns. In Library Philosophy and
Practice, 1–9 (2019).
2. Chandak, M., Girase, S. & Mukhopadhyay, D. Introducing hybrid technique for optimization of book recommender system. Proc.
Comput. Sci. 45, 23–31 (2015).
3. Jain, S., Grover, A., Singh Thakur, P., Choudhary, S. K. Trends, problems, and solutions of recommender system. In International
Conference on Computing, Communication & Automation 955–958. (IEEE, 2015).
4. Koenigstein, N., Dror, G., Koren, Y. Yahoo! music recommendations: Modeling music ratings with temporal dynamics and item
taxonomy. In Proceedings of the Fifth ACM Conference on Recommender Systems, 165–172 (2011).
5. Tian, Y., Zheng, B., Wang, Y., Zhang, Y. & Qi, Wu. College library personalized recommendation system based on hybrid recom-
mendation algorithm. Proc. CIRP 83, 490–494 (2019).
6. Omisore, M. O. & Samuel, O. W. Personalized recommender system for digital libraries. Int. J. Web-Based Learn. Teach. Technol.
IJWLTT 9(1), 18–32 (2014).
7. Jomsri, P. Book recommendation system for digital library based on user profiles by using association rule. In Fourth Edition of
the International Conference on the Innovative Computing Technology (INTECH 2014), 130–134. (IEEE, 2014).
8. Li, Q., Kim, B. M. Constructing user profiles for a collaborative recommender system. In Asia-Pacific Web Conference, 100–110.
(Springer, 2004).
9. Zhang, H.-R., Min, F., He, X., Xu, Y.-Y. A hybrid recommender system based on user-recommender interaction. Math. Probl. Eng
2015 (2015).
10. Robin, B. Hybrid web recommender systems. In The Adaptive Web, 377–408 (2007).
11. Chen, J.-H., Chao, K.-M., Shah, N. Hybrid recommendation system for tourism. In 2013 IEEE 10th International Conference on
e-Business Engineering, 156–161. (IEEE, 2013).
12. Javed, U. et al. A review of content-based and context-based recommendation systems. Int. J. Emerg. Technol. Learn. iJET 16(3),
274–306 (2021).
13. Alrasheed, H. et al. A multi-level tourism destination recommender system. Proc. Comput. Sci. 170, 333–340 (2020).
14. Lucas, J. P. et al. A hybrid recommendation approach for a tourism system. Expert Syst. Appl. 40(9), 3532–3550 (2013).
15. Rey-Lo´pez, M., Barraga´ns-Mart´ınez, A. B., Peleteiro, A., Mikic-Fonte, F. A., Burguillo, J. C. Moretourism: Mobile recommenda-
tions for tourism. In 2011 IEEE International Conference on Consumer Electronics (ICCE), 347–348. (IEEE, 2011).
16. Rozeva, A., Zerkova, S. Assessing the semantic similarity of texts—methods and algorithms. In AIP Conference Proceedings, vol.
1910, 060012. (AIP Publishing LLC, 2017).
17. Maxwell Harper, F. & Konstan, J. A. The movielens datasets: History and context. ACM Trans. Interact. Intell. Syst. (TIIS) 5(4),
1–19 (2015).
18. Kapugama Geeganage, D. T., Xu, Y., Li, Y. Semantic-based topic representation using frequent semantic patterns. Knowl. Based
Syst. (2021).
19. Tang, Y.-K., Mao, X.-L., Huang, H., Shi, X., Wen, G. Conceptualization topic modeling. Multimedia Tools Appl. 3455–3471 (2018).

Acknowledgements
First of all, I would like to praise my God for his support all in my works. Next, I would like to appreciate my
academic staff and my senior supporter Mesfin Leranso (Ph.D.), and another staff member for their uncount-
able support.

Author contributions
F.W. proposed and deduced the main idea, and worked on the overall writing of the paper. M.L.B. worked on
the manuscript review and comments. G.A. and A.K., read and approved the final manuscript modifications.

Competing interests
The authors declare no competing interests.

Additional information
Supplementary Information The online version contains supplementary material available at https://doi.org/
10.1038/s41598-023-30987-0.
Correspondence and requests for materials should be addressed to M.L.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 11

Vol.:(0123456789)
www.nature.com/scientificreports/

Open Access This article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons licence, and indicate if changes were made. The images or other third party material in this
article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 12

Vol:.(1234567890)

# Title Type Parameter Value: Preset Compressor Parameters (Fs 44.1 KHZ)
No ratings yet
# Title Type Parameter Value: Preset Compressor Parameters (Fs 44.1 KHZ)
10 pages
C_WME_2506 Dumps - SAP Certified Associate - WalkMe Digital Adoption Consultant
No ratings yet
C_WME_2506 Dumps - SAP Certified Associate - WalkMe Digital Adoption Consultant
5 pages
A_social_network-based_recommender_system_SNRS
No ratings yet
A_social_network-based_recommender_system_SNRS
32 pages
IJARESM
No ratings yet
IJARESM
7 pages
Module 6_Link Analysis Recommendation Systems.pptx
No ratings yet
Module 6_Link Analysis Recommendation Systems.pptx
68 pages
RS Unit - I
No ratings yet
RS Unit - I
47 pages
Recommender Systems - Chaptre1
No ratings yet
Recommender Systems - Chaptre1
62 pages
Architecture CBR
No ratings yet
Architecture CBR
6 pages
Implementation of Online Book Recommenda
No ratings yet
Implementation of Online Book Recommenda
4 pages
Final Report 18.7.24
No ratings yet
Final Report 18.7.24
26 pages
fin_irjmets1731397431
No ratings yet
fin_irjmets1731397431
7 pages
Recommendation System
No ratings yet
Recommendation System
15 pages
Sample Final Proposal
No ratings yet
Sample Final Proposal
10 pages
M21DGS323 - 2610 - 02
No ratings yet
M21DGS323 - 2610 - 02
77 pages
Unit v Chapter II
No ratings yet
Unit v Chapter II
22 pages
Hybrid Web Recommender Systems
No ratings yet
Hybrid Web Recommender Systems
33 pages
Cv pro YAG - Anglais
No ratings yet
Cv pro YAG - Anglais
5 pages
ICECCO2015
No ratings yet
ICECCO2015
5 pages
Easychair Preprint: Mohit Soni and Shivam Bansal
No ratings yet
Easychair Preprint: Mohit Soni and Shivam Bansal
28 pages
Module 5
No ratings yet
Module 5
50 pages
Objective 1 Badriyah2020
No ratings yet
Objective 1 Badriyah2020
5 pages
Ai Document
No ratings yet
Ai Document
11 pages
Sharma 2021
No ratings yet
Sharma 2021
16 pages
Recommendation System-WPS Office
No ratings yet
Recommendation System-WPS Office
18 pages
UNIT 1
No ratings yet
UNIT 1
9 pages
Personalized Recommendation Services Based On Service-Oriented Architecture
No ratings yet
Personalized Recommendation Services Based On Service-Oriented Architecture
6 pages
Data_Link_Layer_Lecture
No ratings yet
Data_Link_Layer_Lecture
3 pages
Other Techiniques
No ratings yet
Other Techiniques
63 pages
Flipkart Product Recommendation System: T. Keerthana, T. Bhavani, N. Suma Priya, V. Sai Prathyusha, K.Santhi Sri
No ratings yet
Flipkart Product Recommendation System: T. Keerthana, T. Bhavani, N. Suma Priya, V. Sai Prathyusha, K.Santhi Sri
8 pages
A Predictive Approach for improving the sales of
No ratings yet
A Predictive Approach for improving the sales of
5 pages
Gopal Project
No ratings yet
Gopal Project
31 pages
Recommender Systems: A, B, C B B, D A, B, C, A, B, C A, B, C, e
No ratings yet
Recommender Systems: A, B, C B B, D A, B, C, A, B, C A, B, C, e
97 pages
Mirza 2003
No ratings yet
Mirza 2003
30 pages
Final Draft - GIS Based Migration From Copper To OFC Network Using PON For Green and Dry Field Areas of Pakistan V-II (10-12-16)
No ratings yet
Final Draft - GIS Based Migration From Copper To OFC Network Using PON For Green and Dry Field Areas of Pakistan V-II (10-12-16)
89 pages
Recommender Systems
No ratings yet
Recommender Systems
23 pages
CSS Report
No ratings yet
CSS Report
38 pages
Search Manifesto
From Everand
Search Manifesto
Rajan Manickavasagam
No ratings yet
Recommender Lecture
No ratings yet
Recommender Lecture
29 pages
E - Commerce Recommendation System
No ratings yet
E - Commerce Recommendation System
29 pages
Movie Recommendation System Using Cosine Similarity and KNN: II. Related Work
No ratings yet
Movie Recommendation System Using Cosine Similarity and KNN: II. Related Work
4 pages
Paper2-An Improved Recommender System Solution To Mitigat
No ratings yet
Paper2-An Improved Recommender System Solution To Mitigat
22 pages
Sinha-Dhanalakshmi2019 Article EvolutionOfRecommenderSystemOv PDF
No ratings yet
Sinha-Dhanalakshmi2019 Article EvolutionOfRecommenderSystemOv PDF
20 pages
Movie Recommendation KNN
No ratings yet
Movie Recommendation KNN
5 pages
Web Crawling Based Context Aware Recommender Syste
No ratings yet
Web Crawling Based Context Aware Recommender Syste
25 pages
DxDiag
No ratings yet
DxDiag
39 pages
FED Learning Mini Project
No ratings yet
FED Learning Mini Project
17 pages
Bugreport Davinci QKQ1.190825.002 2023 11 22 22 26 02 Dumpstate - Log 29351
No ratings yet
Bugreport Davinci QKQ1.190825.002 2023 11 22 22 26 02 Dumpstate - Log 29351
30 pages
Computers in Human Behavior
No ratings yet
Computers in Human Behavior
10 pages
Session 1 2
No ratings yet
Session 1 2
92 pages
Design and Implementation For Recommender System
No ratings yet
Design and Implementation For Recommender System
112 pages
Computer 1
No ratings yet
Computer 1
9 pages
Irjet V6i7379
No ratings yet
Irjet V6i7379
9 pages
Recommender Systems Asanov
No ratings yet
Recommender Systems Asanov
7 pages
45 Excel Shortcut Keys
No ratings yet
45 Excel Shortcut Keys
7 pages
Shiva_DE_resume (1)
No ratings yet
Shiva_DE_resume (1)
6 pages
Electric Fencing Products That Deliver Peace of Mind
No ratings yet
Electric Fencing Products That Deliver Peace of Mind
60 pages
Welcome 1
No ratings yet
Welcome 1
9 pages
HTML Cheat Sheet
No ratings yet
HTML Cheat Sheet
5 pages
Recommendation in Social Media: Recommender System
No ratings yet
Recommendation in Social Media: Recommender System
29 pages
Improving Collaborative Filtering Recommender Systems Using Semantic Information
No ratings yet
Improving Collaborative Filtering Recommender Systems Using Semantic Information
6 pages
Recommender System Based On Customer Behaviour For Retail Stores
No ratings yet
Recommender System Based On Customer Behaviour For Retail Stores
12 pages
Review of Clustering-Based Recommender Systems
No ratings yet
Review of Clustering-Based Recommender Systems
22 pages
EMD Software Quality Assurance Plan Example
No ratings yet
EMD Software Quality Assurance Plan Example
51 pages
Recommendation Systems Paper AI
No ratings yet
Recommendation Systems Paper AI
7 pages
Personalized E-Commerce Based Recommendation Systems Using Deep-Learning Techniques
No ratings yet
Personalized E-Commerce Based Recommendation Systems Using Deep-Learning Techniques
9 pages
Content-Based Filtering
No ratings yet
Content-Based Filtering
7 pages
Nov Dec 2022
No ratings yet
Nov Dec 2022
4 pages
PWC RF-D.E.S.S. (GTX, RXP, RXT Series - WAKE PRO) - Shop Manual smr2016-033 - en
No ratings yet
PWC RF-D.E.S.S. (GTX, RXP, RXT Series - WAKE PRO) - Shop Manual smr2016-033 - en
5 pages
Travel Companion: Keywords:-Blockchain, Machine Learning, Hybrid Filtering
No ratings yet
Travel Companion: Keywords:-Blockchain, Machine Learning, Hybrid Filtering
5 pages
Recommendation System in Python
No ratings yet
Recommendation System in Python
13 pages
A Survey For Personalized Item Based Recommendation System
No ratings yet
A Survey For Personalized Item Based Recommendation System
3 pages
Using Semantic Recommenders For Personalized Recommendations
No ratings yet
Using Semantic Recommenders For Personalized Recommendations
4 pages
Internship Report
No ratings yet
Internship Report
26 pages
WIFI830 user manual
No ratings yet
WIFI830 user manual
15 pages
Ijaret: International Journal of Advanced Research in Engineering and Technology (Ijaret)
No ratings yet
Ijaret: International Journal of Advanced Research in Engineering and Technology (Ijaret)
8 pages
LTG 02 FF Midterm Test: You're Reading A Preview
No ratings yet
LTG 02 FF Midterm Test: You're Reading A Preview
1 page
CIS017-1 - CIS095-1 - Assignment 1 (Design and Implement A Database) Report Template 2020-2021-16!3!2021
No ratings yet
CIS017-1 - CIS095-1 - Assignment 1 (Design and Implement A Database) Report Template 2020-2021-16!3!2021
7 pages
Bsit 2019 From Registrar
No ratings yet
Bsit 2019 From Registrar
2 pages
Ansar K M Himbran - 20ETEC004004 - F
No ratings yet
Ansar K M Himbran - 20ETEC004004 - F
6 pages
Project in Business Healtcare BI
No ratings yet
Project in Business Healtcare BI
47 pages
AI Recommendation System
No ratings yet
AI Recommendation System
20 pages
GodMode Folder
No ratings yet
GodMode Folder
7 pages
Nokia SBC For Enterprise Brochure EN
No ratings yet
Nokia SBC For Enterprise Brochure EN
9 pages
Recommendation System
No ratings yet
Recommendation System
17 pages
Introduction To COMSOL Multi Physics
No ratings yet
Introduction To COMSOL Multi Physics
168 pages
Computer System: Computer System Provide A Capability For Gathering Data, Performing Computations
No ratings yet
Computer System: Computer System Provide A Capability For Gathering Data, Performing Computations
20 pages
LED Message Display
No ratings yet
LED Message Display
1 page
12 W 09
No ratings yet
12 W 09
6 pages
DENAIR MAM6090 Control Panel Manual PDF
100% (3)
DENAIR MAM6090 Control Panel Manual PDF
43 pages
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
From Everand
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
Fouad Sabry
No ratings yet

2023 KEDIR Pattern Based Hybrid Book Recommendation System

Uploaded by

2023 KEDIR Pattern Based Hybrid Book Recommendation System

Uploaded by

www.nature.

OPEN Pattern‑based hybrid book

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 1

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 2

Components of the Proposed Model.

Figure 1. The proposed framework.

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 3

Algorithm 1 Reader Clustering Pseudocode

Given a set of reader department D

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 4

R : Readers ∗ Book ∗ Context ⇒ rating

Algorithm 2 Sample Rule Generated

Noun Phrases: [‘Michael T.Goodrich', 'Data Structures

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 5

Experimentation result, discussion and evaluation

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 6

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 7

UserID ISBN Book rating Department

Table 1. Books dataset information.

Recommended Not recommended

Table 2. Evaluation metrics ­table11.

1. Precision: This result will be found by calculating using Eq. (2),1.

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 8

User ID Precision Recall F1-Score

Table 3. Experimentation based on user similarity.

Discussion and conclusion.

Figure 2. Experimentation value based on individual user similarity.

Category number Precision Recall F1-Score

Table 4. Category based evaluation.

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 9

Approach Precision Recall F1 Score

Table 5. Result comparison of different algorithms with the proposed model.

Precision Recall F1 Score

Figure 3. Result comparison of the proposed model with other models.

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 10

Received: 19 July 2022; Accepted: 3 March 2023

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 11

© The Author(s) 2023

Scientific Reports | (2023) 13:3693 | https://doi.org/10.1038/s41598-023-30987-0 12

You might also like

Table 2. Evaluation metrics table11.