3
3
Journal of Cloud Computing: Advances, Systems and Applications Journal of Cloud Computing:
(2020) 9:57
https://doi.org/10.1186/s13677-020-00199-2 Advances, Systems and Applications
Abstract
The recommendation system is an effective means to solve the information overload problem that exists in social
networks, which is also one of the most common applications of big data technology. Thus, the matrix
decomposition recommendation model based on scoring data has been extensively studied and applied in recent
years, but the data sparsity problem affects the recommendation quality of the model. To this end, this paper
proposes a hybrid recommendation model based on deep emotion analysis and multi-source view fusion which
makes a personalized recommendation with user-post interaction ratings, implicit feedback and auxiliary
information in a hybrid recommendation system. Specifically, the HITS algorithm is used to process the data set,
which can filter out the users and posts with high influence and eliminate most of the low-quality users and posts.
Secondly, the calculation method of measuring the similarity of candidate posts and the method of calculating K
nearest neighbors are designed, which solves the problem that the text description information of post content in
the recommendation system is difficult to mine and utilize. Then, the cooperative training strategy is used to
achieve the fusion of two recommended views, which eliminates the data distribution deviation added to the
training data pool in the iterative training. Finally, the performance of the DMHR algorithm proposed in this paper
is compared with other state-of-art algorithms based on the Twitter dataset. The experimental results show that the
DMHR algorithm has significant improvements in score prediction and recommendation performance.
Keywords: Hybrid recommendation system, Emotion analysis, Multi-source view, Fusion
© The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License,
which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if
changes were made. The images or other third party material in this article are included in the article's Creative Commons
licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons
licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain
permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Jiang et al. Journal of Cloud Computing: Advances, Systems and Applications (2020) 9:57 Page 2 of 16
in many applications of artificial intelligence. Besides, the achieve the fusion of the behavioral view of user ratings
user rating matrix is still the main data source used by and post content.
most recommendation systems, but the recommendation The main contribution of this paper is to propose a scor-
based on user reviews, user implicit feedback, and project ing prediction method based on multi-recommended view
content information is getting more and more attention fusion of collaborative training, and to explore the integra-
[13]. However, the progress made in these aspects of re- tion of auxiliary language information such as user review
search is not very satisfactory due to the constraints of text in recommendation system by using natural language
text mining and user behavior analysis, but they have im- processing technology based on deep learning. The tasks of
portant potential in solving the recommendation accuracy, this paper are mainly reflected in the following aspects:
cold start and interpretability of the recommendation sys- (1) A data preprocessing method based on HITS algo-
tem. Meanwhile, there are usually more serious data rithm is introduced, which filters out the users and posts
sparse and cold start problems in social networks com- with high influence, so as to eliminate most of the low-
pared with the traditional recommendation algorithm, quality users and posts and ensure better efficiency in sub-
which brings great challenges to the research of social rec- sequent processing. At the same time, the authority value
ommendation algorithms. of the post is obtained and used as the user’s initial rating
Aiming at the above-mentioned issues, the collaborative for the post. Besides, a method based on a comprehensive
filtering algorithm is highly praised by researchers [14, 15]. measure of the user’s emotional tendency and the original
Its goal is to transform the binary relationship between rating level is proposed. The deviation of the user’s ori-
users and posts into a score prediction problem, and then ginal score is corrected from the user’s real interest prefer-
collaborative filtering or sorting based on users’ scores of ence by mining the emotional tendency of user’s reviews.
posts to generate a recommendation list. Furthermore, sub- And the perspective pre-filtering method is used to
sequent research work has found that the recommendation achieve a comprehensive measure of the user’s emotional
results based on user ratings do not accurately reflect the tendency and the original rating level, and provides a
user’s interest preferences due to the constraints of user rat- more accurate comprehensive scoring data reflecting the
ings and the sparseness of the scoring matrix. user’s real interest preference for the post-based collabora-
In the content-based recommendation, the description tive filtering recommendation model.
text information of the post content is an important rec- (2) A method for text information mining based on post
ommendation basis [16]. Content-based recommenda- content description is proposed. The text information of
tion can effectively solve the cold start problem, and is the content description of the post is mined, the neural
not constrained by the score sparsity, which can discover network method is used to represent it as a distributed
hidden information and has a good user experience. paragraph vector, and the similarity calculation of the con-
Hence, it receives wide attention in these days. However, tent of the post is realized, thereby constructing a recom-
the short text natural language description (usually short mendation model based on the content of the post. The
and fragmented) for the content of the post does not calculation method of measuring the similarity of candi-
have enough information for the machine to make stat- date posts and the method of calculating K nearest neigh-
istical inferences, which brings great difficulties to the bors are designed, which solves the problem that the text
semantic understanding of the post content. description information of post content in the recommen-
At present, the research of deep learning technology of dation system is difficult to mine and utilize.
integrating multi-source heterogeneous data, fusion scoring (3) A hybrid recommendation algorithm based on col-
matrix and review text, and multi-featured collaborative laborative training is proposed. The cooperative training
recommendation has become a hot topic [17] [18–20]. strategy is used to achieve the fusion of two recommended
Based on the above research, this paper proposes a hybrid views, adds a data selection strategy based on confidence
recommendation model based on deep emotion analysis estimation and cluster analysis in collaborative training,
and multi-source view fusion (DMHR algorithm), which eliminating the data distribution deviation added to the
aims at the balance of user score distribution and the diffi- training data pool in the iterative training. On this basis,
culty of multi-recommendation in recommendation system. the initial recommendation results are filtered and sorted
The multi-source view here is the multidimensional recom- by using the scoring matrix and the similarity of posts out-
mendation factor in the recommendation system. And the put from the collaborative training model, and the final
hybrid recommendation method of this paper combines recommendation results are obtained.
three recommended views, such as user rating matrix, user The rest of the paper is organized as follows. In section
review text, and content description information of posts, II related work on recommendation algorithms based on
which is different from the traditional hybrid methods such collaborative filtering and content description has been
as weighted fusion and cascading, this paper designs a rec- discussed. In Section III and IV, a hybrid recommendation
ommendation algorithm based on collaborative training to model based on deep emotion analysis and multi-source
Jiang et al. Journal of Cloud Computing: Advances, Systems and Applications (2020) 9:57 Page 3 of 16
view fusion is presented. Our experiments have been ana- obtains the trust and trusted characteristic matrix, and
lyzed and discussed in Section V. The conclusions have recommends accordingly. It solves the problem of data
been given and our future work has been outlined in the sparsity in traditional collaborative filtering algorithm to
last section. some extent. Forsati et al. [26] proposed a matrix
factorization-based model for recommendation in social
Related work rating networks, named SocialMF algorithm, which in-
The recommendation system is an effective means to troduces the trust delivery mechanism into the social
solve the information overload problem and is one of the recommendation, and better reflects the influence of the
most common applications of big data technology. It uti- social network trust relationship on the recommenda-
lizes knowledge discovery technology to filter information tion. To a certain extent, the social recommendation al-
and products that users are interested in according to gorithm has a wide range of applications, such as the
their historical information, hobbies and other characteris- huge amount of data, complex data content, complex al-
tics, thereby achieving personalized recommendation. In gorithm implementation, high time complexity, and
addition, recommendation algorithms based on collabora- weak personalized recommendation results.
tive filtering and content description are the two most
common Recommendation algorithms [21]. Recommendation algorithms based on content
description
Recommendation algorithms based on collaborative In the content-based recommendation, the content infor-
filtering mation of the project is an important recommendation
Collaborative filtering recommendation algorithms can basis, and it is also an important way to solve the cold start
often be divided into user rating-based methods and im- problem, but this recommendation method is subject to
plicit semantic model-based methods. User rating-based the information acquisition technology [27]. Content-based
methods which use historical scoring data to discover recommendation is based on the user’s favorite project
similar users or similar projects, can generate recommen- content information to find similar projects for recommen-
dation lists based on similarity. Implicit semantic model- dation. The current popular practice is to use the relevant
based methods which map the user and the project to a theories, methods and techniques in information retrieval
feature vector with some real meaning, can calculate the to model the project content information. Zhao et al. [28]
user’s preference for the project by calculating the inner proposed a review-based recommendation model by fusing
product of the vector. For example, Guo et al. [22] pro- users’ internal influence into a matrix factorization to im-
posed a neural variational collaborative filtering frame- prove the accuracy of rating predictions. User sentimental
work for top-k recommendation. The actual effect of the deviations and the review’s reliability are explored to meas-
algorithm is improved by incorporating the side informa- ure their impact on Social Recommendation. McAuley
tion of user and project, and employing a Stochastic Gra- et al. [29] proposed an HFT algorithm which fuses the
dient Variational Bayes approach. Yan et al. [23] proposed scoring matrix and the review text during the parameter
a stage-wise matrix factorization algorithm by exploiting learning and fitting phases. It models user ratings and user
manifold optimization techniques. Applying this algorithm reviews by establishing a link between the topic distribution
to the collaborative filtering recommendation model can of the user’s reviews and the potential factors of the user or
greatly improve performance and efficiency on large-scale post. Bao et al. [30] proposed TopicMF algorithm which
real data. Koren [24] proposed a matrix factorization- uses non-negative matrix factorization to mine the topic
based model for recommendation in social rating net- distribution of a single comment. It is considered that the
works, named SVD++ algorithm, which introduces the topic distribution reflects user preferences and project
trust delivery mechanism into the social recommendation, characteristics, and maps with user potential factors and
and better reflects the influence of the social network trust project potential factors. Ding et al. [31] proposed a learn-
relationship on the recommendation. Although the collab- ing algorithm based on the element-wise Alternating Least
orative filtering algorithm is widely recommended and Squares learner which integrates view data into a recom-
easy to implement, it has many problems, such as high mendation system based on implicit feedback to mine hid-
computing cost, poor scalability and sparse data. den preference information other than primary feedback
The social network-based recommendation is the ex- data such as purchases. However, text content is usually
tension of Collaborative Filtering Recommendation Al- short and fragmented. If historical information is not re-
gorithm in social networks, which has the characteristics ferred to, it is easy to cause insufficient information for the
of data diversity, real-time data update and high inter- machine to make statistical inferences, which brings great
action. Guo et al. [25] proposed a collaborative filtering difficulty to the semantic understanding of the content of
recommendation algorithm named TDSRec algorithm the item. The recommended information is also single and
that integrates the characteristics of social networks. It the user’s interest is limited.
Jiang et al. Journal of Cloud Computing: Advances, Systems and Applications (2020) 9:57 Page 4 of 16
Human emotion expressed in social media plays an in- initial score obtained in this way does not take into ac-
creasingly important role in shaping policies and deci- count the time attribute and the actual interest of the
sions. Emotion analysis on social networks has attracted user, and the recommended model proposed in this
increasing research attention. In order to improve the paper overcomes these shortcomings.
accuracy of recommendation, emotion analysis is com-
bined with other factors. Chouchani et al. [18] used in- Hybrid recommendation model based on deep
formation about social influence processes to improve emotion analysis and multi-source view fusion
emotion analysis. Phan et al. [19] proposed a new ap- In view of the above discussion on the status quo of rec-
proach based on a feature ensemble model related to ommendation model research, this paper proposes a hy-
tweets containing fuzzy emotion by taking into account brid recommendation model based on deep emotion
elements such as lexical, word-type, semantic, position, analysis of user reviews and cooperative fusion of multi-
and emotion polarity of words. Chung et al. [20] devel- source recommendation views, named DMHR. The
oped a novel framework for dissecting emotion and process of DMHR hybrid recommendation model is as
examining user influence in social media which compre- follow: firstly, The perspective pre-filtering method [33] is
hensively considered emotions, social positions, influ- used to achieve a comprehensive measure of the user’s
ence and other factors. However, human emotion is emotional tendency and the original rating level, and pro-
fluctuating, more user history data is needed and mul- vides a more accurate comprehensive scoring data reflect-
tiple recommendation factors are not easy to integrate. ing the user’s real interest preference for the post-based
Recommended model based on emotion analysis easily collaborative filtering recommendation model. Simultan-
restricted by data sparsity and cold start, so it is not easy eously, the text information of the content description of
to obtain good recommendation effect. the post is mined, and the neural network method is used
to represent it as a distributed paragraph vector, realizing
Data preprocessing the similarity calculation of the content of the post, and
The data obtained from social networks is disorganized then a recommendation model based on the content of
and faces the problem of sparse data and cold start, which the post is constructed. Secondly, the cooperative training
requires pre-processing the data to improve the recom- strategy is used to achieve the fusion of two recommended
mendation model. To this end, this paper introduces the views, adding a data selection strategy based on confi-
HITS algorithm [32] to the recommendation model. dence estimation and cluster analysis in collaborative
The HITS algorithm is one of the classic algorithms for training, and eliminating the data distribution deviation
web search. It finds the authority page and the hub page added to the training data pool in the iterative training. Fi-
in the page collection by analyzing the hyperlinks between nally, On this basis, the initial recommendation results are
the pages. These characteristics of the HITS algorithm filtered and sorted by using the scoring matrix and the
have attracted many researchers’ attention and have been similarity of posts output from the collaborative training
introduced into online social networks. Similarly, the au- model, and the final recommendation results are obtained.
thority value and the hub value are used to represent the The deviation of the user’s original score from the user’s
influence of users and posts respectively. HITS algorithm real interest preference is corrected by mining the emo-
is used to process the data set and filter out the users and tional tendency of user’s reviews for the next recommen-
posts with high influence, so as to eliminate most of the dation. The hybrid recommendation model system
low-quality users and posts, which ensures better effi- framework is shown in Fig. 1.
ciency in subsequent processing. At the same time, the au-
thority value of the post is obtained and used as the user’s Emotional analysis of user reviews
initial rating for the post. The authority value of the post Distributed vector representation of user review text
can be represented by the sum of the hub scores of all Through statistical analysis of the user review text in the
users who have forward the particular posts: recommendation model, it is found that the presentation
X form is usually a keyword and a short text. Research
aðpÞ ¼ hðuÞ ð1Þ shows that these short text messages are usually proc-
essed differently from long text. The short text has the
The authority value of the post a(p) is standardized: characteristics of short length and irregular grammar,
which makes traditional natural language processing
a ð pÞ
aðpÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P ffi ð2Þ technology powerless in short text analysis. Early ana-
½aðpÞ2 lysis and application of short text mainly rely on enu-
meration or keyword matching, avoiding the semantic
The authority value of the post a(p) is obtained by iter- understanding of text, while automatic short text under-
ating repeatedly until a(p) converges. However, the standing usually relies on additional knowledge. In this
Jiang et al. Journal of Cloud Computing: Advances, Systems and Applications (2020) 9:57 Page 5 of 16
paper, we use the keyword representation method based of the leaf’s output layer. For each node, the relative
on word vector to solve the dimension disaster of trad- probability of its sub-nodes is clearly expressed. Random
itional sparse representation and the problem of unable walk algorithm is used to assign the probability of each
to express semantic information. At the same time, the word.
association attributes between words are also mined, Word2vec automatically learns syntactic and semantic
which improves the accuracy of the semantic representa- information from large-scale unlabeled user reviews, en-
tion of keywords. abling the characterization of keywords in user reviews.
Word2vec is a predictive model for high-efficiency word The use of Word2vec to vectorize the short text infor-
nesting learning, including two variants of CBOW model mation of user reviews is mainly divided into the follow-
and Skip-Gram model [34]. CBOW predicts the probabil- ing two steps:
ity of occurrence of a central word through words within (i) According to the collected user review text data,
the window, while Skip-Gram is based on the probability using the Skip-Gram or CBOW training word vector
that the word appears within the window of the central model, each word is expressed as a K-dimensional vector
word prediction. Its training goal is to find the vector rep- real value;
resentation of the words useful for predicting the sur- (ii) For the short text of user reviews, Top-N words
rounding words in sentences or documents. If for a given are extracted to express the emotion of the text based
sentence, ω1, ω2, …, ωT means the words in the sentence, on word segmentation using TF-IDF and other algo-
the objective function g(ω) of Skip-Gram model is to rithms, and then K-dimensional vector representation of
maximize the average logarithmic probability. the extracted Top-N words is found from the word vec-
tor model.
1 XT X
gðωÞ ¼ − c ≤ j ≤ c; j≠0 log
logp ωtþ j jωt ð3Þ After obtaining the K-dimension real vector represen-
T t¼1
tation of each key word, a common method is to use
In the above formula, c denotes the number of training weighted average method to process the vector of the
texts, the larger c is, and the higher the accuracy of the key word, which is equivalent to the vector representa-
model may be. The Skip-Gram model uses the hierarch- tion of the user review text, in order to realize the emo-
ical Softmax function to define p(ωt + j| ωt). Hierarchy tional analysis of the review information. This weighted
Softmax uses W words as the binary tree representation averaging method ignores the influence of word order
Jiang et al. Journal of Cloud Computing: Advances, Systems and Applications (2020) 9:57 Page 6 of 16
on the affective prediction model. Because word vector dynamically changing the accumulation at different
representation based on Word2vec is only based on the times when the parameters are fixed. In the LSTM
dimension of words to carry out “semantic analysis”, network structure, the calculation formula of each
while weighted average processing of word vectors does LSTM unit is as shown in formulas (4) to (9).
not have the ability of “semantic analysis” of context.
Therefore, this paper constructs an emotional comput- f t ¼ σ W f ∙½ h t − 1 ; x t þ b f ð4Þ
ing model based on word vector and long short-term it ¼ σðW i ∙½ht − 1 ; xt þ bi Þ ð5Þ
memory network to realize the emotional analysis of
user reviews. ft ¼ tanhðW C ∙½ht − 1 ; xt þ bC Þ
C ð6Þ
on real emotions, this paper uses the pre-filtering paragraph coding vectors and word vectors are accu-
method based on viewpoints and the embedding mulated or connected as input of SoftMax in the out-
method based on user ratings to fuse user ratings and put layer. The paragraph code remains unchanged
emotional prediction ratings respectively. The former during the training of the text description of the post,
uses the LSTM network to get the prediction score, and semantic information of the entire sentence is in-
and then weights the sum with the original user tegrated every time the word probability is predicted.
score. The method based on user score embedding In the prediction phase, a new paragraph code is
combines the LSTM network vector with the user assigned to the description text of the post content
rating information, and uses the result as the input of while keeping the parameters of the word vector and
the last layer to directly output the final comprehen- the input layer SoftMax unchanged. Finally, the gradi-
sive score. ent descent method is used to train the new post de-
Based on the method of perspective pre-filtering, scription text until it converges, resulting in a low-
the emotion analysis of user review text modeling is dimensional vector representation of the post content.
performed by Word2vec and LSTM, and the emo- The distributed representation of the paragraph vector
tional tendency score scorer of each user’s review on of the post content is shown below (Fig. 3).
the post is predicted, and the user’s original score is After obtaining the unique d-dimensional distrib-
weighted and summed to obtain a comprehensive uted vector representation of the post content, the
score scorec. similarity and distance between each two post con-
tents can be obtained by the similarity calculation.
scorec ¼ α scorer þ 100ð1 − αÞ scoreH ð10Þ This paper uses the cosine formula to measure the
similarity between two posts, and uses the Mahala
In the above formula, scorer represents the user’s emo-
Nobis distance to calculate the distance between the
tional prediction score for the post review, scoreH repre-
natural language descriptions of the two posts. As-
sents post’s authority value in HITS algorithm, due to
sume that the paragraph vectors of the natural lan-
the limit of the number of data taken, the post’s author-
guage description of the two post contents are
ity value is small. In order to increase its impact on the
represented as PVa = (x11, x12, …, x1d) and PVb = (x21,
results, it is expanded by 100 times. α is the balance fac-
x22, …, x2d), where d denotes the dimensions of two
tor between the two scores.
paragraph vectors. Then the similarity and distance
The method based on user rating embedding is based
between them are defined as follows:
on the emotional analysis of the user review information,
combining the obtained LSTM output vector with the PV d •PV d
user rating information, then the above result is used as simðPVa ; PVb Þ ¼
kPV d k2 •kPV d k2
the input to the last layer (fully connected layer) and the
X
i¼d
final comprehensive emotional score is directly output x1i x2i
via the SoftMax activation function.
¼ vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
i¼0
vffiffiffiffiffiffiffiffiffiffiffiffiffi ð12Þ
u u i¼d
uX
H i ¼ ht Score ðUseri Þ ð11Þ u i¼d 2 u X
t x1i t x22i
i¼0 i¼0
Fig. 3 The distributed representation of the paragraph vector of the post content
recommendation views is realized based on coopera- recommendation and sorting. Besides the hybrid recom-
tive training strategy. In data selection, data selection mendation algorithm based on collaborative training is
algorithm based on confidence estimation and cluster- shown in the following Fig. 4.
ing analysis is used to filter the data, and then added In the recommendation model, the score of user u on
to the training data pool of another classifier for the post p is recorded as Ru(p) which takes from post’s
next round of training, so as to iterate. authority value in HITS algorithm; The corresponding
scoring matrix is Rm × n(U, P), where the row vector m
Hybrid recommendation algorithm based on collaborative represents the number of users, and the column vec-
training tor n represents the number of posts. In the object-
The hybrid recommendation algorithm based on collab- based collaborative filtering recommendation model,
orative training is used to construct the initial scoring input the user’s original scoring matrix Rm × n(U, P),
matrix based on the user’s scoring of the posts. Then the where Ru(p) ∈ [0, 1], and the virtual scoring matrix
perspective pre-filtering method is used to measure the !
R mn ðU; PÞ predicted by the emotion analysis model,
composite score to update the scoring matrix. Finally, a !
hybrid recommendation algorithm based on collabora- where R u ðpÞ∈f0; 1g, 0 means that the user’s emotion
tive training is designed in which the scoring matrix is is negative, and 1 means that the user’s emotion is
cyclically filled and optimized according to the vector positive, output as data set Dtrain. The description of
similarity of the comprehensive scoring matrix and the the post-based collaborative filtering recommendation
post content description, so as to achieve algorithm is as shown in Algorithm 1.
In Algorithm 1, the post-based collaborative filtering The multiple recommended techniques are mixed within
recommendation method is used to populate the default the hybrid recommendation method to compensate for the
value of the user’s scoring matrix and update the training shortcomings and achieve better recommendations. Differ-
data set of user u at the same time. In the emotional clas- ent from traditional hybrid recommendation technologies,
sification model, it is generally divided into fine-grained such as weighted fusion, hybrid recommendation and cas-
(5-level classification) and coarse-grained (2-level classifi- cade recommendation, the collaborative training strategy is
cation). Considering that the accuracy of the 2-level emo- used in this paper to construct a hybrid model of collabora-
tional classification model is much higher than that of the tive filtering recommendation based on posts (Algorithm 1)
5-level emotional classification model, this paper adopts and content-based recommendation (Algorithm 2). In each
2-level emotional classification in the recommendation al- iterative training process of the collaborative training model,
gorithm. The user’s emotions were set to 1 point and 0 the calculated comprehensive scoring data is used to train
point, respectively. Then, the user’s emotional scores and the scoring prediction model to achieve the filling and up-
original scores were comprehensively measured by means dating of the scoring matrix. Then, the training model based
of viewpoint pre-filtering. Finally, the post-based collab- on the content of the post is trained to be scored according
orative filtering model is used to predict and fill the scor- to the updated scoring matrix and the content description
ing matrix, and the data selection algorithm based on information of the post (the posts with the score ≥ 0.7 and
confidence estimation and cluster analysis is used to filter the score ≤ 0.3 are respectively placed in the training pool of
the data, and add the incremental data to the training data the post that the user likes and dislikes). The matrix is filled
set of user u. and updated, and it is used as the input of the post-based
In the content-based description model, K-nearest collaborative filtering recommendation model for the next
neighbor algorithm is used to calculate the distance of iteration training. This paper proposes a hybrid recommen-
content description, and the cosine similarity of posts dation method based on collaborative training compared
and the Mahala Nobis distance of K nearest neighbor with weighted fusion hybrid recommendation, which needs
posts are used to update or fill in the user’s score and to adjust the weight of each recommendation result, the dif-
default value, which is then used in the content-based ficulty of ranking hybrid recommendation, and the staged
recommendation model for the next iteration. The process of cascaded recommendation, which makes full use
Jiang et al. Journal of Cloud Computing: Advances, Systems and Applications (2020) 9:57 Page 10 of 16
Evaluation measures
In order to evaluate the performance of the proposed al-
gorithm, we choose the classical accuracy index in the
recommendation model: mean absolute error (MAE).
For a user u and post p in our datasets, rup is the actual
score of user u on post p, ~r up is the predicted score ob-
tained by the algorithm proposed in this paper. T is the
number of scores of user u on post p in our datasets.
Then the evaluation index MAE in the recommendation
model is calculated as follows:
P
u;p∈T jr up − ~r up j
MAE ¼ ð14Þ
T
The lower the MAE value is, the higher the accuracy
of the algorithm prediction is.
Comparative methods
In the experiment of this paper, four more classical recom-
mendation algorithms are selected as the comparison algo-
rithm of the proposed DMHR algorithm. The performance of
each algorithm is evaluated by performance indicator MAE.
In the case study, the performance of DMHR algorithm in
this paper is evaluated by the Top N recommendation of
some specific instances. The four comparison recommenda- user’s commentary emotional score based on the per-
tion algorithms are described in detail as follows: spective pre-filtering method can be used to solve the
TDSRec algorithm [25]: It is a collaborative filtering problem of large deviation between the user’s original
recommendation algorithm that integrates the character- score and the real interest preference.
istics of social networks. It obtains the trust and trusted
characteristic matrix, and recommends accordingly. It The influence of the number of neighboring posts K
solves the problem of data sparsity in traditional collab- In this paper, the collaborative training strategy is used
orative filtering algorithm to some extent. to fuse user score data and post content description in-
SocialMF algorithm [26]: It is a matrix factorization- formation to construct a hybrid recommendation sys-
based model for recommendation in social rating net- tem. In the post content recommendation model, the
works, which introduces the trust delivery mechanism KNN algorithm is used to calculate the distance of the
into the social recommendation, and better reflects the post content description, and the cosine similarity is used
influence of the social network trust relationship on the to measure the similarity of the post content description,
recommendation. so as to update or fill in user’s score and default value by
SVD++ algorithm [24]: It is an improved singular using the score of K nearest neighbor posts. Finally, exper-
value decomposition (SVD) technique that introduces iments have shown that choosing the appropriate K value
implicit feedback based on SVD. User’s historical brows- has an important impact on the final recommendation. In
ing data and user’s historical rating data are all used as this experiment, the value of K is set from 10 to 100, and
new parameters. the step size is 10. The experimental results obtained by
HFT algorithm [29]: It models user ratings and user our datasets are shown in Fig. 6 below.
reviews by establishing a link between the topic distribu- As can be seen from the data in the above Fig. 6,
tion of the user’s reviews and the potential factors of the the MAE value of the dataset reaches the minimum
user or post. value when K = 60. Subsequently, as the K value con-
tinues to increase, the MAE value of the model also
The influence of parameters increases, indicating that the effect of the recom-
Effect of balance factor α mended model is worse. It is concluded that the rec-
In the DMHR algorithm proposed in this paper, there is ommended effect of the model on the dataset has a
an important parameter α, which reflects the weighting greater relationship with the value of the nearest
of the original user scores and the emotion analysis vir- neighbor number K. However, the MAE accuracy of
tual scores of the user reviews based on the perspective the recommended model is not particularly sensitive
pre-filtering method. The formula is used to evaluate the to the K value, and the relatively ideal MAE accuracy
emotional tendency of the post: can be obtained within a certain range when the value
of K is large. In this experiment, it is better to choose
scorec ¼ α scorer þ 100ð1 − αÞ scoreH ð15Þ K in the range of [50, 70]. Therefore, K = 60 is se-
lected as the parameter of DMHR algorithm when
The larger the value of α is, the greater the weight of using KNN algorithm to calculate the content descrip-
the virtual score predicted by the emotion classification tion of similar posts.
model in the comprehensive score. In this experiment,
the value of α is set from 0 to 1.0, and the step size is Iterations of emotion classification model N
0.1. The experimental results obtained by our datasets In order to take account of the interaction of user rat-
are shown in Fig. 5 below. ings and review information on real emotions, this paper
As can be seen from the data in the above Fig. 5, when uses the pre-filtering method based on viewpoints and the
α = 0.7, the MAE value of the dataset reaches the mini- embedding method based on user ratings to fuse user rat-
mum value. In the perspective pre-filtering method, the ings and emotional prediction ratings respectively. The
value of α represents the weight of the virtual score in former uses the LSTM network to get the prediction
the comprehensive score. This shows that the virtual score, and then weights the sum with the original user
score calculated by the emotion classification model has score. The LSTM network vector with the user rating in-
an important influence on the accuracy of the recom- formation is combined with the method based on user
mended prediction scoring model. To a certain extent, it score embedding, and the result is used as the input of the
also verifies the assumption proposed in this paper that last layer to directly output the final comprehensive score.
the user’s review information can better reflect the user’s In order to show the performance of the emotion classifi-
real interest preferences. In order to reduce the estima- cation model trained by LSTM algorithm more clearly, we
tion bias of noise data to the score prediction, the compared the accuracy of the model in different itera-
weighted synthesis of the original user score and the tions. Set N = {1,10,20,30, …, 100} respectively, and the 10-
Jiang et al. Journal of Cloud Computing: Advances, Systems and Applications (2020) 9:57 Page 12 of 16
Table 2 Authorities of the posts perspective pre-filtering method is adopted, and the pa-
Post Authority rameters α = 0.7, K = 60 when the best result is obtained
668,946,589,063,364,608 0.001194922 are set. For each of the other recommended algorithms,
668,943,987,470,811,136 0.001194922
the parameters are also set to the parameters at which the
best results are obtained. The specific recommendation
668,946,589,063,364,608 0.001194922
results are shown below (Fig. 8).
668,943,987,470,811,136 0.001194922 As can be seen from the data in the above figure, over-
681,697,568,456,192,001 0.000636248 all, the DMHR algorithm proposed in this paper is su-
681,693,469,564,383,232 0.000636248 perior to the other four classic recommendation
681,697,568,456,192,001 0.000636248 algorithms in MAE evaluation indicators. The SocialMF
681,695,337,304,702,976 0.000545355
algorithm has the worst overall performance, because
the algorithm only introduces the trust delivery mechan-
681,695,337,304,702,976 0.000545355
ism into the recommendation model, and cannot achieve
681,697,928,268,910,593 0.000545355 good results on the dataset of the social network. The
overall performance of the TDSRec algorithm is similar
to that of the SVD++ algorithm, and it is significantly
fold cross-validation method is used to evaluate the data- improved compared with the SocialMF algorithm. This
set. The detailed emotional classification model perform- is mainly because the two algorithms add trust and trust
ance indicators are shown in Fig. 7 below. characteristic matrix and user history data to the model
As can be seen from the data in the above Fig. 7, in respectively, which can improve the performance of the
the case of the same parameter settings, the accuracy of recommended model. It shows that it is feasible to use
the method based on user score embedding reaches the the auxiliary information such as user reviews to im-
maximum value 92.1% when N = 20. With the further in- prove the recommendation effect. However, the review
crease of the number of iterations, the accuracy of the information and interest preference are not always posi-
model fluctuates above 90%, which indicates that the tively correlated. The fusion of multiple recommenda-
performance of the emotion classification model trained tion views does not always improve the performance of
by LSTM algorithm is relatively stable. And when N = the model. If some unreliable recommendation factors
20, the emotion classification model can achieve the best are introduced into the model, it will have a negative ef-
results to ensure the effect of subsequent experiments. fect on the performance of the system.
Based on the above analysis data, the DMHR algo-
rithm proposed in this paper has a significant improve-
Result analysis ment on the MAE evaluation index compared with the
In this experiment, MAE is used as the evaluation index traditional algorithm, which indicates that the prediction
to measure the experimental effect of various recom- accuracy of the recommendation model is related to the
mended algorithms. On the same data set, the TDSRec al- real user score and using the perspective pre-filter based
gorithm, the SocialMF algorithm, the SVD++ algorithm, method to fuse the virtual score and get the user’s com-
the HFT algorithm and the DMHR algorithm proposed in prehensive score can effectively improve the user’s scoring
this paper are used for comparison experiments. In the accuracy, ultimately affects the recommendation model’s
DMHR algorithm, the comprehensive scoring result of the scoring prediction accuracy. In addition, the cold start
issue is one of the most interesting issues in the recom-
Table 3 hubs of the users mended scenario, however few records (including ratings
user hub and reviews) are considered relevant to the “cold start”.
339,283,603 0.003429355 The DMHR algorithm proposed in this paper combines
1,679,619,506 0.003233392 post-based collaborative filtering recommendation and
3,693,887,599 0.003135411
post content-based recommendation. The recommenda-
tion factor incorporates the emotional tendency of user
3,628,926,974 0.002645503
reviews and the semantic information of natural language
933,364,430 0.002253576 description of post content. Moreover, the data prepro-
4,068,440,360 0.001959632 cessing method is used to process the messy data obtained
1,000,421,510 0.00186165 from social networks and eliminated most of the low-
3,254,047,099 0.001665687 quality users and posts, which ensures better efficiency in
2,310,175,028 0.001567705
subsequent processing. Theoretically, this auxiliary infor-
mation will help to solve the cold start and sparse data
2,168,821,905 0.001567705
problem to a certain extent.
Jiang et al. Journal of Cloud Computing: Advances, Systems and Applications (2020) 9:57 Page 13 of 16
Case study The hit rate HR (Hit radio) is used to evaluate the per-
In order to evaluate the performance of the proposed formance of the model.
model, this paper uses the leave-one-out method which
has been widely used in most literatures [38]. In this sec-
Num@N
tion of case study, the way of Top-N recommendation is HR@N ¼ ð16Þ
T @N
used to verify the effectiveness of the algorithm. The ex-
periment selects 100 posts that have not been rated by
the user and are most similar to the posts that the user where T @ N represents the number of candidate test
likes, as the candidate posts. The 100 posts have been sets, Num @ N represents the number of TopN posts ob-
manually sorted based on the user’s attention and timeli- tained by the algorithm mentioned above among the
ness of the post. We take this sorting result as a real re- Top N recommended posts of the real result, and rank-
sult and compare it with the recommended results ing is not considered here. The final result is shown in
obtained by the various algorithms mentioned above. the figure below (Fig. 9).
It can be seen from the experimental results that collaborative filtering algorithm based on posts and
the proposed DMHR algorithm and the HFT algo- use the topic distribution information to build the
rithm achieve better performance than other algo- model, so they can’t get good recommendations.
rithms in the HR @ N the evaluation indexes. Since This also proves the feasibility of the idea of build-
these two algorithms mine user comment informa- ing a hybrid recommendation system by integrating
tion and use the description text information of the multiple recommendation views proposed in this
post content in the collaborative training model, they paper. Compared with the HFT algorithm, the
can better overcome the cold start problem of the DMHR algorithm uses the method of perspective
recommendation system. Therefore, they have pre-filtering to calculate the user’s comprehensive
achieved good results in reflecting the recall rate rating of the post, and adds the time dimension to
performance of the recommendation system (HR @ the user’s review. The closer the time is, the larger
N). In contrast, other algorithms such as the the weighting factor is, the longer the time is, and
TDSRec algorithm, the SocialMF algorithm and the the smaller the weighting factor is. The time factor
SVD++ algorithm only use the traditional of the recommended post is also considered when
the Top-N recommendation ranking is performed. In the next step, future research can consider the
Experiments show that considering the time span in impact of user preferences over time, reviews on text
the recommendation process has a more important emotions, weights of potential features, and social re-
impact on the final recommendation results. lationships on recommendations. In addition, the
DMHR model can be applied to group recommenda-
Conclusion and future work tion, friend relationship recommendation and other
The recommendation system is the most effective tool for issues in the future work.
solving information overload, and it has received much at-
Acknowledgements
tention in the current academic and industrial circles. This Not applicable.
paper proposes a hybrid recommendation model based on
deep emotion analysis and multi-source view fusion. Authors’ contributions
Liang Jiang, Lu Liu, Jingjing Yao, and Leilei Shi developed the idea of the
Based on the analysis of user behavior preferences, which study, participated in its design and coordination and helped to draft the
focuses on the emotional mining and deep semantic ana- manuscript. Liang Jiang and Leilei Shi contributed to the acquisition and
lysis of text information; and the natural language descrip- interpretation of data. Lu Liu provided critical review and substantially
revised the manuscript. All authors read and approved the final manuscript.
tion information of the post content is mined, and
combined with the collaborative training strategy in semi- Authors’ information
supervised learning, the post-based collaborative filtering LIANG JIANG received the B.S. degree from the Nanjing University of Posts
and Telecommunications, China, in 2007, and the M.S. degree from Jiangsu
recommendation view and the content-based recommen- University, Zhenjiang, China, in 2011, where he is currently pursuing the
dation view are combined to build a hybrid recommenda- Ph.D. degree with the School of Computer Science and Telecommunication
tion system. Because the method adopts the collaborative Engineering. His research interests include OSNs, computer networks, and
network security. LU LIU received the M.S. degree from Brunel University and
filtering model, it can effectively solve the problem that the Ph.D. degree from the University of Surrey. He is currently a Professor of
the user’s original score and the real interest preference Distributed Computing with the University of Leicester, U.K. His research
are deviated in the recommendation system, and the score interests are in areas of cloud computing, social computing, service-oriented
computing, and peer-to-peer computing. Prof. Liu is a fellow of the British
distribution is extremely uneven. Since the DMHR recom- Computer Society. JINGJING YAO received the B.E. degree from Jiangsu Uni-
mendation algorithm proposed in this paper takes into ac- versity, Zhenjiang, China, in 2011, and the D.M. degree from Jiangsu Univer-
count the content information of the post, which sity, Zhenjiang, China, in 2016. Her research interests include complex
network, information dissemination. LEILEI SHI received the B.S. degree from
effectively solves the cold start problem of the recommen- Nantong University, Nantong, China, in 2012, and the M.S. degree from
dation system and improves the recommended recall rate Jiangsu University, Zhenjiang, China, in 2015, where he is currently pursuing
of the recommendation system. Furthermore, in terms of the Ph.D. degree with the School of Computer Science and Telecommunica-
tion Engineering. His research interests include event detection, data mining,
the recommended effect, the experimental results show social computing, and cloud computing.
that the accuracy of the DMHR algorithm proposed in
this paper has been significantly improved compared with Funding
This work was supported in part by the National Natural Science Foundation
existing methods, and the problem of cold start has also of China under Grant 71701082, in part by the Natural Science Foundation of
been solved to some extent. Jiangsu Province under Grant BK20170069, in part by the U.K.–Jiangsu 20–20
Jiang et al. Journal of Cloud Computing: Advances, Systems and Applications (2020) 9:57 Page 16 of 16
World Class University Initiative Programme, in part by the U.K.–China 19. Phan HT, Tran VC, Nguyen NT, Hwang D (2020) Improving the performance
Knowledge Economy Education Partnership, in part by the Postgraduate of sentiment analysis of tweets containing fuzzy sentiment using the
Research and Practice Innovation Program of Jiangsu Province under Grant feature ensemble model. IEEE Access 8:14630–14641
KYCX17_1808, and in part by Natural Science Research Projects of Jiangsu 20. W. Chung and D. Zeng, "Dissecting emotion and user influence in social
Higher Education Institutions under Grant 19KJB520027. media communities: An interaction modeling approach" Information
Management, 57, 1, 103108, 2020
Availability of data and materials 21. Shi C, Hu B, Zhao WX, Yu PS (2019) Heterogeneous information network
The datasets used or analysed during the current study are available from embedding for recommendation. IEEE Trans Knowl Data Eng 31(2):357–370
the corresponding author on reasonable request. 22. Deng X, Zhuang F, Zhu Z (2019) Neural variational collaborative filtering
with side information for top-K recommendation. Int J Machine Learning
Cybernetics 10(11):3273–3284
Competing interests
23. Yan Y, Tan M, Tsang I, Yang Y, Shi Q, Zhang C (2020) Fast and low memory
The authors declare no conflict of interest.
cost matrix factorization: algorithm, analysis and case study. IEEE Trans
Knowl Data Eng 32(2):288–301
Author details
1 24. Koren Y (2008) “Factorization meets the neighborhood: A multifaceted
School of Computer Science and Telecommunication Engineering, Jiangsu
collaborative filtering model,” Proceedings of the 14th ACM SIGKDD
University, Zhenjiang, China. 2Jiangsu Key Laboratory of Security Technology
international conference on Knowledge discovery and data mining, Las
for Industrial Cyberspace, Jiangsu University, Zhenjiang, China. 3School of
Vegas, Nevada, pp 426–434
Informatics, University of Leicester, Leicester, UK. 4School of Economy and
25. Guo N, Wang B, Hou Y (2018) Collaborative filtering recommendation
Finance, Jiangsu University, Zhenjiang, China.
algorithm based on characteristics of social network. J Front Computer
Science and Technology 12(2):208–217
Received: 15 January 2020 Accepted: 8 September 2020
26. Forsati R, Mahdavi M, Shamsfard M, Sarwat M (2014) Matrix factorization
with explicit trust and distrust side information for improved social
recommendation. ACM Trans Inf Syst 32(4):1–38
References 27. Feng Y, Zhou P, Wu D, Hu Y (2018) Accurate content push for content-centric
1. Daud NN, Ab Hamid SH, Saadoon M, Sahran F, Anuar NB (2020) social networks: a big data support online learning approach. IEEE Transactions
Applications of link prediction in social networks: a review. J Netw Comput on Emerging Topics in Computational Intelligence 2(6):426–438
Appl 166:102716 28. Zhao G, Lei X, Qian X, Mei T (2019) Exploring Users' internal influence from reviews
2. Yi B et al (2019) Deep matrix factorization with implicit feedback for social recommendation. IEEE Transactions on Multimedia 21(3):771–781
embedding for recommendation system. IEEE Transactions Industrial 29. McAuley J, Leskovec J (2013) “Hidden factors and hidden topics:
Informatics 15(8):4591–4601 Understanding rating dimensions with review text,” Proceedings of the 7th
3. Kant S, Mahara T (2018) Nearest biclusters collaborative filtering framework ACM Conference on Recommender Systems, Hong Kong, China, pp 165–172
with fusion. J Comput Sci 25:204–212 30. Bao Y, Fang H, Zhang J (2014) TopicMF: simultaneously exploiting ratings
4. Salawu S, He Y, Lumsden J (2020) Approaches to automated detection of and reviews for recommendation. Proceedings of the Twenty-Eighth AAAI
Cyberbullying: a survey. IEEE Trans Affect Comput 11(1):3–24 Conference on Artificial Intelligence, Québec City, Québec, Canada:2–8
5. Wei J, He J, Chen K, Zhou Y, Tang Z (2017) Collaborative filtering and deep 31. Ding J, Yu G, He X et al (2018) Improving Implicit Recommender Systems
learning based recommendation system for cold start items. Expert Syst with View Data. Proceedings of the 27th International Joint Conference on
Appl 69:29–39 Artificial Intelligence, Stockholm, pp 3343–3349
6. Nguyen V-D, Sriboonchitta S, Huynh V-N (2017) Using community 32. Jiang L, Shi L, Liu L, Yao J, Yuan B, Zheng Y (2019) An efficient evolutionary
preference for overcoming sparsity and cold-start problems in collaborative user interest community discovery model in dynamic social networks for
filtering system offering soft ratings. Electron Commer Res Appl 26:101–108 internet of people. IEEE Internet Things J 6(6):9226–9236
7. Shi L, Liu L, Wu Y, Jiang L, Hardy J (2017) Event detection and user interest 33. Pero S, Horvath T (2013) “Opinion-Driven Matrix Factorization for Rating
discovering in social media data streams. IEEE Access 5:20953–20964 Prediction,” Proceedings of the 21st International Conference on User
8. Gu K, Fan Y, Di Z (2020) How to predict recommendation lists that users do Modeling, Adaptation, and Personalization Rome, Italy, pp 1–13
not like. Physica A: Statistical Mechanics and its Applications 537:122684 34. T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, "Distributed
9. Yuan W, Wang H, Yu X, Liu N, Li Z (2020) Attention-based context-aware representations of words and phrases and their compositionality,"
sequential recommendation model. Inf Sci 510:122–134 Proceedings of the 26th International Conference on Neural Information
10. Shi L, Liu L, Wu Y, Jiang L, Panneerselvam J, Crole R (2019) A social sensing Processing Systems - Volume 2, Lake Tahoe, Nevada, pp. 3111–3119, 2013
model for event detection and user influence discovering in social media 35. T. Yu, H. Hui, W. Z. Zhang, and Y. Jia, "Automatic Generation of Review
data streams. IEEE Transactions on Computational Social Systems:1–10 Content in Specific Domain of Social Network Based on RNN," in 2018 IEEE
11. Yu S, Yang M, Qu Q, Shen Y (2019) Contextual-boosted deep neural Third International Conference on Data Science in Cyberspace (DSC), Los
collaborative filtering model for interpretable recommendation. Expert Syst Alamitos, CA, USA, pp. 601–608, 2018
Appl 136:365–375 36. Shi M, Y T, Liu J (2019) Functional and contextual attention-based LSTM for
12. Liu H et al (2020) Hybrid neural recommendation with joint deep service recommendation in Mashup creation. IEEE Transactions on Parallel
representation learning of ratings and reviews. Neurocomputing 374:77–85 and Distributed Systems 30(5):1077–1090
13. Xiao H, Chen Y, Shi X, Xu G (2019) Multi-perspective neural architecture for 37. Q. Le and T. Mikolov, "Distributed Representations of Sentences and Documents,"
recommendation system. Neural Netw 118:280–288 Proceedings of the 31st International Conference on International Conference on
14. Rosa RL, Schwartz GM, Ruggiero WV, Rodríguez DZ (2019) A knowledge- Machine Learning - Volume 32, Beijing, China, pp. 1188–1196, 2014
based recommendation system that includes sentiment analysis and deep 38. Gantner Z, Rendle s, Freudenthaler C, et al. “MyMedialite A free
learning. IEEE Transactions on Industrial Informatics 15(4):2124–2135 recommender system library,” Proceedings of the 5th ACM Conference on
15. Shi L et al (2019) Human-centric cyber social computing model for hot- Recommender Systems. Chicago, USA, pp. 305–308, 2011
event detection and propagation. IEEE Transactions on Computational
Social Systems 6(5):1042–1050
16. Sanz-Cruzado J, Castells P, Macdonald C, Ounis I (2020) Effective contact Publisher’s Note
recommendation in social networks by adaptation of information retrieval Springer Nature remains neutral with regard to jurisdictional claims in
models. Information Processing & Management 57(5):102285 published maps and institutional affiliations.
17. Almaghrabi M, Chetty G (2018) "A Deep Learning Based Collaborative
Neural Network Framework for Recommender System," in 2018.
International Conference on Machine Learning and Data Engineering
(iCMLDE), Los Alamitos, pp 121–127
18. Chouchani N, Abed M (2020) Enhance sentiment analysis on social networks
with social influence analytics. J Ambient Intell Humaniz Comput 11(1):139–149