A Deep Neural Network-Based Approach For Sentiment Analysis of Movie Reviews
A Deep Neural Network-Based Approach For Sentiment Analysis of Movie Reviews
Complexity
Volume 2022, Article ID 5217491, 9 pages
https://doi.org/10.1155/2022/5217491
Research Article
A Deep Neural Network-Based Approach for Sentiment Analysis of
Movie Reviews
Copyright © 2022 Kifayat Ullah et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The number of comments/reviews for movies is enormous and cannot be processed manually. Therefore, machine learning
techniques are used to efficiently process the user’s opinion. This research work proposes a deep neural network with seven layers
for movie reviews’ sentiment analysis. The model consists of an input layer called the embedding layer, which represents the
dataset as a sequence of numbers called vectors, and two consecutive layers of 1D-CNN (one-dimensional convolutional neural
network) for extracting features. A global max-pooling layer is used to reduce dimensions. A dense layer for classification and a
dropout layer are also used to reduce overfitting and improve generalization error in the neural network. A fully connected layer is
the last layer to predict between two classes. Two movie review datasets are used and widely accepted by the research community.
The first dataset contains 25,000 samples, half positive and half negative, whereas the second dataset contains 50,000 specimens of
movie reviews. Our neural network model performs sentiment classification among positive and negative movie reviews called
binary classification. The model achieves 92% accuracy on both datasets, which is more efficient than traditional machine
learning models.
To solve the problem of sentiment analysis, three paper shows that dropout improves the neural network’s
methods are used: lexicon-based, machine learning, and performance in supervised learning.
hybrid-based methods [4]. There are certain challenges in processing natural lan-
Lexicon-based techniques use either dictionary-based guages. In recent years, it has been observed that a neural
techniques or corpus-based techniques. Dictionary-based network is a favorable solution to the challenges of natural
methods use dictionary terms, like SentiWordNet and language processing [8]. Moreover, according to [9, 10], the
WordNet, and corpus-based approaches use statistical deep learning approach with two hidden layers is more
analysis [4]. accurate than the approach with a single hidden layer.
Machine learning is automatically used to learn from Reference [11] categorized the sentiment analysis into
labeled or unlabeled data automatically, called supervised three levels. The first level is the document level, which is
learning or unsupervised learning. At the same time, a used to categorize the whole document as negative or
hybrid is a combination of both. Machine learning is further positive. The second is sentence level, which classifies the
divided into traditional machine learning and deep learning. sentence. The third is the aspect or feature level that classifies
Machine learning techniques are also known as tradi- the document or sentence based on some aspect.
tional machine learning techniques, where some of the most Reference [12] proposed an unsupervised-based method
popular techniques are SVM (Support Vector Machine), in which WordNet dictionary is used to determine opinion
Decision Tree, NB (Naı̈ve Bayes), and RF (Random Forest). words and their antonyms and synonyms. This method is
Deep learning mimics the working of the human brain to applied to movie reviews classifying each document as
process data and creates patterns. These patterns are used for negative, positive, or neutral. This model uses POS tagger on
the decision-making process. The deep learning approach the collected reviews, which tags all words of the document.
can learn automatically and get improved from experience Reference [12] prepared a seed list that holds some opinion
without any explicit programming. words and their polarity. WordNet is used to find its syn-
This research work proposes a seven-layer deep neural onyms if the extracted word is not in the seed list. The results
network model for larger datasets. First, the data is con- have concluded that the proposed model achieves 63%
verted into word vectors. According to [5], Word2Vec is one accuracy on document classification using movie reviews.
of the most powerful techniques that Keras offers in an In [13], the proposed SLCABG model combines the
embedding layer. We use two layers of the convolutional advantages of deep learning and sentiment lexicon. To
layer. The first 1D convolutional layer is used to extract enhance the sentiment features within the review, a senti-
features from the input data to produce a feature map. The ment lexicon is used at the first stage of the model, and then a
second convolution layer summarizes the features selected convolutional neural network and GRU (Gated Recurrent
by the first convolutional layer. The global max-pooling layer Unit) are used. These layers extract sentiment features and
reduces the resolution features of the output and prevents then use the attention mechanism to weigh.
overfitting of the data (Dang, Moreno-Garcı́a, & De la Prieta, References [14, 15] presented the key challenges faced in
2020). sentiment analysis. Reference [16] presented the two most
The dropout layer is used to solve the problem of essential comparisons of sentiment analysis. First, it dis-
generalization and overfitting. This layer randomly drops cussed the comparisons between sentiment review structure
units from the network, while these units are dropped and analysis challenges. This challenge reveals another factor
temporarily, along with all the outgoing and incoming that challenges faced in sentiment analysis are domain-
connections [6]. dependent. Reference [16] shows that the negation challenge
The dense layer uses the loss function on the trained is very popular in all types of reviews when there is a minor
dataset to classify the input features into positive or negative. difference in explicit or implicit meaning. It is also con-
The remainder of the paper is organized as follows: cluded that the nature and review structure present a suitable
Section 1.1 introduces the related research work about movie challenge. Second, it examines the significance of sentiment
reviews and machine learning and sentiment analysis, analysis challenges to improve accuracy. Another hot area of
Section 2 describes the proposed seven-layer deep neural research is the theoretical type of sentiment analysis. These
network model, Section 3 describes the experiments and two comparisons are collected from 47 research studies [16].
results of our proposed model, Section 4 presents discus- The WHO (World Health Organization) declared an
sions, and Section 5 is the conclusion of the research work. illness on 11th March 2020 because of COVID-19 (Coro-
navirus Disease of 2019). As a result, a significant amount of
pressure has mounted on each country to assess the cases of
1.1. Related Works. Reference [7] proposed a deep neural COVID-19 and efficiently utilize the available resources, as
network model for sentiment analysis applied to comments there was fear, panic, and anxiety as the number of cases
of YouTube videos, written in Brazilian Portuguese. The increased [17]. More than 24 million people had positive
model has six layers. This model achieves an accuracy of 60% tests worldwide as of 27th August 2020 [17].
to 84% [7]. Reference [17] extracted facts from tweets related to
Reference [6] showed that overfitting is a severe problem WHO and COVID-19 and believes that World Health
in deep neural networks. A deep neural network with large Organization is ineffectual in guiding the public. The authors
parameters is a powerful but slow machine. The problem of therein described that two types of tweets were analyzed.
overfitting is handled using the dropout technique. This First, they gathered tweets from 01-01-2019 to 23-03-2020,
Complexity 3
which were around 23,000. These tweets were analyzed and CNN layers and different numbers and sizes of filters.
concluded that most of the tweets conveyed negative or Reference [22] also performed sentiment analysis of Hindi
neutral sentiments. movie reviews, 50% of the dataset was used for training, and
The second dataset collected from December 2019 to the other 50% of the dataset was used for testing the model.
May 2020, which contained about 226,668 tweets, was an- The model proposed by [22] achieved 95% accuracy and
alyzed and it was concluded that most of the tweets conveyed performed superior to traditional machine learning
positive or neutral sentiments [17]. The authors claimed that techniques.
the people had posted mostly positive tweets. This claim was The model proposed by [22] comprises four layers,
then validated using a deep neural network classifier having embedding layer, convolutional layer, global max-pooling
81% accuracy. layer, and dense layer.
Reference [18] focused on the multilingual text data of The first layer, that is, the embedding layer, represents a
social media to discover the concentration of extremism in sequence of words as vectors, where the dimension must be
sentiment. The authors therein used four classifications, less than vocabulary size. Reference [22] used Word2Vec to
neutral, moderate, low extreme, and high extreme. Reference capture semantic properties. The trained model maps a word
[18] extracted Urdu data from a different source, which was to its corresponding vector representation. Softmax prob-
then authenticated by Urdu domain professionals, and it ability is used to calculate high-dimensional vectors for every
achieved 88% accuracy. Naı̈ve Bayes and SVM were applied word.
for classification purposes and achieved 82% accuracy. Every hidden neuron accepts a one-word vector. This
According to [19], data encoding has a vital role in means that there must be a hidden neuron in the model for
convolutional neural network training. One of the most every word vector.
popular and simplest encoding methods is one-hot Reference [22] used the convolutional layer having “m”
encoding. But when the dataset becomes large, data does not kernels/filters to a frame of size “h” over every sentence.
spread the full-label set. Therefore, in one-hot encoding, the These “m” kernels operate in parallel generating several
relationship between words is independent. However, when features.
the larger dataset is used, the dimension of the word vector The features map produced by the convolutional layer is
will be very large. given to the global max-pooling layer, which samples these
Reference [19] had two contributions. First, the authors features and generates the local optimum. This layer ag-
therein showed that random projections are an effective tool gregates information/data and reduces representation.
for calculating embedding having lower dimensions. Sec- In [22], the experimental results claimed that the CNN
ond, they proposed a normalized eigenrepresentation of the which has two convolutional layers and kernel sizes of 4 and
class, which encodes targets with minimal information loss 3 performs better and achieves 95.4% accuracy. In contrast,
and improves the accuracy of random projections with the CNN with three convolutional layers achieves 93.44% ac-
same convergence rates. curacy. When the quantity of the convolution layers and
Reference [20] used an artificial neural network trained kernel size increase, the training time also increases.
on a movie review database with two large lists of negative Reference [8] reviewed the latest studies and showed that
and positive words. This model achieved 91% accuracy on deep learning solves the problems of sentiment analysis and
training and 86.67% accuracy on validation. natural language processing. RNN, DNN, and CNN are
The study performed by [6] showed that overfitting is a applied using TF-IDF and word embeddings to many
severe problem in deep neural networks. The deep neural datasets. Word embedding performs well against TF-IDF.
network having an enormous parameter is a powerful but The convolution neural network outperforms other models,
slow machine. This research work addresses the issue of which present a good balance between CPU runtime and
overfitting, which means too much learning will be per- accuracy.
formed poorly on new data but, on the other hand, there is Reference [23] showed that deep learning is better for
good performance on the training dataset. These problems sentiment analysis. That paper explains the sigmoid function
are handled using the dropout technique. This paper and how weights are learned in neural networks, and a
demonstrates that the neural network performs better by particular convolution layer and pooling operations are
using a dropout on supervised learning. The dropout value used.
may be between 20% and 50%; small value has minimal
effect and too large value results in underlearning by the 2. Methods
network.
CNN is the most famous technique in computer vision; Alexandre Cunha proposed a six-layer neural network
however, recent studies show that convolutional neural model for sentiment analysis to mine features and classified
networks perform well on natural language processing. the comments [7]. However, he recommended that further
Reference [21] presented Facebook fast-text word embed- work is needed to make this model efficient for analyzing
ding to represent words for sentiment analysis instead of large datasets (Figure 1).
word embeddings and trained a convolutional neural In this research work, to check the performance of [7],
network. we implemented the same neural network model in Python
Reference [22] proposed a deep neural network model and applied it to larger movie review datasets. Unfortu-
and conducted experiments using a different number of nately, the model does not give satisfactory results. So we
4 Complexity
Testing Data
25%
Model Model
Data Preprocessing Vectorization Evaluation
Training Testing
Training Data
75%
have proposed a seven-layer deep neural network model for the input data to produce a feature map using a filter or
larger datasets. Furthermore, in this research work, one kernel.
more hidden layer (one-dimensional global max-pooling The convolutional layer is a feed-forward deep neural
layer) was added with default parameters and some pa- network primarily applicable in computer vision, natural
rameters were changed in the proposed model of Alexandre language processing, and recommending systems [8]. The
Cunha, which increased the model’s accuracy. objective of a convolutional layer is to extract the most
Our proposed model has seven layers. The first layer is significant features from the input [26].
the embedded layer, followed by two consecutive con- We used a convolutional layer to effectively extract
volutional layers, a global max-pooling layer, a fully con- important features using fewer neurons compared to the
nected layer, a dropout layer, and a dense layer. dense layer [7]. However, more layers of the convolutional
layer will extract more features from the input vectors.
Therefore, we have used a two convolutional layers. These
2.1. Embedding Layer. The embedding layer is the first layer convolutional layers have 128 and 64 filters size, respectively.
of the model and is provided with labeled data. The em- We have used a 1D convolutional layer. Therefore, it
bedding layer requires that the dataset must be cleaned. The summarizes along with one dimension. Its advantage is that
datasets are prepared so that one-hot encoding is generated it automatically detects important features without any
for each word. Size of the vector space must be specified as it human supervision.
is part of the model. We used 256-dimensions. The second convolutional layer summarizes the features
The word embedding vectors are initialized in the first selected by the first convolutional layer because filter size is
step with small random vectors. One option is one-hot reduced in this layer.
encoding to map words into word vectors, but there is no
relationship between words in one-hot encoding, as each
word is independent [19]. Furthermore, the dimensions of 2.3. Global Max-Pooling Layer. The global max-pooling
the word vector will be very large if the number of words is layer reduces the resolution features in the output and
large. Therefore, the researchers proposed encoding the prevents data overfitting [8]. Furthermore, according to
words into vectors to solve the problem of one-hot encoding [23], pooling layer is used to decrease dimensionality by
[24]. releasing the number of factors and hence shortening the
In this study, word embedding is used to convert words execution time.
into vectors. The vocabulary size is restricted to the top 78,000 We use the global max-pooling 1D layer. In this layer,
most common words for dataset-I and 1,08,000 for dataset-II. the dimensionality of features is decreased without losing
256 dimensions are used and cut off the review after 1200 key information. The features generated by the convolu-
words, which we select from the experimental results of Table tional layer are summarized in this layer and thee features of
1. The weights of embedding layers are taken randomly from the global region are presented into a feature map. The next
the dictionary created from the datasets, and then these layer will perform operations on the summarized features.
vectors are adjusted through backpropagation [25]. Now, if there are any variations in the position of input
features, the model can identify them. Pooling operations are
divided into max-pooling and average pooling. In max-
2.2. Convolution Layer. In the convolution layer, the name pooling operation, maximum elements are selected from the
“convolution” comes from or means to extract features from feature map of the region, which is converted by the filter. In
the input data, also called filters. In a one-dimensional contrast, global max-pooling gives a single value by reducing
convolutional layer (1D-CNN), features are extracted from each channel in the feature map.
Complexity 5
2.4. Dropout Layer. Multiple nonlinear hidden layers in Table 2: Effect of dropout value.
deep neural networks make the model more expressive F1-Score
which can learn the complicated relationship between input Dropout Accuracy (%) Precision (%) Recall (%)
(%)
and output [6]. 0.3 90.23 90.88 89.91 89.78
Generalization and overfitting are two significant 0.4 89.81 90.33 89.99 89.52
problems in the neural network. The model’s ability to 0.6 89.35 89.45 89.96 89.18
perform well on new data is called generalization. In con- 0.8 88.89 88.22 87.83 87.35
trast, when you train the model on large data and the model
works well on that data but performs poorly on the test
dataset, this problem is called overfitting. Model: “sequential”
Dropout is a technique that effectively addresses these Layer (type) Output Shape Param #
problems and combines many different architectures of embedding (Embedding) (None, 1500, 256) 19968000
neural networks efficiently [6]. Dropout means dropping
visible or hidden units from the network, which are tem- conv1d (Conv1D) (None, 1494, 128) 229504
porarily removed from the network and all its outgoing and conv1d_1 (Conv1D) (None, 1490, 64) 41024
incoming connections.
global_max_pooling1d (Global (None, 64) 0
Reference [6] proposed that a typical dropout value for
hidden units will be in the range of 0.5 to 0.8 because a dense (Dense) (None, 50) 3250
shallow value has a very slight effect on the model and a very dropout (Dropout) (None, 50) 0
high value results in underlearning by the neural network.
dense_1 (Dense) (None, 1) 51
The dropout layer is used with a dropout rate of 0.3,
which we obtained from the experimental results in Table 2. Total params: 20, 241, 829
Trainable params: 20, 241, 829
When the dropout layer is used in the neural network, its
Non-trainable params: 0
neurons become less sensitive to specific weights, which will
None
result in a neural network that is less likely to overfit the
training data and can be better generalized. Figure 2: Seven-layer neural network model.
2.5. Fully Connected Layer. Dense layer (densely connected database of information about movies. These datasets are
layer) means fully connected layer. A fully connected layer available and broadly accepted by researchers [8]. The first
performs primary classification. We have used two dense dataset (dataset I) is available on Kaggle.com [27], which
layers, one hidden layer, and another is the last layer. The contains IMDb movie reviews from the audience, with
hidden dense layer will return an array of 50 probability 25,000 samples having half positive and half negative [28].
scores. It takes a feature vector as input and gives an output The second dataset (dataset II) is also available on Kag-
of a 50-dimensional vector. At the same time, the last dense gle.com [29], and it is a binary classification dataset con-
layer is used to classify features of the input into various taining 50,000 reviews of movies.
classes.
The seven-layer sequential neural network model is
3.2. Preprocessing. To improve the quality of sentiment
shown in Figure 2.
analysis and reduce errors and inconsistencies in sentiment
analysis, we need to clean the data [1]. We observed from the
3. Experiments datasets that they contain HTML tags and punctuations.
This section evaluates the neural network model for senti- Therefore, we removed the HTML tags and special char-
ment analysis. acters from these datasets. After removing punctuations,
which result in single characters that make no sense, we
removed all the strings having single characters. We replaced
3.1. Data Collections. There are several sources for movie them with a space that creates multiple spaces in our dataset.
review datasets like GitHub, Kaggle.com, UCI (machine Then we removed multiple spaces from our text, and, finally,
learning repository), and IMDb. We extracted movie review we converted the text to lowercase. After cleaning, each
datasets from IMDb. The critics frequently use movie rating dataset is divided into two pairs, training and testing; 75% of
websites like IMDb to post remarks and rate movies, which the dataset is used to train the proposed model and the
assist users in deciding whether to watch the movie or not. In remaining 25% for testing. In dataset I, from 25,000 samples,
this study, we have used two datasets on IMDb, an online 18,750 sequences or samples are used for training, while the
6 Complexity
remaining 6,250 sequences or samples are used for testing. Table 3: Parameters of the model.
Meanwhile, in dataset II, 37,500 sequences or samples are Parameters Values
used for training and 12,500 sequences or samples are used
The length of the input sentence 1200
for testing from 50,000 samples. The dimension of the word vector 256
78000,
The thesaurus size
108000
3.3. Performance Metrics. The evaluation metric of the The size of the convolution kernel 7×5
model used in this research work is accuracy. The parameters The number of hidden neurons in the convolution layer 128
of the accuracy are defined as follows: Dropout 0.3
TP: The total number of reviews is is categorized as
positive, and the reviews are positive.
FP: The total number of reviews is categorized as Table 4: Effect of the input statement using fixed-length dataset II.
negative, and the reviews are positive. Sentence Accuracy Precision Recall F1-Score
TN: The total number of reviews is categorized as length (%) (%) (%) (%)
negative, and the reviews are negative. 6 73.28 74.74 71.40 72.28
2375 90.85 91.03 90.93 90.47
FN: The total number of statements/reviews is cate- 1200 91.04 91.13 91.09 90.66
gorized as positive, and the reviews are negative.
Accuracy is calculated by taking the ratio between the Table 5: Effect of the input statement using fixed-length six-layer
predicted reviews and the total number of reviews. model.
(TP + TN) Length of sentence Accuracy (%) Precision (%) F1-Score (%)
Accuracy � . (1)
(TP + TN + FP + FN) 10 68.41 66.51 89.35
2375 50.28 50.05 99.78
Precision is calculated by taking the ratio between the
1200 52.39 53.97 99
reviews predicted correctly as positive and the total number
of predictable positive reviews.
The generalization performance of the neural network
TP
Precision � . (2) model is improved by using a dropout layer. Different
(TP + FP) dropout values are used in the experiment. It has been
Recall is calculated by taking the ratio between the re- observed from the experiments that when the dropout value
views that are predicted correctly as positive reviews to all of is set to 0.3, the model’s performance is optimum. The results
the reviews in that class. of the experiments are shown in Table 2. The experimental
result of Table 2 is on the seven-layer model and the ex-
TP perimental results of Table 6 are on the six-layer model [7],
Recall � . (3)
(TP + FN) whose accuracy is not more than 52.39%.
F1-Score is calculated by taking the weighted average It has also been observed from the experiments that the
between recall and precision. model’s performance is affected by the number of iterations.
When we increase the number of iterations, the performance
(2 ∗ Precision ∗ recall) of the model also improves. It is presented in Tables 7 and 8
F1 − Score � . (4) and also shown in Figures 3 and 4. The experiments in
(Precision + recall)
Table 9 are performed on the six-layer model, with accuracy
not more than 55%.
The testing accuracy, recall, precision, and F1-Score are
3.4. Experiment Results. These experiments are performed shown in Tables 10 and 11.
using datasets I and II. We took the largest review, average From Tables 10 and 11, it is noted that the model
review, and smallest review length to conduct experiments. achieved a final accuracy of 91.18% on dataset I and 91.98%
The parameters of the proposed model are shown in Table 3. accuracy on dataset II. Meanwhile, from Table 12, it is noted
We record the experimental results in Tables 1, 4, and 5. We that the six-layer model achieved a final accuracy of 53.70%,
discovered from the experimental results that using the which is less than that of our proposed model.
longest review and average review length as a fixed length of
the input gives better results compared to the smallest review 4. Discussions
length. Furthermore, the average length review is more
effective in terms of processing speed and accuracy, which We proposed a seven-layer deep neural network model for
affects the performance of the proposed neural network sentiment analysis to classify movie reviews in this research
model. The experiments in Table 5 are performed on the six- work. Before the word vector is input into the model, the
layer model proposed by [7]. It is shown in Table 5 that the data is preprocessed to improve the quality of sentiment
accuracy of the six-layer model is less than that of our analysis. We remove HTML tags, punctuation marks, and
proposed model. spaces in preprocessing as they do not contain any
Complexity 7
Table 6: Effect of dropout value. Training Accuracy, F1-Score, precision and recall
Precentage
0.8 51.74 52.08 100
0.94
Table 7: Effect of the iterations on the model’s dataset I.
Epoch Accuracy (%) Precision (%) Recall (%) F1-Score (%) 0.92
1 89.99 90.58 90.01 89.52
2 97.13 97.29 97.01 97.02
0.90
3 99.63 99.63 99.64 99.62
0.0 0.5 1.0 1.5 2.0 2.5 3.0
4 99.89 99.92 99.88 99.89
epoch
accuracy recall
Table 8: Effect of the iterations on the model’s dataset II. precision F1-Score
Epoch Accuracy (%) Precision (%) Recall (%) F1-Score (%) Figure 4: Accuracy on dataset II.
1 91.04 91.13 91.09 90.66
2 96.84 97.00 96.75 96.74
3 99.18 99.16 99.20 99.14 Table 9: Effect of the number of iterations.
4 99.42 99.44 99.38 99.38 Epoch Precision (%) Accuracy (%) F1-Score (%)
1 54.09 52.48 100
2 54.59 53.14 100
Training Accuracy, F1-Score, precision and recall 3 54.58 53.65 100
1.00 4 54.93 53.87 100
0.96
Testing 92.18 92.53 91.94 91.79
0.94
datasets of movie reviews, but it gives 55% accuracy. We networks from overfitting,” Journal of Machine Learning
added one more layer called the global max-pooling layer in Research, vol. 15, no. 1, pp. 1929–1958, 2014.
the same model as Alexandre Cunha and changed some [7] A. A. L. Cunha, M. C. Costa, and M. A. C. Pacheco, “Sen-
parameters, improving the model’s accuracy from 55% to timent analysis of youtube video comments using deep neural
92%. The accuracy of the six-layer model proposed by [7] is networks,” International Conference on Artificial Intelligence
and Soft Computing, China, 2019.
less because the dataset used is large, and every review of the
[8] N. C. Dang, M. N. Moreno-Garcı́a, and F. De la Prieta,
dataset is about 1500 words (cut-off review). “Sentiment analysis based on deep learning: a comparative
We have successfully implemented a deep neural net- study,” Electronics, vol. 9, no. 3, p. 483, 2020.
work with seven layers on movie review data. Our model [9] A. Alharbi and O. Sohaib, “Technology readiness and cryp-
achieves accuracy of 91.18%, recall of 92.53%, F1-Score of tocurrency adoption: PLS-SEM and deep learning neural
91.94%, and precision of 91.79% on dataset I, and, on dataset network analysis,” IEEE Access, vol. 9, pp. 21388–21394, 2021.
II, the model achieves accuracy of 91.98%, precision of [10] O. Sohaib, W. Hussain, M. Asif, M. Ahmad, and M. Mazzara,
94.14%, recall of 89.93%, and F1-Score of 91.72%. “A PLS-SEM neural network approach for understanding
cryptocurrency adoption,” IEEE Access, vol. 8, pp. 13138–
Data Availability 13150, 2020.
[11] B. J. Liu, Sentiment analysis and opinion mining, vol. 5, no. 1,
The code and data used to support the findings of this study pp. 1–167, 2012.
have been deposited in the GitHub repository and are [12] R. Sharma, S. Nigam, and R. J. Jain, “Opinion Mining of
Movie Reviews at Document Level,” 2014, https://arxiv.org/
available at https://github.com/asifntu/sentimentanalysis.
abs/1408.3829.
The readers can easily follow the steps to reproduce the
[13] L. Yang, Y. Li, J. Wang, and R. S. Sherratt, “Sentiment analysis
study. for E-commerce product reviews in Chinese based on sen-
timent lexicon and deep learning,” IEEE Access, vol. 8,
Conflicts of Interest pp. 23522–23530, 2020.
[14] R. Sujata, K. J. Parteek, and R. i. C. Science, “Challenges of
The authors have no conflicts of interest to report regarding Sentiment Analysis and Existing State of Art,” 2014, https://
this study. www.researchgate.net/publication/308331478_Challenges_
of_Sentiment_Analysis_and_Existing_State_of_Art.
Acknowledgments [15] G. Vinodhini and R. J. I. J. Chandrasekaran, Sentiment
analysis and opinion mining: A Survey, vol. 2, no. 6,
Princess Nourah bint Abdulrahman University Researchers pp. 282–292, 2012.
Supporting Project no. PNURSP2022R54, Princess Nourah [16] D. M. E.-D. M. Hussein, “A survey on sentiment analysis
bint Abdulrahman University, Riyadh, Saudi Arabia. challenges,” Journal of King Saud University - Engineering
Sciences, vol. 30, no. 4, pp. 330–338, 2018.
[17] K. Chakraborty, S. Bhatia, S. Bhattacharyya, J. Platos, R. Bag,
References and A. E. Hassanien, “Sentiment Analysis of COVID-19
[1] D. S. Sisodia, S. Bhandari, N. K. Reddy, and A. Pujahari, “A tweets by Deep Learning Classifiers—A study to show how
comparative performance study of machine learning algo- popularity is affecting accuracy in social media,” Applied Soft
rithms for sentiment analysis of movie viewers using open Computing, vol. 97, Article ID 106754, 2020.
reviews,” in Performance Management of Integrated Systems [18] M. Asif, A. Ishtiaq, H. Ahmad, H. Aljuaid, and J. Shah,
and its Applications in Software Engineering, pp. 107–117, “Sentiment analysis of extremism in social media from textual
Springer, Salmon Tower Building, NY, USA, 2020. information,” Telematics and Informatics, vol. 48, Article ID
[2] B. Lakshmi Devi, V. Varaswathi Bai, S. Ramasubbareddy, and 101345, 2020.
K. Govinda, “Sentiment analysis on movie reviews,” in [19] P. Rodrı́guez, M. A. Bautista, J. Gonzàlez, S. Escalera, and
Emerging Research in Data Engineering Systems and Computer V. Computing, “Beyond one-hot encoding: lower dimen-
Communications, pp. 321–328, Springer, Salmon Tower sional target embedding,” Image and Vision Computing,
Building New York City, 2020. vol. 75, pp. 21–31, 2018.
[3] W. Zhang, M. Xu, and Q. Jiang, “Opinion Mining and [20] Z. Shaukat, A. A. Zulfiqar, C. Xiao, M. Azeem, and
Sentiment Analysis in Social media: Challenges and Appli- T. Mahmood, “Sentiment analysis on IMDB using lexicon and
cations,” in Proceedings of the 5th International Conference, neural networks,” SN Applied Sciences, vol. 2, no. 2, p. 148,
HCIBGO 2018, Held as Part of HCI International 2018, Las 2020.
Vegas, NV, USA, July 15-20, 2018. [21] I. Santos, N. Nedjah, and L. de Macedo Mourelle, “Sentiment
[4] B. Bhavitha, A. P. Rodrigues, and N. N. Chiplunkar, analysis using convolutional neural network with fastText
“Comparative Study of Machine Learning Techniques in embeddings,” in Proceedings of the 2017 IEEE Latin American
Sentimental Analysis,” in Proceedings of the 2017 International Conference on Computational Intelligence (LA-CCI), Are-
Conference on Inventive Communication and Computational quipa, Peru, 08-10 November 2017.
Technologies (ICICCT), Coimbatore, India, 10-11 March 2017. [22] S. Rani and P. Kumar, “Deep learning based sentiment
[5] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, analysis using convolution neural network,” Arabian Journal
“Distributed representations of words and phrases and their for Science and Engineering, vol. 44, no. 4, pp. 3305–3314,
compositionality,” Advances in Neural Information Processing 2019.
Systems, vol. 26, 2013. [23] K. Chakraborty, S. Bhattacharyya, R. Bag, and
[6] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and A. A. Hassanien, “Sentiment analysis on a set of movie reviews
R. Salakhutdinov, “Dropout: a simple way to prevent neural using deep learning techniques,” Social Network Analytics:
Complexity 9