verma2018springerpaper
verma2018springerpaper
net/publication/324626996
CITATIONS READS
46 8,435
2 authors, including:
Binita Verma
Jayoti Vidyapeeth Women's University
6 PUBLICATIONS 71 CITATIONS
SEE PROFILE
All content following this page was uploaded by Binita Verma on 22 May 2021.
1 Introduction
Social media has turned into another correspondence channel between consumers
and organizations. From social networking sites like Twitter, Facebook, Tumblr
users are provided with a platform to publish and express their emotions, views, and
likings about various topics, people, product, and services. Traditionally, text and
reviews are collected through questions are prepared by researcher. This method of
data collection was time consuming and difficult to manage. With the advancement
of technology and Internet, consumer is using social media to provide feedback and
comments in form of unstructured text. Opinions expressed in social media can be
B. Verma (&)
BU, Bhopal, India
e-mail: [email protected]
R. S. Thakur
Department of Computer Application and Mathematics,
Maulana Azad National Intitute of Technology, Bhopal, Madhya Pradesh, India
e-mail: [email protected]
classified to determine the orientation (negative, positive, neutral) of the posted text.
Sentiment strength and intensity of the post is determined with the aim to identify
opinion and emotion of the user about a specific product or service. To analyze the
text and reviews, sentiment analysis is used.
Social networking sites are required to keep customers happy to overcome the
competition and retain them. One way to achieve customer satisfaction is to be able
to present and display the products of customer’s interests, inform them about
upcoming sale on products of customer’s interests, suggest them with new products
similar to user’s preference studied on social sites. The contents of the social Web
are dynamic and quickly changing to reflect the social and emotional ups and
downs of users [1].
Recently, one of the most popular sources of personal opinion about any topic
people or product is blog. As the demand of internet increases, the blog pages are
also growing at a great rate. For sentiment analysis, blogs are used as a source of
opinion available in social Webs. For a consumer review sites are helpful to
speculate about the product or services which is available on the internet [2].
Nowadays, users spend their time on social media sites like Twitter, Orkut,
Tumblr, MySpace to create short messages to be posted that can be used as data
source for sentiment classification.
2 Related Work
Sentiment Score
Sentiment Analysis Using Lexicon and Machine … 443
corpus into positive and negative dictionaries. They updated the dictionary by
exploiting emotion symbols like emoticons, acronyms, and exclamation words.
Augustniak et al. [4] presented fast and efficient approach for extracting lexicon
using bag of words (Bow) for sentiment determination. They combined the word
frequency measure approach to build lexicon with ensemble method, exploiting the
efficiency provided by machine learning, and performance provided by
lexicon-based methods.
Dang et al. [5] proposed a framework with combination of machine learning and
lexicon-based approach for sentiment classification to improve the performance of
sentiment classification.
Goodarzi et al. [6] proposed lexicon-based framework for sentiment analysis of
citation on research contents. They determined sentiment orientation, polarity, and
strength. They also compared the performance of SentiWordNet, Bing Liu, and
AFINN lexica for PubMed schema-based research contents.
Liu et al. [7] presented scalable implementation of Naive Bayes algorithm to
analyze sentiment sentences of millions of movie reviews. They build big data
analyzing system using simple mapreduce analyzing jobs and workflow controller
(WFC), user terminal, result collector, and data parser.
3 Sentiment Analysis
These learning approaches are based on building classifiers from labeled instances
of textual posts. They perform well for the domain on which they are trained.
Machine learning can be divided into two approaches [12].
a. Supervised learning
The supervised learning approaches use labeled training documents. Supervised
learning is based on automatic text classification. A labeled training set with
pre-defined category is used. A classification model builds to predict the class of
document based on pre-defined category [12, 13].
Supervised learning algorithms are [13].
1. Probabilistic classifiers like Naïve Bayes, Bayesian network maximum entropy.
2. Linear classifiers determine good separators with can best separate the space
into different classes. Most famous linear classifiers are support vector machine
(SVM) and neutral network.
3. Rule-based classifiers divide the data into set of rule. Rule in the form of “IF
condition THEN conclusion” is generated during the training phase. Decision
rules classification method classifies documents to annotated categories.
4. Decision tree classifiers build a hierarchical tree-like structure with true/false
queries based on categorization of training document.
b. Unsupervised Learning
Unlike supervised learning approaches, unsupervised learning approaches do not
depend on the domain and topic of training data. Unsupervised learning approaches
overcome the difficulty of collecting and creating labeled training data.
Table 1 (continued)
Name of the Sub-technique Approaches Type of data Pros & cons
technique
Lexicon-based Manual Lexicon and Covered wide range
technique approaches optionally on of terms
unlabeled data
Dictionary Lexicon and Lot of words in a
based optionally on lexicon and assign
unlabeled data sentiment scores to
Corpus based Labeled data each word
Hybrid Combination Labeled data and Concept-level
technique of machine lexicon, optionally measurement of
learning and with unlabeled data sentiment and lesser
lexicon-based sensitivity to change
approaches in topic domain
Noisy data
4 Conclusion
References
1. Rodriguez, C., Grivolla, J., Codina, J.: A hybrid framework for scalable opinion Mining in
Social Media. In: Proceeding of the Workshop on Semantic Analysis in Social Media (2012)
2. Priya Mohana, R.: Sentiment Analysis and Opinion mining using SentiwordNet: A Survey
(2015)
3. Hanen, A., Salma, J.: Dynamic construction of dictionaries for sentiment classification. In:
Third International Conference on Cloud and Green Computing (CGC), Karlsruhe (2013)
4. Augustniak, L., Kajdanowicz,T., et al.: Simple is better? Lexicon based ensemble sentiment
classification beats supervised methods. In: IEEE/ACM International Conference on
Advances in Social Networks Analysis and Mining (ASONAM), Beijing (2014)
5. Dang, Y., Zhang, Y., Chen, H.: A lexicon-enhanced method for sentiment classification: an
experiment on online product reviews. IEEE Intell. Syst. 25(4), 46–53 (2009)
6. Goodarzi, M., Mahmoudi, M., Zamani, R.: A framework for sentiment analysis on
schema-based research content via lexical analysis. In: 7th International Symposium on
Telecommunications (IST), Tehran (2014)
7. Liu, B., Blasch, E., Chen, Y., Shen, D.: Scalable sentiment classification for big data analysis
using Naïve Bayes Classifier. In: IEEE International Conference on Big Data, Silicon Valley,
CA (2013)
Sentiment Analysis Using Lexicon and Machine … 447
8. Yongzehng, Z., Dan, S., Catherine, B.: Sentiment analysis in Practice, Dec (2011)
9. Zhou, X., Tao, X., Yong, J., Yang, Z.: Sentiment analysis on tweets for social events. In:
Proceedings of the 2013 IEEE 17th International Conference on Computer Supported
Cooperative Work in Design (2013)
10. Bing, L., Lei, Z.: A Survey of Opinion Mining and Sentiment Analysis. Springer, pp. 415–
463 (2012)
11. Hailong, Z., Wenyan, G., Bo, J.: Machine Learning and Lexicon Based Methods for
Sentiment Classification, IEEE, vol. 11, pp. 262–265 (2014)
12. Walaa, M., Ahmed, H., Korashy, H.: Sentiment analysis algorithms and applications: a
survey. Ain Shams Eng. J. 5(4), 1093–1113 (2013)
13. Baharum, B., Lam, H.L., Khairullah, K.: A review of machine learning algorithms for
text-documents classification. J. Adv. Inf. Technol. 1(1), 4–20 (2010)
14. Wan, X.: Using Bilingual knowledge and ensemble technique for unsupervised Chinese
sentiment analysis. In: Proceedings of the Conference on Empirical Methods in Natural
Language Processing. Association for Computational Linguistics, pp. 553–561 (2008)
15. Hardeniya, T., Borikar, D.A.: Dictionary based approach to sentiment analysis—a review. In:
IJAEMS: Open Access International Journal, vol. 2(5). Infogain Publication (2016)
16. Ye, Q., Zhang, Z., Law, R.: Sentiment classification of online reviews to travel destinations by
supervised machine learning approaches. Expert Syst. Appl. 36, 6527–6535 (2009). https://
doi.org/10.1016/j.eswa.2008.07.035
17. Sindhwani, V, Melville, P.: Document-word co-regularization for semi-supervised sentiment
analysis. In: Eight IEEE International Conference on Data Mining, 2008.ICDM 08,
pp. 1025–1030. IEEE (2008)
18. D’Andrea, A., Ferri, F., Grifoni, P., Guzzo, T.: Approaches, tools and applications for
sentiment analysis implementation. Int. J. Comput. Appl. (0975-8887) 125(3) (2015)