Sentiment Analysis
Sentiment Analysis
ANALYSIS
BUSINESS
Submitted by
I
NTELLIGENCE
Maaz Ghazanfar (BSCSF15E004)
Submitted to
Mr. Muhammad Fahad
Abstract
As a response to the growing availability of informal, opinionated texts like blog posts and
product review websites, a field of Sentiment Analysis has sprung up in the past decade to
address the question What do people feel about a certain topic? Bringing together researchers
in computer science, computational linguistics, data mining, psychology, and even sociology,
Sentiment Analysis expands the traditional fact-based text analysis to enable opinion-oriented
information systems. This paper is an overview of Sentiment Analysis, its basic tasks and the
latest techniques developed to address the challenges of working with emotionally-charged
text.
Goals
Because of the complexity of the problem (underlying concepts, expressions in text, etc.),
Sentiment Analysis encompasses several separate tasks. These are usually combined to
produce some knowledge about the opinions found in text. This section provides an overview
of these tasks:
Introduction
When conducting serious research or making every-day decisions, we often look for other
people’s opinions. We consult political discussion forums when casting a political vote, read
consumer reports when buying appliances, ask friends to recommend a restaurant for the
evening. And now Internet has made it possible to find out the opinions of millions of people on
everything from latest gadgets to political philosophies. The latest Pew study on Internet and
Civic engagement says that “just under one in five internet users (19%) have posted material
about political or social issues or a used a social networking site for some form of civic or
political engagement” . Another study shows that a third (33%) of internet users read blogs,
with 11% doing so on daily basis . Ready availability of opinionated text has created a new area
in text analysis, expanding the subject of study from traditionally fact- and information-centric
view of text to enable sentiment-aware applications. In the past decade, extraction of
sentiment from text has been getting a lot of attention in both industry and academia.
Increasingly businesses realize the importance of Internet users’ opinions about their product
and services.
History
Definition of Sentiment:
Defining the objects of the study – opinions and subjectivity. Originally, subjectivity was defined
by linguists, most prominently, Randolph Quirk (R. Quirk and Svartvik, 1985). Quirk defines
private state as something that is not open to objective observation or verification. These
private states include emotions, opinions, and speculations, among others. It is highly context-
sensitive, and its expression is often peculiar to each person.The sentence “Ali loves chocolate”
expresses a sentiment of Ali towards chocolate, but it doesn’t mean it’s not true. Likewise, not
all objective sentences are true. To underline the ambiguity of the concept, Pang and Lee (Pang
and Lee, 2008) list the definitions of terms closely linked to the notion of sentiment:
Sentiment analysis
It is still a challenge to extract the full private state, complete with the emotion’s intensity, its
holder, and its target.
Preliminary Knowledge
As mentioned before, some of the most studied texts in Sentiment Analysis are product and
movie reviews (Hu and Liu, 2005; Popescu and Etzioni, 2005). The advantage is that they
already have a clearly specified topic, and it is often (reasonably) assumed that the sentiments
expressed in the reviews have to do with the topic. Many also have a star rating system, which
serves as a quantitative indication of the opinion. Such data is often used as gold standard while
evaluating sentiment extraction/identification. A general task aimed at sentiment research
would be to find opinions on a given product in any web content. Several companies offer
services in brand tracking and market perception use Sentiment Analysis techniques. For
example, OpSec Security provides “monitoring, measuring, and analyzing consumer feedback”
to their customers, helping them understand the market needs, target customer segments, and
their position against competitors. On the other hand, one of the most difficult areas for
Sentiment Analysis methods is that of politics. Political discussions are fraught with quotations,
sarcasm, and complex references to persons, organizations, and ideas (Gamon et al., 2008).
Real life commercial uses
Although the field of Sentiment Analysis is relatively young, there are already numerous
businesses that use the techniques developed in this field to customers interested in brand
tracking and market perception.
• Online communities
• Weblogs
• Chatrooms
• Newsgroups
By aggregating, evaluating, and interpreting the data found on these web sites, OpSec
promises to “provide insights and recommendations” and “forecast product and brand trends”.
The discovery of the “opinion leaders”, they claim, helps companies discover their strengths.
Methodologies
Airoldi, E. M., Bai, X., and Padman, R. (2006). Markov blankets and meta-heuristic search:
Sentiment extraction from unstructured text. Lecture Notes in Computer Science, 3932:167–
187.
Bai, X., Padman, R., and Airoldi, E. (2005). On learning parsimonious models for extracting
consumer opinions. Proceedings of the Hawaii International Conference on System Sciences.
Bansal, M., Cardi, C., and Lee, L. (2008). The power of negative thinking: Exploring label
disagreement in the min-cut classification framework. Proceedings of the International
Conference in Computational Linguistics (COLING).
Benamara, F., Cesarano, C., Picariello, A., Reforgiato, D., and Subrahmanian, V. (2007).
Sentiment analysis: Adjectives and adverbs are better than adjectives alone. In Proceedings of
the Internation Conference in Weblogs and Social Media (ICWSM).
Blitzer, J., Dredze, M., and Pereira, F. (2007). Biographies, bollywood, boom-boxes and
blenders: Domain adaptation for sentiment classification. Proceedings of the 45th Annual
Meeting of the Association of Computational Linguistics, pages 440–447.
Chesley, P., Vincent, B., Xu, L., and Srihari, R. K. (2006). Using verbs and adjectives to
automatically classify blog sentiment. Proceedings of the AAAI Spring Symposium on
Computational Approaches to Analyzing Weblogs.
Church, K. W. and Hanks, P. (1989). Word association norms, mutual information and
lexicography. Proceedings of the 27th Annual Conference of the Association of Computational
Linguists, pages 76–83.
Das, S. and Chen, M. (2001a). Yahoo! for amazon: Extracting market sentiment from stock
message boards. Proceedings of the Asia Pacific Finance Association Annual Conference (APFA).
Das, S. and Chen, M. (2001b). Yahoo! for amazon: Sentiment parsing from small talk on the
web. Proceedings of the 8th annual Conference of the Asia Pacific Finance Association (APFA).
Dave, K., Lawrence, S., and Pennock, D. M. (2003). Mining the peanut gallery: Opinion
extraction and semantic classification of product reviews. Proceedings of the World Wide Web
Conference.
……………………………………………………………………….