0% found this document useful (0 votes)
110 views8 pages

Sentiment Analysis

The growing expansion of contents, placed on the Web, provides a huge collection of textual resources. People share their experiences, opinions or simply talk just about whatever concerns them online.

Uploaded by

Eesh Uww
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views8 pages

Sentiment Analysis

The growing expansion of contents, placed on the Web, provides a huge collection of textual resources. People share their experiences, opinions or simply talk just about whatever concerns them online.

Uploaded by

Eesh Uww
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

SENTIMENT

ANALYSIS
BUSINESS

Submitted by

I
NTELLIGENCE
Maaz Ghazanfar (BSCSF15E004)

Submitted to
Mr. Muhammad Fahad
Abstract
As a response to the growing availability of informal, opinionated texts like blog posts and
product review websites, a field of Sentiment Analysis has sprung up in the past decade to
address the question What do people feel about a certain topic? Bringing together researchers
in computer science, computational linguistics, data mining, psychology, and even sociology,
Sentiment Analysis expands the traditional fact-based text analysis to enable opinion-oriented
information systems. This paper is an overview of Sentiment Analysis, its basic tasks and the
latest techniques developed to address the challenges of working with emotionally-charged
text.

Goals
Because of the complexity of the problem (underlying concepts, expressions in text, etc.),
Sentiment Analysis encompasses several separate tasks. These are usually combined to
produce some knowledge about the opinions found in text. This section provides an overview
of these tasks:

 Sentiment or opinion detection


 Polarity classification
 Discovery of the opinion’s target
 Feature extraction & comparative sentences
 Topic specific & cross-topic sentiment analysis

Introduction
When conducting serious research or making every-day decisions, we often look for other
people’s opinions. We consult political discussion forums when casting a political vote, read
consumer reports when buying appliances, ask friends to recommend a restaurant for the
evening. And now Internet has made it possible to find out the opinions of millions of people on
everything from latest gadgets to political philosophies. The latest Pew study on Internet and
Civic engagement says that “just under one in five internet users (19%) have posted material
about political or social issues or a used a social networking site for some form of civic or
political engagement” . Another study shows that a third (33%) of internet users read blogs,
with 11% doing so on daily basis . Ready availability of opinionated text has created a new area
in text analysis, expanding the subject of study from traditionally fact- and information-centric
view of text to enable sentiment-aware applications. In the past decade, extraction of
sentiment from text has been getting a lot of attention in both industry and academia.
Increasingly businesses realize the importance of Internet users’ opinions about their product
and services.
History
Definition of Sentiment:

Defining the objects of the study – opinions and subjectivity. Originally, subjectivity was defined
by linguists, most prominently, Randolph Quirk (R. Quirk and Svartvik, 1985). Quirk defines
private state as something that is not open to objective observation or verification. These
private states include emotions, opinions, and speculations, among others. It is highly context-
sensitive, and its expression is often peculiar to each person.The sentence “Ali loves chocolate”
expresses a sentiment of Ali towards chocolate, but it doesn’t mean it’s not true. Likewise, not
all objective sentences are true. To underline the ambiguity of the concept, Pang and Lee (Pang
and Lee, 2008) list the definitions of terms closely linked to the notion of sentiment:

 Sentiment suggests a settled opinion reflective of ones feelings (“her feminist


sentiments are well-known”).
 View suggests a subjective opinion (“very assertive in stating his views”).

Sentiment analysis

As a field of research, it is closely related to computational linguistics, natural language


processing, and text mining. Proceeding from the study of affective state (psychology) and
judgment, this field seeks to answer questions long studied in other areas of discourse using
new tools provided by data mining and computational linguistics. Sentiment Analysis has many
names. It’s often referred to as subjectivity analysis, opinion mining, and appraisal extraction,
with some connections to affective computing: computer recognition and expression of
emotion (Pang and Lee, 2008). The field usually studies subjective elements, defined by Wiebe
et. al. as “linguistic expressions of private states in context” (Wiebe et al., 2004). These are
usually single words, phrases, or sentences. Sometimes whole documents are studied as a
sentiment unit (Turney and Littman, 2003; Agrawal et al., 2003), but it’s generally agreed that
sentiment resides in smaller linguistic units (Pang and Lee, 2008). Since sentiment and opinion
often refer to the same idea, this paper will use the terms interchangeably. Sentiment that
appears in text comes in two flavors: Explicit where the subjective sentence directly expresses
an opinion (“It’s a beautiful day”), and Implicit where the text implies an opinion (“The
earphone broke in two days”) (Liu, 2006). Most of the work done so far focuses on the first kind
of sentiment, since it is the easier one to analyze.Sentiment polarity is a particular feature of
text. It is usually divided into two – positive and negative – but polarity can also be thought of
as a range. A document containing several opinionated statements would have a mixed polarity
overall, which is different from not having a polarity at all (being objective). Furthermore, a
distinction must be made between the polarity of sentiment and of its strength. One may feel
strongly about a product being OK, not particularly good or bad; or weakly about a product
being very good .
Background
An overview of the work done in the most popular task of Sentiment Analysis, polarity
classification which is extended from (Zhou and Chaovalit, 2008). The work in this area started
around 2000 and is still strong today. As mentioned earlier, a lot of work has been done on
movie and product reviews, especially popular are the Internet Movie Database (IMDb) and
product reviews downloaded from Amazon. The performance achieved by this method is
difficult to judge, since each method uses a variety of resources for training and different
collections of documents for testing. Many studies, such as Blitzer et al. (2007), deal with
several domains, some more “challenging” for their algorithms than others. Notice especially
how much the results vary across domains: in a recent study by Melville et al. (2009) the
performance of their system on blogs and on political commentary differs by nearly 30%. Some
studies, such as Godbole et al. (2007) work on the level of words, sometimes achieving accuracy
of over 90%. Others, working on longer documents, such as blog posts and full web pages, have
in general performance of around 65–85%. Few studies have been done outside the realm of
short documents like product reviews, and especially in difficult domains like political
commentaries.

It is still a challenge to extract the full private state, complete with the emotion’s intensity, its
holder, and its target.

Preliminary Knowledge
As mentioned before, some of the most studied texts in Sentiment Analysis are product and
movie reviews (Hu and Liu, 2005; Popescu and Etzioni, 2005). The advantage is that they
already have a clearly specified topic, and it is often (reasonably) assumed that the sentiments
expressed in the reviews have to do with the topic. Many also have a star rating system, which
serves as a quantitative indication of the opinion. Such data is often used as gold standard while
evaluating sentiment extraction/identification. A general task aimed at sentiment research
would be to find opinions on a given product in any web content. Several companies offer
services in brand tracking and market perception use Sentiment Analysis techniques. For
example, OpSec Security provides “monitoring, measuring, and analyzing consumer feedback”
to their customers, helping them understand the market needs, target customer segments, and
their position against competitors. On the other hand, one of the most difficult areas for
Sentiment Analysis methods is that of politics. Political discussions are fraught with quotations,
sarcasm, and complex references to persons, organizations, and ideas (Gamon et al., 2008).
Real life commercial uses

Although the field of Sentiment Analysis is relatively young, there are already numerous
businesses that use the techniques developed in this field to customers interested in brand
tracking and market perception.

Specifically, these are the types of activities that may be involved:

• Tracking collective user opinions and ratings of products and services

• Analyzing consumer trends, competitors, and market buzz

• Measuring response to company-related events and incidents

• Monitoring critical issues to prevent negative viral effects

• Evaluating feedback in multiple languages As a source of opinionated discourse, these


companies look at

• Online communities

• Discussion boards 20http://www.opsecsecurity.com/ 7 OPEN RESEARCH DIRECTIONS 24

• Weblogs

• Product rating sites

• Chatrooms

• Price comparison portals

• Newsgroups

By aggregating, evaluating, and interpreting the data found on these web sites, OpSec
promises to “provide insights and recommendations” and “forecast product and brand trends”.

The discovery of the “opinion leaders”, they claim, helps companies discover their strengths.
Methodologies

Tools and techniques

 Classification: Many of the tasks in Sentiment Analysis can be thought of as


classification. Machine Learning offers many algorithms designed to do just that, but
this task of classifying text according to its sentiment presents many unique challenges.
These can be formulated in one question: “What kinds of features do we use?”
 Term Presence Vs Frequency Traditional Information Retrieval: Systems have long
emphasized the importance of term frequency. Term that often appear in the document
but seldom in the whole collection are more informative as to what the document is
about as compared to the terms mentioned just once. In the field of Sentiment Analysis
we find that instead of paying attention to most frequent terms, it is more beneficial to
seek out the most unique ones.
 n-grams: Term positions are also important in document representation for Sentiment
Analysis. The position of terms determines, and sometimes reverses, the polarity of the
phrase. So, position information is sometimes encoded into the feature vector.
 Syntax: Syntax information has also been used in feature sets, though there is still
discussion about the merit of this information in Sentiment classification. This
information may include important text features such as negation, intensifiers, and
diminishers.
 Negations: Negations have been long known to be integral in Sentiment Analysis. The
usual bag-of-words representation of text disconnects all of the words, and considers
sentences like “I like this book” and “I don’t like this book” very similar, since only one
word distinguishes one from the other. But when talking about sentiment, a negation
flips the polarity of a whole phrase.
Some techniques are discussed below:
 Lexicons Extended: lexicons are a fundamental part of Sentiment Analysis, but not all of
them are alike. The simplest ones are ones with binary classification of words into
positive vs. negative polarities or objective vs. subjective. A more fine distinction
between the classes can be made with fuzzy lexicons where each label has a score
associated with it, conveying the “strength” of the label. A yet more sophisticated
approach is to adopt any of the finer-grained affective classifications developed in areas
of psychology such as Plutchik’s emotion model (Prinz, 2004).
 Using Training Documents: It is possible to perform sentiment classification using
statistical analysis and machine learning tools that take advantage of the vast resources
of labeled (manually by annotators) documents available. Product review websites like
C-NET7, Ebay8 etc.
Conclusion
This paper describes the field of Sentiment Analysis and its latest developments. Bringing
together researchers from computer science, data mining, text retrieval, and computational
linguistics, this field provides ample opportunities for both quantitative and qualitative work.
Tackling the blurry definition of sentiment and the complexity of its manifestation in text, it
opens doors for novel uses of techniques already developed for data mining and text analysis
and brings up new questions, prompting development of yet better tools. Internet provides us
with an unlimited source of the most diverse and opinionated text, and as of yet only a small
part of the existing domains have been explored. Much work has been done on product reviews
– short documents that have a well-defined topic. More general writing, such as blog posts and
web pages, have recently been receiving more attention. Still, the field is struggling with more
complex texts like sophisticated political discussions and formal writings. Future work in
expanding existing techniques to handle more linguistic and semantic patterns will surely be an
attractive opportunity for researchers and business people alike.

A selection of lists of “fundamental” or “basic” emotions


References
Agrawal, Rajagopalan, Srikant, and Xu (2003). Mining newsgroups using network arising from
social behavior. Twelfth international World Wide Web Conference.

Airoldi, E. M., Bai, X., and Padman, R. (2006). Markov blankets and meta-heuristic search:
Sentiment extraction from unstructured text. Lecture Notes in Computer Science, 3932:167–
187.

Annett, M. and Kondrak, G. (2008). A comparison of sentiment analysis techniques: Polarizing


movie blogs. Advances in Artificial Intelligence, 5032:25–35.

Bai, X., Padman, R., and Airoldi, E. (2005). On learning parsimonious models for extracting
consumer opinions. Proceedings of the Hawaii International Conference on System Sciences.

Bansal, M., Cardi, C., and Lee, L. (2008). The power of negative thinking: Exploring label
disagreement in the min-cut classification framework. Proceedings of the International
Conference in Computational Linguistics (COLING).

Benamara, F., Cesarano, C., Picariello, A., Reforgiato, D., and Subrahmanian, V. (2007).
Sentiment analysis: Adjectives and adverbs are better than adjectives alone. In Proceedings of
the Internation Conference in Weblogs and Social Media (ICWSM).

Blitzer, J., Dredze, M., and Pereira, F. (2007). Biographies, bollywood, boom-boxes and
blenders: Domain adaptation for sentiment classification. Proceedings of the 45th Annual
Meeting of the Association of Computational Linguistics, pages 440–447.

Chesley, P., Vincent, B., Xu, L., and Srihari, R. K. (2006). Using verbs and adjectives to
automatically classify blog sentiment. Proceedings of the AAAI Spring Symposium on
Computational Approaches to Analyzing Weblogs.

Church, K. W. and Hanks, P. (1989). Word association norms, mutual information and
lexicography. Proceedings of the 27th Annual Conference of the Association of Computational
Linguists, pages 76–83.

Das, S. and Chen, M. (2001a). Yahoo! for amazon: Extracting market sentiment from stock
message boards. Proceedings of the Asia Pacific Finance Association Annual Conference (APFA).

Das, S. and Chen, M. (2001b). Yahoo! for amazon: Sentiment parsing from small talk on the
web. Proceedings of the 8th annual Conference of the Asia Pacific Finance Association (APFA).

Dave, K., Lawrence, S., and Pennock, D. M. (2003). Mining the peanut gallery: Opinion
extraction and semantic classification of product reviews. Proceedings of the World Wide Web
Conference.

……………………………………………………………………….

You might also like