0% found this document useful (0 votes)
37 views

Sentiment Analysis On Twitter in R

Mining posts in social networks have great potential for new applications. However, the huge amount of data produced every day in these networks makes it impractical for people to manually undertake the task. The Sentiment Analysis returns polarity of tweets written in English about a topic of interest. Two main requirements direct the design and the development of the system: (a) good usability and (b) good precision in determining the sentiments. The topic of interest has a clean interface to enter the keywords that describe the topic of interest and to present the results at several levels of detail. To meet the second requirement, the system uses Naïve Bayes classifier to identify the sentiments of tweets, the literature shows that this algorithm combines good classificatory performance and low response time. The systems also present a good result in measuring its precision in identifying the predominant polarity of Tweets and are related to five different topics of interest.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Sentiment Analysis On Twitter in R

Mining posts in social networks have great potential for new applications. However, the huge amount of data produced every day in these networks makes it impractical for people to manually undertake the task. The Sentiment Analysis returns polarity of tweets written in English about a topic of interest. Two main requirements direct the design and the development of the system: (a) good usability and (b) good precision in determining the sentiments. The topic of interest has a clean interface to enter the keywords that describe the topic of interest and to present the results at several levels of detail. To meet the second requirement, the system uses Naïve Bayes classifier to identify the sentiments of tweets, the literature shows that this algorithm combines good classificatory performance and low response time. The systems also present a good result in measuring its precision in identifying the predominant polarity of Tweets and are related to five different topics of interest.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Volume 2, Issue 4, April 2017 International Journal of Innovative Science and Research Technology

ISSN No: - 2456- 2165

Sentiment Analysis on Twitter in R


[1]
S.Sathya, Amith Mathew [2], T. Sanalysis
Karthikraja [3]
, John
of feeling andRosy
[4]
mining of opinions are currently
[1] [2] [3] [4]
Assistant Professor, used in
UG Scholars
recent years due to penetration of social networks
Department of CSE, RVS College of Engineering and Technology, Coimbatore

AbstractMining posts in social networks have great and a large range of applications can be developed from the
potential for new applications. However, the huge amount of identification of the subjectivity embedded in the messages.
data produced every day in these networks makes it The analysis of feeling is a problem of processing Natural
impractical for people to manually undertake the task. The language (PNL), in which it only determines the positive or
Sentiment Analysis returns polarity of tweets written in negative feelings related to sentences, entities or topics.
English about a topic of interest. Two main requirements Determining the feeling associated with a fragment of
direct the design and the development of the system: (a) good
usability and (b) good precision in determining the sentiments.
essentially consists of classifying it into one of the
The topic of interest has a clean interface to enter the categories: positive, negative or neutral.
keywords that describe the topic of interest and to present the
results at several levels of detail. To meet the second Eventually, the number of categories may be higher when
requirement, the system uses Nave Bayes classifier to identify you want to express the intensity of feeling: very positive,
the sentiments of tweets, the literature shows that this positive, not positive, neutral etc. The success of these
algorithm combines good classificatory performance and low methods depends heavily on the selection or extraction of
response time. The systems also present a good result in an adequate set of characteristics of the text, including
measuring its precision in identifying the predominant terms (n-grams), grammatical categories, and Semantic
polarity of Tweets and are related to five different topics of
interest.
dependencies.

KeywordsSentiment analysis, opinion mining, Twitter, Naive Mostly focusing on the qualifying performance of the
Bayes classifier. algorithms used for the detection of feelings expressed in
texts (blogs, news sites, and social networks) written in
I. INTRODUCTION English. This analysis processes text messages written and
published on the social network Twitter, known as Tweets,
Social networks are increasingly present in the daily life of in order to identify the feeling. A Naive Bayes (NB)
people of all ages and social layers, being used in diverse classifier for evaluation of the feelings expressed in tweets
activities professional, entertaining and socializing. The is used. This algorithm is adopted because it is simple to
opinions expressed in these networks are an important implement, requires relatively few computational resources
source to help understand the collective feeling about and have similar performance to more complex alternatives
various topics. They can provide feedback to companies and computationally expensive. The Naive Bayes classifier
regarding their brands and products and public figures, such is to filter out comments that are interested. Result of the
as politicians, artists and athletes, about their reputation. analyses is of feelings is displayed in bars, so that the user
The amount of information circulating through networks is can have a clear visualization of the results obtained.
very high. Twitter, for example, in 2014, recorded an
average of 500,000 tweets per day. By 2015, the number of II. THE NAIVE BAYES CLASSIFIER
its active users rose from 300 million to 400 million.
The Naive Bayes (NB) classifier is based on the rule of
Twitter users a day who comment, enjoy and share ideas, Bayes for the inversion of conditional probabilities,
opinions and criticism. The sheer volume of data produced presented in the context of classification of texts.
on social networks makes it impractical for people to (I.e.)
analyze it. In this context, the data mining, this aims to

extract information to gather people opinion among the
various data mining tasks that can perform from text
messages in natural language. Analysis of sentiment, called P (c | d) is the posterior probability of Category C given
the mining of opinions, aims at identifying polarity the document (tweet, in this case Context) d, P (c) the prior
(positive/negative) of messages. The terms analysis of probability of category c, P (d) the prior probability of the
feeling and mining of opinions are currently used area in document d And P (d | c) a Posterior probability of the
polarity (positive / negative) of messages. The terms document d belonging to the class w. Considered, c is

IJISRT17AP14 www.ijisrt.com 33
positive and negative for the analysis of Feelings. The prior B. Fetching data
probability of a category P(c), is determined from
In this project, the fetching of raw data from the Twitter
the number of categories considered. In using the Bayes and to do its analysis using R language is done.
theorem in the NB classifier, the denominator P (d) is seen
as a constant, being eliminating, which reduces to the C. Tokenizing
numerator. P (xi | c) is the conditional probability of a word
(term) xi occurs in category c and nd is the number of terms In a sentimental analysis, tokenization is the process of
in document d. The calculation of P (xi | c) consists of breaking a stream of text up into words.
dividing the number of times that the word xi is found in the
training list of category c by the total of words in this list. D. Data Pre-Processing
The lists associated with categories are built in the training
phase of the classifier, from examples of positive tweets Cleaning and removing the unwanted words.
and negative results. In this way, the best class to assign to
a document by the Naive Bayes classifier NB, where C is E. Sentence Reduction
the set of classes considered.
After completing the data preprocessing the data are
The Nave Bayes classifier (NB) follows that two transformed to the understandable format.
propositions are assumed:
F. Visualizing the data
1) The Positions of the words do not matter.
2) The probabilities of the terms are Independent given a It produces the output of the classifying the data such as
class c. NB performs well in text classification tasks, positive, negative, neutral and negation are estimated from
comparing to that of more complex algorithms are the twitter.
computationally time-consuming.

III. FUNCTIONALITY AND ARCHITECTURE

A. System Flow Diagram

Twitter

Raw Tokenizer
data

Data pre-processing

Sentence reduction

Nave Bayes
Classifier

Text identifier

Visualized output Fig 2. Result for the Sentiment Analysis.

Fig 1. System Architecture

IJISRT17AP14 www.ijisrt.com 34
Volume 2, Issue 4, April 2017 International Journal of Innovative Science and Research Technology

ISSN No: - 2456- 2165

IV. FINAL CONSIDERATIONS analysis in Twitter, "in Proceedings of the 9th International
Workshop on Semantic Evaluation, SemEval, 2015.
Identify the polarity of comments made about Generic
topics on the social network Twitter offers a wide Range of [4] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, From
possibilities, but it is a challenging task, even for humans. data mining to knowledge discovery in databases, AI
In addition to the difficulties inherent automatic Mag., vol. 17, no. 3, p. 37, 1996
interpretation of texts in natural language, it is added that
tweets are often written using Slang and often do not [5] S. Das and M. Chen, Yahoo! for Amazon: Extracting
respect standard language standards. The user who seeks to market sentiment from stock message boards, in
simplify the process of identifying the topic of interest, Proceedings of the Asia Pacific finance association annual
collecting tweets, and analyzing polarity, contained in these conference (APFA), 2001, vol. 35, p. 43.
tweets. Determine the polarity of using a Naive Bayes
classifier. [6] R. Feldman, Techniques and applications for sentiment
analysis, Commun. ACM, vol. 56, no. 4, pp. 8289, 2013
The results obtained in the experimental evaluation
described in the article Good performance, particularly [7] S. Rosenthal, P. Nakov, S. Kiritchenko, S. M.
considering that the system allows you to analyze tweets on Mohammad, A. Ritter, and V. Stoyanov, Semeval-2015
any topic. To train the algorithm with this perspective it is task 10: Sentiment analysis in twitter, in Proceedings of
necessary to have a large number of tweets on various the 9th International Workshop on Semantic Evaluation,
subjects, positive and negative, in order to train the SemEval, 2015.
Classifier. The strategy adopted was to collect a few
thousand of tweets, classifies them into positive or negative [8] A. Agarwal, B. Xie, I. Vovsha, O. Rambow, and R.
according to the polarity associated with the emoticons each Passonneau, Sentiment analysis of twitter data, in
contained. Proceedings of the Workshop on Languages in Social
Media, 2011, pp. 3038.
A system focused on a particular vision (a brand, product,
person or company) would allow training of the classifier
with more specific messages which would tend to increase
the accuracy as well as to compare the performance of the
classifier when the manual classification of training tweets.

V. CONCLUSION

Nowadays, the use of social networking sites is


increasing and the users wish to have a short and quick
view of the discussions going on currently. By this, the
users can have the analysis quickly, less cost and more
accurate. The output is in a graphical form. The application
can have a quick view on things like politics during an
election, health care, movie review, improved customer
service and product review.

REFERENCES

[1] Grandin and Adan Piegas: A System for Sentiment


Analysis of Tweets in Portuguese, 2016.

[2] Hassan Saif, Yulan He, Miriam Fernandez, Harith


Alani Contextual Semantics for Sentiment Analysis of
Twitter,2015.

[3] S. Rosenthal, P. Nakov, S. Kiritchenko, S. M.


Mohammad, A. Ritter, And V. Stoyanov, " Sentiment

IJISRT17AP14 www.ijisrt.com 35

You might also like