SlideShare a Scribd company logo
Can Deep Learning solve the Sentiment Analysis Problem? 
Mark CieliebakZurichUniversity ofApplied Sciences 
Annual Meeting ofSGAICO –Swiss Group forArtificialIntelligenceandCognitiveScience 
18.11.2014
Outline 
1.What is sentiment analysis? 
2.How good are "classical" approaches? 
3.Does deep learning solve the problem? 
18.11.2014 Mark Cieliebak 2
About Me 
18.11.2014 Mark Cieliebak 3 
Mark Cieliebak 
Institute of Applied Information Technology (InIT) 
ZHAW, Winterthur 
Email: ciel@zhaw.ch, Website: www.zhaw.ch/~ciel 
Text 
Analytics 
Open 
Data 
Automated 
Test 
Generation 
Research 
Interests 
Software 
Engineering
WhatisSentiment Analysis 
"… WiFiAnalytics isa freeAndroid appthatI find veryhandywhenitcomestotroubleshootingandmonitoringa homenetwork. "[1] 
18.11.2014 Mark Cieliebak 4
Sample Application: SocialMedia Monitoring 
Text AnalyticsComponents: 
•Find relevant documents 
•Hot topicAnalysis 
•Sentiment analysis 
18.11.2014 Mark Cieliebak 5 
[7]
FlavoursofSentiment Analysis 
•DocumentBased 
•SentenceBased 
•Target-Specific 
•Rating Prediction 
18.11.2014 Mark Cieliebak 6
Classic ApproachestoSentiment Analysis 
Rule-Based 
Corpus-Based 
18.11.2014 Mark Cieliebak 7 
Predicted 
Label 
[3] 
[4]
Simple Sentiment Analysis 
Idea: Count numberofpositive andnegative words 
"This cameraisgreat[+1]." 
+1 (pos) 
"I find itbeautiful[+1]andgood[+1]." 
+2 (pos) 
"Itlooksterrible[-1]." 
-1 (neg) 
"This carhasa bluecolor." 
0 (neu) 
POSITIVE: 
great 
love 
nice 
... 
NEUTRAL: 
hello 
see 
I 
… 
NEGATIVE: 
bad 
hate 
ugly 
... 
UseSentiment-Dictionary: 
18.11.2014 Mark Cieliebak 8
Sample Rules 
18.11.2014 Mark Cieliebak 9 
•DetectBooster Words: "The carisreallyveryexpensive[-1 -1 -2]." 
•New Category"Mixed": "This carhasan appealing[+1]design andcomfortable[+1]seats, but itisexpensive[-1]." 
•Negation: Invertonlyscore ofwordsoccuringafter thenegation: "The carisappealing[+3]andI do not[*-1]find itexpensive[-2]" 
•I do notfind thecarexpensiveanditisappealing. 
Need to“understand” thesentence
Linguistic Analysis 
-> RULE: Invertscoresofwordsbeingin thesame phrasesasnegation. 
“I do not find thecarexpensive[+2] 
anditisappealing[+3].” → +5 (pos) 
Sentence 
Sentence 
Conj. 
Sentence 
NounPhrase 
Verb Phrase 
Verb 
Adverb 
Verb 
Noun Phrase 
Adj. 
Noun Phrase 
Verb Phrase 
Det. 
Det 
Noun 
Det. 
Verb 
Participle 
I 
do 
not 
find 
the 
car 
expensive 
and 
it 
is 
appealing 
18.11.2014 Mark Cieliebak 10
Rule-BasedSentiment Analysis 
Most ImportantIssues: 
-Requiresgoodhand-craftedrules 
-Hard totransfertonewtasksorlanguages 
-Doesnot workwellfortextswithbadgrammer(Twitter) 
18.11.2014 Mark Cieliebak 11 
[5]
Classic ApproachestoSentiment Analysis 
Rule-Based 
Corpus-Based 
18.11.2014 Mark Cieliebak 12 
Predicted 
Label 
[3] 
[4]
Corpus-BasedSentiment Analysis 
18.11.2014 Mark Cieliebak 13 
Predicted 
Label 
[4]
Corpus-BasedSentiment Analysis 
AnnotatedCorpus 
Sentence 
Polarity 
This analysis is good. 
Pos 
It looks awful. 
Neg 
This car has a blue color. 
Neu 
This car has an appealing design, comfortable seats, but it is expensive. 
Mix 
This carhasa veryappealingdesign, comfortableseats, but itisreallyexpensive. 
Mix 
This analysis is not good. 
Neg 
This car has an appealing design, comfortable seats and it is not expensive. 
Mix 
This movie was like a horror event. 
Neg 
This carisappealingandisnot expensive. 
Mix 
... 
... 
18.11.2014 Mark Cieliebak 14
Sample Features forTweets 
•Word ngrams:presence or absence of contiguous sequences of 1, 2, 3, and 4 tokens; noncontiguous ngrams 
•POS: the number of occurrences of each part-of-speechtag 
•SentimentLexica: eachwordannotatedwithtonalityscore (-1..0..+1) 
•Negation: the number of negated contexts 
•Punctuation: the number of contiguous sequences of exclamation marks, question marks, and both exclamation and question marks 
•Emoticons: presenceorabsence, last token is a positive or negative emoticon; 
•Hashtags: the number of hashtags; 
•Elongatedwords: the number of words with one character repeated (e.g. ‘soooo’) 
from: Mohammad et al., SemEval2013 
18.11.2014 Mark Cieliebak 15
Corpus-BasedSentiment Analysis 
Most ImportantIssues: 
-Requireslarge annotatedcorpora 
-Dependson goodfeatures 
18.11.2014 Mark Cieliebak 16 
[6]
HowgoodareSentiment Analysis Tools? 
18.11.2014 Mark Cieliebak 17
Quick Poll 
•Short texts: 1-2 sentencesfromTwitter, news, reviewsetc. 
•Three-classclassification: positive, negative, other 
•Accuracy= #푐표푟푟푒푐푡푑표푐푠 #푑표푐푠 
Mark Cieliebak 21 
Accuracy 
Votes 
<50% 
50-60% 
60-70% 
70-80% 
80-90% 
>90% 
"Howgoodarestate-of-the-art sentimentanalysistools?" 
18.11.2014
Tool Accuracy 
0,2 
0,3 
0,4 
0,5 
0,6 
0,7 
0,8 
Accuracy 
Best Tool per Corpus 
Worst Tool per Corpus 
22 
61% 
40% 
Avg. 
18.11.2014 Mark Cieliebak 
[14]
Tool Accuracy 
0,2 
0,3 
0,4 
0,5 
0,6 
0,7 
0,8 
Accuracy 
Best Tool per Corpus 
Worst Tool per Corpus 
Overall Best Tool 
23 
61% 
40% 
59% 
Avg. 
18.11.2014 Mark Cieliebak
Take-Home Lesson 
Accuracyofbestcommercialtoolon 
arbitraryshorttextsis59% 
18.11.2014 Mark Cieliebak 24
ApproachestoSentiment Analysis 
Rule-Based 
Corpus-Based 
18.11.2014 Mark Cieliebak 25 
Predicted 
Label 
[9] 
DeepLearning 
[8]
DeepLearning on Text 
It'sall aboutWord Vectors! 
18.11.2014 Mark Cieliebak 26
Word2Vec 
•Hugesetoftextsamples(billionsofwords) 
•Extractdictionary 
•Word-Matrix: k-dimensional vectorforeachword(k typically50-500) 
•Word vectorinitializedrandomly 
•Train wordvectorstopredictnextwords, givena sequenceofwordsfromsample text 
18.11.2014 Mark Cieliebak 27 
Major contributionsbyBengioet al. 2003, Collobert&Weston2008, Socher et al. 2011, Mikolovet al. 2013 
[9]
The Magic ofWord Vectors 
18.11.2014 Mark Cieliebak 28 
King -Man + Woman≈ Queen 
Live Demo on 100b wordsfromGoogle News dataset: http://radimrehurek.com/2014/02/word2vec-tutorial/ 
[10]
Relations LearnedbyWord2Vec 
18.11.2014 Mark Cieliebak 29 
[11]
UsingWord Vectorsin NLP 
18.11.2014 Mark Cieliebak 30 
Collobertet al., 2011: 
•SENNA: GenericNLP System basedon wordvectors 
•Nomanualfeatureengineering 
•SolvesmanyNLP-Tasks asgoodasbenchmarksystems 
[12]
DeepLearning andSentiment 
Maas et al., 2011 
•Enrichwordvectorswithsentimentcontext 
•Capture semanticofwords(unsupervised) andsentiment(supervised) in parallel, usingmultiple learningtasks 
wonderful 
amazing 
terrible 
awful 
18.11.2014 Mark Cieliebak 31
DeepLearning andSentiment 
Socher et al. 2013: 
•Word Vectorsdo not helpforSentiment Analysis 
•RecursiveNeuralTensor Networks 
•Representingsentencestructuresastreeswhileaddingsentimentannotationsat same time 
•Restrictedtosingle, well-structuredsentences 
• 
18.11.2014 Mark Cieliebak 32 
[13]
DeepLearning andSentiment 
QuocandMikolov, 2014: 
•"Paragraph Vectors" 
•Add context(sentence, paragraph, document) towordvectorsduringtraining 
•Improvesmanyexistingapproaches 
18.11.2014 Mark Cieliebak 33 
[9]
DoesDeepLearning solvethe 
Sentiment Analysis Problem? 
18.11.2014 Mark Cieliebak 34
Conclusion: DeepLearning forSentiment 
•Small improvements, not revolution 
•Veryrecentresearch, not yet"end ofthestory" 
•SemEval2015 will bebenchmark 
18.11.2014 Mark Cieliebak 35
Talk in Short! 
1.Classic approachesarerule-basedorcorpus-based 
2.State-of-the-art toolsclassify4 out of10 docswrong 
3.DeepLearning doesnot needhand-craftedfeatures 
4.DeepLearning improvesexistingbenchmarks 
18.11.2014 Mark Cieliebak 36
ThankYou! 
Mark Cieliebak 
ZurichUniversity ofApplied Sciences(ZHAW) 
Winterthur, Switzerland 
Email: ciel@zhaw.ch, Website: www.zhaw.ch/~ciel 
18.11.2014 Mark Cieliebak 37 
[15]

More Related Content

What's hot (20)

PPTX
Sentiment analysis
Jennifer D. Davis, Ph.D.
 
PPTX
Lexicon-Based Sentiment Analysis at GHC 2014
Bo Hyun Kim
 
PDF
MTech Seminar Presentation [IIT-Bombay]
Sagar Ahire
 
PDF
Twitter sentiment analysis
Harshit Sanghvi
 
PPTX
Sentiment analyzer and opinion mining
Ankush Mehta
 
PDF
RCOMM 2011 - Sentiment Classification with RapidMiner
bohanairl
 
PPTX
Sentiment Analysis
Sagar Ahire
 
PPTX
2 13
goelkhushbu
 
PPTX
Sentiment Analysis
harit66
 
PPTX
Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis...
Knowledge Media Institute - The Open University
 
PPTX
Sentiment tool Project presentaion
Ravindra Chaudhary
 
PPT
Ml ppt
Alpna Patel
 
PDF
Sentiment Analysis of Twitter Data
Sumit Raj
 
PPTX
Sentiment analysis using naive bayes classifier
Dev Sahu
 
PPT
Sentiment analysis in Twitter on Big Data
Iswarya M
 
PPTX
sentiment analysis
ShivangiYadav42
 
PDF
SentiCheNews - Sentiment Analysis on Newspapers and Tweets
🧑‍💻 Manuel Coppotelli
 
PPTX
Sentiment Analysis
ishan0019
 
PPTX
Sentiment analysis
Amenda Joy
 
PDF
IRE2014-Sentiment Analysis
Gangasagar Patil
 
Sentiment analysis
Jennifer D. Davis, Ph.D.
 
Lexicon-Based Sentiment Analysis at GHC 2014
Bo Hyun Kim
 
MTech Seminar Presentation [IIT-Bombay]
Sagar Ahire
 
Twitter sentiment analysis
Harshit Sanghvi
 
Sentiment analyzer and opinion mining
Ankush Mehta
 
RCOMM 2011 - Sentiment Classification with RapidMiner
bohanairl
 
Sentiment Analysis
Sagar Ahire
 
Sentiment Analysis
harit66
 
Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis...
Knowledge Media Institute - The Open University
 
Sentiment tool Project presentaion
Ravindra Chaudhary
 
Ml ppt
Alpna Patel
 
Sentiment Analysis of Twitter Data
Sumit Raj
 
Sentiment analysis using naive bayes classifier
Dev Sahu
 
Sentiment analysis in Twitter on Big Data
Iswarya M
 
sentiment analysis
ShivangiYadav42
 
SentiCheNews - Sentiment Analysis on Newspapers and Tweets
🧑‍💻 Manuel Coppotelli
 
Sentiment Analysis
ishan0019
 
Sentiment analysis
Amenda Joy
 
IRE2014-Sentiment Analysis
Gangasagar Patil
 

Viewers also liked (20)

PDF
Sentiment analysis of tweets using Neural Networks
Adrián Palacios Corella
 
PDF
Sentiment Analysis and Social Media: How and Why
Davide Feltoni Gurini
 
PPTX
Sentiment analysis of tweets
Vasu Jain
 
PPTX
CNN for Sentiment Analysis on Italian Tweets
Giuseppe Attardi
 
PDF
Negative Sentiment (or "Sentiment Analysis is Sh*te")
Mat Morrison
 
PPTX
Sentiment analysis in healthcare
Tony Russell-Rose
 
PDF
Sentiment Analysis Using Hybrid Structure of Machine Learning Algorithms
Sangeeth Nagarajan
 
PDF
Social media & sentiment analysis splunk conf2012
Michael Wilde
 
PDF
Practical Sentiment Analysis
People Pattern
 
PPTX
Emotions Affect Markets in Predictable Ways: Behavioral Finance and Sentiment...
Cristian Bissattini
 
PPTX
Sentiment Analysis via R Programming
Skillspeed
 
PPTX
CNN for Text Classification
Emory NLP
 
PPT
How Sentiment Analysis works
CJ Jenkins
 
PDF
Introduction to Sentiment Analysis
Jaganadh Gopinadhan
 
PPT
Big Data & Sentiment Analysis
Michel Bruley
 
PPTX
Continuous Sentiment Intensity Prediction based on Deep Learning
Yunchao He
 
PDF
connected_issue_49_summer_2013
Mary Stephanou
 
PPT
SNLI_presentation_2
Viral Gupta
 
PPTX
Lecture 3: Structuring Unstructured Texts Through Sentiment Analysis
Marina Santini
 
PPTX
P3
John Kirbow
 
Sentiment analysis of tweets using Neural Networks
Adrián Palacios Corella
 
Sentiment Analysis and Social Media: How and Why
Davide Feltoni Gurini
 
Sentiment analysis of tweets
Vasu Jain
 
CNN for Sentiment Analysis on Italian Tweets
Giuseppe Attardi
 
Negative Sentiment (or "Sentiment Analysis is Sh*te")
Mat Morrison
 
Sentiment analysis in healthcare
Tony Russell-Rose
 
Sentiment Analysis Using Hybrid Structure of Machine Learning Algorithms
Sangeeth Nagarajan
 
Social media & sentiment analysis splunk conf2012
Michael Wilde
 
Practical Sentiment Analysis
People Pattern
 
Emotions Affect Markets in Predictable Ways: Behavioral Finance and Sentiment...
Cristian Bissattini
 
Sentiment Analysis via R Programming
Skillspeed
 
CNN for Text Classification
Emory NLP
 
How Sentiment Analysis works
CJ Jenkins
 
Introduction to Sentiment Analysis
Jaganadh Gopinadhan
 
Big Data & Sentiment Analysis
Michel Bruley
 
Continuous Sentiment Intensity Prediction based on Deep Learning
Yunchao He
 
connected_issue_49_summer_2013
Mary Stephanou
 
SNLI_presentation_2
Viral Gupta
 
Lecture 3: Structuring Unstructured Texts Through Sentiment Analysis
Marina Santini
 
Ad

Similar to Can Deep Learning solve the Sentiment Analysis Problem (20)

PDF
#like or #fail - How Can Computers Tell the Difference?
Mark Cieliebak
 
PPTX
New
devikamd09
 
PDF
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
IT Arena
 
PDF
7 notes
António Oliveira
 
PPTX
Customer review using sentiment analysis.pptx
TarunKalkar
 
PPTX
Lac presentation
Roseline Antai
 
PDF
Sentiment Analysis (GDSCTU).pdf
YasminAzou
 
PPTX
02 naive bays classifier and sentiment analysis
Subhas Kumar Ghosh
 
PPTX
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
Nicole Novielli
 
PPT
ppt on sentiment analysis using various techniques
NiharikaKhanna19
 
PDF
A survey on approaches for performing sentiment analysis ijrset october15
International Journal of Advance Research and Innovative Ideas in Education
 
PDF
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
Journal For Research
 
PDF
[GAN by Hung-yi Lee]Part 3: The recent research of my group
NAVER Engineering
 
PDF
Speech Sentiment Analysis
Chandan Parida
 
PDF
IRJET- Sentimental Analysis on Audio and Video
IRJET Journal
 
PDF
Kishaloy Haldar and Wenqiang Lei - WESST - Sentiment Analysis of Social Media
NUS Institute of Applied Learning Sciences and Educational Technology
 
PDF
Introduction to sentiment analysis
Rajesh Piryani
 
PDF
UWB semeval2016-task5
Lukáš Svoboda
 
PPTX
Sediment analysis: what is Sediment analysis
VernonSmap
 
#like or #fail - How Can Computers Tell the Difference?
Mark Cieliebak
 
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
IT Arena
 
Customer review using sentiment analysis.pptx
TarunKalkar
 
Lac presentation
Roseline Antai
 
Sentiment Analysis (GDSCTU).pdf
YasminAzou
 
02 naive bays classifier and sentiment analysis
Subhas Kumar Ghosh
 
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
Nicole Novielli
 
ppt on sentiment analysis using various techniques
NiharikaKhanna19
 
A survey on approaches for performing sentiment analysis ijrset october15
International Journal of Advance Research and Innovative Ideas in Education
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
Journal For Research
 
[GAN by Hung-yi Lee]Part 3: The recent research of my group
NAVER Engineering
 
Speech Sentiment Analysis
Chandan Parida
 
IRJET- Sentimental Analysis on Audio and Video
IRJET Journal
 
Kishaloy Haldar and Wenqiang Lei - WESST - Sentiment Analysis of Social Media
NUS Institute of Applied Learning Sciences and Educational Technology
 
Introduction to sentiment analysis
Rajesh Piryani
 
UWB semeval2016-task5
Lukáš Svoboda
 
Sediment analysis: what is Sediment analysis
VernonSmap
 
Ad

Recently uploaded (20)

PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PDF
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
PDF
Copia de Strategic Roadmap Infographics by Slidesgo.pptx (1).pdf
ssuserd4c6911
 
PDF
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
PPTX
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PDF
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PDF
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
PDF
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
PDF
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
PPT
deep dive data management sharepoint apps.ppt
novaprofk
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PPTX
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
PPTX
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
PPTX
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
PPTX
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
Copia de Strategic Roadmap Infographics by Slidesgo.pptx (1).pdf
ssuserd4c6911
 
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
deep dive data management sharepoint apps.ppt
novaprofk
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 

Can Deep Learning solve the Sentiment Analysis Problem

  • 1. Can Deep Learning solve the Sentiment Analysis Problem? Mark CieliebakZurichUniversity ofApplied Sciences Annual Meeting ofSGAICO –Swiss Group forArtificialIntelligenceandCognitiveScience 18.11.2014
  • 2. Outline 1.What is sentiment analysis? 2.How good are "classical" approaches? 3.Does deep learning solve the problem? 18.11.2014 Mark Cieliebak 2
  • 3. About Me 18.11.2014 Mark Cieliebak 3 Mark Cieliebak Institute of Applied Information Technology (InIT) ZHAW, Winterthur Email: [email protected], Website: www.zhaw.ch/~ciel Text Analytics Open Data Automated Test Generation Research Interests Software Engineering
  • 4. WhatisSentiment Analysis "… WiFiAnalytics isa freeAndroid appthatI find veryhandywhenitcomestotroubleshootingandmonitoringa homenetwork. "[1] 18.11.2014 Mark Cieliebak 4
  • 5. Sample Application: SocialMedia Monitoring Text AnalyticsComponents: •Find relevant documents •Hot topicAnalysis •Sentiment analysis 18.11.2014 Mark Cieliebak 5 [7]
  • 6. FlavoursofSentiment Analysis •DocumentBased •SentenceBased •Target-Specific •Rating Prediction 18.11.2014 Mark Cieliebak 6
  • 7. Classic ApproachestoSentiment Analysis Rule-Based Corpus-Based 18.11.2014 Mark Cieliebak 7 Predicted Label [3] [4]
  • 8. Simple Sentiment Analysis Idea: Count numberofpositive andnegative words "This cameraisgreat[+1]." +1 (pos) "I find itbeautiful[+1]andgood[+1]." +2 (pos) "Itlooksterrible[-1]." -1 (neg) "This carhasa bluecolor." 0 (neu) POSITIVE: great love nice ... NEUTRAL: hello see I … NEGATIVE: bad hate ugly ... UseSentiment-Dictionary: 18.11.2014 Mark Cieliebak 8
  • 9. Sample Rules 18.11.2014 Mark Cieliebak 9 •DetectBooster Words: "The carisreallyveryexpensive[-1 -1 -2]." •New Category"Mixed": "This carhasan appealing[+1]design andcomfortable[+1]seats, but itisexpensive[-1]." •Negation: Invertonlyscore ofwordsoccuringafter thenegation: "The carisappealing[+3]andI do not[*-1]find itexpensive[-2]" •I do notfind thecarexpensiveanditisappealing. Need to“understand” thesentence
  • 10. Linguistic Analysis -> RULE: Invertscoresofwordsbeingin thesame phrasesasnegation. “I do not find thecarexpensive[+2] anditisappealing[+3].” → +5 (pos) Sentence Sentence Conj. Sentence NounPhrase Verb Phrase Verb Adverb Verb Noun Phrase Adj. Noun Phrase Verb Phrase Det. Det Noun Det. Verb Participle I do not find the car expensive and it is appealing 18.11.2014 Mark Cieliebak 10
  • 11. Rule-BasedSentiment Analysis Most ImportantIssues: -Requiresgoodhand-craftedrules -Hard totransfertonewtasksorlanguages -Doesnot workwellfortextswithbadgrammer(Twitter) 18.11.2014 Mark Cieliebak 11 [5]
  • 12. Classic ApproachestoSentiment Analysis Rule-Based Corpus-Based 18.11.2014 Mark Cieliebak 12 Predicted Label [3] [4]
  • 13. Corpus-BasedSentiment Analysis 18.11.2014 Mark Cieliebak 13 Predicted Label [4]
  • 14. Corpus-BasedSentiment Analysis AnnotatedCorpus Sentence Polarity This analysis is good. Pos It looks awful. Neg This car has a blue color. Neu This car has an appealing design, comfortable seats, but it is expensive. Mix This carhasa veryappealingdesign, comfortableseats, but itisreallyexpensive. Mix This analysis is not good. Neg This car has an appealing design, comfortable seats and it is not expensive. Mix This movie was like a horror event. Neg This carisappealingandisnot expensive. Mix ... ... 18.11.2014 Mark Cieliebak 14
  • 15. Sample Features forTweets •Word ngrams:presence or absence of contiguous sequences of 1, 2, 3, and 4 tokens; noncontiguous ngrams •POS: the number of occurrences of each part-of-speechtag •SentimentLexica: eachwordannotatedwithtonalityscore (-1..0..+1) •Negation: the number of negated contexts •Punctuation: the number of contiguous sequences of exclamation marks, question marks, and both exclamation and question marks •Emoticons: presenceorabsence, last token is a positive or negative emoticon; •Hashtags: the number of hashtags; •Elongatedwords: the number of words with one character repeated (e.g. ‘soooo’) from: Mohammad et al., SemEval2013 18.11.2014 Mark Cieliebak 15
  • 16. Corpus-BasedSentiment Analysis Most ImportantIssues: -Requireslarge annotatedcorpora -Dependson goodfeatures 18.11.2014 Mark Cieliebak 16 [6]
  • 17. HowgoodareSentiment Analysis Tools? 18.11.2014 Mark Cieliebak 17
  • 18. Quick Poll •Short texts: 1-2 sentencesfromTwitter, news, reviewsetc. •Three-classclassification: positive, negative, other •Accuracy= #푐표푟푟푒푐푡푑표푐푠 #푑표푐푠 Mark Cieliebak 21 Accuracy Votes <50% 50-60% 60-70% 70-80% 80-90% >90% "Howgoodarestate-of-the-art sentimentanalysistools?" 18.11.2014
  • 19. Tool Accuracy 0,2 0,3 0,4 0,5 0,6 0,7 0,8 Accuracy Best Tool per Corpus Worst Tool per Corpus 22 61% 40% Avg. 18.11.2014 Mark Cieliebak [14]
  • 20. Tool Accuracy 0,2 0,3 0,4 0,5 0,6 0,7 0,8 Accuracy Best Tool per Corpus Worst Tool per Corpus Overall Best Tool 23 61% 40% 59% Avg. 18.11.2014 Mark Cieliebak
  • 21. Take-Home Lesson Accuracyofbestcommercialtoolon arbitraryshorttextsis59% 18.11.2014 Mark Cieliebak 24
  • 22. ApproachestoSentiment Analysis Rule-Based Corpus-Based 18.11.2014 Mark Cieliebak 25 Predicted Label [9] DeepLearning [8]
  • 23. DeepLearning on Text It'sall aboutWord Vectors! 18.11.2014 Mark Cieliebak 26
  • 24. Word2Vec •Hugesetoftextsamples(billionsofwords) •Extractdictionary •Word-Matrix: k-dimensional vectorforeachword(k typically50-500) •Word vectorinitializedrandomly •Train wordvectorstopredictnextwords, givena sequenceofwordsfromsample text 18.11.2014 Mark Cieliebak 27 Major contributionsbyBengioet al. 2003, Collobert&Weston2008, Socher et al. 2011, Mikolovet al. 2013 [9]
  • 25. The Magic ofWord Vectors 18.11.2014 Mark Cieliebak 28 King -Man + Woman≈ Queen Live Demo on 100b wordsfromGoogle News dataset: http://radimrehurek.com/2014/02/word2vec-tutorial/ [10]
  • 26. Relations LearnedbyWord2Vec 18.11.2014 Mark Cieliebak 29 [11]
  • 27. UsingWord Vectorsin NLP 18.11.2014 Mark Cieliebak 30 Collobertet al., 2011: •SENNA: GenericNLP System basedon wordvectors •Nomanualfeatureengineering •SolvesmanyNLP-Tasks asgoodasbenchmarksystems [12]
  • 28. DeepLearning andSentiment Maas et al., 2011 •Enrichwordvectorswithsentimentcontext •Capture semanticofwords(unsupervised) andsentiment(supervised) in parallel, usingmultiple learningtasks wonderful amazing terrible awful 18.11.2014 Mark Cieliebak 31
  • 29. DeepLearning andSentiment Socher et al. 2013: •Word Vectorsdo not helpforSentiment Analysis •RecursiveNeuralTensor Networks •Representingsentencestructuresastreeswhileaddingsentimentannotationsat same time •Restrictedtosingle, well-structuredsentences • 18.11.2014 Mark Cieliebak 32 [13]
  • 30. DeepLearning andSentiment QuocandMikolov, 2014: •"Paragraph Vectors" •Add context(sentence, paragraph, document) towordvectorsduringtraining •Improvesmanyexistingapproaches 18.11.2014 Mark Cieliebak 33 [9]
  • 31. DoesDeepLearning solvethe Sentiment Analysis Problem? 18.11.2014 Mark Cieliebak 34
  • 32. Conclusion: DeepLearning forSentiment •Small improvements, not revolution •Veryrecentresearch, not yet"end ofthestory" •SemEval2015 will bebenchmark 18.11.2014 Mark Cieliebak 35
  • 33. Talk in Short! 1.Classic approachesarerule-basedorcorpus-based 2.State-of-the-art toolsclassify4 out of10 docswrong 3.DeepLearning doesnot needhand-craftedfeatures 4.DeepLearning improvesexistingbenchmarks 18.11.2014 Mark Cieliebak 36
  • 34. ThankYou! Mark Cieliebak ZurichUniversity ofApplied Sciences(ZHAW) Winterthur, Switzerland Email: [email protected], Website: www.zhaw.ch/~ciel 18.11.2014 Mark Cieliebak 37 [15]