0% found this document useful (0 votes)
146 views

Sentiment Analysis Dataset

The document provides a list of datasets that can be used for sentiment analysis and natural language processing tasks. It includes links to datasets containing customer reviews from Amazon and IMDb movie reviews. Additionally, it lists medical forum datasets containing health discussions and a depression forum dataset. The document also provides links to multi-modal sentiment analysis datasets that contain audio, text and video modalities like MOSI and IEMOCAP.
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
146 views

Sentiment Analysis Dataset

The document provides a list of datasets that can be used for sentiment analysis and natural language processing tasks. It includes links to datasets containing customer reviews from Amazon and IMDb movie reviews. Additionally, it lists medical forum datasets containing health discussions and a depression forum dataset. The document also provides links to multi-modal sentiment analysis datasets that contain audio, text and video modalities like MOSI and IEMOCAP.
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 3

1.

Kaggle dataset
a) Good read books.
b) Pizza restaurants
c) Zomato Bangalore restaurants
d) Amazon customer reviews - https://www.kaggle.com/bittlingmayer/amazonreviews
e) Twitter dataset - https://www.kaggle.com/kazanova/sentiment140
f) Youtube dtaset - https://www.kaggle.com/datasnaek/youtube-new
g) Twitter US airline data sentiment - https://www.kaggle.com/crowdflower/twitter-airline-
sentiment
h) Amazon alexa product review - https://www.kaggle.com/sid321axn/amazon-alexa-
reviews
i) Movie review dataset - https://inclass.kaggle.com/c/si650winter11/data

2. Multi-Domain Sentiment Dataset (Version2.0)

http://www.cs.jhu.edu/~mdredze/datasets/sentiment/

The Multi-Domain Sentiment Dataset contains product reviews taken from Amazon.com from
many product types (domains). Some domains (books and dvds) have hundreds of thousands of
reviews. Others (musical instruments) have only a few hundred. Reviews contain star ratings (1
to 5 stars) that can be converted into binary labels if needed. This page contains some
descriptions about the data. If you have questions, please email Mark Dredze or John Blitzer.

3. IMDB reviews

This is another older and relatively small dataset for binary sentiment classification, featuring
25,000 movie reviews for training and testing. You’ll find unlabeled data for use, as well as a
README file within the file which contains more details about the dataset.
https://blog.cambridgespark.com/50-free-machine-learning-datasets-sentiment-analysis-
b9388f79c124

4. Stanford – Movie review dataset


This is a dataset for binary sentiment classification containing substantially more data than
previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for
training, and 25,000 for testing. There is additional unlabeled data for use as well. Raw text and
already processed bag of words formats are provided. See the README file contained in the
release for more details.
https://ai.stanford.edu/~amaas/data/sentiment/

5. MedHelp forum dataset on different health topics


https://www.medhelp.org/forums/Mens-Health/show/93
The eDiseases dataset contains patient data from the MedHelp health site
(http://www.medhelp.org/), where different communities share information and opinions about
diseases. Each community consists of a number of conversations; a conversation being a
sequence of comments posted by patients.
Dataset link
https://zenodo.org/record/1479354#.XV4p8eMzbIU

6. Depression dataset
https://www.depressionforums.org/forums/login/

7. MOSI dataset , IEMOCAP, MOSEI dataset – sojanya poria et al.


https://github.com/soujanyaporia/multimodal-sentiment-analysis

S.No Name, Year Modality Web link

1 MOUD, 2013 Audio, Text and http://web.eecs.umich.edu/~mihalcea/do


Video wnloads.html

2 MOSI, 2018 Audio, Text and https://www.amir-zadeh.com/datasets


Video

3 ICT –MMMO , 2013 Audio, Text and By sending mail to Giota Stratou
Video ( [email protected] )

4 HUMAINE, 2007 Audio and https://humaine-db.sspnet.eu/


Video

5 BELFAST, 2000 Audio and https://belfast-naturalistic-db.sspnet.eu/


Video

6 SEMAINE, 2007 Audio and https://semaine-db.eu/


Video

7 IEMOCAP, 2008 Audio and https://sail.usc.edu/iemocap/


Video

8 The eNTERFACE, Audio and http://enterface.net/


2006 Video

9 Classified datasets Text thinknook.com/wp-


content/uploads/2012/09/

10 UCI Machine Audio and https://archive.ics.uci.edu/ml/index.php


Learning Repository Video
11 Sentiment Treebank Text https://nlp.stanford.edu/sentiment/treeban
k.html

12 Sentiment Analysis Text https://www.kaggle.com/sonaam1234/sen


Dataset- Kaggle timentdata

13 Youtube dataset Audio, Text and By sending mail to Giota Stratou


Video ( [email protected] )

You might also like