Orange 2 Sentiment Analysis Part 1
Orange 2 Sentiment Analysis Part 1
with Orange
Amy Larner Giroux, PhD
UCF Center for Humanities & Digital Research
NEH Digital Culture Summer Institute
Welcome
The tutorial outlined in this slide deck is an extension of the
Introduction to Data Analysis in Orange. We will work
through your second workflow in Orange and learn about
Sentiment Analysis. For full details, please consult the
“Orange Tweet Analysis Tutorial” PDF available in Google
Classroom. That document covers both tutorials for this
week in more detail.
Agenda
• Downloading the dataset
• Beginning where we left off in the previous tutorial
• Detailed steps for your second workflow – from
Preprocess Text to Sentiment Analysis
• Updating the current widgets with new data
• Adding a Twitter Profiler and Box Plot
• Adding Sentiment Analysis and Scatter Plots
• Your NEH Institute deliverables for this tutorial
2
Tutorial Dataset
Please download the dataset:
http://chdr.cah.ucf.edu/neh-digculture/Tweet-Profiled-ReadyForOrangeNEH.xlsx
The dataset for this tutorial has 4 categorical columns in addition to the regular tweet data.
The criteria decisions for the Influentials, Opinion-leaders, and Political Leaning columns is
discussed in the “Orange Tweet Analysis Tutorial” PDF. The last column, Political Leaning
(numeric), was added to allow access to the category within the scatter plot widget. More
on that later in the tutorial.
3
Reopen Your First Workflow
Open Orange and use Ctrl-Alt-O or File -> Open and Freeze to open your workflow
from the previous tutorial.
Using the previous workflow will save us some setup time.
4
File Load
Double-click on the File widget to open it.
Using the open folder button, browse and choose the tutorial spreadsheet. If you are on
Windows and have saved it to your desktop, you may need to navigate through Users ->
your user name ->Desktop to get to the file.
After opening the tutorial spreadsheet, click on the
Reload button at the top right to refresh the data.
5
Select Columns
Double-click on the Select Columns widget to open its options screen.
7
Corpus
Double-click on the Corpus widget to open the options screen.
8
Workflow Check
At this point in the tutorial, your workflow should resemble the one in this image; the same as the
end of the last tutorial.
We will leave the Preprocess Text widget alone unless you changed
the Stopwords language to something other than English. If so,
please set it back to English.
We have imported the new COVID19 data file; chosen the columns for
target variables; changed the Data Sample widget to use 100% of our
dataset; and made sure the Corpus had the columns correctly set.
The profiler defaults to the Ekman (Multi Class) model for emotions
and is set to Commit Automatically.
1 The assigned reading for the homework, “Emotion Recognition on Twitter: Comparative Study and Training a Unison
10
Model” by Niko Colneriĉ and Janez Demsar describes in detail the various models used in the Tweet Profiler widget.