MARK3088 - TUT4 wk5 - Setting Up Orange Text Analysis
MARK3088 - TUT4 wk5 - Setting Up Orange Text Analysis
Product
Analytics
Tutorial 4
Week 5: Setting Up Orange For Learning
Agenda
• Install Orange data mining software
• Troubleshooting with tutor and classmates
• Basic function introduction
• Perform text analysis step by step with Orange
2
Orange
Orange is a data mining and visualization
toolbox for novice and expert alike. To
explore data with Orange, one requires no
programming or in-depth mathematical
knowledge. The workflow-based data
science tools democratize data science by
hiding complex underlying mechanics and
exposing intuitive concepts. Anyone who
owns data, or is motivated to peek into
data, should have the means to do so.
3
Install https://orangedatamining.com/
4
Building Workflows
The core principle of Orange is visual programming, which means
each analytical step in contained within a “widget”.
5
6
Let us start with a simple workflow
https://orange3.readthedocs.io/projects/orange-visual-
programming/en/latest/building-workflows/index.html 7
Basic data exploration
A simple workflow to inspect the loaded dataset using “the Iris
flower data set” or Fisher's Iris data set which is a multivariate
data set used and made famous by the British statistician and
biologist Ronald Fisher.
How does the data look like?
8
Quick check with common statistics and
visualization widgets
9
Saving your work (1/2)
You can save this workflow
using the File/Save menu and
share it with your colleagues.
Just don’t forget to put the data
files in the same directory as
the file with the workflow.
10
Saving your work (2/2)
Widgets also have a Report button in their
bottom status bar, which you can use to keep
a log of your analysis. When you find
something interesting, just click it and the
graph will be added to your log. You can also
add reports from the widgets on the path to
this one, to make sure you don’t forget
anything relevant.
You can save the report as HTML or PDF, or a
report file that includes all workflow related
report items that you can later open in Orange.
In this way, you and your colleagues can
reproduce your analysis results.
11
Widget Catalogue from Orange
12
Sentiment
Analysis
13
Preprocessing Text
14
Word
Cloud
15
Topic
Modelling
16
What Did We Examine Today
• Learn about Orange data mining software
• Understand how to use Orange to explore data and save your work
• Understand methodology for text analysis
17
What’s
next?
18
QUESTIONS?
Consultations: upon request
• Request via email
Email:
• Response within 24hr, on weekdays
19
Thank you!