0% found this document useful (0 votes)
19 views

Visual Analysis-Part I-Lecture1-1

visual analysis, tableau lecture 1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Visual Analysis-Part I-Lecture1-1

visual analysis, tableau lecture 1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Visual Analysis-Part I

Module I: Introducing Tableau


End goal
• By the end of this module we will:
– Install Tableau in local computer.
– Identify appropriate data for analysis.
– Create initial visualization in Tableau.
Some technicalities
• Utilizes VizQL- visual query language that
translates mouse inputs (drag-drops) into
database queries.

• Easy exploration of the data: do not need to


know what you are looking for beforehand.

• Easier handling of table dimensions

• Interactivity!!
Example: interactivity
Basic workflow
1. Connect to dataset: files, databases, cloud
services (Google Analytics)
2. Query the dataset visually
3. Display the results in various types of charts,
plots and maps
4. Collate individual charts in a dashboard and put
them in context to create a story.
5. Communicating the results: individual
workbooks, interactive dashboards, social media
Tableau Suite
• Tableau Desktop: Connect of Excel/CSV files
and save workbooks to local drives.
• Tableau Prep: Prepares data before being
processed via Tableau Desktop. Helps with
merging different formatted dataset, clean
data and aggregation.
• Tableau Server: Organization wide data
visualization and dashboard creation that can
be viewed in browser
Tableau Suite-contd..
• Tableau Online: Similar to Tableau Server but is a
cloud-based service. No local server hardware is
needed.
• Tableau Public: hosting service for the publication of
data visualizations in web. Typically used by
organizations to get data stories into the public eye.
Can be viewed in browser directly on the Tableau
Public platform, or they can be embedded into blogs
and websites.
• Tableau Reader: is a free desktop application to open
and interact with Tableau workbook files that have
been created in Tableau Desktop.
Tableau Public: examples
• https://public.tableau.com/en-
gb/gallery/how-happy-are-we?tab=viz-of-the-
day&type=viz-of-the-day.

• https://public.tableau.com/en-gb/s/resources
Installing Tableau Public version
• Install the Tableau public version for free
• Go to:
https://public.tableau.com/app/discover
• Create an account and sign-in
• Go to “Create” tab
• Click on “Download Tableau Desktop Public
Edition”
• Click on Download Tableau Public
Basic Data Format
• Two most common ways of structuring
datasets:
• Wide table format: Often summary tables
containing aggregated measures, or static
variables
• Long table format: Mostly raw data. Each row
containing one data point.
More on data format
• Three key considerations:
– What is a variable?
– What is a unit of observation?
– Which data should go in each row of the data
matrix?
Wide table
Context-
• Each county was measured at four time
points, once every 10 years starting in 1970.
• The outcome variable is # of jobs in each
county.
• Three other variables: Land Area, Natural
Amenity (4=no and 3=Yes), and the proportion
of the county population in that year that had
graduated from college1.
1. https://www.theanalysisfactor.com/wide-and-long-data/
Wide table: contd..
Observe:
• Land area and presence of a natural amenity
doesn’t change from decade to decade-only
one variable per county
• Jobs, and, proportion of College grads, have
different values in each year, so require a
different variable (column) for each year.
Wide table- example
Long tables
• Recall: in long format each row is one time point
per subject (country).
• each county will have data in multiple rows.
• Any variables that don’t change across time will
have the same value in all the rows.
• No longer need four columns for either Jobs or
College. All four values of Jobs for each county
are stacked–they’re all in the Jobs column. The
same is true for the four values of College.
• What are we missing?
Long table -example
Things to consider
• Wide format, the unit of analysis is the
subject–the county.
• Whereas in long format, the unit of analysis is
each measurement occasion for each county.
Implication of the unit of analysis
• In long format- when the occasion is the unit of
analysis, you can use each decade’s college
education rate as a predictor for the same
decade’s Jobs value.
• In wide format- when the unit of observation is
the county, there is no way to do this. Repeated
outcomes are considered different and non-
interchangeable variables.
• What happens if each county had been measured
a different number of times, or measured in
different years?
Implications-contd..
• Need to be able to switch back and forth
between formats.
• It’s often easier to enter and manipulate data
in the wide format, even if you need to
analyze it in the long format.
Implications-crosstabing
• Should we connect fully formatted excel
report that already shows data aggregations?
Crosstabing- contd..
• Once aggregated wide table is created some
information is lost, unless..??
• Loss of info reduces the scope of visualization
• Recommendation: work with the
unaggregated raw data as much as possible
• This will show the items broken down to the
smallest units.
• If that is not possible/available: will be
discussed later.
Begin data prep
• Remove the introductory text (Temperature
Measurements).
• Put the hierarchical headers (Seattle, New York)
in a new, separate column (Location).
• Pivot the data from the wide format, with
Morning, Noon, Evening in the headers, to a long
format, with this information about the time of
day in a new column (named Time of Day).
• Use the full date and time (01.04.2018 06:00)
instead of just stating the month.
Data prep-contd…
• Ensure that numbers are formatted as such
and not as text.
• Remove any summary rows and columns
(Average and Average Across All).
• Remove any empty rows.
• Make sure each column has a meaningful
heading.
Ready-for-analysis Dataset
Remark on the dataset
• Every row contains exactly one temperature
record with the exact time stamp.
• It doesn’t contain any aggregations, such as
averages.
• All averages can be calculated in Tableau later.
• Adjust the level of aggregation as required
later for downstream analysis.
Sample dataset
• Superstore.xls
• C:~\Documents\My Tableau
Repository\Datasources
• Contains three different sheets: Orders, People,
and Returns.
• The three tables are relational providing
information about individual sales transactions.
• No summary aggregations each row contains one
record.
Open the data file
• launch Tableau Desktop
• In the Connect panel on the left, you can
select a file or a server as a data source
• Select the Microsoft Excel option under the
“To A File” heading.
• Find Superstore.xls file in the Documents
folder. Click Open to use this file as a data
source in Tableau.
Open data file –contd..
• Tableau switches to the Data Source pane
– left-hand side: list the names of the three sheets
contained in the Excel file
– Right side: canvas
• Drag and drop the Orders sheet onto the white
space in the top half of the screen
• Bottom half of the screen shows preview of the
data.
• Click Sheet 1 in the tabs bar at the bottom of the
window, to create Tableau worksheet
Tableau Workspace
• The blank canvas (1), includes the title Sheet 1 (2).
• Left: Data pane (3)
• The tab next to it opens the Analytics pane
• Most interactions are achieved by dragging and
dropping items onto the canvas.
• Both dimensions (4)(including hierarchical dimensions
[5]) and measures (6) can be moved directly onto it.
• Alternatively, they can be placed on the Columns (7)
and Rows (8) shelves, in order to add them to your
visualization
Workspace-contd..
• Fields from the Data pane can also be placed onto the
Marks (9), Filters (10), and Pages (11) cards: for
example, to change the color of marks or to only
display marks for a filtered-out subset of the data.
• The tabs bar at the bottom of the screen allows you to
go back to the data source editor (12) and to toggle
between your different worksheets (13), each
containing a single visualization.
• With the three buttons to the right of the tabs, you can
open additional worksheets, new dashboards, and
stories, respectively.
Workspace-contd..
• Top of the screen: menu bar (14)
• Directly under that is the toolbar (15), with three
buttons:
– The Tableau icon: This brings you back to the start
screen, where, among other things, you can add
additional data sources.
– Undo: This allows you to go back a step so you can
safely try out different ideas. You can go back as many
steps as you like.
– Redo: This allows you to restore any undone actions.
Workspace-contd…
The Menu Bar
• File Menu : Open, Save, and Save As and some more functionalities. The
Print To PDF menu item allows you to export your worksheets and
dashboards as PDF files. With the Repository Location option, you can look
up and change the default location for Tableau files on your machine. With
Export As Version, you can create workbooks for colleagues who might still
be using an older version of Tableau Desktop.

• Data Menu: Insert function presents a quick, ad hoc way to add a data
table—for example, from a website. Simply select and copy the table in
the original document, and click Insert in Tableau. This will add the data to
your workbook as a new data source. (More about it in Module 2.)

• Worksheet Menu With Export, you can take your data out of Tableau by
creating an image, a database file, or an Excel crosstab. Duplicate As
Crosstab, on the other hand, opens a new worksheet in Tableau, showing
a crosstab view of the data used in your visualization.
Menu Bar-contd..
• Dashboard Menu: actions that add interactivity
to dashboards are set up and tweaked by clicking
Actions (more about in later)
• Story Menu The Story menu entry lets you create
a story from your worksheets and dashboards. In
a story, content is arranged sequentially for
presentation and enriched with annotations. See
https://help.tableau.com/current/pro/desktop/en-us/story_example.htm
Menu Bar-contd…
• Analysis Menu: With this menu, you can create and edit
calculated fields ( will be discussed in Module 4). Here, you
will also find options for tweaking table layouts as well as
for showing grand totals, forecasts, and trend lines.

• Map menu: Choose between different background maps.


The Offline option is particularly useful when you have no
Internet connection and would like to access the built-in
cartographic material. Details in Module 5.

• Format Menu: Set the font, alignment, shading, and other


formatting options. In addition, you can set the overall
workbook design and adjust the cell size.
Menu Bar-contd…
• Server Menu Use this menu for sharing your dashboard via
Tableau Online, Tableau Server, or Tableau Public. With the
Create User Filter submenu, you can set audience-specific
filters that grant specific users or user groups (which have
been defined in Tableau Online or Tableau Server) access to
selected subsets of the data.

• Window Menu Use the Presentation Mode option to use


the full screen for your dashboard.

• Help Menu Via this menu, you have access to the Tableau
online help, training videos, and sample workbooks.
Investigate Start Performance Recording option.
Data Pane
• The Data pane is divided into measures and
dimensions.
• You control what visualizations you want to
display by adding different combinations of
measures and dimensions to the canvas.
Data Pane: Measures
• Measures are numeric variables.
• By adding a measure to the view, you decide
which values from your dataset to visualize.
• By default, Tableau automatically applies an
aggregation function such as SUM or AVG to
measures. That way, you can, for instance, show
the sum or the average of a sales discount across
different transactions.
• Measures typically (but not always) come with
green symbols, which represent continuous
variables.
Data Pane: Dimensions
• Dimensions are descriptive, categorical variables.
• With dimensions, you can decide how to group
the aggregated values of the used measures. For
instance, the sum of sales revenue (a measure)
could be broken down by country, product
category, or both (i.e. two different dimensions).
• Typically, dimensions come with blue symbols in
Tableau.
• If Tableau has erroneously added a measure to
the dimensions section of the Data pane, drag it
into the Measures section, and vice versa.
Visualize a measure
• In the Superstore workbook, we will visualize
sales revenue.
• Drag the Sales measure onto the left side of
the canvas, to the vertical area labeled Drop
Field Here.
• Alternatively, you can drop the measure onto
the Rows shelf above the canvas. The result is
the same: you see the total sales revenue of
all the records in the dataset.
Visualizing measures across
dimensions
• How the sales numbers break down by
product category?
• drag the Category dimension onto the
Columns shelf above the canvas.
• you see the total sales revenue broken down
by product category, in accordance with the
dataset
Marks
• After the Rows and Columns shelves, the next-
most-important area is the Marks card.
• Add dimensions and measures here to
incorporate additional information.
• Control the color, size, form, and labeling of
the marks displayed in your visualization.
Marks: Colors
• One of the most-used feature of the Marks
card is the Color field.
• Drag the Segment dimension onto Color,
• When you place a measure onto Color,
instead, you can select the colors and intervals
of the color gradient
Marks: Tooltips
• A tooltip is a little hover box that displays
additional information when you point at
individual marks in the visualization.
• This makes your charts interactive –something
not available in static charts and PDF reports.
• Drag the Profit measure onto the Tooltip field
of the Marks card.
• Move the pointer over different marks in the
visualization.
Saving your work
• Open the File menu, and click Save As.
• Two different file types to choose from in the
dialog box that opens:
• Tableau Workbook (*.twb)
– contains all the visualizations as well as the
metadata. They do not, however, contain the
actual data.
– When you share a Tableau workbook, the
recipient will need to have access to the original
file or database that you used.
Saving your work-contd..
• Tableau Packaged Workbook (*.twbx)
– contains actual data in addition to the
visualizations and metadata.
– data is greatly compressed, thereby reducing the
overall file size.
– When you share a Tableau packaged workbook,
the recipient will be able to open and work with
your visualizations even without having access to
the original data source.
Saving your work-contd..
• Recommendation: save your work as Tableau
packaged workbooks (.twbx files).
• You will always have your data extracts with
you, even if you can’t reach your database
servers remotely.

You might also like