0% found this document useful (0 votes)
158 views54 pages

Course Material Tableau

The document discusses data visualization processes and challenges involving multiple data sources that are dynamically updated. It presents steps to collate and prepare data from different file types and sources, draw various chart types, customize charts, and create dynamic and interactive dashboards. Tableau products like Desktop, Prep, Server, and Online are introduced for visualizing data through live or extract connections and joining data from multiple tables.

Uploaded by

sairamakanth14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
158 views54 pages

Course Material Tableau

The document discusses data visualization processes and challenges involving multiple data sources that are dynamically updated. It presents steps to collate and prepare data from different file types and sources, draw various chart types, customize charts, and create dynamic and interactive dashboards. Tableau products like Desktop, Prep, Server, and Online are introduced for visualizing data through live or extract connections and joining data from multiple tables.

Uploaded by

sairamakanth14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Data Visualization Process

Too simple?

Use a snapshot of
Single file contains data Create a chart
the chart in report

Start-Tech Academy
Data Visualization Process
• Data is in multiple files

• Data is in different types of files

• Data is getting dynamically updated


Practical
challenges • Ability to create all popular charts

• Ability to customize charts as required

• Ability to draw multiple charts and create dashboards

Start-Tech Academy
Data Visualization Process

Practical
challenges Multiple data sources Steps Dynamic & Interactive
1. Connect with data sources dashboard
2. Collate and prepare data
3. Draw all types of charts Share, Embed & View
4. Customize the charts

Collaborate

Start-Tech Academy
Tableau Products
Tableau Desktop
Tableau Desktop

Tableau Prep 1. Windows/ Mac application for PC

2. The mail visualization tool by Tableau


Tableau Server
3. Can connect with Tableau Online/ Tableau Public for collaboration with
Tableau Online
team members

Tableau Public

Start-Tech Academy
Tableau Products
Tableau Prep
Tableau Desktop

Tableau Prep 1. Used for preprocessing the data

2. Does data cleaning/ validation and integration using a visual interface


Tableau Server
3. Tableau desktop creator license grants access to Tableau Prep also
Tableau Online

Tableau Public

Start-Tech Academy
Tableau Products
Tableau Server/ Tableau Online
Tableau Desktop
• Used for connecting and collaborating with team members

Tableau Prep Tableau server will be


installed on company server

Tableau Server
Tableau Desktop will talk with
Tableau server via Intranet
Tableau Online

Individuals will be using


Tableau Public
Tableau Desktop on their PC

Start-Tech Academy
Tableau Products
Tableau Server/ Tableau Online
Tableau Desktop
• Used for connecting and collaborating with team members

Tableau online is hosted and


Tableau Prep
managed by Tableau

Tableau Server
Tableau Desktop will talk with
Tableau Online via Internet
Tableau Online

Tableau Public

Start-Tech Academy
Tableau Products
Tableau Public
Tableau Desktop

Tableau Prep 1. Hosted by Tableau – free for all to use

2. Your work will be open for anyone to see


Tableau Server
3. Web based application
Tableau Online

Tableau Public

Start-Tech Academy
Live vs Extract Connection

Real time updates Better performance

Data Source Visualizations Data Source Snapshot of data Visualizations


Extract

Live connection Extract


• Whenever any change is made, data is • Whenever any change is made, data is
fetched from the data source fetched from the snapshot (Extract)
• Visualizations are updated in real-time • Extracts need to be are updated for any
update in data

Start-Tech Academy
Joining data from multiple tables

Sales Table
Sales values only
Order Line Order ID Order Date Ship Date Ship Mode Customer ID Product ID Sales Quantity Discount Profit
1 CA-2016-152156 08-11-2016 11-11-2016 Second Class CG-12520 FUR-BO-10001798 261.96 2 0 41.9136

Scenario 1: 2
3
4
CA-2016-152156
CA-2016-138688
US-2015-108966
08-11-2016
12-06-2016
11-10-2015
11-11-2016
16-06-2016
18-10-2015
Second Class
Second Class
Standard Class
CG-12520
DV-13045
SO-20335
FUR-CH-10000454
OFF-LA-10000240
FUR-TA-10000577
731.94
14.62
957.5775
3
2
5
0
0
219.582
6.8714
0.45 -383.031

Joining two 5 US-2015-108966 11-10-2015 18-10-2015 Standard Class SO-20335 OFF-ST-10000760 22.368 2 0.2 2.5164

different tables
Customer Table
Region values only
Customer ID Customer Name Segment Age Country City State Postal Code Region
CG-12520 Claire Gute Consumer 67 United States Henderson Kentucky 42420 South
DV-13045 Darrin Van Huff Corporate 31 United States Los Angeles California 90036 West
SO-20335 Sean O'Donnell Consumer 65 United States Fort Lauderdale Florida 33311 South
BH-11710 Brosina Hoffman Consumer 20 United States Los Angeles California 90032 West
3 ways of joining data in tableau

Later

Joining Blending Relationships

New way of joining data in


Tableau
Merging data from multiple tables
Combining similar data using operators like Union, Intersect and Except
Customer ID Customer Name Segment Age Country City State Postal Code Region
EB-13870 Emily Burns Consumer 34 United States Orem Utah 84057 West
EH-13945 Eric Hoffmann Consumer 21 United States Los Angeles California 90049 West
Online
TB-21520 Tracy Blumstein Consumer 48 United States Philadelphia Pennsylvania 19140 East customers
MA-17560 Matt Abelman Home Office 19 United States Houston Texas 77095 Central

Scenario 2: Customer ID Customer Name Segment Age Country City State Postal Code Region
ON-18715 Odella Nelson Corporate 27 United States Eagan Minnesota 55122 Central
Offline
Merging similar PO-18865 Patrick O'Donnell
LH-16900 Lena Hernandez
Consumer 64 United States
Consumer 66 United States
Westland
Dover
Michigan
Delaware
48185
19901
Central
East customers

tables
Customer ID Customer Name Segment Age Country City State Postal Code Region
EB-13870 Emily Burns Consumer 34 United States Orem Utah 84057 West
EH-13945 Eric Hoffmann Consumer 21 United States Los Angeles California 90049 West
TB-21520 Tracy Blumstein Consumer 48 United States Philadelphia Pennsylvania 19140 East
MA-17560 Matt Abelman Home Office 19 United States Houston Texas 77095 Central
ON-18715 Odella Nelson Corporate 27 United States Eagan Minnesota 55122 Central
PO-18865 Patrick O'Donnell Consumer 64 United States Westland Michigan 48185 Central
LH-16900 Lena Hernandez Consumer 66 United States Dover Delaware 19901 East
Joins
To join tables we must know:

1. The names of the tables to be joined


2. The common column based on which we will join them
3. The list of columns from each table

Order Line Order ID Order Date Ship Date Ship Mode Customer ID Product ID Sales Quantity Discount Profit
What’s needed 1
2
CA-2016-152156
CA-2016-152156
08-11-2016
08-11-2016
11-11-2016
11-11-2016
Second Class
Second Class
CG-12520
CG-12520
FUR-BO-10001798
FUR-CH-10000454
261.96
731.94
2
3
0
0
41.9136
219.582
3 CA-2016-138688 12-06-2016 16-06-2016 Second Class DV-13045 OFF-LA-10000240 14.62 2 0 6.8714
4 US-2015-108966 11-10-2015 18-10-2015 Standard Class SO-20335 FUR-TA-10000577 957.5775 5 0.45 -383.031
5 US-2015-108966 11-10-2015 18-10-2015 Standard Class SO-20335 OFF-ST-10000760 22.368 2 0.2 2.5164

Customer ID Customer Name Segment Age Country City State Postal Code Region
CG-12520 Claire Gute Consumer 67 United States Henderson Kentucky 42420 South
DV-13045 Darrin Van Huff Corporate 31 United States Los Angeles California 90036 West
SO-20335 Sean O'Donnell Consumer 65 United States Fort Lauderdale Florida 33311 South
BH-11710 Brosina Hoffman Consumer 20 United States Los Angeles California 90032 West
Relationship – performance options
Customer table Postal code master table

Customer ID Postal code Postal code City State Region


CG-12520 42420 32303 Tallahassee Florida South
DV-13045 90036 32725 Deltona Florida South
32935 Melbourne Florida South
SO-20335 33311
33012 Hialeah Florida South
BH-11710 90036
33311 Fort Lauderdale Florida South
. . . .
Cardinality Are postal code values
unique in this table?
. . . .

Are postal code values unique in


Not unique -> ‘Many’ this table?

Unique -> ‘One’

‘Many to One’
We are joining customer table with postal code master - postal code is the matching key

Start-Tech Academy
Relationship – performance options
Customer table Reference table

Customer ID Postal code Customer ID Ref Name Ref Contact


CG-12520 42420 CG-12520 Cindy Stewart 10897310
DV-13045 90036 CM-11935 Dan Campbell 16589278
SO-20335 33311 CM-12385 Darren Koutras 95721837
BH-11710 90036 CG-12520 Denny Ordway 16507437
CS-12355 Evan Bailliet 76772276
CS-12460 Erica Hackney 69524187
Cardinality Is customer ID unique
in this table?
Is customer ID unique in this table?
Unique -> ‘One’
Not unique -> ‘Many’

‘One to Many’
We are joining customer table with reference table – customer ID is the matching key

Start-Tech Academy
Relationship – performance options

‘Many to Many’ is the default setting in Tableau

It will not give wrong visualizations

Cardinality We can improve performance slightly if we specify the exact cardinality

Change only if you know what you are doing

Start-Tech Academy
Relationship – performance options
Customer table Reference table

Customer ID Postal code Customer ID Ref Name Ref Contact


CG-12520 42420 CG-12520 Cindy Stewart 10897310
DV-13045 90036 CM-11935 Dan Campbell 16589278
SO-20335 33311 CM-12385 Darren Koutras 95721837
BH-11710 90036 CG-12520 Denny Ordway 16507437
CS-12355 Evan Bailliet 76772276
Referential Are all customer IDs in
CS-12460 Erica Hackney 69524187

Integrity customer table also present Are all customer IDs in reference table
in the reference table? also present in the customer table?

No -> ‘Some records match’ Yes -> ‘All records match’

We are joining customer table with reference table – customer ID is the matching key

Start-Tech Academy
Relationship – performance options

‘Some records match’ is the default setting in Tableau

It will not give wrong visualizations

Referential We can improve performance if we specify the exact referential integrity


Integrity
Change only if you know what you are doing

Start-Tech Academy
Physical vs Logical layer
Observations

Physical Layer Logical Layer

• We can do Join and Union here • Single logical table can have multiple
joined and Union tables
• Result of joining and union is a single
table • Relationships can be defined – ‘Noodles

• Related tables remain individual tables

Start-Tech Academy
Physical vs Logical layer
Data Model - Example

Logical Sales and Customer Relationship Product Master – Union


Layer joined data of all product tables

Sales Table Customer Table Online Products Table


Union
Physical Join
Layer Offline Products Table
Union
Archived Products Table

Start-Tech Academy
Physical vs Logical layer
Data Model

Logical Relationships Product table


Layer Sales data
Customer table

Physical Sales Table


Layer Union
Bug sales Table

Start-Tech Academy
Types of Data in Tableau
Student ID Gender Age Hours studied Marks scored Year of exam
S101 Male 19 18 73 2021
S102 Female 20 15 85 2020
S103 Female 16 21 71 2023
S104 Male 19 23 89 2022
Dimensions vs measures S105 Female 19 25 94 2022
S106 Female 20 27 70 2020
• Dimensions are columns containing S107 Male 21 15 95 2019
categories/ segments based on S108 Male 21 20 70 2021
S109 Female 16 17 79 2019
which aggregation will be done S110 Female 18 26 73 2020
… … … … … …
• Measures are numeric columns for
which we wish to get the aggregate Gender Sum of Hours studied Year of Exam Average of Marks scored
values Female 131 2019 87
Male 76 2020 76
2021 71.5
2022 91.5
2023 71

Sum, Average, Min, Max, count are some aggregate functions


Start-Tech Academy
Types of Data in Tableau
Student ID Gender Age Hours studied Marks scored Year of exam
S101 Male 19 18 73 2021
S102 Female 20 15 85 2020
S103 Female 16 21 71 2023
S104 Male 19 23 89 2022
Dimensions vs measures S105 Female 19 25 94 2022
S106 Female 20 27 70 2020
• Same column can act as a dimension S107 Male 21 15 95 2019
in some scenarios and as a measure S108 Male 21 20 70 2021
S109 Female 16 17 79 2019
in other scenarios S110 Female 18 26 73 2020
… … … … … …

Age Average of Hours studied Gender Average of Age


16 19 Female 18.2
18 26
Male 20
19 22
20 21
21 17.5

Start-Tech Academy
Types of Data in Tableau

Discrete vs continuous

• Discrete - A set of finite values Age is set as


• Will add headers on the axes discrete

• Continuous – Infinite range


• Will add infinite range on the
axes

Age is set as
continuous

Start-Tech Academy
Types of Data in Tableau

Discrete Continuous
(Blue) (Green)
Finite – adds headers Infinite range – adds axes

Eg. - Name, Gender, category, etc. Eg. Year(transaction date)


Dimension
String, date, numeric Common Rare

Eg. Sum(profit) Eg. Sum(profit), Average(age)


Measure
Numeric - aggregation Rare Common

Start-Tech Academy
Binning data
Converting continuous numeric data into bins/ groups

Age

18
19
: Young
Bins :
39
40
41
: Middle Aged
:
64
65
66
:
Seniors
:

Start-Tech Academy
Grouping data
Clubbing similar categories together into groups

Sub-categories
Phones Phones & Acc.
Accessories Accessories
Appliances
Bookcases Tables Tables & Chairs
Chairs Chairs
Groups Copiers
Envelopes Appliances
Furnishings Bookcases
Labels Copiers
Machines Envelopes
Paper Furnishings Others
Phones Labels
Storage Machines
Tables Paper
Storage

Start-Tech Academy
Filtering
Showing only relevant data/ hiding irrelevant information

Examples

Start-Tech Academy
Filters

Order of operation is the order in which different types of filters are


executed in Tableau

In Tableau, filters are executed in the following order:

Order of 1.
2.
Extract filters
Data source filters
operation 3. Context filters
4. Filters on dimensions
5. Filters on measures

Order of execution is important because it can impact the final


output and the performance when multiple filters are applied

Start-Tech Academy
Filters

Output of two dimension filters


Calculation Steps
Two filters
Output of filter 1 – Apple, Strawberry Filter 1 checks 10 fruits
Filter 1 – Only red fruits (inside and Watermelon
or outside)
Filter 2 checks 10 fruits
Output of filter 2 – Watermelon and
Filter 2 – Top 2 fruits by weight melon Common/ intersection of two
results is shown finally
Final output – Only Watermelon

Start-Tech Academy
Filters

Output of one context and one dimension filter


Calculation Steps
Two filters
Output of filter 1 – Apple, Strawberry Filter 1 checks 10 fruits
Filter 1 (Set as context filter) – and Watermelon
Only red fruits (inside or outside)
Filter 2 checks 3 fruits only
Output of filter 2 – Watermelon and Less calculation
Filter 2 – Top 2 fruits by weight Apple
Result of filter 2 is final result
Final output – Watermelon and Apple
Different Result
Start-Tech Academy
Maps
To plot data points on custom background images

Custom
background
image

Start-Tech Academy
Maps
50

45

40

35

30

25
Custom 20

background 15

10
image 5

0
0 10 20 30 40 50 60 70 80 90 100
State X coordinate Y coordinate
California 13 26
New York 84 34
Texas 50 10

Start-Tech Academy
Maps

Territory States

Territory 1 California, Texas, New York


Territories
Territory 2 Washington, Pennsylvania, Illinois

Territory 3 All other states

Start-Tech Academy
Blending for missing geocoding

Joining Blending Relationships

Blending is used when the data tables are


present in different data sources
Data Analysis
Predefined formulas to do a specific calculation

Types of functions:

1. Number functions
2. Date functions
Functions 3. Text functions
4. Logical functions
5. Aggregate functions

Documentation - https://help.tableau.com/current/pro/desktop/en-
us/functions_all_categories.htm

Task – Create the same calculated field (as in last class) using CASE WHEN function

Start-Tech Academy
Data Analysis
Table calculations – Calculations on plotted data

Table
Calculations

Start-Tech Academy
Data Analysis
Table calculations only consider the final data plotted for calculations

Table
Calculations
Final data which is plotted

The complete data

Start-Tech Academy
Data Analysis
Table calculations have two parts – Calculation and Direction

Types of calculations – Differences, percentages, ranks etc.

Direction – Across the table, Down the table, down and across etc.

Table Marketing Finance HR Total


Calculations Salaries 200 300 100 600
Agency payments 500 100 500 1100
Other expenses 300 100 200 600
Total 1000 500 800 2300

Start-Tech Academy
Data Analysis
Table calculations have two parts – Calculation and Direction

Marketing Finance HR Total


Salaries 200 300 100 600
Across
Agency payments 500 100 500 1100
This total will
Table Other expenses 300 100 200 600
be used
Total 1000 500 800 2300
Calculations
Marketing Finance HR Total
Salaries 33% 50% 17% 100%
Agency payments 45% 10% 45% 100%
Other expenses 50% 17% 33% 100%

Start-Tech Academy
Data Analysis
Table calculations have two parts – Calculation and Direction

Marketing Finance HR Total


Salaries 200 300 100 600
Agency payments 500 Down
100 500 1100
Table Other expenses 300 100 200 600
TotalThis total will 1000 500 800 2300
Calculations be used
Marketing Finance HR
Salaries 20% 60% 12.5%
Agency payments 50% 20% 62.5%
Other expenses 30% 20% 25%
Total 100% 100% 100%

Start-Tech Academy
Data Analysis
Table calculations have two parts – Calculation and Direction

Marketing Finance HR Total


Salaries 200 300 100 600
Agency payments 500 100 500 1100
Table Other expenses 300 100 200 600
Total 1000 500 This total will 2300
800
Calculations be used

Marketing Finance HR
Salaries 9% 13% 4%
Agency payments 22% 4% 22%
Other expenses 13% 4% 9%

Start-Tech Academy
Sets
A subset of data based on some conditions

What List of all states

States with more than 50


customers/ employees

Start-Tech Academy
Sets
Sets are created based on a condition

Why IN OUT

Either select manually or


Specify a condition,
If true – that element is part of Set A or the IN set
If False – that element is part of Set B or the OUT set

Start-Tech Academy
Sets
Sets can be used to compare IN vs OUT performance

Why IN OUT

Example: Compare the sum of sales in top 3 states vs all other states

IN – Top 3 states
OUT – All other states

Start-Tech Academy
Sets
Sets can be used combine sets as per set theory

Customers who made Customers who made


Why a purchase in 2020 a purchase in 2021

Customers who made


a purchase in both,
2020 and 2021

Start-Tech Academy
Sets
Sets can be used combine sets as per set theory

Why
Union – All members Intersect – Shared
in both sets members in both sets

Except – members in one


set except other set

Start-Tech Academy
Box plot

What

Start-Tech Academy
Level of Detail
Level of Detail is the granularity in data/ how fine is the information

Student Exam Subject Institute Marks Scored Subject Institute A Institute B


Student 1 Math A 92 English 93.0 76.5
Student 2 Science A 73
Math 79.0 76.5
Student 3 English A 86
What Student 4 Math A 66
Science 62.5 66.0
Student 5 Science A 52 2 * 3 = 6 marks - less level of detail
Student 6 English A 100
Student 7 Math B 86
Student 8 Science B 51 Subject Average Marks Scored
Student 9 English B 99 English 84.8
Student 10 Math B 67 Math 77.8
Student 11 Science B 81 Science 64.3
Student 12 English B 54 1 * 3 = 3 marks - lesser level of detail
Original table – has 3 * 12 = 36 marks Average of Marks Scored
Very high level of detail 75.6
1 mark - least level of detail
Start-Tech Academy
LOD expressions
Level of Detail expressions help in specifying the level of detail for
aggregation of a calculated field

Usually aggregation happens at visualization level

With LOD expressions, we can control the level of detail of the aggregation
What & Why
Syntax – { LOD keyword Dimension(s) : Aggregate Calculation }

Example: { FIXED [segment] : SUM([profit]) }

There are three LOD keywords:


1. FIXED
2. INCLUDE
3. EXCLUDE

Start-Tech Academy
LOD expressions
Example { FIXED [Student]: SUM([Marks Scored]) }

Student Exam Subject Institute Marks Scored Student Exam Subject Institute Marks Scored FIXED LOD
Student 1 Math A 92 Student 1 Math A 92 251
Student 1 Science A 73 Student 1 Science A 73 251
Student 1 English A 86 Student 1 English A 86 251
Student 2 Math A 66 Student 2 Math A 66 218
Students Sum of Marks Scored
Student 2 Science A 52 Student 2 Science A 52 218
Student 1 251
Student 2 English A 100 Student 2 English A 100 218
Student 2 218
Student 3 Math B 86 Student 3 Math B 86 236
Student 3 236
Student 3 Science B 51 Student 3 Science B 51 236
Student 4 202
Student 3 English B 99 Student 3 English B 99 236
Student 4 Math B 67 Student 4 Math B 67 202
Student 4 Science B 81 Student 4 Science B 81 202
Student 4 English B 54 Student 4 English B 54 202

Compare: AVG(marks scored) on institute dimension vs AVG(FIXED LOD) on institute dimension

Start-Tech Academy
LOD expressions
Example { Include [Student]: SUM([Marks Scored]) }

Student Exam Subject Institute Marks Scored Student Exam Subject Institute Marks Scored Include
Student 1 Math A 92 Student 1 Math A 92 251
Student 1 Science A 73 Student 1 Science A 73 251
Student 1 English A 86 Student 1 English A 86 251
Student 2 Math A 66 Student 2 Math A 66 218
Students Sum of Marks Scored
Student 2 Science A 52 Student 2 Science A 52 218
Student 1 251
Student 2 English A 100 Student 2 English A 100 218
Student 2 218
Student 3 Math B 86 Student 3 Math B 86 236
Student 3 236
Student 3 Science B 51 Student 3 Science B 51 236
Student 4 202
Student 3 English B 99 Student 3 English B 99 236
Student 4 Math B 67 Student 4 Math B 67 202
Student 4 Science B 81 Student 4 Science B 81 202
Student 4 English B 54 Student 4 English B 54 202

Compare: AVG(marks scored) on institute dimension vs AVG(Include) on institute dimension


Compare: AVG(Fixed LOD) on Subject dimension vs AVG(Include) on subject dimension

Start-Tech Academy
LOD expressions
Example { Exclude [Student]: SUM([Marks Scored]) }

Student Exam Subject Institute Marks Scored


Student 1 Math A 92
Student 1 Science A 73
Student 1 English A 86
Student 2 Math A 66
Students Sum of Marks Scored
Aggregation depends on
Student 2 Science A 52
Student 1 907 Visualization LOD
Student 2 English A 100
Student 3 Math B 86
Student 2 907 Student field will be excluded
Student 3 907 while aggregating
Student 3 Science B 51
Student 4 907
Student 3 English B 99
Student 4 Math B 67
Student 4 Science B 81
Student 4 English B 54

Compare: AVG(marks scored) vs AVG(Include) on institute dimension


Compare: AVG(Fixed LOD) on Subject dimension vs AVG(Include) on subject dimension

Start-Tech Academy

You might also like