0% found this document useful (0 votes)
2 views

Combining Data (1)

The document compares the Physical Layer and Logical Layer in data combination, highlighting that the Physical Layer requires fixed joins and unions for data at consistent levels of detail, while the Logical Layer allows for dynamic relationships that accommodate varying levels of granularity. It explains the importance of understanding granularity and relationships, including one-to-one, one-to-many, and many-to-many scenarios, as well as the roles of joins, unions, and join culling in data analysis. Additionally, it discusses the use of spatial joins and the process of creating joins in Tableau, emphasizing the need for careful consideration of data structure and integrity.

Uploaded by

Saira Wahid
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Combining Data (1)

The document compares the Physical Layer and Logical Layer in data combination, highlighting that the Physical Layer requires fixed joins and unions for data at consistent levels of detail, while the Logical Layer allows for dynamic relationships that accommodate varying levels of granularity. It explains the importance of understanding granularity and relationships, including one-to-one, one-to-many, and many-to-many scenarios, as well as the roles of joins, unions, and join culling in data analysis. Additionally, it discusses the use of spatial joins and the process of creating joins in Tableau, emphasizing the need for careful consideration of data structure and integrity.

Uploaded by

Saira Wahid
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Combining Data

Physical Layer Vs. Logical Layer


• The major difference between these two layers lies in how they
combine data and the level of flexibility they offer
• Physical Layer:
Physical Layer Vs. Logical Layer
• Logical Layer:
Physical Layer Vs. Logical Layer
• Logical Layer:
• This layer uses Tableau's data model to create relationships between two or more
data tables based on common fields.
• Relationships are dynamic and flexible. Tableau queries the data only when fields
from the related tables are used in the view.
• Relationships allow your data to be at different levels of detail, accommodating
many-to-many situations without necessarily causing duplication.
• When you first connect to data in Tableau, you are automatically in the logical layer.
• The logical layer aims to provide a more flexible and intuitive way to combine data,
especially when dealing with varying levels of granularity. Tableau handles the
appropriate joins behind the scenes based on the context of your analysis.
Physical Layer Vs. Logical Layer
• Physical Layer:
Physical Layer Vs. Logical Layer
• Physical Layer:
• This layer is where you create joins and unions to combine data from multiple
tables.
• Data combination in the physical layer is not dynamic and flexible. Once you
define a join or union, that combination is fixed.
• It is best suited for data that is at the same level of detail, often resulting in a
one-to-one join.
• Combining data at different levels of detail in the physical layer can lead to
data duplication.
• Joins in the physical layer add columns, while unions append rows.
Physical Layer Vs. Logical Layer
• Physical layer requires explicit definitions of joins and unions and
works best with data at a consistent level of detail, while the logical
layer uses relationships to describe how tables are related and allows
Tableau to dynamically determine how to combine data based on the
analysis, offering greater flexibility when dealing with data at different
levels of detail
Relationships
• Relationships provide a more dynamic and flexible way to connect
multiple tables in the logical layer. Instead of creating a fixed join, you
define how tables are related based on common fields. Tableau then
intelligently determines how to combine the data based on the
context of your analysis in each worksheet.
• Relationships are particularly beneficial when dealing with data at
different levels of detail or in many-to-many relationships, as they
help to avoid the data duplication issues that can arise with traditional
joins. Tableau queries the related tables only when the fields from
those tables are actually used in a visualisation
Level of detail (LOD) can refer to
two related concepts
• The level of detail of the view, which is determined by the dimensions
you have placed on the Rows, Columns, and Marks cards. This defines
the context at which measures are aggregated. For example, if you
have 'Region' on the Rows shelf, the level of detail of the view is 'per
region', and any measures will be aggregated at that level.
Granularity
• Granularity refers to the level of detail present in your data. Data with
finer granularity has more detail, such as individual transactions or
events.
• For example, a sales dataset with each row representing a single
product sold at a specific time and location has a fine granularity.
Conversely, data with coarser granularity is more aggregated or
summarized, such as total monthly sales per region.
• Understanding the granularity of your data is crucial when combining
tables, as joining tables with different granularities without proper
consideration can lead to unexpected results, such as the duplication of
records.
Fine granularity Vs. Coarse
granularity
Fine granularity:
Data with finer granularity has more detail
Example : a dataset recording every single transaction made at a store, including the
specific items bought, the exact time, and the individual customer. This is data with a
high level of detail.

Coarse granularity: "coarser" means a lower level of detail in the data It refers to
data that has been more aggregated or summarized.
Example: a dataset that only shows the total daily sales for each store location. This data
has been aggregated from the individual transactions and has a lower level of detail; it is
therefore of coarser granularity
Relationships created in the logical layer
represent different types of cardinality:
• One-to-One Relationships:
• These occur when a single record in one table corresponds to at most one record in another table.
While the logical layer can handle this, the sources suggest that traditional joins in the physical
layer might also be suitable for this scenario, especially when the data is at the same level of detail.
• One-to-Many Relationships:
• This is when one record in one table can be related to multiple records in another table. A common
example is the relationship between customers and their orders (one customer can have many
orders). The logical layer is well-suited to manage these kinds of relationships.
• Many-to-Many Relationships:
• These are situations where multiple records in one table can be related to multiple records in
another table.
joins, unions, and relationships
• You need joins, unions, and relationships for different ways of
combining data from one or more sources to facilitate analysis.
Joins
• Joins combines Fields or columns of two tables.
• You need joins to combine data from multiple tables by adding columns. This is
typically done when the tables share one or more common fields (keys).
Example, you might join a table of sales orders with a table of customer details
using a customer ID to bring the customer's address information into your sales
analysis.
• Tableau supports different types of joins like inner, left, right, and full outer joins,
each determining which rows are included in the combined dataset based on the
matching keys.
• Joins are particularly useful for data with one-to-one or one-to-many
relationships. However, if the data is at different levels of detail, joins in the
physical layer can sometimes lead to data duplication
Types of Joins:
• Inner Join: This is the default join type and retains only the data where there is a
match in the specified join clause across both data sources. The Venn diagram for
an inner join shows only the overlapping section coloured.
• Left Join: This join retains all the data from the left-hand side data source and
brings in any matching records from the right-hand side data source. If there is no
match in the right-hand data source, the columns from that source will have null
values. The Venn diagram shows the left circle and the overlapping section
coloured.
• Right Join: This is the opposite of a left join. It retains all the data from the right-
hand side data source and brings in any matching records from the left-hand side
data source. Null values will appear for columns from the left-hand side where
there is no match. The Venn diagram shows the right circle and the overlapping
section coloured.
• Full Outer Join: This join retains all data from both data sources, combining
records where the join clause is met and filling in nulls where there is no match.
The Venn diagram shows both circles fully coloured
Types of Joins
Inner Join
Left Table and Right Table
Output table shows only the matching rows from the left and right table. Any
un-matching row will not be in output table

CustomerAg
CustomerNames
ID Name
es
ID Age
1 Ali 1 20
2 Hassan 3 35

ID Name ID Age
1 Ali 1 20
Left Join
Left Table and Right Table
Output table shows All rows from left table but only the matching rows from
the right table.

CustomerNa CustomerAg
mes
ID Name
es
ID Age
1 Ali 1 20
2 Hassan 3 35

ID Name ID Age
1 Ali 1 20
2 Hassan NULL NULL
Right Join
Left Table and Right Table
Output table shows only the matching rows from the left table but all rows
from right table.

CustomerAg
CustomerNames
ID Name
es
ID Age
1 Ali 1 20
2 Hassan 3 35

ID Name ID Age
1 Ali 1 20
NULL NULL 3 35
Full Join
Left Table and Right Table
Output table shows all the rows from the left table and all the rows from right
table.

CustomerAg
CustomerNames
ID Name
es
ID Age
1 Ali 1 20
2 Hassan 3 35

ID Name ID Age
1 Ali 1 20
2 Hassan NULL NULL
NULL NULL 3 35
Creating Joins in Tableau Desktop
1. Open the Data Source tab.
2. Ensure the desired data sources are added. Note that Tableau Data sources (.tds or .tdsx) cannot be used in
joins.
3. Drag the main table onto the canvas.
4. Double-click the dragged table to enter the join/union interface (physical layer).
5. Drag the table you want to join with onto the canvas next to the first table. A connecting line with a Venn
diagram will appear.
6. Click the Venn diagram to open the join configuration window.
7. Select the desired join type (Inner, Left, Right, Full Outer) by clicking on the corresponding Venn diagram icon.
8. Define the join clause by selecting the fields that should match between the two data sources. You can select
common fields with the same name, or choose different fields. You can add multiple join clauses using the "+"
icon.
9. Ensure that the data types of the fields used in the join clause are the same.
10. You can also create complex join conditions using calculation logic, for example, joining on date ranges or
concatenating fields.
Considerations for Joining Data:
• Granularity: Be mindful of the level of detail in your data sources. Joining data
at different levels of aggregation can lead to duplication of records and
inaccurate aggregations. Consider aggregating data before joining if necessary.
• Primary and Foreign Keys: Joins often rely on primary keys (unique identifiers
for records in a table) and foreign keys (fields in one table that reference the
primary key in another). Joining on non-unique keys can lead to erroneous
results.
• Join Culling: Tableau can optimise queries by using join culling, where it only
includes joined tables that are specifically referenced by fields in the view,
assuming referential integrity.
• Spatial Joins: These are used specifically when your data contains spatial
information to join based on spatial relationships.
Join Culling
• Join culling is a performance optimisation technique used by Tableau
when querying databases with joins. It works by generating SQL
queries that only include the tables necessary to retrieve the data
required for the current view.
• How it works: When you build a visualisation in Tableau using data
from joined tables, Tableau analyses which fields from which tables
are actually needed to render the view. If a joined table does not
contain any of the fields required for the visualisation, Tableau's query
engine will exclude (or "cull") that table from the generated SQL
query.
Join Culling
• Purpose: The primary goal of join culling is to improve query
performance by reducing the complexity of the SQL queries sent to
the database and the amount of data that needs to be processed and
transferred. By only querying necessary tables, Tableau can execute
queries faster and more efficiently.
• Conditions for effectiveness: Join culling assumes that the tables in
the database have referential integrity. This means that if you join a
fact table to a dimension table on a common key, the information in
that dimension table is consistent and reliable. In such cases, if a
query only needs data from the fact table (or a subset of joined
dimension tables), the other joined tables will not be referenced.
Spatial Joins
• spatial joins provide a powerful way to integrate and analyse data
based on geographic relationships, allowing you to go beyond
traditional attribute-based joins and leverage the spatial context of
your data.
Unions
We use unions to combine data by appending rows from two or more
tables or files that have the same structure and set of fields. Tables
should have same number of columns and same data types. Unlike
Joins, we don’t need key to combine data.
For Example, if you have sales data for different years stored in
separate Excel sheets with identical columns, you would use a union to
stack this data on top of each other, creating a single table with all the
historical sales records. Unions are essential for analysing data that is
split across multiple sources but represents the same type of
information.
How Union Works

Orders 2022 Orders 2023


ID Date ID Date
1 2022 3 2023
2 2022 4 2023

ID Date
1 2022
2 2022
3 2023
4 2023

You might also like