Share Data Through The Art of Visualization
Share Data Through The Art of Visualization
You learned about the David McCandless method in the first lesson on effective data
visualizations, but as a refresher, the McCandless Method lists four elements of good data
visualization:
This approach is a useful set of questions that can help consumers of data visualization
critique what they are consuming and determine how effective it is. The Checkup has
three questions:
1. What is the practical question?
2. What does the data say?
3. What does the visual say?
Note: This checklist helps you think about your data viz from the perspective of your
audience and decide if your visual is communicating your data effectively to them or not.
In addition to these frameworks, there are some other building blocks that can help you
construct your data visualizations.
Marks
Marks are basic visual objects like points, lines, and shapes. Every mark can be broken
down into four qualities:
Channels
Channels are visual aspects or variables that represent characteristics of the data.
Channels are basically marks that have been used to visualize data. Channels will vary in
terms of how effective they are at communicating data based on three elements:
1. Accuracy - Are the channels helpful in accurately estimating the values being
represented?
For example, color is very accurate when communicating categorical differences, like
apples and oranges. But it is much less effective when distinguishing quantitative data like
5 from 5.5.
There are many ways of drawing attention to specific parts of a visual, and many of them
leverage pre-attentive attributes like line length, size, line width, shape, enclosure, hue,
and intensity.
3. Grouping - How good is a channel at communicating groups that exist in the data?
Design principles
Once you understand the pre-attentive attributes of data visualization, you can go on to
design principles for creating effective visuals. These design principles are important to
your work as a data analyst because they help you make sure that you are creating
visualizations that communicate your data effectively to your audience. By keeping these
rules in mind, you can plan and evaluate your data visualizations to decide if they are
working for you and your goals. And, if they aren’t, you can adjust them!
Principle Description
One of the first things you have to decide is which visual will be the most effective for your
Choose the
audience. Sometimes, a simple table is the best visualization. Other times, you need a more
right visual
complex visualization to illustrate your point.
The data-ink entails focusing on the part of the visual that is essential to understanding the
Optimize the
point of the chart. Try to minimize non-data ink like boxes around legends or shadows to
data-ink ratio
optimize the data-ink ratio.
Use
Make sure the written components of the visual, like the labels on a bar chart, are easy to
orientation
read. You can change the orientation of your visual to make it easier to read and understand.
effectively
There are a lot of important considerations when thinking about using color in your visuals.
These include using color consciously and meaningfully, staying consistent throughout your
Color
visuals, being considerate of what colors mean to different people, and using inclusive color
scales that make sense for everyone viewing them.
Think about how many elements you include in any visual. If your visualization uses lines, try
Numbers of to plot five or fewer. If that isn’t possible, use color or hue to emphasize important lines. Also,
things when using visuals like pie charts, try to keep the number of segments to less than seven
since too many elements can be distracting.
Avoiding misleading or deceptive charts
Finally, keep in mind that data visualization is an art form, and it takes time to develop
these skills. Over your career as a data analyst, you will not only learn how to design good
data visualizations, but you will also learn how to evaluate good data visualizations. Use
these tips to think critically about data visualization—both as a creator and as an audience
member.
Further reading
The beauty of data visualization: In this video, David McCandless explains the need
for design to not just be beautiful, but for it to be meaningful as well. Data
visualization must be able to balance function and form for it to be relevant to your
audience.
‘The McCandless Method’ of data presentation: At first glance, this blog appears to
be written by a David McCandless fan, and it is. However, it contains very useful
information and provides an in-depth look at the 5-step process that McCandless
uses to present his data.
Information is beautiful: Founded by McCandless himself, this site serves as a hub
of sample visualizations that make use of the McCandless method. Explore data
from the news, science, the economy, and so much more and learn how to make
visual decisions based on facts from all kinds of sources.
Beautiful daily news: In this McCandless collection, explore uplifting trends and
statistics that are beautifully visualized for your creative enjoyment. A new chart is
released every day so be sure to visit often to absorb the amazing things happening
all over the world.
The Wall Street Journal Guide to Information Graphics: The Dos and Don'ts of
Presenting Data, Facts, and Figures: This is a comprehensive guide to data
visualization, including chapters on basic data visualization principles and how to
create useful data visualizations even when you find yourself in a tricky situation.
This is a useful book to add to your data visualization library, and you can
reference it over and over again.
Of course, one of the best ways to understand the importance of data visualization is to go
through different examples of it. As a junior data analyst, you want to have several
visualization options for your creative process whenever you need. Below is a list of
resources that can inspire your next data-driven decisions, as well as teach you how to
make your data more accessible to your audience:
The data visualization catalogue: Not sure where to start with data visualization?
This catalogue features a range of different diagrams, charts, and graphs to help
you find the best fit for your project. As you navigate each category, you will get a
detailed description of each visualization as well as its function and a list of similar
visuals.
The 25 best data visualizations: In this collection of images, explore the best
examples of data that gets made into a stunning visual. Simply click on the link
below each image to get an in-depth view of each project, and learn why making
data visually appealing is so important.
10 data visualization blogs: Each link will lead you to a blog that is a fountain of
information on everything from data storytelling to graphic data. Get your next
great idea or just browse through some visual inspiration.
Information is beautiful: Founded by David McCandless, this gallery is dedicated to
helping you make clearer, more informed visual decisions based on facts and data.
These projects are made by students, designers, and even data analysts to help
you gain insight into how they have taken their own data and turned it into visual
storytelling.
Data studio gallery: Information is vital, but information presented in a digestible
way is even more useful. Browse through this interactive gallery and find examples
of different types of data communicated visually. You can even use the data studio
tool to create your own data-driven visual.
Correlation in statistics is the measure of the degree to which two variables move
in relationship to each other. An example of correlation is the idea that “As the
temperature goes up, ice cream sales also go up.” It is important to remember that
correlation doesn’t mean that one event causes another. But, it does indicate that
they have a pattern with or a relationship to each other. If one variable goes up and
the other variable also goes up, it is a positive correlation. If one variable goes up
and the other variable goes down, it is a negative or inverse correlation. If one
variable goes up and the other variable stays about the same, there is no
correlation.
Causation refers to the idea that an event leads to a specific outcome. For
example, when lightning strikes, we hear the thunder (sound wave) caused by the
air heating and cooling from the lightning strike. Lightning causes thunder.
Why is differentiating between correlation and causation
important?
When you make conclusions from data analysis, you need to make sure that you don’t
assume a causal relationship between elements of your data when there is only a
correlation. When your data shows that outdoor temperature and ice cream consumption
both go up at the same time, it might be tempting to conclude that hot weather causes
people to eat ice cream. But, a closer examination of the data would reveal that every
change in temperature doesn’t lead to a change in ice cream purchases. In addition, there
might have been a sale on ice cream at the same time that the data was collected, which
might not have been considered in your analysis.
Knowing the difference between correlation and causation is important when you make
conclusions from your data since the stakes could be high. The next two examples
illustrate the high stakes to health and human services.
Cause of disease
For example, pellagra is a disease with symptoms of dizziness, sores, vomiting, and
diarrhea. In the early 1900s, people thought that the disease was caused by unsanitary
living conditions. Most people who got pellagra also lived in unsanitary environments.
But, a closer examination of the data showed that pellagra was the result of a lack of
niacin (Vitamin B3). Unsanitary conditions were related to pellagra because most people
who couldn’t afford to purchase niacin-rich foods also couldn’t afford to live in more
sanitary conditions. But, dirty living conditions turned out to be a correlation only.
Distribution of aid
Here is another example. Suppose you are working for a government agency that provides
food stamps. You noticed from the agency’s Google Analytics that people who qualify for
food stamps are browsing the official website, but they are leaving the site without signing
up for benefits. You think that the people visiting the site are leaving because they aren’t
finding the information they need to sign up for food stamps. Google Analytics can help
you find clues (correlations), like the same people coming back many times or how quickly
people leave the page. One of those correlations might lead you to the actual cause, but
you will need to collect additional data, like in a survey, to know exactly why people
coming to the site aren’t signing up for food stamps. Only then can you figure out how to
increase the sign-up rate.
Key takeaways
In your data analysis, remember to:
Further information
You can explore the following article and training for more information about correlation
and causation:
Correlation is not causation: This article describes the impact to a business when
correlation and causation are confused.
Correlation and causation (Khan Academy lesson): This lesson describes
correlation and causation along with a working example. Follow the examples of
the analysis and notice if there is a positive correlation between frostbite and
sledding accidents.
Line chart
A line chart is used to track changes over short and long periods of time. When smaller
changes exist, line charts are better to use than bar graphs. Line charts can also be used to
compare changes over the same period of time for more than one group.
Let’s say you want to present the graduation frequency for a particular high school
between the years 2008-2012. You would input your data in a table like this:
Maybe your data is more specific than above. For example, let’s say you are tasked with
presenting the difference of graduation rates between male and female students. Then
your chart would resemble something like this:
Column chart
Column charts use size to contrast and compare two or more values, using height or
lengths to represent the specific values.
The below is example data concerning sales of vehicles over the course of 5 months:
Heatmap
Similar to bar charts, heatmaps also use color to compare categories in a data set. They
are mainly used to show relationships between two variables and use a system of color-
coding to represent different values. The following heatmap plots temperature changes
for each city during the hottest and coldest months of the year.
Pie chart
The pie chart is a circular graph that is divided into segments representing proportions
corresponding to the quantity it represents, especially when dealing with parts of a whole.
For example, let’s say you are determining favorite movie categories among avid movie
watchers. You have gathered the following data:
Scatterplot
Scatterplots show relationships between different variables. Scatterplots are typically
used for two variables for a set of data, although additional variables can be displayed.
For example, you might want to show data of the relationship between temperature
changes and ice cream sales. It would resemble something like this:
As you may notice, the higher the temperature got, the more demand there was for ice
cream – so the scatterplot is great for showing the relationship between the two variables.
Distribution graph
A distribution graph displays the spread of various outcomes in a dataset.
Let’s apply this to real data. To account for its supplies, a brand new coffee shop owner
wants to measure how many cups of coffee their customers consume, and they want to
know if that information is dependent on the days and times of the week. That
distribution graph would resemble something like this:
From this distribution graph, you may notice that the amount of coffee sales steadily
increases from the beginning of the week, reaching the highest point mid-week, and then
decreases towards the end of the week.
If outcomes are categorized on the x-axis by distinct numeric values (or ranges of numeric
values), the distribution becomes a histogram. If data is collected from a customer
rewards program, they could categorize how many customers consume between one and
ten cups of coffee per week. The histogram would have ten columns representing the
number of cups, and the height of the columns would indicate the number of customers
drinking that many cups of coffee per week.
Reviewing each of these visual examples, where do you notice that they fit in relation to
your type of data? One way to answer this is by evaluating patterns in data. Meaningful
patterns can take many forms, such as:
A decision tree is a decision-making tool that allows you, the data analyst, to make
decisions based on key questions that you can ask yourself. Each question in the
visualization decision tree will help you make a decision about critical features for your
visualization. Below is an example of a basic decision tree to guide you towards making a
data-driven decision about which visualization is the best way to tell your story. Please
note that there are many different types of decision trees that vary in complexity, and can
provide more in-depth decisions.
-Does your data have only one numeric variable? Histogram or Density plot -Are there multiple
data sets? Line chart or pie chart -Are you measuring changes over time? Bar chart -Do
relationships between the data need to be shown? Scatter plot or heatmap
Begin with your story
Start off by evaluating the type of data you have and go through a series of questions to
determine the best visual source:
Does your data have only one numeric variable? If you have data that has one,
continuous, numerical variable, then a histogram or density plot are the best
methods of plotting your categorical data. Depending on your type of data, a bar
chart can even be appropriate in this case. For example, if you have data pertaining
to the height of a group of students, you will want to use a histogram to visualize
how many students there are in each height range:
Are there multiple datasets? For cases dealing with more than one set of data,
consider a line or pie chart for accurate representation of your data. A line chart
will connect multiple data sets over a single, continuous line, showing how
numbers have changed over time. A pie chart is good for dividing a whole into
multiple categories or parts. An example of this is when you are measuring
quarterly sales figures of your company. Below are examples of this data plotted
on both a line and pie chart.
Are you measuring changes over time? A line chart is usually adequate for
plotting trends over time. However, when the changes are larger, a bar chart is the
better option. If, for example, you are measuring the number of visitors to NYC over
the past 6 months, the data would look like this:
Do relationships between the data need to be shown? When you have two
variables for one set of data, it is important to point out how one affects the other.
Variables that pair well together are best plotted on a scatterplot. However, if there
are too many data points, the relationship between variables can be obscured so a
heat map can be a better representation in that case. If you are measuring the
population of people across all 50 states in the United States, your data points
would consist of millions so you would use a heat map. If you are simply trying to
show the relationship between the number of hours spent studying and its effects
on grades, your data would look like this:
Additional resources
The decision tree example used in this reading is one of many. There are multiple decision
trees out there with varying levels of details that you can use to help guide your visual
decisions. If you want more in-depth insight into more visual options, explore the
following resources:
After we go through the various design principles, spend some time examining the visual
examples to ensure that you have a thorough understanding of how the principle is put
into practice. Let’s get into it!
1. Balance: The design of a data visualization is balanced when the key visual elements,
like color and shape, are distributed evenly. This doesn’t mean that you need complete
symmetry, but your visualization shouldn’t have one side distracting from the other. If
your data visualization is balanced, this could mean that the lines used to create the
graphics are similar in length on both sides, or that the space between objects is equal.
For example, this column chart (also shown below) is balanced; even though the columns
are different heights and the chart isn’t symmetrical, the colors, width, and spacing of the
columns keep this data visualization balanced. The colors provide sufficient contrast to
each other so that you can pay attention to both the motivation level and the energy level
displayed.
2. Emphasis: Your data visualization should have a focal point, so that your audience
knows where to concentrate. In other words, your visualizations should emphasize the
most important data so that users recognize it first. Using color and value is one effective
way to make this happen. By using contrasting colors, you can make certain that graphic
elements—and the data shown in those elements—stand out.
For example, you will notice a heat map data visualization below from The Pudding’s
“Where Slang Comes From" article. This heat map uses colors and value intensity to
emphasize the states where search interest is highest. You can visually identify the
increase in the search over time from low interest to high interest. This way, you are able
to quickly grasp the key idea being presented without knowing the specific data values.
3. Movement: Movement can refer to the path the viewer’s eye travels as they look at a
data visualization, or literal movement created by animations. Movement in data
visualization should mimic the way people usually read. You can use lines and colors to
pull the viewer’s attention across the page.
For example, notice how the average line in this combo chart (also shown below) draws
your attention from left to right. Even though this example isn’t moving, it still uses the
movement principle to guide viewers’ understanding of the data.
4. Pattern: You can use similar shapes and colors to create patterns in your data
visualization. This can be useful in a lot of different ways. For example, you can use
patterns to highlight similarities between different data sets, or break up a pattern with a
unique shape, color, or line to create more emphasis.
In the example below, the different colored categories of this stacked column chart (also
shown below) are a consistent pattern that makes it easier to compare book sales by
genre in each column. Notice in the chart that the Fantasy & Sci Fi category (royal blue) is
increasing over time even as the general category (green) is staying about the same.
5. Repetition: Repeating chart types, shapes, or colors adds to the effectiveness of your
visualization. Think about the book sales chart from the previous example: the repetition
of the colors helps the audience understand that there are distinct sets of data. You may
notice this repetition in all of the examples we have reviewed so far. Take some time to
review each of the previous examples and notice the elements that are repeated to create
a meaningful visual story.
6. Proportion: Proportion is another way that you can demonstrate the importance of
certain data. Using various colors and sizes helps demonstrate that you are calling
attention to a specific visual over others. If you make one chart in a dashboard larger than
the others, then you are calling attention to it. It is important to make sure that each chart
accurately reflects and visualizes the relationship among the values in it. In this
dashboard (also shown below), the slice sizes and colors of the pie chart compared to the
data in the table help make the number of donuts eaten by each person the focal point.
These first six principles of design are key considerations that you can make while you are
creating your data visualization. These next three principles are useful checks once your
data visualization is finished. If you have applied the initial six principles thoughtfully,
then you will probably recognize these next three principles within your visualizations
already.
8. Variety: Your visualizations should have some variety in the chart types, lines, shapes,
colors, and values you use. Variety keeps the audience engaged. But it is good to find
balance since too much variety can confuse people. The variety you include should make
your dashboards and other visualizations feel interesting and unified.
9. Unity: The last principle is unity. This means that your final data visualization should be
cohesive. If the visual is disjointed or not well organized, it will be confusing and
overwhelming.
Being a data analyst means learning to think in a lot of different ways. These nine
principles of design can help guide you as you create effective and interesting
visualizations.
Monthly spending is shown as a donut chart that reflects different categories like
utilities, housing, transportation, education, and groceries.
When customers set a budget for a category, the donut chart shows filled and
unfilled portions in the same view.
Customers can also set an overall spending limit, and the dashboard will
automatically assign the budgeted amounts (unfilled areas of the donut chart) to
each category based on past spending trends.
Empathize
First, empathize by putting yourself in the shoes of a customer who has a checking
account with the bank.
Define
Now, imagine that you are helping dashboard designers define other things that
customers might want to achieve besides saving money.
Prototype
Finally, developers can prototype the next version of the dashboard with new and
improved data visualizations.
Test
Developers can close the cycle by having you (and others) test the prototype before it is
sent to stakeholders for review and approval.
Key takeaways
This design thinking example showed how important it is to:
If you already know what headlines, subtitles, labels and annotations do, go to the
guidelines and style checks at the end of this reading. If you don’t, these next sections are
for you.
Headlines that pop
A headline is a line of words printed in large letters at the top of a visualization to
communicate what data is being presented. It is the attention grabber that makes your
audience want to read more. Here are some examples:
Turns out, this illustration is showing average rents in the tri-city area. So, let’s add a
headline to make that clear to the audience. Adding the headline, “Average Rents in the
Tri-City Area” above the line chart instantly informs the audience what it is comparing.
Subtitles that clarify
A subtitle supports the headline by adding more context and description. Adding a
subtitle will help the audience better understand the details associated with your chart.
Typically, the text for subtitles has a smaller font size than the headline.
In the average rents chart, it is unclear from the headline “Average Rents in the Tri-City
Area” which cities are being described. There are tri-cities near San Diego, California
(Oceanside, Vista, and Carlsbad), tri-cities in the San Francisco Bay Area (Fremont,
Newark, and Union City), tri-cities in North Carolina (Raleigh, Durham, and Chapel Hill),
and tri-cities in the United Arab Emirates (Dubai, Ajman, and Sharjah).
We are actually reporting the data for the tri-city area near San Diego. So adding
“Oceanside, Vista, and Carlsbad” becomes the subtitle in this case. This subtitle enables
the audience to quickly identify which cities the data reflects.
Labels that identify
A label in a visualization identifies data in relation to other data. Most commonly, labels in
a chart identify what the x-axis and y-axis show. Always make sure you label your axes. We
can add “Months (January - June 2020)” for the x-axis and “Average Monthly Rents ($)”
for the y-axis in the average rents chart.
Data can also be labeled directly in a chart instead of through a chart legend. This makes it
easier for the audience to understand data points without having to look up symbols or
interpret the color coding in a legend.
We can add direct labels in the average rents chart. The audience can then identify the
data for Oceanside in yellow, the data for Carlsbad in green, and the data for Vista in blue.
Suppose in the average rents chart that we want the audience to pay attention to the
rents at their highs. Annotating the data points representing the highest average rents will
help people focus on those values for each city.
Guidelines and pro tips
Visualization
Guidelines Style checks
components
- Use brief language - Don’t use all caps -
- Content: Briefly describe the data -
Don’t use italic - Don’t use acronyms - Don't
Headlines Length: Usually the width of the data
use abbreviations - Don’t use humor or
frame - Position: Above the data
sarcasm
- Content: Clarify context for the data - - Use smaller font size than headline - Don’t
Length: Same as or shorter than use undefined words - Don’t use all caps,
Subtitles
headline - Position: Directly below the bold, or italic - Don’t use acronyms - Don't
headline use abbreviations
- Content: Replace the need for legends -
- Use a few words only - Use thoughtful color-
Length: Usually fewer than 30 characters
Labels coding - Use callouts to point to the data -
- Position: Next to data or below or
Don’t use all caps, bold, or italic
beside axes
- Content: Draw attention to certain data
- Don’t use all caps, bold, or italic - Don't use
- Length: Varies, limited by open space -
Annotations rotated text - Don’t distract viewers from the
Position: Immediately next to data
data
annotated
Refer to the following table for recommended guidelines and style checks for headlines,
subtitles, labels, and annotations in your data visualizations. Think of these guidelines as
guardrails. Sometimes data visualizations can become too crowded or busy. When this
happens, the audience can get confused or distracted by elements that aren’t really
necessary. The guidelines will help keep your data visualizations simple, and the style
checks will help make your data visualizations more elegant.
Choosing to represent your data via a chart is usually the most simple and efficient
method. Let’s go through the entire process of creating any type of chart in 60 minutes.
The goal here is to develop a prototype or mock up of your chart that you can quickly
present to an audience. This will also enable you to have a sense of whether or not the
chart is communicating the information that you want.
5 minutes- prep 15 minutes- talk & listen 20 minutes- prototype & improve 20 minutes- sketch &
design
Follow this high level 60-minute chart to guide your thinking whenever you begin working
on a data visualization.
Prep (5 min): Create the mental and physical space necessary for an environment of
comprehensive thinking. This means allowing yourself room to brainstorm how you want
your data to appear while considering the amount and type of data that you have.
Talk and listen (15 min): Identify the object of your work by getting to the “ask behind
the ask” and establishing expectations. Ask questions and really concentrate on feedback
from stakeholders regarding your projects to help you hone how to lay out your data.
Sketch and design (20 min): Draft your approach to the problem. Define the timing and
output of your work to get a clear and concise idea of what you are crafting.
Prototype and improve (20 min): Generate a visual solution and gauge its effectiveness at
accurately communicating your data. Take your time and repeat the process until a final
visual is produced. It is alright if you go through several visuals until you find the perfect
fit.
Key takeaway
This is a great overview you can use when you need to create a visualization in a short
amount of time. As you become more experienced in data visualization, you will find
yourself creating your own process. You will get a more detailed description of different
visualization options in the next reading, including line charts, bar charts, scatterplots,
and more. No matter what you choose, always remember to take the time to prep, identify
your objective, take in feedback, design, and create.
What insights did you gain about the two products you visualized? What trends did
you notice?
How did design thinking (Empathize, Define, Ideate, Prototype, Test) influence the
process of making the data visualization?
How can design thinking help make data visualizations more accessible and easier
to understand?
1 point
A good response would include how design thinking should be at the heart of your
visualization process because it allows analysts to create user-centric visualizations.
Design thinking helps you stay focused on your audience, message, and goal. This helps
you create a data visualization that tells a meaningful story about your data that is useful
to your audience. Design thinking also helps you plan for accessibility issues. By improving
accessibility, you make data visualizations that communicate more effectively.
To create a chart In Google Sheets, select the data cells, click Insert from the main
menu, and then select Chart. You can set up and customize the chart in the dialog
box on the right.
To create a chart in Microsoft Excel, select the data cells, click Insert from the main
menu, and then select the chart type. Tip: You can optionally click Recommended
Charts to view Excel’s recommendations for the data you selected and then select
the chart you like from those shown.
These are the primary chart types available:
Column (vertical bar): a column chart allows you to display and compare multiple
categories of data by their values.
Line: a line chart showcases trends in your data over a period of time. The last line
chart example is a combo chart which can include a line chart. Refer to the
description for the combo chart type.
Pie: a pie chart is an easy way to visualize what proportion of the whole each data
point represents.
Horizontal bar: a bar chart functions similarly to a column chart, but is flipped
horizontally.
Area: area charts allow you to track changes in value across multiple categories of
data.
Combo: combo charts use multiple visual markers like columns and lines to
showcase different aspects of the data in one visualization. The example below is a
combo chart that has a column and line chart together.
Types of charts and graphs in Google Sheets: a Google Help Center page with a list
of chart examples you can download.
Excel Charts: a tutorial outlining all of the different chart types in Excel, including
some subcategories.
Which chart or graph is right for you? This presentation covers 13 of the most
popular charts in Tableau.
The Ultimate Cheat Sheet on Tableau Charts. This blog describes 24 chart
variations in Tableau and guidelines for use.
The following are visualizations that are more specialized in Tableau with links to
examples or the steps to create them:
Highlight tables appear like tables with conditional formatting. Review the steps
to build a highlight table.
Heat maps show intensity or concentrations in the data. Review the steps to build
a heat map.
Density maps show concentrations (like a population density map). Refer to
instructions to create a heat map for density.
Gantt charts show the duration of events or activities on a timeline. Review the
steps to build a Gantt chart.
Symbol maps display a mark over a given longitude and latitude. Learn more from
this example of a symbol map.
Filled maps are maps with areas colored based on a measurement or dimension.
Explore an example of a filled map.
Circle views show comparative strength in data. Learn more from this example of
a circle view.
Box plots also known as box-and whiskers charts show the distribution of values
along a chart axis. Refer to the steps to build a box plot.
Bullet graphs compare a primary measure with another and can be used instead
of dial gauge charts. Review the steps to build a bullet graph.
Packed bubble charts display data in clustered circles. Review the steps to build a
packed bubble chart.
Key takeaway
This reading described the chart types you can create in spreadsheets and introduced
visualizations that are more unique to Tableau.
Moreover, some versions of the program are available at no charge. Because of these
advantages, many data analysts use it extensively. With the information in this activity,
you can prepare for upcoming activities where you will learn more about what you can do
in Tableau.
This reading will help you download Tableau Desktop, if you want to try it out.
Important note: All hands-on activities are based on the use of Tableau Public. You don't
need to download Tableau Desktop to complete the activities.
Features
Tableau Public is a great free resource, and it allows you to explore a lot of data
visualizations online. But the desktop version has features that can help make Tableau an
even more powerful tool. Using Tableau Desktop you can:
Once you have downloaded Tableau onto your desktop, you can go to the Connect menu
to upload data from local files, like Excel spreadsheets or PDFs, connect to data stored in a
server, or connect to data sources you have used before. Tableau also has a step-by-step
guide to help you get started using your own data in Tableau Desktop.
Tableau is a powerful tool, and once you have downloaded the app onto your desktop,
you can start using it to create data visualizations of your own.
Misleading visualizations
You can create data visualizations in Tableau using a wide variety of charts, colors, and styles. And you
have tremendous freedom in the tool to decide how these visualizations will look and how they will
present your data.
Red normally indicates danger or a warning. Why do you think cells are highlighted in red?
Green normally indicates a positive or “go” status. Is it clear why certain cells are highlighted in
green?
The purpose of the color coding isn’t clear without a legend, but can you guess what might
have been the intent?
This table potentially tells the audience that the numbers in red were bad for the
business while the numbers in greed were good. Without proper context, the
audience would not be able to ascertain the accuracy of the visuals.
The table probably wanted to show to which extent the order quantities had
achieved certain sales quotas.
The key to effective presentations is data visualizations that are clear and convincing. In
turn, the key to effective visualizations is selecting the best way to depict your data.
You have learned about a few types of visualizations (e.g., bar graphs, pie charts) and what
each type is best at emphasizing. Determining which type of visualization to use is
essential to giving your presentation the impact it needs.
So far, you have considered a few rules about what makes a helpful data visualization:
A good reflection would include how the first step to identifying appropriate visualizations
is understanding what kind of data you are presenting, and that you should apply the four
rules above to ensure the visualization has the biggest impact.
After you understand the type of data (frequency, changes over time, categorical
comparisons, etc.), then you must determine what your audience needs to see to
understand your analysis. After that, find which graph or chart style fits your goal. Finally,
utilize the visual design guidelines above to create an accessible and aesthetically
pleasing data visualization.
What did linking data from multiple sources allow you to do with your visualization
in Tableau?
What other kinds of datasets could you link to the four you used in this activity?
What kinds of comparisons or insights could you make?
If you couldn’t link data in this way, how would you make complex comparative
datasets and visualizations like this?
A good response would include that linking data allows you to combine different
features of multiple datasets without having to create a new dataset as you
visualize comparisons and combinations of data.
With Tableau and other visualization software, you can simplify the process of
combining and visualizing data. Otherwise, you would need to select the
information you need and create a new data source, which takes a lot of time. This
simplified process will allow you to share more insights with your peers and
stakeholders throughout your career as a data analyst.
Resource Description
This page links to other resources explaining how to set up your data sources and
prepare them for analysis once you have connected them to your Tableau account. It
Set up data sources specifically includes articles explaining how to join or blend data, and what a union is
and how they work. This is a great starting point as you get ready to begin using and
combining data sources.
Joining refers to the process of combining data sources based on common fields. This
Join your data article gives a more detailed explanation of the different joins, how to use them in
Tableau, and an example join with a step-by-step guide.
Relationships allow you to combine multiple data sources in Tableau. This is a more
Don’t be scared of flexible alternative to joins, and doesn’t force you to create one single table with your
relationships multiple data sources. This article will give you more insight into how relationships
work.
How relationships This article goes into more detail about the differences between using relationships and
differ from joins joins, and guides you through the process of using relationships to combine data.
Data blending is another method you can use to combine multiple data sources.
Instead of truly combining the data, blends allow you to query and aggregate data from
Blend your data
multiple sources. This resource goes into more detail about blending and includes a
tutorial.
Combining This resource provides examples that explain how to combine date fields when using
multiple date fields four different methods of data combination in Tableau.
These are just a few resources you can use. You can also find more information online or in
the Tableau community forums.
Effective data stories
In data analytics, data storytelling is communicating the meaning of a dataset with visuals
and a narrative that is customized for a particular audience. In data journalism, journalists
engage their audience of readers by combining visualizations, narrative, and context into
data-driven articles. It turns out that data analysts and data journalists have a lot in
common! As a junior data analyst, you might learn a few things about effective storytelling
from data journalism. Read further to explore the role and work of a data journalist in
telling a good story.
Note: This reading refers to an article published in The New Yorker. Non-subscribers may
access several free articles each month. If you already reached your monthly limit on free
articles, bookmark the article and come back to this reading later.
Ben Wellington, a contributing writer for The New Yorker and a professor at the Pratt
Institute, used New York City’s open data portal to track down noise complaints from
logged service requests. He analyzed the data to gain a more quantitative understanding
of where the noise was coming from and which neighborhoods were the noisiest. Then, he
presented his findings in the Mapping New York's Noisiest Neighborhoods article.
First, click the link above to skim the article and familiarize yourself with the data
visualizations. Then, join the bus tour of the data! You will be directed to three
visualizations (tour stops) to observe how each visualization helped strengthen the overall
storytelling in the article.
How does the visualization help set the context? The combo table and bar chart is
effective in summarizing the noise categories as percentages of the logged
complaints. This helps set the context by answering the question, “what is noise?”
Notice that the data journalist created a combo table and bar chart instead of a pie
chart. With 11 noise categories, a list with a bar chart showing relative proportions
is an elegant representation. A pie chart with 11 slices would have been harder to
read.
How does the visualization help clarify the data? If you add the percentages in
the categories in the combo table and bar chart, the total is ninety-eight percent.
There is a difference of two percent that can’t be accounted for in the visualization.
So, rather than clarifying the data, the visualization actually causes a little
confusion. One lesson is to always make sure that your percentages add up
correctly. Sometimes rounding decimal places up or down causes percentages to
be off so they don’t add up to 100%.
Do you notice a data visualization best practice? You learned that a companion
table in Tableau shows data in a different way in case some in your audience prefer
tables. It appears that the data journalist had the same idea by using a combo
table and bar chart. Note: As a refresher, a companion table in Tableau is displayed
right next to a visualization. A companion table displays the same data as the
visualization, but in a table format. You may replay the Getting Creative video
which includes an example of a companion table.
In the article, review the stacked area chart for the distribution of noise complaints by
hour of the day. Evaluate the visualization:
How does the visualization perform against the five-second rule? Recall that the
five-second rule states that you should understand what is being conveyed within
the first five seconds of seeing a chart. We are guessing that this visualization
performs quite well! The area charts for loud music and barking dogs help the
audience understand that more of these types of noise complaints were made
during late night and early morning hours (between 10:00 PM and 2:00 AM). Notice
also that the color coding in the legend aligns with the colors in the chart. A chart
legend normally has the largest category at the top, but the data journalist chose
to order the legend so the largest category, “Loud music or party” appears at the
bottom instead. How much time do you think this alignment saved readers?
How does the visualization help clarify the data? Unlike the visualization from
the previous tour stop, this visualization does a better job of clearly showing that
all percentages add up to 100%.
Do you notice a data visualization best practice? As a best practice, both the x-
axis and y-axis should be labeled. But, the data journalist chose to include % or
A.M. and P.M. with each tick on an axis. As a result, labeling the x-axis “Time of
Day'' and the y-axis “Percentage of Noise Complaints” isn’t required. This
demonstrates that a little creativity with labeling can help you achieve a cleaner
chart.
In the article, review the neighborhood map for how close a noisy neighborhood is to a
quiet neighborhood. Evaluate the visualization:
How does the visualization help make a point? The data journalist observed that
one of the noisiest neighborhoods was right next to one of the quietest
neighborhoods. The neighborhood map is effective in emphasizing this
observation as a dark blue area versus a white area.
How does the visualization help clarify the data? The visualization classifies the
data by neighborhood and allows the audience to follow along when the journalist
focuses specifically on the Williamsburg, East Williamsburg, and North Side/South
Side neighborhoods.
Do you notice a data visualization best practice? Each neighborhood is directly
labeled so a legend isn’t necessary.
Insight immediately begins to lose value and continues to do so the longer the data
remains in a static state
Snapshots can't keep up with the pace of data change
Live data means that you can build dashboards, reports, and views connected to
automatically updated data.
PROS
Can take engineering resources to keep pipelines live and scalable, which may be
outside the scope of some companies' data resource allocation
Without the ability to interpret data, you can lose control of the narrative, which
can cause data chaos (i.e. teams coming to conflicting conclusions based on the
same data)
Can potentially cause a lack of trust if the data isn’t handled properly
Key takeaways
Analysts need to familiarize themselves with the business and data so they can
recommend when an updated static analysis is needed or should be refreshed. Also, this
data insight will help you make the case for what sorts of analyses, visualizations, and
additional data are recommended for the types of decisions that the business needs to
make.
Keep this customer survey spreadsheet on hand as it will be useful for the next video.
How did you arrange the sheets onto the dashboard to effectively present the data?
What are some other ways in which you might use dashboards?
Is there a dashboard that you would like to create? If so, what kinds of data might it feature?
1 point
good response would include how you can arrange the layout of a dashboard with
visualizations and corresponding legends to help highlight key takeaways from the data.
Up next
Get started with the messy vs. good presentation comparison by viewing the first video:
Connor: Messy example of a data presentation.
Guide: Sharing data findings in presentations
Use this guide to help make your presentation stand out as you tell your data story. Follow
the recommended tips and slide sequence in this guide for a presentation that will truly
impress your audience.
You can also download this guide as a PDF, so you can reference it in the future:
PDF File
In order to develop the right flow for your presentation, keep your audience in mind. Ask
yourself these two questions to help you define the overall flow and build out your
presentation.
Who is my audience?
If your intended audience is executives, board members, directors, or other C-level
(C-Suite) executives, your storytelling should be kept at a high level. This audience
will want to hear about your story but might not have time to hear the entire story.
Executives tend to focus on endings that encourage improving, correcting, or
inventing things. Keep your presentation brief and spend most of your time on
your results and recommendations. Refer to an upcoming topic in this reading—
Tip 3: end with your recommendations.
If your intended audience is stakeholders and managers, they might have more
time to learn about how you performed your analysis and they might ask more
data-specific questions. Be prepared with talking points about the aspects of your
analysis that led you to your final results and conclusions.
If your intended audience is other analysts and individual contributors, you will
have the most freedom—and perhaps the most time—to go more deeply into the
data, processes, and results.
What is the purpose of my presentation?
Knowing exactly what you will say when explaining each slide throughout your
presentation also creates a natural flow to your story. Talking points help you avoid
awkward pauses between topics. Slides that summarize data can also be repetitive (and
boring). If you prepare a variety of interesting talking points about the data, you can keep
your audience alert and paying attention to the data and its analysis.
Use one slide for your recommendations at the end. Be clear and concise.
If you are recommending that something be done, provide next steps and describe
what you would consider a successful outcome.
Assume that everyone in your audience is busy. Keep your presentation on topic and as
short as possible by:
Being aware of your timing. This applies to the total number of slides and the time
you spend on each slide.
Presenting your data efficiently. Make sure that every slide tells a unique and
important part of your data story. If a slide isn’t that unique, you might think about
combining the information on that slide with another slide.
Saving enough time for questions at the end or allowing enough time to answer
questions throughout your presentation.
Introductions (4 minutes)
Project overview and goals (5 minutes)
Data and analysis (10 minutes)
Recommendations (3 minutes)
Actionable steps (3 minutes)
Questions (5 minutes)
Service center consolidation is an important cost savings initiative. The aim of this project
was to determine the impact of service center consolidation on customer response times.
But, if you choose to tell your story using more than one slide, keep the following in mind:
Slides typically have a logical order (beginning, middle, and end) to fully build the
story.
Each slide should logically introduce the slide that follows it. Visual cues from the
slides or verbal cues from your talking points should let the audience know when
you will go on to the next slide.
Remember not to use too much text on the slides. When in doubt, refer back to the
second tip on preparing talking points and limiting the text on slides.
The high-level information that people read from the slides shouldn’t be the same
as the information you provide in your talking points. There should be a nice
balance between the two to tell a good story. You don’t want to simply read or say
the words on the slides.
For extra visuals on the slides, use animations. For example, you can:
Suppose the data analysis showed that service center consolidation negatively impacted
customer response times. A call to action might be to examine if processes need to change
to bring customer response times back to what they were before the consolidation.
Wrapping it up: Getting feedback
After you present to your audience, think about how you told your data story and how you
can get feedback for improvement. Consider asking your manager or another data analyst
for candid thoughts about your storytelling and presentation overall. Feedback is great to
help you improve. When you have to write a brand new data story (or a sequel to the one
you already told), you will be ready to impress your audience even more!
Overview
Earlier in this course, you practiced creating, giving, and evaluating your own presentation
for the Hands-on Activity: Presenting practice. Now, you’ll complete an entry in your
learning log revisiting that presentation and reflecting on how much your presentation
skills have developed so far. By the time you complete this activity, you will have more
experience presenting, evaluating, and receiving and applying presentation feedback.
This will help you prepare for future presentations as a data analyst.
For the hands-on activity, you recorded yourself presenting a data visualization
dashboard that you created. You then evaluated your work. Now that you have more
knowledge and practice under your belt, it’s time to try again!
Re-record your presentation with the information you’ve learned during this
course. Keep it as concise as possible so you can compare it to your previous
version.
Share the presentation with someone you know who might not be familiar with
data analysis. Keep them in mind while you record your presentation, as it should
be as simple and accessible as possible.
Ask them for their feedback. Did they find it engaging? Did they truly understand
the concept that you explained?
If it would be helpful to receive feedback in a formal way, print out the checklist
you used last time (provided below) and give it to your audience.
Presentation Evaluation Checklist:
Do I include a good title and subtitle that describe what I’m about to present?
Do I include the date of my presentation or the date when my slideshow was last
updated?
Does my font size let the audience easily read my slides?
Do I showcase what business metrics I used?
Do I include effective visuals (like charts and graphs)?
Once you have finished revising, recording, and sharing your presentation again, you’ll
have a chance to reflect on your experience in the learning log template linked below.
OR
If you don’t have a Google account, you can download the template directly from the
attachment below.
DOCX File
Reflection
Now that you’ve finished rerecording your presentation and receiving feedback, take a
moment to reflect on the process you just completed. In your learning log template, write
2-3 sentences (40-60 words) in response to each question below:
There are many things to consider before you begin asking and answering possible questions – like the
objective, stakeholder expectations, and if there are any limitations. Make sure you have everything
covered before you begin. The checklist below identifies ten tasks that you should engage in to be well
prepared for your Q&A:
Sometimes, you may receive questions or objections about your presentation. This is
normal, as your audience wants to understand your presentation as completely as
possible. Responding to these questions and objections in a clear, concise, and polite
manner is crucial to delivering an effective presentation.
Examples of objections
Consider the following situations where a data analyst delivers a presentation and
receives an objection:
Reflection
If you were the data analyst involved, how would you respond to your
stakeholders?
What would be the impact of not addressing these objections?
Using the first scenario as an example: If you receive an objection about the
completeness of your analysis, you should politely acknowledge the objection.
Then, reiterate each step you took in your analysis and explain why you did each
one. Finally, promise to investigate your analysis question further so that the
analysis is complete or your presentation is more clear.
If you don’t address the objection, your stakeholders may not appreciate or
respect the work you’ve done in your analysis. By communicating respectfully with
your stakeholders, you establish a positive relationship with them. You also can
use their feedback to improve your analytical approach for future presentations.
What are some other ways you can receive feedback from your audience and
stakeholders?
How can you be as inclusive as possible? How can you make your audience feel
comfortable asking you for clarification?
What creative methods can you use to engage your audience?
Feedback is a healthy process for data analyst, as it is also for all other
professionals.What is important is the quality of feedback.We may have two
qualities of feedback: pulling or pushing, of which pulling is by far more effective
than pushing.The objective of the two feedbacks could be the same; however, the
approach is very different.Pushing feedback is focused on the past, critical and
general.Pulling feedback is future oriented, supportive, and specific.
Pushing may sound like that’s not good enough, we need to do better.Pulling
feedback may sound like great job, taking initiative, helping to unmask any
ambiguity.So, feedback could be the difference between motivating or causing us
a kind of disappointment.
There are three practical ways to get useful pulling feedback:
1-Prepare: be fully receptive and attentive when receiving feedback.Most of the
time, valid points are behind and thus let’s keep staying task focused, and future
oriented.
2-Stay positive: have a deep breath and keep your motivational energy up and
show appreciation and full understanding before replying to feedback.
3-Dialogue: we may need to engage our stakeholders in further dialogue out of
presentation forum to ensure reaching the best solution the team aims for, asking
them of “what do they think? And how to move forward?”.