Unit 2 and 3 PDF
Unit 2 and 3 PDF
STORYTELLING IN A
DIGITAL ERA
note
Data visualization is the practice of graphically
representing data to help people see and understand
patterns, insights, and other discoveries hidden inside
information. Data storytelling translates seeing into
meaning by weaving a narrative around the data to
answer questions and support decision making.
Data visualization and data storytelling are not the same thing;
however, they are two sides of the same coin. A true data story
utilizes data visualizations as a literary endeavor would use
illustrations—proof points to support the narrative. However,
there’s a little bit of a role reversal here: whereas data
visualizations provide the “what” in the story, the narrative
itself answers the “why.” As such, the two work together in
tandem to translate raw data into something meaningful for its
audience. So, to be a proper data storyteller you need to know
how to do both: curate effective data visualizations and frame
a storyboard around them. This starts with learning how to
visualize data, and more importantly, how to do so in the best
way for communication rather than purely analytical purposes.
As discussed later in this book, visualizations for analysis
versus presentation are not always the same thing in data
storytelling.
One of the most common clichés in the viz space is that “data
visualizations are only as effective as the insights they reveal.”
In this context, effectiveness is a function of careful planning.
Any meaningful visualization is a two-pronged one. It requires
analytical perfection and correct rendering of statistical
information, as well as a well-orchestrated balance of visual
design cues (color, shape, size, and so on) to encode that data
with meaning. The two are not mutually exclusive.
Figure 1.2 A sampling of great data stories in recent headlines
by statisticians and data journalists.
This might sound like an easy task, but it’s not. Learning to
properly construct correct and effective data visualization isn’t
something you can accomplish overnight. It takes as much
time to master this craft as it does any other, as well as a
certain dedication to patience, practice, and keeping abreast of
changes in software. In addition, like so many other things in
data science, data visualization and storytelling tend to evolve
over time, so an inherent need exists for continuous learning
and adaptation, too. The lessons in this book will guide you as
you begin your first adventures in data storytelling using data
visualizations in Tableau.
FROM VISUALIZATION TO
VISUAL DATA STORYTELLING:
AN EVOLUTION
With all the current focus on data visualization as the best
(and sometimes only) way to see and understand today’s
biggest and most diverse data, it’s easy to think of the
practice as a relatively new way of representing data and
other statistical information. In reality, the practice of
graphing information—and communicating visually—
reaches back all the way to some of our earliest prehistoric
cave drawings where we charted minutiae of early human
life, through initial mapmaking, and into more modern
advances in graphic design and statistical graphics. Along
the way, the practice of data visualization has been aided by
both advancements in visual design and cognitive science as
well as technology and business intelligence, and these have
given rise to the advancements that have led to our current
state of data visualization.
note
This dataset is regularly updated and maintained by Ryan
Swanstrom, and is available via Github at
https://github.com/ryanswanstrom/awesome-datascience-
colleges.
SUMMARY
This chapter focused on providing an introductory
discussion on data visualization and visual data storytelling
by taking a look at how these concepts are similar and
different, and how both have been transformed in the digital
era. The next chapter takes a closer look at the power of
visual data stories to help us understand what makes them
so powerful and important in today’s data deluge.
_____________
1. https://actu.epfl.ch/news/the-world-s-largest-data-visualization/
2. https://www.gapminder.org/tag/trendalyzer/
3. Wixom, Barbara; Ariyachandra, Thilini; Douglas, David; Goul, Michael; Gupta, Babita; Iyer,
Lakshmi; Kulkarni, Uday; Mooney, John G.; Phillips-Wren, Gloria; and Turetken, Ozgur (2014).
“The Current State of Business Intelligence in Academia: The Arrival of Big Data,”
Communications of the Association for Information Systems: Vol. 34 , Article 1.
4. http://www.gartner.com/newsroom/id/2593815
CHAPTER 2
Media and journalists are not the only ones putting emphasis
on data storytelling, although they arguably have been a
particularly imaginative bunch of communicators. Today
we’ve seen the power of storytelling used to color in
conversations on just about every type of data imaginable—
from challenging astronomical principles, to visualizing the
tenure pipeline at Harvard Business School, to quantifying the
fairytale of Little Red Riding Hood. In every organization and
every industry, data stories are becoming the next script for
how we share information.
For as diverse as data stories can be, they all have one thing in
common: They give us something to connect to in a very
literal sense. Let’s delve into the power of stories, first by
looking behind the curtain at the science of storytelling and
then looking at some incredible data stories over time to see
how they have capitalized on the secret sauce of storytelling.
Fitness
As much as we might try to argue it, human beings did not
evolve to find truth. We evolved to defend positions and
obtain resources—oftentimes regardless of the cost—to
survive. These concepts are at the heart of Darwinian theory
of natural selection: survival of the fittest as the mechanism,
and our ability to overcome (or, biologically, to reproduce),
fitness.
Closure
Aside from being bent on survival, humans also tend to
require closure. The few philosophical exceptions
notwithstanding, in general we don’t enjoy ongoing
questions and curiosities with no resolution—we need
endings, even unhappy ones. We simply can’t abide
cliffhangers; they’re sticky in the worst of ways, bouncing
around in our brains until we can finally “finish” them and
put them to rest. There’s actually a term for this
phenomenon called the Zeigarnik effect. It was named for
Soviet psychologist Bluma Zeigarnik who demonstrated
that people have a better memory for unfinished tasks that
they do for finished ones. Today, the Zeigarnik effect is
known formally as a “psychological device that creates
dissonance and uneasiness in the target audience.”
But first, let’s set the record straight. There is much to be said
about how visual data stories creating meaning in a time of
digital data deluge, but it would be careless to relegate data
storytelling to the role of “a fun new way to talk about data.”
In fact, it has radically changed the way we talk about data
(though certainly not invented the concept). The traditional
charts and graphs we’ve always used to represent data are still
helpful because they help us to better visually organize and
understand information. They’ve just become a little static.
With today’s technology, fueled by today’s innovation, we’ve
moved beyond the mentality of gathering, analyzing, and
reporting data to collecting, exploring, and sharing information
—rather than simply rendering data visually we are focused on
using these mechanisms to engage, communicate, inspire, and
make data memorable. No longer resigned to the tasks of
beautifying reports or dashboards, data visualizations are
lifting out of paper, coming out of the screen, and moving into
our hearts, minds, and emotions. The ability to stir emotion is
the secret ingredient of visual data storytelling, and what sets it
apart from the aforementioned static visual data renderings.
Story Takeaway
note
See more of Chelsea’s Netflix data story at
https://www.umbel.com/blog/data-visualization/netflix-
chill-little-data-experiment-understanding-my-own-
taste-tv/.
Story Takeaway
COLOR CUES
Let’s begin to build a story around this and see where we end
up. First let’s agree on a foundation: The year follows a
seasonal cycle that starts cold and gets progressively warmer
until it peaks and begins to cool again. Repeat. Right? This is a
pretty basic assumption. More importantly, it’s one that we can
successfully chart—loosely and without requiring any more
specific data or numbers at all. Rather, we’ll use points from
the basic story premise we laid out earlier to graph a seasonal
continuum for the year, using length of daylight as our curve
(see Figure 2.9). From there, we can try to decide just how
many seasons are really in a year.
How many curves does the orange line trace? The answer,
obviously, is two—hence the two-season viewpoint (see
Figure 2.10).
Story Takeaway
Napoleon’s March
As I’ve mentioned, using visualizations to tell stories about
data is not a new technique. French civil engineer Charles
Joseph Minard has been credited for several significant
contributions in the field of information graphics, among
them his very unique visualizations of two military
campaigns—Hannibal’s march from Spain to Italy some
2,200 years ago and Napoleon’s invasion of Russia in 1812.
Both of these visualizations were published in 1869 when
Minard was a spry 88 years old.
Story Takeaway
SUMMARY
In this chapter we discussed what makes stories so
impactful on the human brain. We then looked at a few real-
life examples of visual data storytelling in action. We could
analyze many more examples for this purpose, and more are
available online at the website companion to this book,
www.visualdatastorytelling.com.
_____________
1. www.marketplace.org/2016/09/26/sustainability/corner-office-marketplace/dont-call-
nationalgeographic-stodgy
2. http://mindingourway.com/there-are-only-two-seasons/
3. https://www.ncdc.noaa.gov/news/meteorological-versus-astronomical-seasons
CHAPTER 3
USING TABLEAU
Standing out against many other data visualization tools on
the market, Tableau is an industry-leading, best-of-breed
tool that delivers an approachable, intuitive environment for
self-service users of all levels to help them prepare, analyze,
and visualize their data. The software also provides delivery
channels for the fruits of its user’s visual analysis, including
dashboards and native storytelling functionality, called
“story points” in Tableau.
WHY TABLEAU?
In his book, Communicating Data with Tableau, author Ben
Jones included a personal note to his readers on why he
chooses Tableau. I thought I might do something similar.
TABLEAU IN DEMAND
note
The Tableau pricing model is based on users and
designed to scale as your organization’s needs grow.
Free software trials are also available as well as free
licenses for students and teachers—more at
https://www.tableau.com/pricing.
Tableau Server
As you might expect, Tableau Server is best suited for
enterprise-wide deployments. It is intended to provide entire
organizations with the ability to connect to any data source
—on-premise or in the cloud—with centrally managed
governance and granular security protocols to maintain
balance between user flexibility and IT control. This
product is used in conjunction with Tableau Desktop.
Tableau Desktop
Tableau’s flagship product, Tableau Desktop is an
application that can be used on either Windows or Mac
machines. It allows connection to data on-premise or in the
cloud, and facilitates the entire visual discovery and
analytics process from connecting to data to sharing
visualizations, dashboards, or interactive stories using
Tableau Server or Tableau Online. The software also
includes a device designer to help users design and publish
dashboards optimized for various form factors.
Tableau Online
The online version of Tableau eliminates the need for a
server and is a fully cloud-hosted platform that primarily
works with cloud databases, but can also work with live on-
premises queries or scheduled extract refreshes. It provides
the ability for on-the-go users to build, explore, curate, and
share visualizations and dashboards that are accessible from
a browser or a Tableau Mobile app.
Tableau Public
One part data visualization hosting service, one part social
networking, Tableau Public is a free service that allows
users to publish interactive data visualizations online. These
visualizations can be embedded into webpages and blogs,
shared via social media or email, or made available for
download to other users.
note
You can follow me and see many of the visualizations
included in this book on Tableau Public at
https://public.tableau.com/profile/lindyryan.
GETTING STARTED
The first thing you need to do to get started with Tableau is
to get your hands on a license. If you have not done so
already, refer to the Introduction for guidance on how to get
a free trial of Tableau Desktop. You can also visit the
Tableau website to explore trial and purchase options.
CONNECTING TO DATA
When you first open Tableau Desktop, the Connect to Data
screen appears (see Figure 3.2).
note
This book focuses on the art of visual data storytelling,
and as such is not a user manual for Tableau. I
recommend you review the Training videos provided by
Tableau in the Discover pane.
Connecting to Tables
tip
I’m using the Global Superstore Excel training file
provided by Tableau. This is a simple dataset of sales
for a global retailer that sells furniture, office supplies,
and technology goods. You can download this file from
the Tableau Community to follow along.
Before moving on, there are a few more things to take note of
on this screen.
Figure 3.5 Clicking the data type icon allows you to change
the default data type for that column. This determines how the
fields are displayed on your worksheet in the next step.
join errors!
Sometimes, an issue occurs in joins. Tableau notes these
with a red exclamation point to the side of the join
wherein the error occurs (see Figure 3.8).
Data Window
The pane on the left of the sheet is called the Data window
and has two tabs: a Data tab and an Analytics tab.
Data
At the top of the Data tab is a list of all open data
connections and the fields from that data source categorized
as either dimensions or measures (discussed shortly).
Analytics
The Analytics tab enables you to bring out pieces of your
analysis—summaries, models, and more—as drag-and-drop
elements. We review these functions later.
Legends
Legends will be created and automatically appear when you
place a field on the Color, Size, or Shape card. To change
the order (or appearance) of fields in a visualization, drag
them around in the legend. Hide legends by clicking on the
menu and selecting Hide Card. Likewise, bring them back
by selecting the Legend option on the appropriate space in
the Marks card or by using the Analysis menu.
UNDERSTANDING DIMENSIONS
AND MEASURES
When you bring a data source into Tableau, Tableau
automatically classifies each field as a dimension or a
measure. The differences between these two are important,
though they can be tricky to those new at analysis. Perhaps
the best way to differentiate these two classifications is as
this: dimensions are categories, whereas measures are fields
you can do math with.
Dimensions
Dimensions are things that you can group data by or drill
down by. They are usually—but not always—categories
(such as City, Product Name, or Color), and they can be
grouped into strings, dates, or geographic fields.
Measures
Measures are generally numerical data on which you want
to perform calculations—summing, averaging, and so on.
COLORFUL PILLS
SUMMARY
This chapter introduced the Tableau product ecosystem and
then took a high-level view of the Tableau user interface,
including connecting and preparing data and the core
functionality of the Sheets canvas. In future chapters, you
will put this knowledge into practice as you begin working
hands-on with this functionality.
_____________
1. http://onlinehelp.tableau.com/current/pro/desktop/en-us/basicconnectoverview.html