8 Rules For Better Data Storytelling
8 Rules For Better Data Storytelling
data storytelling
It was the end of 2020 when social media platforms faced an influx of Spotify Wrapped content—a summary of the user’s most-streamed music in the past
year. Those given the title of “top 1% fans worldwide” of their favorite artists proudly paraded their title like a badge of honor, while others shared their Most
Played Genre as a declaration of their musical preferences. In 2020 alone, 90 million users engaged with Spotify Wrapped. Clearly, users enjoyed the data
stories their streaming statistics told about them.
Spotify Wrapped is only one example that illustrates the increasing use of data visualizations in our daily lives. As the world watched the Covid-19 pandemic
unfold, data visualizations like John Hopkin’s Dashboard played a crucial role in informing the public on the extent of the pandemic. Data stories like National
Geographic’s coverage on 500,000 Covid-19 deaths elicited awareness of the deadliness of the virus. All these examples were in line with DataCamp’s
prediction of data visualization (and data stories) becoming mainstream in our 2021 Trends Report.
The difference between data visualizations and data stories is nuanced but important. Put simply, data visualizations are good at articulating what has
happened, but not necessarily why something has happened. This is where data storytelling comes into the picture.
Data storytelling is a methodology for communicating information with a compelling narrative. It is made up of three components—the data, visuals, and
narratives. Together, these components help data storytellers engage the audience, make data more memorable, and be more persuasive.
You might find it much easier to recall a funny story than the tenth digit of pi.
Stories serve as a framework to connect disparate pieces of information coherently and elicit an emotional response.
Humankind has used storytelling to pass down culture and traditions. In a paper Memory, Imagination and Learning: Connected by the Story, it was
discovered that storytelling is the most effective technique for oral cultures to faithfully memorize their knowledge. In a sense, humans are hardcoded to
recall stories better than cold hard facts.
2
Data stories help us communicate better stories facilitate communication
between the storyteller and the listener. In fact, neuroscientist Uri Hasson Table of contents
observed that the brain activities of the listeners of a story are similar to the
brain activities of storytellers. This phenomenon, called the “speaker-listener
neural coupling.” indicates the achievement of successful communication. Know your audience
4
When data are presented intuitively in the form of stories, they can be Choose the best visualization for your story
7
understood and acted upon quickly. Data stories also serve as a medium to
communicate data insights and inspire collaborative action in a way that a Do not lie with data stories 8
regular visualization or dashboard on its own could not.
Data storytelling is a valuable skill set for both technical and non-technical
professionals. As companies build up self-serve capabilities in analytics, more Use texts appropriately
12
On the other hand, data practitioners who craft audience-specific data stories
can effectively convey the impact of their data science projects in the
language of the audience. This technique can be applied in the scoping,
implementation, and evaluation stages of the projects to alleviate the
skepticism of non-technical users. This paves the way to getting the stamp of
approval from decision makers and end users.
With the growing importance of data storytelling in the upcoming decade, all
employees must be equipped with the fundamentals of data storytelling.
n this white paper, we outline eight rules for better data storytelling that will
help anyone craft impactful data-driven narratives.
3
Know your audience
journalism and analysis, recently ventured into Instagram. In charting this new
territory, it constantly redesigns charts for its audience instead of simply reusing
graphs meant for the magazine. The Economist’s feat in garnering millions of
Instagram followers within a few years attests to its ability to cater to its audience.
Data storytellers should know what matters to the audience, and present data
stories that are of importance to the audience. Otherwise, one runs the risk of
Particularly, a non-technical audience cares much more about the business impact
metrics has a higher priority than that of technical metrics for business
stakeholders. They can also benefit from data storytelling techniques in conveying
An example of this in action is from The Economist and how it leverages Instagram
as a medium to an audience that it could not with its print version. Knowing that its
stories of interest to its younger audience, running the gamut from climate change
to celebrity news.
4
Empathize with your audience
Data stories crafted with the audience in mind have a higher chance of retaining
the audience’s attention. Empathizing with your audience might influence the
content of the data story, the tone of the presentation, the length of the
document, and even the rate of speech. Here are some questions that might help
you understand and empathize with the audience better
Does the audience have the necessary prerequisite knowledge to understand
a particular metric?
How much time does the audience have to consume this data story?
What is the medium of presentation (written/oral) that the audience prefers?
The Economist is not afraid to present complex charts that require high data
literacy to understand since it is targeted at highly educated business
professionals. Data storytellers should be aware of the data literacy of their
audience when designing charts to avoid underwhelming or overwhelming them.
Whenever possible, data practitioners should avoid technical jargons that might
confuse non-technical stakeholders, and instead use business metrics to convey Fig 2: How the Economist simplified
5
2. Begin with the goal in mind
“Success in data visualization does not start with data visualization,” proclaimed "When it comes to the form and function of our data
Cole Knaflic, the author of the influential book Storytelling with Data. In the book,
Knaflic emphasized the importance of understanding the end goal and the visualizations, we first want to think about what it is we
context of a data story when starting a data visualization.
understanding the target audience also helps the analyst decide on the level for Business Professionals
of complexity of the data story
What do you want the audience to know or do? The main goal of building a
data story is to communicate a point and/or to give recommendations.
Knowing what exactly the point or recommendation is can help the analyst
stay focused on the main point instead of beating around the bush. A good
data story drives insights, which in turn drives decisions and calls for actions
How can you use your data to convey your point? With an understanding of
the audience and the intended action, we can start gathering the necessary
data evidence that supports the story arc.
The answers to these questions are ingredients for a coherent data story. Once
that is done, we can start building data visualizations that support the story.
6
3. Choose the best visualization for your story
The process of choosing the correct type of visualization for a particular In general, data visualizations can be used to show comparison, relationship,
insight requires an understanding of the different chart types. Here are the composition, and distribution. The following mindmap is a guide for
four most common types of charts. determining a suitable chart based on its purpose.
Bar charts are best suited to show comparisons of different categories Description Examples
Line plots are useful to show the changes of a variable over time.
Histograms show the distribution of one variable. This tells us how Bar charts Best suited to show comparisons A bar chart showing the reasons for
frequently a particular value occurs relative to others. of different categories. users to unsubscribe from the
Scatterplots can be used to show the relationship between two company s newsletter. Fig 3A
’ (
Line plots Useful to show the changes of a A line plot of the company revenue
variable over time. over time. Fig 3B
(
from an H survey.
7
Such a claim ignores the fact that global temperatures naturally fluctuate from
4. Do not lie with data stories time to time due to events like El Ninos, volcanic activities, and ocean
conditions. When such effects are taken into account, and data are presented
from the 19th century to now, it is clear that the earth is warming.
8
Ensure that the axes scales are appropriate
When needed to select a sample for a data story, ensure that the sample is
Use mean, median, and mode appropriately to ensure that the average is
9
5. Keep your visualizations minimal and avoid clutter
Clutter in visualizations takes up the cognitive load while not providing Calories per 100g
additional value. Removing distractions from a visualization focuses the 607
attention of the audience on the core message of the chart.
542
533
296
To quantify the amount of unnecessary data in a chart, Edward Tufte presented 260
an influential concept called the data-ink ratio, which is the proportion of ink
used to present the actual data to the total ink used to print the graphic. Good French
Potato
Bacon Pizza Chili
graphics should have a high data-ink ratio.
To maximize the data-ink ratio, analysts should strive to minimize superfluous Fig 8A: A cluttered chart Fig 8B: The same chart, decluttered
words, colors, and lines without sacrificing clarity. In particular, here are some
ways one can keep visualizations minimal yet effective.
10
6. Add color to your stories Fig 10 Use of color intensity
to highlight prevalence of
First, it is used to distinguish between groups that do not have an intrinsic order.
A prime example is the use of blue for Democrats and red for Republicans as in
Figure 10 below.
example, we can immediately identify areas with darker colors as locations of it is the main character of the
busier Covid-19 testing sites. data story
11
When the goal of the visualization is to drive a certain action, using actionable
7. Use texts appropriately chart titles can quickly call the audience to action. It is also helpful to add notes
to the footnote section of the chart for further details.
Texts clarify the meaning of charts and make them accessible. Too much text
adds clutter while too little text causes confusion. Here are some general rules
for using texts in your data stories.
Without the axes and the titles, the audience is left guessing the meaning of the You may also notice the differences in font sizes in the charts above. In general,
chart, as is the case for Figure 14. Thus, it is crucial to add texts to name the font sizes need to be legible, and the difference in sizes depends on the relative
metrics being plotted to avoid miscommunication.
importance of the texts in conveying the message. In the chart above, the call to
action is the most prominent while the footnote is the least thanks to the choice
of word sizes.
12
8. Develop a narrative around your data
Cognitive psychologist Jerome Bruner suggests we are 22 times more likely to
remember a fact when it has been wrapped in a story. A narrative connects the
"The power of data storytelling is if we combine the
dots between facts and makes them more memorable.
right data with the right narrative and the right visuals,
The first step to data storytelling, according to Tableau’s whitepaper, is to find we have something that's very powerful that can really
the data story. To do so, one needs to identify the core elements of the story.
drive change and alter people's perspectives."
Who is the protagonist? — Brent Dykes, Effective Data Storytelling
What is the challenge? Author of Effective Data Storytelling: How to
What should the audience do at the end of the story?
Hans Rosling, a global health expert and the founder of GapMinder, developed a
narrative around global health data to effectively debunk myths about the
developing world in his influential TED Talk Stats that Reshape Your Worldview.
In this data story, the developing countries (the protagonists) overcame the HIV
epidemic and overpopulation (the challenge) with public health measures before
eventually reaching life expectancies on par with developed countries (the
ending). As the audience engaged with the visual narrative, Rosling is one step
closer to Gapminder’s goal of overcoming systematic misconceptions about
global trends.
13
Everyone will be a data storyteller
Pablo Picasso once espoused, “learn the rules like a pro, so you can break them
like an artist.” While these eight rules serve as a general guideline for data
storytellers, they are by no means the commandments of data storytelling.
Data storytellers should recognize that there are always exceptions to the rule,
much like how the Economist consciously broke the rules in data visualization
in creating masterful data stories.
One should also note that these eight rules to data storytelling are by no
means exhaustive. Instead of aiming to learn every single theoretical rule, one
should aim to start crafting and telling their first data stories as soon as
possible. The experience gained from data storytelling will pay dividends as
data storytelling becomes a necessary skill set for all in the coming decade,
and data insights become a cornerstone of organizations.
14
Build a Team of Data Visualization Data Visualization Courses
Experts with DataCamp at Your Disposal
Take your data storytelling to the next
Data Visualization with Python
level with Data Visualization best practices DataCamp provides a host of data visualization courses across the most popular Python
packages. Whether it’s matplotlib, seaborn, bokeh, plotly, or dash, you’ll be able to develop
DataCamp’s extensive data visualization curriculum can help the data succinct data visualizations and dashboards that are deeply customizable for your data stories.
storytellers of the future hone one of the key elements of data storytelling:
DataCamp’s proven learning methodology provides a cyclical process for
learning and retention. This learning methodology enables learners across the
data literacy spectrum to assess their skills and identify gaps, develop a
learning plan based on these gaps, practice skills, and apply them in a real-
world setting. Practitioners of any skill level can upskill on the latest data tools,
techniques, and concepts.
Assess Learn
Test your skills
Complete interactive
and track progress courses
Apply Practice
Solve real-world Practice with quick
problems daily challenges
15
Data Visualization with R Tell your Data Stories with Tableau and Power BI
If you’re looking to sharpen up your data visualization skills in R, we also have you covered. R is Business intelligence tools were made with data storytellers in mind. Powering the analysts of the
considered one of the best tools for data visualization and reporting. Whether using ggplot2, future, business intelligence tools like Tableau and Power BI provide easy drag-and-drop
leaflet, or plotly for data visualization, or the shiny package for developing interactive interfaces to design and deploy dashboards that walk consumers through a data story.
dashboards, you’ll be able to design better visualizations that complement your data story.
16
Upskill your team Track skills development with skill matrix
Track the data skills your team has today and map a path to the skills they need tomorrow.
Using the Skill Matrix, admin users can easily filter to identify individuals with the skills you
need to take on specific projects or teams with low use or data skills gaps. They can then
Create Custom Tracks create and assign custom tracks to help bridge these gaps and report on skill development.
DataCamp makes it easy for you to create bespoke learning
paths and assignments to meet the needs of all your roles,
teams, and departments.
Set Assignments
Assignments are a great way to set clear, time-sensitive learning goals.
On average, courses assigned by Enterprise customers have completion
rates that are twice as high as unassigned courses.