0% found this document useful (0 votes)
9 views

DS351 DataViz Intro

Uploaded by

khuzi yunus
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

DS351 DataViz Intro

Uploaded by

khuzi yunus
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 49

Introduction to Data Science

Data Visualization

Farhan Khan
Outline
Visualization:
• How not to do it
• How to do static visualizations
• Making it interactive
FIRST, A CLASSIC
Charles Joseph Minard 1869
Napoleon’s March

According to Tufte: “It may well be the best statistical graphic ever drawn.”
5 variables: Army Size, location, dates, direction, temperature during retreat
More Examples
• The famous Gapminder Video, Hans Rosling:
200 Countries, 200 Years, 4 Minutes
• https://www.youtube.com/watch?feature=player_embedded&v=jbkSRLYSojo

• NY Times Interactive Visualizations (e.g., 2013 Federal Budget)


• http://www.nytimes.com/interactive/2012/02/13/us/politics/2013-budget-proposal-graphic.html

• Also, Map-based visualizations, such as CrimeMapping


• http://www.crimemapping.com/map.aspx?aid=3f1738a8-6160-4c68-998a-ae00f597613a
Visualization Types

Depends on what you want to show
 Temporal visualization – satisfy two conditions: they are linear,
and that they are one-dimensional.
 Hierarchical – are those that order groups within larger groups.
 Network data visualizations – show how the data relate to one
another within a network.
 Multidimensional data visualizations – have multiple
dimensions. This means that there are always 2 or more
variables in the mix to create a 3D data visualization.
 Geospatial or spatial data visualizations – relate to real life
physical locations, overlaying familiar maps with different data
points.
Temporal visualization

The most common type that you
normally see and use

Scatter plots

Polar area diagrams

Time series sequences

Timelines

Line graphs
Hierarchical Visualization

Hierarchical visualizations
are best suited if you’re
looking to display clusters of
information, especially if
they flow from a single origin
point.

Tree diagrams

Ring charts

Sunburst diagrams
Geospatial Visualization

These types of data
visualizations are commonly
used to display sales or
acquisitions over time

Flow map – shows
movement of object
between geographical points

Density map

Cartogram

Heat map
Some Anti-Examples
• Courtesy of WTFViz.net
Visualization to Educate?

from wtfviz.net
Another Interesting One

from wtfviz.net
Pie in the Sky?

from wtfviz.net
from wtfviz.net
Needs Fixing

from wtfviz.net
from wtfviz.net
Okay, so that’s how not to do it!
Let’s talk about how to do it well:
• Some principles
• Best practices for static visualization
• Emerging principles and tools for interactive
visualization
What is Visualization?
Definition (www.oed.com)
1. The action or fact of visualizing; the power or process
of forming a mental picture or vision of something not
actually present to the sight; a picture thus formed.
More Definitions
• “Transformation of the symbolic into the geometric”
[McCormick et al. 1987]

• “... finding the artificial memory that best supports


our natural means of perception.” [Bertin 1967]

• “The use of computer-generated, interactive, visual


representations of data to amplify cognition.”
[Card, Mackinlay, & Shneiderman 1999]
Uses for Data Viz
A: Support reasoning about information (analysis)
• Finding relationships
• Discover structure
• Quantifying values and influences
• Should be part of a query/analyze cycle

B: Inform and persuade others (communication)


• Capture attention, engage
• Tell a story visually
• Focus on certain aspects, and omit others
Data Presentation
• Designer-Reader-Data Trinity

From “Designing Data Visualizations”,


Iliinsky and Steele, O’Reilly, 2011 21
Uses for Data Viz
Uses for Data Viz
Uses for Data Viz
A case for Ugly visualizations
People instinctively gravitate to attractive visualizations, and they
have a better chance of getting on the cover of a journal.

But does this conflict with the goals of visualization?:


• Rapid exploration
• Focus on most important details
• Easy and fast to develop and
customize

Powerpoint vs Keynote vs InDesign


A case for Ugly visualizations
But you can go too far:

Ugliness does correlate with hard-to-interpret, but they’re not


the same thing.
Data Scientist’s Workflow
Sandbox
Production

Hypothesize Large Scale


Digging Around Model Exploitation
in Data

Evaluate
Interpret
A case for Interactivity

i.e. visualizations usually aren’t an end in themselves,


but part of a query/interpret cycle.

Interactivity can speed up the query/interpret cycle.


Baby Names Voyager
(Wattenberg et al. 2005)
An interactive visualization with rich narrative quality
(i.e. you can discover stories through the names).

http://www.babynamewizard.com/

Hides more than it reveals, but lets you explore in an


intuitive way. i.e. supports rapid query/interpret cycles.
Many Eyes
(Wattenberg et al. 2007)
Participatory visualization and explanation site:

http://www.many-eyes.com
Outline
Visualization:
• How not to do it
• How to do static visualizations
• Making it interactive
Chart Selection – Andrew Abela
Chart Selection – Juice Analytics
Design Considerations
• Tables and charts
• Reduce chartjunk/tablejunk; increase data-ink ratio
• Lessons from perception: Limit the number of objects
displayed at once
• Typography: capitalization, serif/non-serif; use what your
company uses!
• Colors
• Color scheme
• Contrast, emphasis
• Use what your company uses!
• 6 Gestalt Psychology principles (1912):
• For groups of objects: proximity, similarity, enclosure, connection
• Visual representation: closure, continuity

34
Chart Design

• Example from Tim Bray

35
Chart Design

• Example from Tim Bray

36
Chart Design

• Example from Tim Bray

37
Chart Design

• Example from Tim Bray

38
Chart Design

• Example from Tim Bray

39
Chart Design

• Example from Tim Bray

40
Design Considerations
• Color
• By default, use your organization’s palette
• Choose colors based on the information you want to convey
• Sequential
• Diverging
• Categorical
• Use online resources to discover and record your color
schemes
• Color Brewer
• Kuler
• Colour Lovers

41
Design Considerations
• Color

42
Design Considerations
• Color

43
Design Considerations
• Color

44
Design Considerations
• Color

45
Design Considerations
• Color

46
Design Considerations
• Color

47
Visualization Tools
Tableau

The market-leading
choice for modern
business intelligence

Analytics platform
that makes it easier
for people to explore
and manage data

In high demand –
most data scientist
jobs require
experience in
Tableau

Proprietary software
– include free
version and give one
year free license for
full version to
students

You might also like