Introduction To Data Science Interactive Visualization: CS 194 Fall 2015 John Canny
Introduction To Data Science Interactive Visualization: CS 194 Fall 2015 John Canny
Lecture 11
Interactive Visualization
Evaluate
Interpret
Data Scientist’s Workflow
Sandbox
Publish
Information
Hypothesize
Digging Around Model
in Data
Evaluate
Interpret
Outline
Visualization:
• Some great examples
• Some counter-examples
• Principles for Visualization Design
• Visualization Toolkits preview
FIRST, A CLASSIC
Charles Joseph Minard 1869
Napoleon’s March
According to Tufte: “It may well be the best statistical graphic ever drawn.”
5 variables: Army Size, location, dates, direction, temperature during retreat
Interactivity: Baby Names Voyager
(Wattenberg et al. 2005)
A modern classic with rich narrative quality (i.e. you can
discover stories through the names).
http://www.babynamewizard.com/
From Interactivity to Participation:
Many Eyes
(Wattenberg et al. 2007)
Participatory visualization and explanation site:
http://www.many-eyes.com
Interactivity to Educate
• The famous Gapminder Video, Hans Rosling:
200 Countries, 200 Years, 4 Minutes
• https://www.youtube.com/watch?feature=player_embedded&v=jbkSRLYSojo
The Future of Journalism?
from wtfviz.net
Pie in the Sky?
from wtfviz.net
from wtfviz.net
Needs Fixing
from wtfviz.net
Outline
Visualization:
• Some great examples
• Some counter-examples
• Principles for Visualization Design
• Visualization Toolkits preview
Visualization Definitions
• “Transformation of the symbolic into the geometric”
[McCormick et al. 1987]
22
Chart Design: Simplifying
• Example from Tim Bray
23
Chart Design: Simplifying
• Example from Tim Bray
24
Chart Design: Simplifying
• Example from Tim Bray
25
Chart Design: Simplifying
• Example from Tim Bray
26
Chart Design: Simplifying
• Example from Tim Bray
27
Chart Design: Simplifying
• Example from Tim Bray
28
Principle 1: Simplify
• Tables and charts
• Reduce chartjunk/tablejunk; increase data-ink ratio
• Lessons from perception: Limit the number of objects
displayed at once
• Beware:
• Gratuitous 3D
• Shadows
• Gratuitous animation
• How do you tell if a feature is gratuitous?
Ask whether using it reveals more information.
29
Interactive Chart Design: Simplifying
• With interactive charts you can keep things very simple by
hiding and dynamically revealing important structure.
• On an interactive chart, you reveal the information most
useful for navigating the chart.
30
Principle 2: Understand Magnitudes
Which is brighter?
Principle 2: Understand Magnitudes
(128, 128, 128) (144, 144, 144)
Which is brighter?
Just Noticeable Difference
• JND (Weber’s Law)
I
S k
I
S = sensation
I = intensity
p < 1 : underestimate
p > 1 : overestimate
Length
Slope
Angle
Area
Volume
38
Principle 3: Use Color
• Color
39
Principle 3: Use Color
• Color
40
Principle 3: Use Color
• Color
41
Principle 3: Use Color
• Color
42
Principle 3: Use Color
• Color
43
Principle 3: Use Color
• Color
44
Principle 4: Use Structure
• Gestalt Psychology principles (1912):
Source http://blog.fusioncharts.com/2014/03/how-to-use-the-gestalt-principles-for-visual-storytelling-podv/
45
Principle 4: Use Structure
(but not like this)
Source https://www.vocalabs.com/blog/my-dashboard-pet-peeve
46
Principle 4: Use Structure
Source https://www.vocalabs.com/blog/my-dashboard-pet-peeve
47
Chart Selection – Andrew Abela
Chart Selection – Juice Analytics
Data Viz in the Sciences
Uses for Data Viz
A case for Ugly visualizations
People instinctively gravitate to attractive visualizations, and
they have a better chance of getting on the cover of a journal.
5 min BREAK
Lecture Wrap-up
One more lecture next week: Joey Gonzalez (co-developer of
GraphLab at CMU and GraphX at Berkeley)
Quark Raptor-X
Rich, Complex Data-intensive,
Energy Models general ML models
Faithful, Physical Feature-based inference
Simulation Conditional Neural Fields
What’s Hard (and Rewarding)
about Data Science
Critical Thinking:
• Overcoming assumptions.
• (Not) making ad-hoc explanations of data patterns.
• (Not) overgeneralizing.
• Checking enough (validate models, data pipeline
integrity, etc.).
• Using statistical tests correctly.
• If its looks weird its usually wrong, figure out why…
What’s Hard and Rewarding
about Data Science
Managing Complexity
• Check and validate everything (again).
• Prototype Production transitions.
• Data pipeline complexity (who knows the entire
system?).
Communicating
• You have to distill the results of ###-bytes of data
into a few paragraphs or a chart, and be accurate.
• Models are only approximations to reality.