0% found this document useful (0 votes)
23 views

Unit+1+Intro+and+Theory+I

The document provides an introduction to the course BMI 6340: Health Information Visualization and Visual Analytics, focusing on the principles of information visualization and data visualization. It outlines the course objectives, grading structure, and the use of Tableau as a primary tool for creating effective visualizations. Key topics include the theory behind visualizations, practical design experience, and various types of data analysis techniques.

Uploaded by

gzt5jmx7df
Copyright
© © All Rights Reserved
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Unit+1+Intro+and+Theory+I

The document provides an introduction to the course BMI 6340: Health Information Visualization and Visual Analytics, focusing on the principles of information visualization and data visualization. It outlines the course objectives, grading structure, and the use of Tableau as a primary tool for creating effective visualizations. Key topics include the theory behind visualizations, practical design experience, and various types of data analysis techniques.

Uploaded by

gzt5jmx7df
Copyright
© © All Rights Reserved
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
You are on page 1/ 85

INTRODUCTION TO

INFORMATION
VISUALIZATION
BMI 6340: Health Information Visualization and
Visual Analytics
Todd R. Johnson, PhD
ANSCOMBE’S
QUARTET (1973)
Statistically identical
data sets

Mean

Variance

Correlation

Regression line
WHAT IS INFORMATION
VISUALIZATION?

The use of computer-


supported interactive
visual representations of
abstract data to amplify
cognition.

Card, Mackinlay and


Computer Supported
Interactive Non-Interactive
amplify cognition.

Abstract data
no natural physical form Image of real object
Static image?
Zoomable Satellite view?

Information
Visualization?
Information
Visualization?

Static image?
Zoomable view plus click on icons to get more details

Information
Visualization?
Information
Visualization?

Click image to view

Information
Visualization?
Information
Visualization?

Click image to view

Information
Visualization?
Information
Visualization?

Sky
Shady side of pyramid
Click image to view Sunny side of pyramid
WHAT IS DATA
VISUALIZATION?
“The use of visual
representation to
explore, make sense of,
and communicate data.”

Transforms Data into


Information

Stephen Few,
author, scientist
Syllabus Review
BMI 6340 Health Information
Visualization and Visual Analytics
3 Semester Credit Hours

Will Require 9-20 hours of effort each week


Learning Objectives
Primary

Learn the theory behind effective


visualizations

Gain practical experience designing


effective visualizations

Secondary

Learn how to use Tableau to create effective


information visualizations
Why Tableau?
Directly supports visualizations as mappings from data to
visual representations of that data

More than any other visualization tool, Tableau allows us


to focus on design instead of the mechanics (e.g., data
manipulation and programming) behind effective
visualizations

Often automatically follows sound theory for creating


visualizations

Produces highly interactive visualizations with little effort

Available free to students and faculty through the Tableau


for Teaching program. Tableau Public is available for free
to everyone.
Prereqs/Coreqs
Access to Internet

Downloading Tableau, data, examples,


readings, etc

Canvas for online course material and


discussions

Computer capable of running Tableau (Mac or


Windows)
Instructor
Todd R. Johnson, PhDOffice phone: 713-500-
3913 860 UCT 7000 Fannin, Suite 600
[email protected]

Houston, TX 77030

Office Hours

Online help sessions held once each week


(see Canvas for schedule and session link)

Otherwise schedule one on one with the


course TA or me
Online Help Sessions
Q & A Session open to all students

Be sure to have a mic and headphones and be prepared to


share your screen

Questions taken in the following order

Questions about the next assignment due

Questions about most recent lecture

Questions about all other assignments or lectures (past and


future if you are working ahead)

Questions about class in general

Questions about anything else related to visualization in


healthcare, skills needed to help get a job in visualization, etc.
Grading
Requirements Requirements Percentage of Total Points
Percentage of Total Points
Questions embedded in 0.000000%
Weekly Review
videos and pre-quizzes
15%
Weekly Review Quizzes
Homeworks
15.000000%

40%
Homeworks 40.000000%
Term Project Proposal

15% Midterm Exam 10.000000%

Term Project Progress Report

10%
Midterm Project 10.000000%

Term Project Final Report/Poster


Term Project 25.000000%

20%

Total Total 100.000000%

100%
Late Assignment
Policy
25% penalty for each day after the due date.

Applies to all graded assignments, quizzes, exams, and


projects

Example:

Due Date: December 1 by midnight

Turned in: December 2 at 12:30am (counted as 1 day late!)

Score: 100 out of 100 points

Grade: 100 - .25*100 = 75 points

Assessments turned in after the 4th day will not be graded.


Solutions
Quizzes and Midterm Exam

Correct answers will be available 4 days after due


date (due to late submission policy)

Open one of your attempts to see the correct


answers

Homeworks

Walkthroughs (if available) will be available


approximately 4 days after the homework due date.

Each will appear in the module in which the


homework was assigned.
Weekly Review
Quizzes (15%)
Intended to reinforce weekly material (formative
assessments)

Open book, notes and we. Just do your own work so that
you learn the material.

Online in Canvas

You will have one week to complete the questions from


the time they are posted

You can retake the questions as many times as you like,


but questions are pulled pseudo-randomly from multiple
question banks.

Canvas keeps the highest score


Homeworks (40%)
Usually one per week

Hands on exercises to create or extend visualizations

I expect you to work individually on these


assignments

You can consult with me or your classmates for help,


especially regarding how to use Tableau, but your
work should be your own

I do expect you to use the internet for help with


Tableau and even design ideas. This is all part of
learning how to learn new concepts and new tools. It
is also essential for getting design inspiration.
Homework Grading
Rubrics

Each homework has a detailed grading rubric


to show exactly how much each element of
the homework is worth.

The TA will include a grade and possible


comments in the rubric for your submission.
Midterm Exam (10%)
Online Canvas Quiz

You can submit as many times as you like


until 4 days after the due date (but beware
late penalty).

Grade is taken from last submission.

Correct answers not shown until after due


date.

You will have approximately one week to do


the exam
Midterm Project
(10%)

Tableau project covering material cover in the


first half of the semester

You have approximately one week to


complete it
Term Project (25%)
Fairly open-ended

Create a Tableau dashboard using real data


(supplied by the instructor).

Dashboard must meet specific user needs


that you are given

You must justify your design

You will have several weeks to do the term


project
Textbook

Plus selected readings


Topics
Theory and practical guidelines for designing
effective visualizations

Tools for creating Information Visualizations

Emphasis on Tableau Data Visualization


Software

Brief discussions and examples using other


information visualization tools

Visual analysis for specific kinds of data

Dashboards
Time-Series Data

Data collected at equal (or nearly equal)


intervals
Is proportion of visits increasing over time?
Ranking Relations
Is Clinic 2 better or worse
than Clinic 5?

Where does Clinic 4 rank


among all clinics?

Interested in rank order, not


magnitude of differences

Barchart
An unsorted
sorted
Sorting from low to
barchart
from low does
to high
not
high supports rank
workworks
also well
comparisons
Part to Whole
Relations
What proportion
does one value
contribute to a
whole?

Which clinic sees


the highest
proportion of
patients in 2009?

Which clinic sees


the lowest
proportion of
patients in 2010?
Deviations
How does one set of
values differ from a
reference set of
values?

How much did


each Unit miss or
exceed its
readmission rate
goal?

Sorting from best


Bullet Charts showto
Distributions
How are
quantitative
values spread
across their
range?

What is the
distribution of
Birthweight by
Race?
Correlations

Use LOS vs. some other variable


Multivariate Data
Items described by a
common set of variables

patients: age, height,


weight, gender, race

Countries: population,
health spending per
person, GDP, etc.

Questions

Which items are alike or


similar? (Which patients
are like me?)

How can we group items?

Which set of variables and


values lead to a particular
outcome? Parallel Coordinates Plot
Temporal Event
Relations
Temporal sequence of events

Point events

Events with a duration (start


and end)

Different types of events (e.g.,


admission, surgery, discharge)

Events may take place at non-


uniform time periods

Time periods and actual date and


time of measurement may vary
by subject

Questions

What happens to patients…


EventFlow
Signal Detection and
Quality Improvement
Signal Detection and
Quality Improvement
Geographic Data
We will not cover…
Data governance

Data quality, other than the use of


visualization to discover potential quality
issues

Needs analysis (essential for visualizations)

Less common (but still useful) graphs and


tools

Advanced visualization theories: grammar of


graphics, types of visualization tools, etc.
A Closer Look at
Data Visualization
WHAT IS DATA
VISUALIZATION?
“The use of visual
representation to
explore, make sense of,
and communicate data.”
Stephen Few
Transforms Data into
Information
Income Life
Populatio
Data
Country Region per
Person
Expectanc
y
n
China Asia 9502 75 1.35 B
Sub-
Congo Saharan 403 50 70 M
Africa
Y loc
Color
Mapping Circle + Tooltip X loc Size

(Encoding)

Visual
Representati
on
Perception
+
Knowledge
Information
Income Life
Populatio
Data
Country Region per
Person
Expectanc
y
n
China Asia 9502 75 1.35 B
Sub-
Congo Saharan 403 50 70 M
Africa

Size X
Mapping
(Encoding)

Visual
Representati
on

Changing the
mapping
changes the
Income Life
Populatio
Data
Country Region per
Person
Expectanc
y
n
China Asia 9502 75 1.35 B
Sub-
Congo Saharan 403 50 70 M
Africa

Mapping
(Encoding)

Visual
Representati
on

What is the
mapping?
All Kinds of Mappings are
What’s
Possible, But Not All Arethe
GoodNot
Mapping?
all mappings are
good!

Bar Height

Bar Color

Bars ordered
An
Data effective
mapping depends
on
Mapping Lets look
Characteristics of the data at these in
(Encoding) detail
How we perceive visual objects
and relationships

Visual The viewer’s information


Representati need(s)
on
Perception The viewer’s background
+ knowledge
Knowledge
Information
Case Study: Clinic
Visits
What’s the Mapping?
Clinic Year Visits (%)
1 2008 26.3
Data
2 2008 73.7

1 2009 23.611

2 2009 76.389

Row Column
Hindu-Arabic Numeral
2008 2009 2010 2011 2012

Clinic 1 26.3 23.611 37.681 43.089 62.338

Clinic 2 73.7 76.389 62.319 56.911 37.662


What proportion of patients
visited Clinic 1 in 2009?
2008 2009 2010 2011 2012

Clinic 1 26.3 23.611 37.681 43.089 62.338

Clinic 2 73.7 76.389 62.319 56.911 37.662

Find a specific data point

Easy with the table

Only two clinics

Dates in chronological order

Very precise
Which clinic has a higher
proportion of patient visits in
2012?
2008 2009 2010 2011
Clinic 1
2012

Clinic 1 26.3 23.611 37.681 43.089 62.338

Clinic 2 73.7 76.389 62.319 56.911 37.662

Comparison of two data points

A little more difficult

Requires cognitive comparison of magnitude

Spatial proximity of the two values makes it easier


Is the proportion of patients
visiting Clinic 1 generally
increasing over time?
2008 2009 2010 2011
Yes 2012

Clinic 1 26.3 23.611 37.681 43.089 62.338

Clinic 2 73.7 76.389 62.319 56.911 37.662

Synthesis of multiple differences to determine an


overall trend

Much harder

Requires multiple comparisons and judgement


about slight variations (small decrease in 2008-
2009)
From 2008 to 2011 does one
clinic consistently see more
patients?
2008 2009 2010 2011
Yes
2012

Clinic 1 26.3 23.611 37.681 43.089 62.338

Clinic 2 73.7 76.389 62.319 56.911 37.662

Multiple comparisons

Easier than trend estimation?


Between which two years does one
clinic overtake the other in terms
of proportion of patients seen?
2008 2009 2010 2011 2012

Clinic 1 26.3 23.611 37.681 43.089 62.338

Clinic 2 73.7 76.389 62.319 56.911 37.662

Comparison of pairs of values and detection


of shift in proportion

Hardest yet?
If the trends continue, approximately
what proportion of patient visits will
Clinic 1 have in 2013?
Around 70?
2008 2009 2010 2011 2012

Clinic 1 26.3 23.611 37.681 43.089 62.338

Clinic 2 73.7 76.389 62.319 56.911 37.662

Synthesis to determine trend, plus projection

Very difficult

Calculate growth

-3,+4, +14, +6, +11


LET’S TRY AGAIN
WITH GRAPHS

What are the


mappings?
What proportion of patients
visited Clinic 1 in 2009?

WHICH IS EASIEST?
Which clinic has a higher
proportion of patient visits in
2012?

WHICH IS EASIEST?
Is the proportion of patients
visiting Clinic 1 generally
increasing over time?

WHICH IS EASIEST?
From 2008 to 2011 does one
clinic consistently see more
patients?

WHICH IS EASIEST?
Between which two years does one
clinic overtake the other in terms
of proportion of patients seen?

WHICH IS EASIEST?
If the trends continue, approximately
what proportion of patient visits will
Clinic 1 have in 2013?
Around 70?

WHICH IS EASIEST?
Key Points
The best visualization (mapping) depends on the data
and the information need(s)

For looking up exact values, tables can be better than


graphs

Graphs are often better for comparison and synthesis


(trends, projections)

To pick the best graph you need know your data, the
users’ information need(s), and the users’ background
knowledge

Since users can have more than one information need,


more than one graph might be needed for the same
data
All Kinds of Mappings are
Why
Possible, But Not All is
Arethis
Good a bad
To answer this, we need

mapping?
to understand variables,
measurement scales,
and visualBar Height
perception

Bar Color

Bars ordered
Variables and
Measurement Scales

Variable: A characteristic of an observational


unit that may assume more than one set of
values

Measurement Scale: All possible values for


a variable
Observational Unit:
Patient
Variable Measurement Scale

Age (in years) {0, 1, 2, 3,…}

{infant, toddler, adolescent,


Age Group
teen, adult}
{0-5, 6-12, 13-19, 20-25, 26
Age Range
and older}

Gender {Male, Female}

{Male, Female, Transgender,


Gender
Other}
Categorical

Quantitative
Steven’s Scale Types
Formal Scale
Properties Types
Nominal = Ordinal < ,
Interval - Ratio ÷
,≠ >
Category
X X X X
(equality)
Magnitude
(greater or X X X
less)
Equal
Interval
(equality
X X
of
differences
)
Absolute
Nominal
Values are non-numeric with no meaningful
order OR values are numbers used as names
or labels, where the value of the number is
meaningless

Magnitude < , Absolute Zero


Category = , ≠ Equal Interval -
> ÷

Ordinal Scale
Values have a meaningful order, but differences don’t matter (intervals between
values are not equal)

Age: {0-5, 6-12, 13-19, 20-25, 26 and older}

Rank order, such as finish place in a race: {1st, 2nd, 3rd, …}

Survey response: {Strongly Disagree, Disagree, Neutral, Agree, Strongly Agree}

Magnitude < , Absolute Zero


Category = , ≠ Equal Interval -
> ÷
✓ ✓
Interval Scale
Order and differences matter, but ratios of
two values do not matter (intervals
between values are equal)

Discharge Time – Arrival


Time = LOS
Discharge Time / Arrival
Time = ?

Magnitude < , Absolute Zero


Category = , ≠ Equal Interval -
> ÷
✓ ✓ ✓
Interval Scale
Order and differences matter, but ratios of
two values do not matter

These two patients temp dropped by an


equal amount
104 F - 103 F = 1
102 F - 101 F = 1

Magnitude < , Absolute Zero


Category = , ≠ Equal Interval -
> ÷
✓ ✓ ✓
Ratio Scale
Numbers tell us how much of one thing we have in
comparison to another

Has a non-arbitrary absolute zero, meaning total


absence of the characteristic being measured

Magnitude < , Absolute Zero


Category = , ≠ Equal Interval -
> ÷
✓ ✓ ✓ ✓
Celsius vs. Kelvin
0º K is absolute lack of heat

Note location of 0º on the C


scale

20º C is not twice the heat


as 10º C

10º C = 283.15º K

20º C = 293.15 K
0º C
20º C is just 1.035 times
as much heat as 10º C:

293.15/283.15 = 1.035
Scales and Mappings

To create an accurate mapping

Choose visual properties that match the


scale of the data as closely as possible
All Kinds of Mappings are
Why
Possible, But Not All is
Arethis
GoodScales
a ofbad
data and

mapping?
mappings are
Ratio
mismatched
Nominal
Bar Height

Ratio

Bar Color

Bars ordered Nominal


Ratio
Income Life
Populatio
Country Region per Expectanc
n
Person y
China Asia 9502 75 1.35 B
Sub-
Congo Saharan 403 50 70 M
Africa
North
US 41231 79 310 M
America

Countries (Nominal) to
Distinct Labeled Bars
(Nominal property of
What’s wrong with
this graph?

Y axis is no longer
uniform
Looks like ratio, but
isn’t
An Accurate Mapping is
Necessary But Not Sufficient
for an Effective Mapping
Summary
Information visualization is the use of computer
supported, interactive visualizations of abstract data
to amplify cognition

A subtype of data visualization

A data visualization

maps data to a visual representation

uses visual perception to turn data into information

The best visualization depends on the data, the


users’ information needs and their background
knowledge
Summary
To know what a visualization means, we must know
the mapping used to encode it

Data is measured along 4 scales

Nominal (categorical)

Ordinal

Interval

Ratio

An accurate mapping is one in which the scale of the


data matches the scale of the visual property
Assignments

Download and install Tableau

Try to work through the getting started


tutorial

Do the weekly review questions

Read this weeks and next weeks reading


assignments
Next Week

Perception and Memory

How they affect visual representations and


vice versa

Making effective use of color

You might also like