0% found this document useful (0 votes)
16 views

Data Visualization Basics

This document provides an introduction to data visualization. It discusses the basics of visualization including its importance in decision making. The history of visualization is explored through examples dating back 30,000 years. Modern visualization is examined along with its relationship to other fields such as computer graphics and scientific vs information visualization. The visualization process is outlined including data analysis, mapping data to displays, and choosing appropriate visual representations.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Data Visualization Basics

This document provides an introduction to data visualization. It discusses the basics of visualization including its importance in decision making. The history of visualization is explored through examples dating back 30,000 years. Modern visualization is examined along with its relationship to other fields such as computer graphics and scientific vs information visualization. The visualization process is outlined including data analysis, mapping data to displays, and choosing appropriate visual representations.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Unit I Introduc on and Data Founda on

1.1 Basics
a. Visualiza on
- The communica on of informa on using graphical representa ons.
- A single picture contains a wealth of informa on and can be processed more quickly
than a page of words.
- Because image interpreta on is a parallel process whereas and text analysis is a
sequen al process.
- And pictures are easily understood by others. E.g., maps or a graph.
- Some examples of visualiza on data are:
o A map of a region, that helps to select a route to a new des na on
o A table in a newspaper, that represents data being discussed in an ar cle.
o A train and subway map with arrival and departure mes.
o Analysis of human smoking behaviours.
o The result of a financial and stock market analysis.
b. Importance of Visualiza on:
Visualiza on is essen al for decision – making because humans are primarily visual
beings and rely on sight for informa on understanding.
The two significances of visualiza on in decision – making are:
 Data Distor on:
- Different ways of presen ng the same data can lead to various interpreta ons. Altering
the scale and scaling choices in data visualiza on can distort the data’s representa on,
making it appear differently, such as clustered, linear or quadra c. The choice of scaling
can impact decision – making by affec ng how data is perceived.
 Human Interpreta on:
- In a real – world example, clinicians were presented with the results of hypothe cal
clinical trials using different visualiza on techniques. The choice of visualiza on
significantly influenced the decision – making process. For instance, icon displays were
less most effec ve, leading to 82% correct decisions, while bar charts and pie charts
were less effec ve, resul ng in only 56% correct decisions. User preferences also
played a role in the effec veness of visualiza on methods.

Overall visualiza on is crucial in making sense of data, as it can affect how data is perceived
and interpreted. As data becomes more abundant and complex, visualiza on tools and
techniques play an increasingly important role in helping individuals understand, analyze
and communicate informa on in various domains.

c. History of Visualiza on:


The history of visualiza on traces back to early human communica on, where
graphical representa ons played a crucial role. Notable examples include:

PREETHIKA P 1
1. Chauvet Cave Pain ngs:
Over 30,000 years ago, early humans in the Chauvet Cave, France created
over 250 pain ngs, likely as a means of passing on informa on to future
genera ons.

Fig: Chauvet Cave Pain ngs, France


2. Kish Limestone Tablet:
Considered the earliest wri en document, this pictographic tablet from
Mesopotamia, with some syllabic wri ng, provides an early form of visual
communica on.

Fig: Kish Limestone Tablet, Sumer


3. Hieroglyphics:
The ancient Egyp ans used hieroglyphics, consis ng of logograms
(wri en/pictorial symbol), phonograms (symbol represen ng vocal sound) and
determina ves, to encode informa on, such as religious and historical details.

Fig: Hieroglyph
4. Peu nger Map:
An early road map of the Roman Empire that depicted roads and distances,
though with some distor on, was used for travel, commerce and military planning.

Fig: Peu nger Map

PREETHIKA P 2
5. Hereford Map:
An enormous medieval map of the world located in Hereford, Wales,
featuring real and religious informa on, as well as mythical figures.

Fig: Hereford Map

6. John Snow’s Cholera Map:


In 1854, John Snow’s map of cholera deaths in London highlighted the
concentra on of cases around the Broad Street water pump, leading to a cri cal
public health discovery.

Fig: John Snow’s Cholera Map


7. Early Time Series Visualiza ons:
Early me series visualiza ons depicted astronomical data, such as the
phases of the moon and planetary mo on.

Fig: Biruni Circa phases of moon

8. Minard’s Napoleonic March:


Charles Minard’s famous map depicted Napoleon’s march on Moscow,
demonstra ng the loss of troops during the expedi on.

PREETHIKA P 3
Fig: Napoleon’s 1812 March

9. Playfair’s Visualiza ons:


William Playfair’s work included plots of na onal debt over me and the
balance of trade between England and Norway/Denmark.

Fig: Playfair’s Na onal Debt of England

10. Joseph Priestley’s Longevity Display:


Priestley’s visualiza on showcased the lifespans of famous individuals.

Fig: Joseph Priestley’s display of the


longevity of famous people (1765)
11. Florence Nigh ngale’s Coxcomb Chart:
Nigh ngale’s coxcomb chart illustrated monthly deaths in the army,
dis nguishing between deaths from disease, wounds and other causes.

PREETHIKA P 4
Fig: Nigh ngale’s coxcomb chart showing monthly deaths and wounds from ba le
12. Leonardo Da Vinci’s Anatomy Drawings:
Da Vinci’s detailed drawing of human anatomy served as early medical
visualiza ons.

Fig: Leonardo da Vinci – the muscles of shoulder and arm, and the bones of the foot(verso).

These historical examples highlight the evolu on of visualiza on as a means to


communicate and understand informa on, from early cave pain ngs to complex data
representa ons.

d. Modern Visualiza on:


 Visualiza on provides both qualita ve and quan ta ve views of informa on.
 Examples like the Tokyo underground map and Google Maps direc ons demonstrate
its interpreta ve ease.
 Textual statements provide precise informa on, while visualiza ons offer nuanced
insights into trends and changes.
 Examples like the Dow Jones Average plot and the U.S. Na onal Debt Clock provide
precise and detailed views of complex data.
 Visualiza ons such as sca erplots can reveal regression lines and data pa erns, aiding
in comprehensive data analysis.
 Anscombe’s data sets emphasize the importance of visualiza ons in overcoming
sta s cal misconcep ons.
 Modern applica ons in various fields, such as medical reconstruc on, aerospace
simula ons, and bioinforma cs, showcase the diverse and powerful uses of
visualiza ons.

PREETHIKA P 5
1.2 Rela onship between Visualiza on and Other Fields
1.2.1 Difference between Visualiza on and Computer Graphics
 Visualiza on was ini ally considered a subset of computer graphics, using graphics
to display data visually.
 Visualiza ons apply graphics techniques to represent data via images, emphasizing
the connec on between data and graphics.
 Visualiza ons extend beyond computer graphics by incorpora ng elements from
various fields like human – computer interac on, psychology, databases, sta s cs
and data mining.
 The goal of computer graphics is to create interac ve, realis c images and
anima ons, o en used in entertainment and art.
 Visualiza on focuses on effec ve data communica on, o en dealing with non –
physical a ributes or unseen proper es.
 Computer graphics forms the founda on for visualiza on, with tools like graphics
programming languages, rendering techniques, and hardware being integral to the
field.
1.2.2 Scien fic Data Visualiza on vs. Informa on Visualiza on
 In the past, there was a dis nc on between scien fic visualiza on and informa on
visualiza on.
 Both scien fic visualiza on and informa on visualiza on aim to represent data,
but they o en deal with different types of data.
 Biomolecular chemistry has evolved from basic s ck – and – ball representa ons
to more realis c ones and now includes informa on visualiza ons like
sca erplots.

1.3 The Visualiza on Process


The visualiza on process involves several key aspects:
1.3.1 Data Analysis and Purpose of Visualiza on:
 The process of designing a new visualiza on starts with an analysis of available data
and the desired informa on to convey.
 Data can be diverse in origin and complexity.
 Can be used for explora on, hypothesis confirma on or presenta on of analysis
results.
 Notable results may include anomalies, clusters and trends each serving different
purposes in data analysis.
1.3.2 Mapping Data to Display
 Data visualiza on requires defining a mapping from data to the display. Various methods
can be used for this mapping.
 Data must be mapped to graphical representa ons on the display.

PREETHIKA P 6
 Data values or a ributes are used to define graphical objects, like points, lines and
shapes.
 A ributes such as size, posi on, orienta on and colour are employed for visual
representa on.
 Different data – to – visual mappings can be applied, like plo ng numbers as points or
bars.
 The choice of mapping impacts how data is visualized effec vely.
1.3.3 User interface and Interac vity
 Interac ve controls are crucial for viewing and mapping variables in modern
visualiza on.
 The user interface involves components for
o Data entry
o Presenta on
o Monitoring
o Analysis
o Computa on
 Dialog boxes or visual representa ons of data facilitate user interac on.
 Modern visualiza ons priori ze interac vity, allowing users to control data selec on,
mapping colour schemes and view refinement.
 User input and control play a significant role, as early visualiza ons were sta c, while
modern ones are dynamic.
1.3.4 Customiza on and Refinement
 The effec veness of a visualiza on depends on user backgrounds, abili es, preferences
and the specific task.
 Users should be able to customize, modify and interac vely refine visualiza ons to
achieve their goals and effec vely represent data pa erns.

1.3.5 Integra on into Larger Processes


 Visualiza on is o en a part of broader processes like exploratory data analysis,
knowledge discovery and visual analy cs.
 Theseprocess involve data prepara on, which include tasks like cleaning erroneous or
noisy data.
 Visualiza on and data analysis work together to create models that represent or
approximate the data.
 Visualiza on in data explora on is used to communicate informa on, discover new
knowledge, and iden fy structure, pa erns, anomalies, trends and rela onships.

PREETHIKA P 7
1.3.6 Pipeline Model
 The process of transforming data into images, visualiza ons or models is tradi onally
described as a pipeline, involving different stages.
 These processes vary between graphics, visualiza ons and knowledge discovery but
share commonali es.

They all start with data and ul mately serve the user’s needs and objec ves.

THE COMPUTER GRAPHICS PIPELINE:

The computer graphics stages are as follows:

Modelling:

 A three – dimensional model, consis ng of planar polygons defined by ver ces and
surface proper es, is generated using a world coordinate system.

Viewing:

 A virtual camera is defined at a loca on in world coordinated, along with a direc on


and orienta on (given as vectors).
 All ver ces are transformed into a viewing coordinate system based on the camera
parameters.

Clipping:

 Defining the boundaries of the intended image, typically based on the corner posi ons
of a projec on plane posi oned in front of the camera, enables the removal of objects
not within view and clipping of par ally visible objects.
 Objects can be converted to normalized viewing coordinates to streamline the clipping
process, which can occur at various stages of the pipeline.

Hidden surface removal:

 Polygons facing away from the camera, or those obscured by others, are removed or
clipped.
 This process may be integrated into the projec on process.

Projec on:

PREETHIKA P 8
 Three – dimensional polygons are projected onto the two – dimensional plane of
projec on, usually using a perspec ve transforma on.
 The results may be in a normalized 2D coordinate system or device/screen coordinates.

Rendering:

 The actual colour of the pixels associated with a visible polygon depends on a number
of factors like,
o Material proper es being synthesized (base colour, texture, surface roughness,
shininess),
o Type,
o Loca on,
o Colour,
o Intensity of the light source,
o Degree of occlusion from direct light exposure,
o Amount and colour of light being reflected off of other objects onto the
polygons.
 This process may also be applied at different stages of the pipeline, however due to its
computa onal complexity, it is usually performed simultaneously with projec on.

Ray Tracing:

It is a variant of pipeline that involves cas ng rays from the camera through the plane of
projec on to ascertain what polygons are hit.

For reflec ve or translucent surfaces, secondary rays can be generated upon intersec on
with the surface, and the result are accumulated.

The key algorithms involved include

 determining where rays intersect surfaces,


 the orienta on of the surface at the intersec on point, and
 the mechanism for combining the effects of secondary rays.

PREETHIKA P 9
THE VISUALIZATION PIPELINE

The data or informa on visualiza on pipeline exhibits certain parallels with the graphics
pipeline, par cularly at a conceptual level.

Data selec on:

 Like clipping, data selec on entails pinpoin ng the por on of data that could
poten ally be visualized.
 This selec on can be either en rely user – driven or accomplished through algorithmic
approaches, such as cycling through temporal segments or autonomously detec ng
user – relevant features.

Data to visual mappings:

 The core of the visualiza on pipeline revolves around the process of associa ng data
values with graphical en es or their a ributes.
 Consequently, a specific component of a data record might determine the size of an
object, while others could influence the object’s posi on or colour.
 This mapping frequently requires preprocessing of the data before the mapping itself,
which can include opera ons like scaling, shi ing, filtering, interpola ng or
subsampling.

Scene parameter se ng (view transforma ons):

 Similar to tradi onal graphics, certain a ributes of the visualiza on need to be


specified by the user, and these a ributes are rela vely detached from the data.
 These a ributes encompass the choice of a colour map where specific colours may
have well – defined meaning in different domains.
 It also includes the selec on of a sound map in case of auditory channels.
 And, ligh ng specifica ons that are required for 3D visualiza ons.

PREETHIKA P 10
Rendering or genera on of the visualiza on:

 The projec on and rendering of visualiza on objects differ depending on the mapping
method employed.
 Various techniques like shading or texture mapping may be u lized, although some
visualiza on approaches simply involve drawing lines and uniformly shaded polygons.
 Also, to present the data, most visualiza ons incorporate extra details to aid
interpreta on, such as axes, legends and annota ons.

THE KNOWLEDGE DISCOVERY PIPELINE

The knowledge discovery, also called data mining field has its own pipeline. Similar to the
pipelines used in graphics and visualiza ons, the process commences with raw data, which is
then subjected to processing with the goal of construc ng a model, rather than producing
visual graphics.

The visualiza on pipeline can be integrated with the knowledge discovery (KD) pipeline.

Data:

 the KD pipeline places a stronger emphasis on data.


 Graphics and visualiza on processes usually assume that the data is pre –
structured for display.

Data integra on, cleaning, warehousing and selec on:

 The ini al step involves iden fying poten al data sets for analysis.
 User involvement is possible during step.
 Data cura on and management techniques such as filtering, sampling, sub se ng
and aggrega on may be employed to prepare the data for the data mining process.

Data mining:

 The core of the KD pipeline revolves around the algorithmic analysis of data to
generate a model.

PREETHIKA P 11
Pa ern evalua on:

 The generated model or models need to undergo evalua on to determine their


robustness, stability, precision and accuracy.

Rendering or visualiza on:

 The par cular outcomes must be communicated to the user.


 Whether it is an integral part of the graphics or visualiza on pipelines, the essen al
point is that the user will eventually need to review the process’s results.

The Role of Percep on

 In all visualiza ons, understanding the capabili es and limita ons of the human visual
system is crucial when considering the user experience.
 When the objec ve of visualiza on is to convey informa on accurately through images, it
is impera ve to take into account perceptual abili es.
 While a well – drawn picture can be engaging, it’s undesirable to introduce ambigui es, as
demonstrated by Shepard’s many – legged elephant.

Fig: Shepard’s many – legged elephant


 For instance, when examining a cluster of slightly spaced black squares, observe the
effects as you gaze at them.
 Although there are no actual moving black dots at the intersec ons of the white lines,
such a presenta on of data would create visual instability.

 Hence, it’s not logical to map a variable to a graphical a ribute that humans have
difficulty controlling or quan fying if the goal is to accurately communicate a numeric
value.

PREETHIKA P 12
The significance of understanding human visual percep on in crea ng effec ve visual
displays are important. It highlights that a significant por on of the human brain processes
visual informa on in a con nuous and parallel manner, recognizing primi ve a ributes like
texture, colour and mo on.

The Role of Cogni on

The visualiza on pipeline focuses on crea ng visualiza ons for users and their tasks. It
aims to understand what users no ce, understand, forget or remember in visualiza on. It
also considers how long they can remember this informa on. Answering these ques ons
requires us to go beyond percep on and its cogni on.

1.4 Pseudocode Conven ons


Global variables and func ons exist used in the pseudocode:
 data
o the data table used for working
o contains only numeric values
o in general, the parts of the data table with words or other non – numbers, we
need to convert them into numbers.
 m
o the number of dimensions (columns) in the working data table.
o Dimensions are typically iterated over using j as the running dimension index.
 n
o number of records (rows) in the working data table.
o Records are iterated over using i as the running record index.
 NORMALIZE (record, dimension), NORMALIZE (record, dimension, min, max)
o A func on that transforms data values in a data table to a range between a
minimum and maximum value.
o If there’s no minimum and maximum specified, it makes the number fit between
0 and 1.
o The normaliza on is usually linear and local to a single dimension.

PREETHIKA P 13
o The code should be in such a way that any kind of normaliza on can be used either
locally, globally or local to the ac ve dimensions.
o This helps in crea ng visualiza ons where different parts may need different types
of scaling.
 COLOR (color)
o A func on that changes the colour se ngs of the graphics environment to a
specified colour, where the colour is represented as an integer containing RGB
values.
 MAPCOLOR (record, dimension)
o A func on that adjusts the colour se ngs of the graphics environment to
represent the colour obtained by applying the global colour map to the
normalized value of a specified record and dimension in the working data table.
 CIRCLE (x, y, radius)
o A func on that fills a circle centered at the given (x,y) – loca on, with
 The given radius and
 Colour of the graphics environment.
o The plo ng space for visualiza on is the unit square.
o This func on must map the unit square to a square in pixel co-ordinates.
 POLYLINE (xs, ys)
o A func on that draws a polyline(many connected line segments) from the given
arrays of x- and y- coordinates.
 POLYGON – (xs, ys)
o A func on that fills the polygon defined by the given arrays of x- and y- coordinates
with the colour of the current colour state.
For geographic visualiza ons, the following func ons are assumed to exist in the
environment:
 GETLATITUDES (record), GETLONGITUTDES (record)
o Func ons that obtain the arrays of la tude and longitude coordinates, for the
geographic polygon linked with the given record.
o Example: outlines of the countries of the world.
 PROJECTLATITUDES (lats, scale), PROJECTLONGITUDE (longs, scale)
o Func ons that project
 arrays of la tude values to arrays of y values, and
 arrays of longitude values to arrays of x values.

For graph and 3D surface data sets, the following is added:

 GETCONNECTIONS (record)
o Func on that retrieves an array of record indices to which the given record is
connected.

PREETHIKA P 14

You might also like