Data Visualization Techniques 1
Data Visualization Techniques 1
Bar Charts....................................................... 3
Scatter Plots.................................................... 4
Pie Charts........................................................ 6
Visualization Velocity...................................... 10
Decision Trees................................................ 14
Mobile......................................................... 17
Conclusion.................................................. 17
4
the best possible visual based on the data that is selected.
The visualizations make it easy to see patterns and trends
Bar Charts Another form of a bar chart is called the progressive bar
Bar charts are most commonly used for comparing the chart, or waterfall chart. A waterfall chart shows how the
quantities of different categories or groups. Values of a initial value of a measure increases or decreases during
category are represented using the bars, and they can a series of operations or transactions (see Figure 3). The
be configured with either vertical or horizontal bars, with first bar begins at the initial value, and each subsequent
the length or height of each bar representing the value. bar begins where the previous bar ends. The length and
direction of a bar indicates the magnitude and type
(positive or negative, for example) of the operation or
When values are distinct enough that differences in the
transaction. The resulting chart is a stepped cascade that
bars can be detected by the human eye, you can use a
shows how the transactions or operations lead to the final
simple bar chart. However, when the values (bars) are
value of the measure.
very close together or there are large numbers of values
}
(bars) that need to be displayed, it becomes more difficult
to compare the bars
to each other.
Bar charts can be configured
To help provide visual variance, bars can have different
with either vertical or horizontal
colors. The colors can be used to indicate such things as bars, with the length or height
a particular status or range. Coloring the bars works best
of each bar representing the
when most bars are in a different range or status. When
all bars are in the same range or status, the color value.
becomes irrelevant, and it is most visually helpful to
keep the color consistent or have no coloring at all.
Figure 3: This bar graph a waterfall chart is used to represent the relative contribution of each category to the
total.
Scatter Plots Once you have plotted all of the data points using a
A scatter plot (or X-Y plot) is a two-dimensional plot that scatter plot, you are able to visually determine whether
shows the joint variation of two data items. In a scatter data points are related. Scatter plots can help you gain a
plot, each marker (symbols such as dots, squares and sense of how spread out the data might be or how closely
plus signs) represents an observation. The marker related the data points are, as well as quickly identify
position indicates the value for each observation. Scatter patterns present in the distribution of the data (see Figure
plots also support grouping. When you assign more than 4). Scatter plots are helpful when you have many data
two measures, a scatter plot matrix is produced. A points. If you are working with a small set of data points,
scatter plot matrix is a series of scatter plots that a bar chart or table may be a more effective way to
displays every possible pairing of the measures that are display the information.
assigned to the visualization.
In a scatter plot, you can also apply statistical Bubble Plots AScatter Plot Variation
analysis with correlation and regression. Correlation A bubble plot is a variation of a scatter plot in which the
identifies the degree of statistical correlation between markers are replaced with bubbles. In a bubble plot (see
the variables in the plot. Figure 5), each bubble represents an observation. The
Regression plots a model of the relationship between location of the bubble represents the value for two
the variables in the plot. measured axes; the size of the bubble represents the
value for a third measure. These plots are useful for data
sets with dozens to hundreds of values or when the
values differ by several orders of magnitude. You can bubble plots are a good way to display changing data
also use a bubble plot when you want specific values over time.
to be represented by different bubble sizes. Animated
Figure 4: A scatter plot is a good way to visualize relationships in data.
}
next to each other. If you do use pie charts, they are
most effective when there are limited components and
when text and percentages are included to describe the Pie charts are most effective when
content. By
providing additional information, information consumers do there are limited components and
not have to guess the meaning and value of each slice. If when text and percentages are
you choose to use a pie chart, the slices should be a
percentage of the whole (see Figure 6). When designing included to describe the content.
reports or
Figure 6: A pie chart helps you compare the percentages of different components..
Figure 7: Alternatives to pie charts include line charts and bar charts.
Of course, there are many other chart types you can Volume refers to the size of the data.
use to present data and analytical results. The
Variety describes whether the data is
selection of charts usually will depend upon the
structured, semistructured or unstructured.
number of categories and measures (or dimensions)
Velocity is the speed at which data pours in
you want to visualize. By following the tips outlined here
and how frequently it changes.
and understanding the examples, you may need to try
different types of visuals and test them with your
audience to make sure the correct information is Building upon basic graphing and visualization
being conveyed. techniques, SAS Visual Analytics has taken an
innovative approach to addressing the challenges
associated with visualizing data. Using innovative, in-
Visualizing Big Data memory capabilities combined with SAS Analytics and
Big data brings new challenges to visualization data discovery, SAS provides new techniques based
because of the speed, size and diversity of data that on core fundamentals of data analysis and the
must be taken into account. The cardinality of the presentation of results.
columns you are trying to visualize should also be
considered. One of the most common definitions of
big data is data that is of such volume, variety and
velocity that an organization must move beyond its
comfort zone technologically to derive intelligence for
effective decisions.
Large Data Volumes data you wish to examine and
One challenge when working with big data is how to then, based on the amount of
display results of data exploration and analysis in a way
that is meaningful and not overwhelming. You may data and the type of data, it
need a new way to look at the data that collapses and presents the most appropriate
condenses the results in an intuitive fashion but still
displays graphs and charts that decision makers are
visualization.
accustomed to seeing. You may also need to make the
results available quickly via mobile devices, and
provide users with the ability to easily explore data on
their own in real time.
Figure 9: This box plot compares the distribution of data points within a category.
10
Different Varieties of Data (Semistructured Another visualization technique that can be used for
semistructured or unstructured data is the network
and Unstructured) diagram. Network diagrams view relationships in terms of
Data variety brings challenges because semistructured nodes (representing individual actors within the network)
and unstructured data require new visualization techniques. and ties (which represent relationships between the
A word cloud visual (where the size of the word individuals, such as friendship, kinship, organizations,
represents its frequency within a body of text) can be business relationships, etc.). These networks are often
used on unstructured data as a way to display high- or depicted in a diagram where nodes are represented as
low-frequency words (see Figure 10). points and ties are represented as lines.
Network diagrams can be used in many applications and
SAS Visual Analytics takes the concept of word clouds a disciplines. For example, businesses analyze social networks
step further. It takes advantage of taxonomies and to understand their interactions with customers, while
ontologies to make associations and then organizes words counter- intelligence and law enforcement might map a
into topics based on how the words are being used. SAS clandestine or covert organization such as an espionage
Visual Analytics word clouds display the hot topics of the ring, an organized crime family or a street gang. You can
day gleaned from this also superimpose the network diagram on a map, for
text analysis. End users can drill down by clicking on example, to show the relationship or product sales across
the individual topic to see exactly what words or geographic areas (see Figure 11). Word clouds and
phrases comprise a particular topic. network diagrams are currently available in solutions
such as SAS Text Miner and SAS Social Media Analytics.
Visualization Velocity
Velocity is all about the speed at which data is coming
into the organization. The ability to access and process
varying velocities of data quickly is critical. A correlation
matrix combines big data and fast response times to
quickly identify which variables are related. It also shows
how strong the relationships are between variables. SAS
Visual Analytics makes it easy to assess the relationships.
Simply select a group of variables and drop them into a
visualization pane. The intelligent autocharting function
displays a color-coded correlation matrix that quickly
Figure 10: A word cloud shows the words or identifies strong and weak relationships between the
phrases associated with a topic. variables. Darker boxes indicate a stronger correlation;
lighter boxes indicate a weaker correlation. If you hover
For example, you could use the topic cloud to over a box, a summary of the relationship is shown. You
categorize customer comments on Twitter about your can double-click on a box in the matrix for further details.
products or services and then click on a topic to drill
down to see the actual comments. Figure 12 displays 45 correlation calculations on slightly
}
more than 1.1 billion rows of data. This graph shows the
correlation values, and returns results in two to six
seconds using the SAS LASR Analytic Server. Previously,
While visualizing structured data is this type of calculation would take hours. Now it can be
fairly simple, semistructured or done in seconds. By using box plots and correlation
matrices, SAS Visual Analytics can help speed up your
unstructured data requires new analytics life cycle because analytical modelers can
visualization techniques, such as perform variable reductions more quickly and efficiently.
Figure 12: In this correlation matrix, darker boxes indicate a stronger correlation; lighter boxes indicate a weaker
correlation. You can double-click on a box for further details.
18
Figure 14: This overview axis bar chart shows the high
cardinality in this big data more clearly. You can easily scroll
through the entire chart.
Cardinality becomes a concern in big data because the But what if the filter isnt meaningful or it skews the data
data may have many unique values per column. The in undesirable ways? One way to better understand the
example in Figure 13 shows only 128 unique cities. composition of your data is through the use of
Because you cannot see the labels for each bar, the histograms. Histograms provide a visual distribution of
graph becomes less meaningful. the data along with cues for how the data will change if
Imagine if you had a million bars! It would be you filter on a particular measure. Histograms save time
impossible to see them. by giving you an idea of the effect the filter will have on
the data before you apply it. Rather than relying on trial
SAS has adopted a method for dealing with high and error or instinct, you can use the histogram to help
cardinality in SAS Visual Analytics bar charts that you decide what to focus on.
provide an overview bar that zooms into the bar chart and
enable information consumers to scroll through the entire Data Visualization Made Easy With
chart. The level of zoom can also be controlled. If you Autocharting
compare Figure 13 to Figure 14, it is easy to see that
Figure 14 presents the information more clearly. In SAS Visual Analytics, intelligent autocharting produces
the best visual based on what data you drag and drop
onto the visual palette. It is important to note that
autocharting may not always create the exact
visualization you had in mind. In that case, you also can
select a specific visual to build. However, when you are
first exploring a new data set, autocharts are useful
19
because they provide a quick view of the data. You
then
also required and the visual will be a line graph (see
Figures 1 and 2). If the category is geographic, then a
have the ability to switch to another specific visual as map will be displayed (see Figure 18).
desired. For example, with autocharting, when a single
measure is selected, distribution of that measure is
shown (Figure 15).
Can You See Into the Future? When additional measures are added to the forecast (as
shown in Figure 23), three things happen in SAS Visual
Forecasting estimates future values for your data based Analytics:
on statistical trends. As such, it is an extremely
1 Each variable is evaluated to determine
important tool for organizational planning. Fortunately,
whether it influences the forecast. Variables
SAS Visual Analytics can help you expand the culture of
deemed to be influencers are added to the
forecasting in your organization. Easy-to-use capabilities
bottom of the screen for simulation purposes.
take the complexity out of forecasting, so that users of all
skill levels can see for themselves what might happen in 2 When influencers are found, the forecast is
the future. recalculated and refined. As you can see, the
confidence interval (light blue bars) around the
forecast in Figure 23 is much tighter than in Figure 22.
A simple menu guides users through the process of
generating forecasting results. Select the date, time or 3 Users can manipulate the values of the influencing
date-time data items you want to use for the forecast. variables to see the potential impact on the forecast,
The software automatically chooses the most appropriate in effect by performing simulations.
forecasting algorithm for the data chosen. You also have
the option to select the forecasting intervals. When you
click OK, a line chart is created, along with a clear
explanation of the forecasting results in the what does
it mean section at the bottom of the screen, as shown
in Figure
22. This is just another way SAS Visual Analytics brings
advanced analytics to nontechnical users in an
approachable format.
Figure 22: With automated forecasting capabilities, SAS Visual Analytics chooses the most appropriate forecasting
algorithm for the selected data. What does it mean? (bottom) provides explanations of analytic functions and data
correlations,
so even nontechnical users can understand what the data means.
Figure 23: By adding additional measures, these underlying factors are evaluated as to their potential impact
on the forecast, the forecast is recalculated accordingly, and users can use these additional values to perform
simulations.
audience may dictate which visualization you present. In
Mobile the latter scenario, showing your audience an alternative
Growing employee adoption of mobile devices means visual that conveys the data more clearly may provide
that businesses need to deliver company information to just the information thats needed to truly understand the
these devices at any time and from anywhere. SAS data.
Visual Analytics comes with SAS Mobile BI to allow
businesses to give front-line and mobile employees
access to business intelligence. With SAS Mobile BI,
employees can look at a vast array of different types of
company business intelligence reports, KPIs and
dashboards on their mobile devices. Rather than having
to wait until they get back to the office, mobile users
can quickly and easily gain a deeper analytical
understanding of
business performance.
Conclusion
Visualizing your data can be both fun and challenging. It
is much easier to understand information in a visual
compared to a large table with lots of rows and columns. Analytics, download white papers,
However, with the many visually exciting choices view screenshots and see other
available, it is possible that the visual creator may end
related material, please visit
up presenting the information using the wrong
visualization. In some cases, there are specific visuals you sas.com/visualanalytics. Or, try SAS
should use for certain data. In other instances, your Visual Analytics for yourself at
sas.com/vademos.
And products such as SAS Visual Analytics can help provide
the best, fastest visualizations possible. The solution
enables you to explore all of your data using visual
techniques combined with industry-leading analytics.
Visualizing your data can be both fun and challenging. It isasmuch
Visualizations such box plotseasier to understand
and correlation matrices in
help you quickly understand the composition and
relationships in your data.
You can choose the most appropriate visualization by The net effect is the ability to accelerate the analytics life
understanding the data and its composition, what cycle and to perform the process more often, with more
information you are trying to convey visually to your data. Users can quickly view more options, ask more
audience, and how viewers process visual information. questions, make more precise decisions and succeed faster
than ever before.
To contact your local SAS office, please visit: sas.com/offices
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks
of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and
product names are trademarks of their respective companies. Copyright 2014, SAS Institute Inc. All rights
reserved.
106006_S120359.0514