100% found this document useful (4 votes)
78 views

Get Visualizing Graph Data MEAP Edition Corey Lanum Free All Chapters

Corey

Uploaded by

nejebirohera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (4 votes)
78 views

Get Visualizing Graph Data MEAP Edition Corey Lanum Free All Chapters

Corey

Uploaded by

nejebirohera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Download the full version of the textbook now at textbookfull.

com

Visualizing Graph Data MEAP edition Corey


Lanum

https://textbookfull.com/product/visualizing-
graph-data-meap-edition-corey-lanum/

Explore and download more textbook at https://textbookfull.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Visualizing Graph Data 1st Edition Corey Lanum

https://textbookfull.com/product/visualizing-graph-data-1st-edition-
corey-lanum/

textbookfull.com

Visualizing Financial Data 1st Edition Rodriguez

https://textbookfull.com/product/visualizing-financial-data-1st-
edition-rodriguez/

textbookfull.com

Visualizing streaming data interactive analysis beyond


static limits First Edition Aragues

https://textbookfull.com/product/visualizing-streaming-data-
interactive-analysis-beyond-static-limits-first-edition-aragues/

textbookfull.com

Small animal medicine and metabolic disorders : self-


assessment color review Second Edition Ruaux

https://textbookfull.com/product/small-animal-medicine-and-metabolic-
disorders-self-assessment-color-review-second-edition-ruaux/

textbookfull.com
Learning Perl Making Easy Things Easy and Hard Things
Possible 7th Edition Randal L. Schwartz

https://textbookfull.com/product/learning-perl-making-easy-things-
easy-and-hard-things-possible-7th-edition-randal-l-schwartz/

textbookfull.com

The Woman in the Trunk (Costa Family #1) 1st Edition


Jessica Gadziala

https://textbookfull.com/product/the-woman-in-the-trunk-costa-
family-1-1st-edition-jessica-gadziala/

textbookfull.com

Essentials of Modern Neuroscience (LANGE) 1st Edition Erik


Roberson

https://textbookfull.com/product/essentials-of-modern-neuroscience-
lange-1st-edition-erik-roberson/

textbookfull.com

Mathematical Logic On Numbers Sets Structures and Symmetry


1st Edition Roman Kossak

https://textbookfull.com/product/mathematical-logic-on-numbers-sets-
structures-and-symmetry-1st-edition-roman-kossak/

textbookfull.com

A Clinical Introduction to Psychosis Foundations for


Clinical Psychologists and Neuropsychologists 1st Edition
Johanna C. Badcock
https://textbookfull.com/product/a-clinical-introduction-to-psychosis-
foundations-for-clinical-psychologists-and-neuropsychologists-1st-
edition-johanna-c-badcock/
textbookfull.com
Starting Out With Visual Basic (8th Edition) Tony Gaddis

https://textbookfull.com/product/starting-out-with-visual-basic-8th-
edition-tony-gaddis/

textbookfull.com
MEAP Edition
Manning Early Access Program
Visualizing Graph Data
Version 7

Copyright 2016 Manning Publications

For more information on this and other Manning titles go to


www.manning.com

https://forums.manning.com/forums/visualizing-graph-data
welcome
Thank you for purchasing the MEAP for Visualizing Graph Data! The book is an introductory
level book aimed at data scientists who want to take better advantage of graphs by visualizing
them, and at web developers who want to create applications that include graph visualization.
Although many of the examples use JavaScript for creating interactive user interfaces around
graphs, knowledge of the language is not strictly required and I expect that a non-
programmer would get value from this book as well.
I am releasing the first three chapters to start.
Chapter 1 covers graphs more generally and will be helpful to someone interested in
understanding graph technology and how it might be useful in the context of software
applications.
Chapter 2 goes through some concrete examples of graphs and graph visualizations in
different industries to show the value of visualizations with more detail.
Chapter 3 introduces some of the tools that we use in the rest of the book, KeyLines and
Gephi and when they might be helpful. I will also have a more technical appendix that will
cover detailed implementations using some of the other popular visualization tools like D3 or
Sigma.js.
Please feel free to take advantage of the Author Online forum. I’ll be reviewing all the
feedback there and responding when I can. Your feedback is helpful in making sure the book
is accessible and useful during the development process.

—Corey Lanum
brief contents
PART 1: GRAPH VISUALIZATION BASICS
1 Introduction to Graph Visualization
2 Case Studies in Graph Visualization
3 An Introduction to Gephi and KeyLines
PART 2: VISUALIZE YOUR OWN DATA
4 Data Modeling
5 Engage Your Audience: How To Build Graph Visualizations
6 Creating Interactive Visualizations
7 Graph Layouts: How to Organize a Chart
8 Big Data: Using Graphs When There is Too Much Data
9 Dynamic Graphs: How to Show Data Over
10 Graphs on Maps: The Where of Graph Visualization
APPENDIXES: ALTERNATIVE VISUALIZATION TOOLS
A Visualize Scientific Data with Cytoscape
B Build your own visualizations using D3.js

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
https://forums.manning.com/forums/visualizing-graph-data
1

1
Introduction to Graph Visualization

In December 2001, the Enron Corporation filed for what was at the time the largest-ever
corporate bankruptcy. Its stock had fallen from a high of $90 per share the previous year to
$0.61, decimating its employees’ pensions and shareholders’ investments in it. The FBI’s
investigation into this collapse became the largest white-collar criminal investigation in history
as they seized over 3000 boxes of documents and 4 terabytes of data. Among the information
seized were about 600,000 e-mails between key executives at the organization. Although the
FBI took pains to read every e-mail individually, the investigators recognized that they were
unlikely to find a smoking gun – people committing complex financial fraud don’t often
disclose their actions in a written form. And in 2001, e-mails were only starting to become the
primary means of internal communications; lots of information was still exchanged via phone
calls.
In addition to looking at the text of individual e-mails, the FBI also wanted to uncover
patterns in the communications, perhaps in an attempt to better understand who the decision
makers were within Enron or had access to a lot of the information internal to the company.
To do this, they modeled the e-mails in Enron as a graph.
Before we get into how the FBI used graph visualization in their investigation leading to the
conviction of 24 Enron executives, it’s worth mentioning there are two types of people who
can benefit by reading this book. The first are web developers interested in building a graph-
visualization application or adding graph-visualization capability to an existing web application.
The included JavaScript samples will be particularly relevant for this group.
The second type is data scientists or data engineers who wish to learn more about graph-
visualization technology and how it can enhance their research. The coding samples may not
add much for this group; those sections can be skipped for these readers, who may find the
Gephi examples of prime interest.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
https://forums.manning.com/forums/visualizing-graph-data
2

Back to the FBI. A graph is a model of data that consists of nodes, which are discrete data
elements, and edges, which are relationships between nodes. The graph model brings to the
forefront relationships that may be hidden in tabular views of the same data and illustrates
what is most important. By making those relationships between the data elements a core part
of the data structure, it helps you identify patterns in the data that wouldn’t otherwise be
apparent. But building graph data structures are only half the solution to pattern recognition.
This book will teach you how to visualize graphs using interactive node-link visualization
diagrams, and by the end, you’ll be able to create your own dynamic, interactive visualizations
using a variety of tools available today.
In this chapter, we’ll go a little deeper into the concept of a graph, its history and uses,
and talk about various techniques used to visualize graph data. Subsequent chapters will build
on this framework by introducing concrete examples of graph visualizations and the data they
are based on and discuss various techniques for creating useful visualizations.

1.1 Get To Know Graphs


Graphs are everywhere. As long as you’re interested in how items can be related to one
another, there’s a graph somewhere in your data. In this section, I’ll walk you through what a
graph is, and what can be gained from visualizing them.

1.1.1 What Is a Graph


As described above, a graph is a set of interconnected data elements which is expressed in a
series of nodes and edges.
In the common definition of a graph, edges have exactly two endpoints, not more. A link
can take one of two forms:

 directed – the relationship has a direction. Stella owns the car, but it doesn’t make
sense to say the car owns Stella.
 undirected – the two items are linked without the concept of direction, the relationship
inherently goes both ways. If Stella is linked to Roger because they committed a crime
together, it means the same thing to say Stella was arrested with Roger as it does
Roger was arrested with Stella.

Below in figure 1.1, we see an example of a directed link with properties.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
3

Figure 1.1: A property graph of a single e-mail between Enron executives. The two nodes are the sender and
recipient of the e-mail, and the edge is the e-mail

Both nodes and edges can have properties, which are key-value pairs, lists of properties and
values, describing either the data element itself or the relationship. Below is a simple property
graph showing that Stella bought a 2008 Volkswagen Jetta in September 2007 and sold it in
October 2013. By modeling it as a graph in figure 1.2, it highlights that Stella had a
relationship with this car, albeit temporarily.

Figure 1.2: A simple property graph with two nodes and an edge. Stella (the first node) bought a 2008
Volkswagen Jetta (the second node) in September 2007 and sold it in October 2013. By modeling it as a graph,
it highlights that Stella had a relationship with this car (the edge).

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
4

An e-mail is a relationship, too, between the sender and the recipient. The properties of the
nodes are things like e-mail address, name, and tile, and the properties of the relationship are
the date/time it was sent, its subject line, and the text of the e-mail.
To prove conspiracy, the FBI was interested in all the e-mails sent among the Enron
executives, not just a single one, so let’s add some more nodes to represent a larger number
of e-mails sent during a specified period of time as you can see in figure 1.3.

Figure 1.3: A graph of some of the Enron executives' e-mail communications. You can easily see that Timothy
Belden is a hub of communication in this segment of Enron, sending and receiving email from many other
executives

This is called a directed graph because it matters whether Kevin Presto sent an e-mail to
Timothy Belden or received one – there’s a big difference between sending and receiving
information when you’re investigating who knew what when. The arrowheads on the edges
show that directionality: Kevin Presto sent an e-mail to Timothy Belden, but Timothy Belden
did not reply, indicating they may not have been close associates, or they may have spoken
offline. As we start to add more data to our graph, you can see the value of graphs – patterns
become apparent. In the example above, you can easily see that Timothy Belden is a hub of
communication in this segment of Enron, sending and receiving email from many other
executives.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
5

1.1.2 A bit of theory


Graph Theory began in early 18th century with the Seven Bridges of Königsberg problem. In
Köningsberg, Prussia (now Kaliningrad, Russia), it was a common parlor game to try to
determine a route that would allow one to pass over all seven bridges over the Pregel River
exactly once without passing over any bridge twice. (Go ahead and give it a shot using the
below map of the city, figure 1.4, see if you can prove three centuries of mathematicians
wrong.)

Figure 1.4: The Seven Bridges of Königsberg problem. Using this map of the bridges of Königsberg, Prussia, try
to draw a route that reaches each area of the city but never crosses the same bridge twice.

Leonhard Euler proved this problem impossible by abstracting the regions of the city into
individual points and the bridges as paths between those points, as you can see in figure 1.5
below.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
6

Figure 1.5: 7 Bridges and 4 Land Areas of Königsberg as a Graph. In this graph, nodes denote the land masses
bordering the Pregel River and two islands in its middle. Edges represent the bridges connecting the two islands
and two shorelines.

Each land area of Königsberg is indicated by a point, and the bridges are the lines that connect
the points. This is a graph, just like the Enron one. The graph model in Figure 1.5 makes it
easier to see that nodes with an even number of links can be navigated easily (you can enter
and exit using two different links), but nodes with an odd number of links can only be either
the beginning or the end of a path (this is obvious for a node with only one link, but you can
also see it applies to 3, 5, etc). The number of links from a node is called that node’s degree.
The Königsberg bridge problem can now be proven possible only if 2 nodes have an odd
degree and the rest have an even degree. The above diagram doesn’t satisfy that condition
and therefore it’s impossible to cross each bridge only once, so graph theory has answered a
problem previously considered intractable.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
7

1.1.3 The Graph Data Model Introduced


Graphs are interesting mathematical constructs, and many academic mathematicians have
spent their entire careers studying the domain, but the purpose of this book is to describe how
graphs can be derived from data, and how they can be presented to non-mathematicians so
that they can better understand the data. Let’s go back to the Enron example. We chose to
model the data such that the nodes were the employees at Enron and the e-mails were
represented by links between them, but that’s not the only graph model that could be derived
from this data. That model, and the resulting visualization, shows who is communicating with
whom, but ignores basic data about the e-mail itself. And it fails to take into account possible
interesting information such as forwarding of e-mails or sending a single e-mail to multiple
people, some of whom may be CCed or BCCed.

DEFINITION: A visualization is any method of using imagery to convey a point. When working with
computer graphics, it typically means finding a way to show large amounts of data in a single view. Creating
images to show graph data is called graph visualization.

In this case, we may want to consider the e-mail itself a node. Figure 1.6 shows the sender
and the recipients of an e-mail with subject line “Let’s Talk,” sent among Enron executives.
The nodes are executives from Enron, and the edges represent how they received the e-mail
(whether they were in the To:, CC:, or BCC: fields).

Figure 1.6: Graphing a single e-mail at Enron

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
8

Consider the following very simple table. It consists of only two columns, a list of names and
the countries where that name is used. In the United States, the Social Security
Administration releases a similar table each year showing the popularity of first names given
to babies, as measured by new applications for Social Security numbers.

Name Country

Joe United States

Juan Mexico

João Brazil

Jean France

Antoine France

Jean United States

Ignacio Mexico
João Portugal

This can be modeled as a graph – take each name as a node and also each country. The link
between them exists if the name and country appear in the same row. The result looks like the
following:

Figure 1.7: Using a graph to illustrate a table. The graph view of these name/country pairs

Looking at this graph tells us more than is immediately obvious from the table, namely that
Jean is used both in France and the United States. It also shows that João is used in Brazil and

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
9

Portugal, but no other names are associated with those countries. As you can see, you can
generate graph models from even the simplest data set, but typically, you’re going to want
properties on the nodes, the links, or both. In this case, we may want the frequency of that
name’s use in the country as a property of the link, and perhaps whether that name is
typically male or female.

1.1.4 When are Graphs helpful?


Now that we understand what graphs are, why would we use them? While there are certainly
cases where a graph model wouldn’t be appropriate – a long key-value pairing comes to mind
– they can be very useful when there are relationships between your data elements. If the
nodes are connected to one another somehow, and those connections are as important as the
data themselves, then a graph is a useful model for understanding the data. For example, in a
list of financial data, it may be important to look at the data in aggregate, say in a budget
where you’re interested in the total amount spent in a set of categories. A graph would be
counterproductive in this instance, because you aren’t looking for connections within this data.
You care only about the bottom line. However in the same set of data, if we’re interested in
the transactions embedded in the data – for example, which consumers are spending money
at which merchants, and which merchants are using which banks – then a graph would be a
very useful model for storing and visualizing that data.

VALUE OF GRAPHS

Graphs can be incredibly useful, but there’s a danger in overusing them. Many people, when
first exposed to graph concepts, begin to see graphs everywhere, in every data set, but it can
sometimes obscure the meaning of the data.
Graphs are a good choice when:

 The links between items are not obvious. For example, linking someone’s first name
and last name together is unlikely to be useful unless you’re specifically looking at
relationship of the names independently. “How many ‘Coreys’ drive black cars?”
 There’s a structure embedded in the data. If every link has unique end points with no
other links then the graph is a bunch of disconnected links which doesn’t answer any
interesting question
 There are at least some properties on the nodes. If a data set doesn’t have any
properties, then it may create a pretty picture when drawn, but it’s impossible to tell
what you’re looking at.

Below in figure 1.8 is a graph model that is not very helpful. It represents the data found in
the back of a road atlas that shows the mileage and driving times between city pairs.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
10

Figure 1.8: A graph of driving times between North American cities. A graph is not an effective way of presenting
this type of data because everything is connected to everything else.

In reality, every city in North America is connected to every other city via roads, and it’s
unlikely that the atlas browser wants to bother adding up the various segments and city pairs
necessary to drive from Richmond to Buffalo, they just want to know how far and how long to
get there. In this case, they might prefer an association matrix, which is a different way of
representing graphs:

Association Matrix
An association matrix is a table displaying the node names as both the columns and rows. Then for
each pair of nodes that has a link between them, the cell between those names is either blocked with
a mark or filled in with a property value. In an undirected graph, the values will be repeated, as each
node pairing appears twice: once with the first node as a row and the second as a column, and then
vice versa.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
11

This table -- taken from a 2008 Rand McNally atlas -- is an association matrix presenting
similar data on city pairs.

Figure 1.9 The back page of my road atlas. City names are represented as both columns and rows and the cell
indicates the distance between them.

Sets of data where the relationships between the data elements are the most important
feature are the most useful to model as a graph. Graphs work best when you have a key
component of analysis. In this section, you’ve had a brief introduction to the graph data
model, how tabular data can be represented as a graph in the “Graph Data Model Introduced”
subsection, and when graphs might be useful. In the next section, we will discuss how and
when to visualize graphs, that is, draw pictures of this data model on paper or on computer
screens.

1.2 Get to know Graph Visualiztion


Why does visualizing graph data make it easier to understand? There are two reasons.
Humans are intuitively visual creatures, and it’s almost impossible to think of a model – any

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
12

sort of model – and not picture how it looks. Ernst Rutherford developed the model of the
atom in 1909 that we’re all familiar with, a nucleus of protons and neutrons with electrons
whizzing in orbits around the nucleus. This model was quickly replaced by Shrodinger’s more
accurate model based around quantum mechanics, but the Rutherfordian model is still, 90
years later, the one that the public embraces. Why? Because it can be pictured. The
Schrodinger model is more accurate, but it’s a mathematical concept, not a visual one, and
therefore has never acquired broad public appeal. Data is the same way; unless you can show
your audience what you’re talking about, they won’t recall it. Visualization helps bridge that
gap and gets the understanding of the data to the decision maker.
I described the property graph model, with nodes, edges, and properties in section 1.1,
but the topic of this book is graph visualization. The entire reason to collect data is to make
better informed decisions based on the data, so it’s important to not just collect data with no
useful way of accessing it. And with graph data, that typically means drawing the graph.
Although there are many different methods of graph visualization, and I’ll briefly discuss
many of them, the focus of this book is going to be on node-link visualization. This is not to
say that other visualizations are never useful, but node-link visualizations tend to the have the
broadest appeal regardless of the data source and require the least amount of technical
understanding to understand what they are seeing. We’ve been using node-link visualization
so far in this chapter and it’s just what it sounds like. Nodes are points or polygons or icons
and links are lines connecting those points. Node link diagrams are almost always drawn on a
2D plane and almost never three dimensional. An important aspect of the node link diagram is
that its location doesn’t tell you anything interesting about the node. Nodes are placed solely
for convenience and readability, which make this quite different from a Cartesian scatterplot,
for example. An effect of this is that layout, or how nodes are arranged on the chart becomes
much more important. Two charts with identical data but different layouts can imply different
things to the human eye.

1.2.1 When to Visualize Graphs


There are two reasons to visualize graphs, both of which are important:

1. The first is to better understand the structure of the data that you have.
2. The second purpose of visualization is to expose a broader audience to the data
connections.

VISUALIZING GRAPH DATA STRUCTURE

This visualization, figure 1.10 below shows the structure of the sales database and how its
elements are connected to one another, but not the connections between individual employees
and products.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
13

Figure 1.10: A sales database showing connections between different data types

With regard to the first purpose, understanding the structure: In a dataset, what sorts of
things are linked to other things? Many of the diagrams you’ll see from graph databases are
designed to illuminate this structure.
In the above example, the structure of a sales database is visualized. Suppliers supply
products that are members of categories. Employees take orders that consist of products. To a
data scientist or an application engineer, this view is very important, it helps define the model
of the data, how it’s stored, and how users will interact with it. Get this wrong, and it can be a
very time consuming and expensive process to fix.

DRAWING YOUR GRAPH DATA

The second purpose is to visualize the data inside your dataset. In this case, we’re not
interested in the categories of data, but the actual relationships among the data elements
themselves. Instead of saying “Employees sell orders which consist of products”, we can get
specific and say “Brad from Home Depot sold me a chainsaw from Husqvarna.” Then we can
get more specific – who else is buying Husqvarna chainsaws from Home Depot? What other
products are included in those orders? The visualization allows us to better understand the
connections embedded within the data itself.
A key reason that graph visualization is important is it gives a visual interface to data
discovery. While much of the big data revolution of the past decade has been in understanding
trends in the aggregate data, it is equally important to discover connections and relationships
in the individual data that may not have been known. A dashboard will be unlikely to show
this, but a graph will allow the user to explore through the data and discover these patterns
visually. We’ll discuss this more in chapter 6.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
14

1.2.2 Non-Network Based Graph Visualizations


The node-link visualization isn’t the only way to display a graph, although it’s the main focus
of this book. Below I’ll show you some examples of when other types of visualizations may
make more sense.

CIRCLE PLOT

If the primary goal is to show aggregate link from groups of nodes, as opposed to individual
data, something like the circle plot may make more sense. A good example is from
http://www.global-migration.info/here where they are showing global migration between
and within the 6 populated continents. Their data is a graph of migration patters from each
country to each other country, but a node-link diagram of all this data would be busy and
wouldn’t show aggregate patterns as well as the circle plot, so it was a good choice.

Figure 1.11: A graph Global-Migration.info. This stylish graph illustrates migration patterns among people from
all six inhabited continents.

HIVE PLOT

Another example of an alternative to the node-link diagram is called the hive plot. As
mentioned above, node-link graph visualizations focus on individual data elements and the
connections between them. While useful, they can fail to identify and communicate
connections among different types or groups of data elements. This can be especially useful
when trying to understand structure among very large networks, in the tens or hundreds of

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
15

thousands of nodes or links. The hive plot differentiates nodes into three or more types and
aligns them from the center of the chart on an axis. Links between elements of different types
are drawn as curved lines around the center of the chart. This can allow a viewer to visually
differentiate between two types that are tightly linked versus weakly linked, but it doesn’t
display links between elements in the same type, and makes a drill down, to look at subsets of
the data, much more difficult.

Figure 1.5: Hive plot of the e coli bacteria. Notice the large number of connections of the leftmost group but the
fewer connections between the top and right

There are lots of benefits to using node-link diagrams to better understand data, but there are
instances where other visualizations are more helpful. In general, node-links are valueable
when you want to focus on specifics, and are less helpful when you’re interested in
aggregates, as we’ve seen above.

1.3 Summary
In this chapter, I’ve introduced you to the definition of a graph, and discussed the benefits of
thinking about data in terms of nodes and links. I’ve also emphasized when this is beneficial
and when you’d see more benefit from a tabular view of data. I’ve also touched on the history

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
16

of graph visualization and the reason that drawing a picture of your data can be enlightening.
Although most of this book will focus on node-link visualizations, I mentioned a few other
styles of visualization that can be helpful in certain circumstances.

 A graph is a model of data emphasizing the connections in the data


 The graph model can be created from any data set where items share a common
property. Some are more useful than others
 Graph data can come from any source, not solely graph databases
 The node-link diagram is the most common way of presenting and communicating
graph data
 Graph visualization can serve two purposes, allowing a user to explore the data
independently or communicating findings
 Software-based graph visualizations started in the 1980s and 1990s with algorithmic
graph layouts being developed
 There are many other ways of displaying graph data that don’t rely on the node-link
diagram. Most of them help with looking at the structure of larger networks and not the
individual detail.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
17

2
Case Studies in Graph
Visualization

This chapter covers

 Examples of useful graph visualizations across industries


 Intelligence and Law Enforcement Graphs
 Financial and Review Fraud Graphs
 Cyber Security
 Sales and Marketing Graphs

Over the past dozen years, interest in graphs has exploded beyond academia and into
industry. The intelligence failures that allowed the September 11, 2001 attacks were portrayed
in the media as a “failure to connect the dots,” as individual government agencies had
suspicions about individual terrorists, but no one was collecting and analyzing the big picture.
Well, connecting dots means understanding the relationships between individual data sets,
and although it still has a lot of room for improvement, the US Intelligence community was
one of the first adopters of graph visualization, specifically among anti-terrorism analysts and
investigators. The application of graphs to understand flows across a network also appealed to
the investigators looking at money laundering. Often, discovering money laundering involves
looking at financial flows between people and companies and identifying areas that don’t make
sense, either because they have more inflows than outflows, or because they appear central in
a network when they shouldn’t be. Fraud isn’t limited to financial fraud – any sort of
misrepresentation for gain is fraud, and a relatively recent fraudulent practice is review fraud,
or submitting fake positive reviews of products or services in which one has a financial stake,
or fake negative reviews of one’s competitors. As an increasing sector of the economy is

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
18

comprised of these middlemen who match services or products and their customers (think
OpenTable for resturants, Uber for taxicabs, or Yelp for local businesses), the need to ensure
the integrity of these reviews is critical. I’ve been using “networks” thus far in the book to
mean generic mathematical sense, any collection of nodes and the edges connecting them.
Most people, however, think of computer networks, either a basic local area network, or
something as complex as the global Internet. As the world continues to evolve toward the
Internet being the primary communications infrastructure, not just for people but for devices
as well (the Internet of things, or IoT), understanding how all these things may be connected
to one another gains much higher importance. Cyber Security has become tremendously
important over the last few years, as more critical functions of businesses and individuals are
conducted over the internet. Graphs can be used to identify both weaknesses in computer
network infrastructure and to visualize a cyber attack to determine how to stop one in
progress or prevent a future one.
We’ll be looking at all the above examples in this chapter, including how graph
visualizations aid law-enforcement investigations at both the national and local level, and help
businesses weed out fraudulent behavior by their customers or online reviewers. These case
studies will illustrate several different ways that graph visualization has been used successfully
in real life. Here are some other industries where graph visualization may be useful:

Table 2.1: More industries and data where graph visualization can be used

Healthcare Drug combination responses


Communicable disease infection patterns

Transportation Airline route networks


Shipping logistics

Supply Chain Management Vendor relationships

Business Intelligence Consumer buying behavior


Consumer sentiment analysis

The data from each of these case studies is real-life data from the domains, however, some of
it has been anonymized to protect confidentiality. I’d encourage you to download it from
<here (provide URL)> to play with visualizations yourself.

2.1 Intelligence and Terrorism


In 2004, Marc Sageman, a senior fellow at the Foreign Policy Research Institute and former
CIA agent, published a book called Understanding Terror Networks that includes detailed
biographical profiles of 172 Al Qaeda members and sympathizers across the globe and the
social bonds between individuals in this group. We can understand this data as a graph by
modeling the people as nodes, and the relationships as edges. Let’s look at a small sample of
this data in tabular form, one table for the nodes and one for the edges.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
19

Table 2.2: Table of Al Qaeda members and international sympathizers

A table version of Al Qaeda members and sympathizers and their whereabouts, taken from Marc Sageman’s
book Understanding Terror Networks. In a graph visualization, these people will be the nodes.

And here’s the data showing the relationships between these people as a matrix.

Figure 2.3: Matrix of Al Qaeda members and international sympathizers

The relationship of the people from Figure 2.2, in matrix format. These relationships will be represented by
edges in a graph visualization.

Now we’ll expand this to represent all 172 terrorists in a node-link visualization using the fact
that two people know one another as a link between them in the chart. We’ve made a couple
of design choices here – one is to use the flag of the country that the person lives in (or lived

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
20

in, this is 2004 data) as the node icon. This is helpful because it allows us to tell at a glance
where people are from. We’re going to draw all links identically. Normally we’d want to use
visual properties of the links like width and color to indicate something substantive about the
data, but our data just includes whether a link exists, without a lot of properties on those
links. We’ll also run a force-directed layout, which creates separation between the nodes and
attempts to make the chart more readable. The result is below:

Figure 2.4: A graph visualization of Al Qaeda members and international sympathizers

A graph visualization illustrating the relationships among 172 people in the worldwide Al Qaeda network.
Nodes are people, and edges show who knows whom.

As you can see, a static view of 172 nodes isn’t likely to be very helpful. There is a temptation
in graph visualizations to show more and more data all at once, and that can be
counterproductive, as your diagram becomes less and less readable. Zooming in on
subsections of the chart can give some better insights, as seen below.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
21

Figure 2.5: A closer look at an Al Qaeda graph visualization

Zooming in allows you to see in better detail the relationships between Encep Nurjaman, a Malaysian
member of Al Qaeda, and an international group of his associates.

Now, we see Encep Nurjaman, from the table above, represented by a Malaysian flag, as the
key connection point between a group of mostly Malaysians below and a mix of countries
above. And in fact, this is true – Nurjaman was known as the “Osama bin Laden of Southeast
Asia”, and was the main link between Al Qaeda’s Middle East arm, and its Southeast Asian
operations, so even with no additional information other than who knows whom else, we’ve
identified some key people in this network.
So even without diving in deeper with more properties for each node, the graph
visualization helps identify who knows whom on an international scale. Merely displaying the
data graphically enables you to identify key patterns from a data set that would be impossible
to see in tabular form. From there, you can start to look for hubs – focal points where one
cluster meets another, indicating someone with broad reach within a network.

©Manning Publications Co. We welcome reader comments about anything in the manuscript - other than typos and
other simple mistakes. These will be cleaned up during production of the book by copyeditors and proofreaders.
Random documents with unrelated
content Scribd suggests to you:
to see the ladies home.
“Quilting Bees” define themselves in their name. They were very
similar to spinning bees, except that the work was done after the
guests had assembled.
Of “Stoning Bees,” “Logging Bees” and “Raising Bees,” description is
unnecessary. The names are almost self-explaining, though just why
they were called “Bees” I cannot learn, unless it is because those
who came were expected to, and usually did, imitate the industrial
virtues of that insect. They were also sometimes called “frolics,”
possibly for the reason that the frolicking was often as hard and as
general as the work. Strong and hearty men were much inclined to
playful trials of strength and other frivolities when they met at such
times. This tendency was much enhanced in the earlier days by the
customary presence of intoxicants.
These amusements were varied and extended far beyond those
above mentioned. They exhibited and illustrate much of the
character, surroundings and habits of those early people. They
wanted no better amusement. It was, in their esteem, a wicked
waste of time and in conflict with their necessary economies to have
parties or gatherings of any kind exclusively for amusement, and
unaccompanied with some economic or industrial purpose like those
indicated above.
The dancing party or ball was a thing of later date, but even when it
came, and for many years after, it was looked upon by the more
serious people as not only wicked and degrading in a religious and
moral point of view, but very wasteful in an economic sense.
Their hard sense taught them that their industrio-social gatherings,
together with the church meetings and Sunday-schools, furnished
ample occasions for the young to meet and become acquainted,
while the elements of evil that crept into modern society elsewhere
were there reduced to a minimum.

A THRIFTY STOREKEEPER
A good story is told of Joseph Hoover dating well back in the first half
of the century. He went one day to the store of Mr. Jacob R——, in a
neighboring town, to get a gallon of molasses, taking with him the jug
usually used for that purpose. As it happened that day, the son,
Isaac, who usually waited on him, was otherwise engaged, and the
father, Jacob, went down cellar to draw the molasses. After being
gone some time, Jacob called up from the cellar to Joseph and said
that the jug did not hold a gallon. “Call Isaac,” replied Hoover, “and
let him try; he has always been able to get a gallon in that jug!”
THE ARMY OF THE POTOMAC
A PAGE OF HISTORY CORRECTED

III
HALLECK AND POPE

The fourth letter[13] contains a sentence which almost takes one’s


breath. It is bunglingly constructed—a thing unusual in Pope’s
communications. He had received a letter from Halleck, dated
November 7, intimating that the Secretary of War would order a court
of inquiry, and he answers, conveying the following:
“The overt act at Alexandria, during the engagement near
Centreville, can be fully substantiated by letters from many officers
since I have been here [St. Paul], it is quite certain [Now mark!] that
my defeat was predetermined, [Now mark again!] and I think you
must now be conscious of it.”
Pope does not even intimate who predetermined that “overt act,”
although he intimates rather clearly that Halleck is conscious of the
facts. It is difficult to see, however, how either McClellan, Porter or
Griffin could “predetermine” either a victory or a defeat at that time.
On the 25th day of November, 1862, the very day set by General
Pope, Major-General Halleck ordered a general court-martial for the
trial of Major-General Fitz-John Porter, and on that same day he
made his official report of the battle in which he certified to Pope’s
efficiency, as the latter had demanded in those uncanny letters. And
on the 5th day of December, Major-General Pope declared under
oath:
“This is all I have yet done”: i. e., “in my official reports of the
operations of the army, to set forth all the facts as they transpired on
the field. I have not preferred charges against him. I have merely set
forth facts in my official reports,” etc.
The “Official Records” referred to show that he “set forth” certain
facts [or fancies] in his private letters to Halleck, which, by some
mysterious influence have found their way into print, and suggest
that an explanation is in order to reconcile his sworn testimony with
the fact that he was urging General Halleck to action, by military
court, and even threatening him in case he should neglect such
action.
He says: “No man knows better than yourself the constancy, the
energy, and the zeal with which I endeavored to carry out your
programme in Virginia. Your own letters and dispatches, from
beginning to end, are sufficient evidence of this fact, and also of the
fact that I not only committed no mistake, but that every act and
movement met with your heartiest concurrence.”
[Note.—This statement is fully corroborated by the “Official Records.”
It is as certain as anything can be that Halleck formulated the plan
and that Pope executed it. If he appeared to be making mistakes, he
was obeying orders, and Halleck should be chargeable.]
Pope continues: “Your own declarations to me up to the last hour I
remained in Washington bore testimony that I had shown every
quality to command success.”...
“Having, at your own urgent request [Mark that well! and what
follows also. This paragraph shows that Halleck himself was the
instigator of the charges against Porter], and from a sense of duty [!]
laid before the Government, the conduct of McClellan, Porter and
Griffin, and substantiated the facts stated by their own written
documents, I am not disposed to push the matter further, unless the
silence of the Government [this means Halleck, as has been shown
Halleck was the only objector to the gratification of Pope’s wishes], in
the midst of the unscrupulous slander and misrepresentation
purposely put in circulation against me and the restoration of these
officers, without trial, to their commands, coupled with my
banishment to a distant and unimportant department, render it
necessary as an act of justice to myself.”
How keenly Pope feels his disgrace, having been used as a tool and
then flung aside, is shown clearly. He continues:
“As I have already said, I challenge and seek examination of my
campaign in Virginia in all its details, and unless the Government by
some high mark of public confidence, such as they have given to me
in private, relieves me from the atrocious injury done to my character
as a soldier ... justice to myself and to all connected with me
demands that I should urge the court of inquiry.... This investigation,
under the circumstances above stated, I shall assuredly urge in
every way. If it cannot be accomplished by military courts, it will
undoubtedly be the subject of the inquiry in Congress.”
Then follows a darkly ominous hint: “It is especially hard, in view of
my relations with you [Note that!] that I should be compelled even to
ask at your hands the justice which it is your duty to assure to every
officer of the army.... I tell you frankly that by the time Congress
meets such influences as can not be resisted will be brought to bear
on this subject.... I prefer greatly that you should do me this justice of
your own accord.”[14]
Altogether this letter is a rare specimen of the chiaroscuro in the art
epistolary; it tells of Halleck’s acts of injustice which Pope will right
by every means in his power. At times it breathes hatred and
vengeance, and closes with such a loving assurance as this:
“I write you this letter with mixed feelings. Personal friendship and
interest in your welfare, I think, predominate. I am not so blinded as
not to know that it gave you pain to allow such scandal against me
and to take such action as you thought the peculiar circumstances
required. Much as I differ with you on the subject, I am not ready to
blame you or to feel bitterly.”
Then follows that warning: “I impress upon you the necessity for your
own sake of considering carefully the suggestions I have presented,”
and closes with the assurance, “I shall not again address you a letter
on such a subject.”
This assurance was not fulfilled. Indeed, Pope wrote several letters
on the subject, as will appear. Queer letters were they, to be written
by a major-general commanding a department, to his superior, the
general-in-chief, to whom he administers the medicine à la cheval de
trait.
To summarize: Pope makes these charges against Halleck.
(1) That the plan of campaign was Halleck’s.[15]
(2) That Pope was but an instrument in the hands of the general-in-
chief.[16]
(3) That Pope faithfully executed Halleck’s plans.[17]
(4) That the latter fully approved every act of the former, thereby
making himself responsible, so far as Pope was concerned, for the
final result.[18]
Here a pause. These charges are fully substantiated by letters and
telegrams passing between Halleck and Pope, which appear in parts
II and III, of Vol. XII, of the Official Records. Pope was regularly
advising Halleck of his movements, and Halleck was as regularly
approving the same. And as late as August 26, 11:45 a. m., Halleck
wired Pope: “Not the slightest dissatisfaction has been felt in regard
to your operations on the Rappahannock,” etc.
Returning to the charges:
(5) That Pope had made the charges against Generals McClellan,
Porter and Griffin “at Halleck’s own urgent request.”[19] Halleck was
the real instigator.
(6) That Halleck had not assigned him [Pope] to command of the
western department, which, as Pope says, “would at once have
freed me [Pope] from the odium and abuse which have so
shamefully and unjustly been heaped upon me by the papers and
people,” etc.[20]
(7) That he found himself banished to the frontier.[21]
(8) That his character and reputation as a soldier had been deeply
and irretrievably injured.[22]
(9) That the Government refused to allow him to publish the facts[23]
and
(10) That General-in-Chief Halleck declined to acknowledge his
services publicly.
All through the letters are insinuations and charges against
McClellan, Porter and Griffin. And he makes categorical demand in
these words:
“I said, and say now, that one of three things I was entitled to; any
one of them would have satisfied me. The dictates of the commonest
justice gave me the right to expect one of them at least:
1st. That the court of inquiry be at once held and the blame be fixed
where it belongs. It is now too late for that, as the delay has already
made the worst impression against me that is possible.
2d. That the Government should acknowledge publicly, as it had
done privately, my services in Virginia, or
3d. That in case neither of these things could be done, then that the
Government bestow upon me some mark of public confidence, as its
opinion of my ability warranted.
None of these things have been done,” etc.
He continues: “You know me well enough I think, to understand that I
will never submit if I can help it. The court of inquiry, which you
inform me has been ordered, will amount to nothing for several
reasons. It is too late, so far as I am concerned. Its proceedings, I
presume, will be secret, as in Harper’s Ferry business. The principal
witnesses are here with me, and I myself should be present. The
Mississippi River closes by the 25th of November [Note that date!];
frequently sooner than that. It is then next to impossible to get away
from this place. A journey through the snow of 200 miles is required
to communicate with any railroad.”[24]
And on the very day which Pope had named, November 25, 1862,
General-in-Chief Halleck issued his order for the court-martial of Fitz-
John Porter, and issued his report certifying to the efficiency of
General Pope, thus avoiding the court of inquiry which Pope had
threatened to demand.
Such a court, if honestly conducted, would have laid bare the truth,
and shown to the world that Halleck himself had prevented the
reinforcements from reaching Pope, caused the defeat of Second
Bull Run, imperiled the national capital, and opened the door of
Maryland to Jefferson Davis and Robert E. Lee.
This conclusion is supported both by Halleck’s official report and by
his testimony before the Joint Committee on the Conduct of the War.
In the former, he says: “Had the Army of the Potomac arrived a few
days earlier, the rebel army could have been easily defeated and,
perhaps, destroyed.” His testimony before that committee, on March
11, 1863.[25]
“Question. To what do you attribute the disastrous result of General
Pope’s campaign?
Answer. I think our troops were not sufficiently concentrated so as to
be all brought into action on the field of battle; and there was great
delay in getting reinforcements from the Army of the Potomac to
General Pope’s assistance.
Question. To what is that delay attributable?
Answer. Partly, I think, to accidents, and partly to a want of energy in
the troops, or their officers, in getting forward to General Pope’s
assistance. I could not say that that was due to any particular
individual. It may have resulted from the officers generally not feeling
the absolute necessity of great haste in re-enforcing General Pope.
The troops, after they started from the Peninsula, were considerably
delayed by heavy storms that came on at that time.”
[Note.—General Halleck has not told that committee, what his own
letters and telegrams conclusively prove, that the principal delay of
those reinforcements was due to his own wilfully false telegrams to
Generals McClellan, Burnside, and Porter, and that he also
prevented General Franklin and the Sixth Army Corps from reaching
Pope from Alexandria by refusing to provide transportation. The next
question and answer fixes the blame directly upon Halleck himself]:
“Question. Had the Army of the Peninsula [i. e., the army under
McClellan, which embraced both Porter’s and Franklin’s corps] been
brought to co-operate with the Army of Virginia [under the command
of Pope] with the utmost energy that circumstances would have
permitted, in your judgment as a military man, would it not have
resulted in our victory instead of our defeat?
Answer. I thought so at the time, and still think so.”
And this is the opinion of all military critics who have pronounced
judgment in the case. It is also certainly true that Halleck’s own
orders and telegrams prove that he himself, and apparently
purposely, prevented such co-operation, and it throws a peculiar
significance on Pope’s charge in his letter to Halleck, dated
November 20, 1862, before quoted, “It is quite certain that my defeat
was predetermined, and I think you must now be conscious of it.”[26]
The consequences which followed the defeat of Pope were not
immediately and fully appreciated at the time in the North, on
account of the censorship of the press, nor do they seem to be so at
this day. Orders were given to prepare for the evacuation of
Washington; vessels were ordered to the arsenal to receive the
munitions of war for shipment northward; one warship was anchored
in the Potomac, ready to receive the President, the Cabinet and the
more important archives of the Government: Secretary Stanton
advised Mr. Hiram Barney, then Collector of the Port of New York, to
leave Washington at once, as communication might be cut off before
morning;[27] Stanton and Halleck assured President Lincoln that the
Capital was lost.
Singularly enough the designs against Washington in the East were
at the same time and in the same manner being duplicated against
Cincinnati, then the “Queen City of the West.”
On August 30, while Pope was fighting the second Bull Run battle in
Virginia, the Confederate Major-General, E. Kirby Smith, was fighting
the battle of Richmond, Ky. In his report to General Braxton Bragg,
Smith says:
“The enemy’s loss during the day is about 1400 killed and wounded,
and 4000 prisoners. Our loss is about 500 killed and wounded.
General Miller was killed, General Nelson wounded, and General
Manson taken prisoner. The remnant of the Federal force in
Kentucky is making its way, utterly demoralized and scattered, to the
Ohio. General Marshall is in communication with me. Our column is
moving upon Cincinnati.”
On September 2, Lexington was occupied by Kirby Smith’s infantry.
He reports to General Cooper that the Union killed and wounded
exceed 1000; “the prisoners amount to between 5000 and 6000; the
loss—besides some twenty pieces of artillery, including that taken
here (Lexington) and at Frankfort—9000 small arms and large
quantities of supplies.” The Confederate cavalry, he reports, pursued
the Union forces to within twelve miles of Louisville; and, he adds: “I
have sent a small force to Frankfort, to take possession of the
arsenal and public property there. I am pushing some forces in the
direction of Cincinnati, in order to give the people of Kentucky time to
organize. General Heth, with the advance, is at Cynthiana, with
orders to threaten Covington.”
This invasion of Kentucky was due to Halleck, as was proved before
the military court appointed “to inquire into and report upon the
operations of the forces under command of Major-General Buell in
the States of Tennessee and Kentucky, and particularly in reference
to General Buell suffering the State of Kentucky to be invaded by the
rebel forces under General Bragg,” etc.
That court was in session from November 27, 1862, until May 6,
1863, with the gallant Major-General Lew Wallace presiding. Its
opinion recited that Halleck had ordered General Buell to march
against Chattanooga and take it, with the ulterior object of dislodging
Kirby Smith and his rebel force from East Tennessee; that General
Buell had force sufficient to accomplish the object if he could have
marched promptly to Chattanooga; that the plan of operation
prescribed by General Halleck compelled General Buell to repair the
Memphis and Charleston railroad from Corinth to Decatur, and put it
in running order; that the road proved of comparatively little service;
that the work forced such delays that a prompt march upon
Chattanooga was impossible, while they made the rebel invasion of
Tennessee and Kentucky possible. Our forces were driven northward
to the Ohio, leaving the Memphis and Charleston railroad in
excellent condition for the use of the Confederates. Strangely
enough, Halleck’s orders to Buell had inured to the benefit of the
Confederates in the West, in the same manner and along the same
lines as his orders to McClellan and to Pope had inured to the
benefit of the Confederates in the East.
Both Washington and Cincinnati were imperiled at the same time,
and by the same officer, General-in-Chief Halleck, and in the same
way—by a succession of steps that appear to have been carefully
planned.
Now, mark what follows.
On March 1, 1872, the House of Representatives called upon the
Secretary of War for a copy of the proceedings of that military court;
and on April 13 the Secretary reported to the House, “that a careful
and exhaustive search among all the records and files in this
Department fails to discover what disposition was made of the
proceedings of the Commission,” etc.
But though the records of those proceedings which fix the blame for
that campaign upon Major-General Halleck were lost or stolen from
the archives of the War Department, Benn Pitman, the phonographic
reporter of the court, had possession of a report of those
proceedings. And, by Act of Congress, approved by President Grant
on June 5, 1872, the Secretary of War was “directed to employ at
once Benn Pitman to make a full and complete transcript of the
phonographic notes taken by him during the said investigation, and
to put the same on file among the records of the War Department,
and to furnish a copy of the same to Congress.”
The report of those proceedings may now be found in “Official
Records,” Series I, Vol. XVI, Part I, pp. 6 to 726, inclusive. The most
melancholy part of the story lies in the fact that Porter, who certainly
helped to save Washington from falling into Lee’s hands, had his life
blasted by Halleck, and died without knowledge that Halleck, not
Pope, was really guilty of the disaster which so nearly resulted in the
abandonment of the Capital to the Confederates, and while Halleck
was directing affairs in the West in such a manner as to imperil
Cincinnati.
The remarkable co-operation between Pope and Buell for the
surrender of those cities, and which was attempted by Halleck, does
not look like a concatenation of accidental circumstances. This is
accentuated by the charge against Halleck’s loyalty to the Republic
which was made by the gallant Wallace after he had presided over
that Buell military court. He was a careful man; and, being a good
lawyer, he understood the laws and effect of evidence. Porter, who
prevented the surrender of Washington, and Buell, who saved
Cincinnati, were both punished. It looks as if they had interfered with
Halleck’s plan of a general surrender.

L’ENVOI

In January, 1899, the writer commenced to unravel the mystery


surrounding the battle of Harper’s Ferry, which culminated in the
surrender of that post September 15, 1862. He was a member of
that garrison, and he knew that history had not truthfully recorded the
defense, some chronicles reading that “Harper’s Ferry fell without a
struggle,” others that “there was no defense”; in the main, historians
were a unit.
Such reports are wholly false. The defense of that post was stubborn
and prolonged, lasting from September 11, when the Confederates
showed themselves in Pleasant Valley, until the 15th, when the
garrison was subjected to one of the fiercest bombardments of the
Civil War. Never was hope abandoned until the last shell was
expended, though the little garrison of 12,500 men was besieged by
what was practically the whole of Lee’s army. Starting on a new line
of research, and abandoning the path beaten by others, he found
many battles lost in the same manner, and the responsibility shifted
from the shoulders of the guilty and carefully loaded upon those of
the innocent, and all by the use of the same means, a false report by
General-in-Chief Halleck, and a bogus trial by a military court.
Conspicuous among these was the battle of Second Bull Run,
followed by the trial of Fitz-John Porter. That battle was certainly lost
by Halleck, as shown by documents over that general’s own
signature. And Pope knew it, and charged that it was premeditated.
To avoid the odium which some papers were attaching to his name,
the latter applied the whip and spur to the former, who, under threat
of exposure, ordered the court-martial of the innocent and gallant
Major-General Fitz-John Porter. The battle of Harper’s Ferry
followed; the result was the same; lost by Halleck; responsibility lifted
from his shoulders, and carefully divided between General McClellan
(for not relieving the post) and Colonel Dixon S. Mills (for not
defending it). After that came Fredericksburg, with similar results;
lost by Halleck; responsibility lifted from his shoulders, and divided
between Burnside and Franklin.
Study the plans adopted in one instance; the plans adopted in the
others become manifest. The losing of the battles to the Union arms
was accomplished by carefully prepared plans, and reduced to an
exact science.
R. N. Arpe.
New York City.

FOOTNOTES:
[13] November 20, pp. 825-6.
[14] P. 818
[15] Page 817
[16] Ibid.
[17] Ibid.
[18] Ibid.
[19] P. 817.
[20] P. 818.
[21] Ibid.
[22] P. 921.
[23] Ibid.
[24] P. 822
[25] Vol. II, Part I, p. 454
[26] O. R., Vol. XII, Part III, p. 825.
[27] See Warden’s Chase, p. 415.
THE NORTHERN NECK OF VIRGINIA
PRESENT-DAY ASPECTS OF WASHINGTON’S BIRTHPLACE

Five Virginia counties lying between the Potomac and the


Rappahannock constitute the Northern Neck, the region in which
George Washington, Light Horse Harry Lee, and his more famous
son were born and bred. There are a scant thousand square miles in
these counties of King George, Westmoreland, Richmond,
Northumberland, and Lancaster, and the population of the five is
under fifty-five thousand. At no point are the rivers much more than
thirty miles apart, and near the northern boundary line of King
George the harbors on the two streams are only nine miles apart.
Washington was born on a lonely plantation in Westmoreland
County, bordering the beautiful Bridges Creek, within sight of the
Potomac. At Colonial Beach, two or three miles across the mouth of
Monroe Creek, also in Westmoreland County, stands a house in
good repair, which is declared to have been the residence of Light
Horse Harry Lee before he removed to Fairfax County. Washington
as an infant was taken by his parents to their new home opposite
Fredericksburg, in Stafford County, and at the age of twenty he
inherited from his half-brother Lawrence the fine estate of Mount
Vernon, in Fairfax County. Lawrence had named his estate in honor
of Admiral Vernon, with whom the young Virginian had served as an
officer in the campaign against the Spanish-American stronghold of
Cartagena. It was Lawrence’s acquaintance with Admiral Vernon that
won for George Washington the offer of a midshipman’s commission
in the royal navy, an appointment that only his mother’s strong
objection prevented him from accepting.
From the birthplace of Washington to his second home opposite
Fredericksburg is hardly more than fifty-five miles as the crow flies,
and from the birthplace to the scene of his death at Mount Vernon is
under seventy miles. The triangle enclosed by the lines connecting
these points includes a tract of Virginia that is full of historic interest,
and singularly rich and beautiful as an agricultural region. Most of the
counties of the Northern Neck are increasing in population, but they
lie far from railways, and their mode of communication with the
outside world is the steamboats that ply from Baltimore up and down
the two rivers.
In spite, therefore, of the rolling years, and of civil war, and
emancipation, the Northern Neck of Virginia is in many respects
much what it was when George Washington and Light Horse Harry
Lee were born a month apart in the quaint and lovely old
Westmoreland of the year 1732. The visitor to Mount Vernon comes
away with a strong impression of Washington, the local magnate and
world-wide hero. But Mount Vernon, in spite of its tomb and its relics,
many of them actually used and handled by Washington himself, can
hardly give one the eighteenth century atmosphere. To obtain that
one must make a pilgrimage to the region of Washington’s birth. A
fair shaft erected by the Federal Government now stands on the spot
occupied by the homestead of Augustine Washington, the birthplace
of his mighty son. The spot is as remote and lonely as it was when
Washington’s eyes first saw the light, and the aspect of the region
must be much what it was in that day. Doubtless the woodland has
shrunk in area and the plowed land has widened. But there, in full
view from the monument, are the land-locked tidal waters of the little
stream, and eastward lies the broad lazy flood of the Potomac, idly
moving beneath the soft overarching sky. Everywhere are the marks
of an old civilization. The road that leads from the wharf at Wakefield
on Monroe Creek to the monument is lined with cherry trees
escaped from the old orchards of the neighborhood. The
mockingbird sings in all the woodlands as it must have sung in the
ears of Augustine Washington as he moved about his fields, and
gray old log granaries of the eighteenth century pattern still stand
amid piles of last year’s corncobs. Even to-day brand-new corn cribs
are built in the same fashion of partly hewn logs. The crops are also
those of the earlier century. The monument itself stands in the midst
of a waving wheat field, and acres of Indian corn rustle green and
rich as they must have rustled in the first hot summer of George
Washington’s infancy.
The reality of it all is increased by the bodily presence of
Washington’s own kin, men and women bearing his name, the
descendants of his collateral relatives. A little boat rocking at anchor
off the wharf at Wakefield is the fishing dory of Lawrence
Washington, commonly called “Lal” Washington by his neighbors. He
is a man of substance and dignity. But he takes delight in fishing his
own pound nets, and the unpretentious fishermen of the region tell
how the old man’s enthusiasm was such that he rushed waist deep
into the water to help three or four young fellows drag ashore a
heavily laden seine. His brother was for years State’s Attorney of a
neighboring county, and other members of the family are landholders
in Westmoreland. Their neighbors accept these families of historic
name in a simple, matter-of-fact fashion, and with no humiliating
sense of inferiority. “They’re all smart people,” said the young
fisherman that sailed us across Monroe Creek to the wharf at
Wakefield, and that is what Westmoreland expects of the
Washingtons.
Neighboring plantations are stocked with fine old European nut and
fruit trees, such as the colonists with the increasing wealth of the
third and fourth generations were accustomed to import. In some
places the fig is cultivated, and within the shadow of the birthplace
monument is a dense colony of young fig shoots which have sprung
and resprung after every severe winter for perhaps more than a
century and a half. The steep bank of Bridges Creek to the southeast
of the monument is lined with cherry trees that to this day bear
excellent fruit, to be had merely for the picking. One gathers from all
the surroundings of the place a strong sense of the dignity and
simplicity that mark plantation life in Virginia.
It is a quiet life, indeed, that the people of these Westmoreland
plantations lead. Even to this day sailing craft slowly worm their way
far into the deep navigable inlets of the region, and carry freight to
Baltimore and Washington. Each plantation has its own wharf, and
each planter keeps a lookout for the coming schooner, just as their
ancestors of Washington’s day must have watched for the slow and
patient craft that plied up and down the Potomac, and away to
Baltimore, Philadelphia, and New York, or across the Atlantic to
England, a voyage that might stretch out for six or eight, ten, or even
twelve weeks.
The very speech of the people has a slightly archaic flavor, and
family names are redolent of old English ancestry. Here still are the
Kendalls, who like to boast that one of their ancestors was the
earliest mail contractor in Virginia. The elder Kendall, a man of
substance and fair education, found satisfactory reasons for selling
all that he had and coming to Jamestown with Captain John Smith.
In coming away he left behind a son just grown to manhood and
some debts owing to the estate. The son was instructed to collect
what he could of the proceeds, invest it in blankets and trinkets such
as the Indians liked, and to follow the father to Jamestown. The
young man obeyed the paternal instructions, but in sailing up the
Potomac with his freight of gewgaws he mistook the Potomac for the
James. After vainly looking for Jamestown, he concluded that the
settlement had been destroyed by the Indians, and, having reached
the present site of Alexandria, he made a settlement and called it
Bell Haven. Some months later an Indian who visited Bell Haven
made the settlers to understand that there were white men on a river
further south. Young Kendall knew then that Jamestown was still in
being. So he wrote a letter to his father and entrusted it to the Indian
to be delivered at Jamestown, paying him for the service one gay
woolen blanket. Father and son thus came into communication, but
the son remained at Bell Haven, and from him are descended the
Kendalls of the Northern Neck.
The whole region teems with traditions of Washington. Down in
Northumberland County, the lovely little harbor of Lodge is named
from the fact that here stood the Masonic lodge that Washington
used to attend. The British destroyed the house during the
Revolutionary War, but the cornerstone was found and opened not
many years ago, and some of its treasures of old English money
were placed in the cornerstone of the Masonic lodge at Kinsale,
another charming little Virginia harbor. It is at Lodge that the maker
of canceling dies for the Post Office Department, exiled from
Washington because of the climate, has for nearly twenty years
carried on his business with the aid of country youths trained for the
purpose.
If the shore is much what it was in Washington’s infancy, the river
and its tributaries are even more so. Those who know the Potomac
at Washington or amid the mountains that hem it in further west and
north, may well have no suspicion of the vast flood which it becomes
in the lower part of its course. Fifty miles below Washington the river
is from four to six miles wide. Sixty miles below the capital it has
spread to a width of ten miles, and in the lower forty miles of its
course it is from ten to eighteen miles wide, a great estuary of the
Chesapeake, with tributaries, almost nameless on the map, that
fairly dwarf the Hudson. The busy steamers plying these waters to
carry the produce of the plantations to the markets of Baltimore and
Washington leave the Potomac from time to time to lose themselves
in its tortuous tributaries. Cape on cape recedes to unfold new and
unexpected depths of loveliness; little harbors sit low on the tidal
waters backed by wooded bluffs, behind which lie the rich
plantations of Northumberland and Westmoreland. A soft-spoken
race of easy-going Virginians haunts the landing-places. Fishermen,
still pursuing the traditional methods of the eighteenth century, fetch
in sea trout and striped bass and pike to sell them at absurdly low
prices, and for nine months of the year oystermen are busy. Every
planter who will can maintain his pound net in the shallows of the
Potomac or one of its tributaries, and all along the lower course of
the stream the planter may secure his own oysters almost without
leaving the shore. The dainties that filled colonial larders in
Washington’s youth are still the food of the region—oysters and
clams, soft-shell crabs, wild duck, geese, and swan in winter, and a
bewildering variety of fish.
Just across the Potomac from Washington’s birthplace is old
Catholic Maryland of the Calvert Palatinate, settled almost exactly a
century before his birth, and still rich in the names and traditions of
that earlier time. The great width of the separating flood makes one
shore invisible from the other, and the only means of communication
are either the local sailing craft or the steamers that weave from side
to side of the river and lengthen the voyage from Baltimore to
Washington to a matter of thirty hours. Communication between
Maryland and Virginia was almost as easy in Washington’s day, for
the steamboats have an annoying habit of neglecting many miles of
one shore or the other, and there are days when no steamer crosses
the stream. A man living in one of the little harbors of the Northern
Neck, being in a hurry to travel northward, found his most
expeditious mode of travel to be a drive of seventy miles to a railway
at Richmond. Shut in thus, the people of the Northern Neck have
nursed their traditions and held hard by their old family names, so
that the visiting stranger, if he have any touch of historic instinct,
finds himself singularly moved with a sense of his nearness in time
to George Washington and his contemporaries. The telephone,
indeed, has brought these people into tenuous communication with
the modern world, but he that looks out upon the sea-like flood of the
Potomac from the mouth of one of its many navigable tributaries in
the Northern Neck can hardly persuade himself that the capital of
80,000,000 people lies less than a hundred miles up stream.
Washington the man seems vastly more real and present than
Washington the city.
E. N. Vallandigham.
Evening Post, N. Y.

You might also like