Chapter 6 PDF
Chapter 6 PDF
Chapter content
GIS analysis functions
Measurement
Selection queries
Reclassification
Dissolving
Overlay operations
Buffering
Thiessen polygon
Spread computation
Seek computation
Network analysis
Error propagation in spatial data processing
2
Introduction
Spatial data analysis aims to find answers to
questions that have spatial relevance. Example:
Where is the most suitable site for waste disposal?
Where is the best site for plantation of certain species?
Steps in analysis:
Frame the question
Select your data
Choose analysis method
Process the data
Look at the results
3
Measurement operations
Only geometric measurements are discussed, no
measurements on attribute values.
Measurement functions can be:
Vector measurements
Raster measurements
4
Measurement …
Location → always stored by GIS.
One coordinate pair for
points.
List of pairs for lines
and polygons.
The location of the
centroid of a polygon.
5
Measurement …
Length is associated
with lines, and with
polygon boundaries.
It can be stored by the
GIS.
It can also be computed
on the fly.
6
Measurement …
Distance between two points
→ Pythagorean distance
function.
If one or both features are not
a point we will measure the
minimal distance between the
two features.
7
Measurement …
Raster measurements
include: location, distance
and area size.
Location of an individual
cell → derived from anchor
point and resolution.
The cell’s location can be its
lower left corner or
midpoint.
Resolution/cell size:
30 x 30 meters
8
Measurement …
Distance → standard
distance function applied
to the locations of their
mid-points.
When a raster is used to
represent line features as
strings of cells, the length
of a line is computed as
the sum of the distances
between the cells.
9
Measurement …
Area size → number of cells *
cell size.
When you know the
resolution you can calculate
the area of a single cell.
In this example 30 x 30
meters = 900 m2.
The number of cells is also
called the frequency or count.
10
Spatial selection queries
Interactive selection
Spatial selection by attribute conditions
Relational operators
Logical operators
Combining attribute conditions
11
Interactive selection
Interactive spatial
selection is a selection
in which you select
features by:
clicking on the screen
(on the feature to select)
or
drawing a graphic, to
select all objects within
this graphic.
12
Attribute queries
Attribute queries use a
selection condition on the
features attributes.
The condition is specified
in a query language.
This query language can be
SQL when the data is
stored in a relational
database.
Answers questions: Where
are the features with…?
13
Attribute queries
A condition that tests a
single criterion is called
an atomic condition.
Atomic conditions use a
predictive symbol such as
< (less than).
Any of these symbols is
combined with an
expression on the left and
on the right.
14
Attribute queries
Logical connectives are used to combine two atomic
conditions to one composite condition.
Examples of logical connectives are: AND, OR and
NOT.
AND returns true if both expressions are true.
OR returns true if one or both of the expressions
is/are true.
NOT returns true if the expression is false.
15
Attribute queries
16
Topological relationships
For spatial selections
using topological
relationships, the steps
carried out are:
Select one or more
selection objects.
Apply a chosen spatial
relationship to determine
the features that have that
relationship with the
selection objects.
17
Topological relationships
Selecting features that are inside selection objects.
Polygons can contain polygons, lines or points, and
lines can contain lines or points.
18
Topological relationships
An example of a selection
using the inside
relationship.
In picture 1 the state of
Georgia is selected, this is
the selection object.
In picture 2 all the cities
that are inside this state are
selected.
19
Topological relationships
Selecting features that
intersect.
In picture 1 the state of
Georgia is selected
In picture 2, the
interstates that run
through Georgia are
selected (intersect).
20
Topological relationships
Selecting features
adjacent to selection
objects.
Adjacency is the same as
the meets relationship.
Adjacency depends on the
algorithm used.
Shared line
Shared node
21
Topological relationships
Selecting features based on
their distance.
Will search within a given
distance from the selection
object, at a given distance,
or beyond a given
distance.
Example (bottom) select
all the cities within 100
kilometers from Atlanta.
22
Location selection methods
Cities
Countries
Result
Cities intersected by 24 24
selected countries
Reclassification
It is an assignment of a class or value based on the
attributes or geography of an object.
It is used to reduce the complexity of a layer in order
to show patterns.
Reduce the number of classes and eliminate details.
Two types of classifications:
user controlled classification and
automatic classification
25
Reclassification
User controlled classification
Classification table
Two examples of classification tables:
26
Reclassification
Automatic classification:
User specifies the number of
output classes.
Computer decides the class
break points.
Two methods of determining
the class breaks will be
discussed:
Equal interval technique
Equal frequency technique
27
Reclassification
Equal interval is calculated as:
(Vmax Vmin )
n
Where:
Vmax is the maximum attribute value,
Vmin is the minimum attribute value
n is the number of classes.
28
Reclassification
Equal frequency, is
also called quantile.
Total number of
features / number of
classes (n)
Create categories
with roughly equal
number of features
(or cells).
29
Dissolve
A function whose primary purpose is to combine like
features within a data layer.
Adjacent polygons may have identical values.
Dissolve removes or “dissolves away” the common
boundary.
Used prior to applying area-based selection in spatial
analysis.
Dissolve is often used after reclassification.
30
Dissolve
31
Overlay
Standard overlay operators take two input data layers,
and assume they are:
geo-referenced in the same system,
overlap in study area.
32
Overlay
Vector overlay techniques
Intersection
Clip by
Overwrite by
33
Vector overlay-intersection
The standard operator for
two layers of polygons is the
polygon intersection
operator.
The result of this operator is
the collection of all possible
polygon intersections
The attribute table combines
the information of the two
input tables (spatial join).
34
Vector overlay-intersection
Vector overlays are
usually also defined for
point and line data layers.
When a polygon layer is
intersected by a line layer
the result will be a line
layer (the layer of the
lowest order).
35
Vector overlay-clip
Clip takes a polygon data layer
and restricts its spatial extent
(the area that it covers) to the
outer boundary of a second
input layer (clip layer).
No other polygons from the clip
layer play a role in the result.
This technique can be used to
reduce the area of a thematic
layer to that of the study area.
36
Vector overlay-overwrite
The polygon overwrite
creates a layer with the
polygons of the first layer
except where polygons exist
in the second layer (as they
take priority)
This operator can be used to
overwrite a layer with
“updates“ stored in a second
layer.
37
Raster overlay operations
Raster overlays are mostly
cell by cell computations.
GISs that support raster
processing have a full
language to express
operations.
This is called a raster
calculus.
No geometric calculation.
38
Raster overlay operations
New cell values are calculated using calculus - map
algebra.
Performed on cell-by-cell basis.
Overview
Arithmetic overlay operators
Comparison and logical operators
Conditional expressions
Decision table
39
Raster overlay-arithmetic
Arithmetic operators:
+, -, *, /
MOD (modulo division)
DIV (integer division)
Trigonometric operators:
sin, cos, tan, asin, acos,
atan.
For example:
Raster2 := Raster1 * 5
40
Raster overlay-arithmetic
41
Raster overlay-comparison & logical
Comparison operators:
<, <=, =, >=, >, <>
C := A<>B is true when the
cell’s value in A differs
from the cell’s value in B. It
is false if they are the same.
Logical operators:
AND, OR, NOT, XOR
(exclusive)
a XOR b is true if either a or
b is true, but not both.
42
Raster overlay-logical operators
43
Raster overlay-comparison + logical
44
Raster overlay-conditional
45
Raster overlay-decision table
Decision tables are the
same statement as in a
conditional statement, but
presented in a different
way.
The decision table will
guide the overlay process.
It lists all possible
combinations of input
values, and the output
values.
46
Raster overlay-decision table
47
Summary
Vector overlay techniques, intersection, clip by and overwrite by.
Clip by is like a cookie cutter, cutting out the map extend of the second
layer.
48
Neighbourhood functions
Neighbourhood functions evaluate the characteristics
of an area surrounding a feature’s location.
Buffer zone generation (or buffering): determines a spatial
envelope (buffer) around (a) given feature(s).
The buffer may have a fixed width, or a variable width.
Interpolation functions: predict unknown values using the
known values at nearby locations.
This typically occurs for continuous fields, like elevation.
49
Buffering
In buffer zone generation,
we select one or more
target locations and
determine the area around
them.
It can be performed on
vector as well as raster
data.
Target locations can be
point, line or polygon in
vector environment.
50
Buffering
It can be simple or
zonated.
With zonated buffer, the
buffer consists of multiple
rings each representing a
different distance.
In vector buffer
generation, the buffer will
be a new polygon in the
output layer.
51
Buffering
A variable buffer
changes the buffer
distance depending on
feature attributes.
It requires some way
of specifying the
distance.
This is most often done
with an attribute in a
table.
52
Thiessen polygon
Thiessen polygons: divide
an area into polygons, so
that each polygon contains
locations that are closer to
the midpoint than to any
other midpoint.
It will generate a polygon
around each target location
that identifies all those
locations that belong to that
target.
53
Spread computation
In spread computation the
neighborhood of a target
location not only depends
on distance but also on
direction and differences in
the terrain.
Example:
Determining flooded area
Spreading of pollution etc.
54
Spread computation
Spread computation
involves one or more
target locations (source
locations).
Spread computation also
involves a local resistance
raster, which for each cell
provides a value that
indicate how difficult it is
to pass by that cell.
55
Spread computation
While computing total
resistance, the GIS takes
proper care of correct
spread path lengths.
The spread from a cell
to its neighbour cell to
the east is shorter that
to its northeast
neighbour.
56
Spread computation
The GIS computes the total
minimal resistance raster
for a diagonal neighbour as:
57
Spread computation
Spreading from a source
usually follows the easiest or
shortest route, the value of
the minimal resistance is
always determined.
To determine the minimal
resistance, a GIS calculates
resistance at all possible
paths to reach a cell and
then takes the minimal
value.
58
Seek computation
Seek computation is used for
phenomena that does not
spread in all directions, but
chooses a least cost path.
A typical example is
drainage pattern in a
catchment.
Example applications:
Determination of the path of
water flow,
Highway planning etc.
59
Seek computation
Input for a seek
computation is an
elevation raster.
For each cell the steepest
downward slope to a
neighbour cell is
determined.
The direction of this
downward slope is stored
in the flow direction
raster.
60
Seek computation
For each cell first eliminate
all cells that are not
downhill (have higher
elevation value).
Two types of neighbours:
direct neighbours and
diagonal ones.
For each type pick the
steepest.
Compensate for the
difference in path length.
61
Seek computation
From the flow direction
raster the GIS will calculate
the accumulated flow count
raster.
The value of the
accumulated flow for each
cell is the number of cells
that flow into this cell.
Cells with a high
accumulated flow count
represent streams.
62
Exercise
See a Flow direction raster layer. Calculate the corresponding flow accumulation layer.
63
Exercise…..
Evaluate the elevation layer shown below. Which flow direction raster shows the correct direction
for this layer
Answer 2
64
Summary
65
Network analysis
Network analysis techniques can be characterized by
their use of feature networks.
Feature networks are almost entirely comprised of
linear features.
Types of network analysis:
Finding the best route
Finding the closest facility
Creating an origin-destination cost matrix etc.
Example applications:
Determining stream order
Determining drainage upstream from a location
66
Types of networks
Directed network:
Example: 1-way traffic;
rivers
Undirected network:
Example: 2-way traffic
Planar (2-dimensional):
Example: road network
without overpasses or
underpasses; rivers
67
Types of networks
Non-planar network:
consists of multilevel
crossings, underpasses
and overpasses.
When they are modeled
in 2-D, these overpasses
and underpasses should
be modeled in a special
way.
68
Types of network analysis
The common goal in network analysis is to determine
the way through which goods can be transported
along a set of connected lines.
Optimal path finding-generates a least coast-path on a
network between a pair of predefined locations using both
geometric and attribute data.
Ordered
Unordered
Trace analysis
69
Optimal path finding
Optimal path finding is
used when a least cost
path between two nodes in
a network must be found.
The two nodes are called
origin and destination.
Among the attributes in a
feature, attribute table
could be length, travel time,
etc.
70
Optimal path finding
Problems related to optimal
path finding are ordered
and unordered optimal
path finding.
Ordered optimal path
finding: the sequence in
which these extra nodes are
visited matters.
Unordered optimal path
finding: the sequence does
not matter.
71
Network partitioning
Network allocation
The purpose is to assign
lines and/or nodes of the
network to a number of
target locations.
The target locations play
the role of service centre
for the network.
For example: medical
treatment, education,
water supply etc.
72
Network partitioning
Trace analysis: used to
determine that part of the
network that is upstream (or
downstream) from a given
target location.
Such connectivity problems
exist in pollution tracing
along river/stream systems,
but also in network failure
chasing in energy
distribution networks.
73
Summary
74
Error propagation
75
Error propagation
Error propagation analysis:
Testing the accuracy of each state by measurement against the
real world
Modeling error propagation, either analytically or by means of
simulation techniques.
Initially the complexity of spatial data led to the development
of mathematical models describing only the propagation of
attribute errors.
Modern models incorporate both spatial and attribute errors.
76
77