0% found this document useful (0 votes)
170 views

Chapter 6 PDF

The document discusses various types of spatial data analysis functions in GIS including measurement, selection queries, reclassification, overlay operations, and network analysis. It explains how these functions are used to analyze spatial data, find answers to spatially-relevant questions, and understand patterns and relationships. The document provides details on specific analysis techniques like measuring distances and areas, selecting features based on attributes or topology, and reclassifying data into fewer classes to simplify patterns.

Uploaded by

Dani Ftwi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
170 views

Chapter 6 PDF

The document discusses various types of spatial data analysis functions in GIS including measurement, selection queries, reclassification, overlay operations, and network analysis. It explains how these functions are used to analyze spatial data, find answers to spatially-relevant questions, and understand patterns and relationships. The document provides details on specific analysis techniques like measuring distances and areas, selecting features based on attributes or topology, and reclassifying data into fewer classes to simplify patterns.

Uploaded by

Dani Ftwi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

Spatial Data Analysis

Chapter content
 GIS analysis functions
 Measurement
 Selection queries
 Reclassification
 Dissolving
 Overlay operations
 Buffering
 Thiessen polygon
 Spread computation
 Seek computation
 Network analysis
 Error propagation in spatial data processing

Prepared by Amina Abdelkadir1


Introduction
 Spatial data analysis involves manipulations or
calculation of coordinates or attribute variables with
various operators (tools).
 Why do we need to conduct analysis?
 In order to figure out what to do.
 To understand the problem at hand.
 To establish a solution to the problem.
 To optimize our solution.
 To guide our implementation.
 Applying the solution and knowing the effects before it is
actually implemented in the real world.

2
Introduction
 Spatial data analysis aims to find answers to
questions that have spatial relevance. Example:
 Where is the most suitable site for waste disposal?
 Where is the best site for plantation of certain species?
 Steps in analysis:
 Frame the question
 Select your data
 Choose analysis method
 Process the data
 Look at the results

3
Measurement operations
 Only geometric measurements are discussed, no
measurements on attribute values.
 Measurement functions can be:
 Vector measurements
 Raster measurements

 Vector measurements include:


 Location,
 Length,
 Distance and
 Area size

4
Measurement …
 Location → always stored by GIS.
 One coordinate pair for
points.
 List of pairs for lines
and polygons.
 The location of the
centroid of a polygon.

5
Measurement …
 Length is associated
with lines, and with
polygon boundaries.
 It can be stored by the
GIS.
 It can also be computed
on the fly.

6
Measurement …
 Distance between two points
→ Pythagorean distance
function.
 If one or both features are not
a point we will measure the
minimal distance between the
two features.

7
Measurement …
 Raster measurements
include: location, distance
and area size.
 Location of an individual
cell → derived from anchor
point and resolution.
 The cell’s location can be its
lower left corner or
midpoint.
 Resolution/cell size:
 30 x 30 meters

8
Measurement …
 Distance → standard
distance function applied
to the locations of their
mid-points.
 When a raster is used to
represent line features as
strings of cells, the length
of a line is computed as
the sum of the distances
between the cells.

9
Measurement …
 Area size → number of cells *
cell size.
 When you know the
resolution you can calculate
the area of a single cell.
 In this example 30 x 30
meters = 900 m2.
 The number of cells is also
called the frequency or count.

10
Spatial selection queries
 Interactive selection
 Spatial selection by attribute conditions
 Relational operators
 Logical operators
 Combining attribute conditions

 Spatial selection using topological relationships


 Selecting features that are inside selection objects
 Selecting features that intersect
 Selecting features adjacent to selection objects
 Selecting features based on their distance

11
Interactive selection
 Interactive spatial
selection is a selection
in which you select
features by:
 clicking on the screen
(on the feature to select)
or
 drawing a graphic, to
select all objects within
this graphic.

12
Attribute queries
 Attribute queries use a
selection condition on the
features attributes.
 The condition is specified
in a query language.
 This query language can be
SQL when the data is
stored in a relational
database.
 Answers questions: Where
are the features with…?
13
Attribute queries
 A condition that tests a
single criterion is called
an atomic condition.
 Atomic conditions use a
predictive symbol such as
< (less than).
 Any of these symbols is
combined with an
expression on the left and
on the right.

14
Attribute queries
 Logical connectives are used to combine two atomic
conditions to one composite condition.
 Examples of logical connectives are: AND, OR and
NOT.
 AND returns true if both expressions are true.
 OR returns true if one or both of the expressions
is/are true.
 NOT returns true if the expression is false.

15
Attribute queries

16
Topological relationships
 For spatial selections
using topological
relationships, the steps
carried out are:
 Select one or more
selection objects.
 Apply a chosen spatial
relationship to determine
the features that have that
relationship with the
selection objects.

17
Topological relationships
 Selecting features that are inside selection objects.
 Polygons can contain polygons, lines or points, and
lines can contain lines or points.

18
Topological relationships
 An example of a selection
using the inside
relationship.
 In picture 1 the state of
Georgia is selected, this is
the selection object.
 In picture 2 all the cities
that are inside this state are
selected.

19
Topological relationships
 Selecting features that
intersect.
 In picture 1 the state of
Georgia is selected
 In picture 2, the
interstates that run
through Georgia are
selected (intersect).

20
Topological relationships
 Selecting features
adjacent to selection
objects.
 Adjacency is the same as
the meets relationship.
 Adjacency depends on the
algorithm used.
 Shared line
 Shared node

21
Topological relationships
 Selecting features based on
their distance.
 Will search within a given
distance from the selection
object, at a given distance,
or beyond a given
distance.
 Example (bottom) select
all the cities within 100
kilometers from Atlanta.

22
Location selection methods

Select by location Pollution Sightings


offers many
selection methods
 Intersect
 Contain Counties River
 Are contained by
 Share a line segment
 Touch boundary Result Result
 Within a distance
 Are identical
Polluted areas Animal sightings
 Are centered in completely within within a distance
 More… selected counties of 2 km from rivers
23 23
Select by location (spatial query)

Use features in one layer to select features in another

Cities

Countries

Result

Cities intersected by 24 24
selected countries
Reclassification
 It is an assignment of a class or value based on the
attributes or geography of an object.
 It is used to reduce the complexity of a layer in order
to show patterns.
 Reduce the number of classes and eliminate details.
 Two types of classifications:
 user controlled classification and
 automatic classification

25
Reclassification
 User controlled classification
 Classification table
 Two examples of classification tables:

 The table on the left, the original values are ranges, in


the table on the right the old values already were a
classification.

26
Reclassification
 Automatic classification:
 User specifies the number of
output classes.
 Computer decides the class
break points.
 Two methods of determining
the class breaks will be
discussed:
 Equal interval technique
 Equal frequency technique

27
Reclassification
 Equal interval is calculated as:
(Vmax  Vmin )
n
Where:
Vmax is the maximum attribute value,
Vmin is the minimum attribute value
n is the number of classes.

 In our example: (10-1) / 5 ≈ 2


 Each class will have two values.

28
Reclassification
 Equal frequency, is
also called quantile.
 Total number of
features / number of
classes (n)
 Create categories
with roughly equal
number of features
(or cells).

29
Dissolve
 A function whose primary purpose is to combine like
features within a data layer.
 Adjacent polygons may have identical values.
 Dissolve removes or “dissolves away” the common
boundary.
 Used prior to applying area-based selection in spatial
analysis.
 Dissolve is often used after reclassification.

30
Dissolve

31
Overlay
 Standard overlay operators take two input data layers,
and assume they are:
 geo-referenced in the same system,
 overlap in study area.

 If either condition is not met, the use of an overlay


operator is senseless.
 The principle of spatial overlay is to:
 compare the characteristics of the same location in both
data layers,
 produce a new output value for each location.

32
Overlay
 Vector overlay techniques
 Intersection
 Clip by
 Overwrite by

 Raster overlay techniques


 Arithmetic operators
 Comparison and logical operators
 Conditional expressions
 Decision table

33
Vector overlay-intersection
 The standard operator for
two layers of polygons is the
polygon intersection
operator.
 The result of this operator is
the collection of all possible
polygon intersections
 The attribute table combines
the information of the two
input tables (spatial join).

34
Vector overlay-intersection
 Vector overlays are
usually also defined for
point and line data layers.
 When a polygon layer is
intersected by a line layer
the result will be a line
layer (the layer of the
lowest order).

35
Vector overlay-clip
 Clip takes a polygon data layer
and restricts its spatial extent
(the area that it covers) to the
outer boundary of a second
input layer (clip layer).
 No other polygons from the clip
layer play a role in the result.
 This technique can be used to
reduce the area of a thematic
layer to that of the study area.

36
Vector overlay-overwrite
 The polygon overwrite
creates a layer with the
polygons of the first layer
except where polygons exist
in the second layer (as they
take priority)
 This operator can be used to
overwrite a layer with
“updates“ stored in a second
layer.

37
Raster overlay operations
 Raster overlays are mostly
cell by cell computations.
 GISs that support raster
processing have a full
language to express
operations.
 This is called a raster
calculus.
 No geometric calculation.

38
Raster overlay operations
 New cell values are calculated using calculus - map
algebra.
 Performed on cell-by-cell basis.
 Overview
 Arithmetic overlay operators
 Comparison and logical operators
 Conditional expressions
 Decision table

39
Raster overlay-arithmetic
 Arithmetic operators:
+, -, *, /
 MOD (modulo division)
 DIV (integer division)
 Trigonometric operators:
sin, cos, tan, asin, acos,
atan.
 For example:
Raster2 := Raster1 * 5

40
Raster overlay-arithmetic

41
Raster overlay-comparison & logical
 Comparison operators:
<, <=, =, >=, >, <>
 C := A<>B is true when the
cell’s value in A differs
from the cell’s value in B. It
is false if they are the same.
 Logical operators:
AND, OR, NOT, XOR
(exclusive)
 a XOR b is true if either a or
b is true, but not both.
42
Raster overlay-logical operators

43
Raster overlay-comparison + logical

44
Raster overlay-conditional

45
Raster overlay-decision table
 Decision tables are the
same statement as in a
conditional statement, but
presented in a different
way.
 The decision table will
guide the overlay process.
 It lists all possible
combinations of input
values, and the output
values.
46
Raster overlay-decision table

47
Summary
 Vector overlay techniques, intersection, clip by and overwrite by.

 Intersection is the fundamental operator, the attribute table is a spatial join


(fields from both input tables.

 Clip by is like a cookie cutter, cutting out the map extend of the second
layer.

 For types of raster overlay techniques, using arithmetic operators,


comparison and logical operators, conditional expressions and decision
table.

 Comparison and logical operators only evaluate to true and false.

 Conditional expressions and decision table lead to the same result

48
Neighbourhood functions
 Neighbourhood functions evaluate the characteristics
of an area surrounding a feature’s location.
 Buffer zone generation (or buffering): determines a spatial
envelope (buffer) around (a) given feature(s).
 The buffer may have a fixed width, or a variable width.
 Interpolation functions: predict unknown values using the
known values at nearby locations.
 This typically occurs for continuous fields, like elevation.

49
Buffering
 In buffer zone generation,
we select one or more
target locations and
determine the area around
them.
 It can be performed on
vector as well as raster
data.
 Target locations can be
point, line or polygon in
vector environment.

50
Buffering
 It can be simple or
zonated.
 With zonated buffer, the
buffer consists of multiple
rings each representing a
different distance.
 In vector buffer
generation, the buffer will
be a new polygon in the
output layer.

51
Buffering
 A variable buffer
changes the buffer
distance depending on
feature attributes.
 It requires some way
of specifying the
distance.
 This is most often done
with an attribute in a
table.

52
Thiessen polygon
 Thiessen polygons: divide
an area into polygons, so
that each polygon contains
locations that are closer to
the midpoint than to any
other midpoint.
 It will generate a polygon
around each target location
that identifies all those
locations that belong to that
target.
53
Spread computation
 In spread computation the
neighborhood of a target
location not only depends
on distance but also on
direction and differences in
the terrain.
 Example:
 Determining flooded area
 Spreading of pollution etc.

54
Spread computation
 Spread computation
involves one or more
target locations (source
locations).
 Spread computation also
involves a local resistance
raster, which for each cell
provides a value that
indicate how difficult it is
to pass by that cell.

55
Spread computation
 While computing total
resistance, the GIS takes
proper care of correct
spread path lengths.
 The spread from a cell
to its neighbour cell to
the east is shorter that
to its northeast
neighbour.

56
Spread computation
 The GIS computes the total
minimal resistance raster
for a diagonal neighbour as:

 This is half the resistance


value of the source cell plus
half of the resistance value
of the cell to the northeast
multiplied by root of 2.

57
Spread computation
 Spreading from a source
usually follows the easiest or
shortest route, the value of
the minimal resistance is
always determined.
 To determine the minimal
resistance, a GIS calculates
resistance at all possible
paths to reach a cell and
then takes the minimal
value.
58
Seek computation
 Seek computation is used for
phenomena that does not
spread in all directions, but
chooses a least cost path.
 A typical example is
drainage pattern in a
catchment.
 Example applications:
 Determination of the path of
water flow,
 Highway planning etc.
59
Seek computation
 Input for a seek
computation is an
elevation raster.
 For each cell the steepest
downward slope to a
neighbour cell is
determined.
 The direction of this
downward slope is stored
in the flow direction
raster.
60
Seek computation
 For each cell first eliminate
all cells that are not
downhill (have higher
elevation value).
 Two types of neighbours:
direct neighbours and
diagonal ones.
 For each type pick the
steepest.
 Compensate for the
difference in path length.
61
Seek computation
 From the flow direction
raster the GIS will calculate
the accumulated flow count
raster.
 The value of the
accumulated flow for each
cell is the number of cells
that flow into this cell.
 Cells with a high
accumulated flow count
represent streams.
62
Exercise
See a Flow direction raster layer. Calculate the corresponding flow accumulation layer.

Corresponding accumulation raster

63
Exercise…..

Evaluate the elevation layer shown below. Which flow direction raster shows the correct direction
for this layer

Answer 2

64
Summary

65
Network analysis
 Network analysis techniques can be characterized by
their use of feature networks.
 Feature networks are almost entirely comprised of
linear features.
 Types of network analysis:
 Finding the best route
 Finding the closest facility
 Creating an origin-destination cost matrix etc.
 Example applications:
 Determining stream order
 Determining drainage upstream from a location
66
Types of networks
 Directed network:
 Example: 1-way traffic;
rivers
 Undirected network:
 Example: 2-way traffic
 Planar (2-dimensional):
 Example: road network
without overpasses or
underpasses; rivers
67
Types of networks
 Non-planar network:
consists of multilevel
crossings, underpasses
and overpasses.
 When they are modeled
in 2-D, these overpasses
and underpasses should
be modeled in a special
way.

68
Types of network analysis
 The common goal in network analysis is to determine
the way through which goods can be transported
along a set of connected lines.
 Optimal path finding-generates a least coast-path on a
network between a pair of predefined locations using both
geometric and attribute data.
 Ordered

 Unordered

 Network partitioning-assigns network elements (nodes or


line segments) to different location using predefined criteria.
 Network allocation

 Trace analysis

69
Optimal path finding
 Optimal path finding is
used when a least cost
path between two nodes in
a network must be found.
 The two nodes are called
origin and destination.
 Among the attributes in a
feature, attribute table
could be length, travel time,
etc.

70
Optimal path finding
 Problems related to optimal
path finding are ordered
and unordered optimal
path finding.
 Ordered optimal path
finding: the sequence in
which these extra nodes are
visited matters.
 Unordered optimal path
finding: the sequence does
not matter.
71
Network partitioning
Network allocation
 The purpose is to assign
lines and/or nodes of the
network to a number of
target locations.
 The target locations play
the role of service centre
for the network.
 For example: medical
treatment, education,
water supply etc.

72
Network partitioning
 Trace analysis: used to
determine that part of the
network that is upstream (or
downstream) from a given
target location.
 Such connectivity problems
exist in pollution tracing
along river/stream systems,
but also in network failure
chasing in energy
distribution networks.
73
Summary

74
Error propagation

How errors propagate:


 Errors already present in the
input data will propagate through
the manipulations.
 New errors arise from the
computer processing (analytical
operations performed)

75
Error propagation
Error propagation analysis:
 Testing the accuracy of each state by measurement against the
real world
 Modeling error propagation, either analytically or by means of
simulation techniques.
 Initially the complexity of spatial data led to the development
of mathematical models describing only the propagation of
attribute errors.
 Modern models incorporate both spatial and attribute errors.

76
77

You might also like