Geostatistical Analysis
Geostatistical Analysis
Geostatistical Analysis
Course Code: GeES 612 Mapping Patterns
Compiled By
Dr. Zubairul Islam
Associate Professor
GIS and Remote Sensing
Department of Geography and Environmental Sciences
Adigrat University, Ethiopia
E-mail: [email protected] , Contact no.: +251-967490505
Deterministic methods
Deterministic techniques have parameters that control either (1) the extent of similarity (for example, inverse
distance weighted) of the values or (2) the degree of smoothing (for example, radial basis functions) in the
surface. These techniques are not based on a random spatial process model, and there is no explicit
measurement or modeling of spatial autocorrelation in the data.
The Geostatistical Wizard offers several types of kriging, which are suitable for different types of data and
have different underlying assumptions:
● Ordinary
● Simple
● Universal
● Indicator
● Probability
● Disjunctive
● Areal interpolation
● Empirical Bayesian
These methods can be used to produce the following surfaces:
Inverse distance weighted (IDW) interpolation explicitly makes the assumption that
things that are close to one another are more alike than those that are farther
apart. To predict a value for any unmeasured location, IDW uses the measured
values surrounding the prediction location. The measured values closest to the
prediction location have more influence on the predicted value than those farther
away. IDW assumes that each measured point has a local influence that
diminishes with distance. It gives greater weights to points closest to the prediction
location, and the weights diminish as a function of distance, hence the name
inverse distance weighted.
Weights assigned to data
points are illustrated in the
following example:
Once a neighborhood shape has been specified, you can restrict which data locations within the shape should
be used. You can define the maximum and minimum number of locations to use, and you can divide the
neighborhood into sectors. If you divide the neighborhood into sectors, the maximum and minimum constraints
will be applied to each sector. There are several different sectors that can be used and are displayed below.
The points highlighted in the data view show the
locations and the weights that will be used for
predicting a location at the center of the ellipse
(the location of the crosshair). The search
neighborhood is limited to the interior of the
ellipse. In the example shown below, the two red
points will be given weights of more than 10
percent. In the eastern sector, one point (brown)
will be given a weight between 5 percent and 10
percent. The rest of the points in the search
neighborhood will receive lower weights.
When to use IDW
A surface calculated using IDW depends on the selection of the power value (p) and the search
neighborhood strategy. IDW is an exact interpolator, where the maximum and minimum values (see diagram
below) in the interpolated surface can only occur at sample points.
The output surface is sensitive to clustering and the presence of outliers. IDW assumes that the
phenomenon being modeled is driven by local variation, which can be captured (modeled) by defining an
adequate search neighborhood. Since IDW does not provide prediction standard errors, justifying the use of
this model may be problematic.
Topic 3.2 - Mapping Pattern
Identifying geographic patterns is important for understanding how geographic phenomena
behave.
Although you can get a sense of the overall pattern of features and their associated values
by mapping them, calculating a statistic quantifies the pattern. This makes it easier to
compare patterns for different distributions or different time periods. Often the tools in the
Analyzing Patterns toolset are a starting point for more in-depth analyses. Using the
Incremental Spatial Autocorrelation tool to identify distances where the processes promoting
spatial clustering are most pronounced, for example, might help you select an appropriate
distance (scale of analysis) to use for investigating hot spots (Hot Spot Analysis).
The tools in the Analyzing Patterns toolset are inferential statistics; they start with the null
hypothesis that your features, or the values associated with your features, exhibit a spatially
random pattern. They then compute a p-value representing the probability that the null
hypothesis is correct (that the observed pattern is simply one of many possible versions of
complete spatial randomness). Calculating a probability may be important if you need to have a
high level of confidence in a particular decision. If there are public safety or legal implications
associated with your decision, for example, you may need to justify your decision using
statistical evidence.
The Analyzing Patterns tools provide statistics that quantify broad spatial patterns. These tools
answer questions such as, "Are the features in the dataset, or the values associated with the
features in the dataset, spatially clustered?" and "Is the clustering becoming more or less
intense over time?". The following table lists the tools available and provides a brief description
of each.
Tool Description
Average Nearest Calculates a nearest neighbor index based on the average distance from each feature to its
High/Low Measures the degree of clustering for either high values or low values using the Getis-Ord
Incremental Measures spatial autocorrelation for a series of distances and optionally creates a line
Spatial graph of those distances and their corresponding z-scores. Z-scores reflect the intensity of
Autocorrelation spatial clustering, and statistically significant peak z-scores indicate distances where spatial
processes promoting clustering are most pronounced. These peak distances are often
appropriate values to use for tools with a Distance Band or Distance Radius parameter.
Tool Description
Spatial Measures spatial autocorrelation based on feature locations and attribute values
Multi-Distance Determines whether features, or the values associated with features, exhibit
Analysis (Ripley's
k-function)
Average Nearest Neighbor
Introduction
Calculates a nearest neighbor index based on the average distance from each feature to
its nearest neighboring feature.
Uses
The Average Nearest Neighbor tool returns five values: Observed Mean Distance,
Expected Mean Distance, Nearest Neighbor Index, z-score, and p-value. These
values are accessible from the Results window and are also passed as derived
output values for potential use in models or scripts. Optionally, this tool will create
an HTML file with a graphical summary of results. Double-clicking on the HTML
entry in the Results window will open the HTML file in the default Internet browser.
Right-clicking on the Messages entry in the Results window and selecting View
will display the results in a Message dialog box.
● The z-score and p-value results are measures of statistical significance which tell you
whether or not to reject the null hypothesis. Note, however, that the statistical significance
for this method is strongly impacted by study area size (see below). For the Average
Nearest Neighbor statistic, the null hypothesis states that features are randomly distributed.
● The Nearest Neighbor Index is expressed as the ratio of the Observed Mean Distance to
the Expected Mean Distance. The expected distance is the average distance between
neighbors in a hypothetical random distribution. If the index is less than 1, the pattern
exhibits clustering; if the index is greater than 1, the trend is toward dispersion or
competition.
● The average nearest neighbor method is very sensitive to the Area value (small changes in
the Area parameter value can result in considerable changes in the z-score and p-value
results). Consequently, the Average Nearest Neighbor tool is most effective for comparing
different features in a fixed study area. The picture below is a classic example of how
identical feature distributions can be dispersed or clustered depending on the study area
specified.
● If an Area parameter value is not specified, then the area of the minimum enclosing
rectangle around the input features is used. Unlike the extent, a minimum enclosing
rectangle will not necessarily align with the x- and y-axes.
● When the Input Feature Class is not projected (that is, when coordinates are given in
degrees, minutes, and seconds) or when the output coordinate system is set to a
Geographic Coordinate System, distances are computed using chordal measurements.
Chordal distance measurements are used because they can be computed quickly and
provide very good estimates of true geodesic distances, at least for points within
about thirty degrees of each other. Chordal distances are based on an oblate spheroid.
Given any two points on the earth's surface, the chordal distance between them is the
length of a line, passing through the three-dimensional earth, to connect those two
points. Chordal distances are reported in meters.
How Average Nearest Neighbor works
The Average Nearest Neighbor tool measures the distance between each feature
centroid and its nearest neighbor's centroid location. It then averages all these nearest
neighbor distances. If the average distance is less than the average for a hypothetical
random distribution, the distribution of the features being analyzed is considered
clustered. If the average distance is greater than a hypothetical random distribution, the
features are considered dispersed. The average nearest neighbor ratio is calculated as
the observed average distance divided by the expected average distance (with expected
average distance being based on a hypothetical random distribution with the same
number of features covering the same total area).
Interpretation
If the index (average
nearest neighbor ratio)
is less than 1, the
pattern exhibits
clustering. If the index
is greater than 1, the
trend is toward
dispersion.
Thanks