0% found this document useful (0 votes)
13 views

Lecture 3

Advanced GIS and Remote Sensing Lecture Notes 3

Uploaded by

kassaye hussien
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Lecture 3

Advanced GIS and Remote Sensing Lecture Notes 3

Uploaded by

kassaye hussien
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Spatial Prediction: Geostatistical

Methods

Lecture 3
Limitations of Non-linear Method?
 Non-linear methods : Artificial Neural Networks,
Decision Trees, Expert Systems...
 Non-linear methods are increasingly used to model
and map continuous spatial properties.
◦ These can use more ancillary variables than explicitly
spatial methods.
 However, usually assessed using non-spatial global
error measures.
◦ Summarize many data points
◦ Cannot easily identify where model is correct
What is Spatial Interpolation?
 Spatial prediction is a process of estimating the
value of (quantitative) properties at unvisited site
within the area covered by existing observations.
• Predict values at unknown locations using values at
measured locations
• Assumes spatial autocorrelation
 Though interpolation methods can be classified based on
several aspects, they can be classified broadly in two
according to the amount of statistical analysis included:
◦ Deterministic interpolation models
◦ Statistical (geostatstical) models
• Many interpolation methods both statistical and
non-statistical are there.
Cont…
 There are two main groupings of
interpolation techniques:
◦ Deterministic Interpolation Techniques
─ create surfaces from measured points, based on either
the extent of similarity (inverse distance weighted) or
the degree of smoothing (radial basis functions).
◦ Geostatistical interpolation techniques
(kriging)
─ utilize the statistical properties of the measured points.
─ quantify the spatial autocorrelation among measured
points and account for the spatial configuration of the
sample points around the prediction location.
Why Geostatistical Methods ?
 Deterministic interpolation methods
◦ Not based in statistical theory
◦ Not able to estimate prediction error, unclear
assumptions
◦ Examples: IDW, Spline, Natural Neighbor, Trend
 Geostatistical methods
◦ Predictions based on statistical principles and
theory
◦ Clear assumptions that can be checked
◦ Provide measures of uncertainty for predictions
3.1. Geostatstical: Kriging Method
 It is the most widely used geostatistical method
◦ Many types of kriging: ordinary, simple, universal,
indicator, kriging etc
 Ordinary kriging is one of the basic of kriging
methods, which provides estimates at unobserved
location taking into account the spatial
autocorrelation between observed points.
 Extension of Ordinary kriging
◦ Co- Kriging
◦ Regression Kriging
◦ Geographically Weighted Regression
3.1.1 Ordinary Kriging (OK)
 Ordinary Kriging is based on unknown samples in
finite neighborhoods and takes into account the
shape, size, and spatial position in the sample areas
along with the relationship between unknown
sample spaces.
 The variation function provides the structure of the
information, which is used to estimate the spatial
distributions of target variables.
 It is a weighting function required to interpolate
target (primary) variable between observed points,
and can be graphically represented by variogram.
OK…
 The variogram plots semivariance γ as a function of the
distance between samples, known as lag distance h, and
is defined:

 Where
◦ γ(h) is semivariance as a function of lag distance h,
◦ n(h) is the number of pairs of data locations separated by h, and
◦ z is the data value at location xi and (xi + h).
OK…
 Semivariogram Model
◦ Quantifies the first law of geography
◦ Semivariogram model (Spherical, Exponential and Gaussian
models)
◦ Model Parameters (Composed of nugget, range, and sill)
OK…….
 Example: Exponential Semivariogram
model
OK…
• Model Parameters : nugget (C0), partial sill (C1), sill (C) and range
of influence (r)
• Basic concepts of variograms: distinguish between the sill variation (C0 +
C1) and the sill parameter (C1) and between the range parameter (R) and
the practical range;
OK…
 Model Parameters (Nugget, range, and sill) are described
for the best-fitted semivariogram Model
◦ Nugget represents the spatial variance of measurement errors at
the infinite small distance.
◦ Range delineates the effective distance of the spatial
autocorrelation.
◦ Sill is the maximum value of the semivariogram when the spatial
distance between two locations reaches the value of the range
◦ The partial sill is defined as sill- nugget, and a stronger spatial
autocorrelation is denoted by higher values of partial sill/sill.
◦ Meanwhile, the spatial variation is characterized by the basal
effect defined as nugget/sill.
◦ In other words, a larger value of nugget/sill shows that the spatial
variation among samples is more strongly caused by stochastic factors.
3.1.2. Regression Kriging (RK)
 Why RK ? Concept of spatial sariability
◦ Spatial variability of environmental variables is
commonly a result of complex processes working at
the same time and over long periods of time, rather
than an effect of a single realization of a single factor.
◦ Ideally, variability of environmental variables is
determined by a finite set of inputs and they exactly
follow some known physical law.
◦ If the algorithm (formula) is known, the values of the
target variables can be predicted exactly.
◦ In reality, the relationship between the feature of
interest and physical environment is so complex that it
cannot be modelled exactly.
Regression Kriging (RK)
 Regression kriging (RK) assumes that the target variable can
be explained by independent variables through regression
and the residuals can be described considering the spatial
autocorrelations.
 Accordingly, a value of a target variable at some location
can be modelled as a sum of the deterministic and stochastic
components:
 Both deterministic and stochastic components of spatial
variation can be modelled separately.
RK…
 Both deterministic and stochastic components of spatial
variation can be modelled separately.
 By combining the two approaches, we obtain:

where
◦ ˆm(s0) is the fitted deterministic part,
◦ ˆe(s0) is the interpolated residual,
◦ ˆ k are estimated deterministic model coefficients ( ˆ 0 is the estimated intercept),
◦ i are kriging weights determined by the spatial dependence structure of the residual
and where e(si) is the residual at location si.
RK…
 Predicting the spatial distribution of target variables by
RK usually follow five steps:
◦ (1) determine the prediction model using multiple linear
regressions (MLR);
◦ (2) calculate prediction model residuals at each sample
location;
◦ (3) model the covariance structure of the residuals using a
variogram model.
◦ (4) spatially interpolate the residuals through the parameters
of the variogram model
◦ (5) add the prediction model surface to the interpolated
residuals at each prediction point.
RK…
RK Assumptions
 Ordinary Least Squares is one of the widest used
statistical techniques
 Dependent variable is modeled as a weighted sum
of explanatory variables
 Explanatory variables should have linear
relationship with dependent variable
◦ Explanatory variables should be independent
 Ordinary Least Squares model:
◦ Dependent variable = Intercept + (EV1 * coef1) + (EV2 *
coef2) + … + (EVk * coefk) + Error
◦ Error is assumed to be random noise, coefficients
estimated by regression equation
RK: Assumptions….
 Hybrid of regression and kriging
 Regression kriging model:
◦ Dependent variable = Intercept + (EV1 *
coef1) + (EV2 * coef2) + … + (EVk * coefk) +
Error
◦ Error is modeled with a semivariogram
 Regression equation estimates the average
value for kriging.
 Kriging performed on error term
Comparison of Spatial prediction techniques for
mapping SAND
3.1.3. Geographically weighted Regression (GWR)

 Geographically weighted regression (GWR) is a simple


regression function that allows varying coefficient weights
for the environmental covariates across space.
 It takes the spatial locations of samples into consideration,
and uses the locally weighted least square method to model
the observations of soil parameters.
 GWR is based on the local smooth idea and relationships
among different environmental variables within a local
space.
 It is capable of embedding the spatial location of samples
into a regression through locally weighted least square
method.
 GWR provided extra information on the spatial processes,
but the spatial dependence of residuals still existed.
3.1.3. Geographically Weighted Regression

What is the essence of


Geographically Weighted
Regression ?
Some Definitions
 Spatial non-stationarity: the same stimulus provokes
a different response in different parts of the study
region
 Global models: statements about processes which are
assumed to be stationary and as such are location
independent
 Local models: spatial decompositions of global
models, the results of local models are location
dependent –
◦ A characteristic we usually anticipate from geographic (spatial)
data
Stationary v.s non-stationary

yi= 0 + 1x1i yi= i0 + i1x1i

e1 e1
e2 e2

Stationary process Non-stationary process

e3 e4 e3 e4

Assumed More realistic


Stationary v.s. non-stationary
 If non-stationarity is modeled by
stationary models
◦ Possible wrong conclusions might be drawn
◦ Residuals of the model might be highly spatial
autocorrelated
Simpson’s paradox
Spatially aggregated data Spatially disaggregated data
House Price

House density House density


Why do relationships vary spatially?
 Sampling variation
 Nuisance variation, not real spatial non-stationarity
 Relationships intrinsically different across
space
◦ e.g. Differences in attitudes, preferences or
different administrative,
 Model misspecification –
◦ Suppose a global statement can ultimately be made
but models not properly specified to allow us to make
it. Local models good indicator of how model is
misspecified.
Regression
 Regression establishes relationship among a
dependent variable and a set of independent
variable(s)
 In a typical linear regression model applied to
spatial data we assume a stationary process:
 yi=0 + 1x1i+ 2x2i+……+ nxni+i
 Where
◦ yi the dependent variable,
◦ xji (j from 1 to n) the set of independent variables, and
◦ i the residual, all at location i
so that...
The parameter estimates obtained in thecalibration of such
a model are constant over space:

’ = (XT X)-1 XT Y

which means that any spatial variations in the processes


being examined can only be measured by the error term
Regression
 When applied to spatial data, as can be
seen, it assumes a stationary spatial process
◦ The same stimulus provokes the same
response in all parts of the study region
◦ Highly untenable for spatial process
Geographically weighted regression
 a local form of linear regression used to model spatially
varying relationships.
 Spatial non-stationarity is assumed and will be
tested
─ Based on the “First Law of Geography”: everything is
related with everything else, but closer things are more
related
GWR
 Addresses the non-stationarity directly
◦ Allows the relationships to vary over space, i.e., s do not
need to be everywhere the same
◦ This is the essence of GWR, in the linear form:
 yi=i0 + i1x1i+ i2x2i+……+ inxni+i

◦ Instead of remaining the same everywhere, s now vary in


terms of locations (i)
GWR…
 The formula of GWR can be written as

 where
◦ (ui,vi) are the coordinates of point i,
◦ ß0(ui,vi) is the intercept,
◦ ßk(ui,vi) is the coefficient of different explanatory variables,
◦ x*ik is the value of explanatory variable k at point i,
◦ p is the total number of explanatory variables,
◦ "i is the error term that is generally assumed to be explanatory and
normally distributed with zero mean and constant variance, and the values
of the above parameters vary with the location.
Hybrid /Mixed Models
 Non-linear Prediction Models
◦ ML
◦ ANN
◦ SVR
 Geostastical Prediction Model
◦ OK
◦ RG
◦ GWR
 Mixed Model
◦ ANNK =ANN + OK of residual
◦ SVRK = SVR + Ok of residual
◦ GWRK= GWR + OK of residual
Example: Geographically Weighted
Regression Kriging (GWRK)
 Although GWR uses spatially varying covariate
coefficients, it does not directly consider spatial
autocorrelation in model development.
 Geographically weighted regression kriging (GWRK)
combines the frameworks of GWR and RK models to
account for spatial nonstationarity and spatial
autocorrelation of the residuals.
Model Validation
 Validation allows you to evaluate your predictions
using a dataset that was not involved in creating the
prediction model.
 As with cross-validation, your goals should be to
have the following:
─ An average error close to zero
─ A small root mean square prediction error
─ An average standard error similar to the root mean square prediction
error
─ A standardized mean prediction error near 0

You might also like