0% found this document useful (0 votes)
17 views

Report Geochemical

geostatistic nsgs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Report Geochemical

geostatistic nsgs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

IISRS

science for a changing world

OVERVIEW AND TECHNICAL AND PRACTICAL


ASPECTS FOR USE OF GEOSTATISTICS IN
HAZARDOUS-, TOXIC-, AND RADIOACTIVE-
WASTE-SITE INVESTIGATIONS

U.S. GEOLOGICAL SURVEY

Water-Resources Investigations Report 98-4145

Prepared in cooperation with the


U.S. ARMY CORPS OF ENGINEERS
Overview and Technical and Practical
Aspects for Use of Geostatistics in
Hazardous-, Toxic-, and Radioactive-
Waste-Site Investigations

By C.R. Bossong, M.R. Karlinger, B.M. Troutman, and A.V. Vecchia

U.S. GEOLOGICAL SURVEY

Water-Resources Investigations Report 98-4145

Prepared in cooperation with the


U.S. ARMY CORPS OF ENGINEERS

Denver, Colorado
1999
U.S. DEPARTMENT OF THE INTERIOR
BRUCE BABBITT, Secretary

U.S. GEOLOGICAL SURVEY


Charles G. Groat, Director

The use of firm, trade, and brand names in this report is for identification purposes only and does
not constitute endorsement by the U.S. Geological Survey.

For additional information write to: Copies of this report can be purchased
from:

District Chief U.S. Geological Survey


U.S. Geological Survey Information Services
Box25046, Mail Stop 415 Box 25286
Denver Federal Center Federal Center
Denver, CO 80225-0046 Denver, CO 80225
CONTENTS
Notation........................................................................................................._^ VII
Abstract.................................................................................................................................................................................. 1
1.0 Introduction........................................................................................_^ 1
1.1 Purpose and Scope................................................................................................................................................ 2
1.2 Organization......................................................................................................._^ 2
1.3 Overview of the Use of Geostatistics in Hazardous-, Toxic-, and Radioactive-
Waste-Site Investigations............................................................................................................................... 3
1.4 Acknowledgments................................................................................................................................................ 3
2.0 Overview of Some Technical Aspects of Geostatistics .................................................................................................. 3
2.1 General Considerations in Spatial Prediction....................................................................................................... 3
2.2 Important Geostatistical Concepts........................................................................................................................ 5
2.2.1 Variograms............................................................................................................................................... 5
2.2.2 Directional Variogram and Anisotropy.................................................................................................... 5
2.2.3 Kriging and Kriging Variance.................................................................................................................. 5
2.2.4 Trends and Universal Kriging.................................................................................................................. 6
2.2.5 Block Kriging .......................................................................................................................................... 6
2.2.6 Prediction Intervals and Normality.......................................................................................................... 6
2.2.7 Transformations....................................................................................................................................... 6
2.2.8 Indicator Kriging ..................................................................................................................................... 6
3.0 Technical Aspects of Geostatistics ................................................................................................................................. 7
3.1 Regionalized Random Variables........................................................................................................................... 7
3.1.1 Example 3.1.1.......................................................................................................................................... 9
3.1.2 Example 3.1.2.......................................................................................................................................... 10
3.2 Variograms............................................................................................................................................................ 11
3.3 Kriging.................................................................................................................................................................. 15
3.3.1 Ordinary Kriging ..................................................................................................................................... 16
3.3.1.1 Example 3.3.1.1.......................................................................................................................... 16
3.3.1.2 Example 3.3.1.2.......................................................................................................................... 17
3.3.2 Universal Kriging..................................................................................................................................... 18
3.3.3 Blc)ck Kriging ............................................................................................................^ 18
3.4 Co-Kriging............................................................................................................................................................ 19
3.5 Using Kriging to Assess Risk............................................................................................................................... 20
3.5.1 Normal Distributions and Transformations............................................................................................. 21
3.5.2 Indicator Kriging ..................................................................................................................................... 21
4.0 Geostatistical Resources and Tools................................................................................................................................. 22
4.1 Texts on Geostatistics ........................................................................................................................................... 22
4.2 Journals................................................................................................................................................................. 23
4.3 Software................................................................................................................................................................ 23
5.0 Practical Aspects of Variogram Construction and Interpretation................................................................................... 25
5.1 General Computation of an Empirical Variogram................................................................................................ 26
5.2 Nonstationarity..................................................................................................................................................... 29
5.3 Variogram Refinement.......................................................................................................................................... 29
5.4 Transformations and Anisotropy.......................................................................................................................... 30
5.4.1 Transfonnations........................................................................................^^ 30
5.4.2 Directional Variograms and Anisotropy.................................................................................................. 31
5.5 Fitting a Theoretical Variogram to the Sample Variogram Points........................................................................ 32
5.5.1 Exponential Variogram............................................................................................................................ 34
5.5.2 Spherical Variogram................................................................................................................................ 34
5.5.3 Gaussian Variogram................................................................................................................................. 35
5.5.4 Linear Variogram..................................................................................................................................... 35

CONTENTS III
5.6 Additional Trend Considerations........................................................................................................................ 35
5.7 Outlier Detection................................................................................................................................................ 35
5.8 Cross Validation for Model Verification............................................................................................................. 36
5.8.1 Calibration Statistics ............................................................................................................................. 36
5.8.2 Variogram-ParameterAdjustments....................................................................................................... 37
6.0 Practical Aspects of Geostatistics in Hazardous-, Toxic-, and Radioactive-Waste-Site Investigations....................... 39
6.1 Ground-Water-Level Examples.......................................................................................................................... 40
6.2 Bedrock-Elevation Examples ............................................................................................................................. 43
6.3 Ground-Water-Quality Examples....................................................................................................................... 54
7.0 Review of Kriging Applications................................................................................................................................... 60
7.1 Applicability of Kriging..................................................................................................................................... 60
7.2 Important Elements of Kriging Applications..................................................................................................... 63
7.3 Errors in Measured Data..................................................................................................................................... 64
8.0 Other Spatial Prediction Techniques............................................................................................................................ 64
8.1 Global Measure of Central Tendency (Simple Averaging)................................................................................ 65
8.2 Simple Moving Average..................................................................................................................................... 65
8.3 Inverse-Distance Squared Weighted Average..................................................................................................... 65
8.4 Triangulation ...................................................................................................................................................... 66
8.5 Splines ................................................................................................................................................................ 67
8.6 Trend-Surface Analysis...................................................................................................................................... 67
8.7 Simulation........................................................ 67
9.0 Summary .........................................................................................................................^ 69
10.0 References......................................................._^^ 70

FIGURES
1-3. Diagrams showing:
1. Covariance function properties A, hypothetical study area; B, stationary covariance functions;
and C, isotropic covariance function.................................................................................................................. 8
2. Variogram and features...................................................................................................................................... 13
3. Theoretical variograms indicating A, exponential; B, spherical; C, Gaussian; and D, linear models................ 14
4. Map showing measured water levels from Saratoga data.......................................................................................... 26
5-13. Graphs showing:
5. Squared differences of values for all possible pairs of points for Saratoga data............................................... 27
6. Initial sample variogram points for Saratoga data............................................................................................. 28
7. Sample variogram points for ordinary least-squares trend residuals for Saratoga data..................................... 30
8. Sample variogram points for ordinary least-squares trend residuals for Saratoga data binned to
6.5 kilometers..................................................................................................................................................... 31
9. Initial directional sample variogram points for raw Saratoga data A, north-south and B, east-west.............. 33
10. Sample variogram points and theoretical spherical fit for iterated Saratoga residuals...................................... 38
11. Sample variogram points and theoretical Gaussian fit for iterated Saratoga residuals...................................... 39
12. Cross-validation probability plot for Saratoga data........................................................................................... 40
13. Scatterplot of measured versus kriging estimates from cross validation of Saratoga data................................ 41
14. Maps showing location of measured data for ground-water-level examples A, original data; B, original
data without dropped sites; and C, original data with added sites ............................................................................. 42
15. Graphs showing variogram and variogram cross-validation plots for residuals in water-level examples
A, theoretical variogram; B, cross-validation scatterplot; and C, cross-validation probability plot........................... 44
16. Maps showing kriging results for ground-water-level examples A, kriging estimates for original data;
B, kriging standard deviations for original data; C, ratio (original data to original with dropped sites)
of kriging standard deviations; and D, kriging standard deviations for original data with added sites..................... 46
17. Maps showing location of measured data for bedrock-elevation examples A, original data and
B, restricted data......................................................................................................................................................... 49
18. Graphs showing variogram and variogram cross-validation plots for bedrock-elevation examples
A, theoretical variogram; B, cross-validation scatterplot; and C, cross-validation probability plot........................... 50

IV CONTENTS
19. Maps showing kriging results for bedrock-elevation examples A, kriging estimates; 5, kriging standard
deviations; C, block kriging results; and D, block kriging standard deviations ........................................................ 52
20. Maps showing location of measured data for ground-water-quality examples ......................................................... 54
21. Graphs showing directional variograms and variogram cross-validation plots for ground-water-
quality examples A, theoretical major-direction variogram; B, theoretical minor-direction variogram;
C, cross-validation scatterplot; and D, cross-validation probability plot................................................................... 56
22. Maps showing kriging results for ground-water-quality examples A, kriging estimates back-transformed;
B, kriging estimates in log space; C, kriging standard deviations in log space; and D, 95-percent confidence
level for kriging estimates back-transformed............................................................................................................. 58
23. Graphs showing directional variogram plots for indicator kriging ground-water-quality example
A, theoretical major-direction variogram and B, theoretical minor-direction variogram........................................... 61
24. Map showing indicator kriging results for ground-water-quality example................................................................ 62
25. Diagram showing Voronoi polygons.......................................................................................................................... 66

TABLES
1. Geostatistical software characteristics.................................................................................................................... 24
2. Univariate statistics for example data sets.............................................................................................................. 26
3. Variogram characteristics and cross-validation statistics........................................................................................ 45
4. Univariate statistics for gridded kriging estimates in example applications........................................................... 48

CONVERSION FACTORS, VERTICAL DATUM, AND ABBREVIATIONS

Multiply By To obtain

Length
kilometer 0.6214 mile
meter 1.094 yard

CONTENTS
NOTATION

a Angle for directional variogram. T Transformation.


c Generic constant used for cutoff value in probability V Voronoi polygon.
distribution or indicator transformation.
Var Population variance.
e Kriging error.
W(x) Co-kriging random variable at location x.
e Reduced kriging error.
7 (#) Transformed variable at location £.
fj Explanatory variables used in drift equations.
Y (x) Predictor or estimate of Fat location x, obtained
g Nugget of variogram.
from kriging.
h Lag or distance between two data points.
Z Regionalized random variable.
n Number of data points.
Z(#) Potential value of Z at location x.
m Number of locations in a given block.
Z(x) Predictor or estimate of Z at location x, obtained
r Range of variogram. from kriging.
s Sill of variogram.
Z*(*) The residuals of Z(x).
w Weight.
Z (x) Arbitrary predictor of Z at location x.
x = (u, v) Location based on coordinates u and v.
Zn Sample mean of Z from n observations.
Z (x) Measurement of Z at location x.
P Regression coefficient used in polynomial representa-
Z (x) Kriging estimate using measured data. tion for drift.
A Area of triangle. Y Sample variogram.
B Area designation in block kriging.
y Theoretical variogram.
C Population covariance function.
Y(/0 Theoretical variogram for lag h.
C Sample covariance function.
X Optimization coefficient.
C(x\, #2) Covariance of data values at locations x\
and #2- T| Parameter used in spline analysis.
Dfj Difference in values between data points i and/ p(/z) Correlation function as function ofh.
E Expectation. a (x) Spatial population standard deviation at
/(.) Indicator function. location x.
2
K Number of variogram bins. o (x) Spatial population variance at location x.

N(.) Number of squared differences in variogram bin. 6K (x) Kriging standard deviation at location x.
2
P Probability. GK &) Kriging variance at location x.
2
Sn Sample variance of n measurements. \i(x) Spatial population mean of Z at location x.

NOTATION VII
Overview and Technical and Practical Aspects
for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
ByC.R. Bossong, M.R. Karlinger, B.M. Troutman, and A.V. Vecchia

Abstract USAGE has distributed the report as an Engineer


Technical Letter within their agency (USAGE, 1997).
Technical and practical aspects of One very fundamental aspect of perhaps all HTRW-
applying geostatistics are developed for individ- site investigations that deal with environmental
uals involved in investigations at hazardous-, contamination is the need to characterize the extent
toxic-, and radioactive-waste sites. Important and spatial distribution of contamination. Such a char-
geostatistical concepts, such as variograms acterization usually includes describing and evaluating
and ordinary, universal, and indicator kriging, the spatial trends and variability of the contamination,
are described in general terms for introductory using a variety of statistical or analytical tools. A prin-
purposes and in more detail for practical cipal difficulty in characterizing the contamination is
applications. the fact that measurements might be few or might
Variogram modeling using measured be sparsely scattered over large regions. Another
difficulty that arises naturally is how to interpolate
ground-water elevation data is described in detail
between measured data in order to make predictions
to illustrate principles of stationarity, anisotropy,
(or estimates) at points where measurements of
transformations, and cross validation. Several contaminant concentration are not available. Such
examples of kriging applications are described interpolation is referred to as point, or punctual,
using ground-water-level elevations, bedrock estimation in this report. Additionally, an investigator
elevations, and ground-water-quality data. might need to determine a single representative value
A review of contemporary literature for an area that has several measured or estimated
and selected public domain software associated values, or both; this determination is referred to in
with geostatistics also is provided, as is a discus- this report as block estimation. Geostatistics is a set
sion of alternative methods for spatial modeling, of statistical procedures designed to address these
including inverse distance weighting, triangula- difficulties and needs. Geostatistics can be applied
tion, splines, trend-surface analysis, and to many other problems, besides contamination,
simulation. that occur at HTRW sites. Even though this report
addresses only two-dimensional applications, geosta-
tistics can be used in three dimensions as well. Indeed,
1.0 INTRODUCTION there are many cases in which the third dimension,
usually stratification, is desirable to address.
This report addresses the use of geostatistics at Kriging is the principal geostatistical technique
hazardous-, toxic-, and radioactive-waste (HTRW) site described in this report. For introductory purposes,
investigations. The report was prepared in cooperation kriging can be defined as a technique for determining
with the U.S. Army Corps of Engineers (USAGE) for the optimal weighting of measurements at measured
use as a guidance document within the USAGE. The or sampled locations for obtaining predictions, or

Abstract 1
estimates, at unmeasured or unsampled locations; report also includes a brief literature and software
additional definition of kriging is provided throughout review, a presentation of kriging applications, a
this report. Kriging is well suited for making point discussion of the review of kriging applications, and a
and block estimates; however, much of the advantage discussion of more advanced geostatistical techniques,
of using geostatistical techniques, such as kriging, such as conditional simulation.
is not just in the point and block estimates but in The scope of this report is limited to discussions
the information provided concerning the uncertainty and examples of two-dimensional point and block
associated with the estimates. The uncertainty infor- estimations using a geostatistical method known
mation is usually quantified by a kriging variance that as kriging. The technical aspects of geostatistics are
is associated with a kriging estimate. The uncertainty presented through discussion of the assumptions
also is sometimes referred to as the kriging standard about, and the mechanics of, several types of kriging,
deviation, which is simply the square root of the including ordinary kriging, which is applicable when
kriging variance. the mean for the variable of interest is constant over
the region of interest, and universal kriging, which is
Original geostatistical work involved making
applicable when the mean for the variable of interest
estimates for the areal extent and concentrations of
changes gradually over the region. A specialized form
economic mineral deposits in relation to mining.
of kriging known as indicator kriging and the use of
Today (1998), geostatistical techniques continue
information concerning uncertainty associated with
to have a function in mining. However, a well-
kriging estimates also are discussed. The fundamental
developed method that is capable of interpolating a
concepts of geostatistical kriging theory are discussed,
given set of measured values at discrete locations into but various references are provided for more detailed
estimates for new locations or developing an indi- information.
vidual estimate for an area including many locations,
or both, has attracted users from many disciplines, and
there is a trend toward incorporating geostatistics as 1.2 Organization
standard curriculum for most geoscience educational
programs. The use of geostatistical techniques as part This report is divided into eight sections
of HTRW-site investigations is becoming common described here.
because of the almost routine need for data interpola- Section 1.0 presents introductory material and
tion as part of these investigations. includes an overview of the use of geostatistics
Once investigators have established that the at HTRW-sites.
data are adequate as to quality and quantity, geostatis- Section 2.0 presents an overview of some important
tics can be a powerful analytical tool that results in technical aspects of geostatistics with a minimum
quantitative characterization of areas of special of theory and equations.
interest within the study area or the entire study area.
These characterizations could be used to determine Section 3.0 discusses the assumptions and theory
spatial variation; for example, where concentrations behind kriging, including equations and concepts
of contaminants in soils are relatively high or low, that are useful for obtaining a better under-
are less than or greater than a specified concentration, standing of the technical aspects, or mathematics,
or even have a high or low probability of exceeding of kriging interpolation. Many of the concepts
developed in section 3.0 are discussed in general
a certain concentration.
terms in section 2.0, so those readers desiring
only an overview of kriging concepts may wish to
read only section 2.0 and bypass section 3.0.
1.1 Purpose and Scope
Section 4.0 reviews texts that contain much more
The purpose of this report is to address the detailed information regarding kriging theory
use of geostatistics in HTRW-site investigations by than material included in section 3.0. Section 4.0
presenting an overview of geostatistical methods and also provides a brief generic discussion of kriging
discussing their technical and practical aspects. The software.

Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
Section 5.0 discusses detailed step-by-step vario- or interdisciplinary branches, or both; the geologist,
gram construction and demonstrates some pitfalls for example, will occasionally benefit from knowl-
and solutions to this crucial process. Section 5.0 edge of geophysics. Interdisciplinary input also can
also discusses techniques that investigators may be very helpful, especially in geostatistics, where
use to evaluate their variograms. earth-science disciplines rely on assistance from
Section 6.0 discusses practical aspects of geostatis- statisticians.
tics by presenting several examples of kriging
applications using data from the HTRW field. The 1.4 Acknowledgments
examples illustrate a few of the many different
ways kriging can be used in HTRW-site investiga- Several individuals have provided valuable
tions and are not presented with the same level of technical review of this report. Dave Becker,
detail used in section 5.0. USAGE, HTRW Center for Expertise, not only
Section 7.0 provides additional detail on some provided a consistent and thorough technical review,
crucial aspects of kriging applications and but also coordinated additional technical reviews.
includes considerations that may be helpful to Additional technical review from the USAGE were
determine if kriging is feasible for the intended provided by Terry Walker and Tom Georgian, both
use. from the HTRW Center for Expertise; Brad Call,
Earl Edris, Dave Kachek, Doug Mullendore, and
Section 8.0 briefly discusses other methods for Kerry Walker, all from USAGE field offices; and
spatial modeling and also includes discussion of Tommiann McDaniel from USAGE headquarters.
advanced stochastic methods, such as simulation. Additional technical reviews from outside the USAGE
were provided by Evan Englund, U.S. Environmental
Protection Agency National Exposure Research
1.3 Overview of the Use of
Laboratory; Ed Gilroy, U.S. Geological Survey
Geostatistics in Hazardous-, (USGS); Mohan Srivastava, Froidevaux, Srivastiva,
Toxic-, and Radioactive- and Schofield; and Wayne Woldt, University of
Waste-Site Investigations Nebraska.
Investigations of HTRW sites involve complex
administrative, scientific, and engineering functions
and are truly interdisciplinary. For instance, adminis- 2.0 OVERVIEW OF SOME TECHNICAL
trative functions that are associated with fiscal, ASPECTS OF GEOSTATISTICS
managerial, or regulatory input can guide or constrain This section provides an overview of some of
scientific or engineering work. Similarly, scientific the procedures and concepts discussed in detail in this
or engineering findings may define the scope of the report. Some of the technical ideas and terminology
administrative effort. are introduced in very general terms to familiarize the
Scientists and engineers involved in HTRW- reader with geostatistics.
site investigations have found that they have an
implicit need for many disciplines to fulfill the
objectives of each particular investigation. Frequently, 2.1 General Considerations in
an HTRW-site investigation will benefit from special- Spatial Prediction
ized information available from earth-science
disciplines such as geology, hydrogeology, and The principal consideration in this report is
geochemistry, among others. Some HTRW-site inves- spatial prediction or modeling values of a spatial
tigations are large enough to use several individuals process; in particular, to make best use of measure-
from each of these disciplines, as well as many others, ments of a variable (such as pollutant concentration) at
for the duration of multi-year investigations. Most sampled locations so as to make inferences (or predic-
disciplines associated with HTRW-site investigations tions) about that variable at unsampled locations or for
will benefit from knowledge or input from specialized the region as a whole.

2.0 OVERVIEW OF SOME TECHNICAL ASPECTS OF GEOSTATISTICS


A spatial process can have a large-scale or at sampled locations for obtaining predictions at
a regional component and a small-scale or a local unsampled locations. These optimal weights depend
component; both these components need to be on spatial trends and correlations that may be present.
accounted for when modeling a spatial process. The There are a number of ways to perform spatial
large-scale component is referred to as the mean field prediction. The geostatistical technique of kriging
and is most often modeled by a spatial trend that may belongs to a class of techniques known as stochastic
or may not be constant over the region. The small- techniques. In these techniques, the measurements,
scale component is a random fluctuation that is mathe- actual and potential, are considered to constitute a
matically combined with the trend to make up the single realization of a random (or stochastic) process.
sample at a point. On the average, the random fluctua- One advantage of assuming the existence of such a
tion is assumed to be zero, but can be either positive or random process is that measures of uncertainty, such
negative in individual samples. The separation of the as the variance used in kriging, can be defined. These
trend from the random fluctuation is problem- and measures of uncertainty permit the objective assess-
scale-dependent and needs to be determined carefully. ment of a spatial-prediction technique on the basis
There can be several solutions to the problem of sepa- of how small such measures are. Once a measure of
rating the trend and the random fluctuation that may uncertainty has been selected, the weights to be used
be useful for various geostatistical purposes when in spatial prediction may be determined to minimize
using a single set of data. the measure of uncertainty. In short, the use of
The small-scale fluctuation of the variable of stochastic techniques provides a way of objectively
interest (for example, water levels or contaminant quantifying errors and determining weights. In prac-
concentrations at a sample point), although random, tice, spatial predictions obtained using kriging are
can indicate some association with the random fluctua- almost always accompanied by a measure of the asso-
tions at nearby points. This association is referred to as ciated error. Such an error evaluation is an integral part
spatial correlation. Positive spatial correlation between of a kriging analysis and is one of the principal advan-
measurements indicates that the random fluctuation at tages of using kriging (or stochastic techniques in
both points tends to have the same sign, whereas nega- general).
tive correlation indicates that the random components Nonstochastic techniques are generally applied
tend to have the opposite sign. The large-scale trend strictly empirically; no assumptions concerning the
and the positive spatial correlation of the small-scale existence of an underlying random process are made,
fluctuations contribute to measurements at locations and no theoretical framework is used to evaluate
that are close together being more closely related than statistically the performance or optimality of the
are measurements at locations that are farther apart. nonstochastic techniques. When applied in such a
The most obvious procedure to determine manner, whether such techniques would be expected
spatial prediction at unsampled locations is simply to yield results that are satisfactory cannot be evalu-
to take an average of the measured sample values and ated in advance. Two techniques that are commonly
to assume that this average value gives a reasonable applied nonstochastically are simple averaging and
prediction at all locations in the region of interest. trend analysis, which is a least-squares method for
This procedure may work adequately in some cases, fitting a smooth surface to the data. Even though
but there also are pitfalls. Using a single value for an these two techniques are usually applied nonstochasti-
entire region implicitly assumes spatial homogeneity. cally, performance can still be assessed if a stochastic
This assumption ignores any spatial trends that might setting is assumed. Generally, simple averaging would
exist in the data and also ignores spatial continuity. perform well if there were no trend and no spatial
If the variable of interest does have a tendency to correlation, and trend analysis would perform well if
be spatially correlated, then a weighted average there was a trend that can be modeled, but no spatial
rather than a simple average could be used to make a correlation. Lack of correlation in the measurements
spatial prediction by giving measurements at sampled is one assumption that is made in ordinary statistical
locations that are nearer to the unsampled location regression analysis, and trend analysis, if it is placed in
more weight. This motivation is the basis for the a stochastic setting, is actually a special type of regres-
geostatistical techniques discussed in this report. sion. The stochastic technique of kriging explicitly
The technique known as kriging is a technique for incorporates the spatial correlations that are ignored
determining the optimal weighting of measurements in trend analysis. In section 8.0, a few other common

Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
techniques that are usually applied nonstochastically 2.2.2 Directional Variogram and Anisotropy
are discussed briefly. Most of these techniques are
Spatial correlation often depends not only on
designed to incorporate spatial continuity, but the
the distance between points, but also on the direction
way it is incorporated may be subjective. Use of
along which the points plot. For example, measure-
kriging provides an objective means of incorporating
ments at pairs of points that are 100 meters apart
partial correlation and makes the background assump-
and are oriented north-south may have a different
tions explicit.
correlation than measurements at points that are the
same distance apart, but that are oriented east-west.
Correlations dependent on direction indicate anisot-
2.2 Important Geostatistical Concepts ropy, and when anisotropy is present, a directional
variogram needs to be used for the geostatistical
This section presents some of the key
analysis.
concepts in geostatistics that are discussed in
detail in section 3.0. The concepts are presented
in about the same order as they are discussed in 2.2.3 Kriging and Kriging Variance
section 3.0. Kriging yields optimal spatial estimates at
points where no measurements exist using measure-
2.2.1 Variograms ments at points where there are data. As discussed
in section 2.1, placing an analysis in a stochastic
A central concept in geostatistics is the use of framework enables precision in defining optimality.
spatial correlation to improve spatial predictions or In kriging, a restriction that the predicted value at
interpolations. The variogram is the principal tool used any point is a linear combination of the measured
to characterize the degree of spatial correlation present values is imposed first; that is, the kriging estimate
in the data and is fundamental to kriging. The correla- is a linear predictor. Given this restriction, the values
tion between measurements at two points is usually of the coefficients in this linear function are chosen
assumed to depend on the separation between the two to ensure the predictor to be optimal.
points. This dependence can be examined by squaring The first optimality criterion imposed is that the
the difference between the measured values at each estimate be unbiased, or that on average, the difference
pair of locations and then categorizing the squared between the predicted value and the actual value is
differences according to the separating distance zero. The second optimality criterion is that the vari-
between the paired locations. For small separations, ance of the predictions be minimized. The variance in
or lags, the squared differences are usually small and the predictions is a statistical error measure defined to
increase as the lag increases. A plot of the squared be the average squared difference between predicted
differences per sample pair as a function of lag is and actual values. Because the kriging estimate mini-
referred to as the sample variogram. mizes this variance, the estimate, or prediction, is
The general behavior of the points in the known as the best (minimum variance) unbiased linear
sample variogram is affected by the spatial correlation predictor. This minimization is performed algebra-
between sample sites and can provide investigators ically and results in equations known as the kriging
with qualitative information about the spatial process, equations, which are explicit representations of the
but to use this information rigorously as a basis for optimal coefficients (weights) in terms of the vario-
interpolation, a function, that has specific properties, gram. These equations are presented in section 3.0.
needs to be fit to the sample variogram points. The An expression for the kriging variance also is
fitting passes a smooth curve through the scattered discussed in section 3.0. This variance depends on
points. The curve, which can be represented by a the geometry of the data sites, with the variance
mathematical expression or function, is called a at locations near measured points tending to be small.
model. There are several models introduced in A variance then can be associated with any spatial
section 3.0 that have characteristic features that prediction, which gives an indication of the uncer-
are commonly used in geostatistics. The variogram tainty about that predicted value. The fact that kriging
model is used to determine kriging weights for use provides this measure of uncertainty is one of its prin-
in interpolation. cipal advantages over many other techniques.

2.0 OVERVIEW OF SOME TECHNICAL ASPECTS OF GEOSTATISTICS


2.2.4 Trends and Universal Kriging assumed to have a normal distribution. In this situa-
tion, and given the set of measured values, a potential
In kriging, special attention must be given to
value at an unsampled location has a normal distribu-
the question of whether there are spatial trends in
tion, with the average given by the kriging estimate
the data. A trend is usually any detectable tendency
and the variance given by the kriging variance. Thus,
for the measurements to change as a function of the
on the basis of classical statistics, the straightforward
coordinate variables, but also can be a function of
use of this normal distribution can be used to obtain a
other explanatory variables. For example, aside
95-percent prediction interval for a concentration at
from random fluctuations, measurements of ground-
an unsampled location.
water elevations may have a tendency to consistently
increase in a certain direction. A kriging analysis
in which there is no spatial trend is known as ordinary 2.2.7 Transformations
kriging; when a trend does exist, universal kriging A prediction interval generally is much more
should be considered. In universal kriging, the trends informative than the kriging estimate and kriging
present are accounted for. For example, the trend variance, so a common question is whether a
might be represented as a linear function of coordinate normality assumption can be made for the data. When
variables. The form of the trend model then is incorpo- a normality assumption cannot be made, a transforma-
rated into the universal kriging equations to obtain the tion can be identified that will make the data normal,
optimal weights and account for the trend. or almost normal. For example, for data that have
values greater than 0, a logarithmic transformation
2.2.5 Block Kriging is often tried; that is, a geostatistical analysis is
performed on logarithmically transformed values
The kriging discussed in sections 2.2.3 and rather than on the original data. Prediction intervals
2.3.4 is usually known as point, or punctual, kriging. obtained using the transformed values can be readily
In point kriging, the goal is to predict the value of a converted to corresponding intervals on untransformed
variable at discrete locations. By contrast, in block variables. However, there are subtleties that need to be
kriging, the goal is to predict the average value of a considered in back-transforming the kriging estimate
variable for a specified region. As in point kriging, and the kriging variance; these subtleties are discussed
the optimal predictor is a linear combination of the in more detail in section 3.0.
measured values, and the degree of uncertainty is indi-
cated by a block kriging variance. Block kriging vari-
2.2.8 Indicator Kriging
ances tend to be smaller than point kriging variances
because averages tend to be less variable than indi- In indicator kriging, analysis is performed using
vidual point values. indicator variables of the measured data rather than the
measured data. An indicator variable is a special kind
2.2.6 Prediction Intervals and Normality of transform of the measured data and can have one
of two possible values: 0 or 1. To obtain the indicator
A standard kriging analysis gives two values variables to be analyzed, a threshold value is specified
for any location: the optimal kriging estimate and the (c), which, for example, may represent a contaminant
kriging variance. The variance provides a measure of concentration level of particular importance. At each
uncertainty about the prediction. In some studies, the measurement location, the indicator variable then is
nature of the uncertainty needs to be specified beyond assigned a value of 1 if the measured value is less
just giving the variance. One way to further specify the than or equal to c and is assigned a value of 0 if the
uncertainty is to obtain a prediction interval, which is measured valued is greater than c. This kind of trans-
an interval where there is a certain probability, gener- form allows censored data, or data reported as less
ally 95 percent, that the actual value is within the than some reporting limit, to be included in the anal-
interval. Finding such an interval often hinges on ysis if the reporting limit is less than or equal to the
the probability distribution of the variables being threshold, or cutoff, value of c. After the indicator
sampled. An ideal situation is when the variable of transform has been determined, the kriging analysis
interest, such as contaminant concentration, can be then is done using these indicator variables in the

Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
usual manner; first, a variogram is obtained, and then obtain the best weights for spatial prediction is
the kriging equations yield the optimal linear predictor discussed, and the computation of the average-, or
and the kriging variance for the indicators. mean-, squared prediction error for these predictions
Although the indicator kriging analysis uses also is discussed. In section 3.4, co-kriging, which is
only O's and 1 's, the interpolated estimates are not prediction of one variable based on measurements of
restricted to these two values. In most analyses, the that variable and other variables, is discussed. Finally,
estimates are between 0 and 1, which is interpreted in section 3.5, the application of kriging to determine
to be the probability that the actual value is less than not just optimal spatial predictions, but also probabili-
or equal to the threshold c. Use of this analysis for ties associated with various events, such as extreme
a number of different threshold values can provide events that may be of importance in risk-based anal-
information about the probability distribution of yses, is discussed.
contaminant values at a location, which may be used
to obtain prediction intervals. Such prediction inter-
vals may even be more valuable than having only 3.1 Regionalized Random Variables
the optimal predictor and variance provided by the
usual kriging analysis, particularly if the behavior Suppose the extent of ground-water contamina-
of extremes may be of interest. An advantage of using tion by a particular pollutant in a given study area
indicator kriging to obtain prediction intervals is that is being determined. To simplify the presentation,
there is no need to assume a normal distribution for all data are assumed to be distributed over a two-
the data. dimensional region. In three-dimensional ground-
water flow systems, one could study the depth-
averaged concentration of a pollutant or the concentra-
3.0 TECHNICAL ASPECTS OF tion of the pollutant in a particular horizontal stratum
GEOSTATISTICS of the flow system. Let a vector x = (u, v) denote an
arbitrary spatial location in the study area. Unless
This section provides the technical aspects otherwise stated, u is assumed to be the east-west
or the necessary theoretical background for under- coordinate and v the north-south coordinate (fig. I A).
standing kriging applications. Emphasis is placed Use z(x) to denote a measurement at location x, such
on presentation of the basic ideas; long formulae or as the concentration of a pollutant. The ultimate goal
derivations are minimized. Statistical terms that are is to determine z(x) for all locations in the study
commonly used in geostatistical applications are in area. However, without explicit knowledge of the
bold text and are briefly defined as they are intro- ground-water flow and transport field, this goal
duced; notation used in this report also is listed in cannot be achieved. Therefore, suppose that the
the "Notation" section. More thorough discussions goal is to estimate the values of z(x) with a given
of these fundamental concepts are indicated by refer- error tolerance. For some studies, a small estimation
ences cited in section 4.0. A knowledge of engineering error for some parts of the study area (for instance,
statistics at the level of Devore (1987) and Ross (1987) near a domestic water supply) may need to be
would help in understanding some parts of this obtained, while allowing larger estimation errors in
section. Readers who have limited statistical experi- other parts of the study area. The theory of regional-
ence may wish to briefly scan this section and refer ized random variables is designed to accomplish
back to it after reading the remaining sections. these goals.
In section 3.1, regionalized random variables In the regionalized random-variable theory,
are discussed. Regionalized random variables consti- the true measurement, z(x), is assumed to be the value
tute the random process that is sampled to obtain of a random variable, Z(x). A random variable Z(x)
the observed data that are available for analysis. is associated with a true measurement z(x) to charac-
Basic ideas related to probability distributions, aver- terize the degree of uncertainty in the quantity of
ages, variances, and correlation are introduced. In interest at point x. If there is no measurement obtained
section 3.2, the variogram, which is the fundamental at x, then the values acquired by Z(x) represent poten-
tool used in geostatistics to analyze spatial correlation, tial measurements at x\ that is, Z(x) represents possible
is introduced. In section 3.3, the use of kriging to values that might be expected if a measurement were

3.0 TECHNICAL ASPECTS OF GEOSTATISTICS


Hypothetical study area with two sampling locations. The coordinate for all values of c. The distribution is used to make
axes are in the east-west (u) and north-south (v) directions.
certain evaluations. For example, suppose there is no
measurement of the concentration of a certain contam-
inant at x, but the distribution is known and a threshold
value of c - 8 milligrams per liter is of interest. If
P[Z(x) < 8] = 0.60, and if a measurement were made at
x, there is a 60-percent chance of obtaining a value less
than or equal to 8 milligrams per liter. The distribution
also may be used to calculate other probabilities, such
as the probability of obtaining a value in some speci-
fied interval.
u (east)
An important concept in all geostatistical appli-
Covariance function (variogram) is stationary if all pairs of observations cations is the support of the regionalized random vari-
that are separated by the same lag (h) and angle (a) have identical co-
variance (variogram) values. able. The support of Z(x) is the in-situ geometric unit
represented by an individual sample. For example, in
B a soil-contamination study, Z(x) might represent the
concentration of a contaminant in a vertical soil core
0.1 meter in diameter and 1 meter in length and
centered at location x. Thus, although Z(x) is defined
at a particular point, it represents a volume of soil.
Changing the support of Z(x) usually changes its prob-
ability distribution. Therefore, all the measurements in
a geostatistical analysis need to have the same support.
The technique called point, or punctual, kriging,
described in section 3.3, is designed to predict values
Covariance function (variogram) is isotropic if all pairs of observations of Z(x) with the same support as the sample data.
that are separated by the same lag have identical covariance (vario-
gram) values. A concept closely related to support is that of
estimation block, which is a geometric unit larger
than the support of a single measurement, for which a
single representative value is desired. For example, in
the example soil-contamination study, an estimate of
the average concentration of a contaminant in a truck-
load of soil excavated from a block 6 meters long,
6 meters wide, and 0.3 meter thick may be necessary.
Using a method called block kriging, also described
in section 3.3, the block average can be predicted
based on individual measurements.
Although the distribution of Z(x) completely
Figure 1 . Covariance function properties A, hypothetical
study area; B, stationary covariance functions; and characterizes ZQt) at any particular location, the
C, isotropic covariance function. distribution indicates nothing about the relations
among the values of Z(x) at different locations,
which is very important because geostatistics is
obtained at x. Because there is uncertainty associated based on using a measured value of a regionalized
with Z(x), the random variable needs to be character- variable at one location to gain information about
ized by a probability distribution, defined by values of the variable at another location. The distribu-
P[Z(x) < c], where P denotes probability and c is tion of Z(x) at a single location can be readily general-
any constant. This distribution is a function of c ized to two or more locations. For two locations,
and, to be completely defined, needs to be known if *i and x_2 are two distinct locations, then the

Overview and Technical and Practical Aspects for Use of Geostatistics 1n Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
joint probability distribution is defined to be the 3.1.1 Example 3.1.1
probability P[Z(x{) < q, Z(*2) ^ c2] for any constants
An experiment consists of injecting a conserva-
GI and c2. This latter probability means the probability
that both Z(xl) < cl and Z(*2) < c2. If the variables tive tracer at a particular well in a steady-state ground-
Z(x_i) and Z(*2) are statistically independent of one water flow system and measuring the concentration,
another, then the joint probability distribution can be Zi(x), of the tracer in a neighboring well 24 hours
obtained as the product of the individual probability later. The tracer then is allowed to flush from the
distributions, system, and the experiment is repeated a second time
to obtain another concentration measurement, Z2(x),
at the same location. If this process is repeated n times,
P[Z(x l )<c l ,Z(x2 )<c2 ] n concentration measurements Zi(x), Z2(x), ..., Zn(x)
(3-1)
= P[Z(x l )<c l ] P[ZU2 )<c2 ]. would be obtained, all at locations x. The average
concentration at location x is
However, in most applications, Z(x{) and Z(*2) are
not statistically independent, and their joint distribu-
tion cannot be obtained from the individual distribu- Z2 (x) n (*)], (3-3)
tions. When this joint description is applied to more
than two locations, specification of the full spatial
which would change depending on n and on the actual
distribution of Z would need the joint distribution of
values obtained for ZjQc), Z2(x), ..., Zn(x). However,
Z(x{), ..., Z(xn) for any set of n spatial locations and for
any n; however, except in very special cases, working in the limit as n increases, Zn(x) becomes closer and
with the full set of distribution functions of Z(x) is not closer to the true mean, or expected, concentration
feasible and is not done.
To simplify the problem even further, various
parameters of the distributions are used rather than
using the entire distributions. The parameter most Zn (x) > as n increases. (3-4)
commonly used to characterize a distribution is the
mean; because the mean in geostatistical applications This theoretical limit is a constant value, or
depends on the spatial variable x, the mean may be population parameter, as opposed to Zn (x),
called the spatial mean, or the drift. In statistics, which is a random variable, or a property of
the mean is referred to as the expectation (E) of the the particular sample that is obtained.
random variable Z(jt), and the symbol |i is used in this In example 3.1.1, no assumptions were
report to denote this expectation. Thus, needed concerning whether the mean changed with
spatial location because all sampling was done at
*) = E[Z(x)} (3-2) one sampling location, x. In most HTRW-site applica-
tions, the mean probably changes, depending on
the sampling location. In addition, usually only one
is used to denote the mean, or expected value, of
the bracketed term, in this case Z(x). Thinking of the measurement is available at any particular location.
expectation as an average can be helpful. In fact, if Therefore, some assumptions regarding the structure
the distribution of Z(x) assigned equal probability to of (!(*) must be made. For example, to assume that
a finite number of values, then the expectation of Z(x) |i(jt) = (i, is constant for all x sometimes is appropriate,
would indeed be the simple average of these numbers. in which case, Z(x) has a stationary mean. For
However, in geostatistics, Z(x) is usually assumed to example, data that have no underlying trend, such
take on any value in a continuous range of possible as hydraulic conductivity in a homogeneous aquifer,
values rather than being limited to a discrete set of might be assumed to have a constant mean. If the
values. Therefore, calculus needs to be used to define mean is constant, estimating it with the sample
the expectation. The following example illustrates the average of n measurements obtained at different
difference between averages and expectations. spatial locations x_i, x2, ..., x^ is reasonable; therefore,

3.0 TECHNICAL ASPECTS OF GEOSTATISTICS 9


(3-5) (3-8)

However, in contrast to example 3.1.1, Zn may not get


closer to \JL as n increases, as defined in equation 3-4. may be obtained. This function is key in geostatistical
Because of the possible spatial correlation in the analyses. It is a measure of the association between
data, the size of the sampling region needs to be large, values obtained at point *i and the values obtained
compared to the correlation length for Zn , to accu- at point x_2- If values at these two spatial locations
rately estimate u.. tend to be greater than average or less than average
In addition to the mean of Z(x), its variability at the same time, then the covariance is positive.
or dispersion also is of interest, and this variability is However, if the values vary in the opposite direction
most commonly measured by the spatial variance, (that is, one value tends to be larger than average when
defined to be the mean of squared deviations of Z(x) the other value is less than average, or vice versa), the
from (!(%) and denoted by O2(x), covariance is negative.
Because C(&\, *2) is an unknown population
parameter, it too needs to be estimated using a statistic
(3-6) computed from sample data. To make this estimate
possible, the covariance function is sometimes
The spatial standard deviation oQt) is the square root assumed to depend only on the distance between
of the variance. The following example illustrates the points, which is the lag h, and not on the relative
difference between the population variance, which has location or orientation of the points:
been defined in equation 3-6, and a sample variance.

3.1.2 Example 3.1.2


(3-9)
If the scenario presented in example 3.1.1 (v 1 -v2 )
is used again, the sample variance Sn (x) of the
n measurements could be computed as follows: Under this assumption, C(h) can be estimated by
pooling all pairs of measurements that are approxi-
mately h units apart and computing a sample
covariance function
(3-v)
i= 1

C(/i) = average
This equation gives a measure of dispersion of the (3-10)
Z/(x) values from their sample mean. The sample h-Ah<hfj <h
variance depends on n and on the particular values
measured for Zjfe), Z2(*), ..., Zn(x). However, in where hy is the distance between *,- and Xj and
the limit as n increases, Sn (x) gets closer and closer the average is from all pairs of points so that h^
to a constant value, which is denoted by a^)- Thus, is between h Ah and h + Ah. The quantity Ah is
G (x) is a population parameter, and Sn (x) is a random called the lag tolerance. There are more effective
variable. ways to estimate C(h) besides using equation 3-10;
The mean and the variance both can be calcu- for example, see Isaaks and Srivastava (1989).
lated from the probability distribution of Z(x). Again, However, because the emphasis in this report is
in geostatistics, the relations among regionalized on the variogram (to be defined below) rather
variables at different locations are of interest. From than on the covariance function, this method of
the joint distribution of Z(x{) and Z(x2\ the spatial estimating the covariance function does not need
covariance function, to be used.

10 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
A covariance function is called stationary
if it does not depend on the origin of the coordinate 2 ) = C(h,a),
system; that is, h = J(u,-u
(3-15)
C(x l b) = C(xv x (3-11) a - atan

for any given vector, h (fig. IB). The covariance func-


tion (eq. 3-9) is stationary because changing the origin In this example, a is the angle measured counter-
does not change the distance between the points. clockwise from the east (fig. IB and Q. In many
Substituting x^ = *2 = * in equation 3-9 yields geostatistical publications or computer software,
the angle may be defined as clockwise from the north,
so the appropriate angle in any application needs to be
carefully defined. A covariance function that satisfies
C(x, *) = C(0) , (3-12) equation 3-15 is called anisotropic, or directional.
To summarize, the basic model framework that
which, when combined with the definitions in is used throughout this report is the following: the
equations 3-6 and 3-8, becomes value of a measurement z(x) (concentration, porosity,
hydraulic head, and so on) at location x in a two-
dimensional region is the value of a regionalized
a2 (x) = C(0)forall*. (3-13)
random variable, Z(x), with mean |i(x) and stationary
covariance function C(h,a). Other assumptions may
Therefore, when Z(x) has a stationary covariance be added in the applications sections of this report to
function, the variance of Z(x) is constant for all x. analyze specific data sets, but this framework is the
The covariance function then can be standardized by basic framework from which many of the results are
dividing it by the variance. The resulting dimension- derived. In some situations, the covariance-stationarity
less function of h is called the spatial correlation assumption may be relaxed; for instance, when using
function,
the linear variogram described in the next section.

(3-14) 3<2 Variograms


Regionalized random variables differ from clas-
The correlation function is a scale-independent measure
sical (ordinary least-squares) regression models in that
of linear association between values of Z at different
the residuals, defined as the deviations of the region-
locations. The spatial correlation is always between
alized random variable from its mean and denoted by
-1 and +1, with zero indicating no linear association.
In addition to being stationary, the covariance
function in equation 3-9 has another important prop- Z*U)= (3-16)
erty. It also is isotropic, or omnidirectional, because
the function does not depend on the direction between are related to one another, whereas the residuals in a
the two locations. In many HTRW applications, the regression model are generally assumed to be indepen-
correlation between values of Z at two locations is dent. Thus, in the regionalized random-variable
affected by direction as well as lag. For example, model, measured values of the residuals at measure-
contaminant concentrations in a ground-water flow ment locations contain valuable information when
system might be more highly correlated along a predicting the value of Z(x) at unsampled locations.
transect in the direction of flow than along a transect The relation between the residuals can be understood
perpendicular to the flow. Therefore, the covariance by examining the variogram, which is a tool that is
function depends on the lag, h, and on the angle, a, widely used in geostatistics for modeling the degree of
between locations, spatial dependence in a regionalized random variable.

3.0 TECHNICAL ASPECTS OF GEOSTATISTICS 11


Although the variogram is closely related to the cova- sample data. To facilitate variogram estimation, it is
riance function, there are some important differences usually assumed that, as with the covariance function,
between the variogram and the covariance function Y depends only on the lag,
that are described in this section. The covariance
function and the related correlation function are
more commonly used in basic statistics courses than
the variogram, so many readers may be more familiar (3-20)
with the former concepts. However, the variogram is
more widely used in geostatistics and is used as the
primary tool for analyzing spatial dependence in the or possibly, on the lag and angle between locations,
remainder of this report.
As with the covariance function, it is necessary
to distinguish between the theoretical variogram,
which is based on population parameters, and the
sample variogram, which is an estimator of the theo- (3-21)
( V2~ v \\
retical variogram obtained from measured data. The a = atan -
theoretical variogram of a regionalized random \U2-UJ
variable, y(*i, £2)' *s defined as one-half the variance
of the difference between residuals at locations x^ (fig. 1). Equation 3-20 is called an isotropic
variogram, and equation 3-21 is a directional
variogram at angle a.
For the isotropic variogram, the sample,
. (3-17) or empirical, variogram is obtained by averaging
the square of all computed differences between
Because the residuals have been mean-centered, as residuals separated by a given lag:
shown in (3-16), they have a mean of zero. Therefore,
using the well-known formula for the variance of a
random variable, X,
(3-22)
h-Ah<hij <h
Var(X) = E(X )-(EX)2 , (3-18)
where
equation 3-17 is equal to hy is the distance between */ and *,-.
For a given h, as more and more points that are
separated by distance h±Ah are sampled and as Ah
decreases, y(/z) approaches the theoretical variogram.
More detail on variogram estimation is presented in
The theoretical variogram is always nonnegative; a section 5.0, including the directional case. In this
small value of y indicates that the residuals at locations present section, some general properties of isotropic
X} and x_2 tend to be similar and a large value of y indi- variograms are described that are referred to. numerous
cates that the residuals tend to be different. Although times in section 5.0.
equation 3-19 is sometimes called a semi-variogram A plot of the sample variogram versus h
because of the multiplication by 1/2, it is referred often has a considerable degree of scatter (fig. 2),
to in this report as a variogram. which is especially evident if the sample size,«, is
Knowing the theoretical variogram before small. However, the points can usually be fitted by a
taking measurements would be ideal, but the smooth curve that represents a theoretical variogram
theoretical variogram is typically estimated using selected from a suite of possible choices. Usually, the

12 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
theoretical variogram is monotonically increasing, the spherical variogram (parameters: sill, s > 0;
signifying that the farther two measurements are nugget, 0 < g < s; range, r > 0),
apart, the more their residuals tend to differ, on
average, from one another. Several properties common
to many theoretical variograms are shown in figure 2. h>r
If the variogram either reaches or becomes asymptotic
to a constant value as h increases, that value is called sr (3-24)
the sill (fig. 2). The distance (value of K) after which
the variogram remains at or close to the sill is called 0, h=Q
the range. Measurements whose locations are farther
apart than the range have the same, or even no, degree
the Gaussian variogram (parameters: sill, s > 0;
of association and are assumed to be uncorrelated.
nugget, 0 < g < s; range, r > 0),
Often, a variogram has a discontinuity at the origin,
signifying that even measurements obtained very
close together are not identical. Such variation in the
measurements at small scales is called the nugget
(3-25)
effect. The size of the discontinuity is called the
0, h=0\
nugget. Although the nugget effect is sometimes
confused with the measurement error, there is a
subtle difference between these two concepts that the linear variogram (parameters: nugget, g > 0;
is explained in section 3.3. A simple monotonic func- slope, b > 0),
tion is usually selected to approximate the variogram.
Four such functions that are often used in practice are:
The exponential variogram (parameters: sill, s > 0; (3-26)
nugget, 0 < g < s; range, r > 0), /i=0

Although there are many other functions, these


four describe the variogram models most commonly
85 + (s - - exp-3-,
* 'h>0
(3-23) used (Journel and Huijbregts, 1978); the four models
are shown in figure 3. The exponential, spherical, and
0,
Gaussian models are similar in that they all have a sill
and a range. However, they have different shapes near
zero lag (h = 0) that result in substantial differences
in the prediction results using the three models as
discussed in section 5.0. The linear model is quite
different from the other three, in that it does not reach
a sill, but increases linearly without bound. This fact
i has important implications on the prediction results
using a linear variogram. Because the squared differ-
ences between residuals increase without bound as the
- Range
lag increases, a regionalized random variable based on
a linear variogram has ever increasing variability about
Nugget
its mean as the size of the sampling region increases.
Lag (/i) + In applications involving the linear variogram, the
NOTE: The x's denote hypothetical sample variogram variogram is usually truncated at a sill corresponding
points computed from observed data. The smooth
curve represents a theoretical variogram fitted to the to the value of the variogram at maximum lag, hmax.
sample variogram points.
This is illustrated in figure 3 where a linear variogram
Figure 2. Variogram and features. is shown with a sill and range.

3.0 TECHNICAL ASPECTS OF GEOSTATISTICS 13


EXPONENTIAL SPHERICAL
0.95s

2//3
Lag (h) Lag (h)

GAUSSIAN LINEAR

Lag (h) Lag (h)

Figure 3. Theoretical variograms indicating A, exponential; B, spherical; C, Gaussian; and D, linear models.

Although the variogram is commonly used in From equation 3-28, high values of y(h) (that is, close
a geostatistical analysis, an intuitive understanding to s) signify low values of p(fc). In fact, p(/z) = 0 when-
of geostatistical techniques may be more easily ever y(/0 = j, indicating that measurements whose
obtained by using the covariance function, or equiva- locations are farther apart than the range are uncorre-
lently, the spatial variance and the correlation function. lated. As h decreases, a nugget in y(/i) is reflected in
When Z(x) has a stationary, isotropic covariance a correlation that is less than 1,
function (eq. 3-9), there is a one-to-one correspon-
dence between the variogram and the covariance
function: - £ as h -> 0. (3-29)
s

Therefore, the larger g is in relation to s, the less


(3-27)
correlated nearby observations are. The case when
g = 5, called a pure nugget variogram, results in
As long as C(h) approaches zero as h increases p(/i) = 0 for all h > 0. In that case, neighboring
(a minor technicality that can always be assumed measurements are uncorrelated no matter how
in practice), then, the variogram reaches a sill and closely they are spaced.
the sill equals C(0) as indicated by equation 3-27. Occasionally, y(/i) may not reach a finite sill, as
Therefore, using a regionalized random variable that in the linear variogram (eq. 3-26). In that case, it is not
is covariance stationary, the variogram and the spatial possible to define a correlation function as in (3-28).
covariance function contain the same information. By The corresponding regionalized random variable
factoring out C(0) = s from equation 3-27 and using is said to be intrinsically stationary (Journal and
equation 3-14, the relation between the spatial correla- Huijbregts, 1978), which is more general than
tion function and the variogram can be obtained, covariance stationary. The theory behind intrinsically
stationary variograms is not discussed in this report.
As long as a pseudo-range, hmax, is defined, all of
P(/0 = l- Y(/0 (3-28) the computations described below can be used for the
linear variogram model.

14 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
3.3 Kriging Var[Z(*0)-Z(*0)]
(3-33)
Given a regionalized random variable Z(x) = {E(2(xQ)-Z(xQ)] 2 }.
that has a known theoretical variogram, how can the
value of Z(x) be predicted at an arbitrary location, The smaller the prediction variance, the closer Z(*0)
based on measurements taken at other locations? is (on average) to the true value Z(XQ). The geostatis-
To answer that question, suppose that Z is measured tical technique of kriging computes the best linear
at n specified locations: Z(x{), ..., Z(x^). For example, unbiased predictor of Z(*Q), which is the linear
Z could represent hydraulic conductivity and the unbiased predictor (eqs. 3-31 and 3-32) that has
locations might correspond to n preexisting wells in the smallest possible prediction variance (eq. 3-33).
an aquifer. Let a new location be given by XQ = (MO»VO) The best linear unbiased predictor depends
and denote the ith measurement location by xt = (uit v/). on the mean of Z(x). For example, if Z(x) has a
Suppose that, based on prior knowledge of the geology constant mean (eq. 3-30) and a pure nugget vario-
in the study area, there are no prevailing trends gram [y(/0 = s for all h > 0], the best linear unbiased
in hydraulic conductivity, so the mean of Z(x) is predictor of Z(XQ) is the average of the measured
assumed to be constant over the entire study data,
area:

(3-34)
*)= H (constant). (3-30)
=i
Suppose the value of Z(XQ) is to be predicted
Because the variogram is the same for all h > 0 and
by using a linear predictor, 2(*o), which is defined as there is no trend in the data, there is no reason to
a weighted linear combination of the measured data, favor any of the measurements over any of the other
measurements. Therefore, the weights are all the same.
Ordinary kriging, which is discussed in section 3.3.1,
(3-31) deals with the constant-mean model (the assumption in
= i eq. 3-30) in which the variogram is not a pure nugget
variogram. The weights of the best linear unbiased
where predictor reflects the information in the variogram and
results in an improved predictor over the sample mean.
is the weight assigned to
In section 3.3.2, universal kriging, which is the exten-
To determine specific values for the weights, some sion of ordinary kriging to a nonconstant mean, is
criteria need to be specified for Z(*Q) to be a good discussed. Universal kriging is a very powerful tool
predictor of Z(XQ). The first criterion is that Z(XQ) that can be used to combine regression models and
needs to be an unbiased predictor of Z(*Q), which spatial prediction into one unifying theory. Other, more
is expressed as specialized types of kriging that are discussed in this
section are block kriging (section 3.3.3), co-kriging
(section 3.4), and indicator kriging (section 3.5.2).
£[Z(*0)-Z(*0)] = 0. (3-32) There also is a prediction technique in geostatis-
tics known as simple kriging, which uses the best
An unbiased predictor neither consistently over- linear unbiased prediction in the case when the
predicts nor underpredicts Z(XQ) because the statistical mean of Z(x) is fixed and known. Simple kriging
expectation of the prediction errors is zero. The second is not discussed in this report because, in most
criterion for a good predictor is that it have small applications, the mean is not known and has to be
prediction variance as defined by estimated.

3.0 TECHNICAL ASPECTS OF GEOSTATISTICS 15


3.3.1 Ordinary Kriging Furthermore, the resulting ordinary kriging
variance is
Let Z(x) be a regionalized random variable with
a constant mean (eq. 3-30) and an isotropic variogram
(eq. 3-20). Also, assume that the variogram reaches a
sill so the variance of Z(x) is C(0) = s, and the correla-
tion function is given by equation 3-28. Although (3-36)
the prediction equations can be expressed in terms of
a variogram, they are defined in this report in terms of j=i
the sill (variance) and the correlation function.
Consider linear unbiased predictors from The system of equations 3-35a and 3-35b can
equation 3-31 with the condition in equation 3-32 easily be solved for the wfs and A, after which the
holding. The unbiased condition is equivalent to kriging variance can be obtained from equation 3-36.
The ordinary kriging variance changes depending on
the prediction location, XQ, even though the variance of
Z(*o) itself (eq. 3-6) is constant for all XQ.
/=i
3.3.1.1 Example 3.3.1.1
for any (i, which holds if, and only if,
Let the mean of Z(x) satisfy equation 3-30,
and suppose that the residual Z*(x) (eq. 3-16) has an
isotropic exponential variogram (eq. 3-23). Consider
predicting Z(XQ) based on n = 2 measurements, Z(x{)
i= 1
and Z(xz), where the three locations (XQ, x^, and £2) are
Therefore, all linear unbiased predictors need to have distinct. Using equations 3-23 and 3-28, the correla-
weights that sum to 1. There are many sets of weights tion function is
that satisfy this condition, including the set in which
all the weights equal 1/n, as in the sample mean
(eq. 3-34). However, the unique set of weights that
minimize the prediction variance (eq. 3-33) can be (3-37)
shown to satisfy the following set of n + 1 ordinary
kriging equations (Isaaks and Srivastava, 1989,
chap. 12):
Suppose that

V Wj fly + - = p /0, i = 1, 2,..., n, (3-35a)


»j
7=1 = p,Q<p<l, (3-38)
s

where
I"J = (3-35b)
p is a fixed proportion.
/= 1
The quantity p is sometimes referred to as a relative
where
nugget.
py = p(/iy) is the correlation between measurements, The ordinary kriging equations 3-35a and
/ and7, h^ is the distance between locations / and 3-35b are given by
7, and
A, is a coefficient resulting from the con- X
- = Pio (3-39a)
strained optimization.

16 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
Effect of correlations: If Z(XQ) is more highly
7 = P20 (3-39b) correlated with Z(x{) than with Z(^2), then wj is
larger than ^2, indicating that the measurement at
the first location has more predictive information
(3-39c) than the measurement at the second location. Also,
correlation in the data always decreases the kriging
These three equations have three unknowns vvj, W2, variance compared to the variance using uncorrelated
and A,; the solution is data.
Effect of data clumping: If ZQcj) and Z(^2)
are highly correlated, as indicated by pj 2 being close
lPio-P20
(3-40a) to 1, then the two measurements contain much of the
same information. Two situations then can occur:
PlO = P20» where the weights are both equal, or
PlO > P20 [PlO < P2ol' where wj is much larger [or
= 1 !PlQ-P20
(3-40b) smaller] than w2. In either case, the kriging variance
/2 2 2 l-p 12
increases to reflect the same information in the two
measurements. The automatic adjustment of the
and
kriging weights and kriging variance to account for
data clumping is an important property of the kriging
predictor.
(3-41)
3.3.1.2 Example 3.3.1.2
The resulting kriging variance is (Nugget Effect Versus Measurement Error)

In example 3.3.1.1, all three locations XQ, *i>


and x_2, were assumed to be distinct. When a prediction
^20 location coincides with a measurement location, an
(3-42) important distinction needs to be made between a true
nugget effect and a measurement error. Suppose that,
in example 3.3.1.1, XQ and x_i are the same. If there is
only small-scale variability and no measurement error,
Although there are only three sample locations in this then repeated measurements at the same location
example (two actual and one potential), the example would be identical, that is, pj 0 = 1. In this situation,
indicates several properties of best linear unbiased the kriging equations result in vvj = 1, in w2 = 0,
prediction that generally hold. For example, in A, = 0, and in a kriging variance of zero. That is,
Effect of sill: The kriging weights depend on Z(x_i) is a perfect predictor of Z(XQ). This property,
s only through the relative nugget, p. However, the called exact interpolation, is a property of kriging
kriging variance is directly proportional to s. The when the data are assumed to contain no measurement
sill is called a scaling parameter because scaling each errors. However, suppose that the nugget is interpreted
measurement by a constant, c, has the effect of scaling as a measurement error rather than a small-scale
s by c . When the relative nugget is allowed to vary so variability. Repeated measurements at the same loca-
that s and g can change independently, the effect of s is tion would not be perfectly correlated, but rather,
somewhat more complicated. PlO = l gls. Substituting this correlation into the
Effect of nugget: Increasing p has the effect kriging equations and solving the equations results
of drawing each of the weights closer to 1/2. Asp in a predictor that does not exactly interpolate the
approaches 1, both weights equal 1/2. The larger g data, but smooths the measured data to account for
is compared to s, the more small-scale variability the measurement error. In this report, prediction
there is in the data, and the less important the correla- locations are assumed not to coincide with measure-
tion between neighboring locations becomes. The ment locations, in which case no distinction needs
increased small-scale variability also causes an to be made between the nugget and the measurement
increase in the kriging variance. error.

3.0 TECHNICAL ASPECTS OF GEOSTATISTICS 17


3.3.2 Universal Kriging and a residual correlation function as in equation 3-28,
Universal kriging is an extension of ordinary the best linear unbiased predictor can be obtained from
kriging and can be important in HTRW-site investiga- the following n + p equations, called the universal
tions because environmental data often contain drift. kriging equations (Journel and Huijbregts, 1978):
Universal kriging addresses the nonconstant mean
\i(x). Generally, the mean is assumed to have a func-
tional dependence on spatial location of the form *, ) = P/o

(3-43) (3-46a)
7 =

where the^- (w,v)'s are known deterministic functions


of jc = (u,v) (that is, these functions serve as indepen-
dent variables) and the P-'S are regression coeffi- 7 = 1
cients to be estimated from the data. Suppose Z(x) is
hydraulic head in an aquifer. If the flow is in a steady £=1,2,...,/?
state, the mean of Z(x) could be assumed, in a given
study, to have a unidirectional ground-water gradient where, in contrast to the ordinary kriging equations
that is expressed by (eqs. 3-35a and 3-35b), there are now p coefficients
A,],.... Ap resulting from the unbiased condition on
\L(U, v) = Pj + P2 « . (3-44) the predictor. The first term in the mean (eq. 3-43) is
usually a constant, or an intercept, for which fi(x) = 1.
Therefore, the universal kriging model includes ordi-
In this example, there are two independent variables,
nary kriging as a special case. The universal kriging
variance is given by
/i(",v) = 1

and (3-45) = s l~
i = 1
(3-47)
/2 (w, v) = M,

and two regression coefficients (Pi and P2). The mean


can include other independent variables besides the Equations 3-46a and 3-46b and equation 3-47 can be
simple algebraic functions of u and v. For example, if easily solved to obtain universal kriging predictors and
the aquifer is not of uniform thickness, an independent kriging variances for any location. The estimated trend
variable that involves the aquifer thickness at location surface does not need to be computed to obtain the
(w,v) could be included. universal kriging predictor. If a particular application
The form of the mean in equation 3-43 also is needs an estimate of the trend surface, then general-
generally used in standard linear-regression analysis. ized least-squares regression can be used to estimate
In regression, ordinary least squares is used to solve the coefficients (p^-'s) in the regression equation.
for the coefficients; when this is done, the residuals
are assumed to be independent and identically distrib-
3.3.3 Block Kriging
uted. Universal kriging is an extension of ordinary
least-squares regression that allows for spatially corre- In the previous sections, the problem of
lated residuals. Assuming that Z(x) is a regionalized predicting the value of a regionalized random vari-
random variable with a mean as in equation 3-43 able at a specified location in the region for which

18 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
the variable is defined has been discussed. Implicit In this model, the individual predicted values are
in this discussion is the assumption that the support obtained from either the ordinary or the universal
of the variable being predicted is defined in the same kriging equations. However, computation of the block
way as the variables that make up the measurements. kriging variance is not as simple as computation of the
However, there may be applications where estimating point kriging variance because the individual kriging
the average value of Z for an estimation block of much estimates are not independent of one another. There
larger area than is represented by an individual sample are simple modifications to the kriging equations,
is necessary. For example, an estimate of the average discussed in sections 3.3.1 and 3.3.2, that can be
concentration of a contaminant in an entire aquifer that used to directly compute the kriging estimate of ZB
is based on point measurements at various locations and its kriging variance (Isaaks and Srivastava, 1989,
might be needed. In other applications, an estimate of chap. 13). The equations are not presented in this
the average concentration of soil contaminant, in daily report. The computer packages described in the
excavation volumes that are much larger than the next section can be used to compute block kriging
volume of an individual sample, may be needed. estimates. In general, kriged values of block averages
Let ZB be the average value of Z(x) for a particular are less variable than kriged values at single locations.
block B, Consequently, the blocked kriging variance tends to
be smaller than the kriging variance at a single
location.

(3-48)
3.4 Co-Kriging

where *o/» i = !,. ,«, denotes m prediction locations Kriging as discussed so far provides a way of
in block B. The object is to predict this average predicting values of a regionalized variable ZQt) at a
rather than the regionalized random variable at a location XQ based on measurements of the same vari-
single location. In many applications, the locations *o/ able at locations *i, *2» » %n- ^n some situations,
might correspond to nodes of a regular grid or finite- however, measurements could be available not only
element nodes in a ground-water model. The results of ZQt), but also of one or more other variables that
of the block kriging are dependent on m and on the can be used to improve predictions of Z(XQ). The vari-
prediction locations. Selecting a large number of loca- able Z(x) is called the primary variable because it is
tions in block B, where each location has approxi- the one to be predicted, and the other variables are
mately the same representative area, probably is the called secondary variables. Co-kriging is a technique
best approach to block kriging (Isaaks and Srivastava, that uses the information contained in secondary vari-
1989, chap. 13). ables to predict a primary variable. For example,
The objective of block kriging is to obtain the suppose that Z(x) is a regionalized variable repre-
best linear unbiased predictor of Z# and an estimate of senting the hexavalent chromium concentration, a
the block kriging variance based on the measurements. relatively difficult determination, and suppose that
The model for Z(x) can be the constant-mean model the hexavalent chromium concentration needs to be
(eq. 3-30) assumed for ordinary kriging or the more predicted at a location XQ based on measurements of
general linear-regression model (eq. 3-43) assumed hexavalent chromium at other locations. However,
for universal kriging. For either, the predicted value there also are measurements of a second, relatively
of ZB coincides with the average of the predicted easily determined contaminant, such as lead, that, for
values of the individual measurements in the block; the purposes of this example, tend to be correlated
that is, with hexavalent chromium concentration, and these
data are to be used as well. Denote the second vari-
able, lead, by a regionalized variable W(x), and assume
that measurements have been made on W at m loca-
(3-49) tions *'i, *'2, ..-,x m. The co-kriging predictor of Z(XQ)
i = 1 then is

3.0 TECHNICAL ASPECTS OF GEOSTATISTICS 19


in predicting aquifer head if the thickness can easily
be determined at any location. However, aquifer
thickness may need to be considered a secondary vari-
i= i
(3-50) able in a co-kriging procedure if the thickness is only
m
known at a few selected locations in the aquifer.
j=
3.5 Using Kriging to Assess Risk
This extension of the kriging predictor in
equation 3-31 is straightforward. Analogous to The kriging predictor of Z(XQ) has certain desir-
krig ng, co-kriging produces the weights w{ and able properties based on how close it is to the actual
so tl iat the resulting predictor is the best linear un- value of Z(*Q); it is unbiased and has the smallest vari-
biased predictor. Also, as with kriging, co-kriging ance among all linear predictors. However, when
uses modeling of the variogram for Z, but co-kriging possible, the relation between the predicted and
presents an additional necessity of modeling the vario- observed values could be specified further, and ideally,
gram for W and the cross variogram for Z and W. probability statements could be made. For example,
The optimal weights then are expressed in terms of all if Z(XQ) is the concentration of a contaminant, a
thes variogram properties. More than one secondary 95-percent certainty that the true concentration is
variable may be included in the co-kriging predictor, within 0.05 microgram per liter of the predicted
and theory has been developed for co-kriging in the concentration might be desired. In other situations,
pres snce of drift (universal co-kriging) and block the probability that the actual concentration exceeds a
co-kriging. Details are not included in this report, given target concentration might need to be estimated.
but ] saaks and Srivastava (1989) and Deutsch and Knowledge of the entire distribution function of Z(x),
Journel (1992) have more discussion and citations as opposed to knowledge of only the mean and vario-
of o her references. gram of Z(x), can be used for risk-qualified inferences
One situation for which co-kriging might be in situations when extremes might be of more interest
useful is when the primary variable is undersampled, than averages.
so any additional information, such as that given
A discussion of the concept of a conditional
by secondary variables, would be helpful. However,
probability-distribution function of the regionalized
although co-kriging can be a useful tool, joint
variable Z(x) is appropriate at this point. The concept
moc eling of several variables tends to be demanding
astc data and computational requirements. Thus, also is applicable in section 8.0 when conditional
undersampling of the primary variable may present simulation is discussed. The conditional probability-
problems for co-kriging and for one-variable kriging. distribution function is defined much like the
Alsc , unless the primary variable of interest is probability-distribution function in section 3.1, except
highly correlated with the secondary variable(s), the probability that Z(x) < c is computed conditional
the weights assigned to the secondary variable(s) on, or given, information at other spatial locations.
are often small, and the effort needed to include the Geostatistics is used to make predictions at a location
additional variable(s) may not be worthwhile. For *0 using information at measurement locations
theso reasons, co-kriging is not used extensively in *i» *2» ' =£/i' therefore, in conditional distributions,
practice. the focus is on P[Z(XQ) < c\ Z(x{), Z(^2),.-, Z(xn)].
Although co-kriging is similar to universal The vertical bar denotes the conditioning and is read
kriging in that both techniques use extra variables to "given." The conditional probability distribution needs
predict Z(x), there is an important distinction between to be determined to make probability statements
the two techniques. In universal kriging, the indepen- about the regionalized variable at location XQ. Also,
dent variables in equation 3-43 need to be known with conditional mean and conditional variance can be
certainty at the prediction location XQ. For example, defined in the same way that mean and variance for
aquifer thickness might be an independent variable distribution functions were defined in section 3.1.

20 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
Section 3.5.1 contains methods for using kriging in which Y(x) = \n[Z(x)]. A 95-percent prediction
output to obtain prediction intervals or quantiles when interval for Z(x) then is (exp[T(*o) - l^a^feo)],
the regionalized random variable is either normally espfTfeo) + 1-96 a^feo)]}. As long as the transforma-
distributed or can be transformed to a near-normal tion is a one-to-one function, such as a logarithmic
distribution. Section 3.5.2 discusses indicator kriging, transform, prediction intervals for the original data can
which is a nonparametric technique for obtaining be obtained by simply back-transforming prediction
quantiles when data cannot be adequately transformed intervals for the transformed data.
to a normal distribution. Although prediction intervals and probabili-
ties can be easily obtained using simple back-
3.5.1 Normal Distributions and Transformations transformation, obtaining a predictor of the untrans-
formed data that is unbiased and optimal in some
For prediction at a location XQ, a kriging anal- sense is more difficult. For example, using a loga-
ysis produces the predictor Z(XQ) and the associated rithmic transformation, a kriging analysis using the
kriging variance c^(*o)- If more informative proba- transformed data yields a predictor Y(XQ), which is
bility assessments are to be made, the ideal situation the best linear unbiased predictor of Y(XQ). However,
is when Z(x) is assumed to be a Gaussian, or normal, the back-transformed value Z(XQ) = exp[T(*o)]
process, which means that [Z(*i),..., Z(xn)] has a joint does not possess the same optimality properties as a
normal probability distribution for any set of n loca- predictor of Z(XQ). The technique known as log-normal
tions and any value of n. Then the conditional proba- kriging, and more generally as trans-normal kriging,
bility distribution of Z(XQ), given the n measurements, has been developed to obtain predictors when transfor-
is a normal distribution that has a conditional mean mations are made (Journal and Huijbregts, 1978), but
equal to the kriging predictor Z(XQ) and conditional because of the complexity involved, the technique is
variance equal to the kriging variance G%(XQ). not usually used by practitioners. For example, if a
This normal distribution can be used to obtain a predicted value corresponding to Zfeo) needs to be
prediction interval for Z(^Q) (conditional on the obtained for contour plotting, the kriging predictions
measured data). For example, from a table of the Y(XQ) may be back-transformed and plotted, as long
normal distribution, a value of 1 .96 corresponding as the investigator realizes that such values do not have
to a 0.95 (two-sided) probability can be obtained. the usual kriging optimality properties.
Then the assertion that there is a 95-percent chance
that Z(XQ) is in the 95-percent prediction interval
3.5.2 Indicator Kriging
[Z(XQ) - \9.6aK(xQ), Z(XQ) + \.96aK (xQ)] can be
made. Knowing this interval is much more useful than There may be situations when a transforma-
simply knowing the kriging predictor and variance. tion that makes Z(x) approximately normal cannot
To illustrate quantile estimation, suppose be easily determined. In such situations, indicator
that contaminant concentrations are being studied, kriging can be used to infer the probability distribu-
and a concentration that has only a 1 -percent tion of Z(x). Because no distributional assumptions
chance of being exceeded at location *o needs to are made, this technique is known as a nonparametric
be determined. The appropriate (one-sided) value statistical technique. An example of indicator kriging
from a normal table is 2.33, so the desired estimate is included in section 6.0, and an article by Journel
(1988) is a good reference for additional information
about indicator kriging.
Even if Z(x) is not Gaussian, a transformation,
Y(x) = T[Z(x)], can often be found so that Y(x) is To perform indicator kriging, a special trans-
approximately Gaussian. When a transformation is formation, known as an indicator transformation, is
made, the kriging analysis is performed using the applied to Z(x):
transformed data Y(x), and the inverse transformation
may be applied to obtain prediction intervals for the
original data. For example, the most common transfor- '1, Z(x)<c
I(x, c) = (3-51)
mation is the (natural) logarithmic transformation, 0, Z(x)>c

3.0 TECHNICAL ASPECTS OF GEOSTATISTICS 21


If, as in the usual kriging scenario, the data set at hand 4.0 GEOSTATISTICAL RESOURCES
consists of measurements of the regionalized variable AND TOOLS
Z(x) at n locations, c needs to be fixed first, and then
the indicator transformation is applied by replacing Since the mid-1970's, there have been texts
values that are less than or equal to c with 1 and values and articles published that are either totally dedicated
that are greater than c with 0. The variogram and to geostatistical methods or discuss geostatistics
kriging analysis then is performed using these O's in detail. As well as being separately published,
numerous computer programs and software packages
and 1 's rather than the raw data.
on geostatistics and kriging are included in these texts
Kriging predictors using the indicator data and articles. Although only a few of these resources
are equal to their measured values of 0 or 1 at the are briefly described in this report, their references can
measurement locations *,-, i- !,...,«. However, at provide lists of other geostatistical topics or software
locations different from the measurement locations, not specifically covered in the resources.
predictions may be between 0 and 1. In interpreting
these predictions, the power of indicator kriging
becomes apparent. A predicted value at £Q is an 4.1 Texts on Geostatistics
estimate of the conditional probability distribution
P[Z(*o) < c\ Z(*!), Z(*2),..., Z(xn)]. This analysis may The geostatistical texts presented in this section
be performed for a range of values of c\ therefore, can be classified into two broad categories: instruc-
the entire distribution function can be estimated. This tional texts or reference texts. For one who is delving
estimate of the distribution function can be used to into geostatistics for the first time, dark's (1979) book
obtain prediction intervals or estimates of quantiles. can be a starting point. Simple explanations of the
basic kriging techniques are applied to an example
For example, to estimate the value that has a 1-percent
data set. A more detailed treatment of the kriging tech-
chance of being exceeded at location XQ, the value of c
niques is described by Isaaks and Srivastava (1989).
for which the kriged indicator prediction is 0.99 at that This text book presents discussions of many of the
location is determined. background statistical tools and concepts needed in
One advantage of indicator kriging is that the geostatistical applications, including histograms and
indicator variogram is robust with respect to extreme distributions (univariate and bivariate), sampling,
outliers in the data because no matter how large (or correlation, and spatial continuity. The text also
small) Z(x) is, the indicator variable is either 0 or 1. discusses how to treat the subtleties of kriging using
Indicator variables also may be used in block kriging. three example data sets. As well as being instructional,
For example, a spatial average of I(x,c) over a block B the book also can be used as a reference.
equals the fraction of block B for which Z(x) is less Texts by Cressie (1991) and Journel and
than c. Another advantage of indicator kriging is that Huijbregts (1978) describe the tools of geostatistics,
it can be used when some data are censored. but also include a comprehensive theoretical back-
Despite the relative ease of implementation, ground on the techniques. Cressie's (1991) text is a
there are several drawbacks to indicator kriging, and treatment of spatial processes in general and reviews
this technique might be used only when other tech- a wide range of statistical techniques in the analysis
and stochastic modeling of spatial data. There is a
niques, such as normality transformations, produce
four-chapter section on geostatistics, with a complete
unacceptable results. For example, the kriged values
discussion of variogram estimation, kriging (including
of I(x,c) may be less than 0 or larger than 1. Also, universal kriging), intrinsic random functions, and
the kriged prediction for I(x,ci) may be larger than comparisons of kriging to other spatial prediction
the kriged prediction for I(x,C2) even if c\ < c^, which techniques. The text is written from a statistician's
is not compatible with a valid probability distribution. point of view and is, in places, written at a fairly
There are several more advanced techniques for high level mathematically. Nevertheless, it contains
solving these problems (Isaaks and Srivastava, 1989, numerous examples and illustrations using real-
chap. 18); however, those techniques are beyond the world data. Journel and Huijbregts (1978) maintained
scope of this report. a mining-geological perspective. Two other texts

22 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
written by statisticians that present general treatments 4.3 Software
of spatial processes, but that lack detailed discussions
of kriging, are by Cliff and Ord (1981) and by Ripley The geostatistics software described in this
(1981). section is limited to a few readily available public-
David's (1977) text was the first extensive domain packages that are executable at least on
discussion of the practice of geostatistics and kriging the DOS and sometimes on the UNIX platforms.
in mining applications, and the discussion is presented There are several commercial packages that are
from a practitioner's viewpoint. The text references being marketed, but these packages are not reviewed
many specific mining applications and results for in this report. Each of the software packages described
geostatistics. A broad statistics text by Davis (1973) in this report are listed in table 1, which may serve
with a bent toward geological applications, serves as a reference guide to other software packages.
as a reference for standard statistical procedures One of the earliest interactive kriging software
needed in geological applications of geostatistics. packages was developed by Grundy and Miesch
A book by Bras and Rodriguez-Iturbe (1985) that (1987). Overall, this general statistics package, known
discusses a range of techniques for stochastic as STATPAC, contains a series of programs that can
modeling in hydrology includes a chapter on handle two-dimensional kriging, including universal
applications of kriging. There is a fairly complete kriging. The package has capabilities for univariate
mathematical development of kriging with details statistics, transformations, variogram analysis, and
of an application to predict mean areal precipitation. cross validation (table 1). The graphics in the package
In an article prepared for the U.S. Environmental are limited to simple line-printer plots of the sample
Protection Agency, Journel (1993) discussed geostatis- variogram points and data maps. The menu-driven
tics as it relates to environmental science. Finally, Olea package includes a tutorial using all of the kriging
(1991) presented a useful glossary of geostatistical routines. The package is distributed with not all, but
terms. most source codes, and, therefore, can be modified
by the user if desired. All two-dimensional kriging
routines can be executed from the command line,
4.2 Journals which provides users with the opportunity for batch
The journal Mathematical Geology by the processing.
International Association for Mathematical Geologists The geostatistical environmental-assessment
reports new developments in the theory and applica- software, known as GEO-EAS (Englund and Sparks,
tion of kriging. Although many of the articles present 1991), also is an interactive, menu-driven kriging
new applications of kriging tools, many articles software package for performing two-dimensional
also are dedicated to the derivation of statistical kriging. It has no direct provisions for universal
properties of the variogram, to kriging estimation, kriging (table 1). GEO-EAS does have an advantage
and to cross-validation results. Journals such as over STATPAC through its enhanced graphics capabil-
Water Resources Research, published by the ities, which are useful in the interactive fitting of theo-
American Geophysical Union, and Groundwater, retical variograms to sample variogram points. In
published by the Association of Ground Water addition, in the computation of the sample variogram
Scientists and Engineers, contain articles describing points, GEO-EAS allows for variable bin sizes, the
special applications of kriging techniques in the envi- use of which are further discussed in section 5.0.
ronmental arena. Water Resources Research includes STATPAC and GEO-EAS were originally devel-
many theoretical articles. Other journals that may oped for the personal computer. Since then, versions
contain information addressing geostatistics are the of GEO-EAS have been developed for some types of
Journal of Environmental Engineering, published by work stations. The kriging routines in STATPAC have
the American Society of Civil Engineers; Stochastic not been adapted to workstations.
Hydrology and Hydraulics, published by Springer A third software package, the geostatistical
International; and the North American Council on software library known as GSLIB (Deutsch and
Geostatistics, published by the Colorado School of Journel, 1992), is a suite of programs developed over
Mines. the years at Stanford University, Stanford, California.

4.0 GEOSTATISTICAL RESOURCES AND TOOLS 23


Table 1 . Geostatistical software characteristics

[Note: STATPAC, statistical software package developed by Grundy and Miesch (1987); GEO-EAS geostatistical environmental-assessment software
developed by Englund and Sparks (1991); GSLIB geostatistical software library developed by Deutsch and Journel (1992); CMS, ground-water modeling
system developed for the U.S. Department of Defense]

Characteristic STATPAC GEO-EAS GSLIB GMS2.0


Operating system DOS DOS/UNIX Independent (requires WINDOWS 95
FORTRAN compiler) UNIX

Menu driven Yes Yes No Yes


Batch processing Yes No Yes Yes

User modifications Yes, source No Yes, source No


code provided code provided

Data-set constraints Yes, modifications Yes Yes, modifications Yes


possible via possible via
source code source code

ASCII output Yes Yes Yes Yes


Univariate statistics Yes Yes Yes Yes
Additional exploratory capabilities Yes Yes Yes Yes
Graphical support for analysis Yes Yes Yes Yes

Transformations Yes Yes Yes Yes


B ack- transformations No No Yes Yes

Variogram construction Yes Yes Yes Yes


Variogram analysis Yes Yes Yes Yes
Variogram graphics Yes Yes Yes Yes
Cross-validation operations Yes Yes Yes Yes

Ordinary kriging Yes Yes Yes Yes


Universal kriging Yes No Yes Yes

Block kriging Yes Yes Yes Yes


Indicator kriging Yes Yes Yes No

Conditional simulation Perhaps with No Yes No


batch processing

Three-dimensional kriging Perhaps with No Yes Yes


batch processing

Mapping Yes Yes Yes Yes


Contouring Yes Yes Yes Yes
Gray-scale maps Yes Yes Yes Yes

Line printer Yes No Yes Yes


High-resolution screen No Yes Yes, via Yes
postscript

High-resolution printer No Yes Yes Yes


Postscript No No Yes Yes

24 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
It is presented as a collection of routines that are the variogram, is a mathematical function or model
machine independent (table 1) and are intended to be that is fitted to sample variogram points obtained
used as a modular concept. The package is distributed from data. Permissible models, which include those
as a suite of FORTRAN source codes that need to be models discussed in section 3.0, belong to a family
compiled. To use GSLIB effectively, a relatively high of smooth curves having particular mathematical
level of familiarity with geostatistics is required. Like properties and are each specified by a set of
the other two software packages, GSLIB handles vari- parameters. Section 5.0 describes a sequence of
ogram analysis and kriging techniques (table 1). Two stages for estimating and investigating sample
of its primary advantages over the other two packages variogram points and a calibration procedure for
are its simulation techniques and its ability to analyze specifying the parameters of the variogram model
three-dimensional data sets. Such techniques are eventually to be fitted to the sample points. Although
useful especially in estimating potential extreme the calibration procedure is largely an objective way
outcomes in a geostatistical analysis. for evaluating theoretical variograms, the process of
The U.S. Department of Defense Groundwater obtaining sample variogram points and finalizing a
Modeling System (GMS) is a fourth software package theoretical variogram remains an art as much as a
that has kriging capabilities. GMS is a windows-based science. An understanding of the material presented
integrated modeling environment for site characteriza- in section 3.0 and professional judgment achieved
tion, ground-water flow and transport modeling, and through experience in geostatistical studies are impor-
visualization of results. The GSLIB software has tant in effectively using the guidelines presented in
been implemented in GMS to facilitate two- and section 5.0.
three-dimensional kriging and interactive variogram An accurate estimate of a variogram from a
modeling. GMS also provides comprehensive visual- kriging perspective is needed because the correlation
ization techniques and other interpolation techniques matrix used to obtain the kriging weights is developed
that can be used as alternatives to kriging. The GMS from the variogram values. Even more directly, the
system was developed for the U.S. Department variogram affects the computation of the kriging vari-
of Defense by the Brigham Young University ance (eqs. 3-36 and 3-47) through the product of the
Engineering Computer Graphics Laboratory. GMS kriging weights and the correlation values. An accu-
may be obtained from the U.S. Army Groundwater rate variogram also can be used outside the strict
Modeling Technical Support Center, Waterways context of kriging. For example, in augmenting a
Experiment Station, Vicksburg, MS 39180. spatial network with new data-collection sites, the
In geostatistical software and literature, there range parameter of the variogram could be used as
can be differences in jargon or notation. These differ- the minimum distance of separation between the new
ences may cause some initial confusion if users or sites and between the new and the existing sites to
readers do not familiarize themselves with the jargon maximize overall additional regional information. In
or notation. For example, some authors may use another nonkriging-specific application, the variogram
the term "semi-variogram" rather than "variogram"; is used in dispersion variance computations in which
others may express random variables as other than Z the variance of areal or block values is estimated
(which has been used in this report); and different soft- from the variance of point-data values (Isaaks and
ware often has different references for directional Srivastava, 1989, p. 480).
angles when discussing anisotropy. The stages of variogram construction are
described based on an example data set of ground-
water elevations measured near Saratoga, Wyoming
5.0 PRACTICAL ASPECTS OF (Lenfest, 1986). The data set is summarized in table 2
VARIOGRAM CONSTRUCTION and the relative locations of the data are shown in
AND INTERPRETATION figure 4.
The sequence of steps in computing sample
Section 3.0 presented the mathematical variogram points depends on the stationarity properties
foundation for geostatistics and the kriging technique. of the regional variable represented by the data. If
One theme that pervades the technique is the impor- the mean of the regional variable is the same for
tance of the theoretical variogram. The theoretical all locations, then the mean is said to be spatially
variogram, or what is often referred to simply as stationary; if the mean changes with location, then

5.0 PRACTICAL ASPECTS OF VARIOGRAM CONSTRUCTION AND INTERPRETATION 25


Table 2. Univariate statistics for example data sets

[Note: Base unit for Saratoga, water level A and B and bedrock A and B is meters; base unit for water quality A is log concentration, concentration in
micrograms per liter]

Example Number of T f . M'"imum M**imum JJean "^diari det^n fewness


identifier measurements Transformation (base (base (base (base (dimension-
units) units) units) units) n-*&\ less)
Saratoga 44 Drift 614 687 646 641 17.3 0.45
Water level A 83 Drift 7.80 20.0 12.9 11.7 3.09 1.03
Water level B 74 Drift 7.80 20.0 13.1 11.8 3.23 0.87
Bedrock A 107 None 6.91 24.5 13.5 13.1 3.28 0.89
BedrockB 90 None 7.75 21.1 13.3 13.2 2.62 0.26
Water quality A 66 Natural log 2.08 8.01 5.19 5.59 1.75 -0.42

it is spatially nonstationary. If the data have a stationary


D spatial mean, the discussions in sections 5.2 and 5.6,
100 B : which address nonstationarity and additional trend
D - considerations, can be skipped. If the spatial
; a mean is not stationary, as in this example data set,
D
then sections 5.2 and 5.6 become important, and the
Q H -
sequence of stages for obtaining a variogram becomes
80 1 m an iterative procedure. All variogram and kriging
CO computations for the Saratoga ground-water-level
LJ
I '- n example were performed using the interactive kriging
LJ
D
software STATPAC described by Grundy and Miesch
O
V Un
(1987).
60
LJ" Kl
S8a
03
O
z pi
B 5.1 General Computation of an
H
CO B
Empirical Variogram
o O m NORTH
CL 40
As described in section 3.2, the variogram,
^ H
. m ~ j(h), characterizes the spatial continuity of a regional
I H H . variable for pairs of locations as a function of distance
m m - or lag, h, between the locations. This variogram is
20 - &!!* - sometimes called the theoretical variogram because
' m \H m it is assigned a continuous functional form that
expresses the spatial correlation for any lag in the
region of analysis. The function is estimated by
fitting one of the equations in section 3.2 to empirical
0 ",,,,,,.,, 1 ,,,..,.,. I .,,,,.,., 1 .,,,...,. 1 ,,,,,,," or sample variogram points, y(h), using data whose
° MAP DISTANCE, IN KILOMETERS locations contribute only a finite number of lags.
Although j(h) characterizes the spatial correlation of
EXPLANATION the data, it is computed from residuals of the data from
1 1 1 1
i the spatial mean. Therefore, without prior knowledge
614 636 65g 679 of nonstationarity in the underlying spatial process,
the first step in computing the sample variogram is
INDEX TO PLOTTED VALUES, IN METERS ^ V . . . .. , r
to identify existing nonstationarity indicated lor the
Figure 4. Measured water levels from Saratoga data. spatial mean.

26 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
The approximation to equation 3-19 begins local or regional gradient. In such studies, sample
by computing squared differences, £>y, from the variogram computations need to be made using resid-
data values z(x{), z(x,2),...z(xn) collected at locations uals obtained by subtracting the estimated drift value
*1, x2... XH at each location from the value of the datum at the
location.
The differencing of the data in equation 5-1
(5-1)
is done without considering the relative direction
between the locations; that is, D/; is isotropically
If the spatial mean is stationary, then the squared computed. A plot of D/; versus h^ for all i,j (i >j),
differences of the data are equal to the squared differ- where h^ = \£f- Xj I, produces a cloud of points
ences of the residuals, and sample variogram computa- whose properties govern the behavior of y. The
tions can be continued using the data themselves. If
central tendency of the cloud generally increases
the spatial mean is strongly nonstationary, the plot of
with h. A substantial increase in the central tendency
equation 5-1 versus the distance between associated
points may indicate a trend or drift that needs to be that persists for large h can indicate a nonstationary
removed before further variogram computations can spatial mean. The cloud computed for the Saratoga
be made. Drift needs to be considered in HTRW data, with ground-water levels (z) in meters and
studies such as determining contaminant concentra- lag (h) in kilometers, is shown in figure 5 and does
^
tions areally dispersed from localized sources or show increasing D (meters squared) with increasing
determining ground-water elevations that follow a h, indicating potential nonstationarity.

6.000

Dashed line is ordinary least squares fit indicating slight parabolic shape

5.000

O
CO
(/)
rr 4,000
LJ
LJ

LJ 3,000

LJ
rr
LJ

Q 2,000
Q
LJ
rr
Z>
(/)
<
^ 1.000

10 20 30 40 50 60 70 80 90 100
DISTANCE BETWEEN PAIRS, IN KILOMETERS

Figure 5. Squared differences of values for all possible pairs of points for Saratoga data.

5.0 PRACTICAL ASPECTS OF VARIOGRAM CONSTRUCTION AND INTERPRETATION 27


Generally, there is a large amount of scatter in Ik(.) is an indicator function that has a value of 1
these plots, as seen in figure 5, and this scatter can if hy falls into bin k and 0 otherwise [Ik(.)
conceal the central tendency of D2 with h. One way only allows values of D^ in the calculation
to estimate the central tendency and to minimize the that have an h^ that falls into the bin].
effect of aberrant data is to collect the D2 into K bins The lag value hk can be the midpoint of the bin or it
or lag intervals of width (Ah)k, k=l,...K and assign can be the average of the actual lag values for the
to y the average of the values of D2 in each bin. This points that fall in the bin.
process is similar to the way data are placed in bins To establish bins, equal bin widths are specified
for obtaining histograms. The expression for the &th and the distance between the two most separated data
average bin value is points, hmax, is subdivided according to these equal
increments, or a K is chosen that defines the bin
width. For the Saratoga data, a bin width of about
8 kilometers established K = 12 bins for y. The y
points computed from the binned D^ values in
figure 5 are shown in figure 6. The lag plotting
where positions are the average h values in the bin. The
symbol x indicates that N(fi) is less than 30 pairs
N(hk) is the number of squared differences that
for the particular bin, and this differentiation is
fall into bin k, and
discussed in section 5.3. Although the sample vario-
hk is the lag distance associated with bin k. gram is still preliminary, its general behavior at this

1,500
+ Lag with greater than or equal to 30 pairs
x Lag with less than 30 pairs

Q
LJ
< 1,000
ID
a x

o:
LJ

500

10 20 30 40 50 60 70 80 90 100
LAG, KILOMETERS

Figure 6. Initial sample variogram points for Saratoga data.

28 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
stage is adequate to indicate if nonstationarity needs 5.3 Variogram Refinement
to be addressed before sample variogram refinement
is done. In section 5.1, an initial y was specified by
points computed from equation 5-2. In general, the
larger N(hk) is for any bin k, the more reliable are the
5.2 Nonstationarity points defining y (h^. Also, the larger K is, the greater
the number of sample variogram points shaping y.
An indication of substantial nonstationarity or However, N(h^) and K are competing elements of y.
drift in the spatial mean would be a parabolic shape Journel and Huijbregts (1978) suggested that each
through all lags in a plot of y. This shape occurs bin k could have N(hk) equal to at least 30 pairs.
because differences between data contain differences The American Society for Testing Materials (1996)
in the drift component that increase as h increases. If suggested 20 pairs for each lag interval. For small
equation 3-16 is inserted into equation 3-17, squaring data sets, the number of intervals may have to be
the differences in jo, greatly amplifies the increase with small to guarantee either number of suggested pairs
h. In this case, drift, generally a low-order (less than in all bins.
three) polynomial drift in (M,V), is fitted to the data and The minimum number of data, n, needed
subsequently subtracted from the data to obtain resid- to satisfy the N(hk) requirements for all bins of a
uals. Trend surfaces are not necessarily limited to sample variogram is difficult to determine. Simple
polynomial forms. For example, a numerical model of combinatorial analysis can establish a sample size
ground-water flow may be used to obtain residuals of needed for a given total number of distinct pairs
ground-water head data. obtained from the sample, but the analysis does not
In theory, the polynomial trend indicates a address the spatial considerations needed for proper
slowly varying drift in the spatial mean and, as lagging. For example, for data collected on a uniform
such, one regional trend surface should be fitted to grid and equal-sized bins, fixing an n to just satisfy
all the data. However, often the drift and residuals are the minimum N(hk) for the small lags would yield
obtained locally; that is, using moving neighborhoods insufficient data pairs to meet the minimum Nty^
of locations. Therefore, estimates of these values at for the larger lags. Fixing an n to ensure the minimum
any point are made using a decreased number (usually N(hk) for the large lags would generally have N(hk)
between 8 and 16) of surrounding locations, which is much greater than the minimum for the small lags.
done because, ultimately, the kriging estimates are Therefore, the question of how much data are needed
made using only the data values in the given neighbor- to adequately compute a variogram also needs to
hood. Manipulating the kriging matrices takes less address the relative locations of the data-collection
time when a small number of data are used to make sites.
estimates, and these efficiencies can be substantial The first 10 of the 12 bins for y for the Saratoga
for dealing with large data sets. Little accuracy is lost data contained more than 30 data pairs. Therefore, the
because the nearest neighbors have the most effect in bin width was decreased to have more points define
the kriging weighting scheme. the early part of y. These bin-width adjustments were
A parabolic shape for y in the Saratoga data made to refine y whether it was computed from the
is shown in figure 6 for the sample variogram points data or from the residuals. A plot of y for the residuals
plotted for lags up to about 32 kilometers (the first four for the Saratoga ground-water elevations with the bin
points) and for lags greater than about 56 kilometers. width narrowed to about 6.5 kilometers is shown in
A parabolic shape in the sample variogram points was figure 8.
not surprising because analysis of the data indicates a Spatial data usually are not collected on a
north-south gradient in the ground-water levels. The uniform grid, but occur in a pattern that reflects
simplest polynomial trend, linear in u and v, was fitted problem areas, accessibility, and general spatial
to all the data using ordinary least-squares estimation. coverage. In the Saratoga data set, nonuniform data
Residuals obtained by subtracting this regional trend spacing resulted in the number of data pairs in each
surface from the data were used to reestimate y in bin being highly variable among the bins, although
equation 5-2, and the sample variogram for the resid- there were still greater than 30 data pairs. This vari-
uals is shown in figure 7. ability yields different reliabilities for the points

5.0 PRACTICAL ASPECTS OF VARIOGRAM CONSTRUCTION AND INTERPRETATION 29


defining y. To establish a balance for N(hk) among adequate. Knowing that the variogram is a smooth
the bins, variable bin sizes can be used so that each function, the analyst visually decides when the
bin contains approximately the same (large) number sample variogram is sufficiently defined at all lags
of points. A bin having few points can be coalesced to adequately approximate a theoretical variogram.
with an adjacent bin to form a wider bin having a large
number of points. Conversely, a bin having an exces-
sive number of points can be subdivided into adjacent, 5.4 Transformations and Anisotropy
narrower bins. The coalescing and subdividing proce-
dure is largely trial and error until the distribution of 5.4.1 Transformations
the pairs of points is satisfactory. A transformation is applied to a data set gener-
The values of y at the small lag values are ally for one of two interrelated purposes. First, a
the most critical in defining the appropriate y. There- transformation can decrease the variability of highly
fore, the trade-off between the number of bins and fluctuating data. This variability especially occurs with
the number of data pairs within each bin can be varied contaminant concentrations where order-of-magnitude
for different regions of the sample variogram. At small changes in data at proximate sites are not uncommon.
lags, the numbers of data pairs per bin can be closer to The effects of such data would be erratic sample vario-
the minimum N(hk) so more bins can be defined. At gram points indicated by a large amplitude, ill-defined
larger lags, a smaller number of wider bins may be sawtooth pattern of the lines connecting the points.

250
+ Lag with greater than or equal to 30 pairs
x Lag with less than 30 pairs

200

Q
LJ
o:
a
in 150
CO

LJ

100

<
o

50

10 20 30 40 50 60 70 80 90 100
LAG, IN KILOMETERS

Figure 7. Sample variogram points for ordinary least-squares trend residuals for Saratoga data.

30 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
250
+ Log with greater than or equal to 30 pairs
x Lag with less than 30 pairs

200

Q
UJ

150
CO
O:
Ld
h-
Ld

100
<
2
2
<

50

10 20 30 40 50 60 70 80 90 100
LAG, IN KILOMETERS

Figure 8. Sample variogram points for ordinary least-squares trend residuals for Saratoga data binned to 6.5 kilometers.

Second, a proper transformation of data, subtleties in interpreting the kriging results of the
whose probability distribution is highly skewed, transformed data or in back-transforming kriging
often produces a set of values that can be approxi- results into the untransformed (original) units, as
mately normally distributed by mitigating the effect discussed in section 3.5.1. If a satisfactory variogram
of problematic extreme data. A data set having a of the original data cannot be achieved and a transfor-
normal distribution is important in kriging when confi- mation is indicated, the computation of a sample vari-
dence levels of the estimates are desired. The use of ogram needs to begin again with equation 5.2. Even
confidence levels in a kriging analysis is discussed though no transformation was needed for the Saratoga
in section 6.0. data, an example using a logarithmic transformation
Among the more common transformations is and an example using the indicator transformation are
the natural logarithmic (log) transform. For example, presented in section 6.0.
in this transformation, the y is the sample variogram
of logarithms, and subsequent kriged estimates are 5.4.2 Directional Variograms and Anisotropy
logarithms. Another transformation that is often used,
especially in spatial analyses of contaminant levels, is Anisotropy in the data can be investigated by
the indicator transformation described in section 3.5.2. computing sample variograms for specific directions.
Although a transformation might result in a better Locations included in a given direction from an orig-
distribution of sample variogram points, there are inal location are contained in a sector of a circle of

5.0 PRACTICAL ASPECTS OF VARIOGRAM CONSTRUCTION AND INTERPRETATION 31


radius hmax centered on the original location. The for a data pattern that does not conform to a direction
sector is specified by two angular inputs. The first of anisotropy. To determine the adequacy of the data
is a bearing defining the specific direction of interest for determining anisotropy, the computations of aniso-
[measured counterclockwise from east (= 0 degrees)] tropic sample variograms can be initially limited to
and the second is a window angle defining an arc in two orthogonal directions with window angles of
both directions from the bearing. Thus, in the termi- 45 degrees.
nology used here, the total angle defining a direction Directional sample variograms also can be used
is equal to twice the window angle. Differences in to further delineate nonstationarity of the spatial mean.
sample variograms computed using these angle If the omnidirectional sample variogram indicates a
windows specified for different directions can be drift in the data, the directional variograms may deter-
an indication of anisotropy. mine the dimension of the drift. That is, although the
Anisotropy is generally either geometric or directional sample variograms may not establish the
zonal. Geometric anisotropy is indicated by directional degree of the polynomial in the drift equation, they can
theoretical variograms that have a common sill value indicate the relative strengths of the drift in the u and
but different ranges. The treatment of geometric v directions.
anisotropy is dependent on the software used. The lags
The computed sample variograms for north-
of the directional variograms can be scaled by the ratio
south and east-west directions and window angles of
of their ranges to the range of a standard or common
45 degrees for the Saratoga data are shown in figure 9.
variogram. In some cases, the lags of all directional
The north-south variogram is specified by a direction
variograms are scaled by their respective ranges, and
a common variogram that has a range parameter of 1 angle of 90 degrees and a window angle of 45 degrees.
is used. Ground-water contaminant plumes often have The north-south variogram shows the preferential
geometric anisotropy in which the prevailing plume north-south data alignment by mimicking the omni-
direction has a greater range than the range of the directional (direction angle = 0 degrees and window
transect of the plume. angle = 90 degrees) sample variogram in figure 6.
The east-west variogram is specified by a direction
Zonal anisotropy is indicated by directional
variograms that have the same range but different angle of 0 degrees and a window angle of 45 degrees.
sills. Pure zonal anisotropy is usually not seen The lack of pairs of locations for the east-west
in practice; generally, it is found combined with variogram precludes a good analysis for this direction,
geometric anisotropy. Such mixed anisotropy may but the overlap of the few sufficiently defined vario-
be present when evaluating the variograms of three- gram points with the north-south variogram indicates
dimensional HTRW-sampling results. Variability of a consistency of drift in the two directions. Because
such data (as indicated by the sill of the variogram) of this consistency, an isotropic variogram is assumed
may be substantially higher and the range substantially for the Saratoga residuals. An example of kriging
shorter in the vertical direction than in the horizontal using anisotropic variograms is described in
direction. To model this mixture of anisotropic vario- section 6.0.
grams, the overall variogram is set to a weighted sum
of individual models of the directional variograms and
scaled by their ranges. This process is called nesting, 5.5 Fitting a Theoretical Variogram
in which the choice of weights uses a trial-and-error to the Sample Variogram Points
approach with a constraint that the sum of the weights
equals the sill of the overall variogram. Isaaks and The importance of adequately defining the bin
Srivastava (1989, p. 377-390) contains further infor- values of a sample variogram is substantiated by the
mation on both types of anisotropy. need to accurately generalize the data-based behavior
For a given number of data locations, directional of the sample variogram by a theoretical variogram, y.
sample variograms will have fewer points for any lag The parameters controlling the specific behavior of
when compared to the points for the same lag in the theoretical variograms are the nugget value, the range,
omnidirectional variogram. Hence, point values in the sill, or for case of a linear variogram, a slope. Of
directional variograms are less reliable, which could these parameters, the nugget and the sill can be related
be a critical constraining factor for small data sets or to properties and statistics of the data.

32 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
1.500
+ Lag with greater than or equal to 30 pairs
x Lag with less than 30 pairs

< 1.000
z>
a
c/o
C/O
a:
LJ

500

+ +

10 20 30 40 50 60 70 80 90 100
LAG, IN KILOMETERS

e 1.500
+ Lag with greater than or equal to 30 pairs
\ x Lag with less than 30 pairs

a
LJ
< 1.000
o
C/l
C/l
a:
LJ

500
<
o

, , , rtr , . . . I

10 20 30 40 50 60 70 80 90 100
LAG, IN KILOMETERS

Figure 9. Initial directional sample variogram points for raw Saratoga data A, north-south and
B, east-west.

5.0 PRACTICAL ASPECTS OF VARIOGRAM CONSTRUCTION AND INTERPRETATION 33


The nugget is essentially the extrapolation (see section 5.4.2). Geometric relations for obtaining
of the sample variogram to a lag of zero. It indi- parameters for the four variogram forms are described
cates the uncertainty of the variogram at lags that in the following sections and are shown in figure 3.
are smaller than the minimum separation between
any two data locations. The nugget can include 5.5.1 Exponential Variogram
measurement error variance, and an estimate of
this variance approximates a minimum value of The exponential variogram (eq. 3-23) is speci-
the extrapolation. fied by the nugget, g; sill, s; and a practical range, r.
The sill determines the maximum value of a The range is qualified as practical because the sill is
variogram and approximates the variance of the reached only asymptotically. From the nugget, the
data. However, the points defining y take precedence sample variogram points indicate a convex behavior
over the sample variance in determining the sill. Some that persists through all lags, although to a much
variograms are unbounded, and others may only lesser degree at larger lag values. A nugget and a sill
reach a sill value asymptotically. A defined sill allows are first specified based on the y points. The practical
conversion of the variogram to a covariance function range is chosen so the value of the resulting exponen-
using equation 3-27, which is generally done because tial function at the practical range lag is 95 percent of
computations in the kriging algorithms are more the sill. The specified exponential function meshes
efficiently performed using a covariance function. with the sample variogram points at least through the
Fitting a function to the sample variogram practical range lag. An initial estimate of the practical
values can range from a visual fit to a sophisticated range can be made if the intersection of the sill with a
statistical fit. A statistical fit is an objective method line tangent to the variogram at the nugget is at a lag
as long as the choice of bins and the weighting of value equal to one-third of the assumed practical range
the sample variogram points remain fixed. However, as shown in figure 3. Examples of the exponential vari-
because the inputs vary, inherent subjectivity persists ogram are included in spatial studies of sulfate and
as in a visual fit. A final calibration of the variogram total alkalinity in ground-water systems (Myers and
parameters is based on the kriging algorithm; thus, others, 1980).
either of the initial fitting methods at this stage would
suffice. 5.5.2 Spherical Variogram
Because the initial part of the variogram has
the most effect on subsequent kriging output, a good The spherical variogram parameters (eq. 3-24)
estimate of the nugget value becomes a most impor- are a nugget, g; a range, r; and a sill, s. At small lag
tant first step. The range and the sill, in that order, values, the sample variogram points indicate linear
complete the ranking of the effect of variogram behavior from the nugget that then becomes convex
parameters on the output of a geostatistical analysis. and reaches a sill at some finite lag (fig. 3). A sill is
Whatever the fitting method used, the theoretical vari- estimated, and a line drawn through the points of the
ogram needs to be supported by the sample variogram initial linear part of the variogram intersects the sill
values. For variograms that have a range parameter, at a lag value approximately equal to two-thirds of the
this support could extend to the range. Journel and range. With these estimates of the parameters, a spher-
Huijbregts (1978) suggested that this support would ical variogram is defined that should be supported by
be through one-half the dimension of the field or the sample variogram points. If the spherical vario-
essentially through one-half the maximum lag gram does not plot near the sample variogram points,
distance of the sample data. adjustments need to be made to the parameter esti-
Most geostatistical studies can be success- mates and the subsequent fit evaluated. Although the
fully completed using one of the following four spherical variogram model is one of the most often
singular theoretical variogram forms: exponential, used models for real-valued spatial studies, it also
spherical, Gaussian, and linear functions (fig. 3); seems to be a predominant model for indicator values
however, positive linear combinations of these at various cutoff levels as, for example, in a study of
forms also are acceptable as theoretical variograms lead contamination (Journel, 1993).

34 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
5.5.3 Gaussian Variogram 1. An initial variogram is specified and drift coeffi-
cients are computed to obtain residuals. For this
The Gaussian variogram parameters (eq. 3-25)
step, a pure nugget (that is, a constant) variogram
are a nugget value, g, and a sill, s; and this variogram
can be used to compute the initial estimates of
also has a practical range, r. The Gaussian variogram
the drift coefficients. These initial coefficients
is horizontal from the nugget, becomes a concave yield an ordinary least-squares estimate of the
upward function at small lags, inflects to concave drift and a first-iteration sample variogram of
downward, and asymptotically approaches a sill residuals.
(fig. 3). After a nugget and sill are specified based on
the Y points, the variogram value at a lag of one-half 2. A theoretical variogram is fitted to the sample
the estimated practical range is two-thirds of the sill. variogram of the residuals and is used to obtain
Again, this fitted variogram needs to be supported by updated drift coefficients.
the Y points to a reasonable degree. As is described 3. The residuals from the drift that were obtained in
in the example using the Saratoga data, the Gaussian step 2 are used to compute an updated sample
variogram often is used where the analyzed variable variogram.
is spatially very continuous, such as a ground-water 4. The sample variogram computed at the end of
potentiometric surface. step 3 is compared to the sample variogram from
step 2. If the two sample variograms compare
5.5.4 Linear Variogram favorably, then the theoretical variogram from
step 2 is accepted as the variogram of residuals
Parameters for a linear variogram (eq. 3-26) for subsequent kriging computations. If the
are a nugget value, g, and a slope, b. Sample points sample variogram from step 3 differs markedly
that indicate a linear variogram increase linearly from from the sample variogram from step 2, steps 2
the nugget and fail to reach a sill even for large lags through 4 are repeated using the sample vario-
(fig. 3). Using the nugget as the intercept, the slope is gram from the most recent step 3.
computed for the line passing through the Y points. A
Generally, the plot of the points of Y from
pseudosill can be defined as the value of the line at the
a set of residuals initially increases with h, reaches
greatest lag, hmax, between any two locations. This lag
a maximum, and then decreases as shown in figure 7.
becomes the defacto range, r, for a linear variogram. This typical haystack-type behavior, discussed by
Examples of the use of the linear variogram are in David (1977, p. 272-273), is attributed to a bias
hydrogeochemical studies of specific conductance and resulting from the estimation error in the drift and its
in studies of trace elements, such as barium and boron coefficients. This behavior in the variogram of the
(Myers and others, 1980). residuals generally would more readily occur with a
high degree of drift polynomial and need not prohibit
acceptable variogram determination because the initial
5.6 Additional Trend Considerations points of the sample variogram of residuals are still
indicative of the theoretical variogram. For example,
If a drift in the data is indicated as in section 5.2,
the lag associated with the maximum of Y of the resid-
the theoretical variogram of residuals that has been uals can be a good first approximation for the range of
fitted thus far is used to update the drift equation. the theoretical variogram.
Although ordinary least squares often suffices for
computing a polynomial drift equation, drift determi-
nation is a function of Y when the data are spatially 5.7 Outlier Detection
correlated. But Y cannot be estimated until a drift
equation is obtained to yield the residuals. Therefore, Outliers in a data set can have a substantial
obtaining a sample variogram and a subsequent adverse effect on Y- However, divergent data can
theoretical variogram from drift residuals of a speci- be screened for evaluation using a Hawkins statistic
fied drift form is an iterative process (David, 1977, (Hawkins, 1980), which is described in the context of
p. 273-274) using the following steps: kriging by Krige and Magri (1982). A neighborhood

5.0 PRACTICAL ASPECTS OF VARIOGRAM CONSTRUCTION AND INTERPRETATION 35


containing 4 to 10 data points that are approximately After kriged values at all data locations have
normally distributed around each suspected outlier been estimated in the above manner, the data are
needs to be defined. Despite potential outliers in the used with their kriged values and the kriging standard
data set, a best guess initial theoretical variogram deviations to obtain cross-validation statistics. A
also is needed. successful calibration is based on criteria for these
The Hawkins statistic is obtained by statistics, which are described in section 5.8.1. If the
comparing a suspect datum to the mean value of criteria cannot be reasonably met by adjusting the
the 4 to 10 surrounding data, the smaller number being parameters in the given theoretical variogram function,
sufficient if the variability is low. The spacing between then the calibration needs to be reinitialized using a
these surrounding points is accounted for by the prop- different theoretical variogram model. In some data
sets that have a nonstationary spatial mean, the drift
erties of the chosen variogram. A statistic of 3.84 or
polynomial and the variogram may have to be changed
higher would indicate an outlier on the basis of a
to achieve a satisfactory calibration.
95-percent confidence interval. A large number of
surrounding points has the direct effect of increasing
the magnitude of the statistic. Anomalous points 5.8.1 Calibration Statistics
are removed from the data set, and the procedure The kriging cross-validation error, e^ which
described for obtaining the sample variogram is corresponds to measurement z(*/), is defined as
repeated for the small data set. There were no
outlier problems in the Saratoga data.
= z(xi)-z(xi) (5-3)
There is debate among geostatisticians
regarding the merit of automated outlier-detection
procedures. A procedure such as that described is where
presented as an investigative tool with the under- is the kriged estimate
standing that attendant justification and a Hawkins- The kriged estimate is obtained by ordinary kriging if
type statistic need to be used to ultimately decide if the spatial mean is constant or by universal kriging if
a data value is discarded as a true outlier or retained the spatial mean is not stationary. A reasonable crite-
as a valid measurement. In some situations, highly rion for selecting a theoretical variogram is to mini-
problematic data are removed for computation of mize the squared errors, Ze,- , with respect to the
the sample variogram points, but are reinstated for variogram parameters. However, unlike ordinary least-
kriging. squares regression, which also minimizes the sum of
squared errors, simply minimizing the squared errors
is not sufficient for kriging because the resulting
5.8 Cross Validation for Model model can yield inconsistent estimates of the kriging
Verification variances, o^(*/) at location *,-. This simple minimiza-
tion would give unrealistic measures of the accuracy
The parameters of the theoretical variogram of the kriging estimates. To guard against such bias,
obtained from the initial fitting and refinement of the an expression for the square of a reduced kriging error
sample variogram are calibrated using a kriging cross- is defined:
validation technique. In this technique, the fitted theo-
retical variogram is used in a kriging analysis in which
data are individually suppressed and estimates are (5-4)
made at the location using subsets of the remaining
data. As described in section 5.2, these subsets are the
data points in a moving neighborhood surrounding the where the kriging variances are computed using either
location under consideration. The calibration estimate equation 3-36 or equation 3-47. If the kriging vari-
made at each data location needs a matrix inversion, ance is a consistent estimate of the true mean-squared
which could be very time consuming if all the data error of estimate, then the reduced kriging errors have
locations were used to construct the matrices rather an average of about 1. Therefore, the standard cross-
than just the data within a neighborhood of a limited validation technique for evaluating a theoretical vario-
search radius. gram is:

36 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
Using the Saratoga data, a spherical variogram
mm was fitted to the refined sample variogram of the resid-
i= i uals. The estimated nugget was about 1.49 meters
squared, the sill was 133.8 meters squared, and the
(5-5)
range was about 48 kilometers. Because of the diffi-
culty in determining an exact extrapolated value for
the nugget, the value of 1.49 meters squared was
selected based on an estimated measurement error
- related to obtaining water levels at the well depths
i=i in the Saratoga Valley.
After two iterations using drift residuals, as
The expression to be minimized is called the described in section 5.6, a final variogram was chosen
kriging root-mean-squared error and the constraint that had a nugget of 1.49 meters squared, a sill of
is called the reduced root-mean-squared error. The 148.6 meters squared, and a range of 44.8 kilometers
reduced root-mean-squared error needs to be well (fig. 10). These parameters defined the theoretical
within the interval having endpoints variogram used to obtain the cross-validation errors
through universal kriging with an assumed linear
drift. The best combination of statistics that could be
\T obtained after several attempts at refining the model
1+ 2 -
were a root-mean-squared error of 3.45 meters and
a reduced root-mean-squared error of 0.5794. The
and reduced root-mean-squared error is too small, indi-
cating that the kriging variances produced by the
model are relatively large compared to the actual
1- squared errors. This fact, coupled with the rather
large root-mean-squared error, warranted additional
variogram refinements. In section 5.8.2, a Gaussian
(Delhomme, 1978). An additional check on the good- variogram was fitted to the data; the Guassian vario-
ness of the cross-validation results is the unbiasedness gram produces much better cross-validation results
condition where than the results from the spherical variogram.

5.8.2 Variogram-Parameter Adjustments


If any of the cross-validation statistics vary
unacceptably from their suggested values, minor
As indicated in section 3.0, if probabilistic adjustments to the variogram parameters can be
statements concerning an actual value of Z at an made to attempt to improve the statistics. Modifica-
unmeasured location are to be made compared to tions made to the parameters should not have to be so
the kriged estimate and the kriging variance at the severe that the variogram function drastically deviates
location, the distribution of the cross-validation from the sample variogram points. If the support of the
kriging errors needs to be analyzed. In particular, sample variogram points is compromised to achieve
the reduced errors, ei , i = l,2,...,n, need to be approx- acceptable cross-validation results with the given drift-
imately normally distributed with mean 0 and vari- variogram model, a different drift-variogram combina-
ance 1. A histogram or normal probability plot of the tion needs to be investigated.
reduced kriging errors can be used to assess the A reduced root-mean-squared error that is
validity of assuming a standard normal distribution for unacceptable may be improved by adjusting the range
the reduced kriging errors. Additionally, if the distri- parameter or the nugget value of the variogram. Modi-
bution of reduced kriging errors can be assumed to be fying the range parameter needs to be considered first,
standard normal, outliers not detected using the tech- and any shifts in the nugget value need to be minimal
nique discussed in section 5.6 may be detected by and made only as a final recourse. The calibration
comparing the absolute values of the reduced kriging errors are relatively insensitive to minor adjustments
errors to quantiles of the standard normal distribution. of the sill.

5.0 PRACTICAL ASPECTS OF VARIOGRAM CONSTRUCTION AND INTERPRETATION 37


250
Lag with greater than or equal to 30 pairs
x Lag with less than 30 pairs
Spherical model fitting parameters
Nugget = 1.49 meters squared
200 - Sill = 148.64 meters squared
Range = 44.80 kilometers

LJ

O
CO
CO
or
LJ
LJ

<
^
s
<

50 60 90 100
KILOMETERS

Figure 10. Sample variogram points and theoretical spherical fit for iterated Saratoga residuals.

If the reduced root-mean-squared error is too The reduced kriging errors may not approximate
small, as in the Saratoga example, extending the range a standard normal distribution. If so, a transformation
(equivalent to decreasing the slope for a linear vario- of the data may be needed to achieve a more normal
gram) decreases the kriging variance and, thus, distribution, and the variogram estimation procedure
increases the reduced root-mean-squared error. If a would be repeated.
shift in the nugget is needed, a decrease in the nugget Because no convergence could be reached for
decreases the kriging variance. If the reduced root-
parameter values of a spherical variogram for the
mean-squared error is too large, then a contraction of
Saratoga data, a Gaussian theoretical variogram was
the range or a positive shift in the nugget can be made,
fitted to the sample variogram of residuals in figure 8.
based on the priority and the extent of the changes.
Generally, changes in these parameters also have This choice was made because the initial sample
an effect on the mean-squared error. The larger the variogram points seemed to have a slight upward
nugget is as a percentage of the sill, the larger the concavity, but eventually reached a sill. This behavior
mean-squared error is. In general, improvements in can be attributed to correlation rather than to further
one statistic are usually made at the expense of the drift. After an iterated cross validation using the
other statistics. The optimization of the statistics as a Gaussian parameters, a Gaussian variogram that had a
set is, in effect, a trial-and-error procedure that is oper- nugget of 1.49 meters squared, a sill of 185.81 meters
ationally convergent. squared, and a range of 27.52 kilometers (fig. 11)

38 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
yielded a root-mean-squared error of 2.33 meters and 6.0 PRACTICAL ASPECTS OF
a reduced root-mean-squared error of 1.083 meters. GEOSTATISTICS IN HAZARDOUS-,
The mean cross-validation error was 0.0195 meter. TOXIC-, AND RADIOACTIVE-
These values represented an improvement over the WASTE-SITE INVESTIGATIONS
spherical variogram and were deemed acceptable for
the Gaussian variogram. In this section, several example applications
A probability plot of the reduced kriging are described. The applications have been developed
errors using the final Gaussian variogram is shown using hydrologic, geologic, and contaminant data
in figure 12. The plot is reasonably linear between from established and well-studied hazardous-waste
two standard deviations and, thus, approximates a sites. The real nature of the data enables discussion
standard-normal-distribution function. A plot in of some problems that can occur during HTRW-site
figure 13 of the measured data versus their kriged investigations that originate, not only from natural
estimates indicates that the linear drift/Gaussian field conditions, but also from typical problems that
variogram model selected for the Saratoga data are associated with the types of data involved. In
would produce accurate estimates of ground-water addition, the real nature of the example data provides
elevations for interpolation or contour gridding in the an opportunity for comparison between kriging esti-
region. mates and the real data; these comparisons are brief

300
+ Lag with greater than or equal to 30 pairs Gaussian fit
x Lag with less than 30 pairs
Gaussian model fitting parameters
250 Nugget = 1.49 meters squared
Sill = 185.81 meters squared
Range = 27.52 kilometers

200
ID
O
CO
CO

150

X
100
<
O

50

10 20 30 40 50 60 70 80 90 100
LAG, KILOMETERS
Figure 11. Sample variogram points and theoretical Gaussian fit for iterated Saratoga residuals.

6.0 PRACTICAL ASPECTS OF GEOSTATISTICS IN HAZARDOUS-, TOXIC-, 39


AND RADIOACTIVE-WASTE-SITE INVESTIGATIONS
3.5

2.5

1.5
ct:
o
z
O

CO

O -0-5
CO
LU
P
< -1.5

-2.5

-3.5
-3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5
REDUCED KRIGING ERRORS

Figure 12. Cross-validation probability plot for Saratoga data.

and general. This report does not provide comprehen- The GEO-EAS software has been used
sive analyses of data that are available in other more whenever the example data did not require universal
elaborate studies. kriging; for those examples, STATPAC was used.
The principal intent of the examples is to As indicated in section 4.0, both of these software
provide systematic descriptions for a few of the packages run on the DOS platform (table 1), which
large number of possible applications that may be is probably most convenient to readers. The results
used during HTRW-site investigations. The examples of kriging estimates are portrayed by gray-scale maps
are not intended to provide guidance for comprehen- rather than by contours because of the objective nature
sive analysis of the included data. However, this report of the gray-scale format. North is at the top of all maps
presents some fundamental problems that can occur presented although this orientation may represent
in geostatistical applications and, in some examples, some deviation from the real data.
indicates some possible alternatives.
With each example, a purpose is established 6.1 Ground-Water-Level Examples
and a general environmental setting is described.
Most aspects of variogram construction and calibra- The principal purpose of the ground-water-level
tion are briefly described and are shown in figures examples is to familiarize the reader with a kriging
and listed in tables. A comprehensive treatment of exercise using ground-water levels and to indicate
variogram construction has been presented in simply how kriging standard deviations may be useful
section 5.0. in evaluating monitoring networks. The data are from

40 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
a water-table setting in unconsolidated sediments the form a + bu + cv, observed in the measured
where the local relief for the land surface is about water levels, where #, b, and c are constants
30 meters. The data involved in this example are determined in the iterative process.
considered virtually free of actual measurement error. 3. After drift was removed, residuals were determined
The location of measured water levels is shown to be stationary and universal kriging with a
in figure 144, and the basic univariate statistics for this linear drift was appropriate.
data set are listed in table 2 (water level A); modifica-
tions to the measured data, in the form of removal and 4. A Gaussian model was used to fit the stabilized
addition of measured values, are shown in figures 145 variogram of residuals (fig. ISA), which
and C. The techniques described in section 5.0 were has a nugget of 0.09 meter squared, a
used to guide the following steps for variogram sill of 2.69 meters squared, and a range of
construction: 1,219 meters (table 3).
1. A raw variogram analysis and basic hydrologic Cross validation was performed, and the results are
knowledge of water-level behavior indicated shown in figures 15B and C and listed in table 3.
that universal kriging would be needed for this The cross-validation statistics conform to the criteria
analysis. discussed in section 5.0.
2. To obtain a stable variogram of residuals, an itera- Linear drift is commonly observed in ground-
tive, generalized least-squares operation was water-elevation data where there are no major anthropo-
initially used to remove prominent linear drift of genic activities, such as large ground-water withdrawals.

750

700

CO
o:
LJ
I
Ld
650

Ld

CO
LU
600
O

550

500
500 550 600 650 700 750
MEASURED VALUE, IN METERS

Figure 13. Scatterplot of measured versus kriging estimates from cross validation of Saratoga data.

6.0 PRACTICAL ASPECTS OF GEOSTATISTICS IN HAZARDOUS-, TOXIC-, 41


AND RADIOACTIVE-WASTE-SITE INVESTIGATIONS
'' H"
1,570 1,570

1,470

1,370
1,470

1,370
_J
1,270 1,270

1,170 1,170

1,070 1,070

970 970

870 870 m m
770 770
ff
670 670

570 570

470 470

370 370
270
270 EB EB EB
170
170
70
m EB EB
70
mm m m m m mmmmmmmmmm
240 390 540 690 840 990 1,140 1,290 1,440 1,590
-60 90 240 390 540 690 840 990 1,140 1,290 1,440 1,590
MAP DISTANCE, IN METERS MAP DISTANCE, IN METERS
EXPLANATION
EXPLANATION
S SJiSl
10 15 17 20
INDEX TO PLOTTED VALUES, IN METERS
INDEX TO PLOTTED VALUES, IN METERS

Figure 14. Location of measured data for ground-water-


'tft" "' n '' n '' n level examples A, original data; B, original data without
1.570
B 1,470
dropped sites; and C, original data with added sites
(added sites indicated with +).
1,370

1,270

1,170
E
1,070 A Gaussian model is usually appropriate
970 for variograms of highly continuous variables,
870
such as ground-water-elevation data, and this model
770
670
is particularly appropriate in this example. The vario-
570 gram (fig. 15A) at small lags beyond the nugget has
470 an upward concavity that cannot be fit with a linear,
370
spherical, or exponential model. The observed shape
270

170
was interpreted as a function of continuous small-scale
variability. The Gaussian model fits the bowl shape of
BUB
240 390 540 690 840 990 1,140 1,290 1,440 1,590
the small lag data and other data well to a lag of about
MAP DISTANCE, IN METERS 610 meters, but it is not flexible enough to closely fit
EXPLANATION
the points much beyond 610 meters, indicating that
kriging estimates should be computed using neighbor-
INDEX TO PLOTTED VALUES, IN METERS
hoods with a search radius less than 610 meters. In
section 5.0, the initial part of the variogram was
Under these circumstances, there is usually a fairly described as having the most effect on subsequent
uniform and general ground- water movement along a kriging estimates.
flow path. This uniform and general nature introduces The established variogram then was used with
a nonstationary element to the data that, in geostatistics, the measured data to produce universal kriging esti-
is referred to as drift. As indicated in section 5.0, the mates for all points in a 26-by-26 grid that had a grid
presence of drift is indicated by a parabolic variogram size of about 61-by-61 meters. A gray-scale map
shape. In this example, the initial variogram in the raw of the kriging water levels is shown in figure 16A and
variogram analysis had a characteristic parabolic shape, basic univariate kriging estimate statistics are listed in
and a linear drift was identified. Once the drift was iden- table 4 (water level A). The kriging results are a good
tified and characterized, universal kriging procedures representation of the results from other more elaborate
were used. studies.

42 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
The kriging standard deviations for the kriging To produce the third map (fig. 14Q, nine
estimates are shown in figure \6B. The magnitude locations were added in the southwest corner where
of kriging standard deviations can provide investiga- the sampling density was relatively low and the kriging
tors with a direct indication of where the uncertainty standard deviation was relatively high. In section 3.3,
associated with kriging estimates is relatively high equation 3-47 indicates that the universal kriging
or low. The areas of the greatest uncertainty for variance depends on the variogram, the type of trend,
the kriged water levels are in the upper right and and the measurement locations; in this respect, the
kriging standard deviation does not depend on the
the lower left corners of figure 16B, where standard
values at the measurement locations. Consequently,
deviations are as high as about 1.4 and 0.8 meters.
values of zero were used for the nine new measurement
These areas are where the density of the measured locations and only the resultant map of kriging standard
data is relatively low. Throughout much of the deviations (fig. 16D) is of interest. The map shows that
remainder (about 70 percent) of figure 16B, the the kriging standard deviations in the lower left corner,
kriging standard deviation is almost constant at which formerly had values of about 0.8, approximately
about 0.35 meter. have been decreased by a factor of about 0.25, which
To use the kriging standard-deviation values indicates that the kriging estimates, based on the
more quantitatively, some assurance is needed that geometry of the network, are more reliable.
the measured data and the reduced kriging errors
are approximately normally distributed and that the
assumption of stationary residuals after drift removal 6.2 Bedrock-Elevation Examples
is correct. If assumptions are valid, then the basic
The following examples are for bedrock eleva-
statistical principles involving confidence intervals can tions. The principal purposes of the examples are to
be applied. In this example, the kriging standard devia- familiarize the reader with a kriging exercise using
tion of 0.35 meter throughout most of the map indi- bedrock elevations and to describe block kriging. The
cates that there is a 95-percent chance that the true data are from an area where bedrock consists of a
value at a location where there is a kriging estimate series of intercalated terrestrial deposits that have been
is within 0.76 meter (twice the kriging standard devia- weathered somewhat and then covered with alluvium.
tion) of the kriging estimate. The opportunity for measurement error in these data
As an example of evaluating network density is inevitable because the determination of just where
and the accuracy of kriging estimates, two new maps bedrock begins is complicated and subjective.
were developed. To compile the first map, a decrease The set of measured locations, set A, is shown
in network density was effected by removing nine in figure 11A, and the basic univariate statistics are
measured locations from the northwest part of the listed in table 2 (bedrock A); modifications to the
area (fig. 14#) where sampling density was high measured data, such as removal of sites, are shown
and kriging standard deviations were low. Kriging in figure \1B. The techniques described in section 5.0
were used to guide the following steps for variogram
estimates were produced for the same grid and
construction:
the basic univariate kriging estimate statistics are
listed in table 4 (water level B). The map shown 1. The raw variogram indicated a stationary spatial
in figure 16C indicates that the ratio of the original mean. The data were assumed to be suitable
kriging standard deviations and the kriging standard for ordinary kriging.
deviations with the nine measured locations removed 2. An isotropic Gaussian model was used to fit the
is always very close to 1.00, which indicates that there variogram, which had a nugget of 0.65 meter
is very little difference between the two sets of kriging squared, a sill of 12.54 meters squared, and a
standard deviations and that water levels are over- range of 914 meters (table 3, bedrock A).
sampled in the area where the nine measured locations 3. Cross validation was performed, and the results
were removed. (table 3, bedrock A) were not acceptable.

6.0 PRACTICAL ASPECTS OF GEOSTATISTICS IN HAZARDOUS-, TOXIC-, 43


AND RADIOACTIVE-WASTE-SITE INVESTIGATIONS
3.5
+ Lag with greater than or equal to 30 pairs Gaussian fit
x Lag with less than 30 pairs
l_ Gaussian model fitting parameters
Nugget = 0.09 meter squared
Sill = 2.69 meters squared
Range = 1.219 meters

o
CO

1.5

0.5

300 600 900 1,200 1.500


LAG, IN METERS

25

20
CO

"< 15

CO
LJ
O
z
o
rr^
10

10 15 20 25
MEASURED VALUE, IN METERS

Figure 15. Variogram and variogram cross-validation plots for residuals in water-level examples
A, theoretical variogram; B, cross-validation scatterplot; and C, cross-validation probability plot.

44 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
3.5

2.5

1.5
or
o
O
or
°-5
CO

o
CO

< -1.5
z>
o

-2.5

-3.5
-3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5
REDUCED KRIGING ERRORS

Figure 15. Variogram and variogram cross-validation plots for residuals in water-level examples
A, theoretical variogram; B, cross-validation scatterplot; and C, cross-validation probability plot Continued.

Table 3. Variogram characteristics and cross-validation statistics

[Note: NA, not applicable; base unit for water levels and bedrock is meters; base unit for water quality A is log concentration, concentration in micrograms
per liter; base unit for water quality B is indicator units]

Variogram characteristic Cross-validation statistics


Kriging Reduced
Average root-mean- root-mean-
Example Nugget Sill
kriging
identifier Direction/ (base (base Range squared squared
Transformation Model error
tolerance units units (meters) error error
(base
squared) squared) (base (dimension-
units)
units) less)
Water levels Drift 0/NA Gaussian 0.09 2.69 1,219 -0.0006 0.37 1.083

Bedrock A None 0/NA Gaussian 0.65 12.54 914 0.045 2.53 2.146
Bedrock B None 0/NA Gaussian 0.74 8.36 732 -0.010 1.34 1.192

Water quality A Natural log 150/45 Exponential 1.00 3.20 1,295 0.105 1.54 0.938
Water quality A Natural log 240/45 Exponential 1.00 3.20 228 0.105 1.54 0.938

Water quality B Indicator 150/45 Spherical 0.05 0.25 610 NA NA NA


Water quality B Indicator 240/45 Spherical 0.05 0.25 213 NA NA NA

6.0 PRACTICAL ASPECTS OF GEOSTATISTICS IN HAZARDOUS-, TOXIC-, 45


AND RADIOACTIVE-WASTE-SITE INVESTIGATIONS
00

NORTH
CL

-60 90 240 390 540 690 840 990 1,140 1,290 1,440 1,590
MAP DISTANCE, IN METERS
EXPLANATION __________

7 9 12 14 17 19

INDEX TO PLOTTED VALUES, IN METERS

e
00

970

o
670
NORTH

CL
<

-30
-60 90 240 390 540 690 840 990 1,140 1,290 1,440 1,590
MAP DISTANCE, IN METERS
EXPLANATION
*« *:
i i i:rr:::3Piisiiiiiiiifcii£;^Tirf"^^^^aB^B^^B
i i i i
0.31 0.53 0.74 0.96 1.17 1.40

INDEX TO PLOTTED VALUES, IN METERS

Figure 16. Kriging results for ground-water-level examples A, kriging


estimates for original data; B, kriging standard deviations for original data;
C, ratio (original data to original with dropped sites) of kriging standard
deviations; and D, kriging standard deviations for original data with
added sites.

46 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
00
o:

LJ
o
z

NORTH

CL

-60 240 390 540 690 840 990 1,140 1,290 1,440 1,590
MAP DISTANCE, IN METERS
EXPLANATION

0.97 0.98 0.99 0.99


INDEX TO PLOTTED VALUES, DIMENSIONLESS

1,570

1,470

1,370

1,270
00
o: 1,170

1,070

970
870
LJ
O 770
z
670
570 NORTH
o
CL 470
370
270

170

70
-30
-60 90 240 390 540 690 840 990 1,140 1,290 1,440 1,590
MAP DISTANCE, IN METERS
EXPLANATION

0.53 0.75 0.96 1.18

INDEX TO PLOTTED VALUES, IN METERS

Figure 16. Kriging results for ground-water-level examples A, kriging


estimates for original data; B, kriging standard deviations for original data;
C, ratio (original data to original with dropped sites) of kriging standard
deviations; and D, kriging standard deviations for original data with
added sites Continued.

6.0 PRACTICAL ASPECTS OF GEOSTATISTICS IN HAZARDOUS-, TOXIC-, 47


AND RADIOACTIVE-WASTE-SITE INVESTIGATIONS
Table 4. Univariate statistics for gridded kriging estimates in example applications

[Note: Base unit for water level A and B and bedrock B and C is meters; base unit for water quality A is log concentration, concentration in micrograms
per liter]

Standard
Minimum Maximum Mean Median Skewness
Example deviation
Transformation (base (base (base (base (dimension-
identifier (base
units) units) units) units) less)
units)
Water level A Drift 7.42 19.8 14.0 13.6 3.09 0.11
Water level B Drift 7.49 19.8 14.0 13.5 3.09 0.11
Bedrock B None 7.96 19.8 12.6 12.1 2.35 0.82
Bedrock C None 8.14 19.8 12.6 12.1 2.33 0.82
Water quality A Natural log 2.92 7.07 5.17 5.03 0.72 -0.06

The cross-validation exercise produced a the kriging process so that each homogeneous
reduced root-mean-squared error of 2.146 (table 3, domain is addressed independently, becomes more
bedrock A), which indicates that the kriging variance attractive. In more complicated applications where
is underestimated. Further attempts to fit the Gaussian a large number of domains are present, a distributed
model to the sample variogram points produced better approach may be necessary to avoid an undue amount
cross-validation statistics; however, the Gaussian of compromise.
curve began to depart substantially from the sample The restriction of measured data, set B, is shown
variogram points at the low lag sample points. in figure 17B, and the basic univariate statistics are
As a result, the distribution of the residuals was listed in table 2 (bedrock B). The restriction exercise
examined, and the eastern, and especially north- resulted in removing 18 measured locations and in
eastern, parts of the area were determined to the truncation of the northeastern part of the area
contain problematic data that rendered the distribu-
so that the area became polygonal rather than rectan-
tion nonhomogeneous. The nonhomogeneous nature
gular. The techniques described in section 5.0 were
was related to an incised channel present on the
used to guide the following steps for variogram
bedrock surface. Therefore, the measured data
construction:
were restricted to exclude the outlying measurements.
Before the restriction, two alternative techniques for 1. A Gaussian model was used to fit the variogram,
dealing with the outlying measurements were consid- which had a nugget of 0.65 meter squared,
ered and deemed beyond the scope of this effort. a sill of 8.36 meters squared, and a range of
However, a brief discussion of the alternatives is 732 meters. The variogram indicated a station-
appropriate. ary spatial mean.
The first alternative was to fit a contrived 2. Initial cross validation was performed, and the
and nongradual surface to the measured data to remove nugget was changed from 0.65 meter squared to
the outlier effect. A splined surface might be capable
0.74 meter squared to improve cross-validation
of producing the desired result. The decision whether
statistics. The final variogram is shown in
or not to pursue such an alternative becomes some-
figure ISA, and the characteristics are listed in
what philosophical. In a relatively simple example,
table 3.
as in this bedrock example, such an alternative may
be entirely appropriate; however, this alternative 3. Final cross validation was performed, and the
may actually involve two unique and homogeneous results, shown in figures 18Z? and C and listed
domains. Therefore, the second alternative, distributing in table 3 (bedrock B), were acceptable.

48 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
: ' 'r4 ttLl 1 |'lr-1-|'lf-'i 1 'rt-i' 1 r>-t' 1 ''l' 1 ''l 1111 l 1 '''l'' 11 l 1111 l 11 ''l 1111 1 ' ' :
1,570 - LJLjf] LJ LJ LJ rLJ B~
A 1,470
\ 1_1 |

1,370 JljiijJWp ^ p_ Eg rj B n
(f) 1,270 rffii-3^ ^T1 ^
LJ
h-
LJ
1,170
1,070
L a D EI IB n B j
970
L a D H " j
Z

I_J
870
r a EO B H -
0
z
770
^ n D S ^
<c 670
:L i i m BOH Pl H H H :
Ul l"1 iSBeSEH b£i ^ j
570 NORTH
Q : .^ ;
470 - rr-| 91 -;
Q_ - LJ] Bj :
< 370
r '
270 -_ m B ^
170 ~- -;
70 i. ^
-30
60 140 340 540 740 940 1,140 1,340 1,540 1,740 1,940 2,140 2,340 2,540
MAP DISTANCE, IN METERS
EXPLANATION

i i i i i
3 7 11 14 18 22

yiSss '|
INDEX TO PLOTTED VALUES, IN METERS

1,570
B 1,470
1,370
in 1,270
ct : LJR [T| :
LJ 1,170
h- : t_J j I L-iJ :
LJ
^>
1,070 -; n D
n H
-;
970 ~ L _J ^^?
Z ; Q H ;
870 r D dig --.
Ld
O 770
f LI B 0 "
<C 670
: n
gg pg^JiiL«||
n
BB
^:
(f) 570
: ^_ ; NORTH
Q
470 H-j ||

CL "I

^ i
<C 370
^
270
170 - H
70
-30
60 140 340 540 740 940 1,140 1,340 1,540 1,740 1,940 2,140 2,340 2,540
MAP DISTANCE, IN METERS
EXPLANATION

11 14 22
INDEX TO PLOTTED VALUES, IN METERS

Figure 17. Location of measured data for bedrock-elevation examples A, original data and
B, restricted data.

6.0 PRACTICAL ASPECTS OF GEOSTATISTICS IN HAZARDOUS-, TOXIC-, 49


AND RADIOACTIVE-WASTE-SITE INVESTIGATIONS
15
+ Log with greater than or equal to 30 pairs Gaussian fit
x Lag with less than 30 pairs
Gaussian model fitting parameters
Nugget = 0.74 meter squared
Sill = 8.36 meters squared
Range = 732 meters

O
LJ
1 "
O
CO
CO
or

300 600 900 1,200 1,500


LAG, IN METERS

CO
o;

< 15 -

CO
LJ
o
2
o

MEASURED VALUE, IN METERS


Figure 18. Variogram and variogram cross-validation plots for bedrock-elevation examples
A, theoretical variogram; B, cross-validation scatterplot; and C, cross-validation probability plot.

50 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
3.5

2.5

1.5
o
Q
a:
§ 0.5

I
-0.5
o
CO
y x"
< -1.5

-2.5

-3.5
-3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5
REDUCED KRIGING ERRORS

Figure 18. Variogram and variogram cross-validation plots for bedrock-elevation examples
A, theoretical variogram; B, cross-validation scatterplot; and C, cross-validation probability plot-
Continued.

The large difference between the sill defined table 4 (bedrock B). The kriging results indicate
for the initial data set and the sill for the restricted channel-like features in the bedrock surface and a
data set [12.54 meters squared and 8.36 meters squared prominent bedrock high at the south border of the
(table 3)] supports the hypothesis that the original data area; the results are a good representation of the
set is actually two different domains. The final vario- results from other more elaborate studies.
gram then was used, along with the measured data, For an example of block kriging, an investiga-
to produce ordinary kriging estimates for all points tive goal of establishing block values of bedrock eleva-
in a 52-by-52 grid that had a spacing of about tion for a finite-difference ground-water-model grid
30-by-30 meters, which was truncated along the having about 120-by- 120-meter cells was assumed.
northeastern border because of the restriction operation. The same variogram and search criteria were used
For the kriging procedure, a search radius of about to estimate block values for a 13-by-13 grid that had
about 120-by-120-meter spacing; a 4-by-4 block was
914 meters, with a maximum of 16 and a minimum
specified. Each kriging value shown in figure 19C is
of 8 surrounding locations, was specified. It is not
an estimate of the average value of bedrock elevation
uncommon to specify a search radius that is greater throughout the about 120-by-120-meter block. The
than the variogram range; this practice helps ensure standard deviation for the block estimates is less than
that, in this case, between 8 and 16 points would be the standard deviation for the point estimates (table 4).
obtained to develop the kriging estimate. Gray-scale Gray-scale maps of the kriging estimates and of the
maps of the kriging estimates and kriging standard kriging standard deviations are shown in figures 19C
deviations are shown in figures 19A and B, respectively, and D, and the univariate kriging estimate statistics are
and the univariate kriging estimate statistics are listed in listed in table 4 (bedrock C).

6.0 PRACTICAL ASPECTS OF GEOSTATISTICS IN HAZARDOUS-, TOXIC-, 51


AND RADIOACTIVE-WASTE-SITE INVESTIGATIONS
1,570
^ 1,470
1,370
(/) 1,270
Ld
I 1,170
L^ 1,070
z 970
~~- 870

O 770
670
570 NORTH
470
370
270
170
70
-30
-60 140 340 540 740 940 1,140 1,340 1,540 1,740 1,940 2,140 2,340 2,540
MAP DISTANCE, IN METERS
EXPLANATION
lit?
12 15
INDEX TO PLOTTED VALUES, IN METERS

1 ' I I I -
1,570 ^ ;;;::::::;::;::;
D 1,470 -:
,,,,-,. ,. , , , . . . ,..
;;;;;;;;;;;;;fc
,.:,. .,
_
1,370 F , 3 , , i.tfiiTlii.
(T) 1,270 A.«
cr i.ij:::l*i
LU
h- 1,170 : f
!*.
^
b
!v
LU 1,070 -
t f
ifi
z 970 ii.
1 jiis ~
^9 1 ,

^ 870 I «K

LJ 1
O 770 j 1
< 670 I
mmmtmiL
tn 570 1 ^^KXT NORTH
£ - ^^ ^7
^^BfT
KXXT
< 370 |
^^^^^^^ ' ~

270 J E
PV
170 I
[5[TTT'
70 1 rjl^lrlrT
(yHfTj
^n '.''f*' 1*' , , , | , , , , | . . , , 1 , , , , | , , . , 1 . .

-60 140 340 540 740 940 1,140 1,340 1,540 1,740 1,940 2,140 2,340 2,540
MAP DISTANCE, IN METERS
EXPLANATION

1.38 1.62 1.86 2.11


INDEX TO PLOTTED VALUES, IN METERS

Figure 19. Kriging results for bedrock-elevation examples A, kriging estimates; B, kriging standard
deviations; C, block kriging results; and D, block kriging standard deviations.

52 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
I , , , , i , , , |-,-,-,-,-r ,-r-,-,-|-,-,-,-,-[-,-,- ,-i-p-r-,-,-|-

cr
Ld
I
LJ

(J

I
co NORTH
Q
Q_

-60 140 340 540 740 940 1,140 1,340 1,540 1,740 1,940 2,140 2,340 2,540
MAP DISTANCE, IN METERS
EXPLANATION

INDEX TO PLOTTED VALUES, IN METERS

NORTH

60 140 340 540 740 940 1,140 1,340 1,540 1,740 1,940 2,140 2,340 2,540
MAP DISTANCE, IN METERS
EXPLANATION
M

0.26 0.83 1.11 1.39 1.68

INDEX TO PLOTTED VALUES, IN METERS

Figure 19. Kriging results for bedrock-elevation examples A, kriging estimates; B, kriging standard
deviations; C, block kriging results; and D, block kriging standard deviations Continued.

6.0 PRACTICAL ASPECTS OF GEOSTATISTICS IN HAZARDOUS-, TOXIC-, 53


AND RADIOACTIVE-WASTE-SITE INVESTIGATIONS
6.3 Ground-Water-Quality Examples addition to using performance guidelines, field
quality-assurance samples also were collected. These
The following examples are for ground-water- samples can be used to evaluate other possible errors,
quality information consisting of concentrations deter- such as cross contamination and representativeness of
mined for a contaminant. The principal purposes of the sample. Duplicate samples for the contaminant in
the examples are to familiarize the reader with a the water-quality examples indicate as much as about
kriging exercise using ground-water-quality informa- 15-percent variability in reported results. This vari-
tion and to illustrate indicator kriging. The examples ability is not entirely unusual and is most likely related
also are to familiarize the reader with data that are to the integrity of the analytical method or the method
strongly anisotropic and need transformation. The data for aggregating the sample media during sample
are from a water-table aquifer developed in alluvial collection.
sediments where the depth to water was less than The set of measured locations is shown in
about 23 meters. Several analytical laboratories
figure 20 and the basic univariate statistics are listed
were involved in measuring the concentration of
in table 2 (water quality A). An initial review of the
the contaminant in the water-quality examples. Each
data indicated three important features:
of the analytical laboratories had to follow rather
comprehensive guidelines that specified tests of instru- 1. The data seemed to have strong anisotropy at about
ment performance before sample determinations were 150 counterclockwise degrees to the east-west
made, as well as measurement of extraction efficien- base line.
cies. Because of these performance guidelines, the 2. The data required a natural logarithmic (log) trans-
opportunity for errors due to instrument error was formation so the distribution was approximated
considered to be either known or relatively low. In by a normal distribution.

D D

LU
1 -440 Dnn
I
LJ
n
r n
LJ 940
o
n
rn LJ
D
1vi
Q
0- 440
DD NORTH

rn
L_jJ
pi
EftffJ

-60
-30 470 970 1,470 1,970 2,470 2,970 3,470
MAP DISTANCE, IN METERS
EXPLANATION

i I
307 607 907 1,207 1,507 1,807 2,107 2,407 2,707 3,007

INDEX TO PLOTTED VALUES, IN MICROGRAMS PER LITER

Figure 20. Location of measured data for ground-water-quality examples.

54 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
3. No trends were indicated during preliminary explo- back-transformed and log-space concentrations, as
ration, and ordinary kriging was tentatively well as the kriging standard deviations in log space,
selected as the appropriate technique. are shown in figures 22A, B, and C.
Natural log transformations are routinely The back-transformation procedure was a
needed for concentration data that vary over several simple exponentiation of the log-space kriging esti-
orders of magnitude, which is common in areas of mates. Such a back-transformation does not use bias-
contaminant plumes. The data were transformed to log correction factors to deal with moment bias; conse-
space and fit acceptable criteria for normality. After quently, the back-transformed values need to be inter-
transformation to log space, the techniques described preted as median values rather than average, or mean,
in section 5.0 were used to guide the following steps values. A simple back-transformation, however, is
for variogram construction: convenient and was performed, principally, to enhance
visual interpretation of the kriging estimates. Univariate
1. An exponential model was used to fit a directional statistics for the log-space kriging estimates are listed in
variogram at an angle of 150 counterclockwise table 4 (water quality A). The kriging results do have
degrees to the east-west base line. The variogram noticeable smoothing; however, they also indicate a
had a nugget of 1.00 log concentration squared, plume emanating from a location just northwest of the
a sill of 3.20 log concentration squared, and a center of the area and indicate movement and some
range of 1,295 meters [fig. 21A and table 3 (water dispersion to the southeast; the estimates are a very
quality A)]. good representation of the results from other more elab-
2. An exponential model also was fit to a directional orate studies.
variogram at an angle of 240 counterclockwise Additionally, to indicate the effect of the log
degrees to the east-west base line. The variogram transform on probabilities in converting, or back-
had a nugget of 1.00 log concentration squared, transforming, kriging estimates, the kriging estimates
a sill of 3.20 log concentration squared, and a and the kriging standard deviations, in log space,
range of 228 meters [fig. 21B and table 3 (water were used to estimate the one-sided 95th percentile
quality A)]. at each kriging-estimate location according to the
3. Cross validation was performed using the geometric formula:
anisotropy of the two variograms and the results
[figs. 21 C and D, and table 3 (water quality A)]
were acceptable. Co.95 = exp[Z(*0)
The residuals are symmetrically distributed
(fig. 21D). However, the scatterplot (fig. 21Q indi- where
cates that small concentrations were overestimated and
is the kriging estimate at location, £,,
that large concentrations were underestimated. This
in log space; and
discrepancy in the estimates does not indicate an error
in the model, but rather, indicates a consequence of GK feo) is tne corresponding kriging standard
data that have a large nugget compared to the sill; in deviation in log space.
this example, the nugget is approximately 30 percent The resulting map is shown in figure 22D and can
of the sill. The large nugget decreases the predictive be used to indicate areas where the true concentration
capacity of the model and increases the smoothing has only a 5-percent chance of exceeding the value
introduced by kriging. indicated.
The established variogram then was used, To perform indicator kriging, the indicator
along with the measured locations, to produce ordinary transformation, as described in section 3.0, was
kriging estimates for all points in a 40-by-20 grid using applied. An indicator cutoff equal to the median
a grid spacing of about 91 -by-91 meters. For the kriging value of 270 micrograms per liter for the untrans-
procedure, a search radius of about 1,524 meters with formed measured data was selected. The model for
maximum of 16 and a minimum of 8 locations was indicator kriging estimates the probability that the
specified. Gray-scale maps of kriging estimates, in concentration would be less than the indicator cutoff.

6.0 PRACTICAL ASPECTS OF GEOSTATISTICS IN HAZARDOUS-, TOXIC-, 55


AND RADIOACTIVE-WASTE-SITE INVESTIGATIONS
+ Lag with greater than or equal to 30 pairs Exponential fit
x Lag with less than 30 pairs
Exponential model fitting parameters
Nugget = 1.00 squared log concentrotion
O 4 Sill = 3.20 squared log concentration
Range = 1,295 meters
1
O
O
O
O
Q

o:
<
o
oo

< 1
o

300 600 900 1,200 1,500 1,800 2,100 2,400 2.700 3,000
LAG, IN METERS

+ Lag with greater than or equal to 30 pairs Exponential fit


x Lag with less than 30 pairs
Exponential model fitting parameters
5 - Nugget = 1.00 squared log concentration
Sill = 3.20 squared log cancentrotion
Range = 229 meters

I 4

O
O
O
o
3 -

o
CO
2 -

1 -

300 600 900 1,200 1,500 1,800


LAG, N METERS

Figure 21. Directional variograms and variogram cross-validation plots for ground-water-quality
examples A, theoretical major-direction variogram (southeast); B, theoretical minor-direction vario-
gram (northeast); C, cross-validation scatterplot; and D, cross-validation probability plot.

56 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
10

O 8
line of agreement __
o:
£ 7
LJ
O

8
O
O

< 4
^
CO
^ 3
O
z
o

2345678 10
MEASURED VALUE, IN LOG CONCENTRATION

3.5

2.5

1.5
o:
o
a
o:
0.5

£
b_ -0.5
O
CO
UJ

< -1.5
Z)
O

-2.5

-3.5
-3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5
REDUCED KRIGING ERRORS

Figure 21. Directional variograms and variogram cross-validation plots for ground-water-quality
examples A, theoretical major-direction variogram (southeast); B, theoretical minor-direction vario-
gram (northeast); C, cross-validation scatterplot; and D, cross-validation probability plot Continued.

6.0 PRACTICAL ASPECTS OF GEOSTATISTICS IN HAZARDOUS-, TOXIC-, 57


AND RADIOACTIVE-WASTE-SITE INVESTIGATIONS
NORTH
Q

-60
-30 470 970 1,470 1,970 2,470 2,970
MAP DISTANCE, IN METERS
EXPLANATION

366 482 598 714 830 1,063 1,179

INDEX TO PLOTTED VALUES, IN MICROGRAMS PER LITER

NORTH

-60
-30 470 970 1,470 1,970 2,470 2,970 3,470
MAP DISTANCE, IN METERS
EXPLANATION

3.34 3.75 4.58 5.00 5.83 6.24 6.66 7.08


INDEX TO PLOTTED VALUES, IN LOG CONCENTRATION, CONCENTRATION IN MICROGRAMS PER LITER

Figure 22. Kriging results for ground-water-quality examples A, kriging estimates back-transformed; B, kriging
estimates in log space; C, kriging standard deviations in log space; and D, 95-percent confidence level for kriging
estimates back-transformed.

58 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
NORTH

-30 470 970 1,470 1,970 2,470 2,970 3,470


MAP DISTANCE, IN METERS
EXPLANATION

1.29 1.36 1.43 1.50 1.57 1.64 1.71 1.78 1.85 1.92 2.00

INDEX TO PLOTTED VALUES, IN LOG CONCENTRATION, CONCENTRATION IN MICROGRAMS PER LITER

NORTH

-60
-30 470 970 1,470 1,970 2,470 2,970 3,470
MAP DISTANCE, IN METERS
EXPLANATION

155 1,512 2,869 4,226 5,582 6,939 8,296 9,653 11,010 12,367 13,724

INDEX TO PLOTTED VALUES, IN MICROGRAMS PER LITER

Figure 22. Kriging results for ground-water-quality examples A, kriging estimates back-transformed; B, kriging
estimates in log space; C, kriging standard deviations in log space; and D, 95-percent confidence level for kriging
estimates back-transformed Continued.

6.0 PRACTICAL ASPECTS OF GEOSTATISTICS IN HAZARDOUS-, TOXIC-, 59


AND RADIOACTIVE-WASTE-SITE INVESTIGATIONS
The techniques described in section 5.0 were used to 7.0 REVIEW OF KRIGING
guide the following steps in variogram construction: APPLICATIONS
1. No trends were indicated during preliminary
This section presents a brief discussion of three
exploration, and ordinary kriging was tenta-
principal topics applicability of kriging techniques,
tively selected as the appropriate technique.
important elements that need to be addressed in
2. A spherical model was used to fit an anisotropic kriging applications, and errors in measured data.
variogram at an angle of 150 counterclockwise Much of the information presented in this section has
degrees to the east-west base line. The variogram been gathered from other sections of this report and
had a nugget of 0.05 indicator units squared, a is presented collectively here. The items identified
sill of 0.25 indicator units squared, and a range as important to kriging applications may be helpful
of 610 meters [fig. 23A and table 3 (water in assessing kriging applications that are under
quality B)]. review.
3. A spherical model also was fit to an anisotropic
variogram at an angle of 240 counterclockwise
degrees to the east-west base line. The variogram
7.1 Applicability of Kriging
had a nugget of 0.05 indicator units squared, a In the preceding sections of this report, the
sill of 0.25 indicator units squared, and a range theory of kriging techniques has been summarized,
of 213 meters [fig. 235 and table 3 (water
and examples have been given to indicate the use of
quality B].
kriging techniques in HTRW-site investigations. The
The established variogram, and the indicator examples presented were selected so that kriging
transform of the measured data were used to produce would provide satisfactory results or be applicable.
ordinary kriging estimates for the same grid and Additionally, the examples were designed so that, for
search criteria as the first ground-water-quality the purposes of demonstration, some sort of adjust-
example. A gray-scale map of the kriging estimates ment of the data was needed; that is, drift was removed
is shown in figure 24. The kriging indicator map or transformations were made.
provides a gridded estimate for the probability of Investigators are very likely to have data
contaminant values being less than the indicator for which, in a strict sense, kriging may be applicable,
cutoff, which is a concentration of 270 micrograms but results may be unsatisfactory. Much of the funda-
per liter in this example. mental information that might be used to establish
The cutoff value selected for the preceding how satisfactory the application of kriging techniques
indicator kriging example is probably higher than may be has been presented in the preceding sections.
many investigators involved in HTRW-site investiga- In particular, section 5.0 includes a detailed discussion
tions would like to use. The number of measurements on variogram construction, which is the preliminary
[66 in table 2 (water quality A)] used in this example step in any kriging application, and systematically
is probably a high number of measurements for typical describes many decisions in variogram construction
HTRW-site investigations; yet, even with this high that need attention. If a variogram that has structure,
number of measurements, it was not possible to or some identifiable dependence on lag, cannot be
construct a variogram for indicator values much lower obtained from the data or be obtained from other
than the median. An alternative to this problem would means such as institutional knowledge, the results
be to assume the log-transformed kriging model devel- of a kriging application may not be satisfactory. Some
oped in the first water-quality example is correct and additional discussion that is designed to aid in evalu-
to rely on the kriging estimates from that model to ating the amount of data that may be required for
determine areas greater than or lesser than some indi- kriging applications is presented in this section. This
cator value. The same estimates also could be used to discussion assumes that the measured data are correct;
compute the probability that the concentration was a separate and brief discussion of measurement errors
less than some arbitrarily selected value. also is presented in this section.

60 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
^ 0.75
+ Lag with greater than or equal to 30 pairs Spherical fit
0.70
x Lag with less than 30 pairs
0.65 - Spherical model-fitting parameters
Nugget = 0.05 squared indicator unit
0.60 Sill = 0.25 squared indicator unit
Range = 610 meters
0.55
CO
2 0.50

I a45
O 0.40
O
^ 0.35
Z
^- 0.30

0.25
a 0.20

0.15

0.10

0.05

0.00
300 600 900 1,200 1.500 1.800 2.100
LAG, IN METERS

U./J

+ Lag with greater than or equal to 30 pairs Shericai fit \


0.70
x Lag with less than 30 pairs \ ~
0.65 - Shericai model fitting parameters \ -
Nugget = 0.05 squared indicator unit \ ;
0.60 Sill = 0.25 squared indicator unit \ -
Range = 213 meters \ -
0.55 \ ~
CO I ;
\= I
z 0.50 I "
Cd 0.45 \ -
\ :
§ 0.40 \ '~-
Q \ ;
Z 0.35 \ ~
Z X I \
0.30
^ + + + + \ X ~
+ + X \ :

K
0.25
X +
O :
0.20 :

0.15
\
0.10 i
0.05

0.00 ^
0 300 600 900 1.200 1.500 1.800 2.100
LAG, IN METERS

Figure 23. Directional variogram plots for indicator kriging ground-water-quality example A, theoretical
major-direction variogram and B, theoretical minor-direction variogram.

7.0 REVIEW OF KRIGING APPLICATIONS 61


NORTH

-30 470 970 1,470 1,970 2,470 2,970 3,470


MAP DISTANCE, IN METERS
EXPLANATION

0.18 0.59 0.80 0.90 1.01

INDEX TO PLOTTED VALUES, IN INDICATOR UNITS

Figure 24. Indicator kriging results for ground-water-quality example.

Initially, many investigators have a tendency to and, consequently, is more difficult to fit compared
focus on the amount of measured data that is available to the other standard variogram models, with 8 to
as an initial consideration; however, the applicability 10 optimally located sample variogram points (enough
of kriging techniques cannot be based simply on the points to define the nugget, two areas of curvature, and
amount of measured data. Unless the investigator the sill). In this ideal case, about 25 measured values
is presented with a reliable variogram, the amount would be needed to fulfill the conservative minimum
and spatial distribution of measured data can be of 30 pairs per lag. In this case, the relatively few
a constraint. If, for instance, there are fewer than measured data points need to be systematically located
25 measured values at optimal locations from the field, so that the optimally located variogram points can
there may not be enough data to confidently estimate be computed. If the measured data were not located
Gaussian variogram parameters; however, a small systematically, as is usually the case, then more
amount of measured data may be suitable for other measured data would be needed.
variogram models. Once sample variogram points meeting the
How much data are needed to apply kriging required number of pairs can be defined, the resultant
techniques is not easy to determine, but information variogram still must have structure. The variogram, for
in this report, especially in section 5.0, and the litera- instance, may simply exhibit noise about a horizontal
ture cited can provide some guidance. Section 5.3 line, a case that has no structure. If measured data are
points out that a good minimum for the number of clustered and the number of lags has been minimized
pairs of locations in each variogram lag is 30, and to meet the required number of pairs of locations, the
the American Society for Testing and Materials variogram may seem horizontal because it is domi-
(1996) has suggested that 20 may work well also. nated by small-scale effects in the clustered data.
Most investigators would probably feel comfortable The investigator then has latitude to adjust the lags
defining a Gaussian form, which has more inflection and attempt to balance the lag spacing and required

62 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
number of pairs per lag interval, as described in The construction of the variogram needs to
section 5.3. However, the variogram also could seem be described and included as part of the kriging
horizontal because the actual sill is reached within a application documentation. The description needs
very small lag. If that lag is smaller than the minimum to address the number of pairs of locations in each
spacing of measured data, obtaining structure in the variogram lag and to demonstrate that the variogram
variogram would not be possible. In such cases, the has structure. A plot of the variogram is helpful to
measured data need to be considered independent, demonstrate the presence or absence of structure.
and kriging techniques, at the lag of the measured The variogram construction discussion also needs
data, would be ineffective, or at least, offer little to establish the presence of or lack of isotropy. If
advantage over other interpolation techniques. anisotropy is present, its nature needs to be estab-
lished, and it needs to be addressed by variogram
adjustments similar to the adjustments presented
7.2 Important Elements of
in section 5.4.2.
Kriging Applications
The variogram cross-validation statistics
Many important elements of kriging applica- described in section 5.8 are useful and, if available,
tions have been discussed in this report. These they can aid in the evaluation of a kriging application;
discussions have been presented as a systematic and authoritative and definitive kriging applications
sequential method designed to provide guidance in should include cross validation. Often, the most
kriging applications. Occasionally, an investigator is useful variograms have cross-validation statistics that
presented with the results of a previous kriging appli- conform to the guidelines discussed in section 5.8.
cation and needs to evaluate the application before Section 5.8.1 indicates that the cross-validation exer-
deciding whether or not to use the results. This section cise needs to balance minimizing the kriging cross-
presents a brief review of some important elements validation errors with efforts to guard against bias.
of kriging applications that can be used in that Also, as discussed in section 5.8.1, if probabilistic
evaluation. For a more detailed discussion of impor- statements are part of the kriging application, there
tant elements of geostatistical applications, the reader needs to be some investigation of the normality of
is referred to the American Society for Testing the reduced kriging error, such as the cross-validation
and Materials (1994) for content of geostatistical probability plots included with the examples in
investigations. section 6.0.
The presence of or lack of stationarity in the Maps of the kriging estimates and standard
spatial mean needs to be demonstrated definitively. deviations need to be presented or discussed.
If the spatial mean is nonstationary, then drift is The maps of kriging estimates need to conform to
indicated and appropriate measures to address the any qualitative information about the information
nonstationarity, which are similar to the measures portrayed on the maps that is available to the investi-
presented in section 5.2, need to be part of the
gator. The maps of kriged standard deviations can
application. In ideal situations, nonstationarity occurs
be used to delineate areas of large uncertainty in the
as a gradual change. HTRW-site investigations may
kriging estimates.
present cases, especially when using with ground-
water-quality data in and around plumes, that have Finally, the variogram and kriging algorithms
abrupt step-like changes at plume boundaries and do are intended for interpolation rather than extrapolation
not appear as regional drift. In these cases, the investi- tools. Once the application extends to areas beyond the
gator needs to be aware that, without knowledge of the geographic extremes of the measured data, or perhaps
plume boundaries, points from within the plume will those extremes plus the range, there needs to be some
be grouped with points from outside the plume in qualification of the area of extrapolation. For instance,
computing the sample variogram. The effect of this in universal kriging, the practitioner would need to
problem is minimized as long as the investigator can have some assurance that the conditions of drift
define lags that allow data points within the plume to defined in the study area continue into the area of
be grouped together. extrapolation.

7.0 REVIEW OF KRIGING APPLICATIONS 63


7.3 Errors in Measured Data random process (the regionalized variables) with
joint probability distributions that obey certain
Data associated with HTRW-site investigations assumptions. Kriging yields the predictor that is statis-
have the same possibilities for errors that most investi- tically optimal in that it is the best linear unbiased
gations do. The errors may involve, among others, predictor, given certain assumptions that are detailed
bias, inaccuracy, or lack of representativeness. The in section 3.0. There are other stochastic techniques
classical nature of these errors is described in a publi- that are less well known, such as Markov-random-field
cation by the U.S. Army Corps of Engineers (1995), prediction and Bayesian nonparametric smoothing
which describes HTRW data-quality design. (Cressie, 1991), but these techniques are not discussed
The presence of contamination may complicate here.
the function of errors in HTRW-site investigations. Several techniques that are often applied in
Because these investigations often concern contamina- a nonstochastic setting are discussed. Techniques
tion, there can be large ranges of values for data applied in a nonstochastic setting are generally applied
involving contaminant concentrations, and these large strictly empirically and are not evaluated by rigorous
ranges have a tendency to increase the incidence of statistical criteria, such as mean-squared prediction
data that may seem to be statistical outliers. Even more error although, as discussed in section 3.0, such
complicating is the presence of high concentrations of
criteria may be applied in certain of the techniques,
organic materials that may create challenging analyt-
such as simple average and trend analysis. As indi-
ical problems in laboratory determinations that also
cated in this report, there are some compelling advan-
may result in reported values that seem to be statistical
tages for assuming some kind of stochastic setting.
outliers. In either case, the kriging practitioner is likely
However, the simplicity of not postulating and
to find that the apparent outliers have a strong effect on
justifying the structure and assumptions inherent in
the results of the kriging application.
stochastic analyses might be considered one advantage
When HTRW-site investigations find data that
of nonstochastic techniques, and a nonstochastic anal-
seem to be outliers, the data need to be very carefully
ysis may be perfectly adequate for certain problems.
evaluated before removal is seriously contemplated.
In addition to statistical optimality and simplicity,
Automated outlier detection tools, as suggested in
section 5.7, may be best used to identify points that there are other considerations in selecting a spatial-
may be outliers and warrant further investigation. prediction technique including properties such as
Often data that appear to be outliers may be the ease of computation, sensitivity to data errors, and
most important and meaningful data of all measure- whether the predictors are exact interpolators; that is,
ments. For example, in the first case described in the the interpolators match the measurements exactly at
preceding paragraph, apparent outliers often are repre- the measurement locations x^ t *2»---» %n- The last prop-
sentative values. In the second case, the reported value erty is one that needs to be given careful consideration.
may be an erroneous determination that has been Kriging, as usually applied, is an exact interpolator.
affected by the extremely contaminated nature of the Questions may be raised, however, about whether
sample matrix. The investigator needs to either possess exact interpolation is a desirable property if the
or have access to qualitative or institutional knowledge measurements are contaminated with a considerable
of the study area that aids in outlier interpretation. measurement error. One advantage of stochastic tech-
niques is that, generally, the existence of measurement
error may be incorporated objectively; in fact, some
8.0 OTHER SPATIAL PREDICTION kriging software packages (including STATPAC) have
TECHNIQUES incorporated this feature, resulting in a surface that is
not an exact interpolator. Several of the nonstochastic
In this section, some alternative approaches to techniques discussed in this section depend on a
spatial prediction are discussed. At the beginning of parameter that controls the deviation from exact inter-
section 3.0, the distinction between stochastic and polation. The ability to adjust such a parameter when
nonstochastic techniques for spatial prediction was using these techniques lends a degree of flexibility, but
discussed. Kriging is a stochastic technique because of selecting the best value may not be straightforward
the structure that is imposed in terms of an underlying and may involve considerable subjectivity.

64 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
In most of the following techniques, the 8.2 Simple Moving Average
predictor of the process at location XQ is a linear
combination of the measurements at locations x,-, Let hi0 be the distance of XQ from */, let h^ be
i=l, 2,..., n. Using Z(XQ) to denote an arbitrary the ordered (from smallest to largest) distances, and fix
predictor [this notation distinguishes the predictors 1 < k < n. Then the weights w, are (Cressie, 1991)
to be discussed in this section from the kriging
predictor, which is denoted by Z(XQJ], the definition
of Z (XQ) is -,
(8-3)
hiO> h [kO]

(8-1)
i= 1 Thus, this predictor is the average of the measure-
ments at the k nearest locations from XQ.
Although this form is the same form that is taken by If k is equal to n, this predictor is identical to the
the kriging predictor, the difference is in the way the simple average, with weights as given in equation 8-2.
coefficients w,- are computed. A choice of k smaller than n assumes that the predictor
needs to incorporate more of the local fluctuation
measured in the data, or, equivalently, that measure-
8.1 Global Measure of Central ments at locations near *o need be more informative
than measurements at other locations in predicting
Tendency (Simple Averaging) z(*o); the smaller k is, the more variable the predictor.
If k = 1, the predictor is an exact interpolator and is
The predictor for the process at any location XQ
constant on the Voronoi polygons (see section 8.4)
is the simple average of the measurements; that is, the induced by the measurement locations.
weights Wj are all equal and are given by (Cressie, There are several variations of this predictor. In
1991) one such variation, a distance r may be fixed (rather
than fixing k) and averages over locations that are
within distance r of XQ may be obtained. Additionally,
w,- = -. (8-2) a moving median may be used rather than a moving
n
average. Sorting and testing distances can slow
computations compared to obtaining the simple
This predictor represents the smoothest possible average, and use of medians rather than means results
predictor surface. In using this predictor, a certain in a predictor that is more resistant to outliers.
degree of spatial homogeneity is assumed. No attempt
is made to incorporate any detectable patterns (or
trends) in the mean or variance of the data as a func- 8.3 Inverse-Distance Squared
tion of location, and the fact that measurements made Weighted Average
at points that are close together may be related is dis-
regarded. Such a predictor has the advantage of being The weights w, are (Journel and Huijbregts,
very simple to compute; it needs no estimation of a 1978)
variogram or other model parameters. The disadvan-
tage is that representing the spatial field by a single
value ignores much of the relevant and interesting h iO
n 2
structure that may be very helpful in improving W
' =
n (8-4)
predictions. As discussed in section 3.3, if applied in
a stochastic setting, this predictor would be optimal
(best linear unbiased) if there is no drift and if resid-
uals are uncorrelated and have a common variance. where again hfQ is the distance of XQ from Xj.

8.0 OTHER SPATIAL PREDICTION TECHNIQUES 65


In the simple moving average, weights are the defining the associated area Aj, A5, A6. For this
same, provided the measurement locations are suffi- example,./, k, and 1 in the general equation 8-5 are 1,
ciently close to the prediction locations and are zero 5, and 6, so the weights assigned to points *i, x_5,
otherwise. For the inverse-distance squared method, and x_£ are
weights are forced to decrease smoothly as distance
from the prediction location increases. This predictor
again has the advantage of being easy to compute.
A 1+ A 5 +A 6
Another feature of this predictor is that it is an exact
interpolator. In addition, the exponent 2 of hi0 may
be changed to any positive number, providing some
flexibility in determining the rate of decrease of
weights as a function of distance from XQ. Isaaks and
Srivastava (1989, p. 257-259) presented an example
describing the effects on weights of changing the and (8-6)
exponent.

8.4 Triangulation
The weight assigned to a point is proportional to the
To compute this predictor, the region R is parti- area of the triangle opposite the point.
tioned into what are referred to as Voronoi polygons Computation of this predictor is slower than
Vj, V2» > ^n, with Vj being the set of locations closer computation of the predictors in sections 8.1, 8.2, and
to measurement location */ than to any other measure- 8.3. The predictor is an exact interpolator, and the
ment location. If any two polygons, V^- and Vp share a surface produced is continuous but not differentiable
common boundary, *,- and X: are joined with a straight at the edges of the triangulation.
line. The collection of all such lines defines what is
known as the Delauney triangulation. One such
triangle contains the prediction location XQ', the
vertices of this triangle, which are measurement loca-
tions, are labeled x_j, x^, and x^. The spatial prediction
at XQ is the planar interpolant through the coordinates
[*p z(x_j)], [xfr zQcfc)], and [x^, z(x$\. By joining XQ and
x_j, x_fc, and x±, three subtriangles are formed. The
weights w,- are (Cressie, 1991)

W; = -, i = j, k, or 1
. (8-5)
o, otherwise

where
l
A; is the area of the subtriangle opposite
vertex x;.
These definitions are shown in figure 25. In this EXPLANATION
figure, the dashed lines depict the Voronoi polygons Delauney Triangulation
associated with points x_i, x_2, ..., x^, and the solid lines Voronoi Polygons
define the Delauney triangulation. Vertices of the -***** Subtriangles
triangle containing the prediction point XQ are x±, x$,
and X£, and dotted lines represent the subtriangles Figure 25. Diagram showing Voronoi polygons.

66 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
8.5 Splines Under some conditions, a solution to the optimi-
zation problem (eq. 8-7) also may be obtained by a
In spline modeling, the measurements are inter- kriging algorithm, if the smoothing parameter T| is
polated using combinations of certain so-called basis equal to the variance of the measurement error and if
functions. These basis functions are usually piecewise a special form is chosen for the covariance function. In
polynomials of a certain degree that are determined by this situation, spline approximation is a special type of
the user; let this degree be k. The coefficients of these kriging. However, the variogram that needs to be used
polynomials are chosen so that the function values and in the kriging equations to make the kriging predictor
the first k-\ derivatives agree at the locations where equal to the spline predictor is determined by the basis
they join. The larger k is, the smoother the prediction functions selected for the spline. Because the basis
surface is. Spline techniques are often applied in a functions selected are subjective on the part of the
nonstochastic framework; in such, they represent a user, the resulting equivalent variogram may not
way of fitting a surface that has certain smoothness be representative of the true variogram of the data.
properties to measurements at a set of locations with Because kriging uses the data to indicate reasonable
variogram choices, kriging has an important advantage
no explicit consideration of statistical optimality.
over splines. Another advantage of using the kriging
There is, however, a considerable body of work
framework is the interpretation of the smoothing
in which this technique is applied in a stochastic
parameter in terms of measurement errors. Many
setting. Splines may be used, for example, in non-
times, an objective estimate of the magnitude of the
parametric regression estimation problems (Wegman measurement error can be obtained. The connections
and Wright, 1983). between kriging and splines are discussed further by
A typical approach to formulating a spline Wegman and Wright (1983), Watson (1984), and
problem is to pose the problem as an optimization Cressie (1991).
problem. In one special formulation, the first two
derivatives of the prediction surface are assumed to
exist, which imposes a certain degree of smoothness, 8.6 Trend-Surface Analysis
and the spline function is assumed to minimize
Trend-surface analysis is the process of fitting
a function, such as that in equation 3-43 to the data,
using least squares to determine the coefficients that
(8-7) yield the best fit. Computationally, trend-surface
i=l analysis is equivalent to universal kriging, with an
assumption that the Z*(^,)'s in equation 3-16 are
where uncorrelated. Thus, there is no need to estimate a vari-
ogram, and readily available regression packages may
Q is a term that depends on the first two deriva- be used for estimating the coefficients. As in universal
tives of the predictor surface. kriging, polynomial surfaces are the most commonly
The parameter r| is a nonnegative number that needs used. When trend surfaces are applied in a stochastic
to be specified by the user; the value of this parameter setting, the resulting predictor is optimal if deviations
indicates the trade-off between goodness of fit to the from the surface are uncorrelated and have a common
data, measured by the first term, and smoothness, as variance.
measured by Q. If r| is 0, the spline is an exact interpo-
lator and passes through all the data points. If rj > 0,
8.7 Simulation
the spline is not an exact interpolator. (Splines that
are not exact interpolators are referred to as smoothing In this section, a regionalized random variable
splines.) There are a number of numerical procedures Z(x), where x is a location in a two-dimensional study
that may be used for fitting splines, but allowing the region /?, is considered. Kriging is an interpolation
smoothing parameter rj to be greater than 0 renders the algorithm that yields spatial predictions ZQc) that
computational problem more complex. are optimal, as has been discussed in this report. The

8.0 OTHER SPATIAL PREDICTION TECHNIQUES 67


mean-squared prediction error is smallest among all from location to location along a predetermined
predictors that are linear in the measurements. This path. At each location, a specified set of neighboring
optimality property is local, in that the mean-squared conditioning data are retained, including the original
error of predictions at unsampled locations, when data and simulated grid-location values at previous
considered one at a time, is minimized without traversed grid locations along the path. Then, a
specific regard to preservation of global spatial random number is generated from a Gaussian distribu-
features. If, however, the actual realization z(x) could tion in which the conditional mean and the variance
be compared to the kriged prediction surface based on are determined using a kriging algorithm. The value of
n measured values, the kriged surface would be much the random number determines the simulated process
smoother than the actual surface, especially in areas at this location. The conditional Gaussian distribution
of sparser sampling. Thus, the kriged surface is a used in the simulation is identical to the conditional
good and realistic representation of reality in that the distribution discussed in section 3.5.1. An idea of the
n measured values are honored, but the kriged surface computational requirements can be obtained from the
is less realistic for global properties, such as overall fact that a kriging algorithm needs to be applied for
variability. each simulation location. For multiple realizations, if
The purpose of simulation is to produce one or the path connecting the grid points remains the same,
more spatial surfaces (realizations) that are more real- the kriging equations need to be solved for only the
istic in preserving global properties than the surface first simulation. However, implementation of this
produced by interpolation algorithms, such as kriging. procedure needs to account for the assumptions
These realizations are produced by using numbers that concerning the existence of drift; the details of such
are drawn randomly (Monte Carlo) to impart vari- an implementation are beyond the scope of this report.
ability to the simulated surface, making the simulated A sequential Gaussian simulation also may
surface more representative of the overall appearance be applied in indicator kriging (see section 3.5.2). At
of the actual surface. Generally, simulation uses the each grid point along the path, a (Bernoulli) random
idea that the true value of a random surface may be variable that has only two possible values, 0 or 1, is
expressed as the sum of a predicted value (which is generated, with the relative probability of these two
obtained by kriging) plus a random error, which varies values being determined by indicator kriging applied,
spatially and depends on the random numbers drawn. as in the previous paragraph, to the original observed
A number of independent realizations are generated, indicator data and the previously simulated indicator
and these realizations use equally probable representa- values.
tions of reality. To get an idea of how simulation results might
A simulation algorithm is said to be conditional be used in a risk-assessment setting, assume again
if the resulting realizations agree with the measure- that the underlying process is Gaussian and that
ments at measurement locations x_i, x2,..., x^. If the 1,000 conditional realizations have been generated.
underlying process Z(x) is assumed to be Gaussian If a single grid point XQ (which is not a measurement
(or if a transformation is found that makes the point) is used, then the simulation has produced
process Gaussian), the most common technique 1,000 values at XQ, which, when analyzed in histo-
of conditional simulation is known as sequential gram form, approximate the probability distribution
Gaussian simulation (Deutsch and Journel, 1992, of potential measurements at that location. If an
p. 141-143). Another, more complicated, Gaussian interval that has exactly 25 (2.5 percent) of the values
simulation technique that is particularly useful less than its lower end and 25 of the values larger
for three-dimensional simulations because of its than its upper end were established, the interval
computational efficiency is the turning-bands tech- would almost correspond to the 95-percent prediction
nique (Journel and Huijbregts, 1978; Deutsch and interval Z(XQ) - \.96a K (XQ) to Z(*o) + 1.96a^(^o)
Journel, 1992). discussed in section 3.5.1. For this single location,
In sequential Gaussian simulation, a set of the simulation would not produce much more informa-
grid points for which simulated values are desired tion than kriging alone would have produced. The
is defined, and the points are addressed sequentially real value of simulation is that realizations, not just

68 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations
at a single location, but at all of the grid locations 9.0 SUMMARY
jointly, can be obtained. These realizations then
can be used to calculate probabilities associated The geostatistical technique known as kriging
with any number of spatial locations together. For can be used to determine optimal weighting of
measurements at sampled locations for obtaining
example, the probability that the largest (maximum)
predictions, or kriging estimates, at unsampled loca-
contaminant value over a certain subregion is greater
tions. Kriging also provides information concerning
than a particular concentration might be assessed.
the uncertainty associated with kriging estimates.
(If the word "largest" here were replaced with The uncertainty information available from kriging,
"average," then block kriging could be used to as well as the optimal weighting, distinguishes kriging
obtain the answer.) from other techniques used for spatial modeling.
A central point that needs to be emphasized is The theory of regionalized random variables
that simulation is especially useful when probabilities is the basis for different forms of kriging. Ordinary
associated with complicated, usually nonlinear, func- kriging is used when the spatial mean is considered
tions of the regionalized variables for a region need constant. Universal kriging is an extension of ordinary
to be analyzed. The maximum function mentioned kriging that can be used to address a nonconstant
in the preceding paragraph is one simple example. spatial mean. Block kriging is used to obtain kriging
Another example is the problem of determining place- estimates for a block of area that is larger than the area
ment of ground-water monitoring wells to detect and represented by an individual sample. Indicator kriging
monitor ground-water contamination emanating from implements the kriging equations nonparametrically.
a potential point source. Given an existing set of A fundamental step in kriging applications is
hydraulic-head data, kriging might be applied and development of a variogram. The variogram is usually
flow paths determined from resulting hydraulic-head developed from the results of measurements at many
gradients. Intersection of the flow path from the point locations within the application area. The variogram
source with the regional boundary then might be used describes spatial correlation within the application
to determine monitoring-well placement. Conditional area and provides basic information required to deter-
simulation would be useful to determine the uncer- mine optimal weights for measurements to be used in
tainty associated with well placement or to give an making kriging estimates. Information from the exer-
indication of how many monitoring wells might be cise of variogram development can be used to cross
validate the variogram and the cross-validation statis-
appropriate. In this example, the variable of interest,
tics can, in turn, fine tune variogram development.
well location, is a complicated function of hydraulic
Example applications of kriging illustrate basic
heads so this is a problem for which simulation is
techniques and some constraints that apply to kriging.
well suited. The reader may refer to Easley and others
These applications also illustrate how different types
(1991) for a more detailed discussion of this applica-
of kriging, such as ordinary, universal, block, and indi-
tion of kriging.
cator, can be used.
The complicated functions of interest in ground-
Other spatial modeling techniques include
water studies often involve physically based ground- nonstochastic techniques such as simple averaging,
water flow models. Conditional simulation may be inverse-distance squared weighted averaging, triangu-
used, for example, to generate a suite of hydraulic- lation, splines, and trend-surface analysis. These
conductivity realizations to be used as input to a nonstochastic techniques can be simpler to apply
model that produces as output a set of corresponding than kriging and may be appropriate to use for
hydraulic-head realizations. Weber and others (1991) some problems, especially when it is not necessary
discussed how ground-water modeling might be used to evaluate results with respect to statistical criteria.
with conditional simulation to study the monitoring- Another extension to kriging, simulation, is intended
well-placement problem discussed in the preceding to preserve overall variability and to compensate for
paragraph. the tendency of kriging to smooth results.

9.0 SUMMARY 69
10.0 REFERENCES Journel, A.G., 1988, Non-parametric geostatistics for risk
and additional sampling assessment, in Kieth, L., ed.,
American Society for Testing and Materials, 1994, Standard Principles of environmental sampling: Washington,
guide for content of geostatistical site investigations, D.C., American Chemical Society, p. 45-72.
D 5549-94: Philadelphia, Pa., American Society for Journel, A.G., 1993, Geostatistics for the environmental
Testing and Materials Committee D-18 on Soil and sciences: Las Vegas, Nev., U.S. Environmental
Rock, 5 p. Protection Agency, 135 p.
1996, Standard guide for analysis of spatial varia- Journel, A.G., and Huijbregts, C, 1978, Mining geostatis-
tion in geostatistical site investigations D 5922-96: tics: London, Academic Press, 600 p.
Philadelphia, Pa., American Society for Testing and Krige, D.G., and Magri, E.J., 1982, Studies of the effects
Materials Committee D-18 on Soil and Rock. of outliers and data transformation on variogram esti-
Bras, R.L., and Rodriguez-Iturbe, Ignacio, 1985, Random mates for a base metal and a gold ore body: Mathemat-
functions and hydrology: Reading, Mass., Addison- ical Geology, v. 14, no. 6, p. 557-564.
Wesley Publishing Company, 559 p.
Lenfest, L.W, Jr., 1986, Ground-water levels and use of
Clark, Isobel, 1979, Practical geostatistics: London,
water for irrigation in the Saratoga Valley, south-
Applied Science Publishers, 129 p.
central Wyoming, 1980-81: U.S. Geological Survey
Cliff, A.D., and Ord, J.K., 1981, Spatial processes, models Water-Resources Investigations Report 84-4040,
and applications: London, Pion Limited, 266 p.
23 p.
Cressie, Noel, 1991, Statistics for spatial data: New York,
Myers, D.E., Begovich, C.L., Butz, T.R., and Kane, V.E.,
Wiley, 900 p.
1980, Application of kriging to the hydrogeochemical
David, Michel, 1977, Geostatistical ore reserve estimation
data from the National Uranium Resource Evaluation
in the collection Developments in geomathematics 2:
Program: Oak Ridge, Tenn., Union Carbide Corpora-
Amsterdam, Elsevier, 364 p.
tion, Nuclear Division Oak Ridge Gaseous Diffusion
Davis, J.C., 1973, Statistics and data analysis in geology: Plant, Report GJBX-57-81, 124 p. [Available
New York, Wiley, 646 p.
from U.S. Department of Energy, Grand Junction,
Delhomme, J.P., 1978, Kriging in the hydrosciences:
Colorado.]
Advances in Water Resources, v. 1, no. 5,
Olea, R.A., 1991, Geostatistical glossary and multilingual
p. 251-266.
dictionary, in the collection Wiley series in probability
Deutsch, C.V., and Journel, A.G., 1992, GSLIB, geostatis-
and mathematical statistics: New York, Oxford Univer-
tical software library and user's guide: New York,
sity Press, 177 p.
Oxford University Press, 340 p.
Ripley, B.D., 1981, Spatial statistics: New York, Wiley,
Devore, J.L., 1987, Probability and statistics for engineering
and the sciences: Monterey, Calif., Cole Publishing 252 p.
Company, 672 p. Ross, S.M., 1987, Introduction to probability and statistics
Easley, D.H., Borgman, L.E., and Weber, D., 1991, Moni- for engineers and scientists: New York, Wiley, 492 p.
toring well placement using conditional simulation of U.S. Army Corps of Engineers, 1995, Technical project
hydraulic head: Mathematical Geology, v. 23, no. 8, planning Guidance for hazardous, toxic, and
p. 1059-1080. radioactive waste data quality design: Washington,
Englund, E.J., and Sparks, A.R., 1991, GEO-EAS D.C., U.S. Army Corps of Engineers Engineer
(Geostatistical Environmental Assessment Software) Manual 200-1-2.
users guide: Las Vegas, Nev., Environmental Moni- ___1997, Practical aspects of applying geostatistics
toring Systems Laboratory Report EPA/600/8-91/008. at hazardous, toxic, and radioactive waste sites:
[Available from National Technical Information Washington, D.C., U.S. Army Corps of Engineers,
Service, Springfield, VA 22161 as NTIS Engineer Technical Letter ETL 1110-1-175.
Report PB89-151252.] Watson, G.S., 1984, Smoothing and interpolation by kriging
Grundy, W.D., and Miesch, A.T, 1987, Brief description of and with splines: Mathematical Geology, v. 16, no. 6,
STATPAC and related statistical programs for the IBM p. 601-615.
personal computer A, Documentation: U.S. Geolog- Weber, D., Easley, D.H., and Englund, E.J., 1991, Proba-
ical Survey Open-File Report 87-411-A, 34 p. bility of plume interception using conditional simula-
Hawkins, D.M., 1980, Identification of outliers: London, tion of hydraulic head and inverse modeling:
Chapman and Hall, 188 p. Mathematical Geology, v. 23, no. 2, p. 219-239.
Isaaks, E.H., and Srivastava, R.M., 1989, An introduction to Wegman, E.J., and Wright, I.W, 1983, Splines in statistics:
applied geostatistics: New York, Oxford University American Statistical Association Journal, v. 78,
Press, 561 p. no. 382, p. 351-365.

70 Overview and Technical and Practical Aspects for Use of Geostatistics in Hazardous-, Toxic-,
and Radioactive-Waste-Site Investigations

You might also like