Chapter 1. Introduction and Theoretical Issues in Archaeological Gis. Chapter 2.
Chapter 1. Introduction and Theoretical Issues in Archaeological Gis. Chapter 2.
First principles
2.1 Introduction
The power of GIS, as with other computer programs, can be deceptive: visually
impressive but ultimately meaningless results can appear unassailable because of
the sophisticated technologies used to produce them (Eiteljorg 2000). The famil-
iar adage ‘garbage in, garbage out’ is particularly applicable to GIS, and one of
our primary aims throughout this book is to provide guidance on how to use this
technology in ways to strengthen and extend our understanding of the human past,
rather than to obfuscate it. In this chapter we start by providing an overview of the
‘first principles’ of GIS: the software and hardware requirements, geodetic and car-
tographic principles, and GIS data models. These provide the conceptual building
blocks that are essential for understanding what GIS is, how it works, and what
its strengths and limitations are. Although some of these ‘first principles’ may be
familiar to readers who are experienced in cartography and computer graphics, we
nevertheless provide a thorough review of each as they yield the foundation on
which we build in later chapters.
11
Fig. 2.1 The five main groups of tasks performed by GIS (after Jones 1997, Fig. 1.2).
that is concerned both with the more fundamental conceptual issues of spatial and
space–time relationships as well as the impact geospatial technologies are having
within the humanities and social sciences (Marble 1990; Curry 1998; Forer and
Unwin 1999; Johnston 1999; Longley et al. 2005).
The acquisition of spatial data GIS is a software platform for the acquisition and integration
of spatial datasets. Spatial data include, but are certainly not limited to, topographic maps,
site locations and morphology, archaeological plans, artefact distributions, air photography,
geophysical data and satellite imagery, all of which can be integrated into a common analytic
environment.
Spatial data management GIS uses sophisticated database management systems for the storage
and retrieval of spatial data and their attributes. This might involve the transformation of
map coordinate systems to enable data collected from different sources to be integrated,
the building of vector topologies, the ‘cleaning’ of newly digitised spatial datasets, and the
creation of geospatial metadata.
Database management A major strength of GIS is that it provides an environment for linking
and exploring relationships between spatial and non-spatial datasets. For example, given a
database on the provenance of a sample of projectile points, and another database that contains
information on the morphology of the same points, they can be linked in such a way that it
becomes possible to look for spatial patterns in points’ morphological variability. Database
management, involving conceptual and logical data modelling, is thus an important part
of GIS, as is database construction and maintenance to ensure that the spatial and aspatial
components of a dataset are properly linked.
Spatial data analysis GIS also provides the ability to undertake locational and spatial analysis
of archaeological data, as well as tools for examining visibility (viewsheds) and movement
(cost-surfaces) across landscapes. Much work in GIS involves the mathematical combination
of spatial datasets in order to produce new data that may provide insight into natural and
anthropomorphic phenomena. These range from ecological models that provide predictions
of soil suitability for agriculture or erosion potential, or predictive models of potential site
location. Tools for geostatistical modelling of spatial data to create, for example, continuous
surfaces from a set of discrete observations are also available. GIS can also be a route to
the computer simulation of human behaviour and decision making in different types of
environments.
Spatial data visualisation GIS has powerful visualisation capabilities used for viewing spatial
data in innovative ways (such as thematically or for ‘fly-throughs’ in three dimensions) that can
suggest potential patterns and routes for further analysis. GIS also provide cartographic tools
to help produce hard-copy paper maps. Many GIS packages also facilitate the publication of
interactive map data on the Internet.
although we may, at times, use locations such as parish, county or survey region. More
frequently we use quantitative location data in the form of map coordinates. These
include global geographic locational systems, with latitude and longitude being the most
common, or national, regional or locally defined Cartesian metric coordinate systems.
r A morphology that defines the shape and size of an object, such as ‘straight’ or
‘100 m2 ’. Qualitative or quantitative descriptors can be recorded as attribute data by,
for example, recording the size of an archaeological site or the shape of a distribution.
Alternatively, it is possible to record spatial morphology directly by mapping the size
and shape of a phenomenon, such as an archaeological site on a map. For certain
analytical or visual purposes, morphology might be drawn directly on a map, such as
the arrangement of a skeleton or the shape of a distribution of artefacts.
Spatially Classification
related to
r Information about spatial association and interaction that describes spatial relation-
ships, such as ‘path a crosses path b’, ‘from settlement p one can see settlement q’ or
‘site k is 100 m east of fresh water’. As we discussed in Chapter 1, some types of spatial
associations are referred to as topological, such as when we talk about path or road
connections. Topological relationships are also described as orientation-independent
because only the connective relationships between objects are important and not their
orientation or spatial location. Orientation-dependent or directional relationships are
those that use relational directions, such as above, below, in front, behind, or the
cardinal directions east, west, south, north (Jones 1997, p. 25).
r Temporal relationships that describe the date and/or associated features in relative
terms, like ‘contemporary with’, ‘later than’, ‘earlier than’, etc. Temporal relationships
can be important for ensuring that particular types of analysis, such as settlement
patterns, are undertaken only on contemporaneous sites.
r One or more aspatial attributes that describe the nature of the object. This might
consist of a biography of a site or object, information about the colour and raw material
of an object, the time of day that a field was fieldwalked, the shape of the cross-section
of a feature, or the estimated age of a burial.
The ability to associate aspatial data with spatial objects means that it is pos-
sible to explore the spatial characteristics of non-spatial data. For example, given
a database of handaxes that records their spatial provenance and aspatial morpho-
logical attributes (e.g. weight, size, shape, raw material, reduction stage, etc.) it is
possible to explore the relationship between their location and their other charac-
teristics. This is an extremely important ability of GIS that has found application
in many areas of archaeological research.
1 http://grass.itc.it. 2 www.clarklabs.org.
3 www.esri.com. 4 www.mapinfo.com.
To emphasise the differences between traditional paper maps and the dynamical
interface that GIS offers it is worth noting some of the constraints of the former (cf.
Longley et al. 1999, p. 6). Paper maps differ from GIS because they are:
Static The dynamic space-time interactions between objects cannot easily be depicted
(e.g. changes in population and settlement patterns, or environmental change). A GIS
offers the advantage of enabling exploration of the dynamics of temporal patterning.
The University of Sydney Archaeological Computing Lab’s TimeMap project5 is an
excellent example of this form of dynamic mapping.
Two-dimensional Multidimensionality cannot be easily depicted on paper. Multivariate
spatial data and the three-dimensional representation of topography benefit from
multidimensional forms of display available in GIS (e.g. Portugali and Sonis 1991;
Couclelis 1999).
5 www.timemap.net.
A further major advantage of a GIS over traditional mapping is that a GIS permits
the organisation of different components of the same map into different thematic
map layers (and thus often referred to as thematic mapping), which is the basic way
that spatial data are organised within a GIS environment. In practice this means
that in one GIS digital display many different elements may be combined, each
of which can be individually turned on or off, queried, modified, reclassified and
edited. Many analytical functions, such as spatial queries, can operate across one
or more layers depending on the need of the GIS analyst. Map layers, or subsets
of individual layers, can also be combined to produce new maps at will, providing
potential insight into relationships between elements on different themes.
f q x
Fig. 2.3 Polar coordinates. The circle of the sphere in the x, y-plane is the equator, and
in the x, z-plane it is the meridian. If p is an arbitrary point on the surface of the Earth,
then the angle defined by θ is therefore longitude, and the angle defined by φ is latitude
(after Worboys 1995, p. 143).
to display areas of the Earth’s round surface on a flat map is map projection, which
involves a mathematical transformation of the units of longitude and latitude (i.e.
graticules) to a flat plane. Essentially, a flat map of a large area of the Earth’s surface
cannot be produced without some form of projection. When mapping areas at the
continental or international scale the transformation from three to two dimensions
causes profound distortion and spatial error in particular types of measurement. At
national and regional scales or larger, the distortion arising from projection to a
flat surface causes fewer problems, and national and state mapping agencies have
established projections for minimising error within their own boundaries. At very
small scales, what we might term subregional or local, the surface of the Earth can
be regarded as flat and grid systems can be established and used without reference to
geodetic correction. Note here the use of the terms ‘large scale’ and ‘small scale’, as
this can be a source of confusion. Large scale generally refers to scales of 1 : 50 000
or greater (e.g. 1 : 25 000, 1 : 5000, etc.), and small scale to maps with scales smaller
than 1 : 50 000 (e.g. 1 : 100 000, 1 : 1 000 000, etc; Thurston et al. 2003, p. 37).
Many forms of map projection have been developed for both global and national
mapping purposes and most GIS programs will support many or all of the common
ones (GRASS, for example, supports some 123 different projections). Projection
systems may be grouped into a projection family of which there are three main
ones, conical, azimuthal and cylindrical, defined according to how the sphere is
projected onto a flat surface (for a mathematical discussion see Iliffe 2000). Each
projection family has either a line of tangency or two lines of secancy that define
where the imagined projection surface comes into contact with the Earth, and where
there is correspondingly the least distortion (Fig. 2.4). All projections will distort
Fig. 2.4 A conical projection with two lines of secancy (left) and one line of tangency
(right). The point(s) of contact are also referred to as standard parallels.
Fig. 2.5 Albers equal-area conical projection with one line of tangency (left) and a
meridian (dashed line). The resulting map is to the right, showing the lines of latitude
as concentric arcs.
Fig. 2.6 Azimuthal projection with a point of contact at the North Pole. The resulting
map has radiating lines of longitude, and concentric lines of latitude. Angle and
distance measurements taken along the lines of longitude remain accurate.
(or planar) projections are usually used to map the poles although in theory they can
occur anywhere on the Earth’s surface. If polar, then the projection is conformal
with concentric lines of latitude and radiating lines of longitude. Area distortion
occurs as one moves away from the poles, but directions and linear distances from
the centre point to any other point on the map are accurate.
Cylindrical projections are conformal and so 90◦ angles are maintained between
the lines of latitude and longitude (Fig. 2.7). Measurements along the line of tan-
gency are equidistant but at further distances from this line area measurements
become increasingly distorted.
The most common cylindrical projection is the Mercator Projection, which
uses the equator as its line of tangency and scales the y-dimension (latitude) to
reduce the distortion at polar extremes. This projection gives a very misleading
view of the world as movement away from the equator causes areas towards the top
and bottom of the map to become disproportionately large in area (Snyder and
Voxland 1989, p. 10). The Transverse Mercator Projection (TM projection),
invented by Johann Lambert (1728–1777), rotates the cone 90◦ so that a merid-
ian becomes the line of tangency. This distorts measurements in the east–west
axis but maintains north–south measurements better than the standard Mercator
Projection. The TM Projection is one of the standard ways of mapping the globe.
Fig. 2.7 Cylindrical projection with a line of tangency corresponding to the equator
and a meridian (dashed line). The resulting map is to the right, showing the lines of
latitude as parallel lines.
6 www.ngs.noaa.gov. 7 www.ngs.noaa.gov/cgi-bin/nadcon.prl.
50
y-axis (northings)
p (67, 31)
67
31
50
y-axis (northings)
p (67, 31)
c
b
q (34,18) a
US State Plane system, British National Grid and in most other national grids.
National grid systems, such as the US State Plane system, are often better choices
for regional mapping projects because the ellipsoid is often selected to maximise
spatial accuracy for the specific area covered by that particular system. In parts of
the world where national or military grids are unavailable, then UTM is an excellent
choice. We must emphasise again our warning from the previous section regarding
the inevitable and significant spatial errors that will result from combining data
derived from maps with different projections and/or ellipsoids.
Metric planar systems have the important and crucial advantage of allowing the
easy calculation of distance and area. For example: linear distance measurements
can be calculated using Pythagoras’ theorem (Fig. 2.9); polygon areas can be
Fig. 2.10 The three vector ‘geographic primitives’ of points, lines and polygons.
Vector topology
An extremely important concept that underlies the vector structure is the geomet-
rical relationships between vector objects, referred to as topology. The analysis
of topological relationships is explored more fully in Chapter 11, so here it is
sufficient to note a few basic concepts. Firstly, topological relationships define
Fig. 2.11 Vector objects linked to attribute data. In this example, each polygon has a
unique id number that links it directly to an attribute table that defines the soil type
represented by that polygon.
the connections and relationships between vector objects rather than their spatial
location. For example, when two roads cross each other, two different topological
relationships can potentially exist between those entities. If the lines simply cross
without sharing a node, the lines are not topologically connected. This is equivalent
to a road crossing another via an underpass and it is not possible to get from one
road to another at the point of intersection. If the roads do share a node, they are
topologically linked. In this case, it would be equivalent to the two roads meeting
at an intersection. Topological relationships are therefore defined by the presence
of shared nodes between vector objects. In practice, many GIS require nodes at
both crossing and meeting points, in which case additional methods must be used
to provide adequate topological information (see Chapter 11).
Topological relationships also define how polygons relate to each other. For
example, two adjacent polygons, perhaps representing separate parcels of land or
survey zones, are topologically related if they share one or more nodes or arcs
in common (Fig. 2.12). Without common nodes this relationship does not exist,
and the polygons then must either overlap and/or have a gap between them. It
is entirely possible that they intentionally overlap or have a gap to reflect a real-
world spatial relationship; but more usually adjacent polygons have an assumed,
if not actual, topological relationship. The calculation of spatial relationships and
properties of vector objects is not a trivial process, and is dependent both on the
data structure and accuracy of the dataset. During the data collection phase and
particularly during the process of digitising vector objects, care should to be taken
to ensure that topological relationships are properly maintained and defined. Many
vector GIS programs have ‘clean-up’ routines that can be used to create topologies
between objects automatically (Chapter 5).
Some geodatabases, such as ArcGIS, provide a set of topological rules to ensure
that vector objects are always related in appropriate ways. For example, polygons
that define survey areas might have a ‘Must Not Overlap’ rule, so that any instances
b
a c d
l
1
k 2
m
e
n
3 f
j
g
i h
Fig. 2.12 Three topologically related polygons. Polygons 1, 2 and 3 share arcs (edges)
defined by nodes cm, mj and mg.
where this occurs are identified and the appropriate action taken (e.g. the over-
lapping area is subtracted from one polygon, or a new polygon defined by the
overlapping area is created).
Topological accuracy also makes for more efficient storage of vector data as
vector objects can then share data. Some GIS systems take advantage of this when
storing the geometric definitions by only recording an arc (and its vertices) once,
and then defining its relationship to polygons. In Fig. 2.12, for example, arcs cm,
m j and mg need only be stored once instead of twice for each of the polygon
boundaries they define. On large, complex, polygonal maps such as those routinely
encountered with soil or geological series, this can result in a significant saving
of storage space and computational time, an issue examined in further detail in
Chapter 4.
Fig. 2.13 Point, line and polygon primitives as represented on a raster grid.
Disadvantages of the raster structure There are three major disadvantages of the
raster structure: its fixed resolution, its difficulty in representing discrete entities and
its limited ability to handle multiple attribute data. The first problem arises when
data collected at different scales need to be integrated. Combining multiscalar
datasets could be seen as introducing additional problems regardless of the data
model, and might ideally be avoided, but in practice there are many instances when
Fig. 2.14 Representing complex curves with raster data can be problematic. The box
on the far left shows five vector polylines. The centre box shows the same lines using a
10 × 10 raster grid (i.e. 100 cells). On the far right the resolution has been increasing
to a 20 × 20 grid (i.e. 400 cells). This improves the representation, but the raster map
still suffers from being blocky and from lost detail.
data collected at different scales must be combined. Field survey data, for example,
often mix scales of representation from the larger survey unit (such as a field) to
site-based artefact collections where more detail is collected. The representation
of multiscalar data is difficult in raster systems and the combination of raster data
collected at different scales often results in having to default to the smaller scale and
loosing detail. Secondly, problems can arise with representing complex boundaries
using raster data because of the inherent limitations of grid data for representing
tightly curved objects. Unless the cells are very small in relation to the object being
represented and the storage size correspondingly increased, curved lines always
will be blocky in appearance (Fig. 2.14).
For this reason complex shapes, such as contour lines, are better modelled using
vector objects. Finally, raster data have always been difficult to connect to attribute
tables. Although some GIS programs, notably Idrisi and GRASS, provide a facility
for linking raster data to a database, in practice this is often more cumbersome than
the embedded attribute tables that vector-based GIS programs provide. The raster
data structure thus has limitations for the management and querying of multiscalar
spatial datasets.
2.5 Conclusion
Geographical information systems (GIS) are a powerful technology that offer a
host of analytical possibilities for investigating the spatial organisation of culture
and human–environment relationships. These ‘first principles’ of GIS only define
the starting point for exploring the complexity of the human use of space with GIS.
In fact, many of these first principles are being constantly challenged by research
that is pushing beyond the constraints of two-dimensional mapping to use GIS to
model space–time relationships more adequately than the basic vector and raster