Principles and Applications of GIS-1-1
Principles and Applications of GIS-1-1
Geographical Information Systems arose when it was realized that there was
deficiency in aiding prediction as humans were limited to showing things possible
happenings as they were and difficult to make decision.
1.1 Definitions
Geographical Information System
"…a system of computer software and procedures designed to support the capture,
management, manipulation, analysis, and display of spatially referenced data for
solving complex planning and management problems."
OR, GIS is a computerized system that facilitates the phases of data entry, data
analysis and data presentation especially in areas when we‟re dealing with geo-
referenced data. Geo-referenced data is the data showing the distribution of different
things in space on the earth‟s surface.
Information means that that has been interpreted by a human being. Humans work
with and act upon information, not data. Human perception and mental processing
leads to information, a hopefully understanding and knowledge, Geoinformation is
specific type of information resulting from the interpretation of spatial data.
(iii) Temporal data: is the data that shows change over time.
Conceptually, the basic objective of any temporal database is to record or portray
change over time. Change is normally described as an event or collection of events.
Perhaps the most encompassing definition of an event is „something of significance
that happens‟.
Change, and therefore also events can be distinguished in terms of their temporal
pattern into four types:
a) continuous – going on throughout some interval of time
b) majorative – going on most of the time
c) sporadic – occurring some of the time
d) Unique – occurring only once.
This means that duration and frequency become important characteristics in describing
temporal pattern.
v) Organizational Context
For effective GIS use, an appropriate organizational context is required.
Training of Human Resource is essential if GIS is to be applied in management.
Personnel and managers with knowledge of GIS help the whole system run.
Other GIS functionalities come with these tools including; supporting various
coordinate systems and transformations between them, many different kinds of
computing with geo-referenced data and the freedom of choice in presentation of
results.
However, they differ as much greater volume and density of data input to GIS in
addition to much more analysis in GIS.
Every geographical phenomenon can be represented by the above three and the „label‟
describing what they are. e.g. Oil wells are represented by a “point” entity consisting
of a „single‟ (x and y) coordinate and a label explaining what it is.
The translation of GIS data (spatial or attribute) from one type/format to another is
what is referred to as Data conversion.
Vector lines are often referred to as arcs and these consist of strings of vertices
terminated by a node, a node being a vertex that starts or ends an arc segment.
Point features are defined by one coordinate pair, a vertex. Polygonal features are
shown by a set of closed coordinate pairs.
In vector representation, storage of the vertices for each feature is important as well
as connectivity between features i.e. sharing of common vertices for intersection of
features. Also vector type/format emphasizes other data models some of which
include topologic data model and Computer Aided Drafting (CAD).
Computer Aided Drafting data models consists of listing elements, not features,
defined by strings of vertices, to define geographic features, e.g. points, lines, or
areas. There is considerable redundancy with this data model since the boundary
segment between two polygons can be stored twice, once for each feature.
The size of cells depends on data accuracy and resolution required by the user. There
is no explicit coding of geographic coordinates required since it is implicit in the layout
of cells. Points are represented by one pixel/grid cell. Lines are represented by a
number of pixels in a given direction while areas are represented by an aggregation of
pixels.
Raster data type involves a division of spatial data into regularly spaced cells each
having the same shape and size, the most utilized shape being a square. Most of them
require that a raster cell contains only a single discrete value hence a data layer may
consist of a series of raster maps representing an attribute. E.g. height map, density
map among others.
The use of raster data types allow for sophisticated mathematical modeling processes
while vector based systems are often constrained by the capabilities and language of a
relational DBMS.
Rasterisation
In many cases vector data may be converted to raster data in a process called vector
to raster data conversion (rasterisation). In this process, a digitizer is used to
encode the polygons by digitizing their arcs (line segments forming polygons borders
or individual linear features).
These sets of arcs can easily be converted into raster form at any resolution required
by using programs or application software packages. Most GIS software allows the
user to define a raster grid cell for conversion. It is imperative that the original scale
e.g. accuracy of data be known prior to conversion.
The accuracy of the data is often referred to as resolution and this should determine
the cell size of the output raster map during conversion. This rasterisation leads to loss
of information because cells or pixels near the digitized boundaries arc miscoded. The
loss in accuracy is proportional to both the size of the grid cell and wiggliness of the
boundaries.
Vectorization
This is data conversion from a raster to a vector format. In its process, algorithms are
needed and used to convert arrays of pixels/cells into line data. This enables capability
to convert data from scanners, digitizers, into lines text and also where raster data are
output to devices such as pen plotters.
It should be noted that data conversions come with advantages and disadvantages as
used in GIS some of which are given below.
• Display and plotting can be expensive, particularly for high quality color
• The technology is expensive, particularly for the more sophisticated software
and hardware
• Spatial analysis and filtering within polygons are impossible
• Combination of several vector polygon maps through overlay creates
difficulties
• Continuous data, such as elevation data, is not effectively represented in
vector form. Usually substantial data generalization or interpolation is required
for these data layers.
• Spatial analysis and filtering within polygons is impossible. i.e. Simulation is
difficult because each unit has a different topological form
•
Sub-items
Attribute data is usually input by manual keying or via a bulk loading utility of the
DBMS software. ASCII format is a de facto standard for the transfer and conversion
of attribute information.
The choice of data input method is governed largely by the application, the
available budget, and the type and the complexity of data being input.
There are at least four basic procedures for inputting spatial data into a GIS. These
are:
(i) Manual digitizing;
1. Manual Digitizing
Majority of GIS spatial data entry is done by manual digitizing.
A digitizer is an electronic device consisting of a table upon which the map or drawing
is placed. The user traces the spatial features with a hand-held magnetic pen, often
called a mouse or cursor or a digitizing puck. While tracing the features the
coordinates of selected points, (e.g. vertices,) are sent to the computer and stored.
All points that are recorded are registered against positional control points,
usually on the map corners, which are keyed (typed) in by the user at the beginning of
the digitizing session. The coordinates are recorded in a user defined coordinate
system or map projection. Latitude and longitudes (geographic coordinates) and
UTM coordinates (planer coordinates) are most often used.
The ability to adjust or transform data during digitizing from one projection or
coordinate system to another is a desirable function of the GIS software. Numerous
functional techniques exist to aid the operator in the digitizing process.
Digitizing methods
Point vs. stream digitizing.
Digitizing can be done in a point mode, where single points are recorded one at a
time, or in a stream mode, where a point is collected on regular intervals of time or
distance, measured by an X and Y movement, e.g. every 3 metres records a point
feature.
Blind vs. on screen digitizing
Digitizing can also be done „blindly‟ or with a „graphics terminal.‟ Blind digitizing infers
that the graphic result is not immediately viewable to the person digitizing. Most
systems display the digitized linework as it is being digitized on an accompanying
graphics terminal.
Spaghetti mode of digitizing: This allows the user to simply digitize lines by
indicating a start point and an end point. Data can be captured in point or stream
mode. However, some systems do allow the user to capture the data in an arc/node
topological data structure. The arc/node data structure requires that the digitizer
identify nodes.
does not negate the requirement for editing and cleaning of the digitized linework
before a complete topological structure can be obtained.
Digitizing devices are very reliable and most often offer a greater precision that the
data warrants; and
For raster based GIS software data is still commonly digitized in a vector format and
converted to a raster structure after the building of a clean topological structure. The
procedure usually differs minimally from vector based software digitizing, other than
some raster systems allow the user to define the resolution size of the grid-cell.
Conversion to the raster structure may occur on-the-fly or afterwards as a separate
conversion process.
2. Automatic Scanning
A variety of scanning devices exist for the automatic capture of spatial data. There is
an advantage of being able to capture spatial features from a map at a rapid rate of
speed.
Disadvantage
Scanners are generally expensive to acquire and operate.
Most scanning devices have limitations with respect to the capture of selected
features, e.g. text and symbol recognition.
hard copy data may not be in a form that is viable for effective scanning,
e.g. maps are of poor quality, or are in poor condition for instance Most
cadastral maps once scanned in the LIS were in poor unreadable state;
with raster scanning there it is difficult to read unique labels (text) for a
geographic feature effectively; and
NOTE:
Consensus within the GIS community indicates that scanners work best when the
information on a map is kept very clean, very simple, and uncluttered (not congested)
with graphic symbology.
The sheer cost of scanning usually eliminates the possibility of using scanning methods
for data capture in most GIS implementations. Large data capture shops and
government agencies are those most likely to be using scanning technology.
This involves entering, from survey data, the explicit measurement of features from
some known survey control.
Disadvantage
This input technique is obviously very costly and labour intensive. In fact, it is rarely
used for natural resource applications in GIS.
Advantage
This method is useful for creating very precise cartographic definitions of property, and
accordingly is more appropriate for land records management at the cadastral or
municipal scale. It is currently used to input data in the LIS at the district MZOs.
The most common digital data to be used in a GIS is data from CAD (computer aided
drafting) systems. However, a number of data conversion programs exist, mostly from
GIS software vendors, to transform data from CAD formats to a raster or topological
GIS data format. Several specific standards for data exchange have been established
in the market place.
E.g. Most GIS software vendors provide an ASCII data exchange format specific to
their product, and a programming subroutine library that will allow users to write their
own data conversion routines to fulfill their own specific needs.
Some of the data formats common to the GIS marketplace are listed below. Most
formats are only utilized for graphic data. Attribute data is usually handled as ASCII
text files.
DLG - Digital Line Graph (US Geological This ASCII format is used by the USGS as a
Survey) distribution standard and consequently is
well utilized in the United States. It is not
used very much in Canada even though
most software vendors provide two way
conversion to DLG.
GENERATE - ARC/INFO Graphic A generic ASCII format for spatial data used
Exchange Format by the ARC/INFO software to accommodate
generic spatial data.
The editing of spatial data is a time consuming, interactive process that can take as
long, if not longer, than the data input process itself.
Locational placement errors of spatial data. These types of errors usually are
the result of careless digitizing or poor quality of the original data source.
Distortion of the spatial data. This kind of error is usually caused by base maps
that are not scale-correct over the whole image, e.g. aerial photographs, or from
material stretch, e.g. paper documents.
Incorrect linkages between spatial and attribute data. This type of error is
commonly the result of incorrect unique identifiers (labels) being assigned during
manual key in (Typing) or digitizing. This may involve the assigning of an entirely
wrong label to a feature, or more than one label being assigned to a feature.
Attribute data is wrong or incomplete. Often the attribute data does not
match exactly with the spatial data. This is because they are frequently from
independent sources and often different time periods. Missing data records or too
many data records are the most common problems.
The identification of errors in spatial and attribute data is often difficult. Most spatial
errors become evident during the topological building process. The use of check plots
to clearly determine where spatial errors exist is a common practice.
Usually data is input by digitizing. Digitizing allows a user to trace spatial data from a
hard copy product, e.g. a map, and have it recorded by the computer software. Most
GIS software has utilities to clean the data and build a topologic structure.
If the data is unclean to start with, for whatever reason, the cleaning process can be
very lengthy.
Interactive editing of data is a distinct reality in the data input process. Experience
indicates that in the course of any GIS project 60 to 80 % of the time required to
complete the project is involved in the input, cleaning, linking, and verification of the
data.
The most common problems that occur in converting data into a topological
structure include:
1. Slivers/flakes and gaps in the line work;
Note:
Topological errors only exist with linear and polygon features.
They are most evident with polygonal features.
SOLUTION
It is advisable to digitize data layers one at a time with respect to an existing
data layer, e.g. hydrography, rather than attempting to match data layers later.
A proper plan and definition of priorities for inputting data layers will save many
hours of interactive editing and cleaning.
Dead ends usually occur when data has been digitized in a spaghetti mode, or without
snapping to existing nodes. Most GIS software will clean up undershoots and
overshoots based on a user defined tolerance, e.g. distance.
SOLUTION
The definition of a proper tolerance for cleaning requires an understanding of the scale
and accuracy of the data set.
Duplicate lines. These usually occur when data has been digitized or converted from a
CAD system. The lack of topology in this type of drafting systems permits the
inadvertent creation of elements that are exactly duplicate.
However, most GIS packages afford automatic elimination of duplicate elements during
the topological building process. Accordingly, it may not be a concern with vector
based GIS software. Users should be aware of the duplicate element that retraces
itself, e.g. a three vertical line where the first point is also the last point.
Some GIS packages do not identify these feature inconsistencies and will build such a
feature as a valid polygon. This is because the topological definition is mathematically
correct, however it is not geographically correct. Most GIS software will provide the
capability to eliminate bow ties and slivers by means of a feature elimination command
based on area, e.g. polygons less than 100 square metres.
Solution: Most GIS software contains functions that check for and clearly identify
problems of linkage during attempted operations.
Cleanup of lines and junctions. This process is usually done by software first and
interactive editing second.
Correction for distortion and warping. Most GIS software has functions for scale
correction and rubber sheeting. However, the distinct rubber sheet algorithm used will
vary depending on the spatial data model, vector or raster, employed by the GIS.
Some raster techniques may be more intensive than vector based algorithms.
Construction of polygons. Since the majority of data used in GIS is polygonal, the
construction of polygon features from lines/arcs is necessary. Usually this is done in
conjunction with the topological building process.
These data verification steps occur after the data input stage and prior to or during the
linkage of the spatial data to the attributes. Verification should include some brief
querying of attributes and cross checking against known values.
These files are accessed by binary search procedures. Instead of beginning the search
at the beginning of the list, the record in the middle is examined first. Binary search
procedures requires (n+1) steps. E.g If the file is 10000 and it takes 1 second for
each name, then time required is (10000+1) =13.82s
Indexed files
Index file contain lists of index tables containing a list of keys and addresses of
corresponding records. For instance, a land parcel referenced to a particular volume
and portfolio on which data is stored. Access to original data is fast. There are two sub
types under indexed files
If the data items in files provide the main order of the file then these are called direct
files.
Inverted files are those where location of items in the main file can also be specified
according to topic, normally given in the second file. Index files permit rapid access to
databases.
DATABASE
A database is a collection of structured, non-redundant data „sharable‟ between
different application systems. The definition highlights the essence of sharing data
between the given application systems thus calling for data consistency. A geo-
database hence is a collection of structured, non-redundant geospatial data.
The simplest way to reduce the incidence of the consistent data is to eliminate un-
necessary duplication of data; this in turn implies that data should be stored as a
common pool of data sharable between application systems. This pool of data is the
enterprise database.
Role of a database
Database or information systems allow the user to achieve the addition of new
information easily, retrieve information change or edit information easily giving a prime
role in handling related data.
It comes with a number of useful functions:
The database can be used by multiple users at the same time i.e. it allows for
concurrent use.
The database offers a number of techniques for storing data and allows to use
the most efficient one i.e. it supports storage optimization
The database allows imposing rules on the stored data, which are automatically
checked after each update to the data i.e. it supports data integrity
The database offers an easy way to use data manipulation language, which
allows performing all sorts of data extraction and data updates i.e. it has a
query facility
The database will try to execute each query in the data manipulation language
Data Base Management System (DBMS) acts in support of data storage and
processing.
Database design
Databases are made in tabular forms and being able to relate them.
A table is an elementary building blocks used to describe conceptual models. A
“Model” , being a representation of the actual phenomena.
Database design involves identifying what to be included in these tables i.e. entities
Therefore an entity is a distinct object (a person, place, thing, concept or event) in the
organization that is to be represented in the database. An “alias” is just another name
of an entity.
Enterprise rules are rules that govern data in a database and are applicable to the
conceptual model of the data base.
One-to-many (1: N) Relationship; this exists when one row of the first table
matches to multiple rows in the second table. For example one parcel being
owned by more than one person (many people) especially in customary tenure
system.
Many-to-Many (N:M) Relationship; This exists when one row in the first
table matches multiple rows in the second table and one row in the second
table matches to multiple rows in the first table.
RELATIONAL DATABASE
The term relational database was originally defined and coined by Edgar Codd at IBM
Almaden Research Center in 1970.
Key Terms
Relational database theory uses a set of mathematical terms, which are roughly
equivalent to Structured Query Language (SQL) database terminology.
The table below summarizes some of the most important relational database terms
and their SQL database equivalents.
Tuple row
Attribute column
Relations or Tables
A relation is defined as a set of tuples that have the same attributes. A tuple usually
represents an object and information about that object. Objects are typically physical
objects or concepts. A relation is usually described as a table, which is organized into
rows and columns. All the data referenced by an attribute are in the same domain and
conform to the same constraints.
The relational model specifies that the tuples (rows) of a relation have no specific
order (ordering of rows is not significant) and that the tuples, in turn, impose no order
on the attributes.
Applications access data by specifying queries, which use operations such as select to
identify tuples, project to identify attributes, and join to combine relations.
Relations can be modified using the insert, delete, and update operators. New tuples
can supply explicit values or be derived from a query. Similarly, queries identify tuples
for updating or deleting. It is necessary for each tuple of a relation to be uniquely
identifiable by some combination (one or more) of its attribute values. This
combination is referred to as the primary key.
2.4 TABLES
An example of tables and characteristics
Attribute types
A table type contains only table name and attribute types but not attribute
occurrences/values
NORMALIZATION
The table that satisfies the above rules/restrictions is called a normalized table. One
that violates the rules above is called unnormalized table. Unnormalized tables have
redundant attribute values.
NULL VALUES
An attribute may be null, that is; “Not yet known” or “Not applicable.” Any null value in
the table will usually be represented by a blank, this shouldn‟t mean that “null‟ and
blank are the same.
REDUNDANT DATA
This is any data if removed will not cause any loss in information. i.e. a data value is
redundant if its deletion leas to no loss of information.
management
When the table contains no multiple values and no redundant data we say the table is
fully normalized.
DISADVANTAGES
1. The search is sequential and considerable amount of time can be spent in large
database.
2. Relational database system has to be very skillfully designed in order to support
the capabilities with reasonable speed that is it is very expensive to design a
relational database structure.
3. Relational databases are not very good at storing more complex types of data.
4. Relational data bases are not set up when dealing with spatial data.
Other relations do not store data, but are computed by applying relational
operations to other relations. These relations are sometimes called "derived
relations". In implementations, these are called "views" or "queries".
Derived relations are convenient in that though they may grab information from
several relations, they act as a single relation. Also, derived relations can be used as
an abstraction layer.
Domain
A domain describes the set of possible values for a given attribute. Because a
domain constrains the attribute's values and name, it can be considered constraints.
Mathematically, attaching a domain to an attribute means that "all values for this
attribute must be an element of the specified set."
Topology
A core feature of Geographical Information Systems (GIS) is the ability to create and
manipulate topological data structures for vector-based data.
Hierarchical systems assume there is a good correlation between the key attributes
and the associated attributes.
Disadvantages are;
Data access via unique identifiers is difficult for associated attributes.
Large index files have to be maintained and certain attribute values may have
to be repeated many times leading to data redundancy; thus increasing storage
and access overheads. Illustration
a e
I
c 5
1 II
d
4 f
Figure 1: Map M
2
b 3
3
a e
I
c
c 5
1 II
f
d 4
4
g
6
Figure 2: Polygons I and II (from M)
I II
a b c
d c e f
g
1 2 2 3 3 4 4 1 3 4 3 5 5 6 6 4
I II
a b c d e f g
1
2 3 4 5 6
Advantage;
Network systems are very functional when tables or linkages can be specified
before. (In the process, avoiding data redundancy while using of available
data).
Disadvantage;
The database is enlarged by the overhead pointers which in complex systems can
become quite a substantial part of the database. The problem is that these pointers
must be updated every time change is made to the database.
Data are stored in simple records called tuples and a complete row is called a record.
E.G Attribute type
02 Attribute occurence
column
.
Polygons
I a b c d
II c e f g
Lines
I A 1 2
I B 2 3
I C 3 4
I D 4 1
II E 3 5
II F 5 6
II G 6 4
II C 4 3
Characteristics
It consists of rows and columns.
When a row and a column intersect we call the place an attribute
occurrence.
Columns contain attribute types e.g. code, name, area etc.
These structures store no pointers and express no hierarchy. Instead, data are stored
in simple records known as „tuples‟ containing an ordered set of attribute values
grouped together in two-dimensional tables called relations. Each table or relation is
usually a separate file.
Object-Oriented structure:
The object-oriented database model manages data through objects.
An object is a collection of data elements and operations that together are considered
a single entity. The object-oriented database is a relatively new data structure.
This approach has the attraction that querying is very natural, as features can be
bundled together with attributes at the database administrator's discretion. To date,
only a few GIS packages are promoting the use of this attribute data model.
The tables designed are fully „normalized‟ by enhancing „referential integrity‟. In the
process, Constraints are defined, including primary identifiers (keys), posted
identifiers, other unique identifiers, and check constraints. This ensures integrity on
data and removal of redundant data.
One such integrity, that is; referential integrity rules keep the relationships between
tables intact and unbroken in a relational database management system - referential
integrity prohibits you from changing existing data in ways that invalidate and harm
the links between tables.
Referential Integrity preserved the defined relationships between tables when records
are added, modified or deleted by ensuring that the identifier values are consistent
across tables; such consistency required that there are no references to non-existent
values and if a identifiers value changed, all references to it had to change consistently
through database ensuring that a identifiers couldn‟t be changed
Spatial analysis can also be defined as the process of extracting or creating new
information about a set of geographic features to perform routine examination,
assessment, evaluation, analysis or modeling of data in a geographic area based on
established and computerized criteria and standards.
Connectivity analysis
This is the analysis of connectivity between points, lines and polygons in terms of
distance, area, travel time, optimum paths etc. examples, proximity by buffering,
network analysis.
Proximity analysis
This is primarily concerned with closeness of one feature to another. Proximity analysis
means the ability to identify any feature that is near any other feature based on
location, attribute values, or specific distance. It may also be defined as,
“measurement of distances from points, lines and boundaries of polygons.”
Questions like these are answered, „To identify parcels which are within 80m of a
certain road?‟ “Which parcels are within 60m of the railway line?” “How many houses
lie within 100 meters of this water main?” What is the total number of customers within
10 kilometers of this store? What proportion of the alfalfa crop is within 500 meters of
the well?
To answer such questions, GIS technology uses a process called buffering to determine the
proximity between features.
In proximity, more questions like; “how many houses lie within 30km of this
reservoir?” “What is the total number of patients within 20km of a health care facility?”
These questions also can be answered by a process called “Buffering” to determine
proximity within features.
GIS analysis functions by creating buffers around selected features for example a
radius of 15km around a health centre to denote a catchment area. Proximity analysis
is not always based on distance but also on time.
Buffering
A buffer operation is one of the most common spatial analysis tools.
When creating a buffer, the user selects the feature to buffer from as well as the
distance to be buffered. The buffer operation creates a new polygon dataset, where a
specified distance is drawn around specific feature within a layer.
Buffers are a frequently used analysis tools/device. In fact, buffers calculate distances
from spatial objects and produce polygons that reflect the objects and area around.
Buffer zones are frequently used to mitigate environmental hazards.
Network analysis
This type of analysis examines how linear features are connected and how easily resources can
flow through them. Many analyses can be carried out on the network, for transportation
planning, utility management, air-line scheduling, and navigation etc.
Overlay
An overlay process combines the features of two layers to create a new layer that contains
the attributes of both. This resulting layer can be analyzed to determine which features
overlap, or to find out how much of a feature is in one or more areas.
An overlay could be done to combine soil and vegetation layers to calculate the area of a
certain vegetation type on a specific type of soil.
Overlay analysis integrates spatial data layers with attribute data. This is done by
combining information from different GIS layers and finally deriving an attribute
(another layer).
In fact all types of spatial objects can be overlain in order to analyses spatial
relationship between sets of objects and their surroundings.
Principles of GIS, Compiled by Lubwama Raymond 0773709336; [email protected]
47
Overlays/spatial joins can for example link land use and environmental data to
population and disease data. It could also integrate data of different types such as
soils, vegetation and land ownership.
Vector overlay
It is a combination of two separate spatial data sets to create a new output vector
datasets. These are similar to mathematical Venn diagram overlays.
In the process, map features and associated attributes are integrated to produce
composite maps.
Union overlay
It combines the geographical features and attributes tables of both inputs into a single
new output.
Intersect overlay
It defines the area where both inputs overlap and retains a set of attribute field for
each.
Data extraction
It is a GIS process similar to vector overlay but can be used in either vector and raster
data analysis. Rather than combining the properties and features of both datasets,
data extraction involves using “clip” or “mask” in which features fall in a given extent.
One can perform analysis to obtain the answers to a particular question or find solutions
to particular problem
Retrieval
GIS analysis allows the user to retrieve data. Retrieval occurs in both spatial and
attribute data.
Data retrieval involves the capability to easily select data for graphic or attribute
editing, updating, querying, analysis and/or display.
Retrieval involves the selecting, search, manipulation and output of data without the
requirement to modify the geographic location of the features involved. Often, data is
selected by “Select attributes” and viewed “Graphically.”
The ability to retrieve data is based on the unique structure of the DBMS and
command interfaces are commonly provided with the software. Most GIS software also
provides a programming subroutine library, or macro language, so the user can write
their own specific data retrieval routines if required.
Querying
This is the capability to retrieve data, usually a data subset, based on some user
defined formula. These data subsets are often referred to as logical views. Often the
querying is closely linked to the data manipulation and analysis subsystem.
Many GIS software offerings have attempted to standardize their querying capability
by use of a Standard Query Language (SQL). This is especially true with systems that
make use of an external relational DBMS. Through the use of SQL, GIS software can
interface to a variety of different DBMS packages.
Reclassification
Reclassification involves the selection and presentation of a selected layer of data
based on the classes or values of a specific attribute e.g. cover group. It involves
looking at an attribute, or a series of attributes, for a single data layer and classifying
the data layer based on the range of values of the attribute.
Accordingly, features adjacent to one another that have a common value, e.g. cover
group, but differ in other characteristics, e.g. tree height, species, will be treated and
appear as one class.
The dissolving of map boundaries based on a specific attribute value often results in a
new data layer being created. This is often done for visual clarity in the creation of
derived maps. Almost all GIS software provides the capability to easily dissolve
boundaries based on the results of a reclassification. Some systems allow the user to
create a new data layer for the reclassification while others simply dissolve the
boundaries during data output.
Note;
The querying capability of the DBMS (software) is a necessity in the reclassification
process. The ability and process for displaying the results of reclassification, a map or
report, will vary depending on the GIS.
In some systems the querying process is independent from data display functions,
while in others they are integrated and querying is done in a graphics mode. The exact
process for undertaking a reclassification varies greatly from GIS to GIS.
With network elements, that is, the lines that make up the network, extra values are
commonly associated like distance, quality of the link, or carrying capacity.
A Commonly used data structure in GIS software is the Triangulated Irregular Network,
or TIN. It is one of the standard implementation techniques for digital terrain models,
but it can be used to represent any continuous field.
Principle of a TIN
It is built from a set of locations (coordinated points) for which we have a
measurement, for instance an elevation.
The locations can be arbitrarily scattered in space.
From these three-dimensional points, constructing an irregular tessellation made
of triangles is done.
This is illustrated in figure 2.8. Two such tessellations are illustrated in figure 2.9.
NOTE:
In a three-dimensional space, three points uniquely determine a plane, as long they are
not collinear, i.e., they must not be positioned the same line.
A plane fitted through these points has a fitted aspect and gradient, and can be used
general spatial topology.
Principles of GIS, Compiled by Lubwama Raymond 0773709336; [email protected]
52
A TIN clearly is a vector representation: each anchor point has a stored reference
(coordinate). It is also called an irregular tessellation, as the chosen triangulation
provides a partitioning of the entire study space.
However, in the case the cells do not have an associated stored value as is typical
of tessellations, but rather a simple interpolation function that use the elevation
values of its there anchor points.
DATA QUALITY
As Geo-information is intended to reduce uncertainty in decision making, any errors
and uncertainties in spatial information products may have practical, financial and even
legal implications for user.
For this reason, it is important to that those involved in the acquisition and processing
of spatial data are to assess the quality of the base data and the derived information
resulting products.
However the collection and maintenance of base data remains the responsibility of the
various governmental agencies, such as National Mapping Agencies (NMP), which are
responsible for collecting topographic data for the entire country following supply
companies, local government department s and many others all collect and maintain
spatial data for their own particular purposes.
If data is to be shared among different users, these users need to know not only what
data exists, where and in what format it is held, but also whether the data meets their
particular quality requirements. This “data about data” is known as metadata.
Since the real power of GIS lies in their ability to combine and analyze georeferenced
data from a range of sources, we must pay attention to the issues of data quality and
error as data from different sources are also likely to contain different kinds of error,
These components play an important role in assessment of data quality for several
reasons:
1. Even when source data, such as official topographic maps have been subject to
stringent quality control, errors are introduced when these data are input to GIS.
2. Unlike a conventional map, which is essentially a single product, a GIS database
normally contains data from different sources of varying quality.
3. Unlike topographic or cadastral database, natural resource database contain
data that are inherently uncertain and therefore not suited to conventional
quality control procedures.
4. Most GIS analysis operations will themselves introduce errors.
DATA PRESENTATION/OUTPUT
The end result is best visualized as a maps, images, 3D views, reports or graph. Maps are
efficient for storing and communicating geographic information. GIS provides new and
exciting tools to extend the art and science of map making. Maps can be integrated with
reports, three-dimensional views, photographic images, and other digital media.
Sharing the results of your geographical analysis is one of the primary justifications
for investing resources in GIS. Taking displays created through a GIS and
outputting them into distributable formats is a great way to do this.
The more avenues for output a GIS can offer, the greater the potential for
reaching the right audience with the right information.
MAP PROJECTION
This is a system in which the locations on the curved surface of the earth are displayed
on a flat surface. There are 3 basic types of map projections which include the
following.
a. Planner projections: also called Azimuthal projections. These are the ways
how the curved surfaces of the earth are projected into the plane. A flat surface
is in contact with the globe at the poles and then all the points are projected in a
flat surface e.g.
b. Conical projections: Here projections are made at the surfaces of the cone
tangent at the circle e.g.
F
E
D
C
8O
6O
The strips are numbered upwards from C to S. the coordinates are in UTM are
expressed in meters as Eastings and Northings. The central meridian is given an easting
of 500,000m at start. The northings vary from the equator and the equator is given the
value zero (0) but from the southern hemisphere, the equator is given a value of
10,000,000 metres. the UTM system only works between 84o to 80o. Beyond this the
UTM does not work.(causes distortions).
Advantages of UTM
a. They are the most frequently used systems during mapping.
b. It is a universal approach to geo-referencing.
c. It is consistent to most parts of the globe.
Disadvantages of UTM