0% found this document useful (0 votes)
6 views8 pages

GIS abstracts

The document provides an extensive overview of raster and vector data representations in GIS, detailing their principles, pros and cons, and typical applications. It covers various raster formats, visualization techniques, georeferencing methods, classification methods, and map algebra operations, emphasizing the importance of spatial interpolation and digital terrain models. Additionally, it discusses different classification methods and their properties, as well as the principles of local, focal, and zonal map algebra.

Uploaded by

ynwaitl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views8 pages

GIS abstracts

The document provides an extensive overview of raster and vector data representations in GIS, detailing their principles, pros and cons, and typical applications. It covers various raster formats, visualization techniques, georeferencing methods, classification methods, and map algebra operations, emphasizing the importance of spatial interpolation and digital terrain models. Additionally, it discusses different classification methods and their properties, as well as the principles of local, focal, and zonal map algebra.

Uploaded by

ynwaitl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Theoretical background

You will be asked two questions from the following topics:

1.Principles of raster data


Raster and vector representation of geographic phenomena. Principles, differences, specifics, pros and
cons of each representation.
Geographical phenomena: (Discrete) objects, Continuous fields, Discrete fields
Data models: Vector and Raster

Raster:
Geometry: regular tessellation, elementary unit(grid of cells,pixels, voxel), where each cell represents a
discrete unit of space and holds a value corresponding to a specific attribute.
It may be less suitable for representing precise boundaries and complex features due to its
gridded structure.
Raster’s header: coordinate system, coordinates of one corner, number of columns, numbers of rows, cell
size. Raster’s orientation is always colinear with the coordinate system axes!
Principles:
- Utilizes a grid structure to organize data into cells.
- Suitable for continuous and regularly distributed phenomena.
- Resolution determined by the size of the cells, influences the level of detail in the
representation..
- Raster data is well-suited for spatial analysis involving mathematical operations, interpolation,
overlay analysis, and proximity analysis.

Example: Continuous data such as elevation models, satellite imagery, land cover, temperature
distributions

Pros:
Efficient for continuous data representation.
Well-suited for spatial analysis.
Commonly used in remote sensing applications.
Cons:
Can be memory-intensive for large datasets.
Challenges in representing precise boundaries.
Limited in capturing complex geometric features.

Vector:
Geometry: points, lines, and polygons to represent features on the Earth's surface.
Points - individual locations
Lines - linear features
Polygons - enclosed areas.
It may be less suitable for continuous and regularly distributed phenomena.
Principles:
- Utilizes points, lines, and polygons to represent features.
- Suitable for representing complex features with well-defined boundaries.
Maintains topological spatial connectivity and adjacency.
- Versatile for various applications with attribute data.
Example: Administrative boundaries, road networks, land parcels.

Pros:
Well-suited for precise boundaries and complex features.
Maintains topological relationships.
Versatile with attribute data for comprehensive analysis.
Cons:
May not be efficient for continuous and regularly distributed phenomena.
Complex geometric features may be challenging to represent.
Certain spatial analyses may be computationally intensive.

Basic types of rasters and their specifics and typical applications.


Raster’s attributes: Continuous, Discrete rasters. Multiband, Singleband rasters

Raster formats and their specifics.


Pictures(raster of colours): JPEG, PNG - stores information about the color if each pixel, typically using
RGB channels. Used for photographs, digital images, and graphics where color information is
essential. Suitable for visualizations and mapping where color distinctions are important.
JPEG2000 – provides high compression efficiency with minimal loss of quality; suitable for both lossless
and lossy compression

Raster of values: GeoTIFF, TIFF with Single-band – represents data using a single band(channel),
typically containing values representing a specific attribute like elevation, temperature or intensity. Used for
elevation models, satellite imagery with a single spectral band (e.g., black and white infrared),
and datasets representing various physical measurements.
GeoTIFF – supports georeferencing information, allowing spatial referencing of the raster data; can store
both image and associated metadata; widely used for satellite imagery and other remote sensing data.

Esri Grid – stores grid datasets, consisting of multiole files in a specific folder structure. Used in Esri's
ArcGIS software for various types of spatial analysis, including terrain modeling, hydrology,
and environmental modeling.
USGS DEM, Erdas Imagine(.img), Anything + World file

Raster visualization - what are pyramids, what rendering techniques you know.
Pyramids - a multi-resolution representation of a raster dataset. The purpose of creating
pyramids is to optimize the display and analysis of raster data at different zoom levels.
Rendering – the process of generating images from spatial data, displaying raster on the screen using
colors: RGB Composite, Stretched, Classified, Unique Values, Colormap, Discrete Color

Raster georeference - how it works, what is a “world file”, what is raster resampling.
Raster resampling – a process of overlaying a raster by a new grid and interpolation of the values for this
new grid. It happens when:
a) georeferencing and rectifying a raster
b) changing the coordinate system(Project Raster)
c) shifting the raster position(Snap Raster)
d) changing the raster resolution(Resample)
Interpolation methods:
- nearest neighbor method
– discrete rasters, bilinear interpolation
- bicubic interpolation(Cubic Convolution)
Raster georeferencing is the process of assigning spatial coordinates (latitude and longitude) to a raster
dataset so that it aligns correctly with a specific location on the Earth's surface. This allows for accurate
mapping, spatial analysis, and integration with other geospatial data. The georeferencing process involves
associating each pixel in the raster with its corresponding geographic location.
Key steps in raster georeferencing include identifying control points in the raster (known locations with
known coordinates), transforming the raster to match those control points, and defining a coordinate system
for the raster data.
A "world file" is a common method for storing georeferencing information associated with raster datasets.
It is a plain text file with a predefined format that contains parameters to define the location, rotation, and
scale of the raster. The world file has the same filename as the raster image but with a different extension
based on the image format (e.g., .tfw for TIFF, .jgw for JPEG).
World file or AUX.XML file – externally saved geoference, the original raster is unchanged
Rectification – a new raster is created, resampled to be collinear with coordinate system; geoference
saved either internally(Esri Grid, GeoTIFF etc) or externally(World File with no rotation)
2.Classification methods
What classification methods do you know? What are their properties and differences? When would you use
each of them?
Classification - a process when a group of values (referred as “observed values” or “observations”
further in the text) is divided to subgroups called “classes”. It consists in dividing the range of values
into several non-overlapping intervals (classes) and determining which values belong to a particular
interval (class). In GIS, classification is used for display of a numerical attribute by drawing features
from each class with different symbol. Various classification methods differ in the way how they determine
the class breaks (i.e. the upper and lower bounds of the intervals).

Equal intervals
Divides the whole range of values into equally sized intervals
Equal Interval: 1) define the number of classes
2) the range of values is divided to the specified number of equally sized intervals.
ignores the true distribution of the data along the number line,
Defined Interval: 1) define the size (i.e. width) of the intervals
2) the intervals are placed next to one another, until they cover the whole range of
values.
+++
- the classes are easily interpretable, especially in the case of
rounded class breaks
- the outlying (i.e. extreme) values are well represented, having usually their own class
----
- if the data are not regularly distributed but rather crowded, most of the data appear in one or several
classes, whereas other classes can be even empty.
- visually this leads to a map with only few colors covering the most of the area

Quantiles:
Creates classes that all contain the same (or almost the same) number of observations.
+++
- easy to interpret
----
- in the parts of the number line where the data are more scattered, the requirement of the same number of
observations in each class leads to wide classes, grouping together very distant values.
- sometimes create wide classes grouping distant observations into one class.

Mean and Standard Deviations


Standard deviation focuses on the variation of the data around the mean
The classification is fully automatic, with no possibility to define the number of classes. The classes are
determined so that they all have the same width of one standard deviation, and one of the class is centered
on the mean
The first and the last class are usually enlarged to include all the outlying observations
It is usual to use two-color scale for the map based on this classification, to reflect how the data vary above
and below the mean.
+++
----
- it works well only when the data are normally (or nearly normally) distributed, or at least when the
distribution is symmetrical around the mean
- for strongly non-normally distributed data, the result is usually visually unattractive, little informative, and
sometimes even misleading, as some of the classes may be almost or totally empty.

Geometrical Interval
Designed for the data that are accumulated around some typical value and with greater distance from the
center of the main crowd they are more and more scattered.
The class widths form a geometrical sequence. (Recall that geometrical sequence is a sequence of
numbers such that each number equals to the previous one multiplied by some constant. This constant is
called a coefficient of a sequence.
The coefficient of the sequence is determined so that the numbers of observations in different classes are
as similar as possible (ideally the same, which would be the case of geometrically distributed data).
Moreover, the coefficient can be at some point replaced by its inverse value.
The resulting classification thus contain a dividing point from which the class widths gradually increase (or
decrease) both to the right and to the left
As well as the position of the dividing point where it switches to its inverse, is realized automatically using
an optimization algorithm, but the user has an option to specify the number of classes.
In contrast with the Quintile method, it preserves some kind of order in the classes, namely that each class
width is equal to a previous class width multiplied by the coefficient.
The class breaks, however, are usually non-rounded and may thus look confusing to a map reader
+++
----
- sometimes create wide classes grouping distant observations into one class.

Natural Breaks
Tries to overcome both all these limitations.
It reflects the true distribution of the data along the number line, but instead of fitting a specific distribution to
them (normal or geometrical), it determines the “natural” breaks or gaps between the observations, or more
accurately, it tries to group together only similar observations.
The determination which observations should be grouped to a single class is based on an objective
criterion that says that in the optimal classification, the variation of values inside each class would be
minimal (i.e. as small as possible). The variation can be assessed using any variation measure, such as
absolute deviation or standard deviation. In the optimization process, the method repeatedly tries
different classifications, and for each classification, it computes the total intra-class variation by
summing the variations in the individual classes (i.e. how the observations inside the class vary around
the class median or mean). Once the classification with the minimum possible total intra-class variation
is found, the process terminates. Because of this optimization nature, this method is sometimes
referred as “Optimal method”.
3.Map algebra
Map algebra = collection of operations on rasters
Local map algebra - types of operations, tools in ArcGIS, examples of use.
Local map algebra: working on each cell independently without considering its neighbours / evaluated on
the cell-by-cell basis / inputs are usually 2 or more rasters
Example: adding elevation values of two cells together / substracting rainfall values of one cell from
another / checking if the temperature in a cell is higher than a certain threshold
Types of operations:

-> binary: + - * / % **
-> unary: Abs() Exp() Log() Int() Sin() Cos()

-> < <= > >= == InList(...)


-> output is binary (“logical”) raster: 1=true, 0=false

-> AND OR XOR NOT


-> both input and output are binary (“logical”) rasters
-> combinatorial AND
-> combinatorial OR
-> 0=false
-> 1, 2, 3, ... = different kinds of “truth”

-> sum mean sd median max min variety


-> 2 or more input rasters

Tools in ArcGIS: raster calculator, spatial analyst toolbox,


Focal map algebra - principles, tools, examples.
Focal map algebra: working on groups of cell around each central cell with a focus on the neighrborhood
of each cell in a grid / input is usually single raster of values / output raster: the cell value computed from a
neighrborhood of a given input-raster cell
Examples: calculating the average elevation within a certain radius of each cell / identifying areas with high
or low concentrations of features / applying a smoothing filter to remove noise from the data
Types of operations:

-> mean median sd min max, ...


-> neighborhoods do not overlap
-> appication: changing raster resolution

-> mean median sd min max, ...


-> neighborhood do overlap (“moving window”)
->application: smoothing a terrain

-> high pass / low pass


-> highlighting / suppressing
changes
-> application: highlighting
edges
Tools in ArcGIS: Focal Statistics, Focal mean, Focal sum ....

Zonal map algebra - principles, tools, examples.


Zonal map algebra: working on groups of cell that belong to defined zones/regions within a raster dataset /
calculation is made inside zones / zones: polygons, discrete raster
Examples: calculating the average temperature within each administrative district, determining the total
population within each land cover type, finding the maximum elevation within each watershed
4.Spatial interpolation
Interpolation – estimation of a value of an unknown function at some point based on known values at
surrounding points
Nearest neighbor - principle, characteristics of the output surface.
Thiessen polygon calculates the distance from each cell in a raster dataset to the nearest cell with a
specified attribute value / resulting in an output surface that represents the proximity of each location to the
specified feature.
Principle:
- constructed around each point
- all points inside TP are closer to that point than to any other point
- leads to discontinuous surfaces with abrupt changes
- use in meteorology
Triangulation - principle, characteristics of the output surface.
Triangulation creates TIN by dividing a set of elevation points into triangles / resulting in a vector-based
representation of the terrain surface with irregular triangle shapes and continuous elevation values.
- the result is bounded by a convex hull of input points
Principle:
- points are connected to form irregular triangles, the cell value is determined using linear interpolation on a
correspomding triangle
- Delaunay triangulation: no point belongs to a circle circumscribing some triangle (leads to “reasonable”
triangle shapes)
- connecting the circle centers foms Voronoi diagram
- leads to continuous but not snooth and locally linear surface
- interpolated values are inside the rabge of measured values

Natural neighbors - principle, characteristics of the output surface.


Calculates values at unknown locations by considering the influence or nearby sample points
- resulting surface is smooth and its values are inside the range of measured values
- the result is bounded by a convex hull of input points
Principle:
- the cell value is computed as a weighted average from those points, whose Thiessen polygon intersects
with the Thiessen polygon constructed around the cell center
- weights are proportional to those intersections

IDW - principle, characteristics of the output surface.


- the resulting terrain is smooth surface, with values inside the range of measured values
Principles:
- the cell value is computed as a weighted average from surrounding points
- weights are proportional to the inverse distances of the points from the cell center(Inverse Distance
Weighted average)

Parameter p:
- the higher the values, the more abrupt are the changes between the values (for p → ∞, IDW approaches
Thiessen polygons)
- the lower the value, the more smoothed the resulting terrain
is (for p → 0, IDW approaches a moving average)
Neighborhood size: the larger the neighborhood, the more smoothed the terrain

Spline - principle, characteristics of the output surface.


Calculates values at unknown locations based on a set of known points
- resulting surface is smooth
Principle:
- Low-order polynomial surfaces locally fitted to data points
- parameters of the polynomials are found to meet the following criteria:
1. resulting surface passes exactly through the data points
2. resulting surface is smooth
These requirements lead to ∞ solutions!
→ additional condition is needed.
- version 1: thin plate spline
- version 2: spline with tension
- version 3: regularized spline

TopoToRaster - principle and specifics.


Converts topographic data, typically represeted as contour lines, into raster format
- resulting continuous surface representations of terrain in GIS
Principle:
- interpolation technique based on spline
- developed to create hydrologically correct DTM
- originally a program ANUDEM(Hutchinson, 1988)
- enables one to interpolate from contours (the only method in ArcGIS!)
- additional input data(e.g. hydrological) can be used

What is the major difference (from an applied point of view) between IDW and Spline? When would you
choose the former and when the latter?
IDW -> calculates values based on the weighted average of nearby points, leading to sharp transitions
between areas.
Spline -> calculates smooth curves/surfaces that pass through known data points.
IDW: when i need a simple and quick interpolation method with evenly distributed sample points
Spline: when i need a smoother and more accurate representation of the surface, especially for complex
data patterns/irregularly distributed sample points
5.Digital terrain models
What are the basic techniques to acquire elevation data? What are their pros and cons?

What is a digital terrain model and what is a digital surface model? How to acquire them?
DTM – digital representation of the Earth’s surface that focuses on the natural topography of the terrain,
including hills, valleys and other landforms, without including features such as buildings, vegetation or
infrastructure.
Used for: terrain analysis, slope calculations, wateshed delineation and other applications where the
emphasis is on the natural landscape
To acquire: elevation data is collected using LiDAR, photogrammetry, GPS surveys or ground-based
surveys. These data are then processed to remove features(buildings and vegetation), resulting in a
representation of the bare ground surface
DSM – digital representation of the Earth’s surface including all features present on the terrain(buildings,
trees and other structures) in addition to the bare ground surface
Used for: comprehensive view of the terrain, including both natural and man-made features,
applications(urban planning, 3D visualization, telecommunications planning)
To acquire: elevation data is collected using same techniques as for a DTM. Then the data is not
processed to remove features(buildings and vegetation), resulting in a representation of the entire surface
including all above-ground features
What global/continental elevation models do you know? What are their properties?
6.Digital terrain analysis
Overview of tasks - what problems can you solve using digital terrain models?
How are the slope and aspect computed in ArcGIS? What other approaches do you know?
Explain the formula for hillshade.
Explain how visibility analysis works and how to combine DTM and DSM in their computation (the problem
of forests).
Final grade

You can get a maximum of 30 points from the exam (20 points from the project plus

10 points from the theory). The final grade will be determined as follows:

 27 - 30 points … 1

 22 - 26 points … 2

 18 - 21 points … 3

 Less than 18 points …

You might also like