GIS abstracts
GIS abstracts
Raster:
Geometry: regular tessellation, elementary unit(grid of cells,pixels, voxel), where each cell represents a
discrete unit of space and holds a value corresponding to a specific attribute.
It may be less suitable for representing precise boundaries and complex features due to its
gridded structure.
Raster’s header: coordinate system, coordinates of one corner, number of columns, numbers of rows, cell
size. Raster’s orientation is always colinear with the coordinate system axes!
Principles:
- Utilizes a grid structure to organize data into cells.
- Suitable for continuous and regularly distributed phenomena.
- Resolution determined by the size of the cells, influences the level of detail in the
representation..
- Raster data is well-suited for spatial analysis involving mathematical operations, interpolation,
overlay analysis, and proximity analysis.
Example: Continuous data such as elevation models, satellite imagery, land cover, temperature
distributions
Pros:
Efficient for continuous data representation.
Well-suited for spatial analysis.
Commonly used in remote sensing applications.
Cons:
Can be memory-intensive for large datasets.
Challenges in representing precise boundaries.
Limited in capturing complex geometric features.
Vector:
Geometry: points, lines, and polygons to represent features on the Earth's surface.
Points - individual locations
Lines - linear features
Polygons - enclosed areas.
It may be less suitable for continuous and regularly distributed phenomena.
Principles:
- Utilizes points, lines, and polygons to represent features.
- Suitable for representing complex features with well-defined boundaries.
Maintains topological spatial connectivity and adjacency.
- Versatile for various applications with attribute data.
Example: Administrative boundaries, road networks, land parcels.
Pros:
Well-suited for precise boundaries and complex features.
Maintains topological relationships.
Versatile with attribute data for comprehensive analysis.
Cons:
May not be efficient for continuous and regularly distributed phenomena.
Complex geometric features may be challenging to represent.
Certain spatial analyses may be computationally intensive.
Raster of values: GeoTIFF, TIFF with Single-band – represents data using a single band(channel),
typically containing values representing a specific attribute like elevation, temperature or intensity. Used for
elevation models, satellite imagery with a single spectral band (e.g., black and white infrared),
and datasets representing various physical measurements.
GeoTIFF – supports georeferencing information, allowing spatial referencing of the raster data; can store
both image and associated metadata; widely used for satellite imagery and other remote sensing data.
Esri Grid – stores grid datasets, consisting of multiole files in a specific folder structure. Used in Esri's
ArcGIS software for various types of spatial analysis, including terrain modeling, hydrology,
and environmental modeling.
USGS DEM, Erdas Imagine(.img), Anything + World file
Raster visualization - what are pyramids, what rendering techniques you know.
Pyramids - a multi-resolution representation of a raster dataset. The purpose of creating
pyramids is to optimize the display and analysis of raster data at different zoom levels.
Rendering – the process of generating images from spatial data, displaying raster on the screen using
colors: RGB Composite, Stretched, Classified, Unique Values, Colormap, Discrete Color
Raster georeference - how it works, what is a “world file”, what is raster resampling.
Raster resampling – a process of overlaying a raster by a new grid and interpolation of the values for this
new grid. It happens when:
a) georeferencing and rectifying a raster
b) changing the coordinate system(Project Raster)
c) shifting the raster position(Snap Raster)
d) changing the raster resolution(Resample)
Interpolation methods:
- nearest neighbor method
– discrete rasters, bilinear interpolation
- bicubic interpolation(Cubic Convolution)
Raster georeferencing is the process of assigning spatial coordinates (latitude and longitude) to a raster
dataset so that it aligns correctly with a specific location on the Earth's surface. This allows for accurate
mapping, spatial analysis, and integration with other geospatial data. The georeferencing process involves
associating each pixel in the raster with its corresponding geographic location.
Key steps in raster georeferencing include identifying control points in the raster (known locations with
known coordinates), transforming the raster to match those control points, and defining a coordinate system
for the raster data.
A "world file" is a common method for storing georeferencing information associated with raster datasets.
It is a plain text file with a predefined format that contains parameters to define the location, rotation, and
scale of the raster. The world file has the same filename as the raster image but with a different extension
based on the image format (e.g., .tfw for TIFF, .jgw for JPEG).
World file or AUX.XML file – externally saved geoference, the original raster is unchanged
Rectification – a new raster is created, resampled to be collinear with coordinate system; geoference
saved either internally(Esri Grid, GeoTIFF etc) or externally(World File with no rotation)
2.Classification methods
What classification methods do you know? What are their properties and differences? When would you use
each of them?
Classification - a process when a group of values (referred as “observed values” or “observations”
further in the text) is divided to subgroups called “classes”. It consists in dividing the range of values
into several non-overlapping intervals (classes) and determining which values belong to a particular
interval (class). In GIS, classification is used for display of a numerical attribute by drawing features
from each class with different symbol. Various classification methods differ in the way how they determine
the class breaks (i.e. the upper and lower bounds of the intervals).
Equal intervals
Divides the whole range of values into equally sized intervals
Equal Interval: 1) define the number of classes
2) the range of values is divided to the specified number of equally sized intervals.
ignores the true distribution of the data along the number line,
Defined Interval: 1) define the size (i.e. width) of the intervals
2) the intervals are placed next to one another, until they cover the whole range of
values.
+++
- the classes are easily interpretable, especially in the case of
rounded class breaks
- the outlying (i.e. extreme) values are well represented, having usually their own class
----
- if the data are not regularly distributed but rather crowded, most of the data appear in one or several
classes, whereas other classes can be even empty.
- visually this leads to a map with only few colors covering the most of the area
Quantiles:
Creates classes that all contain the same (or almost the same) number of observations.
+++
- easy to interpret
----
- in the parts of the number line where the data are more scattered, the requirement of the same number of
observations in each class leads to wide classes, grouping together very distant values.
- sometimes create wide classes grouping distant observations into one class.
Geometrical Interval
Designed for the data that are accumulated around some typical value and with greater distance from the
center of the main crowd they are more and more scattered.
The class widths form a geometrical sequence. (Recall that geometrical sequence is a sequence of
numbers such that each number equals to the previous one multiplied by some constant. This constant is
called a coefficient of a sequence.
The coefficient of the sequence is determined so that the numbers of observations in different classes are
as similar as possible (ideally the same, which would be the case of geometrically distributed data).
Moreover, the coefficient can be at some point replaced by its inverse value.
The resulting classification thus contain a dividing point from which the class widths gradually increase (or
decrease) both to the right and to the left
As well as the position of the dividing point where it switches to its inverse, is realized automatically using
an optimization algorithm, but the user has an option to specify the number of classes.
In contrast with the Quintile method, it preserves some kind of order in the classes, namely that each class
width is equal to a previous class width multiplied by the coefficient.
The class breaks, however, are usually non-rounded and may thus look confusing to a map reader
+++
----
- sometimes create wide classes grouping distant observations into one class.
Natural Breaks
Tries to overcome both all these limitations.
It reflects the true distribution of the data along the number line, but instead of fitting a specific distribution to
them (normal or geometrical), it determines the “natural” breaks or gaps between the observations, or more
accurately, it tries to group together only similar observations.
The determination which observations should be grouped to a single class is based on an objective
criterion that says that in the optimal classification, the variation of values inside each class would be
minimal (i.e. as small as possible). The variation can be assessed using any variation measure, such as
absolute deviation or standard deviation. In the optimization process, the method repeatedly tries
different classifications, and for each classification, it computes the total intra-class variation by
summing the variations in the individual classes (i.e. how the observations inside the class vary around
the class median or mean). Once the classification with the minimum possible total intra-class variation
is found, the process terminates. Because of this optimization nature, this method is sometimes
referred as “Optimal method”.
3.Map algebra
Map algebra = collection of operations on rasters
Local map algebra - types of operations, tools in ArcGIS, examples of use.
Local map algebra: working on each cell independently without considering its neighbours / evaluated on
the cell-by-cell basis / inputs are usually 2 or more rasters
Example: adding elevation values of two cells together / substracting rainfall values of one cell from
another / checking if the temperature in a cell is higher than a certain threshold
Types of operations:
-> binary: + - * / % **
-> unary: Abs() Exp() Log() Int() Sin() Cos()
Parameter p:
- the higher the values, the more abrupt are the changes between the values (for p → ∞, IDW approaches
Thiessen polygons)
- the lower the value, the more smoothed the resulting terrain
is (for p → 0, IDW approaches a moving average)
Neighborhood size: the larger the neighborhood, the more smoothed the terrain
What is the major difference (from an applied point of view) between IDW and Spline? When would you
choose the former and when the latter?
IDW -> calculates values based on the weighted average of nearby points, leading to sharp transitions
between areas.
Spline -> calculates smooth curves/surfaces that pass through known data points.
IDW: when i need a simple and quick interpolation method with evenly distributed sample points
Spline: when i need a smoother and more accurate representation of the surface, especially for complex
data patterns/irregularly distributed sample points
5.Digital terrain models
What are the basic techniques to acquire elevation data? What are their pros and cons?
What is a digital terrain model and what is a digital surface model? How to acquire them?
DTM – digital representation of the Earth’s surface that focuses on the natural topography of the terrain,
including hills, valleys and other landforms, without including features such as buildings, vegetation or
infrastructure.
Used for: terrain analysis, slope calculations, wateshed delineation and other applications where the
emphasis is on the natural landscape
To acquire: elevation data is collected using LiDAR, photogrammetry, GPS surveys or ground-based
surveys. These data are then processed to remove features(buildings and vegetation), resulting in a
representation of the bare ground surface
DSM – digital representation of the Earth’s surface including all features present on the terrain(buildings,
trees and other structures) in addition to the bare ground surface
Used for: comprehensive view of the terrain, including both natural and man-made features,
applications(urban planning, 3D visualization, telecommunications planning)
To acquire: elevation data is collected using same techniques as for a DTM. Then the data is not
processed to remove features(buildings and vegetation), resulting in a representation of the entire surface
including all above-ground features
What global/continental elevation models do you know? What are their properties?
6.Digital terrain analysis
Overview of tasks - what problems can you solve using digital terrain models?
How are the slope and aspect computed in ArcGIS? What other approaches do you know?
Explain the formula for hillshade.
Explain how visibility analysis works and how to combine DTM and DSM in their computation (the problem
of forests).
Final grade
You can get a maximum of 30 points from the exam (20 points from the project plus
10 points from the theory). The final grade will be determined as follows:
27 - 30 points … 1
22 - 26 points … 2
18 - 21 points … 3