0% found this document useful (0 votes)
109 views

Lecture 6 - GIS Functions - Part 2

This document provides an overview of GIS data management and exploration. It discusses how GIS uses both spatial and attribute data that can be stored in vector formats like shapefiles and coverages, or raster formats like TIFFs, JPEGs and PNGs. It also covers considerations for data quality such as positional, attribute, temporal, logical and completeness accuracy. The document concludes with a brief introduction to exploring GIS data through statistics, graphs, and spatial and attribute queries.

Uploaded by

Philip Wagih
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
109 views

Lecture 6 - GIS Functions - Part 2

This document provides an overview of GIS data management and exploration. It discusses how GIS uses both spatial and attribute data that can be stored in vector formats like shapefiles and coverages, or raster formats like TIFFs, JPEGs and PNGs. It also covers considerations for data quality such as positional, attribute, temporal, logical and completeness accuracy. The document concludes with a brief introduction to exploring GIS data through statistics, graphs, and spatial and attribute queries.

Uploaded by

Philip Wagih
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Geographical Information

Systems

Lecture 6
GIS Basic Functions -Part II
Data Management & Exploration

Prepared by
Dr. Naglaa Fathy
[email protected]
Image source: Westfield State University
Agenda
• GIS Data Management
✓Vector Data
✓Raster Data
• GIS Data Quality
• GIS Data Exploration

2
GIS Data Management
• A geographic information system (GIS) involves both spatial and
attribute data:
➢ Spatial data relate to the geometries (locations) of spatial features, and
➢ Attribute data describe the characteristics of the spatial features.

• Geospatial data are stored in many different file formats.


➢ Each geographic information system (GIS) software package, and each version
of these software packages, supports different formats.

3
GIS Data Management - Vector file formats
• The georelational data model
• Stores spatial data and attribute data separately and links the two by the
feature ID
• The two data sets are synchronized so that they can be queried, analyzed, and
displayed in unison.
• It is provided in two file formats : Shape files and Coverage files.
• The object-based data model (e.g., geodatabase)
• Combines both geometries and attributes in a single system.
• Each spatial feature has a unique object ID and an attribute to store its
geometry.
• Although the two data models handle the storage of spatial data
differently, both operate in the same relational database environment.
4
GIS Data Management - Vector file format (Shape)

• Shapefiles are nontopological files developed to store the geometric


location and attribute information of geographic features.
• Each shapefile can represent only point, line, or polygon feature sets.
• Supported data types are limited to floating point, integer, date, and
text.
• Shapefiles are supported by almost all commercial and open-source
GIS software.

5
GIS Data Management - Vector file format (Shape)
• Despite being called a “shapefile,” this format is actually a compilation
of many different files.
• One shapefile must have at least 3 files, but most shapefiles have
around 6 files. A shapefile must have:
➢ .shp – this file stores the geometry of the feature
➢ .shx – this file stores the index of the geometry
➢ .dbf – this file stores the attribute information for the feature

• All files for the shapefile must be stored in the same location with the
same name or else the shapefile will not load.

6
GIS Data Management - Vector file format (Coverage)

• The earliest vector format file for use in GIS software packages, which
is still in use today, is the ArcInfo coverage.
• This georelational file format supports multiple features types (e.g.,
points, lines, polygons, annotations) while also storing the topological
information associated with those features.
➢ Information for Arc-Node topology, Polygon-Arc topology, and Left-Right
topology is stored in coverage files.
• Attribute data are stored as multiple files in a separate directory
labeled “Info”.

7
GIS Data Management - Raster file format
• The raster data model presents a different scenario in terms of data
management.
• The cell value corresponds to the value of a continuous/discrete data
at the cell location. And the Value Attribute Table (VAT) summarizes
cell values and their frequencies rather than cell values by cell.

8
GIS Data Management - Raster file format

• Due to ongoing technological advancements, raster image file sizes


have been getting larger and larger.
• To deal with this potential constraint, two types of file compression
are commonly used:
➢ Lossless compression reduces file size without decreasing image quality.
➢ Lossy compression removes information from the image that cannot be
sensed. It results in smaller file sizes than lossless compression.
• Among the most common raster files used on the web are the JPEG,
TIFF, and PNG formats, all of which are open source and can be used
with most GIS software packages.

9
GIS Data Management - Raster file format
• Native JPEG, TIFF, and PNG files do not have georeferenced information
associated with them, and therefore cannot be used in any geospatial
mapping efforts.
• In order to employ these files in a GIS, a world file must first be created.
• A world file is a separate, plaintext data file that specifies the locations and
transformations that allow the image to be projected into a standard coordinate
system.
• Other examples of a raster file formats with explicit georeferencing
information is the MrSID (Multiresolution Seamless Image Database) format,
and the ECW (Enhanced CompressionWavelet) format.

10
GIS Data Management - Hybrid file formats

• A geodatabase is a recently developed, ESRI file format that supports


both vector and raster feature datasets (e.g., points, lines, polygons,
annotation, JPEG, TIFF) within a single file.
• This format maintains topological relationships and is stored as an
MDB file.
• The geodatabase was developed to be a comprehensive model for
representing and modeling geospatial information.

11
GIS Data Quality
• Data quality refers to the ability of a given dataset to satisfy the
objective for which it was created. It could be characterized by accuracy.
• Accuracy describes how close a measurement is to its actual value and is
often expressed as a probability.
➢ (e.g., 80 percent of all points are within +/− 5 meters of their true locations)

• Data accuracy types:


➢Positional accuracy
➢Attribute accuracy
➢Temporal accuracy
➢Logical consistency
➢Data completeness
12
GIS Data Quality - Positional accuracy

• Positional accuracy is the probability of a feature being within +/−


units of either its true location on earth (absolute positional accuracy)
or its location in relation to other mapped features (relative positional
accuracy).
• Positional errors arise via multiple sources:
➢ The process of digitizing paper maps.
Discrepancies between digitized map and source map might be due to
human errors in manual digitizing or in scanning and tracing.
➢ When features to be mapped are inherently vague.

13
GIS Data Quality - Positional accuracy
• Positional inaccuracy in a digitized map is evaluated by measuring the
root-mean square (RMS).
➢ This statistic measures the deviation between the actual (true) and estimated
(digitized) locations of the control points.
• For example
• This figure illustrates the inaccuracies of lines representing soil types.
• By applying an RMS error calculation to the dataset, one could determine the
accuracy of the digitized map and thus determine its suitability for inclusion in
a given study.

14
GIS Data Quality - Positional accuracy
• Positional errors can also arise
when features to be mapped are
inherently vague.
• Take the example of a wetland,
What defines a wetland boundary?

15
Data quality - Attribute & Temporal

• Attribute accuracy
Attribute errors can occur when an incorrect value is recorded within the
attribute field or when a field is missing a value.
• Temporal accuracy
➢ It addresses the age or timeliness of a dataset. No dataset is ever
completely current.
➢ several dates to be aware of while using a dataset(publication date,
collection date, etc.)
➢ To address temporal accuracy, many datasets undergo a regular data
update regime.

16
Data quality - Logical & Completeness

• Logical consistency
➢ It requires that the data are topologically correct.
➢ For example, Do roadways connect at nodes? Do all the connections and
flows point in the correct direction in a network?
• Data completeness
• All the data must be present for a dataset to be accurate.
• For example, Are all of the counties in the state represented? Are all of the
stream segments included in the river network?

17
GIS data exploration

• Data exploration allows data to be viewed from different


perspectives, making it easier for information processing.
• Data exploration in GIS involves:
➢Statistics,
➢Graphs,
➢Attribute data query, and
➢Spatial query.

18
GIS data exploration - Statistics

• Descriptive statistics provide


simple numeric summaries of
large datasets by examining one
variable at a time(e.g. , change in
population rate).
• Assuming the data set is arranged
in the ascending order, measures
include:
• range,
• median,
• mean,
• standard deviation, etc

19
GIS data exploration - Graphs

• Different types of graphs


are used for data
exploration.
• A graph may involve a
single variable or multiple
variables, and it may
display individual values
or classes of values.
• An important guideline in
choosing a graph is to let
the data tell their story
through the graph.
20
Data exploration - Attribute query

• Attribute query retrieves a data subset by working with attribute data.


• The selected data subset can be displayed in charts, linked to the
highlighted features in the map, printed or saved for further processing.
• Attribute query requires the use of expressions, which must be
interpretable by a database management system.
• ArcGIS, for example, uses SQL (Structured Query Language) for query expressions.
• Attribute query examples:
✓show all the census tracts that have a population density of 500 or greater,
✓show all counties that are less than or equal to 100 square kilometers.

21
Data exploration - Spatial query

• Spatial query refers to the process of retrieving features by examining


their position relative to other features.
➢ Target layer refers to the feature dataset whose attributes are selected.
➢ Source layer refers to the feature dataset on which the spatial query is
applied.
• Spatial relationships for spatial query have been grouped into
distance, direction, and topological relationship.
• Distance example, find restaurants that is 1-km distant from a street
intersection.
• Direction example, find recreation areas west of an interstate highway.
22
Data exploration - Spatial query

Three basic Topological spatial relationships:


1. Containment
• Selects target features that fall within, or are contained by, source
features.
• Examples include finding schools within each county, and national parks
within each state.
2. Intersect
• Selects target features that intersect, or are crossed by, source features.
• Examples include finding urban places that intersect an active fault line
and land parcels that are crossed by a proposed road.
23
Data exploration - Spatial query

Three basic Topological spatial relationships:


3. Proximity
• Selects target features that are close, or adjacent, to source features.
• Examples include:
✓ Find the closest hazardous waste site to a city,
✓State parks that are within a distance of 10 miles of an interstate
highway,
✓Land parcels that are adjacent to a flood zone.

24
Data exploration - Spatial query
Combining attribute and spatial queries:
• In many cases data exploration requires both types of queries:
• Example combined queries:
Find gas stations that are within 1 mile of a freeway exit in southern
California and have an annual revenue exceeding $2 million each.
➢ Given the layers of gas stations and freeway exits, there are at least two ways
to answer the question. (next slides)

25
Data exploration - Spatial query
Combining attribute and spatial queries - Solution 1

Use freeway exits as target features and gas stations as target features
in a spatial query to find gas stations that are within a distance of 1
mile of a freeway exit.
We can then use an attribute query to find those gas stations that
have annual revenues exceeding $2 million.
(A join query is performed because the attributes of selected gas
stations are joined to the attribute table of freeway exits.)

26
Data exploration - Spatial query
Combining attribute and spatial queries - Solution 2

Perform an attribute query of all gas stations to find those stations


with annual revenues exceeding $2 million.
Then we can use freeway exits as source features and selected gas
stations as target features in a spatial query to narrow selected gas
stations to only those that are within a distance of 1 mile of a freeway
exit.

27
28

You might also like