GeoHealth Mapping GIS Training
GeoHealth Mapping GIS Training
When working through the curriculum, you will need to have both your browser and the software program
open. The most efficient way to switch between the two programs will depend on your monitor size and
resolution. If you have a smaller monitor, you may need to keep both programs fully maximized in order to
see the details. In this instance, using the taskbar is the easiest way to jump back and forth between the two
programs. Some Windows users also like to use the Alt+Tab method.
If you have a larger or higher resolution monitor, you will find it faster to size the two windows so that they
sit side-by-side. You can then work through many of the steps in QGIS at one time. When you need to see
greater detail in either program, you can use the Window Maximize buttons. Note that we have found
Chrome does a better job of keeping the step you were on centered when switching between window sizes.
Both Internet Explorer and Firefox tended to drift a little and required some scrolling to restore the exact
location.
Users will need to download the workbook exercises and data (zip 19.6 MB) to complete the training.
Exercises and data may also be downloaded for just the individual section as needed. To match curriculum
references, create a folder and name it QGIS_Training. Save the downloaded exercise files in this folder.
Accessibility
A number of screen shots are included in the training curriculum to supplement the text-based instructions.
If you are using assistive technolgy to take this training, you should be able to complete many of the tasks
by following the steps in order. However, the software program will be producing non-accessible visible
output. Some steps will ask you to compare that on-screen output with the associated screen shot in the
curriculum. The alternative text for the screen shots will reference the numbered step of the equivalent text-
based instructions so that you may request assistance as needed to verify their successful completion.
Privacy Policy
This web-based training curriculum is governed by the Health Policy Project's privacy policy with the
following exception: If you choose to use the "Save My Place" feature, a non-tracking, site-specific cookie,
CFCLIENT_GEOHEALTH, will be placed on your computer. If you do not use the "Save My Place"
feature, then no cookie will be created. The cookie is not required to use this web-based training curriculum,
it simply stores the page you were on if you click the "Save My Place" button in the navigation menu.
1. Introduction
estimated time for completion 10 minutes
Welcome to the USAID- and PEPFAR-funded Health Policy Project's GeoHealth Mapping GIS Training.
Training materials are organized as self-directed modules and are intended for users with an introductory-
level understanding of geographic data and geographic information systems (GIS), and of why a geographic
approach to data analysis can be useful in examining HIV variation at subnational levels.
This set of training modules builds upon such a foundational understanding and provides the user with
practical exercises to further develop mapping and spatial analytic skills.
The overarching objectives of this collection of modules are to help monitoring and evaluation or strategic
information officers
Become familiar with basic mapping concepts and with QGIS, a free and open-source mapping
software
Develop skills in using QGIS to map and conduct spatial analysis for use in HIV program and policy
planning
Learn to create geographic data and collect data via GPS
Understand which skills and mapping techniques are most useful in the context of generalized versus
concentrated HIV epidemics
Each module is organized using a similar structure and includes learning objectives, examples of
programmatic questions that can be answered upon completion of the module, and estimated time for
completion.
Users will need to download the workbook exercises and data (zip 19.6 MB) to complete the training. To
match curriculum references, create a folder and name it QGIS_Training. Save the downloaded exercise
files in this folder.
"… without information, things are done arbitrarily and one becomes unsure of whether a policy or
program will fail or succeed. If we allow our policies to be guided by empirical facts and data, there will be
a noticeable change in the impact of what we do." — Nigerian policymaker, MEASURE Evaluation
Many public health professionals are overwhelmed with the collection and use of data related to the services
they deliver. In some contexts, data requirements from governments and donors have grown exponentially,
to the point where some service providers and implementing partners have pages and pages of forms to
complete on a daily basis.
Rarely are these data used to monitor programs and make decisions beyond individual patient care. This is a
lost opportunity because data are critical to program improvement and the decision-making process. It also
becomes difficult to identify important patterns in health data, which has great implications for program
planning and resource allocation. There is therefore a need for technology that can facilitate data use and
help us understand such variations in space and time.
Geographic Information Systems (GIS) are one such technology quickly gaining ground as a powerful tool
for evidence-based decision making for policy and planning in many sectors, including health. Maps are
now a commonly used media for visualizing and interpreting patterns in health data.
As HIV epidemics grow and change, geospatial analysis has allowed epidemiologists to develop a clearer
picture of HIV. Scientists have long understood that HIV does not emerge in populations uniformly. Now,
using GIS software, they have the ability to visualize with increasing precision where HIV infections are
concentrated, giving them a greater understanding of how to drive down rates of infection. Because HIV
exists in hot spots-pockets of higher transmission and infection rates-controlling the epidemic will involve
targeting the specific geographic areas and marginalized populations most affected by the disease.
While mapping health-related data offers tremendous promise by identifying and reaching people who are
most at-risk for HIV, international public health professionals find themselves facing a double-edged sword.
Matching the geography of key populations to programs and life-saving services is a powerful weapon in the
fight against HIV; however, this approach risks putting data about individuals and services in the hands of
those who might inflict harm. This risk is especially pronounced in countries with legal restrictions and/or
rights-constrained environments—for example, where same-sex relationships, sex work, or injecting drug
use are criminalized, or where identifying as transgender is either criminalized or simply unacceptable.
Before beginning any mapping or spatial analysis, analysts, program managers, and decisionmakers should
review available resources regarding ethical considerations in programmatic mapping. This will ensure that
any analysis will directly inform policies or programs, that it is presented at an appropriate level of
aggregation to protect individuals and communities, and that the maps themselves are shared selectively.
Burgert, C.R., J. Colston, T. Roy, and B. Zachary. 2013. Geographic Displacement Procedure and
Georeferenced Data Release Policy for the Demographic and Health Surveys. DHS Spatial Analysis
Reports No. 7. Calverton, Maryland, USA: ICF International. Available at
http://www.dhsprogram.com/publications/publication-SAR7-Spatial-Analysis-
Reports.cfm#sthash.57s11ryj.dpuf.
amfAR, International AIDS Vaccine Initiative, Johns Hopkins Bloomberg School of Public Health,
and UNDP. 2011. Respect, Protect, Fulfill. Best Practices Guidance in Conducting HIV Research
with Gay, Bisexual, and Other Men Who Have Sex with Men (MSM) in Rights-Constrained
Environments. New York : amfAR. Available at http://www.amfar.org/uploadedFiles/_amfar.org
/In_The_Community/Publications/MSMguidance2011.pdf.
VanWey, L.k., R.R. Rindfuss, M.P. Gutmann, B. Entwisle, and D.L. Balk. 2005. "Confidentiality
and Spatially Explicit Data: Concerns and Challenges." Proceedings of the National Academy of
Sciences of the United States of America 102 (43): 15337–15342. Available at
http://www.pnas.org/content/102/43/15337.abstract.
MEASURE Evaluation GIS Working Group. 2008. Overview of Issues Concerning Confidentiality
and Spatial Data. Chapel Hill, NC: MEASURE Evaluation. Available at
http://www.cpc.unc.edu/measure/publications/wp-08-106.
Sherman, J.E. and T.L. Fetters. 2007. "Confidentiality Concerns with Mapping Survey Data in
Reproductive Health Research." Studies in Family Planning 38(4): 309–321. Available at
http://www.ipas.org/~/media/Files/Ipas%20Publications/ShermanSFP2007.ashx.
This curriculum assumes a basic understanding of GIS. In this section, we will review basic GIS concepts
and terminology. Users who require an in-depth introduction to GIS concepts would benefit from taking the
MEASURE Evaluation course, Geographic Approaches to Global Health.
1.3.1 Objectives
Geographic data refers to features that have a spatial component. These features are commonly referred to as
spatial data layers. A spatial data layer is composed of two components: spatial data, or the location of a
feature (i.e., latitude and longitude); and attribute data, or information about the feature.
Source: Mayienda, R. 2003. Adapted from "Basic Training in Geographic Information Systems for Wildlife
Conservation." Wildlife Conservation Society.
Vector data
Raster data
The basic element of the vector data model is a point. It is used to represent discrete objects such as
locations of towns, rivers, roads, or region boundaries. Within the vector model, there are three data types:
Points connect to make lines, lines connect to make polygons, and polygons are grouped to form regions.
Source: Mayienda, 2003.
The most common file format for storing vector data is an ESRI shape file. Other file types exist that are
software-dependent. A shape file is comprised of a minimum of three files with the following
extensions: .shp; .shx; .dbf. However, it can also include files with other extensions, such as .sbx and .prj.
Provinces.shp: stores information about the geometry of the vector (i.e., is it a point, line, or
polygon?)
Provinces.shx: the computer index file
Provinces.dbf: table where the attribute data is stored
Most paper maps you are familiar with use the vector data model:
Gold Coast with Togoland. 1949. European Digital Archive of Soil Maps (EuDASM). Available at
http://eusoils.jrc.ec.europa.eu/esdb_archive/eudasm/africa/maps/afr_ghgc.htm.
In the raster data model, geographic features are represented on a grid of square cells, all the same size. The
basic element is a grid cell.
So
urce: Mayienda, R. 2014. Unpublished Map
Gradients
Data from remote sensing imagery, collected in grids
Topographic data
Sou
rce: Mayienda, 2003.
1.3.8 Vector versus Raster
Raster is computationally efficient because computers are good at matrices and because all
elements lie neatly on top of each other.
However, Rasters can generalize spatial locations depending on the resolution used.
In practice, the data model that you use depends on software and data.
A coordinate system is a reference system used to represent the locations of geographic features, imagery,
and observations such as GPS locations within a common geographic framework.
A global or spherical coordinate system such as latitude-longitude: These are often referred to
as geographic coordinate systems, and the units of measurement are degrees.
A projected coordinate system, which is based on a map projection: Map projects represent the three-
dimensional Earth as a two-dimensional Cartesian coordinate plane. Common systems include the
transverse Mercator, Albers equal area, or Robinson. The units of measurement are either metric or
statute.
Geographic reference systems use degrees as units of measurement, while projected reference systems units
are either in metric or statute. So far we have worked with data in geographic coordinate systems.
Coordinate reference systems are used to accurately identify locations on the Earth's surface. There are two
types of reference systems: geographic (spheroid) and projected (planar).
Geographic coordinate reference systems are spherical systems referenced by their latitude and longitude
values, where the units of measurement are degrees. Decimal degrees (DD) express latitude and longitude
geographic coordinates as decimal fractions and are used in many geographic information systems. DDs are
an alternative to using degrees, minutes, and seconds (DMS). As with latitude and longitude, the values are
bounded by ±90° and ±180°, respectively. As shown in the map below, the coordinates for Buenos Aires are
in decimal degrees; those for Dar es Salaam are in degrees minutes, while those for Lhassa are in DMS.
Projected coordinate systems are planar reference systems based on a map projection. Map projections
represent the three-dimensional Earth as a two-dimensional Cartesian coordinate plane. One commonly used
system is the Universal Transverse Mercator (UTM). Other systems include Albers equal area and
Robinson. The UTM Coordinate Reference System CRS divides the earth between 84°N and 80°S into 60
zones, each of which covers 6 degrees of longitude. The map below shows an example of the world using
the UTM CRS.
Source: Wikimedia Commons, 2007. The longitude and latitude zones in the Universal Transverse Mercator
system. https://commons.wikimedia.org/wiki/File:Utm-zones.jpg.
We map data to see geographic patterns and relationships between data features. As a general rule, mapping
follows four basic steps:
1. Frame your question of interest. In other words, what do you want to know about your data?
2. Gather available data to answer your question.
o Are existing data available to answer the question?
o Can you triangulate multiple data sources?
o Do you need to collect data?
o Do you need to digitize data?
3. Determine whether the question can be answered using a geographical approach.
o Would a chart, graph, or table be a better format to communicate your findings?
o Are you interested in displaying values or do you need a map to reveal patterns or
relationships in the data?
4. Select a method or combination of methods to help answer your question of interest.
The first step in conducting geospatial analyses is selecting the GIS software. A number of commercial and
free GIS software applications are available, each with their own advantages and disadvantages. Some
examples include
Commercial
ArcGIS (Esri)
MapInfo
Open source
QGIS
DevInfo
DIVA-GIS
Epi Map
DHIS
Selection of suitable GIS software will depend on a user's needs and resources. Some key factors to consider
include
This training will utilize the free and open-source software QGIS, developed by the Open Source Geospatial
Foundation (OSGeo). QGIS allows users to create, edit, visualize, analyze, and publish geospatial
information, without the cost of commercial software.