0% found this document useful (0 votes)
6 views

2013 ACAPS How To Approach A Dataset

Uploaded by

Ali Ibrahim
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

2013 ACAPS How To Approach A Dataset

Uploaded by

Ali Ibrahim
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

August 2013

Technical Brief

How to approach a
dataset
Part 1: Database design

1
Technical Brief Database Design

Contents
Contents .............................................................................................................................................................. 2
1 Introduction ................................................................................................................................................. 3
a) Why do we need a database? ................................................................................................................. 3
b) Excel as a simple database solution ........................................................................................................ 3
2 Analysis plan ................................................................................................................................................ 4
a) What should an analysis plan contain? ................................................................................................... 4
b) Transforming data collection units into reporting units ......................................................................... 5
3 Data collection tool ..................................................................................................................................... 6
4 Designing your data model.......................................................................................................................... 8
a) Database documentation ........................................................................................................................ 8
b) Define your tables ................................................................................................................................... 9
c) Define your rows ................................................................................................................................... 10
d) Define your columns.............................................................................................................................. 10
e) Column design steps.............................................................................................................................. 10
f) Define your data values ......................................................................................................................... 15
5 Prepare your database for data entry ....................................................................................................... 17
a) Create drop down menus ...................................................................................................................... 17
b) Setting up named ranges....................................................................................................................... 18
c) Creating cascading drop-down menus (advanced) ............................................................................... 20
d) Creating a look-up for P-codes .............................................................................................................. 21
6 Testing your database ............................................................................................................................... 22
7 Data cleaning and consolidation ............................................................................................................... 22
a) Quality control of data entry ................................................................................................................. 23
b) Validation of rules during data entry .................................................................................................... 23
c) Consolidating data from multiple sources ............................................................................................ 24
d) Cleaning of consolidated data ............................................................................................................... 24
e) Categorization of open response questions .......................................................................................... 24
8 Documenting changes ............................................................................................................................... 24
9 Additional Resources ................................................................................................................................. 25

Acknowledgements
This document would not have been possible without the leadership, support and guidance of many
people. ACAPS especially wishes to express its gratitude to Emese Csete, Nigel Woof, Aldo Benini
and Benoit Munch for their thorough revision and critical insight.
2
1 Introduction
In an ideal world, a rapid onset disaster would be the instigating event for an equally rapid deployment
of a skilled assessment team, with sufficient resources at their disposal to quickly conduct a rapid
multi-sector needs assessment capable of guiding decision making. This team might be comprised of
an assessment coordinator, an information manager/analyst, sectoral specialists and an IT specialist.

In the potential absence of an information manager/data analyst, this technical note provides guidance
in how to set up a simple database suitable for storing small amounts of data as may be generated
by a rapid assessment with relatively small sample sizes. It is aimed at supporting non-specialists in
information management with a working knowledge of spreadsheet applications to set up a suitable
structure rapidly which will support analysis. The document uses an example questionnaire and
database used for the Joint Rapid Assessment for Northern Syria (JRANS).

Database design should be undertaken at the same time as designing your data collection tools,
methodology and analysis approach – therefore it is recommended that this note is first quickly read
in full at the start of the assessment design process, before undertaking any of the steps within.

Analysis planning, data collection tool design and data cleaning are covered in less detail; This note
needs to be read in conjunction with the technical note How to Approach a Dataset Part 2: Data
Preparation, and How to Approach a Dataset Part 3: Analysis, available on the ACAPS website under
http://www.acaps.org/resources/advanced-material .

a) Why do we need a database?

A data base is a ‘tool that stores data, and lets you create, read, update, and delete the data in some
manner’. It does not matter whether you're using paper or a computer software program to collect and
store the data – if you have an organised collection of data collected for a specific purpose, then you
have a database. Databases offer certain advantages in terms of efficiency:
 Provides a centralised digital storage facility for data –easy to share
 Retrieval and updating of specific information is made faster and easier, with the possibility of
using a number of different search criteria
 Easy updating of data
 Facilitates analysis by structuring data is such a way that it is simple to conduct calculations

b) Excel as a simple database solution

Choosing the right approach, the right software and deciding how to model you data within your
database depends on the processes which you will want to carry out upon your data. Within the
context of a rapid assessment where technical database skills are limited, the following requirements
are key:
 Fast to set up
 Does not require specialist technical database skills or software licences
 Easy to enter data
 Structures data so as to facilitate analysis

For these reasons, the recommended approach outlined here is to use Microsoft Excel. Whilst this is
a spreadsheet application as opposed to database management software, Excel is very good for
entering, storing and analysing small amounts of data; it has a much lower learning curve, and also
provides built in analysis features which would require significant programming in a database
application.
3
Technical Brief Database Design

2 Analysis plan

a) What should an analysis plan contain?

It is a common mistake for data collection tool design and sampling design to be seen as a very
separate issue from database design, data cleaning/coding and analysis outputs.

These steps are very closely linked; the information output which you get from an assessment will
depend on structuring the data in such a way that it can be analysed, and this in turn is constrained
by how the data has been collected (e.g. one community group discussion per site, or several
household interviews), and what has been collected (e.g. what question asked). If you design a tool
without thought to the other stages, you may come unstuck somewhere along the way by having
collected data which does not end up providing the information which you need.

Develop an analysis plan at the start of an assessment. This will help you to think through the links
between your information needs, the question/data request which will collect this data (e.g. ‘indicate
your top three priority sectors from the following list), and your sampling strategy. It will contain
details about how the output of each question will be analysed to provide the desired information.
This will ensure that you know in advance how you will transform data into information.

It is important to make the distinction between data and information. Data are facts about the world,
whereas information is the result of processing raw data to reveal its meaning. Processing can mean
carrying out complex calculations, but it can also be as simple as organising data to reveal a pattern,
or extracting a key data item (e.g. who was the oldest participant?). In order to reveal meaning,
information also requires context, i.e. comparison with baseline data. Good decisions require good
information that is derived from raw facts.1

Your analysis plan should contain:


 Data collection units. E.g., community, household
 Reporting unit. This tends to be location; either a community, or an administrative unit.
 Transformation from data collection unit to reporting unit, if required
 Transformations from data inputs into information outputs

The types of data which are collected/reported and the way in which it is collected will affect the design
of the structure - and also the ease of analysis.

Quantitative data is data which can be measured and analysed numerically, allowing it to be
presented as statistics, tables and graphs (e.g. Number of affected households).

Qualitative data is descriptive, and can be observed but not measured in an exact way (e.g. types of
humanitarian needs – shelter, health).

Qualitative data can still be analysed in a quantitative way. For instance, a need for shelter is
qualitative data; however, a need for shelter in 50% of sites visited is a quantitative analysis from
qualitative data. The benefit of quantitative analysis of qualitative data is that it provides a succinct
summary, it is easy to understand/interpret and it allows for simple comparison

1
Database Systems Design, Implementation and Management (9th Edition), 2010
4
Technical Brief Database Design

b) Transforming data collection units into reporting units

In some cases, it may be possible that you do not have a one-to one relationship between units of
reporting and units of data collection. For instance, you may have the reporting unit of an
administrative area, e.g. district, but may have sampled at the household level.

Alternatively, your methodology may involve doing several key informant interviews, community group
discussions and direct observation in each community. These are both one to many relationships
between the reporting unit (community) and the data collection unit.

Your analysis plan should indicate how you will transform/aggregate/consolidate units of data
collection into units of reporting. For instance, in the previous example of a district reporting unit but
a household sampling, you may decide that you will take the majority view; in the case of quantitative
information, you could average it – so long as it is credible to treat each observation as equivalent.
For qualitative data, you could take the most popular response in the case of a single option question.
For multiple option questions, this becomes much more complex. You will need to give good thought
to whether the outputs of these calculations will be logical, and whether they will allow you to meet
you information needs.

You might also face the situation where in each location, several data collection techniques are used
a different number of times – for instance, several key informant interviews, community group
discussions and direct observations. If these cover some of the same variables (recording the same
information but from different sources, e.g. asking both key informants and community group
discussions for their priority sector for response), it will be less credible to treat these with equal
weight. The outputs of a community group discussion already represent a larger consensus than a
key informant, therefore are unlikely to be weighed equally. Also, direct observation is often used as
a technique for the assessment team to verify visually what they have been told in key informants or
community group discussions. How will this information be cross checked against other information?

In these more complicated scenarios, it becomes difficult to define hard and fast rules for aggregating
to the reporting unit. As this requires judgement rather than calculation, this is a task which is best
carried out by the assessment team, immediately after the data is collected. Having interacted with
respondents, the assessment team are the ones best placed to determine what the ‘correct’ response
is. When sufficient trust is placed in the enumerators and when time constraints are important, final
conclusions of the assessment teams after the field visit should be recorded in one single form, the
one that will be used in the database.

Finally, consider whether there is validity to maintaining some views separately. For instance, if
interviewing individuals, rather than averaging responses at the community level, you may want to
average the female and the male view separately and maintain both, so as to be able to disaggregate
your analysis by gender. If conducting male and female CGDs and male and female KIs, then you
may want to consolidate this information to a female and a male viewpoint for each community, which
could be conducted by the assessment team. Figure 1 shows an extract from the Analysis plan from
the Pakistan MIRA, demonstrating the linkages between information needs, indicators, required data,
data sources and analysis type.

5
Technical Brief Database Design

3 Data collection tool

This document does not cover questionnaire design in detail. If a thorough job is done on the analysis
plan, the data collection tools should be “fairly” simple to design.

Often, data collection tools are designed before an analysis plan has been written and sampling
methodology has been determined. Whilst this is not ideal, the retrospective development of an
analysis plan may still help to highlight any pitfalls and limitations to the tool which could limit analysis.
Some recommendations to support later analysis are outlined in the Table 1 below.

Table 1. Recommendations for data collection design

Quantitative
Units Ensure reporting units are the same – either by stipulating unit of measurement (e.g. number
of affected households) or at very least by ensuring that the unit is reported.

Definitions If the data could be open to interpretation, always provide a definition (e.g. “affected
population”)

Null Ensure that it will be possible to differentiate between a response of zero, as compared to a
responses null response – e.g., if a question ask for affected population and is left blank, is it because
there are no affected people, or because the respondent does not know? Wherever clarity is
required to differentiate this in your responses, consider adding a ‘do not know’ or ‘not
applicable’ option, as appropriate to the question.
Qualitative:
Create Qualitative data can be analysed in a quantitative way if responses are categorised.
categories in Categories should be defined in advance, and applied at the point of data collection; this can
advance be done either by presenting a closed question of set options, or by providing an open
question, the answer to which is then categorised or recoded by the data collector at the point
of information recording.

Don’t The downside of pre-defined categories is that they assume that you already know all the
assume you possible options – in order to allow for unanticipated responses, include an ‘other’ category,
know it all always ensuring that details of what the ‘other’ response is recorded.

Single Single choice options are suitable in cases where options are mutually exclusive, and will
response vs. allow simple and intuitive statistics to be carried out, e.g. ‘40% of respondents were displaced,
multiple 30% were staying with family and 30% were still living in their homes’.
response
In certain cases, responses may only be fully described through a combination of options
(multiple choice), e.g. there is a problem with access to water because water sources are
contaminated and security prevents access to water points. The analytical outputs of multiple-
response questions can be more complex, but can be a more accurate representation or
reality and variability in the context.

If it is necessary to categorise responses into exclusive categories, then consider also asking
respondents to select their main/priority/dominant category (remember that they are the only
ones who can do this!). You can do this by designing the question as a ranking, where the
respondent indicates the relative priority of responses.

6
Technical Brief Database Design

Figure 1. Extract from Aleppo J-RANS Analysis plan – WASH Section

Information Needs Indicator Indicator Data Source Analysis type Question Sample visualization
needs
Main issues Main problems in Frequency of problems Local population, relief Breakdown per Is there a serious problem
in water water supply as reported due to access committees, Head of areas regarding water in this
supply expressed by the issues HH, Water committee, neighbourhood? If yes, I am
population Frequency of problems local organization, reading a list of possible
problems (Select max five
reported due to NGOs most serious problems)
availability issues

Main issues Main problems in Frequency of problems Local population, relief Breakdown per Is there a serious problem
in Sanitation Sanitation as reported due to access committees, Head of areas regarding sanitation and
expressed by the issues HH, Water committee, hygiene in this
population Frequency of problems local organization, neighbourhood? If yes, I am
reading a list of possible
reported due to NGOs problems (Select max five
availability issues most serious problems)

Affected Ranking of groups Groups the most Local population, relief Breakdown per Regarding the lack of safe
groups the most at risk as vulnerable in the committees, Head of areas and water, which group is most at
reported by the WASH sector HH, Water committee, priority rank risk? (rank top three: 1=first
population local organization, rank, 2=second rank, 3=third
rank)
NGOs

Response Type and Frequency of Local population, relief Breakdown per Which organizations have
capacity regularity of intervention reported in committees, Head of areas been providing regular water,
assistance the WASH sector HH, Water committee, Breakdown by sanitation or hygiene support
provided in the Type of assistance local organization, humanitarian in this neighbourhood over
the past 30 days?
WASH sector provided NGOs actor
 Type of organization
 Organisation responsible
 Regular or one off support

Severity of Severity of Severity status on a Local population, relief Breakdown per Overall, which of the
conditions problems life-saving scale committees, Head of areas following statements
HH, Water committee, describes best the general
local organization, status of water supply?
(Circle one right answer)
NGOs

7
Technical Brief Database Design

4 Designing your data model

a) Database documentation

A lack of documentation of records and variables is a commonly identified issue when a new dataset
is received by an analyst who did not design it2. In order to ensure that the database is never
separated from this information, create supporting documentation in worksheets in the same
workbook as your database. Table 2 summarises the various spreadsheets which will eventually be
needed, some of which will be covered in later sections of the document.

Table 2. Supporting documents

Database Where all data will be stored. If your data model contains several tables, there will be
one worksheet per table.

Data dictionary/ Worksheet containing all variable names, types, data formats, categorical values.
codebook work Demonstrates how each database field relates back to the data collection tools.

Domains Contains all of the lists of categories used within the database.

P-code lookup Contains all of the administrative boundary information.

Change log For keeping a record of all modifications made to data within the database

A data dictionary contains the ‘metadata’ for your data model, essentially all of the information
necessary for someone else to understand your database. Data dictionaries document what each of
your variables are, their names, what data type they are, and codes for categorical values. A
codebook is a similar document, but provides a technical description of a data file, describing how
the data are arranged in the files, what the various numbers and letters mean, and any special
instructions on how to use the data properly.

For the purpose of a rapid assessment, you can combine both codebook and data dictionary
functionality in one document which describes not just your database, but relates the database design
back to your data collection instruments. An example codebook can be found within the example
database attached to this document. In your code book, you should include the following:
 Name of each column
 Variable type (relating the data back to the relevant data collection item, or if the variable is
calculated, indicating how it was calculated)
 Data format (number, category, text, date)
 Categorical values

2
Technical brief: How to approach a dataset. Part 1: Data preparation. Aldo Benini, 15/03/2013

8
Technical Brief Database Design

b) Define your tables

Figure 2 shows the four steps in designing the structure of your database. The first step is to decide
how many tables you need. This will relate strongly back to your analysis plan. As the aim is simplicity,
the ideal would be one table, where one rows represent one unit of reporting, and each column holds
the different variables. This will make your life much easier when it comes to analysis.

Figure 2. The four steps in designing your data model

More than one table means you have a relational database; whilst Excel can be used to model either
non-relational or relational database, it lacks the functionality of full database applications (Access,
SQL and Oracle) which helps manage these relationships – therefore keeping the number of tables
to a minimum will avoid the manual management of complex relationships. Key principles are as
follow:
 Data should be organised to support analysis, not to reflect the data collection tools.
 Minimise the extent to which you have to analyse across tables.
 Wherever you have the same data structure, store information together (e.g. if you have a
questionnaire which can be applied to both male and female KIs, store these in the same table.

Databases are information containers, which hold digital data in a way that allows a user to interact
with it.

The easiest way to store small quantities of data is as a non-relational database - which is a two
dimensional array of data - rather than as a relational database, which models data in terms of the
relationships between different sections of the data (resulting in several tables of information
interlinked through the use of unique keys, allowing one to many relationships between different data
sections).

9
Technical Brief Database Design

Non-relational databases are ideally suited to data with the same number of data fields for each
record. They have some drawbacks compared to relational database models (such as more
redundancy), but for small amounts of static data such as from a rapid assessment, the advantages
in terms of ease of analysis outweigh these constraints:
 User-friendly: This is a very simple model which most people can easily visualise, and which can
easily be modelled using a spreadsheet application.
 Ease of analysis: This approach structures data immediately into a format which will easily allow
analysis.

c) Define your rows

One row should represent one record/one sample, with each record being the basic unit of reporting
for your assessment. This should be evident from the information needs outlined within your analysis
plan, and will probably correspond with one set of responses at the data collection stage. For instance,
if your information need is to know the proportion of households assessed who need shelter, then
your rows would represent households. Structuring one row per response unit will allow for an easy
analysis, by allowing meaningful calculations at the level of each column.

d) Define your columns

Each column, or database field, should contain one variable/discrete unit of information. Each column
should be defined in your code book (see Figure 3), giving it a variable name and relating the field
back to the relevant section of the data collection tool. In some cases, it will be possible to structure
one field to contain the response to one question, though often more than one field will be necessary.
Table 3 provides examples of how certain common question types can be mapped across columns.
Section e contains column design steps.

Figure 3. Questionnaire section with corresponding code book excerpt

e) Column design steps

Unique ID in first column: In the first column, keep a unique ID which allows you to cross reference
between rows in the database and original paper/digital questionnaires or forms, to allow you to trace
data back to its source. Using an alpha alpha-numeric code can be useful for recording both the
identity of the assessment team as well as the number of the returned questionnaire (e.g., ‘EC07’).
Whilst alpha-numeric codes cannot be sorted as effectively as numeric codes, an additional column
containing containing numeric values can be added to the database as the record number, to facilitate
sorting.

Unique column headers: Create unique column headers (necessary for managing data entry if done
through a form, and also for cross-tabbing data during analysis). The column headers should be kept

10
Technical Brief Database Design

short (e.g. the question number, not the whole question) and should allow you to easily reference
between the column contents and the questionnaire. When many columns refer to the same question
number (this is the case with multiple choice, multiple option questions), extend the code to be
question number plus a code to represent the response (e.g. see Figure 3; each response to question
J2 is coded with ‘j2_sc_’ then a code representing the answer, e.g. ‘nf’ for ‘Schools not functioning’).

Additional column headers: In your code book, add additional header rows to help relate the
responses back to the questionnaire text. See figure 3.

Stratification data near start: In the columns following the unique ID, include columns for the main
factors which will be used to stratify the analysis (major elements of comparison, e.g. displaced/host,
urban/rural, male/female, geographical area - as outlined within your analysis plan).

Geographical information: Geographical information must be recorded accurately in order to ensure


that information is attributed to the correct location, allowing mapping and comparison against other
data. Assessment information is often collected and recorded by administrative unit. The best way to
record the administrative unit is by using the P-code; get an up to date p-code list which you can use
to design the geographical information columns. If your information is not being gathered by
administrative unit but by specific location, geographical coordinates should be recorded. In the later
case, separate fields should be implemented for latitude and longitude – do not put both into one field.
See box on page 15 for more on recording geographical information.

Map remaining questions: Map out the remaining questions across the other columns. Stick to the
same order of items as in the data collection tool to allow easier navigation through the data.

11
Technical Brief Database Design

Table 3. Mapping questions to columns

Example question Example response structure Example database structure

Number field where unit is specified: one column


B1. Estimated population in Estimated number of Responses B1_popn
neighbourhood: population in 2,500 Data values: Number: age
R1 2,500
neighbourhood:
R2
R3

Number field where unit is variable: two columns


B1. Estimated population in Estimated population in Responses D7_sitepop D7_sitepop_u
neighbourhood: 2,500 Units: Individuals
neighbourhood: R1 2,500 Individuals
R2
R3
Data values: Data values:
Number Freetext: units

Single choice option list: one column


E2. Is there a serious problem Responses E2_HC_YN
because people are not able x Yes No Do not know R1 Yes Dropdown menu:
to get adequate health care Yes, No, Do not know
R2
for themselves?
R3

12
Technical Brief Database Design

E2. If yes, I am reading a list Not enough health facilities available


of possible problems. Select Responses E2_HC_main
x Lack of medicines Lack of Dropdown menu:
the main reason. Not enough health facilities available ,
Lack of medical staff R1 medicines ,

lack of medicines, etc.


No access to health services due to physical/logistical constraints R2
No access to health services due to security constraints
R3
No access to health services due to limited economic resources

Single choice option list with ‘other’ category: two columns


E2. If yes, I am reading a list Not enough health facilities available
of possible problems. Select Responses E2_HC_main E2_HC_oth
Lack of medicines
the main reason, or use R1 Other No birthing service
‘other’. Lack of medical staff
No access to health services due to physical/logistical constraints R2
No access to health services due to security constraints R3
No access to health services due to limited economic resources Dropdown menu Data values:
x Other (give details) : No birthing service Not enough health facilities Freetext: Details of ‘other’
available, lack of medicines, etc.

Multiple choice list: as many columns as options


E2. If yes, I am reading a
x Not enough health facilities available Responses E2_1 E2_2 E2_3 E2_4 E2_5 E2_6
list of possible problems.
Select all which apply Lack of medicines R1 1 1
Lack of medical staff R2
x No access to health services due to physical/logistical constraints R3
No access to health services due to security constraints
Data values: 1 or 0 (Boolean)
No access to health services due to limited economic resources

13
Technical Brief Database Design

Multiple choice list with ‘other’ category: as many columns as options +1


E2. If yes, I am reading a Not enough health facilities available
list of possible problems. x Resp. E2_1 E2_2 E2_3 E2_4 E2_5 E2_6 E2_7 E2_o
Lack of medicines No birthing
Select all which apply 1 1 1
R1 service
Lack of medical staff R2
x No access to health services due to physical/logistical constraints R3
No access to health services due to security constraints Data values: Data values:
1 or 0 (Boolean) Freetext: Details
No access to health services due to limited economic resources of ‘other’

x Other (give details) : No birthing service

Ranking of pre-set options: as many columns as options


F5. Which group is most at Responses f5_fv_dh f5_fv_dc f5_fv_dv f5_fv_rh f5_hv_rn
1 Displaced people living in host families
risk of having not enough R1 1 3 2
food to survive in this ---- Displaced people in collective shelter (schools, camps, etc.)
R2
neighbourhood? (rank top
3 Displaced people in vacated buildings R3
three: 1=first rank,
2=second rank, 3=third 2 Resident population hosting displaced persons Dropdown menu: 1,2, or 3. You could also make the
rank) dropdown menu show ‘first’, ‘second’ and ‘third’, so that it is easy
---- Resident population who have not been displaced to differentiate this ordinal data from nominal data.

Ranking of non pre-set options: as many columns as ranks


E3. Which specific health First rank : Treatment for skin infections Responses e3_hi_r1 e3_hi_r2 e3_hi_r3
interventions are most Second rank : Birthing services Treatment for skin
urgently required in this R1 Birthing services Vaccinations
infections
neighbourhood? (Enter Third rank : Vaccinations
R2
short description)
R3
Data values: Freetext

14
Technical Brief Database Design

Geographical data / P-codes

P-codes are unique identification codes, represented by combinations of letters and/or numbers to
identify a specific administrative area or location. These are commonly used when mapping data, in
order to allow information to be linked to geographical boundaries and represented on a map. As well
as ensuring a unique reference system, using P-codes rather than a names also avoids issues of
inconsistent admin name spelling, which are not uncommon occurrences when names have been
translated from their original language.

Geographical areas are often a key disaggregation factor in rapid assessments. The accurate
recording of the location where information was collected is essential; however, the names of
administrative units can sometimes be duplicated within a country, within different regions (for
instance, in the US there are 13 cities, 11 towns, and 14 townships named Springfield). In order to
make sense of administrative areas, these must be recorded in a way in which they are unique. This
can be by reporting the full administrative hierarchy within which it is situated, e.g. Springfield,
Massachusetts. This is a very commonly applied approach often seen within assessments, where
location will be recorded as Admin level 1, Admin level 2, Admin level 3 (e.g.: Province, District,
Commune).

Another way to uniquely identify geographical areas is by recording corresponding P-Codes (codes
allocated to each administrative area as a means of identifying them uniquely). Recording the p-code
for any location based information will ensure that the information is not attributed to the wrong
location at a later stage due to name duplication, and will also allow the information to be easily
imported into a GIS system, linked to boundary data and displayed on a map. It will also allow for an
easy comparison between assessment outputs and data generated by other institutions (for instance,
census data showing pre-disaster population figures).

Whilst the actual names of the administrative areas will be used for reporting (no-one tends to
memorise the list of p-codes!), it is desirable to ‘translate’ this into p-coded data when entering the
information into the database. To do this, ensure that you have an up to date copy of the p-coded
administrative areas for the country. These can normally be found on each country’s Humanitarian
Response website, under the Common Operational Datasets (CODs), a registry of which can be
found on the main humanitarian response website, here: http://cod.humanitarianresponse.info/.
OCHA are a good point of contact for up to date P-code lists in country.

In cases where information is not being collected and recorded by administrative unit, it may be
desirable to record the location of data collection for later use. Assessment teams will need access
to and knowledge of how to use a GPS, and coordinates should be collected and collected in one
pre-agreed coordinate system (e.g. WGS84 Latitude/Longitude coordinates) and reported in a
common format, preferably decimal degrees. This will eliminate the need for coordinate conversion,
which can be both timely and error prone if coordinates are not transcribed correctly.

f) Define your data values

In your codebook, define the data type for each column (e.g. number, text, date). If the data value is
one of a list of options, define the data type as a domain, and list the potential options (domain values)
to each question (see figure 4). These will be used in the next section to constrain data entries. Where
necessary, allocate codes to differentiate between zero and ‘no reply’ (Table 4). If you choose to code

15
Technical Brief Database Design

all of your responses, you should list the codes and the response which they correspond to (see box
below on coding responses).

Figure 4. Example code book with response options

Table 5. Defining data values

Data In your code book, define the format of the data to be entered, e.g. number, text, date, domain
type (categories).

Domain Where a set of responses can be defined, create a list of all of the option. This will be used as
values the basis for drop-down lists, which will speed up data entry and ensure spelling consistency.

Null Where you are recording numbers, identify a code for null responses, in order to differentiate
value them from zero. For instance, in a field where you are recording age (a number), if a data
codes collection form is returned with an incorrect entry of ‘male’ and no age, then this should be
recorded in the database with a specific code. Choose a non-numeric code (e.g. ‘Null’) to ensure
that descriptive statistics can be generated without errors.

To code or not to code?

In some situations (see example in Figure 4 for question D1), your response options may be very
wordy. Good database etiquette would be to replace these options with a code (e.g. 1, 2 or 3 in the
previous example); storing only one number per respondent reduces redundancy in the database. In
a relational database, the ‘translation’ for the code is normally also stored at the database level.
While you could implement this for your database, the issue of storage space is unlikely to be a
consideration when dealing with small data sets from rapid assessments. Furthermore, introducing a
process of translation from text to code at data entry, then from code back to text during analysis, is
both time consuming and leaves you more open to errors. For this reason, the examples in this
document are all encoded.

16
Technical Brief Database Design

5 Prepare your database for data entry

Your code book now contains the outline of the structure of your database. Create a copy, keeping
the column header rows and delete the data values – this will be your final database. If you will be
conducting data entry directly into the spreadsheet, these headers rows will help to ensure that the
person doing the data entry can orientate themselves easily between the questionnaire and the
database. If conducting data entry through a form, you may need to reduce this to one column header,
in which case keep the column ID header, and remove the other header rows.

a) Create drop down menus

The final step is to create drop down menus to support data entry, and to validate the incoming data.
If you will be entering data directly into the spreadsheet, you will do this in the cells of the database
spreadsheet. If you will be setting up a form for data entry, you can do it in the form. As form design
is beyond the scope of this document, this section will focus on setting up the validation within the
workbook.

Drop down menus help to constrain data entry, the advantages being:

 Data entry will be faster


 Erroneous values will be prevented
 Spellings will be constrained – this is essential for ensuring that you will be able to automatically
sum, filter and cross tabulate information. For instance, if a field containing ‘gender’ contains both
the values ‘female’ and woman’, it will not be possible to automatically sum up all ‘female’
responses unless data is cleaned. Even additional white spaces can prevent Excel from matching
different values.

To create drop-down menus in Excel, use the ‘data validation’ function. This is set up at the level of
the cell, allowing you to specify the list of acceptable options. Highlight the cell which should have the
dropdown list, then go to the ‘Data’ menu and select data validation. In the data validation dialogue
box (figure 5), select the method of validation to be a list.

Figure 5. Data validation dialogue box

17
Technical Brief Database Design

For the source of the list, whilst you could reference the list of option in the codebook directly by typing
in the reference of the cells containing the options (e.g. ‘=A1:A5’, or selecting them manually), it is
preferential to work with named ranges. These are names assigned to a range of cells, allowing the
range to be referenced in formula rather than hard coding the reference of the location of the cells.
To reference named ranges, type an equals sign followed directly by the name of the range (e.g.
‘=elec_func’). The following section details how to set up and manage named ranges.

Only set up your data validation in the first cell immediately below the column header. Once you have
set this for all columns, you can use the autofill function to copy all of these validation rules to the
remaining spreadsheet – this will save you copy/paste time, and will also ensure that you don’t have
any erroneous rows with the wrong formula.

Why work with named ranges?

The advantage of working with named ranges rather than hard coding absolute cell references relates
to database upkeep. If there are any changes to the questionnaire structure (last minute changes to
the current data collection format, or re-assessment at a later stage with slight modifications to the
questionnaire), the database and code book will also need to be updated.

If you have lists of response options which are used several times within the database (in the example
database, see ‘source reliability’, with options ‘1= reliable’, ‘2= fairly reliable’, ‘3=unreliable’), any
change to this list (e.g. to add ‘0=no response’), it would be necessary to go through the entire
database and change the absolute cell reference in every column which references this list.

Once you have set up cell validation, there is no way to easily see which cells are referenced within
the validation without clicking on each cell and inspecting the formula for the source. If you have
several columns within your database that reference the same set of cells where a list of options is
given, any additions or subtractions to list of options would need to be changed in all of those cell
validations.

However, if you have set up a named range which references the three options, and have set up your
data validation in all of the relevant columns to reference that named range, you will only need to
update the named range to include the additional option, rather than having to hunt through your
worksheet to find all references to it.

The named range will exist across the whole workbook, which is particularly useful if you have chosen
to create a data model across several different worksheets. Another advantage is that it makes your
formulas easier to read, therefore easier for the database to be used by others.

b) Setting up named ranges

Each range which you will reference should have a unique name. If you re-use the same list of options
repeatedly within the database, you will only need to set up one named range.
To set up a named range, highlight the cells containing the options. Whilst highlighted, edit the name
box next to the formula bar, entering a unique and meaningful name for the range, and press enter
(figure 6).

18
Technical Brief Database Design

Figure 6. Named range creation and storage

Set up a separate spreadsheet specifically for the domains (Figure 7, from the sheet ‘domains’ in
example workbook), avoiding duplication if several questions have the same lists of response options.

Figure 7. Named ranges stored with names

Show each range below the name of the range to make it easier to read and therefore manage. The
domain sheet will need to always be kept with the database, to ensure these named ranges can be
used within the database.

All of the names ranges within a workbook can be viewed and managed through the Name Manager,
which can be found under the formula menu (Figure 8).

Figure 8. Name manager

19
Technical Brief Database Design

c) Creating cascading drop-down menus (advanced)

In some circumstances, you may want the list of options in a dropdown menu to vary according to a
previous choice. This is the case for administrative areas – once an Admin 1 area (in the example
workbook, this is the Governorate) is chosen, offering only the admin 2 levels (District) which are
found within that Governorate helps both to reduce the number of options to a more manageable list,
and also ensures the validity of responses.

This section outlines one method of achieving cascading drop down menus, though there are several
ways to add this functionality. It is recommended that you keep all of the administrative data in one
sheet in the workbook, in which you can create all of the relevant lists.

Set up lists: In your spreadsheet of administrative areas, set up your administrative data so that you
have:
 One list of unique Admin 1 (Governorate) areas
 One list of unique Admin 2 (District) areas, alongside a column of the Admin 1 areas these relate
to.

Set up variables: Create named ranges which will be referenced in formulas:


 Admin 1 names (in the example workbook, the named range is called Governorate).
 The column header of the list of Admin 1 names belonging to the Admin 2 list (in the example, this
is called StartGov).
 The whole column of the list of Admin 1 names belonging to the Admin 2 list (in the example, this
is called ColGov).

Define formula: Create another named range containing the formula for looking up the set of Admin
2 names (in the example workbook, the named range is called District). You will need to do this from
the Name manager console. The following formula can now be used, where Database!RC[-2], is the
reference to the cell, two columns to the left which contain the selected Admin 1 name.

District=OFFSET(StartGov,MATCH(Database!RC[-2],ColGov,0)-
1,1,COUNTIF(ColGov,Database!RC[-2]),1)

This works by finding the starting position of the Admin 1 name in the list alongside Admin 2 names,
and returning the corresponding all Admin 2 names which have the same Admin 1 name.

Set validation: In the database, set up validation:


 For the Admin 1 name column (Column G in the example database), data validation is simply the
list on unique Admin 1 names (Governorate)
 For the Admin 2 name column (Column I in the example database), data validation is set as the
named range containing the formula from step c (in the example, District).

This process can be repeated for further cascading menus; in the example database, Admin 3 (Sub-
district) dropdown menus are also dynamic, referencing the named range called SubDistrict, which
contains the following code:

SubDistrict=OFFSET(StartDist,MATCH(Database!RC[-2],ColDist,0)-
1,2,COUNTIF(ColDist,Database!RC[-2]),1)

20
Technical Brief Database Design

This uses the selected District two columns to the left (Database!RC[-2]), and uses this to look through
a third table which contains a unique list of Subdistricts, listed alongside their District and Governorate
names, as shown in Figure 9.

Figure 9. Spreadsheet of administrative areas with additional Index columns for P-code lookup.

d) Creating a look-up for P-codes

Storing the full administrative hierarchy safeguards against information being misattributed to the
wrong location. A further step of adding the p-code will greatly simplify the process of comparing this
data to other datasets, mapping the information, and also providing it for others in an easy to use
format.

By adding a few formulas at the time of database design, these p-codes can be automatically filled
in. As with cascading menus, there are a number of ways In Excel for looking up information. The
following method is simple to implement.

Create index columns: In your Administrative worksheet, ensure that you have the whole table
of Administrative levels and corresponding p-codes.
 Add a column containing a concatenation of the Admin 1 and Admin 2 names using
CONCATENATE formula:

=CONCATENATE(RC[-6],RC[-4])

 Add a further column concatenating Admin 1, Admin 2 and Admin 3 names (see Figure 9).
Create named ranges: Create the following:
 A named range of the whole look-up table (example: Lookuptable)
 Set up named range referencing the newly created Indexes (DistrictIndex and SubDistrictIndex)
and also an Index for Governorates (GovernorateIndex). For this last Index, you can use the
existing column of Governorate names (column 6 in the example).

Define formula for admin 1: In the cell where you would like the p-code for Admin 1 to be added,
write the following formula, where RC[-1] references the cell with the Admin 1 name:

=INDEX(LookupTable,MATCH(RC[-1],GovernorateIndex,0),2)

This matches the Governorate name with the GovernorateIndex column, and returns the
corresponding Governorate P-code in column 2 of the LookupTable.

21
Technical Brief Database Design

Define formula for admin 2: In the cell where you would like the p-code for Admin 2 to be added,
write the following formula, where RC[-3] is the cell containing the Admin 1 name, and RC[-1] is the
cell containing the Admin 2 name:

=INDEX(LookupTable,MATCH(RC[-3]&RC[-1],DistrictIndex,0),4)

This matches the combination of the Governorate and District names in the adjoining cells with the
DistrictIndex column, and returns the corresponding District P-code in column 4 of the LookupTable.

Define formula for admin 3: In the cell where you would like the p-code for Admin 2 to be added,
write the following formula, where RC[-5] is the cell containing the Admin 1 name, RC[-3] is the cell
containing the Admin 2 name, and RC[-1] contains the Sub-district name:

=INDEX(LookupTable,MATCH(RC[-5]&RC[-3]&RC[-1],SubDistrictIndex,0),6)

This matches the combination of the Governorate, District and Sub District names in the adjoining
cells with the SubDistrictIndex column, and returns the corresponding Sub District P-code in column
6 of the LookupTable.

6 Testing your database

Once you have implemented all of the functionality outlined within the previous sections, there will be
a number of different formulas, lookups and calculations embedded. Before starting data entry, you
should have someone thoroughly test the database. This is best carried out by someone other than
the developer, and should be done in conjunction with reviewing the data collection tools, to ensure
that all of the data items in the data collection tools will be able to be recorded within the database.

Test the first empty row of the database. Once this is verified, the contents of these cells can be
copied to subsequent rows.

Check the following:


 Is there a corresponding field in the database for every data item within the data collection tools?
 Are all dropdown menus working correctly?
 Are all dropdown menus showing the right set of options?
 Are cascading menus working correctly?
 Are p-codes being filled in correctly?
 Have formulas in the first row been copied to subsequent rows?

7 Data cleaning and consolidation

Data cleaning is a critical stage before beginning analysis. Care should be taken to ensure data is as
accurate and consistent (e.g. spellings, to allow aggregation) as possible. Whilst some errors may
only be discovered during analysis, it is by far more effective to correct these in advance in order to
avoid having to re-conduct analysis. This section focusses on deciding on your data cleaning strategy,
as opposed to the data cleaning process.

22
Technical Brief Database Design

When deciding upon an approach to data cleaning, it is useful to consider the different types of errors
which can be made (see Figure 10 for a typology of data errors), and to plan at what point in your
process you will try to identify them. Assuming a system where data entry is distributed across field
locations and consolidation occurs in a different location, Figure 10 suggests where these errors
should be corrected.

Figure 10. A typology of data errors and suggested correction points

a) Quality control of data entry

Erroneous errors (e.g. where data entered is different from the source yet valid, e.g. 26 instead of
25) and entries into the wrong field can only be rectified by comparison to source data, therefore
should be done at the time of data entry when original sources are close at hand.

Missing values should also be examined during data entry. Depending on the question type, it may
be necessary to differentiate a ‘no reply’ from a ‘do not know’ or a ‘no’ or a zero. During the data
model design, ‘null’ codes should have already been implemented for such fields - ensure that data
entry staff know when and how to use them.

These errors are easiest to identify when a quality control procedure exists, ensuring that a second
pair of eyes compare source data to data entered. This will be particularly important when there is a
process of translation at data entry, to ensure consistency/accuracy of translation.

b) Validation of rules during data entry

If there are additional ‘rules’ which should have been followed during data collection, ensure that data
entry staff are familiar with them, so that these can be identified early on and verified/rectified, rather
than adding them to the database (e.g. rules such as ‘pick only three’ or ‘must add to 100%’). If data
entry will be conducted in distributed locations, document the rules to follow, where focus should be
given, and how to solve errors/issues.

23
Technical Brief Database Design

Additional functionality can be added in Excel to highlight rule violations, such as conditional
formatting. The decision to include this in the database must be pragmatic, weighing up the merits of
having errors detected and rectified by data entry staff, versus the time required to set this up.

c) Consolidating data from multiple sources

Your methodology design may require that data is entered in a number of different locations, by
different people. This will not present an issue, so long as the same database structure is used; it will
be possible to copy and paste additional rows of data into one master version, so long as no
alterations are made to the structure (e.g. no additional columns added. Ensure that IDs are not
duplicated across different database versions – this can be done by allocating a range of ID numbers
to each data entry staff.

d) Cleaning of consolidated data

Misspellings and inconsistent spellings will have been largely avoided if you have implemented drop-
down menus. If you have some fields where drop down menus were not used, you can quickly check
for spelling inconsistencies by switching on auto- filtering functionality. When using the filter, each of
the unique entries in the column will be listed, making it easy to spot items which should be the same
but have been spelt in different ways. The find and replace function can then be used to quickly
replace these.

Extraneous errors (where additional irrelevant information had been added) are best removed in the
consolidated database, allowing a consistent approach to be applied.

Incorrectly derived errors (where some calculations have been applied incorrectly) can be reduced
by conducting all calculations after consolidation to ensure consistency (e.g. converting households
to individuals or vice versa).

e) Categorization of open response questions

If your database contains some ‘open response’ questions, or if you have added ‘other’ options to
some of your categorical questions, you will be likely to need to categorise these into common
responses.

Best practice is to add an additional field to contain the categorised responses (which is a derived
field), leaving the original text behind – this allows you to always trace back to the original response
and not to over clean your data.

8 Documenting changes

Documentation of error, alterations, additions and error checking is essential to:


 Maintain data quality
 Avoid duplication of error checking by different data cleaners
 Recover data cleaning errors
 Determine the fitness of the data for use
 Inform users who may have used the data knowing what changes have been made since they last
accessed the data

24
Technical Brief Database Design

In order to manage this process and track changes, create a change log within your workbook, where
you will store all information related to modified fields. This will serve as an audit trail showing any
modifications, and will allow a roll back to the original value if required. Within the change log, store
the following fields:

 Table (if multiple tables are implemented)


 Column, row
 Date changed
 Changed by
 Old value
 New value
 Comments

Figure 11. Sample change log from J-RANS Assessment

Date Time Changed By Change New value


29-Jan 10:45 Georges Cleaning actors across all sectors
29-Jan 11:00 Henry Added recategorisation of priority needs Refer code book
04-Feb 21h55 Henry Addition of the last questionnaire Deir ez Zor
Duplication between questionnaire 49 and 59 solved. Contact with
the enumerators. It was decided to delete the questionnaire 49 (no
possible contact with the enumerators) and to replace it by
05-Feb 11h33 Georges questionnaire 59 (after debriefing), considered more reliable Questionnaire 49 deleted and replace by 59
Double and triple check on Q59 pop figures. Initial figures reflected
pop number at district level and not sub district. Contact with inside
05-Feb 15h37 Christine informant to change pop figures Change from 256.000 to 345.000
05-Feb 16h15 Georges Dana agencies updated after MdM input
06-Feb 22h31 Christine Number of arrested change in Q49. DNK instead of 20000 from 20.000 to DNK
Complete cleaning of the demographic section. Creation of a second
tab called Database pop figures OK where sum of IDPs in public
building+host+vacated buidling = total number of IDPs. See tab
07-Feb 17h05 Henry Changes in pop figures for more details

Always make this information available when sharing the dataset internally or externally (i.e. by
enclosing the change log in a separate worksheet).

9 Additional Resources

Database design in Excel:


http://www.und.edu/dept/cndtrain/Excel/database.pdf
http://spreadsheets.about.com/od/datamanagementinexcel/ss/excel_database.htm
https://intranet.birmingham.ac.uk/as/claddivision/skills/documents/public/excel3.pdf

Change logs:
http://www.codeproject.com/Articles/105768/Audit-Trail-Tracing-Data-Changes-in-Database

25
ANNEX: Joint Rapid Assessment for Aleppo City Form 2013 (J-RANS 2013)
Questionnaire ID: Contested: (y/n) Names of 1. 4.
Date (dd/mm/yy): # of Neighbourhoods neighbourhoods 2. 5.
Team name/code: covered in this form: covered (identical to 3. 6.
MAP):

A. Damages by Conflict Increasing  Decreasing About the same  DNK


Type: (INGO, Committee, local group, health staff, other): Reliability**
Main Source B4. How is the relationship between the displaced and the host
A1. Due to conflict number of persons:* community in this neighbourhood? Select only one
Total Male Female Of whom Host community willing to Tensions already exist
Children < assist for as long as Other (specify ___________)
5 yrs necessary Not applicable
Dead Host community willing to
Injured assist, but for limited time
Missing
C. Humanitarian access
Arrested
C1. Humanitarian Access: Are there problems to gain access to
**(Rating: 1=reliable, 2=fairly reliable, 3= unreliable)
aid in this neighbourhood?  Yes  No  Do not know
A2. Due to conflict damages of physical infrastructure (enter in %) If yes, how severe are the following

problem

Problem

problem

problem
Total for each column should be 100%

Limited
Severe
problems: (Tick only box one per problem)
Type: (INGO, Committee, local group, health staff,

No
Reliability rate
other):
Main Source: Restriction of movement for people
Interference into humanitarian activities
Description Private Buildings Public
(houses, Infrastructure Violence against personnel, facilities and
apartment (schools, health assets
buildings, etc.) centres, etc.) Restriction and obstruction of access to aid
No damages Active hostilities
Slight damages: light repairs Presence of mines and explosives
required (windows, doors)
Moderate damages: Under 30% D. Information
roof damage, fire damage, can be D1. Is humanitarian assistance provided in this neighbourhood
repaired over the past 30 days?  Yes  No  Do not know
Heavy damage: Over 30% roof If yes, are people generally: (Select only one)
damage, severe fire damage, can
 Well informed about humanitarian assistance
be repaired
 Poorly informed about humanitarian assistance
Destruction: Unusable, houses
levelled, can’t be repaired  Not at all informed about humanitarian assistance
A3. Electricity (per day, over the past 30 days) E. Health
 Not functional  1-6 hrs  6-12 hrs  12-18 hrs  18-24hrs E1. Health Status: Is there a serious problem regarding health in
this neighbourhood?  Yes  No  Do not know
B. Demography* If yes, I am reading a list of possible problems: (Select max five most
Type: (INGO, Committee, local group, health staff, other): Reliability** serious problems)
Main Source  Numerous cases of  Incidents of communicable
B1. Estimated # of population in Total % psychological trauma (anxiety, diseases (measles, tetanus,
depression, phobia, etc.) scabies, cholera, etc.)
neighbourhood: Female
 Numerous injured less than 6  Numerous cases of chronic
Total # of pre-conflict population (2011)
months ago diseases (arthritis, dialysis, etc.)
Of whom # who have fled the neighbourhood
 Numerous injured more than 6  Numerous cases of diarrhoea
Current total # of population (resident population months ago  Numerous cases of fever
+ new arrivals, at this moment)  Numerous disabled with  Numerous cases of
- Of whom total # of displaced population (total limitation to move (amputation, respiratory diseases
# of below groups) spinal cord Injury, brain Injury, or  Numerous cases of
peripheral nerve injury) pregnancy related diseases
- # Displaced people living in collective
 Numerous cases with other Other: __________________
accommodation
disabilities (hear, see, speak)
- # Displaced people hosted by local families

- # Displaced people in vacated buildings


* ‘0’=not present ; ‘DNK’=Don’t know ; otherwise provide point estimate E2. Health Care: Is there a serious problem because people are
**(Rating: 1=reliable, 2=fairly reliable, 3= unreliable) not able to get adequate health care for themselves in this
B2. Have the displaced / crisis-affected people been registered in neighbourhood?  Yes  No  Do not know
this neighbourhood? If yes, I am reading a list of possible problems: (Select max five most
serious problems)
 Yes (completed)  No
 Yes (under way)  Not yet, but scheduled
If yes, which organization conducted the
registration in this neighbourhood?

B3. Is the population increasing, decreasing, or staying about the


same in this neighbourhood? Ask this question to more than one
person –LCC/Local authorities, IDPs, neutral party (i.e. NGO)

26
ANNEX: Joint Rapid Assessment for Aleppo City Form 2013 (J-RANS 2013)

 Not enough health facilities  Not enough access to health  Not enough food available  Price increase of basic food items
available services due to physical/logistical (including in markets, etc.)  Agricultural production is
 Lack of ambulance services constraints  Not enough diversity in food disrupted
 Lack of medicines  Not enough access to health  Not enough access to  There are not enough cooking
 Lack of mobility devices services due to security markets due to facilities or utensils
(wheelchairs, prosthetics, others) constraints physical/logistical  Not enough cooking fuel
 Not enough rehabilitation  Not enough access to health constraints (transport)  Loss of economic assets due by
services services due to limited economic  Not enough access to food conflict (livestock, machinery,
 Lack of medical staff resources (lack of money) sources (i.e. markets) due seeds, etc.)
 Other: __________ to security constraints  Other: _______________
 Not enough access to
E3. Which specific health interventions are most urgently markets due to limited
required in this neighbourhood? (Enter short description) economic resources
 Do not know (income)
First rank: F2. Which specific food security interventions are most urgently
required in this neighbourhood?  Do not know
Second rank:
First rank:
Third rank:
Second rank:
E4. Overall, which of the following statements describes best the
general status of public health in this neighbourhood? (circle right Third rank:
answer)
0. DNK F3. Are there functional bakeries regularly providing bread to the
1. No concern – situation under control people in this neighbourhood?  Yes  No  Do not
2. Situation of concern that requires monitoring know (bag = 6-7 loafs)
3. Many people will suffer if no health assistance is provided soon If yes, what is their normal capacity (tons of wheat flour processed per
4. Many people will die if no health assistance is provided soon day)______ (tons)
5. Many people are known to be dying right now because of insufficient
health services What is their current output (tons wheat flour processed per day)____(tons)
Main reason for selecting category: (add short text) Price of subsidized bread (per bag): ___________ SYP
________________________________________________________
Price on the street (per bag, not subsidized): _____________SYP
E5. Distance and capacity of next functional hospital:
F4. Overall, which of the following statements describes best the
Distance (in travel time) ____________minutes general status of food security in this neighbourhood? (Circle right
E6. Which group faces the biggest health risks in this answer)
neighbourhood? (Rank top three: 1=first rank, 2=second rank, 3=third 0. DNK
rank) 1. No concern – situation under control
___ Displaced people living in host families 2. Situation of concern that requires monitoring
3. Many people will suffer if no food assistance is provided soon
___ Displaced people in collective shelter (schools, camps, etc.)
4. Many people will die if no food assistance is provided soon
___ Displaced people in vacated buildings 5. Many people are known to be dying right now due to lack of food
___ Resident population hosting displaced persons
Main reason for selecting category: (add short text)
___ Resident population who have not been displaced ________________________________________________________
E7. Which organisations have been providing regular health care F5. Which group is most at risk of having not enough food to
services in this neighbourhood over the past 30 days? survive in this neighbourhood? (rank top three: 1=first rank, 2=second
Type (INGO, Local Org, Organisation Type of regular support rank, 3=third rank)
Self-help group, other) responsible (excluding one-offs) ___ Displaced people living in host families
___ Displaced people in collective shelter (schools, camps, etc.)
___ Displaced people in vacated buildings
___ Resident population hosting displaced persons
___ Resident population who have not been displaced
F. Food
F6. Which organizations have been providing regular food
F1. Is there a serious problem regarding food in this
support in this neighbourhood over the past 30 days?
neighbourhood?  Yes  No  Do not know
Type (INGO, Local Organisation Type of regular support (excluding
If yes, I am reading a list of possible problems: (Select max five most
Org, Self-help group, responsible one-offs)
serious problems)
other)

G. NUTRITION
G1. Nutritional Status: Is there a serious problem regarding
nutrition in this neighbourhood?  Yes  No  Do not know
If yes, who in this neighbourhood do you think are the most vulnerable
to the issue of poor nutrition: (Select only one most vulnerable group)

27
ANNEX: Joint Rapid Assessment for Aleppo City Form 2013 (J-RANS 2013)

 Children under 6 months  Not enough shelter space  Not enough access to building
 Children under 5 years available materials due to
 Children over 5 years  Not enough protection against physical/logistical constraints
 Pregnant and lactating women cold (snow, wind, rain)  Not enough access to building
 Other: ________________________  Not enough access to privately materials due to security
G2. Are mothers facing a problem with feeding their babies? If rented shelter space constraints
yes, what are some of the reasons mothers are facing trouble  Not enough access to collective  Not enough access to building
feeding:  Yes  No  Do not know shelter space (lack of materials due to limited
If yes, I am reading a list of possible problems: (Select max five most facilities/overcrowded) economic resources (income)
serious problems)  Other (Specify): ___________
 Women are unable to  Lack of infant formula in the H2. Which specific shelter interventions are most urgently
breastfeed due to markets required in this neighbourhood?  Do not know
stress/fear  Lack of fuel/water/sterilizing First rank:
 Women are unable to equipment for preparation of infant
breastfeed due to formula Second rank:
insufficient food availability  Unsolicited / untargeted
 Women are unable to distributions of infant formula (milk Third rank:
breastfeed due to lack of or powder) ongoing
H3. Is there a serious problem in your neighbourhood regarding
privacy  Other: ________________
Non Food Items?  Yes  No  Do not know
 Women are unable to
If yes, I am reading a list of possible problems: (Select max five most
access breastfeeding
serious problems)
support
 Lack of cooking utensils  Lack of personal hygiene
G3. Which specific nutrition interventions are most urgently
required in this neighbourhood?  Do not know (pots, dishes, utensils) products (nail clippers,
First rank:  Lack of household lights toothbrush)
 Lack of adult  Lack of female hygiene products
Second rank: clothing/shoes (sanitary pads, underwear)
 Lack of child clothing/shoes  Lack of mattresses and blankets
Third rank:  Lack of baby supplies  Other (Specify): _________
(diapers, etc.)
G4. Overall, which of the following statements describes best the
H4. Which specific NFI interventions are most urgently required in
general nutritional status in this neighbourhood? (Circle right
answer) this neighbourhood?  Do not know
0. DNK First rank:
1. No concern – situation under control
Second rank:
2. Situation of concern that requires monitoring
3. Many people will suffer if no nutrition assistance is provided soon Third rank:
4. Many people will die if no nutrition assistance is provided soon
5. Many people are known to be dying right now because of insufficient H5. Overall, which of the following statements describes best the
nutrition services general status of Shelter and NFIs?
Main reason for selecting category: (add short text)
0. DNK
________________________________________________________
1. No concern – situation under control
G5. Which group faces the biggest risks of malnutrition in this
2. Situation of concern that requires monitoring
neighbourhood? (rank top three: 1=first rank, 2=second rank, 3=third
rank) 3. Many people will suffer if no shelter assistance is provided soon
___ Displaced people living in host families 4. Many people will die if no shelter is provided soon
___ Displaced people in collective shelter (schools, camps, etc.) 5. Many people are known to be dying right now due to lack of
___ Displaced people in vacated buildings shelter
___ Resident population hosting displaced persons Main reason for selecting category: (add short text)
___ Resident population who have not been displaced
________________________________________________________
G6. Which organisations have been providing regular nutrition
services in this neighbourhood over the past 30 days? H6. Which group is most at risk due to lack of shelter and NFIs?
Type (INGO, Local Org, Organisation Type of regular support (rank top three: 1=first rank, 2=second rank, 3=third rank)
Self-help group, other) responsible (excluding one-offs) ___ Displaced people living in host families
___ Displaced people in collective shelter (schools, camps, etc.)
___ Displaced people in vacated buildings
___ Resident population hosting displaced persons
___ Resident population who have not been displaced
H. Places to live in and non-food items (NFI) H7. Which organizations have been providing regular shelter and
H1. Is there a serious problem in this neighbourhood regarding NFI support in this neighbourhood over the past 30 days?
shelter?  Yes  No  Do not know Type (INGO, Local Organisation Type of regular support (excluding
Org, Self-help group, responsible one-offs)
If yes, I am reading a list of possible problems: (Select max five most
other)
serious problems)

I. Water, Sanitation and Hygiene


I1. What is the main water source in this neighbourhood?

28
ANNEX: Joint Rapid Assessment for Aleppo City Form 2013 (J-RANS 2013)

 Piped water system  Rain water harvesting


 Stream, river or hillside spring  Private well
 Water truck  Other (Specify):____________
I2. Is there a serious problem regarding water in this
neighbourhood?  Yes  No  Do not know J. EDUCATION
If yes, I am reading a list of possible problems: (Select max five most Number of functional schools in Number of functional schools today in
serious problems) this neighbourhood before the this neighbourhood (used for
conflict education)
 Lack of jerry cans  Not enough water available
 The water available is not safe because water is too far away
J1. What percentage of children (6-14 yrs of age) is regularly
for drinking or difficult to access
attending school in this neighbourhood?
 Water does not taste good or  Not enough water available
 0-25%  26-50%  51-75% 76-100%
does not look good enough because people don’t have
 Lack of ways to treat water or means to store water J2. What are the reasons why children are not attending schools?
fuel for boiling it  Not enough water available (Select all that apply)
 Not enough water available because water system, well or  Schools not functioning  Lack of school materials
because water too expensive pump is broken (damaged, destroyed or occupied) (stationery, books, etc.)
 Other (Specify): ___________  Safety - fear of schools being  Lack of water and hygienic
I3. Overall, which of the following statements describes best the bombed/targeted sanitation facilities in schools
general status of water supply? (Circle one right answer)  Lack or absence of teachers  Other (specify):
0. DNK ____________
1. No concern – situation under control J3.Are education activities taking place in other locations? (e.g.
2. Situation of concern that requires monitoring home, mosque, etc.)  Yes  No  Do not know
3. Many people will suffer due to lack of water
If yes, add main type of location:______________________________
4. Many people will die if insufficient water remains available
J4. What percentage of children (6-14 yrs of age) is regularly
5. Many people are known to be dying right now due to lack of water
receiving education in these other locations?
Main reason for selecting category: (add short text)  0-25%  26-50%  51-75% 76-100%
________________________________________________________
K. Protection
I4. Regarding the lack of safe water, which group is most at risk? Reliability
Type: (INGO, Committee, local group, health staff, other):
(rank top three: 1=first rank, 2=second rank, 3=third rank) rate
Main Source:
___ Displaced people living in host families
___ Displaced people in collective shelter (schools, camps, etc.) K1. Is there a serious problem in your neighbourhood regarding
___ Displaced people in vacated buildings protection issues for vulnerable groups?
___ Resident population hosting displaced persons  Yes  No  Do not know
___ Resident population who have not been displaced If yes, what are the three main problems: (See separate list for
I5. Is there a serious problem regarding sanitation and hygiene in guidance)
this neighbourhood?  Yes  No  Do not know First rank:
If yes, I am reading a list of possible problems: (Select max five most
serious problems) Second rank:
 Not enough places to wash  Not enough access to toilets
your body or bathe due to security constraints Third rank:
 Not enough access to water,  Not enough access to toilets
K2. Which specific interventions to protect vulnerable persons
soap or places to wash due to because they are too far away
are most urgently required in this neighbourhood?
security constraints  Not enough access to toilets
 Do not know
 Not enough access to water or because they are not
soap because the cost is too segregated First rank:
expensive  No regular rubbish collection Second rank:
 Not enough toilets available for so general waste builds up Third rank:
men  Others: ____________
K3. What are the most vulnerable groups in this neighbourhood?
 Not enough toilets available for
(Select only one)
women
 Female-headed households  Families belonging to ethnical
I6. Which specific water, sanitation, and hygiene interventions are  Elderly headed households / religious minorities
most urgently required?  Do not know  Households with disabled  Children without appropriate
First rank: persons family care / orphans
 Destitute families  Other. Specify _______
Second rank:
K4. What are the structures in the area that are responsible for
Third rank: the protection of vulnerable persons in this neighbourhood?
(Select max five)
I7. Which organizations have been providing regular water,  Local Council  Local police
sanitation or hygiene support in this neighbourhood over the past  Community based structures /  Family
30 days? groups / committees  No structures responsible for
Type (INGO, Local Org, Organisation Regular support (excluding one-  Local charities protection in the area
Self-help group, other) responsible offs)
 Religious leaders  Other. Specify:
 Schools ____________

29
ANNEX: Joint Rapid Assessment for Aleppo City Form 2013 (J-RANS 2013)

K5. Which groups contain the most vulnerable people in this K6. Which organisations have been providing regular protection
neighbourhood? (Rank top three: 1=first rank, 2=second rank, 3=third services in this neighbourhood over the past 30 days?
rank) Type (INGO, Local Org, Organisation Type of regular support
___ Displaced people living in host families Self-help group, other) responsible (excluding one-offs)
___ Displaced people in collective shelter (schools, camps, etc.)
___ Displaced people in vacated buildings
___ Resident population hosting displaced persons
___ Resident population who have not been displaced

L. Sector Prioritization
After these specific questions, we want to recapitulate. In terms of which sector poses the most serious problems, can you say which is
the most serious, second most, third most, fourth most, and fifth most serious? I read you a list of 7 sectors:

L1. Priority Level. Rank a maximum of 5: 1=first priority, 2=second priority, 3=third priority., 4=fourth priority; 5= fifth priority

Health

Food Security

Nutrition

Water, Sanitation, Hygiene

Places to live and Non-Food Items

Education
Protection

L2. Are there any other urgent problems in this neighbourhood, which I have not yet asked you about? (Please write down bullet points
only)

L3. Any further observations from the assessment team on the difficulty to collect information or the situation in the neighbourhood
(Please elaborate as required)

30

You might also like