0% found this document useful (0 votes)
2 views

Class - DEM 1110 - Data Presentation-1

Yes

Uploaded by

tembomweenzu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Class - DEM 1110 - Data Presentation-1

Yes

Uploaded by

tembomweenzu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

DEM 1110

Introduction to Demography

Data Presentation

1
PRESENTATION OF SOCIAL, ECONOMIC AND
DEMOGRAPHIC DATA
Basic Measures

Basic Measures
• RATIO
– The Quantitative relation between two amounts showing
the number of times one value contains or is contained
within the other. Mathematically, a ratio shows the
relative sizes of two or more values.
Examples of ratios include;
– Sex Ratio- The number of males per 100 females
• SR= ( M/F) *100
– Dependency Ratio- The number of dependents per 100
economically active population
Basic Measures
• Dependency Ratio- The number of
dependents per 100 economically active
population
Population(0  14)  Population65 
DependencyRatio   100
Population15  65
Basic Measures
• Proportion
– A proportion is a part or share of a whole; it is a
ratio which expresses the relative size of a
number in terms of total. It can also be defined
as a part, share or number considered in
comparative relation to a whole.
• Examples include;
– Femininity ratio, Masculinity ratio, Proportion of
HIV+ persons eligible for ART initiation.
– Proportion of women below 35 years= Number
of Women below age 35/Total number of women
Basic Measures
• Percentage
– A proportion expressed as a base of two or more
numbers. The constant used is 100 for easy
interpretation and comparison.
• Examples include:
– % of total population with access to education
– % of total population with access to Health
– % of total population with access to clean water
Basic Measures
• Rate
– A ratio of one figure to another in a specified
period of time.
• Examples Include;
– Pregnancy Rate = Number of pregnancies/ Total
Population of women(15-49)* 1000
– Sometime a rate is referred to as an occurrence or
exposure ratio because the numerator is a number of
occurrences or event usually in a year while the
denominator measures the number of people exposed
to the risk of experiencing the event.
Methods of Presenting Data
• Recap on types of Data!!!!!
– Continuous Data: These are numerical data that
are measured in an unbroken scale such as age,
weight and CD4 count. Often these data can be
categorized during data analysis.
– Age: . . .18, 19, 20, 21, 22, 23 . . .
– Weight: . . .150, 151, 152, 153 . . .
– Temperature: … 37, 37.2, 37.9, 38…
Methods of Presenting Data
• Recap on types of Data!!!!!
– Categorical: These are data that can be broken
into distinct categories, such as gender and
marital status.
– Age Categories: 18-24, 25-44, 45-64, 65+
– Sex: Male, Female
– HIV status: positive, negative
When is it useful to visually
display data?
•Communicate with policy makers
•Document disease
•Demonstrate successes
•Show trade-offs between 2 choices
•Make decisions
•Other examples?

What do you want to communicate?


•Disease Trends?
•Difference among populations? 10
How do we convince people to
listen to us?

SHOW THEM THE DATA!

11
Measles data in a table

12
Improving Presentation of Measles Data

What is the main point of these data?


How can you make the data easier to
interpret?
How could we present these data to show
context?
13
Common ways to display data –
tables and charts
Table: An arrangement of data in
rows and columns
Table 1.0: # PLWHA by District and Sex, Province Y Title
District Male Female Total Row
A 14,800 200 15,000
B 400,000 20,000 420,000
C 997,000 3,000 1,000,000
D 985,000 15,000 1,000,000
E 1,460,000 40,000 1,500,000
F 465,000 35,000 500,000
G 940,000 10,000 950,000
H 380,000 220,000 600,000
I 900,000 600,000 1,500,000
J 545,000 5,000 550,000
Total 7,086,800 948,200 8,035,000
Persons Living With HIV/AIDS (PLWHA)

Column
Improving Presentation of our Data
DOH ID# School Age Sex complaint Have meds Date visit ('04) Temp previous asthma visits
1 A 8F asthma Y 3-Dec 98 1
2 B 11 M asthma N 8-Dec 102 0
3 B 10 F asthma Y 15-Dec 95 3
4 A 8F asthma N 16-Dec 100 0
5 A 9F asthma Y 18-Dec 101 1
6 B 11 M asthma Y 2-Dec 99 3
7 C 12 F asthma Y 12-Dec 103 2
8 A 10 M asthma Y 8-Dec 98 0
9 B 12 F asthma N 5-Dec 100 0
10 C 11 M asthma N 19-Dec 98 0
11 A 8F asthma Y 20-Dec 98 1
12 A 9M asthma N 12-Dec 102 2
13 B 10 F asthma Y 20-Dec 99 1
14 C 11 M asthma N 17-Dec 100 1
15 D 12 F asthma Y 4-Dec 102 0
16 F 9F asthma N 2-Dec 98 0
17 D 9F asthma N 7-Dec 101 0
18 C 7F asthma N 14-Dec 97 1
19 A 9M asthma N 16-Dec 100 0
•What is the main point of these data?
20 B 10 M asthma N 20-Dec 101 1

•How can you make it easier to read?


•Is a table the best way to represent these data? 16
CHARTS
• When used appropriately, can summarize and
display complex data clearly and effectively and
emphasize specific points
• Let you identify and present distributions, trends,
and relationships among the data
• Help make sense of the data in the profile and
communicate findings to planning groups and
decision makers
• However, poorly designed or executed tables and
figures can mislead users or distract them from
your message!
Key points to consider when
using charts to display data:
• What information is being conveyed in
each?
• What different points do they make?
• Is this the most appropriate chart type
represent these data?
Key elements of a Chart
Title
Vertical Axis (y-axis)

Legend

Horizontal Axis (x-axis)


Pie Charts
• Circular chart split into segments which show
components of a larger group
• Suitable for displaying categorical data, or
discrete data in distinct categories
• The size of a “slice” is proportional to the
amount of data (e.g., number of cases) it
represents
• Proportion (%) of the total each component
represents is frequently added on the slice
• Different colors can be used to identify
various slices
Pie Charts
Distribution of notifiable diseases reported
during football World Cup, June 4 - July 10, 1998
Meningococcal
infection
26%

Foodborne
outbreak
29%

Listeriosis
8%
Thyphoid fever Legionellosis
6% Brucellosis 30%
1%

Data source: Norwegian Institute of Public Health, Oslo, Norway


When not to use a pie chart
• When you would have more than 8 slices
• When the values of each slice are similar
because it is difficult to see differences
between slice sizes
• When you have a limited “n” or total (i.e., only
two cases of measles out of 12 people)
• To compare with another pie chart

22
Bar charts
•Suitable for displaying categorical data and
to compare discrete data in distinct
categories
•Each bar represents one category
•Can be organized horizontally or vertically
•Height of the bars are proportional to the
number of events (e.g. cases) in the category
•Variables in a bar graph can be discrete (e.g.
sex, region, race) or continuous (e.g. age) but
organized in categories (e.g. age groups)
Bar charts
Limitations of Bar Charts
• Poorly designed chart can mislead users
or distract them from your message!
• Can be difficult to see actual numbers

25
Four common types of bar charts:

1) Simple
2) Grouped
3) Stacked
4) 100% component
Simple bar chart
Number of district-level workers trained in EDU by province
in Zambia, 2012
#
100
o 80
f
60
p
40
e
o 20
p
0
l
e

Province
“Grouped” bar graph
Stacked bar graph
# Days Since Patients Started ARVs
300

250
More than 180 days
200
91-180 days
# Patients

61-90 days
150
31-60 days

100 1-30 days

50

0
Itezhi Tezhi Kamoto Katondwe Mtendere Mwandi Siavonga
Town
100% Component column graph

# Days Since Patients Started ARVs


100%
90%
80%
70%
More than 180 days
# Patients

60% 91-180 days


50% 61-90 days
40% 31-60 days
30% 1-30 days
20%
10%
0%
Itezhi Kamoto Katondwe Mtendere Mwandi Siavonga
Tezhi
Town
Line Graphs
• Used for continuous variables
• Display relationships between 2 variables
• When used appropriately, can summarize
and display complex data clearly and
effectively and emphasize specific points
• Let you identify and present distributions,
trends, and relationships among the data
Line Graphs
Trends in HIV prevalence among 15-49 year olds in province X: 2001– 2010

80

60

% 40

20

0
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

Year
source: Ministry of Health annual report 2010 32
Line Graphs

33
http://www.ined.fr/en/everything_about_population/graphs-maps/population_graphs/
Limitations of Line Graphs

• Dependent on accurate data input from


tables and figures
• Can be difficult to see actual numbers, and
limited to the number of lines in a graph

34
Population Pyramid
• A population pyramid is a graphical
illustration that shows the distribution of
various age groups in a population. It is used
to depict the age-sex structure of a
population
Population Pyramid

http://www.ined.fr/en/everything_about_po
pulation/graphs-maps/population_graphs/
Key benefits of data presentation
using tables and graphs
Any graph used to report finding should
show:
• Significant features and findings of the
investigation (e.g., M&E work) in an easily
read way
• Relationships between and within
variables
• Data profile and communicate findings to
planning groups and decision making
Summarizing Continuous Variables
• Mean
• Median
• Mode
• Range
• Standard deviation
Statistics describing continuous
variable distribution
Mean
Mean = sum of value
# of observations
Example: age (yrs) of children (8, 11, 10, 8, 9)

Mean: sum = 8+8+9+10+11 =


46
= 9.2 yrs
# obs. 5 5

40
Mean
Mean = sum of value
# of observations

Example: age (yrs) of individuals (8, 56, 10, 8, 9)

Mean: sum = 8+8+9+10+56 91


= = 18.2 yrs
# obs. 5 5

➢Adding an outlier (age = 56) changes the mean


dramatically!

41
Median
Median = Middle value

Example: age (yrs) of children (8, 11, 10, 8, 9)

Median: -Order data: 8,8,9,10,11


-Pick the middle value
Here it is the 3rd: 9 years
42
Mode
Observation that occurs most frequently
9 12 15 15 15 16 16 20 26
Observation Number of occurrences
9 1
12 1
15 3
16 2
20 1
26 1

43
Range
• The spread of the data from lowest
(minimum) value to highest value
(maximum)
Example:
9 12 15 15 15 16 16 20 26

Range = (9—26)
Sometimes reported: Range = 17
44
Standard Deviation (s)
• Measure of the deviation or distance of the
observations from the mean

• Standard deviation =Sum (xi – x)2


N–1

45
Standard Deviation (s)
Example- Calculate the standard deviation for the
following set of numbers:
9 12 15 15 15 16 16 20 26
Standard deviation = Sum _(xi – x)2
N–1

Mean = 16 N=9
[9-16] 2 + [12-16] 2 + [15-16] 2 + [15-16] 2 +
[15-16] 2 + [16-16] 2 + [16-16] 2 + [20-16] 2 +
[26-16] 2 = 184
SD = 184 = 4.79
8 46
Data Sources and Utilization
Level
Utilization Level Type of Data Sources
Inputs Administrative Data Administrative Records
(Resources) (E.g. Hospital Records-
Health Statistics)
Process (Activities) Administrative Data Administrative Records

Program Outputs Administrative Data Administrative Records


(E.g. # of Clinics built)
Intermediate Outcomes Administrative Data & Administrative Records
(E.g. Access to Health Surveys (ZDHS, (MOH), CSO
Care) ZAMPHIA, LFS)
IMPACT Surveys (ZDHS, LFS etc), CSO
Census
EVALUATION OF
DEMOGRAPHIC DATA
Overview
• Data are used in various aspects that
encompass all aspects of human, economic,
social and political development.
• Various sources are used to capture social,
economic and demographic data such as
the census, surveys, vital registration
systems and population registers among
others
Overview
• In developing countries the common
sources are the censuses and surveys.
• Despite surveys and census being the
dependable sources of data, their reliability,
Accuracy can not be fully be guaranteed

• Thus, social, economic and demographic


data need to be evaluated before it is used.
Why Evaluate Data?
• To identify sources and types of errors in order to
know which sections of data contain errors and
what procedures of data collection and
processing lead to these error, useful in planning
for future preventive measures
• To determine the accuracy of data collected
• To make appropriate adjustment to the data
before we can use confidently to estimate
demographic events, plan for social and economic
development.
Principles Sources of Errors in
Data
• Errors resulting from inadequate
preparation in data collection, collation,
analysis
• Errors committed during data collection
• Errors due to processing of data (editing,
coding, transcription, entry)
• Sampling errors
• Non Sampling Errors
Non Sampling Errors
• These are errors that arise during the course
of all data collection activities and
processing procedures
• These exist in both sample surveys and
censuses data.
• Non-sampling errors can divided into two
main categories:
– Coverage errors
– Content errors
Types Sources of Errors in Data
A. Coverage Errors
– These are as a result of either under enumeration/over
enumeration of a place, village, or persons
– They occur when there is an omission, duplication or
erroneous inclusion of units in the sampling frame.
B. Content Errors
– These arise as a result of;
– Interviewer biases
– Use of wrong questionnaires
– Misunderstanding of questions, memory lapse,
deliberately giving wrong answers
– Data entry
– Response, Non response, data processing and reporting
Common Errors in data
• Errors in demographic data affects the composition of the
population, such as age, marital status and other
characteristics.
• Misstatement of age
– Underreporting (or under-enumeration) of age,
measuring coverage or completeness for age
enumeration
– Misreporting of age, and
– Non-reporting of age
• In some developed countries, misreporting of sex is
negligible, for there appear to be little or no reason for a
tendency for one sex to be reported at the expense of other.
Age and Sex Data
• Measurement of Age and Digit Preference
• Types of Age errors
• Graphical Analysis
• Indexes of age preference
• Whipple’s Index
• Myers’ blended index
• Age ratio analysis
Methods of Detecting Errors
• Two methods;
A. Direct Method
– This method entails having two independent
methods that should collect the same
information e.g the census and the post
enumeration survey. The post enumeration
survey is done on a selected area. The results of
the post enumeration survey are compared
with the results of the census in that selected
area. The difference between the two reflects
the margin of error/accuracy
Methods of Detecting Errors
B. Indirect Method
• External consistency check method
(Involves comparing the results from one
census to another or the census with other
sources e.g survey, VRS
– Balancing equation
– Error of closure
– Intercensal growth rate
Indirect Methods Cont...
• Internal Consistence Check; this method
rely heavily on age sex data, this is because
age and sex influence a lot of social,
economic and demographic data
– Single year age; this method detects errors as a result of
age misstatement, age preference in numbers ending
with 0, 5 even numbers. Under this method the
following are used;
• Graphical method which provides a visual inspection of the
data
• Mathematical methods which involves methods of deriving
indices for digit preference.
Graphical Method
Methods used to detect digit
preference (Errors in data)
Index of age preference
• Whipples index; This method was
developed to detect the degree of age
misstatement that can be attributed to the
preference of age ending in digit 0 and 5
• Myers Blended index; this method was
developed to detect the preference of all
digits ending 0 to 9. This method gives
equal weight in terms digit preference in all
the digits.
Methods of
Smoothening/Graduating data
– Graphical methods
– Mathematical methods
• Moving averages
• UN 5 point Method
• Carrier-Farag method
• Sex Ratio Method

You might also like