0% found this document useful (0 votes)
107 views

Lesson 6b: Analyze Data (Summary Statistics) : Integrated Training For Surveillance Officers in Nigeria (ITSON)

The mode of the data set is 8 and 9. Both values occur most frequently, with a frequency of 3.

Uploaded by

Salihu Mustapha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views

Lesson 6b: Analyze Data (Summary Statistics) : Integrated Training For Surveillance Officers in Nigeria (ITSON)

The mode of the data set is 8 and 9. Both values occur most frequently, with a frequency of 3.

Uploaded by

Salihu Mustapha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 74

Integrated Training for Surveillance Officers

in Nigeria (ITSON)

Lesson 6b:
Analyze Data
(Summary Statistics)
Learning objectives

At the end of this lesson, participants will be able to:


• Use tables to describe characteristics of the
affected population
• Use maps to analyze location of populations at risk
• Use tables, graphs and histograms to analyze
trends
• Calculate (or identify) and explain the following
measures of central location: mode, median,
mean; and range
2

Lesson 6: Analyze Data


Learning objectives

• Calculate and explain the following measures of


disease frequency: incidence, prevalence, attack
rates, mortality rates
• Draw conclusions about analysis results
• Make recommendations based on the conclusions

Lesson 6: Analyze Data


Reported cases of cholera, Zango LGA,
August 2008

Zango Health Office


Summary Report
of
Reportable Diseases

August, 2008
4

Lesson 6: Analyze Data


The surveillance cycle: Analyze data
Identify

Evaluate Report

Communicate Analyze/Interpret

Prepare/Respond 5

Lesson 5: Report Priority Diseases, Conditions and Events


Why doing analysis?

• Know the size of the health problem


• Monitor trends and take prompt action
• Identify the cause of the problem
• Monitor progress of public health program

Lesson 6: Analyze Data


Analyze routine data by person, place
and time

Person Place Time


Describe who is Determine Detect time-
at greatest risk where cases based changes
for the disease are occurring in disease and
and potential the period of
risk factors time from
exposure to
onset of
symptoms
7

Lesson 6: Analyze Data


Analysis by time

• Objective:
– Detect changes in trends over time
• Tools used:
– Tables, line-graphs, histograms (epi-curve)
• Events that occurred during the particular
outbreak can also be highlighted on the graph
with arrows e.g.. when:
– Onset of the first (or index) case
– Date the health facility notified the LGA
8

Lesson 6: Analyze Data


Analysis by time

– The date the first case was seen at the health facility
– The date the LGA began the case investigation
– The date the response activity began
– The date the LGA notified the higher level

Lesson 6: Analyze Data


Trend of confirmed measles cases in
Lukawa village

10

Lesson 6: Analyze Data


Analysis by place

• Residence
• Birthplace
• Employment
• School
• Location in care facility
• Travel
• Urban/rural
• Ecologic zone
11

Lesson 6: Analyze Data


Distribution of CSM cases in Nigeria,
January 2018 – Analysis by Place

12

Lesson 6: Analyze Data


Analysis by person

• Inherent
– Age ― Sex ― Race/ethnicity/tribe
• Acquired
– Immunity
• Marital status
• Activities
– Occupation ― Leisure ― Medications/tobacco/alcohol
• Living conditions
– Socioeconomic status ― Urban/rural
13

Lesson 6: Analyze Data


Types of variables

• Because different types of variables are analyzed


differently
• Qualitative/categorical data
– Descriptions
– Non-numeric information
– Examples: Illness (yes/no), sex, district
• Quantitative data
– Measurements ― Numeric data
– Examples: Age, height, number of children
14

Lesson 6: Analyze Data


Example — Variable types

• For each variable, state whether it is qualitative or


quantitative.
1. Age (years): Quantitative
Qualitative
2. Marital status:
3. No. living siblings: Quantitative
4. HIV status (pos. / neg.): Qualitative
5. CD4+ T-cell count: Quantitative
6. Use seat belts (never, occasionally, usually,
always): Qualitative
15

Lesson 6: Analyze Data


Different types of variables are analyzed
differently
Variable Type Tools for Summarizing
• Quantitative: Measures of central location
(Mean, median, mode, range, etc.)
• Qualitative: Measures of disease frequency
(Ratios, proportions, frequency
distributions, rates)

16

Lesson 6: Analyze Data


Calculating Measures of
Central Location

17

Lesson 6: Analyze Data


Types of variables

Scale Example Summary


Qualitative/ Disease (yes/no), Proportions,
Categorical Country Rates

Quantitative/ Age in years, Measures of


Continuous Height, central location
CD4+ T-cell and spread
counts

18

Lesson 6: Analyze Data


Measure of central location and spread

Central Location
20 ? ?
Number of people

15

10

5
Spread
0
0-9 10-19  20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-99

Age 19

Lesson 6: Analyze Data


Days
9
8
16
9
10 • Raw data set: Incubation period
7
9
(days) of EVD cases, West Africa
9
6
3
11
8
17
9
8
13
8
10
7

20

Lesson 6: Analyze Data


Obs Days
1 3
2 6
3 7
4 7
5 8 • Order the data set from the lowest
6
7
8
8
value to the highest value
8
9
8
9
• Add observation numbers
10 9
11 9
12 9
13 9
14 10
15 10
16 11
17 13
18 16
19 17

21

Lesson 6: Analyze Data


Obs Days
1
2
3
6
What is the mode?
3 7
4 7
5 8 • Definition: Value that occurs most
6
7
8
8
frequently
8
9
8
9
• To identify mode, follow these
10 9 steps:
11 9
12 9 1. Arrange raw data in ascending order
13 9
14 10 2. Identify the value that occurs most
15 10 often
16 11
17 13
18 16
19 17

22

Lesson 6: Analyze Data


Obs
1
Days
3
What is the mode of our
2 6 data set?
3 7
4 7
5 8
6 8
7 8
8 8
9 9
10 9
11 9
12 9
13 9
14 10
15 10
16 11 Mode
17 13
18 16
19 17

23

Lesson 6: Analyze Data


Mode

Steps to identify mode:


1. Arrange raw data in ascending order
or
Create frequency distribution
or
Draw histogram
2. Identify the value that occurs most
often (if any)
24

Lesson 6: Analyze Data


Example: Mode of cholera age data
Sorted by Age

Age Frequency Age Frequency


9 2 37 1
17 1 38 1
19 2 44 1
21 1 54 1
22 3 61 2
27 1 63 1
28 2 64 1
30 1 67 1
35 1 69 1

Mode = 22
25

Lesson 6: Analyze Data


Obs
1
Days
3
Identifying the mode from a
2
3
6
7
histogram
4 7
5 8 Mode = most frequent value
6 8
7 8
8 8
Mode = 9 days
6
9 9
10 9 5
11 9
12 9 4
No. Cases

13 9
3
14 10
15 10 2
16 11
17 13 1
18 16
0
19 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Days
26

Lesson 6: Analyze Data


Mode – Properties/uses

• Easiest measure to understand, explain, identify


• May be more than one mode
• May be no mode
• Mode may not be “central”

27

Lesson 6: Analyze Data


What is the median?

Definition: Middle value; value that splits the


distribution into two equal parts
– 50% of observations are below the median
– 50% of observations are above the median

To calculate median, follow these steps:


1. Arrange observations in order
2. Find middle position as (n + 1)/2
3. Identify the value at the middle
28

Lesson 6: Analyze Data


Finding median of sorted series,
odd number of values (n = 19)
Steps
9 observations
above the
1. Sort 
median 2. Find middle position
= (19+1)/2 = 10th
Median = 9
3. Median = value at
10th position = 9
9 observations
below the
median
29

Lesson 6: Analyze Data


Finding median of sorted series,
Obs
1
Days
3
even number of values (n = 20)
2 6
3
4
7
7
Steps


5 8
6 8 1. Sort
7 8
8
9
8
9
2. Find middle position
10 9 = (20+1)/2 = 10.5th
11 9
Median = 9
12
13
9
9
3. Median = average
14 10 value at 10th & 11th
position = (9 + 9)/2 = 9
15 10
16 11
17 13
18 16
19 17
20 49 30

Lesson 6: Analyze Data


Example: Median of cholera
Count
01
Age (yr)
9
age data
02 9
03 17
04 19 Steps
05 19
06
07
08
21
22
22
1. Sort 
22
09
10 27 2. Find middle position
28
= (24+1)/2 = 12.5th
11
12 28
13 30
14 35
15 37 3. Median = average
16 38
17 44 value at 12th & 13th
18 54
19 61 position
20 61
21 63 = (28 + 30)/2 = 29
22 64
22 67
24 69 31

Lesson 6: Analyze Data


Median – Properties/uses

• Good descriptive measure


• Measure of choice for data that are not distributed
symmetrically
• Focuses on the center of the data, so is not
affected by a few extreme values (“outliers”)

32

Lesson 6: Analyze Data


What is arithmetic mean?

• Definition: “Average” value


To calculate mean follow these steps:
1. Sum up all of the values
2. Divide the sum by the number of observations (n)

33

Lesson 6: Analyze Data


Obs Days Arithmetic mean
1 3
2 6
3 7
4 7 Recall: Mode = 9, Median = 9
5 8

Sum of all values


6 8
7 8
Mean =
8
9
8
9
n
10 9
11 9 Sum = 177
n = 19
12 9
13 9
14 10
15 10
16
17
11
13
Mean = 177/19
18 16 = 9.3
19 17
Sum 177
34

Lesson 6: Analyze Data


Which measures are “sensitive”
to outliers?

Measure Original 19th obs = 56 days


data

Mode 9 9

Median 9 9

Mean 9.3 216/19 = 11.4

35

Lesson 6: Analyze Data


Example: Mean of Cholera Age
ID Age (yr)
Data objectives
01 54
02 69
03
04
61
63
Steps to identify mean:
05 9
06 37 1. Sum up all of the values
07 44
08
09
17
28 Sum = 866
10 28
11
12
21
30
2. Divide the sum by the number of
13
14
22
67
observations (n)
15 9
16
17
38
22
n = 24
18 19
19 22 3. Mean = 866/24
20 19
21
22
35
64 = 36.1
22 27
24 61
36

Lesson 6: Analyze Data


Arithmetic mean – Properties/uses

• Probably best known measure of central location


• Use all of the data
• Affected by extreme values (outliers)
• Best for symmetrically distributed data

37

Lesson 6: Analyze Data


Range

• Definition: description of smallest to largest value


Method for identification
1. Sort data or create frequency distribution
2. Find minimum and maximum values

38

Lesson 6: Analyze Data


Obs Days
Range
1 3
2 6
3 7
4 7
5
6
8
8
Minimum value = 3
7 8
8 8
9 9
10
11
9
9 Range = 3 – 17
12 9
13 9
14 10
15 10
16 11 Maximum value = 17
17 13
18 16
19 17
39

Lesson 6: Analyze Data


Example: How to
ID Age (yr) Days
summarize…
01 54 9
02 69
Cholera Age Data
8
03 61
16
04 63
05 9 9
06 37 10 • Mode = 22 • Mean = 36.1
07 44
7
08
09
17
28 9 • Median = 29 • Range = 9 – 69 years
10 28 9
11 21 6
• Age, median (range): 29 (9–69) years
12 30
13 22 3
14 67 11 Ebola Incubation Period Data
15 9 8
16 38
17 22 17 • Mode = 9 • Mean = 9.3
18 19 9
19
20
22
19
8 • Median = 9 • Range = 3 – 17 days
13
21 35
22 64 8 • Incubation period, median (range):
22 27
24 61
10
9 (3–17) days
7
40

Lesson 6: Analyze Data


Exercise 2: Calculate measures
of central location

• Review the Ebola data set


• Determine the mode, median, mean, and range
for:
– Number of Household Residents (“No. HH Residents”)
for all cases
– Days to Death for those who died

41

Lesson 6: Analyze Data


Points to remember: Measures
of central location
• Measure of Central Location – single measure that
represents an entire distribution
– Mode: most common value
– Median: central value
– Mean: average value
– Mean uses all data, so sensitive to outliers
– Mean preferred for symmetrical data, which is not
common
– In epidemiology, median is safer choice
– Use median with range 42

Lesson 6: Analyze Data


Common Measures of
Disease Frequency

43

Lesson 6: Analyze Data


Measures of frequency

 Counts
 Ratios
 Proportions
 Rates

44

Lesson 6: Analyze Data


Cerebrospinal meningitis cases and
deaths by State, Nigeria – March, 2017
States Total No of Cases Deaths
Zamfara 1225 162
Katsina 175 44
Sokoto 414 32
Kebbi 53 8
Niger 81 33
Nasarawa 1 0
FCT 4 0
Gombe 1 0
Taraba 45 0
Cross River 2 0
Lagos 3 2
Total 2004 281
45

Lesson 6: Analyze Data


Counts

• Common descriptive measure


• Provide picture of burden of disease
• Essential for service delivery, planning
• First step in calculating rates

46

Lesson 6: Analyze Data


Common form of measures of frequency

 Ratios
 Proportions = (x/y) x 10n
 Rates

Where x = numerator
y = denominator
10n = constant (1, 100, 1000, etc.)
47

Lesson 6: Analyze Data


Ratio example

• In 2014, Country A recorded 202,143 male live


births and 193,141 female live births
• Calculate the ratio of male to female live births

No of male live births


No of female live births

202,143 105 males 100 males


  
193,141 100 females 95.4 females
48

Lesson 6: Analyze Data


Ratio example

• A city of 4 million people has 400 clinics


• Calculate the ratio of clinics per person

– Ratio = 400/4,000,000 = 0.0001 clinics per person


– Multiply both numerator and denominator by 10,000
– Ratio = 0.0001 x 10,000
= 1 clinic per 10,000 persons

49

Lesson 6: Analyze Data


What is the ratio of males to females
among patients with cholera?

• Number of males = 14
• Number of females = 10
• Ratio of males to females = 14 : 10
or 1.4 : 1

50

Lesson 6: Analyze Data


Proportions

• Definition: Comparison of a part to the whole


• Useful for describing distribution of characteristics
within a population
• Proportion = x/y
• Where:
– x is the number with a characteristic
– y is the total number
– Percent = proportion x 100%, e.g., (x / y) x 100%

51

Lesson 6: Analyze Data


Cerebrospinal meningitis cases and
deaths by State, Nigeria – March, 2017
States Cases (%) Deaths (%)
Zamfara 1225 61 162 58
Katsina 175 8.7 44 16
Sokoto 414 20.6 32 11
Kebbi 53 2.6 8 3
Niger 81 0.1 33 12
Nasarawa 1 0.2 0 0
FCT 4 2.3 0 0
Gombe 1 0 0 0
Taraba 45 2.2 0 0
Cross River 2 0.1 0 0
Lagos 3 0.1 2 0.7
Total 2004 (%) 281
52

Lesson 6: Analyze Data


Proportions - Example

• Number of females = 10
• Total number of patients = 24
• Proportion of females = 10/24
or 1.4 : 1
• Percentage of females = 0.417 x 100%
= 41.7%

53

Lesson 6: Analyze Data


Health-related rates

• Incidence
• Prevalence
• Attack rate
• Case-fatality rate
• Mortality rate
• Many others

54

Lesson 6: Analyze Data


Incidence rate

• Definition: Frequency with which an event (such


as a new case of illness) occurs in a population
over a specified period of time
No. new cases during specified period x Constant

(size of population) x (time)


– Example 1: During past year, 10 cases of measles in
population estimated to be ~ 250,000 population
– Calculate incidence rate per 100,000 per year
– (10 cases / 250,000 pop. x 1 year) x 100,000 =
55

Lesson 6: Analyze Data


Incidence rate (continued)

Example 2: 18 cases of measles over 2 years in


population estimated to be ~ 250,000 population
Calculate incidence rate per 100,000 per year
(18 cases / 250,000 pop. x 2 years) x 100,000 =3.6
- On average, incidence rate of measles was 3.6 cases
per 100,000 population per year
Example 3: 574 hospital admissions over 7 days
among an estimated 32,000 refugees in a new camp
Calculate incidence rate per 10,000 per day
(574 cases / 32,000 x 7 days) x 10,000 =25.6/10,000/day
56

Lesson 6: Analyze Data


Attack rate (“Risk”)

• Definition: Frequency with which an event (such


as a new case of illness) occurs in a population
over a specified period of time, usually in outbreak
No. new cases during specified period x Constant (such
Size of population at start of that period as 100% or 1,000)

 Example: 16 cases of cholera in village of 800


 Express attack rate as percentage (per 100)
16 = 0.02 0.02 x 100 = 2.0
800 2 cases per 100 population = 2.0%
57

Lesson 6: Analyze Data


Counts versus attack rates
Number of Cases of Hemorrhagic Fever by Age and Sex, Zaire, 1976
Age (years) Male Female Total
<1 10 14 24
1 – 14 18 25 43
15 – 29 33 60 93
30 – 49 57 52 109
≥ 50 23 26 49
Total 141 177 318

Q1. Which age group had the most cases? 30 – 49 year olds
Q2. Which age group had the greatest risk of illness? ???
Need denominators (population size) to calculate risk 58

Lesson 6: Analyze Data


Counts versus attack rates
Number of Cases of Hemorrhagic Fever by Age and Sex, Zaire, 1976
Age (years) Male Pop. Female Pop. Total Pop.
<1 10 800 14 850 24 1,650
1 – 14 18 8,200 25 8,150 43 16,350
15 – 29 33 5,500 60 6,000 93 11,500
30 – 49 57 6,250 52 6,750 109 13,000
≥ 50 23 3,000 26 4,500 49 7,500
Total 141 23,750 177 26,250 318 50,000

Q3. Calculate attack rate (risk) for females 15-29 years old,
per 1,000 population.
A3. 60 / 6,000 x 1,000 = 0.01 x 1,000 = 10 cases / 1,000 pop.
59

Lesson 6: Analyze Data


Prevalence

• Prevalence of disease = Number of existing


(prevalent) cases of disease present in a defined
population at a point or over period of time
– New or old cases
– Doesn’t directly measure risk
• Prevalence of an attribute = proportion of persons
with a particular attribute at a “point” or period of
time

60

Lesson 6: Analyze Data


Prevalence – Examples

No. persons living with HIV infection in Makurdi in 2014


Makurdi population on 1 July 2014

No. persons who smoked cigarettes in Katsina in 2014


Katsina population on 1 July 2014

61

Lesson 6: Analyze Data


Prevalence – Examples

1 July 1 August

62

Lesson 6: Analyze Data


Comparing incidence and prevalence

Incidence Prevalence
• NEW cases or events • ALL cases at
over period of time point/period of time
• Useful for studying • Useful for measuring
factors that cause size of problem and
disease (“risk factors”) planning

63

Lesson 6: Analyze Data


Death (mortality) rate

• Definition: frequency of death in a defined


population during a specified period of time
– Death rate 8 9 10
– Cause-specific death rate
5 6 7
– Age-specific death rate
– Infant mortality rate
1 2 3 4
– Maternal mortality rate

64

Lesson 6: Analyze Data


Death rate

• Form: (numerator / denominator) x 10n


– numerator = number of deaths during specified period
– denominator = estimated population
– 10n = 1,000 or 100,000
• Example:
– 540,000 deaths from all causes, Country A, 2014
– 60,000,000 estimated population, Country A, 2014
– Calculate the mortality rate per 1,000
– (540,000 / 60,000,000) x 1,000
– 9.0 deaths per 1,000 population 65

Lesson 6: Analyze Data


Case fatality rate (CFR)

• Definition: The proportion of people with a


particular disease (cases) who die from the
disease
– Describes the virulence or lethality of the disease
– Actually a proportion
– Often reported as a percentage

No. deaths due to a disease x Constant


No. cases of that disease (such as 100)

66

Lesson 6: Analyze Data


Confirmed Human H5N1 cases,
2003–2013
Country Cases Death CFR
Indonesia 195 163 84%
Egypt 173 63 36%
Vietnam 125 62 50%
China 45 30 67%
Cambodia 47 33 70%
Others 64 34 53%
Total (worldwide) 649 385 ???

Q1. How did WHO calculate the case-fatality rate for China?
30 / 45 x 100% = 67%
Q2. Calculate the worldwide case-fatality rate for 2003-2013
385 / 649 x 100% = 59% 67

Lesson 6: Analyze Data


Exercise 3: Calculate measures
of frequency
1. What was the prevalence of HIV among this
cohort of men at the beginning of the study?
2. What was the prevalence of HIV among this
cohort of men at the end of the study?
3. What was the incidence of HIV infection during
the study?
4. What was the incidence of AIDS during the study?
5. What proportion of HIV-seropositive persons
developed AIDS?
68

Lesson 6: Analyze Data


Conclusions

• The time to think about how to analyze the data is


before you begin to analyze the data!
• Create analysis plan
• For quantitative variables:
– Summarize with
 Mode (not common)
 Mean
 Median (median is safe choice)
• Use median with range
69

Lesson 6: Analyze Data


Conclusions

• For qualitative variables


– Summarize with
 Ratios (not common)
 Proportions
 Rates
– Rates are preferred but require having population
denominators
• “Analysis turns data into information!”

70

Lesson 6: Analyze Data


Points to remember

• Time: trend
• Place: map
• Person: demographics
• Counts: number of cases
• Ratio: comparison of any two numbers
• Proportion: part of the whole
• Rate: number of cases divided by population
• Attack rate: new cases, short time interval
71

Lesson 6: Analyze Data


Points to remember

• Incidence rate: new cases, any time interval, need


to take time into account
• Prevalence rate: current cases in population,
regardless of time of onset
• Case-fatality rate: proportion of cases that died

72

Lesson 6: Analyze Data


The surveillance cycle: Analyze data
Identify

Evaluate Report

Communicate Analyze/Interpret

Prepare/Respond 73

Lesson 5: Report Priority Diseases, Conditions and Events


How I can use this at my job

TOPIC HOW CAN I USE THIS AT MY JOB

Introduction to public health surveillance  

Identify priority diseases, conditions and


 
events
Report priority disease ,events and
 
conditions
Role of Laboratory  

Collect and Organize Data  

Analyze Data  

Interpret Data  

Outbreak investigation (Descriptive


 
Epidemiology)
74

Lesson 6: Analyze Data

You might also like