0% found this document useful (0 votes)
19 views

3.data Org & Summerization

Uploaded by

masresha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

3.data Org & Summerization

Uploaded by

masresha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Methods Of Data Collection, Organization And

Presentation

Samuel D.[Bsc/PH, MPH/Epidemiology &


Biostatistics]
Learning Objectives
At the end of this chapter, the students will be able to:

ü Identify the different methods of data organization and


presentation

ü Understand the criterion for the selection of a method to


organize and present data

ü Identify the different methods of data collection and


criterion that we use to select a method of data
collection

2
Introduction

 Before any statistical work can be done data must be


collected.

 Depending on the type of variable and the objective of


the study different data collection methods can be
employed.

3
Frequency Distributions

 A frequency is the number of times a given datum


occurs in a data set.

 A frequency distribution is a table that shows


\classes" or \intervals" of data entries with a count of
the number of entries in each class.

4
A categorical distribution

 Non-numerical information can also be represented in a


frequency distribution.

 In connection with large sets of data, a good overall


picture and sufficient information can often be conveyed
by grouping the data into a number of class intervals.

5
A categorical distribution cont…
 Example
Age (years) Number of persons
Under 18 1,748
18 – 24 3,325
25 – 34 3,149
35 – 44 1,323
45 – 54 512
55 and over 335
Total 10,392
 This kind of frequency distribution is called grouped
frequency distribution.

6
A categorical distribution cont…

 Frequency distributions present data in a relatively


compact form, gives a good overall picture, and contain
information that is adequate for many purposes, but there
are usually some things which can be determined only
from the original data.

 For instance, the above grouped frequency distribution


cannot tell how many of the arrested persons are 19
years old, or how many are over 62.

7
A categorical distribution cont…

 The construction of grouped frequency distribution


consists essentially of four steps:
1. Choosing the classes
 Choosing suitable classification involves choosing the
number of classes and the range of values each class
should cover, namely, from where to where each class
should go.

 Both of these choices are arbitrary to some extent, but


they depend on the nature of the data and its accuracy,
and on the purpose the distribution is to serve.

8
A categorical distribution cont…
 A guide on the determination of the number of classes
(k) can be the Sturge’s Formula, given by:

 K = 1 + 3.322×log(n), where n is the number of


observations

 And the length or width of the class interval (w) can be


calculated by:
 W = (Maximum value – Minimum value)/K = Range/K

9
A categorical distribution cont…

2. Sorting (or tallying) of the data into these classes,

3. Counting the number of items in each class, and

4. Displaying the results in the forma of a chart or table

10
Cumulative Frequencies
 The following are some rules that are generally observed:
1. We seldom use fewer than 6 or more than 20 classes;
and 15 generally is a good number, the exact number we
use in a given situation depends mainly on the number
of measurements or observations we have to group.
2. We always make sure that each item (measurement or
observation) goes into one and only one class, i.e.
classes should be mutually exclusive.
3. Determination of class limits:
 Class limits should be definite and clearly stated. In other words,
open-end classes should be avoided since they make it difficult, or
even impossible, to calculate certain further descriptions that may
be of interest.

11
Cumulative Frequencies

 When frequencies of two or more classes are added


up, such total frequencies are called Cumulative
Frequencies.

 This frequencies help as to find the total number of


items whose values are less than or greater than
some value.

12
Cumulative Frequencies cont…
Note:-
 In the construction of cumulative frequency distribution, if we
start the cumulation from the lowest size of the variable to
the highest size, the resulting frequency distribution is called
`Less than cumulative frequency distribution' and

 If the cumulation is from the highest to the lowest value the


resulting frequency distribution is called `more than
cumulative frequency distribution.'

 The most common cumulative frequency is the less than


cumulative frequency.

13
Relative Frequency

 A relative frequency is the fraction of times an


answer occurs.

 To find the relative frequencies, divide each frequency


by the total number of students in the sample.

 The last entry of the cumulative relative frequency


column is one, indicating that one hundred percent of
the data has been accumulated.

14
Cumulative Relative frequency

 Cumulative relative frequency is the


accumulation of the previous relative frequencies.

 To find the cumulative relative frequencies, add all the


previous relative frequencies to the relative frequency
for the current row.

15
Mid-Point of a class interval and the determination
of Class Boundaries

 Mid-point or class mark (Xc) of an interval is the value of


the interval which lies mid-way between the lower true
limit (LTL) and the upper true limit (UTL) of a class. It is
calculated as:

 Xc =

16
True limits (or class boundaries)

 Are those limits, which are determined mathematically to


make an interval of a continuous variable continuous in
both directions, and no gap exists between classes.

 The true limits are what the tabulated limits would


correspond with if one could measure exactly.

17
Example:
 Frequency distribution of weights (in Ounces) of Malignant
Tumors Removed from the Abdomen of 57 subjects
Weight Class Xc. Frequency Cumulative Relative
Ht Boundary Frequency frequency
10-19 9.5 -19.5 14.5 5 5 0.0877
20-29 19.5-29.5 24.5 19 24 0.3333
30-39 29.5-39.5 34.5 10 34 0.1754
40-49 39.5-49.5 44.5 13 47 0.2281
50-59 49.5-59.5 54.5 4 51 0.0702
60-69 59.5-69.5 64.5 4 55 0.0702
70-79 69.5-79.5 74.5 2 57 0.0352
Total 57 1.0000
Note:
 The width of a class is found from the true class limit by
subtracting the true lower limit from the upper true limit of any
particular class.
18
Example 2:
 Construct a grouped frequency distribution of the
following data on the amount of time (in hours) that 80
college students devoted to leisure activities during a
typical school week:

19
20
21

You might also like