SlideShare a Scribd company logo
DATA ORGANIZATION&PRESENTATION
 The raw material of Statistics is called data.
 We may define data as figures. Figures result
from the process of counting or from taking
a measurement.
 For example:
 When a hospital administrator counts the
number of patients (counting).
 When a nurse weighs a patient
(measurement)
Data
 Data from normal population to set bench
marks or standards.
 Data from sick population to describe the
disease or vital events.
 To compare the characteristics of normal
population in various localities, countries and
regions.
 To compare the normal with the abnormal
OBJECTIVES OF DATA COLLECTION:
Primary Secondary Ungrouped Grouped
Examples:
 Observations
 Questionnaire
 Interviews
 Survey
Examples:
 Census
 Medical
records.
are presented
or observed
individually.
Example: List of
weight (in
pounds) for six
men:
140,150,160,
150,150,160.
Data
presented in
various
classes or
categories
Classification of Data
 Observation.
 Face to face interview
 Telephone interview
 E-mail interview
 Focus group discussion
 Written Questionnaire
 Existing records
METHODS OF DATA COLLECTION
1. Tabulation.
2. Diagrams
3. Graphs
Data can be Presented in 3- Forms:
 A table is systematic arrangement of data into
vertical columns and rows. So the process of
arranging the data into columns and rows is
called Tabulation.
 Can be simple, 2 x 2 and complex tables.
1-Tabulation
◦ Tables should be numbered.
◦ Title must be given to each table, which must be brief and self
explanatory.
◦ Headings of the columns and rows should be clear and concise.
◦ Data to be presented according to size or importance;
chronologically, alphabetically or geographically.
◦ If percentages or averages are to be compared, they should be
placed as close as possible.
◦ No table should be too large.
◦ Foot notes where additional information to be provided.
Principles to be followed while designing tables
Years Population
1991 115 million
1995 122 million
1998 130 million
2002 145 million(Estimated)
*Census of Pakistan 1998
Table -1 Population of Pakistan
 Data is first split into class intervals and the
number of items (frequency) which occur in each
group is shown in the adjacent table
 Guidelines for class intervals:
Number of classes should be small enough to
provide an effective summary but large enough
to display the relevant characteristics of the
data. Usually the number of classes should be
between 5 and 20.
 Each piece of data must belong to one class.
 All classes should have the same width.
Frequency Distribution Table
 Classes: Categories for grouping data
 Frequency: The number of pieces of data in a class
 Frequency distribution: A listing of classes and their
frequencies
 Relative frequency: The ratio of the frequency of the
class to the total number of pieces of data.(rf)
rf= f/Ef
Terms used in Data grouping
 Relative frequency distribution: A listing of classes
and their relative frequencies
 Lower class limit: The smallest value that can go into a
class
 Upper class limit: The largest value that can go into a
class
 Class mark/Mid point: The midpoint of a class
 Class width: The difference between the lower class
limit of the given class and lower class limit of the next
higher class
Term Used
 It shows at a glance how many individual observations are in a
group and where the main concentration lies.
 It also shows the range, and the shape of distribution.
 These tables can also be extended to relative frequency
distribution tables and cumulative frequency distribution table.
 A cumulative frequency is obtained by summing the frequencies
of all classes representing values less than specified class limit.
Cumulative relative frequency is expressed as a percentage.
Advantages of Frequency Distribution Tables:
210 209 212 208
217 207 210 203
208 210 210 199 (L)
215 221 (H) 213 218
202 218 200 214
Cholesterol levels of the 20- patients
Cholesterol level Tally Method Frequency
195-199 I 1
200-204 111 3
205-209 1111 4
210-214 11111,11 7
215-219 1111 4
220-224 1 1
ToT. Nos. of frequencies 20
Classes/Level Frequency
Relative
Frequency
Class mark
195-199 1 0.05
200-204 3 0.15
205-209 4
210-214 7 0.35 212
215-219 4
220-224 1
20
7/20 No. of freq in
particular class /
tot no of frequency
L+U/2
Total blood
cholesterol level
(mg/dl)
Frequency Relative
frequency
Cumulative
Frequency
Cumulative
relative frequency
distribution
100-119 2 0.95 2 0.95
120-139 2 0.95 4 1.9
140-159 6 2.9 10 4.8
160-179 33 15.8 43 20.6
180-199 36 17.2 79 37.8
200-219 40 19.1 119 56.9
220-239 29 13.9 148 70.8
240-259 27 12.9 175 83.7
260-279 13 6.2 188 89.9
280-299 9 4.3 197 94.2
300-319 11 5.3 208 99.5
320-339 0 0 208 99.5
340-359 0 0 208 99.5
360-379 1 0.5 209 100.0
Total 209 100.0
Cumulative frequency distribution of total blood cholesterol levels
 Tabulation is the simplest way to present
nominal data (or ordinal data, if there are not
too many points on the scale) is to list
categories in one column of the table or
percentage of observation in another column.
Methods of delivery No. of births Percentage
Normal 478 79.7
Forceps 65 10.8
Caesarean section 57 9.5
Total 600 100
Table: Method of Delivery of 600 Babies
Born in Hospital
Contingency Tables:
 Data obtained from observing values of two variables
is called bivariate data. They can be grouped using
tables called contingency tables.
 In its general form, the ‘r’ by ‘c’ contingency table
contains counts of observations arranged in rows and
columns representing various levels of exposure in
discrete data. Such as diseased/non diseased and
exposed/non exposed there are two columns and two
rows and table is referred as 2x2 table.
Age & Sex Age & Sex Age & Sex
21 M 29 F 22 M
20 M 20 M 23 M
32 F 18 F 19 F
21 M 21 M 21 M
19 F 26 M 21 F
Example:
Bivariate data on age in years and sex were obtained
from the students attending the Medical class.
Under 21 21-25 Over 25 Total
Male 2 6 1 9
Female 3 1 2 6
Total 5 7 3 15
Graphical Representation
 Many people have no taste for figures and they
would prefer a way of representation where
figures could be avoided. This purpose is
achieved by representing statistical data visually---
Graphical Representation.
 It can be divided into graphs and diagrams.
 The basic difference between a graph and a
diagram is that a graph is a representation of data
by a continuous curve while diagram is any other
visual.
Graphical Representation
Advantages:
 Powerful impact on the imagination of people
 popular method in news papers and magazines.
 Better retained in the memory than the statistical
tables.
 Data must be simple.
 Comparison is easier with the diagrams.
Disadvantage:
 Lot of details of the original data may be lost in the
charts and diagrams.
Advantages & Disadvantage:
They are divided into:
◦ Simple Bar Chart.
◦ Multiple bar chart
◦ Histogram
◦ Pie diagram
◦ Pictogram
DIAGRAMS OR CHARTS
 Popular media of presenting statistical data, usually the
nominal or ordinal data.
 Easy to prepare and enable values to be compared visually.
 Counts or percentages of the characteristics of interest are
shown as bars.
 Length of the bar is proportional to the magnitude to be
represented.
 These bars do not abut each other
BAR CHARTS:
72.3
83.6
34
65.5
45.8
30.2
57.3
0
10
20
30
40
50
60
70
80
90
Depth
in
feet
Simple Bar Chart:
Showing the Mean Depth of Water Sources Districts
0
10
20
30
40
50
60
70
80
90
%
Punjab NWFP Baluchistan Sindh
ARI Diarrhoea Measles
Multiple Bar Charts :
Two or more bars can be grouped together.
Component bar chart:
 The bars may be divided into two or more parts
—each part representing a certain item and
proportional to the magnitude of that
particular item.
 Example: Component bar chart showing
household water sources district wise.
78.4%
0.0%
21.6%
52.6%
43.6%
3.8%
99.9%
0.1%
0.0%
89.8%
10.1%
0.1%
85.3%
5.0%
9.8%
100.0%
0.0%
0.0%
84.5%
5.6%
9.9%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Percentages
Hand & Motorized Pumps Community Supply Others
 It consists of a set of adjacent bars.
 The width of the bars corresponds to class intervals---along
the X-axis and the frequencies/percentages of observations
for each value are on the vertical Y-axis (ordinate).
 Height of each bar is equal to the frequency of the class it
represents.
 Displays the distribution of quantitative continuous variable.
 Only one set of data is shown.
 If the class intervals are not of equal width, it is necessary to
make adjustments so that the total area remains in proper
proportion and direct comparisons can be made.
HISTOGRAM
 For purposes of visually comparing the distribution of two data
sets, it is better to use relative frequency histogram than
frequency histograms. This is because the same vertical scale
is used for all relative frequency histograms (i.e. 0%-100%).
 In public health usually cases, deaths, male, female data are
shown by the histogram.
 Method of depicting the classes on the horizontal axis of a
histogram is to use Class Boundaries.
 Make the class boundaries by subtracting 0.5 from the lower
class limit and add 0.5 from upper limits.
Example: 50-59 49.5-59.5
60-69 59.5-69.5
Data Organizarion and presentation (1).pptx
 The areas of the segment of the circle are compared.
 Suitable to express proportion /percentages.
 Total area of pie is 100%
 Total angle of the circle is 360.
 Area of each segment depends upon angle.
 To find out the angle of a piece of a circle, the formula is
 X/360=P/T
 Where X is the angle
P is the part
T is total.
X=P/Tx360
PIE CHART
Data Organizarion and presentation (1).pptx
 Popular method of presenting data for the lay persons who
cannot understand charts.
 It is a form of a bar chart.
 Small pictures or symbols are used to present the data.
Example:
A picture of doctor to represent the population per physician
PICTOGRAM
 It shows the relationships between two variables.
 The scatter diagram plots the value of each pair of bivariate observations
(x,y) at the point of intersection of the vertical line through the x value on the
abscissa and of the horizontal line through the y value of the ordinate.
 If the dots cluster round a straight line, it shows evidence of a relationship of
a linear nature. If there is no such cluster, it is probable that there is no
relationship between the variables.
 The diagonal line is called the regression line or sometimes the line of best
fit.
Scatter Diagram:
Data Organizarion and presentation (1).pptx
Frequency Polygon (Line graphs):
 It is a graphical form of a frequency
distribution.
 It is constructed by plotting the frequencies
against class mid points and connecting them
by a straight line. It can also be constructed by
joining the midpoints of the tops of successive
rectangles in the histogram by means of
straight line.
GRAPHS
 Similarly relative frequency (or percentages) polygon can be
constructed.
 They are used to compare two distributions on the same graph.
 The end points of the resulting line are then joined to the
horizontal axis at the midpoints immediately below and above
the lowest and highest non-zero frequencies respectively.
 Frequency polygons may take on a number of different shapes:
 Symmetrical (Bell shaped curve)
 In this, there are high frequencies in the centre
of distribution and low frequencies in the two
extremes, which are called upper and lower
tails of the distribution. Example is height
 A frequency distribution curve is said to be skewed
when it departs from symmetry. Here the
frequencies tend to pile up at one end or other end of
distribution.
 Positively skewed distribution: The upper tail of the
distribution is longer than the lower tail.
 Negatively skewed distribution: The lower tail of the
distribution is longer than the upper tail.
Skewed Distributions:
Data Organizarion and presentation (1).pptx
 All three distributions are unimodal, that is,
they have just one peak.
 Sometimes there is a bimodal frequency
distribution. This is occasionally seen and
usually indicates that the data are a mixture of
two separate distributions e.g., hormone levels
of males and females.
 The horizontal scale is the same as that used for a histogram,
 the vertical scale indicates cumulative frequency or cumulative
relative frequency .
 To construct the ogive, we place a point at the upper class
boundary of each class interval.
 Each point represents the cumulative frequency for that class.
Note that not until the upper class boundary has been reached
have all the data of a class interval been accumulated. Ogive is
completed by connecting the points.
 Useful in comparing two sets of data.
Cumulative Frequency Polygon (Ogive)
 Is used to describe the large continuous data
set
 Is based on five number summary and can be
used to provide a graphical display of the
centre and variation of a data set. The five
number summary of data consists of in
increasing order: Min, Q1,Q2,Q3, Max
 Is suited for comparing two or more data sets
BOX PLOT (BOX AND WHISKER DIAGRAM)
1.Determine the quartiles for the data
2.Determine the minimum and maximum of the
data
3. Draw a horizontal axis on which the values
obtained in step 1 and 2. Above this axis mark
the quartiles and minimum and maximum with
vertical lines
4. Connect the quartiles to each other to make a
box and then connect the box to the minimum
and maximum with lines.
Steps of constructing a box plot
WHISKER DIAGRAM)
 Outlier: For a set of numerical data, any value that
is markedly smaller or larger than other values is
called outlier.
 Outlier is also considered any value that is more
than 1.5 times the Interquartile range (IQR=Q3-
Q1) away from the median
 An outlier requires special attention: It may be the
result of a measurement or recording error, a
member from a different population than the rest
of the sample or simply unusual extreme value
 Note that an extreme value need not to be an
outlier; it may instead be an indication of skewness.
 When an outlier is found, its cause should be
determined. If it is due to a measurement or
recording error or for some other reason it clearly
indicates that it does not belong to the data set and
it is to be removed

More Related Content

PPTX
Biostatistics Presentation Assignment.pptx
PPT
Data presentation
PPTX
2 Lecture 2 organizing and displaying of data.pptx
PPTX
Representation of data-200908070821.pptx
PPTX
03.data presentation(2015) 2
PPTX
Presentation of data
PPT
Data Presentation and Slide Preparation
PPTX
3. data graphics.pptx biostatistics reasearch methodology
Biostatistics Presentation Assignment.pptx
Data presentation
2 Lecture 2 organizing and displaying of data.pptx
Representation of data-200908070821.pptx
03.data presentation(2015) 2
Presentation of data
Data Presentation and Slide Preparation
3. data graphics.pptx biostatistics reasearch methodology

Similar to Data Organizarion and presentation (1).pptx (20)

PDF
Scales of measurement and presentation of data
PPTX
lupes presentation epsf mansursadjhhjgfhf.pptx
PPTX
data organization and presentation.pptx
PPTX
Types of data and graphical representation
PPTX
Introduction to the concepts of Biostatics 2.pptx
PPTX
Biostatistics ppt
PPTX
Fundamentals of biostatistics
PDF
2. Descriptive Statistics.pdf
PPTX
Data Presentation biostatistics, school of public health
PPTX
Hanan's presentation.pptx
PPTX
Intro to statistics
PDF
Standerd Deviation and calculation.pdf
PPTX
presentation of data
PPT
Data types by dr najeeb
PPTX
Methods of data presention
PPTX
STATISTICS.pptx
PPT
Data presentation 2
PPT
statistic.ppt
PPTX
day two.pptx
PPTX
Data presentation.pptx
Scales of measurement and presentation of data
lupes presentation epsf mansursadjhhjgfhf.pptx
data organization and presentation.pptx
Types of data and graphical representation
Introduction to the concepts of Biostatics 2.pptx
Biostatistics ppt
Fundamentals of biostatistics
2. Descriptive Statistics.pdf
Data Presentation biostatistics, school of public health
Hanan's presentation.pptx
Intro to statistics
Standerd Deviation and calculation.pdf
presentation of data
Data types by dr najeeb
Methods of data presention
STATISTICS.pptx
Data presentation 2
statistic.ppt
day two.pptx
Data presentation.pptx
Ad

More from MuhammadAsif297069 (20)

PDF
Physiology Homeostasis and Cells Mcqs.pdf
PPTX
homeostasis.pptx bs nursing 3rd year physiology
PPTX
SAFE MOTHERHOOD1-1.pptx bs nursing 3rd year
PPTX
unit 2 Role of nurse in health care.pptx
PPTX
SAFE MOTHERHOOD1-1.pptx bs nursing 3rd year
PPTX
unit 2 Role of nurse in health care.pptx
PPTX
SAFE MOTHERHOOD1-1.pptx 3rd year bs nursing
PPTX
UNIT 1 Introduction to Socio Cultural and Anthropological Concepts-2.pptx
PPTX
Unit III Culture and Health Behavior-2.pptx
PPTX
liver abscess.pptx nursing ppt 3rd years
PPTX
Therepeutic communication.pptx nursing 3rd year
PPTX
Reflective learning.pptx bs nursing 3rd year slbs
PPTX
Impact of state of physical health on learning.pptx
PPTX
Common health problems in Pakistan.pptx nursing
PPTX
Introduction to Reproductive Health-2.pptx
PPTX
11. Reflective Writing and Critical Thinking.pptx
PPTX
Typhus fever Submitted to Submitted by.pptx
PPTX
Family planing in pakistan-1.pptx nursing
PPTX
Worm infestation 2.pptx nursing lecture ppt
PDF
Gonorrhea by Ruqquia-1.pdf in bs nursing 3rd year
Physiology Homeostasis and Cells Mcqs.pdf
homeostasis.pptx bs nursing 3rd year physiology
SAFE MOTHERHOOD1-1.pptx bs nursing 3rd year
unit 2 Role of nurse in health care.pptx
SAFE MOTHERHOOD1-1.pptx bs nursing 3rd year
unit 2 Role of nurse in health care.pptx
SAFE MOTHERHOOD1-1.pptx 3rd year bs nursing
UNIT 1 Introduction to Socio Cultural and Anthropological Concepts-2.pptx
Unit III Culture and Health Behavior-2.pptx
liver abscess.pptx nursing ppt 3rd years
Therepeutic communication.pptx nursing 3rd year
Reflective learning.pptx bs nursing 3rd year slbs
Impact of state of physical health on learning.pptx
Common health problems in Pakistan.pptx nursing
Introduction to Reproductive Health-2.pptx
11. Reflective Writing and Critical Thinking.pptx
Typhus fever Submitted to Submitted by.pptx
Family planing in pakistan-1.pptx nursing
Worm infestation 2.pptx nursing lecture ppt
Gonorrhea by Ruqquia-1.pdf in bs nursing 3rd year
Ad

Recently uploaded (20)

PPTX
Your Guide to a Winning Interview Aug 2025.
PPTX
Autonomic_Nervous_SystemM_Drugs_PPT.pptx
PDF
シュアーイノベーション採用ピッチ資料|Company Introduction & Recruiting Deck
PDF
Prostaglandin E2.pdf orthoodontics op kharbanda
PPTX
DPT-MAY24.pptx for review and ucploading
PPTX
A slide for students with the advantagea
DOCX
How to Become a Criminal Profiler or Behavioural Analyst.docx
PDF
Understanding the Rhetorical Situation Presentation in Blue Orange Muted Il_2...
PPTX
E-Commerce____Intermediate_Presentation.pptx
PPTX
AREAS OF SPECIALIZATION AND CAREER OPPORTUNITIES FOR COMMUNICATORS AND JOURNA...
PDF
313302 DBMS UNIT 1 PPT for diploma Computer Eng Unit 2
PDF
Josh Gao Strength to Strength Book Summary
PPTX
Principles of Inheritance and variation class 12.pptx
PPT
BCH3201 (Enzymes and biocatalysis)-JEB (1).ppt
PDF
Why Today’s Brands Need ORM & SEO Specialists More Than Ever.pdf
PDF
Blue-Modern-Elegant-Presentation (1).pdf
PPTX
Nervous_System_Drugs_PPT.pptxXXXXXXXXXXXXXXXXX
PDF
Manager Resume for R, CL & Applying Online.pdf
PDF
Daisia Frank: Strategy-Driven Real Estate with Heart.pdf
PPTX
FINAL PPT.pptx cfyufuyfuyuy8ioyoiuvy ituyc utdfm v
Your Guide to a Winning Interview Aug 2025.
Autonomic_Nervous_SystemM_Drugs_PPT.pptx
シュアーイノベーション採用ピッチ資料|Company Introduction & Recruiting Deck
Prostaglandin E2.pdf orthoodontics op kharbanda
DPT-MAY24.pptx for review and ucploading
A slide for students with the advantagea
How to Become a Criminal Profiler or Behavioural Analyst.docx
Understanding the Rhetorical Situation Presentation in Blue Orange Muted Il_2...
E-Commerce____Intermediate_Presentation.pptx
AREAS OF SPECIALIZATION AND CAREER OPPORTUNITIES FOR COMMUNICATORS AND JOURNA...
313302 DBMS UNIT 1 PPT for diploma Computer Eng Unit 2
Josh Gao Strength to Strength Book Summary
Principles of Inheritance and variation class 12.pptx
BCH3201 (Enzymes and biocatalysis)-JEB (1).ppt
Why Today’s Brands Need ORM & SEO Specialists More Than Ever.pdf
Blue-Modern-Elegant-Presentation (1).pdf
Nervous_System_Drugs_PPT.pptxXXXXXXXXXXXXXXXXX
Manager Resume for R, CL & Applying Online.pdf
Daisia Frank: Strategy-Driven Real Estate with Heart.pdf
FINAL PPT.pptx cfyufuyfuyuy8ioyoiuvy ituyc utdfm v

Data Organizarion and presentation (1).pptx

  • 2.  The raw material of Statistics is called data.  We may define data as figures. Figures result from the process of counting or from taking a measurement.  For example:  When a hospital administrator counts the number of patients (counting).  When a nurse weighs a patient (measurement) Data
  • 3.  Data from normal population to set bench marks or standards.  Data from sick population to describe the disease or vital events.  To compare the characteristics of normal population in various localities, countries and regions.  To compare the normal with the abnormal OBJECTIVES OF DATA COLLECTION:
  • 4. Primary Secondary Ungrouped Grouped Examples:  Observations  Questionnaire  Interviews  Survey Examples:  Census  Medical records. are presented or observed individually. Example: List of weight (in pounds) for six men: 140,150,160, 150,150,160. Data presented in various classes or categories Classification of Data
  • 5.  Observation.  Face to face interview  Telephone interview  E-mail interview  Focus group discussion  Written Questionnaire  Existing records METHODS OF DATA COLLECTION
  • 6. 1. Tabulation. 2. Diagrams 3. Graphs Data can be Presented in 3- Forms:
  • 7.  A table is systematic arrangement of data into vertical columns and rows. So the process of arranging the data into columns and rows is called Tabulation.  Can be simple, 2 x 2 and complex tables. 1-Tabulation
  • 8. ◦ Tables should be numbered. ◦ Title must be given to each table, which must be brief and self explanatory. ◦ Headings of the columns and rows should be clear and concise. ◦ Data to be presented according to size or importance; chronologically, alphabetically or geographically. ◦ If percentages or averages are to be compared, they should be placed as close as possible. ◦ No table should be too large. ◦ Foot notes where additional information to be provided. Principles to be followed while designing tables
  • 9. Years Population 1991 115 million 1995 122 million 1998 130 million 2002 145 million(Estimated) *Census of Pakistan 1998 Table -1 Population of Pakistan
  • 10.  Data is first split into class intervals and the number of items (frequency) which occur in each group is shown in the adjacent table  Guidelines for class intervals: Number of classes should be small enough to provide an effective summary but large enough to display the relevant characteristics of the data. Usually the number of classes should be between 5 and 20.  Each piece of data must belong to one class.  All classes should have the same width. Frequency Distribution Table
  • 11.  Classes: Categories for grouping data  Frequency: The number of pieces of data in a class  Frequency distribution: A listing of classes and their frequencies  Relative frequency: The ratio of the frequency of the class to the total number of pieces of data.(rf) rf= f/Ef Terms used in Data grouping
  • 12.  Relative frequency distribution: A listing of classes and their relative frequencies  Lower class limit: The smallest value that can go into a class  Upper class limit: The largest value that can go into a class  Class mark/Mid point: The midpoint of a class  Class width: The difference between the lower class limit of the given class and lower class limit of the next higher class Term Used
  • 13.  It shows at a glance how many individual observations are in a group and where the main concentration lies.  It also shows the range, and the shape of distribution.  These tables can also be extended to relative frequency distribution tables and cumulative frequency distribution table.  A cumulative frequency is obtained by summing the frequencies of all classes representing values less than specified class limit. Cumulative relative frequency is expressed as a percentage. Advantages of Frequency Distribution Tables:
  • 14. 210 209 212 208 217 207 210 203 208 210 210 199 (L) 215 221 (H) 213 218 202 218 200 214 Cholesterol levels of the 20- patients
  • 15. Cholesterol level Tally Method Frequency 195-199 I 1 200-204 111 3 205-209 1111 4 210-214 11111,11 7 215-219 1111 4 220-224 1 1 ToT. Nos. of frequencies 20
  • 16. Classes/Level Frequency Relative Frequency Class mark 195-199 1 0.05 200-204 3 0.15 205-209 4 210-214 7 0.35 212 215-219 4 220-224 1 20 7/20 No. of freq in particular class / tot no of frequency L+U/2
  • 17. Total blood cholesterol level (mg/dl) Frequency Relative frequency Cumulative Frequency Cumulative relative frequency distribution 100-119 2 0.95 2 0.95 120-139 2 0.95 4 1.9 140-159 6 2.9 10 4.8 160-179 33 15.8 43 20.6 180-199 36 17.2 79 37.8 200-219 40 19.1 119 56.9 220-239 29 13.9 148 70.8 240-259 27 12.9 175 83.7 260-279 13 6.2 188 89.9 280-299 9 4.3 197 94.2 300-319 11 5.3 208 99.5 320-339 0 0 208 99.5 340-359 0 0 208 99.5 360-379 1 0.5 209 100.0 Total 209 100.0 Cumulative frequency distribution of total blood cholesterol levels
  • 18.  Tabulation is the simplest way to present nominal data (or ordinal data, if there are not too many points on the scale) is to list categories in one column of the table or percentage of observation in another column.
  • 19. Methods of delivery No. of births Percentage Normal 478 79.7 Forceps 65 10.8 Caesarean section 57 9.5 Total 600 100 Table: Method of Delivery of 600 Babies Born in Hospital
  • 20. Contingency Tables:  Data obtained from observing values of two variables is called bivariate data. They can be grouped using tables called contingency tables.  In its general form, the ‘r’ by ‘c’ contingency table contains counts of observations arranged in rows and columns representing various levels of exposure in discrete data. Such as diseased/non diseased and exposed/non exposed there are two columns and two rows and table is referred as 2x2 table.
  • 21. Age & Sex Age & Sex Age & Sex 21 M 29 F 22 M 20 M 20 M 23 M 32 F 18 F 19 F 21 M 21 M 21 M 19 F 26 M 21 F Example: Bivariate data on age in years and sex were obtained from the students attending the Medical class. Under 21 21-25 Over 25 Total Male 2 6 1 9 Female 3 1 2 6 Total 5 7 3 15
  • 23.  Many people have no taste for figures and they would prefer a way of representation where figures could be avoided. This purpose is achieved by representing statistical data visually--- Graphical Representation.  It can be divided into graphs and diagrams.  The basic difference between a graph and a diagram is that a graph is a representation of data by a continuous curve while diagram is any other visual. Graphical Representation
  • 24. Advantages:  Powerful impact on the imagination of people  popular method in news papers and magazines.  Better retained in the memory than the statistical tables.  Data must be simple.  Comparison is easier with the diagrams. Disadvantage:  Lot of details of the original data may be lost in the charts and diagrams. Advantages & Disadvantage:
  • 25. They are divided into: ◦ Simple Bar Chart. ◦ Multiple bar chart ◦ Histogram ◦ Pie diagram ◦ Pictogram DIAGRAMS OR CHARTS
  • 26.  Popular media of presenting statistical data, usually the nominal or ordinal data.  Easy to prepare and enable values to be compared visually.  Counts or percentages of the characteristics of interest are shown as bars.  Length of the bar is proportional to the magnitude to be represented.  These bars do not abut each other BAR CHARTS:
  • 28. 0 10 20 30 40 50 60 70 80 90 % Punjab NWFP Baluchistan Sindh ARI Diarrhoea Measles Multiple Bar Charts : Two or more bars can be grouped together.
  • 29. Component bar chart:  The bars may be divided into two or more parts —each part representing a certain item and proportional to the magnitude of that particular item.  Example: Component bar chart showing household water sources district wise.
  • 31.  It consists of a set of adjacent bars.  The width of the bars corresponds to class intervals---along the X-axis and the frequencies/percentages of observations for each value are on the vertical Y-axis (ordinate).  Height of each bar is equal to the frequency of the class it represents.  Displays the distribution of quantitative continuous variable.  Only one set of data is shown.  If the class intervals are not of equal width, it is necessary to make adjustments so that the total area remains in proper proportion and direct comparisons can be made. HISTOGRAM
  • 32.  For purposes of visually comparing the distribution of two data sets, it is better to use relative frequency histogram than frequency histograms. This is because the same vertical scale is used for all relative frequency histograms (i.e. 0%-100%).  In public health usually cases, deaths, male, female data are shown by the histogram.  Method of depicting the classes on the horizontal axis of a histogram is to use Class Boundaries.  Make the class boundaries by subtracting 0.5 from the lower class limit and add 0.5 from upper limits. Example: 50-59 49.5-59.5 60-69 59.5-69.5
  • 34.  The areas of the segment of the circle are compared.  Suitable to express proportion /percentages.  Total area of pie is 100%  Total angle of the circle is 360.  Area of each segment depends upon angle.  To find out the angle of a piece of a circle, the formula is  X/360=P/T  Where X is the angle P is the part T is total. X=P/Tx360 PIE CHART
  • 36.  Popular method of presenting data for the lay persons who cannot understand charts.  It is a form of a bar chart.  Small pictures or symbols are used to present the data. Example: A picture of doctor to represent the population per physician PICTOGRAM
  • 37.  It shows the relationships between two variables.  The scatter diagram plots the value of each pair of bivariate observations (x,y) at the point of intersection of the vertical line through the x value on the abscissa and of the horizontal line through the y value of the ordinate.  If the dots cluster round a straight line, it shows evidence of a relationship of a linear nature. If there is no such cluster, it is probable that there is no relationship between the variables.  The diagonal line is called the regression line or sometimes the line of best fit. Scatter Diagram:
  • 39. Frequency Polygon (Line graphs):  It is a graphical form of a frequency distribution.  It is constructed by plotting the frequencies against class mid points and connecting them by a straight line. It can also be constructed by joining the midpoints of the tops of successive rectangles in the histogram by means of straight line. GRAPHS
  • 40.  Similarly relative frequency (or percentages) polygon can be constructed.  They are used to compare two distributions on the same graph.  The end points of the resulting line are then joined to the horizontal axis at the midpoints immediately below and above the lowest and highest non-zero frequencies respectively.  Frequency polygons may take on a number of different shapes:
  • 41.  Symmetrical (Bell shaped curve)  In this, there are high frequencies in the centre of distribution and low frequencies in the two extremes, which are called upper and lower tails of the distribution. Example is height
  • 42.  A frequency distribution curve is said to be skewed when it departs from symmetry. Here the frequencies tend to pile up at one end or other end of distribution.  Positively skewed distribution: The upper tail of the distribution is longer than the lower tail.  Negatively skewed distribution: The lower tail of the distribution is longer than the upper tail. Skewed Distributions:
  • 44.  All three distributions are unimodal, that is, they have just one peak.  Sometimes there is a bimodal frequency distribution. This is occasionally seen and usually indicates that the data are a mixture of two separate distributions e.g., hormone levels of males and females.
  • 45.  The horizontal scale is the same as that used for a histogram,  the vertical scale indicates cumulative frequency or cumulative relative frequency .  To construct the ogive, we place a point at the upper class boundary of each class interval.  Each point represents the cumulative frequency for that class. Note that not until the upper class boundary has been reached have all the data of a class interval been accumulated. Ogive is completed by connecting the points.  Useful in comparing two sets of data. Cumulative Frequency Polygon (Ogive)
  • 46.  Is used to describe the large continuous data set  Is based on five number summary and can be used to provide a graphical display of the centre and variation of a data set. The five number summary of data consists of in increasing order: Min, Q1,Q2,Q3, Max  Is suited for comparing two or more data sets BOX PLOT (BOX AND WHISKER DIAGRAM)
  • 47. 1.Determine the quartiles for the data 2.Determine the minimum and maximum of the data 3. Draw a horizontal axis on which the values obtained in step 1 and 2. Above this axis mark the quartiles and minimum and maximum with vertical lines 4. Connect the quartiles to each other to make a box and then connect the box to the minimum and maximum with lines. Steps of constructing a box plot
  • 49.  Outlier: For a set of numerical data, any value that is markedly smaller or larger than other values is called outlier.  Outlier is also considered any value that is more than 1.5 times the Interquartile range (IQR=Q3- Q1) away from the median  An outlier requires special attention: It may be the result of a measurement or recording error, a member from a different population than the rest of the sample or simply unusual extreme value
  • 50.  Note that an extreme value need not to be an outlier; it may instead be an indication of skewness.  When an outlier is found, its cause should be determined. If it is due to a measurement or recording error or for some other reason it clearly indicates that it does not belong to the data set and it is to be removed