0% found this document useful (0 votes)
17 views

0 Ppt1 Introduction To Biostatistics123

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

0 Ppt1 Introduction To Biostatistics123

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 59

INTRODUCTION TO

BIOSTATISTICS

Dr. Mohd Vaseem Ismail


Assistant Professor
SPER, Jamia Hamdard
This session covers:
 Origin and development of Statistics &
Biostatistics
 Definition of Statistics and Biostatistics
 Reasons to know about Biostatistics
 Applications & Uses of Biostatistics
 Types of data
 Frequency distribution of a data
 Graphic & Diagrammatic representation
of data
Definition of Statistics

 “Statistics may be defined as the science


which deals with the collection,
organization, presentation, analysis and
interpretation of numerical data”
--Croxton and Cowden
 “Statistics is the science which deals
with collection, classification and
tabulation of numerical facts as the
basis for explanation, description
and comparison of phenomenon”
--Lovitt
Origin and development of
Statistics
 Sir Ronald A. Fisher (1890-1962), known
as the “Father of Statistics” placed
statistics on a very sound footing by
applying it to various diversified fields such
as genetics, biometry, education,
agriculture etc.
 Introducing the concepts of Point of
Estimation, Fiducial Inference, Exact
Sampling Distribution, Analysis of Variance
and Design of Experiment.
 His contributions won for statistics is very
responsible position among sciences.
Origin and development of
Statistics in Medical Research
 In 1929 a huge paper on application of
statistics was published in Physiology
Journal by Dunn.
 In 1937, 15 articles on statistical methods
by Austin Bradford Hill, were published in
book form.
 In 1948, a RCT of Streptomycin for
pulmonary tb., was published in which
Bradford Hill has a key influence.
 Then the growth of Statistics in Medicine
from 1952 was a 8-fold increase by
1982.
C.R. Rao
Douglas Altman Ronald Fisher Karl Pearson

Gauss -
“Definition of Biostatistics”
 Sir Francis Galton (1822-1911), known as the
“Father of Biometry”. His contributions to
biology was the application of statistical
methods to the analysis of biological variation,
correlation and regression.
 Biostatistics has been defined as the
application of statistical methods to biological
sciences.
 The methods used in dealing with statistics in
the fields of medicine, biology and public health
for planning, conducting and analyzing data
which arise in investigations of these branches.
Reasons to know about
Biostatistics
 Medicine is becoming increasingly
quantitative.
 The planning, conduct and interpretation
of much of medical research are
becoming increasingly reliant on the
statistical methodology.
 Statistics pervades the medical literature.
Applications & Uses of
Biostatistics: In Medicine

 To identify signs & symptoms of a


disease or syndrome
 To find out the association between two
attributes such as cancer and smoking
 To compare the efficacy of a particular
drug.
Pharmacology
 To compare the action of two drugs or
two successive dosages of the same
drug
 To find out the action of drug—a drug is
given to animals and humans to observe
the changes produced are due to the
drug or by chance
 To find out the relative potency of a new
drug with respect to a standard drug
Physiology and Anatomy

 To define what is normal or healthy in a


population and to find out the limits of
normality in variables
 To find the difference between the means
and proportions of normal at two places
or in different periods
 To find the correlation between two
variables such as height and weight.
CLINICAL MEDICINE

 In the documentation of medical history


of diseases.
 In the planning and conduct of clinical
studies.
 In evaluating the merits of different
procedures.
 In providing methods for definition of
“normal” and “abnormal”.
PREVENTIVE MEDICINE
 To provide the magnitude of any health
problem in the community.
 To find out the basic factors underlying
the ill-health.
 To evaluate the health programs which
was introduced in the community
(success/failure).
 To introduce and promote health
legislation.
 To test usefulness of sera and vaccines
in the field.
WHAT DOES STAISTICS
COVER ?
Planning
Design
Data collection
Data Processing
Data analysis
Presentation
Interpretation
Publication
HOW A “BIOSTATISTICIAN”
CAN HELP ?
 Design of study
 Sample size & power calculations
 Selection of sample and controls
 Designing a questionnaire
 Data Management
 Choice of descriptive statistics & graphs
 Application of univariate and multivariate
statistical analysis techniques
INVESTIGATION

Data Colllection

Inferential Statistiscs
Descriptive Statistics
Data Presentation
Estimation Hypothesis Univariate analysis
Measures of Location
Tabulation Testing
Measures of Dispersion
Diagrams Ponit estimate Multivariate analysis
Measures of Skewness &
Graphs Inteval estimate
Kurtosis
WHAT IS DATA?
 Collection of observations
expressed in numerical figures.
 Data is always in collective
sense and never be used
singular
 Collection may be done in two
ways:
 By complete enumeration
 Sample survey method
TYPES OF DATA

 Primary data: Collected for the first


time and are original in character.
 Secondary data: Already collected by
someone for some purpose and are
available for the present study.
 For example: Medical records are the
primary data for the hospital but are
secondary data for any one else.
TYPES OF DATA cont…

 QUALITATIVE DATA
 DISCRETE QUANTITATIVE
 CONTINUOUS QUANTITATIVE
QUALITATIVE

Nominal
Example: Sex ( M, F)
Exam result (P, F)
Blood Group (A,B, O or AB)
Color of Eyes (blue, green,
brown, black)
ORDINAL
Example:
Response to treatment
(poor, fair, good)
Severity of disease
(mild, moderate, severe)
Income status (low, middle,
high)
QUANTITATIVE (DISCRETE)

Example: The no. of family members


The no. of heart beats
The no. of admissions in a day

QUANTITATIVE (CONTINUOUS)

Example: Height, Weight, Age, BP,


Serum Cholesterol and BMI
Discrete data -- Gaps between possible values

Number of Children

Continuous data -- Theoretically,


no gaps between possible values

Hb
CONTINUOUS DATA

DISCRETE DATA

wt. (in Kg.) : under wt, normal & over wt.


Ht. (in cm.): short, medium & tall
Scale of measurement
Qualitative variable:
A categorical variable

Nominal (classificatory) scale


- gender, marital status, race

Ordinal (ranking) scale


- severity scale, good/better/best
Scale of measurement
Quantitative variable:
A numerical variable: discrete; continuous

Interval scale :
Data is placed in meaningful intervals and order. The unit of
measurement are arbitrary.

- Temperature (37º C -- 36º C; 38º C-- 37º C are equal) and


No implication of ratio (30º C is not twice as hot as 15º C)
Ratio scale:
Data is presented in frequency distribution in
logical order. A meaningful ratio exists.

- Age, weight, height, pulse rate


- pulse rate of 120 is twice as fast as 60
- person with weight of 80kg is twice as heavy
as the one with weight of 40 kg.
Scales of Measure
 Nominal – qualitative classification of
equal value: gender, race, color, city
 Ordinal - qualitative classification
which can be rank ordered:
socioeconomic status of families
 Interval - Numerical or quantitative
data: can be rank ordered and sizes
compared : temperature
 Ratio - Quantitative interval data along
with ratio: time, age.
Frequency Distribution

 Frequency distribution is a
biostatistical table which shows the
values of variable arranged in order
of magnitude either individually or in
groups and also the corresponding
frequencies side by side.
Types of Frequency
Distribution
 There are two types of frequency
distribution:
1) Simple or Discrete Frequency
Distribution
2) Grouped or Continuous
Frequency Distribution
 Discrete (or simple) frequency
distribution shows the values of variable
individually.
Table: Simple frequency distribution

Age of Patients ( in years) No. of Patients

10 2
20 7
30 15
40 8
50 5
Total 37
 Continuous (or grouped) Frequency
Distribution shows the values of
variables in groups or intervals.
Table: Continuous frequency distribution
Age of Patient (in years) No. of Patients

10-20 10
20-30 15
30-40 25
40-50 12
50-60 07

Total 69
Tabulate the hemoglobin values of 30 adult
male patients listed below
Patient Hb Patient Hb Patient Hb
No (g/dl) No (g/dl) No (g/dl)
1 12.0 11 11.2 21 14.9
2 11.9 12 13.6 22 12.2
3 11.5 13 10.8 23 12.2
4 14.2 14 12.3 24 11.4
5 12.3 15 12.3 25 10.7
6 13.0 16 15.7 26 12.5
7 10.5 17 12.6 27 11.8
8 12.8 18 9.1 28 15.1
9 13.2 19 12.9 29 13.4
10 11.2 20 14.6 30 13.1
Steps for making a
table
Step1 Find Minimum (9.1) & Maximum (15.7)

Step2 Calculate difference 15.7 – 9.1 = 6.6

Step3 Decide the number and width of


the classes (7 c.l) 9 -10, 10-11,----

Step4 Prepare dummy table –


Hb (g/dl), Tally mark, No. patients
DUMMY TABLE Tall Marks TABLE
Hb (g/dl) Tall marks No. Hb (g/dl) Tall marks No.
patients patients

9 – 10 9 – 10 l 1
10 – 11 10 – 11 lll 3
11– 12 11– 12 llll l 6
12 – 13 12 – 13
13– 14
llll llll 10
13– 14
14 – 15 14 – 15 llll 5
15 – 16 15 – 16 3
lll 2
ll
Total Total - 30
Table Frequency distribution of 30 adult male
patients by Hb
Hb (g/dl) No. of
patients
9 – 10 1
10 – 11 3
11– 12 6
12 – 13 10
13– 14 5
14 – 15 3
15 – 16 2
Total 30
Table Frequency distribution of adult patients by
Hb and gender:
Hb Gender Total
(g/dl)
Male Female

<9.0 0 2 2
9 – 10 1 3 4
10 – 11 3 5 8
11– 12 6 8 14
12 – 13 10 6 16
13– 14 5 4 9
14 – 15 3 2 5
15 – 16 2 0 2

Total 30 30 60
Elements of a Table
Ideal table should have Number
Title
Column headings
Foot-notes
Number – Table number for identification
in a report

Title,place - Describe the body of the


table, variables,
Time period (What, how classified,
where and when)

Column - Variable name, No. ,


Percentages (%), etc.,
Distribution of 120 (Madras) Corporation divisions according
to annual death rate based on registered deaths in 1975 and
1976

Death rate (/1000 per


No.annum)
of divisions
7.0-7.9 4 (3.3)
8.0 - 8.9 13 (10.8)
9.0 - 9.9 20 (16.7)
10.0 - 10.9 27 (22.5)
11.0 - 11.9 18 (15.0)
12.0 - 12.9 11 (0.2)
13.0 - 13.9 11 (9.2)
14.0 - 14.9 6 (5.0)
15.0 - 15.9 2 (1.7)
16.0 - 16.9 4 (3.3)
17.0 - 18.9 3 (2.5)
19.0 + 1 (0.8)
Total 120 (100.0)

Figures in parentheses indicate percentages


DIAGRAMS/GRAPHS

Discrete data
--- Bar diagram (one or two groups)

Continuous data
--- Histogram
--- Frequency polygon (curve)
--- Stem-and –leaf plot
--- Box-and-whisker plot
General rules for designing
graphs
 A graph should have a self-
explanatory legend
 A graph should help reader to
understand data
 Axis labeled, units of measurement
indicated
 Scales important. Start with zero
(otherwise // break)
 Avoid graphs with three-dimensional
impression, it may be misleading
Example data

68 63 42 27 30 36 28 32
79 27 22 28 24 25 44 65
43 25 74 51 36 42 28 31
28 25 45 12 57 51 12 32
49 38 42 27 31 50 38 21
16 24 64 47 23 22 43 27
49 28 23 19 11 52 46 31
30 43 49 12
Histogram
20
Frequency

10

11.5 21.5 31.5 41.5 51.5 61.5 71.5


Age

Figure 1 Histogram of ages of 60 subjects


Frequency Polygon

20
Frequency

10

11.5 21.5 31.5 41.5 51.5 61.5 71.5


Age
Example data

68 63 42 27 30 36 28 32
79 27 22 28 24 25 44 65
43 25 74 51 36 42 28 31
28 25 45 12 57 51 12 32
49 38 42 27 31 50 38 21
16 24 64 47 23 22 43 27
49 28 23 19 11 52 46 31
30 43 49 12
Stem and leaf plot
Stem-and-leaf of Age N = 60
Leaf Unit = 1.0

6 1 122269
19 2 1223344555777788888
(11) 3 00111226688
13 4 2223334567999
5 5 01127
4 6 3458
2 7 49
Box plot

80

70

60

50
Age

40

30

20

10
Descriptive statistics report:
Boxplot
- minimum score
- maximum score
- lower quartile
- upper quartile
- median
- mean

- the skew of the distribution:


positive skew: mean > median & high-score whisker is longer
negative skew: mean < median & low-score whisker is longer
Pie Diagram
• Circular diagram – total -100%
10%
• Divided into segments each
representing a category
20% Mild
• Decide adjacent category
Moderate
Severe • The amount for each category is
70% proportional to slice of the pie

The prevalence of different degree of


Hypertension
in the population
Simple Bar Diagram
25 Heights of the bar indicates
20 20
20 16
frequency
15 12 12
9 8 Frequency in the Y axis
Number

10
5 and categories of variable
0 in the X axis
Smo Alc Chol DM HTN No F-H
Exer The bars should be of equal
Risk factor width and no touching the
other bars
The distribution of risk factor among cases with
Cardio vascular Diseases
HIV cases enrolment in
USA by gender
Multiple Bar Diagram
12
Enrollment (hundred)

10
8
6
Men
4 Women
2
0
1986 1987 1988 1989 1990 1991 1992

Year
HIV cases Enrollment
in USA by gender
Sub-divided or component bar diagram
18
16
Enrollment (Thousands)

14
12
10
8 Women
6 Men
4
2
0
1986 1987 1988 1989 1990 1991 1992
Year
Graphic and Diagrammatic
Presentation of Data

the frequency polygon


(quantitative data)

the histogram
(quantitative data)

the bar Diagram


(qualitative data)
Thank You

You might also like