0% found this document useful (0 votes)
12 views

Biostatistics Notes Introductory Chapter

Uploaded by

Joseph Mbambo
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Biostatistics Notes Introductory Chapter

Uploaded by

Joseph Mbambo
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

UNIT 1

BASIC INTRODUCTION TO BIOSTATISTICS

Definition 1.1

Statistics: Is the branch of mathematic that deals with collection of data, presentation
and analysis for purposes of making informed decisions.

Biostatistics: Is the application of statistics to plants, animals and health.

Biostatistics is referred to applying statistic of bio data.

Bio data include plants and animal science as well as health related data. Anything
related to health data is bio data.

Statistic is a decisional science and so is biostatistics because it assists in decision


making.

Data: refers to raw figure

Information: is the processed data

Data=Processing

UNIT 2

DATA COLLECTION AND PRESENTATION

Population: is the totality of elements or items under study.

Population can be: Finite or Infinite

Finite: Countable

Eg. Number of students


Infinite: Uncountable

Eg. -Mosquitos

-Fish in the sea.


Classes of Biostatistics

1. Descriptive biostatistics

2. Inferential biostatistics

Descriptive: It deals with the uses of variability and central tendency measures as well
as graphical techniques to extract information from data.

Descriptive deals with: -Variability/ Dispersion

-Central tendency / location

-Graphical techniques

Inferential: It deals with generalized of the samples results to the populations.

How do we generalize?

Sample statistic: Are the measures that describe the sample characteristics.

Parameter: Are the measures that describe a population characteristic.

To every sample statistic, there is a corresponding population parameter.

Table 1: Examples of Sample statistic and corresponding Population parameter


(Estimators)

SAMPLES STATISTIC POPULATION STATISTIC

Sample mean ( ) Population mean ( )

Sample variance Population variance

Sample standard deviation Population standard

Sample coefficient of variation Population coefficient of variation

Sample proportion Population proportion

Sample size Population size


Important terms and phrases

1. Statistics
2. Biostatistics
3. Inferential biostatistics
4. Descriptive biostatistics
5. Data
6. Information
7. Population
8. Finite population
9. Infinite population
10. Sample statistic
11. Population parameter
Properties of Estimators

-Sufficient

-Unbianess

-Efficient

Variables: is the quantity that assumes different values.

Classification of variables

-Quantitative/ Numerical

-Qualitative/ Categorical

Quantitative variables take numerical values and represent some kind of measurement.

Quantitative Qualitative

Eg. Height , weight, rainfall, temperature, Normal: cannot be ordered


time
Sex- male and female
Discrete: Assume exact values (0,1,2, etc)
Religion
Eg, Number of students
Ordinal: can be ordered
Continuous : can assume any value in a
given interval. -Education level. Eg. Primary, Junior and
Secondary
Eg. Height (0,3;2,5)

Non probability; member do not have convenience, judgmental


UNIT 3

MEASURES OF CENTRAL TENDENCY AND VARIABILITY

Measures of central tendency / location

3 main measures of central tendency are:

 Mean

 Median

 Mode

 Quartiles (Q1, Q2, Q3)

 Percentiles (100% equal parts)

 Decile (10% below, 90% above)

Ungrouped Data Case

Data are ungrouped if they are not classified or not in classes.

Eg. Yield data

1, 3, 4, 5, 6, 8, 10, 11, 12

Class 0-5 5-10 10-15 15-20

frequency 3 6 2 9

Mean arranged data set on 50% below or equal to it or above or equal to it.

(1)

=
n=8 (even)

= 4th term

Mean =

=5.5

If n = odd

th
Median= term

Eg . 1, 3, 5, 7,9

N=5 (odd)

Median=

=3th term

=5

Mode: Values that appears most

1, 2, 2, 3, 4, 5 mode is 2 (Unimodal)

1, 2, 2, 3, 3, 4, 5 mode is 2&3 (Bimodal)

Q1: devides an arranged data set into 25% below or equal to it and above or equal to it.

Quartiles

1, 3, 4, 5, 6, 10, 11, 12
=2.25th term (position of the first quota)

If Z, we perform a linear Interpolation

Q1 = 3+0.25

=3.25

Q2- Median

th
Q2= term

= ×9

=4.5

Q3=

= ×9

=6.75
Grouped Data Case

X 2 6 10 14 18

Class 0-4 4-8 8-12 12-16 16-20

Frequency 3 4 6 5 9

X is the mid-point class

: is the lower limit of the median class

: is the class width of the median class

: Is the cumulative frequency of classes below the median

: is the frequency of the median class


Class mid-point (x) 2 6 10 14 18

Class 0-4 4-8 8-12 12-16 16-20

Frequency 3 4 6 5 9

=12.4

Lm : is the lower limit of the median class

Cm: is the class width of the median class

Fm: Is the cumulative frequency of classes below the median

fm: is the frequency of the median class

fm-1: is the frequency of the class just below the modal class

fm+1: is the frequency of the class just above the modal class

NB: only the modal class have the highest frequency

Mode=
Examples

Class 0-7 7-14 14-21 21-29 29-36

frequency 3 10 13 4 10
Measures of variability/ Dispersion

This measures are used to describe how the values are spread from a central values,
usually the mean.

The main measures are:

-Standard deviation

-variance

-Range

-Interquartile range

-Coefficiency of variance

Measures of variance: Ungrouped Data Case

Consider Yield data given below:

1, 2, 0, 3, 4, 6, 2

Determining the range, standard deviation, variance, interquartile range, coeffience of


variation and signal to noise ratio

Range=max-min

=6-0

=6

n-1= Computational formula


X

1 2.57 -1.57 1

2 2.57 -0.57 4

0 2.57 -2.57 0

3 2.57 0.42 9

4 2.57 1.43 16

6 2.57 3.43 36

2 2.57 -0.57 4
Invert the ratio

Grouped Data Case

X 1 3 5 7

Class 0-2 2-4 4-6 6-8

frequency 2 1 3 1

X: Is the class mid-point

X f

1 2 1 2 2

3 1 9 9 3

5 3 25 75 15

7 1 49 49 7

Fx2 =135

Fx/X=27
=5.14

Skewness and Kurtosis

Skewness: Any deviation from normality

Kurtosis: Is the peakness / how the curved the data is.

Normality: Is the pattern of symmetry about the mean

Different forms of kurtosis

- Platykurtic
- Mesokurtic
- Laptokurtic

Skewness Analysis

To establish the data is cube or not, we used skewed measures or different co-efficient measures.

Measures are: Bowley’s coefficient of Skewness and Pearson’s Skewness coefficient

1. Bowley’s Coefficient of Skewness

It measures the data if they are positive or negative affected. Its value is range as:

Normally distributed:

Positive distributed:
Negative distributed:

2. Pearson’s Skewness coefficient

or

or

Converting Ungrouped to Grouped Data

Converting data from ungrouped to grouped is important in providing clear graphical illustration. Some
graphs are easily produced if data is grouped.

Grouped data: Is process of putting data into classes with frequency specified against each classes.

Groped Data

Eg.

Class 0-5 5-10 10-15 15-20 20-25

frequency 6 7 9 1 2

Ungrouped Data

1, 3,4, 6, 10, 11, 0, 2, 8, 15, 15, 16, 20, 20, 5, 12, 13, 17, 18, 18, 5

In grouped data, the major issues is the determination of number of classes to use.

Two Method used to achieve this

1. 2k rule

2. Sturge’s rule

2k Rule
Using this method, the number of class K to be used must be chosen such that the relationship > is
satisfied.

>

Range= 20-0

=20 is the overall range from maximum –minimum

For classes: Class width=

NB: Number less than a last number in the class cannot be included, except the last class.

Class Frequency

0-4 4

4-8 4

8-12 2

12-16 4

16-20 6
Sturge’s Rule

Classes are classified into two:

1. Apparent class: Suitable for Discrete Random Variables

2. Real classes: Suitable for Continuous classes

Apparent class have gap while Real class have no gap.

Apparent: Real:

0-5 0-5

6-11 5-10

12-17 10-15

Conversion of Apparent to real section

Apparent: Real:

0-5 0.5-5.5 1-2 0-3

6-11 5.5-11.5 4-6 3-7

12-17 11.5-17.5 8-10 7-11

Graphical Techniques

Graphical techniques form an important of descriptive biostatistics. They are useful in displaying silent
features of any data set.
Different types of graphs that is useful in research

The following are some of the useful graphs

1. Box and Whisker Plot


2. Stem and Leaf (Back to back)
3. Histogram and Bar graph
4. Ogive ( Less than and More than)
5. Polygons
6. Pie charts
7. Line graphs
8. Time series plots
9. Scatter grams

Box and Whisker Plot

Class 0-4 4-8 8-12 12-16 16-20 20-24

frequency 6 5 7 9 4 2

You might also like