Biostatics For Nurses
Biostatics For Nurses
University of Basra
Chapter one
Definition of Biostatistics
What is Statistics?
Types of Variables
Types of Statistics
Descriptive statistics
Inferential statistics
3
Statistical Terms
Sampling
What is Biostatistics?
Biostatistics is using of statistical data for the study of biological
changes in humans, such as functions and diseases, reproduction, growth
and death.
Biostatistics is the application of statistics to a variety of topics in
biology.
What is Statistics?
Statistics is the study of how to collect, organize, analyze, and
information Interpret from the data of the digital.
Statistics – a descriptive measure of calculated from sample data to
serve as an estimate of an unknown population parameter.
4
Types of Variables
A variable is a property that can take on many values.
A variable is an interested criterion to be measured or observed on each individual
such as age, length, gender, blood groups…
They are called variables because their values change from one respondent to
another.
5
b) Continuous, when the variable can take on any value in some range of
values. Continuous quantitative variables - are obtained through measuring
process e.g. the height of a person, exam results … etc
6
b) Ordinal - are non numerical but their categorical values can be arranged
according to some ordered value e.g. ranking of satisfaction level or perception
level, Likert scale of degree of agreement etc.
For example, first, second, third, or fourth
variables
7
Comments:
Notice that the values of the categorical variable Smoking have been coded as
the numbers 0 or 1.
It is quite common to code the values of a categorical variable as numbers, but you
should remember that these are just codes.
Descriptive statistics
1.Descriptive statistics merely “describes” research and does not allow for
conclusions or predictions.
2.Descriptive statistics usually operates within a specific area that contains the entire
target population.
Summary
Inferential statistics
Statistical Terms
Population:- the entire group of individuals we want information about, it can be
huge like "all women" or small like "all statistic students at third year".
Sample Design:- the method used to choose the sample from the population; poor
sample designs can lead to misleading conclusions.
Chapter two
Dot Plots
Bar Graph
Line Graph
The dot plot is composed of dots that are to be plotted on a graph paper.
Solution:
The line graph for the following data is given below:
13
Bar Graph
A bar graph is a very frequently used graph in statistics as well as in media. A bar
graph is a type of graph which contains rectangles or rectangular bars. The lengths
of these bars should be proportional to the numerical values represented by them.
In bar graph, the bars may be plotted either horizontally or vertically. But a vertical
bar graph (also known as column bar graph) is used more than a horizontal one.
The rectangular bars are separated by some distance in order to distinguish them
from one another. The bar graph shows comparison among the given categories.
Mostly, horizontal axis of the graph represents specific categories and vertical axis
shows the discrete numerical values.
14
Students A B C D
Marks 8 14 9 5
Solution:
The following bar graph is obtained:
Line Graph
A line graph is a kind of graph which represents data in a way that a series of
points are to be connected by segments of straight lines. In a line graph, the data
points are plotted on a graph and they are joined together with straight line.
The line graphs are used in the science, statistics and media. Line graphs are very
easy to create. These are quite popular in comparison with other graphs since
they visualize characteristics revealing data trends very clearly. A line graph gives
a clear visual comparison between two variables which are represented on X-axis
and Y-axis.
Solution:
We drew the following histogram:
The frequency polygon is a type of graphical representation which gives us better understanding of the shape
of given distribution. Frequency polygons serve almost the similar purpose as histograms do.
Chapter Three
The mean
The mode
The median
A- Ungrouped data
Mean =
Examle1:- find out the mean (average) for the following ungrouped data
Solution: - ∑x = 24 + 25 + 33 + 50 + 53 + 66 + 78 = 329
N=7
Mean = =
Example 2:- find the mean for the set of data. ( 2,4,7,10,13,16,18)
Solution:-
Mean = = 2+4+7+10+13+16+18 / 7
̅= = 10
B- grouped data
We use a modified formula when finding the mean of grouped data:
Mean ( ̅ ) =
class frequency
60 - 62 5
63 - 65 18
20
66 - 68 42
69 - 71 27
72 - 74 8
Solution:
classes f class mark (xi) fxi
60 - 62 5 (60+62)/2=61 305
63 - 65 18 (63+65)/2=64 1152
66 - 68 42 (66+68)/2=67 2814
69 - 71 27 (69+71)/2=70 1890
72 - 74 8 (72+74)/2=73 584
∑f =100 ∑fxi =6745
2- The mode of a set of data values is the value that appears most
often.
21
A – Ungrouped data
Example 1: find the mode from the following data
5, 9, 7, 4, 6, 8, 2, 4, 1, 3, 5, 1, 4, 6, 9, 8, 7, 5, 2, 4, 1
Solution :
Order: 1, 1, 1, 2, 2, 3, 4, 4, 4, 4, 5, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9
Mode = 4
Note:
-There is no mode when all the scores are different (or there is the same
number of many scores)
- Sometimes there is more than one mode.
B – Grouped data
The mode for grouped data :- you can calculate the mode for a grouped
frequency table using the following formula:
Where:
Modal class meets the more frequency
classes Frequency
5-25 12
25-45 8
45-65 14
65-85 20
85-105 6
Solution:
Mode = 65 + * 20
23
Mode = 65 + * 20
Mode = 65 + 6
Mode = 71
Example 2: (even data): find the median for the following data 3, 24, 35, 8,
11, 13, 46, 11, 12, 48
Solution:
Order 3, 8, 11, 11, 12, 13, 24, 35, 46, 48
Median = (12+13) / 2 = 12.5
24
Median = L + [(N/2 – C) / f ] h
Where:
Median class: Locate n/2 in the column of cumulative frequency. The class
in which it lies is called median class.
classes Frequency
5-9 3
10-14 5
15-19 8
20-24 10
25-29 18
30-34 17
25
35-39 11
40-44 9
45-49 7
Solution:
The geometric mean is a type of average, usually used for growth rates,
Definition: For n numbers: multiply them all together and then take the nth
For n numbers: multiply them all together and then take the
= 3√ (10 × 51.2 × 8) = 16
Geometric Mean = √ =9
contributing equally to the final mean, some data points contribute more
“weight” than others. If all the weights are equal, then the weighted mean
The Weighted mean for given set of non-negative data x1, x2, x3, ….. xn
with non-negative weights w1, w2, w3, …. wn can be derived from the
formula.
̅=
Math 90, Stat 75, Chemistry 88 and Physics 79 and the weights are 1, 2, 2,
̅=
̅=
28
̅=
̅ = 81.91
EX:- The following table shows the prices per 100 gram of coffee of
Brands A B C D E F
Price(in $$) 2.5 3 3.5 4 4.5 5
Quantity(in Kg) 10 8 3 5 7 2
desired.
Solution:-
1- Add the reciprocals of the numbers in the set: 1/1 + 1/5 + 1/8 + 1/10 = 1.425
EX:- Calculate the harmonic mean of the numbers 13.5, 14.5, 14.8, 15.2 and 16.1
x
13.2 0.0758
14.2 0.0704
14.8 0.0676
15.2 0.0658
16.1 0.0621
Total ∑ = 0.3417
̅= = = 14.632
Chapter four
The range
1- the range :-
The range of a set of data is the difference between the largest and
smallest values
31
Example 2:- find the range from the following data set:
23 56 45 65 59 55 62 54 85 25
So the range = 85 – 23
Range = 62
32
x x- ̅ (x - ̅ )2
17 17-20= -3 9
20 20-20= 0 0
33
23 23-20= 3 9
18 18-20= -2 4
19 19-20= -1 1
21 21-20= 1 1
22 22-20= 2 4
∑X = 140 ∑∑(x - ̅ ) = zero ∑∑(x - ̅ )2 = 28
S =√ ,S=√
10 100
12 144
∑ X =42 ∑ X2 = 364
S=√
S=√
S=√
Formula:-
35
̅
1- S2 =
2- S2 =
Example 1 :- find the variance (s2) and stander deviation (s) from the table.
Solution:-
Mean = ̅ = ∑X ÷ n
̅ = (727.7 + 1086.5 +1091.0 +1361.3 +1490.5 +1956.1) ÷ 6
̅ = 1285.5
̅
S2 =
S2 =
S= √S2
S= √
Formula SE = √
Formula CV = ̅ * 100%
Example 1:- Calculate standard error and coefficient of variation for the
following data.
Solution:-
̅= = = 11.83
38
̅
S=√
X ̅ X- ̅ (X- ̅
14 11.83 2.17 4.71
8 11.83 -3.83 14.67
11 11.83 -0.83 0.69
12 11.83 0.17 0.029
16 11.83 4.17 17.39
10 11.83 -1.83 3.35
∑∑ (X- ̅ = 40.839
SE =
√
SE = , SE = 1.163
√
CV = ̅ * 100%
CV = ̅̅̅̅̅̅̅̅ * 100%
SE =
√
SE = , , SE = 5.67
√
CV = ̅ * 100%
So cv = 16.05 %
40
Chapter Five
Normal distribution
Probability definition:
Multiple-event probability
Any variable can have two types of values. Either the values can be fix
numbers which are also known as discrete values or a specified range that
If a variable can take on any value between two specified values, it is called
a continuous variable; otherwise, it is called a discrete variable.
Some examples will clarify the difference between discrete and continuous
variables.
Suppose the nursing departments that all nurses must weigh
between 60 and 90 kg. The weight of nurses would be an example of
a continuous variable; since a nurse’s weight could take on any value
between 60and 90 kg.
Suppose we flip a coin and count the number of heads. The number
of heads could be any integer value between 0 and plus infinity. We
could not, for example, get 2.5 heads. Therefore, the number of
heads must be a discrete variable.
An example will make this clear. Suppose you flip a coin two times. This
simple statistical experiment can have four possible outcomes: HH, HT, TH,
and TT. Now, let the random variable X represent the number of Heads
that result from this experiment. The random variable X can only take on
the values 0, 1, or 2, so it is a discrete random variable.
0 0.25
1 0.50
2 0.25
Normal distribution
44
The normal distribution is the most widely known and used of all
distributions. Because the normal distribution approximates many natural
phenomena so well, it has developed into a standard of reference for many
probability problems.
•
5- The rule for a normal density function is
• 6- About 2/3 of all cases fall within one standard deviation of the mean
that is P (μ - σ ≤ X ≤ μ + σ) = 0.6826
• 7- About 95% of cases lie within 2 standard deviations of the mean, that is
P(μ - 2σ ≤ X ≤ μ + 2σ) = 0.9544
•8- About 99 % of cases lie within 3 standard deviations of the mean, that is
P (μ - 3σ ≤ X ≤ μ + 3σ) = 0.9973
46
47
Probability definition:
2. Flip a coin twice and the two results are different: {(H, T), (T,H)}.
48
Formula:
Example:
Solution:
49
where,
n (A) = 3, total number of even numbers occurred is 3.
Probability that event A does not occur = P (A') = 1 – P (A) = 1 - 0.5 = 0.5.
Formula:
Probability that either of event occurs P(A ∪ B) = P(A) + P(B) - P(A ∩ B).
P(A') = 1 - P(A)
= 1 - 0.5 = 0.5.
Probability that event A does not occur = 0.5.
P(B') = 1 - P(B)
= 1 - 0.5 = 0.5.
Probability that event B does not occur = 0.5.
Chapter Six
Hypothesis testing
Types of hypothesis:
T-Test: comparison of arithmetic means
Independent one-sample T-Test
Hypothesis testing:
Types of hypothesis:
1-Null hypothesis (H0): The means of the two groups are not significantly
different. Its form
H0: μ1 = μ2 = μ3 = μ4 ……μn
Used to compare the mean of the sample with the mean of the community
̅
Formula T= , The degrees of freedom used in this test is n - 1.
√
Where: ( ̅ ) the sample arithmetic mean, (μ) the population arithmetic mean
Solution
̅
T=
√
X X2
150 22500
55
170 28900
140 19600
110 12100
160 25600
120 14400
170 28900
140 19600
110 12100
130 16900
∑x= 1400 ∑x2 = 200600
̅ = 1400/10 =140
S = √S2
S2 =
S2 =
S2 = , S2 =
S2 =
56
S2 = 511.11
Since S = √S2
̅
T=
√
Since µ = 145
T=
√
This model is widely used in scientific research to find out the development
the sample before and after effects.
̅ ̅
Formula T= , The degrees of freedom used in this test is n − 1
√
Ex: Five patients have heart beat (x1=100, 120 115 110, 130), after care
become (x2=90 100, 90, 80.95), Calculate T- value, and then explain the
results by the statistically.
Solution:
̅ ̅
T=
√
X1 X2 d d2
100 90 10 100
120 100 20 400
115 90 25 625
110 80 30 900
130 95 35 1225
∑x1=575 ∑x2=455 ∑d = 120 ∑d2 = 3250
̅ 1=575÷5=115 ̅ 2=455÷5=91
Sd = √Sd2
58
Sd2 =
Sd2 =
Sd2 =
sd2 = 92.5
̅ ̅
T= , T=
√ √
T=
√
̅
Formula T= , The degrees of freedom used in this test is n1+ n2 - 2
√
Where ( ̅ ) is the arithmetic mean of first sample., (ӯ) is the arithmetic mean of second sample.
(S2x) is the variance of first sample., (S2y) is the variance of second sample.
̅
Solution : T=
√
x X2 y y2
8 64 9 81
6 36 7 49
5 25 5 25
4 16 4 16
7 49 7 49
6 36 6 36
6 36 8 64
5 25 4 16
∑x=47 ∑x2= 287 ∑y=50 ∑y2=336
̅ = 47 ÷ 8= 5.87 Ӯ= 50 ÷ 8 = 6.25
T= , T= √
√
T= , T=
√
61
Chapter Six
What is a correlation?
63
Pearson correlation
Chi- Square distribution (X2):- Chi-Square Test is the widely used non-parametric
statistical test that describes the magnitude of discrepancy between the observed
data and the data expected to be obtained with a specific hypothesis.
Where,
O = Observed Frequency
E = Expected or Theoretical Frequency
smoke no smoke
Multiply each row total by each column total and divide by the overall total:
X2 =
Since x2- calculated = 4.102 < x2- tabular = 6.635 at 0.05 level
Voting Preferences
Solution:-
Voting Preferences
Sum
Laboratory clinical management
Calculate "Expected Value" for each entry: E=row total×column / totalsample size
X2 =
Χ2= 16.2
67
Solution:-
sinusitis men women children Sum
68
Calculate "Expected Value" for each entry: E=row total×column / totalsample size
X2 =
Χ2= 22.152
R-calculated:
EX1: Sample question: compute the value of the correlation coefficient from the
following table:
57 87
59 81
Solution:
R = 478 / √813984.48
R = 478 / 902.21
R = 0.529
Interpretation: R-calculated = 0.529 < R-tabular ( .811) at level 0.05 and (df =4) then the
correlation between X and Y is insignificant.
72
EX2: Calculate the correlation coefficient between two variables, and then interpret
the R value. X= 1, 3, 4, 4 and Y= 2, 5, 5, 8.
Solution:
x y xy x2 y2
1 2 2 1 4
3 5 15 9 25
4 5 20 16 25
4 8 32 16 64
Σx = 12 Σy = 20 Σxy= 69 Σ x2= 42 Σ y2 = 118
Interpretation: R-calculated = 0.866 < R-tabular ( .95) at level 0.05 and (df =2)
then the correlation between X and Y is insignificant.
df = n -2
Level of Significance (p)
.10 .05 .02 .01
for Two-Tailed Test
df
73