2-Presentation2
2-Presentation2
College of Science
Department of Biology
Lecture 2
2024-2025
Biostatistics
2/27/2025 biostatistics 1
3. Cluster sample: Divides population into groups called clusters then
randomly select clusters and select all the members of the each cluster.
Example:
2/27/2025 biostatistics 2
4. Systematic sample: Assign a number to each member of the population,
randomly pick a number, then start with that number and choose at the same
interval from it
𝑲
𝑺𝒚𝒔𝒕𝒆𝒎𝒂𝒕𝒊𝒄 𝑺𝒂𝒎𝒑𝒍𝒆 =
𝒏
Where:
𝐾=Number of Populations,
𝑛=Number of Sample
Data is the measurement or observation of the variable. Data are simply defined as numbers,
but not necessary be numbers. Data are raw materials for statistics. There are two main kinds
of data: Examples of Variables and Data
1. Data resulted from measurement such as Variables (Variable) Data (Datum)
body weight,body height, Body temperature 37.4 C
serum cholesterol level ….etc. Body weight 75 kg
2. Data resulted from process of counting Age 25 years
such as number of patients discharged from Number of females 24
a hospital on a given day,number of teeth Number of persons who have blue 10
extracted in a person, eyes
number of pregnancies for a woman…..etc. Number of cigarette smoked per day 25
2/27/2025 biostatistics 4
Variable types
There are two main types of variables: Numerical (quantitative) variables and
categorical (qualitative) variables. Numerical variables are further classified into
numerical continuous or numerical discrete. Categorical variables are classified
into categorical nominal or categorical ordinal. For details of variable types with
examples see
the below Figure 1
2/27/2025 biostatistics 5
Exercise 1: Types of variables Categorize the following variables into either numerical or
categorical variables [Age, sex, race, residency, occupation, years of formal education,
cigarettes smoking. No. of cigarettes smoked/day, weight, height, hemoglobin level, anemia,
educational level of mother (whether primary, secondary, college), coffee drinking, No. of
cups of tea drunk/day, blood sugar level, blood group, Rh, blood urea level, serum sodium
level].
Numerical variables Categorical variables
Age, years of formal education, Sex, race, residency, cigarettes
No. of cigarettes smoked/day, smoking, anemia, blood group,
weight, height, hemoglobin level, Rh, educational level of mother
No. of cups of tea drunk/day, (whether primary, secondary,
blood sugar, blood urea level, college), coffee drinking.
serum sodium level.
2/27/2025 biostatistics 6
DATA PRESENTATION
The huge amount of raw data is meaningless and not understandable, if not organized and
summarized in a simple and understandable way. One of the first things that you may wish
to do when you have entered your data onto a computer is to summarize them in some way
so that you can get a ‘feel’ for the data. This can be done by producing diagrams, tables or
summary statistics. Diagrams are often powerful tools for conveying information about the
data, for providing simple summary pictures before any analyses are performed.
Data presentation is the method by which people summarize, organize and communicate
information using a variety of tools, such as diagrams, distribution charts, histograms and
graphs. The methods used to present raw data vary widely. Common presentation modes
including coding data, drawing diagrams, tables, pie charts and histograms. Common ways
of data presentation are:
2/27/2025 biostatistics 8
Some definitions
Ungrouped data: is the raw data or the original data collected from the source.
Grouped data: it is a regular data in the form of frequency distribution table.
Class: it is the group which is divided variable and each class are two limit (min and max
limit) and range.
Class of limit: each class has minimum and maximum limit.
Real limit on class: each class are minimum and a maximum real.
Class length: is the amount of Class Long – alone particular class. . Preferably equal class.
Center class: is the mid-range between the two class limit divided by two.
Frequency class: is the number of value that fall in two class limit.
2/27/2025 biostatistics 9
Relative Frequency Distribution Table
A table showing the relative importance of each class which represents dividing the
frequency of each class to the total frequency. It should be noted that the sum of the relative
frequencies always equal to one. If we multiply the relative frequency * 100% get the
relative frequency percentile.
𝑓𝑖
𝑝𝑒. 𝑟𝑒. 𝑓 = ∗ 100
∑𝑓𝑖
2/27/2025 biostatistics 10
The general steps to create a frequency distribution table
1. Range
2. Number of classes 3- length classes
Example: The following data represent the patients (28) ( note that the minimum for the first
class =25). Required:
1. Find the frequency distribution table for this data.
2. Find the percentage relative frequency distribution table with the following data 3-Find
the Real limits on classes
26 26 27 30 32 30 31
31 33 30 34 31 34 39
33 31 32 38 37 35 39
38 38 42 37 42 40 35
2/27/2025 biostatistics 11
Soluation :
1.Range
𝑅𝑎𝑛𝑔𝑒 = 𝑚𝑎𝑥𝑣𝑎𝑙𝑢𝑒 − 𝑚𝑖𝑛𝑣𝑎𝑙𝑢𝑒 = 42 − 26 = 16
2.num of class
𝑙𝑜𝑔𝑛 = 1 + 3.3log28 = 1 + 3.3 1.44 = 1 + 4.752 = 5.75 = 𝑛𝑢𝑚. 𝑐𝑙𝑎𝑠𝑠
1-note : num. class should be don’t less than from(5) and don’t more than from ( 12)
3.length of class
𝑟𝑎𝑛𝑔𝑒
𝐿𝑒𝑛𝑔𝑡ℎ =
𝑛𝑢𝑚. 𝑐𝑙𝑎𝑠𝑠
Note : length of class proximate in to nearest beg natural number
16
𝐿𝑒𝑛𝑔𝑡ℎ = = (2.6) = 3 𝑙𝑒𝑛𝑔𝑡ℎ
6
2/27/2025 biostatistics 12
𝑙𝑖𝑚𝑖𝑡 max +𝑙𝑖𝑚𝑡 𝑀𝑖𝑛
𝐶𝑒𝑛𝑡𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 =
2
1
𝑅𝑒𝑎𝑙 𝑚𝑖𝑛𝑙𝑖𝑚𝑖𝑡 = 𝑐𝑒𝑛𝑡𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 − 𝑙𝑒𝑛𝑔𝑡ℎ 𝑐𝑙𝑎𝑠𝑠
2
= 26 − 0.5 3 = 26 − 1.5 = 24.5
1
(𝑅𝑒𝑎𝑙 𝑚𝑖𝑛𝑙𝑖𝑚𝑖𝑡) = 𝑐𝑒𝑛𝑡𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 + 𝑙𝑒𝑛𝑔𝑡ℎ 𝑐𝑙𝑎𝑠𝑠
2
= 26 + 1.5 = 27.5
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 𝑟𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑓𝑖 Nu. Class (f) Center of Real limit P.re.f
= ∗ 100 limit class of class
∑𝑓𝑖
3 1. 25-27 3 26 24.5-27.5 10.714
= ∗ 100 = 10.71%
28
2. 28-30 3 29 27.5-30.5 10.714
3. 31-33 8 32 30.5-33.5 28.571
4. 34-36 4 35 33.5-36.5 14.285
5. 37-39 7 38 36.5-39.5 25
6. 40-42 3 41 39.5-42.5 10.714
Total 28 100
2/27/2025 biostatistics 13
Home Work : In a battle wounded a number of soldiers, and they have 25 soldiers have
given their blood to their friends the soldiers as blood type:
A+ B A+ B AB
O- B O- B A+
AB A+ AB O- AB
A+ O- AB B O-
B AB A+ O- A+
2/27/2025 biostatistics 14
Cumulative Frequency Distribution
The tables are designed to see how many (samples) of
less than or more than a certain value. Which is two
parts: Nu. Class limit (f)
1. Increasing Cumulative Frequency Distribution a 1. 31-40 1
table the purpose of which is to know how many
samples of less than the minimum for a certain class. 2. 41-50 2
2. Decreasing Cumulative Frequency Distribution The
3. 51-60 5
tables are the purpose of which is to know how
many samples of more than the minimum for a 4. 61-70 15
certain class.
5. 71-80 25
6. 81-90 20
Example / Find the Increasing Cumulative Frequency
distribution table and Decreasing Cumulative Frequency 7. 91-100 12
to the following table. Then find the number of class that
less than (71) and percentage as well as the number of 80
class by more than (71) and percentage.
2/27/2025 biostatistics 15
Frequency distribution table sol : Increasing Decreasing Cumulative Frequency table
I.f.d Class limit D.C.f Class limit
0 Less than 31 80 More than 31
1 Less than 41 79 More than 41
3 Less than 51 77 More than 51
8 Less than 61 72 More than 61
23 Less than 71 57 More than 71
48 Less than 81 32 More than 81
68 Less than 91 12 More than 91
80 0
𝑛𝑢𝑚 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 71 = 23 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 𝑟𝑒𝑙. 𝑐𝑙𝑎𝑠𝑠 𝑛𝑢𝑚 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠 𝑚𝑜𝑟𝑒 𝑡ℎ𝑎𝑛 71 = 57
𝟐𝟑 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 𝑟𝑒𝑙. 𝑐𝑙𝑎𝑠𝑠𝑚𝑜𝑟𝑒 𝑡ℎ𝑎𝑛
𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝟕𝟏 = ∗ 𝟏𝟎𝟎 = 𝟐𝟖. 𝟕𝟓% 𝟓𝟕
𝟖𝟎
71 = ∗ 𝟏𝟎𝟎 = 𝟕𝟏. 𝟐𝟓%
𝟖𝟎
2/27/2025 biostatistics 16
Question: Complete the following frequency distribution table, If knowing that n=20 :
Class Fi Center
- 23 3 21
24 - 5
29 -
- 6
-43 4 41
2/27/2025 biostatistics 17