0% found this document useful (0 votes)
9 views

Introduction and frequency distribution

Uploaded by

Soberly Mohanty
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Introduction and frequency distribution

Uploaded by

Soberly Mohanty
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 38

BIOStatistics

Lesson plan
Assignments and quiz
Si Internal/External Weightage %
No. (MSM)
 Assignment 1:
1 Digital assignment 10
 Assignment 2: I
2 Digital assignment 10
 Assignment 3: II
3 Quiz 10

4 CAT 1 15
5 CAT 2 15
6 FAT 40
Total 100
What is statistics

 Methodology for
 Data Collection
 Analysis
 Presentation
 Interpretation and drawing conclusions

 What kind and how much data needs to be collected?  DESIGN


 How should it be organized and summarized?  DESCRIPTION
 How should it be analysed and interpreted?  INFERENCE
 How can it be simplified and presented?  REPRESENTATION
 How can we assess the strength of the conclusions and evaluate their
uncertainty?  PREDICTION
 What is the right amount of dosage of drug for treatment?

 Study same patients or different?


 Study as a whole or different disease stages?
 How many dose conditions?
Key concepts

 Population: collection of all individuals or items under consideration in a


statistical study
 Can be infinite (universal) or finite (space, time)

 Sample: Sample is a subset of the population which is actually studied


 Has to be finite

 Sampling unit (unit): The source of each measurement

 Population is always the target of an investigation


 WHAT IS THE NEED FOR SAMPLE THEN?

 E.g., What is the right amount of dosage of drug for treatment?


 What is the population here
 What is the sample here
 Descriptive statistics: methods for organizing and summarizing
information

 Inferential statistics: Methods for drawing and measuring the reliability of


conclusions (inference and prediction)

 Is there a difference between the weight of boys and girls in the class?
 Mean, SD, graphs
 Is the difference significant (reliable)? Can it be extended to other
classes?
 Hypothesis testing, confidence interval measurements and
significance levels (probability theory)

 STATISTICS CAN NEVER PROVE SOMETHING…only gives probable


chances of association, predictions, etc.
Key concepts

 Variable: a characteristic that varies from a unit to another


 Quantitative or numerical: involve numerical information e.g.?
 Qualitative or categorical: non-numerical information e.g.?

 Discrete and continuous

 Can categorical variables be changed to numerical?


 Scales
 Coding

 Quantitative variables can be classified as discrete or continuous (e.g.)


Frequency distribution

 A common and useful method for evaluating the qualitative and


quantitative data in Biology

 Classification of a random variable into a number of classes or class


intervals indicating the number of times of occurrence of an event

 Could be represented in the form of tables or diagrams (including


graphical means)
Frequency distribution of a variable

 Frequency: The number of observations/events that fall into a


particular class (or category)
 Frequency distribution: A table/graph listing all classes and their
frequencies

 For evaluating both quantitative and qualitative variables


Code Category Frequency

1 Married 5
2 Unmarried 8
3 Divorced 20
4 Separated 34
5 Widower 19
Relative Frequency (RF)

 Relative frequencies – percent

 105 deer out of 300; then the relative frequency is 105/300=0.35


(percent frequency is 35)

 Sum of RFs is always 1 (100%)

 Discrete frequency distribution and tally marking (eg. counting the


number of bats in a squad of 26 cricket players, and tabulating using
tally marks)

 Continuous frequency with class intervals


Discrete Frequency Table

No. of bats Tally marks Frequency RF


Continuous Frequency Table
 What about continuous quantitative variables
 Have to be grouped into classes before analyses
 Find min and max values of the variables
 Choose intervals of equal lengths between the min and max
values without overlapping (class intervals or class)

 Class limits: End points of the class intervals (upper or lower)


 Mid value or class mark: mid point of the class interval
 Class frequency: No. of observations in the class interval

 Overlapping and non-overlapping classes


Real or true class limits: The number in the middle of the upper class
limit of one class interval and the lower class limit of the subsequent
class
Real (True) class limits
Real (True) class intervals or class boundaries

Sl Practical Class Mid F


No. class boundar point
interval y
1 3.3-3.5 3.25- 3.4 2
3.55
2 3.6-3.8 3.55- 3.7 5
3.85
3 3.9-4.1 3.85- 4 11
4.15
4 4.2-4.4 4.15- 4.3 5
4.45
5 4.5-4.7 4.45- 4.6 2
4.75
Total 25
Graphical data representation: Pie chart
Graphical data representation: Bar chart
Graphical data representation: Histogram
No spaces between the bars in a histogram
Distribution of quantitative continuous variables

Bar graphs: qualitative and quantitative discrete variables


Frequency polygon and curve

Polygon: for a small sample (sample distribution)


Curve: for a very large sample or the whole population (population
distribution)
Line graph
N=100 N=2000

N=whole population
Cumulative frequency (CF) and Cumulative
Relative frequency (CRF)

 Total of frequency and all frequencies of the classes in a frequency


distribution until a defined class interval

 Relative cumulative frequency (RCF) – percent


 What do they tell: Frequency with respect to a reference point

Si. No. Variable Frequency Cumulati Cumulative Percent


(x) (f) ve relative Cumulative
frequenc frequency relative
y (CF) (CRF) frequency
1 0-10 8 8 (8/66) = 0.12 0.12*100 = 12
2 10-20 15 23 (23/66) = 0.35 35
3 20-30 20 43 (43/66) = 0.65 65
4 30-40 18 61 (61/66) = 0.92 92
5 40-50 5 66 (66/66) = 1 100
N= 66

How many students got marks below 30?


What was the percentage of students with marks below 30?
What about students with marks above 30?
Ogive curve

 Nothing but a cumulative frequency polygon

 Instead of frequency on the X axis, it has cumulative frequency

 How many numbers lie above or below a reference point


Frequency table – more than and less than CF

X f Less than Cf More than cf


1 5 5 45
2 10 15 40
3 15 30 30
4 10 40 15
5 5 45 5
Total

How many students got marks less than or equal to 3?


Less than Cf
How many students got marks more than or equal to 4?
More than Cf
Plotting ogive plots

1 2 3 4 5
Less than c.f. (ogive): below the reference
point
More than c.f. (ogive): above the reference
point
Continuous variable

Si. X f Less Plotting less More Plotting


No. than Cf than ogive than cf more than
ogive

1 0-10 8 8 10,8 130 0,130


2 10-20 16 24 20,24 122 10,122
3 20-30 30 54 30,54 106 20,106
4 30-40 35 89 40,89 76 30,76
5 40-50 15 104 50,104 41 40,41
6 50-60 26 130 60,130 26 50,26
Total
Exercise 1
 No. of hours of sleep for subjects under the influence of a sleep-
inducing drug (sedative)

 Create a non-overlapping frequency distribution table


 Create a relative frequency distribution table
 Create a cumulative relative frequency distribution table

 What kind of data; discrete or continuous?


 Is the drug effective?
Exercise 2

 The data given below represent the length in mm of a sample of


fishes collected from a pond: 22,15, 38, 21, 30, 44, 34, 45, 55, 23,
24,76, 32, 43, 65, 23,12,10, 8, 46, 76, 43, 55, 22, 20, 40, 87, 99, 67,
49, 37, 20, 68, 98, 78, 66, 34, 54, 65, 27, 34, 35, 36, 37, 38, 39, 50,
41, 45, 43, 42, 32, 48, 49, 60, 67, 68, 42, 22, 12, 21, 23, 34, 64, 69,
70

 Using the entire given data, construct a frequency table with tally
marks.

You might also like