Ipsita Panda-Biostats Assignment
Ipsita Panda-Biostats Assignment
A chi square (χ2) statistics is the measure of the difference between the
observed (O) and Expected (E) frequencies of the outcomes of a set of events
or variables.
Chi square depends on the size of the difference between actual and observed
values, the degree of freedom and the sample size.
It can also be used to test the goodness of fit between an observed distribution
and theoretical distribution of frequencies.
Formula used is:
χ2= (O -E)^2/ E
O= observed values
E= expected values
Goodness -of- fit: chi square provides a way to test how well a sample of data
matches the (known or assumed) characteristics of the larger population that
the sample is intended to represent. This is known as goodness of fit.
Example 1: In a flowering plant white flower (B) are dominant over red flower (b) and
short plant (E) are dominant over tall (e) plants. When the two double heterozygote
(BbEe) plants were crossed the resulting phenotypes is observed (O) as follows: white &
short (206), red & short (83), white & tall (65) and red & tall (30).
- Next step will be to calculate chi square value for individual observation
and then sum of all the values.
- Now calculate the critical chi square value using the following formula,
(Formula to calculate critical chi square value.)
- Since critical chi square value is greater than test statistics or chi square
value which means null hypothesis is accepted.
I. MEAN
- That value of the observation which divides the entire data set into two equal
parts. Condition, that the data should be arranged in ascending or descending
order.
- Median is a positional average which locates the centre of the observation.
- If the number of observation “n” is odd, there will be a unique median, ½(n)th
observation from either end of the observation will be the median.
- If the number of observation “n” is even, there is no middle observation, but
median is defined by convection as the average of (n/2) th and (n+1)/2th
observation.
- In case of discrete frequency distribution – the first step is to arrange the data in
ascending or descending order, then find the cumulative frequency. Then divide
the cumulative frequency by 2 (cf/2). Find a number greater than cf/2; that will be
the median of the data.
- In case of continuous frequency distribution – the first step is to find the
cumulative frequency and then apply the formula:
III. MODE
In order to calculate mean, median and mode following steps are to be performed:
3. GRAPHS
A graph can be defined as pictorial representation or a diagram that represents data or
values in an organized manner.
I. BAR GRAPHS
A bar graphs or bar chart is a visual presentation of group of data that is made
up of horizontal or vertical rectangular bar of length equal to the measure of
the data.
IV. HISTOGRAM
A histogram is a graphical representation of a grouped frequency distribution
with continuous classes. It is an area diagram and can be defined as a set of
rectangles with bases along with the intervals between class boundaries and
with areas proportional to frequencies in the corresponding classes.
V. OGIVE
The ogive is defined as the frequency distribution graph of a series. The ogive
is a graph of cumulative distribution, which explains data values on the
horizontal plane axis and either the cumulative relative frequencies, the
cumulative frequencies or cumulative per cent frequencies on the vertical axis.
Two methods of ogive are:-(i) less than ogive- the frequencies of all preceding
classes are added to the frequency of a class. (ii) greater than ogive-
frequencies of all succeeding classes are added to the frequency of a class.
v) Ogive