F Test T Test Chi Square Test
F Test T Test Chi Square Test
1. Introduction
2. The Chi- Square Distribution
Definition
Sampling Distribution of Sample Variance
3. The t- Distribution
Definition
Rationale for t distribution
4. The F Distribution
Definition
Use of F Distribution
Introduction
We will discuss here applications of the t test, the chi square test and the F test in determining the
probabilities. These distributions are the distributions of variables derived from normal distribution
random variables. Hence, these distributions are significantly related to normal distribution.
We take ‘k’ independent random variables with standard normal distribution. Let these be
It has only one parameter called the degrees of freedom (k). Degrees of freedom, in general are
denoted by v.
The X2 probability distribution graph is depicted below for different degrees of freedom.
• The distribution is skewed and the skewness decreases as the degrees of freedom
increase. That is, the skewness is inversely related to degrees of freedom. For few degrees of
freedom, the distribution plot is highly skewed and for large degrees of freedom, the distribution
plot is less skewed and becomes more symmetrical.
• For degrees of freedom greater than 30, the distribution plot can be approximated by a
normal distribution curve, by standardizing the Z statistic in (1) above using the mean and
variance of chi square distribution ( v and 2v respectively). Hence, Z= (x2- k)//2k is calculated
to determine area under the standard normal curve, to find the probability of the Z value.
• The sum of two X2 variables (with n1 and nz degrees of freedom) is also a X 2 random
variable with (n1+ n2› degrees of freedom.
X2 distribution is used to explain the sampling distribution of sample variance. Taking a random
sample Y1, Y2,. . .. n from a normal distribution population (with mean= p and variance= σ2)
provides the following sample variance:
X’ and S2 are the mean and the variance of a random sample of size n from a normal population
with mean p and the standard deviation σ, then
2. The random variable (n-1)S2/ σ has chi square distribution with n-1 degrees of freedom.
Using the above results, we can determine the probability of obtaining greater than or equal to
the given sample variance. Using statistical tables, we can determine the range within which this
probability lies (because of limited o values).
The T Distribution
Definition
T= Z/√Y/v……………2
where Y and Z are independent random variables, and Y has chi- square distribution with v
degrees of freedom, and Z has standard normal distribution, (2) is called t distribution with v
degrees of freedom.
The distribution is symmetrical like normal distribution. But, its tails are fatter as compared to
normal distribution, implying higher probability for extreme t values than the standard normal
distribution.
As degrees of freedom reach infinity, the t distribution becomes identical to the standard normal
distribution.
It has already been discussed that for random samples of size ‘n’ taken from a normal
distribution population with population mean equal to p and variance equal to σ2, the random
variable X’ has a normal distribution with mean q and variance σ2/ n. Or we can say that the Z
statistic (X’- q)/ σ/In has standard normal distribution.
However, when the population variance is unknown, it is replaced with the sample variance.
The t distribution, thus discusses the exact distribution of(X’- p)/S /1n for random samples taken
from normal distribution population, when population standard deviation is unknown.
If X’ and S2 are the mean and variance of a random sample of size ‘n’ from a normal population
with mean and variance o2, then,
Y= (n-1)S2/ σ2 has chi square distribution with n-1 degrees of freedom and X’ and S2 are
independent.
Hence, substituting these values of Z and Y in (2) yields the proof of(3).
T distribution tables give the area to the right of the specified T value and degree of freedom.
Hence, for 20 degrees of freedom, the probability of getting a t value greater than 2.086 is .025.
Using the above results, we can determine the probability of obtaining greater than or equal to
the given sample mean, when only sample standard deviation is available. Using statistical
tables, we can determine the range within which this probability lies (because of limited o
values).
The F Distribution
Definition
If we take two independent chi- square variables, Z1 and Z2 with k1 and k2 degrees of freedom,
respectively, the variable:
Is said to be following F distribution with k1 and k2 degrees of freedom. The F distribution has
been depicted below:
The mean and variance of F distribution are k2/(k2- 2) and 2k22(k1+ k2- 2)/(k1(k2- 2)2(k2- 4),
respectively.
The distribution of probabilities of F along F values is skewed to the right and approaches
normal distribution as both the degrees of freedom becomes large.
It can be shown that the square of a t distributed random variables with k degrees of freedom
has F distribution with 1 and k degrees of freedom.
Further it can be shown that F(v1, ∞ )= χ2(v1)/ v1. It implies that for large denominator values of
degrees of freedom of F distribution, the chi square value with numerator degrees of freedom
divided by the numerator degrees of freedom are approximately equal.
Use of F Distribution
If S12 and S22 are the variances of independent random samples of size n1 and n2 from normal
populations with variances σ12 and σ22, then
is a random variable having an F distribution with n1- 1 and n2- 1 degrees of freedom.
F- distribution tables give areas under the F curve to the right of the specified F value and
degrees of freedom. For example, F value for 0.05 level of significance and kl= 6 and k2= 10
is 3.22. This implies that at the F value (calculated from the above method) and 6 and 7 degrees
of freedom, the probability of achieving a value of 3.22 and greater is 0.05 or 5%.
The table provides F values for few levels of significance. In order to find values for other levels
of significance, we use the following result:
Using the above results, we can determine the probability of obtaining greater than or equal (or
less than and equal to) to the given sample variance ratio. Using statistical tables, we can
determine the range within which this probability lies (because of limited o values).