Hypothesis. They Are Generally Statements About The Probability Distributions of The Population
Hypothesis. They Are Generally Statements About The Probability Distributions of The Population
MODULE 5
EXPLAIN:
STATISTICAL DECISIONS
Very often in practice we are called upon to make decisions about populations on the basis of
sample information. Such decisions are called statistical decisions. For example, we may wish to
decide on the basis of sample data whether a new serum is really effective in curing a disease,
whether one educational procedure is better than another, or whether a given coin is loaded.
STATISTICAL HYPOTHESIS
In attempting to reach decisions, it is useful to make assumptions (or guesses) about the
populations involved. Such assumptions, which may or may not be true, are called statistical
hypothesis. They are generally statements about the probability distributions of the population.
For a generic hypothesis test, the two hypotheses are as follows:
If many instances we formulate a statistical hypothesis for the sole purpose of rejecting or
nullifying it. For example, if we want to decide whether a given coin is loaded, we formulate the
hypothesis that the coin is fair (i.e., p=0.5 where p is the probability of heads). Similarly, if we
want to decide whether one procedure is better than another, we formulate the hypothesis that
there is no difference between the procedures (i.e., any observed differences are due merely to
fluctuations in sampling from the same population). Such hypotheses are often called null
hypotheses and are denoted by H 0.
Any hypothesis that differs from a given hypothesis is called an alternative hypothesis. For
example, if one hypothesis is p=0.5, alternative hypotheses might be p=0.7 , p ≠ 0.5 ,or p>0.5.
A hypothesis alternative to the null hypothesis is denoted by H 1.
If we suppose that a particular hypothesis is true but finds that the results observed in a random
sample differ markedly from the results under the hypothesis (i.e., expected on the basis of pure
chance, using sampling theory), then we would say that the observed differences are significant
and would thus be inclined to reject the hypothesis (or at least not accept it on the basis of the
evidence obtained). For example, if 20 tosses of a coin yield 16 heads, we would be inclined to
reject the hypothesis that the coin is fair, although it is conceivable that we might be wrong.
1
INFERENTIAL STATISTICS AB
Procedures that enable us to determine whether observed samples differ significantly from the
results expected, and thus help us decide whether to accept or reject hypotheses, are called tests
of hypotheses, tests of significance, rules of decision, or simply decision rules.
In either of the following case, a wrong decision or error in judgment has been made.
If we reject a hypothesis whet it should be accepted, we say that Type I error has been
made.
If, on the other hand, we accept a hypothesis when it should be rejected, we say that a
Type II error has been made.
In order for decision rules (or tests of hypotheses) to be good, they must be designed so as to
minimize errors of decision. This is not a simple matter, because for any given sample size, an
attempt to decrease one type of error is generally accompanied by an increase in the other type of
error. In practice, one type of error may be more serious than the other, and so a compromise
should be reached in favor of limiting the more serious error. The only way to both types of error
is to increase the sample size, which may or not be possible.
LEVEL OF SIGNIFICANCE
In testing a given hypothesis, the maximum probability with which we would be willing to risk a
Type I error is called the level of significance, or significance level, of the test. This probability
often denoted by α, is generally specified before any samples are drawn so that the results
obtained will not influence our choice.
In practice, a significance level of 0.05 or 0.01 is customary, although other values are used. If,
for example, the 0.05 (or 5%) significance level is chosen in designing a decision rule, then there
are about 5 chances in 100 that we would reject the hypothesis when it should be accepted; that
is we are about 95% confident that we have made the right decision. In such case we say that the
hypothesis has been rejected at the 0.05 significance level, which means that the hypothesis has a
0.05 probability of being wrong.
To illustrate the ideas presented above, suppose that under a given hypothesis the sampling
distribution of a statistic S is a normal distribution with mean μS and standard deviation σ S. Thus
the distribution of the standard variable (or z score), given by z=(S−μ S) /σ S , is the standardized
normal distribution (mean 0, variance 1), as shown in the figure below.
2
INFERENTIAL STATISTICS AB
As indicated in the above figure, we can be 95, confident that if the hypothesis is true, then the z
score of an actual sample statistic S will lie between —1.96 and 1.96 (since the area under the
normal curve between these values is 0.95). However, if on choosing a single sample at random
we find that the z score of its statistic lies outside the range —1.96 to 1.96, we would conclude
that such an event could happen with a probability of only 0.05 (the total shaded area in tbe
figure) if the given hypothesis were true. We would then say that this z score differed
significantly from what would be expected under the hypothesis, and we would then be inclined
to reject the hypothesis.
The total shaded area 0.05 is the significance level of the test. It represents the probability of our
being wrong in rejecting the hypothesis (i.e., the probability of making a Type I error). Thus we
say that the hypothesis is rejected at 0.05 the significance level or that the z score of the given
sample statistic is significant at the 0.05 level.
The set of z scores outside the range —1.96 to 1..96 constitutes what is called the critical region
of hypothesis, the region of rejection of the hypothesis, or the region of significance. The set of z
scores inside the range —1.96 to 1.96 is thus called the region of acceptance of the hypothesis, or
the region of nonsignificance.
On the basis of the above remarks, we can formulate the following decision rule (or test of
hypothesis or significance):
Reject thc hypothesis at the 0.05 significance level if the zscore of the statistic S lies outside the
range —1.96 to 1.96 (i.e., either z >1.96 or z <1.96 ¿. This is equivalent to saying that the
observed sample statistic is significant at the 0.05 level.
Because the z score plays such an important part in tests of hypotheses, it is also called a
test statistic .
It should be noted that other significance levels could have been used. For example if the 0.1
level were used, we would replace 1.96 everywhere above with 2 58 (see table below) It can also
be used, since the sum of the significance and confidence levels is 100%.
In the above test were interested in extreme value of the statistics S or its corresponding on both
sides of the mean (i.e., in both tails of the distribution). Such tests are called two-sided tests, or
two-tailed tests.
3
INFERENTIAL STATISTICS AB
Often, however, we may be interested only in extreme values to one side of the mean (i.e., in one
tail of the distribution), such as when we are testing the hypothesis that one process is better than
another (which is different from testing whether one process is better or worse than the other).
Such tests are called one-sided tests or one-tailed test. In such cases the critical region is a region
to one side of the distribution, with area equal to the level of significance.
The table below which gives critical values of z for both one-tailed and two-tailed tests at
various levels of significance, will be found useful for reference purposes. Critical values of z for
other levels of significance are found from the tale of normal-curve areas.