Unit I
Unit I
Parametric Tests
A statistical test, in which specific assumptions are made about the
population parameter
Parametric tests assumes that the data follow a specific distribution, usually
a normal distribution; homogeneity of variances (equal variances across
groups); and that the data are measured on an interval or ratio scale.
The observation must be independent. The inclusion or exclusion of any
case in the sample should not unduly affect the results of study
The meaningfulness of the results of a parametric test depends on the
validity of the assumption.
Parametric tests are sensitive to outliers and non-normality, which can affect
the validity of the results.
Parametric tests are useful as these tests are most powerful for testing the
significance of the computed sample statistics.
Examples: t-test (compares means between two groups), ANOVA (compares
means across multiple groups)
Non-Parametric Tests
They are distribution-free tests. Statistical tests which are not based on a
normal distribution of data or on any other assumption.
The test statistic is arbitrary and can be applied to data measured on ordinal
or nominal scales, as well as interval and ratio scales.
Results are interpreted based on ranks or medians.
Non-parametric test are less sensitive to outliers and can be used for
skewed distributions and small sample sizes.
The use of non-parametric tests is recommended in the following situations:
Sample size is quite small, as small as N=5 or N=6
Assumption like normality of the distribution of scores in the population
are doubtful
When the measurement of data is available either in the form of ordinal
or nominal scales or when the data can be expressed in the form of
ranks
Non-parametric tests typically make fewer assumptions about the data and
may be relevant to a particular situation
Examples: Spearman's rank correlation (measures the strength and direction
of the association between two ranked variables), chi-square test (tests the
association between categorical variables)
Z-test
Random Sampling
A random sample of a given population is a sample so drawn that each
possible sample of that size has an equal probability of being selected from
the population.
The method of selection, not the particular sample outcome, defines a
random sample.
There are two sampling plans that yield a random sample:
sampling with replacement- in which an element may appear more
than once in a sample
sampling without replacement- in which no element may appear more
than once
A random sampling distribution of the mean is the relative frequency
distribution of mean (X-bar) obtained from all possible random samples of a
given size that could be drawn from a given population.
Characteristics of the Random Sampling Distribution of the
Mean
Expected Value of the sample mean, is the same as the mean of the
population of scores from which the samples were drawn.
The central limit theorem also states that the sampling distribution will have
the following properties:
The mean of the sampling distribution will be equal to the mean of the
population distribution:
The variance of the sampling distribution will be equal to the variance of
Testing Hypothesis
Hypothesis is a statement about a population parameter to be subjected to
test and, on the outcome of the test, to be retained or rejected.
The key to any problem in statistical inference is to discover what sample
values will occur by chance in repeated sampling and with what probability.
Sampling Distribution: a theoretical relative frequency distribution of the
values of a statistic that would be obtained by chance from an infinite
number of samples of a particular size drawn from a given population.
Probability Samples: samples for which the probability of inclusion in the
sample of each element in the population is known.
The goal of hypothesis testing is to make inferences about a population
based on a sample.
Null Hypothesis
The hypothesis that a researcher tests is called the null hypothesis
Symbolized Ho. It is the hypothesis that he or she will decide to retain or
reject.
The null hypothesis is simply whatever hypothesis we choose to test.
Null hypothesis implies a statement that expects no difference or effect.
Level of Significance
Conclusion
Rejecting the null hypothesis- the obtained sample statistic (X-bar) has a
low probability of occurring by chance if the value of the population
parameter stated in Ho is true
Retaining the null hypothesis- we do not have sufficient evidence to reject
Ho
Large samples increase the precision by reducing sampling variation.
Alternate Hypothesis
Alternative hypothesis- a hypothesis about a population parameter that
contradicts the null hypothesis
Ha symbol for the alternative hypothesis
Alternative hypothesis is one that expects some difference or effect.
The alternative hypothesis may be directional or non-directional
The time to decide on the nature of the alternative hypothesis is at the
beginning of the study, before the data are collected.
Assumptions of Z-test
A random sample has been drawn from the population. This ensures that
each member of the population has an equal chance of being included in the
sample, which helps in making the sample representative of the population.
The sample has been drawn by the with-replacement sampling plan.
The sampling distribution of Mean follows the normal curve. When the
scores in the population are not normally distributed, the central limit
theorem comes to the rescue when the sample size is 30 or larger
The standard deviation of the population of scores is known.
Student's Distribution of t
Characteristics
Student’s distribution of t is not a single distribution, but rather a family of
distributions. They differ in their degree of approximation to the normal
curve
The mean of the t-distribution is zero.
Are symmetrical.
Are unimodal.
Is platykurtic compared to the normal distribution (i.e., it is narrower at the
peak and has a greater concentration in the tails than does a normal curve)
The shape of the t-distribution depends on the degrees of freedom, which
are typically related to the sample size, df = n - 1
As the degrees of freedom increase, the t-distribution approaches the
normal distribution. For df > 30, the t-distribution is very close to the normal
distribution.
Has a larger standard deviation (𝜎Z = 1)
As the degrees of freedom increase, the standard deviation approaches 1,
which is the standard deviation of the standard normal distribution.
Assumptions of t-test
The t-test is a statistical test used to compare the means of two groups
Differences:
Degree of Freedom
The degrees of freedom associated with s, the estimated standard deviation
of a population, corresponds to the number of observations that are
completely free to vary.
df= n − 1
The degrees of freedom of a test statistic determines the critical value of the
hypothesis test.
P-value
the probability, when Ho is true, of observing a sample mean as deviant or
more deviant (in the direction specified in HA) than the obtained value of
mean
It measures the strength of evidence against the null hypothesis.
It is not established in advance and is not a statement of risk; it simply
describes the rarity of the sample outcome if Ho is true.
If the p-value is less than or equal to the level of significance, reject the null
hypothesis. This suggests that the observed effect is statistically significant.
If the p-value is greater than alpha, do not reject the null hypothesis. This
suggests that there is not enough evidence to conclude that the effect is
statistically significant.
If the p-value is smaller than your level of significance, you declare your
findings significant.