Inferential Statistics
Inferential Statistics
• To decide which test suits your aim, consider whether your data
meets the conditions necessary for parametric tests, the number of
samples, and the levels of measurement of your variables.
• Means can only be found for interval or ratio data, while medians and
rankings are more appropriate measures for ordinal data.
Hypothesis Testing:
• Inferential statistics involves testing hypotheses to
determine if the patterns or relationships observed in
a sample are likely to apply to the broader population.
This includes testing a null hypothesis (typically
suggesting no effect or difference) against an
alternative hypothesis.
Seven Steps to Hypothesis
Testing
(1) Making assumptions
(2) Stating the research
3) Null hypotheses
4) Selecting alpha
(5) Selecting the sampling distribution and specifying the test statistic
(6) Computing the test statistic
(7) Making a decision and interpreting the results
The philosophy of the null
hypothesis in inferential statistics
• It is rooted in scientific skepticism, the principle
of falsifiability, and the aim to make cautious,
evidence-based decisions. By requiring strong
evidence to reject H 0 researchers ensure rigor
and reliability in scientific conclusions. Typically,
the null hypothesis suggests "no effect" or "no
difference," setting a standard against which the
research hypothesis (alternative hypothesis) is
evaluated.
Sampling error in inferential
statistics
• Since the size of a sample is always smaller than the size of the
population, some of the population isn’t captured by sample data.
This creates sampling error, which is the difference between the true
population values (called parameters) and the measured sample
values (called statistics).
• Sampling error arises any time you use a sample, even if your sample
is random and unbiased. For this reason, there is always some
uncertainty in inferential statistics. However, using probability
sampling methods reduces this uncertainty.
Estimating population parameters
from sample statistics
• The characteristics of samples and populations are described by
numbers called statistics and parameters:
• 1. **t-value:** This is the calculated t-statistic. It indicates how far the sample mean deviates from the population mean in
terms of standard error.
•
• 2. **df (degrees of freedom):** This is the sample size minus 1 (n-1). It affects the critical value for the t-distribution.
• 3. **Sig. (2-tailed):** This is the p-value, which tells you whether the results are statistically significant. It tests the null
hypothesis:
• - If **p < 0.05**: You reject the null hypothesis (there is a significant difference between the sample mean and the population
mean).
• - If **p ≥ 0.05**: You fail to reject the null hypothesis (no significant difference).
• 4. **Mean Difference:** This is the difference between the sample mean and the population mean.
• 5. **Confidence Interval (CI):** This shows the range within which the true population mean likely falls, based on your sample
data.
• ****
• - The **t-value** of 2.45 suggests that the sample mean is 2.45
standard errors away from the population mean.
• - The **p-value (Sig.)** is 0.02, which is less than 0.05, so you **reject
the null hypothesis**. This means there is a significant difference
between the class's average score and the national average score.
• - The **mean difference** of 3.5 indicates that, on average, students in
the class scored 3.5 points higher than the national average.
• - The **95% CI** suggests that we are 95% confident that the true
mean difference lies between 0.5 and 6.5 points.
Confidence intervals
A confidence interval uses the variability around a statistic to come up with an interval estimate for a
parameter. Confidence intervals are useful for estimating parameters because they take sampling error
into account.
While a point estimate gives you a precise value for the parameter you are interested in, a confidence
interval tells you the uncertainty of the point estimate. They are best used in combination with each
other.
Each confidence interval is associated with a confidence level. A confidence level tells you the
probability (in percentage) of the interval containing the parameter estimate if you repeat the study
again.
A 95% confidence interval means that if you repeat your study with a new sample in exactly the same
way 100 times, you can expect your estimate to lie within the specified range of values 95 times.
Estimates you can make about the
population
• There are two important types of estimates you can make about the
population: point estimates and interval estimates.
• However, with random sampling and a suitable sample size, you can
reasonably expect your confidence interval to contain the parameter
a certain percentage of the time.
Hypothesis testing
• Hypothesis testing is a formal process of statistical analysis using
inferential statistics. The goal of hypothesis testing is to compare
populations or assess relationships between variables using samples.