0% found this document useful (0 votes)
3 views

statss-2

The document provides a comprehensive overview of various statistical tests, their purposes, data types, and assumptions. It categorizes tests into t-Tests, Non-Parametric Tests, ANOVA, Chi-Square Tests, Correlation and Regression, Factor Analysis, and Effect Size, along with descriptive statistics and distributions. Additionally, it includes key concepts in inference, comparisons between parametric and non-parametric tests, and a quick reference guide for selecting appropriate tests based on variable types.

Uploaded by

Čulo Ivan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

statss-2

The document provides a comprehensive overview of various statistical tests, their purposes, data types, and assumptions. It categorizes tests into t-Tests, Non-Parametric Tests, ANOVA, Chi-Square Tests, Correlation and Regression, Factor Analysis, and Effect Size, along with descriptive statistics and distributions. Additionally, it includes key concepts in inference, comparisons between parametric and non-parametric tests, and a quick reference guide for selecting appropriate tests based on variable types.

Uploaded by

Čulo Ivan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Here is a comprehensive table comparing the statistical tests and their typical uses:

Category Test/Method Used For Data Type

t-Tests Independent Samples Comparing means of two independent Continuous


t-Test groups (interval/ratio)

Equal Variance (Levene's Testing for equal variances between Continuous


Test) groups

Unequal Variance Comparing means when variances are Continuous


(Welch's t-Test) unequal

Normality Assumptions Assessing normality (Shapiro-Wilk, Continuous


Skewness, Kurtosis)

Paired Samples t-Test Comparing means of related groups or Continuous


repeated measures

Within-person differences Assessing changes within the same Continuous


individuals over time

Non-Parametric Tests Mann-Whitney U Test Alternative to Independent t-Test for Ordinal, non-normal
non-normal distributions or ordinal data continuous

Wilcoxon Rank Test Alternative to Paired t-Test for Ordinal, non-normal


non-normal distributions or ordinal data continuous

Analysis of Variance One-way ANOVA Comparing means of three or more Continuous


independent groups

Repeated Measures Comparing means of three or more Continuous


ANOVA related groups

Friedman's ANOVA Non-parametric alternative for repeated Ordinal, non-normal


measures ANOVA continuous

Factorial ANOVA Examining interactions between multiple Continuous


factors

Chi-Square Test Association Analysis Determining association between Categorical


categorical variables (nominal/ordinal)

Contingency Tables Comparing expected vs. observed Categorical


frequencies

Correlation and Pearson Correlation Assessing linear relationships between Continuous


Regression two continuous variables

Multiple Regression Predicting outcomes using multiple Continuous (predictor &


independent variables outcome)

Factor Analysis Data Reduction Reducing variables into underlying Continuous


factors

Identifying Structures Exploring underlying data patterns Continuous

Effect Size Cohen's d Measuring magnitude of differences Continuous

Practical Significance Understanding real-world importance Continuous


beyond statistical significance
1. Descriptive Statistics

Concept Definition / Purpose Type of Data When to Use Typical Result

Mean The arithmetic average Continuous To find a central value Single numeric value.
of a set of values. (Interval/Ratio) for normally distributed
(or roughly symmetric)
data.

Median The middle value in a Continuous or When the data is Single numeric value.
sorted list of values. Ordinal skewed or has outliers; a
robust measure of
central tendency.

Mode The most frequently Any (Continuous, When identifying the Single numeric value
occurring value in a Ordinal, Nominal) most common category (or category).
data set. or value is important;
also relevant for
categorical data.

Standard Deviation A measure of how Continuous To quantify Single numeric value


spread out values are (Interval/Ratio) spread/dispersion in (average distance from
from the mean. data that is mean).
Typically used in approximately normally
normal distributions. distributed.

Variance The average of the Continuous Foundation for Standard Single numeric value
squared differences (Interval/Ratio) Deviation; used in (square of SD).
from the mean. inferential tests (ANOVA,
t-test, etc.).

Interquartile Range The difference Continuous or To measure the middle Single numeric value
(IQR) between the 75th (Q3) Ordinal 50% spread; less (Q3 - Q1).
and 25th (Q1) affected by outliers than
percentile of data. range or SD.

Z-Score Number of standard Continuous To standardize a value Dimensionless number


deviations a point is (Interval/Ratio) from different data sets (distance in SD units).
from the mean. or distributions for
comparison.

Kurtosis A measure of the Continuous To understand the shape Single numeric value
"tailedness" (peak and (Interval/Ratio) of a distribution, (positive, negative, or
tail thickness) of a particularly how heavy or zero).
distribution. light the tails are.

Skewness A measure of the Continuous When checking whether Single numeric value
asymmetry of a (Interval/Ratio) the distribution is (positive, negative, or
distribution symmetrical or skewed. zero).
(positive/right or
negative/left skew).
2. Distributions
Distribution Definition / Type of Data When to Use Typical Result
Purpose

Binomial Probability Discrete (count When the outcome is Probability of


Distribution distribution of of events) binary kk successes
successes/failure (success/failure) and out of nn.
s in a fixed trials are independent.
number of
independent
Bernoulli trials.

Normal Symmetric, Continuous Many biological, Bell curve


(Gaussian) bell-shaped psychological, and shape; mean =
Distribution distribution natural phenomena median =
defined by mean approximate mode.
(μ\mu) and normality.
standard
deviation
(σ\sigma).

Uniform All outcomes in a Continuous or When all intervals or Flat line or


(Rectangular) range are equally Discrete categories have equal rectangular
Distribution likely. probability. shape;
constant
probability.

Bimodal A distribution with Continuous When data naturally Two peaks in


Distribution two distinct clusters around two the histogram
peaks. different modes or density plot.
(subpopulations, etc.).

Skewed Asymmetric Continuous When data cluster on Asymmetry; tail


Distribution distribution with a one side with a tail extends to
(Positive/Negativ longer tail on one extending to the other positive or
e) side. (income data, reaction negative side.
times).

Sampling Distribution of Continuous Underpins inferential Approx. normal


Distribution of sample means (for means) statistics and the shape when nn
Means for all possible concept of the is large (by
samples of a standard error. CLT).
given size from a
population
(Central Limit
Theorem).
3. Sampling, Parameters, and Key Concepts in Inference
Concept Definition / Purpose Type of Data When to Use Typical Result

Parameter A value that describes Population-level When describing the Single numeric value (e.g., μ\mu).
a characteristic of a entire population's
population (e.g., true measure.
μ\mu, σ\sigma).

Statistic A value that describes Sample-level When describing a Single numeric value (e.g., xˉ\bar{x}).
a characteristic of a sample’s measure,
sample (e.g., used to estimate the
xˉ\bar{x}, ss). population
parameter.

Standard Error The standard Continuous (for When estimating how Single numeric value (spread of sample
(SE) deviation of a means) much a sample mean means).
sampling distribution varies from one
(often the distribution sample to another.
of sample means).

Confidence A range of values Continuous (for When wanting an Interval: [Lower bound,Upper bound][
Interval likely to contain the means) or interval estimate \text{Lower bound}, \text{Upper bound} ].
true population proportions (uncertainty range) of
parameter at a a parameter (mean,
chosen confidence proportion, etc.).
level (e.g., 95%).

Null Hypothesis Statement that there Any (depends on When performing Usually “no difference” or “no relationship”
(H0H_0) is no effect or test) hypothesis testing; statement.
difference; the default the assumption to be
assumption in tested (and possibly
hypothesis testing. rejected).

Alternative Statement that there Any (depends on The claim you usually Usually “there is a difference” or “there is a
Hypothesis is an effect or test) want to find evidence relationship”.
(H1H_1 or difference, for; tested indirectly
HaH_a) contradicting the null by attempting to
hypothesis. reject H0H_0.

Hypothesis Statistical analysis to Any (depends on Whenever making A p-value, decision to reject or fail to reject
Testing decide whether to test) inferences about H0H_0.
reject the null population
hypothesis. parameters or
relationships from
sample data.

Type I Error Rejecting H0H_0 Any (depends on Controlling this error Probability of a false alarm.
(α\alpha) when H0H_0 is test) rate is crucial in
actually true (False hypothesis testing
Positive). (often set
α=0.05\alpha = 0.05).

Type II Error Failing to reject Any (depends on Want to keep β\beta Probability of missing a true effect.
(β\beta) H0H_0 when H0H_0 test) low to avoid missing
is actually false (False a true effect; related
Negative). to power = 1−β1 -
\beta.

Statistical Power Probability of correctly Any (depends on Important when Numeric value between 0 and 1 (commonly
rejecting H0H_0 when test) designing studies to 0.8 or higher is desired).
H0H_0 is false. ensure enough
sample size and
effect detectability.

4. Parametric vs. Non-Parametric Tests (Overview)


This table helps you see which tests are parametric or non-parametric, what kind of data they typically handle, and the assumptions behind them.

Test Parametric / When to Use Variables & Data Type Key Assumptions Result / Output
Non-Parametric

t-test (general) Parametric Compare means of DV: Continuous 1) Normal distribution in each t-statistic, p-value
two groups (Interval/Ratio), IV: 2 group2) Similar variances (for
groups (nominal) Student’s t-test)3) Independent
observations

Independent Samples Parametric Compare means of DV: Continuous, IV: Same as above + groups are t-statistic, p-value
t-test two independent Categorical with 2 independent
groups levels

Paired Samples t-test Parametric Compare means of DV: Continuous, IV: Differences are normally t-statistic, p-value
the same group Time/Condition distributed, measurements
measured twice (within-subject) paired

Welch's t-test Parametric Compare means of DV: Continuous, IV: Normal distribution in each t-statistic, p-value
two independent Categorical with 2 group, but does NOT assume
groups with unequal levels equal variances
variances

Mann-Whitney U test Non-Parametric Compare two DV: Ordinal or Independent observations, U-statistic, p-value
independent groups Continuous (skewed), distributions have similar
(median comparison) IV: Categorical with 2 shape
when data isn’t levels
normal

Paired Sample Non-Parametric Compare two related DV: Ordinal or skewed Data are paired, differences in W-statistic, p-value
Wilcoxon Rank Test groups (matched or Continuous, IV: ranks tested
repeated measures) repeated measure or
when data isn’t matched pairs
normal

ANOVA (one-way) Parametric Compare means of DV: Continuous, IV: Normal distribution, F-statistic, p-value
2+ groups on one Categorical with ≥2 homogeneity of variance,
factor levels independent groups

Repeated Measures Parametric Compare means of DV: Continuous, IV: Sphericity (or corrected with F-statistic, p-value
ANOVA 2+ repeated repeated measure Greenhouse-Geisser), normal
conditions on the factor with ≥2 levels distribution of differences
same participants

Factorial ANOVA Parametric Compare means of DV: Continuous, Normal distribution, F-statistics (main
2+ groups across multiple IVs (each homogeneity of variance, effects & interactions),
multiple categorical Categorical) independent groups p-values
factors

Friedman Repeated Non-Parametric Compare 3+ DV: Ordinal or skewed Dependent/paired data, Chi-square statistic,
Measures ANOVA repeated measures Continuous, repeated rank-based test p-value
when data isn’t measures on same
normal subjects

Chi-Square Test Non-Parametric Test relationship DV & IV both Expected frequencies >5 in Chi-square statistic,
between 2+ Categorical each cell (approx), p-value
categorical variables independence of observations

Chi-Square Goodness Non-Parametric Compare categorical DV: Categorical Expected frequencies >5 in Chi-square statistic,
of Fit data distribution to a each category, independence p-value
theoretical expected
distribution

Pearson’s Correlation Parametric Assess linear Both variables: Bivariate normality (roughly), r (from -1 to +1),
(r) relationship between Continuous linear relationship assumption p-value, significance
2 continuous (Interval/Ratio)
variables

Multiple Regression Parametric Predict a continuous DV: Continuous, IVs: Linearity, normality of Regression coefficients
DV from 2+ Continuous or residuals, homoscedasticity, (Betas), p-value, R²
predictors Dummy-coded independence
(continuous or Categorical
categorical)

5. Additional Statistical Concepts & Effect Sizes


Concept Definition / Type of Data When to Use Typical Result
Purpose

Degrees of The number of Any (depends Everywhere from Numeric


Freedom values in the final on test) t-tests to ANOVA; (depends on
(df) calculation that used to reference sample size,
are free to vary. critical values in number of
distributions. parameters).

Levene's Tests if the Continuous Before running F-statistic,


Test variances of two (Interval/Ratio) parametric tests p-value;
or more groups DV, that assume equal indicates
are equal. Categorical IV variances (e.g., variance
ANOVA, Student’s equality or
t-test). inequality.

Cramér's V Effect size Categorical When a Ranges 0 to 1


or Phi measure for Chi-square test is (0 = no
Chi-square tests. significant and you association, 1 =
want to measure perfect
the strength of association).
association.

Beta Standardized Continuous In multiple Numeric values


Coefficient regression DV, regression to (e.g., 0.30), sign
(β) coefficient Continuous or compare relative indicates
indicating the Dummy IVs importance of direction of
unique predictors. effect.
contribution of a
predictor.

Factor Correlation Continuous Factor analysis or Ranges -1 to


Loading between (Interval/Ratio) PCA (Principal +1, indicates
observed variable variables Component direction and
and underlying Analysis) contexts magnitude of
factor in factor to see how correlation.
analysis. strongly a variable
relates to a factor.
6. Quick Reference: “Which Test? Which Variables?”
Below is a concise lookup matching type of variables and typical tests.

Scenario Type of Dependent Type of Independent Typical Test(s)


Variable (DV) Variable (IV)

Compare 2 independent Continuous Categorical (2 groups) Independent t-test


groups on a continuous (Interval/Ratio) (Student’s or Welch’s)
DV (normal)

Compare 2 related groups Continuous Same subjects measured Paired Samples t-test
(pre-post) on a continuous (Interval/Ratio) twice (within-subject)
DV (normal)

Compare 2 independent Ordinal or Categorical (2 groups) Mann-Whitney U test


groups on a non-normal or non-normal
ordinal DV Continuous

Compare 2 related groups Ordinal or Same subjects measured Wilcoxon Signed-Rank


on a non-normal or ordinal non-normal twice (within-subject) (Paired Sample Wilcoxon)
DV Continuous

Compare 2+ groups on a Continuous Categorical (2+ groups) One-Way ANOVA


continuous DV (normal) (Interval/Ratio)

Compare multiple Continuous Repeated measure factor Repeated Measures


conditions in the same (Interval/Ratio) ANOVA
subjects (normal)

Compare multiple factors Continuous 2+ Categorical IVs Factorial ANOVA


on a continuous DV (Interval/Ratio)
(normal)

Compare 2+ repeated Ordinal or Repeated measure factor Friedman Repeated


measures on non-normal non-normal Measures ANOVA
or ordinal DV Continuous

Relationship between two Continuous Continuous Pearson’s Correlation (r)


continuous variables (Interval/Ratio) (Interval/Ratio)

Predict a continuous DV DV = Continuous IVs = Continuous or Multiple Regression


from multiple IVs (Interval/Ratio) Categorical
(dummy-coded)

Relationship between two Categorical Categorical (nominal) Chi-Square Test, measure


categorical variables (nominal) effect size with Cramér’s V

Compare observed Categorical Theoretical distribution Chi-Square Goodness of


frequencies to expected (nominal) (expected) Fit
frequencies

You might also like