This document provides an overview of common statistical tests used to analyze different types of data, including:
1) Matched t-tests and one-way ANOVAs are used to compare means between two or more groups when the data is independent or related/matched samples.
2) Correlation coefficients measure the strength and direction of the linear relationship between two continuous variables.
3) Regression analysis describes the relationship between an independent and dependent variable to predict outcomes.
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
26 views
Test of Difference Correlational SP Es
This document provides an overview of common statistical tests used to analyze different types of data, including:
1) Matched t-tests and one-way ANOVAs are used to compare means between two or more groups when the data is independent or related/matched samples.
2) Correlation coefficients measure the strength and direction of the linear relationship between two continuous variables.
3) Regression analysis describes the relationship between an independent and dependent variable to predict outcomes.
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6
Test of Difference: Matched T test and one way - Independent random sampling - all of the
independent ANOVA samples in the experiment should be drawn
randomly the population of interest. T-test - Normal distributions - assumed that all of the pulation is normally distribuded. Matched t test / paired sample t test, sometimes called - Homogeneity of Variance – ass\\umed that all dependent sample t test, is a statistical procedure used to of the populations involved have same variance determine whether the mean difference between two set of observation is zero Post hoc test - useful if your independent variable - Before and after data includes more than two groups. - Case control studies Test of Level of DV - Repeated measures difference Hypothesis Nominal Ordinal Scale ( test (testing ( testing - Like many statistical procedures, the paired difference differences mean) sample t test has two competing hypothesis, the in in ranks) null and alternative hypothesis proportion) - The null hypothesis assumes that the true mean One sample Chi square Kolmogoro T test single (compared goodness of v - smirnov mean difference between the paired sample is zero. with fit test - The null hypothesis (\(H_0\)) assumes that the standard). Z test true mean difference (\(\mu_d\)) is equal to zero. May be (single - The two-tailed alternative hypothesis (\(H_1\)) hypothetica mean, if assumes that \(\mu_d\) is not equal to zero. l: parameter - The upper-tailed alternative hypothesis (\(H_1\)) parameter variance is assumes that \(\mu_d\) is greater than zero. mean or given) - The lower-tailed alternative hypothesis (\(H_1\)) parameter assumes that \(\mu_d\) is less than zero. proportion Z test for proportion Assumptions of the matched t test Two Chi square Mann – Independe - Normality – the population of difference scores independe difference whitney U nt samples should follow a normal distribution. nt samples test test t test
Assumption of the one way ANOVA for independent Z test
groups (means, if pop - Independent random sampling although the relationship between the two members of a pair variance is of scores, each pair should be independent from given) all other pairs and, ideally, selected randomly from all possible pairs. Z test (proportion ONE WAY ANOVA – short for analysis data ) - Popular test when conducting experiments Two Mcnemar Binomial T test for - ANOVA is to test if two or more groups dependent test sign test or paired or different from each other significantly in one or samples Wilcoxon t dependent more characteristic test for groups - The ANOVA it works when the factor sort of related data points into one of the groups and therefore samples they cause the difference in the mean value of More than Chi square Kruskal – One way the groups. two test for wallis one ANOVA - Example : medicine, sociology, management independe contingenc way ( use of hoc studies. nt samples y tables ANOVA by test) (if one ranks CORRELATION REGRESSION LINE independe Measures the strength of Describes the relation nt or a linear relation between grouping 2 variables variable More than Cochran’s Friedman’s Repeated There is no distinction One variable (X) is the IV two test chi square measure between the 2 and is the predictor, dependent test ANOVA variables, and either one while the other (Y) is to be samples ( if may be designated as X predicted (criterion variable) one independe nt or grouping Used when Used in a (much) later part variable) relation between of the investigation If more Factorial variables is first than one ANOVA being studied grouping or independe Parametric test of association nt variable nominal Ordinal Scale nomina Phi Consider Consider CORRELATIONS COEFFICIENT l coefficient lower level lower level ( for a 2x2 of of - Used to measure how strong a relationship is design) measuremen measuremen between two variables t t - Most popular use is Pearson's correlation also ordinal consider Spearman consider called Pearson’s R lower level Kendall’s lower level - Used in linear regression of tau of - Give an idea how well the data fits a line or measuremen Kruskal – measuremen curve t wallis t gamma - Francis Galton first person to measure scale Consider Consider Pearson correlation, originally termed “co – relation” lower level lower level Considering you’re studying the relationship of of between a couple of different variables. measuremen measuremen t t Correlation coefficient formula: definition
- Formula used to find how strong a Pearson Assumption
relationship is between data. T - The formula return a value between -1 and The Pearson product-moment correlation 1 coefficient (Pearson’s correlation, for short) is a measure of the strength and direction of association - 1 indicates a strong positive relationship. that exists between two variables measured on at least - -1 indicates a strong negative relationship an interval scale. - A result of zero indicates no relationship For example, you could use a Pearson’s at all correlation to understand whether there is an association between exam performance and time WARNING “Even though 2 variables spent revising. You could also use a Pearson's show a higher correlation (whether positive correlation to understand whether there is or negative), this does not show any cause- an association between depression and length and-effect relationship between them.” of unemployment. A Pearson’s correlation attempts to draw a line of best fit through the data of two variables, and the Pearson correlation coefficient, r, indicates how far away all these data points are from this Nonparametric measure of the strength and direction of line of best fit (i.e., how well the data points fit association that exists between two variables measured this model/line of best fit). You can learn more on at least an ordial scale in our more general guide on Pearson's 1. Ordinal or continuous – two variables correlation, which we recommend if you are not should be measured on an ordinal or familiar with this test. continuous scale. 2. Monotonic relationship - Kendall's tau- Assumption pearson b determines whether there is a 1. interval or ratio level – two variables should be monotonic relationship between your measured at the interval or ratio level (they are two variables. As such, it is desirable if continuous). your data would appear to follow a 2. Linear relationship – between your two monotonic relationship, so that formally variables. Whilst there are a number of ways to testing for such an association makes check whether a linear relationship exists sense, but it is not a strict assumption or between your two variables. one that you are often able to assess. 3. No significant outliers - There should be no significant outliers. Outliers are simply single data points within your data that do not follow the Goodman – kruskal gamma Goodman and Kruskal's gamma (G or γ) is usual pattern (e.g., in a study of 100 students’ a nonparametric measure of the strength and IQ scores, where the mean score was 108 direction of association that exists between two with only a small variation between students, variables measured on an ordinal scale. Whilst it is one student had a score of 156, which is possible to analyse such data using Spearman's rank- very unusual, and may even put her in the top order correlation or Kendall's tau-b, Goodman and 1% of IQ scores globally). Kruskal's gamma is recommended when your data has 4. Approximately normally distributed – many tied ranks. variables should be approximately normally distributed. Assumption of goodman 1. Ordinal scale –two variables should be Spearman’s rank order correlation ( spearman rho) measured on an ordinal scale - the spearman rank order correlation 2. Monotonic relationship - There needs to be coefficient (spearman’s correlation, for a monotonic relationship between the two short) is a non parametric measure of the variables. A monotonic relationship exists when strength and direction of association that either the variables increase in value together, exists between two variables measured on at or as one variable value increases, the least an ordinal scale. other variable value decreases. It is typically - Denoted symbol rs or greek letter p, not possible to check this assumption prounounced rho when running a Goodman and Kruskal's gamma analysis. Assumption spearman 1. Ordinal interval or ratio – two variables Statistical power and effect size should be measured on ordinal, interval, or ratio. - Likert scale strongly agree, strongly Statistical power - hypothesis test is the probability of disagree detecting an effect, if there is a true effect present to 2. Paired observations – two variables represent detect. paired observation 3. Monotonic relationship - monotonic Statistical Hypothesis Testing relationship between the two variables. A A statistical hypothesis test makes an monotonic relationship exists when either assumption about the outcome, called the null the variables increase in value together, or hypothesis. as one variable value increases, the For example, the null hypothesis for the other variable value decreases. Pearson’s correlation test is that there is no relationship between two variables. The Kendall’s tau correlation between ranks for ordinal null hypothesis for the Student’s t test is that there is data no difference between the means of two populations. p-value (p): Probability of obtaining a result precisely the inverse of the probability of a Type equal to or more extreme than was observed in the data. II error. In interpreting the p-value of a significance test, - More intuitively, the statistical power can be you must specify a significance level, often referred to thought of as the probability of accepting an as the Greek lower case letter alpha (a). A common alternative hypothesis, when the value for the significance level is 5% written as 0.05. The p-value is interested in the context of the - alternative hypothesis is true . chosen significance level. A result of a significance test WHAT IS STATISTICAL POWER? is claimed to be “statistically significant” if the p-value is less than the significance level. This means that the Power is the probability of rejecting the null null hypothesis (that there is no result) is rejected. hypothesis when, in fact, it is false. Power is the probability of making a correct ∞ = 0.05 decision (to reject the null hypothesis) when the null hypothesis is false. P failed reject Ho Power is the probability that a test of P reject Ho significance will pick up on an effect that is present. Power is the probability that a test of Significance level (alpha): Boundary for significance will detect a deviation from the null specifying a statistically significant finding when hypothesis, should such a deviation exist. interpreting the p-value. Power is the probability of avoiding a Type We can see that the p-value is just a probability II error. and that in actuality the result may be different. The Simply put, power is the probability of not test could be wrong. Given the p-value, we could make making a Type II error an error in our interpretation Mathematically, power is 1 – beta. The power of TYPE OF ERROR a hypothesis test is between 0 and 1; if the power is close to 1, the hypothesis test is very good at detecting Type I Error. Reject the null hypothesis when a false null hypothesis. Beta is commonly set there is in fact no significant effect (false positive). The at 0.2, but may be set by the researchers to be smaller. p-value is optimistically small. Consequently, power may be as low as 0.8, but - REJECT THE NULL HYPO PERO may be higher. Powers lower than 0.8, while not MERON impossible, would typically be considered too low for Type II Error. Not reject the null hypothesis most areas of research. when there is a significant effect (false negative). The p-value is pessimistically large. WHEN INTERPRET STATISTICAL POWER. WE - NOT REJECT NULL HYPO THEIR SEEK EXPERIENTIAL SETUPS THAT HAVE SIGNIFICANCE HIGH STATISTICAL POWER - PAG SIGNIFICANT SIYA BELOW IN 0.05 Low Statistical Power: Large risk of committing - PAG NOT SIGNIFICANT ABOVE IN Type II errors, e.g. a false negative. 0.05 High Statistical Power: Small risk of committing What Is Statistical Power? Type II errors - Statistical power, or the power of a hypothesis test is the probability that the test correctly Power is increased when a researcher increases rejects the null hypothesis. sample size, as well as when a researcher increases - That is, the probability of a true positive result. effect sizes and significance levels. There are other It is only useful when the null hypothesis variables that also influence power, including variance is rejected. (σ2),. - DON’T SAY ACCEPT IN HYPO WENT In reality, a researcher wants both Type I and YOU HAVE RESULT Type II errors to be small. In terms of significance level and power, Weiss says this means we want a small - The higher the statistical power for a given significance level (close to 0) and a large power experiment, the lower the probability of making (close to 1). a Type II (false negative) error. That is the higher the probability of detecting an effect POWER ANALYSIS when there is an effect. In fact, the power is Statistical power is one piece in a puzzle that has - Hindi sapat ang p value four related parts; they are: - A lower p-value is sometimes interpreted as Effect Size. The quantified magnitude of a result meaning there is a stronger relationship present in the population. Effect size is calculated using between two variables. However, a specific statistical measure, such as Pearson’s statistical significance means that it is unlikely correlation coefficient for the relationship that the null hypothesis is true (less than 5%). between variables or Cohen’s d for the difference - Therefore, a significant p-value tells us that an between groups. intervention works, whereas an effect size tells Sample Size. The number of observations in the us how much it works. sample. - It can be argued that emphasizing the size of Significance. The significance level used in the effect promotes a more scientific approach, as statistical test, e.g. alpha. Often set to 5% or 0.05. unlike significance tests, effect size Statistical Power. The probability of accepting is independent of sample size. the alternative hypothesis if it is true. To compare the result of the studies don in different settings All four variables are related. For example, a - Unlike p value, effect size can be used to larger sample size can make an effect easier to detect, quantitatively compare the result of studies and the statistical power can be increased in a test by done in a different setting. It is widely used in increasing the significance level. meta – analysis. A power analysis involves estimating one of How to calculate and interpret effect sizes these four parameters given values for three other - Effect sizes either measure the sizes of parameters. This is a powerful tool in both the design associations between variables or the sizes of and in the analysis of experiments that we wish differences between group means. to interpret using statistical hypothesis tests. Cohen’s d For example, the statistical power can be Cohen's d is an appropriate effect size for the estimated given an effect size, sample size and comparison between two means. It can be used, for significance level. Alternately, the sample size can be example, to accompany the reporting of t-test and estimated given different desired levels of significance. ANOVA results. It is also widely used in meta- Perhaps the most common use of a power analysis. analysis is in the estimation of the minimum Cohen suggested that d = 0.2 be considered a sample size required for an experiment. 'small' effect size, 0.5 represents a 'medium' effect size As a practitioner, we can start with sensible and 0.8 a 'large' effect size. This means that if the defaults for some parameters, such as a difference between two groups' means is less than 0.2 significance level of 0.05 and a power level standard deviations, the difference is negligible, even if of 0.80. We can then estimate a desirable it is statistically significant. minimum effect size, specific to the experiment being performed. A power analysis can then Glass’s Δ method of effect size: be used to estimate the minimum sample size - This method is similar to the Cohen’s method, required but in this method standard deviation is used In addition, multiple power analyses can be for the second group. Mathematically performed to provide a curve of one parameter this formula can be written as: against another, such as the change in the size of an effect in an experiment given changes to the Hedges’ g method of effect size: sample size. More elaborate plots can be created varying three of the parameters. This is a useful - This method is the modified method of tool for experimental design. Cohen’s d method. Hedges’ g method of effect size can be written mathematically as follows: EFFECT SIZE Cohen’s f2 method of effect size: - Statistical significance is the least interesting - Cohen’s f2 method measures the effect size things about the result when we use methods like ANOVA, multiple - Minimize the type 1 error regression, etc. The Cohen’s f2 measure - Ano baa ng degree ng effect ng research mo - Pag mas mababa ang result is mas marereject. effect size for multiple regressions is defined as the following: Why report effect size? Cramer’s φ or Cramer’s V method of effect size: Chi-square is the best statistic to measure the effect size for nominal data. In nominal data, when a variable has two categories, then Cramer’s phi is the best statistic use. When these categories are more than two, then Cramer’s V statistics will give the best result for nominal data. Odd ratio:
- The odds ratio is the odds of success in the
treatment group relative to the odds of success in the control group. This method is used in cases when data is binary. For example, it is used if we have the following table