0% found this document useful (0 votes)
21 views

Chapter 7

Uploaded by

ali karimi
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Chapter 7

Uploaded by

ali karimi
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Statistics

Research Software in Language


Teaching
Chapter 7:
Ali Karimi
Analyses of differences between two conditions: the
t-test
PRESENTATION OUTLINE

1 Analysis of two conditions 6 Inferential statistics: the t-test

2 Analysis of differences between two 7 Output for Independent T-Test


independent groups

3 Confidence limits around the 8 Assumptions to be met in using


mean t-tests

4 Measure of Effect 9 Related T-Test

5 The Size of the effect 10 Summary


Analysis of two conditions
The analysis of two conditions includes the following:
1. Descriptive statistics, such as means or medians, and standard deviations; confidence intervals
around the mean of both groups separately, where this is appropriate; graphical illustrations such as
box and whisker plots and error bars.
2. Effect size – this is a measure of the degree to which differences in a dependent variable are
attributed to the independent variable.
3. Confidence limits around the difference between the means.
4. Inferential tests – t-tests discover how likely it is that the difference between the conditions could
be attributable to sampling error, assuming the null hypothesis to be true.
Analysis of differences between two
independent
Twenty-four groups
people were involved in an
experiment to determine whether background
noise (music, slamming of doors, people making
coffee, etc.) affects short-term memory (recall of
words). Half of the sample were randomly
allocated to the NOISE condition, and half to the
NO NOISE condition.
The participants in the NOISE condition tried to
memorise a list of 20 words in two minutes, while
listening to pre-recorded noise through
earphones. The other participants wore
earphones but heard no noise as they attempted
to memorise the words. Immediately after this,
they were tested to see how many words they
recalled. The numbers of words recalled by each
person in each condition are as shown in the
Table.
Confidence limits around the
mean
The means you have obtained, for your sample, are point estimates. These sample means are the best
estimates of the population means. If, however, we repeated the experiment many times, we would
find that the mean varied from experiment to experiment. For example, the sample mean for the NO
NOISE condition is 13.8. If we repeat the experiment, we might find that the sample mean is 13.3. If
you repeated the experiment many times, the best estimate of the population mean would then be the
mean of all the sample means. It should be obvious, however, that our estimate could be slightly
different from the real population mean difference: thus it would be better, instead of giving a point
estimate, to give a range. This is more realistic than giving a point estimate.
The interval is bounded by a lower limit (12.1 in this case) and an upper limit (15.6 in the above
example). These are called confidence limits, and the interval that the limits enclose is called the
confidence interval.
The confidence limits let you know how confident you are that the population mean is within a certain
interval: that is, it is an interval estimate for the population (not just your sample).
Measure of Effect
We can also take one (sample) mean from the other, to see how much they differ: 7.3 - 13.8 = -6.5
This score on its own, however, tells us very little. If we converted this score to a standardized score, it
would be much more useful. The raw score (the original score) is converted into a z-score. The z-score
is a standardized score, giving a measure of effect which everyone can easily understand. This
measure of effect is called d; d measures the extent to which the two means differ, in terms of
standard deviations. This is how we calculate it:
This means that we take one mean away from the other (it does not matter which is which – ignore the
sign) and divide it by the mean standard deviation.
Step 1: find mean sample SD
Step 2: find d
The Size of the effect
Z-scores are standardized so that the mean is zero and the standard deviation is 1. You
can see that, if the means differed by 0.1, they would differ by only a tenth of a standard
deviation.
That is quite small, on our scale of 0 to 3. If the means differed by 3 standard deviations,
that would be a lot, using the scale of 0 to 3. There is no hard and fast rule about what
constitutes small and large effects. Cohen (1988) gave the following guidelines:
When there is little difference between our groups, the
scores will overlap substantially. The scores for the groups
can be plotted separately: for instance, scores for the NOISE
condition can be plotted and will tend to be normally
distributed. Scores for the NO NOISE condition can also be
plotted, and will tend to be normally distributed. If there is
little difference between them, the distributions will overlap

If there is a large difference between the two groups, then the


distributions will be further apart:
Inferential statistics: the t-test
t-test: The t-test is used when we have two conditions. The t-test assesses whether there is a statistically significant
difference between the means of the two conditions.

independent t-test related t-test

is used when the participants Is used when the participants perform


perform in only one of two in both conditions: that is, a related,
conditions: within-participants or
that is, an independent, repeated measures design.
between-participants or
unrelated design.
• The t-test is basically a ratio between a measure of the between-groups variance and the
within-groups variance. The larger the variance between the groups (columns), compared with
the variance within the groups (rows), the larger the t-value.
• Once we have calculated the t-value, we (or rather the computer) can find the probability of
obtaining such a t-value by chance (sampling error) if the null hypothesis were true. That
is, if there were no differences between the NOISE condition and the NO NOISE condition, how
likely is it that our value of t would be found?
• If there were no real differences between the NOISE and the NO NOISE conditions, and we took
repeated samples, most of the differences would fall around the zero mark (mean of NOISE
condition and mean of NO NOISE condition would be almost the same).
• If there were no difference between the means of our particular experiment, it would be more
likely that our t would fall in the middle region than in one of the ‘tails’ of the sampling
distribution. This is because we know, through the Central Limit Theorem, that most of our
obtained values will fall in the middle range
• If, for instance, our obtained t has an associated probability level of 0.03, we can
say that, assuming the null hypothesis to be true, a t-value such as the one we
obtained in our experiment would be likely to have arisen in only 3 occasions out
of 100. Therefore, we conclude that there is a difference between conditions that
cannot be explained by sampling error.
Output for Independent T-Test
All good computer packages, such as SPSS, will give the following information:
1. Means of the two conditions, and the difference between them. What you want to know is
whether the difference between the two means is large enough to be important (not only ‘statistically
significant’, which tells you the likelihood of your test statistic being obtained, given that the null
hypothesis is true).
2. Confidence intervals: SPSS, using the t-test procedure, gives you confidence limits for the
difference between the means. The difference between means, for your sample, is a point estimate.
This sample mean difference is the best estimate of the population mean difference. If, however, we
repeated the experiment many times, we would find that the mean difference varied from experiment
to experiment. The best estimate of the population mean would then be the mean of all these mean
differences. It is obviously better to give an interval estimate, as explained before. Confidence limits let
you know how confident you are that the population mean difference is within a certain interval. That
is, it is an interval estimate for the population (not just for your sample).
Output for Independent T-Test
3. t-value: the higher the t-value, the more likely it is that the difference between groups is not the
result of sampling error. A negative value is just as important as a positive value. The positive/negative
direction depends on how you have coded the groups. For instance, we have called condition 1 NOISE
and condition 2 NO NOISE. This was obviously an arbitrary decision; we could just as well have called
condition 1 NO NOISE and condition 2 NOISE – this would result in exactly the same t-value, but it
would have a different sign (plus or minus).
4. p-value: this is the probability of your obtained t-value having arisen by sampling variation, or
error, given that the null hypothesis is true. This means that your obtained t is under an area of the
curve that is uncommon – by chance, you would not expect your obtained t-value to fall in this area.
The p-value shows you the likelihood of this arising by sampling error. For instance, p = 0.001 means
that there is only one chance in a thousand of this result arising from sampling error, given that the null
hypothesis is true.
Output for Independent T-Test

5. Degrees of freedom (DF): for most purposes and tests (but not all), degrees of freedom roughly
equate to sample size. For a related t-test, DF are always 1 less than the number of participants. For an
independent t-test, DF are (n - 1) + (n - 1), so for a sample size of 20 (10 participants in each group) DF
= 18 (i.e. 9 + 9). For a within-participants design with sample size of 20, DF = 19. DF should always be
reported in your laboratory reports or projects, along with the t-value, p-value and confidence limits for
the difference between means. Degrees of freedom are usually reported in brackets, as follows: t (87)
= 0.78. This means that the t-value was 0.78, and the degrees of freedom 87.
Assumptions to be met in using
t-tests
• The t-test is a parametric test, which means that certain conditions about the distribution of the
data need to be in force: that is, data should be drawn from a normally distributed population of
scores. We assume this is the case if our sample scores are normally distributed. You can tell whether
your data are skewed by looking at histograms.
• The t-test is based on the normal curve of distribution.
• When we have different numbers of participants in the two groups, taking a simple average of the
two variances might be misleading because the formula would give the two groups equal
weighting, when in fact one group might consist of more participants. In this case we would use a
weighted average. The weighted average for the sample (called the pooled variance estimate) is
used in order to obtain a more accurate estimate of the population variance. If your data are
extremely skewed and you have very small participant numbers, you will need to consider a non-
parametric test
Related T-Test
The related t-test is also known as the paired t-test; these terms are interchangeable. The related t-test
is used when the same participants perform under both conditions.
• because each participant performs in both conditions, and so each participant can be tested against
him- or herself. If we have 20 people in a related design (20 people taking part in both conditions),
we would need 40 in an unrelated design (20 in each condition).
• If you compare an independent and related t-test using the same dataset, you will find that the
related t-test gives a result with a higher associated probability value – this is because the
comparison of participants with themselves gives rise to a reduced within-participants variance,
leading to a larger value of t
Summar
y
1. Confidence limits allow you to infer with a certain degree of confidence (usually 95%) that a
population mean (or difference between means) falls within a certain range. If we calculate
confidence intervals around our sample mean, we can get a good idea of how close it is to the
population mean.
2. d, an effect size, gives the magnitude of difference between two independent means, expressed in
standard deviations.
3. t-tests allow us to assess the likelihood of having obtained the observed differences between the
two groups by sampling error: for example, p = 0.03 means that, if we repeated our experiment 100
times using different samples, then assuming no real difference between conditions, we would
expect to find our pattern of results three times, by sampling error alone.
4. t-tests are suitable for data drawn from a normal population – they are parametric tests.

You might also like