META - Spatial Training Conference - Sat PM
META - Spatial Training Conference - Sat PM
A Meta-analysis
Nora S. Newcombe
Temple University
1
Importance of training?
• Potential to improve skills relevant to STEM (Hedges &
Chung, in prep; Shea, Lubinski & Benbow, 2001)
– High spatial ability: More likely to have STEM major and STEM job
– Can also reduce disparities in STEM achievement
• How and how much?
• Goal: To aggregate systematically past research on
spatial training to determine consensus in literature.
2
http://www.spatialintelligence.org
Overview
What is training?
How can we compare training effectiveness across
studies?
Research questions:
1. How much do (vs. can) spatial skills improve?
• Might vary by task – Embedded Figures vs. Water-level Task?
2. What works?
• Impact of grouping variables
3. Are training effects durable?
4. Does training generalize (transfer) to untrained tests?
3
http://www.spatialintelligence.org
Examples: Video games
• Effect of playing videogames (Tetris) on mental rotation and
Paper Folding Test (Wright, Thompson, Ganis, Newcombe &
Kosslyn, 2008).
5
http://www.spatialintelligence.org
Examples: Spatial coursework
Engineering course using Improved Purdue Spatial
multi-media software and Visualization Test
workbook performance (Sorby, 2008)
•Isometric pictorials from coded plans
•Multi-view drawings
•Paper folding/2-D to 3-D transformations
•Object rotations about one axis
•Object rotations about two or more axes
•Cutting planes and cross sections
•Surfaces and solids of revolution
•Combining solids
g = 2.02
6
http://www.spatialintelligence.org
Examples: Repeated practice
Repeated practice on different Group Embedded Figures
(Chance & Goldstein, 1971; Schaeffer & Thomas, 1999)
g = 1.12
http://www.spatialintelligence.org
Methods
• Searched for both published and unpublished work:
– Dissertations, conference posters, technical reports.
– Electronic searches, references lists, direct contacts
• Coded on several grouping variables, including:
– Age, sex, ability level (i.e., prescreened for low
performers?)
– Outcome measure, type of training
– Publication status, random assignment, location of study
(classroom?), feedback provided (yes/no), training
frequency
8
http://www.spatialintelligence.org
Effect Sizes
• Standard measure of efficacy across studies
– Does not depend on individual measurement (raw score)
– Expresses mean change, as a result of training or
experience, in standard deviation units.
• Final “sample”
– 101 published (76) and unpublished (25) studies
9
http://www.spatialintelligence.org
Analysis Plan
• How do we make sense of various training methods and
dependent variables?
• Created 5 conceptual categories of dependent
variables and 3 categories of training.
• Describe each category then compare size of training
effects in each category.
10
http://www.spatialintelligence.org
• Categories of dependent variables
17
http://www.spatialintelligence.org
Video games Designed for recreation Example: Dress-making,
and entertainment. spatial modules, Drafting
(vs. water purification)
Courses Full-length or short-
term enhancements.
Spatial task Direct rehearsal or
training - Specific practice on outcome
measure of interest.
Spatial task Transfer of training to
training- Transfer reference tests.
18
http://www.spatialintelligence.org
Video games Designed for recreation Example: Repeated practice
and entertainment. on the GEFT, VMRT, WLT
19
http://www.spatialintelligence.org A B
Video games Designed for recreation Example: Regular WLT
and entertainment. test on irregular WLT;
Tetris test on PFT
Courses Full-length or short-term
enhancements.
Spatial task Direct rehearsal or
A B
training - Specific practice on outcome
measure of interest.
Spatial task Transfer of training to
training -Transfer reference tests.
20
http://www.spatialintelligence.org
Results
• Overall effectiveness of training
• Control group effects
• Age and Sex
• Are some kinds of training better than
others? Are some outcome measures more
malleable than others?
• Duration
• Transfer
Overall Effectiveness
• 101 studies
– Mean effect size = .65 (i.e., 2/3 a SD of improvement)
– “Moderate” improvement (Cohen, 1988)
E
xpe
rim
enta
lgro
up
0
.00
0 1
.00
0 2
.00
0
24 g
g
http://www.spatialintelligence.org
Experimental groups do significantly exceed
control groups
• Treatment groups improve more in nearly all cases:
† Homogeneity achieved
ab
Groups labeled with different superscripts are significantly different.
* Age x Control group type χ2 significant, p < .05
But,….
26
http://www.spatialintelligence.org
Why control groups matter
27
http://www.spatialintelligence.org
Why control groups matter
• Important to separate control and treatment groups
– Spatial principles highest Ec effect size, lowest Control group g
– Spatial perception lowest Ec effect size, highest g for Control.
1.2
1 0.95
0.76
0.8
A B 0.64
Mean effect
0.6
size (g)
0.4
0.18
0.2
0
Spatial principles Spatial perception
Control Treatment
28
http://www.spatialintelligence.org
Why control groups improve so much
30
http://www.spatialintelligence.org
What’s the evidence to support this claim?
31
http://www.spatialintelligence.org
Test variety is effective training
• Test-retest effect: Not just number of repetitions
• Number of separate tests given during pretest-posttest:
0.9 0.78
0.8
Mean effect size (g)
0.7 0.59
0.6 0.49
0.5
0.4
0.3
0.2
0.1
0
One measure 2 - 4 measures 5 or more
measures
Number of test-retest measures per study
32
http://www.spatialintelligence.org
Age effects: It’s all in the control group
• Does malleability vary by age?
– On average, effect size significantly higher for children
than for adults, p < .05
– Initially, appears that children are more malleable…
33
http://www.spatialintelligence.org
Age effects: Control and Experimental
34
http://www.spatialintelligence.org
Sex differences
• Does malleability vary by sex? No difference in mean
effect size, both sexes respond to training (same g)
• Overall, results from prior work are most consistent
with last scenario.
• Male advantage is similar in magnitude at pre and post
M M
F
F
Training Training
35
http://www.spatialintelligence.org
Training works – What works?
• Focus on treatment effect sizes:
Grouping variable Results
Outcome measure Outcomes largely similar in malleability: Only
significant difference: Spatial perception (.96) >
mental rotation (.67).
Training frequency More frequent training larger g for mental
rotation only (.81 for multi-session vs. .38 for single).
Feedback during FB led to larger effect sizes for most outcome
training measures except spatial perception (opposite is true)
Random assignment Led to lower effect sizes (more rigorous).
37
http://www.spatialintelligence.org
Duration: Training lasts
• Majority of studies (85%) tested only immediate impact of training.
• Among treatment groups: No significant decline in effect size
measured immediately, 2 weeks after, or more than 2 weeks after
end of training (which includes up to 3 months later).
1
0.9
0.8 0.76
0.7
Effect size (g)
0.59 0.57
0.6
0.5
0.4
0.3
0.2
0.1
0
Immediate test Up to 2 weeks More than 2
38
http://www.spatialintelligence.org
weeks
Training transfers
• Why does this matter?
– Suggests training is NOT just a practice effect
– If spatial training has effects that extend beyond mere
practice, training should transfer to untrained tasks.
• Near vs. Far transfer:
– Near g = 1.01
– Far g = .56 A B
– But Far is more durable.
39
http://www.spatialintelligence.org
Training transfers
• Studies expecting to obtain far transfer might use training
that produces especially durable effects:
1.2
1.05
1
Effect size (g)
0.8 0.72
0.58
0.6
0.37
0.4
0.2
0
Near Far
Transfer type
1 Berkowitz, Ruth. "One Point on the LSAT: How Much Is It Worth?" American
Economist 42 (2) 1998.
2 Judge, T. A., & Cable, D. M. “The Effect of Physical Height on Workplace
Training
42
http://www.spatialintelligence.org
Conclusions
• Training leads to improvements in spatial skills that are:
– Durable - No significant losses in pretest-posttest
improvement, even when retested 3 months later.
– Generalizable to other tasks – Training leads to
improvements on untrained tasks.
43
http://www.spatialintelligence.org
Conclusions
• How much can spatial skills improve?
– Use longer periods of training
• 47% of studies performed only one single session of training
• 85% conducted only an immediate posttest
• When long periods of training are used, durable effects AND far
transfer are observed.
– Test a larger range of outcome measures
• 48% of outcome measures are mental rotation
• Vs. 9% perspective taking, 11% spatial principles, etc.
– Include a variety of methods of training
• Allows for alignment and comparison across problems (Gentner
& Markman, 1994, 1997)
44
http://www.spatialintelligence.org
Future directions
• To develop best-practice guidelines for spatial
interventions at elementary and high school levels.
• Investigate transfer to STEM in more detail.
• Understand thresholds for success
45
http://www.spatialintelligence.org
Acknowledgements
• Larry Hedges (NU)
• Spyros Konstantopoulos (NU)
• David B. Wilson (George Mason University)
• Chris Warren and Alison Lewis
• Research assistance:
– Kate O’Doherty
– Bridget O’Brien
– Eleanor Tushman
– Maggie Carlin
– Laura Mesa, Bonnie Vu, Melissa Sifuentes
46
http://www.spatialintelligence.org