15-shintani2013
15-shintani2013
This article reports a meta-analysis of studies that investigated the relative effectiveness
of comprehension-based instruction (CBI) and production-based instruction (PBI). The
meta-analysis only included studies that featured a direct comparison of CBI and PBI in
order to ensure methodological and statistical robustness. A total of 35 research projects
in 30 published studies were retrieved. The studies were coded for three types of effect
sizes: comparative, absolute, and pre-to-post change. The comparative effect sizes were
used in a subsequent moderator analysis to test the impact of two mediator variables—
CBI with and without Processing Instruction and PBI involving text creation versus text
manipulation. The results showed that (1) overall, both types of instruction had large
effects on both receptive and productive knowledge; (2) for receptive knowledge, CBI
had a greater effect than PBI when the acquisition was measured within one week but the
difference diminished in the delayed tests (i.e., posttests administered between 1 week
and 75 days after the treatment); (3) for productive knowledge, CBI and PBI had similar
effects in short-term measurement but PBI was more effective in the delayed tests; and
(4) the initial advantage found for CBI was largely due to Processing Instruction. We
discuss the theoretical and pedagogical significance of these findings.
Introduction
Traditionally in the field of language teaching, grammar instruction has em-
phasized the importance of learning through producing sentences in the target
We thank the anonymous reviewers and the editor of Language Learning for their detailed and
helpful suggestions for revising the draft of this article.
Correspondence concerning this article should be addressed to Natsuko Shintani, National
Institute of Education, Nanyang Technological University, 1 Nanyang Walk Singapore 637616.
E-mail: [email protected]
DOI: 10.1111/lang.12001
Shintani, Li, and Ellis Meta-Analysis of CBI and PBI
is considered later in detail. As there are now a sufficient number of studies that
have conducted direct comparisons of CBI and PBI, a meta-analytical approach
is possible. Meta-analysis provides a quantitative synthetic review of previous
empirical research on a certain construct or clinical intervention. Since the
publication of Norris and Ortega’s (2000) seminal work on the effectiveness
of L2 instruction, a number of such analyses have been conducted (e.g., Gold-
schneider & DeKeyser, 2001; Li, 2010; Lyster & Saito, 2010; Mackey & Goo,
2007; Norris & Ortega, 2006; Plonsky, 2011). The comparative effectiveness of
CBI and PBI has yet to be addressed but is ripe for a new meta-analysis. Unlike
the previous meta-analyses of different types of instruction, our approach in the
meta-analysis we report in this article is unique in that we included only those
studies that conducted direct comparisons of the two types of instruction. We
inspected cumulative evidence gleaned from within-study comparisons of CBI
versus PBI by examining evidence from three types of effect sizes: comparative
(i.e., contrasting CBI and PBI directly), absolute (i.e., contrasting CBI or PBI
to a control group), and pre-to-post (i.e., contrasting performance by a CBI or
PBI treatment over time).
Theoretical Issues
As noted in the Introduction, comparison of CBI and PBI provides a means
of addressing a key issue in SLA, namely the skill-specificity of language
instruction. Four different positions are possible: (1) CBI benefits receptive but
not productive L2 knowledge whereas PBI benefits productive knowledge but
not receptive knowledge; (2) PBI will prove superior to CBI in developing both
types of knowledge; (3) CBI will prove superior to PBI in developing both
types of knowledge; and (4) PBI will only benefit acquisition if it takes account
of the learners’ level of L2 development but no such constraint exists in the
case of CBI.
The first position is supported by skill-learning theory, which emphasizes
that the power law of practice leads to qualitative changes in the learner’s
knowledge system over time but only “in the basic cognitive mechanisms
used to execute the same task” (DeKeyser, 2007, p. 99). This is because the
kind of knowledge that characterizes the later stages of development (i.e., the
automatization stage) is highly specific and so does not transfer to tasks that
are dissimilar from those used to develop the knowledge. In other words, the
knowledge that results from comprehension-based instruction will be available
in receptive tasks while that which results from production-based tasks will be
available in productive tasks.
The second position receives support from theories that emphasize the
importance of production in L2 acquisition. Swain’s (1985) Output Hypothesis,
for example, proposes that advanced acquisition requires opportunities for
pushed output. Swain argued that comprehensible input alone was not sufficient
to ensure high levels of linguistic competence. She suggested that production
was beneficial when it pushed learners to produce messages that are concise
and socially appropriate. As Swain (1995) noted “learners . . . can fake it,
so to speak, in comprehension, but they cannot do so in the same way as
production” (p. 127). It should be noted that the Output Hypothesis addresses
the role played by free, communicative production, not the kind of controlled
production resulting from grammar drills and exercises. That is, the Output
Hypothesis suggests that PBI consisting of text-creation activities is needed.
Theories of implicit learning (N. Ellis, 2005; Long & Robinson, 1998) also
emphasise the need for text-creation activities in conjunction with a focus
on form (i.e., drawing learners’ attention to linguistic problems that arise in
their communicative production). The importance attached to text-creation
activities in PBI is further supported by the Transfer Appropriate Learning
Hypothesis, which claims that “we can better remember what we have learned
if the cognitive processes that are active during learning are similar to those that
are active during retrieval” (Lightbown, 2008, p. 27). In other words, if the aim
is to develop the kind of knowledge required for meaningful communication,
it is necessary to engage learners in production involving text creation.
The third position is supported by models of L2 acquisition that propose
a single knowledge store that is drawn on for both comprehension and pro-
duction and that develops as a result of processing input. VanPatten (2007)
argued that input processing (defined as the process by which a form-meaning
connection is established) in conjunction with other cognitive processes (such
as restructuring) leads to changes in the learner’s internal grammar and these
will be subsequently manifest in both receptive and productive tasks. This
is based on the hypothesis that for acquisition to take place learners need
to overcome a number of processing principles that define their attentional
priorities and prevent them from paying attention to specific grammatical fea-
tures in the input. For example, “learners prefer processing ‘more meaningful’
morphology before ‘less’ or ‘nonmeaningful morphology’” (VanPatten, 1996,
p. 15) and so may pay attention to English plural -s but not third person -s. The
goal of instruction, therefore, should be to induce attention to those features
that learners overlook as a result of processing constraints. It should be noted,
however, that VanPatten (2002) argued that these principles do not apply to the
entirety of the grammar of a language but only to those specific features influ-
enced by the principles. Processing Instruction potentially involves a number
of different components—explicit instruction, processing strategy training and
structured input—with some studies (e.g., VanPatten & Oikkenon, 1996) sug-
gesting that it is the last of these that is crucial. Several PI studies (e.g., Benati,
2005; VanPatten & Cadierno, 1993), based on VanPatten’s theory, lend support
to the superiority of comprehension-based instruction over production-based
instruction in developing both receptive and productive knowledge.
The fourth position is evident in Pienemann’s (1985) claims regarding the
learnability and teachability of grammatical structures. Pienemann proposed
that the ability to produce a specific grammatical structure depends on whether
the learner has mastered the specific processing operation that is required. The
Teachability Hypothesis predicts that “instruction can only promote language
acquisition if the interlanguage is close to the point when the structure to be
taught is acquired in the natural setting so that sufficient processing prerequi-
sites are developed” (Pienemann, 1985, p. 37). By instruction Pienemann is
referring to production-based activities. Pienemann acknowledged that the pro-
cesses involved in comprehension and production are different. Thus, he saw
no problem in exposing learners to grammatical structures that are beyond their
current level of development provided they are not required to produce them.
From this perspective, CBI avoids the danger of forcing learners to produce
a structure they are not ready for, which according to Pienemann (1989) can
actually delay L2 development.
The present study reports a meta-analysis of studies that have featured a
direct comparison of CBI and PBI. In order to carry out the meta-analysis,
we have treated these two types of instruction as macro instructional types.
We distinguished them by defining CBI as instruction that did not require
production (but did not prohibit it) and that directed learners’ attention to
processing the meaning of specific target features in the input, while PBI was
defined as instruction that required learners’ to produce the L2 target features
and directed attention at processing the target features correctly as output. As
noted above, each type of instruction can be implemented in a variety of ways.
In this broad approach we have followed the practice of previous meta-analyses,
which have been based on instructional distinctions of a similar macro nature
(e.g., Norris & Ortega’s [2000] macro variables of focus-on-form vs. focus-on-
forms and implicit vs. explicit instruction). We argue that the investigation of
CBI and PBI as macro instructional types is justified because of their importance
to both language pedagogy and SLA theory.
Research Questions
The present meta-analysis has two major aims. One is to investigate the relative
effects of the CBI and PBI on learners’ acquisition of both receptive and
productive knowledge. To this end, two research questions were formulated:
RQ1: Is there any difference in the effects of CBI and PBI on the acquisition
of receptive knowledge of grammatical features?
RQ2: Is there any difference in the effects of CBI and PBI on the acquisition
of productive knowledge of grammatical features?
The results obtained for these questions provide a basis for evaluating the
different SLA theoretical positions outlined above. The second aim was to
investigate whether key differences in CBI and PBI affect the acquisition of
grammar features, leading to a third research question:
RQ3: What factors moderate the effects of CBI and PBI on L2 grammar
acquisition?
Two moderating factors were chosen for examination: (1) PI versus non-PI
in the case of CBI and (2) text-manipulation versus text-creation activities in
the case of PBI. The explanation for the choice of these two factors is presented
in the next section.
Method
Selection of Studies
The current meta-analysis did not include fugitive literature (e.g., unpublished
doctoral dissertations and conference presentations) due to the difficulty in-
volved in retrieving such materials consistently. The literature search started
with ten major SLA journals, in alphabetical order: Applied Linguistics, The
Canadian Modern Language Review, Computer Assisted Language Learning,
Language Awareness, Language Learning, Language Teaching Research, The
Modern Language Journal, Studies in Second Language Acquisition, System,
and TESOL Quarterly. The search was carried out by inspecting each jour-
nal using online resources or hard copies. Then, electronic databases such
as the Education Resources Information Center and the Linguistic and Lan-
guage Behaviour Abstracts were used to extend the search. Some of the key
words used in the database search were “input,” “output,” “comprehension,”
“production,” “receptive,” “productive,” “comparative,” “comparison,” “class-
room,” and “grammar learning.” As a result, articles from a number of other
journals (Applied Psycholinguistics, Foreign Language Annals, Hispania, JALT
Journal, Second Language Research, RELC Journal, and Spanish Applied Lin-
guistics) and chapters from some books (Benati, 2009; Benati & Lee, 2008;
Lee & Benati, 2007a, 2007b; and VanPatten, 2004) were identified.
The following criteria were used to select studies for the meta-analysis.
Initially, studies that met criteria (1), (2), and (3) were selected. Then the
selected studies were further screened according to criteria (4), (5) and (6):
1. Only studies published in refereed journals or books between 1991 and
2010 were included, 2010 being the point where the data collection for
this study was completed. The year 1991 was set as the starting point
because, although debate about the relative effectiveness of CBI and PBI
had started much earlier, there were few empirical studies prior to 1991
and those that did exist—such as Asher’s (1977) Total Physical Response
experiments—did not report sufficient data for effect size calculation.
2. Only studies that featured a comparison of a CBI treatment with a PBI
treatment were selected. A CBI treatment was defined as comprising any
activity that required learners to comprehend oral or written L2 input
to achieve the goal of the activity but required only limited production
(e.g., “yes” or “no”) or no production at all. A PBI treatment was defined
Aspects Subcategories k
other to PBI, both of which figured widely in the studies and can be considered
instructional aspects of theoretical and pedagogical significance, as outlined in
the Introduction. Although these factors moderated just CBI or PBI, we decided
to include them as variables because of their potential impact on the efficacy
of the instruction.
Moderator variable 1 was whether or not the CBI involved PI. That is, we
decided to investigate whether or not the CBI studies that included structured
input involved PI in ways that were faithful to VanPatten’s (1996) definition.
Many of the CBI studies involved PI and we thought this might be a factor in the
efficacy of the instruction. We determined this variable in the following way. We
first identified those CBI studies that included a structured input activity (i.e., an
activity that requires learners to make form-meaning connections for the target
agreement rate between the two raters reached 94.5%, and the disparities were
resolved through discussion. A final round of coding was then performed with
special attention to the disputed aspects of the coding scheme.
are not associated with direct contrasts between CBI and PBI, and comparing
the two in terms of their effectiveness would involve two steps: aggregat-
ing effect sizes related with treatment–control/pre–post contrasts followed by
between-group comparisons. Also, analyses based on comparative effect sizes
involved all the studies included in the meta-analysis whereas analyses based
on absolute or pre-to-post effect sizes necessarily precluded studies without a
control group or without pretest data.
One issue meta-analysts have to face is sample size inflation, that is, in-
cluding multiple effect sizes from a single study in the analysis. Lipsey and
Wilson (2001) rightly pointed out that the inclusion of more than one ef-
fect size in the same analysis violates the assumption of independence of
data points and “can render the statistical results highly suspect” (p. 105).
To minimize the violation of such assumption and at the same time main-
tain as much data as possible, a shifting unit of analysis was adopted (Patall,
Cooper, & Robinson, 2008). Initially, effect sizes from each study were com-
puted separately. However, when effect sizes were aggregated to obtain overall
effects, each study contributed a single effect size, which was the average of
multiple effect sizes if a study generated more than one effect size. When
moderator analyses were performed, there was a possibility that more than
one effect size was involved in one analysis. However, in each category of
an independent variable, only one effect size was included from each study.
For instance, in the event that both a receptive as well as a productive test
was used in a study, the two corresponding effect sizes were included in the
analysis related to the effect of the outcome measure. However, within each
of the two categories (receptive and productive), only one effect size was
included.
Another issue relates to outliers. There are three options for dealing with
outliers: (1) including them intact, (2) recoding them to less extreme values
(e.g., using the technique known as winsorizing), and (3) removing them.
In this meta-analysis, outliers were excluded from all analyses in order to
ensure the robustness of the results. Outliers were detected by examining the
standardized values of effect sizes (z scores): Any value that was more than 2.5
standard deviation units from the mean effect size was considered an outlier.
For each set of effect sizes, outlier identification was performed repeatedly until
there were no further outliers. It should be noted that outliers are contingent,
that is, an outlier in one set of effect sizes is not necessarily an outlier in
another set. Information about the outliers is provided in Appendix S2 of
the online Supporting Information, where a list of included studies and their
methodological features can be found.
Results
Descriptive Results
A total of 35 comparative studies investigating the differential effects of CBI
and PBI on SLA were retrieved. These studies were published between 1991
and 2010. As shown in Figure 1, there has been a rapid growth in the amount
of research on the topic under investigation (from 1 during the first 5 years to
15 over the last 5-year period). Altogether 276 effect sizes were generated from
these studies involving 7,700 codes relating to the dependent and independent
variables as well as methodological features. Out of the total effect sizes, 141
were computed based on productive measures, and 135 on receptive measures.
Also, 191 represented immediate effects and 85 related to delayed effects (i.e.,
longer than 1 week following the instruction).
95% CI Heterogeneity
Receptive
Immediate 31 1.09 .00∗ 0.23 0.64 1.55 323.43 0.00∗
Delayed 15 0.25 .11 0.16 −0.06 0.55 51.24 0.00∗
Productive
Immediate 32 −0.08 .29 0.07 −0.23 0.07 48.28 0.02∗
Delayed 15 −0.21 .03∗ 0.09 −0.39 −0.02 19.29 0.15
∗
Statistically significant at p < .05.
Meta-Analytic Results
This section first reports the overall results on the effectiveness the two types of
instruction as reflected by the three categories of effect sizes (i.e., comparative,
absolute, and pre-to-post effect sizes) associated with receptive and productive
measures. This is followed by the results obtained for the moderator analysis.
The overall results for the comparative effects of CBI and PBI appear in
Table 2, which shows the number of contrasts(k), the mean effect size (and the
related p value, standard error, CI), and the between-group Q test results. Re-
call that comparative effect sizes were computed based on the mean differences
between CBI and PBI; a positive effect size indicates a superior effect for CBI
and a negative effect size shows a better effect for PBI. For receptive knowl-
edge, CBI was more effective than PBI on the immediate posttests, d = 1.09,
p = .00, but the difference was no longer statistically significant on the posttests
administered more than one week following the instruction and the effect size
at this time was only d = 0.25. For productive knowledge, the initial effects
of the two types of instruction were similar, but PBI showed a significant but
small mean effect advantage over CBI on the delayed posttests of d = −0.21,
p = .03.
The results for the absolute effect sizes are displayed in Table 3. Absolute
effect sizes for CBI and PBI are based on the contrasts between the two types of
instruction and the control conditions. As shown, both CBI and PBI significantly
outperformed the control groups with large effect sizes on all measures. Q
Receptive
Immediate 5.12 .02∗
∗
CBI 20 1.96 .00 0.29 1.38 2.54
PBI 21 1.13 .00∗ 0.22 0.69 1.56
Delayed 1.01 .32
CBI 9 1.58 .00∗ 0.29 0.99 2.16
PBI 9 1.12 .00∗ 0.34 0.45 1.79
Productive
Immediate 0.07 .79
CBI 22 1.23 .00∗ 0.22 0.81 1.66
PBI 22 1.32 .00∗ 0.23 0.86 1.77
Delayed 0.12 .73
CBI 8 0.91 .00∗ 0.19 0.54 1.29
PBI 8 1.02 .00∗ 0.26 0.52 1.53
∗
Statistically significant at p < .05.
Group
95% CI Contrast
Variables k Mean d p SE Lower Upper Qb p
Receptive
Immediate 21.09 .00∗
CBI 26 2.36 .00∗ 0.26 1.85 2.88
PBI 27 1.01 .00∗ 0.13 0.76 1.27
Delayed 5.04 .02∗
CBI 13 1.70 .00∗ 0.26 1.19 2.22
PBI 12 1.02 .00∗ 0.15 0.73 1.32
Productive
Immediate 0.35 .55
CBI 29 1.66 .00∗ 0.19 1.29 2.03
PBI 30 1.83 .00∗ 0.22 1.40 2.27
Delayed 0.12 .72
CBI 12 1.23 .00∗ 0.21 0.83 1.67
PBI 12 1.36 .00∗ 0.22 0.93 1.79
∗
Statistically significant at p < .05.
Group
95% CI Contrast
Variables k Mean d p SE Lower Upper Qb p
CBI–PI/non-PI
Receptive 48.42 .00∗
∗
PI 18 2.14 .00 0.33 1.49 2.79
Non-PI 10 −0.28 .01∗ 0.11 −0.50 −0.07
Productive 0.96 .33
PI 18 0.05 .55 0.09 −0.12 0.23
Non-PI 12 −0.14 .43 0.17 −0.48 0.20
PBI–text-creation/text-manipulation
Receptive 2.74 .09
Text-creation 11 0.69 .04∗ 0.35 0.02 1.37
Text-manipulation 23 1.48 .00∗ 0.32 0.85 2.12
Productive 1.47 .23
Text-creation 13 −0.15 .08 0.09 −0.32 0.02
Text-manipulation 23 0.03 .81 0.12 −0.20 0.26
∗
Statistically significant at p < .05.
Discussion
Before we summarize and discuss the findings of the present meta-analyses, we
would like to recognize the wide variety of ways in which CBI and PBI were
operationalized across studies and to comment on the validity of synthesizing
CBI and PBI effects as macro instructional types.
We identified 35 research experiments in 30 published studies that had
compared CBI and PBI. However, in the process of the selection, we noted that
the studies varied considerably in how they provided learners opportunities for
production in CBI and opportunities for comprehension in PBI (see Appendix
S1 in the online Supporting Information). Some studies (e.g., Shintani & Ellis,
2010) reported that the participants in the CBI group engaged in substantial
L2 production without being requested to do so. It is possible that this also
occurred in other CBI studies but unfortunately most studies failed to report
whether the treatment led to learners producing the L2. Thus, in most cases it
proved impossible to identify the extent to which production figured in CBI.
Similarly, we were not able to determine to what extent the PBI treatments
afforded opportunities for learners to comprehend input containing the target
features. Many of the PBI treatments provided some L2 input as a stimulus
for the production activities. For example, the “traditional instruction” and
“meaning-oriented output” in the PI studies (e.g., Benati, 2005; Farley, 2001a,
2001b; Lee & Benati, 2007a) typically involved explicit grammar instruction
and activities that presented sentences containing the target grammatical forms
(e.g., “Last week, I (visit) my uncle” in Benati, 2005). Some PBI treatments even
required the learners to first comprehend input including the target grammatical
features as a prelude to the production activities. For example, the PBI in
Collentine (1998) included a pair-work interview task where learners needed
to comprehend their partners’ production. Where the PBI involved dictogloss
tasks (Qin, 2008; VanPatten et al., 2009), learners were required to comprehend
and understand the text before they tried to reconstruct it.
There were other differences in the ways in which the CBI and PBI were
operationalized. Some CBI studies required learners to distinguish the correct
form from the incorrect form (e.g., DeKeyser & Sokalski, 1996) whereas others
merely exposed the participants to exemplars of the correct target feature (e.g.,
Izumi, 2002; Leeser, 2008). In PBI, the types of L2 production activities varied
from controlled discrete-point production (i.e., text manipulation) to more free
production (i.e., text creation). Both CBI and PBI also varied in terms of
whether there was provision of explicit grammar information, the participants’
age, their proficiency levels, the instructional context, the research setting, the
target language, and the length of treatment (see Appendix S2 in the online
Supporting Information). Yet, it was only possible to investigate two of these
moderator variables as there were insufficient studies which had incorporated
other variables.
We argue that it is valid to carry out a comparison of the effects of CBI
and PBI on learning, even though each type of instruction differs in a number
of ways. Our reasons are as follows. First, the distinction is well established in
discussions of language pedagogy and has been the subject of debate among
teacher educators (e.g., Winitz, 1981). Second, as we have shown, a number of
studies have compared the two types of instruction. In fact, our meta-analysis
differs from most of the preceding SLA meta-analyses in that it only examined
comparative studies (i.e., it did not examine studies that investigated just CBI or
PBI). We have argued that examining only comparative studies where the two
types of instruction are maximally different affords a more robust comparison as
each study involved participants drawn from the same population. Third, there
is a fundamental difference in an approach to teaching that emphasizes input
and comprehension and one that emphasizes production that is of theoretical
importance in SLA as it concerns the relative roles of input and output in L2
learning. We will now discuss what answers the results of the meta-analysis
provide to our three research questions and how they speak to relevant SLA
theoretical issues.
Comparative effects
Immediate 31 CBI >∗ PBI 1.09
Delayed 15 CBI > PBI 0.25
Absolute effects
Immediate∗∗ 20 CBI >∗ Control 1.96
21 PBI >∗ Control 1.13
Delayed 9 CBI >∗ Control 1.58
9 PBI >∗ Control 1.12
Pre-to-post effects
Immediate∗∗ 26 CBI: Post >∗ Pre 2.36
27 PBI: Post >∗ Pre 1.01
Delayed∗∗ 13 CBI: Post >∗ Pre 1.70
12 PBI: Post >∗ Pre 1.02
∗
The difference between the groups/ tests were significant at p = .05.
∗∗
Q b value showed a significant difference in the effect sizes between CBI and PBI.
Comparative effects
Immediate 32 CBI = PBI −0.08
Delayed 15 PBI >∗ CBI −0.21
Absolute effects
Immediate 22 CBI >∗ Control 1.23
22 PBI >∗ Control 1.32
Delayed 8 CBI >∗ Control 0.91
8 PBI >∗ Control 1.02
Pre-to-post effects
Immediate 29 CBI: Post >∗ Pre 1.66
30 PBI: Post >∗ Pre 1.83
Delayed 12 CBI: Post >∗ Pre 1.23
12 PBI: Post >∗ Pre 1.36
∗
The difference between the groups/tests were significant at p = .05.
Note. All Q b values were nonsignificant.
effect sizes shown in Table 6 for the absolute and the pre-to-post effects for both
CBI and PBI were all either large or medium.4 In other words, they indicate
both types of instruction were effective in developing receptive knowledge. In
addition, however, the absolute and pre-to-post effect sizes for CBI were always
larger than those for PBI. Also, the comparative effect indicates that CBI was
superior to PBI in the immediate tests (with a close to large) although this dif-
ference disappeared in the delayed tests (i.e., the effect size for this comparison
was negligible). Overall, then, while both types of instruction were effective in
developing receptive knowledge, the CBI proved more effective than the PBI
although its superiority was not clearly sustained over time.
Research question 2 asked whether there was any difference in the relative
effects of CBI and PBI on the acquisition of productive knowledge. The results
are summarized in Table 7. They indicate that in the short term CBI and PBI
were similarly effective in developing learners’ productive knowledge. The
absolute effects and the pre-to-post effects for both types of instruction are all
large with the exception of the medium mean effect size for CBI versus Control
in the delayed tests. The analysis of the comparative effects indicates that PBI
led to more durable productive knowledge than CBI, but only to a small degree
when compared directly, because the difference was statistically significant but
numerically small (d = −0.21).
Taking Tables 6 and 7 together, it is clear that both CBI and PBI are effective
in developing both receptive and productive knowledge in both the short and
long term. However, there are differences in the effects of the two types of
instruction. In the short term, CBI resulted in greater receptive knowledge than
PBI but there was no notable difference in productive knowledge. In the long
term, while both CBI and PBI resulted in similar levels of receptive knowledge,
PBI led to greater sustained productive knowledge. It is possible, then, that while
CBI has an initial advantage, PBI may have more durable effects. Therefore, we
would like to consider the theoretical implications of (1) the initial advantage
of CBI over PBI and (2) the more durable effect of PBI.
Receptive
CBI ∗∗
PI 18 CBI >∗ PBI 2.14
Non-PI 10 PBI >∗ CBI −0.28
PBI
Text-creation 11 CBI >∗ PBI 0.69
Text-manipulation 23 CBI >∗ PBI 1.48
Productive
CBI
PI 18 CBI = PBI 0.05
Non-PI 12 CBI = PBI −0.14
PBI
Text-creation 13 CBI = PBI −0.15
Text-manipulation 23 CBI = PBI 0.03
∗
The difference between the groups/tests were significant at p = .05.
∗∗
The Q b value indicated that the moderator variable had significant (or approaching to
significant) effects on the comparative effect sizes of CBI and PBI.
words, the results can be explained by proposing that CBI is more effective for
teaching new features whereas PBI is better for developing productive ability
of features that have already been partially acquired and need consolidating.
Such a view, it should be noted, accords with VanPatten’s (2004) claims about
the relative contributions of the two types of instruction. However, another
possibility is that the initial advantage for CBI fades over time because it re-
sulted in only explicit knowledge which is less durable than implicit knowledge
(Ellis, 2004). In contrast, PBI may have assisted the development of implicit
knowledge. Both of these interpretations of the results are speculative and it is
not possible to decide between them at this time.
Of the two moderator variables investigated, one (i.e., whether the CBI
involved PI or not) was found to have a statistically significant influence on
the comparative effectiveness of CBI and PBI. The results support VanPatten’s
(2004) claim that CBI is more effective than PBI when it involves PI, but
not when it consists of structured input in a non-PI framework. Wong (2004)
identified the following key features of PI: choice of a target feature that is
governed by a Processing Principle and structured input designed in such a way
as to alter learners’ processing strategies. The studies coded as PI met these
conditions. The results showed PI was more effective than PBI in developing
receptive knowledge. Interestingly, studies involving structured input but not
PI proved to be less effective than PBI in developing receptive knowledge. It
should be noted that there was no difference between the PI studies and non-PI
studies in the case of productive knowledge and the effect sizes were negligible.
This again supports VanPatten’s claims about PI. VanPatten acknowledged that
PBI may be needed to help learners automatize their interlanguage knowledge
for production. Overall, then, the results of the meta-analysis point to the
effectiveness of CBI when it involves PI but only where receptive knowledge
is concerned.
The other moderator instructional variable (i.e. text creation/text manipu-
lation) did not have a significant influence on the comparative effect of CBI
and PBI on acquisition. However, the effect sizes for receptive knowledge were
large in the case of text manipulation and small in the case of text creation.
In other words, CBI was found to be more effective for developing receptive
knowledge than PBI, irrespective of whether the latter involved text-creation or
text-manipulation activities. However, the advantage for CBI is clearer when the
PBI involved text manipulation (d = 1.48) than when it involved text creation
(d = 0.69). In short, CBI is more clearly effective for developing receptive
knowledge than PBI when PBI consists only of text-manipulation activities.
Conversely, the advantage for CBI is less evident when the PBI consists of
text-creation activities.
Conclusion
In the motivation for this study, we distinguished a number of theoretical po-
sitions that address the roles of CBI and PBI in L2 learning. The results of
the meta-analysis do not support the prediction of skill-learning theory, which
claims that CBI benefits receptive but not productive L2 knowledge, whereas
PBI benefits productive knowledge but not receptive knowledge. We found
that both types of instruction were beneficial for both receptive and productive
knowledge. Nor is there really any clear evidence to demonstrate the superiority
of one type of instruction. We did find that CBI was more effective than PBI
for developing receptive knowledge in the immediate tests but there was no
difference in the delayed tests administered 1 week or longer following the in-
struction. PBI, however, was more effective than CBI for productive knowledge
in the delayed tests, although there was no difference in the immediate tests. We
have suggested that these differences can be explained in terms of whether the
target features were new or partially acquired, with CBI advantageous for the
former and PBI for the latter. This explanation accords with the predictions of
the Teachability Hypothesis (Pienemann, 1985), which makes different claims
about the effects of CBI and PBI on new and partially acquired features. How-
ever, we were unable to test these predictions as it was not possible to establish
the status of the target features in the learners’ interlanguages in the studies we
investigated.
Clearly, the results do not allow us to propose that instruction be based on
just CBI or just PBI. Both are effective and, in the case of productive knowledge,
equally so. It is well known that course materials in language textbooks tend
to emphasize PBI—even at the beginning level. Perhaps then, course designers
and materials writers might give greater emphasis to the use of CBI in the
future, especially when introducing new grammatical features, and especially
for beginner-level learners. It is also possible that grammar instruction will
be most effective if it involves a combination of comprehension-based and
production-based activities. The results of this meta-analysis can be seen as
compatible with such a proposal but nevertheless this will need investigating
empirically.
As Oswald and Plonsky (2010) pointed out, one the purposes of meta-
analyses is to prompt further research by identifying areas in need of further
study. The meta-analysis reported in this article points to a number of direc-
tions for future research. Clearly, there is a need for further primary studies that
investigate the comparative effectiveness of CBI and PBI as macro types of
instruction. Such studies need to provide detailed information about the class-
room processes that arise during implementation of the instruction. In this way
it will be possible to identify the characteristics that clearly differentiate CBI
and PBI and that impact on learning. It would also be useful to distinguish the
effects of the instruction on the two types of production measures that Norris
and Ortega (2000) examined (i.e., constrained constructed response and free
constructed response). Further studies are needed to investigate the moderating
effects on learning outcomes of Processing Instruction in CBI and also of text
creation versus text manipulation in PBI. Ideally, too, our understanding of the
Notes
1 An anonymous reviewer suggested a number of other studies that could be
included in the meta-analysis. However, of eight studies suggested, only one study
(Cheng, 2002) met the selection criteria. The other studies were excluded because
they did not involve a direct comparison of CBI and PBI, were published after
2010 (our cut-off point), or were not published in a refereed journal or book.
Cheng’s study was missed because the title did not include any of the key words
used in the search for articles.
2 Trim and Fill (Duval, 2005) is a technique for estimating the potential impact of
publication bias on the observed results; it first guesses the number of missing
studies that might have been missed in a meta-analysis due to publication bias, and
it then calculates a hypothetical new outcome effect, had these studies been
included.
3 An anonymous reviewer rightly pointed out that a symmetrical distribution does
not rule out publication bias as this meta-analysis only included published studies.
4 An alternative benchmark for evaluating effect size can be achieved by comparing
the effect sizes reported in this study with those reported in similar meta-analyses.
Norris and Ortega (2000) reported absolute effect sizes for Focus on Form and
Focus on Forms and for Explicit and Implicit instruction in the range of 0.54
(Implicit Instruction) to 1.92 (Focus on Form). The absolute effect sizes shown in
Table 6 range from 1.13 (PBI) to 1.96 (CBI) and thus demonstrate a similar degree
of effect to those reported by Norris and Ortega.
References
Note. Studies marked with an asterisk were included in the meta-analysis.
∗
Allen, L. Q. (2000). Form-meaning connections and the French causative. Studies in
Second Language Acquisition, 22, 69–84.
Asher, J. J. (1977). Learning another language through actions: The complete
teacher’s guidebook. Los Gatos, CA: Sky Oaks Productions.
∗
Benati, A. (2005). The effects of processing instruction, traditional instruction and
meaning-output instruction on the acquisition of the English past simple tense.
Language Teaching Research, 9, 67–93.
∗
Benati, A. (2009). Japanese language teaching: A communicative approach. London:
Continuum.
∗
Benati, A., & Lee, J. F. (2008). From processing instruction on the acquisition of
Italian noun-adjective agreement to secondary transfer-of-training effects on Italian
future tenes verb morphology. In A. Benati & J. F. Lee (Eds.), Grammar
acquisition and processing instruction: Secondary and cumulative effects
(pp. 54–87). Clevedon, UK: Multilingual Matters.
∗
Benati, A., Lee, J. F., & Houghton, S. D. (2008). From processing instruction on the
acquisition of English past tense to secondary transfer-of-training effects on English
third person singular present tense verb morphology. In A. Benati & J. F. Lee (Eds.),
Grammar acquisition and processing instruction: Secondary and cumulative effects
(pp. 88–120). Clevedon, UK: Multilingual Matters.
∗
Benati, A., Lee, J. F., & Laval, C. (2008). From processing instruction on the
acquisition of French imparfait to secondary transfer-of-training effects on French
subjunctive and to cumulative transfer-of-training effects with French causative
constructions. In A. Benati & J. F. Lee (Eds.), Grammar acquisition and processing
instruction: Secondary and cumulative effects (pp. 121–157). Clevedon, UK:
Multilingual Matters.
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2006).
Comprehensive meta-analysis (Version 2.2.027) [Computer software]. Englewood,
NJ: Biostat.
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction
to meta-analysis. Chichester, UK: Wiley.
Cadierno, T. (1995). Formal instruction from a processing perspective: An
investigation into the Spanish past tense. Modern Language Journal, 79, 179–193.
Cheng, A. C. (2002). The effects of processing instruction on the acquisition of ser and
estar. Hispania, 85, 308–323.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). San
Diego, CA: Academic Press.
∗
Collentine, J. (1998). Processing instruction and the subjunctive. Hispania, 81,
576–587.
Cross, C., Copping, L., & Campbell, A. (2011). Sex differences in impulsivity: A
meta-analysis. Psychological Bulletin, 137, 97–130.
Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence
intervals, and meta-analysis. New York: Routledge.
∗
De Jong, N. (2005). Can second language grammar be learned through listening? An
experimental study. Studies in Second Language Acquisition, 27,
205–234.
DeKeyser, R. M. (2007). Practice in a second language: Perspectives from applied
linguistics and cognitive psychology. New York: Cambridge University Press.
∗
DeKeyser, R. M., & Sokalski, K. J. (1996). The differential role of comprehension
and production practice. Language Learning, 46, 613–642.
Duval, S. (2005). The trim and fill method. In H. R. Rothstein, A. J. Sutton, & M.
Borenstein (Eds.), Publication bias in meta-analysis: Prevention, assessment, and
adjustments (pp. 127–144). Chichester, UK: Wiley.
Ellis, N. C. (2005). At the interface: Dynamic interactions of explicit and implicit
language knowledge. Studies in Second Language Acquisition, 27, 305–352.
Ellis, R. (1995). Interpretation tasks for grammar teaching. TESOL Quarterly, 29,
87–105.
Ellis, R. (1997a). Second language acquisition. Oxford, UK: Oxford University Press.
Ellis, R. (1997b). SLA research and language teaching. Oxford, UK: Oxford
University Press.
Ellis, R. (2003). Task-based language learning and teaching. Oxford, UK: Oxford
University Press.
Ellis, R. (2004). The definition and measurement of L2 explicit knowledge. Language
Learning, 54, 227–275.
∗
Erlam, R. (2003). Evaluating the relative effectiveness of structured-input and
output-based instruction in foreign language learning. Studies in Second Language
Acquisition, 25, 559–582.
∗
Farley, A. P. (2001a). Authentic processing instruction and the Spanish subjunctive.
Hispania, 84, 289–299.
∗
Farley, A. P. (2001b). Processing instruction and meaning-based output instruction: A
comparative study. Spanish Applied Linguistics, 5(2), 57–94.
Farley, A. P. (2004). The relative effects of processing instruction and meaning-based
output instruction. In B. VanPatten (Ed.), Processing instruction: Theory, research,
and commentary (pp. 143–168). Mahwah, NJ: Erlbaum.
∗
Gass, S., & Torres, M. J. A. (2005). Attention when: An investigation of the ordering
effect of input and interaction. Studies in Second Language Acquisition, 27,
1–31.
Goldschneider, J., & DeKeyser, R. M. (2001). Explaining the “natural order of L2
morpheme acquisition” in English: A meta-analysis of multiple determinants.
Language Learning, 51, 1–50.
Hedges, L. V. (1994). Statistical considerations. In H. Cooper & L. V. Hedges (Eds.),
The handbook of research synthesis (pp. 29–38). New York: Russell Sage
Foundation.
Hodges, L. V., Humphris, G., & Macfarlane, G. (2005). A meta-analytic investigation
of the relationship between the psychological distress of cancer patients and their
carers. Social Science and Medicine, 60, 1–12.
Hunter, J., & Schmidt, F. (2004). Methods of meta-analysis. London: Sage.
∗
Izumi, S. (2002). Output, input enhancement, and the noticing hypothesis: An
experimental study on ESL relativization. Studies in Second Language Acquisition,
24, 541–577.
Joe, A. (1998). What effects do text-based tasks promoting generation have on
incidental vocabulary acquisition? Applied Linguistics, 19, 357–377.
∗
Koyanagi, K. (1999). Differential effects of focus on form vs. focus on forms. In T.
Fujimura, Y. Kato, & R. Smith (Eds.), Proceedings on the 10th Conference on
Second Language Research in Japan, 10 (pp. 1–31). Tokyo, Japan: International
University of Japan.
Krashen, S., & Terrell, T. (1983). The natural approach. Hayward, CA: Alemany Press.
∗
Lee, J. F., & Benati, A. (2007a). Comparing modes of delivering processing
instruction and meaning-based output instruction on Italian and French subjunctive.
In J. F. Lee & A. Benati (Eds.), Delivering processing instruction in classrooms and
in virtual contexts: Research and practice (pp. 99–136). London: Equinox.
∗
Lee, J. F., & Benati, A. (2007b). The effects of structured input activities on the
acquisition of two Japanese linguistic features. In J. F. Lee & A. Benati (Eds.),
Delivering processing instruction in classrooms and in virtual contexts: Research
and practice (pp. 49–71). London: Equinox.
∗
Leeser, M. J. (2008). Pushed output, noticing, and development of past tense
morphology in content-based instruction. Canadian Modern Language Review, 65,
195–220.
Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis.
Language Learning, 60, 309–365.
Lightbown, P. M. (2008). Transfer appropriate processing as a model for classroom
second language acquisition. In Z. Han (Ed.), Understanding second language
process (pp. 27–44). Clevedon, UK: Multilingual Matters.
Lipsey, M., & Wilson, D. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage.
Littell, J. H., Corcoran, J., & Pillai, V. (2008). Systematic reviews and meta-analysis.
New York: Oxford University Press.
Long, M. H., & Robinson, P. (1998). Focus on form: Theory, research, and practice. In
C. Doughty & J. Williams (Eds.), Focus on form in classroom second language
acquisition (pp. 15–40). Cambridge, UK: Cambridge University Press.
Lyster, R., & Saito, K. (2010). Oral feedback in classroom SLA. Studies in Second
Language Acquisition, 32, 265–302.
Mackey, A., & Goo, J. M. (2007). Interaction research in SLA: A meta-analysis and
research synthesis. In A. Mackey (Ed.), Input, interaction and corrective feedback
in L2 learning (pp. 379–452). Oxford, UK: Oxford University Press.
∗
Morgan-Short, K., & Bowden, H. W. (2006). Processing instruction and meaningful
output-based instruction: Effects on second language development. Studies in
Second Language Acquisition, 28, 31–65.
∗
Nagata, N. (1998a). Input vs. output practice in educational software for second
language acquisition. Language Learning and Technology, 1(2), 23–40.
∗
Nagata, N. (1998b). The relative effectiveness of production and comprehension
practice in second language acquisition. Computer Assisted Language Learning, 11,
153.
Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research
synthesis and quantitative meta-analysis. Language Learning, 50, 417–528.
Oswald, F., & Plonsky, L. (2010). Meta-analysis in second language research: Choices
and challenges. Annual Review of Applied Linguistics, 30, 85–110.
Ortega, L. (2010). Research synthesis. In B. Paltridge & A. Phakiti (Eds.), Companion
to research methods in applied linguistics (pp. 111–126). London: Continuum.
Patall, E., Cooper, H., & Robinson, J. (2008). The effects of choice on intrinsic
motivation and related outcomes: A meta-analysis of research findings.
Psychological Bulletin, 134, 270–300.
Pienemann, M. (1985). Learnability and syllabus construction. In K. Hyltenstam & M.
Pienemann (Eds.), Modelling and assessing second language acquisition
(pp. 23–75). Clevedon, UK: Multilingual Matters.
Pienemann, M. (1989). Is language teachable? Psycholinguistic experiments and
hypotheses. Applied Linguistics, 1, 52–79.
Plonsky, L. (2011). The effectiveness of second language strategy instruction: A
meta-analysis. Language Learning, 61, 993–1038.
∗
Qin, J. (2008). The effect of processing instruction and dictogloss tasks on acquisition
of the English passive voice. Language Teaching Research, 12, 61–82.
Schulze, R. (2004). Meta-analysis: A comparison of approaches. Cambridge, MA:
Hogrefe & Huber.
∗
Shintani, N., & Ellis, R. (2010). The incidental acquisition of English plural -s by
Japanese children in comprehension-based and production-based lessons. Studies in
Second Language Acquisition, 32, 607–637.
∗
Song, M. J., & Suh, B. R. (2008). The effects of output task types on noticing and
learning of the English past counterfactual conditional. System, 36, 295–312.
Swain, M. (1985). Communicative competence: Some roles of comprehensible input
and comprehensible output in its development. In S. Gass & C. Madden (Eds.),
Input in second language acquisition (pp. 235–256). Rowley, MA: Newbury House.
Swain, M. (1995). Three functions of output in second language learning. In G. Cook
& B. Seidelhofer (Eds.), Principle and practice in applied linguistics: Studies in
honor of H.G. Widdowson (pp. 125–144). Oxford, UK: Oxford University Press.
∗
Takimoto, M. (2009). The effects of input-based tasks on the development of
learners’ pragmatic proficiency. Applied Linguistics, 30, 1–25.
∗
Tanaka, T. (2001). Comprehension and production practice in grammar instruction:
Does their combined use facilitate second language acquisition? JALT Journal, 23,
6–30.
∗
Toth, P. D. (2006). Processing instruction and a role for output in second language
acquisition. Language Learning, 56, 319–385.
Ur, P. (1996). A course in language teaching: Practice and theory. Cambridge, UK:
Cambridge University Press.
VanPatten, B. (1996). Input processing and grammar instruction: Theory and research.
Norwood, NJ: Ablex.
VanPatten, B. (2002). Processing instruction: An update. Language Learning, 52,
755–803.
Supporting Information
Additional Supporting Information may be found in the online version of this
article at the publisher’s website: