0% found this document useful (0 votes)
15 views

15-shintani2013

Uploaded by

bmorris
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

15-shintani2013

Uploaded by

bmorris
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Language Learning ISSN 0023-8333

SYSTEMATIC REVIEW ARTICLE


Comprehension-Based Versus
Production-Based Grammar Instruction:
A Meta-Analysis of Comparative Studies
Natsuko Shintani,a Shaofeng Li,b and Rod Ellisb
a
National Institute of Education, Nanyang Technological University and b The University of
Auckland

This article reports a meta-analysis of studies that investigated the relative effectiveness
of comprehension-based instruction (CBI) and production-based instruction (PBI). The
meta-analysis only included studies that featured a direct comparison of CBI and PBI in
order to ensure methodological and statistical robustness. A total of 35 research projects
in 30 published studies were retrieved. The studies were coded for three types of effect
sizes: comparative, absolute, and pre-to-post change. The comparative effect sizes were
used in a subsequent moderator analysis to test the impact of two mediator variables—
CBI with and without Processing Instruction and PBI involving text creation versus text
manipulation. The results showed that (1) overall, both types of instruction had large
effects on both receptive and productive knowledge; (2) for receptive knowledge, CBI
had a greater effect than PBI when the acquisition was measured within one week but the
difference diminished in the delayed tests (i.e., posttests administered between 1 week
and 75 days after the treatment); (3) for productive knowledge, CBI and PBI had similar
effects in short-term measurement but PBI was more effective in the delayed tests; and
(4) the initial advantage found for CBI was largely due to Processing Instruction. We
discuss the theoretical and pedagogical significance of these findings.

Keywords meta-analysis; comprehension-based instruction; production-based instruc-


tion; grammar instruction; receptive and productive knowledge; processing instruction;
skills acquisition theory

Introduction
Traditionally in the field of language teaching, grammar instruction has em-
phasized the importance of learning through producing sentences in the target

We thank the anonymous reviewers and the editor of Language Learning for their detailed and
helpful suggestions for revising the draft of this article.
Correspondence concerning this article should be addressed to Natsuko Shintani, National
Institute of Education, Nanyang Technological University, 1 Nanyang Walk Singapore 637616.
E-mail: [email protected]

Language Learning 63:2, June 2013, pp. 296–329 296



C 2013 Language Learning Research Club, University of Michigan

DOI: 10.1111/lang.12001
Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

language. This is true of grammar translation (e.g., first-to-second language


translation exercises), of the Audiolingual Method (which sought to develop
habits by having learners mechanically produce exemplars of specific structural
patterns), and present-practice-produce (PPP), which differs from other tradi-
tional methods mainly because it incorporates opportunities for free as well as
controlled production in the second language (L2). There are obvious reasons
for this emphasis on production. First, the goal of most language courses is to
enable students to speak and write in the L2 and it would seem self-evident
that this can best be achieved by having students speak and write. Second,
teachers frequently evaluate the success of their lessons in terms of student
participation and view some kind of participation as necessarily involving pro-
duction. Third, students’ production, when erroneous, provides opportunities
for corrective feedback and so-called errors are conceived as problems with
production not comprehension. Production-based instruction (PBI) remains
dominant in language teaching today. Ellis (2003), for example, analyzed the
learning activities in six grammar practice books and found that all six included
copious controlled production exercises while four of them also provided free
production activities. In contrast, only two provided any comprehension-based
activities.
However, there have been challenges to the pre-eminence of PBI in a variety
of proposals that can be characterized as comprehension-based instruction
(CBI). In the 1960s Asher (e.g., 1977) promoted Total Physical Response as a
comprehension-based method and conducted a number of studies that indicated
it was in superior to traditional PBI methods. In 1981 Winitiz published an
edited collection of papers on CBI, while in the same decade Krashen and
Terrell (1983) proposed their Natural Approach, which was premised on the
assumption that ability to produce in an L2 emerges only after learners have
acquired some language through comprehending input. VanPatten (1996) also
argued that the acquisition of grammatical features originates in input. However,
unlike Winitz and Krashen, he rejected the view that learning will always
occur naturally, as he put it, that is, automatically if learners are exposed to
comprehensible input. VanPatten argued that what was needed was an approach
that focused learners’ attention on specific grammatical forms and the meanings
they realize.
There is, then, clearly a need to examine these differing pedagogic claims
by comparing the effects that CBI and PBI have on L2 learning. A comparison
of the two types of instruction will also enable the claims that the theories
underpinning the two types of instruction make regarding the skill-specificity
of instruction—a key issue in second language acquisition (SLA) research that

297 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

is considered later in detail. As there are now a sufficient number of studies that
have conducted direct comparisons of CBI and PBI, a meta-analytical approach
is possible. Meta-analysis provides a quantitative synthetic review of previous
empirical research on a certain construct or clinical intervention. Since the
publication of Norris and Ortega’s (2000) seminal work on the effectiveness
of L2 instruction, a number of such analyses have been conducted (e.g., Gold-
schneider & DeKeyser, 2001; Li, 2010; Lyster & Saito, 2010; Mackey & Goo,
2007; Norris & Ortega, 2006; Plonsky, 2011). The comparative effectiveness of
CBI and PBI has yet to be addressed but is ripe for a new meta-analysis. Unlike
the previous meta-analyses of different types of instruction, our approach in the
meta-analysis we report in this article is unique in that we included only those
studies that conducted direct comparisons of the two types of instruction. We
inspected cumulative evidence gleaned from within-study comparisons of CBI
versus PBI by examining evidence from three types of effect sizes: comparative
(i.e., contrasting CBI and PBI directly), absolute (i.e., contrasting CBI or PBI
to a control group), and pre-to-post (i.e., contrasting performance by a CBI or
PBI treatment over time).

Background to the Meta-Analysis


Defining and Characterizing CBI and PBI
The essential difference between CBI and PBI rests on whether production
is or is not required and, therefore, how learners are expected to process the
grammatical features that are the target of the instruction. CBI does not require
production of the target features. It aims to teach them by embedding them in
input which (1) learners have to comprehend and which (2) induces learners’
conscious attention to specific forms and the meanings these encode. However,
CBI does not proscribe production; learners may voluntarily elect to try to
produce the target language. Nevertheless, learner production of the target
features is limited and may be nonexistent. In CBI, feedback is focused on
whether learners have successfully comprehended and processed the target
structure in the input. In contrast, PBI seeks to elicit the correct production of the
target features. A key feature of most PBI activities is corrective feedback (i.e.,
feedback directed at correcting any errors that learners make in production).
PBI also provides opportunities for comprehending input. Production activities
may provide learners with input containing the target feature as a stimulus
for triggering output. Also one learner’s output in PBI can serve as another
learner’s input. However, PBI differs from CBI in viewing production rather
than comprehension of the target feature as the source of acquisition.

Language Learning 63:2, June 2013, pp. 296–329 298


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Although it is possible to identify clear differences between CBI and PBI,


it is important to acknowledge that there is considerable variation in both.
Both types of instruction may or may not include explicit instruction (i.e., the
provision of information about the target features). Also, both types may in-
volve oral or written activities or a combination of both. In the case of CBI,
the input to be comprehended may be enriched or enhanced (i.e., the target
feature may be orthographically or phonologically made salient to the learn-
ers). While all CBI activities involve learners in demonstrating that they have
processed the target feature by means of a nonverbal or minimally verbal re-
sponse, the actual nature of this response differs considerably. In some studies
(e.g., Erlam, 2003), it involves error-identification; in others (e.g., Allen, 2000)
it involves answering multiple-choice questions. Many of the studies involved
what is known as structured input activities. These take the form of what R. Ellis
(1995) calls interpretation tasks—that is, tasks that consist of input containing
multiple examples of the target feature and that require learners to demonstrate
they have successfully processed the input by, for example, choosing which
picture matches the input stimulus. Interpretation tasks aim to help the learners
create a form-meaning mapping for the target feature. A specific rationale for
structured input activities has been developed by VanPatten (1996, 2002), who
proposed a type of instruction he called Processing Instruction (PI), which he
argued is particularly effective because of the psycholinguistic principles we
discuss in a later section. A key difference in PBI concerns whether the activ-
ities involve text manipulation or text creation (Ellis, 1997b). This distinction
applies equally to oral and written activities. Text-manipulation activities sup-
ply learners with sentences illustrating the target structure and require them
to operate on these in some way (e.g., by substituting one grammatical form
for another as in a pattern drill, filling in blanks, transforming sentences from
one pattern to another, or translating a sentence from the first language to the
L2). Examples of studies in our meta-analysis that employed text-manipulation
activities are DeKeyser and Sokalski (1996) and Lee and Benati (2007b). Text-
creation activities require students to compose their own sentences, that is,
they involve free production. Examples of studies in our meta-analysis that
included such activities are Gass and Torres (2005), who made use of jigsaw
and information-gap tasks, and Lee and Benati (2007a), who also included
meaning-based production tasks. The difference between these two types of
PBI activities, however, is continuous rather than dichotomous. That is, activ-
ities can involve degrees of text manipulation or text creation—as seen, for
example, in Ur’s (1996) description of different types of a production-practice
activities.

299 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Theoretical Issues
As noted in the Introduction, comparison of CBI and PBI provides a means
of addressing a key issue in SLA, namely the skill-specificity of language
instruction. Four different positions are possible: (1) CBI benefits receptive but
not productive L2 knowledge whereas PBI benefits productive knowledge but
not receptive knowledge; (2) PBI will prove superior to CBI in developing both
types of knowledge; (3) CBI will prove superior to PBI in developing both
types of knowledge; and (4) PBI will only benefit acquisition if it takes account
of the learners’ level of L2 development but no such constraint exists in the
case of CBI.
The first position is supported by skill-learning theory, which emphasizes
that the power law of practice leads to qualitative changes in the learner’s
knowledge system over time but only “in the basic cognitive mechanisms
used to execute the same task” (DeKeyser, 2007, p. 99). This is because the
kind of knowledge that characterizes the later stages of development (i.e., the
automatization stage) is highly specific and so does not transfer to tasks that
are dissimilar from those used to develop the knowledge. In other words, the
knowledge that results from comprehension-based instruction will be available
in receptive tasks while that which results from production-based tasks will be
available in productive tasks.
The second position receives support from theories that emphasize the
importance of production in L2 acquisition. Swain’s (1985) Output Hypothesis,
for example, proposes that advanced acquisition requires opportunities for
pushed output. Swain argued that comprehensible input alone was not sufficient
to ensure high levels of linguistic competence. She suggested that production
was beneficial when it pushed learners to produce messages that are concise
and socially appropriate. As Swain (1995) noted “learners . . . can fake it,
so to speak, in comprehension, but they cannot do so in the same way as
production” (p. 127). It should be noted that the Output Hypothesis addresses
the role played by free, communicative production, not the kind of controlled
production resulting from grammar drills and exercises. That is, the Output
Hypothesis suggests that PBI consisting of text-creation activities is needed.
Theories of implicit learning (N. Ellis, 2005; Long & Robinson, 1998) also
emphasise the need for text-creation activities in conjunction with a focus
on form (i.e., drawing learners’ attention to linguistic problems that arise in
their communicative production). The importance attached to text-creation
activities in PBI is further supported by the Transfer Appropriate Learning
Hypothesis, which claims that “we can better remember what we have learned
if the cognitive processes that are active during learning are similar to those that

Language Learning 63:2, June 2013, pp. 296–329 300


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

are active during retrieval” (Lightbown, 2008, p. 27). In other words, if the aim
is to develop the kind of knowledge required for meaningful communication,
it is necessary to engage learners in production involving text creation.
The third position is supported by models of L2 acquisition that propose
a single knowledge store that is drawn on for both comprehension and pro-
duction and that develops as a result of processing input. VanPatten (2007)
argued that input processing (defined as the process by which a form-meaning
connection is established) in conjunction with other cognitive processes (such
as restructuring) leads to changes in the learner’s internal grammar and these
will be subsequently manifest in both receptive and productive tasks. This
is based on the hypothesis that for acquisition to take place learners need
to overcome a number of processing principles that define their attentional
priorities and prevent them from paying attention to specific grammatical fea-
tures in the input. For example, “learners prefer processing ‘more meaningful’
morphology before ‘less’ or ‘nonmeaningful morphology’” (VanPatten, 1996,
p. 15) and so may pay attention to English plural -s but not third person -s. The
goal of instruction, therefore, should be to induce attention to those features
that learners overlook as a result of processing constraints. It should be noted,
however, that VanPatten (2002) argued that these principles do not apply to the
entirety of the grammar of a language but only to those specific features influ-
enced by the principles. Processing Instruction potentially involves a number
of different components—explicit instruction, processing strategy training and
structured input—with some studies (e.g., VanPatten & Oikkenon, 1996) sug-
gesting that it is the last of these that is crucial. Several PI studies (e.g., Benati,
2005; VanPatten & Cadierno, 1993), based on VanPatten’s theory, lend support
to the superiority of comprehension-based instruction over production-based
instruction in developing both receptive and productive knowledge.
The fourth position is evident in Pienemann’s (1985) claims regarding the
learnability and teachability of grammatical structures. Pienemann proposed
that the ability to produce a specific grammatical structure depends on whether
the learner has mastered the specific processing operation that is required. The
Teachability Hypothesis predicts that “instruction can only promote language
acquisition if the interlanguage is close to the point when the structure to be
taught is acquired in the natural setting so that sufficient processing prerequi-
sites are developed” (Pienemann, 1985, p. 37). By instruction Pienemann is
referring to production-based activities. Pienemann acknowledged that the pro-
cesses involved in comprehension and production are different. Thus, he saw
no problem in exposing learners to grammatical structures that are beyond their
current level of development provided they are not required to produce them.

301 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

From this perspective, CBI avoids the danger of forcing learners to produce
a structure they are not ready for, which according to Pienemann (1989) can
actually delay L2 development.
The present study reports a meta-analysis of studies that have featured a
direct comparison of CBI and PBI. In order to carry out the meta-analysis,
we have treated these two types of instruction as macro instructional types.
We distinguished them by defining CBI as instruction that did not require
production (but did not prohibit it) and that directed learners’ attention to
processing the meaning of specific target features in the input, while PBI was
defined as instruction that required learners’ to produce the L2 target features
and directed attention at processing the target features correctly as output. As
noted above, each type of instruction can be implemented in a variety of ways.
In this broad approach we have followed the practice of previous meta-analyses,
which have been based on instructional distinctions of a similar macro nature
(e.g., Norris & Ortega’s [2000] macro variables of focus-on-form vs. focus-on-
forms and implicit vs. explicit instruction). We argue that the investigation of
CBI and PBI as macro instructional types is justified because of their importance
to both language pedagogy and SLA theory.

Research Questions
The present meta-analysis has two major aims. One is to investigate the relative
effects of the CBI and PBI on learners’ acquisition of both receptive and
productive knowledge. To this end, two research questions were formulated:

RQ1: Is there any difference in the effects of CBI and PBI on the acquisition
of receptive knowledge of grammatical features?
RQ2: Is there any difference in the effects of CBI and PBI on the acquisition
of productive knowledge of grammatical features?

The results obtained for these questions provide a basis for evaluating the
different SLA theoretical positions outlined above. The second aim was to
investigate whether key differences in CBI and PBI affect the acquisition of
grammar features, leading to a third research question:

RQ3: What factors moderate the effects of CBI and PBI on L2 grammar
acquisition?

Two moderating factors were chosen for examination: (1) PI versus non-PI
in the case of CBI and (2) text-manipulation versus text-creation activities in

Language Learning 63:2, June 2013, pp. 296–329 302


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

the case of PBI. The explanation for the choice of these two factors is presented
in the next section.

Method
Selection of Studies
The current meta-analysis did not include fugitive literature (e.g., unpublished
doctoral dissertations and conference presentations) due to the difficulty in-
volved in retrieving such materials consistently. The literature search started
with ten major SLA journals, in alphabetical order: Applied Linguistics, The
Canadian Modern Language Review, Computer Assisted Language Learning,
Language Awareness, Language Learning, Language Teaching Research, The
Modern Language Journal, Studies in Second Language Acquisition, System,
and TESOL Quarterly. The search was carried out by inspecting each jour-
nal using online resources or hard copies. Then, electronic databases such
as the Education Resources Information Center and the Linguistic and Lan-
guage Behaviour Abstracts were used to extend the search. Some of the key
words used in the database search were “input,” “output,” “comprehension,”
“production,” “receptive,” “productive,” “comparative,” “comparison,” “class-
room,” and “grammar learning.” As a result, articles from a number of other
journals (Applied Psycholinguistics, Foreign Language Annals, Hispania, JALT
Journal, Second Language Research, RELC Journal, and Spanish Applied Lin-
guistics) and chapters from some books (Benati, 2009; Benati & Lee, 2008;
Lee & Benati, 2007a, 2007b; and VanPatten, 2004) were identified.
The following criteria were used to select studies for the meta-analysis.
Initially, studies that met criteria (1), (2), and (3) were selected. Then the
selected studies were further screened according to criteria (4), (5) and (6):
1. Only studies published in refereed journals or books between 1991 and
2010 were included, 2010 being the point where the data collection for
this study was completed. The year 1991 was set as the starting point
because, although debate about the relative effectiveness of CBI and PBI
had started much earlier, there were few empirical studies prior to 1991
and those that did exist—such as Asher’s (1977) Total Physical Response
experiments—did not report sufficient data for effect size calculation.
2. Only studies that featured a comparison of a CBI treatment with a PBI
treatment were selected. A CBI treatment was defined as comprising any
activity that required learners to comprehend oral or written L2 input
to achieve the goal of the activity but required only limited production
(e.g., “yes” or “no”) or no production at all. A PBI treatment was defined

303 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

as any activity that required learners to produce the target feature(s) in


either oral or written form. Studies that investigated the effects of either
CBI or PBI alone in comparison to some other type of instruction or
a control condition were excluded. This decision was taken because of
the difficulty of comparing the effects of CBI and PBI when effect sizes
are computed using baseline data derived from heterogeneous instructional
interventions. Also, the way control conditions were operationalized varied
across studies. In some studies, participants in the control group did not
receive any treatment; in others they received some placebo treatment
that was irrelevant to the target structure. Clearly effect sizes based on
experiment-control comparisons constitute a less accurate representation
of the comparative effects of CBI and PBI than effect sizes based on direct
contrasts between the two types of instruction. Also, the fact that there are
sufficient comparative studies makes it possible to conduct such a meta-
analysis. The statistical advantages of using comparative effect sizes will
be further discussed in the data analysis section.
3. Only studies that investigated grammar acquisition were included. Gram-
mar acquisition was operationalized as L2 morphology, syntax, or pragma-
linguistic features (e.g., Takimoto, 2009). There are a considerable number
of comparative studies investigating the effects of CBI and PBI on vocab-
ulary acquisition but, given the different processes involved in learning
grammar and vocabulary, this meta-analysis focuses only on the former.
4. Only studies that provided enough statistical information for computing
effect sizes were included. The detailed criteria for selecting studies that
allow for this are considered below. For example, Cadierno (1995) was
excluded because the study reported only mean test scores for each group
and the results of one-way analysis of variance, which were not sufficient
to estimate effect sizes.
5. When the same experimental study was reported in more than one published
article (e.g., Farley, 2001b, 2004), only one was selected for the meta-
analysis.
6. When a single study reported more than one experiment, each experiment
contributed an effect size in the meta-analysis. In other words, each exper-
iment was treated as a separate study (e.g., VanPatten & Wong, 2004).

As a result, 35 research experiments reported in 30 published studies were


included in the meta-analysis.1 Detailed information about the instruction,
target features, and type of outcome measurement of each study is reported in
Appendix S1 in the online Supporting Information for this article.

Language Learning 63:2, June 2013, pp. 296–329 304


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Study Coding: Independent, Dependent, and Moderator Variables


The 35 research experiments were coded for independent, dependent, and
moderator variables, as follows.
In terms of independent variables, the experimental and comparison groups
in each research experiment were coded as either CBI or PBI based on the
description of the instructional treatment and the definitions of these two types
of instruction. Some studies involved more than one CBI or PBI group. For
example, Koyanagi (1999) involved one CBI group (the Input Group) and two
PBI groups (the Output and Drill Groups). In such cases, separate effect sizes
for each group comparison were calculated.
The dependent variables were receptive and productive L2 knowledge. They
were measured in terms of the learners’ test scores following the treatment as
these were reported in the studies. The tests in each experiment were classified
as either receptive or productive. A receptive test was a test that required
the learners to comprehend oral or written L2 input but did not require the
production of the target features. A productive test was a test that required
the learners to produce the target L2 feature in oral or written form. This
coding conformed to how the tests were designated in the original studies (see
Appendix S1 in the online Supporting Information).
In order to ascertain the durability of the instructional effects, the individual
tests were further categorized as either an immediate posttest (if it was taken
within 7 days after the treatment) or a delayed posttest (if it was administered
8 days after the treatment or later). It should be noted that this classification
did not always accord with how the tests had been originally labeled in the
primary studies. For example, Song and Suh’s (2008) and Takimoto’s (2009)
so-called immediate posttests were actually conducted more than 1 week after
the treatment was concluded. They were thus coded as delayed for the purpose
of this meta-analysis. The timing of the outcome measures of each study is
shown in Appendix S1 in the Supporting Information online.
As for moderator variables, during the process of systematically coding
the studies we considered a number of potentially interesting factors that have
figured in theoretical discussions of L2 instruction (e.g., the learners’ age, their
proficiency, type of target structure, classroom vs. computer-mediated settings,
and incidental vs. intentional learning). In Table 1 and in Appendix S2 in
the Supporting Information online, the coding for these variables is shown.
Unfortunately, it proved impossible to investigate many of these variables in
the meta-analysis because there were insufficient studies to make comparisons
possible or because some of the individual studies provided insufficient infor-
mation. Finally, we chose two moderating features, one relating to CBI and the

305 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Table 1 Methodological features of included studies

Aspects Subcategories k

Context Foreign language 32


Second language 3
Target language English 9
French 2
Japanese 8
Spanish 16
Instructional status University 24
Secondary 6
Elementary 1
Miscellaneous 4
Research setting Face-to-face 31
Computer-mediated 4
Age Adult 28
Adolescent 6
Child 1
Intentional/incidental Intentional 29
Incidental 6
Type of activities in CBI Enriched/Enhanced input 4
Structured input 31
PI studies 18
Non-PI studies 13
Type of activities in PBI Text-creation 14
Text-manipulation 21

other to PBI, both of which figured widely in the studies and can be considered
instructional aspects of theoretical and pedagogical significance, as outlined in
the Introduction. Although these factors moderated just CBI or PBI, we decided
to include them as variables because of their potential impact on the efficacy
of the instruction.
Moderator variable 1 was whether or not the CBI involved PI. That is, we
decided to investigate whether or not the CBI studies that included structured
input involved PI in ways that were faithful to VanPatten’s (1996) definition.
Many of the CBI studies involved PI and we thought this might be a factor in the
efficacy of the instruction. We determined this variable in the following way. We
first identified those CBI studies that included a structured input activity (i.e., an
activity that requires learners to make form-meaning connections for the target

Language Learning 63:2, June 2013, pp. 296–329 306


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

grammatical features in oral or written input). The instructional treatment was


then coded as “PI” if the researcher(s) claimed that the experimental treatment
involved PI. However, some studies that claimed their treatment constituted
PI failed to satisfy VanPatten’s criteria for PI (as discussed earlier), and these
cases were coded as “non-PI.” These studies were Allen (2000), Dekeyser and
Sokalski (1996) (see Wong, 2004), and Collentine (1998) (see Farley, 2001b).
Those structured input studies that did not claim to be PI were also coded as
“non-PI.”
Moderator variable 2 was whether PBI involves text creation or text ma-
nipulation. This variable was included because, as discussed above, whether
production practice involves text creation or text manipulation can impact on
L2 acquisition. A study was coded as text creation if the PBI treatment required
students to compose their own sentences at any stage of the instruction. For
example, Toth (2006) was coded as text creation as the PBI group engaged in
communicative output tasks involving summarizing, comparing and contrast-
ing pictures or narrating personal stories of recent “unplanned occurrences”
(p. 339). Although dictogloss tasks are designed to direct the learners’ attention
to form by requiring them to reconstruct the sentences accurately, they were
coded as text creation as the task involves communicative L2 interaction in the
process of completing the tasks (e.g., Izumi, 2002; Leeser, 2008; Qin, 2008;
VanPatten, Inclezan, Calazar, & Farley, 2009). On the other hand, a study was
coded as text manipulation if the PBI involved only controlled production of the
L2 such as substituting one grammatical form for another, filling in a blank, or
transforming sentences from one pattern to another. Although some researchers
described their PBI as meaning-oriented instruction, a closer examination of
their instructional procedures and materials showed that the studies only in-
volved controlled production of the L2 (e.g., Lee & Benati, 2007a, 2007b).
These studies were coded as “text manipulation.”

Reliability of the Coding of the Studies


The bulk of the data was coded by the first author, who is a trained SLA
researcher and experienced language teacher and who is familiar with the theo-
retical and pedagogical aspects of CBI and PBI. The development of the coding
scheme was an evolving, recursive process that involved repeated revisions. The
coding scheme was established by means of extensive and in-depth discussions
among the authors based on the methodological information reported by pri-
mary researchers, the related theories, and other meta-analyses in SLA. After a
complete scheme agreed by the authors was created, a subset of 12 studies (over
one-third of the total 35 studies) was coded by the second author. The overall

307 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

agreement rate between the two raters reached 94.5%, and the disparities were
resolved through discussion. A final round of coding was then performed with
special attention to the disputed aspects of the coding scheme.

Data Analysis and Methodological Decisions


An effect size may take various forms depending upon the construct to be
investigated. This meta-analysis examines the effectiveness of two types of in-
struction, and thus the effect size index relates to mean differences. Standard-
ized mean difference is calculated by dividing the mean difference between two
groups by an estimate of its variation (e.g., the pooled standard deviation). In
this study, Cohen’s d was used as the index of effect size estimate.
Most SLA meta-analyses have followed Cohen’s (1988) benchmarks in in-
terpreting the magnitude of effect size: 0.80 constitutes a large effect, 0.50 a
medium effect, and 0.20 a small effect. However, Cohen’s reference criteria are
controversial and experts of meta-analysis have cogently argued for the consid-
eration of field specificity in interpreting effect size, that is, understanding it
within the context of the particular research domain (Ortega, 2010; Oswald &
Plonsky, 2010; see also Cumming, 2012). Oswald and Plonsky, based on their
synthesis of 27 published and unpublished meta-analyses in SLA, found that the
average of the mean effect sizes of these meta-analyses was 0.68 (confidence
interval [CI]: 0.53, 0.83) for experiment-control comparisons and 1.95 (CI:
0.65, 1.25) for pretest-posttest gains. Based on this, the authors recommended
as a set of “SLA standards for effect sizes” (p. 99) that 0.4 be considered a small
effect, 0.70 medium, and 1.0 a large effect. This meta-analysis used Oswald
and Plonsky’s criteria in interpreting d values for between-group effects. How-
ever, because pre-to-post effects (pretest-posttest gains; with the lower bound
of the CI being over 0.60) are typically larger than between-group effects, we
consider it necessary to set higher standards for the former, with 0.50 repre-
senting a small effect, 0.80 a medium effect, and 1.10 a large effect. Effect
sizes were also interpreted with reference to the 95% CI and related p values.
A 95% CI suggests that, if the same treatment is repeated many times, 95% of
the time the effect would fall within that range. A narrow CI indicates a robust
finding. If a CI crosses zero and the p value is above .05, the null hypothesis
(i.e., that the effect of a treatment or intervention is significantly different from
zero) is rejected. However, a zero-crossing or nonsignificant CI should not be
interpreted as lack of any effect: “If CIs also cover effect size . . . values that are
clinically significant, we cannot rule out the possibility that there is a clinically
significant average effect that could not be detected in the analysis” (Littell,
Corcordan, & Pillai, 2008, p. 135).

Language Learning 63:2, June 2013, pp. 296–329 308


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

As with most other meta-analyses, this meta-analysis follows a two-step


procedure: Effect size aggregation followed by moderator analysis. Effect size
aggregation generates an average effect size that indicates the overall effect
of an instruction type or treatment. The purpose of moderator analysis is to
identify variables that potentially moderate the effect of the treatment and that
may account for the within-subject variability among the effect sizes. Moderator
analyses were conducted by using an analogue to analysis of variance, which
“partitions the total homogeneity statistic, Q, into the portion explained by the
categorical variable (Q b[etween] ) and the residual pooled within groups portion
(Q w[ithin] )” (Lipsey & Wilson, 2001, p. 121). A significant Q b would mean that
the moderator variable accounts for a significant portion of the variance of
the effects of the treatment or intervention. All analyses were conducted using
Comprehensive Meta-Analysis (Borenstein, Hedges, Higgins, & Rothstein,
2006) professional software.
Three categories of effect sizes were computed in this meta-analysis: com-
parative effect sizes, absolute effect sizes, and pre-to-post effect sizes. A com-
parative effect size represents the effect of CBI in comparison with PBI (i.e.,
as measured in a direct comparison). In order to make the results comparable
across included studies, CBI data were consistently coded as “experimental”
data, and PBI data as “baseline” data. Thus a significant comparative effect
size d = 0.3 would suggest that CBI is more effective than PBI by 0.3 standard
deviation units. Conversely, a significant effect size in the negative direction
d = −0.3 would imply that PBI has a larger effect than CBI and the effect
differs by 0.3 standard deviation units. An absolute effect size shows the effect
of CBI or PBI in relation to a control condition (in the event of studies that
included a control group as well as the experimental groups). This group of
effect sizes were based on the contrasts between CBI or PBI, on the one hand,
and no instruction or instruction that did not relate to the target structure (in
either case usually called control in the primary studies) on the other hand. A
pre-to-post effect size reflects the effect of an instruction type as a result of the
treatment; it involves the contrast between pretest and posttest scores by the
same (CBI or PBI) group.
Notwithstanding the fact that three types of effect sizes were computed to
obtain an initial, holistic picture of the effects of the two instruction types, it is
the comparative effect sizes that were used in subsequent moderator analyses.
This decision reflects the primary objective of this meta-analysis, that is, exam-
ining the comparative effects of CBI and PBI. Using comparative effect sizes
made it possible to minimize the statistical analysis needed (i.e., extra group
comparisons) and reduce Type I error rate. Absolute or pre-to-post effect sizes

309 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

are not associated with direct contrasts between CBI and PBI, and comparing
the two in terms of their effectiveness would involve two steps: aggregat-
ing effect sizes related with treatment–control/pre–post contrasts followed by
between-group comparisons. Also, analyses based on comparative effect sizes
involved all the studies included in the meta-analysis whereas analyses based
on absolute or pre-to-post effect sizes necessarily precluded studies without a
control group or without pretest data.
One issue meta-analysts have to face is sample size inflation, that is, in-
cluding multiple effect sizes from a single study in the analysis. Lipsey and
Wilson (2001) rightly pointed out that the inclusion of more than one ef-
fect size in the same analysis violates the assumption of independence of
data points and “can render the statistical results highly suspect” (p. 105).
To minimize the violation of such assumption and at the same time main-
tain as much data as possible, a shifting unit of analysis was adopted (Patall,
Cooper, & Robinson, 2008). Initially, effect sizes from each study were com-
puted separately. However, when effect sizes were aggregated to obtain overall
effects, each study contributed a single effect size, which was the average of
multiple effect sizes if a study generated more than one effect size. When
moderator analyses were performed, there was a possibility that more than
one effect size was involved in one analysis. However, in each category of
an independent variable, only one effect size was included from each study.
For instance, in the event that both a receptive as well as a productive test
was used in a study, the two corresponding effect sizes were included in the
analysis related to the effect of the outcome measure. However, within each
of the two categories (receptive and productive), only one effect size was
included.
Another issue relates to outliers. There are three options for dealing with
outliers: (1) including them intact, (2) recoding them to less extreme values
(e.g., using the technique known as winsorizing), and (3) removing them.
In this meta-analysis, outliers were excluded from all analyses in order to
ensure the robustness of the results. Outliers were detected by examining the
standardized values of effect sizes (z scores): Any value that was more than 2.5
standard deviation units from the mean effect size was considered an outlier.
For each set of effect sizes, outlier identification was performed repeatedly until
there were no further outliers. It should be noted that outliers are contingent,
that is, an outlier in one set of effect sizes is not necessarily an outlier in
another set. Information about the outliers is provided in Appendix S2 of
the online Supporting Information, where a list of included studies and their
methodological features can be found.

Language Learning 63:2, June 2013, pp. 296–329 310


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

A classical debate among meta-analysts concerns whether to adopt a fixed-


effect model or random-effect model in effect size aggregation (Borenstein,
Hedges, Higgins, & Rothstein, 2009; Hunter & Schmidt, 2004; Lipsey &
Wilson, 2001; Schulze, 2004). A fixed-effects model assumes that the popula-
tion mean effect size is the same in all included studies and the variation among
effect sizes is only attributable to sampling errors. A random-effect model
allows the population effect to vary among studies, and effect size variation re-
sults from both within-study and between-study variability. While fixed-effect
models were the default, recent meta-analyses (e.g., Cross, Copping, & Camp-
bell, 2011) have started to adopt random-effects models because of the potential
Type I error rates and misleading narrow confidence intervals associated with
the former (Hedges, 1994; Hunter & Schmidt, 2004). The Q-statistic (within-
group) is sometimes used to determine which model is appropriate (Hodges,
Humphris, & Macfarlane, 2005; Schulze, 2004). A significant Q indicates a het-
erogeneous distribution and so a random-effect model is warranted. However,
because the statistical power of homogeneity tests is typically low (Oswald &
Plonsky, 2011), it is recommended that model selection be based on the meta-
analyst’s observation of the collected data and understanding about whether
there is a common effect size among the identified samples (Borenstein et al.,
2009). In this meta-analysis, a random-effect model was used in effect size
aggregation because of the methodological heterogeneity observed among the
included studies and the assumption that the population mean effect size varied
across studies (and the Q values were significant for nearly all aggregated ef-
fect sizes, including the immediate comparative effects on which all subsequent
moderator analyses were based).

Results
Descriptive Results
A total of 35 comparative studies investigating the differential effects of CBI
and PBI on SLA were retrieved. These studies were published between 1991
and 2010. As shown in Figure 1, there has been a rapid growth in the amount
of research on the topic under investigation (from 1 during the first 5 years to
15 over the last 5-year period). Altogether 276 effect sizes were generated from
these studies involving 7,700 codes relating to the dependent and independent
variables as well as methodological features. Out of the total effect sizes, 141
were computed based on productive measures, and 135 on receptive measures.
Also, 191 represented immediate effects and 85 related to delayed effects (i.e.,
longer than 1 week following the instruction).

311 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Figure 1 Publication frequency of comparative studies of CBI versus PBI.

A trim-and-fill analysis2 was performed for the receptive and productive


effect sizes (immediate effects) to explore publication bias. It was found that,
with respect to the receptive effect sizes, two values on the right side of the mean
effect were missing. Imputing them would change the mean effect size from
d = 1.09, CI = 0.64–1.55 to d = 1.27, CI = 0.79–1.76. One hypothetical effect
size was imputed on the right side of the mean effect size of the productive
effect sizes d = −0.05, CI = −0.21–0.11, and the adjusted effect size was
d = −0.03, CI = −0.19–0.13. These results demonstrate that, in general, the
extracted effect sizes are normally distributed and that the retrieved studies
constitute a reliable representation of comparative studies on the effects of CBI
and PBI.3
Table 1 shows a summary of the methodological features of the included
studies. As shown, 32 out of the 35 studies were conducted in foreign language
contexts and only two in second language contexts. Among the target lan-
guages involved, Spanish was the most frequently studied, followed by English,
Japanese, and French. In 24 of the studies the participants were undergraduate
students enrolled in university foreign language classes, 6 targeted students at
secondary schools, and only 1 investigated elementary school students. Twenty-
nine studies were conducted in classroom settings and only six studies were
based in a laboratory setting. As to the distribution of age groups among the L2
population, 28 investigated adult learners, 6 studied adolescent learners, and
1 examined child learners. The majority of studies (n = 29) involved inten-
tional learning, and only 6 involved incidental learning. Most studies (n = 31)
involved at least one CBI treatment with structured input and only 4 studies

Language Learning 63:2, June 2013, pp. 296–329 312


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Table 2 Overall results: Comparative effects

95% CI Heterogeneity

Variables k Mean d p SE Lower Upper Lower Upper

Receptive
Immediate 31 1.09 .00∗ 0.23 0.64 1.55 323.43 0.00∗
Delayed 15 0.25 .11 0.16 −0.06 0.55 51.24 0.00∗
Productive
Immediate 32 −0.08 .29 0.07 −0.23 0.07 48.28 0.02∗
Delayed 15 −0.21 .03∗ 0.09 −0.39 −0.02 19.29 0.15

Statistically significant at p < .05.

involved enriched or enhanced input. Of the 31 structured input studies, 18 were


PI studies. Learners engaged in text creation in the PBI treatments that featured
in 14 studies and text manipulation in 24 studies (three studies involved both
text creation and text manipulation in different PBI treatments).

Meta-Analytic Results
This section first reports the overall results on the effectiveness the two types of
instruction as reflected by the three categories of effect sizes (i.e., comparative,
absolute, and pre-to-post effect sizes) associated with receptive and productive
measures. This is followed by the results obtained for the moderator analysis.
The overall results for the comparative effects of CBI and PBI appear in
Table 2, which shows the number of contrasts(k), the mean effect size (and the
related p value, standard error, CI), and the between-group Q test results. Re-
call that comparative effect sizes were computed based on the mean differences
between CBI and PBI; a positive effect size indicates a superior effect for CBI
and a negative effect size shows a better effect for PBI. For receptive knowl-
edge, CBI was more effective than PBI on the immediate posttests, d = 1.09,
p = .00, but the difference was no longer statistically significant on the posttests
administered more than one week following the instruction and the effect size
at this time was only d = 0.25. For productive knowledge, the initial effects
of the two types of instruction were similar, but PBI showed a significant but
small mean effect advantage over CBI on the delayed posttests of d = −0.21,
p = .03.
The results for the absolute effect sizes are displayed in Table 3. Absolute
effect sizes for CBI and PBI are based on the contrasts between the two types of
instruction and the control conditions. As shown, both CBI and PBI significantly
outperformed the control groups with large effect sizes on all measures. Q

313 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Table 3 Overall results: Absolute effects


Group
95% CI Contrast
Variables k Mean d p SE Lower Upper Qb p

Receptive
Immediate 5.12 .02∗

CBI 20 1.96 .00 0.29 1.38 2.54
PBI 21 1.13 .00∗ 0.22 0.69 1.56
Delayed 1.01 .32
CBI 9 1.58 .00∗ 0.29 0.99 2.16
PBI 9 1.12 .00∗ 0.34 0.45 1.79
Productive
Immediate 0.07 .79
CBI 22 1.23 .00∗ 0.22 0.81 1.66
PBI 22 1.32 .00∗ 0.23 0.86 1.77
Delayed 0.12 .73
CBI 8 0.91 .00∗ 0.19 0.54 1.29
PBI 8 1.02 .00∗ 0.26 0.52 1.53

Statistically significant at p < .05.

statistics demonstrated that the immediate effects of CBI were significantly


larger than those of PBI in terms of receptive knowledge but otherwise no
significant differences were found.
Table 4 shows the pre-to-post effect sizes. The results showed that both CBI
and PBI led to significant improvement in the learners’ posttest scores compared
with their pretest scores. CBI was more effective than PBI on receptive measures
(Q b = 21.29, p = .00 for immediate effects; Q b = 5.04, p = .02 for delayed
effects) but there was no significant difference between them on productive
measures.
Table 5 summarizes the results of the moderator analysis. The results for
each Q test indicate whether the variable is a significant moderator or whether
the two groups of effect sizes for that variable are significantly different. The
moderator analyses are based on the comparative effect sizes associated with
the immediate posttests.
With respect to the first moderator variable, CBI consisting of structured in-
put in a PI study resulted in significantly greater receptive learning than PBI but
there was no difference between this type of CBI and PBI in productive learn-
ing. In contrast, CBI consisting of structured input in a non-PI study resulted

Language Learning 63:2, June 2013, pp. 296–329 314


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Table 4 Overall results: Pre-to-post effects

Group
95% CI Contrast
Variables k Mean d p SE Lower Upper Qb p

Receptive
Immediate 21.09 .00∗
CBI 26 2.36 .00∗ 0.26 1.85 2.88
PBI 27 1.01 .00∗ 0.13 0.76 1.27
Delayed 5.04 .02∗
CBI 13 1.70 .00∗ 0.26 1.19 2.22
PBI 12 1.02 .00∗ 0.15 0.73 1.32
Productive
Immediate 0.35 .55
CBI 29 1.66 .00∗ 0.19 1.29 2.03
PBI 30 1.83 .00∗ 0.22 1.40 2.27
Delayed 0.12 .72
CBI 12 1.23 .00∗ 0.21 0.83 1.67
PBI 12 1.36 .00∗ 0.22 0.93 1.79

Statistically significant at p < .05.

Table 5 Moderator analysis

Group
95% CI Contrast
Variables k Mean d p SE Lower Upper Qb p

CBI–PI/non-PI
Receptive 48.42 .00∗

PI 18 2.14 .00 0.33 1.49 2.79
Non-PI 10 −0.28 .01∗ 0.11 −0.50 −0.07
Productive 0.96 .33
PI 18 0.05 .55 0.09 −0.12 0.23
Non-PI 12 −0.14 .43 0.17 −0.48 0.20
PBI–text-creation/text-manipulation
Receptive 2.74 .09
Text-creation 11 0.69 .04∗ 0.35 0.02 1.37
Text-manipulation 23 1.48 .00∗ 0.32 0.85 2.12
Productive 1.47 .23
Text-creation 13 −0.15 .08 0.09 −0.32 0.02
Text-manipulation 23 0.03 .81 0.12 −0.20 0.26

Statistically significant at p < .05.

315 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

in a significantly lower level of receptive learning than PBI with no difference


between the two types of instruction in the case of productive learning. The
Q b values also showed that the involvement of PI had a significant impact on
the comparative effectiveness of CBI and PBI regarding the development of
receptive knowledge (Q b = 48.42, p = .00) but not of productive knowledge
(Q b = .96, p = .33).
Analysis of the second moderator variable yielded the following results.
CBI was significantly more effective than PBI in terms of receptive knowledge
when PBI involved both text creation (d = 0.69, CI = 0.02–1.37, p = .04) and
text manipulation (d = 1.48, CI = 0.85–2.12, p = .00). There was no significant
difference between CBI and PBI on productive tests. The different levels of
the effect sizes for text creation (medium) and text manipulation (large) indi-
cate that CBI had larger effects when the PBI involved text manipulation than
when PBI involved text creation. However, the nonsignificant Q b value indi-
cated that this variable did not moderate the comparative effectiveness of CBI
and PBI.

Discussion
Before we summarize and discuss the findings of the present meta-analyses, we
would like to recognize the wide variety of ways in which CBI and PBI were
operationalized across studies and to comment on the validity of synthesizing
CBI and PBI effects as macro instructional types.
We identified 35 research experiments in 30 published studies that had
compared CBI and PBI. However, in the process of the selection, we noted that
the studies varied considerably in how they provided learners opportunities for
production in CBI and opportunities for comprehension in PBI (see Appendix
S1 in the online Supporting Information). Some studies (e.g., Shintani & Ellis,
2010) reported that the participants in the CBI group engaged in substantial
L2 production without being requested to do so. It is possible that this also
occurred in other CBI studies but unfortunately most studies failed to report
whether the treatment led to learners producing the L2. Thus, in most cases it
proved impossible to identify the extent to which production figured in CBI.
Similarly, we were not able to determine to what extent the PBI treatments
afforded opportunities for learners to comprehend input containing the target
features. Many of the PBI treatments provided some L2 input as a stimulus
for the production activities. For example, the “traditional instruction” and
“meaning-oriented output” in the PI studies (e.g., Benati, 2005; Farley, 2001a,
2001b; Lee & Benati, 2007a) typically involved explicit grammar instruction

Language Learning 63:2, June 2013, pp. 296–329 316


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

and activities that presented sentences containing the target grammatical forms
(e.g., “Last week, I (visit) my uncle” in Benati, 2005). Some PBI treatments even
required the learners to first comprehend input including the target grammatical
features as a prelude to the production activities. For example, the PBI in
Collentine (1998) included a pair-work interview task where learners needed
to comprehend their partners’ production. Where the PBI involved dictogloss
tasks (Qin, 2008; VanPatten et al., 2009), learners were required to comprehend
and understand the text before they tried to reconstruct it.
There were other differences in the ways in which the CBI and PBI were
operationalized. Some CBI studies required learners to distinguish the correct
form from the incorrect form (e.g., DeKeyser & Sokalski, 1996) whereas others
merely exposed the participants to exemplars of the correct target feature (e.g.,
Izumi, 2002; Leeser, 2008). In PBI, the types of L2 production activities varied
from controlled discrete-point production (i.e., text manipulation) to more free
production (i.e., text creation). Both CBI and PBI also varied in terms of
whether there was provision of explicit grammar information, the participants’
age, their proficiency levels, the instructional context, the research setting, the
target language, and the length of treatment (see Appendix S2 in the online
Supporting Information). Yet, it was only possible to investigate two of these
moderator variables as there were insufficient studies which had incorporated
other variables.
We argue that it is valid to carry out a comparison of the effects of CBI
and PBI on learning, even though each type of instruction differs in a number
of ways. Our reasons are as follows. First, the distinction is well established in
discussions of language pedagogy and has been the subject of debate among
teacher educators (e.g., Winitz, 1981). Second, as we have shown, a number of
studies have compared the two types of instruction. In fact, our meta-analysis
differs from most of the preceding SLA meta-analyses in that it only examined
comparative studies (i.e., it did not examine studies that investigated just CBI or
PBI). We have argued that examining only comparative studies where the two
types of instruction are maximally different affords a more robust comparison as
each study involved participants drawn from the same population. Third, there
is a fundamental difference in an approach to teaching that emphasizes input
and comprehension and one that emphasizes production that is of theoretical
importance in SLA as it concerns the relative roles of input and output in L2
learning. We will now discuss what answers the results of the meta-analysis
provide to our three research questions and how they speak to relevant SLA
theoretical issues.

317 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Table 6 Summary of effects for receptive knowledge

k Comparative effects Effect size

Comparative effects
Immediate 31 CBI >∗ PBI 1.09
Delayed 15 CBI > PBI 0.25
Absolute effects
Immediate∗∗ 20 CBI >∗ Control 1.96
21 PBI >∗ Control 1.13
Delayed 9 CBI >∗ Control 1.58
9 PBI >∗ Control 1.12
Pre-to-post effects
Immediate∗∗ 26 CBI: Post >∗ Pre 2.36
27 PBI: Post >∗ Pre 1.01
Delayed∗∗ 13 CBI: Post >∗ Pre 1.70
12 PBI: Post >∗ Pre 1.02

The difference between the groups/ tests were significant at p = .05.
∗∗
Q b value showed a significant difference in the effect sizes between CBI and PBI.

The Relative Effectiveness of CBI and PBI for the Development of


Receptive and Productive Grammar Knowledge
Research question 1 asked whether there was any difference in the effects
of CBI and PBI on the acquisition of receptive knowledge of grammatical
features. Table 6 summarizes the effects of the two types of instruction on
receptive knowledge.
The comparative effect size results suggest CBI is superior to PBI on the
immediate tests. However, the absolute effect sizes show that both CBI and
PBI groups outperformed the control groups on both immediate and delayed
posttests administered 1 week or longer after the instruction. Furthermore,
the pre-to-post effect sizes demonstrate that both types of instruction led to
solid improvements in their posttest scores in comparison with their pretest
performance and the effects were sustained over time. To what extent are these
results of practical significance as indicated by the effect sizes reported in
Table 6? As we discussed in the Method section, following Oswald and Plon-
sky’s (2010) recommendations we have elected to use the following bench-
marks. For between-groups effects (i.e., comparative and absolute effects in
this study), d = 0.4 can be considered small, d = 0.7 medium, and d = 1.0
large. For within-groups effects (i.e., pre-to-posttest effects), d = 0.50 repre-
sents a small effect, d = 0.80 a medium effect and 1.10 a large effect. Thus, the

Language Learning 63:2, June 2013, pp. 296–329 318


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Table 7 Summary of effects for productive knowledge

k Comparative effects Effect size

Comparative effects
Immediate 32 CBI = PBI −0.08
Delayed 15 PBI >∗ CBI −0.21
Absolute effects
Immediate 22 CBI >∗ Control 1.23
22 PBI >∗ Control 1.32
Delayed 8 CBI >∗ Control 0.91
8 PBI >∗ Control 1.02
Pre-to-post effects
Immediate 29 CBI: Post >∗ Pre 1.66
30 PBI: Post >∗ Pre 1.83
Delayed 12 CBI: Post >∗ Pre 1.23
12 PBI: Post >∗ Pre 1.36

The difference between the groups/tests were significant at p = .05.
Note. All Q b values were nonsignificant.

effect sizes shown in Table 6 for the absolute and the pre-to-post effects for both
CBI and PBI were all either large or medium.4 In other words, they indicate
both types of instruction were effective in developing receptive knowledge. In
addition, however, the absolute and pre-to-post effect sizes for CBI were always
larger than those for PBI. Also, the comparative effect indicates that CBI was
superior to PBI in the immediate tests (with a close to large) although this dif-
ference disappeared in the delayed tests (i.e., the effect size for this comparison
was negligible). Overall, then, while both types of instruction were effective in
developing receptive knowledge, the CBI proved more effective than the PBI
although its superiority was not clearly sustained over time.
Research question 2 asked whether there was any difference in the relative
effects of CBI and PBI on the acquisition of productive knowledge. The results
are summarized in Table 7. They indicate that in the short term CBI and PBI
were similarly effective in developing learners’ productive knowledge. The
absolute effects and the pre-to-post effects for both types of instruction are all
large with the exception of the medium mean effect size for CBI versus Control
in the delayed tests. The analysis of the comparative effects indicates that PBI
led to more durable productive knowledge than CBI, but only to a small degree
when compared directly, because the difference was statistically significant but
numerically small (d = −0.21).

319 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Taking Tables 6 and 7 together, it is clear that both CBI and PBI are effective
in developing both receptive and productive knowledge in both the short and
long term. However, there are differences in the effects of the two types of
instruction. In the short term, CBI resulted in greater receptive knowledge than
PBI but there was no notable difference in productive knowledge. In the long
term, while both CBI and PBI resulted in similar levels of receptive knowledge,
PBI led to greater sustained productive knowledge. It is possible, then, that while
CBI has an initial advantage, PBI may have more durable effects. Therefore, we
would like to consider the theoretical implications of (1) the initial advantage
of CBI over PBI and (2) the more durable effect of PBI.

Theoretical Significance of the CBI–PBI Results


The initial advantage found for CBI lends some support to VanPatten’s (2004)
claims. He suggests that learners’ limited processing capacity impedes the
learners’ ability to convert input to intake and that forcing learners to produce
in the L2 can interfere with their capacity to notice or attend to linguistic
form. On the other hand, the advantage seen for PBI in the long term might be
explained by the deeper processing that production requires as proposed, for
example, by Izumi (2002). These two apparently conflicting positions can be
reconciled if we view the acquisitional process in terms of the computational
model of L2 acquisition (R. Ellis, 1997a). According to this model, the first
step in the process of acquiring new L2 forms involves converting input into
intake (i.e., attending to the feature in the input and rehearsing it in working
memory). The next step involves integration (i.e., incorporating the feature into
long-term memory) and accommodation (VanPatten, 1996) (i.e., restructuring
the interlanguage system to accommodate the new features). The final step is
output, where the learner accesses the new features for use in speech or writing,
a process that can also contribute to acquisition (see Swain, 1995).
The two types of instruction may differ in terms of the particular step they
assist. That is, CBI caters primarily to the initial stage of acquisition (i.e.,
converting input to intake) while PBI assists the process of accessing partially
acquired knowledge. Thus, CBI has an initial advantage in the case of new
receptive knowledge because it induces noticing of the new grammatical forms
and practices their rehearsal in working memory. It also assists production be-
cause, as VanPatten suggests, comprehension and production draw on the same
interlanguage system. However, the effects of CBI atrophy because it does not
involve the kind of generative use that Joe (1998) found was important for
vocabulary learning and may well be equally important for grammar. In other

Language Learning 63:2, June 2013, pp. 296–329 320


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Table 8 Summary of moderator analysis

Moderator variables k Comparative effects Effect size

Receptive
CBI ∗∗
PI 18 CBI >∗ PBI 2.14
Non-PI 10 PBI >∗ CBI −0.28
PBI
Text-creation 11 CBI >∗ PBI 0.69
Text-manipulation 23 CBI >∗ PBI 1.48
Productive
CBI
PI 18 CBI = PBI 0.05
Non-PI 12 CBI = PBI −0.14
PBI
Text-creation 13 CBI = PBI −0.15
Text-manipulation 23 CBI = PBI 0.03

The difference between the groups/tests were significant at p = .05.
∗∗
The Q b value indicated that the moderator variable had significant (or approaching to
significant) effects on the comparative effect sizes of CBI and PBI.

words, the results can be explained by proposing that CBI is more effective for
teaching new features whereas PBI is better for developing productive ability
of features that have already been partially acquired and need consolidating.
Such a view, it should be noted, accords with VanPatten’s (2004) claims about
the relative contributions of the two types of instruction. However, another
possibility is that the initial advantage for CBI fades over time because it re-
sulted in only explicit knowledge which is less durable than implicit knowledge
(Ellis, 2004). In contrast, PBI may have assisted the development of implicit
knowledge. Both of these interpretations of the results are speculative and it is
not possible to decide between them at this time.

The Facilitating Role of Processing Instruction Within CBI


Research question 3 addressed the factors that moderated the effects of CBI and
PBI on L2 grammar acquisition. Table 8 summarizes the results of the moderator
analysis. It should be noted that this analysis only focused on the scores obtained
for the immediate posttests. This was because there were insufficient studies
with postests administered one week or longer after the instruction to conduct
a convincing moderator analysis.

321 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Of the two moderator variables investigated, one (i.e., whether the CBI
involved PI or not) was found to have a statistically significant influence on
the comparative effectiveness of CBI and PBI. The results support VanPatten’s
(2004) claim that CBI is more effective than PBI when it involves PI, but
not when it consists of structured input in a non-PI framework. Wong (2004)
identified the following key features of PI: choice of a target feature that is
governed by a Processing Principle and structured input designed in such a way
as to alter learners’ processing strategies. The studies coded as PI met these
conditions. The results showed PI was more effective than PBI in developing
receptive knowledge. Interestingly, studies involving structured input but not
PI proved to be less effective than PBI in developing receptive knowledge. It
should be noted that there was no difference between the PI studies and non-PI
studies in the case of productive knowledge and the effect sizes were negligible.
This again supports VanPatten’s claims about PI. VanPatten acknowledged that
PBI may be needed to help learners automatize their interlanguage knowledge
for production. Overall, then, the results of the meta-analysis point to the
effectiveness of CBI when it involves PI but only where receptive knowledge
is concerned.
The other moderator instructional variable (i.e. text creation/text manipu-
lation) did not have a significant influence on the comparative effect of CBI
and PBI on acquisition. However, the effect sizes for receptive knowledge were
large in the case of text manipulation and small in the case of text creation.
In other words, CBI was found to be more effective for developing receptive
knowledge than PBI, irrespective of whether the latter involved text-creation or
text-manipulation activities. However, the advantage for CBI is clearer when the
PBI involved text manipulation (d = 1.48) than when it involved text creation
(d = 0.69). In short, CBI is more clearly effective for developing receptive
knowledge than PBI when PBI consists only of text-manipulation activities.
Conversely, the advantage for CBI is less evident when the PBI consists of
text-creation activities.

Conclusion
In the motivation for this study, we distinguished a number of theoretical po-
sitions that address the roles of CBI and PBI in L2 learning. The results of
the meta-analysis do not support the prediction of skill-learning theory, which
claims that CBI benefits receptive but not productive L2 knowledge, whereas
PBI benefits productive knowledge but not receptive knowledge. We found
that both types of instruction were beneficial for both receptive and productive

Language Learning 63:2, June 2013, pp. 296–329 322


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

knowledge. Nor is there really any clear evidence to demonstrate the superiority
of one type of instruction. We did find that CBI was more effective than PBI
for developing receptive knowledge in the immediate tests but there was no
difference in the delayed tests administered 1 week or longer following the in-
struction. PBI, however, was more effective than CBI for productive knowledge
in the delayed tests, although there was no difference in the immediate tests. We
have suggested that these differences can be explained in terms of whether the
target features were new or partially acquired, with CBI advantageous for the
former and PBI for the latter. This explanation accords with the predictions of
the Teachability Hypothesis (Pienemann, 1985), which makes different claims
about the effects of CBI and PBI on new and partially acquired features. How-
ever, we were unable to test these predictions as it was not possible to establish
the status of the target features in the learners’ interlanguages in the studies we
investigated.
Clearly, the results do not allow us to propose that instruction be based on
just CBI or just PBI. Both are effective and, in the case of productive knowledge,
equally so. It is well known that course materials in language textbooks tend
to emphasize PBI—even at the beginning level. Perhaps then, course designers
and materials writers might give greater emphasis to the use of CBI in the
future, especially when introducing new grammatical features, and especially
for beginner-level learners. It is also possible that grammar instruction will
be most effective if it involves a combination of comprehension-based and
production-based activities. The results of this meta-analysis can be seen as
compatible with such a proposal but nevertheless this will need investigating
empirically.
As Oswald and Plonsky (2010) pointed out, one the purposes of meta-
analyses is to prompt further research by identifying areas in need of further
study. The meta-analysis reported in this article points to a number of direc-
tions for future research. Clearly, there is a need for further primary studies that
investigate the comparative effectiveness of CBI and PBI as macro types of
instruction. Such studies need to provide detailed information about the class-
room processes that arise during implementation of the instruction. In this way
it will be possible to identify the characteristics that clearly differentiate CBI
and PBI and that impact on learning. It would also be useful to distinguish the
effects of the instruction on the two types of production measures that Norris
and Ortega (2000) examined (i.e., constrained constructed response and free
constructed response). Further studies are needed to investigate the moderating
effects on learning outcomes of Processing Instruction in CBI and also of text
creation versus text manipulation in PBI. Ideally, too, our understanding of the

323 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

theoretical and pedagogical significance of the two macro types of instruction


will be enhanced by investigating additional moderator factors. In particular,
it would be desirable to investigate the choice of target structure, the learn-
ers’ proficiency level, the role played by explicit instruction, and individual
difference factors such as working memory and language anxiety.
Final revised version accepted 29 April 2012

Notes
1 An anonymous reviewer suggested a number of other studies that could be
included in the meta-analysis. However, of eight studies suggested, only one study
(Cheng, 2002) met the selection criteria. The other studies were excluded because
they did not involve a direct comparison of CBI and PBI, were published after
2010 (our cut-off point), or were not published in a refereed journal or book.
Cheng’s study was missed because the title did not include any of the key words
used in the search for articles.
2 Trim and Fill (Duval, 2005) is a technique for estimating the potential impact of
publication bias on the observed results; it first guesses the number of missing
studies that might have been missed in a meta-analysis due to publication bias, and
it then calculates a hypothetical new outcome effect, had these studies been
included.
3 An anonymous reviewer rightly pointed out that a symmetrical distribution does
not rule out publication bias as this meta-analysis only included published studies.
4 An alternative benchmark for evaluating effect size can be achieved by comparing
the effect sizes reported in this study with those reported in similar meta-analyses.
Norris and Ortega (2000) reported absolute effect sizes for Focus on Form and
Focus on Forms and for Explicit and Implicit instruction in the range of 0.54
(Implicit Instruction) to 1.92 (Focus on Form). The absolute effect sizes shown in
Table 6 range from 1.13 (PBI) to 1.96 (CBI) and thus demonstrate a similar degree
of effect to those reported by Norris and Ortega.

References
Note. Studies marked with an asterisk were included in the meta-analysis.

Allen, L. Q. (2000). Form-meaning connections and the French causative. Studies in
Second Language Acquisition, 22, 69–84.
Asher, J. J. (1977). Learning another language through actions: The complete
teacher’s guidebook. Los Gatos, CA: Sky Oaks Productions.

Benati, A. (2005). The effects of processing instruction, traditional instruction and
meaning-output instruction on the acquisition of the English past simple tense.
Language Teaching Research, 9, 67–93.

Language Learning 63:2, June 2013, pp. 296–329 324


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI


Benati, A. (2009). Japanese language teaching: A communicative approach. London:
Continuum.

Benati, A., & Lee, J. F. (2008). From processing instruction on the acquisition of
Italian noun-adjective agreement to secondary transfer-of-training effects on Italian
future tenes verb morphology. In A. Benati & J. F. Lee (Eds.), Grammar
acquisition and processing instruction: Secondary and cumulative effects
(pp. 54–87). Clevedon, UK: Multilingual Matters.

Benati, A., Lee, J. F., & Houghton, S. D. (2008). From processing instruction on the
acquisition of English past tense to secondary transfer-of-training effects on English
third person singular present tense verb morphology. In A. Benati & J. F. Lee (Eds.),
Grammar acquisition and processing instruction: Secondary and cumulative effects
(pp. 88–120). Clevedon, UK: Multilingual Matters.

Benati, A., Lee, J. F., & Laval, C. (2008). From processing instruction on the
acquisition of French imparfait to secondary transfer-of-training effects on French
subjunctive and to cumulative transfer-of-training effects with French causative
constructions. In A. Benati & J. F. Lee (Eds.), Grammar acquisition and processing
instruction: Secondary and cumulative effects (pp. 121–157). Clevedon, UK:
Multilingual Matters.
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2006).
Comprehensive meta-analysis (Version 2.2.027) [Computer software]. Englewood,
NJ: Biostat.
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction
to meta-analysis. Chichester, UK: Wiley.
Cadierno, T. (1995). Formal instruction from a processing perspective: An
investigation into the Spanish past tense. Modern Language Journal, 79, 179–193.
Cheng, A. C. (2002). The effects of processing instruction on the acquisition of ser and
estar. Hispania, 85, 308–323.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). San
Diego, CA: Academic Press.

Collentine, J. (1998). Processing instruction and the subjunctive. Hispania, 81,
576–587.
Cross, C., Copping, L., & Campbell, A. (2011). Sex differences in impulsivity: A
meta-analysis. Psychological Bulletin, 137, 97–130.
Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence
intervals, and meta-analysis. New York: Routledge.

De Jong, N. (2005). Can second language grammar be learned through listening? An
experimental study. Studies in Second Language Acquisition, 27,
205–234.
DeKeyser, R. M. (2007). Practice in a second language: Perspectives from applied
linguistics and cognitive psychology. New York: Cambridge University Press.

DeKeyser, R. M., & Sokalski, K. J. (1996). The differential role of comprehension
and production practice. Language Learning, 46, 613–642.

325 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Duval, S. (2005). The trim and fill method. In H. R. Rothstein, A. J. Sutton, & M.
Borenstein (Eds.), Publication bias in meta-analysis: Prevention, assessment, and
adjustments (pp. 127–144). Chichester, UK: Wiley.
Ellis, N. C. (2005). At the interface: Dynamic interactions of explicit and implicit
language knowledge. Studies in Second Language Acquisition, 27, 305–352.
Ellis, R. (1995). Interpretation tasks for grammar teaching. TESOL Quarterly, 29,
87–105.
Ellis, R. (1997a). Second language acquisition. Oxford, UK: Oxford University Press.
Ellis, R. (1997b). SLA research and language teaching. Oxford, UK: Oxford
University Press.
Ellis, R. (2003). Task-based language learning and teaching. Oxford, UK: Oxford
University Press.
Ellis, R. (2004). The definition and measurement of L2 explicit knowledge. Language
Learning, 54, 227–275.

Erlam, R. (2003). Evaluating the relative effectiveness of structured-input and
output-based instruction in foreign language learning. Studies in Second Language
Acquisition, 25, 559–582.

Farley, A. P. (2001a). Authentic processing instruction and the Spanish subjunctive.
Hispania, 84, 289–299.

Farley, A. P. (2001b). Processing instruction and meaning-based output instruction: A
comparative study. Spanish Applied Linguistics, 5(2), 57–94.
Farley, A. P. (2004). The relative effects of processing instruction and meaning-based
output instruction. In B. VanPatten (Ed.), Processing instruction: Theory, research,
and commentary (pp. 143–168). Mahwah, NJ: Erlbaum.

Gass, S., & Torres, M. J. A. (2005). Attention when: An investigation of the ordering
effect of input and interaction. Studies in Second Language Acquisition, 27,
1–31.
Goldschneider, J., & DeKeyser, R. M. (2001). Explaining the “natural order of L2
morpheme acquisition” in English: A meta-analysis of multiple determinants.
Language Learning, 51, 1–50.
Hedges, L. V. (1994). Statistical considerations. In H. Cooper & L. V. Hedges (Eds.),
The handbook of research synthesis (pp. 29–38). New York: Russell Sage
Foundation.
Hodges, L. V., Humphris, G., & Macfarlane, G. (2005). A meta-analytic investigation
of the relationship between the psychological distress of cancer patients and their
carers. Social Science and Medicine, 60, 1–12.
Hunter, J., & Schmidt, F. (2004). Methods of meta-analysis. London: Sage.

Izumi, S. (2002). Output, input enhancement, and the noticing hypothesis: An
experimental study on ESL relativization. Studies in Second Language Acquisition,
24, 541–577.
Joe, A. (1998). What effects do text-based tasks promoting generation have on
incidental vocabulary acquisition? Applied Linguistics, 19, 357–377.

Language Learning 63:2, June 2013, pp. 296–329 326


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI


Koyanagi, K. (1999). Differential effects of focus on form vs. focus on forms. In T.
Fujimura, Y. Kato, & R. Smith (Eds.), Proceedings on the 10th Conference on
Second Language Research in Japan, 10 (pp. 1–31). Tokyo, Japan: International
University of Japan.
Krashen, S., & Terrell, T. (1983). The natural approach. Hayward, CA: Alemany Press.

Lee, J. F., & Benati, A. (2007a). Comparing modes of delivering processing
instruction and meaning-based output instruction on Italian and French subjunctive.
In J. F. Lee & A. Benati (Eds.), Delivering processing instruction in classrooms and
in virtual contexts: Research and practice (pp. 99–136). London: Equinox.

Lee, J. F., & Benati, A. (2007b). The effects of structured input activities on the
acquisition of two Japanese linguistic features. In J. F. Lee & A. Benati (Eds.),
Delivering processing instruction in classrooms and in virtual contexts: Research
and practice (pp. 49–71). London: Equinox.

Leeser, M. J. (2008). Pushed output, noticing, and development of past tense
morphology in content-based instruction. Canadian Modern Language Review, 65,
195–220.
Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis.
Language Learning, 60, 309–365.
Lightbown, P. M. (2008). Transfer appropriate processing as a model for classroom
second language acquisition. In Z. Han (Ed.), Understanding second language
process (pp. 27–44). Clevedon, UK: Multilingual Matters.
Lipsey, M., & Wilson, D. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage.
Littell, J. H., Corcoran, J., & Pillai, V. (2008). Systematic reviews and meta-analysis.
New York: Oxford University Press.
Long, M. H., & Robinson, P. (1998). Focus on form: Theory, research, and practice. In
C. Doughty & J. Williams (Eds.), Focus on form in classroom second language
acquisition (pp. 15–40). Cambridge, UK: Cambridge University Press.
Lyster, R., & Saito, K. (2010). Oral feedback in classroom SLA. Studies in Second
Language Acquisition, 32, 265–302.
Mackey, A., & Goo, J. M. (2007). Interaction research in SLA: A meta-analysis and
research synthesis. In A. Mackey (Ed.), Input, interaction and corrective feedback
in L2 learning (pp. 379–452). Oxford, UK: Oxford University Press.

Morgan-Short, K., & Bowden, H. W. (2006). Processing instruction and meaningful
output-based instruction: Effects on second language development. Studies in
Second Language Acquisition, 28, 31–65.

Nagata, N. (1998a). Input vs. output practice in educational software for second
language acquisition. Language Learning and Technology, 1(2), 23–40.

Nagata, N. (1998b). The relative effectiveness of production and comprehension
practice in second language acquisition. Computer Assisted Language Learning, 11,
153.
Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research
synthesis and quantitative meta-analysis. Language Learning, 50, 417–528.

327 Language Learning 63:2, June 2013, pp. 296–329


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

Oswald, F., & Plonsky, L. (2010). Meta-analysis in second language research: Choices
and challenges. Annual Review of Applied Linguistics, 30, 85–110.
Ortega, L. (2010). Research synthesis. In B. Paltridge & A. Phakiti (Eds.), Companion
to research methods in applied linguistics (pp. 111–126). London: Continuum.
Patall, E., Cooper, H., & Robinson, J. (2008). The effects of choice on intrinsic
motivation and related outcomes: A meta-analysis of research findings.
Psychological Bulletin, 134, 270–300.
Pienemann, M. (1985). Learnability and syllabus construction. In K. Hyltenstam & M.
Pienemann (Eds.), Modelling and assessing second language acquisition
(pp. 23–75). Clevedon, UK: Multilingual Matters.
Pienemann, M. (1989). Is language teachable? Psycholinguistic experiments and
hypotheses. Applied Linguistics, 1, 52–79.
Plonsky, L. (2011). The effectiveness of second language strategy instruction: A
meta-analysis. Language Learning, 61, 993–1038.

Qin, J. (2008). The effect of processing instruction and dictogloss tasks on acquisition
of the English passive voice. Language Teaching Research, 12, 61–82.
Schulze, R. (2004). Meta-analysis: A comparison of approaches. Cambridge, MA:
Hogrefe & Huber.

Shintani, N., & Ellis, R. (2010). The incidental acquisition of English plural -s by
Japanese children in comprehension-based and production-based lessons. Studies in
Second Language Acquisition, 32, 607–637.

Song, M. J., & Suh, B. R. (2008). The effects of output task types on noticing and
learning of the English past counterfactual conditional. System, 36, 295–312.
Swain, M. (1985). Communicative competence: Some roles of comprehensible input
and comprehensible output in its development. In S. Gass & C. Madden (Eds.),
Input in second language acquisition (pp. 235–256). Rowley, MA: Newbury House.
Swain, M. (1995). Three functions of output in second language learning. In G. Cook
& B. Seidelhofer (Eds.), Principle and practice in applied linguistics: Studies in
honor of H.G. Widdowson (pp. 125–144). Oxford, UK: Oxford University Press.

Takimoto, M. (2009). The effects of input-based tasks on the development of
learners’ pragmatic proficiency. Applied Linguistics, 30, 1–25.

Tanaka, T. (2001). Comprehension and production practice in grammar instruction:
Does their combined use facilitate second language acquisition? JALT Journal, 23,
6–30.

Toth, P. D. (2006). Processing instruction and a role for output in second language
acquisition. Language Learning, 56, 319–385.
Ur, P. (1996). A course in language teaching: Practice and theory. Cambridge, UK:
Cambridge University Press.
VanPatten, B. (1996). Input processing and grammar instruction: Theory and research.
Norwood, NJ: Ablex.
VanPatten, B. (2002). Processing instruction: An update. Language Learning, 52,
755–803.

Language Learning 63:2, June 2013, pp. 296–329 328


Shintani, Li, and Ellis Meta-Analysis of CBI and PBI

VanPatten, B. (2004). Input processing in second language acquisition. In B. VanPatten


(Ed.), Processing Instruction: Theory, research, and commentary (pp. 5–32).
Mahwah, NJ: Erlbaum.
VanPatten, B. (2007). Input processing in adult second language acquisition. In B.
VanPatten & J. Williams (Eds.), Theories in second language acquisition
(pp. 115–135). Mahwah, NJ: Erlbaum.

VanPatten, B., & Cadierno, T. (1993). Input processing and second language
acquisition: A role for instruction. Modern Language Journal, 77, 45–57.

VanPatten, B., Inclezan, D., Salazar, H., & Farley, A. P. (2009). Processing instruction
and dictogloss: A study on object pronouns and word order in Spanish. Foreign
Language Annals, 42, 557–575.

VanPatten, B., & Wong, W. (2004). Processing instruction and the French causative:
A replication. In B. VanPatten (Ed.), Processing instruction (pp. 97–118). Mahwah,
NJ: Erlbaum.
VanPatten, B., & Oikkenon, S. (1996). Explanation versus structured input in
processing instruction. Studies in Second Language Acquisition, 18, 495–510.
Winitz, H. (1981). The comprehension approach to foreign language instruction.
Rowley, MA: Newbury House.
Wong, W. (2004). The nature of processing instruction. In B. VanPatten (Ed.),
Processing instruction: Theory, research, and commentary (pp. 33–63). Mahwah,
NJ: Erlbaum.

Supporting Information
Additional Supporting Information may be found in the online version of this
article at the publisher’s website:

Appendix S1. Details of the Studies Included in the Meta-Analysis.


Appendix S2. Variables of the Meta-Analysis Studies.

329 Language Learning 63:2, June 2013, pp. 296–329

You might also like