Chen - 2012 - Dictionary Use and Vocabulary Learning in The Context of Reading
Chen - 2012 - Dictionary Use and Vocabulary Learning in The Context of Reading
216–247
doi:10.1093/ijl/ecr031 Advance access publication 2 December 2011 216
Yuzhen Chen: College of Foreign Languages and Cultures of Xiamen University, Fujian,
China and Department of Languages of Putian University, Fujian, China
([email protected])
This empirical study attempts to explore the role of dictionary use in L2 vocabulary
learning in reading context. It involved the use of English-Chinese bilingualized diction-
aries (BLDs) for EFL vocabulary task completion and incidental vocabulary acquisition
by undergraduate English majors in Chinese universities. The subjects were asked to
read an English passage and perform a reading task under one of three conditions: with
the aid of a paper BLD (PBLD) or an electronic BLD (EBLD), or without access to any
dictionary. After task completion, they were given an unexpected retention test on the
target lexical items included in the reading passage. The same retention test was re-
peated one week later. The study found that BLD use can effectively facilitate vocabu-
lary comprehension and enhance incidental vocabulary acquisition, suggesting that
dictionary use is a more effective strategy of vocabulary learning than contextual gues-
sing. There was no significant difference in dictionary effectiveness between the PBLD
and the EBLD, yet the latter showed some advantage over the former for vocabulary
retention. Students varying on vocabulary proficiency levels and reading conditions
fared differently on incidental vocabulary acquisition.
1. Introduction
Dictionary use has long been recognized as one of vocabulary learning strate-
gies (Gu and Johnson 1996, Scholfield 1997, Nation 1990, 2001, Gu 2003,
Nation and Meara 2010). Yet despite the important role of the dictionary
for L2 learning and the relatively long history of the research on vocabulary
learning through dictionary use, in the domain of L2 vocabulary acquisition,
‘interest from a research perspective has been limited and sporadic over the
years’ (Ronald 2003: 285).1 Fortunately, recent years have witnessed steady
development of dictionary use research which includes investigations of the
use and usefulness of dictionaries for various language activities. This study
attempts to evaluate the effectiveness of dictionary use for L2 vocabulary
learning in reading context. It examines the use of English-Chinese bilingual-
ized dictionaries (henceforth BLDs) for EFL vocabulary task completion and
incidental vocabulary acquisition during reading. This type of dictionary is
hugely popular with Chinese EFL learners, yet has received little attention
from researchers of dictionary use studies. By evaluating the effectiveness of
such dictionaries for EFL vocabulary learning and identifying the problems
with dictionary use, this research attempts to shed some light on vocabulary
pedagogy and dictionary use instruction in a Chinese EFL environment.
(a) the test itself was made up of items which were not likely to be affected
by the availability of a dictionary,
(b) the dictionary did not include information needed to answer the compre-
hension questions, and
(c) the user failed to identify the words in the text which were most
crucial for correct answering of the test questions (Nesi and Meara
1991: 643).
one. Aizawa (1999) even found that subjects in the non-dictionary condition
achieved significantly better results of reading comprehension than those using
dictionaries.
In contrast, some studies demonstrated a positive correlation between
dictionary use and vocabulary comprehension. Summers (1988) revealed that
compared with no-entry use, the use of dictionary entries yielded substantially
better results of comprehension as well as production. Tono (1989), later
republished as Tono (2001: 75–83), showed that a significant difference in
performance existed between reading comprehension with dictionaries and
that without dictionaries. Similar findings were obtained by Bogaards (2002,
cited in Welker 2010: 178–179) and Hayati and Pour-Mohammadi (2005). In
Szczepaniak (2006), the monolingual dictionary was found to be effective
were even higher than those of the students with marginal glosses. Aizawa
(1999) showed a different role of dictionary use for vocabulary comprehension
and vocabulary retention: for the former, the non-dictionary group scored
significantly higher than the dictionary group while there was a reverse result
for the latter. Nevertheless, for those more proficient learners, there was almost
no difference in vocabulary retention. In the same vein, Chang (2002) found
that reading with different conditions did not produce significant effects on
reading comprehension but for vocabulary retention, the use of marginal
glosses and electronic dictionaries yielded different results.
Different from the above-mentioned studies, Conceição (2004, cited in
Welker 2010) concluded that dictionary use does not contribute significantly
to vocabulary retention as there was no significant difference in retention
the advantage of dictionary use for vocabulary retention. Yet with the only
exception of Laufer (2011), all studies mentioned above did not involve any
BLD use. This is one of the reasons that initiated the present study.
There is now a large body of literature on the use of various kinds of electronic
dictionaries covering such topics as the usefulness of dictionaries for learning
tasks, the comparison of dictionary effectiveness between different types, and
lookup preferences and behavior of dictionary users in CALL context etc. Yet
on the whole, the contrastive studies between electronic dictionaries and paper
3. The study
Based on the literature review given above, the author proposed the following
research hypotheses concerning the use of a paper BLD (henceforth PBLD)
and an electronic BLD (henceforth EBLD) for vocabulary learning (including
vocabulary comprehension and incidental vocabulary acquisition).
Subjects of the study were asked to finish a reading task, which was followed by
an unexpected vocabulary retention test that was repeated one week later. The
following is a detailed introduction to design issues.
Dictionary Use and Vocabulary Learning in the Context of Reading 223
3.2.1 Selection of the reading material and target lexical items. Several factors were
taken into consideration for the selection of reading material. The priority is
to ensure that the reading text is of an appropriate level of difficulty with a lexical
density that would allow general comprehension through contextual guessing.
Therefore, the author adopted a density of 98% known words as advocated by
Hu and Nation (2000) and Nation (2001). To arouse students’ interest in reading
under experimental conditions, narratives were preferred over other types of
writings such as argumentation, exposition or description. Furthermore, texts
which are too long or too short would not be considered as appropriate for
practical reasons of test administration. In view of these considerations, two
texts of similar levels of difficulty, length and type were chosen from a corpus
that is not accessible to the subjects. Three teachers of extensive English reading
3.2.2 Design ofthe reading task. The reading text is accompanied by 13 compre-
hension questions. Different from the regular type of reading comprehension
questions which mostly involve global text comprehension, this study adopts a
word-focused approach: most of the questions are related to the comprehen-
sion of the target items. This approach can be justified by two considerations.
One is to avoid the pitfall of test design of some previous studies such as
Bensoussan et al. (1984) and Nesi and Meara (1991): the test may be made
up of items which are not likely to be affected by dictionary use. By incorpor-
ating target items into comprehension questions, a closer correlation between
word comprehension and dictionary use will be established. Meanwhile, some
of the comprehension questions are also connected with the overall text com-
prehension so as to prevent participants from doing skipping reading only.
To be specific, the questions concerning the comprehension of target lexical
224 Yuzhen Chen
items account for nearly 70% of the total, i.e. among 13 questions, nine are
word-focused, six of which are multiple choice questions and three in
question-and-answer form. Given the nature of the study, the scores of these
nine questions were analyzed, rather than the total score of all 13 questions.
The other reason to adopt a word-focused approach is to ensure a higher
incidence of incidental vocabulary acquisition. As reported by previous re-
searchers (Hulstijn 1992, Cho and Krashen 1994, Parikbakht and Wesche
1997, Zahar et al. 2001, Horst 2005), vocabulary gains through reading without
any enhancement tasks tend to be extremely small, ranging from one to seven
words per text (which is up to 7,000 words). Empirical evidence supports the
claim that in instructed L2 context, word-focused activities play a more im-
portant role than reading alone in building the learner’s lexical knowledge
3.2.3 Design of the vocabulary retention tests. Like most incidental learning re-
search, vocabulary retention in this study is measured by checking whether
the subjects can recall the meaning of the target items.2 The study included
two retention tests of the same contents, one was conducted immediately after
the reading task, the other one week later. Hulstijn (2003: 372) argues that
experiments comparing different methods of cognitive processing of new lexical
material need only immediate post-tests, for it would not be possible to
Dictionary Use and Vocabulary Learning in the Context of Reading 225
3.2.4 Selection of the BLDs and solution to the problem of dictionary underuse. As the
study also involves a comparison of effectiveness between a PBLD and an
EBLD, to strictly control the variable of dictionary form, the author operatio-
nalized the electronic-paper opposition by using a desktop dictionary and its
Figure 2: Dictionary entry for rapt in the PBLD. This figure appears in
colour in the online version of the International Journal of Lexicography.
underuse effect. First, all the target lexical items in the reading text were marked
in bold type so as to obtain a higher degree of salience. Secondly, all of them were
included in the question items, in bold type again, and subjects had to know their
meanings when completing the task. Thirdly, the dictionary groups were encour-
aged to consult the target lexical items and told to underline in the reading text
any word they consulted during task completion. Those who failed to consult the
target items were excluded from the final data analysis. Fourthly, the study was
done in subjects’ regular class sections and under the supervision of their re-
spective teachers, which promoted a higher degree of willingness on the part of
students to cooperate and follow test instructions. All these measures proved
useful to handle the problem of dictionary underuse in experimental treatment.
there is little point in eliciting all that learners may know about a particular set
of words (Read 2007: 113). In fact, learners’ word knowledge naturally deepens
as vocabulary size increases, so that good size measures may be all that are
required (Vermeer 2001). Furthermore, there are so many more aspects of word
knowledge that could potentially be assessed and no consensus has emerged as
to which are the most significant ones (Read 2004). Therefore, the author
decided on a two-section Vocabulary Levels Test (henceforth VLT), adapted
from Nation (2001), Schmitt et al. (2001) and Laufer and Nation (1999). The
first section is taken from Schmitt et al.’s version of VLT (2001) in which
subjects are requested to match words with their synonyms or short definitions.
The following is an example.
For the same reasons mentioned above, only three word levels, i.e. the 3000
word level, the 5000 word level and the University Word List level were
adopted and except for the University Word level which includes 13 items,
each level contains 12 sentences with 12 tested items, thus reaching a total of
37 items. Altogether, there are 100 tested items in the VLT.
3.3 Participants
Participants for this study included three intact teaching classes of English
seniors from Putian University and another four classes of juniors from
228 Yuzhen Chen
The study was conducted during two regular class sections on two consecutive
weeks under the supervision of the participants’ respective teachers. In the case
target items in either L1 or L2. The RT1 was to be finished within six to seven
minutes and without the assistance of any dictionary. At the end of the RT1
paper, students were also asked to identify among the target lexical items the
word(s), if any, that they had known prior to the study. No mention was made
of the other retention test that would follow in the next week, but students were
told in a delicate way not to do anything more with the target lexical items
after the RT1. In addition, they were instructed not to tell their fellow students
in other classes anything about the study so as to prevent collaboration. This
first phase of study, including the reading task and the RT1, took about
45 minutes.
Seven days later, the participants were given a delayed retention test
(henceforth RT2) in the same class and under the supervision of the same
may have known from other students that there would be another retention
test. The exclusion of all these students from the study left the final number
at 176.
The author alone undertook all the scoring work. For the reading task, the
maximum score is 18 for the nine word-focused questions. Each correct answer
for the multiple choice question yielded two points. The scoring of question
answering was based on the semantic and pragmatic criteria. If each question
was answered correctly and appropriately, it would get two points. If the
answer fit only one of the criteria, then one point. For example, when asked
to compose a sentence with a dab hand, many students wrote sentences like I’m
a dab hand in cooking, she is a dab hand with her papers or he is a dab hand in
farming, instead of using the idiom with a more frequent preposition at, which
3.5 Results
reading Groups (the EBLD, the PBLD and the ND groups) as the
between-subjects factor and Time (the RT1 and the RT2) the within-subjects
factor. The measure of effect size is Z2, expressing explained variance. Results
in Tables 5 and 6 indicate a significant main effect for both Time
[F(1, 173) = 237.03; p < 0.001; Z2 = 0.578] and Groups [F(2, 173) = 32.23;
p < 0.001; Z2 = 0.271] and a significant time groups interaction as well
[F(2, 173) = 4.64; p = 0.011; Z2 = 0.051]. As revealed by multiple comparisons
(Scheffe), a significant difference occurred between the ND and the other two
groups (p < 0.001) while the latter two did not differ substantially from each
other (p = 0.137).
Dictionary Use and Vocabulary Learning in the Context of Reading 233
RT1
Vocabulary level 1 42.99 14.35 0.000 0.140
Reading condition 2 40.52 13.52 0.000 0.235
Vocabulary level *reading condition 2 11.27 3.76 0.027 0.079
Error 88 2.99
EBLD PBLD ND
Figure 3: Profile plot: RT1 scores for vocabulary levels. This figure appears
in colour in the online version of the International Journal of Lexicography.
those who used the EBLD scored significantly higher than those who did not
use any dictionary (mean difference = 3.27, p < 0.001), and those with access to
the PBLD (mean difference = 2.45, p = 0.009).
Results of two-way ANOVA in Table 10 show that there was no significant
interaction between the effects of vocabulary levels and reading conditions on
students’ RT2 scores [F(2, 88) = 1.15, p = 0.322, Z2 = 0.025]. The main effect
of vocabulary levels on students’ long-term vocabulary retention was statistic-
ally significant [F(1, 88) = 13.30, p < 0.001, Z2 = 0.131], so was the case with
Dictionary Use and Vocabulary Learning in the Context of Reading 235
Table 9: Post hoc tests for six new cell codes: multiple comparisons
Dependent variable: RT1 scores (Tukey HSD)
(I) six new cell codes (J) six new cell codes Mean Std. Sig.
Difference Error
(I-J)
Based on observed means. The error term is Mean Square (Error) = 2.996.
*The mean difference is significant at the 0.05 level.
Table 10: Two-way ANOVA for RT2 scores as a function of reading condi-
tions and vocabulary levels
RT2
Vocabulary level 1 52.52 13.30 0.000 0.131
Reading condition 2 26.57 6.73 0.002 0.133
Vocabulary level *reading condition 2 4.54 1.15 0.322 0.025
Error 88 2.99
Table 11: Means, standard deviations, and n for RT2 scores as a function of
reading conditions and vocabulary levels
EBLD PBLD ND
3.6.1 BLD use vs. contextualguessing. The study yielded clear evidence to support
Hypothesis 1 proposed in Section 3.1, i.e. students using the PBLD or
the EBLD achieve significantly better results of vocabulary learning than
those without access to the dictionary. Students who used the BLDs fared
Table 12: Correct answer rates for each multiple choice question
question revealed that dictionary use did not necessarily produce better results
than non-dictionary use in every case. Table 12 shows the correct answer rates
for each question between the ND group and the two BLD groups combined.
In this case arrive finally in a place and end up are treated as two distinct senses
of wind up. Yet the Chinese translation covers only one of the senses, omitting
3.6.2 BLD use and incidental vocabulary acquisition. As revealed by the study, stu-
dents who had access to the dictionary, be it the PBLD or the EBLD, achieved
significantly better results on both vocabulary retention tests than those who
did not, thus confirming the benefit of BLD use for incidental vocabulary
learning. As can be calculated from Tables 5 and 6, there was a large effect
size of both time (Z = 0.76) and reading conditions (Z = 0.52) on vocabulary
retention scores. Since the RT2 was done seven days later than the RT1, the
factor of time surely played a major role, for a loss of lexical knowledge is
bound to occur when students have no further exposure to target lexical items.
The more noteworthy finding is the large effect size (Z = 0.52) of reading
conditions upon retention scores, suggesting that the former exerted a powerful
impact on the latter.
The advantage of dictionary use for vocabulary retention identified by the
study echoes what was found in Luppescu and Day (1993), Knight (1994), and
Hulstijn et al. (1996). It can be explained in terms of the Involvement Load
Hypothesis proposed by Laufer and Hulstijn (2001). This task-induced con-
struct involves motivational and cognitive dimensions: need, search and evalu-
ation, which can be absent or present during word processing in a natural or
artificially designed task. Compared with the ND group, the task-induced in-
volvement load of the BLD groups is higher, because the component of search
was present: students had to consult the dictionary when dealing with the task.
Dictionary Use and Vocabulary Learning in the Context of Reading 239
3.6.3 Comparison between BLDsin paperand electronic form. The study showed that
there was no significant difference between the PBLD and the EBLD groups in
240 Yuzhen Chen
3.6.4 The effects of vocabulary levels and reading conditions on incidental vocabulary
acquisition. The study revealed a significant interaction, with a medium effect
size (Z = 0.28), between the effects of vocabulary levels and reading conditions
on students’ RT1 scores. In particular, the effect size of reading conditions was
larger than that of vocabulary levels (0.48 vs. 0.37), suggesting that the former
exerted a more powerful influence on vocabulary retention than the latter.
What is more noteworthy is the finding about the effects of different BLDs
on immediate vocabulary retention for students at different vocabulary levels.
For the higher level group, PBLD use led to significantly better result than
non-dictionary use whereas for the lower level group, EBLD users fared sig-
nificantly better than the PBLD and the ND groups. In other words, students
at the higher vocabulary level were at an advantage when using the PBLD
while the EBLD proved to be more beneficial for those at the lower vocabulary
level.
As to the long-term retention measured by the RT2, the interaction between
the effects of vocabulary levels and reading conditions was not statistically
significant, though the main effect of both factors reached the 0.05 significance
level. The effect size of both factors dropped from large in the case of the RT1
to medium in the RT2 and was very close to each other in the latter case.
Obviously, the impact of reading conditions and vocabulary levels both
decreased with the lapse of time. Students at the higher vocabulary level
achieved substantially better scores on both retention tests than those at the
lower level. In other words, more proficient students remembered more words
than less proficient ones. For students at both levels of vocabulary proficiency,
those using the EBLD fared significantly better than those without access to
the dictionary. Interestingly, for the higher level group, the advantage of PBLD
use over non-dictionary use diminished to a considerable degree from the RT1
242 Yuzhen Chen
to the RT2 while students at the lower level group were still at an advantage
when using the EBLD.
The discussion above substantiates Hypothesis 3, i.e. students varying on
vocabulary proficiency levels and reading conditions fare differently on inci-
dental vocabulary acquisition.
4. Concluding remarks
The study revealed that compared with non-dictionary use, BLD use can
effectively facilitate vocabulary comprehension, indicating that dictionary use
is a more effective strategy of vocabulary learning than contextual guessing.
Useful as it is, contextual guessing might be a process prone to incomplete or
Despite the general advantage of BLD use over non-dictionary use for vo-
cabulary learning, some problems with dictionary use were also identified by
the author. Some of them are related to the dictionary itself, but more are
concerned with students’ inadequate dictionary use skills. For example, some
students were unable to distinguish variant forms of phrases or collocations;
some did not pay sufficient attention to dictionary examples; and some were
too careless or overconfident to make judicious choice. These problems might
not be specific to BLD use and might also be found with students using other
types of dictionaries. Therefore, teachers should pay sufficient attention to
students’ dictionary use skills and provide necessary training to help them
make the best use of the dictionary.
Acknowledgments
Notes
1 In this article, the term L2 refers to second or foreign language acquired after one’s
native language (L1). The acquisition-learning distinction as proposed by S. Krashen
(1981) is not distinguished here; therefore, the terms vocabulary learning and vocabu-
lary acquisition are used interchangeably throughout the article.
2 It is acknowledged that such a simple format allows only a crude measurement of
the word knowledge elaborated by Nation (2001). However, since the purpose of the
study is not to investigate how well subjects mastered new lexical items but how much
they retained after different experimental treatment, there is reason to claim that such a
test is as adequate as any other.
References
Aizawa, K. 1999. A Study of Incidental Vocabulary Learning Through Reading by
Japanese EFL Learners. Tokyo: Tokyo Gakugei University.
Albus, D., J. Bielinski, M. Thurlow and K. Liu. 2001. The Effect of a Simplified English
Language Dictionary on a Reading Test (LEP Project Report). Minneapolis, MN:
244 Yuzhen Chen
University of Minnesota. Available at: http://education.umn.edu/NCEO/
OnlinePubs/LEP1.html.
Atkins, B. T. S. and K. Varantola. 1998. ‘Language Learners Using Dictionaries:
The Final Report on the EURALEX/AILA Research Project on Dictionary Use’.
In B. T. S. Atkins (ed.), Using Dictionaries. Studies of Dictionary Use by Language
Learners and Translators. Tübingen: Niemeyer, 21–81.
Bensoussan, M., D. Sim and R. Weiss. 1984. ‘The Effect of Dictionary Usage of EFL
Test Performance Compared with Student and Teacher Attitudes and Expectations’.
Reading in a Foreign Language, 2.2: 262–276.
Bogaards, P. 2002. ‘The Use of the DE GRUYTER WÖRTERBUCH DEUTSCH
ALS FREMDSPRACHE for receptive purposes’. In H. E. Wiegand (ed.),
Perspektiven der pädagogischen Lexikographie des Deutschen. Untersuchungen
anhand des’de Gruyter Wörterbuchs Deutsch als Fremdsprache. Tübingen: Niemeyer,
Read, J. 2007. ‘Second Language Vocabulary Assessment: Current Practices and New
Directions’. International Journal of English Studies, 7.2: 105–125.
Ronald, J. 2003. ‘A Review of Research into Vocabulary Acquisition through
Dictionary Use. Part 1: Intentional Vocabulary Learning through Dictionary Use’.
Studies in the Humanities and Sciences, 44.1: 285–307.
Rundell, M. 1999. ‘Dictionary Use in Production’. International Journal of
Lexicography, 12.1: 35–53.
Schmidt, R. 1994. ‘Deconstructing Consciousness: in Search of Useful Definitions for
Applied Linguistics’. AILA Review, 11: 11–26.
Schmidt, R. 2001. ‘Attention’. In P. Robinson (ed.), Cognition and Second Language
Instruction. Cambridge: Cambridge University Press, 3–32.
Schmitt, N., D. Schmitt and C. Clapham. 2001. ‘Developing and Exploring the Behavior
of Two New Versions of the Vocabulary Levels Test’. Language Testing, 18.1: 55–88.