The Role of Derivational Morphology in Vocabulary Acquisition: Get by With A Little Help From My Morpheme Friends
The Role of Derivational Morphology in Vocabulary Acquisition: Get by With A Little Help From My Morpheme Friends
net/publication/227801400
CITATIONS READS
102 4,094
3 authors, including:
All content following this page was uploaded by Raymond Bertram on 08 November 2017.
Bertram, R., Laine, M. & Virkkala, M. M. (2000). The role of derivational morphology in vocabulary acquisition: Get by with a little help
from my morpheme friends. Scandinavian Journal of Psychology, 41, 287±296.
This study explores the role of morphology in vocabulary knowledge of 3 rd and 6 th grade Finnish elementary school children. In a word
definition task, children from both grades performed overall better on derived words than on monomorphemic words. However, the results
were modified by the factors Frequency and Productivity. Most strikingly, performance on monomorphemic words was disproportionately
weaker than on derived words at the low frequency range. At the high-frequency range, derived words with low-productive suffixes yielded
poorest performance. We partly make an appeal to the lexical-statistical properties of the Finnish language to explain the interaction of
Frequency and Word Structure. At any rate, the results suggest that Finnish elementary school children benefit significantly from utilizing
morphology in determining word meanings.
Key words: Morphology, acquisition, Finnish, lexicon, productivity, frequency.
Raymond Bertram, Department of Psychology, University of Turku, FIN-20520 Turku, Finland. E-mail: [email protected]
Elementary school children encounter a huge number of & Jenkins, 1987). Nevertheless, the context does not always
words that they have never seen before. Nagy and Anderson give sufficient clues to determine the meaning of an
(1984) estimate on the basis of the American Heritage Word unfamiliar word. Wysocki and Jenkins (1987) found that
Frequency Book (WFB) of Carroll, Davies, and Richman children's success in deriving the meaning of unfamiliar
(1971) that in a corpus of printed school English in the US words was very much dependent on the strength of the
from grades three through nine, more than 600, 000 surrounding sentence context. Fairly often the context is
orthographically distinct word types are present. The greater rather neutral and does not hint at the meaning of an
part of these word types is of very low frequency. As a unfamiliar word. In addition, the more unfamiliar word
matter of fact, about 585, 000 word types would occur no forms occur in a text, the less clues the context will provide
more than 1 time per million. Somehow children have to the readers with and the more difficult it will be to determine
make sense of these low frequency words they are the meaning of a particular unfamiliar word form.
confronted with during their elementary school years. After A more successful way to determine the meaning of many
all, they are supposed to extract a general understanding of unfamiliar words could be to make use of a word's
the texts they are exposed to. How would children get morphological structure, since most of the low frequency
around the problem that many of these texts contain a large words consist of two or more morphemes (Nagy &
amount of low frequency words that are most probably Anderson, 1984; see also Baayen, 1994, who calculated that
neologisms to them? in the Dutch INL corpus of 42 million words there are
The most straightforward way to deal with this challenge 13,360 word formations occurring only once, and 96% of
is seemingly to take up a dictionary and read the definition them are morphologically complex). The ability to recognize
and other lexical information related to the lexical entry. the morphemic structure in complex words and the ability to
This strategy, however, would be highly disrupting when arrive at the meaning by means of the more familiar
one is reading texts with a large number of unfamiliar components seems therefore very useful. Indeed, there is
words. Moreover, recent studies (e.g., Scott & Nagy, 1997) ample evidence that children use word elements in learning
show that elementary school children frequently misunder- unfamiliar words (e.g., Anglin, 1993; Freyd & Baron, 1982;
stand dictionary definitions and that they fail to use Shu & Anderson, 1997; Tyler & Nagy, 1989; Van Daalen-
adequately the information dictionary definitions provide Kapteijns & Elshout-Mohr, 1981). In a detailed study,
them with. Anglin (1993) shows that a great deal of the vocabulary
An alternative strategy would be to make use of the growth of elementary school children can be accounted for
context to determine the meaning of an unfamiliar word by an increasing ability to deal with morphologically
form. Several studies indicate that this strategy can indeed complex words. Moreover, his study clearly points out that
be beneficial (Graves, 1986; Nagy & Herman, 1987; Wysocki many morphologically complex words are not listed in the
# 2000 The Scandinavian Psychological Associations. Published by Blackwell Publishers, 108 Cowley Road, Oxford OX4 1JF, UK and
350 Main Street, Malden, MA 02148, USA. ISSN 0036-5564.
288 R. Bertram, M. Laine and M. M. Virkkala Scand J Psychol 41 (2000)
mental lexicon, but that children from the 1 st, 3 rd and 5 th accounted for by monomorphemic nominative singulars. 4
grade arrive at their meaning via the morphological sub- Interestingly, hapaxes, word formations that appear only
structure of the words. Freyd and Baron (1982) found that once in our corpus, surface relatively more in the
fast vocabulary development is enhanced by morphological polymorphemic (N 613, 572, 60.7%) than in the mono-
awareness. In their study, superior 5 th grade elementary morphemic group (33.6%). To summarize, the bulk of
school children (mean age 10 : 9 years) outperformed Finnish words in running text is polymorphemic (for nouns,
average 8 th graders on a standard vocabulary test including 97.4%) and of very low frequency. Consequently, Finnish
30 monomorphemic words and 30 derived words. The 5 th language users will have to resort to morphological parsing
graders outperformed the 8 th graders on both word types, quite frequently. This leads us to speculate that the role of
but the difference between grades was greater for the derived morphology in this language could be visible in domains
words than for the simple words. This difference was based where one would not easily expect it, for instance at the
on the greater ability of the superior 5 th graders to analyze high-frequency range.
words into morphemes, more specifically, to detect the root In this study, we will employ a vocabulary (word
in the derived word forms and to give definitions on the definition) test to examine the performance of native Finnish
basis of the root. 3 rd and 6 th graders on monomorphemic and derived words,
All in all, the general observation is that morphology has presented in a random order, while controlling for
a role to play in vocabulary acquisition. The more specific psycholinguistically relevant factors such as word length
question in this study is whether the effect of morphological (in letters), surface frequency (the number of times the word
structure interacts with factors such as frequency and affixal proper occurs in our corpus), lemma frequency (the number
productivity (see Baayen, 1994, for a way of measuring the of times the word proper plus all the inflectional variants of
degree of productivity of a given affix) that have been shown the stem occur in our corpus) and average bigram frequency
to affect lexical processing of adults (Bertram, Laine, & (the average of the number of times that all combinations of
Karvinen, 1999; Bertram, Schreuder, & Baayen, 2000b; two subsequent letters in a word occur in our corpus).
Stemberger & MacWhinney, 1986). With respect to fre- Moreover, we will systematically investigate the role that
quency, some models like the Morphological Race Model of frequency and affix productivity might play on children's
Frauenfelder and Schreuder (1992) or the AAM model of word knowledge.
Caramazza and his colleagues (Burani & Caramazza, 1987;
Caramazza, Laudanna, & Romani, 1988) assume that
morpheme-based processing is much more likely to take MATERIALS AND METHODS
place for low frequency complex words than for complex This study investigates how morphological knowledge
words of higher frequency. The idea is that it would contributes to Finnish children's understanding of words.
necessarily take a certain number of exposures for complex More specifically, the question is what children would know
words to develop full-form representations via which lexical about a morphologically complex word form such as
processing could take place. Bertram et al. (2000b) stress juusto la ``cheese location marker ) ``cheesery'' in
that the balance of storage and computation hinges upon the comparison to a matched monomorphemic control word
interaction of specific factors of the affixes involved. In too veitikka ``rascal''. Understanding is measured by oral
many studies a pot-pourri of affixes is selected to represent a definitions given by children to the probe words. Thus the
certain category (for instance, derivations or inflections) general question addressed in this study is whether children
without taking into consideration affix-specific properties. 1 significantly benefit from morphological structure in deter-
Another new aspect of the present study is the language mining the meaning of derived words. By varying the factors
involved. While most vocabulary acquisition studies have Frequency and Productivity, we will investigate whether and
been conducted in English with English natives (but see e.g., how performance differs for high frequency versus low
Shu & Anderson, 1997 and Shu, Anderson, & Zhang, 1995, frequency words and for derived words with high versus low
for some exceptions), this study explores a language very productive suffixes. We selected a group of 3 rd grade
deviant from any Indo-European language, namely Finnish. children and a group of 6 th grade children to assess these
Finnish is a Finno-Ugric, agglutinative language with a very questions at different developmental stages.
rich morphology. It has been estimated that nominal To assess the potential effect of word frequency on
inflection in Finnish yields over 2000 possible word forms vocabulary knowledge of elementary school children, a
(Karlsson & Koskenniemi, 1985). Nouns may be marked for group of words of relatively high frequency (about 35 per
number, case (13 cases in active use), and possession. In million) and a group of words of relatively low frequency
addition, clitic particles conveying pragmatic information (about 1 per million) were selected. Frequency effects are
may be attached to the end of a word. Moreover, derivation found in all kinds of lexical tasks with various subject
and compounding is very productive in Finnish, leading to populations (e.g., Baayen, Dijkstra & Schreuder, 1997;
huge morphological families. 2 Of the 1, 022, 944 distinct Laine, Niemi, KoivuselkaÈ-Sallinen & HyoÈnaÈ, 1995; Stem-
noun types in our database, 3 only 26, 355 (2.6%) is berger & MacWhinney, 1986), generally showing that high
frequency words are dealt with more effectively than low suffix only. For instance, for the word sika la
frequency words. Accordingly, we expect that also our ``pig'' location marker ) ``piggery'', a definition like ``a
sample of school children will show better performance for place where you can buy computers'' will be given 0 points;
the high frequency than for the low frequency words. Of a definition like ``it smells there, there are pigs'' will be given
particular interest is the possible interaction between 1 point; a definition like ``a place where they breed pigs'' will
frequency and morphological structure, that is, whether be given full credit, that is, 2 points. When the stem was
children benefit from their morphological knowledge to the stripped off and appeared in another morphologically
same extent in the high and the low frequency range. As simple or complex word formation (e.g., an answer like
suggested by several authors (e.g., Schreuder & Baayen, sikaa ``pig'' partitive on the presentation of sikala), the
1995; Stemberger & MacWhinney, 1986), whole-word answer was rewarded with 1 point. This purely formal
representations play a much more important role in the morphological analysis did not occur very often though (on
access of complex words in the higher frequency range. The 4.0% of the items in the 3 rd grade and 1.8% in the 6 th
role of morphology might therefore be more visible at the grade).
low frequency range than at the high frequency range. On
the other hand, the overall morphological productivity of
the Finnish language could entail that effects of morphology Scoring on the basis of the suffix for zero point responses
are visible even at the high frequency range. For all those instances where children's answers were rated
To assess the potential effect of suffix productivity, a zero points, we conducted subsequent analyses to assess
group of derived words with a relatively high-productive their suffix knowledge in more detail. One point was given
suffix 5 (henceforth, high-productive derivations) and a for a suffix that was formally and=or semantically recog-
group of derived words with a relatively low-productive nized in the response. In other words, in this suffix-related
suffix (henceforth, low-productive derivations) were selected scoring, the children got one point when they had said
and presented to the children. Bertram et al. (1999; 2000b) juustola ``cheesery'' or ``a place where you can buy
suggest that high productivity triggers rule-based processing computers'' upon the presentation of sikala ``piggery''.
behaviour. In other words, the more productive a suffix is,
the more likely it is that morphological structure affects
performance. In Experiment 1, the vocabulary test was EXPERIMENT 1
administered to the 3 rd graders and in Experiment 2 to the
6 th graders.
Method
THE SCORING SYSTEM Participants. Thirty-two elementary school children from two 3 rd
grade classes (on the average 9 : 6 years of age) of the Puolala School
Two of the authors (MMV & ML), being native speakers of in Turku (Finland) were tested individually in a quiet room. All of
Finnish, rated all the definitions of six of the 3 rd and 6 th them were native speakers of Finnish.
grade children independently. Additional scoring was
included to assess suffix knowledge in greater detail. The Materials. Seventy target words were selected from our lexical
ratings of the judges were highly correlated (for each database accessed by the search program WordMill of Laine &
Virtanen (1999). Thirty-five of them were from the high-frequency
condition, r > 0.90), after which one of them (MMV) rated range and 35 from the low-frequency range. Within each frequency
the rest of the material. All of the average scores presented range there were three different word types: derived words with a
here are based on her ratings. high-productive suffix, derived words with a low-productive suffix
and monomorphemic words. In Table 1, the relevant quantitative
data of the 6 different conditions are presented; in addition, the
Scoring on the basis of the whole word form suffix functions are described and presented with an example.
Within each frequency range, conditions were matched for word
The scoring system we employed was based on the meaning length in letters and for average lemma, surface and bigram
and part of speech of the whole word form. For every target frequency. Moreover, the root frequency of the low- and high-
word, a child obtained 2 points when the given definition productive derivations was matched at both frequency ranges. 6
was fully correct, 1 point when the definition was partly Note that the root frequency should not have any effect if the
children's responses are based on the whole-word form. In other
correct, and 0 points when the definition was not correct at words, when morphological analyses will not take place, the
all or when no answer was given. As a reference, the response pattern for both types of derivation and the monomor-
definitions of a standard Finnish dictionary, the Nykysuo- phemic word forms should not differ. Finally, adding the deriva-
men sanakirja (1978), were employed. It should be noted tional suffixes did not alter the root orthographically or
that for almost any derivation the meaning is greatly phonologically, neither in the low-productive nor in the high-
productive condition. In other words, we deliberately selected our
determined by the stem. In practice this means that at least derived target items so that no stem formation took place that could
one point was given when the children indicated knowledge have obscured the salience of the root. All the suffixes employed
of the stem, but no points when they indicated to know the conveyed both grammatical and lexical information.
290
R. Bertram, M. Laine and M. M. Virkkala
Table 1 Quantitative data of the 6 target conditions in experiment 1, 3 rd grade
Word
Root Lemma Surface Bigram length Number Suffixes
Condition frequency frequency frequency frequency in letters of items employed Function Example
High-frequent derivation 314 32.8 6.4 1094 8.00 15 -jA deverbal agent marker laula ja ``singer''
with a high-productive suffix -Us deverbal abstract noun marker ilmoit us ``announcement''
-ntA deverbal causative marker kerro nta ``narration''
High-frequent derivation 181 33.9 8.4 1181 7.73 15 -stO denominal collective noun marker laiva sto ``fleet''
with a low-productive suffix -mO deverbal location marker kampaa mo ``hairdresser's''
-lA denominal location marker kahvi la ``coffee bar''
High-frequent ± 38.2 6.5 1120 7.20 5 none - kellari ``cellar''
monomorphemic word
Low-frequent derivation 29.7 0.81 0.20 1359 8.27 15 -jA deverbal agent marker somista ja ``decorator''
with a high-productive suffix -Us deverbal abstract noun marker kuitta us ``receipt''
-ntA deverbal causative marker haudo nta ``bathing''
Low-frequent derivation 28.5 1.00 0.23 1297 8.13 15 -stO denominal collective noun marker taru sto ``mythology''
with a low-productive suffix -mO deverbal location marker sulatta mo ``meltery''
-lA denominal location marker juusto la ``cheesery''
Low-frequent 0.86 0.22 1163 7.20 5 none ± sikermaÈ ``cluster''
monomorphemic word
Procedure. Each word was printed on a separate card and shown to frequency range (0.67 for high-productive and 0.49 for low-
the participants. At the same time the word was spoken aloud by the productive derived words). Separate ANOVA's at the high
experimenter. The children were instructed to give an oral definition
of every single word presented. When they indicated that they did and low frequency range both showed a significant effect for
not know a particular word, they were encouraged to guess. morphological structure (F1(2, 58) 42.2, p < 0.001 and
Responses were tape-recorded and transcribed afterwards for F1(2, 58) 44.2, p < 0.001, respectively). Post-hoc compar-
further analysis. The experiment lasted approximately 30 minutes isons revealed that all contrasts differed significantly from
and was preceded by 10 practice words of variable morphological each other (all p's < 0.05). This means that at the high-
structure and frequency.
frequency range high-productive derivations elicit the best
performance, followed by monomorphemic words and low-
productive derivations. At the low-frequency range, again
RESULTS
high-productive derivations yield best performance, fol-
Prior to statistical analysis we excluded two opaque items lowed by low-productive derivations and by monomorphe-
from the low-frequent low-productive derivations, for mic words.
which Ð according to three Finnish native speakers Ð it If we look at the distribution of 0-, 1-, and 2-point
was impossible to calculate the meaning via the constituent definitions, it appears that the percentage of 0-point
morphemes. All other items were included as they were definitions is higher for the monomorphemic words than
within two standard deviations of their condition mean. for the derived items (monomorphemic words 33.3%, low-
Two children were discarded due to overall non-responsive- productive derivations 14.8%, high-productive derivations
ness. 16.1%). For the 2-point definitions, the proportion is
greater for high-productive derivations than for the other
two word types (monomorphemic words 48.3%, low-
Scoring on the basis of the whole word form productive derivations 46.3%, high-productive derivations
The responses on the remaining 68 definitions of the 66.7%), yielding a highly significant effect ( 2(4)
remaining 30 children were used to calculate the mean 164.9, p < 0.001, with Yates' continuity correction).
scores per condition (see Table 2).
A 2 3 repeated measures ANOVA revealed a significant
Scoring on the basis of the suffix for zero point responses
main effect for Frequency (F1(1, 29) 312, p < 0.001),
with high-frequent items being better defined than low- For all the 130 instances where an attempt to define a
frequent items, and for Morphological Structure derived word yielded zero points, it was decided whether
(F1(2, 58) 52.2, p < 0.001). Moreover, the interaction be- suffix knowledge was present in the answer (either formally
tween these two factors was significant (F1(2, 58) or semantically). In 71.5% of all these instances (N 93),
36.2, p < 0.001). As regards the main effect of Morphologi- this was indeed the case. Most interestingly, a chi-square
cal Structure, post-hoc comparisons based on subsequent F- shows that suffix-related knowledge is not randomly
tests showed that the words with high-productive suffixes distributed over the high- and low-productive condition
were better defined than the words with low-productive ( 2(1) 22.8, p < 0.001, with continuity correction). Suffix
suffixes (F1(1, 29) 61.8, p < 0.001) or the monomorphemic knowledge was much more common on the high-productive
words (F1(1, 29) 79.8, p < 0.001). In addition, words with condition (60 out of 66) than success on the low-productive
low-productive suffixes elicited significantly higher scores condition (33 out of 64).
than monomorphemic words (F1(1, 29) 15.8, p < 0.001).
The observed interaction is mainly caused by the large
performance difference on the high vs. low-frequent mono- DISCUSSION OF EXPERIMENT 1
morphemic words (a difference of 1.10 points), whereas In line with previous research, this study shows a robust
performance on both types of derived words drops down frequency effect. Children's definitions of high-frequency
less drastically when moving from the high- to the low- words were Ð in ordinal scale Ð nearly twice as good as
Table 2. Mean definition scores with SD of the 6 target conditions of the 3 rd graders in experiment 1
those of low-frequency words. As expected, 3 rd graders have grammatical and semantic operations of the suffixes on the
a better understanding of words that are used and=or word roots. This notion is backed up by the chi-square
encountered regularly than words that are employed only analysis of suffix knowledge on zero-scored words, where
occasionally. high-productive suffixes elicit better performance than low-
More interestingly, the morphological make-up of words productive ones.
clearly affects our subjects'' performance in giving word It should be noted that the low-productive suffixes
definitions. First of all, overall performance is poorest for employed here are rather transparent and it may well be
the category for which no fall-back on morphology is that the 6 th graders, being more advanced in their
possible, that is, the monomorphemic words, and best for vocabulary knowledge in general, and in their morphologi-
derived words in high-productive suffixes. The most cal awareness and knowledge in particular, will perform
intriguing part of the data is the differential pattern that better with the low-productive suffixes than the 3 rd graders.
we found at the high-frequency range in comparison with By means of more or less the same vocabulary test for the
the low-frequency range. In both frequency ranges, high- 6 th graders of the same school, we tried to explore this
productive derivations elicit the best performance. However, possibility, next to the other issues already raised in the
whereas monomorphemic words elicit good performance at Introduction.
the high-frequency range, they caused considerable trouble
at the low-frequency range. In contrast, low-productive
derivations elicit relatively good performance at the low- EXPERIMENT 2
frequency range, but relatively poor at the high-frequency
range. How could we explain this diverse pattern of results? METHOD
We presume in accordance with previous research (e.g.,
Stemberger & MacWhinney, 1986) that full-form represen- Participants Thirty-two elementary school children (on the average
tations had little chance to develop at the low-frequency 12 : 6 years of age) from two 6 th grade classes of the Puolala School
range (in our case words with 1 or less occurences per in Turku, Finland, were tested individually in a quiet room on
location. All of them were native speakers of Finnish.
million). When no or only weak full-form representations
are available, little sense can be made for monomorphemic
Materials Seventy target words were selected from our lexical
words which do not contain sub-lexical units. However, for database, 35 representing the high-frequency range and 35 the low-
derived words, sub-lexical units do exist in the form of frequency range. The high-frequency words were exactly the same as
rather high-frequent morphemes, and calculating the mean- the ones employed for the 3 rd grade, but about 43% of the low-
ing on the basis of these units provides a rather successful frequency words were replaced in order to increase the degree of
difficulty. As in Experiment 1, both frequency conditions included
back-up option, be it more successful for high- than for low-
three different word types: derived words with a high-productive
productive derivations. At the high-frequency range, mono- suffix, derived words with a low-productive suffix and monomor-
morphemic words have developed more stable representa- phemic words. In Table 3, the relevant quantitative data of the three
tions via which children frequently retrieve the correct conditions at the low-frequency range (the ones that differed from
semantics. The fact that both type of derivations differ from Experiment 1) are presented. As in Experiment 1, word length in
letters, lemma frequency, surface frequency and bigram frequency
monomorphemic words at this frequency range suggests to
were controlled. The root frequency difference between the high-
us that morphological structure still has an impact here. For productive derivations (7.1) and the low-productive derivations
low-productive derivations it is hindering performance, since (18.9) almost reached significance (t2(28) 1.94, p2 0.06). We
the low-productive suffixes are not yet so familiar and the discuss the implications for this root frequency bias after presenta-
grammatical and semantic operations of the suffixes on the tion of the results.
word roots are not so well known. For the high-productive
derivations, performance is boosted, since the suffixes here Procedure The procedure was identical to that of Experi-
are more familiar, leading to a better understanding of the ment 1.
Word
Root Lemma Surface Bigram length Number Suffixes
Condition frequency frequency frequency frequency in letters of items employed Example
Low-frequent derivation 7.1 0.80 0.16 1100 7.80 15 -jA, -Us, kuitta us
with a high-productive suffix -ntA ``receipt''
Low-frequent derivation 18.9 0.52 0.15 1237 7.93 15 -stO, -mO, taru sto
with a low-productive suffix -lA ``mythology''
Low-frequent ± 1.05 0.28 1361 8.20 5 none sikermaÈ
monomorphemic word ``cluster''
Table 4. Mean definition scores with SD of the 6 target conditions of the 6 th graders in experiment 2
by the now non-significant difference between high- and In this study frequency of occurrence interacts with
low-productive suffixes in the additional chi-square analysis morphological structure. Most notably, knowledge of
on suffix knowledge. However, on the basis of this monomorphemic words at the low-frequency range is rather
explanation one would expect equal performance for high- poor. Given the fact that particularly at this frequency range
and low-productive derivations and not that the latter words are represented only weakly, if at all, this does not
outperforms the former. This unexpected difference then is come as a surprise. Whereas for complex words children can
most probably caused by the (almost significantly) higher fall back upon more familiar constituent morphemes, the
average root frequency for the low-productive derivations in lack of sub-lexical structure does not allow children to do
comparison to the average root frequency of the high- the same with monomorphemic words. The data patterns
productive ones. As noted before, frequency of exposure is for low-frequent derived words are in line with models such
one of the most consistent and important factors in as the AAM model (Caramazza et al., 1988) or the MR
triggering differential performance patterns in lexical model (Frauenfelder & Schreuder, 1992) which claim that
processing. This root frequency effect would be yet another the morpheme-based route is the rule for complex words at
line of evidence that at the low-frequency range children this frequency range. At the high-frequency range, we also
effectively compute the syntax=semantics of complex words find a differential performance pattern for complex words in
on the basis of the constituent morphemes. comparison to monomorphemic words. This is going against
the predictions of most models as they generally assume that
the full form route is taking care of lexical access and
retrieval for such items. If children would base their
GENERAL DISCUSSION
definitions purely on the full form, one should not have
Morphological knowledge matters. As already shown by observed a difference between the derived and monomor-
previous studies (most notably, Anglin, 1993, and Nagy & phemic words as we did. The effect of morphology at this
Anderson, 1984), children acquire a more extensive voca- frequency range might be linked to the lexical-statistical
bulary by making use of their ability to analyze and properties of the Finnish language. Finnish is a language
comprehend words via morphological constituents. The with an extremely rich morphology in which any given word
general picture that arises from this study is that on top of can appear in hundreds or even thousands of different word
that they have a greater understanding of morphologically forms (Karlsson & Koskenniemi, 1985). Therefore a Finnish
complex than simple words when the word types are tightly language user will have to resort to morphology in
matched on all relevant factors. However, we were able to production and comprehension far more often than users
acquire a more detailed picture by manipulating factors such of morphologically restricted languages such as English and
as frequency of occurrence and suffix productivity in a Dutch. In that respect it does not come as a surprise that
relatively unexplored but from a morphologically point-of- young Finnish children are more sensitive to word-internal
view very interesting language, namely Finnish. morphological structure than their Indo-European age-
Suffix productivity turned out to be an important mates.
differentiating factor. High-productive derivations were Thus frequency, suffix productivity and language all seem
understood best by children of both the 3 rd and the 6 th to affect the role that morphological structure plays in
grade, although in the latter grade performance on low- vocabulary acquisition. Therefore it is hazardous to make
productive derivations seemed to approach that on high- bold statements about complex words or even derived=
productive ones. Why would suffix productivity be so inflected words in general. Moreover, one should already be
important in vocabulary acquisition? A reasonable answer careful in item selection not to represent a certain category
on this question could be given by referring to what Tyler with a variety of affix types which differ in many
and Nagy (1989) call relational knowledge. They define this dimensions. The more general notion of this study is that
type of knowledge as the ability to recognize morphological children benefit greatly from utilizing morphology in
relations between words that share common morphemes determining word meanings. This might be particularly
such as work and worker, or for that matter, worker and handy while they are engaged in listening to speech or in
thinker. It is not hard to see that a morphological pattern reading texts with a high number of infrequently used
that gets reinstated regularly in many word formations at words. As shown in this study, it is especially in the low-
both input and output will be acquired earlier and more frequency range that they will get by with a little help from
strongly than a pattern that is encountered or produced less their morphemic friends.
often. In general, it becomes more and more apparent that
affixal properties have a huge impact on processing of We wish to thank Jukka HyoÈnaÈ, Christina Burani, William Nagy,
morphologically complex words (e.g., Bertram et al., 1999; one anonymous reviewer and Pekka Niemi for their helpful
comments on an earlier version of this paper. The Turun Sanomat
Bertram et al., 2000b; Burani, Dovetto, Thornton, & Company kindly provided us with a massive corpus of written
Laudanna, 1997; Laudanna & Burani, 1995; Schreuder & Finnish. This study was financially supported by the Academy of
Baayen, 1995). Finland (grant #27774 to Matti Laine), the Centre of International
Mobility (CIMO) and the Graduate School of Psychology, financed storage and computation in morphological processing: the role
by the Finnish Ministry of Education. of Word Formation Type, Affixal Homonymy, and Productiv-
ity. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 26, 489 ±511.
Burani, C. and Caramazza, A. (1987). Representation and proces-
NOTES
sing of derived words. Language and Cognitive Processes, 2,
1 217±227.
There are exceptions to this observation. Lewis and Windsor
(1996), for instance, assessed the effects of suffix productivity on Burani, C., Dovetto, M., Thornton, A. M. & Laudanna, A. (1997).
nonsense derivations. The more productive the suffix, the better Accessing and naming suffixed pseudo-words. In G. E. Booij
production and comprehension performance on the nonsense and J. Van Marle (Eds.), Yearbook of Morphology 1996.
derivations was observed. (pp. 55±73). Dordrecht: Kluwer Academic Publishers.
2 Caramazza, A., Laudanna, A. & Romani, C. (1988). Lexical access
We define a morphological family as the group of derived and
compound words that are sharing the same root. Thus words like and inflectional morphology. Cognition, 28, 297 ±332.
``worker'', ``work-man'', ``unworkable'' belong to the same mor- Carroll, J. B., Davies, P. & Richman, B. (1971). The American
phological family. In a recent article Bertram, Baayen, & Schreuder Heritage Word Frequency Book. Boston: Houghton Mifflin.
(2000a) assess the effects of morphological family size on lexical Frauenfelder, U. & Schreuder, R. (1992). Constraining psycholin-
processing. guistic models of morphological processing and representation:
3 The role of productivity. In G. Booij and J. van Marle (Eds.),
The database mentioned here has compiled words of articles of
the Turun Sanomat, the second largest newspaper in Finland. The Yearbook of morphology 1991. (pp. 165±183). Dordrecht:
compilation stretches from 1.4.1994 to 30.6.1996. There are 22.7 Kluwer.
million word forms divided over 1, 483, 912 distinct word types. Freyd, P. & Baron, J. (1982). Individual differences in acquisition of
Nouns are clearly the largest grammatical class in this database; the derivational morphology. Journal of Verbal Learning and Verbal
two other major classes, verbs and adjectives, account for 118, 521 Behavior, 21, 282±295.
and 219, 984. Graves, M. (1986). Vocabulary learning and instruction. Review of
4
This percentage is actually even lower, for the low-productive Research in Education, 13, 49± 89.
derived word types are still included in the count of 26, 355, since Karlsson, F. & Koskenniemi, K. (1985). A process model of
they are not recognized by the automatic morphological parser morphology and lexicon. Folia Linguistica, 29, 207 ±231.
employed in the tagging of our lexical database. Laine, M., Niemi, J., KoivuselkaÈ-Sallinen, P. & HyoÈnaÈ, J. (1995).
5 Morphological processing of polymorphemic nouns in a highly
In order to assess the degree of productivity for certain
derivational affixes of Finnish, a production experiment with 37 inflecting language. Cognitive Neuropsychology, 12, 457 ±502.
adult participants was conducted in which the participants had to Laine, M. & Virtanen, P. (1999). WordMill Lexical Search Program.
create as many words as possible with a given suffix in a limited Centre for Cognitive Neuroscience, University of Turku.
amount of time. For the high-productive suffixes -jA, -Us, and Laudanna, A. & Burani, C. (1995). Distributional properties of
-ntA, the growth rate (see Baayen, 1994) was 0.322, 0.491, and derivational affixes: Implications for processing. In L. B.
0.289, respectively. For the low-productive suffixes -mO, -stO, Feldman (Ed.), Morphological Aspects of Language Processing.
and -lA, the growth rate was 0.275, 0.201, and 0.113, respectively. (pp. 345±364). Hillsdale, NJ: Erlbaum.
6 Lewis, D. J. & Windsor, J. (1996). Children's analysis of
A two-tailed t-test for independent samples revealed that the
root frequency of the two derivation conditions was matched indeed derivational suffix meanings. Journal of Speech and Hearing
at both frequency ranges (low-frequency range t2(28) < 1; high- Research, 39, 209 ±216.
frequency range t2(28) 1.05, p2 < 0.30), even though the absolute Nagy, W. E. & Anderson, R. C. (1984). How many words are there
difference in the latter range seems quite large (181 vs. 314). This in printed school English. Reading Research Quarterly, 19, 304±
difference, however, is caused by one outlier in the high-productive 329.
derivation condition with a root frequency of 1638. Without this Nagy, W. E. & Herman, P. A. (1987). Breadth and depth of
item the average root frequency for high-productive derivations vocabulary knowledge: Implications for acquisition and instruc-
would have been 219. tion. In M. A. McKeown and M. E. Curtis (Eds.), The nature of
vocabulary acquisition. (pp. 19± 35). Hillsdale, NJ: Erlbaum.
Nykysuomen sanakirja [Standard dictionary on the Finnish
language] (1978). Porvoo: WSOY.
REFERENCES Schreuder, R. & Baayen, R. H. (1995). Modelling morphological
Anglin, J. M. (1993). Vocabulary development: a morphological processing. In L. B. Feldman (Ed.), Morphological Aspects of
analysis. Monographs of the Society for Research in Child Language Processing. (pp. 131 ±154). Hillsdale, NJ: Erlbaum.
Development. (Serial No. 238). Chicago: University of Chicago Scott, J. A. & Nagy, W. E. (1997). Understanding the definitions of
Press. unfamiliar verbs. Reading Research Quarterly, 32, 184 ±200.
Baayen, R. H. (1994). Productivity in language production. Shu, H. & Anderson, R. C. (1997). Role of radical awareness in the
Language and Cognitive Processes, 9, 447±469. character and word acquisition of Chinese children. Reading
Baayen, R. H., Dijkstra, T. & Schreuder, R. (1997). Singulars and Research Quarterly, 32, 78±89.
plurals in Dutch: Evidence for a parallel dual route model. Shu, H., Anderson, R. C. & Zhang, H. (1995). Incidental learning of
Journal of Memory and Language, 36, 94 ±117. word meanings. A Chinese and American cross-cultural study.
Bertram, R., Baayen, R. H. & Schreuder, R. (2000a). Effects of Reading Research Quarterly, 30. 76±96.
family size for complex words. Journal of Memory and Stemberger, J. P. & MacWhinney, B. (1986). Frequency and the
Language, 42, 390±405. lexical storage of regularly inflected forms. Memory and
Bertram, R., Laine, M. & Karvinen, K. (1999). The interplay of Cognition, 14, 17 ±26.
Word Formation Type, Affixal Homonymy, and Productivity in Tyler, A. & Nagy, W. E. (1989). The acquisition of English
lexical processing: evidence from a morphologically rich derivational morphology. Journal of Memory and Language, 28,
language. Journal of Psycholinguistic Research, 28, 213± 226. 649±667.
Bertram, R., Schreuder, R. & Baayen, R. H. (2000b). The balance of Van Daalen-Kapteijns, M. M. & Elshout-Mohr, M. (1981). The
acquisition of word meanings as a cognitive learning process. morphological generalization. Reading Research Quarterly, 22,
Journal of Verbal Learning and Verbal Behavior, 20, 386 ±399. 66 ±81.
Wysocki, K. & Jenkins, J. (1987). Deriving word meanings through
Received 4 March 1999, accepted 8 October 1999