Language Learning - 2024 - Köylü - Longitudinal Development of Holistic Formulaicity Formulaic Sequences and Lexical
Language Learning - 2024 - Köylü - Longitudinal Development of Holistic Formulaicity Formulaic Sequences and Lexical
EMPIRICAL STUDY
This is an open access article under the terms of the Creative Commons Attribution-NonCom-
mercial-NoDerivs License, which permits use and distribution in any medium, provided the orig-
inal work is properly cited, the use is non-commercial and no modifications or adaptations are
made.
Introduction
A study-abroad (SA) context may provide high amounts of repeated, contex-
tualized, and meaningful second language (L2) input informing learners about
the social and cultural aspects of language use and nativelike idiomaticity in
the target language (TL) environment (Siyanova & Schmitt, 2008, p. 447).
The current study thus explores L2 development in terms of measures such as
holistic formulaicity, use of formulaic sequences, and lexical measures. Holis-
tic formulaicity is defined as a combination of intensifiers, fillers, multiword
sequences, collocations, idiomatic phrases, verb–argument constructions, and
pragmatic and discourse features: all that is typically used by a first language
(L1) speaker. Formulaic sequences are defined as multiword sequences of
the kinds used most frequently by L1 speakers. Lexical measures are defined
in terms of diversity, sophistication, density, and relative frequency. In this
study, we assess holistic formulaicity through human judgment, and formulaic
sequences with automated analyzers, to explore sojourner diaries for L2 de-
velopment, complementing this with the examination of lexical development.
In terms of theoretical framework, we draw on a dynamic usage-based
(DUB) perspective, a combination of usage-based linguistics, which holds
that language is a large array of conventionalized utterances that are acquired
through exposure (Langacker, 2008; Schmid, 2020), and complex dynamic sys-
tems theory (CDST), a theory of change (van Geert, 1991). A CDST view
holds that even if individuals receive the same amount of instruction in and/or
exposure to the L2, they follow their own unique trajectories, which cannot be
generalized to a population (Lowie & Verspoor, 2019). Therefore, to study the
actual process of development, studies within a CDST framework analyze sin-
gle cases (sometimes in small groups of participants) in longitudinal designs
through the inspection of dense data (e.g., biweekly over 2 years; Fogal, 2022).
Background Literature
The Study-Abroad Context
SA programs are popular due to the common belief that sojourners, regardless
of their initial level of proficiency, will improve their L2 during SA: a belief
probably encouraged as a marketing strategy and in need of further investiga-
tion (Güvendir et al., 2021). However, research has shown that a threshold level
of competence (mostly a pre-intermediate to lower intermediate level) is key
to development during SA (Collentine, 2009; DeKeyser, 2007; Pérez-Vidal,
2014; Köylü, 2023). In contrast, advanced learners have probably reached a
plateau and stabilized their interlanguage (Flynn & O’Neil, 1988; Han, 2004;
Osborne, 2007), so that not much change in their L2 might be expected with
further instruction or sojourn experience. In this respect, DeKeyser (2007,
2014) repeatedly argues that advanced learners need more time for progress
to become apparent. Still, an immersion context should provide a wealth of
opportunities for learners to develop their L2 even at advanced levels (Pérez-
Vidal, 2014), but which aspects are positively influenced is still disputatious
(Borrás & Llanes, 2021; Pérez-Vidal & Llanes, 2021).
Research into the effects of SA periods spent in the TL country has grown
exponentially over the past three decades. Cross-sectional and longitudinal
studies such as Languages and Social Networks Abroad (LANGSNAP;
Mitchell et al., 2017) and Study Abroad and Language Acquisition (SALA;
Pérez-Vidal, 2014) have also added to our knowledge about learners’ linguistic
development while abroad. Both these research programs set out to investigate
context effects, in the wake of Collentine and Freed’s seminal 2004 publication
on the topic, through longitudinal, multiple-measure, and mixed-methods
studies with a large sample size representing the Erasmus student population
learning English, French, and Spanish (Erasmus being the European Union’s
student exchange program). Such a rich body of data has allowed researchers
to conduct a series of meta-analyses (Güvendir et al., 2024; Tseng et al., 2024;
Xu, 2019; Yang, 2016) as well as a recent all-round critical appraisal of the
methods used in the examination of the effects of SA periods (Pérez-Vidal
& Sanz, 2023). With regard to nonlinguistic development, the field has en-
compassed in-depth analyses of newly identified individual features such as
identity, interculturality, personality, emotions, or social networks (Sanz &
Front-Morales, 2018; Mitchell & Tyne, 2022).
Such an array of studies has revealed that overall learners develop signifi-
cantly more through SA, either short or long, than through immersion at home.
Moreover, the SA context is shaped by the type of program offered to students,
which in turn conditions the opportunities for contact with local speakers and
the amount of input potentially available of which they can avail themselves.
The above-mentioned meta-analyses reveal that the program features associ-
ated with such progress include taking language courses and having signed
some kind of language pledge or commitment to use the TL while abroad.
This is precisely the case of the learners in the current study, who participated
in a 4-month-long Erasmus mobility scheme in an Anglophone country.
Participation in the Erasmus program typically involves a student traveling on
their own, finding their own accommodation, taking pre-established courses
at the host university with the local students, integrating into the local com-
munity, and even trying to find some work. It has been argued that these are
optimal conditions for progress to be made (Peréz-Vidal, 2014). It is under
such circumstances that our participants were writing the diaries analyzed
hereafter.
Turning to linguistic development, the accumulated evidence points to
significant progress in listening comprehension and oral fluency (see Borràs &
Llanes, 2021, for an overview). Significant gains are also obtained for lexical
complexity, that is, density, diversity, and sophistication. Pragmatics, as mea-
sured through speech act reception and production, and humor and identity
construction, also shows significant progress (Pérez-Vidal & Shively, 2019;
Taguchi, 2018). In contrast, mixed findings are often obtained for morphology,
syntactic complexity, and lexical accuracy (Köylü & Tracy-Ventura, 2022).
With regard to the focus of the current study, Zaytseva et al. (2021) explored
vocabulary growth in terms of nativelike selections, following Foster (2009),
by comparing oral and written SALA data. The authors found that gains
were larger for oral than for written lexical complexity. In writing, learners
significantly progressed in their use of more nativelike general statements
(e.g., using pronouns or referents rather than naming a specific person or thing
as in Anyone who knows coding can help) and adverbial intensifiers, although
they still had a restricted lexical repertoire. In their oral production, they
used fewer cognates, their collocational accuracy grew, and they employed
larger numbers of lexicalized fillers (I mean, I’d say) and targetlike adverbs
(really, basically, actually). It is clear that learners’ language may change over
time in a short-term sojourn context, perhaps not in terms of all traditional
complexity, accuracy, and fluency measures, but more in terms of some lexical
measures and especially in nativelike (or conventionalized) ways of saying
things, which may include fillers, intensifiers, and collocations. Therefore, it
is of interest to study L2 development in the SA context longitudinally from a
DUB perspective in order to see how nativelike language emerges.
Holistic Formulaicity
Pawley and Syder (1983) pointed out that a native speaker has the ability to
convey meaning by means of expressions that are not only grammatical, but
also natural and idiomatic when considered as options from among a range of
grammatically accurate paraphrases. Similarly, Langacker (2008), the founder
of usage-based linguistics, pointed out that nativelike language consists of con-
ventionalized expressions that go beyond the use of idioms, collocations, for-
mulaic sequences, and lexical items. Langacker’s concept of “particular ways
of phrasing certain notions out of all the ways they could in principle be ex-
pressed in accordance with lexicon and grammar of the language” (2008, p. 84)
is clearly linked to concepts such as “preferred ways of saying things” (Sinclair,
1991) and “routines” in nativelike selections (Foster, 2009), and this notion of
formulaicity is quite broad.
In addition to collocational accuracy, this construct may include native-
like verb–argument constructions or general statements; the use of adverbial
intensifiers; the use of lexicalized fillers such as I mean and I’d say; and the
targetlike use of adverbs such as really, basically, and actually (Zaytseva et al.,
2021). Also, the preceding and succeeding textual context, morphosyntactic
choices (e.g., choice of tense and voice), and pragmatic features add formu-
laicity (Smiskova-Gustafsson et al., 2012). This construct may also include
other discourse features, such as repetition, that are not necessarily quantifi-
able but will make a L2 writer sound more or less like a L1 writer. Finally and
Formulaic Sequences
One very prominent and ubiquitous aspect of holistic formulaicity is formu-
laic sequences, which can be defined in a range of ways, depending on one’s
research objectives and overall perspective on language (Wray, 2002). Vari-
ous terms are used to refer to formulaic sequences in the literature, such as
fixed expressions, lexicalized phrases, prefabricated routines, multiword con-
structions (Siyanova-Chanturia & Pellicer-Sanchez, 2018, p. 2), memorized
phrases, chunks, collocations, or formulas/formulae (Foster, 2009, p. 91).
The use of formulaic sequences can be researched through a corpus-driven
approach (Biber, 2000), based on measures of collocational strength, such as
P (Ellis, 2006), or predictability (familiarity and frequency), or through the
phraseological approach (Nesselhauf, 2004; Paquot 2019), which differentiates
between different multiword unit types in terms of linguistic criteria, such as
their noncompositionality (idiomaticity) and their fixedness (e.g., tie the knot,
tie-dye) or use in free combinations representing no fixedness (e.g., tie some-
thing to something). Formulaic sequences can be extracted from a probabilistic
network of constructions based on their fixedness and/or frequency of occur-
rence in reference corpora (Colson, 2017) by using automated analyzers, such
as CollGram (Bestgen & Granger, 2014) or IdiomSearch (Colson, 2017).
Another way to detect longer sequences of conventionalized ways of say-
ing things in L2 learners’ output is by means of a more qualitative approach.
This approach starts with the meaning and then explores how different L2
learners express such a meaning (Gustafsson, 2019). In their study, Smiskova-
Gustafsson et al. (2012) used a three-step analysis based on a triangulation
Verspoor, 2017; Macqueen & Knoch, 2020; Verspoor et al., 2012). However,
the use of formulaic sequences has been studied longitudinally in the early
stages of L2 development (Gustafsson & Verspoor, 2017), but not much at
advanced stages, the focus of the present study. We expected that the SA
context, which typically provides high amounts of repeated, contextualized,
and meaningful L2 input, would have a positive effect on the sojourners’
development of formulaic sequences (Siyanova & Schmitt, 2008, p. 447),
which can be captured by automated tools.
Lexical Complexity
Previous SA studies have shown significant gains for lexical complexity, that
is, diversity, sophistication, and density (see the reviews by Borràs & Llanes,
2021, and Llanes, 2011). L2 lexical complexity (Bulté & Housen, 2012) is a
complex dynamic phenomenon consisting of several subdimensions, and these
should be taken into consideration in a multidimensional analysis framework.
Despite numerous available indices, the majority of L2 research has em-
ployed a few measures to determine lexical diversity/variation, sophistication,
and density (McCarthy & Jarvis, 2010). Traditionally, researchers of L2
acquisition employed generic indices to capture diversity and variation (e.g.,
type/token ratio [TTR], root-TTR or Guiraud’s index, VocD), sophistication
(e.g., average word length, the number of words from the Academic Word List
[Coxhead, 2000]), percentage of words from different word frequency bands),
and density (content word ratio). Given that the TTR could be problematic
depending on text length (Zenker & Kyle, 2021), our study addresses this
problem by employing an index confirmed to be robust against text-length
effects (Zenker & Kyle, 2021), namely the measure of textual lexical density
(MTLD;1 McCarthy & Jarvis, 2010). Along with diversity and density, lex-
ical sophistication is frequently measured as an index to capture the depth
of learners’ word knowledge, measured mostly through special word lists
constructed as a result of corpus-driven frequency counts (e.g., the AWL or
the General Service List; Kyle & Crossley, 2015). Most lexical complexity
measures operate on frequency counts. However, it is suggested to include a
measure tackling the dispersion of how well a word or word family is used
(Kyle & Crossley, 2015) within a multidimensional framework. The data set
analyzed in this study (further explained in the Method section) necessitates
employing measures that cover lexical sophistication in both academic and
nonacademic contexts (the latter involving daily language use and informal
genres). Thus, we employ measures that capture all these dimensions of L2
lexical complexity within a multidimensional analytical framework.
A number of studies within the CDST or DUB approach have explored lex-
ical development longitudinally by analyzing dense learner performance data
elicited through written or oral tasks (see Fogal, 2022, for an overview). One
of the main findings is that different variables (TTR, sophistication, academic
words) develop at different stages of development (beginner, intermediate, ad-
vanced). For example, Verspoor et al. (2017) traced three advanced Dutch L1
learners of English for over 4 years. Even though each learner had an individ-
ual trajectory, there were commonalities: None showed significant changes for
lexical diversity, but they all made significant gains in the number of academic
words, average word length, and unique content words. The authors concluded
that some of the lexical indices “that discriminate well between L2 English
texts written by L1 Dutch students at lower levels of proficiency do not do so
at the higher level of students with the same L1 background, and vice versa”
(p. 18). Thus, we might expect different developmental trajectories for dif-
ferent dimensions of L2 performance across different stages of development.
These findings were supported by Penris and Verspoor (2017), who traced the
writing development of a L1 Dutch learner of English in two different educa-
tional contexts. For the first 5 years, he was at a teacher training college (31
texts) and developed within a high-intermediate to low-advanced range; then,
after an 8-year gap as an English teacher, he continued a 3-year postgraduate
program in applied linguistics (18 texts) and developed from a low-advanced to
an advanced-academic range. Between the first and second stages, there were
significant differences in average content word length, percentage of academic
words (AWL), percentage of less frequent lexical items, percentage of unique
lexical items, and TTR, but there was no difference in lexical density (LD).
Overall, most studies looking into lexical complexity development at
advanced levels have reported increases for sophistication (AWL or average
word length), but almost no changes have been found for lexical diversity and
density.
Method
This study traces the written texts of 26 Catalan/Spanish bilingual tertiary-
level sojourners in an Anglophone country over one semester (12 to 17 weeks)
in three measures. We utilized both human judgment and automated tools to
measure holistic formulaicity, formulaic sequences, and lexical complexity
(diversity, density, sophistication).
Participants
The participants in this study were involved in the Study Abroad and Language
Acquisition (SALA) project (Pérez-Vidal, 2014). All SALA participants, in-
cluding the 26 participants in this subgroup, were bilingual speakers of Catalan
and Spanish as their L1, majoring in a language specialization degree at a
Catalan university in Barcelona, with English as their major language of study.
The median age for all SALA participants was 19 at the onset of the study,
within an age range of 17–27 years.
Predeparture Proficiency
The students participating in the study were obliged to take an institutional pro-
ficiency test to meet the home university entry requirements (before starting
their studies at the home university). This test involves a reading comprehen-
sion task and a written essay of about 200 words in response to an audiovisual
prompt. The receptive and productive tasks are relevant and/or complementary
to each other in terms of topic. The test finishes with a translation compo-
nent (from Spanish or Catalan to English). These tests are holistically assessed
by specially trained in-house instructors. Prior to departing for their semester
abroad at the beginning of their 2nd year, all SALA participants were certi-
fied to be at an upper-intermediate to advanced level—equivalent to higher-B2
to C1 level of the Common European Framework of Reference (CEFR)—of
initial proficiency in English, as a major requirement for their degrees and
the host universities involved in their Erasmus mobility scheme. The univer-
sity has mutual agreements with universities around the globe, which require
different minimum language competencies to be met at predeparture, as the
minimum proficiency to embark on an exchange at an Anglophone university
varies between the higher-B2 and C1 levels of the CEFR. See Appendix S1 in
the Supporting Information online for the predeparture proficiency scores of
the 26 participants in the current study.
2014). In total, 383 weekly diary entries were collected (12 to 17 entries per
participant, with a total of 274,041 words; Köylü et al., 2023); see Appendix
S1 in the Supporting Information online for overview data on participants’
diary entries in the corpus. The diaries were written using a word processor
and submitted all together upon return home.
The participants’ assignment was to write a piece every week, with no
preset word limit, reflecting on their overall sojourn experiences in terms of
noteworthy linguistic, cultural, and social interactions, observations, and their
perceived language development during SA. With this prompt in mind, the
participants used an overall academic register; however, the prompt-based
task meant that they described common daily observations and occurrences
rather than research-oriented topics, and therefore the number of academic
words used was not expected to increase. Thus, even though the entries are
not quintessential free-style diary pieces, the typical storytelling component
of diaries through personal anecdotes was evident in each entry. The following
three excerpts show the type of written production in the diaries (all errors are
original):
1. This was a special week because the enrolment took place in it. The reg-
istration session was arranged for all incoming Erasmus students, which
meant that I would have the opportunity to meet students like me from
all over the European Union (P13W02).
2. Today I’ve been told about “the Beefeaters”. It is the name of the guards
standing on the Tower of London who, by order of the queen, have to feed
all black birds of England so as to not let them disappear because it is of
public knowledge the day there are no black birds left the monarchy will
end. Pitifully, I saw no Beefeaters while being in England (P07W03).
3. Of course, the bartender took the glass right away, apologized, filled in
the desired amount of Sprite, and apologized again for the embarrassing
incidence. At that point a real duel of ‘sorries’ broke out. The customer
was so sorry for not being pleased with her drink at the first place and
the girl behind the bar was so sorry for her unforgivable mistake. Without
exaggerating, I could count more than 30 ‘sorries’ during the two minutes
‘combat,’ and I was quite amused by excessive changing of apologies
(P10W04). (Köylü et al., 2023, p. 5)
Knoch, 2020). For example, Bestgen and Granger (2014) used CollGram to
analyze the phraseological strength of bigrams (two-word relationships) in a
corpus of 171 essays from a group of tertiary English learners, showing that
bigrams of less frequent words (e.g., Korean peninsula) positively correlated
with holistic essay scores assigned by expert raters, whereas those of high
frequency (e.g., of the) did not show a significant correlation. The authors
suggested that human raters probably recognize these less frequent bigrams
as indicators of a higher competence in the TL. Also, the negative correla-
tion between erroneous combinations (e.g., everyone are) and essay scores
indicated that human raters probably attend to grammatical accuracy in their
judgments. As human raters have intuitions about nativelike language use
(Macqueen & Knoch, 2020), and as holistic formulaicity may be difficult to
quantify, we asked experienced English teachers to rate the data set on holistic
formulaicity and assign holistic proficiency (HOLFOR) scores. The four raters
were tertiary-level instructors in English as a foreign language or English for
academic purposes (near-native speakers of English with more than 20 years
of tertiary-level teaching experience), along with the first author, who has
similar experience in teaching English for academic purposes.
In a similar procedure to the holistic proficiency assessment instructions
set out by Verspoor et al. (2017, pp. 4–5), the raters were first trained by com-
paring a small number of texts in accordance with the holistic formulaicity
definition to establish a ranking from 1 to 5 (1 = least formulaic to 5 = most
formulaic) that everyone could agree upon (see below for the instructions).
This small set of rated texts were then used as benchmarks for rating the
remainder of the texts.
Once the benchmarks were established, the texts from all students were
provided in randomized order. Overall, each instructor rated around 90 texts,
but the first author rated all 383. Each text was assigned a score by three in-
dependent raters. The interrater reliability for holistic formulaicity was high,
with a Cronbach’s alpha of .81.
one per word (PW ratio), which considers the number of phraseological units
per total number of words, and another per text (PT ratio), which considers the
total number of words in the text included in the phraseological units. In other
words, a PT ratio of 0.31 as in the Figure 1 example means that 31% of all the
word tokens in the text are included in the phraseological units, whereas 69%
of them occur outside those units. Table 1 presents the formulaicity indices
employed in the analyses.
Lexical Sophistication (TAALES; Kyle & Crossley, 2015), and the Tool for the
Automatic Analysis of Lexical Diversity (TAALED; Kyle, Crossley, & Jarvis,
2021), to capture lexical density, sophistication, and diversity. Table 2 describes
the indices used in the analyses.
Data Analyses
We aimed to identify a shared L2 development tendency in the whole data set
given, considering the total time spent abroad. We analyzed the data set with
individual trajectories through a series of generalized additive mixed models
(GAMMs) in order to understand the interconnection between formulaicity,
formulaic sequences, and lexical complexity measures and time, utilizing the
mgcv package (Wood, 2006) in R (R Core Team, 2022) to trace their develop-
ment. GAMMs allow for a nonlinear function of time when analyzing nested
dependencies in dense data, for instance, time intervals within learning trajec-
tories (Kliesch & Pfenninger, 2021, p. 248). GAMMs are preferred on account
of potential intercorrelations within clusters in our data set and also in an at-
tempt to provide a more accurate representation of the data structure. Unlike
traditional linear models, which commonly assume both linearity and normal-
ity of residuals, GAMMs offer increased flexibility by accommodating nonlin-
ear relationships in the data. Nonetheless, unless otherwise specified, GAMMs
generally maintain the assumption of normally distributed residuals, which is
a consideration still upheld in our modeling approach. We plotted the results
from the GAMM analyses using the itsadug package (van Rij et al., 2020).
Examining individual raw data can pose challenges in discerning any over-
all improvement or change. As with conventional statistical methods, smooth-
ing techniques are crucial in GAMMs for revealing underlying trends by fitting
smooth, flexible forms to the data. In GAMMs, these smooths help in model-
ing complex nonlinear relationships without making parametric assumptions
about the form of these relationships, enhancing the model’s ability to describe
the data accurately. Consequently, smoothers are highly effective in conveying
the direction of change and offering an overview of the general developmen-
tal pattern. The effective degrees of freedom (EDF) in the results indicate the
complexity of the smooth term, reflecting the nonlinearity of the trajectory. In
contrast, the F value and p value from the output of the mgcv package (Wood,
2006) assess the overall statistical significance of the variable, independent
of whether the relationship with the outcome variable is linear or nonlinear.
An EDF value larger than 1.000 shows a nonlinear trajectory, while 1.000 in-
dicates a linear trend. The dependent variables were the linguistic variables
(e.g., scores on different measures), while time (data collection points) was
Density LD Lexical density Number of content words per total number of Vocabprofile
words
Sophistication AWL Academic Word Number of words from list of 570 words Vocabprofile
List frequently used in an academic context
(Coxhead, 2000)
BNC_Written_ BNC written word Average frequency counts from written word TAALES
Freq_AWa frequency frequencies derived from the British National
average Corpus (BNC)
BNC_Spoken_ BNC spoken word Average frequency counts from spoken word TAALES
Freq_AWa frequency frequencies derived from the British National
average Corpus (BNC)
COCA_Spoken_ COCA spoken Average frequency counts from spoken word TAALES
Freq_AW word frequency frequencies derived from the Corpus of
average Contemporary American English (COCA)
Diversity MTLDb Measure of textual Average number of tokens in a text required to TAALED
lexical diversity reach a given type/token value
Note. TAALES = Tool for the Automatic Analysis of Lexical Sophistication; TAALED = Tool for the Automatic Analysis of Lexical
a b
Diversity. See Kyle and Crossley (2015) for a full description of BNC word frequency average scores. See McCarthy and Jarvis (2010)
for a full description of MTLD.
14679922, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/lang.12680 by Morocco Hinari NPL, Wiley Online Library on [24/10/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
14679922, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/lang.12680 by Morocco Hinari NPL, Wiley Online Library on [24/10/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Köylü et al. Sojourners’ L2 Formulaicity & Lexical Development
used to determine the fixed effect and participant as the effect with random
intercepts allowing a nonlinear function of time. The following model specifi-
cation is used in the GAMM analyses via the mgcv package (Wood, 2006; for
the complete GAMMs syntax, see Appendix S6 in the Supporting Information
online):
Results
Here we present the results relating to each of the three research questions in
turn.
Parametric
Coefficients Estimate Std. Error t Pr(>|t|) Edf Ref.df F p
Figure 3 Holistic formulaicity (HOLFOR) scores for each sojourner each week.
Week 1
[…] in the United Kingdom the sport is a very important part in the life
of the student. I personally think that is a very healthy way of conceive
the university life. […] And coming back to what I was saying first, the
sport will let me know some specific vocabulary related to the positions,
the movements and the tactics inside the field. Concerning to the Scottish
accent, and even the English accent, I have some difficulties to
understand some questions sometimes. Maybe this is due to the fact that I
am more used to the American accent. I was told that the English from
the United Kingdom was more understandable and clear, but I can assure
that this is not true. It all depends on what are you used to. (P05W01)
Week 12
The most important thing is to be yourself, to be happy with what you
have and make the most of it, and always treat the others as you would
like to be treated. I came here determined to give “me” a chance, and I
started from scratch. I have met wonderful people here, all of them very
different, from different countries, and all very interesting in their own
way. In my last day here I received presents from all of them, some
unexpected, and that made me feel loved, lucky for having friends like
them. I am not saying that I don’t have good friends here, because I do,
but I know them since I was little. The merit now is bigger. I feel that
people can really like me for what I am now. (P05W12)
The weekly entry from which the first excerpt was taken received a HOLFOR
score of 3.00 on a scale of 5.00, whereas the one from Week 12 received 4.25.
As Figure 3 shows, most sojourners were variable in their HOLFOR scores
over time, with most showing an upward trend (e.g., Participant 10) and some
a downward trend (e.g., Participant 3). More importantly, the trajectories are
rather linear for most, with the exception of Participants 2, 8, 11, 14, 16, 21,
and 23. What is striking is that those following a nonlinear route also improve
their holistic formulaicity at the end of their sojourn.
sequences in our type of L2 English learner data, given that the register and
style of the sojourners’ diary writing approximates spoken, informal style,
rather than written and academic style. Taken together, these points may ex-
plain why the PW and PT measures determined by the IdiomSearch tool did
not show further development in our participants.
The main findings for the third research question are that the sojourners’
lexical scores did not develop significantly during the semester abroad in terms
of density, sophistication, or diversity. These findings could be explained in
different ways. As the sojourners learn more conventionalized ways of saying
things, they may use more colloquial, less sophisticated words, and use the
same words more frequently. For example, phrasal verbs, which are one of the
hallmarks of nativelike use of English, consist of high-frequency but rather
low-sophistication words (e.g., the sojourners might use the phrasal verb put
up with instead of a more sophisticated word like tolerate or endure). Such
expressions will contribute to the overall impression of high proficiency and
formulaicity, so the holistic proficiency scores will increase, even though lexi-
cal measures will remain the same or decrease. This finding is in line with the
current literature in terms of lexical diversity and density (Penris & Verspoor,
2017). However, our results have also not confirmed any development for
sophistication (e.g., academic words or the use of unique words), in contrast to
what the literature suggests (Penris & Verspoor, 2017; Verspoor et al., 2017).
This might be relevant to the idea that our participants as advanced learners
of English have already reached a plateau by stabilizing their interlanguage
(Flynn & O’Neil, 1988; Han, 2004; Osborne, 2007). Another explanation
might involve task effects. The nature of diary writing by reporting on daily
life experiences might have led our participants to repeatedly use words from
higher frequency bands throughout their stay.
Additionally, complex dynamic systems studies in the literature have con-
firmed different growth trajectories for different proficiency levels (e.g., from
pre-intermediate to intermediate). We believe that upper-intermediate to ad-
vanced proficiency has a different developmental trajectory from that seen for
lower levels, as our participants probably ended up with gains in different di-
mensions of their written formulaicity that we could not individually measure.
These results, thus, support the argument that different linguistic features will
become marked or develop at different proficiency levels (Penris & Verspoor,
2017; Verspoor et al., 2012). Our participants might have wanted to sound
more nativelike and authentic through frequent use of conventionalized con-
structions, mostly involving lexical items from higher frequencies (Siyanova-
Chanturia & Pellicer-Sánchez, 2018). Thus, there seems to be a logical link
Conclusion
Taking a DUB perspective, this study investigated the nexus between time and
development in holistic formulaicity, formulaic sequences, and lexical com-
plexity in a group of Catalan/Spanish tertiary-level sojourners who had upper-
intermediate to advanced proficiency in English. We aimed to determine a
nonlinear trend for L2 development after a semester abroad through a series
of GAMMs, while expecting large amounts of individual variation as we ex-
plored the predictive power of time and its nonlinear function.
Our analyses confirmed a significant linear main effect of time only on
holistic formulaicity (which subsumes lexical sequences, verb–argument con-
structions, intensifiers, fillers, pragmatic markers, discourse markers, and for-
mulaic sequences), while nonlinear individual trajectories were also present
in the data set. There was no significant development in the use of formu-
laic sequences; this may have been due to ceiling effects in the sojourners’
interlanguage, but we also suspect that the automated measures used may fail
to detect to what degree syntactic or lexical constructions can be regarded as
conventionalized sequences in the TL, most likely because the frequency of
occurrence of set sequences is not the only indicator of formulaicity. We can
conclude by saying that, after a semester abroad, our upper-intermediate to
advanced sojourners increased their use of holistic formulaicity according to
human raters; however, they did not show development either within the typi-
caldimensions (e.g., lexical diversity)or to the extent that the automated tools
employed for the analyses would capture it, nor with respect to the more con-
ventional lexical dimensions.
This study significantly contributes to our understanding of L2 develop-
ment during SA sojourns for those of high predeparture proficiency levels. Our
results support the idea that learners of different proficiencies may display var-
ied changes in competencies and capabilities (Penris & Verspoor, 2017), while
the nature of trade-offs between holistic formulaicity, formulaic sequences, and
lexical complexity may alter as sojourners start from an already competent
threshold level. Yet, it should be kept in mind that we did not directly compare
any developmental patterns across different levels of proficiency. The findings
of Verspoor et al. (2012) can now be extended thanks to our findings from
lower proficiency levels to upper-intermediate and/or advanced levels for the
additional dimensions of L2 performance, holistic formulaicity as rated by hu-
mans, and formulaic sequences as assessed by automated analyzers, along with
lexical density, sophistication, and diversity. Sounding more authentic and na-
tivelike is probably the key objective of sojourners at such high proficiencies.
They might be trying to imitate L1 users more than lower proficiency learners
(Pérez-Vidal & Barquin, 2014).
This study is not without limitations. Firstly, we analyzed a data set
composed of weekly diary entries. We did not assign our participants a fixed,
repeated task even though they were given clear instructions and prompts
to complete the weekly written task, but at different lengths ranging from
400 words to 2,000 words. However, there was a thematic similarity across
different entries. We also employed statistical methods that are robust against
imbalanced data sets, accommodating disparities such as differing total
numbers of entries per participant. Additionally, we calculated indices that
take account of variation in word counts per entry, ensuring that our analyses
remained sensitive to these differences without assuming uniform distribution
of the outcome variable across different weeks or participants.
In sum, we have sought to contribute to the field with one of the
first examples of DUB L2 development research investigating the case of
advanced-level sojourners spending their SA exchange in a TL country.
Therefore, we strongly advocate more studies with lower proficiency levels
to explore how a DUB perspective can contribute to our understanding of
L2 development during SA. To our knowledge, this is also the first study
involving holistic formulaicity assessment through human judgment and
formulaic sequence assessment with automated analyzers to explore sojourner
diaries for L2 development. Our findings suggest that holistic formulaicity
Notes
1 The MTLD is calculated as “the mean length of word strings that maintain a
criterion level of lexical variation” (McCarthy & Jarvis, 2010, p. 381).
2 The tool was freely accessible from a web application
(https://idiomsearch.lsti.ucl.ac.be/) until mid-September 2024. Please contact
Jean-Pierre Colson (https://uclouvain.be/fr/repertoires/jean-pierre.colson) for use.
References
Bestgen, Y., & Granger, S. (2014). Quantifying the development of phraseological
competence in L2 English writing: An automated approach. Journal of Second
Language Writing, 26, 28–41. https://doi.org/10.1016/j.jslw.2014.09.004
Biber, D. (2000). Lexical expressions in speech and writing. In D. Biber, S. Johansson,
G. Leech, S. Conrad, & E. Finegan (Eds.), Longman grammar of spoken and written
English (pp. 987–1036). Longman. https://doi.org/10.1162/089120101300346831
Borràs, J., & Llanes, À. (2021). Re-examining the impact of study abroad on L2
development: A critical overview. The Language Learning Journal, 49(5),
527–540. https://doi.org/10.1080/09571736.2019.1642941
Bulté, B., & Housen, A. (2012). Defining and operationalising L2 complexity. In A.
Housen, F. Kuiken, & I. Vedder (Eds.), Dimensions of L2 performance and
proficiency: Complexity, accuracy and fluency in SLA (pp. 21–36). John Benjamins.
https://doi.org/10.1075/lllt.32.02bul
Cobb, T. (2000). Web Vocabprofile [Computer software]. Accessed August, 2022, from
http://www.lextutor.ca/vp
Collentine, J. (2009). Study abroad research: Findings, implications and future
directions. In M. H. Long & C. J. Doughty (Eds.), Handbook of language teaching
(pp. 218–233). Wiley-Blackwell. https://doi.org/10.1002/9781444315783.ch13
Collentine, J., & Freed, B. (2004). Introduction: Learning context and its effects on
language acquisition. Studies in Second Language Acquisition, 26(2), 153–172.
https://doi.org/10.1017/S0272263104262015
Colson, J.-P. (2017). The IdiomSearch experiment: Extracting phraseology from a
probabilistic network of constructions. In R. Mitkov (Ed.), Computational and
Güvendir, E., Acar-Güvendir, M., & Dündar, S. (2021). Study abroad marketing and
L2 self-efficacy beliefs. In M. Howard (Ed.), Study abroad and the second language
learner: Expectations, experiences and development (pp. 49–68). Bloomsbury
Academic. http://doi.org/10.5040/9781350104228.ch-003
Güvendir, E., Borràs, J., & Acar-Güvendir, M. (2024). The effects of study abroad on
L2 vocabulary development: A meta-analysis. Study Abroad Research in Second
Language Acquisition and International Education, 9(1), 26–51.
https://doi.org/10.1075/sar.22014.bor
Han, Z. (2004). Fossilization in adult second language acquisition. Multilingual
Matters. https://doi.org/10.21832/9781853596889
Kliesch, M., & Pfenninger, S. E. (2021). Cognitive and socioaffective predictors of L2
microdevelopment in late adulthood: A longitudinal intervention study. The
Modern Language Journal, 105(1), 237–266. https://doi.org/10.1111/modl.12696
Köylü, Z. (2023). The ERASMUS sojourn: Does the destination country or
pre-departure proficiency impact oral proficiency gains? The Language Learning
Journal, 51(1), 48–60. https://doi.org/10.1080/09571736.2021.1930112
Köylü, Z., Eryılmaz, N., & Pérez-Vidal, C. (2023). A dynamic usage-based analysis of
L2 proficiency: Syntactic and lexical complexity development of sojourners.
Journal of Second Language Writing, 60, Article 101002.
https://doi.org/10.1016/j.jslw.2023.101002
Köylü, Z., & Tracy-Ventura, N. (2022). Learning English in today’s global world: A
comparative study of at home, anglophone, and lingua franca study abroad. Studies
in Second Language Acquisition, 44(5), 1330–1355.
http://doi.org/10.1017/S0272263121000917
Kyle, K., & Crossley, S. A. (2015). Automatically assessing lexical sophistication:
Indices, tools, findings, and application. Tesol Quarterly, 49(4), 757–786.
https://doi.org/10.1002/tesq.194
Kyle, K., Crossley, S. A., & Jarvis, S. (2021). Assessing the validity of lexical
diversity indices using direct judgements. Language Assessment Quarterly, 18(2),
154–170. https://doi.org/10.1080/15434303.2020.1844205
Kyle, K., Crossley, S., & Verspoor, M. (2021). Measuring longitudinal writing
development using indices of syntactic complexity and sophistication. Studies in
Second Language Acquisition, 43(4), 781–812.
https://doi.org/10.1017/S0272263120000546
Langacker, R. W. (2008). Cognitive grammar as a basis for language instruction. In P.
Robinson & N. C. Ellis (Eds.), Handbook of cognitive linguistics and second
language acquisition (pp. 66–88). Routledge.
https://doi.org/10.4324/9780203938560-12
Llanes, À. (2011). The many faces of study abroad: An update on the research on L2
gains emerged during a study abroad experience. International Journal of
Multilingualism, 8(3), 189–215. https://doi.org/10.1080/14790718.2010.550297
Lowie, W. M., & Verspoor, M. H. (2019). Individual differences and the ergodicity
problem. Language Learning, 69(S1), 184–206. https://doi.org/10.1111/lang.12324
McCarthy, P. M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study
of sophisticated approaches to lexical diversity assessment. Behavior Research
Methods, 42, 381–392. https://doi.org/10.3758/BRM.42.2.381
Macqueen, S. S., & Knoch, U. (2020). Adaptive imitation: Formulaicity and the words
of others in L2 English academic writing. In G. Fogal & M. H. Verspoor (Eds.),
Complex dynamic systems theory and L2 writing development (pp. 81–108). John
Benjamins Publishing Company. https://doi.org/10.1075/lllt.54.04mac
Michel, M., Murakami, A., Römer, U., & Alexopoulou, D. (2022, August 24–27). Do
formulaic sequences mask proficiency? Considering evidence from a large learner
corpus [Paper presentation]. EuroSLA 31, Fribourg, Switzerland.
Mitchell, R., Tracy-Ventura, N., & McManus, K. (2017). Anglophone students abroad:
Identity, social relationships and language learning. Routledge.
https://doi.org/10.4324/9781315194851
Nesselhauf, N. (2004). What are collocations? In D. J. Allerton, N. Nesselhauf, & P.
Skandera (Eds.), Phraseological units: Basic concepts and their application
(pp. 1–21). Schwabe.
Ortega, L., & Byrnes, H. 2008. Theorizing advancedness, setting up the longitudinal
research agenda. In L. Ortega & H. Byrnes (Eds.), The longitudinal study of
advanced L2 capacities. Routledge.
Osborne, J. (2007). Why do they keep making the same mistakes? Evidence for error
motivation in a learner corpus. In J. Waliński, K. Kredens, & S. Goźdź-Roszkowski
(Eds.), Corpora and ICT in language studies: PALC 2005 (pp. 343–355). Peter
Lang.
Paquot, M. (2019). The phraseological dimension in interlanguage complexity
research. Second Language Research, 35(1), 121–145.
https://doi.org/10.1177/0267658317694221
Pawley, A., & Syder, F. H. (1983). Two puzzles for linguistic theory: Nativelike
selection and nativelike fluency. In J. C. Richards & R. W. Schmidt (Eds.),
Language and communication (pp. 191–225). Longman.
Penris, W., & Verspoor, M. (2017). Academic writing development: A complex,
dynamic process. In S. E. Pfenninger & J. Navracsics (Eds.), Future research
directions for applied linguistics (pp. 215–242). Multilingual Matters.
https://doi.org/10.21832/9781783097135-012
Pérez-Vidal, C. (2014). Language acquisition in study abroad and formal instruction
contexts. John Benjamins. https://doi.org/10.1075/aals.13
Pérez-Vidal, C., & Barquin, E. (2014). Comparing progress in writing after formal
instruction and study abroad. In C. Pérez-Vidal (Ed.), Language acquisition in
study abroad and formal instruction contexts (pp. 217–234). John Benjamins.
https://doi.org/10.1075/aals.13
Supporting Information
Additional Supporting Information may be found in the online version of this
article at the publisher’s website:
Accessible Summary
Appendix S1. Overview of Participant Diaries and Proficiency Profiles.
Appendix S2. Information on the Training of the Human Raters.
Appendix S3. Results of Generalized Additive Mixed Models for Nonsignifi-
cant Models.
Appendix S4. Additional Variables From the Tool for the Automatic Analysis
of Lexical Diversity (TAALED).
Appendix S5. Group Tendencies in the Generalized Additive Mixed Models.
Appendix S6. Syntax of the Generalized Additive Mixed Models.