0% found this document useful (0 votes)

5 views

Teaching_language_variation_using_Italia

Uploaded by

esmfxi

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Teaching_language_variation_using_Italia

Uploaded by

esmfxi

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Teaching language variation using Italian corpora

Isabella Chiari (La Sapienza University of Rome)

Abstract

This paper aims at evaluating reference corpora of spoken and written Italian as tools for
teaching variation to learners of Italian as a second language. Specifically, some of the main
Italian reference corpora will be compared, focusing on their classification of varieties and
their potential as sources of information on register, domains, genre, and text types.

1. Introduction

The aim of this paper is to evaluate reference corpora of spoken and written Italian as tools
for teaching variation to learners of Italian as a second language. Language variation is
traditionally one of the most crucial and complex abilities that learners are expected to
acquire: aspects such as genre, register, text type, domain, topic, activity type, sublanguage
and style are key in developing language awareness. In corpus design, there is a wide variety
of approaches to capturing variation, a variety reflected in the possibilities offered by different
query systems. What kind of information about linguistic varieties is encoded in Italian
reference corpora? And how can learners acquire knowledge about variation from the
available electronic resources, either autonomously or in guided classroom activities?

2. Language variation and corpora

The fruitfulness of corpora in language teaching has been widely stressed in the literature, be
it in a general didactic environment (Sinclair 2004; Gavioli 2005; Scott & Tribble 2006), in
translation studies (Bernardini & Zanettin 2000; Botley et al. 2000; Wang 2000; Zanettin et
al. 2003), in classroom work (Johns 1991a; Partington 1998; Godwin-Jones 2001; Timmis
2003), and in learner corpus exploitation (Granger et al. 2000; Granger et al. 2002; Tono
2002). The advantages for the learner include easy access to authentic texts, autonomous
management of the learning process in a data-driven framework (Johns 1991b, c; Wible et al.
2002), and stimulation of metalinguistic reflection. For the teacher they include the possibility
to obtain real examples of language use, assistance in syllabus and materials design, and other
functions (Kahn 1985; Szendeffy 2005).

The interaction between language teaching using corpora and the need to develop
understanding of language variation has been addressed by a number of scholars. The general
framework proposed by Biber (1988, 1995), being both theoretical and methodological, has
led to considerable discussion of the relationship between language variation and corpora.
Biber’s attention has generally been focused on exploiting corpora for variation studies (Biber
& Finegan 1991; Biber 1992, 1993a, b; Biber et al. 1998), and he attempts to clarify the
terminological jungle that characterises this area by associating the term genre and register to
“situationally defined text categories (such as fiction, sports, broadcasts, psychology

1
articles)”, and text type to “linguistically defined text categories” (1993a: 244-5).1 Biber
argues that precedence should be given to external criteria in corpus design, using genre and
register as the main principles in distinguishing subcorpora. But reference corpora of major
languages have extremely dissimilar designs, and exhibit different capabilities for analysing
language varieties. Not only are registers differently defined and represented (qualitatively
and quantitatively), but the possibility to retrieve data about variation varies in different
corpus querying systems.

Once the corpus has been designed and constructed, other questions arise as to the possibility
of investigating different aspects of variation. Since corpora cannot be said to be
representative of a single genre tout court – we can always think of features that are only
marginally present in the corpus because they are rare, because they refer to a level of analysis
which was not foreseen in the corpus design, or simply because they show marginal
tendencies – we must always take into account the limits of data retrieval from different
corpora, and from different subcorpora within a corpus. For the study of language variation,
especially directed towards language learning, the obvious sources are reference corpora. A
reference corpus has been defined as

designed to provide comprehensive information about a language. It aims to be large enough to

represent all the relevant varieties of the language, and the characteristic vocabulary, so that it can be
used as a basis for reliable grammars, dictionaries, thesauri and other language reference materials.
(Eagles 1996)

Reference corpora are not the only source of such data, but constitute, by definition, the most
comprehensive portrait of a language’s main varieties.

Variation must play a role in the language learning process, since it is a key aspect of
communicative competence (Hymes 1971; Spolsky 1989) But a decision has to be made
whether to adopt a genre-driven or a text-type-driven approach. If we accept Biber and
Finegan’s claim that “linguistic variation is generally conditioned by some combination of
social, situational, discourse and processing characteristics” (1991: 216), then we will tend to
prefer a functional approach that will give precedence to genre/register in presenting variation
to language learners. This seems a sensible way to move if we wish to maintain a
communicative methodology in teaching.

Whether we prefer a genre-driven or a text-type-driven approach, there remain practical

problems related to corpus exploitation in classroom environments. When it comes to
teaching such central aspects of variation as differences between spoken and written language,
sociolinguistic variables, formal and informal registers, and language change, the need for
explicitly coded information in corpora (as well as in dictionaries and other tools) becomes
urgent: “language teachers and researchers need to know exactly what kind of language they
are examining or describing” (Lee 2001: 37). But corpora encode aspects of language
variation such as formal and informal styles (at phonological, morphological, syntactic and
textual levels) and in general diaphasic variations (written vs. spoken, diatopic differences,
speaker styles, genres, domains, sublanguages, etc.) differently. Furthermore, to investigate
language variation, it is essential to have access to multiple linguistic sources, since it is only

1
While the term register is commonly used as a “general cover term associated with all aspects of variation in
use“ (Biber 1995: 9), it is most closely connected to linguistic patterns and choices instantiated by specific
genres (Lee 2001).

2
possible to detect variation through comparison. Features which are to be considered specific
to a genre or a text type can only be detected through the observation of diversity or of
different degrees of similarity. When it comes to analysing variation using corpora, the first
condition is the possibility to keep different genre labels apart.

It is of course possible to investigate language variation by close analysis of textual material

extracted from corpora. But it is not feasible to expect language learners to discover patterns
of variation which have not been previously coded by the corpus designer or at least explored
by the teacher in advance. The most obvious instruments for extracting information about
variation are the retrieval tools associated with different corpora. These retrieval tools (mainly
corpus querying systems) enable, to different extents, the user to mine a corpus for lexical,
morphological, syntactic and pragmatic aspects of variation.

3. Exploiting available Italian corpora

Reference corpora are becoming widely used for teaching purposes and a number of
theoretical investigations of their potential for teaching variation have been produced
(Hunston 2002; Conrad 2004). Maximum coverage of variation is considered a prerequisite of
reference corpora aiming at representativeness, register-diversity and balance among genres.
So if we want to look at variation for teaching purposes, reference corpora should be the ideal
tool. In this section I compare some of the main Italian reference corpora, focusing on their
classification of varieties and their potential as sources of information on register, domains,
genre, and text types. I will also look at examples of how variation can be observed, and how
students can examine the main characteristics of register diversification identified by Biber
(1993b: 221): “individual linguistic features are distributed differently across registers, and
[…] the same (or similar) linguistic features can have different functions in different
registers”.

3.1. Written Reference Corpora

Among the corpora of written Italian developed during the past decades (Chiari 2005), some
can be considered reference corpora in their design and scope. Excluding some which are no
longer available, such as LIF (Bortolini et al. 1971), I will focus on two of these: the large
100-million word COrpus di Riferimento dell'Italiano Scritto, 1998-2001 (CORIS/CODIS:
Rossini Favretti 2000; Rossini Favretti et al. 2002), and the smaller 4-million word Corpus e
Lessico di Frequenza dell'Italiano Scritto (COLFIS: Laudanna et al. 1995).

3.1.1. CORIS/CODIS

Developed by the Centre for theoretical and applied linguistics of the University of Bologna
(CILTA), this is currently the largest publicly available balanced corpus of written Italian. It
exists in two forms: a static one (COrpus di Riferimento dell'Italiano Scritto (CORIS)), and a
dynamic one (COrpus Dinamico dell'Italiano Scritto (CODIS)). The latter is a monitor corpus
updated every two years, modelled on the Bank of English. The corpus is divided into
subcorpora of traditional genres: press, fiction, administrative and legal prose, academic
prose, miscellanea and ephemera. These subcorpora are divided into sections and subsections
based on external parameters (see Table 1), including both national and local varieties of the
language. The hierarchical structuring of sections and subsections involves a mixture of
subgenres and domains. Thus the fiction subcorpus contains crime, adventure, and science-

3
fiction, along with women’s literature, literature for adults, and literature for children.
Academic prose includes books, reviews, popular history, philosophy, arts, literary criticism,
law, economy, biology, etc.. Other subcorpora seem to be organized on the basis of
differences in readers and purposes.

SUBCORPORA SECTIONS SUB-SECTIONS

FICTION novels, short stories Italian, foreign, for adults, for

25 Mw children, crime, adventure,
science-fiction, women’s
literature

PRESS newspapers, periodic, national, local, specialist, non-

38 Mw supplement specialist, connotated, non-
connotated

ACADEMIC PROSE human sciences, natural books, reviews

12 Mw sciences, physics, scientific, popular history,
experimental sciences philosophy, arts, literary
criticism, law,
economy, biology, etc.

LEGAL AND ADMIN. PROSE legal, bureaucratic, books, reviews

10 Mw administrative

MISCELLANEA books on religion, travel, books, reviews

10 Mw cookery, hobbies, etc.

EPHEMERA letters, leaflets, instructions private, public, printed,

5 Mw electronic

Table 1. Design structure of CORIS/CODIS

The only tool available to access the corpus is concordancing over the web
(http://corpora.dslo.unibo.it). The dynamic version allows the user to choose which
subcorpora to include, and how large each should be (seeFig. 1), adapting to different needs
and working hypotheses The query interface admits single word forms, specific sequences of
word forms (AND/OR logical operator), sequences at a distance (which retrieves two words
with up to a given number of words in between), and returns the total number of hits
satisfying the query, displaying a maximum of 300. Collocational values using mutual
information, t-score or raw frequency are optionally shown. No grammatical annotation is
provided to the end user, and at the moment the public online version of the corpus is not
lemmatized (accessed May 2008). The extremely synthetic online help seems to suggest the
presence of grammatical annotation, though no explanation of the tagset and query language
is given . A major limitation is the impossibility of exploring entire texts (probably due to
copyright issues), or of restricting queries to specific sections and subsections within the
subcorpora. Distributions over subcorpora, sections and subsections are not given.

4
Fig. 1. CODIS query form

We can search for specific wordforms or sequences, such as praticamente or a lui gli piace,
and obtain the actual number of hits found (309 and 2 respectively), a concordance display of
up to 300 occurrences with a maximum of 160 characters of context, and an indication of
which subcorpus each line comes from (see Fig. 2). If the total number of hits is over 300 but
not too large, we can obtain all the concordance lines by doing separate queries for each
subcorpus. But when faced, say, with the 2,232 hits for the form diciamo (we say/let’s say),
we cannot analyse all the hits in order to distinguish uses as an ordinary verb from those as a
discourse marker. No hints on the use of searches for lemmas is given in the help file or in the
documentation. Information about register variation is only obtainable via separate queries in
different subcorpora.

5
Fig. 2. CODIS concordance: praticamente

3.1.2. Corpus e Lessico di Frequenza dell'Italiano Scritto (COLFIS)

COLFIS (Laudanna et al. 1995) is a recently released frequency list of Italian words derived
from a 3,798,275 word corpus of written texts collected in the early ‘90s. While the frequency
list is fully available (in a variety of different formats), the corpus at the moment is in
prototype form, and queries can only be performed on those texts authorized by the copyright
holders (http://www.ge.ilc.cnr.it/strumenti.php). COLFIS is an extremely interesting design,
being based on reading statistics for a sample of the Italian population from 11 years of age,
processed by the National Institute of Statistics (ISTAT). The relative proportions of texts
included in the corpus were based on 1992-94 data for newspapers, magazines and books. In
Biber’s terms (1993a:245) the corpus is designed on text reception principles, aiming to
represent language use from a demographic perspective.

SUBCORPORA SECTIONS SUBSECTIONS

NEWSPAPERS Corriere della Sera Business and economics
1,523,167 La Repubblica Local news
La Stampa Gossip
Crime
Foreign affairs
Home affairs
Science
Arts and sport
MAGAZINES Arts, science and technology
1,084,574 Cars and boats
Children and young people
Home and hobbies
Photo comics2
General information
Gossip
Radio and TV
Sport
Travel and ecology
Women’s
Other
BOOKS Art
542,334 Children’s
Classic fiction
Crime and espionage
Drama and poetry
Essays
Hobbies and travel
Modern fiction
Natural and exact sciences
Romantic fiction
Science fiction
Other

Table 2. Design structure of COLFIS

2
A popular subgenre of comics in which the story is told not by drawings, but by photos.

6
For each of the three national newspapers sampled, there are 9 canonical categories of texts
(corresponding to sections in the newspapers). Magazines are divided into 12 categories using
a mixture of genre, domain, and text-type criteria. Books are distributed into 13 categories,
again with mixed genre and domain criteria.

Fig. 3. COLFIS query form for the non lemmatized corpus

The authorized portion of the corpus can be concordanced on the web, using either the raw
text or the lemmatized version. The three subcorpora can be searched individually, but not
subsections of these. The raw corpus has a very intuitive interface (see Fig. 3). As there is no
query syntax, only word types and word sequences can be searched for. The output is a
tabular concordance showing the hit paragraph with information on subcorpus, subsection,
newspaper or magazine source, author, title, publisher, year, and date. Context length cannot
be varied.

Fig. 4. COLFIS concordance of sai

A similar query window is available for the lemmatized search, producing a similar output
with POS tagging (12 categories) included, but only a sentence of context.

7
Most queries performed with CODIS can be performed in COLFIS, except that the latter does
not support wildcards. Frequency information is not provided in the concordance tables, but
can be obtained through a specific search mask giving frequencies, relative frequencies and
dispersion values for all the wordforms and lemmas in the corpus. The general frequency lists
can be downloaded, while results of single queries are not exportable. As with CODIS, access
to the full text is not permitted.

If looking for specific words or sequences, COLFIS can be used to determine distribution
across the main subcorpora, but morphological and syntactic patterns cannot be investigated.
The impossibility of querying specific subsections significantly reduces the suitability of the
corpus for studying variation.

3.2. Spoken Corpora

Many researchers have favoured the use of spoken language corpora in classroom work.
Cresti (2007) notes how artificially-constructed materials differ radically from natural
conversations performed on the same tasks, and how the latter present far greater diaphasic
variability. While for formal written language, real-life uses may be relatively similar to those
presented to the learner in the classroom, the distance between the speech of teaching
materials and that of real life communication is enormous, making the presentation of spoken
language variation a key task for teaching.

If, as we have seen, written corpora can vary considerably in their coverage of language
variation due to an ambiguous distinction between external and internal factors, for speech
corpora the obstacles are multiplied by the absence of a shared classification of genres and
registers.

Over the last fifteen years a number of Italian speech corpora have been created, starting with
the pioneering Lessico di frequenza dell’italiano parlato (LIP: De Mauro et al 1993). Some
multilingual corpora also include spoken Italian components, such as C-ORAL-ROM,
Integrated reference corpora for spoken romance languages (Cresti & Moneglia 2005). Other
corpora have been designed to explore phonetic and phonological properties, such as Archivio
delle Varietà di Italiano Parlato (AVIP: Pettorino & Giannini 2003), Archivio di Parlato
Italiano (API), Italiano Parlato (IPar: Albano Leoni & Giordano 2005) e Corpora e Lessici di
Italiano Parlato e Scritto (CLIPS: http://www.clips.unina.it/).3 Here we will focus attention
on two corpora that aim at representativeness of variation, namely LIP (500,000 words) and
the Italian component of C-ORAL-ROM (300,000 words).
3.2.1 The LIP corpus

LIP was designed for use in the production of the first frequency list of spoken Italian (De
Mauro et al. 1993). It was modelled on the corpus used for the first frequency list of written
Italian (LIF: Bortolini et al. 1971). By current standards the corpus is rather small (500,000
words), consisting of 57 hours of speech recorded in the early ’90s, transcribed and

3
The wide interest in spoken language corpora is testified by the Parlaritaliano project, directed by Miriam
Voghera, aiming at the constitution of a web portal of electronic resources and bibliographies on Italian spoken
language (http://www.parlaritaliano.it/).

8
lemmatized. It was designed along two dimensions of variation: diatopic (4 cities: Rome,
Florence, Milan and Naples) and interaction typology (5 classes: A: bidirectional, exchange,
face to face, with free turn-taking; B: bi-directional exchange, not face to face, with free turn-
taking; C: bi-directional exchange, face to face, with regulated turn-taking; D: unidirectional
exchange, with the addressee being present; E: distanced unidirectional exchange). Criteria
are thus mainly external ones (see Table 3).

The corpus can now be interrogated at the BADIP website (http://languageserver.uni-

graz.at/badip/). The BADIP interface is a powerful one. The query syntax accepts wildcards,
POS tags, lemma searches, gaps between sequences. Subcorpora can be queried separately.
Simple statistics (frequency and subcorpora frequency) are given for each query. A complete
utterance is provided as context. Concordances can be exported in various formats (txt, xls,
html; xml). As a simple search produces a concordance with statistics on dispersion in
different text-types, it is immediately clear that praticamente, for examplet occurs less
frequently in unidirectional exchanges – type D, the most formal category – and is common
in informal speech.

The strength of the LIP corpus lies in its coverage of different situational contexts. This
avoids the vicious circle caused by designs based on internal criteria (the main problem with
which is circularity in text selection and data retrieval). LIP’s external classification seems
particularly useful to investigate variation in language teaching contexts. Not only can
expressions, tags, and syntactic patterns be queried over different situation-types, but access is
available to the entire transcript, allowing the learner to explore this freely.

Among the disadvantages of LIP is its small size, which makes it unreliable for medium
frequency words and patterns. A small scale investigation of the most frequent loan words in
Italian (Chiari 2008) obtained sparse data, which was heavily influenced by single text
domains. Another drawback is the inaccessibility of the original audio files, and the absence
of phonetic and prosodic annotation. However it remains unrivalled for its balanced typology
based exclusively on situational parameters.

TEXT TYPOLOGY INCLUDED TEXTS

A: bidirectional, exchange, face to face, ‐ conversations at home;
with free turn‐taking ‐ conversations at work;
‐ conversations at school or at the university;
‐ conversations during recreation or on means of transport.

B: bi‐directional exchange, not face to ‐ normal telephone conversations;

face, with free turn‐taking ‐ telephone conversations broadcasted on radio;
‐ messages recorded by telephone answering machines.
C: bi‐directional exchange, face to face, ‐ legislative assemblies;
with regulated turn‐taking ‐ cultural discussions;
‐ assemblies at school;
‐ labor union assemblies;
‐ meetings of workers;
‐ oral exams in the elementary school;
‐ oral exams in the secondary school;

9
‐ oral exams at the university;
‐ interrogations in the courtroom;
‐ interviews on radio or television
D: unidirectional exchange, with the ‐ lessons in the elementary school;
addressee being present ‐ lessons in the secondary school;
‐ university lectures;
‐ speeches held during party conventions or labor union
meetings;
‐ presentations at scientific meetings;
‐ speeches held during electoral campaigns;
‐ sermons;
‐ presentations at non‐specialist meetings;
‐ court pleadings.
E: distanced unidirectional exchange ‐ television programs;
‐ radio programs.

Table 3. LIP text typology (from BADIP website)

3.2.2 C-ORAL-ROM

C-ORAL-ROM (Integrated reference corpora for spoken romance languages) includes a

comparable set of spontaneous spoken language corpora for French, Italian, Portuguese and
Spanish, each represented by 300,000 word samples (Cresti & Moneglia 2005a). The corpus
was designed to focus on prosodic and pragmatic features (speech acts). The DVD includes
the digitalized audio (in wav format), transcriptions (in CHAT format) and text analyses of
the recordings, accessible in a simultaneous aligned format. The corpus is annotated
prosodically and with action values. The Italian component comprises about 36 hours of
speech.

Fig. 5. C-ORAL-ROM design scheme (from Cresti & Moneglia 2005)

10
The corpus design is specifically aimed at capturing variation, “covering a wide range of
semantic and pragmatic domains of application” (Cresti & Moneglia 2005b, see Fig. 5). The
highest level distinction is one of register, between formal and informal speech. The
definition of these terms is rather peculiar, since informal is intended as an “un-scripted low
variety of language, used for everyday interactive purposes”, and the formal as a “partially-
scripted task-oriented high variety of language” ((Cresti & Moneglia 2005b). The second
level takes into account the channel of communication for formal speech (face-to-face,
broadcast and telephonic), and public versus private for informal speech. At the third level
informal speech is divided into monologue, dialogue and (multi-party) conversation. For
some formal speech categories a further distinction is made on the basis of domain: natural
context (political speech; political debate; preaching; teaching; professional explanation;
conference; business; law) and media (news; sport; interviews; science; weather forecast;
scientific press; reportage; talk show). As Cresti notes, the criteria adopted for the formal and
informal sections are rather different:

The definition of a finite list of typical domains of use is the main criterion applied in documenting the formal
uses of the four romance languages, while variations in dialogue structure and social context of use is the
sampling criterion of the informal part. The choice of the specific semantic domain of use is left random in the
informal sampling.
(Cresti & Moneglia 2005b)

The architecture of the corpus is not consistent, including both internal and external criteria.
In the domain distinction for formal natural contexts we can find topic-related categories
(preaching “religion”, business, law), but also ones that represent situational aspects and text-
types (teaching, professional explanation and conference), as well as mixed topic- and
activity-centered categories, such as political speech and political debate.

The quantitative balance of the different categories reflects a questionable principle:

[…] while it can be assumed that in western societies the formal use of language is applied in a closed series of
typical domains, the same does not hold for the informal use of language. The list of possible domains of use for
informal language is by definition open, and no domain can in principle be considered more typical than others.
(Cresti & Moneglia 2005b)

While the open nature of the domains of informal language use can be easily agreed with, the
idea that formal language use is set in a predefined (or at least closed) set of domains is
debatable. The formal-informal parameter seems a continuum rather than a clear-cut
distinction, and hence theoretically weak. The sampling reflects these differences in the
treatment of the formal and informal categories:
Fig. 6):

11
Fig. 6. Quantitative balance of C-ORAL-ROM (Italian component)

When it comes to the issue of language variation, it is extremely difficult to evaluate the
project. On the one hand, if we suppose that the formal/informal distinction holds and has
been applied consistently, it would be extremely practical to observe and compare the two
sections in classroom work. On the other hand the different quantitative balance of the
subsections makes it hard to go beyond this dichotomy.4

Turning to retrieval tools, C-ORAL-ROM is distributed with Véronis’ Contextes. The query
syntax accepts word types, lemmas, pos tagging, multiple word queries, and regular
expressions, and the interface permits the export of concordances and frequency lists of
selected words, as well as direct access to the full text. No general frequency and dispersion
data is given for sections of the corpus, but a non-exportable frequency list for the whole
corpus is provided (without dispersion and usage counts). Like the LIP corpus, C-ORAL-
ROM is rather small and hence unreliable for medium frequency words.

4. Variation in Italian language teaching

Italian reference corpora treat language variation in very different ways, posing a number of
problems for their exploitation in teaching. The speech corpora discussed have a number of
features which are highly interesting from this point of view: access to both concordances and
full text (the latter being essential to understand the speech acts involved), aligned audio,
tagged versions for the investigation of grammatical patterns. Their disadvantages concern
query limitations, their small size, and design inconsistencies. If the learner wants to
investigate similarities and differences between spoken and written Italian, she will encounter
a number of obstacles (common in some cases to the language researcher): differences in size,
design, and tools that do not permit statistical conclusions to be drawn from the different
sources available. Italian corpus linguistics has been mainly concerned with the construction

4
The frame is of course different for research purposes, since it is always possible to evaluate representativeness
related to single parameters (Biber 1993a; b).

12
of balanced resources, and the debate over large vs. register-diversified balanced corpora
(Biber 1993b) has barely touched the Italian scene, mainly because we do not have any very
large corpora at the moment. But if we consider the problem from the language learner’s
perspective, it is obvious that a balanced diversified corpus will respond better to their needs.

Although language variation in all its aspects is at the centre of debate in corpus linguistics,
there are still inconsistencies in the classification and labelling of variation, not only in
reference to Italian. These lead to difficulties in managing corpus material, since text-types
are grouped in different classes in unpredictable ways (a situation common to many English
resources as well, as Lee 2001 has shown). At the level of query capabilities, there are severe
limitations for Italian reference corpora (which have been largely overcome for English
corpora like the BNC, with Dodd’s Xaira (http://www.oucs.ox.ac.uk/rts/xaira/), Davies’ View
(http://corpus.byu.edu/bnc/), and Fletcher’s Pie (http://pie.usna.edu/)). The flexibility of
retrieval tools is indispensable for correct data extraction, and an interface offering
accessibility, user-friendliness, clean and intuitive layout is a vital requirement from a
teaching perspective, where aims potentially differ from those of linguistic research, .

One of the main obstacles for teachers is the unpredictability of query results and thus a sense
of unexpectedness, making them reluctant to use corpora extensively (it implies a lot of
planning). A clear description of corpus features and the variation dimensions represented
should be a priority in corpus distribution. Explicit evaluation of the main tasks that teachers
and learners need to perform with single corpora should include: looking for patterns of
variation from a lexical, syntactic, pragmatic, and sociolinguistic point of view; looking for
specific features that characterize different genres, different domains and different text types;
looking for textual specificity of sublanguages. For spoken corpora a requisite should be
access to aligned audio, so that the user can examine spoken realisations of different registers
(phonetic and prosodic features). C-ORAL-ROM and CLIPS (www.clips.unina.it) both go in
this direction, albeit at different levels.

In order to explore such issues retrieval tools and documentation should provide:
• full classification of subcorpora and their subsections;
• querying in subsections;
• querying over POS tags, preferably using regular expressions or intuitive matching
systems;
• explicit frequency data (including relative frequency and dispersion), perhaps indicated
roughly with learner-friendly symbols like those in dictionaries);
• exact numbers of hits found and display of complete concordances with adjustable context
length;
• full text access for traditional textual analysis and pragmatic observation.

Language teaching may benefit from the corpus-based study of linguistic variation in many
ways: in the selection and grading of content for syllabus and materials design, in developing
activities to enhance learner awareness and to help learners exploit resources, and in
assessment procedures.

References
Albano Leoni F. & R. Giordano (eds.) (2005), Italiano parlato. Analisi di un dialogo. Napoli: Liguori.

13
Bernardini S. & F. Zanettin (eds) (2000) I corpora nella didattica della traduzione. Bologna: CLUEB.
Biber D. (1988) Variation across speech and writing. Cambridge: Cambridge University Press.
Biber D. (1992) The multi-dimensional approach to linguistic analyses of genre variation: an overview
of methodology and findings. Computers and the humanities, 26, 331-345.
Biber D. (1993a) Representativeness in corpus design. Literary and linguistic computing, 8, 243-257.
Biber D. (1993b) Using register-diversified corpora for general language studies. Computational
Linguistics, 19, 219-241.
Biber D. (1995) Dimensions of register variation: a cross-linguistic comparison. Cambridge:
Cambridge University Press.
Biber D., S. Conrad & R. Reppen (1998) Corpus linguistics: investigating language structure and use.
Cambridge: Cambridge University Press.
Biber D. & E. Finegan (1991) On the exploitation of computerized corpora in variation studies. In
Aijmer K. & B. Altenberg (eds), English corpus linguistics: studies in honour of Jan Svartvik.
London: Longman, 204-220.
Biber D. & E. Finegan (eds) (1994) Sociolinguistic perspectives on register. Oxford: Oxford
University Press.
Bortolini U., C. Tagliavini & A. Zampolli (1971) Lessico di frequenza della lingua italiana
contemporanea. Milano: IBM Italia.
Botley S., J. Glass, T. McEnery & A. Wilson (eds) (1996) Proceedings of teaching and language
corpora 1996. Lancaster: UCREL.
Botley S., A. McEnery & A. Wilson (eds) (2000) Multilingual corpora in teaching and research.
Amsterdam: Rodopi.
Burnard L. & T. McEnery (eds) (2000) Rethinking language pedagogy from a corpus perspective.
Frankfurt am Main: Peter Lang.
Chiari I. (2005) Linguistica e informatica: la linguistica dei corpora in Italia. Bollettino di Italianistica,
4, 101-118.
Chiari I. (2008) Ingresso, uso, integrazione e produttività delle parole nuove in italiano. Metodi e
problemi dell’indagine quantitativa sul lessico. In M. Pettorino, A. Giannini, M. Vallone & R.
Savy (eds.) Atti del Convegno internazionale sulla comunicazione parlata. Napoli: Liguori, 402-
421. Conrad S. (2004) Corpus linguistics, language variation and language teaching. In J.M.
Sinclair (ed), How to use corpora in language teaching. Amsterdam: John Benjamins, 67-85.
Cresti E. (2007) Some comparisons between UBLI and C-ORAL-ROM. In Zaima K.Y.S. & T.
Takagaki (eds), Spoken language corpus and linguistics informatics. Amsterdam: John Benjamins,
125-115.
Cresti E. & M. Moneglia (2005a) C-ORAL-ROM: integrated reference corpora for spoken Romance
languages. Amsterdam: John Benjamins.
Cresti E. & M. Moneglia (2005b) C-ORAL-ROM: integrated reference corpora for spoken Romance
languages. URL: http://lablita.dit.unifi.it/corpora/descriptions/coralrom/ (date accessed May 15
2008)._
De Mauro T., F. Mancini, M. Vedovelli & M. Voghera (1993) Lessico di frequenza dell'italiano
parlato (LIP). Milano: Etaslibri.
Eagles (1996). Preliminary recommendations on Corpus Typology, EAG--TCWG--CTYP/P,
http://www.ilc.cnr.it/EAGLES96/corpustyp/corpustyp.html. May, 1996. [Accessed April 2007].
Ellis R. (1987) Second language acquisition in context. Englewood Cliffs, NJ: Prentice-Hall.
Gavioli L. (2005) Exploring corpora for ESP learning. Amsterdam: John Benjamins.
Godwin-Jones B. (2001) Emerging technologies. tools and trends in corpora use for teaching and
learning. Language Learning & Technology, 5(3), 7-12.
Granger S. & M. Wynne (2000) Optimising measures of lexical variation in EFL learner corpora. In
Kirk J.M. (ed) Corpora galore: analyses and techniques in describing English. Amsterdam:
Rodopi, 249-257.
Hunston S. (2002) Pattern grammar, language teaching, and linguistic variation: applications of a
corpus-driven grammar. In Reppen R., S.M. Fitzmaurice & D. Biber (eds) Using corpora to
explore linguistic variation. Amsterdam: John Benjamins, 167-183.
14
Hymes D. (1971) On communicative competence. Philadelphia: University of Pennsylvania Press.
Johns T. (1991a) Classroom concordancing: University of Birmingham, Centre for English Studies.
Johns T. (1991b) From printout to handout: grammar and vocabulary learning in the context of data-
driven learning. English Language Research Journal, 4, 27-45.
Johns T. (1991c) Should you be persuaded – two examples of data-driven learning materials. English
Language Research Journal, 4, 1-16.
Johns T. (2002) Data-driven learning: the perpetual challenge. In Kettemann B. & G. Marko (eds)
Teaching and learning by doing corpus analysis. Amsterdam: Rodopi, 107-117.
Kahn B. (1985) Computers in science: using computers for learning and teaching. Cambridge:
Cambridge University Press.
Laudanna A., A.M. Thornton, G. Brown, C. Burani & L. Marconi (1995) Un corpus dell'italiano
scritto contemporaneo dalla parte del ricevente. In Bolasco S., L. Lebart & A. Salem (eds), III
Giornate internazionali di analisi statistica dei dati testuali. Roma: CISU, 103-109.
Lee, D.Y. (2001) Genres, registers, text types, domains, and styles: clarifying the concepts and
navigating a path through the BNC jungle. Language Learning & Technology, 5(3), 37-72.
Lewandowska-Tomaszczyk B. (ed) (2003) PALC 2001: practical applications in language corpora.
Frankfurt am Main: Peter Lang.
Lewandowska-Tomaszczyk B. (ed) (2004) Practical applications in language and computers: PALC
2003. Frankfurt am Main: Peter Lang.
Lewandowska-Tomaszczyk B. & P.J. Melia (eds) (1997) International conference on practical
applications in language corpora. Lodz: Lodz University Press.
Lewandowska-Tomaszczyk B. & P.J. Melia (eds) (2000) PALC'99 – Practical applications in
language corpora. Frankfurt am Main: Peter Lang.
Meyer C.F. (2004) Can you really study language variation in linguistic corpora? American Speech,
79, 339-355.
Paltridge B. (1996) Genre, text type, and the language learning classroom. ELT Journal, 50, 237-243.
Partington A. (1998) Patterns and meanings: using corpora for English language research and
teaching. Amsterdam: John Benjamins.
Pettorino M, & A. Giannini (2003) Progetti AVIP e API-unità di ricerca dell’Università degli Studi di
Napoli l’Orientale. In Albano Leoni F., Cutugno F., Pettorino M., Savy R (eds.), Il parlato
italiano Atti del Convegno Nazionale, D'Auria, Napoli, N06.
Rossini Favretti R. (2000) Progettazione e costruzione di un corpus di italiano scritto: CORIS/CODIS.
In Rossini Favretti R. (ed.), Linguistica e informatica. Multimedialità, corpora e percorsi di
apprendimento. Roma: Bulzoni, 39-56.
Rossini Favretti R., F. Tamburini & C. De Santis (2002) A corpus of written Italian: a defined and a
dynamic model. In Wilson A., P. Rayson & T. McEnery (eds), A rainbow of corpora: corpus
linguistics and the languages of the world. Munich: Lincom-Europa, 27-38.
Scott M. & C. Tribble (eds) (2006) Textual patterns: key words and corpus analysis in language
education. Amsterdam: John Benjamins.
Sinclair J.M. (ed) (2004) How to use corpora in language teaching. Amsterdam: John Benjamins.
Spolsky B. (1989) Communicative competence, language proficiency, and beyond. Applied
Linguistics, 10, 138-156.
Szendeffy J. (2005) A practical guide to using computers in language teaching. Ann Arbor: University
of Michigan Press.
Timmis I. (2003) Corpora, classroom and context: the place of spoken grammar in English language
teaching. Ph.D. Thesis, University of Nottingham.
Tono Y. (2002) The role of learner corpora in SLA research and foreign language teaching: the
multiple comparison approach. Ph.D Thesis, University of Lancaster
Wang L. (2000) The use of parallel texts in language learning: computer software and teaching
materials for English and Chinese. Birmingham: University of Birmingham.
Wible D., F.-y. Chien, C.-H. Kuo & C.C. Wang (2002) Toward automating a personalized
concordancer for data-driven learning: a lexical difficulty filter for language learners. In Ketteman

15
B. & G. Marko (eds) Teaching and learning by doing corpus analysis. Amsterdam: Rodopi, 147-
154.
Wichmann, A., S. Fligelstone, T. McEnery & G. Knowles (eds) (1997) Teaching and language
corpora. London: Longman.
Zanettin F., S. Bernardini & D. Stewart (eds) (2003) Corpora in translator education. Manchester: St.
Jerome.

Collins Cobuild English Grammar
From Everand
Collins Cobuild English Grammar
HarperCollins UK
4/5 (13)
ABRAHAM Nicolas The Shell and The Kernel
100% (2)
ABRAHAM Nicolas The Shell and The Kernel
15 pages
Sociolinguistics and Language Teaching
From Everand
Sociolinguistics and Language Teaching
Thomas S.C. Farrell
1/5 (1)
Collaborative Writing in L2 Classrooms
From Everand
Collaborative Writing in L2 Classrooms
Neomy Storch
No ratings yet
Amphetamines Speed, Whizz, Uppers, Billy, Amph, Sulphate: Cannabis Marijuana, Blow, Hash, Ganja, Weed, Pot
No ratings yet
Amphetamines Speed, Whizz, Uppers, Billy, Amph, Sulphate: Cannabis Marijuana, Blow, Hash, Ganja, Weed, Pot
12 pages
Document
No ratings yet
Document
288 pages
Applied Linguistics: A Genre Analysis Of: Research Articles Results and Discussion Sections in Journals Published in Applied Linguistics
From Everand
Applied Linguistics: A Genre Analysis Of: Research Articles Results and Discussion Sections in Journals Published in Applied Linguistics
Veronica M. Mutinda
No ratings yet
Explorations of Language Transfer
From Everand
Explorations of Language Transfer
Terence Odlin
No ratings yet
Unit 2 Representativeness, Balance and Sampling
No ratings yet
Unit 2 Representativeness, Balance and Sampling
8 pages
Language, Linguistics, and Development Simplified
From Everand
Language, Linguistics, and Development Simplified
Narinder Mehra
No ratings yet
Lexical Approach To Teach
100% (1)
Lexical Approach To Teach
32 pages
linking-up-contrastive-and-learner-corpus-research (1)
No ratings yet
linking-up-contrastive-and-learner-corpus-research (1)
273 pages
253-Article Text-1696-2-10-20220930
No ratings yet
253-Article Text-1696-2-10-20220930
5 pages
The Phraseological View of Language
No ratings yet
The Phraseological View of Language
336 pages
Biber_2012
No ratings yet
Biber_2012
31 pages
10 51726-jlr 714724-1198537
No ratings yet
10 51726-jlr 714724-1198537
14 pages
Corpus For Classrooms: Ideas For Material Design: Proceedings of The 10th METU ELT Convention
No ratings yet
Corpus For Classrooms: Ideas For Material Design: Proceedings of The 10th METU ELT Convention
11 pages
Reviewed By: Rachelle Vessey, Birkbeck, University of London, U.K
No ratings yet
Reviewed By: Rachelle Vessey, Birkbeck, University of London, U.K
7 pages
122-Article Text-368-1-10-20211006
No ratings yet
122-Article Text-368-1-10-20211006
3 pages
Pallotti 2017
No ratings yet
Pallotti 2017
11 pages
Analysis of a Medical Research Corpus: A Prelude for Learners, Teachers, Readers and Beyond
From Everand
Analysis of a Medical Research Corpus: A Prelude for Learners, Teachers, Readers and Beyond
Georgette Nicolas Jabbour
No ratings yet
Spoken Corpora PDF
No ratings yet
Spoken Corpora PDF
25 pages
Creativity and Unnaturalness in The Use of Phrasal Verbs in ESL Learner Language
No ratings yet
Creativity and Unnaturalness in The Use of Phrasal Verbs in ESL Learner Language
12 pages
Corpus Linguistic1
No ratings yet
Corpus Linguistic1
6 pages
(17882591 - Practice and Theory in Systems of Education) Analysing ESP Texts, But How
No ratings yet
(17882591 - Practice and Theory in Systems of Education) Analysing ESP Texts, But How
15 pages
Corpora in Translation Studies
No ratings yet
Corpora in Translation Studies
5 pages
A Corpus-Based Analysis of Curriculum-Based Elementary and Secondary English Textbooks
No ratings yet
A Corpus-Based Analysis of Curriculum-Based Elementary and Secondary English Textbooks
28 pages
The Emergence of World Englishes Sythesis Paper
No ratings yet
The Emergence of World Englishes Sythesis Paper
7 pages
LOCAL VARIETIES AND ENGLISH LANGUAGE TEACHING.
No ratings yet
LOCAL VARIETIES AND ENGLISH LANGUAGE TEACHING.
9 pages
Language Variation in EFL Teachers
100% (2)
Language Variation in EFL Teachers
6 pages
World Englishes PDF
No ratings yet
World Englishes PDF
9 pages
Research in The Foreign Languages
No ratings yet
Research in The Foreign Languages
33 pages
Insights into Task-Based Language Teaching
From Everand
Insights into Task-Based Language Teaching
Sima Khezrlou
No ratings yet
Style, Identity and Literacy: English in Singapore
From Everand
Style, Identity and Literacy: English in Singapore
Christopher Stroud
No ratings yet
5 2 Cogo Dewey
No ratings yet
5 2 Cogo Dewey
36 pages
(Language and Computers - Studies in Practical Linguistics 74) Sebastian Hoffmann, Paul Rayson, Geoffrey Leech - English Corpus Linguistics_ Looking Back, Moving Forward_ Papers from the 30th Internat
No ratings yet
(Language and Computers - Studies in Practical Linguistics 74) Sebastian Hoffmann, Paul Rayson, Geoffrey Leech - English Corpus Linguistics_ Looking Back, Moving Forward_ Papers from the 30th Internat
273 pages
Corpus Approach To Analysing Gerund Vs Infinitive
No ratings yet
Corpus Approach To Analysing Gerund Vs Infinitive
16 pages
Lecture 2
No ratings yet
Lecture 2
7 pages
Applying Corpus Linguistics to Classroom Teaching
No ratings yet
Applying Corpus Linguistics to Classroom Teaching
6 pages
IML Talk
No ratings yet
IML Talk
5 pages
Language Planning and Student Experiences: Intention, Rhetoric and Implementation
From Everand
Language Planning and Student Experiences: Intention, Rhetoric and Implementation
Joseph Lo Bianco
No ratings yet
Unit 16 Language Teaching and Learning
No ratings yet
Unit 16 Language Teaching and Learning
7 pages
EU COST C13 Glass and in Building Envelopes - Final Report - Volume 1 Research in Architectural Engineering Series (Research in Architectural Engineering)
100% (1)
EU COST C13 Glass and in Building Envelopes - Final Report - Volume 1 Research in Architectural Engineering Series (Research in Architectural Engineering)
288 pages
12 Corpora Linguistics
No ratings yet
12 Corpora Linguistics
27 pages
E-Content Submission To INFLIBNET
No ratings yet
E-Content Submission To INFLIBNET
14 pages
Corpus Linguistics
No ratings yet
Corpus Linguistics
2 pages
Definition of A Corpus
No ratings yet
Definition of A Corpus
6 pages
Investigating Tasks in Formal Language Learning
From Everand
Investigating Tasks in Formal Language Learning
María del Pilar García Mayo
No ratings yet
Options in The Teaching of Grammar Otg Notebook
No ratings yet
Options in The Teaching of Grammar Otg Notebook
24 pages
Contrastive Analysis of Tiv and English Morphological Processes.
100% (5)
Contrastive Analysis of Tiv and English Morphological Processes.
14 pages
XTji WUpgm AW3 S L58
No ratings yet
XTji WUpgm AW3 S L58
4 pages
Corpus Linguistics
No ratings yet
Corpus Linguistics
17 pages
Problems of Translating English Sports Register Into Arabic
No ratings yet
Problems of Translating English Sports Register Into Arabic
18 pages
Calper Alp Corpus
No ratings yet
Calper Alp Corpus
5 pages
Towards A Functinonal Pedagogical Approach To Language A Critical Argument Ijariie10964
No ratings yet
Towards A Functinonal Pedagogical Approach To Language A Critical Argument Ijariie10964
7 pages
Introduction to Linguistics Week 11 class material [December 11, 2024]
No ratings yet
Introduction to Linguistics Week 11 class material [December 11, 2024]
6 pages
Research Problem ECA3
No ratings yet
Research Problem ECA3
5 pages
Perspectives On Formulaic Language Acquisition and Communication by David Wood
No ratings yet
Perspectives On Formulaic Language Acquisition and Communication by David Wood
297 pages
Corpus Linguistics (CL) in The Design of English For Academic Purposes (EAP) Courses
No ratings yet
Corpus Linguistics (CL) in The Design of English For Academic Purposes (EAP) Courses
9 pages
Variation in Language - System - and Usage-Based Approaches
No ratings yet
Variation in Language - System - and Usage-Based Approaches
323 pages
Business Arabic and Corpus Based Teachin
No ratings yet
Business Arabic and Corpus Based Teachin
13 pages
1 SM
No ratings yet
1 SM
7 pages
B2First5 Test 2 Listening Answer Key
No ratings yet
B2First5 Test 2 Listening Answer Key
5 pages
Biotic Community Concept
100% (2)
Biotic Community Concept
2 pages
Speakout Upper Intermediate Workbook Answer Key 2pdf
No ratings yet
Speakout Upper Intermediate Workbook Answer Key 2pdf
13 pages
Oe 254 Antenna TM PDF
No ratings yet
Oe 254 Antenna TM PDF
2 pages
2 Growth of Nationalism
No ratings yet
2 Growth of Nationalism
12 pages
The Grammar of Polarity Pragmatics Sensitivity and The Logic of Scales 1st Edition Michael Israel 2024 Scribd Download
100% (17)
The Grammar of Polarity Pragmatics Sensitivity and The Logic of Scales 1st Edition Michael Israel 2024 Scribd Download
70 pages
Adaptive Lesson Plan Rubric
No ratings yet
Adaptive Lesson Plan Rubric
2 pages
Voices Intermediate Plus u07 Test Answer Key
No ratings yet
Voices Intermediate Plus u07 Test Answer Key
5 pages
Semi-Delailed Lesson Plan in Cause and Effect Final
100% (2)
Semi-Delailed Lesson Plan in Cause and Effect Final
5 pages
Aqeedah of Ahlu-Ssunnah Wal Jama'ah - (Notes)
No ratings yet
Aqeedah of Ahlu-Ssunnah Wal Jama'ah - (Notes)
9 pages
Integrated Studies - Names Sept 16-20,2024
No ratings yet
Integrated Studies - Names Sept 16-20,2024
9 pages
Judas 2004 Eng
No ratings yet
Judas 2004 Eng
18 pages
28 Cases For Consti2
No ratings yet
28 Cases For Consti2
193 pages
Elections - COMELEC VI
No ratings yet
Elections - COMELEC VI
96 pages
Owners of The Sidewalk: Security and Survival in The Informal City by Daniel M. Goldstein
50% (2)
Owners of The Sidewalk: Security and Survival in The Informal City by Daniel M. Goldstein
15 pages
Readiness, Barriers and Potential Strenght of Nursing in Implementing Evidence-Based Practice
No ratings yet
Readiness, Barriers and Potential Strenght of Nursing in Implementing Evidence-Based Practice
9 pages
Star Paper Corp. v. Simbol, G.R. No. 164774, April 12, 2006
No ratings yet
Star Paper Corp. v. Simbol, G.R. No. 164774, April 12, 2006
6 pages
APPENDIX "2" The Form of Application For Membership of The Society by A Nominee Or0A Heir Who Is A Minor Through His Guardian or Legal Representative
No ratings yet
APPENDIX "2" The Form of Application For Membership of The Society by A Nominee Or0A Heir Who Is A Minor Through His Guardian or Legal Representative
2 pages
Adopting Partnership Strategy For Achieving Organizational Goal and Objectives
100% (1)
Adopting Partnership Strategy For Achieving Organizational Goal and Objectives
13 pages
E-Portfolio Unit 2
No ratings yet
E-Portfolio Unit 2
3 pages
MTS DP - DP Asset Reactivation Guidance - Rep003-C (8049)
No ratings yet
MTS DP - DP Asset Reactivation Guidance - Rep003-C (8049)
48 pages
Labo Jr. vs. Comelec
No ratings yet
Labo Jr. vs. Comelec
14 pages
Some Questions.... Here They Are
No ratings yet
Some Questions.... Here They Are
3 pages
CHAPTER 5 Motivation
No ratings yet
CHAPTER 5 Motivation
14 pages
UNIT 2.2 Three Questions
100% (1)
UNIT 2.2 Three Questions
21 pages
Roller Coaster Crew: Part A
No ratings yet
Roller Coaster Crew: Part A
2 pages
The_Works_of_the_Emperor_Julian_Hymn_to_King_Helios
No ratings yet
The_Works_of_the_Emperor_Julian_Hymn_to_King_Helios
55 pages
Controller Types Experiments: Principles of Automatic Control Lab #2
No ratings yet
Controller Types Experiments: Principles of Automatic Control Lab #2
24 pages

Teaching_language_variation_using_Italia

Uploaded by

Teaching_language_variation_using_Italia

Uploaded by

Teaching language variation using Italian corpora

Isabella Chiari (La Sapienza University of Rome)

2. Language variation and corpora

designed to provide comprehensive information about a language. It aims to be large enough to

Whether we prefer a genre-driven or a text-type-driven approach, there remain practical

It is of course possible to investigate language variation by close analysis of textual material

3. Exploiting available Italian corpora

3.1. Written Reference Corpora

SUBCORPORA SECTIONS SUB-SECTIONS

FICTION novels, short stories Italian, foreign, for adults, for

PRESS newspapers, periodic, national, local, specialist, non-

ACADEMIC PROSE human sciences, natural books, reviews

LEGAL AND ADMIN. PROSE legal, bureaucratic, books, reviews

MISCELLANEA books on religion, travel, books, reviews

EPHEMERA letters, leaflets, instructions private, public, printed,

Table 1. Design structure of CORIS/CODIS

3.1.2. Corpus e Lessico di Frequenza dell'Italiano Scritto (COLFIS)

SUBCORPORA SECTIONS SUBSECTIONS

Table 2. Design structure of COLFIS

Fig. 3. COLFIS query form for the non lemmatized corpus

Fig. 4. COLFIS concordance of sai

3.2. Spoken Corpora

The corpus can now be interrogated at the BADIP website (http://languageserver.uni-

TEXT TYPOLOGY INCLUDED TEXTS

B: bi‐directional exchange, not face to ‐ normal telephone conversations;

Table 3. LIP text typology (from BADIP website)

C-ORAL-ROM (Integrated reference corpora for spoken romance languages) includes a

Fig. 5. C-ORAL-ROM design scheme (from Cresti & Moneglia 2005)

The quantitative balance of the different categories reflects a questionable principle:

4. Variation in Italian language teaching

You might also like