Elliot, M. The Expression of Affect in Speaking...
Elliot, M. The Expression of Affect in Speaking...
Editorial Board
Dr Nick Saville, Director, Research and Validation Group, Cambridge ESOL
Roger Johnson, Director, Assessment and Operations Group, Cambridge ESOL
Production Team
Caroline Warren, Research Support Administrator
Rachel Rudge, Production Controller
George Hammond, Design
Research Notes
Contents
Editorial Notes 1
Developing a model for investigating the impact of language assessment: Nick Saville 2
Construct validation of the Reading module of an EAP proficiency test battery: Hanan Khalifa 8
Comparing proficiency levels in a multi-lingual assessment context: Karen Ashton 14
Testing financial English: Specificity and appropriacy of purpose in ICFE: Angela Wright 15
The expression of affect in spoken English: Mark Elliott 16
Peer–peer interaction in a paired Speaking test: The case of FCE: Evelina D Galaczi 22
Second language acquisition of dynamic spatial relations: Ivana Vidaković 23
Demonstrating cognitive validity of IELTS Academic Writing Task 1: Graeme Bridges 24
Qualification and certainty in L2 writing: A learner corpus study: Sian Morgan 33
Prompt and rater effects in second language writing performance assessment: Gad S Lim 39
Computer-based and paper-based writing assessment: A comparative text analysis: Lucy Chambers 39
A study of the context and cognitive validity of a BEC Vantage Test of Writing: Hugh Bateman 40
Models of supervision – some considerations: Juliet Wilson 40
A framework for analysing and comparing CEFR-linked certification exams: Marylin Kies 41
IRT model fit from different perspectives: Muhammad Naveed Khalid 41
Conferences and publications 42
Editorial Notes
Welcome to issue 42 of Research Notes, our quarterly publication reporting on matters
relating to research, test development and validation within Cambridge ESOL.
This special issue of Research Notes shares with the readers summaries of doctoral and
Master’s theses by Cambridge ESOL staff. The issue is organised according to skill area and
domain of interest. It begins with Nick Saville’s paper on an expanded impact model intended
to provide a more effective way of understanding how language examinations impact on
society. In the area of reading, Hanan Khalifa investigates the construct validity of the reading
module of an EAP test battery using qualitative and quantitative research methods. Also using
a mixed-method approach, Karen Ashton compares reading proficiency levels of secondary
school learners of German, Japanese and Urdu, while Angela Wright examines context validity
of the ICFE test of Reading. If your interests lie in the area of speaking, you may want to read
Mark Elliott’s paper on affective factors in oral communication, Evelina Galaczi’s summary of
her thesis on paired test format and Ivana Vidaković’s summary on learning how to express
motion in a second language and factors affecting second language acquisition. In the area of
writing, we would like to introduce to you Graeme Bridges’ paper on cognitive validity of IELTS,
Sian Morgan’s paper on qualification and certainty in L2 writing, Gad Lim’s work on prompt
and rater effect in assessing writing, Lucy Chambers’ summary on comparability issues
between paper-based and computer-based modes of assessment and Hugh Bateman’s work
on context and cognitive validity of a BEC Writing paper. Finally, Juliet Wilson discusses
models of teaching supervision, Marylin Kies proposes a framework for assessing and
comparing examinations linked to the CEFR and Muhammad Naveed Khalid investigates IRT
model fit from a variety of perspectives.
We finish this issue by reporting on the conference season and events Cambridge ESOL
supported. Laura Cope and Tamsin Walker report on the IACAT conference (June 2010) on
computerised adaptive testing. Martin Nuttall describes the ALTE events and Lynda Taylor
provides a brief on the three latest volumes in the SiLT series.
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
2 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 3
The context: There was one main context which was the Locating impact research within
focus of attention: the school and classroom (i.e. the micro
Cambridge ESOL
context). The test-taking context was typically not separated
from the school context where the teaching and learning A fundamental concern in the thesis was how impact-
takes place. Although some wider contextual features related research can be integrated into operational
(macro context) were starting to be discussed, these were processes. For Cambridge ESOL, impact research needed to
not yet a major focus. combine theoretical substance with practical applications
and to become an integral part of the operational test
The participants: The main participants were taken to be development and validation processes.
the teacher and the learners in the classroom/school In placing impact within a validation framework, the work
context. There was a limited focus on other participants, of Bachman was influential, especially his series of
such as materials writers, or participants from the wider seminars delivered in Cambridge in 1990–1. He was one of
context (e.g. parents). the first language testers to discuss impact as a ‘quality’ of
a test and suggested that impact should be considered
The outcomes: Outcomes were seen as changes
within the overarching concept of test usefulness (Bachman
attributable to the introduction of the test: behaviour of
& Palmer 1996). The development of ‘useful tests’ involves
participants – actions, activities, performance in the target
the balancing of four qualities: validity, reliability, impact
language; views and attitudes of participants; decisions to
and practicality – the VRIP features as they became known
make changes to the curriculum/syllabus and to develop
in Cambridge.
new materials and methods (products).
In an internal working paper, Milanovic & Saville (1996)
The processes involved in bringing about the outcomes
first set out ideas on an expanded concept of test impact to
were not well understood nor well represented in the
meet the needs of Cambridge ESOL. They addressed the
model. For example, the processes whereby the test
question of how examinations can be developed with
features influenced the content and methods of the
appropriate systems in place to monitor and evaluate their
teachers were not understood. Some evidence existed to
impact.
suggest that content but not the teaching methodology was
Aware of the work of Hughes (1989) and others (e.g.
affected, but when these effects occurred, how they actually
Bailey 1996) who used checklists of behaviours to
came about and what factors influenced the strength of the
encourage positive washback, Milanovic & Saville (1996)
effects was not included in the model.
proposed four maxims to support working practices:
The researcher: The washback researcher was typically an Maxim 1: PLAN
academic, not usually involved in the test development Use a rational and explicit approach to test development
process as a participant, nor as a participant in the Maxim 2: SUPPORT
teaching/learning context itself (i.e. an outsider). Support stakeholders in the testing process
The research methods: No clear impact methodology, Maxim 3: COMMUNICATE
instrument validation procedures or validated instruments Provide comprehensive, useful and transparent
had been established, but qualitative methods were information
emerging in addition to survey techniques for data Maxim 4: MONITOR and EVALUATE
collection. The need to problematise washback in terms of Collect all relevant data and analyse as required
hypotheses had been recognised.
The timeline: In the washback model, the timeline was The statements were deliberately designed to be short and
implied but not explicitly focused on. The need for memorable, to capture the key principles and what is most
comparative data – before/after – had led to a focus on relevant, and in so doing to provide a basis for decision-
time-series designs and an appeal to insights from making and action planning.
innovation theory. Innovation theory, in relation to Wall’s Under Maxim 1 there was a requirement to plan
(1999, 2005) work using Henrichsen’s (1989) hybrid model effectively and for the organisation to adopt a rational and
of diffusion/implementation, suggests that each period of explicit model for managing the test development
an educational innovation has its own antecedents, processes in a cyclical and iterative way. Maxim 2 focused
processes and consequences. The investigation of on the requirement to provide adequate support for the
‘antecedent conditions’ are Henrichsen’s version of the stakeholders involved in the many processes associated
baseline study (see also Saville 2003). The consequences, with international examinations. Maxim 3 focused on the
therefore, are the changes which are brought about as a importance of communication and of providing useful and
result of the new processes which have been introduced. transparent information to the stakeholders and Maxim 4
on the requirement to collect relevant data and to carry out
Cheng (1997, 2005), Green (2003, 2007) and Wall (1999, analyses as part of the iterative process model.
2005) looked at different aspects of washback and had By conceptualising impact within VRIP-based validation
begun to focus more broadly on impact issues. However, processes, there was an explicit attempt to integrate impact
there had been no serious attempt to bring all the features research into ongoing procedures for accumulating validity
of impact together within a comprehensive model which evidence. The Cambridge perspective on impact was framed
would allow the complex relationships to be examined by these considerations and provided the starting point for
across broader educational and societal contexts. the model developed in the thesis.
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
4 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
Locating impact within educational Understanding the nature of context within educational
systems and the roles of stakeholders in those contexts are
systems
clearly important considerations for an examination board
The thesis focused broadly on how impact operates within like Cambridge ESOL (see Saville 2003:60).
educational systems and the literature on educational
reform and management of change was particularly
relevant. An understanding of how socio-political change
processes work within education was also considered to be Using case studies as meta-data
crucial (Fullan 1991). A range of data collection and analysis techniques needs to
Several concepts emerged from the literature and were be employed in impact-related research. These were
explored: discussed with reference to the literature on social research.
• a definition of stakeholders and the roles they play in Ways in which quantitative and qualitative approaches can
many varied contexts where language learning and be effectively combined in mixed-method designs were
assessment operate noted and the validation of instruments was illustrated.
Three case studies formed the central part of the thesis.
• a view of educational systems as complex and dynamic in
which planned innovations are difficult to implement • Case 1 was the survey of the impact of IELTS (the
successfully International English Language Testing System). This was
the starting point for the impact model; it set out the
• an understanding of how change can be anticipated and
conceptualisation of impact and described the design
how change processes related to assessment systems
and validation of suitable instruments to investigate it, as
can be successfully managed through the agency of an
applied within four Impact Projects as part of an ongoing
examination provider
programme of validation following the 1995 revision. This
• the critical importance of the evidence collected as part case included a description of the IELTS development and
of the validation system and as the basis for claims about the underlying constructs, the nature of the impact data
validity. which was targeted and the necessary instrumentation to
It has been suggested that educational processes take collect that data. The lessons learned were summarised
place within complex dynamic systems with interplay in relation to the developing model and how they
between many sub-systems and ‘cultures’ and where informed the next phase of development in Case 2.
understanding the roles of stakeholders as participants is a • Case 2 was the Italian Progetto Lingue 2000 (PL2000)
critical factor (e.g. Fullan 1993, 1999, Thelen & Smith 1994, Impact Study. This impact study was an application of the
Van Geert 2007). original model within a macro educational context and
The thesis situated the discussion of impact within the described an initial attempt at applying the approach
work of researchers who focus on how change can be within a state educational context, i.e. the Italian state
managed successfully within educational systems. Figure 1 system of education and a government reform project
illustrates macro and micro contexts within society; it intended to improve standards of language education at
shows how diversity and variation between contexts tend to the turn of the 21st century – the Progetto Lingue 2000.
increase as the focus moves from the macro context to the The impact of the reforms generally and the specific role
multiple micro contexts at the local level (i.e. schools, of external examinations provided by Cambridge ESOL
classes, groups, individual teachers and learners). formed the basis of this case. This study provided greater
Individual Differences
MACRO
• Demographic
CONTEXT
• Socio-Psychological Country
• Strategic • Culture
• Prior knowledge/learning • Politics
• L1
• Role of L2
• Model of L2
Learner
and Region
• Urban/rural
Teacher • Wealthy/poor
Micro
Community
• Demographic make up
con
Group
tex
School
ta
nd
cu Class Sector
ltu
re • Public/private
ion Cycle
i at
var • Primary
re asing • Middle
Inc
• Upper
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 5
focus on the contextual variables and the roles and Dimension 1: re-conceptualise the place and role of impact
responsibilities of particular stakeholder groups and study within the assessment enterprise, vis-à-vis societal
individuals within the educational system (see Hawkey systems generally and language education specifically.
2006). The re-conceptualisation of test impact draws on theories
• Case 3 was the Florence Learning Gains Project (FLGP). in the social sciences and goes beyond the work in applied
Still within Italy, this project built directly on the PL2000 linguistics and measurement. It is based on a 21st century
case and was an extension and re-application of the world view and takes into account recent ontological and
model within a single school context (i.e. at the micro epistemological developments.
level). It focused on individual stakeholders in one It extends the epistemological influences which guided
language teaching institution, namely teachers and Messick and his predecessors in the development of
learners preparing for a range of English language validity theory in the second half of the 20th century.
examinations at a prestigious language school in Messick explicitly referred to the philosophical perspectives
Florence. The complex relationships between assessment of Leibniz, Locke, Kant, Hegel and Singer, and to the
and learning/teaching in a number of language influences of their rationalism and logical positivism on the
classrooms, including the influence of the Cambridge nature of scientific enquiry in the 20th century (Messick
examinations, were examined against the wider 1989:30). In moving beyond Messick into the 21st century,
educational and societal milieu in Italy. The micro level of the influence of post-modernism cannot be ignored, but for
detail, as well as the longitudinal nature of the project examinations boards and language test providers an
conducted over an academic year, were particularly epistemology which can provide the basis for action is
relevant in this case. required.
The ontological approach suggested draws on ‘critical
The analysis and discussion in each case study was broadly realism’ in the social sciences (e.g. Sayer 1984, 2000)
structured around the seven features of the washback model and contemporary views on pragmatism derived from the
which had emerged by end of the 1990s, as noted above. philosophy which originated with Peirce in the late
19th century. This realist stance underpins the suggested
re-conceptualisation of impact and the other dimensions
The revised model of impact of the meta-framework:
Insights from the three case studies were assembled into a. Anticipating and managing change over time is a key
an expanded model; this meta-framework builds on aspect of impact research, noting the importance of
Milanovic & Saville’s maxims (1996), and constitutes an timescales and the timeline (change over time, planned
action-oriented approach with four inter-related dimensions and unpredicted) with recurrent cycles (before/during/
(see Figure 2). after). The recent educational literature on management
Stance
Perspective of UK examinations board
Influenced by critical realism, contemporary pragmatism
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
6 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
of innovation suggests mechanisms which can be put in validation with iterative cycles is a necessary condition for
place to anticipate and achieve desirable outcomes creating construct-valid tests and for the development of
through change processes. Fullan (1993:19), for successful systems to support them.
example, suggests that the solution to achieving At the heart of this is the adequate specification of the
productive educational change ‘lies in developing better focal construct which is crucial for ensuring that the test is
ways of thinking about, and dealing with, inherently appropriate for its purpose and contexts of use (and to
unpredictable processes’. His work also points up the counter the twin threats to validity – construct under
social dimension of education and the relevance of representation and construct-irrelevant variance – noted
theories of social systems and practices to assessment by Messick (1996: 252)).
which have also been a focus of attention in language This is a necessary condition for achieving the
testing circles in recent years (e.g. McNamara & Roever anticipated outcomes, but it is not sufficient and only
2006). provides the ‘latent potential’ for validity in use. For
b. Socio-cognitive theories which place importance on both Cambridge ESOL impact by design highlights the
social and cognitive considerations are particularly importance of designing and implementing assessment
relevant to the conceptualisation of language constructs systems, which extend the design features beyond the
(e.g. Weir 2005). technical validities related to the construct, and incorporate
The research methodologies needed to investigate considerations explicitly related to the social and
the impact of examinations in their socio-cultural educational contexts of test use.
contexts indicate that insights from socio-cognitive As time passes following the introduction of an
theory might also be helpful in understanding how examination, new contexts of use arise and new users
language learning and preparation for examinations acquire a stake in the examination. As this extension of
takes place in formalised learning contexts. The literature ‘ownership’ happens, there is a risk of ‘drift’ away from the
on social psychology may also be relevant as social original intentions of the test developers; for example, the
psychologists seek to explain human behaviour in terms intended relationship between use of test results and the
of the interaction between mental state and social test construct may begin to change over time due to
context; this is an important aspect of impact at the influences in the wider educational context. The potential
micro level. for negative impact is likely to increase when the original
construct is no longer suitable for the decisions which the
c. Constructivism is important for the re-conceptualisation new users are making. In other words, the examination is
of impact for two reasons: first because contemporary no longer ‘fit for purpose’ and so corrective action of some
approaches to teaching and learning in formal contexts kind needs to be taken.
now appeal to constructivist theories; second because it Similarly, consequences – intended and unintended –
underpins the research paradigm which is most often emerge after the test has been ‘installed’ into real-life
appropriate to finding out what goes on in contexts of contexts of use which are not uniform and are constantly
test use, as seen in the case studies. changing as a result of localised socio-political and other
d. Contemporary theories of knowledge and of language factors. The overall validity of an assessment system,
learning need to play a more prominent role in the study therefore, is an emergent property resulting from a test
of impact. For example, from the learner’s perspective, interacting with contexts over time.
affective factors are vital for motivation, and feedback ‘Impact by design’ is therefore not strictly about
from tests that highlights strengths positively tends to prediction; a more appropriate term might be ‘anticipation’.
lead to better learning (assessment for learning). In working with stakeholders, possible impacts on both
micro and macro levels can be anticipated as part of the
These considerations are relevant in designing language design and development process. Where negative
assessment systems with learning-oriented objectives, and consequences are anticipated, potential remedial actions
whether these objectives have been met is a concern in or mitigations can be planned in advance. So, for example,
impact research. if ‘construct drift’ is a risk, it can be anticipated and
appropriate tolerances set before test revisions are
Dimension 2: introduce the concept of ‘impact by design’ required. This approach is congruent with the concept of
into the planning and operationalisation of language social impact assessment, a form of policy-oriented social
assessments by examination providers. research.
The concept of ‘impact by design’ is a key feature of the
expanded impact model. This means designing tests which Dimension 3: re-organise validation procedures to
have the potential for positive impacts, including well- incorporate impact research into operational activities to
defined focal constructs supported by contemporary provide the basis for knowing about and understanding
theories of communicative language ability, language how well an assessment system works in practice with
acquisition and assessment (cf. the socio-cognitive model). regard to its impact.
It takes an ex ante approach to anticipating the possible It is essential to know what happens when a test is
consequences of a given policy ‘before the event’. introduced into its intended contexts of use; this should
‘Impact by design’ builds on Messick’s idea (1996) of constitute a long-term validation plan, as required by the
achieving ‘validity by design as a basis for washback’. The impact by design concept.
importance of the rational model of test development and Finding out and understanding needs to be a routine
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 7
Examples of theory of action are found in the literature Hawkey, R (2006) The theory and practice of impact studies:
Messages from studies of the IELTS test and Progetto Lingue 2000:
on educational reform and school improvements, especially
Cambridge ESOL/Cambridge University Press.
in the USA. Such examples provide support for the ways
Henrichsen, L E (1989) Diffusion of innovations in English language
in which the four dimensions of the expanded model fit
teaching: The ELEC effort in Japan, 1956–1968, New York:
together in practice (e.g. Resnick and Glennan 2002). Greenwood Press.
A theory of action provides planners and practitioners with
Hughes, A (1989) Testing for language teachers, Cambridge:
the capacity to act in social contexts, to determine what Cambridge University Press.
needs to be done and when/how to do it. Being prepared to
McNamara, T and Roever, C (2006) Language Testing: the Social
change and to manage change is critical to a theory of Dimension, Oxford: Blackwell.
action. The challenge for the examination provider is to
Messick, S (1989) Validity, in Linn, R L (Ed) Educational measurement
‘harness the forces of change’ in order to get the relevant (3rd ed), New York: Macmillan, 13–103.
stakeholders working together to achieve better Messick, S (1996) Validity and washback in language testing,
assessment outcomes. Language Testing 13 (3), 241–256.
Some of the dilemmas which arise in assessment Milanovic, M and Saville, N (1996) Considering the Impact of
contexts can only be dealt with if a wide range of Cambridge EFL Examinations, Manuscript Internal Report,
stakeholders agrees to manage them in ways which they Cambridge: Cambridge ESOL.
find acceptable. As Fullan (1999:xx) puts it: ‘Top-down Resnick, L B and Glennan, T K (2002) Leadership for learning:
mandates and bottom-up energies need each other.’ A theory of action for urban school districts, in Hightower, A M,
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
8 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
Knapp, M S, Marsh, J A and McLaughlin, M W (Eds), School Districts Wall, D (1999) The impact of high-stakes examinations on classroom
and Instructional Renewal, New York: Teachers College Press, teaching: a case study using insights from testing and innovation
160–172. theory, unpublished PhD thesis, Lancaster University.
Saville, N (2003) The process of test development and revision within Wall, D (2005) The Impact of High-Stakes Testing on Classroom
Cambridge EFL, in Weir, C and Milanovic, M (2003) (Eds) Continuity Teaching: A Case Study Using Insights from Testing and Innovation
and Innovation: Revising the Cambridge Proficiency in English Theory, Cambridge: Cambridge ESOL and Cambridge University
Examination 1913–2002, Cambridge: Cambridge ESOL/Cambridge Press.
University Press. Watanabe, Y (1997) The Washback Effects of the Japanese University
Sayer, A (1984) Method in Social Science: A Realist Approach, Entrance Examinations of English – Classroom-based Research,
Routledge: London. unpublished PhD thesis, University of Lancaster.
Sayer, A (2000) Realism and Social Science, Sage: London. Watanabe, Y (2004) Teacher factors mediating washback, in Cheng, L
Thelen, E and Smith, L B (1994) A Dynamic Systems Approach to the and Watanabe, Y (Eds) with Curtis A, Washback in language
Development of Cognition and Action, Cambridge, MA: The MIT testing: Research contexts and methods, Mahwah, N. J.: Lawrence
Press. Erlbaum Associates, 19–36.
Van Geert, P (2007) Dynamic systems in second language learning: Weir, C J (2005) Language Testing and Validation: An Evidence-based
Some general methodological reflections, Bilingualism: Language Approach, Basingstoke: Palgrave Macmillan.
and Cognition 10, 47–49.
This summary is based on a doctoral thesis submitted to level is operationalised by scanning to locate specific
the University of Reading (UK) in 1997. The PhD was information, and reading carefully to infer the meaning of
supervised by Professor Cyril Weir. lexical items and identify pronominal referents. Global and
local comprehension levels are characterised by two
different rates of reading. Operations like skimming, search
reading, and scanning require a faster reading rate than
Research purpose those involving careful reading at microlinguistic level. Weir
The research sought to establish the construct validity of & Urquhart (1998) refer to the former as expeditious
the Reading module of an English for Academic Purposes reading operations (whereby the reader processes text
(EAP) Graduate Proficiency Test (GPT) Battery developed by quickly, selectively and efficiently) while they refer to the
the ESP Center of Alexandria University in Egypt. It latter as slow careful reading operations.
investigated the componential nature of the reading On reviewing empirical evidence provided by product- and
construct and the effect of background knowledge on test process-oriented studies, it became apparent that there is a
performance. Only full consideration of these two issues case for and against the multi-divisible nature of reading.
would substantiate validation of the Reading module. Product-oriented studies like that of Berkoff (1979), Carver
(1992), Davis (1968), Guthrie & Kirsch (1987) and process-
oriented studies (e.g. Anderson, Bachman, Perkins & Cohen
1991, Cohen 1984, Hosenfeld 1977, Nevo 1989) have
Research questions provided empirical evidence for the separability of skills. On
The Reading module of the Egyptian Graduate Proficiency the other hand, product-oriented studies (e.g. Lunzer, Waite
Test Battery (GPT) was intended to measure global and local & Dolan 1979, Rosenshine 1980, Rost 1993, Thorndike
comprehension. In the study, ‘global comprehension’ refers 1973) and process-oriented studies like that of Alderson
to understanding propositions at the macro-structure level (1990a & b) have provided evidence that reading is a single
of the text and ‘local comprehension’ refers to holistic process. What is most significant in all of these
understanding propositions at the micro-structure level. The studies is the occurrence of vocabulary as a second factor
former is concerned with the relationships between ideas (also referred to as word meaning, verbal reasoning, word
represented in complexes of propositions or paragraphs knowledge, semantic difficulty).
which tend to be logical or rhetorical (see Vipond 1980), The contradiction in findings seemed to be due to sample
whereas the latter is concerned with the relationships selection and methodology used. First, process-oriented
between individual sentences or concepts which tend to be studies researched at that time highlighted the absence of
mechanical or syntactical. Reading at the global level a working definition of the operations used in tests, hence,
involves skimming to establish the gist of the text, search disagreement among experts on what skill each item
reading to locate information on a pre-determined topic, tested. Second, most of the product-oriented studies did
and careful reading to understand explicitly and implicitly not take into account the ability to process text quickly, i.e.
stated main ideas. Reading at the local comprehension the tests used do not exhibit a wide coverage of putative
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 9
EAP reading operations. Third, most of the studies favouring When developing the specifications for the GPT,
a unitary concept have been carried out on young learners designers debated whether there should be separate
and in L1 contexts. The case might, therefore, be different if academic modules for the disciplines involved. Ultimately,
the sample used were adult non-native speakers who are they decided that the Reading module would have texts
spread out across a range of language proficiencies. It may covering three broad academic discipline areas: (1) Arts,
well be that for such a sample a distinction between lower- Social Sciences, Administrative and Business Studies
order skills and higher-order skills is valid (see Clarke 1980, (ASAB); (2) Sciences (SS); and (3) Dentistry, Medicine, and
Eskey & Grabe 1988). Health Sciences (DMHS). This decision was based on three
The fact that the Reading module was intended to views. First, if one were to design discipline-specific
measure a variety of reading operations, and that some modules for all disciplines it would clearly be a very large
studies provided evidence for the emergence of certain undertaking. Second, variation within a discipline area
operations as factors separate from a general reading inevitably meant that one module was by no means specific
competence one, provided the rationale for the formulation for all the candidates doing that module. Third, there is as
of the first research question. yet no body of evidence to support EAP testing claims that
candidates are disadvantaged if they take a test which is
Research Question 1:
not in the area of their discipline. The grouping of
Within which of the three Reading tests of the GPT Reading module are
the components of these tests testing different reading operations as
disciplines into three broad areas and classification of
claimed by the module designers? candidates accordingly were based on the lists supplied by
the Student Affairs Divisions in Alexandria (Egypt) and
Reading (UK) universities.
The starting point for the second research question was
The third research question explored the value of
Weir & Porter’s (1994) suggestion, based on reviewing
including subject-specific reading tests in EAP testing.
some empirical data, that tests which include items testing
What is meant by subject specific is ‘specific to the broad
local lower-order skills might discriminate against the
discipline areas’, for example, specific to the area of
micro-linguistically disadvantaged but otherwise competent
Science disciplines.
reader. Similarly, Alderson & Lukmani’s (1989) study has
shown that weaker students tended to cope quite well with Research Question 3:
the text and questions at the global level but this was not Will postgraduate candidates in three broad discipline areas perform
matched by their performance on questions focusing on better on a Reading Comprehension test whose content is on a topic
microlinguistic items at the local level. Thus, the researcher that is related to their own broad discipline area than on a Reading
Comprehension test whose content is on a topic that is related to
set out to investigate whether candidates were
another broad discipline area, given that the texts are of approximately
disadvantaged by the inclusion of any of the subtests,
comparable difficulty?
hence, the formulation of the second research question
where group and individual performances are considered.
Research Question 2: Studies in ESP testing examined at the time also appeared
(A) Do groups at different levels of proficiency perform the same across to suggest that other factors are at play and that these
the four components of each test in the GPT Reading module? factors seemed to be influencing the results or leading to
(B) Do individuals perform the same across the four components of conflicting results. We could divide these factors into two
each test in the GPT Reading module? types: test-related factors, such as sample size, sample
linguistic homogeneity, and sample academic level; and
The discussion of the nature of the reading construct posed text-related factors, such as text specificity, text difficulty,
another question: if sub-skills exist, do they interact with and topic familiarity. Thus, the fourth research question
other factors such as text organisation or readers’ familiarity attempted to find out which of these factors contributes
with test content? It seemed quite obvious that drawing most to candidates’ performance on EAP Reading
inferences can be easy when the reader has adequate Comprehension tests.
background knowledge about the topic. When discussing Research Question 4:
reading comprehension, we cannot discuss just the Which contributes more to candidates’ EAP reading proficiency scores:
interaction between the reader and the reading operations, topic familiarity, topic/text ease, or L2 proficiency level?
but also the interaction between the reader and the text, in
other words, the role of readers’ background knowledge in
text comprehension. Several studies (e.g. Alderson &
Research methods
Urquhart 1983, 1985 & 1988, Ausubel 1960, Clapham
1994 & 1996, Erickson & Molloy 1983, Ja’far 1992, Jensen Quantitative and qualitative research methods were used to
& Hansen 1995, Kattan 1990, Koh 1985, Moy 1975, Peretz investigate the above research questions. This included:
& Shoham 1990, Shoham, Peretz & Vorhaus 1987, Tan mindmapping, introspection procedures, feedback
1990) have investigated the effect of content familiarity on questionnaires and statistical analysis.
candidates’ performance in EAP reading tests. Data
emerging from these studies gives some tentative Instruments
indication that there is a relation between candidates’ To ensure that reading construct as defined by the test
background knowledge in their academic discipline and designers was adequately captured by the test items, the
their performance on EAP reading comprehension tests. items were matched against mindmaps of the text
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
10 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
produced by subject and language experts. This procedure It should be pointed out that the questionnaires were
was used to justify the existence of the test items, and to administered to test takers immediately after they had
re-categorise the items under four subtests. A panel of finished the tests. Since candidates did not take any two
language and subject experts was asked to provide tests immediately after each other, there is no reason to
mindmaps of the texts and identify key lexical words. They believe that in answering the questionnaires candidates
went through the operations the items were supposed to were comparing texts.
test. A synthesis of information was then collected from the
mindmaps. Items which did not feature in this consensus or Participants
on which expert judges widely disagreed were marked for Candidates who participated in this research comprised
possible exclusion from the tests. two sub-samples: linguistically heterogeneous and
The mindmapping procedure was followed by an linguistically homogeneous EAP learners. The homogeneous
introspection activity. The first part of this activity was used sample consisted of 973 non-native speakers of English
to establish whether each item measured what it was registering for postgraduate courses at Alexandria
designed to measure. Another group of language and University in Egypt. Candidates here share the same L1
subject experts and a group of proficient subject students background (i.e. Arabic). They were classified into the
were asked to introspect on what skill(s) they use in three broad discipline areas described above. The
answering the items. The second part of the activity heterogeneous sample consisted of 355 non-native
consisted of retrospection interviews with subject students. speakers of English. These were registering for
Interviews were conducted to clarify those cases where postgraduate courses at Reading University in England.
candidates had arrived at the same response via a process Candidates in this sample had different L1s (e.g. Chinese,
different from the expected one, and to ask why candidates French, Japanese, Danish, Italian, Turkish). Candidates
had left an item unanswered or had used more than one were classified into two broad discipline areas: Arts and
skill. The introspection procedure was a way of gaining Sciences. There is no Medical group in the UK sample
insights into how readers arrive at their answers and of since Reading University does not provide courses for
determining if test items were testing what they claimed to candidates in this group.
test. Forty-five subject lecturers (of near native proficiency in
The module was then administered and data was English) who were teaching postgraduates in Alexandria
subjected to classical and rasch analyses. Decisions on University in Egypt participated in the study. Lecturers in
which items to exclude or retain depended on the pulling of Arts disciplines were teaching at the faculties of Arts, Fine
evidence from three different data sources: meaning and Arts, Commerce and Tourism. Science disciplines lecturers
lexical consensus, introspection proforma, and item were teaching at Agriculture, Engineering and Science
analysis. faculties. Lecturers from the Medical disciplines were
In order to investigate research questions 2(A) and 4, teaching at the faculties of Dentistry, Medicine, Nursing and
it was necessary to have a common measure of proficiency Pharmacy. No data was collected from subject lecturers in
so that candidates could be placed into language levels. the UK due to practical constraints.
Thus, a vocabulary and grammar test which was part of
the Test of English for Educational Purposes (TEEP) (see
Weir 1988) was used. Candidates were divided into
three levels in accordance with Egyptian universities’ Results and discussion
proficiency level requirements for admission to
postgraduate courses. Research question 1
In order to investigate research question 4, two sets In order to investigate the first research question,
of questionnaires were used to find out about text qualitative data from introspection proforma and
specificity, topic/text ease, and topic familiarity. The retrospection interviews as well as quantitative data from
subject lecturers’ questionnaire was used to find out how subtests’ inter-correlations and factor analysis were
they assessed the specificity, familiarity, and difficulty collected from Egyptian and UK pre-sessional samples
of the Reading module texts on a 4-point scale (high, taking a single test: the Arts Test, the Science Test, or the
medium, low, not at all) according to their knowledge of Medicine Test.
their students’ level of proficiency and of the discipline All three tests exhibited low inter-correlations between
knowledge they thought their students might use in subtests measuring global and local comprehension, and
answering the items. The term ‘specific’ here was used to between subtests requiring expeditious and careful
indicate how specific the topic was, how specific the reading. Factor analysis gave an indication that the tests
vocabulary used in the text, and how specific the non-linear were not operating uni-dimensionally. It showed the
information given in the text were to their postgraduate consistent presence of at least a second factor. It also
students. Familiarity was defined in terms of the topic and appeared to suggest that candidates behave differently on
the rhetorical organisation of the texts and tasks required the operations being tested: a clear factor structure
to answer the test items. Difficulty was seen in terms of showing a distinction between expeditious and careful
language in a text and item difficulty. reading occurred across a range of samples of EAP
The test takers’ feedback questionnaire was used to find candidates taking different tests. This is in line with
out about perceived topic familiarity, and perceived topic Guthrie & Kirsch’s (1987) and Carver’s (1992) findings
ease/test bias. A 3-point scale was used for those items. that made a case for differentiating between reading to
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 11
comprehend explicitly stated ideas and reading to locate consistently passed on slow careful but failed on the
specific information. expeditious reading parts.
Similarly, introspection proforma and retrospective Overall, the findings indicate that candidates perform
interviews indicated that the operations the subject differentially on the subtests. Some appear to be
students reported using to answer the test items differed disadvantaged by the expeditious reading subtests
according to the subtest they were answering. For example, compared to the careful reading ones, while others appear
in the scanning subtest, students reported rapid to be disadvantaged by the local comprehension subtests
inspection of the text; going backward and forward in the compared to the global ones. In certain individuals,
text looking for specific words, dates, etc. In contrast, in however, the case may be more marked in terms of global
the reading carefully subtest students reported slow and local or expeditious and slow. Individuals vary in their
inspection of the text; observance of the linearity and profile of proficiency – where local comprehension might be
sequencing of the text. They read and reread in order to weaker than global comprehension and expeditious reading
establish more clearly and accurately the comprehension weaker than slow reading, for instance. Furthermore, these
of main ideas. differences may vary considerably with level of candidates
On the whole, the answer to this research question is and according to text. It is clear from this data that a
‘Yes’. Findings from qualitative and quantitative research serious case can be made for the profiling of abilities in
methods appear to support the test designers’ claim that each of the skill operations; otherwise false conclusions
the tests are measuring separable subskills, and lend may be drawn about candidates’ reading ability.
support to the argument for the existence of separate
reading operations. They, therefore, contradict the oft- Research question 3
expressed view that reading is a unitary construct. There seemed to be no straightforward answer to this
research question. The findings showed that the evidence is
Research question 2 mixed. For the entire test population, no significant
For research question 2, group and individual performances difference was observed between the performance of the
in the linguistically homogeneous and heterogeneous different discipline groups. Candidates did not seem to
samples in single and paired data sets were looked at. either suffer or profit from taking Reading tests in different
The Grammar Test was used as a measure of candidates’ discipline areas. This finding is compatible with those of
general language ability and to classify them into high, Carrell (1983) and Clapham (1993, 1994, 1996).
middle, and low proficiency level groups. Cross-tabulations When looking at group performances in the paired data
were used. The intention was to compare the performances sets, significant differences were found. Both discipline
of individuals who passed and those who failed in each of groups (Arts and Sciences) of the linguistically
the GPT Reading module tests. Research findings provided heterogeneous sample appeared to suffer when taking the
evidence for significant differential performance on the Science Test and profit when taking the Arts Test. In
components of the tests. contrast, each of the three groups of the linguistically
In most cases candidates perform better on global items homogeneous sample (Arts, Sciences, Medicine) appeared
than on local items. This seems to be in line with the to be at an advantage when taking the Science Test and at a
findings of Alderson & Lukmani’s (1989) study. Similarly, disadvantage when taking the Arts Test. This picture was
most of the evidence shows that candidates of different confirmed when considering individual performances in the
ability levels seem to perform better on items requiring slow paired data sets.
careful reading than those requiring expeditious reading. In considering the findings of this research question, it
This is in line with Beard (1972) and Weir (1983) whose should be noted that the value of using a homogeneous
studies into students’ abilities indicate that ‘for many sample is that candidates share the same L1, similar
readers reading quickly and efficiently posed greater instructional background, or previous learning experiences,
problems than reading carefully and efficiently’ (Weir 1998). that is, variables that were not controlled for in the
This draws attention to Weir & Urquhart’s (1998) call for heterogeneous sample and might have neutralised the
‘paying attention to expeditious reading strategies in both subject effect for this sample. It should also be noted that
teaching and testing’. It should be noted that candidates of the texts in the GPT Reading module were selected from
different proficiency levels performed the worst on academic journals in the appropriate broad discipline
scanning, with the low-level groups being the most severely areas. They were expected to be appropriate and specific to
disadvantaged by the inclusion of scanning items in a the relevant Reading module and, therefore, by implication
Reading Comprehension test. to be unsuitable for or unfamiliar to candidates in other
The results of cross-tabulations for individual disciplines. However, the evidence provided by this
performances affirmed those reported for the group data. research showed that, in some cases, this is not necessarily
The most interesting finding, however, came from the the case. One possible explanation could be that studying
paired data sets. These showed that, across two tests, not a in one particular discipline area does not mean that
single individual performed consistently better on local candidates are ignorant about other disciplines or
than on global comprehension components, or on unfamiliar with other rhetorical structures. They may well
expeditious than on slow careful reading components. In read books and articles in disciplines outside their own
contrast, the results showed a number of individuals who academic field.
consistently passed on global and failed on the local The findings of the third research question seem to
comprehension parts of both tests, and others who indicate that if there is to be one test catering for
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
12 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 13
education when deciding on the number and nature of Following such a procedure provided a sound basis for the
texts to be used in an EAP Reading test. final version of the tests in the Reading module. The
Given the above mixed evidence, it would probably be mindmapping consensus eliminated idiosyncrasies that
safer to use a variety of texts from the broad discipline existed in content selection. The introspection proforma
areas. If either one text or more than one is opted for, there appeared to enhance the probability that the required
should be systematic ways followed in text selection. The operations were being tested. The retrospection interviews
next section describes the basis on which texts should be illuminated, to some extent, how the behaviour test items
selected. produced may equate with the behaviour identified in the
theory-based model.
Selection of texts
In selecting texts, the importance of face validity cannot be References
ignored if just one academic Reading test is opted for. We Alderson, J C (1990a) Testing Reading Comprehension Skills (Part
cannot ignore what subject lecturers and test takers said One), Reading in a Foreign Language 6 (2), 425–438.
regarding text specificity and topic familiarity. In addition, Alderson, J C (1990b) Testing Reading Comprehension Skills: Getting
it might be hard to get an approval from university Students to Talk about Taking a Reading Test (Part Two), Reading in
authorities to use a test which seemingly does not look a Foreign Language 7 (1), 465–503.
subject specific. The danger also exists that under Alderson, J C and Lukmani, Y (1989) Cognition and Reading: Cognitive
examination conditions some students might be upset by Levels as Embodied in Test Questions, Reading in a Foreign
Language 5 (2), 253–270.
the apparently unfamiliar material. In turn, they might not
do as well as they should have. Alderson, J C and Urquhart, A H (1983) The Effect of Students’
Background Discipline on Comprehension: a Pilot Study, in
On the other hand, if there is a need to create parallel
Hughes, A and Porter, D (Eds) Current Developments in Language
EAP Reading tests, it seems quite impossible to find texts Testing, London: Academic Press, 121–138.
which are similar in terms of specificity, difficulty, and
Alderson, J C and Urquhart, A H (1985) The Effect of Students’
familiarity unless either a general academic text or a very Academic Discipline on Their Performance on ESP Reading Tests,
highly specific one is opted for. If the latter is chosen, then Language Testing 2 (2), 192–204.
the number of candidates who would be sitting for such a Alderson, J C and Urquhart, A H (1988) This Test is Unfair: I’m not an
test is inevitably limited. In addition, tests which are too Economist, in Carrell, P L, Devine, J and Eskey, D E (Eds) Interactive
specialised may assess subject matter knowledge in a Approaches to Second Language Reading, Cambridge: Cambridge
particular field more than the reading ability of the University Press, 168–183.
candidates, and thus individuals who happen to have less Anderson, N J, Bachman, L, Perkins, K and Cohen, A (1991) An
subject matter knowledge might be discriminated against. Exploratory Study into the Construct Validity of a Reading
Thus one is forced to choose texts which are equally Comprehension Test: Triangulation of Data Sources, Language
Testing 8 (1), 41–66.
comprehensible for, and generally accessible to candidates
in all fields within the broad discipline areas. They should Ausubel, D P (1960) The Use of Advance Organisers in the Learning
and Retention of Meaning Material, Journal of Educational
come from an academic source and have an academic
Psychology 51, 267–272.
nature. The rhetorical structure could be argumentative or
Beard, R (1972) Teaching and Learning in Higher Education,
Introduction-Methods-Results-Discussion (IMRD), the
Harmondsworth: Penguin Books Ltd.
former being more suitable to humanities-oriented
Berkoff, N A (1979) Reading Skills in Extended Discourse in English
candidates and the latter to scientifically-oriented
as a Foreign Language, Journal of Research in Reading 2 (2),
candidates. 95–107.
In other words, in developing Reading tests which cater
Carrell, P L (1983) Some Issues in Studying the Role of Schemata or
for a large number of candidates, there is a need to ensure Background Knowledge in Second Language Comprehension,
that the chosen topic is fairly familiar to all candidates so Reading in a Foreign Language 1 (2), 81–92.
as to avoid bias caused by topic familiarity. Several texts Carver, R P (1992) Reading Rate: Theory, Research and Practical
of different topics might be used to counter-balance the Implications, Journal of Reading 36 (2), 84–95.
topic-familiarity effect. The level of difficulty of the test Clapham, C (1993) Is ESP Justified? in Douglas, D and Chapelle, C
should also be taken into account. The texts also would (Eds) A New Decade of Language Testing Research, TESOL,
have to be submitted to subject specialists and students 257–271.
to check that no discipline is advantaged over another. Clapham, C (1994) The Effect of Background Knowledge on EAP
These factors appear to be crucial to test designers to get Reading Test Performance, unpublished PhD thesis, University of
stable, reliable, and meaningful results. Thus what seems Lancaster.
to be needed is the development of a mechanism to screen Clapham, C (1996) The Development of IELTS: A Study into the Effect
texts for difficulty, familiarity, and specificity. of Background Knowledge on Reading Comprehension, Cambridge:
University of Cambridge Local Examinations Syndicate.
Clarke, M A (1980) The Short-circuit Hypothesis of ESL Reading – or
Triangulation of data sources
When Language Competence Interferes with Reading Performance,
In empirically validating the GPT Reading module, The Modern Language Journal 64 (2), 104–109.
information was collected from a variety of sources: Cohen, A D (1984) On Taking Tests: What the Students Report,
experts’ mindmapping consensus, experts’ and subject Language Testing 1 (1), 70–81.
students’ introspection proforma, subject students’ Davis, F B (1968) Research in Comprehension in Reading, Reading
retrospection interviews, and item statistical analyses. Research Quarterly 3 (4), 499–545.
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
14 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
Erickson, M and Molloy, J (1983) ESP Test Development for Rosenshine, B V (1980) Skill Hierarchies in Reading Comprehension,
Engineering Students, in Oller, J W (Ed.) Issues in Language Testing in Spiro, R J, Bruce, B C and Brewer, W F (Eds) Theoretical Issues in
Research, Rowley, Mass.: Newbury House, 280–300. Reading Comprehension, Hillsdale, NJ: Erlbaum, 535–554.
Eskey, D E and Grabe, W (1988) Interactive Models for Second Rost, D H (1993) Assessing the Different Components of Reading
Language Reading: Perspectives on Instruction, in Carrell, P L, Comprehension: Fact or Fiction? Language Testing 10 (1), 79–92.
Devine, J and Eskey, D E (Eds) Interactive Approaches to Second Shoham, M, Peretz, A S and Vorhaus, R (1987) Reading
Language Reading, Cambridge: Cambridge University Press, Comprehension Tests: General or Subject Specific?, System 15 (1),
223–236. 81–88.
Guthrie, J T and Kirsch, I S (1987) Distinctions Between Reading Tan, S H (1990) The Role of Prior Knowledge and Language
Comprehension and Locating Information in Text, Journal of Proficiency as Predictors of Reading Comprehension among
Educational Psychology 79, 220–228. Undergraduates, in de Jong, J H A L and Stevenson, D K (Eds),
Hosenfeld, C (1977) A Preliminary Investigation of the Reading Individualising the Assessment of Language Abilities, Clevedon,
Strategies of Successful and Nonsuccessful Second Language PA: Multilingual Matters, 214–224.
Learners, System 5 (2), 110–123. Thorndike, R L (1973) Reading as Reasoning, Reading Research
Ja’far, W M (1992) The Interactive Effects of Background Knowledge on Quarterly 9, 135–147.
ESP Reading Comprehension Proficiency Tests, unpublished PhD Vipond, D (1980) Micro- and Macro-processes in Text
thesis, University of Reading. Comprehension, Journal of Verbal Learning and Verbal Behaviour
Jensen, C and Hansen, C (1995) The Effect of Prior Knowledge on EAP 19, 276–296.
Listening Test Performance, Language Testing 12 (1), 99–119. Weir, C J (1983) Identifying the Language Problems of Overseas
Kattan, J (1990) The Construction and Validation of an EAP Test for Students in Tertiary Education in the United Kingdom, unpublished
Second Year English and Nursing Majors at Bethlehem University, PhD thesis, University of London.
unpublished PhD thesis, University of Lancaster. Weir, C J (1988) The Specification, Realisation and Validation of an
Koh, M Y (1985) The Role of Prior Knowledge in Reading English Language Proficiency Test, ELT Documents: 127, Modern
Comprehension, Reading in a Foreign Language 3 (1), 375–380. English Publications: The British Council.
Lunzer, E, Waite, M and Dolan, T (1979) Comprehension and Weir, C J (1998) The Testing of Reading in a Second Language,
Comprehension Tests, in Lunzer, E and Gardner, K (Eds) The Language Testing & Assessment 7, Kluwer: Dordrecht.
Effective Use of Reading, London: Heinemann Educational, 37–71. Weir, C J and Porter, D (1994) The Multi-Divisible or Unitary Nature of
Mohammed, M A H and Swales, J M (1984) Factors Affecting the Reading: The Language Tester between Scylla and Charybdis,
Successful Reading of Technical Instructions, Reading in a Foreign Reading in a Foreign Language 10 (2), 1–19.
Language 2 (2), 206–217. Weir, C J and Urquhart, A H (1998) Reading in a Second Language:
Moy, R (1975) The Effect of Vocabulary Clues, Content Familiarity and Process and Product, Longman.
English Proficiency on Cloze Scores, unpublished PhD thesis, Zuck, L V and Zuck, J G (1984) The Main Idea: Specialists and Non-
University of California. specialist Judgements, in Pugh, A K and Ulijn, J M (Eds), Reading
Nevo, N (1989) Test-taking Strategies on a Multiple-choice Test of for Professional Purposes: Studies and Practices in Native and
Reading Comprehension, Language Testing 6 (2), 199–215. Foreign Languages, London: Heinemann Educational, 130–145.
Peretz, A S and Shoham, M (1990) Testing Reading Comprehension in
LSP: Does Topic Familiarity Affect Assessed Difficulty and Actual
Performance?, Reading in a Foreign Language 7 (1), 447–455.
This short summary is based on a doctoral thesis submitted and QCA 1999) and the increasing use of the Common
to the Faculty of Education, Cambridge University (UK) in European Framework of Reference (CEFR hereafter) (Council
2008. The research was funded by Cambridge ESOL. The of Europe 2001) both within England and Europe.
PhD was supervised by Dr Neil Jones and Dr Edith Esch. ‘Can Do’ statements are commonly used, and are being
The PhD research focused on Cambridge ESOL’s Asset promoted for wider adoption (see Council of Europe 2008),
Languages assessments. in educational assessment to describe the level of a
This mixed-methods PhD explores and compares the learner’s reading proficiency. However, there is no research
reading proficiency of secondary school learners of German, as to how, or whether, such ‘Can Do’ frameworks can be
Japanese and Urdu in England with the aim of investigating applied to all languages, particularly non-Latin script or
and shedding light upon the feasibility of relating learners community languages. The majority of research in this area
of different languages and contexts to the same framework. has focused on learners of English, although the few single
This research has important implications within education, language research studies undertaken indicate that reading
particularly given the use of frameworks such as the in languages like Japanese and Urdu requires different
National Curriculum for Modern Foreign Languages (DfES processing strategies from reading in alphabetic languages
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 15
such as German for learners with English as their first Urdu as a resource when reading. Finally, this research
language. Existing research has also failed to relate findings demonstrates that the construct of reading in the National
to proficiency level, making it impossible to compare Curriculum for Modern Foreign Languages is not endorsed
findings across studies. by any of the learner groups, which is worrying for language
This thesis employed a mixed-methods approach, using education and assessment within England and raises the
self-assessment ‘Can Do’ surveys and think-aloud need for further research.
protocols, to compare the reading proficiency of secondary
school learners of German, Japanese and Urdu in England. References
Findings show that statistically the same three factors best Council of Europe (2001) Common European Framework of Reference
represent learners’ understanding of reading proficiency for Languages: Learning, Teaching, Assessment, Cambridge:
across all three languages. However, there are also strong Cambridge University Press.
differences. For example, the difficulty of script acquisition Council of Europe (2008) Recommendations of the Committee of
in Japanese impacts on learners’ understanding of the Ministers to member states on the use of the Common European
construct, while learners of both Japanese and Urdu were Framework of Reference for Languages (CEFR) and the promotion of
unable to scan texts in the way learners of German were plurilingualism, Strasbourg, Adopted by the Committee of
Ministers on 2 July 2008.
able to. Urdu learners under-rated their ability, not taking
DfES and QCA (1999) The National Curriculum for England: Modern
into account the wide range of natural contexts in which
Foreign Languages, London: DfEE/QCA.
they use Urdu outside the classroom. The findings also
illustrate how Urdu learners use their spoken knowledge of
This short summary is based on a Master’s thesis submitted given to testing specialists only. It was designed to measure
to Anglia Ruskin University in 2007. The research was the degree of specificity of various aspects of context
funded by Cambridge ESOL. validity in ICFE in comparison to Business and General
Developers of tests of languages for specific purposes are tests. The third stage involved a corpus study which aimed
faced with the challenge of creating tests which allow for an to identify some of the characteristics of the core language
appropriate interaction between subject knowledge and of Financial English, by comparing Financial English texts to
language ability in relation to the target language use Business and General English texts. The results taken
domain. This dissertation was completed while the together suggest that ICFE might be placed at the more
International Certificate in Financial English (ICFE) was specific end of the ‘specificity continuum’ than the General
under development and set out to establish the extent to and Business English tests, and that although there is
which the Reading paper meets this challenge. The research considerable fuzziness between Financial and Business
aimed to establish the degree of specificity of the ICFE English, distinct linguistic differences were found between
Reading paper, to try and identify the characteristics that Financial and General English and the beginning of a core
make it specific, and to find out how appropriate it is as a Financial lexis was identified. It was found that the degree
testing instrument for people working in or intending to of specificity of ICFE made it appropriate as a testing
work in the financial domain. There were three stages in instrument in relation to the target domain. For more details
this research, each comparing ICFE to tests of General and on one of the aspects of this study see Wright (2008).
Business English at the same level (CEFR levels B2/C1). In
the first stage, a questionnaire was administered to both References
subject specialists and non-specialists. It was designed to Wright, A (2008) A corpus-informed study of specificity in Financial
measure the subject specificity and appropriacy of the texts English: the case of ICFE Reading, Research Notes 31, 16–21.
used in ICFE. In the second stage, a questionnaire was
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
16 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
This paper is based on a Master’s thesis submitted to dynamic system (Larsen-Freeman & Cameron 2008), and
King’s College London (UK) in 2008. The thesis was treating it as such provides a suitable framework for
supervised by Susan Maingay and Dr Nick Andon. investigating the expression of affect.
When we speak, we do not merely transfer information Complex systems involve a large number of components
from one individual to another; we also give expression to a interacting, often in a non-linear fashion (i.e. when a
whole range of emotions, attitudes and evaluations. This change in input results in a disproportionate change in
phenomenon, ‘pervasive, because no text or utterance is output). Complex systems exhibit certain key features. Let
ever absolutely free from it [and] elusive, because it may be us consider these, following Larsen-Freeman & Cameron
difficult to say exactly what it is that gives the text or (2008), with examples of how they relate to language:
utterance that certain quality’ (Dossena & Jucker 2007:7), 1. Heterogeneity of elements and agents: the elements or
is known as affect. agents in a complex system are often extremely diverse,
At present, affect tends to sit on the periphery of models and can be processes rather than entities, or even
of language and language proficiency, treated as an complex subsystems. Although the components may be
‘optional overlay of emotion’ (Thompson & Hunston diverse, they are interconnected – change in one
2000:20) to the expression of ‘core’ informational meaning. component affects others. Language elements include
Affect can be broken down into two core areas: emotion phonetic and phonological features, lexis, grammar and
and attitudes. Emotion covers feelings such as anger and discourse-level features; agents include users of the
happiness, while attitudes are an individual’s opinions of language (at an individual level) and society (at a higher
the world, formed through predisposition, experience and level).
ideology, and which colour his or her perceptions. Attitudes 2. Dynamics: complex systems are in a permanent state of
are realised in language by evaluation (Thompson & flux. Change takes place on scales (time) and levels
Hunston 2000), which are essentially good or bad value (size): change may occur at the level of the whole
judgements. Evaluation ‘does not occur in discrete items system, a subsystem within it, or only a very small part
but can be identified across whole phrases, or units of of it. Different levels and scales influence each other
meaning, and ... is cumulative’ (Hunston 2007:39). upwards and downwards. Languages change on both
Affect can be expressed towards many different objects. micro levels (such as the introduction of a new word)
These are most likely to be previous utterances, the and macro levels (such as changes in the formation of
proposition being made, agents implicated within the tenses), and both over short and long scales.
proposition, the listener or the speaker; there could,
3. Non-linearity: due to the interconnected nature of the
however, be still more.
elements in a complex system, change can result which
Many different resources are employed in the expression
is out of proportion to the external stimulus. An example
of affect, and they interact in complex and sometimes
of this is the famous ‘butterfly effect’ (weather is an
unpredictable ways. To reflect this, this study is grounded in
example of a complex system). Some language
a complex systems view of language (Larsen-Freeman &
innovations spread rapidly through a language while
Cameron 2008). The study considers how different
others are ignored. Similarly, a slight change of
elements of language interact within a specific context to
intonation could render a completely different
create affective meaning.
interpretation to an utterance.
4. Openness: complex systems are open. They can – and
must – take on new elements and energy in order to
Complex systems theory and language remain in a state of dynamic stability, where the system
‘Tidy explanations survive as long as all that has to be is stable but not static or fixed. New words are constantly
explained is the meaning of sentences invented by armchair being created, either to label new developments in
linguists’ (Coates 1990:62). society and the world (the source of external energy), or
Coates captures one of the tensions at the heart of from other languages through ‘borrowed’ words.
applied linguistics. By focusing on small, manageable areas 5. Adaptation: many complex systems are adaptive,
of the language and producing clear, tidy explanations, we meaning that change in one part of the system leads to
can lose sight of the fact that real-life language simply does change in the system as a whole, as it adapts to the new
not behave in this fashion. In reality, the production of situation. Although languages are in constant flux, the
meaning is a highly complex process involving the basic requirement of intelligibility dictates that the
interaction of a variety of components: lexis, grammar, language incorporates changes by adapting to new
phonology, discourse-level features, paralinguistic and non- circumstances without losing its overall integrity.
verbal features and, crucially, context. Indeed, language 6. The importance of context: context is crucial when
exhibits many, if not all, of the properties of a complex considering complex systems – indeed, the context
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 17
within which a system operates cannot be considered malleability of language: ‘What is important for the
separate from the system itself; it actually forms a part speaker about a linguistic form is not that it is always a
of the system. For example, no utterance in any language stable and self-equivalent signal, but that it is an always
can be fully interpreted without consideration of the changeable and adaptable sign.’ This is not to negate the
context it was uttered in, such as who uttered it, to importance of lexis, but merely to underline that it is one
whom and in what situation. of several affective resources employed in an utterance;
7. Constructions: construction grammar (Goldberg 2003) this holds true of all affective resources. In analysing a text,
provides a model of grammar which is consistent with we need to consider the interaction of the affective
complex systems theory, and within which we shall resources.
frame this study. Constructions range from morphemes There are other lexemes which encode ideational
through words and chunks, up to abstract grammatical meanings whilst also expressing an affective connotation;
structures. Constructions carry inherent semantic or these often exist in apposition to more affectively neutral
discoursal functions, rather than being ‘empty’ syntactic alternatives. For example, the words dog, doggie, cur and
shells for meaning-carrying words. These semantic mutt all have the same ideational referent, but encode
meanings can change over time – for example the be rather different affective connotations.
going to construction originally only denoted movement:
Semantic prosody
I’m going to the shops (literally), but developed its
present future meaning, as in: I’m going to buy some A form of connotation can exist at another level through
bread there (Perez 1990). semantic prosody – how ‘a given word or phrase may occur
most frequently in the context of other words or phrases
which are predominantly positive or negative in their
Discourse and complex systems evaluative orientation’ (Channell 2000:38). In this way,
We try to understand language in use ‘by looking at what connotations of collocants are ‘inherited’ by the word or
the speaker says against the background of what he might phrase, often lending them an affective meaning which can
have said but did not, as an actual in the environment of a develop across a text or texts. Corpus analysis of semantic
potential’ (Halliday 1978:52). This Systemic Functional prosodies has produced some interesting, not always
viewpoint is echoed in a complex systems approach, where intuitive, results – the phrase par for the course, for
discourse is ‘action in complex dynamic systems nested example, almost exclusively appears in cases of negative
around the microgenetic moment of language using’ evaluation, so although it may not directly encode a
(Larsen-Freeman & Cameron 2008:163). Individuals adapt negative connotation, it carries a negative semantic
their utterances to take into account all relevant contextual prosody (ibid.).
features.
In discourse, different scales and levels interact to create
Grammar
complex systems phenomena we have already
encountered: self-organisation (the progression of the Affective constructions
discourse), emergence (of meaning and new semiotic Wierzbicka (1987) argues that certain constructions encode
entities within the discourse) and reciprocal causality specific affective meanings that cannot be accounted for by
(between the interlocutors, and between the speakers and reference to conversational implicature alone. I will term
the discourse itself). The expression of affective meaning such constructions, which encode an affective meaning
can be viewed as an emergent phenomenon from the either instead of or in addition to an ideational meaning,
interaction of the elements and agents of the complex affective constructions. A simple example of an affective
system of discourse. construction is the What’s X doing Y? construction which
expresses incongruity, e.g. What’s this scratch doing on the
table? (Kay & Fillmore 1999).
Affective resources Other constructions, particularly focusing constructions,
may contribute to the expression of affect indirectly.
Speakers use a range of resources within the language to
For example, non-defining which-clauses, particularly
create affective meaning: lexis, grammar, phonology,
continuative ones, have been shown to encode an
discourse-level features and context. We will term these
evaluative function in the majority of cases (Tao & McCarthy
affective resources, and consider them in turn.
2001). The use of such marked forms may be considered a
case of grammatical metaphor (see below).
Lexis
Individual lexemes Grammatical metaphor
Some words and phrases serve purely affective functions; ‘A meaning may be realised by a selection of words that is
brilliant, for example, has no ideational meaning beyond different from that which is in some sense typical or
the evaluative. However, the affective meaning of an unmarked. From this end, metaphor is variation in the
utterance is not determined by lexis alone. The utterance expression of meanings’ (Halliday 1994:341).
That was brilliant could convey its ‘natural’ semantic Halliday’s concept of grammatical metaphor, analogous
meaning, but in a different context and with sarcastic to the concept of lexical metaphor, holds that grammatical
intonation, it could also convey precisely the opposite choices are made in the production of any utterance, and
meaning. As Volos̆inov (1986:68) notes regarding the that such choices are meaningful. Halliday uses the term
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
18 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 19
Context
Sociolinguistic considerations Less use of More use of
Affective resources Affective resources
The expression of affect is not a sociolinguistic
phenomenon. Sociolinguistics describes how external
sociological factors influence and constrain language;
affect, on the other hand, is intensely personal and internal.
However, sociolinguistic factors constitute a key element in Realisation of speaker’s affective judgement
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
20 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
MP, leader of the British political party the Liberal • The people of this country (or just people), consistently
Democrats (prior to his becoming Deputy Prime Minister). positioned as victims in the text: people are scared of the
The sample features a question from a female caller about extent of immigration: when did you … ask the people of
immigration policy, specifically whether Clegg would ‘close this country if this is what they want? The speaker
the borders’ of the UK. positions herself with this group, which includes the
The interaction patterns within the sample are complex. listeners.
While the speaker is ostensibly addressing Clegg with her • Politicians – specifically party leaders, and in particular
question, she has another audience – the radio listeners. Clegg himself, consistently evaluated negatively.
Indeed, it could be argued that the listeners are her primary
audience, since the speaker’s motivation for phoning in to Throughout her discourse, the speaker employs bare
such a programme seems to be to make a point rather than assertions – statements with no hedging employed – to
to make a genuine enquiry of Mr Clegg. create a strongly monoglossic feel, not acknowledging any
The medium of a radio phone-in affects the exchange. alternative viewpoints. The evaluation builds through the
The lack of visual contact prevents the use of non-verbal text; we will consider two utterances of particular interest in
communication, which means that the language itself depth (for full analysis, see Elliott 2008).
carries all the affective meaning.
From a sociolinguistic perspective, the setting of a radio Utterance 1:
phone-in, and the position of Clegg as a senior politician, ‘Will you close the borders within Europe if we find that we are totally
are likely to have the following effects: swamped? Our culture and our way of life have changed beyond belief,
• the dual audience means there are two sets of people are scared of the extent of immigration …’
sociolinguistic norms at play – those between the The speaker uses the strongly negative term swamped.
speakers and the radio audience, and those between the The term swamped has an interesting developmental
speakers and each other history. It has a particular resonance in British political
• the ‘exposed’ nature of the discussion, conducted in such discourse on immigration – Margaret Thatcher was accused
a public forum, is likely to lead to circumspection, since of racism when she used the term in 1979, and further
the speakers will not want to appear unreasonable. controversy was caused in 2002 by the then Home
Secretary David Blunkett’s use of the term. The term
Agents swamped is so loaded as to create a qualitatively different
Nick Clegg has been an MEP, Liberal Democrat feel to the discourse in affective terms. Also, beyond belief
spokesperson for Europe (2005–06) and Home Affairs serves a similarly strong role.
spokesperson (2006–07). In the past, he has described
the issue of immigration as ‘the dog-pit of British politics – Utterance 2:
a place only the political rottweilers are happy to enter’ and ‘Would you close our borders to people from Europe, let alone the rest
arguing for a ‘liberal managed immigration system’ (Clegg of the world, if the people of this country became so distressed at …’
2007). The caller is Mary, a woman from Coventry. The
programme was hosted by Victoria Derbyshire, a BBC Radio Use of the let alone construction posits a scalar
presenter. relationship between Europe and the rest of the world
(Fillmore, Kay & O’Connor 1988), which would naturally be
interpreted in terms of the relative desirability of
immigration from the two parts of the world; this scalar
Discussion relationship is reinforced by marked stress and intonation
The analysis focused on the following extended turn by the accorded to both let alone and rest.
caller, although the previous (and subsequent) parts of the Here, so is heavily marked, with marked stress, a
discussion were also considered. markedly low fall, heavy sibilance on the vowel /s/ and an
elongated diphthong /əu/, conveying an impression of
‘Um … We have open borders within Europe. Millions of people can
anger (Kienast, Paeschke & Sendlmeier 1999, Walsh 1968).
come in here potentially. Um … (unclear) I want to ask you, when did
you, or any of the other two leaders, ask the people of this country if
The utterance is left unfinished, which naturally raises the
this is what they want? It’s not your country. Will you close the borders question of how it would finish; grammatically, completion
within Europe if we find that we are totally swamped? Our culture and with a that-clause to create a cause-and-effect relationship
our way of life have changed beyond belief, people are scared of the is suggested. We can only speculate as to what the
extent of immigration, I believe one in four in Boston, Lincolnshire is an unexpressed effect would be, but we can note the
immigrant. Would you close our borders to people from Europe, let following:
alone the rest of the world, if the people of this country became so
• The cause if the people ... became so distressed at ...
distressed at … you know, I just want to know – would you close the
evokes a fairly extreme set of circumstances, which
borders, or are you so keen on Europe that you don’t care how many
naturalises an expectation that the response would be
people come here?’
proportionally strong.
The text reveals multiple objects of evaluation: • The impression of an extreme response from the British
• Immigration and immigrants. Immigrants are subdivided people is reinforced by the fact that the utterance remains
into those from Europe and those from the rest of the unfinished. After producing some strong, direct
world. statements, the speaker feels unable to articulate these
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 21
consequences. She then appears to backtrack – you overall effect, as the unfinished utterance exemplifies.
know, I just want to know … – suggesting a reasonable The text we examined was a telephone-based exchange
position on the part of the speaker, especially with the with a whole host of other contextual and sociolinguistic
use of just (with a low intonational fall). factors in play relating to participants, medium, (multiple)
audiences and interaction patterns. The last two points in
We cannot know how the speaker intended to complete
particular raise interesting questions for future research
the utterance, but what is important is the interpretation
regarding their effects, since they apply whenever more
that the unfinished utterance, in conjunction with previous
than two people are involved in an exchange, even in a
utterances, naturalises – the perceived attitude. This seems
passive listening role.
to be that the consequences of the people of Britain
These reflections raise questions regarding models of
becoming so distressed are rather dark – too dark to be
language and language proficiency – affective meaning, a
spelled out on a radio programme.
central plank of communication, and often its main
As can be seen, the utterances need to be considered in
motivation, is underrepresented in current models and is a
the light of the full text, plus surrounding turns and the
prime candidate for in-depth exploration, which would
wider context, to realise how the interaction of the different
enrich our understanding of language as a whole. Similarly,
affective resources creates the full evaluative effect.
the study of its progression as a key part of language
proficiency could reap dividends, with consequences for
Global overview
language assessment – although obstacles such as the
• The use of noun phrases (the people of this country, high context-sensitivity and deeply personal nature of
people) and pronouns (we, our) throughout to position affective communication are by no means easy to overcome
the people of Britain as victims of both immigration and within an assessment context.
the politicians Mary holds responsible. The use of the
noun phrase the people of this country is interesting; References and further reading
concordance analysis shows that it almost exclusively
Carter, R and McCarthy, M (1995) Grammar and the spoken language,
occurs in political rhetoric, and that it carries a strong
Applied Linguistics 16 (2), 141–15.
positive semantic prosody (Elliott 2008).
Channell, J (2000) Corpus-based analysis of evaluative lexis, in
• The repeated use of bare assertions (often in Hunston S and Thompson, G (Eds) Evaluation in Text: Authorial
conjunction with subjective statements) lends a Stance and the Construction of Discourse, Oxford: Oxford University
monoglossic feel to the whole turn: the speaker does Press, 38–55.
not acknowledge alternatives. This is reinforced by Clegg, N (2007) Immigration in the 20th Century, speech at
(phonologically) prosodic features such as a rapid Liberal Democrats Conference 2007, retrieved from
http://www.nickclegg.org.uk/index.php?option=com_content&tas
speech rate for such utterances and low final falls in
k=view&id=219&Itemid=45.
intonation.
Coates, J (1990) Modal meaning: the semantic-pragmatic interface,
• The evaluation builds throughout the turn, reaching a Journal of Semantics 7, 53–63.
peak with the unfinished utterance, as the layers of Dossena, M and Jucker, A (2007) Introduction, Textus XX, 7–16.
evaluation interact to reinforce each other and amplify
Elliott, M (2008) The Expression of Affect in Spoken English: a case
the effect. study, unpublished MA thesis, King’s College London.
• The complex interaction patterns and multiple audiences Fillmore, C, Kay, P and O’Connor, M (1988) Regularity and idiomaticity
have an effect on the speaker as she attempts to tailor her in grammatical constructions: the case of let alone, Language
message to the different audiences and conform to 64 (3), 501–538.
different sociolinguistic norms simultaneously (it may have Gobl, C and Chasaide, A (2003) The role of voice quality in
been an inability to reconcile these with the intended communicating emotion, mood and attitude, Speech
message that led the speaker to abort the utterance). Communication 40, 189–212.
Goldberg, A (2003) Constructions: a new theoretical approach to
What is particularly striking is how different affective language, Trends in Cognitive Sciences 7 (5), 219–224.
resources interact to produce the overall effect, and how the Grice, H (1975) Logic and Conversation, in Cole, P and Morgan, J (Eds)
evaluation is dependent on previous utterances (and Syntax and Semantics Volume 3: Speech Acts, London: Academic
previous texts, as in the case of swamped). An analysis Press, 41–58.
focusing on only one or two of these areas, or on individual Halliday, M (1978) Language as a Social Semiotic, London: Arnold.
utterances in isolation, would not be able to account fully Halliday, M (1994) An Introduction to Functional Grammar (2nd ed.),
for the extremely strong affective meaning expressed London: Arnold.
throughout. Halliday, M (2003) On Language and Linguistics (edited by Webster,
J), London: Continuum.
Hunston, S (2007) Using a corpus to investigate stance quantatively
Conclusions and qualitatively, in Englebretson, R (Ed.) Stancetaking in
Discourse, Amsterdam: John Benjamins, 27–48.
We have seen that different elements of language combine Jenkins, J (2000) The Phonology of English as an International
to create affective meaning in a highly interrelated manner, Language, Oxford: Oxford University Press.
but that some individual elements can create a particularly Kay, P and Fillmore, C (1999) Grammatical constructions and
strong effect which reverberates throughout the whole text. linguistic generalizations: the what’s X doing Y? construction,
Even what is not said often can contribute greatly to the Language 75 (1), 1–33.
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
22 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
Kienast, M, Paeschke, A and Sendlmeier, W (1999) Articulatory Tao, H and McCarthy, M (2001) Understanding non-restrictive which-
reduction in emotional speech, EUROSPEECH ‘99, 117–120. clauses in spoken English, which is not an easy thing, Language
Sciences 23, 651–677.
Larsen-Freeman, D and Cameron, L (2008) Complex Systems and
Applied Linguistics, Oxford: Oxford University Press. Tarone, E (1973) Aspects of intonation in Black English, American
Speech 48 (1–2), 29–36.
Marsen, S (2006) How to mean without saying: presupposition and
implication revisited, Semiotica 160, 243–263. Thompson, G and Hunston, S (2000) Evaluation: An introduction, in
Hunston, S and Thompson, G (Eds) Evaluation in Text: Authorial
Martin, J and White, P (2005) The Language of Evaluation: Appraisal Stance and the Construction of Discourse, Oxford: Oxford University
in English, Basingstoke: Palgrave Macmillan. Press, 1–27.
Mumford, K and Power, A (2003) East Enders, Bristol: The Policy Veltman, R (2003) Phonological metaphor, in Simon-Vandenbergen,
Press. A-M, Taverniers, M and Ravelli, L (Eds) Grammatical Metaphor,
Nariyama, S (2006) Pragmatic information extraction from subject Amsterdam: John Benjamin, 311–335.
ellipsis in informal English, Proceedings of the 3rd Workshop on Volos̆inov, V (1986) [1929] Marxism and the Philosophy of Language,
Scalable Natural Language Understanding, 1–8. Cambridge, MA: Harvard University Press.
Perez, A (1990) Time in motion: grammaticalisation of the be going to Walsh, M (1968) Explosives and spirants: primitive sounds in
construction in English, La Trobe University Working Papers in cathected words, Psychoanalytic Quarterly 37, 199–211.
Linguistics 3, 49–64. Wierzbicka, A (1987) Boys will be boys: ‘radical semantics’ vs.
Stefanowitsch, A and Gries, S (2003) Collostructions: investigating ‘radical pragmatics’ Language 63 (1), 95–114.
the interaction of words and constructions, International Journal of Zwicky, A (2005) Saying more with less, Language Log, retrieved from
Corpus Linguistics 8 (2), 209–243. http://158.130.17.5/~myl/languagelog/archives/2005_03.html.
This short summary is based on a doctoral thesis submitted more accurately the relationship between the discourse
to Columbia University, New York City (US) in 2004. The PhD generated by the task and the scores for ‘interactive
was supervised by Professor James Purpura. communication’, and to provide some validity evidence for
the IC scores. The results showed that the high-scorers
This discourse-based study, which was undertaken as part of mostly oriented to a collaborative pattern of interaction,
a doctoral degree, investigated paired test taker discourse in while the low scorers generally oriented to a parallel pattern
the First Certificate in English (FCE) Speaking test. Its primary of interaction, as would have been expected. The
aim was to focus on fundamental conversation management significance of the study lies in the deeper understanding it
concepts, such as overall structural organisation, turn- provides of paired oral test interaction in the FCE and the
taking, sequencing, and topic organisation of the paired construct of conversation management. This study also
test taker interaction. The analysis highlighted global holds implications for FCE examiner training as it provides
patterns of interaction in the peer test taker dyads and insights which could lead to more accurate and consistent
salient discourse features of interaction. The three distinct assessment of FCE candidate output. A further contribution
patterns of interaction which emerged were termed of the present study is the recommendations it provides for
‘collaborative’, ‘parallel’, and ‘asymmetric’. The patterns of the performance descriptors used for ‘interactive
interaction were distinguished based on the dimensions of communication’ in the FCE assessment scales, which would
mutuality and equality, and were conceptualised as ultimately lead to a fairer test. For more details on this study,
continua ranging from high to low. In addition, the see Galaczi (2003, 2008).
dimension of conversational dominance, operationalised as
‘participatory’, ‘sequential’, and ‘quantitative’, was found to
intersect with the dimensions of mutuality and equality, References
leading to sub-groups within each interactional pattern of Galaczi, E D (2003) Interaction in a paired speaking test: the case of
high or low conversational dominance. The second goal of the First Certificate in English, Research Notes 14, 19–23.
the study was to investigate a possible relationship between Galaczi, E D (2008) Peer–Peer Interaction in a Speaking Test: The
the patterns of peer–peer interaction and the FCE score for Case of the First Certificate in English Examination, Language
‘interactive communication’ (IC). The aim was to understand Assessment Quarterly 5 (2), 89–119.
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 23
This short summary is based on a doctoral thesis submitted learners’ interlanguage and its development over time are
to the University of Cambridge (UK) in 2006. The PhD was systematic. This systematicity cannot be directly related to
supervised by Dr Henriëtte Hendriks. either the first or the second language. The acquisition
paths exhibit similarities across different (first – L1, and
The aim of this thesis is to shed light on the nature of second – L2) language pairings, being influenced mostly by
adult second language acquisition, factors guiding the universal, and only marginally by language-specific factors,
acquisition and the ways in which these factors interact. since the interlanguage of beginners is syntactically and
This is achieved through exploring how English learners of semantically a very simple system. Previous studies on
Serbian and Serbian learners of English acquire another higher-level learners, whose interlanguages are more
way of expressing dynamic spatial relations (motion) in a complex syntactically and semantically, document mostly
second (foreign) language. language-specific influences. The present thesis set out to
Talmy (1985) divides languages into: investigate whether universal characteristics of learners’
a. satellite-framed, typically encoding Path in satellites development persist among learners beyond the beginning
and Manner in motion verbs (e.g. The bottle floated out) stage, or whether only language-specific influences hold
and sway, how all of them manifest and what their scope is.
Since the learners examined are beyond the beginning
b. verb-framed, typically encoding Path in motion verbs
stage, the over-arching hypothesis was that language-
and Manner, if expressed at all, outside the verb
specific influences would be stronger than among
(e.g. La botella salió flotando – The bottle exited floating).
beginners and acquisition paths not so homogenous, yet
English and Serbian were both classified as satellite- factors other than first or second language may bring out
framed languages within Talmy’s typology. However, recent similarities in the interlanguages and acquisition paths of
research revealed that Serbian differs to a certain extent learners with different first and second languages.
from English as to where Manner and Path are typically One of the contributions of the present thesis resides in
expressed (Filipović 2002), and as to the frequency of showing that even the interlanguage of learners beyond the
expression of Manner. Therefore, Filipović (2002) beginning stage shows similarities unrelated to the first or
reclassified Serbian placing it midway in the continuum second language, and also that it exhibits a rich interplay of
satellite-framed>Serbian>verb-framed. The contribution of both language-specific (L1/L2) and universal factors. For
the non-acquisition part of the thesis resides in providing example, both English and Serbian learners mostly prefer
further support for the reclassification of Serbian, based the satellite-framed, English pattern (e.g. run into X ) to the
on the analysis of the spoken mode of language use and verb-framed pattern favoured by Serbian native speakers
systematic examination of attention to Manner (as reflected when using their L1 (e.g. go running into X ). In this way,
in the frequency of Manner mention). The findings show learners resort to the economy-of-form strategy1 opting for
that: a pattern that is more economical by being shorter,
a. when they want to express Manner in boundary-crossing syntactically simpler and thus easier for processing
situations (e.g. entering, exiting, crossing), Serbian native (production/understanding). It is in the domain of linguistic
speakers most frequently opt for the verb-framed pattern attention to Manner that a language-specific influence
of expressing Path in the verb and Manner outside it (L1 influence) is at its strongest at times, being clearly
when using their mother tongue, and visible even among the advanced English and Serbian
learners. In addition, the findings reveal that L2 learners
b. they omit Manner information considerably more
undergo not only linguistic reorganisation, but also a
frequently than English native speakers when speaking in
change in the degree of linguistic attention to Manner
their mother tongue, even when Manner is not inferable
(increasing/decreasing frequency of Manner mention) with
from the context.
increasing proficiency levels.
Using the Interlanguage approach, the main, acquisition- Besides theoretical implications for the field of second
related part of the thesis examines how lower-intermediate, language acquisition, this thesis has also practical
upper-intermediate and advanced learners express motion implications for teaching the linguistic devices expressing
at a given stage of the acquisition process, how their dynamic spatial relations in the two languages. For more
linguistic means develop and what factors influence the details on this study see Filipović & Vidaković (2010).
acquisition. According to this approach, which has proved
fruitful for analysing the acquisition process of beginners, 1 This term was first used in Vidaković (2006).
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
24 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 25
Figure 1: Construct validity components • monitoring: checking mechanical accuracy and overall
coherence
• revising: adjusting text as a result of monitoring.
SCORING
VALIDITY These stages of executive processing are the basis of
Shaw and Weir’s (2007) conceptualisation of the cognitive
validity component of the socio-cognitive framework for
writing and as such inform the methodology outlined
below.
A parallel strand of research focuses not on the stages of
TEST
the writing process per se but how these relate to different
CONSTRUCT
levels of language proficiency. Eysenck & Keane (2005:418)
argue that it is the planning process that differentiates the
skilled from the unskilled writer. Scardamalia & Bereiter
CONTEXT COGNITIVE (1987) describe two major strategies, knowledge telling and
VALIDITY VALIDITY knowledge transforming, which occur mainly at the
planning stage and help to identify the processing of skilled
writers and the less able. In knowledge telling the writer
plans very little and is concerned mainly with generating
content from remembered existing resources in terms of
TEST TAKER CHARACTERISTICS content, task and genre. In knowledge transforming the
skilled writer considers the complexities of a task as well as
(Shaw & Khalifa 2007) content, audience, register and other relevant factors in
written communication.
The elements making up construct validity can be seen to
be symbiotically related in that decisions taken in terms of IELTS-related writing research
task context will impact on the processing that takes place
As a high-stakes test IELTS has always attracted attention
in task completion. Likewise scoring criteria where known to
from researchers including those who have focused just on
the test taker will impinge on cognitive processing. Taken
the writing component. Much of the research has been
together ‘the more evidence collected on each of the
generated by the IELTS partners themselves thus
components of this framework, the more secure we can be
demonstrating their commitment to the continual
in our claims for the validity of a test’ (Weir 2005:47).
improvement of the test (see for example Taylor & Falvey
(2007) for a collection of IDP and British Council joint-funded
Models of second language writing research reports on IELTS Writing). In 2005, the assessment
Before the 1960s writing was often conceptualised as criteria and rating scales were revised in IELTS Writing largely
transcribed speech and was viewed as ‘decontextualised’ as a consequence of these and other research findings.
(Ellis 1994:188) and product-oriented with final texts seen Many of the inevitable criticisms that a high-stakes test such
as ‘autonomous objects’ where various elements were as IELTS attracts were addressed in 2005 but some issues
organised according to a ‘system of rules’ (Hyland 2002:6). concerning cognitive validity still remain.
Writing is now seen as essentially a communicative act. Of the two tasks in IELTS Academic Writing most research
A written text is therefore viewed as discourse in that the has been conducted on Task 2, the short essay. Being the
writer attempts to engage the reader using linguistic longer of the two in terms of time allocation (40 minutes)
patterns influenced by a variety of social constraints and and word length (250 words) it generates a greater sample
choices (writer’s goals, relationship with audience, content of L2 writing. There have therefore been several a posteriori
knowledge, etc.). Any model of writing needs to account for studies on Task 2 candidate scripts (see Mayor, Hewings,
these contextual factors and see writing as a social act. North, Swann & Coffin 2006). Task 2 also carries the heavier
A model of writing also needs to take account of the weighting in scoring, one of the justifications for Moore &
internal processing writers undertake. A recent model from Morton’s (2006) a priori study on test task authenticity.
Field (2004) is based upon information processing Weir et al (2007) were the first to use a specially designed
principles from psycholinguistic theory. He provides a cognitive validity-based questionnaire in their study of
detailed account of the stages a writer proceeds through: comparability of word-processed and pen & paper IELTS
• macro-planning: ideas gathering and identifying major writing. In that study, they compared candidate scores on
constraints (genre, readership, goals) two Task 2 prompts (a posteriori ) as well as a quantitative
and qualitative analysis of the questionnaire responses
• organisation: ordering ideas and identifying relationships
(a priori ). This questionnaire forms the basis of one of the
between them
research instruments used in my study.
• micro-planning: focusing on the part of the text Task 1 on the other hand has generated relatively less
(paragraph and sentence) about to be produced research interest and apart from some internal Cambridge
• translation: converting prepositional content held in ESOL validation studies, it has always been researched
abstract form to linguistic form alongside Task 2. Of greatest relevance to the present study
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
26 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
is Mickan, Slater and Gibson’s (2000) a priori study Figure 2: Data input task – cinema attendance
examining the readability of test prompts (Task 1 and 2)
and test-taking behaviours of intending IELTS candidates You should spend about 20 minutes on this task.
using verbal protocol analysis. This study essentially
The graph below gives information about cinema attendance in
focused on the context validity parameters of task input Australia between 1990 and the present, with projections to 2010.
emphasising the ‘socio-cultural influences on candidates’
Summarise the information by selecting and reporting the main
demonstration of their writing ability’ (ibid:29). As many features, and make comparisons where relevant.
aspects of IELTS Writing have evolved since this study,
Write at least 150 words.
including the rubric, it would be interesting to see how
candidates perceive Task 1 now.
100
90
80
Methodology outline
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 27
In the study the participants were asked to verbalise their Two participants (1 and 3) thought aloud as they wrote a
thoughts concurrently as they wrote their responses to one of response to the data input task (Figure 2) and two
two non-live IELTS Academic Writing Task 1s. Their participants (2 and 4) responded to the diagrammatic
verbalisations were audio-recorded generating a set of Academic Writing Task 1 (Figure 3).
protocols making up a body of qualitative data. Concurrent All were IELTS preparation students at Anglia Ruskin
reports are generally regarded as more reliable than University in Cambridge. More demographic information on
retrospective reports in that data is not reliant on recovering the participants is provided in Table 2 below.
thoughts from memory. These reports were supplemented by
field notes (e.g. instances of underlining, crossing out and Table 2: Demographic data of participants
A quick debriefing at the end of the recording session took 2 Mongolian F 29 Hoping to do an MA in
place where subjects were asked to comment on the task and Modern Society and Global
Transformation at
the research procedure. After this, the participants were Cambridge
asked to complete a questionnaire.
3 Korean M 23 Wants to do BA in Sports
After the data was collected, the recordings were Management at
transcribed and data was segmented according to their Loughborough
correspondence to single thought processes. The unit of 4 French M 20 Wants to improve English
analysis for segmentation was sometimes a word, phrase, while in UK for a year
clause, sentence or even 2–3 sentences. Each segment was
delineated with a ‘/’ and timed. In order to facilitate
analysis, a coding scheme was developed by focusing on Cognitive processing questionnaire
each protocol at a time and attempting to describe each For this part of the study, I adapted the 38-item cognitive
segment as a thought process. processing questionnaire (CPQ) designed by Weir et al
This involved four iterations of re-coding until a scheme (2007:321). The questions are grouped to reflect the
was established that accounted for all four sets of cognitive processes that writers are hypothesised to
protocols. Green (1998:70) emphasises that it is important undergo and are identified in the CPQ as one of Field’s
at this stage to keep ‘any theoretical assumptions to a (2004:329) six stages outlined previously. For example,
minimum’ as otherwise there is the danger of ignoring question 21 (see below) is one of several that focuses on
those verbalisations that are inconsistent with a particular the translation phase:
hypothesis.
I felt it was easy to express ideas using the correct sentences.
The coding that finally emerged consisted of each
1. Strongly disagree 2. Disagree 3. No view 4. Agree 5. Strongly agree
protocol being divided into three phases – pre-writing,
writing and post-writing – and was labelled PreW, W and Each stage is represented by at least four questions in
PostW respectively. Each thought process was then order to enhance the reliability of the questionnaire, as a
assigned a number so that PreW1 for example referred to single question is always susceptible to bias.
the process of ‘Reading (part of) the introductory A further advantage of this procedure is its uni-
background to the visual input’. As well as code labels and dimensionality in that all the questions measure in the
length of time, comments from the field notes were also same direction. Each item can therefore be scored from
collated. For example, the beginning of a participant’s 1 to 5 (except Question 12 which elicited a yes/no
protocol was presented as follows: response). The higher the score, the more favourable is
the attitude. This in turn means that a frequency count can
Segment Time Verbal protocol Code Length Comments
be carried out for the number and percentage of
of time respondents who choose each option of each question.
The mean value of responses to each question can then
001 00.00 OK. Writing Task 1. PreW3 00.08
You should spend be calculated to reveal the tendency of the responses with
about 20 minutes the proviso that a minimum number of 30 respondents
on this task/
are sourced.
002 00.08 The diagram below PreW1 00.10 Underlines For those four who participated in the think aloud
shows the process ‘bricks’ on
by which bricks are task procedure, this questionnaire was administered afterwards
manufactured for the in order to avoid the possible contamination of the
building industry/
protocols. As well as to these four participants, I distributed
003 00.18 Summarise the PreW3 00.17 Underlines this questionnaire to several language schools that run
information by ‘make
selecting and comparisons’ IELTS preparation courses in order to generate some
reporting the main and circles quantitative data.
features, and make ‘main
comparisons where features’ on A total of 60 IELTS preparation students of varied
relevant/ task nationalities studying in the UK (44 students) and Hong
Kong (16 students) wrote a response to either the data
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
28 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
input or diagrammatic writing tasks (see Table 3). They then Table 4b: Writing phase (17 minutes 33 seconds/75.81% of overall
time on task)
completed one of two questionnaires depending on the
task they had responded to. Code Coding category Length Frequency
of time
Table 3: Breakdown of respondents by language institute and task type W2 Converting ideas into text 07.54 22
W1 Rehearsing a linguistic form before writing 01.58 8
IELTS preparation course provider No of respondents
——————————————————— W11 Interpreting feature(s) of visual input 01.28 5
Data Diagram Total W14 Reviewing grammatical/lexical correctness 01.22 4
after writing some text
Eurocentres, Cambridge 4 4 8
W4 Previewing a concept before writing some 00.54 1
Anglia Ruskin University, Cambridge 8 10 18 text
W17 Making a goal statement 00.50 4
St Giles, central London 9 9 18
W13 Reviewing grammatical/lexical correctness 00.37 3
Centre for Language in Education, 8 8 16 while writing some text
Hong Kong Institute of Education W3 Attempting to retrieve a linguistic form 00.25 2
from memory
Total 29 31 60
W7 Reading (part of) the standard instructions 00.25 1
W15 Reviewing informational content while 00.11 1
writing some text
W20 Monitoring the word count 00.05 1
Data collection and analysis
‘Think aloud’ verbal protocols
From each of the protocols collected, the instances where a abstract ideas to linguistic form (W2 22 instances). The
coding category was applied were ranked in order of time second most common thought process was rehearsing a
spent. This was supplemented with data on the frequency linguistic form before writing. There were eight instances of
of instances so that together these rather crude measures this (W1) which generally occurred before the actual
could provide some indication of the prevalence of certain putting of pen to paper. There were however some overt
thought processes. This information was collated for each examples of micro-planning where the writer broke off
writing phase for each participant. mid-sentence, tried to find a phrase to continue the
For the purposes of exemplification, the findings of each sentence, went back to the task and read the instructions
writing phase based on the verbal protocol of Participant 1 and then made a goal statement, previewed an idea
are summarised in Table 4a, Table 4b and Table 4c. Of the before finally writing. This highlights the dynamic nature
23.09 minutes she took to complete the task, she spent of writing where the text becomes part of the context thus
03.41 minutes planning her response (see Table 4a). There compelling the writer re-visit the task, the instructions,
is evidence of macro-planning in that she clarified the task goals and their memory before they can continue encoding
requirements by reading the task-specific rubric (PreW1 and their thoughts.
PreW2) and the graphical input (PreW5). She attempted to As well as micro-planning there are also examples of
interpret the data (PreW8) and summarise it (PreW9). This monitoring during (W13) and after writing some text (W14).
was the only protocol where there was evidence of topic While writing there were occasions where the writer self-
definition (PreW10) where the writer generates ideas by corrected some errors e.g. The graph illustrate illustrates
utilising world knowledge. However, at no time did she erm/ (W13). This is an example of low level monitoring
write any notes although she did claim in the debriefing involving mechanical accuracy such as punctuation,
that she made notes in her head. spelling and syntax. However, the monitoring that occurred
Just over 75% of the time (17.33 minutes) was spent after some text had been written does require more
actually writing (see Table 4b), of which she spent 07.54 attentional resources as it involves checking cohesion
minutes engaging in translating – the actual conversion of between sentences and within sentences e.g. the writer
in her final paragraph prepared to write ‘To conclude’,
Table 4a: Pre-writing phase (3 minutes 41 seconds/15.91% of overall realised that the previous paragraph began with ‘To
time on task) conclude’ so replaced it with ‘To compare’ some 3 minutes
after originally beginning the penultimate paragraph.
Code Coding category Length Frequency
of time The degree of monitoring however did not seem to extend
to any consideration of the reader or to goals set earlier.
PreW10 Defining the topic 00.46 3
Nevertheless there is evidence of an evolving orientation
PreW5 Reading (part of) the visual input 00.36 4
towards goals. There are four instances of this where the
PreW8 Interpreting feature(s) of visual input 00.31 5
writer prompts herself: to make a difference, write one more
PreW9 Summarising feature(s) of visual input 00.23 2
sentence then a conclusion, draw a comparison and put it in
PreW2 Re-reading (part of) the introductory 00.13 1
background to the visual input my conclusion (W17).
PreW6 Re-reading (part of) the visual input 00.11 1 The sheer complexity of writing is further evidenced with
PreW1 Reading (part of) the introductory 00.10 1 this participant in that she prompted herself twice to
background to the visual input retrieve a linguistic form from her long-term memory (W3),
PreW7 Previewing potential linguistic form(s) 00.07 1 felt the need to read the standard instructions for the first
time (W7), reviewed the informational content of a piece of
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 29
text (W15) and was aware of the need to monitor the word Table 5: Stages involved in writing and questions designed to elicit
candidates’ behaviour
count (W20).
This participant was only one of two subjects who Stages Question No.
devoted any time to the post-writing phase (see Table 4c)
Macro-planning 1–9
although she had to prompt herself to do this (PostW1).
Organising 10–15
She mostly spent the time correcting errors (PostW3)
Micro-planning 16–19
although there were a couple of instances where she read
Translating 20–26
her script making no corrections (PostW2). In her debriefing
Monitoring & Revising 27–38
she thought that her response was short and that she
didn’t have enough time to count the number of words.
Table 4c: Post-writing phase (1 minute 55 seconds/8.28% of overall From the frequency data collected, the percentage of
time on task) agreement for each question was obtained by adding up
the percentage of those expressing agreement and strong
Code Coding category Length Frequency
of time agreement. This was done by task type and as a total and is
presented in Tables 6, 7, 8, 9 and 10 overleaf. Preceding
PostW3 Editing (part of) text 01.02 3
each table is a summary of the data highlighting the main
PostW1 Making a goal statement 00.18 2
findings with some tentative speculation as to the reasons
PostW2 Reading (part of) text 00.17 2
for the results.
Macro-planning
Overall there is strong evidence from this and from the In the goal-setting part of this stage (questions 1–5, see
other three participants that all but one of the cognitive Table 6) there is generally quite high to very high agreement
processes outlined in Weir (2005) and Field (2004) are being among the respondents. It does seem that many of these
employed. The only process where there was very little preparation students do read the instructions very carefully
evidence was of organising – this was also the case in and attempt to interpret both these and the visual input so
Mickan et al’s (2000) study which concentrated on Task 2, a that they can meet the task requirements. This seems to be
longer task requiring knowledge transforming skills. Perhaps especially true of those who responded to the diagrammatic
even more so for Task 1, candidates are unlikely to write input.
notes or mentally plan an outline. What was striking from all A very low proportion of candidates seem to utilise world
the participants was the perception that there was not knowledge or consider the genre constraints when
enough time so perhaps organising the response was responding to Academic Writing Task 1s. Regarding the
sacrificed due to that. However, from the participants’ scripts question of topic knowledge (Q6), it could be argued that
and also from some of their goal statements there was still low levels of agreement are actually a good thing as IELTS
some evidence of the provisional outlining of ideas. Writing tasks should not be seen to favour candidates from
The findings based on verbal reports of all four any specific discipline. Tasks have to be about something
participants showed that there did not seem to be any but not at a level where specialised knowledge would
striking dissimilarities in thought processes between those create bias.
taking the data input task as opposed to the diagram. Of more concern perhaps is the low level of knowledge
Differences were largely based on writing competence with about this task type which is a 150-word descriptive
the more skilled writers such as Participant 1 engaging summary (cf. question 8 in Table 6 overleaf). Interestingly,
more in macro-planning and monitoring than the less more candidates, albeit very marginally, seemed to be more
skilled (for more information see Bridges 2008). familiar with the diagrammatic task type than the data
Not surprisingly, the protocols collected in this study input.
provide stronger evidence of knowledge telling than
knowledge transforming. Task 1 is after all designed to Organising
facilitate the transfer of assembled information from a A not particularly clear picture emerges from this sample
visual input to a verbal written output. during this organising stage (see Table 7). For questions 10
It must be emphasised, however, that as this study and 11, which elicit information on whether the writer starts
involved just four participants it should be seen as to generate their ideas after the macro-planning phase
exploratory and any conclusions drawn are tentative. There above, it seems that about a third of the students report
are also drawbacks with the methodology of VPA itself that they engage in these activities.
which need to be considered in any conclusion. Questions 12 and 13 reveal that just over half do plan an
outline either on paper or as mental notes and that just
Cognitive processing questionnaire over 50% have thought of their ideas before they plan their
The design of the two questionnaires was aimed at outline. These ideas may well be incomplete (see question
investigating, through participants’ self-reports, the extent 10) or not well-organised (question 11) but there does
of the cognitive processes they employ in responding to two seem to be some provisional organisation of ideas.
types of the Academic Writing Task 1. Table 5 below Not surprisingly, as 51.7% reported that they thought of
summarises the different stages and the questions most of their ideas before planning an outline, only 29%
designed to elicit respondent behaviour. mostly thought of ideas while planning an outline. An
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
30 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
equally low percentage thought of their ideas in English. in responses between those who responded to the
This is not altogether surprising. L2 writers, especially diagrammatic task, of whom 58.1% thought it was easy
unskilled ones, may experience a heavy cognitive load in to put their ideas in good order, and the data task
simply encoding their thoughts as they write so are respondents, of whom only 17.2% thought it was easy
unlikely to plan for writing in English. (see question 19 in Table 8). It could be surmised that
the diagrammatic task does offer more scaffolding than
Micro-planning the data task although interestingly more data task
This level of planning takes place as the text evolves at respondents reported being able to put their ideas or
both the paragraph and sentence level while also taking content in good order (46.7% to 31.1%, see question
into account decisions made in macro-planning. Perhaps 17) but that of course does not necessarily mean it was
the most interesting finding is the substantial difference easy.
1 I FIRST read the instructions very slowly considering the significance of each word in it. 55.2% 77.5% 66.7%
2 I thought of WHAT I was required to write after reading the instructions and visual input. 79.3% 80.6% 80.0%
3 I thought of HOW to write my response so that it would respond well to the instructions. 79.3% 71.0% 75.0%
5 I was able to understand the instructions for this writing test completely. 69.0% 80.7% 75.0%
6 I know A LOT about this topic, i.e., I have enough ideas to write about this topic. 24.1% 16.1% 20.0%
7 I felt it was easy to produce enough ideas for the Task 1 from memory. 17.2% 35.5% 26.6%
8 I know A LOT about this task type, i.e. I know how to write a descriptive summary of data 24.1% 25.8% 25.0%
(chart, diagram, table)/diagrams (process, map, plan).
9 I know A LOT about other types of IELTS Academic Writing Task 1s e.g., diagrams 27.5% 32.2% 30.0%
(process, map, plan)/data (chart, diagram, table).
11 Ideas occurring to me at the beginning were well ORGANISED. 31.0% 45.1% 38.3%
12 I planned an outline on paper or in my head BEFORE starting to write.* 51.8% 51.6% 51.7%
13 I thought of most of my ideas for the task BEFORE planning an outline. 60.0% 43.8% 51.7%
14 I thought of most of my ideas for the task WHILE I planned an outline. 33.3% 25.1% 29.0%
*As respondents only had to answer Yes or No to this item, % agreement is based on those who answered ‘yes’.
17 I was able to put my ideas or content in good order. 46.7% 31.1% 38.8%
18 Some ideas had to be removed while I was putting them in good order. 40.0% 37.6% 38.7%
19 I felt it was easy to put ideas in good order. 17.2% 58.1% 38.4%
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 31
20 I felt it was easy to express ideas using the appropriate words. 17.2% 45.1% 31.7%
21 I felt it was easy to express ideas using the correct sentences. 24.1% 25.8% 25.0%
22 I thought of MOST of my ideas for the summary WHILE I was actually writing it. 41.3% 64.6% 53.3%
23 I was able to express my ideas by using appropriate words. 13.8% 61.3% 38.4%
24 I was able to express my ideas using CORRECT sentence structures. 20.6% 45.2% 33.3%
25 I was able to develop any paragraph by putting sentences in logical order in the paragraph. 31.0% 64.5% 48.3%
26 I was able to CONNECT my ideas smoothly in the whole response. 13.8% 41.9% 28.4%
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
32 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
27 I tried NOT to write more than the required number of words in the instructions. 31.0% 25.8% 28.3%
28 I reviewed the correctness of the contents and their order WHILE writing this response. 44.8% 45.1% 45.0%
29 I reviewed the correctness of the contents and their order AFTER finishing this response. 44.8% 48.4% 46.6%
30 I reviewed the appropriateness of the contents and their order WHILE writing this response. 41.4% 45.2% 43.3%
31 I reviewed the appropriateness of the contents and their order AFTER finishing this response. 48.3% 42.0% 45.0%
32 I reviewed the correctness of sentences WHILE writing this response. 51.8% 51.7% 51.6%
33 I reviewed the correctness of sentences AFTER finishing this response. 44.8% 41.9% 43.4%
34 I reviewed the appropriateness of words WHILE writing this response. 51.7% 54.8% 53.4%
35 I reviewed the appropriateness of words AFTER finishing this response. 44.8% 41.9% 43.4%
36 I was able to write a draft response in this test, then wrote the response again neatly 37.9% 19.4% 28.4%
within the given time.
37 After finishing the summary I also thought for a while of those statements or thoughts I removed. 37.9% 29.0% 33.4%
38 I felt it was easy to review or revise the whole response. 24.1% 29.0% 26.7%
Conclusion and recommendations • Analysis of linguistic features of scripts from the VPA
participants to gain further insight into levels of
For the cognitive processes required to complete the Task 1
processing in terms of rhetorical and content parameters.
in IELTS Academic Writing to be deemed appropriate, they
need to replicate those thought processes that test takers
IELTS has always been a research-led enterprise and so
will need to utilise in the future target language use
these and other studies are likely to come to fruition in one
situation. This study demonstrates that there is evidence of
form or another. As a high-stakes test it is important that
a large variety of the cognitive processes being employed,
IELTS continues to demonstrate validity. It is hoped that this
although organising does not seem to be as activated as
small scale study using a relatively recent theoretical
much as the other processes. This is perhaps because
framework contributes in some way to the validity argument
ultimately the completion of Task 1 requires a knowledge-
supporting the use of IELTS as a means of assessing the
telling strategy even with very proficient writers. Unskilled
writing ability of those wishing to study or work in the
writers are likely to plan less with each sentence generating
medium of English.
the content of the next piece of text in a linear non-
reflective manner. Skilled writers on the other hand may
find that re-shaping the content from a visual input is not
particularly demanding. They may adopt problem-solving References
strategies involved in knowledge transformation such as
Bachman, L (1990) Fundamental Considerations in Language Testing,
organising, but knowledge telling may be successful with Oxford: Oxford University Press.
very straightforward Task 1s.
Bridges, G (2008) Demonstrating further evidence of cognitive and
In order to follow up this study and to furnish further context validity for Task 1 of the IELTS Academic Writing Paper using
evidence of cognitive validity to support the use of IELTS a socio-cognitive validity framework, unpublished MA dissertation,
Academic Writing Task 1 the following research projects Anglia Ruskin University.
could be initiated: Chapelle, C (1998) Construct definition and validity inquiry in SLA
• Further verbal protocol analysis where each informant research, in Bachman, L and Cohen, A (Eds) Second Language
acquisition and language testing interfaces, Cambridge: Cambridge
would verbalise their thoughts on both data and diagram
University Press, 32–70.
input tasks. Comparisons were limited in my study as the
Ellis, R (1994) The Study of Second Language Acquisition, Oxford:
task variable was confounded by the participant variable.
Oxford University Press.
• Keystroke logging of responses during VPA as subjects Eysenck, M and Keane, M (2005) Cognitive Psychology (5th edition),
type their responses. This kind of research will become Hove: Psychology Press.
increasingly relevant as the IELTS partners plan to offer Field, J (2004) Psycholinguistics: the Key Concepts, London:
computer-based variations on the traditional pen and Routledge.
paper administrations they currently offer. Keystroke Green, A (1998) Verbal protocol analysis in language testing
logging provides a more accurate record of when and research, Cambridge: UCLES/Cambridge University Press.
where writers pause and together with concurrent Hyland, K (2002) Teaching and Researching Writing, London:
protocols potentially offers richer data. Longman.
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 33
IELTS Scores Explained DVD (2006), Cambridge: Cambridge ESOL Scardamalia, M and Bereiter, C (1987) Knowledge telling and
Publications. knowledge transforming in written composition, in Rosenberg, S
Mayor, B, Hewings, A, North, S, Swann, J and Coffin, C (2006) A (Ed.) Advances in Applied Psycholinguistics, Volume 2: Reading,
writing and language learning, Cambridge: Cambridge University
linguistic analysis of Chinese and Greek L1 scripts for IELTS
Press, 142–175.
Academic Writing Task 2, in Taylor, L and Falvey, P (Eds) IELTS
Collected Papers: Research in speaking and writing assessment, Shaw, S and Khalifa, H (2007) Deconstructing the Main Suite tests to
Cambridge: Cambridge ESOL/Cambridge University Press, understand them better, Cambridge ESOL presentation to internal
250–315. staff.
Mickan, P, Slater, S and Gibson, C (2000) Study of Response Validity Shaw, S and Weir, C (2007) Examining Writing: Research and practice
of the IELTS Writing Subtest, in Tulloh, R (Ed.) IELTS Research in assessing second language writing, Cambridge: Cambridge
Reports Volume 3, Canberra: IELTS Australia, 29–48. ESOL/Cambridge University Press.
Moore, T and Morton, J (2006) Authenticity in the IELTS Academic Taylor, L and Falvey, P (2007) IELTS Collected Papers: Research in
Writing test: a comparative study of Task 2 items and university speaking and writing assessment Cambridge: Cambridge
assignments, in Taylor, L and Falvey, P (Eds) IELTS Collected Papers: ESOL/Cambridge University Press.
Research in speaking and writing assessment, Cambridge: Weir, C (2005) Language Testing and Validation: an evidence-based
Cambridge ESOL/Cambridge University Press, 197–249. approach, Basingstoke: Palgrave Macmillan.
Saville, N (2003) The process of test development and revision within Weir, C, O’Sullivan, B, Jin Yan and Bax, S (2007) Does the computer
UCLES EFL, in Weir, C and Milanovic, M (Eds) Continuity and make a difference? Reaction of candidates to a computer-based
innovation: revising the Cambridge Proficiency in English versus a traditional hand-written form of the IELTS Writing
Examination 1913–2002, Cambridge: UCLES/Cambridge University component: effects and impact, in Taylor, L (Ed.) IELTS Research
Press, 57–120. Report Volume 7, IELTS Australia and British Council, 311–347.
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
34 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
aspects of language which present difficulties for specific to be an important politeness strategy in speech and
groups of learners at different points on the interlanguage writing. Salager-Meyer (1995) considers hedges and
continuum. This information can yield insights about range, boosters to be ‘a significant communicative resource for
complexity and typical performance at different proficiency student writers at any proficiency level’. Hyland & Milton
levels. In fact, CLC is currently being used in the English (1997:186) also comment on this important area of
Profile project to describe in more detail linguistic and pragmatic competence, and argue that these devices
lexical features of learner output (McCarthy 2009). With C1 influence the reader’s assessment of ‘both referential and
and C2 levels, where advanced language performance may affective aspects of texts’ (bold added). In spoken discourse
reveal clusters of different features (Jarvis, Grant, Bikowski too, increasing attention has been paid to the pragmatic
& Ferris 2003:399), corpus analysis may help us importance of hedging strategies. Carter (2005:68)
understand how these are distributed over student suggests that they have an important interpersonal function
populations. Regardless of level however, if we are able to in keeping lines of communication open; Hyland (2005)
identify typical errors or avoidance strategies which still refers to this elsewhere as ‘opening up a discursive space’
need to be addressed, we can then try to feed work on in written discourse.
these areas into our teaching. All of this seems to suggest that flexible use of modal
devices is important both as an interpersonal feature and
as a communication strategy in L2 production in general. It
Focus of the study is because of their all-pervasive nature in many types of
discourse, as well as their significance in academic writing,
This study was prompted by a previous investigation by that I decided to carry out a preliminary study using learner
Hyland and Milton (1997) into the way Hong Kong students corpora to investigate the frequency and occurrence of
express qualification and certainty in their writing. The these devices in my own local teaching context.
authors believe that flexible use of linguistic devices to The research question was the following: how do
mitigate and boost statements is crucial to academic undergraduate students express qualification and certainty
discourse for the following reasons: in their argumentative writing, and what type of devices do
Mitigators or ‘hedges’ allow writers to: they use most frequently?
• avoid absolute statements
• acknowledge the presence of alternative voices
• express caution in anticipation of criticism. Student profile and methods
Amplifiers or ‘boosters’ allow writers to: Although the learner corpus used is very small, Granger
• demonstrate confidence and commitment in a (1998a) suggests that small corpora compiled by teachers
proposition of their own students’ work can yield useful insights into a
group profile of learner language. Clearly, for any corpus to
• mark their involvement and solidarity with the reader.
be useful it is essential to have clear design criteria; in the
My own experience of working with Italian students case of learner language it is particularly important to
suggests that they have firm control of amplifiers but are control for the many different types of learner language and
less likely to mitigate their statements. For example, several situations, taking into account variables such as the
years ago one student, Chiara1, wrote a well-structured and following:
supported, generally accurate essay on the subject of
teenage pregnancies in Britain, and was disappointed at Table 1: Variables to control for in learner corpora design
receiving a slightly lower mark than she had expected. This
Language Learner
was because she had failed to navigate the ‘area between
Yes and No’ (Halliday 1985:335), and used only categorical medium age
statements with inappropriate strength of claim, resulting in genre sex
what Milton (1999:230) has called ‘over zealous emphasis’. topic L1
If Chiara had qualified her statements more, in order to task level
‘recognise alternative voices’ (Hyland 2005:52) her essay task setting learning context
would have been more persuasive. According to Hyland
(Adapted from Granger 1998b:9)
(2005:24):
‘… meaning is not synonymous with ‘content’ but dependent on all the
The students in this project formed a relatively
components of a text. …both propositional and metadiscoursal
homogenous group in terms of age, level and language
elements occur together … each element expressing its own ‘content’:
learning background. The 50 students involved were in their
one concerned with the world, and the other with the text and its
reception.’ (bold added)
second year of a degree in European languages and culture
at the University of Modena and Reggio Emilia. This was a
Equally importantly, as well as its central function in predominantly female student population (42 female and
establishing the tone and style of academic writing, the 8 male) whose language level ranged from high B2 to low
ability to express qualification and certainty is considered C1, as measured by their results in the first year exam.
The study was conducted with this group of high-
1 A pseudonym intermediate students as it was hoped that their firm
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 35
control of grammatical and lexical resources would free … so work experience can really help you to grow…
them up to reflect upon how modal or epistemic devices … that’s why you’re really interested on it.
could be used to achieve different rhetorical purposes. … world of sport has really changed today …
… the meeting are really serious and …
The aim was to observe how they hedged or boosted their
… have turned out to be really appreciated …
statements; therefore the focus here was on
appropriateness, rather than accuracy. Expert NS and non-native-speaker (NNS) writers, in a
The data used are two small corpora based on student similar argumentative task, might have achieved this
writing produced at the end of the first and second emphasis more formally, for example, by replacing really
semesters. CORPUS 1 was compiled of two short important with crucial, really help you with be of
argumentative writing tasks submitted in the first considerable help, and really appreciated with very much
semester. The handwritten scripts were later keyed into the appreciated.
computer verbatim by the students themselves. I then
corrected typographical errors only and analysed the texts Predominance of central modals
using Wordsmith Tools text retrieval software to examine
The same central modal verbs will, should, would, could,
the type and frequency of hedges and boosters occurring
and the epistemic verb think, appeared in the top 10 tokens
in the scripts. A further manual analysis was conducted
of both Corpus 1 and Corpus 2 (see Table 3).
to disambiguate any items. CORPUS 2 was compiled from
two further assignments submitted at the end of the second
semester and a similar analysis was carried out. Table 3: Occurrence of central modals in Corpus 1 and 2 (raw figures)
CORPUS 1 CORPUS 2
Table 2: Top 10 epistemic devices which occurred in this study Total 316 125
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
36 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
Table 4: Top 10 modal devices occurring in Corpus 1 and 2 initial position has a mitigating or amplifying effect on
(percentages)
writer commitment, and if this effect might change if it were
CORPUS 1 No. % CORPUS 2 No. % embedded or inserted at clause-initial position.
It may be that NNS writers prefer to use fixed phrases in
will 123 5% in fact 41 1.4%
sentence-initial position because they are often presented
would 81 3.5% think 34 1.2%
in school textbooks in this way, and this makes them
think 61 2.7% will 31 1.2%
implicitly available for uptake by students. This might be
could 60 2.6% could 30 1.1%
something we want to draw students’ attention to when
should 39 1.7% would 29 0.9%
using published materials.
always 26 1.1% always 29 0.9%
in fact 24 1% quite 28 0.9% Compound hedges
know 19 0.8% should 20 0.7%
Despite the predominance of boosters in this learner
usually 18 0.8% clear 16 0.5%
corpus, there were also some clear attempts to qualify
possible 15 0.6% believe 15 0.5%
assertions. For example, some students tried to combine
devices in a ‘compound hedge’ (Salager-Meyer 1995:155),
not always with harmonic results. Nevertheless, it is
Predominance of boosters interesting to note that such clusters, typical of expert or NS
It is also interesting to note that boosters (which have an writers, also occurred in this corpus (see examples below).
amplifying function) rather than hedges (mitigating This seems to indicate an increasing awareness of the
function) predominate in the list of 10 most frequently reader–writer relationship in this high-intermediate student
occurring devices in this corpus. This may be a result of a population.
mother tongue (L1) fingerprint on L2, although this If it is possible for me to make a suggestion, my advice would be to try
hypothesis would need to be researched further for an to reduce the number of cars circulating
Italian L1 context. Past learning experience or instruction …. or rather I would say that I feel the need to express my opinion
where students are encouraged to express their views concerning …
assertively may also be a contributory factor.
Personally, I think that imposing a daily “congestion charge” could be a
good idea.…
Sentence position
This restriction seems to me not quite right …
Previous studies of complexity in L2 writing have found
that, possibly because of the multiple demands of the Some researchers (e.g. Hyland & Milton 1997) have
composing process, learners frequently default to safe found that students who modify their statements with more
usages such as thing instead of topic issue/question. In this tentative expressions tend to have a higher level of general
corpus, too, the same phenomenon occurs when expressing language proficiency. Others, instead, suggest that
opinions. For instance, many students in this corpus relied although greater linguistic competence is an important pre-
on personal subjectivity markers such as in my opinion, requisite, it does not automatically imply the parallel
what Hasselgren (1994) might describe as a ‘lexical teddy development of pragmatic competence (Bardovi-Harlig &
bear’. This is illustrated in the examples below: Dörnyei 1998:234).
… instead of having a walk with a friend. In my opinion, it would be
Possible reasons for lack of control of modal devices
better spend …
Even in this small study of a relatively homogenous student
… “real” encounter takes place. In my opinion, to deal with this issue …
population there was some variation both in the degree of
… action proposing these two projects. In my opinion, proposal
formality and the degree in the use of tentativeness. This
number one is …
may be linked to one or more of the following factors:
… the Car Park and the city centre, yet in my opinion this may be
• language level (even within this relatively homogeneous
revealed as …
student population)
… threatened or highly endangered. In my opinion, we have led
• writing competence (as opposed to language
our planet …
competence)
The first proposal is, in my opinion, a great solution for … • incomplete register control
… and stressful sport activity. In my opinion the secret for staying fit … • individual differences in communicative style
… the health side to doing sports. In my opinion practicing sports, and … • cultural differences in rhetorical style.
… too much traffic and much noise. In my opinion a good solution for …
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 37
populations previously studied (Hinkel 2005, Hyland & models through extensive reading of a variety of text types.
Milton 1997), they tend to overstate rather then hedge their In this way they can explore contextualised examples of
assertions, possibly in a bid to ‘sell’ their ideas, and often these devices, notice how they occur typically in
default to informal items (e.g. really), creating a degree of discourse, and reflect on their function in each context.
writer visibility which may not be appropriate in all types of For example, the predominance of hedges in the abstract
writing. Also, the narrow range of modal auxiliaries which and discussion section of an academic article are
learners tend to rely on at this stage may not be adequate polypragmatic in that they express a degree of uncertainty
as they progress to more complex, pragmatically sensitive and therefore humility towards the academic community.
writing events in future contexts, both academic and Apprentice texts written by advanced-level students
professional. Therefore it is important to make learners (Flowerdew 2000) can also be an excellent source of
aware that there is a wider spectrum of linguistic choices reading texts for students of slightly lower levels. Attention
available for these purposes and to provide opportunities can be drawn to hedging devices, which are often lexically
for them to encounter such alternatives in context. invisible to learners (Lowe 1996:30), and the possible
A further consideration is the improvement of stylistic purpose of these can then be discussed. For example,
proficiency, which is an important objective as students they may be used to express caution in anticipation of
progress along the writing continuum. The increasing criticism, to show politeness and modesty towards the
internationalisation of higher education means that, in academic community and wider readership, or to open up
order to gain access to English-medium university courses, a dialogical space, among others.
students need to obtain advanced English language The following are some suggestions for form-focused
qualifications such as International English Language instruction and consciousness-raising (CR) activities:
Testing System (IELTS), CAE (Cambridge English: Advanced2) • remove hedges from texts and ask students to discuss
or CPE (Certificate of Proficiency in English). Testing criteria the resulting effect on the reader
for these exams, based on the Common European
• ask students to explore the function of multi-word items
Framework of Reference (CEFR) descriptors, include lexical
which naturally occur in the target discourse such as it
resources and interactive communication. To meet the
would seem that, to my knowledge, to some extent or the
required level for C1 and C2, students need to use a wide
more informal on the whole in their reading (and notice
range of lexis accurately and appropriately to perform
that they are sometimes embedded in the clause and not
interpersonal functions and meet the testing criteria.
in sentence-initial position)
Therefore a strong learner training component in exam
preparation classes could provide learners with strategies • ask students to distinguish statements in a text which
to extend their range of lexis and discover alternatives to report facts and those which are unproven
certain default usages or ‘islands of reliability’ (Dechert • students rewrite an academic essay (which uses hedges
1984:227). and boosters) into popular journalistic style (which
For second language learners, increasing their stock of doesn’t) or vice versa (Hyland 2005)
lexis is a particular challenge (Schmitt 2008:329). Research • design persuasive tasks of various kinds on sensitive
on advanced students’ vocabulary (Ringbom 1998:43) has topics, anticipating the potentially critical views of the
shown that learners at this level consistently use the 100 reader (Hyland 2005)
most frequent words more often than NS writers. Rundell &
• students could reformulate texts to accommodate
Granger (2007) report corpus findings demonstrating that
different audiences, and compare the before and after
learners writing academic texts use the discourse marker
effect on the audience.
besides about 15 times more frequently than native
speakers writing in the same mode. Such findings highlight
how expanding lexical resources is a key priority for
learners, and how vocabulary acquisition should concern Conclusion
not only content words, but also a range of lexis to perform This has been a preliminary investigation into an area of
interpersonal functions such as agreeing, disagreeing or learner language which is receiving increasing attention
expressing opinion. For example, the findings of this from discourse analysts. The study should be regarded as a
particular study suggest that these students need to point of departure rather than arrival, and the findings are
develop their repertoire of alternatives to central modals. intended to be representative of a specific student
Sinclair’s (1991) idiom (rather than open choice) principle population only. Clearly, it would benefit from further
holds that meaning is attached to the whole phrase rather quantitative and qualitative analysis and replication in
than the individual parts of it, so teachers may want to draw other student populations. Nevertheless, it has thrown up
students’ attention to prefabricated modal chunks (lexical interesting insights about how the students in this setting
phrases) as they are encountered, as well as individual navigate the ‘area of meaning between Yes and No’, which I
tokens (modal verbs). have since used to inform my teaching. What it suggests is
As well as providing opportunities for intentional learning that we may need to adopt a more systematic approach to
of vocabulary, we need to provide opportunities for raising students’ awareness of these interpersonal features
incidental learning of vocabulary (Schmitt 2008:353). in building reader–writer relationships and fostering
Students may benefit from exposure to appropriate text effective communication in general. In this way, unlike
Chiara in her essay on teenage pregnancies, they can learn
2 Previously known as Certificate in Advanced English to acknowledge the presence of ‘alternative voices’.
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
38 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
References and further reading Hyland, K and Milton, J (1997) Qualification and Certainty in L1 and L2
Students’ Writing, Journal of Second Language Writing 6 (2),
Bardovi-Harlig, K and Dörnyei, Z (1998) Do language learners recognize 183–205.
pragmatic violations? Pragmatic versus grammatical awareness in
instructed L2 learning, TESOL Quarterly 32 (2), 233–262. Jarvis, S, Grant, L, Bikowski, D and Ferris, D (2003) Exploring multiple
profiles of highly rated learner compositions, Journal of Second
Carter, R (2005) What is a frequent word?, paper presented at the Language Writing 12, 377–403.
international IATEFL conference, Cardiff, 5–9 April, 2005.
Lowe, G (1996) Intensifiers and Hedges in Questionnaire items and the
Dechert, H (1984) Second language production: Six hypotheses, in Lexical Invisibility Hypothesis, Applied linguistics, 17 (1), 1–37.
Dechert, H, Mohle, D and Raupach, M (Eds) Second Language
Productions, Tubingen: Gunter Narr Verlag, 211–223. McCarthy, M (2009) English Profile. TESOL Talk from Nottingham,
retrieved from http://portal.lsri.nottingham.ac.uk/SiteDirectory/
Flowerdew, L (2000) Using a genre-based framework to teach TTfN/default.asp
organizational structure in academic writing, ELT Journal 54 (4),
369–378. Milton, J (1999) Lexical thickets and electronic gateways, in Candlin, C
N and Hyland, K (Eds) Writing: texts, processes and practices,
Granger, S (1998a) Prefabricated patterns in advanced ELT writing: London: Longman, 221–244.
collocations and formulae, in Cowie, A P (Ed.) Phraseology:
theory, analysis, and applications, Oxford: Clarendon Press, Morgan, B S (2008) The space between Yes and No: how Italian
145–160. students qualify and boost their statements, in Palawek, M (Ed.)
Investigating English Language Learning and Teaching, Poznan-
Granger, S (1998b) The computer learner corpus: a versatile new Kalisz: Adam Mickiewicz University, 267–278.
source of data for SLA research, in Granger, S (Ed.) Learner English
on Computer, New York: Pearson Education, 3–18. Ringbom, H (1998) Vocabulary frequencies in advanced learner
English: A cross-linguistic approach, in Granger, S (Ed.) Learner
Granger, S (2002) A Bird’s eye view of Learner Corpus Research, in English on Computer, London & New York: Addison Wesley
Granger, S, Hung, J and Petch-Tyson, S (Eds) Computer Learner Longman, 41–52.
Corpora, Second Language Acquisition and Foreign Language
teaching, Amsterdam/Philadelphia: John Benjamins Publishing Rundell, M and Granger, S (2007) From Corpus to confidence, retrieved
Company, 3–33. from: http://www.macmillandictionaries.com/MED-
Magazine/August2007/46-Feature_CorporatoC.htm
Halliday, M (1985) An Introduction to Functional Grammar, London:
Arnold. Salager-Meyer, F (1995) I Think That Perhaps You Should: A Study of
Hedges in Written Scientific Discourse, Journal of TESOL France 2,
Hasselgren, A (1994) Lexical teddy bears and advanced learners: a 127–143.
study into the way Norwegian students cope with English
vocabulary, International Journal of Applied Linguistics 4, 237–58. Schmitt, N (2008) Review article: Instructed second language
vocabulary learning, Language Teaching Research 12, 329–363.
Hinkel, E (2005) Hedging, inflating and persuading, Applied language
learning 15 (1–2), 29–53. Sinclair, J M (1991) Corpus Concordance Collocation, Oxford: Oxford
University Press.
Hunston, S (2002) Corpora in Applied Linguistics, Cambridge:
Cambridge University Press. Van Els, T, Bongaerts, T, Extra, G, van Os, C, and Janssen-van Dieten,
A M (1984) Applied Linguistics and the Learning and Teaching
Hyland, K (2000) Hedges, Boosters and Lexical Invisibility: Noticing Languages, Edward Arnold: London.
Modifiers in Academic Texts, Language Awareness 9 (4), 179–197.
Hyland, K (2005) Metadiscourse, London/New York: Continuum.
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 39
This short summary is based on a doctoral thesis submitted constraint, expected grammatical person of response, and
to the University of Michigan, Ann Arbor (US) in 2009. The number of tasks. It also considers whether prompts are
PhD was supervised by Professor Diane Larsen-Freeman. differentially difficult for test takers of different genders,
language backgrounds, and proficiency levels. Second, the
Performance assessments have become the norm for study investigates the quality of raters’ ratings, whether
evaluating language learners’ writing abilities in international these are affected by time and by raters’ experience and
examinations of English proficiency. Two aspects of these language background. It also considers whether raters alter
assessments are usually systematically varied: test takers their rating behaviour depending on their perceptions of
respond to different prompts, and their responses are read prompt difficulty and of test takers’ prompt selection
by different raters. This raises the possibility of undue behaviour.
prompt and rater effects on test takers’ scores, which can The results show that test takers’ scores reflect actual
affect the validity, reliability and fairness of these tests. ability in the construct being measured as operationalised in
This study uses data from the Michigan English Language the rating scale, and are generally not affected by a range of
Assessment Battery (MELAB), including all official ratings prompt dimensions, rater variables, test taker characteristics,
given over a period of over four years (n=29,831), to or interactions thereof. It can be concluded that scores on
examine these issues related to scoring validity. It uses the this test and others like it have score validity and, assuming
multi-facet extension of Rasch methodology to model this that other inferences in the validity argument are similarly
data, producing measures on a common, interval scale. First, warranted, can be used as a basis for making appropriate
the study investigates the comparability of prompts that decisions. Further studies to develop a framework of task
differ on topic domain, rhetorical task, prompt length, task difficulty and a model of rater development are proposed.
This short summary is based on a Master’s thesis submitted are used and the resulting scripts from paper-based and
to the Faculty of Arts, Law and Social Sciences, Anglia Ruskin computer-based administrations analysed.
University in 2007. The research was funded by Cambridge In the second and main part of the study scripts produced
ESOL. The MA was supervised by Dr Sebastian Rasinger. from a live PET administration were studied. Two samples of
texts were chosen; these samples were matched on
This MA research focused on Cambridge ESOL’s Preliminary candidates’ proficiency and the country in which they sat the
English Test (PET). exam. A number of linguistic and text features were
In 2007 Cambridge ESOL was starting to launch computer- analysed. Texts were found to be comparable in text length,
based versions of many of its paper-based tests. Thus it was surface features and lexical error rates. However, there were
important that the issues of comparability between differences in lexical variation and in the number of
administration modes were explored. This study focuses on sentences and paragraphs produced. It is recommended
the skill of writing and builds on research from overall score that these results be considered a starting point from which
and writing sub-element score comparability studies. Unlike to further explore text-level differences across writing
the majority of current research, which focuses on score modes, covering additional first languages, proficiency
comparability, this study focuses on the comparability of text levels and writing genres. Results from this and future
and linguistic features. Features studied include lexical studies can help inform rater training and provide
range and sophistication, text length and organisation and information for teachers and candidates. For more details on
surface features such as capitalisation and punctuation. this study see Chambers (2008).
The study is set within an ESOL assessment environment
and is in two parts. The first part is a qualitative analysis of a References
small sample of scripts that also acts as a pilot for part two. Chambers, L (2008) Computer-based and paper-based Writing
Tasks from Cambridge ESOL’s Preliminary English Test (PET) assessment: a comparative text analysis, Research Notes 34, 9–15.
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
40 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
This short summary is based on a Master’s thesis operations on both tasks; however, micro-planning of the
submitted to Anglia Ruskin University in 2008. The research third subject’s test task was influenced by a desire not to
was funded by Cambridge ESOL. The MA was supervised exceed the word limit specified in the task. Consideration of
Dr Sebastian Rasinger. the word limit also influenced one subject’s macro-planning
of the test task, and all three subjects engaged in
This study applied Weir’s (2005) socio-cognitive framework considerably more macro-planning for the test task than
to investigate context and cognitive validity of the Writing their real-life task. However, there was no evidence that
component of a test of English in a business context. macro-planning was affected by completing the test task on
Cognitive validity was investigated primarily through a paper rather than on computer. All three subjects engaged
small-scale, qualitative study which used verbal protocol in similar revising activity on both tasks. There was no
analysis to establish whether one of the test tasks activated evidence that the limits on major revisions to wording or
the same cognitive processes as similar tasks in the real-life structure that apply when handwriting a test task resulted in
workplace. Cognitive validity was found to be high. All three different cognitive processing operations to a word-
subjects displayed the same five stages of cognitive processed task. For a summary of the part of the study that
processing in completing the test task and the real-life task. investigated the test’s context validity (specifically, the
However, there was no evidence in either task of a sixth linguistic demands the test made of the candidates who
stage identified in the above framework, in which writers took it), see Bateman (2009).
organise ideas in a pre-linguistic form. It seems probable
that the lack of an organisation phase is related to the
brevity of the tasks rather than their English for Specific References
Purposes (ESP) nature. The fine-grained processing Bateman, H (2009) Some evidence supporting the alignment of an
operations of all three subjects were very similar for both LSP Writing test to the CEFR, Research Notes 37, 29–34.
tasks in the translation and monitoring phases. Two of the Weir, C J (2005) Language Testing and Validation: An Evidence-Based
three subjects displayed very similar micro-planning Approach, Basingstoke: Palgrave Macmillan.
This short summary is based on the report submitted as and the resulting roles that a supervisor may be called upon
part of requirements for an MA in TESOL at the University of to carry out. The report went on to consider two case
London in 1996. The thesis was supervised by Dr John studies – my experiences as a trainer on a pre-service
Norrish. certificate course at a Further Education College in London,
and as a supervisor at a secondary school in Malta during
My report Models of Supervision – some considerations was the practicum of the Teacher Education and Training module
concerned with aspects of teacher supervision. After of the MA. I explored the limitations and successes of these
summarising various historical approaches to teacher two experiences and showed how the work I did in Malta
supervision and feedback, I outlined some of the factors modified my view of the supervisory process and led me to
which need to be taken into account when evaluating the draw some tentative conclusions about the advantages of a
potential of these different models, including the non-evaluative, co-operative approach to teacher training.
education/training debate, the issue of teacher evaluation
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 41
This short summary is based on a Master’s dissertation linked to the CEFR using three sets of criteria: Weir’s (2005)
submitted to the University of London Institute of Education socio-cognitive validity framework to evaluate overall test
in 2009. It was supervised by Dr Amos Paran. validity, the CEFR scales to evaluate the extent to which
these are addressed in test tasks and the COE’s (2009)
Communication and transparency are fundamental ideals Manual for relating language examinations to the CEFR to
underlying the Council of Europe Common European assess the validity of linkage to a CEFR level. To illustrate
Framework of Reference (CEFR). The CEFR has facilitated this procedure, two 4-skills B1 certification exams in
communication immensely, as teachers, students, English for speakers of other languages were compared:
publishers, policy makers and examination boards all now Cambridge ESOL’s Preliminary English Test and Trinity
make reference to the CEFR levels. Transparency, however, College London’s Integrated Skills in English 1. The
presents a greater challenge, at least regarding language resulting analysis revealed that even exams that are similar
certification. Although test users may presume that exams in terms of their characteristics, aims and recognition might
pegged to the same CEFR level are ‘in some way equivalent’ not equally satisfy an institutional or professional test
or at ‘exactly the same level’ (COE 2009:4), this is not user’s requirements.
necessarily so, as the Council of Europe (COE) encourages
diversity. Moreover, interpretation of the CEFR specifications
varies considerably and no overseeing authority monitors References
claims of linkage. As students and aspiring employees
Council of Europe, Language Policy Division (2009) Relating language
normally choose the certification exams recognised, examinations to the Common European Framework of Reference for
required or offered by institutions and employers, these Languages: Learning, teaching, assessment (CEFR) A Manual,
latter must set their policies wisely. Strasbourg: Language Policy Division.
This study suggested how institutional and professional Weir, C J (2005) Language Testing and Validation: An Evidence-Based
test users may analyse and compare certification exams Approach, Basingstoke: Palgrave Macmillan.
This short summary is based on a doctoral thesis submitted second problem pertains to the importance of DIF, i.e. the
to the University of Twente (Netherlands) in 2010. The PhD effect size, and related problem of defining a stopping rule
was supervised by Professor Cees A W Glass. for the searching procedure. Simulations show that the
importance of DIF and the stopping rule can be based on
The chapters in this thesis are self-contained; hence they the estimate of the difference between the means of the
can be read separately. ability distributions of the studied groups of respondents.
In Chapter 2, item bias or differential item functioning The searching procedure is stopped when the change in this
(DIF) is seen as a lack of fit to an IRT model. It is shown that effect size becomes negligible.
inferences about the presence and importance of DIF can Chapter 3 presents the measures for evaluating the most
only be made if DIF is sufficiently modelled. This requires a important assumptions underlying unidimensional item
process of so-called test purification where items with DIF response models such as subpopulation invariance, form of
are identified using statistical tests and DIF is modelled item response function, and local stochastic independence.
using group-specific item parameters. In the present study, These item fit statistics are studied in two frameworks. In a
DIF is identified using a Lagrange multiplier statistic. The frequentist MML framework, LM tests for model fit based on
first problem addressed is that the dependency of these residuals are studied. In the framework of LM model tests,
statistics might cause problems in the presence of relatively the alternative hypothesis clarifies which assumptions are
large number DIF items. However, simulation studies show exactly targeted by the residuals. The alternative framework
that the power and Type I error rate of a step wise procedure is the Bayesian one. The PPCs is a much used Bayesian
where DIF items are identified one at a time are good. The model checking tool because it has an intuitive appeal, and
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
42 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
is simple to apply. A number of simulation studies are the estimation of ability parameter) are compared with
presented that assess the Type I error rates and the power PPCs. Simulation studies are carried out using number of fit
of the proposed item fit tests in both frameworks. Overall, statistics in a number of combinations in both frameworks.
the LM statistic performs better in terms of power and Type I In Chapter 5, a method based on structural equation
error rates. modelling (or, more specifically, confirmatory factor
Chapter 4 presents fit statistics that are used for analysis) for examining measurement equivalence is
evaluating the degree fit between the chosen psychometric presented. Top-down and bottom-up approaches were
model and an examinee’s item score pattern. Person fit evaluated for constructing nested models. A comprehensive
statistic reflects the extent to which the examinee answered comparative simulation study is carried out to explore the
test questions according to the assumptions and factors that have impact performance for detecting DIF
description of the model. Frequentist tests as the LM test items.
and tests with Snijders’ correction (which take into account
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0 | 43
Germany, Spain and the UK signed up for the ALTE summer a theoretical framework for validating tests of second
testing courses which took place from 20 to 24 September, language reading ability. The framework is then applied
and from 27 September to 1 October. These courses were through an examination of the tasks in Cambridge ESOL
hosted by the Basque Government, ALTE’s Basque member, Reading tests from a number of different validity
at the Royal Academy of the Basque Language in Bilbao. perspectives that reflect the socio-cognitive nature of any
The first course was an Introductory Course in Language assessment event. The authors show how an understanding
Testing run by Professor Cyril Weir and Dr Lynda Taylor, and and analysis of the framework and its components can
the second was an Introduction to Testing Reading run by assist test developers to operationalise their tests more
Dr Hanan Khalifa and Dr Ivana Vidaković from Research and effectively, especially in relation to the key criteria that
Validation. differentiate one proficiency level from another.
Later in the year, ALTE’s 39th meeting and conference will Key features of the book include: an up-to-date review of
take place at the Charles University in Prague from 10 to 12 the relevant literature on assessing reading; an accessible
November. As at previous meetings, the first two days will and systematic description of the different proficiency
include a number of Special Interest Group meetings, and levels in second language reading; and a comprehensive
workshops for ALTE members and affiliates, and the third and coherent basis for validating tests of reading. This
day will be an open conference day for anyone with an volume is a rich source of information on all aspects of
interest in language testing. The theme of the conference is examining reading ability. As such, it will be of considerable
‘Fairness and Quality Management in Language Testing’ interest to examination boards wishing to validate their own
and the speakers at the conference will include Professor reading tests in a systematic and coherent manner, as well
Antony Kunnan and Dr Piet van Avermaet, as well as Dr Neil as to academic researchers and graduate students in the
Jones, Juliet Wilson and Mike Gutteridge from Cambridge field of language assessment more generally. This is a
ESOL. Juliet, Mike and Dittany Rose will also run workshops companion volume to the previously published Examining
on the two days prior to the conference day. Writing (Shaw & Weir 2007).
Just prior to the Prague conference, ALTE is launching the Volume 31, co-edited by Lynda Taylor and Cyril J Weir, is
first of its Tier 3 language testing courses with a 2-day entitled Language Testing Matters: Investigating the wider
course on The Application of Structural Equation Modelling social and educational impact of assessment – Proceedings
(SEM) in Language Testing Research on 8 and 9 November. of the ALTE Cambridge Conference, April 2008. It explores
The course will be run by Dr Ardeshir Geranpayeh from the social and educational impact of language testing and
Research and Validation. This is an advanced course in assessment, at regional, national and international level, by
language testing (ALTE Tier 3) and is aimed at experienced bringing together a collection of 20 edited papers based on
and knowledgeable language testing professionals. The presentations given at the 3rd international conference of
Tier 3 courses complement the Foundation Courses (Tier 1) the Association of Language Testers in Europe (ALTE) held in
and Introductory Courses (Tier 2) which are already well Cambridge in April 2008.
established. Following the conference, on 13 November, The selected papers focus on three core strands
ALTE will continue its programme of Foundation Courses addressed during the conference. Section One considers
when Annie Broadhead will run a general introduction to new perspectives on testing for specific purposes, including
language testing. the key role played by language assessment in the aviation
The call for papers for the ALTE 4th International industry, in the legal system, and in migration and
Conference to be held in Kraków, Poland from 7 to 9 July citizenship policy. Section Two contains insights on testing
2011 is already open and will run until the end of January policy and practice in the context of language teaching and
2011. We encourage you to submit a paper for the learning in different parts of the world, including Africa,
conference, and reflecting ALTE’s commitment to multi- Europe, North America and Asia. Section Three offers
lingualism, papers can be submitted in English, French, reflections on the impact of testing among differing
German, Italian, Polish and Spanish. The theme of the stakeholder constituencies, such as the individual learner,
conference is ‘The Impact of Language Frameworks on educational authorities, and society in general.
Assessment, Learning and Teaching viewed from the Key features of the volume include: up-to-date
perspectives of policies, procedures and challenges’ and information on the impact of language testing and
the plenary speakers are Professor Lyle Bachman, Professor assessment in a wide variety of social and educational
Giuliana Grego Bolli, Dr Neil Jones, Dr Waldemar Martyniuk, contexts worldwide; accounts of recent research into the
Dr Michaela Perlmann-Balme and Professor Elana Shohamy. profiling of language proficiency levels and into cheating in
For further information about these events and other ALTE tests; insights into new areas for testing and assessment,
activities, please visit the ALTE website – www.alte.org e.g. teacher certification, examinations in L2 school
systems, testing of intercultural competence; discussion of
the relationships among different test stakeholder
Studies in Language Testing constituencies.
The last 12 months have seen the publication of three more With its broad coverage of key issues, combining
titles in the Studies in Language Testing series, published theoretical insights and practical advice, this volume is a
jointly by Cambridge ESOL and Cambridge University Press. valuable reference work for academics, employers and
Volume 29, authored by Hanan Khalifa and Cyril J Weir, is policy-makers in Europe and beyond. It is also a useful
entitled Examining Reading: Research and practice in resource for postgraduate students of language testing and
assessing second language reading. This volume develops for practitioners, i.e. teachers, teacher educators,
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
44 | C A M B R I D G E E S O L : R E S E A R C H N OT E S : I SS U E 4 2 / N OV E M B E R 2 0 1 0
curriculum developers, materials writers, and anyone Key features of the book include: an up-to-date review of
seeking greater understanding of the social and educational the literature on the development and assessment of L1
impact of language assessment. and L2 reading ability; practical guidance on how to
July 2010 saw the publication of another title in the investigate the L2 reading construct using multiple
Studies in Language Testing series, published jointly by methodologies; and fresh insights into interpreting test
Cambridge ESOL and Cambridge University Press. Volume data and statistics, and into understanding the nature of L2
32, by Toshihiko Shiotsu, is entitled Components of L2 reading proficiency. This volume will be a valuable resource
Reading: Linguistic and processing factors in the reading for academic researchers and postgraduate students
test performances of Japanese EFL learners. interested in investigating reading comprehension
This latest volume investigates the linguistic and performance, as well as for examination board staff
processing factors underpinning the reading concerned with the design and development of reading
comprehension performance of Japanese learners of assessment tools. It will also be a useful reference for
English. It describes a comprehensive and rigorous curriculum developers and textbook writers involved in
empirical study to identify the main candidate variables preparing syllabuses and materials for the teaching and
that impact on reading performance and to develop learning of reading.
appropriate research instruments to investigate these. Information on all the volumes published in the SiLT
The study explores the contribution to successful reading series is available at: www.CambridgeESOL.org/what-we-
comprehension of factors such as syntactic knowledge, do/research/silt.html
vocabulary breadth and reading speed in the second
language.
©UCLES 2010 – The contents of this publication may not be reproduced without the written permission of the copyright holder.