0% found this document useful (0 votes)
23 views61 pages

Hoadjli's PhD Thesis

The thesis explores the washback effect of an alternative testing model on teaching and learning in English as a Foreign Language (EFL) secondary classes in Biskra, Algeria. It identifies issues with the current testing system, which often leads to negative washback, and proposes an Alternative Testing Model (ATM) to improve educational outcomes. The study aims to investigate the perceptions of teachers and students regarding the ATM and its impact on their behaviors and attitudes towards language learning and assessment.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views61 pages

Hoadjli's PhD Thesis

The thesis explores the washback effect of an alternative testing model on teaching and learning in English as a Foreign Language (EFL) secondary classes in Biskra, Algeria. It identifies issues with the current testing system, which often leads to negative washback, and proposes an Alternative Testing Model (ATM) to improve educational outcomes. The study aims to investigate the perceptions of teachers and students regarding the ATM and its impact on their behaviors and attitudes towards language learning and assessment.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

The People’s Democratic Republic of Algeria

Ministry of Higher Education and Scientific Research

Mohamed Kheider University of Biskra

Faculty of Letters and Foreign Languages

Department of Foreign Languages

Section of English

Thesis submitted to the Department of Foreign Languages in Candidacy for the


Degree of Doctorate in Language and Civilization

The Washback Effect of an Alternative Testing Model on Teaching


and Learning:
An Exploratory Study on EFL Secondary Classes in Biskra.

Candidate: HOADJLI Ahmed Chaouki Supervisor: Dr. HAMLAOUI Naima

Board of Examiners:

C h a i r : Pr. HAMADA Hacène - ENS Constantine.

Supervisor: Dr. HAMLAOUI Naima - University Badji Mokhtal, Annaba.

Examiner: Pr. MELLOUK Mohamed - University Djilali Liabes, Sidi-Bel-Abbes.

Examiner: Pr. MERBOUH Zouaoui - University Djilali Liabes, Sidi-Bel-Abbes.

Examiner: Dr. BOUHANIA Bachir - University Ahmed Draia, Adrar.

March 2015
Contents
Dedication ............................................................................................................................................................ I

Acknowledgements ........................................................................................................................................... II

Abstract ............................................................................................................................................................. IV

List of Abbreviations and Acronyms............................................................................................................V

List of Appendices ........................................................................................................................................... VI

List of Tables ...................................................................................................................................................VII

List of Figures ....................................................................................................................................................X

General Introduction.........................................................................................................................................1

Chapter One: Washback in Language Testing........................................................................................ 10

Introduction...................................................................................................................................................... 10

1.1 Washback: Origins and Definitions ................................................................................................... 10

1.1.1 Washback: Origins.................................................................................................. 10


1.1.2 Washback: Definition ............................................................................................. 13
1.2 Types of Washback ................................................................................................................................ 20

1.2.1 Negative Washback ................................................................................................. 20


1.2.2 Positive Washback .................................................................................................. 21
1.3 Functions and Mechanisms of Washback ........................................................................................ 23

1.3.1 Functions of Washback........................................................................................... 23


1.3.2 Mechanisms of Washback ...................................................................................... 26
1.4 Washback: Empirical Studies ............................................................................................................. 29

1.5 Washback: Lever of Innovation in Education ................................................................................. 40

1.5.1 Innovation: Definition ............................................................................................. 41


1.5.2 The Process of Innovation ...................................................................................... 43
1.6 A Washback Model for the Present Study........................................................................................ 54

1.6.1 Washback on Participants ...................................................................................... 55


1.6.2 Washback on Process ............................................................................................... 56
1.6.3 Washback on Products ............................................................................................ 56
Conclusion ........................................................................................................................................................ 59

XI
General Introduction

It may seem paradoxical but tests are not sanctions. They should be looked at as a

rewarding experience. In the past, language tests used to be regarded as the students’ “bête

noire”. Students did not enjoy taking tests and teachers did not enjoy marking them.

Nowadays, more focus is put on the relevance of language tests to the teaching operation.

Instead of being a separate subject that frequently takes place at the end of a course, a term or

an academic year, language tests have become an integral part of the curriculum; that is, they

are seen as a learning experience, which is part of the on-going course of study.

More importantly, language tests are now regarded as a valuable tool for providing

information that is relevant to several concerns in language teaching. They can be one way of

providing systematic feedback for both teachers and students. The teacher can see how well

or badly the students are performing and check for any discrepancies between expectations

and actual performance. Likewise, the students can know how much attainment and progress

they are doing in learning the language.

Language tests can also be a good means of evaluating instructional materials and tasks

and their relevance to the educational goals. Ideally, the goals of language tests and/or test-

items should be clear to students, so that they need not spend time guessing what the teacher

means. If the students perceive the tests as relevant to their needs, they themselves are

probably going to engage more actively in the process of dealing with them.

Another aspect of language tests concerns the insights and inferences a teacher often

draws from the outcomes of language tests. The usefulness of such inferences is manifested

when they provide feedback to be utilized in making the teaching programme more effective

and when they provide information to the kind of materials and tasks students need. These

inferences can be the only ground on which teachers can make appropriate decisions to the

teaching operation.

1
To be able to achieve its aims, a language test must meet the requirements of some

fundamental test-qualities, such as, validity, reliability, and practicality. In other contexts, it

should be authentic, interactive, and should have an impact on all the concerned parties.

Certainly, these considerations will vary from one situation to another because what might be

appropriate in a given case might not be so in another. Because of this reality, an

understanding of how one designs and develops a language test is very crucial.

Furthermore, an important assumption behind the numerous uses of language tests is

that a language test exerts an influence on both teaching and learning. It can transgress these

levels to concern even border areas in the educational setting. In the assessment literature,

there is now a clear consensus that the impact of a test on teachers and students is termed

washback. For many specialists, the concept of washback direction encapsulates the

principles that some effects of a language test may be beneficial, while some other effects

may be harmful. A positive washback is often seen as to encourage learning, and, conversely,

a negative washback usually inhibits the attainment of the educational goals held by learners.

In this sense, washback is judged positive or negative according to how far it enhances or

reduces forms of teaching and learning evaluated to be appropriate. Of course, what is

considered appropriate will depend on the instructional goals one espouses.

Enchanted by the power of the decisions they can provide, language tests are key

components in projects to introduce in education through interventions. It is argued that a

language test can stand as an effective instrument to reform or innovate an educational or a

language testing system. In line with this assumption, research on washback has

demonstrated that a language test can also be a way by which teachers’ and students’

behaviours and perceptions of their own abilities can change. In addition, a testing instrument

may influence the content and methodology of teaching programmes, attitudes towards the

value of certain educational goals and activities; in the long term, it many even serve the

2
needs of society as a whole. Given this potential power, a language test, as an innovative act,

can thus become a useful source for contending participants who seek to bring about a new

vision of what education in general and instruction in particular should be.

In Algeria, evidence from assessment practices in the school system is scanty and such

evidence is not of quality to support the inferences that tests are expected to yield in relation

to teaching and learning. It can be argued that the current testing system used by teachers in

the Algerian educational system may be developed under pressure to reflect more closely,

and to sustain desired educational goals. There is clear evidence that these practices are

meant to exploit the format and content of developed tests to improve the final outcomes

quickly and efficiently. In reality, there is something wrong in the way these tests are

conceived. The assessment practices do not consider the fundamental theoretical and

procedural constructs of tests and the connection that should exist between teaching and

learning and the mechanisms by which students have to be assessed. In this spirit, it is

assumed that the employed traditional testing system leads to inappropriate or outmoded

forms of inferences which often fail to keep pace with the requirements of pedagogy.

Therefore, if improvement in the educational system is to be realised, the testing system

should be used principally in support of teaching and learning, rather than in producing

limited outcomes for the sake of just passing or failing. The underlying objective in this

context should be employed in supporting of the learning process. Thus, the aim of

assessment is said to develop and improve students’ achievement and progress in learning the

target language, and not just to measure their performances in some given skills and language

components. Of course, this could be attained only when testing is connected to teaching and

learning process through processes of feedback. In addition, there is a need that teachers

should ensure appropriate standards of test design and development and content strategies

with the intention to relate their tests more closely to valued classroom behaviours as these

3
are the underlying assumptions of the educational objectives set out in the official syllabuses

suggested for the teaching of English language in the Algerian schools. In short, achievement

tests should mirror the best practices of teachers, so that test practice will involve students in

activities which will develop a full range of skills in the target language.

1. Statement of the Problem

From our experience of 10 years teaching English language at the secondary school

level, we observed that the testing practices through the available assessment instruments did

not provide specific available information about students’ achievement and progress in

English. The current testing system employed by the teachers in the Algerian schools relied

only on either copies of the “Bac” exam model, or merely intuitive tests constructed without

reference to any theoretical bases or operational procedures. Wrongly, what teachers attempt

to do is to build test contents on the basis of previous existing tests, believing that such a

practice would improve students’ scores. It is ostensible that there is not much congruence

between test-content and the contents of the syllabuses. The syllabuses, very often, are based

on the integration of the four skills and the development of competencies; but, when it comes

to assessment, our observation with regard to this issue has shown that tests assess one skill

and/or two skills and eschew the other skills. For instance, the didactic unit presented to

students throughout the learning cycle starts with a listening phase, but students are never

tested on that skill. The same remark can be made about the speaking skill.

This study was stimulated by another observation in the field of language testing in the

Algerian context. A dangerous phenomenon has cropped up: the scores have lost their

credibility in judging the actual degree of achievement and progress of students. Those who

might seem bright are really not the best while those with low scores are often believed to be

weak, as far as the learning of English language is concerned. The interpretation of this

phenomenon might reside in the misconception of language test design and development. As

4
a result, a negative washback effect has emerged. Instead of designing and developing a test

that should assess the amount of mastery of the content of the syllabus, and to see to what

extent the educational goals have been reached, teachers turned to becoming mere trainers of

students on how to respond mechanically to a typology of questions and activities that are

frequently included in the “Bac” exam papers. This system urged these teachers to

considerably reduce the time available for instruction, restrict the range of the curriculum and

limit the teaching methods, and potentially diminish their freedom to teach content or use

methods that are believed to be incompatible with instruction.

As pointed out earlier in this section, having been an English teacher, what bothered us

most about the English language teaching in the Algerian secondary schools is that teachers

in this context do not realise how important assessment has been in shaping the current

teaching and learning situation. We consider the situation one of a serious case of negative

washback of an external standardized examination on the processes of teaching and learning.

It is within this framework that we conceived our work in order to understand the

reasons of the set of anomalies observed in the current testing system in the context under

exploration. The intention is not to criticize this system for the sake of criticizing it, but it is

believed that this study would be an opportunity to judge the mechanisms by which the

Algerian teachers come to evaluate and assess English language learners, and highlight

common errors that currently occur in their assessment practices. To remedy this situation,

we found it more appropriate to suggest an Alternative Testing Model (ATM) that seeks to

overcome the deficiencies diagnosed in the current testing model, and to repair the

shortcomings that are thought to be one of the sources of the decline in learning English

language. To be able to succeed in this new project, there is a need to conceive the

relationship that ought to exist between teaching, learning, and testing as one of partnership.

5
Each element in this equation has a tremendous role to play for the ultimate goals traced in

this research.

2.Research Questions

Based on what has been stated in the problematic, the study explores these research

questions:

RQ1: What is the nature and scope of the washback effect on teachers’ behaviours of
aspects of teaching English language in the Algerian secondary schools due to
the use of the current testing system?
RQ2: What is the nature and scope of the washback effect on students’ perceptions
of aspects of teaching English language in the Algerian secondary schools due
to the use of the current testing system?
RQ3: What strategies does one need to implement the Alternative Testing Model
with EFL classes at the Algerian secondary school level?
RQ4: What is the nature and scope of the washback effect on teachers’ behaviours as
a result of the Alternative Testing Model implementation?
RQ5: What is the nature and scope of the washback effect on students’ perceptions
as a result of the Alternative Testing Model implementation?

3. Aims of the Study

The general purpose of the present study is to explore how those involved in teaching

and learning of English in the Algerian secondary schools perceive themselves to be affected

by the implementation of the Alternative Testing Model. More specifically, this study aims at:

1. investigating the phenomenon of washback effect in light of using the ATM.

2. understanding how the main participants at the micro-level (teachers and

students) reacted to the implementation of the ATM. In other words, the study

offers the teachers and students an opportunity to voice their opinions to

decision makers for a serious consideration of the issue of testing. Such a

6
purpose can provide a new solid ground for more consistent assessment reforms

in the Algerian educational system.

3. exploring the nature of washback effect on aspects of teachers’ and students’

perceptions of the new model, and teachers’ behaviours within the context of

English in the Algerian secondary schools.

4. Hypotheses

On the basis of the research questions formulated before, we hypothesised that:

a. A useful achievement test for EFL secondary classes in Algeria will influence

positively teaching and learning; and, conversely, an achievement test for EFL

learners that is not useful will influence negatively teaching and learning.

b. A useful achievement test will influence attitudes towards the content and

method of teaching and learning.

c. An achievement test that has important consequences will have positive

washback; and, conversely, an achievement test that does not have important

consequences will have no washback.

5. Research Methodology

Because the intention of this study is an exploration of the washback effect of an ATM

on teaching and learning for EFL learners at different levels, focusing on perceptions, values,

and situational factors in the complex and varying situation of the Algerian secondary

schools, the design of the study needs to take into account all variables that are concerned

with the different facets of teaching and learning where such a phenomenon may occur.

In order to draw a picture of the context where the current testing practices take place

and find out the differences of this situation with the intended changes that might occur with

the ATM implementation, the participants’ perceptions, attitudes, and opinions were

7
significantly quested between two phases: (1) prior and (2) after the introduction of the new

testing policy. Accordingly, a mixed-methods approach to data collection seemed

appropriate; no single method is able to explain such a complex phenomenon as different

methods have different strengths and weaknesses. Thus, using a range of methods can be

considered the best way to understand the problem.

A mixed-methods approach which is both qualitative and quantitative data driven

would allow the refinement and checking in context of these methods. As a result, the

research methodology in the present study relies on questionnaires, interviews, classroom

observations, and focus groups as data collection methods. These methods are seen to

complement one another and could be relatively integrated in practice.

In order to verify the results and check for the validity of these results, the commonly

known statistical method SPSS (Statistical Package for Social Sciences) is employed as an

additional validating instrument in this study. From a research point of view, SPSS is useable

as a complement to the used methods to corroborate the results. From a statistical point of

view, the application of this method allows the researcher to validate the findings in some

methods by cross-checking them with other methods.

6.Structure of the Thesis

Besides a general introduction and a general conclusion, the thesis is divided into 6

chapters. Chapter 1 provides background information and the literature review on the central

research concept of ‘washback’ in this study. It also displays its implications to innovation in

education and provides a washback model for the present study.

Chapter 2 identifies the research methodology adopted in the current study. The

chapter describes the research methodology, research strategies, and data collection methods.

Chapter 3 describes how the teacher-participants in this study perceive the current

testing system they use to assess and evaluate their students’ achievement and progress. In

8
this chapter, the research design highlights the research questions and purposes and delineates

the procedures used to develop the research instruments in the Preliminary Study. This

includes procedures for validating and increasing reliability of the data collection methods,

and the procedures for collecting data.

Chapter 4 describes the proposed ATM, i.e., it displays the components of the new

testing system. The content and format of the three sub-tests that form an achievement test

based on the new model are discussed. The chapter ends with a description of the time

framework and scoring procedure this model utilises.

Chapter 5 discusses the results obtained after the implementation of the ATM. The

reactions of the participants to this innovation are shown in this chapter. As with the

Preliminary Study, the design of the Final Study is displayed. This concerns the data

collection methods used, the kind of respondents in each one of the research instrument, the

data collection and analysis procedures. The chapter ends with discussion and summary of

the findings.

Chapter 6 synthesizes the findings from the pre-implementation to the post-

implementation time of the new testing model. In addition, the chapter devotes a section to

the implications and recommendations of the study for those parties involved in the teaching

and learning of English, and limitations of the study.

9
Chapter One: Washback in Language Testing

Introduction

The gist of this chapter turns around the underpinnings that shaped and guided this

research. First, the chapter highlights the origins of washback as a recent concept that has

come to emerge in the scenes of language education in general and language testing in

particular. Second, it provides definition of this concept and its related constructs as it is

conceived by most language testing specialists. Then, it identifies the different types of

washback. Following this section, the chapter displays the functions and mechanisms of the

concept under exploration. Next, washback as a lever to innovation and change in educational

settings is discussed. Finally, this chapter ends with a potential washback model for the

present study that investigates how the different components that make-up washback are

structured. It also provides an ideal opportunity to understand how new testing systems are

introduced into different educational systems.

1.1 Washback: Origins and Definitions

In recent years, washback has led to a greater understanding of this construct in the

testing literature. In this section, a review of the literature related to the origins and

definitions of this concept will be displayed. Besides, from this literature review, the different

points of review about what the construct of this concept may encompass will be thoroughly

discussed.

1.1.1 Washback: Origins

Although the subject of the effects of examination has long been discussed in the

literature of General Education (Kellaghan et al., 1982; Vernon, 1956), and has been looked

at from different points of view (Madaus, 1988; Fredericksen, 1984), it has been common in

the testing literature that the concept of ‘washback’, as it is known now, has come to attract

10
the attention of testing an assessment researchers only at the beginning of the 1990's. Before,

applied linguists used different terms to refer to the idea of examination influence. Some of

these terms included, ‘test impact’ (Bachman & Palmer, 1996; Baker, 1991), ‘systemic

validity’ (Messick, 1989), ‘measurement-driven instruction’ (Popham, 1988), ‘curriculum

alignment’ (Shepard, 1990), ‘backwash’ (Biggs, 1993), and possibly other terms.

Among this set of terms, two of them dominated the scene of the issue of examination

influence. Reference here is to the two concepts: ‘backwash’ and ‘washback’. Testing

specialists have quickly admitted that in order to avoid any sort of confusion in terms of the

adoption and use of the appropriate terminology, it would be better to assume that these two

concepts can be used interchangeably, and, therefore there is no need to place a clear

distinction between, on the one hand, ‘washback’, and, on the other, ‘backwash’ in language

testing uses and practices. In this sense, Hamp-Lyons (1997) corroborated the idea and noted

that “washback is one of the set of forms that has been used in language education and

language testing to refer to a number of beliefs about the relationship between testing and

learning” (295). She goes on to add that “another set of terms is ‘backwash’, but it would

appear that the terms ‘backwash’ and ‘washback’ are used interchangeably in the field”

(ibid). To confirm this interpretation of the two concepts, Hughes (1993) points out that there

is an interchangeable use of the two terms. He makes it more explicit when he states that

“where washback comes from, I don't know. That I know is that you can find backwash in

dictionaries, but not washback” (57). However, in another context, Cheng and Curtis (2005)

prefer the term ‘washback’ and not ‘backwash’ since they think that ‘washback’ is the

concept that is frequently found in applied linguistics in general and language testing

literature in particular. In short, this brief reviewing of the distinction between ‘washback’

and ‘backwash’ leads one to believe that these two terms do not present any kind of

difference, rather, the great majority of testing researchers who have dealt with this matter

11
have agreed upon the idea that both of the two terms stand for the same sense, and hence each

one of them can be used interchangeably with the other one.

Language testing researchers have realized that the emergence of this concept is the

result of the considerable reforms and advances that have taken part in the field of language

testing during the last two decades at the end of the twentieth century. Indeed, it has been

assumed that one of the areas that was actively discussed in that period of time was the

influence of tests on both instruction and learning. Cheng (2005) indicates that the subject of

examination influence was rooted in the notion that tests are often seen to drive teaching and

learning. He argues that in order to realize what he refers to as measurement driven

instruction, there is a dire need to seek for the creation of the matching between the construct

of the test and what teachers present in instruction. In other words, he aims to say that the

clearer the fit between test content and teaching is, the greater the potential improvement will

be on the test. In a different approach, Messick (1989) placed the concept washback in a

broader scope of construct validity. He claims that this construct encompasses a set of aspects

about testing such as the impact of tests on language test takers and teachers, the

interpretation of scores by decision makers and the intended uses of these scores. In such a

view, the concept washback stands as an inherent quality of any kind of assessment,

especially when test takers’ futures are affected by examination results.

In a comprehensive study on how the concept washback has come to exist as an

important research concept in the field of language teaching and testing. Tsagari (2006)

proposed an artificial time framework divided into three different but successive phases: (i)

the 'pre-1993' phase, (ii) the '1993' phase, and, finally, (iii) the 'post-1993' phase. First, he

labelled the 'pre-1993' phase the 'myth' phase. He identified it as the period of time when

writers recognized the examination influence phenomenon but no one accounted for it. What

is noticeable in this era was that few empirical studies were carried out and published to the

12
language testing community (Wesdrop, 1982; Hughes, 1988), which made strong claims of

the absence of this phenomenon. Most of the available studies in this period were merely

based on self-report data or on direct results or on test results rather that direct contact with

participants involved. Second, the '1993' phase, was markedly different from the previous one

since it was typically characterised by the publication of a seminal work paper by two

prominent language testing researchers, Alderson and Wall, who are indebted the fact they

were the first who questioned the nature of examination influence. More crucially, the

authors managed to reconceptualise this phenomenon by suggesting a set of relevant

hypotheses. Finally, the third 'post 1993' phase, or as Tsagari named it the 'reality phase', was

fundamentally recognized to be a new era where substantial research projects on washback

used developed models to accurately decorticate and explain the various components that

make up this concept.

To summarize, although relatively little has been written about the origins of washback,

a great deal of information has emerged about various concepts that refer to examination

influence. Based on this review, it is not an exaggeration to say that the study of the origins of

washback is crucial to shape the scope of further needed research in this area. This matter

should be treated as a direct consequence of other educational studies that targeted the

investigation on the relationship between learning, teaching, and testing.

1.1.2 Washback: Definition

In testing research and literature, definitions of washback are numerous. These

definitions vary from simple and straightforward to complex. Some take a narrow focus on

teachers and students, while others transgress to concern even educational systems and

society in general. Some definitions stress on intentionality whereas others insist they occur

haphazardly (Bailey, 1999 : 3). In this subsection, a discussion of these definitions and how

they are similar to or different from one another will be presented.

13
Many applied linguists have indicated that the concept washback is rarely found in

language dictionaries. The few available definitions can be found in dictionaries such as 'The

New Webster's Comprehensive Dictionary', which says washback as “the unwelcome

repercussions of some social actions”, expressed on the ‘Collins Cobuild Dictionary of

English’, which defines washback as “the unpleasant after-effects of an event or situation”.

Apart from these two examples, a meticulous research in other common language dictionaries

in English has shown that there does not exist an explanation or indication of the term

washback as it is generally known in the testing literature in the present time.

Unlike the rare definitions found in language dictionaries, a great number of other

definitions of the concept washback is present throughout the published assessment research

and literature with various meanings. In a paper on testing listening comprehension, Buck

(1988) describes the apparent effect of Japanese University entrance examinations on English

language learning in Japan. In this description of washback, he says:

There is a natural tendency for both teachers and students to tailor their classroom
activities to the demands of the test, especially when the test is very important to the
future of student, and pass rates are used as a measure of teacher success. This
influence of the test on the classroom (referred to as washback by language testers) is,
of course important; this washback effect can be either beneficial or harmful (17).
In this sense, Buck's definition emphasises the importance of what teachers and

students do in classrooms.

In another definition, Messik (1996) considers washback as a prominent concept in

applied linguistics. For him, this term refers to the extent to which the interaction and the use

of a test influence language teachers and learners to do things they would not otherwise do

that promote or inhibit language learning. Shohamy (1992) also focuses on washback in

terms of language learners as test-takers when she describes, “the utilisation of external

language tests to affect and derive foreign language learning in the school context” (15). She

points out that “this phenomenon is the result of the strong authority of external testing and

14
the major impact it has on the lives of test takers” (ibid). To corroborate this belief, Shohamy

cites the example of the introduction of an oral test proficiency based on an interview in the

Unites States. She says that this example involves “the power of tests to change the behaviour

of teachers and students” (Shohamy: 514).

Bailey (1999) refers to Shohamy who summarized four key definitions that are crucial

to understand the washback concept. Accordingly,

1. Washback effect refers to the impact tests have on teaching and learning.

2. Measurement driven instruction refers to the motion that tests should drive

learning.

3. Curriculum alignment focuses on the connection between testing and the teaching

syllabus.

4. Systemic validity implies the integration of tests into the educational system and

the need to demonstrate that the introduction of a new test can improve learning.

Cheng (2005) converged to some extent with Shohamy's (1992) ideas. She relies on

Pearsons (1988) to show the influence of external examinations on the attitudes, behaviours,

and motivations of classroom teachers, learners, and even on other broader, related areas of

this research concept. Similarly, Cheng (1997) introduced the concept of intensity, “the

degree of washback effects in area or a number of areas of teaching and learning affected by

an examination” (43). Cohen (1994) also takes a broad view. He describes washback in terms

of “how assessment instruments affect educational practices and beliefs” (41). In the same

vein, Pierce (1992) suggests that “the washback effect, sometimes referred to as the systemic

validity of a test, refers to the impact of a test on classroom pedagogy, curriculum

development, and educational policy” (687).

However, a number of researchers in general education have reported that before

yielding an accurate definition of the concept washback in its boarder sense, there is a crucial

15
need to clearly display the distinction between washback and the other confusing concept

impact in language testing. On this matter, Tsagari (2006), in a study on washback, explores

the relationship and/or distinction between these two major concepts: washback and impact.

He argues that the common view which prevails in the field of language assessment considers

washback as one dimension of impact. The latter is often used to describe effects on the

wider educational context. Tsagari goes back to Wall (1995) to discuss in some detail the

existing relationship between washback and impact. The latter suggests that “washback is

frequently seen to refer to the effects that tests may have on teaching and learning, whereas

impact deals with the effects that tests may have on individuals, policies, and practices,

within the classroom, the school, the educational system, or even societies as a whole” (16).

Following this interpretation, Tsagari (2006) recognizes that this view intersects to a great

extent with other writers' explanations, such as McNamara (2000) and Shohamy et al. (1996)

who place washback within the scope of impact.

These ideas are revisited in Bachman and Palmer (1996 : 35). They note that washback,

however, is a more complex phenomenon than simply the effect of a test on teaching and

learning. Instead, they feel that the impact of a test should be evaluated with reference to

contextual variables of society's goals and values, the educational system in which the test is

used, and the potential outcomes of its use. They referred to these uses at two levels:

 a micro-level, in terms of individuals who are affected by the particular tests uses,

especially test-takers and teachers, and;

 a macro-level, in terms of society and educational systems.

Bailey (1996) adopts “a holistic view on washback, but prefer to consider overall

impact in terms of ‘washback of learners’, and ‘washback to the program’, counsellors, etc.”

(263-264). For Cain (2005), although the two forms of washback and impact are used in

many cases, interchangeably, the test impact more accurately refers to the wider implications

16
and effects of a given test use. Andrews (1994 : 37) writing on washback, appears to

acknowledge the fragility of the washback-impact distinction. He reports that:

The term washback is interpreted broadly ... washback refers to the effects of tests on
teaching and learning, the educational system, and the various stockholders in the
educational process, whereas the word process is used in a non-technical sense, as a
synonym of effect.

Hawkey (2006) comments on the fact the concepts washback and impact are often

considered in terms of their:

 logical location,

 definition scope,

 positive and negative implications,

 intentionality,

 complexity,

 direction,

 intensity, emphasis,

 relationship with validity and validation,

 relationship with testing critical view, and,

 the role in washback/ impact models.

Alderson and Wall (1993) also discussed the notion of washback, and tried to identify

what washback was. The authors reviewed the concept as it has been presented by language

specialists up to that time. They concluded that the concept was too vaguely defined to be

useful and much of what have been said and written about this concept had been based on

assertion rather than empirical findings. As response to this claim, they presented a number of

'washback hypotheses', which were meant to illustrate some of the effects that tests might

have on teaching and learning. They argued that test developers should specify the types of

impact that they wished to promote and the kinds of effects test evaluators should look for

when deciding whether or not the desired washback has occurred (Wall, 2005: 51). The
17
washback hypotheses they presented stated:

1. A test will influence teaching.

2. A test will influence learning.

3. A test will influence what teachers teach; and

4. A test will influence how teachers teach; and

5. A test will influence what learners learn.

6. A test will influence how learners learn.

7. A test will influence the rate and sequence of teaching; and

8. A test will influence the rate and sequence of learning.

9. A test will influence the degree and depth of teaching; and

10. A test will influence the degree and depth of learning.

11. A test will influence attitudes to the content, method, etc. of teaching and

learning

12. Tests that have important consequences will have washback, and conversely.

13. Tests that do not have important consequences will have no washback.

14. Tests will have washback on all learners and teachers.

15. Tests will have washback effects for some learners and some teachers, but not

for others.

Alderson and Wall proposed these hypotheses as a result of their own extensive work in

Sri Lanka and reviewing case studies conducted in Nepal (Khaniya, 1990), Turkey (Hughes,

1988), and the Netherlands (Wesdrop, 1982).

After reviewing numerous definitions of the concept washback, it is evident that this

concept is open to a variety of explanations and that there are number of variables one needs

to consider when conducting research on this subject. Crucially, what comes out from this

discussion is that washback can be defined according to two major scopes. One following a

18
narrower definition which focuses on the effects that a test has on teaching and learning, and

the other following a wider and a more holistic view of washback that transgresses the

classroom to take into account the educational system and society at large, which can as

noted earlier in this section would be more accurately referred to as test impact. In this

connection, Hamp-Lyons (1997) summarises the situation and the terminology well. She

finds that Alderson and Wall's limitation of the term washback to influence on teaching,

teachers, and learning seems now to be generally accepted, and the discussion of the wider

influences of tests is considered under the term impact, with the term used in wider

educational measurement literature. In a similar view, the adoption of Bachman's and

Palmer's definition, which refers to issues of test use and social impact as 'macro' issues of

impact, while washback takes place at the 'micro' level of participants, particularly teachers

and learners, sounds the most acceptable.

It is not within the scope of the present study to look in details at the wider implications

of testing. Rather, in the context of this research, and since this work is exploratory in nature,

the washback adopted will be primarily concerned with the area identified by Alderson &

Wall (1993) and Bachman & Palmer (1997), i.e., ‘washback to teaching’ and ‘washback to

learning’. In other words, the researcher will adopt the narrow definition of washback

focusing more on the washback at the micro-level that investigates the effects of a suggested

ATM on teachers and students. Besides, he will try to be consistent in the use of the two

terms ‘washback’ and ‘impact’. In this respect, he uses ‘washback’ to cover influences of

language tests on language learners and teachers, language learning and teaching processes

and outcomes. In the same vein, he uses ‘impact’ to cover influences of language tests on

stakeholders beyond language learners and teachers.

19
1.2 Types of Washback

Assessment studies have indicated that washback very often implies movement in a

particular direction. This movement in a particular direction is an inherent part on the use of

this phenomenon to describe teaching-testing relationship. Hsu (2009) referred to Pearson

(1988) who pointed out that:

Public examination's influence the attitudes, behaviours, and motivation of teachers,


learners, and parents, and because examination often come at the end of the course,
this influence is seen working in a backward direction, hence, the term washback
(98).

Besides, washback has also been perceived as bipolar- either negative (harmful) or

positive (beneficial). Messick (1996) cites Alderson and Wall's (1993 : 17) definition of

washback as the “extent to which a test influences language teachers and learners to do things

they would not necessarily otherwise do that promote or inhibit language learning” (Messik:

241). They add that “tests can be powerful determiners, both positively and negatively, of

what happens in classrooms” (cited in Hsu, 2009 : 46-47).

Following this line of argument in this sub-section, regarding the two types of

washback, the tremendous impact and power of testing on teaching and learning in schools

and whether this washback exerts a positive or negative influence will be discussed in some

detail.

1.2.1 Negative Washback

Negative washback is seen by testing researchers as the negative influence of tests on

teaching and learning. Alderson and Wall (1993) point out that:

A negative washback is defined as the undesirable effects on teaching and learning


of a particular test. The tests may fail to reflect the learning principles and/or the
course objectives to which they are supposedly related (5).

In this case, such tests will lead to the narrowing of content in the curriculum, instead

of covering a definite content from what has been learnt in class. For Vernon (1956),

20
“teachers tend to ignore subjects in activities that are not directly related to passing

examination, and testing accordingly alter the curriculum in a negative way” (18). Once

again, it is logical that those tests may fail to create correspondence between the learning

principles and/or the course objectives to which they should be related (Cheng, 2005 : 8).

More dangerous, negative washback can substantially reduce the time available for

instruction, narrow curriculum offering, and modes of instruction, and potentially reduce the

capacities for teachers to teach content and use methods and materials that are incompatible

with useful testing instruments (Smith, 1991 : 120). Madaus (1988) intersects with the above

claims and adds that

The negative washback definitely result in cramming, narrowing the curriculum,


focus attention of those skills that are the most relevant to testing, placement of
constrains on teachers’ and learners’ creativity and spontaneity, and disparage the
professional judgment of educators (22).

One strong impression that resulted from negative washback is that an increasing

number of coaching classes are set up to prepare students for examinations, but what students

will learn are test- taking skills rather than language activities (Wiseman, 1961 : 21). In such

a learning context, an atmosphere of high anxiety and fear of test results become current

among teachers and learners (Shohamy et al., 1996 : 9). For Shohamy, teachers will feel that

success or failure of their students is reflected on them, and they speak of pressure to cover

the materials for the examination. When the students know that one single measure of

performance can determine their levels, they will less likely take a positive attitude toward

learning.

1.2.2 Positive Washback

There are other testing researchers, on the other hand, who have seen washback in a

more positive way (Andrews, Fullilove and Wong, 2002; Bailey, 1996; Davies, 1985; Hsu,

2009). Those researchers strongly believe that it is possible to bring about beneficial changes

21
in teaching by changing examinations, representing the positive washback (Cheng &

Watanabe, 2004 : 10). This phenomenon refers to tests and examinations that influence

teaching and learning positively (Alderson & Wall, 1993 : 15). In a broad sense, good tests

can be utilised and designed as beneficial teaching-learning activities so as to encouraging a

positive teaching-learning process (Pearson, 1988 : 7). Andrews et al. (2002) suggest

deliberately introducing innovations in the language curriculum through modifications in

language testing. For instance, an oral proficiency test was introduced in the expectation that

it would promote the teaching of speaking (Hsu, 2009 : 49). Davies (1985) considers that “a

creative and innovative test can advantageously result in syllabus alternation or even in a new

syllabus” (18). In this sense, a test no longer needs to be only an obedient servant; rather, it

can also be a leader.

Nevertheless, in educational settings, things sound a little bit different as one may think

of in that assessment researchers have come to realise that there exists a set of conflicting

positions towards washback in language testing. That is, most of these experts claim that

there is no clear consensus among practitioners as to whether certain washback effects are

negative or positive. One justification to this conflicting situation is that potentially positive

or negative nature of the test can be influenced by many contextual factors (Hsu, 2009 : 9).

Alderson and Wall (1993 : 117-118), commenting on this particular case, posit that the

quality of the washback effect might be said to have beneficial or detrimental washback.

They add that whatever changes educators would like to bring about in teaching and learning

by a particular assessment method, it is worthwhile to first explore the educational context in

which an assessment is introduced.

Therefore, for many testing specialists research into washback may be more fruitful if

this latter turns its attention looking at the complex causes of such phenomenon in teaching

and learning, rather than focusing on deciding whether or not the effects can be classified as

22
positive or negative. According to Alderson and Wall (1993), the best way to realise this is to

investigate as thoroughly as possible the broad educational context in which the act of

assessing is taking part, since the major variables that often affect this act exist within the

education system, and that might prevent washback from appearing. Cheng and Watanabe

(2004) summarise this situation, and note that “if the consequences of a particular test for

which teaching and learning are to be evaluated, the educational context in which the test

takes place needs to be fully understood” (31). This means that whether the washback effect

is positive or negative, this will largely depend on where and how it exists and manifests

itself within a particular educational context in order to understand the mechanism of

washback.

1.3 Functions and Mechanisms of Washback

Traditionally, tests used to be given at the end of the teaching and learning processes to

provide an accurate diagnosis of the effects of teaching and learning. Nevertheless, with the

advances and changes made in the field of testing and how this latter is conceived, a test can

also be developed to be used at the beginning or in the middle of the teaching and learning

processes in order to influence either or both processes. This section intends to shed light on

the functions and mechanisms by which washback occurs in relation to other educational

theories and practices.

1.3.1 Functions of Washback

In discussing the functions of language tests through which washback occurs in actual

teaching and learning environments, Wall (2005) referred to a set of reviews of those tests

and influences they could have on the systems they are introduced into. One of these crucial

reviews is the one that was produced by Eckstein and Noah (1993). In its essence, Eckstein

and Noah provided a historical account of the myriads of a number of functions and

influences of some types of tests that displayed appropriately how people over history have

23
usually considered tests as an important tool by which they take the desired decisions for

some targeted purposes. For instance, for the authors, the first documented use of written,

public examination systems occurred under the Han Dynasty in China about 200 B.C. The

main function of these particular examinations was to select candidates for entry into the

government services. In other words, the candidates were used to break the monopoly over

government jobs enjoyed by an aristocratic feudal system.

With Eckstein and Noah, the second example of the functions of tests was that one

which sought to check patronage and corruption. A typical example of this function was

Britain where people could gain entry into higher education or the profession of strengths. An

important, direct consequence of this examination was the establishment of a great deal of

public schools, which aimed at preparing students to sit for these examinations. In addition to

this, a third example of functions of examinations, suggested by Eckstein and Noah, was to

encourage levels of competence and knowledge amongst those who were entering

government services or professions. The intention was to design and develop examinations

which reflected the demands and requirements of the target situations; students for those

examinations could have to develop skills which were relevant to the work they hoped to get

in the future.

The fourth function, in this series of examples, was that of allocating spare places in

higher education. At this level, examinations were used as means of selecting the most able

candidates for the available places. This type of examinations is the same to what is referred

to as placement tests in the testing literature in the present time. The fifth function in this

illustration was to measure and improve the effectiveness of teachers and schools. Eckstein

and Noah used again Britain, as an example describing how, at a certain time, the government

set up a system of examination to monitor performance of schools by sponsoring these

examinations through the allocation of considerable funds. The amount of funds that the

24
school received depended on how its students performed. However, the system had serious

unintended consequences and at last had forced to achieve the expected objectives. The last

function, in this set of examples suggested by Eckstein and Noah, was limiting curriculum

differentiation. In Britain, in the nineteenth and the twentieth centuries, there was a

remarkable resistance to the idea of centralised education, and all the schools had the freedom

to decide on their own curriculum and means of assessment. With the establishment of

certificate examinations, these schools had a common target they could aim for, and all these

schools turned to teach the curriculum that can help better in doing well in the examinations

that are relevant in these certificates.

In the modern world, tests are frequently used for accountability within the system, and

in particular for certification of achievement in education. They form part of the procedure

for decisions about the allocation of scarce resources of both systemic and an individual

level. For example, tests in many countries and Algeria is one of them, tests control the

transition between school and higher education, and they may lead to the awarding of a

degree. Tests are also seen as ways to upgrade knowledge and to improve the performance of

institutions (Hsu, 2009 : 62). Through testing, education policy can be rapidly defused and

implemented at relatively low cost (Linn, 2000). Test results that are visible and ideally

measurable can be reported by the media in terms the public can understand and can be used

to show that change has or has not taken place. However, tests are also criticized for exerting

a certain authority and power on both systemic and individual level. But, in spite of the

criticism levelled at them, tests continue to occupy a leading place in the educational system

of many countries.

The series of functions, exposed above, are typical situations where these tests were

used to exert influence on the final outcomes to suit the expected intentions of those who are

in authority to make and impose their policies. As pointed out by some testing specialists, this

25
is an especially common practice in countries with centralized educational systems, where the

taught programmes are controlled by central agencies. Policymakers in these contexts and

countries have used tests to manipulate educational systems, to control curricula, and to

impose new textbooks and teaching methods. In such settings, tests have been viewed as a

primary tool through which changes in the educational system can be introduced without

having to change other educational components such as teaching training and curricular. On

this point, Shohamy (1993) commented on that:

The power and authority of tests and external examinations enable policy-makers to
use them as effective tools for controlling educational systems and prescribing the
behaviours of those who are affected by their results - administrators, teachers,
students and others (239).

Given the status of tests and examinations in public spheres, it seems that it is important

to understand the functions of testing in relation to many facets and scopes of teaching as

mentioned in the examples discussed earlier. The importance of considering these functions

serves as a starting point and also a linking point to get a clear picture of the various

mechanisms within different educational contexts.

1.3.2 Mechanisms of Washback

In explaining the complex mechanisms through which washback occurs in actual

teaching and learning environments. Bailey (1996) cited Hughes (1993) trichotomy to show

how this phenomenon works in different contexts. Bailey points out that this particular

trichotomy allows educators in general education and language testing specialists in

particular to develop a basic model of washback that explains how the various components

that make-up this framework interact to help the understanding of the nature of this subject of

interest. In describing this model, Hughes states that the trichotomy is formed of three parts.

First, the participants who are mainly the people such as classroom-teachers, students,

administrators, materials developers, and even publishers whose perceptions and attitudes

26
toward their work may be a test; Hughes' second component in this framework is termed

process. The latter covers any actions taken by the participants, which may contribute to the

process of learning as the development of teaching methods. Third, in Hughes' framework,

product refers to what is learnt as facts, skills, and other aspects and also the quality of

learning.

Contrary to Hughes who stresses more on the three component that make-up this

model, Alderson and Wall (1993), in their Sri Lankan study, focus on what they referred to as

'micro-aspects' of teaching and learning that might be influenced by examinations. Alderson

and Wall argued that there is little evidence provided by empirical research to sustain the idea

that tests impact on teaching. They advocated that:

The concept is not well defined, and we believe that it is important to be more
precise about what washback might be before we can investigate its nature and
whether it is a natural or inevitable consequence of testing (117).

Consequently, they suggest 15 hypotheses that can aid researchers to illustrate areas in

teaching and learning that are usually affected by washback and can stand as a basis for

further researcher. This set of hypotheses has shown that there exists a strong correlation

between the importance of tests and the extent of washback. Alderson and Wall concluded

that further research is needed and that such research must “entail increasing specification of

the washback hypothesis”. They called on that researchers in the field of language testing had

to take into account to research literature in at least two areas: motivation, performance, and

that of innovation in the educational settings.

Following this seminal work realised on washback hypotheses, Wall (1996) followed

up their study and stressed the difficulties in finding explanations on how tests exerted

influence on teaching. She went back to innovation theory and literature to explore the

complex topic of washback. In this respect, she proposed that the research areas that are seen

to be relevant to washback should include (a) the writing of detailed baseline studies to

27
identify important characteristics in the target system and the environment, including an

analysis of the current testing practices (Shohamy, Donista-Scmidt and Ferman, 1996),

current testing resources (Bailey, 1996; Hughes, 1993);

(b) attitudes stockholders (Bailey, 1996), and;

(c) formation of management terms representing all important interest groups; teachers,

teachers-trainers, ministry officials, parents, and learners (Cheng, 2008).

Likewise, in the same perspective of washback mechanisms as a phenomenon of

change in teaching and learning, Hsu (2009) referred to Smith (1991) who investigated an

ELT project and worked on to construct a corresponding model of variables involved with the

aim to introduce the desired change in the teaching and learning processes. In its essence,

smith's model comprises five components of change: the target system, the management

system, the innovation itself, the resources available, and the environment in which change is

supposed to take place. Hsu adds that, on the ground of the same idea, Markee (1997)

illustrated through another study how change might occur on larger subjects such as

curricular through following stages which are to design, to implement, and finally to

maintain. In this respect, Markee suggested a framework that was based on the composed

questions that were posed by Cooper (1989) and which referred to: who (participants), what

(product), where (the content), when (the time, duration), why (the rationale), and how

(different approaches in managing the washback effect).

In two other studies, Fullan with Steingelbawer (1991) dealt with the issue of washback

effect but in its broader uses. They discussed the effects and changes of tests on schools and

came to identify two main recurring themes: first, a washback effect should be seen as a

process rather that an event. Second, all participants who are affected by this phenomenon

have to find their understanding of what washback effect is. Cheng (2004) made this last

point clearer. He explained that according to Fullan teachers work on their own with little

28
reference to experts or consultation with colleagues. Thus, those latter are usually forced to

make on -the-spot- decisions, with little time to reflect on their better solutions. The other

problem they often encounter in this context is that they are always unable to accomplish

what they prepared to do. Consequently, their lives can become very difficult, indeed. This

reality can explain why intended washback does or does not occur in teaching and learning.

In other words, this means that, if educational change is often imposed upon teachers and

students without meticulous preparation, resistance is likely to be a natural response (Curtis,

2000 : 4).

In summary, these reviews of major studies on the mechanisms of washback have

corroborated the fundamental relationship between the design of given tests and their positive

or negative impact and power on teaching and learning. However, it is worth noting that the

outcomes of these studies, even if they have contributed in advancing research into the

domain of washback in language testing, they remain insufficient to draw a larger and

transparent picture of this issue since a number of raised questions on the mechanisms of

washback in language testing remain without definite answers.

1.4 Washback: Empirical Studies

In this section, a number of common empirical research studies into washback of both

language and general education are discussed. This literature review is a summary of detailed

reviews realized by a number of researchers. The latter highlighted the basis of the central

research concept and pointed out useful research methods adopted by the myriads of

researchers to carry out their investigations. Such an elucidation is of a great utility for the

present exploration since it shapes the scope of the study and serves as a guideline for many

relevant issues for further needed research. For ease of reference Table 1.1 provides

background information for the most used studies in terms of the educational context, exam

type, and research methods employed.

29
Authors Context Exam Methods
Wesdrop The Netherlands Multiple choice  Scores
(1982) language assessment  Analysis of tests
and final exams in  Teacher and students'
Dutch Secondary questionnaires.
Schools.

Hughes Turkey University entrance  Test scores


(1988) test.  Questionnaires to
lecturers.

Li China Matriculation English  Questionnaire to


(1990) test (MET) teachers and local
officers (and
students’ discussion).

Alderson and Sri Lanka O- level examination  Individual and group


Wall (1993) in English in interviews with
teachers
 Questionnaires to
teachers and students
and teacher advisors
 Materials and test
analysis
 Observations

Lam (1993) Hong Kong New use of English  Questionnaire to


(NUE) (end of teachers
secondary school)  Test book analysis
 Analysis of test script
and scores

Andrews Hong Kong Oral component of the  Two parallel


revised use of English questionnaires to the
(1994)
(RUE) working party
members and teachers

Alderson and USA TOEFL Exam  Individual and group


Hamp-Lyons teacher and student
(1996) interviews
 Observations
 Fieldnotes

30
Watanabe Japan University entrance  Questionnaires
(1997) exam  Interviews with
students and teachers

Read and New Zealand IELTS  Interviews


Hayes (2003)  Questionnaires
 Observations
 Pre- and Post-English
test

Qui (2004) China National Matriculation  Interviews and


English Test (NMET) questionnaires with
NMET constructors,
inspectors, teachers,
and students
 Observations

Cheng (2005 ) Hong Kong Revised Hong Kong  Questionnaires to


Certificate of teachers and students
Education  Observations
Examination  Interviews
(HKCEE)

Gosa (2006) Romania English component of  Students diaries (10


the Romanian school students retrospective
learning exam (Bac) use)

Hawkey UK CPE (Cambridge  Text book analysis


(2006) EOSOL)
Table 1.1: Overview of the common research projects into washback

Referring back to the above illustrative table, what is ostensible is that most reports

from the various, available research studies into washback indicate that the influence of

washback has been observed on various aspects of learning and teaching, and that this

phenomenon generated was mediated by numerous factors. What is more significant on this

matter is that almost all research projects looking at washback have been carried out in

several different countries and various contexts. Crucially, all these research studies are

organised with regard to what Hughes (1993) referred to as process-washback on content,

teaching methods, and classroom assessment, product-washback on students learning, and

participants-washback on feelings and attitudes of teachers and students in the context under

31
exploration. Thus, in order to have a clear understanding of how these research studies were

carried out, it is worth examining several of the most interesting tenets of these research

projects.

Wesdrop (1982) investigated whether the incorporation of multiple-choice technique in

secondary school institutions would lead to the impoverishment of the curriculum. He argued

that such a fear could happen because the skills that could not be tested through multiple-

choice would not be practised and hence would eventually completely disappear. He added

that if this happens there would be a failure in the adoption of some teaching methods. Also,

this fact may provoke some changes in the way in that students prepared themselves for tests.

Wesdrop concluded that his study revealed that there are no differences between teaching

practices and students' preparation methods. He concluded with the assumption that after all

the so-called washback effects are a mere “myth”. If they do exist, they must be so weak or

just small that our methods cannot detect them.

Hughes (1988), in a project at Bogaziri University in Istanbul, Turkey, explored how

the introduction of a test in English for academic purposes helps to improve English

performance in the university's English medium undergraduate courses. Hughes

demonstrated that the basic aim of this test was to devise a new proficient test as the sole test

by which students could get access to undergraduate programmes. The intended test was

developed after completing a needs analysis. It comprised sections on three major skills:

listening, reading, and writing. Hughes, after introducing this test, noted that, “for the first

time, the foreign languages schools teachers were compelled by the test to consider seriously

just how to provide their students with training appropriately for the tasks which they could

face them at the end of the course” (44). Hughes concluded that the washback effect occurred

as a result of the incorporation of this test, with great changes in the materials used in the

foreign language school.

32
Likewise, Li (1990), in another washback study examined high-stakes examination

taken by Chinese students at the end of secondary school, the so-called the matriculation

English test-MET-. He pointed out that this test was first introduced in 1984 to replace an

earlier examination, which was so weak and lacked considerable validity and reliability. Li

aimed at identifying whether or not the introduced test would lead to better results or not, in

comparison to the previous adopted examination. After a period of time from the

incorporation of this new test, Li came to the conclusion that the teachers' and students'

attitudes towards the MET were positive. In this respect, he wrote “tests are able to subjugate

the minds of millions of people to the thraldom of forced memorisation, but, we would say it

is a greater kind of power to be able to liberate people's minds from such thraldom” (42).

From the short discussion of these three research projects into washback illustrated

above, the common view is that the very few empirical studies that were carried out prior to

the 1990's period indicated that little effect and impact of tests on learning and teaching was

found on classroom assessment. Such a result would be explained by the fact that these

studies into washback research failed to construct a definite washback model that would take

into account the array of factors which may place a part in determining why teachers react in

the way they do.

In brief, the washback studies that prevailed in this particular phase succeeded to

provide a clear definition of the concept washback, some guidelines about how to achieve

positive washback, and a few references to the effects of tests on the contents they had been

introduced into; but, in the meantime, there were few detailed accounts of specific attempts to

innovate through testing. Most of the research was based on questionnaires or on test results

rather than direct classroom observations.

33
Contrary to the preceding period of time, starting precisely from 1993 the scenes of

language education in general and testing in particular have known a significant increase in

empirical studies on washback effects. A great deal of language testing researchers recognise

that the recent research projects have led to a more detailed understanding of the phenomenon

in the domain of language education and of the factors which contribute to it.

Wall and Alderson (1993), for example, examined the effects of the new O-level

examination on English teaching and learning in secondary schools in Sri Lanka. They

emphasized that, by the time the study was published, it was the only investigation that

included classroom observation as one of its research methods. Wall (1996) summarised the

findings of the study by pointing out:

The examination had had considerable impact of English lessons on the way
teachers designed their classroom tests, but, on the other hand, this examination had
had little to no impact in the methodology they used in the classroom or on the way
they marked their students' performance (348).

Wall and Alderson found that the potential factors which impeded teachers from using

new teaching methods included insufficient teacher training, problematic management

schools, and teachers' beliefs in the efficacy of various methods.

In another context, Lam (1993) examined the New Use of English (NUE) examination

in Hong Kong. Lam through this study, attempted to find out clear answers to a set of some

raised questions, such as whether the amount of time that schools allocated to the teaching of

English language is sufficient, the schools set aside special time to prepare for one particular

section of the examination, how the attitudes and abilities of their students are, how the

quality of English textbooks is, and how the content of the teaching and the students'

performance are. He concluded that it is worth noting for the examination designers to take

into account how different factors in the context where this examination occurs might interact

with one another to yield the appropriate intended results and a clear picture of the expected

examination.

34
Andrews' (1994) Hong Kong study, was about the development of the RUE -the

Revised Use of English- test to measure students’ oral performance in Hong Kong. In order

to see the degree of efficiency of this newly, developed test, Andrews conducted a study

using two parallel questionnaires to the working party members and teachers with three

groups of candidates. The results of this investigation were that there was not one definite

conclusion to the washback effect of the designed oral tests; rather, Andrews remarked that

the final outcomes indicated that the nature of washback varied across the three groups: only

a small improvement in performance between the first and the second group was ostensible.

These results led the researchers to conclude that the washback effect of the test was delayed.

For this reason, the findings of Andrews' study suggested to re-use the test in a second year to

see whether or not the expected results could be noticeable.

Alderson and Hamp-Lyons (1996) found, in their study of washback of the test of

English as a foreign language (TOEFL) on preparation courses, that “this particular test was

seen to have a more direct washback effect on teaching content than on teaching

methodology”. The researchers employed three different types of data: interviews with

students in groups, interviews with teachers (both individuals and groups), and field notes

and audio-recordings during classroom observations. Like Watanabe (1996), Alderson and

Hamp-Lyons observed “two different teachers while they taught both TOEFL preparation

classes and courses. This particular design permitted Alderson and Hamp-Lyons to compare

TOEFL preparation to non-TOEFL preparation classes” (cited in Bailey 1999 : 32). The

authors concluded that the amount and type of washback which occurred depended on

The status of the test, the extent to which the test is counter, the current practice, the
extent to which teachers and textbook writers think about appropriate methods for
test preparation, and the extent to which teachers and textbook writers are willing to
and able to innovate (296).

35
A similar research design was used by Watanabe (1997), who investigated the

university entrance examination in Japan through two different types of data collection

methods: questionnaires, and interviews with teachers and students. Watanabe found that all

the textbooks used by the teachers observed consisted of past exam papers and materials. In

addition, the results showed the presence of grammar translation questions on a particular

university entrance examination did not influence the teachers in the same way in that some

teachers were affected by these exams, and others were not. Watanabe identified that “three

possible factor that might promote or inhibit washback to teachers: (1) the teachers'

educational background and/or experience; (2) differences in teachers' beliefs about effective

teaching methods; and (3) the teaming of the researcher's observation” (cited in Bailey 1999 :

23) . Thus, Watanabe concluded that "teacher factors may out weight the influence of an

examination in terms of how an exam preparation courses are actually taught" (ibid).

Moreover, he noted that school cultures might influence the degree of washback in that “a

school positive atmosphere which encouraged students to interact with authentic language

might infiltrate into individual classrooms” (Hsu 2009 : 58).

Cheng (2005), in a large-scale empirical quantitative and qualitative study, sought to

corroborate the idea of whether the modified Hong Kong Certificate of Education

Examination (HKCEE) taken by most secondary graduates brought about the positive

washback on teaching that was intended. In this study, Cheng used questionnaires,

interviews, and observations during the first year after the announcement and discovered that

the newly introduced examination was having considerable influence to the 'what' teachers

teach, and not to the 'how' they teach (Wall, 2005). In other words, the charge of the

examination would change teachers' classroom activities, but it did not change teachers'

beliefs and attitudes about teaching. Cheng suggested that, “to change the how ... genuine

changes in how teachers teach and textbooks are designed must be involved. A change in the

36
examination syllabus itself will not alone fulfil the intended goals”

In a study that converged with the findings found in Cheng's work, Read and Hayes

(2003) attempted to measure the learning outcomes through an investigation conducted on the

IELTS in New Zealand. The researchers used four data collection tools: interviews,

questionnaires, observations, and pre- and post-English test. What was particular in this study

was that the researchers had two small groups of 17 students. Those latter took retired version

of the IELTS exam as a pre- and post-tests to two ILTS courses (intensive and general). Like

the final result obtained in Cheng, Read and Hayes study did not show any significant

improvement overall, nor between the groups of students. The researchers concluded that

time is needed for washback to occur.

Furthermore, in order to determine how students' individual differences can be affected

by washback effect, Ferman (2004) examined the influence of the introduction of an oral test

on learners' achievement. The author found that average ability level students were

significantly different from other students: their anxiety level was the lightest and they were

not adversely affected by potential failure in the test. For that reason, Ferman concluded that

in order that washback could occur, it is important to consider the individual differences

among learners.

Like Ferman (2004), Gosa (2006) sought to identify possible washback effects that took

place inside and outside classrooms as experienced by her students in Romania. Gosa used

students' diaries to analyse whether or not the students’ study environment was affected by

test washback. Adopting that particular method, Gosa recognised that the individual

differences among the learners and the environment where they operate need to be considered

to see if an exam might exert the expected effects, attitudes; perceptions, beliefs, learning

styles, and anxiety should be taken into account when trying to promote positive washback as

they are likely to interact with the test, and hence intervene in the washback process.

37
Qui (2004) conducted a survey to examine the impact of the National Matriculation

English Test (NMET) in China. This investigation focused on the main function used to

select students for higher education. The obtained data revealed that the NMET has a

considerable impact on materials and learning activities, but not the type of the intended

results set out at the beginning of this survey. Qui concluded that one of the reasons for this

was that teachers failed to teach students the required skills that are supposed to be an integral

part of instructional objectives. Instead, these teachers felt more pressured to work only for

good results that have to be obtained at the end of this examination.

Wall and Horâk (2006) examined the impact of the changes of the TOEFL on teaching

and learning in preparing students to take the test from teachers' point of view in central

eastern Europe. Wall and Horâk used interviews over the period of five months to detect the

degree of answers of teachers on the changes in the TOEFL test. They observed that there

was indeed a certain awareness, but it grew very slowly. Nevertheless, the two researchers

found that there was a positive impact toward the introduction of a speaking test and the

integrated writing skill. They concluded that the availability and quality of the information

about the test and test preparation materials would be a major source contributing to teachers'

reactions to desired changes.

Therefore, from the above literature review on washback effects a number of findings

have emerged with regard to this phenomenon and the ways in which it can be investigated.

Some of these concluding remarks are summarised as follows:

First of all, the review of the literature showed clearly that washback is broad and

multi-faceted and can be brought through the agency of many independent and intervening

variables besides the exam itself. As far as washback is concerned, one can see now some of

the factors which seem to have affected the form that washback can take included teachers

and students factors such as beliefs, attitudes, experience, education, training, personality, the

38
status of the subject to be tested, resources, classroom conditions, management of practices in

the schools, communication between test providers and test users, and even the socio-political

context which the test is put to use. In addition, what stands out clearly is that to carry on

washback investigation, a multitude of different data collection methods were employed.

Example of these used instruments concerned methods as classroom observation, individual

interviews, group discussion, questionnaires, analysis of participants' diaries and their talk in

the context under exploration. All these methods and instruments aimed to examine the

factors and variables that make-up washback.

As was seen in this review of literature, the majority of research studies into washback

tried to report the effects of examinations on the teaching content (Alderson & Hamp-Lyons,

1996; Read & Hayes, 2003; Wall & Alderson, 1993). Some results have indicated that tests

altered teaching methods and materials, but others have shown that the tests had limited or no

impact on either (Alderson & Hamp-Lyons, 1996; Cheng, 1997; Wall & Alderson, 1993).

Washback may also be differential, it occurs with some teachers, but not with others

(Alderson & Hamp-Lyons, 1996; Watanabe, 1996). Besides, most of the analysed data

revealed that tests have a superficial impact on students learning, and these individual

learners like teachers have experienced this influence in different ways, with the potential for

considerable impact in terms of effective factors and teachers' behaviours (Cheng, 1997;

Ferman, 2004). The differences between the degree of washback on teachers and students

have raised questions about the extent to which washback to teachers can be assumed to be

generalizable to washback to learners. In sum, there is an evidence indicating how washback

to teachers and programmes might interact with washback to learners.

At the end, on the ground of the literature review dealt with above, there has been an

evidence displaying most of the common and available empirical studies that investigated

washback in language testing on learning and teaching but at large-scale proficiency tests. In

39
other words, this implies that a study of washback on assessment in a classroom context is

still not well explored. This reality is expressed by McNamara (2000), too much language

testing research is about high-stakes proficiency test, ignoring classroom context, and

focusing on the use of technically sophisticated qualitative methods to improve the quality of

tests at the expense of methods more available to non-expert. Hence, what is required to

overcome this shortage is to realize a study of washback on learning and teaching for

classroom tests, and the influences of these tests on the teachers' and students' behaviours and

attitudes. This objective becomes the priority of any washback investigation, and this is

actually the main argument on which the present study in this thesis rests.

1.5 Washback: Lever of Innovation in Education


Washback is the key concept in this study. So, this section will attempt to clarify the

implication of this concept to educational innovation. The aim of this section is to highlight a

possible framework that can help test developers to judge whether their innovations (the

testing they are developing) are likely to have the impact they intend them to have. In the

literature, it was argued that to understand the nature of washback it is also important to take

account of findings in the research literature in the area of innovation in language and change

in educational innovation.

A great deal of applied linguists assert that there has been a well-established tradition,

which led to the realisation of a number of networks that served to yield the most elegant

compilation of ideas about the different phases in the innovation process at the factors at

work in every phase (Rogers, 2003; Fullan, 2007), and an increasing body of literature

focusing on the English teaching context (Henrichsen, 1989; Kennedy, 1990; Markee, 1993,

White, 1993; Li,2001). Crucially, what is most remarkable with these research studies is that

they succeeded to some extent to make clear for readers the complexity of the innovation

process, and the factors which inhibit or facilitate successful diffusion and implementation.

40
Because it is not possible to cover all of innovation theory in a single action, the

discussion will be limited to the ideas which are relevant to the present study. The

particularity of what is going to be displayed in this section is that ideas are arranged in a

certain way in order to help readers find a link between innovation and washback in

education. The researcher looks first at the term innovation and what it implies as a specific

concept in relation to washback in language testing, then he considers what distinguishes this

term from the other types of change needs to be considered. This will be followed by a

discussion of the process of innovation and the sense of change, for the individuals who are

most affected by it. This section concludes with the provision of several models of

innovation, including the hybrid model of the diffusion/implementation process by

Henrichsen (1989), which served as the starting point for the analysis of data in the present

study.

1.5.1 Innovation: Definition

The first question that needs to be answered is what the term 'innovation' refers to.

Following wall (2005: 60), Rogers (1985) defines innovation as an “idea, practice, or object

that is perceived as new by an individual or other unit of adoption”. For Rogers, he sees that:

It does not matter whether the idea is objectively new (in terms of the amount of
time that has passed since its discovery or invention), but rather whether it is felt to
be new by those who may be adopting or using it (ibid).

In Hsu (2009), innovation can be usefully defined as a planned and deliberate effort,

perceived as new by individuals to bring about improvement in relation to desired objectives.

Hsu makes this idea more explicit. He advocates that educational innovation is the result of a

number of problems that a given educational system can present such as failure in students'

achievement, a poor performance by students in specific areas, or a lack of transparent

accountability reporting. What is worth noting about these problems is that they also

transgress to touch some aspects of educational systems that concern systematic attempts by

41
some authorities to change educational policies and practices with the intention to achieve

better final outcomes (Brindeley, 2008 : 36).

Some other researchers make a distinction between innovation and other types of

change. For Wall (2005: 60), who cites White (1993), “the difference has to do with

intentionality: while 'change' is any difference that occurs between time one and time two, an

'innovation' requires human intervention” (244). For Miles (1964) “innovation is a deliberate,

novel, specific change, which is thought to be more efficacious in accomplishing the goals of

a system” (cited by White, 1988: 211). Nevertheless, there is another view borne by some

other researchers who use the terms as synonyms. Many of them argue that if they believe

that the distinction is a valid one, they use them interchangeably to avoid any sort of

confusion or ambiguity, the researcher prefers to opt for the view that regards that these two

terms bear the same sense since the efforts required to launch innovation in this research are

so high, and the challenges go beyond the discussion between the two concepts.

In the literature on innovation in language education, it is thought of that the ideas

provided by Markee (1987) are said to be the most comprehensive. Markee recommends that

language-teaching professionals should adopt a 'diffusion -of-innovation perspective' in order

to understand why their attempts to innovate meet with success or failure. In other words, this

means that specialists need to be aware of the matters and findings reported by educators and

vice versa. For Markee, this approach will not only provide language educators with a

coherent set of guidelines and principles for the development of their own innovation

evaluation, but will also apply them with criteria for retrospective evaluation of the extent to

which these innovations have actually been implemented (cited in Wall, 2005 : 61).

What comes out from this brief discussion on the right sense of the concept

‘innovation’ is that it is deliberate, intentional and also planned. With regard to this

definition, it is apparent that the other concept ‘washback’, the central research concept in

42
this study, should be conceived with a meaning that overlaps to a large extent with this

definition of innovation in order to bring about changes and improvements to the teaching

and learning processes. Obviously, this is what it is intended by the innovation of a newly,

testing system in the context under study in this thesis. In what follows, more understanding

about the process of innovation and what it encompasses, and how it is adapted to this

research will be discussed.

1.5.2 The Process of Innovation

The concept of innovation has been so far defined. The next step in this section is to

synthesise the process of innovation through the discussion of four major innovation views

provided by four different educationists in order to highlight four attempts of innovation to

bring about their relevant ideas together: a basic model for innovation by Rogers (1995), a

comprehensive survey of innovation by Fullan (1991), a set of principles that are seen to

guide an innovative act by Markee (1997), and Henrichsen's (1989) innovation model. The

latter is discussed in some detail because of its relevance to language education and to

language testing projects in various contexts.

1.5.2.1 Rogers' View

In the innovation process, a number of attributes have been proposed as correlating

with the success of implementing innovation. One of the most cited set of attributes is

perhaps that proposed by Rogers (2003). For Rogers, there are five attributes that compose

his model, which are relative advantage, compatibility, observability, trialibity and

complexity. The first attribute, as Rogers posits, is about the answer to the question that is

first raised on the persons whom are mostly affected by innovation. In addition, relative

advantage represents the perception that those persons have on the innovative act. It is

believed that the greater an individual perceives the relative advantage to be, the quicker it

43
will be adopted. Hsu (2009) refers to Rogers to make this point more explicit:

He linked relative advantage to a person's incentives. If incentives contribute


strongly to a decision to adopt a change, there may be little relative advantage to its
continued use after the incentive is removed or sometimes reduced (70).

Compatibility, the second attribute, provides a thorough description of the degree of

congruence between the innovation and the existing values, past experiences and perceived

needs of those who are expected to adopt this innovation. For Rogers, it is clear that if there is

a high degree of compatibility between the innovation and the standard norms and values of a

system where innovation is to occur, the act of innovation is going to happen rapidly.

Contrary to this attribute, the third one, complexity, is proposed by Rogers to display the

extent to which an innovation is difficult and complex to take. It is argued that if an

innovation is seen by its adopters to be difficult, it will be difficult to diffuse and adopt

(Rogers, 2003).

The next attribute in this series of Rogers' model concerns mainly the issue of triability.

The latter refers to the extent to which a prospective adopter could try out an innovation

before its adoption. For Rogers, an innovation that is triable represents less uncertainty to the

person who is considering for its adoption, as it is possible to adopt the innovation a little at a

time rather than all at once and to learn by doing. The final attribute, observability, pertains to

the adopter's ability to actually see the innovation being used by others. For Rogers, if an

innovation is observable, it is easy to adopt and defuse (Hsu, 2009 : 70). What is significant

about this model is that it claims that considering these five attributes makes it easy for the

innovation to be adopted and rapid for its diffusion Among these five attributes, innovation

researchers believe that relative advantage and compatibility are of a great value, and very

important in giving a thorough explanation of an innovation's rate of adoption

44
1.5.2.2 Fullan's View

On the issue of quality, Fullan and Steingelbawer (1991), in their comprehensive

survey of innovation in education, observe that innovation should be regarded as a process

rather than an event. For Fullan (2007), the innovation process is identified through three

fundamental stages, which are: 'Initiation', 'Implementation', and 'Continuation'. What is

worth noting on these stages is that Fullan recognised that it is not possible to predict how

events in one phase will influence those in the others, or how long it will take for one phase

to change into another.

Referring back to these three basic stages in Fullan's model, these are defined as

follows:

1. The 'Initiation' stage: it is the process that occurs between the first appearance of an idea

for a change and the time when it is adopted. In this stage, Fullan proposes to ask a set of

questions to see whether or not the idea to be adopted is worthy. These questions include:

 What is the source of the idea of change?

 What was the motivation behind the idea?

 What is the quality of the innovation?

 Are the participants in the innovation process aware about the

requirements of the idea?

 Is this particular idea supported by external establishments?

2. The 'Implementation' stage: it is the process of putting into practice an idea, programme,

or a set of activities and structures new to the people attempting or expecting change.

Fullan insists on the assumption that there should be a definite consideration to the factors,

which are important in this stage. This particularity includes three aspects: the

characteristics of the innovation (need, clarity, and availability/practicability), the

characteristics of the local context (the district, the community, the principal and the

45
teachers), and the characteristics of external bodies such as (government and ministries)

(cited in Wall, 2005 : 62).

3. The 'Continuation' stage: it refers to whether an innovation becomes part of the

educational system, or whether it fails and is rejected. Like the previous stage, a number of

factors need to be considered. These mainly concern matters such as:

 the degree to which an innovation has been built into the system;

 the number of people who are committed to and skilled in the change;

 the strength and procedures to provide continued support, and

 the degree of the staff turnover in the target situation.

Besides the identification of the three stages of the implementation process, Fullan

(2007) observed that quality in implementing projects needs to be considered; he thought that

quality may be compromised, especially, in politically-driven projects, simply because the

period between the decision to initiate and start up is often too short to allow for adequate

quality assurance. Hsu (2009) made this idea more explicit; he posits that “when adoption is

more important than implementation, decisions are frequently made with the follow-up or

preparation time necessary to generate adequate materials” (74). To overcome this problem,

Fullan (2007) proposes that “it is important to attempt substantial change and to do it by

persistently working on multi-level meaning across the system over time” (92). By this,

Fullan shows that innovation is a complicated task to carry out. In order to make it easier for

innovators, it is wiser to break down complex changes into components, which can be

implemented in an incremental manner.

1.5.2.3 Markee's View

Markee's (1997) framework in the area of educational innovation is seen by innovation

researchers to be one of the most comprehensive models that successfully realised to

46
summarize and display the relevance of ideas from innovation theory. In the diffusion of

innovation in language education, Markee provides Principles for language teaching

professionals:

To understand the factors that affect the design, implementation, and maintenance of
innovation. His frame work is based on the questions that were posed by Cooper
(1989): these include questions such as, who adopts, what, when, why, and how (118).

In terms of who, Markee, based on Fullan (1982), associates the who to the teachers.

He sees that those latter are key players in the language teaching innovation. Though the

teachers are different from one context to another, they tend to assume the role as

implementers who carry out the process of innovation on the ground. Markee also reported

that Kennedy's (1988) study shows the individuals in this process of innovation may

transgress to concern often people other than the teachers, mainly those who are seen to play

the role of deciders. These are individuals such as Ministry of Education, directors of schools,

and general inspectors. The other part in this process are students, who are the clients. All of

those individuals form the community that the innovation act targets.

In the course of the implementation process, the potential adopters, drawing from the

studies by Rogers (1983) and Rogers and Shoemaker (1971), should pay attention to basic

stages. Markee identifies these stages as:

1. gaining knowledge about innovation,

2. being able and persuaded of its value;

3. making a preliminary decision of whether to adopt or reject the innovation, and

4. confirming their previous decisions.

In terms of what, Nicholls (1983) defines “innovation as an idea, objective, or practice

perceived as new by individuals, which is intended to bring about improvement in relation to

desired objectives, which is fundamental and which is planned and deliberate” (cited in

Markee 1993 : 231).

47
Chinda (2009), based on Markee's interpretation of the what, sees this latter:

…what needs to be considered as a managed process of development whose


principal product are teaching or teaching materials, methodological skills, and
pedagogical values that are perceived as new by potential adopters or as he labelled
it the who (65).

Here, Markee addresses two issues he felt were missing in Nicholls' definition. These

mainly are: the notion of fundamental change and the question of whether innovations need

to be planned or not (Wall, 2005 :71).

In terms of where, citing Cooper (1989), Markee says that “where in an innovation is

implemented or is a socio-cultural, not a geographical issue” (55). Markee also stresses the

importance of understanding the context where innovation takes place. In general terms, the

context here refers to a social and cultural context where many factors, such as cultural,

ideological and socio-linguistic are involved and currently affect it. Markee cites Kennedy

(1988) who gives the name of 'sub-systems' to these factors displayed in Figure 1.1.

Figure 1.1: The Hierarchy of inter-relating systems in which an innovation has to operate

(Source: Kennedy, 1988 : 332, cited in Wall, 2005 : 72)

In terms of when, Markee discusses the rate of diffusion. He points out that this rate

may vary from one type of innovation to another. He also adds that “the diffusion process

tends to begin slowly and then accelerates to finally shaken” (58). Besides, Markee thinks

48
that innovation takes time to implement and always takes longer to implement than expected.

In order to grasp this idea in the diffusion process of an innovation, it is appropriate to refer

to Rogers (1995), who explained the rates of diffusion in the form of S-shaped curve (Figure

1.2). In this version, Rogers claims that most innovations follow the same pattern. First, the

rate of diffusion is slow in the beginning, but then after the adoption of the rate by

individuals, the rate accelerates. This is indicated by the step climb in the curve.

Figure 1.2: The Rate of adoption of an innovation (The S-shaped diffusion curve)

(Source: Rogers, 1995: 11, cited in Wall, 2005 : 74)

Rogers (1995) makes clear what the attributes of the rate of diffusions are. In his words,

he states that:

the rate on adoption is usually determined by five types of variables: the attributes of
innovation, the type of innovation, decision, communication channels operating in
the environment, the nature of the social system, and the extent of the agents (cited
in Wall, 2005 : 74).

In terms of why, Markee discusses the characteristics of adopters and characteristics

which can facilitate or hinder innovation. About the characteristics of adopters, Markee refers

to Rogers (1993) and emphasizes on the assumption to consider five categories. These were

discussed with some detail under the who. He adds that giving too much importance to the

characteristics of adaptors make it possible that the agent would be more convinced about the

adoption of the innovative act. On this particular point, Rogers used the phrase ‘audience

49
segmentation’ to talk about the various communication channels or appeals that are used to

target different categories of adopters (Rogers, 1993, cited in Wall, 2005 : 74). The second

factor that was discussed by Markee is about the features of successful innovation. Rogers

(1995) stresses that it is the adopters perceptions of the features which decide whether to

adopt or reject the innovation. Drawing on this assumption, Markee points out five

fundamental crucial attributes to adopt or reject the innovation. These attributes were

illustrated in Rogers' view (see 1.5.2.1).

Finally, in terms of how, Markee (1997) describes five different approaches to affecting

change: the social interaction; centre-periphery model; research-development and diffusion

model, problem-solving model; and, the linkage model.

1. The social interaction model: “it sees a diffusion of an innovation as a matter of

communication: individuals belong to one or more network and information about spreads

and colleagues interact with others in their own social grouping” (62-63).

2. The research-development, and diffusion model: “it assumes that research, long-term

planning and specialist teams working on different aspects of development can ensure

high-quality innovations” (63-64).

3. The centre-periphery model: “it is more ‘top-down’ policy-makers decision of whether

and innovation will be adopted and from it should take, and pass their decision on the

subordinates who must try to manage the implementation” (ibid).

4. The problem-solving model: “it is a ‘bottom-up’ model, where it is the potential users of

innovation who decide whether there is a need for change. They identify possible, trial and

evaluate them, and repeat the process until they reach satisfactory outcomes” (67-68).

5. The linkage model: “it is corporate features from the social interaction, problem-solving

models, and which acknowledge that different approaches should be used in different

situations, depending on the type of problem that need to be solved” (68).

50
1.5.2.4 Henrichsen's View

In congruence with the previous illustrated innovation views discussed earlier in this

section, Henrichsen's (1989) model suggests a full understanding of the diffusion and

implementation of innovation processes required not only an examination of the innovation

itself, but also an in-depth examination of (a) the role of the change agent (eg. policy makers,

deciders of innovation,), (b) the role of the adopted (eg. teachers and students), (c) the various

stages of the innovation diffusion process (eg. decision making, adoption, implementation,

diffusion), and (d) the local constraint which reformers operate. In other words, it is crucial to

understand the context where innovation would occur and take part, the length of time that is

required for successful innovation, and the factors that are present in the context where the

innovation is expected to happen (Andrews, 2004). Without an understanding of these

different components of the innovation process, innovators will find it very complicated to

carry on what they want to innovate.

In this respect, from his attempt in diffusion innovations in English language teaching,

Henrichsen (1989) proposes a hybrid model of the diffusion/Implementation process. The

model consists of three main elements: ‘antecedents’, ‘process’, and ‘consequences’.

1. The ‘antecedents’ component of the model focus on the significance of the set of

conditions of the educational context or environment before an innovation is introduced.

On this point, Henrichsen insists on that those who postulate for an innovation must be

aware of the characteristics of the intended 'user system', the characteristics of the 'users',

traditional pedagogical practices, and the experience of the pervious reforms before they

decide on the suitable innovation to be carried out. The characteristics of the 'user system'

correspond to the structure and power relationships in schools and society. The

characteristics of the intended ‘users’ of the innovation process include the used attitudes,

values, norms, and abilities. Traditional pedagogical practices consist in deriving from a

51
variety of cultural and historical influences. Finally, the experiences of the previous

reforms will provide an understanding on how to achieve the goal or how to overcome the

difficulties present in the innovation.

2. In the ‘process’ component of the model, Henrichsen describes and analyses the factors

which stand as facilitators and/or hindrances to change; he lists the factors as follows:

 Within the innovation itself, including originality, complexity,

explicitness, relative advantage, triability, observability, status,

practicality, flexibility, adaptability, primacy, and form;

 Within the ‘intended user system’, including geographic location,

centralisation of power, and administration, size of the adopting unit;

 Communication structure, gap orientation, and balance, learner factors,

student factors, capacities, educational philosophy and examination;

 Within the inter-elemental, including compatibility, linkage, reward,

proximity, and synergism.

3. In the ‘consequences’ component, the hybrid model provides different types of the

innovation decisions and outcomes. In this section, Henrichsen describes how a decision

to adopt or reject an innovation can be changed at a later stage; he also describes the types

of innovation decisions, collective decisions, authority decisions, and contingent decisions.

Besides, he labels the types of outcomes that can be in mediate or delayed manifest or

latent, and functional or dysfunctional functions (Wall, 2005 : 83-86, Chinda, 2009 :66-

67).

Based on this discussion, different views on innovation in education provide

implications for the present study since the major aim in this research is to repair the myriads

of anomalies present in the current testing system adopted by EFL teachers in the Algerian

secondary schools, and hence seek to implement an ATM; besides, another implication

52
mainly concerns to examine the impact of the intended innovation on those who are

concerned by it in the context to be explored. In this respect, this synthesis of the available

literature review on educational innovation has led us to consider that Rogers (2003) model

has provided a general definition of innovation, its basic characteristics, and its diffusion

process. Fullan (1991) has proposed broad phases of change in innovation, as well as factors

affecting each phase. Fullan, through his model, has made clear that the process of innovation

should be seen as a process rather than as an event, and all the participants who are affected

by this act have to find their own understanding for the change. Markee's (1997) model has

yielded specific perspectives in the domain of innovation in education. The set of proposed

questions have proved to be crucial in making this process true. Finally, Henrichsen's (1989)

hybrid model has offered insights into the fundamental factors affecting the different stages

in the implementation and diffusion of innovation.

From this synthesis of the implications of “innovations” views in this study, it is

ostensible that different theories of innovation and change have provided the researcher in the

present exploration useful insights on how one should proceed to implement subjects that are

new for the people concerned by this change and to serve to bring positive outcomes.

Knowing that not all what was discussed above by these theories can be taken for granted, it

is essential to note that there is a dire need to adapt the contents of these models to the subject

and objective of the present study, so that conceiving things in such a way makes these

frameworks effective in their uses.

As has been pointed out at several instances in this review of the literature on

innovation in education, bringing about any kind of change can be extremely long, complex,

and difficult. Research into washback has consistently shown that tests, in many cases, can be

seen as effective and useful levers for innovation in education. This is why the final

implication of this discussion of innovation in education to the current study is that it has

53
enabled the researcher to base his investigation on a clear-founded theoretical background,

and, second, it has also offered the understanding of how systematic the process of

implementing a new testing system should be an issue which is very indispensable to

diagnose the factors affecting the different stages of this innovation in this study.

1.6 A Washback Model for the Present Study

In section 1.3 in this chapter, the functions and mechanisms of washback through which

it is believed to operate were investigated. The present section explores a washback model

arrived at from the set of reviews of different empirical studies in various contexts. Its

fundamental aim is to investigate the effects of the introduction of new tests in this research

and to consider the nature of evidence required to support claims of a washback effect.

The model proposes that the nature of washback from language tests flows from

overlap, the distance between the contents, and the instructional objectives set out in the

relevant, taught syllabus. The greater there is a correspondence between the two, the likely

positive washback becomes. Nevertheless, in this model, washback is not simply a matter of

test design, it is realised through, and limited by, participant characteristics. Participants

perceptions, attitudes, and reactions of test importance and difficulty, and their ability to

accommodate to test demands, will moderate the strength of any effect and certainly the

evaluation of its value.

To provide a structure of this model, the researcher used an adapted model of washback

that is based on a framework that was suggested by Hughes (1993). It is worth noting that this

latter is very common in the literature review on the most current research studies on

washback. The researcher discussed this model with some detail in section 1.4. The choice of

this washback model among many other available models in the language testing area is

manifested by the fact that such a framework is very appropriate to the nature of the present

research. In what follows, the different components that form the adopted and adapted

54
washback model in this study are discussed. However, before proceeding to this, it is

worthwhile to mention that to describe washback phenomenon and how it occurs, it is

important to distinguish between participants, process, and product, recognising that these

three components may be affected by the nature of a test.

1.6.1 Washback on Participants

In considering this first component in the adopted framework in this research, one can

point out that the participants' behaviour can either support or override the intended washback

effect of the introduction of the new testing system. As noted by Shohamy et al. (1996), the

results obtained from tests can have serious consequences for individuals as well as

programmes, since many crucial decisions are made on the basis of test results. In this study,

the researcher called for one of these two important sorts of washback, that is, washback on

participants. This idea overlaps, to some extent with, Bachman's and Palmer's (1996 : 30-31)

micro-level of washback. On this point, Bailey (1999 : 12) views that the participants either

teachers or students affected by washback may be influenced by information that a test bears

prior to its administration, or by ‘folk-information’ (such as reports from students who have

taken earlier version of the test). Besides, these participants may also be influenced by several

sources of feedback following the administration of the test. These would include the actual

test scores provided by teachers, feedback from students, and feedback from the teachers to

the students scores.

To access the participants' attitudes in a washback study, the literature review on this

issue presents myriads of means of investigation. Alderson and Wall (1993) point to the

inadequacies of relying on survey data in isolation, but acknowledge that surveys can help to

explain teachers' behaviour by probing, understanding, and beliefs. Watanabe (2004)

proposes that qualitative data methods such as interviews in understanding washback in

context can provide access to the world view of the participants. He adds that qualitative

55
interview can also assist the researcher both in the design on more quantitative instruments

and in the interpretation of the results (cited in Green, 2007 : 28).

1.6.2 Washback on Process

The second component in the adopted model is washback on process. Hughes (1993: 2)

defined process “as any actions taken by the participants which may contribute to the process

of learning”. Specifically, he included processes such as materials development, syllabus,

teaching methods, contents of teaching, learning strategies, and assessment.

To understand better washback on process, a number of language assessment

researchers recommended that the triangulation of perspectives, incorporating the views of

teachers is needed (Watanabe, 2004; Shohamy, 2001; Turner, 2001; Alderson & Wall, 1993).

Hence, they suggest that both questionnaires responses and interview data will need to be

sustained by another instrument, Cheng (1997), citing Bailey (1999) agrees that “observation

allows for a richer understanding of washback than surveys alone and argues for a

combination of asking through surveys and interviews - and watching through observation”

(cited in Green 2007 : 29).

1.6.3 Washback on Products

The last component in the adopted washback model in the current study is washback on

products. Hughes (1993) defines “the products associated washback on products as what is

learnt (facts, skills, etc.) and the quality of learning”. What is notable on this component is

that it is sometimes difficult to untangle it from the two other components that are

participants and process. For Bailey (1989), much of the literature review about participants

and washback describes the various processes participants try to increase. Such processes

include aspects, such as reviewing what one carries on in teaching. Shohamy (1993)

highlights “processes as well when claims that negative washback often brings about under

56
emphasis on the means by which the learner arrives at proficiency” (186). This means to

include processes and products in the model.

To access language outcomes, most of language assessment researchers claim that it is

complicated to measure products. The reasons for the lack on consideration given to tests

include the problem of comparing non-equivalent, often-distant groups and the selection of

alternative outcome measures (Green, 2007: 29). For Madaus (1988), in evaluating outcomes,

it is important to bear in mind the circularity of evaluating test impact through score gains. A

rise in score does not obligatorily imply that there is an improvement in learning. Rather, the

score may mask the reality that there is no positive washback of the test on the final outcomes

of learners. Because of this, and in order to make washback on products actual, in the current

study, the researcher would work on the evidential link between test design issues and test

score interpretation. This can result in gains on the newly testing system.

Having explained the choice of a washback model, highlighted its structure and the

components it comprises, introduced the appropriate research methodology to be employed to

access each of these components in this research, now we turn to dependent variables of the

model arrived at above. This raises how washback can be recognized and gives a clear picture

on its nature. A thorough examination of these dependent variables is needed and will be

presented further through the two fundamental studies: the “Preliminary Study” and the

“Final Study” in Chapters 3 and 5.

On this matter, Green (2007) cites Wall and Alderson (1993) who argue that it is

necessary to define dependent variables in washback research and their 15 washback

hypotheses suggest predictions regarding content (what), methods (how), rate, sequence,

degree, and depth of teaching and learning as potential dependent variables for investigation.

In the same vein, calling for the same explicitness of these dependent variables, Bailey (1999)

remarks that washback studies can broadly be divided into those focusing on perceptions and

57
those concerning actions. Hughes (1993), whose ideas are the starting point of the washback

model adopted in this research, provides his model that attempts to tackle the dependent

variables as a basis for research which encompasses both perceptions and actions and links

these two variables to learning outcomes. Bailey (1996), who developed this model presents

it in the form of flow diagramme, where the conditions outlined for washback are met. In this

sense, washback will occur to participants, affecting their attitudes towards work.

Participants' attitudes will affect processes, including both what participants do, and how they

do. Processes concern aspects such as teaching materials, teaching, and learning. In their turn,

these processes will influence the product: the content and the extent of learning.

Drawing on the reviewed literature, dependent variables in this study will include the

effects of the prevailing, existing testing system in the Algerian secondary schools on

participants' attitudes and beliefs, the content and methods of teaching and learning and the

students' outcomes, in the form of their scores and self-assessed gains. Likewise, the same

way of conceiving these dependent variables will be adopted with the effects of the newly,

introduced testing system in the context under exploration on participants' attitudes, beliefs,

and reactions to that innovation. Each of these facets to be considered poses its own

challenges for the researcher in this investigation. Therefore, this study aims to explore how

different components in the Algerian educational system reacted when washback was

strategically anticipated to determine the possible areas of washback intensity in teaching and

learning English in the Algerian Secondary schools and to define the interrelationship

between the who changes, ‘what’, ‘how’, ‘where’, and ‘why’.

58
Conclusion

To summarize, this chapter reviewed a number of issues related to the central research

concept in this research. Crucially, an attempt has been made to elucidate the origins,

definitions, functions, and mechanisms of washback in language testing. In addition, this

literature review on the concept has also considerably helped us to display the power and

authority of tests on the teaching and learning processes, indicate how language tests become

effective ways for influencing educational system, prescribe the behaviour of those who are

affected by their results. Some ideas related to the question of impact of tests on teaching and

learning either in positive or negative sides are still not well explained, and the raised

questions remain without thorough and comprehensive answers. Next to this part, in this

chapter, an array of assessment studies on washback have revealed that a large number of

investigations on this phenomenon are from different perspectives and multiple levels are

available. In the meantime, those studies have shown that a few of them are of empirical

nature, and findings become fewer when research turns to explore the washback effects of

language tests in classrooms.

The following displays the theoretical background of the research methodology

employed in the present study.

59

You might also like