0% found this document useful (0 votes)
39 views

Classroom Testing

The document discusses various types of classroom tests including diagnostic, progress, achievement, placement, and proficiency tests. It examines the purposes of testing students, including to check their understanding, evaluate teaching, and determine future lessons. The document also covers qualities of effective tests such as validity, reliability, clarity, and positive impact.

Uploaded by

D
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Classroom Testing

The document discusses various types of classroom tests including diagnostic, progress, achievement, placement, and proficiency tests. It examines the purposes of testing students, including to check their understanding, evaluate teaching, and determine future lessons. The document also covers qualities of effective tests such as validity, reliability, clarity, and positive impact.

Uploaded by

D
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

DELTA Testing page I

CTASSROOM TESTING

lntroduction
Task 1

Why do we fesf students? Tick the reasons which you think are justifiable and
give examples of what kind of fesfs might help in each case.

n to punish them

n to motivate them to study

d to check whether they've studied the materialfrom the previous lesson

n because they expect it

El because the Director of Studies says we must

tro'to determine which class they should join

n to determine whether they should pass or repeat the class

tr to award them qualifications such as the FCE

n to see how intelligent they are

ff' to evaluate our own teaching

Il].'' to determine what to do next

tr to plan remedial lessons

O CELT Athens (GV) 2011


DELTA Testing page2

Types of Tests

Task 2

Suggesf which of the purposes in Task 1 each of the following tests might be said
to serve.

A diagnostic test is usually a very short test that focuses on one area of language use.
Such tests can be administered as often as the class teacher feels necessary and they
are an invaluable source of information on lesson planning.

A progress test is usually longer and focuses on a number of areas that have been
"covered" in a period of time. Progress tests are usually administered at regular
intervals.

An achievement or attainment test is long and formal. lt is usually administered at the


end of the year and its results can be critical! lt reflects the emphases of the course as a
whole.

A placement test is useful if the level of a learner is not known. lt is administered


immediately after registration.

A proficiency test is one that an examining body is responsible for. lt leads to a


qualification which certifies that the examinee has reached a certain level of proficiency
in the language.

O CELT Athens (GV) 2011


DELTA Testing page 3

Qualities of a Test

Validity

Traditionally, four different types of validation procedure were recognised.

r Construct validation: to establish that a test measures a theoretical construct,


usually through some sort of experimental reasoning
r Griterion-related validation: to establish that a test agrees with some external
criterion which seems to measure the same skill(s) as the test in question. There
were two types of criterion-related validation: (a) concurrent criterion-related
validation, which involves comparing the results of the test to another measure given
or obtained at the same point in time; and (b) predictive criterion-related validation,
which compares the results of the test to another measure or variable obtained after
the test is completed, and toward which the test seeks to predict.
r Gontent validation: to establish that a test agrees with a specified content or
theoretical domain, through the comparison of the tasks on the test to some
description of this domain. This is the area of test validation where test specifications
have had a historically important role.
r Face validation: to judge a test prima facie, thal is, to simply look at a test and
determine its validity. Bachman (1990) and others called for an elimination of this
form of test validity, because it was weak in its empirical and evidentiary base.

A more modern approach to validity views validity as a unitary phenomenon.

Validation is a process of producing evidence for the claim that the test serves its
purpose using the classical "types" of validity as sources of evidence.

Thus, validity is not a property of a test; rather, it is a propertyof the inferences drawn
from the test. Tests can be misused or underused, and one must know much about a
particular test use situation before determining if a test in that situation provides
evidence upon which to base a valid inference about test takers.

The study of the consequences of tests is a key feature of test validation. lnferences
from testing mean that we look at what the test does to the people who take it: What
kinds of decisions are made based on the test result? How does the test taker benefit
from the test? How does the test taker suffer?

The classic role of reliability (see below) is as but one form of evidence of the validity of
a test; lt has thus even been suggested that reliability become an element in a validity
argument.

O CELT Athens (GV) 2011


DELTA Testing page 4

Reliability
Reliability is a quality of test scores. A perfectly reliable score would be one which is free
from errors of measurement. lt thus has to do with the consistency of emasures across
different times, test forms, raters and other characteristics of the measurement context.

Some of the most common approaches to measuring reliability include:


a lnternal consistency (measurement of how consistent test-takers' performances on
the different parts of the test are with each other)
t lntra-rater reliability (obtaining two independent ratings from a single rater for each
individual language sample)
r lnter-rater reliability (obtaining two independent ratings from two raters for each
individual language sample)
r Equivalence (reliability of parallelforms of the test)
Discrimination
Tests should be pitched at the right level for the learners so that the teacher can get an
accurate picture of the various levels of ability within her class.

Clarity
Test instructions should be very clear and examples should always be included!
Explanations should be given to the learners when they evidently do not understand
what a question is asking.

Practicality
Tests should be practical to administer and to mark. Test design should take into
account the time that is available for marking and issuing results in relation to the human
resources available.

Positive lmpact
The test impact is the effect that the test may have on individuals, policies or practices,
within the classroom, the school, the educational system or society as a whole.
Washback is sometimes used as a synonym of impact, but is more frequently used to
refer to the effects of tests on teaching and learning,

O CELT Athens (GV) 2011


DELTA Testing page 5

Qualities of a Test
Task 3

Which of these complaints made by teachers and students about fesfs and testing
methods do you consider legitimate?

r I had to mark 35 tests in two days and each of them contained three compositions!

o We were supposed to do a reading test but most of the exercises in it were grammar!

r There must be something wrong with this test. How can my students have failed the listening
component when they scored 80o/o or more in the writing component?

. lt's disgusting! We have to take a test once a month, and as if this wasn't enough, there's
another test to be taken at the end of the year!

. During the year we did a lot of games and speaking activities, but the final test only contained
grammar and vocabulary exercisesl

. She gave us a handwritten exercise on a scruffy piece of paper and then told us that it was a
test!

I wrote exactly the same things as Makis, but he got a 20 and I only got a '15! I think it's , ,- ,
because the teacher hates me....

We were left alone in the classroom so most students cheated. lt's not fair to those of us who
studied hard for the test.

I had never done this sort of exercise before and I didn't appreciate having to do it for the first
time as part of a test!

I did not know what to do and when I asked her what the exercise was asking she just said I

should know!
al.. :' ,.4 ,, '...-. . , !
,.r,
I hate it when I have to give them multiple-choice tests: the correct answer always stands out;
it's either longer or shorter than the others and you never know whether the learners really
knew why it was right! i

a The test was so difficult none of us got more than 20%.

a I don't think I can trust this exam; last year the pass rate was 40%, the year before last it was
65% and this year it's 20%.ls this really supposed to be the same exam?

I want to emigrate to Australia, where l'm hoping to get a job as a nurse. I need to produce
evidence that my English is good enough, but the evidence they will accept is a certificate in
academic English.

O CELT Athens (GV) 20J1


DELTA Testing page 6

Test ltem Evaluation

Task 4

Study the items below, which were all taken from progress and achievement tests. ln
each case, the level and aim of the item is stated. Please consider how effective each
item is in serving the aim stated and, if ineffective, suggest ways of improving it.

One word is missing in each of these sentences. lndicate where the word is
missing and what it is. (10) marks
Example: The room smelt gas. of
The Thames is 200 mile-long river.
Oxford is the west of London.

upper progress test, grammar section

Do you eat too much chocolate? Do you read too many romantic novels? Use
the table to talk about your bad habits.

eat
read
I drink
play
smoke

elementary diagnostic fesf; focus on form of new item

What would you like to be able to do? Choose three things. Write three
sentences.

play chess write books make films


sail a yacht make pottery sing in a group
windsurf fly a helicopter

| 'd like to be able to write books.

p re -i nte rm e d i ate p ro g ress f esf, g ra m m a r se cti o n

O CELT Athens (GV) 2011


DELTA Testing page 7

The word affluent in line 6 of the text means


A. abounding, wealthy
B. stupid
C. poor
D. fluent
progress

section

Dear John,
I 've got some news for you! My uncle gave me some money for my
birthday so I can come to visit your country this summer.
What are you going to do? Are you going away? Tell me when you
are going to be at home so I can visit you.
I don't know your country very well. I 've got a small guide book but
it isn't very good. Can you tell me something about it? Which places should I

visit? Are there any interesting things I should see?


Mark

True or false?
Mark's uncle gave him money for his birlhday.
Mark is going to visit John's country this winter.
Mark doesn't want to visit John.
Mark knows a lot of things about John's country.

intermediate achievement test, reading section

[ ! f L] f [ tr I] [ [ L ! L I L I ! I] [ f L ! I] ! Ll L Ll
Which of the test items above would you consider consistent with communicative
language teaching?

O CELT Athens (GV) 30ll


DELTA Testing page 8

Test features

Study these definitions of test features and discuss advantages and disadvantages of
each type of test.

integrative { discrete-point
norm-referenced ./ criterion-referenced
objective \i subjective
direct { indirect
An integrative test attempts to assess performance in a number of language
systemsiskills at the same time through items which involve a number of language
abilities, whereas a discrete-point (or discrete-item) test seeks to isolate one aspect of
a system or skill to test in each item.

A norm-referenced test compares the performance of one testee with that of the other
testees whereas a criterion-referenced one compares each testee's performance with
pre-determined criteria

lf you need to use your judgement in marking a test, it is a subjective one; an objective
test can be marked without the marker needing to use their judgement.

A direct test tests directly the skills or abilities that it sets out to test, whereas an
indirect one attempts to test underlying skills or systems.

.o CELT Athens (GV) 2011


DELTA Testing page 9

Forms of Assessment

Which of the forms of assessment shown below do you use in class? Discuss
with your partner:

r whether you regularly use any of these assessment techniques

r to what extent you use each

r the relative importance you attach to each of them

r whether your emphases change depending on the class or level being taught

TESTING PROCEDURES
_o
Gtr
trtD
bE
LV'
o

LINGUISTIC FACTORS NON.LINGU ISTIC FACTORS

. written homework r attitude/effort


o written grammar activities o participation in class
=o
tu
o speaking activities
=o o projects . group work
,gE . listening tasks . organisation of work
coa . reading tasks
o writing tasks o presentation of work !. ),
. vocabularyactivities o punctuality

^ O CELT Athens (GV) 2011


DELTA Testing page 10

Test Evaluation: Questions to ask yourself

Test Type . What was the aim of the test? Was it ...

in the language?
> to establish whether the learner has successfully
completed a course?

Test Features . Can you mark the test without using you judgement?
. Do the test items test one thing at a time?
. ls marking based on a predetermined scheme or is it relative
to the learners' performance?

Validity . Does the test test what it is supposed to test? How do you
know?
. Does it use techniques compatible with your teaching?
. Are these techniques consistent with a theory of learning and
teaching that you subscribe to?
. Does it focus on a representative sample of the work done?
. Do the results agree with results from other sources?

Reliability o How did you ensure that conditions were the same for all
testees?
. How did you decide on a marking scheme?
. How did you achieve a high degree of objectivity in your
marking?

Instructions . How clear were your instructions?


. Were there examples for all exercises?

Discrimination . Was the test at the right level of difficulty?


. What did the results show?

Practicality . How easy was it to administer the test?


. How easy was it to mark it?

Washback . How did the test affect your teaching?


. How did the learners feel about it?

O CELT Athens (GV) 2011


DELTA Testing page 1 I

Gommunicative Testing

What does communication involve?

. choice of language form

. focus on the message

. communicative purpose

. audience

. integration of skills

How can a test be made more communicative?

. Should there be one or more correct answers be possible?

o What about contextualisation of activities? What kinds of contexts should be


present?

. Should the test focus on the language systems or on skills?

. How could the objectives of such a test be defined?

o How would you go about correcting and marking a communicative test? What would
be your priorities?

O CELT Athens (GV) 2011


DELTA Testing page 12

Further Reading

Alderson, J. C., C. Clapham and D. Wall, 1995. Language Test Construction and
Eval u ati on. Cam brid ge: Cam brid ge U niversity Press.
Bachman, L. F. 1990. Fundamental Considerations in Language Testing. Oxford: Oxford
University Press.
Bachman, L. F. 1991. What does language testing have to offer? IESOL Quafterly25
(4):671-704
Bachman, L. F. and A. S. Palmer. 1996. Language Testing in Practice. Oxford: Oxford
University Press.
Brindley, G. 2001. Assessment. ln Carter, R. and D. Nunan (eds.) The Cambridge Guide
to Teaching English fo Speakers of Other Languages. Cambridge: Cambridge University
Press, 137-143
Council of Europe. 2002. Common European Framework of Reference for Languages.'
Le arn i ng, te achi ng, assessrnenf. Cam brid ge: Cambrid ge U niversity Press.

Harris, M. and P. McCann. 1994. Assessmenf. Oxford: Heinemann.


Hughes, A. 2003. Testing for Language Teachers. Second Edition. Cambridge:
Cambridge U niversity Press.
McNamara ,T.2001. Language assessment as social practice: challenges for research.
Language Testing 18 (4). 333-349.
Weir, C. 1993. Understanding and Developing Language lesfs. Hemel Hempstead:
Prentice Hall.
Widdowson, H. G. 1979. Explorations in Applied Linguistics. Oxford: Oxford University
Press.
Widdowson, H. G. 1990. Aspecfs of Language Teaching. Oxford: Oxford University
Press.

O CELT Athens (GV) 2011

You might also like