Test Features

The document discusses various testing methods in language assessment, focusing on direct versus indirect testing, discrete point versus integrative testing, norm-referenced versus criterion-referenced testing, and objective versus subjective testing. Direct testing measures candidates' actual skills, while indirect testing assesses underlying abilities, though the latter often has weaker correlations with performance. The text advocates for direct and criterion-referenced testing for more accurate and meaningful assessments of language proficiency.

Uploaded by

Đức Hoàng Bùi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views

Test Features

Uploaded by

Đức Hoàng Bùi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Test Features

I. Direct versus indirect testing

Testing is said to be direct when it requires the candidate to perform precisely the skill that we wish to measure. If
we want to know how well candidates can write compositions, we get them to write compositions. If we want to
know how well they pronounce a language, we get them to speak. The tasks, and the texts that are used in direct
testing, should be as authentic as possible. The fact that candidates are aware that they are in a test situation means
that the tasks cannot be really authentic. Nevertheless every effort is made to make them as realistic as possible.

Direct testing is easier to carry out when it is intended to measure the productive skills of speaking and writing. The
very acts of speaking and writing provide us with information about the candidate’s ability. With listening and
reading, however, it is necessary to get candidates not only to listen or read but also to demonstrate that they have
done this successfully. Testers have to devise methods of eliciting such evidence accurately and without the method
interfering with the performance of the skills in which they are interested. Appropriate methods for achieving this
are discussed in Chapters 11 and 12. Interestingly enough, in many texts on language testing it is the testing of
productive skills that is presented as being most problematic, for reasons usually connected with reliability. In fact
these reliability problems are by no means insurmountable, as we shall see in Chapters 9 and 10.

Direct testing has a number of attractions. First, provided that we are clear about just what abilities we want to
assess, it is relatively straightforward to create the conditions which will elicit the behaviour on which to base our
judgements. Secondly, at least in the case of the productive skills, the assessment and interpretation of students’
performance is also quite straightforward. Thirdly, since practice for the test involves practice of the skills that we
wish to foster, there is likely to be a helpful backwash effect.

Indirect testing attempts to measure the abilities that underlie the skills in which we are interested. There was a time
when some professional testers would use the multiple choice technique to measure writing ability. Their items
were of the following kind where the candidate had to identify which of the underlined elements is erroneous or
inappropriate in formal standard English:

At the outset the judge seemed unwilling to believe anything that was said to her by my wife and I.

While the ability to respond to such items has been shown to be related statistically to the ability to write
compositions (although the strength of the relationship was not particularly great), the two abilities are far from
being identical. Another example of indirect testing is Lado’s (1961) proposed method of testing pronunciation
ability by a paper-and-pencil test in which the candidate has to identify pairs of words which rhyme with each other.

Perhaps the main appeal of indirect testing is that it seems to offer the possibility of testing a representative sample
of a finite number of abilities which underlie a potentially indefinite large number of manifestations of them. If, for
example, we take a representative sample of grammatical structures, then, it may be argued, we have taken a
sample which is relevant for all the situations in which control of grammar is necessary. By contrast, direct testing is
inevitably limited to a rather small sample of tasks, which may call on a restricted and possibly unrepresentative
range of grammatical structures. On this argument, indirect testing is superior to direct testing in that its results are
more generalisable.

The main problem with indirect tests is that the relationship between performance on them and performance of the
skills in which we are usually more interested tends to be rather weak in strength and uncertain in nature. We do not
yet know enough about the component parts of, say, composition writing to predict accurately composition writing
ability from scores on tests that measure the abilities that we believe underlie it. We may construct tests of
grammar, vocabulary, discourse markers, handwriting, punctuation, or of any other linguistic element. But we will
still not be able to predict accurately scores on compositions (even if we make sure of the validity of the composition
scores by having people write many compositions and by scoring these in a valid and highly reliable way).

It seems to us that in our present state of knowledge, at least as far as proficiency and final achievement tests are
concerned, it is preferable to rely principally on direct testing. Provided that we sample reasonably widely (for
example require at least two compositions, each calling for a different kind of writing and on a different topic), we
can expect more accurate estimates of the abilities that really concern us than would be obtained through indirect
testing. The fact that direct tests are generally easier to construct simply reinforces this view with respect to
institutional tests, as does their greater potential for positive backwash. It is only fair to say, however, that many
testers are reluctant to commit themselves entirely to direct testing and will always include an indirect element in
their tests. Of course, to obtain diagnostic information on underlying abilities, such as control of particular
grammatical structures, indirect testing may be perfectly appropriate.

In summary, we might say that both direct and indirect testing rely on obtaining samples of behaviour and drawing
inferences from them. While sampling may be easier in indirect testing, making meaningful inferences is likely to be
more difficult. Accurate inferences may be more readily made in direct testing, though it may be more difficult to
obtain samples that are truly representative. One can expect the backwash effect of direct testing to be the more
positive.

Before ending this section, it should be mentioned that some tests are referred to as semi-direct. The most obvious
examples of these are speaking tests where candidates respond to recorded stimuli, with their own responses being
recorded and later scored. These tests are semidirect in the sense that, although not direct, they simulate direct
testing.

II. Discrete point versus integrative testing

Discrete point testing refers to the testing of one element at a time, item by item. This might, for example, take the
form of a series of items, each testing a particular grammatical structure. Integrative testing, by contrast, requires
the candidate to combine many language elements in the completion of a task. This might involve writing a
composition, making notes while listening to a lecture, taking a dictation, or completing a cloze passage. Clearly this
distinction is not unrelated to that between indirect and direct testing. Discrete point tests will almost always be
indirect, while integrative tests will tend to be direct. However, some integrative testing methods, such as the cloze
procedure, are indirect. Diagnostic tests of grammar of the kind referred to in an earlier section of this chapter will
tend to be discrete point.

III. Norm-referenced versus criterion-referenced testing

Imagine that a reading test is administered to an individual student. When we ask how the student performed on the
test, we may be given two kinds of answer. An answer of the first kind would be that the student obtained a score
that placed her or him in the top 10 percent of candidates who have taken that test, or in the bottom five percent; or
that she or he did better than 60 percent of those who took it. A test which is designed to give this kind of
information is said to be norm-referenced. It relates one candidate’s performance to that of other candidates. We
are not told directly what the student is capable of doing in the language.

Testing for assignment to levels is intended to be carried out in a faceto-face situation, with questions being asked
orally. The tester gives the candidate reading matter of different kinds and at different levels of difficulty, until a
conclusion can be made as to the candidate’s ability. This can only be done, of course, with relatively small numbers
of candidates.

In this case we learn nothing about how the individual’s performance compares with that of other candidates.
Rather we learn something about what he or she can actually do in the language. Tests that are designed to provide
this kind of information directly are said to be criterion-referenced2.

When the previous edition of this book was published, it was not difficult to point to major language tests which
were norm-referenced. The scores which were reported did not indicate what a candidate could or could not do.
Rather a numerical score was provided, which candidates, teachers and institutions had to interpret on the basis of
experience. Only over time did it become possible to relate a person’s score to their likely success in coping in
particular second or foreign language situations.

Pure criterion-referenced tests classify people according to whether or not they are able to perform some task or set
of tasks satisfactorily. The tasks are set, and the performances are evaluated. It does not matter in principle whether
all the candidates are successful, or none of the candidates is successful. In broad terms, tasks are set, and those
who perform them satisfactorily ‘pass’; those who don’t, ‘fail’. This means that students are encouraged to measure
their progress in relation to meaningful criteria, without feeling that, because they are less able than most of their
fellows, they are destined to fail. Criterion-referenced tests therefore have two positive virtues: they set meaningful
standards in terms of what people can do, which do not change with different groups of candidates, and they
motivate students to attain those standards. We welcome the trend to make major tests more criterion-referenced.

Books on language testing have tended to give advice which is more appropriate to norm-referenced testing than to
criterion-referenced testing. One reason for this may be that procedures for use with norm-referenced tests
(particularly with respect to such matters as the analysis of items and the estimation of reliability) are well
established, while those for criterion-referenced tests are not. The view taken in this book, and argued for in Chapter
6, is that criterion-referenced tests are often to be preferred, not least for the positive backwash effect they are
likely to have. The lack of agreed procedures for such tests is not sufficient reason for them to be excluded from
consideration.

IV. Objective testing versus subjective testing

The distinction here is between methods of scoring, and nothing else. If no judgement is required on the part of the
scorer, then the scoring is objective. A multiple choice test, with the correct responses unambiguously identified,
would be a case in point. If judgement is called for, the scoring is said to be subjective. There are different degrees of
subjectivity in testing. The impressionistic scoring of a composition may be considered more subjective than the
scoring of short answers in response to questions on a reading passage.

Objectivity in scoring is sought after by many testers, not for itself, but for the greater reliability it brings. In general,
the less subjective the scoring, the greater agreement there will be between two different scorers (and between the
scores of one person scoring the same test paper on different occasions). However, there are ways of obtaining
reliable subjective scoring, even of compositions.

Michael W Passer Research Method
100% (2)
Michael W Passer Research Method
1,627 pages
U.5 Testing and Assessment in ELT by Dave Allan
No ratings yet
U.5 Testing and Assessment in ELT by Dave Allan
3 pages
More How to Win at Aptitude Tests
From Everand
More How to Win at Aptitude Tests
Liam Healy
4/5 (7)
How to Win at Aptitude Tests Vol II
From Everand
How to Win at Aptitude Tests Vol II
Iain Maitland
4/5 (2)
Using Rubrics for Performance-Based Assessment: A Practical Guide to Evaluating Student Work
From Everand
Using Rubrics for Performance-Based Assessment: A Practical Guide to Evaluating Student Work
Todd Stanley
4.5/5 (2)
Testtechniques 150208061117 Conversion Gate02
No ratings yet
Testtechniques 150208061117 Conversion Gate02
20 pages
Kinds of Tests
No ratings yet
Kinds of Tests
2 pages
The Essay-Translation Approach: A Handout of Language Testing - Approaches To Language Testing Page 1 of 5
0% (1)
The Essay-Translation Approach: A Handout of Language Testing - Approaches To Language Testing Page 1 of 5
5 pages
Creating A Placement Test
0% (1)
Creating A Placement Test
6 pages
Tugas Resume
No ratings yet
Tugas Resume
12 pages
Approaches To Language Testing
No ratings yet
Approaches To Language Testing
7 pages
Testingin English Language Teaching
No ratings yet
Testingin English Language Teaching
4 pages
Why Test Learners
No ratings yet
Why Test Learners
3 pages
Ela 45
No ratings yet
Ela 45
21 pages
Discrete Item and Integrative Tests
100% (1)
Discrete Item and Integrative Tests
5 pages
Tipos de Reactivo
No ratings yet
Tipos de Reactivo
3 pages
Summary 21-23
No ratings yet
Summary 21-23
8 pages
Assessment and Testing in The Classroom
100% (1)
Assessment and Testing in The Classroom
47 pages
Testing Assessing Teaching SUMMARY
No ratings yet
Testing Assessing Teaching SUMMARY
6 pages
Farhady, H. (1986) Theories of Language Testing
No ratings yet
Farhady, H. (1986) Theories of Language Testing
10 pages
EFL Testing Techniques Syllabus. (UNG) Doc
No ratings yet
EFL Testing Techniques Syllabus. (UNG) Doc
30 pages
University of Rizal System
No ratings yet
University of Rizal System
16 pages
For Methodology
No ratings yet
For Methodology
2 pages
Tugas Language Testing Pak Halim
No ratings yet
Tugas Language Testing Pak Halim
16 pages
Issues and Challenges in Language Assessment
No ratings yet
Issues and Challenges in Language Assessment
13 pages
The Advantages of Using An Analytic Scoring Procedure in Speaking Assessment
No ratings yet
The Advantages of Using An Analytic Scoring Procedure in Speaking Assessment
12 pages
THE SUMMARY OF CHAPTER 2 Tugas Sabtu 281023
No ratings yet
THE SUMMARY OF CHAPTER 2 Tugas Sabtu 281023
5 pages
LECTURE NOTES 6B - Approach and Techniques of Language Testing
No ratings yet
LECTURE NOTES 6B - Approach and Techniques of Language Testing
5 pages
Types of Test Items
100% (1)
Types of Test Items
5 pages
Nazini - 211230057 - 4B. TBI
No ratings yet
Nazini - 211230057 - 4B. TBI
5 pages
Placement, Achievement, Diagnostic, Profiency
No ratings yet
Placement, Achievement, Diagnostic, Profiency
12 pages
What and How To Test
No ratings yet
What and How To Test
11 pages
From Discrete-Point Test To Integrative Assessment
No ratings yet
From Discrete-Point Test To Integrative Assessment
3 pages
Aula 4
No ratings yet
Aula 4
16 pages
Vasylevskaya V
No ratings yet
Vasylevskaya V
6 pages
Maryam Maghrour - Language Testing & Assessment
No ratings yet
Maryam Maghrour - Language Testing & Assessment
19 pages
Objective and Integrative Test
No ratings yet
Objective and Integrative Test
12 pages
Language Assessment Report 1
No ratings yet
Language Assessment Report 1
13 pages
LTA
No ratings yet
LTA
2 pages
Types of Test
0% (1)
Types of Test
8 pages
Final KTDG 2024
No ratings yet
Final KTDG 2024
28 pages
Story Starters
No ratings yet
Story Starters
23 pages
Testing
No ratings yet
Testing
21 pages
Story Starters
No ratings yet
Story Starters
22 pages
Chapter 5
No ratings yet
Chapter 5
28 pages
Methodology 3 Exam Notes
No ratings yet
Methodology 3 Exam Notes
9 pages
D2410031722(0)
No ratings yet
D2410031722(0)
6 pages
Test Features (Shorten Ver)
No ratings yet
Test Features (Shorten Ver)
20 pages
What Is Test..draft
No ratings yet
What Is Test..draft
4 pages
Module 2 in EL106
No ratings yet
Module 2 in EL106
23 pages
Tugas Lte 1
No ratings yet
Tugas Lte 1
3 pages
ELT Handout
No ratings yet
ELT Handout
85 pages
Students' Writing Test: An Argumentative Study of English As A Foreign Language Learner
No ratings yet
Students' Writing Test: An Argumentative Study of English As A Foreign Language Learner
9 pages
Language Testing: Approaches and Techniques: Eng204 February13, 2021 Reporter #1 Gaverell M. Carial
No ratings yet
Language Testing: Approaches and Techniques: Eng204 February13, 2021 Reporter #1 Gaverell M. Carial
7 pages
Kinds of Testing
No ratings yet
Kinds of Testing
14 pages
Buiminhhoaithu - Midterm Test - Language Testing and Assessment
No ratings yet
Buiminhhoaithu - Midterm Test - Language Testing and Assessment
7 pages
3 Kinds of Tests and Testing, Hughes 2003, Abridged
No ratings yet
3 Kinds of Tests and Testing, Hughes 2003, Abridged
12 pages
Performance-Based Assessment for 21st-Century Skills
From Everand
Performance-Based Assessment for 21st-Century Skills
Todd Stanley
4.5/5 (14)
This Is a Test: A Handbook for Writing Good Tests
From Everand
This Is a Test: A Handbook for Writing Good Tests
Jan Gleiter
No ratings yet
How to Approach Learning: What teachers and students should know about succeeding in school: Study Skills
From Everand
How to Approach Learning: What teachers and students should know about succeeding in school: Study Skills
Fiona McPherson
No ratings yet
Anonymity in Collaboration: Anonymous Vs. Identifiable E-Peer Review in Writing Instruction
From Everand
Anonymity in Collaboration: Anonymous Vs. Identifiable E-Peer Review in Writing Instruction
Ruiling Lu
No ratings yet
The Concepts of Test, Measurement, Assessment Andevaluation in Education
No ratings yet
The Concepts of Test, Measurement, Assessment Andevaluation in Education
4 pages
1719397514 R2X Foundationa Classes
No ratings yet
1719397514 R2X Foundationa Classes
2 pages
CTET Full Form: Central Teacher Eligibility Test: Rupali Pruthi
No ratings yet
CTET Full Form: Central Teacher Eligibility Test: Rupali Pruthi
7 pages
HC JA Recruitment tips
No ratings yet
HC JA Recruitment tips
3 pages
Activity Predecessors A - B A C A D B E B, C F E G D, F G
No ratings yet
Activity Predecessors A - B A C A D B E B, C F E G D, F G
4 pages
Circular Seating Arrangement For Bank Exam - Question Bank Set 1 (Eng)
No ratings yet
Circular Seating Arrangement For Bank Exam - Question Bank Set 1 (Eng)
6 pages
Self Evaluation and Life Skill Development
No ratings yet
Self Evaluation and Life Skill Development
20 pages
WWW Selfstudys
No ratings yet
WWW Selfstudys
3 pages
Behavioral Objectives
No ratings yet
Behavioral Objectives
14 pages
Last
No ratings yet
Last
21 pages
Shri Vaishnav Kanya Vidyalaya School Paper 52
No ratings yet
Shri Vaishnav Kanya Vidyalaya School Paper 52
7 pages
Important Formulas For CAT 2024 To Score 99+ Percentile: Vipra Shrivastava
No ratings yet
Important Formulas For CAT 2024 To Score 99+ Percentile: Vipra Shrivastava
32 pages
Acc632 Syllabus - Reidenbach
No ratings yet
Acc632 Syllabus - Reidenbach
6 pages
Faculty Handbook 2021-22: Bowdoin College
No ratings yet
Faculty Handbook 2021-22: Bowdoin College
143 pages
Admit Card 020
No ratings yet
Admit Card 020
1 page
International Office English Language Entry Requirements Postgraduate 21-09-21 Waikato Print Edited by CeTTL 24-3-23 v2
No ratings yet
International Office English Language Entry Requirements Postgraduate 21-09-21 Waikato Print Edited by CeTTL 24-3-23 v2
8 pages
PED 07 Introduction To Test Item Analysis
No ratings yet
PED 07 Introduction To Test Item Analysis
22 pages
GROUP-3-CHAPTER-4-PORTFOLIO-ASSESSMENT
No ratings yet
GROUP-3-CHAPTER-4-PORTFOLIO-ASSESSMENT
25 pages
BSR Lecture 6 2024-2025
No ratings yet
BSR Lecture 6 2024-2025
21 pages
Guidance For Pre-And Post-Test Design
No ratings yet
Guidance For Pre-And Post-Test Design
5 pages
APDC Information1234
No ratings yet
APDC Information1234
2 pages
Research Design
No ratings yet
Research Design
24 pages
K-12 Curriculum Teaching and Assessment Nowadays Is It Effective?
No ratings yet
K-12 Curriculum Teaching and Assessment Nowadays Is It Effective?
16 pages
Keep Learning
No ratings yet
Keep Learning
4 pages
Critical Literature Review Advantages and Disadvantages
No ratings yet
Critical Literature Review Advantages and Disadvantages
9 pages
Short Listed Candidate (Commercial Artist (Drawing Painting) UR(Website)
No ratings yet
Short Listed Candidate (Commercial Artist (Drawing Painting) UR(Website)
1 page
Domain B - Tpe 5
No ratings yet
Domain B - Tpe 5
5 pages
Set 22 ESOL (QCF) Guidance For Assessors - Reading E3
No ratings yet
Set 22 ESOL (QCF) Guidance For Assessors - Reading E3
2 pages
84 Technician Engineering Assistant Posts Advt Details BEL
No ratings yet
84 Technician Engineering Assistant Posts Advt Details BEL
8 pages

Test Features

Uploaded by

Test Features

Uploaded by

Test Features

I. Direct versus indirect testing

II. Discrete point versus integrative testing

III. Norm-referenced versus criterion-referenced testing

IV. Objective testing versus subjective testing

You might also like