Assessment
Assessment
Types of Test
1- Language Aptitude Test.
Although admittedly not not common, it predicts a person's success prior to
exposure to a second language, the capacity or ability to learn a foreign
language. ( Applied in the united States, Modern Language Aptitude Test).
Learners are exposed to some tasks, for example, examinees must learn a set of
numbers through aural input and then discriminate different combinations of
those numbers ; or, they must learn a set of correspondences between speech
sounds and phonetics symbols. For more details, see the table on p. 44 in your
textbook.
3- Placement Test.
Some proficiency test can play in the role of a placement test, the purpose of
which is to place a student into a particular level or section of language
curriculum or school. Such a test usually indicates the point at which the student
will find the material neither too easy nor too difficult but appropriately
challenging,l. The English as a Second Language Placement (ESLP) at San
Francisco University has Three parts. (a) The students read a short article and
then write a summary essay about It. (b) students write a composition in
response to an article.(c ) This part is a multiple - choice : students read an essay
and identify the grammatical errors in it.
4- Achievement Test.
It is the most important type of Testing since school teachers are responsible for
testing how much a student has learnt of a learning syllabus. It is based on
detailed course syllabus (annual school examination and all public examinations
are of this type). In this perspective , the test is related to class room lessons and
units that it should be limited to a particular material within a particular time
frame. The aim is to see whether a course has achieved its objectives or not. The
maximum time allowed for the test is three hours.
5- Diagnostic Test.
This test is designed to diagnose a specific aspect of language ;what skills or
aspects of language programme a student has achieved or not achieved. For
example, a test in pronunciation might diagnose the phonological features of
English. A typical test of this type was developed by Clifford Prator (1972) to
accompany the manual of English pronunciation. Test-takers are directed to
read a 150-word passage while they are tape-recorded. These recordings are
going to be analyzed phonologically with reference to learner's production.
University of Basrah /College of Education for Human Sciences
Department of English /Evening Study 2020-2021
Subject :Test Design & Assessment.
Year: Fourth
Set by: Kareem A. Abed
______________________________________________________
Lecture 3
Approaches to Language Testing : A Brief History.
Integrative tests, on the other hand, involves the testing of language in context
and is concerned with meaning and the total communicative effect of discourse.
There are two examples integrative test. : Cloze test and Dictation. A Cloze
test is a reading passage (perhaps 150-300 words) in which roughly every sixth
word or seventh has been deleted. The test-taker is required to supply words that
fit into those blanks. Oller (1979) claimed that Cloze test results are good
measures of overall proficiency. According to theoretical conducts underlying
this claim, the ability to supply appropriate words in the blanks requires a
number of abilities that lie at the heart competence in language. In Dictation,
learners listen to a passage of 100-150 words, read aloud by administrator or
audiotape and write what they have heard, using correct spelling. In fact there is
a short pause between every phase to give chance to the learners to write down
what is heard. Supporters of this way argue that dictation is an integrative test
because it taps into grammatical and discourse competencies.
1- Validity
Validity is the degree to which a test measures what it is supposed to be
measured (a test should measure what it is intended to be measured), what
precisely does the test measure? How well does it do that?
There are in fact four kinds of validity. The first is Content Validity and is
concerned with what is being tested. The remaining three, empirical, face and
construct are concerned with the extent to which the measurement is
satisfactory.
a- Content Validity
Almost certainly the most important issues for a teacher when preparing a test
are (a) the extent to which a test covers the syllabus to be tested and (b) the
relative importance of each area and the number of items given to it. The first
refers to language area to be tested while the second is concerned with the
extent to which the questions adequately cover the language area to be tested.
b- Empirical Validity
This one is referred to as statistical validity. If we want to check the
effectiveness of a test and to determine how a test measures, we should compare
the test results with the result of some independent outside criteria that we
believe is an indicator of the ability tested. If there is a high correlation between
them then our test is empirically valid. Examples of independent criteria include
the score given at the end of a course, or an external examination.
c- face Validity
This means the extent to which students view the test (the way a test loos to the
testees) as fair, relevant and useful for improving learning. Or it is the degree to
which a test looks right or appears to measure the knowledge or ability it claims
to measure.
d- Construct Validity
Construct validity means that the testing methods should be in harmony with the
teaching method used. Otherwise the teaching programme is not likely to
succeed in achieving its objective if there is no close relationship between the
course objective and the instructional material on the one hand and the teaching
and the testing methods on the other. For example, when a course of a study
emphasizes the communicative aspect of the language and the test is designed
according to discrete-point items. The construct validity of this test will be low.
2- Reliability
Reliability is the stability of test scores. In other words, if a test is given twice to
the same group of students, under the same conditions, it would give the same
results. The requisites of relaliable test are :(a) multiple samples which means
that a test must be long enough to provide a generous sampling of the area
tested, besides it should contains a wide variety of levels of difficulty, (b)
standard conditions which means that all students take the test under identical
conditions, in a listening test, for example, students must be able to hear the
items clearly,(c) standard task which means all students must be given the same
items of equal difficulty, and,(d) standard scoring which refers to the fact that
all papers must be scored in an identical manner. A teacher or scorer should
give the same or nearly the same repeatedly for the same test performance.
3- Practicality
Two features have to be considered to achieve practicality of a test: economy
which means cost in time, money and personnel of administrating a particular
test and ease, the degree of difficulty experienced in the administrating and
scoring a test, for example, an oral test that demands the use of a tape recorder
is not practical it it has to be administered to thousands of students.
4- Accuracy
Accuracy means a test should be free from grammatical, spelling, and
punctuation errors. The numbering of questions, sub-questions, and items
should be correct. The directions for each question should be accurately worded,
with marks allotted for it as well as the time allotted for the whole test. An
example for such a case is choose the correct option (accurate), and complete
the following sentences(inaccurate).
5- Authenticity
Bachman and Palmer (1996) define authenticity as the degree of
correspondence of the characteristics of a given language test task to the
features of a target language task. Essentially, when you make a claim for
authenticity in a test task, you are saying that this task is likely to be enacted in
the real world.
In a test, authenticity may be present in the following ways.:
- The language in the test is as natural as possible.
- Items are contextualized rather than isolated.
- Topics are meaningful (relevant, interesting) for the learner.
- Tasks represent, or closely approximate, real world task.
6- Washback
Washback refers to the effect of testing on teaching and learning. When a
teacher limits the weak and strong points of his students, this is called
washback.
Positive washback refers to expected test effects. For example, a test may
encourage students to study more or may promote a connection between
standards and instruction. Negative washback refers to the unexpected, harmful
consequences of
Lecture 5
_______________________________________________________________________
Lecture 5
Test Design and Construction
1-Planning the test. Effective testing requires careful planning, and much
consideration of the balance of the test content.
2- Preparing the test items. A teacher may write more items than he could
possibly need or be answered by the testees during the allotted time for the test
and so some items have to be eliminated. As a result, we might omit some of the
material which the students had mastered at a previous stage of learning. We
could thus (a) concentrate on the new range of activities that we taught in the
course at later time. Another important issue must be considered, in this context,
is the (b) time to be provided for the test. Also,(c) the test techniques are
important here which may include multiple choice items, completion, short
answer questions, essay writing and so on. (d) The test directions must be
simple, clear easily understood and free from possible ambiguity.
3- Reviewing the items. Once the items are written, they should be set aside for
a day or two days before being reviewed by the teacher, when the teacher is
satisfied with the test, it should be submitted to a colleague who is experienced
in the subject. The comments of this outside reviewer have to be taken into
consideration. and the Faculty items must be corrected before preparing the
final version of the test.
Note:If a test is for research purposes, it should be submitted to a jury of a least
ten language and Testing experts.
4- Setting the scoring scheme. For the purpose of objectivity and reliability,
scoring scheme should be made. Scoring refers to the process of correcting tests
and assessing numerical scores. In objective tests each item is marked as correct
or incorrect(full mark or zero). In semi-objective tests like transformation or
completion with three or four words, a different scoring scheme is required. For
Language Assessment /4th Year - Evening Study
Mr. Kareem Abed
_______________________________________________________________________
example, two marks are given to each correct answer, zero to incorrect answer
and one mark to a recognizable answer with some errors.
5- Reproducing the test. The next step is the presentation of the paper itself.
Where possible, it should be printed so as to appear neat and tidy. Nothing is
worse and more confusing to the testees than an untidy test paper, full of
spelling, omission and corrections.
7- Analyzing the results. Having given the test and collecting the papers, the
teacher's task is to calculate the number of students who have responded directly
to the test items and reach the required level for example 70 or 75 per cent of
the items have been successfully answered. Students reaching this level will be
those who have succeeded in terms of the course objectives and those who fail
to reach the lovell are at-risk students who need assistance.
8- Using the results. Besides measuring the students' achievement, this last
step helps to pinpoint the problematic items some students failed to answer
correctly, and diagnose their specific weaknesses. This helps the teacher to
provide feedback, and extra exercises.
Test Design and Assessment
______________________________________________________________
Lecture 6
Constructing Questions
Before going deep into how to form various types of questions, like True/False
questions, writing a composition, multiple choice items questions, gap filling
questions and so on, we would like to refer to two types of tests : objective tests
and subjective tests.
Objective tests, like True /False, Multiple-Choice items, matching, gap filling
and completion, rearrangement and odd item out, are the ones which are either
marked by a full mark or zero because they have one limited answer, i. e., there
is no flexibility to have half mark or something like that.
Subjective test include short-answer essay, extended-response essay, problem
solving and performance test items,writing a composition.
Poor: Coal, which used to be the most important source of power for
ages, is used nowadays in most field of primary industries.
Better : Coal is used nowadays in primary industries.
5- The number of the True statements and the false statements should
be approximately equal in one test.
Advantages
1- True-False items are easy to construct.
2- They can cover a wide sampling of the course material.
3- Scoring is easy.
Disadvantages
1- They can be violated by the testee's guessing and cheating.
2- The learning outcomes are largely limited to the knowledge area.
Assessment 8
This test in its simplest form consists of two lists with instructions as how
matching is to be undertaken. The items in the first list,of which a match is
sought,are called " premises" and those in the other list from which a selection
is made are called responses.The testee's task is to identify the pairs of items
that are to be associated.
Advantages
1-The first advantage is ease of construction.
2- It is possible to measure a large amount of related factual material in a
relatively short time.
For example:
pictures and words
Questions and answers.
Titles and Texts
Authors and Titles of Books
Machines and Uses
Verbs and Nouns (collocation)
Disadvantages
1- Matching test are restricted to the measurement of factual information.
2- Finding significant homogeneous material is sometimes difficult.
3- Matching test are sometimes considered a reduced multiple choice test. Once
you matched the first ones,you are left with fewer options. Eventually the
answer to the difficult item may be the only one left.
Assessment 9