Language Testing

The document discusses the key characteristics of a good language test: validity, reliability, and practicality. It defines each characteristic and provides examples. Validity refers to what a test accurately measures, such as content or skills. Reliability means a test consistently measures examinees. Practicality concerns a test's cost, ease of use, and interpretation. A good test must demonstrate these essential qualities.

Uploaded by

Handi Pabriana

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views

Language Testing

Uploaded by

Handi Pabriana

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 29

LANGUAGE TESTING

CHAPTER TWO
CHARACTERISTICS OF A
GOOD TEST

A Good test possesses three qualities:

1- Validity
2- Reliability
3- Practicality
Any test that we use must be appropriate and applicable to our objectives.
Dependable in the evidence it provides.
Applicable to our particular situation.
without any one of these three qualities a test would be a poor test.
RELIABILITY
1- The meaning of reliability
2- Types of estimates of reliability.
3- Estimating the reliability of speeded tests
4- The question of satisfactory reliability.
5- The standard error of measurement.
1- The meaning of reliability.
Reliability is the stability of test scores. A test cannot measure anything well
unless it measures consistently.
To have confidence in a measuring instrument, we would need to be assured
that approximately the same results would be obtained.
If we tested a group on Tuesday instead of Monday.
If we gave two parallel forms of the test to the same group on Monday and on
Tuesday.
If we scored a particular test on Tuesday instead of Monday.
If two or more competent scorers scored the test independently.
Two types of consistency or reliability

Reliability of the test itself.

Reliability of the scoring of the test.
Test Reliability:
It is affected by a number of factors:
A- The adequacy of sampling of tasks, the more of students’ performance we
take, the more reliable will be our assessment of their knowledge and ability.
B- The conditions under which the test is administered tend to fluctuate from
administration to administration.
C- Poor student motivation will also lower the reliability of a test.
Sometimes the lack of proper motivation can be attributed to the weaknesses
in the test or the testing procedure.

Scorer or Rater Reliability:

It concerns the consistency with which test performance are evaluated.
Would one scorer give the same score repeatedly for
the same test performance?

Would two or more scorers assign equivalent scores

for the same performance?
It is nearly perfect in multiple –choice tests.

Low in the case of free-response tests like

compositions where an individual judgments must be
involved.
Types of Estimates of Reliability
1- Test retest. If the results of the two administrations were highly correlated
then the test has temporal stability. This method is limited.
A- time interval short leads to memory factor. Overestimation of the test.
B- time interval is long ,the examinees proficiency may have gone undergone
genuine change. The test could be underestimated.
2- Alternative or parallel forms. Different versions of the same test .equivalent
in length, difficulty, time limit, and format.

3- Giving a single administration of one form of the test dividing the items
into two halves, obtaining two scores for each individual.
4- Rational equivalence. Reliability here is estimated from a single
administration of one form. We are concerned with inter-item consistency as
determined by the proportion of persons who pass and who don’t pass each
item.
Speeded Tests.
The items of the test are easy but the time limit is short. Neither the split-half
nor the rational equivalence should be used with speed tests.
Test-retest or parallel forms are the methods best adapted to measure speed
test reliability.
Satisfactory Reliability
Quotient of 1.00 indicates a perfect or reliable test.
Standard test to make individual diagnoses would have at least 0.90
Homemade tests would run somewhat lower in the 0.70s or 0.80s.
Reliability can be increased by lengthening the test additional material must be
similar in quality and difficulty to the original.
The Standard Error of Measurement
An obtained score on any test consists of the “True” score plus a certain
amount of test error.
A student may score 60 on an English entrance test and 55 when retested with
an equivalent form of the test. Five points decrease is probably not
statistically important.
In short, reliability refers simply to the precision with which the test measures.
No matter how high the reliability quotient, it is by no means a guarantee that
test measures what the test user wants to measure.
VALIDITY
What precisely does the test measure?
How well does the test measure?
A test must be based on a sound analysis of the skill or skill we wish to
measure.
There must be sufficient evidence that test scores correlate fairly highly with
actual ability the skills area being tested. Then we assume that the test is valid
for our purposes.
Types of Validity:

1- Content Validity
2- Empirical Validity
3- Face Validity
Content Validity:
If a test designed to measure mastery of a specific skill or the content of a
particular course of study , we should expect the test to be based upon a
careful analysis of the skill or an outline of the course.
In choosing a test, we cannot simply accept the title which the authors have
given it, for titles very often are misleading.
We should expect the test makers to be able to provide us with information
about the specific materials or skills being tested, and the basis for their
selection.

Empirical validity:
The best way to check on the actual effectiveness of a test is to determine how
test scores are related to some independent, outside criterion such as marks
given at the end of a course ratings.
If there is a high correlation between test scores and a trustworthy external
criterion, we are justified in putting our confidence in the empirical validity of
the test.
Two kinds of empirical validity:
Predictive
Concurrent.
if we use a test of English as a second language to screen university
applicants and then correlate test scores with grades made at the end of
the first semester, we are attempting to determine the predictive validity
of the test.
If we follow up the test immediately by having an English teacher rate
each student’s English proficiency on the basis of his class performance
during the first week and correlate the two measures we are seeking to
establish the concurrent validity of the test.
Empirical Validity depends on the reliability of the test and the criterion
measure.
Face Validity
/we simply mean the way the test looks to the examinees.
Its importance should not be underestimated.
Content must be relevant and appropriate.
Test makers must always keep face validity in mind.
Practicality

Refers to
1- Economy.
2- Ease of administration and scoring.
3- Ease of interpretation.
Economy
It refers to the cost per copy. Whether the test books are reusable.
Number of scorers and administrators needed.
Time allowed for administration and scoring.
Ease of administration and scoring.
Full test directions provided.
Test requirements ( mechanical devices, rooms available. number of
examinees)
Scoring the test subjectively or objectively
Ease of Interpretation
If a standard test is being adopted
Examine the date of publication, check if there is an up to date test manual for
information about both reliability and validity. If the test items are appropriate.

Artificial Intelligence for Engineers Basics and Implementations (Zhen Leo Liu) (Z-Library)
No ratings yet
Artificial Intelligence for Engineers Basics and Implementations (Zhen Leo Liu) (Z-Library)
441 pages
The Five Principles of Assessment
80% (5)
The Five Principles of Assessment
10 pages
Qualities of A Good Test
71% (7)
Qualities of A Good Test
4 pages
mvIMPACT Acquire GUI Applications Manual
No ratings yet
mvIMPACT Acquire GUI Applications Manual
177 pages
Characteristics of A Good Test
50% (2)
Characteristics of A Good Test
5 pages
Validity and Reliability
100% (4)
Validity and Reliability
19 pages
Logarithmic and Trigonometric Tables
100% (1)
Logarithmic and Trigonometric Tables
104 pages
PPM Calculation
100% (2)
PPM Calculation
14 pages
Flomixers: Selection Data All Gases
No ratings yet
Flomixers: Selection Data All Gases
12 pages
Language - Testing - Characteristics of Good Test
No ratings yet
Language - Testing - Characteristics of Good Test
31 pages
Characteristics of A Good Test/: Compiled by Nurmala Hendrawaty, M.PD
75% (16)
Characteristics of A Good Test/: Compiled by Nurmala Hendrawaty, M.PD
12 pages
Characteristics of Assessment Methods
No ratings yet
Characteristics of Assessment Methods
15 pages
Principles of Language Assessment
No ratings yet
Principles of Language Assessment
35 pages
What is Reliability
No ratings yet
What is Reliability
2 pages
Language Testing Ppt 2
No ratings yet
Language Testing Ppt 2
27 pages
Chapter 5
No ratings yet
Chapter 5
20 pages
Assessment
No ratings yet
Assessment
26 pages
Testing and Evaluation in ELT
No ratings yet
Testing and Evaluation in ELT
27 pages
Validity & Reliability
No ratings yet
Validity & Reliability
27 pages
1700214341
No ratings yet
1700214341
22 pages
Characteristicsofagoodtest3 140227023631 Phpapp02
No ratings yet
Characteristicsofagoodtest3 140227023631 Phpapp02
41 pages
Qualities of Test(Validity & Relibility Etc)
No ratings yet
Qualities of Test(Validity & Relibility Etc)
38 pages
Principles of Language Assessment
No ratings yet
Principles of Language Assessment
32 pages
Unit 4: Qualities of A Good Test: Validity, Reliability, and Usability
No ratings yet
Unit 4: Qualities of A Good Test: Validity, Reliability, and Usability
18 pages
Characteristics of A Good Test
No ratings yet
Characteristics of A Good Test
35 pages
Language Assessment Principles and Class
100% (1)
Language Assessment Principles and Class
9 pages
A Good Test Should Possess The Following Qualities.: - There Are Different Types of Validity
No ratings yet
A Good Test Should Possess The Following Qualities.: - There Are Different Types of Validity
4 pages
Test Criteria
No ratings yet
Test Criteria
3 pages
Validity and Reliability
No ratings yet
Validity and Reliability
19 pages
Establishing Validity-and-Reliability-Test
No ratings yet
Establishing Validity-and-Reliability-Test
28 pages
Educ Measurement Prelim
No ratings yet
Educ Measurement Prelim
24 pages
Quantitative Analysis - Sir Audrey
No ratings yet
Quantitative Analysis - Sir Audrey
6 pages
Chapter II
No ratings yet
Chapter II
38 pages
Quide Good Questions
No ratings yet
Quide Good Questions
12 pages
Principles of Language Testing
No ratings yet
Principles of Language Testing
48 pages
Characteristics of A Good Test
No ratings yet
Characteristics of A Good Test
33 pages
Testıng 2
No ratings yet
Testıng 2
28 pages
The Principles of Language Assessment"
No ratings yet
The Principles of Language Assessment"
13 pages
Chapter 4 Assessment & Evaluation
No ratings yet
Chapter 4 Assessment & Evaluation
10 pages
Criteria For A Good Test
100% (1)
Criteria For A Good Test
5 pages
Principles of Language Assessment - Tips For Testing
93% (14)
Principles of Language Assessment - Tips For Testing
4 pages
Conditions of A Good Test #1
No ratings yet
Conditions of A Good Test #1
27 pages
Characteristics of A Good Test
No ratings yet
Characteristics of A Good Test
37 pages
Ed 216 NOTES
No ratings yet
Ed 216 NOTES
21 pages
Characteristics of A Good Test: Validity and Reliability Criteria of Assessment and Rubric of Scoring
No ratings yet
Characteristics of A Good Test: Validity and Reliability Criteria of Assessment and Rubric of Scoring
6 pages
Unit Iii - Designing and Developing Assessments
No ratings yet
Unit Iii - Designing and Developing Assessments
5 pages
اختبارات لغوية
No ratings yet
اختبارات لغوية
19 pages
CHAPTER 4
No ratings yet
CHAPTER 4
86 pages
Task 1b. The Principles of Language Testing
100% (2)
Task 1b. The Principles of Language Testing
9 pages
Principles of High Quality Assessment 2
No ratings yet
Principles of High Quality Assessment 2
46 pages
UE-MA-LT-W3-Qualities of Tests-2019
No ratings yet
UE-MA-LT-W3-Qualities of Tests-2019
21 pages
Unit 6 8602
100% (1)
Unit 6 8602
22 pages
Lesson 8
No ratings yet
Lesson 8
1 page
8602 2
No ratings yet
8602 2
13 pages
Principles of Language Assessment: Debi Annisa Anang Yunianto W by
No ratings yet
Principles of Language Assessment: Debi Annisa Anang Yunianto W by
17 pages
Task 1B (I)
No ratings yet
Task 1B (I)
5 pages
Validity and Reliability: Purpose of Tests
No ratings yet
Validity and Reliability: Purpose of Tests
19 pages
Principles of Lang Assessment HO-2
No ratings yet
Principles of Lang Assessment HO-2
7 pages
Validity and Reliability Lesson 3.
No ratings yet
Validity and Reliability Lesson 3.
48 pages
Group 1 & 2
No ratings yet
Group 1 & 2
38 pages
Validity & Realibility
No ratings yet
Validity & Realibility
13 pages
Test Validity
No ratings yet
Test Validity
15 pages
El 114 Prelim Module 2
No ratings yet
El 114 Prelim Module 2
9 pages
Topic 8F Validity Reliability and Sources of Error
No ratings yet
Topic 8F Validity Reliability and Sources of Error
24 pages
How to Practice Before Exams: A Comprehensive Guide to Mastering Study Techniques, Time Management, and Stress Relief for Exam Success
From Everand
How to Practice Before Exams: A Comprehensive Guide to Mastering Study Techniques, Time Management, and Stress Relief for Exam Success
Ranjot Singh Chahal
No ratings yet
Digital Logic Design CSE-241: Unit 3 Fall 2015
No ratings yet
Digital Logic Design CSE-241: Unit 3 Fall 2015
14 pages
Physics Question Bank
No ratings yet
Physics Question Bank
5 pages
Apxvbb4l26x - 43 U I20
No ratings yet
Apxvbb4l26x - 43 U I20
6 pages
Dx.a en
No ratings yet
Dx.a en
7 pages
A Single-Chamber Pneumatic Soft Bending Actuator With Increased Stroke-Range by Local Electric Guidance
No ratings yet
A Single-Chamber Pneumatic Soft Bending Actuator With Increased Stroke-Range by Local Electric Guidance
9 pages
notes_key_topic_3.10_part_i_trigonometric_equations_and_inequalities
No ratings yet
notes_key_topic_3.10_part_i_trigonometric_equations_and_inequalities
3 pages
BEC198 Questions. Differential Equations
No ratings yet
BEC198 Questions. Differential Equations
5 pages
MIT6 003F11 F09q2 Sol PDF
No ratings yet
MIT6 003F11 F09q2 Sol PDF
14 pages
Unit 16 - Magnetism (Physics)
No ratings yet
Unit 16 - Magnetism (Physics)
14 pages
NCP1216 D
100% (1)
NCP1216 D
18 pages
Names From Demotic Egyptian Sources 2nd Edition
No ratings yet
Names From Demotic Egyptian Sources 2nd Edition
30 pages
Ltr Answer Key
No ratings yet
Ltr Answer Key
430 pages
MAST - SOP For Final Inpsection
No ratings yet
MAST - SOP For Final Inpsection
71 pages
Audi ABS: Security Access To Modules
No ratings yet
Audi ABS: Security Access To Modules
5 pages
Certificate No. 00801-16A PDF
No ratings yet
Certificate No. 00801-16A PDF
1 page
2almona PE CD System
No ratings yet
2almona PE CD System
11 pages
Note R Control Function Scoping Rules Vectorized Operation Date and Time
No ratings yet
Note R Control Function Scoping Rules Vectorized Operation Date and Time
15 pages
Ultrafast Superconducting Qubit Readout With The Quarton Coupler
No ratings yet
Ultrafast Superconducting Qubit Readout With The Quarton Coupler
20 pages
Wma12 01 Rms 20230817
No ratings yet
Wma12 01 Rms 20230817
25 pages
Microsoft Word - Power Factor Correction Note 1
No ratings yet
Microsoft Word - Power Factor Correction Note 1
18 pages
English Test 7th Grade - Sports
No ratings yet
English Test 7th Grade - Sports
3 pages
2021 Chem 1 Mark Question Bank.
No ratings yet
2021 Chem 1 Mark Question Bank.
17 pages
Ag-Doped Magnetic Metal Organic Framework As A Novel Nanostructured00
No ratings yet
Ag-Doped Magnetic Metal Organic Framework As A Novel Nanostructured00
9 pages
Line Sizing Summary
No ratings yet
Line Sizing Summary
7 pages
Toaster Oven Control System
No ratings yet
Toaster Oven Control System
12 pages

Language Testing

Uploaded by

Language Testing

Uploaded by

LANGUAGE TESTING

A Good test possesses three qualities:

Reliability of the test itself.

Scorer or Rater Reliability:

Would two or more scorers assign equivalent scores

Low in the case of free-response tests like

You might also like