0% found this document useful (0 votes)
20 views9 pages

C2-LE THI THANH TU - Classroom-Based Assessment-Assignment

This document summarizes key principles of language assessment, including practicality, reliability, validity, authenticity, and washback. Practicality means a test is feasible to administer within budget and time constraints. Reliability refers to a test producing consistent results. Validity means a test accurately measures what it intends to measure. Authenticity means a test resembles real-world language tasks. Washback refers to how a test impacts teaching and learning; beneficial washback positively influences both.

Uploaded by

2057010260
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views9 pages

C2-LE THI THANH TU - Classroom-Based Assessment-Assignment

This document summarizes key principles of language assessment, including practicality, reliability, validity, authenticity, and washback. Practicality means a test is feasible to administer within budget and time constraints. Reliability refers to a test producing consistent results. Validity means a test accurately measures what it intends to measure. Authenticity means a test resembles real-world language tasks. Washback refers to how a test impacts teaching and learning; beneficial washback positively influences both.

Uploaded by

2057010260
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

‭Session 3 - SUMMARY - CHAPTER 2 (P.

27-56)‬
‭ hapter 2: Principles of Language Assessment‬
C
‭I.‬ P
‭ racticality‬
‭ PRACTICAL TEST...‬
A
‭* stays within budgetary limits‬
‭* can be completed by the test-taker within appropriate time constraints‬
‭* has clear directions for administration‬
‭* appropriately utilizes available human resources‬
‭* does not exceed available material resources‬
‭* considers the time and effort involved in both designing and scoring.‬

‭ ime is always a crucial practical factor for busy teachers in‬


T
‭classroom-based testing‬
‭II.‬ ‭RELIABILITY‬
‭A reliable test is consistent and dependable.‬
‭ RELIABLE TEST...‬
A
‭* has consistent conditions across two or more administrations‬
‭* gives clear directions for scoring/evaluation‬
‭* has uniform rubrics for scoring/evaluation lends itself to consistent‬
‭application of rubrics by the scorer‬
‭* contains items/tasks that are unambiguous to the test-taker‬
‭ he issue of the reliability of tests can be better understood by considering a‬
T
‭number of factors that can contribute to their unreliability.‬
‭1.‬ ‭Student-Related Reliability‬
‭Most common issues caused by physical or psychological factors (i.e.:‬
‭temporary illness, fatigue, a “bad day,” anxiety, etc.) or a test-taker’s‬
‭test-wiseness, or strategies for efficient test-taking‬
‭2.‬ ‭Rater Reliability‬
‭Interrater reliability occurs when two or more scorers yield consistent scores‬
‭on the same test.‬
‭Intra-rater reliability is an internal factor, a common occurrence for‬
‭classroom teachers. Ex: reliability can be violated in cases of unclear scoring‬
c‭ riteria, fatigue, bias toward particular “good” and “bad” students, or simple‬
‭carelessness.‬
‭3.‬ ‭Test Administration Reliability‬
‭Unreliability may also result from the conditions in which the test is‬
‭administered. (Ex: the unexpected noise from outside affects testers,‬
‭photocopying variations, the amount of light in dif- ferent parts of the room,‬
‭variations in temperature, and even the condition of desks and chairs.)‬
‭4.‬ ‭Test Reliability‬
‭Test unreliability can be caused by many factors, including rater bias (i.e:‬
‭subjective tests with open-ended responses and objective tests with‬
‭predetermined fixed responses)‬
‭Further unreliability may be caused by poorly written test items (including‬
‭too many items and a time limit,...)‬
‭=> test characteristics can interact with student-related unreliability,‬
‭muddying the lines of distinction between test reliability and test‬
‭administration reliability.‬
‭III. VALIDITY‬
‭●‬ ‭the extent to which inferences made from assessment results are appropriate,‬
‭meaningful, and useful in terms of the purpose of the assessment‬
‭ VALID TEST...‬
A
‭* measures exactly what it proposes to measure‬
‭* does not measure irrelevant or “contaminating” variables‬
‭* relies as much as possible on.empirical evidence (performance)‬
‭* involves performance that samples the test’s criterion (objective)‬
‭* offers useful, meaningful information about a test-taker's ability‬
‭* is supported by a theoretical rationale or argument‬
‭1.‬ ‭Content-Related Evidence‬
‭●‬ ‭A test actually samples the subject matter about which conclusions are‬
‭to be drawn + requires the test-taker to perform the behavior measured‬
‭●‬ ‭The difference between direct and indirect testing.‬
‭○‬ ‭Direct testing involves the test-taker in actually performing the‬
‭target task.‬
‭○‬ ‭In an indirect test, learners do not perform the task itself but‬
‭rather a task that is related in some way.‬
‭2.‬ ‭Criterion-Related Evidence‬
‭●‬ T ‭ ests measure specified classroom objectives, and implied‬
‭predetermined levels of performance are expected to be reached‬
‭●‬ ‭is best demonstrated through a comparison of results of an assessment‬
‭with results of some other measure of the same criterion.‬
‭●‬ ‭Criterion-related evidence usually falls into one of two categories: (1)‬
‭concurrent and (2) predictive validity.‬
‭○‬ ‭concurrent validity - test results are supported by other‬
‭concurrent performance beyond the assessment itself‬
‭○‬ ‭The predictive validity - placement tests, admissions assessment‬
‭batteries, and achievement tests designed to determine students’‬
‭readiness to “move on” to another unit.‬
‭3.‬ ‭Construct-Related Evidence‬
‭●‬ ‭A construct is any theory, hypothesis, or model that attempts to‬
‭explain observed phenomena in our universe of perceptions.‬
‭●‬ ‭Constructs may or may not be directly or empirically measured—their‬
‭verification often requires inferential data.‬
‭●‬ ‭Proficiency and communicative competence are examples of linguistic‬
‭constructs; self-esteem and motivation are psychological constructs.‬
‭●‬ ‭​Construct validity is a major issue in validating large-scale‬
‭standardized tests of proficiency - tests must adhere to the principle of‬
‭practicality, must sample a limited number of domains of language,‬
‭and may not be able to contain all the content of a particular field or‬
‭skill.‬
‭4.‬ ‭Consequential Validity (Impact)‬
‭●‬ ‭Consequential validity encompasses all the consequences of a test, (its‬
‭accuracy in measuring intended criteria, its effect on the preparation‬
‭of test-takers, and the (intended and unintended) social consequences‬
‭of a test’s interpretation and use.)‬
‭●‬ ‭The term impact - consequential validity, perhaps more broadly‬
‭encompassing the many consequences of assessment, before and after‬
‭a test administration.‬
‭5.‬ ‭Face Validity‬
‭●‬ ‭Face validity refers to the degree to which a test looks right, and‬
‭appears to measure the knowledge or abilities it claims to measure,‬
‭based on the subjective judgment of the examinees who take it, the‬
a‭ dministrative personnel who decide on its use, and other‬
‭psychometrically unsophisticated observers‬
‭●‬ ‭Test appearance does indeed have an effect that neither test-takers nor‬
‭test designers can ignore‬
‭●‬ ‭Teachers can increase a student's perception of fair tests by using:‬
‭○‬ ‭formats that are expected and well-constructed with familiar‬
‭tasks‬
‭○‬ ‭tasks that can be accomplished within an allotted time limit‬
‭○‬ ‭items that are clear and uncomplicated ¢ directions that are‬
‭crystal clear‬
‭○‬ ‭tasks that have been rehearsed in their previous course work‬
‭○‬ ‭tasks that relate to their course work (content validity)‬
‭○‬ ‭level of difficulty that presents a reasonable challenge‬
‭The psychological state of the learner (confidence, anxiety, etc.) is an‬
‭important ingredient in peak performance.‬
‭IV. AUTHENTICITY‬
‭●‬ ‭the degree of correspondence of the characteristics of a given language test‬
‭task to the features of a target language task‬
‭AN AUTHENTIC TEST...‬
‭●‬ ‭contains language that is as natural as possible‬
‭●‬ ‭has items that are contextualized rather than isolated‬
‭●‬ ‭includes meaningful, relevant, and interesting topics‬
‭●‬ ‭provides some thematic organization to items, such as through a‬
‭story line or episode‬
‭●‬ ‭offers tasks that replicate real-world tasks‬

‭V. WASHBACK‬
‭●‬ ‭the effect of testing on teaching and learning‬
‭●‬ ‭refer to both the promotion and the inhibition of learning, thus emphasizing‬
‭what may be referred to as beneficial versus harmful (or negative)‬
‭washback.‬
‭A TEST THAT PROVIDES BENEFICIAL WASHBACK ...‬
‭●‬ ‭positively influences what and how teachers teach‬
‭●‬ ‭positively influences what and how learners learn‬
‭●‬ ‭offers learners a chance to adequately prepare‬
‭‬ g
● ‭ ives learners feedback that enhances their language development‬
‭●‬ ‭is more formative in nature than summative‬
‭●‬ ‭provides conditions for peak performance by the learner‬

‭●‬ t‭he effects that tests have on instruction in terms of how students prepare for‬
‭the test.‬
‭○‬ ‭Washback can have a number of positive manifestations - from the‬
‭benefits of preparing and reviewing for a test to the learning that‬
‭accrues from feedback on one’s performance. => Teachers can‬
‭provide information that “washes back” to students in the form of‬
‭useful diag- noses of strengths and weaknesses.‬
‭●‬ ‭the effects of an assessment on preparation for the assessment. (Informal‬
‭performance assessment is by nature more likely to have built-in washback‬
‭effects because the teacher usually provides interactive feedback.)‬
‭●‬ ‭To enhance washback - comment generously and specifi- cally on test‬
‭performance.‬
‭●‬ ‭Washback is achieved by a quick consideration of differences between‬
‭formative and summative tests - formative tests provide washback in the‬
‭form of information to the learner on progress toward goals‬
‭●‬ ‭Washback also implies that students have ready access to you to discuss the‬
‭feedback and evaluation you have given - students need to have a chance to‬
‭“feed back” on teachers’ feedback.‬
‭ I. APPLYING PRINCIPLES TO CLASSROOM TESTING‬
V
‭1.‬ ‭Are the Test Procedures Practical?‬

‭ RACTICALITY CHECKLIST‬
P
‭1. Are administrative details all carefully attended to before the test?‬
‭2. Can students complete the test reasonably within the set time frame?‬
‭3. Can the test be administered smoothly, without procedural “glitches”?‬
‭4. Are all printed materials accounted for?‬
‭5. Has equipment been pre-tested?‬
‭6. Is the cost of the test within budgeted limits?‬
‭7. Is the scoring/evaluation system feasible in the teacher’s time frame?‬
‭8. Are methods for reporting results determined in advance?‬
‭ .‬ I‭ s the Test Itself Reliable?‬
2
‭-‬ ‭Test and test administration reliability can be achieved by making sure that all‬
‭students receive the same quality of input, whether written or auditory.‬

‭ EST RELIABILITY CHECKLIST‬


T
‭1. Does every student have a cleanly photocopied test sheet?‬
‭2. Is sound amplification clearly audible to everyone in the room?‬
‭3. Is video input clearly and uniformly visible to all?‬
‭4. Are lighting, temperature, extraneous noise, and other classroom conditions equal‬
‭(and optimal) for all students?‬
‭5. For closed-ended responses, do scoring procedures leave little debate about‬
‭correctness of an answer?‬
‭ .‬ C
3 ‭ an You Ensure Rater Reliability?‬
‭-‬ ‭Intra-rater reliability for open-ended responses may be enhanced by answering‬
‭these questions:‬

I‭ NTRA-RATER RELIABILITY CHECKLIST‬


‭1. Have you established consistent criteria for correct responses?‬
‭2. Can you give uniform attention to those criteria throughout the evaluation time?‬
‭3. Can you guarantee that scoring is based only on the established criteria and not on‬
‭extraneous or subjective variables?‬
‭4. Have you read through tests at least twice to check for consistency?‬
‭5. If you have made “midstream” modifications of what you consider a correct‬
‭response, did you go back and apply the same standards to all?‬
‭6. Can you avoid fatigue by reading the tests in several sittings, especially if the time‬
‭requirement is a matter of several hours?‬
‭ .‬ D
4 ‭ oes the Procedure Demonstrate Content Validity?‬
‭-‬ ‭Content validity: the extent to which the assessment requires students to perform‬
‭tasks included in the previous classroom lessons and that directly represent the‬
‭objectives of the unit on which the assessment is based.‬

‭ ONTENT VALIDITY CHECKLIST (FOR A TEST ON A UNIT)‬


C
‭ .‬
1 ‭Are unit objectives clearly identified?‬
‭2.‬ ‭Are unit objectives represented in the form of test specifications? (See below for‬
‭details on test specifications.)‬
‭3.‬ ‭Do the test specifications include tasks that have already been performed as part‬
‭of the course procedures?‬
‭4.‬ ‭Do the test specifications include tasks that represent all (or most) of the‬
‭ bjectives for the unit?‬
o
‭ .‬ ‭Do those tasks involve actual performance of the target task(s)?‬
5
‭-‬ T
‭ est specifications (specs)‬‭: a test should have a structure that follows logically‬
‭from the lesson or unit you are testing. Many tests have a design that:‬
‭+‬ ‭divides them into a number of sections (corresponding, perhaps, to the‬
‭objectives assessed)‬
‭+‬ ‭offers students a variety of item types‬
‭+‬ ‭gives an appropriate relative weight to each section‬
‭5. Has the Impact of the Test Been Carefully Accounted for?‬

‭ ONSEQUENTIAL VALIDITY CHECKLIST‬


C
‭1. Have you offered students appropriate review and preparation for the test?‬
‭2. Have you suggested test-taking strategies that will be beneficial?‬
‭3. Is the test structured so that, if possible, the best students will be modestly‬
‭challenged and the weaker students will not be overwhelmed?‬
‭4. Does the test lend itself to your giving beneficial washback?‬
‭5. Are the students encouraged to see the test as a learning experience?‬
‭ .‬ A
6 ‭ re the Test Tasks as Authentic as Possible?‬
‭-‬ ‭Evaluate the extent to which a test is authentic by asking the following questions:‬
‭ UTHENTICITY CHECKLIST‬
A
‭1. Is the language in the test as natural as possible?‬
‭2. Are items as contextualized as possible rather than isolated?‬
‭3. Are topics and situations interesting, enjoyable, and/or humorous?‬
‭4. Is some thematic organization provided. such as through a story line or episode?‬
‭5. Do tasks represent. or closely approximate, real-world tasks?‬
-‭ ‬ D ‭ econtextualized tasks‬
‭-‬ ‭Contextualized tasks‬
‭7.‬ ‭Does the Test Offer Beneficial Washback to the Learner?‬

‭ ASHBACK CHECKLIST‬
W
‭1. Is the test designed in such a way that you can give feedback that will be relevant to‬
‭the objectives of the unit being tested?‬
‭2. Have you given students sufficient pretest opportunities to review the subject matter‬
‭of the test?‬
‭3. In your written feedback to each student, do you include comments that will‬
‭contribute to students’ formative development?‬
‭4. After returning tests, do you spend class time “going over” the test and offering‬
a‭ dvice on what students should focus on in the future?‬
‭5. After returning tests, do you encourage questions from students?‬
‭6. If time and circumstances permit, do you offer students (especially the weaker ones)‬
‭a chance to discuss results in an office hour?‬
‭-‬ B
‭ y spending classroom time after the test reviewing the content, students discover‬
‭their areas of strength and weakness.‬
‭ II. MAXIMIZING BOTH PRACTICALITY AND WASHBACK‬
V

-‭ ‬ b‭ uilding as much authenticity as possible into multiple-choice task types and items‬
‭-‬ ‭designing classroom tests that have both objective-scoring sections and‬
‭open-ended response sections‬
‭-‬ v ‭ arying the performance tasks turing multiple-choice test results into diagnostic‬
‭feedback on areas of needed improvement‬
‭-‬ ‭maximizing the preparation period before a test to elicit performance relevant to‬
‭the ultimate criteria of the test‬
‭-‬ ‭teaching test-taking strategies‬
‭-‬ ‭helping students achieve learning beyond the test (don't “teach to the test”)‬
‭-‬ ‭triangulating information on a student before making a final assessment of‬
‭competence‬

You might also like