paper language assessment group 9
paper language assessment group 9
Lecturer:
Dr. Ridho Kurniawan, M. Pd
Arranged by Group 9:
1. Dhea Aura Cindyana (211014288203006)
2. Dinda Oktaviandini (211014288203007)
3. Elsa Febria (211014288203008)
4. Namira Futri Lia Haliza (201014288203011)
First of all, thanks to Allah SWT because of the help of Allah, writer finished writing
the paper entitle “Principles of Language Assessment”. The purpose in writing this paper
is to fulfill the assignment that given by Dr. Ridho Kurniawan, M. Pd as lecturer in
Language Assessment in arranging this paper, the writer truly get lots challenges and
obstructions but with help of many individuals, those obstructions could pass. Writer also
realized there are still many mistakes in process of writing this paper.
We are realized that in this paper still imperfect in arrangement and the content. Then
ourself hopes the criticism from the readers can help the writer in perfecting the next
paper. Last but not the least hopefully, this paper can help the readers to gain more
knowledge about subject Language Assessment.
Author
i
TABLE OF CONTENTS
PREFACE .............................................................................................................. i
TABLE OF CONTENTS .................................................................................... ii
CHAPTER I INTRODUCTION
A. Background ................................................................................................ 1
B. Problem Formulation ................................................................................. 1
C. Purpose of Writing ...................................................................................... 1
CHAPTER II DISCUSSION
A. Washback ................................................................................................... 2
B. Applying Principles to Classroom Testing ............................................... 3
1. Are the Test Procedures Partical? ....................................................... 3
2. Is the Test itself Reliable? ................................................................... 4
3. Can You Ensure Rater Reability? ....................................................... 4
4. Does the Procedure Demonstrate Content Validity? ......................... 5
5. Has the Impact of the Test Been Carefully Accounted for? ............... 6
6. Are the Test Tasks as Authentic as Possible? ..................................... 7
7. Does the Test Offer Beneficial Washback to the Learner? ................. 9
C. Maximizing Both Practicality and Washback ......................................... 10
CHAPTER III CLOSING
A. Conclusion ............................................................................................... 12
B. Suggestion ................................................................................................ 12
REFERENCES .................................................................................................. 13
ii
CHAPTER I
INTRODUCTION
A. Background
The development and understanding of human language is very important in everyday
life because language is the main tool for humans to communicate and transfer
information. In the context of education, language assessment is an important component
to measure the extent to which a person can communicate well in a language.
The importance of language assessment can be seen in various aspects of life, such as
formal education, work, and social life. However, for accurate and useful language
assessment, strong and relevant principles are needed.
To measure and assess one's language ability, the process of language assessment
becomes an inevitable necessity. However, in developing language assessment
instruments, it is important to understand the underlying principles to ensure fairness,
validity and reliability in measuring language skills.
B. Problem Formulation
The problem formulation of this paper is as follows.
1. What is washback?
2. How to apply the principles to classroom testing?
3. How to maximize practicality and washback?
C. Purpose of Writing
The objectives of writing this paper are as follows.
1. To know what washback is.
2. To know how to apply the principles to classroom testing.
3. To know how to maximize practicality and washback.
1
CHAPTER II
DISCUSSION
A. Washback
“Washback” (alternatively “backwash”) is a term used in education to describe the
influence, whether beneficial or damaging, of an assessment on the teaching and
learning that precedes and prepares for that assessment (Green, 2020)
A Test That Provides Beneficial Washback:
Teachers face the challenge of designing classroom assessments that not only
evaluate but also promote learning. Mistakes made by students can offer valuable
insights for further learning, while correct answers should be acknowledged,
especially when they demonstrate progress in language proficiency. Teachers can play
a coaching role by offering strategies for improvement. One effective way to enhance
2
the impact of assessments is to provide detailed and constructive feedback on
students' performance. Many teachers simply assign a letter grade or numerical score
without additional commentary, which fails to provide meaningful information to
students. Grades and scores, without context or feedback, offer little value and can
foster a competitive rather than cooperative learning environment.
With this in mind, when returning a written test or an oral production test data
sheet, it's beneficial to provide feedback beyond just a number, grade, or brief phrase.
Even if you don't have time to write a detailed paragraph, you can address specific
details throughout the test. Acknowledge strengths and offer constructive criticism for
weaknesses. Provide hints on how students can improve certain aspects of their
performance. Essentially, aim to make the test experience intrinsically motivating for
students, fostering a sense of accomplishment and challenge.
The tips and checklists provided in this chapter refer to these five principles, aiding
in the evaluation of existing tests for classroom use. It's crucial to note that the
sequence of these questions doesn't imply a priority order. While validity is often
considered the most significant principle, practicality might take precedence in certain
classroom testing scenarios. Alternatively, authenticity could be prioritized for
specific tests. However, ultimately, without validity, the other considerations may lose
their significance.
3
d) Are all printed materials accounted for?
e) Has equipment been pre-tested?
f) Is the cost of the test within budgeted limits?
g) Is the scoring/evaluation system feasible in the teacher’s time frame?
h) Are methods for reporting results determined in advance?
As this checklist recommends, you should consider the viability of your plans
for scoring the test after taking into consideration the administrative aspects of
administering it. Time frequently becomes the most crucial element in instructors’
hectic life, taking precedence over all other factors when assessing an exam.
Teachers frequently need to modify tests to accommodate their own schedules,
therefore it’s important to do so without compromising the test’s validity and
washbackFor example, educators should resist the urge to provide just rapidly
scored multiple-choice questions that might not be well-designed or relevant.
Everyone is aware that teachers despise grading tests in private almost as much as
students do, and they will stop at nothing to complete the assignment as fast and
painlessly as possible. However, effective instruction nearly always requires the
teacher to devote their time to provide students with test-related feedback,
including remarks and recommendation
4
Rather, instructors are constantly concerned about inter-rater reliability: What
transpires with our human limitations in terms of endurance and focus over the
course of assessing an exam? Educators must devise strategies to sustain their
concentration and enthusiasm throughout the duration of assessment scoring. This
is a crucial matter when it comes to exams that demand free-form answers. After
spending the hours necessary to assess the test, it is simple to let mentally formed
standards to deteriorate. In open-ended responses, the following questions may
improve intra-rater reliability:
Intra-rater reliability checklist
a) Have you established consistent criteria for correct responses?
b) Can you give uniform attention to those criteria throughout the evaluation time?
c) Can you guarantee that scoring is based only on the established criteria and not
on extraneous or subjective variables?
d) Have you read through tests at least twice to check for consistency?
e) If you have made “midstream” modifications of what you consider a correct
response, did you go back and apply the same standards to all?
f) Can you avoid fatigue by reading the tests in several sittings, especially if the
time requirement is a matter of several hours?
The main factor in determining whether a test administered in class is valid is its
content validity, or how much of it asks students to complete tasks that are covered
in earlier classes and that accurately reflect the unit's goals. In order for an
evaluation focused on students' reading, summarizing, and reacting to brief texts to
be considered content valid, it must incorporate performance in these abilities if
you have been teaching English language classes to students who have been
practicing these skills. Content and criterion validity are intimately related to
assessments in the classroom since lesson or unit objectives serve as the
assessment's criteria.You might take several steps to evaluate the content validity
of a classroom test:
Content validity checklist:
One of the main challenges in proving content validity is realizing that the goals
of the relevant lesson, module, or unit form the foundation of any well-designed
5
classroom assessment. Therefore, determining the objectives is the first step
towards evaluating a test in the classroom effectively. It’s easier said than done at
times. All too frequently, educators go through lessons day after day with little to
no awareness of the goals they are trying to achieve. Or it’s possible that those
goals are so vaguely stated that it’s hard to tell whether they were met A second
issue in content validity is test specifications (specs). Don’t let this word scare you.
It simply means that a test should have a strucmre that follows logically from the
lesson or unit you are testing. Many tests have a design that:
The way the objectives of the unit being tested are represented in the content of
items, item clusters, and item kinds on an existing classroom test should
demonstrate its content validity. Do you clearly perceive the performance of test-
takers as reflective of the classroom objectives? If so (and you can argue this),
content validity has most likely been achieved.
This question integrates the concept of consequential validity (impact) and the
importance of structuring an assessment procedure to elicit optimal performance by
the student. Remember that even though it is an elusive concept, the appearance of
a test from a student’s point of view is important to consider. The following factors
might help you to pinpoint some of the issues surrounding the impact of a test:
Consequential validity checklist:
a) Have you offered students appropriate review and preparation for the test?
b) Have you suggested test-taking strategies that will be beneficial?
c) Is the test structured so that, if possible, the best students will be modestly
challenged and the weaker students will not be overwhelmed?
d) Does the test lend itself to your giving beneficial washback?
e) Are the students encouraged to see the test as a learning experience?
6
6. Are the Test Tasks as Authentic as Possible?
Evaluate the extent to which a test is authentic by asking the following
questions:
Authenticity checklist
Let’s consider two excerpts from tests, and the concept of authenticity may
become clearer. The sequence of items in the following decontextualized tasks
takes the test-taker into five different topic areas with no context for any item and
with the grammatical category as the only unifying element. Each sentence is likely
to be written or spoken in the real world, but only perhaps in five different
contexts. On a scale of authenticity and given the constraints of a multiple-choice
format, this first excerpt is evaluated as only fair.
7
b) Going
c) going to
The second excerpt that follows iis more effective and is ranked as good. The
sequence of items in these contextualized tasks achieves a modicum of
authenticity by sequentially linking all the items in a story lirte. The conversation
is one that might occur in the real world, although with a little less formality.
Multiple-choice tasks—contextualized
Directions: After answering the questions, click the “Submil” button.
“Going To”
8
7. Does the Test Offer Beneficial Washback to the Learner?
The plan of an viable test ought to point the way to advantageous washback. A
test that accomplishes substance legitimacy illustrates pertinence to the
educational modules in address and subsequently sets the organize for washback.
When test things speak to the different targets of a unit, and/or when areas of a
test clearly center on major points of the unit, classroom tests can serve in a
symptomatic capacity indeed in the event that they aren't particularly labeled as
such.
a) Is the test designed in such a way that you can give feedback that will be
relevant to the objectives of the unit being tested?
b) Have you given students sufficient pre-test opportunities to review the subject
matter of the test?
c) In your written feedback to each student, do you include comments that will
contribute to students’ formative development?
d) After returning tests, do you spend class time “going over” the test and
offering advice on what students should focus on in the future?
e) After returning tests, do you encourage questions from students?
f) If time and circumstances permit, do you offer students (especially the weaker
ones a chance to discuss results in an office hour?
9
C. Maximizing Both Practicality and Washback
As a teacher, of course we have a necessity or obligation to assess the process of
students. Of course, teachers must try and use good and appropriate assessment
techniques. The teacher must master this in order to adapt to the situation. For
example, when we as teachers are in busy times and are being chased by deadlines, of
course we will think about choosing a quick and easy assessment, which means that
the assessment used is practical in nature. This assessment tends to be the main
principle. However, there is a but, this is because this practicality can contain or
sacrifice practicality and washback. This particularity and washback is a challenge for
teachers which becomes a dilemma in maximizing practical assessments of
authenticity.
In the assessment there are 3 types of tests with different levels of practicality,
namely multiple choice tests (at a high level in practicality and reliability, but at a low
level in washback), essay tests (at a practical and reliability level with medium
washback , along with portfolios, journals and conferences (at a low level of
practicality and reliability but washback at a high level).
It could be argued that large-scale multiple-choice tests cannot offer much regress
or authenticity, nor can portfolios and similar alternatives achieve much practicality or
reliability. This doesn't need to happen. The challenge faced by teachers and
conscientious evaluators in our profession is changing the levels that can be achieved
with varying efforts. Examples include efforts to push traditional test formats toward
more setbacks and authenticity, and to increase the practicality and reliability of
portfolios, journals, and conferences.
But apart from that, portfolios, journals and conferences actually have several
advantages as alternative assessments, namely:
10
Open-ended in their time orientation and format.
Contextualized to a curriculum.
Referenced to the criteria (objectives) of that curriculum.
Likely to build intrinsic motivation.
All of the methods above can be done so that the assessment is equal between
practical and washback. However, if in large-scale standardized tests, for example,
practicality is usually more important than washback, The opposite is probably true
for most classroom tests.
11
CHAPTER III
CLOSING
A. Conclusion
Washback, in education, refers to the impact of an assessment on teaching and learning.
A test with beneficial washback positively influences what and how teachers teach and
what and how learners learn. In classroom-based assessment, washback can take positive
forms such as the advantages of studying and preparing for a test and the knowledge
gained from receiving performance feedback. Teachers can provide students with helpful
assessments of their strengths and weaknesses. Teachers face the challenge of designing
assessments that not only evaluate but also promote learning. They can provide detailed
and constructive feedback on students' performance, which enhances the impact of
assessments. Practicality is crucial in classroom testing. Teachers should consider the
viability of their plans for scoring the test, administering it, and providing test-related
feedback.
Reliability applies to the student, the test administration, the test itself, and the teacher.
Test and test administration reliability can be achieved by making sure that all students
receive the same quality of input. Content validity is determined by how much of a test
asks students to complete tasks that are covered in earlier classes and that accurately
reflect the unit's goals. The impact of a test should be carefully accounted for by offering
students appropriate review and preparation for the test, suggesting test-taking strategies,
and structuring the test so that it challenges the best students and does not overwhelm the
weaker ones. Authenticity can be evaluated by asking if the language in the test is natural,
if items are contextualized, if topics are interesting, and if tasks represent real-world tasks.
B. Suggestion
We as the writer want to apology for the shortage of the pape. We know that this paper
still far from perfect.So that I need the suggest from the reader for the perfection of this
paper.Thank you so much for the reader.
12
REFERENCES
Brown, H. D., & Abeywickrama, P. (2018). Language Assessment: Principles and Classroom
Practices (Third Edition). Pearson.
13