0% found this document useful (0 votes)
11 views16 pages

paper language assessment group 9

The document is a paper titled 'Principles of Language Assessment' prepared by a group of students under the guidance of Dr. Ridho Kurniawan. It discusses the significance of language assessment in education, outlines key principles such as washback, reliability, and validity, and provides guidelines for effective classroom testing. The paper aims to enhance understanding of language assessment and improve future assessments through constructive feedback and adherence to established principles.

Uploaded by

pbi5vvip
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views16 pages

paper language assessment group 9

The document is a paper titled 'Principles of Language Assessment' prepared by a group of students under the guidance of Dr. Ridho Kurniawan. It discusses the significance of language assessment in education, outlines key principles such as washback, reliability, and validity, and provides guidelines for effective classroom testing. The paper aims to enhance understanding of language assessment and improve future assessments through constructive feedback and adherence to established principles.

Uploaded by

pbi5vvip
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

LANGUAGE ASSESSMENT

“Principles of Language Assessment”

Lecturer:
Dr. Ridho Kurniawan, M. Pd

Arranged by Group 9:
1. Dhea Aura Cindyana (211014288203006)
2. Dinda Oktaviandini (211014288203007)
3. Elsa Febria (211014288203008)
4. Namira Futri Lia Haliza (201014288203011)

ENGLISH EDUCATION DEPARTMENT


FACULTY OF TEACHER TRAINIING AND EDUCATION
MUHAMMADIYAH UNIVERSITY OF MUARA BUNGO
2024
PREFACE

First of all, thanks to Allah SWT because of the help of Allah, writer finished writing
the paper entitle “Principles of Language Assessment”. The purpose in writing this paper
is to fulfill the assignment that given by Dr. Ridho Kurniawan, M. Pd as lecturer in
Language Assessment in arranging this paper, the writer truly get lots challenges and
obstructions but with help of many individuals, those obstructions could pass. Writer also
realized there are still many mistakes in process of writing this paper.

We are realized that in this paper still imperfect in arrangement and the content. Then
ourself hopes the criticism from the readers can help the writer in perfecting the next
paper. Last but not the least hopefully, this paper can help the readers to gain more
knowledge about subject Language Assessment.

Muara Bungo, 14 February 2024

Author

i
TABLE OF CONTENTS

PREFACE .............................................................................................................. i
TABLE OF CONTENTS .................................................................................... ii
CHAPTER I INTRODUCTION
A. Background ................................................................................................ 1
B. Problem Formulation ................................................................................. 1
C. Purpose of Writing ...................................................................................... 1
CHAPTER II DISCUSSION
A. Washback ................................................................................................... 2
B. Applying Principles to Classroom Testing ............................................... 3
1. Are the Test Procedures Partical? ....................................................... 3
2. Is the Test itself Reliable? ................................................................... 4
3. Can You Ensure Rater Reability? ....................................................... 4
4. Does the Procedure Demonstrate Content Validity? ......................... 5
5. Has the Impact of the Test Been Carefully Accounted for? ............... 6
6. Are the Test Tasks as Authentic as Possible? ..................................... 7
7. Does the Test Offer Beneficial Washback to the Learner? ................. 9
C. Maximizing Both Practicality and Washback ......................................... 10
CHAPTER III CLOSING
A. Conclusion ............................................................................................... 12
B. Suggestion ................................................................................................ 12
REFERENCES .................................................................................................. 13

ii
CHAPTER I
INTRODUCTION

A. Background
The development and understanding of human language is very important in everyday
life because language is the main tool for humans to communicate and transfer
information. In the context of education, language assessment is an important component
to measure the extent to which a person can communicate well in a language.
The importance of language assessment can be seen in various aspects of life, such as
formal education, work, and social life. However, for accurate and useful language
assessment, strong and relevant principles are needed.
To measure and assess one's language ability, the process of language assessment
becomes an inevitable necessity. However, in developing language assessment
instruments, it is important to understand the underlying principles to ensure fairness,
validity and reliability in measuring language skills.

B. Problem Formulation
The problem formulation of this paper is as follows.
1. What is washback?
2. How to apply the principles to classroom testing?
3. How to maximize practicality and washback?

C. Purpose of Writing
The objectives of writing this paper are as follows.
1. To know what washback is.
2. To know how to apply the principles to classroom testing.
3. To know how to maximize practicality and washback.

1
CHAPTER II
DISCUSSION

A. Washback
“Washback” (alternatively “backwash”) is a term used in education to describe the
influence, whether beneficial or damaging, of an assessment on the teaching and
learning that precedes and prepares for that assessment (Green, 2020)
A Test That Provides Beneficial Washback:

 Positively influences what and how teachers teach.


 Positively influences what and how learners learn.
 Offers learners a chance to adequately prepare.
 Gives learners feedback that enhances their language development.
 Is more formative in nature than summative.
 Provides conditions for peak performance by the learner.
The word "washback" in large-scale assessment frequently describes how tests
affect teaching in terms of how students get ready for the test. Examples of washback
that can have both beneficial and detrimental consequences include "teaching to the
test" and "cram" courses. Because standardized tests are currently used all over the
world as gatekeepers, children may become more concerned with getting a passing
grade than with developing their language skills. Positively, a lot of students taking
test-prep classes report being more proficient in specific language-related tasks
(Chapelle, Enright, & Jamieson, 2008).

Washback in classroom-based assessment can take many good forms; these


include the advantages of studying and getting ready for a test as well as the
knowledge gained from receiving performance feedback. Teachers have the ability to
give students information that "washes back" to them in the shape of helpful
assessments of their strengths and flaws. The consequences of an assessment on
instruction and learning before to the assessment, or on assessment preparation, are
also referred to as washback. Since teachers typically provide interactive feedback,
informal performance assessments are more likely to have built-in washback effects.
Formal assessments can also yield positive washback. However, if students are given
a single number score or a basic letter grade, they do not yield any positive washback.
(Brown, 2018).

Teachers face the challenge of designing classroom assessments that not only
evaluate but also promote learning. Mistakes made by students can offer valuable
insights for further learning, while correct answers should be acknowledged,
especially when they demonstrate progress in language proficiency. Teachers can play
a coaching role by offering strategies for improvement. One effective way to enhance

2
the impact of assessments is to provide detailed and constructive feedback on
students' performance. Many teachers simply assign a letter grade or numerical score
without additional commentary, which fails to provide meaningful information to
students. Grades and scores, without context or feedback, offer little value and can
foster a competitive rather than cooperative learning environment.

With this in mind, when returning a written test or an oral production test data
sheet, it's beneficial to provide feedback beyond just a number, grade, or brief phrase.
Even if you don't have time to write a detailed paragraph, you can address specific
details throughout the test. Acknowledge strengths and offer constructive criticism for
weaknesses. Provide hints on how students can improve certain aspects of their
performance. Essentially, aim to make the test experience intrinsically motivating for
students, fostering a sense of accomplishment and challenge.

Additionally, providing some washback by specifying numerical scores for


different subsections of the test can also be helpful. For instance, if a subsection on
verb tenses receives a low score, it serves as diagnostic feedback, indicating an area
where the student may need to focus and improve.

B. Applying Principles to Classroom Testing


The five principles of practicality, reliability, validity, authenticity, and washback
offer valuable guidance for both assessing an existing assessment procedure and
creating one from scratch. Whether it's quizzes, tests, final exams, or standardized
proficiency tests, all can be analyzed through these lenses.

The tips and checklists provided in this chapter refer to these five principles, aiding
in the evaluation of existing tests for classroom use. It's crucial to note that the
sequence of these questions doesn't imply a priority order. While validity is often
considered the most significant principle, practicality might take precedence in certain
classroom testing scenarios. Alternatively, authenticity could be prioritized for
specific tests. However, ultimately, without validity, the other considerations may lose
their significance.

1. Are the Test Procedures Partical?


What happens before and after the test, as well as the teacher’s and students’
time limits, expenses, and administrative details, all play a part in practicality. You
might want to follow the checklist below to see if a test is appropriate for your
needs.
Practicality checklist:

a) Are administrative details all carefully attended to before the test?


b) Can students complete the test reasonably within the set time frame?
c) Can the test be administered smoothly, without procedural “glitches”?

3
d) Are all printed materials accounted for?
e) Has equipment been pre-tested?
f) Is the cost of the test within budgeted limits?
g) Is the scoring/evaluation system feasible in the teacher’s time frame?
h) Are methods for reporting results determined in advance?

As this checklist recommends, you should consider the viability of your plans
for scoring the test after taking into consideration the administrative aspects of
administering it. Time frequently becomes the most crucial element in instructors’
hectic life, taking precedence over all other factors when assessing an exam.
Teachers frequently need to modify tests to accommodate their own schedules,
therefore it’s important to do so without compromising the test’s validity and
washbackFor example, educators should resist the urge to provide just rapidly
scored multiple-choice questions that might not be well-designed or relevant.
Everyone is aware that teachers despise grading tests in private almost as much as
students do, and they will stop at nothing to complete the assignment as fast and
painlessly as possible. However, effective instruction nearly always requires the
teacher to devote their time to provide students with test-related feedback,
including remarks and recommendation

2. Is the Test Itself Reliable?


Reliability applies to the student, the test administration, the test itself, and the
teacher. At least four sources of unreliability must be guarded against, as noted in
this chapter on pages 30-31. Test and test administration reliability can be achieved
by making sure that all students receive the same quality of input, whether written
or auditory. The following checklist should help you to determine whether a test is
itself reliable:
Test reliability checklist:

a) Does every student have a cleanly photocopied test sheet?


b) Is sound amplification clearly audible to everyone in the room?
c) Is video input clearly and uniformly visible to all?
d) Are lighting, temperature, extraneous noise, and other classroom conditions
equal (and optimal) for all students?
e) For closed-ended responses, do scoring procedures leave little debate about
correctness of an answer?

3. Can You Ensure Rater Reability?


Another frequent problem in evaluations is rater reliability, which can be more
challenging because we don’t give it enough attention. Inter-rater reliability is
rarely a problem in classroom assessments because there are rarely two scorers.

4
Rather, instructors are constantly concerned about inter-rater reliability: What
transpires with our human limitations in terms of endurance and focus over the
course of assessing an exam? Educators must devise strategies to sustain their
concentration and enthusiasm throughout the duration of assessment scoring. This
is a crucial matter when it comes to exams that demand free-form answers. After
spending the hours necessary to assess the test, it is simple to let mentally formed
standards to deteriorate. In open-ended responses, the following questions may
improve intra-rater reliability:
Intra-rater reliability checklist
a) Have you established consistent criteria for correct responses?
b) Can you give uniform attention to those criteria throughout the evaluation time?
c) Can you guarantee that scoring is based only on the established criteria and not
on extraneous or subjective variables?
d) Have you read through tests at least twice to check for consistency?
e) If you have made “midstream” modifications of what you consider a correct
response, did you go back and apply the same standards to all?
f) Can you avoid fatigue by reading the tests in several sittings, especially if the
time requirement is a matter of several hours?

4. Does the Procedure Demonstrate Content Validity

The main factor in determining whether a test administered in class is valid is its
content validity, or how much of it asks students to complete tasks that are covered
in earlier classes and that accurately reflect the unit's goals. In order for an
evaluation focused on students' reading, summarizing, and reacting to brief texts to
be considered content valid, it must incorporate performance in these abilities if
you have been teaching English language classes to students who have been
practicing these skills. Content and criterion validity are intimately related to
assessments in the classroom since lesson or unit objectives serve as the
assessment's criteria.You might take several steps to evaluate the content validity
of a classroom test:
Content validity checklist:

a) Are unit objectives clearly identified?


b) Are unit objectives represented in the form of test specifications?
c) Do the test specifications include tasks that have already been performed as part
of the course procedures?
d) Do the test specifications include tasks that represent all (or most) of the
objectives for the unit?
e) Do those tasks involve actual performance of the target itask(s)?

One of the main challenges in proving content validity is realizing that the goals
of the relevant lesson, module, or unit form the foundation of any well-designed

5
classroom assessment. Therefore, determining the objectives is the first step
towards evaluating a test in the classroom effectively. It’s easier said than done at
times. All too frequently, educators go through lessons day after day with little to
no awareness of the goals they are trying to achieve. Or it’s possible that those
goals are so vaguely stated that it’s hard to tell whether they were met A second
issue in content validity is test specifications (specs). Don’t let this word scare you.
It simply means that a test should have a strucmre that follows logically from the
lesson or unit you are testing. Many tests have a design that:

 Divides them into a number of sections (corresponding, perhaps, to the


objectives assessed).
 Offers students a variety of item types.
 Gives an appropriate relative weight to each section.
Naturally, some examinations are not well suited for this type of organization. A
written essay completed in class on a particular topic might legitimately serve as
the test for a university-level academic writing course. There would be just one
“item” and one response, so to speak. However, in this instance, the requirements
would be included in both the prompt and the evaluation or scoring rubric that is
used to assess the student’s response and provide comments. In the following
chapter, we revisit the idea of test specifications.

The way the objectives of the unit being tested are represented in the content of
items, item clusters, and item kinds on an existing classroom test should
demonstrate its content validity. Do you clearly perceive the performance of test-
takers as reflective of the classroom objectives? If so (and you can argue this),
content validity has most likely been achieved.

5. Has the Impact of the Test Been Carefully Accounted for?

This question integrates the concept of consequential validity (impact) and the
importance of structuring an assessment procedure to elicit optimal performance by
the student. Remember that even though it is an elusive concept, the appearance of
a test from a student’s point of view is important to consider. The following factors
might help you to pinpoint some of the issues surrounding the impact of a test:
Consequential validity checklist:
a) Have you offered students appropriate review and preparation for the test?
b) Have you suggested test-taking strategies that will be beneficial?
c) Is the test structured so that, if possible, the best students will be modestly
challenged and the weaker students will not be overwhelmed?
d) Does the test lend itself to your giving beneficial washback?
e) Are the students encouraged to see the test as a learning experience?

6
6. Are the Test Tasks as Authentic as Possible?
Evaluate the extent to which a test is authentic by asking the following
questions:
Authenticity checklist

a) Is the language in the test as natural as possible?


b) Are items as contextualized as possible rather than isolated?
c) Are topics and situations interesting, enjoyable, and/or humorous?
d) Is some thematic organization provided, such as through a story line or episode?
e) Do tasks represent, or closely approximate, real-world tasks?

Let’s consider two excerpts from tests, and the concept of authenticity may
become clearer. The sequence of items in the following decontextualized tasks
takes the test-taker into five different topic areas with no context for any item and
with the grammatical category as the only unifying element. Each sentence is likely
to be written or spoken in the real world, but only perhaps in five different
contexts. On a scale of authenticity and given the constraints of a multiple-choice
format, this first excerpt is evaluated as only fair.

Multiple-choice tasks—decontextualized “Going To”

1) What_______ this summer?


a) John is going to do
b) Is John going to do
c) You’re going to do

2) _______ anything special next weekend?


a) Are you going to do
b) You are going to do
c) Is going to do

3) She and I______ my English class tomorrow.


a) are going to
b) are going
c) going to

4) The Giants are playing baseball on Wednesday._______


a) What’s it going to?
b) Who’s it going to be?
c) Where's it going to be played?

5) The ocean’s _______ to be at low tide later this morning.


a) Go

7
b) Going
c) going to

The second excerpt that follows iis more effective and is ranked as good. The
sequence of items in these contextualized tasks achieves a modicum of
authenticity by sequentially linking all the items in a story lirte. The conversation
is one that might occur in the real world, although with a little less formality.

Multiple-choice tasks—contextualized
Directions: After answering the questions, click the “Submil” button.
“Going To”

1) Amanda: What______this weekend?


a) you are going to do
b) are you going to do
c) your gonna do

2) Gwen: I’m not sure.________anything special?


a) Are you going to do
b) You are going to do
c) Is going to do

3) Amanda: Melissa and I_________a parly. Would you like to come?


a) are going to
b) are going
c) go to

4) Gwen: I’d love to!


a) What’s it going to be
b) Who’s going to be
c) Where’s it going to be?

5) Amanda: It’s_______ to be at Ruth’s house.


a) Go
b) Going
c) Gonna

8
7. Does the Test Offer Beneficial Washback to the Learner?
The plan of an viable test ought to point the way to advantageous washback. A
test that accomplishes substance legitimacy illustrates pertinence to the
educational modules in address and subsequently sets the organize for washback.

When test things speak to the different targets of a unit, and/or when areas of a
test clearly center on major points of the unit, classroom tests can serve in a
symptomatic capacity indeed in the event that they aren't particularly labeled as
such.

The following checklist should help you to maximize beneficial washback in a


test:
Washback checklist

a) Is the test designed in such a way that you can give feedback that will be
relevant to the objectives of the unit being tested?
b) Have you given students sufficient pre-test opportunities to review the subject
matter of the test?
c) In your written feedback to each student, do you include comments that will
contribute to students’ formative development?
d) After returning tests, do you spend class time “going over” the test and
offering advice on what students should focus on in the future?
e) After returning tests, do you encourage questions from students?
f) If time and circumstances permit, do you offer students (especially the weaker
ones a chance to discuss results in an office hour?

Now and then prove of washback may be as it were imperceptibly obvious


from an examination of the test itself. Here once more, what happens some time
recently and after the test is criticaL Planning time some time recently the test can
contribute to washback since the learner is investigating and centering in a
possibly communicative way on the goals in address. In what we
unconventionally allude to as “wash forward,” understudies can be supported by
vital endeavors to internalize the fabric being tried. An progressively common
event in student-centered classrooms is the arrangement of consider bunches
whose errand is to audit the subject matter of an up and coming test. Some of the
time these consider bunches are more important, in terms of quantifiable
washback, than the test itself.

By investing classroom time after the test investigating the substance,


understudies find their regions of quality and shortcoming. Instructors can raise
the washback potential by inquiring understudies to utilize test comes about as a
direct to setting objectives for their future exertion. The key is to play down the
“Whew, I'm happy that's over” feeling that understudies are likely to have and
play up the learning that can presently take put from their information of the
comes about.

9
C. Maximizing Both Practicality and Washback
As a teacher, of course we have a necessity or obligation to assess the process of
students. Of course, teachers must try and use good and appropriate assessment
techniques. The teacher must master this in order to adapt to the situation. For
example, when we as teachers are in busy times and are being chased by deadlines, of
course we will think about choosing a quick and easy assessment, which means that
the assessment used is practical in nature. This assessment tends to be the main
principle. However, there is a but, this is because this practicality can contain or
sacrifice practicality and washback. This particularity and washback is a challenge for
teachers which becomes a dilemma in maximizing practical assessments of
authenticity.
In the assessment there are 3 types of tests with different levels of practicality,
namely multiple choice tests (at a high level in practicality and reliability, but at a low
level in washback), essay tests (at a practical and reliability level with medium
washback , along with portfolios, journals and conferences (at a low level of
practicality and reliability but washback at a high level).

Look at the picture!

It could be argued that large-scale multiple-choice tests cannot offer much regress
or authenticity, nor can portfolios and similar alternatives achieve much practicality or
reliability. This doesn't need to happen. The challenge faced by teachers and
conscientious evaluators in our profession is changing the levels that can be achieved
with varying efforts. Examples include efforts to push traditional test formats toward
more setbacks and authenticity, and to increase the practicality and reliability of
portfolios, journals, and conferences.
But apart from that, portfolios, journals and conferences actually have several
advantages as alternative assessments, namely:

10
 Open-ended in their time orientation and format.
 Contextualized to a curriculum.
 Referenced to the criteria (objectives) of that curriculum.
 Likely to build intrinsic motivation.

Then of course, if it feels like this difficulty cannot be overcome, a teacher


certainly cannot just surrender and remain silent. Teachers must not surrender if the
tests used are inauthentic and produce negative results. This can be turned into a more
pedagogically satisfying learning experience.

The methods or approaches that can be taken include:


 Build as much authenticity as possible into multiple choice task types and items.
 Design classroom tests that consist of an objective assessment section and an open
response section, varying the performance tasks.
 Convert multiple choice test results into diagnostic feedback regarding areas
requiring improvement.
 Maximize the pre-exam preparation period to obtain performance relevant to the
final exam criteria.
 Teach test-taking strategies.
 Help students achieve learning beyond the test (don't teach before the test).
 Triangulate information on students before carrying out the final competency
assessment.

All of the methods above can be done so that the assessment is equal between
practical and washback. However, if in large-scale standardized tests, for example,
practicality is usually more important than washback, The opposite is probably true
for most classroom tests.

11
CHAPTER III
CLOSING

A. Conclusion
Washback, in education, refers to the impact of an assessment on teaching and learning.
A test with beneficial washback positively influences what and how teachers teach and
what and how learners learn. In classroom-based assessment, washback can take positive
forms such as the advantages of studying and preparing for a test and the knowledge
gained from receiving performance feedback. Teachers can provide students with helpful
assessments of their strengths and weaknesses. Teachers face the challenge of designing
assessments that not only evaluate but also promote learning. They can provide detailed
and constructive feedback on students' performance, which enhances the impact of
assessments. Practicality is crucial in classroom testing. Teachers should consider the
viability of their plans for scoring the test, administering it, and providing test-related
feedback.
Reliability applies to the student, the test administration, the test itself, and the teacher.
Test and test administration reliability can be achieved by making sure that all students
receive the same quality of input. Content validity is determined by how much of a test
asks students to complete tasks that are covered in earlier classes and that accurately
reflect the unit's goals. The impact of a test should be carefully accounted for by offering
students appropriate review and preparation for the test, suggesting test-taking strategies,
and structuring the test so that it challenges the best students and does not overwhelm the
weaker ones. Authenticity can be evaluated by asking if the language in the test is natural,
if items are contextualized, if topics are interesting, and if tasks represent real-world tasks.

B. Suggestion
We as the writer want to apology for the shortage of the pape. We know that this paper
still far from perfect.So that I need the suggest from the reader for the perfection of this
paper.Thank you so much for the reader.

12
REFERENCES

Brown, H. D., & Abeywickrama, P. (2018). Language Assessment: Principles and Classroom
Practices (Third Edition). Pearson.

Green, A. (2020). The Encyclopedia of Applied Linguistics.


https://doi.org/10.1002/9781405198431.wbeal1274.pub2

13

You might also like