Ail 2
Ail 2
I. Introduction
Formulating instructional objectives or learning targets is identified as the first step in
conducting both the process of teaching and evaluation. Once you have determined your
objectives or learning targets, or have answered the question “what to assess”, you will probably
be concerned with answering the question “how to assess?” At this point, it is important to keep
in mind several criteria that determine the quality and credibility of the assessment methods that
you choose. This lesson will focus on the different principles or criteria and it will provide
suggestions for practical steps you can take to keep the quality of your assessment high. At the
end of your reading, you must be able to attain the following objectives:
1
Clear and
appropriate
learning
targets
Select
Appropriate Fairness
methods
High quality
assessment Positive
Validity
Consequences
Practicality
Reliability and
efficiency
The types of learning targets presented provide a start to identifying the focus of
instruction and assessment, but you will find other sources that are more specific about learning
targets such as the Bloom’s Taxonomy of Objectives.
Cognitive Domain
Each level of the taxonomy represents an increasingly complex type of cognition, with
knowledge level considered as the lowest level. However, the remaining five levels are referred to
as “Intellectual abilities and skills.” Though this categorization of cognitive tasks was created
more than 50 years ago, and other more contemporary frameworks were offered, the taxonomy is
still valuable in providing a comprehensive list of possible learning objectives with clear action
verbs that operationalize the learning targets.
APPROPRIATENESS OF ASSESSMENT METHODS
Many different approaches or methods are used to assess students but your choice will
greatly depend on the match of the learning target and the method. The different methods of
assessment are categorized according to the nature and characteristics of each method. There are
four major categories: selected-response, teacher observation, and self-report
I. Selective Response
a) Multiple Choice
b) Binary Choice (e.g., true/false)
c) Matching
3
II. Constructed Response
a) Brief conducted response
1. Short answer
2. Completion
3. Label a diagram
b) Performance-based tasks
1. Products
Paper
Project
Poem
Portfolio
Reflection
Journal
Graph/Table
2. Skills
Speech
Demonstration
Debate
Recital
c) Essay Items
1) Restricted-response
2) Extended-response
d) Oral Questioning
1) Informal questioning
2) Examinations
3) Interviews
Learning Targets
Major Knowledge/simple Deep Skills Products Affects Totals
Content understanding understanding
Areas and reasoning
1. (Topic) No./% No./% No./% No./% No./% No./%
2. (Topic) No./% No./% No./% No./% No./% No./%
3. (Topic) No./% No./% No./% No./% No./% No./%
4. Mammals 4/8% No./% No./% No./% No./% No./%
- - - - - - -
- - - - - - -
N (Topic) - - - - - -
Total no. of No./% No./% No./% No./% No./% 50/100%
items/% of
the test
The table is completed by simply indicating the number of items (No.) and the percentage
of items from each type of learning target. For example, if the topic were vertebrates, you might
have mammals as one topic. If there were four knowledge items for mammals, and this was 8
percent of the test (N=50), then 4/8% would be included in that table under knowledge. The rest
of the tale is completed by your judgement as to wether which learning targets will be assessed,
what area of the content will be sampled, and how much of the assessment is measuring each
target. In this process, evidence of content-related validity is established.
Another consideration related to this type of evidence is the extent to which an
assessment can be said to have instructional validity or concerned with the match between what
is taught and what is assesses. One way to check this is to examine the table of specifications
after teaching a unit to determine if the emphasis in different areas is consistent with what was
emphasized in class. For example, if you emphasize knowledge in teaching a unit (e.g., facts,
definition of terms, places dates and names), it would not be logical to test for reasoning and the
make inferences about the knowledge students learned in the class.
5
Criterion-related evidence. This is established by relating an assessment to some other
valued measure (criterion) that either provides an estimate of current performance (concurrent
criterion-related evidence) or predicts future performance (predictive criterion-related evidence)
Classroom teachers do not conduct formal studies to obtain correlation coefficients that will
provide evidence of validity, but the principle is very important for teachers to employ. The
principle is that when you have two or more measures of the same thing, and these measures
provide similar results, then you have established criterion-related evidence. For examples, if
your assessment of a student’s score in quiz that tests steps in using microscope, then you have
criterion-related evidence that your inference about the skill of this student is valid.
Similarly, if you are interested in the extent to which preparation by your students, as
indicated by scores on a final exam in mathematics, predicts how well they will do next year, then
you can examine the grades of previous students and determine informally if students who scored
high on your final exam are getting high grades and students who scored low on your final exam
are obtaining low grades. If correlation is found, then an inference about predicting how your
students will performed, based on their final exam is valid, particularly, predictive criterion-
related validity.
Construct-related evidence. A construct refers to an unobservable trait or characteristics
that a person possesses, such as intelligence, reading comprehension, honesty, self-concept,
attitude, reasoning, learning style and anxiety. These are not measured directly rather the
characteristic is constructed to account for behavior that can be observed. Three types of
construct-related evidence are theoretical, logical and statistical. Theoretical explanation is to
define the characteristic in such a way that its meaning is clear and not confused with any other
constructs like “what is attitude or ‘how much students enjoy reading”. Logical analyses on the
other hand can be done by asking the students to comment on what they were thinking when they
answered the questions, or compare the scores of groups who, as determined by other criteria,
should respond differently. Finally, statistical procedures can be used to correlate scores from
measures of the construct. For example, self-concept of academic ability scores from one survey
should be related to another measure of the same thing (convergent construct-related evidence)
but less related to measures of self-concept of physical ability (divergent construct-related
evidence)
RELIABILITY
Like validity, term reliability has been used for so many years to describe an essential
characteristic of sound assessment. Reliability is concerned with the consistency, stability, and
dependability of the results. In other words, a reliable result is one that shows similar
performance at different times or under different conditions.
Suppose Mrs. Reyes is assessing her student’s addition and subtraction skills, she decided
to give the students a twenty-point quiz to determine their skills. She examines the results but
wants to be sure about the level of performance before designing appropriate instruction. So she
gives another quiz two days later on the same addition and subtraction skills. The results are as
follows:
Addition Subtraction
Quiz 1 Quiz 2 Quiz 1 Quiz 2
Carlo 18 16 13 20
Kate 10 12 18 10
Jane 9 8 8 14
Fely 16 15 17 12
6
The scores for addition are fairly consistent. All four students scored within one or two points on
the quizzes; students who scored high on the first quiz also scored high on the second quiz, and
the students scored low did so on both quizzes. Consequently, the results for addition are reliable.
For subtraction, on the other hand, there is considerable change in performance from the first to
the second quiz. Students scoring low on the first quiz score high on the second. For subtraction,
then, the results are unreliable because they are not consistent. The scores contradict one another.
The teacher’s goal is to use the quiz to accurately determine the defined skill. In the case
of addition, she can get a fairly accurate picture with an assessment that is reliable. For
subtraction, the other hand, she cannot use this result alone to estimate the student’s real or actual
skill. More assessments are needed before she can be confident that scores are reliable and thus
provide a dependable result.
But even the scores in addition are reliable; they are not without some degree of error. In
fact, all assessments have error; they are never perfect measure of the trait or skill. The concept of
error in assessment is critical to understanding reliability. Conceptually, whenever we see
something, we get an observed score of result. This observed score is a product of what the true
or real ability or skill is plus some degree of error:
Observed score = True score + Error
Reliability is directly related to error. It is not a matter of all or none, as if some results
are reliable and others unreliable. Rather, for each assessment there is some degree of error. Thus,
we think in terms of low, moderate, or high reliability. It is important to remember that error can
be positive or negative. That is, the observed score can be higher or lower than the true score
depending on the nature of the error. For example, if the student is sick, tired, in bad mood or
distracted, the score may have negative error and underestimate the true score.
So what are the sources of error in assessment that they may affect test reliability? Figure
3 summarizes the different sources of assessment error
7
Internal error
Health
Mood
Motivation
Test-taking skills
Anxiety
Fatigue
General ability
Assessment Observed
Actual or true knowledge, Score
Understanding, Reasoning, Skills,
Products or Affects.
External error
Directions
Luck
Item ambiguity
Heat in room,
lighting
Sampling of
Items
Observer Possible sources of assessment error
differences
Test
interruptions
Scoring
Observer Bias
8
Use sufficient number of items or tasks. (Other things being equal, longer tests
are more reliable)
Use independent raters or observers who provide similar score on the same
performances
Construct items and tasks that clearly differentiate students on what is being
assessed.
Make sure the assessment procedures and scoring are as objective as possible.
Continue assessment until results are consistent.
Eliminate or reduce the influence or extraneous events or factors
Use shorted assessments more frequently that fewer but long assessment
FAIRNESS
A fair assessment is one that provides all students an equal opportunity to demonstrate
achievement and yield scores that are comparably valid from one person or group to another. If
some students have an advantage over others because of factors unrelated to what is being taught,
then the assessment is not fair. Thus, neither the assessment task nor scoring is differentially
affected by race, gender, ethnic background, or other unrelated to what is being assessed. The
following criteria represent potential influences that determine whether or not an assessment is
fair.
1. Student knowledge of learning targets and assessment. A fair assessment is one in
which it is clear what will and will not be tested and your objectives is not to fool or trick students
or to outguess them on assessment. Rather, you need to be very clear and specific about the
learning target – what is to be assessed and how it will be scored
2. Opportunity to Learn. This means that the students know what to learn and then are
provided ample time and appropriate instruction. It is usually not sufficient to simply tell students
what will be assessed and the test them. You must plan instruction that focuses specifically on
helping students understand, providing students with feedback on their progress, and giving
students the time they need to learn.
3. Prerequisite knowledge and skills. It is unfair to assess students on things that require
prerequisite knowledge or skills that they do not possess. For example, you want to test math
reasoning skills. Your questions are based on short paragraphs that provide needed information.
In this situation, math reasoning skills can be demonstrated only if students can read and
understand the paragraphs. Thus, reading skills are prerequisites. If students do poorly on the test,
their performance may have more to do with a lack of reading skills than with math reasoning
4. Avoiding stereotypes. Stereotypes are judgements about how group of people will
behave based on characteristics such as gender, race, socioeconomic status and physical
appearance. Though it is impossible to avoid stereotypes completely because of out values,
beliefs and preferences, we can control the influence of these prejudices.
5. Avoiding bias in assessment task and procedures. Bias is present if the assessment
distorts performance because of the student’s ethnicity, gender, race, religious background and so
on. Bias appears in two forms: offensiveness and unfair penalization.
POSITIVE CONSEQUENCES
9
Ask yourself these questions. How will assessment affect student motivation? Will
students be more or less likely to be meaningfully involved? Will their motivation be intrinsic or
intrinsic? How will the assessment affect my teaching? What will the parents think about my
assessment? It is important to remember that nature of classroom assessment has important
consequences for teaching and learning.
Positive consequences on students. The most direct consequence of assessment is that
students learn and study in a way consistent with your assessment task. If your assessment is
multiple choice to determine the student’s knowledge of specific facts, students will tend to
memorize information. Assessment also has clear consequences on student’s motivation. If the
students know what will be assessed and how it will be scored, and if they believe that the
assessment will be fair, they are likely to be motivated to learn. Finally, the student-teacher
relationship is influenced by the nature of assessment such as when teachers construct
assessments carefully and provide feedback to students, the relationship is strengthened.
Positive consequences on teachers. Just as students learn depending on the assessment,
teachers tend to teach the test. Thus, if assessment calls for memorization of facts, the teacher
tends to teach lots of facts; if the assessment requires reasoning, then the teacher structures
exercises and experiences that get students to think. Assessment may also influence how you are
perceived by others. Are you comfortable with school administrators and parents reviewing and
critiquing your assessments? What about the views of other teachers? How do your assessments
fit with what you want to be as a professional? Thus, like students, teachers are affected by the
nature of the assessments they give their students.
PRACTICALITY AND EFFICIENCY
High quality assessments are practical and efficient. Because time is limited commodity
for teachers, factors like familiarity with the method, time required, complexity of administration,
ease of scoring, ease of interpretation and cost should be considered.
1. Familiarity with the method. This includes knowing the strengths and limitations of
the method, how to administer, how to score and interpret responses. Otherwise, teachers risk
time and resources for questionable results.
2. Time Required. Gather only as much information as you need for the decision. The
time is required should include how long it takes to construct the assessment, and how long it
takes to score the results. Thus, if you plan to use a test format (like multiple choice) over and
over for different groups of student, it is efficient to put in considerable time preparing the
assessment as long as you can use many of the same test items each year of the semester.
3. Complexity of administration. The directions and procedures for administration
should be clear and that little time and efforts are needed. Assessments that require long and
complicated instructions are less efficient and because probable student’s misunderstanding,
reliability and validity are affected.
4. Ease of Scoring. It is obvious that objectives test are easier to score than other
methods. In general use the easiest method of scoring appropriate to the method and purpose of
the assessment. Scoring performance-based assessment, essays and papers are more difficult to
score so it is more practical to use rating scales and checklists rather than writing extended
individualized evaluations.
5. Ease of interpretation. Objectives tests that report a single score than other methods.
In general use the easiest interpret, and individualized written comments are more difficult to
10
interpret. You can share to students key and other materials that provide meaning to different
scores or grades.
6. Cost. Like other practical aspects, it is best to use the most economical assessment.
However, it would be certainly unwise to use a more unreliable or invalid instrument just because
it costs less.
BALANCE
1. Assessment methods should be able to ass all domains of learning (Cognitive, psychomotor
and affective)
2. Assessment methods should be able to assess all hierarchy of objectives.
AUTHENTICITY
1. Assessment should touch real life situations.
2. Assessment should emphasize practicability.
CONTINUITY
1. Since assessment is an integral part of the teaching-learning process, it should be continuous.
CLEAR COMMUNICATION
1. Assessment result should be communicated to all people involved
2. Assessment results can be communicated by pre-test and post-test reviews.
ETHICS IN ASSESSMENT
As an indicator who uses assessments, you are expected to uphold principles of
professional conduct such as:
1) Protecting the safety, health, and welfare of all examinees;
2) Knowing about and behaving in compliance with laws relevant to
activities;
3) Maintaining and improving competence in assessment
4) Providing assessment services only in your area of expertise;
5) Adhering to, and promoting high standards of professional conduct
within and between educational institutions;
6) Promoting the understanding of sound assessment practices; and
7) Performing your professional responsibilities with honesty, integrity, due
care, and fairness.
III. Learning Task (you will be notified through google classroom on the date of
submission)
Activity 1 (on learning targets and methods of assessment)
For each of the following situations or questions, indicate which assessment method
provides the best match. Then provide a brief explanation why you choose that method of
assessment. Choices are selected response, essay, performance based, oral question, observation
and self-report.
1. Mrs. Abad needs to check students to see if they are able to draw graphs correctly
11
like the examples just demonstrated in class
Method: _______________________
Why?
______________________________________________________________________________
______________________________________________________________________________
_________________________________________________________________.
2. Mr. Garcia wants to see if his students are comprehending the story before moving
to the next set of instructional activities.
Method: ________________________
Why?
______________________________________________________________________________
______________________________________________________________________________
_________________________________________________________________.
3. Ms. Santos wants to find out how many spelling words her students know.
Method: _________________________
Why?
______________________________________________________________________________
______________________________________________________________________________
________________________________________________________________.
4. Ms. Cruz wants to see how well her students can compare and contrast the EDSA 1
and EDSA 2 people power revolution
Method: __________________________
Why?
______________________________________________________________________________
______________________________________________________________________________
_________________________________________________________________.
5. Mr. Mango’s objective is to enhance his student’s self-efficiency and attitude toward
school.
Method: __________________________
Why?
______________________________________________________________________________
______________________________________________________________________________
_________________________________________________________________.
6. Mr. Fuentes wants to know if his class can identify the different parts of a
microscope.
Method: __________________________
Why?
______________________________________________________________________________
12
______________________________________________________________________________
____________________________________________________________________.
Activity 2. (On validity and reliability)
A. Answer the follo0wing questions briefly.
1. Should teachers be concerned about relatively technical features of assessment such as validity
and reliability? Why or why not?
4. The students in the following lists are rank ordered, based on their performance on two tests on
the same content (Highest score at the top). Do The results suggest a reliable assessment? Why
or why not?
Test A Test B.
George Ann
Tess Robert
Ann Carlo
Carlo George
Robert Tess
5. Reading activity. When do we use these methods of establishing reliability evidences?
a. Split-half reliability instruments
b. Spearman-Brown Formula
c. Kuder-Richardson (KR 20 & KR 21) formulas
d. Coefficient alphas
Activity 3. (on fairness, practicality and positive consequences)
1. Which aspect of fairness is illustrated in each of the following assessment situations?
a. Students complained because they were not told what to study for the test
b. Students studied the wrong way for the test (e.g., they memorized the content)
c. The teacher was unable to cover the last unit that was on the test.
d. The test was about the story about life in Baguio City and students who had been
to Baguio showed better comprehension scores than students who had not been there.
13
2. Is the following test item biased? Why or Why not?
Carlo has decided to develop a family budget. He has P2,000 to work with and decides
to put P1,000 into house rental, P300 into food, P200 to transportation, P300 into
entertainment, P150 into utilities, and P50 into savings. What percent or Ramon’s
budget
is being spent into each of the categories?
3. Why is it important for teachers to consider practicality and efficiency in selecting their
assessments?
4. Based on your experience or observed practices, suggest at least two ways on how to enhance
the practicality and efficiency of the assessment in terms of:
a. Cost
b. Ease of scoring
c. Complexity of administration
5. On-site activity. Ask a group of high school or elementary students, depending on your interest
about what they see as fair assessment. Also, ask them how different kinds of assessment affect
them; for example, do they study differently for essay and multiple-choice tests?
Activity 4. Share insights that you gained in the lesson. I would suggest that in each
principle/criterion of high-quality assessment, a paragraph or two is encouraged.
IV. References
Buendicho, Flordeliza C. Assessment of Student Learning 1,
Rex Book Store, Manila, 2010
Garcia, Carlito D. Measuring and Evaluating Learning Outcomes: a Textbook in
Assessment of Learning 1 & 2, Books Atbp. Publishing corp., Mandaluyong City, 20
14