0% found this document useful (0 votes)
9 views8 pages

Assessment-of-Learning

The document outlines the concepts of assessment, measurement, evaluation, and testing, detailing their definitions, types, and purposes in the educational context. It distinguishes between various forms of assessment, such as formative, summative, and portfolio assessments, and discusses the advantages and disadvantages of traditional and performance-based assessments. Additionally, it emphasizes the principles of high-quality classroom assessment, including validity, reliability, and clear learning targets.

Uploaded by

Sy Reu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views8 pages

Assessment-of-Learning

The document outlines the concepts of assessment, measurement, evaluation, and testing, detailing their definitions, types, and purposes in the educational context. It distinguishes between various forms of assessment, such as formative, summative, and portfolio assessments, and discusses the advantages and disadvantages of traditional and performance-based assessments. Additionally, it emphasizes the principles of high-quality classroom assessment, including validity, reliability, and clear learning targets.

Uploaded by

Sy Reu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

ASSESSMENT OF LEARNING

Test
 An instrument designed to measure any characteristic, quality, ability, knowledge or skill.
 It comprised of items in the area it is designed to measure.
Example: Quiz.

Measurement
 A process of quantifying the degree to which someone/something possesses a given trait.
 Assigning of numbers to a performance, product, skill, or behavior of a student, based on a pre-
determined procedure or set of criteria
 Assigning of numbers to the results of a test or other type of assessment
 Awarding of points for a particular aspect of an essay or performance.

Assessment can be defined both as a product and a process.


Assessment as a Product
 Refers to the instrument (e.g. set of questions or tasks) that is designed to elicit a
predetermined behavior, unique performance, or a product from a student.
Assessment as a Process
 Collection, interpretation, and use of Qualitative and Quantitative information to assist
teachers in their educational decision-making.
Hence, assessment is a pre-requisite to evaluation. It provides the information which enables
evaluation to take place.

Evaluation
 A process of making judgments about the quality of a performance, product, skill, or behavior of a
student.

Testing
 A strategy or method employed to obtain information for evaluation purposes.

Types of Assessment
Assessment FOR Learning
1. Placement- done prior to instruction
 Its purpose is to assess the needs of the learners to have the basis in planning for a relevant
instruction.
 The results of this assessment place students in specific learning groups to facilitate
teaching and learning.
2. Formative – done during instruction
 It is this assessment where teachers continuously monitor the student’s level of attainment
of the learning objectives.
 The results of this assessment are communicated clearly and promptly to the students for
them to know their strengths and weaknesses and the progress of their learning.
3. Diagnostic – done before/during instruction
 Used to determine students’ recurring or persistent difficulties.
 Identifies causes of learning problems
 It helps formulate a plan for detailed REMEDIAL INSTRUCTION

Assessment OF Learning
1. Summative assessment – done after instruction
 Measures end-of-course achievement
 It is used to certify what students know and can do and the level of their proficiency or
competency.
 Usually expressed as marks or letter grade
 Results are communicated to the students, parents, and other stakeholders for decision
making
Assessment AS Learning (Self-assessment)
 Self-directed
 This is done for teachers to understand and perform well their roles of assessing FOR and
OF learning.

Modes of Assessment
1. Traditional- the objective paper-and-pencil which usually assesses low-level thinking skills.
Examples: Standardized tests and Teacher-made Tests
Advantages:
 Scoring is objective.
 Administration is easy because students can take the test at the same time.
Disadvantages:
 Preparation of instrument is time-consuming
 Prone to cheating.

2. Performance- a mode of assessment that requires actual demonstration of skills or creation of


products of learning.
Examples: practical tests, Oral and Aural test, Projects
Advantages:
 Preparation of instrument is relatively easy.
 Measures behaviors that cannot be deceived.
Disadvantages:
 Scoring tends to be subjective without rubrics.
 Administration is time consuming.

3. Portfolio – a process of gathering multiple indicators of student progress to support course goals in
dynamic, ongoing and collaborative process. It is an alternative tool to pen-and-paper objective test.
It is also performance-based.
Examples: Working portfolios, show portfolios, Documentary portfolios
Advantages:
 Measures student’s growth and development.
 Intelligence fair
Disadvantages:
 Development is time-consuming
 Rating tends to be subjective without rubrics.

Traditional and Authentic Assessments Compared

Traditional ------------------------------- Authentic -heal wrld


Selecting a Response ------------------------------- Performing a Task
Real-life ------------------------------- Real-life
Recall/Recognition ------------------------------- Construction/Application
Teacher-structured ------------------------------- Student-structured
Indirect Evidence ------------------------------- Direct Evidence

Seven Criteria in Selecting a Good Performance Assessment Task


1. Authenticity – the task is similar to what the students might encounter in the real world as opposed
to encountering only in the school.
2. Feasibility – the task is realistically implementable in relation to its cost, space, time, and equipment
requirements.
3. Generalizability – the likelihood that the students’ performance on the task will generalize to
comparable tasks.
4. Fairness – the task is fair to all the students regardless of their social status or gender
5. Teachability – the task allows one to master the skill that one should be proficient in.
6. Multi Foci – the task measures multiple instructional outcomes
7. Scorability – the task can be reliably and accurately evaluated
PORTFOLIO ASSESSMENT
Portfolio Assessment is also an alternative tool to pen-and-paper objective test. It is a purposeful,
on-going, dynamic, and collaborative process of gathering multiple indicators of the learner's growth and
development. Portfolio assessment is also performance-based.

TYPES OF ASSESSMENT PORTFOLIO


1. Documentation or Working Portfolio
 To highlight development and improvement over time
 Showcase the process of learning by including full progression of project development
 Growth portfolio
 Development portfolio
 Best and weakest of student’s work
 Often involves a range of artifacts from brainstormed lists to rough drafts to finished products
2. Process Portfolio
 To document all stages of the learning process
 It also includes samples of student work throughout the entire educational progression.
 It expands on the information in a documentation portfolio by integrating reflections and higher-
order cognitive activities.
 It includes documentation of reflection such as learning logs, journals, or documented
discussions.
 Best and weakest of student’s work + reflection
3. Product or Showcase Portfolio
 To highlight student's best work by showcasing the quality and range of student
accomplishments.
 Typically, it is used as a summative assessment to evaluate mastery of learning objectives.
 Best works or completed works only included

Rubric is a measuring instrument used in rating performance-based tasks. It is the "key to corrections" for
assessment tasks designed to measure the attainment of learning competencies that require demonstration
of skills or creation of products of learning. It offers a set of guidelines or descriptions in scoring different
levels of performance or qualities of products of learning. It can be used in scoring both the process and the
products of learning.

Similarity of Rubric with Other Scoring Instruments


Rubric is a modified checklist and rating scale.

1. Checklist – presents the observed characteristics of a desirable performance or product.


- the rater checks the trait/s that has/have been observed in one's performance or
product.
2. Rating Scale – measures the extent or degree to which a trait has been satisfied by one's
work or performance
- offers overall description of the different levels of quality of a work or a
performance
- uses 3 to more levels to describe the work or performance although the most
common rating scales have 4 or 5 performance levels.
3. Rubric – shows the observed traits of a work/performance
- shows degree of quality of work/performance

Type Description Advantages Disadvantages


It does not clearly describe
It describes the overall
It allows fast the degree of the criterion
Holistic Rubric quality of a performance
assessment satisfied or not by the
or product.
performance or product

It provides one score


In this Rubric, there is It does not permit differential
to describe the overall
only one rating given to weighting of the qualities of a
performance or quality
the entire work or product or a performance
of work
performance
Type Description Advantages Disadvantages

It describes the quality of


a performance or product It clearly describes the
in terms of the identified degree of the criterion
dimensions and/or criteria It is more time consuming to
Analytic Rubric satisfied or not by the
use
for which are rated performance or
independently to give a product
better picture of the
quality of work or
performance.
It permits differential
weighting of the
It is more difficult to construct
qualities of a product
or a performance
It helps raters pinpoint
specific areas of
strengths and
weaknesses.

PRINCIPLES OF HIGH-QUALITY CLASSROOM ASSESSMENT

Principle 1: Clear and Appropriate Learning Targets


 Learning targets should be clearly stated, specific, and centers on what is truly important.

Principle 2: Appropriate methods

Assessment Methods

Objective Objective Performance Oral


Essay Observation Self-Report
Supply Selection Based Question
 Presentations
 Papers  Informal  Attitude
Short  Multiple
 Restricted  Projects  Oral  Formal  Survey
Answer Choice
Response  Athletics Examination  Obtrusive  Sociometric
Completion  Matching  Extended  Demonstrations  Conferences  Unobtrusive
Test  True/False
Devies
Response  Exhibitions  Interviews  Questionnaires
 Portfolios  Inventories

Types of Tests According to FORMAT


1. Selective Type-provides choices for the answer
a. Multiple Choice-consists of a stem which describes the problem and 3 or more
alternatives which give the suggested solutions. The incorrect alternatives are the
distractors.
b. True-False or Alternative Response-consists of declarative statement that one has to
mark true or false, right or wrong, correct or incorrect, yes or no, fact or opinion and the
like.
c. Matching Type-consists of two parallel columns: Column A, the column of premises
from which a match is sought; Column B, the column of responses from which the
selection is made.

2. Supply Test
a. Short Answer-uses a direct question that can be answered by a word, phrase, a
number, or a symbol
b. Completion Test-it consists of an incomplete statement

3. Essay Test
a. Restricted Response-limits the content of the response by restricting the scope of the
topic
b. Extended Response-allows the students to select any factual information that they
think is pertinent, to organize their answers in accordance with their best judgment

Guidelines for Constructing Test ltems

When to Use Essay Tests


Essays are appropriate when:
1. the group to be tested is SMALL and the test is NOT TO BE USED again;
2. you wish to encourage and reward the development of student's SKILL IN WRITING;
3. you are more interested in exploring the student's ATTITUDES than in measuring his/her
academic achievement;
4. you are more confident of your ability as a critical and fair reader than as an imaginative writer of
good objective test items.

When to Use Objective Test ltems


Objective test items are especially appropriate when:
1. The group to be tested is LARGE and the test may be REUSED;
2. HIGHLY RELIABLE TEST SCOREs must be obtained as efficiently as possible
3. IMPARTIALITY of evaluation, ABSOLUTE FAIRNESS and FREEDOM from possible test
SCORING INFLUENCES - fatigue, lack of anonymity are essential;
4. You are more confident of your ability to express objective test items clearly than your ability to
judge essay test answers correctly;
5. There is more PRESSURE FOR SPEEDY REPORTING OF SCORES than for speedy test
preparation.

Principle 3: Balanced
 A balanced assessment sets target in all sets in domains of learning or domains of intelligence.

Principle 4: Validity
Validity – is a degree to which the assessment instrument measures what it intends to measure. It is
the most important criterion of a good assessment instrument.

Ways in Establishing Validity


1. Face Validity – done by examining the physical appearance of the instrument.
2. Content Validity - is done through a careful and critical examination of the objectives of assessment
so that is reflects the curricular objectives.
3. Criterion-related Validity – is established statistically such that a set of scores revealed by the
measuring instrument is correlated with the scores obtained in another external predictor or
measure.
a. Concurrent Validity- correlating the sets of scores obtained from two measures given
concurrently.
b. Predictive Validity - describes the future performance of an individual by correlating the sets of
scores obtained from two measures given at a longer time interval.
4. Construct Validity – established statistically by comparing psychological traits or factors that
theoretically influence scores in test.

Factors Influencing the Validity of an Assessment Instrument


1. Unclear Directions – directions that do not clearly indicate to the students how to respond to the
task and how to record the responses tend to reduce validity.
2. Reading Vocabulary and sentence structure too difficult – Vocabulary and sentences structure
that is too complicated for the student result in the assessment of reading comprehension thus
altering the meaning of assessment result.
3. Ambiguity – Ambiguous statements.in assessments tasks contribute to misinterpretations and
confusion. Ambiguity sometimes confuses the better students more than it does the poor students.
4. Inadequate time limits – time limits that do not provide students with enough time to consider the
tasks and provide thoughtful responses can reduce the validity of interpretations of results.
5. Overemphasis of easy - to assess aspects of domain at the expense of important, but hard, to
assess aspects (construct under the presentation). It is easy to develop test question that asses
factual recall and generally harder to develop ones that tap conceptual understanding or higher-
order thinking processes such as the evaluation of competing positions or arguments. Hence, it is
important to guard against under representation of task getting the important, but more difficult to
assess aspects of achievement.
6. Test items inappropriate for the outcomes being measured - Attempting to measure
understanding, thinking skills, and other complex types of achievement with test forms that are
appropriate for only measuring factual knowledge will invalidate the results.
7. Poorly constructed test items – test items that unintentionally provide clues to the answer tend to
measure the students' alertness in detecting clues as well as mastery of skills or knowledge the test
is intended to measure.
8. Test too short - if a test is too short to provide a representative sample of the performance we are
interested in its validity will suffer accordingly.
9. Improper arrangement of items – test items are typically arranged in order of difficulty, with the
easiest items first. Placing difficult items first in the test may cause students to spend too much time
on these and prevent them from reaching items they could easily answer. Improper arrangement
may also influenced validity by having a detrimental effect on student motivation.
10. Identifiable pattern of answer - Placing correct answers in some systematic pattern (e.g., T, T, F,
F, or B, B. B. C,C,C,D,D,D) enables students to guess the answers to some items more easily, and
this lowers validity.

Principle 5: Reliability
Reliability- refers to the consistency of the scores obtained by the same person when tested using
the same instrument/ its parallel or when compared with other students who took the same test.

Ways to test reliability:


Method Type of Reliability Procedure Statistical
Measure Measure
1. Test-Retest Measure of stability Give a test twice to the same Pearson r
group with any time interval
between tests from several
minutes to several years
2. Equivalent Measure of Give parallel forms of tests Pearson r
Forms/Parallel Form equivalence with close time interval
between forms
3. Test-retest with Measure stability Give parallel forms of tests Pearson r
Equivalent forms and equivalence. with increased time interval
between forms
4. Split Half Measure of Internal Give a test once. Score is Person r &
Consistency equivalent halves of the test Spearman Brown
e.g.)odd-and-even- Formula
numbered items
5. Kuder- Richardson Measure of Internal Give the test once then Kuder-Richardson
Consistency correlate the Formula 20 and 21
proportion/percentage of the
students passing and not
passing a give item

Improving test Reliability


Several test characteristics affect reliability. They include the following:
1. Test length. In general, a longer test is more reliable than a shorter one because longer test sample
the instructional objectives more adequately.
2. Spread of scores. The type of students taking the test can influence reliability. A group of students
with heterogeneous ability will produce a large spread of test than a group with homogenous ability.
3. Item difficulty. In general, test composed of items of moderate or average difficulty (.30 to .70) will
have more influence on reliability than those composed primarily of easy of very difficult items.
4. Item discrimination. In general, test composed of more discriminating items will have greater
reliability than those composed of less discriminating items.
5. Time limits. Adding a time factor may improve reliability for lower-level cognitive test items. Since all
students do not function at the same pace a time factor adds another criterion to the test that causes
discrimination, thus improving reliability. Teachers should not, however, arbitrarily impose a time
limit. For higher-level cognitive test items, the imposition of time may defeat the intended purpose of
the items.

3 Criteria in Determining Desirability and Undesirability of an Item


1. Difficulty of an Item
Item Difficulty Level - the number of people who answer the particular item correctly.

The number of students who answer item X correctly


Difficulty Index=
Total number of students who answer item X

Difficulty Level

Index Range Difficulty Level

0.0 - 0.20 Very Difficult


0.21 - 0.40 Difficult
0.41 - 0.60 Moderately Difficult
0.61 - 0.80 Easy
0.81 - 1.00 Very Easy
2. Discriminating power
Discrimination Index – is the degree to which the item discriminates between high
performing and low performing group.

Discrimination Index=The difference of the ¿ of students who got the correct answer ∈Upper group∧Lower group

Discrimination Index Item Evaluation

0.40 and up Very good Item


0.30 - 0.39 Reasonably Good item but possibly subject to improvement.
0.20 - 0.29 Marginal Item, usually needing and being subject to improvement.
0.19 and below Poor item, to be rejected or improved by revision.

Positive discrimination - if the proportion of students who got an item right in the upper group is
GREATER THAN the lower group.

Negative discrimination – if the proportion of students who got an item right in the lower group is
GREATER THAN the upper group.

Zero discrimination – if the proportion of students who got the item right in the upper performing group
and low performing group are equal

Item Analysis
Difficulty Index Discrimination index

Retained 0.26 - 0.75 0.20 and above

0.26 - 0.75 0.19 and below

Revised 0.0 - 0. 25
0.20 and above
0.76 - 1.0

0.0 - 0.25
Reject 0.19 and below
0.76 - 1.0

3. Measures of attractiveness

Mark 11:24
“Therefore I tell, whatever you ask in prayer, believe that you have received it, and it will be yours.”

You might also like