0% found this document useful (0 votes)
24 views

Assessment of Learning

The document discusses concepts related to professional education assessment including measurement, reliability, validity, norm-referenced vs criterion-referenced assessment, and developing classroom tests. It also outlines taxonomies of learning domains and psychomotor skills.

Uploaded by

Winona Atayan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Assessment of Learning

The document discusses concepts related to professional education assessment including measurement, reliability, validity, norm-referenced vs criterion-referenced assessment, and developing classroom tests. It also outlines taxonomies of learning domains and psychomotor skills.

Uploaded by

Winona Atayan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Professional Education – Assessment of Learning Lyndon Laborte Lazaro

BASIC CONCEPTS Policy decisions – Evaluating curriculum


TEST Counselling and guidance decisions
Collecting students’ performance –Promoting self-understanding
A sample of an individual’s behaviour is
obtained, evaluated and scored using RELIABILITY AND VALIDITY
standardized procedures MEASUREMENT ERROR
limits the extent to which test results can be
MEASUREMENT generalized and reduces the confidence we
Set of rules for assigning numbers to represent have in test results.
objects, traits, attributes or behaviors SOURCES
Means of rating a. Content Sampling Error
-amount of measurement error will be
ASSESSMENT relatively small if the items on a test are a
Systematic procedure for collecting information good sample of domain.
that can be used to make inferences about the b. Time Sampling Error
characteristics of people or objects -limits our ability to generalize test scores
across different situations.
EVALUATION c. Systematic Error
Determining the extent to which instructional -consistent inflation or deflation of the
objectives are attained obtained score
Comparison of information with the desired d. Random Error
performance. -inflation or deflation of scores in an
unpredictable manner
QUALITATIVE RESEARCH
Data in narrative form VALIDITY
Results are contextual and unique to individual appropriateness and meaningfulness of the
and setting inferences made from assessment results
representativeness and relevance of the
QUANTITATIVE RESEARCH assessment results to the measurement of a
Data in numeral form specific achievement domain.
Results are generalizable and attempt to find EVIDENCE OF VALIDITY
laws, generalizations. a. Content-related evidence
-adequacy of sampling
ASSESSMENT IN INSTRUCTION -content relevance
PLACEMENT ASSESSMENT b. Criterion-related evidence
Done at the beginning of the instruction -degree of relationship between two sets of
Concerned with entry performance of student measures: test scores and the criterion to be
estimated or predicted
FORMATIVE ASSESSMENT -expressed in correlation coefficients or
Done during or after instruction expectancy tables
Monitors student progress -.70 – 1 = strong correlation
.31 - .70 = moderate correlation
DIAGNOSTIC ASSESSMENT .00 - .30 = weak correlation
Identifies strengths and weaknesses c. Construct-related evidence
-hypothetical qualities that we assume to
SUMMATIVE ASSESSMENT exist in order to explain behaviour
done at the end of instruction -to account for all possible influences on the
certifies mastery; used for assigning grades score

INTERPRETATION OF RESULTS RELIABILITY


NORM-REFERENCED stability or consistency of assessment results
Broad range obtaining approximately the same results for
Provide elative ranking of students tests administered at different times or with a
For survey testing different sample of equivalent items or rated
Compares performance/discriminates among by different raters
students provides the consistency needed to obtain
validity
CRITERION-REFERENCED R measures provide an estimate of how much
Specific domain of learning tasks variation to expect under different conditions
For mastery testing R score tends to be lower when the test is short,
Compares performance to a clearly specified range of scores is limited, testing conditions are
domain inadequate and scoring is subjective.
METHODS OF ESTABLISHING RELIABILITY
APPLICATIONS OF ASSESSMENT a. Test-retest method
Student Evaluation – Summative/Formative -administer same test twice to a group with a
Instructional decisions – Placement/Diagnostic time interval in between
Selection, placement and classification decisions -stability of scores over time
1
Professional Education – Assessment of Learning Lyndon Laborte Lazaro
b. Equivalent-forms method (alternate/parallel) 6. Adaptation – modifies for Special
-administer two equivalent forms of the test in Problems
close succession 7. Origination – new movement
-consistency of test scores over different forms of patterns/creativity
test R.H. Dave
c. Test-retest with equivalent forms 1. Imitation – copy action of another
-administer two equivalent forms of the test 2. Manipulation – reproduce activity
with a time interval in between from instruction or memory
d. Internal-consistency method 3. Precision – execute skill reliably
-administer test once and compute the 4. Articulation – adapt & integrate
consistency of the response within the test expertise
-e.g.: Split-half method 5. Naturalization – automated,
unconscious mastery of activity
DEVELOPING A CLASSROOM TEST Anita Harrow
TABLE OF SPECIFICATIONS (TOS) 1. Reflex movements – reactions not
Test Blueprint learned
Describes test items in terms of content (what 2. Fundamental movements
student should know) and process (what he 3. Perception – response to stimuli
should do with that knowledge) 4. Physical abilities – stamina that
Ensures congruence between classroom must be developed further
instruction and test content. 5. Skilled movements – advanced
Teachers learned
-helps them review curriculum content 6. Non-discursive communication –
-encourages them to use items of varying effective body language
complexity.  Affective
Students Krathwol’s Hierarchical Taxonomy
-basis for study and review 1. Receiving – willingness to pay
attention
INITIAL STEPS 2. Responding – reacts voluntary or
Identify and state complies
Instructional Objectives with the following 3. Valuing - acceptance
scope: 4. Organization – rearrangement of value
a. Scope – how broad or specific the objective is system
too broad – lack characteristics to help 5. Characterization – incorporates
develop tests with good measurement value into life
characteristics c. Format – behavioural vs. non-behavioral
too narrow – disjointed items, rote Behavioral Objectives (overt)
memory of low-level cognitive abilities -observable and measurable
b. Domain – cognitive, affective and Non-behavioral Objectives (covert)
psychomotor -unobservable and not directly measurable
Cognitive
Bloom’s Hierarchical Taxonomy TYPES OF TEST ITEMS
1. Knowledge – recall and recognition Selection-type items
2. Comprehension – translate, interpret -a set of possible responses from which they are
3. Application – use of generalizations to select the most appropriate answer
4. Analysis – determine relationships a. Multiple Choice
5. Synthesis – create new relationships -stem: problem situation or question
6. Evaluation – exercise learned -alternatives: the options or choices
judgment b. True-False Items
Revised -used when there are only two possible
1. Remembering alternatives
2. Understanding -ability to identify whether statements of
3. Application fact are correct
4. Analysis c. Matching items
5. Evaluation -premises: the series of stems
6. Creating -responses: series of alternative answers
Psychomotor Supply-type
Simpson’s Hierarchical Taxonomy -create and supply their own answer
1. Perception – awareness of sensory a. Short answer items
stimulus -also completion items
2. Set – relates cues/knows -used for simple recall of knowledge and
3. Guided Response – perform as for computational problems
demonstrated b. Essay
4. Mechanism – performs simple acts -provide freedom of response
5. Complex Overt Response – skilful -used to measure ability to organize,
performance of Complex Acts integrate and express ideas
 Restricted
 Extended
2
Professional Education – Assessment of Learning Lyndon Laborte Lazaro
SCORING RUBRIC Assessment in classes such as art, music, PE
Structured, unbiased scoring procedures Also referred as authentic assessment or
A rating scale that is used with performance alternative assessment.
assessments Can measure abilities that are not accessible
Scoring guide using other assessment.
a. Holistic rubric Consistent with modern learning theory
-score overall process or product May result in better instruction
-a single score is assigned based on the Make learning more meaningful
overall quality of the student’s response Notorious for producing unreliable scores
b. Analytical rubric Time consuming and difficult to construct,
-scores separate, individual parts of the administer and score.
product or performance
-needs to specify the value assigned to each PORTFOLIOS
characteristic Involves systematic collection of student’s work
products over a specified period of time
ITEM ANALYSIS according to a specific set of guidelines
After-the-fact analysis of results of tests Achievement and growth over time
Improve tests by revising or eliminating Reflecting student achievement and growth
ineffective items over time.
Examining class-wide performance of Students evaluate their own performances and
individual items products
Scoring in a reliable manner is difficult
DIFFICULTY INDEX Time-consuming and demanding process
Proportion of the number of students in the
upper and lower groups who answered an item INTERPRETING ASSESSMENT RESULTS
correctly BRANCHES OF STATISTICS
Tell very little of the item’s usefulness in Descriptive
measuring the test’s construct -collecting, describing and analysing a set of
INDEX RANGE DIFFICULTY LEVEL data without drawing conclusions
0 - .20 Very Difficult Inferential
.21 - .40 Difficult -analysis of a subset of date leading to
.41 - .60 Optimum Difficult predictions or inferences about the entire set of
.61 - .80 Easy data
.81 – 1.00 Very Easy
BASIC STATISTICAL TERMS
DISCRIMINATION INDEX Population – totality of observations
Comparison of how overall high scorers on the Sample – subset of population
who test did on one particular item compared Parameter – numerical value describing
to overall low scorers population; Greek letter
Dependent on the difficulty of the item Statistic - describes characteristic of a sample;
INDEX RANGE DISCRIMINATION LEVEL ordinary English letter
Below .10 Questionable
MEASURE OF CENTRAL TENDENCY
.11 - .20 Not discriminating
Measure of the center of a set of data when
.21 - .30 Moderately discriminating data are arranged in increasing or decreasing
.31 - .40 Discriminating order of magnitude
.41 – 1.00 Very discriminating a. Mean
-arithmetic average of a distribution
DISTRACTERS ANALYSIS -lower than the median in a negatively
Distracters – incorrect alternatives which serve skewed distribution
to distract examinees who do not know correct b. Median
response -middle value
“working” distracters – appear attractive to -affected by size of sample
lower groups -unaffected by extreme scores or outliers
Indicates extent to which knowledge is related c. Mode
to response of student -most frequently occurring score
Measures validity -quickest estimate
NOTE: Item is RETAINED if SCALES
.20 - .60 = Difficulty index Nominal
.20 - .40 = Discrimination index -categories, classes or sets
-Mode
PERFORMANCE ASSESSMENTS AND PORTFOLIOS Ordinal
PERFORMANCE ASSESSMENT -rank things according to the amount of a
Requires students to complete a process or characteristic they display or possess
produce a product in a context that closely -Median
resembles a real-life situation.
3
Professional Education – Assessment of Learning Lyndon Laborte Lazaro
Interval Measures how many SD an observation is
-rank but on a scale with equal units above or below the mean
-has a constant unit of measurement with an Z=
arbitrary zero
-Mean, Median and Mode
Ratio T-SCORES
-have the properties of interval scales plus true Derived scores with a mean of 50 and a SD of
zero point 10.
- Mean, Median and Mode T = 10(Z) + 50

MEASUREMENT OF VARIABILITY FRACTILES OR QUANTILES


Range Measures of location that describe or locate the
position of pieces of data relative to the entire
-largest score minus smallest score
Variance set of data.
-deviation of an observation from the mean a. Percentile Ranks
Standard Deviation -percentage or proportion of scores that
score lower than a given score
-measure of the average distance that scores
vary from the mean of the distribution
Coefficient of Variation CORRELATION COEFFICIENTS
-measure of relative variation Positive correlation coefficient
-can be used to compare variability of two or -increase in one variable is associated with an
more sets of data increase on the other variable
Negative correlation coefficient
FREQUENCY DISTRIBUTION -increase in one variable is associated with a
decrease on the other variable
Basic Terms
Frequency Distribution Table
-list categories or scores along with their HYPOTHESIS TEST
corresponding frequencies Type I – rejected when it should have been
Frequency accepted
-number of original scores that fall into the class Type II – accepted when it should have been
Classes or Categories rejected
-groupings of a frequency table
Class width ASSESSMENT OF AFFECTIVE LEARNING
-difference between two consecutive lower class BASIC CONCEPTS
limits or class boundaries Behaviour
Class limits -everything done that can be observed
-smallest or largest numbers that can actually Personality
belong to different classes -characterize one’s adaptation to the world
Class marks Trait
-midpoints of the classes -tends to lead to certain behaviors
Class boundaries
-increasing or decreasing the class limits by 0.5 ASSESSMENT METHODS
Class interval Projective tests
-group of scores in a grouped frequency -use ambiguous stimulus and asks the
distribution examinee to describe or tell a story about it.
a. Projective drawings
NORMAL DISTRIBUTION -draw-a-person
Data symmetrically distributed on either side -house-tree-person
of its midpoint b. Rorschach Inkblot
Tails – areas on either side of the peak c. Thematic apperception test
Mean = Median = Mode -a story about each of the picture
Z-scores – scores expressed in SD units d. Sentence-completion
-word association
KURTOSIS e. Graphology
Peakedness -study of hand-writing
a. Leptokurtic – too peaked Self-report tests
b. Platykurtic – too flat -have a large number of statements and
limited number of answers
SKEWNESS -used in identification of traits
If the curve has one tail is longer than the -empirically keyed test
other, it is skewed. Behaviour rating
a. Negatively skewed - longer tail is in the left
b. Positively skewed - longer tail is in the right GUIDANCE AND COUNSELLING
GUIDANCE SERVICES
Z-SCORES Individual Inventory system
Shows relatively rank Information service
Counselling service

4
Professional Education – Assessment of Learning Lyndon Laborte Lazaro
Placement service Order effects
Follow-up service -changes in scoring that emerge during the
Appraisal/Testing service grading process.
Referral service
Research and Evaluation NEW DEPED GRADING SYSTEM
A Advanced 90 and above
RA 9258 P Proficient 85-89
Guidance and Counselling Act of 2004 AP Approaching Proficiency 80-84
Guidance and Counselling D Developing 75-79
-a profession that involves the use of an B Beginning 74 and below
integrated approach of dev’t of a well-
functioning individual Knowledge – 15%
Functions: Process or skills – 25%
a. Counselling Understanding – 30%
b. Psychological testing Products or performance – 30%
c. Research, placement, group process

PRINCIPLES OF GUIDANCE
Is based on a true concept of the client
Designed to provide assistance to the person in
crisis, who solves his crisis thru self-discovery and
self-direction
A learning process
Is preventive rather than curative
Responsibility of parents in the home and of
teachers in the school
Test have their place in guidance

ASSIGNING MARKS/RATINGS
Informing students of Grading System
Feedback and Evaluation
a. Formative
b. Summative
c. Grades – formal recognition of a specific
level of mastery.
-brief summary statements that fail to
convey rich details about a student’s
performance.
Frame of Reference
a. Norm-referenced
b. Criterion- referenced
Reporting student progress
a. Letter grades
b. Numerical grades
c. Verbal descriptors
d. Pass-fail
e. Supplemental systems

SOME COMMON ERRORS IN RATING


Halo effect
-raters are influenced by a single positive or
negative trait
Leniency effect
-tend to give all students good rating
Severity effect
-tend to give all students poor rating
Central Tendency effect
-tend to give all student scores in the middle
range
Personal biases
-tendency to let stereotypes influence their
ratings
Logical errors
-assumes that two characteristics are related
and tends to give similar ratings based on
assumption

You might also like