0% found this document useful (0 votes)
65 views

Module 3 Educ 105 Modified

This document provides an overview of Module III which focuses on designing and developing assessment tools. It discusses the key characteristics of quality assessment tools, including objectivity, validity, reliability, and usability. Specific types of validity like face validity, content validity, and construct validity are defined. Methods for estimating reliability such as test-retest, parallel forms, and split-half are also outlined.

Uploaded by

Maria Zobel Cruz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views

Module 3 Educ 105 Modified

This document provides an overview of Module III which focuses on designing and developing assessment tools. It discusses the key characteristics of quality assessment tools, including objectivity, validity, reliability, and usability. Specific types of validity like face validity, content validity, and construct validity are defined. Methods for estimating reliability such as test-retest, parallel forms, and split-half are also outlined.

Uploaded by

Maria Zobel Cruz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

MODULE III

Designing and Developing


Assessment Tools

Lesson 1 Characteristics of
Quality Assessment
Tools

Lesson 2 The Table of


Specifications

Lesson 3 Assessment Tools


Development

Module III
2

Lesson 1

 Characteristics of Quality Assessment


Tools

The evaluation tool serves a variety of uses. Regardless of the type of


tool used or how the result of evaluation is to be used, all types of
evaluation should possess the following characteristics:

Characteristics of a Good Measuring Instrument


1. Objectivity represents the agreement of two or more competent
judges, scorers, or test administrators concerning a measurement; in
essence it is the reliability of test scores between or among more
than one evaluator. If two raters who assess the same individual on
the same test cannot agree on a score, the test lacks objectivity and
the score of neither judge is valid. Thus, lacking of objectivity
reduces test validity in the same way that lack of reliability
influences validity (Popham, 2000).
2. Validity of a test is the degree of accuracy by which it measures
what it aims to measure. For instance, if a test aims to measure
proficiency in solving linear system in algebra, and it does not
measure proficiency in solving linear system in algebra, then it is not
valid. The degree of validity of a test is often expressed numerically
as a coefficient of correlation with another test of the same kind and
of known validity.

Types of Validity

a. Face Validity refers to how tests takers perceive the


attractiveness and appropriateness of a test. If test takers
consider the test to have face validity, they may offer a more
conscientious effort to complete the test. If a test does not
have face validity, they might hurry through a test and take it
less seriously.

b. Content Validity refers to the relevance of the test items of a


test to the subject matter or situation from which they are
taken. This is also called “face validity” or “logical validity”.

c. Concurrent Validity refers to the correspondence of the


scores of a group in a test with the scores of the same group
in a similar test of already known validity used as a criterion.

Module III
3

d. Predictive validity refers to the degree of accuracy of how a


test predicts the level of performance in a certain activity
which intends to foretell.

e. Construct validity is used to ensure that the measure is


actually a measure of what it is intended to measure
(construct), and not other variables. A test has construct
validity if it accurately measures a theoretical, non-
observable construct or trait. The construct validity of a test
is worked out over a period of time on the basis of an
accumulation of evidence. Two methods of establishing a
test’s construct validity are:

e. 1. Convergent/divergent validation. A test has


convergent validity if it has a high correlation with
another test that measures the same construct. By
contrast, a test’s divergent validity is demonstrated
through low correlation with a test that measures a
different construct.

e. 2. Factor Analysis. Factor analysis is a complex statistical


procedure which is conducted for variety of purposes.,
one of which is to assess the construct validity of a test
or a number of tests.

3. Reliability of a test is the degree of consistency of measurement


that it gives. Suppose a test is given to an individual and after the
lapse of a certain length of time, the test scores are identical or
almost identical, the test is reliable. Like validity, the degree of
reliability of a test is numerically expressed as coefficient of
correlation.

Factors of reliability

a. Adequacy refers to the appropriate length of the test and the


proper sampling of the test content. A test is adequate if it is
long enough to contain a sufficient number of representative
items of the behavior to be measured so that it is able to give
true measurement.

b. Objectivity. A test is objective if it yields the same score no


matter who checks it or even if it is checked at different
times. To make a test objective, make the response to the
items single symbols, words, or phrases.

c. Testing Condition refers to the condition of the examination

Module III
4

room.

d. Test administration procedure. The manner of administrating


test also affects its reliability. Explicit directions usually
accompanying a test and they should be followed strictly
because these procedures as standardizes. Directions should
be clearly understood before starting the test.

Reliability is a factor of validity; that is, a test cannot be valid without


it being reliable. However, validity is not a factor of reliability because
test can be reliable without being valid.

Method in Estimating Reliability of Good Measuring Instrument

1. Test-retest method. The same measuring instrument is


administered twice to the same group of subjects. The scores of
the first and second administrations of the test are determined by
correlation coefficient. Spearman rank correlation or Spearman
rho may use to correlate the scores of this method.

Disadvantage of test-retest method

a. When time interval is short, memory affects may operate. The


subject may recall of his responses and trends to make the
correlation of the test high.

b. When the time interval is long, such factors as unlearning,


forgetting, among others may occur and may result to low
correlation of the test.

c. Regardless of the time interval separating the two


administrations, other varying environmental conditions such
as noise, temperature, lighting and other factors may affect
the correlation of the test.

2. Parallel-forms Method. Parallel or equivalent forms of a test may


be administered to the group of subjects, and the paired
observations correlated. In estimating reliability by the
administration parallel or equivalent forms of a test, criteria or
parallelism are required. The two forms of the test must be
constructed that the content, type of item, difficulty, instructions
for administration, and many others, should be similar but not
identical. Pearson Product Moment Correlation may be used.

Module III
5

The correlation between the scores obtained on paired


observations of these two forms represents the reliability
coefficients of the test. If r value obtained is high this means that
the measuring instrument is reliable.

3. Split-half Method. The test in this method may be administered


once, but the test items are divided into halves. The common
procedure is to divide a test into odd and even items. The two
halves of the test must be similar but not identical in content,
number of items, difficulty, means and standard deviations. Each
student obtained two scores, one no the odd and the other on the
even items, in one test. The scores obtained on the two halves
are correlated. The result is a reliability coefficient for the half
test, since the reliability holds only for the half test, the
reliability coefficient for a whole test may be estimated by using
the Spearman-Brown formula.

Internal-consistency Method. This method is used with


psychological tests which are constructed scored items. The
examinee either passes or fails in an item. A 1 is assigned for 0 for a
failure. The method of obtaining reliability coefficients in this
method is determined by Kuder-Richardson formula, KR 20 or KR
21.

4. Usability refers to the characteristics of administrability, scorability,


economy, comparability, and utility of a test. It is usable if it is easy
to administer, easy to score, economical, when results are given
meaning and utile.

Factors of Usability

a. Administrability. There are tests that are easy to administer


and there are tests hard to administer.

b. Scorability. There are tests that are easy to score and they
are in demand. But here are some of them that are difficult to
score and the computations to arrive at the final score are
very complicated. This situation lessens the demand of these
tests.

c. Economy. These are tests the answers to which are written in


the tests themselves and so they cannot be used again. This
makes these kinds of tests costly and this limits their
usability. There are also tests that utilize separate answer
sheets so that they can be used again and again. Because
these tests are cheaper, they are more in demand enhancing
their usability.

Module III
6

d. Comparability. Refers to the availability of norms with which


scores of students are compared to determine the meanings of
their scores.

e. Utility. A test is utile if it is adequately serves the very


purpose for which it is intended. If a test is intended to
measure achievement in mathematics and it does, then test
has a high utility. The test is usable.

Practicality and Efficiency of Assessment of Student Learning

Teachers need to be familiar with the tools of assessment. In the


development and use of classroom assessment tools, certain issues must be
addressed in relation to the following important criteria.

1. Purpose and Impact. How will the assessment be used and how will it
impact instruction and the selection of curriculum?
2. Validity and Fairness. Does it measure what is intends to measure?
Does it allow students to demonstrate both what they know and are
able to do?
3. Reliability. Is the data that is collected reliable across applications
within the classroom, school, and district?
4. Significance. Does it address content and skills that are valued by
and reflect current thinking in the field?
5. Efficiency. Is the method of assessment consistent with the time
available in the classroom setting?

Ethics in Assessment
Ethics refers to questions of right and wrong. There are some
aspects of the teaching- learning situation that should not be assessed,
like:

 Requiring students to answer checklist of their sexual fantasies;

 Asking elementary pupils to answer sensitive questions without


consent of their parents;

 Testing the mental abilities of pupils using an instrument

In any situations, teacher should consider the following ethical issues


that may be raised:

a. Possible harm to the participants. Teachers do not want any


physical or psychological harm that may come to any student as a
result of assessment.

Module III
7

b. Confidentiality of the assessment data. Test results and


assessment results are confidential. Such should be known only by
the student concerned and the teacher.

c. Presence of concealment or deception. There are instances in


which it is necessary to conceal or hide the objective of the
assessment from the students in order to ensure fair and impartial
results. When this is the case, the teacher has a special
responsibility to:

 Determine whether the use of such techniques is justified by


the educational value of assessment;

 Determine whether alternative procedures are available that


do not make use of concealment; and

 Ensure that students are provided with sufficient explanation as


soon as possible.

Finally, temptation to assist certain individuals in class during


testing is ever present. In this case, it is best if the teacher does not
administer the test himself if he believes that such a concern may, at a
later time, be considered unethical.

 LEARNING ACTIVITY

Answer the following:


1. How do you interpret this statement: “Is valid test always
valid?” Explain your answer and give examples.
2. Is reliable test always valid? Why? Give example.
3. Discuss briefly the “ethics in assessment”.
4. As a future teacher, how will you perform “fairness” in
assessing the achievements of your students?
5. Discuss briefly the types of validity. Give examples of each
type.

Module III
8

Lesson 2


The Table of Specifications

Classroom tests provide teachers with essential information used to


make decisions about instruction and student grades. A table of
specification (TOS) can be used to help teachers frame the decision making
process of test construction and improve the validity of teachers’
evaluations based on tests constructed for classroom use.

A TOS, sometimes called a test blueprint, is a plan to help teachers


decide the subject matter in which to test. Instructional objectives specify
the actual learning behavior, and test items are then designed to elicit
those behaviors (Chase, 1999). It is a two-way chart which describes the
topics to be covered by a test and the number or points which will be
associated with each topic.

To ensure that classroom tests measure a representative sample of


instructionally relevant tasks, it is important to develop specifications that
can guide the selection of best items and assessment tasks.

There are three important aspects to be included in preparing table


of specifications:
1. Selecting the learning outcomes to be tested;
2. Outlining the subject matter; and
3. Making a two-way chart.

How to Select Learning Outcomes to be tested


The specific nature of the course, the objectives attained in previous
courses, the philosophy of the school, the special needs of the students, and
the hosts of other local factors that have a bearing on the instructional
program have to be considered.
The instructional objectives will include learning outcomes in the
following areas:
1. Knowledge
2. Intellectual abilities and skills
3. General skills – laboratory, performance, communication
4. Attitudes, interests, and appreciations

How to Outline the Subject Matter


The content of the course may be outlined in detail for teaching
purposes, but for test planning; only major categories need to be listed.

Example:
A. Linguistic Knowledge

Module III
9

1. Knowledge of sound system


2. Knowledge of words
3. Creativity of linguistic knowledge
4. Knowledge of sentences and non-sentences

B. Linguistic Knowledge and Performance


1. What is grammar?
a. Descriptive grammar
b. Prescriptive grammar
c. Teaching grammar
2. Language universals
a. Sign language

C. The Human Brain


1. The modularity of the brain
2. More evidence for modularity
a. Aphasia

Steps in Constructing the Table of Specifications


1. Define the content categories. This is the scope or coverage of what
was taught – the classroom test – or what was intended to be taught
(achievement test).
2. Define the skills. This can be defined in general categories or in
specific behavior.
3. Determine relative weights. For content and skills categories assign
weights considering priorities and points for emphasis.
4. Complete the entries. Allocate the number of items in each category.
It is necessary to fill all the cells.

Using the table of specifications, test item can be prepared using the
content skill in the blueprint. The test items are described in these stages:
1. Choose the cell of the table;
2. Write an item that seems to fit the content and skill category; and
3. After revising the item, check the appropriateness of the intended
cell.

How can the Use of TOS Benefits Students


A table of specifications benefits students in two ways:
1. It improves the validity of a teacher-made tests; and
2. It can improve student learning.

A table of specifications helps to ensure that there is a match


between what is taught and what is tested. Classroom assessment should be
driven by classroom teaching which itself is driven by course goals and
objectives. Tables of Specifications provide the link between teaching and
testing.

Objectives Teaching Testing

Module III
10

Tables of specifications can help students at all ability levels learn


better. By providing the table to students DURING instruction, students can
recognize the main ideas, key skills, and relationships among concepts more
easily. The table of specifications can act in the same way as a CONCEPT
MAP to analyze content areas.

Teachers can even collaborate with students on the construction of


the table of specifications – what are the main ideas and topics, what
emphasis should be placed on each topic, what should be on the test.

Open discussion and negotiation on these issues can encourage higher


levels of understanding while also modeling good learning, and study skills
(Chase, 1999).

Example: A Unit Exam: Amoeba

Bloom’s Taxonomy Analysis or


Knowledge Application Total
Cognitive Level Synthesis
Classification 3 3 6
24%
Structure 5 3 8
32%
Reproduction 2 2 4
18%
Medical 2 2 3 7
28%
Total 12 (48%) 5 (20%) 8 (32%) 25

The purpose of a table of specifications is to identify the


achievement domains being measured and to ensure that a fair and
representative sample of questions appear in the test. Teachers cannot
measure every topic or objective and cannot ask every question they wish to
ask.

A table of specifications allows the teacher to construct a test which


focuses on the key areas and weighs those different areas based on their
importance. A table of specifications provides the teacher with evidence
that a test has content validity, that it covers what it should be covered.

Another example is a two-way chart that indicates both the total


number of test items and assessment tasks and the percentage allotted to
each objective and each area of content. For classroom testing, using the
number of items may be sufficient, but the percentages are useful in
determining the amount of emphasis to give to each area.

In the example, a list of five performance assessment tasks


corresponding to instructional objectives, together with the plan for relative

Module III
11

weight to be given to the scores on those tasks and the classroom test can
do.

Examples:
In table 1, the total item used is 60. This is done by starting within
each cell assigning the number of items intended to prepare per content
behavior. These points can be transformed into percentage by dividing
subtotals by total points.

Table 1. Table of Specifications in Science


Objectives
Knows Understands Interprets
Influence of Each Factor Total #
Percent of
Content Basic Weather Specific On weather Weather
Items
terms Symbols Facts formation items maps
Air pressure 1 1 1 3 3 9 15
Wind 1 1 1 10 2 15 25
Temperature 1 1 1 4 2 9 15
Humidity and
1 1 1 7 5 15 25
precipitation
Clouds 2 2 2 6 12 25
Total number
6 6 6 30 12 60 20
of items
Percent of
10 10 10 50 20 100
Items
Source: Linn and Gronlund, 2000

Looking closely at the next table, you will notice the following steps
in preparing a two-way chart.

1. List of general objectives across the top of the table.


2. List of major content areas down the left side of the table.
3. The proportion of the test items devoted to each objective and each
content area.

Table 2. Table of Specifications for 50-item test in addition of fraction


Content Area Adds Add Add Total
Fractions Fractions Mixed Items
and Mixed Numbers
Numbers
Denominators are alike 5 5 5 15
Denominators are unlike 5 5 5 15
(with common factor)
Denominators are unlike 6 7 7 20
(without common factor)
Total Items 16 17 17 50
Source: Linn and Gronlund, 2000

Module III
12

The next example is expressed in terms of percentage of points


associated with each combination of objectives and content area (Chase,
1999).

Table 3. Table of Specifications for functions and decimals showing targeted


percentage of points from a combination of a classroom test and a set of
performance assessment tasks
Instructional Objectives
Content Area Procedural Understanding Application Total
Skills percentage
of points
Simple fractions 5 10 5 20
Mixed numbers 5 15 10 30
Decimals 5 10 5 20
Decimal-fraction 10 10 10 30
relationships
Total Percentage 25 45 30 100
of Points

Sample two-way table of specifications for a summative test:

Outcomes Knows Total


Comprehends Applies
number of
Terms Facts Procedures Principles Principles
Content Items
Role of test in
4 4 2 10
instruction
Principles of
4 3 2 6 5 20
testing
Norm-referenced vs
criterion- 4 3 3 10
referenced
Planning the test 3 5 5 2 5 20
Total number of
15 15 10 10 10 60
items
Source: Gronlund, 1995

Specifications for a Formative Test


Specify knowledge outcomes to result from study of cognitive domain
of the taxonomy and the number of test items for each intended outcomes.

Table 4. List of reading comprehension skills and number of items for each
specific skill

Reading Skill Number of Items


Identifies details stated in a passage 10
Identifies the main idea of a passage 10

Module III
13

Identifies the sequence of actions or events 10


Identifies relationships expressed in a passage 10
Identifies inferences drawn from a passage 10
Total Number 50

Checklist for Specification Table


1. Are the specifications in harmony with the purpose?
2. Do specifications reflect the nature and limits of the domain?
3. Do specifications indicate the types of learning outcomes to be
measured?
4. Do the specifications indicate the sample of learning outcomes to be
measured?
5. Is the number and types of items/tasks appropriate?
6. Is the distribution of items and tasks adequate?
7. Is the number of items adequate to represent the domain?

 LEARNING ACTIVITY
Refer to the K to 12 curriculum guide, choose a subject aligned to
your field of specialization/major then prepare/construct a table of
specification for a 50-item test.

Module III
14

Lesson 3

 Assessment Tool Development

TEACHER-MADE EXAMINATIONS - are those constructed by


teachers to be given to their students for the purpose of marking and
promotion. Teacher-made examinations are principal tools in measuring
school achievement. They are grouped into the following major classes:

1. Oral Examinations. These are test in which the answers are given
in spoken words. The questions may be given in spoken words or in
writing. Examples are oral recitations and oral defense of a thesis
or dissertation in graduate studies.

2. Written Examinations. These are test in which the answers are


given in writing. The questions may be given orally or in writing.
Examples are essay and objective examinations.

3. Performance Examinations. These are examinations which the


responses are given by means of overt actions. Examples are
calisthenics in physical education, marching and assembling a gun
in military training, planning in woodworking, making a dress, etc,.
the questions may be given in words or in writing.

Types of Questions Asked in Examinations According to What


They Measure

1. Questions or items that measure knowledge of facts or factual


information.

Example: Magellan was killed in (a) Cebu (b) Mactan (c) Panay.

2. Questions or items that measure understanding particularly of


terminologies, ideas and concepts, methods and procedures.
Example: Explain the meaning of democracy.

3. Questions or items that measure reasoning power, critical thinking,


analytical ability, critical evaluation, etc.
Example: Why should the Philippines industrialize?

4. Questions or items that measure performance skill. These are skill


questions. Among the skills are laboratory, computational, oral
communication social, singing, and other observable skills.

Module III
15

Example: Sing a kundiman.

5. Questions or items that measure values, attitudes, and


appreciation, especially in literature, music and art, scientific
achievement, and social interaction.
Example: How can you show your love for your country?

6. Questions or items that measure judgment.


Example: Drug addiction is a major problem.
7. Questions or items that measures interest: personal, social,
educational, vocational, leisure, etc.
Example: Which do you prefer to see, a basketball game or a movie?

8. Questions or items that measure social and emotional adjustment.


Example: Do you have a problem that disturbs your emotional
stability?

Types of Oral Examinations

A. According to the Number of Answering

1. Individual oral examination

Examples: reciting a poem, oral defense of a thesis

2. Group oral test

Examples: choral speaking, oral renditions

B. According to the Objectives of Measurement

1. Questions for marking. The students are graded


according to the quality of their answers.

2. Questions for selection. The purpose is to fill up a vacant


position either for honor, for scholarship, and for
employment.

Advantages of Oral Examinations

1. It can be used in any grade or year level, especially in the lower


grades.

2. It can be used in any subject.

3. It can be used for diagnostic purposes to discover the strength,

Module III
16

weaknesses, and study habits of the students.

4. The difficulty of the question can be adjusted to suit the ability of


the student answering.

Disadvantages of Oral Examinations

1. Marking is too subjective and there is no way of standardizing


questioning and scoring. Hence, an oral examination seriously lacks
validity and reliability.

2. It is unfair since some students are asked to answer difficult


questions while others are asked to answer easy ones.

3. Since only one or two questions may be asked to a student, the


amount of knowledge measured is very limited.

4. It takes too long a time before all the students in a class are
measured.

5. The results cannot be used for comparison purposes, that is,


comparing the scores of the students, because the students are
asked to answer different questions of varying difficulties.

Improving Construction and Conduct of Oral Examinations

1. The question should present a single, central point

2. It should be stated in positive form.

3. It should enable the informed pupil to show evidence of attainment.

4. It should deal with an important point.

5. It should present an element of novelty if it seeks to measure high-


level outcome.

6. Proper advanced preparation is made before the examination.

7. Consistent procedures should be followed in the question session.

8. Scoring and rating methods should be systematically applied.

9. Teacher should be as fair as possible.

Module III
17

Two Types of Informal Teacher-Made Tests (Written


Examinations)

A. Essay Type
B. Objective Type

Essay Examination

Essay examination consists of questions to which the students


respond in one or more sentences to a specific question or problem. It
measures skill in writing where it tests student’s ability to express his
ideas accurately and to think critically within a certain period of time. In
other words, an essay examination may be evaluated in terms of content
and form.

Suggestions in Constructing an Essay Examination

1. It must be planned and constructed carefully in advance.

2. It must show major aspects of the lessons in framing question, so


care must be taken to distribute questions evenly among the
different units.

3. After the test has been planned and the questions have been
written tentatively, precautions on the causes of unreliability
should be taken.

4. In assembling the test questions into a final form, the teacher


should be careful that he phrases the questions vividly so that its
scope will be clear to the students.

5. Time limit on the coverage of each question should be reckoned so


that students have adequate time to answer.

Types of Constructing and Essay Examination

1. Selective recall
2. Evaluating recall
3. Comparison of two things (Specific)
4. Comparison of two things (General)
5. Decision (For or Against)
6. Cause or Effects
7. Explanation of the use of exact meaning of some phrases or
statements in a passage.

Module III
18

8. Summary of some unit of the test or some articles read.


9. Analysis
10. Statement of relationship.
11. Illustrations and examples of principles in sentence construction in
language
12. Classification
13. Application of rules r principles in new situations
14. Discussion
15. Statement of aim
16. Criticism
17. Outline
18. Reorganization of facts
19. Formulation of new questions, problems and questions raised
20. New method or procedure

Advantages of an Essay Examination

1. Easy to construct. With regard to the preparation of the test, an


essay examination is easier to construct by a classroom teacher and
it saves time and energy as far as construction is concerned
because it involves few items.

2. Economical. Essay examination is economical when it comes to


duplication facilities like the typewriter, computer, mimeographing
machine, etc., because the questions can be written on the board.
It is also advantageous for schools with lack duplicating facilities

3. Trains the core of organizing, expressing, and reasoning power.


An essay examination trains the student to organize, express and
reason out their ideas.

4. Minimize guessing. Responses to an essay examination consists of


one or more sentences, hence, guessing is minimized.

5. Develop critical thinking. An essay examination develops the


students to think critically. Essay questions call for comparison,
analysis, reorganization of facts, for criticism, for defense of
opinion, for decision and other mental activity.

6. Minimize cheating and memorizing. Cheating and memorizing in


an essay examination are minimized because essay test are
evaluated in terms of content and form and that an answer to a
question is composed of one or more sentences.

7. Develops good study habits. An essay examination develops good


study habits on the part of the student, in the sense, that they
study their lessons with comprehension rather than rote memory.

Module III
19

Disadvantages of an Essay Examination:

An essay examination has some difficulties that limit its


affectivity as a measuring instrument. The limitations or disadvantages of
an essay examination are as follows:

1. Low validity. Essay examination has low validity for it has limited
sampling.

2. Low reliability. Low reliability may occur in an essay examination


due to its subjectivity of scoring. The tendency of some teachers is
to react unfavorably to answers of students whom he considered
weak and give favorable impressions to answers of bright students.

3. Low usability. An examination is time consuming to both teacher


and student wherein much time and energy are wasted.

4. Encourages bluffing. Another limitation of an essay examination is


it encourages bluffing on the part of the examinee. The tendency of
the student who does not know the answer is to bluff his answers
just to cover up lack of information. If the student is intelligent
enough to present a worth discussion as an answer related to the
scope covered by the question, this often misleads the teacher and
sometimes such answer may give a sense of completeness. If
bluffing becomes satisfactory on an essay examination, inaccuracy
of the measuring instrument occurs and evaluation of the student’s
achievement may not be valid and reliable.

5. Difficult to correct or score. Another weakness of an essay


examination is the difficulty on the part of the teacher to correct
or score due to an answer to a question consisting of one or more
sentences.

6. Disadvantages for students with poor penmanship. Some teachers


react unfavorably to responses of students having poor handwriting
and untidy papers.

Scoring an Essay Examination


To avoid subjectivity of scoring an essay test, the following
procedures are hereby presented:
1. Brush up the answers before scoring.

2. Quickly read through the papers on the basis of your opinion of


their worthiness and sort them into five groups: (a) very superior
papers, (b) superior papers, (c) average, (d) inferior, and (e) very
inferior.

Module III
20

3. Read the responses at the same time simultaneously.

4. Re-read the papers in each group and shift any that you feel have
been misplaced.
5. Avoid looking the names of the papers you are scoring.

Objective Examination
Two main types of objective tests:
1. Recall Type is categorized as to:

1.1 Simple-recall

1.2 Completion

2. Recognition type is categorized as to:

2.1 Alternative response

2.2 Multiple Choice

2.3 Matching

2.4 Rearrangement

2.5 Analogy

2.6 Identification

Recall Type

Simple-Recall Type. This test is one of the easiest to construct among


the objective types where the item appears as a direct question, a
stimulus word or phrase, or a specific direction. The response requires
the subject to recall previously learned materials and the answers are
usually short consisting of either a word or a phrase.

This test is applicable in natural sciences subjects like


Mathematics, chemical and physical sciences where the stimulus
appears in a form of a problem that requires computation.

Rules and Suggestions for the Construction of Simple-Recall


Type are as follows:

1. The test item should be so worded that the responses is brief as


possible preferably a simple word, number, symbols, or very brief

Module III
21

phrase. This objectifies and facilitates scoring.

2. The direct-question form is usually preferable to the statement


form. It is easier to phrase and more natural to the student.

3. The blanks for their responses should be in a column preferably at


the right column of the items. This arrangement facilities scoring
and is more convenient for the students.

4. The question should be so worded that there is only one correct


response. Whenever this is possible, all acceptable answers should
be included in the scoring key.

5. Make a minimum use of a textbook language in wording the


questions. Unfamiliar phrasings will reduce the possibility of correct
responses that represent more meaningless verbal associations.

Completion Test. This test consists of a series of items which requires the
subject to fill a word or phrase on the blanks. An item may contain one or
more blanks.

Some Rules and Suggestions for the Construction of a


Completion test are:

1. Give the student a reasonable basis for the responses desired.

a. Avoid indefinite statements.

b. Avoid over mutilated statements.

2. Avoid giving the student unwarranted clues to the desired response.

There are several ways in which clues are often carelessly given.
The following questions may help to prevent the common errors
in constructing completion test:

a. Avoid lifting statements directly from the book.

b. Omit only key words or phrases rather than trivial details.

c. Whenever possible avoid “a” or “an” immediately before a


blank. These words may give a clue of whether a response
starts with a consonant or vowel.

d. Do not indicate the expected answer by varying the length of


blanks or by using a dot for each letter in the correct word

e. Guard the possibility that one item or part of the test may

Module III
22

suggest the correct response to another item.

f. Avoid giving grammatical clues to the answer expected.

3. Arrange the test so as to facilitate scoring

a. Allow one point for each blank correctly filled. Avoid fractional
credits or unequal weighing of items in a test.

b. Select items to which only one correct response is possible.

c. Arrange the items as far as possible so that the student’s


responses are in a column at the right of the sentences.
Sentences containing a single blank will be easier to score if the
blank is at the end.

d. Scoring is more rapid if the blanks are numbered and the


student is directed to write his response in the appropriate
numbered blanks.

e. Prepare a key for scoring by writing on a copy of the test all


acceptable answers.

Recognition Type

The recognition type of test may be categorized into six types,


namely, alternative response, multiple choice, matching, analogy,
rearrangement, and identification.

Alternative response test. This test consists of a series of items


where it admits only one correct response in each item from the two or
three constant options to be chosen. This type is commonly used in
classroom testing, particularly, the two-constant alternative type or true-
false type. Other forms are right-wrong, plus-minus, yes-no, correct-
incorrect, same-different, etc.

Other types of constant alternative response tests are: three


constant alternative type, for instance, true-false, doubtful; and constant
alternative with correction, i.e., modified true-false type.

Suggestions for the Construction of True-False Test

1. The test items must be arranged in groups of five to facilitate


scoring. The groups must be separated by two single spaces and the
items within a group by a single space.

Module III
23

2. In indicating response, it must be simple as possible where a single


letter is enough to facilitate scoring, for instance, T for True and F
for False or X for True and O for False, and many others.
It would be better responses are placed in one column at the right
margin. Sometimes the response is written before the item number
but the former placement is preferable.
3. The use of similar statements from the book must be avoided to
minimize rote memory n studying.
4. The items are carefully constructed so that the language is within
the level of the students, hence, flowery statements are avoided.

5. Specific determines like “all”, “always”, “none”, “not”, “nothing”,


“no” are more likely to be false and should be avoided. Moreover,
determines such as “may”, “some”, “seldom”, “sometimes”,
“usually”, and “often”, are more likely to be true, hence, the
foregoing determines must be avoided because they give indirect
suggestions to probable answer.
6. Qualitative terms, for instance, “few”, “many”, “great”,
“frequent”, and “large”, are vague and indefinite and they must be
avoided.

7. Statements which are partly right and partly wrong must be avoided.

8. Statements must be strongly considered that they represent either


true or false.
9. Ambiguous and double negative statements must be avoided.

Multiple-Choice Test. This test is made up of items consisting of


three or more plausible options in each item. The choice is multiple in
the sense that student must choose only one correct or best option from
the rest.

The multiple-choice test is regarded as one of the best test forms


in testing outcomes. The form is most valuable and widely used in
standardized tests due to the flexibility and objectivity in scoring. In
teacher-made test, it is applicable for testing the vocabulary, reading
comprehension, relationships, interpretation of graphs, formulas, table
and drawing of inferences from a set of data.

Some Suggestions for the Construction of a Multiple Choice


Test

1. Statements borrowed from textbooks or other reference


materials must be avoided. Use familiar phrasing to test the
comprehension of students.

Module III
24

2. All options must be plausible with each other to attract the


students to choose distracters or incorrect responses where only
those with high intellectual levels can get the best option.

3. All options must be grammatically consistent. For instance, if the


stem is singular the options are singular.

4. Articles “an”, and “a” are avoided as last word in an incomplete


sentence. These words give clues to probable answer of students as
to whether the best options starts with a consonant or vowel.

5. Four or more options must be provided in each item to minimize


guessing.

6. The order of correct answer in all items are randomly arranged


rather than following a regular pattern

7. A uniform number of options must be used. For instance, if there


are twenty items for this type and if item I starts with five options,
the rest of the items will have also five options.

8. The best option has consistent length with distracters.

9. Homogeneity of the options must be increased in order to choose


the best option by a researcher of a logical elimination.

10. The simplest method of indicating a response must be used to


facilitate scoring. For instance, the options I each item are lettered
or numbered, the choice is made by indicating a letter or number.

The Five Varieties of Multiple-Choice for of test are:

1. Stem-and-option variety. This variety is most commonly used in


classroom testing, civil service examinations, National College
Entrance Examination (NCEE), and many others. The stem serves as
the problem and is followed by four or more options from which the
students select the correct answer or best option.

2. Setting-and-options variety. The optional responses to this type of


test are dependent upon a setting or foundation of some sort. The
setting can consist of a graphical representation, as sentence,
paragraph, picture, or forms of representation

3. Group-Term variety. This type consists of a group of words or


terms in which one does not belong to the group.

Module III
25

4. Structured-Response variety. This variety makes use of structure


response which are commonly used in classroom testing for natural
science subjects.

5. Contained-Option variety. This variety is designed to identify


errors in a word, phrase, sentence or paragraph.

Matching Type. This consists of two columns in which proper


pairing relationship of two things is strictly observed. For instance,
Column A is to be matched with column B. Two forms of matching type
are as follows:

a. Balanced form. The number of items is equal to the number of


options.

b. Unbalanced form. There are unequal numbers in the two


columns.

Suggestions for the Construction of Matching Type

1. Using heterogeneous material must be avoided in matching


exercises. For instance, dates, and items, persons and events,
measurements and definitions and many others must not be mixed
with each other.
2. More options, from which selections are made, must be included
than statements to minimize the guessing factor. In other words,
unbalanced matching type is preferable.

3. Each category must be grammatically consistent.

4. All options, including distracter, must be plausible with each other.

5. The item column must be placed at the left and the option column at
the right.
6. Option column must be arranged in an alphabetical order and dates
in chronological order to facilitate the selection of the correct
answer. Each option is assigned a code number or letter.
7. There should only be one correct response in each item.

8. Be sure each item has a pair in the option column.

9. The ideal number of items is 5 to 10 and a maximum of 15.

10. All items must appear on one page to avoid waste of time and energy
in turning the pages.

Module III
26

Rearrangement Type. This type of multiple-option item requires


a chronological, logical, rank, etc. or order.

Analogy. This type is made of items consisting of a pair of words


which are related to each other. It is designed to measure the ability of
students to observe the pair relationship of the first group to the second
group.

There are 15 kinds of relationships. These types are as follows:

1. Purpose 9. Place
2. Cause and Effect 10. Degree
3. Part-Whole 11. Characteristics

4. Part-Part 12. Sequence


5. Action to Object 13. Grammatical
6. Object to Action 14. Numerical
7. Antonym 15. Association
8. Synonym

Suggestions for the Construction of an Analogy Type

1. The relationship of the first pair words must be equal to the


relationship of the second.

2. Distracters must be plausible with the correct option to attract the


students and the process of obtaining a correct answer is by logical
elimination.

3. All options must be constructed in a parallel language.

4. All items must be grammatically consistent.

5. Four or more options must be included in each item to minimize the


chances of guessing. If using less than four options a correction
formula must be applied.

6. Only homogeneous relationship must be included in each item. For


instance, if purpose relationship is used in the first pair of words,
the second pair is also purpose relationship.

Identification Test. It is a test in which a term is defined,


described, explained or indicated by a picture, diagram, or a concrete
object and term referred to is supplied by the pupil or student.

Module III
27

Advantages of an Informal Objective Type

1. Easy to correct or score. With ease in scoring the test, objective


test is easier to correct by classroom teachers due to short response
involve in each item. Responses may contain a single word, letter,
number, or a phrase in each item.

2. Eliminate subjectivity. An objective test eliminates subjectivity in


scoring because the responses are short and exact.

3. Adequate sampling. More items are included in an objective test


where validity and reliability of the test can be adequately
observed.

4. Objectivity in scoring. Scoring objective tests can be objectively


done due to short and one correct response in each item.

5. Eliminates bluffing. Bluffing is eliminated in an objective type of


test because the students only choose the answers from the options
provided.

6. Norms can be established. Due to adequate sampling test, norms can


be established.

7. Saves time and energy in answering questions. An objective test


saves student’s time and energy in answering questions because the
options are provided, from which selection of the answers are to be
made using short statements.

Limitations of an Informal Objective Test

1. Difficult to construct. As far as preparation of the test is


concerned, an objective test is difficult to construct. There are
more items involved.

2. Encourages cheating and guessing. An objective test encourages


cheating and guessing due to the short answer given for each item.
Responses can be in the form of a letter, number, word, or phrase.

3. Expensive. Due to adequate sampling of an objective test, it is


expensive when it comes to duplicating facilities. Questions cannot
be written on the board and is disadvantageous for schools without
duplicating facilities.

4. Encourages rote memorization. An objective test encourages rote


memorization rather than memorizing logically because an answer

Module III
28

to an item may consist only of a single word or phrase. A student’s


ability to think critically, express, organize and reason out his ideas
is not developed.

5. Time consuming. Preparation of objective test is time consuming in


the part of the teacher.

Performance test is one in which the responses to test questions are in


the form of overt manual, vocal, and other similar behavioral activities.

Classification of finished products which are measured by


performance test:

1. Intangible- the actual process of performance after which it is no


longer observable. Examples: public speaking, gymnastics, dancing

2. Tangible – the concrete object or article produced by the


performer. Examples: drawings, paintings, hollow locks, dresses

Other Evaluation Instruments

1. Questionnaires. These consist of questions in which the


pupil/student responds to each item by encircling the option or by
the use of checkmark.

2. Checklists. A teacher make use of a checklist if his pupil/student


exhibits a desired behavior or if neglects certain outcomes.

3. Rating Scale. These skills are filled out by teachers for meritorious
achievement done by a pupil/student.

4. Cumulative records. These records provide information about


pupil/student personality, special talent, scholarship, and family
background.

5. Anecdotal record. It is designed to determine what happened and


what the behavior of the learner probably means. Example,
estimate, paraphrase, explain, discuss, predict, extend,
summarize, generalize.

 LEARNING ACTIVITY
Based from the constructed table of specification in Lesson 2,
prepare/develop the 50-item test with at least three (3) types of
tests.

Module III

You might also like