0% found this document useful (0 votes)

24 views

Module Assessment 3 - 020535

This document discusses the characteristics of quality assessment tools and different types of teacher-made tests. It outlines the general principles of testing, including measuring all instructional objectives, covering all learning tasks, using appropriate test items, and making tests valid and reliable. The document also describes different types of assessment tools like objective tests, subjective tests, performance assessments, portfolios, oral questioning, observations, and self-reports. It provides details on the qualities of good assessment tools such as validity, reliability, fairness, objectivity, scorability, adequacy, and practicality.

Uploaded by

Daniel Estopace

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views

Module Assessment 3 - 020535

Uploaded by

Daniel Estopace

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 29

3

Designing and
Developing Assessments
Overview
Module 3 | Assessment in Learning 1

Instructional objectives must be specific, measurable, attainable, relevant, and time-

bound. Teachers must develop assessment tools like test items that should match with the
instructional objectives appropriately and accurately. Teachers should have skills and knowledge
to design and develop assessment tools used to guide the collection of quality evidence,
including their application in formative and summative assessment.
This module discusses the characteristics of quality assessment tools and the different
types of teacher-made tests. This also includes discussions on the learning target, assessment
methods and assessment tools development.

Learning Outcomes

After learning this module, you should be able to:

 develop assessment tools that are learner-appropriate and target-matched; and
 improve assessment tools based on assessment data.

Lesson 1. Characteristics of Quality Assessment Tools

 Assessment tools are techniques used to measure a student’s academic abilities, skills,
and/or fluency in a given subject or to measure one’s progress toward academic
proficiency in a specific subject area.
 It is the instrument (form, test, rubric, etc.) that is used to collect data for each outcome.
The actual product that is handed out to students for the purpose of assessing whether
they have achieved a particular learning outcome(s).
 Assessments can be either formal or informal. Informal assessments are often inferences
an educator draws as a function of unsystematic observations of a student’s performance
in the subject matter under consideration. Formal assessments are objective
measurements of a student’s abilities, skills, and fluency using screening, progress
monitoring, diagnosis, or evaluation. Both types of assessments are important; however,
only formal assessments are research, or evidence-based.
 Educators use assessment tools to make informed decisions regarding strategies to
enhance student learning.

General Principles of Testing (Ebel and Frisbie, 1999)

1. Measure all instructional objectives. When a teacher constructs test items to measure the
learning progress of the students, they should match all the learning objectives posed during
instruction. That is why the first step in constructing a test is for the teacher to go back to the
instructional objectives.
2. Cover all the learning tasks. Teacher should construct a test that contains a wide range of
sampling of items.in this case, the teacher can determine the educational outcomes or
abilities that the resulting scores are representatives of the total performance in the areas
measured.
3. Use appropriate test items. The test items constructed must be appropriate to measure
learning outcomes.
4. Make test valid and reliable. The teacher must construct a test that is valid so that in can
measure what is supposed to measure from the students. The test is reliable when the scores
of the students remain the same or consistent when the teacher gives the same test for the
second time.
5. Use test to improve learning. The test scores should be utilize by the teacher properly to
improve learning by discussing the skills or competencies on the items that have not been
learned or mastered by the learners.
45
Module 3 | Assessment in Learning 1

Appropriateness of Assessment Tools

The type of test used should match the instructional objective or learning outcomes of the
subject matter posed during the delivery of the instruction. The following are the types of
assessment tools:
1. Objective Test. It requires student to select the correct response or to supply a word or short
phrase to answer a question or complete statement. It includes true-false, matching type, and
multiple-choice questions. The word objective refers to the scoring, it indicates that there is
only one correct answer.
2. Subjective Test. It permits the student to organize and present an original answer. It includes
either short answer questions or long general questions. This type of test has no specific
answer. Hence, it is usually scored on an opinion basis, although there will be certain facts
and understanding expected in the answer.
3. Performance Assessment. Is an assessment in which students are asked to perform real-
world tasks that demonstrate meaningful application of essential knowledge and skills. It can
appropriately measure learning objectives which focus on the ability of the students to
demonstrate skills or knowledge in real-lifer situations.
4. Portfolio Assessment. It is an assessment that is based on the systematic, longitudinal
collection of student work created in response to specific, known instructional objectives and
evaluated in relation to the same criteria. Portfolio is a purposeful collection of students’ work
that exhibits the students’ efforts, progress and achievements in one or more areas over a
period of time. It measures the growth and development of students.
5. Oral Questioning. This method is used to collect assessment data by asking oral questions.
The most commonly used of all forms of assessment in class, assuming that the learner hears
and shares the use of common language with the teacher during instruction. The ability of the
students to communicate orally is very relevant to this type of assessment. This is also a form
of formative assessment.
6. Observation Technique. This is a method of collecting assessment data. The teacher will
observe how students carry our certain activities either observing the process or product.
There are two types of observation techniques: formal and informal observations. Formal
observations are planned in advance like when the teacher assess oral report or presentation
in class while informal observation is done spontaneously, during instruction like observing
the working behavior of students while performing a laboratory experiment.
7. Self-report. The responses of the students may be used to evaluate both performance and
attitude. Assessment tools could include sentence completion, Likert scales, checklists, or
holistic scales.

Different Qualities of Assessment Tools

1. Validity refers to appropriateness of score-based inferences; or decisions made based on the

students’ test results. The extent to which a test measured what is supposed to measure.
2. Reliability refers to the consistency of measurement; that is, how consistent test results or
other assessment results from one measurement to another. A test is reliable when it can be
used to predict practically the same scores when the test administered twice to the same
group of students and with a reliability index of 0.61 above.
3. Fairness means the test item should not have any biases. It should not be offensive to any
examinee subgroup. A test can only be good if it is fair to all the examinees.
4. Objectivity refers to the agreement of two or more raters or test administrators concerning the
score of a student. If the two raters who assess the same student on the same test cannot
agree on the score, the test lacks objectivity and neither of the score from the judges is valid.
Lack of objectivity reduces test validity in the same way that the lack of reliability influence
validity.

46
Module 3 | Assessment in Learning 1

5. Scorability means that the test should be easy to score, direction for scoring should be clearly
stated in the instruction. Provide the students an answer sheet and the answer key for the one
who sill check the test.
6. Adequacy means that the test should contain a wide range of sampling of items determine the
educational outcomes or abilities so that the resulting scores are representatives of the total
performance in the areas measured.
7. Administrability means the test should be administered uniformly to all students so that the
scores obtained will not vary due to factors other than differences of the students’ knowledge
and skills. There should be a clear provision for instruction for the students, proctors and even
the one who will check the test.
8. Practicality and Efficiency refers to the teacher’s familiarity with the methods used, time
required for the assessment, complexity of the administration, ease of scoring, ease of
interpretation of the test results and the materials used must be at the lowest cost.

Assessment Task 3.1

1. Why are assessment tools necessary in assessment?

2. What are the advantages and disadvantages of a subjective test over objective type
of test?

Lesson 2. Types of Teacher-made Tests

There are different types of assessing the performance of the students like objective test,
subjective test, performance based assessment, oral questioning, portfolio assessment, self-
assessment and checklist. Each of this has their own function and use. The type of assessment
tools should always be appropriate with the objectives of the lesson. There are two general types
of test item to use in achievement test using paper and pencil test. It is classified as selection-
type items and supply-type items.

Selection-type or Objective Test Items

A. Multiple-choice Test

A multiple-choice test is used to measure knowledge outcomes and other types of

learning outcomes such as comprehension and applications. It is the most commonly used format
in measuring student achievements at different levels of learning. Multiple-choice item consists of
three parts: the stem, the keyed option and the incorrect options or alternatives. The stem
represents the problem or question usually expressed in completion form or questions form. The
keyed option is the correct answer. The incorrect options or alternatives also called
distracters or foil.

Guidelines in Constructing Multiple-choice Test

1. Construct each item to assess a single written objective

2. Base each item on a specific problem stated clearly in the stem.
3. Include as much of the item as possible in the stem, but do not include irrelevant material.
4. State the stem in positive form (in general).
5. Word the alternatives clearly and concisely
6. Keep the alternatives mutually exclusive.
7. Keep the alternatives homogeneous in content.
8. Keep the alternatives free from clues as to which response is correct.
8.1 Keep the grammar of each alternative consistent with the stem
47
Module 3 | Assessment in Learning 1

8.2 Keep the alternatives parallel in form.

8.3 Keep the alternatives similar in length.
8.4 Avoid textbook, verbatim phrasing.
8.5 Avoid the use of specific determiners.
8.6 Avoid including keywords in the alternatives.
8.7 Use plausible distracters.
9. Avoid the alternatives “all of the above” and “none of the above” (in general).
10. Use as many functional distracters as are feasible.
11. Include one and only one correct or clearly best answer in each item.
12. Present the answer in each of the alternative positions approximately an equal number of
times, in a random order.
13. Lay out the items in a clear and consistent manner
14. Use proper grammar, punctuation, and spelling.
15. Avoid using unnecessarily difficult vocabulary.
16. Analyze the effectiveness of each item after each administration of the test.

Examples of Multiple-choice Items

Knowledge Level
The number of chromosomes in a cell produced by meiosis is ______
A. half as many as the original cell.
B. twice as many as the original cell.
C. the same number as the original cell.
D. not predictable.

Comprehension Level
Why did John B. Watson reject the structuralist study of mental events?
A. He believed that structuralism relied too heavily on scientific methods.
B. He rejected the concept that psychologists should study observable behavior.
C. He believed that scientists should focus on what is objectively observable.
D. He actually embraced both structuralism and functionalism.

Application Level
An envelope contains 140 pieces consisting P1000 and P500 peso bill. If the number
of P500 IS 20 more than the number of P1000 peso bill. How many P500 peso bill are
there?
A. 40
B. 60
C. 70
D. 80

Analysis Level
What is the statistical test used when you test the mean difference between the pre-
test and post-test?
A. Analysis of variance
B. t-test
C. Correlation
D. Regression analysis

Advantages of Multiple-choice Test

1. Measures learning outcomes from the knowledge to evaluation level.

2. Scoring is highly objective, easy, and reliable.
3. Scores are more reliable than subjective type of test.
4. Measures broad samples of content within a short span of time.

48
Module 3 | Assessment in Learning 1

5. Distracters can provide diagnostic information.

6. Item analysis can reveal the difficulty of an item and can discriminate the good and poor
performing students.

Disadvantages of Multiple-choice Test

1. Time consuming to construct a good item.

2. Difficult to find effective and plausible distracters.
3. Scores can be influenced by the reading ability of the examinees.
4. In some cases, there is more than one justifiable correct answer.
5. Ineffective in assessing the problem solving skills of the students.
6. Not applicable when assessing the students’ ability to organize and express ideas.

B. Matching Type Test

Matching type item consists of two columns. Column A contains the descriptions and
must be placed at the left side while column b contains the options and placed at the right side.
The examinees are asked to match the options that are associated with the descriptions.

Guidelines in Constructing Matching Type Test

1. The descriptions and options must be short and homogeneous.

2. The descriptions must be written at the left side and marked it with Column A and the options
must be written at the right side and marked it with Column B to save time for the examinees.
3. There should be more options than descriptions or indicate in the directions that each option
may be used more than once to decrease the chance of guessing.
4. Matching directions should specify the basis for matching. Failure to indicate how matches
should be marked can greatly increase the time consumed by the teacher in scoring.
5. Avoid too many correct answers.
6. When using names, always include the complete name (first name and surname) to avoid
ambiguities.
7. Use numbers for the descriptions and capital letters for the options to avoid confusions to the
students that have a reading problem.
8. Arrange the options into a chronological order or alphabetical order.
9. The descriptions and options must be written in the same page.
10. A minimum of three items and a maximum of seven items for elementary level and a
maximum of seventeen items for secondary and tertiary levels.

Examples of Matching Type Test

Direction: Match each trigonometric term in column A with the corresponding expression
in column B. Write only the letter of your choice on the space provided for each item.

COLUMN A COLUMN B
49
sin θ
_____1. 1 A.
cos θ
_____2. csc θ B. sinθ cscθ
_____3. cos θ C. 1 – cos2θ
_____4. tan D.
sec θ

Module 3 | Assessment in Learning 1

Advantages of Matching Type Test

1. It is simpler to construct than a multiple-choice type of test.

2. It reduces the effect of guessing compared to the multiple-choice and true or false type of tests.
3. It is appropriate to assess the association between facts.
4. Provide easy, accurate, efficient, objective and reliable test scores.
5. More content can be covered in the given set of test.

Disadvantages of Matching Type Test

1. It measures only simple recall or memorization of information.

2. It is difficult to construct due to problems in selecting the descriptions and options.
3. It assesses only low level of cognitive domain such as knowledge and comprehension.

C. True or False Type

In this type of test, the examinees determine whether the statement presented is true or
false. True or false test item is an example of a “force-choice test” because there are only two
possible choices in this type of test. The students are required to choose the answer true or false
in recognition to a correct statement or incorrect statement. This type of test is appropriate in
assessing the behavioural objectives such as ‘identify,” “select,” or “recognize.” It is also suited to
assess the knowledge and comprehension level in cognitive domain. This is appropriate when
there are only two plausible alternatives or distracters.

Guidelines in Constructing True or False Test

1. Avoid writing a very long sentence statement. Eliminate unnecessary word(s) in the statement.
2. Avoid trivial questions.
3. It should contain only one idea in each item except for statement showing the relationship
between cause and effect.
4. It can be used for establishing cause and effect relationship.
5. Avoid using negative or double negatives. Construct the statement positively. If this cannot be
avoided, bold negative words or underline it to call the attention of the examinees.
6. Avoid using opinion-based statement, if it cannot be avoided, the statement should be
attributed to somebody.
7. Avoid specific determiner such as “never,” “always,” “all,” “none,” for they tend to appear in the
statements that are false.
8. Avoid specific determiner such as “some,” “sometimes,” and “may” for they tend to appear in
the statements that are true.
9. The number of true items must be the same with the number of false items.
10. Avoid grammatical clues that lead to correct answer such as the article 9a, an, the).
11. Avoid statement directly taken from the textbook.
12. Avoid arranging the statements in a logical order such as (TTTTTFFFFF, TFTFTFTFTF, etc.).
13. Directions should indicate where or how the students should mark their answer.

Examples of True or False type of Test

Direction: Determine if the statement is true or false. Write true if the statement is true,
otherwise, write false.

1. Swimming is the best fat burning exercise for everyone.

2. Target heart rate must be maintained for 20 minutes or longer to be effective.
3. The effectiveness of an exercise program depends on the frequency, intensity, and
duration of the exercise.
4. An indoor bike burns more calories than riding an outdoor bike.
5. When sleeping the body relaxes the muscles, breathing and heart rate decreases, and the
body repairs itself. 50
Module 3 | Assessment in Learning 1

Advantages of a True or False Test

1. It covers a lot of content in a short span of time.

2. It is easier to prepare compared to multiple-choice and matching type of test.
3. It is easier to score because it can be scored objectively compared to a test that depends on
the judgment of the rater(s).
4. It is useful when there are two alternatives only.
5. The score is more reliable than essay type of test.

Disadvantages of a True or False Test

1. Limited only to low level of thinking skills such as knowledge and comprehension, or
recognition or recall information.
2. High probability of guessing the correct answer compared to multiple-choice which consists of
more than 2 choices.

Supply type or Subjective type of Test Items

A. Completion type or Short Answer Test

Completion or Short Answer Test is an alternative form of assessment because the

examinee needs to supply or create the appropriate word(s), symbol(s) or number(s) to answer
the question or complete a statement rather than selecting the answer from the given options.
There are two ways of constructing completion type of test: question form or complete the
statement form.

Guidelines in Constructing Completion type or Short Answer Test

1. The item should require a single word answer or brief and definite statement. Do not use
indefinite statement that allows several answers.
2. Be sure that the language used in the statement is precise and accurate in relation to the
subject matter being tested.
3. Be sure to omit only key words; do not eliminate so many words so that the meaning of the
item statement will not change.
4. Do not leave the blank at the beginning or within the statement. It should be at the end of the
statement.
5. Use direct question rather than incomplete statement. The statement should pose the problem
to the examinee.
6. Be sure to indicate the units in which to be expresses when the statement requires numerical
answer.
7. Be sure that the answer the student is required to produce is factually correct.
8. Avoid grammatical clues.
9. Do not select textbook sentences.

Examples of Completion and Short Answer

Question Form

Direction: Write your answer on the space provided before each item.

_________________1. What do you call a polygon that has five sides?

_________________2. What is the sum of the measures of the interior angles of a nonagon?
_________________3. Which polygon is always regular?
_________________4. Which polygon has 5 diagonals?
51 with equal sides and equal angles?
_________________5. What do you call a polygon

Completion Form
Module 3 | Assessment in Learning 1

Advantages of a Completion or Short Answer Test

1. It covers a broad range of topic in a short span of time.

2. It is easier to prepare and less time consuming compared to multiple choice and matching type
of test.
3. It can assess effectively the lower level of Bloom’s Taxonomy. It can assess recall of
information, rather than recognition.
4. It reduces the possibility of guessing the correct answer because it requires recall compared to
true or false items and multiple-choice items.
5. It covers greater amount of contents than matching type test.

Disadvantages of a Completion or Short Answer Test

1. It is only appropriate for questions that can be answered with short responses.
2. There is a difficulty in scoring when the questions are not prepared properly and clearly. The
question should be clearly stated so that the answer of the student is clear.
3. It can assess only knowledge, comprehension and application levels in Bloom’s taxonomy of
cognitive domain.
4. It is not adaptable in measuring complex learning outcomes.
5. Scoring is tedious and time consuming.

B. Essay Items

It is appropriate when assessing students’ ability to organize and present original ideas. It
consists of a few number of questions wherein the examinee is expected to demonstrate the
ability to recall factual knowledge; organize his knowledge; and present his knowledge in logical
and integrated answer. Extended response essay and restricted response essay are the two
types of essay test items. Extended Response Essay allows the students to determine the
length and complexity of the response. It is very useful in assessing the synthesis and evaluation
skills of the students. When the objective is to determine whether the students can organize
ideas, integrate and express ideas, evaluate information in knowledge, it is best to use extended
response essay test. Restricted Response Essay is an essay item that places strict limits on
both content and the response given by the students. This type of essay, the content usually
restricted by the scope of the topic to be discussed and the limitations on the form of the
response is indicated in the question.

Guidelines in Constructing Essay Test Items

1. Construct essay question used to measure complex learning outcomes only.

2. Essay questions should relate directly to the learning outcomes to be measured.
3. Formulate essay questions that present a clear task to be performed.
4. An item should be stated precisely and it must clearly focus on the desired answer.
5. All students should be required to answer the same question.
6. Number of points and time spent in answering the question must be indicated in each item.
7. Specify the number of words, paragraphs or the number of sentences for the answer.
8. The scoring system must be discussed or presented to the students.

Examples of Essay Test

Extended Response Essay

1. Present and describe the modern theory of evolution and discuss how it is
supported by evidence from the areas of (a) comparative anatomy, (b) population genetics.
2. From the statement, “Mathematics maybe defined as the subject in which we
never know what we are talking about, nor whether what we are saying is true,” what do
you think is the reasoning of the statement?
52Explain your answer.
Restricted Response Essay

1. Point out the advantages and disadvantages of an essay type of test. Limit your
Module 3 | Assessment in Learning 1

Advantages of Essay Test

1. It is easier to prepare and less time consuming compared to other paper and pencil tests.
2. It measures higher-order thinking skills (analysis, synthesis and evaluation).
3. It allows students’ freedom to express individuality in answering the given question.
4. The students have a chance to express their own ideas in order to plan their own answer.
5. It reduces guessing answer compared to any objective type of test.
6. It presents more realistic task to the students.
7. It emphasizes on the integration and application of ideas.

Disadvantages of Essay Test

1. It cannot provide an objective measure of the achievement of the students.

2. It needs so much time to grade and prepare scoring criteria.
3. The scores are usually not reliable most especially without scoring criteria.
4. It measures limited amount of contents and objectives.
5. Low variation of scores.
6. It usually encourages bluffing.

Suggestions for Grading Essay Test

1. Decide on a policy for dealing with incorrect, irrelevant or illegal responses.

2. Keep scores of the previously read items out of sight.
3. The student’s identity should remain anonymous while his/her paper is being graded.
4. Read and evaluate each student’s answer to the same question before grading the next
question.
5. Provide students with general grading criteria by which they will be evaluated prior to the
examination.
6. Use analytic scoring or holistic scoring.
7. Answer the test question yourself by writing the ideal answer to it so that you can develop the
scoring criteria from your answer.
8. Write your comments on their papers.

C. Problem-Solving Test

Problem-solving test or computational test is a type of subjective test that presents a

problem situation or task and required demonstration of work procedures and correct solution, or
just a correct solution. Teacher can assign full of partial credit to either correct or incorrect
solutions depending on the quality and kind of work procedures presented.

Guidelines for Writing Problem-solving Test Items

1. Clearly identify and explain the problem.

2. Provide directions which clearly inform the student of the type of response called for.
3. State the directions whether or not the student must show his/her work procedures for full or
partial credit.
4. Clearly separate item parts and indicate their point values.
5. Use figures, conditions, situations which create a realistic problem.
6. Ask questions that elicit response on which experts could agree that one solution and one or
more work procedures are better than others.
7. Work through each problem before classroom administration to double-check accuracy.

53
Module 3 | Assessment in Learning 1

Examples of Problem-solving Test

Direction: Analyze and solve each problem. Show your solution neatly and clearly by
applying the strategy indicated in each item. Each item corresponds to 10 points.

1. Debbie begins a physical fitness program. Debbie’s goal is to do 100 sit-ups. On the first
day of the program, she does 20 sit-ups. Every 5 th day of the program, she increases the
number of sit-ups by 10. After how many days will she reach her goal? (Make a list or table)

2. In three more years, Miguel's grandfather will be six times as old as Miguel was last year.
When Miguel's present age is added to his grandfather's present age, the total is 68. How old
is each one now? (Use an equation)

Advantages of Problem-solving Test Items

1. It minimizes guessing by requiring the students to provide an original response rather than to
select from several alternatives.
2. It is easier to construct.
3. It can most appropriately measure learning objectives which focus on the ability to apply skills
and knowledge in the solution of problems.
4. It can measure an extensive amount of content objectives.

Disadvantages of Problem-solving Test

1. It generally provides low test and test scorer reliability.

2. It required an extensive amount of teacher time to read and grade the paper.
3. It does not provide an objective measure of student achievement or ability – subject to bias on
the part of the grader when partial credit is given.

Assessment Task 3.2

1. Construct a multiple-choice test using Krathwolh’s 2001 revised cognitive domain.

2. Formulate at least two examples of the different types of objective and subjective test
in your area of specialization.

Lesson 3. Learning Target and Assessment Method Match

Table of Specification

 Table of specification (TOS) is a chart or table that details the content and level of
cognitive domain assessed on a test as well as the types and emphases of test items
(Gareis and Grant, 2008).
 TOS is very important in addressing the validity and reliability of the test items. The
validity of the test means that the assessment can be used to draw appropriate result
from the assessment because the assessment guarded against any systematic error.
 TOS provides the test constructor a way to ensure that the assessment is based from the
intended learning outcomes.

54
Module 3 | Assessment in Learning 1

 It is also a way of ensuring that the number of questions on the test is adequate to ensure
dependable results that are not likely caused by chance.
 It is also a useful guide in constructing a test and in determining the type of test items that
you need to construct.

Different Formats of Table of Specification

A. Format 1 of a Table of Specification. This format is composed of the specific objectives, the
cognitive level, type of test used, the item number, and the total points needed in each item.

Specific Objectives refer to the intended learning outcomes stated as specific instructional
objective covering a particular test topic.
Cognitive Level pertains to the intellectual skill or ability to correctly answer a test item using
Bloom’s taxonomy of educational objectives. We sometimes refer to this as the cognitive
domain of a test item. Thus, entries in this column could be knowledge, comprehension,
application, analysis, synthesis, and evaluation.
Type of Test Item identifies the type or kind of test a test item belongs to. Examples of entries in
this column could be multiple-choice, true or false, or even essay.
Item Number simply identifies the question number as it appears in the test.
Total Points summarize the score given to a particular test.

Specific Objectives Cognitive Type of Test Item Total

Level Number Points
Solve worded problems in Application Multiple- 1 and 2 4
consecutive integers choice points

B. Format 2 of Table of Specification (One-way Table of Specification)

Contents Number of Number Cognitive Level Test Item

Class Sessions of Items K-C A HOTS Distribution
Basic Concepts Fraction 1 2 1-2
Addition of Fraction 1 2 3-4
Subtraction of Fraction 1 2 5-6
Multiplication and Division 3 6 7-12
of Fraction
Application/Problem 4 8 13-20
Solving
Total 10 20

C. Format 3 of Table of Specification (Two-way Table of Specification)

Content Class Krathwohl’s Cognitive Level Total Item

Sessions Remembering Understanding Applying Analyzing Evaluating Creating
Points Distribution
Concepts 1 2 1-2
Z-score 2 4 3-6
T-score 2 4 7-10
Stanine 3 6 11-16
Percentile 3 6 17-22
Rank
Application 4 8 23-30
Total 15 30

Preparing a Table of Specification

1. Selecting the learning outcomes to be measured. Identify the necessary instructional

objectives needed to answer the test items correctly. The lists of the instructional objectives

55
Module 3 | Assessment in Learning 1

will include the learning outcomes in the areas of knowledge, intellectual skills or abilities,
general skills, attitudes, interest, and appreciation. Use Bloom’s taxonomy or Krathwolh’s 2001
revised taxonomy of cognitive domain as guide.
2. Make an outline of the subject matter to be covered in the test. The length of the test will
depend on the areas covered in its content and the time needed to answer.
3. Decide on the number of items per subtopic. Use this formula to determine the number of
items to be constructed for each subtopic covered in the test so that the number of item in
each topic should be proportioned to the number of class sessions.
4. Make the two-way chart as shown in the format 2 and format 3 of a Table of
Specification.
5. Construct the test items. A classroom teacher should always follow the general principle of
constructing test items. The test item should always correspond with the learning outcome so
that it serves whatever purpose it may have.

Number of items = number of class sessions x desired total number of items

total number of class session
Example:
Number of class session discussing the topic: 3
Desired number of items: 10
Total number of class sessions for the unit: 10

No. of items = no. of class sessions x desired total number of items

total number of class session
= 3 x 10
10
= 30
10
=3

Assessment Task 3.3

1. Why is it necessary to prepare Table of Specification before constructing test items?

2. Choose a topic in your specialization and make a sample Table of Specification.

Prepare a 20-item test and use this in your table of specification.

Lesson 4. Assessment Tools Development

Assessment Development Cycle

1. Planning Stage
 Determine who will use the assessment results and how they will use them.
 Identify the learning targets to be assessed.
 Select the appropriate assessment method or methods.
 Determine the sample size.
2. Development Stage
 Develop or select items, exercises, tasks, and scoring procedures.
 Review and critique the overall assessment for quality before use.
3. Use Stage
 Conduct and score the assessment.
 Revise as needed for future use.

56
Module 3 | Assessment in Learning 1

Steps in Developing Assessment Tools

1. Examine the instructional objectives of the topics previously discussed. The first step in
developing a test is to examine and go back to the instructional objectives so that you can
match with the test items to be constructed.
2. Make a table of specification (TOS). TOS ensures that the assessment is based from the
intended learning outcomes.
3. Construct the test items. In constructing test items, it is necessary to follow the general
guidelines for constructing test items. Kubiszyn and Borich (2007) suggested some guidelines
for writing test items to help classroom teachers improve the quality of test items to write.
 Begin writing items far enough or in advance so that you will have time to revise them.
 Match items to intended outcomes at appropriate level of difficulty to provide valid
measure of instructional objectives. Limit the question to the skill being assessed.
 Be sure each item deals with an important aspect of the content area and not with trivia.
 Be sure the problem posed is clear and unambiguous.
 Be sure that the item is independent with all other items. The answer to one item should
not be required as a condition in answering the next item. A hint to one answer should
not be embedded to another item.
 Be sure the item has one or best answer on which experts would agree.
 Prevent unintended clues to an answer in the statement or question. Grammatical
inconsistencies such as “a” or “an” give clues to the correct answer to those students
who are not well prepared for the test.
 Avoid replication of the textbook in writing test items; do not quote directly from the
textual materials. You are usually not interested in how well students memorize the text.
Besides, taken out of context, direct quotes from the text are often ambiguous.
 Avoid trick or catch questions in an achievement test. Do not waste time testing how well
the students can interpret your intentions.
 Try to write items that require higher-order thinking skills.
4. Assemble the test items. After constructing the test items following the different principles of
constructing test item, the next step to consider is to assemble the test items. There are two
steps in assembling the test: (1) packaging the test; and (2) reproducing the test. In
assembling the test, consider the following guidelines:
 Group all test items with similar format.
 Arrange test items from easy to difficult.
 Space the test items for easy reading.
 Keep items and option in the same page.
 Place the illustrations near the description.
 Check the answer key.
 Decide where to record the answer.
5. Check the assembled test items. Before reproducing the test, it is very important to
proofread first the test items for typographical and grammatical errors and make necessary
corrections if any. If possible, let others examine the test to validate its content. This can save
time during the examination and avoid destruction of the concentration of the students.
6. Write directions. Check the test directions for each item format to be sure that it is clear for
the students to understand. The test direction should contain the numbers of items to which
they apply; how to record their answers; the basis of which they select answer; and the criteria
for scoring or the scoring system.
7. Make the answer key. Be sure to check your answer key so that the correct answers follow a
fairly random sequence.
57
Module 3 | Assessment in Learning 1

8. Analyze and improve the test items. Analyzing and improving the test items should be done
after checking, scoring and recording the test.

Item Analysis

Item analysis is a process of examining the student’s response to individual item in the
test. It consists of different procedures for assessing the quality of the test items given to the
students. Through the use of item analysis we can identify which of the given are good and
defective test items. Good items are to be retained and defective items are to be improved, to be
revised or to be rejected.

Uses of Item Analysis

1. Item analysis data provide a basis for efficient class discussion of the test results.
2. Item analysis data provide a basis for remedial work.
3. Item analysis data provide a basis for general improvement of classroom instruction.
4. Item analysis data provide a basis for increased skills in test construction.
5. Item analysis procedures provide a basis for constructing test bank.

Types of Quantitative Item Analysis

1. Difficulty Index
It refers to the proportion of the number of students in the upper and lower groups who
answered an item correctly. The larger the proportion, the more students, who have learned the
subject is measured by the item. To compute the difficulty index of an item, use the formula:
n
D F=
N

where: DF = difficulty index; n = number of the students selecting item correctly in the
upper group and in the lower group; and N = total number of students who answered the
test

Level of Difficulty
To determine the level of difficulty of an item, find first the difficulty index using the
formula and identify the level of difficulty using the range given below. The higher the value of the
index of difficulty, the easier the item is. Hence, more students got the correct answer and more
students mastered the content measured by that item.

Level of Difficulty of an Item

Index Range Difficulty Level
0.00 - 0.20 Very Difficult
0.21 - 0.40 Difficult
0.41 - 0.60 Average/Moderately Difficult
0.61 - 0.80 Easy
0.81 - 1.00 Very Easy

2. Discrimination Index
It is the power of the item to discriminate the students who know the lesson and those
who do not know the lesson. It also refers to the number of students in the upper group who got
an item correctly minus the number of students in the lower group who got an item correctly.
Divide the difference by either the number of the students in the upper group or number of
students in the lower group or get the higher number if they are not equal. Discrimination index is
the basis of measuring the validity of an item. This index can be interpreted as an indication of the
58
Module 3 | Assessment in Learning 1

extent to which overall knowledge of the content area or mastery of the skills is related to the
response on an item. The formula used to compute for the discrimination index is:

CUG −C LG
D I=
D

where: DI = discrimination index value; CUG = number of the students selecting the correct
answer in the upper group; CLG = number of students selecting the correct answer in the lower
group; and D = the number of students in either the lower group or upper group

Types of Discrimination Index

1. Positive discrimination happens when more students in the upper group got the item
correctly than those students in the lower group.
2. Negative discrimination occurs when more students in the lower group got the item correctly
than the students in the upper group.
3. Zero discrimination happens when a number of students in the upper group and lower group
who answer the test correctly are equal, hence, the test item cannot distinguish the students who
performed in the overall test and the students whose performance are very poor.

Level of Discrimination
Ebel and Frisbie (1986) as cited by Hetzel (1997) recommended the use of the Level of
Discrimination of an Item for easier interpretation.

Index Range Discrimination Level

0.19 and below Poor item, should be eliminated or need to be revised
0.20 – 0.29 Marginal item, needs some revision
0.30 – 0.39 Reasonably good item but possibly for improvement
0.40 and above Very good item

Steps in Solving Difficulty Index and Discrimination Index

1. Arrange the scores from highest to lowest.

2. Separate the scores in the upper group and lower group. There are different methods to do
this:
(a) if a class consists of 30 students who takes an exam, arrange their scores from highest to
lowest, then divide them into two groups. The highest score belongs to the upper group.
The lowest score belongs to the lower group.
(b) Other literature suggested to use 27%, 30%, or 33% of the students for the upper group
and lower group.
(c)However, in the Licensure Examination for Teachers (LET) the test developers always
used 27% of the students who participated in the examination for the upper and lower
groups.
3. Count the number of those who chose the alternatives in the upper and lower group for each
item and record the information using the template:

Options A B C* D E
Upper Group
Lower Group

Note: Put asterisk for the correct answer.

4. Compute the value of the difficulty index and the discrimination index and also the analysis of
each response in the distracters.

59
Module 3 | Assessment in Learning 1

5. Make an analysis for each item.

3. Analysis of Response Options

Aside from identifying the difficulty index and discrimination index, another way to
evaluate the performance of the entire test item is through the analysis of the response options. It
is very important to examine the performance of each option in a multiple-choice item. Through
this, you can determine whether the distracters or incorrect options are effective or attractive to
those who do not know the correct answer. The attractiveness of the incorrect options is
determine when more students in the lower group than in the upper group choose it. Analyzing
the incorrect options allows the teachers to improve the test items so that it can be used again in
the future.

Distracter Analysis

1. Distracter. It is the term used for the incorrect options in the multiple-choice type of test while
the correct answer represents the key.
2. Miskeyed item. The test item is a potential miskey if there are more students from the upper
group who choose the incorrect options than the key.
3. Guessing item. Students from the upper group have equal spread of choices among the given
alternatives.
4. Ambiguous item. This happen when more students from the upper group choose equally an
incorrect option and the keyed answer.

How to Improve the Test Item

Example 1. A class is composed of 40 students. Divide the group into two. Option B is the correct
answer. Based from the given data on the table, as a teacher, what would you do
with the test item?
Options A B* C D E
Upper Group 3 10 4 0 3
Lower Group 4 4 8 0 4

1. Compute the difficulty index.

n = 10 + 4 = 14
N = 40
n
D F=
N
14
D F=
40
D F=0.35∨35 %
2. Compute the discrimination index
CUG = 10
CLG = 4
D = 20
CUG −C LG
D I=
D
10−4 6
D I= =
20 20
D I =0.30∨30 %
3. Make an analysis about the level of difficulty, discrimination and distracters.
a. Only 35% of the examinees got the answer correctly, hence, the item is difficult.
b. More students from the upper group got the answer correctly, hence, it has positive
discrimination.

60
Module 3 | Assessment in Learning 1

c. Retain options A, C, and E because most of the students who did not perform well in
the overall examination selected it. Those options attract most students from the lower
group
4. Conclusion: Retain the test item but change option D, make it more realistic to make it effective
for the upper and lower groups. At least 5% of the examinees choose the incorrect option.

Example 2. Below is the result of an item analysis for a test item in Mathematics. Are you going
to reject, revise or retain the test item?
Options A B C* D E
Upper Group 4 3 4 3 6
Lower Group 3 4 3 4 5
1. Compute the difficulty index.
N=4+3=7
N = 39
n 7
D F= =
N 39
D F=0.18∨18 %
2. Compute the discrimination index
CUG = 4
CLG = 3
D = 20
CUG −C LG 4−3 1
D I= = =
D 20 20
D I =0.005∨5 %
3. Make an analysis about the level of difficulty, discrimination and distracters.
a. Only 18% of the examinees got the answer correctly, hence, the item is very difficult.
b. More students from the upper group got the answer correctly, hence, it has a positive
discrimination of 5%
c. Students respond about equally to all alternatives, an indication that they are guessing.
d. If the test item is well-written but too difficult, reteach the material to the class.
4. Conclusion: Reject the item because it is very difficult and the discrimination index is very poor,
and option A and B are not effective distracters.

Example 3. A class is composed of 50 students. Use 27% to get the upper and the lower groups.
Analyze the item given the following results. Option D is the correct answer. What will
you do with the test item?

Options A B C D* E
Upper Group 3 1 2 6 2
Lower Group 5 0 4 4 1

1. Compute the difficulty index.

N = 6 + 4 = 10
N = 28
n 10
D F= =
N 28
D F=0.36∨36 %
2. Compute the discrimination index
CUG = 6
CLG = 4
D = 14
CUG −C LG 6−4 2
D I= = =
D 14 14
61
Module 3 | Assessment in Learning 1

D I =0.14∨14 %
3. Make an analysis about the level of difficulty, discrimination and distracters.
a. Only 36% of the examinees got the answer correctly, hence, the item is difficult.
b. More students from the upper group got the answer correctly, hence, it has a positive
discrimination of 14%
c. Modify options B and E because more students from the upper group chose them
compare with the lower group, hence, they are not effective distracters because most
of the students who performed well in the overall examination selected them as their
answers.
d. Retain options A and C because most of the students who did not perform well in the
overall examination selected them as the correct answers. Hence, options A and C
are effective distracters.
4. Conclusion: Revised the item by modifying options B and E.

Test Reliability

Reliability refers to the consistency with which it yields the same rank for individuals who
take the test more than once (Kubiszyn and Borich, 2007). That is, how consistent test results or
other assessment results from one measurement to another. A test is reliable when it can be
used to predict practically the same scores when test administered twice to the same group of
students and with a reliability index of 0.60 or above. The reliability of a test can be determined by
means of Pearson Product Moment of Correlation, spearman-Brown Formula, Kuder-Richardson
Formulas, Cronbach’s Alpha, etc.

Factors Affecting Reliability of a Test

1. Length of the test

2. Moderate item difficulty
3. Objective scoring
4. Heterogeneity of the student group
5. Limited time

Methods of Establishing Reliability of a Test

1. Test-retest Method. A type of reliability determined by administering the same test twice to the
same group of students with any time interval between the tests. The results of the test scores
are correlated using the Pearson Product Correlation Coefficient (r) and this this correlation
coefficient provides a measure of stability. This indicated how stable the test result over a
period of time.
2. Equivalent/Parallel/Alternate Forms. A type of reliability determined by administering two
different but equivalent forms of the test to the same group of students in close succession.
The equivalent forms are constructed to the same set of specification that is similar in content,
type of items and difficulty. The results of the test scores are correlated using the Pearson
Product Correlation Coefficient and this correlation coefficient provides a measure of the
degree to which generalization about the performance of students from one assessment to
another assessment is justified. It measures the equivalence of the tests.
3. Split-half Method. Administer test once and score two equivalent halves of the test. To split
the test into halves that are equivalent, the usual procedure is to score the even-numbered
and the odd-numbered test item separately. This provides two scores for each student. The
results of the test scores are correlated using the Spearman-Brown formula and this
correlation coefficient provides a measure of internal consistency. It indicates the degree to
which consistent results are obtained from two halves of the test.
4. Kuder-Richardson Formula. Administer the test once. Score the total test and apply the
Kuder-Richardson (KR) formula. The KR-20 formula is applicable only in situations where

62
Module 3 | Assessment in Learning 1

students’ responses are scored dichotomously, and therefore, is most useful with traditional
test items that are scored as right or wrong, true or false, and yes or no type. KR-20 formula
estimates of reliability provide information whether the degree to which the items in the test
measure is of the same characteristic, it is an assumption that all items are of equal in
difficulty. Another formula for testing the internal consistency of a test is the KR-21 formula,
which is not limited to test items that are scored dichotomously.

Reliability Coefficient

Reliability coefficient is a measure of the amount of error associated with the test
scores. Reliability Coefficient has the following description:
(a) The range of the reliability coefficient is from 0 to 1.0;
(b) The acceptable range value is 0.60 or higher;
(c) The higher the value of the reliability coefficient, the more reliable the overall test
scores;
(d) Higher reliability indicates that the test items measure the same thing.

Interpreting Reliability Coefficient

1. The group variability will affect the size of the reliability coefficient. Higher coefficient results
from heterogeneous groups than from the homogeneous groups. As group variability
increases, reliability goes up.
2. Scoring reliability limits test score reliability. If tests are scored unreliable, error is introduced.
This will limit the reliability of the test scores.
3. Test length affects test score reliability. As the length increases, the reliability tends to go up.
4. Item difficulty affects test score reliability. As test items become very easy or very hard, the
test’s reliability goes down.

Level of Reliability Coefficient

Reliability Coefficient Interpretation
Above 0.90 Excellent reliability
0.81 – 0.90 Very good for a classroom test
0.71 – 0.80 Good for classroom test. There are probably few items need to be
improved
0.61 – 0.70 Somewhat low. The test needs to be supplemented by other
measured (more test0 to determine grades
0.51 – 0.60 Suggests need for revision of test, unless it is quite short (10 or
fewer items). Needs to be supplemented by other measured (more
test) for grading
0.50 and below Questionable reliability. This test should not contribute heavily to the
course grade, and it needs revision.

Example 1. Prof. Joel conducted a test to his 10 students in Elementary Statistics class twice
after one-day interval. The test given after one day is exactly the same test given the
first time. Scores below were gathered in the first test (FT) and second test (ST).
Using test-retest method, is the test reliable? Show the complete solution.
Student FT ST
1 36 38
2 26 34
3 38 38
4 15 27
5 17 25
6 28 26
7 32 35

63
Module 3 | Assessment in Learning 1

8 35 36
9 12 19
10 35 38
Using the Pearson r formula, find Ʃx, Ʃy, Ʃxy, Ʃx2, Ʃy2
Student FT (x) ST (y) xy x2 y2
1 36 38 1368 1296 1444
2 26 34 884 676 1156
3 38 38 1444 1444 1444
4 15 27 405 225 729
5 17 25 425 289 625
6 28 26 728 784 676
7 32 35 1120 1024 1225
8 35 36 1260 1225 1296
9 12 19 228 144 361
10 35 38 1330 1225 1444
n=10 Ʃx = 274 Ʃy = 316 Ʃxy = 9192 Ʃx2 = 8332 Ʃy2 = 10400
( n )( Σxy )−( Σx ) (Σy)
r=
√ [ ( n ) ( Σ x ) −(Σx) ] [ ( n ) (Σ y )−( Σy) ]
2 2 2 2

( 10 ) ( 9 192 )−( 274 ) (316)

r=
√ [ ( 10 ) ( 8 332 )−274 2 ] [ ( 10 ) (10 400)−316 2 ]

r =0.91

Analysis: The reliability coefficient using the Pearson r = 0.91 means that it has a very high
reliability. The scores of the 10 students conducted twice with one-day interval are consistent.
Hence, the test has a very high reliability.

Example 2. Prof. Glenn conducted a test to his 10 students in his Chemistry class. The test was
given only once. The scores of the students in odd and even items below were
gathered, (O) odd items and (E) even items. Using split-half method, is the test
reliable? Show the complete solution.

Odd (x) Even (y)

15 20
19 17
20 24
25 21
20 23
18 22
19 25
26 24
20 18
18 17

2 r oe
Use the Spearman-Brown Formula r ot = to find the reliability of the whole test, find
1+ r oe
the Ʃx, Ʃy, Ʃxy, Ʃx2, Ʃy2 to solve the reliability of the odd and even test items.

64
Module 3 | Assessment in Learning 1

Odd (x) Even (y) xy x2 y2

15 20 300 225 400
19 17 323 361 289
20 24 480 400 576
25 21 525 625 441
20 23 460 400 529
18 22 396 324 484
19 25 475 361 625
26 24 624 676 576
20 18 360 400 324
18 17 306 324 289
Ʃx = 200 Ʃy = 211 Ʃxy = 4 249 Ʃx2 = 4 096 Ʃy2 = 4 533

Steps:
1. Use the Pearson Product Correlation Coefficient Formula to solve for r.
( n )( Σxy )−( Σx ) (Σy)
r=
√ [ ( n ) ( Σ x ) −(Σx) ] [ ( n ) (Σ y )−( Σy) ]
2 2 2 2

( 10 )( 4 249 )− (200 ) (211)

r=
√ [ ( 10 ) ( 4 096 )−200 2 ] [ ( 10 ) (4 533)−2112 ]

r =0.33
2. Find the reliability of the original test using the formula:
2 r oe
r ot =
1+ r oe
2(0.33)❑
r ot =
1+ 0.33
0.66
r ot =
1.33
r ot =0.50

3. Analysis: The reliability coefficient using Brown formula is 0.50, which is questionable
reliability. Hence, the test items should be revised.

Example 3. Ms. Tan administered a 40-item test in English for her Grade VI pupils in UEPLES.
Below are the scores of 15 pupils, find the reliability using the Kuder-Richardson
formula.
Student Score (x)
1 16
2 25
3 35
4 39
5 25
6 18
7 19
8 22
9 33
10 36
11 20
12 17

65
Module 3 | Assessment in Learning 1

13 26
14 35
15 39
Steps:
1. Solve the mean and the standard deviation of the scores using the given table.

Student Score (x) X2

1 16 256
2 25 625
3 35 1225
4 39 1521
5 25 625
6 18 324
7 19 361
8 22 484
9 33 1089
10 36 1296
11 20 400
12 17 289
13 26 676
14 35 1225
15 39 1521
n = 15 Ʃx =405 Ʃx2 = 11 917

2. Solve for Standard Deviation using the formula:

2
(Ʃx ¿¿ 2)−(Ʃx)
s2=n ¿
n(n−1)
15(11 917)−(405)2
s2=
15 (14)
2 178 755−164 025
s=
210
2 14 730
s=
210
2
s =70.14
3. Solve for the Mean using the formula:
Ʃx
x=
n
405
x=
15
x=27
4. Solve the reliability coefficient using the Kuder-Richardson formula:

KR 21=
k
k−1 [
1−
x (k −x)
ks2 ]
KR 21=
40
40−1
1−
[
27(40−27)
40 (70.14)
❑
]
66
Module 3 | Assessment in Learning 1

KR 21=
40
39[1−
27(13)
40(70.14 )
❑
]
KR 21=1.03 1−
[ 351
2 805.60 ]
KR 21=1.03 [ 1−0.1251 ]
KR 21=1.03 [ 0.8749 ]
KR 21=0.90

5. Analysis: The reliability coefficient using KR-21 formula is 0.90 which means that the test has
a very good reliability. Meaning, the test is very good for a classroom test.

Test Validity

Validity is concerned whether the information obtained from an assessment permits the
teacher to make a correct decision about a student’s learning. This means that the
appropriateness of score-based inferences or decisions made are based on the students’ test
results. Validity is the extent to which a test measures what it is supposed to measure.

Types of Validity

1. Face Validity. It is the extent to which a measurement method appears “on its face” to
measure the construct of interest. Face validity is at best a very weak kind of evidence that a
measurement method is measuring what it is supposed to. One reason is that it is based on
people’s intuitions about human behaviour, which are frequently wrong. It is also the case that
many established measures in psychology work quite well despite lacking face validity.
2. Content Validity. A type of validation that refers to the relationship between test and the
instructional objectives, establishes content so that the test measures what is supposed to
measure. Things to remember about validity:
a. The evidence of the content validity of a test is found in the Table of specification.
b. This is the most important type of validity for a classroom teacher.
c. There is no coefficient for content validity. It is determined by experts judgmentally, not
empirically.
3. Criterion-related Validity. A type of validation that refers to the extent to which scores from a
test relate to theoretically similar measures. It is a measure of how accurately a student’s
current test score can be used to estimate a score on a criterion measure, like performance in
courses, classes or another measurement instrument. For example, the classroom reading
grades should indicate similar levels of performance as Standardized Reading test scores.
a. Concurrent validity. The criterion and the predictor data are collected at the same time.
This type of validity is appropriate for tests designed to assess a student’s criterion
status or when you want to diagnose student’s status; it is a good diagnostic screening
test. It is established by correlating the criterion and the predictor using Pearson
Product Correlation Coefficient and other statistical tools correlations.
b. Predictive validity. A type of validation that refers to a measure of the extent to which
student’s current test result can be used to estimate accurately the outcome of the
student’s performance at later time. It is appropriate for tests designed to assess
students’ future status on a criterion. Regression analysis can be sued to predict the
criterion of a single predictor or multiple predictors.
4. Construct Validity. A type of validation that refers to the measure of the extent to which a test
measures a theoretical and unobservable variable qualities such as intelligence, math
achievement, performance anxiety, and the like, over a period of time on the basis of gathering
evidence. It is established through intensive study of the test or measurement instrument using
67
Module 3 | Assessment in Learning 1

convergent/divergent validation and factor analysis. There are other ways of assessing
construct validity like test’s internal consistency, developmental change and experimental
intervention.
a. Convergent validity is a type of construct validation wherein a test has a high
correlation with another test that measures the same construct.
b. Divergent validity is type of construct validation wherein a test has low correlation with
a test that measures a different construct. In this case, a high validity occurs only
when there is a low correlation coefficient between the tests that measure different
traits.
c. Factor analysis assesses the construct validity of a test using complex statistical
procedures conducted with different procedures.

Important Things to Remember about Validity

1. Validity refers to the decisions we make, and not to the test itself or to the measurement.
2. Like reliability, validity is not an all-or-nothing concept; it is never totally absent or absolutely
perfect.
3. A validity estimate, called a validity coefficient, refers to specific type of validity. It ranges
between 0 and 1.
4. Validity can never be finally determined; it is specific to each administration of the test.

Factors Affecting the Validity of a Test Item

1. The test itself.

2. The administration and scoring of a test.
3. Personal factors influencing how students response to the test.
4. Validity is always specific to a particular group.

Validity Coefficient

The validity coefficient is the computed value of the r xy. In theory, the validity coefficient
has values like the correlation that ranges from 0 to 1. In practice, most of the validity scores are
usually small and they range from 0.3 to 0.5, few exceed 0.6 to 0.7. Hence, there is a lot of
improvement in most of our psychological measurement. Another way of interpreting the findings
is the squared correlation coefficient (rxy)2, this is called coefficient of determination. Coefficient
of determination indicates how much variation in the criterion can be accounted for by the
predictor.

Example: Teacher James develops a 45-item test and he wants to determine if his test is valid.
He takes another test that is already acknowledged for its validity and uses it as
criterion. He conducted these two sets of test to his 15 students. The following table
shows the results of the two tests. Is the test valid? Find the validity coefficient using
Pearson r and the coefficient of determination.

Teacher James Test (x) Criterion Test (y) xy x2 y2

12 16 192 144 256
22 25 550 484 625
23 31 713 529 961
25 25 625 625 625
28 29 812 784 841
30 28 840 900 784
33 35 1155 1089 1225
68
Module 3 | Assessment in Learning 1

42 40 1680 1764 1600

41 45 1845 1681 2025
37 40 1480 1369 1600
26 33 858 676 1089
44 45 1980 1936 2025
36 40 1440 1296 1600
29 35 1015 841 1225
37 41 1517 1369 1681
Ʃx = 465 Ʃy = 508 Ʃxy = 16 702 Ʃx2 = 15 487 Ʃy2 = 18 162
( n )( Σxy )−( Σx ) (Σy)
r=
√ [ ( n ) ( Σ x ) −(Σx) ] [ ( n ) (Σ y )−( Σy) ]
2 2 2 2

( 15 ) ( 16 702 )−( 465 ) (508)

r=
√ [ ( 15 ) ( 15 487 ) −4652 ] [ ( 15 ) (18 162)−508 2 ]

250530−236 220
r=
√ [ 232305−216 225 ] [ 272 430−258 064 ]
14 310
r=
√ [ 16 080 ] [ 14 366 ]

14 310
r=
√ 231 005 280
14 310
r=
15 198.85785

r =0.94
Coefficient of determination = (r)2 = (0.94)2= 88.36%

Interpretation: The correlation coefficient is 0.94, which means that the validity of the test is high,
or 88.36% of the variance in the students’ performance can be attributed to the test.

Assessment Task 3.4

1. A 25-item multiple-choice test in Physical Education with four options was recorded
below for item number 10. Listed were a number of students in the lower and upper
groups who answered A, B, C, and D.
Item 10 A B C D*
Upper Group (27%) 4 5 2 9
Lower Group (27%) 6 4 5 5

Based from the given table answer the following:

a. Give the difficulty index
b. What is the level of difficulty?
c. Indicate the discrimination index
d. What is the discrimination level?
e. Which of the option(s) is/are the most effective?
f. Which of the option(s) is/are ineffective?
g. What can you conclude? Interpret the result.

2. Teacher Luis conducted a test to his 15 students in Science class twice with one-day
interval. The test given after one week69is exactly the same test during the first time it
was conducted. Scores below were gathered in the first test (FT) and second test
(ST).
a. Using test-retest method, is the test reliable? Show the complete solution
using the Pearson r and Spearman rho formula.
Module 3 | Assessment in Learning 1

Feedback

How was it working with this module? Were you exhausted seeing a lot of terms, numbers,
and computations used in designing and developing assessment? I hope you were able to follow
the discussion in this module. Remember that in assessment, numbers and computations are
always included. The results of the different formulas used for item analysis, reliability and
validation testing are used for the interpretation of the test item. So it is necessary that you will
know this computation processes. If you are having a hard time on some lessons, you can always
go back to the different topics and examples.

Summary

To aid you in reviewing the concepts in this module, here are the highlights:

 Assessment tools are techniques used to measure a student’s academic abilities,

skills, and/or fluency in a given subject or to measure one’s progress toward
academic proficiency in a specific subject area.
 Objective test, subjective test, performance assessment, portfolio assessment, oral
questioning, observation technique, and self-report are some of the common types
of classroom assessment tools.
 The different qualities of assessment tools include validity, reliability, fairness,
objectivity, scorability, adequacy, administrability, practicality, and efficiency.
 The two general types of test item to use in achievement test using paper and
pencil test are classified as selection-type items or objective type and supply-type
items or subjective type.
 A multiple-choice test is an objective type of test used to measure knowledge
outcomes and other types of learning outcomes such as comprehension and
applications. It is the most commonly used format in measuring student
achievements at different levels of learning. It consists of three parts: the stem, the
keyed option and the incorrect options or alternatives.
 Matching type test is an objective test that consists of two columns. Column A
contains the descriptions and must be placed at the left side while column b
contains the options and placed at the right side. The examinees are asked to
match the options that are associated with the descriptions
 True or false test item is an objective type of test which require the examinees to
choose the answer true or false in recognition to a correct statement or incorrect
statement. This is an example of a “force-choice test” because there are only two
possible choices in this type of test.
 Completion or Short Answer Test is an alternative form of subjective assessment
because the examinee needs to supply or create the appropriate word(s),
symbol(s) or number(s) to answer the question or complete a statement rather than
selecting the answer from the given options. There are two ways of constructing
completion type of test: question form or complete the statement form.
 Essay type of test is appropriate when assessing students’ ability to organize and
present original ideas. It consists of a few number of questions wherein the
examinee is expected to demonstrate the ability to recall factual knowledge;
organize his knowledge; and present his knowledge in logical and integrated
answer. Extended response essay and restricted response essay are the two types
of essay test items.
 Problem-solving test or computational test is a type of subjective test that presents
a problem situation or task and required demonstration of work procedures and
correct solution, or just a correct solution. Teacher can assign full of partial credit to
either correct or incorrect solutions depending on the quality and kind of work

70
Module 3 | Assessment in Learning 1

procedures presented.
 Table of specification (TOS) is a chart or table that details the content and level of
cognitive domain assessed on a test as well as the types and emphases of test
items (Gareis and Grant, 2008).
 Item analysis is a process of examining the student’s response to individual item in
the test. It consists of different procedures for assessing the quality of the test
items given to the students. Through the use of item analysis we can identify which
of the given are good and defective test items.
 Difficulty Index refers to the proportion of the number of students in the upper and
lower groups who answered an item correctly.
 Discrimination Index is the power of the item to discriminate the students who know
the lesson and those who do not know the lesson.
 Reliability refers to the consistency with which it yields the same rank for
individuals who take the test more than once. That is, how consistent test results or
other assessment results from one measurement to another.
 Validity is the extent to which a test measures what it is supposed to measure.

Suggested Readings
If you want to learn more about the topics in this module, you may log on to the following
links:
https://content.schoolinsites.com/api/documents/a4734c1ff0b948828e25b66791054c3b.pdf
https://www.slideshare.net/RonaldQuileste/constructing-test-questions-and-the-table-of-
specifications-tos
https://www.yourarticlelibrary.com/statistics-2/teacher-made-test-meaning-features-and-
uses-statistics/92607
https://www.slideshare.net/tamlinares/sound20-design2028ch204-729
https://opentextbc.ca/researchmethods/chapter/reliability-and-validity-of-measurement/

References

Buendicho, F. C. (2013). Assessment of Student learning 1. Rex Bookstore, Inc, Manila,

Philippines.

Burton, S. J., Sudweeks, R. E., Merrill, P. F., Wood, B. (1991). How to prepare better multiple-
choice test items: Guidelines for university faculty. Retrieved
from http://testing.byu.edu/info/handbooks/betteritems.pdf

Gabuyo, Y.A. (2012) Assessment of Learning I. Rex Book Store, Inc., Manila, Philippines.

71
Module 3 | Assessment in Learning 1

Instances Where A Non Lawyer May Appear in Court
100% (3)
Instances Where A Non Lawyer May Appear in Court
3 pages
Textbook of Psychiatry - Basant K. Puri
93% (14)
Textbook of Psychiatry - Basant K. Puri
478 pages
Unit 3 DESIGNING AND DEVELOPING ASSESSMENT
100% (3)
Unit 3 DESIGNING AND DEVELOPING ASSESSMENT
46 pages
Module 3
100% (1)
Module 3
24 pages
General Principles of Testing
100% (10)
General Principles of Testing
32 pages
Assessment in Learning
100% (3)
Assessment in Learning
22 pages
Types of Assessment
No ratings yet
Types of Assessment
2 pages
Curriculum Map
67% (3)
Curriculum Map
9 pages
Piattelli-Palmarini 1994 Cognition
No ratings yet
Piattelli-Palmarini 1994 Cognition
32 pages
General Principles of Testing
100% (1)
General Principles of Testing
3 pages
Lecture Notes 3 - Assessment in Learning 1
No ratings yet
Lecture Notes 3 - Assessment in Learning 1
16 pages
Module 4 lESSONS 1 and 2 - DESIGNING AND DEVELOPING ASSESSMENT TOOLS
No ratings yet
Module 4 lESSONS 1 and 2 - DESIGNING AND DEVELOPING ASSESSMENT TOOLS
37 pages
HANDOUT
No ratings yet
HANDOUT
2 pages
LESSON 1 High Quality Assessment
No ratings yet
LESSON 1 High Quality Assessment
4 pages
Module 4 CHAPTER 4 DESIGNING AND DEVELOPING ASSESSMENT TOOLS
100% (1)
Module 4 CHAPTER 4 DESIGNING AND DEVELOPING ASSESSMENT TOOLS
7 pages
Module 4 Chapter 4- Educ 4
No ratings yet
Module 4 Chapter 4- Educ 4
7 pages
Learning Assessment 1
100% (1)
Learning Assessment 1
43 pages
Development of Classroom Assessment Tools
No ratings yet
Development of Classroom Assessment Tools
193 pages
Chapter 3
100% (1)
Chapter 3
193 pages
Chapter 3
No ratings yet
Chapter 3
26 pages
Module 3: Development of Classroom Assessment Tools
No ratings yet
Module 3: Development of Classroom Assessment Tools
24 pages
Designing and Developing Assessment Handouts
No ratings yet
Designing and Developing Assessment Handouts
8 pages
Power Point
No ratings yet
Power Point
193 pages
Reviewer (PCK3)
No ratings yet
Reviewer (PCK3)
12 pages
Assessment of Learning
No ratings yet
Assessment of Learning
3 pages
Black Yellow Modern Minimalist Elegant Presentation 20241007 115533 0000
No ratings yet
Black Yellow Modern Minimalist Elegant Presentation 20241007 115533 0000
90 pages
MODULE-3
No ratings yet
MODULE-3
26 pages
Principles of High Quality Assessment
No ratings yet
Principles of High Quality Assessment
2 pages
Principle of High Quality Assess Module
No ratings yet
Principle of High Quality Assess Module
12 pages
PROF - ED 8 - Assessment of Learning 1
No ratings yet
PROF - ED 8 - Assessment of Learning 1
11 pages
Development of Classroom Assessment Tools
No ratings yet
Development of Classroom Assessment Tools
31 pages
Implication-for-Instructional-strategies-in-MG-teaching-1
No ratings yet
Implication-for-Instructional-strategies-in-MG-teaching-1
21 pages
Chapter 3 Designing and Developing Assessments
100% (2)
Chapter 3 Designing and Developing Assessments
20 pages
Chapter 3 - Development of Classroom Assessment Tools
100% (2)
Chapter 3 - Development of Classroom Assessment Tools
55 pages
Assessment in Learning 2
No ratings yet
Assessment in Learning 2
4 pages
EDUC-202
No ratings yet
EDUC-202
8 pages
Educ-107a-chapter-1-Authentic-assessment-in-the-classroom
No ratings yet
Educ-107a-chapter-1-Authentic-assessment-in-the-classroom
7 pages
EDUC-7-PPT-1
No ratings yet
EDUC-7-PPT-1
16 pages
-Real-Pointers-to-Review.docx
No ratings yet
-Real-Pointers-to-Review.docx
5 pages
Development of Classroom Assessment Tools 11
100% (5)
Development of Classroom Assessment Tools 11
20 pages
Assessment for HDP
No ratings yet
Assessment for HDP
7 pages
Assessment - Study Guide 1
No ratings yet
Assessment - Study Guide 1
15 pages
Assessment Finals Reviewer
No ratings yet
Assessment Finals Reviewer
35 pages
Development of Classroom Assessment Tools
No ratings yet
Development of Classroom Assessment Tools
16 pages
123 RVWR
No ratings yet
123 RVWR
53 pages
Assessment of Student Learning 2
No ratings yet
Assessment of Student Learning 2
51 pages
Assessment and Evaluation of Learning 1
No ratings yet
Assessment and Evaluation of Learning 1
15 pages
EDUC 204
No ratings yet
EDUC 204
11 pages
Module 5
No ratings yet
Module 5
45 pages
Lyksss Portfolio Kemerut 20240604 154541 0000
No ratings yet
Lyksss Portfolio Kemerut 20240604 154541 0000
20 pages
Assessment
No ratings yet
Assessment
30 pages
Notes
No ratings yet
Notes
5 pages
DESIGNING-AND-DEVELOPING-ASSESSMENT
No ratings yet
DESIGNING-AND-DEVELOPING-ASSESSMENT
41 pages
Assessment of Student Learning
No ratings yet
Assessment of Student Learning
3 pages
Module 2 Assessment of Learning 1 Dr. Banayo
No ratings yet
Module 2 Assessment of Learning 1 Dr. Banayo
24 pages
1 Development of Varied Assessment Tools
No ratings yet
1 Development of Varied Assessment Tools
52 pages
Assesment of Learning1
No ratings yet
Assesment of Learning1
15 pages
Classroom Assessment: Definition: The Measurement
No ratings yet
Classroom Assessment: Definition: The Measurement
21 pages
Assessment of Student Learning
No ratings yet
Assessment of Student Learning
5 pages
Professional education reviewer for let or blept examinees
No ratings yet
Professional education reviewer for let or blept examinees
17 pages
Module 8 Assessment and Evaluation of Learning 1
No ratings yet
Module 8 Assessment and Evaluation of Learning 1
14 pages
Classroom Testing and Evaluation
No ratings yet
Classroom Testing and Evaluation
80 pages
Assessment
No ratings yet
Assessment
51 pages
How to Practice Before Exams: A Comprehensive Guide to Mastering Study Techniques, Time Management, and Stress Relief for Exam Success
From Everand
How to Practice Before Exams: A Comprehensive Guide to Mastering Study Techniques, Time Management, and Stress Relief for Exam Success
Ranjot Singh Chahal
No ratings yet
Thierry Balzacq Editor, Ronald R Krebs Editor The Oxford Handbook
100% (2)
Thierry Balzacq Editor, Ronald R Krebs Editor The Oxford Handbook
801 pages
Theory: Gestalt
No ratings yet
Theory: Gestalt
3 pages
Time Table Format For Hard Copy Academic Year 21-22 (ODD Semester) - Section A & B - Semester 4 - With Classroom
No ratings yet
Time Table Format For Hard Copy Academic Year 21-22 (ODD Semester) - Section A & B - Semester 4 - With Classroom
2 pages
Assesing Grammar. Group5-1
No ratings yet
Assesing Grammar. Group5-1
18 pages
Capitalizing Data Science A Guide to Unlocking the Power of Data for Your Business and Products 1st Edition Mathangi Sri Ramachandran - The ebook is ready for instant download and access
100% (1)
Capitalizing Data Science A Guide to Unlocking the Power of Data for Your Business and Products 1st Edition Mathangi Sri Ramachandran - The ebook is ready for instant download and access
57 pages
Semi-Detailed LP in Science 4 For CO
No ratings yet
Semi-Detailed LP in Science 4 For CO
4 pages
Post UTME English Language Past Questions With Answers - SchoolNGR
No ratings yet
Post UTME English Language Past Questions With Answers - SchoolNGR
1 page
Michelle's Personal Vision Statement
No ratings yet
Michelle's Personal Vision Statement
2 pages
Curriculum Vitae: Personal Data
No ratings yet
Curriculum Vitae: Personal Data
2 pages
January 2015 MS - Paper 1B Edexcel Biology IGCSE
No ratings yet
January 2015 MS - Paper 1B Edexcel Biology IGCSE
22 pages
Describe Artificial Intelligence and Machine Learning
No ratings yet
Describe Artificial Intelligence and Machine Learning
27 pages
Copilot
No ratings yet
Copilot
3 pages
BSBLDR511 - Task 3
No ratings yet
BSBLDR511 - Task 3
24 pages
I Think Therefore I Play PDF
No ratings yet
I Think Therefore I Play PDF
28 pages
Level of Academic Performance of The Wo
No ratings yet
Level of Academic Performance of The Wo
47 pages
Library List
No ratings yet
Library List
8 pages
Yearly Lesson Plan (2012) Mathematics Form 5
100% (1)
Yearly Lesson Plan (2012) Mathematics Form 5
19 pages
Communicating The Book of Job in The Twenty-First Century
No ratings yet
Communicating The Book of Job in The Twenty-First Century
11 pages
Iso 10015 134
No ratings yet
Iso 10015 134
16 pages
Adverbs of Frequency Practice Solve
No ratings yet
Adverbs of Frequency Practice Solve
1 page
Detailed Lesson Plan Substance and Mixtures
100% (5)
Detailed Lesson Plan Substance and Mixtures
17 pages
The National Bursary Policy
No ratings yet
The National Bursary Policy
20 pages
xixixii
No ratings yet
xixixii
3 pages
1887 - 35119-Full Text
No ratings yet
1887 - 35119-Full Text
335 pages
Arpon Bhattacharjee CV
No ratings yet
Arpon Bhattacharjee CV
1 page
Why Horst WJ Rittel Matters by Chanpory Rith and Hugh Dubberly Horst
No ratings yet
Why Horst WJ Rittel Matters by Chanpory Rith and Hugh Dubberly Horst
3 pages