0% found this document useful (0 votes)

392 views11 pages

Item Writing

The document outlines the steps in developing test items and discusses different item formats including dichotomous, polychotomous, Likert, checklists and Q-sorts. It details the advantages and disadvantages of each format and provides guidance on writing clear, unbiased items at an appropriate reading level for test takers.

Uploaded by

azailelleon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

392 views11 pages

Item Writing

Uploaded by

azailelleon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

ITEM WRITING AND ITEM FORMATS

Objectives

1. To outline the steps taken in developing test items.

2. To discuss the different item formats, i.e. dichotomous,
polychotomous, Likert, Checklists, Q-sorts and the category scale;
their advantages and disadvantages.

Introduction

Items are specific questions or problems that make up a test (Kaplan

&Saccuzo, 2009).

An item is a specific stimulus to which a person responds overtly (i.e can

be observed) or can be scored. This response can be scored or evaluated
for example on a scale or grade e.g. 75% meaning out of a 100-item test,
the individual has scored 75 items correct.

A test is a measurement device or technique used to quantify behaviour

or help in understanding and prediction of behaviour. It is also termed as
a collection of items.

Item Writing

Item writing involves a number of steps;

1. Define clearly what you want to measure.

Most often, it will be in one of these areas;

 A type of cognitive achievement –this can be either a skill or

knowledge. An example of knowledge is – ‘knowledge of
Ugandan history’ or for a skill – ‘demonstration of an ability to
multiply decimals’.
 A type of affective trait- for example - interest in psychology.

The items should be made as speficic as possible.

2. Generate an item pool

The item developer should take care in selecting and developing items. They
should avoid redundant items.
In order to get the required number if items, one may need to write 3-4
items for each item that they wish to write. For example if you wish to write
20 items for your test, you may generate a pool of 60-80 items.

3. Avoid exceptionally long items.

Writing exceptionally long items may lead to having items that are
misleading or confusing. So they should be avoided.

4. Keep the level of reading difficulty appropriate for those who will
complete the scale.
It is important to be mindful of the level of reading difficulty of the targeted
test takers. If for example the item developer is writing for nursery school
children, the items should be in line with the capability of the targeted test
takers. If this is not done, they will not understand the test and will
therefore fail the test.

5. Avoid double barrelled items that convey two or more ideas at

the same time.
Double barrelled items may end up confusing the test taker since they may
fail to decide whether to agree with or disagree with the statement. This will
eventually affect the results of the test. An example is ;
Indicate whether you agree or disagree with the statement.
“ I vote NRM because I support Universal Secondary Education”.
These are two different statements; “I vote NRM” and “I support Universal
Secondary Education”. Someone can agree with one but not the other or
viceversa.

6. Consider mixing positively and negatively worded items.

At times, the test takers may develop the ‘acquiescence response set’ where
they tend to respond positively to all items. To avoid this bias, you may
include items that are worded in the opposite direction.

For example;

“I feel tired”.

“I feel energised”.

7. When writing test items, you need to be sensitive to the

cultural and ethnic differences.

For example, if you are writing items for a religious population, it may not be
appropriate to write items reflecting mannerisms that may be offensive to

2
them like – alcohol drinking, eating certain foods that may be taboo to them,
etc.

8. It is important to realise that items become obsolete. When

they become obsolete, they lose reliability.

When items are used over a long period, they tend to lose reliability. Hence
the need to ensure they are reliable at any one point if they are to be used.

Other general guides for item writing.

 “All of the Above” should not be an answer option

 “None of the Above” should not be an answer option

 All answer options should be credible

 Order of answer options should be logical or vary

 Items should cover important concepts and objectives

 Negative wording should not be used

 Answer options should include only one correct answer

 Specific determiners (e.g. always, never) should not be used

 Answer options should be homogenous

 Correct answer options should not be the longest answer option

 Items should be independent of each other

 Test copies should be clear, readable and not hand-written

Item Formats

Different item formats are used for different purposes. The format used for
evaluating attitudes may not be the same to be used for assessing
personalities. Each format is chosen based on the pros and cons for that
particular format.

a. Dichotomous Format

This format offers two alternatives for each item. If a test taker selects one of
the alternatives that is presented, they are awarded a point.

3
A common dichotomous test is the True-False examination. The test taker’s
task is to choose either what is true or what is false, but not both for a
single item.

Other item responses on the format include, “Yes” or “No”

An example of dichotomous items;

Item True False

1 Tough managers produce best performing

teams

2 Teamwork encourages social loafing

3 Introverts perform tasks better as

individuals

4 I often worry about my reading ability

Advantages of the Dichotomous Format

1. It is easy to construct and adminster.

2. It is easily scored. The tester only needs to count the number of

correct items to get the score.

3. The true-false items require absolute judgement. The test taker

cannot choose anything in between.

Disadvantages

1. They encourage students to memorise material and be able to pass

the test even when they have not really understood the concepts.

2. Dichotomous items tend to be less reliable than other item formats.

This is because it only poses a mere chance of 50% of either passing
the test or failing it! It is easy for a test taker to simply guess a correct
answer without understanding the context of the item.

b. The Polychotomous Format (Polytomous)

4
This resembles the dichotomous format only that it has more than two
alternatives.

A point is given for selecting one of the alternatives but not for selecting any
other choice.

For a polychotomous examination, the test taker has to determine which

alternative is correct. Incorrect alternatives are called distractors.

According to the psychometric theory, adding more distractors increases the

reliability of the item. It is usually best to have 3- 4 distractors for this
purpose. However, poorly written distractors may affect the quality of the
test.

Unlike in the dichotomous format where a 50% chance of success is

observed, in the polychotomous format, chances of success are dependent
on the number of choices available per item, i.e. if the choices are four,
chance of a correct choice is one out of the four choices which is equivalent
to 25%. If the choices are three, the chance of a correct choice is one out of
three which is equivalent to 33.3%.

Some test takers can get the items correct simply by guessing even if they
have not read the subject matter. Hence for a test with three alternatives,
the chances of getting a correct choice is 33%, etc.

Because of guessing, a correction for guessing is done. The formula to

correct for guessing on a test is;

W
Corrected score = R -
n−1

Where R = the number of right responses

W = the number of wrong responses

n = the number of choices for each item

Take an example of 100 items with 4 choices each, and the test taker
decided to guess all through the exercise. By default the expected score
from guessing will be a quarter (25) of the 100 items. R is expected to be
25 of the 100 items, and the number of wrong responses will be W =
(100-25)= 75 and n = 4

Using the formula above;

75 75
Correct score = 25 -( ) = 25 –( ) =25-25=0
4−1 3

5
So when correction for guessing is applied, the corrected score is
actually 0.

An example:

Mukiibi was subjected to a psychological test with 100 items, each item
having four answer choices to choose from. He scored 88 correct answers
and was pronounced to have passed the test. What is Mukiibi’s score
after correction for guessing?

W
From the formula, Corrected score = R -
n−1

R is observed to be 88 of the 100 items, W = 12 and n = 4

12 12
Correct score = 88- ( ) = 88 – ( ) = (88-4 ) =84
4−1 3

So Mukiibi’s corrected score is 84.

The omitted numbers are not included. They provide neither credit nor
penalty.

The expression (W/n-1) is an estimate of the number of responses the

test taker is expected to get right by chance.

Advantages of use of polychotomous format

 It takes little time for the test takers to respond since they do not write
the answers. Hence one can respond to a large number of items in a
short time.

 The tests are easy to score. The tester only counts the correct items to
get the score.

Disadvantages

It may be easy to guess a correct answer and by chance a correct answer

may be selected.

c. The Likert Format

This format requires that a respondent indicates the degree of agreement

with a particular attitudinal question.

It is very popular with personality and attitude scales.

6
This scale is non-comparative and measures only a single trait. The
respondent is asked to indicate their level of agreement with a given
statement by way of an ordinal scale.

It is sometimes expressed as a four, five or even six –point scale ranging

from, Strongly agree, Agree, Neutral, Disagree, Strongly Disagree. The
more the number of points, the less likely it is for the respondent to be
neutral.

An example of a six-point scale;

No Item Strongly Moderately Mildly Mildly Moderately Agre

. Disagree Disagree Disagree Agree

1 I am afraid
of
caterpillars

2 I love snakes

3 I fear cats

4 I do not like
centipedes

Likert scales are some times referred to as summative scales because each
specific question can be summed up with other related items to create a
score for a group of statements.

Scoring requires that each negatively worded item be reverse scored and the
responses are then summed up.

Advantages

 It is easy to construct

 It produces a highly reliable scale

 It is easy to read and complete by the test takers.

Weaknesses of this scale include:

7
 Central tendency bias; participants may avoid extreme response
categories

 Acquiescence bias; participants may agree with statements as

presented in order to please the tester.

 Social desirability bias; Respondents may wish to portray themselves

in a more favorable light rather than being honest.

 Validity may be difficult to demonstrate; it may not portray what the

tester intended to measure

d. The Category Format

It is similar to the Likert scale but uses an even greater number of

choices than the Likert scale.

Although it may seem similar to the Likert format, the category scale
uses a defined point rating system.

Test takers are required to rate a given item scenario on a scale in a

category range. For example one may use a scale of 1 to 5 or 1 to 10,
where 1 is the lowest score and 5 or 10 being the highest score
respectively.

The numbers that are assigned when using the rating scale are
sometimes influenced by the context in which the items are rated.

The number of categories used depends on the fineness of the

discrimination that the test takers are willing to make. If they wish to
have a fine discrimination they will take even more categories.

An example:

1. On a scale of 1 to 5, rate Bazalaki’s attitude towards class

assignments. (where 1 is very negative and 5 is very positive)

2. On a scale of 1to 10, rate the level of academic excellence of Makerere

university. (where 1 is very ordinary and 10 is very competitive.

Advantages

 It is very easy to administer

Disadvantages

 It does not take into consideration the context in which the test
subject is being rated! E.g. in a class of averagely performing students,
a student may be rated as 9, which represents a very good performer.

8
Yet if the same student is placed in another class of only highly
performing students, the same student may be rated 3, which
represents a relatively poor performance.

 Also on this scale, test takers have a tendency to spread their

responses evenly across the entire scale of 1 to 10, which may not
fairly represent the actual score.

In order to overcome the problems above, the end points of this scale
have to be clearly defined, by outlining the expected characteristics of
each point (Kaplan & Ernest, 1983).

For example if one is looking at the performance of students in a given

class, for a student to score 10, they must have been;

- attending all classes

- contribute to every question asked in class
- solves problems fast
- assists others to complete their class work
- regularly passes class tests with over 80%.

On the other hand, the opposite can explain the characteristics of a

student scoring 1.

e. Checklists

These are used in personality measurement.

The test taker is given a list of adjectives and asked to indicate whether
each is characteristic of him/herself or someone else.

Here, a rating of 9 will mean that the statement on the card is the best
description of the characteristic of the person being studied, while 1 is
the least description of that person’s characteristics.

For example ;

Castro is…

 Is a dependable person.

 Is a talkative individual.

 Behaves in a sympathetic or considerate manner.

 Appears to have a high degree of intellectual capacity

 Is protective of those close to him

 Tends to be self-defensive.

9
 Is thin-skinned; sensitive to anything that can be construed as
criticism.

Q-Lists
A test taker is given a list of statements about one their proposed
personal characteristics and asked to sort them into a given number of
piles, e.g. 5, or 9 piles.

These statements are sorted into piles that indicate the degree to which
they appear to describe a given person accurately.

A pile list of 1 to 9 is provided to the test taker, where he/she will rate
and place the statement listed on the card, onto the pile number that
appropriately describes the characteristics of the person being studied.

For example 100 statements about a person’s characteristics are listed

on cards, with each card having one statement, making 100 cards.

The degree of representation of the statements on the cards can be

distributed across the 9 piles, depending on the test taker’s
interpretation of the subject being studied.

The frequency of cards placed on the different piles is noted and the best
characteristic description of the person under study is noted.

The observed results tend to follow a normal distribution. However, items

that lie at the extreme ends of the quantum always Speke volumes about
the true personal characteristics of the subject.

Conclusion

The items if written carefully will be able to help in the assessment of the
subject and give accurate results to the tester.

10
References

 Kaplan, R.M &Saccuzzo, D.P(2009) Psychological Testing, Principles,

Applications and Issues

 Crocker, L &Algina, J (2008) Introduction to Classical and Modern

Test Theory

 Suen, H,K& McClellan, S(2003). Test item construction techniques

and Principles.

Case Ocd
No ratings yet
Case Ocd
7 pages
10th DG Khan Gazette 2021
0% (1)
10th DG Khan Gazette 2021
2,668 pages
Independent Groups Design - Part 1
No ratings yet
Independent Groups Design - Part 1
22 pages
Condensed Guide for the Stanford Revision of the Binet-Simon Intelligence Tests
From Everand
Condensed Guide for the Stanford Revision of the Binet-Simon Intelligence Tests
Lewis Madison Terman
No ratings yet
Eclectic Approach PPT
No ratings yet
Eclectic Approach PPT
33 pages
Psychology Report Ryff Psyhological Test of Wellbeing
No ratings yet
Psychology Report Ryff Psyhological Test of Wellbeing
8 pages
Rotter's Incomplete Sentence Blank Test
No ratings yet
Rotter's Incomplete Sentence Blank Test
9 pages
The Eysenck Personality Scales The Eysen
No ratings yet
The Eysenck Personality Scales The Eysen
21 pages
Chapter Personality Assessment
No ratings yet
Chapter Personality Assessment
34 pages
EPQ Stem Score Table
No ratings yet
EPQ Stem Score Table
4 pages
Methods To Ascertain Knowledge, Scales of Measurements and Various Methods To Present Data
No ratings yet
Methods To Ascertain Knowledge, Scales of Measurements and Various Methods To Present Data
9 pages
SCQ Report
No ratings yet
SCQ Report
4 pages
Critical Issues in Psychotherapy
No ratings yet
Critical Issues in Psychotherapy
8 pages
Lecture 7 Personality Testing
No ratings yet
Lecture 7 Personality Testing
61 pages
The Junior Temperament and Character Inventory: Preliminary Validation of A Child Self-Report Measure
No ratings yet
The Junior Temperament and Character Inventory: Preliminary Validation of A Child Self-Report Measure
13 pages
Construction of Test Items - Rational and Empirical Approach
No ratings yet
Construction of Test Items - Rational and Empirical Approach
3 pages
Unit 4
No ratings yet
Unit 4
8 pages
Multilingual Assessment Instrument For Narratives (MAIN)
No ratings yet
Multilingual Assessment Instrument For Narratives (MAIN)
49 pages
NEO-4 Manual
100% (1)
NEO-4 Manual
12 pages
Randomized Block Design Method
No ratings yet
Randomized Block Design Method
15 pages
Warwick-Edinburgh Mental Well-Being Scale: (Wemwbs)
No ratings yet
Warwick-Edinburgh Mental Well-Being Scale: (Wemwbs)
32 pages
Movie Review (Bajrangi
No ratings yet
Movie Review (Bajrangi
10 pages
Nature and History of Clinical Psychology
0% (1)
Nature and History of Clinical Psychology
27 pages
Translation and Validation of Symptom Checklist-90: Nadia Shafique and Muhammad Tahir Khalily
No ratings yet
Translation and Validation of Symptom Checklist-90: Nadia Shafique and Muhammad Tahir Khalily
17 pages
Practical - 2: Clinical Rating Scale: Ham-A For Patient
No ratings yet
Practical - 2: Clinical Rating Scale: Ham-A For Patient
7 pages
Aiss Write Up
No ratings yet
Aiss Write Up
4 pages
Validation of The General Health Questionnaire
No ratings yet
Validation of The General Health Questionnaire
9 pages
Rotters Incomplete Sentences Blank by Julian Rotter Outline
No ratings yet
Rotters Incomplete Sentences Blank by Julian Rotter Outline
8 pages
General Health Questionnaire
100% (1)
General Health Questionnaire
5 pages
Youth Unit 2
No ratings yet
Youth Unit 2
58 pages
MMPI-2 Case Study and Profile For Interpretation
No ratings yet
MMPI-2 Case Study and Profile For Interpretation
4 pages
16 PF 1
No ratings yet
16 PF 1
4 pages
The Development and Validation of The Rational and Intuitive Decision Styles Scale
No ratings yet
The Development and Validation of The Rational and Intuitive Decision Styles Scale
14 pages
SeguinFormBoardtest Aproposalforappropriatenorms
No ratings yet
SeguinFormBoardtest Aproposalforappropriatenorms
8 pages
Intellectual Assessment
No ratings yet
Intellectual Assessment
5 pages
Appendix 1: Sarason Social Support Questionnaire (Short Form) SSQSR
No ratings yet
Appendix 1: Sarason Social Support Questionnaire (Short Form) SSQSR
3 pages
EQ-i Resource Report en
100% (1)
EQ-i Resource Report en
13 pages
Basic Themes of Positive Psychology
100% (1)
Basic Themes of Positive Psychology
4 pages
Beck Depression Inventory
No ratings yet
Beck Depression Inventory
3 pages
Dicks - KeithCharles ADJUSTMENT BELL PDF
No ratings yet
Dicks - KeithCharles ADJUSTMENT BELL PDF
85 pages
MOST COMMON PSYCHOLOGICAL TEST-army-carabineros-armada
No ratings yet
MOST COMMON PSYCHOLOGICAL TEST-army-carabineros-armada
4 pages
The Edwards Personal Preference
100% (1)
The Edwards Personal Preference
3 pages
12a154eb 1646118280563
100% (1)
12a154eb 1646118280563
9 pages
Eyesenck Personality Questionarrie
No ratings yet
Eyesenck Personality Questionarrie
3 pages
Chapter 4 Psych Assessment
No ratings yet
Chapter 4 Psych Assessment
5 pages
Objective Personality Tests
No ratings yet
Objective Personality Tests
7 pages
History of Psychological Assessment
No ratings yet
History of Psychological Assessment
3 pages
AISS Experiment
No ratings yet
AISS Experiment
2 pages
Cross-Cultural Translation, Adaptation and Validation of Psychological Instruments
No ratings yet
Cross-Cultural Translation, Adaptation and Validation of Psychological Instruments
2 pages
Scoring Instructions For Scales Comprehensive Psychological Well-Being Scale
No ratings yet
Scoring Instructions For Scales Comprehensive Psychological Well-Being Scale
6 pages
Untitled Document
0% (1)
Untitled Document
12 pages
PGI Memory Scale Manual
No ratings yet
PGI Memory Scale Manual
16 pages
07.Statistics Unit-VII
No ratings yet
07.Statistics Unit-VII
46 pages
III - Essentials of Test Score Interpretation
No ratings yet
III - Essentials of Test Score Interpretation
31 pages
History of Psych Assessment
No ratings yet
History of Psych Assessment
49 pages
Psychological Wellbeing of Elders As A Function of Religious Involvement
No ratings yet
Psychological Wellbeing of Elders As A Function of Religious Involvement
126 pages
RIBT
No ratings yet
RIBT
43 pages
Rotter Handouts PDF
No ratings yet
Rotter Handouts PDF
31 pages
VSMS Indian Adaptation - Compressed
No ratings yet
VSMS Indian Adaptation - Compressed
11 pages
TCI - English
No ratings yet
TCI - English
14 pages
Research On Child Psychopathology - Research Designs, Ethical Issues - NOTES
100% (1)
Research On Child Psychopathology - Research Designs, Ethical Issues - NOTES
11 pages
Tuning Protocol Maddie Bartlett 1
No ratings yet
Tuning Protocol Maddie Bartlett 1
27 pages
Undergraduate-Application-Form-a-consolidated-standard-form-2023
No ratings yet
Undergraduate-Application-Form-a-consolidated-standard-form-2023
8 pages
Time Table_Whitefield_2025-26 - Updated VP Batches TT Week No.23
No ratings yet
Time Table_Whitefield_2025-26 - Updated VP Batches TT Week No.23
1 page
LSPU-Admission-Guidelines-for-1st-Sem-of-AY-2024-2025
No ratings yet
LSPU-Admission-Guidelines-for-1st-Sem-of-AY-2024-2025
7 pages
DLL DISS Week7 8 Q2
No ratings yet
DLL DISS Week7 8 Q2
9 pages
My D.E.Shaw Resume
No ratings yet
My D.E.Shaw Resume
1 page
IELTS Speaking
No ratings yet
IELTS Speaking
3 pages
Project
No ratings yet
Project
29 pages
MALVIYA - Publisher
No ratings yet
MALVIYA - Publisher
15 pages
RLA-Grade 6 2023
No ratings yet
RLA-Grade 6 2023
4 pages
DM-395-S.-2023-enhanced-RBB
No ratings yet
DM-395-S.-2023-enhanced-RBB
105 pages
BannuSchoolleader PDF
No ratings yet
BannuSchoolleader PDF
13 pages
IQ0916 Full Issue Single Page IQ Boost
No ratings yet
IQ0916 Full Issue Single Page IQ Boost
1 page
Prof Ed 9 Module 9 Unpacking Curriculum
No ratings yet
Prof Ed 9 Module 9 Unpacking Curriculum
14 pages
Certification On Claim For Grade Level Chairmanship
No ratings yet
Certification On Claim For Grade Level Chairmanship
3 pages
Ryan Clavenna Resume
No ratings yet
Ryan Clavenna Resume
1 page
DR YC James Yen Government Polytechnic, Kuppam: (Diploma in Computer Engineering)
No ratings yet
DR YC James Yen Government Polytechnic, Kuppam: (Diploma in Computer Engineering)
2 pages
Mark Scheme Paper 31
No ratings yet
Mark Scheme Paper 31
6 pages
International Primary and Lower Secondary Teaching and Learning Resources 2024
No ratings yet
International Primary and Lower Secondary Teaching and Learning Resources 2024
76 pages
Lesson 10-DEVELOPMENT OF COMPARATIVE EDUCATION
No ratings yet
Lesson 10-DEVELOPMENT OF COMPARATIVE EDUCATION
5 pages
Academic Excellence Award With Honors: Ezekiel Tristan D. Cases
No ratings yet
Academic Excellence Award With Honors: Ezekiel Tristan D. Cases
3 pages
Application Checklist of Requirement
No ratings yet
Application Checklist of Requirement
1 page
Princely Cie
No ratings yet
Princely Cie
33 pages
Chapter 7 - Top
No ratings yet
Chapter 7 - Top
6 pages
Philip H. Coombs-The World Crisis in Education - The View From The Eighties-Oxford University Press (1985)
No ratings yet
Philip H. Coombs-The World Crisis in Education - The View From The Eighties-Oxford University Press (1985)
366 pages
GC - MEC32P 2 - A6 - 2Q2021 - Fullgc - 2021 02 20 12 24 30
No ratings yet
GC - MEC32P 2 - A6 - 2Q2021 - Fullgc - 2021 02 20 12 24 30
4 pages
Project Proposal LYKA
No ratings yet
Project Proposal LYKA
4 pages
Career Advocacy Program
No ratings yet
Career Advocacy Program
2 pages
Apple Mae Resume For Teacher Application
No ratings yet
Apple Mae Resume For Teacher Application
4 pages