0% found this document useful (0 votes)

122 views4 pages

Interpretation of Discrimination Data From Multiple-Choice Test Items

The document discusses three useful statistics for interpreting student performance on multiple choice tests: item difficulty, point-biserial correlation coefficient (item discrimination), and reliability coefficient. It provides guidelines for evaluating items based on these statistics, such as desirable difficulty levels and discrimination value ranges. The document also discusses evaluating distractors and cautions about interpreting item analysis results.

Uploaded by

Fayza Nashrillah Kim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

122 views4 pages

Interpretation of Discrimination Data From Multiple-Choice Test Items

Uploaded by

Fayza Nashrillah Kim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Teaching & Learning Seminar Series

College of Health Sciences

Midwestern University

Interpretation of Discrimination Data from Multiple-Choice Test Items

Understanding how to interpret three useful statistics concerning your students' multiple-choice test scores
will help you construct well-designed tests and improve instruction.

1. Item Difficulty (P): the percentage of students who correctly answered an

item.
 Also referred to as the p-value
 Ranges from 0% to 100%, or more typically written as a proportion 0.00 to 1.00
 The higher the value, the easier the item
 P-values above 0.90 indicate very easy items that you should not use in subsequent tests. If almost
all students responded correctly, an item addresses a concept probably not worth testing.
 P-values below 0.20 indicate very difficult items. If almost all students responded incorrectly, either
an item is flawed or students did not understand the concept. Consider revising confusing
language, removing the item from subsequent tests, or targeting the concept for re-instruction.
For maximum discrimination potential, desirable difficulty levels are slightly higher than midway between
chance (1.00 divided by the number of choices) and perfect scores (1.00) for an item:

ITEM FORMAT AND RESPCTIVE IDEAL DIFFICULTY

 Five-response multiple-choice .60

 Four-response multiple-choice .63
 Three-response multiple-choice .66
 True-false (two-response multiple-choice) .75

2. Point-Biserial Correlation Coefficient (PBCC) for Item discrimination or R

(IT): the relationship between how well students performed on the item and
their total test score.
 Ranges from 0.00 to 1.00
 The higher the value, the more discriminating the item
 A highly discriminating item indicates that students with high test scores responded correctly
whereas students with low test scores responded incorrectly.
Remove items with discrimination values near or less than zero, because this indicates that students who
performed poorly on the test performed better on an item than students who performed well on the test. The
item is confusing for your better scoring students in some way.
EVALUATE ITEMS USING FOUR GUIDELINES FOR CLASSROOM TEST
DISCRIMINATION VALUES:
 0.40 or higher very good items
 0.30 to 0.39 good items
 0.20 to 0.29 fairly good items
 0.19 or less poor items

3. Reliability coefficient (ALPHA): a measure of the amount of error

associated with a test score.
 Ranges from 0.00 to 1.00
 The higher the value, the more reliable the test score
 Typically, a measure of internal consistency, indicating how well items are correlated with one
another
 High reliability indicates that items are measuring the same construct (e.g., knowledge of how to
calculate integrals)
 Two ways to improve test reliability: 1) increase the number of items or 2) use items with high
discrimination values

RELIABILITY INTERPRETATION
 .90 and above: Excellent reliability; at the level of the best standardized tests
 .80 - .90: Very good for a classroom test
 .70 - .80 :Good for a classroom test; in the range of most. There are probably a few items that could
be improved.
 .60 - .70: Somewhat low. This test should be supplemented by other measures to determine grades.
There are probably some items that could be improved.
 .50 - .60: Suggests need to revise the test, unless it is quite short (ten or fewer items). The test must
be supplemented by other measures for grading.
 .50 or below: Questionable reliability. This test should not contribute heavily to the course grade, and
it needs revision.

DISTRACTOR EVALUATION
Another useful item review technique is distractor evaluation.
You should consider each distractor an important part of an item in view of nearly 50 years of research that
shows that there is a relationship between the distractors students choose and total test score. The quality
of the distractors influences student performance on a test item.
Although correct answers must be truly correct, it is just as important that distractors be clearly incorrect,
appealing to low scorers who have not mastered the material rather than to high scorers. You should review
all item options to anticipate potential errors of judgment and inadequate performance so you can revise,
replace, or remove poor distractors.
One way to study responses to distractors is with a frequency table that tells you the proportion of students
who selected a given distractor. Remove or replace distractors selected by a few or no students because
students find them to be implausible.

CAUTION WHEN INTERPRETING ITEM ANALYSIS RESULTS

Mehrens and Lehmann (1973) offer three cautions about using the results of item analysis:
 Item analysis data are not synonymous with item validity. An external criterion is required to
accurately judge the validity of test items. By using the internal criterion of total test score, item
analyses reflect internal consistency of items rather than validity.
 The discrimination index is not always a measure of item quality. There are a variety of reasons why
an item may have low discrimination power:
o extremely difficult or easy items will have low ability to discriminate, but such items are often needed to
adequately sample course content and objectives.
o an item may show low discrimination if the test measures many content areas and cognitive skills. For
example, if the majority of the test measures "knowledge of facts," then an item assessing "ability to apply
principles" may have a low correlation with total test score, yet both types of items are needed to measure
attainment of course objectives.
 Item analysis data are tentative. Such data are influenced by the type and number of students being
tested, instructional procedures employed, and chance errors. If repeated use of items is possible,
statistics should be recorded for each administration of each item.

Standards of Acceptance

Item difficulty 30% - 90%

Item Discrimination Ratio 25% and Above

PBCC 0.20 and Above

KR20 0.70 and Above

Activity: Item Analysis

Review the item statistics for the exams and answer the following questions when possible:

Which item(s) would you remove altogether from the test? Why?
Which distractor(s) would you revise? Why?
Which items are working well?
What does the pattern of responses for the correct and incorrect alternatives across the various splits tell
you about the item?
Which distractor(s) would you revise? Why?
How can you use the frequency counts (and related percentages) to determine how the class did as a
whole?
How would you use the standard scores to compare 1) the same students across different tests or 2) the
overall scores between tests?

References
The University of Texas at Austin, Faculty Innovation Center. https://facultyinnovate.utexas.edu/. Last
accessed November, 2016.
DeVellis, R. F. (1991). Scale development: Theory and applications. Newbury Park: Sage Publications.
Haladyna. T. M. (1999). Developing and validating multiple-choice test items (2nd ed.). Mahwah, NJ:
Lawrence Erlbaum Associates.
Lord, F.M. (1952). The relationship of the reliability of multiple-choice test to the distribution of item
difficulties. Psychometrika, 18, 181-194.
Mehrens, W. A., & Lehmann, I. J. (1973). Measurement and Evaluation in Education and Psychology. New
York: Holt, Rinehart and Winston, 333-334.
Suen, H. K. (1990). Principles of test theories. Hillsdale, NJ: Lawrence Erlbaum Associates.

75 Sales Representative Cosmetics Interview Questions Answers
100% (1)
75 Sales Representative Cosmetics Interview Questions Answers
9 pages
Decision Making Project Report
100% (1)
Decision Making Project Report
21 pages
itam analysis
No ratings yet
itam analysis
8 pages
DIAZ - Quantitative Item Analysis
No ratings yet
DIAZ - Quantitative Item Analysis
14 pages
Lesson 3.2 - Item Analysis
No ratings yet
Lesson 3.2 - Item Analysis
20 pages
Item Analysis Guidelines
100% (1)
Item Analysis Guidelines
31 pages
Item Analysis SPSS
No ratings yet
Item Analysis SPSS
44 pages
Administering, Analyzing, & Improving Tests (Part 2)
No ratings yet
Administering, Analyzing, & Improving Tests (Part 2)
31 pages
Analysis
No ratings yet
Analysis
46 pages
Administering, Analyzing and Improving Test
100% (1)
Administering, Analyzing and Improving Test
3 pages
Assessment report
No ratings yet
Assessment report
4 pages
Item Analysis and Validation: Mark Leonard Tan Verena Gonzales Ann Creia Tupasi Ramil Cabañesas
No ratings yet
Item Analysis and Validation: Mark Leonard Tan Verena Gonzales Ann Creia Tupasi Ramil Cabañesas
46 pages
Administering, Analyzing and Improving Test
No ratings yet
Administering, Analyzing and Improving Test
38 pages
Administering, Analyzing, and Improving The Test or Assessment
No ratings yet
Administering, Analyzing, and Improving The Test or Assessment
53 pages
Item Analysis Module
No ratings yet
Item Analysis Module
10 pages
Item Analysis Made Easy
No ratings yet
Item Analysis Made Easy
55 pages
Educ 6
No ratings yet
Educ 6
5 pages
Educ 71 FS2 Episode5
No ratings yet
Educ 71 FS2 Episode5
20 pages
Item Analysis: Dr. Moawia Ahmed Elbadri
No ratings yet
Item Analysis: Dr. Moawia Ahmed Elbadri
58 pages
Assessment of Learning
100% (1)
Assessment of Learning
3 pages
Principles of Test Construction
No ratings yet
Principles of Test Construction
27 pages
Test Development Process in Education
No ratings yet
Test Development Process in Education
7 pages
Module Asl1 Unit 4
No ratings yet
Module Asl1 Unit 4
9 pages
Item Analysis 2 1
No ratings yet
Item Analysis 2 1
26 pages
Chapter-9-IMPROVING-OF-CLASSROOM-BASED-ASSESSMENT-TEST
No ratings yet
Chapter-9-IMPROVING-OF-CLASSROOM-BASED-ASSESSMENT-TEST
54 pages
Item Analysis
No ratings yet
Item Analysis
4 pages
Inbound 2619681137775099791
No ratings yet
Inbound 2619681137775099791
9 pages
Al1 Finals
No ratings yet
Al1 Finals
12 pages
AL1 Week 7
No ratings yet
AL1 Week 7
4 pages
Item Analysis
100% (1)
Item Analysis
33 pages
Assembling, Administering and Appraising Classroom Tests and Assessments
67% (6)
Assembling, Administering and Appraising Classroom Tests and Assessments
27 pages
05E.90 Improving A Classrom-Based Assessment Test
100% (1)
05E.90 Improving A Classrom-Based Assessment Test
36 pages
Educational & Psychological Measurement & Evaluation Sgdy 5063
100% (1)
Educational & Psychological Measurement & Evaluation Sgdy 5063
23 pages
Item Analysis-Ppt 2 - 083033
100% (2)
Item Analysis-Ppt 2 - 083033
47 pages
Kyu Edu 2301 WK8
No ratings yet
Kyu Edu 2301 WK8
5 pages
EDUC 105 Finals Notes-Compilation
No ratings yet
EDUC 105 Finals Notes-Compilation
24 pages
Dept Education 1705 MPHIL II SEM Item Analysis
No ratings yet
Dept Education 1705 MPHIL II SEM Item Analysis
8 pages
BEED ASSESS 1
No ratings yet
BEED ASSESS 1
25 pages
Item Analysis: Complex Topic
100% (2)
Item Analysis: Complex Topic
8 pages
item analysis Arjumand zargar
No ratings yet
item analysis Arjumand zargar
15 pages
lecture 9 and 10
No ratings yet
lecture 9 and 10
14 pages
Item Analysis and Validation
No ratings yet
Item Analysis and Validation
39 pages
Chapter 4 - Assessment of Learning - Administering, Analyzing, And Improving Tests
No ratings yet
Chapter 4 - Assessment of Learning - Administering, Analyzing, And Improving Tests
55 pages
Item Analysis
No ratings yet
Item Analysis
59 pages
Item Analysis Part 2
No ratings yet
Item Analysis Part 2
62 pages
Improving Teacher-Developed Assessment
No ratings yet
Improving Teacher-Developed Assessment
19 pages
Lesson 4 - Item Analysis and Test Validation
No ratings yet
Lesson 4 - Item Analysis and Test Validation
24 pages
Improving Multiple Choice Test Items Through Item Analysis: Mae Biñas Angeles
No ratings yet
Improving Multiple Choice Test Items Through Item Analysis: Mae Biñas Angeles
36 pages
Table of Specifications
No ratings yet
Table of Specifications
37 pages
Item Analysis and Validation
No ratings yet
Item Analysis and Validation
19 pages
D. Designing Multiple-Choice Test Items
No ratings yet
D. Designing Multiple-Choice Test Items
6 pages
Item Analysis Process
No ratings yet
Item Analysis Process
9 pages
Item Analysis and Validation (Group 5)
No ratings yet
Item Analysis and Validation (Group 5)
35 pages
inbound1676784190742580896
No ratings yet
inbound1676784190742580896
14 pages
10 Item Analysis
No ratings yet
10 Item Analysis
24 pages
PDF Document
No ratings yet
PDF Document
76 pages
Item Analysis: Improving Multiple Choice Tests
No ratings yet
Item Analysis: Improving Multiple Choice Tests
24 pages
Ayesha Irfan Class No. 08 Subject: Measurement and Evaluation Topic: The Value of Item Analysis Procedure/Process of Item Analysis Mphil 2 Semester
No ratings yet
Ayesha Irfan Class No. 08 Subject: Measurement and Evaluation Topic: The Value of Item Analysis Procedure/Process of Item Analysis Mphil 2 Semester
44 pages
Interpreting The Item Analysis Report
100% (1)
Interpreting The Item Analysis Report
3 pages
Item Analysis
No ratings yet
Item Analysis
7 pages
Testing Impact Review
From Everand
Testing Impact Review
Mason Ross
No ratings yet
Performance-Based Assessment for 21st-Century Skills
From Everand
Performance-Based Assessment for 21st-Century Skills
Todd Stanley
4.5/5 (14)
ELT Assessment - Assessing Pronunciation (Revised)
No ratings yet
ELT Assessment - Assessing Pronunciation (Revised)
17 pages
The Relationship Between English Songs and Learning Vocabulary
No ratings yet
The Relationship Between English Songs and Learning Vocabulary
12 pages
Ujian Akhir Semester Gasal 2021/2022: Fakultas Bahasa Dan Seni Jurusan Bahasa Dan Sastra Inggris
No ratings yet
Ujian Akhir Semester Gasal 2021/2022: Fakultas Bahasa Dan Seni Jurusan Bahasa Dan Sastra Inggris
3 pages
Are Girls Too Mean in Their Friendship
No ratings yet
Are Girls Too Mean in Their Friendship
1 page
Assessing Listening - Group4
No ratings yet
Assessing Listening - Group4
10 pages
70 Fayza Nashrillah Argumentative Essay
No ratings yet
70 Fayza Nashrillah Argumentative Essay
1 page
70 - Fayza N - LEARNING LOG 9
No ratings yet
70 - Fayza N - LEARNING LOG 9
3 pages
70 - FAYZA - SKL-Standar Isi-Kompensi Inti-Kompetensi Dasar
No ratings yet
70 - FAYZA - SKL-Standar Isi-Kompensi Inti-Kompetensi Dasar
1 page
70 - Fayza N - Pesantren Vs Public School
No ratings yet
70 - Fayza N - Pesantren Vs Public School
1 page
Interviws Assestmen Definition:: Start of Interview
No ratings yet
Interviws Assestmen Definition:: Start of Interview
2 pages
Curriculum Development in Language Teaching - PDF: Learning Log Name: Fayza Nashrillah NIM: 20020084070
No ratings yet
Curriculum Development in Language Teaching - PDF: Learning Log Name: Fayza Nashrillah NIM: 20020084070
4 pages
Quiz Elt Methods: Instruction
No ratings yet
Quiz Elt Methods: Instruction
4 pages
Borntrager & Lyon 2015
No ratings yet
Borntrager & Lyon 2015
13 pages
Building Community From The Inside Out
67% (3)
Building Community From The Inside Out
16 pages
UNIT 5 career Development
No ratings yet
UNIT 5 career Development
11 pages
Persepolis Assessment Task
No ratings yet
Persepolis Assessment Task
4 pages
East Asian Painting (Calligraphy)
No ratings yet
East Asian Painting (Calligraphy)
2 pages
2005.03 Indigo Children Return of the Emerald Order
No ratings yet
2005.03 Indigo Children Return of the Emerald Order
14 pages
Philosophy Assignment
No ratings yet
Philosophy Assignment
3 pages
Peterson - Posthumanism To Come
No ratings yet
Peterson - Posthumanism To Come
16 pages
Modern Project Management: Powerpoint Presentation by Charlie Cook
No ratings yet
Modern Project Management: Powerpoint Presentation by Charlie Cook
11 pages
Locus of Control Scale - Quiz
No ratings yet
Locus of Control Scale - Quiz
4 pages
Transferable Skills Worksheet
No ratings yet
Transferable Skills Worksheet
4 pages
Manage Yourself and Lead Others
No ratings yet
Manage Yourself and Lead Others
10 pages
Project Reporton On Communication: Submitted To - : Submitted by
No ratings yet
Project Reporton On Communication: Submitted To - : Submitted by
19 pages
Test Construction
No ratings yet
Test Construction
40 pages
Level 4 Oral Task 32H
No ratings yet
Level 4 Oral Task 32H
1 page
The Architecture of Incarceration - Changing Paradigms in Prison Architecture-Libre
No ratings yet
The Architecture of Incarceration - Changing Paradigms in Prison Architecture-Libre
8 pages
Pauli-Jung Letters - Atom and Archetype
100% (19)
Pauli-Jung Letters - Atom and Archetype
313 pages
Software Engineer
No ratings yet
Software Engineer
1 page
Character Strengths and Virtues: A Handbook and Classification
100% (1)
Character Strengths and Virtues: A Handbook and Classification
3 pages
Diagnostic Test Lesson Plan
No ratings yet
Diagnostic Test Lesson Plan
3 pages
Debating Template
No ratings yet
Debating Template
6 pages
1.1 Introduction To Physics
No ratings yet
1.1 Introduction To Physics
7 pages
English Literature Project
No ratings yet
English Literature Project
3 pages
IELTS Speaking Test Summary
No ratings yet
IELTS Speaking Test Summary
23 pages
Ncharlestin Resume
No ratings yet
Ncharlestin Resume
2 pages
Ergonomics and Anatony
No ratings yet
Ergonomics and Anatony
31 pages
Structural Family Therapy
No ratings yet
Structural Family Therapy
9 pages
Grade 2-3 A Pirates Life Compass Rose Lesson
No ratings yet
Grade 2-3 A Pirates Life Compass Rose Lesson
3 pages

Interpretation of Discrimination Data From Multiple-Choice Test Items

Uploaded by

Interpretation of Discrimination Data From Multiple-Choice Test Items

Uploaded by

Teaching & Learning Seminar Series

College of Health Sciences

Interpretation of Discrimination Data from Multiple-Choice Test Items

1. Item Difficulty (P): the percentage of students who correctly answered an

ITEM FORMAT AND RESPCTIVE IDEAL DIFFICULTY

 Five-response multiple-choice .60

2. Point-Biserial Correlation Coefficient (PBCC) for Item discrimination or R

3. Reliability coefficient (ALPHA): a measure of the amount of error

CAUTION WHEN INTERPRETING ITEM ANALYSIS RESULTS

Item difficulty 30% - 90%

Item Discrimination Ratio 25% and Above

PBCC 0.20 and Above

KR20 0.70 and Above

Activity: Item Analysis

You might also like