Item Analysis and Validation
Item Analysis and Validation
VALIDATION
WHAT IS ITEM ANALYSIS?
PHASES
• TRY-OUT PHASE
• ITEM ANALYSIS PHASE (LEVEL OF
DIFFULCULTY)
• ITEM REVISION PHASE
ITEM ANALYSIS
TWO CHARACTERISTICS
(a) ITEM DIFFULTY
- THE OF THE EXAMINEES WHO HAVE ANSWERED THE ITEM CORRECTLY”
DIFFULTY VALUE OF AN ITEM IS DEFINED AS THE PROPORTION OR PERCENTAGE.
-J.P. GUILFORD
(B) DISCRIMINATION INDEX
-”INDEX OF DISCRIMINATION IS THAT AILITY OF AN ITEM ON THE BASIS OF WHICH
THE DISCRIMINATION IS MADE BETWEEN SUPERIORS AND INTERIORS.
- BLOOD AND BUDD (1972)
• ITEM DIFFICULTY =
HERE THE TOTAL NUMBER OF STUDENTS IS 100, HENCE, THE ITEM DIFFICULTY
INDEX IS 75/100 OR 75%.
Range of difficulty index interpretation action
•
The proportion of the total group who got the item wrong.
•
FORMULA:
P – percentage who answered the item
correctly (index of difficulty)
R – number who answered the item
correctly
T – the total number who tired the item
•
X 100 40%
The smaller the percentage figure the more difficult the item.
Estimate the item discriminating power using the formula:
= =0.40
The discriminating power of an item is reported as a decimal
fraction; maximum discriminating power is indicated by an index of
1.00
Maximum discrimination is usaully found at the 50 percent level of
difficulty.
0.00 – 0.20 = very difficult
0.21 – 0.80 = moderately difficult
0.81 – 1.00 = very easy
Validation
Is the extent to which a test measure or as referring to the
appropriateness, correctness, meaningfulness and usefulness of the
specific decisions a teacher makes based on the test results.
High 20 10 5
Average 10 25 5
Low 1 10 14
RELIABILITY
Refers to the consistency of the scores
obtained.
RELIABILITY INTERPRETATION
Excellent reliability; at the level of the best
.90 and above
standardized tests
.80 - .90 Very good for a classroom test.
Good for a classroom test; in the range of most.
.70 - .80 There are probably a few items which could be
improved.
Somewhat low. This test needs to be supplemented
by other measures (e.g., more tests) to determine
.60 - .70
grades. There are probably some items which could
be improved.
Suggests need for revision of test; unless it is quite
short (ten or fewer items). The test definitely needs
.50 - .60
to be supplemented by other measures (e.g., more
tests) for grafing .
Questionable reliability. This test should not
.50 - below
contribute heavily to the course grade,