0% found this document useful (0 votes)
230 views

Item Analysis and Validation

The document discusses item analysis and validation techniques used to evaluate test quality. Item analysis involves calculating item difficulty and discrimination indices to identify poorly performing questions and determine if revisions are needed. The validation process establishes how well a test measures the intended construct through examining content validity, criterion validity, and construct validity evidence. Reliability is also assessed using reliability coefficients to ensure test scores are consistent and stable over time.

Uploaded by

Angela
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
230 views

Item Analysis and Validation

The document discusses item analysis and validation techniques used to evaluate test quality. Item analysis involves calculating item difficulty and discrimination indices to identify poorly performing questions and determine if revisions are needed. The validation process establishes how well a test measures the intended construct through examining content validity, criterion validity, and construct validity evidence. Reliability is also assessed using reliability coefficients to ensure test scores are consistent and stable over time.

Uploaded by

Angela
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

ITEM ANALYSIS AND

VALIDATION
WHAT IS ITEM ANALYSIS?

- It is statistical technique which is used for selecting and


rejecting the item of the test on the basis of their difficulty
value and discriminated power.
PURPOSE OF ITEM ANALYSIS?

• Evaluates the quality of each item .


• Rationale: the quality of items determine the quality of test (i.e.,
reliabilty and validity)
• May suggest ways of improving measurement of a test.
• Can help with understanding why certain test predict some criteria
but not others.
DRAFT

• SUBJECTED TO ITEM ANALYSIS AND VALIDATION

PHASES

• TRY-OUT PHASE
• ITEM ANALYSIS PHASE (LEVEL OF
DIFFULCULTY)
• ITEM REVISION PHASE
ITEM ANALYSIS

TWO CHARACTERISTICS
(a) ITEM DIFFULTY
- THE OF THE EXAMINEES WHO HAVE ANSWERED THE ITEM CORRECTLY”
DIFFULTY VALUE OF AN ITEM IS DEFINED AS THE PROPORTION OR PERCENTAGE.
-J.P. GUILFORD
(B) DISCRIMINATION INDEX
-”INDEX OF DISCRIMINATION IS THAT AILITY OF AN ITEM ON THE BASIS OF WHICH
THE DISCRIMINATION IS MADE BETWEEN SUPERIORS AND INTERIORS.
- BLOOD AND BUDD (1972)
•  ITEM DIFFICULTY =

• THE ITEM DIFFICULTY IS USUALLY EXPRESSED IN PERCENTAGE.


EXAMPLE:
WHAT IS THE ITEM DIFFICULTY INDEX OF AN ITEM IF 25 STUDENTS ARE UNABLE TO
ANSWER IT CORRECTLY WHILE 75 ANSWERED IT CORRECTLY?

HERE THE TOTAL NUMBER OF STUDENTS IS 100, HENCE, THE ITEM DIFFICULTY
INDEX IS 75/100 OR 75%.
Range of difficulty index interpretation action

0-0.25 difficult Revise or discard

0.26-0.75 Right difficulty Retaion

0.76- above easy Revose or discard


INDEX OF DISCRIMINATION

• TELLS WHETHER IT CAN DISCRIMINATE BETWEEN THOSE WHO DO NOT KNOW


THE ANSWER.
• DIFFICULTY IN:
UPPER 25% OF THE CLASS
LOWER 25% OF THE CLASS
INDEX OF DISCRIMINATION = DU – DL

• EXAMPLE: OBTAIN THE INDEX OF DISCRIMINATION OF AN ITEM IF THE UPPER


25% OF THE CLASS HAD A DIFFICULTY INDEX OF 0.60 (I.E. 60% OF THE UPPER 25%
GOT THE CORRECT ANSWER) WHILE THE LOWER 25% OF THE CLASS HAD A
DIFFICULTY INDEX OF 0.20.

DU = 0.60 while DL = 0.20, thus index of discrimination = .60 - .20 = .40.


Index Range Interpretation Action

-1.0 - -.50 Can discriminate but item Discard


is questionable

-.55 – 0.45 Non-discriminating Revise

0.46 – 1.0 Discriminating item Include


INDEX OF DIFFICULTY

•  
 The proportion of the total group who got the item wrong.

Ru – the number in the upper group who


FORMULA: answered the item correctly
Rl – the number in the lower group who
X 100 answered the item correctly
T – the total number who tired the item
INDEX OF ITEM DISCRIMINATION

•  
FORMULA:
P – percentage who answered the item
correctly (index of difficulty)
R – number who answered the item
correctly
T – the total number who tired the item
•  
X 100 40%

The smaller the percentage figure the more difficult the item.
Estimate the item discriminating power using the formula:

= =0.40
The discriminating power of an item is reported as a decimal
fraction; maximum discriminating power is indicated by an index of
1.00
Maximum discrimination is usaully found at the 50 percent level of
difficulty.
0.00 – 0.20 = very difficult
0.21 – 0.80 = moderately difficult
0.81 – 1.00 = very easy
Validation
Is the extent to which a test measure or as referring to the
appropriateness, correctness, meaningfulness and usefulness of the
specific decisions a teacher makes based on the test results.

Purpose: to determine the characteristics of the whole test itself,


namely, the validity and reliability of the test
Three main types of evidence that may be collected
Content-related evidence of validity – refers to the content and format of the
instrument.
Criterion-related evidence of validity – refers to the relationship between
scores obtained using the instrument and scores obtained using one or more other
test (often called criterion).
Construct-related evidence of validity – refers to the nature of the
psychological construct or characteristic being measured by the test.
EXPECTANCY TABLE

GRADE POINT AVERAGE

Test Score Very Good Good Needs Improvement

High 20 10 5

Average 10 25 5

Low 1 10 14
RELIABILITY
Refers to the consistency of the scores
obtained.
RELIABILITY INTERPRETATION
Excellent reliability; at the level of the best
.90 and above
standardized tests
.80 - .90 Very good for a classroom test.
Good for a classroom test; in the range of most.
.70 - .80 There are probably a few items which could be
improved.
Somewhat low. This test needs to be supplemented
by other measures (e.g., more tests) to determine
.60 - .70
grades. There are probably some items which could
be improved.
Suggests need for revision of test; unless it is quite
short (ten or fewer items). The test definitely needs
.50 - .60
to be supplemented by other measures (e.g., more
tests) for grafing .
Questionable reliability. This test should not
.50 - below
contribute heavily to the course grade,

You might also like