0% found this document useful (0 votes)
39 views

Workshop: Polytomous IRT Models (# 144, Remo Ostini and Michael L. Nering)

The document summarizes several polytomous IRT models including the nominal response model (NRM). The NRM is introduced as a model for polytomous items with nominal, or unordered, response categories such as multiple choice items. The NRM directly models the item category response functions using an exponential form with slope and intercept parameters for each response category. One of the slope or intercept parameters is typically constrained to identify the model for estimation purposes. An example item is provided to illustrate the NRM parameterization.

Uploaded by

Ridwan Efendi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Workshop: Polytomous IRT Models (# 144, Remo Ostini and Michael L. Nering)

The document summarizes several polytomous IRT models including the nominal response model (NRM). The NRM is introduced as a model for polytomous items with nominal, or unordered, response categories such as multiple choice items. The NRM directly models the item category response functions using an exponential form with slope and intercept parameters for each response category. One of the slope or intercept parameters is typically constrained to identify the model for estimation purposes. An example item is provided to illustrate the NRM parameterization.

Uploaded by

Ridwan Efendi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Workshop

Polytomous IRT models


(# 144, Remo Ostini and Michael L. Nering)

Jorge Tendeiro

16 April 2014

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 1/49
Literature
Presentation based on the book:
Ostini, R., & Nering, M. L. (2006). Polytomous item
response theory models. Sage University Paper Series
QASS.
(“Little green book” # 144)

I also used a classic book:


Embretson, S. E., & Reise, S. P. (2000). Item response
theory for psychologists. Chapter 5.

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 2/49
Overview

1 Introduction

2 (Some) Polytomous IRT models


Nominal response model (NRM)
Partial credit model (PCM)
Generalized partial credit model (GPCM)
Rating scale model (RSM)
Graded response model (GRM)

3 Model selection

4 Software

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 3/49
Introduction

Introduction

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 4/49
Introduction

Item response theory (IRT): Main idea


Modeling the relationship item↔person by means of a mathematical
function:
P(Xi = c|θ) = f (θ)
| {z }
Pic (θ)

X Xi = Item i with discrete response categories.


X c = Coded response category:
• If X is dichotomous, c = 0, 1;
• If X is polytomous, c = 0, 1, . . . , m (m > 1).
X θ = Person trait parameter.

This is the item response function (IRF).

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 5/49
Introduction

IRT: Important property


Item location (to be defined shortly) and person trait are indexed on the
same metric.
Example: Dichotomous item
1.0

0.8
P(Xi = 1)

0.6
0.5
0.4

0.2

0.0
b
−4 −2 0 2 4

θ (person ability)

• θ > b −→ person is more likely to answer Xi = 1.


• θ < b −→ person is more likely to answer Xi = 0.
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 6/49
Introduction

IRT: Dichotomous models recap.

• Dichotomous items:
Xi = 0 (incorrect, false) or Xi = 1 (correct, true).
• Most common models (logistic): 1PLM, 2PLM, 3PLM
• These models typically relate θ and Pi1 (θ):

Pi1 (θ) = f (θ).

[Pi0 (θ) ≡ 1 − Pi1 (θ)].

We usually simplify notation in the dichotomous case:

Pi (θ) = Pi1 (θ).

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 7/49
Introduction

IRT: Dichotomous models recap.

1PLM
1
Pi (θ) =
1 + exp[−(θ − bi )]

• bi = difficulty param.
1.00
b1= −1
b2= 0.5
0.75 b3= 1.5
P(Xi = 1)

0.50

0.25

0.00
b1 b2 b3
−4 −3 −2 −1 0 1 2 3 4
θ (person ability)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 8/49
Introduction

IRT: Dichotomous models recap.

2PLM
1
Pi (θ) =
1 + exp[−ai (θ − bi )]

• bi = difficulty param., ai = discrimination param.


1.00
a1= 1, b1= −1
a2= 2, b2= 0.5
0.75 a3= 0.5, b3= 1.5
P(Xi = 1)

0.50

0.25

0.00
b1 b2 b3
−4 −3 −2 −1 0 1 2 3 4
θ (person ability)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 9/49
Introduction

IRT: Dichotomous models recap.

3PLM
1
Pi (θ) = ci + (1 − ci )
1 + exp[−ai (θ − bi )]

• bi = difficulty param., ai = discrimination param., ci = guessing param.


1.00
a1= 1, b1= −1, c1= 0
a2= 2, b2= 0.5, c2= 0.2
0.75 a3= 0.5, b3= 1.5, c3= 0.25
P(Xi = 1)

0.50

0.25

0.00
b1
−4 −3 −2 −1 0 1 2 3 4
θ (person ability)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 10/49
Introduction

IRT: Polytomous models


In this case Xi = 0, 1, . . . , m, where m > 1.
Example of items with multiple response items:
• Rating scale
(e.g., Likert-type items: ‘Strongly disagree’, ..., ‘Strongly agree’).
• Ability test items awarding partial credit.

Now we need to define models which allow estimating each Pic (θ),
c = 0, 1, . . . , m: 
 Pi0 (θ) = f1 (θ)

··· .


Pim (θ) = fm (θ)
These are the item category response functions (ICRFs).

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 11/49
Introduction

IRT: Polytomous models – Why?


Polytomous items. . .
• are extensively used in applied psychological measurement.
• measure across a wider range of the trait continuum θ.
• are related to an increase of statistical information when
compared to dichotomous items.
• (in some settings) may help reducing test length
(time&, costs&, respondents’ motivation%).

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 12/49
(Some) Polytomous IRT models Nominal response model (NRM)

Nominal response model


(NRM)

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 13/49
(Some) Polytomous IRT models Nominal response model (NRM)

NRM (Bock, 1972)

• Type of items: Polytomous with two or more nominal categories.


• Here, nominal categories = unordered in terms of the trait being
measured.
• E.g.: Multiple choice items (namely the distractors).

The NRM is a “divide-by-total”, or “direct” model:


The ICRFs are modeled directly.

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 14/49
(Some) Polytomous IRT models Nominal response model (NRM)

NRM (Bock, 1972)


The ICRF for category c (c = 0, 1, . . . , m) is

exp(λic θ + ζic )
Pic (θ) = Pm .
h=0 exp(λih θ + ζih )

• λih = slope associated to category h of item i.


• ζih = intercept associated to category h of item i.

To identify the model (i.e., to estimate parameters), one of two


constraints is typically imposed:

Pm Pm
h=0 λih = h=0 ζih = 0, or
• λi0 = ζi0 = 0.

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 15/49
(Some) Polytomous IRT models Nominal response model (NRM)

NRM (Bock, 1972): Example


Item measuring student mathematical achievement (N ' 2, 000).
Response options P
A B C D
λi −.30 .81 −.31 −.20 .000
ζi .21 .82 −.09 −.94 .000
1.0

0.8
Pi (X = c)

0.6
Response A
Response B
0.4 Response C
Response D
0.2

0.0
−4 −3 −2 −1 0 1 2 3 4
θ (person ability)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 16/49
(Some) Polytomous IRT models Nominal response model (NRM)

NRM (Bock, 1972): Example


Interpretation:
• Response B is the most popular for the more able respondents.
• Response A is the most popular for the less able respondents
(followed by Response C).
• Response D was not popular across the entire trait scale.

In general, for the NRM:


• The popularity of response categories across the entire trait scale
is associated to the order of the intercepts ζic .

For the example, in increasing order of popularity:

Response D < Response C < Response A < Response B.

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 17/49
(Some) Polytomous IRT models Partial credit model (PCM)

Partial credit model


(PCM)

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 18/49
(Some) Polytomous IRT models Partial credit model (PCM)

PCM (Masters, 1982)

• Type of items: Polytomous with two or more ordinal categories.


• Ideal when the answer to an item consists of an ordered
sequence of steps.
• Partial credit can be given if the respondents only answered
correctly to the first (but not all) steps.
• Varying number of categories across items is possible.
• PCM = Applying the 1PLM to each pair of adjacent item
response categories.
• The PCM is an extension of the 1PLM.

The PCM is a “divide-by-total”, or “direct” model:


The ICRFs are modeled directly.
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 19/49
(Some) Polytomous IRT models Partial credit model (PCM)

PCM (Masters, 1982)


The ICRF for category c (c = 0, 1, . . . , m) is
hP i
c
exp j=0 (θ − δij )
Pic (θ) = Pm hP
h
i.
h=0 exp j=0 (θ − δij )

• δij (j = 1, . . . , m): Item step difficulties, also known as


• category boundaries;
• category intersections.
• Notation: 0j=0 (θ − δij ) = 0.
P

Xi = 0 Xi = 1 Xi = 2 Xi = 3

δi1 δi2 δi3

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 20/49
(Some) Polytomous IRT models Partial credit model (PCM)

PCM (Masters, 1982)

• δij = θ-value at which two consecutive ICRFs intersect:

Pi(j−1) (δij ) = Pij (δij ).

• The higher the δij , the more difficult a particular step is.
• The δij ’s aren’t necessarily ordered in the same sequence as the
categories (reversals; such a case indicates that the item is
probably not functioning as intended).

Special restriction of the PCM:


There must exist responses in every response category.
(Problematic for sparse data.)

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 21/49
(Some) Polytomous IRT models Partial credit model (PCM)

PCM (Masters, 1982): Example


Item from a survey of morality (N ' 1, 000).
Five-point Likert-type rating scale.
Step Difficulties
δi1 δi2 δi3 δi4
−1.618 −0.291 0.414 2.044
1.0
Category 1
0.8 Category 2
Category 3
Category 4
Pi (X = c)

0.6 Category 5

0.4

0.2

0.0
δi1 δi2 δi3 δi4
−4 −3 −2 −1 0 1 2 3 4
θ (person ability)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 22/49
(Some) Polytomous IRT models Partial credit model (PCM)

PCM (Masters, 1982): Example


Interpretation:
• In this case the δij ’s are ordered, so adjacent ICRFs intersect at
locally optimal trait values.
• In particular, each answer option has the highest probability in
some subinterval of the θ-scale.

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 23/49
(Some) Polytomous IRT models Partial credit model (PCM)

PCM (Masters, 1982): Example


Interpretation:
• In this case the δij ’s are ordered, so adjacent ICRFs intersect at
locally optimal trait values.
• In particular, each answer option has the highest probability in
some subinterval of the θ-scale.

1.0
Category 1
0.8 Category 2
Category 3
Category 4
Pi (X = c)

0.6 Category 5

0.4

0.2

0.0
δi1 δi2 δi3 δi4
−4 −3 −2 −1 0 1 2 3 4
θ (person ability)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 23/49
(Some) Polytomous IRT models Generalized partial credit model (GPCM)

Generalized partial credit model


(GPCM)

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 24/49
(Some) Polytomous IRT models Generalized partial credit model (GPCM)

GPCM (Muraki, 1992)

• The GPCM is a generalization of the PCM.


• Idea: Add discrimination parameter (one per item).
• So, in a way, PCM→GPCM just like 1PLM→2PLM.

The GPCM is a “divide-by-total”, or “direct” model:


The ICRFs are modeled directly.

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 25/49
(Some) Polytomous IRT models Generalized partial credit model (GPCM)

GPCM (Muraki, 1992)


The ICRF for category c (c = 0, 1, . . . , m) is
hP i
c
exp j=0 αi (θ − δij )
Pic (θ) = Pm hP
h
i.
h=0 exp j=0 αi (θ − δij )

• δij (j = 1, . . . , m): Item step difficulties (category intersections).


• αi : Item discrimination (slope parameters).
• Notation: 0j=0 αi (θ − δij ) = 0.
P

Xi = 0 Xi = 1 Xi = 2 Xi = 3

δi1 δi2 δi3

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 26/49
(Some) Polytomous IRT models Generalized partial credit model (GPCM)

GPCM (Muraki, 1992)

• δij = θ-value at which two consecutive ICRFs intersect.


• αi — Intuitive interpretation:
• Small values (say, ≤ 1) → ‘flatter’ ICRFs.
• Large values (say, ≥ 1.5) → more ‘peaked’ ICRFs.

In Muraki’s (1992, p. 162) words:


“[The αi ’s] indicate the degree to which categorical
responses vary among items as θ level changes.”

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 27/49
(Some) Polytomous IRT models Generalized partial credit model (GPCM)

GPCM (Muraki, 1992): Example

• Items from the Neuroticism Extraversion Openness Five-Factor


Inventory (NEO-FFI; Costa & McCrae, 1992).
• Five-point Likert-type rating scale.
(0 = strongly disagree;. . . ; 4 = strongly agree.)
• N = 350.

Let’s see three items.

Response category
Item Content 0 1 2 3 4
5 Feels tense and jittery 17 111 97 101 24
6 Sometimes feels worthless 72 89 52 94 43
9 Feels discouraged, like giving up 27 128 66 95 34

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 28/49
(Some) Polytomous IRT models Generalized partial credit model (GPCM)

GPCM (Muraki, 1992): Example (slope ' 1)


Item 6 ‘Sometimes feels worthless’.
(0 = 72, 1 = 89, 2 = 52, 3 = 94, 4 = 43).

Slope Step Difficulties


α6 δ61 δ62 δ63 δ64
1.073 −0.873 0.358 −0.226 1.547
1.0

0.8
P6 (X = c)

Category 0
0.6 Category 1
Category 2
0.4 Category 3
Category 4
0.2

0.0
δ61 δ63 δ62 δ64
−4 −3 −2 −1 0 1 2 3 4
θ (person ability)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 29/49
(Some) Polytomous IRT models Generalized partial credit model (GPCM)

GPCM (Muraki, 1992): Example (slope < 1)


Item 5 ‘Feels tense and jittery ’.
(0 = 17, 1 = 111, 2 = 97, 3 = 101, 4 = 24).

Slope Step Difficulties


α5 δ51 δ52 δ53 δ54
0.683 −3.513 −0.041 0.182 2.808
1.0
Category 0
0.8 Category 1
Category 2
Category 3
P5 (X = c)

0.6 Category 4

0.4

0.2

0.0
δ51 δ52 δ53 δ54
−4 −3 −2 −1 0 1 2 3 4
θ (person ability)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 30/49
(Some) Polytomous IRT models Generalized partial credit model (GPCM)

GPCM (Muraki, 1992): Example (slope ' 1.5)


Item 9 ‘Feels discouraged, like giving up’.
(0 = 27, 1 = 128, 2 = 66, 3 = 95, 4 = 34).

Slope Step Difficulties


α9 δ91 δ92 δ93 δ94
1.499 −1.997 0.210 0.103 1.627
1.0

0.8
Category 0
P9 (X = c)

0.6 Category 1
Category 2
Category 3
0.4
Category 4

0.2

0.0
δ91 δ93 δ92 δ94
−4 −3 −2 −1 0 1 2 3 4
θ (person ability)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 31/49
(Some) Polytomous IRT models Rating scale model (RSM)

Rating scale model


(RSM)

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 32/49
(Some) Polytomous IRT models Rating scale model (RSM)

RSM (Andrich, 1978)

• Type of items: Polytomous with two or more ordinal categories.


• Requirement: All items of the measurement instrument have the
same consistent structural response form.
E.g.: When the set of responses is the same for all items.
• As a consequence, the response format is intended to function in
the same way across all items.
• The RSM is an extension of the 1PLM.
Moreover, the RSM can be seen as a special case of the PCM.

The RSM is a “divide-by-total”, or “direct” model:


The ICRFs are modeled directly.

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 33/49
(Some) Polytomous IRT models Rating scale model (RSM)

RSM (Andrich, 1978)


The ICRF for category c (c = 0, 1, . . . , m) is
nP o
c
exp j=0 [θ − (λi + δj )]
Pic (θ) = Pm nP
h
o.
h=0 exp j=0 [θ − (λi + δj )]

• λi : Item location parameter.


• δj (j = 1, . . . , m): Category threshold parameters.
• Notation: 0j=0 [θ − (λi + δj )] = 0.
P

δ1 δ1
δ2 δ2
δ3 δ3

λ1 λ2 θ scale
Item 1 (4 cats.) Item 2 (4 cats.)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 34/49
(Some) Polytomous IRT models Rating scale model (RSM)

RSM (Andrich, 1978)

• Two consecutive categories intersect at θ = (λi + δj ):

Pi(j−1) (λi + δj ) = Pij (λi + δj ).

• RSM is a special case of the PCM:


Corresponding (across items) category intersections are equally
spaced.

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 35/49
(Some) Polytomous IRT models Rating scale model (RSM)

RSM (Andrich, 1978): Example (NEO-FFI)


Thresholds: δ1 = −1.600, δ2 = 0.224, δ3 = −0.184, δ4 = 1.560.
Item 1: λ1 = −0.44 ('easiest' of the 12 items in the scale)
1.0

0.8
Category 0
P1 (X = c)

0.6 Category 1
Category 2
Category 3
0.4
Category 4

0.2

0.0
−4 −3 −2 −1 0 1 2 3 4

Item 1: λ11 = 0.47 ('hardest' of the 12 items in the scale)


1.0

0.8
P11 (X = c)

0.6

0.4

0.2

0.0
−4 −3 −2 −1 0 1 2 3 4
θ (person ability)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 36/49
(Some) Polytomous IRT models Graded response model (GRM)

Graded response model


(GRM)

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 37/49
(Some) Polytomous IRT models Graded response model (GRM)

GRM (Samejima, 1969)

• Type of items: Polytomous with two or more ordinal categories.


• Varying number of categories across items is possible.
• GRM = Applying the 2PLM at each category boundary
(i.e., between two consecutive category responses).
• The GRM is an extension of the 2PLM.

The GRM is a “difference”, or “indirect” model:


The ICRFs are modeled indirectly.

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 38/49
(Some) Polytomous IRT models Graded response model (GRM)

GRM (Samejima, 1969)


The ICRF for category c (c = 0, 1, . . . , m) is

Pic (θ) = Pic∗ (θ) − Pi(c+1) (θ),
where
1
P∗
ic = (the 2PLM).
|{z} 1 + exp[−αi (θ − βic )]
P(Xi ≥c|θ)
∗ ∗
(And Pi0 ≡ 1, Pim ≡ 0.)

For example, if m = 4 (i.e., c = 0, 1, 2, 3):



Pi0 (θ) = 1 − Pi1



∗ ∗
Pi1 (θ) = Pi1 − Pi2


∗ ∗

 Pi2 (θ) = Pi2 − Pi3


Pi3 (θ) = Pi3 − 0.

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 39/49
(Some) Polytomous IRT models Graded response model (GRM)

GRM (Samejima, 1969)

• αi : Item slope parameter (one per item).


• βic : Category threshold parameters
(one set {βi1 , . . . , βim } per item).
These are the θ-values of transition between response categories.
• The βic ’s are necessarily ordered.

Xi = c − 1 Xi = c
Category responses
βic
Category thresholds
(θ scale) P(Xi ≤ c) = .50 P(Xi ≥ c) = .50

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 40/49
(Some) Polytomous IRT models Graded response model (GRM)

GRM (Samejima, 1969): Example (NEO-FFI)


Item 4 ‘Rarely feels lonely, blue’.
(0 = 20, 1 = 90, 2 = 68, 3 = 125, 4 = 47).

Slope Category thresholds


α4 β41 β42 β43 β44
1.31 −2.72 −0.81 0.04 1.85
1.0
Category 0
Category 1
0.8
Category 2
Category 3
P4 (X = c)

0.6 Category 4
0.5
0.4

0.2

0.0
β41 β44
−4 −3 −2 −1 0 1 2 3 4
θ (person ability)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 41/49
Model selection

Model selection

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 42/49
Model selection

Model selection

• There are plenty of polytomous IRT models available


(models + variants > 10).
• Choosing one model may be a hard enterprise.

Criteria to help choosing the ‘best’ model:


1 Data characteristics

2 Measurement philosophy

3 Mathematical approaches to check fit

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 43/49
Model selection

Model selection

1 Data characteristics
• Dichotomous vs polytomous item scores.
• Nominal vs ordinal categories.
• Number of response categories.
E.g.: The RSM requires the same number across items.

2 Measurement philosophy
• Does the model reflect the the psychological reality that
produced the data?
E.g.: Can one conceptualize the answer to an item as being an
ordered sequence of subtasks for which awarding partial credit
to each is meaningful (i.e., PCM)?

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 44/49
Model selection

Model selection

3 Mathematical approaches to check fit


• Check plots
,→ Compare model-predicted vs empirical response functions.
,→ Plot residuals.

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 45/49
Model selection

Model selection

3 Mathematical approaches to check fit


• Statistical fit tests
These may vary depending on their level of generality.
(Assessing fit of all items, of a specific group of items, or of individual
items.)
,→ Residual-based measures.
Based on differences between observed and expected item scores.
,→ Multinomial distribution-based tests.
Based on differences between observed and expected frequencies
of response patterns.
,→ Response function-based tests.
Based on differences between observed and expected
log-likelihood of response patterns.
,→ Guttman error-based tests
Nonparametric approach based on the number of Guttman
errors.
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 46/49
Model selection

Model selection

3 Mathematical approaches to check fit


• Goodness of fit
Consider model fit ⊕ number of estimated parameters.
,→ Akaike’s information criterion (AIC; Akaike, 1977).
,→ Procedures based on likelihood ratio of two comparing models.

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 47/49
Model selection

Model selection
Some problems of statistical fit tests:
• The sampling distributions are often unknown.
• Some tests require very large sample sizes (on the hundreds),
specially for χ2 -based tests.
• Unknown influence of using estimated parameters or of mild
model violations on the performance of the tests.
• Too large sample sizes invariably lead to rejections of the null
hypothesis (effect size?).
A final reassurence:
Some comparative studies of polytomous IRT models suggest that
results don’t vary much between models.
(E.g., Dodd, 1984; Maydeu-Olivares et al., 1994; Ostini, 2001; van Engelenburg,
1997; Verhelst et al., 1997.)
Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 48/49
Software

Software

• IRTPRO
• R: Several packages worth checking
(see http://cran.r-project.org/web/views/Psychometrics.html)
ltm, eRm, TAM, mcIRT, pcIRT,. . .

Workshop Polytomous IRT models, (# 144, Remo Ostini and Michael L. Nering) 49/49

You might also like