0% found this document useful (0 votes)
29 views

Modeling Ordered Categorical Data: James J. Dignam

The document discusses ordered categorical data and modeling ordered categorical responses. It provides examples of ordered categorical variables and introduces notation for the response categories and probabilities. It then describes a proportional odds model using a latent continuous variable approach where the log odds of being in a category or lower is modeled as a linear function of covariates.

Uploaded by

cdcdiver
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Modeling Ordered Categorical Data: James J. Dignam

The document discusses ordered categorical data and modeling ordered categorical responses. It provides examples of ordered categorical variables and introduces notation for the response categories and probabilities. It then describes a proportional odds model using a latent continuous variable approach where the log odds of being in a category or lower is modeled as a linear function of covariates.

Uploaded by

cdcdiver
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Modeling Ordered Categorical Data

James J. Dignam

Department of Public Health Sciences


University of Chicago

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 1 / 27


Ordered Categorical Data

An ordered categorical variable ( also called an ordinal variable) is


a categorical variable (with greater than two levels) where there is
a natural ordering of the categories.
Examples:
1 In a clinical trial on pain relievers, the degree of pain control may be
described as totally ineffective, poor control, moderate control, or
good control.
2 Stage of cancer are ordered by extent of disease: stage I
(localized), II, III, IV (metastatic).
3 Agreement level on a survey question: strongly disagree, disagree,
neutral, agree, strongly agree.
The quantitative distance between levels may not be known and
may not be the same.

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 2 / 27


Ordinal Response: notation

Let C 1 ,C 2 , ...,C k , k ≥ 2 denote the k ordered categories for the


response ( in increasing order).
Let Yi be the response variable for the i t h individual, with Yi taking
the value j if the response is in category C j , j = 1, 2, ..., k.
Define p i j = P (Yi = j ) = P [ individual i responds in category C j ]
The cumulative probability for Yi : γi j = P (Yi ≤ j )
Hence γi j = p i 1 + p i 2 + · · · + p i j and γi k = kj=1 p i j = 1
P

We now introduce an unobservable / latent continuous random


variable Zi which is such that

Yi = j , if d j −1 < Zi ≤ d j

where −∞ = d0 < d1 < · · · < dk = ∞. We refer to d1 , d2 , ..., dk−1 as the


cut points.

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 3 / 27


Latent Variable Idea

Thus, γi j = P (Yi ≤ j ) = P (Zi ≤ d j ), j = 1, 2, ..., k


Assume that Zi have a logistic distribution with mean µi and unit
standard deviation, then

e d j −µi
γi j = P (Zi ≤ d j ) = (1)
1 + e d j −µi

It then follows that


γi j
log( ) = d j − µi (2)
1 − γi j
We further suppose that µi is a linear combination of the explanatory
variables for the i t h individual, and set µi = β0 x i
γi j
log( ) = d j − β0 x i (3)
1 − γi j

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 4 / 27


γ
Proportional Odds Model: log( 1−γi ji j ) = d j − β0 x i

This model uses cumulative probabilities up to a response threshold,


thereby making the whole range of ordinal categories binary at that
threshold.
Intercept d j is the log-odds of falling into or below category j when x i = 0
γ
Two individuals with covariates x 1 , x 2 respectively, log( 1−γ1 j1 j ) = d j − β0 x 1 ,
γ2 j
log( 1−γ ) = d j − β0 x 2 , we have
2j

γ1 j /(1 − γ1 j )
log( ) = −β0 (x 1 − x 2 ) (4)
γ2 j /(1 − γ2 j )

(4) is independent of the d ’s, the log odds ratio of being in category C j or
worse is proportional to the difference between x 1 and x 2 where −β is
the constant of proportionality (same for j = 1, 2, ..., k − 1), thus the name
“proportional odds model".
γ
When k = 2, the model is log( 1−γi 1i 1 ) = d1 − β0 x i , where γi 1 = p i 1 , and there
is just 1 cut point. This reduces to the standard logistic regression model
for binary data.
J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 5 / 27
Example: Small Cell Lung Cancer
In a clinical trial evaluating treatment of small cell lung cancer
described by Holtbrugge and Schumacher (1991), two treatment
strategies were compared: sequential therapy (same combination of
chemotherapeutic agents were administered in each treatment cycle)
vs. alternating therapy (three different combinations were given,
alternating between cycles). Data were obtained from 299 patients.

Table 1: Tumor response by sex and chemotherapy strategy.


Sex of Therapy Progressive No change Partial Complete
patient strategy disease remission remission
0(Male) 0(Sequential) 28 45 29 26
1(Female) 0 4 12 5 2
0 1(Alternating) 41 44 20 20
1 1 12 7 3 1
85 108 57 49

Response is naturally ordered from worst to best.


J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 6 / 27
Lung Cancer : dataset organization for ordered
analysis (long form)
. l i s t , sepby ( sex )

+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−+
| sex therapy category count |
|−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−|
1. | 0 0 1 28 |
2. | 0 0 2 45 |
3. | 0 0 3 29 |
4. | 0 0 4 26 |
|−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−|
5. | 1 0 1 4 |
6. | 1 0 2 12 |
7. | 1 0 3 5 |
8. | 1 0 4 2 |
|−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−|
9. | 0 1 1 41 |
10. | 0 1 2 44 |
11. | 0 1 3 20 |
12. | 0 1 4 20 |
|−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−|
13. | 1 1 1 12 |
14. | 1 1 2 7 |
15. | 1 1 3 3 |
16. | 1 1 4 1 |
+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−+

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 7 / 27


γ
Lung Cancer : log 1−γi ji j = d j , j = 1, 2, 3

The ordered logit model can be used here


. o l o g i t c a t e g o r y [w= count ] , nolog
( f r e q u e n c y w e i g h t s assumed )

Ordered l o g i s t i c r e g r e s s i o n Number o f obs = 299


Log l i k e l i h o o d = − 399.98398 Pseudo R2 = 0.0000

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
category | Coef . Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
/ c u t 1 | − .9233248 .1282092 − 1.17461 − .6720393
/ cut2 | .5992511 .1208938 .3623036 .8361986
/ cut3 | 1.629641 .1562311 1.323433 1.935848
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
. estimates store Null

The estimates store filename command provides storage of model


info for contrasting later

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 8 / 27


γ
Lung Cancer : log 1−γi ji j = d j , j = 1, 2, 3

What do these parameters represent?

Param value estimating raw odds Cumul.


what from table prob.
cut1 -0.9233 log(odds PD/higher response category) 0.397 .28
cut2 0.5992 log(odds PD or SD/higher category) 1.821 .65
cut3 1.6296 log(odds PD or SD or PR/higher category) 5.10 .84

Model predicts odds (probability) of being in given category or less


vs higher categories - thus reference event changes to higher
categories remaining

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 9 / 27


γ
Lung Cancer : log 1−γi ji j = d j − β1 sex, j = 1, 2, 3

. o l o g i t c a t e g o r y sex [w= count ] , nolog


( f r e q u e n c y w e i g h t s assumed )

Ordered l o g i s t i c r e g r e s s i o n Number o f obs = 299


LR c h i 2 ( 1 ) = 3.34
Prob > c h i 2 = 0.0676
Log l i k e l i h o o d = − 398.31341 Pseudo R2 = 0.0042

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
category | Coef . Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
sex | − .5218702 .28707 − 1.82 0.069 − 1.084517 .0407767
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
/ cut1 | − 1.01504 .138661 − 1.286811 − .7432694
/ cut2 | .5188191 .1285166 .2669312 .770707
/ cut3 | 1.557141 .1609139 1.241756 1.872527
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

. estimates store S

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 10 / 27


γ
Lung Cancer : log 1−γi ji j = d j − β1 therapy, j = 1, 2, 3

. o l o g i t c a t e g o r y t h e r a p y [w = count ] , nolog
( f r e q u e n c y w e i g h t s assumed )

Ordered l o g i s t i c r e g r e s s i o n Number o f obs = 299


LR c h i 2 ( 1 ) = 7.31
Prob > c h i 2 = 0.0068
Log l i k e l i h o o d = − 396.32657 Pseudo R2 = 0.0091
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
category | Coef . Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
t h e r a p y | − .5699142 .2117716 − 2.69 0.007 − .9849789 − .1548495
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
/ cut1 | − 1.21673 .1704333 − 1.550773 − .8826866
/ cut2 | .3382206 .1542139 .035967 .6404743
/ cut3 | 1.380296 .1801627 1.027184 1.733409
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
. estimates store T

What does this model say?

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 11 / 27


γ
Lung Cancer : log 1−γi ji j = d j − β1 therapy, j = 1, 2, 3

. t a b t h e r a p y c a t e g o r y [ f w e i g h t = count ] , row c o l
| category
therapy | 1 2 3 4 | Total
−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−
0 | 32 57 34 28 | 151
| 21.19 37.75 22.52 18.54 | 100.00
| 37.65 52.78 59.65 57.14 | 50.50
−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−
1 | 53 51 23 21 | 148
| 35.81 34.46 15.54 14.19 | 100.00
| 62.35 47.22 40.35 42.86 | 49.50
−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−
Total | 85 108 57 49 | 299
| 28.43 36.12 19.06 16.39 | 100.00
| 100.00 100.00 100.00 100.00 | 100.00

For therapy 0, odds(progression, stable, or partial response/


complete response) is 123/28 = 4.39. For therapy 1,
odds(progression, stable, or partial response/ complete response)
is 6.04
model predicts exp(1.38) = 3.97 and exp(1.38 − (−.570) = 7.02

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 12 / 27


Lung Cancer :
γi j
log 1−γ = d j − β1 sex − β2 therapy, j = 1, 2, 3
ij

. o l o g i t c a t e g o r y sex t h e r a p y [ w= count ] , nolog


( f r e q u e n c y w e i g h t s assumed )

Ordered l o g i s t i c r e g r e s s i o n Number o f obs = 299


LR c h i 2 ( 2 ) = 10.91
Prob > c h i 2 = 0.0043
Log l i k e l i h o o d = − 394.52832 Pseudo R2 = 0.0136

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
category | Coef . Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
sex | − .5413938 .2871816 − 1.89 0.059 − 1.104259 .0214717
therapy | − .580685 .2121478 − 2.74 0.006 − .996487 − .164883
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
/ c u t 1 | − 1.318043 .1797769 − 1.670399 − .9656869
/ cut2 | .2492335 .1613881 − .0670813 .5655484
/ cut3 | 1.300056 .1849928 .9374766 1.662635
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

. e s t i m a t e s s t o r e ST

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 13 / 27


γ
Lung Cancer : log 1−γi ji j =
d j − β1 sex − β2 therapy − β3 sex*therapy, j = 1, 2, 3

. gen s t = sex * t h e r a p y
. o l o g i t c a t e g o r y sex t h e r a p y s t [ w= count ] , nolog
( f r e q u e n c y w e i g h t s assumed )

Ordered l o g i s t i c r e g r e s s i o n Number o f obs = 299


LR c h i 2 ( 3 ) = 11.96
Prob > c h i 2 = 0.0075
Log l i k e l i h o o d = − 394.00492 Pseudo R2 = 0.0149

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
category | Coef . Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
sex | − .2741906 .3873497 − 0.71 0.479 − 1.033382 .4850008
therapy | − .488071 .2305167 − 2.12 0.034 − .9398754 − .0362666
s t | − .5904159 .5791605 − 1.02 0.308 − 1.72555 .5447177
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
/ c u t 1 | − 1.275657 .184367 − 1.63701 − .9143045
/ cut2 | .2957159 .1678283 − .0332216 .6246534
/ cut3 | 1.345164 .1905977 .9715991 1.718728
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

. e s t i m a t e s s t o r e SXT

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 14 / 27


Lung Cancer : Model Comparison via LR Tests
These tests are ‘ smaller model - bigger model’, which are nested
. l r t e s t S Null

Likelihood −r a t i o test LR c h i 2 ( 1 ) = 3.34


( Assumption : N u l l nested i n S ) Prob > c h i 2 = 0.0676

. l r t e s t T Null

Likelihood −r a t i o test LR c h i 2 ( 1 ) = 7.31


( Assumption : N u l l nested i n T ) Prob > c h i 2 = 0.0068

. l r t e s t ST T

Likelihood −r a t i o test LR c h i 2 ( 1 ) = 3.60


( Assumption : T nested i n ST ) Prob > c h i 2 = 0.0579

. l r t e s t ST S

Likelihood −r a t i o test LR c h i 2 ( 1 ) = 7.57


( Assumption : S nested i n ST ) Prob > c h i 2 = 0.0059

. l r t e s t SXT ST

Likelihood −r a t i o test LR c h i 2 ( 1 ) = 1.05


( Assumption : ST nested i n SXT) Prob > c h i 2 = 0.3062

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 15 / 27


Lung Cancer : model comparison (continued)

Based on these analyses using proportional odds models, we


conclude that both sex and therapy affect tumor response, but
there is not evidence that the interaction between sex and therapy
is an important predictor for tumor response.
The fitted model based on
γi j
log 1−γi j = d j − β1 sex − β2 therapy, j = 1, 2, 3 can be written as:

γ̂i 1
log = −1.318 + 0.541sex + 0.581therapy
1 − γ̂i 1

γ̂i 2
log = .249 + 0.541sex + 0.581therapy
1 − γ̂i 2
γ̂i 3
log = 1.300 + 0.541sex + 0.581therapy
1 − γ̂i 3

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 16 / 27


Lung Cancer :
γi j
log 1−γ = d j − β1 sex − β2 therapy, j = 1, 2, 3
ij

Odds Ratios

The parameter estimates dˆj , j = 1, 2, 3 are the estimated log odds of


falling into or below category j when Sex=0 (male) and Therapy = 0
(sequential therapy)
The parameter estimates β̂1 , β̂2 associated with Sex and Therapy can
be interpreted in terms of odds ratios.
Example: −β̂2 (= 0.5806..) gives the estimated log odds ratio of the
probability of a response in category C j or worse, j = 1,2,3, comparing
the alternating therapy (therapy = 1) with sequential therapy (therapy =
0) adjusting for sex. Odds ratio is 1.79 (so alternating therapy is a bit
worse).

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 17 / 27


Lung Cancer : Predicted Category Probabilities

We would like to have the predicted probabilities for each category


underγi j
each condition. The results are based on the additive model
log 1−γi j = d j − β1 sex − β2 therapy, j = 1, 2, 3
. q u i e t l y o l o g i t c a t e g o r y sex t h e r a p y [ f w e i g h t = count ] , nolog
. p r e d i c t p1 p2 p3 p4
( o p t i o n p r assumed ; p r e d i c t e d p r o b a b i l i t i e s )
. t a b l e sex therapy , c ( mean p1 mean p2 mean p3 mean p4 ) f (%5.2 f )

−−−−−−−−−−−−−−−−−−−−−−
| therapy
sex | 0 1
−−−−−−−−−−+−−−−−−−−−−−
0 | 0.21 0.32
| 0.35 0.37
| 0.22 0.17
| 0.21 0.13
|
1 | 0.32 0.45
| 0.37 0.35
| 0.18 0.12
| 0.14 0.08
−−−−−−−−−−−−−−−−−−−−−−

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 18 / 27


Lung Cancer : Predicted Cumulative Probabilities

. gen cp2 = p1+p2

. gen cp3 = cp2+p3

. gen cp4 = cp3+p4

. t a b l e sex therapy , c ( mean p1 mean cp2 mean cp3 mean cp4 ) f (%5.2 f )

−−−−−−−−−−−−−−−−−−−−−−
| therapy
sex | 0 1
−−−−−−−−−−+−−−−−−−−−−−
0 | 0.21 0.32
| 0.56 0.70
| 0.79 0.87
| 1.00 1.00
|
1 | 0.32 0.45
| 0.69 0.80
| 0.86 0.92
| 1.00 1.00
−−−−−−−−−−−−−−−−−−−−−−

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 19 / 27


Lung Cancer : Predicted Probabilities
Note: the lincom command can also be used to make predictions for
specific covariate combinations (pay attention to - signs) with precision
estimate
. * e s t i m a t e p r o b a b i l i t y f o r PD vs b e t t e r f o r female on a l t e r n a t i n g t h e r a p y
. l i n c o m _b [ / c u t 1 ] − 1 * sex − 1 * t h e r a p y

( 1 ) − [ c a t e g o r y ] sex − [ c a t e g o r y ] t h e r a p y + [ c u t 1 ] _cons = 0
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
category | Coef . Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
( 1 ) | − .1959642 .2892904 − 0.68 0.498 − .7629629 .3710345
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

. d i s p l a y i n v l o g i t ( − .1959642)
.45116513

. * l o w e r bound

. d i s p l a y i n v l o g i t ( − .7629629)
.31800333

. * upper bound

. display i n v l o g i t ( .3710345)
.59170893

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 20 / 27


Testing the Proportional Odds Assumption

The final model from the proportional odds model


γi j
log = d j − β1 sex − β2 therapy, j = 1, 2, 3 (5)
1 − γi j

assumes the effect of therapy and of sex do not depend on which


cutpoint between response categories we are considering.
γ̂i 1
log = −1.318 + 0.541sex + 0.581therapy
1 − γ̂i 1
γ̂i 2
log = 0.249 + 0.541sex + 0.581therapy
1 − γ̂i 2
γ̂i 3
log = 1.300 + 0.541sex + 0.581therapy
1 − γ̂i 3

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 21 / 27


Testing the proportional odds assumption (cont.)

We can test this proportional odds assumption by allowing for a


different covariate effect at each cutpoint, which motivates the
following generalized ordered logit model.
Generalized ordered logit model:
γi j
log = d j − β j 1 sex − β j 2 therapy, j = 1, 2, 3 (6)
1 − γi j

The proportional odds model/ordered logit model is nested within


the generalized ordered logit model by assuming that

β11 = β21 = β31 and β12 = β22 = β32

so we can use difference in deviance (likelihood-ratio test) to


perform model comparison to check this assumption.

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 22 / 27


The generalized ordered logit model: (continued)

In Stata, we need a user-written program (gologit or gologit2) to


perform the analysis using this model.
In contrast to the way that the proportional odds model is
parameterized as in ologit, the generalized ordered logit model is
parameterized in gologit2 as follows:
1 − γi j
log = d j + β j 1 sex + β j 2 therapy, j = 1, 2, 3 (7)
γi j

Intercept estimates d j will be of opposite sign as cuts from


proportional odds model

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 23 / 27


Lung Cancer: gologit2 model

. g o l o g i t 2 c a t e g o r y sex t h e r a p y [ f w e i g h t =count ]

G e n e r a l i z e d Ordered L o g i t E s t i m a t e s Number o f obs = 299


LR c h i 2 ( 6 ) = 14.10
Prob > c h i 2 = 0.0285
Log l i k e l i h o o d = − 392.93348 Pseudo R2 = 0.0176

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
category | Coef . Std . E r r . z P> | z | [95% Conf . I n t e r v a l ]
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
1 |
sex | − .3645641 .3443519 − 1.06 0.290 − 1.039481 .3103533
t h e r a p y | − .7317005 .2633247 − 2.78 0.005 − 1.247807 − .2155936
_cons | 1.373901 .2091236 6.57 0.000 .9640261 1.783775
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 |
sex | − .6700663 .3720467 − 1.80 0.072 − 1.399264 .0591318
t h e r a p y | − .5229065 .2458232 − 2.13 0.033 − 1.004711 − .0411019
_cons | − .2566465 .1745179 − 1.47 0.141 − .5986952 .0854023
−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
3 |
sex | − 1.171411 .619793 − 1.89 0.059 − 2.386183 .043361
t h e r a p y | − .3439213 .3156354 − 1.09 0.276 − .9625554 .2747128
_cons | − 1.341935 .2158886 − 6.22 0.000 − 1.765069 − .9188013
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 24 / 27


Lung Cancer: gologit2 model vs. ologit model
The fitted generalized ordered logit model
γ
(log 1−γi ji j = d j − β j 1 sex − β j 2 therapy, j = 1, 2, 3)

γ̂i 1
log = −1.374 + 0.365sex + 0.732therapy
1 − γ̂i 1
γ̂i 2
log = 0.257 + 0.670sex + 0.523therapy
1 − γ̂i 2
γ̂i 3
log = 1.342 + 1.171sex + 0.344therapy
1 − γ̂i 3

The previous fitted proportional odds model


γ
(log 1−γi ji j = d j − β1 sex − β2 therapy, j = 1, 2, 3)

γ̂i 1
log = −1.318 + 0.541sex + 0.581therapy
1 − γ̂i 1
γ̂i 2
log = .249 + 0.541sex + 0.581therapy
1 − γ̂i 2
γ̂i 3
log = 1.300 + 0.541sex + 0.581therapy
1 − γ̂i 3

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 25 / 27


Testing the proportional odds assumption: Lung
Cancer data

Log likelihood = -394.52832 under the proportional odds


model/ordered logit model
Log Likelihood = -392.93348 under the generalized ordered logit
model
we would calculate χ2 = −2 ∗ (−394.52832 − (−392.93348)) = 3.18968
with 4 d.f.:
not close to statistically significant.
This comparison suggests that the proportional odds assumption
is plausible for the small cell lung cancer study.

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 26 / 27


Summary

Ordinal Logit Model


The logistic regression model can extend to situations where the
outcome variable has multiple ordered categories
There are several choices for defining the outcome metric - here
we examined cumulative logits - odds of a given category or below
vs. categories above
Working with the model and interpretation can be more difficult.
We also need to check assumptions such as proportional odds.
Nonetheless, these models effectively support evaluation of
multiple covariates in relation to ordered discrete outcomes.

J. Dignam (UChicago) Lecture 10 Feb. 11, 2020 27 / 27

You might also like