0% found this document useful (0 votes)
4 views

Ordinal Logistic Regression MC

Uploaded by

Nga Vũ
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Ordinal Logistic Regression MC

Uploaded by

Nga Vũ
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Ordinal logistic regression using SPSS

Mike Crowson, Ph.D.


Created July 15, 2019

In this video, I provide a demonstration of how to carry out and interpret an ordinal logistic regression.

A link for the data used, as well as this Powerpoint, will be made available for download underneath the
video description. Additionally, a “running” document containing links to other videos on logistic regression
and using other programs will be made available as well.

If you find video and materials useful, please take the time to “like” the video and share the link with
others. Also, please consider subscribing to my Youtube channel.

Youtube video link: https://www.youtube.com/watch?v=rSCdwZD1DuM

For more videos and resources, check out my website: https://sites.google.com/view/statistics-for-the-real-


world/home
Overview

Binary logistic regression is utilized in those cases when a researcher is modeling a predictive relationship
between one or more independent variables and a binary dependent variable. Although this is probably the
most common form of logistic regression utilized in research literatures, there are other logistic regression
models that can be useful when your dependent variable is not binary and/or the categories are unordered
or ordered. Multinomial logistic regression (MLR) is generally used when you have more than two
categories on the dependent variable that are unordered. Ordinal logistic regression (OLR) is generally used
when you have categories for the dependent variable that are ordered (i.e., are ranked).

Although it is permissible to utilize MLR to analyze data involving an ordered categorical dependent
variable, OLR is generally preferable (Osborne, 2015). Unlike MLR (which produces multiple sets of
regression coefficients and associated tests), OLR yields only a single set of regression coefficients to
estimate relationships between independent and dependent variables. As such, OLR will provide a more
parsimonious representation of the data than MLR when the dependent variable is ordered. Nevertheless,
when the proportional odds assumption is violated, then MLR provides a viable alternative to OLR (see
https://stats.idre.ucla.edu/spss/dae/ordinal-logistic-regression/). [The proportional odds assumption
essentially states that the relationship between the independent variable and dependent variable is
constant, irrespective of which groups are being compared on the dependent variable (see Osborne, 2015,
2017).]
Scenario: Let’s say you are a researcher studying predictors of student interest. You collect data from 200
students on several variables. (Below is a subset of the data). “Pass” indicates whether a student passed
(coded 1) or failed (coded 0) a previous subject matter test. “Masteryg” is mastery goals (higher scores
indicate greater mastery goals). “Fearfail” is fear of failure (higher scores indicate greater fear of failure).
“Masteryg” and “Fearfail” are treated as continuous variables. “Genderid” is a binary variable (like pass),
dummy coded 0=identified male, 1=identified female. “Interestlev’ is an ordered, categorical variable
indicating students’ self-reported interest for the next topic in class. It is coded 1=low interest, 2=medium
interest, 3=high interest).
Ordinal logistic regression (using SPSS): Route 1
Here, we place “Interestlev” variable in the dependent box and
remaining variables (IV’s) in the Covariate(s) box. Although they are
categorical variables, we can include “pass” and “genderid" as
covariates. However, if you have categorical variables with more
than two levels, then you must use the “factor(s)” box for them.
[FYI, would have also entered the above variables as factors, but I
prefer having control over the designation of the reference
category; SPSS defaults by treating the category with the higher
value as the reference category]
We’ll check (at minimum) these boxes. “Test of parallel lines” provides a test of the proportional odds
assumption.
The Case Processing Summary tells you the proportion of cases falling at each level of the dependent
variable (Interestlev).

The Model Fitting Information (see right) contains the -2 Log Likelihood for an Intercept only (or null) model
and the Full Model (containing the full set of predictors). We also have a likelihood ratio chi-square test to
test whether there is a significant improvement in fit of the Final model relative to the Intercept only model.
In this case, we see a significant improvement in fit of the Final model over the null model [χ²(4)=30.249,
p<.001].
The “Goodness of Fit” table contains the Deviance and Pearson chi-square tests, which are useful for
determining whether a model exhibits good fit to the data. Non-significant test results are indicators that the
model fits the data well (Field, 2018; Petrucci, 2009). [Note: They do not always necessarily agree, as in the
case we see here. So the results are somewhat mixed.]

In this analysis, we see that both the Pearson chi-square test [χ²(394)=400.412, p=.401] and the deviance test
[χ²(394)=403.353, p=.362] were both non-significant. These results suggest good model fit.
These are pseudo-R-square values that are treated as rough analogues to the R-square value in OLS
regression. In general, there is no strong guidance in the literature on how these should be used or
interpreted (Lomax & Hahs-Vaugn, 2012; Osborne, 2015; Pituch & Stevens, 2016; Smith & McKenna,
2013). As such, one should interpret these with caution.
Here, we have the regression coefficients and significance tests for each of the independent variables in the model.
The regression coefficients are literally interpreted as the predicted change in log odds of being in a higher (as
opposed to a lower) group/category on the dependent variable (controlling for the remaining independent
variables) per unit increase on the independent variable. As such…

We interpret a positive Estimate (b) in the following way: For every one unit increase on an independent variable,
there is a predicted increase (of a certain amount) in the log odds of falling at a higher level of the dependent
variable. More generally, this indicates that as scores increase on an independent variable, there is an increased
probability of falling at a higher level on the dependent variable.
We interpret a negative Estimate (b) in the following way: For every one unit increase on an independent variable,
there is a predicted decrease (of a certain amount) in the log odds of falling at a higher level of the dependent
variable. More generally, this indicates that as scores increase on an independent variable, there is a decreased
probability of falling at a higher level on the dependent variable.
The Threshold estimates that are given in this table are intercepts. Osborne (2017) states that these
estimates can be interpreted as the “log odds of being in a particular group or lower when scores on the
other variable(s) are zero” (p. 147).
1. Mastery goals was a significant positive predictor of Interest in the next topic. For every one unit increase on
mastery goals, there is a predicted increase of .026 in the log odds of a student being in a higher (as opposed
to lower) category on Interest. This indicates that a student scoring higher on mastery goals were more likely
to indicate greater interest in the next topic.
2. Fear of failure was not a significant predictor in the model. [The coefficient is interpreted as follows: For every
one unit increase on fear of failure, there is a predicted decrease of .015 in the log odds of being in a higher
level of the dependent variable.]
3. Pass was a significant positive predictor of Interest. Since Pass is a binary variable, the slope represents the
difference in log odds between individuals in the “failed” group and the “passed group”. The log odds of being
in a higher level on Interest was .820 points higher on average for those who passed the previous subject
matter test as compared to those who failed the test.
4. Gender identification was not a significant predictor. [Again, because this is a binary variable the slope can be
thought of as the difference in log odds between groups. On average, the log odds of being in a higher
Interest category was .232 points greater for persons identified as female than males.]
As mentioned previously, OLR assumes that the relationship between the IV’s are the same “across all
possible comparisons” (Osborne, 2017, p. 147) involving the dependent variable – an assumption referred to
as Proportional Odds.

When the result of the test of Parallel lines (i.e., assumption of Proportional odds) indicate non-significance,
then we interpret it to mean that the assumption is satisfied. Statistical significance is taken as an indicator
that the assumption is not satisfied.

In the results from our analysis, we interpret the results to mean that the assumption is satisfied (as p=.854).
Ordinal logistic regression (using SPSS): Route 2 (using
generalized linear models option)
One downside of using the previous option is that we cannot get Odds Ratios (OR’s), reflecting the
changing odds of a case falling at a next higher level on the dependent variable. Moreover, the test results
associated with the independent variables are based solely on the Wald test. These results can be less
powerful than test results based on the use of Likelihood ratio chi-square tests. Using the Generalized
linear models option, we can obtain all of this additional information.
If you have “factor” variables then
you could include them in the
Factors box. Unlike Route 1, you can
actually specify the reference
category.

Include independent variables (not


treated as factors) here
Here, I have requested Likelihood ratio chi-
square statistics and odds ratios to be
printed in the output.
These are various goodness of fit statistics.

You’ll notice that although the Pearson chi-


square and Deviance appear in this table,
test results are not provided (as we saw in
the Goodness of fit table via Route 1).
Nevertheless, both values and degrees of
freedom are provided, which could be used
to test for model fit using the chi-square
distribution. (Of course, it’s probably less
work to obtain that information via Route
1)
This is the Likelihood ratio chi-square test we saw
via Route 1. We see that our full model was a
significant improvement in fit over the null (no
predictors) model [χ²(4)=30.249, p<.001].
Running your logistic regression through this route will allow you to obtain both Wald tests of the
predictors (see test results under Parameter Estimates) and Likelihood ratio tests (see Tests of Model
Effects). For the most part, the p-values from both tables are very consistent.
A closer look at the table:
Here, you’ll see roughly the same information contained in the previous table of regression coefficients
through Route 1. One of the main differences is the Exp(B) column (and confidence interval). The Exp(B)
column contains odds ratios reflecting the multiplicative change in the odds of being in a higher category
on the dependent variable for every one unit increase on the independent variable, holding the remaining
independent variables constant. An odds ratio > 1 suggests an increasing probability of being in a higher
level on the dependent variable as values on an independent variable increases, whereas a ratio < 1
suggests a decreasing probability with increasing values on an independent variable. An adds ratio = 1
suggests no predicted change in the likelihood of being in a higher category as values on an independent
variable increase.
As before, mastery goals was a significant positive predictor of Interest in the next topic. For every one unit
increase on mastery goals, there is a predicted increase of .026 in the log odds of a student being in a higher
level of the Interest (dependent) variable. This indicates that a student scoring higher on mastery goals were
more likely to indicate greater interest in the next topic.

The odds ratio indicates that the odds of being in a higher category on Interest increases by a factor of 1.027 for
every one unit increase on mastery goals.
Fear of failure was not a significant predictor in the model. [The regression coefficient indicates that for every
one unit increase on fear of failure, there is a predicted decrease of .015 in the log odds of being in a higher level
of the dependent variable (controlling for the remaining predictors).]

The odds ratio indicates that the odds of being in a higher category on Interest increases by a factor of .985 for
every one unit increase on fear of failure. [Given that the odds ratio is < 1, this indicates a decreasing probability
of being in a higher level on the Interest variable as scores increase on fear of failure.]
Pass was a significant positive predictor of Interest. The log odds of being in a higher level on Interest was .820
points higher on average for those who passed the previous subject matter test than those who failed the test.

The odds of students who passed (the previous subject matter test) being in a higher category on the dependent
variable were 2.270 times that of those who failed the test.

Gender identification was not a significant predictor. [On average, the log odds of being in a higher Interest
category was .232 points greater for females than males.]

The odds of a student identified as female being in a higher category on the dependent variable was 1.261 times
that of a student identified as male (although again, gender identification was not a significant predictor).
What if we analyze the data using MLR?
Model fit was good according to these results.
The test results in the Parameter
estimates table are of regression
slopes for comparisons between level
1 (reference category) and level 2
and between level 1 and level 3.
References and resources

Field, A. (2018). Discovering statistics using IBM SPSS statistics (5th ed). Los Angeles: Sage.

Lomax, R.G., & Hahs-Vaughn (2012). An introduction to statistical concepts (3rd ed). New York: Routledge.

Osborne, J.W. (2015). Best practices in logistic regression. Los Angeles: Sage.

Osborne, J.W. (2017). Regression and linear modeling: Best practices and modern methods. Thousand Oaks, CA: Sage.

Petrucci, C.J. (2009). A primer for social worker researchers on how to conduct a multinomial logistic regression. Journal
of Social Service Research, 35, 193-205.

Pituch, K.A., & Stevens, J.A. (2016). Applied multivariate statistics for the social sciences (6th ed). New York: Routledge.

Smith, T.J., & McKenna, C.M. (2013). A comparison of logistic regression pseudo R2. Multiple Linear Regression
Viewpoints, 39, 17-26. Retrieved from http://www.glmj.org/archives/articles/Smith_v39n2.pdf on June 20, 2019.

Tabachnick, B.G., & Fidell, L.S. (2013). Using multivariate statistics (6th ed.). New York: Pearson.

UCLA: Statistical Consulting Group. Ordinal logistic regression: SPSS Examples. From
https://stats.idre.ucla.edu/spss/dae/ordinal-logistic-regression/

You might also like