Binary Logistic
Binary Logistic
with SPSS/PASW
Karl L. Wuensch
Dept of Psychology
East Carolina University
Download the Instructional
Document
http://core.ecu.edu/psyc/wuenschk/SPSS/
SPSS-MV.htm .
Click on Binary Logistic Regression .
Save to desktop.
Open the document.
When to Use Binary Logistic Regression
The criterion variable is dichotomous.
Predictor variables may be categorical or
continuous.
If predictors are all continuous and nicely
distributed, may use discriminant function
analysis.
If predictors are all categorical, may use
logit analysis.
Wuensch & Poteat, 1998
Cats being used as research subjects.
Stereotaxic surgery.
Subjects pretend they are on university
research committee.
Complaint filed by animal rights group.
Vote to stop or continue the research.
Purpose of the Research
Cosmetic
Theory Testing
Meat Production
Veterinary
Medical
Predictor Variables
Gender
Ethical Idealism (9-point Likert)
Ethical Relativism (9-point Likert)
Purpose of the Research
Model 1: Decision = Gender
Decision 0 = stop, 1 = continue
Gender 0 = female, 1 = male
Model is .. logit =
is the predicted probability of the event
which is coded with 1 (continue the research)
rather than with 0 (stop the research).
( ) bX a
Y
Y
ODDS + =
|
|
.
|
\
|
ln ln
Y
ln ln =
|
|
.
|
\
|
=
Y
Y
ODDS
Exponentiate Both Sides
Exponentiate both sides of the equation:
e
-.379
= .684 = Exp(B
0
) = odds of deciding to
continue the research.
128 voted to continue the research, 187 to stop
it.
187
128
684 . ) 379 . (
= = =
Exp
Y
Y
Probabilities
Randomly select one participant.
P(votes continue) = 128/315 = 40.6%
P(votes stop) = 187/315 = 59.4%
Odds = 40.6/59.4 = .684
Repeatedly sample one participant and
guess how e will vote.
Humans vs. Goldfish
Humans Match Probabilities
(suppose p = .7, q = .3)
.7(.7) + .3(.3) = .49 + .09 = .58
Goldfish Maximize Probabilities
.7(1) = .70
The goldfish win!
SPSS Model 0 vs. Goldfish
Look at the Classification Table for Block 0.
SPSS Predicts STOP for every participant.
SPSS is as smart as a Goldfish here.
Classification Table
a,b
187 0 100.0
128 0 .0
59.4
Observed
stop
continue
decision
Overall Percentage
Step 0
stop continue
decision
Percentage
Correct
Predicted
Constant is included in the model.
a.
The cut value is .500
b.
Block 1 Model
Gender has now been added to the model.
Model Summary: -2 Log Likelihood = how
poorly model fits the data.
Model Summary
399.913
a
.078 .106
Step
1
-2 Log
likelihood
Cox & Snell
R Square
Nagelkerke
R Square
Estimation terminated at iteration number 3 because
parameter estimates changed by less than .001.
a.
Block 1 Model
For intercept only, -2LL = 425.666.
Add gender and -2LL = 399.913.
Omnibus Tests: Drop in -2LL = 25.653 =
Model _
2
.
df = 1, p < .001.
Omnibus Tests of Model Coefficients
25.653 1 .000
25.653 1 .000
25.653 1 .000
Step
Block
Model
Step 1
Chi-square df Sig.
Variables in the Equation
ln(odds) = -.847 + 1.217-Gender
Gender b a
e ODDS
- +
=
Variables in the Equation
1.217 .245 24.757 1 .000 3.376
-.847 .154 30.152 1 .000 .429
gender
Constant
Step
1
a
B S.E. Wald df Sig. Exp(B)
Variable(s) entered on step 1: gender.
a.
Odds, Women
A woman is only .429 as likely to decide to
continue the research as she is to decide
to stop it.
429 . 0
847 . ) 0 ( 217 . 1 847 .
= = =
+
e e ODDS
Odds, Men
A man is 1.448 times more likely to vote to
continue the research than to stop the research.
448 . 1
37 . ) 1 ( 217 . 1 847 .
= = =
+
e e ODDS
Odds Ratio
1.217 was the B (slope) for Gender, 3.376 is the
Exp(B), that is, the exponentiated slope, the
odds ratio.
Men are 3.376 times more likely to vote to
continue the research than are women.
217 . 1
376 . 3
429 .
448 . 1
_
_
e
odds female
odds male
= = =
Convert Odds to Probabilities
For our women,
For our men,
30 . 0
429 . 1
429 . 0
1
= =
+
=
ODDS
ODDS
Y
59 . 0
448 . 2
448 . 1
1
= =
+
=
ODDS
ODDS
Y
Classification
Decision Rule: If Prob (event) > Cutoff,
then predict event will take place.
By default, SPSS uses .5 as Cutoff.
For every man, Prob(continue) = .59,
predict he will vote to continue.
For every woman Prob(continue) = .30,
predict she will vote to stop it.
Overall Success Rate
Look at the Classification Table
SPSS beat the Goldfish!
% 66
315
208
315
68 140
= =
+
Classification Table
a
140 47 74.9
60 68 53.1
66.0
Observed
stop
continue
decision
Overall Percentage
Step 1
stop continue
decision
Percentage
Correct
Predicted
The cut value is .500
a.
Sensitivity
P (correct prediction | event did occur)
P (predict Continue | subject voted to Continue)
Of all those who voted to continue the research,
for how many did we correctly predict that.
% 53
128
68
60 68
68
= =
+
Specificity
P (correct prediction | event did not occur)
P (predict Stop | subject voted to Stop)
Of all those who voted to stop the research, for
how many did we correctly predict that.
% 75
187
140
47 140
140
= =
+
False Positive Rate
P (incorrect prediction | predicted occurrence)
P (subject voted to Stop | we predicted Continue)
Of all those for whom we predicted a vote to Continue
the research, how often were we wrong.
% 41
115
47
68 47
47
= =
+
False Negative Rate
P (incorrect prediction | predicted nonoccurrence)
P (subject voted to Continue | we predicted Stop)
Of all those for whom we predicted a vote to Stop the
research, how often were we wrong.
% 30
200
60
60 140
60
= =
+
Pearson _
2
Analyze, Descriptive Statistics, Crosstabs
Gender Rows; Decision Columns
Crosstabs Statistics
Statistics, Chi-Square, Continue
Crosstabs Cells
Cells, Observed Counts, Row
Percentages
Crosstabs Output
Continue, OK
59% & 30% match logistics predictions.
gender * decision Crosstabulation
140 60 200
70.0% 30.0% 100.0%
47 68 115
40.9% 59.1% 100.0%
187 128 315
59.4% 40.6% 100.0%
Count
% within gender
Count
% within gender
Count
% within gender
Female
Male
gender
Total
stop continue
decision
Total
Crosstabs Output
Likelihood Ratio _
2
= 25.653, as with
logistic.
Chi-Square Tests
25.685
b
1 .000
25.653 1 .000
315
Pearson Chi-Square
Likelihood Ratio
N of Valid Cases
Value df
Asymp. Sig.
(2-sided)
Computed only f or a 2x2 table
a.
0 cells (.0%) have expected count less than 5. The
minimum expected count is 46.73.
b.
Model 2: Decision =
Idealism, Relativism, Gender
Analyze, Regression, Binary Logistic
Decision Dependent
Gender, Idealism, Relatvsm
Covariate(s)
Click Options and check Hosmer-
Lemeshow goodness of fit and CI for
exp(B) 95%.
Continue, OK.
Comparing Nested Models
With only intercept and gender,
-2LL = 399.913.
Adding idealism and relativism dropped
-2LL to 346.503, a drop of 53.41.
_
2
(2) = 399.913 346.503 = 53.41, p = ?
Model Summary
346.503
a
.222 .300
Step
1
-2 Log
likelihood
Cox & Snell
R Square
Nagelkerke
R Square
Estimation terminated at iteration number 4 because
parameter estimates changed by less than .001.
a.
Obtain p
Transform, Compute
Target Variable = p
Numeric Expression =
1 - CDF.CHISQ(53.41,2)
p = ?
OK
Data Editor, Variable View
Set Decimal Points to 5 for p
p < .0001
Data Editor, Data View
p = .00000
Adding the ethical ideology variables
significantly improved the model.
Hosmer-Lemeshow
H