0% found this document useful (0 votes)
2 views

LogisticRegression

Logistic Regression question +solution

Uploaded by

Helin Wang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

LogisticRegression

Logistic Regression question +solution

Uploaded by

Helin Wang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

STAT 3888

Semester 2 Statistical Machine Learning 2022

Tutorial Exercise 3.

1. Consider the following data collected by Erickson (1987) as part of a study on the
measurement of anaesthetic depth. The potency of an anaesthetic agent is measured
in terms of the minimum alveolar concentration (MAC) of the agent at which 50% of
patients exhibit no response to stimulation (i.e. do not move - moving means jerking
or twisting, not twitching or grimacing - in response to a surgical incision). Thirty
patients were administered an anaesthetic agent which was maintained at a predeter-
mined alveolar concentration (actually, anaesthetists refer to concentration when they
mean partial pressure hence alveolar concentration is measured as a percentage of one
atmosphere) for 15 minutes before a single incision was made in each patient. For each
patient, the alveolar concentration of the anaesthetic agent and the patient’s response
to incision was recorded.
Consider the following R code:

> x
[1] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 1.0 1.0 1.0 1.0 1.0 1.2 1.2 1.2 1.2 1.2
1.2 1.4 1.4 1.4 1.4 1.4 1.4 1.6 1.6 1.6 1.6 2.5 2.5
> y
[1] 1 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0
> res <- glm(y~x, family = "binomial")
> res
Call: glm(formula = y ~ x, family = "binomial")
Coefficients:
(Intercept) x
6.469 -5.567
Degrees of Freedom: 29 Total (i.e. Null); 28 Residual
Null Deviance: 41.46
Residual Deviance: 27.75 AIC: 31.75

The amount of Alveolar concentration is stored in the R vector x above. If the patient
“responds to stimulation” then the corresponding element of y is 1 and 0 otherwise.
A summary of the model can be found at the beginning for the following page.

1
> summary(res)
Call:
glm(formula = y ~ x, family = "binomial")
Deviance Residuals:
Min 1Q Median 3Q Max
-2.06900 -0.68666 -0.03413 0.74407 1.76666
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 6.469 2.418 2.675 0.00748 **
x -5.567 2.044 -2.724 0.00645 **
---
Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 41.455 on 29 degrees of freedom
Residual deviance: 27.754 on 28 degrees of freedom
AIC: 31.754
Number of Fisher Scoring iterations: 5

Use the above R code and output to answer the following questions.

(a) State the fitted model.


Solution: The fitted model is

logit (P(y = 1)) = 6.469 − 5.567x

x

where logit(x) = log 1−x
. Alternatively,

P(y = 1) = [1 + exp (−(6.469 − 5.567x))]−1 = expit(6.469 − 5.567x)

where expit(x) = logit−1 (x) = 1/(1 + exp(−x)).

(b) Estimate the MAC value corresponding to P(y = 1) = 0.5?


Solution: If P(y = 1) = 0.5 then

logit (P(y = 1)) = 0

so we need to solve for x the equation

0 = 6.469 − 5.567x

which gives us x = 6.469/5.567 ≈ 1.162 to 3 d.p.

2
(c) If x = 1.1p predict whether or not the subject responds to stimulation?
Solution: If x = 1.1 then

P(y = 1) = [1 + exp (−(6.469 − 5.567 × 1.1))]−1 = [1 + exp (−0.3453)]−1 ≈ 0.585

to 3 d.p. Since the predicted probability is greater than 0.5 we would predict
y = 1, that is the responds to stimulation.
(d) Interpret the effect of increasing x by one unit on response.
Solution: Increasing the value of x decreases the log-odds of P(y = 1) by 1 by
5.567. To see this the log-odds for a value x = 1.1 is
 
P(y = 1)
log = 0.585
P(y = 0)

If we increase x to 2.1 the log-odds is


 
P(y = 1)
log = 6.469 − 5.567 × 2.1 = −5.2217
P(y = 0)

The difference between these is a decrease of 5.567 the estimated coefficient for x.

(e) Let β1 be the coefficient corresponding to x in the logistic regression model. What
is the Wald statistic for testing the hypothesis H0 : β1 = 0? What is the cor-
responding p-value? Is the coefficient β1 statistically significantly different from
0?
Solution: The Wald Statistic for the hypotheses

H0 : β1 = 0 versus H1 : β1 ̸= 0

is W1 = βb1 /se(βb1 ) = −5.567/2.044 = −2.724 to 3 d.p. The p-value is 2*pnorm(


abs(-2.724), lower.tail=FALSE) ≈ 0.00645. These values can be read from
the summary R output. The p-value is smaller than 0.05, so we reject the null
hypothesis that H0 : β1 = 0 and say that β1 is significantly different from 0.

2. A retrospective sample of males in a coronary heart disease (CHD) high-risk region


of the Western Cape, South Africa was collected. Samples were divided into cases (of
CHD) and controls. Many of the CHD positive men have undergone blood pressure
reduction treatment and other programs to reduce their risk factors after their CHD
event. The variables in the study are:

3
Variable Description
sbp systolic blood pressure
tobacco cumulative tobacco (kg)
ldl low density lipoprotein cholesterol
famhist family history of heart disease (Present, Absent)
bmi body mass index
alcohol current alcohol consumption
age age at onset
CHD response

Assume that the variables sbp, tobacco, ldl, famhist, bmi, alcohol, age and CHD
have already been entered as vectors in R.
Consider the R code:

res <- glm(chd~,data=dat,family=binomial)


summary(res)
\end{Sinput}
\begin{Soutput}
Call:
glm(formula = chd ~ ., family = "binomial", data = dat)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.7781 -0.8213 -0.4387 0.8889 2.5435
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.1507209 1.3082600 -4.701 2.58e-06 ***
sbp 0.0065040 0.0057304 1.135 0.256374
tobacco 0.0793764 0.0266028 2.984 0.002847 **
ldl 0.1739239 0.0596617 2.915 0.003555 **
adiposity 0.0185866 0.0292894 0.635 0.525700
famhistPresent 0.9253704 0.2278940 4.061 4.90e-05 ***
typea 0.0395950 0.0123202 3.214 0.001310 **
obesity -0.0629099 0.0442477 -1.422 0.155095
alcohol 0.0001217 0.0044832 0.027 0.978350
age 0.0452253 0.0121298 3.728 0.000193 ***
---
Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 596.11 on 461 degrees of freedom
Residual deviance: 472.14 on 452 degrees of freedom
AIC: 492.14
Number of Fisher Scoring iterations: 5

Use the above R code and output to answer the following questions.

4
(a) State the model corresponding to the R object res.
Solution: The model can be written as
 
P(chd = 1)
log = −6.1507209 + 0.0065040 × sbp + 0.0793764 × tobacco
1 − P(chd = 1)
+0.1739239 × ldl + 0.0185866 × adiposity
+0.9253704 × famhist = “Present” + 0.0395950 × typea
−0.0629099 × obesity + 0.0001217 × alcohol
+0.0452253 × age

(b) Using model res what is the probability of chd=1 for a patient with measurements
sbp=160, tobacco=12.00, ldl=5.73, adiposity=23.11, famhist=”Present”, typea=49,
obesity=25.30, alcohol=97.20, and age=52.

 
P(chd = 1)
log = −6.1507209 + 0.0065040 × 160 + 0.0793764 × 12.00
1 − P(chd = 1)
+0.1739239 × 5.73 + 0.0185866 × 23.11
+0.9253704 + 0.0395950 × 49
−0.0629099 × 25.30 + 0.0001217 × 97.20
+0.0452253 × 52
= 0.9060059

Hence, P(chd = 1) = 1/(1 + exp(−0.9060059)) = 0.71 to 2 d.p.


(c) Write the R command to make this prediction.
Solution:
newdata <- data.frame(
sbp=160,
tobacco=12.00,
ldl=5.73,
adiposity=23.11,
famhist="Present",
typea=49,
obesity=25.30,
alcohol=97.20,
age=52
)
predict(res, newdata, type="response")
which returns 0.7121829.
(d) Which variables are significantly different from zero?
Solution: The variables that are significantly differnt from 0 are tobacco, ldl,
famhist, typea and age.

5
These questions assume knowledge of vector calculus
3. (Harder - not examinable) Suppose that y ∈ Rn , X ∈ Rn×p , β ∈ Rp , λ > 0 is a tuning
parameter. Consider the penalized likelihood
pℓ(β, λ) = − 21 ∥y − Xβ∥22 − λ∥β∥22 + costants

Show that the maximizer of pℓ(β, λ) for fixed λ is given by


b = XT X + λI −1 XT y

β

Solution
∂pℓ(β, λ)
= −XT Xβ + XT y − λIβ
∂β
Setting the above to zero after rearranging we get
XT X + λI β = XT y


multiplying both sides to the right by (XT X + λI) (which is always possible even when
p > n) gives the desired result.
4. (Harder - not examinable) Suppose that y ∈ Rn , X ∈ Rn×p , β ∈ Rp , λ > 0 is a tuning
parameter. Consider the penalized likelihood
pℓ(β, λ) = − 12 ∥y − Xβ∥22 − λ∥β∥1 + costants

(a) For a single coefficient βj find the maximizer of pℓ(β, λ) holding all other param-
eters fixed.
Solution: We can rewrite pℓ(β, λ) as
pℓ(β, λ) = − 12 ∥rj − Xj βj ∥22 − λ|βj | + costants in βj
where rj = y − X−j β −j where X−j is the matrix X with the jth column removed,
and β −j is the vector β with the jth element removed. Then

∂pℓ(β, λ) = −βj + rj T Xj − λ∂|βj |


assuming ∥Xj ∥ = 1. This is the same problem as described in the lectures whose
solution is
βbj = Sλ (rj T Xj )
where Sλ = sign(x)(x − λ)+ is the soft-max operator.
(b) Use the above result to suggest an algorithm to maximize pℓ(β, λ) for fixed λ > 0.
Solution: Set β. b Loop over the following 2 steps for each j = 1, . . . , p:
• Set rj = y − X−j β −j
• βbj = Sλ (rj T Xj )j)
Until the change is the βbj values is sufficiently small.

You might also like