0% found this document useful (0 votes)

2 views

Lecture 10

The document discusses logistic regression using data from a study on pre-teen and teenage girls in Warsaw, focusing on the relationship between age and the proportion of girls reaching menarche. It outlines the logistic regression model, explains how to estimate regression coefficients using maximum likelihood, and provides Stata implementation details for analyzing the data. The results indicate a significant relationship between age and the likelihood of reaching menarche, with odds increasing for each year of age.

Uploaded by

nsha07rani11

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Lecture 10

Uploaded by

nsha07rani11

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Lecture 10: Logistic Regression - Two Introductory Examples

The data below are from a study conducted by Milicer and Szczotka on pre-teen and teenage
girls in Warsaw. The subjects were classified into 25 age categories. The number of girls in
each group (sample size) and the number that reached menarche (# RM) at the time of the
study were recorded. The age for a group corresponds to the midpoint for the age interval.

Sample size # RM Age Sample size # RM Age

376 0 9.21 200 0 10.21
93 0 10.58 106 67 13.33
120 2 10.83 105 81 13.58
90 2 11.08 117 88 13.83
88 5 11.33 98 79 14.08
105 10 11.58 97 90 14.33
111 17 11.83 120 113 14.58
100 16 12.08 102 95 14.83
93 29 12.33 122 117 15.08
100 39 12.58 111 107 15.33
108 51 12.83 94 92 15.58
99 47 13.08 114 112 15.83
1049 1049 17.58

The researchers were interested in whether the proportion of girls that reached menarche (
# RM/ sample size ) varied with age. One could perform a test of homogeneity by arranging
the data as a 2 by 25 contingency table with columns indexed by age and two rows: ROW1
= # RM and ROW2 = # that have not RM = sample size − # RM. A more powerful
approach treats these as regression data, using the proportion of girls reaching menarche as
the “response” and age as a predictor.
The data were imported into Stata using the infile command and labelled menarche,
total, and age. A plot of the observed proportion of girls that have reached menarche (ob-
tained in Stata with the two commands generate phat = menarche / total and twoway

126
1
.8
.6
phat
.4
.2
0

10 12 14 16 18
age

Figure 1: Estimated proportions p̂i versus AGEi , for i = 1, . . . , 25.

(scatter phat age)) shows that the proportion increases as age increases, but that the
relationship is nonlinear.
The observed proportions, which are bounded between zero and one, have a lazy S-shape
(a sigmoidal function) when plotted against age. The change in the observed proportions
for a given change in age is much smaller when the proportion is near 0 or 1 than when
the proportion is near 1/2. This phenomenon is common with regression data where the
response is a proportion.
The trend is nonlinear so linear regression is inappropriate. A sensible alternative might
be to transform the response or the predictor to achieve near linearity. A better approach is
to use a non-linear model for the proportions. A common choice is the logistic regression
model.

The Simple Logistic Regression Model

The simple logistic regression model expresses the population proportion p of individuals
with a given attribute (called a success) as a function of a single predictor variable X. The

127
Logit Scale Probability Scale

1.0
0.8
5
+ slope

0.6
Probability
Log-Odds

0 slope
0

0.4
0 slope

- slope

0.2
-5

+ slope - slope

0.0
-5 0 5 -5 0 5
X X

Figure 2: logit(p) and p as a function of X

model assumes that p is related to X through

Ã !
p
logit(p) = log = α + βX (1)
1−p

or, equivalently, as
exp(α + βX)
p= .
1 + exp(α + βX)
The logistic regression model is a binary response model, where the response for
each case falls into one of 2 exclusive and exhaustive categories, often called success (cases
with the attribute of interest) and failure (cases without the attribute of interest). In many
biostatistical applications, the success category is presence of a disease, or death from a
disease.
I will often write p as p(X) to emphasize that p is the proportion of all individuals with
score X that have the attribute of interest. In the menarche data, p = p(X) is the population
proportion of girls at age X that have reached menarche.
The odds of success are p/(1 − p). For example, the odds of success are 1 (or 1 to 1) when
p = 1/2. The odds of success are 2 (or 2 to 1) when p = 2/3. The logistic model assumes

128
that the log-odds of success is linearly related to X. Graphs of the logistic model relating p
to X are given above. The sign of the slope refers to the sign of β.
There are a variety of other binary response models that are used in practice. The probit
regression model or the complementary log-log regression model might be appropriate
when the logistic model does not fit the data.

Data for Simple Logistic Regression

For the formulas below, I assume that the data are given in summarized or aggregate form:

X n D
X1 n1 d1
X2 n2 d2
. . .
. . .
Xm nm dm

where di is the number of individuals with the attribute of interest (number of diseased)
among ni randomly selected or representative individuals with predictor variable value Xi .
The subscripts identify the group of cases in the data set. In many situations, the sample
size is 1 in each group, and for this situation di is 0 or 1.
For raw data on individual cases, the sample size column n is usually omitted and D
takes on 1 of two coded levels, depending on whether the case at Xi is a success or not. The
values 0 and 1 are typically used to identify “failures” and “successes” respectively.

Estimating Regression Coefficients

The principle of maximum likelihood is commonly used to estimate the two unknown pa-
rameters in the logistic model:
Ã !
p
log = α + βX.
1−p

129
The maximum likelihood estimates (MLE) of the regression coefficients are estimated
iteratively by maximizing the so-called Binomial likelihood function for the responses, or
equivalently, by minimizing the deviance function (also called the likelihood ratio LR chi-
squared statistic)
m
( Ã ! Ã !)
X di ni − di
LR = 2 di log + (ni − di )log
i=1 ni pi ni − ni pi
over all possible values of α and β, where the pi s satisfy
Ã !
pi
log = α + βXi .
1 − pi
The ML method also gives standard errors and significance tests for the regression estimates.
The deviance is an analog of the residual sums of squares in linear regression. The choices
for α and β that minimize the deviance are the parameter values that make the observed
and fitted proportions as close together as possible in a “likelihood sense”.
Suppose that α̂ and β̂ are the MLEs of α and β. The deviance evaluated at the MLEs:
m
( Ã ! Ã !)
X di ni − di
LR = 2 di log + (ni − di )log ,
i=1 ni p̂i ni − ni p̂i
where the fitted probabilities p̂i satisfy
Ã !
p̂i
log = α̂ + β̂Xi ,
1 − p̂i
is used to test the adequacy of the model. The deviance is small when the data fits the
model, that is, when the observed and fitted proportions are close together. Large values
of LR occur when one or more of the observed and fitted proportions are far apart, which
suggests that the model is inappropriate.
If the logistic model holds, then LR has a chi-squared distribution with m − r degrees
of freedom, where m is the number of groups and r (here 2) is the number of estimated
regression parameters. A p-value for the deviance is given by the area under the chi-squared
curve to the right of LR. A small p-value indicates that the data does not fit the model.
Stata does not provide the deviance statistic, but rather the Pearson chi-squared test
statistic, which is defined similarly to the deviance statistic and is interpreted in the same
manner:

130
m
X (di − ni p̂i )2
X2 = .
i=1 ni p̂i (1 − p̂i )

This statistic can be interpreted as the sum of standardized, squared differences between
the observed number of successes di and expected number of successes ni p̂i for each covariate
Xi . When what we expect to see under the model agrees with what we see, the Pearson statis-
tic is close to zero, indicating good model fit to the data. When the Pearson statistic is large,
q
we have an indication of lack of fit. Often the Pearson residuals ri = (di −ni p̂i )/ ni p̂i (1 − p̂i )
are used to determine exactly where lack of fit occurs. These residuals are obtained in Stata
using the predict command after the logistic command. Examining these residuals is
(O−E)2
very similar to looking for large values of E
in a χ2 analysis of a contingency table as
discussed in the last lecture. We will not talk further of logistic regression diagnostics.

Age at Menarche Data: Stata Implementation

A logistic model for these data implies that the probability p of reaching menarche is related
to age through Ã !
p
log = α + β AGE.
1−p
If the model holds, then a slope of β = 0 implies that p does not depend on AGE, i.e.
the proportion of girls that have reached menarche is identical across age groups. However,
the power of the logistic regression model is that if the model holds, and if the proportions
change with age, then you have a way to quantify the effect of age on the proportion reaching
menarche. This is more appealing and useful than just testing homogeneity across age groups.

A logistic regression model with a single predictor can be fit using one of the many
commands available in Stata depending on the data type and desired results: logistic
(raw data, outputs odds ratios), logit (raw data, outputs model parameter estimates),
and blogit (grouped data). The logistic command has many more options than ei-
ther logit or blogit, but requires you to reformat the data into individual records, one
for each girl. For an example of how to do this, check out the online Stata help at

131
http://www.stata.com/support/faqs/stat/grouped.html. The Stata command blogit
menarche total age yields the following output:
Logit estimates Number of obs = 3918
LR chi2(1) = 3667.18
Prob > chi2 = 0.0000
Log likelihood = -819.65237 Pseudo R2 = 0.6911
------------------------------------------------------------------------------
_outcome | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | 1.631968 .0589509 27.68 0.000 1.516427 1.74751
_cons | -21.22639 .7706558 -27.54 0.000 -22.73685 -19.71594
------------------------------------------------------------------------------

The output tables the MLEs of the parameters: α̂ = −21.23 and β̂ = 1.63. Thus, the
fitted or predicted probabilities satisfy:
Ã !
p̂
log = −21.23 + 1.63AGE
1 − p̂
or
exp(−21.23 + 1.63AGE)
p̂(AGE) = .
1 + exp(−21.23 + 1.63AGE)
The p-value for testing H0 : β = 0 (i.e. the slope for the regression model is zero) based
upon the chi-squared test p-value (P>|z|) is 0.000, which leads to rejecting H0 at any of the
usual test levels. Thus, the proportion of girls that have reached menarche is not constant
across age groups.
The likelihood ratio test statistic of no logistic regression relationship (LR chi2(1) =
3667.18) and p-value (Prob > chi2 = 0.0000) gives the logistic regression analogue of the
overall F-statistic that no predictors are important to multiple regression. In general, the chi-
squared statistic provided here is used to test the hypothesis that the regression coefficients
are zero for each predictor in the model. There is a single predictor here, AGE, so this test
and the test for the AGE effect are both testing H0 : β = 0.
To obtain the Pearson goodness of fit statistic and p-value we must reformat the data
and use the logistic command as described in the webpage above:
generate w0 = total - menarche
rename menarche w1
generate id = _n
reshape long w, i(id) j(y)
logistic y age [fw=w]
lfit

132
We obtain the following output:
Logistic regression Number of obs = 3918
LR chi2(1) = 3667.18
Prob > chi2 = 0.0000
Log likelihood = -819.65237 Pseudo R2 = 0.6911
------------------------------------------------------------------------------
y | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | 5.113931 .3014706 27.68 0.000 4.555917 5.740291
------------------------------------------------------------------------------
Logistic model for y, goodness-of-fit test
number of observations = 3918
number of covariate patterns = 25
Pearson chi2(23) = 21.87
Prob > chi2 = 0.5281

Using properties of exponential functions, the odds of reaching menarche is exp(1.632) =

5.11 times larger for every year older a girl is. To see this, let p(Age + 1) and p(Age) be
probabilities of reaching menarche for ages one year apart. The odds ratio OR satisfies
Ã !
p(Age + 1)/(1 − p(Age + 1))
log(OR) = log
p(Age)/(1 − p(Age))
= log (p(Age + 1)/(1 − p(Age + 1))) − log (p(Age)/(1 − p(Age)))
= (α + β(Age + 1)) − (α + β Age)
= β

so OR = eβ . If we considered ages 5 years apart, the same derivation would give us OR =

e5β = (eβ )5 . You often see a continuous variable with a significant though apparently small
OR, but when you examine the OR for a reasonable range of values (by raising to the power
of the range in this way), then the OR is substantial.
You should pick out the the estimated regression coefficient β̂ = 1.632 and the estimated
odds ratio exp(β̂) = exp(1.632) = 5.11 from the output obtained using the blogit and
logistic commands respectively. We would say that, for example, that the odds of 15 year
old girls having reached menarche are between 4.5 and 5.7 times larger than for 14 year old
girls.
The Pearson chi-square statistic is 21.87 on 23 df, with a p-value of 0.5281. The large
p-value suggests no gross deficiencies with the logistic model.

133
Logistic Regression with Two Effects: Leukemia Data
Feigl and Zelen reported the survival time in weeks and the white cell blood count (WBC)
at time of diagnosis for 33 patients who eventually died of acute leukemia. Each person was
classified as AG+ or AG- (coded as IAG = 1 and 0, respectively), indicating the presence
or absence of a certain morphological characteristic in the white cells. The researchers are
interested in modelling the probability p of surviving at least one year as a function of WBC
and IAG. They believe that WBC should be transformed to a log scale, given the skewness in
the WBC values. Where Live=0, 1 indicates whether the patient died or lived respectively,
the data are

IAG WBC Live IAG WBC Live IAG WBC Live

---------------------------------------------
1 75 1 1 230 1 1 430 1
1 260 1 1 600 0 1 1050 1
1 1000 1 1 1700 0 1 540 0
1 700 1 1 940 1 1 3200 0
1 3500 0 1 5200 0 1 10000 1
1 10000 0 1 10000 0 0 440 1
0 300 1 0 400 0 0 150 0
0 900 0 0 530 0 0 1000 0
0 1900 0 0 2700 0 0 2800 0
0 3100 0 0 2600 0 0 2100 0
0 7900 0 0 10000 0 0 10000 0

As an initial step in the analysis, consider the following model:

Ã !
p
log = α + β1 LWBC + β2 IAG,
1−p

where LWBC = log WBC. This is a logistic regression model with 2 effects, fit using the
logistic command. The parameters α, β1 and β2 are estimated by maximum likelihood.

134
The model is best understood by separating the AG+ and AG- cases. For AG- individ-
uals, IAG=0 so the model reduces to
Ã !
p
log = α + β1 LWBC + β2 ∗ 0 = α + β1 LWBC.
1−p
For AG+ individuals, IAG=1 and the model implies
Ã !
p
log = α + β1 LWBC + β2 ∗ 1 = (α + β2 ) + β1 LWBC.
1−p
The model without IAG (i.e. β2 = 0) is a simple logistic model where the log-odds of
surviving one year is linearly related to LWBC, and is independent of AG. The reduced
model with β2 = 0 implies that there is no effect of the AG level on the survival probability
once LWBC has been taken into account.
Including the binary predictor IAG in the model implies that there is a linear rela-
tionship between the log-odds of surviving one year and LWBC, with a constant slope for
the two AG levels. This model includes an effect for the AG morphological factor, but more
general models are possible. Thinking of IAG as a factor, the proposed model is a logistic
regression analog of ANCOVA.
The parameters are easily interpreted: α and α + β2 are intercepts for the population
logistic regression lines for AG- and AG+, respectively. The lines have a common slope, β1 .
The β2 coefficient for the IAG indicator is the difference between intercepts for the AG+
and AG- regression lines. A picture of the assumed relationship is given below for β1 < 0.
The population regression lines are parallel on the logit (i.e. log odds ) scale only, but the
order between IAG groups is preserved on the probability scale.
The data are in the raw data form for individual cases. There are three columns: the
binary or indicator variable iag (with value 1 for AG+, 0 for AG-), wbc (continuous),
live (with value 1 if the patient lived at least 1 year and 0 if not). Note that a frequency
column is not needed with raw data (and hence using the logistic command) and that the
success category corresponds to surviving at least 1 year.
Before looking at output for the equal slopes model, note that the data set has 30 distinct
IAG and WBC combinations, or 30 “groups” or samples that could be constructed from the
33 individual cases. Only two samples have more than 1 observation. The majority of

135
Logit Scale Probability Scale

1.0
5

0.80.6
IAG=1
0

Probability
Log-Odds

IAG=1

0.4
-5

IAG=0

0.2
IAG=0
-10

0.0
-5 0 5 -5 0 5
LWBC LWBC

the observed proportions surviving at least one year (number surviving ≥ 1 year/ group
sample size) are 0 (i.e. 0/1) or 1 (i.e. 1/1). This sparseness of the data makes it difficult to
graphically assess the suitability of the logistic model (Why?). Although significance tests on
the regression coefficients do not require large group sizes, the chi-squared approximations to
the deviance and Pearson goodness-of-fit statistics are suspect in sparse data settings. With
small group sizes as we have here, most researchers would not interpret the p-values for the
deviance or Pearson tests literally. Instead, they would use the p-values to informally check
the fit of the model. Diagnostics would be used to highlight problems with the model.
We obtain the following modified output:

. infile iag wbc live using c:/biostat/notes/leuk.txt

. generate lwbc = log(wbc)
. logistic live iag lwbc
. logit
. lfit
Logistic regression Number of obs = 33
LR chi2(2) = 15.18
Prob > chi2 = 0.0005
Log likelihood = -13.416354 Pseudo R2 = 0.3613
------------------------------------------------------------------------------
live | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
iag | 12.42316 13.5497 2.31 0.021 1.465017 105.3468
lwbc | .3299682 .1520981 -2.41 0.016 .1336942 .8143885
------------------------------------------------------------------------------
Logit estimates Number of obs = 33

136
LR chi2(2) = 15.18
Prob > chi2 = 0.0005
Log likelihood = -13.416354 Pseudo R2 = 0.3613
------------------------------------------------------------------------------
live | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
iag | 2.519562 1.090681 2.31 0.021 .3818672 4.657257
lwbc | -1.108759 .4609479 -2.41 0.016 -2.0122 -.2053178
_cons | 5.543349 3.022416 1.83 0.067 -.380477 11.46718
------------------------------------------------------------------------------
Logistic model for live, goodness-of-fit test
number of observations = 33
number of covariate patterns = 30
Pearson chi2(27) = 19.81
Prob > chi2 = 0.8387

The large p-value (0.8387) for the lack-of-fit chi-square (i.e. the Pearson statistic) indi-
cates that there are no gross deficiencies with the model. Given that the model fits reasonably
well, a test of H0 : β2 = 0 might be a primary interest here. This checks whether the re-
gression lines are identical for the two AG levels, which is a test for whether AG affects the
survival probability, after taking LWBC into account. The test that H0 : β2 = 0 is equivalent
to testing that the odds ratio exp(β2 ) is equal to 1: H0 : eβ2 = 1. The p-value for this test
is 0.021. The test is rejected at any of the usual significance levels, suggesting that the AG
level affects the survival probability (assuming a very specific model). In fact we estimate
that the odds of surviving past a year in the AG+ population is 12.4 times the odds of
surviving past a year in the AG- population, with a 95% CI of (1.4, 105.4); see below for
this computation carried out explicitly.
The estimated survival probabilities satisfy
Ã !
p̂
log = 5.54 − 1.11LWBC + 2.52IAG.
1 − p̂
For AG- individuals with IAG=0, this reduces to
Ã !
p̂
log = 5.54 − 1.11LWBC,
1 − p̂
or equivalently,
exp(5.54 − 1.11LWBC)
p̂ = .
1 + exp(5.54 − 1.11LWBC)
For AG+ individuals with IAG=1,
Ã !
p̂
log = 5.54 − 1.11LWBC + 2.52 ∗ (1) = 8.06 − 1.11LWBC,
1 − p̂

137
or
exp(8.06 − 1.11LWBC)
p̂ = .
1 + exp(8.06 − 1.11LWBC)
Using the logit scale, the difference between AG+ and AG- individuals in the estimated
log-odds of surviving at least one year, at a fixed but arbitrary LWBC, is the estimated IAG
regression coefficient:

(8.06 − 1.11LWBC) − (5.54 − 1.11LWBC) = 2.52.

Using properties of exponential functions, the odds that an AG+ patient lives at least one
year is exp(2.52) = 12.42 times larger than the odds that an AG- patient lives at least one
year, regardless of LWBC.
Although the equal slopes model appears to fit well, a more general model might fit
better. A natural generalization here would be to add an interaction, or product term,
IAG ∗ LWBC to the model. The logistic model with an IAG effect and the IAG ∗ LWBC
interaction is equivalent to fitting separate logistic regression lines to the two AG groups.
This interaction model provides an easy way to test whether the slopes are equal across AG
levels. I will note that the interaction term is not needed here.

138

Just Javascript
No ratings yet
Just Javascript
123 pages
African Short Storiess
100% (2)
African Short Storiess
178 pages
Understanding Divine Direction - Introduction
No ratings yet
Understanding Divine Direction - Introduction
2 pages
Regresi Logistik
No ratings yet
Regresi Logistik
34 pages
Logistic Regression
No ratings yet
Logistic Regression
49 pages
Logistic Regression
No ratings yet
Logistic Regression
98 pages
Logistic Regression Mini Tab
100% (3)
Logistic Regression Mini Tab
20 pages
Bio2 Module 5 - Logistic Regression
No ratings yet
Bio2 Module 5 - Logistic Regression
19 pages
ppt4
No ratings yet
ppt4
54 pages
5.1) Binary logistic regression
No ratings yet
5.1) Binary logistic regression
32 pages
Home Lesson 15: Logistic, Poisson & Nonlinear Regression
No ratings yet
Home Lesson 15: Logistic, Poisson & Nonlinear Regression
32 pages
Chapter Three 3.0 Methodology 3.1 Source of Data
No ratings yet
Chapter Three 3.0 Methodology 3.1 Source of Data
10 pages
Binary Logistic Regression - 6.2
No ratings yet
Binary Logistic Regression - 6.2
34 pages
Logistic Regression: Continued Psy 524 Ainsworth
0% (1)
Logistic Regression: Continued Psy 524 Ainsworth
29 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Regresion Logistica
No ratings yet
Regresion Logistica
71 pages
Binary Logistic Regression Concept
No ratings yet
Binary Logistic Regression Concept
10 pages
Cda Chapter Three
No ratings yet
Cda Chapter Three
18 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
Binary Logistic Regression Using Stata 17 Drop-Down Menus
No ratings yet
Binary Logistic Regression Using Stata 17 Drop-Down Menus
53 pages
Econometrics II CH 1
No ratings yet
Econometrics II CH 1
48 pages
Log Reg
No ratings yet
Log Reg
32 pages
Lec-4 Logistic Regression
No ratings yet
Lec-4 Logistic Regression
54 pages
13. Review of Logistic and Poisson Regression Models
No ratings yet
13. Review of Logistic and Poisson Regression Models
15 pages
Logistic Regression
No ratings yet
Logistic Regression
54 pages
Sta 3010 Quizes
No ratings yet
Sta 3010 Quizes
10 pages
Logistic Regression
100% (1)
Logistic Regression
37 pages
Logistic Regression: Multivariate Analysis
No ratings yet
Logistic Regression: Multivariate Analysis
29 pages
Thesis Using Logistic Regression
100% (2)
Thesis Using Logistic Regression
7 pages
Minitab Tip Sheet 15
No ratings yet
Minitab Tip Sheet 15
5 pages
Logistics Regression
No ratings yet
Logistics Regression
56 pages
Predictive Modeling: Logistic Regression
No ratings yet
Predictive Modeling: Logistic Regression
13 pages
1_LogisticRegressionNotes1
No ratings yet
1_LogisticRegressionNotes1
11 pages
Chapter 5-LDVM-2024
No ratings yet
Chapter 5-LDVM-2024
27 pages
90784-Origin of Logit
No ratings yet
90784-Origin of Logit
8 pages
Article: An Introduction Tos Logistic Regression Analysis and Reporting
No ratings yet
Article: An Introduction Tos Logistic Regression Analysis and Reporting
5 pages
An Introduction To Logistic Regression in R
No ratings yet
An Introduction To Logistic Regression in R
25 pages
Psy 512 Logistic Regression
No ratings yet
Psy 512 Logistic Regression
12 pages
An Overview of Logistic Regression: Jill Mccracken May 28, 2004
No ratings yet
An Overview of Logistic Regression: Jill Mccracken May 28, 2004
10 pages
Logistic Regression
100% (2)
Logistic Regression
41 pages
Logistic Regression Analysis
No ratings yet
Logistic Regression Analysis
48 pages
Lec-03_LogisticRegression
No ratings yet
Lec-03_LogisticRegression
32 pages
Logistic Regression & Practice
100% (1)
Logistic Regression & Practice
51 pages
Lect7 Math231
No ratings yet
Lect7 Math231
29 pages
Group 1 Biostat Assignement@
No ratings yet
Group 1 Biostat Assignement@
20 pages
Practical - 592 MA SOCIOLOGY SPSS Fourth Sem
No ratings yet
Practical - 592 MA SOCIOLOGY SPSS Fourth Sem
45 pages
Capstone - Https:Users - Ox.ac - Uk: Jesu0073:Lecture 3:LogisticRegression
No ratings yet
Capstone - Https:Users - Ox.ac - Uk: Jesu0073:Lecture 3:LogisticRegression
17 pages
Logisticregression PDF
No ratings yet
Logisticregression PDF
48 pages
SAS Annotated Output
No ratings yet
SAS Annotated Output
8 pages
Session 3 - Logistic Regression
50% (2)
Session 3 - Logistic Regression
28 pages
A Simple But Effective Logistic Regression Derivation
No ratings yet
A Simple But Effective Logistic Regression Derivation
6 pages
Fox and Weisberg Logistic Regression
100% (1)
Fox and Weisberg Logistic Regression
4 pages
Mock-test-Econ
No ratings yet
Mock-test-Econ
3 pages
Logistic Regression
No ratings yet
Logistic Regression
7 pages
Ho Logistic3
No ratings yet
Ho Logistic3
3 pages
655 656bridge
No ratings yet
655 656bridge
23 pages
18Logistic regression yilma
No ratings yet
18Logistic regression yilma
88 pages
Logistic Regression: 30 March 2016
No ratings yet
Logistic Regression: 30 March 2016
49 pages
Basic Concepts of Logistic Regression
No ratings yet
Basic Concepts of Logistic Regression
5 pages
CUHK STAT5102 Ch7
No ratings yet
CUHK STAT5102 Ch7
33 pages
L5 Logistic Regression (2011)
100% (1)
L5 Logistic Regression (2011)
55 pages
Sas Notes Module 4-Categorical Data Analysis Testing Association Between Categorical Variables
100% (1)
Sas Notes Module 4-Categorical Data Analysis Testing Association Between Categorical Variables
16 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Common Mistakes in Grammar
No ratings yet
Common Mistakes in Grammar
32 pages
3 Pieces of Advice From Imam al-Ghazali’s “O My Beloved Son” _ Marajal Bahrain
No ratings yet
3 Pieces of Advice From Imam al-Ghazali’s “O My Beloved Son” _ Marajal Bahrain
8 pages
3910 7037 1 PB PDF
No ratings yet
3910 7037 1 PB PDF
16 pages
Google Translate Assignment
No ratings yet
Google Translate Assignment
1 page
Jadwal Smp Semester Ganjil 2024-2025
No ratings yet
Jadwal Smp Semester Ganjil 2024-2025
5 pages
Ip Unit 2
No ratings yet
Ip Unit 2
51 pages
HAML Cheat Sheet: by Via
No ratings yet
HAML Cheat Sheet: by Via
1 page
Technology Integration For Meaningful Teaching and Learning
No ratings yet
Technology Integration For Meaningful Teaching and Learning
44 pages
From Long Distance Literary Analysis
No ratings yet
From Long Distance Literary Analysis
13 pages
Exponents and Power
No ratings yet
Exponents and Power
1 page
C/C++/Stl/Com/Atl/Mts/Activex /Win32/Multithreading
No ratings yet
C/C++/Stl/Com/Atl/Mts/Activex /Win32/Multithreading
5 pages
Jason Birch - The Meaning of Hatha in Early Hathayoga
No ratings yet
Jason Birch - The Meaning of Hatha in Early Hathayoga
29 pages
March Maintain Training Facilities
No ratings yet
March Maintain Training Facilities
16 pages
Basic Use of Comma
No ratings yet
Basic Use of Comma
2 pages
Notes On The Equivalence Between Ontology and Mathematics Burhanuddin Baki
No ratings yet
Notes On The Equivalence Between Ontology and Mathematics Burhanuddin Baki
10 pages
E66 0219 PDF
No ratings yet
E66 0219 PDF
15 pages
5CS4-03 Operating System Priyanka Sharma
No ratings yet
5CS4-03 Operating System Priyanka Sharma
373 pages
Activity 2 Juan Pablo Moreno
No ratings yet
Activity 2 Juan Pablo Moreno
15 pages
Times of Refreshing in The Presence of The Lord
No ratings yet
Times of Refreshing in The Presence of The Lord
3 pages
The Poker Night
No ratings yet
The Poker Night
6 pages
Tarot - The Royal Road- 1 the Magician i
No ratings yet
Tarot - The Royal Road- 1 the Magician i
13 pages
Blog Rubric
No ratings yet
Blog Rubric
1 page
Wespeaker A Research and Production Oriented Speak
No ratings yet
Wespeaker A Research and Production Oriented Speak
6 pages
Instant Download Java the UML way integrating object oriented design and programming Else Lervik PDF All Chapters
100% (1)
Instant Download Java the UML way integrating object oriented design and programming Else Lervik PDF All Chapters
82 pages
History of C Programming Language
No ratings yet
History of C Programming Language
23 pages
IPv6 Subnetting
No ratings yet
IPv6 Subnetting
2 pages
Curriculum Vitae - Jeison Pernía
No ratings yet
Curriculum Vitae - Jeison Pernía
1 page

Lecture 10

Uploaded by

Lecture 10

Uploaded by

Lecture 10: Logistic Regression - Two Introductory Examples

Sample size # RM Age Sample size # RM Age

Figure 1: Estimated proportions p̂i versus AGEi , for i = 1, . . . , 25.

The Simple Logistic Regression Model

Figure 2: logit(p) and p as a function of X

model assumes that p is related to X through

Data for Simple Logistic Regression

Estimating Regression Coefficients

Age at Menarche Data: Stata Implementation

Using properties of exponential functions, the odds of reaching menarche is exp(1.632) =

so OR = eβ . If we considered ages 5 years apart, the same derivation would give us OR =

IAG WBC Live IAG WBC Live IAG WBC Live

As an initial step in the analysis, consider the following model:

. infile iag wbc live using c:/biostat/notes/leuk.txt

(8.06 − 1.11LWBC) − (5.54 − 1.11LWBC) = 2.52.

You might also like