0% found this document useful (0 votes)
8 views

Logistic Regression 223110099

Uploaded by

artmitv5
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Logistic Regression 223110099

Uploaded by

artmitv5
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

Logistic regression

Main
Reference
Introduction

• How well your set of predictor variables predicts or explains the categorical dependent
variable.

• Logistic regression is used when y is categorical with only two outcomes, i.e.
dichotomous/binary.

• If you have only one x, it is called “simple” logistic regression, and if you have more than
one x, it is called “multiple” logistic regression.

• The x-variable(s) can be categorical (nominal/ordinal) and/or continuous (ratio/interval).


Odds ratio (OR)
• An odds ratio (OR) is a measure of association between an exposure and an outcome.

• A logistic regression is thus based on the fact that the outcome has only two possible
values: 0 (no) or 1 (Yes).

• Often, 1 is used to denote a “case” whereas 0 is then a “non-case”.

• What a “case” or “non-case” means depends on how the hypothesis is formulated.

• Logistic regression is used to predict the “odds” of being a “case” based on the values of
the x-variable(s). The odds ratio is interpreted in the following way: “for every one-unit
increase in x, y increases/decreases by [the odds ratio]”.
Formula
Cancer No cancer

Smoking a/c
Not smoking b/d
Cancer No cancer

1200
Smoking 40 20
=6
200
Not smoking 10 30
Simple versus multiple regression models
• The difference between simple and multiple regression models, is
that in a multiple regression each x-variable’s effect on y is estimated
while taking into account the other x-variables’ effects on y.

• We then say that these other x-variables are “held constant”, or


“adjusted for”, or “controlled for”.

• Multiple regression analysis is a way of dealing with the issue of


“confounding” variables
Confounding

Kanker paru-paru
Minum kopi
Merokok

A confounder is a variable that influences both the x-variable and the y-variable and, thus, makes you think that there is
an actual relationship between x and y (but it is due to z).
In data analysis, we commonly want to get rid of the confounding effects – in that context, we often talk about
“controlling” or “adjusting” for confounders
Simple logistic regression
For every additional day of unemployment, the odds of dying increases by a factor of 1.67.
Outpout

by a factor of 0.972
or inverse by 1/value
For every one-year decrease in age, the likelihood of having an active lifestyle increases by a factor of 1.03.
Having small children increases the likelihood to have a pet by a factor 1.49.
Family with small children were 1.49 more likely to have a pet than those without small children.
People with active lifestyle were 0.987 times less likely to be married than those without active lifestyle
aren’t
Model diagnostics
• Goodness of fit  if the estimated model (i.e. the model with one or
more x-variables) predicts the outcome better than the null model
(i.e. a model without any x-variables).
specificity
sensitivity
Exercise
• Dependent variable = sleeping problem (yes/no)
• Independent variable = sex (male/female), age (continuous), caffeine
(yes/no)
Pertanyaan?

Twitter: @kbarbahar
Instagram: @akbarbahar
Email: [email protected]

You might also like