0% found this document useful (0 votes)
4 views

05b Logistic Regression

Uploaded by

dhirendrakumar
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

05b Logistic Regression

Uploaded by

dhirendrakumar
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

Machine Learning

Logistic Regression

Jeff Howbert Introduction to Machine Learning Winter 2012 1


Logistic regression

 Name is somewhat misleading. Really a


technique for classification, not regression.
– “Regression” comes from fact that we fit a
linear model to the feature space.
 Involves a more probabilistic view of
classification.

Jeff Howbert Introduction to Machine Learning Winter 2012 2


Different ways of expressing probability

 Consider a two-outcome probability space, where:


– p( O1 ) = p
– p( O2 ) = 1 – p = q
 Can express probability of O1 as:

notation range equivalents


standard probability p 0 0.5 1
odds p/q 0 1 +
log odds (logit) log( p / q ) - 0 +

Jeff Howbert Introduction to Machine Learning Winter 2012 3


Log odds

 Numeric treatment of outcomes O1 and O2 is


equivalent
– If neither outcome is favored over the other,
then log odds = 0.
– If one outcome is favored with log odds = x,
then other outcome is disfavored with log
odds = -x.
 Especially useful in domains where relative
probabilities can be miniscule
– Example: multiple sequence alignment in
computational biology
Jeff Howbert Introduction to Machine Learning Winter 2012 4
From probability to log odds (and back
again)

 p 
z log  logit function
 1 p 
p
e z
1 p
ez 1
p  logistic function
1 e z
1  e z

Jeff Howbert Introduction to Machine Learning Winter 2012 5


Standard logistic function

Jeff Howbert Introduction to Machine Learning Winter 2012 6


Logistic regression

 Scenario:
– A multidimensional feature space (features
can be categorical or continuous).
– Outcome is discrete, not continuous.
 We’ll focus on case of two classes.
– It seems plausible that a linear decision
boundary (hyperplane) will give good
predictive accuracy.

Jeff Howbert Introduction to Machine Learning Winter 2012 7


Using a logistic regression model
 Model consists of a vector  in d-dimensional feature
space

 For a point x in feature space, project it onto  to convert


it into a real number z in the range -  to + 
z   β x   1 x1     d xd

 Map z to the range 0 to 1 using the logistic function


p 1 /(1  e  z )

 Overall, logistic regression maps a point x in d-


dimensional feature space to a value in the range 0 to 1
Jeff Howbert Introduction to Machine Learning Winter 2012 8
Using a logistic regression model

 Can interpret prediction from a logistic regression


model as:
– A probability of class membership
– A class assignment, by applying threshold to
probability
 threshold represents decision boundary in feature
space

Jeff Howbert Introduction to Machine Learning Winter 2012 9


Training a logistic regression
model

 Need to optimize  so the model gives the best


possible reproduction of training set labels
– Usually done by numerical approximation of
maximum likelihood
– On really large datasets, may use stochastic
gradient descent

Jeff Howbert Introduction to Machine Learning Winter 2012 10


Logistic regression in one
dimension

Jeff Howbert Introduction to Machine Learning Winter 2012 11


Logistic regression in one
dimension

Jeff Howbert Introduction to Machine Learning Winter 2012 12


Logistic regression in one
dimension

 Parameters control shape and location of sigmoid curve


–  controls location of midpoint
–  controls slope of rise

Jeff Howbert Introduction to Machine Learning Winter 2012 13


Logistic regression in one
dimension

Jeff Howbert Introduction to Machine Learning Winter 2012 14


Logistic regression in one
dimension

Jeff Howbert Introduction to Machine Learning Winter 2012 15


Logistic regression in two
dimensions

Subset of Fisher iris dataset


– Two classes
– First two columns (SL, SW) decision boundary

Jeff Howbert Introduction to Machine Learning Winter 2012 16


Logistic regression in two
dimensions

Interpreting the model vector of coefficients

 From MATLAB: B = [ 13.0460 -1.9024 -0.4047 ]


  = B( 1 ),  = [ 1 2 ] = B( 2 : 3 )
 ,  define location and orientation
of decision boundary
– -  is distance of decision
boundary from origin 
– decision boundary is
perpendicular to 
 magnitude of  defines gradient
of probabilities between 0 and 1

Jeff Howbert Introduction to Machine Learning Winter 2012 17


Heart disease dataset

 13 attributes (see heart.docx for details)


– 2 demographic (age, gender)
– 11 clinical measures of cardiovascular status and
performance
 2 classes: absence ( 1 ) or presence ( 2 ) of heart disease
 270 samples
 Dataset taken from UC Irvine Machine Learning Repository:
http://archive.ics.uci.edu/ml/datasets/Statlog+(Heart)
 Preformatted for MATLAB as heart.mat.

Jeff Howbert Introduction to Machine Learning Winter 2012 18


MATLAB interlude

matlab_demo_05.m

Jeff Howbert Introduction to Machine Learning Winter 2012 19


Logistic regression

 Advantages:
– Makes no assumptions about distributions of classes in feature
space
– Easily extended to multiple classes (multinomial regression)
– Natural probabilistic view of class predictions
– Quick to train
– Very fast at classifying unknown records
– Good accuracy for many simple data sets
– Resistant to overfitting
– Can interpret model coefficients as indicators of feature
importance
 Disadvantages:
– Linear decision boundary

Jeff Howbert Introduction to Machine Learning Winter 2012 20

You might also like