0% found this document useful (0 votes)
49 views

Gradient Descent Regression Logistic Regression

Gradient descent regression involves: 1) Formulating an error objective function and representing data with features. 2) Optimizing model parameters numerically through gradient descent to minimize the objective function. 3) Applying gradient descent to linear and logistic regression by taking the derivative of the objective function to determine the update rule for the model parameters in each iteration.

Uploaded by

Patrick O'Rourke
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

Gradient Descent Regression Logistic Regression

Gradient descent regression involves: 1) Formulating an error objective function and representing data with features. 2) Optimizing model parameters numerically through gradient descent to minimize the objective function. 3) Applying gradient descent to linear and logistic regression by taking the derivative of the objective function to determine the update rule for the model parameters in each iteration.

Uploaded by

Patrick O'Rourke
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Gradient Descent Regression

Logistic Regression
Module2 : learning with Gradient Descent
module 2: numerical optimization
DATA PROBLEM REPRESENTATION LEARNING PERFORMANCE

RAW DATA EVALUATION


FEATURES CLUSTERING
housing data train/test
spam data error, accuracy
Cross Validation
SELECTION SUPERVISED ROC
LABELS LEARNING
numerical optimization ANALYSIS
DATA DIMENSIONS Logistic Regression
PROCESSING Perceptron
Neural Network TUNING

• formulate problem by model/parameters


• formulate error as mathematical objective
• optimize numerically the parameters for the given objective
• usually algebraic setup
- involves matrices and calculus
• probabilistic setup (likelihoods) next module
Module 2 Objectives/Gradient Descent Regression

• numerical methods primer, gradient descent


• Regression using GD
- learning rate
- batch vs online modes
- compare with normal eq regression (module 1)
• Logistic regression
• optional: Newton’s optimization procedure
Gradient descent

• finds a local minima of the objective function (J)


by guessing an initial set of parameters w and
then ”walking” episodically in the opposite
direction of the gradient ∂J/∂w.
!

• update rule (per dimension)


Gradient Descent Example

• J(x) = (x − 2)
2
+ 1
and the initial
guess for a
minimum is x0 = 3
!

• GD iteration 1
• GD iteration 2
• GD iteration 3
regression goal

• housing data, two features (toy example)


!

• regressor = a linear predictor


!

• such that h(x) approximates label(x)=y as close as


possible, measured by square error
Regression Normal Equations (module1)

• Linear regression has a well known exact


solution, given by linear algebra
!

• X= training matrix of feature values


• Y= corresponding labels vector
!

• then regression coefficients that minimize


objective J are
Problems with exact solution for regression

objective solution

• very unstable
• impractical for large matrices
• slow
• undesirable in cases with many outliers
Gradient Descent for linear regression

• differentiate the objective


!
!
!
!
!
!

• GD update rule for one datapoint


!
!

• GD for all datapoints (batch)


GD for linear regression

• batch (all datapoints) update step


!

• alternative stochastic (online) update


- i or t indicate the datapoint
- j indicates the feature (column in data)
least mean square objective convexity

• GD “walks” the function argument towards a local minimum


- it is possible (and sometimes likely) to obtain a local minimum
that is not the GLOBAL minimum

• however this doesn't


happen for regression
objective, since it is
convex
- verify convexity by
looking at the second
derivative matrix
- Hessian matrix of J(w) is
positive semidefinite
which implies J convex
!
Logistic regression for classification

• Logistic • Logistic differential


transformation
Logistic regression

• Logistic regression function


!

• transform outcome into


“probabilities”
!
!

• objective = likelihood of
observations
- a.k.a how likely is the data
observed, given the
regression model
- and take the log
Logistic
!
Regression
!
!
• consider the likelihood of
observations
- and take the log
!
!
• maximize log likelihood using
gradient ascent
- one datapoint derivation
!
!
• write down the update rules
- batch or stochastic
!
!
!

You might also like