4 MachineLearningForCV
4 MachineLearningForCV
Computer Vision
Machine Learning for Vision
Steve Elston
• K-nearest neighbours
• Support vector machines
• Tree models and tree ensembles
• Naïve Bayes models
• Deep neural networks – more on these starting next week!
• …
Machine Learning and Computer Vision
Key points for this lesson
• The formulation of linear machine learning models
• Basic machine learning workflow
• Formulation of CV features for machine learning
• The relationship between bias, variance and model capacity
• Theory of binary classifiers
• Theory of multi-class classifiers
Review of Linear Models
Why linear models?
• Understandable and interpretable
• Generalize well, if properly fit
• Highly scalable – computationally efficient
• Can approximate fairly complex functions
• Applied widely to CV problems
• A basis of understanding complex models
– Many non-linear models are locally linear at convergence
– e.g. we can learn a lot about the convergence of DL and RL models from linear
approximations
Review of Linear Models
• Given a feature matrix, A, we wish to compute a linear model to
predict some labels x
• Let A be n x p.
• Then, the model has a vector of p coefficients or weights, b
• We want to compute b so that we minimize errors, e
• The predictive model is then:
Review of Linear Models
• How can we compute b so that we minimize errors, e, for the model?
• Where,
– Is a dense matrix
– Computing computationally intensive,
-B1 = B2
X1
B2 ~ 0
Constant ||B||2
Contours of J(W)MLE
MinW (J(W)MLE)
Constraint on model
parameters binds
Contours of ||W||2
l2 Regularization
l2 regularization goes by many names
• Euclidian norm regularization
• First published by Andrey Tikonov in late 1940s
Only published in English in 1977
Is known as Tikhonov regularization
X1
B2 = 0
Constant ||B||
Contours of J(W)MLE
MinW (J(W)MLE)
Nonzero
parameter
value
Contours of ||W||1
Bias, Variance and Model Capacity
The Bias-Variance Trade-Off
Where:
The Bias-Variance Trade-Off
Test Error
Variance Bias
Increasing bias
Increasing variance
Model Capacity, Bias and Variance
Example: capacity, bias and variance for a polynomial model linear
in coefficients
v
Model Capacity, Bias and Variance
Example: capacity, bias and variance for a polynomial model linear
in coefficients
Split data to
Train
Database Train Model
Test
Evaluate
Test Model
Machine Learning Workflow
Training
features Trained model
Model training by
With
minimizing errors with for
Training Learned model
known labels
labels parameters
Machine Learning Workflow
Test
features Trained model
With Evaluate model
Test Learned model performance
labels parameters
Machine Learning Workflow
How do we evaluate classification models?
• Classifiers perform an hypothesis test with possible outcomes:
– True positive (TP): Positive cases are correctly classified
– True negative (TN): Negative cases are correctly classified
– False positive (FP): Negative case erroneously classified as positive
– False negative (FN): Positive case erroneously classified as negative
(e1,0,0)
(0,0,4)
(1,0,3) (0,3,1)
(1,1,2)
(2,0,2) C (0,2,2)
(2,1,1) (1,2,1)
(3,0,1) (0,3,1)
B
A
(4,0,0) (0,4,0)
(3,1,0) (2,2,0) (1,3,0)
Multi-Class Logistic Regression
How do we build a multinomial classifier?
1. Start with K classes, {1,2,3,…K}, with label one-hot encoded
2. The most probable class is:
3. One vs. Rest method uses k-1 classifiers to compute probabilitie
4. The probability of the Kth, or pivot class:
Multi-Class Logistic Regression
Use k-1 linear classifiers
Start with the K-1 log probability ratios with the pivot class
where
vector of coefficients for kth model
ith feature vector
ith one-hot encoded label vector
Multi-Class Logistic Regression
Find k-1 probabilities with linear classifiers