Jdavis Advice
Jdavis Advice
Jesse Davis
Goals of this Lecture: Address Practical Aspects of
Machine Learning
2
Decision trees:
Selectmost promising feature at each node
Tree only contains a subset of features
x1
Input images:
N images
2500 features
Misleading figure.
Best to think of as an N 2500 matrix: |Examples| |Features|
Reduce Dimensionality 2500 → 15
9
Other
components
Problematic Data Set for PCA
10
PCA
Rotate the axes and sort new dimensions in order of “importance”
Discard low significance dimensions
Uses:
Get compact description
Ignore noise
Improve classification (hopefully)
Not magic:
Doesn’t know class labels
Can only capture linear variations
Pro: Very fast so scales to large feature sets or large data sets
Cons
Misses feature interactions
May select many redundant feature
Approach 1: Correlation
14
cov( fi , y)
R ( fi , y ) =
var( fi ) var( y)
(f )( y )
m
k =1 k,i − fi k −y
R( f i , y) =
(f ) (y )
m 2 m 2
k =1 k ,i
− fi k =1 k
−y
Approach 2: Single Variable Classifier
15
Operators:
Forward:add a feature
Backward: subtract a feature
add F3
subtract F3
Forward Backward
Faster in early steps because Fast for choosing all but a
fewer features to test small subset of the features
Drop x2
x1 x1
Feature Selection in Practice
22
Good practices:
Pose a question / hypothesis and answer it
Also include a naive baseline such as
◼ Always predict majority class
◼ Return mean value in training data
Case Study: RPE for Professional Soccer
25
Players
1.20
Given: GPS and
accelerometer data from a 1.00
player’s training session Train
set
Predict: Player’s Rate of 0.80 average
Neural
Perceived Exertion Net
MAE
0.60
LASSO
Question: Is model valid 0.40
across seasons?
0.20
0.00
Results: Is an Individual Model More Accurate Than
a Team Model?
26
0.90
0.85
Mean Absolute Error
0.80
0.75
0.70
0.65
Individual Team
Neural Net Boosted Tree LASSO
0.50
0.45
0.40
0.35
0.30
AUCPR
0.25
0.20
0.15
0.10
0.05
0.00
1 2 3
Number of Training Databases
TSFuse
TSFresh
RNN
Case Study: Energy Efficient Prediction
30
6.00
Speedup Factor Our approach:
5.00
4X more predictions
4.00
on resource budget
3.00
2.00
1.00
0.00
Δ Weighted Accuracy
0.02
0.01
IG
0.00
ΔCP
-0.01
0 200 400 600 800 1000
Feature Budget
Comparing Run Times Is A Dark Art
32
Differences due to
Programming languages
How optimized the code is (definitely relevant)
Evaluate Design Decision: Ablasion or Lesion Study
33
Which functionality
Wrist
Both
Tibia
No learning baselines:
3.50 Constant predictions
3.00
Median RPE
2.50
MAE of RPE
Personalized
2.00
Median
1.50 No Normalization
1.00 Normalization
0.50
0.00
Case Study: Resource Monitoring
37
Simple Features
Simple Features + Learned Patterns
Time of Data
39 Potential Problems or Pitfalls
Cross Validation Errors
40
- -
+ - +
+ - +
+ - -- - +
Synthetic - - -
example
Idea 2: Manipulate the Learner
44
ROC
Precision / Recall
My Model Is Not Accurate Enough
45
More data
More / better features
Change optimizer
Change objective function
Change model class
Question: What do I do?
Error
Error
Train
Test