0% found this document useful (0 votes)

10 views

Final Presentation

(1) The document proposes a simple method for predictive uncertainty estimation using deep ensembles. It trains an ensemble of neural networks with different initialization, evaluates them using proper scoring rules, and employs adversarial training to smooth predictive distributions. (2) The method is evaluated on calibration and generalization tasks using metrics like NLL, accuracy, and Brier score. It is shown to provide superior uncertainty estimates compared to Monte Carlo dropout. (3) In summary, the paper introduces a scalable approach for uncertainty quantification in neural networks based on ensembles and adversarial training. It highlights the potential of non-Bayesian methods for uncertainty estimation.

Uploaded by

dupsimugne

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Final Presentation

Uploaded by

dupsimugne

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Simple and Scalable Predictive

Uncertainty Estimation using

Deep Ensembles

Article Review
Introduction
Deep neural networks have achieved state-of-the-art
performance on a wide variety of machine learning tasks.

NNs are poor at quantifying predictive uncertainty, and

tend to produce overconfident predictions.

Proper uncertainty quantification is crucial for practical applications

Evaluation measures
Evaluating the quality of predictive uncertainties is challenging as the ‘ground truth’ uncertainty
estimates are usually not available.

Generalization of the predictive

Calibration uncertainty to domain shift
A frequentist notion of uncertainty Also referred to as out-of-
which measures the discrepancy distribution examples, that is,
between subjective forecasts and measuring if the network knows
(empirical) long-run frequencies. what it knows.
Alternative Approaches
Bayesian
Neural Networks
This concept revolves around a
Bayesian formalism, whereby a Drawbacks
prior distribution is specified
upon the parameters of a NN Bayesian NNs are often hard to implement.
and then, given the training Computationally slower to train compared to non-
data, the posterior distribution Bayesian NNs.
over the parameters is
computed, which is used to
quantify predictive uncertainty.
Monte Carlo dropout to estimate predictive uncertainty

Monte Carlo by using Dropout at test time. MC-dropout is relatively

simple to implement leading to its popularity in practice.
dropout Interestingly, dropout may also be interpreted as
ensemble model combination where the predictions are
averaged over an ensemble of NNs (with parameter
sharing).
Summary of Contributions
and Novelty
While ensembles of NNs and adversarial training are known, authors are the first
to explore their potential for predictive uncertainty estimation.
Introduced a scalable method for predictive uncertainty estimates from Neural
Networks (NNs). Investigated the impact of ensembles and adversarial training.
Proposed tasks to assess the quality of predictive uncertainty in terms of
calibration and generalization. Demonstrated superior performance against MC-
dropout.

While most research emphasizes Bayesian deep learning for uncertainty, authors
results highlight the potential of non-Bayesian methods, hoping to inspire further
exploration in this direction.
Deep Ensembles: A Simple Recipe For
Predictive Uncertainty Estimation
Problem setup and High-level
summary

Authors suggest a simple recipe:

1. Use a proper scoring rule as the training criterion
2. Use adversarial training to smooth the predictive distributions
3. Train an ensemble.
Adversarial training
Adversarial examples are those which are ‘close’ to the
original training examples, but are misclassified by the
NN.

Intuitively, the adversarial perturbation creates a new

training example by adding a perturbation along a
direction which the network is likely to increase the loss.

Assuming this perturbation is small enough, these

adversarial examples can be used to augment the original
training set by treating (x’, y) as additional training
examples.
Adversarial training to smooth predictive
distributions

Adversarial training can be interpreted as a computationally efficient solution to

smooth the predictive distributions by increasing the likelihood of the target
around an epsilon neighborhood of the observed training examples.

We investigate the use of adversarial training for predictive uncertainty

estimation.
Proper Scoring Rules
Scoring rules measure the quality of predictive uncertainty. A scoring rule assigns
a numerical score to a predictive distribution , rewarding better
calibrated predictions over worse. We shall consider scoring rules where a higher
numerical score is better. Then our proper scoring rule is:

Where:
is a function that evaluates the quality of the predictive distribution
relative to an event .
denotes the true distribution on (y, x) tuples.
A proper scoring rule is the one where with equality if and only if
NNs can then be trained according to measure that encourages calibration of predictive
uncertainty by minimizing the loss
Ensembles: training and prediction
Authors focus only on the randomization based approach as it is better suited for distributed, parallel
computation.

They came to a conclusion that training on entire dataset with random initialization was better than
bagging for deep ensembles.
Experimental Results
Evaluation metrics and experimental setup
Evaluation Metrics and Experimental Details
Evaluation Metrics:
Both Classification & Regression:
Negative Log Likelihood (NLL) for evaluating predictive uncertainty.
Classification:
Classification Accuracy.
Brier Score (BS): and 0 otherwise.

Experimental Setup:
Batch size: 100.
Optimizer: Adam with a fixed learning rate of 0.1.
Adversarial Training: set ϵ to 0.01 times the range of the training data for each dimension.
Weight Initialization: Default in Torch.
Classification on MNIST

Authors used the following setup:

Model: MLP with 3 hidden layers.

Units: 200 hidden units per layer.
Activation: ReLU with batch normalization.
MC-dropout: Added after each ReLU.
Dropout Rate: 0.1.
Classification on MNIST results

Results from the paper

Our results
The End

Public Speaking and Persuasion REVIEWER
No ratings yet
Public Speaking and Persuasion REVIEWER
5 pages
Tackling Prediction Uncertainty in Machine Learning For Healthcare
No ratings yet
Tackling Prediction Uncertainty in Machine Learning For Healthcare
8 pages
Uncertainty Estimation in Neural Networks Through Multi-Task Learning
No ratings yet
Uncertainty Estimation in Neural Networks Through Multi-Task Learning
13 pages
UQ Review
No ratings yet
UQ Review
129 pages
Divergent Ensemble Networks: Enhancing Uncertainty Estimation with Shared Representations and Independent Branching
No ratings yet
Divergent Ensemble Networks: Enhancing Uncertainty Estimation with Shared Representations and Independent Branching
10 pages
Uncertainty Notes
No ratings yet
Uncertainty Notes
166 pages
Dropout As A Bayesian Approximation
No ratings yet
Dropout As A Bayesian Approximation
10 pages
A Survey On Uncertainty Estimation in Deep Learning Classification Systems From A Bayesian Perspective
No ratings yet
A Survey On Uncertainty Estimation in Deep Learning Classification Systems From A Bayesian Perspective
35 pages
Conformal Time-Series Forecasting
No ratings yet
Conformal Time-Series Forecasting
13 pages
NeurIPS-2020-towards-maximizing-the-representation-gap-between-in-domain-out-of-distribution-examples-Paper
No ratings yet
NeurIPS-2020-towards-maximizing-the-representation-gap-between-in-domain-out-of-distribution-examples-Paper
12 pages
Dropout As A Bayesian Approximation: Representing Model Uncertainty in Deep Learning
No ratings yet
Dropout As A Bayesian Approximation: Representing Model Uncertainty in Deep Learning
12 pages
Franchi_Make_Me_a_BNN_A_Simple_Strategy_for_Estimating_Bayesian_CVPR_2024_paper
No ratings yet
Franchi_Make_Me_a_BNN_A_Simple_Strategy_for_Estimating_Bayesian_CVPR_2024_paper
11 pages
Make Me a BNN- A Simple Strategy for Estimating Bayesian Uncertainty From Pre-trained Models
No ratings yet
Make Me a BNN- A Simple Strategy for Estimating Bayesian Uncertainty From Pre-trained Models
19 pages
Machine Learning in Measurement Part 2 Uncertainty Quantification
No ratings yet
Machine Learning in Measurement Part 2 Uncertainty Quantification
5 pages
Probabilistic Modeling of Deep Features For Out-of-Distribution and Adversarial Detection
No ratings yet
Probabilistic Modeling of Deep Features For Out-of-Distribution and Adversarial Detection
10 pages
Lecture 05
No ratings yet
Lecture 05
34 pages
Machine Learning Paper - 5
No ratings yet
Machine Learning Paper - 5
37 pages
Holovko Seniorthesis 2020
No ratings yet
Holovko Seniorthesis 2020
107 pages
Online Calibrated Regression For Adversarially Robust Forecasting
No ratings yet
Online Calibrated Regression For Adversarially Robust Forecasting
22 pages
A Survey of Uncertainty in Deep Neural Networks
No ratings yet
A Survey of Uncertainty in Deep Neural Networks
41 pages
Monte Carlo Averaging for Uncertainty Estimation i
No ratings yet
Monte Carlo Averaging for Uncertainty Estimation i
13 pages
DIVERGENT ENSEMBLE NETWORKS :IMPROVING PREDICTIVE RELIABILITY AND COMPUTATIONAL EFFICIENCY
No ratings yet
DIVERGENT ENSEMBLE NETWORKS :IMPROVING PREDICTIVE RELIABILITY AND COMPUTATIONAL EFFICIENCY
11 pages
Logit Disagreement: OoD Detection with Bayesian Neural Networks
No ratings yet
Logit Disagreement: OoD Detection with Bayesian Neural Networks
14 pages
Bayesian Nonparametrics and The Probabilistic Approach To Modelling
No ratings yet
Bayesian Nonparametrics and The Probabilistic Approach To Modelling
27 pages
Lecture 12 Bayesian Neural Network
No ratings yet
Lecture 12 Bayesian Neural Network
46 pages
arXiv 2412.00278
No ratings yet
arXiv 2412.00278
7 pages
L Complexity Uncertainty Trade Off
No ratings yet
L Complexity Uncertainty Trade Off
10 pages
AM207 14 Introduction UQ
No ratings yet
AM207 14 Introduction UQ
63 pages
van-amersfoort20a
No ratings yet
van-amersfoort20a
11 pages
Bayesian Modelling in Practice: Using Uncertainty To Improve Trustworthiness in Medical Applications
No ratings yet
Bayesian Modelling in Practice: Using Uncertainty To Improve Trustworthiness in Medical Applications
7 pages
Selective Classification Using A Robust Meta-Learning Approach
No ratings yet
Selective Classification Using A Robust Meta-Learning Approach
26 pages
Lecture 06
No ratings yet
Lecture 06
22 pages
Neural Networks
No ratings yet
Neural Networks
14 pages
Deeplerning Ensmble Metyhode
No ratings yet
Deeplerning Ensmble Metyhode
20 pages
Quantifying and Leveraging Classification Uncertainty For Chest Radiograph Assessment
No ratings yet
Quantifying and Leveraging Classification Uncertainty For Chest Radiograph Assessment
9 pages
Lecture2
No ratings yet
Lecture2
67 pages
Chen Scoring Your Prediction On Unseen Data CVPRW 2023 Paper
No ratings yet
Chen Scoring Your Prediction On Unseen Data CVPRW 2023 Paper
10 pages
Deep Learning Summer School 2015: Introduction To Machine Learning
No ratings yet
Deep Learning Summer School 2015: Introduction To Machine Learning
46 pages
Maximizing Overall Diversity For Improved Uncertainty Estimates in Deep Ensembles
No ratings yet
Maximizing Overall Diversity For Improved Uncertainty Estimates in Deep Ensembles
11 pages
Deep Evidential Regression
No ratings yet
Deep Evidential Regression
11 pages
ML Lab
No ratings yet
ML Lab
7 pages
Generating Unobserved Alternatives
No ratings yet
Generating Unobserved Alternatives
18 pages
Destercke 22 A
No ratings yet
Destercke 22 A
11 pages
Improving Neural Networks by Preventing Co-Adaptat
No ratings yet
Improving Neural Networks by Preventing Co-Adaptat
19 pages
Convex Formulation of Overparameterized Deep Neural Networks
No ratings yet
Convex Formulation of Overparameterized Deep Neural Networks
13 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Six Lectures On NN - Montanari
No ratings yet
Six Lectures On NN - Montanari
77 pages
boosting
No ratings yet
boosting
28 pages
s41467-024-54813-x
No ratings yet
s41467-024-54813-x
9 pages
Cost Effective Transfer of Reinforcement Learn - 2024 - Expert Systems With Appl
No ratings yet
Cost Effective Transfer of Reinforcement Learn - 2024 - Expert Systems With Appl
15 pages
Papadopoulos-ConformalPrediction
No ratings yet
Papadopoulos-ConformalPrediction
9 pages
Towards A Mathematical Understanding of Neural Network-Based Machine Learning: What We Know and What We Don't
No ratings yet
Towards A Mathematical Understanding of Neural Network-Based Machine Learning: What We Know and What We Don't
56 pages
Solving Parabolic Periodic P-Laplacian by Deep Learning
No ratings yet
Solving Parabolic Periodic P-Laplacian by Deep Learning
15 pages
1876 Diffusion Based Probabilistic
No ratings yet
1876 Diffusion Based Probabilistic
27 pages
S S R P E S - L M - C C T: Uperior Coring Ules For Robabilistic Valuation OF Ingle Abel Ulti Lass Lassification Asks
No ratings yet
S S R P E S - L M - C C T: Uperior Coring Ules For Robabilistic Valuation OF Ingle Abel Ulti Lass Lassification Asks
21 pages
(2014 Elsevier) Neural Network Ensemble Operators For Time Series Forecasting
No ratings yet
(2014 Elsevier) Neural Network Ensemble Operators For Time Series Forecasting
10 pages
NeurIPS 2018 A Simple Unified Framework For Detecting Out of Distribution Samples and Adversarial Attacks Paper
No ratings yet
NeurIPS 2018 A Simple Unified Framework For Detecting Out of Distribution Samples and Adversarial Attacks Paper
11 pages
Performance Metrics
No ratings yet
Performance Metrics
402 pages
Lec 25
No ratings yet
Lec 25
15 pages
Exploring Adversarial Training For Out-of-Distribution Detection
No ratings yet
Exploring Adversarial Training For Out-of-Distribution Detection
6 pages
Deep Learning As A Building Block in Probabilistic Models: Pierre-Alexandre Mattei
No ratings yet
Deep Learning As A Building Block in Probabilistic Models: Pierre-Alexandre Mattei
57 pages
ML-2 Seminars Notes
No ratings yet
ML-2 Seminars Notes
5 pages
Sem 7 Notes
No ratings yet
Sem 7 Notes
8 pages
Game Theory and Strategic Management - W3
No ratings yet
Game Theory and Strategic Management - W3
21 pages
Game Theory and Strategic Management - W2
No ratings yet
Game Theory and Strategic Management - W2
18 pages
Multicultural Lesson Plan - 280
No ratings yet
Multicultural Lesson Plan - 280
5 pages
Religiao e Sociedade
No ratings yet
Religiao e Sociedade
26 pages
MIT21W 747 01F09 Study01
No ratings yet
MIT21W 747 01F09 Study01
4 pages
Table of Specifications: FIRST Quarter Examinations, SY 2019-2020
86% (7)
Table of Specifications: FIRST Quarter Examinations, SY 2019-2020
2 pages
Seminar Report Deep-Learning PDF
No ratings yet
Seminar Report Deep-Learning PDF
26 pages
Bryan Adams Monin Cheater JEP PDF
No ratings yet
Bryan Adams Monin Cheater JEP PDF
25 pages
Humanistic Psychology Me
No ratings yet
Humanistic Psychology Me
14 pages
Session 4 Why Organizations Dont Learn
No ratings yet
Session 4 Why Organizations Dont Learn
13 pages
Infochap10 12
No ratings yet
Infochap10 12
18 pages
Sustained Student Action in The PYP: A Multi-Staged Action Research Project
No ratings yet
Sustained Student Action in The PYP: A Multi-Staged Action Research Project
11 pages
Report HBO Chapter 2
No ratings yet
Report HBO Chapter 2
6 pages
Day - Harris y Hadfield SLM - 01 Effective School Leaders
No ratings yet
Day - Harris y Hadfield SLM - 01 Effective School Leaders
25 pages
Poeticdevicefinal
No ratings yet
Poeticdevicefinal
64 pages
Selective Exam Notes and Tips
No ratings yet
Selective Exam Notes and Tips
3 pages
LET Review Exercises DR Brenda Corpuz1
No ratings yet
LET Review Exercises DR Brenda Corpuz1
23 pages
Schema Theory Quiz ERQ
No ratings yet
Schema Theory Quiz ERQ
7 pages
Piaget's Theory of Cognitive Development
No ratings yet
Piaget's Theory of Cognitive Development
12 pages
The Ethic of Collaboration and Decentred Practice Michael White
No ratings yet
The Ethic of Collaboration and Decentred Practice Michael White
28 pages
Literaturereview With Rubric
No ratings yet
Literaturereview With Rubric
43 pages
Common Core: Language Arts Standards, Grade 2
No ratings yet
Common Core: Language Arts Standards, Grade 2
4 pages
Approaches To The Study Literature: Biographical Approach
No ratings yet
Approaches To The Study Literature: Biographical Approach
4 pages
SPAG Rules Reinforcement Samples PDF
100% (1)
SPAG Rules Reinforcement Samples PDF
4 pages
Chapter - I
No ratings yet
Chapter - I
50 pages
Experiential Learning
No ratings yet
Experiential Learning
4 pages
Lesson Plan
No ratings yet
Lesson Plan
3 pages
Somatic Experiencing - The Razor's Edge of Trauma
100% (3)
Somatic Experiencing - The Razor's Edge of Trauma
4 pages
Activity 1: - I Don't Work On The Tenth Floor
No ratings yet
Activity 1: - I Don't Work On The Tenth Floor
8 pages
DCC Management Syllabus
No ratings yet
DCC Management Syllabus
3 pages
Clinical Philosophy - Tradução de Geneci Bett
No ratings yet
Clinical Philosophy - Tradução de Geneci Bett
58 pages

Final Presentation

Uploaded by

Final Presentation

Uploaded by

Simple and Scalable Predictive

Uncertainty Estimation using

NNs are poor at quantifying predictive uncertainty, and

Proper uncertainty quantification is crucial for practical applications

Generalization of the predictive

Monte Carlo by using Dropout at test time. MC-dropout is relatively

Authors suggest a simple recipe:

Intuitively, the adversarial perturbation creates a new

Assuming this perturbation is small enough, these

Adversarial training can be interpreted as a computationally efficient solution to

We investigate the use of adversarial training for predictive uncertainty

Authors used the following setup:

Model: MLP with 3 hidden layers.

Results from the paper

You might also like