0% found this document useful (0 votes)
10 views

Final Presentation

(1) The document proposes a simple method for predictive uncertainty estimation using deep ensembles. It trains an ensemble of neural networks with different initialization, evaluates them using proper scoring rules, and employs adversarial training to smooth predictive distributions. (2) The method is evaluated on calibration and generalization tasks using metrics like NLL, accuracy, and Brier score. It is shown to provide superior uncertainty estimates compared to Monte Carlo dropout. (3) In summary, the paper introduces a scalable approach for uncertainty quantification in neural networks based on ensembles and adversarial training. It highlights the potential of non-Bayesian methods for uncertainty estimation.

Uploaded by

dupsimugne
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Final Presentation

(1) The document proposes a simple method for predictive uncertainty estimation using deep ensembles. It trains an ensemble of neural networks with different initialization, evaluates them using proper scoring rules, and employs adversarial training to smooth predictive distributions. (2) The method is evaluated on calibration and generalization tasks using metrics like NLL, accuracy, and Brier score. It is shown to provide superior uncertainty estimates compared to Monte Carlo dropout. (3) In summary, the paper introduces a scalable approach for uncertainty quantification in neural networks based on ensembles and adversarial training. It highlights the potential of non-Bayesian methods for uncertainty estimation.

Uploaded by

dupsimugne
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Simple and Scalable Predictive

Uncertainty Estimation using


Deep Ensembles

Article Review
Introduction
Deep neural networks have achieved state-of-the-art
performance on a wide variety of machine learning tasks.

NNs are poor at quantifying predictive uncertainty, and


tend to produce overconfident predictions.

Proper uncertainty quantification is crucial for practical applications


Evaluation measures
Evaluating the quality of predictive uncertainties is challenging as the ‘ground truth’ uncertainty
estimates are usually not available.

Generalization of the predictive


Calibration uncertainty to domain shift
A frequentist notion of uncertainty Also referred to as out-of-
which measures the discrepancy distribution examples, that is,
between subjective forecasts and measuring if the network knows
(empirical) long-run frequencies. what it knows.
Alternative Approaches
Bayesian
Neural Networks
This concept revolves around a
Bayesian formalism, whereby a Drawbacks
prior distribution is specified
upon the parameters of a NN Bayesian NNs are often hard to implement.
and then, given the training Computationally slower to train compared to non-
data, the posterior distribution Bayesian NNs.
over the parameters is
computed, which is used to
quantify predictive uncertainty.
Monte Carlo dropout to estimate predictive uncertainty

Monte Carlo by using Dropout at test time. MC-dropout is relatively


simple to implement leading to its popularity in practice.
dropout Interestingly, dropout may also be interpreted as
ensemble model combination where the predictions are
averaged over an ensemble of NNs (with parameter
sharing).
Summary of Contributions
and Novelty
While ensembles of NNs and adversarial training are known, authors are the first
to explore their potential for predictive uncertainty estimation.
Introduced a scalable method for predictive uncertainty estimates from Neural
Networks (NNs). Investigated the impact of ensembles and adversarial training.
Proposed tasks to assess the quality of predictive uncertainty in terms of
calibration and generalization. Demonstrated superior performance against MC-
dropout.

While most research emphasizes Bayesian deep learning for uncertainty, authors
results highlight the potential of non-Bayesian methods, hoping to inspire further
exploration in this direction.
Deep Ensembles: A Simple Recipe For
Predictive Uncertainty Estimation
Problem setup and High-level
summary

Authors suggest a simple recipe:


1. Use a proper scoring rule as the training criterion
2. Use adversarial training to smooth the predictive distributions
3. Train an ensemble.
Adversarial training
Adversarial examples are those which are ‘close’ to the
original training examples, but are misclassified by the
NN.

Intuitively, the adversarial perturbation creates a new


training example by adding a perturbation along a
direction which the network is likely to increase the loss.

Assuming this perturbation is small enough, these


adversarial examples can be used to augment the original
training set by treating (x’, y) as additional training
examples.
Adversarial training to smooth predictive
distributions

Adversarial training can be interpreted as a computationally efficient solution to


smooth the predictive distributions by increasing the likelihood of the target
around an epsilon neighborhood of the observed training examples.

We investigate the use of adversarial training for predictive uncertainty


estimation.
Proper Scoring Rules
Scoring rules measure the quality of predictive uncertainty. A scoring rule assigns
a numerical score to a predictive distribution , rewarding better
calibrated predictions over worse. We shall consider scoring rules where a higher
numerical score is better. Then our proper scoring rule is:

Where:
is a function that evaluates the quality of the predictive distribution
relative to an event .
denotes the true distribution on (y, x) tuples.
A proper scoring rule is the one where with equality if and only if
NNs can then be trained according to measure that encourages calibration of predictive
uncertainty by minimizing the loss
Ensembles: training and prediction
Authors focus only on the randomization based approach as it is better suited for distributed, parallel
computation.

They came to a conclusion that training on entire dataset with random initialization was better than
bagging for deep ensembles.
Experimental Results
Evaluation metrics and experimental setup
Evaluation Metrics and Experimental Details
Evaluation Metrics:
Both Classification & Regression:
Negative Log Likelihood (NLL) for evaluating predictive uncertainty.
Classification:
Classification Accuracy.
Brier Score (BS): and 0 otherwise.

Experimental Setup:
Batch size: 100.
Optimizer: Adam with a fixed learning rate of 0.1.
Adversarial Training: set ϵ to 0.01 times the range of the training data for each dimension.
Weight Initialization: Default in Torch.
Classification on MNIST

Authors used the following setup:

Model: MLP with 3 hidden layers.


Units: 200 hidden units per layer.
Activation: ReLU with batch normalization.
MC-dropout: Added after each ReLU.
Dropout Rate: 0.1.
Classification on MNIST results

Results from the paper


Our results
The End

You might also like