SlideShare a Scribd company logo
Variational Inference
using Implicit
Distributions
by Ferenc Huszar
MUPI journal club
2018-10-04
Sources
● arXiv:
Ferenc Huszár: Variational Inference using Implicit Distributions
https://arxiv.org/abs/1702.08235 (Feb 2017 version)
● blog posts:
https://www.inference.vc/variational-inference-with-implicit-probabilistic-model
s-part-1-2/
Reminder: VI & ELBO
approximation to
posterior p(z|x)
David M. Blei, Alp Kucukelbir, Jon D. McAuliffe: Variational Inference: A Review for Statisticians
(https://arxiv.org/abs/1601.00670)
Explicit vs implicit
● The parametric assumptions we make in VI are often too strong.
● Implicit models would be one way to relax these.
● We can model more complicated distributions.
vs
https://en.wikipedia.org/wiki/Normal_distribution
Implicit distributions:
● can sample from
● cab take derivatives
of samples w.r.t.
params
Explicit vs implicit
● The parametric assumptions we make in VI are often too strong.
● Implicit models would be one way to relax these.
● We can model more complicated distributions.
vs
https://en.wikipedia.org/wiki/Normal_distribution
For example 휺 ~ N(0,1)
휺
Neural
network
Implicit distributions:
● can sample from
● cab take derivatives
of samples w.r.t.
params
Notation; what is implicit/explicit
we usually would not
see these parametrized
“If q or p are implicit the ELBO needs
to be approximated differently,
e.g., in terms of density log-ratios”
[1] Prior contrastive form of ELBO
[2] Joint contrastive form of ELBO
const
prior
[1] Gradient of L (so r and s) w.r.t. 흍
forward model has to
be explicit
p(z) and q
can be implicit
this we can optimize by
reparametrizing
z = g흍
(x, 휺)
reminder:
this ratio we
approximate with
and learn
[1] Gradient of L (so r and s) w.r.t. 흍
Observation: gradient at some position 흍0
can be simplified so r/s does not depend on 흍:
reminder:
Optimization:
● Update
● Update ELBO
notation:
Update log ratio
reminder:
logistic regression
empirical loss:
y = -1 ⇔ z ~ p
y = +1 ⇔ z ~ q
Learn log ratio by optimizing
logistic regression loss
Prior-contrastive adversarial VI
one step of ELBO
optimization using
reparametrization
z = g흍
(x, 휺)
K steps of fitting
Translate to GANs
● discriminator
● generator G = g흍
(x, 휺)
● training mode similar to GANs - K steps for D vs 1 step for G
● ⇾ adversarial variational Bayes
Denoiser-guided learning
ELBO:
approximate with
samples from q
reparametrization
chain rule
ok
ok ?
derivative
Learn
● Trick: denoiser solution contains gradient of generating distribution:
● Find denoiser numerically by optimizing:
● Extract gradients:
denoiser
added noisegenerating distribution
Denoiser-guided learning
ELBO:
approximate with
samples from q
reparametrization
chain rule
ok
ok
gradient from
denoiser
derivative
Full algorithm
one step of ELBO
optimization using
formula for
K steps of learning
denoisers
Results
Ad

More Related Content

What's hot (19)

Cs6503 theory of computation april may 2017
Cs6503 theory of computation april may 2017Cs6503 theory of computation april may 2017
Cs6503 theory of computation april may 2017
appasami
 
Cs6503 theory of computation may june 2016 be cse anna university question paper
Cs6503 theory of computation may june 2016 be cse anna university question paperCs6503 theory of computation may june 2016 be cse anna university question paper
Cs6503 theory of computation may june 2016 be cse anna university question paper
appasami
 
Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015
appasami
 
CS2303 Theory of computation April may 2015
CS2303 Theory of computation April may  2015CS2303 Theory of computation April may  2015
CS2303 Theory of computation April may 2015
appasami
 
Cs6660 compiler design may june 2017 answer key
Cs6660 compiler design may june 2017  answer keyCs6660 compiler design may june 2017  answer key
Cs6660 compiler design may june 2017 answer key
appasami
 
Cs6503 theory of computation november december 2016
Cs6503 theory of computation november december 2016Cs6503 theory of computation november december 2016
Cs6503 theory of computation november december 2016
appasami
 
Minimal Introduction to C++ - Part I
Minimal Introduction to C++ - Part IMinimal Introduction to C++ - Part I
Minimal Introduction to C++ - Part I
Michel Alves
 
Model toc
Model tocModel toc
Model toc
GUNASUNDARI C
 
Cs6660 compiler design november december 2016 Answer key
Cs6660 compiler design november december 2016 Answer keyCs6660 compiler design november december 2016 Answer key
Cs6660 compiler design november december 2016 Answer key
appasami
 
Implementation
ImplementationImplementation
Implementation
Syed Zaid Irshad
 
Boolean type
Boolean typeBoolean type
Boolean type
Dmytro Mitin
 
Cs2303 theory of computation all anna University question papers
Cs2303 theory of computation all anna University question papersCs2303 theory of computation all anna University question papers
Cs2303 theory of computation all anna University question papers
appasami
 
10 - Scala. Co-product type (sum type)
10 - Scala. Co-product type (sum type)10 - Scala. Co-product type (sum type)
10 - Scala. Co-product type (sum type)
Roman Brovko
 
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Peng Cheng
 
Problem set2 | Theory of Computation | Akash Anand | MTH 401A | IIT Kanpur
Problem set2 | Theory of Computation | Akash Anand | MTH 401A | IIT KanpurProblem set2 | Theory of Computation | Akash Anand | MTH 401A | IIT Kanpur
Problem set2 | Theory of Computation | Akash Anand | MTH 401A | IIT Kanpur
Vivekananda Samiti
 
Value objects in JS - an ES7 work in progress
Value objects in JS - an ES7 work in progressValue objects in JS - an ES7 work in progress
Value objects in JS - an ES7 work in progress
Brendan Eich
 
Backpropagation
BackpropagationBackpropagation
Backpropagation
Alexander Jung
 
Dag representation of basic blocks
Dag representation of basic blocksDag representation of basic blocks
Dag representation of basic blocks
Jothi Lakshmi
 
Dynamic Program Problems
Dynamic Program ProblemsDynamic Program Problems
Dynamic Program Problems
Ranjit Sasmal
 
Cs6503 theory of computation april may 2017
Cs6503 theory of computation april may 2017Cs6503 theory of computation april may 2017
Cs6503 theory of computation april may 2017
appasami
 
Cs6503 theory of computation may june 2016 be cse anna university question paper
Cs6503 theory of computation may june 2016 be cse anna university question paperCs6503 theory of computation may june 2016 be cse anna university question paper
Cs6503 theory of computation may june 2016 be cse anna university question paper
appasami
 
Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015
appasami
 
CS2303 Theory of computation April may 2015
CS2303 Theory of computation April may  2015CS2303 Theory of computation April may  2015
CS2303 Theory of computation April may 2015
appasami
 
Cs6660 compiler design may june 2017 answer key
Cs6660 compiler design may june 2017  answer keyCs6660 compiler design may june 2017  answer key
Cs6660 compiler design may june 2017 answer key
appasami
 
Cs6503 theory of computation november december 2016
Cs6503 theory of computation november december 2016Cs6503 theory of computation november december 2016
Cs6503 theory of computation november december 2016
appasami
 
Minimal Introduction to C++ - Part I
Minimal Introduction to C++ - Part IMinimal Introduction to C++ - Part I
Minimal Introduction to C++ - Part I
Michel Alves
 
Cs6660 compiler design november december 2016 Answer key
Cs6660 compiler design november december 2016 Answer keyCs6660 compiler design november december 2016 Answer key
Cs6660 compiler design november december 2016 Answer key
appasami
 
Cs2303 theory of computation all anna University question papers
Cs2303 theory of computation all anna University question papersCs2303 theory of computation all anna University question papers
Cs2303 theory of computation all anna University question papers
appasami
 
10 - Scala. Co-product type (sum type)
10 - Scala. Co-product type (sum type)10 - Scala. Co-product type (sum type)
10 - Scala. Co-product type (sum type)
Roman Brovko
 
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Peng Cheng
 
Problem set2 | Theory of Computation | Akash Anand | MTH 401A | IIT Kanpur
Problem set2 | Theory of Computation | Akash Anand | MTH 401A | IIT KanpurProblem set2 | Theory of Computation | Akash Anand | MTH 401A | IIT Kanpur
Problem set2 | Theory of Computation | Akash Anand | MTH 401A | IIT Kanpur
Vivekananda Samiti
 
Value objects in JS - an ES7 work in progress
Value objects in JS - an ES7 work in progressValue objects in JS - an ES7 work in progress
Value objects in JS - an ES7 work in progress
Brendan Eich
 
Dag representation of basic blocks
Dag representation of basic blocksDag representation of basic blocks
Dag representation of basic blocks
Jothi Lakshmi
 
Dynamic Program Problems
Dynamic Program ProblemsDynamic Program Problems
Dynamic Program Problems
Ranjit Sasmal
 

Similar to Variational inference using implicit distributions (20)

Joint contrastive learning with infinite possibilities
Joint contrastive learning with infinite possibilitiesJoint contrastive learning with infinite possibilities
Joint contrastive learning with infinite possibilities
taeseon ryu
 
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Olga Zinkevych
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
Kenta Oono
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
Elkana Rorio
 
17_monte_carlo.pdf
17_monte_carlo.pdf17_monte_carlo.pdf
17_monte_carlo.pdf
KSChidanandKumarJSSS
 
NICE Research -Variational inference project
NICE Research -Variational inference projectNICE Research -Variational inference project
NICE Research -Variational inference project
Natan Katz
 
NICE Implementations of Variational Inference
NICE Implementations of Variational Inference NICE Implementations of Variational Inference
NICE Implementations of Variational Inference
Natan Katz
 
Applied statistics lecture_6
Applied statistics lecture_6Applied statistics lecture_6
Applied statistics lecture_6
Daria Bogdanova
 
Learning group variational inference
Learning group  variational inferenceLearning group  variational inference
Learning group variational inference
Shuai Zhang
 
Regression: A skin-deep dive
Regression: A skin-deep diveRegression: A skin-deep dive
Regression: A skin-deep dive
abulyomon
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)
Masahiro Suzuki
 
Relations as Executable Specifications
Relations as Executable SpecificationsRelations as Executable Specifications
Relations as Executable Specifications
Nuno Macedo
 
Logmodels2
Logmodels2Logmodels2
Logmodels2
Pakistan Gum Industries Pvt. Ltd
 
Logmodels2
Logmodels2Logmodels2
Logmodels2
Pakistan Gum Industries Pvt. Ltd
 
Isolation Lemma for Directed Reachability and NL vs. L
Isolation Lemma for Directed Reachability and NL vs. LIsolation Lemma for Directed Reachability and NL vs. L
Isolation Lemma for Directed Reachability and NL vs. L
cseiitgn
 
Big Data Analysis
Big Data AnalysisBig Data Analysis
Big Data Analysis
NBER
 
The Magic of Auto Differentiation
The Magic of Auto DifferentiationThe Magic of Auto Differentiation
The Magic of Auto Differentiation
Sanyam Kapoor
 
On Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph DomainsOn Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph Domains
Jean-Charles Vialatte
 
Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)
MikeBlyth
 
Algorithm_NP-Completeness Proof
Algorithm_NP-Completeness ProofAlgorithm_NP-Completeness Proof
Algorithm_NP-Completeness Proof
Im Rafid
 
Joint contrastive learning with infinite possibilities
Joint contrastive learning with infinite possibilitiesJoint contrastive learning with infinite possibilities
Joint contrastive learning with infinite possibilities
taeseon ryu
 
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Olga Zinkevych
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
Kenta Oono
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
Elkana Rorio
 
NICE Research -Variational inference project
NICE Research -Variational inference projectNICE Research -Variational inference project
NICE Research -Variational inference project
Natan Katz
 
NICE Implementations of Variational Inference
NICE Implementations of Variational Inference NICE Implementations of Variational Inference
NICE Implementations of Variational Inference
Natan Katz
 
Applied statistics lecture_6
Applied statistics lecture_6Applied statistics lecture_6
Applied statistics lecture_6
Daria Bogdanova
 
Learning group variational inference
Learning group  variational inferenceLearning group  variational inference
Learning group variational inference
Shuai Zhang
 
Regression: A skin-deep dive
Regression: A skin-deep diveRegression: A skin-deep dive
Regression: A skin-deep dive
abulyomon
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)
Masahiro Suzuki
 
Relations as Executable Specifications
Relations as Executable SpecificationsRelations as Executable Specifications
Relations as Executable Specifications
Nuno Macedo
 
Isolation Lemma for Directed Reachability and NL vs. L
Isolation Lemma for Directed Reachability and NL vs. LIsolation Lemma for Directed Reachability and NL vs. L
Isolation Lemma for Directed Reachability and NL vs. L
cseiitgn
 
Big Data Analysis
Big Data AnalysisBig Data Analysis
Big Data Analysis
NBER
 
The Magic of Auto Differentiation
The Magic of Auto DifferentiationThe Magic of Auto Differentiation
The Magic of Auto Differentiation
Sanyam Kapoor
 
On Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph DomainsOn Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph Domains
Jean-Charles Vialatte
 
Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)
MikeBlyth
 
Algorithm_NP-Completeness Proof
Algorithm_NP-Completeness ProofAlgorithm_NP-Completeness Proof
Algorithm_NP-Completeness Proof
Im Rafid
 
Ad

More from Tomasz Kusmierczyk (9)

Priors for BNNs
Priors for BNNsPriors for BNNs
Priors for BNNs
Tomasz Kusmierczyk
 
Overconfidence and subnetwork Inference for BNNs
Overconfidence and subnetwork Inference for BNNsOverconfidence and subnetwork Inference for BNNs
Overconfidence and subnetwork Inference for BNNs
Tomasz Kusmierczyk
 
Introduction to modern Variational Inference.
Introduction to modern Variational Inference.Introduction to modern Variational Inference.
Introduction to modern Variational Inference.
Tomasz Kusmierczyk
 
Loss Calibrated Variational Inference
Loss Calibrated Variational InferenceLoss Calibrated Variational Inference
Loss Calibrated Variational Inference
Tomasz Kusmierczyk
 
On the Causal Effect of Digital Badges
On the Causal Effect of Digital BadgesOn the Causal Effect of Digital Badges
On the Causal Effect of Digital Badges
Tomasz Kusmierczyk
 
What are the negative effects of social media?: fighting fake information
What are the negative effects of social media?: fighting fake informationWhat are the negative effects of social media?: fighting fake information
What are the negative effects of social media?: fighting fake information
Tomasz Kusmierczyk
 
Sampling and Markov Chain Monte Carlo Techniques
Sampling and Markov Chain Monte Carlo TechniquesSampling and Markov Chain Monte Carlo Techniques
Sampling and Markov Chain Monte Carlo Techniques
Tomasz Kusmierczyk
 
Probabilistic Models in Recommender Systems: Time Variant Models
Probabilistic Models in Recommender Systems: Time Variant ModelsProbabilistic Models in Recommender Systems: Time Variant Models
Probabilistic Models in Recommender Systems: Time Variant Models
Tomasz Kusmierczyk
 
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
Tomasz Kusmierczyk
 
Overconfidence and subnetwork Inference for BNNs
Overconfidence and subnetwork Inference for BNNsOverconfidence and subnetwork Inference for BNNs
Overconfidence and subnetwork Inference for BNNs
Tomasz Kusmierczyk
 
Introduction to modern Variational Inference.
Introduction to modern Variational Inference.Introduction to modern Variational Inference.
Introduction to modern Variational Inference.
Tomasz Kusmierczyk
 
Loss Calibrated Variational Inference
Loss Calibrated Variational InferenceLoss Calibrated Variational Inference
Loss Calibrated Variational Inference
Tomasz Kusmierczyk
 
On the Causal Effect of Digital Badges
On the Causal Effect of Digital BadgesOn the Causal Effect of Digital Badges
On the Causal Effect of Digital Badges
Tomasz Kusmierczyk
 
What are the negative effects of social media?: fighting fake information
What are the negative effects of social media?: fighting fake informationWhat are the negative effects of social media?: fighting fake information
What are the negative effects of social media?: fighting fake information
Tomasz Kusmierczyk
 
Sampling and Markov Chain Monte Carlo Techniques
Sampling and Markov Chain Monte Carlo TechniquesSampling and Markov Chain Monte Carlo Techniques
Sampling and Markov Chain Monte Carlo Techniques
Tomasz Kusmierczyk
 
Probabilistic Models in Recommender Systems: Time Variant Models
Probabilistic Models in Recommender Systems: Time Variant ModelsProbabilistic Models in Recommender Systems: Time Variant Models
Probabilistic Models in Recommender Systems: Time Variant Models
Tomasz Kusmierczyk
 
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
Tomasz Kusmierczyk
 
Ad

Recently uploaded (20)

md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Data Analytics Overview and its applications
Data Analytics Overview and its applicationsData Analytics Overview and its applications
Data Analytics Overview and its applications
JanmejayaMishra7
 
chapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptxchapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptx
justinebandajbn
 
How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136
illuminati Agent uganda call+256776963507/0741506136
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia
Alexander Romero Arosquipa
 
Data Science Courses in India iim skills
Data Science Courses in India iim skillsData Science Courses in India iim skills
Data Science Courses in India iim skills
dharnathakur29
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Data Analytics Overview and its applications
Data Analytics Overview and its applicationsData Analytics Overview and its applications
Data Analytics Overview and its applications
JanmejayaMishra7
 
chapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptxchapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptx
justinebandajbn
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
Data Science Courses in India iim skills
Data Science Courses in India iim skillsData Science Courses in India iim skills
Data Science Courses in India iim skills
dharnathakur29
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 

Variational inference using implicit distributions

  • 1. Variational Inference using Implicit Distributions by Ferenc Huszar MUPI journal club 2018-10-04
  • 2. Sources ● arXiv: Ferenc Huszár: Variational Inference using Implicit Distributions https://arxiv.org/abs/1702.08235 (Feb 2017 version) ● blog posts: https://www.inference.vc/variational-inference-with-implicit-probabilistic-model s-part-1-2/
  • 3. Reminder: VI & ELBO approximation to posterior p(z|x) David M. Blei, Alp Kucukelbir, Jon D. McAuliffe: Variational Inference: A Review for Statisticians (https://arxiv.org/abs/1601.00670)
  • 4. Explicit vs implicit ● The parametric assumptions we make in VI are often too strong. ● Implicit models would be one way to relax these. ● We can model more complicated distributions. vs https://en.wikipedia.org/wiki/Normal_distribution Implicit distributions: ● can sample from ● cab take derivatives of samples w.r.t. params
  • 5. Explicit vs implicit ● The parametric assumptions we make in VI are often too strong. ● Implicit models would be one way to relax these. ● We can model more complicated distributions. vs https://en.wikipedia.org/wiki/Normal_distribution For example 휺 ~ N(0,1) 휺 Neural network Implicit distributions: ● can sample from ● cab take derivatives of samples w.r.t. params
  • 6. Notation; what is implicit/explicit we usually would not see these parametrized “If q or p are implicit the ELBO needs to be approximated differently, e.g., in terms of density log-ratios”
  • 7. [1] Prior contrastive form of ELBO [2] Joint contrastive form of ELBO const prior
  • 8. [1] Gradient of L (so r and s) w.r.t. 흍 forward model has to be explicit p(z) and q can be implicit this we can optimize by reparametrizing z = g흍 (x, 휺) reminder: this ratio we approximate with and learn
  • 9. [1] Gradient of L (so r and s) w.r.t. 흍 Observation: gradient at some position 흍0 can be simplified so r/s does not depend on 흍: reminder: Optimization: ● Update ● Update ELBO
  • 10. notation: Update log ratio reminder: logistic regression empirical loss: y = -1 ⇔ z ~ p y = +1 ⇔ z ~ q Learn log ratio by optimizing logistic regression loss
  • 11. Prior-contrastive adversarial VI one step of ELBO optimization using reparametrization z = g흍 (x, 휺) K steps of fitting
  • 12. Translate to GANs ● discriminator ● generator G = g흍 (x, 휺) ● training mode similar to GANs - K steps for D vs 1 step for G ● ⇾ adversarial variational Bayes
  • 13. Denoiser-guided learning ELBO: approximate with samples from q reparametrization chain rule ok ok ? derivative
  • 14. Learn ● Trick: denoiser solution contains gradient of generating distribution: ● Find denoiser numerically by optimizing: ● Extract gradients: denoiser added noisegenerating distribution
  • 15. Denoiser-guided learning ELBO: approximate with samples from q reparametrization chain rule ok ok gradient from denoiser derivative
  • 16. Full algorithm one step of ELBO optimization using formula for K steps of learning denoisers