0% found this document useful (0 votes)
4 views

Lecture-01_Introductory

Uploaded by

kuangau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lecture-01_Introductory

Uploaded by

kuangau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Introduction to Deep

Lecture #1
Generative Modeling
HY-673 – Computer Science Dept, University of Crete
Professors: Yannis Pantazis & Yannis Stylianou
TAs: Michail Raptakis & Michail Spanakis
Lecture
What is this course about? #1
Lecture
#1
üStatistical Generative Models
üA Generative Model (GM) is defined as a probability distribution, 𝒑 𝒙 .
üA statistical GM is a trainable probabilistic model, 𝒑𝜽 𝒙 .
üA deep GM is a statistical generative model parametrized by a neural network.
ü𝒑 𝒙 and in many cases 𝒑𝜽 𝒙 are not analytically known. Only samples are available!
üData (𝒙): complex, (un)structured samples (e.g., images, speech, molecules, text, etc.)
üPrior knowledge: parametric form (e.g., Gaussian, mixture, softmax), loss function (e.g.,
maximum likelihood, divergence), optimization algorithm, invariance/equivariance, laws
of physics, prior distribution, etc.
Lecture
What is this course about? #1

ü A dataset with images e.g., of bedrooms (LSUN dataset)

data distribution GM’s distribution


𝑝(𝑥) or 𝑝!"#" (𝑥) or 𝑝! (𝑥) 𝑝$ (𝑥) or 𝑝% (𝑥)

ü Goal: Find 𝜃 ∈ Θ such that 𝑝! (𝑥) ≈ 𝑝" (𝑥).


ü It is generative because sampling from 𝑝! (𝑥) generates
new unseen images.
… ~ 𝑝! (𝑥)
Lecture
What is this course about? #1

xi ∼ pdata , pθ) pθ
d (p data
pdata

Parametric Model Space

We will stydy:
ü Families of Generative Models
ü Algorithms to train these GMs
ü Network architectures
ü Loss functions & distances between probability density functions
Lecture
What is this course about? #1

üConditional Generative Models


üA conditional GM is defined as a conditional probability distribution, 𝑝 𝑥|𝑦 .
ü𝑦: conditioning variable(s) (e.g., label/class, text, captions, speaker id, style,
rotation, thickness, …)

~ 𝑝! (𝑥|𝑦), 𝑦:digit label


Lecture
Discriminative vs Generative Models #1
Lecture
#1
Data: 𝒙 Label: 𝒚

“Cat”

ü Discriminative Model
ü Learn the probability distribution
𝒑(𝒚|𝒙)
ü Generative Model
ü Learn the probability distribution 𝒑 𝒙
ü Conditional GM
ü Learn 𝒑(𝒙|𝒚)
Lecture
Families of Generative Models #1
Lecture
#1

üEnergy-based Models (EBMs)


üGenerative Adversarial Nets (GANs)
üVariational Auto-Encoders (VAEs)
üNormalizing Flows (NFs)
üDiffusion Probabilistic Models (DPMs)
üDeep Autoregressive Models (ARMs)
Lecture
Families of Generative Models #1
Lecture
#1

GMs

Exact Approximate Implicit

ARMs NFs VAEs EBMs DPMs GANs GGFs

(R)NADE Planar Vanilla Belief nets diffusion Vanilla KALE


WaveNet Coupling β-VAE Boltzmann denoising WGAN Lipschitz-reg.
WaveRNN MAFs/IAFs VQ-VAE machines score 𝑓-GAN …
GPT … … … … (𝑓, Γ)-GAN

Lecture
Less known Families of GMs #1
Lecture
#1

üGenerative Stochastic Networks (GSNs)


üGenerative Gradient Flows (GGFs)
üSpecific EBMs
üDeep Belief Networks
üDeep Boltzmann Machines

üGenerative Flow Networks (GFlowNets)
...
Lecture
Progress in Image Generation #1
Lecture
#1

ü Face generation: Rapid progress in image quality


Lecture
Image Super Resolution #1
Lecture
#1
ü Several inverse problems can be solved with conditional GMs.
ü Inverse problems: From measurements, calculate/infer the causes.

ü𝑷(𝒉𝒊𝒈𝒉 𝒓𝒆𝒔𝒐𝒍𝒖𝒕𝒊𝒐𝒏|𝒍𝒐𝒘 𝒓𝒆𝒔𝒐𝒍𝒖𝒕𝒊𝒐𝒏)


ü Photo-Realistic Single Image Super-Resolution Using a
Generative Adversarial Network - Ledig et al. - CVPR 2017
ü https://openaccess.thecvf.com/content_cvpr_2017/html/Ledig_P
hoto-Realistic_Single_Image_CVPR_2017_paper.html
Lecture
Image Inpainting #1
Lecture
Lecture
#1
#1
ü𝑷(𝒇𝒖𝒍𝒍 𝒊𝒎𝒂𝒈𝒆|𝒎𝒂𝒔𝒌𝒆𝒅 𝒊𝒎𝒂𝒈𝒆)
üDeepFill (v2): Free-Form Image Inpainting With Gated Convolution
– Yu et al. - ICCV 2019
ü https://openaccess.thecvf.com/content_ICCV_2019/html/Yu_Free-
Form_Image_Inpainting_With_Gated_Convolution_ICCV_2019_paper.html
Lecture
Image Colorization #1
Lecture
#1
ü𝑷(𝒄𝒐𝒍𝒐𝒓𝒆𝒅 𝒊𝒎𝒂𝒈𝒆|𝒈𝒓𝒂𝒚𝒔𝒄𝒂𝒍𝒆 𝒊𝒎𝒂𝒈𝒆)
ü PalGAN: Image Colorization with Palette Generative Adversarial
Networks – Wang et al. - ECCV 2022
ü https://link.springer.com/chapter/10.1007/978-3-031-19784-0_16
Lecture
Text2Image Translation #1
Lecture
Lecture
#1
#1

üRecent advancements:
üDALL-E 2
üStable Diffusion
üImagen
üGLIDE
üMidjourney

ü!(#$%&'|)'*))

Théâtre D’opéra Spatial by Jason Allen and Midjourney


Lecture
OpenAI’s DALL-E 2 #1
Lecture
#1
üText à Text embedding à Image embedding à low resolution
image à medium resolution image à high resolution image

ü𝑷 𝒉𝒊𝒈𝒉 𝒓𝒆𝒔 𝒊𝒎𝒂𝒈𝒆 𝒕𝒆𝒙𝒕 𝒄𝒂𝒑𝒕𝒊𝒐𝒏 = 𝑷 𝒊𝒎𝒂𝒈𝒆 𝒆𝒎𝒃 𝒕𝒆𝒙𝒕 𝒄𝒂𝒑𝒕𝒊𝒐𝒏 x


𝑷 𝒉𝒊𝒈𝒉 𝒓𝒆𝒔 𝒊𝒎𝒂𝒈𝒆 𝒊𝒎𝒂𝒈𝒆 𝒆𝒎𝒃

üHierarchical Text-Conditional
Image Generation with CLIP Latents -
Ramesh et al. - 2022
ü https://cdn.openai.com/papers/dall-e-2.pdf
Lecture
Image2Image Translation #1
Lecture
#1

üUnpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks


(CycleGAN) – Zhu et al. - ICCV 2017
ü https://openaccess.thecvf.com/content_iccv_2017/html/Zhu_Unpaired_Image-To-
Image_Translation_ICCV_2017_paper.html
Lecture
Speech & Audio Synthesis Lecture
#1
#1
Lecture
#1
ü𝑷(𝑥#$% |𝑥# , 𝑥#&% , … , 𝑡𝑒𝑥𝑡)
üWaveNet, WaveRNN, Parallel Wavenet, Text to Speech Synthesis

MelGAN, WaveDiff, … Parametric

Concatenative

WaveNet

Unconditional

Music

van den Oord et al., 2016


Lecture
(Natural) Language Generation #1
Lecture
#1
ü𝑷(𝒏𝒆𝒙𝒕 𝒘𝒐𝒓𝒅|𝒑𝒓𝒆𝒗𝒊𝒐𝒖𝒔 𝒘𝒐𝒓𝒅)
ühttps://app.inferkit.com/demo
üGPT-3
üGenerative Pre-trained
Transformer
ühttps://deepai.org/machine-
learning-model/text-generator
Lecture
(Natural) Language Generation #1
Lecture
#1

üEnormous model size (Trillion parameters?)


üEnormous & diverse training data
üMultimodal capabilities
üContext learning (a.k.a. prompting)
üReinforcement learning
üEnormous performance
üCoherence, relevance, proficiency
üSafety & Ethics
üFew steps from AGI
Lecture
Geometric Design #1
Lecture
#1
üJust meshing around with GPT4 (ESA proposal)
Lecture
Molecule/Drug/Protein Design #1
Lecture
#1
üMolGAN: An implicit generative model for small molecular graphs
– De Cao & Kipf – ICML 2018
Lecture
Driving forces in GM progress #1
Lecture
#1
üRepresentation learning
üLeveraging the exponential growth of data & of model’s parameters
via self-supervised learning
üGave raise to the Foundation Models
üComputational resources are also exponentially increasing.
üBetter understanding of the models, algorithms act as key
enablers.
üUnlocks human productivity & creativity.
üIdeally, it will accelerate the scientific discovery process.
Lecture
Challenges in GMs #1
Lecture
#1
üRepresentation: How do we model the joint distribution of many
random variables?
üNeed compact & meaningful representations
üLearning (a.k.a. quality assessment): What is the proper comparison
metrics between probability distributions?
üReliability: Can we trust the generated outcomes? Are they
consistent?
üAlignment: Do they perform according to the input of the user?
Lecture
Prerequisites #1
Lecture
Lecture
#1
#1

üVery good knowledge of probability theory, multivariate calculus


& linear algebra.
üIntermediate knowledge regarding machine learning & neural
networks.
üProficiency in some programming language, preferable Python, is
required.
Lecture
Course Syllabus #1
Lecture
#1
üBasics in probability theory (1W)
ü Shallow generative models - GMMs (1W)
üExact (i.e., fully-observed) likelihood
GMs
ü AR models (1.5W)
ü Normalizing flows (1.5W)
üApproximate likelihood Approxi Implicit
Exact
ü VAEs (2W) mate
ü Diffusion/Score-based models (2W)
ü EBMs (1W)
NFs VAEs GANs GGFs
üImplicit ARMs EBMs DPMs

ü GANs (2W)
Lecture
Logistics #1
Lecture
#1
üTeaching Assistant: Michail Raptakis (PhD candidate)
üWeekly Tutorial (Friday 10:00-12:00): Python/PyTorch basics, neural
network architectures and training, solve problems to assist with
homework, solve selected homework’s problems.
üTextbook: Probabilistic Machine Learning: Advanced Topics
by Kevin P. Murphy
ühttps://probml.github.io/pml-book/book2.html
üSeminal papers will be distributed.
Lecture
Grading policy #1
Lecture
#1
üFinal Exam (30% of total grade)
üOpen notes
üNO internet
ü5-6 series of Homework (40% of total grade)
üMix of theoretical and programming problems
üEqually weighted
üProject: paper implementation & presentation (30% of total grade)
üImplementation: 10%
üFinal report: 10%
üPresentation: 10%
Lecture
Project #1
Lecture
Lecture
#1
#1
üSelect from a given list of papers or propose a paper (which has to
be approved)
üCategories of papers:
üApplication of deep generative models on a novel task/dataset
üAlgorithmic improvements into the learning, inference and/or evaluation
of deep generative models
üTheoretical analysis of any aspect of existing deep generative models
üGroups of up to 2 students per project
üComputational resources might be provided (colab, local GPUs,
etc.)
Introduction to Deep
Generative Modeling Lecture #1
HY-673 – Computer Science Dept, University of Crete
Professors: Yannis Pantazis & Yannis Stylianou
TAs: Michail Raptakis & Michail Spanakis

You might also like