SlideShare a Scribd company logo
Generative Adversarial Text to Image
Synthesis
Scott Reed, Zeynep Akata, Xinchen
Yan, Lajanugen Logeswaran
[GitHub] [Arxiv]
Slides by Víctor Garcia [GDoc]
Computer Vision Reading Group (30/09/2016)
Index
● Introduction
● State of the Art
● Method
○ Network Architecture
○ Losses
● Experiments
○ Qualitative Results
○ Sentence interpolation
○ Style Transfer
● Conclusions
Introduction
Text → Image
GANs
Index
● Introduction
● State of the Art
● Method
○ Network Architecture
○ Losses
● Experiments
○ Qualitative Results
○ Sentence interpolation
○ Style Transfer
● Conclusions
GANs
Discriminator
1/0
True
World
Fake
Generator
GANs
Discriminator
D(·)
1/0
True
World
Fake
Generator
q(x) x
G(z) zx’
GANs
Discriminator
D(·)
MAX → E[log(D(X))]
True
World
Fake
Generator
q(x) x
G(z) zx’
GANs
Discriminator
D(·)
MAX → E[log(D(X))] + E[ log(1 - D(G(Z))) ]
True
World
Fake
Generator
q(x) x
G(z) zx’
GANs
Discriminator
D(·)
MAX → E[log(D(X))] + E[ log(1 - D(G(Z))) ]
True
World
Fake
Generator
q(x) x
G(z) zx’
GANs
Discriminator
D(·)
True
World
Fake
Generator
q(x) x
G(z) zx’
MIN → E[ log(1 - D(G(Z))) ]
GANs with Join Distributions
How do we generate the image from text?
GANs with Join Distributions
How do we generate the image from text?
Discriminator
1/0
f(x,t) f(x’,t)
GANs with Join Distributions
Discriminator
1/0
Real Image
+
Text
Gen. Image
+
Text
Generator +
Text
GANs with Join Distributions
Discriminator
1/0
Real Image
+
Text
Gen. Image
+
Text
Generator +
Text
Text Embeddding
In order to represent the text in a vector...
MIN
WHERE
Text Embeddding
In order to represent the text in a vector...
MIN
WHERE
This is the recurrent text encoder
Index
● Introduction
● State of the Art
● Method
○ Network Architecture
○ Losses
● Experiments
○ Qualitative Results
○ Sentence interpolation
○ Style Transfer
● Conclusions
Network Architecture
Losses - CLS
log(D(x,t)) log(1-D(G(z,t)))
True Image
+
True Text
Fake Image
+
True Text
Real Images match
the text content?
Losses - CLS
log(D(x,t)) log(1-D(G(z,t))) log(1-D(G(zi,tk)))
True Image
+
True Text
Fake Image
+
True Text
True Image (i)
+
True Text (j)
Unmatched
Losses - INT
They train interpolating between different text embedding vector (t1~t2).
So the generator learns to fill GAPS on the data manifold.
Index
● Introduction
● State of the Art
● Method
○ Network Architecture
○ Losses
● Experiments
○ Qualitative Results
○ Sentence interpolation
○ Style Transfer
● Conclusions
Qualitative Results - Birds
Sentence Interpolation
Gen.
z0
+
Text1
Gen.
z1
+
Text3
Gen.
z0
+
Text2
Gen.
z1
+
Text4
Disentangling style and content
Generator.
z
+
Text
If ‘text’ is describing the content? What is ‘z’ describing?
Disentangling style and content
Generator.
z
+
Text
If ‘text’ is describing the content? What is ‘z’ describing?
Style → Pose, Background…, let’s extract ‘z’
Disentangling style and content
z0 z1 z2 z3 z4 z5
Qualitative Results - Flowers
Qualitative Results - MSCOCO
Conclusions
Discriminator
1/0
f(x,t) f(x’,t)
x~t
Generative adversarial text to image synthesis

More Related Content

What's hot (20)

PPTX
Generative Adversarial Networks (GAN)
Manohar Mukku
 
PDF
Introduction to Generative Adversarial Networks (GANs)
Appsilon Data Science
 
PDF
GAN - Theory and Applications
Emanuele Ghelfi
 
PDF
Generative adversarial networks
남주 김
 
PDF
Brief introduction on GAN
Dai-Hai Nguyen
 
PDF
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
WithTheBest
 
PPTX
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Preferred Networks
 
PDF
Regularization
Darren Yow-Bang Wang
 
PDF
Mask R-CNN
Chanuk Lim
 
PDF
Basic Generative Adversarial Networks
Dong Heon Cho
 
PPTX
Beginner's Guide to Diffusion Models..pptx
Ishaq Khan
 
PDF
Variational Autoencoder
Mark Chang
 
PDF
Introduction to Generative Adversarial Networks
BennoG1
 
PPTX
Generative Adversarial Network (GAN)
Prakhar Rastogi
 
PDF
GANs and Applications
Hoang Nguyen
 
PDF
Self-supervised Learning Lecture Note
Sangwoo Mo
 
PPTX
Diffusion models beat gans on image synthesis
BeerenSahu
 
PDF
Single Image Super Resolution Overview
LEE HOSEONG
 
PPT
Clustering
NLPseminar
 
PPTX
Generative Adversarial Network (GANs).
kgandham169
 
Generative Adversarial Networks (GAN)
Manohar Mukku
 
Introduction to Generative Adversarial Networks (GANs)
Appsilon Data Science
 
GAN - Theory and Applications
Emanuele Ghelfi
 
Generative adversarial networks
남주 김
 
Brief introduction on GAN
Dai-Hai Nguyen
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
WithTheBest
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Preferred Networks
 
Regularization
Darren Yow-Bang Wang
 
Mask R-CNN
Chanuk Lim
 
Basic Generative Adversarial Networks
Dong Heon Cho
 
Beginner's Guide to Diffusion Models..pptx
Ishaq Khan
 
Variational Autoencoder
Mark Chang
 
Introduction to Generative Adversarial Networks
BennoG1
 
Generative Adversarial Network (GAN)
Prakhar Rastogi
 
GANs and Applications
Hoang Nguyen
 
Self-supervised Learning Lecture Note
Sangwoo Mo
 
Diffusion models beat gans on image synthesis
BeerenSahu
 
Single Image Super Resolution Overview
LEE HOSEONG
 
Clustering
NLPseminar
 
Generative Adversarial Network (GANs).
kgandham169
 

Viewers also liked (20)

PDF
論文輪読: Generative Adversarial Text to Image Synthesis
mmisono
 
PDF
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Universitat Politècnica de Catalunya
 
PPTX
[DL輪読会]Image-to-Image Translation with Conditional Adversarial Networks
Deep Learning JP
 
PDF
[DL輪読会]StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generat...
Deep Learning JP
 
PDF
[DL輪読会]SeqGan Sequence Generative Adversarial Nets with Policy Gradient
Deep Learning JP
 
PDF
Búsqueda Visual con Retroacción de Relevancia basada en Actualización de Pesos
Universitat Politècnica de Catalunya
 
PPTX
[DL Hacks輪読] Learning Physical Intuition of Block Towers by Example
kurotaki_weblab
 
PDF
Paper overview: "Deep Residual Learning for Image Recognition"
Ilya Kuzovkin
 
PDF
[Dl輪読会]video pixel networks
Deep Learning JP
 
PDF
[輪読会]Multilingual Image Description with Neural Sequence Models
Deep Learning JP
 
PPTX
Dl hacks輪読: "Unifying distillation and privileged information"
Yusuke Iwasawa
 
PDF
[DL輪読会]Learning What and Where to Draw (NIPS’16)
Deep Learning JP
 
PPTX
[DL輪読会]Learning convolutional neural networks for graphs
Deep Learning JP
 
PPTX
[DL輪読会]Let there be color
Deep Learning JP
 
PPTX
[DL輪読会] Hybrid computing using a neural network with dynamic external memory
Yusuke Iwasawa
 
PPTX
Learning to remember rare events
홍배 김
 
PPTX
Meta-Learning with Memory Augmented Neural Networks
홍배 김
 
PDF
[DL輪読会]QUASI-RECURRENT NEURAL NETWORKS
Deep Learning JP
 
PDF
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Universitat Politècnica de Catalunya
 
PDF
Generative Adversarial Networks and Their Applications
Artifacia
 
論文輪読: Generative Adversarial Text to Image Synthesis
mmisono
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Universitat Politècnica de Catalunya
 
[DL輪読会]Image-to-Image Translation with Conditional Adversarial Networks
Deep Learning JP
 
[DL輪読会]StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generat...
Deep Learning JP
 
[DL輪読会]SeqGan Sequence Generative Adversarial Nets with Policy Gradient
Deep Learning JP
 
Búsqueda Visual con Retroacción de Relevancia basada en Actualización de Pesos
Universitat Politècnica de Catalunya
 
[DL Hacks輪読] Learning Physical Intuition of Block Towers by Example
kurotaki_weblab
 
Paper overview: "Deep Residual Learning for Image Recognition"
Ilya Kuzovkin
 
[Dl輪読会]video pixel networks
Deep Learning JP
 
[輪読会]Multilingual Image Description with Neural Sequence Models
Deep Learning JP
 
Dl hacks輪読: "Unifying distillation and privileged information"
Yusuke Iwasawa
 
[DL輪読会]Learning What and Where to Draw (NIPS’16)
Deep Learning JP
 
[DL輪読会]Learning convolutional neural networks for graphs
Deep Learning JP
 
[DL輪読会]Let there be color
Deep Learning JP
 
[DL輪読会] Hybrid computing using a neural network with dynamic external memory
Yusuke Iwasawa
 
Learning to remember rare events
홍배 김
 
Meta-Learning with Memory Augmented Neural Networks
홍배 김
 
[DL輪読会]QUASI-RECURRENT NEURAL NETWORKS
Deep Learning JP
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks and Their Applications
Artifacia
 

Similar to Generative adversarial text to image synthesis (20)

PPTX
A Tour of Neural Sequence Generators
sumeet0
 
PPTX
A Survey of Generative Adversarial Neural Networks (GAN) for Text-to-Image Sy...
Mirsaeid Abolghasemi
 
PDF
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
MLAI2
 
PDF
Generating Natural-Language Text with Neural Networks
Jonathan Mugan
 
PDF
M5 Topic 1 - Encoder Decoder MODEL-JEC.pdf
KeshavSen4
 
PDF
Finding connections among images using CycleGAN
NAVER Engineering
 
PDF
Multi-modal embeddings: from discriminative to generative models and creative ai
Roelof Pieters
 
PDF
Denis Yarats ITEM 2018
ITEM
 
PDF
Unpaired Image Translations Using GANs: A Review
IRJET Journal
 
PPTX
Computer Vision Gans
Wael Badawy
 
PDF
IRJET - Speech to Speech Translation using Encoder Decoder Architecture
IRJET Journal
 
PPTX
Image captions.pptx
RohanBorgalli
 
PDF
Image Generation from Caption
IJSCAI Journal
 
PDF
IMAGE GENERATION FROM CAPTION
ijscai
 
PDF
Deep Generative Modelling (updated)
Petko Nikolov
 
PDF
Optimal text-to-image synthesis model for generating portrait images using ge...
nooriasukmaningtyas
 
PPTX
Long short term memory on tensorflow using python
rahulk2004
 
PPTX
introduction to machine learning for students.pptx
sanjioborade1
 
PDF
A novel ensemble deep network framework for scene text recognition
International Journal of Reconfigurable and Embedded Systems
 
PDF
Pointing the Unknown Words
hytae
 
A Tour of Neural Sequence Generators
sumeet0
 
A Survey of Generative Adversarial Neural Networks (GAN) for Text-to-Image Sy...
Mirsaeid Abolghasemi
 
Contrastive Learning with Adversarial Perturbations for Conditional Text Gene...
MLAI2
 
Generating Natural-Language Text with Neural Networks
Jonathan Mugan
 
M5 Topic 1 - Encoder Decoder MODEL-JEC.pdf
KeshavSen4
 
Finding connections among images using CycleGAN
NAVER Engineering
 
Multi-modal embeddings: from discriminative to generative models and creative ai
Roelof Pieters
 
Denis Yarats ITEM 2018
ITEM
 
Unpaired Image Translations Using GANs: A Review
IRJET Journal
 
Computer Vision Gans
Wael Badawy
 
IRJET - Speech to Speech Translation using Encoder Decoder Architecture
IRJET Journal
 
Image captions.pptx
RohanBorgalli
 
Image Generation from Caption
IJSCAI Journal
 
IMAGE GENERATION FROM CAPTION
ijscai
 
Deep Generative Modelling (updated)
Petko Nikolov
 
Optimal text-to-image synthesis model for generating portrait images using ge...
nooriasukmaningtyas
 
Long short term memory on tensorflow using python
rahulk2004
 
introduction to machine learning for students.pptx
sanjioborade1
 
A novel ensemble deep network framework for scene text recognition
International Journal of Reconfigurable and Embedded Systems
 
Pointing the Unknown Words
hytae
 

More from Universitat Politècnica de Catalunya (20)

PDF
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
PDF
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
PDF
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
PDF
The Transformer - Xavier Giró - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
PDF
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
PDF
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
PPTX
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
PPTX
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
PDF
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
PDF
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
PDF
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
PDF
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
PDF
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
PDF
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
PDF
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
PDF
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
PDF
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
PDF
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
PDF
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 
PDF
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Universitat Politècnica de Catalunya
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
The Transformer - Xavier Giró - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Universitat Politècnica de Catalunya
 

Recently uploaded (20)

PDF
SQL for Accountants and Finance Managers
ysmaelreyes
 
DOCX
🧩 1. Solvent R-WPS Office work scientific
NohaSalah45
 
PDF
IT GOVERNANCE 4-2 - Information System Security (1).pdf
mdirfanuddin1322
 
PPTX
Generative AI Boost Data Governance and Quality- Tejasvi Addagada
Tejasvi Addagada
 
PDF
Unlocking Insights: Introducing i-Metrics Asia-Pacific Corporation and Strate...
Janette Toral
 
PPTX
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
DOCX
INDUSTRIAL BENEFIT FROM MICROSOFT AZURE.docx
writercontent500
 
PDF
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
PDF
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 
PDF
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
PDF
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
PPTX
03_Ariane BERCKMOES_Ethias.pptx_AIBarometer_release_event
FinTech Belgium
 
PPTX
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
PDF
apidays Singapore 2025 - Trustworthy Generative AI: The Role of Observability...
apidays
 
PDF
Data Science Course Certificate by Sigma Software University
Stepan Kalika
 
PDF
Business implication of Artificial Intelligence.pdf
VishalChugh12
 
PPTX
Module-2_3-1eentzyssssssssssssssssssssss.pptx
ShahidHussain66691
 
PPTX
美国史蒂文斯理工学院毕业证书{SIT学费发票SIT录取通知书}哪里购买
Taqyea
 
PDF
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
SQL for Accountants and Finance Managers
ysmaelreyes
 
🧩 1. Solvent R-WPS Office work scientific
NohaSalah45
 
IT GOVERNANCE 4-2 - Information System Security (1).pdf
mdirfanuddin1322
 
Generative AI Boost Data Governance and Quality- Tejasvi Addagada
Tejasvi Addagada
 
Unlocking Insights: Introducing i-Metrics Asia-Pacific Corporation and Strate...
Janette Toral
 
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
INDUSTRIAL BENEFIT FROM MICROSOFT AZURE.docx
writercontent500
 
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
03_Ariane BERCKMOES_Ethias.pptx_AIBarometer_release_event
FinTech Belgium
 
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
apidays Singapore 2025 - Trustworthy Generative AI: The Role of Observability...
apidays
 
Data Science Course Certificate by Sigma Software University
Stepan Kalika
 
Business implication of Artificial Intelligence.pdf
VishalChugh12
 
Module-2_3-1eentzyssssssssssssssssssssss.pptx
ShahidHussain66691
 
美国史蒂文斯理工学院毕业证书{SIT学费发票SIT录取通知书}哪里购买
Taqyea
 
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 

Generative adversarial text to image synthesis