0% found this document useful (0 votes)

24 views7 pages

Engproc 20 00016 With Cover

Uploaded by

likhithp1ga20is067

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views7 pages

Engproc 20 00016 With Cover

Uploaded by

likhithp1ga20is067

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Proceeding Paper

Text-to-Image Generation Using

Deep Learning

Sadia Ramzan, Muhammad Munwar Iqbal and Tehmina Kalsum

https://doi.org/10.3390/engproc2022020016
Proceeding Paper
Text-to-Image Generation Using Deep Learning †
Sadia Ramzan 1, * , Muhammad Munwar Iqbal 1 and Tehmina Kalsum 2

1 Department of Computer Science, University of Engineering and Technology, Taxila 47050, Pakistan;
[email protected]
2 Department of Software Engineering, University of Engineering and Technology, Taxila 47050, Pakistan;
[email protected]
* Correspondence: [email protected]
† Presented at the 7th International Electrical Engineering Conference, Karachi, Pakistan, 25–26 March 2022.

Abstract: Text-to-image generation is a method used for generating images related to given tex-
tual descriptions. It has a significant influence on many research areas as well as a diverse set of
applications (e.g., photo-searching, photo-editing, art generation, computer-aided design, image re-
construction, captioning, and portrait drawing). The most challenging task is to consistently produce
realistic images according to given conditions. Existing algorithms for text-to-image generation create
pictures that do not properly match the text. We considered this issue in our study and built a deep
learning-based architecture for semantically consistent image generation: recurrent convolutional
generative adversarial network (RC-GAN). RC-GAN successfully bridges the advancements in text
and picture modelling, converting visual notions from words to pixels. The proposed model was
trained on the Oxford-102 flowers dataset, and its performance was evaluated using an inception
score and PSNR. The experimental results demonstrate that our model is capable of generating more
realistic photos of flowers from given captions, with an inception score of 4.15 and a PSNR value of
30.12 dB, respectively. In the future, we aim to train the proposed model on multiple datasets.

Keywords: convolutional neural network; recurrent neural network; deep learning; generative
adversarial networks; image generation

Citation: Ramzan, S.; Iqbal, M.M.;

Kalsum, T. Text-to-Image Generation
Using Deep Learning. Eng. Proc. 1. Introduction
2022, 20, 16. https://doi.org/ When people listen to or read a narrative, they quickly create pictures in their mind
10.3390/engproc2022020016 to visualize the content. Many cognitive functions, such as memorization, reasoning
Academic Editor: Saad Ahmed ability, and thinking, rely on visual mental imaging or “seeing with the mind’s eye” [1].
Qazi Developing a technology that recognizes the connection between vision and words and can
produce pictures that represent the meaning of written descriptions is a big step toward
Published: 29 July 2022
user intellectual ability.
Publisher’s Note: MDPI stays neutral Image-processing techniques and applications of computer vision (CV) have grown
with regard to jurisdictional claims in immensely in recent years from advances made possible by artificial intelligence and deep
published maps and institutional affil- learning’s success. One of these growing fields is text-to-image generation. The term text-to-
iations. image (T2I) is the generation of visually realistic pictures from text inputs. T2I generation is
the reverse process of image captioning, also known as image-to-text (I2T) generation [2–4],
which is the generation of textual description from an input image. In T2I generation, the
model takes an input in the form of human written description and produces a RGB image
Copyright: © 2022 by the authors.
that matches the description. T2I generation has been an important field of study due to its
Licensee MDPI, Basel, Switzerland.
tremendous capability in multiple areas. Photo-searching, photo-editing, art generation,
This article is an open access article
captioning, portrait drawing, industrial design, and image manipulation are some common
distributed under the terms and
applications of creating photo-realistic images from text.
conditions of the Creative Commons
Attribution (CC BY) license (https://
The evolution of generative adversarial networks (GANs) has demonstrated excep-
creativecommons.org/licenses/by/
tional performance in image synthesis, image super-resolution, data augmentation, and
4.0/).
image-to-image conversion. GANs are deep learning-based convolutional neural networks

Eng. Proc. 2022, 20, 16. https://doi.org/10.3390/engproc2022020016 https://www.mdpi.com/journal/engproc

Eng. Proc. 2022, 20, 16 2 of 6

(CNNs) [5,6]. It consists of two neural networks: one for generating data and the other for
classifying real/fake data. GANs are based on game theory for learning generative models.
Its major purpose is to train a generator (G) to generate samples and a discriminator (D)
to discern between true and false data. For generating better-quality realistic image, we
performed text encoding using recurrent neural networks (RNN), and convolutional layers
were used for image decoding. We developed recurrent convolution GAN (RC-GAN),
a simple an effective framework for appealing to image synthesis from human written
textual descriptions. The model was trained on the Oxford-102 Flowers Dataset and ensures
the identity of the synthesized pictures. The key contributions of this research include
the following:
• Building a deep learning model RC-GAN for generating more realistic images.
• Generating more realistic images from given textual descriptions.
• Improving the inception score and PSNR value of images generated from text.
The following is how the rest of the paper is arranged: In Section 2, related work is
described. The dataset and its preprocessing are discussed in Section 3. Section 4 explains
the details of the research methodology and dataset used in this paper. The experimental
details and results are discussed in Section 5. Finally, the paper is concluded in Section 6.

2. Related Work
GANs were first introduced by Goodfellow [7] in 2014, but Reed et al. [8] was the
first to use them for text-to-image generation in 2016. Salimans et al. [9] proposed training
stabilizing techniques for previously untrainable models and achieved better results on
the MNIST, CIFAR-10, and SVHN datasets. The attention-based recurrent neural network
was developed by Zia et al. [10]. In their model, word-to-pixel dependencies were learned
by an attention-based auto-encoder and pixel-to-pixel dependencies were learned by an
autoregressive-based decoder. Liu et al. [11] offered a diverse conditional image synthesis
model and performed large-scale experiments for different conditional generation tasks.
Gao et al. [12] proposed an effective approach known as lightweight dynamic conditional
GAN (LD-CGAN), which disentangled the text attributes and provided image features by
capturing multi-scale features. Dong et al. [13] trained a model for generating images from
text in an unsupervised manner. Berrahal et al. [14] focused on the development of text-
to-image conversion applications. They used deep fusion GAN (DF-GAN) for generating
human face images from textual descriptions. The cross-domain feature fusion GAN (CF-
GAN) was proposed by Zhang et al. [15] for converting textual descriptions into images
with more semantic detail. In general, the existing methods of text-to-image generation use
wide-ranging parameters and heavy computations for generating high-resolution images,
which result in unstable and high-cost training.

3. Dataset and Preprocessing

3.1. Dataset
The dataset used was Oxford-102 flowers, which include 8189 images of flowers that
are all of different species. It has 102 classes; each class consists of 40 to 60 images, and
each of the images have 10 matching textual descriptions. In this study, we considered
8000 images for training. This dataset was used to train the model for 300 epochs.

3.2. Data Preprocessing

When the data were collected and extracted initially, it consisted of 8189 images with
different sizes of resolutions and corresponding textual descriptions. For normalizing the
textual data, we used an NLTK tokenizer, which converted the textual sentences into words.
These tokenized lists of words were transformed into an array of caption ids. The images
were loaded for resizing to the same dimensions. All training images and testing images
were resized to a resolution of 128 × 128. For training purposes, the images were converted
into arrays, and both the vocabulary and images were loaded onto the model.
Eng. Proc. 2022, 20, 16 3 of 6

4. Proposed Methodology
This section describes the training details of deep learning-based generative models.
Conditional GANs were used with recurrent neural networks (RNNs) and convolutional
neural networks (CNNs) for generating meaningful images from a textual description. The
dataset used consisted of images of flowers and their relevant textual descriptions. For
generating plausible images from text using a GAN, preprocessing of textual data and
image resizing was performed. We took textual descriptions from the dataset, preprocessed
these caption sentences, and created a list of their vocabulary. Then, these captions were
stored with their respective ids in the list. The images were loaded and resized to a fixed
dimension. These data were then given as input to our proposed model. RNN was used
for capturing the contextual information of text sequences by defining the relationship
between words at altered time stamps. Text-to-image mapping was performed using an
RNN and a CNN. The CNN recognized useful characteristics from the images without the
need for human intervention. An input sequence was given to the RNN, which converted
the textual descriptions into word embeddings with a size of 256. These word embeddings
were concatenated with a 512-dimensional noise vector. To train our model, we took a batch
size of 64 with gated-feedback 128 and fed the input noise and text input to a generator.
The architecture of the proposed model is presented in Figure 1.

Figure 1. Architecture of the proposed method, which can generate images from text descriptions.

Semantic information from the textual description was used as input in the generator
model, which converts characteristic information to pixels and generates the images. This
generated image was used as input in the discriminator along with real/wrong textual
descriptions and real sample images from the dataset.
A sequence of distinct (picture and text) pairings are then provided as input to the
model to meet the goals of the discriminator: input pairs of real images and real textual
descriptions, wrong images and mismatched textual descriptions, and generated images
and real textual descriptions. The real photo and real text combinations are provided so
that the model can determine if a particular image and text combination align. An incorrect
picture and real text description indicates that the image does not match the caption. The
discriminator is trained to identify real and generated images. At the start of training, the
discriminator was good at classification of real/wrong images. Loss was calculated to
improve the weight and to provide training feedback to the generator and discriminator
model. As soon as the training proceeded, the generator produced more realistic images
and it fooled the discriminator when distinguishing between real and generated images.

5. Results and Discussion

In this section, the experimental analysis and generated images of flowers are pre-
sented. The training of the proposed model was performed on an Nvidia 1070 Ti GPU,
32 GB memory and windows 10 operating system. The weights of the generator and
discriminator were optimized using an Adam optimizer, the mini-batch size was 64, and
Eng. Proc. 2022, 20, 16 4 of 6

the learning rate was 0.0003. The ground truths from the dataset and the images generated
from the input textual descriptions are shown in Figure 2. For evaluating the performance
of the proposed model, the inception score (IS) and PSNR values are calculated. Incep-
tion scores capture the diversity and quality of the generated images. PSNR is used for
calculating the peak signal-to-noise ratio in decibels among two images. The quality of the
original and produced images is compared using this ratio. The PSNR value increases as
the quality of the created or synthesized image improves. PSNR value is calculated using
following equation:
PSNR = 10log10 (R2/MSE) (1)

Figure 2. Input textual descriptions and resultant generated images with ground truths.

To validate the proposed approach, the results are compared with those of existing
models including GAN-INT-CLS, StackGAN, StackGAN++, HDGAN, and DualAttn-GAN
on the Oxford-102 flowers dataset. This performance comparison in terms of inception
score is shown in Table 1 and that of the PSNR value is shown in Table 2. These results
show that our model is capable of generating more unambiguous and diverse photos than
the other models.

Table 1. Performance comparison of state-of-the-art methods vs. the methodology presented here by
inception score.

Ref. Models Inception Score

Reed et al. [8] GAN-INT-CLS 2.66 ± 0.03
Zhang et al. [16] StackGAN 3.20 ± 0.01
Zhang et al. [17] StackGAN++ 3.26 ± 0.01
Zhang et al. [18] HDGAN 3.45 ± 0.05
Cai et al. [19] DualAttn-GAN 4.06 ± 0.05
Proposed Method RC-GAN 4.15 ± 0.03

Table 2. PSNR value of generated images.

Ref. Model PSNR Value

Proposed Method RC-GAN 30.12 dB
Eng. Proc. 2022, 20, 16 5 of 6

6. Conclusions and Future Work

In the fields of computer vision and natural language processing, text-to-image gener-
ation is a hot topic these days. For producing visually realistic and semantically consistent
images, we presented a deep learning-based model (RC-GAN) and described how it works
in the confluence of computer vision and natural language processing. The model was
trained by text encoding and image decoding. Extensive experiments on the Oxford-102
flowers dataset demonstrated that the proposed GAN model generates better-quality im-
ages compared with existing models by providing the best recorded IS. The performance of
our proposed method was compared with that of state-of-the-art methods using IS. Our
model achieved an IS of 4.15 and a PSNR value of 30.12 dB on the Oxford-102 flowers
dataset. We want to expand this work by training the model on a variety of datasets for
image generation.

Author Contributions: The authors confirm contribution to the paper as follows: study conception,
design and data collection: S.R.; analysis and interpretation of results: S.R., M.M.I., T.K.; draft
manuscript preparation: S.R., M.M.I. All authors have read and agreed to the published version of
the manuscript.
Funding: The Authors do not received funding for this research.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Dataset is publicly available.
Conflicts of Interest: The authors are anonymously declared that they have no conflict of interest.

References
1. Kosslyn, S.M.; Ganis, G.; Thompson, W.L. Neural foundations of imagery. Nat. Rev. Neurosci. 2001, 2, 635–642. [CrossRef]
[PubMed]
2. Karpathy, A.; Fei-Fei, L. Deep visual-semantic alignments for generating image descriptions. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3128–3137.
3. Vinyals, O.; Toshev, A.; Bengio, S.; Erhan, D. Show and tell: A neural image caption generator. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3156–3164.
4. Xu, K.; Ba, J.; Kiros, R.; Cho, K.; Courville, A.; Salakhudinov, R.; Zemel, R.; Bengio, Y. Show, attend and tell: Neural image caption
generation with visual attention. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July
2015; pp. 2048–2057.
5. Albawi, S.; Mohammed, T.A.; Al-Zawi, S. Understanding of a convolutional neural network. In Proceedings of the 2017
International Conference on Engineering and Technology (ICET), Antalya, Turkey, 21–23 August 2017; pp. 1–6.
6. Kim, P. Convolutional neural network. In MATLAB Deep Learning; Springer: Berlin/Heidelberg, Germany, 2017; pp. 121–147.
7. Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial
networks. arXiv 2014, arXiv:1406.2661. [CrossRef]
8. Reed, S.; Akata, Z.; Yan, X.; Logeswaran, L.; Schiele, B.; Lee, H. Generative adversarial text to image synthesis. arXiv 2016,
arXiv:1605.05396.
9. Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X. Improved techniques for training gans. In
Proceedings of the Advances in Neural Information Processing Systems 29 (NIPS 2016), Barcelona, Spain, 5–10 December 2016.
10. Zia, T.; Arif, S.; Murtaza, S.; Ullah, M.A. Text-to-Image Generation with Attention Based Recurrent Neural Networks. arXiv 2020,
arXiv:2001.06658.
11. Liu, R.; Ge, Y.; Choi, C.L.; Wang, X.; Li, H. Divco: Diverse conditional image synthesis via contrastive generative adversarial
network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25
June 2021; pp. 16377–16386.
12. Gao, L.; Chen, D.; Zhao, Z.; Shao, J.; Shen, H.T. Lightweight dynamic conditional GAN with pyramid attention for text-to-image
synthesis. Pattern Recognit. 2021, 110, 107384. [CrossRef]
13. Dong, Y.; Zhang, Y.; Ma, L.; Wang, Z.; Luo, J. Unsupervised text-to-image synthesis. Pattern Recognit. 2021, 110, 107573. [CrossRef]
14. Berrahal, M.; Azizi, M. Optimal text-to-image synthesis model for generating portrait images using generative adversarial
network techniques. Indones. J. Electr. Eng. Comput. Sci. 2022, 25, 972–979. [CrossRef]
15. Zhang, Y.; Han, S.; Zhang, Z.; Wang, J.; Bi, H. CF-GAN: Cross-domain feature fusion generative adversarial network for
text-to-image synthesis. Vis. Comput. 2022, 1–11. [CrossRef]
Eng. Proc. 2022, 20, 16 6 of 6

16. Zhang, H.; Xu, T.; Li, H.; Zhang, S.; Wang, X.; Huang, X.; Metaxas, D.N. Stackgan: Text to photo-realistic image synthesis with
stacked generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy,
22–29 October 2017; pp. 5907–5915.
17. Zhang, H.; Xu, T.; Li, H.; Zhang, S.; Wang, X.; Huang, X.; Metaxas, D.N. Stackgan++: Realistic image synthesis with stacked
generative adversarial networks. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 1947–1962. [CrossRef] [PubMed]
18. Zhang, Z.; Xie, Y.; Yang, L. Photographic text-to-image synthesis with a hierarchically-nested adversarial network. In Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6199–6208.
19. Cai, Y.; Wang, X.; Yu, Z.; Li, F.; Xu, P.; Li, Y.; Li, L. Dualattn-GAN: Text to image synthesis with dual attentional generative
adversarial network. IEEE Access 2019, 7, 183706–183716. [CrossRef]

Text-to-Image Generation Using Deep Learning
No ratings yet
Text-to-Image Generation Using Deep Learning
6 pages
Introduction To Recurrent Neural Network
No ratings yet
Introduction To Recurrent Neural Network
10 pages
Deep Learning Based Text To Image Genera
No ratings yet
Deep Learning Based Text To Image Genera
6 pages
Documents 5
No ratings yet
Documents 5
5 pages
An Adaptive Approach To Text To Image
No ratings yet
An Adaptive Approach To Text To Image
5 pages
Indian Institute OF Information Technology Allahabad: Text To Image Synthesis
No ratings yet
Indian Institute OF Information Technology Allahabad: Text To Image Synthesis
8 pages
ijariie26613
No ratings yet
ijariie26613
5 pages
Text-to-Image Synthesis With Generative Models Met
No ratings yet
Text-to-Image Synthesis With Generative Models Met
16 pages
SAW-GAN
No ratings yet
SAW-GAN
11 pages
Development and deployment of a generative model-based framework for text to photorealistic image generation
No ratings yet
Development and deployment of a generative model-based framework for text to photorealistic image generation
16 pages
Photographic Text-to-Image Synthesis With A Hierarchically-Nested Adversarial Network
No ratings yet
Photographic Text-to-Image Synthesis With A Hierarchically-Nested Adversarial Network
10 pages
Meta
No ratings yet
Meta
17 pages
Tao DF-GAN A Simple and Effective Baseline For Text-to-Image Synthesis CVPR 2022 Paper
No ratings yet
Tao DF-GAN A Simple and Effective Baseline For Text-to-Image Synthesis CVPR 2022 Paper
11 pages
Text-to-Image_Synthesis_With_Generative_Models_Methods_Datasets_Performance_Metrics_Challenges_and_Future_Direction_Basiv
No ratings yet
Text-to-Image_Synthesis_With_Generative_Models_Methods_Datasets_Performance_Metrics_Challenges_and_Future_Direction_Basiv
16 pages
Mirror Gan
No ratings yet
Mirror Gan
10 pages
Base Paper Batch 9 Final Updated 3
No ratings yet
Base Paper Batch 9 Final Updated 3
10 pages
BTP_6 sem_part1
No ratings yet
BTP_6 sem_part1
40 pages
Mirrorgan: Learning Text-To-Image Generation by Redescription
No ratings yet
Mirrorgan: Learning Text-To-Image Generation by Redescription
10 pages
Generative Adversarial Text To Image Synthesis
No ratings yet
Generative Adversarial Text To Image Synthesis
1 page
ttoimage_merged
No ratings yet
ttoimage_merged
57 pages
Final All Correct
No ratings yet
Final All Correct
49 pages
Rishab Paper Final
No ratings yet
Rishab Paper Final
7 pages
Generating AI Text to Image A Comprehensive Guide
No ratings yet
Generating AI Text to Image A Comprehensive Guide
3 pages
Dynamic Image Generation From Text Prompt Research Paper-JOT-5135
100% (1)
Dynamic Image Generation From Text Prompt Research Paper-JOT-5135
7 pages
Utilizing Generative AI for Text-To-Image Generation
No ratings yet
Utilizing Generative AI for Text-To-Image Generation
6 pages
ai-image-generator
No ratings yet
ai-image-generator
37 pages
MPAI05_FINAL DOCUMENT
No ratings yet
MPAI05_FINAL DOCUMENT
40 pages
New Microsoft Word Document (2)
No ratings yet
New Microsoft Word Document (2)
8 pages
Text To Image Synthesis Using Generative Adversarial Networks
No ratings yet
Text To Image Synthesis Using Generative Adversarial Networks
10 pages
AI Image Generator PPT-1
No ratings yet
AI Image Generator PPT-1
15 pages
Liao Text To Image Generation With Semantic-Spatial Aware GAN CVPR 2022 Paper
No ratings yet
Liao Text To Image Generation With Semantic-Spatial Aware GAN CVPR 2022 Paper
10 pages
Cross-Caption Cycle-Consistent Text-to-Image Synthesis
No ratings yet
Cross-Caption Cycle-Consistent Text-to-Image Synthesis
9 pages
1 RV
No ratings yet
1 RV
11 pages
AI Image Generation
No ratings yet
AI Image Generation
12 pages
Cycle-Consistent Inverse GAN For Text-to-Image Synthesis - 3474085.3475226
No ratings yet
Cycle-Consistent Inverse GAN For Text-to-Image Synthesis - 3474085.3475226
2 pages
Satgan Paper
No ratings yet
Satgan Paper
17 pages
Plug and Play Diffusion Feature
No ratings yet
Plug and Play Diffusion Feature
15 pages
Text-to-image generation using Generative AI
No ratings yet
Text-to-image generation using Generative AI
5 pages
4 - Creating Creative Photomontages or Image Mixing Using Generative Adversarial Networks
No ratings yet
4 - Creating Creative Photomontages or Image Mixing Using Generative Adversarial Networks
9 pages
18237wPg#s.
No ratings yet
18237wPg#s.
17 pages
Ernie-V LG: U G P - B V - L G: I Nified Enerative RE Training For Idirectional Ision Anguage Eneration
No ratings yet
Ernie-V LG: U G P - B V - L G: I Nified Enerative RE Training For Idirectional Ision Anguage Eneration
15 pages
perceptionGAN Preprint PDF
No ratings yet
perceptionGAN Preprint PDF
7 pages
Report Image generation
No ratings yet
Report Image generation
61 pages
A_Realistic_Image_Generation_of_Face_From_Text_Description_Using_the_Fully_Trained_Generative_Adversarial_Networks
No ratings yet
A_Realistic_Image_Generation_of_Face_From_Text_Description_Using_the_Fully_Trained_Generative_Adversarial_Networks
11 pages
Building A System That Can Generate High
No ratings yet
Building A System That Can Generate High
2 pages
basepaper1
No ratings yet
basepaper1
15 pages
Yayi Final Seminar
No ratings yet
Yayi Final Seminar
19 pages
ppt1
No ratings yet
ppt1
20 pages
Text To Image Synthesis Using Self
No ratings yet
Text To Image Synthesis Using Self
20 pages
From Words To Pictures Artificial Intelligence Based Art Generator
No ratings yet
From Words To Pictures Artificial Intelligence Based Art Generator
9 pages
(ICCV-2023)Expressive Text-to-Image Generation with Rich Text
No ratings yet
(ICCV-2023)Expressive Text-to-Image Generation with Rich Text
29 pages
DL M6 Tech
No ratings yet
DL M6 Tech
29 pages
BTP Report On Text To Image Synthesis
No ratings yet
BTP Report On Text To Image Synthesis
62 pages
Dual Adversarial Inference For Text-to-Image Synthesis
No ratings yet
Dual Adversarial Inference For Text-to-Image Synthesis
20 pages
SanjanaSademba 2205348.
No ratings yet
SanjanaSademba 2205348.
8 pages
Text To Image Generator
No ratings yet
Text To Image Generator
12 pages
T - F G S Gan2: EXT TO ACE Eneration With Tyle
No ratings yet
T - F G S Gan2: EXT TO ACE Eneration With Tyle
16 pages
Xu AttnGAN Fine-Grained Text CVPR 2018 Paper
No ratings yet
Xu AttnGAN Fine-Grained Text CVPR 2018 Paper
9 pages
Image Generation From Caption
No ratings yet
Image Generation From Caption
10 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Hospital List
No ratings yet
Hospital List
18 pages
Vâlcea BT B5-37-82
No ratings yet
Vâlcea BT B5-37-82
46 pages
Project: COLLINGWOOD VILLAGE, Vancouver, B.C
No ratings yet
Project: COLLINGWOOD VILLAGE, Vancouver, B.C
20 pages
The Abcdefghij of Powerful IDEAS in Marketing:: Hermawan Kartajaya
100% (1)
The Abcdefghij of Powerful IDEAS in Marketing:: Hermawan Kartajaya
43 pages
game.log
No ratings yet
game.log
35 pages
Percentage Basics
No ratings yet
Percentage Basics
4 pages
Photography
75% (4)
Photography
39 pages
1st Assignment (BMAT (SMEC) 101)
No ratings yet
1st Assignment (BMAT (SMEC) 101)
2 pages
Basic Concepts of Management
No ratings yet
Basic Concepts of Management
10 pages
Eureka SMF1000 V.2020 04
No ratings yet
Eureka SMF1000 V.2020 04
2 pages
What To Eat and When - Stanley K Clark
100% (2)
What To Eat and When - Stanley K Clark
66 pages
GOC Short Notes NItesh Devnani_240923_020237 (1)
No ratings yet
GOC Short Notes NItesh Devnani_240923_020237 (1)
5 pages
MATHS IIB BIE IMPORTANT QUESTIONS
No ratings yet
MATHS IIB BIE IMPORTANT QUESTIONS
12 pages
Final Version TOR
No ratings yet
Final Version TOR
6 pages
E8372h Firmware Release Notes V1.0: Huawei Technologies Co., LTD
No ratings yet
E8372h Firmware Release Notes V1.0: Huawei Technologies Co., LTD
11 pages
Cisco Hierarchical Color Aware Policing
No ratings yet
Cisco Hierarchical Color Aware Policing
14 pages
Rehabilitation of Juvenile Delinquents
No ratings yet
Rehabilitation of Juvenile Delinquents
5 pages
Bus Stat Chapter 4
No ratings yet
Bus Stat Chapter 4
2 pages
Reaching Out With Youth Alive
No ratings yet
Reaching Out With Youth Alive
38 pages
25.09.2022 Minor Test 1 - Conquer Batch 13, 15 & 16
No ratings yet
25.09.2022 Minor Test 1 - Conquer Batch 13, 15 & 16
31 pages
Lec33 - 210102029 - DIYA ARUN
No ratings yet
Lec33 - 210102029 - DIYA ARUN
5 pages
Statistics Formative Task 2.
No ratings yet
Statistics Formative Task 2.
5 pages
S770 Elec Std-Op 7210766-C
No ratings yet
S770 Elec Std-Op 7210766-C
13 pages
ICSE-8 - Cambridge - Math - Exponent (PART 1) (T Copy)
No ratings yet
ICSE-8 - Cambridge - Math - Exponent (PART 1) (T Copy)
4 pages
Chapter-1 (Rational Numbers) - Introduction Notes
No ratings yet
Chapter-1 (Rational Numbers) - Introduction Notes
9 pages
The Effect of Working Capital Management On Cash Holding
No ratings yet
The Effect of Working Capital Management On Cash Holding
79 pages
WWE - Lab Manual - 3171306
No ratings yet
WWE - Lab Manual - 3171306
16 pages
Philippe Van Parijs, Joshua Cohen, Joel Rogers-What's Wrong With A Free Lunch - (New Democracy Forum) - Beacon Press (2001) PDF
No ratings yet
Philippe Van Parijs, Joshua Cohen, Joel Rogers-What's Wrong With A Free Lunch - (New Democracy Forum) - Beacon Press (2001) PDF
156 pages
TPE 1 Artifact-Case Study
No ratings yet
TPE 1 Artifact-Case Study
13 pages
Face Recognition PDF
No ratings yet
Face Recognition PDF
41 pages

Engproc 20 00016 With Cover

Uploaded by

Engproc 20 00016 With Cover

Uploaded by

Proceeding Paper

Text-to-Image Generation Using

Sadia Ramzan, Muhammad Munwar Iqbal and Tehmina Kalsum

Citation: Ramzan, S.; Iqbal, M.M.;

Eng. Proc. 2022, 20, 16. https://doi.org/10.3390/engproc2022020016 https://www.mdpi.com/journal/engproc

3. Dataset and Preprocessing

3.2. Data Preprocessing

5. Results and Discussion

Ref. Models Inception Score

Table 2. PSNR value of generated images.

Ref. Model PSNR Value

6. Conclusions and Future Work

You might also like