100% found this document useful (1 vote)
320 views

DeepPov GAI

Uploaded by

Varad Ingale
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
320 views

DeepPov GAI

Uploaded by

Varad Ingale
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Deep Point of View

Generative AI
Table of Contents
1. Executive Summary 03

2. Introduction 05

3. Common Techniques of Generative AI 08

4. Generative AI Market Potential 14

5. Industry Use Cases of Generative AI 15

6. Generative Adversarial Networks (GAN) Based Research Work 20

7. Challenges in Generative AI 41

8. Concluding Notes 43

9. Authors 44

10. References 45

©LTIMindtree | Privileged and Confidential 2022


01 Executive Summary
The term Artificial Intelligence (AI) was first coined images, audio, video, text, or code. This next-gen AI
by John McCarthy almost 60 years ago in 1956. AI discovers the underlying pattern associated with the
is not just tech jargon in this digital age. The field of input to build new, realistic artefacts representing
AI is evolving and breakthroughs are happening the training data's properties. According to the MIT
daily. We are developing complex algorithms and Technology Review, Generative AI is one of the most
computing systems leveraging AI that can quickly promising advancements in the field of artificial
process and analyze massive volumes of data, which intelligence in the last decade.
would be impossible for an average human to
complete in a single lifetime. By self-learning from each batch of data, Generative
AI can create authentic artifacts that did not exist
We have now focused on creating AI-powered before using a wide array of inputs. Advancements
machines that can generate images, texts, and in neural networks and machine learning algorithms
similar multimedia content independently with developed specifically for data crunching and
minimal human intervention. To make this a reality, pattern analysis are key growth drivers in this
researchers and programmers have produced an domain. It is expected that deep research in
innovative concept called “Generative AI.” Generative AI will open new avenues for bulk data
evaluation and analysis. Generative AI is in research
Generative AI is an emerging technology that uses infancy, but early application trials have exhibited
unsupervised learning algorithms to generate novel promising results that rival human competency.

3
©LTIMindtree | Privileged and Confidential 2022
At present, Generative AI applicability is confined to This document aims to provide a high-level view of
network models only. However, with consistent Generative AI, its building blocks, market analysis,
research, new models are being created daily. and industry use cases. This document also presents
Generative Adversarial Network (GAN) is the most some ongoing research on the generative models ,
well-understood and heavily researched model of all thus providing a good starting point for anyone
the network models available today. It offers a who wishes to dive deeper into this domain.
plethora of use cases in the image and video
processing domain.

4
©LTIMindtree | Privileged and Confidential 2022
02 Introduction
In this age of artificial intelligence and deep neural etc., are some creative options for human beings to
networking, we often use computers to perform excel at.
tasks beyond human capabilities. We rely on them
for menial chores and to perform activities that Our creativity went digital with the advent of
require zero mistakes or errors. powerful graphic cards. We transitioned from
creating 3D models and digital photography to
Scientists wanted to develop technology that would converting them into digital art and selling it as
liberate humans from mundane duties so that we NFTs. Let us look at some of the best examples of
could dedicate our lives to more creative and digital art created by incredible 3D artists
imagination-based work. Poetry, painting, craft, and photographers!

Fig.1: Digital Art

5
©LTIMindtree | Privileged and Confidential 2022
What if we told you that every single one of has many designer friends and enjoys a nice cup of
them was done by a single 3D artist and coffee while watching the sunset?
photographer?
This artist, on the other hand, did not grow up in a
You would think she would be an artistic personality creative home. She was raised in a laboratory. She
who grew up in a home that encouraged her has no designer friends. She has no friends. She
creative side from an early age, right? Or that she only has researchers.

This artist is not a person at all. It is an AI and one of the greatest AIs ever produced in a
Google lab called Google Imagen, and it takes a random bit of text and turns it into art.

An art gallery displaying A Pomeranian is sitting


Monet paintings. The on the Kings throne
art gallery is flooded. wearing a crown. Two
Robots are going tiger soldiers are
around the art gallery standing next to the
using paddle boards. throne.

A photo of a raccoon
A giant cobra snake on
wearing an astronaut
a farm. The snake is
helmet, looking out of
made out of corn.
the window at night.

Fig.2: Digital Art by Google Imagen

6
©LTIMindtree | Privileged and Confidential 2022
Google Imagen displays the power of what As these models are provided with limited
Generative AI can do with limited input. This tech is parameters during the training stage, the model
a game changer for image and language modeling. creates its interpretation and judgment about the
Such a Generative AI can read and analyze data unique and vital properties of the data. Due to this,
inputs (in the form of text, pictures, audio, or video) the results from these Generative AI models are free
and produce new and unique media while retaining from human experience-based biases and mental
the essence of the original data. processes. A significant drawback in wide-scale
adoption of generative AI currently is cost. As this
As discussed earlier, this tech employs unsupervised technology requires high processing power, the cost
learning algorithms to create unique content from of deployment and operation is
existing data. Simply put, it allows machines to more increased.
recognize patterns in incoming text and utilize them
to produce similar content.

7
©LTIMindtree | Privileged and Confidential 2022
03 Common Techniques of
Generative AI
Artificial intelligence techniques have traditionally Generative models are exceptionally good at
been used to clean data, enhance predictive producing near-original material with a little vector. It
analysis, compress data, and decrease the also enables us to create previously non existing
dimensionality of datasets for other algorithms. material that may be used without licensing. Some
Novel generative AI techniques like Variational Generative AI techniques are used when working with
autoencoders (VAEs), for example, push this a step pictures or visual data. There are certain Generative AI
further by reducing errors between the raw signal models that perform better in signal processing
and the reconstruction. applications such as anomaly detection for predictive
maintenance or security analytics. Let’s discuss some
of these Generative AI techniques in this section.

Generative Adversarial Networks

Ian Goodfellow and colleagues at the University For starters, they have been utilized to make realistic
of Montreal pioneered the use of Generative speech by mimicking humans and matching voices
Adversarial Networks (GANs) in 2014. They have and lip movements for better translations. They
showed enormous potential in producing many have also interpreted visuals, distinguished between
forms of realistic data. Yann LeCun, Meta's chief night and day, and defined dancing motions
AI scientist, called GANs and their variants "the between bodies. They are also used in conjunction
most exciting topic in machine learning in the with other AI approaches to increase security and
last ten years." create stronger AI classifiers.

8
©LTIMindtree | Privileged and Confidential 2022
GANs use two competing neural networks, a network, is a neural network that differentiates
generator and a discriminator. The generator, also between source and produced data. The
known as the generative network, is a neural competition between these two networks is to
network responsible for producing new data or develop their algorithms until they can create data
content comparable to the original data. A indistinguishable from the original material.
discriminator, also known as a discriminative

Discriminator
Training set
Real

Fake
Random
noise

Generator Fake image

Fig 3: Generative Adversarial Networks (Thalles Silva)

Models

DCGAN ProGAN BigGAN


(Deep (Progressively
Convolutional Growing
GAN) GAN)

9
©LTIMindtree | Privileged and Confidential 2022
Transformer-based Models

Transformer-based models are mainly used to The ability of the transformer models to attend to
analyze data with a sequential structure (such as the various positions of the input sequence to compute
sequence of words in a sentence). In modern times, a representation of that sequence is core to their
transformer-based techniques have become a architecture.
standard tool for modeling natural language.

INBOX OUTBOX

Je suis etudiant
THE I am a student
TRANSFORMER

Fig 5: Transformer-based Models (Source: GitHub)

Models

BERT RoBERTA
(Bidirectional Encoder Representation (Robustly
from Transformers) Optimized BERT)

10
©LTIMindtree | Privileged and Confidential 2022
Autoregressive Convolutional Neural Networks

Autoregressive refers to self-regression. The word likelihood of specific data is only based on what has
autoregression refers to forecasting future happened before. To create reliable new data, they
outcomes of a series based on previously observed rely on past time-series data. RNNs and casual
effects of that sequence. AR- CNNs investigate convolutional networks are the most common
systems that change over time and believe that the autoregressive designs.

Output

Hidden
Layer

Hidden
Layer

Hidden
Layer

Input

Fig 4: Autoregressive Convolutional neural networks (Source: deepmind.com)

Models

PixelRNN PixelCNN WaveNet

11
©LTIMindtree | Privileged and Confidential 2022
Other Nascent Techniques

Bayesian Network

Bayesian Network or Bayes Network is a generative acyclic graph (DAG), and the parameters consist of
probabilistic graphical model that allows efficient conditional probability distributions associated with
and effective representation of the joint probability each node. This network can be used for various
distribution over a set of random variables. Bayes applications, such as time series prediction, anomaly
Network consists of two main parts, which are detection, reasoning, etc.
structure and parameters. The structure is a directed

Gaussian Mixture Model

Gaussian Mixture Model is a generative probabilistic biometric system, which includes vocal tract-related
model which assumes all the data points are spectral components in a speaker recognition
generated from a mixture of a finite number of system. Thus, a well-trained prior model estimates
Gaussian distributions with unknown parameters. GMM parameters from training data using the
GMMs are commonly used as a parametric model iterative Expectation-Maximization (EM) algorithm
of the probability distribution of features in a or Maximum a Posteriori (MAP) estimation.

Hidden Markov Model

A Hidden Markov Model (HMM) is a statistical or events They have been extensively used in
model that can describe the evolution of observable various fields, especially in speech recognition and
events that depend on internal factors, which are digital communication. A Hidden Markov Model
not directly observable. The model is popularly consists of two stochastic processes: an invisible
known for its effectiveness in modeling the circle of hidden states and a visible process of
correlations between adjacent symbols, domains, observable symbols.

12
©LTIMindtree | Privileged and Confidential 2022
Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocation (LDA) is a generative item is modeled as a finite mixture over an
probabilistic model with collections of discrete data underlying set of topics. The model has applications
such as text corpora. LDA is a three-level for various problems, including collaborative
hierarchical Bayesian model in which each collection filtering and content-based image retrieval.

Variational Autoencoders (VAEs)

Variational Autoencoders (VAEs) have been one of stochastic gradient descent. The application of VAEs
the most popular approaches to unsupervised includes generating various kinds of complex data,
learning of complicated distributions. They are built including handwritten digits, faces, CIFAR images,
on top of standard function approximators, which predicting the future from static images, and more.
are neural networks and can be trained with

13
©LTIMindtree | Privileged and Confidential 2022
04 Generative AI
Market Potential
It is difficult to estimate the true market sense of market is an offshoot of Generative AI, which deals
Generative AI as the technology is in the with CAD-based automated design.
experimental stage. As no practical uses and Currently, the Generative Design Market is
adoption models have yet been developed, it has expected to surpass USD 529 Mn by 2027 at a
become difficult to gauge the exact range of CAGR of 19.4%.
applications. Thus, it is too early to put a number on Drawing inspiration from the above statistics, we
the market potential. believe that the market potential for non-niche
The most reliable information that we can use to Generative AI will be 10x of the Generative Design
derive some insights and assess the market scenario market with a tentative timeline of 3-5 years in
is by studying the Generative Design Market. This the future.

$ 529.0
Million by 2027

Global Generative
Design Market,
2020-2027 %
19.4
G R-
CA

$ 128.4
Million by 2019

2020 (E) 2027 (P)

Fig 6: Global Generative Design Market Potential (Source: Verified Market Research)

14
©LTIMindtree | Privileged and Confidential 2022
05 Industry Use Cases of
Generative AI
According to Gartner, the amount of digital data created by generative AI will be more than 10X by 2025, a
dramatic increase from the 1% it currently accounts for. Let us find out what industries will benefit the most
from implementing this technology.

Healthcare

In this sector, Generative AI serves a dual purpose. tumor development outcomes. It can also identify
Primarily, it has the potential to improve patient cancerous developments by comparing images of
care. Second, it can enhance patient data privacy. healthy organs from a databank to damaged
Here fabricated and under-represented data is used ones. The second use of the technology focuses on
to train and improve the Generative AI model. data de-identification, which aids in the security of
GANs, for example, can provide numerous the reversal process, which is far from impenetrable.
viewpoints of an X-ray picture to show potential

Life Science

Here Generative AI can help with drug discovery. ailments can also be facilitated. This automated
The technology can produce molecular structures of technology is faster than the manual procedure.
medications used to treat various diseases. When Gartner predicts that generative AI will be used in
this technology performs quick database search of 50% of drug development activities by 2025.
substances for this purpose, the treatment of novel

15
©LTIMindtree | Privileged and Confidential 2022
Media and Entertainment

Movie Restoration

Many vintage movies and classic Disney cartoons Generative AI can upscale them to 4k and beyond,
are treasures of world culture, but their quality create 60 frames per second of the standard 23,
sometimes falls short of the demands of our day. reduce noise, and convert black-and-white to color.

Generation of Animated Models

Along with film, the video game business relies on models might be completely new or derived from
moving pictures, and generative AI can aid. When previously inputted 2D photos. Additionally, the
AI algorithms build 3D models in computer games, system can produce 2D images for usage in a
software developers' efforts are lightened, and certain game and animation genres, such as anime.
development time is significantly decreased. Such

Audio synthesis

Generative AI is about more than just images. Its spectators love. With Generative AI, it is possible to
application can also improve the field of sound. This create voices that resemble humans. The
knowledge may be used in cinematography and computer-generated voice helps develop video
video gaming to create foley components, ambient voiceovers, audible clips, and narrations for
noises, voiceovers, and other audio effects that are companies and individuals.
an essential aspect of a movie or video game that

16
©LTIMindtree | Privileged and Confidential 2022
Retail and e-commerce

People express their feelings and evaluate the things consumers' web activity and evaluate user data to
they purchase and the organization supplies services determine how enjoyable the UX is or how effective
when engaging with items. AI algorithms may be an advertising or the overall marketing campaign
trained to assess consumer-generated texts, audio was. Such information may then be used in client
samples, and facial expressions that provide insight segmentation to identify different consumer groups
into clients' attitudes on the item in issue. and map out focused promotional programs,
Other generative AI techniques can monitor online enhancing upselling and cross-selling potential.

Finance

Fraud detection

Several businesses already use automated accurately. AI is now detecting illegal transactions
fraud-detection practices that leverage the power through preset algorithms and rules and is making
of AI. These practices have helped them locate detecting theft identification easier.
malicious and suspicious actions quickly and with

Trend evaluation

ML and artificial learning technology help predict insights into the trends beyond conventional
the future. These technologies provide valuable calculative analysis.

17
©LTIMindtree | Privileged and Confidential 2022
IT Industry

Software development

Generative AI has also influenced the software model-based tool GENIO can enhance a developer’s
development sector by automating manual coding. productivity multifold compared to a manual coder.
Rather than coding the software completely, IT The tool helps citizen developers, or non-coders,
professionals now have the flexibility to quickly develop applications specific to their requirements
develop a solution by explaining the AI model about and business processes and reduces their
what they are looking for. For instance, a dependency on the IT department.

Data Synthesis and augmentation

Data unavailable in the real world can be generated performance, especially deep learning. Data

using generative AI. This may be used for research, augmentation using generative AI can be used

such as testing new machine learning algorithms or to enhance the quality of data. Generative AI can

deep learning architectures. The artificial data set help with the task of tuning the neurons in neural

generated by generative AI can be supplemented networks by automatically finding the best set

with original data to improve neural network of connections.

Algorithm Invention

One application of generative AI is to help by hand, but with the help of generative AI, it can
researchers invent new machine learning be automated.
algorithms. This process has far been done mostly

18
©LTIMindtree | Privileged and Confidential 2022
Other use cases

Artificial General Intelligence (AGI)

Artificial general intelligence (AGI) consists of automate this process as well! Generative AI is a
algorithms that can successfully perform any crucial step towards building AI that can design
intellectual task a human being can do. Humans better machine learning algorithms and other
have used tools for thousands of years to solve forms of AI.
problems and create new things. Now we need to

NFT Development

Non-fungible tokens are all the rage in today's Generative AI technologies are unrivaled in their
digitally driven society, with sales exceeding $25 ability to create such art creations that may bring in
billion last year. NFT art is prevalent in the niche, large sums of money for their creators.
with cartoons, memes, and paintings dominating.

Text, Image, and Music Generation

AI text generators can create summaries of articles, can also be used to create music. It is even possible
generate product descriptions, write blog posts, and to use generative AI algorithms to listen to the
paraphrase text to prevent plagiarism. Generative generated music and find the parts they like most,
AI-created images can be used for research work like Pandora or Spotify. This can also be used to
and in computer graphic applications. Generative AI improve the existing music experience.

Artificial Creativity

Artificial creativity is a subfield of generative AI for example, generating an abstract painting or a


where the main goal is not generating new data novel story without human input.
but creating something that did not exist before,

19
©LTIMindtree | Privileged and Confidential 2022
06 Generative Adversarial Networks
(GAN) Based Research Work
Image Generation Using Datasets

Ian Goodfellow, et al., in their 2104 paper Along the same line of thought, Alec Radford, et al. in
“Generative Adversarial Networks”, used GANs their 2015 paper titled “Unsupervised Representation
to generate new plausible sample images for the Learning with Deep Convolutional Generative
MNIST handwritten digit dataset, the CIFAR-10 Adversarial Networks (DCGAN)” demonstrated how
small object photograph dataset, and the Toronto to train stable GANs at scale. They showed models for
Face Database. generating new examples of bedrooms.

Fig 7: Example of GAN-Generated Photographs of Bedrooms (Source: Arxiv)

20
©LTIMindtree | Privileged and Confidential 2022
Importantly, this paper demonstrates the ability to perform vector arithmetic with the
input to the GANs (in the latent space) both with generated bedrooms and with
generated faces.

Man with Man without Woman


glasses glasses without glasses Woman with glasses

Fig 8: Example of Vector Arithmetic for GAN-Generated Faces (Source: Arxiv)

Generate Photographs of Human Faces

Tero Karras, et al. in their 2017 paper titled received a lot of media attention. The face
“Progressive Growing of GANs for Improved generations were trained on celebrity examples,
Quality, Stability, and Variation” demonstrated the meaning that there are elements of existing celebrities
generation of plausible realistic photographs of in the generated faces, making them seem familiar,
human faces. The images were real, and the results but not entirely.
were promising. Application based on this research

21
©LTIMindtree | Privileged and Confidential 2022
Fig 9: Examples of Photorealistic GAN-Generated Faces. (Source: Arxiv)

Examples from this paper were also used in a Mitigation” to demonstrate the rapid progress
2018 report titled “The Malicious Use of Artificial of GANs from 2014 to 2017.
Intelligence: Forecasting, Prevention, and

Fig 10: Example of the Progression in the Capabilities of GANs from 2014 to 2017 (Source: Arxiv)

22
©LTIMindtree | Privileged and Confidential 2022
Generate Realistic Photographs

Andrew Brock, et al. in their 2018 paper titled synthetic photographs with their technique BigGAN
“Large Scale GAN Training for High Fidelity Natural that are indistinguishable from real photos.
Image Synthesis” demonstrate the generation of

Fig 11: Example of Realistic Synthetic Photographs Generated with BigGAN (Source: Arxiv)

Generate Cartoon Characters

Yanghua Jin, et al., in their 2017 paper titled characters). Inspired by the anime examples,
“Towards the Automatic Anime Characters Creation several people have tried to create Pokemon
with Generative Adversarial Networks” demonstrate characters, such as the pokeGAN project and the
the training and use of a GAN for generating faces Generate Pokemon with DCGAN project, with
of anime characters (i.e., Japanese comic book limited success.

23
©LTIMindtree | Privileged and Confidential 2022
Fig 12: Example of GAN-Generated Pokemon Characters (Source: PokeGAN project)

Image-to-Image Translation

This is a bit of a catch-all task, for those papers that Adversarial Networks” demonstrate GANs, precisely
present GANs that can do many image translation their pix2pix approach for many image-to-image
tasks. Phillip Isola, et al., in their 2016 paper titled translation tasks.
“Image-to-Image Translation with Conditional

24
©LTIMindtree | Privileged and Confidential 2022
Examples include translation tasks such as:

Translation of Translation Translation Translation of Translation


semantic images of satellite of photos black and white of sketches
to photographs photographs from day photographs to color
of cityscapes to Google to night. to color. photographs.
and buildings. Maps.

Input Ground truth Output Input Ground truth Output

Fig 13: Example of Photographs of Daytime Cityscapes to Night-time With pix2pix (Source: Arxiv)

25
©LTIMindtree | Privileged and Confidential 2022
Similarly, Jun-Yan Zhu in their 2017 paper titled “Unpaired Image-to-Image Translation using Cycle-Consistent
Adversarial Networks” introduces their famous Cycle GAN and a suite of impressive image-to-image
translation examples.

The example below demonstrates four image translation cases:

Translation from Translation Translation of Translation of


photograph to of horse photographs satellite
artistic painting to zebra. from summer photographs to
style. to winter. Google Maps view.

Input x Output G(x) Reconstruction


F(G(x))

Fig 14: Example of Four Image-to-Image Translations Performed with CycleGAN (Source: Arxiv)

26
©LTIMindtree | Privileged and Confidential 2022
The paper also provides many other examples, such as:

Translation of Translation Translation of Translation of


painting to of sketch to apples to photographs to
photograph. photograph. oranges. artistic painting.

Text-to-Image Translation (text2image)

Han Zhang, et al., in their 2016 paper titled StackGAN to generate realistic-looking photographs
“StackGAN: Text to Photo-realistic Image Synthesis from textual descriptions of simple objects like birds
with Stacked Generative Adversarial Networks” and flowers.
demonstrate the use of GANs, specifically their

The small bird has a red head with feathers that fade from red to gray from head to tail

Stage-1
images

Stage-2
images

Fig 15: Example of Textual Descriptions and GAN-Generated Photographs of Birds (Source: Arxiv)

27
©LTIMindtree | Privileged and Confidential 2022
This bird is black with green and has a very short beak

Stage-1
images

Stage-2
images

Fig 15: Example of Textual Descriptions and GAN-Generated Photographs of Birds (Source: Arxiv)

Scott Reed, et al., in their 2016 paper titled Conditioned Auxiliary Classifier Generative Adversarial
“Generative Adversarial Text to Image Synthesis” Network”. Scott Reed, et al. in their 2016 paper titled
also provide an early example of text-to-image “Learning What and Where to Draw” expand upon
generation of small objects and scenes including this capability and use GANs to both generate images
birds, flowers, and more. Ayushman Dash, et al. from text and use bounding boxes and key points
gave even more examples on the same dataset in as hints as to where to draw a described object,
their 2017 paper titled “TAC-GAN – Text like a bird.

This bird is completely black.

Beak
This bird is bright blue.
Belly
Right leg

Head
A man in an orange jacket,
black pants and a black cap
wearing sunglasses skiing

Fig 16: Example of Photos of Object Generated from Text and Position Hints with a GAN (Source: Arxiv)

28
©LTIMindtree | Privileged and Confidential 2022
Semantic-Image-to-Photo Translation

Ting-Chun Wang, et al. in their 2017 paper titled the use of conditional GANs to generate
“High-Resolution Image Synthesis and Semantic photorealistic images given a semantic image or
Manipulation with Conditional GANs” demonstrate sketch as input.

Input labels Synthesized image

Fig 17: Example of Semantic Image and GAN-Generated Cityscape Photograph (Source: Arxiv)

Specific examples included:

Cityscape Bedroom Human face Human face


photograph, given photograph, given photograph, given photograph, given
semantic image. semantic image. semantic image. sketch.

Face Frontal View Generation

Rui Huang, et al. in their 2017 paper titled “Beyond of human faces given photographs taken at an
Face Rotation: Global and Local Perception GAN for angle. The idea is that the generated front-on
Photorealistic and Identity Preserving Frontal View photos can then be used as input to a face
Synthesis” demonstrate the use of GANs for verification or face identification system.
generating frontal-view (i.e., face-on) photographs

29
©LTIMindtree | Privileged and Confidential 2022
Fig 18: Example of GAN-based Face Frontal View Photo Generation (Source: Ieeexplore)

Generate New Human Poses

Liqian Ma, et al. in their 2017 paper titled “Pose example of generating new photographs of human
Guided Person Image Generation” provide an models with new poses.

1 2 3 4 5 6 7 8 9
Condition Target Target GI-CE-LI HI-HME-LI GI-LI GI-pose GI+D G1+G2+D
image pose image (GT) MaskLoss (our refined
(our course result)
result)

ID, 245

ID, 346

ID, 116

Fig 19: Example of GAN-Generated Photographs of Human Poses (Source: Arxiv)

30
©LTIMindtree | Privileged and Confidential 2022
Photos to Emojis

Yaniv Taigman, et al. in their 2016 paper titled handwritten digits, and from photographs of
“Unsupervised Cross-Domain Image Generation” celebrities to what they call emojis or small
used a GAN to translate images from one domain cartoon faces.
to another, including from street numbers to MNIST

Fig 20: Example of Celebrity Photographs and GAN-Generated Emojis (Source: Arxiv)

31
©LTIMindtree | Privileged and Confidential 2022
Photograph Editing

Guim Perarnau, et al. in their 2016 paper titled photographs of faces with specific specified
“Invertible Conditional GANs For Image Editing” features, such as changes in hair color, style, facial
use a GAN, specifically their IcGAN, to reconstruct expression, and even gender.

Real image Reconstructed images

Blonde Bangs Smile Male

Fig 21: Example of Face Photo Editing with IcGAN (Source: Arxiv)

Ming-Yu Liu, et al. in their 2016 paper titled properties such as hair color, facial expression, and
“Coupled Generative Adversarial Networks” also glasses. They also explore the generation of other
explores the generation of faces with specific images, such as scenes with varied color and depth.

Fig 22: Example of GANs used to Generate Faces with and Without Blond Hair (Source: Arxiv)

32
©LTIMindtree | Privileged and Confidential 2022
Andrew Brock, et al. in their 2016 paper titled The editor allows rapid realistic modification of
“Neural Photo Editing with Introspective Adversarial human faces including changing the hair color,
Networks” present a face photo editor using a hairstyles, facial expression, pose, and adding
hybrid of variational autoencoders and GANs. facial hair.

Fig 23: Example of Face Editing Using the Neural Photo Editor Based on VAEs and GANs (Source: Arxiv)

He Zhang, et al. in their 2017 paper titled “Image including examples such as removing rain and snow
De-raining Using a Conditional Generative from photographs.
Adversarial Network” use GANs for image editing,

(a) (b)

(c) (d)
Fig 24: Example of Face Editing Using the Neural Photo Editor Based on VAEs and GANs (Source: Arxiv)

33
©LTIMindtree | Privileged and Confidential 2022
Face Aging

Grigory Antipov, et al. in their 2017 paper titled photographs of faces with different apparent ages,
“Face Aging with Conditional Generative from younger to older.
Adversarial Networks” use GANs to generate

Reconstruction Optimization
Initial
Original Reconstruction Pixelwise IP

Face Ageing

0-18 19-29 30-39 40-49 50-59 60+

Fig 25: Example of Photographs of Faces Generated with a GAN With Different Apparent (Source: Arxiv)

34
©LTIMindtree | Privileged and Confidential 2022
Photo Blending

Huikai Wu, et al. in their 2017 paper titled blending photographs, specifical elements from
“GP-GAN: Towards Realistic High-Resolution Image different photographs such as fields, mountains,
Blending” demonstrates the use of GANs in and other large structures.

Fig 26: Example of GAN-based Photograph Blending (Source: Arxiv)

35
©LTIMindtree | Privileged and Confidential 2022
Super Resolution

Christian Ledig, et al. in their 2016 paper titled demonstrate the use of GANs, specifically their
“Photo-Realistic Single Image Super-Resolution SRGAN model, to generate output images with
Using a Generative Adversarial Network” higher, sometimes much higher, pixel resolution.

bicubic SRResNet SRGAN original

Fig 27: Example of GAN-Generated Images with Super Resolution (Source: Arxiv)

Subeesh Vasu, et al. in their 2018 paper titled provide an example of GANs for creating
“Analysing Perception-Distortion Trade-off using high-resolution photographs, focusing on
Enhanced Perceptual Super-Resolution Network” street scenes.

36
©LTIMindtree | Privileged and Confidential 2022
I HR Bicubic SRCNN(12) EDSR(31) DBPN(20)

008 from Urban100 ENet(39) BNet1 BNet3 EPSR1 EPSR3

I HR Bicubic SRCNN(12) EDSR(31) DBPN(20)

Fig 28: Example of High-Resolution GAN-Generated Photographs of Buildings (Source: Arxiv)

Photo Inpainting

Deepak Pathak, et al. in their 2016 paper titled Encoders, to perform photograph inpainting or hole
“Context Encoders: Feature Learning by Inpainting” filling, that is filling in an area of a photograph that
describe the use of GANs, specifically Context was removed for some reason.

37
©LTIMindtree | Privileged and Confidential 2022
Fig 29: Example of GAN-Generated Photograph Inpainting Using Context Encoders (Source: Arxiv)

Raymond A. Yeh, et al. in their 2016 paper titled Models” use GANs to fill in and repair intentionally
“Semantic Image Inpainting with Deep Generative damaged photographs of human faces.

Real Input Ours NN

Fig 30: Example of GAN-based Inpainting of Photographs of Human Face (Source: Arxiv)

38
©LTIMindtree | Privileged and Confidential 2022
Video Prediction

Carl Vondrick, et al. in their 2016 paper titled predicting up to a second of video frames with
“Generating Videos with Scene Dynamics” describe success, for static elements of the scene.
the use of GANs for video prediction, specifically

Static Generated Video


Input Frame 1 Frame 16 Frame 32

Static Generated Video


Input Frame 1 Frame 16 Frame 32

Fig 31: Example of Video Frames Generated with a GAN (Source: Arxiv)

39
©LTIMindtree | Privileged and Confidential 2022
3D Object Generation

Jiajun Wu, et al. in their 2016 paper titled “Learning GAN for generating new three-dimensional
a Probabilistic Latent Space of Object Shapes via 3D objects (e.g., 3D models) such as chairs, cars, sofas,
Generative-Adversarial Modelling” demonstrates a and tables.

High-res Low-res High-res Low-res High-res Low-res High-res Low-res

Fig 32: Example of GAN-Generated Three-Dimensional Objects (Source: Arxiv)

Matheus Gadelha, et al. in their 2016 paper titled models given two-dimensional pictures of objects
“3D Shape Induction from 2D Views of Multiple from multiple perspectives.
Objects” use GANs to generate three-dimensional

Fig 33: Example of Three-Dimensional Reconstructions of a Chair from Two-Dimensional Images (Source: Arxiv)

40
©LTIMindtree | Privileged and Confidential 2022
07 Challenges in Generative AI
While implementing the best practise of Generative AI, keep in mind potential bottlenecks and misconceptions.
Some of these include:

Safety

Utilization of Generative AI for cyber theft and comes to Generative AI implementation. It has been
criminal activities like scamming and identity theft reported that people are using this technology for
are the biggest problems that emerge when it scamming and theft.

Highly limited abilities

Generative AI algorithms require extensive training generated is not 100% new. Instead, these models
and a high amount of data to perform tasks like can only mix and match and sequence the data in
creating digital art. Despite this, the content the best way possible.

Unpredictable outcomes

Accuracy of the result is another challenge that unexpected results. While adopting some models, it
crops up while implementing this technology. is easier to manage the behavior of Generative AI,
GAN's processes remain unstable and difficult to but in heavy applications, they yield erroneous and
regulate, with the potential to produce completely unexpected results.

41
©LTIMindtree | Privileged and Confidential 2022
Data Security

Verticals like healthcare and defense are reluctant AI-based applications may create data security and
to adopt Generative AI, as there are no parameters privacy issues.
available for data moderation, and Generative

Massive Data Sets Requirement

You cannot rely on generative AI algorithms to work within the constraints set by the training data. It
well unless they have a substantial quantity of input cannot generate fresh text or images out of thin air.
content. This program can do miracles, but only

42
©LTIMindtree | Privileged and Confidential 2022
08 Concluding Notes
It is evident that Generative AI is an extremely nascent If a company is looking to gain a competitive
technology with limited industrial use cases currently advantage in this domain, it should focus on
available. Due to this, clear market directives and understanding the fundamental AI models, delving
revenue areas are yet to be identified. deeper into the research as it stands today, and
striving to develop a proof of concept.
Despite the buzz around technology and huge market
capitalization potential, it is too early to predict the By following the above suggestions, you would be
direction in which the market is heading. The timeline ready to capture a significant market share when the
for Generative AI technology and its industry use market demands a full solution.
cases seems to be farther away on the horizon.

43
©LTIMindtree | Privileged and Confidential 2022
09 Authors

Sachin Jain
Head - Crystal & Deep POV

Bharat Trivedi
Principal – Enterprise Architecture

Manish Potdar
Head- Incubation & Industrialization

Hakimuddin Bawangaonwala
Research Analyst – Crystal & Deep POV

Soham Patankar
Lead Strategist- Incubation

44
©LTIMindtree | Privileged and Confidential 2022
10 References
[1] https://imagen.research.google/
[2] https://www.techopedia.com/definition/34633/generative-ai
[3] https://dresma.ai/what-is-generative-ai/
[4] https://arxiv.org/pdf/2205.11487.pdf
[5] https://www.researchgate.net/publication/230708329_Generative_Artificial_Intelligence
[6] https://medium.com/@bks46/introduction-to-generative-ai-models-6a5ebcebc168
[7] https://analyticsindiamag.com/7-types-of-generative-models-for-your-next-machine-learning-project/
[8] https://www.deepmind.com/blog/wavenet-a-generative-model-for-raw-audio
[9] https://jalammar.github.io/illustrated-transformer/
[10] https://www.marketdataforecast.com/market-reports/generative-design-market
[11] https://www.ventureradar.com/startup/Generative%20AI
[12] https://machinelearningmastery.com/impressive-applications-of-generative-adversarial-networks/
[13] https://alexrachnog.medium.com/gans-beyond-generation-7-alternative-use-cases-725c60ba95e8
[14] https://sunverasoftware.com/10-use-cases-for-generative-ai/
[15] https://coe-dsai.nasscom.in/generative-ai-exploring-the-world-of-possibilities/
[16] https://www.analyticsinsight.net/what-is-generative-ai-its-impacts-and-limitations/
[17] https://www.e-spincorp.com/what-are-the-challenges-of-generative-ai-adoption/
[18] https://www.tcs.com/bridging-the-human-machine-divide
[19] https://www.infosys.com/iki/perspectives/advanced-trends-ai.html
[20] https://www.cognizant.com/us/en/ai/evolutionary-ai
[21] https://www.hcltech.com/blogs/applied-ai-time-lead-front
[22] https://www.hcltech.com/next-ai

45
©LTIMindtree | Privileged and Confidential 2022
[23] https://research.ibm.com/interactive/generative-models/
[24] https://www.microsoft.com/en-us/ai/ai-lab-gen-studio
[25] https://aws.amazon.com/blogs/startups/synthesis-ais-fules-computer-vision-innovation/
[26] https://ai.google/
[27] https://arxiv.org/abs/1406.2661
[28] https://arxiv.org/abs/1511.06434
[29] https://arxiv.org/abs/1710.10196
[30] https://arxiv.org/abs/1802.07228
[31] https://arxiv.org/abs/1809.11096
[32] https://arxiv.org/abs/1708.05509
[33] https://github.com/moxiegushi/pokeGAN
[34] https://github.com/kvpratama/gan/tree/master/pokemon
[35] https://arxiv.org/abs/1611.07004
[36] https://arxiv.org/abs/1703.10593
[37] https://junyanz.github.io/CycleGAN/
[38] https://arxiv.org/abs/1612.03242
[39] https://arxiv.org/abs/1605.05396
[40] https://arxiv.org/abs/1703.06412
[41] https://arxiv.org/abs/1610.02454
[42] https://arxiv.org/abs/1711.11585
[43] https://arxiv.org/abs/1705.09368
[44] https://arxiv.org/abs/1611.02200
[45] https://arxiv.org/abs/1611.06355
[46] https://arxiv.org/abs/1606.07536
[47] https://arxiv.org/abs/1609.07093
[48] https://arxiv.org/abs/1701.05957
[49] https://ieeexplore.ieee.org/document/8296650

46
©LTIMindtree | Privileged and Confidential 2022
[50] https://arxiv.org/abs/1702.08423
[51] https://arxiv.org/abs/1703.07195
[52] https://arxiv.org/abs/1609.04802
[53] https://arxiv.org/abs/1707.00737
[54] https://arxiv.org/abs/1811.00344
[55] https://arxiv.org/abs/1604.07379
[56] https://arxiv.org/abs/1607.07539
[57] https://arxiv.org/abs/1704.05838
[58] https://arxiv.org/abs/1603.07442
[59] https://arxiv.org/abs/1609.02612
[60] https://arxiv.org/abs/1610.07584
[61] https://arxiv.org/abs/1612.05872
[62] https://www.itbusinessedge.com/data-center/what-is-generative-ai/#Challenges_of_Generative_AI
[63] https://nix-united.com/blog/pushing-the-technological-envelope-with-generative-ai/

LTIMindtree is a global technology consulting and digital solutions company that enables enterprises across industries to
reimagine business models, accelerate innovation, and maximize growth by harnessing digital technologies. As a digital
transformation partner to more than 750 clients, LTIMindtree brings extensive domain and technology expertise to help drive superior
competitive differentiation, customer experiences, and business outcomes in a converging world. Powered by nearly 90,000 talented and
entrepreneurial professionals across more than 30 countries, LTIMindtree — a Larsen & Toubro Group
company — combines the industry-acclaimed strengths of erstwhile Larsen and Toubro Infotech and Mindtree in solving the most complex
business challenges and delivering transformation at scale. For more information, please visit www.ltimindtree.com.

LTIMindtree Limited is a subsidiary of Larsen & Toubro Limited

You might also like