Gen-AI

Uploaded by

Ramandeep Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Gen-AI

Uploaded by

Ramandeep Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

How generative AI is different than other types of AI

- Generative AI is a type of AI that, as this name suggests, generates new

content. This is in contrast to other types of AI, like discriminative AI, which focuses
on classifying or identifying content that is based on preexisting data. Generative AI
is often used in applications such as image generation, video synthesis, language
generation, and music composition, but to really understand this new tool, we need
to know first where it fits in the broader AI landscape.

The term AI, which is artificial intelligence, is an umbrella term that encompasses
several different subcategories, including generative AI. These subcategories are used
to perform different tasks. For example, reactive machines are used in self-driving
cars. Limited memory AI forecasts the weather. Theory of mind powers virtual
customer assistance. Narrow AI generates customized product suggestions for E-
commerce sites. Supervised learning identifies objects from things like images and
video. Unsupervised learning can detect fraudulent bank transactions, and
reinforcement learning can teach a machine how to play a game. These are only a
few of the subcategories, and generative AI models fall into a lot of these categories,
and honestly, it's only growing.

These other types of AI may still generate content, but they do it as a side effect of
their primary function. Generative AI is specifically designed to generate new content
as its primary output. Whether this is text, images, product suggestions,
whatever, that's what generative AI is designed to do. So, now that we know where
generative AI fits in the broader landscape, together, let's explore how it works.

How generative AI works

You feed it with thousands, millions, trillions of content, and then you teach a certain
algorithm to generate outputs and solutions as a result. Okay, now that we got AI
101 out of the way, let's get into generative AI 101. Let's use cars as an example. Just
like a Porsche has a different engine than a Mazda, under the umbrella term of
generative AI, there are a variety of different generative AI models. These AI models,
or car engines, are written and manufactured by groups of highly advanced
computer vision specialists, machine learning experts, and mathematicians. They're
built on years of open source machine learning research and generally funded by
companies and universities. Some of the big players in writing these generative AI
models, engines, are Open AI, NVIDIA, Google, Meta, and universities like UC
Berkeley and LMU Munich. They can either keep these models private, or they can
make them public, what we call this, open source, for those to benefit from their
research.

All right, now that these complex generative models are written, meaning the
engines are made, what are we going to do with them? Depending on your level of
technical expertise, this can look a bit different. I'm going to paint a picture for
you with three different end users of these models. The first person is a business
leader who comes up with a product idea that involves a generative AI model, or
several. For the development of their tool, this business leader either uses free open
source generative AI models or enters into a partnership with a corporation to get
rights to their generative AI model, then their team creates their vision. To continue
the chronology, let's say this person owns the car factory. They direct where the
engine and chassis go, but don't actually work on the floor. The second person is a
creative person with an appetite for adventure. They might have some technical
knowledge, but they aren't an AI engineer. I mean, they can be if they want. This
person goes to a car engine showroom, where they pick a pre-made car engine or a
generative AI model from a repository like GitHub and Hugging Face. After that, they
go to a chassis manufacturer to pick their empty shell for their new engine, their
precious new engine. These chassis are called AI notebooks. Their purpose is to hold
and run the generative AI model code. The most widely used one is Google
Colab, but there are others like Jupiter Notebooks. And the third person would be
my mother, bless her heart. She has absolutely no technical pedigree, nor she's
interested in acquiring one. But this doesn't mean she cannot benefit from
generative AI. My mother would be buying her already made car. She will have way
less control over the outcome of her car, but she will be able to drive, just like the
business leader and the creative technologist. People with no technical knowledge
can simply subscribe to an online service like OpenAI's ChatGPT or DALL-E, or
download Discord and play with Midjourney, or download Lensa AI and Avatar
Maker in their smartphone to play with the magic of generative AI. Well, this all
depends what you want to do and what you want to build, and how much technical
expertise you already have. Now that we have our car, our generative AI model, we
can now start creating our own content and go for a drive.
Natural language models

- Natural language generation is perhaps the most well-known application of

generative AI so far with ChatGPT in the headlines. Most of the hype around text-
based generative AI is using a model called GPT. GPT stands for Generative Pre-
trained Transformer. It's a language model developed by OpenAI, a research
organization focused on developing and promoting friendly AI.

The idea of pre-training a language model and finding it on a task-specific

dataset isn't something new. This concept has been around for decades and has
been used in several other models before GPT. However, GPT has become notable
for its large scale use of transformer architecture and its ability to generate human-
like texts, which had led to its widespread use and popularity in the field of natural
language processing. Imagine you have a writing assistant that can help you write
emails, articles, even a novel. GPT can take in a prompt, like a topic or a
sentence, and can generate text based on that prompt. It can even continue a
story or a conversation you started earlier. Here are a few industry applications. Let's
start with GitHub.

GitHub Copilot is a generative AI service provided by GitHub to its users. The service
uses the OpenAI codex to suggest the code and entire functions in real time, right
from the code editor. It allows the users to search less for outside solutions and it
also helps them type less with smarter code completion. Another example would be
Microsoft's Bing, which implemented ChatGPT into its search functionality, enabling
it to reach concise information in a shorter amount of time. Since OpenAI made
ChatGPT available to the public on November 30th in 2022, it reached 1 million
user in less than a week, I said in less than a week.

Now, let's compare that to other companies that hit 1 million users. It took Netflix 49
months to reach 1 million users. It took Twitter 24 months, it took Airbnb 30
months, Facebook, 10 months, and it took Instagram two-and-a half-months to
reach 1 million users. Let's remember, it took ChatGPT only one week. These figures
demonstrate how easily humans adopted their workflow for co-creating with
generative AI-based tools and services. This is amazing. However, GPT has several
limitations, such as the lack of common sense, creativity and understanding the text
it generates. Also, bias data sets and the danger of normalization of mediocrity when
we come up with creative writing. Natural language models synthetically mimic
human capabilities, but, clearly, conscious contemplations are required before
developing generative AI tools. ChatGPT is a wonderful tool for factual and
computable information. However, I would advise us to approach it with
caution when inquiring about creative and opinion-based writing.
Text to image applications

- In 2022, we have seen a rise in commercial image generation services. The

technology behind these services is broadly referred as text to image. You simply
type words on a screen and watch the algorithms create an image based on your
queue, even if you description is not very specific. There are three main text to image
generation services. Midjourney, DALL-E, and Stable Diffusion.

If we were to compare these three text to image tools to operating systems,

Midjourney would be macOS because they have a closed API and a very design and
art-centric approach to the image generation process. DALL-E would be Windows
but with an open API because the model is released by a corporation and it initially
had the most superior machine-learning algorithm. Open AI values technical
superiority over design and art sensitivities. And the third, the Stable Diffusion would
be Linux because it is open source and is improving each day with the contribution
of the generative AI community. The quality of the generated images from text to
image models can depend both on the quality of the algorithm and the datasets they
use to train it. So now that we know the main services, let's look at three industrial
applications.

First is Cuebric. Hollywood's first generative AI tool created by our company, Seyhan
Lee, for streaming the production of film backgrounds. A normal virtual production
workflow uses three dimensional world building which involves a bunch of people
building 3D worlds that are custom made for that film. It's time consuming,
expensive, and requires a lot of repetitive tasks. An alternative now is to augment 2D
backgrounds into 2.5D by involving generative AI in the picture creation process. The
second example would be Stitch Fix. When they suggest garments to discover their
customer's fashion style, they use real clothes along with clothes generated with
DALL-E. And finally, marketers and filmmakers use text to image models when
ideating for a concept in a film. And actually, they may later on continue to use it to
make storyboards and even use it in the production of the final art of their
campaigns and films. Just like we have seen in Cuebric. A recent example from the
marketing world would be Martini that used the Midjourney generated image in their
campaign. Another one would be Heinz and Nestle that used DALL-E in their
campaign. And GoFundMe that used Stable Diffusion in their artfully illustrated
film. Marketers prefer using generative AI in their creative process for two
reasons. First, for its time and cost-saving efficiency, and the second, for the unique
look and feel that you get from text to image based tools.
Generative Adversarial Networks (GANs)

- Another renowned generative AI model is generative adversarial networks, also

referred as simply GANs. To illustrate how GANs work, let's give a game of forgery as
a metaphor. Imagine you have an artist called The Generator who is trying to
recreate a painting that is so realistic that it looks like a famous painting. And you
have another person called The Discriminator who's an art expert and trying to spot
the difference between the real painting and the forgery.

The Generator creates a painting and The Discriminator evaluates it, giving
feedback to the generator on how to improve the next iteration. The Generator and
The Discriminator played this game repeatedly until The Generator creates the
painting that is so realistic that The Discriminator can't tell the difference between it
and the real painting. In the same way, a GAN model has a generator and a
discriminator. The two parts work together in a competition. That's why it's called
generative adversarial networks. In this way, they improve the generator's ability to
create realistic data, and over time, the generator becomes better and better at
creating realistic data. And the results start yielding in the creation of
products, assets, faces, people, that didn't exist before, just like we have seen with
text-to-image that we have seen in the former session. The difference though is that
with GANs, you input one type of data, like pictures or bank transitions, and then you
output the same type of data. Let's now give three real-world examples where GANs
were used. We're going to start with Audi. They trained their own GANs to get
inspiration for their wheel designs. This process created lots of different wheel
designs that simply didn't exist before, and gave inspiration to Audi designers so
they can pick and choose which designs they wanted to use in their final
decisions. And remember, AI didn't design the final wheel.

AI was simply a tool that the wheel designers used to inspire themselves for the final
designs that they would make. Next, Beko, that is a European-based appliance
brand, they use custom trained GANs in their sustainability stand film, which also
happens to be the world's first brand-funded AI film created and produced by
Seyhan Lee. We use GANs to generate lightning, leaves, roots, eyes, flowers, and
created seamless transitions to flow between humans and nature. GANs have this
beautiful transitional quality. And finally, in the context of financial fraud
detection, GAN models can be used to generate synthetic versions of fraudulent
transactions, which can then be used to train a fraud detection model. You know
what's really surprising with GANs is that the same generative AI model can be
used for two very distinct professions. Here we are seeing some financial fraud
detection solving and create a new tire styles for Audi. And then later on, the same AI
model makes impossibly beautiful visual effects for film, and that versatility is the
greatest power of GAN models.

VAE and Anomaly Detection

- Let's now move to talking about an application of generative AI that may not be as
obvious as it's used in generating images, like we have seen earlier, audio or text. But
it's still very important application nonetheless, and it is going to be the anomaly
detection. One of the main models that we use in this space is Variational
Autoencoders, referred as VAE.

These models can be used for anomaly detection by training the model on a dataset
of normal data, and then using the trained model to identify instances that deviate
from the normal data. This can be used to detect anomalies in a wide range of
situations, like finding fraud in financial transactions, spotting flaws in
manufacturing or finding security breaches in a network.
For example, Uber has used VAE for anomaly detection in their financial transactions
to detect fraud. Another example would be Google has also used VAE to detect
network intrusions using anomaly detection and another one of a real world
application of VAE would be anomaly detection in industrial quality control. In this
scenario, a VAE can be trained on a dataset of images of normal products and then
used to identify images of products that deviate from the normal data. In this way, it
can be used to detect defects in products such as scratches, dents, or misalignments.

Another real world example would be healthcare where VAE is used to detect
anomalies in medical imaging such as CT scans and MRI, like Children's National
Hospital in Washington, DC uses a generative AI model to analyze electronic health
records. The model uses data such as vital signs, laboratory results and demographic
information to predict which patients are at risk of sepsis, allowing healthcare
providers to intervene early and improve patient outcomes. Variational Autoencoders
are a flexible, generative model that are not only able to detect anomalies but are
also a part of the architecture of several other generative AI models.
Future predictions using GenAi

- The best way to predict the future, as they say, is to invent it so let's talk about the
future. In two to three years in the gaming, film and marketing sectors generative
AI will continue to be used in computer graphics, and animation to create more
realistic, and believable characters, and environments.

This is going to be particularly important in 3D modeling. Generative AI will be used

to improve natural language understanding in virtual assistants and chatbots making
them more and more capable of handling complex and nuanced conversations. In
the energy sector, generative AI models will be used to optimize energy
consumption and production, such as predicting demand, and managing renewable
energy sources, as well as improving the efficiency of energy distribution
networks. As for the transportation sector, generative AI models will be used to
optimize traffic flow and to predict maintenance needs for vehicles. In short,
generative AI will be used to automate repetitive tasks and improve efficiency in a
wide variety of industries. My predictions for the next 10 to 15 years would be
generative AI will be used to create more and realistic, and accurate simulations in
fields such as architecture, urban planning, and engineering. The second would be to
be used to create new materials and products in fields, such as manufacturing and
textile design. The third will be natural language generation will be improved in the
fields of content creation such as news articles, books, and even movie scripts. It will
also improve self-driving cars by generating realistic virtual scenarios for testing and
training, and also it will excel in audio to asset generation where you can speak, and
then have the AI generate an asset. In short, my prediction for the upcoming 10 to 15
years would be generative AI will be used in the creation and production of mass
media quality books, films, and games. Meanwhile, it will also be the
technology behind paradigm shifting implications in the job market, such as self-
driving cars, advanced robotics for manufacturing, and for warehousing, and
improved crop yield and precision agriculture.