Gen-AI
Gen-AI
The term AI, which is artificial intelligence, is an umbrella term that encompasses
several different subcategories, including generative AI. These subcategories are used
to perform different tasks. For example, reactive machines are used in self-driving
cars. Limited memory AI forecasts the weather. Theory of mind powers virtual
customer assistance. Narrow AI generates customized product suggestions for E-
commerce sites. Supervised learning identifies objects from things like images and
video. Unsupervised learning can detect fraudulent bank transactions, and
reinforcement learning can teach a machine how to play a game. These are only a
few of the subcategories, and generative AI models fall into a lot of these categories,
and honestly, it's only growing.
These other types of AI may still generate content, but they do it as a side effect of
their primary function. Generative AI is specifically designed to generate new content
as its primary output. Whether this is text, images, product suggestions,
whatever, that's what generative AI is designed to do. So, now that we know where
generative AI fits in the broader landscape, together, let's explore how it works.
You feed it with thousands, millions, trillions of content, and then you teach a certain
algorithm to generate outputs and solutions as a result. Okay, now that we got AI
101 out of the way, let's get into generative AI 101. Let's use cars as an example. Just
like a Porsche has a different engine than a Mazda, under the umbrella term of
generative AI, there are a variety of different generative AI models. These AI models,
or car engines, are written and manufactured by groups of highly advanced
computer vision specialists, machine learning experts, and mathematicians. They're
built on years of open source machine learning research and generally funded by
companies and universities. Some of the big players in writing these generative AI
models, engines, are Open AI, NVIDIA, Google, Meta, and universities like UC
Berkeley and LMU Munich. They can either keep these models private, or they can
make them public, what we call this, open source, for those to benefit from their
research.
All right, now that these complex generative models are written, meaning the
engines are made, what are we going to do with them? Depending on your level of
technical expertise, this can look a bit different. I'm going to paint a picture for
you with three different end users of these models. The first person is a business
leader who comes up with a product idea that involves a generative AI model, or
several. For the development of their tool, this business leader either uses free open
source generative AI models or enters into a partnership with a corporation to get
rights to their generative AI model, then their team creates their vision. To continue
the chronology, let's say this person owns the car factory. They direct where the
engine and chassis go, but don't actually work on the floor. The second person is a
creative person with an appetite for adventure. They might have some technical
knowledge, but they aren't an AI engineer. I mean, they can be if they want. This
person goes to a car engine showroom, where they pick a pre-made car engine or a
generative AI model from a repository like GitHub and Hugging Face. After that, they
go to a chassis manufacturer to pick their empty shell for their new engine, their
precious new engine. These chassis are called AI notebooks. Their purpose is to hold
and run the generative AI model code. The most widely used one is Google
Colab, but there are others like Jupiter Notebooks. And the third person would be
my mother, bless her heart. She has absolutely no technical pedigree, nor she's
interested in acquiring one. But this doesn't mean she cannot benefit from
generative AI. My mother would be buying her already made car. She will have way
less control over the outcome of her car, but she will be able to drive, just like the
business leader and the creative technologist. People with no technical knowledge
can simply subscribe to an online service like OpenAI's ChatGPT or DALL-E, or
download Discord and play with Midjourney, or download Lensa AI and Avatar
Maker in their smartphone to play with the magic of generative AI. Well, this all
depends what you want to do and what you want to build, and how much technical
expertise you already have. Now that we have our car, our generative AI model, we
can now start creating our own content and go for a drive.
Natural language models
GitHub Copilot is a generative AI service provided by GitHub to its users. The service
uses the OpenAI codex to suggest the code and entire functions in real time, right
from the code editor. It allows the users to search less for outside solutions and it
also helps them type less with smarter code completion. Another example would be
Microsoft's Bing, which implemented ChatGPT into its search functionality, enabling
it to reach concise information in a shorter amount of time. Since OpenAI made
ChatGPT available to the public on November 30th in 2022, it reached 1 million
user in less than a week, I said in less than a week.
Now, let's compare that to other companies that hit 1 million users. It took Netflix 49
months to reach 1 million users. It took Twitter 24 months, it took Airbnb 30
months, Facebook, 10 months, and it took Instagram two-and-a half-months to
reach 1 million users. Let's remember, it took ChatGPT only one week. These figures
demonstrate how easily humans adopted their workflow for co-creating with
generative AI-based tools and services. This is amazing. However, GPT has several
limitations, such as the lack of common sense, creativity and understanding the text
it generates. Also, bias data sets and the danger of normalization of mediocrity when
we come up with creative writing. Natural language models synthetically mimic
human capabilities, but, clearly, conscious contemplations are required before
developing generative AI tools. ChatGPT is a wonderful tool for factual and
computable information. However, I would advise us to approach it with
caution when inquiring about creative and opinion-based writing.
Text to image applications
First is Cuebric. Hollywood's first generative AI tool created by our company, Seyhan
Lee, for streaming the production of film backgrounds. A normal virtual production
workflow uses three dimensional world building which involves a bunch of people
building 3D worlds that are custom made for that film. It's time consuming,
expensive, and requires a lot of repetitive tasks. An alternative now is to augment 2D
backgrounds into 2.5D by involving generative AI in the picture creation process. The
second example would be Stitch Fix. When they suggest garments to discover their
customer's fashion style, they use real clothes along with clothes generated with
DALL-E. And finally, marketers and filmmakers use text to image models when
ideating for a concept in a film. And actually, they may later on continue to use it to
make storyboards and even use it in the production of the final art of their
campaigns and films. Just like we have seen in Cuebric. A recent example from the
marketing world would be Martini that used the Midjourney generated image in their
campaign. Another one would be Heinz and Nestle that used DALL-E in their
campaign. And GoFundMe that used Stable Diffusion in their artfully illustrated
film. Marketers prefer using generative AI in their creative process for two
reasons. First, for its time and cost-saving efficiency, and the second, for the unique
look and feel that you get from text to image based tools.
Generative Adversarial Networks (GANs)
The Generator creates a painting and The Discriminator evaluates it, giving
feedback to the generator on how to improve the next iteration. The Generator and
The Discriminator played this game repeatedly until The Generator creates the
painting that is so realistic that The Discriminator can't tell the difference between it
and the real painting. In the same way, a GAN model has a generator and a
discriminator. The two parts work together in a competition. That's why it's called
generative adversarial networks. In this way, they improve the generator's ability to
create realistic data, and over time, the generator becomes better and better at
creating realistic data. And the results start yielding in the creation of
products, assets, faces, people, that didn't exist before, just like we have seen with
text-to-image that we have seen in the former session. The difference though is that
with GANs, you input one type of data, like pictures or bank transitions, and then you
output the same type of data. Let's now give three real-world examples where GANs
were used. We're going to start with Audi. They trained their own GANs to get
inspiration for their wheel designs. This process created lots of different wheel
designs that simply didn't exist before, and gave inspiration to Audi designers so
they can pick and choose which designs they wanted to use in their final
decisions. And remember, AI didn't design the final wheel.
AI was simply a tool that the wheel designers used to inspire themselves for the final
designs that they would make. Next, Beko, that is a European-based appliance
brand, they use custom trained GANs in their sustainability stand film, which also
happens to be the world's first brand-funded AI film created and produced by
Seyhan Lee. We use GANs to generate lightning, leaves, roots, eyes, flowers, and
created seamless transitions to flow between humans and nature. GANs have this
beautiful transitional quality. And finally, in the context of financial fraud
detection, GAN models can be used to generate synthetic versions of fraudulent
transactions, which can then be used to train a fraud detection model. You know
what's really surprising with GANs is that the same generative AI model can be
used for two very distinct professions. Here we are seeing some financial fraud
detection solving and create a new tire styles for Audi. And then later on, the same AI
model makes impossibly beautiful visual effects for film, and that versatility is the
greatest power of GAN models.
- Let's now move to talking about an application of generative AI that may not be as
obvious as it's used in generating images, like we have seen earlier, audio or text. But
it's still very important application nonetheless, and it is going to be the anomaly
detection. One of the main models that we use in this space is Variational
Autoencoders, referred as VAE.
These models can be used for anomaly detection by training the model on a dataset
of normal data, and then using the trained model to identify instances that deviate
from the normal data. This can be used to detect anomalies in a wide range of
situations, like finding fraud in financial transactions, spotting flaws in
manufacturing or finding security breaches in a network.
For example, Uber has used VAE for anomaly detection in their financial transactions
to detect fraud. Another example would be Google has also used VAE to detect
network intrusions using anomaly detection and another one of a real world
application of VAE would be anomaly detection in industrial quality control. In this
scenario, a VAE can be trained on a dataset of images of normal products and then
used to identify images of products that deviate from the normal data. In this way, it
can be used to detect defects in products such as scratches, dents, or misalignments.
Another real world example would be healthcare where VAE is used to detect
anomalies in medical imaging such as CT scans and MRI, like Children's National
Hospital in Washington, DC uses a generative AI model to analyze electronic health
records. The model uses data such as vital signs, laboratory results and demographic
information to predict which patients are at risk of sepsis, allowing healthcare
providers to intervene early and improve patient outcomes. Variational Autoencoders
are a flexible, generative model that are not only able to detect anomalies but are
also a part of the architecture of several other generative AI models.
Future predictions using GenAi
- The best way to predict the future, as they say, is to invent it so let's talk about the
future. In two to three years in the gaming, film and marketing sectors generative
AI will continue to be used in computer graphics, and animation to create more
realistic, and believable characters, and environments.