0% found this document useful (0 votes)
699 views

Mathematics of Generative AI

This document provides an introduction to generative AI and its underlying mathematics. It explains that generative AI maps an input like text or an image to an output like a new image, text, video or audio. It then discusses several key mathematical concepts used in generative AI models like distributions of high-dimensional data and dimensionality reduction via PCA. The document outlines several generative models like GANs, VAEs, normalizing flows, diffusion models, and transformers. It concludes by discussing future mathematical developments and advertising consulting services from Ensemble Control Inc. to build custom algorithms using this mathematics.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
699 views

Mathematics of Generative AI

This document provides an introduction to generative AI and its underlying mathematics. It explains that generative AI maps an input like text or an image to an output like a new image, text, video or audio. It then discusses several key mathematical concepts used in generative AI models like distributions of high-dimensional data and dimensionality reduction via PCA. The document outlines several generative models like GANs, VAEs, normalizing flows, diffusion models, and transformers. It concludes by discussing future mathematical developments and advertising consulting services from Ensemble Control Inc. to build custom algorithms using this mathematics.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Mathematics of Generative AI

Abhishek Gupta

The Ohio State University and Ensemble Control Inc.

Material is taken from https://issuu.com/cmb321/docs/deep_learning_


ebook

Email: [email protected]
Table of contents

1. Introduction to Generative AI
2. Basic Mathematical Ideas
3. Linear Transformation through PCA

4. Generative Adversarial Networks


5. Variational Autoencoders
6. Normalizing Flows

7. Diffusion Model
8. Transformer for Conditional Probability Approximation

9. Future Mathematical Development

[email protected] 1
Introduction to Generative AI
What is Generative AI?

A function that maps input to output

Input can be

• A high-dimensional random vector


• A text prompt (which is converted into a high dimensional
vector either via tokenization or via word2vec models)

Output can be

• An image (Stable diffusion, DALL-E, etc.)


• Text (ChatGPT, LaMDA)
• A video
• An audio
• A video with an audio
[email protected] 2
Why is GenAI useful?

People generally need

• Content
• Suggestions
• Nudges
• Images
• Videos
• Search results

Prompt =⇒ Content for Human Needs


| {z }
Incredible value gets created

[email protected] 3
Where are people using GenAI?

• Insurance underwriting
• Legal case search and contract drafting
• Search and recommendation
• Social media content generation
• Script ideation
• Marketing and branding material

[email protected] 4
Why should we know the math behind GenAI?

The underlying theory is immensely useful for other use cases


but almost no one is looking into these

• User generation
• Automated app testing
• Automated penetration testing
• Marketing campaign generation

Let’s see why towards the end of the presentation

[email protected] 5
Basic Mathematical Ideas
Basic building block

Just like home prices in an area has a distribution, content also


has a distribution, but in a high dimensional space

Image Source: Wikipedia 6


Some distributions are interesting

Image Source: Wikipedia 7


What are the data dimensions?

• A 1000 px × 800 px colored image x ∈ [0, 1]1000×800×3


• 10 sec audio clip at 8000 Hz x ∈ [0, 1]10×8000
• 10 sec 640× 640 px colored video clip at 30 Hz
x ∈ [0, 1]10×640×640×3×30

[email protected] 8
What about language?

Say a language has 50,000 words and punctuation marks. Say


the vector embedding of each word with its positional encoder
is 128 dimensional. Say the choice of a word depends on 24000
words preceding it. Then,
n o
Pr next word word1, . . . , word24000 is a function that
maps [0, 1]24000×128 → [0, 1]50000

[email protected] 9
What about language?

Say a language has 50,000 words and punctuation marks. Say


the vector embedding of each word with its positional encoder
is 128 dimensional. Say the choice of a word depends on 24000
words preceding it. Then,
n o
Pr next word word1, . . . , word24000 is a function that
maps [0, 1]24000×128 → [0, 1]50000

Note: In LLMs, people worry about tokens and not words.

[email protected] 9
What about long form audio and video?

n o
Pr frame and audio old frames and audio is a
function that maps extremely high dimensional space to a very
high dimensional space
Note: Such functions will be extremely hard to train and would
require unimaginable datasets for general usecases

[email protected] 10
Linear Transformation through PCA
Generative Adversarial Networks
Variational Autoencoders
Normalizing Flows
Diffusion Model
Transformer for Conditional
Probability Approximation
Future Mathematical Development
About Ensemble Control Inc.

Ensemble Control Inc. engages with late stage startups and


mature companies to build crucial data-driven algorithms for
growth, retention, and improving customer experience.
We build custom algorithms using sophisticated statistical
and mathematical literature that can seamlessly integrate
with our client’s backend to yield demonstrable uplifts
instantly.
Contact us at [email protected] to engage with us.
Our consulting rates start from USD 500 per hour.

[email protected] 11

You might also like