1-INTODUCTION TO GENERATIVE AI
1-INTODUCTION TO GENERATIVE AI
v=cZaNf2rA30k
INTRODUCTION TO GENERATIVE AI
VIDEO TRANSCRIPT
“Introduction to Generative AI”. Don't know what that is? Then you're in the
perfect place.
I'm Roger Martinez and I am a Developer Relations Engineer at Google Cloud
it's my job to help developers learn to use Google Cloud.
In this course, I'll teach you 4 things.
1. How to define Generative AI,
2. Explain how Generative AI works,
3. Describe Generative AI Model Types,
4. Describe Generative AI applications.
Let's get into it. So one way to think about it is that AI is a discipline, like how
physics is a discipline of science. . Are you with me so far? Essentially, AI has
to do with the theory and methods to build machines that think and act like
humans. Pretty simple right?
Now let's talk about Machine Learning (ML). Machine learning is a subfield of
AI. It is a program or system that trains a model from input data. The trained
model can make useful predictions from new (never-before-seen) data drawn
from the same one used to train the model. This means that machine
learning gives the computer the ability to learn without explicit
programming.
So what do these Machine Learning models look like? Two of the most
common classes of machine learning models are unsupervised and
supervised ML models. The key difference between the two is that with
supervised models, we have labels. Labeled data is data that comes with a
tag, like a name, a type, or a number. Unlabeled data is data that comes
with no tag. So what can you do with supervised and unsupervised models?
Here, you want to look at tenure and income, and then group or cluster
employees, to see whether someone is on the fast track. Unsupervised
problems are all about discovery, about looking at the raw data, and seeing
if it naturally falls into groups. This is a good start but let's go a little deeper
to show this difference graphically because understanding these concepts is
the foundation for your understanding of generative AI.
In supervised learning, testing data values (“x”) are input into the model.
The model outputs a prediction and compares it to the training data used to
train the model. If the predicted test data values and actual training data
values are far apart, that is called an "error". The model tries to reduce this
error until the predicted and actual values are closer together. This is a
classic optimization problem. So let's check in. So far, we've explored the
differences between artificial intelligence and machine learning and
supervised and unsupervised learning. That's a good start.
But what's next? Let's briefly explore where deep learning fits as a subset of
ML methods. And then I promise we'll start talking about GenAI. While
machine learning is a broad field that encompasses many different
techniques, deep learning is a type of machine learning that uses artificial
neural networks, allowing them to process more complex patterns than
machine learning.
Artificial neural networks are inspired by the human brain. Like your brain,
they are made up of many interconnected nodes, or neurons, that can learn
to perform tasks by processing data and making predictions. Deep learning
models typically have many layers of neurons, which allows them to learn
more complex patterns than traditional machine learning models. Neural
networks can use both labeled and unlabeled data. This is called semi-
supervised learning. In semi-supervised learning, a neural network is trained
on a small amount of labeled data and a large amount of unlabeled data. The
labeled data helps the neural network to learn the basic concepts of the
tasks, while the unlabeled data helps the neural network to generalize to
new examples.
Now we finally get to where generative AI fits into this AI discipline! Gen AI is
a subset of deep learning, which means it uses Artificial Neural Networks,
and can process both labeled and unlabeled data, using supervised,
unsupervised, and semi-supervised methods.
Large Language Models are also a subset of Deep Learning. Deep learning
models (or machine learning models in general) can be divided into two
types – Generative and Discriminative. A discriminative model is a type of
model that is used to classify or predict labels for data points. Discriminative
models are typically trained on a dataset of labeled data points, and they
learn the relationship between the features of the data points and the labels.
Once a discriminative model is trained, it can be used to predict the label for
new data points.
A Generative model generates new data instances based on a learned
probability distribution of existing data. Generative models generate new
content. Take this example. Here, the discriminative model learns the
conditional probability distribution or the probability of “y” (our output) given
“x” (our input), that this is a dog and classifies it as a dog and not a cat,
which is great because I'm allergic to cats. The generative model learns the
joint probability distribution (or the probability of x and y) p(x, y) and
predicts the conditional probability that this is a dog and can then generate a
picture of a dog.
This illustration shows a good way to distinguish between what is GenAI and
what is not. It is NOT GenAI when the output (or "y”, or label) is a number, or
a class (for example - spam or not spam), or a probability. It IS GenAI when
the output is natural language (like speech or text), audio, or an image like
Fred from before, for example.
Let's get a little mathematics to really show the difference. Visualizing this
mathematically would look like this. If you haven't seen this for a while, the Y
= f(x) equation calculates the dependent output of a process given different
inputs. The “Y” stands for the model output, the “f” embodies the function
used in the calculation (or model), and the “X” represents the input or inputs
used for the formula.
As a reminder, inputs are the data, like comma-separated value files, text
files, audio files or image files like Fred. So, the model output is a function of
all the inputs. If the “y” is a number - like predicted sales - it is not
Generative AI. If “y” is a sentence like “Define sales”, it is generative, as the
question would elicit a text response. The response would be based on all
the massive large data the model was already trained on.
The Generative AI process can take training code, labeled data, and
unlabeled data of all data types and build a “foundation model”. The
foundation model can then generate new content. It can generate text, code,
images, audio, video, and more. We've come a long way from traditional
programming, to neural networks, to generative models! In traditional
programming, we used to have to hard code the rules for distinguishing a
cat: type: animal, legs: 4, ears: 2, fur: yes, likes: yarn, catnip, dislikes: Fred.
In the wave of neural networks, we could give the network pictures of cats
and dogs and ask: “Is this a cat”? And it would predict a cat; or not a cat.
What's cool is that in the Generative Wave, we - as users - can generate our
content - whether it be text, images, audio, video, or more. For example,
models like Gemini (Google’s Multimodal AI Model) or LaMDA (Language
Model for Dialogue Applications) ingest large data from multiple sources
across the Internet and build foundation language models we can use simply
by asking a question - whether typing it into a prompt or verbally talking into
the prompt itself.
So, when you ask it “What’s a cat”, it can give you everything it has learned
about a cat. Now let's make things a little more formal with an official
definition. What is Generative AI? GenAI is a type of Artificial
Intelligence that creates new content based on what it has learned
from existing content. The process of learning from existing content is
called training and results in the creation of a statistical model.
When given a prompt, GenAI uses this statistical model to predict what an
expected response might be–and this generates new content. It learns the
underlying structure of the data and can then generate new samples that are
similar to the data it was trained on. Like I mentioned earlier, a generative
language model can take what it has learned from the examples it’s been
shown and create something entirely new based on that information.
That's why we use the word “Generative.” But Large Language Models,
which generate novel combinations of text in the form of natural-sounding
language, are only one type of Generative AI A generative image model
takes an image as input and can output text, another image, or video. For
example, under the output text, you can get visual question and answering.
Based on things learned from its training data, it offers predictions of how to
complete this sentence. “I'm making a sandwich with peanut butter and...
jelly.” So, given some text, it can predict what comes next. Thus, generative
language models are pattern-matching systems. They learn about patterns
based on the data you provide.
Hallucinations are words or phrases that are generated by the model that are
often nonsensical or grammatically incorrect. Hallucinations can be caused
by several factors, like when the model: is not trained on enough data, is
trained on noisy or dirty data, is not given enough context, or is not given
enough constraints. Hallucinations can be a problem for Transformers
because they can make the output text difficult to understand. They can also
make the model more likely to generate incorrect or misleading information.
So put simply... hallucinations are bad.
Let's pivot slightly and talk about prompts. A prompt is a short piece of text
that is given to a large language model, or LLM, as input, and it can be used
to control the output of the model in a variety of ways. Prompt design is the
process of creating a prompt that will generate the desired output from an
LLM.
Diffusion is one method used to achieve this. There's also text-to-video and
text-to-3D. Text-to-video models aim to generate a video representation
from text input. The input text can be anything from a single sentence to a
full script, and the output is a video that corresponds to the input text.
Similarly, Text-to-3D models generate three-dimensional objects that
correspond to a user’s text description, for use in games or other 3D worlds.
And finally, there's Text-to-task. Text-to-task models are trained to perform a
defined task or action based on text input. This task can be a wide range of
actions such as answering a question, performing a search, making a
prediction, or taking some sort of action.
Let’s say you have a use case where you need to gather sentiments about
how your customers feel about your product or service, you can use the
classification task sentiment analysis task model. Same for vision tasks - if
you need to perform occupancy analytics, there is a task-specific model for
your use case. So those are some examples of foundation models we can
use, but can GenAI help with code for your apps? Absolutely! Shown here are
generative AI applications. You can see there's quite a lot.
Let’s look at an example of code generation shown in the second block under
the code at the top. In this example, I’ve input a code file conversion
problem - converting from Python to JSON. I use Gemini and insert into the
prompt box “I have a Pandas Data Frame with two columns – one with the
filename and one with the hour in which it is generated. I am trying to
convert it into a JSON file in the format shown on screen. Gemini returns the
steps I need to do this. And here my output is in a JSON format. Pretty cool
huh? Well get ready, it gets even better.
The first is Vertex AI Studio. Vertex AI Studio lets you quickly explore and
customize Generative AI models that you can leverage in your applications
on Google Cloud. Vertex AI Studio helps developers create and deploy
Generative AI models by providing a variety of tools and resources that make
it easy to get started. For example, there is a: library of pre-trained models,
tool for fine-tuning models, tool for deploying models to production, and
community forum for developers to share ideas and collaborate.
Next, we have Vertex AI which is particularly helpful for all of you who don't
have much coding experience. You can build generative AI search and
conversations for customers and employees with Vertex AI Agent Builder
(formerly Vertex AI Search and Conversation). Build with little or no coding
and no prior machine learning experience. Vertex AI can help you create
your own: Chatbots, Digital Assistants, Custom Search Engines, knowledge
bases, training applications, and more.