0% found this document useful (0 votes)
13 views

LLM Models

LLM Models

Uploaded by

mevi.programs
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

LLM Models

LLM Models

Uploaded by

mevi.programs
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Large

Languag
e
Models
Agenda
Introduction
Why LLM’s?
About LLM (Large Language Models)
NLP (Natural Language Models)
LL (Language Models)
Introduction
Why
LLMs?
1.Versatility: They can perform a wide range of natural language processing tasks, such as
text generation, translation, summarization, sentiment analysis, and more, due to their
large-scale training on diverse textual data.
2.Contextual Understanding: LLMs excel at understanding context within text, enabling
them to generate responses that are contextually relevant and coherent, leading to more
human-like interactions.
3.Scalability: With their massive size and parallel processing capabilities, LLMs can handle
large volumes of data efficiently, making them suitable for handling complex language
tasks at scale.
4.Continuous Learning: LLMs can be fine-tuned on specific datasets or domains, allowing
them to adapt and improve their performance over time, making them adaptable to
various use cases and scenarios.
Questions we
hear about
LLMs
Is the LLM How to
hype real? Is Are LLMs a leverage LLMs How to
this an threat or an to gain a quickly apply
iPhone opportunity? competitive LLMs to my
moment? advantage? data?
LLMs are more than
hype
They are revolutionizing every industry
“Chegg shares drop more than “[...] ask GitHub Copilot to
4O% after company says explain a piece of code. Bump
ChatGPT is killing its business” into an error? Have GitHub
Copilot fix it. It’ll even
generate unit tests so you can
05/02/2023 get back to building what’s
Link next.”

03/22/2023*
“[YouChat is an] AI search Link
assistant that you can talk to
right in your search results. It
stays up-to-date with the
news and cites its sources so
that you can feel confident in
its answers.”
12/23/2022
Link
LLMs are not that
new
Why should I care now?
Accuracy and effectiveness has
hit a tipping point
• Many new use cases are unlocked!
• Accessible by all.

Readily available data and tooling


• Large datasets.
• Open-sourced model options.
• Requires powerful GPUs, but are
available on the cloud.
What is an LLM?
It’s a large language model trained on enormous
data
What does that?
LLMs automate many human-led tasks
Choose the right LLM
There is no “perfect” model. Trade-offs are
required.
Decision criteria

Model Serving Servin Customizabili


Quality Cost g ty
Latenc
y
Primer on NLP

Natural Language Processing


What is NLP?
We use NLP
everyday
NLP is useful for a variety of
domains
Sentiment analysis: product Other use
reviews
This book was terrible and went
Negativ cases
e
on and on about… • Literature
Semantic search.
similarity
• Database querying.
Translati • Question-Answer
matching.
on Summarization
Me gusta este
libro. • Clinical decision
I like this
book.
support.
• News article
Question answering: sentiments.
chatbots Text
• Legal proceeding
It really depends on classification
• summary.
Customer review
your preferences.
Some of the top-
sentiments.
What’s the best scifi book ever? rated ones include… • Genre/topic
classification.
Some useful NLP
definitions
The moon, Earth's only natural satellite, has been a subject of fascination and wonder for
thousands of years.

Token Sequence Vocabulary


Basic building Sequential list of Complete list of
block tokens tokens
• The • The moon, {
• Moon • Earth’s only natural
1:"The"
• , satellite
,
• Earth’ • Has been a subject of 569:"moon"
s • …. ,
• Only • 122:
Thousands of years
",",
• ….. 430:"Earth"
• years ,
50:"**’s"
,

…}
Types of sequence
tasks
Translation
I like this book. Me gusta este Sequence to sequence
libro. prediction

Sequence of text
Sequence of text

Sentiment analysis
This book was terrible and went(product
Negativ Sequence to non sequence
reviews)
on and on about…
e prediction
Labe
Sequence of text l

Question answering
(chatbots) It really depends on Sequence to sequence
What’s the best scifi book your preferences. generation
ever? Some of the top-
Sequence of rated ones include…
text
Sequence of text
NLP goes beyond
text
Speech recognition

Image caption

generation Image

generation from text


Text interpretation is
challenging
“The ball hit the table and it broke.” “What’s the best sci-fi book
ever?”
Context There can
Language
can be multiple
is
change good
ambiguou
the answers.
s.
meaning.
Input data format matters.
Lots of work has gone into text representation for NLP.
Model size matters.
Big models help to capture the diversity and complexity of human
language.
Training data matters.
It helps to have high-quality data and lots of it.
Language Models:
How to predict and analyze
text
What is a Language
Model?

The term Large Language Models is everywhere these days.


But let’s take a closer look at that term:

Large Language Model—What is a Language Model?

Large Language Model—What about these makes them “larger” than other
language models?
What is a Language Model?
LMs assign probabilities to word sequences: find the most likely
word

Categories:
• Generative: find the most likely next word
• Classification: find the most likely
classification/answer
What is a Large Language
Model?
Language Model Description “Large”? Emergence
Represents text as a set of unordered words,
Bag-of-Words Model No 195Os-196Os
without considering sequence or context

Considers groups of N consecutive words to


N-gram Model No 195Os-196Os
capture sequence

Hidden Markov Models Represents language as a sequence of hidden


No 198Os-199Os
(HMMs) states and observable outputs

Recurrent Neural Networks Processes sequential data by maintaining an


No 199Os-2O1Os
(RNNs) internal state, capturing context of previous
inputs
Long Short-Term Memory Extension of RNNs that captures longer-
No 2O1Os
(LSTM) Networks term dependencies

Neural network architecture that processes


Transformers sequences of variable length using a self-
Yes 2017-Present
attention mechanism
Natural Language Processing
(NLP)
Let’s review
• NLP is a field of methods to process text.

• NLP is useful: summarization, translation, classification, etc.

• Language models (LMs) predict words by looking at word


probabilities.
• Large LMs are just LMs with transformer architectures, but
bigger.
• Tokens are the smallest building blocks to convert text to
numerical vectors, aka N-dimensional embeddings.
Thank you

You might also like