0% found this document useful (0 votes)
5 views

nlp

The document provides an overview of Natural Language Processing (NLP), including its definition, stages, and key concepts such as language modeling, grammar, and tokenization. It discusses various techniques and tools used in NLP, such as morphological parsing, stemming, and popular libraries like NLTK and spaCy. Additionally, it highlights the importance of datasets, error detection, and the role of parts of speech in understanding language structure.

Uploaded by

mickeypinky123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

nlp

The document provides an overview of Natural Language Processing (NLP), including its definition, stages, and key concepts such as language modeling, grammar, and tokenization. It discusses various techniques and tools used in NLP, such as morphological parsing, stemming, and popular libraries like NLTK and spaCy. Additionally, it highlights the importance of datasets, error detection, and the role of parts of speech in understanding language structure.

Uploaded by

mickeypinky123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Part – A (2 Mark Questions)

1. What is Natural Language Processing (NLP)?

ANS:

NLP (Natural Language Processing) is how computers learn to understand


and work with human language. It's what makes things like chatbots,
voice assistants, and translation apps work.

For example:

 If you ask Siri a question, NLP helps Siri understand what you mean.

 When you type a sentence, autocorrect uses NLP to fix errors.

 Google Translate uses NLP to turn text from one language into
another.

2. List the stages involved in NLP.

ANS:

The stages involved in Natural Language Processing (NLP) are as follows:

1. Text Preprocessing

 Cleaning and preparing the text data for analysis.

 Common steps:

o Tokenization: Breaking text into smaller units like words or


sentences.

o Lowercasing: Converting text to lowercase for uniformity.

o Stopword Removal: Removing common words like "is," "the,"


etc., that don't add meaning.

o Lemmatization/Stemming: Reducing words to their base or


root form (e.g., "running" → "run").

2. Syntactic Analysis (Parsing)

 Analyzing the grammatical structure of sentences.

 Identifying parts of speech (nouns, verbs, etc.).

 Building parse trees to understand sentence structure.

3. Semantic Analysis
 Understanding the meaning of words and sentences.

 Resolving ambiguity (e.g., "bank" as a riverbank or a financial bank).

 Tasks include Named Entity Recognition (NER) and word-sense


disambiguation.

4. Discourse Analysis

 Understanding the context and relationships between sentences.

 Example: Determining whether "he" refers to a specific person


mentioned earlier.

3. Define language modeling in NLP?

ANS:

Language modeling in NLP is the process of building a statistical or


machine learning model that predicts the likelihood of a sequence of
words. It helps the computer understand how words and sentences are
structured and how they occur in natural language.

Key Goal:

Predict the next word in a sequence: For example, in "I am going


to the ___," the model might predict "store" or "park."

Types of Language Models:

1. Statistical Language Models:

o Based on probabilities of word sequences (e.g., n-grams).

2. Neural Language Models:

o Use deep learning to capture complex relationships (e.g.,


RNNs, LSTMs, Transformers).

Applications:

 Text prediction and autocomplete.

 Machine translation.

 Speech recognition.

 Chatbots.
4. What is the role of grammar in NLP?
ANS:

 Grammar plays a crucial role in Natural Language Processing


(NLP) as it helps machines understand the structure and rules of
human language.
 It provides a foundation for analyzing and generating text in a way
that is syntactically correct and meaningful.

Key Roles of Grammar in NLP:

1. Syntactic Parsing:
o Grammar is used to analyze the structure of sentences,
identifying parts of speech (nouns, verbs, etc.) and their
relationships.
2. Disambiguation:
o Grammar helps resolve ambiguities in sentences by providing
context.
3. Sentence Generation:
o Grammar ensures that machines generate human-like text
that follows the rules of language.
4. Error Detection and Correction:
o Tools like spell checkers and grammar checkers rely on
grammatical rules to identify and fix errors in text.

5. Name two datasets commonly used in NLP.

ANS:
1. IMDB Reviews Dataset
 Description: A dataset of 50,000 movie reviews from the IMDB
website, labeled as positive or negative.
 Usage:
o Sentiment analysis.
o Text classification.
 Format: Plain text with binary labels (positive/negative).
2. SQuAD (Stanford Question Answering Dataset)
 Description: A reading comprehension dataset containing over
100,000 questions and answers based on Wikipedia articles.
 Usage:
o Question answering tasks.
o Evaluating machine comprehension of text.
 Format: JSON files with passages, questions, and answers.
6. What is the difference between morphological and syntactic
analysis?
ANS:

Morphological Analysis Syntactic Analysis

Analyzes the structure and


Analyzes the structure
arrangement of words in
and formation of words.
sentences.

Focuses on individual
Focuses on phrases.
words.

Determines how words are


Identifies the root form,
related and organized in
prefixes, suffixes.
sentences.

Input is a single word Input is a complete sentence


(e.g., "running"). (e.g., "The cat sat on the mat.").

Techniques used are Context-


Techniques used are
free grammars, dependency
Stemming, lemmatization.
parsing.

Useful for spell-checking, Used in machine translation,


text normalization. question answering.

"Cats" → Root: "cat", "The cat sat" → Subject: "cat",


Plural: "Yes". Verb: "sat".

7. What are augmented transition networks?


ANS:

Augmented Transition Networks (ATNs) are a state-based


computational model used in NLP to parse natural language sentences.
They extend finite state automata by incorporating conditions and actions
on transitions, allowing for recursive handling of complex and context-
sensitive grammatical structures.

 ATNs can represent complex, recursive grammars to handle nested


structures in language.

 ATNs handle context-sensitive rules, enabling more accurate natural


language parsing.
 ATNs explore multiple interpretations of a sentence until one satisfies
all conditions.

 ATNs were widely used in early NLP systems for sentence parsing and
machine translation.

8. Name two popular NLP libraries and their uses.

ANS:

 NLTK (Natural Language Toolkit): A comprehensive library for text


processing, including tokenization, stemming, part-of-speech tagging, and
syntactic parsing. It is widely used for teaching and research in NLP.

 spaCy: A fast and efficient NLP library focused on production use,


offering pre-trained models for tasks like tokenization, named entity
recognition (NER), dependency parsing, and text classification.

9. What is a lexicon in NLP?

ANS:

A lexicon in NLP refers to a collection or database of words and their


associated meanings, part-of-speech information, and other linguistic
features (such as tense, pluralization, etc.). It serves as a dictionary for a
language, helping models understand word usage and relationships.

10. Define the term "semantic analysis."

ANS:

Semantic analysis in NLP is the process of understanding the meaning of


words, sentences, or entire texts. It involves resolving ambiguities (such
as polysemy) and interpreting the relationships and context to derive the
underlying meaning, enabling machines to comprehend and generate
human-like language.
11. Explain how morphological parsing is applied in
NLP.

ANS:

Morphological parsing in NLP is the process of analyzing and breaking


down words into their constituent morphemes (the smallest meaningful
units) to understand their structure and meaning. Here's how it's applied:

Tokenization: The text is split into words or tokens, which are then
analyzed at the morphological level.

Root Identification: The system identifies the root form of words,


removing prefixes or suffixes (e.g., "running" → "run").

Prefix and Suffix Analysis: The system recognizes common prefixes,


suffixes, or inflections (e.g., "un-" in "unhappy," "-ed" in "walked").

Lemmatization and Stemming: It may reduce words to their lemma


(the dictionary form, e.g., "better" → "good") or stem (e.g., "running" →
"run").

Part-of-Speech Tagging: The parsed word is classified based on its role


(e.g., noun, verb), which helps in further processing like sentence
structure analysis.

Language Understanding: Morphological parsing aids in tasks like text


normalization, named entity recognition (NER), and syntactic parsing,
ensuring that the system comprehends different forms of a word.

12. How do regular expressions contribute to text


processing?

ANS:

 Regular expressions allow for searching and finding text that matches
specific patterns, such as dates or keywords.

 They enable the extraction of specific parts of a text, like email


addresses or phone numbers.
 Regular expressions can replace patterns in a text, useful for tasks like
text cleaning or formatting.

 They are used to validate if text matches a required format, such as


checking for valid email addresses.

 Regular expressions make text processing efficient by quickly locating


or modifying specific patterns in large datasets.

13. Differentiate between syntactic and semantic


analysis.

ANS:

Syntactic Analysis Semantic Analysis

Focuses on the structure of Focuses on the meaning of words,


sentences and grammatical rules. phrases, or entire sentences.

Identifies relationships between Resolves word meanings and


words (e.g., subject, verb). ambiguity based on context.

Produces a syntactic tree or Produces meanings, interpretations, or


grammatical structure. representations.

Sentence parsing, part-of-speech Word sense disambiguation, named


tagging. entity recognition.

Deals with how words are arranged Deals with what the sentence or
in a sentence. words mean in a specific context.

14. What is the significance of tokenization in NLP?

ANS:

Significance of Tokenization in NLP:

 Tokenization breaks text into smaller units like words or phrases,


which is essential for further analysis and processing.

 It simplifies text data, making it easier to analyze, especially for


tasks like part-of-speech tagging, named entity recognition, and
machine learning.

 Tokenization helps in text preprocessing by standardizing and


structuring input data for NLP models.
15. Describe the process of noise removal in a
generic NLP pipeline.
ANS:

 Lowercasing: Convert all text to lowercase to maintain consistency


and avoid case-sensitive issues.

 Removing Punctuation: Remove punctuation marks that do not add


meaning to the analysis.

 Stopword Removal: Eliminate common words that do not contribute


significant meaning to the text.

 Whitespace Removal: Remove extra spaces, tabs, or newline


characters to clean the text.

 Spelling Correction: Correct misspelled words to standardize the text


for further processing.

 Special Character Removal: Remove non-alphanumeric characters


that are irrelevant to the analysis.

 Tokenization: Break text into smaller units and discard irrelevant parts
or tokens.

16. How does spelling error detection enhance text


quality?
ANS:

Spelling error detection enhances text quality by improving readability,


ensuring consistency, and reducing misunderstandings. Correcting
spelling mistakes helps ensure that the text is easier to understand, more
professional, and less likely to cause confusion, especially in tasks like text
classification, machine translation, or sentiment analysis. It also aids in
standardizing the text, making it more suitable for NLP models that rely on
clean and accurate data.

17. Explain the importance of word classes in NLP.

ANS:

Word classes, or parts of speech (POS), are important in NLP because


they help define the role of each word in a sentence, which is essential for
understanding sentence structure and meaning. Identifying word classes
allows NLP systems to perform tasks such as:
1. Syntactic Parsing: Understanding the grammatical structure and
relationships between words (e.g., subject, verb, object).

2. Word Sense Disambiguation: Determining the correct meaning of


a word based on its role in the sentence.

3. Text Classification: Identifying the category or intent of a text by


recognizing patterns in word usage.

4. Named Entity Recognition (NER): Recognizing proper nouns


(e.g., names, locations) by distinguishing them from other words.

5. Improving Accuracy: Helping NLP models make better predictions


by recognizing the function of words in context.

18. Illustrate the use of finite state automata in word-


level analysis.

ANS:

Finite State Automata (FSA) are used in word-level analysis to model


and process the structure of words in NLP. Here's how they work:

 Tokenization: FSAs identify word boundaries by detecting spaces or


punctuation, breaking text into tokens.

 Stemming: FSAs reduce words to their root form by recognizing and


removing suffixes (e.g., "running" → "run").

 Morphological Analysis: FSAs recognize different word forms, such as


singular/plural or tense (e.g., "cats" → "cat").

 Spell-Checking: FSAs identify patterns or common misspellings in


words and suggest corrections.

19. Describe the role of text pre-processing in NLP.


ANS:

 Noise Removal: Eliminate irrelevant characters like punctuation,


special symbols, and stopwords from the text.

 Normalization: Standardize the text by converting it to lowercase,


correcting spelling errors, and handling contractions.

 Tokenization: Split text into smaller units, such as words or sentences,


for easier analysis.
 Lemmatization and Stemming: Reduce words to their base or root
form to ensure consistency.

 Vectorization: Convert text into a numerical format suitable for


machine learning models.

20. Why are datasets essential for language modeling?


Datasets are essential for language modeling because they provide the
necessary examples of natural language for training algorithms, helping
the model learn the statistical patterns, structure, and relationships
between words. This enables the model to understand syntax, predict the
next word in a sequence, and generate coherent language in tasks like
text generation, machine translation, and speech recognition.

21. Define stemming.


Stemming is a process in NLP where words are reduced to their base or
root form by removing prefixes and suffixes. The output may not always
be a valid word in the dictionary, but it helps to group similar words
together for text analysis (e.g., "jumping," "jumps," and "jumped" are all
reduced to "jump").

22. List three stemming algorithms.

1. Porter Stemmer: A widely-used algorithm that applies a series of


rules to remove common suffixes from English words.

2. Snowball Stemmer: An improvement on the Porter Stemmer,


providing better accuracy and flexibility across different languages.

3. Lancaster Stemmer: A more aggressive stemming algorithm that


tends to over-stem words but is faster than others.

23. What is the purpose of lemmatization?


The purpose of lemmatization is to reduce words to their base form
(lemma) based on their meaning, ensuring the word is contextually
correct and appears as it would in a dictionary (e.g., "running" becomes
"run," and "better" becomes "good"). Unlike stemming, lemmatization
ensures that the root word is always a valid word.
24. Differentiate between rule-based and machine-based
lemmatization.

Here’s a simplified comparison of rule-based and machine-based


lemmatization:

Rule-based Lemmatization Machine-based Lemmatization

Uses predefined linguistic rules Uses machine learning models to predict


to convert words to their lemma. the correct lemma based on context.

Can be less accurate, following Generally more accurate, adapting to


fixed rules. context learned from data.

More flexible, handles complex and


Limited flexibility, relies on
ambiguous cases by learning from large
specific rules and dictionaries.
datasets.

Struggles with exceptions or Better at handling exceptions by


irregular word forms. understanding context.

Example: "flies" → "fly" based on Example: "Better" → "good" based on


rules like removing "es". context in a sentence.

25. What is the significance of stop-word removal in NLP?


Stop-word removal is significant because it reduces the size of the dataset
by removing common, high-frequency words that do not add much value
to text analysis, such as articles, prepositions, and conjunctions (e.g.,
"the," "is," "in"). By eliminating these words, the focus shifts to more
meaningful content, improving the accuracy and efficiency of tasks like
sentiment analysis, information retrieval, and text classification.

26. Define Parts of Speech (POS) tagging.

 Parts of Speech (POS) tagging is a process in Natural Language


Processing (NLP) where each word in a sentence is assigned a tag
that corresponds to its grammatical role (e.g., noun, verb, adjective,
adverb).

 This is done based on the word's definition, syntactic position, and


context within the sentence. POS tagging helps in understanding
sentence structure, which is essential for various NLP tasks such as
parsing, named entity recognition, and machine translation.

27. What are N-grams in NLP?


N-grams are sequences of n contiguous elements (usually words or
characters) from a text. These sequences are used in NLP for various
tasks, such as text modeling and prediction. For example, in the sentence
"I love programming," the following are N-grams:

 Unigrams: "I", "love", "programming"

 Bigrams: "I love", "love programming"

 Trigrams: "I love programming"


N-grams are essential for language modeling, capturing the
relationships between words, and enabling better predictions in
tasks like speech recognition and machine translation.

28. What is the Bag of Words representation?

The Bag of Words (BoW) representation is a simplified text model used


to represent text data for machine learning and NLP tasks.

It treats text as a "bag" (or collection) of words, ignoring grammar and
word order but keeping track of the frequency of each word's occurrence.

This model is often used in text classification, sentiment analysis, and


document clustering. While BoW is efficient, it doesn't preserve any
syntactic structure or semantic meaning of the text.

29. Name two smoothing techniques for N-grams.

1. Additive Smoothing (Laplace Smoothing): Adds a small


constant to all counts to avoid zero probabilities for unseen N-grams
in a corpus. For example, adding 1 to all counts in a unigram model
to ensure no probability is zero.

2. Good-Turing Smoothing: Adjusts probabilities of N-grams based


on the frequency of N-grams that appear only once. It redistributes
the probability mass from more frequent N-grams to those that
appear less frequently, ensuring that rare or unseen N-grams are
accounted for.
30. Define dependency grammar in NLP.
Dependency grammar is a syntactic theory in NLP that represents the
relationships between words in a sentence, where each word is dependent
on another, typically a "head" word. In this structure, the main verb of a
sentence acts as the head, and other words, such as subjects, objects, or
adjectives, depend on it. For example, in the sentence "She eats an
apple," "eats" is the head, and "She" and "apple" are dependent on it.
Dependency grammar helps in understanding sentence structure and is
widely used for tasks such as syntactic parsing, machine translation, and
question answering.

31. Compare stemming and lemmatization with an example in


table

Aspect Stemming Lemmatization

Reduces words to their root Converts words to their base or


Definiti
form by stripping suffixes or dictionary form (lemma) based
on
prefixes. on context.

The result may not always be a The result is a valid word that
Output valid word (e.g., "running" → exists in the dictionary (e.g.,
"run" becomes "run"). "running" → "run").

Accurac May lead to non-dictionary More accurate as it considers


y words or over-stemming. word meaning and context.

Faster because it applies Slower due to reliance on


Speed
simple rules. dictionary lookup or algorithms.

"flies" → "fly" (Lemmatizer


Exampl "flies" → "fli" (Porter Stemmer
considers context and returns
e strips "es").
the valid lemma).

32. How does the Porter Stemmer algorithm work?

 The Porter Stemmer algorithm works by applying a series of rule-


based transformations to remove common English suffixes from
words.
 The algorithm operates in multiple steps, progressively stripping
away suffixes like "-ing," "-ed," "-es," "-ly," and others based on the
word's context.
 The goal is to reduce a word to its base form or stem, which may not
always be a valid word in the dictionary (e.g., "happiness" →
"happi").
 The algorithm is efficient but can sometimes over-stem words.

33. Explain the significance of word embeddings in NLP.


Word embeddings represent words as dense vectors in a continuous
vector space, where words with similar meanings are located close to
each other. This allows machines to capture semantic relationships
between words, improving tasks such as sentiment analysis, machine
translation, and text classification. Unlike traditional one-hot encoding,
word embeddings preserve contextual meaning, handle synonyms
effectively, and improve the performance of NLP models by providing
richer word representations.

34. Differentiate between unsmoothed and smoothed N-grams in


table

Aspect Unsmoothed N-grams Smoothed N-grams

Probability is calculated Adds a small value to all N-


Definition based only on the frequency grams (including unseen ones)
of observed N-grams. to avoid zero probability.

Handling Assigns zero probability to Handles unseen N-grams by


Unseen N- unseen N-grams, leading to redistributing some probability
grams poor performance. mass to them.

Can result in inaccurate Improves accuracy by


Accuracy predictions for rare or accounting for unseen or rare
unseen words. N-grams.

"I am running" might be The probability of "I am


Example rare, and its probability running" is adjusted with
might be zero. smoothing techniques.

Common Simple count-based Additive smoothing, Good-


Technique probability. Turing smoothing.

35. How does TF-IDF enhance document representation?


TF-IDF (Term Frequency-Inverse Document Frequency) enhances
document representation by weighing words according to their
importance in a document relative to a corpus. It balances the
frequency of a word in a document (TF) with how common or rare
the word is across the entire corpus (IDF). Words that appear
frequently in a document but rarely in other documents are given
higher importance, helping to better represent the document's
unique content and improve tasks like text classification and search
ranking.

36. Describe the role of Word2Vec in vector


representation.
Word2Vec is a model that transforms words into vector
representations, capturing semantic relationships between them.
Words with similar meanings are represented by vectors that are
close in the vector space, enabling better understanding of word
similarity and context.

37. What are the differences between stochastic and rule-


based POS tagging?

 Stochastic POS tagging uses probabilistic models based on


training data to predict part-of-speech tags, incorporating the
likelihood of tags based on context.

 Rule-based POS tagging relies on a set of predefined linguistic


rules to assign tags, typically using regular expressions and
syntactic patterns.

38. Explain the role of parsing in NLP.


Parsing in NLP involves analyzing the syntactic structure of a
sentence, identifying the relationships between words, and
organizing them into a tree structure to understand grammatical
relationships and sentence meaning.

39. Why is evaluating N-grams critical in language


modeling?
Evaluating N-grams helps capture the likelihood of word sequences,
allowing models to predict the next word in a sequence by
considering the previous words, improving accuracy in language
tasks like text generation and translation.
40. Illustrate how dependency parsing helps understand
sentence structure.
Dependency parsing identifies how words in a sentence are related
to each other by establishing directed links between words. This
structure helps in understanding grammatical relationships like
subject-object and modifying actions, improving sentence
comprehension.

41. Define semantic analysis.


Semantic analysis is the process of understanding and interpreting
the meanings of words, phrases, or sentences in context, going
beyond syntax to capture the true intent and meaning in a
language.

42. What is Word Sense Disambiguation?


Word Sense Disambiguation (WSD) is the task of determining which
meaning of a word is being used in a specific context, as many
words have multiple meanings depending on the situation.

43. What is lexical semantics in NLP?


Lexical semantics focuses on the meanings of words and their
relationships to one another, including concepts like synonyms,
antonyms, homonyms, and polysemy, to understand how words
convey meaning.

44. List two key features of WordNet.

 Synonym sets (synsets): Groups of words that share the same


meaning.

 Hierarchical structure: Words are organized in a tree-like


structure based on relationships like hypernyms (broader terms) and
hyponyms (more specific terms).

45. Define sequence-to-sequence models in NLP.


Sequence-to-sequence models are designed to transform an input
sequence (like a sentence) into an output sequence (like a
translation). They typically use an encoder to process the input and
a decoder to generate the output.

46. What is meaning representation in NLP?


Meaning representation in NLP is the process of converting natural
language into a structured format (like logical forms or vectors) that
can be easily understood and manipulated by computers for tasks
like reasoning and question answering.

47. Name two approaches used for word sense


disambiguation.

 Supervised learning: Uses labeled datasets to train models to


predict word senses based on context.

 Unsupervised learning: Uses clustering or statistical methods to


infer word senses based on the context without labeled data.

48. What is the purpose of embeddings in NLP?


Embeddings represent words or phrases as dense vectors in a
continuous space, capturing semantic relationships and enabling
efficient processing by machine learning models for tasks like
classification, sentiment analysis, and translation.

49. Define the term “lexical semantics.”


Lexical semantics is the study of word meanings and how words
relate to each other, including concepts like word senses, synonyms,
antonyms, and polysemy.
50. What are the key components of a sequence-to-
sequence model?

Ans:

A sequence-to-sequence (Seq2Seq) model is a type of neural network


used to convert one sequence (like a sentence) into another sequence
(like a translated sentence). It has two parts:

1. Encoder: Reads and processes the input sequence (e.g., a sentence


in English).

2. Decoder: Uses the information from the encoder to generate the


output sequence (e.g., the translated sentence in French).
51. Explain the role of WordNet in semantic analysis.
WordNet is a lexical database that helps in semantic analysis by
organizing words into synonym sets and defining relationships (like
hypernyms), and understanding the poiu word meanings and their
connections.

52. Differentiate between lexical semantics and meaning


representation.

5-marks answer.

53. How do sequence-to-sequence models process natural


language?
----

54. Why is word sense disambiguation important in NLP?


Word Sense Disambiguation (WSD) is a subtask of Natural Language
Processing that deals with the problem of identifying the correct
sense of a word in context. Many words in natural language have
multiple meanings, and WSD aims to disambiguate the correct
sense of a word in a particular context. For example, the word
“bank” can have different meanings in the sentences “I deposited
money in the bank” and “The boat went down the river bank”.

Ex-

55. Describe the process of building a semantic model.

 Collect relevant text data for the task or domain.

 Preprocess the text by cleaning, normalizing, and tokenizing it.

 Extract features using techniques like embeddings (Word2Vec,


GloVe) or TF-IDF.

 Design a model architecture to learn semantic relationships.


 Train the model using labeled or unsupervised data.

 Evaluate the model using performance metrics like accuracy or


BLEU.

 Fine-tune the model to improve accuracy and generalization.

 Deploy the model for tasks like text understanding, classification, or


translation.

Ex: customer reviews as positive or negative

56. How do embeddings contribute to semantic analysis?


Embeddings contribute to semantic analysis by representing words
as dense vectors that capture their meanings and relationships. This
helps models understand context, find similar words, and identify
relationships, enabling better performance in tasks like sentiment
analysis and topic modeling.

57. Explain the challenges of meaning representation in


NLP..

 Ambiguity: Words and sentences often have multiple meanings depending on context
(e.g., "bank" as a riverbank or a financial institution).
 Complex Sentence Structures: Long and nested sentences make capturing meaning
difficult.
 Idiomatic Expressions: Phrases like "kick the bucket" don’t translate literally.
 Cultural Differences: Meaning can vary based on cultural context.

58. How does lexical semantics enhance NLP tasks?


Lexical semantics helps NLP tasks by providing richer word
representations, allowing models to better understand the
meaning of words in context, which improves performance
in tasks like machine translation and text classification.

59. Summarize the advantages of using deep learning for


semantic analysis.
Differences.
60. Illustrate the importance of word sense in text
understanding.

Word sense plays a crucial role in text understanding because many


words have multiple meanings depending on the context in which
they appear. Correctly identifying the intended meaning of a word
helps a model accurately interpret the text. For example, the word
"bank" could refer to a financial institution or the side of a river.
Without understanding the correct sense, a system might
misinterpret the text, leading to errors in tasks like machine
translation, sentiment analysis, or information retrieval. Word sense
disambiguation ensures that the meaning is understood in context,
enabling more precise and reliable language processing

61. Define Long Short-Term Memory (LSTM).

 Long Short-Term Memory (LSTM) is a type of recurrent neural


network (RNN) architecture designed to model sequential data.
 It is capable of learning and remembering long-range dependencies
by using memory cells, which help overcome the vanishing gradient
problem that traditional RNNs face.
 LSTMs are widely used in NLP tasks such as language modeling, text
generation, and machine translation.

62. What is a Bidirectional LSTM?


A Bidirectional LSTM (BiLSTM) is an extension of the LSTM model where
two LSTMs are trained simultaneously:

 one processes the input sequence from left to right,

while the other processes it from right to left.

-----------72. Explain the significance of Bidirectional LSTMs in NLP.

The significance is:

 process sequences in both directions, improving understanding by considering both past


and future context.
 Enhance performance in tasks like translation using both preceding and succeeding words.
 Resolve sentence ambiguities.
 Improve tasks like question answering and part-of-speech tagging by considering the full
context of words.
63. List two differences between GRU and LSTM.

Here are the differences between GRU and LSTM without the "Update
Mechanism" aspect and without subheadings:

GRU LSTM

3 gates: input, forget, and output


2 gates: reset and update gates.
gates.

Does not have a separate memory Has a distinct memory cell for storing
cell. long-term information.

Simpler with fewer parameters to More complex due to the extra gates
train. and memory cell.

Generally faster to train because of May take longer to train due to its
its simpler structure. complexity.

performs better on tasks with


performs tasks well with small complex or large datasets.
datasets.

64. Define the Transformer model in NLP.

The Transformer model in NLP is a neural network architecture


designed for sequence-to-sequence tasks, like translation and text
generation. It uses self-attention mechanisms to process the entire input
sequence at once and Transformers are backbone of modern models like
BERT and GPT.

65. What is the self-attention mechanism?

The self-attention mechanism is a key component of the Transformer


model that helps it understand how different words in a sentence relate to
each other.

It works by calculating attention scores for each word in relation to all
other words in the sentence. These scores tell the model which words are
important and should be given more focus when processing the sentence.
69. Define Bidirectional Encoder Representations from
Transformers (BERT).

BERT (Bidirectional Encoder Representations from Transformers) is a deep


learning model used for NLP. It is designed to understand the context of
words in a sentence by looking at the text in both directions (left-to-right
and right-to-left). BERT is pre-trained on a large amount of text data and
can be fine-tuned for specific tasks like question answering, sentiment
analysis, and named entity recognition.

66. Name two characteristics of BERT.

1)BERT reads text in both directions (left-to-right and right-to-left) to


understand the full context of a word.

2) It is first trained on large amounts of text and then fine-tuned for


specific tasks like question answering or sentiment analysis.

3) BERT uses the Transformer architecture, which relies on attention


mechanisms to understand the relationships between words.

67. What does GPT stand for in NLP?


GPT stands for Generative Pretrained Transformer, a type of
Transformer-based model that is pre-trained on large corpora of text data
and fine-tuned for specific tasks. It is primarily used for text generation,
completion, and other generative tasks.

98. How do self-attention mechanisms work in generative


models?

 Input Representation: Each word is turned into a vector.

 For each word, self-attention computes three vectors:

 Query (Q): Represents the word you are focusing on.

 Key (K): Represents the words you're comparing the query to.

 Value (V): Represents the content that will be passed forward


based on the comparison of the query and key.
 Attention Scores: The model checks how much each word should pay
attention to other words.

 Focus on Important Words: Blend information from all words,


emphasizing the most relevant ones.

 Output: Generate the next word or understand the context using the
blended information

68. What are multi-head attention mechanisms?


Multi-head attention mechanisms are a key component of the Transformer
model. They allow the model to focus on different parts of a sequence
simultaneously. Each "head" in the mechanism computes attention
independently, capturing diverse relationships (e.g., word meanings,
positions) across the sequence. The outputs from all heads are then
combined, providing richer and more detailed context understanding.

--------

71. Compare GRU and LSTM models.

63th answer same.

73. How does self-attention improve NLP tasks?

65th same answer


74. Describe the difference between Transformers and RNNs.

Transformers RNNs

Use attention mechanisms for Process sequences step-by-


parallel sequence processing. step, one token at a time.

Struggle with long-range


Handle long-range
dependencies due to vanishing
dependencies effectively.
gradients.

Allow for parallelization, Cannot parallelize, leading to


speeding up training. slower training.

Capture relationships from Only capture context from


the entire sequence. previous tokens.

Scale efficiently with large Performance degrades with


datasets and models. large datasets.

75. Illustrate how BERT processes text inputs.

 BERT processes text bidirectionally, looking at both the left and


right context of words. It uses self-attention to understand the
relationships between all words in a sentence, which enhances its
ability to handle tasks like question answering and sentiment
analysis.

76. Summarize the role of GPT models in text generation.

 GPT (Generative Pretrained Transformer) models are autoregressive


and generate text by predicting the next word based on the
previous ones. They are primarily used for text generation, such as
completing sentences, writing stories, or answering queries in a
conversational manner.

77. Explain the importance of multi-head attention mechanisms.


 Multi-head attention allows the model to focus on different parts
of the input sequence simultaneously, capturing multiple
relationships between words. This improves the model’s ability to
understand complex dependencies in tasks like translation and text
classification.

78. How do Transformer models improve upon traditional models?

 Transformers improve upon traditional models (like RNNs and


LSTMs) by allowing for parallel processing, capturing long-range
dependencies more effectively, and using attention mechanisms to
weigh the importance of different words in a sequence, leading to
better performance and faster training.

79. Describe the differences between BERT and GPT-3.

Aspect BERT GPT-3

Bidirectional (reads text Unidirectional (reads text


Model Type
both directions) left-to-right)

Training
predict missing words predict next word
Objective

Mainly for text generation


Use Case Mainly for understanding
and completion

Transformer Encoder- Transformer Decoder-


Architecture
based based

Pretrained on masked text Pretrained on


Pretraining
tasks autoregressive text tasks

Fine-tuned for specific


Fine-Tuning Fine-tuned for many tasks
tasks

Considers both past and


Contextual future context Considers only past
Understanding context
Aspect BERT GPT-3

80.What are the benefits of bidirectional encoding in NLP?

Bidirectional encoding helps models understand context by looking at


both the words before and after a given word. This improves:

 Context understanding: The model gets a fuller picture of word


meaning.

 Better results: It improves tasks like translation, sentiment


analysis, and recognition.

 Word clarity: It helps the model figure out words with multiple
meanings based on surrounding words.

81. What are Generative Adversarial Networks (GANs)?


GANs are a type of deep learning model consisting of two networks,
a generator and a discriminator, which compete against each
other. The generator creates fake data, and the discriminator tries
to distinguish between real and fake data. Through this competition,
the generator improves its ability to create realistic data.

82. What is the purpose of autoencoders in NLP?


Autoencoders are used for tasks like dimensionality reduction,
feature extraction, and noise reduction in NLP. They compress input
data into a lower-dimensional space (encoding) and then
reconstruct it (decoding), which helps in capturing important
features and reducing noise.

83. Name three evaluation metrics used in NLP.

 BLEU (Bilingual Evaluation Understudy)

 ROUGE (Recall-Oriented Understudy for Gisting Evaluation)

 Perplexity

84. What does BLEU stand for?


BLEU stands for Bilingual Evaluation Understudy. It is an
evaluation metric for machine-generated translations that compares
the n-grams (word sequences) of the generated text to reference
translations to measure translation quality.

85. Define the term "perplexity" in language modeling.


Perplexity is a measure of how well a language model predicts a
sequence of words. It is the inverse probability of the predicted word
sequence normalized by the number of words, with lower perplexity
indicating better model performance.

86. What is the Hugging Face API?


The Hugging Face API is a platform offering access to pre-trained
models for various NLP tasks. It allows users to easily use and fine-
tune models for tasks like text generation, translation, and
classification, through an easy-to-use interface.

87. List two key features of open-source LLMs.

95. Summarize the importance of open-source LLMs.

LLMs (Large Language Models) are AI models trained on vast text data
to understand and generate human language.

They can perform tasks like text generation, translation, and


summarization using deep learning techniques like transformers.

 Accessibility: Open-source LLMs make powerful AI tools available


to everyone, reducing costs and barriers to entry.

 Transparency: They allow users to inspect and modify the code.

 Collaboration: Open-source LLMs encourage shared improvements,


advancements in AI research and development

88. Define the self-attention mechanism in generative


models.
The self-attention mechanism in generative models allows each
token (word) to focus on and weigh other tokens in the input
sequence, capturing relationships between them regardless of their
positions in the sequence.
89. What are multi-head attention mechanisms used for?
Multi-head attention allows the model to focus on different parts of
the input sequence simultaneously. Each head learns to focus on
different aspects of the data, improving the model's ability to
understand complex relationships.

90. Define the term “Generative AI.”

 Generative AI refers to artificial intelligence systems capable of


creating new content such as text, images, audio, or video by
learning patterns and structures from existing data.
 It utilizes models like GPT (Generative Pre-trained Transformer) and
GANs (Generative Adversarial Networks) to produce coherent and
contextually relevant outputs.

90. Compare MRR and MAP metrics for evaluating NLP


models.

MRR (Mean Reciprocal Rank) and

MAP (Mean Average Precision):

Metri
Focus Purpose Use Case Calculation
c

Rank of the Evaluates how well Reciprocal of the first


first the model ranks Question relevant item's rank,
MRR
relevant the first relevant answering averaged across
item result queries

Average precision of all


Overall Measures how well
Document relevant items,
MAP ranking all relevant items
retrieval averaged across
quality are ranked higher
queries
92. Explain how ROUGE is used to evaluate summarization
models.

ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a set of


metrics to evaluate how much of the reference content is captured in the
generated text.. It focuses on:

 ROUGE-N: Measures the overlap of n-grams (e.g., unigrams,


bigrams) between the generated and reference summaries.

 ROUGE-L: Measures the longest common subsequence (LCS)


between the generated and reference texts.

 ROUGE-W: Measures weighted overlap of subsequences.

 ROUGE-S: Measures the overlap of skip-bigrams

93. Describe the role of perplexity in language model evaluation.

 Use: Measures how well a language model predicts a sequence by


evaluating the uncertainty in its predictions.

 Task: Language Modeling.

 How It Works:

o Lower perplexity indicates that the model assigns higher


probabilities to the correct sequence of words.

o Calculated as the exponentiated average negative log-


likelihood of predicted tokens.

 Example:

o Model 1 predicts: "The cat sat on the mat." with high


probability.

o Model 2 predicts: "The cat ate the mat." with low


probability.

o Model 1 has lower perplexity, indicating better fluency and


coherence.

 Limitation:
Does not measure semantic correctness or alignment with real-
world meanings.
94. How does BLEU measure machine translation quality?

BLEU (Bilingual Evaluation Understudy)

 Use: Measures the similarity between machine-generated text and


reference text using n-gram overlap.

 Task: Machine Translation.

 How It Works:
Compares n-grams (sequences of words) in the generated text to
reference texts, focusing on precision.

 Example:

o Reference: "The cat sat on the mat."

o Generated: "The cat is sitting on the mat."

o Overlapping unigrams: "The," "cat," "on," "the," "mat."

o Simplified BLEU Score: ~0.67.

 Limitation:
Favors exact matches and penalizes semantically correct but
paraphrased outputs.

96. Explain how Hugging Face APIs support NLP applications.

Hugging Face APIs simplify NLP by providing followings:

 Hugging Face APIs offer a wide range of pre-trained NLP models.

 They support various NLP tasks like text generation, sentiment analysis,
translation, summarization, and question-answering.

 The APIs allow easy integration of advanced language processing


features into applications.

 Provides efficient solutions for NLP tasks without extensive training.

 Developers can fine-tune models for specific tasks or datasets using the
APIs.
97. What is the role of autoencoders in generative modeling?

 Autoencoders in generative modeling learn to compress input data


into a lower-dimensional latent space and reconstruct it. By
sampling from this latent space, they can generate new, similar
data. This ability allows autoencoders to be used for tasks like
image generation, data denoising, and anomaly detection, as they
capture and reproduce the underlying structure of the data.

99. Describe the importance of evaluation metrics in Generative


AI.

Evaluation metrics are crucial in Generative AI for:

 Evaluation metrics measure the quality of generated


content.

 They allow for comparison between different models.

 Metrics provide feedback to improve model performance.

 They ensure that the generated content is relevant and


meaningful.

 They help in detecting biases or errors in the generated


content.

 They ensure the generated content meets user


expectations.
Evaluation Metrics in Generative AI

1. BLEU (Bilingual Evaluation Understudy)

 Use: Measures the similarity between machine-generated text and


reference text using n-gram overlap.

 Task: Machine Translation.

 How It Works:
Compares n-grams (sequences of words) in the generated text to
reference texts, focusing on precision.

 Example:

o Reference: "The cat sat on the mat."

o Generated: "The cat is sitting on the mat."

o Overlapping unigrams: "The," "cat," "on," "the," "mat."

o Simplified BLEU Score: ~0.67.

 Limitation:
Favors exact matches and penalizes semantically correct but
paraphrased outputs.

2. ROUGE (Recall-Oriented Understudy for Gisting Evaluation)

 Use: Measures longest common subsequences (LCS) between


generated and reference texts.

 Task: Summarization.

 How It Works:
Evaluates how much of the reference content is captured in the
generated text.
 Example:

o Reference: "The quick brown fox jumps over the lazy dog."

o Generated: "The quick fox jumps over the dog."

o Overlapping bigrams: "The quick," "fox jumps," "over the."

o ROUGE Score (Recall): High, as most key phrases are


included.

 Limitation:
May favor verbose summaries over concise ones.

3. METEOR (Metric for Evaluation of Translation with Explicit


ORdering)

 Use: Considers synonyms, stemming, and word order for matching.

 Task: Translation, Text Generation.

 How It Works:
Computes precision and recall with an emphasis on semantic
matching, combining them into a harmonic mean.

 Example:

o Reference: "The boy is playing football."

o Generated: "A child is playing soccer."

o METEOR recognizes "boy" as a synonym for "child" and


"football" as "soccer," leading to a higher score.

4. Perplexity

 Use: Measures how well a language model predicts a sequence by


evaluating the uncertainty in its predictions.

 Task: Language Modeling.

 How It Works:

o Lower perplexity indicates that the model assigns higher


probabilities to the correct sequence of words.

o Calculated as the exponentiated average negative log-


likelihood of predicted tokens.

 Example:
o Model 1 predicts: "The cat sat on the mat." with high
probability.

o Model 2 predicts: "The cat ate the mat." with low


probability.

o Model 1 has lower perplexity, indicating better fluency and


coherence.

 Limitation:
Does not measure semantic correctness or alignment with real-
world meanings.

You might also like