nlp
nlp
ANS:
For example:
If you ask Siri a question, NLP helps Siri understand what you mean.
Google Translate uses NLP to turn text from one language into
another.
ANS:
1. Text Preprocessing
Common steps:
3. Semantic Analysis
Understanding the meaning of words and sentences.
4. Discourse Analysis
ANS:
Key Goal:
Applications:
Machine translation.
Speech recognition.
Chatbots.
4. What is the role of grammar in NLP?
ANS:
1. Syntactic Parsing:
o Grammar is used to analyze the structure of sentences,
identifying parts of speech (nouns, verbs, etc.) and their
relationships.
2. Disambiguation:
o Grammar helps resolve ambiguities in sentences by providing
context.
3. Sentence Generation:
o Grammar ensures that machines generate human-like text
that follows the rules of language.
4. Error Detection and Correction:
o Tools like spell checkers and grammar checkers rely on
grammatical rules to identify and fix errors in text.
ANS:
1. IMDB Reviews Dataset
Description: A dataset of 50,000 movie reviews from the IMDB
website, labeled as positive or negative.
Usage:
o Sentiment analysis.
o Text classification.
Format: Plain text with binary labels (positive/negative).
2. SQuAD (Stanford Question Answering Dataset)
Description: A reading comprehension dataset containing over
100,000 questions and answers based on Wikipedia articles.
Usage:
o Question answering tasks.
o Evaluating machine comprehension of text.
Format: JSON files with passages, questions, and answers.
6. What is the difference between morphological and syntactic
analysis?
ANS:
Focuses on individual
Focuses on phrases.
words.
ATNs were widely used in early NLP systems for sentence parsing and
machine translation.
ANS:
ANS:
ANS:
ANS:
Tokenization: The text is split into words or tokens, which are then
analyzed at the morphological level.
ANS:
Regular expressions allow for searching and finding text that matches
specific patterns, such as dates or keywords.
ANS:
Deals with how words are arranged Deals with what the sentence or
in a sentence. words mean in a specific context.
ANS:
Tokenization: Break text into smaller units and discard irrelevant parts
or tokens.
ANS:
ANS:
It treats text as a "bag" (or collection) of words, ignoring grammar and
word order but keeping track of the frequency of each word's occurrence.
The result may not always be a The result is a valid word that
Output valid word (e.g., "running" → exists in the dictionary (e.g.,
"run" becomes "run"). "running" → "run").
Ans:
5-marks answer.
Ex-
Ambiguity: Words and sentences often have multiple meanings depending on context
(e.g., "bank" as a riverbank or a financial institution).
Complex Sentence Structures: Long and nested sentences make capturing meaning
difficult.
Idiomatic Expressions: Phrases like "kick the bucket" don’t translate literally.
Cultural Differences: Meaning can vary based on cultural context.
Here are the differences between GRU and LSTM without the "Update
Mechanism" aspect and without subheadings:
GRU LSTM
Does not have a separate memory Has a distinct memory cell for storing
cell. long-term information.
Simpler with fewer parameters to More complex due to the extra gates
train. and memory cell.
Generally faster to train because of May take longer to train due to its
its simpler structure. complexity.
It works by calculating attention scores for each word in relation to all
other words in the sentence. These scores tell the model which words are
important and should be given more focus when processing the sentence.
69. Define Bidirectional Encoder Representations from
Transformers (BERT).
Key (K): Represents the words you're comparing the query to.
Output: Generate the next word or understand the context using the
blended information
--------
Transformers RNNs
Training
predict missing words predict next word
Objective
Word clarity: It helps the model figure out words with multiple
meanings based on surrounding words.
Perplexity
LLMs (Large Language Models) are AI models trained on vast text data
to understand and generate human language.
Metri
Focus Purpose Use Case Calculation
c
How It Works:
Example:
Limitation:
Does not measure semantic correctness or alignment with real-
world meanings.
94. How does BLEU measure machine translation quality?
How It Works:
Compares n-grams (sequences of words) in the generated text to
reference texts, focusing on precision.
Example:
Limitation:
Favors exact matches and penalizes semantically correct but
paraphrased outputs.
They support various NLP tasks like text generation, sentiment analysis,
translation, summarization, and question-answering.
Developers can fine-tune models for specific tasks or datasets using the
APIs.
97. What is the role of autoencoders in generative modeling?
How It Works:
Compares n-grams (sequences of words) in the generated text to
reference texts, focusing on precision.
Example:
Limitation:
Favors exact matches and penalizes semantically correct but
paraphrased outputs.
Task: Summarization.
How It Works:
Evaluates how much of the reference content is captured in the
generated text.
Example:
o Reference: "The quick brown fox jumps over the lazy dog."
Limitation:
May favor verbose summaries over concise ones.
How It Works:
Computes precision and recall with an emphasis on semantic
matching, combining them into a harmonic mean.
Example:
4. Perplexity
How It Works:
Example:
o Model 1 predicts: "The cat sat on the mat." with high
probability.
Limitation:
Does not measure semantic correctness or alignment with real-
world meanings.