NLP_Answers
NLP_Answers
Significance in NLP:
Efficiency: Automates translation tasks, saving time and resources compared to manual
translation.
Cultural Exchange: Promotes the sharing of knowledge and culture across linguistic
boundaries.
2)
Slow due to sequential Still slow, but slightly faster Fast due to parallel
Speed
processing than RNNs processing
Simple tasks with short Moderate-length sequence Complex tasks with long
Use Cases
inputs tasks inputs
3) 1. What is the Encoder-Decoder Architecture?
The encoder-decoder architecture is a framework used in Sequence-to-Sequence (Seq2Seq) models
for tasks like machine translation. It consists of two main components:
Encoder: Reads the input sentence and converts it into a fixed-length context vector (a
summary of the input).[
Decoder: Uses the context vector to generate the output sentence in the target language.
1. The encoder processes the input sequence (e.g., a sentence in English) and converts it into a
hidden representation (context vector).
2. The decoder takes this hidden representation and generates the translated output (e.g., the
sentence in French), one word at a time.
3. During training, the model learns to map input sentences to their correct translations by
adjusting weights.
Handles Varying Sentence Lengths: Converts inputs and outputs of different lengths into a
common form (context vector).
Captures Context: The encoder creates a meaningful summary of the input for the decoder
to use.
Facilitates Learning: Allows the model to learn relationships between languages by working
on paired sentences (source and target).
1. Encoder:
o The encoder is a neural network (usually a Recurrent Neural Network (RNN), Long
Short-Term Memory (LSTM), or Gated Recurrent Unit (GRU)) that processes the
input sequence (e.g., a sentence in the source language) word by word.
o It converts each word into a fixed-size vector (a word embedding) and passes it
through the network, updating its internal state at each step.
o After processing the entire sequence, the encoder outputs a context vector, a fixed-
length vector summarizing the entire input sequence.
2. Context Vector:
o The context vector is the output of the encoder. This vector is intended to capture all
relevant information from the source sequence and is used by the decoder to
generate the translated output.
3. Decoder:
o The decoder is another RNN/LSTM/GRU that generates the output sequence (the
translated sentence in the target language) one word at a time.
o The decoder takes the context vector from the encoder as input to start the
generation process. Each step of the decoder produces the next word in the
sequence, and the previous words generated are used as input for subsequent steps.
o The decoder produces the entire translated sentence based on this sequential
generation.
o The attention mechanism computes a weighted sum of the encoder’s hidden states,
giving the decoder the ability to focus on relevant parts of the input sequence when
generating each word of the output.
2. Seq2Seq Models:
o The Seq2Seq model handles the entire sequence of words, learning the translation
from start to end, which allows it to capture long-range dependencies and
contextual nuances.
Key Differences:
Handling of Sequence: Traditional models typically used phrase-based or word-based
translation, while Seq2Seq models translate the entire sequence at once.
Model Architecture: Seq2Seq models use neural networks (e.g., RNNs, LSTMs), whereas
traditional methods use statistical or rule-based models.
Context Understanding: Seq2Seq models, especially with attention, can better capture the
context and relationships between words in the sequence, while traditional models often
struggle with this.
5) 1. Idiomatic Expressions:
o These are phrases where the meaning is different from the literal words. For
example, “kick the bucket” means "to die," not literally kicking a bucket.
o Lack of Data: For a machine to translate idioms correctly, there needs to be enough
data showing how they are used in different languages. But many languages lack
enough examples of idiomatic expressions.
2. Syntactic Complexities:
o Different languages follow different sentence structures. For example, English uses
Subject-Verb-Object (SVO) order ("I like apples"), while languages like Japanese use
Subject-Object-Verb (SOV) order ("I apples like"). These differences can make
translation difficult.
o Word Order: Languages may have different sentence orders. For instance, adjectives
in English come before nouns ("red car"), but in Spanish, the adjective comes after
the noun ("coche rojo"). MT systems must adjust word order accordingly.
o Syntax Ambiguity: Some languages allow more flexible word orders (like Latin or
Russian), making it difficult for MT systems to understand the right structure.
o Complex Sentences: Sentences with multiple parts (like “She went to the store
because she needed milk”) are harder to translate, especially if the target language
has different sentence structures.
o Long-Distance Relationships: Some sentences link words that are far apart. Older
MT systems struggle with these long-distance dependencies.
o Neural Networks (RNNs and Transformers): Modern models like RNNs and
especially Transformers (used in models like GPT and BERT) are designed to
understand relationships between words in a sentence, even if they are far apart.
o Syntax-Aware Models: Some MT systems use syntax rules (like sentence trees) to
guide the translation and ensure it follows the correct sentence structure.
Soln:
2) Evaluate, given encoder hidden states and a decoder hidden state using the dot-product
scoring method 𝑒 𝑖𝑗 = ℎ𝑖𝑇 𝑠 . The encoder hidden states areℎ1 = [5], ℎ2 = [0.6, 0.4], ℎ3 =
[0.6, 0.3] and the decoder hidden state is 𝑠 = [0.6, 0.4], Compute the values of 𝑒𝑖1 , 𝑒𝑖2 , 𝑒𝑖3 .
Soln:
3)
Soln:
4)
Soln: