NLP- Part-II
NLP- Part-II
Part-II
Grammars for Natural Language
• Grammars for Natural Language are rule-
based systems used in Natural Language
Processing (NLP) to describe the structure and
syntax of a language.
• They help computers understand the patterns
and relationships between words in a
sentence, allowing for accurate parsing and
analysis of human language.
There are two main types of grammars used in
NLP:
• context-free grammars (CFGs)
• Dependency grammars.
• CFGs are a simple and powerful type of grammar
that can be used to represent the structure of most
natural languages. A CFG consists of a set of non-
terminal symbols, a set of terminal symbols, and a
set of production rules.
• Production rules specify how non-terminal
symbols can be rewritten as combinations of
other non-terminal symbols and terminal
symbols.
• For example, the following
production rule specifies
that the non-terminal
symbol "sentence" can be
rewritten as the
combination of the non-
terminal symbol "noun
phrase" and the non-
terminal symbol "verb
phrase":
Dependency grammars are a more complex type of grammar
that can represent the fine-grained syntactic relationships
between words in a sentence. A dependency grammar consists
of a set of nodes, a set of edges, and a set of dependency
relations.
• Nodes represent words in a sentence, and edges represent the
syntactic relationships between words. For example, the
following dependency graph represents the sentence "The dog
ate the bone":
• here is a dependency grammar diagram for the sentence "The
cat chased the mouse":
• ( (S (NP the cat) (VP chased (NP the mouse)) ))
Movement Phenomena in Language
• Movement phenomena in NLP refers to the
phenomena in which certain elements in a
sentence seem to move around, changing
their position to form different sentence
structures while still conveying the same
meaning.
• These phenomena are important because
they affect the interpretation and meaning of
a sentence.
The different forms of movement phenomena in
language are :
1. wh-movement: Wh-movement is a movement
phenomenon that is used to form questions. In
English, wh-words, such as "who", "what", "when",
"where", and "why", are moved to the beginning of
the sentence to form questions.
• For example, the sentence "The man saw the dog"
becomes "Who did the man see?" after wh-
movement.
2. Topicalization: Topicalization is a movement
phenomenon that is used to highlight a
particular element in a sentence. In English,
topicalization involves moving a phrase to the
beginning of the sentence and marking it with a
special word called a topic marker.
• For example, the sentence "The man saw the
dog" becomes "The man, he saw the dog"
after topicalization.
3. Adverb preposing : Adverb preposing occurs
when an adverbial phrase is moved from its
position within the sentence to the beginning
of the sentence.
• For example, the sentence “ I will see you
tomorrow ” becomes “ Tomorrow, I will see
you. ”
4. Passivization : Passivization is a movement
phenomenon in NLP where the typical word
order of a sentence is changed from an active
voice to a passive voice. In the passive voice,
the subject of the active sentence becomes the
object, and the object becomes the subject.
• For example, the sentence “The man saw the
dog” becomes “ The dog was seen by the
man.”
5. Extraposition : Extraposition is a movement
phenomenon in NLP that involves moving a
clause to the end of a sentence. This movement
is motivated by the need to avoid ambiguity or
to improve the fluency of the sentence.
• For example, the sentence “I believe that you
will win the race.", becomes
• "I believe that the race will be won by you.",
“you” moved to the end.
Handling Questions in Context-Free Grammars
1. Question Word (Wh-Question) Handling:
Wh-questions start with question words like "who," "what,"
"where," "when," "why," "which," "how," etc. To handle wh-
questions in CFG, we introduce a new production rule for
generating the question word and incorporate it into sentence
structures.
• Sentence: "The cat chased the mouse."
• Wh-Question: "What chased the mouse?"
• Sentence : “ He ate the Pizza”
• Wh-Question: “ Who ate the Pizza?”
• Sentence: “ He left for delhi.”
• Wh-Question: “Where did he left for?”
2. Yes/No Questions: Yes/No questions are simple
questions that can be answered with "yes" or "no." To
handle them, we use the auxiliary verb at the beginning of
the sentence to form the question.
• Example:
• Sentence: "The cat chased the mouse"
• Yes/No Question: "Did the cat chased the mouse?"
• Sentence: “ He ate the Pizza”
• Yes/No Question: “He eats Pizza”
• Sentence: “ Does he eats Pizza”
3.Tag Questions: Tag questions are short questions added
to the end of a sentence to seek confirmation. To handle
tag questions, we introduce production rules for the tag
questions and combine them with the main sentence.
• Example:
• Sentence: " He ate the Pizza
• Yes/No Question: " He ate the Pizza, did he?"
Action:
Stack: The
Action:
Action:
Stack: NP
Action:
Stack: NP sat
Action:
Stack: NP VP
Action:
Stack: NP VP on
Action:
Stack: NP VP on the
Action:
Action:
Stack: NP VP PP
Action:
Stack: S
Deterministic Parsers
• Deterministic parsers in Natural Language Processing
(NLP) are parsing algorithms that follow a fixed set of
rules to determine the next action during the parsing
process. Unlike non-deterministic parsers, which may
have multiple possible actions at each step,
deterministic parsers make decisions based on a
predefined strategy or heuristic, leading to a unique
parsing path. Deterministic parsing is often more
efficient than non-deterministic parsing due to its
predictable nature. Here are two common examples of
deterministic parsers:
1. Shift-Reduce Parser:
• The Shift-Reduce parser is a deterministic parsing
algorithm that operates by repeatedly shifting a token
from the input to the stack and then reducing elements
on top of the stack based on predefined grammar rules.
• The parser makes decisions at each step solely based on
the current state of the stack and the input token,
following a fixed set of shift and reduce rules.
• Example: The example provided earlier in this
conversation, where we demonstrated the Shift-Reduce
parser, is a deterministic parser.
2. Earley Parser:
• The Earley parser is a chart-based parsing algorithm that uses dynamic programming
and is based on a concept called "predictor, scanner, and completer" strategies.
• It works by constructing a chart that records all possible parse constituents for different
parts of the input sentence.
• The parser makes deterministic decisions by following the predictor, scanner, and
completer rules to add new constituents to the chart and combine them to form valid
parse trees.
• Example: Consider the following context-free grammar and input sentence:
• Grammar:
S -> NP VP
NP -> "the" "cat"
VP -> "chased"
NP -> "the" "mouse"