0% found this document useful (0 votes)
7 views39 pages

NLP- Part-II

Uploaded by

maneeshreddyamr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views39 pages

NLP- Part-II

Uploaded by

maneeshreddyamr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

NLP

Part-II
Grammars for Natural Language
• Grammars for Natural Language are rule-
based systems used in Natural Language
Processing (NLP) to describe the structure and
syntax of a language.
• They help computers understand the patterns
and relationships between words in a
sentence, allowing for accurate parsing and
analysis of human language.
There are two main types of grammars used in
NLP:
• context-free grammars (CFGs)
• Dependency grammars.
• CFGs are a simple and powerful type of grammar
that can be used to represent the structure of most
natural languages. A CFG consists of a set of non-
terminal symbols, a set of terminal symbols, and a
set of production rules.
• Production rules specify how non-terminal
symbols can be rewritten as combinations of
other non-terminal symbols and terminal
symbols.
• For example, the following
production rule specifies
that the non-terminal
symbol "sentence" can be
rewritten as the
combination of the non-
terminal symbol "noun
phrase" and the non-
terminal symbol "verb
phrase":
Dependency grammars are a more complex type of grammar
that can represent the fine-grained syntactic relationships
between words in a sentence. A dependency grammar consists
of a set of nodes, a set of edges, and a set of dependency
relations.
• Nodes represent words in a sentence, and edges represent the
syntactic relationships between words. For example, the
following dependency graph represents the sentence "The dog
ate the bone":
• here is a dependency grammar diagram for the sentence "The
cat chased the mouse":
• ( (S (NP the cat) (VP chased (NP the mouse)) ))
Movement Phenomena in Language
• Movement phenomena in NLP refers to the
phenomena in which certain elements in a
sentence seem to move around, changing
their position to form different sentence
structures while still conveying the same
meaning.
• These phenomena are important because
they affect the interpretation and meaning of
a sentence.
The different forms of movement phenomena in
language are :
1. wh-movement: Wh-movement is a movement
phenomenon that is used to form questions. In
English, wh-words, such as "who", "what", "when",
"where", and "why", are moved to the beginning of
the sentence to form questions.
• For example, the sentence "The man saw the dog"
becomes "Who did the man see?" after wh-
movement.
2. Topicalization: Topicalization is a movement
phenomenon that is used to highlight a
particular element in a sentence. In English,
topicalization involves moving a phrase to the
beginning of the sentence and marking it with a
special word called a topic marker.
• For example, the sentence "The man saw the
dog" becomes "The man, he saw the dog"
after topicalization.
3. Adverb preposing : Adverb preposing occurs
when an adverbial phrase is moved from its
position within the sentence to the beginning
of the sentence.
• For example, the sentence “ I will see you
tomorrow ” becomes “ Tomorrow, I will see
you. ”
4. Passivization : Passivization is a movement
phenomenon in NLP where the typical word
order of a sentence is changed from an active
voice to a passive voice. In the passive voice,
the subject of the active sentence becomes the
object, and the object becomes the subject.
• For example, the sentence “The man saw the
dog” becomes “ The dog was seen by the
man.”
5. Extraposition : Extraposition is a movement
phenomenon in NLP that involves moving a
clause to the end of a sentence. This movement
is motivated by the need to avoid ambiguity or
to improve the fluency of the sentence.
• For example, the sentence “I believe that you
will win the race.", becomes
• "I believe that the race will be won by you.",
“you” moved to the end.
Handling Questions in Context-Free Grammars
1. Question Word (Wh-Question) Handling:
Wh-questions start with question words like "who," "what,"
"where," "when," "why," "which," "how," etc. To handle wh-
questions in CFG, we introduce a new production rule for
generating the question word and incorporate it into sentence
structures.
• Sentence: "The cat chased the mouse."
• Wh-Question: "What chased the mouse?"
• Sentence : “ He ate the Pizza”
• Wh-Question: “ Who ate the Pizza?”
• Sentence: “ He left for delhi.”
• Wh-Question: “Where did he left for?”
2. Yes/No Questions: Yes/No questions are simple
questions that can be answered with "yes" or "no." To
handle them, we use the auxiliary verb at the beginning of
the sentence to form the question.
• Example:
• Sentence: "The cat chased the mouse"
• Yes/No Question: "Did the cat chased the mouse?"
• Sentence: “ He ate the Pizza”
• Yes/No Question: “He eats Pizza”
• Sentence: “ Does he eats Pizza”
3.Tag Questions: Tag questions are short questions added
to the end of a sentence to seek confirmation. To handle
tag questions, we introduce production rules for the tag
questions and combine them with the main sentence.
• Example:
• Sentence: " He ate the Pizza
• Yes/No Question: " He ate the Pizza, did he?"

• Sentence: “ He leaves for delhi.”


• Wh-Question: “He leaves for delhi, does he?”
4. Negative Questions:
Negative questions include negation in the question. To
handle them, we can add the following production rule:
• Example:
• Sentence: " He eats Pizza”
• Negative Question: "Doesn’t he eat Pizza."

• Sentence: “ They are playing”


• Negative Question: “ Aren’t they playing?”
Hold Mechanisms in ATN’s
• Hold mechanisms in ATNs (Augmented Transition
Networks) in NLP are used to temporarily suspend the
parsing of a sentence until certain conditions are met.
• Hold mechanisms can be a powerful tool for handling
ambiguous sentences and sentences that contain
incomplete information.
• ATNs utilize hold mechanisms to handle ambiguous input
and temporarily store partial results during parsing. The
hold mechanism allows the parser to suspend processing
a certain portion of the input and continue exploring
alternative parsing paths.
• For example, the following ATN can be used to parse the sentence
"The dog chased the cat."
start
-> NP
/\
This ATN can parse the
the dog
-> VP sentence correctly because the
/\ order of the words in the
chased sentence is unambiguous.
-> NP
/\
the cat .
The following ATN can be used to parse the sentence "The dog chased the
cat who."
start
-> NP
/\
the dog
-> VP
/\
chased
-> NP
/\
the cat
who -> hold
Gap Threading
• Gap threading is a technique used in Natural Language
Processing (NLP) to connect two or more separate pieces
of information within a text.
• Gaps can be caused by missing words, phrases, or clauses.
Gap threading algorithms work by identifying the gaps
in a text and then trying to fill them in using the
surrounding context.
• Gap threading is an essential NLP technique as it allows
systems to make connections between pieces of
information scattered across a text, enabling better
comprehension and context understanding.
Example:
• Text: "John is a passionate chef. He recently
opened his own restaurant. The restaurant is
getting rave reviews from customers."
• In this example, there are three separate
sentences, each containing information about
John, his restaurant, and the positive reviews.
Gap threading helps us connect the relevant
entities across these sentences:
Gap Threading:
Gap 1: "John" is connected to "chef."
Gap 2: "John" is connected to "his own
restaurant."
Gap 3: "restaurant" is connected to "rave
reviews."
Connected Information:
- John is a passionate chef who recently opened
his own restaurant.
- The restaurant, opened by John, is getting rave
reviews from customers.
Human Preferences in Parsing
• Human preferences in parsing in NLP (Natural Language Processing)
are the ways in which humans prefer to parse sentences.
• For example, humans tend to prefer to parse sentences in a top-
down fashion, starting with the main clause and then moving on to
the subordinate clauses. This is because humans are able to quickly
identify the main idea of a sentence and then use that information
to understand the subordinate clauses.
• Consider the statement “ The raft floated down the river sank.
• This sentence is an example of a garden-path sentence because the
initial interpretation leads the reader to an incorrect path the word
"sank" might mean that “river” has sunk until they reevaluate and
find out that ”raft” has sunk.
• Minimal Attachment:
• Minimal Attachment is a principle in parsing where
readers tend to prefer parse tree.
• In the sentence "The raft floated down the river
sank," readers may form a structure that follows
Subject-Verb-Object (SVO) word order.
• They might interpret the sentence as "The raft [that
floated down the river] sank," where "floated down
the river" forms a noun phrase (NP) modifying "the
raft."
• Right Association:
• Right Association is a principle where readers
tend to associate a new word with the
immediately preceding element rather than
earlier elements in the sentence.
• In the sentence "The raft floated down the river
sank," the phrase "the river sank" appears
directly after "floated down," which may lead
readers to interpret as if the river sank.
• Lexical Preferences:
• Lexical Preferences refer to the tendency of
readers to prefer the most likely meanings of
individual words in a sentence.
• In this sentence, the words "raft," "floated,"
"river," and "sank" are all familiar words.
• The lexical preferences may lead readers to
interpret the sentence, as if the river sank.
• Correct Interpretation:
• The correct interpretation of the sentence is
"The raft [that floated down the river] sank,"
where "sank" is associated with "the raft,"
indicating that the raft sank.
• Shift Reduce Parsers
• A shift-reduce parser is a type of parser that works by repeatedly shifting
words from the input sentence onto a stack and then reducing them into
larger constituents.
• The parser starts with the empty stack and the input sentence. It then
repeatedly applies the following two actions:
• Shift: The parser shifts the next word from the input sentence onto the
stack.
• Reduce: The parser reduces the top two elements of the stack into a larger
constituent.
• The parser continues to apply these actions until the stack contains a single
constituent, which is the parse tree for the input sentence.
• Here is an example of how a shift-reduce parser would parse the sentence
"The cat sat on the mat."
Input: The cat sat on the mat.
Stack:

Action:

1. Shift "The" onto the stack.

Stack: The

Action:

2. Shift "cat" onto the stack.

Stack: The cat

Action:

3. Reduce "The cat" into a noun phrase (NP).

Stack: NP
Action:

4. Shift "sat" onto the stack.

Stack: NP sat

Action:

5. Reduce "sat" into a verb phrase (VP).

Stack: NP VP

Action:

6. Shift "on" onto the stack.

Stack: NP VP on
Action:

7. Shift "the" onto the stack.

Stack: NP VP on the

Action:

8. Shift "mat" onto the stack.

Stack: NP VP on the mat

Action:

9. Reduce "on the mat" into a prepositional phrase (PP).

Stack: NP VP PP

Action:

10. Reduce "NP VP PP" into a sentence (S).

Stack: S
Deterministic Parsers
• Deterministic parsers in Natural Language Processing
(NLP) are parsing algorithms that follow a fixed set of
rules to determine the next action during the parsing
process. Unlike non-deterministic parsers, which may
have multiple possible actions at each step,
deterministic parsers make decisions based on a
predefined strategy or heuristic, leading to a unique
parsing path. Deterministic parsing is often more
efficient than non-deterministic parsing due to its
predictable nature. Here are two common examples of
deterministic parsers:
1. Shift-Reduce Parser:
• The Shift-Reduce parser is a deterministic parsing
algorithm that operates by repeatedly shifting a token
from the input to the stack and then reducing elements
on top of the stack based on predefined grammar rules.
• The parser makes decisions at each step solely based on
the current state of the stack and the input token,
following a fixed set of shift and reduce rules.
• Example: The example provided earlier in this
conversation, where we demonstrated the Shift-Reduce
parser, is a deterministic parser.
2. Earley Parser:
• The Earley parser is a chart-based parsing algorithm that uses dynamic programming
and is based on a concept called "predictor, scanner, and completer" strategies.
• It works by constructing a chart that records all possible parse constituents for different
parts of the input sentence.
• The parser makes deterministic decisions by following the predictor, scanner, and
completer rules to add new constituents to the chart and combine them to form valid
parse trees.
• Example: Consider the following context-free grammar and input sentence:
• Grammar:
S -> NP VP
NP -> "the" "cat"
VP -> "chased"
NP -> "the" "mouse"

• Input Sentence: "the cat chased the mouse"

You might also like