0% found this document useful (0 votes)

16 views

Lecture 2 Hierarchy of NLP & TF-IDF

Uploaded by

priyanshisrivastava131

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Lecture 2 Hierarchy of NLP & TF-IDF

Uploaded by

priyanshisrivastava131

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 48

Amity School of Engineering and Technology

MODULE III
Understanding Natural Languages
TF-IDF
2
3
4
5
6
7
8
9
10
11
Q1. Apply Bag of words (BOW) method on following sentences and convert
to the vector form:
Sentence 1: This movie is very scary and long
Sentence 2: This movie is not scary and is slow
Sentence 3: This movie is spooky and good

Q2. Apply TF-IDF method on the following sentences and draw

the final table having features value:

Sentence 1: This movie is very scary and long

Sentence 2: This movie is not scary and is slow
Sentence 3: This movie is spooky and good

12
Topic Modeling
• Uncovering hidden structures in sets of
texts or documents.
• Groups texts to discover latent topics.
• Assumes each document consists of a
mixture of topics and that each topic
consists of a set of words.

13
Topic Modeling
(Example)

14
Parsing
• Breaking down a given sentence into its
grammatical constituents.
• Example:
• “Who won the cricket worldcup in 2019?”
• “The swift black cat jumps over the wall”

15
Part-of-speech (POS) tagging

• According to the role of a word in a

sentence, it can be tagged as a noun,
verb, adjective, adverb, preposition, etc.
• Correct tags such as nouns, verbs,
adjectives, etc. should be assigned.

16
Constituency parsing
• Need to identify and define commonly
seen grammatical patterns.
• Divide words into groups, called
constituents, based on their grammatical
role in the sentence.
• Example:
• ‘Amitian — read — an article on Syntactic
Analysis’

17
Dependency Parsing
• Dependencies are established between
words themselves.
• Example:
• ‘Amitians attend classes’

18
Co-reference resolution
• Coreference resolution is the task of
finding all expressions that refer to the
same entity in a text.
Example:Two entities as ‘Michael Cohen’
and ‘Mr. Trump’

19
Word sense
disambiguation
• NLP involves resolving different kinds of
ambiguity.
• A word can take different meanings
making it ambiguous to understand.
• Word sense disambiguation (WSD) means
selecting the correct word sense for a
particular word.

20
Word sense
disambiguation
• Example:
• The word “bank”. It can refer to a financial
institution or the land alongside a river.
• These different meanings are called word
senses.
• Context can be used effectively to perform
WSD.

21
Named entity
recognition
• Identification of named entities such as
persons, locations, organisations which
are denoted by proper nouns.
• Example:
• “Michael Jordan is a professor at
Berkeley.”

22
Context free grammars
• It is the grammar that consists rules with a
single symbol on the left-hand side of the
rewrite rules. Let us create grammar to
parse a sentence
• “The bird pecks the grains”

23
Context free grammars

24
Context free grammars
• The parse tree breaks down the sentence
into structured parts so that the computer
can easily understand and process it.
• In order for the parsing algorithm to
construct this parse tree, a set of rewrite
rules, which describe what tree structures
are legal, need to be constructed.

25
Context free grammars
• These rules say that a certain symbol may
be expanded in the tree by a sequence of
other symbols.
• According to first order logic rule, if there
are two strings Noun Phrase (NP) and
Verb Phrase (VP), then the string
combined by NP followed by VP is a
sentence.

26
Context free grammars
• The rewrite rules for the sentence are as
follows −

27
Context free grammars
• The parse tree can be created as shown −

28
Context free grammars
• Now consider the above rewrite rules.
Since V can be replaced by both, "peck" or
"pecks", sentences such as "The bird peck
the grains" can be wrongly permitted.
i. e. the subject-verb agreement error is
approved as correct.

29
Context free grammars
• Merit − The simplest style of grammar,
therefore widely used one.
• Demerits −
They are not highly precise. For example,
“The grains peck the bird”, is a
syntactically correct according to parser,
but even if it makes no sense, parser
takes it as a correct sentence.

30
Context free grammars
• Demerits
 To bring out high precision, multiple sets of
grammar need to be prepared.
 It may require a completely different sets
of rules for parsing singular and plural
variations, passive sentences, etc., which
can lead to creation of huge set of rules
that are unmanageable.

31
Transformational
Grammar
• These are the grammars in which the
sentence can be represented structurally
into two stages.
• Obtaining different structures from
sentences having the same meaning is
undesirable in language understanding
systems.
• Sentences with the same meaning should
always correspond to the same internal
knowledge structures. 32
Transformational
Grammar
• In one stage the basic structure of the
sentence is analyzed to determine the
grammatical constituent parts and in the
second stage just the vice versa of the first
one.
• This reveals the surface structure of the
sentence, the way the sentence is used in
speech or in writing.

33
Transformational Grammar

• Alternatively, we can also say that

application of the transformation rules can
produce a change from passive voice to
active voice and vice versa.

34
Transformational Grammar

35
• Both of the above sentences are two
different sentences but they have same
meaning.
• Thus it is an example of a transformational
grammar.
• These grammars were never widely used in
computational models of natural language.
• The applications of this grammar are
changing of voice (Active to Passive and
Passive to Active) change a question to
declarative form etc. 36
TRANSITION NETWORK

• It is a method to represent the natural

languages. It is based on applications of
directed graphs and finite state automata.
• The transition network can be constructed
by the help of some inputs, states and
outputs.
• A transition network may consist of some
states or nodes, some labeled arcs from
one state to the next state through which it
will move. 37
• The arc represents the rule or some
conditions upon which the transition is
made from one state to another state.
• For example, a transition network is used
to recognize a sentence consisting of an
article, a noun, an auxiliary, a verb, an
article, a noun would be represented by
the transition network as follows.

38
39
• The transition from N1 to N2 will be made if
an article is the first input symbol.
• If successful, state N2 is entered.
• The transition from N2 to N3 can be made if
a noun is found next.
• If successful, state N3 is entered.
• The transition from N3 to N4 can be made if
an auxiliary is found and so on.
40
• Suppose consider a sentence “A boy is
eating a banana”.
• So if the sentence is parsed in the above
transition network then, first ‘A’ is an
article.
• So successful transition to the node N1 to
N2. Then boy is a noun (so N2 to N3), “is” is
an auxiliary (N5 to N6) and finally “banana”
is a noun (N 6 to N7) is done successfully.
• So the above sentence is successfully
41
TYPES OF TRANSITION
NETWORK
• There are generally two types of transition
networks like
1.Recursive Transition networks (RTN)
2.Augmented Transition networks (ATN)

42
Recursive Transition Networks (RTN)

• RTNs are considered as development for

finite state automata with some essential
conditions to take the recursive
complexion for some definitions in
consideration.
• A recursive transition network consists of
nodes (states) and labeled arcs
(transitions).

43
• It permits arc labels to refer to other
networks and they in turn may refer back
to the referring network rather than just
permitting word categories.
• It is a modified version of transition
network.
• It allows arc labels that refer to other
networks rather than word category.

44
Augmented Transition Network
(ATN)
• An ATN is a modified transition network.
• It is an extension of RTN.
• The ATN uses a topdown parsing
procedure to gather various types of
information to be later used for
understanding system.
• It produces the data structure suitable for
further processing and capable of storing
semantic details.
45
• An augmented transition network (ATN) is
a recursive transition network that can
perform tests and take actions during arc
transitions.
• An ATN uses a set of registers to store
information.
• A set of actions is defined for each arc and
the actions can look at and modify the
registers.
• An arc may have a test associated with it.
46
• The arc is traversed (and its action) is
taken only if the test succeeds.
• When a lexical arc is traversed, it is put in
a special variable (*) that keeps track of
the current word.
• The ATN was first used in LUNAR system.
• In ATN, the arc can have a further arbitrary
test and an arbitrary action.

47
The structure of ATN

1270A596-018.2 Local HSM Manager v5.1.7
100% (1)
1270A596-018.2 Local HSM Manager v5.1.7
154 pages
Constant Chlor MC4-50 and MC4-150 Operation and Installation Manual
No ratings yet
Constant Chlor MC4-50 and MC4-150 Operation and Installation Manual
76 pages
Unit-3 Notes Part-1
No ratings yet
Unit-3 Notes Part-1
48 pages
AI_M3_Merged.pdf
No ratings yet
AI_M3_Merged.pdf
98 pages
13-Dependency Grammar-03-09-2024
No ratings yet
13-Dependency Grammar-03-09-2024
31 pages
NLP_39-48
No ratings yet
NLP_39-48
11 pages
NLP
No ratings yet
NLP
17 pages
Lecture 6
No ratings yet
Lecture 6
43 pages
Natural Language Processing
No ratings yet
Natural Language Processing
17 pages
Natural Language Processing
No ratings yet
Natural Language Processing
13 pages
Topic 5 - Sentence Processing
No ratings yet
Topic 5 - Sentence Processing
33 pages
NLP UNIT-II
No ratings yet
NLP UNIT-II
71 pages
NLP Chapter 3
No ratings yet
NLP Chapter 3
50 pages
NLP Unit-Iv
No ratings yet
NLP Unit-Iv
124 pages
Unit-III PDF
No ratings yet
Unit-III PDF
72 pages
Introduction To Natural Language Processing and NLTK
No ratings yet
Introduction To Natural Language Processing and NLTK
23 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
36 pages
ai 6
No ratings yet
ai 6
55 pages
Unit 3-1
No ratings yet
Unit 3-1
66 pages
3. Syntax Parsing
No ratings yet
3. Syntax Parsing
95 pages
NLP Unit-Ii
No ratings yet
NLP Unit-Ii
118 pages
Unit II
No ratings yet
Unit II
61 pages
feature eng
No ratings yet
feature eng
34 pages
unit-1
No ratings yet
unit-1
23 pages
NLP Module4
No ratings yet
NLP Module4
12 pages
NLP Assign Mod-4,5,6 IramShaikh
No ratings yet
NLP Assign Mod-4,5,6 IramShaikh
10 pages
nlp unit 3 part A pdf
No ratings yet
nlp unit 3 part A pdf
75 pages
UNIT 3 Complete Note (3)
No ratings yet
UNIT 3 Complete Note (3)
16 pages
NLP Week 2 Rationalist and Empiricist Paradigms in Natural Language Processing
No ratings yet
NLP Week 2 Rationalist and Empiricist Paradigms in Natural Language Processing
28 pages
Phrase Structure Report
No ratings yet
Phrase Structure Report
60 pages
What Is Parsing
No ratings yet
What Is Parsing
47 pages
Text Summarization Using NLP Final
No ratings yet
Text Summarization Using NLP Final
38 pages
Unit_5_NLP
No ratings yet
Unit_5_NLP
16 pages
Semanti Roles PDF
No ratings yet
Semanti Roles PDF
105 pages
Transformational Generative Grammar
No ratings yet
Transformational Generative Grammar
12 pages
Transformational Generative Grammar Final
100% (4)
Transformational Generative Grammar Final
17 pages
Chapter 2 Text Operations
No ratings yet
Chapter 2 Text Operations
37 pages
CH-3
No ratings yet
CH-3
183 pages
Syntax
No ratings yet
Syntax
26 pages
Connected_Discourse_Presentation
No ratings yet
Connected_Discourse_Presentation
13 pages
AI6122 Topic 1.2 - WordLevel
No ratings yet
AI6122 Topic 1.2 - WordLevel
63 pages
Descriptive Morphological Analysis in Montage
No ratings yet
Descriptive Morphological Analysis in Montage
56 pages
Code Switching and Grammatical Theory
No ratings yet
Code Switching and Grammatical Theory
28 pages
unit2
No ratings yet
unit2
116 pages
FALLSEM2019-20 CSE4022 ETH VL2019201002590 Reference Material I 17-Jul-2019 NLP1-Lecture 4
No ratings yet
FALLSEM2019-20 CSE4022 ETH VL2019201002590 Reference Material I 17-Jul-2019 NLP1-Lecture 4
34 pages
04 - Parsing in NLP
No ratings yet
04 - Parsing in NLP
39 pages
ai self n
No ratings yet
ai self n
6 pages
4th Unit DVT
No ratings yet
4th Unit DVT
40 pages
Chapter #5 Discourse and Pradmatic Processing
No ratings yet
Chapter #5 Discourse and Pradmatic Processing
15 pages
chapter two IR
No ratings yet
chapter two IR
44 pages
CH 2_text operation
No ratings yet
CH 2_text operation
38 pages
R and W (Topic 5)
No ratings yet
R and W (Topic 5)
2 pages
Semantic Summary
No ratings yet
Semantic Summary
11 pages
Unit 5 AI (1)
No ratings yet
Unit 5 AI (1)
32 pages
Chapter 2 (Information Storage & Retrieval)
No ratings yet
Chapter 2 (Information Storage & Retrieval)
56 pages
Natural Language Processing
No ratings yet
Natural Language Processing
24 pages
Ontology
No ratings yet
Ontology
78 pages
Text Preprocessing
No ratings yet
Text Preprocessing
59 pages
Module 4.1
No ratings yet
Module 4.1
48 pages
NLP Module3
No ratings yet
NLP Module3
27 pages
Natural Language Processing
No ratings yet
Natural Language Processing
15 pages
GRAMMER KING
From Everand
GRAMMER KING
King
No ratings yet
Applied Microsoft SQL Server 2008 Reporting Services PDF
No ratings yet
Applied Microsoft SQL Server 2008 Reporting Services PDF
770 pages
PC1250-8 Eess018304 1011
100% (2)
PC1250-8 Eess018304 1011
20 pages
Student Bio Data Form PDF
No ratings yet
Student Bio Data Form PDF
2 pages
Inclusive Education
No ratings yet
Inclusive Education
250 pages
The Relationship Rescue Plan
No ratings yet
The Relationship Rescue Plan
35 pages
Evergreen Workbook Answers of With The Photographer Treasure Chest A Collection of Short Stories - Shout To Learn - The Ori
100% (1)
Evergreen Workbook Answers of With The Photographer Treasure Chest A Collection of Short Stories - Shout To Learn - The Ori
1 page
faculty-list
No ratings yet
faculty-list
36 pages
Oxygen Concentrator Catalog Final
No ratings yet
Oxygen Concentrator Catalog Final
1 page
Improved Cryptanalysis of The Block Cipher KASUMI
No ratings yet
Improved Cryptanalysis of The Block Cipher KASUMI
2 pages
Brochure Ais Innovate - 2023
No ratings yet
Brochure Ais Innovate - 2023
24 pages
Factory Acceptance Test Plan
No ratings yet
Factory Acceptance Test Plan
6 pages
Title 5
No ratings yet
Title 5
10 pages
Roll Stickers: File Format Colours
No ratings yet
Roll Stickers: File Format Colours
7 pages
Module 2 Market Analysis
No ratings yet
Module 2 Market Analysis
16 pages
Practical 2 Software Requirements Specification (SRS) : Name:Tarsariya Khushal Roll No:Ma054 Subject:Software Enginerring
No ratings yet
Practical 2 Software Requirements Specification (SRS) : Name:Tarsariya Khushal Roll No:Ma054 Subject:Software Enginerring
14 pages
HD785 O &M Manual PDF
No ratings yet
HD785 O &M Manual PDF
273 pages
MC44 - Inventory Turnover (1) - SAP Mental Notes
No ratings yet
MC44 - Inventory Turnover (1) - SAP Mental Notes
6 pages
Calculation Tool Heat Flux Through A Single Pipe KNOWING Skin Temperatures
No ratings yet
Calculation Tool Heat Flux Through A Single Pipe KNOWING Skin Temperatures
5 pages
Statistics Data Mining and Machine Learning in Astronomy A Practical Python Guide for the Analysis of Survey Data Željko Ivezić - Read the ebook online or download it as you prefer
100% (1)
Statistics Data Mining and Machine Learning in Astronomy A Practical Python Guide for the Analysis of Survey Data Željko Ivezić - Read the ebook online or download it as you prefer
50 pages
All India Career Point Test NEET
No ratings yet
All India Career Point Test NEET
5 pages
Absorption Costing Vs Marginal Costing
No ratings yet
Absorption Costing Vs Marginal Costing
5 pages
Hazard Identification
No ratings yet
Hazard Identification
3 pages
Alchemical Lo Shu Divination As Theoretical Physics
No ratings yet
Alchemical Lo Shu Divination As Theoretical Physics
10 pages
STS CRF
No ratings yet
STS CRF
38 pages
Business Analytics: Nanodegree Program Syllabus
No ratings yet
Business Analytics: Nanodegree Program Syllabus
13 pages
Highway Engineering: Cross Section of Road
No ratings yet
Highway Engineering: Cross Section of Road
12 pages
Validating The Passenger Traffic Model For Copenhagen
No ratings yet
Validating The Passenger Traffic Model For Copenhagen
25 pages
CIEN 30043 Lecture No. 3
No ratings yet
CIEN 30043 Lecture No. 3
32 pages

Lecture 2 Hierarchy of NLP & TF-IDF

Uploaded by

Lecture 2 Hierarchy of NLP & TF-IDF

Uploaded by

Amity School of Engineering and Technology

Q2. Apply TF-IDF method on the following sentences and draw

Sentence 1: This movie is very scary and long

• According to the role of a word in a

• Alternatively, we can also say that

• It is a method to represent the natural

• RTNs are considered as development for

You might also like