0% found this document useful (0 votes)

142 views

NLP Unit1

The document provides an introduction to natural language processing (NLP), describing NLP as allowing computers to understand human language through applications like personal assistants, machine translation, and sentiment analysis. It discusses key concepts in NLP including parts of speech tagging, syntactic and semantic analysis, knowledge representation, and natural language generation. The document also notes some of the difficulties in natural language understanding like lexical, syntactic, and referential ambiguities.

Uploaded by

Aryaman Sood

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

142 views

NLP Unit1

Uploaded by

Aryaman Sood

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Introduction to Natural

language Processing
Unit-I
Syed Rameem Zahra
(Assistant Professor)
Department of CSE, NSUT

1
Introduction to NLP
● Natural Language Processing (NLP) refers to AI method of communicating
with an intelligent systems using a human languages (e.g. English) — speech
or text
● NLP-powered software helps us in our daily lives in various ways, for
example:
● Personal assistants: Siri, Cortana, and
Google Assistant.
● Auto-complete: In search engines (e.g.
Google).
● Spell checking: Almost everywhere, in
your browser, your IDE (e.g. Visual
Studio), desktop apps (e.g. Microsoft
Word).
● Machine Translation: Google Translate. 2
NLP: Real World Examples

3
Source: Wikipedia
Advantages of NLP

● Computers can infer and analyze human language

● It is the ability of a computer program to understand the human
speech.
● Automatic Text Summarization (like in newspapers).
● Finding relationships between sentences.
● Ease in web search.
● Text/speech translation.
● Understanding sentiment in tweets and blogs (Sentiment Analysis).

4
Applications of NLP
● Machine Translation (it is the translation of text or speech by a computer with no
human involvement.)
● Information Retrieval (software program that deals with the organization, storage,
retrieval and evaluation of information from document repositories particularly
textual information.
● Question Answering (is concerned with building systems that automatically answer
questions posed by humans in a natural language.
● Dialogue Systems (computer system intended to converse with a human)
● Information Extraction (refers to the automatic extraction of structured information
such as entities, relationships between entities, and attributes)
● Summarization ( refers to the technique of shortening long pieces of text.)
● Sentiment Analysis (tries to identify and extract opinions within a given text across
blogs, reviews, social media, forums, news etc)

5
Evolution of NLP

6
NLP Terminology
● Phonology − It is study of organizing sound systematically.
● Morphology − It is a study of construction of words from primitive meaningful
units.
● Morpheme − It is primitive unit of meaning in a language.
● Syntax − It refers to arranging words to make a sentence. It also involves
determining the structural role of words in the sentence and in phrases.
● Semantics − It is concerned with the meaning of words and how to combine
words into meaningful phrases and sentences.
● Pragmatics − It deals with using and understanding sentences in different
situations and how the interpretation of the sentence is affected.
● Discourse − It deals with how the immediately preceding sentence can affect
the interpretation of the next sentence.
● World Knowledge − It includes the general knowledge about the world.
7
Process of NLP

8
Natural Language Understanding (NLU)

● Mapping the given input

in natural language into
useful representations.
● Analyzing different
aspects of the language.
● The NLU is harder than
NLG.

9
Part-of-Speech (POS) Tagging
● Each word has a part-of-speech tag to describe its category.
● Part-of-speech tag of a word is one of major word groups (or its
subgroups).
○ open classes -- noun, verb, adjective, adverb
○ closed classes -- prepositions, determiners, conjuctions, pronouns, particples
● POS Taggers try to find POS tags for the words.
● duck is a verb or noun? (morphological analyzer cannot make decision).
● A POS tagger may make that decision by looking the surrounding words.
○ Duck! (verb)
○ Duck is delicious for dinner. (noun)

10
Lexical Processing
● The purpose of lexical processing is to determine meanings of individual
words.
● Basic methods is to lookup in a database of meanings -- lexicon
● We should also identify non-words such as punctuation marks.
● Word-level ambiguity -- words may have several meanings, and the
correct one cannot be chosen based solely on the word itself.
○ bank in English
○ yüz in Turkish
● Solution -- resolve the ambiguity on the spot by POS tagging (if possible)
or pass-on the ambiguity to the other levels.

11
Syntactic Processing

● Parsing -- converting a flat input sentence into a hierarchical

structure that corresponds to the units of meaning in the sentence.
● There are different parsing formalisms and algorithms.
● Most formalisms have two main components:
○ grammar -- a declarative representation describing the syntactic structure of
sentences in the language.
○ parser -- an algorithm that analyzes the input and outputs its
structural representation (its parse) consistent with the
grammar specification.
● CFGs are in the center of many of the parsing mechanisms. But
they are complemented by some additional features that make the
formalism more suitable to handle natural languages.
12
Semantic Analysis

● Assigning meanings to the structures created by syntactic

analysis.
● Mapping words and structures to particular domain objects in way
consistent with our knowledge of the world.
● Semantic can play an import role in selecting among competing
syntactic analyses and discarding illogical analyses.
○ I robbed the bank -- bank is a river bank or a financial institution
● We have to decide the formalisms which will be used in the
meaning representation.

13
Knowledge Representation for NLP

● Which knowledge representation will be used depends on the

application .
○ Requires the choice of representational framework, as well as the specific
meaning vocabulary (what are concepts and relationship between these concepts
-- ontology)
○ Must be computationally effective.
● Common representational formalisms:
○ first order predicate logic
○ conceptual dependency graphs
○ semantic networks
○ Frame-based representations
○ Vector-space models
14
Natural Language Generation (NLG)

● It is the process of producing meaningful phrases and sentences in

the form of natural language from some internal representation.
● It involves:
○ Text planning − It includes retrieving the relevant content from
knowledge base.
○ Sentence planning − It includes choosing required words,
forming meaningful phrases, setting tone of the sentence.
○ Text Realization − It is mapping sentence plan into sentence
structure.

15
● Lexical Analysis:
Stages of NLP ➢It involves identifying and analyzing the
structure of words.
➢Lexicon of a language means the
collection of words and phrases in a
language.
➢Lexical analysis is dividing the whole
chunk of txt into paragraphs, sentences,
and words.
● Syntactic Analysis (Parsing):
➢It involves analysis of words in the
sentence for grammar and arranging words
in a manner that shows the relationship
among the words.
➢The sentence such as “The school goes
to boy” is rejected by English syntactic
analyzer. 16
Stages of NLP (Contd…)
● Semantic Analysis:
➢It draws the exact meaning or the dictionary meaning from the text.
➢The text is checked for meaningfulness.
➢It is done by mapping syntactic structures and objects in the task domain.
➢The semantic analyzer disregards sentence such as “hot ice-cream”.
● Discourse Integration:
➢The meaning of any sentence depends upon the meaning of the sentence just
before it.
➢In addition, it also brings about the meaning of immediately succeeding sentence.
17
Stages of NLP (Contd…)

● Pragmatic Analysis:
➢During this, what was said is re-interpreted on what it actually meant.
➢It involves deriving those aspects of language which require real
world knowledge.

18
19
Why NLP is hard: Difficulties in NLU

● Lexical ambiguity − It is at very primitive level such as word-level.

○ For example, treating the word “board” as noun or verb?
● Syntax Level ambiguity − A sentence can be parsed in different
ways.
○ For example, “He lifted the beetle with red cap.” − Did he use cap to lift the
beetle or he lifted a beetle that had red cap?
● Referential ambiguity − Referring to something using pronouns.
For
○ example, Rima went to Gauri. She said, “I am tired.” − Exactly who is tired?
○ One input can mean different meanings.
○ Many inputs can mean the same thing.
20
An example of Ambiguity

21
Classical NLP Problems
● Mostly Solved:
● Text Classification (e.g. spam detection in Gmail).
● Part of Speech (POS) tagging: Given a sentence, determine the POS tag for each word (e.g. NOUN,
VERB, ADV, ADJ).
● Named Entity Recognition (NER): Given a sentence, determine named entities (e.g. person names,
locations, organizations).
● Making a Solid Progress:
● Sentiment Analysis: Given a sentence, determine it’s polarity (e.g. positive, negative, neutral), or
emotions (e.g. happy, sad, surprised, angry, etc)
● Co-reference Resolution: Given a sentence, determine which words (“mentions”) refer to the same
objects (“entities”). for example (Manning is a great NLP professor, he worked in the field for over two
decades).
● Word Sense Disambiguation (WSD): Many words have more than one meaning; we have to select the
meaning which makes the most sense based on the context (e.g. I went to the bank to get some money),
here bank means a financial institution, not the land beside a river.
● Machine Translation (e.g. Google Translate).
● Still Challenging:
● Dialogue agents and chat-bots, especially open-domain ones.
● Question Answering.
● Abstractive Summarization.
● NLP for low-resource languages (e.g. African languages) 22
Morphology
● Morphology comes from a Greek word meaning “Shape” or “Form”
and is used in linguistics to denote the study of words, both with
regard to their internal structure (e.g. washing -> wash + ing) and
their combination or formulation to form new or larger units (e.g.
bat->bats :: rat->rats)
● Morphology tries to formulate rules.
● It helps in different domains such as spell checkers and machine
translation.
● Morphological Analyzer and generator is a tool for analyzing the given
word and generator for generating word given the stem and its
features (like affixes).
● It identifies how a word is produced through the use of morphemes.
23
Morpheme and its Types
● The morpheme is the smallest element of
a word that has grammatical function and
meaning.
● Types:
○ Free morpheme: A single free
morpheme can become a complete
word, For instance, a bus, a bicycle,
and so forth.
○ Bound morpheme: It cannot stand
alone and must be joined to a free
morpheme to produce a word. ing, un,
and other bound morphemes are
examples.
24
Basic word classes (parts of speech)
● Content words (open-class):
– Nouns: student, university, knowledge,...
– Verbs: write, learn, teach,...
– Adjectives: difficult, boring, hard, ....
– Adverbs: easily, repeatedly,...
● Function words (closed-class):
– Prepositions: in, with, under,...
– Conjunctions: and, or,...
– Determiners: a, the, every,...

25
Words aren’t just defined by blanks

● Problem 1: Compounding
○ “ice cream”, “website”, “web site”, “New York-based”
● Problem 2: Other writing systems have no blanks, like
Chinese
● Problem 3: Clitics
○ English: “doesn’t” , “I’m” ,
○ Italian: “dirglielo” = dir + gli(e) + lo (meaning: tell + him + it)

26
How many words are there?
“Of course he wants to take the advanced course too. He already
took two beginners’ courses.”
● How many word tokens are there?
○ (16 to 19, depending on how we count punctuation)
● How many word types are there?
○ i.e. How many different words are there?
○ Again, this depends on how you count, but it’s usually much less than the number of
tokens
● The same (underlying) word can take different forms: course/courses, take/took
● We distinguish concrete word forms (take, taking) from abstract lemmas or dictionary forms
(take)
● Different words may be spelled/pronounced the same: of course vs. advanced course; two
vs. too
27
How many different words are there?

● Inflection creates different forms of the same word:

○ Verbs: to be, being, I am, you are, he is, I was,
○ Nouns: one book, two books
● Derivation creates different words from the same lemma:
○ grace ⇒ disgrace ⇒ disgraceful ⇒ disgracefully
● Compounding combines two words into a new word:
○ cream ⇒ ice cream ⇒ ice cream cone ⇒ ice cream cone bakery
● Word formation is productive:
○ New words are subject to all of these processes:
○ Google ⇒ Googler, to google, to ungoogle, to misgoogle, googlification,
ungooglification, googlified, Google Maps, Google Maps service,...
28
Inflectional morphology in English
● Verbs:
○ Infinitive/present tense: walk, go
○ 3rd person singular present tense (s-form): walks, goes
○ Simple past: walked, went
○ Past participle (ed-form): walked, gone
○ Present participle (ing-form): walking, going
● Nouns:
○ Number: singular (book) vs. plural (books)
○ Plural: books
○ Possessive (~ genitive case): book’s, books
○ Personal pronouns inflect for person, number, gender, case: I saw him; he saw me; you saw
her; we saw them; they saw us.

29
Derivational morphology

● Nominalization:
○ V + -ation: computerization
○ V+ -er: killer
○ Adj + -ness: fuzziness
● Negation:
○ un-: undo, unseen, …
○ mis-: mistake,...
● Adjectivization:
○ V+ -able: doable
○ N + -al: national

30
Morphemes: stems, affixes
dis-grace-ful-ly
prefix-stem-suffix-suffix
● Many word forms consist of a stem plus a number of affixes (prefixes or
suffixes)
○ Infixes are inserted inside the stem.
○ Circumfixes (German gesehen) surround the stem
● Morphemes: the smallest (meaningful/grammatical) parts of words.
○ Stems (grace) are often free morphemes.
○ Free morphemes can occur by themselves as words.
○ Affixes (dis-, -ful, -ly) are usually bound morphemes.
○ Bound morphemes have to combine with others to form words.

31
Morphemes and morphs

● There are many irregular word forms:

○ Plural nouns add -s to singular: book-books,
○ but: box-boxes, fly-flies, child-children
○ Past tense verbs add -ed to infinitive: walk-walked,
○ but: like-liked, leap-leapt
● Morphemes are abstract categories
○ Examples: plural morpheme, past tense morpheme
○ The same morpheme (e.g. for plural nouns) can be realized as different
surface forms (morphs): -s/-es/-ren
○ Allomorphs: two different realizations (-s/-es/-ren)of the same underlying
morpheme (plural)
32
Morphological parsing

disgracefully
dis grace ful ly
prefix stem suffix suffix
NEG grace+N +ADJ +ADV

33
Morphological generation

● Generate possible English words:

○ grace, graceful, gracefully
○ disgrace, disgraceful, disgracefully,
○ ungraceful, ungracefully,
○ undisgraceful, undisgracefully,...
● Don’t generate impossible English words:
○ *gracelyful, *gracefuly, *disungracefully,...

34
Finite-State Automata and Regular Languages: review

● An alphabet ∑ is a set of symbols:

○ e.g. ∑= {a, b, c}
● A string ω is a sequence of symbols, e.g ω=abcb.
○ The empty string ε consists of zero symbols.
● The Kleene closure ∑* (‘sigma star’) is the (infinite) set of all
strings that can be formed from ∑:
○ ∑*= {ε, a, b, c, aa, ab, ba, aaa, ...}
● A language L⊆ ∑* over ∑ is also a set of strings.
○ Typically we only care about proper subsets of ∑* (L ⊂ Σ).

35
Automata and languages

● An automaton is an abstract model of a computer which reads

an input string, and changes its internal state depending on the
current input symbol.
● It can either accept or reject the input string.
● Every automaton defines a language (the set of strings it
accepts).
● Different automata define different language classes:
○ Finite-state automata define regular languages
○ Pushdown automata define context-free languages
○ Turing machines define recursively enumerable languages
36
Finite State Automata (FSAs)

● A finite-state automaton M = 〈 Q, Σ, qo , F, δ 〉 consists of:

○ A finite set of states Q = {qo , q1 ,.., qn}
○ A finite alphabet Σ of input symbols (e.g. Σ = {a, b, c,...})
○ A designated start state qo ∈ Q
○ A set of final states F ⊆Q
○ A transition function δ:
■ The transition function for a deterministic (D)FSA: Q x Σ → Q
● δ(q,w) = q’ for q, q’ ∈ Q, w ∈ Σ
● If the current state is q and the current input is w, go to q’
■ The transition function for a nondeterministic (N)FSA: Q x Σ → 2Q
● δ(q,w) = Q’ for q ∈ Q, Q’ ⊆ Q, w ∈ Σ
● If the current state is q and the current input is w, go to any q’ ∈ Q’
○ Every NFA can be transformed into an equivalent DFA 37
Regular Expressions

● Simple patterns:
○ Standard characters match themselves: ‘a’, ‘1’
○ Character classes: ‘[abc]’, ‘[0-9]’, negation: ‘[^aeiou]’
○ (Predefined: \s (whitespace), \w (alphanumeric), etc.)
○ Any character (except newline) is matched by ‘.’
● Complex patterns: (e.g. ^[A-Z]([a-z])+\s )
○ Group: ‘(...)’
○ Repetition: 0 or more times: ‘*’, 1 or more times: ‘+’
○ Disjunction: ‘...|...’
○ Beginning of line ‘^’ and end of line ‘$’
38
Finite-state methods for morphology

39
Stem changes
● Some irregular words require stem changes:
○ Past tense verbs: teach-taught, go-went, write-wrote
○ Plural nouns: mouse-mice, foot-feet, wife-wives

40
FSAs for derivational morphology

noun2 = {nation, form,...}

noun3 = {natur, structur,...} 41
Recognition vs. Analysis
● FSAs can recognize (accept) a string, but they don’t tell us its internal
structure.
● We need is a machine that maps (transduces) the input string into an
output string that encodes its structure:

42
Finite-state transducers
● A finite-state transducer T = 〈 Q, Σ, Δ, qo , F, δ, σ 〉 consists of:
○ A finite set of states Q = {qo , q1 ,.., qn }
○ A finite alphabet Σ of input symbols (e.g. Σ = {a, b, c,...})
○ A finite alphabet Δ of output symbols (e.g. Δ = {+N, +pl,...})
○ A designated start state qo ∈ Q
○ A set of final states F ⊆ Q
○ A transition function δ: Q × Σ → 2Q
■ δ(q,w) = Q’ for q ∈Q, Q’ ⊆ Q, w ∈ Σ
○ An output function σ: Q × Σ → Δ*
■ σ(q,w) = ω for q ∈ Q, w ∈ Σ, ω ∈ Δ*
■ If the current state is q and the current input is w, write ω.
43
Finite-state transducers

● An FST T = Lin ⨉ Lout defines a relation between two

regular languages Lin and Lout :
○ Lin = {cat, cats, fox, foxes, ...}
○ Lout = {cat+N+sg, cat+N+pl, fox+N+sg, fox+N+PL ...}
○ T = { <cat, cat+N+sg>, <cats, cat+N+pl>, <fox, fox+N+sg>,
<foxes, fox+N+pl> }

Note: N: Noun, pl: Plural, sg: Singular

44
Intermediate representations

● English plural -s: cat ⇒ cats dog ⇒ dogs

○ but: fox ⇒ foxes, bus ⇒ buses buzz ⇒ buzzes
● We define an intermediate representation which captures
morpheme boundaries (^) and word boundaries (#):
○ Lexicon: cat+N+PL fox+N+PL
○ ⇒ Intermediate representation: cat^s# fox^s#
○ ⇒ Surface string: cats foxes
● Intermediate-to-Surface Spelling Rule:
○ If plural ‘ s ’ follows a morpheme ending in ‘ x ’,‘z’ or ‘s’, insert ‘ e ’.

45
Simplified Morpholgical Parsing FST

46
Some FST operations

● Inversion T-1 :
○ The inversion (T-1 ) of a transducer switches input and output
labels.
○ This can be used to switch from parsing words to generating
words.
● Composition (T◦T’): (Cascade)
○ Two transducers T = L1 ⨉ L2 and T’ = L2 ⨉ L3 can be composed
into a third transducer T’’ = L1 ⨉ L3.

47
Problems in Morphological Analyzer

● Productivity
● False Analysis
● Bound Base Morphemes

48
Productivity

49
False analysis

50
Bound Base Morphemes

● Occur only in a particular complex word

● Do not have independent existence

(A) What Is Traditional Model of NLP?: Unit - 1
No ratings yet
(A) What Is Traditional Model of NLP?: Unit - 1
18 pages
2019 21 Interviews
No ratings yet
2019 21 Interviews
17 pages
NLP Notes
No ratings yet
NLP Notes
18 pages
Natural Language Processing: Dr. Abdulfetah A.A
No ratings yet
Natural Language Processing: Dr. Abdulfetah A.A
25 pages
NLP Presentation
No ratings yet
NLP Presentation
19 pages
Natural Language Processing
No ratings yet
Natural Language Processing
24 pages
Natural Language Processing (NLP) : Chapter 1: Introduction To NLP
No ratings yet
Natural Language Processing (NLP) : Chapter 1: Introduction To NLP
96 pages
Be Computer Engineering Semester 7 2023 May Dloc III Natural Language Processing Rev 2019 C Scheme
0% (1)
Be Computer Engineering Semester 7 2023 May Dloc III Natural Language Processing Rev 2019 C Scheme
2 pages
Solutions To NLP I Mid Set A
100% (1)
Solutions To NLP I Mid Set A
8 pages
NLP Iat QB
No ratings yet
NLP Iat QB
10 pages
NLP Unit-3-Semantics-And-Pragmatics
No ratings yet
NLP Unit-3-Semantics-And-Pragmatics
20 pages
Unit 5 - Notes
No ratings yet
Unit 5 - Notes
11 pages
NLP Unit-1 Notes
No ratings yet
NLP Unit-1 Notes
59 pages
Natural Language Processing: Dr. Tulasi Prasad Sariki SCOPE, VIT Chennai
No ratings yet
Natural Language Processing: Dr. Tulasi Prasad Sariki SCOPE, VIT Chennai
29 pages
Unit - 5 Natural Language Processing
No ratings yet
Unit - 5 Natural Language Processing
66 pages
NLP Unit 1
100% (1)
NLP Unit 1
34 pages
CSE4022 Natural-Language-Processing ETH 1 AC41
No ratings yet
CSE4022 Natural-Language-Processing ETH 1 AC41
6 pages
Unit 4 NLP
No ratings yet
Unit 4 NLP
51 pages
Week 6: Introduction To Natural Language Processing
No ratings yet
Week 6: Introduction To Natural Language Processing
18 pages
NLP: Background and Overview: Introduction To Natural Language Processing (CSE5321)
No ratings yet
NLP: Background and Overview: Introduction To Natural Language Processing (CSE5321)
30 pages
NLP Notes
No ratings yet
NLP Notes
43 pages
NLP Unit-V
No ratings yet
NLP Unit-V
30 pages
Shivangi Tyagi (NLP Assignments)
No ratings yet
Shivangi Tyagi (NLP Assignments)
60 pages
Lecture 1: Introduction To NLP: Understand Concepts Applications
No ratings yet
Lecture 1: Introduction To NLP: Understand Concepts Applications
32 pages
Natural Language Processing
No ratings yet
Natural Language Processing
47 pages
10 Natural Language Processing
No ratings yet
10 Natural Language Processing
27 pages
Model Question Paper
0% (1)
Model Question Paper
2 pages
Unit-8: Natural Language: Processing
No ratings yet
Unit-8: Natural Language: Processing
16 pages
Unit 4 NLP Notes
No ratings yet
Unit 4 NLP Notes
35 pages
NLP Final
No ratings yet
NLP Final
26 pages
1.introduction To Natural Language Processing (NLP)
100% (1)
1.introduction To Natural Language Processing (NLP)
37 pages
NLP UNIT 1 (Ques Ans Bank)
No ratings yet
NLP UNIT 1 (Ques Ans Bank)
20 pages
NLP Lab Manual-1
No ratings yet
NLP Lab Manual-1
18 pages
Word Sense Disambiguation: by Under The Guidance of
No ratings yet
Word Sense Disambiguation: by Under The Guidance of
99 pages
Question Bank
No ratings yet
Question Bank
13 pages
Unit I
No ratings yet
Unit I
30 pages
Unit 3
100% (1)
Unit 3
11 pages
NLP Lab Expdoc New
No ratings yet
NLP Lab Expdoc New
103 pages
NLP Chapter 2
No ratings yet
NLP Chapter 2
103 pages
NLP Unit 4
No ratings yet
NLP Unit 4
10 pages
Natural Language Processing
100% (2)
Natural Language Processing
48 pages
AI Assignment 1
No ratings yet
AI Assignment 1
31 pages
6CS4 AI Unit-5
No ratings yet
6CS4 AI Unit-5
65 pages
Unit 1 2 3 4 5 NLP Notes Merged
100% (1)
Unit 1 2 3 4 5 NLP Notes Merged
105 pages
NLP Notes Unit-3.Doc
No ratings yet
NLP Notes Unit-3.Doc
19 pages
NLP Qb-Ese
No ratings yet
NLP Qb-Ese
2 pages
Module 3 - Paper 1 - Extracting Relations From Text From Word Sequences To Dependency Paths
No ratings yet
Module 3 - Paper 1 - Extracting Relations From Text From Word Sequences To Dependency Paths
11 pages
Unit-III PDF
No ratings yet
Unit-III PDF
72 pages
NLP
No ratings yet
NLP
13 pages
CS6007 Information Retrieval
No ratings yet
CS6007 Information Retrieval
8 pages
IR UNIT I - Notes
No ratings yet
IR UNIT I - Notes
23 pages
NLP Akash
No ratings yet
NLP Akash
4 pages
NLP Assignment 2
No ratings yet
NLP Assignment 2
2 pages
NLP MCQ 153 Out of 427 - Part One
No ratings yet
NLP MCQ 153 Out of 427 - Part One
30 pages
Natural Language Processing Notes
No ratings yet
Natural Language Processing Notes
26 pages
21AD3202 - Natural LanguageProcessing-Record
No ratings yet
21AD3202 - Natural LanguageProcessing-Record
64 pages
NLP UNIT 2 (Ques Ans Bank)
No ratings yet
NLP UNIT 2 (Ques Ans Bank)
26 pages
Information Retrieval - Question Bank
No ratings yet
Information Retrieval - Question Bank
3 pages
Lecture-8. Only For This Batch
No ratings yet
Lecture-8. Only For This Batch
46 pages
NLp_lab1
No ratings yet
NLp_lab1
33 pages
NLP_PPT
No ratings yet
NLP_PPT
41 pages
Gate 2016 Cse Syllabus
No ratings yet
Gate 2016 Cse Syllabus
2 pages
Lecture 3-Finite Autometa
No ratings yet
Lecture 3-Finite Autometa
84 pages
Exam Preparation Questions ECL-1 2020
No ratings yet
Exam Preparation Questions ECL-1 2020
6 pages
M.D. Book
No ratings yet
M.D. Book
20 pages
Coa D1
No ratings yet
Coa D1
32 pages
Toc Report
No ratings yet
Toc Report
4 pages
CSE 2-2 JNTUH Syllabus
No ratings yet
CSE 2-2 JNTUH Syllabus
24 pages
CS402 Final Term Solved SUBJECTIVE by JUNAID
No ratings yet
CS402 Final Term Solved SUBJECTIVE by JUNAID
59 pages
Digital Logic For Computing (PDFDrive)
No ratings yet
Digital Logic For Computing (PDFDrive)
324 pages
Sequence Tutorial
No ratings yet
Sequence Tutorial
5 pages
Toc
No ratings yet
Toc
1 page
Toc Unit 1 MCQS 2019-20
100% (1)
Toc Unit 1 MCQS 2019-20
567 pages
Chapter 01 - Introduction
No ratings yet
Chapter 01 - Introduction
11 pages
Computer Architecture As A Multilevel Hierarchical Framework
100% (1)
Computer Architecture As A Multilevel Hierarchical Framework
6 pages
CS402 Mid Term Papers
No ratings yet
CS402 Mid Term Papers
38 pages
I-CAL-GUI-004 Calibration Guideline No. 4 Web
No ratings yet
I-CAL-GUI-004 Calibration Guideline No. 4 Web
48 pages
Uppaal Modeling
No ratings yet
Uppaal Modeling
4 pages
Unit 7.Assignment brief 2 (1)
No ratings yet
Unit 7.Assignment brief 2 (1)
5 pages
CS402 - AL102 Module 1
No ratings yet
CS402 - AL102 Module 1
4 pages
Theory of Computation: Sathyabama
No ratings yet
Theory of Computation: Sathyabama
92 pages
Finite Automata: Asif Nawaz UIIT, PMAS-Arid Agriculture University, Rawalpindi
No ratings yet
Finite Automata: Asif Nawaz UIIT, PMAS-Arid Agriculture University, Rawalpindi
18 pages
Theory of Automata & Formal Languages
No ratings yet
Theory of Automata & Formal Languages
60 pages
20cs2204 - Formal Languages and Automata Theory
No ratings yet
20cs2204 - Formal Languages and Automata Theory
2 pages
VHDL Code For Vending Machine Controller
No ratings yet
VHDL Code For Vending Machine Controller
4 pages
Subclasses of Petri Nets
No ratings yet
Subclasses of Petri Nets
10 pages
Unit-2 STM
No ratings yet
Unit-2 STM
50 pages
Work Book - Formal Language and Automata Theory - CS402-1 PDF
No ratings yet
Work Book - Formal Language and Automata Theory - CS402-1 PDF
138 pages
Pushdown Automata - Dalpat Songara
No ratings yet
Pushdown Automata - Dalpat Songara
49 pages
Software Testing Unit Integration Functional System Acceptance
100% (1)
Software Testing Unit Integration Functional System Acceptance
6 pages

NLP Unit1

Uploaded by

NLP Unit1

Uploaded by

Introduction to Natural

● Computers can infer and analyze human language

● Mapping the given input

● Parsing -- converting a flat input sentence into a hierarchical

● Assigning meanings to the structures created by syntactic

● Which knowledge representation will be used depends on the

● It is the process of producing meaningful phrases and sentences in

● Lexical ambiguity − It is at very primitive level such as word-level.

● Inflection creates different forms of the same word:

● There are many irregular word forms:

● Generate possible English words:

● An alphabet ∑ is a set of symbols:

● An automaton is an abstract model of a computer which reads

● A finite-state automaton M = 〈 Q, Σ, qo , F, δ 〉 consists of:

noun2 = {nation, form,...}

● An FST T = Lin ⨉ Lout defines a relation between two

Note: N: Noun, pl: Plural, sg: Singular

● English plural -s: cat ⇒ cats dog ⇒ dogs

● Occur only in a particular complex word

You might also like