0% found this document useful (0 votes)

6 views

NLP assignment notes

The document provides an overview of various models and concepts in Natural Language Processing (NLP) and Information Retrieval (IR), including comparisons of Boolean, Vector Space, and Probabilistic models, as well as classical NLP models like rule-based and statistical models. It also discusses the role of Finite-State Transducers in morphological analysis, tokenization processes, and named entity recognition, alongside syntactic parsing techniques and the significance of linguistically annotated corpora. Key differences between phonetics and phonology, as well as word formation processes, are also highlighted.

Uploaded by

paridhikadwey78

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

NLP assignment notes

Uploaded by

paridhikadwey78

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 28

NLP ASSIGNMENT NOTES

UNIT 1
1. Comparison: Boolean Model vs Vector Space Model vs
Probabilistic Model (for NLP & IR)

🔷 Boolean Model

 Basic Idea: Documents and queries are represented as a set of terms

(present or not).

 Operations: Uses Boolean logic (AND, OR, NOT).

 Strengths:

o Simple to implement.

o Fast for small datasets.

o Precise matching when query and document terms align exactly.

 Weaknesses:

o No partial matching – either a document is relevant or not.

o No ranking of results.

o Doesn’t consider term frequency or document length.

 Example Use: Early search engines and digital libraries.

🔷 Vector Space Model (VSM)

 Basic Idea: Documents and queries are represented as vectors in a

multi-dimensional space (each dimension = term).

 Similarity Measure: Cosine similarity is often used.

 Strengths:

o Supports ranking of documents based on similarity.

o Handles partial matches and term weighting (like TF-IDF).

o Simple mathematical foundation.

 Weaknesses:
o Ignores relationships between terms (no semantics).

o Assumes independence between terms.

 Example Use: Search engines using TF-IDF scoring.

🔷 Probabilistic Model

 Basic Idea: Assigns a probability that a document is relevant to a

given query.

 Models: Includes Binary Independence Model (BIM), BM25, Language

Models.

 Strengths:

o Considers uncertainty and estimates relevance.

o Can adapt and learn from user feedback.

o Supports probabilistic ranking.

 Weaknesses:

o Computationally more complex.

o Requires training data or relevance judgments.

 Example Use: Modern IR systems like Google ranking algorithms, ad

retrieval.

2. Classical NLP Models: Rule-based vs Statistical vs Information

Retrieval Models

🔹 Rule-Based Models

 How They Work:

o Use manually written linguistic rules.

o Depend on syntax, lexicons, and grammar rules.

 Strengths:

o Transparent and explainable.

o Precise for controlled language environments.

 Weaknesses:

o Hard to scale across domains and languages.

o Rigid and brittle – small changes break the system.

 Examples:

o Grammar checkers (like Grammarly).

o Early NER systems using regular expressions.

🔹 Statistical Models

 How They Work:

o Use data-driven approaches based on probability.

o Learn from annotated corpora (training data).

 Types:

o N-gram models, Hidden Markov Models (HMM),

Conditional Random Fields (CRF).

 Strengths:

o Scalable and domain-adaptive.

o More robust than rule-based models.

 Weaknesses:

o Require large labeled datasets.

o May lack transparency in decision-making.

 Examples:

o POS tagging (HMM).

o Named Entity Recognition (CRF).

🔹 Information Retrieval (IR) Models

 How They Work:

o Match documents to queries using keyword overlap or statistical
scoring (like TF-IDF).

 Strengths:

o Fast and efficient on large-scale corpora.

o Good for unstructured document retrieval.

 Weaknesses:

o Lack deep understanding of language (semantic gap).

 Examples:

o Search engines (Lucene, Solr).

o Document recommendation systems.

3. Probabilistic Graphical Models (PGMs) and Their Role in NLP

🔷 Definition:

 PGMs are frameworks that use graph structures to represent and

reason about uncertain variables.

 Combine graph theory + probability theory.

🔷 Two Main Types:

1. Bayesian Networks (Directed Acyclic Graphs)

o Represent causal relationships.

o Use conditional probabilities.

2. Markov Random Fields (Undirected Graphs)

o Represent symmetric relationships.

o Capture dependencies without direction.

🔷 Importance in NLP Tasks:

Task Role of PGMs

POS Tagging Hidden Markov Models (HMMs) estimate tag sequences.

Task Role of PGMs

CRFs (Conditional Random Fields) predict label sequences

NER / Chunking
with context.

LDA (Latent Dirichlet Allocation) identifies hidden topics

Topic Modeling
using a generative Bayesian model.

Probabilistic CFGs (Context-Free Grammars) used to assign

Parsing
tree structures.

Semantic Role Use graphical models to capture dependencies between

Labeling sentence elements.

🔷 Advantages:

 Handle uncertainty and hidden variables.

 Model dependencies and structure in language.

 Scalable to large datasets.

 Provide probabilistic predictions useful in many NLP applications.

🔷 Challenges:

 Require training data and parameter estimation.

 Inference can be computationally expensive.

UNIT 2
1. Key Differences between Phonetics and Phonology

Aspect Phonetics Phonology

Study of the physical sounds Study of how sounds function

Definition
of speech and are organized in a language

Abstract, rule-based sound

Physical articulation, acoustic
Focus patterns and systems in a
properties, and perception
language

Phonemes (distinct sound units

Units Phones (actual spoken sounds)
that can change meaning)
Aspect Phonetics Phonology

Scientific More related to physics, More related to linguistics and

Area biology, and physiology mental representation

[p] and [ph] are different

/p/ and /b/ are phonemes in
Example phonetically (with/without
English – "pat" vs "bat"
aspiration)

Linguistic analysis, language

Applicatio Speech synthesis, recognition,
teaching, computational
ns forensic linguistics
linguistics

2. Definition of Morphology and Word Formation Processes

🔹 Morphology:

 The study of the structure and formation of words.

 Focuses on morphemes – the smallest meaningful units in a language.

🔹 Word Formation Processes (with examples):

1. Derivation:

o Adding prefixes/suffixes to form new words.

o happy → unhappiness (prefix: un-, suffix: -ness)

2. Inflection:

o Changes a word’s form to express tense, number, case, etc.

o walk → walked (past tense), cat → cats (plural)

3. Compounding:

o Joining two or more words to form a new word.

o tooth + brush → toothbrush

4. Conversion (Zero Derivation):

o Changing the word class without changing the form.

o run (verb) → a run (noun)

5. Clipping:
o Shortening a longer word.

o advertisement → ad

6. Blending:

o Combining parts of two words.

o breakfast + lunch → brunch

7. Acronyms and Initialisms:

o Forming words from initials.

o NASA (National Aeronautics and Space Administration)

8. Reduplication:

o Repeating part/all of a word.

o bye-bye, tick-tock

3. Role of Finite-State Transducers (FSTs) in Morphological Analysis

🔹 What is an FST?

 A Finite-State Transducer is a type of automaton that maps

between two levels of representation, e.g., surface form ↔ lexical
form.

 Used to model morphological rules in computational linguistics.

🔹 How FSTs Work in Morphology:

 FSTs take an input word and break it down into root + affix(es).

 They can also generate surface forms from lexical entries.

🔹 Example:

Suppose we have the word: "talked"

 Lexical form: talk + PAST

 Surface form: talked

✅ An FST can:

 Analyze: From talked → talk + PAST

 Generate: From talk + PAST → talked

🔹 Practical Scenarios:

1. Spell-checkers:

o Recognize and correct inflected and derived forms of words.

2. Text-to-speech systems:

o Accurately pronounce morphologically complex words.

3. Search engines:

o Perform stemming and lemmatization to match different word

forms.

4. Machine Translation:

o Accurately translate morphologically rich languages.

5. Language learning apps:

o Teach correct word forms and conjugations.

UNIT 3
1. Tokenization in NLP

✅ Definition:

Tokenization is the process of breaking down a text into smaller

units called tokens (e.g., words, phrases, symbols).

✅ Steps Involved:

1. Input Text Processing:

o Raw text is taken as input for processing.

2. Language Identification (optional but useful):

o Detect the language to apply proper tokenization rules

(since rules vary by language).

3. Sentence Segmentation:

o Divide the text into individual sentences.

o Example: "Hello world! How are you?" → ["Hello world!",
"How are you?"]

4. Word Tokenization:

o Each sentence is broken into words/tokens.

o Example: "Hello world!" → ["Hello", "world", "!"]

5. Punctuation Handling:

o Decide whether to keep/remove punctuation as separate

tokens.

6. Special Handling (e.g., contractions):

o "I'm" → ["I", "am"] (depending on the tokenizer)

7. Language-Specific Rules:

o For agglutinative languages (e.g., Turkish), sub-word or

morpheme tokenization may be applied.

✅ Importance:

 Acts as the first step in most NLP pipelines.

 Essential for:

o Text analysis

o Information retrieval

o Sentiment analysis

o Machine translation

2. PoS Tagging: Rule-Based vs Stochastic vs Lexical

When It’s
Model Type Description Examples
Effective

If a word Low-resource
Uses hand-crafted
ends with settings,
Rule-Based linguistic rules to assign
"ly", tag as languages with
PoS tags.
adverb. rich morphology.

Stochastic Uses probabilistic "The dog When trained on

When It’s
Model Type Description Examples
Effective

barks." →
models like HMMs, CRFs
High chance large, labeled
(Statistical) based on word and tag
"dog" = corpora.
frequencies.
noun

When working
WordNet or
Lexical with standard
Tags are assigned based large
(Dictionary vocabulary and
on dictionaries/lexicons. tagged
-Based) limited context
corpora
needed.

✅ Examples:

 Rule-Based:

o "If previous word is 'to' and current word is verb → mark

as base form of verb"

 Stochastic:

o Trained on corpus: "flies like" could infer "flies" is noun or

verb based on context.

 Lexical:

o "run" is both a verb and a noun. Dictionary helps list all

possible tags, but context is not used.

3. Named Entity Recognition (NER)

✅ Definition:

NER is a subtask of NLP that identifies and classifies named entities

in text into predefined categories such as:

 Person names

 Organizations

 Locations

 Dates

 Quantities
 Monetary values

 Events, etc.

✅ Example:

text

CopyEdit

"Apple Inc. announced a new product in California on March 21,

2023."

NER Output:

 Apple Inc. → Organization

 California → Location

 March 21, 2023 → Date

✅ Applications:

Field Use Case

Extracting company names, financial info, executive

Business
movements from reports.

Journalis
Quickly tagging people, places, events in articles.
m

Healthcar Identifying diseases, drug names, patient details from

e clinical notes.

Extracting case numbers, defendant/plaintiff names,

Legal
dates from documents.

Social Identifying trending people, brands, and places in

Media tweets/posts.

UNIT 4
1. What is Syntactic Parsing?

Syntactic parsing (also called syntax analysis) is the process of analyzing

a sentence to reveal its grammatical structure, often represented as a
parse tree.
 It helps determine how words relate to each other in a sentence.

 Outputs phrase structure or dependency relations.

🟩 2. Why is Syntactic Parsing Important?

 Understands sentence structure for deeper NLP tasks.

 Aids in:

o Machine Translation

o Information Extraction

o Question Answering

o Grammar Checking

🟩 3. Types of Parsing

✅ A. Constituency Parsing (Phrase Structure Parsing)

 Breaks sentence into nested constituents (NP, VP, PP).

 Based on Context-Free Grammar (CFG).

Example:

scss

CopyEdit

(NP The quick brown fox)

(VP jumps

(PP over

(NP the lazy dog))))

✅ B. Dependency Parsing

 Focuses on word-to-word relationships.

 Each word is a node, and arcs represent dependencies (subject, object,
etc.)

Example (Dependency Tree):

 jumps → root

 fox → subject of jumps

 over → modifier of jumps

 dog → object of over

🟩 4. Parsing Techniques

✅ A. Rule-Based Parsing

 Uses hand-written grammar rules (e.g., CFGs)

 Parses sentence according to production rules.

 ✅ Transparent and explainable

 ❌ Not robust to real-world noisy text

✅ B. Statistical Parsing

 Learns parsing from treebank datasets (annotated corpora)

 Uses:

o PCFG (Probabilistic CFG): Assigns probabilities to CFG rules.

o Chart Parsers, CKY Algorithm

✅ C. Transition-Based Parsing (for Dependency Parsing)

 Builds parse tree incrementally using actions (SHIFT, REDUCE)

 Efficient and used in real-time NLP systems

✅ D. Neural Parsing

 Uses neural networks (e.g., LSTMs, Transformers)

 Learns from large corpora

 Highly accurate and generalizable

Libraries:

 spaCy

 Stanza (Stanford NLP)

 Benepar (Berkeley Neural Parser)

 AllenNLP

🟩 5. Treebanks – Role in Parsing

 Treebanks are annotated corpora with syntactic parse trees.

 Used for training and evaluating parsers.

Examples:

 Penn Treebank (for constituency parsing)

 Universal Dependencies (for dependency parsing)

🟩 6. Example Grammar and CFG Parsing

Grammar Rules (simple CFG):

rust

CopyEdit

S -> NP VP

NP -> Det N

VP -> V NP

Det -> 'the'

N -> 'dog' | 'cat'

V -> 'chased'

Sentence:

the dog chased the cat

Parse Tree:

mathematica

CopyEdit

/ \

NP VP

/\ /\

Det N V NP

| | | /\

the dog chased Det N

| |

the cat

🟩 7. Applications of Syntactic Parsing

 Grammar correction (e.g., Grammarly)

 Voice assistants: understanding commands

 Machine translation: structural disambiguation

 Question answering systems

 Summarization

What is a Linguistically Annotated Corpus?

A linguistically annotated corpus is a large collection of text

that has been tagged or marked up with linguistic information
such as:

 Part-of-Speech (PoS) tags

 Syntax (parse trees)

 Semantics (meaning)
 Named Entities

 Morphological information

 Coreference links

 Dependency relations

💡 Think of it as a text + expert linguistic labels for machines to learn from.

🟩 2. Purpose of Annotated Corpora

 Training and evaluating NLP models.

 Studying language structure.

 Developing tools like:

o Part-of-speech taggers

o Parsers

o Named Entity Recognizers

o Machine Translation systems

🟩 3. Common Types of Annotations

Annotation
Description Example
Type

Labels each word with its

PoS Tagging dog/NN, run/VB
part of speech

Includes root word, running → run +

Morphological
tense, number, etc. V + ing

Phrase structure or NP → Det + Adj

Syntactic
dependency trees + Noun

"bank" →
Meaning-based roles or
Semantic financial vs
senses
river
Annotation
Description Example
Type

NER (Named Tags names of persons, Obama →

Entities) places, orgs, etc. PERSON

Links pronouns and their

Coreference She = Mary
antecedents

🟩 4. Popular Linguistically Annotated Corpora

Corpus Name Features Use Case

PoS tags, phrase

Penn Treebank Syntax parsing
structure (CFG trees)

Universal
Multilingual, Cross-lingual
Dependencies
dependency parsing NLP
(UD)

General
One of the first
Brown Corpus linguistic
annotated corpora
research

Syntax, semantics, Semantic role

OntoNotes
coreference, NER labeling

Word Sense Lexical

SemCor
Disambiguation semantics

🟩 5. Example: Annotated Sentence (from Penn Treebank)

Raw Sentence:

"The quick brown fox jumps over the lazy dog."

Annotated with PoS:

swift

CopyEdit

[The/DT quick/JJ brown/JJ fox/NN]NP

[jumps/VBZ]VP
[over/IN the/DT lazy/JJ dog/NN]PP

Parse Tree (simplified):

less

CopyEdit

/ \

NP VP

/|\ |

DT JJ NN jumps

🟩 6. Benefits in NLP

 Improves model accuracy via supervised learning.

 Provides ground truth for evaluation.

 Enables complex tasks like:

o Coreference Resolution

o Semantic Role Labeling

o Machine Translation

🟩 7. How They're Created

 Manual Annotation: By linguists or trained annotators.

 Semi-Automated Tools: Annotators use tools like Brat, Prodigy,

WebAnno.

 Crowdsourcing: Amazon Mechanical Turk for large datasets.

🟩 8. Tools to Use Annotated Corpora

 NLTK (Python): Comes with corpora like Treebank, Brown

 spaCy: Pretrained models from annotated corpora

 Stanza: Accesses Universal Dependencies

1. Treebanks and Their Role in Syntactic Parsing

✅ What is a Treebank?

 A treebank is a linguistically annotated corpus where each

sentence is paired with a syntactic parse tree that shows its
grammatical structure.

 The parse tree is created using a grammar (usually a Context-Free

Grammar).

✅ Types of Treebanks:

 Constituency Treebanks: Show how words group into constituents

(phrases).

 Dependency Treebanks: Show head-dependent relations between

words.

✅ Examples:

 Penn Treebank (most famous English treebank using phrase structure

trees)

 Universal Dependencies (UD) for multiple languages using

dependency parsing

✅ Role in Syntactic Parsing:

 Used to train and evaluate parsers.

 Helps NLP models learn grammatical structures and patterns.

 Crucial for applications like:

o Machine translation

o Grammar correction

o Question answering

2. Statistical Parsing vs. Probabilistic CFGs (PCFGs)

Aspect Statistical Parsing Probabilistic CFG (PCFG)

Definiti General parsing using A CFG where each production rule has
on machine-learned models an associated probability

Data-driven, learned from Based on probabilistic rules derived

Basis
treebanks from corpus frequencies

Most probable parse tree for Same, but derived using a

Output
a sentence probabilistic CFG

Flexibili Can use rich features and

Limited to rules and probabilities
ty context

Exampl Neural parsers, transition- Inside-Outside algorithm, CKY with

es based parsers probabilities

Use Real-time parsing in large Structured prediction in small

Case applications grammar-based systems

3. Create Grammar Rules Using Context-Free Grammar (CFG)

✅ CFG Basics:

 A Context-Free Grammar is defined by:

o A set of non-terminals (e.g., S, NP, VP)

o A set of terminals (e.g., words)

o Production rules (e.g., S → NP VP)

o A start symbol (usually S)

✅ Sample Grammar for a Subset of English:

S → NP VP

NP → Det N | Det Adj N | Pronoun

VP → V NP | V

Det → "a" | "the"

N → "dog" | "cat" | "boy"

Adj → "happy" | "angry"

V → "chased" | "saw"

Pronoun → "he" | "she"

✅ Example Sentence:

"the happy dog chased a cat"

✅ Parse Tree (Structure):

/ \

NP VP

/|\ /\

Det Adj N V NP

the happy dog chased / \

Det N

a cat

✅ Parsing Using Python (with NLTK):

python

CopyEdit

import nltk

from nltk import CFG

# Define the grammar

grammar = CFG.fromstring("""

S -> NP VP

NP -> Det N | Det Adj N | Pronoun

VP -> V NP | V

Det -> 'a' | 'the'

N -> 'dog' | 'cat' | 'boy'

Adj -> 'happy' | 'angry'

V -> 'chased' | 'saw'

Pronoun -> 'he' | 'she'

""")

# Create the parser

parser = nltk.ChartParser(grammar)

# Input sentence

sentence = ['the', 'happy', 'dog', 'chased', 'a', 'cat']

# Parse the sentence

for tree in parser.parse(sentence):

tree.pretty_print()

UNIT 5
1. First-Order Logic (FOL) and Description Logics (DLs)

✅ First-Order Logic (FOL)

 Definition: A formal system used to express statements with

quantifiers, predicates, and logical connectives.

 Components:

o Constants: specific entities (e.g., John)

o Variables: general placeholders (e.g., x)

o Predicates: properties/relations (e.g., Loves(John, Mary))

o Quantifiers:

 Universal: ∀x (for all)

 Existential: ∃x (there exists)

o Logical connectives: AND (∧), OR (∨), NOT (¬), IMPLIES (→)

 Importance in Semantic Analysis:

o Captures meaning using formal logic.

o Enables inference and reasoning.

o Used in question answering, knowledge representation,

and semantic parsing.

✅ Description Logics (DLs)

 Definition: A subset of FOL focused on concepts (classes), roles

(relationships), and individuals.

 Used primarily in ontology languages like OWL (Web Ontology

Language).

 Example:

o Concept: Person

o Role: hasChild

o Assertion: Person ⊑ ∃hasChild.Person (every person has a child

who is also a person)

 Importance in Semantic Analysis:

o Facilitates semantic web, ontology building, and knowledge

graphs.

o Balances expressiveness and computational tractability.

o Powers tools like reasoners (e.g., Pellet, HermiT) to infer new

knowledge.

2. Report on Thematic Roles and Selectional Restrictions

✅ Thematic Roles (Semantic Roles)

 Define the relationship between a verb and its arguments.

 Common Roles:

o Agent: The doer of the action (John in “John kicked the ball”)

o Theme: The entity affected (the ball)

o Experiencer: One who feels or perceives (Mary in “Mary felt

cold”)

o Instrument: Means by which action is performed (knife in “cut

with a knife”)

o Location, Goal, Source, etc.

✅ Selectional Restrictions

 Definition: Constraints that verbs place on their arguments based on

semantic compatibility.

 Examples:

o eat expects an edible object: ✔️“eat an apple”, ❌“eat a table”

o drive expects a vehicle: ✔️“drive a car”, ❌“drive a banana”

✅ Importance:

 Ensures grammatical and semantic validity.

 Helps in word sense disambiguation.

 Enhances machine understanding of meaning by filtering

implausible combinations.

3. Word Sense Disambiguation (WSD)

✅ Definition:

 The process of identifying which sense (meaning) of a word is used

in a sentence when the word has multiple meanings.

✅ Example:

 Word: “bank”
o “I deposited money at the bank.” → financial institution

o “He sat by the river bank.” → river edge

✅ WSD Approaches:

1. Knowledge-Based:

o Use dictionaries/ontologies like WordNet

o Example: Lesk Algorithm (overlap of dictionary definitions and

context)

2. Supervised Learning:

o Train classifiers (e.g., SVM, Naive Bayes) on labeled corpora.

o Requires annotated data.

3. Unsupervised Learning:

o Use clustering on word contexts.

o Doesn’t require labeled data.

4. Neural Approaches:

o Contextual embeddings (e.g., BERT, ELMo) capture word

meaning in context.

o Example: Fine-tuned BERT model on WSD datasets.

✅ Applications:

 Machine Translation (choose correct word in target language)

 Information Retrieval (more accurate search results)

 Text Mining (correct extraction of entities or topics)

PoS Tagging

🟩 1. What is PoS Tagging?

Part-of-Speech (PoS) tagging is the process of assigning a grammatical

category (tag) to each word in a sentence based on its context.

✅ Example:
Sentence:
"The quick brown fox jumps over the lazy dog."

Tagged Output:

 The/DT

 quick/JJ

 brown/JJ

 fox/NN

 jumps/VBZ

 over/IN

 the/DT

 lazy/JJ

 dog/NN

Here, NN = Noun, JJ = Adjective, VBZ = Verb (3rd person singular present),

etc.

🟩 2. Common PoS Tags (from Penn Treebank)

Tag Meaning Example

NN Noun (singular) dog, table

NNS Noun (plural) cats, trees

VB Verb (base) run, go

VBD Verb (past tense) ate, went

VBG Verb (gerund/present participle) running

JJ Adjective quick, lazy

quickly,
RB Adverb
silently

Preposition/Subordinating
IN on, over
Conjunction

DT Determiner the, a
Tag Meaning Example

PRP Personal Pronoun he, she

🟩 3. Approaches to PoS Tagging

✅ A. Rule-Based Tagging

 Uses handcrafted linguistic rules to determine tags.

 Example: If a word ends in “-ing”, tag as VBG.

 ✅ Advantage: Transparent, interpretable

 ❌ Limitation: Not scalable, brittle

✅ B. Stochastic (Statistical) Tagging

 Based on probability of tag sequences.

 Uses models like:

o Hidden Markov Models (HMM)

o Maximum Entropy Models

 Considers the likelihood of a tag given previous tags (n-grams).

Example:
If "can" is preceded by a noun and followed by a verb, it’s likely a modal
verb.

✅ C. Lexical/Dictionary-Based Tagging

 Uses pre-tagged corpora and dictionaries.

 Assigns the most frequent tag based on corpus data.

✅ D. Machine Learning and Deep Learning Methods

 Use models like:

o CRF (Conditional Random Fields)

o RNNs / LSTMs

o Transformer-based models (BERT, RoBERTa)

 Fine-tuned on large corpora (e.g., Universal Dependencies)

 ✅ High accuracy, context-aware tagging

 ❌ Require large datasets and computational resources

🟩 4. Importance of PoS Tagging

 Syntax analysis: Helps in parsing and grammar checking

 WSD (Word Sense Disambiguation): Aids in deciding meaning

 NER (Named Entity Recognition): Improves entity recognition

 Information Retrieval & Extraction: Enhances search precision

 Speech Synthesis & Translation: Guides prosody and structure

🟩 5. Tools for PoS Tagging

 NLTK (Python)

 spaCy

 Stanford NLP

 Flair

 BERT-based taggers

IRS Notes
No ratings yet
IRS Notes
40 pages
Important Questions-Answers Text Analytics and Natural Language Processing [KAI073]
No ratings yet
Important Questions-Answers Text Analytics and Natural Language Processing [KAI073]
37 pages
Feature Systems and Augmented Grammars
No ratings yet
Feature Systems and Augmented Grammars
7 pages
DL Unit-IV
No ratings yet
DL Unit-IV
20 pages
Unit 5 NLP
No ratings yet
Unit 5 NLP
24 pages
PARTS OF SPEECH TAGGING Article
No ratings yet
PARTS OF SPEECH TAGGING Article
4 pages
Unit Ii NLP Notes Final
No ratings yet
Unit Ii NLP Notes Final
6 pages
NLP Soln
No ratings yet
NLP Soln
19 pages
NLP assignment
No ratings yet
NLP assignment
10 pages
NLP- AI2214601 unit 1to unit 5 notes
No ratings yet
NLP- AI2214601 unit 1to unit 5 notes
98 pages
NLPNotes
No ratings yet
NLPNotes
12 pages
natural language processing
No ratings yet
natural language processing
3 pages
NLP Notes For Students
No ratings yet
NLP Notes For Students
18 pages
AI-MODULE 4
No ratings yet
AI-MODULE 4
28 pages
CMR University School of Engineering and Technology Department of Cse and It
No ratings yet
CMR University School of Engineering and Technology Department of Cse and It
8 pages
NLP LectureNotes UNIT 1
No ratings yet
NLP LectureNotes UNIT 1
55 pages
Natural Language Processing_
No ratings yet
Natural Language Processing_
7 pages
nlp
No ratings yet
nlp
35 pages
ML QBF
No ratings yet
ML QBF
13 pages
Natural Language Processin1
No ratings yet
Natural Language Processin1
86 pages
Unit Ii Part of Speech Tagging and Syntactic Parsing
No ratings yet
Unit Ii Part of Speech Tagging and Syntactic Parsing
29 pages
Syntactic Planning and Lexicalization-1
No ratings yet
Syntactic Planning and Lexicalization-1
5 pages
UNIT I_NLP
No ratings yet
UNIT I_NLP
24 pages
NLP_Lecture_6_Week_3
No ratings yet
NLP_Lecture_6_Week_3
9 pages
NLP Self Notes
No ratings yet
NLP Self Notes
12 pages
NLP Unit 1 Answers
No ratings yet
NLP Unit 1 Answers
7 pages
NLP - Viva - Que & Ans
No ratings yet
NLP - Viva - Que & Ans
15 pages
NLP UNIT-II PPT
No ratings yet
NLP UNIT-II PPT
45 pages
NLP Short Que Ans
No ratings yet
NLP Short Que Ans
21 pages
An Introduction To Syntax
No ratings yet
An Introduction To Syntax
2 pages
Lemmas and Lemmatization
No ratings yet
Lemmas and Lemmatization
5 pages
Unit2 A
No ratings yet
Unit2 A
22 pages
NLP Notes (Ch1-5) PDF
100% (1)
NLP Notes (Ch1-5) PDF
41 pages
SNLP
No ratings yet
SNLP
18 pages
SLP Unit-I
No ratings yet
SLP Unit-I
39 pages
NLP QB
No ratings yet
NLP QB
14 pages
Explain in Detail Rule Based POS Tagging
No ratings yet
Explain in Detail Rule Based POS Tagging
12 pages
Natural language processing notes
No ratings yet
Natural language processing notes
61 pages
Chapter 1
No ratings yet
Chapter 1
5 pages
Unit 123(Nlp)
No ratings yet
Unit 123(Nlp)
3 pages
Bag of Words
No ratings yet
Bag of Words
32 pages
Unit 1 NLP KCS072
No ratings yet
Unit 1 NLP KCS072
12 pages
Brocode OP
No ratings yet
Brocode OP
133 pages
Unit 2
No ratings yet
Unit 2
8 pages
unit-1
No ratings yet
unit-1
23 pages
Natural Language Processing 5
No ratings yet
Natural Language Processing 5
24 pages
NLP CHAPTER 3
No ratings yet
NLP CHAPTER 3
23 pages
NLP Mid-1
No ratings yet
NLP Mid-1
15 pages
Ai Unit - 5
No ratings yet
Ai Unit - 5
12 pages
Syntax_complete
No ratings yet
Syntax_complete
22 pages
Unit V Intelligence and Applications: Morphological Analysis/Lexical Analysis
No ratings yet
Unit V Intelligence and Applications: Morphological Analysis/Lexical Analysis
30 pages
Classifying Idiomatic and Literal Expressions Using Topic Models and Intensity of Emotions
No ratings yet
Classifying Idiomatic and Literal Expressions Using Topic Models and Intensity of Emotions
9 pages
nlp2
No ratings yet
nlp2
45 pages
NLP UNIT 5 part b
100% (2)
NLP UNIT 5 part b
31 pages
NLP KEY
No ratings yet
NLP KEY
16 pages
NLP Notes
No ratings yet
NLP Notes
19 pages
Poeter Stemmer Algorithm
No ratings yet
Poeter Stemmer Algorithm
57 pages
HPS - High Precision Stemmer
No ratings yet
HPS - High Precision Stemmer
24 pages
DLT Unit-5
No ratings yet
DLT Unit-5
48 pages
Natural Language Processing
From Everand
Natural Language Processing
Ajit Singh
No ratings yet
Explanation Based Learning: Fundamentals and Applications
From Everand
Explanation Based Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Cs8080 Ir Unit2 I Modeling and Retrieval Evaluation
No ratings yet
Cs8080 Ir Unit2 I Modeling and Retrieval Evaluation
42 pages
NLP assignment notes
No ratings yet
NLP assignment notes
28 pages
IRS Notes
No ratings yet
IRS Notes
10 pages
CS726 Handouts
No ratings yet
CS726 Handouts
237 pages
Modern Information Retrieval: Modeling
No ratings yet
Modern Information Retrieval: Modeling
197 pages
CS8080 Irt Unit 4 23 24
No ratings yet
CS8080 Irt Unit 4 23 24
36 pages
Modern Information Retrieval: Modeling
No ratings yet
Modern Information Retrieval: Modeling
263 pages
IR Ch23 Text Representation
No ratings yet
IR Ch23 Text Representation
36 pages
2.notes CS8080 - Information Retrieval Technique
No ratings yet
2.notes CS8080 - Information Retrieval Technique
164 pages
Information Retrieval 6 IR Models
No ratings yet
Information Retrieval 6 IR Models
14 pages
Probabilistic Model
No ratings yet
Probabilistic Model
46 pages
Probabilistic Information Retrieval Model
No ratings yet
Probabilistic Information Retrieval Model
51 pages
Joachims 02c
No ratings yet
Joachims 02c
10 pages
Probabilistic IR: Giorgio Gambosi
No ratings yet
Probabilistic IR: Giorgio Gambosi
42 pages
Information Retrieval From Scientific Abstract and Citation Database Query by Documents Approach Based On Monte Carlo Sampling
No ratings yet
Information Retrieval From Scientific Abstract and Citation Database Query by Documents Approach Based On Monte Carlo Sampling
9 pages
IR - Ricardo Unit II
No ratings yet
IR - Ricardo Unit II
512 pages
Unit Iv - Irt
No ratings yet
Unit Iv - Irt
62 pages
Information Retrieval: Unit 4: Web Search - Part 1
No ratings yet
Information Retrieval: Unit 4: Web Search - Part 1
63 pages
Estimating Google’s search engine ranking function from a search engine optimization perspective
No ratings yet
Estimating Google’s search engine ranking function from a search engine optimization perspective
17 pages
Framework of Competitor Analysis by Monitoring Inf
No ratings yet
Framework of Competitor Analysis by Monitoring Inf
8 pages
Information Retrieval Techniques
No ratings yet
Information Retrieval Techniques
63 pages
NLP, Language models
No ratings yet
NLP, Language models
22 pages
Unit Ii Modeling
No ratings yet
Unit Ii Modeling
15 pages
Chapter 2 Modeling: Modern Information Retrieval by R. Baeza-Yates and B. Ribeir
No ratings yet
Chapter 2 Modeling: Modern Information Retrieval by R. Baeza-Yates and B. Ribeir
47 pages
CS8080 Irt
100% (1)
CS8080 Irt
33 pages
Web Crawling
No ratings yet
Web Crawling
10 pages
IR Cs Sem 6
No ratings yet
IR Cs Sem 6
16 pages
Lecture10 Efficient Scoring
No ratings yet
Lecture10 Efficient Scoring
19 pages
Lecture10 Efficient Scoring
No ratings yet
Lecture10 Efficient Scoring
19 pages