0% found this document useful (0 votes)

2 views

NLP Lab File

Uploaded by

Bharat Mishra

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

NLP Lab File

Uploaded by

Bharat Mishra

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

DELHI TECHNOLOGICAL

UNIVERSITY
SE-316
NATURAL LANGUAGE PROCESSING

Submitted by
Bharat Mishra
Roll Number: - 2K21/SE/54

Batch: - SE-A1

Submitted to: Geetanjali Garg

Department of Software Engineering

Delhi Technological University
Bawana Road, Delhi-110042
INDEX

S. No. Experiment Date

1. Import nltk and download the ‘stopwords’ and 13-01-2024

‘punkt’ packages

2. Import spacy and load the language model. 19-01-2024

3. WAP in python to tokenize a given text. 09-02-2024

4. WAP in python to get the sentences of a text 09-02-2024

document.

5. WAP in python to tokenize text with stopwords 23-02-2024

as delimiters.

6. WAP in python to add custom stop words in 05-03-2024

spaCy.

7. WAP to remove punctuations, perform 19-03-2024

stemming, lemmatize given text and extract
usernames from emails

8. WAP to do spell correction, extract all nouns, 26-03-2024

pronouns and verbs in a given text

9. WAP to find similarity between two words and 02-04-2024

classify a text as positive/negative sentiment
EXPERIMENT-1
AIM : Import nltk and download the ‘stopwords’ and ‘punkt’
packages

CODE :
import nltk
nltk.download('stopwords')
nltk.download('punkt')

OUTPUT :
EXPERIMENT-2

AIM : Import spacy and load the language model

CODE :
import spacy
nlp_eng = spacy.load('en_core_web_sm')
nlp_multi = spacy.load('xx_ent_wiki_sm')

OUTPUT :
EXPERIMENT-3

AIM : WAP in python to tokenize a given text

CODE :
from nltk import word_tokenize
text = "Last week, the University of Cambridge shared its own research that shows if
everyone wears a mask outside home,dreaded ‘second wave’ of the pandemic can be
avoided."
text = word_tokenize(text)
for t in text:
print(t)

OUTPUT :
EXPERIMENT-4
AIM : WAP in python to get the sentences of a text document.

CODE :
file = open('/content/demo.text')
Input_text = file.read()
ans = Input_text.split('.')

for an in ans:
print(an,'\n')

OUTPUT :
EXPERIMENT-5

AIM : WAP in python to tokenize text with stopwords as

delimiters.

CODE :
text = "Walter was feeling anxious. He was diagnosed today. He probably is the best
person I know."

stop_words_and_delims = ['was', 'is', 'the', '.', ',', '-', '!', '?']

for r in stop_words_and_delims:
text = text.replace(r, 'DELIM')

words = [t.strip() for t in text.split('DELIM')]

words_filtered = list(filter(lambda a: a not in [''], words))
for word in words_filtered:
print(word)

OUTPUT :
EXPERIMENT-6

AIM : WAP in python to add custom stop words in spaCy.

CODE :
import spacy

nlp = spacy.load('en_core_web_sm')

custom_stop_words = ['was', 'is','the','JUNK','NIL','of','more' ,'.',

',', '-', '!', '?','a']
for word in custom_stop_words:
nlp.vocab[word].is_stop = True

doc = nlp("Jonas was a JUNK great guy NIL Adam was evil NIL Martha JUNK was
more of a fool")
for token in doc:
if not token.is_stop:
print(token.text, end=" ")

OUTPUT :
EXPERIMENT-7
AIM : WAP to remove punctuations, perform stemming,
lemmatize given text and extract usernames from emails

CODE :
punctuations = '''!()-[]{};:'"\,<>./?@#$%^&*_~'''
string = "Jonas!!! great \\guy <> Adam --evil [Martha] ;;fool() ."
ans = ""
for char in string:
if char not in punctuations:
ans+=char

print(ans)

from nltk.stem import PorterStemmer

from nltk.tokenize import word_tokenize
text= "Dancing is an art. Students should be taught dance as a subjectin schools . I
danced in many of my school function. Some people arealways hesitating to dance."
ans = ""
stemmer = PorterStemmer()
tokens = word_tokenize(text)
for token in tokens:
ans+=stemmer.stem(token)
ans+=" "
print(ans)

from nltk.corpus import wordnet

from nltk.tokenize import word_tokenize
from nltk.stem.wordnet import WordNetLemmatizer
nltk.download('wordnet')

lemmatizer = WordNetLemmatizer()
text= "Dancing is an art. Students should be taught dance as a subject in schools . I
danced in many of my school function. Some people are always hesitating to dance."
ans = ""
tokens = word_tokenize(text)
for token in tokens:
ans+=lemmatizer.lemmatize(token, wordnet.VERB)
ans+=" "
print(ans)

from nltk.tokenize import word_tokenize

text= "The new registrations are [email protected] , [email protected]. If you

find any disruptions, kindly contact [email protected] or [email protected]
"

text_list = word_tokenize(text)
usernames = []
for i in range(len(text_list)):
if text_list[i] == "@":
usernames.append(text_list[i-1])
print(username)

OUTPUT :
EXPERIMENT – 8
AIM : WAP to do spell correction, extract all nouns, pronouns and verbs in a
given text.

CODE :
from textblob import TextBlob
text="He is a gret person. He beleives in bod"
textb = TextBlob(text)
correct_text = textb.correct()
print(correct_text)

import nltk
from nltk import word_tokenize, pos_tag
text="James works at Microsoft. She lives in manchester and likes to play the flute"
tokens = word_tokenize(text)
parts_of_speech = nltk.pos_tag(tokens)
nouns = list(filter(lambda x: x[1] == "NN" or x[1] == "NNP", parts_of_speech))
for noun in nouns:
print(noun[0])

from nltk import pos_tag, word_tokenize

text = "I may bake a cake for my birthday. The talk will introduce reader about Use of
baking"

words = word_tokenize(text)

verb_phrases = []
for i in range(len(words)):
if i > 0 and pos_tag(words)[i][1] == 'VB':
verb_phrase = words[i-1] + ' ' + words[i]
verb_phrases.append(verb_phrase)

for i in verb_phrases:
print (i)
OUTPUT :
EXPERIMENT - 9
AIM : WAP to find similarity between two words and classify a text
as positive/negative sentiment

CODE :
import spacy

nlp = spacy.load('en_core_web_md')
words = "amazing terrible excellent"

tokens = nlp(words)

token1, token2, token3 = tokens[0], tokens[1], tokens[2]

print(f"Similarity between {token1} and {token2} : ", token1.similarity(token2))

print(f"Similarity between {token1} and {token3} : ", token1.similarity(token3))

from textblob import TextBlob

text = "It was a very pleasant day"
print(TextBlob(text).sentiment)

OUTPUT :

PCE Mock Exam (100 Questions) English
70% (10)
PCE Mock Exam (100 Questions) English
10 pages
Come Lord Jesus SATB+Piano
No ratings yet
Come Lord Jesus SATB+Piano
3 pages
Purves - Formal Patterns
No ratings yet
Purves - Formal Patterns
27 pages
Study Plan of BCS Syllabus (From Beginner-Long Plan)
90% (10)
Study Plan of BCS Syllabus (From Beginner-Long Plan)
37 pages
NLP Lab File
No ratings yet
NLP Lab File
13 pages
NLP Lab File (1)
No ratings yet
NLP Lab File (1)
13 pages
01 NLP - Merged Vinay
No ratings yet
01 NLP - Merged Vinay
27 pages
ASTW RA03 PracticalManual
No ratings yet
ASTW RA03 PracticalManual
18 pages
SK NLP Practical (FS)
No ratings yet
SK NLP Practical (FS)
22 pages
CCS369 - Text and Speech Analysis
No ratings yet
CCS369 - Text and Speech Analysis
31 pages
NLP Lab Manual (R20)
50% (2)
NLP Lab Manual (R20)
24 pages
TSA Student
No ratings yet
TSA Student
20 pages
Nlp Lab Manual
No ratings yet
Nlp Lab Manual
21 pages
H7 W5 NLP - Merged
No ratings yet
H7 W5 NLP - Merged
17 pages
NLP LAB_MANUAL (1)
No ratings yet
NLP LAB_MANUAL (1)
33 pages
Lab Manual - NLP
No ratings yet
Lab Manual - NLP
60 pages
NLP Lab1
No ratings yet
NLP Lab1
6 pages
NLP record
No ratings yet
NLP record
16 pages
Ccs369 - Text and Speech Analysis - Lab Manual
100% (1)
Ccs369 - Text and Speech Analysis - Lab Manual
23 pages
FOP Lab Report 4 - Merged
No ratings yet
FOP Lab Report 4 - Merged
18 pages
NLP Lab Complete
No ratings yet
NLP Lab Complete
23 pages
Natural Language Processing
No ratings yet
Natural Language Processing
17 pages
R22 Nlp Python Programs
No ratings yet
R22 Nlp Python Programs
15 pages
Sahil NLP
No ratings yet
Sahil NLP
16 pages
AI Phash3
No ratings yet
AI Phash3
11 pages
Deep DL Manual Deep
No ratings yet
Deep DL Manual Deep
8 pages
NLP FinAL (1)
No ratings yet
NLP FinAL (1)
27 pages
All Practicals
No ratings yet
All Practicals
33 pages
NLP 3
No ratings yet
NLP 3
3 pages
NLP - Practical List
No ratings yet
NLP - Practical List
14 pages
Cyber Security Essentials Lab Manual
No ratings yet
Cyber Security Essentials Lab Manual
18 pages
Tsa Lab Record - Cse
No ratings yet
Tsa Lab Record - Cse
53 pages
NLP_Preprocessing_Steps__1740444240
No ratings yet
NLP_Preprocessing_Steps__1740444240
20 pages
Dokumen - Pub - Natural Language Processing Practical Using Transformers With Python
No ratings yet
Dokumen - Pub - Natural Language Processing Practical Using Transformers With Python
275 pages
SMA (TASK1 AND 2) ... HARDCOPY (Final) ..Pranchal..
No ratings yet
SMA (TASK1 AND 2) ... HARDCOPY (Final) ..Pranchal..
11 pages
NLP___
No ratings yet
NLP___
28 pages
Unit 5 Machine Learning
No ratings yet
Unit 5 Machine Learning
9 pages
NLP 2
No ratings yet
NLP 2
5 pages
Chapter 6 - Strings and Pointers
No ratings yet
Chapter 6 - Strings and Pointers
6 pages
Omkar Nimbalkar Ass3
No ratings yet
Omkar Nimbalkar Ass3
14 pages
cs project documentation-merged
No ratings yet
cs project documentation-merged
19 pages
British_Airways_Forage_Report
No ratings yet
British_Airways_Forage_Report
12 pages
Cs Investigatory
No ratings yet
Cs Investigatory
48 pages
unit4 (1)
No ratings yet
unit4 (1)
23 pages
Interview Questions
No ratings yet
Interview Questions
10 pages
Digital Core Java Test - SD
No ratings yet
Digital Core Java Test - SD
6 pages
Naive Bayes
No ratings yet
Naive Bayes
11 pages
Multimedia 358
No ratings yet
Multimedia 358
25 pages
NLP lab Manual (3)
No ratings yet
NLP lab Manual (3)
7 pages
Pranav Compiler Design Lab File
No ratings yet
Pranav Compiler Design Lab File
32 pages
Sahil Chhabra
No ratings yet
Sahil Chhabra
22 pages
Compiler Design File
No ratings yet
Compiler Design File
13 pages
Python Chatbot Project
No ratings yet
Python Chatbot Project
10 pages
Rajeev Mishra 20 SCSE1180087
No ratings yet
Rajeev Mishra 20 SCSE1180087
29 pages
NLP Preprocessing Steps
No ratings yet
NLP Preprocessing Steps
20 pages
10303_Exp3
No ratings yet
10303_Exp3
15 pages
Ap Ex-5 .......
No ratings yet
Ap Ex-5 .......
6 pages
Practical No10
No ratings yet
Practical No10
4 pages
Final_NLP_Lab_File
No ratings yet
Final_NLP_Lab_File
28 pages
6 - Text Vectorization-CSC688-SP22
No ratings yet
6 - Text Vectorization-CSC688-SP22
5 pages
Week-4 NLP 2
No ratings yet
Week-4 NLP 2
2 pages
Lab Manual 1
No ratings yet
Lab Manual 1
33 pages
Final_Projects_RISCV_20241_v2_ICT
No ratings yet
Final_Projects_RISCV_20241_v2_ICT
9 pages
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
From Everand
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
4 Elements Stem in Early Education
No ratings yet
4 Elements Stem in Early Education
67 pages
A Night in The Hills
50% (2)
A Night in The Hills
14 pages
Screening For Colorectal Cancer
No ratings yet
Screening For Colorectal Cancer
16 pages
Monetary Management and Financial Intermediation: Table 3.1: Revision in Policy Rates
No ratings yet
Monetary Management and Financial Intermediation: Table 3.1: Revision in Policy Rates
15 pages
Indian Music in Greece - Helen Abadzi
No ratings yet
Indian Music in Greece - Helen Abadzi
11 pages
FIDE-TRG - Chess Steps A - Book
No ratings yet
FIDE-TRG - Chess Steps A - Book
144 pages
Text
No ratings yet
Text
6 pages
Adobe XMP Metadata Overview
No ratings yet
Adobe XMP Metadata Overview
2 pages
Relative Pronouns and Relative Clauses
No ratings yet
Relative Pronouns and Relative Clauses
6 pages
Mejoff vs. Director of Prisons
No ratings yet
Mejoff vs. Director of Prisons
2 pages
First Day of Classes
No ratings yet
First Day of Classes
16 pages
The Role of The Family Institution in The Sustenance of Core Societal Values
No ratings yet
The Role of The Family Institution in The Sustenance of Core Societal Values
7 pages
Reflection Bar Graph
No ratings yet
Reflection Bar Graph
7 pages
FRIENDSHIP
No ratings yet
FRIENDSHIP
34 pages
Oswestry Disability Index Odi Editable 1
No ratings yet
Oswestry Disability Index Odi Editable 1
2 pages
Writing in Paragraphs
No ratings yet
Writing in Paragraphs
6 pages
Ubc 2009 Fall Shason Jan-Paul
No ratings yet
Ubc 2009 Fall Shason Jan-Paul
154 pages
Submitted in Partial Fulfillment of The Requirement For The Attainment of Sarjana Pendidikan Degree at English Education Study Program
No ratings yet
Submitted in Partial Fulfillment of The Requirement For The Attainment of Sarjana Pendidikan Degree at English Education Study Program
83 pages
Cep Project
No ratings yet
Cep Project
22 pages
Module 5 Exam
No ratings yet
Module 5 Exam
6 pages
Sample Bai GTVH BPD 2
No ratings yet
Sample Bai GTVH BPD 2
15 pages
Week 2 REHA3017 Prac Guide - Impairment Testing and Coaching Principles
No ratings yet
Week 2 REHA3017 Prac Guide - Impairment Testing and Coaching Principles
3 pages
MICROBIAL GENETICS Questions and Answers PDF
100% (1)
MICROBIAL GENETICS Questions and Answers PDF
4 pages
2019 Subject Guide PDF
No ratings yet
2019 Subject Guide PDF
190 pages
Gore, Georgiana - Bakka, Egil. Constructing Dance Knowledge in The Field - Bringing The Gap Between Realisation and Concept
No ratings yet
Gore, Georgiana - Bakka, Egil. Constructing Dance Knowledge in The Field - Bringing The Gap Between Realisation and Concept
5 pages
SCMA 311 Chapter 4
100% (1)
SCMA 311 Chapter 4
17 pages

NLP Lab File

Uploaded by

NLP Lab File

Uploaded by

DELHI TECHNOLOGICAL

Submitted to: Geetanjali Garg

Department of Software Engineering

S. No. Experiment Date

1. Import nltk and download the ‘stopwords’ and 13-01-2024

2. Import spacy and load the language model. 19-01-2024

3. WAP in python to tokenize a given text. 09-02-2024

4. WAP in python to get the sentences of a text 09-02-2024

5. WAP in python to tokenize text with stopwords 23-02-2024

6. WAP in python to add custom stop words in 05-03-2024

7. WAP to remove punctuations, perform 19-03-2024

8. WAP to do spell correction, extract all nouns, 26-03-2024

9. WAP to find similarity between two words and 02-04-2024

AIM : Import spacy and load the language model

AIM : WAP in python to tokenize a given text

AIM : WAP in python to tokenize text with stopwords as

stop_words_and_delims = ['was', 'is', 'the', '.', ',', '-', '!', '?']

words = [t.strip() for t in text.split('DELIM')]

AIM : WAP in python to add custom stop words in spaCy.

custom_stop_words = ['was', 'is','the','JUNK','NIL','of','more' ,'.',

from nltk.stem import PorterStemmer

from nltk.corpus import wordnet

from nltk.tokenize import word_tokenize

text= "The new registrations are [email protected] , [email protected]. If you

from nltk import pos_tag, word_tokenize

token1, token2, token3 = tokens[0], tokens[1], tokens[2]

print(f"Similarity between {token1} and {token2} : ", token1.similarity(token2))

from textblob import TextBlob

You might also like