0% found this document useful (0 votes)

40 views23 pages

NLP Lab Complete

Natural language processing lab

Uploaded by

vineethaiml

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views23 pages

NLP Lab Complete

Natural language processing lab

Uploaded by

vineethaiml

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

A.A.N.M & V.V.R.S.

R POLYTECHNIC,
Department of ARTIFICIAL
INTELLIGENCE AND MACHINE LEARNING
GUDLAVALLERU

CERTIFICATE
Certified that this is the Bonafide Record Work of NATURAL

LANGUAGE PROCESSING USING PYTHON LAB (AIM-506) carried

out by

Mr. / Ms.

A.A.N.M & V.V.R.S.R Polytechnic

PIN No.

A student of

during the Academic Year

Marks Awarded 40

Head of Section Staff Member

1
INDEX
Page
S. No. NAME OF THE EXPERIMENT Marks Remarks
No.

10.

11.

12.

15 .

2
Experiment:-1 DATE:

Aim: INSTALLATION OF NLTK IN PYTHON.

Installation of Anaconda IDE

1. Please click on the link below https://www.anaconda.com/download/#windows

2. Click on Download, and then you have to check for compatibility of your Pc, after that it will start
downloading.

3. Double click the installer to launch.Click ON Next.

4. Read the licensing terms and click “I Agree”.

1
5. Select an install for “Just Me” unless you’re installing for all users (which require Windows Administrator
privileges) and click Next.

6. Select a destination folder to install Anaconda and click the Next button.

7. Click the Install button after installation Click the Next button.

2
8. And then click the Finish button.

• Installation of nltk using Anaconda prompt :-

To install NLTK using the Anaconda prompt, you can follow these steps:

1. Open Anaconda Píompt :

2. Input the following commands:

import nltk nltk.download('all')

3
After this you will get a GUI where you can download all the data.

3. close nltk downloader after installation of nltk.

RESULT:

4
Experiment-2 DATE:
AIM:EXECUTE TOKENISE BY WORD USING NLTK IN PYTHON.
Introduction
Tokenization is a fundamental step in natural language processing (NLP) that involves breaking down
text into smaller units called tokens. These tokens can be words, phrases, or symbols.

PROGRAM1
import nltk
from nltk.tokenize import word_tokenize
print(word_tokenize("this is nirmal kollipara"))

PROGRAM2
def token(file_path):
with open(file_path,'r')as file:
text=file.read()
tokens=word_tokenize(text)
return tokens
file_path=r'C:\Users\DELL\Desktop\abc.txt'
tokens = token(file_path)
print(tokens)

RESULT:

5
Experiment-3 DATE:

AIM:EXECUTE TOKENISE BY SENTENCE USING NLTK IN PYTHON.

Introduction
Tokenization is a fundamental step in natural language processing (NLP) that involves breaking down
text into smaller units called tokens. These tokens can be words, phrases, or symbols.

Program 1
import nltk
from nltk.tokenize import sent_tokenize
t="hi how are you. this nirmal kollipara"
sent_tokenize(t)

program 2: reading form file

import nltk
from nltk.tokenize import sent_tokenize
file=open("C:/Users/DELL/Desktop/abc.txt","r")
t=file.read()
sent_tokenize(t)

RESULT

6
Experiment-4 DATE:

AIM:EXECISE TO FIND MINIMUM NUMBER OF EDITS REQUIRED TO CONVERT STR1 INTO STR2 USING
PYTHON.

Introduction
The minimum edit distance is the lowest number of operations needed to transform one string into the other. It has many
applications. In an NLP for example, it could be used in spelling correction, document similarity and machine translation.

Program1:

str1="GEEKSFORGEEKS"

str2="GEEXSFRGEEKKS"

def med (str1,str2,m,n):

if m==0:

return n

if n==0:

return m

if str1[m-1]==str2[n-1]:

return med (str1,str2,m-1,n-1)

return 1+min(med(str1,str2,m,n-1),

med(str1,str2,m-1,n),

med(str1,str2,m-1,n-1))

print(med(str1,str2,len(str1),len(str2)))

Program2:

def editDistance(str1,str2,m,n):

if m==0:

return n

if n==0:

return m

if str1[m-1]==str2[n-1]:

7
return editDistance(str1,str2,m-1,n-1)

return 1+ min(editDistance(str1,str2,m,n-1),

editDistance(str1,str2,m-1,n),

editDistance(str1,str2,m-1,n-1),

str1="NLPPROGRAMM"

str2="DLPPROGRAMM"

print(editDistance(str1,str2,len(str1),len(str2)))

RESULT:

8
Experiment-5 DATE:

AIM:PRACTICE PART OF SPEECH TAGGING WITH STOP WORDS USING NLTK IN PYTHON.

INTRODUCTION
Stop words are a set of commonly used words in a language. Examples of stop words in English are “a,” “the,” “is,” “are,”
etc. Stop words are commonly used in Text Mining and Natural Language Processing (NLP) to eliminate words that are so
widely used that they carry very little useful information.

PROGRAM1:

import nltk

from nltk.corpus import stopwords

text="the quick brown fox jumps over the lazy dog"

words=nltk.word_tokenize(text)

stop_words=set(stopwords.words('english'))

pos_tags=nltk.pos_tag(words)

print("pos tags include stop words")

for word,tag in pos_tags:

print(f"{tag}:{word}")

PROGRAM2:
import nltk
from nltk.corpus import stopwords

text="the quick brown fox jumps over the lazy dog"

words=nltk.word_tokenize(text)

stop_words=set(stopwords.words('english'))

stop=[word for word in words if word.lower()in stop_words]

pos_tags=nltk.pos_tag(words)

stop =nltk.pos_tag(stop)

print("pos tags include stop words")

for word,tag in pos_tags:

9
print(f"{tag}:{word}")

print("pos tags with stop words:")

for word,tag in stop:

print(f"{tag}:{word}")

print("..................................................................the stop words are below............................................................")

print(stop_words)

RESULT:

10
Experiment-6 DATE:

AIM:EXERCISE ON BINNING METHODS FOR DATA SMOOTHING USING PYTHON.

Introduction

Binning method is used to smoothing data or to handle noisy data. In this method, the data is first sorted and then the
sorted values are distributed into a number of buckets or bins. As binning methods consult the neighbourhood of values,
they perform local smoothing.

Program

import numpy as np

import math

from sklearn.datasets import load_iris

from sklearn import datasets, linear_model, metrics

# load iris data set

dataset = load_iris()

a = dataset.data

b = np.zeros(150)

# take 1st column among 4 column of data set

for i in range (150):

b[i]=a[i,1]

b=np.sort(b) #sort the array

# create bins

bin1=np.zeros((30,5))

bin2=np.zeros((30,5))

bin3=np.zeros((30,5))

# Bin mean

for i in range (0,150,5):

11
k=int(i/5)

mean=(b[i] + b[i+1] + b[i+2] + b[i+3] + b[i+4])/5

for j in range(5):

bin1[k,j]=mean

print("Bin Mean: \n",bin1)

# Bin boundaries

for i in range (0,150,5):

k=int(i/5)

for j in range (5):

if (b[i+j]-b[i]) < (b[i+4]-b[i+j]):

bin2[k,j]=b[i]

else:

bin2[k,j]=b[i+4]

print("Bin Boundaries: \n",bin2)

# Bin median

for i in range (0,150,5):

k=int(i/5)

for j in range (5):

bin3[k,j]=b[i+2]

print("Bin Median: \n",bin3)

RESULT:

12
Experiment-7 DATE:

AIM:PRACTICE BASIC TREE BANK STRUCTURE IMPLEMENTATION IN PYTHON.

Program

import nltk

from nltk.tree import Tree

parse_tree=Tree.fromstring('(S (NP (DT THE)(NN CAT))(VP(VBD SAT)(PP(IN ON)(NP(DT THE)(NN MAT)))))')

print(parse_tree)

parse_tree.pretty_print()

RESULT:

13
Experiment-8 DATE:

AIM:EXERCISE ON CREATING SHALLOW TREE USING PYTHON.

Introdcution

SHallow parsing, also known as chunking, is a type of natural language processing (NLP) technique that aims to identify
and extract meaningful phrases or chunks from a sentence.
Unlike full parsing, which involves analyzing the grammatical structure of a sentence, shallow parsing focuses on
identifying individual phrases or constituents, such as noun phrases, verb phrases, and prepositional phrases.
Shallow parsing is an essential component of many NLP tasks, including information extraction, text classification,
and sentiment analysis.

Program

import nltk
from nltk import pos_tag
from nltk.tokenize import word_tokenize
from nltk.chunk import RegexpParser

nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

def shallow_parse(sentence):
tokens = word_tokenize(sentence)
pos_tags = pos_tag(tokens)
print("POS Tags:", pos_tags)

chunk_grammar = r"""
NP: {<DT>?<JJ>*<NN>} # Noun Phrase
VP: {<VB.*><NP|PP|CLAUSE>+$} # Verb Phrase
PP: {<IN><NP>} # Prepositional Phrase
CLAUSE: {<NP><VP>} # Clause
"""

chunk_parser = RegexpParser(chunk_grammar)
tree = chunk_parser.parse(pos_tags)
print("\nShallow Parse Tree:")
print(tree)
tree.draw()

if __name__ == "__main__":
sentence = "The quick brown fox jumps over the lazy dog."
shallow_parse(sentence)

RESULT:

14
Experiment-9 DATE:

AIM:PRACTICE FIBONACCI NUMBER USING DYNAMIC PROGRAMMING PYTHON.

Introdcution

The Fibonacci numbers are the numbers in the following integer sequence. 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ……..
In mathematical terms, the sequence Fn of Fibonacci numbers is defined by the recurrence relation.
Fn = Fn-1 + Fn-2
with seed values : F0 = 0 and F1 = 1.

Program

def fibonacci_dp(n):
# Edge cases for n = 0 and n = 1
if n == 0:
return 0
elif n == 1:
return 1

# Initialize the base values for F(0) and F(1)

fib = [0] * (n + 1)
fib[0] = 0
fib[1] = 1

# Build the Fibonacci sequence from the bottom-up

for i in range(2, n + 1):
fib[i] = fib[i - 1] + fib[i - 2]

# The nth Fibonacci number

return fib[n]

# Driver function to test the program

if __name__ == "__main__":
n = 10 # Find the 10th Fibonacci number
print(f"Fibonacci number F({n}) = {fibonacci_dp(n)}")

Result:

15
Experiment-10 DATE:

AIM:EXECUTE CORRECT() FUNCTION USING NLTK IN PYTHON.

Introdcution
With the help of TextBlob.correct() method, we can get the corrected words if any sentence have spelling mistakes by
using TextBlob.correct() method.

Syntax : TextBlob.correct()
Return : Return the correct sentence without spelling mistakes.

Program

pip install textblob

from textblob import TextBlob
def correct_spelling_textblob(text):
blob=TextBlob(text)
corrected_text=blob.correct()

return str(corrected_text)
text="i havv a dreem that one day thea nation will rise up."
corrected_text=correct_spelling_textblob(text)
corrected_text1=correct_spelling_textblob("helloo these is mee")
print(corrected_text)

RESULT:

16
Experiment-11 DATE:

AIM: EXERCISE ON CHUNKING USING NLTK IN PYTHON

Introduction

Chunk extraction or partial parsing is a process of meaningful extracting short phrases from the sentence (tagged with
Part-of-Speech).Chunks are made up of words and the kinds of words are defined using the part-of-speech tags. One can
even define a pattern or words that can’t be a part of chuck and such words are known as chinks.

Program

import nltk
from nltk import pos_tag, word_tokenize, RegexpParser

# Download necessary resources

nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

# Sample sentence
sentence = "The quick brown fox jumps over the lazy dog."

# Tokenize and tag the sentence

tokens = word_tokenize(sentence)
tagged = pos_tag(tokens)

# Define a grammar for chunking

grammar = "NP: {<DT>?<JJ>*<NN.*>+}"
chunk_parser = RegexpParser(grammar)

# Chunk the sentence

chunks = chunk_parser.parse(tagged)

# Print the chunked output

print(chunks)

Result

17
Experiment-12 DATE:

AIM: EXERCISE ON CHINKING USING NLTK IN PYTHON.

INTRODUCTION

Chinking is nothing but the process of removing the chunk from the chunk which is called as chink. These patterns
are normal regular expression which are modifdied and designed to match POS(Part-of-Speech) tag designed to
match the sequences of POS tags.

PROGRAM

import nltk
from nltk.chunk import RegexpParser
from nltk import pos_tag
from nltk.tokenize import word_tokenize

# Example sentence
sentence = "The quick brown fox jumps over the lazy dog."

# Tokenizing and part-of-speech tagging

tokens = word_tokenize(sentence)
tagged = pos_tag(tokens)

# Chunking pattern (Example: Noun Phrase Chunk - NP)

chunk_grammar = r"""
NP: {<DT>?<JJ>*<NN>} # Chunk determiners, adjectives, and nouns
}<VB|IN>{ # Chink verbs (VB) or prepositions (IN)
"""

# Create a chunk parser using the grammar

chunk_parser = RegexpParser(chunk_grammar)

# Parse the tagged sentence

chunked_sentence = chunk_parser.parse(tagged)

# Display the result

print(chunked_sentence)

RESULT:

18
Experiment-13 DATE:

AIM:PRACTICE LEMMATIZING USING NLTK PYTHON.

INTRODUCTION

Lemmatization is the process of grouping together the different inflected forms of a word so they can
be analyzed as a single item. Lemmatization is similar to stemming but it brings context to the
words. So, it links words with similar meanings to one word.

PROGRAM

from nltk.stem import WordNetLemmatizer

lemmatizer = WordNetLemmatizer()
print("rocks :", lemmatizer.lemmatize("rocks"))
print("corpora :", lemmatizer.lemmatize("corpora"))
# a denotes adjective in "pos"
print("better :", lemmatizer.lemmatize("better", pos="a"))

Result:

19
Experiment-14 DATE:

AIM:PRACTICE STEMMING USING NLTK IN PYTHON.

INTRODCUTION

Stemming is a method in text processing that eliminates prefixes and suffixes from words,

transforming them into their fundamental or root form, The main objective of stemming is to

streamline and standardize words, enhancing the effectiveness of the natural language

processing tasks.

PROGRAM

from nltk.stem import PorterStemmer

# Create a Porter Stemmer instance

porter_stemmer = PorterStemmer()

# Example words for stemming

words = ["running", "jumps", "happily", "running", "happily"]

# Apply stemming to each word

stemmed_words = [porter_stemmer.stem(word) for word in words]

# Print the results

print("Original words:", words)

print("Stemmed words:", stemmed_words)

RESULT:

20
Experiment-15 DATE:

AIM:EXERCISE ON MAKING A FREQUENCY DISTRIBUTION USING NLTK IN PYTHON.

PROGRAM

import nltk
from nltk import FreqDist
from nltk.tokenize import word_tokenize
import matplotlib.pyplot as plt

# Make sure to download the punkt tokenizer if you haven't done so

nltk.download('punkt')

# Sample text
text = "This is a sample text. This text is for testing the frequency distribution."

# Tokenize the text

tokens = word_tokenize(text.lower()) # Convert to lower case to standardize

# Create a frequency distribution

freq_dist = FreqDist(tokens)

# Print the frequency distribution

print(freq_dist)

# Plot the frequency distribution

freq_dist.plot(30, cumulative=False)
plt.show()

# You can also access the most common words

print(freq_dist.most_common(5))

RESULT:

American Inside Out Evolution Advanced SB
No ratings yet
American Inside Out Evolution Advanced SB
10 pages
Great - Writing.3. (5e) Unit 1-2 - PDF
No ratings yet
Great - Writing.3. (5e) Unit 1-2 - PDF
59 pages
Hangul Alphabet System
No ratings yet
Hangul Alphabet System
13 pages
NLP Smitpatel
No ratings yet
NLP Smitpatel
32 pages
NLP Lab Manual (R20)
50% (2)
NLP Lab Manual (R20)
24 pages
Ai&Ml Bai601 Nlp Lab Manual
No ratings yet
Ai&Ml Bai601 Nlp Lab Manual
48 pages
NLP LAB_MANUAL (1)
No ratings yet
NLP LAB_MANUAL (1)
33 pages
Lab Manual - NLP
No ratings yet
Lab Manual - NLP
60 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages
Batch 2
No ratings yet
Batch 2
13 pages
Jal Patel NLP
No ratings yet
Jal Patel NLP
32 pages
AI Lab file pdf
No ratings yet
AI Lab file pdf
9 pages
AI Lab Final
No ratings yet
AI Lab Final
22 pages
SK NLP Practical (FS)
No ratings yet
SK NLP Practical (FS)
22 pages
Nlp Lab Manual
No ratings yet
Nlp Lab Manual
21 pages
Sahil NLP
No ratings yet
Sahil NLP
16 pages
NLTK Tutorial
No ratings yet
NLTK Tutorial
33 pages
Final_NLP_Lab_File
No ratings yet
Final_NLP_Lab_File
28 pages
NLP - (Natural Language Processing Lab Manual)
No ratings yet
NLP - (Natural Language Processing Lab Manual)
12 pages
Lab File Complete
No ratings yet
Lab File Complete
10 pages
CSIT366-Lab File
No ratings yet
CSIT366-Lab File
17 pages
Laboratory Manual: Faculty of Engineering and Technology Bachelor of Technology
No ratings yet
Laboratory Manual: Faculty of Engineering and Technology Bachelor of Technology
10 pages
NLP___
No ratings yet
NLP___
28 pages
ML Lab 01 Manual - Intro To Python
No ratings yet
ML Lab 01 Manual - Intro To Python
9 pages
AI LAB RPOGRAMS 1 To 6
No ratings yet
AI LAB RPOGRAMS 1 To 6
8 pages
123nlp456
No ratings yet
123nlp456
4 pages
NLP Final
No ratings yet
NLP Final
26 pages
AI file.pdf
No ratings yet
AI file.pdf
19 pages
3.Nlp Lab Manual
No ratings yet
3.Nlp Lab Manual
18 pages
AI Lab Programs
No ratings yet
AI Lab Programs
9 pages
PPT for Assignment-10 (Machine Learning With Python_NLP-2)
No ratings yet
PPT for Assignment-10 (Machine Learning With Python_NLP-2)
37 pages
LAB02
No ratings yet
LAB02
11 pages
AI BCAI 551 Lab Manual
No ratings yet
AI BCAI 551 Lab Manual
54 pages
NLP lab Manual (3)
No ratings yet
NLP lab Manual (3)
7 pages
Nlp Lab Manual
No ratings yet
Nlp Lab Manual
32 pages
NLP Record
No ratings yet
NLP Record
6 pages
Natural Language Processing
No ratings yet
Natural Language Processing
17 pages
H7 W5 NLP - Merged
No ratings yet
H7 W5 NLP - Merged
17 pages
AIM_PROCEDURE_RESULT_SINGLE SIDE
No ratings yet
AIM_PROCEDURE_RESULT_SINGLE SIDE
18 pages
NLP LAB MANUAL
No ratings yet
NLP LAB MANUAL
17 pages
CCS369 - Text and Speech Analysis
No ratings yet
CCS369 - Text and Speech Analysis
31 pages
AI Lab Manual
No ratings yet
AI Lab Manual
24 pages
Artificial Intelligence: Lab Using Python
No ratings yet
Artificial Intelligence: Lab Using Python
17 pages
Natural Language Processing: Practical 1
No ratings yet
Natural Language Processing: Practical 1
64 pages
NLP Final Review
No ratings yet
NLP Final Review
32 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
55 pages
Ccs339 Text and Speech Analysis Lab Manual
No ratings yet
Ccs339 Text and Speech Analysis Lab Manual
51 pages
124111029_TSaiSrikar
No ratings yet
124111029_TSaiSrikar
45 pages
NLP FinAL (1)
No ratings yet
NLP FinAL (1)
27 pages
Python Lab File Shivang
No ratings yet
Python Lab File Shivang
33 pages
NLP Practicals All
No ratings yet
NLP Practicals All
57 pages
Lab - 11 - File Handling in Python
No ratings yet
Lab - 11 - File Handling in Python
6 pages
NLP - Practical List
No ratings yet
NLP - Practical List
14 pages
a7 dsbda sana
No ratings yet
a7 dsbda sana
15 pages
Natural Language Processing Lab Manual
No ratings yet
Natural Language Processing Lab Manual
24 pages
ANKUSH
No ratings yet
ANKUSH
20 pages
DSBD 7 Ass
No ratings yet
DSBD 7 Ass
9 pages
J.K. Institute of Applied Physics and Technology: Natural Language Processing Assignment
No ratings yet
J.K. Institute of Applied Physics and Technology: Natural Language Processing Assignment
22 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
25 pages
Python Programming
No ratings yet
Python Programming
17 pages
CS QP 11th
No ratings yet
CS QP 11th
19 pages
AI LAB file
No ratings yet
AI LAB file
16 pages
Composing Software: An Exploration of Functional Programming and Object Composition in JavaScript
From Everand
Composing Software: An Exploration of Functional Programming and Object Composition in JavaScript
Eric Elliott
No ratings yet
E1. EEP 1 Dependent and Independent Clauses PPT
No ratings yet
E1. EEP 1 Dependent and Independent Clauses PPT
16 pages
Balyasnikova pdf
No ratings yet
Balyasnikova pdf
72 pages
Marking Signal in Opinion Writing
No ratings yet
Marking Signal in Opinion Writing
8 pages
Writing Technical Paragraphs
No ratings yet
Writing Technical Paragraphs
10 pages
PREPARE 4 Grammar Plus Unit 15 3
No ratings yet
PREPARE 4 Grammar Plus Unit 15 3
4 pages
Paraphrasing
No ratings yet
Paraphrasing
5 pages
C1L4
No ratings yet
C1L4
4 pages
Sematics Unit 9 - PT 1-Sense Properties and Stereotypes
No ratings yet
Sematics Unit 9 - PT 1-Sense Properties and Stereotypes
27 pages
Some Basic Concepts in Linguistics: P. B. Allen
No ratings yet
Some Basic Concepts in Linguistics: P. B. Allen
28 pages
Bridge Sentences: This Is What The New Paragraph Is About
No ratings yet
Bridge Sentences: This Is What The New Paragraph Is About
2 pages
Focus4 2E MiniMatura Unit6 GroupB 2kol
No ratings yet
Focus4 2E MiniMatura Unit6 GroupB 2kol
5 pages
CS UNIT 2 (1)
No ratings yet
CS UNIT 2 (1)
36 pages
Cohesive
No ratings yet
Cohesive
16 pages
Gerund in English Language
No ratings yet
Gerund in English Language
31 pages
TOEFL Writing
100% (1)
TOEFL Writing
19 pages
Explanation and Exercises Connectors of Cause and Result
No ratings yet
Explanation and Exercises Connectors of Cause and Result
3 pages
Cee C1 RB TB R3
No ratings yet
Cee C1 RB TB R3
16 pages
ADM Activity2
No ratings yet
ADM Activity2
2 pages
Arabic Syntax: علم النحو العربی
100% (2)
Arabic Syntax: علم النحو العربی
26 pages
Eng 322 Assignment
No ratings yet
Eng 322 Assignment
9 pages
Writing Formal Sentence Definitions
No ratings yet
Writing Formal Sentence Definitions
9 pages
Bach (2004) - Minding The Gap
No ratings yet
Bach (2004) - Minding The Gap
17 pages
Week 26
No ratings yet
Week 26
2 pages
Basic Academic Writing Notes
No ratings yet
Basic Academic Writing Notes
6 pages
Bài tập tuần 5: Phrase, Clause, Sentence
No ratings yet
Bài tập tuần 5: Phrase, Clause, Sentence
2 pages
Grade 3 Worksheets
No ratings yet
Grade 3 Worksheets
106 pages
INGLES UNIDO
No ratings yet
INGLES UNIDO
28 pages