0% found this document useful (0 votes)

34 views13 pages

Batch 2

Uploaded by

Sulaksha BK

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views13 pages

Batch 2

Uploaded by

Sulaksha BK

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

CCS369 - TEXT AND SPEECH ANALYSIS

1. Write a regular expression to search a digit inside a String

PROGRAM:

import re
def find_all_digits_in_string(input_string):
pattern = r'\d'
matches = re.findall(pattern, input_string)
if matches:
print(f'Digits found: {", ".join(matches)}')
else:
print('No digits found in the string.')
user_input = input('Enter a string: ')
find_all_digits_in_string(user_input)

OUTPUT:

2. Write a Python program to find the occurrence and position of substrings within a string
def find_substring_occurrences(main_string, substring):

occurrences = []

start_position = 0

while True:

position = main_string.find(substring, start_position)

if position == -1:

break

occurrences.append(position)

start_position = position + 1

return occurrences
def main():

main_string = input("Enter the main string: ")

substring = input("Enter the substring to find: ")

positions = find_substring_occurrences(main_string, substring)

if positions:

print(f"The substring '{substring}' occurs at positions: {positions}")

else:

print(f"The substring '{substring}' does not occur in the main string.")

if __name__ == "__main__":

main()

OUTPUT:

Enter the main string: hellohellohello

Enter the substring to find: hello

The substring 'hello' occurs at positions: [0, 5, 10]

3. Write a Python program that takes a string with some words. For two consecutive words in the said string,
check whether the first word ends with a vowel and the next word begins with a vowel. If the program
meets the condition, return true, otherwise false. Only one space is allowed between the words.

PROGRAM:
import re

def check_vowel_condition(input_string):

word_pattern = r'\b\w+\b'

words = re.findall(word_pattern, input_string)

for i in range(len(words) - 1):

first_word = words[i]

second_word = words[i + 1]

if first_word[-1].lower() in 'aeiou' and second_word[0].lower() in

'aeiou':
return True

return False

user_input = input('Enter a string with some words: ')

result = check_vowel_condition(user_input)

print(result)

OUTPUT:

4. Write a python program to find the frequency of each word from a text file using NLTK?

from pprint import pprint

import re
from collections import Counter
from nltk.corpus import stopwords
def most_frequent_non_stop_words(text, num_words=10):
clean_text = re.sub(r'[^\w\s]', '', text.lower())
words = clean_text.split()
stop_words = set(stopwords.words('english'))
word_counts = Counter([word for word in words if word not in stop_words])
most_common_words = word_counts.most_common(num_words)
return most_common_words
text = """Natural language processing (NLP) is a subfield of artificial intelligence that focuses on the interaction
between computers and humans through natural language. The ultimate goal of NLP is to read, decipher, understand,
and make sense of human language in a way that is both valuable and meaningful. NLP techniques are used to analyze
text, sentiment, language structure, and more. These techniques involve various tasks such as text classification,
named entity recognition, machine translation, and text generation.
Stop words are common words that are often removed from text during NLP preprocessing. They typically
include words like "the", "and", "in", "is", "of", "on", etc."""
top_words = most_frequent_non_stop_words(text, num_words=10)
pprint(top_words)

[('language', 4),
('nlp', 4),
('text', 4),
('words', 3),
('natural', 2),
('techniques', 2),
('processing', 1),
('subfield', 1),
('artificial', 1),
('intelligence', 1)]

5. Write a NLTK program for text classification using naïve bayes algorithm
PROGRAM:
pip install nltk
import nltk
from nltk.corpus import movie_reviews
from nltk.classify import NaiveBayesClassifier
from nltk.classify.util import accuracy
nltk.download('movie_reviews')
documents = [(list(movie_reviews.words(fileid)), category)
for category in movie_reviews.categories()
for fileid in movie_reviews.fileids(category)]
import random
random.shuffle(documents)
def document_features(document):
document_words = set(document)
features = {}
for word in word_features:
features[word] = (word in document_words)
return features
all_words = nltk.FreqDist(w.lower() for w in movie_reviews.words())
word_features = list(all_words.keys())[:2000]
featuresets = [(document_features(d), c) for (d, c) in documents]
train_set, test_set = featuresets[:1500], featuresets[1500:]
classifier = NaiveBayesClassifier.train(train_set)
accuracy_score = accuracy(classifier, test_set)
print(f'Accuracy: {accuracy_score:.2%}')

OUTPUT:

6. Write a Python NLTK program to get a list of common stop words in various languages in Python

import nltk

from nltk.corpus import stopwords

def get_stopwords(language):

nltk.download('stopwords') # Download the stopwords data if not already downloaded

stop_words = set(stopwords.words(language))

return stop_words

def main():
# Specify the languages for which you want to get stop words

languages = ['english', 'spanish', 'french', 'german', 'italian']

for language in languages:

stop_words = get_stopwords(language)

print(f"\nStop words in {language.capitalize()}:\n{', '.join(stop_words)}")

if __name__ == "__main__":

main()

OUTPUT:

Stop words in English:

i, me, my, myself, we, our, ours, ourselves, you, you're, you've, ...

7. Write a Python program to generate Bigrams of words from a given list of strings

PROGRAM:
import nltk

from nltk import word_tokenize

from nltk.util import bigramsnltk.download('punkt')

def generate_bigrams(strings):

tokenized_strings = [word_tokenize(sentence) for sentence in strings]

bigram_list = [list(bigrams(sentence)) for sentence in tokenized_strings]

return bigram_list

input_strings = ["This is a sample sentence.", "Another example sentence."]

result = generate_bigrams(input_strings)

for i, sentence_bigrams in enumerate(result, 1):

print(f"Bigrams for sentence {i}: {sentence_bigrams}")

OUTPUT:

8. Write a Python NLTK program to get the overview of the tagset, details of a specific tag in the tagset and
details on several related tagsets, using regular expression.
import nltk

from nltk.data import find

from nltk.tag.mapping import map_tag

import re

nltk.download('tagsets')

def get_tagset_overview(tagset):

tags = nltk.help.upenn_tagset(tagset)

print(f"Overview of the {tagset} tagset:\n")

print(tags)

def get_tag_details(tag):

tag_info = nltk.help.upenn_tagset(tag)

print(f"Details for the {tag} tag:\n")

print(tag_info)

def get_related_tagsets(tag):

related_tagsets = nltk.help.brown_tagset(tag)

print(f"Related tagsets for the {tag} tag:\n")

print(related_tagsets)
def main():

# Specify a tagset and a specific tag

tagset = 'upenn'

specific_tag = 'NN'

# Get an overview of the tagset

get_tagset_overview(tagset)

# Get details for a specific tag

get_tag_details(specific_tag)

# Get details on several related tagsets

get_related_tagsets(specific_tag)

if __name__ == "__main__":

main()

output:
9.Write a Python program to count the occurrences of each word in a given sentence

PROGRAM:
from collections import Counter

import string

def count_word_occurrences(sentence):

sentence = sentence.lower().translate(str.maketrans("", "",

string.punctuation))

words = sentence.split()

word_counts = Counter(words)

return word_counts

input_sentence = "This is a sample sentence. Another sentence is here."

result = count_word_occurrences(input_sentence)

for word, count in result.items():

print(f"{word}: {count}")

OUTPUT:

10. Write a Python NLTK program to compare the similarity of two given nouns.
OUTPUT:

11. Write a function that finds the 50 most frequently occurring words of a text that are not stop words.

LAB EXP NO 5

12. Write a Python program to generate word vectors using Word2Vec

LAX EXP 6

13. Write a python Program to implement your own word2vec(skip-gram) model in Python
LAB EXP NO 6

14. Write to program to find the Odd Word amongst given words using Word2Vec embeddings

15. Implement a PyTorch Transformer Model for Input – Output Classification

16. Create a Intelligent chatbot in python using SpaCY NLP Library

17. Create a Visual Dialog System

18. Write a program to convert Text to Speech in python using win32.com client

19. Write a Program to Convert PDF File Text to Audio Speech using Python
import PyPDF2

from gtts import gTTS

import os

# Replace 'path/to/your/file.pdf' with the actual path to your PDF file

pdf_path = '/content/CCS ASSIGNMENT 2.pdf'

# Read text from PDF

with open(pdf_path, 'rb') as file:

pdf_reader = PyPDF2.PdfReader(file)

text = ''

for page_num in range(len(pdf_reader.pages)):

page = pdf_reader.pages[page_num]

text += page.extract_text()

# Convert text to speech

tts = gTTS(text=text, lang='en', slow=False)

output_path = 'output.mp3'

tts.save(output_path)

# Print information

print(f"Text extracted from PDF:\n{text}")

print(f"Audio file saved at: {output_path}")

20. Design a speech recognition system and find the error rate

LAB EXP 20

Ccs369 - Text and Speech Analysis - Lab Manual
100% (1)
Ccs369 - Text and Speech Analysis - Lab Manual
23 pages
tsarecord
No ratings yet
tsarecord
22 pages
Natural Language Processing Lab Manual
No ratings yet
Natural Language Processing Lab Manual
24 pages
TSA Student
No ratings yet
TSA Student
20 pages
NLP LAB_MANUAL (1)
No ratings yet
NLP LAB_MANUAL (1)
33 pages
Natural Language Processing
No ratings yet
Natural Language Processing
17 pages
CCS369-LAB EX 3,4,5
No ratings yet
CCS369-LAB EX 3,4,5
8 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages
Ai&Ml Bai601 Nlp Lab Manual
No ratings yet
Ai&Ml Bai601 Nlp Lab Manual
48 pages
1
No ratings yet
1
13 pages
AIM_PROCEDURE_RESULT_SINGLE SIDE
No ratings yet
AIM_PROCEDURE_RESULT_SINGLE SIDE
18 pages
Ccs339 Text and Speech Analysis Lab Manual
No ratings yet
Ccs339 Text and Speech Analysis Lab Manual
51 pages
NLP Previous Sem
No ratings yet
NLP Previous Sem
5 pages
Tsa Labmanual
No ratings yet
Tsa Labmanual
26 pages
NLP MTE syllabus and Practice Problems (2)
No ratings yet
NLP MTE syllabus and Practice Problems (2)
2 pages
NLP Previous Sem-1-3
No ratings yet
NLP Previous Sem-1-3
3 pages
Module5 PPT
No ratings yet
Module5 PPT
69 pages
CCS369-Text and Speech Analysis Lab (1-9) (1)
No ratings yet
CCS369-Text and Speech Analysis Lab (1-9) (1)
37 pages
NLP LAB MANUAL
No ratings yet
NLP LAB MANUAL
17 pages
NLP Exercises
No ratings yet
NLP Exercises
2 pages
CCS369 - Text and Speech Analysis
No ratings yet
CCS369 - Text and Speech Analysis
31 pages
Final_NLP_Lab_File
No ratings yet
Final_NLP_Lab_File
28 pages
SK NLP Practical (FS)
No ratings yet
SK NLP Practical (FS)
22 pages
Text File Question Bank Solutions
No ratings yet
Text File Question Bank Solutions
14 pages
NLP - (Natural Language Processing Lab Manual)
No ratings yet
NLP - (Natural Language Processing Lab Manual)
12 pages
NLP Final Review
No ratings yet
NLP Final Review
32 pages
Natural Language Processing
No ratings yet
Natural Language Processing
22 pages
NLP - Practical List
No ratings yet
NLP - Practical List
14 pages
TEXT FILE HANDLING
No ratings yet
TEXT FILE HANDLING
4 pages
R22 Nlp Python Programs
No ratings yet
R22 Nlp Python Programs
15 pages
NLP_TP1_Report_Lahouel_Ibrahim
No ratings yet
NLP_TP1_Report_Lahouel_Ibrahim
6 pages
Sahil NLP
No ratings yet
Sahil NLP
16 pages
NLP_Midterm_Spring2025
No ratings yet
NLP_Midterm_Spring2025
7 pages
NLP Lab1
No ratings yet
NLP Lab1
6 pages
Wsma Final Manual
No ratings yet
Wsma Final Manual
58 pages
Lenguaje de Procesamiento
No ratings yet
Lenguaje de Procesamiento
7 pages
NLP Lab Complete
No ratings yet
NLP Lab Complete
23 pages
Assignment2_Fall_2024
No ratings yet
Assignment2_Fall_2024
6 pages
Text Analysis With NLTK Cheatsheet PDF
No ratings yet
Text Analysis With NLTK Cheatsheet PDF
3 pages
Text Analysis With NLTK Cheatsheet PDF
No ratings yet
Text Analysis With NLTK Cheatsheet PDF
3 pages
Text Analysis With NLTK Cheatsheet
No ratings yet
Text Analysis With NLTK Cheatsheet
3 pages
FILE HANDLING
No ratings yet
FILE HANDLING
23 pages
Lab File Complete
No ratings yet
Lab File Complete
10 pages
Https Raw - Githubusercontent.com Joelgrus Data-Science-From-Scratch Master Code Natural Language Processing
No ratings yet
Https Raw - Githubusercontent.com Joelgrus Data-Science-From-Scratch Master Code Natural Language Processing
5 pages
Class 12 Cs Final Prac
No ratings yet
Class 12 Cs Final Prac
68 pages
NLP_Record(Weeks 1-12) (1)
No ratings yet
NLP_Record(Weeks 1-12) (1)
41 pages
NLP Previous Sem-4-5
No ratings yet
NLP Previous Sem-4-5
2 pages
Ex4 Lab
No ratings yet
Ex4 Lab
4 pages
Minor Assignment-3 (NLP)
No ratings yet
Minor Assignment-3 (NLP)
2 pages
UBC Summer School in NLP - VSP 2019 Lecture 9
No ratings yet
UBC Summer School in NLP - VSP 2019 Lecture 9
17 pages
Assessment - 2: - K Mary Nikitha
No ratings yet
Assessment - 2: - K Mary Nikitha
27 pages
NLP Smitpatel
No ratings yet
NLP Smitpatel
32 pages
Python File Handling Answers Class12 (1)
No ratings yet
Python File Handling Answers Class12 (1)
3 pages
NLP (1)
No ratings yet
NLP (1)
12 pages
Text Processing
No ratings yet
Text Processing
16 pages
ANSHIKA'S PROJECT DO NOT TOUCH!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
No ratings yet
ANSHIKA'S PROJECT DO NOT TOUCH!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
15 pages
Lab - Manual - IR - BE AI&DS CL II
No ratings yet
Lab - Manual - IR - BE AI&DS CL II
38 pages
CS Practical File
No ratings yet
CS Practical File
47 pages
N_gram_Presentation
No ratings yet
N_gram_Presentation
29 pages