0% found this document useful (0 votes)
41 views

CSIT366-Lab File

The document contains a summary of 10 experiments conducted using Python: 1. Create line, bar, and histogram charts using Matplotlib library. 2. Create an n*k matrix to represent a linear function mapping k-dimensional vectors to n-dimensional vectors. 3. Import a dataset from Kaggle and perform tasks like viewing columns and dimensions. 4. Perform text analysis operations like tokenization, frequency distribution, stopword removal using NLTK. 5. Generate random words using HTTP requests and from a text file. 6. Explore using morphing to transform images. 7. Generate n-grams from text using a specified number of words. 8
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

CSIT366-Lab File

The document contains a summary of 10 experiments conducted using Python: 1. Create line, bar, and histogram charts using Matplotlib library. 2. Create an n*k matrix to represent a linear function mapping k-dimensional vectors to n-dimensional vectors. 3. Import a dataset from Kaggle and perform tasks like viewing columns and dimensions. 4. Perform text analysis operations like tokenization, frequency distribution, stopword removal using NLTK. 5. Generate random words using HTTP requests and from a text file. 6. Explore using morphing to transform images. 7. Generate n-grams from text using a specified number of words. 8
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

INDEX

NO TOPIC DATE SIGN


1 To create Line chart, Bar chart and Histogram 05/07/202
3
2 Create a n * k matrix to represent a linear function that maps k- 12/07/202
dimensional vectors to n-dimensional vectors. 3
3 Import a dataset from Kaggle and perform different tasks on it 26/07/202
3
4 Write a program to perform different text analysis operations 02/08/202
using NLTK 3
5 Write a program to generate a random word using: HTTP, text 09/08/202
file 3
6 Morphing
7 Generate N-grams 06/09/202
3
8 POS Tagging: Hidden Markov Model 04/10/202
POS Tagging: Viterbi Decoding 3
9 Finding K-nearest neighbor 11/10/202
3
10 Sentiment Analysis 11/10/202
3
EXPERIMENT 1
AIM: To create Line chart, Bar chart and Histogram
SOFTWARE USED: Python, matplotlib library
CODE:

Line Graph

A line chart represents data points connected by straight lines. It is useful for showing trends or
changes over time or continuous data. The x-axis typically represents the independent variable,
while the y-axis represents the dependent variable.

import matplotlib.pyplot as plt

x = [1,2,3,4,5]
y = [2,4,6,8,10]

plt.plot(x,y)

plt.xlabel('X-AXIS')
plt.ylabel('Y-AXIS')
plt.title('Line Chart')

plt.show()
Bar Chart

A bar chart displays categorical data using rectangular bars of varying lengths. Each bar
represents a category, and the length of the bar corresponds to the value or frequency of that
category. Bar charts are suitable for comparing values between different categories.

x = ['Maths','English','Hindi','Science','GK']
y = [80, 90, 81, 45, 67, 76]

plt.bar(x,y)

plt.xlabel('Subjects')
plt.ylabel('Marks')
plt.title('Bar Chart')

plt.show()
Histogram

A histogram visualizes the distribution of numerical data. It consists of adjacent rectangular bins
representing intervals or ranges of values, while the height of each bin represents the frequency
or count of values falling within that range. Histograms help analyze the shape, spread, and
central tendency of the data.

data = [1,1,2,3,4,4,4,5,5,6,7,7,7,8,9]

plt.hist(data, bins=10)

plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Histogram')

plt.show()
EXPERIMENT 2
AIM: Create a n*k matrix to represent a linear function that maps k-dimensional vectors to n-
dimensional vectors.
SOFTWARE USED: Python, matplotlib library
CODE:
import random
import numpy as np

def create_matrix(n,k):
matrix=[]
for i in range(n):
row=[]
for j in range(k):
row.append(random.random())
matrix.append(row)
return matrix

if __name__ == "__main__":
matrix = create_matrix(3,2)
print(matrix)
EXPERIMENT 3
AIM: Import a dataset from Kaggle and perform different tasks on it.
SOFTWARE USED: Python, Kaggle
CODE:
import numpy as np
import pandas as pd

df = pd.read_csv("./exp_3/data2.csv")
print(df["User ID"])
print(df.head())
print(df.shape)
print(df.info())
EXPERIMENT 4

AIM: Write a program to perform different text analysis operations using NLTK
SOFTWARE USED: Python, matplotlib, nltk library
CODE:

import nltk
from nltk.tokenize import sent_tokenize
from nltk.tokenize import word_tokenize
from nltk.probability import FreqDist
import matplotlib.pyplot as plt
from nltk.corpus import stopwords

nltk.download("punkt")

text = """Hello Dear Students, how are you doing today? Today we will study natural language concepts,
and implement the same on Python platform. Everyone has to write the program and make practile file
too"""

tokenized_sent = sent_tokenize(text)
print(tokenized_sent)

tokenized_word = word_tokenize(text)
print(tokenized_word)

fdist = FreqDist(tokenized_word)
print(fdist.most_common(2))

# Frequency Distribution Plot


fdist.plot(30, cumulative=False)
plt.show()

nltk.download("stopwords")

stop_words = set(stopwords.words("english"))

print(stop_words)

filtered_tokens = []
for w in tokenized_word:
if w not in stop_words:
filtered_tokens.append(w)

print("Tokenized Words:", tokenized_word)


print("Filterd Tokens:", filtered_tokens)
EXPERIMENT 5
AIM: Write a program to generate a random word using:
1) HTTP Request
2) A text file
SOFTWARE USED: Python
CODE:

1) HTTP Request

import random
import string

def generate_random_word(length):
letters = string.ascii_lowercase
return "".join(random.choice(letters for _ in range(length)))
def main():
try:
word_length = int(input("Enter the desired word length: "))
num_words = int(input("Enter the number of words you need: "))

if word_length <= 0 or num_words <= 0:


print("Word length and number of words should be positive integers")
return

generated_words = [generate_random_word(word_length) for _ in range(num_words)]

print("\nGenerated words: ")


for word in generated_words:
print(word)

except ValueError:
print(
"Please enter valid positive integers for word length and number of words."
)

if __name__ == "__main__":
main()

2) A text file

import random

def get_list_of_words(path):
with open(path, "r", encoding="utf-8") as f:
return f.read().splitlines()

words = get_list_of_words("/content/wordGenerate.txt")

print(words)
random_word = random.choice(words)

print(random_word)
EXPERIMENT 6
AIM: To explore how to use morphing to transform one image into another.
SOFTWARE USED: Python
CODE:

EXPERIMENT 7
AIM: To generate n-grams from a given text, where an n-gram is a sequence of 'n' words.
SOFTWARE USED: Python
CODE:
def generate_ngrams(text, WordsToCombine):
words = text.split()
output = []
for i in range(len(words) - WordsToCombine + 1):
output.append(words[i:i + WordsToCombine])
return output

# Example usage:
text = 'this is a very good class to teach and interact'
WordsToCombine = 3

ngrams = generate_ngrams(text, WordsToCombine)

print(ngrams)
EXPERIMENT 8
AIM: To understand and implement Part-of-Speech (POS) tagging using a Hidden Markov
Model and to implement Viterbi decoding for POS tagging.
SOFTWARE USED: Python, Jupyter
CODE:

HIDDEN MARKOV MODEL

# Import libraries
import nltk
from nltk.tag import hmm
from nltk.corpus import treebank
nltk.download('treebank')

# Train the HMM POS tagger


train_data = treebank.tagged_sents()[:3000] # Training data
tagger = hmm.HiddenMarkovModelTrainer().train(train_data)

# POS tagging
test_sentence = "This is a sample sentence for POS tagging."
tokens = nltk.word_tokenize(test_sentence)
pos_tags = tagger.tag(tokens)

# Display the POS tags


print(pos_tags)
VITERBI DECODING

# Import libraries
import nltk
from nltk.tag import hmm
from nltk.corpus import treebank

# Train the HMM POS tagger


train_data = treebank.tagged_sents()[:3000] # Training data
tagger = hmm.HiddenMarkovModelTrainer().train(train_data)

# Implement Viterbi algorithm for POS tagging


def viterbi(sentence, tagger):
tokens = nltk.word_tokenize(sentence)
viterbi_path = tagger.tag(tokens)
return viterbi_path

# Test the Viterbi POS tagging


test_sentence = "This is a test sentence for Viterbi decoding."
viterbi_tags = viterbi(test_sentence, tagger)
print(viterbi_tags)
EXPERIMENT 9
AIM: To implement K-nearest neighbor (KNN) classification for data points.
SOFTWARE USED: Python, Jupyter
CODE:

# Import necessary libraries


from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt # Import matplotlib

# Load the Iris dataset


data = load_iris()

# Split the data into features (X) and labels (y)


X = data.data
y = data.target

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train the KNN classifier


k = 3 # Number of neighbors
knn_classifier = KNeighborsClassifier(n_neighbors=k)
knn_classifier.fit(X_train, y_train)

# Make predictions on the test set


y_pred = knn_classifier.predict(X_test)

# Calculate the accuracy of the model


accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

# Visualize the Iris dataset


plt.figure(figsize=(8, 6))
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Set1, edgecolor='k')
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Sepal Width (cm)')
plt.title('Iris Dataset (Sepal Length vs. Sepal Width)')
plt.show()
EXPERIMENT 10
AIM: To perform sentiment analysis on text data using a pre-trained model.
SOFTWARE USED: Python, Jupyter
CODE:

# Import libraries
import transformers
from transformers import pipeline

# Load a pre-trained sentiment analysis model


sentiment_analyzer = pipeline("sentiment-analysis")

# Analyze sentiment of text


text = "I love this product! It's amazing."
sentiment = sentiment_analyzer(text)

# Display sentiment
print(sentiment)

You might also like