0% found this document useful (0 votes)

4 views

Ml Lab Manual Cse

The document outlines the practical record for students at Kallam Haranadhareddy Institute of Technology, including sections for student information, certificates, and an index of experiments. It details the vision, mission, and program-specific outcomes for the Computer Science and Engineering department, along with course outcomes related to machine learning. Additionally, it provides sample experiments involving Python programming, regression analysis, classification algorithms, and machine learning techniques.

Uploaded by

228x1a05d2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Ml Lab Manual Cse

Uploaded by

228x1a05d2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 49

KALLAM HARANADHAREDDY

INSTITUTE OF TECHNOLOGY
(Approved by AICTE New Delhi & Affiliated to JNTUK, Kakinada)
NH- 5, Chowdavaram, Guntur-522 019
An ISO 9001:2015 Certified Institution, Accredited by NAAC & NBA

KHIT

PRACTICAL RECORD

Name:……………………………………………………………………..
Roll No:……………..…... Year & Semester:…………..………..

Branch:…………………. Section:…………………………........

Lab:……………………………………………………………………..…
KALLAM HARANADHAREDDY
INSTITUTE OF TECHNOLOGY
(APPROVED BY AICTE NEW DELHI, AFFLIATED TO
JNTUK, KAKINADA) CHOWDAVARAM, GUNTUR-19

Roll No:

CERTIFICATE

This is to Certify that the Bonafide Record of the Laboratory Work done by

Mr/Ms…………………………………………………………………………………………..

of……..B.Tech/M.Tech/Diploma……...Semester in ………..Branch has completed…..…..

experiments in ……………………………………………………….………………………..

Laboratory during the Academic year 20 -20

Faculty-in-charge Head of the Department

Internal Examiner External Examiner

INDEX
EX. PAGE

NO DATE NAME OF THE EXPERIMENT FROM TO MARKS SIGNATURE

.
EX. PAGE
NO
DATE NAME OF THE EXPERIMENT FROM TO MARKS SIGNATURE

.
CSE DEPARTMENT VISION, MISSION, GOALS

Vision
Imparting quality technical education to learners in the field of Computer Science and
Engineering to produce technically competent software personnel with advanced skills, knowledge and
behavior to meet the computational global real time challenges.

Mission

To impart the quality technical education through training in

M1: fundamentals of software and hardware in the field of computer science
and engineering with global standards.
To educate the students to become software professionals and lifelong
M2:
learners through professional training and practice.
To develop professional ethics in students to lead the life with good
M3:
human values.
To be a state of the art research centre in the field of computer science &
M4:
engineering promoting innovation and research.

PROGRAM SPECIFIC OUTCOMES (PSOs)

PSO-1: Acquaintance of knowledge and implement concepts of programming

languages, software engineering, computer networks, databases and computer
automation to solve computing problems.

PSO-2: Understand, Analyze, Design, Develop and Test computer programs for the
problems related to Algorithms, Internet of Things, Data Sciences, Cloud Computing,
Artificial Intelligence and Machine Learning.

PSO-3: Apply theoretical and practical knowledge by using Modern software tools and
techniques to build application software.
Course Outcomes:

At the end of the course, students will be able to:

• Apply Machine learning approaches for a given problem.

• Analyze and identify the need for machine learning techniques for a particular domain.
• Develop the real time applications and predict its outcomes using machine learning
algorithms.

CO- PO Mapping
(3/2/1 indicates strength of correlation) 3-Strong, 2-Medium, 1-Weak
COs Programme Outcomes(POs) PSOs
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3

CO1 1 1 1 1 1 1 2

CO2 1 1 2 1 1 1 1 1 1 1
Exp. No: Date:

1. Install the python software/Anaconda- python and install useful package for machine learning
load the dataset (sample), understand, and visualize the data.

CSE KHIT
Exp. No: Date:

B) Loading, understanding and visualizing the data set.

import pandas as pd
import matplotlib.pyplot as plt
# importing the dataset
dataset
=pd.read_csv('D:/Salary_Data.csv')
print(dataset.head())

Output:

CSE KHIT
Exp. No: Date:

2. Implement simple linear regression.

import numpy as np

import matplotlib.pyplot as plt

def estimate_coef(x,y):
# number of observations/points
n = np.size(x)

# mean of x and y vector

m_x = np.mean(x)
m_y = np.mean(y)

# calculating cross-deviation and deviation about x

SS_xy = np.sum(y*x) - n*m_y*m_x
SS_xx = np.sum(x*x) - n*m_x*m_x

# calculating regression coefficients

b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x
return (b_0, b_1)
def plot_regression_line(x, y, b):

# plotting the actual points as scatter plot

plt.scatter(x,y,color="m",marker= "o",s= 30)

# predicted response vector

y_pred = b[0] + b[1]*x

# plotting the regression line

plt.plot(x, y_pred, color = "g")

# putting labels
plt.xlabel('x')
plt.ylabel('y')

# function to show plot

plt.show()
def main():
# observations / data
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])
Exp. No: Date:

# estimating coefficients
b = estimate_coef(x, y)
print("Estimated coefficients:\nb_0 = {}\nb_1 = {}".format(b[0], b[1]))

# plotting regression line

plot_regression_line(x, y, b)
if __name__ == "__main__":
main()

Output:
Estimated coefficients:
b_0 = -0.0586206896552
b_1 = 1.45747126437

And graph obtained looks like this:

CSE KHIT
Exp. No: Date:

3. Implement multivariate linear regression.

import pandas

from sklearn import linear_model

a={

'slips' : [2,4,6,8],

'open' : [2,4,6,8],

'marks' : [20,40,60,80]

df = pandas.DataFrame(a)

X = df[['slips', 'open']]

y = df['marks']

regr = linear_model.LinearRegression()

regr.fit(X, y)

#predict the Marks for slips=5 and open=5

predictedMarks= regr.predict([[5,5]])

print(predictedMarks)

Output:

[ 50 . ]

CSE KHIT
Exp. No: Date:

4. Implement simple Logistic Regression and Multivariate Logistic Regression.

import pandas as pd

from sklearn.linear_model import LogisticRegression

from sklearn.model_selection import train_test_split

from sklearn import metrics

a={

'slips' : [2,4,6,8,10,15,18],

'pass_or_fail' : [0,1,1,1,1,1]

DF=pd.DataFrame(a)

#split dataset in features and target variable

feature_cols = ['slips']

X = DF[feature_cols] # Features

y = DF.pass_or_fail # Target variable

# split X and y into training and testing sets

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.5,random_state=0)

#instantiate the model (using the default parameters)

logreg = LogisticRegression()#

fit the model with data

logreg.fit(X_train,y_train)

y_pred=logreg.predict(X_test)

#import the metrics class

cnf_matrix = metrics.confusion_matrix(y_test, y_pred)

print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
Exp. No: Date:

print(cnf_matrix)

Output:

runfile('C:/Users/student/.spyder-py3/temp.py', wdir='C:/Users/student/.spyder-py3')

Accuracy: 0.75

[[0 0]

[1 3]]

CSE KHIT
Exp. No: Date:

5. Implement Decision Trees.

# Load libraries
import pandas as pd

from sklearn.tree import DecisionTreeClassifier

# Import Decision Tree Classifier

from sklearn.model_selection import train_test_split

# Import train_test_split function

from sklearn import metrics

#Import scikit-learn metrics module for accuracy calculation

a={

'easy' : [0,1,1,0,0,0,0,0],

'slips' : [0,0,2,2,4,6,8,10],

'result' : [0,1,1,0,1,1,1,1]

pima=pd.DataFrame(a)

print(pima.head())

#split dataset in features and target variable

feature_cols = ['easy', 'slips']

X = pima[feature_cols] # Features

y = pima.result # Target variable

# Split dataset into training set and test set

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25,random_state=1)

# 75% training and 25% test

# Create Decision Tree classifier object

clf = DecisionTreeClassifier()

# Train Decision Tree Classifier

clf = clf.fit(X_train,y_train)

#Predict the response for test dataset

y_pred =clf.predict(X_test)

# Model Accuracy, how often is the classifier correct?

print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

OUTPUT:
runfile('C:/Users/student/.spyder-py3/untitled1.py', wdir='C:/Users/student/.spyder-py
easy slips result
0 0 0 0
1 1 0 1
2 1 2 1
3 0 2 0
4 0 4 1
Accuracy: 1.0

CSE KHIT
Exp. No: Date:

6. Implement any 3 ClassificationAlgorithms.

import pandas as pd

import matplotlib.pyplot as plt

# importing the dataset

dataset = pd.read_csv('Salary_Data.csv')

print(dataset.head())

# data preprocessing

X = dataset.iloc[:, :-1].values # independent variable array

y = dataset.iloc[:, 1].values # dependent variable vector

# splitting the dataset

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=0)

# fitting the regression model

from sklearn.linear_model import LinearRegression

regressor = LinearRegression()

regressor.fit(X_train, y_train)

# actually produces the linear eqn for the data

# predicting the test set results

y_pred = regressor.predict(X_test)

print(y_pred)

print(y_test)

# visualizing the results

# plot for the TRAIN

plt.scatter(X_train, y_train, color='red')

# plotting the observation line

plt.plot(X_train, regressor.predict(X_train), color='blue')

# plotting the regression line

plt.title("Salary vs Experience (Training set)")

# stating the title of the graph

plt.xlabel("Years of experience")

# adding the name of x-axis

plt.ylabel("Salaries")

# adding the name of y-axis

plt.show()

# specifies end of graph

# plot for the TEST

plt.scatter(X_test, y_test, color='red')

plt.plot(X_train, regressor.predict(X_train), color='blue')

# plotting the regression line

plt.title("Salary vs Experience (Testing set)")

plt.xlabel("Years of experience")

plt.ylabel("Salaries")

plt.show()

Output:

CSE KHIT
Exp. No: Date:

import pandas

from sklearn import linear_model

a={

'slips' : [2,4,6,8],

'open' : [2,4,6,8],

'marks' : [20,40,60,80]

df = pandas.DataFrame(a)

X = df[['slips', 'open']]

y = df['marks']

regr = linear_model.LinearRegression()

regr.fit(X, y)

#predict the Marks for slips=5 and open=5

predictedMarks= regr.predict([[5,5]])

print(predictedMarks)

Output:

[ 50 . ]

CSE KHIT
Exp. No: Date:

import pandas as pd

from sklearn.linear_model import LogisticRegression

from sklearn.model_selection import train_test_split

from sklearn import metrics

a={

'slips' : [2,4,6,8,10,15,18],

'pass_or_fail' : [0,1,1,1,1,1]

DF=pd.DataFrame(a)

#split dataset in features and target variable

feature_cols = ['slips']

X = DF[feature_cols] # Features

y = DF.pass_or_fail # Target variable

# split X and y into training and testing sets

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.5,random_state=0)

#instantiate the model (using the default parameters)

logreg = LogisticRegression()#

fit the model with data

logreg.fit(X_train,y_train)

y_pred=logreg.predict(X_test)

#import the metrics class

cnf_matrix = metrics.confusion_matrix(y_test, y_pred)

print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Exp. No: Date:

print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

print(cnf_matrix)

Output:

runfile('C:/Users/student/.spyder-py3/temp.py', wdir='C:/Users/student/.spyder-py3')

Accuracy: 0.75

[[0 0]

[1 3]]
Exp. No: Date:

7. Implement Random Forests Algorithm.

#Import scikit-learn dataset library

from sklearn import datasets

import pandas as pd

from sklearn.model_selection

import train_test_split

from sklearn.ensemble importRandomForestClassifier

from sklearn import metrics

#Load dataset

iris = datasets.load_iris()

# print the label species(setosa, versicolor,virginica)

print(iris.target_names)

# print the names of the four features

print(iris.feature_names)

# print the iris data (top 5 records)

print(iris.data[0:5])

# print the iris labels (0:setosa, 1:versicolor, 2:virginica)

print(iris.target)

# Creating a DataFrame of given iris dataset.

data=pd.DataFrame({

'sepal length':iris.data[:,0],

'sepal width':iris.data[:,1],

'petal length':iris.data[:,2],

'petal width':iris.data[:,3],

CSE KHIT
Exp. No: Date:

'species':iris.target

})

data.head()

X=data[['sepal length', 'sepal width', 'petal length', 'petal width']] # Features y=data['species']

# Labels

# Split dataset into training set and test set

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) # 70% training and 30% test

#Import Random Forest Model

#Create a Gaussian Classifier

clf=RandomForestClassifier(n_estimators=100)

#Train the model using the training sets y_pred=clf.predict(X_test)

clf.fit(X_train,y_train)

y_pred=clf.predict(X_test)

#Import scikit-learn metrics module for accuracy calculation

# Model Accuracy, how often is the classifier correct?

print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Output:

CSE KHIT
Exp. No: Date:

8. Implement K-Means, KNN algorithms.

#Create artificial data set

from sklearn.datasets import make_blobs

raw_data = make_blobs(n_samples = 200, n_features = 2, centers = 4, cluster_std = 1.8)

#Data imports

import pandas as pd

import numpy as np

#Visualization imports

import matplotlib.pyplot as plt

#Visualize the data

plt.scatter(raw_data[0][:,0], raw_data[0][:,1])

plt.scatter(raw_data[0][:,0], raw_data[0][:,1], c=raw_data[1])

#Build and train the model from

sklearn.cluster import KMeans

model = KMeans(n_clusters=4)

model.fit(raw_data[0])

#See the predictions

print(model.labels_)

print(model.cluster_centers_)

#PLot the predictions against the original data set

f, (ax1, ax2) = plt.subplots(1, 2, sharey=True,figsize=(10,6))

ax1.set_title('Our Model')

ax1.scatter(raw_data[0][:,0],raw_data[0][:,1],c=model.labels_)

CSE KHIT
Exp. No: Date:

ax2.set_title('Original Data')

ax2.scatter(raw_data[0][:,0], raw_data[0][:,1],c=raw_data[1])

Output:

CSE KHIT
Exp. No: Date:

#Common imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#Import the data set
raw_data = pd.read_csv('classified_data.csv', index_col = 0)
print(raw_data.columns)

#Import standardization functions from scikit-learn

from sklearn.preprocessing import StandardScaler
#Standardize the data set scaler

= StandardScaler()
scaler.fit(raw_data.drop('TARGET CLASS', axis=1))
scaled_features = scaler.transform(raw_data.drop('TARGET CLASS', axis=1))
scaled_data = pd.DataFrame(scaled_features, columns = raw_data.drop('TARGET CLASS',
axis=1).columns)
#Split the data set into training data and test data
from sklearn.model_selection import train_test_split
x = scaled_data
y = raw_data['TARGET CLASS']
x_training_data, x_test_data, y_training_data, y_test_data = train_test_split(x, y, test_size = 0.3)
#Train the model and make predictions
from sklearn.neighbors import KNeighborsClassifier
model = KNeighborsClassifier(n_neighbors = 1)
model.fit(x_training_data, y_training_data)
predictions = model.predict(x_test_data)
#Performance measurement
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix

CSE KHIT
Exp. No:
Date:
print(classification_report(y_test_data, predictions))
print(confusion_matrix(y_test_data, predictions))
#Selecting an optimal K value
error_rates = []
for i in np.arange(1, 101):
new_model=KNeighborsClassifier(n_neighbors=i)
new_model.fit(x_training_data,y_training_data)
new_predictions=new_model.predict(x_test_data)
error_rates.append(np.mean(new_predictions != y_test_data))
plt.figure(figsize=(16,12))
plt.plot(error_rates)

Output:

CSE KHIT
Exp. No: Date:

9. Implement SVM on any applicable datasets.

#Import scikit-learn dataset library

from sklearn import datasets

from sklearn.model_selection import train_test_split

from sklearn import svm

from sklearn import metrics

#Load dataset

cancer=datasets.load_breast_cancer()

# print the names of the 13 features

print("Features: ", cancer.feature_names)

# print the label type of cancer('malignant' 'benign')

print("Labels: ", cancer.target_names)

# print data(feature)shape

cancer.data.shape

# print the cancer data features (top 5 records)

print(cancer.data[0:5])

# print the cancer labels (0:malignant, 1:benign)

print(cancer.target)

# Split dataset into training set and test set

X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target,

test_size=0.3,random_state=109) # 70% training and 30% test

#Create a svm Classifier

clf = svm.SVC(kernel='linear')
# Linear Kernel

CSE KHIT
Exp. No: Date:

#Train the model using the training sets

clf.fit(X_train, y_train)

#Predict the response for test dataset

y_pred = clf.predict(X_test)
# Model Accuracy: how often is the classifier correct?
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
# Model Precision: what percentage of positive tuples are labeled as such?
print("Precision:",metrics.precision_score(y_test, y_pred))
#ModelRecall:what percentage of positive tuples are labelled as such?

print("Recall:",metrics.recall_score(y_test, y_pred))

Output:

CSE KHIT
Exp. No: Date:

10. Implement Neural Networks.

import numpy as np

class NeuralNetwork():

def init (self):

# seeding for random number generation

np.random.seed(1)

# converting weights to a 3 by 1 matrix with values from -1 to 1 and mean of 0

self.synaptic_weights = 2 * np.random.random((3, 1)) - 1

def sigmoid(self, x):

# applying the sigmoid function

return 1 / (1 + np.exp(-x))

def sigmoid_derivative(self, x):

# computing derivative to the Sigmoid function

return x * (1 - x)

def train(self, training_inputs, training_outputs, training_iterations):

# training the model to make accurate predictions while adjusting weights continually

for iteration in range(training_iterations):

# siphon the training data via the neuron

output = self.think(training_inputs)

# computing error rate for back-propagation

error = training_outputs - output

# performing weight adjustments

adjustments = np.dot(training_inputs.T, error * self.sigmoid_derivative(output))

CSE KHIT
Exp. No: Date:

self.synaptic_weights += adjustments

def think(self, inputs):

# passing the inputs via the neuron to get output

# converting values to floats

inputs = inputs.astype(float)

output = self.sigmoid(np.dot(inputs, self.synaptic_weights))

return output

if name == " main ":

# initializing the neuron class

neural_network = NeuralNetwork()

print("Beginning Randomly Generated Weights:")

print(neural_network.synaptic_weights)

# training data consisting of 4 examples--3 input values and 1 output

training_inputs = np.array([[0, 0, 1],

[1, 1, 1],

[1, 0, 1],

[0, 1, 1]])

training_outputs = np.array([[0, 1, 1, 0]]).T

# training taking place

neural_network.train(training_inputs,training_outputs,15000)

print("Ending Weights After Training: ")

print(neural_network.synaptic_weights)

user_input_one = str(input("User Input One: "))

CSE KHIT
Exp. No: Date:

user_input_two = str(input("User Input Two: "))

user_input_three = str(input("User Input Three: "))

print("Considering New Situation: ", user_input_one, user_input_two, user_input_three)

print("New Output data: ")

print(neural_network.think(np.array([user_input_one, user_input_two, user_input_three])))

print("Wow, we did it!")

Output:

CSE KHIT
Exp. No: Date:

11. Implement PCA.

# import all libraries

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

%matplotlib inline

from sklearn.decomposition import PCA

from sklearn.preprocessing import StandardScaler

#import the breast _cancer dataset

from sklearn.datasets import load_breast_cancer

data=load_breast_cancer()

data.keys()

# Check the output classes

print(data['target_names'])

# Check the input attributes

print(data['feature_names'])

# construct a dataframe using pandas

df1=pd.DataFrame(data['data'],columns=data['feature_names'])

# Scale data befor applying PCA

scaling=StandardScaler()

# Use fit and transform method

scaling.fit(df1)

Scaled_data=scaling.transform(df1)

CSE KHIT
Exp. No: Date:

# Set the n_components=3

principal=PCA(n_components=3)

principal.fit(Scaled_data)

x=principal.transform(Scaled_data)

# Check the dimensions of data after PCA

print(x.shape)

# Check the values of eigen vectors

# prodeced by principal components

principal.components_

plt.figure(figsize=(10,10))

plt.scatter(x[:,0],x[:,1],c=data['target'],cmap='plasma')

plt.xlabel('pc1')

plt.ylabel('pc2')

# immport relevant libraries for 3d graph

from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(10,10))

# choose projection 3d for creating a 3d graph

axis = fig.add_subplot(111, projection='3d')

# x[:,0]is pc1,x[:,1] is pc2 while x[:,2] is pc3

axis.scatter(x[:,0],x[:,1],x[:,2], c=data['target'],cmap='plasma')

axis.set_xlabel("PC1", fontsize=10)

axis.set_ylabel("PC2", fontsize=10)

axis.set_zlabel("PC3", fontsize=10)

CSE KHIT
Exp. No: Date:

# check how much variance is explained by each principal component

print(principal.explained_variance_ratio_)

(569,3)

CSE KHIT
Exp. No: Date:

12. Implement anomaly detection and recommendation

.# import pandas library

import pandas as pd

# Get the data

column_names = ['user_id', 'item_id', 'rating', 'timestamp']

path = 'https://media.geeksforgeeks.org/wp-content/uploads/file.tsv'

df = pd.read_csv(path, sep='\t', names=column_names)

# Check the head of the data

df.head()

# Check out all the movies and their respective IDs

movie_titles=pd.read_csv('https://media.geeksforgeeks.org/wp-
content/uploads/Movie_Id_Titles.csv')

movie_titles.head()

data = pd.merge(df, movie_titles, on='item_id')

data.head()

# Calculate mean rating of all movies

data.groupby('title')['rating'].mean().sort_values(ascending=False).head()

# Calculate count rating of all movies

data.groupby('title')['rating'].count().sort_values(ascending=False).head()

# creatingdataframe with 'rating' count values

ratings = pd.DataFrame(data.groupby('title')['rating'].mean())

ratings['num of ratings'] = pd.DataFrame(data.groupby('title')['rating'].count())

ratings.head()

CSE KHIT
Exp. No: Date:

import matplotlib.pyplot as plt

import seaborn as sns

sns.set_style('white')

%matplotlib inline

# plot graph of 'num of ratings column'

plt.figure(figsize =(10, 4))

ratings['num of ratings'].hist(bins = 70)

# plot graph of 'ratings' column

plt.figure(figsize =(10, 4))

ratings['rating'].hist(bins = 70)

# Sorting values according to

# the 'num of rating column'

moviemat = data.pivot_table(index ='user_id', columns ='title', values='rating')

moviemat.head()

ratings.sort_values('num of ratings', ascending =False).head(10)

# analysing correlation with similar movies

starwars_user_ratings = moviemat['Star Wars (1977)']

liarliar_user_ratings = moviemat['Liar Liar (1997)']

starwars_user_ratings.head()

# analysing correlation with similar movies

similar_to_starwars = moviemat.corrwith(starwars_user_ratings)

similar_to_liarliar = moviemat.corrwith(liarliar_user_ratings)

CSE KHIT
Exp. No: Date:

corr_starwars=pd.DataFrame(similar_to_starwars,columns=['Correlation'])

corr_starwars.dropna(inplace = True)

corr_starwars.head()

Output:

CSE KHIT
Exp. No: Date:

Anomaly recommendation

# import pandas library

import pandas as pd

# Get the data

column_names = ['user_id', 'item_id', 'rating', 'timestamp']

path = 'https://media.geeksforgeeks.org/wp-content/uploads/file.tsv'

df = pd.read_csv(path, sep='\t', names=column_names)

# Check the head of the data

df.head()

# Check out all the movies and their respective IDs

movie_titles = pd.read_csv('https://media.geeksforgeeks.org/wp-

content/uploads/Movie_Id_Titles.csv')

movie_titles.head()

data = pd.merge(df, movie_titles, on='item_id')

data.head()

# Calculate mean rating of all movies

data.groupby('title')['rating'].mean().sort_values(ascending=False).head()

# Calculate count rating of all movies

data.groupby('title')['rating'].count().sort_values(ascending=False).head()

# creating dataframe with 'rating' count values

ratings = pd.DataFrame(data.groupby('title')['rating'].mean())

ratings['num of ratings'] = pd.DataFrame(data.groupby('title')['rating'].count())

ratings.head()

CSE KHIT
Exp. No: Date:

import matplotlib.pyplot as plt

import seaborn as sns

sns.set_style('white')

%matplotlib inline

# plot graph of 'num of ratings column'

plt.figure(figsize =(10, 4))

ratings['num of ratings'].hist(bins = 70)

# plot graph of 'ratings' column plt.figure(figsize =(10, 4))

ratings['rating'].hist(bins = 70)

Output:

CSE KHIT
Exp. No: Date:

White Star Capital 2020 Artificial Intelligence Sector Report
100% (1)
White Star Capital 2020 Artificial Intelligence Sector Report
87 pages
Exp 04 (1)
No ratings yet
Exp 04 (1)
12 pages
PSTC Record 1 st year
No ratings yet
PSTC Record 1 st year
50 pages
ATOMIC ENERGY CENTRAL SCHOOL1
No ratings yet
ATOMIC ENERGY CENTRAL SCHOOL1
5 pages
data_science_syllabus
No ratings yet
data_science_syllabus
4 pages
R-Programming LAB MANUAL by Chiru
No ratings yet
R-Programming LAB MANUAL by Chiru
72 pages
Introduction to Python Programming: Do your first steps into programming with python
From Everand
Introduction to Python Programming: Do your first steps into programming with python
Greytower Corp
No ratings yet
Arzoo Ai Practical
No ratings yet
Arzoo Ai Practical
21 pages
PDS Practical
No ratings yet
PDS Practical
94 pages
Salesforce Certified Platform Developer I CRT-450 Exam Preparation
From Everand
Salesforce Certified Platform Developer I CRT-450 Exam Preparation
Georgio Daccache
No ratings yet
Ayush ML
No ratings yet
Ayush ML
29 pages
DMV_Lab_Manual (2) (1) (2)
No ratings yet
DMV_Lab_Manual (2) (1) (2)
45 pages
Daa Front Page
No ratings yet
Daa Front Page
21 pages
DAA File Siddharth
No ratings yet
DAA File Siddharth
21 pages
ATM Management System
No ratings yet
ATM Management System
10 pages
mn
No ratings yet
mn
10 pages
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet
Experiment 2.java
No ratings yet
Experiment 2.java
12 pages
PPFD SET-1 (2)
No ratings yet
PPFD SET-1 (2)
12 pages
Sec-L CBNST Lab Formats
No ratings yet
Sec-L CBNST Lab Formats
12 pages
DS Observation Corrected
No ratings yet
DS Observation Corrected
79 pages
R Programming Shiv
No ratings yet
R Programming Shiv
18 pages
Data Science and Its Applications (21AD62) Lab Manual
No ratings yet
Data Science and Its Applications (21AD62) Lab Manual
26 pages
PDS Practical
No ratings yet
PDS Practical
94 pages
Lab Rubrics
No ratings yet
Lab Rubrics
2 pages
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
Internal Format
No ratings yet
Internal Format
7 pages
The Foundry NukeX 7 for Compositors
From Everand
The Foundry NukeX 7 for Compositors
Prof. Sham Tickoo
No ratings yet
Lab_Format
No ratings yet
Lab_Format
2 pages
Organized
No ratings yet
Organized
25 pages
PDS Practical
No ratings yet
PDS Practical
94 pages
Statiscal Method Using R Lab, Syllabus
No ratings yet
Statiscal Method Using R Lab, Syllabus
3 pages
Computer Science 2024-25
No ratings yet
Computer Science 2024-25
40 pages
Aman Data
No ratings yet
Aman Data
64 pages
mohit SW lab
No ratings yet
mohit SW lab
3 pages
Customizing AutoCAD 2020, 13th Edition
From Everand
Customizing AutoCAD 2020, 13th Edition
Prof. Sham Tickoo
No ratings yet
Syllabus
No ratings yet
Syllabus
8 pages
Anshu Complete Data Science Files
No ratings yet
Anshu Complete Data Science Files
26 pages
spss front page
No ratings yet
spss front page
5 pages
Record of Applied and Action Learning CSE - A4-2pgs
No ratings yet
Record of Applied and Action Learning CSE - A4-2pgs
8 pages
DS LM
No ratings yet
DS LM
110 pages
Pyhton Manual 06 10 23
No ratings yet
Pyhton Manual 06 10 23
47 pages
NES Profile 315
No ratings yet
NES Profile 315
16 pages
IP lab CO attainment calculation Final
No ratings yet
IP lab CO attainment calculation Final
21 pages
CS3401-Algorithms Lab Manual
No ratings yet
CS3401-Algorithms Lab Manual
41 pages
POP manual PDF original
No ratings yet
POP manual PDF original
57 pages
PDS Exp 1 To 3
No ratings yet
PDS Exp 1 To 3
17 pages
Software Testing and Quality Assurance ETCS - 453
No ratings yet
Software Testing and Quality Assurance ETCS - 453
135 pages
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
AutoCAD Plant 3D 2020 for Designers, 5th Edition
From Everand
AutoCAD Plant 3D 2020 for Designers, 5th Edition
Prof. Sham Tickoo
No ratings yet
Python
No ratings yet
Python
7 pages
Vansh
No ratings yet
Vansh
15 pages
Lab Record-Data Struct
No ratings yet
Lab Record-Data Struct
44 pages
Mca Os Lab Record
No ratings yet
Mca Os Lab Record
57 pages
Download
No ratings yet
Download
9 pages
Rkr21 Se Lab Record
No ratings yet
Rkr21 Se Lab Record
86 pages
KCS453 - Python Language Programming Lab: Name: Reg. No: 1900970130053 Year: 2 Sem: 4 Section: B
No ratings yet
KCS453 - Python Language Programming Lab: Name: Reg. No: 1900970130053 Year: 2 Sem: 4 Section: B
23 pages
AutoCAD 2019: A Problem - Solving Approach, Basic and Intermediate, 25th Edition
From Everand
AutoCAD 2019: A Problem - Solving Approach, Basic and Intermediate, 25th Edition
Prof. Sham Tickoo
No ratings yet
"C Programming for Beginners: A Step-by-Step Guide"
From Everand
"C Programming for Beginners: A Step-by-Step Guide"
Lov kush
No ratings yet
Python - Lab 3
No ratings yet
Python - Lab 3
11 pages
Lab Program Python Prathamesh
No ratings yet
Lab Program Python Prathamesh
46 pages
Using Classification And Regression Trees A Practical Primer 1st Edition Xin Ma instant download
100% (2)
Using Classification And Regression Trees A Practical Primer 1st Edition Xin Ma instant download
46 pages
Bajaj Finance Limited - 16 - 07 - 2020 - 13 - 36 - 15
No ratings yet
Bajaj Finance Limited - 16 - 07 - 2020 - 13 - 36 - 15
1 page
Research Paper
No ratings yet
Research Paper
7 pages
CEC HarnessingThePowerOfAIInCoaching
No ratings yet
CEC HarnessingThePowerOfAIInCoaching
13 pages
The Elements of Statistical Learning Notes
No ratings yet
The Elements of Statistical Learning Notes
3 pages
Speech Emotion Recognition From Raw Audio Using Deep Learning
No ratings yet
Speech Emotion Recognition From Raw Audio Using Deep Learning
83 pages
A Closer Look at Deep Learning On Tabular Data
No ratings yet
A Closer Look at Deep Learning On Tabular Data
43 pages
Assignment
No ratings yet
Assignment
15 pages
PDF Big Data Iot and Machine Learning Tools and Applications Internet of Everything Ioe 1St Edition Rashmi Agrawal Editor Ebook Full Chapter
100% (3)
PDF Big Data Iot and Machine Learning Tools and Applications Internet of Everything Ioe 1St Edition Rashmi Agrawal Editor Ebook Full Chapter
54 pages
Topology For Data Science: Morse Theory and Application: Colleen M. Farrelly
No ratings yet
Topology For Data Science: Morse Theory and Application: Colleen M. Farrelly
16 pages
Databricks Certified Machine Learning Associate Exam Guide
No ratings yet
Databricks Certified Machine Learning Associate Exam Guide
9 pages
New Advances in Machine Learning
No ratings yet
New Advances in Machine Learning
374 pages
Neural Network Methods for Natural Language Processing 1st Edition by Yoav Goldberg ISBN 9783031021657 3031021657 - The full ebook version is just one click away
100% (11)
Neural Network Methods for Natural Language Processing 1st Edition by Yoav Goldberg ISBN 9783031021657 3031021657 - The full ebook version is just one click away
76 pages
An Introduction To Convolutional Neural Networks: Abstract
No ratings yet
An Introduction To Convolutional Neural Networks: Abstract
11 pages
DL Question Paper Solved
No ratings yet
DL Question Paper Solved
12 pages
Time Table MCS Fall-19
No ratings yet
Time Table MCS Fall-19
8 pages
A Machine Learning Approach For Player and Position Adjusted Expected Goals in Football (Soccer)
No ratings yet
A Machine Learning Approach For Player and Position Adjusted Expected Goals in Football (Soccer)
12 pages
(Ebook) Concepts and Real-Time Applications of Deep learning by Smriti Srivastava, Manju Khari, Ruben Gonzalez Crespo, Gopal Chaudhary, Parul Arora ISBN 9783030761660, 3030761665 - The ebook with all chapters is available with just one click
100% (2)
(Ebook) Concepts and Real-Time Applications of Deep learning by Smriti Srivastava, Manju Khari, Ruben Gonzalez Crespo, Gopal Chaudhary, Parul Arora ISBN 9783030761660, 3030761665 - The ebook with all chapters is available with just one click
73 pages
21CS54 Module 1
100% (2)
21CS54 Module 1
35 pages
Internship Report
No ratings yet
Internship Report
77 pages
Covid-19
No ratings yet
Covid-19
37 pages
Assignment 1 EC19025
No ratings yet
Assignment 1 EC19025
5 pages
Final Project
No ratings yet
Final Project
9 pages
Parallel Implementation of OPTICS Algorithm
No ratings yet
Parallel Implementation of OPTICS Algorithm
10 pages
Online Canteen Food Ordering System: N. Durga Swathi & T. Durga
No ratings yet
Online Canteen Food Ordering System: N. Durga Swathi & T. Durga
5 pages
Islp 1
No ratings yet
Islp 1
15 pages
First Review Ppt
No ratings yet
First Review Ppt
19 pages
Module 1 Topic-3-ML Framework
No ratings yet
Module 1 Topic-3-ML Framework
82 pages
CDP JP Guide 24-7-17
No ratings yet
CDP JP Guide 24-7-17
134 pages