0% found this document useful (0 votes)
25 views27 pages

ML Lab Record

Uploaded by

VENKATVYAS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views27 pages

ML Lab Record

Uploaded by

VENKATVYAS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

MACHINE LEARNING LAB

INTRODUCTION TO LAB:

Machine Learning is used anywhere from automating mundane tasks to offering intelligent insights,
industries in every sector try to benefit from it. You may already be using a device that utilizes it. For
example, a wearable fitness tracker like Fitbit, or an intelligent home assistant like Google Home. But
there are much more examples of ML in use.

 Prediction:Machine learning can also be used in the prediction systems. Considering the loan
example, to compute the probability of a fault, the system will need to classify the available
data ingroups.
 Image recognition:Machine learning can be used for face detection in an image as well. There
is aseparate category for each person in a database of several people.
 Speech Recognition:It is the translation of spoken words into the text. It is used in voice
searches and more. Voice user interfaces include voice dialing, call routing, and appliance
control. It can also be used a simple data entry and the preparation of structured documents.
 Medical diagnoses:ML is trained to recognize cancerous tissues.
 Financial industry:andtrading:companies use ML in fraud investigations and credit checks.

Types of Machine Learning?

Machine learning can be classified into 3 types of algorithms

1. Supervised Learning
2. Unsupervised Learning
3. Reinforcement Learning

Overview of Supervised Learning Algorithm

In Supervised learning, an AI system is presented with data which is labeled, which means that each
datatagged with the correct label.

The goal is to approximate the mapping function so well that when you have new input data (x) that
youcan predict the output variables (Y) for that data.
As shown in the above example, we have initially taken some data and marked them as ‘Spam’ or ‘Not
Spam’. This labeled data is used by the training supervised model, this data is used to train the model.

Once it is trained we can test our model by testing it with some test new mails and checking of the
model is able to predict the right output.
Types of Supervised learning

 Classification: A classification problem is when the output variable is a category, such as


“red” or “blue” or “disease” and “no disease”.
 Regression: A regression problem is when the output variable is a real value, such as
“dollars” or “weight”.

Overview of Unsupervised Learning Algorithm

In unsupervised learning, an AI system is presented with unlabeled, uncategorized data and the
system’s algorithms act on the data without prior training. The output is dependent upon the coded
algorithms. Subjecting a system to unsupervised learning is one way of testing AI.

Types of Unsupervised learning:

 Clustering: A clustering problem is where you want to discover the inherent groupings in the
data,such as grouping customers by purchasing behavior.

 Association: An association rule learning problem is where you want to discover rules that
describe large portions of your data, such as people that buy X also tend to buy Y.

Overview of Reinforcement Learning

A reinforcement learning algorithm, or agent, learns by interacting with its environment. The agent
receives rewards by performing correctly and penalties for performing incorrectly. The agent learns
without intervention from a human by maximizing its reward and minimizing its penalty. It is a type of
dynamic programming that trains algorithms using a system of reward and punishment.
in the above example, we can see that the agent is given 2 options i.e. a path with water or a path
with fire. A reinforcement algorithm works on reward a system i.e. if the agent uses the fire path
then the rewards are subtracted and agent tries to learn that it should avoid the fire path. If it had
chosen the water path or the safe path then some points would have been added to the reward
points, the agent then would try to learn what path is safe and what path isn’t.

It is basically leveraging the rewards obtained; the agent improves its environment knowledge to
select thenext action.
PROGRAM 1: The probability that it is Friday and that a student is absent is 3 %. Since there are 5

school days in a week, the probability that it is Friday is 20 %. What is theprobability that a student

is absent given that today is Friday? Apply Baye’s rule in python to get the result.

SOURCE CODE:

# User input of the probability of the student being absent on Friday

abonfri=float(input("Enter the probability that a student is absent on Firday"))

# The probability that a given day is Friday

prothatfri=float(input("Enter the probability that a given day is Friday"))

absgivenfri=abonfri/prothatfri

print("The probability that the student is absent given that today is Fridays is",absgivenfri)

O/P:

Note: Write reaming two example programs


PROGRAM 2: EXTRACT THE DATA FROM DATABASE USING PYTHON

Method - I
SOURCE CODE:
import mysql.connector
my_database=mysql.connector.connect(host="localhost", user="root", password="root",
database="mysql")
cursor=my_database.cursor()
sql="insert into player1(name, jersey_no,age,score)values(%s,%s,%s,%s)"
player2=[('sachin',10,20,50),('kohile',20,30,100),('dhoni',40,35,110)]
cursur.executemany(sql,player2)
my_database.commit()
cursor.execute("select * from player1")
cursor.fetchall()

output:
Method - II

'''Aim: Extract the data from database using python

=================================
Explanation:
=================================

===> First You need to Create a Table (students) in Mysql Database (SampleDB)

---> Open Command prompt and then execute the following command to enter into MySQL prompt.

--> mysql -u root -p

And then, you need to execute the following commands at MySQL prompt to create table in the
database.

--> create database SampleDB;

--> use SampleDB;

--> CREATE TABLE students (sid VARCHAR(10),sname VARCHAR(10),age int);

--> INSERT INTO students VALUES('s521','Jhon Bob',23);


--> INSERT INTO students VALUES('s522','Dilly',22);
--> INSERT INTO students VALUES('s523','Kenney',25);
--> INSERT INTO students VALUES('s524','Herny',26);

===> Next,Open Command propmt and then execute the following command to install mysql.connector
package to connect with mysql database through python.

--> pip install mysql.connector (Windows)


--> sudo apt-get install mysql.connector (linux)

===============================
Source Code :
===============================. '''

import mysql.connector
# Create the connection object
myconn = mysql.connector.connect(host = "localhost", user = "root",passwd =
"",database="SampleDB")
# Creating the cursor object
cur = myconn.cursor()
# Executing the query
cur.execute("select * from students")
# Fetching the rows from the cursor object
result = cur.fetchall()
print("Student Details are :")
# Printing the result
for x in result:
print(x);
# Commit the transaction
myconn.commit()
# Close the connection
myconn.close()
Output:
PROGRAM 3: IMPLEMENT K-NEAREST NEIGHBORS CLASSIFICATION
USINGPYTHON

SOURCE CODE:
print(__doc__)

import numpy as np

import matplotlib.pyplot as plt

from matplotlib.colors import ListedColormap

from sklearn.neighbors import KNeighborsClassifier

from sklearn.datasets import load_iris

n_neighbors=15

#Loading data

iris=load_iris()

#print(irisData)

X=iris.data[:, :2] #We only take the first two features

#print(X)

y=iris.target

h=.2 # step size in the mesh

# create color maps

cmap_light=ListedColormap(['#FFAAAA','#AAFFAA','#AAAAFF'])

cmap_bold=ListedColormap(['#FF0000','#00FF00','#0000FF'])

for weights in ['uniform','distance']:

# we create an instance of Neighbor classifier and fit the data.

clf=KNeighborsClassifier(n_neighbors,weights=weights)

clf.fit(X,y)

# Plot the decision boundary. We will assign a color to each

# point in the mesh [x_min,x_max]*[y_min,y_max].

x_min,x_max=X[:, 0].min()-1,X[:, 0].max()+1

y_min,y_max=X[:, 1].min()-1,X[:, 1].max()+1

xx,yy=np.meshgrid(np.arange(x_min,x_max,h),np.arange(y_min,y_max,h))

Z=clf.predict(np.c_[xx.ravel(),yy.ravel()])

#Put the result into a color plot


Z=Z.reshape(xx.shape)

plt.figure()

plt.pcolormesh(xx,yy,Z,cmap=cmap_light,shading='auto')

# Plot the training points also

plt.scatter(X[: ,0], X[: ,1],c=y,cmap=cmap_bold)

plt.xlim(xx.min(),xx.max())

plt.ylim(yy.min(),yy.max())

plt.title("3-Class classification (k=%i,weights='%s')"%(n_neighbors,weights))

plt.show()

Output:
SOURCE CODE:
# Import necessary modules

from sklearn.neighbors import KNeighborsClassifier

from sklearn.model_selection import train_test_split

from sklearn.datasets import load_iris

# Loading data

irisData = load_iris()

#print(irisData)

# Create feature and target arrays

X = irisData.data

y = irisData.target

# Split into training and test set

X_train, X_test, y_train, y_test = train_test_split(

X, y, test_size = 0.2, random_state=42)

print(X_test)

print(y_test)

knn = KNeighborsClassifier(n_neighbors=7,weights='distance')

knn.fit(X_train, y_train)

# Predict on dataset which model has not seen before

print(knn.predict(X_test))

output:
PROGRAM-4: Given the following data, which specify classifications for nine
combinations of VAR1 and VAR2 predict a classification for a case where
VAR1=0.906 and VAR2=0.606, using the result of k-means clustering with 3 means
(i.e., 3centroids)

SOURCE CODE:
import numpy as np

import pandas as pd

from matplotlib import pyplot as plt

#from sklearn.datasets.samples_generator import make_blobs

from sklearn.cluster import KMeans

data=pd.read_csv("D:\ML LAB\Lab_Final\Lab_Final\kmeansdata.csv")

print(data)

var1=pd.DataFrame(data['VAR1'])

var2=pd.DataFrame(data['VAR2'])

kmeans=KMeans(3)

pred_var2=kmeans.fit_predict(var1)

print(pred_var2)

plt.scatter(var1,var2)

plt.scatter(kmeans.cluster_centers_[: ,0],kmeans.cluster_centers_[: ,0],s=300,c='red')

plt.show()
CSV FILE:

OUTPUT:
PROGRAM 5: The Following Training Examples Map Descriptions Of Individuals Onto High,
Medium And LowCredit-Worthiness.

medium skiing design single twenties no ->highRisk

high golf trading married forties yes ->lowRisk

low speedway transport married thirties yes ->medRisk

medium football banking single thirties yes ->lowRisk

high flying media married fifties yes ->highRisk

low football security single twenties no ->medRisk

medium golf media single thirties yes ->medRisk

medium golf transport married forties yes ->lowRisk

high skiing banking single thirties yes ->highRisk

low golf unemployed married forties yes ->highRisk

SOURCE CODE:

import pandas as pd

data=pd.read_csv("D:\ML LAB\Lab_Final\Lab_Final\Credit-Worthiness.csv")

def unc_prob(val,attr):

val_count=0

for ele in data[attr]:

if ele==val:

val_count+=1

return val_count/len(data[attr])

def cond_prob(val1,attr1,val2,attr2):

val_count1,data_count=0,0

for ele1 in data[attr1]:

for ele2 in data[attr2]:

if ele2==val2:

data_count+=1

if ele1==val1:

val_count1+=1

return val_count1/data_count
inp_value,inp_attr=input("Enter the value name and attribute name for which you want to find
unconditional probability with a space in between").split()

inp_value1,inp_attr1,inp_value2,inp_attr2=input("Enter the value name and attribute name for which


you want to find the conditional probability give value name and attribute name").split()

ele_unc_prob=unc_prob(inp_value,inp_attr)

ele_cond_prob=cond_prob(inp_value1,inp_attr1,inp_value2,inp_attr2)

print(ele_unc_prob)

print(ele_cond_prob)
CSV FILE:

OUTPUT:
PROGRAM 6: IMPLEMENT LINEAR REGRESSION USINGPYTHON.

SOURCE CODE:
# A program to illustrate linear regression
# Import Libraries
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import linear_model

# Load Dataset
data=pd.read_csv('D:\ML LAB\Lab_Final\Lab_Final\weight-height.csv')
# Quick view about the data
#print(data)
#data.plot(kind='scatter',x='Height',y='Weight')
#data.plot(kind='box')
#plt.show()
#print(data.corr())
# Change to DataFrame variables
Height=pd.DataFrame(data['Height'])
Weight=pd.DataFrame(data['Weight'])
print(Weight)
print(Height)
# Build Linear Regression Model
lm=linear_model.LinearRegression()
model=lm.fit(Height,Weight)
print(model.coef_)
print(model.intercept_)
print(model.score(Height,Weight))#Evaluate the model
Height_new=pd.DataFrame([65,60,68])
Weight_new=model.predict(Height_new)
Weight_new=pd.DataFrame(Weight_new)
df=pd.concat([Height_new,Weight_new],axis=1,keys=['Height_new','Weight_new'])
print(df)
# Visualize the result
data.plot(kind='scatter',x='Height',y='Weight')
#Plotting the regression line
plt.plot(Height,model.predict(Height),color='red',linewidth=2)
# Plotting the predicted values
plt.scatter(Height_new,Weight_new,color='red')
plt.show()
CSV FILE:

OUTPUT:
PROGRAM 7: IMPLEMENT NAÏVE BAYES THEOREM TO CLASSIFY THE
ENGLISHTEXT

SOURCE CODE:

import numpy as np, pandas as pd

import seaborn as sns

import matplotlib.pyplot as plt

from sklearn.datasets import fetch_20newsgroups

from sklearn.feature_extraction.text import TfidfVectorizer

from sklearn.naive_bayes import MultinomialNB

from sklearn.pipeline import make_pipeline

from sklearn.metrics import confusion_matrix, accuracy_score

sns.set() # use seaborn plotting style

# Load the dataset

data = fetch_20newsgroups()

print(data)

# Get the text categories

text_categories = data.target_names

# define the training set

train_data = fetch_20newsgroups(subset="train", categories=text_categories)

# define the test set

test_data = fetch_20newsgroups(subset="test", categories=text_categories)

print("We have {} unique classes".format(len(text_categories)))

print("We have {} training samples".format(len(train_data.data)))

print("We have {} test samples".format(len(test_data.data)))

# let’s have a look as some training data

print(test_data.data[5])

# Build the model

model = make_pipeline(TfidfVectorizer(), MultinomialNB())

# Train the model using the training data


model.fit(train_data.data, train_data.target)

# Predict the categories of the test data

predicted_categories = model.predict(test_data.data)

print(np.array(test_data.target_names)[predicted_categories])

# plot the confusion matrix

mat = confusion_matrix(test_data.target, predicted_categories)

sns.heatmap(mat.T, square = True, annot=True, fmt = "d",


xticklabels=train_data.target_names,yticklabels=train_data.target_names)

plt.xlabel("true labels")

plt.ylabel("predicted label")

plt.show()

print("The accuracy is {}".format(accuracy_score(test_data.target, predicted_categories)))

OUTPUT:
PROGRAM 8: IMPLEMENT AN ALGORITHM TO DEMONSTRATE THE
SIGNIFICANCE OFGENETICALGORITHM
SOURCE CODE:
import numpy
def cal_pop_fitness(equation_inputs, pop):
fitness = numpy.sum(pop*equation_inputs, axis=1)
return fitness
def select_mating_pool(pop, fitness, num_parents):
parents = numpy.empty((num_parents, pop.shape[1]))
for parent_num in range(num_parents):
max_fitness_idx = numpy.where(fitness == numpy.max(fitness))
max_fitness_idx = max_fitness_idx[0][0]
parents[parent_num, :] = pop[max_fitness_idx, :]
fitness[max_fitness_idx] = -99999999999
return parents
def crossover(parents, offspring_size):
offspring = numpy.empty(offspring_size)
crossover_point = numpy.uint8(offspring_size[1]/2)
for k in range(offspring_size[0]):
parent1_idx = k%parents.shape[0]
parent2_idx = (k+1)%parents.shape[0]
offspring[k, 0:crossover_point] = parents[parent1_idx, 0:crossover_point]
offspring[k, crossover_point:] = parents[parent2_idx, crossover_point:]
return offspring
def mutation(offspring_crossover):
for idx in range(offspring_crossover.shape[0]):
random_value = numpy.random.uniform(-1.0, 1.0, 1)
offspring_crossover[idx, 4] = offspring_crossover[idx, 4] + random_value
return offspring_crossover
PROGRAM 9: IMPLEMENT THE FINITE WORDS CLASSIFICATION SYSTEM USING
BACK-PROPAGATIONALGORITHM

SOURCE CODE:

import pandas as pd

msg = pd.read_csv('D:/python/backprapogation1.csv', names=['message', 'label'])

print("Total Instances of Dataset: ", msg.shape[0])

msg['labelnum'] = msg.label.map({'pos': 1, 'neg': 0})

X = msg.message

y = msg.labelnum

from sklearn.model_selection import train_test_split

Xtrain, Xtest, ytrain, ytest = train_test_split(X, y)

from sklearn.feature_extraction.text import CountVectorizer

count_v = CountVectorizer()

Xtrain_dm = count_v.fit_transform(Xtrain)

Xtest_dm = count_v.transform(Xtest)

df = pd.DataFrame(Xtrain_dm.toarray(),columns=count_v.get_feature_names())

print(df[0:5])

from sklearn.naive_bayes import MultinomialNB

clf = MultinomialNB()

clf.fit(Xtrain_dm, ytrain)

pred = clf.predict(Xtest_dm)

for doc, p in zip(Xtrain, pred):


p = 'pos' if p == 1 else 'neg'

print("%s -> %s" % (doc, p))

from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score

print('Accuracy Metrics: \n')

print('Accuracy: ', accuracy_score(ytest, pred))

print('Recall: ', recall_score(ytest, pred))

print('Precision: ', precision_score(ytest, pred))

print('Confusion Matrix: \n', confusion_matrix(ytest, pred))

CSV FILE:
OUTPUT:

You might also like