0% found this document useful (0 votes)
19 views

AIML Exp 9

The document summarizes an experiment conducted by a student to predict cereal ratings using a neural network model. The student imports a cereal dataset and splits it into training and test sets. They scale the data and fit a neural network to predict ratings based on calories, proteins, fat, sodium, and fiber. Cross-validation is performed by varying the training set length from 10 to 65 samples. Median RMSE is calculated for each length, showing the model accuracy increases with more training data.

Uploaded by

Kartik Guleria
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

AIML Exp 9

The document summarizes an experiment conducted by a student to predict cereal ratings using a neural network model. The student imports a cereal dataset and splits it into training and test sets. They scale the data and fit a neural network to predict ratings based on calories, proteins, fat, sodium, and fiber. Cross-validation is performed by varying the training set length from 10 to 65 samples. Median RMSE is calculated for each length, showing the model accuracy increases with more training data.

Uploaded by

Kartik Guleria
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Experiment 9

Student Name: Shadman Ahmed UID: 19BCS2289


Branch: CSE Section/Group: 9 - A
Semester: 5 Date of Performance: 21/11/2021
Subject Name: AI ML Lab Subject Code: CSP - 303

1. Aim/Overview of the practical:

Import cereal dataset shared by Carnegie Mellon University (CMU). The details of the
dataset are on the following link: http://lib.stat.cmu.edu/DASL/Datafiles/Cereals.htm
The objective is to predict rating of the cereals variables such as calories, proteins, fat
etc. Test and Train using Neural Networks.

2. Steps for experiment/practical/Code:

1. Creating index variable.


# Read the Data
data = read.csv("C:\\Users\\Asus\\Desktop\\cereals.csv", header=T)

# Random sampling samplesize


= 0.60 * nrow(data)
set.seed(80)
index = sample( seq_len ( nrow ( data ) ), size = samplesize )

# Create training and test set


datatrain = data[ index, ]
datatest = data[ -index, ]

2. Scale data for neural network:

max = apply(data , 2 , max)


min = apply(data, 2 , min)
scaled = as.data.frame(scale(data, center = min, scale = max - min))
3. Fit neural network.

# install library
install.packages("neuralnet")

# load library
library(neuralnet)

# creating training and test set trainNN =


scaled[index , ] testNN = scaled[-index , ]

# fit neural network set.seed(2)


NN = neuralnet(rating ~ calories + protein + fat + sodium + fiber, trainNN, hidden = 3 , linear.output
=T)

# plot neural network plot(NN)


4. Prediction using neural network

predict_testNN = compute(NN, testNN[,c(1:5)])


predict_testNN = (predict_testNN$net.result * (max(data$rating) - min(data$rating))) +
min(data$rating)

plot(datatest$rating, predict_testNN, col='blue', pch=16, ylab = "predicted rating NN", xlab = "real rating")

abline(0,1)

# Calculate Root Mean Square Error (RMSE)


RMSE.NN = (sum((datatest$rating - predict_testNN)^2) / nrow(datatest)) ^ 0.5
5. Cross validation of neural network model.

# install relevant libraries


install.packages("boot")
install.packages("plyr")

# Load libraries
library(boot) library(plyr)

# Initialize variables
set.seed(50)
k = 100 RMSE.NN = NULL

List = list( )

# Fit neural network model within nested for loop for(j in


10:65){
for (i in 1:k) {
index = sample(1:nrow(data),j)

trainNN = scaled[index,] testNN =


scaled[-index,] datatest = data[-
index,]

NN = neuralnet(rating ~ calories + protein + fat + sodium + fiber, trainNN, hidden = 3,


linear.output= T)
predict_testNN = compute(NN,testNN[,c(1:5)])
predict_testNN = (predict_testNN$net.result*(max(data$rating)-
min(data$rating)))+min(data$rating)

RMSE.NN [i]<- (sum((datatest$rating - predict_testNN)^2)/nrow(datatest))^0.5


}
List[[j]] = RMSE.NN
}

Matrix.RMSE = do.call(cbind, List)

6. Prepare boxplot

boxplot(Matrix.RMSE[,56], ylab = "RMSE", main = "RMSE BoxPlot (length of traning set = 65)")
7. Variation of median RMSE

install.packages("matrixStats")
library(matrixStats)

med = colMedians(Matrix.RMSE) X =

seq(10,65)

plot (med~X, type = "l", xlab = "length of training set", ylab = "median RMSE", main = "Variation of RMSE with
length of training set")
5. Observations/Discussions/ Complexity Analysis:

Neural network is inspired from biological nervous system. Similar to nervous system the
information is passed through layers of processors. The significance of variables is represented
by weights of each connection. The article provides basic understanding of back propagation
algorithm, which is used to assign these weights. In this article we also implement neural
network on R. We use a publically available dataset shared by CMU. The aim is to predict the
rating of cereals using information such as calories, fat, protein etc. After constructing the
neural network we evaluate the model for accuracy and robustness. We compute RMSE and
perform cross-validation analysis. In cross validation, we check the variation in model accuracy
as the length of training set is changed. We consider training sets with length 10 to 65. For each
length a 100 samples are random picked and median RMSE is calculated. We show that model
accuracy increases when training set is large. Before using the model for prediction, it is
important to check the robustness of performance through cross validation.
Evaluation Grid (To be created as per the SOP and Assessment guidelines by the faculty):

Sr. No. Parameters Marks Obtained Maximum Marks


1.
2.
3.

You might also like