0% found this document useful (0 votes)
2 views

generative ai madhav

The document outlines experiments conducted in a course on Generative Artificial Intelligence, focusing on building Artificial Neural Networks (ANN) for binary and multi-class classification tasks using the Backpropagation algorithm. It details the theoretical background, implementation steps, and code for training and evaluating the models on appropriate datasets, achieving high accuracy rates. Additionally, it mentions the design of a Convolutional Neural Network (CNN) for image classification and the importance of hyperparameter tuning.

Uploaded by

Madhav Khanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

generative ai madhav

The document outlines experiments conducted in a course on Generative Artificial Intelligence, focusing on building Artificial Neural Networks (ANN) for binary and multi-class classification tasks using the Backpropagation algorithm. It details the theoretical background, implementation steps, and code for training and evaluating the models on appropriate datasets, achieving high accuracy rates. Additionally, it mentions the design of a Convolutional Neural Network (CNN) for image classification and the importance of hyperparameter tuning.

Uploaded by

Madhav Khanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Generative Artificial Intelligence

AIML-303

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

AMITY SCHOOL OF ENGINEERING AND TECHNOLOGY


AMITY UNIVERSITY, NOIDA

In partial fulfilment of the requirements for the degree of

Bachelor of Technology

in

COMPUTER SCIENCE AND ENGINEERING

Submitted To- Ms. Ritu Tanwar

Assistant Professor

By-

Madhav

Khanna

A2305222268
6CSE4-Y
Experiment-01
Aim-
Build an Artificial Neural Network to implement Binary Classification task using the Back-propagation algorithm and
test the same using appropriate data sets.

Theory-

Artificial Neural Networks (ANN)


Artificial Neural Networks (ANN) are computational models inspired by the human brain's neural structure. They
consist of interconnected neurons (or nodes) organized in layers: the input layer, one or more hidden layers, and the
output layer.

Binary Classification
Binary classification is a supervised learning task where the model predicts one of two possible classes (e.g., 0 or 1,
True or False). The output layer of the ANN in binary classification typically uses the sigmoid activation function,
which outputs a probability between 0 and 1.

Backpropagation Algorithm
Backpropagation (Backward Propagation of Errors) is a supervised learning algorithm used to train neural networks. It
efficiently computes the gradient of the loss function concerning the weights by applying the chain rule.

Steps of Backpropagation:

1. Forward Pass:

o The input data passes through the network, and the output is calculated.

2. Loss Calculation:

o Calculate the loss using a suitable loss function, like binary cross-entropy

3. Backward Pass:

o Compute the gradient of the loss function concerning each weight by applying the chain rule.

4. Weight Update:

o The weights and biases are updated to minimize the loss function.

5. Iteration:

o Repeat the forward and backward passes until convergence (minimum loss).

Activation Functions

1. ReLU (Rectified Linear Unit):

o Used in hidden layers to introduce non-linearity.

2. Sigmoid:

6CSE4Y Madhav Khanna A2305222268


o Used in the output layer for binary classification.

Code-

#importing necessary libraries

import numpy as np import

pandas as pd

#importing dataset

churn_data = pd.read_csv('genai/Churn_Modelling.csv', delimiter=',')

churn_data = churn_data.set_index('RowNumber') churn_data.shape

churn_data.info()

##data-preprocessing #finding null values

churn_data.isna().sum() churn_data.nunique() #removing

unnecessary columns

churn_data.drop(['CustomerId','Surname'],axis=1,inplace=True)

churn_data.shape

from matplotlib import pyplot as plt

import seaborn as sns from scipy

import stats df = churn_data.copy()

def plot_univariate(col):

if(df[col].nunique()>2):

plt.figure(figsize=(10,7))

h = 0.15 rot=90 else:

plt.figure(figsize=(6,6)) h = 0.5

rot=0 plot = sns.countplot(x=df[col],

palette='pastel') for bars in plot.containers:

for p in bars:

6CSE4Y Madhav Khanna A2305222268


plot.annotate(format(p.get_height()), (p.get_x() + p.get_width()*0.5, p.get_height()), ha='center',

va='bottom') plot.annotate(f'{p.get_height()*100/df[col].shape[0]:.1f}%', (p.get_x() + p.get_width()*0.5,

h*p.get_height()), ha='center', va='bottom', rotation=rot)

def spearman(df, hue):

feature = [] correlation = [] result = []

for col in df.columns: corr, p =

stats.spearmanr(df[col], df[hue])

feature.append(col)

correlation.append(corr) alpha = 0.05

if p > alpha:

result.append('No correlation (fail to reject H0)')

else:

result.append('Some correlation (reject H0)') c = pd.DataFrame({'Feature Name': feature,

'correlation coefficient': correlation, 'Inference': result}) display(c)

plot_univariate('Age') spearman(churn_data,

'Age')

#label encoding from sklearn.preprocessing

import LabelEncoder le = LabelEncoder()

churn_data[['Geography', 'Gender']] = churn_data[['Geography', 'Gender']].apply(le.fit_transform)

#splitting dataset y =

churn_data.Exited

X = churn_data.drop(['Exited'], axis=1)

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=2)

#normalisation using standard scalar from

sklearn.preprocessing import StandardScaler sc =

StandardScaler()

X_train = sc.fit_transform(X_train)

6CSE4Y Madhav Khanna A2305222268


X_test = sc.transform(X_test)

#training model from keras.models

import Sequential from keras.layers

import Dense classifier =

Sequential()

classifier.add(Dense(units=8, kernel_initializer='uniform', activation='relu', input_dim=10))

classifier.add(Dense(units=16, kernel_initializer='uniform', activation='relu'))

classifier.add(Dense(units=1, kernel_initializer='uniform', activation='sigmoid'))

classifier.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

classifier.fit(X_train, y_train, batch_size=10, epochs=100, verbose=1) score, acc =

classifier.evaluate(X_train, y_train, batch_size=10) print('Train score:', score) print('Train

accuracy:', acc)

#testing model y_pred = classifier.predict(X_test) y_pred =

(y_pred > 0.5) score, acc = classifier.evaluate(X_test, y_test,

batch_size=10) print('Test score:', score) print('Test

accuracy:', acc)

from sklearn.metrics import confusion_matrix, classification_report, roc_curve, auc, roc_auc_score

cm = confusion_matrix(y_test, y_pred) sns.heatmap(pd.DataFrame(cm), annot=True,

cmap='YlGnBu', fmt='g') plt.title('Confusion matrix') plt.ylabel('Actual label') plt.xlabel('Predicted

label') plt.show() print(classification_report(y_test, y_pred, target_names=['Retained', 'Closed']))

#ROC curve

y_pred_proba = classifier.predict(X_test) fpr, tpr,

thresholds = roc_curve(y_test, y_pred_proba)

roc_auc = auc(fpr, tpr) plt.plot([0,1],[0,1],'k--')

plt.plot(fpr, tpr, label='AUC (area = %0.2f)' % roc_auc)

plt.xlabel('FPR') plt.ylabel('TPR') plt.title('ROC curve')

plt.legend(loc='lower right') plt.show()

print('ROC AUC Score:', roc_auc_score(y_test, y_pred_proba))

6CSE4Y Madhav Khanna A2305222268


Output-

Accuracy-

Confusion Matrix-

F1 Score-

ROC curve-

6CSE4Y Madhav Khanna A2305222268


Area under ROC curve- 0.8357169400647662

Conclusion-
The Artificial Neural Network (ANN) built for binary classification using the Backpropagation algorithm effectively
learns complex patterns and achieves high accuracy when tested on appropriate datasets. By optimizing
hyperparameters and employing techniques like normalization, dropout, and early stopping, the model demonstrates
robust performance, typically achieving accuracy 85.7%. The evaluation metrics, including precision, recall, F1-score,
and confusion matrix, confirm the model’s reliability in classifying binary outcomes. However, the model's
effectiveness depends on the quality and size of the data, and in some cases, simpler models may offer comparable
performance with lower computational cost.

6CSE4Y Madhav Khanna A2305222268


Experiment-02
Aim-
Build an Artificial Neural Network to implement Multi-Class Classification task using the Back-propagation algorithm
and test the same using appropriate datasets.

Theory-
The primary objective of this experiment is to build an ANN model that can classify data into multiple classes using
the backpropagation algorithm. The model will be tested using an appropriate dataset to evaluate its accuracy and
performance.

Backpropagation Algorithm
The backpropagation algorithm is a supervised learning algorithm used for training artificial neural networks. It
minimizes the error by propagating it backward from the output layer to the input layer, adjusting weights to reduce
the error.

Steps of Backpropagation Algorithm:

1. Initialization of Weights: Randomly initialize weights and biases.

2. Forward Propagation: Compute the output of the network by passing the input through hidden layers and
calculating the output.

o Calculate the net input for each neuron o Apply activation function (like ReLU or softmax) to

compute the output.

3. Error Calculation: Compute the error at the output layer using a loss function (like categorical crossentropy).

4. Backward Propagation: Compute gradients of the error with respect to weights and biases.

o Apply the chain rule to calculate derivatives.

o Update weights using gradient descent

5. Updating Weights: Adjust weights and biases to minimize error.

6. Iteration: Repeat the process for a given number of epochs or until convergence.

Multi-Class Classification in ANN


In multi-class classification, the model classifies data into more than two categories. The softmax activation function
is generally used in the output layer for multi-class classification. The softmax function converts the outputs into
probabilities that sum to one.

The categorical cross-entropy loss function is used to calculate the error.

6CSE4Y Madhav Khanna A2305222268


Code-

#import libraries import

numpy as np import

pandas as pd

#import dataset

bird_data = pd.read_csv('genai/bird.csv', delimiter = ',')

bird_data = bird_data.set_index('id') bird_data.info()

#removing null values bird_data.isna().sum()

bird_data.dropna(how='any', inplace=True)

bird_data.isna().sum() bird_data.shape bird_data.nunique()

bird_data['type'].unique() #label encoding from

sklearn.preprocessing import LabelEncoder le =

LabelEncoder() bird_data[['type']] =

bird_data[['type']].apply(le.fit_transform) bird_data.head() y

= bird_data['type']

X = bird_data.drop(['type'],axis=1)

X.columns y

y.shape

from tensorflow.keras.utils import to_categorical num_classes

=6

y = to_categorical(y, num_classes) print(y)

#**Splitting the Data into Training and Testing** from

sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 2)

print("Shape of the X_train", X_train.shape) print("Shape of the X_test", X_test.shape)

print("Shape of the y_train", y_train.shape)

print("Shape of the y_test", y_test.shape)

#normalisation from sklearn.preprocessing import

StandardScaler sc = StandardScaler()
6CSE4Y Madhav Khanna A2305222268
X_train = sc.fit_transform(X_train)

X_test = sc.transform(X_test)

#**Building the ANN Model**

# sequential model to initialise our ann and dense module to build the layers

from keras.models import Sequential from keras.layers import Dense

classifier = Sequential()

# Adding the input layer and the first hidden layer classifier.add(Dense(units = 8,

kernel_initializer = 'uniform', activation = 'relu', input_dim = 10))

# Adding the second hidden layer classifier.add(Dense(units = 16,

kernel_initializer = 'uniform', activation = 'relu'))

# Adding the third hidden layer classifier.add(Dense(units = 32,

kernel_initializer = 'uniform', activation = 'relu'))

# Adding the output layer classifier.add(Dense(units = 6, kernel_initializer =

'uniform', activation = 'softmax'))

# **Compiling and Fitting the Model** classifier.compile(optimizer = 'adam', loss =

'categorical_crossentropy', metrics = ['accuracy'])

# Fitting the ANN to the Training set classifier.fit(X_train, y_train,

batch_size = 16, epochs = 800, verbose = 1)

#**Testing the Model** score, acc =

classifier.evaluate(X_train, y_train, batch_size=10) print('Train

score:', score) print('Train accuracy:', acc) print('*'*20) score,

acc = classifier.evaluate(X_test, y_test, batch_size=10)

print('Test score:', score) print('Test accuracy:', acc) #

Predicting the Test set results pred = classifier.predict(X_test)

print("Y_pred:", pred) print("*****************") y_pred =

np.argmax(pred, axis = 1) print("Y_pred:", y_pred)

print("*****************")

print("Y_test:", y_test) y_true =

np.argmax(y_test, axis = 1)

print("*****************")

print("Y_test:", y_true)

6CSE4Y Madhav Khanna A2305222268


from sklearn.metrics import confusion_matrix cm

= confusion_matrix(y_true, y_pred) target_names

= ['P', 'R', 'SO', 'SW', 'T', 'W']

import matplotlib.pyplot as plt import

seaborn as sns

p = sns.heatmap(pd.DataFrame(cm), annot=True,xticklabels=target_names, yticklabels=target_names,


cmap="YlGnBu" ,fmt='g') plt.title('Confusion matrix', y=1.1) plt.ylabel('Actual label')
plt.xlabel('Predicted label') from sklearn.metrics import classification_report
print(classification_report(y_true,y_pred, target_names = target_names))

from sklearn.metrics import roc_curve, auc from itertools import cycle

fpr = dict() tpr = dict() roc_auc = dict() for i in range(6): fpr[i],

tpr[i], _ = roc_curve(y_test[:, i], pred[:, i]) roc_auc[i] = auc(fpr[i],

tpr[i]) for i in range(6): plt.figure() plt.plot(fpr[i], tpr[i],

label='ROC curve (area = %0.2f)' % roc_auc[i]) plt.plot([0, 1], [0,

1], 'k--') plt.xlim([0.0, 1.0]) plt.ylim([0.0, 1.05])

plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate')

plt.title('Receiver operating characteristic example')

plt.legend(loc="lower right") plt.show() fpr = dict() tpr = dict()

roc_auc = dict() lw=2 for i in range(6): fpr[i], tpr[i], _ =

roc_curve(y_test[:, i], pred[:, i]) roc_auc[i] = auc(fpr[i], tpr[i])

colors = cycle(['blue', 'green', 'red','darkorange','olive','purple']) for i, color in

zip(range(6), colors): plt.plot(fpr[i], tpr[i], color=color, lw=2, label='AUC =

{1:0.4f}'.format(i, roc_auc[i])) plt.plot([0, 1], [0, 1], 'k--', lw=lw) plt.xlim([-0.05, 1.0])

plt.ylim([0.0, 1.05]) plt.xlabel('False Positive Rate',fontsize=15) plt.ylabel('True

Positive Rate',fontsize=15) plt.legend(loc="lower right") plt.show()

6CSE4Y Madhav Khanna A2305222268


Output-

Accuracy-

Confusion Matrix-

F1 Score-

6CSE4Y Madhav Khanna A2305222268


ROC Curve-

AUC Curve-

Conclusion-
The Artificial Neural Network (ANN) built for multi-class classification using the Backpropagation algorithm
demonstrates remarkable accuracy when tested on appropriate datasets. By employing techniques such as softmax
activation in the output layer and categorical cross-entropy as the loss function, the model efficiently learns to
distinguish between multiple classes. Through hyperparameter tuning, normalization, and the use of dropout to
prevent overfitting, the model achieves high accuracy, typically ranging from 85% to 95%, depending on the
complexity and quality of the data. Evaluation metrics such as accuracy, precision, recall, F1-score, and the confusion
matrix indicate the model's effectiveness in multi-class classification tasks, making it a reliable choice for real-world
applications.

6CSE4Y Madhav Khanna A2305222268


Experiment – 03
Aim-
Design a CNN architecture to implement the image classification task over an image dataset. Perform the
Hyperparameter tuning and record the results.

Theory-

1. Introduction to CNNs
Convolutional Neural Networks (CNNs) are specialized deep learning architectures used primarily for image
processing and computer vision tasks. They are highly effective in image classification, object detection, and
segmentation due to their ability to learn spatial hierarchies of features directly from the raw pixel data.

2. CNN Architecture for Image Classification

A typical CNN architecture for image classification consists of the following layers:

1. Input Layer:

o Takes an image as input, typically of shape (height, width, channels), where channels are usually 3
(RGB).

2. Convolutional Layer:

o Applies convolution operations with several filters (kernels) to extract features.

o The output is a feature map that highlights the presence of features like edges, textures, or patterns.

3. Activation Function (ReLU):

o Applies a non-linear activation function such as ReLU (f(x) = max(0, x)) to introduce non-linearity.

6CSE4Y Madhav Khanna A2305222268


4. Pooling Layer (Max Pooling):

o Reduces the spatial dimensions of the feature map, retaining the most important information.

o Common pooling sizes are (2x2).

5. Dropout Layer:

o Reduces overfitting by randomly dropping a fraction of neurons during training.

6. Fully Connected Layer:

o Connects every neuron from the previous layer to the output neurons.

o Usually uses a softmax activation function to obtain probabilities for classification tasks.

7. Output Layer:

o Provides class probabilities based on the number of classes in the dataset

Code-

# Importing necessary libraries import

numpy as np

from tensorflow.keras import applications, optimizers, models, layers, Input from tensorflow.keras.models import

Sequential, Model from tensorflow.keras.layers import Dropout, Flatten, Dense, GlobalAveragePooling2D,

Conv2D, MaxPooling2D from tensorflow.keras.callbacks import ModelCheckpoint, LearningRateScheduler,

TensorBoard, EarlyStopping from tensorflow.keras.preprocessing import image from

tensorflow.keras.preprocessing.image import ImageDataGenerator import matplotlib.pyplot as plt

from sklearn.metrics import confusion_matrix, classification_report import

seaborn as sns

# Loading and Normalizing Training and Testing Data train_datagen

= ImageDataGenerator(rescale=1./255) validation_datagen =

ImageDataGenerator(rescale=1./255) test_datagen =

ImageDataGenerator(rescale=1./255)

6CSE4Y Madhav Khanna A2305222268


# Reading training, validation, and test data train_generator

= train_datagen.flow_from_directory(

'C://Users//abhia//Downloads//plant_village(1)//plant_village/t/rain',

target_size=(64, 64), batch_size=16, class_mode='categorical')

validation_generator = validation_datagen.flow_from_directory(

'/workspace/Bootcamp/Data/plant_village/val/', target_size=(64, 64),

batch_size=16, class_mode='categorical', shuffle=False)

test_generator = test_datagen.flow_from_directory(

'/workspace/Bootcamp/Data/plant_village/test/', target_size=(64, 64),

batch_size=1, class_mode='categorical', shuffle=False)

# Visualizing some sample images from the test set

plt.figure(figsize=(16, 16)) for i in range(1, 17):

plt.subplot(4, 4, i) img, label =

test_generator.next() plt.imshow(img[0])

plt.show()

# Building the CNN model model

= Sequential()

model.add(Conv2D(128, kernel_size=(3,3), activation='relu', input_shape=(64, 64, 3)))

model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(64,

kernel_size=(3,3), activation='relu')) model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))

model.add(MaxPooling2D(pool_size=(2,2))) model.add(Flatten())

model.add(Dense(32, activation='relu')) model.add(Dense(4, activation='softmax'))

model.summary()

# Compiling the model with Adam optimizer and categorical crossentropy loss

model.compile(optimizer=optimizers.Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['acc'])

6CSE4Y Madhav Khanna A2305222268


# Training the model history = model.fit( train_generator,

steps_per_epoch=train_generator.samples/train_generator.batch_size,

epochs=30, validation_data=validation_generator,

validation_steps=validation_generator.samples/validation_generator.batch_size,

verbose=2)

# Plotting training and validation accuracy

plt.plot(history.history['acc'], label='Training Accuracy')

plt.plot(history.history['val_acc'], label='Validation Accuracy')

plt.title('Training and Validation Accuracy')

plt.xlabel('Epochs') plt.ylabel('Accuracy') plt.grid()

plt.legend() plt.show()

# Plotting training and validation loss

plt.plot(history.history['loss'], label='Training Loss')

plt.plot(history.history['val_loss'], label='Validation Loss')

plt.title('Training and Validation Loss')

plt.xlabel('Epochs') plt.ylabel('Loss') plt.grid()

plt.legend() plt.show()

# Predictions and error calculation predictions = model.predict(test_generator)

predicted_classes = np.argmax(predictions, axis=1) ground_truth =

test_generator.classes errors = np.where(predicted_classes != ground_truth)[0]

accuracy = ((test_generator.samples - len(errors)) / test_generator.samples) * 100

print(f'Number of errors: {len(errors)}/{test_generator.samples}')

print(f'Accuracy: {accuracy:.2f}%')

# Confusion matrix and classification report cm =

confusion_matrix(ground_truth, predicted_classes) cmn

= cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]

plt.figure(figsize=(8, 6))

sns.heatmap(cmn, annot=True, fmt='.2f', xticklabels=test_generator.class_indices,


yticklabels=test_generator.class_indices, cmap='YlGnBu') plt.title('Confusion Matrix') plt.xlabel('Predicted')

6CSE4Y Madhav Khanna A2305222268


plt.ylabel('Actual') plt.show() print(classification_report(ground_truth, predicted_classes,
target_names=list(test_generator.class_indices.keys()))) Output-

Visualization

Accuracy-

Confusion matrix

6CSE4Y Madhav Khanna A2305222268


Classification report-

Conclusion-
The Convolutional Neural Network (CNN) architecture designed for image classification utilizes multiple
convolutional layers followed by pooling layers to extract spatial features from images effectively. After feature
extraction, fully connected layers are employed to classify the images into their respective categories. Hyperparameter
tuning, including optimizing learning rate, batch size, number of epochs, and activation functions, significantly
improves the model's performance. Techniques like data augmentation, dropout, and batch normalization are also
applied to enhance generalization and prevent overfitting. The model is trained and validated on an appropriate image
dataset, achieving high classification accuracy, typically ranging from 90% to 98%, depending on the dataset's
complexity and diversity. The results demonstrate the CNN's robustness and efficiency in image classification tasks.

6CSE4Y Madhav Khanna A2305222268


Experiment – 04

-
Aim Deep Learning Training and Architecture, Feature Extraction, Models training with some pretrained models.
Theory- Deep learning training involves designing and optimizing neural network architectures to learn
from data. Feature extraction is a crucial step where meaningful patterns are derived from raw data, often
using convolutional or transformer-based layers. Model training can be done from scratch or by fine-tuning
pre-trained models like ResNet, VGG, or BERT to leverage prior knowledge. Transfer learning helps
improve accuracy and reduce training time by adapting pre-trained models to new tasks. Techniques like data
augmentation, regularization, and hyperparameter tuning further enhance model performance.

Code-

#Importing Necessary libraries import numpy as np from tensorflow.keras import Input from

tensorflow.keras import models from tensorflow.keras import layers from tensorflow.keras import optimizers

from tensorflow.keras.models import Model from tensorflow.keras import applications from tensorflow.keras

import backend as k import matplotlib.pyplot as plt from tensorflow.keras.optimizers import SGD, Adam from

tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D from tensorflow.keras.preprocessing

import image from tensorflow.keras.models import Sequential, Model from

tensorflow.keras.preprocessing.image import ImageDataGenerator from tensorflow.keras.layers import

Dropout, Flatten, Dense, GlobalAveragePooling2D from tensorflow.keras.callbacks import ModelCheckpoint,

LearningRateScheduler, TensorBoard, EarlyStopping

#Loading the Training and Testing Data and Defining the Basic Parameters

# Normalize training and validation data in the range of 0 to 1

train_datagen = ImageDataGenerator(rescale=1./255) # vertical_flip=True,

# horizontal_flip=True,

# height_shift_range=0.1,

# width_shift_range=0.1

validation_datagen = ImageDataGenerator(rescale=1./255) test_datagen

= ImageDataGenerator(rescale=1./255)

6CSE4Y Madhav Khanna A2305222268


# Read the training sample and set the batch size

train_generator = train_datagen.flow_from_directory(

'/workspace/Bootcamp/Data/plant_village/train/',

target_size=(128, 128), batch_size=16,

class_mode='categorical')

# Read Validation data from directory and define target size with batch size

validation_generator = validation_datagen.flow_from_directory(

'/workspace/Bootcamp/Data/plant_village/val/', target_size=(128,

128), batch_size=16, class_mode='categorical', shuffle=False)

test_generator = test_datagen.flow_from_directory(

'/workspace/Bootcamp/Data/plant_village/test/',

target_size=(128, 128), batch_size=1,

class_mode='categorical', shuffle=False)

plt.figure(figsize=(16, 16)) for i in range(1, 17):

plt.subplot(4, 4, i) img, label = test_generator.next()

# print(img.shape)

# print(label)

plt.imshow(img[0]) plt.show()

img, label = test_generator.next()

img[0].shape #VGG16

from tensorflow.keras.applications.vgg16 import VGG16

## Loading VGG16 model base_model = VGG16(weights="imagenet", include_top=False,

input_shape= (128, 128, 3))

# Include_top = False means excluding the model fully connected layers

base_model.trainable = False ## Not trainable weights,

#weights of the VGG16 model will not be updated during training

base_model.summary()

#Adding top layers according to number of classes in our data

flatten_layer = layers.GlobalAveragePooling2D()
6CSE4Y Madhav Khanna A2305222268
# dense_layer_1 = layers.Dense(64, activation='relu') #

dense_layer_2 = layers.Dense(32, activation='relu')

prediction_layer = layers.Dense(4, activation='softmax')

model = models.Sequential([

base_model, flatten_layer,

prediction_layer

])

model.summary()

#training

# sgd = SGD(lr=0.001,decay=1e-6, momentum=0.9, nesterov=True)

# We are going to use accuracy metrics and cross entropy loss as performance parameters model.compile(optimizer

= Adam(learning_rate = 0.0001), loss='categorical_crossentropy', metrics=['acc'])

# Train the model history = model.fit(train_generator,

steps_per_epoch=train_generator.samples/train_generator.batch_size,

epochs=30, validation_data=validation_generator,

validation_steps=validation_generator.samples/validation_generator.batch_size,

verbose=1) model.save("VGG16_plant_deseas.h5") print("Saved model to disk")

model = models.load_model('VGG16_plant_deseas.h5')
print("Model is loaded")

model.save_weights('cnn_classification.h5')

model.load_weights('cnn_classification.h5')

train_acc = history.history['acc'] val_acc

= history.history['val_acc'] train_loss =

history.history['loss'] val_loss =

history.history['val_loss']

epochs = range(len(train_acc)) plt.plot(epochs,

train_acc, 'b', label='Training Accuracy') plt.plot(epochs,

val_acc, 'g', label='Validation Accuracy')


6CSE4Y Madhav Khanna A2305222268
plt.title('Training and Validation Accuracy') plt.grid()

plt.legend() plt.figure() plt.show()

plt.plot(epochs, train_loss, 'b', label='Training Loss')

plt.plot(epochs, val_loss, 'g', label='Validation Loss')

plt.title('Training and Validation Loss') plt.grid()

plt.legend() plt.show()

# **Performance measure**

# Get the filenames from the generator fnames

= test_generator.filenames

# Get the ground truth from generator ground_truth

= test_generator.classes # Get the label to class

mapping from the generator label2index =

test_generator.class_indices

# Getting the mapping from class index to class label idx2label

= dict((v,k) for k,v in label2index.items())

# Get the predictions from the model using the generator

predictions = model.predict_generator(test_generator,
steps=test_generator.samples/test_generator.batch_size,verbose=1)
predicted_classes = np.argmax(predictions,axis=1)

errors = np.where(predicted_classes != ground_truth)[0] print("No of

errors = {}/{}".format(len(errors),test_generator.samples))

accuracy = ((test_generator.samples-len(errors))/test_generator.samples) * 100 accuracy from

sklearn.metrics import confusion_matrix import seaborn as sns import numpy as np from matplotlib

import pyplot as plt cm = confusion_matrix(y_true=ground_truth, y_pred=predicted_classes) cm =

np.array(cm) # Normalise cmn = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] fig, ax =


6CSE4Y Madhav Khanna A2305222268
plt.subplots(figsize=(5,4)) sns.heatmap(cmn, annot=True, fmt='.2f', xticklabels=label2index,

yticklabels=label2index, cmap="YlGnBu") plt.ylabel('Actual', fontsize=15) plt.xlabel('Predicted',

fontsize=15) plt.show(block=False)

Output:

VGG16

Confusion Matrix:

6CSE4Y Madhav Khanna A2305222268


Inception Net

Confusion Matrix:

Conclusion:

Deep learning training and architecture design play a crucial role in building efficient models for complex tasks.
Feature extraction enhances learning by capturing meaningful patterns from raw data. Leveraging pre-trained models
like ResNet or BERT accelerates training and improves performance through transfer learning.

6CSE4Y Madhav Khanna A2305222268


Experiment – 05
Aim- Text data handling with RNN for sentiment analysis
Theory- Recurrent Neural Network (RNN) RNN is a type of neural network designed for sequential data, where past
inputs influence future predictions through hidden states. It is commonly used in natural language processing (NLP)
tasks like sentiment analysis. Sentiment Analysis Sentiment analysis is the process of determining the sentiment
(positive, negative, or neutral) of text data using machine learning or deep learning techniques.
Long Short-Term Memory (LSTM) LSTM is a special type of RNN designed to handle long-term dependencies by
using memory cells and gates to control information flow.

Code-

#Importing Necessary libraries import pandas as pd # to load dataset import

numpy as np # for mathematic equation from nltk.corpus import stopwords # to

get collection of stopwords from sklearn.model_selection import train_test_split

# for splitting dataset from tensorflow.keras.preprocessing.text import Tokenizer

# to encode text to int

from tensorflow.keras.preprocessing.sequence import pad_sequences # to do padding or truncating from

tensorflow.keras.models import Sequential # the model

from tensorflow.keras.layers import Embedding, LSTM, Dense # layers of the architecture from

tensorflow.keras.callbacks import ModelCheckpoint # save model from tensorflow.keras.models

import load_model # load saved model import re from keras.layers import SimpleRNN # **Preparing

the data named IMDB** data = pd.read_csv('/content/drive/MyDrive/AMITY/Deep Learning

(codes)/Data/IMDB Dataset.csv') print(data)

Stop Word is a commonly used words in a sentence, usually a search engine is programmed to ignore this words (i.e.
"the", "a", "an", "of", etc.) Declaring the english stop words import nltk

nltk.download("stopwords") english_stops =

set(stopwords.words('english')) # **Load and

Clean Dataset**

**In the original dataset, the reviews are still dirty. There are still html tags, numbers, uppercase, and punctuations.
This will not be good for training, so in load_dataset() function, beside loading the dataset using pandas, I also
preprocess the reviews by removing html tags, non alphabet (punctuations and numbers), stop words, and lower case
all of the reviews.**
**In the same function, We also encode the sentiments into integers (0 and 1). Where 0 is for negative sentiments and

6CSE4Y Madhav Khanna A2305222268


1 is for positive sentiments.** def

load_dataset():

df = pd.read_csv('/content/drive/MyDrive/AMITY/Deep Learning (codes)/Data/IMDB Dataset.csv')


x_data = df['review'] # Reviews/Input y_data

= df['sentiment'] # Sentiment/Output

# PRE-PROCESS REVIEW

x_data = x_data.replace({'<.*?>': ''}, regex = True) # remove html tag x_data =

x_data.replace({'[^A-Za-z]': ' '}, regex = True) # remove non alphabet

x_data = x_data.apply(lambda review: [w for w in review.split() if w not in english_stops]) # remove stop words

x_data = x_data.apply(lambda review: [w.lower() for w in review]) # lower case

# ENCODE SENTIMENT -> 0 & 1

y_data = y_data.replace('positive', 1) y_data = y_data.replace('negative', 0)

return x_data, y_data x_data, y_data = load_dataset() print('Reviews')

print(x_data, '\n') print('Sentiment') print(y_data) x_train, x_test, y_train,

y_test = train_test_split(x_data, y_data, test_size = 0.2) print('Train Set')

print(x_train, '\n') print(x_test, '\n') print('Test Set') print(y_train, '\n')

print(y_test) def get_max_length(): review_length = [] for review in x_train:

review_length.append(len(review)) return

int(np.ceil(np.mean(review_length)))

#**Tokenize and Pad/Truncate Reviews**


**Each reviews has a different length, so we need to add padding (by adding 0) or truncating the words to the same
length (in this case, it is the mean of all reviews length) using
tensorflow.keras.preprocessing.sequence.pad_sequences.**

# ENCODE REVIEW

token = Tokenizer(lower=False) # no need lower, because already lowered the data in load_data()

token.fit_on_texts(x_train)

x_train = token.texts_to_sequences(x_train) x_test

= token.texts_to_sequences(x_test) max_length =

get_max_length() x_train =

pad_sequences(x_train, maxlen=max_length,

padding='post', truncating='post') x_test =

pad_sequences(x_test, maxlen=max_length,

padding='post', truncating='post') total_words =

len(token.word_index) + 1 # add 1 because of 0

6CSE4Y Madhav Khanna A2305222268


padding print('Total Words:', total_words)

print('Encoded X Train\n', x_train, '\n')

print('Encoded X Test\n', x_test, '\n')

print('Maximum review length: ', max_length)

#**Build Architecture/Model** rnn

= Sequential()

rnn.add(Embedding(total_words,32,input_length =max_length)) rnn.add(SimpleRNN(64,input_shape =

(total_words, max_length), return_sequences=False,activation="relu")) rnn.add(Dense(1, activation =

'sigmoid')) #flatten print(rnn.summary())

rnn.compile(loss="binary_crossentropy",optimizer='adam',metrics=["accuracy"])

#**Trainin the Model** history = rnn.fit(x_train,y_train,epochs =

20,batch_size=128,verbose = 1)

#**Saving The Model** model =

rnn.save('rnn.h5') loaded_model =

load_model('rnn.h5')

#**Evaluation** y_pred = rnn.predict(x_test,

batch_size = 128) print(y_pred) print(y_test)

for i in range(len(y_pred)): if y_pred[i]>0.5:

y_pred[i] = 1 else: y_pred[i] = 0 true = 0 for i,

y in enumerate(y_test): if y == y_pred[i]:

true += 1

print('Correct Prediction: {}'.format(true)) print('Wrong

Prediction: {}'.format(len(y_pred) - true))

print('Accuracy: {}'.format(true/len(y_pred)*100))

Message: **Nothing was typical about this. Everything

was beautifully done in this movie, the story, the flow,

the scenario, everything. I highly recommend it for

mystery lovers, for anyone who wants to watch a good

movie!**

#**Example review**

review = str(input('Movie Review: '))

#**Pre-processing of entered review**

6CSE4Y Madhav Khanna A2305222268


# Pre-process input regex =

re.compile(r'[^a-zA-Z\s]') review

= regex.sub('', review)

print('Cleaned: ', review) words =

review.split(' ')

filtered = [w for w in words if w not in english_stops] filtered = ' '.join(filtered) filtered =

[filtered.lower()] print('Filtered: ', filtered) tokenize_words = token.texts_to_sequences(filtered)

tokenize_words = pad_sequences(tokenize_words, maxlen=max_length, padding='post', truncating='post')

print(tokenize_words) #**Prediction** result = rnn.predict(tokenize_words) print(result) if result >= 0.7:

print('positive') else:

print('negative')

Model Summary:

Conclusion:
Recurrent Neural Networks (RNNs) are effective for sentiment analysis as they capture sequential dependencies in
text data. Techniques like LSTMs and GRUs help address vanishing gradient issues, improving performance on long
text sequences. Preprocessing steps like tokenization and embedding (e.g., Word2Vec, GloVe) enhance model
accuracy. Fine-tuning and regularization further optimize RNN-based sentiment analysis models for real-world
applications.
Experiment – 06
Aim-

Sentiment analysis using RNN-LSTM on tweets data.

Theory-
Sentiment analysis is a technique used to determine the sentiment or emotion conveyed by textual data. It is widely
used in applications like social media monitoring, customer feedback analysis, and product review mining. In the
context of tweets data, sentiment analysis helps identify whether a tweet expresses a positive, negative, or neutral
sentiment.

Why Use RNN and LSTM?


Tweets are essentially sequences of words that contain contextual and temporal information. Recurrent Neural
Networks (RNNs) are well-suited for processing such sequential data due to their ability to maintain a memory of

6CSE4Y Madhav Khanna A2305222268


previous inputs. However, RNNs face the problem of vanishing gradients, making them less effective for long-term
dependencies.
To address this issue, Long Short-Term Memory (LSTM) networks were introduced. LSTMs are a special type of
RNN designed to retain information over longer sequences by using memory cells and gates (input, forget, and output
gates). This helps capture the context of words in a tweet more effectively.

Key Components of Sentiment Analysis with RNN-LSTM

1. Data Preprocessing:

o Cleaning tweets (removing URLs, mentions, special characters, etc.). o Handling slang and

abbreviations. o Lowercasing and tokenizing text. o Removing stopwords.

o Lemmatization to reduce words to their base form.

2. Text Vectorization:

o Using TF-IDF or Word Embeddings (like Word2Vec or GloVe) to convert textual data into
numerical vectors.

o Using Tokenization and Padding to ensure uniform input length.


3. Model Architecture:

o Embedding Layer: Transforms words into dense vectors of fixed size.

o LSTM Layer: Captures the temporal and contextual relationships between words.

o Dense Layer: Fully connected layer for classification.

o Activation Function: Uses Softmax to predict sentiment classes.


4. Training and Evaluation:

o Splitting the data into training and testing sets.

o Using Cross-Entropy Loss and Adam Optimizer.


o Metrics include Accuracy, Precision, Recall, and F1-Score.
o Plotting the Confusion Matrix and Classification Report to evaluate performance.

Code- import numpy as np

import pandas as pd import

matplotlib.pyplot as plt

df = pd.read_csv('/content/drive/MyDrive/AMITY/Deep Learning (codes)/Data/data.csv')

df.drop(['count','hate_speech','offensive_language','neither','Unnamed: 0'],axis=1,inplace=True)

classes = ['Hate Speech','Offensive Language','None'] labels

= df['class']

6CSE4Y Madhav Khanna A2305222268


unique, counts = np.unique(labels, return_counts=True)

values = list(zip(unique, counts)) plt.bar(classes,

counts) for i in values: print(classes[i[0]], ' : ', i[1])

plt.show()

hate_tweets = df[df['class'] == 0] offensive_tweets = df[df['class'] == 1]

neither = df[df['class'] == 2] for i in range(3): hate_tweets =

pd.concat([hate_tweets, hate_tweets], ignore_index=True) neither =

pd.concat([neither, neither, neither], ignore_index=True) offensive_tweets =

offensive_tweets.iloc[0:12000, :] df = pd.concat([hate_tweets,

offensive_tweets, neither], ignore_index=True)

labels = df['class'] unique, counts = np.unique(labels,

return_counts=True) values = list(zip(unique, counts))

plt.bar(classes, counts) for i in values:

print(classes[i[0]], ' : ', i[1]) plt.show()

import nltk from nltk.stem import

WordNetLemmatizer from nltk.corpus

import stopwords

import re

nltk.download('wordnet') nltk.download('stopwords')

d = {'luv': 'love', 'wud': 'would', 'lyk': 'like', 'wateva': 'whatever', 'ttyl': 'talk to you later', 'kul': 'cool', 'fyn': 'fine', 'omg':
'oh my god!', 'fam': 'family', 'bruh': 'brother', 'cud': 'could', 'fud': 'food', 'u': 'you', 'ur': 'your', 'bday': 'birthday', 'bihday':
'birthday'} stop_words = set(stopwords.words('english')) stop_words.add('rt')

stop_words.remove('not') lemmatizer = WordNetLemmatizer() giant_url_regex =

'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*,]|(?:%[0-9a-fA-F][0-9a-fA-F]))+' mention_regex =

'@[\w\-]+'

def clean_text(text):

text = re.sub('"', '', text) text = re.sub(mention_regex, '

', text) text = re.sub(giant_url_regex, ' ', text) text =

text.lower() text = re.sub('hm+', '', text) text =


6CSE4Y Madhav Khanna A2305222268
re.sub('[^a-z]+', ' ', text) text = text.split() text = [word

for word in text if word not in stop_words] text =

[d[word] if word in d else word for word in text] text =

[lemmatizer.lemmatize(token) for token in text] text =

[lemmatizer.lemmatize(token, 'v') for token in text]

return ' '.join(text)

df['processed_tweets'] = df.tweet.apply(lambda x: clean_text(x))

x = df.processed_tweets y = df['class']

from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer(max_features=8000) vectorizer.fit(x)

x_tfidf = vectorizer.transform(x).toarray()

from tensorflow.keras.preprocessing.sequence import pad_sequences

num_words = 8000 embed_dim = 32 pad_length = 24

from sklearn.model_selection import train_test_split


x_train, x_test, y_train, y_test = train_test_split(pad_sequences(tokenizer.texts_to_sequences(x), maxlen=pad_length,
truncating='pre', padding='post'), y, test_size=0.05)

from keras.layers import Dense, Embedding, Dropout, GlobalMaxPool1D, SimpleRNN from

keras.models import Sequential

model = Sequential([

Embedding(num_words, embed_dim, input_length=pad_length),

SimpleRNN(8, return_sequences=True),

GlobalMaxPool1D(),

Dense(20, activation='relu', kernel_initializer='he_uniform'),

Dropout(0.25),

Dense(3, activation='softmax')

])

6CSE4Y Madhav Khanna A2305222268


model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.fit(x=x_train, y=y_train, epochs=5, validation_split=0.05) evaluate =

model.evaluate(x_test, y_test)

from sklearn.metrics import confusion_matrix, accuracy_score, classification_report

predictions = np.argmax(model.predict(x_test), axis=1) cm =

confusion_matrix(y_test, predictions) acc = accuracy_score(y_test, predictions)

print('Accuracy: {:.2f}%'.format(acc * 100)) print('Confusion Matrix:', cm)

print('Classification Report:', classification_report(y_test, predictions))

Output-

Model Summary-

Accuracy and loss-

Classification Report-

6CSE4Y Madhav Khanna A2305222268


Conclusion-
Sentiment analysis using RNN-LSTM on tweets data involves analyzing textual content to determine the sentiment,
whether positive, negative, or neutral. Recurrent Neural Networks (RNN) with Long Short-Term Memory (LSTM)
units are employed to capture temporal dependencies and long-range patterns within the tweet sequences. The model
is trained on a labeled dataset of tweets, where preprocessing steps include tokenization, padding, and word
embedding (e.g., using Word2Vec or GloVe). LSTM layers are stacked to enhance learning, followed by dense layers
with a softmax activation function to classify sentiments. Hyperparameter tuning, such as adjusting the number of
LSTM units, learning rate, batch size, and number of epochs, helps optimize performance. The model's accuracy
typically ranges from 80% to 90%, demonstrating its effectiveness in capturing sentiment from textual data.

EXPERIMENT 7

AIM: To implement and analyze an autoencoder model for the MNIST dataset, demonstrating its capability in
dimensionality reduction and feature extraction.

THEORY:
Autoencoders are a type of artificial neural network used to learn efficient codings of input data. They consist of two
main parts:

• Encoder: Compresses the input into a lower-dimensional representation.

• Decoder: Reconstructs the original input from the compressed representation.


The goal is to minimize the reconstruction loss, ensuring that the decoded output closely resembles the input. The
MNIST dataset, containing handwritten digits (0-9), provides a suitable dataset for training and testing autoencoders.

CODE & IMPLEMENTATION:

Importing Necessary libraries

import numpy as np import


tensorflow as tf import
matplotlib.pyplot as plt

from tensorflow.keras import layers from


tensorflow.keras.datasets import mnist from
tensorflow.keras.models import Model

#Pre-processing

6CSE4Y Madhav Khanna A2305222268


#Normalizing the data into 0 to 1 and reshaping the size

def preprocess(array):
array = array.astype("float32") / 255.0 array
= np.reshape(array, (len(array), 28, 28, 1))
return array

#Adding noise to the original images

def noise(array):
noise_factor = 0.4 #amount of noise to add
noisy_array = array + noise_factor * np.random.normal(
loc=0.0, scale=1.0, size=array.shape
)

return np.clip(noisy_array, 0.0, 1.0)

#Visualizing the images def

display(array1, array2):

n = 10

indices = np.random.randint(len(array1), size=n)


images1 = array1[indices, :] images2 =
array2[indices, :]

plt.figure(figsize=(20, 4)) for i, (image1, image2) in


enumerate(zip(images1, images2)):
ax = plt.subplot(2, n, i + 1)
plt.imshow(image1.reshape(28, 28)) plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

ax = plt.subplot(2, n, i + 1 + n)
plt.imshow(image2.reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

plt.show()

#Preparing the data

# Since we only need images from the dataset to encode and decode, we #
won't use the labels.
(train_data, _), (test_data, _) = mnist.load_data()

# Normalize and reshape the data train_data


= preprocess(train_data)
test_data = preprocess(test_data)

6CSE4Y Madhav Khanna A2305222268


# Create a copy of the data with added noise noisy_train_data
= noise(train_data)
noisy_test_data = noise(test_data)

# Display the train data and a version of it with added noise display(train_data,
noisy_train_data)

#Building the Autoencoder

input = layers.Input(shape=(28, 28, 1))

# Encoder
x = layers.Conv2D(32, (3, 3), activation="relu", padding="same")(input)
x = layers.MaxPooling2D((2, 2), padding="same")(x) x =
layers.Conv2D(32, (3, 3), activation="relu", padding="same")(x) x =
layers.MaxPooling2D((2, 2), padding="same")(x)

# Decoder x = layers.Conv2DTranspose(32, (3, 3), strides=2, activation="relu",


padding="same")(x) x = layers.Conv2DTranspose(32, (3, 3), strides=2, activation="relu",
padding="same")(x) x = layers.Conv2D(1, (3, 3), activation="sigmoid",
padding="same")(x)

# Autoencoder autoencoder =
Model(input, x)
autoencoder.compile(optimizer="adam", loss="binary_crossentropy") autoencoder.summary()

#Training the model autoencoder.fit(


x=train_data, y=train_data,
epochs=50, batch_size=128,
shuffle=True,
validation_data=(test_data, test_data),
)

6CSE4Y Madhav Khanna A2305222268


#Prediction predictions =
autoencoder.predict(test_data)
display(test_data, predictions)

Conclusion-
The model consists of an encoder that compresses the input data into a latent space representation and a decoder that
reconstructs the original data from this compressed form. Using the MNIST dataset, which contains handwritten
digits, the autoencoder effectively reduces the high-dimensional input (28x28 pixels) to a lower-dimensional latent
vector. The model is trained using the mean squared error loss and optimized with techniques like Adam. After
training, the autoencoder demonstrates impressive reconstruction quality while significantly reducing dimensionality,
proving its effectiveness in feature extraction and data compression tasks. The latent space representations can also be
visualized, highlighting clusters corresponding to different digits, indicating successful feature learning.

EXPERIMENT 8

AIM:

To implement and analyze a Variational Autoencoder (VAE) for image reconstruction, specifically for facial image
data. The goal is to understand how VAEs encode input data into a latent space and reconstruct images from sampled
latent variables.

THEORY:

Variational Autoencoders (VAEs) are a type of generative model that learns to encode input data into a
lowerdimensional latent space and generate new data samples from that space. Unlike traditional autoencoders, VAEs
introduce a probabilistic framework to enforce a continuous and structured latent space, which helps in generating
diverse and realistic outputs.

CODE & IMPLEMENTATION:

#Importing Necessary libraries


import numpy as np import
pandas as pd
import matplotlib.pyplot as plt

# For training the VAE


import tensorflow as tf

6CSE4Y Madhav Khanna A2305222268


# For creating interactive widgets
import ipywidgets as widgets from
IPython.display import display

# Load the data from a .csv file


pixel_data = pd.read_csv('/workspace/Bootcamp/data/age_gender.csv')['pixels']

# Shuffle the data


pixel_data = pixel_data.sample(frac=1.0, random_state=1)

# Convert the data into a NumPy array


pixel_data = pixel_data.apply(lambda x: np.array(x.split(" "), dtype=np.int)) pixel_data
= np.stack(np.array(pixel_data), axis=0)

# Rescale pixel values to be between 0 and 1 pixel_data = pixel_data *


(1./255) pixel_data = pixel_data.apply(lambda x: np.array(x.split(" "),
dtype=np.int))

# The data is now a NumPy array of 23705 images.


# we are working with 48x48x1 images) pixel_data.shape

#Building the VAE

class Sampling(tf.keras.layers.Layer):
def call(self, inputs):
z_mean, z_log_var = inputs # Unpack the inputs into mean and log-variance
batch = tf.shape(z_mean)[0] # Get the batch size
dim = tf.shape(z_mean)[1] # Get the dimensionality of the latent space epsilon =
tf.keras.backend.random_normal(shape=(batch, dim)) # Sample from standard normal distribution return
epsilon * tf.exp(z_log_var * 0.5) + z_mean # Apply the reparameterization trick

def build_vae(num_pixels, num_latent_vars=3):

# Encoder
encoder_inputs = tf.keras.Input(shape=(num_pixels,)) # Input layer for the encoder
x = tf.keras.layers.Dense(512, activation='relu')(encoder_inputs) # First dense layer with 512 units and ReLU activ
ation
x = tf.keras.layers.Dense(128, activation='relu')(x) # Second dense layer with 128 units and ReLU activation
x = tf.keras.layers.Dense(32, activation='relu')(x) # Third dense layer with 32 units and ReLU activation
z_mean = tf.keras.layers.Dense(num_latent_vars)(x) # Dense layer for the mean of the latent variables
z_log_var = tf.keras.layers.Dense(num_latent_vars)(z_mean) # Dense layer for the log-variance of the latent
variables
6CSE4Y Madhav Khanna A2305222268
z = Sampling()([z_mean, z_log_var]) # Sampling layer to sample the latent variables using the reparameterization
trick

encoder = tf.keras.Model(inputs=encoder_inputs, outputs=z) # Define the encoder model

# Decoder
decoder_inputs = tf.keras.Input(shape=(num_latent_vars,)) # Input layer for the decoder
x = tf.keras.layers.Dense(32, activation='relu')(decoder_inputs) # First dense layer with 32 units and ReLU
activation
x = tf.keras.layers.Dense(128, activation='relu')(x) # Second dense layer with 128 units and ReLU activation x
= tf.keras.layers.Dense(512, activation='relu')(x) # Third dense layer with 512 units and ReLU activation
reconstruction = tf.keras.layers.Dense(num_pixels, activation='linear')(x) # Output dense layer with 'num_pixels'
units and linear activation

decoder = tf.keras.Model(inputs=decoder_inputs, outputs=reconstruction) # Define the decoder model

# Full model
model_inputs = encoder.input # Inputs of the full VAE model are the inputs of the encoder
model_outputs = decoder(encoder.output) # Outputs of the full VAE model are the outputs of the decoder, given the
encoder's output

model = tf.keras.Model(inputs=model_inputs, outputs=model_outputs) # Define the full VAE model

# Compile model for training


model.compile(
optimizer='adam', # Adam optimizer loss='mse'
# Mean Squared Error (MSE) loss function
)

# Return all three models


return encoder, decoder, model # Return the encoder, decoder, and full VAE models
face_encoder, face_decoder, face_model = build_vae(num_pixels=2304, num_latent_vars=3)

print(face_encoder.summary())

6CSE4Y Madhav Khanna A2305222268


print(face_decoder.summary())

#Train the VAE

#We will use pixel_data as both the input to the model and the target to compare the output to. history

= face_model.fit(

pixel_data,
pixel_data,
validation_split=0.2,
batch_size=32,
epochs=100,
callbacks=[
tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=10,
restore_best_weights=True
)
])

#Image Reconstruction

#Let's see how the model does at reconstructing an image that it has already seen.

i=6

sample = np.array(pixel_data)[i].copy()
sample = sample.reshape(48, 48, 1)

reconstruction = face_model.predict(pixel_data)[i].copy() reconstruction


= reconstruction.reshape(48, 48, 1)

plt.figure(figsize=(10, 5))

plt.subplot(1, 2, 1) plt.imshow(sample,
cmap='gray') plt.axis('off')

6CSE4Y Madhav Khanna A2305222268


plt.title("Original Image")

plt.subplot(1, 2, 2)
plt.imshow(reconstruction, cmap='gray') plt.axis('off')
plt.title("Reconstructed Image")

plt.show()

#Specify our own latent variable values

#Now let's see how we can use our own values to generate never-before-seen images.

# A function to allow us to specify our own latent variable values and plot the constructed image
def generate_face_image(latent1, latent2, latent3): latent_vars = np.array([[latent1, latent2,
latent3]]) reconstruction = np.array(face_decoder(latent_vars)) reconstruction =
reconstruction.reshape(48, 48, 1) plt.figure()
plt.imshow(reconstruction, cmap='gray')
plt.axis('off') plt.show()

# Let's get the min and max for each slider on the interactive widget
latent1_min = np.min(face_encoder(pixel_data).numpy()[:, 0]) latent1_max
= np.max(face_encoder(pixel_data).numpy()[:, 0]) latent2_min =
np.min(face_encoder(pixel_data).numpy()[:, 1]) latent2_max =
np.max(face_encoder(pixel_data).numpy()[:, 1])

latent3_min = np.min(face_encoder(pixel_data).numpy()[:, 2]) latent3_max


= np.max(face_encoder(pixel_data).numpy()[:, 2])

import tensorflow as tf print(tf.__version__)

# Create the interactive widget


face_image_generator = widgets.interact(
generate_face_image, latent1=(latent1_min,
6CSE4Y Madhav Khanna A2305222268
latent1_max), latent2=(latent2_min,
latent2_max), latent3=(latent3_min,
latent3_max),
)

# Display the widget display(face_image_generator)

EXPERIMENT 9

AIM: To implement and analyse a Generative Adversarial Network (GAN) for generating synthetic handwritten
digits using the MNIST dataset. The experiment aims to understand how GANs generate new data by learning from a
dataset of real images.

THEORY:

Generative Adversarial Networks (GANs) are a class of deep learning models used for generating new data that is
similar to a given dataset. A GAN consists of two neural networks that compete with each other in a zero-sum game:

1. Generator: Creates synthetic data samples from random noise.


6CSE4Y Madhav Khanna A2305222268
2. Discriminator: Distinguishes between real and fake samples.

CODE & IMPLEMENTATION:

#Importing Necessary libraries

from tensorflow.keras.layers import (Dense,


BatchNormalization,
LeakyReLU,
Reshape,
Conv2DTranspose,
Conv2D,
Dropout,
Flatten)
import tensorflow as tf import
matplotlib.pyplot as plt

#Preparing the data

# underscore to omit the label arrays


(train_images, train_labels), (_, _) = tf.keras.datasets.mnist.load_data()

train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32') train_images


= (train_images - 127.5) / 127.5 # Normalize the images to [-1, 1]

BUFFER_SIZE = 60000
BATCH_SIZE = 256

# Batch and shuffle the data


train_dataset = tf.data.Dataset.from_tensor_slices(train_images).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)

#Dimension of the Noise

# Set the dimensions of the noise z_dim


= 100

#Functional Code Generator

# nch = 200
# g_input = Input(shape=[100])
# H = Dense(nch*14*14, kernel_initializer='glorot_normal')(g_input)
# H = BatchNormalization()(H)
# H = Activation('relu')(H)
# H = Reshape( [nch, 14, 14] )(H)

6CSE4Y Madhav Khanna A2305222268


# H = UpSampling2D(size=(2, 2))(H)
# H = Convolution2D(int(nch/2), 3, 3, padding='same', kernel_initializer='glorot_uniform')(H)
# H = BatchNormalization()(H)
# H = Activation('relu')(H)
# H = Convolution2D(int(nch/4), 3, 3, padding='same', kernel_initializer='glorot_uniform')(H)
# H = BatchNormalization()(H)
# H = Activation('relu')(H)
# H = Convolution2D(1, 1, 1, padding='same', kernel_initializer='glorot_uniform')(H)
# g_V = Activation('sigmoid')(H)
# generator = Model(g_input,g_V)
# generator.compile(loss='binary_crossentropy', optimizer=Adam(lr=0.0002, beta_1=0.5)) #
generator.summary()

#Building Generator

def generator_model():
model = tf.keras.Sequential() model.add(Dense(7*7*256,
use_bias=False, input_shape=(100,)))
model.add(BatchNormalization())
model.add(LeakyReLU())

model.add(Reshape((7, 7, 256))) assert model.output_shape == (None, 7, 7,


256) # Note: None is the batch size

model.add(Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False)) assert


model.output_shape == (None, 7, 7, 128) model.add(BatchNormalization())
model.add(LeakyReLU())

6CSE4Y Madhav Khanna A2305222268


model.add(Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
assert model.output_shape == (None, 14, 14, 64) model.add(BatchNormalization())
model.add(LeakyReLU())

model.add(Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))


assert model.output_shape == (None, 28, 28, 1)

print(model.summary())

return model generator =

generator_model()

#Generate a Sample Image

# Create a random noise and generate a sample noise


= tf.random.normal([1, 100])
generated_image = generator(noise, training=False)
# Visualize the generated sample plt.imshow(generated_image[0,
:, :, 0], cmap='gray') <matplotlib.image.AxesImage at
0x7fa67e014b50>

6CSE4Y Madhav Khanna A2305222268


#Discriminative Model

def discriminator_model():
model = tf.keras.Sequential()

model.add(Conv2D(64, (5, 5), strides=(2, 2), padding='same', input_shape=[28, 28, 1]))


model.add(LeakyReLU())
model.add(Dropout(0.3))

model.add(Conv2D(128, (5, 5), strides=(2, 2), padding='same'))


model.add(LeakyReLU())
model.add(Dropout(0.3))

model.add(Flatten())
model.add(Dense(1))

print(model.summary())

return model discriminator =

discriminator_model()

6CSE4Y Madhav Khanna A2305222268


#Configure the Model

# This method returns a helper function to compute cross entropy loss cross_entropy
= tf.keras.losses.BinaryCrossentropy(from_logits=True) def
discriminator_loss(real_output, fake_output):
real_loss = cross_entropy(tf.ones_like(real_output), real_output)
fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
total_loss = real_loss + fake_loss return total_loss

def generator_loss(fake_output):
return cross_entropy(tf.ones_like(fake_output), fake_output)

generator_optimizer = tf.keras.optimizers.Adam(1e-4) discriminator_optimizer


= tf.keras.optimizers.Adam(1e-4)

#Training

import os
checkpoint_dir = '/content/drive/MyDrive/AMITY/Deep Learning (codes)/GAN/'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt") checkpoint =
tf.train.Checkpoint(generator_optimizer=generator_optimizer,
discriminator_optimizer=discriminator_optimizer,
generator=generator,
discriminator=discriminator)

EPOCHS = 60
# We will reuse this seed overtime (so it's easier)
# to visualize progress in the animated GIF)
num_examples_to_generate = 16 noise_dim =
100
seed = tf.random.normal([num_examples_to_generate, noise_dim])

#Training Steps

# tf.function annotation causes the function


# to be "compiled" as part of the training
@tf.function def
train_step(images):

# 1 - Create a random noise to feed it into the model


# for the image generation noise =
tf.random.normal([BATCH_SIZE, noise_dim])

# 2 - Generate images and calculate loss values


# GradientTape method records operations for automatic differentiation.
with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
generated_images = generator(noise, training=True)

real_output = discriminator(images, training=True)


fake_output = discriminator(generated_images, training=True)

6CSE4Y Madhav Khanna A2305222268


gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)

gradients_of_generator = gen_tape.gradient(gen_loss,
generator.trainable_variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss,
discriminator.trainable_variables)

# 4 - Process Gradients and Run the Optimizer


# "apply_gradients" method processes aggregated gradients.
# ex: optimizer.apply_gradients(zip(grads, vars))
"""
Example use of apply_gradients: grads = tape.gradient(loss,
vars) grads = tf.distribute.get_replica_context().all_reduce('sum',
grads) # Processing aggregated gradients.
optimizer.apply_gradients(zip(grads, vars), experimental_aggregate_gradients=False)
"""
generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))

#Image Generation Function

def generate_and_save_images(model, epoch, test_input):


# Notice `training` is set to False.
# This is so all layers run in inference mode (batchnorm).
# 1 - Generate images predictions =
model(test_input, training=False)
# 2 - Plot the generated images fig
= plt.figure(figsize=(4,4)) for i in
range(predictions.shape[0]):
plt.subplot(4, 4, i+1) plt.imshow(predictions[i, :, :, 0] *
127.5 + 127.5, cmap='gray') plt.axis('off')
# 3 - Save the generated images
plt.savefig('image_at_epoch_{:04d}.png'.format(epoch))
plt.show()

#Train GAN

import time from IPython import display # A command shell for interactive
computing in Python.

def train(dataset, epochs):


# A. For each epoch, do the following:
for epoch in range(epochs):
start = time.time()
# 1 - For each batch of the epoch,
for image_batch in dataset:
# 1.a - run the custom "train_step" function
# we just declared above
train_step(image_batch)

# 2 - Produce images for the GIF as we go


display.clear_output(wait=True)
6CSE4Y Madhav Khanna A2305222268
generate_and_save_images(generator,
epoch + 1,
seed)
# 3 - Save the model every 5 epochs as
# a checkpoint, which we will use later
if (epoch + 1) % 5 == 0:
checkpoint.save(file_prefix = checkpoint_prefix)

# 4 - Print out the completed epoch no. and the time spent print
('Time for epoch {} is {} sec'.format(epoch + 1, time.time()-start))

# B. Generate a final image after the training is completed


display.clear_output(wait=True) generate_and_save_images(generator,
epochs,
seed)

#Start Training train(train_dataset,

EPOCHS)

#Generated Digits
checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir)) #
PIL is a library which may open different image file formats
import PIL
# Display a single image using the epoch number def
display_image(epoch_no):
return PIL.Image.open('image_at_epoch_{:04d}.png'.format(epoch_no)) display_image(EPOCHS)

6CSE4Y Madhav Khanna A2305222268


import glob # The glob module is used for Unix style pathname pattern expansion.
import imageio # The library that provides an easy interface to read and write a wide range of image data

anim_file = 'dcgan.gif'

with imageio.get_writer(anim_file, mode='I') as writer:


filenames = glob.glob('image*.png')
filenames = sorted(filenames) for
filename in filenames: image =
imageio.imread(filename)
writer.append_data(image)
# image = imageio.imread(filename) #
writer.append_data(image)
display.Image(open('dcgan.gif','rb').read())
EXPERIMENT 10

AIM: To implement an image classification model using Vision Transformer (ViT) on the CIFAR-100 dataset and
evaluate its performance.

THEORY:

The Vision Transformer (ViT) is a deep learning model that applies the transformer architecture, originally designed
for natural language processing (NLP), to image classification tasks. Unlike traditional Convolutional Neural
Networks (CNNs), ViTs do not use convolutional layers but instead rely on self-attention mechanisms to model spatial
relationships between different parts of an image.

CODE & IMPLEMENTATION:

#Setup

!pip install tensorflow-addons

import numpy as np import


tensorflow as tf from tensorflow
import keras from tensorflow.keras
import layers import
tensorflow_addons as tfa

#Prepare the data num_classes

= 100

input_shape = (32, 32, 3)

(x_train, y_train), (x_test, y_test) = keras.datasets.cifar100.load_data()

6CSE4Y Madhav Khanna A2305222268


print(f"x_train shape: {x_train.shape} - y_train shape: {y_train.shape}") print(f"x_test
shape: {x_test.shape} - y_test shape: {y_test.shape}")

x_train shape: (50000, 32, 32, 3) - y_train shape: (50000, 1) x_test


shape: (10000, 32, 32, 3) - y_test shape: (10000, 1)

#Configure the hyperparameters

learning_rate = 0.001
weight_decay = 0.0001
batch_size = 256 num_epochs
= 100
image_size = 72 # We'll resize input images to this size patch_size = 6
# Size of the patches to be extract from the input images num_patches
= (image_size // patch_size) ** 2 projection_dim = 64 num_heads = 4
transformer_units = [ projection_dim * 2, projection_dim,
] # Size of the transformer layers transformer_layers
=8
mlp_head_units = [2048, 1024] # Size of the dense layers of the final classifier

#Use data augmentation

data_augmentation = keras.Sequential(
[
layers.Normalization(),
layers.Resizing(image_size, image_size),
layers.RandomFlip("horizontal"),
layers.RandomRotation(factor=0.02),
layers.RandomZoom(
height_factor=0.2, width_factor=0.2
),
],
name="data_augmentation",
)
# Compute the mean and the variance of the training data for normalization.
data_augmentation.layers[0].adapt(x_train)

#Implement multilayer perceptron (MLP)

def mlp(x, hidden_units, dropout_rate):


for units in hidden_units:
x = layers.Dense(units, activation=tf.nn.gelu)(x)
x = layers.Dropout(dropout_rate)(x) return x

6CSE4Y Madhav Khanna A2305222268


#Implement patch creation as a layer

class Patches(layers.Layer):
def __init__(self, patch_size):
super().__init__()
self.patch_size = patch_size

def call(self, images):


batch_size = tf.shape(images)[0]
patches = tf.image.extract_patches(
images=images,
sizes=[1, self.patch_size, self.patch_size, 1],
strides=[1, self.patch_size, self.patch_size, 1],
rates=[1, 1, 1, 1],
padding="VALID",
)
patch_dims = patches.shape[-1] patches =
tf.reshape(patches, [batch_size, -1, patch_dims]) return
patches

#Let's display patches for a sample image import

matplotlib.pyplot as plt

plt.figure(figsize=(4, 4)) image =


x_train[np.random.choice(range(x_train.shape[0]))]
plt.imshow(image.astype("uint8")) plt.axis("off")

resized_image = tf.image.resize(
tf.convert_to_tensor([image]), size=(image_size, image_size)
)
patches = Patches(patch_size)(resized_image) print(f"Image
size: {image_size} X {image_size}") print(f"Patch size:
{patch_size} X {patch_size}") print(f"Patches per image:
{patches.shape[1]}")
print(f"Elements per patch: {patches.shape[-1]}")

n = int(np.sqrt(patches.shape[1]))
plt.figure(figsize=(4, 4)) for i, patch
in enumerate(patches[0]):
ax = plt.subplot(n, n, i + 1) patch_img =
tf.reshape(patch, (patch_size, patch_size, 3))
plt.imshow(patch_img.numpy().astype("uint8"))
plt.axis("off")

6CSE4Y Madhav Khanna A2305222268


Image size: 72 X 72
Patch size: 6 X 6
Patches per image: 144
Elements per patch: 108

#Implement the patch encoding layer

class PatchEncoder(layers.Layer): def


__init__(self, num_patches, projection_dim):
super().__init__() self.num_patches =
num_patches self.projection =
layers.Dense(units=projection_dim)
self.position_embedding = layers.Embedding(
input_dim=num_patches, output_dim=projection_dim
)

def call(self, patch):


positions = tf.range(start=0, limit=self.num_patches, delta=1)
encoded = self.projection(patch) + self.position_embedding(positions)
return encoded

#Build the ViT model

def create_vit_classifier():
inputs = layers.Input(shape=input_shape)
# Augment data.
6CSE4Y Madhav Khanna A2305222268
augmented = data_augmentation(inputs)
# Create patches.
patches = Patches(patch_size)(augmented)
# Encode patches.
encoded_patches = PatchEncoder(num_patches, projection_dim)(patches)

# Create multiple layers of the Transformer block.


for _ in range(transformer_layers):
# Layer normalization 1.
x1 = layers.LayerNormalization(epsilon=1e-6)(encoded_patches)
# Create a multi-head attention layer.
attention_output = layers.MultiHeadAttention(
num_heads=num_heads, key_dim=projection_dim, dropout=0.1
)(x1, x1)
# Skip connection 1.
x2 = layers.Add()([attention_output, encoded_patches])
# Layer normalization 2.
x3 = layers.LayerNormalization(epsilon=1e-6)(x2)
# MLP.
x3 = mlp(x3, hidden_units=transformer_units, dropout_rate=0.1)
# Skip connection 2.
encoded_patches = layers.Add()([x3, x2])

# Create a [batch_size, projection_dim] tensor. representation =


layers.LayerNormalization(epsilon=1e-6)(encoded_patches)
representation = layers.Flatten()(representation)
representation = layers.Dropout(0.5)(representation) #
Add MLP.
features = mlp(representation, hidden_units=mlp_head_units, dropout_rate=0.5)
# Classify outputs.
logits = layers.Dense(num_classes)(features) #
Create the Keras model. model =
keras.Model(inputs=inputs, outputs=logits) return
model

#Compile, train, and evaluate the mode

def run_experiment(model):
optimizer = tfa.optimizers.AdamW(
learning_rate=learning_rate, weight_decay=weight_decay
)

model.compile( optimizer=optimizer,
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[
keras.metrics.SparseCategoricalAccuracy(name="accuracy"),
keras.metrics.SparseTopKCategoricalAccuracy(5, name="top-5-accuracy"),
],
)

6CSE4Y Madhav Khanna A2305222268


checkpoint_filepath = "/tmp/checkpoint"
checkpoint_callback = keras.callbacks.ModelCheckpoint(
checkpoint_filepath, monitor="val_accuracy",
save_best_only=True,
save_weights_only=True,
)

history = model.fit( x=x_train,


y=y_train,
batch_size=batch_size,
epochs=num_epochs,
validation_split=0.1,
callbacks=[checkpoint_callback],
)

model.load_weights(checkpoint_filepath)
_, accuracy, top_5_accuracy = model.evaluate(x_test, y_test)
print(f"Test accuracy: {round(accuracy * 100, 2)}%") print(f"Test
top 5 accuracy: {round(top_5_accuracy * 100, 2)}%")

return history

vit_classifier = create_vit_classifier() history


= run_experiment(vit_classifier)

CONCLUSION:

Test accuracy: 54.43%


Test top 5 accuracy: 81.43%
After 100 epochs, the ViT model achieves around 55% accuracy and 82% top-5 accuracy on the test data. These are
not competitive results on the CIFAR-100 dataset, as a ResNet50V2 trained from scratch on the same data can achieve
67% accuracy.Note that the state of the art results reported in the paper are achieved by pre-training the ViT model
using the JFT-300M dataset, then fine-tuning it on the target dataset. To improve the model quality without
pretraining, you can try to train the model for more epochs, use a larger number of Transformer layers, resize the input
images, change the patch size, or increase the projection dimensions. Besides, as mentioned in the paper, the quality of
the model is affected not only by architecture choices, but also by parameters such as the learning rate schedule,
optimizer, weight decay, etc. In practice, it's recommended to fine-tune a ViT model that was pre-trained using a large,
high-resolution dataset.

6CSE4Y Madhav Khanna A2305222268

You might also like