LP - V - Lab Manual - DL
LP - V - Lab Manual - DL
Laboratory Manual
Journal Guidelines
1. The laboratory assignments are to be submitted by student in the form of journal in
handwritten write-up.
2. Journal may consists of ,
a. Index
b. Certificate
c. Assignment Write - ups
c.1) Aim - Title
c.2) Software & Programming Language Used - Java / Python
c.3) Theory - Descriptive (no Details are required)
c.4) Algorithm - Depends on Assignment
c.5) Mathematical Model - As per program
c.6) Conclusion
3. Program codes with sample output of all performed assignments are to be submitted in file.
Deep Learning Assignment List
1. Linear regression by using Deep Neural network: Implement Boston housing price
prediction problem by Linear regression using Deep Neural network. Use Boston House
price prediction dataset.
Assignment No: 1
Title of the Assignment: Linear regression by using Deep Neural network: Implement Boston housing
price. Prediction problem by linear regression using Deep Neural network. Use Boston House price
prediction dataset.
Objective of the Assignment: Students should be able to perform Linear regression by using
Deep Neural network on Boston House Dataset.
Prerequisite:
1. Basic of programming language
2. Concept of Linear Regression
3. Concept of Deep Neural Network
---------------------------------------------------------------------------------------------------------------
Contents for Theory:
1. What is Linear Regression
2. Example of Linear Regression
3. Concept of Deep Neural Network
4. How Deep Neural Network Work
5. Code Explanation with Output
---------------------------------------------------------------------------------------------------------------
What is Linear Regression?
Linear regression is a statistical approach that is commonly used to model the relationship between a
dependent variable and one or more independent variables. It assumes a linear relationship between the
variables and uses mathematical methods to estimate the coefficients that best fit the data.
Deep neural networks are a type of machine learning algorithm that are modeled after the structure and
function of the human brain. They consist of multiple layers of interconnected neurons that process data
and learn from it to make predictions or classifications.
Linear regression using deep neural networks combines the principles of linear regression with the power
of deep learning algorithms. In this approach, the input features are passed through one or more layers of
neurons to extract features and then a linear regression model is applied to the output of the last layer to
make predictions. The weights and biases of the neural network are adjusted during training to optimize
the performance of the model.
This approach can be used for a variety of tasks, including predicting numerical values, such as stock
prices or housing prices, and classifying data into categories, such as detecting whether an image contains
a particular object or not. It is often used in fields such as finance, healthcare, and image recognition.
Boston House Price Prediction is a common example used to illustrate how a deep neural network can
work for regression tasks. The goal of this task is to predict the price of a house in Boston based on
various features such as the number of rooms, crime rate, and accessibility to public transportation.
Here's how a deep neural network can work for Boston House Price Prediction:
1. Data preprocessing: The first step is to preprocess the data. This involves normalizing the input
features to have a mean of 0 and a standard deviation of 1, which helps the network learn more
efficiently. The dataset is then split into training and testing sets.
2. Model architecture: A deep neural network is then defined with multiple layers. The first layer is
the input layer, which takes in the normalized features. This is followed by several hidden layers,
which can be deep or shallow. The last layer is the output layer, which predicts the house price.
3. Model training: The model is then trained using the training set. During training, the weights and
biases of the nodes are adjusted based on the error between the predicted output and the actual
output. This is done using an optimization algorithm such as stochastic gradient descent.
4. Model evaluation: Once the model is trained, it is evaluated using the testing set. The
performance of the model is measured using metrics such as mean squared error or mean absolute
error.
5. Model prediction: Finally, the trained model can be used to make predictions on new data, such as
predicting the price of a new house in Boston based on its features.
6. By using a deep neural network for Boston House Price Prediction, we can obtain accurate
predictions based on a large set of input features. This approach is scalable and can be used for
other regression tasks as well.
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 506 entries, 0 to 505
Data columns (total 14 columns):
# Column Non-Null Count Dtype
#Checking the correlation of the independent feature with the dependent feature
# Correlation is a statistical technique that can show whether and how strongly
pairs of variables are related.An intelligent correlation analysis can lead to a
greater understanding of your data
#checking Correlation of the data
correlation = data.corr()
correlation.loc['PRICE']
CRIM -0.388305
ZN 0.360445
INDUS -0.483725
CHAS 0.175260
NOX -0.427321
RM 0.695360
AGE -0.376955
DIS 0.249929
RAD -0.381626
TAX -0.468536
PTRATIO -0.507787
B 0.333461
LSTAT -0.737663
PRICE 1.000000
Name: PRICE, dtype: float64
# plotting the heatmap
import matplotlib.pyplot as plt
fig,axes = plt.subplots(figsize=(15,12))
sns.heatmap(correlation,square = True,annot = True)
# By looking at the correlation plot LSAT is negatively correlated with -0.75 and
RM is positively correlated to the price and PTRATIO is correlated negatively
with -0.51
# Checking the scatter plot with the most correlated features
plt.figure(figsize = (20,5))
features = ['LSTAT','RM','PTRATIO']
for i, col in enumerate(features):
plt.subplot(1, len(features) , i+1)
x = data[col]
y = data.PRICE
plt.scatter(x, y, marker='o')
plt.title("Variation in House prices")
plt.xlabel(col)
plt.ylabel('"House prices in $1000"')
mean = X_train.mean(axis=0)
std = X_train.std(axis=0)
model = Sequential()
model.add(Dense(128,activation = 'relu',input_dim =13))
model.add(Dense(64,activation = 'relu'))
model.add(Dense(32,activation = 'relu'))
model.add(Dense(16,activation = 'relu'))
model.add(Dense(1))
#model.compile(optimizer='adam', loss='mse', metrics=['mae'])
model.compile(optimizer = 'adam',loss ='mean_squared_error',metrics=['mae'])
!pip install ann_visualizer
!pip install graphviz
from ann_visualizer.visualize import ann_viz;
#Build your model here
ann_viz(model, title="DEMO ANN");
history = model.fit(X_train, y_train, epochs=100, validation_split=0.05)
# By plotting both loss and mean average error, we can see that our model was
capable of learning patterns in our data without overfitting taking place (as
shown by the validation set curves)
from plotly.subplots import make_subplots
import plotly.graph_objects as go
fig = go.Figure()
fig.add_trace(go.Scattergl(y=history.history['loss'],
name='Train'))
fig.add_trace(go.Scattergl(y=history.history['val_loss'],
name='Valid'))
fig.update_layout(height=500, width=700,
xaxis_title='Epoch',
yaxis_title='Loss')
fig.show()
fig = go.Figure()
fig.add_trace(go.Scattergl(y=history.history['mae'],
name='Train'))
fig.add_trace(go.Scattergl(y=history.history['val_mae'],
name='Valid'))
fig.update_layout(height=500, width=700,
xaxis_title='Epoch',
yaxis_title='Mean Absolute Error')
fig.show()
lr_model = LinearRegression()
lr_model.fit(X_train, y_train)
y_pred_lr = lr_model.predict(X_test)
mse_lr = mean_squared_error(y_test, y_pred_lr)
mae_lr = mean_absolute_error(y_test, y_pred_lr)
Conclusion- In this way we can Predict the Boston House Price using Deep Neural Network.
Assignment Question
Assignment No: 2A
Title of the Assignment: Binary classification using Deep Neural Networks Example: Classify movie
reviews into positive" reviews and "negative" reviews, just based on the text content of the reviews. Use
IMDB dataset
Objective of the Assignment: Students should be able to Classify movie reviews into positive reviews
and "negative reviews on IMDB Dataset.
Prerequisite:
1. Basic of programming language
2. Concept of Classification
3. Concept of Deep Neural Network
---------------------------------------------------------------------------------------------------------------
Contents for Theory:
1. What is Classification
2. Example of Classification
3. How Deep Neural Network Work on Classification
4. Code Explanation with Output
---------------------------------------------------------------------------------------------------------------
What is Classification?
Classification is a type of supervised learning in machine learning that involves categorizing data into
predefined classes or categories based on a set of features or characteristics. It is used to predict the class
of new, unseen data based on the patterns learned from the labeled training data.
In classification, a model is trained on a labeled dataset, where each data point has a known class label.
The model learns to associate the input features with the corresponding class labels and can then be used
For example, we can use classification to identify whether an email is spam or not based on its content
and metadata, to predict whether a patient has a disease based on their medical records and symptoms, or
Classification algorithms can vary in complexity, ranging from simple models such as decision trees and
k-nearest neighbors to more complex models such as support vector machines and neural networks. The
choice of algorithm depends on the nature of the data, the size of the dataset, and the desired level of
Classification is a common task in deep neural networks, where the goal is to predict the class of an
input based on its features. Here's an example of how classification can be performed in a deep neural
The MNIST dataset contains 60,000 training images and 10,000 testing images of handwritten digits
from 0 to 9. Each image is a grayscale 28x28 pixel image, and the task is to classify each image into one
We can use a convolutional neural network (CNN) to classify the MNIST dataset. A CNN is a type of
deep neural network that is commonly used for image classification tasks.
IMDB Dataset-The IMDB dataset is a large collection of movie reviews collected from the IMDB
website, which is a popular source of user-generated movie ratings and reviews. The dataset consists of
50,000 movie reviews, split into 25,000 reviews for training and 25,000 reviews for testing.
Each review is represented as a sequence of words, where each word is represented by an integer index
based on its frequency in the dataset. The labels for each review are binary, with 0 indicating a negative
review and 1 indicating a positive review.
The IMDB dataset is commonly used as a benchmark for sentiment analysis and text classification tasks,
where the goal is to classify the movie reviews as either positive or negative based on their text content.
The dataset is challenging because the reviews are often highly subjective and can contain complex
language and nuances of meaning, making it difficult for traditional machine learning approaches to
accurately classify them.
Deep learning approaches, such as deep neural networks, have achieved state-of-the-art performance on
the IMDB dataset by automatically learning to extract relevant features from the raw text data and map
them to the correct output class. The IMDB dataset is widely used in research and education for natural
language processing and machine learning, as it provides a rich source of labeled text data for training
and testing deep learning models.
Source Code and Output-
# The IMDB sentiment classification dataset consists of 50,000 movie reviews from IMDB users that are
labeled as either positive (1) or negative (0).
# The reviews are preprocessed and each one is encoded as a sequence of word indexes in the form of
integers.
# The words within the reviews are indexed by their overall frequency within the dataset. For example,
the integer “2” encodes the second most frequent word in the data.
# The 50,000 reviews are split into 25,000 for training and 25,000 for testing.
# Text Process word by word at different timestamp ( You may use RNN LSTM GRU
)# convert input text to vector reprint input text
# DOMAIN: Digital content and entertainment industry
# CONTEXT: The objective of this project is to build a text classification model that analyses the
customer's sentiments based on their reviews in the IMDB database. The model uses a complex deep
learning model to build an embedding layer followed by a classification algorithm to analyse the
sentiment of the customers.
# DATA DESCRIPTION: The Dataset of 50,000 movie reviews from IMDB, labelled by sentiment
(positive/negative).
# Reviews have been preprocessed, and each review is encoded as a sequence of word indexes
(integers).
# For convenience, the words are indexed by their frequency in the dataset, meaning the for that has
index 1 is the most frequent word.
# Use the first 20 words from each review to speed up training, using a max vocabulary size of 10,000.
# As a convention, "0" does not stand for a specific word, but instead is used to encode any unknown
word.
# PROJECT OBJECTIVE: Build a sequential NLP classifier which can use input text parameters to
determine the customer sentiments.
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
#loading imdb data with most frequent 10000 words
from keras.datasets import imdb
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=10000) # you may take top 10,000
word frequently used review of movies other are discarded
#consolidating data for EDA Exploratory data analysis (EDA) is used by data scientists to analyze and
investigate data sets and summarize their main characteristics
data = np.concatenate((X_train, X_test), axis=0) # axis 0 is first running vertically downwards across
rows (axis 0), axis 1 is second running horizontally across columns (axis 1),
label = np.concatenate((y_train, y_test), axis=0)
X_train.shape
(25000,)
X_test.shape
(25000,)
y_train.shape
(25000,)
y_test.shape
(25000,)
print("Review is ",X_train[0]) # series of no converted word to vocabulory associated with index
print("Review is ",y_train[0])
Review is [1, 194, 1153, 194, 8255, 78, 228, 5, 6, 1463, 4369, 5012, 134, 26, 4, 715, 8, 118, 1634, 14,
394, 20, 13, 119, 954, 189, 102, 5, 207, 110, 3103, 21, 14, 69, 188, 8, 30, 23, 7, 4, 249, 126, 93, 4, 114,
9, 2300, 1523, 5, 647, 4, 116, 9, 35, 8163, 4, 229, 9, 340, 1322, 4, 118, 9, 4, 130, 4901, 19, 4, 1002, 5,
89, 29, 952, 46, 37, 4, 455, 9, 45, 43, 38, 1543, 1905, 398, 4, 1649, 26, 6853, 5, 163, 11, 3215, 2, 4,
1153, 9, 194, 775, 7, 8255, 2, 349, 2637, 148, 605, 2, 8003, 15, 123, 125, 68, 2, 6853, 15, 349, 165,
4362, 98, 5, 4, 228, 9, 43, 2, 1157, 15, 299, 120, 5, 120, 174, 11, 220, 175, 136, 50, 9, 4373, 228, 8255,
5, 2, 656, 245, 2350, 5, 4, 9837, 131, 152, 491, 18, 2, 32, 7464, 1212, 14, 9, 6, 371, 78, 22, 625, 64,
1382, 9, 8, 168, 145, 23, 4, 1690, 15, 16, 4, 1355, 5, 28, 6, 52, 154, 462, 33, 89, 78, 285, 16, 145, 95]
Review is 0
vocab=imdb.get_word_index() # Retrieve the word index file mapping words to indices
print(vocab)
{'fawn': 34701, 'tsukino': 52006, 'nunnery': 52007, 'sonja': 16816, 'vani': 63951, 'woods': 1408, 'spiders':
16115,
y_train
array([1, 0, 0, ..., 0, 1, 0])
y_test
array([0, 1, 1, ..., 0, 0, 0])
# Function to perform relevant sequence adding on the data
# Now it is time to prepare our data. We will vectorize every review and fill it with zeros so that it
contains exactly 10000 numbers.
# That means we fill every review that is shorter than 500 with zeros.
# We do this because the biggest review is nearly that long and every input for our neural network needs
to have the same size.
# We also transform the targets into floats.
# sequences is name of method the review less than 10000 we perform padding overthere
# binary vectorization code:
test_x = data[:10000]
test_y = label[:10000]
train_x = data[10000:]
train_y = label[10000:]
test_x.shape
(10000,)
test_y.shape
(10000,)
train_x.shape
(40000,)
train_y.shape
(40000,)
print("Categories:", np.unique(label))
print("Number of unique words:", len(np.unique(np.hstack(data))))
# The hstack() function is used to stack arrays in sequence horizontally (column wise).
Categories: [0 1]
Number of unique words: 9998
length = [len(i) for i in data]
print("Average Review length:", np.mean(length))
print("Standard Deviation:", round(np.std(length)))
# The whole dataset contains 9998 unique words and the average review length is 234 words, with a
standard deviation of 173 words.
Average Review length: 234.75892
Standard Deviation: 173
# If you look at the data you will realize it has been already pre-processed.
# All words have been mapped to integers and the integers represent the words sorted by their frequency.
# This is very common in text analysis to represent a dataset like this.
# So 4 represents the 4th most used word,
# 5 the 5th most used word and so on...
# The integer 1 is reserved for the start marker,
# the integer 2 for an unknown word and 0 for padding.
print("Label:", label[0])
Label: 1
print("Label:", label[1])
Label: 0
print(data[0])
# Retrieves a dict mapping words to their index in the IMDB dataset.
index = imdb.get_word_index() # word to index
# Create inverted index from a dictionary with document ids as keys and a list of terms as values for
each document
reverse_index = dict([(value, key) for (key, value) in index.items()]) # id to word
=================================================================
Total params: 505,201
Trainable params: 505,201
Non-trainable params: 0
#For early stopping
# Stop training when a monitored metric has stopped improving.
# monitor: Quantity to be monitored.
# patience: Number of epochs with no improvement after which training will be
stopped.
import tensorflow as tf
callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=3)
# We use the “adam” optimizer, an algorithm that changes the weights and biases
during training.
# We also choose binary-crossentropy as loss (because we deal with binary
classification) and accuracy as our evaluation metric.
model.compile(
optimizer = "adam",
loss = "binary_crossentropy",
metrics = ["accuracy"]
)
from sklearn.model_selection import train_test_split
results = model.fit(
X_train, y_train,
epochs= 2,
batch_size = 500,
validation_data = (X_test, y_test),
callbacks=[callback]
)
# Let's check mean accuracy of our model
print(np.mean(results.history["val_accuracy"]))
# Evaluate the model
score = model.evaluate(X_test, y_test, batch_size=500)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
20/20 [==============================] - 1s 24ms/step - loss: 0.2511 - accuracy:
0.8986
Test loss: 0.25108325481414795
Test accuracy: 0.8985999822616577
#Let's plot training history of our model.
Conclusion- In this way we can Classify the Movie Reviews by using DNN.
Assignment Question
Assignment No: 2B
Title of the Assignment: Multiclass classification using Deep Neural Networks: Example: Use the OCR
letter recognition dataset https://archive.ics.uci.edu/ml/datasets/letter+recognition
Objective of the Assignment: Students should be able to solve Multiclass classification using Deep
Neural Networks Solve
Prerequisite:
In multiclass classification, each input sample is associated with a single class label, and the goal of the
model is to learn a function that can accurately predict the correct class label for new, unseen input data.
Multiclass classification can be approached using a variety of machine learning algorithms, including
decision trees, support vector machines, and deep neural networks.
Some examples of multiclass classification problems include image classification, where the goal is to
classify images into one of several categories (e.g., animals, vehicles, buildings), and text classification,
where the goal is to classify text documents into one of several categories (e.g., news topics, sentiment
analysis).
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.datasets import mnist
import matplotlib.pyplot as plt
from sklearn import metrics
# Load the OCR dataset
# X_train and X_test are our array of images while y_train and y_test are our array of labels for each
image.
# The first tuple contains the training set features (X_train) and the training set labels (y_train).
# The second tuple contains the testing set features (X_test) and the testing set labels (y_test).
# For example, if the image shows a handwritten 7, then the label will be the intger 7.
# image data is just an array of digits. You can almost make out a 5 from the pattern of the digits in the
array.
# Array of 28 values
# a grayscale pixel is stored as a digit between 0 and 255 where 0 is black, 255 is white and values in
between are different shades of gray.
# Therefore, each value in the [28][28] array tells the computer which color to put in that position when.
# reformat our X_train array and our X_test array because they do not have the correct shape.
# Reshape the data to fit the model
print("X_train shape", x_train.shape)
print("y_train shape", y_train.shape)
print("X_test shape", x_test.shape)
print("y_test shape", y_test.shape)
# Here you can see that for the training sets we have 60,000 elements and the testing sets have 10,000
elements.
# y_train and y_test only have 1 dimensional shapes because they are just the labels of each element.
# x_train and x_test have 3 dimensional shapes because they have a width and height (28x28 pixels) for
each element.
# (60000, 28, 28) 1st parameter in the tuple shows us how much image we have 2nd and 3rd parameters
are the pixel values from x to y (28x28)
# The pixel value varies between 0 to 255.
# (60000,) Training labels with integers from 0-9 with dtype of uint8. It has the shape (60000,).
# (10000, 28, 28) Testing data that consists of grayscale images. It has the shape (10000, 28, 28) and the
dtype of uint8. The pixel value varies between 0 to 255.
# (10000,) Testing labels that consist of integers from 0-9 with dtype uint8. It has the shape (10000,).
X_train shape (60000, 28, 28)
y_train shape (60000,)
X_test shape (10000, 28, 28)
y_test shape (10000,)
# X: Training data of shape (n_samples, n_features)
# y: Training label values of shape (n_samples, n_labels)
# 2D array of height and width, 28 pixels by 28 pixels will just become 784 pixels (28 squared).
# Remember that X_train has 60,000 elemenets, each with 784 total pixels so will become shape (60000,
784).
# Whereas X_test has 10,000 elements, each with each with 784 total pixels so will become shape
(10000, 784).
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32') # use 32-bit precision when training a neural network, so at one point
the training data will have to be converted to 32 bit floats. Since the dataset fits easily in RAM, we might
as well convert to float immediately.
x_test = x_test.astype('float32')
x_train /= 255 # Each image has Intensity from 0 to 255
x_test /= 255
# Regarding the division by 255, this is the maximum value of a byte (the input feature's type before the
conversion to float32),
# so this will ensure that the input features are scaled between 0.0 and 1.0.
# Convert class vectors to binary class matrices
num_classes = 10
y_train = np.eye(num_classes)[y_train] # Return a 2-D array with ones on the diagonal and zeros
elsewhere.
y_test = np.eye(num_classes)[y_test] # f your particular categories is present then it mark as 1 else 0 in
remain row
# Define the model architecture
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,))) # Input cosist of 784 Neuron ie 784 input,
512 in the hidden layer
model.add(Dropout(0.2)) # DROP OUT RATIO 20%
model.add(Dense(512, activation='relu')) #returns a sequence of another vectors of dimension 512
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax')) # 10 neurons ie output node in the output layer.
# Compile the model
model.compile(loss='categorical_crossentropy', # for a multi-class classification problem
optimizer=RMSprop(),
metrics=['accuracy'])
# Train the model
batch_size = 128 # batch_size argument is passed to the layer to define a batch size for the inputs.
epochs = 20
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1, # verbose=1 will show you an animated progress bar eg. [==========]
validation_data=(x_test, y_test)) # Using validation_data means you are providing the
training set and validation set yourself,
# 60000image/128=469 batch each
# Evaluate the model
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
Test loss: 0.08541901409626007
Test accuracy: 0.9851999878883362
Assignment Question
2. What is Dropout?
3. What is RMSprop?
Assignment Question
2. What is Dropout?
3. What is RMSprop?
Assignment No: 3B
Title of the Assignment: Use MNIST Fashion Dataset and create a classifier to classify fashion clothing
into categories.
Objective of the Assignment: Students should be able to Classify movie reviews into positive reviews
and "negative reviews on IMDB Dataset.
Prerequisite:
1. Basic of programming language
2. Concept of Classification
3. Concept of Deep Neural Network
---------------------------------------------------------------------------------------------------------------
Contents for Theory:
1. What is Classification
2. Example of Classification
3. What is CNN?
4. How Deep Neural Network Work on Classification
5. Code Explanation with Output
---------------------------------------------------------------------------------------------------------------
What is Classification?
Classification is a type of supervised learning in machine learning that involves categorizing data into
predefined classes or categories based on a set of features or characteristics. It is used to predict the class
of new, unseen data based on the patterns learned from the labeled training data.
In classification, a model is trained on a labeled dataset, where each data point has a known class label.
The model learns to associate the input features with the corresponding class labels and can then be used
For example, we can use classification to identify whether an email is spam or not based on its content
and metadata, to predict whether a patient has a disease based on their medical records and symptoms, or
Classification algorithms can vary in complexity, ranging from simple models such as decision trees and
k-nearest neighbors to more complex models such as support vector machines and neural networks. The
choice of algorithm depends on the nature of the data, the size of the dataset, and the desired level of
Example- Classification is a common task in deep neural networks, where the goal is to predict the class
of an input based on its features. Here's an example of how classification can be performed in a deep
The MNIST dataset contains 60,000 training images and 10,000 testing images of handwritten digits
from 0 to 9. Each image is a grayscale 28x28 pixel image, and the task is to classify each image into one
We can use a convolutional neural network (CNN) to classify the MNIST dataset. A CNN is a type of
deep neural network that is commonly used for image classification tasks.
What us CNN-
Convolutional Neural Networks (CNNs) are commonly used for image classification tasks, and they are
designed to automatically learn and extract features from input images. Let's consider an example of
using a CNN to classify images of handwritten digits.
In a typical CNN architecture for image classification, there are several layers, including convolutional
layers, pooling layers, and fully connected layers. Here's a diagram of a simple CNN architecture for the
digit classification task:
The input to the network is an image of size 28x28 pixels, and the output is a probability distribution
over the 10 possible digits (0 to 9).
The convolutional layers in the CNN apply filters to the input image, looking for specific patterns and
features. Each filter produces a feature map that highlights areas of the image that match the filter. The
filters are learned during training, so the network can automatically learn which features are most
relevant for the classification task.
The pooling layers in the CNN down sample the feature maps, reducing the spatial dimensions of the
data. This helps to reduce the number of parameters in the network, while also making the features more
robust to small variations in the input image.
The fully connected layers in the CNN take the flattened output from the last pooling layer and perform
a classification task by outputting a probability distribution over the 10 possible digits.
During training, the network learns the optimal values of the filters and parameters by minimizing a loss
function. This is typically done using stochastic gradient descent or a similar optimization algorithm.
Once trained, the network can be used to classify new images by passing them through the network and
computing the output probability distribution.
Overall, CNNs are powerful tools for image recognition tasks and have been used successfully in many
applications, including object detection, face recognition, and medical image analysis.
CNNs have a wide range of applications in various fields, some of which are:
Image classification: CNNs are commonly used for image classification tasks, such as identifying
objects in images and recognizing faces.
Object detection: CNNs can be used for object detection in images and videos, which involves
identifying the location of objects in an image and drawing bounding boxes around them.
Semantic segmentation: CNNs can be used for semantic segmentation, which involves partitioning an
image into segments and assigning each segment a semantic label (e.g., "road", "sky", "building").
Natural language processing: CNNs can be used for natural language processing tasks, such as
sentiment analysis and text classification.
Medical imaging: CNNs are used in medical imaging for tasks such as diagnosing diseases from X-rays
and identifying tumors from MRI scans.
Autonomous vehicles: CNNs are used in autonomous vehicles for tasks such as object detection and
lane detection.
Video analysis: CNNs can be used for tasks such as video classification, action recognition, and video
captioning.
Overall, CNNs are a powerful tool for a wide range of applications, and they have been used
successfully in many areas of research and industry.
How Deep Neural Network Work on Classification using CNN-
Deep neural networks using CNNs work on classification tasks by learning to automatically extract
features from input images and using those features to make predictions. Here's how it works:
Input layer: The input layer of the network takes in the image data as input.
Convolutional layers: The convolutional layers apply filters to the input images to extract relevant
features. Each filter produces a feature map that highlights areas of the image that match the filter.
Activation functions: An activation function is applied to the output of each convolutional layer to
introduce non-linearity into the network.
Pooling layers: The pooling layers down sample the feature maps to reduce the spatial dimensions of the
data.
Dropout layer: Dropout is used to prevent overfitting by randomly dropping out a percentage of the
neurons in the network during training.
Fully connected layers: The fully connected layers take the flattened output from the last pooling layer
and perform a classification task by outputting a probability distribution over the possible classes.
Softmax activation function: The softmax activation function is applied to the output of the last fully
connected layer to produce a probability distribution over the possible classes.
Loss function: A loss function is used to compute the difference between the predicted probabilities and
the actual labels.
Optimization: An optimization algorithm, such as stochastic gradient descent, is used to minimize the
loss function by adjusting the values of the network parameters.
Training: The network is trained on a large dataset of labeled images, adjusting the values of the
parameters to minimize the loss function.
Prediction: Once trained, the network can be used to classify new images by passing them through the
network and computing the output probability distribution.
MNIST Dataset-
The MNIST Fashion dataset is a collection of 70,000 grayscale images of 28x28 pixels, representing 10
different categories of clothing and accessories. The categories include T-shirts/tops, trousers, pullovers,
dresses, coats, sandals, shirts, sneakers, bags, and ankle boots.
The dataset is often used as a benchmark for testing image classification algorithms, and it is considered
a more challenging version of the original MNIST dataset which contains handwritten digits. The
MNIST Fashion dataset was released by Zalando Research in 2017 and has since become a popular
dataset in the machine learning community.
he MNIST Fashion dataset is a collection of 70,000 grayscale images of 28x28 pixels each. These
images represent 10 different categories of clothing and accessories, with each category containing 7,000
images. The categories are as follows:
T-shirt/tops
Trousers
Pullovers
Dresses
Coats
Sandals
Shirts
Sneakers
Bags
Ankle boots
The images were obtained from Zalando's online store and are preprocessed to be normalized and
centered. The training set contains 60,000 images, while the test set contains 10,000 images. The goal of
the dataset is to accurately classify the images into their respective categories.
The MNIST Fashion dataset is often used as a benchmark for testing image classification algorithms,
and it is considered a more challenging version of the original MNIST dataset which contains
handwritten digits. The dataset is widely used in the machine learning community for research and
educational purposes.
Here are the general steps to perform Convolutional Neural Network (CNN) on the MNIST Fashion
dataset:
● Import the necessary libraries, including TensorFlow, Keras, NumPy, and Matplotlib.
● Load the dataset using Keras' built-in function, keras.datasets.fashion_mnist.load_data(). This
will provide the training and testing sets, which will be used to train and evaluate the CNN.
● Preprocess the data by normalizing the pixel values between 0 and 1, and reshaping the images to
be of size (28, 28, 1) for compatibility with the CNN.
● Define the CNN architecture, including the number and size of filters, activation functions, and
pooling layers. This can vary based on the specific problem being addressed.
● Compile the model by specifying the loss function, optimizer, and evaluation metrics. Common
choices include categorical cross-entropy, Adam optimizer, and accuracy metric.
● Train the CNN on the training set using the fit() function, specifying the number of epochs and
batch size.
● Evaluate the performance of the model on the testing set using the evaluate() function. This will
provide metrics such as accuracy and loss on the test set.
● Use the trained model to make predictions on new images, if desired, using the predict()
function.
Source Code with Output-
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow import keras
import numpy as np
# There are 10 image classes in this dataset and each class has a mapping corresponding to the following
labels:
#0 T-shirt/top
#1 Trouser
#2 pullover
#3 Dress
#4 Coat
#5 sandals
#6 shirt
#7 sneaker
#8 bag
#9 ankle boot
plt.imshow(x_train[1])
plt.imshow(x_train[0])
# Next, we will preprocess the data by scaling the pixel values to be between 0 and 1, and then reshaping
the images to be 28x28 pixels.
# 28, 28 comes from width, height, 1 comes from the number of channels
# -1 means that the length in that dimension is inferred.
# This is done based on the constraint that the number of elements in an ndarray or Tensor when
reshaped must remain the same.
# each image is a row vector (784 elements) and there are lots of such rows (let it be n, so there are 784n
elements). So TensorFlow can infer that -1 is n.
# converting the training_images array to 4 dimensional array with sizes 60000, 28, 28, 1 for 0th to 3rd
dimension.
x_train.shape
(60000, 28, 28)
x_test.shape
(10000, 28, 28, 1)
y_train.shape
(60000,)
y_test.shape
(10000,)
# We will use a convolutional neural network (CNN) to classify the fashion items.
# The CNN will consist of multiple convolutional layers followed by max pooling,
# dropout, and dense layers. Here is the code for the model:
model = keras.Sequential([
keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
# 32 filters (default), randomly initialized
# 3*3 is Size of Filter
# 28,28,1 size of Input Image
# No zero-padding: every output 2 pixels less in every dimension
# in Paramter shwon 320 is value of weights: (3x3 filter weights + 32 bias) * 32 filters
# 32*3*3=288(Total)+32(bias)= 320
keras.layers.MaxPooling2D((2,2)),
# It shown 13 * 13 size image with 32 channel or filter or depth.
keras.layers.Dropout(0.25),
# Reduce Overfitting of Training sample drop out 25% Neuron
keras.layers.Conv2D(64, (3,3), activation='relu'),
# Deeper layers use 64 filters
# 3*3 is Size of Filter
# Observe how the input image on 28x28x1 is transformed to a 3x3x64 feature map
# 13(Size)-3(Filter Size )+1(bias)=11 Size for Width and Height with 64 Depth or filtter or channel
# in Paramter shwon 18496 is value of weights: (3x3 filter weights + 64 bias) * 64 filters
# 64*3*3=576+1=577*32 + 32(bias)=18496
keras.layers.MaxPooling2D((2,2)),
# It shown 5 * 5 size image with 64 channel or filter or depth.
keras.layers.Dropout(0.25),
keras.layers.Conv2D(128, (3,3), activation='relu'),
# Deeper layers use 128 filters
# 3*3 is Size of Filter
# Observe how the input image on 28x28x1 is transformed to a 3x3x128 feature map
# It show 5(Size)-3(Filter Size )+1(bias)=3 Size for Width and Height with 64 Depth or filtter or
channel
# 128*3*3=1152+1=1153*64 + 64(bias)= 73856
# To classify the images, we still need a Dense and Softmax layer.
# We need to flatten the 3x3x128 feature map to a vector of size 1152
keras.layers.Flatten(),
keras.layers.Dense(128, activation='relu'),
# 128 Size of Node in Dense Layer
# 1152*128 = 147584
keras.layers.Dropout(0.25),
keras.layers.Dense(10, activation='softmax')
# 10 Size of Node another Dense Layer
# 128*10+10 bias= 1290
])
model.summary()
Model: "sequential"
=================================================================
Total params: 241,546
Trainable params: 241,546
Non-trainable params: 0
# Compile and Train the Model
# After defining the model, we will compile it and train it on the training data.
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history = model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))
# 1875 is a number of batches. By default batches contain 32 samles.60000 / 32 = 1875
# Finally, we will evaluate the performance of the model on the test data.
test_loss, test_acc = model.evaluate(x_test, y_test)
print('Test accuracy:', test_acc)
313/313 [==============================] - 3s 10ms/step - loss: 0.2606 - accuracy: 0.9031
Test accuracy: 0.9031000137329102
Conclusion- In this way we can Classify fashion clothing into categories using CNN.
Assignment Question
Objective: To build a deep neural network model that can recognize the identities of celebrities in the
"CelebA" dataset.
Theory - Human face recognition using deep neural networks involves building a neural network model that
can take an image of a human face as input and accurately recognize the person in the image. Here is a sample
code using TensorFlow to implement a basic face recognition system using a deep neural network:
Human face recognition using deep neural networks (DNNs) involves training a neural network to identify and
distinguish between different faces. The process typically involves the following steps:
Data collection: A large dataset of face images is collected, including images of different individuals and
Data preprocessing: The face images are preprocessed to remove noise, align the faces, and normalize the
illumination.
Feature extraction: The preprocessed face images are then fed into a deep neural network to extract high-
level features that capture the important characteristics of a face. The neural network typically consists of
several layers of convolutional and pooling operations, followed by fully connected layers that produce a
feature vector.
Training: The extracted features are then used to train the neural network to distinguish between different
faces. This is typically done using a supervised learning approach, where the network is trained on a labeled
Testing: After the neural network has been trained, it can be tested on a separate dataset to evaluate its
performance. This typically involves measuring the accuracy of the network in correctly identifying the
Deployment: Once the neural network has been trained and tested, it can be deployed in a real-world
application for face recognition. This typically involves capturing a face image, preprocessing it, and then
feeding it into the neural network to obtain a feature vector. The feature vector is then compared to a database
Overall, human face recognition using DNNs is a complex process that requires a large amount of data,
sophisticated neural network architectures, and careful preprocessing and training. However, with the
increasing availability of large datasets and powerful computing resources, DNN-based face recognition
Example- One example of human face recognition using DNNs is the FaceNet algorithm, which was
developed by researchers at Google in 2015. FaceNet is a deep neural network that is trained to directly
optimize the embedding of face images into a high-dimensional feature space, where distances between faces
The FaceNet architecture consists of a deep convolutional neural network that takes a raw face image as input
and produces a 128-dimensional feature vector as output. The network is trained on a large dataset of face
During training, the FaceNet network is optimized to minimize the distance between the feature vectors of
images that depict the same person and maximize the distance between feature vectors of images that depict
different people. This is done using a loss function called the triplet loss, which compares the distance between
the feature vectors of an anchor image, a positive image (of the same person as the anchor), and a negative
image (of a different person). The goal is to minimize the distance between the anchor and positive images
while maximizing the distance between the anchor and negative images.
After training, the FaceNet network can be used for face recognition by capturing a face image, preprocessing
it, and then feeding it into the network to obtain a 128-dimensional feature vector. This feature vector is then
compared to a database of known faces by computing the Euclidean distance between the feature vector and
the feature vectors of the faces in the database. The closest matching face in the database is then returned as the
recognized identity.
Overall, FaceNet is an example of a DNN-based face recognition system that has achieved high accuracy and
robustness in real-world applications. It has been used in a variety of applications, including security systems,
import tensorflow as tf
import numpy as np
import pandas as pd
import cv2
# Define constants
img_height = 128
img_width = 128
batch_size = 32
epochs = 10
num_classes = 10
df = pd.read_csv('list_attr_celeba.csv')
df = df.sample(frac=1).reset_index(drop=True) # shuffle the dataset
train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_dataframe(
dataframe=df_train,
directory='img_align_celeba',
x_col='image_id',
y_col='Smiling',
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
val_generator = val_datagen.flow_from_dataframe(
dataframe=df_val,
directory='img_align_celeba',
x_col='image_id',
y_col='Smiling',
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
test_generator = test_datagen.flow_from_dataframe(
dataframe=df_test,
directory='img_align_celeba',
x_col='image_id',
y_col='Smiling',
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
model = Sequential([
MaxPooling2D((2,2)),
MaxPooling2D((2,2)),
MaxPooling2D((2,2)),
MaxPooling2D((2,2)),
Flatten(),
Dropout(0.5),
Dense(512, activation='relu'),
Dense(num_classes, activation='sigmoid')
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(train_generator,
epochs=epochs,
validation_data=val_generator)
img = cv2.imread('sample_image.jpg')
Assignment Question