0% found this document useful (0 votes)
33 views

CS601 Machine Learning Unit 3

Uploaded by

okchaitanya568
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

CS601 Machine Learning Unit 3

Uploaded by

okchaitanya568
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 47

Chameli Devi Group of Institutions, Indore

Department of Computer Science and Engineering


CS601-Machine Learning
Unit-3
Unit-3 Syllabus
Unit –III
Convolutional neural network, flattening, subsampling, padding, stride,
convolution layer, pooling layer, loss layer, dance layer 1x1 convolution,
inception network, input channels, transfer learning, one shot learning,
dimension reductions, implementation of CNN like tensor flow, keras etc.

Course Outcome:
Student will be able to design the CNN algorithms to solve related real-life
problems.
Introduction to Convolution Neural Network
Convolutional neural networks, also known as convnet, or CNNs, are a special
kind of neural network for processing data that has a known grid-like topology
like time series data(1D) or images(2D).
Introduction to Convolution Neural Network
Layers in Convolutional neural networks (CNN),
1. Convolution Layer
2. Pooling Layer
3. Fully Connected Layer

Inspiration:
• Visual Cortex of our brain.

Introduced by
• By Yann Lecun in 1998 at AT & T Lab (For Bank cheque scanning purpose).
• Further Microsoft developed many OCR and handwritten character recognition tools.
• Now a days from CNN used in Facial Recognition and Self driving cars.
Why not use ANN?
• 1. High Computation Cost
• 2. Overfitting
• 3. Loss of imp info like spatial
arrangement of pixels
CNN Intuition
Convolution Operation
Basics of Images

Gray Scale Image (Single Channel) RGB Colour image (3 Channels)


Edge Detection Operation
Edge detection = Change in intensity

Horizontal Edge Detector Feature Map

6x6 3x3
Low Resolution B/W Image Filter/Kernal
Working with colour images
Convolution Operation
Convolution Operation
Padding
 We can observe that the size of output is smaller than input. To
maintain the dimension of output as in input, we use padding.
 Padding is a process of adding zeros to the input matrix symmetrically.
In the following example, the extra grey blocks denote the padding. It is
used to make the dimension of output same as input.
Padding
Formula After Padding

Padding code: https://colab.research.google.com/drive/1jdFjETiCG0_csGo363m-dF88nXMQ9bki?usp=sharing


Stride
 Stride denotes how many steps we are moving in each step-in convolution. By default, it is one.

Stride code: https://colab.research.google.com/drive/1EdS-sUM0uo64plB_ZHOg77E_-hefg1t9?usp=sharing


Stride
Why strides are required?
1. You want High level features.
2. To increase computing power (by increasing stride value).
ReLU Layer (Combined with Convolution
layer)
ReLU stands for the rectified linear unit. Once the feature maps are
extracted, the next step is to move them to a ReLU layer.
ReLU performs an element-wise operation and sets all the negative pixels to
0. It introduces non-linearity to the network, and the generated output is a
rectified feature map.

The original image is scanned with multiple convolutions and ReLU layers for
locating the features`
Pooling in CNN
• In Convolutional Neural Networks (CNNs), pooling layers downsample feature maps, reducing
spatial dimensions while retaining crucial information, which helps in reducing computational
complexity and preventing overfitting
Pooling in CNN
There are two types of poolings that are used:
1. Max pooling: This works by selecting the maximum value from every pool. Max
Pooling retains the most prominent features of the feature map, and the returned
image is sharper than the original image.
2. Average pooling: This pooling layer works by getting the average of the pool.
Average pooling retains the average values of features of the feature map. It
smoothes the image while keeping the essence of the feature in an image.
Flattening
The next step in the process is flattening, where all the resulting 2D
arrays from the pooled feature maps are converted into a single,
continuous linear vector.
This flattened vector is then passed as input to the fully connected
layer for image classification.
The flatten layer
The flatten layer is a component of the convolutional neural networks (CNN's). A
complete convolutional neural network can be broken down into two parts:
• CNN: The convolutional neural network that comprises the convolutional layers.

• ANN: The artificial neural network that comprises dense layers.

The flatten layer lies between the CNN and the ANN, and its job is to convert the
output of the CNN into an input that the ANN can process, as we can see in the
diagram below.
Dense layer
• In neural networks, a "dense layer," also known as a "fully connected layer," is a
layer where each neuron receives input from every neuron in the preceding layer,
forming a complete network of connections
Softmax/Logistic Layer
The Softmax or Logistic layer is the final layer in a CNN, positioned at the end of
the fully connected layer. This layer plays a key role in classification.
• Logistic Function: Used for binary classification tasks, where there are only two
possible classes (e.g., cat or not cat). It outputs a probability score between 0 and
1, helping to decide which of the two classes the input belongs to.
• Softmax Function: Used for multi-class classification tasks, where there are
more than two possible classes (e.g., identifying digits 0–9 in MNIST). Softmax
converts the output into a probability distribution across all classes, with the
highest probability indicating the predicted class.

In summary, the choice between Logistic and Softmax depends on the type of
classification required — binary or multi-class.
Output Layer
The output layer provides the final prediction as a one-hot encoded label.
In one-hot encoding, each possible class is represented as a unique vector
where only one element is “1” (indicating the chosen class) and all others
are “0.” For example, in a classification task with three classes (say, “cat,”
“dog,” and “bird”), the output for each class would be represented as
follows:
• Cat: [1, 0, 0]
• Dog: [0, 1, 0]
• Bird: [0, 0, 1]
Loss Layer
In Convolutional Neural Networks (CNNs), the loss layer, or loss function, quantifies
the difference between the model's predictions and the actual (ground truth) labels,
guiding the training process by measuring how well the model is performing and
enabling adjustments to minimize errors.
1x1 Convolution
Initially 1x1 convolutions were proposed at Network-in-network(NiN). After they were highly used in
GoogleNet architecture. Main features of such layers:
• Reduce or increase dimensionality
• Apply nonlinearity again after convolution
• Can be considered as “feature pooling”
1x1 Convolution
Inception Network
Building a powerful deep neural network is possible by increasing the number of
layers in a network Two problems with the above approach are that:
1. Increasing the number of layers of a neural network may lead to overfitting
especially if you have limited labeled training data and
2. Increase in the computational requirement.

Inception networks were created with the idea of increasing the capability of a
deep neural network while efficiently using computational resources.
Inception Network
Inception Network
Input channels
In convolutional neural networks (CNNs), input channels are the number of feature
maps (or channels) in the input data (e.g., RGB images have 3 input channels), and
output channels are the number of feature maps generated by a convolutional
layer, determined by the number of filters used.

Example:
If you have an input image with 3 channels (RGB) and use a convolutional layer with
32 filters, the output will have 32 channels (feature maps).
These 32 feature maps will be the input to the next layer in the network.
Transfer Learning (Fine Tuning vs Feature
Extraction)
Problem with training your own model
1. Models are data hungry means you need millions of labeled data.
2. Takes lot of time to train the model.
Solution
• Use pre trained data sets like ImageNet, MNIST
Transfer Learning (Fine Tuning vs Feature
Extraction)
 Transfer learning is a machine learning technique in which knowledge gained through one task or
dataset is used to improve model performance on another related task and/or different dataset.
 In other words, transfer learning uses what has been learned in one setting to improve
generalization in another setting.
One Shot Learning
 One-shot learning is a machine learning approach where a model learns to
classify or recognize objects from a single training example (or very few
examples) per class, making it particularly useful in scenarios with limited
data.
How it works:
 One-shot learning often employs techniques like Siamese Networks, which
train two neural networks to compare input examples and learn a similarity
metric.
Other names:
 One-shot learning is also known as few-shot learning when the model learns
from a small number of examples, and zero-shot learning when the model
learns from no examples.
One Shot Learning
Dimension Reduction
 Dimensionality reduction is a method for representing a given dataset using a
lower number of features (that is, dimensions) while still capturing the original
data's meaningful properties.
Implementation of CNN with TensorFlow, Keras
Handwritten Digit (0-9) Classification

%pip install tensorflow

import tensorflow as tff


rom tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dense, Conv2D, MaxPooling2D

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train

x_train.shape

x_train[0].shape
Implementation of CNN with TensorFlow, Keras
Handwritten Digit (0-9) Classification

x_train[0]

import matplotlib.pyplot as plt


plt.imshow(x_train[0])

y_train

y_train.shape

y_train[0].shape

y_train[0]

x_test

x_test.shape
Implementation of CNN with TensorFlow, Keras
Handwritten Digit (0-9) Classification

x_test[0].shape

x_test[0]

import matplotlib.pyplot as plt


plt.imshow(x_test[0])

y_test

y_test.shape

y_test[0].shape

y_test[0]

x_test = x_test/255
x_train = x_train/255
Implementation of CNN with TensorFlow, Keras
Handwritten Digit (0-9) Classification

import numpy as npIMG_SIZE=28x_trainr=np.array(x_train).reshape(-


1,IMG_SIZE,IMG_SIZE,1)x_testr=np.array(x_test).reshape(-
1,IMG_SIZE,IMG_SIZE,1)print('Training sample dimension:', x_trainr.shape)print('Test sample
dimension:', x_testr.shape)

model = Sequential()#1st conv layermodel.add(Conv2D(64, (3,3), activation='relu',input_shape=x_trainr.shape[1:]))


model.add(MaxPooling2D((2,2)))
#2nd conv layer
model.add(Conv2D(64, (3,3), activation='relu’))
model.add(MaxPooling2D((2,2)))
#3rd conv layermodel.add(Conv2D(64, (3,3), activation='relu’))
model.add(MaxPooling2D((2,2)))
#Fully connected layer #1
model.add(Flatten())model.add(Dense(64, activation='relu’))
#Fully connected layer #2
model.add(Dense(32, activation='relu’))
#output layer
model.add(Dense(10, activation='softmax'))
Implementation of CNN with TensorFlow, Keras
Handwritten Digit (0-9) Classification

model.summary()

model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy’])

model.fit(x_trainr, y_train, epochs=5, validation_split = 0.3)

predictions = model.predict([x_testr])

print(predictions)

print(np.argmax(predictions[0]))

plt.imshow(x_test[0])

%pip install opencv-python

import cv2
import numpy as np
import matplotlib.pyplot as plt
Implementation of CNN with TensorFlow, Keras
Handwritten Digit (0-9) Classification
img = cv2.imread('two.png’)

plt.imshow(img)

img.shape

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

gray.shape

resized = cv2.resize(gray, (28,28), interpolation = cv2.INTER_AREA)

resized.shape

newimg = tf.keras.utils.normalize(resized, axis = 1) # 0 to 1 scaling

newimg = np.array(newimg).reshape(-1,IMG_SIZE, IMG_SIZE,1) # kernal operation of convolution layer

newimg.shape

predictions = model.predict(newimg)

print(np.argmax(predictions))

You might also like