0% found this document useful (0 votes)
10 views

Implementing A Convolutional Neural Network CNN 1718899610

Uploaded by

kalexan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Implementing A Convolutional Neural Network CNN 1718899610

Uploaded by

kalexan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Implementing a

Custom CNN (Convolutional Neural Network)


for Image Classification
from Scratch in Python

1 ANSHUMAN JHA
Implementing a Custom CNN (Convolutional Neural Network)
for Image Classification from Scratch in Python

Table of Contents
1. Introduction to CNN (Convolutional Neural Network)
2. The Structure of a Convolutional Neural Network
3. Implementation in Python
a. Image Preprocessing
b. Building the Convolutional Layer
c. Implementing the Pooling Layer
d. Creating the Fully Connected Layer
e. Putting It All Together
f. Training and Evaluation
4. Conclusion

2 ANSHUMAN JHA
Implementing a Custom CNN (Convolutional Neural Network)
for Image Classification from Scratch in Python

1. Introduction to Convolutional Neural Networks


Convolutional Neural Networks (CNNs) are a type of deep learning algorithm specifically designed for image
recognition and classification tasks. They have achieved remarkable success in computer vision, powering
applications like facial recognition, self-driving cars, and medical image analysis. This post will guide you
through building a custom CNN from scratch in Python, focusing on image preprocessing, convolutional layers,
pooling layers, and fully connected layers without relying on high-level libraries such as TensorFlow or
PyTorch.

2. The Structure of a Convolutional Neural Networks


The Structure illustrates the following components and their connections:

1. Input Layer:
- Input image of size 64x64x3.

2. First Convolutional Layer:


- Convolution operation with filters of size 3x3x3 and 8 filters.
- ReLU activation function.
- Padding: 1, Stride: 1.

3. First Pooling Layer:


- Max-pooling operation with a 2x2 filter and stride of 2.

4. Second Convolutional Layer:


- Convolution operation with filters of size 3x3x8 and 16 filters.
- ReLU activation function.
- Padding: 1, Stride: 1.

5. Second Pooling Layer:


- Max-pooling operation with a 2x2 filter and stride of 2.

6. Flatten Layer:
- Flatten the output of the second pooling layer into a 1D array.

7. Fully Connected Layer:


- Fully connected layer with an output size of 10 (assuming 10 classes for classification).

8. Output Layer:
- Softmax activation function to produce a probability distribution over the 10 classes.

3 ANSHUMAN JHA
Implementing a Custom CNN (Convolutional Neural Network)
for Image Classification from Scratch in Python

4 ANSHUMAN JHA
Implementing a Custom CNN (Convolutional Neural Network)
for Image Classification from Scratch in Python
3. Implementation in Python
Let's implement a simple CNN (Convolutional Neural Network) in Python to classify image.

Step 1: Image Preprocessing


Before feeding images into a CNN, they need to be preprocessed. This includes resizing, normalization, and
converting them into a suitable format.

Loading and Preprocessing Images

import numpy as np
import cv2
import os

def load_images_from_folder(folder, img_size):


images = []
labels = []
for filename in os.listdir(folder):
img = cv2.imread(os.path.join(folder, filename))
if img is not None:
img = cv2.resize(img, (img_size, img_size))
img = img / 255.0 # Normalize to [0,1]
images.append(img)
labels.append(filename.split('_')[0]) # Assuming filenames contain labels
return np.array(images), np.array(labels)

# Example usage
images, labels = load_images_from_folder('path_to_images', 64)

In this example, cv2 is used to read and resize images, while the pixel values are normalized to the range [0, 1].

5 ANSHUMAN JHA
Implementing a Custom CNN (Convolutional Neural Network)
for Image Classification from Scratch in Python

Step 2: Building the Convolutional Layer


The convolutional layer is the core building block of a CNN. It applies a number of filters to the input image to
extract features.

Convolutional Layer Implementation

def conv2d(X, W, b, stride=1, padding=0):


(n_H_prev, n_W_prev, n_C_prev) = X.shape
(f, f, n_C_prev, n_C) = W.shape
n_H = int((n_H_prev - f + 2 * padding) / stride) + 1
n_W = int((n_W_prev - f + 2 * padding) / stride) + 1
Z = np.zeros((n_H, n_W, n_C))

X_pad = np.pad(X, ((padding, padding), (padding, padding), (0, 0)), mode='constant')

for h in range(n_H):
for w in range(n_W):
for c in range(n_C):
vert_start = h * stride
vert_end = vert_start + f
horiz_start = w * stride
horiz_end = horiz_start + f

X_slice = X_pad[vert_start:vert_end, horiz_start:horiz_end, :]


Z[h, w, c] = np.sum(X_slice * W[:, :, :, c]) + b[c]

return Z

# Example usage
X = np.random.randn(64, 64, 3) # Example input
W = np.random.randn(3, 3, 3, 8) # 8 filters of size 3x3x3
b = np.random.randn(8)
Z = conv2d(X, W, b)

This function performs a 2D convolution operation with the specified filters, stride, and padding. The input X is
padded to ensure the output dimensions match the desired size.

6 ANSHUMAN JHA
Implementing a Custom CNN (Convolutional Neural Network)
for Image Classification from Scratch in Python

Step 3: Implementing the Pooling Layer


Pooling layers reduce the spatial dimensions of the input, which helps in reducing the computational
complexity and prevents overfitting.

Max-Pooling Layer Implementation

def max_pooling(X, f=2, stride=2):


(n_H_prev, n_W_prev, n_C) = X.shape
n_H = int(1 + (n_H_prev - f) / stride)
n_W = int(1 + (n_W_prev - f) / stride)
Z = np.zeros((n_H, n_W, n_C))

for h in range(n_H):
for w in range(n_W):
for c in range(n_C):
vert_start = h * stride
vert_end = vert_start + f
horiz_start = w * stride
horiz_end = horiz_start + f

X_slice = X[vert_start:vert_end, horiz_start:horiz_end, c]


Z[h, w, c] = np.max(X_slice)

return Z

# Example usage
X = np.random.randn(64, 64, 8) # Example input
Z = max_pooling(X)

This function performs max-pooling, which extracts the maximum value from each patch of the input.

Step 4: Creating the Fully Connected Layer

The fully connected (FC) layer is a standard neural network layer where each neuron is connected to every
neuron in the previous layer. This layer combines features learned by convolutional and pooling layers to make
predictions.

Fully Connected Layer Implementation

This function performs a linear transformation of the input X using weights W and biases b.
def fully_connected(X, W, b):
Z = np.dot(W, X) + b
return Z

# Example usage
X = np.random.randn(64 * 64 * 8) # Flattened input
W = np.random.randn(10, 64 * 64 * 8) # 10 output classes
b = np.random.randn(10)
Z = fully_connected(X, W, b)

7 ANSHUMAN JHA
Implementing a Custom CNN (Convolutional Neural Network)
for Image Classification from Scratch in Python

Step 5: Putting It All Together


We will now combine the above components to build a simple CNN model.

CNN Model Implementation

def cnn_forward(X, parameters):


W1, b1, W2, b2, W3, b3 = parameters
Z1 = conv2d(X, W1, b1, stride=1, padding=1)
A1 = np.maximum(0, Z1) # ReLU activation
P1 = max_pooling(A1, f=2, stride=2)

Z2 = conv2d(P1, W2, b2, stride=1, padding=1)


A2 = np.maximum(0, Z2) # ReLU activation
P2 = max_pooling(A2, f=2, stride=2)

F = P2.flatten()
Z3 = fully_connected(F, W3, b3)

return Z3

# Initialize parameters
W1 = np.random.randn(3, 3, 3, 8)
b1 = np.random.randn(8)
W2 = np.random.randn(3, 3, 8, 16)
b2 = np.random.randn(16)
W3 = np.random.randn(10, 16 * 16 * 16)
b3 = np.random.randn(10)

parameters = (W1, b1, W2, b2, W3, b3)

# Example usage
X = np.random.randn(64, 64, 3) # Example input
Z3 = cnn_forward(X, parameters)

In this example, we define a simple CNN model with two convolutional layers followed by ReLU activation
and max-pooling, and a final fully connected layer for classification.

8 ANSHUMAN JHA
Implementing a Custom CNN (Convolutional Neural Network)
for Image Classification from Scratch in Python

Step 6: Training and Evaluation


Training a CNN involves optimizing the weights and biases using backpropagation and gradient descent. For
simplicity, we'll use random values for our input and target, and demonstrate the forward pass.

Training Loop (Simplified)

def compute_loss(Y_hat, Y):


m = Y.shape[0]
loss = -np.sum(Y * np.log(Y_hat + 1e-8)) / m
return loss

def softmax(Z):
expZ = np.exp(Z - np.max(Z))
return expZ / expZ.sum(axis=0, keepdims=True)

# Example usage
# Example target (one-hot encoded)
Y = np.eye(10)[np.random.choice(10, 1)].T
Y_hat = softmax(Z3) # Applying softmax to the output of the CNN
loss = compute_loss(Y_hat, Y)

This code computes the cross-entropy loss, which is commonly used for classification tasks.

7. Conclusion
In this post, we built a custom CNN from scratch in Python, implementing each layer manually. This process
helps in understanding the inner workings of CNNs and provides a foundation for using higher-level libraries
for more complex tasks
By following this guide, you should have a clear understanding of how to build and understand a CNN from
scratch, providing a solid foundation for further learning and development in the field of deep learning and
computer vision.

9 ANSHUMAN JHA
Implementing a Custom CNN (Convolutional Neural Network)
for Image Classification from Scratch in Python

Constructive comments and feedback are welcomed

10 ANSHUMAN JHA

You might also like