Image Classifier Report
Image Classifier Report
Abstract:
This report presents an overview and detailed explanation of a Python script for image classification
using Convolutional Neural Networks (CNNs). The script leverages the TensorFlow and Keras libraries to
build, train, and evaluate a CNN for classifying flower images. The dataset used consists of various
flower categories, and the model is designed to predict the class of a given image.
1. Introduction:
Image classification is a fundamental task in computer vision, with applications ranging from object
recognition to medical imaging. Convolutional Neural Networks have proven to be highly effective for
image classification tasks due to their ability to automatically learn hierarchical features from the input
data.
2. Dataset:
The script uses a flower image dataset, accessible through a specified URL. The dataset is downloaded,
extracted, and split into training and validation sets. The images are preprocessed, and pixel values are
normalized to ensure consistency and stability during training.
1. Importing Libraries:
import numpy as np
import PIL
import tensorflow as tf
import pathlib
- Import necessary libraries such as `matplotlib`, `numpy`, `PIL` (Pillow), and TensorFlow with Keras.
- `pathlib` is used for working with file paths.
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/
flower_photos.tgz"
data_dir = pathlib.Path(data_dir).with_suffix('')
image_count = len(list(data_dir.glob('*/*.jpg')))
batch_size = 32
img_height = 180
img_width = 180
train_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
val_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
- Images are resized to `(img_height, img_width)` and split into training and validation sets.
**3. Data Normalization and Caching:**
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
```
normalization_layer = layers.Rescaling(1./255)
first_image = image_batch[0]
print(np.min(first_image), np.max(first_image))
```
- Pixel values of images are normalized to be in the range `[0, 1]` using the `Rescaling` layer.
- Example normalization is performed, and the minimum and maximum pixel values are printed.
num_classes = len(class_names)
model = Sequential([
layers.MaxPooling2D(),
layers.MaxPooling2D(),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(num_classes)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
```
- A sequential model is defined with convolutional and pooling layers, followed by dense layers.
- The model is compiled using the Adam optimizer, sparse categorical crossentropy loss, and accuracy as
the metric.
epochs=15
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
- The model is trained on the training dataset, and validation performance is monitored over 15 epochs.
data_augmentation = keras.Sequential(
layers.RandomFlip("horizontal",
input_shape=(img_height,
img_width,
3)),
layers.RandomRotation(0.1),
layers.RandomZoom(0.1),
- A data augmentation pipeline is defined with random horizontal flip, rotation, and zoom operations.
data_augmentation,
layers.Rescaling(1./255),
layers.MaxPooling2D(),
layers.MaxPooling2D(),
layers.MaxPooling2D(),
layers.Dropout(0.2),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(num_classes, name="outputs")
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
- A new model is created by combining the original model with the data augmentation pipeline.
- The model is compiled with the same optimizer, loss function, and metrics.
sunflower_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/592px-
Red_sunflower.jpg"
img = tf.keras.utils.load_img(
img_array = tf.keras.utils.img_to_array(img)
predictions = model.predict(img_array)
score = tf.nn.softmax(predictions[0])
print(
- The trained model is used to predict the class and confidence score for the input image.
This breakdown provides a detailed explanation of each part of the script, from data loading and
preprocessing to model training and prediction.
4. Model Architecture:
The CNN model is defined using the Keras Sequential API. It comprises several layers:
5. Training:
The model is compiled using the Adam optimizer and sparse categorical crossentropy loss function. The
training process involves iterating over batches of images, and the model is evaluated on a validation set
to monitor performance. The training history is stored for later analysis.
A second model is created by combining the original model with the data augmentation pipeline. This
model is then compiled and trained on the augmented dataset. Data augmentation helps the model
generalize better to new, unseen images.
A sample image of a red sunflower is loaded, preprocessed, and passed through the trained model for
prediction. The predicted class and confidence score are printed.
8. Conclusion:
The script demonstrates the end-to-end process of building and training a CNN for image classification
using TensorFlow and Keras. By incorporating data augmentation, the model's performance is
enhanced, showcasing the importance of preprocessing techniques in improving generalization. This
report provides a comprehensive understanding of the code's components and their contributions to
the overall image classification task.