Assignment 1
Assignment 1
Assignment # 01
Submitted by
Yousra Iman
Submitted to
Dr Aamir Arsalan
Reg no
2020-BSE-032
Dataset # 01:
https://www.kaggle.com/datasets/gpiosenka/100-bird-species/data
1. Introduction
The dataset comprises 525 bird species with 84,635 high-quality training images, 2,625 test
images, and 2,625 validation images. Each image is 224x224x3 pixels in JPG format, with the
bird typically occupying at least 50% of the pixels. The dataset ensures no leakage between
train, test, and validation sets, with duplicates and low-information images removed. Images
were gathered from internet searches and hand-selected for quality. Cropping and resizing
ensure sufficient information for accurate classification, with expected high accuracies even
with moderately complex models. While the training set is unbalanced, with varying numbers
of files per species, each species has at least 130 training images. However, a significant
shortcoming is the gender imbalance, with 80% male and 20% female images. As test and
validation images are predominantly male, the classifier may perform less accurately on female
species images.
2. Libraries
This code sets up the necessary libraries and modules required to build a bird classification
model. Further steps would involve loading and preprocessing the dataset, defining the model
architecture, training the model, and evaluating its performance.
3. Configuration variables
• The batch_size variable is set to 128, indicating the number of images to be included in each
batch during training and evaluation.
• The image_size variable is set to (224, 224), which likely represents the desired size for the
input images. This size is commonly used for models trained on ImageNet, and it's often a good
choice for transfer learning.
• The seed variable is set to 42, which is used to initialize the random number generator. Setting
a seed ensures reproducibility of the results.
• The data_augmentation variable is set to True, indicating that data augmentation techniques
will be applied during training. Data augmentation is a common technique used to artificially
increase the size of the training dataset by applying transformations such as rotation, shifting,
and flipping to the images.
4. Data exploration
• Number of classes
This function provides a visual representation of a randomly selected image from the training
dataset, allowing users to get a glimpse of the data and verify its integrity before training the
model.
By arranging multiple subplots in a grid, this code allows for the visualization of multiple
random images from the training dataset in a single figure. This can provide a more
comprehensive overview of the dataset and the different bird species present.
This function provides a way to obtain a random image from the training dataset, which can be
useful for various purposes such as data exploration, visualization, or debugging.
Printing the image shape provides valuable information about the dimensions and size of the
images in the dataset. It helps ensure consistency in image dimensions and compatibility with
the model architecture being used for classification.
This function provides a convenient way to analyze the distribution of images across different
classes in the dataset. The resulting sorted list can help identify classes with more or fewer
images, which is valuable for understanding class imbalance and planning appropriate data
preprocessing strategies.
The formatted print statement provides a clear and concise summary of the class distribution
in the dataset, including the class names and the corresponding counts of images. This
information is helpful for understanding the dataset's characteristics and potential class
imbalances.
This code provides a visual representation of the distribution of images across the specified
number of classes in the training dataset. It helps visualize class imbalances and understand the
dataset's characteristics better.
5. Data frame creation
This code sets up the data generators with appropriate preprocessing and augmentation
configurations, enabling efficient and flexible data pipeline management for training,
validation, and testing phases of the machine learning model
This code establishes data generators for training, validation, and testing datasets, enabling
efficient batch-wise processing of data during model training and evaluation.
6. Model
This code defines a CNN model suitable for bird species classification, incorporating
convolutional layers for feature extraction and dense layers for classification. The model's
architecture is designed to capture and learn discriminative features from input images for
accurate classification.
7. Callbacks
These callbacks provide mechanisms for monitoring the model's performance during training,
preventing overfitting, and saving the best model weights for future use. They enhance the
training process by introducing early stopping and checkpointing functionalities.
8. Training
These hyperparameters are crucial for training the neural network effectively. They influence the
model's convergence, performance, and generalization ability. Adjusting these hyperparameters may be
necessary to achieve optimal performance based on the dataset characteristics and the complexity of the
classification task.
Compiling the model with the specified optimizer, loss function, and evaluation metrics prepares it for
training by configuring the necessary components for optimization and performance evaluation.
This code initiates the model training process, where the model learns to classify bird species
images by adjusting its weights based on the training data and evaluates its performance on the
validation dataset after each epoch. The specified callbacks further enhance the training process
by providing early stopping and model checkpointing functionalities.
Saving the model in the HDF5 format with a descriptive filename ensures that the trained model
can be easily accessed, shared, and reused for various purposes in future workflows.