KYK_Final_Project_Report

Uploaded by

kartikayagarwal737

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

KYK_Final_Project_Report

Uploaded by

kartikayagarwal737

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

ELL409 - Machine Intelligence and Learning Semester Project Report

Cuisine Prediction
Kriti Garg, Yagya Goyal, Kartikey Agarwal

November 24, 2024

Abstract and outcomes of this approach.

This project explores cuisine prediction using

a machine learning approach applied to the 3 Methodology
dataset provided. By leveraging a feedforward
neural network with TF-IDF feature extraction 3.1 Data Preprocessing
and class balancing via SMOTE, the model pre-
dicts the cuisine type based on a list of ingredi- The dataset comprises recipes with their in-
ents. The results demonstrate the potential of gredients and corresponding cuisines. We ob-
simple neural network architectures for multi- served that the dataset was highly imbalanced.
class classification tasks in text-based datasets. To address class imbalance, SMOTE (Synthetic
Our code and data can be found at this link: Minority Oversampling Technique) was applied
Cuisine Prediction ELL409 to the training data, ensuring all classes were
equally represented.

1. Ingredient tokens were preprocessed by re-

1 Problem Statement placing spaces with underscores (e.g., ‘"soy
The goal of this project is to develop a ma- sauce"‘ becomes ‘"soy_sauce"‘) to ensure
chine learning model capable of predicting the uniformity for text analysis, and ensure
type of cuisine for a given recipe based on its multi-word ingredients were treated as sin-
list of ingredients. Using a dataset containing gle ingredient only and not 3 ingredients if
recipes with ingredients and their corresponding it read "grated pamesan cheese".
cuisines, the task involves training an Artificial
2. Each recipe’s list of ingredients was joined
Neural Network (ANN) to classify cuisines like
into a single string (e.g., "sugar salt but-
Indian, Mexican, Moroccan, etc. The dataset
ter"). This was done separately for the
is provided in JSON format, with ingredients
training and test datasets.
as input features and cuisine type as the target
variable. The final evaluation metric is overall 3. TF-IDF (Term Frequency-Inverse Docu-
accuracy, measured by comparing predictions on ment Frequency) was used to transform the
a test dataset to true labels. Results are submit- ingredient lists into a vectorized numerical
ted in a specified CSV format. format (limited to 4508 ingredients/features
for this case).

2 Introduction
Cuisine prediction is a multi-class classification 3.2 Model Architecture
problem where the goal is to predict the type Feedforward - The front pass
of cuisine based on the ingredients in a recipe. The feedforward neural network was designed
This project focuses on developing an efficient with:
machine learning pipeline to handle the task,
starting from data preprocessing and feature ex- 1. Input layer matching the number of TF-
traction to training a feedforward neural net- IDF features.
work. The provided dataset is used as a bench-
mark, which includes recipes labeled with their 2. Two hidden layers with 128 and 64 neurons,
respective cuisines and ingredient lists. This re- using ReLU activations and dropout for reg-
port highlights the methodology, experiments, ularization.

1
by instance using the training DataLoader.
The number of epochs are kept large enough
to see the accuracy converging to a value.
4. After each epoch: The validation set is
evaluated, and metrics such as loss and ac-
curacy are calculated.

Testing and Prediction:

1. The trained model predicts the cuisines for
the test data (X_test).
Figure 1: Imbalance in dataset
2. The predicted labels are converted back to
cuisine names using the LabelEncoder.
3. A softmax output layer for multi-class clas-
sification. Evaluation Metrics: Although the choice
Every layer has an unsaid linear layer compo- of evaluation metrics could be diverse, it was
nent, which is essential to transform data in a clearly mentioned in the aim of the project to
learnable, flexible way. choose accuracy as the evaluation metric and
While this is the final architecture established, hence we did not do a lot of trials in this do-
we went through a lot of steps to establish which main.
one was the best suited for our problem state-
ment. These steps are described in the following Here are a few models that we trained and
section. compared:
The backward pass Except for the feed- 1. Balanced vs Imbalanced data.
forward neural network architechture, here is
a small description of our back propagation. 2. Preprocessed vs Untouched data.
We set up an Adam optimizer, and used it to
3. Training With Dropout vs Without
back propagate gradients for each batch in each
Dropout layer.
epoch, i.e. we used mini batch gradient descent.
4. Comparison Between different number of
3.3 Training, Optimization and layer and number of neurons per layer.
Evaluation 5. Comparison Between different batch sizes
Data Splitting and Conversion 6. Changing the maximum number of features
1. Training and Validation Split: to take into account.
train_test_split divides the resampled The plots and accuracy results can be found
training data into training and validation in the next section.
sets (80 %- 20% split).

2. Tensor Conversion: The split data is 4 Results

converted into PyTorch tensors for use in
the neural network. Here are the plots for each of our comparisons.
3. Dataloaders: PyTorch DataLoader ob-
jects are created for both training and val-
idation sets to manage batching during
training.

Training the Model

1. Loss Function: CrossEntropyLoss is used

for multi-class classification.

2. Optimizer: Adam optimizer with a learn-

ing rate of 0.001 is used for training.

3. Training Loop: For a set number of Figure 2: Without Data Balancing

epochs. The model is trained instance

2
Figure 7: Using Batch Size = 10
Figure 3: Without Preprocessing

Figure 8: Changed Number of Neurons to

Figure 4: Without Dropout (256,64)

1. As we can clearly see data balancing saved

us from the hazard of overfitting on the im-
balanced data.

2. Changing Batch size to 10, did not have any

effect on the accuracy, rather it just took a
lot longer to train, hence, the bigger batch
sizes work better.

3. While all other changes did not bring in

very considerable changes in accuracy, we
observed that when judged on the metrics of
Figure 5: Using 3 hidden layers overfitting, underfitting, efficiency and ac-
curacy together, our model performed best,
and we anticipate it to perform the best
among this lot, on unseen data.

4. With our final model, trained on the 80-

20 split data, we got a training accuracy
of 96.94% and a validation accuracy of
95.42%

5 Discussion
1. TF-IDF Effectiveness: TF-IDF proved ef-
Figure 6: Using total of 1 layer fective in capturing text features for multi-
class classification tasks. Limiting the fea-
ture size to 4508 balanced performance and
Observations efficiency. This feature of the code gives the

3
Figure 9: Final Model accuracy results

ingredients (features) weights according to

their frequency of occurence. Qualitatively
speaking, an ingredient which is more com-
mon, will have a lower weight.
2. Handling Imbalanced Data: SMOTE signif-
icantly improved class representation, en-
suring fair learning across cuisines.

3. Model Simplicity and Performance: Despite

being a simple feedforward neural network,
the model achieved competitive accuracy,
highlighting its suitability for this task.

Conclusions
Despite being a relatively straightforward archi-
tecture, the model proved to be suitable for the
multi-class classification task, highlighting the
importance of careful preprocessing and data
balancing in text-based machine learning tasks.

Acknowledgements
We would like to thank Anant sir, for guiding us
through the project, with valuable insights that
helped us get the best out of it.

References
[1] Learn with Jay, Adam Optimizer