Fine Tuning Hper Parameters
Fine Tuning Hper Parameters
• The settings that you adjust during each iteration are called hyperparameters.
• In recent times, there have been several research advancements in both deep
learning and neural networks
• These deep neural networks help developers to achieve more sustainable and
high-quality results.
• An artificial neural network (ANN) or a simple traditional neural network
aims to solve trivial tasks with a straightforward data.
• These networks usually consist of an input layer, one to two hidden layers,
and an output layer.
• For these problems, we utilize deep neural networks, which often have a
complex hidden layer structure with a wide variety of different layers.
• These additional layers help the model to understand problems better and
provide optimal solutions to complex projects.
• A deep neural network has more layers (more depth) than ANN and each
layer adds complexity to the model.
• # Importing the necessary functionality
• import tensorflow as tf
• from tensorflow.keras.models import Sequential
• from tensorflow.keras.layers import Input, Dense, Conv2D
• from tensorflow.keras.layers import Flatten, MaxPooling2D
Training DNN:
• There are certain practices in Deep Learning that are highly recommended,
in order to efficiently train Deep Neural Networks.
• Training data:
• A lot of ML practitioners are habitual of throwing raw training data in any Deep Neural
Net(DNN). And why not, any DNN would(presumably) still give good results.
• When - “given the right type of data, a fairly simple model will provide better and faster
results than a complex DNN.
• So, whether you are working with Computer Vision, Natural Language
Processing, Statistical Modelling, etc. try to preprocess your raw data.
A few measures one can take to get better training data:
• Get your hands on as large a dataset as possible(DNNs are quite data-hungry: more is
better)
• Remove any training sample with corrupted data(short texts, highly distorted images,
spurious output labels, features with lots of null values, etc.)
• For years, sigmoid activation functions have been the preferable choice. But, a sigmoid
function is inherently cursed by these two drawbacks - 1. Saturation of sigmoids at
tails(further causing vanishing gradient problem).
• Number of Hidden Units and Layers:
• Keeping in mind a larger number of hidden units than the optimal number, is generally a
safe.
• On the other hand, while keeping smaller numbers of hidden units(than the optimal
number), there are higher chances of underfitting the model.
• By increasing the number of hidden units, model will have the required flexibility to filter
out the most appropriate information out of these pre-trained representations.
• Hyperparameter Tuning: Grid Search & Random Search:
1. Based on your prior experience, you can manually tune some common
hyperparameters like learning rate, number of layers, etc.
• “Training a Deep Learning Model for multiple epochs will result in a better model” - we
have heard it a couple of times, but how do we quantify “many”