0% found this document useful (0 votes)
25 views

Fine Tuning Hper Parameters

Uploaded by

divejdivej16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Fine Tuning Hper Parameters

Uploaded by

divejdivej16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Fine Tuning of Neural

Network Hyper parameters


• Hyperparameter / Hyper Tuning:
• Building machine learning models is an iterative process that involves
optimizing the model’s performance and compute resources.

• The settings that you adjust during each iteration are called hyperparameters.

• The process of searching for optimal hyperparameters is


called hyperparameter tuning or hypertuning, and is essential in any machine
learning project.

• Hypertuning helps boost performance and reduces model complexity by


removing unnecessary parameters (e.g., number of units in a dense layer).
UNIT-2
DNN (Deep Neural Networks)

• Deep neural networks have changed the landscape of artificial intelligence in


the modern era.

• In recent times, there have been several research advancements in both deep
learning and neural networks

• which dramatically increase the quality of projects related to artificial


intelligence.

• These deep neural networks help developers to achieve more sustainable and
high-quality results.
• An artificial neural network (ANN) or a simple traditional neural network
aims to solve trivial tasks with a straightforward data.

• An artificial neural network is loosely inspired from biological neural


networks. It is a collection of layers to perform a specific task.

• Each layer consists of a collection of nodes to operate together.

• These networks usually consist of an input layer, one to two hidden layers,
and an output layer.

• While it is possible to solve easy mathematical statements and computer


problems
• It is tough for these networks to solve complicated Image processing,
computer vision, and natural language processing tasks.

• For these problems, we utilize deep neural networks, which often have a
complex hidden layer structure with a wide variety of different layers.

• such as a convolutional layer, max-pooling layer, dense layer, and other


unique layers.

• These additional layers help the model to understand problems better and
provide optimal solutions to complex projects.

• A deep neural network has more layers (more depth) than ANN and each
layer adds complexity to the model.
• # Importing the necessary functionality
• import tensorflow as tf
• from tensorflow.keras.models import Sequential
• from tensorflow.keras.layers import Input, Dense, Conv2D
• from tensorflow.keras.layers import Flatten, MaxPooling2D
Training DNN:
• There are certain practices in Deep Learning that are highly recommended,
in order to efficiently train Deep Neural Networks.
• Training data:

• A lot of ML practitioners are habitual of throwing raw training data in any Deep Neural
Net(DNN). And why not, any DNN would(presumably) still give good results.

• When - “given the right type of data, a fairly simple model will provide better and faster
results than a complex DNN.

• So, whether you are working with Computer Vision, Natural Language
Processing, Statistical Modelling, etc. try to preprocess your raw data.
A few measures one can take to get better training data:

• Get your hands on as large a dataset as possible(DNNs are quite data-hungry: more is
better)

• Remove any training sample with corrupted data(short texts, highly distorted images,
spurious output labels, features with lots of null values, etc.)

Choose appropriate activation functions:


• One of the vital components of any Neural Networks are activation functions.

• For years, sigmoid activation functions have been the preferable choice. But, a sigmoid
function is inherently cursed by these two drawbacks - 1. Saturation of sigmoids at
tails(further causing vanishing gradient problem).
• Number of Hidden Units and Layers:

• Keeping in mind a larger number of hidden units than the optimal number, is generally a
safe.

• On the other hand, while keeping smaller numbers of hidden units(than the optimal
number), there are higher chances of underfitting the model.

• Also, while employing unsupervised representations the optimal number of hidden


units are generally kept even larger.

• Since, Unsupervised representation might contain a lot of irrelevant information in these


representations

• By increasing the number of hidden units, model will have the required flexibility to filter
out the most appropriate information out of these pre-trained representations.
• Hyperparameter Tuning: Grid Search & Random Search:

• Grid Search has been prevalent in classical machine learning.



• But, Grid Search is not at all efficient in finding optimal hyperparameters for DNNs.

• Primarily, because of the time taken by a DNN in trying out different hyperparameter
combinations.

• As the number of hyperparameters keeps on increasing, computation required for Grid


Search also increases exponentially.
• There are two ways to go about it:

1. Based on your prior experience, you can manually tune some common
hyperparameters like learning rate, number of layers, etc.

• Instead of Grid Search, use Random Search/Random Sampling for


choosing optimal hyperparameters.
• Number of Epochs”:

• “Training a Deep Learning Model for multiple epochs will result in a better model” - we
have heard it a couple of times, but how do we quantify “many”

You might also like