0% found this document useful (0 votes)
118 views

PPB ML Notes

The document contains notes on machine learning topics including Python, mathematics for deep learning, deep learning concepts, and personal projects. Sections cover Python programming concepts, tools like Pandas and SciKit-Learn, deep learning frameworks like Keras, and neural network architectures.

Uploaded by

Phenil Buch
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views

PPB ML Notes

The document contains notes on machine learning topics including Python, mathematics for deep learning, deep learning concepts, and personal projects. Sections cover Python programming concepts, tools like Pandas and SciKit-Learn, deep learning frameworks like Keras, and neural network architectures.

Uploaded by

Phenil Buch
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Machine Learning Notes

Phenil Buch
[email protected]

17 July, 2019

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Contents I

1 Contents

2 Python

3 Mathematics for Deep Learning

4 Deep Learning Concepts

5 My Projects

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Python Introduction I

• Decorators in Python are used to modify or inject code


in functions or classes. Using decorators, you can wrap a
class or function method call so that a piece of code can be
executed before or after the execution of the original code.
Decorators can be used to check for permissions, modify or
track the arguments passed to a method, logging the calls
to a specific method, etc.
• The tools to find bugs in Python are Pylint and Py-
checker. Pylint is used to verify if a module satisfies all
the coding standards. Pychecker is a static analysis tool
that helps to find out bugs in the source code.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Python Introduction II

• PEP8 is a set of coding guidelines in Python language


that programmers can use to write readable code which
makes it easy to use for other users.
• Pylab is a package that combines NumPy, SciPy, and Mat-
plotlib into a single namespace.
• The difference between list and tuple is that list is mutable
while tuple is not. Tuple can be hashed for e.g as a key for
dictionaries.
• List comprehension is the process of creating a list while
performing some operation on the data so that it can be
accessed using an iterator.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Python Introduction III


• If the data is stored in HDFS format (Hadoop) and you
want to find how the data is structured. To find out the
names of HDFS keys: hf.keys() ; HDFS file has been loaded
by h5py as hf.
• Lambda is a single expression anonymous function often
used as inline function.
• unittest is a testing framework in Python. It supports
sharing of setups, automation testing, shutdown code for
tests, aggregation of tests into collections etc.
• A Python documentation string is known as docstring, it
is a way of documenting Python functions, modules and
classes.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Pandas I

• There are two data structures supported by pandas library,


Series and DataFrames. Both of the data structures are
built on top of Numpy. Series is a one-dimensional data
structure in pandas and DataFrame is the two-dimensional
data structure in pandas.
• There are various features in pandas library and some of
them are mentioned below
• Data Alignment
• Memory Efficient
• Data Manipulation and Data Analysis
• Reshaping and Merge & join
• Time Series

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Pandas II

• A categorical variable takes on a limited and usually fixed,


number of possible values.
• Some ways by which we create a DataFrame: Lists, Dic-
tionaries and Arrays
• A time series is an ordered sequence of data which basically
represents how some quantity changes over time. Pan-
das contains extensive capabilities and features for working
with time series data for all domains. Pandas supports:
• Parsing time series information from various sources and
formats
• Manipulating and converting date time with timezone in-
formation

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Pandas III

• Resampling or converting a time series to a particular fre-


quency
• Performing date and time arithmetic with absolute or rel-
ative time increments

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Sci-kit Learn I

• Data Pre-processing
• Standardization - Standard Scalar - Subtract Mean and di-
vide by Standard Deviation a.k.a. Z-Score. Always Stan-
dardize after generating Polynomial Features
• Normalization - Normalizer - Range between -1 to 1
• Discretization and Binarization -KBinsDiscretizer and Bi-
narizer - Turn Continous values into Categorical ones
• Encoding Categorical Variables - Label Encoder
• Imputing Missing Values - Imputer method mean
• Generating Polynomial Features and Word Vectors - Poly-
nomialFeatures, CountVectorizer, TfidfVectorizer

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Sci-kit Learn II
• Supervised Learning Estimators - Naive Bayes Classi-
fier, K-Nearest Neighbors, Support Vector Machines, Lin-
ear Regression
• Unsupervised Learning Estimators - Principal Compo-
nent Analysis PCA and K-means for Unsupervised Learning
Tasks like Clustering, Dimension Reduction, Embedding,
Representing the data with a distribution (Density Estima-
tion).
• Classification Metrics - Accuracy Score, Classification
Report (Precision, Recall, F1, Support), Confusion Matrix
• Regression Metrics - MAE Mean Absolute Error, MSE
Mean Squared Error, R-squared Error

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Sci-kit Learn III

• Clustering Metrics - Adjusted Rand Index, Homogeneity,


V-measure
• Model Tuning - Grid Search GridSearchCV, Randomized
Parameter Optimization RandomizedSearchCV

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Keras I

• Keras is a powerful and easy-to-use deep learning library for


Theano and TensorFlow that provides a high-level neural
networks API to develop and evaluate deep learning mod-
els.
• Data Preprocessing - Sequence Padding sequence.pad sequence
One-Hot Encoding to categorical
• Model Architecture
• Sequential
• Multi Layer Perceptron - Binary Classification (Sigmoid
Activation), Multi-class Classification (Softmax Activa-
tion), Regression (No Activation to be Specified), Other
Activations - ReLu, LeakyReLu, TanH,

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Keras II

• CNN - Convolution is a mathematical operation where


you summarize a tensor or a matrix or a vector into a
smaller one. Conv1D - 1 Dimensional Signal Audio Data,
Conv2D - 2 Dimensional Image Data, Conv3D - 3 Dimen-
sional Video Data. Same Padding and Valid Padding.
Max Pooling and Average Pooling. Strides. Filter values
learned by the Network is also called Feature Map. Fully
Connected Layer.
• Input Shape for conv1D- (batch size,W,channels) Example-
1 second stereo voice signal sampled at 44100 Hz, shape-
(batch size,44100,2)
• Input shape for conv2D- (batch size,(H,W),channels) Example-
32x32 RGB image, shape- (batch size,32,32,3)

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Keras III

• Input shape for conv3D- (batch size,(H,w,D),channels)


Example- 1 second video of 32x32 RGB images at 24 fps,
shape- (batch size,32,32,3,24)
• RNN - Embedding Layer, LSTM Layer (can return se-
quences or a vector), GRU, Encoder Decoder Model with
Attention Layer, BiDirectional LSTM wrapper, TimeDis-
tributedDense Layer, Model Types- One-to-one, Many-
to-one, one-to-Many, Many-to-Many
• Model.compile -

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Keras IV

• Optimizer (Includes Learning Rate and Decay)(To reduce


the loss function) - Gradient Descent - Batch Mini-Batch
Stochastic, SGD with Momentum, Nesterov Accelerated
Gradient NAG, Adaptive Gradient AdaGrad eliminates the
need to fine-tune Learning rate, Root Mean Square Pro-
pogation RMSProp Moving Averages, Adaptive Moment
Estimation Adam Adagrad+RMSProp, Nadam NAG+Adam
• Loss Function - Mean Absolute Error MAE aka L1, Mean
Squared Error MSE aka L2, Negative Log Likelihood aka
Log Loss aka Cross Entropy, KL Divergence
• Metrics - Accuracy for Classification, Mean Absolute Error
for Regression

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Keras V

• Model.fit - Epochs, Verbose 0 for no progress 1 for ani-


mated progress bar 2 for summarized progress, Batch Size,
Callbacks - A callback is a set of functions to be applied at
given stages of the training procedure. Callbacks - Basel-
ogger, History, ModelCheckpoint, EarlyStopping, Remote-
Monitor, TensorBoard, LambdaCallback and Custom Call-
backs.
• Model.evaluate - Score
• Model.predict and Model.predict classes
• Model.save and load model

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

TensorFlow I

• TensorFlow is basically a software library for numerical com-


putation using data flow graphs where: Nodes in the
graph represent mathematical operations. Edges in the
graph represent the multidimensional data arrays (called
tensors) communicated between them. Tensor is the cen-
tral unit of data in TensorFlow.
• TensorFlow Core is the low level API of TensorFlow. Rec-
ommended for ML Researchers. tf.contrib.learn is an ex-
ample of a high level API.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

TensorFlow II
• Step 1. Create a Computational Graph using Variables and
Placeholders. By creating computational graph, we mean
defining the nodes. Tensorflow provides different types of
nodes for a variety of tasks. Each node takes zero or more
tensors as inputs and produces a tensor as an output.
• TensorFlow has Variable nodes which can hold variable
data. They are mainly used to hold and update param-
eters of a training model. A graph can be parameterized
to accept external inputs, known as placeholders. A place-
holder is a promise to provide a value later.
• Step 2. In order to run the computational graph, we need
to create a session. We can invoke the run method of
session object to perform computations on any node.
Phenil Buch [email protected] Machine Learning Notes
Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

TensorFlow III

• Tensorboard is a stack of web apps. It is a suit of visualizing


tools that lets you visualize the graphs and plot quantitative
metrics. It also helps us in inspecting and understanding
the Tensorflow runs.
• Using the tf. convert to tensor() operation, we can convert
python objects like lists and numpy arrays into tensors.
• Tensorflow is used for Image, Video, Text and Speech re-
lated Applications.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

PyTorch I

• Dynamic computation graphs – Instead of predefined


graphs with specific functionalities, PyTorch provides a
framework for us to build computational graphs as we go,
and even change them during runtime. This is valuable
for situations where we don’t know how much memory is
going to be required for creating a neural network.
• Autograd Module – PyTorch uses a technique called au-
tomatic differentiation. That is, we have a recorder that
records what operations we have performed, and then it
replays it backward to compute our gradients. We save
time on one epoch by calculating differentiation of the pa-
rameters at the forward pass itself.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

PyTorch II

• Optim Module – torch.optim is a module that imple-


ments various optimization algorithms used for building
neural networks.
• nn Module – The keras to Pytorch

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Tensorflow Vs. PyTorch I

• Both frameworks operate on tensors and view any model as


a directed acyclic graph (DAG), but they differ drastically
on how you can define them. In TensorFlow you define
graph statically before a model can run. In PyTorch things
are more imperative and dynamic: you can define, change
and execute nodes as you go, no special session interfaces
or placeholders.
• Since computation graph in PyTorch is defined at runtime
you can use our favorite Python debugging tools such as
pdb, ipdb, PyCharm debugger or old trusty print state-
ments.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Tensorflow Vs. PyTorch II

• Tensorboard competitor from the PyTorch side is visdom.


Also, integrations with Tensorboard do exist. Also, you
are free to use standard plotting tools — matplotlib and
seaborn.
• If we start talking about deployment TensorFlow is a clear
winner for now: is has TensorFlow Serving which is a frame-
work to deploy your models on a specialized gRPC server.
Mobile is also supported.
• Deployment with PyTorch - we may use Flask or another
alternative to code up a REST API on top of the model.
This could be done with TensorFlow models as well.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

OpenCV
• Face Recognition Algorithms
• Haar Cascades
• Eigenfaces
• Fisherfaces
• ML Algos available - Normal Bayes Classifier, K-Nearest
Neighbors, Support Vector Machines, Decision Trees, Boost-
ing, Gradient Boosted Trees, Random Trees, Extremely
randomized trees
• Image Filters - Averaging, Gaussian Filtering, Median Fil-
tering, Bilateral Filtering
• Video Filters - Color Conversion, Thresholding, Smooth-
ing, Morphology, Gradients, Canny Edge Detection, Con-
tours, Histograms
Phenil Buch [email protected] Machine Learning Notes
Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

NLTK, spaCy and Gensim I

• NLTK supports Classification, Tokenization, Stemming,


Lemmatization, POS tagging, Parsing, and semantic rea-
soning functionalities.
• It also has functionality for N-gram and collocations, Tree
model and Text chunker and Named-entity recognition
• spaCy focuses on providing software for production us-
age unlike NLTK, which is widely used for teaching and
research.
• spaCy also supports deep learning workflows that allow
connecting statistical models trained by popular machine
learning libraries like TensorFlow, Keras, Scikit-learn or Py-
Torch.
Phenil Buch [email protected] Machine Learning Notes
Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

NLTK, spaCy and Gensim II


• spaCy has Pre-trained word vectors, Part-of-speech tag-
ging, Labelled Dependency Parsing, Syntax-driven sentence
segmentation, Text classification, Built-in visualizers for
syntax and named entities, Deep learning integration
• Gensim is a great package for processing texts, working
with word vector models (such as Word2Vec, FastText etc)
and for building topic models.
• Topic modeling is a technique to extract the underlying
topics from large volumes of text. Gensim provides algo-
rithms like Latent Dirichlet Allocation (LDA) and Latent
Semantic Analysis & Indexing (LSA & LSI) and the neces-
sary sophistication to build high-quality topic models.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

NLTK, spaCy and Gensim III

• A significant advantage with gensim is: it lets you handle


large text files without having to load the entire file in
memory.
• Gensim also helps in computing word vector similarity met-
rics like cosine similarity and soft cosine similarity.
• With Gensim, Text can be summarized and important key-
words can also be found from a paragraph. You can also
perform Non-negative Matrix Factorization NMF.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Linear Algebra I

• Scalars: A scalar is just a single number, in contrast to


most of the other objects studied in linear algebra, which
are usually arrays of multiple numbers.
• Vectors: A vector is an array of numbers. The numbers
are arranged in order. We can identify each individual num-
ber by its index in that ordering.
• Matrices: A matrix is a 2D array of numbers where each
number is identified by two indices.
• Tensors: In some cases we will need an array with more
than two axes An array of numbers arranged on a regular
grid with a variable number of axes is known as a tensor.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Linear Algebra II
• We usually measure the size of vectors using a function
called a norm.
• The most widely used kind of matrix decomposition
is called eigen-decomposition, in which we decompose
a matrix into a set of eigenvectors and eigenvalues.
Eigen-decomposition is only defined for square matrice-
sEigenvectors are vectors and Eigenvalues are scalars.
• The Singular value decomposition provides another way
to factorize a matrix, into singular vectors and singular
values.
• The Moore Penrose PseudoInverse - Normally, Matrix
Inversion is not defined for matrices that are not square.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Probability I

• The set of all possible outcomes is called the sample space.


• A random variable x, is a variable which randomly takes
on values from a sample space.
• To describe the likelihood of each possible value of a ran-
dom variable x, we specify a probability distribution.
• Discrete random variables are described with a Proba-
bility Mass Function (PMF). A PMF maps each value
in the variable’s sample space to a probability.
• One such PMF is the uniform distribution over n possible
outcomes where each outcome is equally likely where P(x)
= 1/n.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Probability II

• Another common discrete distribution is the Bernoulli. A


Bernoulli distribution specifies the probability for a ran-
dom variable which can take on one of two values 1/0,
heads/tails, true/false etc. The probability is given by p
and 1-p.
• Continuous random variables are described by Proba-
bility Density Functions (PDF).
• The most famous continuous distribution is the Gaussian
Distribution aka Normal Distribution. The Gaussian distri-
bution (colloquially called the bell curve) can be used to
model several natural phenomena.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Probability III

• The Gaussian distribution is parameterized by two values:


the mean (mu) and variance (sigma squared). The mean
specifies the center of the distribution, and the variance
specifies the width of the distribution.
• The probability of sampling a specific number from a con-
tinuous distribution is 0! Because there are infinite pos-
sible values in a continuous distribution. What we want
is actually the probability of a range of values within the
continuous distribution.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Probability IV
• A distribution over multiple random variables is called a
Joint Probability Distribution. We can write a collec-
tion of random variables as a vector x. A joint distribution
over x specifies the probability of any particular setting of
all the random variables contained in x.
• We denote the Conditional Probability Distributions
as the probability of an event given that another event has
already been observed.
• Chain Rule and Bayes Rule gives the relationship be-
tween conditional probability and joint probability. P(x, y)
= P(x given y).P(y) and P(x given y) = P(y given x).P(x)
/ P(y)

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Probability V

• If P(x given y) = P(x) then this implies Variable Indepen-


dence. Two variables x and y are said to be independent
if P(x, y) = P(x) . P(y).
• A similar concept is that of Conditional Independence.
Two variables x and y are called conditionally independent
given another variable z if P((x, y) given z) = P(x given z)
. P(y given z)
• Variance is a measure of how much random values vary
from their mean. Covariance is a measure of how linearly
related two random variables (or functions on random vari-
ables) are with each other.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Probability VI

• A measure used to indicate the extent to which two ran-


dom variables change in tandem is known as Covariance.
A measure used to represent how strongly two random
variables are related known as Correlation. Covariance
is nothing but a measure of correlation.
• Probability Distributions
• Bernoulli: models the outcome of coin flips and other
binary events
• Binomial: models a series of Bernoulli trials (a series of
coin flips, etc.)
• Geometric: models how many flips necessary before you
get a success

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Probability VII

• Multinomial: a generalization of the Binomial to more


than two outcomes (like a die roll)
• Poisson: models the number of events that occur in a
certain interval
• These well formed distributions are more like templates
than anything else. The true distribution of your data is
probably not so nice and may even be changing over time.
• We can learn a mapping from X to Y in various ways. First,
you could learn P(Y given X), that is to say, a probability
distribution over possible values of Y given that you’ve ob-
served a new sample X. Machine learning algorithms that
find this distribution are called Discriminative.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Probability VIII

• Alternatively, we could instead try to learn P(X given Y),


the probability distribution over inputs given labels. Algo-
rithms for doing this are called Generative.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Calculus

• A Gradient is a vector that stores the partial derivatives


of multivariable functions. It helps us calculate the slope
at a specific point on a curve for functions with multiple
independent variables.
• A gradient measures how much the output of a function
changes if you change the inputs a little bit.
• In functions with 2 or more variables, the Partial Deriva-
tive is the derivative of one variable with respect to the
others. If we change x, but hold all other variables con-
stant, how does f(x,z) change? That’s one partial deriva-
tive.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

CNN I

• Convolution layer (CONV) — It uses filters that perform


convolution operations as it is scanning the input with re-
spect to its dimensions. Its hyperparameters include the
filter size and stride. The resulting output is called feature
map or activation map.
• Pooling Layer (POOL) — It is a downsampling opera-
tion, typically applied after a convolution layer.
• Fully Connected (FC) — It operates on a flattened input
where each input is connected to all neurons. If present,
FC layers are usually found towards the end of CNN archi-
tectures
• Padding Modes - Valid, Same and Full

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

CNN II

• Batch Normalization - We normalize the input layer by


adjusting and scaling the activations to speed up learn-
ing. If the input layer is benefiting from it, why not do
the same thing also for the values in the hidden layers,
that are changing all the time. Batch normalization allows
each layer of a network to learn by itself a little bit more
independently of other layers.
• Object Detection - 3 Types - Image Classification (Tra-
ditional CNN), Classification with Localization (Simplified
You only look once YOLO, Regions with CNN Features
R-CNN), Object Detection (YOLO, Faster R-CNN).

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

CNN III
• RCNN works with regions of the image which probably
contains objects. YOLO looks at the entire image at once
and divides it into grids.
• textbfObject Detection Algorithms - Bounding Box Detec-
tion or Landmark Detection. Anchor Boxes, Intersection
Over Union IoU and Non-Max Suppression (Removal of
Duplicate Bounding Boxes using IoU)
• Face Verification and Face Recognition - One-Shot
Learning (learn similarity function for verification with lim-
ited dataset), Siamese Network (learning how to encode
images to then quantify how different two images are kinda
like word embeddings but for images) , Triplet Loss (loss

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

CNN IV

function computed on the embedding representation of a


triplet of images Anchor, Positive and Negative)
• ResNets - Skip connections, or short-cuts to jump over
some layers. The Residual Network architecture uses resid-
ual blocks with a high number of layers meant to decrease
the training error.
• Inception Networks - Using multiple filter sizes together.
This architecture uses inception modules and aims at giv-
ing a try at different convolutions in order to increase its
performance through features diversification.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

RNN I

• The Vanishing and Exploding Gradient phenomena are


often encountered in the context of RNNs. The reason
why they happen is that it is difficult to capture long term
dependencies because of multiplicative gradient that can
be exponentially decreasing/increasing with respect to the
number of layers.
• Gradient clipping — It is a technique used to cope with
the exploding gradient problem sometimes encountered when
performing backpropagation. By capping the maximum
value for the gradient, this phenomenon is controlled in
practice.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

RNN II

• GRU/LSTM — Gated Recurrent Unit (GRU) and Long


Short-Term Memory units (LSTM) deal with the vanishing
gradient problem encountered by traditional RNNs, with
LSTM being a generalization of GRU.
• Word Embeddings — Word2Vec - Skip-gram model and
CBOW(Continuous Bag of Words) model. GloVe and Neg-
ative Sampling.
• Machine Translation — Beam Search, Beam Width, Length
Normalization, Bleu Score, Error Analysis for faulty Beam
Search/ faulty RNN.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

RNN III

• Attention model — This model allows an RNN to pay


attention to specific parts of the input that is considered
as being important, which improves the performance of
the resulting model in practice. Attention Weights - the
amount of attention that the output y¡t¿ should pay to
the activation a¡t¿.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

TimeSeries I
• A time series can be taken on any variable that changes
over time. In investing, it is common to use a time series
to track the price of a security over time.
• Time series analysis can be used to examine how the changes
associated with the chosen data point compare to shifts in
other variables over the same time period.
• Time-intervals between Datapoints can be either regular or
irregular
• The data typically arrives in time order but may need to
be re-ordered properly as part of data cleaning
• Dependence: Dependence refers to the association of two
observations with the same variable, at prior time points.
Phenil Buch [email protected] Machine Learning Notes
Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

TimeSeries II

• Stationarity: Shows the mean value of the series that re-


mains constant over a time period; if past effects accumu-
late and the values increase toward infinity, then stationar-
ity is not met.
• Differencing: Used to make the series stationary, to De-
trend, and to control the auto-correlations; however, some
time series analyses do not require differencing and over-
differenced series can produce inaccurate estimates.
• Specification: May involve the testing of the linear or non-
linear relationships of dependent variables by using models
such as ARIMA, ARCH, GARCH, VAR, Co-integration, etc.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

TimeSeries III
• ARIMA stands for autoregressive integrated moving av-
erage. This method is also known as the Box-Jenkins
method.
• Tools for investigating time-series data include:
• Consideration of the autocorrelation function and the spec-
tral density function
• Cross-correlation functions and cross-spectral density func-
tions
• Performing a Fourier transform to investigate the series in
the frequency domain
• Principal component analysis (or empirical orthogonal func-
tion analysis)
• Artificial neural networks

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

TimeSeries IV

• Support vector machine


• Fuzzy logic
• Gaussian process
• Hidden Markov model

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Devanagri Handwritten Character Recognition I

• The dataset was made up of 46 character classes and had


78200 total images of size 32 X 32
• Created a Numpy array Feature Matrix X from the images
using PIL and a Label Matrix Y from the folder names of
directory listing of the dataset.
• Train Test Split and then Normalized the Training and
Testing Feature Matrices
• Created and Compiled three different Traditional Neural
Networks using different number of hidden layers and neu-
rons. All three models used the Adam Optimizer and the
Sparse Categorical Cross-Entropy Loss functions.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Devanagri Handwritten Character Recognition II

• Trained all models for 5 Epochs and Visualized the Com-


parision of Model Accuracies.
• Created a fourth CNN Model with 9 CONV2D layers, 3
MAXPOOL2D layers and 2 Dense layers with softmax at
the end and trained for 10 Epochs using the ADAM opti-
mizer.
• Learned that if the Y labels had been converted to one-hot
vectors, then could have used simple categorical crossen-
tropy as a loss function instead of sparse categorical cross
entropy.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Facial Expression Recognition I

• Facial Expression Recognition FER2013 Kaggle Challenge


Dataset which is in CSV format. 28700 training images
3570 Crossvalidation images and 3570 testing images all
of dimensions 48 X 48 X 1 GreyScale.
• Recreated the images from the pixel intensity values to
visualize the images for referennce purposes and as a fun
learning topic.
• Created Numpy arrays from the pixel intensity values and
reshaped the training and testing arrays.
• Created and Compiled CNN architecture with 3 CONV2D
layers with same padding, 3 MAXPOOL2D layers and 2
Dense layers with softmax at the end.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Facial Expression Recognition II

• Performed Batch Normalization and used LeakyRelu in the


CONV layers. Used Categorical Cross Entropy as the loss
function and the ADAM optimizer.
• Trained for 10 Epochs using GPU and obtained 51% accu-
racy.
• Added more CONV layers and used Relu instead of Leaky
Relu and obtained 57.68% accuracy at the end which is
equivalent to 16th place in the private leaderboard in the
competetion.

Phenil Buch [email protected] Machine Learning Notes


Contents Python Mathematics for Deep Learning Deep Learning Concepts My Projects

Thank You

Phenil Buch [email protected] Machine Learning Notes

You might also like