SlideShare a Scribd company logo
Machine Learning Essentials
Part 2: Artificial Neural Networks
Lior King
Lior.King@gmail.com
1
Previously on “Machine Learning Essentials...”
2
Linear regression
Finding the relation between the age
and the salary.
Predicting the salary for any given age
3
Historical
Data points
Experience
Salary
Historical
Data points
Salary (dependent)
Minimize the error
The Error (or Residual) is the offset of
the dependent variable from the
independent variable.
The goal of any regression is to minimize
the error for the training data and to
FIND THE OPTIMAL LINE (or curve in
case of logistic regression).
4
Error
Experience (independent)
Historical
Data points
Salary (dependent)
Minimize the error – sum of square diffs
The error = 𝑖=1
𝑁
(𝑦𝑖 − 𝑦𝑖)2
5
y
Error
𝒚
Experience
Minimize the error with Stochastic Gradient
Descent (SGD)
Error =
1
𝑁 𝑖=1
𝑁
(𝑦𝑖 − 𝑦𝑖)2
N -> number of historical data points
1. Initialize some value for the slope
and intercept.
2. Find the current value of the error
function.
6
Error
Slope
Intercept
3. Find the slope at the current point (partial derivative) and move slightly
downwards in the direction.
4. Repeat until you reach a minimum OR stop after certain number of iterations
Historical
Data points
Salary (dependent)
Experience
Minimize the error
The iterative SGD process will slowly
change the slope and the intercept until
the error is minimal.
7
Multiple Linear Regression
• Simple linear regression:
𝑌 = 𝑏0 + 𝑏1*𝑥1
• Multiple linear regression:
𝑌 = 𝑏0 + 𝑏1*𝑥1 + 𝑏2*𝑥2 + … + 𝑏 𝑛∗𝑥 𝑛
Important note:
You need to exclude variables that will “mess” the prediction and keep the ones
that actually help predicting the desired result.
8
Polynomial Linear Regression
9
Simple linear regression:
𝑌 = 𝑏0 + 𝑏1*𝑥1
Polynomial linear regression:
𝑌 = 𝑏0 + 𝑏1*𝑥1 + 𝑏2∗𝑥1
𝟐
+ … + 𝑏 𝑛∗𝑥1
𝒏
Quadratic: degree = 2
Cubic: degree = 3
10
Artificial Neural Networks - ANN
“Traditional” ML vs. “Representation” ML
• “Traditional” ML based systems rely on experts to decide what features to pay
attention to.
• “Representation” ML based systems figure out by themselves what features to pay
attention to.
• The most common representation ML algorithm is called Artificial Neural Network
• ANN are commonly used for:
• Image/video/audio processing
• Speech recognition
• Natural language processing (NLP)
• Games
11
The neuron
12
Neuron Axon
Dendrites
Synapse
Artificial Neural Networks - ANN
• Inspired by the neurons in the human mind.
• Can learn and organize data and thus create an understanding of relationships.
13
Artificial Neuron
14
Neuron
Input Signal 1 (X1)
Input Signal 2 (X2)
Input Signal n (Xn)
Output Signal
⁞
Independent variables
Dependent variable
Can be:
• Continuous (price)
• Binary (Yes/No)
• Categorical
The neuron behaves like a function
W1
W2
Wn
The neural network flow
In neural networks, the activation functions are non-linear.
15
Activation functions
16
MNIST Example
• NIST = US National Institute of Standards and
Technologies
• MNIST – a subset of NIST’s handwritten digit
data set
• Consists of a training set of 60,000 samples and
a test set of 10,000 samples.
• 28x28 pixels grayscale images and digit labels
for each image.
• http://Yann.lecun.com/exdb/mnist
17
MNIST
18
MNIST example – starting with simple ANN
19
W(783, 9)
W(0, 0)
W(783, 0)
784 Pixels…
…
0 1 2 9
0 1 2 3 4 5 6 7 8 783
7840 weights
W(0, 9)
28x28
Pixels
10 Nodes
20
𝑊0,9…𝑊0,3𝑊0,2𝑊0,1𝑊0,0
𝑊1,9…𝑊1,3𝑊1,2𝑊1,1𝑊1,0
𝑊2,9…𝑊2,3𝑊2,2𝑊2,1𝑊2,0
𝑊3,9…𝑊3,3𝑊3,2𝑊3,1𝑊3,0
𝑊4,9…𝑊4,3𝑊4,2𝑊4,1𝑊4,0
𝑊5,9…𝑊5,3𝑊5,2𝑊5,1𝑊5,0
𝑊6,9…𝑊6,3𝑊6,2𝑊6,1𝑊6,0
𝑊7,9…𝑊7,3𝑊7,2𝑊7,1𝑊7,0
………………
𝑊783,9…𝑊783,3𝑊783,2𝑊783,1𝑊783,0
x x x x x x
x
x
x
10 columns (for 10 digits)
783rows(foreverypixel)
𝑏9…𝑏3𝑏2𝑏1𝑏0
…
𝑋𝑖 𝑊𝑖,0 + b0
…
𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0784 pixels
𝑋0 𝑋1 𝑋2 𝑋3 𝑋4 𝑋5 𝑋6 𝑋7 𝑋783
Softmax
0.20.40.10.30.10.80.20.60.10.2
9876543210
Softmax Softmax Softmax Softmax
+++++
Inference
function
0 1 2 3 9
Softmax function
Activation
function
Wrong !
Biases (1 bias per digit)
Using “softmax” activation function
• In this example we will use “softmax” activation function:
• Good for classification problems.
• Increases the differences so the output gets closer to 1 or closer to 0
21
Loss/error measurement function
22
9876543210
0100000000
9876543210
0.20.40.10.30.10.80.20.60.10.2
“one hot” actual probabilities
Computed probabilities
Cross entropy error measurement function: - 𝐴𝑖log(𝑌𝑖)
A
Y
Minimize the error with Gradient Descent
Optimization Function
Error =
1
𝑁 𝑖=1
𝑁
𝑒𝑖
2
N -> number of historical datapoints
1. Initialize some value for the slope
and intercept.
2. Find the current value of the error
function.
23
Error
Slope
Intercept
3. Find the slope at the current point (partial derivative) and move slightly
downwards in the direction.
4. Repeat until you reach a minimum OR stop after certain number of iterations
Training the neural network
• How can we know what should be the weights and biases?
• Through training the network
• The code will figure out the correct values BY ITSELF
• How does the training work?
1. Starting with zero weights and bias, we multiply the input values by the weights and add the bias
2. We get an incorrect output
But we know what the correct output should be.
1. The system measures the difference between the incorrect output and the correct output. This is
call “loss measurement function”.
• The loss measurement function calculates how big the error is.
2. Now the system will change the weights and biases to minimize the error. This is called
“optimization function” and goes back to step 3 until it cannot reduce the error anymore.
24
Back propagation - adjusting the weights
Get Input
Values
Multiply input values by the
weights and add biases
Run activation
function and get
predictions
Calculate the
distance from the
Correct results
Apply optimization on the
weights to reduce the error
25
Back propagation - adjusting the weights
Get Input
Values
Multiply input values by the
weights and add biases
Run activation
function and get
predictions
Calculate the
distance from the
Correct results
Apply optimization on the
weights to reduce the error
26
9876543210
0100000000
9876543210
0.20.40.10.30.10.80.20.60.10.2
Back propagation - adjusting the weights
Get Input
Values
Multiply input values by the
weights and add biases
Run activation
function and get
predictions
Calculate the
distance from the
Correct results
Apply optimization on the
weights to reduce the error
27
9876543210
0100000000
9876543210
0.20.50.10.30.10.70.20.40.10.1
Back propagation - adjusting the weights
Get Input
Values
Multiply input values by the
weights and add biases
Run activation
function and get
predictions
Calculate the
distance from the
Correct results
Apply optimization on the
weights to reduce the error
28
9876543210
0100000000
9876543210
0.10.60.10.20.10.60.20.30.10.1
Back propagation - adjusting the weights
Get Input
Values
Multiply input values by the
weights and add biases
Run activation
function and get
predictions
Calculate the
distance from the
Correct results
Apply optimization on the
weights to reduce the error
29
9876543210
0100000000
9876543210
0.10.70.10.20.10.40.10.20.10.1
Back propagation - adjusting the weights
Get Input
Values
Multiply input values by the
weights and add biases
Run activation
function and get
predictions
Calculate the
distance from the
Correct results
Apply optimization on the
weights to reduce the error
30
9876543210
0100000000
9876543210
0.10.90.10.100.10.10.10.10.1
Correct!
31
𝑊0,9…𝑊0,3𝑊0,2𝑊0,1𝑊0,0
𝑊1,9…𝑊1,3𝑊1,2𝑊1,1𝑊1,0
𝑊2,9…𝑊2,3𝑊2,2𝑊2,1𝑊2,0
𝑊3,9…𝑊3,3𝑊3,2𝑊3,1𝑊3,0
𝑊4,9…𝑊4,3𝑊4,2𝑊4,1𝑊4,0
𝑊5,9…𝑊5,3𝑊5,2𝑊5,1𝑊5,0
𝑊6,9…𝑊6,3𝑊6,2𝑊6,1𝑊6,0
𝑊7,9…𝑊7,3𝑊7,2𝑊7,1𝑊7,0
………………
𝑊783,9…𝑊783,3𝑊783,2𝑊783,1𝑊783,0
x x x x x x
x
x
x
10 columns (for 10 digits)
783rows(foreverypixel)
𝑏9…𝑏3𝑏2𝑏1𝑏0
…
𝑋𝑖 𝑊𝑖,0 + b0
…
𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0784 pixels
𝑋0 𝑋1 𝑋2 𝑋3 𝑋4 𝑋5 𝑋6 𝑋7 𝑋783
Softmax
0.20.90.10.30.10.20.10.20.10.2
9876543210
Softmax Softmax Softmax Softmax
+++++
Inference
function
0 1 2 3 9
Softmax function
Activation
function
Correct!
TensorFlow
32
Tensor
• An n-dimensional array or list used to represent data
• Defined by the 3 properties:
• Rank: Scalar (number), Vector (1-dim array), Matrix (2-dim array), Cube, etc.
• Shape
• Type
33
TypeShapeRankExample
Int32[]0 (scalar)1
Int32[5]1 (vector)[1, 5, 3, 6, 2]
Int32[2, 5]2 (matrix)[[1, 5, 3, 8, 4], [3, 2, 6, 4, 7] ]
Int32[3, 2, 3]3 (cube)[ [ [1, 6, 3], [2, 4, 3] ]
[ [2, 6, 2], [3, 7, 4] ]
[ [1, 9, 2], [4, 8, 3] ] ]
What is TensorFlow
• The most popular Python library for building ensemble algorithms – mainly NN.
• Initially developed by Google and today it is open sourced
• Provides a library of predefined versions of many common ML algorithms, but also
enables to flexibly create your own algorithm.
• Can harness the GPUs
• Scalable – using “execution master” you can run on a laptop as well as on a large
scale cluster in remote servers.
34
Tensor Features and Tools
• Name property - used to identify elements in the graph
• Name Scope property – used for grouping elements (like “conv1” for 1st conv layer)
• Summary class – has methods for writing summaries to log files. Can capture how
elements change over time.
• TensorBoard – A web server that uses the log files to visualize the computation
graph and training progress. Can be used from remote desktops.
• Common add-ons (for easier developement):
• TFLearn - Simplifies the use of TensorFlow only and can converse with TF data types.
• Keras – Simplification which supports multiple frameworks (including Microsoft CNTK).
35
Training neural networks with TensorFlow
With TensorFlow you need define the following:
1. The input data:
• “Placeholders” – The input training data.
• “Variables” – What we ask TF to compute through training. With neural network these are
weights and biases.
2. The inference function (which is applied on the weights and biases).
3. Loss/error measurement function (example: “Cross Entropy”)
4. Optimization function to minimize loss (example: “Gradient Descent”)
36
TensorFlow - MNIST demo
37
ImplementationConcept
MNIST dataPrepared Data
Sum(X* weight) + bias -> ActivationInference
Cross EntropyLoss Measurement
Gradient descent optimizerOptimize to minimize loss
TensorFlow DEMO
38
Why Convolutional Neural Networks (CNN)
• Problem – Flattening the images caused us to lose the shape information.
• When we see a digit, we recognize the lines and curves.
• We need to “zoom out” slowly from the picture.
39
Hidden layers
40
Deep neural networks
41
Deep Learning
• Use of multi layered neural network is called Deep Learning
• Some applications:
• Natural language processing (NLP)
• Face recognition
• Image analysis (what’s in the picture)
• Image search
• Voice analysis
• Video analysis
42
Convolutional Neural Network(CNN):
X’s and O’s
Says whether a picture is of an X or an O
X or OCNN
A two-dimensional
array of pixels
For example
CNN X
CNN O
Trickier cases
CNN X
CNN O
Deciding is hard
?
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 1 -1 -1 -1
-1 -1 1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 1 -1 -1
-1 -1 -1 1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Computers are literal
x
ConvNets match pieces of the image
=
=
=
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
Features match pieces of the image
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
Filtering: The math behind the match
1. Line up the feature and the image patch.
2. Multiply each image pixel by the corresponding feature pixel.
3. Add them up.
4. Divide by the total number of pixels in the feature.
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1 1 1
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1 1 1
1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 1
1 1 1
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1
1 1 1
1 1 1
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 -1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1 1 -1
1 1 1
-1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Filtering: The math behind the match
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1 1 -1
1 1 1
-1 1 1
Filtering: The math behind the match
55
1 1 -1
1 1 1
-1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Convolution: Trying every possible match
=
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1 -1 -1
-1 1 -1
-1 -1 1
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
=
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
=
=
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Convolution layer
• One image becomes a stack of filtered images
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
1 -1 -1
-1 1 -1
-1 -1 1
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Convolution layer
• One image becomes a stack of filtered images
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Pooling: Shrinking the image stack
1. Pick a window size (usually 2 or 3).
2. Pick a stride (usually 2). A stride = step.
3. Walk your window across your filtered images.
4. From each window, take the maximum value.
1.00
Pooling
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1.00 0.33
Pooling
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1.00 0.33 0.55
Pooling
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1.00 0.33 0.55 0.33
Pooling
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1.00 0.33 0.55 0.33
0.33
Pooling
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1.00 0.33 0.55 0.33
0.33 1.00 0.33 0.55
0.55 0.33 1.00 0.11
0.33 0.55 0.11 0.77
Pooling
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
1.00 0.33 0.55 0.33
0.33 1.00 0.33 0.55
0.55 0.33 1.00 0.11
0.33 0.55 0.11 0.77
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
0.33 0.55 1.00 0.77
0.55 0.55 1.00 0.33
1.00 1.00 0.11 0.55
0.77 0.33 0.55 0.33
0.55 0.33 0.55 0.33
0.33 1.00 0.55 0.11
0.55 0.55 0.55 0.11
0.33 0.11 0.11 0.33
Pooling layer
• A stack of images becomes a stack of smaller images.
1.00 0.33 0.55 0.33
0.33 1.00 0.33 0.55
0.55 0.33 1.00 0.11
0.33 0.55 0.11 0.77
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
0.33 0.55 1.00 0.77
0.55 0.55 1.00 0.33
1.00 1.00 0.11 0.55
0.77 0.33 0.55 0.33
0.55 0.33 0.55 0.33
0.33 1.00 0.55 0.11
0.55 0.55 0.55 0.11
0.33 0.11 0.11 0.33
Normalization
• Keep the math from breaking by tweaking each of the values just a bit.
• Change everything negative to zero.
Rectified Linear Units (ReLUs)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.77
0.77 0
Rectified Linear Units (ReLUs)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.77 0 0.11 0.33 0.55 0 0.33
Rectified Linear Units (ReLUs)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.77 0 0.11 0.33 0.55 0 0.33
0 1.00 0 0.33 0 0.11 0
0.11 0 1.00 0 0.11 0 0.55
0.33 0.33 0 0.55 0 0.33 0.33
0.55 0 0.11 0 1.00 0 0.11
0 0.11 0 0.33 0 1.00 0
0.33 0 0.55 0.33 0.11 0 0.77
Rectified Linear Units (ReLUs)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
ReLU layer
• A stack of images becomes a stack of images with no negative values.
0.77 0 0.11 0.33 0.55 0 0.33
0 1.00 0 0.33 0 0.11 0
0.11 0 1.00 0 0.11 0 0.55
0.33 0.33 0 0.55 0 0.33 0.33
0.55 0 0.11 0 1.00 0 0.11
0 0.11 0 0.33 0 1.00 0
0.33 0 0.55 0.33 0.11 0 0.77
0.33 0 0.11 0 0.11 0 0.33
0 0.55 0 0.33 0 0.55 0
0.11 0 0.55 0 0.55 0 0.11
0 0.33 0 1.00 0 0.33 0
0.11 0 0.55 0 0.55 0 0.11
0 0.55 0 0.33 0 0.55 0
0.33 0 0.11 0 0.11 0 0.33
0.33 0 0.55 0.33 0.11 0 0.77
0 0.11 0 0.33 0 1.00 0
0.55 0 0.11 0 1.00 0 0.11
0.33 0.33 0 0.55 0 0.33 0.33
0.11 0 1.00 0 0.11 0 0.55
0 1.00 0 0.33 0 0.11 0
0.77 0 0.11 0.33 0.55 0 0.33
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
Layers get stacked
• The output of one becomes the input of the next.
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1.00 0.33 0.55 0.33
0.33 1.00 0.33 0.55
0.55 0.33 1.00 0.11
0.33 0.55 0.11 0.77
0.33 0.55 1.00 0.77
0.55 0.55 1.00 0.33
1.00 1.00 0.11 0.55
0.77 0.33 0.55 0.33
0.55 0.33 0.55 0.33
0.33 1.00 0.55 0.11
0.55 0.55 0.55 0.11
0.33 0.11 0.11 0.33
Deep stacking
• Layers can be repeated several (or many) times.
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1.00 0.55
0.55 1.00
0.55 1.00
1.00 0.55
1.00 0.55
0.55 0.55
Fully connected layer
• Every value gets a vote
1.00 0.55
0.55 1.00
0.55 1.00
1.00 0.55
1.00 0.55
0.55 0.55
1.00
0.55
0.55
1.00
1.00
0.55
0.55
0.55
0.55
1.00
1.00
0.55
Fully connected layer
• Vote depends on how strongly a value predicts X or O
X
O
1.00
0.55
0.55
1.00
1.00
0.55
0.55
0.55
0.55
1.00
1.00
0.55
Fully connected layer
• Vote depends on how strongly a value predicts X or O
X
O
0.55
1.00
1.00
0.55
0.55
0.55
0.55
0.55
1.00
0.55
0.55
1.00
Fully connected layer
• Future values vote on X or O
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Fully connected layer
• Future values vote on X or O
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Fully connected layer
• Future values vote on X or O
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Fully connected layer
• Future values vote on X or O
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Fully connected layer
• Future values vote on X or O
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Fully connected layer
• Future values vote on X or O
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Fully connected layer
• A list of feature values becomes a list of votes.
X
O
0.9
0.65
0.45
0.87
0.96
0.73
0.23
0.63
0.44
0.89
0.94
0.53
Putting it all together
• A set of pixels becomes a set of votes.
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
X
O
Layer 1 Layer 2 Layer 3 Layer 4 Layer 5
Gradient descent
• For each feature pixel and voting
weight, adjust it up and down a
bit and see how the error
changes.
weighterror
Gradient descent
• For each feature pixel and voting
weight, adjust it up and down a
bit and see how the error
changes.
weighterror
Tuning the CNN
• Architecture
• How many of each type of layer?
• In what order?
• Convolution
• Number of features
• Size of features
• Pooling
• Window size
• Window stride
• Fully Connected
• Number of neurons
CNN - Not just for images
Things closer together are more closely related than things far away:
• 2D Images.
• 3D Images.
• Audio
• Video
• Signal processing
• NLP – semantic parsing, sentence modelling and more.
• Drug discovery - Chemical interactions,
MNIST demo using CNN
108
Machine Learning in the near future
There is a lot of research around ML in the academia and in commercial companies
and a lot of money is invested there….
• ML will be used adopted in much greater scales across almost every industry.
• ML will be embedded everywhere
• Specialized hardware for ML will enable deeper and faster learning
• Machine Learning as a Service (MLaaS) market will grow substantially.
• ML will save more lives.
• ML will automate more repetitive tasks.
109
Why should developers/data
engineers/DBAs invest time in ML?
• Data is the fuel of every ML system – comes from the data platforms DBAs
manage.
• The data preparation before the training is the most time consuming part.
• The DBAs can definitely assist here.
• ML – not just for data scientists (up to a certain level)
• Developers already use ML
• Data engineers use ML.
• ML can be used by DBAs too – why not?
• ML will become more and more easy to use:
• Azure ML
• AWS ML
110
111
The future of AI
?
112
Thank you !
Lior.King@gmail.com
Ad

More Related Content

What's hot (20)

Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hakky St
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
Alexandros Karatzoglou
 
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Balázs Hidasi
 
Machine Learning Lecture 2 Basics
Machine Learning Lecture 2 BasicsMachine Learning Lecture 2 Basics
Machine Learning Lecture 2 Basics
ananth
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
Sujit Pal
 
Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network  Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network
Tomoki Hayashi
 
Generative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsGenerative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variants
ananth
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
NAVER Engineering
 
Deep learning simplified
Deep learning simplifiedDeep learning simplified
Deep learning simplified
Lovelyn Rose
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
Balázs Hidasi
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Daniel Lewis
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
Pranav Challa
 
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTECFace recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
BAINIDA
 
Machine learning the next revolution or just another hype
Machine learning   the next revolution or just another hypeMachine learning   the next revolution or just another hype
Machine learning the next revolution or just another hype
Jorge Ferrer
 
Deeplearning in finance
Deeplearning in financeDeeplearning in finance
Deeplearning in finance
Sebastien Jehan
 
Deep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr SanparitDeep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr Sanparit
BAINIDA
 
Machine Learning Overview
Machine Learning OverviewMachine Learning Overview
Machine Learning Overview
Mykhailo Koval
 
Algorithms Design Patterns
Algorithms Design PatternsAlgorithms Design Patterns
Algorithms Design Patterns
Ashwin Shiv
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
Gayatri Khanvilkar
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
Akash Goel
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hakky St
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
Alexandros Karatzoglou
 
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Balázs Hidasi
 
Machine Learning Lecture 2 Basics
Machine Learning Lecture 2 BasicsMachine Learning Lecture 2 Basics
Machine Learning Lecture 2 Basics
ananth
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
Sujit Pal
 
Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network  Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network
Tomoki Hayashi
 
Generative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsGenerative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variants
ananth
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
NAVER Engineering
 
Deep learning simplified
Deep learning simplifiedDeep learning simplified
Deep learning simplified
Lovelyn Rose
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
Balázs Hidasi
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Daniel Lewis
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
Pranav Challa
 
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTECFace recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
BAINIDA
 
Machine learning the next revolution or just another hype
Machine learning   the next revolution or just another hypeMachine learning   the next revolution or just another hype
Machine learning the next revolution or just another hype
Jorge Ferrer
 
Deep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr SanparitDeep learning and image analytics using Python by Dr Sanparit
Deep learning and image analytics using Python by Dr Sanparit
BAINIDA
 
Machine Learning Overview
Machine Learning OverviewMachine Learning Overview
Machine Learning Overview
Mykhailo Koval
 
Algorithms Design Patterns
Algorithms Design PatternsAlgorithms Design Patterns
Algorithms Design Patterns
Ashwin Shiv
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
Gayatri Khanvilkar
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
Akash Goel
 

Similar to Machine Learning Essentials Demystified part2 | Big Data Demystified (20)

Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
Eran Shlomo
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Databricks
 
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
SrideviPcSenthilkuma
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
Mehrnaz Faraz
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
milad abbasi
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
Ankita Tiwari
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
MayuraD1
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
MoctardOLOULADE
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
Te-Yen Liu
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Simplilearn
 
Integrating Artificial Intelligence with IoT
Integrating Artificial Intelligence with IoTIntegrating Artificial Intelligence with IoT
Integrating Artificial Intelligence with IoT
bplay2086
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learning
Nimrita Koul
 
Artificial Neural Networks presentations
Artificial Neural Networks presentationsArtificial Neural Networks presentations
Artificial Neural Networks presentations
migob991
 
Enhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithmEnhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithm
Hadi Fadlallah
 
Development of Deep Learning Architecture
Development of Deep Learning ArchitectureDevelopment of Deep Learning Architecture
Development of Deep Learning Architecture
Pantech ProLabs India Pvt Ltd
 
Machine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspectiveMachine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspective
Marijn van Zelst
 
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Codemotion
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
Dessy Amirudin
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Ahmed Yousry
 
Backpropagation and computational graph.pptx
Backpropagation and computational graph.pptxBackpropagation and computational graph.pptx
Backpropagation and computational graph.pptx
tintu47
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
Eran Shlomo
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Databricks
 
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
SrideviPcSenthilkuma
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
Mehrnaz Faraz
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
milad abbasi
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
Ankita Tiwari
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
MayuraD1
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
Te-Yen Liu
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Simplilearn
 
Integrating Artificial Intelligence with IoT
Integrating Artificial Intelligence with IoTIntegrating Artificial Intelligence with IoT
Integrating Artificial Intelligence with IoT
bplay2086
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learning
Nimrita Koul
 
Artificial Neural Networks presentations
Artificial Neural Networks presentationsArtificial Neural Networks presentations
Artificial Neural Networks presentations
migob991
 
Enhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithmEnhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithm
Hadi Fadlallah
 
Machine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspectiveMachine Learning from a Software Engineer's perspective
Machine Learning from a Software Engineer's perspective
Marijn van Zelst
 
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Machine learning from a software engineer's perspective - Marijn van Zelst - ...
Codemotion
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
Dessy Amirudin
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Ahmed Yousry
 
Backpropagation and computational graph.pptx
Backpropagation and computational graph.pptxBackpropagation and computational graph.pptx
Backpropagation and computational graph.pptx
tintu47
 
Ad

More from Omid Vahdaty (20)

Data Pipline Observability meetup
Data Pipline Observability meetup Data Pipline Observability meetup
Data Pipline Observability meetup
Omid Vahdaty
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
Omid Vahdaty
 
The technology of fake news between a new front and a new frontier | Big Dat...
The technology of fake news  between a new front and a new frontier | Big Dat...The technology of fake news  between a new front and a new frontier | Big Dat...
The technology of fake news between a new front and a new frontier | Big Dat...
Omid Vahdaty
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Omid Vahdaty
 
Making your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data DemystifiedMaking your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data Demystified
Omid Vahdaty
 
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
Omid Vahdaty
 
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
Omid Vahdaty
 
Aerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data DemystifiedAerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data Demystified
Omid Vahdaty
 
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
Omid Vahdaty
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
Omid Vahdaty
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
Omid Vahdaty
 
AWS Big Data Demystified #4 data governance demystified [security, networ...
AWS Big Data Demystified #4   data governance demystified   [security, networ...AWS Big Data Demystified #4   data governance demystified   [security, networ...
AWS Big Data Demystified #4 data governance demystified [security, networ...
Omid Vahdaty
 
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
Omid Vahdaty
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
Omid Vahdaty
 
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Omid Vahdaty
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
Omid Vahdaty
 
Emr spark tuning demystified
Emr spark tuning demystifiedEmr spark tuning demystified
Emr spark tuning demystified
Omid Vahdaty
 
Emr zeppelin & Livy demystified
Emr zeppelin & Livy demystifiedEmr zeppelin & Livy demystified
Emr zeppelin & Livy demystified
Omid Vahdaty
 
Zeppelin and spark sql demystified
Zeppelin and spark sql demystifiedZeppelin and spark sql demystified
Zeppelin and spark sql demystified
Omid Vahdaty
 
Introduction to AWS Big Data
Introduction to AWS Big Data Introduction to AWS Big Data
Introduction to AWS Big Data
Omid Vahdaty
 
Data Pipline Observability meetup
Data Pipline Observability meetup Data Pipline Observability meetup
Data Pipline Observability meetup
Omid Vahdaty
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
Omid Vahdaty
 
The technology of fake news between a new front and a new frontier | Big Dat...
The technology of fake news  between a new front and a new frontier | Big Dat...The technology of fake news  between a new front and a new frontier | Big Dat...
The technology of fake news between a new front and a new frontier | Big Dat...
Omid Vahdaty
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Omid Vahdaty
 
Making your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data DemystifiedMaking your analytics talk business | Big Data Demystified
Making your analytics talk business | Big Data Demystified
Omid Vahdaty
 
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
BI STRATEGY FROM A BIRD'S EYE VIEW (How to become a trusted advisor) | Omri H...
Omid Vahdaty
 
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
AI and Big Data in Health Sector Opportunities and challenges | Big Data Demy...
Omid Vahdaty
 
Aerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data DemystifiedAerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data Demystified
Omid Vahdaty
 
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
ALIGNING YOUR BI OPERATIONS WITH YOUR CUSTOMERS' UNSPOKEN NEEDS, by Eyal Stei...
Omid Vahdaty
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
Omid Vahdaty
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
Omid Vahdaty
 
AWS Big Data Demystified #4 data governance demystified [security, networ...
AWS Big Data Demystified #4   data governance demystified   [security, networ...AWS Big Data Demystified #4   data governance demystified   [security, networ...
AWS Big Data Demystified #4 data governance demystified [security, networ...
Omid Vahdaty
 
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...
Omid Vahdaty
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
Omid Vahdaty
 
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Omid Vahdaty
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
Omid Vahdaty
 
Emr spark tuning demystified
Emr spark tuning demystifiedEmr spark tuning demystified
Emr spark tuning demystified
Omid Vahdaty
 
Emr zeppelin & Livy demystified
Emr zeppelin & Livy demystifiedEmr zeppelin & Livy demystified
Emr zeppelin & Livy demystified
Omid Vahdaty
 
Zeppelin and spark sql demystified
Zeppelin and spark sql demystifiedZeppelin and spark sql demystified
Zeppelin and spark sql demystified
Omid Vahdaty
 
Introduction to AWS Big Data
Introduction to AWS Big Data Introduction to AWS Big Data
Introduction to AWS Big Data
Omid Vahdaty
 
Ad

Recently uploaded (20)

Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
Metal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistryMetal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistry
mee23nu
 
Oil-gas_Unconventional oil and gass_reseviours.pdf
Oil-gas_Unconventional oil and gass_reseviours.pdfOil-gas_Unconventional oil and gass_reseviours.pdf
Oil-gas_Unconventional oil and gass_reseviours.pdf
M7md3li2
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
new ppt artificial intelligence historyyy
new ppt artificial intelligence historyyynew ppt artificial intelligence historyyy
new ppt artificial intelligence historyyy
PianoPianist
 
DSP and MV the Color image processing.ppt
DSP and MV the  Color image processing.pptDSP and MV the  Color image processing.ppt
DSP and MV the Color image processing.ppt
HafizAhamed8
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
Introduction to FLUID MECHANICS & KINEMATICS
Introduction to FLUID MECHANICS &  KINEMATICSIntroduction to FLUID MECHANICS &  KINEMATICS
Introduction to FLUID MECHANICS & KINEMATICS
narayanaswamygdas
 
Compiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptxCompiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptx
RushaliDeshmukh2
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Journal of Soft Computing in Civil Engineering
 
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G..."Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
Infopitaara
 
RICS Membership-(The Royal Institution of Chartered Surveyors).pdf
RICS Membership-(The Royal Institution of Chartered Surveyors).pdfRICS Membership-(The Royal Institution of Chartered Surveyors).pdf
RICS Membership-(The Royal Institution of Chartered Surveyors).pdf
MohamedAbdelkader115
 
π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalizationπ0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization
NABLAS株式会社
 
Value Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous SecurityValue Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous Security
Marc Hornbeek
 
Machine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptxMachine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptx
rajeswari89780
 
Data Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptxData Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptx
RushaliDeshmukh2
 
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
Metal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistryMetal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistry
mee23nu
 
Oil-gas_Unconventional oil and gass_reseviours.pdf
Oil-gas_Unconventional oil and gass_reseviours.pdfOil-gas_Unconventional oil and gass_reseviours.pdf
Oil-gas_Unconventional oil and gass_reseviours.pdf
M7md3li2
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
new ppt artificial intelligence historyyy
new ppt artificial intelligence historyyynew ppt artificial intelligence historyyy
new ppt artificial intelligence historyyy
PianoPianist
 
DSP and MV the Color image processing.ppt
DSP and MV the  Color image processing.pptDSP and MV the  Color image processing.ppt
DSP and MV the Color image processing.ppt
HafizAhamed8
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
Introduction to FLUID MECHANICS & KINEMATICS
Introduction to FLUID MECHANICS &  KINEMATICSIntroduction to FLUID MECHANICS &  KINEMATICS
Introduction to FLUID MECHANICS & KINEMATICS
narayanaswamygdas
 
Compiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptxCompiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptx
RushaliDeshmukh2
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G..."Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
Infopitaara
 
RICS Membership-(The Royal Institution of Chartered Surveyors).pdf
RICS Membership-(The Royal Institution of Chartered Surveyors).pdfRICS Membership-(The Royal Institution of Chartered Surveyors).pdf
RICS Membership-(The Royal Institution of Chartered Surveyors).pdf
MohamedAbdelkader115
 
π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalizationπ0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization
NABLAS株式会社
 
Value Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous SecurityValue Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous Security
Marc Hornbeek
 
Machine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptxMachine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptx
rajeswari89780
 
Data Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptxData Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptx
RushaliDeshmukh2
 

Machine Learning Essentials Demystified part2 | Big Data Demystified

  • 1. Machine Learning Essentials Part 2: Artificial Neural Networks Lior King [email protected] 1
  • 2. Previously on “Machine Learning Essentials...” 2
  • 3. Linear regression Finding the relation between the age and the salary. Predicting the salary for any given age 3 Historical Data points Experience Salary
  • 4. Historical Data points Salary (dependent) Minimize the error The Error (or Residual) is the offset of the dependent variable from the independent variable. The goal of any regression is to minimize the error for the training data and to FIND THE OPTIMAL LINE (or curve in case of logistic regression). 4 Error Experience (independent)
  • 5. Historical Data points Salary (dependent) Minimize the error – sum of square diffs The error = 𝑖=1 𝑁 (𝑦𝑖 − 𝑦𝑖)2 5 y Error 𝒚 Experience
  • 6. Minimize the error with Stochastic Gradient Descent (SGD) Error = 1 𝑁 𝑖=1 𝑁 (𝑦𝑖 − 𝑦𝑖)2 N -> number of historical data points 1. Initialize some value for the slope and intercept. 2. Find the current value of the error function. 6 Error Slope Intercept 3. Find the slope at the current point (partial derivative) and move slightly downwards in the direction. 4. Repeat until you reach a minimum OR stop after certain number of iterations
  • 7. Historical Data points Salary (dependent) Experience Minimize the error The iterative SGD process will slowly change the slope and the intercept until the error is minimal. 7
  • 8. Multiple Linear Regression • Simple linear regression: 𝑌 = 𝑏0 + 𝑏1*𝑥1 • Multiple linear regression: 𝑌 = 𝑏0 + 𝑏1*𝑥1 + 𝑏2*𝑥2 + … + 𝑏 𝑛∗𝑥 𝑛 Important note: You need to exclude variables that will “mess” the prediction and keep the ones that actually help predicting the desired result. 8
  • 9. Polynomial Linear Regression 9 Simple linear regression: 𝑌 = 𝑏0 + 𝑏1*𝑥1 Polynomial linear regression: 𝑌 = 𝑏0 + 𝑏1*𝑥1 + 𝑏2∗𝑥1 𝟐 + … + 𝑏 𝑛∗𝑥1 𝒏 Quadratic: degree = 2 Cubic: degree = 3
  • 11. “Traditional” ML vs. “Representation” ML • “Traditional” ML based systems rely on experts to decide what features to pay attention to. • “Representation” ML based systems figure out by themselves what features to pay attention to. • The most common representation ML algorithm is called Artificial Neural Network • ANN are commonly used for: • Image/video/audio processing • Speech recognition • Natural language processing (NLP) • Games 11
  • 13. Artificial Neural Networks - ANN • Inspired by the neurons in the human mind. • Can learn and organize data and thus create an understanding of relationships. 13
  • 14. Artificial Neuron 14 Neuron Input Signal 1 (X1) Input Signal 2 (X2) Input Signal n (Xn) Output Signal ⁞ Independent variables Dependent variable Can be: • Continuous (price) • Binary (Yes/No) • Categorical The neuron behaves like a function W1 W2 Wn
  • 15. The neural network flow In neural networks, the activation functions are non-linear. 15
  • 17. MNIST Example • NIST = US National Institute of Standards and Technologies • MNIST – a subset of NIST’s handwritten digit data set • Consists of a training set of 60,000 samples and a test set of 10,000 samples. • 28x28 pixels grayscale images and digit labels for each image. • http://Yann.lecun.com/exdb/mnist 17
  • 19. MNIST example – starting with simple ANN 19 W(783, 9) W(0, 0) W(783, 0) 784 Pixels… … 0 1 2 9 0 1 2 3 4 5 6 7 8 783 7840 weights W(0, 9) 28x28 Pixels 10 Nodes
  • 20. 20 𝑊0,9…𝑊0,3𝑊0,2𝑊0,1𝑊0,0 𝑊1,9…𝑊1,3𝑊1,2𝑊1,1𝑊1,0 𝑊2,9…𝑊2,3𝑊2,2𝑊2,1𝑊2,0 𝑊3,9…𝑊3,3𝑊3,2𝑊3,1𝑊3,0 𝑊4,9…𝑊4,3𝑊4,2𝑊4,1𝑊4,0 𝑊5,9…𝑊5,3𝑊5,2𝑊5,1𝑊5,0 𝑊6,9…𝑊6,3𝑊6,2𝑊6,1𝑊6,0 𝑊7,9…𝑊7,3𝑊7,2𝑊7,1𝑊7,0 ……………… 𝑊783,9…𝑊783,3𝑊783,2𝑊783,1𝑊783,0 x x x x x x x x x 10 columns (for 10 digits) 783rows(foreverypixel) 𝑏9…𝑏3𝑏2𝑏1𝑏0 … 𝑋𝑖 𝑊𝑖,0 + b0 … 𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0784 pixels 𝑋0 𝑋1 𝑋2 𝑋3 𝑋4 𝑋5 𝑋6 𝑋7 𝑋783 Softmax 0.20.40.10.30.10.80.20.60.10.2 9876543210 Softmax Softmax Softmax Softmax +++++ Inference function 0 1 2 3 9 Softmax function Activation function Wrong ! Biases (1 bias per digit)
  • 21. Using “softmax” activation function • In this example we will use “softmax” activation function: • Good for classification problems. • Increases the differences so the output gets closer to 1 or closer to 0 21
  • 22. Loss/error measurement function 22 9876543210 0100000000 9876543210 0.20.40.10.30.10.80.20.60.10.2 “one hot” actual probabilities Computed probabilities Cross entropy error measurement function: - 𝐴𝑖log(𝑌𝑖) A Y
  • 23. Minimize the error with Gradient Descent Optimization Function Error = 1 𝑁 𝑖=1 𝑁 𝑒𝑖 2 N -> number of historical datapoints 1. Initialize some value for the slope and intercept. 2. Find the current value of the error function. 23 Error Slope Intercept 3. Find the slope at the current point (partial derivative) and move slightly downwards in the direction. 4. Repeat until you reach a minimum OR stop after certain number of iterations
  • 24. Training the neural network • How can we know what should be the weights and biases? • Through training the network • The code will figure out the correct values BY ITSELF • How does the training work? 1. Starting with zero weights and bias, we multiply the input values by the weights and add the bias 2. We get an incorrect output But we know what the correct output should be. 1. The system measures the difference between the incorrect output and the correct output. This is call “loss measurement function”. • The loss measurement function calculates how big the error is. 2. Now the system will change the weights and biases to minimize the error. This is called “optimization function” and goes back to step 3 until it cannot reduce the error anymore. 24
  • 25. Back propagation - adjusting the weights Get Input Values Multiply input values by the weights and add biases Run activation function and get predictions Calculate the distance from the Correct results Apply optimization on the weights to reduce the error 25
  • 26. Back propagation - adjusting the weights Get Input Values Multiply input values by the weights and add biases Run activation function and get predictions Calculate the distance from the Correct results Apply optimization on the weights to reduce the error 26 9876543210 0100000000 9876543210 0.20.40.10.30.10.80.20.60.10.2
  • 27. Back propagation - adjusting the weights Get Input Values Multiply input values by the weights and add biases Run activation function and get predictions Calculate the distance from the Correct results Apply optimization on the weights to reduce the error 27 9876543210 0100000000 9876543210 0.20.50.10.30.10.70.20.40.10.1
  • 28. Back propagation - adjusting the weights Get Input Values Multiply input values by the weights and add biases Run activation function and get predictions Calculate the distance from the Correct results Apply optimization on the weights to reduce the error 28 9876543210 0100000000 9876543210 0.10.60.10.20.10.60.20.30.10.1
  • 29. Back propagation - adjusting the weights Get Input Values Multiply input values by the weights and add biases Run activation function and get predictions Calculate the distance from the Correct results Apply optimization on the weights to reduce the error 29 9876543210 0100000000 9876543210 0.10.70.10.20.10.40.10.20.10.1
  • 30. Back propagation - adjusting the weights Get Input Values Multiply input values by the weights and add biases Run activation function and get predictions Calculate the distance from the Correct results Apply optimization on the weights to reduce the error 30 9876543210 0100000000 9876543210 0.10.90.10.100.10.10.10.10.1 Correct!
  • 31. 31 𝑊0,9…𝑊0,3𝑊0,2𝑊0,1𝑊0,0 𝑊1,9…𝑊1,3𝑊1,2𝑊1,1𝑊1,0 𝑊2,9…𝑊2,3𝑊2,2𝑊2,1𝑊2,0 𝑊3,9…𝑊3,3𝑊3,2𝑊3,1𝑊3,0 𝑊4,9…𝑊4,3𝑊4,2𝑊4,1𝑊4,0 𝑊5,9…𝑊5,3𝑊5,2𝑊5,1𝑊5,0 𝑊6,9…𝑊6,3𝑊6,2𝑊6,1𝑊6,0 𝑊7,9…𝑊7,3𝑊7,2𝑊7,1𝑊7,0 ……………… 𝑊783,9…𝑊783,3𝑊783,2𝑊783,1𝑊783,0 x x x x x x x x x 10 columns (for 10 digits) 783rows(foreverypixel) 𝑏9…𝑏3𝑏2𝑏1𝑏0 … 𝑋𝑖 𝑊𝑖,0 + b0 … 𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0𝑋𝑖 𝑊𝑖,0 + b0784 pixels 𝑋0 𝑋1 𝑋2 𝑋3 𝑋4 𝑋5 𝑋6 𝑋7 𝑋783 Softmax 0.20.90.10.30.10.20.10.20.10.2 9876543210 Softmax Softmax Softmax Softmax +++++ Inference function 0 1 2 3 9 Softmax function Activation function Correct!
  • 33. Tensor • An n-dimensional array or list used to represent data • Defined by the 3 properties: • Rank: Scalar (number), Vector (1-dim array), Matrix (2-dim array), Cube, etc. • Shape • Type 33 TypeShapeRankExample Int32[]0 (scalar)1 Int32[5]1 (vector)[1, 5, 3, 6, 2] Int32[2, 5]2 (matrix)[[1, 5, 3, 8, 4], [3, 2, 6, 4, 7] ] Int32[3, 2, 3]3 (cube)[ [ [1, 6, 3], [2, 4, 3] ] [ [2, 6, 2], [3, 7, 4] ] [ [1, 9, 2], [4, 8, 3] ] ]
  • 34. What is TensorFlow • The most popular Python library for building ensemble algorithms – mainly NN. • Initially developed by Google and today it is open sourced • Provides a library of predefined versions of many common ML algorithms, but also enables to flexibly create your own algorithm. • Can harness the GPUs • Scalable – using “execution master” you can run on a laptop as well as on a large scale cluster in remote servers. 34
  • 35. Tensor Features and Tools • Name property - used to identify elements in the graph • Name Scope property – used for grouping elements (like “conv1” for 1st conv layer) • Summary class – has methods for writing summaries to log files. Can capture how elements change over time. • TensorBoard – A web server that uses the log files to visualize the computation graph and training progress. Can be used from remote desktops. • Common add-ons (for easier developement): • TFLearn - Simplifies the use of TensorFlow only and can converse with TF data types. • Keras – Simplification which supports multiple frameworks (including Microsoft CNTK). 35
  • 36. Training neural networks with TensorFlow With TensorFlow you need define the following: 1. The input data: • “Placeholders” – The input training data. • “Variables” – What we ask TF to compute through training. With neural network these are weights and biases. 2. The inference function (which is applied on the weights and biases). 3. Loss/error measurement function (example: “Cross Entropy”) 4. Optimization function to minimize loss (example: “Gradient Descent”) 36
  • 37. TensorFlow - MNIST demo 37 ImplementationConcept MNIST dataPrepared Data Sum(X* weight) + bias -> ActivationInference Cross EntropyLoss Measurement Gradient descent optimizerOptimize to minimize loss
  • 39. Why Convolutional Neural Networks (CNN) • Problem – Flattening the images caused us to lose the shape information. • When we see a digit, we recognize the lines and curves. • We need to “zoom out” slowly from the picture. 39
  • 42. Deep Learning • Use of multi layered neural network is called Deep Learning • Some applications: • Natural language processing (NLP) • Face recognition • Image analysis (what’s in the picture) • Image search • Voice analysis • Video analysis 42
  • 43. Convolutional Neural Network(CNN): X’s and O’s Says whether a picture is of an X or an O X or OCNN A two-dimensional array of pixels
  • 47. -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Computers are literal x
  • 48. ConvNets match pieces of the image = = =
  • 49. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1 Features match pieces of the image
  • 50. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1
  • 51. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1
  • 52. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1
  • 53. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1
  • 54. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1
  • 55. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 56. Filtering: The math behind the match 1. Line up the feature and the image patch. 2. Multiply each image pixel by the corresponding feature pixel. 3. Add them up. 4. Divide by the total number of pixels in the feature.
  • 57. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 58. 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 59. 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 60. 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 61. 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 62. 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 63. 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 64. 1 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 65. 1 1 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 66. 1 1 1 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 67. 1 1 1 1 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 68. 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 69. 1 1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 70. 1 1 -1 1 1 1 -1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Filtering: The math behind the match
  • 71. 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 -1 1 1 1 -1 1 1 Filtering: The math behind the match 55 1 1 -1 1 1 1 -1 1 1
  • 72. 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Convolution: Trying every possible match = 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 73. 1 -1 -1 -1 1 -1 -1 -1 1 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 = 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 = = -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
  • 74. Convolution layer • One image becomes a stack of filtered images 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 1 -1 -1 -1 1 -1 -1 -1 1 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
  • 75. Convolution layer • One image becomes a stack of filtered images 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
  • 76. Pooling: Shrinking the image stack 1. Pick a window size (usually 2 or 3). 2. Pick a stride (usually 2). A stride = step. 3. Walk your window across your filtered images. 4. From each window, take the maximum value.
  • 77. 1.00 Pooling 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 78. 1.00 0.33 Pooling 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 79. 1.00 0.33 0.55 Pooling 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 80. 1.00 0.33 0.55 0.33 Pooling 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 81. 1.00 0.33 0.55 0.33 0.33 Pooling 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 82. 1.00 0.33 0.55 0.33 0.33 1.00 0.33 0.55 0.55 0.33 1.00 0.11 0.33 0.55 0.11 0.77 Pooling 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 83. 1.00 0.33 0.55 0.33 0.33 1.00 0.33 0.55 0.55 0.33 1.00 0.11 0.33 0.55 0.11 0.77 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 0.33 0.55 1.00 0.77 0.55 0.55 1.00 0.33 1.00 1.00 0.11 0.55 0.77 0.33 0.55 0.33 0.55 0.33 0.55 0.33 0.33 1.00 0.55 0.11 0.55 0.55 0.55 0.11 0.33 0.11 0.11 0.33
  • 84. Pooling layer • A stack of images becomes a stack of smaller images. 1.00 0.33 0.55 0.33 0.33 1.00 0.33 0.55 0.55 0.33 1.00 0.11 0.33 0.55 0.11 0.77 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 0.33 0.55 1.00 0.77 0.55 0.55 1.00 0.33 1.00 1.00 0.11 0.55 0.77 0.33 0.55 0.33 0.55 0.33 0.55 0.33 0.33 1.00 0.55 0.11 0.55 0.55 0.55 0.11 0.33 0.11 0.11 0.33
  • 85. Normalization • Keep the math from breaking by tweaking each of the values just a bit. • Change everything negative to zero.
  • 86. Rectified Linear Units (ReLUs) 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.77
  • 87. 0.77 0 Rectified Linear Units (ReLUs) 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 88. 0.77 0 0.11 0.33 0.55 0 0.33 Rectified Linear Units (ReLUs) 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 89. 0.77 0 0.11 0.33 0.55 0 0.33 0 1.00 0 0.33 0 0.11 0 0.11 0 1.00 0 0.11 0 0.55 0.33 0.33 0 0.55 0 0.33 0.33 0.55 0 0.11 0 1.00 0 0.11 0 0.11 0 0.33 0 1.00 0 0.33 0 0.55 0.33 0.11 0 0.77 Rectified Linear Units (ReLUs) 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
  • 90. ReLU layer • A stack of images becomes a stack of images with no negative values. 0.77 0 0.11 0.33 0.55 0 0.33 0 1.00 0 0.33 0 0.11 0 0.11 0 1.00 0 0.11 0 0.55 0.33 0.33 0 0.55 0 0.33 0.33 0.55 0 0.11 0 1.00 0 0.11 0 0.11 0 0.33 0 1.00 0 0.33 0 0.55 0.33 0.11 0 0.77 0.33 0 0.11 0 0.11 0 0.33 0 0.55 0 0.33 0 0.55 0 0.11 0 0.55 0 0.55 0 0.11 0 0.33 0 1.00 0 0.33 0 0.11 0 0.55 0 0.55 0 0.11 0 0.55 0 0.33 0 0.55 0 0.33 0 0.11 0 0.11 0 0.33 0.33 0 0.55 0.33 0.11 0 0.77 0 0.11 0 0.33 0 1.00 0 0.55 0 0.11 0 1.00 0 0.11 0.33 0.33 0 0.55 0 0.33 0.33 0.11 0 1.00 0 0.11 0 0.55 0 1.00 0 0.33 0 0.11 0 0.77 0 0.11 0.33 0.55 0 0.33 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
  • 91. Layers get stacked • The output of one becomes the input of the next. -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1.00 0.33 0.55 0.33 0.33 1.00 0.33 0.55 0.55 0.33 1.00 0.11 0.33 0.55 0.11 0.77 0.33 0.55 1.00 0.77 0.55 0.55 1.00 0.33 1.00 1.00 0.11 0.55 0.77 0.33 0.55 0.33 0.55 0.33 0.55 0.33 0.33 1.00 0.55 0.11 0.55 0.55 0.55 0.11 0.33 0.11 0.11 0.33
  • 92. Deep stacking • Layers can be repeated several (or many) times. -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1.00 0.55 0.55 1.00 0.55 1.00 1.00 0.55 1.00 0.55 0.55 0.55
  • 93. Fully connected layer • Every value gets a vote 1.00 0.55 0.55 1.00 0.55 1.00 1.00 0.55 1.00 0.55 0.55 0.55 1.00 0.55 0.55 1.00 1.00 0.55 0.55 0.55 0.55 1.00 1.00 0.55
  • 94. Fully connected layer • Vote depends on how strongly a value predicts X or O X O 1.00 0.55 0.55 1.00 1.00 0.55 0.55 0.55 0.55 1.00 1.00 0.55
  • 95. Fully connected layer • Vote depends on how strongly a value predicts X or O X O 0.55 1.00 1.00 0.55 0.55 0.55 0.55 0.55 1.00 0.55 0.55 1.00
  • 96. Fully connected layer • Future values vote on X or O X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 97. Fully connected layer • Future values vote on X or O X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 98. Fully connected layer • Future values vote on X or O X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 99. Fully connected layer • Future values vote on X or O X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 100. Fully connected layer • Future values vote on X or O X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 101. Fully connected layer • Future values vote on X or O X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 102. Fully connected layer • A list of feature values becomes a list of votes. X O 0.9 0.65 0.45 0.87 0.96 0.73 0.23 0.63 0.44 0.89 0.94 0.53
  • 103. Putting it all together • A set of pixels becomes a set of votes. -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 X O Layer 1 Layer 2 Layer 3 Layer 4 Layer 5
  • 104. Gradient descent • For each feature pixel and voting weight, adjust it up and down a bit and see how the error changes. weighterror
  • 105. Gradient descent • For each feature pixel and voting weight, adjust it up and down a bit and see how the error changes. weighterror
  • 106. Tuning the CNN • Architecture • How many of each type of layer? • In what order? • Convolution • Number of features • Size of features • Pooling • Window size • Window stride • Fully Connected • Number of neurons
  • 107. CNN - Not just for images Things closer together are more closely related than things far away: • 2D Images. • 3D Images. • Audio • Video • Signal processing • NLP – semantic parsing, sentence modelling and more. • Drug discovery - Chemical interactions,
  • 108. MNIST demo using CNN 108
  • 109. Machine Learning in the near future There is a lot of research around ML in the academia and in commercial companies and a lot of money is invested there…. • ML will be used adopted in much greater scales across almost every industry. • ML will be embedded everywhere • Specialized hardware for ML will enable deeper and faster learning • Machine Learning as a Service (MLaaS) market will grow substantially. • ML will save more lives. • ML will automate more repetitive tasks. 109
  • 110. Why should developers/data engineers/DBAs invest time in ML? • Data is the fuel of every ML system – comes from the data platforms DBAs manage. • The data preparation before the training is the most time consuming part. • The DBAs can definitely assist here. • ML – not just for data scientists (up to a certain level) • Developers already use ML • Data engineers use ML. • ML can be used by DBAs too – why not? • ML will become more and more easy to use: • Azure ML • AWS ML 110