0% found this document useful (0 votes)
21 views

NNDL

Uploaded by

sumitsudamnikam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

NNDL

Uploaded by

sumitsudamnikam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Networks

Neural
aud

Deep Learning
.
nates)
(sem
-z

uram
Models of a nevror .

model :
Deterministic

D Synapses/connecting links .

-> in pat signal


-> weight

2) Addes .

-> linear combiner .

3) Activation function . / Squashing function .

the permissible amplite range of


-> limits

output to a finite value .

- the net input of activation function can be

lowered or increased
through bias .

-> bios modifies the relationship between

induced local field and activation function .

of
Typesactivation functions
---
function
function) Heaside
1) Threshold
.

(unity amplificationn
a
linear function
1) Piecewise
function
3) sigmoid
.

parameter)
Fac [a
I = slope

6 () =

Sigmoid threshold faction


zinfinity >
.

a
Memory
-

.
connected
memory and learning intricately
->
are

-> activities from spatial inside the


Neural a
pattern
that stimulus
memory contains information about the .

-> Thus
memory
ransforms an
activity pattern in the

into
input space another
activity pattern in the

output space .
[It is through matrix multiplication) Y =RX.
.

defines the aurall


-
The weight matrix connectivity
between input and outpat layers of associative

memory
.

Correlation
-
Matrimony .

M
M = YpX

= · Ya si
k=1

call normal
M ↑ onsider Ij to be
y =
Mxj
m vector (1) 11 jll=1

j
=
y =

I
&j]Y1
7

+Ej) yj
2L+x
3
=[c)Y;
=
+

Ej j
Failure in learning .

->
Training algo may
not find solution parameters
.

wrong function
->
due to
Training algo moose
may
over
fitting
.

[MM
-

->
Captures Spatial features from an
image .

Spatial features
identify object and it's
->
help the location

more
accurately
.

PN
->
intermediate results fed back to the
are
predict
of
outcome the .
layer
-> Information from
previous time-step is remembered by
a function
memory
.

Gradient based learning


-
.

--

optimization algorithm uses model


->
gradients to update
parameters during training
-> Enables learn complex representations of data .

-> Gradient infolSensitivity info is needed to determine

the search directions .

partial derivatives used


-> Spatial and temporal are

to estimate flow at every position .


of
a Neal
Math Network
No ofMevrons/Units of MM
.
- .
in each layer a

-> with
> more to
more
capacity capture complex patterns
.

wid better for


-
>more
memorizing and fitting complex data
.

But
-> , more with may lead to
overfitting .

Depth
of a Wal etwork

-> No of,

layers in a MM
.

-> Movedepth) better at hierarchial data .

-> more depth of better at abstract features .

->
may face and exploding gradients problems
vanishing ,

Activation Functions
I ,

--

-> Introduce non-linear properties in Neural Network,


a

Restricted Boltzmann Machines


-
,

-
-
Used for
=
generative modelling and unsupervised learning
.

-> Stochastic learning processes .

->
Statistical in .
nature

used
->
in supervised learning .

-> The state of each individual newson is taken into .


account

->
They have fixed weight .

-> Howevers ,
the weights are to be set

-Its to maximize the consensus function


objective is
AutoEncoders ,

->
Arms capable of dense representations of input data ,
-> These dense representations are called 'latent representation's
or
'codings',
-> It works unsupervised learn to copy inputs to .
outputs
,
than the
->
Codings typically have a lower
dimensionality
,
input
->
They also act as feature detectors

Can be
-> used for unsupervised pretaining of nearal

networks
.

-> Some autoencoders are generative models .

-> These are


capable of generating
random new data

that looks similar to the data


very training .

->
Morse can be added or Site of latent representations can ba

limited.
->
This forces the model to learn efficient of representing
ways
the data ,

-> It
always
has two ports: (i) Encoder (recognition network)

(i) Decoder/generation network)


.

of op No of input
neurons in layer layers.
=
- No ,
, neurons in

sine outputs are called .


reconstructions

-> Const function contains a reconstruction loss .

-> Because Me internal representation of the autoencoder hos

lower the auto said to be


dimensionality Mon input data, encoder is

under complete
-> Thus a full layer of nevrons using the same filter
,

outputs a feature map


.
-> A feature map highlights the areas in an image that activate

the filter most .

the
Convolution will
During training layer automatically
-> learn

the most useful filters for its took and the layers above it
,

to
will know combine them into complex patterns
-> Each canvolation layer has multiple milters and outputs

one featuremap for each filter .

-> It has one nearon per pixcel in feature map.


-> All neurons of a
given featuremap share the some parameters .

-> Thus
,
a convolational layer applies multiple mainable filters to

its inputs making it capable of detecting multiple features

anywhere in its inputs.

-
> Valid Padding : No zero padding
. Each neuson's receptive

field lies strictly within valid positions


inside the input
.

-
Same padding - Inputs are padded with enough zeroes on

all sides to ensure that output feature

maps endup with the same size as the


in puts

-> If stride >I ,


then output size will nat be equal to input

Size even
for some pudding .
: Controls outpat gate
which decides
[Iv> Outputs O(P) ,

what of state should


parts longterm
output this timeStep
be read and at

& o nC] and Y() .

So LSTM cell can't is learn to recognize an importan input


.
,
(i) Store that important input in

state
- long-term ,

too as long as
.
needed
(ii) Preserve it

GRU cells's
-
-> A simplified version of LSTM cell .

ificationsstate vectors are merged into h,

controls forget and input


· A single gate convaller
other closes
.
When opens, the
gates one
.

·
No olpgate .
Full state vector is outputted at
time step
.

every

· New gate conmalles wt] which tells what

part of previous State will be shazon to &(D)


.

However good LSTMs and RGRU cells are , they still can't

. So Input sequences are shortened


tackle very long sequences ,

using ID Convolution layers


.

You might also like