0% found this document useful (0 votes)

18 views

Rec03 - Deep Architectures

The document provides an overview of machine learning techniques, focusing on deep neural network architectures such as Convolutional Neural Networks (CNN) for image processing and Recurrent Neural Networks (RNN) for sequential data. It discusses various applications, including object detection and image segmentation, and introduces interpretability methods like LIME and SHAP. Additionally, it touches on the evolution of machine learning workflows with the advent of large-scale datasets and foundation models.

Uploaded by

Toyba

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Rec03 - Deep Architectures

Uploaded by

Toyba

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

Machine Learning

& Neural Networks

Deep Neural Network Architectures
Topics
• Images via CNN

• Sequential data via RNN & Transformers

• Dealing with other data types

• A brief introduction to Interpretability (LIME & SHAP)

Images
A vector of pixels
A vector of pixels
CNN
Convolutional Neural Network
CNN
Convolutional Neural Network
CNN
Full architecture
Image: Convolved feature
(or activation map):
1 1 1 0 0

0 1 1 1 0

0 0 1 1 1

0 0 1 1 0
CNN – part 1:
Convolution Layer
CNN
CNN – part 1:
Convolution Layer
CNN – part 1:
Convolution Layer

With padding:
Padding 1 => N_new = 9 => (9-3)/3+1 = 3
CNN – part 2:
Pooling Layer
Pooling
• Decrease the computational power required to process the data
• Extracting dominant features

Max pooling
If there is a good match with the feature (1 match is enough)

Avg pooling
What is the average match with the pattern in the whole area
CNN – part 3:
Fully Connected Layer(s)
• The flatten vector represents the input’s features
• Build non-linear classifier (MLP)

flatten
class CNN(nn.Module):
def __init__(self, in_channels, num_classes=10):
"""
in_channels: int
The number of channels in the input image. For MNIST, this is 1 (grayscale images).
num_classes: int
The number of classes we want to predict, in our case 10 (digits 0 to 9).
"""

super(CNN, self).__init__()
# 1st conv layer: 1 input channel, 8 output channels, 3x3 kernel, stride 1, padding 1
self.conv1 = nn.Conv2d(in_channels=in_channels, out_channels=8, kernel_size=3, stride=1, padding=1)
# Max pooling layer: 2x2 window, stride 2
self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
# 2nd conv layer: 8 input channels, 16 output channels, 3x3 kernel, stride 1, padding 1
self.conv2 = nn.Conv2d(in_channels=8, out_channels=16, kernel_size=3, stride=1, padding=1)
# Fully connected layer: 16*7*7 input features (after two 2x2 poolings), 10 output features (num_classes)
self.fc1 = nn.Linear(16 * 7 * 7, num_classes)

def forward(self, x):

x = F.relu(self.conv1(x)) # Apply first convolution and ReLU activation
x = self.pool(x) # Apply max pooling
x = F.relu(self.conv2(x)) # Apply second convolution and ReLU activation
x = self.pool(x) # Apply max pooling
x = x.reshape(x.shape[0], -1) # Flatten the tensor Implementation
x = self.fc1(x) # Apply fully connected layer
return x
Vanila CNN

This is the starting point!

Skip
Connection

ResNet-152?
CNN - hyperparameters
• Number of layers
• Size of kernel
• Number of kernels
• Stride
• Padding
Applications
What to do with CNN architecture?
- Object classification
- Object detection
- Image segmentation
Example Task #1 - Object detection
• Identifying and locating objects within an image.
• object detection provides both: i) the class and ii) the bounding box
coordinates for each object detected in the image.
• This makes it a more complex and information-rich task (vs. simple
detection of a certain class).
YOLO (You Only Look Once)
• Example for an advanced CNN
architecture for object
detection.
• Divides the image into a grid
and predicts bounding boxes
and class probabilities for each
grid.
• Known for its good real-time
performance.
Example Task #2 - Image segmentation

• Partitioning an image into

multiple segments.
• The goal is to assign a class label
to each pixel in the image.
• Semantic segmentation
• Instance segmentation
• Panoptic segmentation
Segment Anything Model (SAM)

https://segment-anything.com/
Topics
• Images via CNN

• Sequential data via RNN & Transformers

• Dealing with other data types

• A brief introduction to Interpretability (LIME & SHAP)

Sequential data
Sometimes our data comes in a form of a sequence
● Spike trains
● Stocks
● Sentences & Speech
Previous approach to analyze sequential data is via window-based classifiers:
● Sliding windows (avg., sum)
● For spike train data - we discretize time into bins of fixed width, and count
the number of events that occur in each time bin.
The problem: how to choose the right window size?
Sequential data - Text (NLP; written lang.)
• Can be done at character-level/ word-level / document-level:
Sparse vectors Dense vectors
One-hot encoding e.g., Bag of Words word embeddings (e.g., Word2Vec)

Length of vector = number of words in dictionary Length of vector = a different number of learned features
(e.g., below 10 times ‘other’) in the embedded space
Sequential data – Audio (spoken lang.)
Two different domains:
• Time domain
• Frequency domain
Common in neuroscience:
• Discrete signal - Spike train data (1/0)
• Continous signal – EEG, LFP
Recurrent Neural Network (RNN)

RNN RNN RNN RNN RNN

Recurrent Neural Network (RNN)

RNN RNN RNN RNN RNN

Vanilla (or Elman’s) RNN

Note:
The parameters aren’t
changing as function of t.
The hidden states
changes
RNN Layers

RNN RNN RNN RNN RNN

Hyperparameters of Vanila RNN
• Number of layers
• Hidden state dimension

• Note that the input and output of RNN are not hyperparameter!
They depend on the embeddings, type of task etc.
Which architecture will we use?
Image Captioning Sentiment Machine Entity
Classification Analysis Translation, Recognition
Summarization
Example Task #1 – Image captioning

A man and a girl sit on the ground and eat

A man and a little girl are sitting on a

sidewalk near a blue bag eating

A man wearing a black shirt and a little girl

wearing an orange dress share a treat
• Model:
• Use CNN+FC to convert the image into a single vector representation
• Use RNN to generate the output sentence using the Image vector as another input
Pros and cons of RNN
• Can process any length input
• Theoretically, the computation of a current step can use info from
many steps back
• Model size dosen’t increase for longer input context as the same
weights are applied
Pros and cons of RNN
• Can process any length input
• Theoretically, the computation of a current step can use info from
many steps back
• Model size dosen’t increase for longer input context as the same
weights are applied

• Recurrent computation is slow…

• In practice, difficult to access information from many steps back
RNN model without attention
RNN model with Attention
Sequential
data modeling
Transformers
architecture Decoder

Encoder
Transformers
• Attention is All You Need (Vaswani
et al., 2017).

• The ‘main’ ideas:

1) Positional encoding
2) Multi-head attention
3) Layer normalization (vs. batch norm)
Tweak #1 - Positional encoding

• The index value is less suited to represent an item’s position

in transformer models as for long sequences, the indices can
grow large in magnitude.
• the location or position of an entity in a sequence so that each
position is assigned a unique representation.
https://machinelearningmastery.com/a-gentle-introduction-to-positional-
encoding-in-transformer-models-part-1/
Example of positional encoding

https://machinelearningmastery.com/a-gentle-introduction-to-positional-
encoding-in-transformer-models-part-1/
https://machinelearningmastery.com/a-gentle-introduction-to-positional-
encoding-in-transformer-models-part-1/
Tweak #2 – from attention…
● Each decoded token in the target sequence is focusing on different tokens from
the source sequence.
… A Single Self-Attention
… Multi-head Attention!
Tweak #3 - Layer Normalization
• Normalize units in a particular
layer so they will have the same
distribution across all features.
• We compute layer norm statistics
across all the hidden units in the
same layer.

where H denotes the number of hidden units in a layer.

All the hidden units in a layer share the same normalization
terms μ and σ.
Types of Transformers
• Encoder models – all tokens can “see the future”
without masking (e.g., BERT by Devlin et al., 2019)
From Machine Learning to Foundation Models
• To learn a certain task, the classic workflow in machine learning was:
• Collect labeled data
• Train the model on train data
• Generalization / infer on new test data

• Today, the large-scale datasets changed this classical workflow:

• Collect a large dataset (can be labeled or unlabeled)
• Learn a representation - type of “prior” for learning
• Use the model for downstream tasks
Types of transformers
• Encoder models – all tokens can “see the future”
without masking (e.g., e.g., BERT by Devlin et al., 2019)

• Decoder models – can only observe the past to generate

text
Topics
• Images via CNN

• Sequential data via RNN & Transformers

• Dealing with other data types

• A brief introduction to Interpretability (LIME & SHAP)

Tabular data - example
• Tables - a vector of discrete/continous features (with/without
labels)
Networks – example
Topics
• Images via CNN

• Sequential data via RNN & Transformers

• Dealing with other data types

• A brief introduction to Interpretability: LIME & SHAP

LIME - Local Interpretable Model agnostic Explanations
• LIME can be applied to any model.
• Which variable caused the prediction?
• Provides a local interpretability / explanation – i.e., disturb the input samples
and use a simple model to understand how predictions change

Understanding model predictions with LIME | by Lars Hulstaert | Towards Data Science
Example - classification of a tree frog
• Step 1:
Divide the original image into interpretable components –
“superpixels” – a groups of pixels that look similar (image
segmentation)
• Step 2:
Generate a data set of perturbed instances by turning some of
the superpixels “off” (gray mask)
• Step 3:
Get the model’s prediction – here the probability of it being a tree frog
– per pertubed instance
• Step 4:
Learn a simple model on this data set and present the
superpixels with highest positive weights as an explanation,
graying out everything else.
Pool table ballon
LIME - Local Interpretable Model agnostic Explanations
• LIME can be applied to any model.
• It answers which datapoints (superpixel) caused the prediction.
• Provides a local interpretability / explanation – i.e., disturb the input
samples and use a simple model to understand how predictions change

• Cons:
• Explains only simple linear relations
• Often simple perturbations are
not enough!

Understanding model predictions with LIME | by Lars Hulstaert | Towards Data Science
SHAP values - SHapley Additive exPlanations
• Based on Shapley values (Game Theory), where:
• The game = reproducing a single prediction/outcome of the model
• The players = features included in the model
• SHAP values quantify the contribution each player to a single game.

• Requires training many models (e.g., 2^F models, with 50 features

1,125,899,906,842,624 models).
• Solution: approximate and sample per feature (implementation)

https://towardsdatascience.com/shap-explained-the-way-i-wish-someone-explained-it-to-me-ab81cc69ef30

Jeff Glover Brochure
No ratings yet
Jeff Glover Brochure
6 pages
Guru99 Banking Project SRS
No ratings yet
Guru99 Banking Project SRS
17 pages
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Zambian Languages 1-7 November, 2012
100% (5)
Zambian Languages 1-7 November, 2012
50 pages
Lecture2.2 UnimodalRepresentations Part1 PDF
No ratings yet
Lecture2.2 UnimodalRepresentations Part1 PDF
92 pages
Lec6 RNN Attention Search
No ratings yet
Lec6 RNN Attention Search
62 pages
03_pytorch_computer_vision
No ratings yet
03_pytorch_computer_vision
29 pages
ENG6500 8 DL IntroductionToDeepLearning Part2
No ratings yet
ENG6500 8 DL IntroductionToDeepLearning Part2
65 pages
Astro AI
No ratings yet
Astro AI
20 pages
An Introduction To Transformers
No ratings yet
An Introduction To Transformers
10 pages
CVlecture 5
No ratings yet
CVlecture 5
56 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
Introduction to Deep Learning 17th January 2025 (2)
No ratings yet
Introduction to Deep Learning 17th January 2025 (2)
60 pages
CNN and Autoencoder
No ratings yet
CNN and Autoencoder
56 pages
An Introduction To Transformers
No ratings yet
An Introduction To Transformers
10 pages
AI_slide_2
No ratings yet
AI_slide_2
82 pages
Deep Learning Curriculum
No ratings yet
Deep Learning Curriculum
23 pages
Lect11 Neural Nets2
No ratings yet
Lect11 Neural Nets2
48 pages
Intro CNN PDF
No ratings yet
Intro CNN PDF
31 pages
CNN
No ratings yet
CNN
31 pages
L11 Learning III Neural Network Architectures
No ratings yet
L11 Learning III Neural Network Architectures
35 pages
E-Eli5-Way-3bd2b1164a53: CNN (Source:)
No ratings yet
E-Eli5-Way-3bd2b1164a53: CNN (Source:)
4 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
Introduction to Deep Learning
No ratings yet
Introduction to Deep Learning
47 pages
Lec5 CNN RNN Attention
No ratings yet
Lec5 CNN RNN Attention
71 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
8 pages
Part 2
No ratings yet
Part 2
225 pages
Deep Learning: Alberto Ezpondaburu
No ratings yet
Deep Learning: Alberto Ezpondaburu
58 pages
Deep Learning: Seungsang Oh
No ratings yet
Deep Learning: Seungsang Oh
39 pages
What Is Convolutional Neural Network
No ratings yet
What Is Convolutional Neural Network
16 pages
Cv Ppt Mt101
No ratings yet
Cv Ppt Mt101
16 pages
CO2_CNN_3
No ratings yet
CO2_CNN_3
31 pages
MODULE 5
No ratings yet
MODULE 5
20 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
Computer Vision 11 Transformers
No ratings yet
Computer Vision 11 Transformers
63 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
37 pages
anthony
No ratings yet
anthony
33 pages
IC Unit6 DeepLearning
No ratings yet
IC Unit6 DeepLearning
35 pages
Unit 3 NNDL-1
No ratings yet
Unit 3 NNDL-1
31 pages
Unit 3
No ratings yet
Unit 3
105 pages
Assignment-6 STC-DL
No ratings yet
Assignment-6 STC-DL
17 pages
03 Convolution Neural Networks and Computer Vision With Tensorflow
No ratings yet
03 Convolution Neural Networks and Computer Vision With Tensorflow
21 pages
An Introduction To Convolutional Neural Networks: November 2015
No ratings yet
An Introduction To Convolutional Neural Networks: November 2015
12 pages
Guddu jha_organized
No ratings yet
Guddu jha_organized
3 pages
Chapter 5 Deep Learning
No ratings yet
Chapter 5 Deep Learning
35 pages
Transformer
No ratings yet
Transformer
5 pages
Eng Ppt Tech
No ratings yet
Eng Ppt Tech
18 pages
CS601 Machine Learning Unit 3
No ratings yet
CS601 Machine Learning Unit 3
47 pages
Image Classification Using Convolutional Neural Networks (CNNS)
No ratings yet
Image Classification Using Convolutional Neural Networks (CNNS)
61 pages
4a Convolutional Neural Networks
No ratings yet
4a Convolutional Neural Networks
56 pages
4b Image Processing
No ratings yet
4b Image Processing
63 pages
Astro AI
No ratings yet
Astro AI
20 pages
L09-10 DL and CNN
No ratings yet
L09-10 DL and CNN
56 pages
An Introduction To Convolutional Neural Networks: November 2015
No ratings yet
An Introduction To Convolutional Neural Networks: November 2015
12 pages
Unit 6
No ratings yet
Unit 6
41 pages
Stage 424 June 2023
No ratings yet
Stage 424 June 2023
89 pages
UNIT 2 Self Notes
No ratings yet
UNIT 2 Self Notes
10 pages
Deep Learning Notes (1) 2
No ratings yet
Deep Learning Notes (1) 2
54 pages
Deep Learning notes
No ratings yet
Deep Learning notes
155 pages
04Introduction to Neural Networks
No ratings yet
04Introduction to Neural Networks
62 pages
Lecture_3
No ratings yet
Lecture_3
48 pages
CNN
No ratings yet
CNN
9 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
2025 EmTe and programming Assigned Inst
No ratings yet
2025 EmTe and programming Assigned Inst
1 page
FC 2017 II Mid xx
No ratings yet
FC 2017 II Mid xx
5 pages
OOSECGB
No ratings yet
OOSECGB
3 pages
ppt_coa
No ratings yet
ppt_coa
62 pages
Lab-PC Assembly Manual
No ratings yet
Lab-PC Assembly Manual
6 pages
coa Chapter 3
No ratings yet
coa Chapter 3
11 pages
Lab-Mobile Device Information
No ratings yet
Lab-Mobile Device Information
2 pages
coa chapter 4
No ratings yet
coa chapter 4
16 pages
coa Chapter 2
No ratings yet
coa Chapter 2
14 pages
FEMA FilterManual 2011
100% (1)
FEMA FilterManual 2011
362 pages
Project Report On: "Performace Appraisal in CCL"
100% (1)
Project Report On: "Performace Appraisal in CCL"
87 pages
Leadership Powerpoint
No ratings yet
Leadership Powerpoint
14 pages
33kV CB Drawings
No ratings yet
33kV CB Drawings
2 pages
Online Shopping Behavioral Influences of Online Shoppers in Choosing A Product
No ratings yet
Online Shopping Behavioral Influences of Online Shoppers in Choosing A Product
12 pages
Concepts and Data Model - SAP Documentation
No ratings yet
Concepts and Data Model - SAP Documentation
2 pages
Tle What I Have Learned Essay About Origami
No ratings yet
Tle What I Have Learned Essay About Origami
1 page
Shining Star Academy Grade 7 End of Term Exams TERM ONE (1) 2024 Technology Studies Total:50Marks
No ratings yet
Shining Star Academy Grade 7 End of Term Exams TERM ONE (1) 2024 Technology Studies Total:50Marks
4 pages
Cracking and Delamination of Coatings: Articles You May Be Interested in
No ratings yet
Cracking and Delamination of Coatings: Articles You May Be Interested in
7 pages
Body Language
No ratings yet
Body Language
66 pages
Iot ass4
No ratings yet
Iot ass4
7 pages
Problem Set - Fluid Mechanics
No ratings yet
Problem Set - Fluid Mechanics
2 pages
Object Oriented System Development: Presented by Abitha.D 1 M.SC - CS (SF)
No ratings yet
Object Oriented System Development: Presented by Abitha.D 1 M.SC - CS (SF)
25 pages
Swyd Project
100% (1)
Swyd Project
51 pages
Answers Cambridge Checkpoint Mathematics Practicebook 8 PDF Mean Rectangle
No ratings yet
Answers Cambridge Checkpoint Mathematics Practicebook 8 PDF Mean Rectangle
1 page
Rwa Method
0% (1)
Rwa Method
25 pages
Linear Algebra
No ratings yet
Linear Algebra
42 pages
Intranets, Extranets, and Enterprise Collaboration Lecture Notes The Internet and Business
No ratings yet
Intranets, Extranets, and Enterprise Collaboration Lecture Notes The Internet and Business
20 pages
Evaluation of The Paper of Molao
No ratings yet
Evaluation of The Paper of Molao
1 page
ME 315 - Heat Transfer Laboratory Experiment No. 5 Pool Boiling in A Saturated Liquid
No ratings yet
ME 315 - Heat Transfer Laboratory Experiment No. 5 Pool Boiling in A Saturated Liquid
10 pages
2000 - Triangulation - A Methodological
100% (1)
2000 - Triangulation - A Methodological
6 pages
CNLINKO 2019 Product Catalogue (1)
No ratings yet
CNLINKO 2019 Product Catalogue (1)
56 pages
Gas Laws and Solutions Mini-Lab
No ratings yet
Gas Laws and Solutions Mini-Lab
1 page
Genome 371 Lecture 01
No ratings yet
Genome 371 Lecture 01
1 page
Drilling Machines
No ratings yet
Drilling Machines
28 pages
Mode Grouped
100% (1)
Mode Grouped
6 pages
Cab Center Console Cab Fuse Block: Electrical System 793F Off-Highway Truck
No ratings yet
Cab Center Console Cab Fuse Block: Electrical System 793F Off-Highway Truck
10 pages