0% found this document useful (0 votes)

53 views

Bayesian Networks (Part I) : 10 - 601 Introduction To Machine Learning

The document discusses a lecture on convolutional neural networks (CNNs) for machine learning. It provides an overview of common CNN layers including convolutional layers, max-pooling layers, fully-connected layers, ReLU layers, and softmax layers. It also discusses CNN architectures like LeNet-5 and AlexNet. The lecture covers topics such as training CNNs with mini-batch stochastic gradient descent and backpropagation, and visualizing CNNs.

Uploaded by

Abhinandan Chatterjee

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views

Bayesian Networks (Part I) : 10 - 601 Introduction To Machine Learning

Uploaded by

Abhinandan Chatterjee

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

10-‐601 Introduction to Machine Learning

Machine Learning Department

School of Computer Science
Carnegie Mellon University

Bayesian Networks
(Part I)
Graphical Model Readings:
Murphy 10 – 10.2.1 Matt Gormley
Bishop 8.1, 8.2.2
HTF -‐-‐
Lecture 22
Mitchell 6.11 April 10, 2017

1
Reminders
• Peer Tutoring
• Homework 7: Deep Learning
– Release: Wed, Apr. 05
– Part I due Wed, Apr. 12 Start Early
– Part II due Mon, Apr. 17

2
CONVOLUTIONAL NEURAL NETS

3
Deep Learning Outline
• Background: Computer Vision
– Image Classification
– ILSVRC 2010 -‐ 2016
– Traditional Feature Extraction Methods
– Convolution as Feature Extraction
• Convolutional Neural Networks (CNNs)
– Learning Feature Abstractions
– Common CNN Layers:
• Convolutional Layer
• Max-‐Pooling Layer
• Fully-‐connected Layer (w/tensor input)
• Softmax Layer
• ReLU Layer
– Background: Subgradient
– Architecture: LeNet
– Architecture: AlexNet
• Training a CNN
– SGD for CNNs
– Backpropagation for CNNs

4
Convolutional Neural Network (CNN)
• Typical layers include:
– Convolutional layer
– Max-‐pooling layer
– Fully-‐connected (Linear) layer
– ReLU layer (or some other nonlinear activation function)
– Softmax
• These can be arranged into arbitrarily deep topologies

Architecture #1: LeNet-‐5

5
Convolutional Layer
CNN key idea:
Treat convolution matrix as
parameters and learn them!
Input Image

0 0 0 0 0 0 0 Convolved Image
Learned
0 1 1 1 1 1 0 Convolution .4 .5 .5 .5 .4
0 1 0 0 1 0 0 θ11 θ12 θ13 .4 .2 .3 .6 .3
0 1 0 1 0 0 0 θ21 θ22 θ23 .5 .4 .4 .2 .1
0 1 1 0 0 0 0 θ31 θ32 θ33 .5 .6 .2 .1 0
0 1 0 0 0 0 0 .4 .3 .1 0 0
0 0 0 0 0 0 0

6
Downsampling by Averaging
• Downsampling by averaging used to be a common approach
• This is a special case of convolution where the weights are fixed to a
uniform distribution
• The example below uses a stride of 2
Input Image

1 1 1 1 1 0 Convolved Image
Convolution
1 0 0 1 0 0
3/4 3/4 1/4
1 0 1 0 0 0 1/4 1/4
3/4 1/4 0
1 1 0 0 0 0 1/4 1/4
1/4 0 0
1 0 0 0 0 0
0 0 0 0 0 0

7
Max-‐Pooling
• Max-‐pooling is another (common) form of downsampling
• Instead of averaging, we take the max value within the same range as
the equivalently-‐sized convolution
• The example below uses a stride of 2
Input Image
Max-‐Pooled
1 1 1 1 1 0 Image
Max-‐
1 0 0 1 0 0 pooling
1 1 1
1 0 1 0 0 0 xi,j xi,j+1
1 1 0
1 1 0 0 0 0 xi+1,j xi+1,j+1
1 0 0
1 0 0 0 0 0
0 0 0 0 0 0

8
Multi-‐Class Output

Output …

Hidden Layer …

Input …
10
Multi-‐Class Output
(F) Loss
Softmax Layer: J = k=1 yk HQ;(yk )
K

2tT(bk ) (E) Output (softmax)

yk = yk = K2tT(b k)

2tT(bl )
K l=12tT(b )
l

l=1
(D) Output (linear)
D
bk = j=0 kj zj k
…
Output
(C) Hidden (nonlinear)
zj = (aj ), j
…
Hidden Layer
(B) Hidden (linear)
M
aj = i=0 ji xi , j

…
Input
(A) Input
Given xi , i
11
Training a CNN
Whiteboard
– SGD for CNNs
– Backpropagation for CNNs

12
Common CNN Layers
Whiteboard
– ReLU Layer
– Background: Subgradient
– Fully-‐connected Layer (w/tensor input)
– Softmax Layer
– Convolutional Layer
– Max-‐Pooling Layer

13
Convolutional Layer

14
Convolutional Layer

15
Max-‐Pooling Layer

16
Max-‐Pooling Layer

17
Convolutional Neural Network (CNN)
• Typical layers include:
– Convolutional layer
– Max-‐pooling layer
– Fully-‐connected (Linear) layer
– ReLU layer (or some other nonlinear activation function)
– Softmax
• These can be arranged into arbitrarily deep topologies

Architecture #1: LeNet-‐5

18
Architecture #2: AlexNet
CNN for Image Classification
(Krizhevsky, Sutskever & Hinton, 2012)
15.3% error on ImageNet LSVRC-‐2012 contest
Input • Five convolutional layers 1000-‐way
image (w/max-‐pooling)
(pixels) • Three fully connected layers softmax

Figure 2: An illustration of the architecture of our CNN, explicitly showing the delineation of responsibilities
CNNs for Image Recognition

(slide from Kaiming He’s recent presentation) 20

Slide from Kaiming He
Fei-Fei Li & Andrej Karpathy & Justin Johnson
Mini-‐Batch SGD
• Gradient Descent:
Compute true gradient exactly from all N
examples
• Mini-‐Batch SGD:
Approximate true gradient by the average
gradient of K randomly chosen examples
• Stochastic Gradient Descent (SGD):
Approximate true gradient by the gradient
of one randomly chosen example

21
Mini-‐Batch SGD

Three variants of first-‐order optimization:

22
CNN VISUALIZATIONS

23
3D Visualization of CNN
http://scs.ryerson.ca/~aharley/vis/conv/
Convolution of a Color Image
• Color images consist of 3 floats per pixel for
RGB (red, green blue) color values
• Convolution must also be 3-‐
A closer look at spatial dimensions: dimensional
activation map
32x32x3 image
5x5x3 filter
32

convolve (slide) over all

spatial locations

32 28
3 1
25
Figure from Fei-‐Fei Li & Andrej Karpathy & Justin Johnson (CS231N)
Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 23 27 Jan 2016
Animation of 3D Convolution
http://cs231n.github.io/convolutional-‐networks/

26
Figure from Fei-‐Fei Li & Andrej Karpathy & Justin Johnson (CS231N)
MNIST Digit Recognition with CNNs
(in your browser)
https://cs.stanford.edu/people/karpathy/convnetjs/demo/mnist.html

27
Figure from Andrej Karpathy
CNN Summary
CNNs
– Are used for all aspects of computer vision, and
have won numerous pattern recognition
competitions
– Able learn interpretable features at different levels
of abstraction
– Typically, consist of convolution layers, pooling
layers, nonlinearities, and fully connected layers
Other Resources:
– Readings on course website
– Andrej Karpathy, CS231n Notes
http://cs231n.github.io/convolutional-‐networks/
28
BAYESIAN NETWORKS

29
Bayes Nets Outline
• Motivation
– Structured Prediction
• Background
– Conditional Independence
– Chain Rule of Probability
• Directed Graphical Models
– Writing Joint Distributions
– Definition: Bayesian Network
– Qualitative Specification
– Quantitative Specification
– Familiar Models as Bayes Nets
• Conditional Independence in Bayes Nets
– Three case studies
– D-‐separation
– Markov blanket
• Learning
– Fully Observed Bayes Net
– (Partially Observed Bayes Net)
• Inference
– Sampling directly from the joint distribution
– Gibbs Sampling

31
MOTIVATION: STRUCTURED
PREDICTION

32
Structured Prediction
• Most of the models we’ve seen so far were
for classification
– Given observations: x = (x1, x2, …, xK)
– Predict a (binary) label: y
• Many real-‐world problems require
structured prediction
– Given observations: x = (x1, x2, …, xK)
– Predict a structure: y = (y1, y2, …, yJ)
• Some classification problems benefit from
latent structure
33
Structured Prediction Examples
• Examples of structured prediction
– Part-‐of-‐speech (POS) tagging
– Handwriting recognition
– Speech recognition
– Word alignment
– Congressional voting
• Examples of latent structure
– Object recognition

34
Dataset for Supervised
Part-‐of-‐Speech (POS) Tagging
Data: D = {x(n) , y (n) }N
n=1

n v p d n y(1)
Sample 1:
time flies like an arrow x(1)

n n v d n y(2)
Sample 2:
time flies like an arrow x(2)

n v p n n y(3)
Sample 3:
flies fly with their wings x(3)

p n n v v y(4)
Sample 4:
with time you will see x(4)

35
total of 1,63
total words,
of 1,63
words, and
words,its and
part
Dataset for Supervised its part
its part
of s
of s
speechspeech
label
Handwriting Recognition speech label
1. F
1. First-
Data: D = {x(n) , y (n) }N
n=1
1. First-
2. F
2. Four
2. Four
Sample 1:
u n e x p e c t e d y(1)

x(1) Hand
Hand

Sample 2: y(2)

v o l c a n i c

x(2)

Sample 2:
e m b r a c e s y(3)

x(3)

Fig. 5.Fig. 5. Handwriting

Handwriting recognition:
recognition: Example
Example wordswords fromdataset
from the the dataset
used.used.
Fig. 5. Handwriting recognition: Example words from the dataset used. 36
Figures from (Chatzis & Demiris, 2013)
Dataset for Supervised
Phoneme (Speech) Recognition
Data: D = {x(n) , y (n) }N
n=1

Sample 1:
1704 h# dh ih s w uh z iy z y(1)
iyIEEE TRANSACTIONS ON SIGNA

x(1)

Sample 2: y(2)

f ao r ah s s h#
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 61, NO. 7, APRIL 1, 2013

Fig. 5. Extrinsic (top) and intrinsic (bottom) spectral representations for the utterance “This was easy for us.” Note
was used. x (2)

where are the input unlabeled data and is the novel utterance into this in
new parametrization of the function we need to estimate. To the computation of Equatio
proceed, we plug the functional form of (9) into the optimization Fig. 5 shows an exampl
problem of (8). Taking the gradient with respect to the parameter spectrograms for
vector and setting it to zero sets up the following generalized 37
for us” (TIMIT sentence sx
Figures from (Jansen & Niyogi, 2013)
eigenvalue problem: with 200 examples of eac
iﬁed in [26].2 Each examp
(10) a 40-dimensional, homomo
Application:
Word Alignment / Phrase Extraction
• Variables (boolean):
– For each (Chinese phrase,
English phrase) pair,
are they linked?
• Interactions:
– Word fertilities
– Few “jumps” (discontinuities)
– Syntactic reorderings
– “ITG contraint” on alignment
– Phrases are disjoint (?)

(Burkett & Klein, 2012) 38

Application:
Congressional Voting
• Variables:

– Text of all speeches of a

representative
– Local contexts of
references between two
representatives
• Interactions:
– Words used by
representative and their
vote
– Pairs of representatives
and their local context

(Stoyanov & Eisner, 2012) 39

Structured Prediction Examples
• Examples of structured prediction
– Part-‐of-‐speech (POS) tagging
– Handwriting recognition
– Speech recognition
– Word alignment
– Congressional voting
• Examples of latent structure
– Object recognition

40
Case Study: Object Recognition
Data consists of images x and labels y.

x(1) x(2)

pigeon y(1) rhinoceros y(2)

x(3) x(4)

leopard y(3) llama y(4)

41
Case Study: Object Recognition
Data consists of images x and labels y.
• Preprocess data into
“patches”
• Posit a latent labeling z
describing the object’s
parts (e.g. head, leg,
tail, torso, grass)

• Define graphical
model with these
latent variables in
mind
• z is not observed at leopard
train or test time

42
Case Study: Object Recognition
Data consists of images x and labels y.
• Preprocess data into
“patches”
• Posit a latent labeling z
describing the object’s
Z7
parts (e.g. head, leg,
tail, torso, grass)
X7
• Define graphical
Z2
model with these
latent variables in
mind X2

• z is not observed at leopard Y

train or test time

43
Case Study: Object Recognition
Data consists of images x and labels y.
• Preprocess data into
“patches”
• Posit a latent labeling z
describing the object’s
Z7
parts (e.g. head, leg, ψ4

tail, torso, grass) ψ1

X7 ψ4
• Define graphical
ψ2 Z2 ψ4
model with these
ψ3
latent variables in
mind X2

• z is not observed at leopard Y

train or test time

44
Structured Prediction
Preview of challenges to come…
• Consider the task of finding the most probable
assignment to the output

Classification Structured Prediction

ŷ = `;Kt p(y|t) v̂ = `;Kt p(v|t)
y v

where y {+1, 1} where v Y

and |Y| is very large

45
Machine Learning
The data inspires Our model
the structures defines a score
we want to for each structure
predict
Domain Mathematical
It also tells us
Knowledge Modeling
what to optimize
ML
Inference finds Combinatorial Optimization

{best structure, marginals, Optimization

partition function} for a

new observation Learning tunes the
parameters of the
(Inference is usually model
called as a subroutine
in learning) 46
a
s aw B ob on
3 Alice

e
a telescop

Machine Learning
Alice
saw
B ob
on

ow
a hill with

Model
s l i ke a n a rr
ie
time fl
4
Data X1

arrow X3
like an X2
flies
time

X4 X5
an arrow
like
flies
time

an arrow
Objective
like
flies
time

Inference
time
flies
like an arrow

Learning

(Inference is usually
called as a subroutine
in learning) 47
BACKGROUND

48
Background
Whiteboard
– Chain Rule of Probability
– Conditional Independence

49
Background: Chain Rule
of Probability

For random variables A and B:

P (A, B) = P (A|B)P (B)

For random variables X1 , X2 , X3 , X4 :

P (X1 , X2 , X3 , X4 ) =P (X1 |X2 , X3 , X4 )
P (X2 |X3 , X4 )
P (X3 |X4 )
P (X4 )

50
Background:
Conditional Independence
Random variables A and B are conditionally
independent given C if:

P (A, B|C) = P (A|C)P (B|C) (1)

or equivalently:

P (A|B, C) = P (A|C) (2)

We write this as:

A B|C (3)
Later we will also
|4

write: I<A, {C}, B>

51
Bayesian Networks

DIRECTED GRAPHICAL MODELS

52
Example: Tornado Alarms
1. Imagine that
you work at the
911 call center
in Dallas
2. You receive six
calls informing
you that the
Emergency
Weather Sirens
are going off
3. What do you
conclude?
53
Figure from https://www.nytimes.com/2017/04/08/us/dallas-‐emergency-‐sirens-‐hacking.html
Example: Tornado Alarms
1. Imagine that
you work at the
911 call center
in Dallas
2. You receive six
calls informing
you that the
Emergency
Weather Sirens
are going off
3. What do you
conclude?
54
Figure from https://www.nytimes.com/2017/04/08/us/dallas-‐emergency-‐sirens-‐hacking.html
Directed Graphical Models
(Bayes Nets)
Whiteboard
– Example: Tornado Alarms
– Writing Joint Distributions
• Idea #1: Giant Table
• Idea #2: Rewrite using chain rule
• Idea #3: Assume full independence
• Idea #4: Drop variables from RHS of conditionals
– Definition: Bayesian Network
– Observed Variables in Graphical Models

55
Bayesian Network

X1
p(X1 , X2 , X3 , X4 , X5 ) =
X3
X2
p(X5 |X3 )p(X4 |X2 , X3 )
X4 X5 p(X3 )p(X2 |X1 )p(X1 )

56
Bayesian Network
Definition:
X1

n
X3
X2
P(X1 …X n ) = ∏ P(Xi | parents(Xi ))
i=1
X4 X5

• A Bayesian Network is a directed graphical model

• It consists of a graph G and the conditional probabilities P
• These two parts full specify the distribution:
– Qualitative Specification: G
– Quantitative Specification: P

Nepalese Currency Recognition System
No ratings yet
Nepalese Currency Recognition System
66 pages
"Smart Traffic Light System": Synopsis Presentation On
100% (1)
"Smart Traffic Light System": Synopsis Presentation On
22 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
161 pages
Lecture 5
No ratings yet
Lecture 5
36 pages
CNN 1
No ratings yet
CNN 1
9 pages
Soft Computing.
No ratings yet
Soft Computing.
23 pages
Deep Learning Tutorial
No ratings yet
Deep Learning Tutorial
133 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
55 pages
Week6_Intro to Convolutional Neural Networks
No ratings yet
Week6_Intro to Convolutional Neural Networks
25 pages
DL Slides 3
No ratings yet
DL Slides 3
99 pages
MachineLearningSlides PartOne
No ratings yet
MachineLearningSlides PartOne
252 pages
Lab 5 - Intro To Convolutional Neural Networks
No ratings yet
Lab 5 - Intro To Convolutional Neural Networks
52 pages
DL Concepts 1 Overview
No ratings yet
DL Concepts 1 Overview
80 pages
Chapter 1
No ratings yet
Chapter 1
21 pages
Lecture 6
No ratings yet
Lecture 6
17 pages
DeepLearning Unit-II
No ratings yet
DeepLearning Unit-II
70 pages
Chapter 4 - Machine Learning With Graphs II: Prepared By: Shier Nee, SAW
No ratings yet
Chapter 4 - Machine Learning With Graphs II: Prepared By: Shier Nee, SAW
48 pages
Cnn
No ratings yet
Cnn
123 pages
06-Classification_Part2
No ratings yet
06-Classification_Part2
34 pages
Lesson 2 Neural Network Architectures
No ratings yet
Lesson 2 Neural Network Architectures
35 pages
ML 03
No ratings yet
ML 03
42 pages
Artificial Neural Networks:: Unsupervised Learning
No ratings yet
Artificial Neural Networks:: Unsupervised Learning
37 pages
Lecture_2(DIP) (1)
No ratings yet
Lecture_2(DIP) (1)
48 pages
NNFL 1 RA Moodle
No ratings yet
NNFL 1 RA Moodle
42 pages
CS 4476 Project 1 Description
No ratings yet
CS 4476 Project 1 Description
8 pages
Lecture_10_slides_-_after
No ratings yet
Lecture_10_slides_-_after
66 pages
Iii Unit - Deeplearning
No ratings yet
Iii Unit - Deeplearning
93 pages
DL Quiz1
No ratings yet
DL Quiz1
5 pages
Different Classes of Abstract Models: - Supervised Learning (EX: Perceptron) Reinforcement Learning - Unsupervised Learning (EX: Hebb Rule)
No ratings yet
Different Classes of Abstract Models: - Supervised Learning (EX: Perceptron) Reinforcement Learning - Unsupervised Learning (EX: Hebb Rule)
31 pages
WEEK 8
No ratings yet
WEEK 8
101 pages
Chap4 Ann
No ratings yet
Chap4 Ann
22 pages
Lec10 Handout
No ratings yet
Lec10 Handout
41 pages
4 Neural Network
No ratings yet
4 Neural Network
74 pages
Simplifying Neural Networks and Deep Learning Basics!
No ratings yet
Simplifying Neural Networks and Deep Learning Basics!
27 pages
2-ANN - 1-14-12-2024
No ratings yet
2-ANN - 1-14-12-2024
34 pages
Deep Learning Turorial PDF
No ratings yet
Deep Learning Turorial PDF
301 pages
Research On Face Recognition Based On CNN
No ratings yet
Research On Face Recognition Based On CNN
6 pages
Unit 4 - The Learning Mechanisms - New
No ratings yet
Unit 4 - The Learning Mechanisms - New
79 pages
Multi Layer Perceptron 1
No ratings yet
Multi Layer Perceptron 1
54 pages
Deep Learning: Convolutional Neural Network & Its Applications
No ratings yet
Deep Learning: Convolutional Neural Network & Its Applications
53 pages
CNN2
No ratings yet
CNN2
70 pages
Deep Learning
No ratings yet
Deep Learning
90 pages
NN 09
No ratings yet
NN 09
34 pages
4 - DL (v2)
No ratings yet
4 - DL (v2)
32 pages
Applications
No ratings yet
Applications
84 pages
Lecture 17. Convolutional Neural Networks PDF
No ratings yet
Lecture 17. Convolutional Neural Networks PDF
32 pages
Image Enhancement
No ratings yet
Image Enhancement
84 pages
CNN For Computer Vision Problem (Session 1)
No ratings yet
CNN For Computer Vision Problem (Session 1)
43 pages
CVlecture 5
No ratings yet
CVlecture 5
56 pages
Deep Learning: Hung-yi Lee 李宏毅
No ratings yet
Deep Learning: Hung-yi Lee 李宏毅
29 pages
Cs490 Advanced Topics in Computing (Deep Learning) : Lecture 16: Convolutional Neural Networks (CNNS)
No ratings yet
Cs490 Advanced Topics in Computing (Deep Learning) : Lecture 16: Convolutional Neural Networks (CNNS)
63 pages
Lecture14 - ML (FF, Autoenc, Dense Networks)
No ratings yet
Lecture14 - ML (FF, Autoenc, Dense Networks)
28 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
15 pages
DEU CSC5045 Intelligent System Applications Using Fuzzy - 7+Deep+Learning
No ratings yet
DEU CSC5045 Intelligent System Applications Using Fuzzy - 7+Deep+Learning
108 pages
mod3
No ratings yet
mod3
101 pages
Face Recognition Using Facenet
No ratings yet
Face Recognition Using Facenet
46 pages
02 Cnn Slides
No ratings yet
02 Cnn Slides
77 pages
ANN - Back Propagation
No ratings yet
ANN - Back Propagation
22 pages
01 Section 10.1 QR Code Content
No ratings yet
01 Section 10.1 QR Code Content
4 pages
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
How To Code For Quantum Computers
From Everand
How To Code For Quantum Computers
Nivio Dos Santos
No ratings yet
Digital Image Processing: Fundamentals and Applications
From Everand
Digital Image Processing: Fundamentals and Applications
Fouad Sabry
No ratings yet
Data Science Crash Course
100% (1)
Data Science Crash Course
32 pages
Basic Concepts of Iron and Steel Making: January 2020
No ratings yet
Basic Concepts of Iron and Steel Making: January 2020
4 pages
The Ultimate Beginner's Guide To Python: Aiming To Start A Career in Data Science
No ratings yet
The Ultimate Beginner's Guide To Python: Aiming To Start A Career in Data Science
47 pages
Statistical Analysis of EEG Data: Hierarchical Modelling and Multiple Comparisons Correction 10.6084/m9.figshare.4233977
No ratings yet
Statistical Analysis of EEG Data: Hierarchical Modelling and Multiple Comparisons Correction 10.6084/m9.figshare.4233977
35 pages
AI_ML
No ratings yet
AI_ML
23 pages
Blood Group Determination Using Fingerprint
No ratings yet
Blood Group Determination Using Fingerprint
10 pages
Finn RTL
No ratings yet
Finn RTL
22 pages
Ophthalmic Medical Image Analysis 7th International Workshop Omia 2020
No ratings yet
Ophthalmic Medical Image Analysis 7th International Workshop Omia 2020
227 pages
Real Time Deep Learning Weapon Detection Techniques For Mitigating Lone Wolf Attacks
No ratings yet
Real Time Deep Learning Weapon Detection Techniques For Mitigating Lone Wolf Attacks
16 pages
2023-Viafara-etal-A Method To Analyze Wear Mechanisms On Worn Chute Lining Surfaces Using Computer Vision Tools
No ratings yet
2023-Viafara-etal-A Method To Analyze Wear Mechanisms On Worn Chute Lining Surfaces Using Computer Vision Tools
18 pages
deep-learning-r18-jntuh-lab-manual
No ratings yet
deep-learning-r18-jntuh-lab-manual
20 pages
BCA 5005 Minor Project - Synopsis 1
No ratings yet
BCA 5005 Minor Project - Synopsis 1
9 pages
OCR Project Report PDF
No ratings yet
OCR Project Report PDF
24 pages
Wa0011.
No ratings yet
Wa0011.
11 pages
Fncom 17 1243779
No ratings yet
Fncom 17 1243779
14 pages
A Beginner's Guide To Using Attention Layer in Neural Networks
No ratings yet
A Beginner's Guide To Using Attention Layer in Neural Networks
11 pages
CAPSULE NETWORK Project Research
No ratings yet
CAPSULE NETWORK Project Research
6 pages
2023 Optimized deep learning architecture for brain tumor classification using improved Hunger Games Search Algorithm
No ratings yet
2023 Optimized deep learning architecture for brain tumor classification using improved Hunger Games Search Algorithm
21 pages
Entry Form Mphil PHD Examination
No ratings yet
Entry Form Mphil PHD Examination
11 pages
Deep Learning 4/7: Convolutional Neural Networks: C. de Castro, IEIIT-CNR, Cristina - Decastro@ieiit - Cnr.it
0% (1)
Deep Learning 4/7: Convolutional Neural Networks: C. de Castro, IEIIT-CNR, Cristina - Decastro@ieiit - Cnr.it
49 pages
4cspl2041 - Introduction to Machine Learning
No ratings yet
4cspl2041 - Introduction to Machine Learning
6 pages
Yousef Udacity Deep Learning Part 3 CNN
No ratings yet
Yousef Udacity Deep Learning Part 3 CNN
253 pages
Anna University: Chennai 600 025
No ratings yet
Anna University: Chennai 600 025
10 pages
IEI Epitome May 2021
No ratings yet
IEI Epitome May 2021
17 pages
Detecting Depression From Speech
No ratings yet
Detecting Depression From Speech
8 pages
Acoustic Modeling Based On Deep Learning For Low-R
No ratings yet
Acoustic Modeling Based On Deep Learning For Low-R
17 pages
Human-Computer Interaction Based On Speech Recogni
No ratings yet
Human-Computer Interaction Based On Speech Recogni
9 pages
A Smart Fire Detector IoT System With Extinguisher Class Recommendation Using Deep Learning
No ratings yet
A Smart Fire Detector IoT System With Extinguisher Class Recommendation Using Deep Learning
24 pages
Mideepseg: Minimally Interactive Segmentation of Unseen Objects From Medical Images Using Deep Learning
No ratings yet
Mideepseg: Minimally Interactive Segmentation of Unseen Objects From Medical Images Using Deep Learning
17 pages
Channel Attention For Quantum Convolutional Neural Networks
No ratings yet
Channel Attention For Quantum Convolutional Neural Networks
6 pages
Smart Traffic Management System Basepaper
No ratings yet
Smart Traffic Management System Basepaper
7 pages
MLAgentBench Evaluating Language Agents on Machine Learning Experimentation
No ratings yet
MLAgentBench Evaluating Language Agents on Machine Learning Experimentation
39 pages

Bayesian Networks (Part I) : 10 - 601 Introduction To Machine Learning

Uploaded by

Bayesian Networks (Part I) : 10 - 601 Introduction To Machine Learning

Uploaded by

10-­‐601 Introduction to Machine Learning

Machine Learning Department

Architecture #1: LeNet-­‐5

2tT(bk ) (E) Output (softmax)

Architecture #1: LeNet-­‐5

(slide from Kaiming He’s recent presentation) 20

Three variants of first-­‐order optimization:

convolve (slide) over all

Sample 2: y(2)

Fig. 5.Fig. 5. Handwriting

Sample 2: y(2)

(Burkett & Klein, 2012) 38

– Text of all speeches of a

(Stoyanov & Eisner, 2012) 39

pigeon y(1) rhinoceros y(2)

leopard y(3) llama y(4)

• z is not observed at leopard Y

tail, torso, grass) ψ1

• z is not observed at leopard Y

Classification Structured Prediction

where y {+1, 1} where v Y

{best structure, marginals, Optimization

partition function} for a

For random variables A and B:

P (A, B) = P (A|B)P (B)

For random variables X1 , X2 , X3 , X4 :

P (A, B|C) = P (A|C)P (B|C) (1)

P (A|B, C) = P (A|C) (2)

We write this as:

write: I<A, {C}, B>

DIRECTED GRAPHICAL MODELS

• A Bayesian Network is a directed graphical model

You might also like

10-‐601 Introduction to Machine Learning

Architecture #1: LeNet-‐5

Architecture #1: LeNet-‐5

Three variants of first-‐order optimization: