0% found this document useful (0 votes)

33 views

CS 601 Machine Learning Unit 5

The document discusses various machine learning algorithms including support vector machines, Bayesian learning, naive Bayes, hidden Markov models and their applications. It provides details on how these algorithms work, their assumptions and how parameters are estimated.

Uploaded by

Priyanka Bhatele

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views

CS 601 Machine Learning Unit 5

Uploaded by

Priyanka Bhatele

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

LNCT GROUP OF COLLEGES

Unit: 5
Topic: Support Vector Machines, Bayesian learning, application of
machine learning in computer vision, speech processing, natural language
processing etc, Case Study: Image Net Competition

What is Support Vector Machines?

The objective of the support vector machine algorithm is to find a hyper plane in an N-
dimensional space (N — the number of features) that distinctly classifies the data points.

Possible hyper planes

To separate the two classes of data points, there are many possible hyper planes that could be
chosen. Our objective is to find a plane that has the maximum margin, i.e. the maximum
distance between data points of both classes. Maximizing the margin distance provides some
reinforcement so that future data points can be classified with more confidence.

Hyper planes and Support Vectors

LNCT GROUP OF COLLEGES

Hyper planes in 2D and 3D feature space

Hyper planes are decision boundaries that help classify the data points. Data points falling on
either side of the hyper plane can be attributed to different classes. Also, the dimension of the
hyper plane depends upon the number of features. If the number of input features is 2, then the
hyper plane is just a line. If the number of input features is 3, then the hyper plane becomes a
two-dimensional plane. It becomes difficult to imagine when the number of features exceeds
3.

Support vectors are data points that are closer to the hyper plane and influence the position and
orientation of the hyper plane. Using these support vectors, we maximize the margin of the
classifier. Deleting the support vectors will change the position of the hyper plane. These are
the points that help us build our SVM.

Refreance:(https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms
934a444fca47)
LNCT GROUP OF COLLEGES

Bayesian Learning

Bayesian machine learning is a particular set of approaches to probabilistic machine learning.

(for other probabilistic models, see Supervised Learning).

Bayesian learning treats model parameters as random variables - in Bayesian learning,

parameter estimation amounts to computing posterior distributions for these random variables
based on the observed data.

Bayesian learning typically involves generative models - one notable exception is Bayesian
linear regression, which is a discriminative model.

Bayesian models

Bayesian modeling treats those two problems as one.

We first have a prior distribution over our parameters (i.e. what are the likely
parameters?) P(θ)P(θ).

From this we compute a posterior distribution which combines both inference and learning:

P(y1,…,yn,θ|x1,…,xn)=P(x1,…,xn,y1,…,yn|θ)P(θ)P(x1,…,xn)P(y1,…,yn,θ|x1,…,xn)=P(x1,
…,xn,y1,…,yn|θ)P(θ)P(x1,…,xn)

Then prediction is to compute the conditional distribution of the new data point given our
observed data, which is the marginal of the latent variables and the parameters:

P(xn+1|x1,…,xn)=∫P(xn+1|θ)P(θ|x1,…,xn)dθP(xn+1|x1,…,xn)=∫P(xn+1|θ)P(θ|x1,…,xn)dθ

Classification then is to predict the distributions of the new datapoint given data from other
classes, then finding the class which maximizes it:

Hidden Markov Models

HMMs can be thought of as clustering over time; that is, each state is a "cluster".

The data points and latent variables are sequences, and πkπk becomes the transition
probability given the state (cluster) kk. θ∗kθk∗ becomes the emission distribution
for xx given state kk.

Model-based clustering

• model data from heterogeneous unknown sources

• KK unknown sources (clusters)
• each cluster/source is modelled using a parametric model (e.g. a Gaussian
distribution)

For a given data point ii, we have:

zi|π∼Discrete(π)zi|π∼Discrete(π)

Where zizi is the cluster label for which data point ii belongs to. This is the latent variable we
want to discover.

ππ is the mixing proportions which is the vector of probabilities for each class kk, that is:

π=(πi,…,πK)|α∼Dirichlet(αK,…,αK)π=(πi,…,πK)|α∼Dirichlet(αK,…,αK)

That is, πk=P(zi=k)πk=P(zi=k).

We also model each data point xixi as being drawn from a source (cluster) like so,
where FF is however we are modeling the cluster (e.g. a Gaussian), parameterized
by θ∗ziθzi∗, that is some parameters for the zizi-labeled cluster:

xi|zi,θ∗k∼F(θ∗zi)xi|zi,θk∗∼F(θzi∗)

(Note that the star, as in θ∗θ∗, is used to denote the optimal solution for θθ.)

For this approach we have two priors over parameters of the model:

• For the mixing proportions, we typically use a Dirichlet prior (above) because it has
the nice property of being a conjugate prior with multinomial distributions.
• For each cluster kk we use some prior HH, that is θ∗k|H∼Hθk∗|H∼H.

Graphically, this is: Model-based clustering plate model

LNCT GROUP OF COLLEGES

Naive Bayes

The main assumption of Naive Bayes is that all features are independent effects of the label.
This is a really strong simplifying assumption but nevertheless in many cases Naive Bayes
performs well.

Naive Bayes is also statistically efficient which means that it doesn't need a whole lot of data
to learn what it needs to learn.

If we were to draw it out as a Bayes' net:

YYY→F1→F2…→FnY→F1Y→F2…Y→Fn

Where YY is the label and F1,F2,…,FnF1,F2,…,Fn are the features.

The model is simply:

P(Y|F1,…,Fn)∝P(Y)∏iP(Fi|Y)P(Y|F1,…,Fn)∝P(Y)∏iP(Fi|Y)

This just comes from the Bayes' net described above.

The Naive Bayes learns P(Y,f1,f2,…,fn)P(Y,f1,f2,…,fn) which we can normalize (divide

by P(f1,…,fn)P(f1,…,fn)) to get the conditional probability P(Y|f1,…,fn)P(Y|f1,…,fn):

So the parameters of Naive Bayes are P(Y)P(Y) and P(Fi|Y)P(Fi|Y) for each feature.

Inference in Bayesian models

Maximum a posteriori (MAP) estimation

A Bayesian alternative to MLE, we can estimate probabilities using maximum a posteriori

estimation, where we instead choose a probability (a point estimate) that is most likely given
the observed data:

Again, this may be done with log-likelihoods:

θMAP=argmaxθp(θ|x)=argmaxθlogp(x|θ)+logp(θ)θMAP=argmaxθ⁡p(θ|x)=argmaxθ⁡log⁡p(
x|θ)+log⁡p(θ)

Maximum A Posteriori (MAP)

Likelihood function L(θ)L(θ) is the probability of the data DD as a function of the

parameters θθ.

This often has very small values so typically we work with the log-likelihood function
instead:

ℓ(θ)=logL(θ)ℓ(θ)=log⁡L(θ)

The maximum likelihood criterion simply involves choosing the parameter θθ to

maximize ℓ(θ)ℓ(θ). This can (sometimes) be done analytically by computing the derivative

and setting it to zero and yields the maximum likelihood estimate.

MLE's weakness is that if you have only a little training data, it can overfit. This problem is
known as data sparsity. For example, you flip a coin twice and it happens to land on heads
both times. Your maximum likelihood estimate for θθ (probability that the coin lands on
heads) would be 1! We can then try to generalize this estimate to another dataset and test it by
measuring the log-likelihood on the test set. If a tails shows up at all in the test set, we will
have a test log-likelihood of −∞−∞.

We can instead use Bayesian techniques for parameter estimation. In Bayesian parameter
estimation, we treat the parameters θθ as a random variable as well, so we learn a joint
distribution p(θ,D)p(θ,D).

We first require a prior distribution p(θ)p(θ) and the likelihood p(D|θ)p(D|θ) (as with
maximum likelihood).

We want to compute p(θ|D)p(θ|D), which is accomplished using Bayes' rule:

Though we work with only the numerator for as long as possible (i.e. we delay normalization
until it's necessary):

p(θ|D)∝p(θ)p(D|θ)p(θ|D)∝p(θ)p(D|θ)
LNCT GROUP OF COLLEGES

The more data we observe, the less uncertainty there is around the parameter, and the
likelihood term comes to dominate the prior - we say that the data overwhelm the prior.

We also have the posterior predictive distribution p(D′|D)p(D′|D), which is the distribution
over future observables given past observations. This is computed by computing the posterior
over θθ and then marginalizing out θθ:

p(D′|D)=∫p(θ|D)p(D′|θ)dθp(D′|D)=∫p(θ|D)p(D′|θ)dθ

The normalization step is often the most difficult, since we must compute an integral over
potentially many, many parameters.

We can instead formulate Bayesian learning as an optimization problem, allowing us to avoid

this integral. In particular, we can use maximum a-posteriori (MAP) approximation.

Whereas with the previous Bayesian approach (the "full Bayesian" approach) we learn a
distribution over θθ, with MAP approximation we simply get a point estimate (that is, a
single value rather than a full distribution). In particular, we get the parameters that are most
likely under the posterior:

Application of Machine learning in Computer Vision

It is not just the performance of machine learning/deep learning models on benchmark

problems that is most interesting; it is the fact that a single model can learn meaning from
images and performs vision tasks, obviating the need for a pipeline of specialized and hand-
crafted methods.
We will look at the following computer vision problems where deep learning has been used:

1. Image Classification
2. Image Classification With Localization
3. Object Detection
4. Object Segmentation
5. Image Style Transfer
6. Image Colorization
7. Image Reconstruction
8. Image Super-Resolution
9. Image Synthesis
10. Other Problems

Image Classification
Image classification involves assigning a label to an entire image or photograph.

This problem is also referred to as “object classification” and perhaps more generally as
“image recognition,” although this latter task may apply to a much broader set of tasks related
to classifying the content of images.
Some examples of image classification include:

• Labeling an x-ray as cancer or not (binary classification).

• Classifying a handwritten digit (multiclass classification).
• Assigning a name to a photograph of a face (multiclass classification).
• A popular example of image classification used as a benchmark problem is the MNIST
dataset.
LNCT GROUP OF COLLEGES

Image Classification With Localization

Image classification with localization involves assigning a class label to an image and
showing the location of the object in the image by a bounding box (drawing a box around the
object).

This is a more challenging version of image classification.

Some examples of image classification with localization include:

• Labeling an x-ray as cancer or not and drawing a box around the cancerous region.
• Classifying photographs of animals and drawing a box around the animal in each scene.
A classical dataset for image classification with localization is the PASCAL Visual Object
Classes datasets, or PASCAL VOC for short (e.g. VOC 2012). These are datasets used in
computer vision challenges over many years.
LNCT GROUP OF COLLEGES

The task may involve adding bounding boxes around multiple examples of the same object in
the image. As such, this task may sometimes be referred to as “object detection.”

Example of Image Classification With Localization of Multiple Chairs From VOC 2012

The ILSVRC2016 Dataset for image classification with localization is a popular dataset
comprised of 150,000 photographs with 1,000 categories of objects.
Some examples of papers on image classification with localization include:

• Selective Search for Object Recognition, 2013.

• Rich feature hierarchies for accurate object detection and semantic segmentation, 2014.
• Fast R-CNN, 2015.
LNCT GROUP OF COLLEGES
Object Detection

Object detection is the task of image classification with localization, although an image may
contain multiple objects that require localization and classification.

This is a more challenging task than simple image classification or image classification with
localization, as often there are multiple objects in the image of different types.

Often, techniques developed for image classification with localization are used and
demonstrated for object detection.

Some examples of object detection include:

• Drawing a bounding box and labeling each object in a street scene.

• Drawing a bounding box and labeling each object in an indoor photograph.
• Drawing a bounding box and labeling each object in a landscape.

The PASCAL Visual Object Classes datasets, or PASCAL VOC for short (e.g. VOC 2012),
is a common dataset for object detection.
Another dataset for multiple computer vision tasks is Microsoft’s Common Objects in
Context Dataset, often referred to as MS COCO.

Example of Object Detection With Faster R-CNN on the MS COCO Dataset

LNCT GROUP OF COLLEGES

Some examples of papers on object detection include:

• Over Feat: Integrated Recognition, Localization and Detection using Convolutional

Networks, 2014.
• Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, 2015.
• You Only Look Once: Unified, Real-Time Object Detection, 2015.

Object Segmentation

Object segmentation, or semantic segmentation, is the task of object detection where a line is
drawn around each object detected in the image. Image segmentation is a more general
problem of spitting an image into segments.
Object detection is also sometimes referred to as object segmentation. Unlike object detection
that involves using a bounding box to identify objects, object segmentation identifies the
specific pixels in the image that belong to the object. It is like a fine-grained localization.
More generally, “image segmentation” might refer to segmenting all pixels in an image into
different categories of object. Again, the VOC 2012 and MS COCO datasets can be used for
object segmentation.

STYLE TRANSFER

Style transfer or neural style transfer is the task of learning style from one or more images
and applying that style to a new image.

This task can be thought of as a type of photo filter or transform that may not have an
objective evaluation.
LNCT GROUP OF COLLEGES

Examples include applying the style of specific famous artworks (e.g. by Pablo Picasso or
Vincent van Gogh) to new photographs.

Datasets often involve using famous artworks that are in the public domain and photographs
from standard computer vision datasets.

Example of Neural Style Transfer From Famous Artworks to a Photograph Taken from “A Neural Algorithm of Artistic

Style”
LNCT GROUP OF COLLEGES

Image Colorization

Image colorization or neural colorization involves converting a grayscale image to a full

color image.

This task can be thought of as a type of photo filter or transform that may not have an
objective evaluation.

Examples include colorizing old black and white photographs and movies.

Datasets often involve using existing photo datasets and creating grayscale versions of photos
that models must learn to colorize.

Refreance:(https://machinelearningmastery.com/applications-of-deep-learning-for-computer-vision/)
LNCT GROUP OF COLLEGES

Speech processing

As a sub-field of Artificial Intelligence (AI) technology, machine learning is the method of

data analysis which constructs analytical models automatically. This is a promising
technology to provide the most optimal support for businesses with a variety of real-world
applications, such as speech recognition and image recognition.

Machine learning uses iterative algorithms to learn from data and allows the computer to find
information, hidden values that are not explicitly programmed. The repetitive aspect of
Machine learning is important because when these models are exposed to new data, they can
adapt independently. Machine Learning systems can quickly apply knowledge and training
from large datasets to perform face recognition, speech recognition, and more.

Image Recognition

One of the most common uses of machine learning is image recognition. There are many
situations where you can classify the object as a digital image. For digital images, the
measurements describe the outputs of each pixel in the image.

In the case of a black and white image, the intensity of each pixel serves as one measurement.
So if a black and white image has N*N pixels, the total number of pixels and hence
measurement is N2.

In the colored image, each pixel considered as providing 3 measurements to the intensities of
3 main color components ie RGB. So N*N colored image there are 3 N2 measurements.

• For face detection – The categories might be face versus no face present. There
might be a separate category for each person in a database of several individuals.
• For character recognition – We can segment a piece of writing into smaller
images, each containing a single character. The categories might consist of the 26
letters of the English alphabet, the 10 digits, and some special characters.

Image recognition system uses the machine learning technology is being used by Google in
their products such as Google Photos, Google Search, Google Drive … to optimize the image
detection through the keyword search of user.
LNCT GROUP OF COLLEGES
Speech Recognition

Speech recognition (SR) is the translation of spoken words into text. It is also known as
“automatic speech recognition” (ASR), “computer speech recognition”, or “speech to text”
(STT).

The application translating spoken words into text.

In speech recognition, a software application recognizes spoken words. The measurements in

this application might be a set of numbers that represent the speech signal. We can segment
the signal into portions that contain distinct words or phonemes. In each segment, we can
represent the speech signal by the intensities or energy in different time-frequency bands.

Although the details of signal representation are outside the scope of this program, we can
represent the signal by a set of real values.

Speech recognition applications include voice user interfaces. Voice user interfaces are such
as voice dialing, call routing, domotic appliance control. It can also use as simple data entry,
preparation of structured documents, speech-to-text processing, and plane.

Using Machine Learning, Baidu’s research and development department have created a tool
called Deep Voice – a deep neural network that is capable of producing artificial voices that
are difficult to distinguish from real human voice. This network can “learn” features in
rhythm, voice, pronunciation, and vocalization to create the voice of the speaker. In addition,
Google also uses Machine Learning for other voice-related products and translations such as
Google Translate, Google Text To Speech, Google Assistant.
LNCT GROUP OF COLLEGES
Besides the applications in audio recognition and image recognition, Machine
learning is also applied in areas such as medical analysis; arranging, classifying; data analysis
and forecasting, etc, in the field such as healthcare, financial services, transportation,
marketing & sale…In a near day, Devices and applications based on Machine learning
technology may appear in all aspects of human life.

FPT.AI – New Generation Conversation Platform and Virtual Assistant

In order to catch up with the modern technology trend, FPT has been using Machine learning
in most FPT applications and technology products such as FPT.AI – New Generation
Conversation Platform and Virtual Assistant, PeoIed identification in FPT Shop, Autonomous
car, Human Machine Interface – TTS, STT.

Refreance:(https://techinsight.com.vn/language/en/image-recognition-speech-recognition-machine-learning-applications-
real-world/)
LNCT GROUP OF COLLEGES

Natural Language Processing

NLP is a field in machine learning with the ability of a computer to understand, analyze,
manipulate, and potentially generate human language.

NLP in Real Life

• Information Retrieval (Google finds relevant and similar results).

• Information Extraction (Gmail structures events from emails).

• Machine Translation (Google Translate translates language from one language to

another).

• Text Simplification (Rewordify simplifies the meaning of sentences). Shashi Tharoor

tweets could be used (pun intended).

• Sentiment Analysis (Hater News gives us the sentiment of the user).

• Text Summarization (Smmry or Reddit’s autotldr gives a summary of sentences).

• Spam Filter (Gmail filters spam emails separately).

• Auto-Predict (Google Search predicts user search results).

• Auto-Correct (Google Keyboard and Grammarly correct words otherwise spelled wrong).

• Speech Recognition (Google Web Speech or Vocal ware).

• Question Answering (IBM Watson’s answers to a query).

• Natural Language Generation (Generation of text from image or video data.)

Christopher M. Bishop, “Pattern Recognition and Machine Learning”, Springer-Verlag

Reference :1.
New York Inc., 2nd Edition, 2011.
2. Tom M. Mitchell, “Machine Learning”, McGraw Hill Education, First edition, 2017.
3. Ian Goodfellow and Yoshua Bengio and Aaron Courville, “Deep Learning”, MIT Press.
(https://towardsdatascience.com/natural-language-processing-nlp-for-machine-learning-d44498845d5b)

Murphy Book Solution
No ratings yet
Murphy Book Solution
100 pages
Unit 5 - Machine Learning
No ratings yet
Unit 5 - Machine Learning
16 pages
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
17 pages
Unit 5 - Machine Learning
No ratings yet
Unit 5 - Machine Learning
17 pages
2B Naive Bayes
No ratings yet
2B Naive Bayes
90 pages
Naive Bayes Thoerem
No ratings yet
Naive Bayes Thoerem
90 pages
Dl Highlights
No ratings yet
Dl Highlights
6 pages
Machine Learning UNIT-2: Logistic Regression
No ratings yet
Machine Learning UNIT-2: Logistic Regression
12 pages
Supervised Learning
No ratings yet
Supervised Learning
6 pages
CSE546: Naïve Bayes: Winter 2012
No ratings yet
CSE546: Naïve Bayes: Winter 2012
35 pages
07 - Bayesian Learning
No ratings yet
07 - Bayesian Learning
55 pages
Cheatsheet Supervised Learning
No ratings yet
Cheatsheet Supervised Learning
4 pages
Machine Learning: Lecture 6: Bayesian Learning (Based On Chapter 6 of Mitchell T.., Machine Learning, 1997)
No ratings yet
Machine Learning: Lecture 6: Bayesian Learning (Based On Chapter 6 of Mitchell T.., Machine Learning, 1997)
15 pages
MLT Unit 2 - Updated
No ratings yet
MLT Unit 2 - Updated
58 pages
Statistical Machine Learning-The Basic Approach and Current Research Challenges
No ratings yet
Statistical Machine Learning-The Basic Approach and Current Research Challenges
35 pages
Statistical Machine Learning-The Basic Approach and Current Research Challenges
No ratings yet
Statistical Machine Learning-The Basic Approach and Current Research Challenges
35 pages
Bayes ML Tutorial
No ratings yet
Bayes ML Tutorial
69 pages
Lecture 03 Bayes Classifier With Prob Concepts
No ratings yet
Lecture 03 Bayes Classifier With Prob Concepts
70 pages
Lec 24
No ratings yet
Lec 24
39 pages
Bark08 Ghahramani Samlbb 01
No ratings yet
Bark08 Ghahramani Samlbb 01
26 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
ML Unit 3 Part B Material
No ratings yet
ML Unit 3 Part B Material
15 pages
Machine_learning(unit 3)
No ratings yet
Machine_learning(unit 3)
9 pages
ML RUSA Module 6 Probablistic EM KNN SVM
No ratings yet
ML RUSA Module 6 Probablistic EM KNN SVM
51 pages
ML - Interview Prep
No ratings yet
ML - Interview Prep
9 pages
Statistical Learning Theory
No ratings yet
Statistical Learning Theory
4 pages
Duda Solutions PDF
No ratings yet
Duda Solutions PDF
77 pages
ECE_449_Notes
No ratings yet
ECE_449_Notes
5 pages
Machine Learning
No ratings yet
Machine Learning
87 pages
Module05 - Bayesian Reasoning
No ratings yet
Module05 - Bayesian Reasoning
37 pages
Chapter 9 Data Mining
No ratings yet
Chapter 9 Data Mining
147 pages
Brief Intro To ML PDF
No ratings yet
Brief Intro To ML PDF
236 pages
UNIT 3 - Frequentist Statistics
No ratings yet
UNIT 3 - Frequentist Statistics
65 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
FML Unit3
No ratings yet
FML Unit3
18 pages
Slide 1
No ratings yet
Slide 1
37 pages
Probabilistic Models in Machine Learning: Unit - III Chapter - 1
No ratings yet
Probabilistic Models in Machine Learning: Unit - III Chapter - 1
18 pages
UNIT-3
No ratings yet
UNIT-3
12 pages
The Naive Bayes Model, Maximum-Likelihood Estimation, and The EM Algorithm
No ratings yet
The Naive Bayes Model, Maximum-Likelihood Estimation, and The EM Algorithm
21 pages
Bayesian Classifier Implementation Using MATLAB
No ratings yet
Bayesian Classifier Implementation Using MATLAB
21 pages
Bayesian Nonparametrics and The Probabilistic Approach To Modelling
No ratings yet
Bayesian Nonparametrics and The Probabilistic Approach To Modelling
27 pages
Unit-2: Logistic Regression
No ratings yet
Unit-2: Logistic Regression
30 pages
AP for NLP-LO2
No ratings yet
AP for NLP-LO2
38 pages
MLT Unit-2
No ratings yet
MLT Unit-2
30 pages
Note 1518944988
No ratings yet
Note 1518944988
27 pages
Cheatsheet Supervised Learning
100% (1)
Cheatsheet Supervised Learning
4 pages
Evaluation of Different Classifier
No ratings yet
Evaluation of Different Classifier
4 pages
DSA5102_lecture1
No ratings yet
DSA5102_lecture1
60 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
Lecture 6_Generative Models
No ratings yet
Lecture 6_Generative Models
33 pages
Lecture 6
No ratings yet
Lecture 6
13 pages
CH 1
No ratings yet
CH 1
24 pages
Lecture 12 Bayesian Neural Network
No ratings yet
Lecture 12 Bayesian Neural Network
46 pages
Lecture 18 - 2024
No ratings yet
Lecture 18 - 2024
34 pages
ML Unit 3 V1
No ratings yet
ML Unit 3 V1
25 pages
Lecture 2 - Principle of Machine Learning
No ratings yet
Lecture 2 - Principle of Machine Learning
39 pages
ML Columbia PDF
No ratings yet
ML Columbia PDF
615 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet
CS 601 Machine Learning Unit 4
No ratings yet
CS 601 Machine Learning Unit 4
14 pages
Unit 5 Part2
No ratings yet
Unit 5 Part2
33 pages
CS 601 Machine Learning Unit 3
No ratings yet
CS 601 Machine Learning Unit 3
37 pages
Design Thinking
No ratings yet
Design Thinking
1 page
RMAE Notes SCR
No ratings yet
RMAE Notes SCR
9 pages
How To Use A Scientific Calculator For Polar To Rectangular Conversion
No ratings yet
How To Use A Scientific Calculator For Polar To Rectangular Conversion
9 pages
RME Theory PPTB 3.1 Actuators Overview PN, Hy and Ele
No ratings yet
RME Theory PPTB 3.1 Actuators Overview PN, Hy and Ele
33 pages
RME Theory PPTB 3.2 Magnetism Basics
No ratings yet
RME Theory PPTB 3.2 Magnetism Basics
18 pages
The Intelligent Vehicle Number Plate Recognition System Based On Arduino
No ratings yet
The Intelligent Vehicle Number Plate Recognition System Based On Arduino
19 pages
Project Report PDF
No ratings yet
Project Report PDF
29 pages
Segmentation of Medical Image Using Fuzzy Neuro Logic
No ratings yet
Segmentation of Medical Image Using Fuzzy Neuro Logic
3 pages
Plant Disease Detection Using Deep Learning: Anjaneya Teja Kalvakolanu
No ratings yet
Plant Disease Detection Using Deep Learning: Anjaneya Teja Kalvakolanu
4 pages
CS 804 Image Processing notes
No ratings yet
CS 804 Image Processing notes
57 pages
3D Tooth Segmentation and Labeling Using Deep Convolutional Neural Networks
No ratings yet
3D Tooth Segmentation and Labeling Using Deep Convolutional Neural Networks
13 pages
Joint Segmentation and Recognition of Categorized Objects From Noisy Web Image Collection
No ratings yet
Joint Segmentation and Recognition of Categorized Objects From Noisy Web Image Collection
17 pages
Classification of Automatic Detection of Plant Disease in Leaves and Fruits Using Image Processing With Convolutional Neural Network
No ratings yet
Classification of Automatic Detection of Plant Disease in Leaves and Fruits Using Image Processing With Convolutional Neural Network
7 pages
Intelligent System For Vehicles License Plate Recognition Using A Hybrid Model of GAN, CNN and ELM
No ratings yet
Intelligent System For Vehicles License Plate Recognition Using A Hybrid Model of GAN, CNN and ELM
7 pages
Efficient Wildlife Intrusion Detection System Using Hybrid Algorithm
No ratings yet
Efficient Wildlife Intrusion Detection System Using Hybrid Algorithm
7 pages
Industrial AI Platform
No ratings yet
Industrial AI Platform
36 pages
Finite Element Analysis - From Biomedical Applications To Industrial Developments1
100% (2)
Finite Element Analysis - From Biomedical Applications To Industrial Developments1
509 pages
Atlas Copco Market Segmentation
No ratings yet
Atlas Copco Market Segmentation
12 pages
2022 - Automatic Recognition and Localization of Underground Pipelines in GPR
No ratings yet
2022 - Automatic Recognition and Localization of Underground Pipelines in GPR
10 pages
Automatic Defect Segmentation On Leather With Deep Learning: A B, B C A
No ratings yet
Automatic Defect Segmentation On Leather With Deep Learning: A B, B C A
13 pages
3 2c735de418 Syllabus Computer Vision Modified
No ratings yet
3 2c735de418 Syllabus Computer Vision Modified
5 pages
DIP+Important+Questions+ +solutions
100% (1)
DIP+Important+Questions+ +solutions
20 pages
Laddernet: Multi-Path Networks Based On U-Net For Medical Image Segmentation Juntang Zhuang Biomedical Engineering, Yale University, New Haven, CT, USA
No ratings yet
Laddernet: Multi-Path Networks Based On U-Net For Medical Image Segmentation Juntang Zhuang Biomedical Engineering, Yale University, New Haven, CT, USA
4 pages
Adaptive Image Processing: A Computational Intelligence Perspective
No ratings yet
Adaptive Image Processing: A Computational Intelligence Perspective
10 pages
Basic Operations in Image Processing - Poorvi Joshi - 2019 Batch
No ratings yet
Basic Operations in Image Processing - Poorvi Joshi - 2019 Batch
26 pages
A Candidate Lattice Refinement Method For Online Handwritten Japanese Text Recognition
No ratings yet
A Candidate Lattice Refinement Method For Online Handwritten Japanese Text Recognition
6 pages
Ge Et Al 2023 Roadside Lidar Sensor Configuration Assessment and Optimization Methods For Vehicle Detection and
No ratings yet
Ge Et Al 2023 Roadside Lidar Sensor Configuration Assessment and Optimization Methods For Vehicle Detection and
20 pages
A Semiautomated Deep Learning Approach For Pancreas Segmentation
No ratings yet
A Semiautomated Deep Learning Approach For Pancreas Segmentation
21 pages
Applications of Image Processing in Agriculture: A Survey: Anup Vibhute S K Bodhe
No ratings yet
Applications of Image Processing in Agriculture: A Survey: Anup Vibhute S K Bodhe
7 pages
Clothing Attribute Recognition Based On RCNN Framework Using L-Softmax Loss
No ratings yet
Clothing Attribute Recognition Based On RCNN Framework Using L-Softmax Loss
15 pages
Expert Systems With Applications: D.K. Vishwakarma, Rajiv Kapoor
No ratings yet
Expert Systems With Applications: D.K. Vishwakarma, Rajiv Kapoor
9 pages
CSEN3231 IMAGE PROCESSING
No ratings yet
CSEN3231 IMAGE PROCESSING
4 pages
Fetal Project
No ratings yet
Fetal Project
61 pages
Segmentation Region Growing
No ratings yet
Segmentation Region Growing
14 pages
Final Report
No ratings yet
Final Report
22 pages

CS 601 Machine Learning Unit 5

Uploaded by

CS 601 Machine Learning Unit 5

Uploaded by

LNCT GROUP OF COLLEGES

What is Support Vector Machines?

Possible hyper planes

Hyper planes and Support Vectors

Hyper planes in 2D and 3D feature space

Bayesian machine learning is a particular set of approaches to probabilistic machine learning.

Bayesian learning treats model parameters as random variables - in Bayesian learning,

Bayesian modeling treats those two problems as one.

Hidden Markov Models

• model data from heterogeneous unknown sources

For a given data point ii, we have:

That is, πk=P(zi=k)πk=P(zi=k).

Graphically, this is: Model-based clustering plate model

If we were to draw it out as a Bayes' net:

Where YY is the label and F1,F2,…,FnF1,F2,…,Fn are the features.

The model is simply:

This just comes from the Bayes' net described above.

The Naive Bayes learns P(Y,f1,f2,…,fn)P(Y,f1,f2,…,fn) which we can normalize (divide

Inference in Bayesian models

Maximum a posteriori (MAP) estimation

A Bayesian alternative to MLE, we can estimate probabilities using maximum a posteriori

Again, this may be done with log-likelihoods:

Maximum A Posteriori (MAP)

Likelihood function L(θ)L(θ) is the probability of the data DD as a function of the

The maximum likelihood criterion simply involves choosing the parameter θθ to

and setting it to zero and yields the maximum likelihood estimate.

We want to compute p(θ|D)p(θ|D), which is accomplished using Bayes' rule:

We can instead formulate Bayesian learning as an optimization problem, allowing us to avoid

Application of Machine learning in Computer Vision

It is not just the performance of machine learning/deep learning models on benchmark

• Labeling an x-ray as cancer or not (binary classification).

Image Classification With Localization

This is a more challenging version of image classification.

Some examples of image classification with localization include:

• Selective Search for Object Recognition, 2013.

Some examples of object detection include:

• Drawing a bounding box and labeling each object in a street scene.

Example of Object Detection With Faster R-CNN on the MS COCO Dataset

Some examples of papers on object detection include:

• Over Feat: Integrated Recognition, Localization and Detection using Convolutional

Image colorization or neural colorization involves converting a grayscale image to a full

As a sub-field of Artificial Intelligence (AI) technology, machine learning is the method of

The application translating spoken words into text.

In speech recognition, a software application recognizes spoken words. The measurements in

FPT.AI – New Generation Conversation Platform and Virtual Assistant

Natural Language Processing

NLP in Real Life

• Information Retrieval (Google finds relevant and similar results).

• Information Extraction (Gmail structures events from emails).

• Machine Translation (Google Translate translates language from one language to

• Text Simplification (Rewordify simplifies the meaning of sentences). Shashi Tharoor

• Sentiment Analysis (Hater News gives us the sentiment of the user).

• Text Summarization (Smmry or Reddit’s autotldr gives a summary of sentences).

• Spam Filter (Gmail filters spam emails separately).

• Auto-Predict (Google Search predicts user search results).

• Speech Recognition (Google Web Speech or Vocal ware).

• Question Answering (IBM Watson’s answers to a query).

• Natural Language Generation (Generation of text from image or video data.)

Christopher M. Bishop, “Pattern Recognition and Machine Learning”, Springer-Verlag

You might also like