Lecture 10

Classification
Machine Learning

Supervised Learning:
 Classification: Predict a discrete value(label)
associated with feature vector.
 Regression: Predict a real number associated with a
feature vector.
E.g., Use linear regression to fit a curve to data.

Using Distance Matrix for Classification:
 Simplest approach is probably nearest neighbors.
 Remember training data
 When predicting the label of a new example
 Find the nearest example in the training data
 Predict the label associated with that example.

Hand-Written Character Recognition:

Advantages and Disadvantages of KNN:
Advantages:
 Learning Fast, no explicit training
 No theory Required
 Easy to explain method and results
Disadvantages:
 Memory intensive and predictions can take a long
time.
 No model to shed light on process that generated
data.

Naïve Baye’s Text classification:
Why?
 Learn which news articles are of interest.
 Learn to classify web pages category
Basic Intuition:
 Simple (naïve) classification method based on
Bayes rule.
 Relies on very simple representation of documents
 Bag of words

Naïve Bayes Text Classification:
Bayes Rule:
For a document d and class c
Goal of Classifier:

Learn to Classify Text using Naïve Bayes:
Target concept interesting? : Document {+, -}
 Represent each document by vector of words
 One attribute per word position in document
 Learning : Use training examples to estimate
P(+), P(-), P(doc|+), P(doc|-)
Naïve Bayes conditional independence assumption
Where P(ai = Wk|Vj) is probability that a word
in position in i is Wk , given Vj

An example: Movie Review
Dictionary: 10 Unique words
< I, loved, the, movie, hated, a, great, good, poor,
acting>

Steps:
 Covert the documents into feature sets, where
attributes are possible words, and the values are the
number of times a word occurs in the given
document.
Doc I love
d
the movi
e
hate
d
a great goo
d
poor actin
g
Clas
s
1 1 1 1 1 +
2 1 1 1 1 -
3 2 1 1 1 +
4 1 1 -
5 1 1 1 1 1 +
Let us look at the probabilities per outcomes(+
or -)

Lecture 10

More Related Content

What's hot (14)

Similar to Lecture 10 (20)

More from Jeet Das (11)

Recently uploaded (20)

Lecture 10