0% found this document useful (0 votes)

68 views

Classification

Classification is a supervised machine learning method where a model predicts labels for input data. There are two types of classifiers: eager learners, which build a model during training, and lazy learners, which do not build a model and instead search training data for predictions. K-nearest neighbors (KNN) is an example of a lazy learner that predicts labels based on similarity to training examples. Model evaluation metrics include accuracy, precision, recall, and F-measures, which are calculated using a confusion matrix comparing predicted and actual labels.

Uploaded by

lekha

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

68 views

Classification

Uploaded by

lekha

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 58

Classification

• Classification is a supervised machine learning method

where the model tries to predict the correct label of a
given input data.
• In classification, the model is fully trained using the
training data, and then it is evaluated on test data before
being used to perform prediction on new unseen data.
• Two types of learners in classification: lazy and eager
learners.
• Eager learners are machine learning algorithms that first
build a model from the training dataset before making any
prediction on future datasets.
• They spend more time during the training process because of
their eagerness to have a better generalization during the
training from learning the weights
• But they require less time to make predictions.
• Most machine learning algorithms are eager learners, and
below are some examples:
• Logistic Regression.
• Support Vector Machine.
• Decision Trees.
• Artificial Neural Networks.
Lazy learners
•Lazy learners or instance-based learners, on the other hand, do not

create any model immediately from the training data, and this is where

the lazy aspect comes from.

•They just memorize the training data, and each time there is a need to

make a prediction, they search for the nearest neighbor from the whole

training data, which makes them very slow during prediction.

Some examples of this kind are:

K-Nearest Neighbor(KNN)

• K-Nearest Neighbour is one of the simplest Machine Learning algorithms based on

Supervised Learning technique.
• K-NN algorithm assumes the similarity between the new case/data and available cases
and put the new case into the category that is most similar to the available categories.
• K-NN algorithm stores all the available data and classifies a new data point based on
the similarity. This means when new data appears then it can be easily classified into a
well suite category by using K- NN algorithm.
• K-NN algorithm can be used for Regression as well as for Classification but mostly it
is used for the Classification problems.
• K-NN is a non-parametric algorithm, which means it does not make
any assumption on underlying data.

• It is also called a lazy learner algorithm because it does not learn

from the training set immediately instead it stores the dataset and at
the time of classification, it performs an action on the dataset.

• KNN algorithm at the training phase just stores the dataset and when it
gets new data, then it classifies that data into a category that is much
similar to the new data.
• Suppose, we have an image of a creature that looks similar to cat and
dog, but we want to know either it is a cat or dog. So for this
identification, we can use the KNN algorithm, as it works on a
similarity measure. Our KNN model will find the similar features of
the new data set to the cats and dogs images and based on the most
similar features it will put it in either cat or dog category.
Why do we need a K-NN Algorithm?
• Suppose there are two categories, i.e., Category A and Category B, and
we have a new data point x1, so this data point will lie in which of
these categories.

• To solve this type of problem, we need a K-NN algorithm.

• With the help of K-NN, we can easily identify the category or class of
a particular dataset.
How does K-NN work?

• The K-NN working can be explained on the basis of the below algorithm:

• Step-1: Select the number K of the neighbors

• Step-2: Calculate the Euclidean distance of K number of neighbors

• Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.

• Step-4: Among these k neighbors, count the number of the data points in each
category.

• Step-5: Assign the new data points to that category for which the number of the
neighbor is maximum.

• Step-6: Our model is ready.

Suppose we have a new data point and we need to put it in the
required category.
• Firstly, we will choose the number of neighbors, so we
will choose the k=5.
• Next, we will calculate the Euclidean
distance between the data points. The Euclidean
distance is the distance between two points, which we
have already studied in geometry.
• It can be calculated as:
By calculating the Euclidean distance we got the nearest neighbors, as three nearest
neighbors in category A and two nearest neighbors in category B. Consider the
below image:

•As we can see the 3 nearest neighbors are from category A,

hence this new data point must belong to category A.
How do we select the right value of K?

• We don’t have a particular method for determining the correct

value of K.

• The value of K that delivers the best accuracy for both training
and testing data is selected.
• It is recommended to always select an odd value of K

• When the value of K is set to even, a situation may arise in which the
elements from both groups are equal. In the diagram below, elements
from both groups are equal in the internal “Red” circle (k == 4).

• In this condition, the model would be unable to do the correct

classification. Here the model will randomly assign any of the two classes
to this new unknown data.
• Choosing an odd value for K is preferred because such a
state of equality between the two classes would never
occur here.

• Due to the fact that one of the two groups would still
be in the majority, the value of K is selected as odd.
• The impact of selecting a smaller or larger K value on the model
• Larger K value: The case of underfitting occurs when the value of
k is increased. In this case, the model would be unable to
correctly learn on the training data.
• Smaller k value: The condition of overfitting occurs when the
value of k is smaller. The model will capture all of the training
data, including noise. The model will perform poorly for the test
data in this scenario.
Advantages of KNN Algorithm:
• It is simple to implement.
• It is robust to the noisy training data
• It can be more effective if the training data is large.
Disadvantages of KNN Algorithm:
• Always needs to determine the value of K which may
be complex some time.
• The computation cost is high because of calculating the
distance between the data points for all the training
samples.
Case-based reasoning
• Case-based reasoning is any kind of problem-solving
approach that uses past solutions to solve similar problems.
• It assumes that knowledge can be acquired through past
experiences, and can help warn you of avenues that will lead
to failure or to help you think of successful past solutions that
could be adapted to the problem at hand.
• For example, Google Maps uses case-based reasoning to tell
you how long your journey will take by examining the
patterns of past users to see how long it took them to get from
point A to point B. Even if your path is from two slightly
different points, it makes inferences on how long your
journey will take.
Model Evaluation and Selection
 Evaluation metrics: How can we measure accuracy? Other metrics to consider?
 Use validation test set of class-labeled tuples instead of training set when assessing
accuracy
 Methods for estimating a classifier’s accuracy:
 Holdout method, random subsampling
 Cross-validation
 Bootstrap
 Comparing classifiers:
 Confidence intervals
 Cost-benefit analysis and ROC Curves

37
Classifier Evaluation Metrics: Confusion
Matrix
Confusion Matrix:
Actual class\Predicted class C1 ¬ C1
C1 True Positives (TP) False Negatives (FN)
¬ C1 False Positives (FP) True Negatives (TN)

Example of Confusion Matrix:

Actual class\Predicted buy_computer buy_computer Total
class = yes = no
buy_computer = yes 6954 46 7000
buy_computer = no 412 2588 3000
Total 7366 2634 10000

 Given m classes, an entry, CMi,j in a confusion matrix indicates

# of tuples in class i that were labeled by the classifier as class j
 May have extra rows/columns to provide totals
38
Classifier Evaluation Metrics: Accuracy,
Error Rate, Sensitivity and Specificity
A\P C ¬C  Class Imbalance Problem:
C TP FN P
 One class may be rare, e.g.
¬C FP TN N
fraud, or HIV-positive
P’ N’ All
 Significant majority of the

 Classifier Accuracy, or negative class and minority of

recognition rate: percentage of the positive class
test set tuples that are correctly  Sensitivity: True Positive
classified recognition rate
 Sensitivity = TP/P
Accuracy = (TP + TN)/All
 Specificity: True Negative
 Error rate: 1 – accuracy, or
Error rate = (FP + FN)/All recognition rate
 Specificity = TN/N

39
Classifier Evaluation Metrics:
Precision and Recall, and F-measures
 Precision: exactness – what % of tuples that the classifier
labeled as positive are actually positive

 Recall: completeness – what % of positive tuples did the

classifier label as positive?
 Perfect score is 1.0
 Inverse relationship between precision & recall
 F measure (F1 or F-score): harmonic mean of precision and
recall,

 Fß: weighted measure of precision and recall

 assigns ß times as much weight to recall as to precision

40
Classifier Evaluation Metrics: Example

Actual Class\Predicted class cancer = yes cancer = no Total Recognition(%)

cancer = yes 90 210 300 30.00 (sensitivity
cancer = no 140 9560 9700 98.56 (specificity)
Total 230 9770 10000 96.40 (accuracy)

 Precision = 90/230 = 39.13% Recall = 90/300 = 30.00%

41
Evaluating Classifier Accuracy:
Holdout & Cross-Validation Methods
 Holdout method
 Given data is randomly partitioned into two independent sets

 Training set (e.g., 2/3) for model construction

 Test set (e.g., 1/3) for accuracy estimation

 Random sampling: a variation of holdout

 Repeat holdout k times, accuracy = avg. of the accuracies

obtained
 Cross-validation (k-fold, where k = 10 is most popular)
 Randomly partition the data into k mutually exclusive subsets,

each approximately equal size


At i-th iteration, use Di as test set and others as training set
 Leave-one-out: k folds where k = # of tuples, for small sized

data
 *Stratified cross-validation*: folds are stratified so that class

dist. in each fold is approx. the same as that in the initial data
42
Evaluating Classifier Accuracy: Bootstrap
 Bootstrap
 Works well with small data sets
 Samples the given training tuples uniformly with replacement
 i.e., each time a tuple is selected, it is equally likely to be selected
again and re-added to the training set
 Several bootstrap methods, and a common one is .632 boostrap
 A data set with d tuples is sampled d times, with replacement, resulting in
a training set of d samples. The data tuples that did not make it into the
training set end up forming the test set. About 63.2% of the original data
end up in the bootstrap, and the remaining 36.8% form the test set (since
(1 – 1/d)d ≈ e-1 = 0.368)
 Repeat the sampling procedure k times, overall accuracy of the model:

43
Estimating Confidence Intervals:
Classifier Models M1 vs. M2
 Suppose we have 2 classifiers, M1 and M2, which one is better?
 Use 10-fold cross-validation to obtain and
 These mean error rates are just estimates of error on the true population of future
data cases
 What if the difference between the 2 error rates is just attributed to chance?
 Use a test of statistical significance
 Obtain confidence limits for our error estimates

44
Estimating Confidence Intervals:
Null Hypothesis
 Perform 10-fold cross-validation
 Assume samples follow a t distribution with k–1 degrees of freedom (here, k=10)
 Use t-test (or Student’s t-test)
 Null Hypothesis: M1 & M2 are the same
 If we can reject null hypothesis, then
 we conclude that the difference between M1 & M2 is statistically significant
 Chose model with lower error rate

45
Estimating Confidence Intervals: t-test

 If only 1 test set available: pairwise comparison

 For ith round of 10-fold cross-validation, the same cross partitioning is used to
obtain err(M1)i and err(M2)i
 Average over 10 rounds to get and
 t-test computes t-statistic with k-1 degrees of freedom:
where

 If two test sets available: use non-paired t-test

where

where k1 & k2 are # of cross-validation samples used for M1 & M2, resp.
46
Estimating Confidence Intervals:
Table for t-distribution

 Symmetric
 Significance level,
e.g., sig = 0.05 or
5% means M1 & M2
are significantly
different for 95% of
population
 Confidence limit, z
= sig/2

47
Estimating Confidence Intervals:
Statistical Significance
 Are M1 & M2 significantly different?
 Compute t. Select significance level (e.g. sig = 5%)
 Consult table for t-distribution: Find t value corresponding to k-1 degrees of
freedom (here, 9)
 t-distribution is symmetric: typically upper % points of distribution shown → look
up value for confidence limit z=sig/2 (here, 0.025)
 If t > z or t < -z, then t value lies in rejection region:
 Reject null hypothesis that mean error rates of M & M are same
1 2

 Conclude: statistically significant difference between M1 & M2

 Otherwise, conclude that any difference is chance

48
Model Selection: ROC Curves
 ROC (Receiver Operating
Characteristics) curves: for visual
comparison of classification models
 Originated from signal detection theory
 Shows the trade-off between the true
positive rate and the false positive rate
 The area under the ROC curve is a  Vertical axis
measure of the accuracy of the model represents the true
positive rate
 Rank the test tuples in decreasing  Horizontal axis rep.
order: the one that is most likely to the false positive rate
belong to the positive class appears at  The plot also shows a
the top of the list diagonal line
 The closer to the diagonal line (i.e., the  A model with perfect
closer the area is to 0.5), the less accuracy will have an
accurate is the model area of 1.0
49
Issues Affecting Model Selection
 Accuracy
 classifier accuracy: predicting class label
 Speed
 time to construct the model (training time)
 time to use the model (classification/prediction time)
 Robustness: handling noise and missing values
 Scalability: efficiency in disk-resident databases
 Interpretability
 understanding and insight provided by the model
 Other measures, e.g., goodness of rules, such as decision tree
size or compactness of classification rules
50
Chapter 8. Classification: Basic Concepts

 Classification: Basic Concepts

 Decision Tree Induction
 Bayes Classification Methods
 Rule-Based Classification
 Model Evaluation and Selection
 Techniques to Improve Classification Accuracy:
Ensemble Methods
 Summary
51
Ensemble Methods: Increasing the Accuracy

 Ensemble methods
 Use a combination of models to increase accuracy

 Combine a series of k learned models, M , M , …, M , with

1 2 k
the aim of creating an improved model M*
 Popular ensemble methods
 Bagging: averaging the prediction over a collection of

classifiers
 Boosting: weighted vote with a collection of classifiers

 Ensemble: combining a set of heterogeneous classifiers

52
Bagging: Boostrap Aggregation
 Analogy: Diagnosis based on multiple doctors’ majority vote
 Training
 Given a set D of d tuples, at each iteration i, a training set D of d tuples is
i
sampled with replacement from D (i.e., bootstrap)
 A classifier model M is learned for each training set D
i i
 Classification: classify an unknown sample X
 Each classifier M returns its class prediction
i
 The bagged classifier M* counts the votes and assigns the class with the
most votes to X
 Prediction: can be applied to the prediction of continuous values by taking
the average value of each prediction for a given test tuple
 Accuracy
 Often significantly better than a single classifier derived from D

 For noise data: not considerably worse, more robust

 Proved improved accuracy in prediction

53
Boosting
 Analogy: Consult several doctors, based on a combination of
weighted diagnoses—weight assigned based on the previous
diagnosis accuracy
 How boosting works?
 Weights are assigned to each training tuple
 A series of k classifiers is iteratively learned
 After a classifier Mi is learned, the weights are updated to
allow the subsequent classifier, Mi+1, to pay more attention to
the training tuples that were misclassified by Mi
 The final M* combines the votes of each individual classifier,
where the weight of each classifier's vote is a function of its
accuracy
 Boosting algorithm can be extended for numeric prediction
 Comparing with bagging: Boosting tends to have greater accuracy,
but it also risks overfitting the model to misclassified data 54
Adaboost (Freund and Schapire, 1997)
 Given a set of d class-labeled tuples, (X1, y1), …, (Xd, yd)
 Initially, all the weights of tuples are set the same (1/d)
 Generate k classifiers in k rounds. At round i,
 Tuples from D are sampled (with replacement) to form a training set Di
of the same size
 Each tuple’s chance of being selected is based on its weight
 A classification model Mi is derived from Di
 Its error rate is calculated using Di as a test set
 If a tuple is misclassified, its weight is increased, o.w. it is decreased
 Error rate: err(Xj) is the misclassification error of tuple Xj. Classifier Mi error
rate is the sum of the weights of the misclassified tuples:

 The weight of classifier Mi’s vote is

55
Random Forest (Breiman 2001)
 Random Forest:
 Each classifier in the ensemble is a decision tree classifier and is

generated using a random selection of attributes at each node to

determine the split
 During classification, each tree votes and the most popular class is

returned
 Two Methods to construct Random Forest:
 Forest-RI (random input selection): Randomly select, at each node, F

attributes as candidates for the split at the node. The CART methodology
is used to grow the trees to maximum size
 Forest-RC (random linear combinations): Creates new attributes (or

features) that are a linear combination of the existing attributes (reduces

the correlation between individual classifiers)
 Comparable in accuracy to Adaboost, but more robust to errors and outliers
 Insensitive to the number of attributes selected for consideration at each
split, and faster than bagging or boosting
56
Classification of Class-Imbalanced Data Sets
 Class-imbalance problem: Rare positive example but numerous
negative ones, e.g., medical diagnosis, fraud, oil-spill, fault, etc.
 Traditional methods assume a balanced distribution of classes
and equal error costs: not suitable for class-imbalanced data
 Typical methods for imbalance data in 2-class classification:
 Oversampling: re-sampling of data from positive class

 Under-sampling: randomly eliminate tuples from negative

class
 Threshold-moving: moves the decision threshold, t, so that

the rare class tuples are easier to classify, and hence, less
chance of costly false negative errors
 Ensemble techniques: Ensemble multiple classifiers

introduced above
 Still difficult for class imbalance problem on multiclass tasks
57
Chapter 8. Classification: Basic Concepts

 Classification: Basic Concepts

Credit Card Fraud Detection Using Machine Learning: Ruttala Sailusha V. Gnaneswar
No ratings yet
Credit Card Fraud Detection Using Machine Learning: Ruttala Sailusha V. Gnaneswar
7 pages
Machine Learning and Web Scraping Lecture 03
No ratings yet
Machine Learning and Web Scraping Lecture 03
22 pages
Algorithm
No ratings yet
Algorithm
27 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
19 pages
FPA unit 2
No ratings yet
FPA unit 2
20 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
12 pages
UNIT 2 - Notes
No ratings yet
UNIT 2 - Notes
31 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
UNIT 3 - Final
No ratings yet
UNIT 3 - Final
37 pages
AIML Unit-IV & V
100% (1)
AIML Unit-IV & V
47 pages
ML-Unit 5
No ratings yet
ML-Unit 5
40 pages
Lecture 4
No ratings yet
Lecture 4
31 pages
Unit 4 Supervised Learning
100% (1)
Unit 4 Supervised Learning
75 pages
K-Nearest Neighbor (KNN) Algorithm For Machine Learning - Javatpoint
No ratings yet
K-Nearest Neighbor (KNN) Algorithm For Machine Learning - Javatpoint
18 pages
K-Nearest Neighbor (KNN) Algorithm For Machine Learning
No ratings yet
K-Nearest Neighbor (KNN) Algorithm For Machine Learning
17 pages
Untitled 9
No ratings yet
Untitled 9
17 pages
ML-UNIT-2
No ratings yet
ML-UNIT-2
46 pages
CH 04 Classification Techniques
No ratings yet
CH 04 Classification Techniques
89 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
ML Unit-4
No ratings yet
ML Unit-4
9 pages
DM - MP (1)
No ratings yet
DM - MP (1)
15 pages
Classification
No ratings yet
Classification
53 pages
Introduction to Classification and Classification Algorithms
No ratings yet
Introduction to Classification and Classification Algorithms
9 pages
ML UNIT - III-Complete
No ratings yet
ML UNIT - III-Complete
52 pages
Unit - 2 ML
No ratings yet
Unit - 2 ML
32 pages
ML Unit 3
No ratings yet
ML Unit 3
12 pages
statistic inference unit 2 notes
No ratings yet
statistic inference unit 2 notes
34 pages
7.classification Before
No ratings yet
7.classification Before
27 pages
Lectures 7 and 8 - Data Anaysis in Management - MBM
No ratings yet
Lectures 7 and 8 - Data Anaysis in Management - MBM
78 pages
ML Unit-2
No ratings yet
ML Unit-2
24 pages
Data Sciene - Unit 5 Material
No ratings yet
Data Sciene - Unit 5 Material
15 pages
Classification
No ratings yet
Classification
50 pages
Lecture 3 Basics of Clssification
No ratings yet
Lecture 3 Basics of Clssification
53 pages
NLP Chapter 2
No ratings yet
NLP Chapter 2
79 pages
ch2
No ratings yet
ch2
30 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
Lecture Week 2 KNN and Model Evaluation PDF
100% (1)
Lecture Week 2 KNN and Model Evaluation PDF
53 pages
K-NN
No ratings yet
K-NN
25 pages
Machine Learning Unit-3.1
No ratings yet
Machine Learning Unit-3.1
20 pages
MachineLearning Unit-III Ppt
No ratings yet
MachineLearning Unit-III Ppt
26 pages
U3 KNN
No ratings yet
U3 KNN
6 pages
Unit Ii
No ratings yet
Unit Ii
102 pages
ML CH 3
No ratings yet
ML CH 3
88 pages
Unit - II
No ratings yet
Unit - II
37 pages
ML Unit-2
No ratings yet
ML Unit-2
26 pages
2. Classification and clustering algorithms
No ratings yet
2. Classification and clustering algorithms
108 pages
ML Mid2 Ans
No ratings yet
ML Mid2 Ans
24 pages
KNN (K-Nearest Neighbours) Is A Supervised Learning and Non-Parametric Algorithm That Can
No ratings yet
KNN (K-Nearest Neighbours) Is A Supervised Learning and Non-Parametric Algorithm That Can
4 pages
Class10 14 PatternClassification - 13 24sept2019
No ratings yet
Class10 14 PatternClassification - 13 24sept2019
50 pages
Classification
No ratings yet
Classification
7 pages
When Do We Use KNN Algorithm?
No ratings yet
When Do We Use KNN Algorithm?
7 pages
Topics in Module-3-: ML & Cloud Computing For Iot
No ratings yet
Topics in Module-3-: ML & Cloud Computing For Iot
149 pages
Data Mining 4th Is
No ratings yet
Data Mining 4th Is
24 pages
KNN Evaluation
No ratings yet
KNN Evaluation
51 pages
Lecture 2 Final
No ratings yet
Lecture 2 Final
90 pages
Supervised Learning - SVM - DT
No ratings yet
Supervised Learning - SVM - DT
43 pages
Module Iii
No ratings yet
Module Iii
15 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
Machine Learning unit 3
No ratings yet
Machine Learning unit 3
40 pages
3.1 K Nearest Neighbour Classifier (1)
No ratings yet
3.1 K Nearest Neighbour Classifier (1)
24 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
CS3492 Database Management Systems Two Mark Questions 1
100% (1)
CS3492 Database Management Systems Two Mark Questions 1
38 pages
Unit I
No ratings yet
Unit I
159 pages
OLAP Operations
No ratings yet
OLAP Operations
28 pages
NS CRYPTO LAB Final
No ratings yet
NS CRYPTO LAB Final
28 pages
Logical Database Design: Unit 7
No ratings yet
Logical Database Design: Unit 7
52 pages
Test Driven Machine Learning - Sample Chapter
100% (1)
Test Driven Machine Learning - Sample Chapter
25 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
88 pages
Image Guarder
No ratings yet
Image Guarder
6 pages
Influence of Radiographic Acquisition Methods and Visualization Software Programs On The Detection of Misfits at The Implant-Abutment Interface An Ex Vivo Study
No ratings yet
Influence of Radiographic Acquisition Methods and Visualization Software Programs On The Detection of Misfits at The Implant-Abutment Interface An Ex Vivo Study
7 pages
Ai Mini Project Report
No ratings yet
Ai Mini Project Report
41 pages
ML File
No ratings yet
ML File
37 pages
MSC Proposal - Quantitative - 2019
No ratings yet
MSC Proposal - Quantitative - 2019
32 pages
ICCAKM-2022 Paper 46
No ratings yet
ICCAKM-2022 Paper 46
6 pages
Barakat
No ratings yet
Barakat
7 pages
CSDS 440: Machine Learning: Soumya Ray (
No ratings yet
CSDS 440: Machine Learning: Soumya Ray (
20 pages
Terwee Quality Criteria PDF
No ratings yet
Terwee Quality Criteria PDF
9 pages
Phishing Detection With Machine Learning
No ratings yet
Phishing Detection With Machine Learning
9 pages
SNAPPE II Score: Predictor of Mortality in NICU: Original Research Article
No ratings yet
SNAPPE II Score: Predictor of Mortality in NICU: Original Research Article
5 pages
Evaluation From Precision Recall and F-Factor To R
No ratings yet
Evaluation From Precision Recall and F-Factor To R
25 pages
Download Full Deep Learning 1st Edition Dulani Meedeniya PDF All Chapters
100% (2)
Download Full Deep Learning 1st Edition Dulani Meedeniya PDF All Chapters
50 pages
Iranian Churn
No ratings yet
Iranian Churn
16 pages
Ijbsv 17 P 1581
No ratings yet
Ijbsv 17 P 1581
7 pages
Detection of Malicious Urls Using Machine Learning: Nuria Reyes Dorta Pino Caballero Gil Carlos Rosa Remedios
No ratings yet
Detection of Malicious Urls Using Machine Learning: Nuria Reyes Dorta Pino Caballero Gil Carlos Rosa Remedios
18 pages
Cakir2021 Lactatealbumin Ratio Is More Effective Than Lactate or Albumin Alone in Predicting Clinical Outcomes in Intensive Care Patients With Sepsis
No ratings yet
Cakir2021 Lactatealbumin Ratio Is More Effective Than Lactate or Albumin Alone in Predicting Clinical Outcomes in Intensive Care Patients With Sepsis
6 pages
Ai-900 5
100% (1)
Ai-900 5
5 pages
Astm E3327 - E3327m - 21
No ratings yet
Astm E3327 - E3327m - 21
29 pages
Predicting Autism Spectrum Disorder Using Machine Learning Classifiers
No ratings yet
Predicting Autism Spectrum Disorder Using Machine Learning Classifiers
4 pages
ROC Curves
100% (1)
ROC Curves
3 pages
Pelod Vs Sofa Scoring in Pediatric
No ratings yet
Pelod Vs Sofa Scoring in Pediatric
6 pages
Model 1 - Customer Churn Prediction in Telecom Using ML
No ratings yet
Model 1 - Customer Churn Prediction in Telecom Using ML
24 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
A Model For Predicting Music Popularity On Streami
No ratings yet
A Model For Predicting Music Popularity On Streami
10 pages
Developing A Global Atmospheric Turbulence Decision Support System For Aviation
No ratings yet
Developing A Global Atmospheric Turbulence Decision Support System For Aviation
6 pages
How Eastern Bank Uses Big Data To Better Serve & Protect Its Customers!
No ratings yet
How Eastern Bank Uses Big Data To Better Serve & Protect Its Customers!
36 pages

Classification

Uploaded by

Classification

Uploaded by

Classification

• Classification is a supervised machine learning method

the lazy aspect comes from.

training data, which makes them very slow during prediction.

Some examples of this kind are:

• K-Nearest Neighbour is one of the simplest Machine Learning algorithms based on

• It is also called a lazy learner algorithm because it does not learn

• To solve this type of problem, we need a K-NN algorithm.

• Step-1: Select the number K of the neighbors

• Step-2: Calculate the Euclidean distance of K number of neighbors

• Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.

• Step-6: Our model is ready.

•As we can see the 3 nearest neighbors are from category A,

• We don’t have a particular method for determining the correct

• In this condition, the model would be unable to do the correct

Example of Confusion Matrix:

 Given m classes, an entry, CMi,j in a confusion matrix indicates

 Classifier Accuracy, or negative class and minority of

 Recall: completeness – what % of positive tuples did the

 Fß: weighted measure of precision and recall

Actual Class\Predicted class cancer = yes cancer = no Total Recognition(%)

 Precision = 90/230 = 39.13% Recall = 90/300 = 30.00%

 Training set (e.g., 2/3) for model construction

 Test set (e.g., 1/3) for accuracy estimation

 Random sampling: a variation of holdout

 Repeat holdout k times, accuracy = avg. of the accuracies

each approximately equal size

 If only 1 test set available: pairwise comparison

 If two test sets available: use non-paired t-test

 Conclude: statistically significant difference between M1 & M2

 Classification: Basic Concepts

 Combine a series of k learned models, M , M , …, M , with

 Ensemble: combining a set of heterogeneous classifiers

 For noise data: not considerably worse, more robust

 Proved improved accuracy in prediction

 The weight of classifier Mi’s vote is

generated using a random selection of attributes at each node to

features) that are a linear combination of the existing attributes (reduces

 Under-sampling: randomly eliminate tuples from negative

 Classification: Basic Concepts

You might also like