0% found this document useful (0 votes)

11 views19 pages

ML Notes UT-2

Classification algorithms are supervised machine learning techniques used to categorize new observations. They predict categorical rather than continuous output values. Common examples include binary classifiers that predict yes/no or spam/not spam, and multi-class classifiers that classify types of crops or music. Performance is evaluated using metrics like the confusion matrix and accuracy. The confusion matrix tracks true positives, false positives, true negatives, and false negatives. Accuracy is the ratio of correct predictions to total predictions. These help select the best performing classification model.

Uploaded by

beleharshwardhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views19 pages

ML Notes UT-2

Uploaded by

beleharshwardhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Unit 2

CLASSIFICATION
Supervised Machine Learning algorithm can be broadly classified into Regression
and Classification Algorithms. In Regression algorithms, we have predicted the output for
continuous values, but to predict the categorical values, we need Classification
algorithms.
2.1 Classification Algorithm:
The Classification algorithm is a Supervised Learning technique that is used to
identify the category of new observations based on training data. In Classification, a
program learns from the given dataset or observations and then classifies new
observation into several classes or groups. Such as, Yes or No, 0 or 1, Spam or Not Spam,
cat, or dog, etc. Classes can be called as targets/labels or categories.
Unlike regression, the output variable of Classification is a category, not a value, such as
"Green or Blue", "fruit or animal", etc. Since the Classification algorithm is a Supervised
learning technique, hence it takes labelled input data, which means it contains input with
the corresponding output.
In classification algorithm, a discrete output function(y) is mapped to input variable(x).
The main goal of the Classification algorithm is to identify the category of a given dataset,
and these algorithms are mainly used to predict the output for the categorical data.
Classification algorithms can be better understood using the below diagram. In the below
diagram, there are two classes, class A and Class B. These classes have features that are
like each other and dissimilar to other classes.
The algorithm which implements the classification on a dataset is known as a classifier.

There are two types of Classifications:

• Binary Classifier: If the classification problem has only two possible outcomes,
then it is called as Binary Classifier.
Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT, or DOG, etc.
• Multi-class Classifier: If a classification problem has more than two outcomes, then
it is called as Multi-class Classifier.
Examples: Classifications of types of crops, Classification of types of music.
The best example of an ML classification algorithm is Email Spam Detector.
2.1.1 Learners in Classification Problems:
In the classification problems, there are two types of learners:
1. Lazy Learners: Lazy Learner firstly stores the training dataset and wait until it
receives the test dataset. In Lazy learner case, classification is done based on the
most related data stored in the training dataset. It takes less time in training but
more time for predictions.
Example: K-NN algorithm, Case-based reasoning
2. Eager Learners: Eager Learners develop a classification model based on a training
dataset before receiving a test dataset. Opposite to Lazy learners, Eager Learner
takes more time in learning, and less time in prediction.
Example: Decision Trees, Naïve Bayes, ANN.
2.1.2 Types of ML Classification Algorithms:
Classification Algorithms can be further divided into the Mainly two category:
1. Linear Models
o Logistic Regression
o Support Vector Machines
2. Non-linear Models
o K-Nearest Neighbours
o Kernel SVM
o Naïve Bayes
o Decision Tree Classification
o Random Forest Classification
Classification versus Prediction
Classification: Predicts categorical class labels.
Classifies data (constructs a model) based on the training set and the values (class labels)
in a classifying attribute and uses it in classifying new data.
Prediction: Models continuous-valued functions, i.e., predicts unknown or missing values.
Classification is a two-step process
Model Construction: Describing a set of predetermined classes. Each sample/record is
assumed to belong to a predefined class, as determined by the class label attribute. The
set of samples used for model construction is the training set. The model is represented
as classification rules, decision trees, or mathematical formulae.
Model usage: For classifying future or unknown objects need to estimate the accuracy of
the model by comparing known label of a test sample with the classified results from the
model. Accuracy rate is the percentage of test set samples that are correctly classified by
the model. The test set is independent of the training set, otherwise over-fitting will occur.
If the accuracy is acceptable, use the model to classify data objects whose class labels are
not known.
2.1.3 Linear Classification model:
A linear classifier achieves this by making a classification decision based on the value of
a linear combination of the characteristics. An object's characteristics are also known as
feature values and are typically presented to the machine in a vector called a feature
vector. Such classifiers work well for practical problems such as document classification,
and more generally for problems with many variables (features), reaching accuracy
levels comparable to non-linear classifiers while taking less time to train and use. A linear
classifier is a model that makes a decision to categories a set of data points to a discrete
class based on a linear combination of its explanatory variables.
Ex: combining details about a dog such as weight, height, colour, and other features would
be used by a model to decide its species. The effectiveness of these models lies in their
ability to find this mathematical combination of features that groups data points together
when they have the same class and separate them when they have different classes,
providing us with clear boundaries for how to classify.

If the input feature vector to the classifier is a real vector to the classifier is a real vector ,
then the output score is
2.2 Performance Evaluation:
• Confusion Matrix: As the target variable is not continuous, binary classification
model predicts the probability of a target variable to be Yes/No. To evaluate such
a model, a metric called the confusion matrix is used, also called the classification
or co-incidence matrix. With the help of a confusion matrix, we can calculate
important performance measures:

o TP means True Positive: It can be interpreted as the model predicted

positive class and it is True.
o FP means False Positive: It can be interpreted as the model predicted
positive class, but it is False.
o FN means False Negative: It can be interpreted as the model predicted
negative class, but it is False.
o TN means True Negative: It can be interpreted as the model predicted
negative class and it is True.
Use case: Let’s take an example of a patient who has gone to a doctor with certain
symptoms. Since it’s the season of Covid, let’s assume that he went with fever, cough,
throat ache, and cold. These are symptoms that can occur during any seasonal changes
too. Hence, it is tricky for the doctor to do the right diagnosis.
True Positive (TP):
Let’s say the patient was actually suffering from Covid and on doing the required
assessment, the doctor classified him as a Covid patient. This is called TP or True Positive.
This is because the case is positive in real and at the same time the case was classified
correctly. Now, the patient can be given appropriate treatment which means, the decision
made by the doctor will have a positive effect on the patient and society.
False Positive (FP):
Let’s say the patient was not suffering from Covid and he was only showing symptoms of
seasonal flu, but the doctor diagnosed him with Covid. This is called FP or False Positive.
This is because the case was actually negative but was falsely classified as positive. Now,
the patient will end up getting admitted to the hospital or home and will be given
treatment for Covid. This is an unnecessary inconvenience for him and others as he will
get unwanted treatment and quarantine. This is called Type I Error.
True Negative (TN):
Let’s say the patient was not suffering from Covid and the doctor also gave him a clean
chit. This is called TN or True Negative. This is because the case was actually negative and
was also classified as negative which is the right thing to do. Now the patient will get
treatment for his actual illness instead of taking Covid treatment.
False Negative (FN):
Let’s say the patient was suffering from Covid and the doctor did not diagnose him with
Covid. This is called FN or False Negative as the case was actually positive but was falsely
classified as negative. Now the patient will not get the right treatment and also, he will
spread the disease to others. This is a highly dangerous situation in this example. This is
also called Type II Error.
Summary: In this particular example, both FN and FP are dangerous and the classification
model which has the lowest FN and FP values needs to be chosen for implementation. But
in case there is a tie between few models which score very similar when it comes to FP
and FN, in this scenario the model with the least FN needs to be chosen. This is because
we simply cannot afford to have FNs! The goal of the hospital would be to not let even
one patient go undiagnosed (no FNs) even if some patients get diagnosed wrongly (FPs)
and asked to go under quarantine and special care.
Here is how a confusion matrix looks like:

• Accuracy: Accuracy is the simple ratio between the number of correctly classified
points to the total number of points.
Accuracy = (TP + TN) / (TP + FP +TN + FN)
This term tells us how many right classifications were made out of all the classifications.
In other words, how many TPs and TNs were done out of TP + TN + FP + FNs. It tells the
ratio of “True”s to the sum of “True”s and “False”s.
Use case: Out of all the patients who visited the doctor, how many were correctly
diagnosed as Covid positive and Covid negative.
• Precision: Precision is the fraction of the correctly classified instances from the
total classified instances. Precision is the fraction of true positive examples among
the examples that the model classified as positive. In other words, the number of
true positives divided by the number of false positives plus true positives.

Low precision: the more False positives the model predicts, the lower the precision.
Use case: Let’s take another example of a classification algorithm that marks emails as
spam or not. Here, if emails that are of importance get marked as positive, then useful
emails will end up going to the “Spam” folder, which is dangerous. Hence, the
classification model which has the least FP value needs to be selected. In other words, a
model that has the highest precision needs to be selected among all the models.
• Recall or Sensitivity: Recall is the fraction of the correctly classified instances from
the total classified instances. The number of true positives divided by the number
of true positives plus false negatives.
Low recall: the more False Negatives the model predicts, the lower the recall.

Use case: Out of all the actual Covid patients who visited the doctor, how many were
actually diagnosed as Covid positive. Hence, the classification model which has the least
FN value needs to be selected. In other words, a model that has the highest recall value
needs to be selected among all the models.
Precision helps us understand how useful the results are. Recall helps us understand how
complete the results are.
• ROC Curves: A Receiver Operating Characteristic curve or ROC curve is created by
plotting the True Positive (TP) against the False Positive (FP) at various threshold
settings. The ROC curve is generated by plotting the cumulative distribution
function of the True Positive in the y-axis versus the cumulative distribution
function of the False Positive on the x-axis.
F-Measure: Once precision and recall have been calculated for a binary classification
problem, the two scores can be combined into the calculation of the F-Measure.
The traditional F measure is calculated as follows:
F-Measure = (2 * Precision * Recall) / (Precision + Recall)

This is the harmonic mean of the two fractions. This is sometimes called the F-Score or
the F1-Score and might be the most common metric used on imbalanced classification
problems.
2.3 Multi-class Classification: Multiclass classification is a classification task with more
than two classes. Each sample can only be labelled as one class. Each training point
belongs to one of N different classes. The goal is to construct a function which, given a
new data point, will correctly predict the class to which the new point belongs. There are
many scenarios in which there are multiple categories to which points belong, but a given
point can belong to multiple categories. In its most basic form, this problem decomposes
trivially into a set of unlinked binary problems, which can be solved naturally using our
techniques for binary classification.
For example, classification using features extracted from a set of images of fruit, where
each image may either be of an orange, an apple, or a pear. Each image is one sample and
is labelled as one of the 3 possible classes. Multiclass classification assumes that each
sample is assigned to one and only one label - one sample cannot, for example, be both a
pear and an apple
2.3.1 Binary vs Multiclass Classification:

Parameters Binary classification Multi-class classification

No. of classes It is a classification of two groups, There can be any number of
i.e., classifies objects in at most classes in it, i.e., classifies the
two classes. object into more than two
classes.
Algorithms The most popular algorithms used Popular algorithms that can be
used by the binary classification are- used for multi-class
classification include:

Logistic Regression k-Nearest Neighbours

k-Nearest Neighbours Decision Trees

Decision Trees Naive Bayes

Support Vector Machine Random Forest.

Naive Bayes Gradient Boosting

Examples Email spam detection (spam or Face classification.
not).
Plant species classification.
Churn prediction (churn or not).
Optical character recognition.
Conversion prediction (buy or
not).

2.4 Multiclass Classification techniques: multi-class classification techniques can be

categorized into
1. One vs One
2. One vs Rest
1. One vs One: One-vs-One (OvO) is heuristic method for using binary classification
algorithms for multi-class classification. One-vs-One splits a multi-class
classification dataset into binary classification problems. The One-vs-One
approach splits the dataset into one dataset for each class versus every other class.
Each classifier trained on a small subset of data (only those labelled with those
two classes would be involved), which can result in high variance. This component
implements the one-versus-one method, in which a binary model is created per
class pair. At prediction time, the class which received the most votes are selected.
Since it requires to fit n* (n - 1) / 2 classifiers, this method is usually slower than
one-versus-all, due to its O(n^2) complexity. However, this method may be
advantageous for algorithms such as kernel algorithms which don’t scale well with
n_samples. This is because each individual learning problem only involves a small
subset of the data whereas, with one-versus-all, the complete dataset is used
n_classes times.

2. One vs Rest: One-vs-rest (OvR for short, also referred to as One-vs-All or OvA) is a
heuristic method for using binary classification algorithms for multi-class
classification. It involves splitting the multi-class dataset into multiple binary
classification problems. A binary classifier is then trained on each binary
classification problem and predictions are made using the model that is the most
confident. The obvious approach is to use a one-versus-the-rest approach (also
called one-vs-all), in which we train C binary classifiers, fc(x), where the data from
class c is treated as positive, and the data from all the other classes is treated as
negative.
2.4 Linear Models:
Linear modelling in a classification context consists of regression followed by a
transformation to return a categorical output and thereby producing a decision
boundary. The two most used linear classification algorithms are logistic regression and
linear support vector machines. In the field of machine learning, the goal of statistical
classification is to use an object's characteristics to identify which class (or group) it
belongs to. A linear classifier achieves this by making a classification decision based on
the value of a linear combination of the characteristics.
The mathematical formula for binary classification to make prediction is given below —
ŷ = x[0] * z[0] + x[1] * z[1] + … + x[p] * z[p] + b > 0
The formula is quite like the one used in linear regression, but here the weighted sum of
the features is just returned. The threshold of the predicted value is zero in binary
classification. If the function is less than zero, the class is predicted as -1 and if it is greater
than zero, the class is predicted as +1. This common rule is used in case of all linear
models for classification.
2.4.1 Linear Support Vector Machines (SVM):
Support Vector Machine or SVM is one of the most popular Supervised Learning
algorithms, which is used for Classification as well as Regression problems. However,
primarily, it is used for Classification problems in Machine Learning. The goal of the SVM
algorithm is to create the best line or decision boundary that can segregate n-dimensional
space into classes so that we can easily put the new data point in the correct category in
the future. This best decision boundary is called a hyperplane. SVM chooses the extreme
points/vectors that help in creating the hyperplane. These extreme cases are called as
support vectors, and hence algorithm is termed as Support Vector Machine. Consider the
below diagram in which there are two different categories that are classified using a
decision boundary or hyperplane:
Example: SVM can be understood with the example that we have used in the KNN
classifier. Suppose we see a strange cat that also has some features of dogs, so if we want
a model that can accurately identify whether it is a cat or dog, so such a model can be
created by using the SVM algorithm. We will first train our model with lots of images of
cats and dogs so that it can learn about different features of cats and dogs, and then we
test it with this strange creature. So as support vector creates a decision boundary
between these two data (cat and dog) and choose extreme cases (support vectors), it will
see the extreme case of cat and dog. Based on the support vectors, it will classify it as a
cat. Consider the below diagram:

Types of SVM can be of two types:

• Linear SVM: Linear SVM is used for linearly separable data, which means if a
dataset can be classified into two classes by using a single straight line, then such
data is termed as linearly separable data, and classifier is used called as Linear
SVM classifier.
• Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which
means if a dataset cannot be classified by using a straight line, then such data is
termed as non-linear data and classifier used is called as Non-linear SVM classifier.
1. Linear SVM: Linear SVM is used for linearly separable data, which means if a
dataset can be classified into two classes by using a single straight line, then such
data is termed as linearly separable data, and classifier is used called as Linear
SVM classifier. Suppose we have a dataset that has two tags (green and blue), and
the dataset has two features x1 and x2. We want a classifier that can classify the
pair (x1, x2) of coordinates in either green or blue. Consider the below image:

Hence, the SVM algorithm helps to find the best line or decision boundary; this
best boundary or region is called as a hyperplane. SVM algorithm finds the closest
point of the lines from both the classes. These points are called support vectors.
The distance between the vectors and the hyperplane is called as margin. And the
goal of SVM is to maximize this margin. The hyperplane with maximum margin is
called the optimal hyperplane.

2. Non-Linear SVM: If data is linearly arranged, then we can separate it by using a

straight line, but for non-linear data, we cannot draw a single straight line.
Consider the below image:
So, to separate these data points, we need to add one more dimension. For linear data, we
have used two dimensions x and y, so for non-linear data, we will add a third-dimension z.
It can be calculated as:
z=x2 +y2
By adding the third dimension, the sample space will become as below image:

So now, SVM will divide the datasets into classes in the following way. Consider the below
image:
Since we are in 3-d Space, hence it is looking like a plane parallel to the x-axis. If we
convert it in 2d space with z=1, then it will become as:

Hence, we get a circumference of radius 1 in case of non-linear data.

2.5 Soft Margin SVM:
Choosing a correct classifier is really important. Let us understand this with an example.

Suppose we are given 2 Hyperplane one with 100% accuracy (HP1) on the left side
and another with >90% accuracy (HP2) on the right side. Which one would you think is
the correct classifier? Most of us would pick the HP2 thinking that it because of the
maximum margin. But it is the wrong answer. But Support Vector Machine would choose
the HP1 though it has a narrow margin. Because though HP2 has maximum margin but it
is going against the constrain that: each data point must lie on the correct side of the
margin and there should be no misclassification. This constrain is the hard constrain that
Support Vector Machine follows throughout.
HP1 is a Hard SVM (left side) while HP2 is a Soft SVM (right side). By default, Support
Vector Machine implements Hard margin SVM. It works well only if our data is linearly
separable. Hard margin SVM does not allow any misclassification to happen. In case our
data is non-separable/ nonlinear then the Hard margin SVM will not return any
hyperplane as it will not be able to separate the data. Hence this is where Soft Margin SVM
comes to the rescue. Soft margin SVM allows some misclassification to happen by relaxing
the hard constraints of Support Vector Machine. Soft margin SVM is implemented with
the help of the Regularization parameter (C): It tells us how much misclassification we
want to avoid.
– Hard margin SVM generally has large values of C.
– Soft margin SVM generally has small values of C.
2.6 SVM Kernel to handle non-linear data:
SVM can be extended to solve nonlinear classification tasks when the set of samples
cannot be separated linearly. By applying kernel functions, the samples are mapped onto
a high-dimensional feature space, in which the linear classification is possible.
• Gaussian Radial Basis Function (RBF): It is one of the most preferred and used
kernel functions in SVM. It is usually chosen for non-linear data. It helps to make
proper separation when there is no prior knowledge of data.

F (x, xj) = exp (-  * ||x - xj||^2)

The value of  (gamma) varies from 0 to 1. We must manually provide the value

of  in the code. The most preferred value for gamma is 0.1.

• Gaussian Kernel: The Gaussian kernel is a very popular kernel function used in
many machine learning algorithms, especially in support vector machines (SVMs).
It is more often used than polynomial kernels when learning from nonlinear
datasets and is usually employed in formulating the classical SVM for nonlinear
problems. The Gaussian kernel function allows the separation of nonlinearly
separable data by mapping the input vector to Hilbert space. The Gaussian kernel
is an exponential function including norm and real constant.
• Polynomial: In general, the polynomial kernel is defined as

b = degree of kernel & a = constant term.

in the polynomial kernel, we simply calculate the dot product by increasing the
power of the kernel.
• Sigmoid: This function is equivalent to a two-layer, perceptron model of the neural
network, which is used as an activation function for artificial neurons. The sigmoid
kernel was quite popular for support vector machines due to its origin from neural
networks.

2.7 Logistic Regression: Logistic regression is one of the most popular Machine Learning
algorithms, which comes under the Supervised Learning technique. It is used for
predicting the categorical dependent variable using a given set of independent variables.
Logistic regression predicts the output of a categorical dependent variable. Therefore, the
outcome must be a categorical or discrete value. It can be either Yes or No, 0 or 1, true or
False, etc. but instead of giving the exact value as 0 and 1, it gives the probabilistic values
which lie between 0 and 1. Logistic Regression is much like the Linear Regression except
that how they are used. Linear Regression is used for solving Regression problems,
whereas Logistic regression is used for solving the classification problems. In Logistic
regression, instead of fitting a regression line, we fit an "S" shaped logistic function, which
predicts two maximum values (0 or 1). The curve from the logistic function indicates the
likelihood of something such as whether the cells are cancerous or not, a mouse is obese
or not based on its weight, etc. Logistic Regression is a significant machine learning
algorithm because it can provide probabilities and classify new data using continuous
and discrete datasets. Logistic Regression can be used to classify the observations using
different types of data and can easily determine the most effective variables used for the
classification. This type of statistical model (also known as logit model) is often used for
classification and predictive analytics. Logistic regression estimates the probability of an
event occurring, such as voted or didn’t vote, based on a given dataset of independent
variables. Since the outcome is a probability, the dependent variable is bounded between
0 and 1. In logistic regression, a logit transformation is applied on the odds—that is, the
probability of success divided by the probability of failure.

Logistic regression uses the concept of predictive modelling as regression;

therefore, it is called logistic regression, but is used to classify samples; Therefore, it falls
under the classification algorithm.
The Logistic regression equation can be obtained from the Linear Regression
equation. The mathematical steps to get Logistic Regression equations are given below:
We know the equation of the straight line can be written as:

In Logistic Regression y can be between 0 and 1 only, so for this let's divide the
above equation by (1-y):

But we need range between -[infinity] to +[infinity], then take logarithm of the
equation it will become:

The above equation is the final equation for Logistic Regression.

2.7.1 Type of Logistic Regression:
On the basis of the categories, Logistic Regression can be classified into three
types:
• Binomial: In binomial Logistic regression, there can be only two possible types of
the dependent variables, such as 0 or 1, Pass or Fail, etc.
• Multinomial: In multinomial Logistic regression, there can be 3 or more possible
unordered types of the dependent variable, such as "cat", "dogs", or "sheep".
• Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered
types of dependent variables, such as "low", "Medium", or "High".

2.7.2 Steps in Logistic Regression: To implement the Logistic Regression using Python,
we will use the same steps as we have done in previous topics of Regression. Below are
the steps:
• Data Pre-processing step
• Fitting Logistic Regression to the Training set
• Predicting the test result
• Test accuracy of the result (Creation of Confusion matrix)
• Visualizing the test set result.

Homework 3
No ratings yet
Homework 3
3 pages
EXCEL EXERCISE #10: Statistical Analysis: I Bar 2
No ratings yet
EXCEL EXERCISE #10: Statistical Analysis: I Bar 2
12 pages
Credit Card Segmentation
No ratings yet
Credit Card Segmentation
3 pages
Classification Algorithm in Machine Learning
No ratings yet
Classification Algorithm in Machine Learning
13 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
Evaluation of Predictive Models Final
No ratings yet
Evaluation of Predictive Models Final
6 pages
ML Unit 2
No ratings yet
ML Unit 2
31 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
BSC ML CH1.pptx
No ratings yet
BSC ML CH1.pptx
63 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
Machine Learning Project Report (Group 3) Shahbaz Khan
No ratings yet
Machine Learning Project Report (Group 3) Shahbaz Khan
11 pages
ML Supervised Regression
No ratings yet
ML Supervised Regression
70 pages
ML-2-PPT-UNIT-2
No ratings yet
ML-2-PPT-UNIT-2
214 pages
Performance measure for a classification model.
No ratings yet
Performance measure for a classification model.
5 pages
ERROR and Confusion Matrix
No ratings yet
ERROR and Confusion Matrix
29 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
Unit 4 ML
No ratings yet
Unit 4 ML
28 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Confusion Matrix
No ratings yet
Confusion Matrix
8 pages
Machine Learning Note (2)
No ratings yet
Machine Learning Note (2)
40 pages
Machine Learning Cheatsheet
No ratings yet
Machine Learning Cheatsheet
12 pages
Evaluation Metrics:: Confusion Matrix
No ratings yet
Evaluation Metrics:: Confusion Matrix
7 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
ml4
No ratings yet
ml4
32 pages
Risk Security and Regulatory Compliance
No ratings yet
Risk Security and Regulatory Compliance
12 pages
Performance Measures - Session 2
No ratings yet
Performance Measures - Session 2
35 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
6.Data Mining - Classification Ppt
No ratings yet
6.Data Mining - Classification Ppt
37 pages
Confusion Matrix
No ratings yet
Confusion Matrix
43 pages
Unit 2
No ratings yet
Unit 2
28 pages
Lecture - 2 Classification (Machine Learning Basic and KNN)
No ratings yet
Lecture - 2 Classification (Machine Learning Basic and KNN)
94 pages
ML - Module 3
No ratings yet
ML - Module 3
58 pages
Performance Metrics (Classification) : Enrique J. de La Hoz D
100% (1)
Performance Metrics (Classification) : Enrique J. de La Hoz D
30 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
Unit III 1
No ratings yet
Unit III 1
21 pages
KNN Evaluation
No ratings yet
KNN Evaluation
51 pages
Lecture 1, Part 2: Linear Classification: Roger Grosse
No ratings yet
Lecture 1, Part 2: Linear Classification: Roger Grosse
10 pages
Machine_Learning_II
No ratings yet
Machine_Learning_II
61 pages
8.predictive Analytics - Classification 2
No ratings yet
8.predictive Analytics - Classification 2
28 pages
Classification Algorithms
100% (2)
Classification Algorithms
23 pages
Accuracy and error measures
No ratings yet
Accuracy and error measures
14 pages
ML-chap-2
No ratings yet
ML-chap-2
60 pages
MLA CT1 - Notes
No ratings yet
MLA CT1 - Notes
17 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
Unit - 3
No ratings yet
Unit - 3
83 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
Lecture 9
No ratings yet
Lecture 9
27 pages
Classification
100% (2)
Classification
105 pages
ML Unit-IV Notes
No ratings yet
ML Unit-IV Notes
49 pages
KNN-Unit1-Notes (1)
No ratings yet
KNN-Unit1-Notes (1)
57 pages
ML+LVC+3+Post-Session+Summary
No ratings yet
ML+LVC+3+Post-Session+Summary
16 pages
mla_unit-5'2 (1)
No ratings yet
mla_unit-5'2 (1)
74 pages
Lecture 6 Text Classification
No ratings yet
Lecture 6 Text Classification
19 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Evaluation Measures for Machine Learning Models
No ratings yet
Evaluation Measures for Machine Learning Models
6 pages
Binary Classification PDF
No ratings yet
Binary Classification PDF
27 pages
Lec 8
No ratings yet
Lec 8
35 pages
ARTIFICIAL INTELLIGENCE LEC 2
No ratings yet
ARTIFICIAL INTELLIGENCE LEC 2
17 pages
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet
Schickinger Et Al 2023
No ratings yet
Schickinger Et Al 2023
18 pages
Datamining For Business Intelligence
No ratings yet
Datamining For Business Intelligence
6 pages
DIAdem GettingStarted
No ratings yet
DIAdem GettingStarted
57 pages
Geokniga Art and Science Resource Estimation PDF
No ratings yet
Geokniga Art and Science Resource Estimation PDF
246 pages
FR2202mock Exam
No ratings yet
FR2202mock Exam
6 pages
Analyze & Interprete Pro - Data-Edited
No ratings yet
Analyze & Interprete Pro - Data-Edited
32 pages
Mco 3 - June Tee 2024
No ratings yet
Mco 3 - June Tee 2024
46 pages
Report Writing
No ratings yet
Report Writing
4 pages
Emp Code First Name Last Name Department Region Branch Salary
No ratings yet
Emp Code First Name Last Name Department Region Branch Salary
14 pages
Cricket 1 Prediction
No ratings yet
Cricket 1 Prediction
11 pages
Evan
No ratings yet
Evan
2 pages
Answer PDF Lab
No ratings yet
Answer PDF Lab
34 pages
Data Analytics Starter Pack
No ratings yet
Data Analytics Starter Pack
3 pages
Linear Regression Models For Panel Data - 230919 - 160651
No ratings yet
Linear Regression Models For Panel Data - 230919 - 160651
93 pages
Challenges of HR in BSNL & Its Solutions
No ratings yet
Challenges of HR in BSNL & Its Solutions
61 pages
Employee Churn Prediction Using Logistic Regression
No ratings yet
Employee Churn Prediction Using Logistic Regression
72 pages
Advanced R Programming For Data Analytics in Business - Unit 6 - Week-3
No ratings yet
Advanced R Programming For Data Analytics in Business - Unit 6 - Week-3
4 pages
Notes On Forecasting With Moving Averages - Robert Nau
No ratings yet
Notes On Forecasting With Moving Averages - Robert Nau
28 pages
Humanities Data in R
100% (1)
Humanities Data in R
218 pages
Education: 99.17%ile 100%ile Data Interpretation and Logical Reasoning
No ratings yet
Education: 99.17%ile 100%ile Data Interpretation and Logical Reasoning
1 page
Capstone Project: Group 5
No ratings yet
Capstone Project: Group 5
39 pages
StatProb Lesson 9
No ratings yet
StatProb Lesson 9
28 pages
Nutrition Knowledge, Attitudes and Fat Intake: Application of The Theory of Reasoned Action
No ratings yet
Nutrition Knowledge, Attitudes and Fat Intake: Application of The Theory of Reasoned Action
11 pages
MLP Question Bank of AI and ML and NLP
No ratings yet
MLP Question Bank of AI and ML and NLP
7 pages
R Data Analyst DAR
No ratings yet
R Data Analyst DAR
6 pages
Regression With Stata
No ratings yet
Regression With Stata
40 pages
(Ebook) Designing and Conducting Mixed Methods Research by Creswell, John W., Plano Clark, Vicki L. ISBN 9781412975179, 1412975174 instant download
No ratings yet
(Ebook) Designing and Conducting Mixed Methods Research by Creswell, John W., Plano Clark, Vicki L. ISBN 9781412975179, 1412975174 instant download
46 pages

ML Notes UT-2

Uploaded by

ML Notes UT-2

Uploaded by

Unit 2

There are two types of Classifications:

o TP means True Positive: It can be interpreted as the model predicted

Parameters Binary classification Multi-class classification

Logistic Regression k-Nearest Neighbours

k-Nearest Neighbours Decision Trees

Decision Trees Naive Bayes

Support Vector Machine Random Forest.

Naive Bayes Gradient Boosting

2.4 Multiclass Classification techniques: multi-class classification techniques can be

Types of SVM can be of two types:

2. Non-Linear SVM: If data is linearly arranged, then we can separate it by using a

Hence, we get a circumference of radius 1 in case of non-linear data.

F (x, xj) = exp (-  * ||x - xj||^2)

of  in the code. The most preferred value for gamma is 0.1.

b = degree of kernel & a = constant term.

Logistic regression uses the concept of predictive modelling as regression;

The above equation is the final equation for Logistic Regression.

You might also like