0% found this document useful (0 votes)
1 views

ARTIFICIAL INTELLIGENCE LEC 2

The document outlines a course on Advanced Artificial Intelligence focusing on supervised learning and classification algorithms. It explains the process of supervised learning, key classification terminologies, types of classification algorithms, and their use cases. Additionally, it discusses the evaluation methods for classification models, including log loss and confusion matrix, and provides insights into decision trees as a classification method.

Uploaded by

Kunal Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

ARTIFICIAL INTELLIGENCE LEC 2

The document outlines a course on Advanced Artificial Intelligence focusing on supervised learning and classification algorithms. It explains the process of supervised learning, key classification terminologies, types of classification algorithms, and their use cases. Additionally, it discusses the evaluation methods for classification models, including log loss and confusion matrix, and provides insights into decision trees as a classification method.

Uploaded by

Kunal Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

ARTIFICIAL INTELLIGENCE(ADVANCED)

A Course under Centre of Excellence as Initiative of Department of Science and


Technology, Government of Bihar

GOVERNMENT POLYTECHNIC SAHARSA


Presenter:
Prof. Shubham
HoD(Computer Science and Engineering)
Todays Class
➢Introduction to Supervised Learning
➢Classification Algorithms
➢Types of Classification Algorithms
Supervised Machine Learning
Supervised learning is the types of machine learning in which machines are
trained using well "labelled" training data, and on basis of that data, machines
predict the output. The labelled data means some input data is already tagged
with the correct output.
In supervised learning, the training data provided to the machines work as the
supervisor that teaches the machines to predict the output correctly. It applies
the same concept as a student learns in the supervision of the teacher.
Supervised learning is a process of providing input data as well as correct output
data to the machine learning model. The aim of a supervised learning algorithm
is to find a mapping function to map the input variable(x) with the output
variable(y).
In the real-world, supervised learning can be used for Risk Assessment, Image
classification, Fraud Detection, spam filtering, etc.
Steps Involved in Supervised Learning:
First Determine the type of training dataset

Collect/Gather the labelled training data.

Split the training dataset into training dataset, test dataset, and validation dataset.

Determine the input features of the training dataset, which should have enough knowledge so that
the model can accurately predict the output.

Determine the suitable algorithm for the model, such as support vector machine, decision tree,
etc.

Execute the algorithm on the training dataset. Sometimes we need validation sets as the control
parameters, which are the subset of training datasets.

Evaluate the accuracy of the model by providing the test set. If the model predicts the correct
output, which means our model is accurate.
Classification Terminologies In Machine Learning
•Classifier – It is an algorithm that is used to map the input data to a specific category.
•Classification Model – The model predicts or draws a conclusion to the input data
given for training, it will predict the class or category for the data.
•Feature – A feature is an individual measurable property of the phenomenon being
observed.
•Binary Classification – It is a type of classification with two outcomes, for eg – either
true or false.
•Multi-Class Classification – The classification with more than two classes, in multi-
class classification each sample is assigned to one and only one label or target.
•Multi-label Classification – This is a type of classification where each sample is
assigned to a set of labels or targets.
•Initialize – It is to assign the classifier to be used for the
•Train the Classifier – Each classifier in sci-kit learn uses the fit(X, y) method to fit the
model for training the train X and train label y.
•Predict the Target – For an unlabeled observation X, the predict(X) method returns
predicted label y.
•Evaluate – This basically means the evaluation of the model i.e classification report,
accuracy score, etc.
Classification
The Classification algorithm is a Supervised Learning technique that is used to identify the
category of new observations on the basis of training data.

In Classification, a program learns from the given dataset or observations and then classifies
new observation into a number of classes or groups. Such as, Yes or No, 0 or 1, Spam or Not
Spam, cat or dog, etc. Classes can be called as targets/labels or categories.

Unlike regression, the output variable of Classification is a category, not a value, such as
"Green or Blue", "fruit or animal", etc. Since the Classification algorithm is a Supervised
learning technique, hence it takes labeled input data, which means it contains input with
the corresponding output.
In classification algorithm, a discrete output function(y) is mapped to input variable(x).
The best example of an ML classification algorithm is Email Spam Detector.
Classification
The main goal of the Classification algorithm is to identify the category of a given dataset,
and these algorithms are mainly used to predict the output for the categorical data.
Classification algorithms can be better understood using the below diagram. In the below
diagram, there are two classes, class A and Class B. These classes have features that are
similar to each other and dissimilar to other classes.
Classification
The algorithm which implements the classification on a dataset is known as a classifier.
There are two types of Classifications:
•Binary Classifier: If the classification problem has only two possible outcomes, then it is
called as Binary Classifier.
Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG, etc.
•Multi-class Classifier: If a classification problem has more than two outcomes, then it is
called as Multi-class Classifier.
Example: Classifications of types of crops, Classification of types of music.
Types of Classification Algorithms
Classification Algorithms can be further divided into the Mainly two category:
•Linear Models
• Logistic Regression
• Support Vector Machines
•Non-linear Models
• K-Nearest Neighbours
• Kernel SVM
• Naïve Bayes
• Decision Tree Classification
• Random Forest Classification
Use cases of Classification Algorithms
Classification algorithms can be used in different places. Below are some popular use
cases of Classification Algorithms:
•Email Spam Detection
•Speech Recognition
•Identifications of Cancer tumor cells.
•Drugs Classification
•Biometric Identification, etc.
Evaluating a Classification model:
Once our model is completed, it is necessary to evaluate its performance; either it is a Classification or
Regression model. So for evaluating a Classification model, we have the following ways:
1. Log Loss or Cross-Entropy Loss:
•It is used for evaluating the performance of a classifier, whose output is a probability value between the 0
and 1.
•For a good binary Classification model, the value of log loss should be near to 0.
•The value of log loss increases if the predicted value deviates from the actual value.
•The lower log loss represents the higher accuracy of the model.
•For Binary classification, cross-entropy can be calculated as:
2. Confusion Matrix:
•The confusion matrix provides us a matrix/table as output and describes the performance of
the model.
•It is also known as the error matrix.
•The matrix consists of predictions result in a summarized form, which has a total number of
correct predictions and incorrect predictions. The matrix looks like as below table:
DECISION TREE
•The decision tree algorithm builds the classification model in the form of a tree
structure.
•It utilizes the if-then rules which are equally exhaustive and mutually exclusive in
classification.
•The process goes on with breaking down the data into smaller structures and
eventually associating it with an incremental decision tree.
• The final structure looks like a tree with nodes and leaves.
•The rules are learned sequentially using the training data one at a time.
• Each time a rule is learned, the tuples covering the rules are removed. The
process continues on the training set until the termination point is met.
DECISION TREE
Advantages and Disadvantages
• A decision tree gives an advantage of simplicity to
understand and visualize, it requires very little data
preparation as well. The disadvantage that follows with the
decision tree is that it can create complex trees that may bot
categorize efficiently. They can be quite unstable because
even a simplistic change in the data can hinder the whole
structure of the decision tree.
Use Cases
• Data exploration
• Pattern Recognition
• Option pricing in finances
• Identifying disease and risk threats

You might also like