0% found this document useful (0 votes)

12 views

A Comparative Analysis of Machine Learning Algorithms for Classification Purpose

This document presents a comparative analysis of various machine learning algorithms, specifically focusing on classification techniques such as Naive Bayesian, Decision Trees, SVM, and K-nearest neighbor. The study evaluates the performance of these algorithms on multiple datasets, concluding that the Naive Bayesian algorithm demonstrates the highest effectiveness. The paper also discusses the importance of data classification for risk management and data security.

Uploaded by

shoxruxsmartboy

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

A Comparative Analysis of Machine Learning Algorithms for Classification Purpose

Uploaded by

shoxruxsmartboy

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Available online at www.sciencedirect.

com

ScienceDirect
Available online at www.sciencedirect.com
Procedia Computer Science 00 (2022) 000–000
Available online at www.sciencedirect.com
www.elsevier.com/locate/procedia
ScienceDirect
ScienceDirect
Procedia Computer Science 215 (2022) 422–431
Procedia Computer Science 00 (2022) 000–000
www.elsevier.com/locate/procedia
4th International Conference on Innovative Data Communication Technology and
Application

A Comparative Analysis of Machine Learning Algorithms for

4th International Conference on Innovative Data Communication Technology and
Classification Purpose
Application
Vraj Sheth , Urvashi Tripathi , Ankit Sharma *
a a a
A Comparative Analysis of Machine Learning Algorithms for
Classification Purpose
Electronics and Instrumentation Engineering Department, Institute of Technology, Nirma University, Ahmedabad, Gujarat, India
a

Vraj Shetha, Urvashi Tripathia, Ankit Sharmaa*

Abstract
a
Electronics and Instrumentation Engineering Department, Institute of Technology, Nirma University, Ahmedabad, Gujarat, India

A few of the popular data-mining techniques are clustering, classification, and association. The classification process simplifies
the process of identifying and accessing data. Classification of data is crucial for risk management, compliance, and data security.
Classifying data facilitates its search-ability and traceability by categorising the information. Each data mining model has a distinct
level of information. The success of a model is solely determined by the datasets being used, as there is no such thing as an excellent
Abstract
or a poor model. As a part of this study, we examine how accurate different classification algorithms are on diverse datasets. On
five
A different
few of the datasets, four classification
popular data-mining models
techniques areare compared:
clustering, Decision tree,and
classification, SVM, Naive Bayesian,
association. and K-nearest
The classification neighbor.
process The
simplifies
Naive Bayesian algorithm is proven to be the most effective among other algorithms.
the process of identifying and accessing data. Classification of data is crucial for risk management, compliance, and data security.
ClassifyingNaive
Keywords: dataBayes; K-Nearest
facilitates Neighbour; Decision
its search-ability
© 2023 The Authors. Published by Elsevier B.V.
Tree; Support
and traceability Vector Machine;
by categorising the information. Each data mining model has a distinct
level of an
This is information. The
open access success
article of athe
under model is solely determined
CC BY-NC-ND by the datasets being used, as there is no such thing as an excellent
license (https://creativecommons.org/licenses/by-nc-nd/4.0)
or a poor model.
Peer-review underAs a part of thisofstudy,
responsibility we examine
the scientific how accurate
committee of the 4thdifferent classification
International algorithms
Conference are onData
on Innovative diverse datasets. On
Communication
1. Introduction
five different datasets,
Technologies four classification models are compared: Decision tree, SVM, Naive Bayesian, and K-nearest neighbor. The
and Application
Naive Bayesian algorithm is proven to be the most effective among other algorithms.
Data isNaive
Keywords: one Bayes; K-Nearest
of most Neighbour;
significant, longDecision Tree; Support defining
term performance Vector Machine;
assets of an organisation. The process of predicting
outcomes by analysing the anomalies, patterns, and correlations in huge data sets is called Data mining. Classification
is termed for a predictive modelling procedure that predicts a class label based on input data. As part of the modelling
1. Introduction
process, classifiers require a training dataset that contains many inputs and outputs from which to learn. We can
compute the best course of approach to map samples of raw data to the predefined class labels using the training data.
As Data is one
a result, theoftraining
most significant,
dataset mustlongbeterm performanceofdefining
representative the issueassets
and of an organisation.
include Thenumber
a substantial processofofsources
predicting
for
outcomes
each class by analysing the anomalies, patterns, and correlations in huge data sets is called Data mining. Classification
label.
is termed
There isforanaensemble
predictiveofmodelling procedure
classification that predicts
approaches a class
for creating label based
a classifier on input
model. Manydata. As part
studies haveofbeen
the conducted
modelling
process, classifiers require a training dataset that contains many inputs and outputs from which
to evaluate various algorithms in order to determine which one is the best. The literature study reveals that there to learn. Weiscanno
compute the best course of approach to map samples of raw data to the predefined class labels using
one solution, as evidenced by the fact that various investigations provide diverse outcomes. One fundamental reason the training data.
As a result, the training dataset must be representative of the issue and include a substantial number of sources for
each class label.author. Tel.:+91-787-432-7003.
* Corresponding
There is an ensemble
E-mail address: of classification approaches for creating a classifier model. Many studies have been conducted
[email protected]
to evaluate various algorithms in order to determine which one is the best. The literature study reveals that there is no
one solution,
1877-0509 © 2023 as evidenced
The Authors.by the factbythat
Published various
Elsevier B.V.investigations provide diverse outcomes. One fundamental reason
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review underauthor.
* Corresponding responsibility of the scientific committee of the 4th International Conference on Innovative Data Communication
Tel.:+91-787-432-7003.
Technologies and Application
E-mail address: [email protected]

1877-0509 © 2023 The Authors. Published by Elsevier B.V.

1877-0509 © 2023 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
This is an open
Peer-review access
under article under
responsibility the scientific
of the CC BY-NC-ND licenseof(https://creativecommons.org/licenses/by-nc-nd/4.0)
committee the 4th International Conference on Innovative Data Communication
Peer-review under responsibility
Technologies and Application of the scientific committee of the 4th International Conference on Innovative Data Communication
Technologies and Application
10.1016/j.procs.2022.12.044
Vraj Sheth et al. / Procedia Computer Science 215 (2022) 422–431 423
Vraj Sheth Et al./ Procedia Computer Science 00 (2019) 000–000 2

for this might be that the outcome of these algorithms is influenced by a variety of parameters such as dataset size,
application, feature selection, and so on.
For proposed work, we have regarded some of the most prominent classification methods, including Naive
Bayesian, K-nearest neighbour, SVM, and Decision Trees over different datasets to obtain a comprehensive
understanding of the algorithms ’performance and choosing the most optimum one.
Section 2 outlines previous projects, and Section 3 describes the various classification algorithms. Section 4
discusses the metrics evaluation, and Section 5 covers the datasets adopted for the study. The assessment of the
prediction models is covered in Section 6. The results are summarised in Section 9.

2. Related Work

In this domain, a great deal of research has been done using various methodologies. This section seeks to provide
a brief analysis of the current classification model research initiatives. The section presents a survey of the literature
in the domain of classifier implementation. The first input in this area comes from Matthew Anyanwua (2009) [1]
who performed a comparative analysis of three decision tree algorithms namely C4.5, CART and ID3 to internal
assessment data to predict students’performance in an examination. The performance is analysed based on accuracy
and time taken to derive the tree. C4.5 is proven to be the best of all for small-scale datasets. SPRINT and SLIQ
decision tree algorithms are suitable for larger datasets. M. J. Muzammil (2013) [2] proposed a novel approach to
compare different classifiers adapted for Statistical IDS whose performance is evaluated over WEKA. The
classification algorithms employed were Naïve Bayesian, C4.5 Decision Tree, Decision Table, ZeroR, and OneR. To
evaluate the model’s Performance, several performance metrics were used, including True Positive, True Negative,
False Positive, False Negative, Model Building Time, and Margin Curve. Anuradha (2015) [3] proposed a novel
approach to implementing classification techniques namely the C4.5 (J48) Bayesian classifiers, decision tree
technique, the KNN algorithm, and two rule learner algorithms, JRip and OneR to predict and analyse students’
performance in examinations using data mining. The dataset is retrieved from the college database and a structured
questionnaire. Bayesian classifiers such as Nave Bayes and BayesNet are proven to perform the best with high
accuracy greater than 70% followed by JRip classifier and J48 classifiers. The JRip results in the highest accuracy for
the Distinction. R. Muhamedyev (2015) [4] Implemented the learning curves experiment to evaluate which learning
rate occurs during machine learning training using Feedforward Artificial Neural Network (ANN), Naïve Bayes and
k-Nearest-Neighbors (k-NN). For error detection, accuracy, weighted mean precision and weighted mean recall were
used. The information comes from 30 boreholes in the Inkai uranium deposit. All other algorithms are outperformed
by the ANN algorithm. Amit Tate (2016) [5] Presented and compared different classification models for disease
predictions namely Naive Bayes (NB), Support Vector Machine (SVM) Random Forest (RF). The dataset is classified
using Weka. The performances are compared by estimating accuracy, training time, precision, recall. The results are
comparable but the random forest algorithm outperforms the other models. Rafet Duriqi (2016) [6] proposed a novel
approach to evaluate different classification algorithms such as Random Forest, Naive Bayes, and K * on three
different datasets using the WEKA tool. The data is obtained from the UCI Machine Learning repository. The results
suggest that the dataset, particularly the quantity of attributes in the dataset, has an impact on a classifier's performance.
Wesley Becari (2016) [7] compares the categorisation methods used by the iCub platform's humanoid hand tactile
sensors in considerable detail. Support Vector Machines (SVM), k-Nearest Neighbors Classifiers (kNN), and Decision
Trees were used as classifiers. The classification with two fingers performed well. With an accuracy of 97.4%, the
Gaussian SVM kernel is proven to be the best as it resulted in the highest percentage of correct answers. P. Srikanth
(2016) [8] proposed a novel approach to predict Diabetes Disease using Data Mining Techniques of Classification
Algorithms namely the Decision Tree Algorithm, Bayes Algorithm, and Rule-based Algorithm. The Classification
Algorithm with the applied classification methods results in high accuracy. Archanaa R (2017) [9] Presented a detailed
comparative analysis of distinct machine learning algorithms including Bayes classifiers, Rule learning, Decision
trees, and Ensemble classifiers. The dataset is acquired from the University of Queen Mary repository. The Ensemble
classifiers show the best results with a classification accuracy of over 99%. The Decorate algorithm belonging to the
ensemble classifiers performs the best. Saeed M. Alqahtani (2017) [10] Proposed a comparative analysis of four kinds
Vraj Sheth Et al. / Procedia Computer Science 00 (2019) 000–000 3 of classification techniques namely Decision tree
(DT), OneR, Naive Bayes (NB), and K-nearest neighbor (KNN). The ISCX dataset is employed and Accuracy,
Sensitivity, Precision, F-measure and Specificity were estimated to evaluate the best classifier algorithm. All other
algorithms are outperformed by the decision tree classifier. Preeti Nair (2017) [11] Put forward a novel approach for
424 Vraj Sheth et al. / Procedia Computer Science 215 (2022) 422–431
Vraj Sheth Et al./ Procedia Computer Science 00 (2019) 000–000 3

a detailed analysis of various classification algorithms such as Decision tree (DT), OneR, Naive Bayes (NB), and K-
nearest neighbor (KNN) using binary classification problems. The datasets are derived from the UCI data repository.
The model is evaluated with a confusion matrix and Accuracy, Precision, Recall, and Specificity are calculated to
assess each model’s performance. The Naive Bayesian classification model is observed to achieve a greater number
of highest values as per the performance metrics. S. Sharma (2018) [12] Put forward a detailed analysis of different
multi-label classification models namely BR, CC, PS, LS, and Random Forest using the MEKA tool. The dataset
(multi-label) employed for the study is retrieved from the engineering students’ database of a private university.
Random forest outperforms all the models with an accuracy of 96%. Muhammad Alghobiri (2018) [13] Implemented
a comprehensive approach to compare and analyse several classification algorithms including Decision tree (DT),
Naive Bayes (NB) and Support vector machines (SVM). Ten different datasets have been considered for the
assessment. The evaluation metrics employed are Accuracy, Precision, and F-Measure. SVM is proven to be the best.
Siddhi Velankar (2018) [14] Proposed a comparative analysis of Bayesian regression (BR), Generalised Linear Model
(GLM), and Random Forest (RF) along with a combination of five mean normalisation techniques to find out the best
possible combination for bitcoin price forecasting. The database is collected from Quandl and CoinmarketCap to
retrieve bitcoin values for a five-year frame. Karunya Rathan (2019) [15] proposed a comparative analysis of LR and
Decision Trees (DT) for bitcoin price forecasting. Data for the proposed approach was downloaded from quandl.com.
LR outperformed Decision Trees by having an accuracy of 97.59%. A. Demir (2019) [16] presented various machine
learning algorithms, including artificial neural networks (ANN), long-short term memory (LSTM), decision trees,
Naive Bayes (NB), the nearest neighbour algorithm, and support vector machines (SVM). The dataset was retrieved
from KAGGLE. The implementation resulted in LSTM outperforming all the other algorithms with an accuracy of
97.2%. Dr. Pasumpon pandian (2019) [17] presented the review on the methods of the big-data-analytics and the
machine-learning in the analytics of high voluminous data to extract the valuable and information. Dr. T. Vijaya kumar
(2019) [18] proposed the Caps Net based classification system that can be trained using a smaller number of datasets
to detect the type of cancerous tumors in brain. Reaz Chowdhury (2020) [19] proposed a unique approach for
projecting the closing price of cryptocurrency. Gradient boosted trees (GBT), k-Nearest Neighbor (K-NN), Neural net
(NN), Ensemble learning approach, and other machine learning algorithms have all been conducted to a thorough
comparison. Coinmarketcap.com provided the data for this analysis. The three approaches outperformed the state-of-
the-art models, with an accuracy of 92.4 percent and an RMSE of 0.2 percent for the ensemble learning method. N.
N. Qomariyah (2020) [20] used the WEKA data mining tool to implement a pairwise comparison. This is used to
build a decision tree for learning user preferences. The performance of the DT's models, J48, ID3, RandomForest, and
RandomTree, A 10-fold cross-validation and hold-out technique is used to evaluate it. J48 performed best when the
data was split into 65 percent of the training and 35 percent of the test sets. I. S. Balabanova (2020) [21] present a
comparative analysis of the indicators in the development of models based on machine learning approaches for
detecting RMS noise levels of tones with different frequencies Decision trees, k-NN, and Nave Bayes are implemented
with k-NN having accuracy in the range of 89.800% to 91.050%. Decision tree and k-NN perform well with great
efficiency. Mayukh Sammadar (2021) [22] carried out a well-framed comparative analysis of many machine learning
algorithms with neural network algorithms taken as convolutional neural network (CNN), artificial neural network
(ANN) and recurrent neural network (RNN) and supervised learning algorithms like Random Forest (RF) and k-
nearest neighbors (k-NN). The dataset is sourced from Kaggle and scaled afterward using MinMaxScaler. CNN
outperforms all the other models with higher accuracy and the least loss. RNN also performs finely. The non-deep
learning algorithm is observed to be less accurate.

3. Classification Algorithms

This section offers a concise explanation of all of the classification algorithms used in the proposed work.

3.1. Naive Bayesian

The Bayes' Theorem is used to generate naive Bayes classifiers, which are a group of classification methods. It
consists of a number of algorithms which all work on the same principle: each pair of features to be categorised is
independent. NB (Naive Bayes) uses the Bayes rule as follows:
Vraj Sheth et al. / Procedia Computer Science 215 (2022) 422–431 425
Vraj Sheth Et al./ Procedia Computer Science 00 (2019) 000–000 4

𝑃𝑃(𝑋𝑋|𝑦𝑦)𝑃𝑃(𝑦𝑦)
𝑃𝑃(𝑦𝑦|𝑋𝑋) = (1)
𝑃𝑃(𝑋𝑋)
Y is a class variable and X is an n-dimensional dependent feature vector.

X = (x1 ,x2 ,x3 ,…, xn )

The class variable(y) has only two outcomes in our case: yes or no. The classification may be multivariate in some
circumstances. As a result, we must find the class Y with the highest probability.
𝑛𝑛
y = p(y) ∏ 𝑝𝑝(𝑋𝑋|𝑦𝑦) (2)
𝑖𝑖=1
The precision of the projected output values is used to define the procedure's error. If the goal values are categorical,
the error is presented as an error rate. The error rate is the proportion of times the prediction was wrong. The Bayes
error rate is the lowest error rate that any classifier of a random outcome can provide.
Naive Bayes is easy to set up, produces good results, scales proportionally with the number of predictors and data
points, requires less training data, manages discrete and continuous data, can tackle binary and multi-class
classification problems, and makes stochastic recommendations. Data can be processed in a continuous or
discontinuous manner. It is unaffected by non-essential features. Naive Bayes assumes conditional independence,
means the relationship between all input features are independent.
The following are the drawbacks of naive Bayes: Naive Bayes models are too simplistic, models that have been
properly trained and optimised often outperform them. If one of the features is required to be a "continuous variable"
(such as time), it is complicated to implement Naive Bayes effectively. Even if "buckets" for "continuous variables"
can be created, they are not 100% accurate. Because there is no genuine online option for Naive Bayes, all data must
be saved in order to retrain the model. When the number of attributes is really high, such as > 100K, it will not scale.
In comparison to SVM or simple logistic regression, it requires higher runtime memory for prediction. It consumes
much time to compute, especially for models with a lot of variables.

3.2. Decision Tree

A Decision Tree is a supervised learning technique that can be used to perform classification and regression tasks,
while it is most typically employed for classification.
A decision tree has a root node, branch nodes, and leaf nodes, similar to a tree, with each node representing a
characteristic or attribute, each branch representing a decision or rule, and each leaf representing a result. To split the
features, decision tree algorithms are used. At each node, the splitting is tested to see if it is the most suited for the
respective classes. A decision tree is a graphical layout that allows you to get all of the various answers for a decision
based on the current situation. It only focuses on one question, and the tree is split into subtrees based on the answer.
The following are some of the benefits of using a Decision Tree: It is effective for both regression and classification
problems, with ease of interpretation, the ability to fill incomplete data in attributes with the most likely value and
handling categorical and quantitative values. It also has a superior productivity due to the efficiency of the tree
traversal algorithm. Over-fitting is a problem that Decision Tree may experience, and the answer is Random Forest,
which is based on an ensemble modelling technique.
The following are the downsides of using a Decision Tree: it is being unstable, difficult to manage tree size, prone
to errors in sampling, and providing a locally optimal answer rather than a globally ideal solution.

3.3. K-Nearest Neighbour

K-nearest neighbours (KNN) are supervised machine learning algorithms that can be utilised to solve both
classification and regression problems. With the K-NN model, fresh data can be quickly sorted into well-defined
categories. To estimate the values of any new data points, the KNN algorithm makes use of "feature similarity." It
evaluates the distances between a query and each example in the data, picks the K examples that are closest to the
query, and then selects the label with the highest frequency (in the case of classification) or averages the labels (in the
case of regression).
426 Vraj Sheth et al. / Procedia Computer Science 215 (2022) 422–431
Vraj Sheth Et al./ Procedia Computer Science 00 (2019) 000–000 5

KNN analyses a given test tuple with comparable training tuples in process of learning. An n- dimensional pattern
space is used to hold all of the training tuples. A k-nearest-neighbor classifier examines the pattern space for the k
training tuples that are nearest to the unidentified tuple when given one. These k training tuples are the unknown
tuple's k "nearest neighbours." [2].
Advantages of KNN algorithm are the following: It is a simple technique that may be implemented quickly. It is
inexpensive to construct the model. It's a very adaptable categorisation technique that's ideal for Multi-modal classes.
There are several class labels on the records. The mistake rate is twice as high as the Bayes error rate. It is sometimes
the most effective way. When it came to predicting protein, function based on expression profiles, KNN outperformed
SVM.
Disadvantages of KNN are the following: It is relatively costly to classify unknown records. It requires calculating
the distance between k-nearest neighbours. The algorithm becomes more computationally costly as the size of the
training set grows. Accuracy will degrade as a result of noisy or irrelevant features.

3.4. Support Vector Machine

In Supervised Learning, Support Vector Machines (SVMs) are widely used for dealing with classification and
regression problems. The purpose of SVM is to find the optimal line or decision boundary for classifying ndimensional
space into sections so that successive data points may be classified conveniently. These boundaries are known as
hyperplanes. SVM can handle unstructured, semi structured and structured data. Kernel functions eases the
complexities in data type.
This algorithm is divided into two categories: linear data and non-linear data. Mathematical programming and
kernel functions are the two main implementations of SVM technology. In a high-dimensional space, the hyperplane
divides data points of distinct kinds [4].
SVM has a number of limitations, including the following: Because of the longer training time, it performs poorly
when working with large data sets. The correct kernel function will be tough to locate. When a dataset is noisy, SVM
does not perform well. Probability calculations are not provided by SVM. It's difficult to interpret the final SVM
model.

4. Datasets

The different datasets used for the classification and testing of algorithms are split into sets of test and training
models, with 70% as test and 30% as training datasets. The machine learning model is fitted using the train dataset
and the test Dataset is used to assess how well a machine learning model fits the data.

1. Placement Dataset
The dataset consists of information including gender, SSC percentage, board of education HSC percentage,
specialisation, degree info, work experience, employability test percentage, and salary. This dataset has 15
attributes and 215 entries."Salary" and "ssc_p" are two relevant features for predicting the status of placement for
a student, whereas 'workex' and 'specialisation' are two important features for predicting status.
2. Heart Disease Dataset
The dataset consists of information including age, sex, chest pain, restBP, chol, MaxHR, Exang, Oldpeak, Slope,
FBS, RestECG, Ca, and thal. This dataset has 15 attributes and 303 entries. This dataset is used to classify whether
a person has a certain heart disease or not. It uses attributes such as chest pain and thal, which are later differentiated
between sex and thalamic value, which helps in prediction.
3. Wine Quality Dataset
The dataset contains several numerical values that represent information about various wines, such as fixed acidity,
residual sugar, pH, sulphates, alcohol chlorides, free sulphur dioxide, volatile acidity, total sulphur dioxide, citric
acid, and density. This dataset is also quite large, with over 1000 rows and no null entries. The numerical values in
each variable differ substantially, maybe due to the units. If a model is susceptible to these differences, the dataset
may need to be standardised. Wines low in both citric acid and alcohol are of poor quality, whereas good quality
wine is generally high in both.
Vraj Sheth et al. / Procedia Computer Science 215 (2022) 422–431 427
Vraj Sheth Et al./ Procedia Computer Science 00 (2019) 000–000 6

4. Glass Quality Dataset

The dataset consists of the refractive index and measures of various elements in the glass, such as ID number, RI:
refractive index, Mg: Magnesium, Na: Sodium, Al: Aluminium Si stands for Silicon, K stands for Potassium, Ca
stands for Calcium, Ba stands for Barium, and Fe stands for Iron of glass. This dataset has 10 attributes and 214
entries. Using the values of elements present in each glass type, the glass with different element compositions can
be predicted.
5. Classification of Jobs
Dataset The dataset mainly has information about a job profile, including ID, JobFamily, JobFamilyDescription,
JobClass, JobClassDescription, PayGrade, EducationLevel, Experience, OrgImpact, problem-solving, supervision,
contact level, financial budget, PG This dataset has 14 attributes and 66 entries.

5. Analysing the classifiers

Quantitative metrics and qualitative metrics are the two fundamental approaches used for data analysis. In
Quantitative Metrics, numerical data constitute the foundation of quantitative information. Ratios, percentages,
averages, currency values, and other straightforward expressions of quantitative measures are frequently used.
Whereas, in Qualitative metrics, non-numerical facts constitute the foundation of qualitative information. In a
dataset, this inaccuracy is referred to as noise. Any important information's prediction can be greatly impacted by
noisy data. Train-Test split and Cross-Validation was used in proposed work.
For proposed work, we chose five data sets at random from Kaggle and ran each dataset through each classifier,
comparing the results to see which classifier provided the most accurate value in the majority of cases. For each
dataset, a confusion matrix is produced. We calculated and reported each performance measure in the tables using the
confusion matrix generated by each classifier. Flowchart for the ML classifier is shown in Figure 1.

Fig.1: Flowchart for the Classifier

Table.1, Table.2, Table.3 and Table.4 depicts the classification accuracy, precision, recall and F1 score respectively
for each algorithm used on the datasets. The values for each metric evaluation vary as the datasets used are random
and have different classes and categories, which generates variation in the output of the algorithm.
Fig.2, Fig.3, Fig.4 and Fig.5 depicts the classification accuracy, precision, recall and F1 score respectively for each
algorithm used on the datasets. The proportion of real positive examples among those that the model classified as
positive is known as precision. Recall, commonly referred to as sensitivity, is the percentage of positive examples
among all the positive ones. The F1-score is a method of integrating the model's recall and precision and is the
harmonic mean of the two.All of the classifiers are operating admirably, and the accuracy scores are high. There are
relatively few variations between the results obtained. However, when we compared these figures to see which one
was the best, we concluded that the Naive Bayes and SVM model produced the best results with the highest accuracy
values in two datasets each. Even though k-NN's accuracy values are closer to the high values it does not give the
428 Vraj Sheth et al. / Procedia Computer Science 215 (2022) 422–431
Vraj Sheth Et al./ Procedia Computer Science 00 (2019) 000–000 7

maximum value for any dataset while Decision Tree delivers the highest value for one case. As a result of this
observation, we can infer that Naive Bayes and SVM outperform Decision Tree and k-NN

Fig.2: Accuracy (in percentage) of the algorithms applied on each dataset

The result shows that the k-NN algorithm has the highest precision values in percentage form, with values that are
extremely close to those of Naive Bayes. SVM and Decision Tree perform similarly, however neither produces the
highest precision numbers in any circumstance.

Table1: Accuracy (in percentage) of the algorithms applied on each dataset

Table 2: Precision (in percentage) of the algorithms applied on each dataset

The Result shows that the Naive Bayes method performs the best, with the highest recall percentage values,
followed by the SVM algorithm, which delivers the highest value for one dataset. With the lowest values for recall
percentage, the decision tree works badly.
Vraj Sheth et al. / Procedia Computer Science 215 (2022) 422–431 429
Vraj Sheth Et al./ Procedia Computer Science 00 (2019) 000–000 8

Table 3: Recall (in percentage) of the algorithms applied on each dataset

Fig. 3: Precision (in percentage) of the algorithms applied on each dataset

Table 4: F1 Score (in percentage) of the algorithms applied on each dataset

We discovered that Naive Bayes and Decision Tree did the best in terms of F1 score, with SVM coming in second,
while k-NN performed the worst, with the lowest F1 score value.
430 VrajVraj
ShethSheth
Et al./etProcedia
al. / Procedia Computer
Computer Science
Science 215 (2022)
00 (2019) 422–431
000–000 9

Fig. 4: Recall (in percentage) of the algorithms applied on each dataset

Fig. 5: F1 Score (in percentage) of the algorithms applied on each dataset

6. Conclusion

The prediction of classes is handled by a classification algorithm in this paper. There are different classification
models available which are based on a variety of logic and methodologies. We compiled several datasets and compared
the accuracy, recall, precision and F1 score of four most commonly used classifiers, namely decision trees, k-NN,
SVM, and Naive Bayes. Each model performs differently depending on the size and characteristics of the data sets.
There were only a few minor differences in performance measurements between the algorithms. A table including
each performance measure against each dataset and each algorithm was created to determine which algorithm was the
most effective overall. To gain a better understanding of the scores, we created a graphical representation of it in
percentage form. After analysing the data, we ascertained that the Naive Bayesian classification model surpasses the
others in terms of accuracy, recall, precision, and F1 score. The second-best classifier is SVM, which is preceded by
K-Nearest Neighbor and Decision
Vraj Sheth et al. / Procedia Computer Science 215 (2022) 422–431 431
Vraj Sheth Et al./ Procedia Computer Science 00 (2019) 000–000 10

The primary goal of this research has been to choose the best classifier from the most popular techniques. However,
other models may be considered in the future work for comparison and selection. Various noise reduction strategies
could be applied to enhance the results of this study, besides the one mentioned. Another aspect that would be
intriguing to employ in any further research is usage of various measures to compare the performance of the
algorithms.

References

[1] Anyanwu, Matthew & Shiva, S. (2009). Comparative Analysis of Serial Decision Tree Classification Algorithms. International Journal of
Computer Science and Security. 3(3).
[2] M. J. Muzammil, S. Qazi and T. Ali, "Comparative analysis of classification algorithms performance for a statistical-based intrusion detection
system," 2013 3rd IEEE International Conference on Computer, Control and Communication (IC4), 2013, pp. 1-6, DOI:
10.1109/IC4.2013.6653738.
[3] Anuradha, C & T, Velmurugan. (2015). A Comparative Analysis on the Evaluation of Classification Algorithms in the Prediction of Students
Performance. Indian Journal of Science and technology. 8. 974-6846. 10.17485/ijst/2015/v8i15/74555.
[4] R. Muhamedyev, K. Yakunin, S. Iskakov, S. Sainova, A. Abdilmanova and Y. Kuchin, "Comparative analysis of classification algorithms,"
2015 9th International Conference on Application of Information and Communication Technologies (AICT), 2015, pp. 96-101, DOI:
10.1109/ICAICT.2015.7338525.
[5] Amit Tate, Bajrangsingh Rajpurohit, Jayanand Pawar, Ujwala Gavhane,Gopal B. Deshmukh."Comparative Analysis of Classification
Algorithms Used for Disease Prediction in Data Mining" Vol. 2 - Issue 6 ( Nov - Dec 2016), International Journal of Engineering and Techniques
(IJET), ISSN: 2395 - 1303, www.ijetjournal.org
[6] R. Duriqi, V. Raca and B. Cico, "Comparative analysis of classification algorithms on three different datasets using WEKA," 2016 5th
Mediterranean Conference on Embedded Computing (MECO), 2016, pp. 335-338, DOI: 10.1109/MECO.2016.7525775.
[7] W. Becari, L. Ruiz, B. G. P. Evaristo and F. J. Ramirez-Fernandez, "Comparative analysis of classification algorithms on tactile sensors," 2016
IEEE International Symposium on Consumer Electronics (ISCE), 2016, pp. 1-2, DOI: 10.1109/ISCE.2016.7797324.
[8] P. Srikanth and D. Deverapalli, "A Critical Study of Classification Algorithms Using Diabetes Diagnosis," 2016 IEEE 6th International
Conference on Advanced Computing (IACC), 2016, pp. 245- 249, DOI: 10.1109/IACC.2016.54.
[9] R. Archanaa, V. Athulya, T. Rajasundari and M. V. K. Kiran, "A comparative performance analysis on network traffic classification using
supervised learning algorithms," 2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS), 2017,
pp. 1-5, DOI: 10.1109/ICACCS.2017.8014634.
[10] S. M. Alqahtani and R. John, "A comparative analysis of different classification techniques for cloud intrusion detection systems' alerts and
fuzzy classifiers," 2017 Computing Conference, 2017, pp. 406-415, DOI: 10.1109/SAI.2017.8252132.
[11] Journal of Basic and Applied Engineering Research p-ISSN: 2350-0077; e-ISSN: 2350-0255; Volume 4, Issue 3; April-June, 2017, pp. 180-
184 © Krishi Sanskriti Publications http://www.krishisanskriti.org/Publication.html
[12] S. Sharma and D. Mehrotra, "Comparative Analysis of Multi-label Classification Algorithms," 2018 First International Conference on Secure
Cyber Computing and Communication (ICSCCC), 2018, pp. 35-38, DOI: 10.1109/ICSCCC.2018.8703285.
[13] Alghobiri, M. (2018). A Comparative Analysis of Classification Algorithms on Diverse Datasets. Engineering, Technology & Applied Science
Research. 8. 2790-2795. 10.48084/etasr.1952.
[14] S. Velankar, S. Valecha, and S. Maji, "Bitcoin price prediction using machine learning," 2018 20th International Conference on Advanced
Communication Technology (ICACT), 2018, pp. 144-147, DOI: 10.23919/ICACT.2018.8323676.
[15] K. Rathan, S. V. Sai and T. S. Manikanta, "Crypto-Currency price prediction using Decision Tree and Regression techniques," 2019 3rd
International Conference on Trends in Electronics and Informatics (ICOEI), 2019, pp. 190-194, DOI: 10.1109/ICOEI.2019.8862585.
[16] A. Demir, B. N. Akılotu, Z. Kadiroğlu and A. Şengür, "Bitcoin Price Prediction Using Machine Learning Methods," 2019 1st International
Informatics and Software Engineering Conference (UBMYK), 2019, pp. 1-4, DOI: 10.1109/UBMYK48245.2019.8965445.
[17] pandian, Dr. “ REVIEW OF MACHINE LEARNING TECHNIQUES FOR VOLUMINOUS INFORMATION MANAGEMENT”. Journal
of Soft Computing Paradigm. 2019. 103-112. DOI: 10.36548/jscp.2019.2.005.
[18] kumar, Dr. (2019). “CLASSIFICATION OF BRAIN CANCER TYPE USING MACHINE LEARNING”. Journal of Artificial Intelligence
and Capsule Networks. 2019. DOI:10.36548/jaicn.2019.2.006.
[19] Reaz Chowdhury, M. Arifur Rahman, M. Sohel Rahman, M.R.C. Mahdy, an approach to predict and forecast the price of constituents and
index of cryptocurrency using machine learning, Physica A: Statistical Mechanics and its Applications, Volume 551, 2020, 124569, ISSN 0378-
4371, https://DOI.org/10.1016/j.physa.2020.124569.
[20] N. N. Qomariyah, E. Heriyanni, A. N. Fajar and D. Kazakov, "Comparative Analysis of Decision Tree Algorithm for Learning Ordinal Data
Expressed as Pairwise Comparisons," 2020 8th International Conference on Information and Communication Technology (ICoICT), 2020, pp.
1- 4, DOI: 10.1109/ICoICT49345.2020.9166341.
[21] I. S. Balabanova, V. I. Markova, S. S. Kostadinova and G. I. Georgiev, "Comparative Analysis between Machine Learning Methods in Tones
Classification," 2020 28th National Conference with International Participation (TELECOM), 2020, pp. 45-48, DOI:
10.1109/TELECOM50385.2020.9299535.
[22] M. Samaddar, R. Roy, S. De and R. Karmakar, "A Comparative Study of Different Machine Learning Algorithms on Bitcoin Value Prediction,"
2021 International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), 2021, pp. 1-
7, DOI: 10.1109/ICAECT49130.2021.9392629.

How To Hack Atm
87% (15)
How To Hack Atm
1 page
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
88% (8)
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
56 pages
Whitepaper - Foundational Large Language Models & Text Generation
100% (1)
Whitepaper - Foundational Large Language Models & Text Generation
75 pages
Relational Intelligence The People Skills You Need For The Life of Purpose You Want (Daniels, Dharius) (Z-Library) - Data - Alterno
No ratings yet
Relational Intelligence The People Skills You Need For The Life of Purpose You Want (Daniels, Dharius) (Z-Library) - Data - Alterno
6 pages
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
100% (10)
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
821 pages
Cracking The Coding Interview - 189 Programming Questions and Solutions (6th Edition) (EnglishOnlineClub - Com)
100% (10)
Cracking The Coding Interview - 189 Programming Questions and Solutions (6th Edition) (EnglishOnlineClub - Com)
708 pages
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
100% (25)
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
306 pages
Ain't It Fun - Paramore
No ratings yet
Ain't It Fun - Paramore
2 pages
2020 02. DNNRec A Novel Deep Learning Based Hybrid Recommender System
No ratings yet
2020 02. DNNRec A Novel Deep Learning Based Hybrid Recommender System
14 pages
Banana Pancakes - Ukulele Chord Chart
100% (1)
Banana Pancakes - Ukulele Chord Chart
2 pages
Gestalt
100% (3)
Gestalt
39 pages
75 Productivity Hacks - System Sunday
100% (7)
75 Productivity Hacks - System Sunday
75 pages
Military Remote Viewing Manual
100% (5)
Military Remote Viewing Manual
72 pages
Comparison_of_classification_techniques_for_intrus
No ratings yet
Comparison_of_classification_techniques_for_intrus
6 pages
01. Introduction
No ratings yet
01. Introduction
20 pages
Desale Classification IDS
No ratings yet
Desale Classification IDS
5 pages
TPW Data Mining
No ratings yet
TPW Data Mining
4 pages
Machine Learning-Powered Web Application For Predicting and Identifying Fake Job Listing
No ratings yet
Machine Learning-Powered Web Application For Predicting and Identifying Fake Job Listing
6 pages
An Investigation On Intrusion Detection System Using Machine Learning
No ratings yet
An Investigation On Intrusion Detection System Using Machine Learning
9 pages
MSDS Ljmu Submission Form
No ratings yet
MSDS Ljmu Submission Form
14 pages
journal-8(2025)
No ratings yet
journal-8(2025)
15 pages
2013 Selection of The Best Classifier From Different Datasets Using WEKA PDF
No ratings yet
2013 Selection of The Best Classifier From Different Datasets Using WEKA PDF
8 pages
Gupta, Govind P. Kulariya, Manish (2016)
No ratings yet
Gupta, Govind P. Kulariya, Manish (2016)
8 pages
A Tutorial On Principal Component Analysis For Dimensionality Reduction in Machine Learning
No ratings yet
A Tutorial On Principal Component Analysis For Dimensionality Reduction in Machine Learning
5 pages
A_Comparison_Study_between_Data_Mining_Tools_over_
No ratings yet
A_Comparison_Study_between_Data_Mining_Tools_over_
10 pages
Irjet V9i11154
No ratings yet
Irjet V9i11154
4 pages
1 s2.0 S0167404820304375 Main
No ratings yet
1 s2.0 S0167404820304375 Main
12 pages
DattaDeshmukhecs 2014 6892542
No ratings yet
DattaDeshmukhecs 2014 6892542
7 pages
Privacy Preserving Data Mining Using Piecewise Vector Quantization
No ratings yet
Privacy Preserving Data Mining Using Piecewise Vector Quantization
3 pages
Ieee Edited
No ratings yet
Ieee Edited
6 pages
Chapter 3 1
No ratings yet
Chapter 3 1
29 pages
Comparative Study Between Traditional Machine Learning and Deep Learning Approaches For Text Classification
No ratings yet
Comparative Study Between Traditional Machine Learning and Deep Learning Approaches For Text Classification
11 pages
Classification Through Machine Learning Technique
No ratings yet
Classification Through Machine Learning Technique
9 pages
Survey Paper On Classification
No ratings yet
Survey Paper On Classification
6 pages
PRP Icp 2
No ratings yet
PRP Icp 2
11 pages
Statistical Performance Assessment of Supervised Machine Learning Algorithms For Intrusion Detection System
No ratings yet
Statistical Performance Assessment of Supervised Machine Learning Algorithms For Intrusion Detection System
12 pages
privacy preservation
No ratings yet
privacy preservation
9 pages
1 s2.0 S1110866517301081 Main 1
No ratings yet
1 s2.0 S1110866517301081 Main 1
8 pages
Naïve Bayes vs. Decision Trees vs. Neural Networks in The Classification of Training Web Pages
No ratings yet
Naïve Bayes vs. Decision Trees vs. Neural Networks in The Classification of Training Web Pages
8 pages
Classification Through Machine Learning Technique: C4.5 Algorithm Based On Various Entropies
No ratings yet
Classification Through Machine Learning Technique: C4.5 Algorithm Based On Various Entropies
8 pages
Paper 2
No ratings yet
Paper 2
6 pages
2.2(4)
No ratings yet
2.2(4)
20 pages
Weka
No ratings yet
Weka
15 pages
Hemant STRT
No ratings yet
Hemant STRT
18 pages
Anomaly Detection Report
No ratings yet
Anomaly Detection Report
33 pages
A Comparative Study of Classification Techniques For Fraud Detection
No ratings yet
A Comparative Study of Classification Techniques For Fraud Detection
5 pages
NIDS Conference
No ratings yet
NIDS Conference
4 pages
A Case Study On Data Classification Approach Using K-Nearest Neighbor
No ratings yet
A Case Study On Data Classification Approach Using K-Nearest Neighbor
7 pages
Intrusion Detection
No ratings yet
Intrusion Detection
12 pages
Survey On Clustering Techniques in Data Mining For So Ware Engineering
No ratings yet
Survey On Clustering Techniques in Data Mining For So Ware Engineering
7 pages
Comparative Analysis of Classification Algorithms Using Weka
No ratings yet
Comparative Analysis of Classification Algorithms Using Weka
12 pages
Applsci 11 07733 v2
No ratings yet
Applsci 11 07733 v2
18 pages
Analysis of Various Decision Tree Algorithms For Classification in Data Mining PDF
No ratings yet
Analysis of Various Decision Tree Algorithms For Classification in Data Mining PDF
5 pages
Zhao 2020
No ratings yet
Zhao 2020
32 pages
Data Science - Glossary
100% (1)
Data Science - Glossary
12 pages
Comparison of Text Classifiers On News Articles
No ratings yet
Comparison of Text Classifiers On News Articles
5 pages
SV25_DigitalTwins
No ratings yet
SV25_DigitalTwins
3 pages
Intrusion Detection Using Big Data and Deep Learning Techniques
No ratings yet
Intrusion Detection Using Big Data and Deep Learning Techniques
9 pages
2012-Elsiver-An Efficient Intrusion Detection System Based On Support Vector Machines
No ratings yet
2012-Elsiver-An Efficient Intrusion Detection System Based On Support Vector Machines
7 pages
Project PPT
No ratings yet
Project PPT
13 pages
1 s2.0 S0925231220319032 Main
No ratings yet
1 s2.0 S0925231220319032 Main
11 pages
A Framework For Detection of Malicious Code by Exploiting Machine Learning Techniques On Portable Executables
No ratings yet
A Framework For Detection of Malicious Code by Exploiting Machine Learning Techniques On Portable Executables
4 pages
Knowledge Management - 10 - Data Mining Overview
No ratings yet
Knowledge Management - 10 - Data Mining Overview
41 pages
Experimental Evaluation of Open Source Data Mining
No ratings yet
Experimental Evaluation of Open Source Data Mining
7 pages
Research
No ratings yet
Research
10 pages
1656792308661
No ratings yet
1656792308661
23 pages
Data Mining Approach For Cyber Security
No ratings yet
Data Mining Approach For Cyber Security
7 pages
Software Defect Prediction Using ML
No ratings yet
Software Defect Prediction Using ML
6 pages
Systematic Review Automation in Cyber Security
No ratings yet
Systematic Review Automation in Cyber Security
4 pages
Network Traffic Classification Techniques and Comparative Evaluation of Machine Learning Models
No ratings yet
Network Traffic Classification Techniques and Comparative Evaluation of Machine Learning Models
5 pages
Enhancing Network Security - Leveraging Machine Learning For Intrusion Detection-Web of Science Core Collection
No ratings yet
Enhancing Network Security - Leveraging Machine Learning For Intrusion Detection-Web of Science Core Collection
2 pages
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
The Secrets of A Slot Machine
No ratings yet
The Secrets of A Slot Machine
4 pages
Roadmap How To Learn AI in 2024 (Uncovered AI)
No ratings yet
Roadmap How To Learn AI in 2024 (Uncovered AI)
6 pages
Attention Is All You Need
50% (2)
Attention Is All You Need
11 pages
My Ai Cheat List
100% (11)
My Ai Cheat List
3 pages
Teas Topics To Study
100% (12)
Teas Topics To Study
6 pages
Realworld - Python - Hackers Guide2021
67% (3)
Realworld - Python - Hackers Guide2021
362 pages
97 Things Every Programmer Should Know Extended
100% (3)
97 Things Every Programmer Should Know Extended
143 pages
2045: The Year Man Becomes Immortal
No ratings yet
2045: The Year Man Becomes Immortal
9 pages
Rationality From AI To Zombies
86% (7)
Rationality From AI To Zombies
1,813 pages
Pranav SOP Harvard 2
No ratings yet
Pranav SOP Harvard 2
2 pages
The Singularity: Creating Skynet and The Destruction of Humanity.
No ratings yet
The Singularity: Creating Skynet and The Destruction of Humanity.
212 pages
Scientific American - April 2024
100% (1)
Scientific American - April 2024
88 pages
Mythic Magazine #009
100% (3)
Mythic Magazine #009
27 pages
Being Human in The Age of Artificial Intelligence
No ratings yet
Being Human in The Age of Artificial Intelligence
1 page
Sudoku Theory
No ratings yet
Sudoku Theory
13 pages
Cognitive Bias Cheat Sheet
100% (1)
Cognitive Bias Cheat Sheet
17 pages
U.S. Army Intelligence Analysis Manual PDF
100% (1)
U.S. Army Intelligence Analysis Manual PDF
146 pages
Mercity - Ai-Guide To Fine-Tuning LLMs Using PEFT and LoRa Techniques
No ratings yet
Mercity - Ai-Guide To Fine-Tuning LLMs Using PEFT and LoRa Techniques
25 pages
Python Programming and Maching Learning 2 in 1 B08Y5DPX32
100% (7)
Python Programming and Maching Learning 2 in 1 B08Y5DPX32
145 pages
Google Ads Measurement Certification Assessment Answers 2021
No ratings yet
Google Ads Measurement Certification Assessment Answers 2021
40 pages
A Multi-Objective Active Learning Platform and Web App ForReaction Optimization
No ratings yet
A Multi-Objective Active Learning Platform and Web App ForReaction Optimization
9 pages
supervised learning using python - chapter1
No ratings yet
supervised learning using python - chapter1
34 pages
14-15 ASAP Advanced Statistics Clasification Techniques KNN
No ratings yet
14-15 ASAP Advanced Statistics Clasification Techniques KNN
49 pages
EAI Notes 1
No ratings yet
EAI Notes 1
15 pages
1 s2.0 S2215098621000070 Main
No ratings yet
1 s2.0 S2215098621000070 Main
12 pages
JD AI - ML Engineer Tanalink
No ratings yet
JD AI - ML Engineer Tanalink
3 pages
Internship Introduction Pages
No ratings yet
Internship Introduction Pages
10 pages
embedded tutorial
No ratings yet
embedded tutorial
257 pages
JARVIC REPORT Tusar
No ratings yet
JARVIC REPORT Tusar
35 pages
Week1 Introduction
No ratings yet
Week1 Introduction
17 pages
Daniel Prijs MSC Thesis 2022
No ratings yet
Daniel Prijs MSC Thesis 2022
70 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
6 pages
Full download Recommender System with Machine Learning and Artificial Intelligence Practical Tools and Applications in Medical Agricultural and Other Industries 1st Edition Sachi Nandan Mohanty (Editor) pdf docx
No ratings yet
Full download Recommender System with Machine Learning and Artificial Intelligence Practical Tools and Applications in Medical Agricultural and Other Industries 1st Edition Sachi Nandan Mohanty (Editor) pdf docx
52 pages
Thuy T. Pham: By, U. of Technology Sydney
No ratings yet
Thuy T. Pham: By, U. of Technology Sydney
5 pages
Value Engineering Analysis in The Construction of Box Girder Bridge PDF
No ratings yet
Value Engineering Analysis in The Construction of Box Girder Bridge PDF
8 pages
AD601 Deep Learning Unit-2 Notes
No ratings yet
AD601 Deep Learning Unit-2 Notes
14 pages
Final 2 Ethical Considerations in AI
No ratings yet
Final 2 Ethical Considerations in AI
9 pages
Stock Price Prediction Thesis
100% (3)
Stock Price Prediction Thesis
4 pages
Annotated Bibliography
No ratings yet
Annotated Bibliography
3 pages
M.C.A. ( 2020 PATTERN )
No ratings yet
M.C.A. ( 2020 PATTERN )
44 pages
Artificial Intelligence and Machine Learning Applications in The PR 2024 Hel
No ratings yet
Artificial Intelligence and Machine Learning Applications in The PR 2024 Hel
17 pages
Student Guide
No ratings yet
Student Guide
91 pages
B.Tech - IT and CSIT Syllabus of 3rd Year
No ratings yet
B.Tech - IT and CSIT Syllabus of 3rd Year
37 pages
Anomaly Detection On Industrial Electrical Systems Using Deep Learning
No ratings yet
Anomaly Detection On Industrial Electrical Systems Using Deep Learning
6 pages
Deepfake Detection of Images
No ratings yet
Deepfake Detection of Images
9 pages
Ipl Prediction
No ratings yet
Ipl Prediction
12 pages
Rutuja Thube-Resume
No ratings yet
Rutuja Thube-Resume
2 pages

A Comparative Analysis of Machine Learning Algorithms for Classification Purpose

Uploaded by

A Comparative Analysis of Machine Learning Algorithms for Classification Purpose

Uploaded by

Available online at www.sciencedirect.

A Comparative Analysis of Machine Learning Algorithms for

Vraj Shetha, Urvashi Tripathia, Ankit Sharmaa*

1877-0509 © 2023 The Authors. Published by Elsevier B.V.

3.1. Naive Bayesian

X = (x1 ,x2 ,x3 ,…, xn )

3.2. Decision Tree

3.3. K-Nearest Neighbour

3.4. Support Vector Machine

4. Glass Quality Dataset

5. Analysing the classifiers

Fig.1: Flowchart for the Classifier

Fig.2: Accuracy (in percentage) of the algorithms applied on each dataset

Table1: Accuracy (in percentage) of the algorithms applied on each dataset

Table 2: Precision (in percentage) of the algorithms applied on each dataset

Table 3: Recall (in percentage) of the algorithms applied on each dataset

Fig. 3: Precision (in percentage) of the algorithms applied on each dataset

Table 4: F1 Score (in percentage) of the algorithms applied on each dataset

Fig. 4: Recall (in percentage) of the algorithms applied on each dataset

Fig. 5: F1 Score (in percentage) of the algorithms applied on each dataset

You might also like