0% found this document useful (0 votes)

3 views

himanshPR

This report provides a comparative analysis of K-Nearest Neighbor (KNN) and Bayesian Networks as machine learning algorithms, discussing their principles, strengths, weaknesses, and applications. It emphasizes the importance of choosing the right algorithm based on data characteristics and problem requirements. The report also suggests future research directions, including hybrid approaches that combine the strengths of both algorithms.

Uploaded by

www.neerajkumar2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

himanshPR

Uploaded by

www.neerajkumar2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

BENGAL COLLEGE OF ENGINEERING AND TECHNOLOGY

A REPORT ON
Comparative Analysis of K-Nearest Neighbor and
Bayesian Networks as Learning Mechanism

Submitted by
Himanshu KUMAR

University roll no.: 12500221075

Subject: Pattern Recognition

Dept. Of Information Technology 3rd Year

Under the guidance of

Mrs. Aparna ma’am

( IT Department)
ACKNOWLEGMENT

On the very outset of this report, I would like to extend my

sincere & heartfelt obligation towards all the personages who
have helped me in this endeavour. Without their active
guidance, help, cooperation, I would not have made headway
in the project.I am ineffably indebted to ma’am for
conscientious guidance and encouragement to accomplishment
this report I am extremely thankful and pay my gratitude to my
faculty Mrs. Aparna for his valuable guidance and support on
completion of this report writing in its presently.
TABLE OF CONTENT

 ABSTRACT

 INTRODUCTION

 METHODOLOGY

 DISCUSSION

 CONCLUSION

 APPLICATION

 BIBLIOGRAPHY
ABSTRACT

This report presents a comprehensive comparison between two prominent

machine learning algorithms, namely K-Nearest Neighbor (KNN) and
Bayesian Networks. Both these methods are widely utilized for
classification and prediction tasks in various fields.
This report aims to highlight the fundamental principles, strengths,
weaknesses, and applications of KNN and Bayesian Networks, providing
insights into their suitability for different scenarios.
To conclude, the paper also highlights the future scope of ML algorithms
and artificial intelligence in the coming times and their roles in automation
and holistic development, not just in technology-related aspects but also, the
humanitarian aspects, finally followed by reliable and relevant conclusions
derived from this exhaustive research.
INTRODUCTION

Machine learning plays a pivotal role in data analysis, pattern recognition,

and decision-making processes. KNN and Bayesian Networks represent
two distinct approaches to learning from data. KNN is an instance-based
learning algorithm that relies on the proximity of data points, while
Bayesian Networks are probabilistic graphical models that capture
dependencies among variables.

K-Nearest Neighbor (KNN):

Principles: KNN is a simple and intuitive algorithm that classifies data
points based on the majority class of their k-nearest neighbors. The
algorithm's decision is heavily influenced by the choice of distance metric
and the value of k.
METHODOLOGY

. Working of KNN algorithm

The ‘K’ here in K-NN refers to the count of neighbors of the new data
point. Deciding a suitable value for K is the foremost process in this
algorithm. For better accuracy, it is imperative that one chooses the accurate
value of K, and this process is called parameter tuning. A very low value of
K like 1 or 2 can lead to noisy results, whereas, a very high value can create
confusion at times, depending on the data set [10].

There is no fixed value for K, however, one of the standard values that K
often assumes is ‘5’ i.e., for the majority voting process, the 5 neighbors
closest to the new data point are considered. To avoid mistakes and
confusion among two classes of data sets, generally, an odd value of K is
suitable. Another formula-based calculation for K can be done through this
formula:(1)
And, n is the overall count of data points.

. Euclidean distance is calculated as shown in Fig. 3.

Upon calculating the values of the Euclidean distances of all the points from
the new data point, one should observe the category to which the majority
of the nearest neighbors belong (say, at K = 5), and hence after careful
computation impute that class to the data point, assigned for classification.
Like in Fig. 4, it can be concluded that the point, goes to class A, since it has
3 (majority) nearest neighbors from that category [10].
Fig. 3. Calculation of Euclidean Distance b/w two points [13].

2.2. Comparison of logistic regression, Naive Bayes and KNN machine

learning algorithms for credit card fraud detection — recent application

2.2.1. Background of the recent work

Credit cards are a widely adopted method for payments these days due to
the unstoppable advancement of internet technology. Having said that,
banking scams are also way more commonly heard these days than before,
which has indelibly affected many segments of the population, be it
individuals or institutions. With every advanced security feature, the
DISCUSSION

. Decision tree consisting of a constant variable

The Decision Tree having a constant variable as the target.
Example: - Whether a person can repay a loan or not. In case the banks do
not have income details, which is a significant variable in this case, then a
decision tree could be built for predicting the monthly revenue of a person
on the basis of various factors like assets, living standard, occupation, etc.
Here the values being predicted are for variables continuous in nature.

5.2. Decision tree terminologies

 •

Root Node The initial part of the Decision Tree from where the entire
data set starts getting divided further, into various possible sets that are
homogeneous in nature.
 •

Leaf Node: The final outcome node beyond which no further

segregation of trees is possible.
 •

Splitting: It involves the process of division of the main node further,

upon the provided constraints into sub-nodes.
 •

Sub Tree: Splitting up a hierarchy results into a sub tree or branch.

 •

Pruning: This involves the elimination of superfluous branches of the

Decision Tree in order to get optimal results.

Child and Parent node: It is the base node also called the parental
node whereas the remaining nodes are simply called child nodes [40].
5.3. Attribute selection measures
Attribute selection measure (ASM) involves the collection of the optimum
attribute concerning the source node as well as for the sub-nodes. The two
major practices for ASM are:

5.3.2. Gini index

Gini index measures the impurity or purity used during the creation of a
decision tree algorithm. Small Gini index attributes are preferred by the
decision tree algorithm over the attributes possessing larger Gini index,
while taking the decision.
The calculation of Gini index can be performed using the expression given
as follows:(4)Gini index=1

Steps for making a decision tree



The root node, say X, that contains the entire data set is considered the
starting point of the tree.


By using ASM look for the best matching characteristic from the data
set.


Split X into subsections comprising values having the finest possible

qualities.


Develop the decision tree nodes only using the idyllic attribute.


Repetitively keep developing unique decision tree nodes using the

Application

 Image and speech recognition.

 Anomaly detection.
 Recommender systems.
1. Bayesian Networks: 3.1 Principles: Bayesian Networks model the
probabilistic relationships between variables using a directed acyclic
graph (DAG). Conditional dependencies and independencies are
explicitly represented, making it suitable for reasoning under
uncertainty.
1.2 Strengths:
 Effective handling of uncertainty and incomplete data.
 Explicit modeling of variable dependencies.
 Facilitates intuitive representation and interpretation.
1.3 Weaknesses:
 Complexity increases with the number of variables.
 Dependency on accurate prior probabilities.
1.4 Applications:
 Medical diagnosis.
 Fraud detection.
 Natural language processing.
2. Comparative Analysis: 4.1 Performance Metrics:
 Accuracy, precision, recall.
 Robustness to noise and outliers.
 Computational efficiency.
CONCULSION

In conclusion, both KNN and Bayesian Networks offer unique advantages

and challenges. The choice between them depends on the nature of the data,
the problem at hand, and computational considerations.

KNN excels in simplicity and adaptability, while Bayesian Networks

provide a principled approach to modeling uncertainties and dependencies.
Ultimately, the selection should be based on the specific requirements of the
task and the characteristics of the dataset.

Future Directions: Further research can explore hybrid approaches that

combine the strengths of KNN and Bayesian Networks, leveraging the
simplicity of KNN for local decisions and the probabilistic modeling
capabilities of Bayesian Networks for capturing global dependencies.
Additionally, advancements in handling large-scale datasets and
optimization techniques can contribute to the scalability of both algorithms
BIBLIOGRAPHY

 CLASS NOTES
 WEBSITE
 www.google.com
 https://www.javatpoint.com/machine-learning.
Google Scholar

https://images.app.goo.gl/eLBR6gBjRGnSyJ7S9.
Google Scholar

Ann 111 Principles of Animal Nutrition and Feed Technology
No ratings yet
Ann 111 Principles of Animal Nutrition and Feed Technology
153 pages
Fractionally Spaced Adaptive Equalizer A Review
No ratings yet
Fractionally Spaced Adaptive Equalizer A Review
3 pages
Selected Problems: 54 Lithuanian National Chemistry Olympiad
No ratings yet
Selected Problems: 54 Lithuanian National Chemistry Olympiad
17 pages
new90程梅洁电子商务 202111080313
No ratings yet
new90程梅洁电子商务 202111080313
12 pages
Unit Ii
No ratings yet
Unit Ii
102 pages
Unit - II
No ratings yet
Unit - II
37 pages
Sayan Das - Machine Learning
No ratings yet
Sayan Das - Machine Learning
4 pages
Module Iii
No ratings yet
Module Iii
15 pages
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
No ratings yet
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
23 pages
U02Lecture08 Statistical Machine Learning
No ratings yet
U02Lecture08 Statistical Machine Learning
41 pages
DataMining_Unit-3
No ratings yet
DataMining_Unit-3
8 pages
ML-UNIT-2
No ratings yet
ML-UNIT-2
46 pages
Data Mining: Kabith Sivaprasad (BE/1234/2009) Rimjhim (BE/1134/2009) Utkarsh Ahuja (BE/1226/2009)
No ratings yet
Data Mining: Kabith Sivaprasad (BE/1234/2009) Rimjhim (BE/1134/2009) Utkarsh Ahuja (BE/1226/2009)
32 pages
Research and Implementation of Machine
No ratings yet
Research and Implementation of Machine
6 pages
02-knn Notes
No ratings yet
02-knn Notes
23 pages
Chapter
100% (1)
Chapter
101 pages
DM - MP (1)
No ratings yet
DM - MP (1)
15 pages
Unit II - 2 - Supervised Learning
No ratings yet
Unit II - 2 - Supervised Learning
23 pages
Classification
No ratings yet
Classification
50 pages
CSE-VSEM-503-B-PR-UNIT-2-NOTES
No ratings yet
CSE-VSEM-503-B-PR-UNIT-2-NOTES
17 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
ml unit2
No ratings yet
ml unit2
38 pages
CH 04 Classification Techniques
No ratings yet
CH 04 Classification Techniques
89 pages
2.unit 2 ML Q&A
No ratings yet
2.unit 2 ML Q&A
36 pages
Learning Types ML
No ratings yet
Learning Types ML
18 pages
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
No ratings yet
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
33 pages
A Case Study On Data Classification Approach Using K-Nearest Neighbor
No ratings yet
A Case Study On Data Classification Approach Using K-Nearest Neighbor
7 pages
Comparative Study of K-NN, Naive Bayes and Decision Tree Classification Techniques
No ratings yet
Comparative Study of K-NN, Naive Bayes and Decision Tree Classification Techniques
4 pages
UNIT 3 - Final
No ratings yet
UNIT 3 - Final
37 pages
ML Assignment 2 PDF
No ratings yet
ML Assignment 2 PDF
9 pages
ML Unit 2
No ratings yet
ML Unit 2
84 pages
FPA unit 2
No ratings yet
FPA unit 2
20 pages
Machine Learning
No ratings yet
Machine Learning
15 pages
05 Classification Part1
No ratings yet
05 Classification Part1
35 pages
MLunit 2 Mynotes
No ratings yet
MLunit 2 Mynotes
15 pages
Unit 5 - DA - Classification & Clustering
No ratings yet
Unit 5 - DA - Classification & Clustering
105 pages
Classification
No ratings yet
Classification
7 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
DATA MINING UNIT-2 (1)
No ratings yet
DATA MINING UNIT-2 (1)
37 pages
ML_Course_15 -17
No ratings yet
ML_Course_15 -17
31 pages
Application of Machine Learning
No ratings yet
Application of Machine Learning
8 pages
331mt 3.2 (1)
No ratings yet
331mt 3.2 (1)
23 pages
Machine learning algorithms laiki
No ratings yet
Machine learning algorithms laiki
123 pages
Big Data Notes
No ratings yet
Big Data Notes
33 pages
Jntuk r20 ML Unit-II
No ratings yet
Jntuk r20 ML Unit-II
33 pages
KNN HMM
No ratings yet
KNN HMM
51 pages
Yunsu Han KNN K Means
No ratings yet
Yunsu Han KNN K Means
8 pages
Evaluation_of_Student_Academic_Performan
No ratings yet
Evaluation_of_Student_Academic_Performan
7 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
L05-Predictive Analytics I
No ratings yet
L05-Predictive Analytics I
49 pages
ML UNIT-2
No ratings yet
ML UNIT-2
33 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
UNIT 2 - Notes
No ratings yet
UNIT 2 - Notes
31 pages
Data Mining All Summary
No ratings yet
Data Mining All Summary
47 pages
Data Mining Intro IEP
No ratings yet
Data Mining Intro IEP
47 pages
ML Unit-Ii Notes
No ratings yet
ML Unit-Ii Notes
17 pages
Unit 5 Learning with Algorithm
No ratings yet
Unit 5 Learning with Algorithm
7 pages
Introduction To Data Mining 2005
60% (5)
Introduction To Data Mining 2005
400 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
Day 2 - Session 2: - KNN - Decision Tree - Random Forest - Naïve Bayes Classification
No ratings yet
Day 2 - Session 2: - KNN - Decision Tree - Random Forest - Naïve Bayes Classification
50 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Contextual Image Classification: Understanding Visual Data for Effective Classification
From Everand
Contextual Image Classification: Understanding Visual Data for Effective Classification
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
7sem_Syllabus
No ratings yet
7sem_Syllabus
7 pages
SANJEEV
No ratings yet
SANJEEV
16 pages
Bengal College of Engineering and Technology: Niraj Kumar
No ratings yet
Bengal College of Engineering and Technology: Niraj Kumar
9 pages
Bengal College of Engineering and Technology
No ratings yet
Bengal College of Engineering and Technology
11 pages
Artificialintelligence
No ratings yet
Artificialintelligence
10 pages
GTE Micro Project 4th Sem
100% (3)
GTE Micro Project 4th Sem
6 pages
Credit Score Prediction Using Support Vector Machine and Gray Wolf Optimization
No ratings yet
Credit Score Prediction Using Support Vector Machine and Gray Wolf Optimization
5 pages
Nexans 10px 0.75mm2 MGT Xlpe Isos LSZH Swa LSZH
No ratings yet
Nexans 10px 0.75mm2 MGT Xlpe Isos LSZH Swa LSZH
2 pages
Glazings PDF
No ratings yet
Glazings PDF
9 pages
Power System Analysis and Design EE-461: Tassawar Kazmi Lecturer, EE Department, Seecs, Nust
No ratings yet
Power System Analysis and Design EE-461: Tassawar Kazmi Lecturer, EE Department, Seecs, Nust
10 pages
Assignment Ict L
No ratings yet
Assignment Ict L
40 pages
Applications of Chemical Reactions
No ratings yet
Applications of Chemical Reactions
7 pages
Data Sheet 6ES7134-6GD01-0BA1: General Information
No ratings yet
Data Sheet 6ES7134-6GD01-0BA1: General Information
3 pages
921 941 Specifications
No ratings yet
921 941 Specifications
1 page
Cylindrically Orthotropic Circular Plates With Large Deflections 1)
No ratings yet
Cylindrically Orthotropic Circular Plates With Large Deflections 1)
11 pages
Bergen Boin Sanjani Blood Vessel Segmentation
No ratings yet
Bergen Boin Sanjani Blood Vessel Segmentation
12 pages
Ncert Geo Class-6 Our Habitat
No ratings yet
Ncert Geo Class-6 Our Habitat
74 pages
Kuat Tekan Bambu Laminasi Dan Aplikasiny
No ratings yet
Kuat Tekan Bambu Laminasi Dan Aplikasiny
10 pages
Meteorological Instruments: MODEL 05106
No ratings yet
Meteorological Instruments: MODEL 05106
11 pages
Genset C18.sis - Controller
No ratings yet
Genset C18.sis - Controller
12 pages
Bloom Filters: References
No ratings yet
Bloom Filters: References
22 pages
Permutations and Combinations Lesson Plan
No ratings yet
Permutations and Combinations Lesson Plan
5 pages
Fonte 12v
No ratings yet
Fonte 12v
9 pages
Planiranje Eksperimenta Hodzic
No ratings yet
Planiranje Eksperimenta Hodzic
7 pages
Chapter 8 Powerpoint Slides-2
100% (1)
Chapter 8 Powerpoint Slides-2
33 pages
UNIT 3 & 4 exam
No ratings yet
UNIT 3 & 4 exam
5 pages
Dma
No ratings yet
Dma
19 pages
ABE 057 Manual
No ratings yet
ABE 057 Manual
84 pages
1st Mdcat Mock Tests 2023 by Mbts.
No ratings yet
1st Mdcat Mock Tests 2023 by Mbts.
23 pages
LaTeX Tutorial Book
No ratings yet
LaTeX Tutorial Book
103 pages
Vacon OPTE3 E5 Profibus Option Board User Manual DPD00997C UK
No ratings yet
Vacon OPTE3 E5 Profibus Option Board User Manual DPD00997C UK
117 pages
12th Physics EM 1 Mark Questions English Medium PDF Download
No ratings yet
12th Physics EM 1 Mark Questions English Medium PDF Download
18 pages