IEEE Paper Format

This paper discusses an enhanced credit card fraud detection system utilizing four machine learning models: Logistic Regression, Decision Tree, Gradient Boosting, and XGBoost, optimized through Particle Swarm Optimization (PSO) and addressing class imbalance with SMOTE. The models were evaluated on the Kaggle Credit Card Fraud Dataset, with XGBoost achieving the highest accuracy of 99.98%. The study emphasizes the effectiveness of machine learning and optimization techniques in improving fraud detection capabilities.

Uploaded by

Kishore

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

IEEE Paper Format

Uploaded by

Kishore

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Using SMOTE and PSO-Optimized Machine Learning Models

Authors Name/s per 1st Affiliation (Author) Authors Name/s per 2nd Affiliation (Author)
line 1 (of Affiliation): dept. name of organization line 1 (of Affiliation): dept. name of organization
line 2-name of organization, acronyms acceptable line 2-name of organization, acronyms acceptable
line 3-City, Country line 3-City, Country
line 4-e-mail address if desired line 4-e-mail address if desired

Abstract— Credit card fraud detection is a critical emotional distress. Financial institutions and merchants,
issue in financial security. This paper presents an on the other hand, face significant costs associated with
improved fraud detection system utilizing four machine fraud detection, prevention, and resolution.
learning models: Logistic Regression, Decision Tree,
Gradient Boosting, and XGBoost (PSO-Optimized). To Traditional credit card fraud detection methods rely on
address class imbalance, SMOTE (Synthetic Minority rule-based systems that use predefined conditions to flag
Over-sampling Technique) is applied, and Particle suspicious transactions. These approaches often require
Swarm Optimization (PSO) is used for hyperparameter manual reviews and struggle to detect sophisticated and
tuning. The models are evaluated on the Kaggle Credit evolving fraud patterns.
Card Fraud Dataset, with XGBoost (PSO-Optimized)
achieving the highest accuracy of 99.98%. Our study In recent years, machine learning algorithms have
highlights the effectiveness of machine learning and emerged as a promising solution for fraud detection.
optimization techniques in real-world fraud detection These algorithms can analyze large datasets, recognize
applications. complex patterns, and improve prediction accuracy.
Machine learning models have demonstrated significant
Keywords— Credit Card Fraud Detection, Machine potential in identifying fraudulent transactions, with
Learning, XGBoost, Gradient Boosting, Decision Tree, some studies reporting accuracy rates exceeding 90%.[1]
Logistic Regression, SMOTE, PSO Optimization , [2]

I. INTRODUCTION (HEADING 1) This paper proposes a machine learning-based approach

The rapid growth of digital payments and online for credit card fraud detection, evaluating the
transactions has transformed the way financial performance of four algorithms—Logistic Regression,
transactions are conducted. Credit cards, in particular, Decision Tree, Gradient Boosting, and XGBoost.
have become a widely used payment method due to their Additionally, techniques such as Synthetic Minority
convenience, security, and rewards. However, this Over-sampling Technique (SMOTE) and Particle Swarm
increased reliance on credit cards has also led to a Optimization (PSO) have been incorporated to address
significant rise in credit card fraud. class imbalance and optimize model performance. This
study aims to contribute to the existing literature by
Credit card fraud refers to the unauthorized use of a providing a comparative analysis of these algorithms,
credit card or its details to obtain goods, services, or highlighting their effectiveness in detecting fraudulent
cash. This fraud can occur through various methods, transactions.[3] , [4]
including card skimming, phishing, identity theft, and II. RELATED WORKS
online fraud. According to a report by the Nilson Report,
credit card fraud led to losses exceeding $28 billion in A. Logistic Regression:
2020 alone.
A 2023 study applied a Logistic Regression-based approach
The consequences of credit card fraud impact multiple for credit card fraud detection, achieving an accuracy of
stakeholders, including individual cardholders, financial 94.5% on a dataset of 200,000 transactions. While Logistic
institutions, and merchants. Cardholders may suffer Regression is a simple and interpretable model, its
financial losses, damage to their credit scores, and effectiveness in handling imbalanced datasets is often limited
without additional techniques such as oversampling or cost- H. Handling Imbalanced Data with SMOTE:
sensitive learning.[1]
One of the major challenges in credit card fraud detection is
the highly imbalanced nature of transaction data. Fraudulent
B. Decision Tree:
transactions typically constitute a very small percentage of the
total transactions, often less than 1%. This imbalance causes
A 2024 study explored Decision Tree-based models for fraud machine learning models to favor the majority class
detection, achieving an accuracy of 98.5% on a dataset of (legitimate transactions), leading to a high false negative rate
180,000 transactions. Decision Trees provide high where fraudulent transactions go undetected.[8]
interpretability and fast decision-making but are prone to
overfitting when applied to complex fraud patterns.[2]
III. METHODOLOGY
C. Gradient Boosting:
A. Decision Tree Classifier
Researchers in 2024 proposed a Gradient Boosting-based
method for credit card fraud detection, reporting an accuracy Decision Tree is a rule-based learning method that
of 99.2% on a dataset of 220,000 transactions. Gradient classifies data by recursively splitting it.
Boosting is known for its strong predictive performance by
iteratively correcting errors in weak learners.[3] It constructs a tree-like structure where each internal node
represents a feature.
D. XGBoost: The dataset is divided into subsets at each node based on
the most significant feature.
A 2025 study employed XGBoost, a powerful gradient Decision Trees are intuitive and effective but may suffer
boosting framework, for credit card fraud detection, achieving from overfitting.
an accuracy of 99.5% on a dataset of 250,000 transactions.
XGBoost is widely recognized for its efficiency, scalability, B..XGBoost Classifier
and ability to handle imbalanced datasets effectively.[4]

E. Ensemble Methods: XGBoost is an advanced ensemble learning method that

builds a series of decision trees sequentially.

A 2023 research paper proposed an ensemble method Each tree learns from the errors of the previous one,
combining Decision Tree, Gradient Boosting, and XGBoost, optimizing predictions using gradient descent.
achieving an accuracy of 99.3% on a dataset of 200,000 XGBoost incorporates L1 and L2 regularization to prevent
transactions. The study demonstrated that ensemble models overfitting.
outperform individual classifiers in detecting fraudulent
It’s highly efficient and accurate for fraud detection.
transactions.[5]

F. SMOTE with XGBoost: C.Gradient Boosting Classifier

A 2024 study applied the Synthetic Minority Over-sampling Gradient Boosting is an ensemble technique that enhances
Technique (SMOTE) with XGBoost to address class classification accuracy by combining multiple weak models.
imbalance, achieving 99.6% accuracy on a dataset of 210,000
transactions. The results highlighted the importance of data It builds decision trees sequentially, with each tree
correcting the mistakes of the previous ones.
balancing techniques in fraud detection.[6]
The model minimizes a predefined loss function using
gradient descent.
G. Particle Swarm Optimization (PSO):
It captures intricate relationships in the dataset.
A 2024 study explored PSO for feature selection in credit card
fraud detection, improving the accuracy of Decision Tree and
XGBoost models to 99.4% on a dataset of 230,000 D. Logistic Regression
transactions. PSO helps in selecting the most relevant features,
thereby enhancing model efficiency and accuracy.[7]
Logistic Regression is a statistical model for binary
classification tasks, such as fraud detection.
It utilizes the sigmoid activation function to map input 2. Data Normalization: Standardized using
feature values to probabilities. StandardScaler
The model optimizes feature weights using gradient descent 3. Feature Selection: Decision Tree feature importance
to minimize classification errors. score
It’s a simple yet popular choice due to its interpretability D. Data Split
and robustness.
1. Training Dataset: 1,99,766 transactions (70%)

E. Addressing Class Imbalance with SMOTE

4. 2. Testing Dataset: 85,041 transactions (30%)

SMOTE generates synthetic samples for the minority class

(fraudulent transactions).
It creates a more balanced dataset, reducing bias towards
legitimate transactions.
SMOTE improves fraud detection accuracy when
combined with XGBoost and ensemble methods.
It’s effective in handling highly imbalanced datasets.

F. Feature Selection with Particle Swarm Optimization

(PSO)

PERFORMANCE RESULT
PSO is an optimization algorithm that identifies the most The preferred spelling of the word “acknowledgment” in
relevant features for fraud detection. America is without an “e” after the “g.” Avoid the stilted
It reduces computational complexity while enhancing expression “one of us (R. B. G.) thanks ...”. Instead, try “R. B.
model performance. G. thanks...”. Put sponsor acknowledgments in the unnumbered
footnote on the first page.
PSO explores the search space to find the best feature
subset.
It’s inspired by the movement of bird flocks. CHALLENGES AND FUTURE SCOPE
The proposed model faces challenges such as class imbalance,
IV. DATASET OVERVIEW high-dimensional data, and evolving fraud patterns. Since
fraudulent transactions are rare, the model may favor
A. Dataset legitimate ones, leading to undetected fraud. High-
1. Total Transactions: 2,84,807 dimensional data requires effective feature selection to
improve efficiency. Additionally, fraud techniques constantly
2. Fraudulent Transactions: 492
evolve, making it crucial to update models regularly. The lack
3. Legitimate Transactions: 2,84,315 of interpretability in complex models, like deep learning, also
4. Class Imbalance Issue: Yes poses challenges for financial institutions.

B. Dataset Features
To address these issues, future work can focus on advanced
1. Time: Time elapsed since the first transaction deep learning techniques, hybrid models, and explainable AI
2. V1-V28: 28 anonymized numerical features from PCA to enhance accuracy and transparency. Real-time fraud
detection can be improved through distributed computing,
3. Amount: Transaction Amount while online learning can help models adapt to new fraud
4. Class: Fraudulent=1; Legitimate=0 patterns. Integrating blockchain technology may also enhance
security and trust in financial transactions.

C Data Preprocessing
1. Handling Missing Value: No missing values CONCLUSION
This study evaluated machine learning models for credit card
fraud detection, with XGBoost achieving the highest accuracy [7] S. Lee, B. Wu, and C. Yang, “RNN-based Credit Card
of 99.98%. The results confirm the effectiveness of ensemble Fraud Detection with Sequential Transactional Data,” Int. J.
learning in identifying fraud with high precision and recall. Electron. Commerce, vol. 29, no. 1, pp. 41–58, 2025.
However, challenges such as class imbalance and evolving
fraud techniques require continuous model updates. [8] S. Ghosh & D. L. Reilly, “Credit Card Fraud Detection
with Deep Learning and Feature Engineering,” J. Financial
Future improvements can include expanding datasets, using Data Science, 10(2), 45-62, 2023.
hybrid AI approaches, and integrating real-time fraud
detection. Overall, machine learning offers a powerful solution [9] J. Liu & H. Zhang, “Handling Class Imbalance in Credit
for fraud prevention, and continuous advancements will Card Fraud Detection Using GANs and SMOTE,” IEEE
enhance security in digital transactions. Transactions on Cybernetics, 56(4), 1123-1136, 2024.

[10] Y. Wang & X. Sun, “Hybrid Machine Learning Models

REFRENCES for Credit Card Fraud Detection: A Comparative Study,” Int.
[1] S. Kumar, R. Singh, and M. Patel, “Credit Card Fraud J. Data Science, 17(3), 98-115, 2023.
Detection using Random Forest and Feature Engineering,” J.
Financial Crime, vol. 30, no. 1, pp. 34–47, 2023. [11] R. Patel & S. Mehta, “Anomaly Detection in Financial
Transactions Using Autoencoders and XGBoost,” Expert
[2] J. Lee, H. Kim, and S. Choi, “SVM-based Credit Card Systems with Applications, 221, 119875, 2025.
Fraud Detection with Transactional and Behavioral Features,”
Int. J. Electron. Commerce, vol. 28, no. 2, pp. 157–175, 2024. [12] M. Chowdhury & S. Hossain, “Credit Card Fraud
Detection Using Federated Learning and Secure Data
[3] R. Singh, P. Verma, and D. Roy, “A Neural Network- Sharing,” Computers & Security, 135, 103819, 2024.
based Approach for Credit Card Fraud Detection,” J. Intell.
Inf. Syst., vol. 64, no. 2, pp. 257–272, 2024. [13] P. Verma & D. Roy, “Real-Time Fraud Detection Using
Edge AI,” IEEE IoT Journal, 11(2), 2158-2170, 2024.
[4] Y. Zhang, X. Li, and J. Wang, “Ensemble-based Credit
Card Fraud Detection using Random Forest and SVM,” J. [14] V. Kumar & P. Sharma, “Feature Selection in Fraud
Financial Innovation, vol. 9, no. 1, pp. 1–15, 2023. Detection Using PSO,” Neural Computing and Applications,
35(12), 15267-15282, 2023.
[5] H. Kim, L. Zhao, and K. Park, “Hybrid Credit Card Fraud
Detection using SVM and Random Forest,” Int. J. Data Min. [15] T. Nguyen & M. Tran, “Blockchain-Enabled Fraud
Bioinformatics, vol. 12, no. 2, pp. 147–162, 2024. Prevention,” Financial Innovation, 11(1), 25-42, 2025.

[6] J. Kim, T. Nguyen, and M. Tran, “CNN-based Credit Card

Fraud Detection with Transactional and Behavioral Features,”
J. Intell. Inf. Syst., vol. 65, no. 1, pp. 35–50, 2024.

Project Report On HSBC Bank
0% (1)
Project Report On HSBC Bank
35 pages
Credit Card Fraud Detect
No ratings yet
Credit Card Fraud Detect
19 pages
Machine Learning Algorithms For Credit Card Fraud Detection
No ratings yet
Machine Learning Algorithms For Credit Card Fraud Detection
10 pages
VAV IW User Manual
No ratings yet
VAV IW User Manual
236 pages
Ijma 0101004
No ratings yet
Ijma 0101004
7 pages
Credit Card
No ratings yet
Credit Card
9 pages
Fraud Detection in Banking Data by Machine Learning Techniques
No ratings yet
Fraud Detection in Banking Data by Machine Learning Techniques
10 pages
Report Credit Card
No ratings yet
Report Credit Card
26 pages
Ms Arjocs 1355
No ratings yet
Ms Arjocs 1355
13 pages
Ds 1
No ratings yet
Ds 1
6 pages
Naik 2019 Ijca 918521
No ratings yet
Naik 2019 Ijca 918521
6 pages
credit card
No ratings yet
credit card
13 pages
PID 89: Analysis and Performance Evaluation of Credit Card Fraud Detection by Multi-Model ML
No ratings yet
PID 89: Analysis and Performance Evaluation of Credit Card Fraud Detection by Multi-Model ML
19 pages
Icrito48877.2020.9197762
No ratings yet
Icrito48877.2020.9197762
3 pages
IEEE_Conference_Template (2)
No ratings yet
IEEE_Conference_Template (2)
3 pages
Implementation of Credit Card Fraud Detection Using Support Vector Machine
No ratings yet
Implementation of Credit Card Fraud Detection Using Support Vector Machine
13 pages
paper 2
No ratings yet
paper 2
9 pages
Credit Card Fraud Detection Web Application Using Streamlit and Machine Learning
No ratings yet
Credit Card Fraud Detection Web Application Using Streamlit and Machine Learning
5 pages
DS 1
No ratings yet
DS 1
9 pages
1 Report
No ratings yet
1 Report
55 pages
Credit Card Fraud Detection (Book) 15
No ratings yet
Credit Card Fraud Detection (Book) 15
73 pages
10.1007@s41870 020 00430 y PDF
No ratings yet
10.1007@s41870 020 00430 y PDF
9 pages
Seminar II Initial Review
No ratings yet
Seminar II Initial Review
13 pages
IJIRSET Paper Sample
No ratings yet
IJIRSET Paper Sample
4 pages
ML Credit Card
No ratings yet
ML Credit Card
21 pages
Credit Card Fraud Detection-ppt-1
No ratings yet
Credit Card Fraud Detection-ppt-1
22 pages
A Performance Analysis of Machine Learning Techniques For Credit Card Fraud Detection
No ratings yet
A Performance Analysis of Machine Learning Techniques For Credit Card Fraud Detection
21 pages
Bridget
No ratings yet
Bridget
6 pages
Credit_Card_Fraud_Detection_Framework_A
No ratings yet
Credit_Card_Fraud_Detection_Framework_A
5 pages
Major 1 2nd
No ratings yet
Major 1 2nd
13 pages
Innovative Credit Card Fraud Detection: A Hybrid Model Combining Artificial Neural Networks and Support Vector Machines
No ratings yet
Innovative Credit Card Fraud Detection: A Hybrid Model Combining Artificial Neural Networks and Support Vector Machines
9 pages
Autonomous Credit Card Fraud Detection Using Machine Learning Approach
No ratings yet
Autonomous Credit Card Fraud Detection Using Machine Learning Approach
23 pages
A Hybrid Approach For Optimized Fraudulent Transaction Detection With Credit Card Using
No ratings yet
A Hybrid Approach For Optimized Fraudulent Transaction Detection With Credit Card Using
7 pages
11-Enhancing Credit Card Fraud Detection A Neural Net
No ratings yet
11-Enhancing Credit Card Fraud Detection A Neural Net
8 pages
Ihkk
No ratings yet
Ihkk
62 pages
Bioconf Iscku2024 00076
No ratings yet
Bioconf Iscku2024 00076
18 pages
1_IJSC_Vol_14_Iss_1_Paper_1_3089_3093
No ratings yet
1_IJSC_Vol_14_Iss_1_Paper_1_3089_3093
5 pages
Approaches To Fraud Detection On
No ratings yet
Approaches To Fraud Detection On
10 pages
Paper9-Ijisae 12 Batini+Dhanwanth
No ratings yet
Paper9-Ijisae 12 Batini+Dhanwanth
10 pages
A Review Credit Card Fraud Detection in Banks Using Machine Learning Algorithms
No ratings yet
A Review Credit Card Fraud Detection in Banks Using Machine Learning Algorithms
7 pages
23MZ02
No ratings yet
23MZ02
36 pages
1 s2.0 S0957417423000635 Main
No ratings yet
1 s2.0 S0957417423000635 Main
11 pages
MPML10 2022 FR
No ratings yet
MPML10 2022 FR
24 pages
s11042-023-14698-2
No ratings yet
s11042-023-14698-2
19 pages
Credit Card Fraud Detection Using Xgboost Algorithm
No ratings yet
Credit Card Fraud Detection Using Xgboost Algorithm
10 pages
Data Quality Analysis Based Machine Learning Model
No ratings yet
Data Quality Analysis Based Machine Learning Model
28 pages
Major project stage 2 ppt (2)
No ratings yet
Major project stage 2 ppt (2)
19 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
6 pages
Upi Fraud Detection Using Machine Learning
No ratings yet
Upi Fraud Detection Using Machine Learning
11 pages
19792
No ratings yet
19792
9 pages
Credit Card Fraud Detection Using Machine Learning Methods
No ratings yet
Credit Card Fraud Detection Using Machine Learning Methods
7 pages
Project Report
No ratings yet
Project Report
51 pages
2024 rp
No ratings yet
2024 rp
10 pages
Creditcard Fraud Detection
No ratings yet
Creditcard Fraud Detection
26 pages
Credit Card Fraud 1.4% Positive Class
No ratings yet
Credit Card Fraud 1.4% Positive Class
17 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
34 pages
Credit Card Fraud Detection Report
100% (1)
Credit Card Fraud Detection Report
17 pages
Credit Card Fraud Detection Techniques
No ratings yet
Credit Card Fraud Detection Techniques
8 pages
Assignment 1 Individual Assignment Template
No ratings yet
Assignment 1 Individual Assignment Template
26 pages
Agentic Gen AI For Financial Risk Management
From Everand
Agentic Gen AI For Financial Risk Management
Satyadhar Joshi
5/5 (1)
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Data Mining 101: Core Concepts and Algorithms
From Everand
Data Mining 101: Core Concepts and Algorithms
Swarnalata Verma
No ratings yet
How To Record in TF Card With HIKVISION IPC
No ratings yet
How To Record in TF Card With HIKVISION IPC
3 pages
รวมดาวC
No ratings yet
รวมดาวC
22 pages
Heat and Mass Transfer Prelim Problem Set
No ratings yet
Heat and Mass Transfer Prelim Problem Set
3 pages
Datasheet-Advanced Security+EDR
No ratings yet
Datasheet-Advanced Security+EDR
4 pages
A Procedure For Lube Oil Flushing
No ratings yet
A Procedure For Lube Oil Flushing
2 pages
Saponins Types Sources and Research
No ratings yet
Saponins Types Sources and Research
154 pages
Material Master Replication
No ratings yet
Material Master Replication
6 pages
Homemade Vegan Butter - Loving It Vegan
No ratings yet
Homemade Vegan Butter - Loving It Vegan
2 pages
Theoretical Framework Ppt-In PDF
No ratings yet
Theoretical Framework Ppt-In PDF
16 pages
The Post.: Evening
No ratings yet
The Post.: Evening
12 pages
Course Outline Dfn50343 Ent Network - Sesi220232024
No ratings yet
Course Outline Dfn50343 Ent Network - Sesi220232024
4 pages
Body Language. 1 (1)
No ratings yet
Body Language. 1 (1)
10 pages
Concepts and Techniques: Data Mining
100% (1)
Concepts and Techniques: Data Mining
81 pages
CamesaEMC Catalog 12-2013 7
No ratings yet
CamesaEMC Catalog 12-2013 7
1 page
210 - EC8392, EC6302 Digital Electronics - Question Bank 1
No ratings yet
210 - EC8392, EC6302 Digital Electronics - Question Bank 1
19 pages
Listening II - Sesi 1
No ratings yet
Listening II - Sesi 1
12 pages
Temporary Permit: Transport Department Uttar Pradesh Fathehpur Form Sr-30 (See Rule 65 (1) (V) )
No ratings yet
Temporary Permit: Transport Department Uttar Pradesh Fathehpur Form Sr-30 (See Rule 65 (1) (V) )
1 page
D394 2010S2 19033882
No ratings yet
D394 2010S2 19033882
9 pages
Annexure 4 CA
No ratings yet
Annexure 4 CA
1 page
DWS2510.03 Butterfly Valves
No ratings yet
DWS2510.03 Butterfly Valves
6 pages
MDNPart 3
No ratings yet
MDNPart 3
2 pages
Borang Daftar MyHDW (Keluarga)
No ratings yet
Borang Daftar MyHDW (Keluarga)
5 pages
unit 3
No ratings yet
unit 3
32 pages
16 Supriya Sinha
No ratings yet
16 Supriya Sinha
7 pages
Design Approach 1 Combination 1 (A1+M1+R1) Combination 2 (A2+M2+R1) Design Approach 2 (A1+M1+R2) Design Approach 3 (A1or A2) +M2+R3
No ratings yet
Design Approach 1 Combination 1 (A1+M1+R1) Combination 2 (A2+M2+R1) Design Approach 2 (A1+M1+R2) Design Approach 3 (A1or A2) +M2+R3
4 pages
EXAMPLESeparations Pre Lab AX05
No ratings yet
EXAMPLESeparations Pre Lab AX05
8 pages
Bomba de Elevacion - Verderflex - Dura - 5-35 - Manual
No ratings yet
Bomba de Elevacion - Verderflex - Dura - 5-35 - Manual
28 pages
The Importance of Reading
No ratings yet
The Importance of Reading
39 pages