Fraud Detection in Banking Data by Machine Learning Techniques
Fraud Detection in Banking Data by Machine Learning Techniques
2023
ABSTRACT As technology advanced and e-commerce services expanded, credit cards became one of the
most popular payment methods, resulting in an increase in the volume of banking transactions. Furthermore,
the significant increase in fraud requires high banking transaction costs. As a result, detecting fraudulent
activities has become a fascinating topic. In this study, we consider the use of class weight-tuning hyper-
parameters to control the weight of fraudulent and legitimate transactions. We use Bayesian optimization
in particular to optimize the hyperparameters while preserving practical issues such as unbalanced data.
We propose weight-tuning as a pre-process for unbalanced data, as well as CatBoost and XGBoost to
improve the performance of the LightGBM method by accounting for the voting mechanism. Finally, in order
to improve performance even further, we use deep learning to fine-tune the hyperparameters, particularly
our proposed weight-tuning one. We perform some experiments on real-world data to test the proposed
methods. To better cover unbalanced datasets, we use recall-precision metrics in addition to the standard
ROC-AUC. CatBoost, LightGBM, and XGBoost are evaluated separately using a 5-fold cross-validation
method. Furthermore, the majority voting ensemble learning method is used to assess the performance of
the combined algorithms. LightGBM and XGBoost achieve the best level criteria of ROC-AUC = 0.95,
precision 0.79, recall 0.80, F1 score 0.79, and MCC 0.79, according to the results. By using deep learning
and the Bayesian optimization method to tune the hyperparameters, we also meet the ROC-AUC = 0.94,
precision = 0.80, recall = 0.82, F1 score = 0.81, and MCC = 0.81. This is a significant improvement over
the cutting-edge methods we compared it to.
INDEX TERMS Bayesian optimization, data Mining, deep learning, ensemble learning, hyper parameter,
unbalanced data, machine learning.
for purchases in a physical or digital manner. In digital trans- • To evaluate the performance of the proposed methods,
actions, fraud can happen over the line or the web, since the we perform extensive experiments on real-world data.
cardholders usually provide the card number, expiration date, To better cover the unbalanced datasets, we use recall-
and card verification number by telephone or website [6]. precision in addition to the typically used ROC-AUC.
There are two mechanisms, fraud prevention and fraud We also evaluate the performance using F1_score and
detection, that can be exploited to avoid fraud-related losses. MCC metrics. According to the results, the proposed
Fraud prevention is a proactive method that stops fraud methods outperform the existing and based methods. For
from happening in the first place. On the other hand, fraud evaluations, we use publicly available datasets and also
detection is needed when a fraudster attempts a fraudulent publish the source codes 1 with public access to be used
transaction [7]. by other researchers.
Fraud detection in banking is considered a binary clas- The reminder of this paper is organized as follows: In
sification problem in which data is classified as legitimate Section II we review the related state-of-the-art. The proposed
or fraudulent [8]. Because banking data is large in volume approach for credit card fraud detection including the dataset,
and with datasets containing a large amount of transaction pre-processing, feature extraction and feature selection, algo-
data, manually reviewing and finding patterns for fraudu- rithms, framework, and evaluation metrics, is presented in
lent transactions is either impossible or takes a long time. Section III. Section IV discusses the evaluation results of the
Therefore, machine learning-based algorithms play a pivotal experiments performed, and finally Section V concludes the
role in fraud detection and prediction [9]. Machine learning paper.
algorithms and high processing power increase the capa-
bility of handling large datasets and fraud detection in a II. RELATED WORKS
more efficient manner. Machine learning algorithms and deep In order to prevent fraudulent transactions and detect
learning also provide fast and efficient solutions to real-time credit card fraud, several methods have been proposed by
problems [10]. researchers. A review of state-of-the-art related works is pre-
In this paper, we propose an efficient approach for detect- sented in the following.
ing credit card fraud that has been evaluated on publicly Halvaiee & Akbari study a new model called the AIS-based
available datasets and has used optimised algorithms Light- fraud detection model (AFDM). They use the Immune Sys-
GBM, XGBoost, CatBoost, and logistic regression individu- tem Inspired Algorithm (AIRS) to improve fraud detection
ally, as well as majority voting combined methods, as well accuracy. The presented results of their paper show that their
as deep learning and hyperparameter settings. An ideal fraud proposed AFDM improves accuracy by up to 25%, reduces
detection system should detect more fraudulent cases, and the costs by up to 85%, and reduces system response time by up
precision of detecting fraudulent cases should be high, i.e., to 40% compared to basic algorithms [11].
all results should be correctly detected, which will lead to the Bahnsen et al. developed a transaction aggregation strategy
trust of customers in the bank, and on the other hand, the bank and created a new set of features based on the periodic
will not suffer losses due to incorrect detection. behaviour analysis of the transaction time by using the von
The main contributions of this paper are summarized as Mises distribution. In addition, they propose a new cost-based
follows: criterion for evaluating credit card fraud detection’s models
• We adopt Bayesian optimization for fraud detection and then, using a real credit card dataset, examine how dif-
and propose to use the weight-tuning hyperparameter to ferent feature sets affect results. More precisely, they extend
solve the unbalanced data issue as a pre-process step. the transaction aggregation strategy to create new offers based
We also suggest using CatBoost and XGBoost along- on an analysis of the periodic behaviour of transactions [12].
side LightGBM to improve performance. We use the Randhawa et al. study the application of machine learning
XGBoost algorithm due to the high speed of training algorithms to detect fraud in credit cards. They first use Naive
in big data as well as the regularization term, which Bayes, stochastic forest and decision trees, neural networks,
overcomes overfitting by measuring the complexity of linear regression (LR), and logistic regression, as well as
the tree, and it does not require much time to set the support vector machine standard models, to evaluate the
hyperparameters. We also use the Catboost algorithm available datasets. Further, they propose a hybrid method by
because there is no need to adjust hyperparameters applying AdaBoost and majority voting. In addition, they add
for overfitting control, and it also obtains good results noise to the data samples for robustness evaluation. They
without changing hyperparameters compared to other perform experiments on publicly available datasets and show
machine learning algorithms. that majority voting is effective in detecting credit card fraud
• We propose a majority-voting ensemble learning cases [6].
approach to combine CatBoost, XGBoost, and Light- Porwal and Mukund propose an approach that uses cluster-
GBM and review the effect of the combined methods on ing methods to detect outliers in a large dataset and is resistant
the performance of fraud detection on real, unbalanced
data. We also propose to use deep learning for adjusting 1 The codes are available at https://github.com/khadijehHashemi/Fraud-
and fine-tuning the hyperparameters. Detection-in-Banking-Data-by-Machine-Learning-Techniques
to changing patterns [13]. The idea behind their proposed TABLE 1. Features of the credit-card fraud dataset that is used in this
paper.
approach is based on the assumption that the good behaviour
of users does not change over time and that the data points
that represent good behaviour have a consistent spatial sig-
nature under different groupings. They show that fraudulent
behaviours can be detected by identifying the changes in this
data. They show that the area under the precision-recall curve
is better than ROC as an evaluation criterion [13].
The authors in [14], propose a group learning framework
based on partitioning and clustering of the training set. Their regression classifier outperform other algorithms in an imbal-
proposed framework has two goals: 1) to ensure the integrity anced dataset [20]. The summary of the literature review is
of the sample features, and 2) to solve the high imbalance presented in Fig. 1.
of the dataset. The main feature of their proposed framework
is that every base estimator can be trained in parallel, which III. PROPOSED APPROACH TO DETECTING CREDIT CARD
improves the effectiveness of their framework. FRAUD
Itoo et al. use three different ratios of datasets and an The proposed framework for fraud detection is presented
oversampling method to deal with the problem of data imbal- in Fig. 2. As this figure shows, we first apply the desired
ance. Authors use three machine learning algorithms: logistic pre-processing on the data and further divide the data into
regression, Naive Bayes, and K-nearest neighbor. The per- two sections: training and testing, followed by performing
formance of the algorithms is measured based on accuracy, Bayesian optimization on the training data to find the best
sensitivity, specificity, precision, F1-score, and area under the hyperparameters that lead to the improvement of the perfor-
curve. They show that the logistic regression-based model mance. We use the cross-validation method to obtain perfor-
outperforms the other commonly used fraud detection algo- mance comparison in an unbalanced set and then examine
rithms in the paper [15]. the algorithms using different evaluation metrics, including
The authors in [16] propose a framework that combines the accuracy, precision, recall, the Matthews correlation coeffi-
potential of meta-learning ensemble techniques and a cost- cient (MCC), the F1-score, and AUC diagrams. These steps
sensitive learning paradigm for fraud detection. They perform are explained in detail as follows:
some evaluations, and the results obtained from classifying
unseen data show that the cost-sensitive ensemble classifier A. DATASET
has acceptable AUC value and is efficient as compared to the In this paper, we use a real dataset so that the outcome of
performances of ordinary ensemble classifiers. the proposed algorithm can be used in practice. We consider
Altyeb et al. propose an intelligent approach for detect- a dataset named ‘‘creditcard’’ that contains 284,807 records
ing fraud in credit card transactions [17]. Their proposed of two days of transactions made by credit card holders
Bayesian-based hyperparameter optimization algorithm is in September 2013. There are 492 fraudulent transactions,
used to tune the parameters of a LightGBM. They perform and the rest of the transactions are legitimate. The positive
experiments on publicly available credit card transaction class (frauds) accounts for 0.172% of all transactions; hence,
datasets. These datasets consist of fraudulent and legitimate the dataset is highly imbalanced. This dataset is available
transactions. Their evaluation results are reported in terms and can be accessed through https://www.kaggle.com/mlg-
of accuracy, area under the receiver operating characteristic ulb/creditcardfraud.
curve (ROC-AUC), precision, and F1-score metrics. This dataset contains only numerical input variables result-
Xiong et al. propose a learning-based approach to tackle ing from a principle component analysis (PCA) transfor-
the fraud detection problem. They use feature engineering mation. Unfortunately, the original features and background
techniques to boost the proposed model’s performance. The information about the data are not given due to confidentiality
model is trained and evaluated on the IEEE-CIS fraud dataset. and privacy considerations. PCA yielded the following prin-
Their experiments show that the model outperforms tradi- cipal components: V1 , V2 , V28 . The untransformed features
tional machine-learning-based methods like Bayes and SVM with PCA are ‘‘time’’ and ‘‘amount.’’ The ‘‘Time’’ column
on the used dataset [18]. contains the time (in seconds) elapsed between each trans-
Viram et al. evaluate the performance of Naive Bayes action and the first transaction in the dataset. The feature
and voting classifier algorithms. They demonstrate that in ‘‘Amount’’ shows the transaction amount. Feature ‘‘Class’’ is
terms of evaluated metrics, particularly accuracy, the voting the response variable, and it takes the value 1 in case of fraud
classifier outperforms the Naive Bayes algorithm [19]. and 0 otherwise. The summary of the variables and features
Verma and Tyagi investigate machine learning algorithms is presented in Table 1.
in order to determine the best supervised ML-based algorithm
for credit card fraud detection in the presence of an imbal- B. DATA PRE-PROCESSING
anced dataset. They evaluate five classification techniques As illustrated in Table 2, the total number of fraudulent
and show that the supervised vector classifier and logistic transactions is significantly lower than the total number of
FIGURE 1. Summary of the related works on fraud detection in banking industry with machine learning techniques.
legitimate transactions, indicating that the data distribution is leads to data loss [21]. Besides, using over-sampling methods
unbalanced. In real datasets for credit card fraud detection, leads to the production of duplicate data that doesn’t provide
unbalanced data is expected. This data imbalance causes information (the data and information are different, and the
performance issues in machine learning algorithms, and hav- subject is discussed under the ‘‘Entropy’’). Some researchers
ing a class with the majority of the samples influences the use synthetic minority oversampling (SMOTE) as a solution,
evaluation results [6]. Therefore, in many studies, under- which avoids the drawbacks of under and over sampling [5],
sampling and over-sampling methods are used to solve the [17], [22]. However, the SMOTE method causes an increase
data imbalance problem [15]. Using under-sampling methods in the false-positive rate, which is not acceptable in banking
first), it makes use of the ‘‘max depth’’ parameter and starts the outcome. Deep learning is shown to be a very promising
tree pruning from the backward direction, which signifi- solution to deal with fraud in financial transactions, making
cantly improves the computational performance and speed the best use of banks’ big data. [34]. Deep learning is a
of XGBoost [28]. XGBoost employs a more regularised generic term that refers to machine learning using a deep
technique called ‘‘formalization’’ to control over-fitting and multi-layer artificial neural network (ANN). It is a biologi-
achieve better performance [29]. The tuned hyperparameters cally inspired model of human neurons, composed of multi-
include learning rate, number of trees, and maximum tree level hidden layers of nonlinear processing units, where each
depth, as well as applying weight to classes neuron is able to send data to a connected neuron within
the hidden layers. These processing units discover interme-
4) CatBoost diate representations in a hierarchical manner. The features
Category Boosting (CatBoost) is a new gradient boosting discovered in one layer form the basis for the processing of
algorithm proposed by Prokhorenkova et al. [29]. CatBoost is the succeeding layer. In this way, deep learning algorithms
a competitive candidate in the realm of classifiers for highly learn intermediate concepts between raw input and target
unbalanced data. [30]. CatBoost machine learning algorithm knowledge [34].
is a particular type of Gradient boosting on the decision trees In this paper, we use a sequential model, which is a linear
as it can handle categorical, ordered features, and the over- stack of layers to construct an artificial neural network model.
fitting of the model is taken care of by Bayesian estima- Our model has a dense class, which is a very common layer
tors [31]. CatBoost doesn’t require extensive data training and is often used. In the neural network, the activation func-
like other machine learning models and can be successfully tion is used to increase the predictive power. This function
applied to diverse types and formats of data [29], [30]. Cat- divides input signals into output signals. We use the Relu
Boost has both CPU and GPU implementations, the GPU activation function, and in the last layer, we use ‘‘Sigmoid’’,
implementation allows for much faster training and is faster since our output is binary. The Sigmoid function generates
than both state-of-the-art open-source GBDT GPU imple- values in a range of zero and one. In the ‘‘Relu’’ function,
mentations, XGBoost and LightGBM, on ensembles of sim- if the value x is smaller than or equal to zero, the output is
ilar sizes [32]. CatBoost uses a more efficient strategy hat zero. The function of the Relu activation function is in many
reduces over-fitting and allows the use of the whole dataset ways similar to the function of our biological neurons.
for training. We perform a random permutation of the dataset, Neural networks require initial weighting. We use kernel-
and also, for data imbalance problems, we use a class weight initializer, which defines the method of determining the ran-
hyperparameter. dom weights of the primary Keras layers. To overcome the
unbalanced data problem, we consider the ratio of 1 to 4 for
5) MAJORITY VOTING the weight of the majority class to the minority class. This
Ensemble learning (EL), which is a type of machine learn- causes an increase in the processing speed as well as increas-
ing, combines several classifiers, minimises the error of the ing the efficiency of the model. The size of the input layer
classifiers, and achieves more reasonable results than a single is equal to the number of features plus the extracted features.
technique. A voting majority classifier is not a real classifier, We also remove the ‘‘time’’ feature. To build the Keras model,
but a method that is trained and evaluated in parallel in order we optimise the number of layers and neurons, the number
to use the different features of each algorithm. We can train of epochs, and the batch size, which leads to an increase in
the data using different hybrid algorithms to predict the final speed. Commonly, batch size is set to 32 or 128. However, our
output. The final result of the prediction is determined by a dataset is highly unbalanced, and by choosing the common
majority of votes according to two different strategies: hard batch size, there may be no fraud cases in the batch during
voting and soft voting. If voting is hard, it uses the predicted training. Therefore, our range is chosen so that we can see
class labels to vote for the majority law. Otherwise, if the vote fraudulent samples in each batch. Also, by choosing a larger
is soft, it predicts the class label based on ‘‘Argmax,’’ the sum batch size, the processing is faster, and we also need less
of the predicted probabilities, which is recommended for a memory. Large epoch sizes can result in either over- or under-
set of well-calibrated classifiers. In this case, the probability fitting. Therefore, selecting the appropriate range for opti-
vector is calculated on average for each predicted class (for mization not only increases the efficiency of the algorithm
all classifiers). The winning class is the one with the highest but also reduces the time required to find the optimal points.
value [27], [33]. By performing Bayesian optimization, the number of neurons
1 X in the first hidden layer is set to 86, the number of epochs is
ŷ = argmax (p1 , . . . , pn ) (1) set to 117, and the batch size is set to 1563. The details of our
NClassifiers
Classifiers model are presented in Table 3.
Following Keras and with the help of the compile method
6) DEEP LEARNING and Adam’s optimizer, we perform weight updates and use
Deep learning algorithms are a class of machine learning binary-cross entropy for the loss function that finalises the
algorithms where multiple hidden layers are used to improve configuration of the learning and training process.
TABLE 3. Details of our deep learning model used in the paper are
provided. The total parameters are set to 7593, and all are trainable.
F. EVALUATION METRICS
We apply a cross-validation test to evaluate the performance
of the proposed model for credit card fraud detection. Similar
to [6], [17], We use a stratified 5-fold validation test to obtain
a reliable performance comparison in the unbalanced set.
The dataset is divided randomly into five separate subsets
of equal size, where the number of samples in each class
is divided into equal proportions in each category. In all
steps of validation, a single subset (20% of the dataset) is
reserved as the validation data to test the performance of the
FIGURE 4. ROC_AUC curve.
proposed approach, while the remaining four subsets (80%
of the dataset) are employed as the training data. We repeat
this process five times until all subsets are used. The average detecting actual fraudulent transactions. Precision measures
performances of the five test subsets are calculated, and the the reliability of the classifier and F1-Score is the harmonic
final result is the performance of the proposed approach on a average of recall and precision measures, that considers both
5-fold cross-validation test. false negatives and positives.
To be fair in our comparisons, we use the common met- ROC-AUC is a measure of separability that demonstrates
rics for our evaluations, including accuracy, precision, recall, the model’s ability to differentiate between classes [15].
the Matthews correlation coefficient (MCC), the F1-score, ROC-AUC is a graphical plot of the false positive rate (FPR)
and AUC diagrams. Positive numbers represent fraudulent and the true positive rate (TPR) at different possible lev-
transactions in our experiments, while negative numbers rep- els [17]. The area under the ROC curve is not a suitable
resent legitimate ones. True positive (TP) represents fraud- criterion for evaluating fraud detection methods since it only
ulent transactions that have been classified as such. False considers positive values.
positives (FP) indicate the number of legitimate transactions The precision and recall curves are commonly used to
misclassified as fraudulent. The true negative (TN ) represents compare classifiers in terms of precision and recall. Usually,
legitimate transactions classified as legitimate, and the false in this two-dimensional graph, the precision rate is plotted
negative (FN ) indicates the misclassified fraudulent transac- on the y-axis and the recall is plotted on the x-axis. There
tions as legitimate [15]. The mathematical expressions for the is no good way to describe the true and false positives and
metrics used are given in Eq. (2) to Eq. (6). negatives using one indicator. One good solution is to use
TP + TN MCC, which measures the quality of a two-class problem,
Accuracy = (2) taking into account the true and false positives and negatives.
TP + TN + FP + FN
TP It is a balanced measure, even when the classes are of different
Recall = (3) sizes [6].
TP + TN
TP
Precision = (4) IV. EXPERIMENTAL RESULTS AND DISCUSSION
TP + TN + FP + FN
Precision × Recall We use the stratified 5-fold cross validation method and the
F1-Score = 2× (5) boosting algorithms with the Bayesian optimization method
Precision+Recall
TP × FP−FP × FN to evaluate the performance of the proposed framework.
MCC = √ We extract the hyperparameters and evaluate each algo-
(TP + FP)(TP + FN )(TN + FP)(TN + FN )
(6) rithm individually before using the majority voting method.
We examine the algorithms in triple and double precision. The
Accuracy Accuracy quantifies the total performance of comparison results are presented in Table 5.
the classifier and is defined as the number of correct predic- Most studies in the literature rely on AUC diagrams to
tions made by the model. When dealing with data that isn’t evaluate performance. However, as can be seen from the
balanced, this criterion doesn’t give good results because it ROC-AUC curve in Fig. 4, the value of AUC in severely
also gives a high value if even one fraudulent transaction unbalanced data is not a good evaluation metric. It is influ-
is found. Recall shows the efficiency of the classifier in enced by the real positives and considers the negatives
TABLE 6. Performance comparison of the proposed approach and the method presented in [17].
REFERENCES
[1] J. Nanduri, Y.-W. Liu, K. Yang, and Y. Jia, ‘‘Ecommerce fraud detection
through fraud islands and multi-layer machine learning model,’’ in Proc.
Future Inf. Commun. Conf., in Advances in Information and Communica-
FIGURE 9. Performance comparison of the proposed approach with the tion. San Francisco, CA, USA: Springer, 2020, pp. 556–570.
paper [17] based on the different evaluation criteria. [2] I. Matloob, S. A. Khan, R. Rukaiya, M. A. K. Khattak, and
A. Munir, ‘‘A sequence mining-based novel architecture for detecting
fraudulent transactions in healthcare systems,’’ IEEE Access, vol. 10,
pp. 48447–48463, 2022.
The diagram of the Precision-Recall curve is shown in [3] H. Feng, ‘‘Ensemble learning in credit card fraud detection using boosting
Fig. 8, and shows the value as 0.7922. methods,’’ in Proc. 2nd Int. Conf. Comput. Data Sci. (CDS), Jan. 2021,
pp. 7–11.
The evaluation results of the proposed approach using dif- [4] M. S. Delgosha, N. Hajiheydari, and S. M. Fahimi, ‘‘Elucidation of big
ferent pre-processing and class weight hyperparameter tuning data analytics in banking: A four-stage delphi study,’’ J. Enterprise Inf.
to deal with the problem of data unbalance compared to the Manage., vol. 34, no. 6, pp. 1577–1596, Nov. 2021.
[5] M. Puh and L. Brkić, ‘‘Detecting credit card fraud using selected machine
paper [17] are shown in Fig. 9. The results show improvement learning algorithms,’’ in Proc. 42nd Int. Conv. Inf. Commun. Technol.,
of both methods compared to the method presented in [17]. Electron. Microelectron. (MIPRO), May 2019, pp. 1250–1255.
According to the Table 6, it is shown that the pro- [6] K. Randhawa, C. K. Loo, M. Seera, C. P. Lim, and A. K. Nandi, ‘‘Credit
card fraud detection using AdaBoost and majority voting,’’ IEEE Access,
posed methods outperform the intelligence method presented vol. 6, pp. 14277–14284, 2018.
in [17] using common metrics and a public dataset. [7] N. Kumaraswamy, M. K. Markey, T. Ekin, J. C. Barner, and K. Rascati,
‘‘Healthcare fraud data mining methods: A look back and look ahead,’’
Perspectives Health Inf. Manag., vol. 19, no. 1, p. 1, 2022.
V. CONCLUSION AND FUTURE WORK [8] E. F. Malik, K. W. Khaw, B. Belaton, W. P. Wong, and X. Chew, ‘‘Credit
In this paper, we studied the credit card fraud detection card fraud detection using a new hybrid machine learning architecture,’’
problem in real unbalanced datasets. We proposed a machine- Mathematics, vol. 10, no. 9, p. 1480, Apr. 2022.
[9] K. Gupta, K. Singh, G. V. Singh, M. Hassan, G. Himani, and U. Sharma,
learning approach to improve the performance of fraud ‘‘Machine learning based credit card fraud detection—A review,’’ in Proc.
detection. We used a publicly available ‘‘credit card’’ dataset Int. Conf. Appl. Artif. Intell. Comput. (ICAAIC), 2022, pp. 362–368.
[10] R. Almutairi, A. Godavarthi, A. R. Kotha, and E. Ceesay, ‘‘Analyzing credit [31] B. Dhananjay and J. Sivaraman, ‘‘Analysis and classification of heart rate
card fraud detection based on machine learning models,’’ in Proc. IEEE Int. using CatBoost feature ranking model,’’ Biomed. Signal Process. Control,
IoT, Electron. Mechatronics Conf. (IEMTRONICS), Jun. 2022, pp. 1–8. vol. 68, Jul. 2021, Art. no. 102610.
[11] N. S. Halvaiee and M. K. Akbari, ‘‘A novel model for credit card fraud [32] Y. Chen and X. Han, ‘‘CatBoost for fraud detection in financial trans-
detection using artificial immune systems,’’ Appl. Soft Comput., vol. 24, actions,’’ in Proc. IEEE Int. Conf. Consum. Electron. Comput. Eng.
pp. 40–49, Nov. 2014. (ICCECE), Jan. 2021, pp. 176–179.
[12] A. C. Bahnsen, D. Aouada, A. Stojanovic, and B. Ottersten, ‘‘Feature [33] A. Goyal and J. Khiari, ‘‘Diversity-aware weighted majority vote classifier
engineering strategies for credit card fraud detection,’’ Expert Syst. Appl., for imbalanced data,’’ in Proc. Int. Joint Conf. Neural Netw. (IJCNN),
vol. 51, pp. 134–142, Jun. 2016. Jul. 2020, pp. 1–8.
[13] U. Porwal and S. Mukund, ‘‘Credit card fraud detection in e-commerce: [34] A. Roy, J. Sun, R. Mahoney, L. Alonzi, S. Adams, and P. Beling, ‘‘Deep
An outlier detection approach,’’ 2018, arXiv:1811.02196. learning detecting fraud in credit card transactions,’’ in Proc. Syst. Inf. Eng.
[14] H. Wang, P. Zhu, X. Zou, and S. Qin, ‘‘An ensemble learning Design Symp. (SIEDS), Apr. 2018, pp. 129–134.
framework for credit card fraud detection based on training set
partitioning and clustering,’’ in Proc. IEEE SmartWorld, Ubiquitous
Intell. Comput., Adv. Trusted Comput., Scalable Comput. Commun.,
SEYEDEH KHADIJEH HASHEMI received the
Cloud Big Data Comput., Internet People Smart City Innov. (Smart-
World/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), Oct. 2018, pp. 94–98. B.Sc. and M.Sc. degrees in computer engineer-
[15] F. Itoo, M. Meenakshi, and S. Singh, ‘‘Comparison and analysis of logistic ing. She is currently a Former Student with the
regression, Naïve Bayes and knn machine learning algorithms for credit Department of Electrical and Computer Engineer-
card fraud detection,’’ Int. J. Inf. Technol., vol. 13, no. 4, pp. 1503–1511, ing, Kharazmi University. Her master’s thesis has
2021. been performed on fraud detection for banking
[16] T. A. Olowookere and O. S. Adewale, ‘‘A framework for detecting credit with machine learning techniques. Her research
card fraud with cost-sensitive meta-learning ensemble approach,’’ Sci. Afr., interest includes application of machine learning
vol. 8, Jul. 2020, Art. no. e00464. techniques, focusing on banking.
[17] A. A. Taha and S. J. Malebary, ‘‘An intelligent approach to credit card
fraud detection using an optimized light gradient boosting machine,’’ IEEE
Access, vol. 8, pp. 25579–25587, 2020.
[18] X. Kewei, B. Peng, Y. Jiang, and T. Lu, ‘‘A hybrid deep learning model
SEYEDEH LEILI MIRTAHERI is currently a Fac-
for online fraud detection,’’ in Proc. IEEE Int. Conf. Consum. Electron.
Comput. Eng. (ICCECE), Jan. 2021, pp. 431–434.
ulty Member with the Department of Electrical
[19] T. Vairam, S. Sarathambekai, S. Bhavadharani, A. K. Dharshini, N. N. Sri, and Computer Engineering, Kharazmi University,
and T. Sen, ‘‘Evaluation of Naïve Bayes and voting classifier algorithm for Tehran, Iran. She is researching next-generation
credit card fraud detection,’’ in Proc. 8th Int. Conf. Adv. Comput. Commun. high-performance computing systems and GPU
Syst. (ICACCS), Mar. 2022, pp. 602–608. computing. She has published more than 50 papers
[20] P. Verma and P. Tyagi, ‘‘Analysis of supervised machine learning algo- in credible conferences and journals. Her research
rithms in the context of fraud detection,’’ ECS Trans., vol. 107, no. 1, interests include distributed and parallel systems,
p. 7189, 2022. exascale computing, cluster computing, mathe-
[21] J. Zou, J. Zhang, and P. Jiang, ‘‘Credit card fraud detection using autoen- matics, and scientific computing. She worked on
coder neural network,’’ 2019, arXiv:1908.11553. distributed systems and done several successful industrial experiments in
[22] D. Almhaithawi, A. Jafar, and M. Aljnidi, ‘‘Example-dependent cost- these areas. She received an Exemplary Professor of Kharazmi University,
sensitive credit cards fraud detection using SMOTE and Bayes minimum in 2020, and also she received a Leading Young Researcher in Alborz
risk,’’ Social Netw. Appl. Sci., vol. 2, no. 9, pp. 1–12, Sep. 2020. Province, in 2020. She received the First Award of Inventions at National
[23] J. Cui, C. Yan, and C. Wang, ‘‘Learning transaction cohesiveness for online
Science Foundation Invention Festival, in 2011, the Iran University of Sci-
payment fraud detection,’’ in Proc. 2nd Int. Conf. Comput. Data Sci.,
ence and Technology (IUST) Awards for Excellence in Researching, in 2009,
Jan. 2021, pp. 1–5.
[24] M. Rakhshaninejad, M. Fathian, B. Amiri, and N. Yazdanjue,
the Second Level Reward of National Science Foundation in Ph.D., in 2009,
‘‘An ensemble-based credit card fraud detection algorithm using an the First Award for presenting‘‘CSharifi: Kernel Level Cluster Management
efficient voting strategy,’’ Comput. J., vol. 65, no. 8, pp. 1998–2015, System Software,’’ at the Khwarizmi Young Awards, in 2008, the Grant
Aug. 2022. of Excellent Researcher of National Science Foundation, in 2008, and the
[25] A. H. Victoria and G. Maragatham, ‘‘Automatic tuning of hyperparameters Iranian Organization of Scientific and Industrial Research appreciation to
using Bayesian optimization,’’ Evolving Syst., vol. 12, no. 1, pp. 217–223, cooperating and presenting ‘‘A Cluster Management System Software’’ at
Mar. 2021. the Khwarizmi International Awards, in 2007.
[26] H. Cho, Y. Kim, E. Lee, D. Choi, Y. Lee, and W. Rhee, ‘‘Basic enhancement
strategies when using Bayesian optimization for hyperparameter tuning of
deep neural networks,’’ IEEE Access, vol. 8, pp. 52588–52608, 2020. SERGIO GRECO is currently a Full Professor with
[27] F. N. Khan, A. H. Khan, and L. Israt, ‘‘Credit card fraud prediction and the Department of Informatics, Modeling, Elec-
classification using deep neural network and ensemble learning,’’ in Proc.
tronics and System Engineering (DIMES), Uni-
IEEE Region 10 Symp. (TENSYMP), Jun. 2020, pp. 114–119.
versity of Calabria, Rende, Italy. He has written
[28] W. Liang, S. Luo, G. Zhao, and H. Wu, ‘‘Predicting hard rock pillar stability
using GBDT, XGBoost, and LightGBM algorithms,’’ Mathematics, vol. 8,
over 220 papers, including more than 60 journal
no. 5, p. 765, May 2020. papers in prestigious conferences and journals.
[29] S. B. Jabeur, C. Gharib, S. Mefteh-Wali, and W. B. Arfi, ‘‘CatBoost model His research interests include database theory,
and artificial intelligence techniques for corporate failure prediction,’’ data integration and exchange, inconsistent data,
Technol. Forecasting Social Change, vol. 166, May 2021, Art. no. 120658. incomplete data, data mining, knowledge repre-
[30] J. Hancock and T. M. Khoshgoftaar, ‘‘Medicare fraud detection using sentation, logic programming, and computational
CatBoost,’’ in Proc. IEEE 21st Int. Conf. Inf. Reuse Integr. Data Sci. (IRI), logic and argumentation theory.
Aug. 2020, pp. 97–103.