A Survey Paper On Credit Card Fraud Detection Techniques
A Survey Paper On Credit Card Fraud Detection Techniques
net/publication/355410841
CITATIONS READS
12 15,474
3 authors, including:
All content following this page was uploaded by Derar Eleyan on 19 October 2021.
Index Terms: Detection Techniques, Machine Learning, Credit Card, Fraud Detection.
————————————————————
72
IJSTR©2021
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 10, ISSUE 09, SEPTEMBER 2021 ISSN 2277-8616
1.2 Credit Card Fraud Detection by recombination of the calculated probability with the initial
Services make electronic payments more restful, seamless, belief of fraud using an advanced combination heuristic.
adequate, and simple to use; however, we must not overlook Vimala Devi. J et al. [19] To detect counterfeit transactions,
the losses associated with electronic commerce. three machine-learning algorithms were presented and
Organizations and banks to use them propose good security implemented. There are many measures used to evaluate the
solutions. To address these issues, but fraudsters' subtle performance of classifiers or predictors, such as the Vector
techniques evolve over time. As a result, it is critical to Machine, Random Forest, and Decision Tree. These metrics
improving detection and prevention techniques [7]. It is critical are either prevalence-dependent or prevalence-independent.
to understand the mechanisms for carrying a fraud in order to Furthermore, these techniques are used in credit card fraud
combat the fraud effectively. The gadget for identifying credit detection mechanisms, and the results of these algorithms
score card fraud relies upon on the fraud manner itself [13]. have been compared. Popat and Chaudhary [20] supervised
To accomplish this, provide the transaction details to the algorithms were presented Deep learning, Logistic
verification module, which will classify them as either fraud or Regression, Nave Bayesian, Support Vector Machine (SVM),
non-fraud. If it classified as fraudulent, it will be rejected. Neural Network, Artificial Immune System, K Nearest
Otherwise, the transaction is accepted [14]. Fraud detection Neighbour, Data Mining, Decision Tree, Fuzzy logic based
techniques such as statistical data analysis and artificial System, and Genetic Algorithm are some of the techniques
intelligence can be used to distinguish between the two. AI used. Credit card fraud detection algorithms identify
technique includes data mining that used to detect fraud, transactions that have a high probability of being fraudulent.
which can classify, group, and segment data to search through We compared machine-learning algorithms to prediction,
millions of transactions to find patterns and detect fraud. clustering, and outlier detection. Shiyang Xuan et al. [21] For
Machine learning is a technique for automatically detecting training the behavioral characteristics of credit card
fraud characteristics. One method of dealing with fraud is transactions, the Random Forest classifier was used. The
through both prevention and detection. Fraud detection and following types are used to train the normal and fraudulent
prevention's primary goal is to tell the difference between behavior features Random forest-based on random trees and
legitimate and fraudulent transactions and to prevent random forest based on CART. To assess the model's
fraudulent activity. Using historical data, the user's pattern effectiveness, performance measures are computed.
and behavior are analysed to determine if a transaction is Dornadula and Geetha S. [5] Using the Sliding-Window
fraudulent or not. When the system fails to detect and prevent method, the transactions were aggregated into respective
fraudulent activities, fraud detection takes over. [15]. In groups, i. , some features from the window were extracted to
supervised fraud detection systems, new transactions are find cardholder's behavioral patterns. Features such as the
classified as fraudulent or genuine based on characteristics of maximum amount, the minimum amount of a transaction, the
deceptive and legitimate activities, whereas outliers' average amount in the window, and even the time elapsed are
transactions are identified as prospective fraudulent available. Sangeeta Mittal et al. [22] To evaluate the
transactions in unsupervised fraud detection systems. A point- underlying problems, some popular machine learning-
by-point dialogue between supervised and unsupervised algorithms in the supervised and unsupervised categories
machine learning techniques can be discovered. Diversity of were selected. A range of supervised learning algorithms, from
studies have been conducted on several methods to solve the classical to modern, have been considered. These include
issue of card fraud detection. These approaches include, tree-based algorithms, classical and deep neural networks,
ANN, K-means Clustering, DT, etc.[16]. hybrid algorithms and Bayesian approaches. The
effectiveness of machine-learning algorithms in detecting
1.3 Fraud types in Card-based transactions credit card fraud has been assessed. On various metrics, a
1) Physical Card Fraud in most POS (point of sale) number of popular algorithms in the supervised, ensemble,
transactions, as it is essential that the cardholder must have to and unsupervised categories were evaluated. It is concluded
be physically presenting the card to the merchant to carry out that unsupervised algorithms handle dataset skewness better
the transaction. There are chances that the customer's card and thus perform well across all metrics absolutely and in
can be stolen and misused by fraudsters without the comparison to other techniques. Deepa and Akila [17] For
customer‘s knowledge. 2)Virtual Card Fraud: In most Online fraud detection, different algorithms like Anomaly Detection
shopping transactions there is no need for a physical card and Algorithm, K-Nearest Neighbor, Random Forest, K-Means and
instead we use the Card Number, Expiry Date, and CVV Decision Tree were used. Based on a given scenario,
number to perform the transaction. Fraudsters can steal this presented several techniques and predicted the best algorithm
information and they can use it to perform fraudulent online to detect deceitful transactions. To predict the fraud result, the
transactions‖ [17]. system used various rules and algorithms to generate the
Fraud score for that certain transaction. Xiaohan Yu et al. [23]
2. LITERATURE REVIEW have proposed a deep network algorithm for fraud detection A
Prajal Save et al. [18] have proposed a model based on a deep neural network algorithm for detecting credit card fraud
decision tree and a combination of Luhn's and Hunt's was described in the paper. It has described the neural
algorithms. Luhn's algorithm is used to determine whether an network algorithm approach as well as deep neural network
incoming transaction is fraudulent or not. It validates credit applications. The preprocessing methods and focal loss; for
card numbers via the input, which is the credit card number. resolving data skew issues in the dataset. Siddhant. Bagga et
Address Mismatch and Degree of Outlierness are used to al. [24] presented several techniques for determining whether
assess the deviation of each incoming transaction from the a transaction is real or fraudulent Evaluated and compared the
cardholder's normal profile. In the final step, the general belief accomplishment of 9 techniques on data of credit card fraud,
is strengthened or weakened using Bayes Theorem, followed including logistic regression, KNN, RF, quadrant discriminative
73
IJSTR©2021
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 10, ISSUE 09, SEPTEMBER 2021 ISSN 2277-8616
analysis, naive Bayes, multilayer perceptron, ada boost, 3.1 Selection of rudimentary Studies
ensemble learning, and pipelining, using different parameters To highlight primary research for selection, keywords were
and metrics. ADASYN method is used to balance the dataset. passed to the search engine, then they were chosen to
Accuracy, recall, F1 score, Balanced Classification Rate are enhance the development of research that wishes to aid in
used to assess classifier performance and Matthews‘s answering the study questions. The only Boolean factors that
correlation coefficient. This is to determine which technique is could be used were AND and OR. (‖machine-learning‖ OR
the best to use to solve the issue based on various metrics. ―machine learning‖) AND ―fraud detection‖ were the search
Carrasco and Urban [25] Deep neural networks have been terms. IEEE Explore Digital Library was one of the platforms
used to test and measure their ability to detect false positives looked into.
by processing alerts generated by a fraud detection system. - Google Scholar
Ten neural network architectures classified a set of alerts - Elsevier- Science Direct
triggered by an FDS as either valid alerts, representing real - Web site
fraud cases, or incorrect alerts, representing false positives. According on the search platforms, the title, keywords, and
When capturing 91.79 percent of fraud cases, optimal abstract were all searched for. On March 28, 2021, we
configuration achieved an alert reduction rate of 35.16 percent, conducted the searches, and we went over all of the previous
and a reduction rate of 41.47 percent when capturing 87.75 studies. The outcome of these searches refined using the
percent of fraud cases. Kibria and Sevkli [26] Using the grid criteria described in Section 3.2, resulting in a collection of
search technique, create a deep learning model. The built results that could be run.
model's performance is compared to the performance of two
other traditional machine-learning algorithms: logistic 3.2 Inclusion and Exclusion Criteria
regression (LR) and support vector machine (SVM). The Modern technological fraud detection, Case studies, and
developed model is applied to the credit card data set and the comments on how to improve existing mechanisms by
results are compared to logistic regression and support vector building a hybrid approach could all be considered for
machine models. Borse, Suhas and Dhotre. [27] Machine inclusion in this SLR. Papers must be read and write in the
learning's Naive Bayes classification was used to predict English language. Any Google Scholar findings are tested for
common or fraudulent transactions. The accuracy, recall, submission, as if Google Scholar has the ability to re-turn
precision, F1 score, and AUC score of the Naive Bayes lower-grade papers. This SLR will only accept the most recent
classifier are all calculated. Asha R B et al. [14] have proposed version of a sample. Table 1 lists the most important inclusion
a deep learning-based method for detecting fraud in credit and exclusion requirement.
card transactions. Using machine-learning algorithms such as
support vector machine, k-nearest neighbor, and artificial Table 1
neural network to predict the occurrence of fraud. used. INDICATES IMPLICATION AND EXCLUSION CRITERIA
FOR THE PRELIMINARY STUDY
3. RESEARCH METHODOLOGY
Systematic literature reviews, for example, are a type of Inclusion Exclusion
Must contain information related
methodology, which conducts a literature review on a specifi Cantering on the social or lawful
to fraud detection and learning
topic, could be used to detect fraud. A systematic review's ramifications of fraud.
machine technologies.
primary goal in this context is to identify, evaluate, and The paper must include
Interpret the available studies in the literature that address the empirical data on credit card
paper on detecting fraud on
authors' research questions. A secondary goal is to identify fraud as well as the use of
individuals and public sites
research gaps and opportunities in the area of interest. In this machine learning techniques for
detection.
paper, we attempted to walk through the activities proposed by The paper must have been
Kitchenham: analysis preparation, execution, and reporting in published in a journal or a written in a language other than
iterations. [28]. conference. English.
.
3.3 Selection Results
The primary keyword searches against the pick platforms
yielded 68 studies. After duplicate studies were removed, this
was reduced to 52. After the procedure of the survey through
the implication/exception criteria, there were 45 papers left to
read. The 45 papers have been read in their entirety, after
applying the inclusion/exclusion criteria a second time, 37
papers remained. As a result, SLR will comprise 37 papers in
total, as illustrated in the diagram below:
74
IJSTR©2021
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 10, ISSUE 09, SEPTEMBER 2021 ISSN 2277-8616
the end, the node is divided into two leaves (accepted and
declined offers).
deal with. When a new data point occurs, the Random Forest network receives an input variable, the weight of that input is
classifier predicts the conclusion based on the majority of allocated at random. The importance of each input data point
outcomes. is indicated by its weight in terms of forecasting the result; the
prejudice parameter, on the other hand, allows you to fine-tune
4.3 Logistic Regression the activation mission curve to achieve precise results. The
An algorithm that can be used for both regression and produce of the input and the weight are calculated after the
classification tasks, but it is most commonly used for inputs have been given weight. We get the Weighted Sum by
classification.' ‗Logistic Regression is used to predict adding all of these products together. The summation function
categorical variables using dependent variables. Consider two accomplishes this.
classes, and a new data point is to be checked to see which
class it belongs to. The algorithms then compute probability
values ranging between (0) and (1). Logistic Regression
employs a more complex cost function, this cost function is
known as the Sigmoid Function or the Logistic Function.' [33].
LR also does not require independent variables to be linearly
related, nor does it require equal variance within each group,
making it a less stringent statistical analysis procedure. As a
result, logistic regression was used to predict the likelihood of
fraudulent credit cards [34]. Clarify the working of LR through
the following scenario: The default variable for determining
whether a tumor is malignant or not is y=1 (tumor= malignant);
the x variable could be a measurement of the tumor, such as
its size. The logistic function converts the x-values of the
dataset's various instances into a range of 0 to 1. The tumor is
classified as malignant if the probability exceeds 0.5. (As
indicated by the horizontal line). As shown in the figure below:
77
IJSTR©2021
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 10, ISSUE 09, SEPTEMBER 2021 ISSN 2277-8616
[13] M. Kanchana, V. Chadda, and H. Jain, ―Credit card fraud [27] D. D. Borse, P. S. H. Patil, and S. Dhotre, ―Credit Card
detection,‖ Int. J. Adv. Sci. Technol., vol. 29, no. 6, pp. Fraud Detection Using Naïve Bayes and C4,‖ vol. 10, no.
2201–2215, 2020, doi: 10.17148/ijarcce.2016.5109. 1, pp. 423–429, 2021.
[14] A. RB and S. K. KR, ―Credit Card Fraud Detection Using [28] P. J. Taylor, T. Dargahi, A. Dehghantanha, R. M. Parizi,
Artificial Neural Network,‖ Glob. Transitions Proc., pp. 0–8, and K. K. R. Choo, ―A systematic literature review of
2021, doi: 10.1016/j.gltp.2021.01.006. blockchain cyber security,‖ Digit. Commun. Networks, vol.
[15] R. R. Popat and J. Chaudhary, ―A Survey on Credit Card 6, no. 2, pp. 147–156, 2020, doi:
Fraud Detection Using Machine Learning,‖ Proc. 2nd Int. 10.1016/j.dcan.2019.01.005.
Conf. Trends Electron. Informatics, ICOEI 2018, vol. 25, [29] V. Patil and U. Kumar Lilhore, ―A Survey on Different Data
no. 01, pp. 1120–1125, 2018, doi: Mining & Machine Learning Methods for Credit Card Fraud
10.1109/ICOEI.2018.8553963. Detection,‖ Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol.
[16] O. Adepoju, J. Wosowei, S. Lawte, and H. Jaiman, © 2018 IJSRCSEIT, vol. 5, no. 10, pp. 320–325, 2018, doi:
―Comparative Evaluation of Credit Card Fraud Detection 10.13140/RG.2.2.22116.73608.
Using Machine Learning Techniques,‖ 2019 Glob. Conf. [30] ―Machine Learning Decision Tree Classification Algorithm
Adv. Technol. GCAT 2019, pp. 1–6, 2019, doi: - Javatpoint.‖ https://www.javatpoint.com/machine-
10.1109/GCAT47503.2019.8978372. learning-decision-tree-classification-algorithm (accessed
[17] M. Deepa and D. Akila, ―Survey Paper for Credit Card Apr. 03, 2021).
Fraud Detection Using Data Mining Techniques,‖ Int. J. [31] ―Machine Learning Random Forest Algorithm -
Innov. Res. Appl. Sci. Eng., vol. 3, no. 6, p. 483, 2019, doi: Javatpoint.‖ https://www.javatpoint.com/machine-learning-
10.29027/ijirase.v3.i6.2019.483-489. random-forest-algorithm (accessed Apr. 03, 2021).
[18] P. Save, P. Tiwarekar, K. N., and N. Mahyavanshi, ―A [32] A. Mishra and C. Ghorpade, ―Credit Card Fraud Detection
Novel Idea for Credit Card Fraud Detection using Decision on the Skewed Data Using Various Classification and
Tree,‖ Int. J. Comput. Appl., vol. 161, no. 13, pp. 6–9, Ensemble Techniques,‖ 2018 IEEE Int. Students‘ Conf.
2017, doi: 10.5120/ijca2017913413. Electr. Electron. Comput. Sci. SCEECS 2018, pp. 1–5,
[19] J. Vimala Devi and K. S. Kavitha, ―Fraud Detection in 2018, doi: 10.1109/SCEECS.2018.8546939.
Credit Card Transactions by using Classification [33] ―Introduction to Logistic Regression | by Ayush Pant |
Algorithms,‖ Int. Conf. Curr. Trends Comput. Electr. Towards Data Science.‖
Electron. Commun. CTCEEC 2017, pp. 125–131, 2018, https://towardsdatascience.com/introduction-to-logistic-
doi: 10.1109/CTCEEC.2017.8455091. regression-66248243c148 (accessed Apr. 03, 2021).
[20] R. R. Popat and J. Chaudhary, ―A Survey on Credit Card [34] S. Venkata Suryanarayana, G. N. Balaji, and G.
Fraud Detection Using Machine Learning,‖ Proc. 2nd Int. Venkateswara Rao, ―Machine learning approaches for
Conf. Trends Electron. Informatics, ICOEI 2018, no. Icoei, credit card fraud detection,‖ Int. J. Eng. Technol., vol. 7,
pp. 1120–1125, 2018, doi: 10.1109/ICOEI.2018.8553963. no. 2, pp. 917–920, 2018, doi: 10.14419/ijet.v7i2.9356.
[21] S. Xuan, G. Liu, Z. Li, L. Zheng, S. Wang, and C. Jiang, [35] ―Artificial Neural Networks for Machine Learning - Every
―Random forest for credit card fraud detection,‖ ICNSC aspect you need to know about - DataFlair.‖ https://data-
2018 - 15th IEEE Int. Conf. Networking, Sens. Control, pp. flair.training/blogs/artificial-neural-networks-for-machine-
1–6, 2018, doi: 10.1109/ICNSC.2018.8361343. learning (accessed Apr. 03, 2021).
[22] S. Mittal and S. Tyagi, ―Performance evaluation of [36] ―K-Nearest Neighbor(KNN) Algorithm for Machine
machine learning algorithms for credit card fraud Learning - Javatpoint.‖ https://www.javatpoint.com/k-
detection,‖ Proc. 9th Int. Conf. Cloud Comput. Data Sci. nearest-neighbor-algorithm-for-machine-learning
Eng. Conflu. 2019, pp. 320–324, 2019, doi: (accessed Apr. 03, 2021).
10.1109/CONFLUENCE.2019.8776925. [37] ―K-Means Clustering Algorithm for Machine Learning | by
[23] X. Yu, X. Li, Y. Dong, and R. Zheng, ―A Deep Neural Madison Schott | Capital One Tech | Medium.‖
Network Algorithm for Detecting Credit Card Fraud,‖ Proc. https://medium.com/capital-one-tech/k-means-clustering-
- 2020 Int. Conf. Big Data, Artif. Intell. Internet Things Eng. algorithm-for-machine-learning-d1d7dc5de882 (accessed
ICBAIE 2020, pp. 181–183, 2020, doi: Apr. 03, 2021).
10.1109/ICBAIE49996.2020.00045.
[24] S. Bagga, A. Goyal, N. Gupta, and A. Goyal, ―Credit Card
Fraud Detection using Pipeling and Ensemble Learning,‖
Procedia Comput. Sci., vol. 173, pp. 104–112, 2020, doi:
10.1016/j.procs.2020.06.014.
[25] R. San Miguel Carrasco and M.-A. Sicilia-Urban,
―Evaluation of Deep Neural Networks for Reduction of
Credit Card Fraud Alerts,‖ IEEE Access, vol. 8, pp.
186421–186432, 2020, doi:
10.1109/access.2020.3026222.
[26] G. Kibria and M. Sevkli, ―Application of Deep Learning for
Credit Card Approval : A Comparison with Application of
Deep Learning for Credit Card Approval : A Comparison
with Two Machine Learning Techniques,‖ no. January, pp.
0–5, 2021, doi: 10.18178/ijmlc.2021.11.4.1049.
79
IJSTR©2021
View publication stats
www.ijstr.org