Credit Card Fraud Detection Using Machine Learning (1) (1)
Credit Card Fraud Detection Using Machine Learning (1) (1)
II LITERATURE REVIEW
Fraud
act
111
countered fraud from a unique direction.it
proven
and its preset value.Unconventional techniques comparable This graph shows that the number
to hybrid data mining/complex network classification of fraudulent transactions is much
formula is in a position to understand prohibited instances in lower than the legitimate ones.
associate actual card dealings knowledge set, supported
network reconstruction algorithm that permits making
representations of the deviation of 1 instance from a
reference cluster have proven economical generally on
medium sized on-line transaction.There have conjointly been
efforts to progress from a very aspect. tries are created to
enhance the alertfeedback interaction just in case of
fallacious transaction. just in case of fraudulent transaction,
the authorised system would be alerted and a feedback
would be sent to deny the continuing transaction. Artificial
Genetic Algorithm, one in every of the approaches that shed
new lightweight during this domain, correct to find out
thefallacious transactions and minimizing the amount of
false alerts. Even though, if
First, we got our dataset from Kaggle, a data
analysis website that provides datasets. There are
31 columns in this dataset, 28 of which are named This graph shows the times at
v1-v28 to protect sensitive data.the other columns which transactions were done
represent time, amount and class. Time shows within two days. It can be seen that
the time span between the first transaction and the the least number of transactions
following one. Amount is the amountof money were made during night time and
that will be transacted. Class 0 stands for a valid highest during the days.
transaction and 1 for a fraudulent one. We draw
different charts to check the data set for
inconsistencies and to understand it visually:
113
and machine learning. It has various classification, clustering and
regression algorithms and is designed to work with numerical and
scientific libraries.
We used the Jupyter Notebook platform to create a program in
Python to demonstrate the approach proposed in this document.
This program can also run in cloud with Google Collab platform
which supports all Python notebook files. Detailed explanations of
the modules with pseudocodes for their algorithms and output
graphs are given as follows:
A. Local Outlier Factor
It is an unsupervised outlier detection algorithm. "Local
Outlier Factor" refers to the anomaly score of each sample. It
measures the local deviation of the sample data with respect to its
neighbors.
. By comparing the local values of neighbors, one can identify
More specifically, locality is given by k-nearest neighbors,
samples that are ignificantly lower than their neighbors. These
whose distance is used to estimate the local data. The
values are quite amanous and are considered outliers.
pseudocode for this algorithm is as follows:
Since the dataset is very large, we used only a fraction of it in
our tests to reduce processing times. The end result with the
fully processed data set is also determined and is given in the
result part of this work.
The Isolation Forest ‘isolates’ observations by arbitrarily
selecting a feature and then randomly selecting a split
value between the maximum and minimum values of the
designated feature.
Recursive partitioning can be represented
114
fraudulent transaction.
This result is compared to the class values to check for false
positive
share information based on their market competition, and also for legal
reasons and to protect the data of its users. So we looked up some reference
papers that followed similar approaches and collected results. As stated in one
of these reference works:
115
algorithm, the accuracy increases to 33%. This high percentage of
accuracy is to be expected due to the large imbalance between the
number of valid and real transactions.
REFERENCES
Since the entire dataset consists of only two days’ transaction records, [1]"Detection of credit card fraud based on transaction behavior -by
its only a fraction of data that can be made available if this project were John Richard D. Kho, Larry A. Vea, ” edited by Proc. of the 2017 IEEE
to be used on a commercial scale. Being based on machine learning
Region 10 Conference (TENCON), Malaysia, May 5-8
algorithms, the program will only increase its efficiency over time as
November,2017
more data is put into it.
[1] CLIFTON PHUA1, VINCENT LEE1, KATE SMITH1 & ROSS
VII. FUTURE ENHANCEMENTS GAYLER2 “ A comprehensive investigation into data mining based
” published by the School of Business of
Although w e couldn't reach the goal of 10 0 % accuracy in fraud detection Research, systems, Faculty Information
fraud detection, w e ended up developing a system that, Technology, Monash University, Wellington Road,Clayton, Victoria
given enough time and data, can come very close to that 3800, Australia
goal. As w ith any project of this nature, there is still room
for improvement.The nature of this project allow s to [2] “ Suman Credit Card Fraud Detection Survey Paper” ,
integrate multiple algorithms as modules and to combine Research
their results to increase the accuracy of the final result.
This model can be further improved by adding more Scholar, GJUS&T Hisar HCE, Sonepat published by International
Journal of Advanced Research in Computer Engineering and
algorithms. How ever, the output of these algorithmts must Technology
be in the same format as the others. Once this condition is (IJARCET) Volume 3 Issue 3, March 2014
met, the modules can be easily added as in code. This gives
the project a high degree of modularity and versatility. [3] “ Research on credit card fraud detection model based on
Further possibilities for improvement can be found in the distance
data set. As previously show n, the accuracy of the Sum - by Wen-Fang Yu and Na Wang", published in 2009
algorithms increases as the size of the data set increases. International Joint Conference on Artificial Intelligence [5]
"Detection of credit card fraud through parenclitic network
Therefore, more data w ill certainly make the model more analysis-by Massimiliano Zanin, Miguel Romance, Regino Criado
accurate in detecting fraud and reducing the number of false and santiagoMoral” , edited by Hindawi Complexity Volume 2018,
positives.
Item ID 5764370, 9 pages
How ever, this requires official support from the banks [6] "Detecting Credit Card Fraud: A Realistic Modeling and Novel
themselves
Learning Strategy,” published by IEEE TRANSACTIONS ON NEURAL
NETWORKS AND LEARNING SYSTEMS, VOL. 29,
116
NO. 8 AUGUST 2018
[7] “ Credit Card Fraud Detection – by Ishu Trivedi, Monika, Mrigya,mridushi,
published by the International Journal of
Advanced Research in computer and Communication Technology
Vol. 5, Issue 1, January
2016 [8] David J. Wetson, David J. Hand, M. Adams, Whitrow, and Piotr
Jusczak
"Plastic card fraud detection using peer group analysis" Springer, Edition 2008.
117