0% found this document useful (0 votes)
16 views34 pages

Python Program Final Pp122222

Uploaded by

neilmangwani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views34 pages

Python Program Final Pp122222

Uploaded by

neilmangwani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

1

2
Acknowledgement

This project is prepared in the partial fulfilment of the requirements for the masters
of business administration in information technology. The satisfaction and success of
the completion of this task would be incomplete without heartfelt thanks to the
people whose consent, guidance and support made this work successful. On doing
this undergraduate project I have been to have help , support and encouragement
from many people I would like to acknowledge them for the help and cooperation.
Our first thanks goes to SPPU University for designing such a worthy syllabus and
making us do this project. Our next batch of thanks goes to the Pune Institute of
management science and entrepreneurship without whose help our project would
have not been impossible. This list includes Dr. Sheena Abraham.Without sir
guidance my project would have been impossible to complete. This project has been
an wonderful experience where I have been learnt and experience many beneficial
things.

3
ABSTRACT

It is vital that credit card companies are able to identify fraudulent credit card
transactions so that customers are not charged for items that they did not purchase.
Such problems can be tackled with Data Science and its importance,along with
Machine Learning, cannot be overstated. This the project intends to illustrate the
modelling of a data set using machine learning with Credit Card Fraud Detection. The
Credit Card Fraud Detection Problem includes modelling past Credit card
transactions with the data of the ones that turned out to be fraud. This model is then
used to recognize whether a new transaction is fraudulent or not. Our objective
here is to detect 100% of the fraudulent transactions while minimizing the incorrect
fraud classifications. Credit Card Fraud Detection is atypical sample of classification.
In this process, we have Focused on analysing and preprocessing datasets as well as
the
deployment of multiple anomaly detection algorithms such as Local Outlier Factor
on the PCA.Transformed Credit Card transaction data.The significance of the
application of the techniques reviewed here is in the minimization of credit card
fraud. Yet there are still ethical issues when genuine credit card customers are
misclassified as fraudulent.

4
Chapter Topic Page Nos
1 Introduction 6
Company profile
1 1.1 Introduction
1.2 Objectives
1.3 Motivation
1.4 Overview of the Project
1.5 Chapter wise Summary

2 Analysis and Design 12


2.1 Functional Requirements
2.2 Non-Functional Requirements
2.3 Architecture
2.4 Use case diagram
2.5. Sequence Diagram

3 Implementation . 16
3.1. Modules Description
3.2. Implementation Details
3.3.Tools used
3.4. Instructor Notes

4 Test results/experiments/verification 30

4.1. Testing
4.2. Results
4.3. Verification

5 Conclusions 33

6 Reference 34

5
CHARTER 1

INTRODUCTION

COMPANY PROFILE

It is important that you read this notice, together with any other privacy notice we may provide on specific
occasions when we are collecting or processing personal information about you so that you are aware of
how and why we are using such information.

A “data controller” is responsible for deciding how to hold and use personal data. Personal data is
information or data from which you can be identified and is about you. Shield Security Services LTD (“the
Company”) is, therefore, a “data controller” in relation to the personal data that we receive in connection with
your instructions for the provision of our services.

We are required under data protection legislation to notify you of the information contained in this privacy
notice and it is important that you understand it. If there is anything in this notice that you do not
understand,
please speak to our Data Protection Appointed Person who can be contacted via email
[email protected].

THE INFORMATION THAT WE HOLD ABOUT YOU

In order that we can provide our services to you, we will collect, store and use some or all of the
following categories of personal information about you depending on your instructions and the services we
provide:

Category Examples

Personal Contact Details Name, title, addresses, telephone numbers, personal email addresses
Biographical Data Date of birth, gender, marital status,
Financial Data Bank account details, Payment card details
We do not normally collect, store or use more Sensitive Personal Data or “special categories of data” .

6
Modules

Bankruptcy credit card fraud:-

In bankruptcy, fraud usually occurs at the expense of a creditor. For instance,


suppose you aren’t completely honest about your income
when you apply for a credit card. Or maybe you hide money from your business
partner. Or perhaps when you closed your restaurant, you sold offall the tables,
commercial ovens, and the like.If you put large charges on your credit card shortly
before filing for bankruptcy, your credit card company might argue that the charges
shouldn’t be wiped out by your bankruptcy discharge.

For more on issues that arise regarding your credit cards in bankruptcy, see Credit
Card Debt & Bankruptcy.
Bankruptcy law states that any debt obtained by fraud, misrepresentation, or false
pretenses is nondischargeable in bankruptcy.A common type of fraud is presumptive
credit card abuse. Using your credit cards for luxury items —such as expensive
purses, jewelry, or going to the theater — during the 9 0 days before you file for
bankruptcy is presumed fraudulent. You can, however, use credit for necessities of
life, such as food and car repairs.
Other types of fraud also exist —for example, fiduciary fraud (such as fraud between
business
partners), larceny, and embezzlement.

7
Application fraud

1. Application fraud is a type of identity thief that involves opening a credit card
account inanother person's name. By opening an account using stolen
information, an
individual can cause numerous problems for the victim, including debt, negative
credit ratings. In the United States, identity theft is a federal offense. Several
government agencies and consumer protection organizations have developed

2. strategies to help victims, and potential victims, deal with fraud on credit card
applications.Typically, the applicant either pretends to be someone else, or he or
she uses his or her real name and false contact information. The applicant might use
stolen documents or have copies of the victim’s personal information, such as utility
bills or other financial statements.Credit card application fraud can have many
negative consequences for its victims. Once a fake application is approved, someone
can accumulate potentially limitless charges using another person’sidentity.

3. Counterfeit credit card fraud: It is usually committed through skimming. This


means that a fake magnetic swipe card holds all your card details. This fake
strip is thenused to create a fraudulent card that is fully functional.
Essentially, it is an exact copy, which means fraudsters can simply swipe it in
a machine to pay for certain goods.

4. This type of fraud can also be committed by someone who knows your card
details. They can use this information to create a so-called ‘ fake plastic’ .
Here, the magnetic strip or the chip on the card doesn’t actually work.
However, it is often easy enough to convince a merchant that there is
something wrong with the card, at which point they will enter the
transaction by hand.

CNP (Card Not Present) Fraud:

If somebody knows the expiry date and account


number of your card, they can commit CNP against you. This can be done through
phone, mail or internet. It essentially means that somebody uses your card without
actually being in physical possession of it. More and more and often, merchants will
require the card verification code, making CNP fraud slightly more
difficult, but if a fraudster can get your account number, they probably know that
number too.Additionally, there are only 999 possible combinations for the
verification code. As such,
many criminals attempt to order items of very low amounts until they figure out the
right number. Be on the lookout, therefore, for small payments on your statements.
8
Phishing:

Phishing is atype of fraud that involves stealing personal information such as


Customer ID, IPIN, Credit/Debit Card number, Card expiry date, CVV number, etc.
through emails that appear to be from a legitimate source, say HDFC Bank.
Nowadays, phishers also use phone (voice phishing) and SMS (Smishing). It is a form
of fraud: Phishing is when thieves pretend to represent legitimate companies,
contact consumers and extract their credit card information.

Then the phishers goshopping. For the victims, it’s not phunny.The consequences of doing
that can bedire. If you fall victim to a credit card phishing scam, the perpetrators can gain
access to your credit card numbers and a lot of other personal information. They can use
your credit card togo on shopping sprees, and they can use your personal
information to steal your identity.

9
Skimming:

Credit Card Skimming This is a type of fraud wherein a small electronic device known
as a ‘skimmer’ is used to steal credit card information. This usually occurs when the
credit card is inserted into electronic devices such as Point of Sale or PoS.Credit card
skimmers are devices that enable thieves to steal card data and use it for
fraudulent transactions. They're added to card reader devices to capture your
information.Victims of credit card skimming are completely blindsided by the theft.
They notice fraudulent charges or money withdrawn from their accounts, but their
credit and debit cards never left their possession.

Card-not-present fraud It is a type of credit cardscam in which the customer does


not physically present the card to the merchant during the fraudulent transaction.
Card-not-present fraud can occur with
transactions that are conducted online or over the phone.Card-not-present fraud is a
type of credit cardscam in which the customer does not physically present the card
to the merchant during the fraudulent transaction. Card-not-present fraud can
occur with transactions that are conducted online or over the phone.In this article,
we’ll
take a closer look at this disturbing trend, and ways that merchants can protect
themselves.

The term “ card not present,” or CNP, refers to transactions that depend on manual
entry of credit card information without the physical presence of the
credit card itself.

Problem Statement

The Credit Card fraud detection Problem includes modeling past credit card
transactions with the knowledge of the ones that turned out tobe fraud.This
model is then used to identify whether a new transaction is fraudulent or not.
Our aim here is to detect 1 0 0 % of the fraudulent transactions while
minimizing the incorrect fraud detection classification.

10
Overview of the Project

Overview of the Credit Card Fraud Detection Project SPD Group was contacted by an E-
commerce and Financial Service company that offered products and services that
can be paid for using Mobile Money or a bank card (e.g., Visa and MasterCard) to
make their platforma safer online transaction place for their customers.Credit Card
Fraud Detection is based on the analysis of existing purchase data of cardholders is
a promising way to reduce the fraud rate of successful credit card fraud.The scope of
the research is focused on implementing a credit card fraud detection system to
compact the increasing cyber-crimes faced by our country.

Motivation Nowadays most of the transactions take place online, meaning that
credit cards and otheronline payment systems are involved. This method is
convenient both for thecompany and for the consumer. Consumers save time
because they don’t have togo to the store to make their purchases and
companies save money by not owningphysical stores and avoiding expensive
rental payments. It seems that the digital age brought some highly useful features
which changed the way that both companiesand consumers interact with each
other but with one cost Companies need to hireskilled software engineers and
penetration testers to make sure that all thetransactions are legal and
nonfraudulent.

Those people are designing the company’sservers in away that the client has no control over
critical transaction parts such as payment amount. With careful designing most (if not all) of
the problems can beeliminated but even the framework which was used to create the server is
notperfect. For example, if you follow the django framework release notes, you will see that
there are many bug fixes among versions, thus a company should not rely only on its
engineering skills.

11
CHAPTER 2

Analysis and Design

2.1 Functional Requirements

A functional requirement defines a function of a system or its component. A function


is described as aset of inputs, the behavior, and outputs . Functional requirements
maybe calculations, technical details, data manipulation and processing and other
specific functionality that define what a system is supposed to accomplish.

2.2 Non- Functional Requirements ( NFR)

specifies the quality attribute of a software system. They judge the software system
based on Responsiveness, Usability, Security, Portability and other non-functional
standards that are critical to the success of the software system.
Example of nonfunctional requirement, “ how fast does the website load?” Failing to
meet non-functional requirements can result in systems that fail to satisfy user
needs.

Non-functional Requirements allows you to impose constraints or restrictions on the


design of the system across the various agile backlogs. Example, the site
should load in 3 seconds when the number of simultaneous users are > 10000.
Description of non-functional requirements is just as critical as a functional
requirement.

12
Merchants are assigned either in the high risk or low-risk category by credit card
processors.
Since most processors opt to do business with low-risk merchants because they’re
not as risky, there aren’ta whole lot of merchant services that work with high-risk
merchants

13
Credit Card Processing System (aka Credit Card Payment Gateway) is a subject , i.e.
system under design or consideration. Primary actor for the system is a Merchant’s
Credit Card Processing System . The merchant submits some credit card transaction
request to the credit card payment gateway on behalf of a customer.

14
The instance of class objects involved in this UML Sequence Diagram of Credit Card
Approval System areas follows: This is the UML sequence diagram of Training
Management System which shows the interaction between the objects of Training,
Job, Knowledge, College, Qualification.

This is a Component diagram of Credit Card Approval System which shows


components, provided and required interfaces, ports, and relationships between
the Limits, Credit Card, Consumer, Document and
Application. This type of diagrams is used in Component-Based Development (CBD)
to describe systems with Service-Oriented Architecture (SOA).
15
CHAPTER 3

Implementation

3.1. Modules Description

Credit Card: A credit card is a thin rectangular slab of plastic issued by a financial
company, that lets cardholders borrow funds with which to pay for goods and
services.

1) Fraudulent: Obtained, done by, or involving deception, especially criminal


deception. Verification: The process of establishing the truth, accuracy, or
validity of something.

Consumer: A person who purchases goods and services for personal use.
2)
Transaction: An instance of buying or selling something.

1) Purchase: Acquire (something) by paying for it; buy.

3.2. Implementation Details


Importing all the necessary Libraries.
Loading the Data.
Understanding the Data.
Describing the Data.

16
17
18
1. Multiple libraries and frameworks need to be imported to perform tasks in
a data science case study. Everytime a data scientist or analyst starts a
new jupyter
notebook or any other IDE, they need to import all the libraries as per their
requirements. Sometimes writing multiple lines of the same import statement over
and over again can be frustrating.We see that our data is highly imbalanced.
Therefore, we cannot use any supervised learning algorithm directly because it will
overfit based on the ‘ Normal’ examples.I encourage you to execute the above code
snippet in your local machine to see the results better.

2. The blue line indicates theactual Gaussian distribution, while theredone our
data’ s probability densityfunction. We see that almost every feature comes
from thenormal (or gaussian)
distribution except from the ‘Time’ one. Therefore, this will be our motivation to do
fraud detection with the multivariate Gaussian distribution. This method works only
for features from the gaussian distribution, thus if your data is not gaussian-like you
can transform it to be with a method that I describe in article. For example, you can
try this technique on features ‘ V26’, ‘ V4’, ‘V1’ and see if you get improvements in
our final score.

3. The ‘Time’ feature comes from the bimodal distribution which can’t be
transformed to gaussian-like, thus we will discard it. Another reason to discard the
‘Time’ feature is that it doesn’t seem to contain extreme values like the features
shown in the other graphs. Only 0 . 1 7 % fraudulent transaction out all the transactions.
The data is highly Unbalanced. Lets first apply our models without balancing it and if
wedon’t get a good accuracy then we can find away to balance this dataset. But
first, let’s
implement the model without it and will balance the data only if needed.

19
Tools used
Google notebook online .
python .
jupyter notebook.

Jupyter notebook:

The Jupyter Notebook is an open-source web application that allows you to create
and share documents that contain live code, equations, visualizations and narrative
text. Uses include:
data cleaning and transformation, numerical simulation, statistical modeling, data
visualization, machine learning, and much more.

3.4 Instructor Notes


Some meta-notes about this analysis:

● If using the full data, this analysis is extremely computationally intensive. My


laptop’s fan was working overtime. In general, RStudio Cloud is not really up to this
task due to the memory limit.

● To alleviate some of the stress of repeating this computations while checking the
knitted output, the results are cached. This is done in the first chunk of the analysis.
Beware that caching results can sometimes lead to odd issues that are best solved
by deleting the cache and running from scratch again.

● Another method used to reduce the computational burden is the use of parallel
operations. The doParallel and parallel packages are used to setup a parallel backend
which caret automatically leverages if it exists.

● SMOTE was used instead of ROSE as issue were uncovered when combining ROSE
with glm. It seems to be related to how various functions are defining the positive
class. (And also whether factor coercion is happening within those functions.)

● While not tried here, subsampling within random forests for class imbalance might
perform well in this scenario.

● In practice, this application would likely involve streaming data. 1 0

● The createDataPartition function from the caret package was used to do the data
splitting. This allows for stratified sampling with the classes of the response variable.
Often, this might not really be necessary, but with such imbalanced data, it
guarantees
that the imbalance is the same in both dat2a0sets. As a result of the splitting, the test
data was only 98 positive (fraud) examples.
Credit Card Fraud Detection

The challenge is to recognize fraudulent credit card transactions so that the


customers of credit card companies are not charged for items that they did not purchase.

Main challenges involved in credit card fraud detection are:

Enormous Data is processed everyday and the model build must be fast enough to
respond to the scam in time.
Imbalanced Data i. e most of the transactions ( 9 9 . 8 % ) are not fraudulent which makes
it really hard for detecting the fraudulent ones
Data availability as the data is mostly private.
Misclassified Data can be another major issue, as not every fraudulent transaction is
caught and reported.
Adaptive techniques used against the model by the scammers.

How to tackle these challenges?

The model used must be simple and fast enough to detect the anomaly and classify it
as a fraudulent transaction as quickly as possible.
Imbalance can be dealt with by properly using some methods which we will talk
about in the next paragraph
For protecting the privacy of the user the dimensionality of the data can be reduced.
A more trustworthy source must betaken which double-check the data, atleast for
training the model.
We can make the model simple and interpretable so that when the scammer adapts
to it with just some tweaks we can have a new model up and running to deploy.

# import the necessary packages


import numpy asnp
import pandas aspd
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib import gridspec
Code : Loading the Data

# Load the dataset from the csv file using pandas


# best way is to mount the drive on colaband

21
# copy the path for the csv file
data = pd.read_csv("credit.csv")

Code : Understanding the Data


# Grab a peek at the data
data. head()

Code: Describing the data

# Print the shape of the data


# data = data.sample(frac = 0.1, random_state = 48)
print(data. shape)
print(data. describe())

22
23
Code : Imbalance in the data
Time to explain the data we are dealing with.

# Determine number of fraud cases in dataset


fraud = data[data['Class'] == 1]
valid = data[data['Class'] == 0]
outlierFraction = len(fraud)/float(len(valid))
print(outlierFraction)
print('Fraud Cases: {}'.format(len(data[data['Class'] == 1])))
print('Valid Transactions: {}'.format(len(data[data['Class'] == 0])))

24
Only 0 . 1 7 % fraudulent transaction out all the transactions. The data is highly
Unbalanced. Lets first apply our models withoutbalancing it and if we don’tget a
good accuracy then we can find away to balance this dataset. But first, let’s
implement the model without it and will balance the data only if needed.

Code : Print the amount details for Fraudulent Transaction

print(“ Amount details of the fraudulent transaction”)


fraud.Amount.describe()

Code : Print the amount details for Normal Transaction

print(“details of valid transaction”)


valid.Amount.describe()

25
As we can clearly notice from this, the average Money transaction for the fraudulent
ones is more. This makes this problem crucial to deal with.

Code : Plotting the Correlation Matrix


The correlation matrix graphically givesusan idea of how features correlate with
each other and can help us predict what are the features that are most relevant for
the prediction.

# Correlation matrix
corrmat = data. corr()
fig = plt.figure(figsize = (12, 9))
sns. heatmap(corrmat,vmax = .8, square = True)
plt.show()

26
In the HeatMap we can clearly see that most of the features do not correlate to other
features but there are some features that either has a positive or a negative
correlation with each other. For example, V2 and V5 are highly negatively correlated
with the feature called Amount. We also see some correlation with V20 and Amount.
This gives usa deeper understanding of the Data available to us.

Code : Separating the X and the Y values


Dividing the data into inputs parameters and outputs value format

# dividing the X and the Y from the dataset


X = data.drop(['Class'], axis = 1)
Y = data["Class"]
print(X.shape)
print(Y.shape)
# getting just the values for the sake of processing
# (its a numpy array with no columns)
xData = X.values
yData = Y.values

27
Training and Testing Data Bifurcation
We will be dividing the dataset into two main groups. One for training the model and
the other for Testing our trained model’s performance.

# Using Scikit-learn to split data into training and testing sets


from sklearn.model_selection import train_test_split
# Split the data into training and testing sets
xTrain, xTest,yTrain, yTest = train_test_split(
xData,yData, test_size = 0.2, random_state = 42)
Code : Building a Random Forest Model using scikit learn

# Building the Random Forest Classifier ( RANDOM FOREST)


from sklearn.ensemble import RandomForestClassifier
# random forest model creation
rfc = RandomForestClassifier()
rfc. fit(xTrain,yTrain)
# predictions
yPred = rfc. predict(xTest)
Code : Building all kinds of evaluating parameters

# Evaluating the classifier


# printing every score of the classifier
# scoring in anything
from sklearn.metrics import classification_report, accuracy_score
from sklearn.metrics import precision_score, recall_score
from sklearn. metrics import f1 _score, matthews_corrcoef
from sklearn.metrics import confusion_matrix

n_outliers = len(fraud)
n_errors = (yPred != yTest).sum()
print("The model used is Random Forest classifier")

acc = accuracy_ score(yTest, yPred)


print("The accuracy is {}".format(acc))

prec = precision_score(yTest,yPred)
print("The precision is {}".format(prec))

rec = recall_score(yTest, yPred)


print("The recall is {}".format(rec))
28
f1 = f1_score(yTest,yPred)
print("The F1-Score is {}".format(f1))

MCC = matthews_ corrcoef(yTest, yPred)


print("The Matthews correlation coefficient is{}".format(MCC))

Code : Visualizing the Confusion Matrix

# printing the confusion matrix


LABELS = ['Normal', 'Fraud']
conf_matrix = confusion_matrix( yTest, yPred)
plt.figure(figsize =(12, 12))
sns.heatmap(conf_matrix,xticklabels = LABELS,
yticklabels = LABELS, annot = True, fmt ="d");
plt.title("Confusion matrix")
plt.ylabel('True class')
plt.xlabel('Predicted class')
plt.show()

29
CHAPTER 4

Test results/experiments/verification

30
Card testing happens when fraudsters test stolen credit card details by making small
online purchases. The fraudsters need to check the validity of the credit card details,
and once they confirm the credit card is valid they proceed with making larger
fraudulent purchases. Dividing the data into input parameters and output value
format.We will be dividing the dataset into two main groups. One for training the
model and the other for Testing our trained model’s performance. Implementing a
robust fraud detection system will result in additional benefits: Better Analytics and
Predictive Forecasts: credit card frauds often bear a recognizable pattern.
Results The code prints out the number of false positives it detected and compares it
with the actual values. This is used to calculate the accuracy score and precision of
the algorithms. The fraction of data we used for faster testing is 10% of the entire
dataset. The complete dataset is also used at the end and both the results are
printed.These results along with the classification report for each algorithm is given
in the output as follows, where class 0 means the transaction was determined to be
valid and 1 means it was determined as a fraud transaction.This result matched
against the class values to check for false positives.
31
Results

when 10% of the dataset is used:Clearly, credit card fraud is an act of


criminal dishonesty. This article has reviewed recent findings in the credit card field.
This paper has identified the different types of fraud, such as bankruptcy fraud,
counterfeit fraud, theft fraud, application fraud and behavioral fraud, and discussed
measures to detect them. Such measures have included pairwise matching, decision
trees, clustering techniques, neural networks, and genetic algorithms. From an
ethical perspective, it can be argued that banks and credit card companies should
attempt to detect all fraudulent cases. Yet, the unprofessional fraudster is unlikely to
operate on the scale of the professional fraudster and so the costs to the bank of
their detection maybe uneconomic.

The bank would then be faced with an ethical dilemma. Should they try to detect
such fraudulent cases or should they act inshareholder interests and avoid
uneconomic costs? As the nextstep in this research program, the focus will be
upon the implementation of a ‘suspicious’ scorecard on a real data-set and its
evaluation. The main tasks will be to build scoring models topredict fraudulent
behavior, taking into account the fields of behavior that relate to the different
types of credit card fraud identified in this paper, and to evaluate the associated
ethical implications. The plan is to take one of the European countries, probably
Germany, and then to extend the research to other EU countries.

32
CONCLUSION

creating a website for credit cardfraud detection.Credit card fraud is without a doubt an act
of criminal dishonesty.
This article has listed out the most common methods of fraud along with their
detection methods and reviewed recent findings in this field. This paper has also
explained in detail, how machine learning can be applied to get better results in fraud
detection along with the algorithm, pseudocode, explanation its implementation and
experimentation results. While the algorithm does reach over 9 9 . 6 % accuracy, its
precision remains only at 2 8 % when a tenth of the data set is taken into
consideration. However, when the entire dataset is fed into the algorithm, the
precision rises to 33%. This high percentage of accuracy is to be expected due to the
huge imbalance between the number of valid and number of genuine transactions.

Further Scope:

While we couldn’ t reach out goal of 100% accuracy in fraud


detection, we did end up creating a system that can, with enough time and data, get
very close to that goal. As with any such project, there is some room for
improvement here. The very nature of this project allows for multiple algorithms to
be integrated together as modules and their results can be combined to increase the
accuracy of the final result. This model can further be improved with the addition of
more algorithms into it. However, the output of these algorithms needs to be in the
same format as the others. Once that condition is satisfied, the modules are easy to
add as done in the code. This provides a great degree of modularity and versatility to
the project. More room for improvement can be found in the dataset. As
demonstrated before, the precision of the algorithms increases when the size of
dataset is increased. Hence, more data will surely make the model more accurate in
detecting frauds and reduce the number of false positives. However, this requires
official support from the banks themselves.

33
REFERENCS

1. CLIFTON PHUA1, VINCENT LEE1, KATE SMITH1 & ROSS GAYLER2 “ A


Comprehensive Survey of Data Mining-based Fraud Detection Research” published
by School of Business Systems, Faculty of Information Technology, Monash
University, Wellington Road,

2. Clayton, Victoria 3800, Australia

3. “Survey Paper on Credit Card Fraud Detection by Suman” , Research Scholar,


GJUS&T Hisar HCE, Sonepat published by International Journal of Advanced
Research in Computer Engineering & Technology ( IJARCET) Volume 3 Issue 3 ,
March 2014.

4. https://www.kaggle.com/renjithmadhavan/credit-card-fraud-detection-using-
pythoN

5. https://colab.research.google.com/notebooks/intro.ipynb#recent=true

6. https://en.wikipedia.org/wiki/Machine_learning

34

You might also like