SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1359
Credit Card Fraud Detection Using Machine Learning & Data Science
Ishika Sharma1 Shivjyoti Dalai2, Venktesh Tiwari3, Ishwari Singh 4, Seema Kharb5
1,2,3 Students, Computer Science Engineering, SRM University, Sonipat
4Asst. Professor, Dept. of Computer Science Engineering, SRM University, Haryana,
5Asst. Professor, Dept. of Computer Science Engineering, SRM University, Haryana, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - A method for 'Credit Card Fraud Detection' is
created in this study. As the number of scammers grows every
day. Credit cards are used for fraudulent transactions, and
there are several sorts of fraud. As a result, various techniques
such as Logistic Regression, Random Forest, and Naive Bayes
are utilized to tackle this problem. This transaction is
evaluated individually, and whateverworksbestiscarried out.
The primary purpose is to detect fraud by filtering the
aforementioned strategies in order to achieve a better
outcome.
Key Words: Credit Card, Fraud Detection,RandomForest,
Naïve Bayes, Logistic Regression.
1. INTRODUCTION
Credit card fraud is a broad word for theft and fraud
perpetrated using or utilizing a credit card at the moment of
payment. The goal may be to buy something without paying
for it or withdraw money from an account without
permission. Identity theft is often accompanied by credit
card fraud. According to the Federal Trade Commission of
the United States, the rate of identity theft remained steady
during the mid-2000s, but it jumped by 21% in 2008. Even
though credit card fraud, the crime most people connect
with ID theft, fell to a fraction of total ID theft complaints in
2000, roughly 10 million transactions, or one out of every
1300, were fraudulent. In addition, 0.05 percent (5 out of
10,000) of all monthly active accounts were fake. Today,
fraud detection systems keep track of a twelfth of one
percent of all transactions performed, resulting in billions of
dollars in losses. Credit card fraud is one of the most serious
issues facing businesses today. However, to successfully
detect fraud, it is necessary first to comprehend the
processes of fraud execution. Fraudsters use a variety of
methods to perpetrate credit cardfraud.CreditCardFraudis
described as "when an individual uses another person's
credit card for personal reasons while the card owner and
the card issuer are unaware that the card is being used."
Theft of the actual card or the critical data linked with the
account, such as the card account number or other
information that must be given to a merchant during a valid
transaction, is where card fraud begins. Card numbers,
usually the Primary Account Number (PAN), are often
reproduced on the card, and the data is stored in machine-
readable format on a magnetic stripe on the reverse.
2. METHODOLOGY
This part should provide the method and analysis used in
your research project. Using keywords from your title in the
first few phrases is a simple and effective method to follow.
A. Data Collection
The data-gathering phase is the first step in the project; this
dataset comprises a collectionoftransactions,someofwhich
are real and others are fraudulent. The data-gatheringphase
is the first step in the project; this dataset includes a
collection of transactions, some of which are real and others
that are fraudulent. The data-gatheringphaseisthefirststep
in the project; this dataset comprises a collection of
transactions, some of which are real and others are
fraudulent.
B. Credit Card Dataset
A credit card transaction data set was gathered via Kaggle,
and it comprises a total of 2,84,808 credit card transactions
from a European bank. It divides transactions into "positive
class" and "negative class." The data set is highly skewed,
with roughly 0.172 percent of transactions being fraudulent
and the remainder being legitimate; this indicates that just
492 of the 2,84,808 transactions are fraudulent, and the rest
are genuine ones. So, we oversampled to balance the data
set, resulting in 60% of fraud transactions and 40% genuine
ones.
C. Preprocessing of Dataset
Selected data is formatted, cleaned, and sampled in this
module. The following are some of the data pre-processing
steps:
a) Formatting: The chosen data might not be in the correct
format. We may prefer data in a file format over a relational
database or vice versa.
b) Cleaning is the process of removing or correcting missing
data. The dataset may contain recordsthat areincompleteor
have null values. Such records must be deleted.
c) Sampling: The class distribution in credit card
transactions is uneven because the number of frauds in the
dataset is fewer than the total number of transactions. As a
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1360
result, the sampling approach is utilized to tackle this
problem.
D. Loading of Dataset
The dataset is loaded after it has been pre-processed.
Various library functions can be used to load the dataset. In
this case, we used the read CSV method of Python's Pandas
library to load a dataset in CSV or Microsoft Excel format; in
terms of python, it is called a DATAFRAME. dataset =
pd.read_csv('creditcard.csv')
E. Splitting of Dataset
To compensate for the dataset's imbalance, we used the
ADASYN oversampling technique, which oversamples both
the number of fraudulent and genuine transactions to a
specific number, resulting in a positive and negative range
that is nearly equal. After the dataset is oversampled, the
samples are split into Train and Test data. A suitable ratio is
to be performed for the model (Usually, 70% for Train data
and 30% for Test data are chosen, anyone can choose their
ration). The train dataset can be further split into train data
and validation data.
F. Building Model
After the data has been split into train and test data which is
70% and 30%, respectively, the training data is now utilized
for the model building. The dataset contains 31 features,out
of which 30 features or columns are the independent
features, and the last column called the CLASS column, is the
dependent feature. So here, the dataset is split into four
categories: xtrain, ytrain, xtest, and ytest, representing
independent training features, dependent training features,
independent test features, and dependent test features.
G. Algorithms
a) Logistic Regression -Regression is a regression model
that analyses therelationshipbetweenmultipleindependent
variables and has a categorical dependent variable. There
are many different logistic regression models, including
binary, multiple, and binomial logistic models. The Binary
Logistic Regression model calculates the likelihood of a
binary response based on one or more predictors.
Fig 1- Logistic Regression expression
The above equation represents the logistic regression in
mathematical form.
b) Random Forest - Random Forest can be used to rank the
importance of variables in a regression or classification
problem in a natural way. Random forest is a tree-based
algorithm that createsseveral treesandcombinesthe results
to improve the model's generalization ability. An ensemble
method is a technique for combining trees. Ensembling is
nothing more than putting togethera groupofweak learners
(individual trees) to create a stronglearner.RandomForests
can be used to solve problems involving regression and
classification. The dependent variable in regression
problems is continuous. The dependent variable in
classification problems is categorical.
Fig 2- Random Forest expression
c) Naïve Bayes - A Bayesian classifier is a statistical method
that uses Bayes' theorem to calculate the probability that a
feature belongs to a specific class. It is referred to as naive
because it assumes that the possibilities of individual
components are independent of one another, which is
extremely unlikely to occur inthereal world.Theprobability
of an event occurring is calculated by considering the
likelihood of another event occurring. It's possibletowriteit
as:
Fig 3- Naïve Bayes expression
Where the posterior probability of target class c P(c|X) is
calculated from P(c), P(X|c), and P(X).
H. Training of Model
After building the model, the model is trained using Train
data and validation data. The model is trained using the
library function fit () function.
I. Evaluating the Model The model can be evaluated by using
various metrics. These are
a) Interpreting Loss and Validationloss -Lossistheresult
of a bad prediction. A loss is a number indicating how bad
the model's prediction was on a single example. Loss can be
validation loss and training loss.
b) Interpreting Accuracy and Validation Accuracy -
Validation accuracy and accuracy need to be converged in a
good model.
c) Confusion Matrix - A confusion matrix summarizes
classification problem prediction results. The correct and
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1361
incorrect predictions are totaled and broken down by class
using count values. The confusion matrix's key is this. The
confusion matrix depicts the various ways in which your
classification model becomes perplexed when making
predictions. It reveals the number of errors made by a
classifier and the types of errors made.
Here,
•Class 1: Positive
•Class 2: Negative Classification Rate/ Accuracy
Classification Rate or Accuracy is given by the relation:
Fig 4- Accuracy Expression
Recall - Recall can be defined as the ratio of the total number
of correctly classified positive examples divided by the total
number of positive examples. High Recall indicates the class
is correctly recognized (a small number of FN).
Fig 5- Recall Expression
Precision - To get the precision value, we divide the total
number of correctly classified positive examples by thetotal
number of predicted positive examples. High precision
indicates that an example labeled as positive is positive (a
small number of FP).
Fig 6- Precision Expression
J. Saving the Model
After building the model, the model is saved to our device.
The model can be saved in .pkl format or .h5 format. To save
the model in .pkl format, python provides usa librarynamed
Pickle, and to save it in .h5 format, the Tensorflow library is
used. I have used the Pickle library and saved the models in
.pkl format.
3. MODEL AND ANALYSIS
Fig 8- RandomForest Classifier
Fig 9: Naïve Bayes model
4. RESULTS AND DISCUSSION
The accuracy results we got from the three algorithms are
shown in the following table below.
Table- Accuracy Results
Fig 7- Logistic Regression
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1362
Training vs. Test Data in Dataset
Fig 10- Training vs. Test Data
Normal vs. Fraud Transaction after Oversampling
Fig 11- Fraudulent vs. non-Fraudulent
Correlation matrix
Fig12- Correlation Matrix
4. CONCLUSIONS
Various machine learning algorithms for detecting fraud in
credit card transactions were reviewed in this paper. The
accuracy, precision, and specificity metrics are used to
evaluate the performance of this technique. To classify the
transaction as fraudulent or authorized, I used three
supervised learning techniques: Logistic Regression,
Random Forest, and Naive Bayes. Using feedback and
delayed supervised training, these classifiers were trained
on a delayed supervised sample dataset of almost 284807
transaction records. Due to the massive imbalance, the
dataset was subjected to an Oversampling technique, which
resulted in the number of fraud and normal transactions
being nearly equal. The training and test data were tested
using the three Models, and the results were obtained. The
accuracy of the Random Forest, Logistic Regression, and
Naive Bayes was 99.27%, 91.20%, and89.40%, respectively.
From the Above project, itcanbeconcludedthattheRandom
Forest model is somewhat trustworthy, and its accuracy
could be improved further with a larger and more balanced
dataset. If some other algorithms can be combined with this
one to form a Hybrid Algorithm, the results will be even
better.
ACKNOWLEDGEMENT
The success and outcome of this project required a lot of
guidance and assistance from many people, and we are
highly privileged to have got this all along with the
completion of our project. All that we have done is only due
to such supervision and assistance, and we will not forget to
thank them.
We are extremely grateful to Dr. Paramjit S. Jaswal, Vice-
Chancellor, SRM University, and Dr.PuneetGoswami,Head
of the Department, Department of Computer Science and
Engineering, for providing all the required resources for the
completion of my seminar.
Our heartfelt gratitude to our guide Dr. Ishwari Singh, for
their valuable suggestion and guidance in preparing the
research paper. Last but not least, we would express our
obligation to all the people who have worked extensively on
the topic and make the content available for free to all the
aspiring people who want to grow in their community. I
would say this report can be helpful to any aspiring student
who wants to gain an overall idea about how high-
performance computing works in practical life.
REFERENCES
[1] T. Mohana Priya, Dr. M. Punithavalli & Dr. R. Rajesh
Kanna, Machine Learning Algorithm forDevelopmentof
Enhanced Support Vector MachineTechniquetoPredict
Stress, Global Journal of Computer Science and
Technology: C Software & Data Engineering, Volume20,
Issue 2, No. 2020, pp 12-20
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1363
[2] Ganesh Kumar and P.Vasanth Sena, “Novel Artificial
Neural Networks and Logistic Approach for Detecting
Credit Card Deceit,” International Journal of Computer
Science and Network Security, Vol. 15, issue 9, Sep.
2015, pp. 222-234
[3] Gyusoo Kim and Seulgi Lee, “2014 Payment Research”,
Bank of Korea, Vol. 2015, No. 1, Jan. 2015.
[4] Chengwei Liu, Yixiang Chan, Syed Hasnain Alam Kazmi,
Hao Fu, “Financial Fraud Detection Model: Based on
Random Forest,” International Journal ofEconomicsand
Finance, Vol. 7, Issue. 7, pp. 178-188, 2015.
[5] Hitesh D. Bambhava, Prof. Jayeshkumar Pitroda, Prof.
Jaydev J. Bhavsar (2013), “A Comparative Study on
Bamboo Scaffolding And Metal Scaffolding in
Construction Industry Using Statistical Methods,"
International Journal of Engineering Trends and
Technology (IJETT) – Volume 4, Issue 6, June 2013,
Pg.2330-2337.
[6] P. Ganesh Prabhu, D. Ambika, "Study on Behaviour of
Workers in Construction Industry to Improve
Production Efficiency," International Journal of Civil,
Structural, Environmental and Infrastructure
Engineering Research and Development (IJCSEIERD),
Vol. 3, Issue 1, Mar 2013, 59-66
[7] Manideep, A. P. S., and Seema Kharb. "A Comparative
Analysis of Machine Learning Prediction Techniquesfor
Crop Yield Prediction in India." Turkish Journal of
Computer andMathematicsEducation(TURCOMAT) 13.2
(2022): 120-133.
Ad

More Related Content

Similar to Credit Card Fraud Detection Using Machine Learning & Data Science (20)

A Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud DetectionA Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud Detection
IRJET Journal
 
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
IRJET Journal
 
Online Transaction Fraud Detection System Based on Machine Learning
Online Transaction Fraud Detection System Based on Machine LearningOnline Transaction Fraud Detection System Based on Machine Learning
Online Transaction Fraud Detection System Based on Machine Learning
IRJET Journal
 
ATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithmsATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithms
IRJET Journal
 
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
IRJET Journal
 
A Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud DetectionA Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud Detection
IRJET Journal
 
Tax Prediction Using Machine Learning
Tax Prediction Using Machine LearningTax Prediction Using Machine Learning
Tax Prediction Using Machine Learning
IRJET Journal
 
In Banking Loan Approval Prediction Using Machine Learning
In Banking Loan Approval Prediction Using Machine LearningIn Banking Loan Approval Prediction Using Machine Learning
In Banking Loan Approval Prediction Using Machine Learning
IRJET Journal
 
IRJET- Credit Card Fraud Detection using Random Forest
IRJET-  	  Credit Card Fraud Detection using Random ForestIRJET-  	  Credit Card Fraud Detection using Random Forest
IRJET- Credit Card Fraud Detection using Random Forest
IRJET Journal
 
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
IRJET Journal
 
A predictive system for detection of bankruptcy using machine learning techni...
A predictive system for detection of bankruptcy using machine learning techni...A predictive system for detection of bankruptcy using machine learning techni...
A predictive system for detection of bankruptcy using machine learning techni...
IJDKP
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
dataalcott
 
IRJET- Survey on Credit Card Security System for Bank Transaction using N...
IRJET-  	  Survey on Credit Card Security System for Bank Transaction using N...IRJET-  	  Survey on Credit Card Security System for Bank Transaction using N...
IRJET- Survey on Credit Card Security System for Bank Transaction using N...
IRJET Journal
 
B05840510
B05840510B05840510
B05840510
IOSR-JEN
 
B05840510
B05840510B05840510
B05840510
IOSR-JEN
 
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISKMACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
IRJET Journal
 
IRJET- Competitive Analysis of Attacks on Social Media
IRJET-  	 Competitive Analysis of Attacks on Social MediaIRJET-  	 Competitive Analysis of Attacks on Social Media
IRJET- Competitive Analysis of Attacks on Social Media
IRJET Journal
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection Analysis
IRJET Journal
 
IRJET- Financial Analysis using Data Mining
IRJET- Financial Analysis using Data MiningIRJET- Financial Analysis using Data Mining
IRJET- Financial Analysis using Data Mining
IRJET Journal
 
Detection of credit card fraud
Detection of credit card fraudDetection of credit card fraud
Detection of credit card fraud
Bastiaan Frerix
 
A Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud DetectionA Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud Detection
IRJET Journal
 
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
IRJET Journal
 
Online Transaction Fraud Detection System Based on Machine Learning
Online Transaction Fraud Detection System Based on Machine LearningOnline Transaction Fraud Detection System Based on Machine Learning
Online Transaction Fraud Detection System Based on Machine Learning
IRJET Journal
 
ATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithmsATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithms
IRJET Journal
 
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
IRJET Journal
 
A Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud DetectionA Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud Detection
IRJET Journal
 
Tax Prediction Using Machine Learning
Tax Prediction Using Machine LearningTax Prediction Using Machine Learning
Tax Prediction Using Machine Learning
IRJET Journal
 
In Banking Loan Approval Prediction Using Machine Learning
In Banking Loan Approval Prediction Using Machine LearningIn Banking Loan Approval Prediction Using Machine Learning
In Banking Loan Approval Prediction Using Machine Learning
IRJET Journal
 
IRJET- Credit Card Fraud Detection using Random Forest
IRJET-  	  Credit Card Fraud Detection using Random ForestIRJET-  	  Credit Card Fraud Detection using Random Forest
IRJET- Credit Card Fraud Detection using Random Forest
IRJET Journal
 
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
IRJET Journal
 
A predictive system for detection of bankruptcy using machine learning techni...
A predictive system for detection of bankruptcy using machine learning techni...A predictive system for detection of bankruptcy using machine learning techni...
A predictive system for detection of bankruptcy using machine learning techni...
IJDKP
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
dataalcott
 
IRJET- Survey on Credit Card Security System for Bank Transaction using N...
IRJET-  	  Survey on Credit Card Security System for Bank Transaction using N...IRJET-  	  Survey on Credit Card Security System for Bank Transaction using N...
IRJET- Survey on Credit Card Security System for Bank Transaction using N...
IRJET Journal
 
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISKMACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
IRJET Journal
 
IRJET- Competitive Analysis of Attacks on Social Media
IRJET-  	 Competitive Analysis of Attacks on Social MediaIRJET-  	 Competitive Analysis of Attacks on Social Media
IRJET- Competitive Analysis of Attacks on Social Media
IRJET Journal
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection Analysis
IRJET Journal
 
IRJET- Financial Analysis using Data Mining
IRJET- Financial Analysis using Data MiningIRJET- Financial Analysis using Data Mining
IRJET- Financial Analysis using Data Mining
IRJET Journal
 
Detection of credit card fraud
Detection of credit card fraudDetection of credit card fraud
Detection of credit card fraud
Bastiaan Frerix
 

More from IRJET Journal (20)

Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
BRAIN TUMOUR DETECTION AND CLASSIFICATION
BRAIN TUMOUR DETECTION AND CLASSIFICATIONBRAIN TUMOUR DETECTION AND CLASSIFICATION
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ..."Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
Breast Cancer Detection using Computer Vision
Breast Cancer Detection using Computer VisionBreast Cancer Detection using Computer Vision
Breast Cancer Detection using Computer Vision
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the HeliosphereAnalysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
A Novel System for Recommending Agricultural Crops Using Machine Learning App...A Novel System for Recommending Agricultural Crops Using Machine Learning App...
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the HeliosphereAnalysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
FIR filter-based Sample Rate Convertors and its use in NR PRACH
FIR filter-based Sample Rate Convertors and its use in NR PRACHFIR filter-based Sample Rate Convertors and its use in NR PRACH
FIR filter-based Sample Rate Convertors and its use in NR PRACH
IRJET Journal
 
Kiona – A Smart Society Automation Project
Kiona – A Smart Society Automation ProjectKiona – A Smart Society Automation Project
Kiona – A Smart Society Automation Project
IRJET Journal
 
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
Invest in Innovation: Empowering Ideas through Blockchain Based CrowdfundingInvest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUBSPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...
AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...
AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...
IRJET Journal
 
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
IRJET Journal
 
BRAIN TUMOUR DETECTION AND CLASSIFICATION
BRAIN TUMOUR DETECTION AND CLASSIFICATIONBRAIN TUMOUR DETECTION AND CLASSIFICATION
BRAIN TUMOUR DETECTION AND CLASSIFICATION
IRJET Journal
 
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
IRJET Journal
 
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ..."Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
IRJET Journal
 
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
IRJET Journal
 
Breast Cancer Detection using Computer Vision
Breast Cancer Detection using Computer VisionBreast Cancer Detection using Computer Vision
Breast Cancer Detection using Computer Vision
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the HeliosphereAnalysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
A Novel System for Recommending Agricultural Crops Using Machine Learning App...A Novel System for Recommending Agricultural Crops Using Machine Learning App...
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
IRJET Journal
 
Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.Auto-Charging E-Vehicle with its battery Management.
Auto-Charging E-Vehicle with its battery Management.
IRJET Journal
 
Analysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the HeliosphereAnalysis of high energy charge particle in the Heliosphere
Analysis of high energy charge particle in the Heliosphere
IRJET Journal
 
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
IRJET Journal
 
FIR filter-based Sample Rate Convertors and its use in NR PRACH
FIR filter-based Sample Rate Convertors and its use in NR PRACHFIR filter-based Sample Rate Convertors and its use in NR PRACH
FIR filter-based Sample Rate Convertors and its use in NR PRACH
IRJET Journal
 
Kiona – A Smart Society Automation Project
Kiona – A Smart Society Automation ProjectKiona – A Smart Society Automation Project
Kiona – A Smart Society Automation Project
IRJET Journal
 
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
IRJET Journal
 
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
Invest in Innovation: Empowering Ideas through Blockchain Based CrowdfundingInvest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
IRJET Journal
 
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
IRJET Journal
 
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUBSPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
IRJET Journal
 
AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...
AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...
AR Application: Homewise VisionMs. Vaishali Rane, Om Awadhoot, Bhargav Gajare...
IRJET Journal
 
Ad

Recently uploaded (20)

W1 WDM_Principle and basics to know.pptx
W1 WDM_Principle and basics to know.pptxW1 WDM_Principle and basics to know.pptx
W1 WDM_Principle and basics to know.pptx
muhhxx51
 
Surveying through global positioning system
Surveying through global positioning systemSurveying through global positioning system
Surveying through global positioning system
opneptune5
 
Novel Plug Flow Reactor with Recycle For Growth Control
Novel Plug Flow Reactor with Recycle For Growth ControlNovel Plug Flow Reactor with Recycle For Growth Control
Novel Plug Flow Reactor with Recycle For Growth Control
Chris Harding
 
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
ajayrm685
 
Prediction of Flexural Strength of Concrete Produced by Using Pozzolanic Mate...
Prediction of Flexural Strength of Concrete Produced by Using Pozzolanic Mate...Prediction of Flexural Strength of Concrete Produced by Using Pozzolanic Mate...
Prediction of Flexural Strength of Concrete Produced by Using Pozzolanic Mate...
Journal of Soft Computing in Civil Engineering
 
Nanometer Metal-Organic-Framework Literature Comparison
Nanometer Metal-Organic-Framework  Literature ComparisonNanometer Metal-Organic-Framework  Literature Comparison
Nanometer Metal-Organic-Framework Literature Comparison
Chris Harding
 
New Microsoft PowerPoint Presentation.pdf
New Microsoft PowerPoint Presentation.pdfNew Microsoft PowerPoint Presentation.pdf
New Microsoft PowerPoint Presentation.pdf
mohamedezzat18803
 
Efficient Algorithms for Isogeny Computation on Hyperelliptic Curves: Their A...
Efficient Algorithms for Isogeny Computation on Hyperelliptic Curves: Their A...Efficient Algorithms for Isogeny Computation on Hyperelliptic Curves: Their A...
Efficient Algorithms for Isogeny Computation on Hyperelliptic Curves: Their A...
IJCNCJournal
 
2025 Apply BTech CEC .docx
2025 Apply BTech CEC                 .docx2025 Apply BTech CEC                 .docx
2025 Apply BTech CEC .docx
tusharmanagementquot
 
Evonik Overview Visiomer Specialty Methacrylates.pdf
Evonik Overview Visiomer Specialty Methacrylates.pdfEvonik Overview Visiomer Specialty Methacrylates.pdf
Evonik Overview Visiomer Specialty Methacrylates.pdf
szhang13
 
Dynamics of Structures with Uncertain Properties.pptx
Dynamics of Structures with Uncertain Properties.pptxDynamics of Structures with Uncertain Properties.pptx
Dynamics of Structures with Uncertain Properties.pptx
University of Glasgow
 
NOMA analysis in 5G communication systems
NOMA analysis in 5G communication systemsNOMA analysis in 5G communication systems
NOMA analysis in 5G communication systems
waleedali330654
 
Compiler Design_Code Optimization tech.pptx
Compiler Design_Code Optimization tech.pptxCompiler Design_Code Optimization tech.pptx
Compiler Design_Code Optimization tech.pptx
RushaliDeshmukh2
 
最新版加拿大魁北克大学蒙特利尔分校毕业证(UQAM毕业证书)原版定制
最新版加拿大魁北克大学蒙特利尔分校毕业证(UQAM毕业证书)原版定制最新版加拿大魁北克大学蒙特利尔分校毕业证(UQAM毕业证书)原版定制
最新版加拿大魁北克大学蒙特利尔分校毕业证(UQAM毕业证书)原版定制
Taqyea
 
Compiler Design_Intermediate code generation new ppt.pptx
Compiler Design_Intermediate code generation new ppt.pptxCompiler Design_Intermediate code generation new ppt.pptx
Compiler Design_Intermediate code generation new ppt.pptx
RushaliDeshmukh2
 
The Gaussian Process Modeling Module in UQLab
The Gaussian Process Modeling Module in UQLabThe Gaussian Process Modeling Module in UQLab
The Gaussian Process Modeling Module in UQLab
Journal of Soft Computing in Civil Engineering
 
Design of Variable Depth Single-Span Post.pdf
Design of Variable Depth Single-Span Post.pdfDesign of Variable Depth Single-Span Post.pdf
Design of Variable Depth Single-Span Post.pdf
Kamel Farid
 
PRIZ Academy - Functional Modeling In Action with PRIZ.pdf
PRIZ Academy - Functional Modeling In Action with PRIZ.pdfPRIZ Academy - Functional Modeling In Action with PRIZ.pdf
PRIZ Academy - Functional Modeling In Action with PRIZ.pdf
PRIZ Guru
 
Autodesk Fusion 2025 Tutorial: User Interface
Autodesk Fusion 2025 Tutorial: User InterfaceAutodesk Fusion 2025 Tutorial: User Interface
Autodesk Fusion 2025 Tutorial: User Interface
Atif Razi
 
SICPA: Fabien Keller - background introduction
SICPA: Fabien Keller - background introductionSICPA: Fabien Keller - background introduction
SICPA: Fabien Keller - background introduction
fabienklr
 
W1 WDM_Principle and basics to know.pptx
W1 WDM_Principle and basics to know.pptxW1 WDM_Principle and basics to know.pptx
W1 WDM_Principle and basics to know.pptx
muhhxx51
 
Surveying through global positioning system
Surveying through global positioning systemSurveying through global positioning system
Surveying through global positioning system
opneptune5
 
Novel Plug Flow Reactor with Recycle For Growth Control
Novel Plug Flow Reactor with Recycle For Growth ControlNovel Plug Flow Reactor with Recycle For Growth Control
Novel Plug Flow Reactor with Recycle For Growth Control
Chris Harding
 
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
ajayrm685
 
Nanometer Metal-Organic-Framework Literature Comparison
Nanometer Metal-Organic-Framework  Literature ComparisonNanometer Metal-Organic-Framework  Literature Comparison
Nanometer Metal-Organic-Framework Literature Comparison
Chris Harding
 
New Microsoft PowerPoint Presentation.pdf
New Microsoft PowerPoint Presentation.pdfNew Microsoft PowerPoint Presentation.pdf
New Microsoft PowerPoint Presentation.pdf
mohamedezzat18803
 
Efficient Algorithms for Isogeny Computation on Hyperelliptic Curves: Their A...
Efficient Algorithms for Isogeny Computation on Hyperelliptic Curves: Their A...Efficient Algorithms for Isogeny Computation on Hyperelliptic Curves: Their A...
Efficient Algorithms for Isogeny Computation on Hyperelliptic Curves: Their A...
IJCNCJournal
 
Evonik Overview Visiomer Specialty Methacrylates.pdf
Evonik Overview Visiomer Specialty Methacrylates.pdfEvonik Overview Visiomer Specialty Methacrylates.pdf
Evonik Overview Visiomer Specialty Methacrylates.pdf
szhang13
 
Dynamics of Structures with Uncertain Properties.pptx
Dynamics of Structures with Uncertain Properties.pptxDynamics of Structures with Uncertain Properties.pptx
Dynamics of Structures with Uncertain Properties.pptx
University of Glasgow
 
NOMA analysis in 5G communication systems
NOMA analysis in 5G communication systemsNOMA analysis in 5G communication systems
NOMA analysis in 5G communication systems
waleedali330654
 
Compiler Design_Code Optimization tech.pptx
Compiler Design_Code Optimization tech.pptxCompiler Design_Code Optimization tech.pptx
Compiler Design_Code Optimization tech.pptx
RushaliDeshmukh2
 
最新版加拿大魁北克大学蒙特利尔分校毕业证(UQAM毕业证书)原版定制
最新版加拿大魁北克大学蒙特利尔分校毕业证(UQAM毕业证书)原版定制最新版加拿大魁北克大学蒙特利尔分校毕业证(UQAM毕业证书)原版定制
最新版加拿大魁北克大学蒙特利尔分校毕业证(UQAM毕业证书)原版定制
Taqyea
 
Compiler Design_Intermediate code generation new ppt.pptx
Compiler Design_Intermediate code generation new ppt.pptxCompiler Design_Intermediate code generation new ppt.pptx
Compiler Design_Intermediate code generation new ppt.pptx
RushaliDeshmukh2
 
Design of Variable Depth Single-Span Post.pdf
Design of Variable Depth Single-Span Post.pdfDesign of Variable Depth Single-Span Post.pdf
Design of Variable Depth Single-Span Post.pdf
Kamel Farid
 
PRIZ Academy - Functional Modeling In Action with PRIZ.pdf
PRIZ Academy - Functional Modeling In Action with PRIZ.pdfPRIZ Academy - Functional Modeling In Action with PRIZ.pdf
PRIZ Academy - Functional Modeling In Action with PRIZ.pdf
PRIZ Guru
 
Autodesk Fusion 2025 Tutorial: User Interface
Autodesk Fusion 2025 Tutorial: User InterfaceAutodesk Fusion 2025 Tutorial: User Interface
Autodesk Fusion 2025 Tutorial: User Interface
Atif Razi
 
SICPA: Fabien Keller - background introduction
SICPA: Fabien Keller - background introductionSICPA: Fabien Keller - background introduction
SICPA: Fabien Keller - background introduction
fabienklr
 
Ad

Credit Card Fraud Detection Using Machine Learning & Data Science

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1359 Credit Card Fraud Detection Using Machine Learning & Data Science Ishika Sharma1 Shivjyoti Dalai2, Venktesh Tiwari3, Ishwari Singh 4, Seema Kharb5 1,2,3 Students, Computer Science Engineering, SRM University, Sonipat 4Asst. Professor, Dept. of Computer Science Engineering, SRM University, Haryana, 5Asst. Professor, Dept. of Computer Science Engineering, SRM University, Haryana, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - A method for 'Credit Card Fraud Detection' is created in this study. As the number of scammers grows every day. Credit cards are used for fraudulent transactions, and there are several sorts of fraud. As a result, various techniques such as Logistic Regression, Random Forest, and Naive Bayes are utilized to tackle this problem. This transaction is evaluated individually, and whateverworksbestiscarried out. The primary purpose is to detect fraud by filtering the aforementioned strategies in order to achieve a better outcome. Key Words: Credit Card, Fraud Detection,RandomForest, Naïve Bayes, Logistic Regression. 1. INTRODUCTION Credit card fraud is a broad word for theft and fraud perpetrated using or utilizing a credit card at the moment of payment. The goal may be to buy something without paying for it or withdraw money from an account without permission. Identity theft is often accompanied by credit card fraud. According to the Federal Trade Commission of the United States, the rate of identity theft remained steady during the mid-2000s, but it jumped by 21% in 2008. Even though credit card fraud, the crime most people connect with ID theft, fell to a fraction of total ID theft complaints in 2000, roughly 10 million transactions, or one out of every 1300, were fraudulent. In addition, 0.05 percent (5 out of 10,000) of all monthly active accounts were fake. Today, fraud detection systems keep track of a twelfth of one percent of all transactions performed, resulting in billions of dollars in losses. Credit card fraud is one of the most serious issues facing businesses today. However, to successfully detect fraud, it is necessary first to comprehend the processes of fraud execution. Fraudsters use a variety of methods to perpetrate credit cardfraud.CreditCardFraudis described as "when an individual uses another person's credit card for personal reasons while the card owner and the card issuer are unaware that the card is being used." Theft of the actual card or the critical data linked with the account, such as the card account number or other information that must be given to a merchant during a valid transaction, is where card fraud begins. Card numbers, usually the Primary Account Number (PAN), are often reproduced on the card, and the data is stored in machine- readable format on a magnetic stripe on the reverse. 2. METHODOLOGY This part should provide the method and analysis used in your research project. Using keywords from your title in the first few phrases is a simple and effective method to follow. A. Data Collection The data-gathering phase is the first step in the project; this dataset comprises a collectionoftransactions,someofwhich are real and others are fraudulent. The data-gatheringphase is the first step in the project; this dataset includes a collection of transactions, some of which are real and others that are fraudulent. The data-gatheringphaseisthefirststep in the project; this dataset comprises a collection of transactions, some of which are real and others are fraudulent. B. Credit Card Dataset A credit card transaction data set was gathered via Kaggle, and it comprises a total of 2,84,808 credit card transactions from a European bank. It divides transactions into "positive class" and "negative class." The data set is highly skewed, with roughly 0.172 percent of transactions being fraudulent and the remainder being legitimate; this indicates that just 492 of the 2,84,808 transactions are fraudulent, and the rest are genuine ones. So, we oversampled to balance the data set, resulting in 60% of fraud transactions and 40% genuine ones. C. Preprocessing of Dataset Selected data is formatted, cleaned, and sampled in this module. The following are some of the data pre-processing steps: a) Formatting: The chosen data might not be in the correct format. We may prefer data in a file format over a relational database or vice versa. b) Cleaning is the process of removing or correcting missing data. The dataset may contain recordsthat areincompleteor have null values. Such records must be deleted. c) Sampling: The class distribution in credit card transactions is uneven because the number of frauds in the dataset is fewer than the total number of transactions. As a
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1360 result, the sampling approach is utilized to tackle this problem. D. Loading of Dataset The dataset is loaded after it has been pre-processed. Various library functions can be used to load the dataset. In this case, we used the read CSV method of Python's Pandas library to load a dataset in CSV or Microsoft Excel format; in terms of python, it is called a DATAFRAME. dataset = pd.read_csv('creditcard.csv') E. Splitting of Dataset To compensate for the dataset's imbalance, we used the ADASYN oversampling technique, which oversamples both the number of fraudulent and genuine transactions to a specific number, resulting in a positive and negative range that is nearly equal. After the dataset is oversampled, the samples are split into Train and Test data. A suitable ratio is to be performed for the model (Usually, 70% for Train data and 30% for Test data are chosen, anyone can choose their ration). The train dataset can be further split into train data and validation data. F. Building Model After the data has been split into train and test data which is 70% and 30%, respectively, the training data is now utilized for the model building. The dataset contains 31 features,out of which 30 features or columns are the independent features, and the last column called the CLASS column, is the dependent feature. So here, the dataset is split into four categories: xtrain, ytrain, xtest, and ytest, representing independent training features, dependent training features, independent test features, and dependent test features. G. Algorithms a) Logistic Regression -Regression is a regression model that analyses therelationshipbetweenmultipleindependent variables and has a categorical dependent variable. There are many different logistic regression models, including binary, multiple, and binomial logistic models. The Binary Logistic Regression model calculates the likelihood of a binary response based on one or more predictors. Fig 1- Logistic Regression expression The above equation represents the logistic regression in mathematical form. b) Random Forest - Random Forest can be used to rank the importance of variables in a regression or classification problem in a natural way. Random forest is a tree-based algorithm that createsseveral treesandcombinesthe results to improve the model's generalization ability. An ensemble method is a technique for combining trees. Ensembling is nothing more than putting togethera groupofweak learners (individual trees) to create a stronglearner.RandomForests can be used to solve problems involving regression and classification. The dependent variable in regression problems is continuous. The dependent variable in classification problems is categorical. Fig 2- Random Forest expression c) Naïve Bayes - A Bayesian classifier is a statistical method that uses Bayes' theorem to calculate the probability that a feature belongs to a specific class. It is referred to as naive because it assumes that the possibilities of individual components are independent of one another, which is extremely unlikely to occur inthereal world.Theprobability of an event occurring is calculated by considering the likelihood of another event occurring. It's possibletowriteit as: Fig 3- Naïve Bayes expression Where the posterior probability of target class c P(c|X) is calculated from P(c), P(X|c), and P(X). H. Training of Model After building the model, the model is trained using Train data and validation data. The model is trained using the library function fit () function. I. Evaluating the Model The model can be evaluated by using various metrics. These are a) Interpreting Loss and Validationloss -Lossistheresult of a bad prediction. A loss is a number indicating how bad the model's prediction was on a single example. Loss can be validation loss and training loss. b) Interpreting Accuracy and Validation Accuracy - Validation accuracy and accuracy need to be converged in a good model. c) Confusion Matrix - A confusion matrix summarizes classification problem prediction results. The correct and
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1361 incorrect predictions are totaled and broken down by class using count values. The confusion matrix's key is this. The confusion matrix depicts the various ways in which your classification model becomes perplexed when making predictions. It reveals the number of errors made by a classifier and the types of errors made. Here, •Class 1: Positive •Class 2: Negative Classification Rate/ Accuracy Classification Rate or Accuracy is given by the relation: Fig 4- Accuracy Expression Recall - Recall can be defined as the ratio of the total number of correctly classified positive examples divided by the total number of positive examples. High Recall indicates the class is correctly recognized (a small number of FN). Fig 5- Recall Expression Precision - To get the precision value, we divide the total number of correctly classified positive examples by thetotal number of predicted positive examples. High precision indicates that an example labeled as positive is positive (a small number of FP). Fig 6- Precision Expression J. Saving the Model After building the model, the model is saved to our device. The model can be saved in .pkl format or .h5 format. To save the model in .pkl format, python provides usa librarynamed Pickle, and to save it in .h5 format, the Tensorflow library is used. I have used the Pickle library and saved the models in .pkl format. 3. MODEL AND ANALYSIS Fig 8- RandomForest Classifier Fig 9: Naïve Bayes model 4. RESULTS AND DISCUSSION The accuracy results we got from the three algorithms are shown in the following table below. Table- Accuracy Results Fig 7- Logistic Regression
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1362 Training vs. Test Data in Dataset Fig 10- Training vs. Test Data Normal vs. Fraud Transaction after Oversampling Fig 11- Fraudulent vs. non-Fraudulent Correlation matrix Fig12- Correlation Matrix 4. CONCLUSIONS Various machine learning algorithms for detecting fraud in credit card transactions were reviewed in this paper. The accuracy, precision, and specificity metrics are used to evaluate the performance of this technique. To classify the transaction as fraudulent or authorized, I used three supervised learning techniques: Logistic Regression, Random Forest, and Naive Bayes. Using feedback and delayed supervised training, these classifiers were trained on a delayed supervised sample dataset of almost 284807 transaction records. Due to the massive imbalance, the dataset was subjected to an Oversampling technique, which resulted in the number of fraud and normal transactions being nearly equal. The training and test data were tested using the three Models, and the results were obtained. The accuracy of the Random Forest, Logistic Regression, and Naive Bayes was 99.27%, 91.20%, and89.40%, respectively. From the Above project, itcanbeconcludedthattheRandom Forest model is somewhat trustworthy, and its accuracy could be improved further with a larger and more balanced dataset. If some other algorithms can be combined with this one to form a Hybrid Algorithm, the results will be even better. ACKNOWLEDGEMENT The success and outcome of this project required a lot of guidance and assistance from many people, and we are highly privileged to have got this all along with the completion of our project. All that we have done is only due to such supervision and assistance, and we will not forget to thank them. We are extremely grateful to Dr. Paramjit S. Jaswal, Vice- Chancellor, SRM University, and Dr.PuneetGoswami,Head of the Department, Department of Computer Science and Engineering, for providing all the required resources for the completion of my seminar. Our heartfelt gratitude to our guide Dr. Ishwari Singh, for their valuable suggestion and guidance in preparing the research paper. Last but not least, we would express our obligation to all the people who have worked extensively on the topic and make the content available for free to all the aspiring people who want to grow in their community. I would say this report can be helpful to any aspiring student who wants to gain an overall idea about how high- performance computing works in practical life. REFERENCES [1] T. Mohana Priya, Dr. M. Punithavalli & Dr. R. Rajesh Kanna, Machine Learning Algorithm forDevelopmentof Enhanced Support Vector MachineTechniquetoPredict Stress, Global Journal of Computer Science and Technology: C Software & Data Engineering, Volume20, Issue 2, No. 2020, pp 12-20
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1363 [2] Ganesh Kumar and P.Vasanth Sena, “Novel Artificial Neural Networks and Logistic Approach for Detecting Credit Card Deceit,” International Journal of Computer Science and Network Security, Vol. 15, issue 9, Sep. 2015, pp. 222-234 [3] Gyusoo Kim and Seulgi Lee, “2014 Payment Research”, Bank of Korea, Vol. 2015, No. 1, Jan. 2015. [4] Chengwei Liu, Yixiang Chan, Syed Hasnain Alam Kazmi, Hao Fu, “Financial Fraud Detection Model: Based on Random Forest,” International Journal ofEconomicsand Finance, Vol. 7, Issue. 7, pp. 178-188, 2015. [5] Hitesh D. Bambhava, Prof. Jayeshkumar Pitroda, Prof. Jaydev J. Bhavsar (2013), “A Comparative Study on Bamboo Scaffolding And Metal Scaffolding in Construction Industry Using Statistical Methods," International Journal of Engineering Trends and Technology (IJETT) – Volume 4, Issue 6, June 2013, Pg.2330-2337. [6] P. Ganesh Prabhu, D. Ambika, "Study on Behaviour of Workers in Construction Industry to Improve Production Efficiency," International Journal of Civil, Structural, Environmental and Infrastructure Engineering Research and Development (IJCSEIERD), Vol. 3, Issue 1, Mar 2013, 59-66 [7] Manideep, A. P. S., and Seema Kharb. "A Comparative Analysis of Machine Learning Prediction Techniquesfor Crop Yield Prediction in India." Turkish Journal of Computer andMathematicsEducation(TURCOMAT) 13.2 (2022): 120-133.