0% found this document useful (0 votes)
14 views

Aiml

Uploaded by

shauryaraghu4
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Aiml

Uploaded by

shauryaraghu4
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

V Semester

Artificial Intelligence and Machine Learning


Open-Ended Project Work (21AI52)

Department of Computer Science & Engineering,

RV College of Engineering, Bengaluru.


Project TITLE
Transactional Fraud Detection
Student names & USNs

Shaurya Jain 1RV21CS152


Shaurya Raghuvanshi 1RV21CS153

Faculty Incharge1 (Theory) : Faculty Incharge2 (Lab) :


Dr Shantha Rangaswamy Dr Suma B Rao

2
Agenda

• Identification of Problem Domain and Detailed Analysis


• Literature Review
• Research gap
• Objectives
• Methodology
• High Level Design
• Detailed Design
• Implementation
• Experimental Results & Analysis
• Conclusion & Future Work
• Demonstration of the project
• Paper Publication
3
Identification of Problem and Detailed Analysis

• Problem domain
Transactional fraud detection systems are designed to identify and prevent fraudulent activities joccurring
within financial transactions. The problem domain encompasses various aspects of detecting, preventing, and
mitigating fraudulent behavior across different types of transactions, such as credit card transactions, online
payments, wire transfers, and more.

• Potential for AI in the problem domain with industrial relevance


AI-powered machine learning models can analyze large volumes of transactional data to identify complex
patterns indicative of fraudulent behavior. Deep learning algorithms, such as neural networks, can
automatically learn and adapt to evolving fraud tactics, improving detection accuracy over time.

4
Identification of Problem and Detailed Analysis

• Problem Statement
To design and train a Machine learning model to detect fraudulent credit card transactions as well as fraudulent
bank account transfers.
•Detailed and extensive explanation of the purpose & uniqueness of the project
The purpose of this project is to develop a robust machine learning model capable of accurately detecting fraudulent
credit card transactions and bank account transfers. By combining the detection of fraudulent credit card
transactions and bank account transfers into a single model, we aim to provide a holistic solution that can
effectively identify fraudulent activities across multiple channels, enhancing the overall security posture of the
financial ecosystem. The uniqueness of this project lies in its utilization of diverse data sources and advanced
machine learning techniques to capture the nuanced patterns of fraudulent behavior specific to credit card
transactions and bank transfers.

5
Literature Review

Sl. No Title of the Author/s and Objectives of Results Gaps


work Publication the work Obtained Identified
details carried out

1. Learning IEEE 2022 In this paper, we Achieved a The model used if


Transactional propose a new model training accuracy of very high
Yu Xie, to learn the
Behavioral of 92.6 after complexity and is
Representations Guanjun Liu, transactional
behavioral training TH- tough to export
for Credit Card Chungang Yan representations for LSTM on a third party
Fraud Detection each transaction service or
record of users. A
time-aware gate is application
augmented to LSTM
to learn the behavioral
changes brought by
different time
intervals.

2. A Model Based Hindawi Journals The CNN model Obtained an For different
on Convolutional 2020 based on feature accuracy of 98.5 sequences of
rearrangement
Neural Network Xinxin Zhou, constructed in this after features, the
for Online Xiabo Zhang paper has an excellent experimenting model has
Transaction Fraud experimental, with different different effects.
Detection performance with a feature sequences more attention to
good stability. model
needs neither high the discovery of
dimensional input sequence
features nor derivative characteristics of
variables. transactions. 6
Literature Review
Sl. No Title of the Author/s and Objectives of Results Gaps
work Publication the work Obtained Identified
details carried out

3. Transaction Fraud IEEE 2019 In this paper, we have Achieved a Model struggles if
Detection Based a tendency to propose training accuracy data is highly
Lutao Zheng, a way to extract users’
on Total Order of 92.6 after varied in terms of
Relation and Guanjun Liu, BPs supported their
group action records, training TH- types of
Behavior that is employed to LSTM transaction as
Diversity detect group action well as format of
fraud within the on-
line looking state of transaction.
affairs. OM
overcomes the
disadvantage of
Markov process
models since it
characterizes the
range of user
behaviors.

4. Financial Fraud IEEE 2018 CoDetect, which can Obtained an For different
Detection With Dongxu Huang, perform fraud accuracy of sequences of
detection on graph-
Anomaly Feature Dejun Mu based similarity almost 97 on the features, the
Detection matrix and feature outliers using model has
matrixsimultaneousl subspace different effects.
y. It introduces a clustering more attention to
new way to reveal
the nature of
the discovery of
financial activities sequence 7
from fraud patterns characteristics of
Objectives

• To train a deep learning model to detect fraudulent transactions made in the case of bank transfers or other
modes of online payments.

• To train another deep learning model to detect fraudulent transactions made in the case of credit card
payments.

• Appropriate datasets must be procured for to train the model. The datasets have been obtained from Kaggle and a
Convolutional Neural Network model has been selected for our purposes.

8
Methodology

This research investigates the potential of applying Convolutional Neural Networks (CNNs) in conjunction with an
oversampling technique to improve the accuracy and effectiveness of transactional fraud detection.

1) Data Collection and Preprocessing:

Data Acquisition: We will acquire a comprehensive dataset of labeled transaction records, encompassing both
legitimate and fraudulent activities. This data may be obtained from financial institutions, public datasets, or
simulated scenarios.

Data Preprocessing: The raw data will undergo cleaning and preprocessing steps to ensure its consistency and
quality. This may involve handling missing values, formatting inconsistencies, and feature engineering relevant to
the CNN architecture.

9
Methodology

2) Oversampling for Imbalanced Data:

Imbalanced Class Problem: Similar to the previous methodology, transactional fraud data often exhibits class
imbalance, where the number of fraudulent transactions is significantly smaller than legitimate ones. This
imbalance can hinder the learning process of CNNs.

Oversampling Technique:
SMOTE (Synthetic Minority Oversampling Technique): This method creates synthetic data points by interpolating
existing data points in the minority class, specifically focusing on features suitable for CNNs (e.g., numerical
transaction amounts, timestamps).

How SMOTE Works:


Identify the Minority Class: The algorithm first identifies the class with the fewest data points (minority class) in
the dataset. In the context of fraud detection, this is typically the class representing fraudulent transactions.
Randomly Select a Minority Class Instance: The algorithm randomly selects an instance from the minority class.
.

10
Methodology

Identify k-Nearest Neighbors: It then identifies k nearest neighbors of the selected instance from the minority class
based on a chosen distance metric (e.g., Euclidean distance).
Randomly Select a Neighbor: A neighbor from the k nearest neighbors is randomly chosen.
Generate Synthetic Data Point: The algorithm creates a new synthetic data point by linearly interpolating between
the selected minority class instance and its chosen neighbor.

3) CNN model used:

Dense (16 units, ReLU activation): The first layer takes an input with six features and transforms it into a hidden
layer with 16 neurons. The ReLU (Rectified Linear Unit) activation function introduces non-linearity, allowing the
model to learn complex relationships between features.
Dense (24 units, ReLU activation): The second layer further processes the information from the first layer,
increasing the complexity by introducing 24 neurons with ReLU activation.
Dropout (0.5): This layer randomly drops 50% of the neurons during training, preventing overfitting and improving
the model's generalization to unseen data.
Dense (20 units, ReLU activation): Similar to the second layer, this layer adds another layer of complexity with 20
neurons and ReLU activation.
Dense (24 units, ReLU activation): Another layer with 24 neurons and ReLU activation further refines the learned
features. 11
Methodology

Dense (1 unit, sigmoid activation): The final layer has only one neuron and uses the sigmoid activation function,
which outputs a value between 0 and 1, suitable for binary classification tasks.
The model is compiled with the Adam optimizer, which efficiently updates the model weights during training. The
loss function is set to binary cross-entropy, appropriate for binary classification problems. Additionally, the model
tracks accuracy as a metric to evaluate its performance.

12
Design

13
Design

14
Experimental Results & Analysis

The two datasets chosen to train the model are:

1) Paysim Dataset:

Paysim dataset is a synthetic dataset made by The Norwegian University of Science and Technology which contains
6,362,620 records out of which 8,213 are fraudulent transactions. This dataset contains details such as account
number, transaction amount, account balance of both the concerned parties and class label (0-not a fraud, 1-fraud)

2) Sparkov Credit Card Fraud dataset

This is a dataset generated using Sparkov simulation which was ran for the duration of 1 Jan 2019 to 31 Dec 2020.
This dataset contains 1,296,675 records out of which 7,506 are fraudulent. This dataset contains details such as
credit card number, amount, transaction time, location of both the parties and class label (0-not a fraud, 1-fraud).

15
Experimental Results & Analysis

16
Experimental Results & Analysis

17
Demonstration of the project

Values closer
to 0 indicate
Model 1: not a fraud and
values closer
to 1 are
fraudulent

Model 2:

18
Conclusion & Future Work

Leveraging oversampling techniques and CNNs holds promise for enhancing transactional fraud detection. While
existing research shows positive results, further exploration is necessary to identify the most effective combinations
and assess their generalizability in real-world scenarios. By addressing class imbalance and utilizing the feature
learning capabilities of CNNs, this approach can contribute to developing more robust and accurate fraud detection
systems.

Explore the impact of different oversampling techniques (SMOTE, ADASYN) on CNN performance for fraud
detection.Investigate the effectiveness of combining oversampling with advanced CNN architectures (e.g., residual
networks).Analyze the generalizability of these approaches on real-world imbalanced fraud datasets from financial
institutions.

19
References

Give the details of the references here.

20

You might also like