0% found this document useful (0 votes)
1 views

Phase 2-AI Credit Card Fraud Detection System-1-2

The document outlines a project focused on detecting fraudulent credit card transactions using AI, aiming for a fraud detection rate of 99.5% with less than 0.5% false positives. It details the project's significance, objectives, dataset, data preprocessing, exploratory analysis, feature engineering, model performance, and team contributions. The project utilizes Python and various libraries, with XGBoost achieving the highest performance metrics.

Uploaded by

cr2485958
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Phase 2-AI Credit Card Fraud Detection System-1-2

The document outlines a project focused on detecting fraudulent credit card transactions using AI, aiming for a fraud detection rate of 99.5% with less than 0.5% false positives. It details the project's significance, objectives, dataset, data preprocessing, exploratory analysis, feature engineering, model performance, and team contributions. The project utilizes Python and various libraries, with XGBoost achieving the highest performance metrics.

Uploaded by

cr2485958
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Phase 2: AI Credit Card Fraud Detection

System

Student Details
Name: DHANUSH A

Register Number: 513323106010

Institution: University College of Engineering Arni

Department: Electronics and Communication Engineering

Date of Submission: 10-05-2025

GitHub Repository:https://github.com/Gooooooooookuuuuuuulllllllllll/Ai-fraud-
detection-project-

1. Problem Statement
Classification Problem: Detect fraudulent credit card transactions in real-time with less
than 0.5% false positives.

Significance:

● Prevents financial losses exceeding ₹1,200 crore annually in Indian banking (RBI
2023 Report).
● Ensures compliance with PCI-DSS 4.0 standards for transaction security.

2. Project Objectives

# Objective Success Metric

1 Maximize fraud detection rate 99.5% recall (stratified cross-validation)

2 Minimize false alerts Precision-recall curve optimization

3 Maintain model transparency SHAP value interpretation

3. Workflow Diagram
4. Dataset Overview
Source: Kaggle (284,807 transactions with 492 fraud cases)

Key Attributes:

● Time: Seconds elapsed since first transaction


● Amount: Transaction value (₹)
● V1-V28: Anonymized PCA components
● Class Distribution: 0.172% fraudulent (highly imbalanced)

5. Data Preprocessing Pipeline


1. Missing Values: Confirmed no null entries

2. Outlier Handling: IQR-based winsorization on Amount

3. Class Imbalance: SMOTE oversampling (fraud:non-fraud = 1:1)

4. Feature Scaling: RobustScaler applied to monetary values

6. Exploratory Data Analysis Insights


● Temporal Pattern: 63% frauds occur during 00:00-04:00 IST
● Feature Correlation: Strong negative association between V17 and fraud (-0.82)
● Transaction Amount: 89% of frauds involve amounts below ₹10,000

7. Feature Engineering

Derived Feature Methodology Performance Impact

Transaction_Hour Time decomposition +7% recall

Transaction_Frequency 2-hour rolling window +12% precision

8. Model Performance Comparison


Algorithm Precision Recall F1-Score Training Time

Logistic 0.85 0.78 0.81 45 sec


Regression

Random Forest 0.91 0.87 0.89 3 min 12 sec

XGBoost 0.96 0.92 0.94 2 min 18 sec

9. Key Visualization
![Confusion Matrix](assets/cm.png)

Figure 1: Performance evaluation of XGBoost classifier (F1=0.94)

10. Technology Stack


● Programming Language: Python 3.9
● Core Libraries: Pandas, NumPy, Scikit-learn, XGBoost, SHAP
● Development Environment: Google Colab Pro (T4 GPU acceleration)

11. Team Contributions

Name Register No. Role Key


Responsibilities

GOKUL PRAKASH G 513323106002 Data Engineer Designed data


pipeline,
implemented
SMOTE

GOWTHAM KUMAR 513323106008 ML Engineer Optimized XGBoost


R hyperparameters
DHANUSH A 513323106010 Data Analyst Created EDA
visuals, SHAP
interpretation

You might also like