0% found this document useful (0 votes)

317 views6 pages

Fraud Detection Project Report

The document discusses detecting fraudulent vehicle insurance claims using machine learning. It first provides background on the problem of insurance fraud and focuses on vehicle insurance fraud. The authors then describe analyzing insurance claim data to understand characteristics of fraudulent claims and training machine learning models like random forests and decision trees to detect fraud. Key results included models achieving over XX% accuracy in detecting fraudulent claims. The document concludes future work could involve more advanced techniques and expanding the models to other insurance fraud types.

Uploaded by

Roshan Velpula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

317 views6 pages

Fraud Detection Project Report

Uploaded by

Roshan Velpula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Detection of Fraud Insurance Claims in Vehicles

FRANCESCO BUZZI, POONGKUNDRAN THAMARAISELVAN, and ROSHAN VELPULA

Insurance fraud has been a problem since the inception of insurance. However, the methods used for committing fraud and the
frequency of these incidents have increased in recent years. Vehicle insurance fraud often involves making false or exaggerated
claims for damages or injuries resulting from an accident. Examples of this type of fraud include staged accidents, the use of phantom
passengers, and false personal injury claims. In this paper, we analyze data to understand the characteristics of fraudulent claims and
use machine learning algorithms to detect this type of fraud.

Additional Key Words and Phrases: Random Forest, Decision Trees, Exploratory data analysis, Fraud detection

1 INTRODUCTION
Insurance fraud is a pervasive problem that has been affecting the insurance industry for many years. One of the most
common types of insurance fraud is vehicle insurance fraud, which involves making false or exaggerated claims for
damages or injuries resulting from a car accident. In recent years, the volume and frequency of vehicle insurance fraud
incidents have increased significantly, leading to significant losses for insurance companies.
The purpose of this project is to create a model using machine learning algorithms to detect vehicle insurance fraud. One
challenge in using machine learning for fraud detection is that fraud is much less common than legitimate insurance
claims, which can make it difficult for the model to accurately identify fraudulent activity. In order to develop a successful
model, it is important to balance the cost of false alerts with the potential savings from avoiding losses due to fraud.
Insurance fraud can take many forms, including arranging accidents, misrepresenting the circumstances of an accident,
and exaggerating the extent of damages or injuries. Machine learning can help improve the accuracy of fraud detection
and allow insurance companies to more effectively identify and prevent fraudulent activity.

2 METHODOLOGY
The first step in our project was to collect a large dataset of past insurance claims, both fraudulent and legitimate. We
obtained this dataset from Kaggle , which provided us with anonymized data on a variety of claims made over a period
of several years. The dataset included information on the type of claim, the amount of the claim, the date of the claim,
and other relevant details.
Once we had collected the dataset, we performed basic data analysis to understand the characteristics of fraudulent
claims. This analysis allowed us to identify key features that are often associated with fraudulent claims, such as the
amount of the claim, the type of claim, and the date of the claim. We also looked at other factors, such as the location of
the accident and the number of people involved, to see if they had any impact on the likelihood of fraud.
With this information in hand, we proceeded to train a machine learning model to detect fraudulent claims. We used a
variety of algorithms, including logistic regression, decision trees, and random forests, to develop the model. We trained
the model on the dataset of past claims, using the identified features as inputs and the known fraudulent and legitimate
labels as outputs.
Once the model was trained, we tested it on a separate dataset of claims to see how well it performed. We found that
the model was able to accurately detect fraudulent claims with a high degree of accuracy, achieving an overall accuracy
rate of over 𝑋𝑋 𝑝𝑒𝑟𝑐𝑒𝑛𝑡.

Authors’ address: Francesco Buzzi; Poongkundran Thamaraiselvan; Roshan Velpula.

1
Our work can be divided into four main components:

• Exploratory Data Analysis: This involves examining the data to understand its characteristics and identify any
patterns or trends.
• Data Preprocessing: This involves cleaning and preparing the data for modeling, such as by handling missing
values, transforming variables, and scaling the data.
• Data Modeling: This involves building and fitting statistical or machine learning models to the data to make
predictions or classify data points.
• Model Evaluation: This involves assessing the performance of the model using metrics such as accuracy,
precision, and recall, and making adjustments to improve the model as needed

2.1 Exploratory Data Analysis

Exploratory Data Analysis (EDA) is a crucial step in any data science project, including the project on detecting vehicle
fraud insurance claims. EDA involves analyzing and summarizing the characteristics of the data, identifying any trends
or patterns, and checking for inconsistencies or anomalies.
The goal of EDA is to gain a better understanding of the data and to identify any potential problems or opportunities
that could affect the success of the project. This involves examining the distribution of the data, looking for correlations
between variables, and visualizing the data using charts and plots.
Our first goal was to get familiar with the dataset, We found that the data has 33 columns including our dependant
column ’FraudFound’. Our data consists of a total of 9 numerical and 24 categorical columns with no missing values.

Some important plots and pairwise comparisons between our dependent and independent variables.

Fig. 1. Distribution of Variables

Fig. 2. FraudFound vs Make

Analysis: Mercedes and Accura have a higher probability of fraudulent transactions, most likely due to a higher
return in these costlier cars

Fig. 3. FraudFound vs DayOfWeek

Analysis: Fraudulent claims are higher nearer to the Weekends!

Fig. 4. FraudFound vs AgeOfPolicyHolder

Analysis: Fraudulent claims are generally made from persons ranging from the age group 30-40

Fig. 5. FraudFound vs AgeOfVehicle

Analysis: Newer Vehicles and Ages of vehicles between 2-4 years have encountered many Fraudulent claims
Fig. 6. FraudFound vs AccidentArea Fig. 7. FraudFound vs PastClaims

Fig. 8. FraudFound vs WitnessPresent Fig. 9. FraudFound vs TypeOfPolicy

These are the other interesting analysis we found from EDA

3 DATA PREPROCESSING AND MODELING

4 MODEL EVALUATION
<Model Evaluation and Results>

5 CONCLUSION AND FUTURE SCOPE

While our model performed well in this study, there is still room for improvement. In the future, we plan to explore more
advanced machine learning techniques, such as deep learning, to see if we can achieve even better performance. We also
plan to expand the scope of the model to include other types of insurance fraud, such as health and life insurance fraud.
In conclusion, our project shows that machine learning algorithms have the potential to play a significant role in the
fight against insurance fraud. By providing insurance companies with a powerful tool for detecting fraudulent claims,
we can help them reduce their losses and improve the overall efficiency of the insurance industry
REFERENCES
[1] Andrea Dal Pozzolo. 2015. Adaptive machine learning for credit card fraud detection. (2015).
[2] Najmeddine Dhieb, Hakim Ghazzai, Hichem Besbes, and Yehia Massoud. 2019. Extreme gradient boosting machine learning algorithm for safe auto
insurance operations. In 2019 IEEE international conference on vehicular electronics and safety (ICVES). IEEE, 1–5.
[3] MOHAMED Hanafy and Ruixing Ming. 2021. Using Machine Learning Models to Compare Various Resampling Methods in Predicting Insurance
Fraud. Journal of Theoretical and Applied Information Technology 99, 12 (2021).
[4] Jesús M Pérez, Javier Muguerza, Olatz Arbelaitz, Ibai Gurrutxaga, and José I Martín. 2005. Consolidated tree classifier learning in a car insurance
fraud detection domain with class imbalance. In International Conference on Pattern Recognition and Image Analysis. Springer, 381–389.
[5] Yibo Wang and Wei Xu. 2018. Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud. Decision Support
Systems 105 (2018), 87–95.
[6] Meryem Yankol-Schalck. 2022. The value of cross-data set analysis for automobile insurance fraud detection. Research in International Business and
Finance 63 (2022), 101769.

Hanson-Manufacturing-Case-Study Solution
100% (1)
Hanson-Manufacturing-Case-Study Solution
3 pages
Settlement Agreement (001 186)
100% (1)
Settlement Agreement (001 186)
186 pages
Project Proposal
No ratings yet
Project Proposal
11 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
8 pages
My Black Book
No ratings yet
My Black Book
5 pages
Telecommunication Fraud and Detection Techniques: A Review
No ratings yet
Telecommunication Fraud and Detection Techniques: A Review
3 pages
A Faa Fraud Handbook
No ratings yet
A Faa Fraud Handbook
75 pages
Internet Fraud 6monthreport 2000 A
No ratings yet
Internet Fraud 6monthreport 2000 A
14 pages
Fraud Detection: Data Mining
No ratings yet
Fraud Detection: Data Mining
5 pages
Controlling Credit Card Fraud
No ratings yet
Controlling Credit Card Fraud
4 pages
The Insurance Industry'S Incredible Disappearing Weather Catastrophe Risk
No ratings yet
The Insurance Industry'S Incredible Disappearing Weather Catastrophe Risk
19 pages
Fighting Fraud in Financial Services: A Guide To Choosing The Right Technology Solution
No ratings yet
Fighting Fraud in Financial Services: A Guide To Choosing The Right Technology Solution
7 pages
Credit Card Fraud Detection Using Hidden Markov Models
No ratings yet
Credit Card Fraud Detection Using Hidden Markov Models
2 pages
Combatting Treasury Fraud
No ratings yet
Combatting Treasury Fraud
26 pages
Computer Fraud
No ratings yet
Computer Fraud
4 pages
Chapter10 Fraud
No ratings yet
Chapter10 Fraud
36 pages
Fraud Analytics: The Three-Minute Guide
0% (1)
Fraud Analytics: The Three-Minute Guide
16 pages
What Is 'Forensic Accounting': Types
No ratings yet
What Is 'Forensic Accounting': Types
5 pages
191 Cases: Fraud in Nonprofits
No ratings yet
191 Cases: Fraud in Nonprofits
2 pages
FRAUD - 3rd Group
No ratings yet
FRAUD - 3rd Group
18 pages
Online Transaction Fraud Detection Using Python &amp Backlogging On E-Commerce
No ratings yet
Online Transaction Fraud Detection Using Python &amp Backlogging On E-Commerce
6 pages
PNB Scam
No ratings yet
PNB Scam
14 pages
Life Insurance Fraud
No ratings yet
Life Insurance Fraud
10 pages
Fraud Detection
No ratings yet
Fraud Detection
9 pages
The New Fraud Diamond Model How Can It Help Forensic Accountants in Fraud Investigation in Nigeria
No ratings yet
The New Fraud Diamond Model How Can It Help Forensic Accountants in Fraud Investigation in Nigeria
10 pages
Forensic Accounting - CA. CJS Nanda January 30
No ratings yet
Forensic Accounting - CA. CJS Nanda January 30
35 pages
5 Computer Fraud
No ratings yet
5 Computer Fraud
15 pages
Forensic Accounting Slide
No ratings yet
Forensic Accounting Slide
15 pages
Frauds in Indian Banking Sector
100% (1)
Frauds in Indian Banking Sector
5 pages
Fraud Management in The Banking Sector
No ratings yet
Fraud Management in The Banking Sector
18 pages
Demat Account Fraud - How To Safeguard Against Demat Account Fraud
100% (1)
Demat Account Fraud - How To Safeguard Against Demat Account Fraud
2 pages
Fraud Detection in Python Chapter4
No ratings yet
Fraud Detection in Python Chapter4
33 pages
Fraud
No ratings yet
Fraud
7 pages
12 - FWA-Common Type of Health Care Fraud
No ratings yet
12 - FWA-Common Type of Health Care Fraud
4 pages
The Role of Data Analytics in Insurance Sector
100% (2)
The Role of Data Analytics in Insurance Sector
4 pages
Ethics and Fraud
No ratings yet
Ethics and Fraud
28 pages
Fraud Taxonomies and Models
100% (1)
Fraud Taxonomies and Models
14 pages
Check Cashing San Diego
No ratings yet
Check Cashing San Diego
1 page
Recognize and Report Medicaid Fraud
No ratings yet
Recognize and Report Medicaid Fraud
2 pages
17 Weather Index-Based Insurance PDF
No ratings yet
17 Weather Index-Based Insurance PDF
61 pages
Chapter 10-Fraud and Forensic Auditing
100% (1)
Chapter 10-Fraud and Forensic Auditing
27 pages
Occupational Frauds: Corruption Asset Misappropriatio N Financial Statement Fraud Schemes
No ratings yet
Occupational Frauds: Corruption Asset Misappropriatio N Financial Statement Fraud Schemes
10 pages
The Institute of Internal Auditors Red Flags of Fraud: Situational Pressures That Contribute To Management Fraud
No ratings yet
The Institute of Internal Auditors Red Flags of Fraud: Situational Pressures That Contribute To Management Fraud
2 pages
Enterrprise Fraud Management
No ratings yet
Enterrprise Fraud Management
18 pages
KPMG Nppa New Payments Platform Minimising Payments Fraud
No ratings yet
KPMG Nppa New Payments Platform Minimising Payments Fraud
16 pages
Fraud Tree Focus Inventorycompressed
100% (1)
Fraud Tree Focus Inventorycompressed
64 pages
Causes, Effects and Deterrence of Insurance Fraud: Evidence From Ghana
No ratings yet
Causes, Effects and Deterrence of Insurance Fraud: Evidence From Ghana
104 pages
Analysis On Credit Card Fraud Detection Methods
100% (1)
Analysis On Credit Card Fraud Detection Methods
10 pages
This Study Resource Was: Fraud Detection in Banking Using Data Mining Techniques
No ratings yet
This Study Resource Was: Fraud Detection in Banking Using Data Mining Techniques
5 pages
Web - Incar.tw-Fraud Casebook Lessons From The Bad Side of Business
No ratings yet
Web - Incar.tw-Fraud Casebook Lessons From The Bad Side of Business
5 pages
Fraud Abuse MLN4649244
No ratings yet
Fraud Abuse MLN4649244
23 pages
Digital Forensics
100% (1)
Digital Forensics
11 pages
Rite Aid Fraud
No ratings yet
Rite Aid Fraud
44 pages
Prevent Fraud
No ratings yet
Prevent Fraud
23 pages
Classification of Features For Detecting Phishing Web Sites Based On Machine Learning Techniques
No ratings yet
Classification of Features For Detecting Phishing Web Sites Based On Machine Learning Techniques
51 pages
Chapter 17 - Occupational Fraud and Abuse - The Big Picture - Principles of Fraud Examination, 4th Edition
No ratings yet
Chapter 17 - Occupational Fraud and Abuse - The Big Picture - Principles of Fraud Examination, 4th Edition
11 pages
Credit Reporting For Full Financial Inclusion: Financial Inclusion 2020 Credit Reporting Working Group
No ratings yet
Credit Reporting For Full Financial Inclusion: Financial Inclusion 2020 Credit Reporting Working Group
14 pages
Prerequisite Check - CheckActiveFilesAndExecutables - Failed DBACLASS
No ratings yet
Prerequisite Check - CheckActiveFilesAndExecutables - Failed DBACLASS
1 page
The Fraud Triangle
No ratings yet
The Fraud Triangle
2 pages
Single customer view Second Edition
From Everand
Single customer view Second Edition
Gerardus Blokdyk
No ratings yet
Big Data Analytics Complete Self-Assessment Guide
From Everand
Big Data Analytics Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
intern
No ratings yet
intern
17 pages
Advanced Techniques in Insurance Claim Fraud Detection
No ratings yet
Advanced Techniques in Insurance Claim Fraud Detection
41 pages
14 16274 SI Introduction
No ratings yet
14 16274 SI Introduction
18 pages
Poly ML SIR
No ratings yet
Poly ML SIR
378 pages
History of AI - Phase 1
No ratings yet
History of AI - Phase 1
4 pages
Python Project Report
No ratings yet
Python Project Report
4 pages
Segmentation Tutorial
No ratings yet
Segmentation Tutorial
20 pages
Apjo 10 3 2021 07 07 Alaswad 2021-145 sdc4
No ratings yet
Apjo 10 3 2021 07 07 Alaswad 2021-145 sdc4
1 page
Apjo 10 3 2021 07 07 Alaswad 2021-145 sdc5
No ratings yet
Apjo 10 3 2021 07 07 Alaswad 2021-145 sdc5
1 page
Aa4102 Group4 Exercise
No ratings yet
Aa4102 Group4 Exercise
7 pages
Gaurav Final Sip Report
No ratings yet
Gaurav Final Sip Report
96 pages
Product Brief - Aia Fixed Rate Home Loan
No ratings yet
Product Brief - Aia Fixed Rate Home Loan
3 pages
JazzyMacsCredit Templates EDITABLE v6 2022
No ratings yet
JazzyMacsCredit Templates EDITABLE v6 2022
6 pages
Uusyrityskeskus Guide Becoming An Entrepreneur in Finland 2024-2-1
No ratings yet
Uusyrityskeskus Guide Becoming An Entrepreneur in Finland 2024-2-1
104 pages
Salary Notes
No ratings yet
Salary Notes
43 pages
Elearning-Chapter 6-Cargo Insurance - RV
No ratings yet
Elearning-Chapter 6-Cargo Insurance - RV
59 pages
IC-38 Question Bank - 1
No ratings yet
IC-38 Question Bank - 1
48 pages
Solvency II: Raising The Bar On Insurance Technical Expertise
No ratings yet
Solvency II: Raising The Bar On Insurance Technical Expertise
12 pages
Reassurance at Every Step: Keeps Giving You More!
No ratings yet
Reassurance at Every Step: Keeps Giving You More!
2 pages
Unit 1
No ratings yet
Unit 1
37 pages
Caiib Success Class-10 (BRBL Module-D Part-1) : 7 PM 7 Nov. 2023
No ratings yet
Caiib Success Class-10 (BRBL Module-D Part-1) : 7 PM 7 Nov. 2023
17 pages
Auditing Insurance Brokerage Firms Requires A Specialized Approach Due To The Unique Nature of Their Operations
No ratings yet
Auditing Insurance Brokerage Firms Requires A Specialized Approach Due To The Unique Nature of Their Operations
3 pages
State Exam Questions and Answers
No ratings yet
State Exam Questions and Answers
14 pages
Role of Actuary in Insurance
100% (3)
Role of Actuary in Insurance
71 pages
National University of Study and Research of Law
No ratings yet
National University of Study and Research of Law
11 pages
Edu 2012 04 MLC Exam
No ratings yet
Edu 2012 04 MLC Exam
30 pages
Art Insurance
No ratings yet
Art Insurance
62 pages
Acc Module 4 Accounting Cycle Exercises With Answers
No ratings yet
Acc Module 4 Accounting Cycle Exercises With Answers
15 pages
SSRN-id3520473
No ratings yet
SSRN-id3520473
48 pages
BATJIC Information Sheet: New Wage Rates From Batjic Effective Monday 22 June 2020
100% (1)
BATJIC Information Sheet: New Wage Rates From Batjic Effective Monday 22 June 2020
4 pages
What My Family Should Know - v1
No ratings yet
What My Family Should Know - v1
8 pages
BFSI Target DB Feb 2019
No ratings yet
BFSI Target DB Feb 2019
856 pages
Welfare Measures to CG Employees
No ratings yet
Welfare Measures to CG Employees
78 pages
SNP House
No ratings yet
SNP House
5 pages
Chapter 3 & 4 CC FIM
No ratings yet
Chapter 3 & 4 CC FIM
33 pages
GOVERNMENT AGENCIES ACCEPTING PHILID AND EPHILID (W - Financial Institutions)
No ratings yet
GOVERNMENT AGENCIES ACCEPTING PHILID AND EPHILID (W - Financial Institutions)
7 pages

Fraud Detection Project Report

Uploaded by

Fraud Detection Project Report

Uploaded by

Detection of Fraud Insurance Claims in Vehicles

FRANCESCO BUZZI, POONGKUNDRAN THAMARAISELVAN, and ROSHAN VELPULA

Authors’ address: Francesco Buzzi; Poongkundran Thamaraiselvan; Roshan Velpula.

2.1 Exploratory Data Analysis

Fig. 1. Distribution of Variables

Fig. 3. FraudFound vs DayOfWeek

Analysis: Fraudulent claims are higher nearer to the Weekends!

Fig. 5. FraudFound vs AgeOfVehicle

Fig. 8. FraudFound vs WitnessPresent Fig. 9. FraudFound vs TypeOfPolicy

These are the other interesting analysis we found from EDA

3 DATA PREPROCESSING AND MODELING

5 CONCLUSION AND FUTURE SCOPE

You might also like