23MZ02
23MZ02
FINAL REVIEW
The challenge is to create a system that can detect fraudulent credit card transactions
automatically and provide clear explanations for flagged transactions, thereby enhancing
customer trust, ensuring regulatory compliance, and improving fraud detection accuracy
over time, utilizing Explainable AI techniques.
Motivation
For instance, Experian's 2024 Fraud Forecast discusses the increasing use of generative
AI for deepfakes and synthetic identity fraud, both of which are growing concerns in the
financial sector. Companies like Google and IBM are also heavily involved in developing
AI-driven fraud detection solutions with a strong emphasis on transparency and
explainability. These companies use XAI principles to ensure that their fraud detection
models are not only effective but also interpretable, fostering greater trust and compliance
in financial services.
Abstract
• This study aims to investigate and evaluate XAI methods for credit card fraud detection.
To achieve this aim, three objectives are stated as follows:
• Objective 1: Investigate the current application of ML methods and XAI methods in the
area of credit card fraud detection to gain background knowledge for the implementation
of objectives 2 and 3.
• Objective 2: Implement two ML methods and apply them to the credit card fraud dataset
and evaluate the performance of these two ML methods in terms of their accuracy,
recall, precision, and F1 score.
• Objective 3: Implement SHAP and LIME and apply them to the results obtained in
objective 2.
Literature Survey
System Design
1. Data Collection Layer
• Transaction Data: Collect data related to credit card transactions including features
such as transaction amount, transaction location, time, merchant details, etc.
• User Data: Information about the cardholder such as their spending patterns, location,
and device information.
4. Explainability Layer
• LIME (Local Interpretable Model-agnostic Explanations): Use LIME to explain individual
predictions by approximating the model locally with an interpretable model.
• SHAP (SHapley Additive exPlanations): Use SHAP to provide consistent and locally
accurate explanations for the predictions made by the models.
For detecting credit card fraud uses machine learning models like XGBoost and
Decision Trees to analyze transaction data. The dataset includes features like
distance, transaction amount, and PIN or chip usage. LIME and SHAP
explainability models are used to enhance transparency and trust in the model's
decisions, aiming to improve fraud detection accuracy and interpretability.
Dataset Description
The dataset used in the study focuses on credit card transactions and is designed to analyze
various trends and circumstances that contribute to fraudulent activities in digital payments
Data Dictionary
• transdatetrans_time: Transaction DateTime
• merchant: Merchant Name
• category: Category of Merchant
• amt: Amount of Transaction
• city: City of Credit Card Holder
• state: State of Credit Card Holder
• lat: Latitude Location of Purchase
• long: Longitude Location of Purchase
• city_pop: Credit Card Holder's City Population
• job: Job of Credit Card Holder
• dob: Date of Birth of Credit Card Holder
• trans_num: Transaction Number
• merch_lat: Latitude Location of Merchant
• merch_long: Longitude Location of Merchant
• is_fraud: Whether Transaction is Fraud (1) or Not (0)
Feature Selection
• Based on the previous findings, we are going to use the XGBoost Classifier
model paying particular attention to the settings of its hyperparameters, due to
the unbalanceness of the dataset.
shap_values = explainer.shap_values(X_test)
shap.summary_plot(shap_values,
X_test,
feature_names = features)
shap.plots.heatmap(shap_values)
shap_values = explainer(X_test.iloc[instance_idx])
• Let’s choose two indexes (8, 28) of different cases (is_not_fraud, is_fraud) in order to
locally explain their predictions.
• From the force plot, we can see that the (negative) contribution given by the features in
blue has had a negative impact on the local prediction. In other words, the plot shows
which features are pushing the prediction to the right or left.
In this particular case, we can see how amt (amount) and trans_date_trans_time_unix pushed
the prediction to the left of the base value
•In this case, we can see how ‘amt’ feature majorly
instance_idx = 28 pushed the prediction to the right of the base value.
lime_explainer =
LimeTabularExplainer(X_train.values,mode="classification",
feature_names=features, class_names=[‘is_not_fraud‘,’is_fraud’])
explanation =
lime_explainer.explain_instance(X_test.values[instance_idx],
model_XGB.predict_proba,num_features=len(X_test.columns))
explanation.show_in_notebook()
• Now, let’s apply LIME to the same local instances (8, 28).
In this case, we can see the negative impact of ‘amt’ feature on the prediction, pushing the
prediction probability to classify the prediction as fraud.
SHAP vs LIME
•Let’s consider the first instance (index = 8) and compare its (local) SHAP and LIME results.
•From the plots, we can see that the impact of the most important features has been captured by both methods,
such as ‘amt’, ‘trans_date_trans_time_unix’, ‘category_encoded’, ‘merch_long’, etc.
•In conclusion, to explain this local prediction both methods look valid.
TREE BASED INTEGRATED GINI ENTROPY:
Formula
Gini Index = 1 — Σ ( pi )²
Feature Importance
IMPORTANCE OF FEATURES
Feature Importance
amt 64.91
merch_long 5.14
long 5.14
city_pop 5.14
merch_lat 5.14
lat 5.14
Others 9.19
Out of a total feature importance score of 100, the amount (amt) feature stands out
substantially with a contribution of 64.91, indicating that transaction amount is the most
important factor in the model's decision-making process for detecting fraudulent
transactions.
References
1. N. Dhieb, H. Ghazzai, H. Besbes and Y. Massoud, "A Secure AI-Driven Architecture for Automated Insurance
Systems: Fraud Detection and Risk Measurement," in IEEE Access, vol. 8, pp. 58546-58558, 2020, doi:
10.1109/ACCESS.2020.2983300.
2. R. Li, Z. Liu, Y. Ma, D. Yang and S. Sun, "Internet Financial Fraud Detection Based on Graph Learning," in IEEE
Transactions on Computational Social Systems, vol. 10, no. 3, pp. 1394-1401, June 2023, doi:
10.1109/TCSS.2022.3189368.
3. Z. Zhang, L. Chen, Q. Liu and P. Wang, "A Fraud Detection Method for Low-Frequency Transaction," in IEEE
Access, vol. 8, pp. 25210-25220, 2020, doi: 10.1109/ACCESS.2020.2970614.
4. M. N. Ashtiani and B. Raahemi, "Intelligent Fraud Detection in Financial Statements Using Machine Learning and
Data Mining: A Systematic Literature Review," in IEEE Access, vol. 10, pp. 72504-72525, 2022, doi:
10.1109/ACCESS.2021.3096799
5. Kotagiri, A. (2023). Mastering Fraudulent Schemes: A Unified Framework for AI-Driven US Banking Fraud
Detection and Prevention. International Transactions in Artificial Intelligence, 7(7), 1–19. Retrieved from
https://isjr.co.in/index.php/ITAI/article/view/197
6. Md Rokibul Hasan, Md Sumon Gazi, & Nisha Gurung. (2024). Explainable AI in Credit Card Fraud Detection:
Interpretable Models and Transparent Decision-making for Enhanced Trust and Compliance in the USA. Journal of
Computer Science and Technology Studies, 6(2), 01–12. https://doi.org/10.32996/jcsts.2024.6.2.1
7. Sabharwal, R., Miah, S.J., Wamba, S.F. et al. Extending application of explainable artificial intelligence for
managers in financial organizations. Ann Oper Res (2024). https://doi.org/10.1007/s10479-024-05825-9
8. Sai, Chaithanya Vamshi and Das, Debashish and Elmitwally, Nouh and Elezaj, Ogerta and Islam, Md Baharul,
Explainable Ai-Driven Financial Transaction Fraud Detection Using Machine Learning and Deep Neural Networks.
Available at SSRN: https://ssrn.com/abstract=4439980 or http://dx.doi.org/10.2139/ssrn.4439980
9. Raufi, B., Finnegan, C., Longo, L. (2024). A Comparative Analysis of SHAP, LIME, ANCHORS, and DICE for
Interpreting a Dense Neural Network in Credit Card Fraud Detection. In: Longo, L., Lapuschkin, S., Seifert, C. (eds)
Explainable Artificial Intelligence. xAI 2024. Communications in Computer and Information Science, vol 2156.
Springer, Cham. https://doi.org/10.1007/978-3-031-63803-9_20
10. Kennedy, R.K.L., Salekshahrezaee, Z., Villanustre, F. et al. Iterative cleaning and learning of big highly-imbalanced
fraud data using unsupervised learning. J Big Data 10, 106 (2023). https://doi.org/10.1186/s40537-023-00750-3
11. Chhabra, R., Goswami, S. & Ranjan, R.K. A voting ensemble machine learning based credit card fraud detection
using highly imbalance data. Multimed Tools Appl 83, 54729–54753 (2024).
https://doi.org/10.1007/s11042-023-17766-9
12. A. A. Taha and S. J. Malebary, "An Intelligent Approach to Credit Card Fraud Detection Using an Optimized Light
Gradient Boosting Machine," in IEEE Access, vol. 8, pp. 25579-25587, 2020, doi: 10.1109/ACCESS.2020.2971354.
13. Krishnavardhan, N., Govindarajan, M. & Rao, S.V.A. An intelligent credit card fraudulent activity detection using
hybrid deep learning algorithm. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18793-w
14. Talaat, F.M., Aljadani, A., Badawy, M. et al. Toward interpretable credit scoring: integrating explainable artificial
intelligence with deep learning for credit card default prediction. Neural Comput & Applic 36, 4847–4865 (2024).
https://doi.org/10.1007/s00521-023-09232-2
15. T. Awosika, R. M. Shukla and B. Pranggono, "Transparency and Privacy: The Role of Explainable AI and Federated
Learning in Financial Fraud Detection," in IEEE Access, vol. 12, pp. 64551-64560, 2024, doi:
10.1109/ACCESS.2024.3394528.