0% found this document useful (0 votes)
45 views

Mini Project

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views

Mini Project

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

P2P LENDING LOAN DEFAULT PREDICTION USING

MACHINE LEARNING

Submitted by:-
Under the
❖Anindita Sinha - 35311502721
Supervision of : ❖Ayush Goel- 08511502721
Dr. Deepika Kumar ❖Daksh Kaushik-08711502721
(HOD CSE) ❖Vipul Varshney- 08211502721
1
TABLE OF CONTENTS

➢ Introduction
➢ Benefits of P2P lending
➢ Need of AI model in P2P Lending
➢ Related Work
➢ Objectives
➢ Research Methodology
➢ Dataset
2
➢ References
INTRODUCTION

What is P2P Lending?


● Peer-to-peer (P2P) lending is a form of
financial technology that allows people to
lend or borrow money from one another
without going through a bank.

● P2P lending websites connect borrowers


directly to investors. The site sets the rates
and terms and enables the transactions.

● P2P lenders are individual investors who want


to get a better return on their cash savings
than they would get from a bank savings
account or certificate of deposit.

3 Figure 1. P2P lending process in general [9]


BENEFITS OF P2P LENDING

• Lower Interest Rates Increased

• Lending Opportunities

• Increased Transparency and Control

• Increased Financial System Diversity

• Convenience and Accessibility [19] Figure 2. Number of lenders and credit recipients of P2P platforms in Lithuania (created by
authors based on data from Tarasevičienė 2019; The Central Bank of Lithuania 2020).

4
NEED OF AI MODEL IN P2P LENDING

• Financial Losses
• Increased Non-Performing Assets
(NPAs)
• Impact on Capital Adequacy
• Increased Interest Rates for Borrowers
• Reduced Lending Capacity
• Reputation Risk
• Legal and Administrative Costs
• Market Perception
Figure 3. Operation status of P2P platforms in China [17]

5
RELATED WORK
AUTHOR RESEARCH TITLE METHODOLOGY RESULTS LIMITATIONS SOURCE

Jing Zhou, Wei Li, Default prediction in P2P lending from high- heterogeneous ensemble learning Accuracy: 0.84 This study does not focus on optimizing the [1]
Jiaxin Wang, Shuai dimensional data based on machine learning (GBDT, XGBoost and parameters or conducting sensitivity analyses, so
Ding, Chengyi LightGBM) we recommend that future studies deploy
Xia(2019) algorithms to automate the optimization of
parameters for better results.

Junhui Xu, Loan default prediction of Chinese P2P market: a synthetic minority oversampling Accuracy: 0.91 The study does not consider the changes in [2]
Zekai Lu, machine learning methodology technique (SMOTE),gradient macroeconomic factors and regulatory policies.
Ying Xie boosting model (GBM), NN,
extreme gradient boosting tree
(2021) (XGBT) and random forest (RF)

J. D. Turiel and T. Aste Peer-to-peer loan acceptance and default prediction LR and SVM models,DNN,two Accuracy: 0.75 The integration of the present model with [3]
(2020) with artificial intelligence phase model predictive modelling based on information
filtering network techniques is not discussed.

Beibei Niu,Jinzheng Credit Scoring Using Machine Learning by Combining Logistic Regression(LR), Accuracy: 0.66 Were not able to collect other social network data, [4]
Ren and Xiaotao Li Random such as frequency of calls,whether they are
Social Network Information: Evidence from Peer-to-
Forest,AdaBoost,LightGBM incoming or outgoing and the strength of social
Peer Lending network ties.

6
RELATED WORK
AUTHOR RESEARCH TITLE METHODOLOGY RESULT LIMITATIONS SOURCE

Xiaojun Ma,Jinglan Study on a prediction of P2P network loan default based LightGBM and XGboost Accuracy:0.86 the method is relatively novel, the scope of [5]
Sha ,Dehua Wang, on the machine learning LightGBM and XGboost algorithms application is not very extensive, and the articles
Yuanbo Yu(2018) algorithms according to different high dimensional data related to it are very rare.
cleaning

Li-Hua Li, Alok Kumar Predicting the Default Borrowers in P2P Platform Using KNN, Logistic Accuracy: 0.95 P2P lending faces challenges in its development, [6]
Machine Learning Models Regression,Random Forest such as asymmetric information and improper risk
Sharma, Ramli handling method
Ahmad,Rung-Ching
Chen

Yuejin Zhang,Haifeng Determinants of loan funded successful in online P2P Binary Logistic Regression Accuracy: 0.77 To get reliable results of predicting loan [7]
Li Mo Hai,Aihua Li Lending performance, a pre-selection of variables on the
(2017) basis of credit grades should be underdone. More
precise results can be expected. This could solve the
discordant results that were delivered by previous
research and give a deeper insight into the topic of
ex-post risk in P2P Lending.

An-Hsing Chang, Li-Kai Machine learning and artificial neural networks to XGBoost,LightBGM Accuracy: 0.88 use of other linear classification techniques, such as [8]
Yang,Rua-Huan construct P2P lending credit-scoring model: A case the LDA. It would be an interesting extension to
Tsaih(2022) using Lending Club data exploit the XGBoost algorithm by using many
variable selection techniques in statistics.

7
RELATED WORK
AUTHOR RESEARCH TITLE METHODOLOGY LIMITATIONS SOURCE

Suryono, Ryan Randy, Peer to Peer (P2P) Lending Problems and Potential Solutions: This study produces a table of P2P Many cases of improper billing and awareness of [9]
Betty Purwandari, and Indra A Systematic Literature Review lending problem identification and privacy data can be investigated in further research. It
Budi. alternative solutions by employing a relates to the feasibility of P2P Lending Platform as a
SLR of 81 publications. significant concern. Besides, there is very limited work
on analyzing positive and negative sentiments on P2P
lending

Zhao, Hongke, et al. P2P Lending Survey: Platforms, Recent Advances and Provided a comprehensive survey on suggested several future research directions, including [10]
Prospects P2P lending. Specifically, summarized the pricing problem, mechanism improvement, risk
some mainstream P2P lending platforms management, privacy preserving, and personalization.
in the world and provided a systematic
taxonomy for them.

Au, Cheuk Hang, Barney Developing a P2P lending platform: stages, strategies and Study hints at the strategies that can the issue of generalizability as a potential limitation of [11]
Tan, and Yuan Sun. platform configurations. facilitate the various stages. Model can our study. Future work will be directed toward
potentially serve as the foundation for extending and validating our process model with
formulating guidelines for the managers the collection and analysis of additional data from
of P2P lending platforms, so that they Tuodao,and possibly other P2P lending platforms
are able to optimize the development of
their platforms.

Najaf, Khakan, Understanding the implications of FinTech Peer-to-Peer (P2P) This study examines the impact of the As the P2P lending market is a considerably new market [12]
Ravichandran K. lending during the COVID-19 pandemic. COVID-19 pandemic on the and still in the development stage, further analysis with
Subramaniam, and Osama determinants of FinTech Peer-to-Peer a longer time-frame and more diverse macroeconomic
F. Atayah. (P2P) lending. conditions will check the robustness of our results.
8
RELATED WORK
AUTHOR RESEARCH TITLE METHODOLOGY LIMITATIONS RESULTS SOURCE

An-Hsing Chang, Li-Kai Machine learning and artificial neural networks to Artificial Neural since the data contains description features, Accuracy: 0.88 [13]
Yang, Rua-Huan Tsaih and construct P2P lending credit-scoring model: A case Network(ANN),logistic such as the reasons for the loans and the credit
Shih-Kuei Lin using Lending Club data regression (LR),decision tree, document from the lender, the information in
random forest,XGBoost, text can be converted into a numerical form
LightGBM and 2-layer neural via natural language processing, such as
networks. sentiment analysis

Dong-Her Shih,Ting-Wei A Framework of Global Credit-Scoring Modeling Using naive Bayesian (NB), logistic only the isolated forest outlier detection Accuracy: 0.958 [14]
Wu,Po-Yuan Shih,Nai-An regression (LR), and random method is used.Could have used various other
Outlier Detection and Machine Learning in a P2P
Lu and Ming-Hung Shih forest (RF) outlier detection methods to find common
Lending Platform outliers

Zhida Liu, Zhenyu Zhang, innovative model fusion algorithm to improve the recall Random Forest, Extra Trees, use more types of data, such as time series data Accuracy: 0.8899 [15]
Hongwei Yang, Guoqiang rate of peer-to-peer lending default customers XGBoost, LightGBM, geographic location data, etc., to provide more
Wang, Zhenwei Xu CatBoost, (Artificial Neural accurate default forecast.
Network) ANN, Logistic
Regression, RF-ET-GBM-CAT-
XGB-Stacking model and
LGB- XGB-Stacking model.

Lailatul Nikmah, Dwika New model combination meta-learner to improve KNN,SVM,Random conducting experiments on larger datasets or Accuracy: 0.9998 [16]
Ananda Agustina Pertiwi, accuracy prediction P2P lending with stacking ensemble Forest,Stacking- datasets from different countries and trying to
Subhan, Jumanto, Yosza learning⁎ AdaBoost,Stacking- tune new models to achieve better
Dasril, Iswanto performance.
LightGBM,Stacking-

9 XGBoost,LGBFS-
StackingXGBoost
OBJECTIVES

1) To study prior conducted studies on loan default predictions.

2) To collect and curate a dataset of loan defaulters with their financial information(ratios) and
later pre-process the data.

3) To use various feature transformation based techniques to achieve better accuracy in the
model.

4) To build a machine learning model and using various algorithms for the classification to loan
risk.

5) To analyse whether the model will work irrespective of external factors like recession,
pandemic, etc.

10
Research
Methodology

11
Machine
Learning
Process Flow

12
DATASET

Dataset taken from bondora public site (https://www.bondora.com/en/public-reports)


13
DATASET

14
DATASET

16
Data Pre-processing and Cleaning

● Removal of unnecessary feature in model training


● Removal of highly correlated features
● Removing features having more than 50% null features
● Differentiating numerical and categorical data
● Filling null values with help of simple imputing using mean
strategy
● Encoding of categorical data was done using label encoder
● Z-score normalization

16
Machine Learning (Results)

16
Deep Learning (ANN Results)

18
LOSS ACCURACY VAL_LOSS VAL_ACCURACY

1.0848 0.9551 0.2568 0.9830

19
Deep Learning (LSTM Results)

18
LOSS ACCURACY VAL_LOSS VAL_ACCURACY

0.2503 0.5469 0.2568 0.5830

19
REFERENCES
[1] Zhou, Jing, et al. "Default prediction in P2P lending from high-dimensional data based on machine learning." Physica A: Statistical Mechanics and its Applications
534 (2019): 122370.

[2] Xu, Junhui, Zekai Lu, and Ying Xie. "Loan default prediction of Chinese P2P market: a machine learning methodology." Scientific Reports 11.1 (2021): 18759.

[3] Turiel, J. D., and T. Aste. "Peer-to-peer loan acceptance and default prediction with artificial intelligence." Royal Society open science 7.6 (2020): 191649.

[4] Niu, Beibei, et al. "Lender trust on the P2P lending: Analysis based on sentiment analysis of comment text." Sustainability 12.8 (2020): 3293.

[5] Ma, Xiaojun, et al. "Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGboost algorithms according to different
high dimensional data cleaning." Electronic Commerce Research and Applications 31 (2018): 24-39.

[6] Li, Li-Hua, et al. "Predicting the default borrowers in P2P platform using machine learning models." International Conference on Artificial Intelligence and
Sustainable Computing. Cham: Springer International Publishing, 2021.

[7] Zhang, Yuejin, et al. "Determinants of loan funded successful in online P2P Lending." Procedia computer science 122 (2017): 896-901.

[8] Chang, An-Hsing, et al. "Machine learning and artificial neural networks to construct P2P lending credit-scoring model: A case using Lending Club data."
Quantitative Finance and Economics 6.2 (2022): 303-325.

[9] Suryono, Ryan Randy, Betty Purwandari, and Indra Budi. "Peer to peer (P2P) lending problems and potential solutions: A systematic literature review." Procedia
Computer Science 161 (2019): 204-214.

17
REFERENCES
[10] Zhao, Hongke, et al. "P2P lending survey: Platforms, recent advances and prospects." ACM Transactions on Intelligent Systems and Technology (TIST) 8.6 (2017): 1-28.

[11] Au, Cheuk Hang, Barney Tan, and Yuan Sun. "Developing a P2P lending platform: stages, strategies and platform configurations." Internet Research 30.4 (2020): 1229-
1249.

[12] Najaf, Khakan, Ravichandran K. Subramaniam, and Osama F. Atayah. "Understanding the implications of FinTech Peer-to-Peer (P2P) lending during the COVID-19
pandemic." Journal of Sustainable Finance & Investment 12.1 (2022): 87-102.

[13] Chang, A.H., Yang, L.K., Tsaih, R.H. and Lin, S.K., 2022. Machine learning and artificial neural networks to construct P2P lending credit-scoring model: A case using
Lending Club data. Quantitative Finance and Economics, 6(2), pp.303-325.

[14] Shih, D.H., Wu, T.W., Shih, P.Y., Lu, N.A. and Shih, M.H., 2022. A Framework of Global Credit-Scoring Modeling Using Outlier Detection and Machine Learning in a
P2P Lending Platform. Mathematics, 10(13), p.2282.

[15]Liu, Z., Zhang, Z., Yang, H., Wang, G. and Xu, Z., 2023. An innovative model fusion algorithm to improve the recall rate of peer-to-peer lending default
customers. Intelligent Systems with Applications, 20, p.200272.

[16]Muslim, M.A., Nikmah, T.L., Pertiwi, D.A.A. and Dasril, Y., 2023. New model combination meta-learner to improve accuracy prediction P2P lending with
stacking ensemble learning. Intelligent Systems with Applications, 18, p.200204.

[17] Yoon, Yeujun, Yu Li, and Yan Feng. "Factors affecting platform default risk in online peer-to-peer (P2P) lending business: an empirical study using Chinese online P2P
platform data." Electronic Commerce Research 19 (2019): 131-158.

[18] https://www.investopedia.com/terms/p/peer-to-peer-lending.asp

[19] What Are the Biggest Benefits of P2P Lending? (financemagnates.com)


18

You might also like