0% found this document useful (0 votes)
208 views24 pages

Loan Prediction Using Artificial Intelligence and Machine Learning

This document is a project report submitted by four students for their B.E. degree in Electronics and Communication Engineering. The project aims to develop a model for predicting loan approvals using machine learning and artificial intelligence techniques. The report includes an introduction describing the problem statement, objectives and methodology. It also includes sections on the dataset used, model development and results, and a conclusion comparing different models. The students developed the model under the guidance of their project supervisor Dr. Urvashi Bansal to partially fulfill the requirements for their undergraduate degree.

Uploaded by

Sumit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
208 views24 pages

Loan Prediction Using Artificial Intelligence and Machine Learning

This document is a project report submitted by four students for their B.E. degree in Electronics and Communication Engineering. The project aims to develop a model for predicting loan approvals using machine learning and artificial intelligence techniques. The report includes an introduction describing the problem statement, objectives and methodology. It also includes sections on the dataset used, model development and results, and a conclusion comparing different models. The students developed the model under the guidance of their project supervisor Dr. Urvashi Bansal to partially fulfill the requirements for their undergraduate degree.

Uploaded by

Sumit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

B.E.

PROJECT ON

(LOAN PREDICTION USING ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING)

Submitted by:

Lakshay Singh (Roll No. 2018UEC2006)

Archit Salriwal (Roll No. 2018UEC2018)

Sumit Kumar Chaudhary(Roll No. 2018UEC2013)

Kunal (Roll No. 2018UEC2044)

Under the Guidance of

Dr. Urvashi Bansal

Project-I in partial fulfillment of requirement for the award of

B.E. in

Electronics & Communication Engineering

Division of Electronics & Communication Engineering


NETAJI SUBHAS INSTITUTE OF TECHNOLOGY
(UNIVERSITY OF DELHI)
NEW DELHI-110078
YEAR 2021-2022
CERTIFICATE

This is to certify that the report entitled “Loan prediction using machine
learning and artificial intelligence” being submitted by LAKSHAY SINGH,
ARCHIT SALRIWAL, SUMIT KUMAR CHAUDHARY & KUNAL to the
Division of Electronics and Communication Engineering, NSIT, for the ward of
bachelor’s degree of engineering, is the record of the bonafide work carried out
by them under our supervision and guidance. The results contained in the report
have not been submitted either in part or in full to any other university or
institute
for the award of any degree or diploma.

SUPERVISOR

Dr. URVASHI BANSAL


ECE Division

ACKNOWLEDGMENT
We want to offer our earnest thanks to our supervisor Dr. Urvashi Bansal for giving her

significant assistance, direction, recommendations, information and entire hearted

collaboration over the span of the project. She helped us in overcoming the challenges

that came throughout the development of the project.

We would like to express our sincere thanks to the Head of Department for allowing us

to present the project on the topic “LOAN PREDICTION USING ARTIFICIAL

INTELLIGENCE AND MACHINE LEARNING” at our department as a part of our

Project Thesis for the Partial fulfillment of the requirements for the award of the degree

of Bachelor of Technology.

We take this opportunity to thank all our professors who have helped us in our project.

Lakshay singh (2018UEC2006)

Archit Salriwal(2018UEC2018)

Sumit Chaudhary(2018UEC2013)

Kunal(2018UEC2044)

ABSTRACT
In our banking system, banks have many products to sell but main source of
income of any banks is on its credit line. So they can earn from interest of those
loans which they credits.A bank's profit or a loss depends to a large extent on
loans i.e. whether the customers are paying back the loan or defaulting. By
predicting the loan defaulters, the bank can reduce its Non- Performing Assets.
This makes the study of this phenomenon very important. Previous research in
this era has shown that there are so many methods to study the problem of
controlling loan default. But as the right predictions are very important for the
maximization of profits, it is essential to study the nature of the different
methods and their comparison. A very important approach in predictive
analytics is used to study the problem of predicting loan defaulters: The Logistic
regression model. The data is collected from the Kaggle for studying and
prediction. Logistic Regression models have been performed and the different
measures of performances are computed. The models are compared on the basis
of the performance measures such as sensitivity and specificity. The final results
have shown that the model produce different results. Model is marginally better
because it includes variables (personal attributes of customer like age, purpose,
credit history, credit amount, credit duration, etc.) other than checking account
information (which shows wealth of a customer) that should be taken into
account to calculate the probability of default on loan correctly. Therefore, by
using a logistic regression approach, the right customers to be targeted for
granting loan can be easily detected by evaluating their likelihood of default on
loan. The model concludes that a bank should not only target the rich customers
for granting loan but it should assess the other attributes of a customer as well
which play a very important part in credit granting decisions and predicting the
loan defaulter

LIST OF CONTENTS
ACKNOWLEDGEMENT

ABSTRACT

INDEX

LIST OF FIGURES

LIST OF TABLES

CHAPTER 1: INTRODUCTION
1.1 OBJECTIVE
1.2 THE CLASSIFICATION PROBLEM

1.3 STEPS INVOLVED IN MACHINE LEARNING

CHAPTER 2: LITERATURE SURVEY

CHAPTER 3: DATASET

3.1 FEATURES
3.2 LABELS

3.3 VISUALIZING DATA


3.4 EXPLANATION OF THE MAIN CODE

CHAPTER 4: RESULT
4.1 MODELS OF TRAINING AND TESTING THE DATASET

4.2 LOAN PREDICITON USING LOGISTIC REGRESSION

CHAPTER 5: CONCLUSION

5.1 LOAN PREDICTION MODELS COMPARISON


REFERENCES

CHAPTER 1 :INTRODUCTION
1. Loan-Prediction – It is the process by which a machine learning algorithm can predict
whether a person will get loan or not.

2. Understanding the problem statement is the first and foremost step. This would help you
give an intuition of what you will face ahead of time. Let us see the problem statement.

3. Dream Housing Finance company deals in all home loans. They have presence across all
urban, semi urban and rural areas. Customer first apply for home loan after that company
validates the customer eligibility for loan. Company wants to automate the loan eligibility
process (real time) based on customer detail provided while filling online application form.
These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan
Amount, Credit History and others. To automate this process, they have given a problem to
identify the customer segments, those are eligible for loan amount so that they can
specifically target these customers.

4. It is a classification problem where we have to predict whether a loan would be approved


or not. In a classification problem, we have to predict discrete values based on a given set of
independent variable(s). Classification can be of two types:

5. Binary Classification : In this classification we have to predict either of the two given
classes. For example: classifying the gender as male or female, predicting the result as win
or loss, etc. Multiclass Classification : Here we have to classify the data into three or more
classes. For example: classifying a movie's genre as comedy, action or romantic, classify
fruits as oranges, apples, or pears, etc.

6. Loan prediction is a very common real-life problem that each retail bank faces atleast
once in its lifetime. If done correctly, it can save a lot of man hours at the end of a retail
bank.

Steps involved in machine learning

1. Data Collection
● The quantity & quality of your data dictate how accurate our model is

● The outcome of this step is generally a representation of data which we will use for
training

● Using pre-collected data, by way of datasets from Kaggle, UCI, etc., still fits into this
step.

2. Data Preparation

● Wrangle data and prepare it for training

● Clean that which may require it (remove duplicates, correct errors, deal with missing
values, normalization, data type conversions, etc.)

● Randomize data, which erases the effects of the particular order in which we collected
and/or otherwise prepared our data.

3. Choose a Model
● Different algorithms are for different tasks; choose the right one

4. Train the Model


● The goal of training is to answer a question or make a prediction
correctly as often as possible
● Linear regression example: algorithm would need to learn values for m
(or W) and b (x is input, y is output)
● Each iteration of process is a training step.

5. Evaluate the Model


● Uses some metric or combination of metrics to "measure" objective
performance of model
● Test the model against previously unseen data
● This unseen data is meant to be somewhat representative of model
performance in the real world, but still helps tune the model (as opposed
to test data, which does not)
● Good train/evaluate split 80/20, 70/30, or similar, depending on
domain, data availability, dataset particulars, etc.

6. Parameter Tuning
● This step refers to hyper-parameter tuning, which is an "art form" as
opposed to a science
● Tune model parameters for improved performance.
● Simple model hyper-parameters may include: number of training steps,
learning rate, initialization values and distribution, etc

CHAPTER 2- LITERATURE SURVEY


We start our literature review with more general systematic literature reviews that
focus on the application of machine learning in the general field of Banking Risk
Management. Since the global financial crisis, risk management in banks has to take
a major role in shaping decision-making for banks. A major portion of risk
management is the approval of loans to promising candidates. But the black-box
nature of Machine learning algorithms makes many loan providers vary the result.
Martin Leo, Suneel Sharma and k. Maddulety's [1] extensive report has explored
where Machine Learning is being used in the fields of credit risk, market risk,
operational risk, and liquidity risk only to conclude that the research falls short of
extensive research is required in the field.

We could not find any literature review for loan prediction for specific Machine
learning algorithms to use which would be a possible starting point for our paper.
Instead, since loan prediction is a classification problem, we went with popular
classification algorithms used for a similar problem. Ashlesha Vaidya [2] used
logistic regression as a probabilistic and predictive approach to loan approval
prediction. The author pointed out how Artificial neural networks and Logistic
regression are most used for loan prediction as they are easier comparatively develop
and provide the most accurate predictive analysis. One of the reasoning behind this
that that other Algorithms are generally bad at predicting from non-normalized data.
But the nonlinear effect and power terms are easily handled by Logistic regression as
there is no need for the independent variables on which the prediction takes place to
be normally distributed.

Logistic regression still has its limitations, and it requires a large sample of data for
parameter estimation. Logistic regression also requires that the variables be
independent of each other otherwise the model tends to overweigh the importance of
the dependent variables.

A solution to this multicollinearity problem among the categorical explanatory


variables is the use of a categorical principal component analysis which can be seen
used by Guilder and Ozlem [3] on a case study for housing Loan approval data. The
goal of Principal component analysis is to reduce the number of m variables where
many of them would be highly correlated with each other, to a smaller set of n
uncorrelated variables called principal components which account for the variances
between the previous m variables. Methods such as PCA are known as dimension
reduction of the data. It may be suitable for scaled continuous variables but it isn’t
quite an appropriate method of dimension reduction for categorical variables. Thus,
the authors here used a tweaked version of PCA for categorical data called CATPCA
or categorical (nonlinear) principal components analysis which is specifically
developed for where the dependent variables are a mix of nominal, ordinal, or
numeric data which may not have linear relationships with each other. CATPCA
works by using a scaling process optimized to convert the categorical variables into
numeric variables.

Similar to PCA, Zaghdoudi, Djebali & Mezni [4] compared the use of Linear
Discriminant Analysis versus Logistic Regression for Credit Scoring and Default
Risk Prediction for foreseeing default risk o small and medium enterprises. Linear
Discriminant Analysis (LDA) is like PCA for dimensionality reduction but instead of
looking for the most variation, LDA focuses on maximizing the separability among
the know categories. This subspace that well separates the classes is usually in which
a linear classifier can be learned. The classification of those enterprises correctly in
their original groups through both these methods was inconsequential with Logistic
regression having a 0.3% better accuracy score than LDA.

Another novel approach for T.Sunitha and colleagues [5] was to predict loan Status
using Logistic Regression and a Binary Tree. Decision Tree is an algorithm for a
predictive type machine learning model.

Classification and Regression Trees are referred to as CART (in short) introduced by
Leo Breiman. It best suits both predictive and decision modeling problems. This
Binary Tree methodology is the greedy method is used for the selection of the best
splitting. Although Decision trees gave us a similar accuracy. The benefits of
Decision Trees, in this case, were due to the latter giving equal importance to both
accuracy and prediction. This model became successful in making a lower number of
False Predictions to reduce the risk factor.

Rajiv Kumar and Vinod Jain [6] proposed a model using machine learning
algorithms to predict the loan approval of customers. They applied three machine
learning algorithms, Logistic Regression (LR), Decision Tree (DT), and Random
Forest (RF) using Python on a test data set. From the results, they concluded that the
Decision Tree machine learning algorithm performs better than Logistic Regression
and Random Forest machine learning approaches. It also opens other areas on which
the Decision Tree algorithm is applicable.

Some machine learning models give different weights to each factor but in practice
sometimes loans can be sanctioned based on a single strong factor only. To eliminate
this problem J. Tejaswini and T. Mohana Kavya [7] in their research paper have built
a loan prediction system that automatically calculates the weight of each feature
taking part in loan processing and on new test data the same features are processed
concerning their associated weight. They have implemented six machine learning
classification models using R for choosing the deserving loan applicants. The models
include Decision Trees, Random Forest, Support Vector Machine, Linear Models,
Neural Network and Adaboost. The authors concluded that the accuracy of the
Decision Tree is highest among all models and performs better on the loan prediction
system.

Predicting loan defaulters is an important process of the banking system as it directly


affects profitability. However, loan default data sets available are highly imbalanced
which results in poor performance of the algorithms. Lifeng Zhou and Hong Wang
[8] in their call for paper made loan default prediction on imbalanced data sets using
an improved random forests approach. In this approach, the authors have employed
weights in decision tree aggregation. The weights are calculated and assigned to each
tree in the forest during the forest construction process using Out-of-bag (OOB)
errors. The experimental results conclude that the improved algorithm performs
better and has better accuracy than the original random forest and other popular
classification algorithms such as SVM, KNN, and C4.5. The research opens
improvements in terms of efficiency of the algorithm if parallel random forests can
be used for further work.

Anchal Goyal and Ranpreet Kaur [9] discuss various ensemble algorithms. Ensemble
algorithm is a supervised machine learning algorithm that is a combination of two or
more algorithms to get better predictive performance. They carried out a systematic
literature review to compare ensemble models with various stand-alone models such
as neural network, SVM, regression, etc. The authors after reviewing different
literature reviews concluded that the Ensemble Model performs better than the stand-
alone models. Finally, they concluded that the concept of combined algorithms also
improves the accuracy of the model.

Data Mining is also becoming popular in the field banking sector as it extracts
information from a tremendous amount of accumulated data sets. Aboobyda Jafar
Hamid and Tarig Mohammed Ahmed [10] focused on implementing data mining
techniques using three models j48, bayesNet, and naiveBayesdel for classifying loan
risk in the banking sector. The author implemented and tested models using the
Weka application. In their work, they made a comparison between these algorithms
in terms of accuracy in classifying the data correctly. The operation of sprinting
happened in a manner that 80% represented the training dataset and 20% represented
the testing dataset. After analyzing the results the author came up with the results
that the best algorithm among the three is the J48w algorithm in terms of high
accuracy and low mean absolute error.

CHAPTER 3 -DATASETS
● Here we have two datasets. First is train_dataset.csv, test_dataset.csv.
● These are datasets of loan approval applications which are featured
with annualincome, married or not, dependents are there or not, educated
or not, credit history present or not, loan amount etc.
● The outcome of the dataset is represented by loan_status in the train
dataset.
● This column is absent in test_dataset.csv as we need to assign loan
status with the help of training dataset.
● These two datasets are already uploaded on google colab.

FEATURES PRESENT IN LOAN PREDICTION


● Loan_ID – The ID number generated by the bank which is giving loan.
● Gender – Whether the person taking loan is male or female.
● Married – Whether the person is married or unmarried.
● Dependents – Family members who stay with the person.
● Education – Educational qualification of the person taking loan.
● Self_Employed – Whether the person is self-employed or not.
● ApplicantIncome – The basic salary or income of the applicant per
month.
● CoapplicantIncome – The basic income or family members.
● LoanAmount – The amount of loan for which loan is applied.
● Loan_Amount_Term – How much time does the loan applicant take to
pay the loan.
● Credit_History – Whether the loan applicant has taken loan previously
from same bank.
● Property_Area – This is about the area where the person stays
(Rural/Urban).

LABELS
● LOAN_STATUS – Based on the mentioned features, the machine
learning algorithm decides whether the person should be give loan or not.

Visualizing data
Code and output
#Importing required libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from scipy.stats import norm
from sklearn.preprocessing import StandardScaler
from scipy import stats
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline

df_train = pd.read_csv('train_dataset.csv')
# take a look at the top 5 rows of the train set, notice the column "Loan_Status"
df_train.head()
# This code visualizes the people applying for loan who are categorized based on
gender and marriage

grid = sns.FacetGrid(df_train, row='Gender', col='Married', size=2.2, aspect=1.6)


grid.map(plt.hist, 'ApplicantIncome', alpha=.5, bins=10)
grid.add_legend()
# Graphs plotted based on categories gender and education
grid = sns.FacetGrid(df_train, row='Gender', col='Education', size=2.2, aspect=1.6)
grid.map(plt.hist, 'ApplicantIncome', alpha=.5, bins=10)
grid.add_legend()

# Graphs plotted based on categories marriage and education


grid = sns.FacetGrid(df_train, row='Married', col='Education', size=2.2, aspect=1.6)
grid.map(plt.hist, 'ApplicantIncome', alpha=.5, bins=10)
grid.add_legend()
#histogram and normal probability plot
sns.distplot(df_train['ApplicantIncome'], fit=norm);
fig = plt.figure()
res = stats.probplot(df_train['ApplicantIncome'], plot=plt)
#correlation matrix
corrmat = df_train.corr()
f, ax = plt.subplots(figsize=(12, 9))
sns.heatmap(corrmat, vmax=.8, square=True);
# This graph depicts the combination of applicant income, married people and
dependent people in a family
grid = sns.FacetGrid(df_train, row='Married', col='Dependents', size=3.2,
aspect=1.6)
grid.map(plt.hist, 'ApplicantIncome', alpha=.5, bins=10)
grid.add_legend()

# The graph which differentiates the applicant income distribution, Coapplicant


income distribution, loan amount distribution
flg, axes = plt.subplots(nrows = 1, ncols = 3, figsize = (14,6))
sns.distplot(df_train['ApplicantIncome'], ax = axes[0]).set_title('ApplicantIncome
Distribution')
axes[0].set_ylabel('ApplicantIncomee Count')
sns.distplot(df_train['CoapplicantIncome'], color = "r", ax =
axes[1]).set_title('CoapplicantIncome Distribution')
axes[1].set_ylabel('CoapplicantIncome Count')
sns.distplot(df_train['LoanAmount'],color = "g", ax = axes[2]).set_title('LoanAmount
Distribution')
axes[2].set_ylabel('LoanAmount Count')
plt.tight_layout()
plt.show()
plt.gcf().clear()

# This figure shows the count of people differentiated based on education,


self_employed, and property_area

fig, axes = plt.subplots(ncols=3,figsize=(12,6))


g = sns.countplot(df_train["Education"], ax=axes[0])
plt.setp(g.get_xticklabels(), rotation=90)
g = sns.countplot(df_train["Self_Employed"], ax=axes[1])
plt.setp(g.get_xticklabels(), rotation=90)
g = sns.countplot(df_train["Property_Area"], ax=axes[2])
plt.setp(g.get_xticklabels(), rotation=90)
plt.tight_layout()
plt.show()
plt.gcf().clear()
Explanation of the Main Code
1. Logistic Regression model

# Importing required Libraries


import pandas as pd
import numpy as np # For mathematical calculations
import seaborn as sns # For data visualization
import matplotlib.pyplot as plt # For plotting graphs
# Importing dataset
train = pd.read_csv('train_dataset.csv')
test = pd.read_csv('test_dataset.csv')
# Converting the values to number
train['Dependents'].replace('3+', 3,inplace=True)
test['Dependents'].replace('3+', 3,inplace=True)
# take a look at the top 5 rows of the train set, notice the column "Loan_Status"
train.head()

# take a look at the top 5 rows of the test set, notice the absense of "Loan_Status"
that we will predict
test.head()

# Handling Missing Values


# Check How many Null Values in each columns
train.isnull().sum()
# Train Categorical Variables Missisng values
train['Gender'].fillna(train['Gender'].mode()[0], inplace=True)
train ['Married'].fillna(train['Married'].mode()[0],inplace=True)
train['Dependents'].fillna(train['Dependents'].mode()[0], inplace=True)
train['Self_Employed'].fillna(train['Self_Employed'].mode()[0], inplace=True)
train['Credit_History'].fillna(train['Credit_History'].mode()[0], inplace=True)
# Train Numerical Variables Missing Values
train['Loan_Amount_Term'].fillna(train['Loan_Amount_Term'].mode()[0],
inplace=True)
train['LoanAmount'].fillna(train['LoanAmount'].median(), inplace=True)
# Train Check if any Null Values Exits
train.isnull().sum()
# Test Check How many Null Values in each columns
test.isnull().sum()
# test Categorical Variables Missisng values
test['Gender'].fillna(test['Gender'].mode()[0], inplace=True)
test ['Married'].fillna(test['Married'].mode()[0],inplace=True)
test['Dependents'].fillna(test['Dependents'].mode()[0], inplace=True)
test['Self_Employed'].fillna(test['Self_Employed'].mode()[0], inplace=True)
test['Credit_History'].fillna(test['Credit_History'].mode()[0], inplace=True)
# test Numerical Variables Missing Values
test['Loan_Amount_Term'].fillna(test['Loan_Amount_Term'].mode()[0],
inplace=True)
test['LoanAmount'].fillna(test['LoanAmount'].median(), inplace=True)
# test Check if any Null Values Exits
test.isnull().sum()
# Outlier treatment
train['LoanAmount'] = np.log(train['LoanAmount'])
test['LoanAmount'] = np.log(test['LoanAmount'])
# Separting the Variable into Independent and Dependent
X = train.iloc[:, 1:-1].values
y = train.iloc[:, -1].values
# Converting Categorical variables into dummy
from sklearn.preprocessing import LabelEncoder,OneHotEncoder
labelencoder_X = LabelEncoder()
# Gender
X[:,0] = labelencoder_X.fit_transform(X[:,0])
# Marraige
X[:,1] = labelencoder_X.fit_transform(X[:,1])
# Education
X[:,3] = labelencoder_X.fit_transform(X[:,3])
# Self Employed
X[:,4] = labelencoder_X.fit_transform(X[:,4])
# Property Area
X[:,-1] = labelencoder_X.fit_transform(X[:,-1])
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state
= 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# Fitting Logistic Regression to our training set
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state=0)
classifier.fit(X_train, y_train)
# Predecting the results
y_pred = classifier.predict(X_test)
# Printing values of whether loan is accepted or rejected
y_pred[:100]

# import classification_report
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))

# implementing the confusion matrix


from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
print(cm)
# f, ax = plt.subplots(figsize=(9, 6))
sns.heatmap(cm, annot=True, fmt="d")
plt.title('Confusion matrix of the classifier')
plt.xlabel('Predicted')
plt.ylabel('True')
# Check Accuracy
from sklearn.metrics import accuracy_score
accuracy_score(y_test,y_pred)
0.8373983739837398

# Applying k-Fold Cross Validation


from sklearn.model_selection import cross_val_score
accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv =
10)
accuracies.mean()
# accuracies.std()

0.8024081632653062

CHAPTER -5 CONCLUSION

Loan prediction Accuracy Accuracy using K-fold


models comparison Cross Validation
Loan Prediction
Using Logistic 0.8373983739837398 0.8024081632653062
Regression
The task of this machine learning project is to train the model for accepting loan or
rejecting loan. Now there are 3 models wherein we can train the model and test it to
predict whether other applicants could get loan or not. First model is about using
logistic regression model for which the accuracy is 0.8373 and accuracy using k-fold
cross validation comes to 0.8024. Among all the models, Logistic regression gives
better accuracy. This Logistic regression model has been trained with a datasets and
tested with another dataset.
REFERENCES
1. https://www.researchgate.net/publication/242579096
2. https://www.kaggle.com/datasets/altruistdelhite04/loan-prediction
3. https://datahack.analyticsvidhya.com/contest
4.https://scikitlearn.org/stable/modules/generated/sklearn.linear_model.LogisticRegre
ssion.html
5. https://ieeexplore.ieee.org/document/8962457

You might also like