Predicting Divorce Prospect Using Ensemble Learning Support Vector Machine Linear Modeland Neural Network
Predicting Divorce Prospect Using Ensemble Learning Support Vector Machine Linear Modeland Neural Network
net/publication/361917362
CITATIONS READS
2 117
10 authors, including:
86 PUBLICATIONS 717 CITATIONS
Khwaja Fareed University of Engineering & Information Technology
6 PUBLICATIONS 3 CITATIONS
SEE PROFILE
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
[Frontiers in Romote Sensing] Research Topic: Remote Sensing for Sustainable Land Use and Management View project
Quality evaluation and improvement of multi-source remotely sensed soil moisture retrieval data View project
All content following this page was uploaded by Aqil Tariq on 12 July 2022.
Research Article
Predicting Divorce Prospect Using Ensemble Learning: Support
Vector Machine, Linear Model, and Neural Network
Mian Muhammad Sadiq Fareed ,1 Ali Raza,2 Na Zhao ,3 Aqil Tariq ,4 Faizan Younas,2
Gulnaz Ahmed,2 Saleem Ullah,2 Syeda Fizzah Jillani ,5 Irfan Abbas,6
and Muhammad Aslam7
1
Department of Software Engineering, University of Central, Punjab 54000, 1-Khayaban-e-Jinnah Road, Johar Town,
Lahore, Pakistan
2
Department of Computer Science, Khwaja Fareed University of Engineering and Information Technology,
Rahim Yar Khan 64200, Pakistan
3
State Key Laboratory of Resources and Environmental Natural Resources Research, Chinese Academy of Sciences,
Beijing 100101, China
4
State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing, Wuhan University,
Wuhan 430079, China
5
Department of Physics, Physical Sciences Building, Aberystwyth University, Aberystwyth SY23, UK
6
School of Agricultural Equipment Engineering, Jiangsu University, Zhenjiang 212013, China
7
School of Computing Engineering and Physical Sciences, University of West of Scotland, Paisley, UK
Received 17 March 2022; Revised 20 April 2022; Accepted 23 May 2022; Published 11 July 2022
Copyright © 2022 Mian Muhammad Sadiq Fareed et al. This is an open access article distributed under the Creative Commons
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
A divorce is a legal step taken by married people to end their marriage. It occurs after a couple decides to no longer live together as
husband and wife. Globally, the divorce rate has more than doubled from 1970 until 2008, with divorces per 1,000 married people rising
from 2.6 to 5.5. Divorce occurs at a rate of 16.9 per 1,000 married women. According to the experts, over half of all marriages ends in
divorce or separation in the United States. A novel ensemble learning technique based on advanced machine learning algorithms is
proposed in this study. The support vector machine (SVM), passive aggressive classifier, and neural network (MLP) are applied in the
context of divorce prediction. A question-based dataset is created by the field specialist. The responses to the questions provide important
information about whether a marriage is likely to turn into divorce in the future. The cross-validation is applied in 5 folds, and the
performance results of the evaluation metrics are examined. The accuracy score is 100%, and Receiver Operating Characteristic (ROC)
curve accuracy score, recall score, the precision score, and the F1 accuracy score are close to 97% confidently. Our findings examined the
key indicators for divorce and the factors that are most significant when predicting the divorce.
article. There are several ways to achieve the aim of having a (vi) The comparative analysis of model performance is
court issue, an absolute divorce ruling. For convenience, it has conducted among the three employed SVM, PAC,
been the usual practice in law to classify each of these proce- and Neural network approaches.
dures as a different type of divorce, which we will do here.
The rest of the paper is formulated as: The divorce-related
The following states contain divorce data for the United
work is examined in Section 2. The architectural methodology
States. There have been 2,015,603 weddings. Marriage occurs
analysis of our proposed research approach is analyzed in
at a rate of 6.1 per 1,000 of the population in total. There have
Section 3. The applied advanced machine learning techniques
been 746,971 divorces. Divorce occurs at a rate of 2.7 per
are examined in Section 4. Then, a novel ensemble learning
1,000 people (45 reporting states) [1].
approach based on three machine learning techniques is
Divorce occurs at a rate of 16.9 per 1,000 married women.
discussed in Section 5. The results and evaluation of the
Many experts believe that this is a far more authentic repre-
proposed approaches are explained and deliberated in Section
sentation of the genuine divorce rate [2] than the raw number.
6. Then, to conclude the research work, Section 7 contains the
The divorce rate for every 1,000 married women is about
conclusion of this novel research study.
double of what it was in 1960; nonetheless, it is lower than the
all-time high of 22.6 in the early 1980s. In the United States,
about half of the total marriages end in separation or divorce. 2. Related Work
According to the researchers, 41% of all the first marriages
result in divorce. The second marriages fail about 60%. All third The authors used Yöntem’s findings to construct 56 questions
marriages end in divorce about 73%. The United States has the as divorce predictors. Furthermore, they employed four
world’s sixth highest divorce rate [3]. automated learning models (perceptron, logistic regression,
Machine learning is an artificial intelligence (AI) technique neural networks, and randomized forest) as well as three
that enables computers to automatically develop and learn on hybrid models based on voting criteria. Each of these models
their own without being explicitly programmed. Machine was trained in 5 distinct scenarios, resulting in a total of 35
learning [4] is anxious with the establishment of computer tests, with the performance attained in terms of accuracy,
programmers that can access information data and employ it to sensitivity, and specificity is 0.98, 1.0, and 0.96, respectively,
learn on their own. Text classification [5] is a machine learning for the perceptron model and a hybrid model [9].
approach that assigns tags or categories to text automatically. The categorization approaches are used to forecast di-
Text classifiers can evaluate and categorize text by sentiment vorce in Turkey. In 2019, the authors carried out this in-
[6], subject, and consumer intent using natural language vestigation. They determined in this study that the ANN
processing (NLP) [7] quicker and more correctly than people. technique paired with a correlation-based matrix of feature
Ensemble modeling is an effective method for improving space selection performs best, with an accuracy score of 98%
the performance of our model. It typically pays to use en- and a Kappa value of 0.97. The SVM model training span is
semble learning in addition to any other models we may be also less than that of the ANN model training span [10].
developing. Ensemble learning techniques [8] are a kind of The authors utilized significant characteristics in this
machine learning methodology that accommodates numer- suggested study by deleting duplicate features that do not
ous base techniques to create the best prediction technique. help with the prediction by applying an improved machine
The divorce prospect prediction is the core objective of learning technique to the standard dataset accessible to
this novel research study. The main contributions of this forecast the divorce rate. They were able to reach 99% ac-
research are the following: curacy. This technique may also be utilized as evidence by
family counseling professionals on a couple’s emotional and
(i) A novel research study in terms of divorce prospect psychological well-being [11].
prediction using a questionnaire dataset is proposed Within the area of this study, divorce prediction was
in this paper. performed utilizing the Divorce Predictors Scale based on
(ii) The three advanced machine learning models, the Gottman couple’s therapy. DPS’s success was explored
support vector machine (SVM), passive aggressive utilizing the multilayer perceptron (MLP) neural networks
classifier (PAC), and neural networks (multilayer and decision tree algorithms. The study also seeks to identify
perceptron classifier) are utilized for the prediction the most important features of the Divorce Predictor Scale
task. Our employed techniques are fully hyper- values that influence divorce. When the direct classification
parameter tunned. learning methods were applied to the divorce dataset, the
RBF neural network had the greatest success rate of 98%.
(iii) An enhanced novel ensemble learning approach based
This scale can be used by family counselors and family
on three machine learning techniques is employed to
therapists to help with the case formulation and intervention
predict the divorce prospect of the couple.
planning. Furthermore, the predictors of divorce in the
(iv) The divorce exploratory data analysis (DEDA) is Gottman couple relation therapy were verified in the Turkish
conducted to get fruitful insights to form the dataset samples [12].
and to determine the major factors that cause divorce. In a long-term, prospective longitudinal research, this
(v) The cross-validation (CV) is applied in 5 folds, and paper explores the predictability of divorce. During the 14-
the performance results evaluation metric of the year research period, the prediction was attainable with a
proposed approach is examined. technique that incorporated marital happiness, concerns of
Computational Intelligence and Neuroscience 3
the marriage breakup, and emotional interaction in both predictable model with the best-fit features in the context of
talks. The algorithm correctly predicted divorce 93% of the divorce prediction. The data normalization is applied to
time [13]. make the dataset in perfect form for our proposed model.
An artificial neural network (ANN) technique was created Now dataset splitting is applied to split the dataset into
and employed in this research to predict whether or not a two portions. The 80% portion of the data is used for model
couple will divorce. The prediction is based on several ques- training and 20% is utilized for model testing and perfor-
tions that the couple acknowledged, and the answers to those mance evaluation. The three models are applied with the
questions served as the input data to the ANN model. The ensemble learning approach. Finally, the ensemble learning
model was subjected to repeated learning over training data model prediction is used for predicting the divorce.
and validation cycles until it achieved 100% accuracy [14]. The research methodology for this novel research is
The authors are offering a study on the prediction of examined in Figure 1. It visualizes the workflow of the
divorce cases using available machine learning techniques in complete research study. In the first step, the questionnaire
this paper. The authors compared the accuracy of the per- dataset is analyzed by the exploratory data analysis (EDA).
ceptron learning classifier, random forest learning classifier, Then, in the next step, feature engineering is applied to get
decision tree learning classifier, Naive Bayes learning clas- the useful features for the ensemble learning model. Then,
sifier, support vector machine learning classifier, and the data normalization is applied. The dataset splitting is
K-nearest neighbor learning classifier for divorce case applied in the next step. Then, the train portion is given to
prediction. Following training, the algorithm will forecast the model, and then, the test model results in the evaluation
whether or not the divorce will materialize. This allows the of the test portion. After all these methodology steps are
therapist to assess how stressful a couple’s condition is and done, a predictive ensemble learning model is formed and
properly counsel them. With the perceptron model, the ready to predict the divorce of a couple.
authors attained an accuracy of 98% [15].
The detection of COVID-19 based on a blood test was
proposed in this study [16]. The ensemble-learning-based 3.1. Dataset. The dataset is based on the questions asked by
approach was developed for the prediction of COVID-19. At the specialists to the married couples [18]. The answers to
the first stage of research, the deep-learning-based classifier these 54 questions will predict the chance of divorce between
convolutional neural network (CNN) was utilized. The dataset them. The questions are graded on a scale of 0 to 4, with 0
was used from the San Raffaele Hospital. In the second stage of being the worst and 4 being the best. The last category
research, the 15 different machine-learning-based classifiers indicates whether or not the couple has divorced. Table 1
were applied. The findings of the research study show that the contains the descriptive dataset analysis.
ensemble learning model achieved an accuracy score of 99%.
Malware detection based on ensemble learning tech-
niques is proposed in this study [17]. The fully connected 3.2. Divorce Exploratory Data Analysis. The divorce ex-
convolutional neural network (CNN)-based classifier was ploratory data analysis (DEDA) refers to the essential
developed for base stage classification. The machine-learn- process of administrating preliminary investigations on data
ing-based models were utilized for end-stage classification. to spot anomalies. The uncovered data patterns can be found
15 machine-learning-based classifiers were utilized for by applying DEDA. The test hypotheses are performed using
malware detection. The dataset of Windows Portable Exe- DEDA. The assumption validation using graphical repre-
cutable (PE) malware was used for model training and sentations and summary statistics is demonstrated by uti-
testing results. The research findings show that the fully lizing the DEDA.
connect CNN ensemble model and machine-learning-based The bar plot is a plot on the Divorce_Y_N column in
extra trees classifier achieved an accuracy score of 100%. Figure 2. In the bar plot, 0 represents the number of divorce
In conclusion, our proposed novel research study is class and 1 represents the divorce class. The bar plot shows
based on the prediction of divorce prospects using ensemble the total number of divorces and not divorce value. The value
learning techniques. The comparative analysis with the past of divorce in Figure 2 is 86, and the value of number of
applied research study shows that our research study out- divorce is 84. The bar chart shows that the data set is bal-
performed by utilizing advanced techniques. The research anced. Both classes have approximately the same number of
study results’ outcomes are efficient, validated, and higher rows.
than the past applied approaches. We have revealed the key The violin chart is the plot based on the dataset to explore
indicators for divorce and the factors that are the most the cause of divorce in Figures 3 –5. A violin graph is a cross
significant when predicting divorce in this research study. between a kernel density plot and a box plot that visualizes
the data peaks. It is utilized to display how numerical data
3. Methodology points are distributed in the employed dataset.
As opposite to a box plot, which can only bring summary
The methodological analysis of the proposed research study statistics, violin graphs visualize summary statistics as well as
is analyzed in this section. The working flow of our research the frequency of every variable. In the violin plot of the
findings flow is elaborated here. I’m_not_wrong (51) column, we explore that as the intensity
The questionnaire dataset is analyzed and useful insights of value increases, the number of divorces increases, and as
are taken from it. Feature engineering is applied to make a the value decreases, the number of divorces decreases. The
4 Computational Intelligence and Neuroscience
Feature
Questioner Dataset Analysis Normalize
Enginering
Linear
Model Dataset
Prediction
Spliting
80% 20%
SVM
Neural
Network
Divorce Predict Train
analysis graph also shows that it has a great impact on the (27), anxieties (26), inner_world (25), fav_food (23),
Divorce_Y_N column. care_sack (22), and likes (21) showing the total number of
In Figure 3, the data from the violin plot is also explored counts on the y-axis and the 0 to 4 scale on the x-axis. The
with the column of love (16), common goal (10), and enjoy histogram chart is plotted on trust, role, marriage, love, and
holidays (8). The graph shows the cause of divorce and dreams columns and explored the number of counts on a
no_divorce when the value of the scale changes. The violin different scale on the y-axis and x-axis, respectively.
plot is also plotted on the column of happy (17), always never From Figure 6, we have analyzed that the feature
(32), trust (21), and you are inadequate (53) in Figure 4. I’am_not_wrong (51) has higher rank values among all. This
The violin plot shows how the cause of divorce changes shows that this feature question has a major cause of divorce
when the scale changes. The violin plot of argue_then_leave and that’s why it has higher ranked scale values.
(42), humiliate (36), and friend social (30) is analyzed in This applied divorce histogram analysis is based on the
Figure 5. In Figure 5, we explored whether the effect of divorce prominent questions present in the dataset and their scale
change is linked with the scale change through the violin plot.. ranks. These questions are analyzed to get their feature
All these applied divorce analyses prove to be very importance and to determine the relationship between di-
fruitful in the context of getting useful insights from the vorce causes. These features are for model training and
dataset and its related features. getting divorce prediction from it.
The histogram chart is the plot of the dataset in Figures 6 A correlation graph displays the correlations for various
and 7. A histogram is referred to as a data representation variables present in the dataset employed. The correlation
tool, which appears to be a bar chart that buckets a variation matrix emphasizes the relationship between all the possible
of outcomes along with the x-axis columns. The numerical pairings of values in a dataset. It is a powerful tool for
value count or percent of value occurrences in the dataset for summarizing a large dataset in addition to visualizing and
every column is represented on the y-axis. identifying trends in the provided data. We draw the cor-
We get the histogram of features 2_stranger (7), silen- relation matrix on the dataset in Figure 8. The visualized
ce_instead_of_discussion (45), I’m_not_wrong (51), good_- features are based on the correlation values above or equal to
to_leave_home (44), I’m_not_guilty (50), humiliate (36), 0.7. The feature that has low correlation values is not present
not_calm (37), negative_personality (33), and know_well (29) in the feature display map. The correlation matrix shows that
and get the total number of counts in the different scale values. all features are highly related. All features are important to
The histogram is the plot of insult (35), common_goal (10), use for the training of our model.
no_home_time (6), special_time (5), contact (4), begin_-
correct (3), ignore_diff (2), incompetence (54), always_never
(32), and by counting the number of different scale values. 3.3. Feature Engineering. The technique of changing the raw
The histogram is the plot of the features friends_social dataset into a prominent feature space that well describes the
(30), know_well (29), hopes_wishes (28), current_stress root problem of predictive techniques, resulting in improving
Computational Intelligence and Neuroscience 5
the employed model accuracy results on the unseen dataset, is absolute correlation features are examined in Figure 9. The
referred to as the feature engineering technique. The 54 fav_food (24), know_well (30), freedom_value (12), marriage
features of the divorce questionnaire dataset are used as (18), special_time (5), roles (19), harmony (11), happy (17),
dependent features, and the target feature containing the enjoy_travel (9), insult (36), humiliate (37), and trust (21) are
label class is utilized in this research study. The top 10 the top correlated features.
6 Computational Intelligence and Neuroscience
80
60
count
40
20
0
0 1
Divorce_Y_N
No Divorce
Divorce
5 5
No Divorced No Divorced
Divorced Divorced
4 4
3
I′m_not_wrong
3
enjoy_holiday
2 2
1 1
0 0
–1 –1
No Divorced Divorced No Divorced Divorced
Divorce_Y_N Divorce_Y_N
(a) (b)
5 5
No Divorced No Divorced
Divorced Divorced
4 4
3 3
common_goals
love
2 2
1 1
0 0
–1 –1
No Divorced Divorced No Divorced Divorced
Divorce_Y_N Divorce_Y_N
(c) (d)
Figure 3: Divorce analysis by I’m_not_wrong, enjoy_holiday, love, and common_goals features. (a) The violin graph analysis of
I’m_not_wrong feature among Divorced and not Divorced category, (b) The violin graph analysis of enjoy_holiday feature among Divorced
and not Divorced category, (c) The violin graph analysis of love feature among Divorced and not Divorced category, and (d) The violin graph
analysis of common_goals feature among Divorced and not Divorced category.
3.4. Dataset Splitting. Dataset splitting appears as a re- is utilized to effectively assess the proposed model’s result
quirement for removing bias from training data in machine performance. This division prevents the employed technique
learning systems. The dataset is split into two sets: the from overfitting [19]. The dataset splitting utilized in this
training dataset, which is used by the model to learn an research has a ratio of 80: 20. The 80% portion of the dataset
efficient mapping of inputs to output, and the test set, which is used to ensemble learning models, and the 20% portion of
Computational Intelligence and Neuroscience 7
5
No Divorced No Divorced
4 Divorced Divorced
4
3 3
Always_never
happy
2 2
1 1
0 0
–1
No Divorced Divorced No Divorced Divorced
Divorce_Y_N Divorce_Y_N
(a) (b)
5 5
No Divorced No Divorced
Divorced Divorced
4 4
you′re_inadequate
3
2
trust
1
1
0
0
–1
–1
No Divorced Divorced No Divorced Divorced
Divorce_Y_N Divorce_Y_N
(c) (d)
Figure 4: Divorce analysis by happy, always_never, trust, and you’re_inadequate features. (a) The violin graph analysis of happy feature
among Divorced and not Divorced category, (b) The violin graph analysis of always_never feature among Divorced and not Divorced
category, (c) The violin graph analysis of trust feature among Divorced and not Divorced category, and (d) The violin graph analysis of
you’re_inadequate feature among Divorced and not Divorced category.
the dataset is utilized for testing and evaluating the ensemble is the worth of a certain coordinate in the SVM model. Then,
model. The random state unit for splitting is 42. we achieve classification by establishing the hyper-plane that
best distinguishes the two classes of the employed dataset.
4. Proposed Approaches The SVM technique hyperparameters are analyzed in
Table 3.
4.1. Passive Aggressive Classifier. The passive-aggressive
categorization [20] is one of the accessible incremental
learning methods because it uses a closed-form updating 4.3. Neural Networks. A feedforward artificial neural net-
rule. In the sense that they do not require a learning rate, work (ANN) that generates a set of outputs from a set of
passive-aggressive algorithms are akin to perceptron models. employed inputs is referred to as a multilayer perceptron
They do, however, contain a regularization parameter. The (MLP) neural network [22]. An MLP is referred to by
classifier updates its weight vector for each misclassified various layers of employed input nodes that are associated as
training sample it gets in an attempt to fix it. The hyper- a directed graph between the output and input layers.
parameters by tuning analysis of the passive-aggressive al- Backpropagation is utilized by MLP to train the employed
gorithm are examined in Table 2. neural network. An MLP is a neural network that joins many
layers in a directed graph, which means that the data signal
4.2. Support Vector Machine. The support vector machine routed across the graph nodes is only a single direction. In
(SVM) [21] is a supervised learning model that is utilized to addition to the input nodes, every node has an activation
solve regression and classification problems. It is largely function of the nonlinear form.
employed in categorization-related difficulties. Every data Backpropagation [23] is a supervised machine learning
item is visualized as a point in n-dimensional space, where n technique utilized by an MLP. The MLP is a deep-learning-
is the number of data features. The value of every data feature based approach since it uses various layers of neurons. The
8 Computational Intelligence and Neuroscience
5 No Divorced 5 No Divorced
Divorced Divorced
4 4
argue_then_leave
3 3
humiliate
2 2
1 1
0 0
–1 –1
(a) (b)
5
No Divorced
Divorced
4
3
friends_social
–1
No Divorced Divorced
Divorce_Y_N
(c)
Figure 5: The divorce analysis by argue_then_leave, humiliates, and friends_social features. (a) The violin graph analysis of argue_-
then_leave feature among Divorced and not Divorced category, (b) The violin graph analysis of humiliates feature among Divorced and not
Divorced category, (c) The violin graph analysis of friends_social feature among Divorced and not Divorced category.
60 50
50
50
40
40 40
30
Counts
Counts
Counts
30 30
20
20 20
10 10
10
0 0 0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Scale Scale Scale
Counts
Counts
40 40
30
30 30
20
20 20
10 10 10
0 0 0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Scale Scale Scale
(d) (e) (f )
Figure 6: Continued.
Computational Intelligence and Neuroscience 9
70 60
80
60
50
50 60
40
40
Counts
Counts
Counts
30 40
30
20
20
20
10 10
0 0 0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Scale Scale Scale
Counts
Counts
40 40
40
30 30
20 20 20
10 10
0 0 0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Scale Scale Scale
Counts
40 40
Counts
40
30 30
30
20 20
20
10 10
10
0 0 0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Scale Scale Scale
Figure 6: The divorce histogram analysis of 15 prominent questions scale ranks analysis. (a) The ranked scale analysis being the lowest and
highest for feature Ignore_diff, (b) The ranked scale analysis being the lowest and highest for feature incompetence, (c) The ranked scale
analysis being the lowest and highest for feature Always_never, (d) The ranked scale analysis being the lowest and highest for feature
friends_social, (e) The ranked scale analysis being the lowest and highest for feature hopes_wishes, (f ) The ranked scale analysis being the
lowest and highest for feature current_stress, (g) The ranked scale analysis being the lowest and highest for feature anxieties, (h) The ranked
scale analysis being the lowest and highest for feature inner_world, (i) The ranked scale analysis being the lowest and highest for feature
fav_food, (j) The ranked scale analysis being the lowest and highest for feature care_sick, (k) The ranked scale analysis being the lowest and
highest for feature likes, (l) The ranked scale analysis being the lowest and highest for feature trust, (m) The ranked scale analysis being the
lowest and highest for feature roles, (n) The ranked scale analysis being the lowest and highest for feature marriage and (o) The ranked scale
analysis being the lowest and highest for feature love.
70 50
60 50
40
50 40
40 30
Counts
Counts
Counts
30
30 20
20
20
10 10
10
0 0 0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Scale Scale Scale
80
70
80
70
60 70
60
50 60
50
40 50
Counts
Counts
Counts
40 40
30
30 30
20
20 20
10 10 10
0 0
0 1 2 3 4 0
0 1 2 3 4 0 1 2 3 4
Scale Scale Scale
(d) (e) (f )
70 60
80
60
50
50 60
40
40
Counts
Counts
Counts
30
30 40
20
20
20
10 10
0 0 0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Scale Scale Scale
Counts
Counts
40 40
40
30 30
20 20 20
10 10
0 0 0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Scale Scale Scale
40
Counts
40 40
30 30
30
20 20
20
10 10
10
0 0
0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Scale Scale Scale
Figure 7: The divorce histogram analysis of other 15 prominent questions scale ranks analysis. (a) The ranked scale analysis being the lowest
and highest for feature dreams, (b) The ranked scale analysis being the lowest and highest for feature incompetence, (c) The ranked scale
analysis being the lowest and highest for feature Always_never, (d) The ranked scale analysis being the lowest and highest for feature
friends_social, (e) The ranked scale analysis being the lowest and highest for feature hopes_wishes, (f ) The ranked scale analysis being the
lowest and highest for feature current_stress, (g) The ranked scale analysis being the lowest and highest for feature anxieties, (h) The ranked
scale analysis being the lowest and highest for feature inner_world, (i) The ranked scale analysis being the lowest and highest for feature
fav_food, (j) The ranked scale analysis being the lowest and highest for feature care_sick, (k) The ranked scale analysis being the lowest and
highest for feature likes, (l) The ranked scale analysis being the lowest and highest for feature trust, (m) The ranked scale analysis being the
lowest and highest for feature roles, (n) The ranked scale analysis being the lowest and highest for feature marriage and (o) The ranked scale
analysis being the lowest and highest for feature love.
Computational Intelligence and Neuroscience 11
1.00
Sorry_end 1.0 0.9 0.8 0.8 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.9 0.8 0.8 0.7 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.9
Special_time 0.9 1.0 0.9 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.9
enjoy_holiday 1.0 0.9 0.8 0.9 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.9 0.8 0.9 0.8 0.8 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.9
enjoy_travel 0.8 0.9 0.9 1.0 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.9 0.9 0.8 0.9 0.9 0.8 0.9 0.9 0.8 0.8 0.9 0.9 0.8 0.9 0.9 0.9 0.8 0.9
common_goals 0.8 0.8 0.8 0.9 1.0 0.9 0.8 0.8 0.8 0.9 0.9 0.8 0.9 0.8 0.8 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.8
harmony 0.9 0.9 0.9 0.9 0.9 1.0 0.9 0.9 0.9 0.9 0.9 0.9 0.9 1.0 0.9 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.9 0.9 0.9 0.9 0.9 0.8 0.9
0.95
freeom_value 0.8 0.8 0.9 0.9 0.8 0.9 1.0 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.9 0.8 0.7 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.9
entertain 0.8 0.9 0.8 0.9 0.8 0.9 0.9 1.0 0.8 0.9 0.8 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
people_goals 0.8 0.8 0.9 0.9 0.8 0.9 0.9 0.8 1.0 0.9 0.8 0.8 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.8 0.9 0.9 0.8 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.9
dreams 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 1.0 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.9 0.9 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.9 0.8 0.8 0.9
love 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 1.0 0.9 0.9 0.9 0.9 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.9
happy 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.9 0.9 1.0 0.9 1.0 0.9 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.8 0.9 0.9 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.9 0.90
marriage 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 1.0 0.9 1.0 0.9 0.8 0.8 0.9 0.8 0.9 0.9 0.8 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.9 0.9 0.8 0.8 0.9
roles 0.9 0.9 0.9 0.9 0.8 1.0 0.9 0.9 0.9 0.9 0.9 1.0 0.9 1.0 0.9 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.8 0.9 0.9 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.9
trust 0.8 0.9 0.9 0.9 0.8 0.9 0.9 0.8 0.9 0.9 0.9 0.9 1.0 0.9 1.0 0.9 0.9 0.8 0.9 0.8 0.9 0.9 0.8 0.9 0.9 0.9 0.8 0.8 0.9 0.8 0.9 0.9 0.9 0.8 0.9
likes 0.8 0.9 0.9 0.9 0.8 0.9 0.8 0.8 0.8 0.9 0.8 0.9 0.9 0.9 0.9 1.0 0.9 0.9 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.9
care_sick 0.8 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.9 0.8 0.8 0.8 0.9 0.9 1.0 0.9 0.8 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8
fav_food 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.9 0.8 0.9 0.8 0.9 0.9 1.0 0.8 0.9 0.8 0.9 0.9 0.9 0.8 0.9 0.8 0.9 0.9 0.8 0.8 0.8 0.9 0.8 0.8 0.85
stresses 0.8 0.8 0.9 0.9 0.8 0.8 0.9 0.8 0.9 0.8 0.8 0.9 0.9 0.9 0.9 0.9 0.8 0.8 1.0 0.8 0.9 0.9 0.8 0.9 0.9 0.8 0.7 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.8
inner_world 0.8 0.9 0.8 0.8 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.9 0.8 1.0 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.8 0.8 0.8 0.9 0.9 0.8 0.9
anxieties 0.8 0.8 0.9 0.9 0.8 0.8 0.9 0.8 0.9 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.9 0.9 1.0 0.9 0.9 0.9 0.9 0.8 0.7 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.9
current_stress 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.8 0.9 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 1.0 0.8 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.9 0.8 0.9
hopes_wishes 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.9 0.8 0.8 0.8 0.8 0.9 0.9 0.9 0.8 0.9 0.9 0.8 1.0 0.9 0.9 0.8 0.8 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.8
0.80
know_well 0.9 0.9 0.9 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 1.0 0.9 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.9
friends_social 0.8 0.8 0.9 0.9 0.8 0.9 0.9 0.8 0.9 0.8 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.9 0.8 0.9 0.9 0.9 0.9 1.0 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.9
negative_personality 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.9 0.8 1.0 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9
offensive_expressions 0.7 0.8 0.8 0.8 0.7 0.8 0.7 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.8 0.7 0.8 0.8 0.8 0.8 0.9 1.0 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8
insult 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.9 0.8 0.9 0.8 0.9 0.9 0.9 0.8 0.9 0.8 0.8 0.8 0.9 0.8 0.9 0.9 1.0 1.0 0.9 0.9 0.9 0.9 0.9 0.9
humiliate 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.9 0.9 0.8 0.9 0.9 1.0 1.0 0.9 0.9 0.9 0.9 0.9 0.9
0.75
hate_subjects 0.8 0.9 0.8 0.8 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.9 0.8 0.9 0.9 0.9 0.9 1.0 0.9 0.9 0.9 0.8 0.9
sudden_discussion 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.9 0.9 0.9 0.9 1.0 0.9 0.9 0.8 0.9
idk_what’s_going_on 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.9 0.8 0.9 0.8 0.9 0.9 0.8 0.8 0.8 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 1.0 0.9 0.8 0.9
calm_breaks 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.8 0.8 0.8 0.8 0.9 0.8 0.9 0.9 0.8 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.9 1.0 0.8 0.9
incompetence 0.8 0.8 0.8 0.8 0.7 0.8 0.7 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.8 0.8 0.8 0.8 0.9 0.8 0.9 0.9 0.8 0.8 0.8 0.8 1.0 0.8
Divorce_Y_N 0.9 0.9 0.9 0.9 0.8 0.9 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.9 0.9 0.8 0.9 0.9 0.9 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0.8 1.0
0.70
enjoy_holiday
care_sick
negative_personality
love
happy
marriage
roles
trust
likes
fav_food
stresses
inner_world
anxieties
current_stress
hopes_wishes
know_well
friends_social
incompetence
Sorry_end
Special_time
enjoy_travel
common_goals
harmony
freeom_value
entertain
people_goals
dreams
offensive_expressions
insult
humiliate
sudden_discussion
Divorce_Y_N
hate_subjects
idk_what's_going_on
calm_breaks
fav_food:know_well
freeom_value:marriage
Special_time:roles
harmony:happy
Special_time;happy
Features
enjoy_travel:dreams
harmony:roles
insult:humiliate
marriage:trust
happy:roles
Voting Function
Testing Data
Divorce Prediction
Table 5: The comparison analysis of selected methods before and after hyperparameter tuning.
Before hyperparameter tuning After hyperparameter tuning
Proposed technique
Accuracy score Training time (seconds) Accuracy score Training time (seconds)
Support vector machine (SVM) 97 0.004660367965698242 100 0.0017824172973632812
Passive aggressive classifier (PAC) 97 0.0012810230255126953 97 0.002166748046875
Neural network (MLP) 97 0.9576735496520996 100 0.4841580390930176
20.0
17.5
17.5
15.0
14 0 15.0 14 0
0
0
12.5
12.5
10.0
10.0
7.5 7.5
0 20 5.0 1 19 5.0
1
1
2.5 2.5
0.0 0.0
0 1 0 1
SVM PAC
20.0 20.0
17.5 17.5
14 0 15.0 14 0 15.0
0
12.5 12.5
10.0 10.0
7.5 7.5
0 20 5.0 0 20 5.0
1
2.5 2.5
0.0 0.0
0 1 0 1
MLP EL
5. Ensemble Learning to find the average accuracy of the model. We have applied hard
voting because our classification data depends on class labels
The ensemble learning approach is examined and applied in this and the associated weights with every classifier. The higher
research. The architecture of the applied approach, the en- accuracy score is our best prediction value.
semble approach, is analyzed in Figure 10. The training dataset
is used for training the three classification models utilized in this
research. The SVM, linear model, and neural network model are 6. Results and Evaluation
trained and tested parallelly using the pipeline of ensemble
learning. The ensemble learning architecture is based on the All performance evaluation metrics utilized in this research are
logic to train and test all model underlying models in parallel. examined in this section. The ensemble learning model ac-
Now, the testing results are used by the “hard” voting function curacy score value, ROC accuracy score value, recall score value,
14 Computational Intelligence and Neuroscience
precision score value, and F1 score values are the performance several methods in which the classification technique gets
evaluation metrics employed in this research study. One pa- perplexed when making predictions. It is critical to assess the
rameter for assessing the classification models is accuracy. The model’s performance once it has been trained using some
accuracy score value is the percentage of the correct number of training data. When we developed a confusion matrix, we
predictions made by our proposed model. The accuracy of our had several components:
proposed technique is 100%. Formally, accuracy is represented
(i) Positive (P): the projected outcome is positive (like
by using the following mathematical equation:
the couple gets a divorce).
number of correct predictions (ii) Negative (N): the projected outcome is negative
accuracy � . (1)
total number of predictions (like a couple does not get a divorce).
The ROC curve is referred to as the probability curve (iii) True positive (TP): in this case, TP denotes the
analysis that displays the true positive rate (TPR) outcome vs expected and actual values, which are both 1 (true).
the false positive rate (FPR) outcome at numerous threshold (iv) True negative (TN): TN denotes the projected value,
settings, separating the signal data from the noise data. The while 0 denotes the actual value (false).
area under the curve (AUC) is a measure of an employed (v) False negative (FN): in this case, FN denotes that the
learning classifier’s ability to discriminate between classes predicted count value is 0 (N) while the actual count
and is utilized to summarize the ROC curve. The ROC AUC value is 1 (P). Both values in this case do not
of our proposed technique is 97%. The mathematical correspond. As a result, it is an FN.
equation expresses the ROC AUC score:
1 +∞ 7. Conclusion
ROC AUC � ROCx (t)dF(X|D�1)(x)dt , (2)
0 −∞
The prediction of divorce by using machine learning and
+∞ ensemble learning techniques is the core motive of this
ROC AUC � AUCx dF(x|D�1)(x) . (3) research study. The findings of our study are based on key
−∞
indicators for divorce and the factors that are most signif-
icant when predicting divorce. The support vector machine
Precision is referred to as the ratio of true positives rate (SVM), passive aggressive classifier, and neural network
(TPR) outcomes to all positive outcomes. The recall is a (MLP) are applied to predict divorce. The cross-validation
measure of how well our model identifies true positives. In and performance evaluation techniques are manipulated to
our case, both have a 97% score. The mathematical equation
evaluate the proposed models. Our EL proposed technique
that expressed the precision and recall:
achieved the highest accuracy of 100%. In the context of
true positive limitations and future directions, we will try to enhance the
precision � , (4) questionnaire dataset by adding more questions to get more
true positive + false positive
clarified results and also apply the data augmentation
techniques. To reduce overfitting, we will explore different
true positive
recall � . (5) deep learning models.
true positive + false negative
out data curation, funding acquisition, and project ad- Model for COVID-19 Detection from Blood Test Samples,”
ministration. All authors have read and agreed to the Sensors, vol. 22, no. 6, p. 2224, 2022.
published version of the manuscript. [17] N. A. Azeez, O. E. Odufuwa, S. Misra, J. Oluranti, and
R. Damaševičius, “Windows PE malware detection using
ensemble learning,” Informatics, vol. 8, no. 1, 2021.
Acknowledgments [18] Csafrit, “Predicting divorce | kaggle,” 2022, https://
This research was funded by the National Natural Science www.kaggle.com/datasets/csafrit2/predicting-divorce.
[19] A. Rácz, D. Bajusz, and K. Héberger, “Effect of dataset size and
Foundation of China, grant number 42071374.
train/test split ratios in qsar/qspr multiclass classification,”
Molecules, vol. 26, no. 4, pp. 1–16, 2021.
References [20] A. Y. Muaad, J. Hanumanthappa, M. A. Al-antari, J. V. Bibal
Benifa, and C. Chola, “AI-based misogyny detection from
[1] “CDC. FastStats - Marriage and Divorce,” 2022, https:// Arabic levantine twitter tweets,” in Proceedings of the 1st
www.cdc.gov/nchs/fastats/marriage-divorce.htm. Online Conf. Algorithms, October. 2021.
[2] J. Dall’Agnola and H. Thibault, “Online temptations: divorce [21] A. Rizwan, N. Iqbal, R. Ahmad, and D. H. Kim, “Wr-svm
and extramarital affairs in Kazakhstan,” Religions, vol. 12, model based on the margin radius approach for solving the
no. 8, pp. 654–20, 2021. minimum enclosing ball problem in support vector machine
[3] E. San Diego, “Divorce Statistics and Facts | what Affects classification,” Applied Sciences, vol. 11, no. 10, 2021.
Divorce Rates in the U.S.?,” 2022, https://www.wf-lawyer- [22] A. Taravat, S. Proud, S. Peronaci, F. Del Frate, and N. Oppelt,
s.com/divorce-statistics-and-facts/. “Multilayer perceptron neural networks model for meteosat
[4] Jorge Dı́az-Ramı́rez, “Aprendizaje Automático y Aprendizaje second generation SEVIRI daytime cloud masking,” Remote
Profundo,” Ingeniare. Revista chilena de ingenierı́a, vol. 29, Sensing, vol. 7, no. 2, pp. 1529–1539, 2015.
no. 2, pp. 182-183, 2021. [23] N. B. Shaik, S. R. Pedapati, S. A. Ammar Taqvi, A. R. Othman,
[5] X. Luo, “Efficient English text classification using selected and F. A. Abd Dzubir, “A feed-forward back propagation
machine learning techniques,” Alexandria Engineering Jour- neural network approach to predict the life condition of crude
nal, vol. 60, no. 3, pp. 3401–3409, 2021. oil pipeline,” Processes, vol. 8, no. 6, 2020.
[6] W. Chen, Z. Xu, X. Zheng, Q. Yu, and Y. Luo, “Research on
sentiment classification of online travel review text,” Applied
Sciences, vol. 10, no. 15, p. 5275, 2020.
[7] P. M. Nadkarni, L. Ohno-Machado, and W. W. Chapman,
“natural language processing: an introduction,” Journal of the
American Medical Informatics Association, vol. 18, no. 5,
pp. 544–551, 2011.
[8] K. Pham, D. Kim, S. Park, and H. Choi, “Ensemble learning-
based classification models for slope stability analysis,” Ca-
tena, vol. 196, Article ID 104886, 2021.
[9] N. Flores, S. Silva, C. Science, S. Silva, A. I. Group, and
C. Science, “Machine learning model to predict the divorce
OF a married couple,” 3C Tecnologı́a_Glosas de innovación
aplicadas a la pyme, pp. 83–95, 2021.
[10] N. Hafidz, Sfenrianto, Y. Sfenrianto, Y. Pribadi, E. Fitri, and
Ratino, “ANN and SVM algorithm in divorce predictor,”
International Journal of Engineering and Advanced Technol-
ogy, vol. 9, no. 3, pp. 2523–2527, 2020.
[11] P. Ranjitha and A. Prabhu, “Improved divorce prediction
using machine learning- particle swarm optimization (PSO),”
in Proceedings of the 2020 Int. Conf. Emerg. Technol. INCET
2020, pp. 1–5, Belgaum, India, June 2020.
[12] Y. Mustafa Kemal, A. Kemal, İL. Tahsin, and K. Serhat,
“Divorce prediction using correlation based feature selection
and artificial neural networks,” Nevşehir Hacı Bektaş Veli
Üniversitesi SBE Derg.vol. 9, no. 1, pp. 259–273, 2019.
[13] J. M. Gottman and R. W. Levenson, “The timing of divorce:
predicting when a couple will divorce over a 14-year period,”
Journal of Marriage and Family, vol. 62, no. 3, pp. 737–745,
2000.
[14] I. Nasser, “Predicting whether a couple is going to get di-
vorced or not using artificial neural networks,” Int. J. Eng. Inf.
Syst.vol. 3, no. 10, pp. 49–55, 2019.
[15] A. Sharma, A. S. Chudhey, and M. Singh, “Divorce case
prediction using machine learning algorithms,” in Proceedings
of the Int. Conf. Artif. Intell. Smart Syst. ICAIS 2021,
pp. 214–219, Coimbatore, India, March 2021.
[16] Olusola O. Abayomi-Alli, Robertas Damaševičius,
Rytis Maskeli� unas, and Sanjay Misra, “An Ensemble Learning