Predicting Autism Spectrum Disorder Using Machine Learning Classifiers
Predicting Autism Spectrum Disorder Using Machine Learning Classifiers
13th 2020
Abstract— Autism Spectrum Disorder (ASD) is on the rise questions is in Binary data type. Using Python programming
and constantly growing. Earlier identify of ASD with the best language and python libraries such as sklearn, pandas, keras,
outcome will allow someone to be safe and healthy by proper numpy and matplotlib libraries, we have got our predicting
nursing. Humans are hard to estimate the present condition results both in graphical and numeric from SVM.
and stage of ASD by measuring primary symptoms. Therefore,
it is being necessary to develop a method that will provide the
best outcome and measurement of ASD. This paper aims to II. LITERATURE SURVEY
show several measurements that implemented in several A. Literature Review
classifiers. Among them, Support Vector Machine (SVM)
provides the best result and under SVM, there are also some Analysis of the increasing trend of using modern machine
kernels to perform. Among them, the Gaussian Radial Kernel learning as well as data mining technologies to explore data
gives the best result. The proposed classifier achieves 95% efficiently is on the rise. We are at the age where we need
accuracy using the publicly available standard ASD dataset. more than perfection. Some researchers used machine
learning techniques to measure data in the most accurate
Keywords— ASD, SVM, Classifier, ROC, Accuracy. form. Implementing datasets in several classifiers and several
algorithms, the outcome was in terms of calculating the
I. INTRODUCTION accuracy rate. Such as, research was in 2016 for breast
cancer risk prediction and diagnosis by using machine
Analyzing data for different purposes (like a prediction learning algorithms [2]. There they had performed a
or measuring performances) is called the data mining pro- comparison between different machine learning algorithms.
cess. It has several tasks such as association rule mining, Such as Support Vector Ma- chine (SVM), Decision Tree
classification, prediction and clustering. Researchers have (C4.5), Naive Bayes (NB) and k Nearest Neighbors (k-NN).
found a suitable way of different purposes of data mining for These algorithms had been implemented on the Wisconsin
different fields of research. Here in this paper, we have gone Breast Cancer datasets [2]. Their experimental results
through to predict autism spectrum disorder (ASD) by showed that SVM gave the highest accuracy with the lowest
applying the data mining technique specifically in Support- error rate and all experiments were executed within a
Vector Machine (SVM). simulation environment and conducted in WEKA data
Autism spectrum disorder (ASD) is a neurological and mining tool [2].
developmental disorder that begins early in childhood and There are many more similar researches took place on
lasts throughout a person’s life. It affects how a person acts several topics and several datasets. Here, we would like to
and interacts with others. It is called a "spectrum" disorder mention another one that was on Magnetic Resonance Imag-
because people with ASD can have a range of symptoms [1]. ing (MRI) stroke classification by Support Vector Machine
People with ASD might have problems talking with others, (SVM) [3]. They performed a method to classify the MRI
or they might not look in the eye when someone talks to images of the brain related stroke. The MRI images for
them. They may also have restricted interests and repetitive stroke classification used Gabor filters and Histograms to
behaviors. Research says that both genes and environment extract features from the images and the features were
play important roles and there is currently no standard classified using Support Vector Machine (SVM) with
treatment for ASD but there are many ways to increase a various kernels.
person’s ability to grow normally. We really need to check a
person’s behavior and symptoms whether he is autistic or Finally, the experimental results were shown that the
not. Therefore, we need a strong dataset to perform a presented method achieves satisfactory classification
technique that will identify a person’s autism spectrum accuracy and the classification accuracy was the best for the
disorder. However, very limited autism datasets associated linear kernel.
with clinical or screening are available and most of them are
From this kind of research mentioned above here, we
genetic in nature.
gather knowledge and related information that we have used
A suitable data set of autistic spectrum disorder (ASD) for our work. We aim to improve the result of the prediction
has been selected that is combined with three different of the ASD Screening test of an individual by doing some
datasets such as ASD Screening Data for Child, ASD analysis of Data Mining techniques.
Screening Data for Adolescent and ASD Screening Data for
B. Data Understanding
Adult [1]. We have merged three of these data sets into one
dataset for our mission. Our desired dataset consists of ten 1) Attributes:
individual characteristics and ten behavioral features in
binary data. We consider with their data type respectively Our work uses publicly available standard data sets [4].
such as String, Number, Boolean and the ten behavioral Score such as A1_Score, A2_Score are the result of a
Authorized licensed use limited to: Machakos University College. Downloaded on May 20,2021 at 10:33:05 UTC from IEEE Xplore. Restrictions apply.
questionnaire survey by the Autism Research Center at the x Gradient Boosting.
University of Cambridge, UK [5]. The Dataset consists of x Support Vector Machine.
nine individual characteristics and ten behavioral features. x Decision Tree.
They are following. x MLP Classifier.
Table 1: BEHAVIOURAL FEATURE.
C. Evaluation Metrics
Attribute Type
gender String We select the following evaluation metrics to evaluate
ethnicity String
our results. These metrics, especially accuracy and AUC
value, help us to find the best classifier for the data set.
jaundice Boolean (yes or no)
PDD Boolean (yes or no) x Accuracy.
relation String x AUC Value.
country_of_res String x Precision.
did_the_qn_before Boolean (yes or no) x Recall.
ade_desc Integer x F1 Score.
A1_Score Binary (0, 1)
A2_Score Binary (0, 1) IV. EXPERIMENT RESULT AND FINDINGS
A3_Score Binary (0, 1) A. Comparison between different classifiers
A4_Score Binary (0, 1)
In order to prove which classifier is performing better
A5_Score Binary (0, 1) than others, we must need a comparison between them.
A6_Score Binary (0, 1)
A7_Score Binary (0, 1) 1). Results: To compare between classifiers we need proper
A8_Score Binary (0, 1)
measurement result of their accuracy rate, AUC value,
Precision, Recall and F1-Score.
A9_Score Binary (0, 1)
A10_Score Binary (0, 1) Table 2: CLASSIFIERS RESULTS
Class/ASD Boolean (yes=1 or no=0) NB kNN LR GB SVM DT MLP
Acc 0.76 0.92 0.915 0.921 0.95 0.87 1
2). Missing Value: AUC 0.78 0.81 0.90 0.88 0.87 0.86 0.67
Pre (no) 0.86 0.87 0.93 0.89 0.89 0.90 1
The dataset contains so many missing values. Dealing
Pre (yes) 0.69 0.73 0.89 0.92 0.91 0.84 0.47
with these missing values was the biggest challenge as we
Rec (no) 0.81 0.84 0.94 0.96 0.95 0.92 0.37
have 19 variables. There are a few ways to handle the Rec (yes) 0.76 0.78 0.87 0.80 0.79 0.82 1
missing values. For example, replace the missing values with F1 (no) 0.83 0.86 0.93 0.93 0.92 0.91 0.54
averaged values or delete instances that have missing values. F1 (yes) 0.72 0.75 0.88 0.85 0.84 0.83 0.64
Since we have 19 variables, we decide to remove the
instances with missing values. From the Table 2 Naïve Bayesus providing the lowest
accuracy that is 0.76 and MLP Classifier is showing the
III. APPROACHES highest accuracy that is 1.00. But MLP Classifier is
providing the lowest AUC value which is only 0.686.
A. Data Classification Moreover, comparing with MLP Classifier’s AUC value,
There are two steps in data classification. The first step is Precision, Recall, F1-score are not even close to other
known as a learning step. A given set of classes based on the classifiers. MLP Classifier gets the dataset over-fit. This can
analysis of a set of data instances is explained. Each instance be a cause of MLP Classifier’s not working well with small
belongs to a predefined class. In the last step, the data set is datasets. So if we don’t count MLP Classifier in that way,
tested using various machine learning techniques that are Support Vector Machine (SVM) would be the best performer
used to calculate the classification accuracy, AUC value, for this kind of operation as SVM has the second-best
precision, recall, etc. A model is then designed that predicts accuracy rate that is 0.950 which is 95%. Though Logistic
the future outcome based on the historical or recorded data. Regression is providing the highest AUC value that is 0.903,
The Support Vector Machine shows a decent overall result
There are various machine learning techniques for for all evaluation metrics. So comparing these sides, we
classification. We have used the following machine learning found support Vector Machine is very much appropriate for
classifier in our work, which is seen in the section this kind of operation.
‘Classifiers’.
2) ROC Curve: The ROC curve is a graphical diagram that
B. Classifiers illustrates the probability curve of binary classifiers. Here,
We applied the following machine learning classifier to from figure 1, we can see that there are four ROC curves of
the data set. First, we divided our data set into two four classifiers and they are plotted according to their AUC
categories: training (70%) and testing (30%). After splitting value. A perfect test result has an AUC value of 1.0,
the dataset, we used this machine learning classifier to whereas random chance gives an AUC value of
determine the results of the assessment matrices. 0.5 or may lower. Here, our desired classifiers have AUC
value of 0.782 Naïve Bayes, 0.809 k-Nearest Neighbor,
x Naïve Bayes.
0.866 Decision Tree, 0.878 Gradient Boosted Trees.
x K-Nearest Neighbor (kNN). Gradient Boosting gives the best AUC value if we only
x Logistic Regression. compare these 4 classifiers. This is the advantage of ROC
325
Authorized licensed use limited to: Machakos University College. Downloaded on May 20,2021 at 10:33:05 UTC from IEEE Xplore. Restrictions apply.
curve measurement that we can compare two or more A. Findings
sensitive tests visually and simultaneously in one figure.
MLP Classifier provides 100% accuracy but its AUC
Now we want to compare between Logistic Regression, value, Precision, Recall and F1-Score are very low. So it is
SVM and MLP Classifier. According to figure 2, Logistic not fit for our dataset. Logistic Regression and SVM provide
Regression gives the best result here in the ROC curve and almost same values and they are very much fit for this kind
MLP Classifier gives the worst result here and SVM gives of dataset. SVM works slightly better than Logistic
the best appreciable result in accuracy. Regression if we evaluate based on their accuracy metrics.
So it is better to select SVM for this kind of datasets.
From the above figure 2 we can see there are three ROC
curves plotted according to their AUC values. To compare B. Prediction using Support Vector Machine
between them we need to check out their AUC values.
Logistic Regression provides AUC of 0.903, MLP Classifier Support Vector Machine algorithm uses several sets of
0.686 and SVM provides AUC of 0.870. Here in ROC curve mathematical functions that are defined as SVM kernels. The
Logistic Regression gives the best result and MLP Classifier function of these SVM kernels is to take data as input and
gives the worst. SVM is slightly less than Logistic transform it into the required form. There are different kinds
Regression here in ROC curves but it gives the best of SVM algorithms that use different types of kernel
functions and these functions can be different types [6]. Most
appreciable result in accuracy.
used SVM kernels are Linear SVM kernel, Polynomial SVM
Kernel, Gaussian Radial Basis Kernel and Sigmoid SVM
kernel. Our data does not suit the linear kernel since one
dependent response and one or more independent variables
are linear concerns. Out data have 2 dependent responses
326
Authorized licensed use limited to: Machakos University College. Downloaded on May 20,2021 at 10:33:05 UTC from IEEE Xplore. Restrictions apply.
1). SVM Kernel Results: We need to find which SVM kernel 3). SVM Finding: SVM Sigmoid kernel is giving very poor
is performing the best with our dataset. So we have to result for this dataset, whereas Gaussian Radial Basis kernel
compare their measurement result of accuracy rate, AUC and Polynomial kernel performed almost similar and that is
value, Precision, Recall and F1-Score. Here from the above good for our dataset. But SVM Gaussian Radial Basis kernel
table we can see that Sigmoid SVM kernel is providing very slightly performed better than SVM Polynomial Kernel. So
poor result and it is always less than half (0.5). On the other we can say that SVM Gaussian Radial Basis kernel is the
hand Polynomial Kernel and Gaussian Radial Basis Kernel best performed algorithm as well as classifier for this kind of
are performing well. Among them Gaussian Radial Basis datasets. The following figure 4 shows a sample comparison
Kernel is proving better result than Polynomial Kernel. of the actual results and the results of SVM kernels.
Table 3: SVM KERNEL RESULTS
REFERENCES
[1] Thabtah, F. (2019). Machine learning in autistic spectrum disorder
behavioral research: A review and ways forward. Informatics for
Health and Social Care, 44(3), 278-297.
[2] Asri, H., Mousannif, H., Al Moatassime, H., & Noel, T. (2016).
Using machine learning algorithms for breast cancer risk prediction
and diagnosis. Procedia Computer Science, 83, 1064-1069.
[3] A.S.Shanthi and A.S.Shanthi. “Support Vector Machine for MRI
Stroke Classification”, International Journal on Computer Science and
Engineering (IJCSE), ISSN: 0975-3397 in April, 2014.
[4] Autism Screening Adult Data Set. Source by Fadi
Fayez Thabtah, Department of Digital Technology,
Manukau Institute of Technology, Auckland, New Zealand.
https://archive.ics.uci.edu/ml/datasets/Autism+Screening+Adult
[5] National Institue of Health Research. Autism Research Centre.
http://docs.autismresearchcentre.com/tests/AQ10.pdf
[6] Achirul Nanda, M. (2018). Boro Seminar, K.; Nandika, D.; Maddu.
A. A Comparison Study of Kernel Functions in the Support Vector
Fig. 3. ROC Curve of SVM Kernels. Machine and Its Application for Termite Detection. Information, 9(5).
327
Authorized licensed use limited to: Machakos University College. Downloaded on May 20,2021 at 10:33:05 UTC from IEEE Xplore. Restrictions apply.