Paper 26-Early Detection of Autism Spectrum Disorder
Paper 26-Early Detection of Autism Spectrum Disorder
Abstract—Autism Spectrum Disorder (ASD) is a mental genetic predisposition, environmental factors, and lifestyle
disorder among children that is difficult to diagnose at an early choices. Although the exact cause is still unknown, the
age of a child. People with ASD have difficulty functioning in available evidence shows that it is a multi-faceted condition.
areas such as communication, social interaction, motor skills, and In addition, the lack of trained professionals and resources to
emotional regulation. They may also have difficulty processing diagnose and treat ASD [1] has created a huge gap in access to
sensory information and have difficulty understanding language, care. Furthermore, due to the complexity of the disorder, it can
which can lead to further difficulty in socializing. Early detection be difficult to diagnose and properly classify it, leading to
can help with learning coping skills, communication strategies, misdiagnosis or delayed diagnosis. This is because autism is a
and other interventions that can make it easier for them to
complex disorder, and it can manifest itself differently in each
interact with the world. This kind of disorder is not curable but
it is possible to reduce the symptoms of ASD. The early age
affected individual [4]. As such, it is difficult to create a single
detection of ASD helps to start several therapies corresponding biomarker that can accurately detect the disorder.
to ASD symptoms. The detection of ASD symptoms at an early Additionally, research into developing tools and applications,
age of a child is our main problem where traditional machine data analysis, and pattern recognition [5][6] to help identify
learning algorithms like Support Vector Machine, Logistic children with autism is challenging, as it requires creating a
Regression, K-nearest neighbour, and Random Forest classifiers comprehensive program that can detect subtle signs of autism
have been applied to parents’ dialog to understand the sentiment across a range of contexts as in [7]. People with autism may
of each statement about their child. After completion of the struggle with understanding social cues, interpreting and
prediction of these models, each positive ASD symptoms-related responding to others‟ emotions, and forming relationships.
sentence has been used in the cosine similarity model for the They may also have difficulty with processing sensory
detection of ASD problems. Samples of parents’ dialogs have information or have strong interests in certain topics or
been collected from social networks and special child training activities. Diagnosis is based on observed behavior, and the
institutes. Data has been prepared according to the model for process can involve interviews and questionnaires, cognitive
sentiment analysis. The accuracies of these proposed classifiers assessments, physical examinations, and genetic and
are 71%, 71%, 62%, and 69% percent according to the prepared neurological tests. All of these evaluations can take time and
data. Another dataset has been prepared where each sentence money, and the cost can be prohibitive for some families.
refers to a particular categorical ASD problem and that has been These tests are designed to identify patterns of behavior and
used in cosine similarity calculation for ASD problem detection.
symptoms associated with autism, by asking parents and
Keywords—Support vector; logistic regression; cosine
professionals to observe the individual. They then analyze the
similarity; K-nearest neighbor; random forest responses and compare them to a set of criteria established to
identify autism or other developmental disorders. For
I. INTRODUCTION example, if a person is using a metal detector, they must have
an understanding of the type of metal they are looking for and
People with ASD [1] often have difficulty in
the size of the object they are searching for. The quality of the
understanding the social cues and expectations that are
metal detector will also have an impact on the accuracy and
necessary for meaningful conversations and relationships with
efficiency of the screening method. Such systems can use
others. This can lead to isolation, difficulty in forming
algorithms to analyze large amounts of data and detect
relationships, and, in some cases, difficulty in gaining
patterns with high accuracy, potentially leading to earlier and
recognition in society as in [2]. Early detection can help
more accurate diagnoses. Additionally, such systems can help
identify the illness sooner, allowing for personalized
to automate certain labor-intensive tasks and reduce the
treatments or preventive measures to be put in place that can
amount of time needed to complete diagnostic tests. This is
help reduce the severity of the illness and improve the chances
because machine learning algorithms can analyze large
of recovery as in [3]. It is caused by a combination of genetic
amounts of data and identify patterns and correlations that
and environmental factors that affect the development of the
would be difficult or impossible for humans to find. The
brain. It is characterized by difficulty in social interaction,
algorithms can then be used to develop predictive models that
communication, and repetitive behaviors. Research has been
can accurately identify potential diagnoses and suggest
done to identify the causes of this syndrome, which include
231 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
therapies as in [8]. Some research scholar has done some work disorders is often written in a complex, highly technical
on ASD diagnosis using machine learning. The aim of this language that is difficult to parse and interpret with natural
research is to reduce the classification time of ASD diagnosis language processing tools. Additionally, many of the diseases
process after the detection of the most influential ASD are multi-faceted and involve a variety of clinical terms that
diagnosis items as in [9][10][11][12]. Machine learning (ML) need to be identified by the NLP tools in order to accurately
is a powerful tool that can be used to analyze vast amounts of extract relevant information. The authors evaluated the
data and identify patterns that can be used to detect mental predictive performance using precision, recall, and F1 score.
health issues. ML can also be used to develop personalized We also ran a manual evaluation to compare the manual
treatments based on individual patient characteristics. This annotation of ASD-related terms with the tools‟ extracted
could potentially lead to more targeted and effective terms, and found that CLAMP outperformed the other two
treatments for mental health issues. Through the use of data- tools in terms of precision, recall, and F1 score on both the
driven techniques, ML enables the analysis of large amounts abstracts and full-text articles. The F1 score combines the
of data to uncover previously unknown patterns, trends, and precision and recall of a system, so it takes into account both
correlations. ML can be used to develop predictive models or the accuracy and completeness of the system. In this case,
to recommend interventions that may be tailored to individual CLAMP had the highest F1 score, meaning it had both a
needs. These challenges include the need to ensure responsible higher precision and a higher recall than the other two
data collection and storage, to develop equitable access to systems. This type of analysis protocol allows researchers to
ML-enabled solutions, to ensure ethical and responsible use of better identify, classify, and quantify the symptoms of a
ML and AI, and to ensure that privacy and confidentiality are disorder, even when there is not a well-defined terminology
maintained as in [13]. set to describe it. This makes it easier to compare the
presentation of the disorder across different populations and
The proposed work is based on the detection of ASD can help to identify potential biomarkers for the disorder as in
symptoms from the parents‟ dialogue. Parents of autistic [14]. People with ASD had more difficulty in expressing
children have the best experience with their autistic children‟s
emotions and abstract concepts than typically developing
symptoms. The data has been collected from many social sites individuals, as well as difficulty in using language to describe
and organizations for special children. The data is related to events and convey information. This suggests that
the parents‟ dialogue in text mode and a dataset has been impairments in the use of pragmatic language are an important
prepared using these parents‟ text inputs. Traditional machine aspect of ASD and should be addressed in interventions. This
learning models like SVM, Logistic Regression, K-nearest suggests that the differences in narrative production between
neighbor (KNN), and Random Forest have been used to detect ASD and control groups are related to difficulties in
the symptoms from the parents‟ text. The sentiment analysis understanding and expressing emotions, as well as producing
process has been used to detect sentences from the parents‟ more abstract language. The individuals with typical
text. After completion of the prediction using the proposed development had a more varied range of vocabulary, which
machine learning models, the positive sentences have been included more words with both positive and negative
used as input in the cosine similarity model. This model will sentiments, while the participants with ASD displayed a
calculate the cosine similarity of input sentences and ASD limited vocabulary, resulting in a greater tendency to use
symptoms sentences to detect ASD problems. Many machine negative words. The lower level of language abstraction in the
learning-based applications related to mental disorders have ASD narratives could be due to the limitation of their
been discussed in Section II. The proposed dataset, detailed vocabulary and the difficulty of expressing abstract concepts.
architecture of the proposed system, and machine learning This suggests that language abstraction and emotional polarity
models have been discussed in Section III. The results of this can be used to measure the narrative abilities of individuals
proposed system have been discussed in Section IV. The with ASD without relying on age or IQ scores. The strong
limitation has been given in Section V whereas conclusion has positive correlation between linguistic abstraction and
been discussed in Section VI and ends with the future work in emotional polarity indicates that the more abstract language
Section VII. used, the more likely it is to contain emotional content. The
II. RELATED WORKS difference in emotional polarity between the two groups could
be due to the fact that individuals with ASD may have
Today, Autism Spectrum Disorder (ASD) is a highly difficulty recognizing and expressing emotions. In addition,
prevalent disorder problem among children. Now it is one of they may have difficulty understanding abstract language
the main components in the healthcare domain and much concepts, which could explain why they used fewer abstract
research has been done using Artificial Intelligence (AI). A words in their narratives as in [15]. One of the most promising
few important AI-based research works on Mental Health areas for developing assistive tools is the use of artificial
related issues have been included in this related work section. intelligence (AI) and machine learning (ML) algorithms.
These NLP software tools use a combination of natural These algorithms can be used to analyze data from various
language processing (NLP) algorithms and domain-specific sources and can provide insights that may help diagnose ASD
ontologies to identify and extract biomedical concepts from earlier and more accurately. The proposed approach is
unstructured texts. The ontologies provide an organized expected to find the underlying patterns in the eye-tracking
representation of biomedical concepts and the NLP algorithms records which can be used to accurately diagnose the
enable the software to accurately identify the concepts in the disorders. The results of this study could provide clinicians
text. This is due to the fact that the existing literature on these with a powerful tool that could potentially improve the
232 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
accuracy and speed of diagnosis. By applying NLP methods to that was trained on EEG data from patients with Parkinson's
the raw eye-tracking data, the study was able to extract Disease in order to classify them as either having the disease
meaningful features from the data that could be used to train or not. The CNN was able to extract relevant features from the
classification models. The experiment showed that using these data without any manual input, resulting in a higher accuracy
features could yield better results than using the raw data than other machine learning approaches. This is because
alone. The authors [16] used a customized loss function to CNNs can learn more complex patterns from the data and
adjust the weights of the model, which allowed them to have the ability to generalize to new data. Event-related
achieve a high level of accuracy. Additionally, authors [16] spectrograms capture more information about the events of
utilized transfer learning to fine-tune the model, allowing us to interest, which can be used to extract more accurate features
further improve the accuracy of the model. The author‟s [16] than resting state EEG spectrograms. This suggests that these
approach could realize a promising accuracy of classification techniques can be used to identify and visualize the underlying
(ROC-AUC up to 0.8) as in [16]. Social behavior issues are physiological differences between neurological disorders and
often the most noticeable in children with autism, and they healthy brains, potentially leading to a better understanding of
may include difficulty forming relationships, lack of eye their underlying pathophysiology. Deep networks are useful
contact, and difficulty understanding nonverbal because they can extract meaningful patterns from EEG
communication. Clinical tests can also be used to look for signals and are capable of handling large amounts of data.
developmental delays, such as difficulty with speech and These results suggest that deep networks can also be used to
language, as well as repetitive behaviours like hand flapping analyze EEG dynamics from smaller datasets, which could be
or rocking. The assessment process is designed to identify key used to develop biomarkers for clinical use as in [18]. EEG
characteristics of autism in individuals, such as difficulty in can provide valuable information to help diagnose ADHD in
communication and social interaction, and to determine the children because it can measure electrical activity in the brain
severity of the condition. By using semi-structured data posted and detect any abnormal electrical activity that may be
in Twitter, the team of doctors can gain insight into the indicative of ADHD. Additionally, EEG can help to
individual's behavior, which can then be used to develop a differentiate ADHD from other mental disorders that may be
more accurate and effective assessment. Analyzing the tweets, present in the child. Symptoms of ADHD include difficulty
it allows researchers to detect the sentiment of people's paying attention, impulsivity, and hyperactivity. These
opinions on autism, the topics that are most commonly symptoms can interfere with a child's ability to learn, manage
discussed, and the language used to discuss autism. This helps emotions, and interact with peers. Video long-range EEG
researchers gain a better understanding of how people think monitoring can provide more accurate and detailed
and talk about autism, and can help inform policy decisions information about the brain activity of children with ADHD
NLP and topic modeling allow for more efficient processing compared to ambulatory EEG monitoring, as it allows for
of data by automatically recognizing patterns and keywords, more frequent data collection and better visualization of the
saving time and effort. Furthermore, the results of the analysis EEG data. It also helps to identify abnormal brain electrical
are highly accurate, making them an ideal choice for studying activities which may be associated with ADHD, thus aiding in
topics such as genetic analysis, the effect of vaccination, and the diagnosis of the condition. By doing this, they were able to
behavior analysis. The 10k tweets dataset is enough to provide accurately identify children with ADHD and study their
in-depth analysis and insight into these topics. The analytical behavioral patterns in order to better understand and treat the
results are used to learn the genetic impact on ASD, the disorder. This allowed for a more precise and detailed analysis
vaccination effect on ASD and also used to learn the behavior than traditional methods of observation. Comparing the results
changes and population of autistic children as in [17]. It is of various models can help to identify which model is best
characterized by a persistent pattern of inattention and/or suited for recognizing signs of ADHD in EEG data. By
hyperactivity-impulsivity that interferes with functioning or selecting the most accurate and appropriate model, researchers
development. It is often accompanied by other mental health can then use it to build a recognition method that can diagnose
disorders, such as anxiety and depression, which can further children with ADHD more accurately. This is because long-
impair functioning and quality of life. We applied the CNN term video EEG can detect the abnormal EEG patterns
model to the EEG data in order to distinguish between ADHD associated with ADHD, such as slow wave activity, and can
patients and healthy controls. The CNN was able to accurately also detect the degree of attention fluctuation in children with
classify the EEG data with an accuracy of 90.3%, significantly ADHD as in [19]. With the recent advances in artificial
outperforming other methods, particularly of event-related intelligence, computers can now analyze EEG data and
potentials (ERP) from ADHD patients (n = 20) and healthy provide results much faster than a neurologist. This has
controls (n = 20) collected during the Flanker Task, with 2800 enabled the field of neurology to become much more efficient
samples for each group. By exploiting invariances, deep and provide more accurate results in a fraction of the time.
networks are able to classify data even when there are This is made possible because AI is able to quickly analyze
variations in the data, such as changes in lighting or and process large amounts of data. It can quickly identify
orientation of an image. Compositional features are patterns and draw conclusions from the data that would take
combinations of basic elements that form a more complex human hours or even days to detect ADHD. Additionally, AI
representation of the data, such as edges and shapes in an can look for indicators of diseases or abnormalities that would
image. Deep networks are able to identify these features, be difficult for humans to find on their own. This is because it
which enables them to accurately classify data. This was can automate the process of analyzing EEG signals, thus
achieved by using a Convolutional Neural Network (CNN) allowing neurologists to quickly and accurately identify
233 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
patterns associated with different neurological diseases. associated with depression. Recognizing the early signs of
Furthermore, this technology can also help neurologists to depression can help to identify and address the issue before it
identify subtle changes in EEG signals that could potentially becomes a more serious problem. The CNN is used to extract
signal the onset of a neurological disorder. The ML model can high-level features from speech signals, while the SVM
process the EEG signals quickly and accurately to detect classifier is used to classify the extracted features. The hybrid
patterns that may indicate ADHD. By making use of the data model is trained on a dataset of Arabic speech from people
generated from the EEG signals, the ML model can diagnose with depression and those without, to produce a model that is
ADHD more accurately and quickly than traditional methods. capable of distinguishing between the two. The hybrid model
By analyzing the EEG signals, the ML model can identify uses a combination of convolutional neural networks (CNNs)
patterns that are indicative of ADHD. Additionally, the ML and support vector machines (SVMs) to analyze while 30% of
model can be trained to recognize these patterns more quickly data were used to test the proposed model. A hybrid model
and accurately than traditional methods. With the right pre- (CNN + SVM) attained a 90.0% and 91.60% accuracy rate to
processing techniques and machine learning algorithms, the predicting the depression from the data and make predictions.
ML model can provide a more accurate diagnosis of ADHD This combination of techniques allows for the model to
than traditional methods as in [20]. This allows individuals to process the data quickly and accurately, resulting in the high
stay connected with their friends and family and to keep up accuracy rates it achieved. This is likely because the hybrid
with what is going on in the world. Additionally, it makes it model combines the strengths of both models. The RNN can
easier to stay in touch with people who are not in the same accurately make predictions based on the context of the data,
physical location, making it a great way to stay connected while the CNN can detect the most important features in the
during this time. The pandemic has had a negative impact on data. By combining both models, the predictive power of the
the mental health of many people, and it has become harder hybrid model is enhanced ‚e RNN achieved an 80.70% and
for them to access in-person support. As a result, online tools 81.60% accuracy rate. This indicates that the combined model
and resources have become more important than ever for those was more effective in classifying depression than either of the
struggling with mental health issues, allowing them to get the individual models alone. The results suggest that incorporating
help they need even when they are unable to leave their multiple models into one prediction system can increase the
homes. Mental health conditions can have a significant impact accuracy of the diagnosis. This is because the achieved
on an individual's overall well-being, affecting their ability to findings can be used to identify key indicators of depression in
work and their relationships with others. Additionally, spoken Arabic, such as speech patterns, intonation, and
research has found that mental illnesses can increase an pauses. These indicators can then be used to identify
individual's risk of developing chronic physical health individuals who may be suffering from depression and help
conditions, such as heart disease and diabetes. AI methods can physicians, psychiatrists, and psychologists provide more
help mental health providers to detect patterns in patient data effective treatment as in [22]. The mental health issues, such
that might otherwise go unnoticed, as well as to generate as depression and anxiety, are becoming more common, and
insights into the patient‟s current state. This can lead to more people are recognizing the need to prioritize their mental
accurate diagnoses and better treatment plans, leading to better health as well as their physical health. Additionally, with the
overall outcomes for the patient. AI can help to analyze development of telehealth services, it's become easier for
patient data quickly and accurately, identify patterns and people to access mental health services regardless of their
correlations, and make predictions about the best course of location. This means that most people who suffer from mental
action for a patient's diagnosis and treatment. AI can also help health issues are unable to get access to the right diagnosis and
reduce the time and resources required for manual data treatment, resulting in an overall decrease in the mental health
analysis and provide more efficient and cost-effective care. of the population. The model will be trained on a dataset of
The models were tested on a labeled dataset of Reddit posts speech samples from people with and without depression.
from users with self-reported mental illnesses and compared Exploring the acoustic features and patterns in the speech
against a baseline model. The results showed that the machine samples of people with depression will help to identify the
learning, deep learning, and transfer learning models differences between those with and without depression. By
outperformed the baseline model in correctly classifying the doing so, it will be possible to detect signs of depression in an
different mental illnesses. This will help to reduce the amount individual and provide an initial diagnosis of mental health
of time it takes to identify and respond to medical problems. This model uses Natural Language Processing
emergencies, which will ultimately lead to more lives being (NLP) techniques to analyze the text and determine the
saved. Additionally, it will also help to reduce the burden on sentiment of the posts. The sentiment of the posts is then used
healthcare workers, which will make the public health system to assess an individual's mental health status as in [23].
more efficient and cost-effective as in [21]. A variety of
A comparative analysis has been done on proposed
factors can contribute to depression, such as genetics, brain
chemistry, environmental influences, traumatic experiences, systems that are equipped with machine learning models and
and other medical conditions. Additionally, depression can be similar types of systems that are also based on machine
caused by a combination of these factors, making it difficult to learning models. Table I contains „Models‟ as the first
pinpoint a single cause. Genetics and brain chemistry can attribute where each model name is defined. The „Description‟
predispose someone to depression, while environmental attribute contains details about the models. The third attribute
factors and traumatic experiences can trigger its onset. Other is „Dataset‟ which refers to the dataset details and the fourth
medical conditions such as chronic illnesses can also be attribute is „Accuracy‟ where each model‟s accuracy has been
given. The last attribute is „Remarks‟ about each model. Fig. 1
234 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
shows the accuracy graph of similar machine learning models and proposed machine learning models.
TABLE I. COMPARATIVE STUDY OF PROPOSED MODELS WITH SIMILAR TYPE MODELS IN MENTAL DISORDERS
Sl.No. Models Description Dataset Accuracy Remarks
Similar Type Machine Learning Models in Mental Disorders
1 CNN, RNN, SNN Deep learning CNN, Recurrent Neural EEG data has been 88%, 86%, EEG is a medical test that measures electrical
[18] Network, and Recurrent Neural used. and 78% activity in the brain. This data is a very high
Network are used for classification and volume and time and cost-effective.
comparison to detect Attention deficit
hyperactivity disorder (ADHD).
2 Fully connected Neural Network-based Deep Learning Deep learning long- 97.7% The data is long-range EEG big data which is a
neural network Model to detect disorders like ADHD. range EEG big data. very high volume data for analysis.
model [19]
3 KNN, SVM, and KNN, SVM, and RF Models are used EEG signals data of 69%, 72%, Much time has to be given for preprocessing to
RF [20] trained with the EEG signals data to ADHD and 74% improve the quality of EEG signals.
detect ADHD.
4 Linear Support Depression, anxiety, bipolar disorder,Unstructured user 79%, 79%, Reddit's post-dataset cleaning process is related
Vector ADHD, and PTSD detection from data on the Reddit 74%, and to removing personal information, punctuation
Classifier, LR, unstructured data. platform has been 75% marks, and URLs.
NB, and RF [21] used.
5 CNN +SVM[22] Intelligent system to detect depressive Basic Arabic 90 and The dataset has been prepared from the audio
symptoms using speech analysis Vocal Emotions 91.60 format for sentiment analysis.
Dataset (BAVED)
6 RNN+CNN [22] Intelligent system to detect depressive Basic Arabic 88.50 and The dataset has been prepared from the audio
symptoms using speech analysis Vocal Emotions 86.60 format for sentiment analysis.
Dataset (BAVED)
Proposed Models in Mental Disorder (Autism Spectrum Disorder)
7 Proposed SVM SVM model to predict positive ASD Parents‟ Dialogues 71% The data has been collected in text form. The
symptoms from parents‟ dialogue. of Autistic Children parents‟ dialogues about their autistic children
in text format from are very useful because they shared their
SAHAS- Durgapur, experiences and thoughts about their autistic
India, and Social children. A parent of an autistic child is the best
Sites. source to understand the ASD symptoms
patterns.
8 Proposed Logistic SVM model to predict positive ASD Parents‟ Dialogues 71% The data has been collected in text form. The
Regression symptoms from parents‟ dialogue. of Autistic Children parents‟ dialogues about their autistic children
in text format from are very useful because they shared their
SAHAS- Durgapur, experiences and thoughts about their autistic
India, and Social children. A parent of an autistic child is the best
Sites. source to understand the ASD symptoms
patterns.
9 Proposed K SVM model to predict positive ASD Parents‟ Dialogues 62% The data has been collected in text form. The
Nearest Neighbor symptoms from parents‟ dialogue. of Autistic Children parents‟ dialogues about their autistic children
(KNN) in text format from are very useful because they shared their
SAHAS- Durgapur, experiences and thoughts about their autistic
India, and Social children. A parent of an autistic child is the best
Sites. source to understand the ASD symptoms
patterns.
10 Proposed SVM model to predict positive ASD Parents‟ Dialogues 69% The data has been collected in text form. The
Random Forest symptoms from parents‟ dialogue. of Autistic Children parents‟ dialogues about their autistic children
in text format from are very useful because they shared their
SAHAS- Durgapur, experiences and thoughts about their autistic
India, and Social children. A parent of an autistic child is the best
Sites. source to understand the ASD symptoms
patterns.
235 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
Y
100
80
ACCURACY
60
40
20
0
RNN [18]
LSVC [21]
RNN [22]
CNN [18]
NN [19]
KNN [20]
CNN [22]
CNN [22]
SVM [20]
Proposed KNN
RF [20]
LR [21]
NB [21]
RF [21]
SVM [22]
Proposed LR
SNN [18]
Proposed RF
Proposed SVM
MACHINE LEARNING MODELS
III. ARCHITECTURE OF PROPOSED MODELS 4. My youngest with autism, learning disabilities and is non-verbal,
will be 4. She has to be in a pushchair whilst out and about for
A few traditional machine learning classifiers have been safety as has zero sense of danger. I‟m struggling to find a
used to identify ASD symptoms from parents‟ dialogues. double pushchair suitable for a newborn and my will be 4 year
SVM has been used as the first classifier to identify the old. If anyone can send any links or pictures that would be great.
symptoms from the parents‟ dialogue. Logistic regression is a 5. From few days my son eye movements strangely like keeping
head down n seeing up and moving eye balls to the corners of
second classifier that is also identifying the ASD symptoms the eyes. Can anyone suggest why he is doing so? Please...
from the given dataset. KNN and Random forest are the last thanks!
two classifiers that are also used to identify ASD symptoms
from a given dataset. The Dataset has been prepared from the text in Table II.
Each sentence has been taken into consideration to identify
A. Dataset of Proposed System whether it is a symptom of ASD or not. There are no fixed
The Dataset has been prepared using the parents‟ dialogue symptoms in ASD for identification. Increment of those
where parents are describing their thoughts and experiences parents‟ dialogues who are actually parents of autistic children
about their own autistic child. These data have been collected can be a good idea to identify more symptoms as well as a
from several different social networks and organizations good advantage to train the machine learning models for better
where special children are taking their therapies on accuracy. A few examples of data from the proposed dataset
communication, speech, and behavior. A few parent dialogue have been given in Table III.
example has been given in Table II. Parents‟ dialogues are
very important data from where all possible symptoms of TABLE III. EXAMPLE DATA IN THE PROPOSED DATASET
ASD can be identified. The given dialogues are used to make Sl. No. Comments Sentiment
the dataset for proposed machine learning models training and
testing. because all they do there is play with toys
1. 1
with him every time
TABLE II. EXAMPLE OF PARENTS‟ DIALOGUES 2. I'm confused guys help my son is 3years 0
old now
Sl. No. Parents’ Dialogues My little girl is 3 and a half and still non
3. 1
1. My second son is 4 and also autistic; he's on the move always verbal
and always into something and he's also a big momma's boy,
loves hugging and cuddling me. I'm nervous about bringing 4. he does is mumbles only no proper words 1
baby home. Idk how he'll handle it. Any advice? I was really surprised when he came home
2. Hi. Please I need some advice. My son is 10 and from a few 5. 0
with iep papers
years is very hard to make him do some activities (writing and
staff like that) At school he refuse. They are not able to make
him do anything. At school just play and if say no to him he just The dataset structure in the proposed research has been
scream. He doesn't want to do anything; (in terms of studying or described in Table III where the first column is Serial
activities). I really don't know what to do. Number, the second column is Comments, and the third
3. I‟m currently having problems with washing my (almost 2 year column is Sentiment. Paragraph text from parents‟ dialogues
old) daughter's hair. Whenever i try, she basically goes ballistic
has been taken to prepare the dataset. Each sentence has been
and throws a fit. She‟s scared and I‟m trying to figure out how to
support her and make her feel safe because she does have to get taken from the paragraph text and identifies whether it is a
hair washed. Any suggestions and things that have worked for symptom of ASD or not. If it is a symptom of ASD then it is
you? labeled as 1 (true) otherwise 0 (false). According to Table III,
Sentences in the Comments column with serial numbers 1, 3,
236 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
and 4 are true symptoms of ASD whereas serial numbers 2 dataset and it generates good predictive results according to
and 5 are false symptoms. Now this ASD symptom-based the problem. SVM is based on the finding of the best
dataset has been prepared to train some traditional machine hyperplane that divides data points either in two classes or
learning models like SVM, Logistic Regression, KNN, and multiclass. The proposed approach is binary classification
Random Forest. where data points either true (1) or false (0).
237 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
// kfolds has been used to send data as a bunch into the SVM model.
kfolds = StratifiedKFold(n_splits=5, shuffle=True, random_state=1)
// Make the pipeline to send data inside the SVM model.
pipeline_svm = make_pipeline(vectorizer, SVC(probability=True,
kernel="linear", class_weight="balanced"))
// SVM model initialization with parameters
grid_svm = GridSearchCV(pipeline_svm,
Fig. 3. Sigmoid function according to the equation.
param_grid = {'svc__C': [0.01, 0.1, 1]},
cv = kfolds, The proposed algorithm which is based on logistic
scoring="roc_auc",
regression has been given below.
verbose=1,
n_jobs=-1) Proposed Logistic Regression Algorithm:
// fit data inside the model to train Pseudo Code:
grid_svm.fit(X_train, y_train) Step 1: Read data from CSV file.
Step 2: X=data from csv
Step 6. Predict the result using SVM model. x1=[a1,a2, a3,a4,a5,………an] is a user text column inside the
model= grid_svm.best_estimator_ dataset.
prediction = model.predict(X_test) x2=[r1,r2, r3,r4,r5,………rn] is a label data column inside the
The result of this proposed algorithm has been discussed in dataset
the Result and Discussion section. Step 3: Features generation using Vectorizer function.
// Vectorizer function converts the string value to number values.
C. Logistic Regression vectorizer = CountVectorizer(
The next approach is logistic regression which is able to analyzer = 'word',
identify ASD symptoms from user text. This is another lowercase = False,)
machine-learning algorithm for binary classification problems. // Feature creation using vectorizer.fit_transform function
The logistic regression model works on finding the value features = vectorizer.fit_transform(x1)
between 0 and 1 and this algorithm is bounded. The logistic // Feature array creation
regression does not contain any relationship between input and features_nd = features.toarray()
output variables because of the nonlinear transformation to the Step 4: Model creation and training
//Logistic model creation
odds ratio. Logistic regression can be defined as-
log_model = LogisticRegression()
Log(p(M)/1-p(M))=β0+ β1X // Logistic model train
log_model = log_model.fit(X=X_train, y=y_train)
238 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
Step 5: Prediction using Logistic Regression model Hamming distance functions can be used here to calculate the
y_pred = log_model.predict(X_test) distance. The proposed algorithm is based on KNN that uses
The output as a result of this proposed algorithm has been the proposed dataset.
discussed in the Result and Discussion section.
Proposed KNN Algorithm:
D. K-Nearest Neighbor (KNN) Pseudocode:
The third approach to identifying ASD symptoms from Step 1: Read data from CSV file.
Step 2: X=data from csv
user text. KNN is a supervised algorithm that can be used in
x1=[a1,a2, a3,a4,a5,………an] is a user text column inside the
classification problems. This algorithm uses feature similarity
dataset.
to predict the value for a new data point that comes as input. y1=[r1,r2, r3,r4,r5,………rn] is a label data column inside the
KNN uses the similarity between new data points with dataset
available categorical data points and identifies this data point // Split the data in train and test format
in a particular similar data point‟s category. KNN is very x_train,x_test,y_train,y_test=train_test_split(x1, y1,stratify=
popular in binary classification. Fig. 4 shows before KNN y1,test_size=0.33)
prediction the new data point plotted on a graph where two Step 3: String value to Vectorizer transformation.
categories of data points are present. Category A and Category // Vector function declaration
B have been classified according to the nearest data points. vectorizer=CountVectorizer()
According to Fig. 5, after applying the KNN algorithm, the // Vector transformation of x_train
new data point has been assigned as Category B because the x_train_bow=vectorizer.fit_transform(x_train)
nearest neighbor of the new data point is the data point of // Vector transformation of y_train
Category B. x_test_bow=vectorizer.transform(x_test)
Step 4: KNN Model creation
grid_params = { 'n_neighbors' : [40,50,60,70,80,90],'metric' :
['manhattan']}
knn=KNeighborsClassifier()
Step 5: KNN model training using prepared dataset
clf = RandomizedSearchCV(KNN, grid_params,
random_state=0,n_jobs=-1,verbose=1)
clf.fit(x_train_bow,y_train)
Step 6: Prediction using KNN model
Prediction=clf.predict_proba(x_test_bow)
The result of this proposed KNN-based algorithm has been
discussed in Result and Discussion section.
E. Random Forest
The last approach is a Random forest machine learning
algorithm to identify the ASD Symptoms from user text. This
Fig. 4. Before the KNN algorithm is applied on a new data point. is one of the important machine learning algorithms which is
constructed from decision tree algorithms. The Random forest
algorithm is used to solve regression and classification
problems. This algorithm is trained through bagging which is
an ensemble algorithm. The ensemble algorithm is used to
improve the accuracy of the machine learning algorithms. The
outcomes of the random forest are based on the prediction of
the decision tree. The mean of various decision trees is used to
calculate the prediction value by the random forest algorithm.
Decision trees in random forest algorithms use the tree view to
generate prediction value from a series of feature-based splits
where it starts from a root node and ends in a leaf node with a
decision. Feature selection and the splitting process is
depending on the impurity which means either result will be
„yes‟ or „no‟. To know about the impurity of the dataset, the
Gini index [25] is a good option and that can be written
Fig. 5. After the KNN algorithm applied on a new data point. mathematically-
239 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
F. Proposed System Flow data. After completion of the model training and testing, the
Fig. 6 shows the overall architectural diagram of the proposed system is ready to accept new paragraph text from
proposed system to identify ASD symptoms from user text. the user to identify a number of positive sentences from the
The proposed system will read data from the ASD symptoms given text that denotes ASD symptoms. The predicted
dataset in the first step. Each sentence will be passed through sentences will be in two modes either it will positive (1) or
some NLP tasks like tokenization, stop words removal, and negative (0). The proposed system will select only the
text-to-vector transformation. Sentences are tokenized by the sentences that are positive and the negative sentences will be
tokenization process of NLP where stop words mean discarded in the next step. The selected positive sentences will
unwanted words (tokens) like „am‟,‟is‟,a‟,‟an‟, etc. are be the input to the Spacy Cosine Similarity Model. This model
removed from the sentence. The final task is to transform each will read each positive sentence from the ASD symptoms
token into vectors. These vectors are the main input in each dataset (Table V) and calculate the cosine similarity with the
machine-learning model with labeled data. After vector input sentence. The Spacy cosine similarity model will check
transformation, data are separated into two parts which are a sentence that has the highest cosine similarity score with the
training and testing data. According to Fig. 6, SVM, Logistic input sentence and the Label will be selected of this sentence
Regression, KNN, and Random Forest models are trained with by the system. The Label will indicate the ASD problem
the training data, and testing the prediction results with test according to Table IV. Each input sentence will be handled by
240 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
this cosine similarity model to identify ASD problems. The TABLE VI. SVM MODEL METRICS
algorithm has been given below. Sl. No. Metrics Value
Proposed Cosine Similarity algorithm: 1. AUC 0.77
Pseudo code: 2. F1 0.74
Step 1: // Declare Python and Spacy packages
import spacy 3. Accuracy 0.71
import pandas as pd
4. Precision 0.71
nlp = spacy.load('en_core_web_lg')
// Initialize positive ASD symptoms data in a Dataframe 5. Recall 0.77
Step 2: df = pd.read_csv(“ASD_Smptoms.csv”)
// Three list variable has been declared to store each cosine
The SVM model has multiple metrics to understand the
similarity value with sentence and label
comments=[]
model‟s performance and scalability. According to Table VI,
sentiment=[] the AUC score is 77% which is a good score for any trained
cosine_value=[] SVM model. The AUC refers to the area under the ROC curve
Step 3: Define Cosine Similarity Calculation Method that is a popular metric of SVM. If AUC = 1, then the model
def Spacy_Cosine(strs): can distinguish correctly between positive and negative. If the
for ind in df.index: condition is 0.5<AUC<1 then there is a high chance to
sen1 = nlp(df['Comments'][ind]) distinguish between positive and negative. The F1 score of
sen2 =nlp(strs) this proposed SVM model is 74% which refers to the
combination of precision and recall scores which are 71% and
sen1_no_stop_words = nlp(' '.join([str(t) for t in sen1 if not 77% respectively. The overall accuracy of this proposed SVM
t.is_stop])) model is 71% and this score is a good approach. According to
sen2_no_stop_words = nlp(' '.join([str(t) for t in sen2 if not the ROC curve, the higher Y-axis value denotes a higher
t.is_stop])) number of true positives than false negatives as well as the
higher X-axis value denotes a higher number of false positives
comments.append(df['Comments'][ind]) than true negatives. According to Fig. 7, the ROC curve of this
sentiment.append(df['Sentiment'][ind]) proposed SVM model shows a higher true positive rate than
the false positive rate. This signifies that the proposed is able
score=sen2_no_stop_words.similarity(sen1_no_stop_words)
to generate good prediction results and this ROC curve
# score=sen2.similarity(sen1)
indication satisfied this.
cosine_value.append(score)
dfc=pd.DataFrame(
{'Comments': comments,
'Sentiment': sentiment,
'Cosine_Scores': cosine_value
})
dfc.to_csv(r'ASD_Cosine_Data.csv')
dfc['Cosine_Scores']=dfc['Cosine_Scores'].astype('float64')
i = dfc['Cosine_Scores'].idxmax()
return dfc['Sentiment'][i]
Step 4: // Select only predicted positive (1) sentences as input
Strs= List of predicted positive sentences
for st in strs['Comments']:
result=Spacy_Cosine(st)
print(st,"=",result) Fig. 7. ROC curve of proposed SVM model.
The output of this proposed algorithm has been given and
discussed in Result and Discussion section. According to Fig. 8, the training scores line on the graph is
between 0.99 and 0.94 (approx.) and the cross-validation
IV. RESULT AND DISCUSSION scores line is between 0.70 and 0.79 (approx.). The gap
The proposed system uses multiple traditional machine between the two score lines is not very high. This proposed
learning models which are SVM, Logistic Regression, KNN, model is able to generate good prediction results according to
and Random Forest. The proposed dataset has been utilized to the given Fig. 8.
train and test these models. The result of each model
according to the dataset has been discussed here one by one.
A. Result and Discussion of SVM Model
Table VI has been given here to show the SVM model
metrics after training and testing.
241 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
A few sentences have been sent to the proposed SVM Fig. 10. Confusion matrix of logistic regression model.
model for the prediction. According to Fig. 9, the proposed
model shows the output as 1 or 0, which is attached to each TABLE VII. LOGISTIC REGRESSION MODEL METRICS
sentence. One (1) refers to a positive sentence regarding ASD Sl. No. Metrics Value
detection whereas zero (0) refers to a negative sentence.
1. AUC 0.69
2. F1 0.63
3. Accuracy 0.71
4. Precision 0.72
5. Recall 0.56
242 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
243 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
244 | P a g e
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 14, No. 6, 2023
[6] A.L. Georgescu, J.C. Koehler, J. Weiske, K. Vogeley, N. Koutsouleris, [23] Amanda Sun, Zhe Wu, "Early detection of mental disorder via social
C. Falter-Wagner, “Machine Learning to Study Social Interaction media posts using deep learning models", Proceedings of Asia Pacific
Difficulties in ASD.” Computational Approaches for Human-Human Computer Systems Conference, pp. 149-158, 2021.
and Human-Robot SocialInteractions, 2019. [24] Anshul Saini, "Support Vector Machine(SVM): A Complete guide for
[7] Shomona Gracia Jacob, Majdi Mohammed Bait Ali Sulaiman, Bensujin beginners", https://www.analyticsvidhya.com/blog/2021/10/ support-
Bennet, "Algorithmic Approaches to Classify Autism Spectrum vector-machinessvm-a-complete-guide-for-beginners/, 2023.
Disorders: A Research Perspective", Procedia Computer Science, vol. [25] Himanshi Singh, "How to select Best Split in Decision trees using Gini
201, pp. 470–477, 2022. Impurity", https://www.analyticsvidhya.com/blog/2021/03/how-to-
[8] Fadi Thabtah, David Peebles, "A new machine learning model based on select-best-split-in-decision-trees-gini-impurity/, 2021.
induction of rules for autism detection",
[9] D. P. Wall, R. Dally, R. Luyster R, et al., "Use of artificial intelligence AUTHORS‟ PROFILE
to shorten the behavioral diagnosis of autism", PLoS ONE, 2012. Prasenjit Mukherjee has 14 years of experience in
[10] M. Duda, R. Ma, N. Haber, et al., "Use of machine learning for academics and industry. He completed his Ph.D. in
behavioral distinction of autism and ADHD", Transl Psychiat, vol. 9(6), Computer Science and Engineering in the area of Natural
2016. Language Processing from the National Institute of
Technology (NIT), Durgapur, India under the
[11] A.Pratap, C.S. Kanimozhiselvi, R. Vijayakumar, et al., "Predictive Visvesvaraya PhD Scheme from 2015 to 2020.
assessment of autism using unsupervised machine learning models, Int J Presently, He is working as a Data Scientist at Vodafone
Adv Intell Paradig, vol.6(2), pp. 113–121, 2014. Intelligent Solutions, Pune, Maharashtra, India, and
[12] M. Al-Diabat, "Fuzzy data mining for autism classification of children", doing his Post Doctoral (D.Sc.) in Computer Science
Int J Adv Comput Sci Appl, vol. 9(7), pp. 11–17, 2018. from Manipur International University, Imphal, Manipur, India.
[13] ANJA THIEME, DANIELLE BELGRAVE, GAVIN DOHERTY, Sourav Sadhukhan has above 5 years of experience
"Machine Learning in Mental Health: A Systematic Review of the HCI in Law and Management. He completed his Graduation
Literature to Support the Development of Effective and Implementable in LLB from Calcutta University, Kolkata, India, and
ML Systems", Trans. Comput.-Hum. Interact, vol. 27(5), Article 34, Post Graduate Diploma in Management from Pune
2020. Institute of Business Management, Pune, India. Presently
[14] Jacqueline Peng, Mengge Zhao, James Havrilla, Cong Liu, Chunhua he is a student of Executive Post Graduation in Data
Weng, Whitney Guthrie, Robert Schultz, Kai Wang, Yunyun Zhou, Science and Analytics from the Indian Institute of
"Natural language processing (NLP) tools in extracting biomedical Management, Amritsar, India.
concepts from research articles: a case study on autism spectrum
Dr. Manish Godse has 27 years of experience in
disorder", BMC Med Inform Decis Mak, pp. 1-9, 2020.
academics and industry. He holds Ph.D. from Indian
[15] Izabela Chojnicka, Aleksander Wawer, "Social language in autism Institute of Technology, Bombay (IITB). He is currently
spectrum disorder: A computational analysis of sentiment and linguistic working as an IT Consultant in the Bizamica Software,
abstraction", PLOS ONE, pp. 1-16, 2020. Pune in the area of Artificial Intelligence and Analytics.
[16] Mahmoud Elbattah, Jean-Luc Guérin, Romuald Carette, Federica Cilia, His research areas of interest include automation,
Gilles Dequen, "NLP-Based Approach to Detect Autism Spectrum machine learning, natural language processing and
Disorder in Saccadic Eye Movement", IEEE Symposium Series on business analytics. He has multiple research papers
Computational Intelligence (SSCI), pp. 1581-1587, 2020. indexed at IEEE, ELSEVIER, etc.
[17] T. Lakshmi Praveena, N. V. Muthu Lakshmi, "Sentiment Analysis on Dr. Baisakhi Chakraborty received the PhD.
Autism Spectrum Disorder using Twitter Data", International Journal of degree in 2011 from National Institute of Technology,
Recent Technology and Engineering (IJRTE), vol. 7(4), pp. 204-208, Durgapur, India in Computer Science and Engineering.
2018. Her research interest includes knowledge systems,
[18] Laura Dubreuil-Vall, Giulio Ruffini, Joan A. Camprodon1, "Deep knowledge engineering and management, database
Learning Convolutional Neural Networks Discriminate Adult ADHD systems, data mining, natural language processing, and
From Healthy Individuals on the Basis of Event-Related Spectral EEG", software engineering. She has several research scholars
Front. Neurosci, vol. 14, pp. 1-12, 2020. under her guidance. She has more than 60 international
publications. She has a decade of industrial and 22
[19] Dingfu Zhou, Zhihang Liao, Rong Chen, "Deep Learning Enabled years of academic experience.
Diagnosis of Children‟s ADHD Based on the Big Data of Video Screen
Long-Range EEG", Journal of Healthcare Engineering, pp. 1-9, 2022.
[20] Shubham Dhuri, Nitin Ahire, Deepak Kamat, Sunil Nayak, Bhavesh
Maurya, "ADHD EEG signal analysis using Machine Learning",
International Research Journal of Engineering and Technology (IRJET),
vol. 8(5), pp. 2572-2575, 2021.
[21] Iqra Ameer, Muhammad Arif,Grigori Sidorov, Helena Gomez-Adorno,
Alexander Gelbukh, "Mental Illness Classication on Social Media Texts
using Deep Learning and Transfer Learning", arXiv:2207.01012, pp. 1-
12, 2022.
[22] Tanzila Saba, Amjad Rehman Khan, Ibrahim Abunadi, Saeed AliBahaj,
Haider Ali, Maryam Alruwaythi, "Arabic Speech Analysis for
Classification and Prediction of Mental Illness due to Depression Using
Deep Learning", Computational Intelligence and Neuroscience, vol.
2022, pp. 1-9, 2022.
245 | P a g e
www.ijacsa.thesai.org