Final Project
Final Project
INTRODUCTION
Health care is one of the most important fields that would benefit from reducing
processing time. The speed and efficiency of human health issues diagnostics is significant.
The current diagnosing time is a huge challenge in many health conditions, especially
Autism. It takes up to six months to firmly diagnose a child with autism due the long process,
and a child must see many different specialists to diagnose autism, starting from
developmental pediatricians, neurologists, psychiatrists or psychologists.
The time consumed to finalize Autism diagnoses is relatively long in the current
traditional way. Therefore, Machine Learning methods can make a significant change to
accelerate the process. It is known that Early Intervention is the key for improving Autistic
children. Clearly speeding the diagnosing time is even more crucial in Autism cases. Big data
and machine learning technologies can make enormous progress to predict and speed up the
complex and time-consuming processes of diagnosis and treatment.
A machine learning system can be developed to utilize a massive amount of health and
medical data available towards predictive modeling and predictive analysis. In this project, a
comparison of several machine learning techniques and models will be tested and analyzed.
Data is pre-processed to make a prediction based on different categories into which test
subjects are classified as Autistic. There are many existing classification algorithms that can
be applied. Every classifier is diverse in its way of data accumulation, data filtering, feature
extraction and employing these processes towards feeding the model to learn.
A second stage is attempted to use machine learning algorithms to identify the most
relevant diagnostic questions of a traditional Autism diagnostic questionnaire (AQ) using the
extended version of the test containing fifty questions. In addition, we analyze the results to
be used to further enhance Autism predictive models. The second stage of this research is
very important for future research in the Autism diagnosis field. The model designed to check
the questions relevancy can be used in many different ways. Once more data is collected, this
model can help to determine the child’s Autism severity in the scale. Further research with
larger dataset would be very useful to improve the updating the Autism test questions based
on outcome of this model after being fed larger amount of data which will lead to
improvement is the Machine Learning process.
1
Autism Spectrum Disorder (ASD) is a complex heterogeneous neurodevelopment
disability that may cause behavioral, social, or communication challenges or difficulties The
term “spectrum” indicates that the severity of the symptoms varies in individuals, and
symptoms also differ due to the heterogeneity of the conditions.
The centers for disease control (CDC) reports that the prevalence of autism has been
increasing during the past two decades According to this study, in 2000, in every 150
children’s one is diagnosed with ASD in the US. In 2014 one in every 59 children and in
2016, one in every 54 children is diagnosed with ASD prevalence reported by CDC from
2000 to 2016 For the past two decades, researchers have been using various methods to help
diagnose people with ASD and understand what may cause this issue in individuals.
The proposed system makes use of Random Forest (RF), Support Vector Machine
(SVM), Decision Tree and Ada Booster algorithms to predict autism spectrum disorder in an
individual in terms of accuracy, specificity, sensitivity, precision and f1-score
The result is measured in terms of specificity, sensitivity, and accuracy by using the
confusion matrix and classification report.
In this research we use machine learning to determine a set of conditions that together
prove to be predictive of Autism Spectrum Disorder. This will be a great use to physicians,
helping them detect Autism Spectrum Disorder at a much earlier stage.
2
1.2. System specification
3
1.3 Software description
Python
The software development companies prefer Python language because of its versatile
features and fewer programming codes. Nearly 14% of the programmers use it on the
operating systems like UNIX, Linux, Windows and Mac OS. Interactive
The Python language has diversified application in the software development companies
such as in gaming, web frameworks and applications, language development, prototyping,
graphic design applications, etc. This provides the language a higher plethora over other
programming languages used in the industry. Some of its advantages are-
It provides large standard libraries that include the areas like string operations, Internet,
web service tools, operating system interfaces and protocols. Most of the highly used
programming tasks are already scripted into it that limits the length of the codes to be written
in Python.
4
• Presence of Third Party Modules
The Python Package Index (PyPI) contains numerous third-party modules that make
Python capable of interacting with most of the other languages and platforms.
• Integration Feature
Python integrates the Enterprise Application Integration that makes it easy to develop
Web services by invoking COM or COBRA components. It has powerful control capabilities
as it calls directly through C, C++ or Java via Python. Python also processes XML and other
markup languages as it can run on all modern operating systems through same byte code.
The language has extensive support libraries and clean object-oriented designs that
increase two to tenfold of programmer’s productivity while using the languages like Java, VB,
Perl, C, C++ and C#.
• Productivity
With its strong process integration features, unit testing framework and enhanced
control capabilities contribute towards the increased speed for most applications and
productivity of applications. It is a great option for building scalable multi-protocol network
applications.
Further, its development is driven by the community which collaborates for its through
hosting conferences and mailing lists, and provides for its numerous modules.
5
• Learning Ease and Support Available
Python offers excellent readability and uncluttered simple-to-learn syntax which helps
beginners to utilize this programming language. The code style guidelines, PEP 8, provide a
set of rules to facilitate the formatting of code. Additionally, the wide base of users and active
developers has resulted in a rich internet resource bank to encourage development and the
continued adoption of the language.
Python has built-in list and dictionary data structures which can be used to construct
fast runtime data structures. Further, Python also provides the option of dynamic high-level
data typing which reduces the length of support code that is needed.
• The syntaxes of python language are very simple. Anybody can remember the
python language syntaxes, rules and regulations very easily.
• The elegant syntaxes of the python language make the people to learn python in
easiest manner.
• Without having any other programming languages knowledge, we can learn
python directly.
• The simple and powerful syntax of python language makes the programmers to
express their business logic is less lines of code.
• Platform Independent
6
• High Level Language
7
• Python is Embeddable
• We can embed the python code into other languages such as C, C++, Java and Etc.
• In order to provide the scripting capabilities to other languages we can use the
python code in it.
• Applications of Python
• GUI based desktop applications
• Image processing and graphic design applications
• Scientific and computational applications
• Games
• Web frameworks and web applications
• Enterprise and business applications
• Operating systems
• Language development
• Prototyping
8
2. SYSTEM STUDY
According to the Centers of Disease Control and Prevention (CDC), around one in 68
children have been identified with some forms of ASD. The main problem statement of this
research is to expedite Autism diagnoses by providing a machine learning system that uses
different machine learning algorithms that lead to the make Autism predictive model with
most possible accuracy.
The solution is proposing a predictive model with high accuracy that can predict if a
child has Autism or not using Autism Quotient questionnaire (AQ) test. The aim is to use a
traditional Autism diagnosis method and transform it to a machine learning model that can
utilize the massive amount of data collected to make predictions, observations and lead to
better solutions in the future of discovering Autism at the earliest age possible. Ideally more
observations and data analysis in the field will lead to improvements suggesting new methods
of improving
9
2.2. Proposed system
In proposed system, three Machine learning algorithms used in predicting autism spectrum
disease are Logistic Regression KNN, and Random Forest, concluding that these algorithms
can achieve high accuracy in predicting Autism spectrum disorder. Using machine learning,
the app can analyze and predict if a child is at risk for developmental delay or autism. It can
help identify red flags in development but also help with analyzing progress on normal
development.
• Machine learning algorithms can analyze a large amount of data to assist medical
professionals in making more informed decisions cost-effectively.
• Machine Learning algorithms allowed us to analyze clinical data, draw relationships
between diagnostic variables, design the predictive model, and test it against the new
case. The predictive model achieved an accuracy of 89.4 percent using Random Forest
Classifier’s default setting to predict fatal diseases
• Finally, the model we built to predict autism spectrum disease can save enormous
medical bills, improve diagnosis capability on large scale, and most importantly save
live
• By mapping this activity over time in the brain's many regions, the algorithm generates
neural activity “fingerprints.”
• Although unique for each individual just like real fingerprints, the brain fingerprints
nevertheless share similar features, allowing them to be sorted and classified.
• With the help of and machine learning, researchers are hoping the brain can help
identify mental health issues.
• By applying specially designed algorithms to brain scans, labs could identify
distinctive features that determine a patient's optimal treatment.
10
3. SYSTEM DESIGN AND DEVELOPMENT
The input design is the process of entering data to the system. The input design
goal is to enter to the computer as accurate as possible. Here inputs are designed
effectively so that errors made by the operations are minimized. The inputs to the system
have been designed in such a way that manual forms and the inputs are coordinated where
the data elements are common to the source document and to the input. The input is
acceptable and understandable by the users who are using it.
Help managers are also provided whenever the user entry to a new field to that he/she
can understand what is to be entered. Whenever the user enter an error data, error manager
displayed user can move to next field only after entering the correct data.
Data Pre-processing is for removing outliers and noise in the raw data and makes it
available for training the model. In simple way, Data Pre-processing is the major step in the
project evaluation to obtain the best accuracy.
11
3.2. Output design
Output design is the process of converting computer data into hard copy that is
understood by all. The various outputs have been designed in such a way that they represent
the same format that the office and management used to.
Computer output is the most important and direct source of information to the user.
Efficient, intelligible output design should improve the systems relationships with the user
and help in decision making. A major form of output is the hardcopy from the printer.
Output requirements are designed during system analysis. A good starting point for the
output design is the Data Flow Diagram (DFD). Human factors educe issues for design
involves addressing internal controls to ensure readability.
The output form in the system is either by screen or by hard copies. Output design aims
at communicating the results of the processing of the users. The reports are generated to suit
the needs of the users. The reports have to be generated with appropriate levels.
12
3.3. System development
• Dataset Collection
The dataset for this study was gathered from the UCI Repository, which is open to the
public. Dataset Name-ASD Screening Data for Adult, Attribute Type-continuous and binary,
Number of Instances-704. There are sixteen characteristics in the dataset, which are a
combination of categorical and numerical data of which include: Question 1- 10, Age,
Gender, if the person born with jaundice, Any Family member with ASD, who is completing
the test, and Class. The AQ-10 screening questions cover a variety of domains, including
attention switching, imagination, communication, and social interaction. The questions are
graded using a one point scoring system for each of the ten questions. On each question, the
user might earn 0 or 1 point depending on their response.
• Preprocessing of Data
Data Pre-processing is for removing outliers and noise in the raw data and makes it
available for training the model. In simple way, Data Pre-processing is the major step in the
project evaluation to obtain the best accuracy. Thus, raw data is converted into something
usable and understandable. Real-world data is frequently partial and inaccurate because it
contains many errors and outliers. Several methods are to handle data, such as handling
incomplete data, outlier analysis, data reduction, and discretization. The missing values in
these datasets were solved using the imputation method.
• Transformation of Data
Transforming data into appropriate forms to perform data mining is the third step. The
transformation refers to the right format of each features involved in the data mining. The
format of the feature is decided based on the techniques selected for data mining.
Choosing a data mining algorithm which is appropriate to extract the pattern in the data is
initially done. In this step, the data mining model is built for mining data. The model is
trained on the training dataset and tested on the testing set. Data mining models include
clustering, classification, and prediction. Considering an 80:20 ratio, the complete dataset is
categorized into training set and testing set respectively as per requirements. In the wake of
13
implementing several types of supervised learning systems such as random forests (RF),
KNN and Logistics Regression.
In this step, the models developed for data mining is evaluated for its performance and
accuracy of the results. For each data mining techniques the evaluation methods and metrics
differ. In addition to assessing accuracy, sensitivity, specificity and precision, the proposed
model was also tested using the leave-one-out strategy on the AQ-10 dataset. As part of the
validation process, field observations were conducted at various places using forms to collect
over 189 ASD cases and 515 cases without ASD from a special education institute for people
with special needs
Here we will evaluate the accuracy of three algorithms like decision tree, Logistics
Regression and KNN.
Methods
Here we will be using several classifiers to compare accuracy scores - we will be settling
and fine-tuning the best classifier after comparing effectiveness.
Logistic regression
Random Forest
14
Gui Prediction
In the end, a Tkinter GUI Application has been developed specifically for the general
public. The user answers closed ended questions to receive a result with regard to autism or
not. The input data will be collected from graphical screen and it will be compared to the
algorithm model and the prediction result will be displayed in the screen. The accuracy will
be calculated for three algorithm model like Logistic Regression, Random Forest and KNN.
15
4. TESTING AND IMPLEMENTATION
System testing is the process of exercising software with the intent of finding and
ultimately correcting errors. This fundamental philosophy does not change for web
applications, because Web-based systems and application reside on a network and
interoperate with many different operating system, browsers, hardware platforms, and
communication protocols; the search for errors represents a significant challenge for web
application.
System testing is actually a series of different tests whose primary purpose is to fully
exercise the computer based system. System testing is the state of implementation that is
aimed at assuring that the system works accurately and efficiently. Testing is the vital to the
success of the system. System testing makes the logical assumption that if all the parts of the
system are correct, the goal will be successfully achieved.
• Testing is the process of executing a program with the intent of finding an error.
• A successful test is that one of the cover of undiscovered error.
Testing Issues
16
Testing Methodologies
System testing is the state of implementation, which is aimed at ensuring that the
system works accurately and efficiently as expect before live operation commences. It
certifies that the whole set of programs hang together. System testing requires a test plan that
consists of several key activities and steps for run program, string, system and user
acceptance testing. The implementation of newly designed package is important in adopting a
successful new system.
Testing phase in the development cycle validates the code against the functional
specification. Testing is vital to the achievement of the system goals. The objective of
testing is to discover errors. To fulfill this objective a series of test step unit, integration,
validations and system tests were planned and executed.
• Unit testing
• Integration testing
• Functional testing
• System testing
• White box testing
• Black box testing
Unit testing involves the design of test cases that validate that the internal program
logic is functioning properly, and that program inputs produce valid outputs. All decision
branches and internal code flow should be validated. It is the testing of individual software
units of the application .it is done after the completion of an individual unit before
17
integration. This is a structural testing, that relies on knowledge of its construction and is
invasive.
Unit tests perform basic tests at component level and test a specific business process,
application, and/or system configuration. Unit tests ensure that each unique path of a business
process performs accurately to the documented specifications and contains clearly defined
inputs and expected results.
Functional tests provide systematic demonstrations that functions tested are available
as specified by the business and technical requirements, system documentation, and user
manuals.
Functional testing is centered on the following items:
18
4.2.4 System Testing
System testing ensures that the entire integrated software system meets requirements.
It tests a configuration to ensure known and predictable results. An example of system testing
is the configuration oriented system integration test. System testing is based on process
descriptions and flows, emphasizing pre-driven process links and integration points.
White Box Testing is a testing in which in which the software tester has knowledge of
the inner workings, structure and language of the software, or at least its purpose. It is
purpose. It is used to test areas that cannot be reached from a black box level.
Black Box Testing is testing the software without any knowledge of the inner
workings, structure or language of the module being tested. Black box tests, as most other
kinds of tests, must be written from a definitive source document, such as specification or
requirements document, such as specification or requirements document. It is a testing in
which the software under test is treated, as a black box .you cannot “see” into it. The test
provides inputs and responds to outputs without considering how the software works.
Quality Assurance
• Correctness
The extent to which the program meets system specifications and user objectives.
• Reliability
The degree to which the system performs its intended functions overtime.
19
• Efficiency
• Usability
• Maintainability
• Testability
• Portability
• Accuracy
Generic Risk
Generic risks are estimated to contribute 40 to 80 percent of ASD risk. The risk from
gene variants combined with environmental risk factors, such as parental age, birth
complications, and others that have not been identified, determine an individual's risk of
developing this complex condition. Risk factors may include a sibling with autism. Older
parents. Certain genetic conditions, such as Down, fragile X, and Rett syndromes.
Any system developed should be secured and protected against possible hazards.
Security measures are provided to prevent unauthorized access of the database at various
levels. At uninterrupted power supply should be so that the power failure or voltage
fluctuations will not erase the data in the files.
Password protection and simple procedures to change the unauthorized access are
provided to the users. The system allows the user to enter the system for product management
and order status entry only through login utility. The user will have to enter the user name
and password.
20
A multi-layered security architecture comprising firewalls, filtering routers,
encryption and digital certification must be ensured in this project in real time that order and
payment details protected from unauthorized access. The customer can access this order
status only by using his customer code and order number.
Implementation is the stage in the project where the theoretical design is turned into a
working system. The most crucial stage is achieving a successful new system & giving the
user confidence in that the new system will work efficiently & effectively in the
implementation state.
Implementation Procedures
The implementation phase is less creative than system design. A system project may
be dropped at any time prior to implementation, although it becomes more difficult when it
goes to the design phase.
The final report to the implementation phase includes procedural flowcharts, record
layouts, report layouts, and a workable plan for implementing the candidate system design
into an operational one. Conversion is one aspect of implementation.
Several procedures of documents are unique to the conversion phase. They include the
following,
First Frame selection is performed. Frames with sufficient number of blocks are
selected. Next, only some predetermined low frequency DCT coefficients are permitted to
hide data. Then the average energy of the block is expected to be greater than a
21
predetermined threshold. In the final stage, the energy of each coefficient is compared against
another threshold.
The unselected blocks are labeled as erasures and they are not processed. For each
selected block, there exists variable number of coefficients. These coefficients are used to
embed and decode single message bit by employing multi-dimensional form of FZDH that
uses cubic lattice as its base quantize.
User Manual
User Training
User Training is designed to prepare the user for testing & consenting the system.
They are:
• User Manual.
• Help Screens.
• Training Demonstration.
1) User Manual
The summary of important functions about the system and software can be provided
as a document to the user.
2) Help Screens
This features now available in every software package, especially when it is used with
a menu. The user selects the “Help” option from the menu. The system accesses the
necessary description or information for user reference.
3) Training Demonstration
22
Maintenance is expensive. One way to reduce the maintenance costs are through
maintenance management and software modification audits.
Corrective Maintenance
Perfective Maintenance
Preventive Maintenance
23
5. CONCLUSION AND FUTURE ENHANCEMENT
5.1. Conclusion
This research provides a prediction model that is developed to predict autism traits.
Using the AQ-10 dataset, the proposed model can predict autism with 92.89%, 96.20%,
100.00% and 79.14% accuracy in case of Decision tree, Random forest, Adaboost and SVM
algorithms respectively. Comparing all four supervised machine learning algorithms,
AdaBoost and Random Forest algorithm are efficient algorithms for better prediction of
ASD. This result showed better performance comparing to the other existing approach of
screening autism. Moreover, the proposed model can predict autism traits for age groups
below 3 years, while many other existing approaches missed this feature.
A user-friendly Web application has been developed for end users based on the
proposed prediction model so that individual can use the application to predict the autism
traits easily.
24
5.2. Future Enhancement
The world of computers is not static. It is always subject to change. The technology
today will become outdated the very next day. To keep abstract of the technological
improvements the system need refinements, so it is concluded, it will be improved for further
enhancements, whenever the user needs an additional feature into it.
The outcome of this research provides an effective and efficient approach to detect
autism traits for age groups 3 years and below. Since diagnosing the autism traits is quite a
costly and lengthy process, it’s often delayed because of the difficulty of detecting autism in
toddlers. With the help of autism screening application, an individual can be guided at an
early stage that will prevent the situation from getting any worse and reduce costs associated
with delayed diagnosis.
25
BIBLIOGRAPHY
Reference Books
1. Lutz, M. (2013). Learning Python, 5th Edition (5 edition). Beijing: O’Reilly Media.
2. Tibbits S., vander Harten, A., & Baer, S. (2011). Rhino Python Primer (3rd ed.).
4. Greg Wilson .“Data crunching: solve every day problems using Java, Python and
more. The pragmatic programmers”, Pragmatic Bookshelf, Raleigh.
5. Guido van rossum and Fred L. Drake, Jr. “The Python Tutorial - An Introduction to
Python”. Network Theory Ltd., Bristol,
6. Michael Dawson. “Python programming for the absolute beginner”. Premier Press
Inc., Boston, MA, USA, 2003.
7. Harvey M. Deitel, Paul Deitel, Jonathan Liperi, and Ben Wieder mann “Python How
to Program”. P T R Prentice-Hall, Englewood Cliffs.
Reference Websites
• www.W3schools.com
• www.udemy.com
• www.learnpython.com
• www.Guru99.com
• www.towardsdatascience.com
• www.kaggle.com
26
APPENDICES
Start
Input Data
Autism Spectrum
Data Set
Pre-Processing
Data Partition
Model Building
Split Data into Training data and Test Data
27
B) Sample Coding
data = pd.read_csv(r'csv_result-Autism_Data.csv')
n_records = len(data.index)
n_asd_yes = len(data[data['Class/ASD'] == 'YES'])
n_asd_no = len(data[data['Class/ASD'] == 'NO'])
yes_percent = float(n_asd_yes) / n_records * 100
28
data.dropna(inplace=True)
gender_n = {"m": 1, "f": 0}
jundice_n = {"yes": 1, "no": 0}
austim_n = {"yes": 1, "no": 0}
used_app_before_n = {"yes": 1, "no": 0}
result_n = {"YES": 1, "No": 0}
# Encode columns into numeric
for column in data.columns:
le = LabelEncoder()
data[column] = le.fit_transform(data[column])
features = ['A1_Score', 'A2_Score', 'A3_Score', 'A4_Score', 'A5_Score', 'A6_Score',
'A7_Score', 'A8_Score', 'A9_Score', 'A10_Score','ethnicity','contry_of_res','relation']
predicted = ['Class/ASD']
X = data[features].values
y = data[predicted].values
split_test_size = 0.20
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=split_test_size, random_state=0)
# Setup a knn classifier with k neighbors
knn = KNeighborsClassifier()
# Fit the model
knn.fit(X_train, y_train.ravel())
from PIL import Image, ImageTk
top = Tk()
# You can set the geometry attribute to change the root windows size
top.geometry("700x700") # You want the size of the app to be 500x500
top.resizable(0, 0) # Don't allow resizing in the x or y direction
top.title('Autism Spectrum Disorder Classification Tool')
top.option_add("*Button.Background", "grey")
top.option_add("*Button.Foreground", "black")
from PIL import ImageTk, Image
photo = ImageTk.PhotoImage(Image.open(r'ribbon.png'))
logo = Label(top, image=photo)
29
logo.pack()
top.mainloop
label_pos_x = 30
label_pos_y = 50
entry_pos_x = 200
entry_pos_y = 50
e1 = Entry(top)
e1.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
e2 = Entry(top)
e2.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
e3 = Entry(top)
e3.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
e4 = Entry(top)
e4.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
e5 = Entry(top)
e5.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
31
e6 = Entry(top)
e6.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
e7 = Entry(top)
e7.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
e8 = Entry(top)
e8.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
e9 = Entry(top)
e9.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
e10 = Entry(top)
e10.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
e11 = Entry(top)
e11.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
e12 = Entry(top)
e12.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
e13 = Entry(top)
e13.place(x=entry_pos_x, y=entry_pos_y)
entry_pos_y += 20
entryText = StringVar()
prediction_entry = Entry(top, textvariable=entryText, width=60)
32
prediction_entry.place(x=200, y=460)
entry_pos_y += 20
entryAccuracy = StringVar()
accuracy_entry = Entry(top, textvariable=entryAccuracy, width=60)
accuracy_entry.place(x=200, y=460+20)
entry_pos_y += 20
label_pos_x = 400
label_pos_y = 20
label_pos_x = 100
label_pos_y = 460
acc= []
acc1 = []
total_accuracy = {}
def accuracy(model):
pred = model.predict(X_test)
pred = (pred > 0.5)
accu = metrics.accuracy_score(y_test,pred)
errors = abs(pred - y_test)
33
print('Model Performance')
print("\nAccuracy Of the Model: ",accu)
entryAccuracy.set('Accuracy Of the Model: '+ str(accu))
print("\nAverage Error: {:0.2f} degrees.".format(np.mean(errors)))
total_accuracy[str((str(model).split('(')[0]))] = accu
model_test = model.predict(X_test)
# true negative, false positive, etc...
cm = confusion_matrix(y_test, model_test)
total1=sum(sum(cm))
sensitivity1 = cm[1,1]/(cm[1,0]+cm[1,1])
print('Sensitivity Of the Model: ', sensitivity1,'\n')
acc.append([accu,sensitivity1, specificity1])
def classify_using_knn():
A1_Score = float(e1.get())
A2_Score = float(e2.get())
A3_Score = float(e3.get())
A4_Score = float(e4.get())
A5_Score = float(e5.get())
A6_Score = float(e6.get())
A7_Score = float(e7.get())
A8_Score = float(e8.get())
A9_Score = float(e9.get())
A10_Score = float(e10.get())
A11_Score = float(e11.get())
A12_Score = float(e12.get())
A13_Score = float(e13.get())
34
best_grid = KNeighborsClassifier(n_neighbors=14, metric = 'hamming')
best_grid.fit(X_train, y_train.ravel())
#entryText1.set()
if prediction[0] == 0:
entryText.set('Not pre-diagnose with ASD')
else:
entryText.set('Pre-diagnosed with ASD, seek clinician for further assistance')
def classify_using_rf():
A1_Score = float(e1.get())
A2_Score = float(e2.get())
A3_Score = float(e3.get())
A4_Score = float(e4.get())
A5_Score = float(e5.get())
A6_Score = float(e6.get())
A7_Score = float(e7.get())
A8_Score = float(e8.get())
A9_Score = float(e9.get())
A10_Score = float(e10.get())
A11_Score = float(e11.get())
A12_Score = float(e12.get())
A13_Score = float(e13.get())
35
prediction = RF3.predict([[A1_Score, A2_Score, A3_Score, A4_Score, A5_Score,
A6_Score, A7_Score, A8_Score, A9_Score,A10_Score, A11_Score, A12_Score,
A13_Score]])
accuracy(RF3)
if prediction[0] == 0:
entryText.set('Not pre-diagnose with ASD')
else:
entryText.set('Pre-diagnosed with ASD, seek clinician for further assistance')
def classify_using_lr():
from sklearn.linear_model import LogisticRegression
A1_Score = float(e1.get())
A2_Score = float(e2.get())
A3_Score = float(e3.get())
A4_Score = float(e4.get())
A5_Score = float(e5.get())
A6_Score = float(e6.get())
A7_Score = float(e7.get())
A8_Score = float(e8.get())
A9_Score = float(e9.get())
A10_Score = float(e10.get())
A11_Score = float(e11.get())
A12_Score = float(e12.get())
A13_Score = float(e13.get())
# Create standardizer
standardizer = StandardScaler()
36
# Create logistic regression
for c in [0.00001, 0.0001, 0.001, 0.1, 1, 10]:
logRegModel = LogisticRegression(C=c)
if prediction[0] == 0:
entryText.set('Not pre-diagnose with ASD')
else:
entryText.set('Pre-diagnosed with ASD, seek clinician for further assistance')
37
Button(top, text="Classify using K-Nearest Neigbhbors", command=classify_using_knn,
activebackground="pink",
activeforeground="blue").place(x=200, y=420)
Button(top, text="Classify using Random Forest", command=classify_using_rf,
activebackground="pink",
activeforeground="blue").place(x=200, y=385)
Button(top, text="Classify using Logistic Regression", command=classify_using_lr,
activebackground="pink",
activeforeground="blue").place(x=200, y=350)
top.mainloop()
38
C) Sample Input Forms
Sample Screen
39
Login page
40
D) Sample Output Forms
41
Classifier performance
42
43
44
Model performance
45