Report Format Major 5 -4
Report Format Major 5 -4
A PROJECT REPORT
on
“HIGH-POTENCY MOLECULE PREDICTION
USING AI-DRIVEN COMPUTATIONAL MODEL
FOR DRUG DISCOVERY”
Submitted by
BACHELOR OF ENGINEERING
in
SAHYADRI
College of Engineering & Management
An Autonomous Institution
MANGALURU
2024 - 25
SAHYADRI
College of Engineering & Management
An Autonomous Institution
MANGALURU
COMPUTER SCIENCE AND ENGINEERING
(ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING)
CERTIFICATE
This is to certify that the Project entitled “High-Potency Molecule Prediction
Using AI-Driven Computational Model for Drug Discovery” has been carried
out by Anurag R Poojary (4SF21AD008), B Sri Satya Shravan
(4SF21AD013), Rayson Minin Fernandes (4SF21AD043) and Shashank S K
(4SF21AD048), the bonafide students of Sahyadri College of Engineering &
Management in partial fulfillment of the requirements for the award of Bachelor of
Engineering in Artificial Intelligence and Data Science of Visvesvaraya
Technological University, Belagavi during the year 2024 - 25. It is certified that all
corrections/suggestions indicated for Internal Assessment have been incorporated in the
report deposited in the departmental library. The project report has been approved as
it satisfies the academic requirements in respect of project work prescribed for the said
degree.
External Viva-Voce
1. ......................................... .........................................
2. ......................................... .........................................
SAHYADRI
College of Engineering & Management
An Autonomous Institution
MANGALURU
DECLARATION
We hereby declare that the entire work embodied in this Project Report titled
“High-Potency Molecule Prediction Using AI-Driven Computational Model
for Drug Discovery” has been carried out by us at Sahyadri College of Engineering
and Management, Mangaluru under the supervision of Mr. Sharathchandra N R for
the award of Bachelor of Engineering in Artificial Intelligence and Data
Science. This report has not been submitted to this or any other University.
The integration of machine learning (ML) techniques into drug discovery has
significantly transformed the pharmaceutical research landscape. ML methods are now
widely used to accelerate various stages of drug development, including target
identification, compound screening, and lead optimization.The ability of ML algorithms
to analyze complex datasets, such as protein-ligand interactions, chemical structures,
and biological responses, has enabled more efficient and accurate predictions in drug
design. Applications range from virtual screening and de novo drug design to drug
repurposing and toxicity prediction. Emerging areas, such as deep learning and
graph-based models, have further enhanced the predictive capabilities of ML,
facilitating the discovery of novel therapeutics. Additionally, advancements in
computational power, such as GPU-accelerated computing, have supported the
implementation of large-scale ML models, enabling the integration of diverse datasets
for a more holistic approach to drug discovery. This abstract highlights the
revolutionary role of ML in modern pharmaceutical research, emphasizing its potential
to address critical challenges in the development of effective and safe therapies.
i
Acknowledgement
It is with great satisfaction and euphoria that we are submitting the Project Report on
“High-Potency Molecule Prediction Using AI-Driven Computational Model
for Drug Discovery”. We have completed it as a part of the curriculum of
Visvesvaraya Technological University, Belagavi for the award of Bachelor of
Engineering in Artificial Intelligence and Data Science of Visvesvaraya
Technological University, Belagavi.
We are profoundly indebted to Dr. Duddela Sai Prashanth , Associate Professor and
Project Work Coordinator, Department of Computer Science and Engineering(AI&ML)
for their invaluable support and guidance.
We express our sincere gratitude to Dr. Pushpalatha K, Professor & Head of the
Department of CSE(AI&ML) for her invaluable support and guidance.
Finally, yet importantly, We express our heartfelt thanks to our family & friends for
their wishes and encouragement throughout the work.
ii
Table of Contents
Abstract i
Acknowledgement ii
Table of Contents iv
List of Figures v
List of Tables v
1 Introduction 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Literature Survey 4
3 Problem Formulation 9
3.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Problem Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
iii
5 System Design 15
5.1 System Architecture Diagram . . . . . . . . . . . . . . . . . . . . . . . . 15
5.2 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.3 State Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.4 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6 Implementation 23
6.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.1.1 Random Forest . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.2 Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.3 Implementation Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.3.1 Bioactivity data concising . . . . . . . . . . . . . . . . . . . . . . 26
6.3.2 Polymerase basic protein2 (PB2) Exploratory Data Analysis . . . 27
6.3.3 Descriptor Dataset Preparation . . . . . . . . . . . . . . . . . . . 29
6.3.4 Random Forest Regressor implementation . . . . . . . . . . . . . 30
6.3.5 Streamlit Application for Predicting Potency of the molecule . . . 32
Reference Inference 50
iv
List of Figures
v
Chapter 1
Introduction
1.1 Overview
Bioinformatics serves as a cornerstone in biopharmacy, particularly in drug discovery
and development, by merging biological insights with computational innovation. It
enables researchers to analyze vast datasets, such as genomic, proteomic, and
metabolomic information, to identify novel drug targets and understand disease
mechanisms. Using advanced algorithms and machine learning, bioinformatics tools
1
High-Potency Molecule Prediction Using AI-Driven Computational Model Chapter 1
1.2 Purpose
Bioinformatics is essential in biopharmacy for drug discovery, as it combines biological
data with computational tools to identify and optimize new drug candidates. It helps find
potential drug targets, screen large compound libraries, design effective molecules, and
predict how these compounds will behave in the body. By analyzing clinical trial data and
predicting a drug’s safety and efficacy, bioinformatics makes the drug development process
faster, more efficient, and cost-effective. Additionally, the integration of bioinformatics
ensures a higher success rate in clinical trials by improving the accuracy of preclinical
predictions and identifying biomarkers for patient stratification.
Moreover, bioinformatics empowers researchers to address unmet medical needs by
exploring alternative therapeutic strategies, such as drug repurposing or combination
therapies. Its application extends beyond drug discovery to include vaccine design, gene
therapy development, and the study of antimicrobial resistance, making it a versatile tool
in modern biopharmacy.
1.3 Scope
The scope of this project involves using bioinformatics tools to enhance drug discovery and
development. It includes identifying potential drug targets through the analysis of genetic
and protein data, screening and designing drug candidates, and predicting their behavior
and safety in the body. The project will also focus on optimizing these compounds to
improve their effectiveness and minimize side effects. By integrating bioinformatics into
every stage of the drug development process, the project aims to accelerate the creation of
new, safe, and effective medications while reducing costs and improving patient outcomes.
Furthermore, the project will explore the use of machine learning algorithms and
advanced data visualization techniques to gain actionable insights from large datasets.
It will also evaluate the impact of structural modifications on drug efficacy and safety
profiles, providing a comprehensive understanding of compound behavior. The
application of bioinformatics in identifying biomarkers for precision medicine will also
be a critical aspect, ensuring that treatments are tailored to individual patients. This
comprehensive approach will not only enhance the efficiency of drug discovery but also
contribute to advancements in personalized medicine and global healthcare.
Literature Survey
Machine Learning in Drug Discovery: A Review Dara et al. (2021) discussed the
transformative impact of ML on drug discovery processes, focusing on its use in target
identification, lead optimization, and clinical trial analysis. They emphasized how ML
models handle large-scale datasets efficiently, enabling better predictions and reducing
drug development costs. The study also highlights the potential of combining ML with
genomics data to identify personalized therapeutic targets for complex diseases.
4
High-Potency Molecule Prediction Using AI-Driven Computational Model Chapter 2
Structure-Based Drug Discovery with Deep Learning Çelik et al. (2023) focused
on the application of deep learning in structure-based drug discovery. They discussed how
convolutional and recurrent neural networks are used to predict protein-ligand binding
affinities with high accuracy. The study also explored the use of transfer learning for
adapting pre-trained models to new drug targets, enhancing the efficiency of virtual
screening. Their findings underscore the potential of AI in automating complex tasks in
the drug discovery pipeline.
models design novel compounds with desired properties. The authors discussed challenges
in training generative models, including the need for large, high-quality datasets. Their
findings highlight the potential of generative ML to revolutionize the creation of new
drugs.
Problem Formulation
9
High-Potency Molecule Prediction Using AI-Driven Computational Model Chapter 3
goal of this study is to harness the power of the ChemBL database to predict and rank
promising drug candidates by analyzing bioactivity, structural properties, and
pharmacokinetic profiles. Through this approach, the aim is to enhance the efficiency of
the drug discovery pipeline and accelerate the development of innovative therapeutic
solutions to address unmet medical needs. This work seeks to integrate data-driven
methods with biological research to bring forth effective and impactful advancements in
drug development.
3.3 Objectives
• Data Mining and Bioactivity Analysis: Perform systematic data mining to identify
small molecules with significant bioactivity against selected targets associated with
specific diseases or biological processes.
4.1 Introduction
The software requirements for this project outline the functionalities, interfaces, and
system components essential for developing a bioinformatics system for drug discovery.
These requirements ensure that the system meets the needs of researchers and
pharmaceutical stakeholders by providing accurate, timely, and user-friendly
predictions. The requirements are divided into functional requirements, which define
the system’s specific behaviors, and non-functional requirements, which establish
performance and quality criteria.
4.1.1 Requirements
1. Data Collection
Collecting the data from the CHEMBL database and extracting the relevant
features, and searching for the target protein of the virus
2. Data Preprocessing
The collected data is cleaned and preprocessed to address missing values, rectify
inconsistencies, and standardize inputs, ensuring high-quality data for analysis.
The system implements and trains advanced machine learning Random Forest
Model, to predict drug-target interactions and optimize lead compounds.
4. User Interface
11
High-Potency Molecule Prediction Using AI-Driven Computational Model Chapter 4
5. Prediction Output
The system generates concise and informative reports on predicted drug efficacy,
complemented by visualizations to support decision-making.
4.2 Purpose
• Enhance the ability to predict drug efficacy and safety by analyzing biological data
and leveraging advanced machine learning techniques.
1. Researchers
Role: Scientists engaged in analyzing biological data to identify drug targets and evaluate
candidate compounds.
Technical Proficiency: Moderate to high, with familiarity in bioinformatics, data
analysis, and computational tools.
3. System Administrators
Role: IT professionals responsible for maintaining the system infrastructure, ensuring
data security, and ensuring system reliability.
Technical Proficiency: High, with expertise in database management, server
operations, and deployment of bioinformatics software systems.
4. Data Scientists
Role: Experts in machine learning and data modeling tasked with optimizing predictive
algorithms for drug discovery.
Technical Proficiency: High, with skills in programming, machine learning,
bioinformatics data processing, and statistical analysis.
4.4 Interfaces
• User Interface (UI): An intuitive platform developed using python and streamlit to
input biological datasets and view predictions, including interactive visualizations
of drug-target interactions.
System Design
The process begins by collecting biological data, such as DNA, RNA, or protein
sequences, from databases or laboratory experiments. This data includes genomic
15
High-Potency Molecule Prediction Using AI-Driven Computational Model Chapter 5
• InputFile: This class is responsible for handling the input data in the form of
SMILES strings and their associated chemical identifiers (chemblID). It includes
the following:
– Attributes:
– Methods:
– Attributes:
– Methods:
• FeatureSelection: This class is responsible for selecting relevant features from the
generated descriptors. It includes:
– Attributes:
– Methods:
– Attributes:
– Methods:
– Attributes:
– Methods:
• Result: This class manages the output and presentation of predicted pIC50 values.
It includes:
– Attributes:
– Methods:
transitions to ”Prediction,” where users can input new data to receive predictions and
insights. Finally, the system enters the ”Knowledge Discovery” state, generating
actionable insights and hypotheses for further research.
This workflow highlights the progression from raw data to actionable biological
insights, ensuring a robust and dynamic bioinformatics analysis pipeline.
Implementation
6.1 Algorithm
Algorithm is a step-by-step, systematic procedure or set of rules designed to perform a
specific task or solve a problem. Algorithms are fundamental to computer science and
are used to manipulate data, perform calculations, or automate reasoning tasks.
It is an ensemble learning algorithm that builds multiple decision trees and combines
their results to improve predictive accuracy. It is used to model complex interactions
between microbial data and crop yields, helping to predict agricultural outcomes based
on microbial activity in the soil
23
High-Potency Molecule Prediction Using AI-Driven Computational Model Chapter 6
2. Split dataset into training (Xtrain , ytrain ) and testing (Xtest , ytest ) sets
R2 ← r2 score(ytest , ypred )
8. end procedure
• Data Collection: Retrieve bioactivity data from the ChEMBL database, which
provides details about compound activities against biological targets.
• Prediction: Use the trained model to predict the activity of new compounds based
on their molecular descriptors.
1 ! pip install c h e m b l _ w e b r e s o u r c e _ c l i e n t
2
23 df
24
11
12 # Clean and extract the longest SMILES for compounds with multiple
representations
13 smiles = []
14 for i in df . canonical_smiles . tolist () :
15 cpd = str ( i ) . split ( ’. ’)
16 cpd_longest = max ( cpd , key = len )
17 smiles . append ( cpd_longest )
18
31 baseData = np . arange (1 , 1)
32 i = 0
33 for mol in moldata :
34 desc_MolWt = Descriptors . MolWt ( mol )
35 desc_MolLogP = Descriptors . MolLogP ( mol )
36 desc_NumHDonors = Lipinski . NumHDonors ( mol )
37 de sc _N um HA cc ep to rs = Lipinski . NumHAcceptors ( mol )
38
46 columnNames = [ " Molecular_Weight " , " LogP " , " Num_H_Donors " , "
Num_H_Acceptors " ]
47 descriptors = pd . DataFrame ( data = baseData , columns = columnNames )
48 return descriptors
49
1 import pandas as pd
2 import seaborn as sns
3 from sklearn . model_selection import train_test_split
4 from sklearn . ensemble import R a n d o m F o r e s t R e g r e s s o r
5
_ 0 6 _ b i o a c t i v i t y _ d a t a _ 3 c l a s s _ p E C 5 0 _ p u b c h e m _ f p . csv ’)
8
34 # Calculate R ^2 score
35 r2 = model . score ( X_test , Y_test )
36 r2
37
47
1 import streamlit as st
2 import pandas as pd
3 from PIL import Image
4 import subprocess
5 import os
6 import base64
7 import pickle
8
17 # File download
18 def filedownload ( df ) :
19 csv = df . to_csv ( index = False )
20 b64 = base64 . b64encode ( csv . encode () ) . decode () # strings <-> bytes
conversions
21 href = f ’ <a href =" data : file / csv ; base64 ,{ b64 }" download =" prediction .
csv " > Download Predictions </ a > ’
22 return href
23
24 # Model building
25 def build_model ( input_data ) :
26 # Reads in saved regression model
27 load_model = pickle . load ( open ( ’ a c e t y l c h o l i n e s t e r a s e _ m o d e l . pkl ’ , ’ rb
’) )
28 predictions = load_model . predict ( input_data )
29 return predictions
7.1 Outcomes
The final results of the drug discovery project for predicting pIC50 values demonstrate
the effectiveness of integrating molecular descriptors with advanced machine learning
techniques. Using models like Random Forest and LazyPredict, the framework achieves
high prediction accuracy, showcasing its ability to identify compounds with desired
bioactivity. Key features such as molecular fingerprints and standardized descriptors
were leveraged to ensure robust predictions. These results highlight the potential of
utilizing cheminformatics and bioinformatics approaches to accelerate drug discovery for
target proteins, promoting efficiency and precision in pharmaceutical research.
Furthermore, the framework’s performance was validated against diverse datasets,
ensuring its adaptability and reliability across different chemical structures. This
versatility underscores the model’s potential as a valuable tool for researchers working
with compounds of varying complexity. By leveraging the predictive capabilities of
machine learning, the project significantly reduces reliance on time-consuming
experimental screening processes. This approach not only saves resources but also
enhances the prioritization of high-potential candidates for further investigation.
This project focuses on predicting pIC50 values for chemical compounds with potential
drug activity using a streamlined computational approach. By analyzing molecular
features derived from SMILES strings and processed via PaDEL-Descriptor, the tool
provides researchers with a robust resource for evaluating compound potency. The
34
High-Potency Molecule Prediction Using AI-Driven Computational Model Chapter 7
The system accepts a text file as input, containing a list of SMILES strings along with
their corresponding ChEMBL IDs. The following steps outline the predictive workflow:
• Input File: Users provide a text file with the chemical structures represented as
SMILES and linked to their respective ChEMBL IDs.
• Machine Learning Model: The processed molecular descriptors are fed into a
pre-trained Random Forest model to predict pIC50 values for the input compounds.
Prediction Process: After uploading the input file, the system validates chemical
structures, generates molecular descriptors, and applies machine learning techniques to
predict the pIC50 values.
7.1.3 Result
2024
Tasks: (Months)
Apr May Jun Jul Aug Sep Oct Nov Dec
Selection of Topic
Literature Review
Synopsis Report and PPT
Preparation
Experimenting with
Potential Methodologies
Presentation to Panel
Model Training and
Selection
Model Testing
Model Evaluation
Model Parameters
Optimization
Validation
Preparation of Project
Report
The following table maps the outcomes of the drug discovery project to their
corresponding objectives and provides measurements of achievement.
Achievement: The project successfully built a machine learning model that predicts
pIC50 values for chemical compounds using molecular descriptors generated from
SMILES strings. The Random Forest model demonstrated high predictive accuracy,
supporting reliable estimation of compound potency.
Achievement: The model was tested on extensive datasets, demonstrating its ability
to scale and maintain predictive accuracy. This scalability ensures applicability to
real-world drug discovery scenarios involving large chemical libraries.
Achievement: The model’s robustness was evaluated using external datasets with
varied chemical structures. Consistent performance across these datasets confirmed
the model’s generalizability.
Achievement: The project incorporates mechanisms for updating the model with
new data, enabling continuous learning and improvement over time.
Achievement: The project bridges the gap between cheminformatics and machine
learning, offering a cohesive framework for data processing, feature generation, and
prediction.
Achievement: The results showcase how AI-driven tools can complement traditional
methods, accelerating the early phases of the drug discovery pipeline.
Achievement: The virtual screening approach significantly reduces the need for
physical experiments, contributing to more sustainable and eco-friendly drug
discovery practices.
During the development of the drug discovery application for predicting pIC50 values
based on molecular descriptors derived from SMILES strings, several challenges, both
This project highlights the critical role of advanced computational tools and machine
learning in drug discovery, specifically in predicting pIC50 values based on molecular
descriptors derived from SMILES strings. By employing data-driven methodologies,
researchers can efficiently identify potential drug candidates, significantly reducing the
time and cost associated with traditional experimental methods. The integration of
cheminformatics with predictive modeling provides actionable insights into molecular
properties, enabling targeted optimization of compounds. This approach not only
accelerates the drug discovery pipeline but also enhances the precision of candidate
selection, contributing to more efficient and sustainable pharmaceutical research
practices. Overall, the project demonstrates the potential of AI-driven strategies in
addressing complex challenges in drug discovery, offering both scientific and economic
benefits.
The predictive framework developed in this study has proven to be a valuable tool
for prioritizing compounds with high bioactivity. Its success demonstrates the
importance of combining computational innovation with domain expertise in tackling
the multifaceted challenges of drug development. The project’s outcomes also
underscore the value of integrating cheminformatics techniques with modern machine
learning approaches to create robust and interpretable models. This approach has the
potential to complement experimental efforts, paving the way for breakthroughs in
pharmaceutical research and development.
To further enhance the accuracy and applicability of our predictive model, future
work will focus on addressing current limitations and expanding the project’s scope.
This includes developing a more robust dataset, refining feature selection, and integrating
advanced algorithms to improve prediction accuracy. The proposed future work includes:
47
High-Potency Molecule Prediction Using AI-Driven Computational Model Chapter 8
By addressing these areas, the project aims to advance the field of computational
drug discovery, making predictive modeling tools more accurate, reliable, and accessible
to researchers worldwide. This will contribute to the discovery of safer and more effective
therapeutic agents, ultimately benefiting global healthcare. Furthermore, the insights
and methodologies developed through this project could inspire similar initiatives in
other domains, such as environmental chemistry, agrochemicals, and materials science,
showcasing the broad applicability of computational approaches in scientific research.
[1] Quazi, S., Fatima, Z. Role of Artificial Intelligence and Machine Learning in Drug
Discovery and Drug Repurposing. In IGI Global eBooks 2023, pp. 1394–1405.
https://doi.org/10.4018/979-8-3693-3026-5.ch062
[2] Shahab, M., Danial, M., Duan, X., Khan, T., Liang, C., Gao, H.,
Chen, M., Wang, D., Zheng, G. Machine Learning-based Drug Design for
Identification of Thymidylate Kinase Inhibitors as a Potential Anti-Mycobacterium
Tuberculosis. Journal of Biomolecular Structure and Dynamics 2023, 1–13.
https://doi.org/10.1080/07391102.2023.2216278
[3] Özelçelik, R., Van Tilborg, D., Jiménez-Luna, J., Grisoni, F. Structure-
based Drug Discovery with Deep Learning. ChemBioChem 2023, 24(13).
https://doi.org/10.1002/cbic.202200776
[4] Husnain, A., Rasool, S., Saeed, A., Hussain, H. K. Revolutionizing Pharmaceutical
Research: Harnessing Machine Learning for a Paradigm Shift in Drug Discovery.
International Journal of Multidisciplinary Sciences and Arts 2023, 2(2), 149–157.
https://doi.org/10.47709/ijmdsa.v2i2.2897
[5] Siebenmorgen, T., Menezes, F., Benassou, S., Merdivan, E., Didi, K., Mourão, A.
S. D., Kitel, R., Li’o, P., Kesselheim, S., Piraud, M., Theis, F. J., Sattler, M.,
Popowicz, G. M. MISATO: machine learning dataset of protein–ligand complexes for
structure-based drug discovery. Nature Computational Science 2024, 4(5), 367–378.
https://doi.org/10.1038/s43588-024-00627-2
50
[7] Pandey, M., Fernandez, M., Gentile, F., Isayev, O., Tropsha, A., Stern, A.
C., Cherkasov, A. The Transformational Role of GPU Computing and Deep
Learning in Drug Discovery. Nature Machine Intelligence 2022, 4(3), 211–221.
https://doi.org/10.1038/s42256-022-00463-x
[8] Akondi, V. S., Menon, V., Baudry, J., Whittle, J. Novel Big Data-Driven Machine
Learning Models for Drug Discovery Application. Molecules 2022, 27(3), 594.
https://doi.org/10.3390/molecules27030594
[9] Patel, V., Shah, M. Artificial Intelligence and Machine Learning in Drug
Discovery and Development. Intelligent Medicine 2022, 2(3), 134–140.
https://doi.org/10.1016/j.imed.2021.10.001
[10] Dara, S., Dhamercherla, S., Jadav, S. S., Babu, C. M., Ahsan, M. J. Machine
Learning in Drug Discovery: A Review. Artificial Intelligence Review 2021, 55(3),
1947–1999. https://doi.org/10.1007/s10462-021-10058-4
[11] Gaudelet, T., Day, B., Jamasb, A. R., Soman, J., Regep, C., Liu, G.,
Hayter, J. B. R., Vickers, R., Roberts, C., Tang, J., Roblin, D., Blundell,
T. L., Bronstein, M. M., Taylor-King, J. P. Utilizing graph machine learning
within drug discovery and development. Briefings in Bioinformatics 2021, 22(6).
https://doi.org/10.1093/bib/bbab159
[13] Patel, L.; Shukla, T.; Huang, X.; Ussery, D.W.; Wang, S. Machine
Learning Methods in Drug Discovery. Molecules 2020, 25, 5277.
https://doi.org/10.3390/molecules25225277
[14] Rajula, H. S. R., Verlato, G., Manchia, M., Antonucci, N., Fanos, V.
Comparison of Conventional Statistical Methods with Machine Learning in
Medicine: Diagnosis, Drug Development, and Treatment. Medicina 2020, 56(9), 455.
https://doi.org/10.3390/medicina56090455
51
[16] Rodrigues, T., Bernardes, G. J. Machine Learning for Target Discovery in
Drug Development. Current Opinion in Chemical Biology 2020, 56, 16–22.
https://doi.org/10.1016/j.cbpa.2019.10.003
[18] Scheeder, C., Heigwer, F., Boutros, M. Machine Learning and Image-Based
Profiling in Drug Discovery. Current Opinion in Systems Biology 2018, 10, 43–52.
https://doi.org/10.1016/j.coisb.2018.05.004
[19] Agarwal, S., Dugar, D., Sengupta, S. Ranking Chemical Structures for Drug
Discovery: A New Machine Learning Approach. Journal of Chemical Information
and Modeling 2010, 50(5), 716–731. https://doi.org/10.1021/ci9003865
[20] Burbidge, R., Trotter, M., Buxton, B., Holden, S. Drug Design by Machine Learning:
Support Vector Machines for Pharmaceutical Data Analysis. Computers Chemistry
2001, 26(1), 5–14. https://doi.org/10.1016/s0097-8485(01)00094-8
52