chemoinformatics
chemoinformatics
Applications in
Agrochemical Discovery
Stars Small
Molecules
Existing 1022 107
Virtual 0 1060 (?)
Mode Real Virtual
Access Difficult “Easy”
2
Chemical Space: Small Molecules in Organic
Chemistry
4
R e g is t r a t io n o f s a f e r
c h e m ic a ls
P r o p o r t io n o f p e s t ic id e a c t iv e in g r e d ie n t s t h a t a r e
c o n s id e r e d t o b e s a f e r ( b io l o g ic a l c h e m ic a ls a n d
r e d u c e d - r is k c o n v e n t io n a l c h e m ic a ls ) h a s s t e a d i ly
in c r e a s e d o v e r t h e la s t s e v e r a l y e a r s .
S o u rc e : EP A, 19 9 9 . 5
P la n t b io t e c h n o lo g y o p e n s n e w
m a r k e t s / s o lu t io n s
6
T h e d e v e lo p m e n t o f t h e
a g r o c h e m ic a l
in vivo s c r e e n in g
7
Overall Outline
1. Introduction
2. Molecular Representations
3. Chemical Data and Databases
4. Molecular Similarity
5. Chemical Reactions
6. Machine Learning and Other Predictive
Methods
7. Molecular Docking and Drug Discovery
What is Chemoinformatics?
• It encompasses the design, creation,
organisation, management, retrieval, analysis,
dissemination, visualization and use of chemical
information
• It is the mixing of information resources to
transform data into information and information
into knowledge, for the intended purpose of
making better decisions faster in the arena of
drug lead identification and optimization
What is Chemoinformatics?
• “the set of computer algorithms and tools
to store and analyse chemical data in the
context of drug discovery and design
projects”
• Chemoinformatics is the application of
informatics methods to solve chemical
problems
Resources
Books:
J. Gasteiger, T. E. and Engel, T. (Editors) (2003).
Chemoinformatics: A Textbook. Wiley.
A.R. Leach and V. J. Gillet (2005). An Introduction to
Chemoinformatics. Springer.
Journal:
Journal of Chemical Information and Modeling
Web:
http://cdb.ics.uci.edu
and many more………
History of Chemoinformatics
The first, and still the core, journal for the subject, the Journal of Chemical
Documentation, started in 1961 (the name Changed to
the Journal of Chemical Information and computer Science in 1975)
The first book appeared in 1971 (Lynch, Harrison, Town and Ash,
Computer Handling of Chemical Structure Information)
Substructure Searching
Searching Databases
3D Substructure Searching
22
Structure and applications of chemoinformatics
Database design and programming
Representation and searching of chemical structures
Structure, substructure & similarity searching in 2D & 3D
Markush and reaction searching
Representation and searching of biological databases
chemoinformatics software
Data analysis techniques
Clustering;
Evolutionary algorithms;
Graph theory;
Neural networks;
Chemical information sources
Cheminformatics applications
Techniques used to design bioactive compounds
Molecular simulation and design
Drug discovery process; QSAR; Combi-chem; SBDD
Spectroscopy and crystallography in cheminformatics
Kinds of chemistry databases
• Small-molecule databases
– Databases of commercially-available compounds (e.g. ACD,
http://www.mdl.com/products/experiment/available_chem_dir/index.jsp)
– Proprietary chemical structure databases
– Literature databases
– Patent databases
– Small project-specific databases
• Protein databases
– Public, online databases (e.g. PDB, http://www.pdb.org)
– Proprietary and project-specific databases
Software Companies
Accelrys -Large chemoinformatics company
ACD/Labs - analytical informatics & predictions
BCI - 2D fingerprinting, clustering toolkits & software
Bioreason - HTS data analysis software
Cambridgesoft - 2D drawing tools & E-notebooks
CAS - produce Scifinder Scholar searching software
ChemAxon - Java based toolkits and software
Daylight- 2D representation & searching software
Leadscope - 2D structure and property tools
Lion Bioscience - produce LeadNavigator
MDL - Large chemoinformatics company
Openeye - Fast 3D docking, structure generation, toolkits
Quantum Pharmaceuticals - prediction, docking, screening
Sage Informatics - ChemTK 2D analysis software
Tripos-Large chemoinformatics company
Journals & Magazines
H Aliphatic- Capital
8 Aromatic-Small
N4 7
9 3
Ring-By giving no.
5
O Double bonds- “=” sign
H
O 10 2 6 Parentheses-branching in the molecule
1
11
Acetaminophen
SMILES Representation
c1c(O)ccc(NC(=O)C)c1
• X-ray crystallography
• NMR spectroscopy
DRAWING AND DEPICTING 2D STRUCTURES
Web-based drawing tools
JME (http://www.molinspiration.com/cgi-bin/properties) is a clean, simple Java drawing tool.
Draw your structure and click on the smiley face to show the SMILES.
Marvin Sketch is a Java applet that allows you to draw structures, and export them as
SMILES, MDL MOL files or others.
• Additive schemes
Structure
Data bases
searchable
Chemical(s) Chemical Structural Property Biological or
of concern Specific analogue analogue mechanistic
data analogue
Data mining Structure activity relationships
Chemometrics
Hidden
Input Output
Hidden
Computer-Assisted Structure Elucidation (CASE)
1. Substructure searching
2. Similarity searching
35
Applications of Chemoinformatics
1. Chemical Information
36
3. Bioactive molecules
37
Contd……
• analysis of high-throughput data
4. Organic Chemistry
38
5. Analytical Chemistry
Teaching Chemoinformatics
39
40
Toxicity Prediction for chemical Q
Chemical
Q class
Class based
SAR model
assignment
Global
toxicity
model
Supporting
information
Toxicity Analogue
prediction search
Hypothesis
generation
Weight of
evidence of
Data collection toxicity
presentation
41
Institutes are Offering Courses on Chemoinformatics
42
SAR Application
Minimize toxicity
FURAPIOLE ANALOGUES
H3 C
O
O O
CH3
O
O
R
SESAMOL ETHERS
H3 C
O H3 C
O O O
R O O
R O O
O CH3
O R
O O
log SF =
0.153D2 + 0.240D1 - 1.711 σI - 0.429RM + 0.070L - 0.384
n s r F
Pollination Control
system
Male sterility
Male sterility
r s F
3 43.74Fp – 3.04ΣMR +0.36 MW- 5.63D - 0.71 0.81 12.10 10.41 (0.01)
4 44.61Fp – 2.93ΣMR +0.65MW- 5.78D +8.02ΣEs – 56.94 0.86 10.80 12.05 (0.00)
5 35.56Fp – 2.96ΣMR +0.85MW- 4.94D +10.36ΣEs –10.00Σπ - 96.48 0.90 9.37 14.49 (0.00)
* p values (%)
J. Agri. Food Chem. 2003, 51, 992-998
Agrophore Group
H O
N
O
F / Br / CF3 / CN O
QSAR equations for 2-pyridones analogues
O
O
O
N
Equations (Ms =) Statistics
X
n s r F(p %)
-3.21ΣMR + 57.18 Fp -3.77Rp – 5.38D + 93.98lnMw -459.06 26 7.77 0.91 18.38 (0.00)
-3.43ΣMR + 38.60 Fp– 4.79D + 210.64lnMw + 10.42ΣEs-1113.06 26 7.24 0.92 21.82 (0.00)
-3.00ΣMR + 49.50 Fp– 7.87D + 211.67lnMw + 12.19ΣEs -6.87ΣEs(m) -1117.35 6.37 0.94 22.94
n r r2 s F (Probability)
55
l o r g a n is m s f o r t a r g e t id e n t if ic a
56
s t s y s t e m s f o r t a r g e t s in U H T B S
57
U H T V S - A u t o m a t e d e v a lu a t io n o f
a c t iv it y o f c o m p o u n d s
58
T h e v ir t u a l d is c o v e r y c y c le
59
U n iq u e r e s e a r c h p la t f o r m – N e t w o r k
o f c o m p le m e n t a r y t e c h n o lo g ie s t o
m e e t t h e c h a lle n g e s in c o m p o u n d
d is c o v e r y
60
D is c o v e r y o f t h e t a r g e t p r o t e in s o f
n o v e l f u n g ic id e s
61
D e n o v o t a r g e t d is c o v e r y b y f u n c t io n a l
g e n o m ic s a n d t h e s t e p s a im in g t o
d e v e lo p a n d p e r f o r m h ig h t h r o u g h p u t
b io c h e m ic a l t e s t s
62
Gene expression profiling, a revolutionary tool in
herbicide discovery
63
T h e p r in c ip le o f G e n e E x p r e s s io n
P r o f ilin g .
64