Structural bioinformatics
Structural bioinformatics
MAHRUKH ZAKIR
EMAIL: [email protected]
MIC-403
COMPUTATIONAL MICROBIOLOGY
1
STRUCTURAL BIOINFORMATICS
Structural bioinformatics is a sub-discipline of bioinformatics that focuses on the analysis and prediction of the
three-dimensional (3D) structure of biological macromolecules, such as proteins, DNA, and RNA.
It integrates computational techniques with structural biology to understand the relationship between structure and
function in biological molecules.
2
PROTEIN STRUCTURE PREDICTION
3
METHODS OF PROTEIN STRUCTURE PREDICTION
Experimental Methods:
• X-ray Crystallography: Provides high-resolution structures but is time-consuming and requires protein
crystals.
• Nuclear Magnetic Resonance (NMR) Spectroscopy: Useful for small proteins, but limited for larger
ones.
• Cryo-Electron Microscopy (Cryo-EM): Effective for large complexes, with increasing resolution in
recent years.
4
COMPUTATIONAL METHODS
Homology modeling, also known as comparative modeling, is a method used to predict the 3D structure of
a protein with an unknown structure by using the known structure of a homologous protein.
The method relies on the fact that the 3D structure of proteins is often better conserved than their amino acid
sequence. Therefore, proteins with similar sequences are likely to have similar structures.
5
STEPS OF HOMOLOGY
MODELING
6
CONT.
Template Identification: Search for proteins with known structures (templates) that have high sequence similarity to the target
protein.
Tools like BLAST and HHpred are commonly used to identify suitable templates.
Alignment of Sequences: Align the target protein sequence with the selected template sequence to ensure conserved regions are
mapped correctly.
Tools: Clustal Omega, MUSCLE.
Model Building: Use the template structure to construct a 3D model of the target protein by copying conserved regions and
modeling variable regions (loops).
Tools:
• SWISS-MODEL: A web-based tool that automates the homology modeling process, providing both alignment and model
generation.
• MODELLER: A flexible and widely used tool for generating models based on alignment.
7
CONT.
Model optimization: Optimize the generated model using energy minimization and molecular
dynamics simulations to improve its stability and accuracy.
Tools: GROMACS, AMBER.
Model Validation/ evaluation: Evaluate the quality of the predicted model using statistical and
structural parameters.
Tools: PROCHECK (Ramachandran plots), Verify3D, ERRAT.
8
AB INITIO (DE NOVO) MODELING
Ab initio, or de novo, modeling predicts the 3D structure of a protein entirely from its amino acid sequence,
without relying on template structures. This approach is based on the principles of physics and energy
minimization, assuming that the native conformation of a protein corresponds to the lowest free energy state.
Scoring and
Conformational
Primary Optimization: Refinement:
Sampling:The
Sequence Input: Each conformation refine the
algorithm
Start with the is scored based on predicted
generates multiple
amino acid its energy state, structure to make
possible 3D
sequence of the and the model it more
conformations
protein as the only with the lowest biologically
(folding pathways)
input. energy is selected plausible.
of the protein.
or refined further.
9
TOOLS FOR AB INITIO MODELING
Rosetta: Rosetta is a powerful software suite for protein structure prediction and design.
• Applications:
• Protein folding prediction.
• Protein design and engineering.
AlphaFold: Developed by DeepMind, AlphaFold is a breakthrough AI-based tool for protein structure prediction.
• Applications:
• Predicting structures of previously unresolved proteins.
• Structural bioinformatics, drug discovery, and functional annotation.
10
THREADING (FOLD RECOGNITION)
Threading, also known as fold recognition, is a protein structure prediction method used when no
homologous templates are available.
It works by matching the sequence of a target protein to a library of known structural folds.
The approach is based on the principle that a limited number of protein folds exist in nature and that
many proteins with low sequence similarity still share similar structural features.
11
Steps in Threading (Fold Recognition)
1. Input the Query Sequence: Start with the amino acid sequence of the target protein.
2. Search Known Fold Libraries: Compare the sequence against a library of known protein structures (e.g., SCOP, CATH) to identify potential structural
folds.
3. Sequence-Structure Alignment: Align the query sequence to candidate folds using secondary structure predictions, residue interactions, and solvent
accessibility patterns.
4. Scoring and Ranking: Evaluate each alignment using energy-based or statistical scoring functions and rank the folds based on compatibility.
5. Model Building and Validation: Generate a 3D structural model for the best-matching fold, refine it, and validate its quality using tools like
Ramachandran plots or Verify3D.
12
RNA STRUCTURE PREDICTION
RNA structure prediction involves predicting the three-dimensional structure of an RNA molecule based on its
nucleotide sequence. This is critical because the function of RNA is closely related to its shape.
Accurate predictions can help in understanding various biological processes such as protein synthesis, regulation,
and catalysis.
13
CONT.
Secondary Structure Prediction
Secondary structure refers to the local folding of RNA that forms regular patterns of base pairs (nucleotide bases
like A, U, G, C pairing up). These structures typically include:
• Hairpins: Loops formed by complementary base-pairing.
• Bulges: Unpaired bases that stick out from the structure.
• Internal loops: Unpaired regions within a base-paired stem.
To predict RNA secondary structures:
• RNAfold: This tool uses a thermodynamic model to predict the most stable secondary structure of RNA based on
the sequence. The algorithm minimizes the free energy of the structure.
• Mfold: Similar to RNAfold, this tool predicts the RNA secondary structure by minimizing the free energy.
14
CONT.
16
PREDICTING PROTEIN-PROTEIN INTERACTIONS
(PPIS)
17
MOLECULAR DOCKING
Molecular docking is a computational technique used to predict the preferred orientation of two molecules (typically a
ligand and a receptor protein) when they bind together to form a stable complex.
It is widely applied in drug discovery, biological research, and understanding molecular interactions at the atomic
level.
Docking simulations help predict how small molecules, such as potential drugs, interact with their target proteins, guiding
the design of new therapeutic compounds.
AutoDock
Dock
•AutoDock is one of the most widely used
Dock is another widely used software that
molecular docking software programs.
specializes in rigid-body docking and can
•It is well-suited for small molecules and
handle flexible docking of ligands. It uses
allows for both rigid and flexible docking.
grid-based energy evaluation, making it
•AutoDock also provides various scoring
suitable for high-throughput docking
functions to evaluate the quality of docking
simulations.
results.
conformational flexibility.
APPLICATIONS OF STRUCTURAL BIOINFORMATICS IN
DRUG DISCOVERY
Virtual Screening: Identifying potential drug candidates by predicting how small molecules interact
with target proteins or RNA structures.
Biomarker Discovery: Identifying key protein-RNA interactions or mutations that could serve as
biomarkers for diseases.
Therapeutic Targeting: Designing molecules that can either block or activate protein-protein or
protein-small molecule interactions, aiding in the treatment of diseases.
22
23