0% found this document useful (0 votes)
2 views

Structural bioinformatics

The document discusses protein structure prediction, a key area in bioinformatics that determines the 3D structure of proteins from their amino acid sequences. It outlines various methods for structure prediction, including experimental techniques like X-ray crystallography and computational methods such as homology modeling and ab initio modeling. Additionally, it covers RNA structure prediction, protein-protein interactions, molecular docking, and applications of structural bioinformatics in drug discovery.

Uploaded by

alriyanmalik6
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Structural bioinformatics

The document discusses protein structure prediction, a key area in bioinformatics that determines the 3D structure of proteins from their amino acid sequences. It outlines various methods for structure prediction, including experimental techniques like X-ray crystallography and computational methods such as homology modeling and ab initio modeling. Additionally, it covers RNA structure prediction, protein-protein interactions, molecular docking, and applications of structural bioinformatics in drug discovery.

Uploaded by

alriyanmalik6
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

PROTEIN STRUCTURE PREDICTION

MAHRUKH ZAKIR
EMAIL: [email protected]
MIC-403
COMPUTATIONAL MICROBIOLOGY

1
STRUCTURAL BIOINFORMATICS

 Structural bioinformatics is a sub-discipline of bioinformatics that focuses on the analysis and prediction of the
three-dimensional (3D) structure of biological macromolecules, such as proteins, DNA, and RNA.
 It integrates computational techniques with structural biology to understand the relationship between structure and
function in biological molecules.

2
PROTEIN STRUCTURE PREDICTION

 Protein structure prediction is a critical area of bioinformatics and


structural biology that focuses on determining the three-dimensional
structure of a protein from its amino acid sequence.
Levels of Protein Structure:
• Primary Structure: The sequence of amino acids.
• Secondary Structure: Local folding into structures like α-helices and β-
sheets, stabilized by hydrogen bonds.
• Tertiary Structure: The overall 3D shape of a single polypeptide chain.
• Quaternary Structure: The assembly of multiple polypeptide chains.

3
METHODS OF PROTEIN STRUCTURE PREDICTION

Experimental Methods:
• X-ray Crystallography: Provides high-resolution structures but is time-consuming and requires protein
crystals.
• Nuclear Magnetic Resonance (NMR) Spectroscopy: Useful for small proteins, but limited for larger
ones.
• Cryo-Electron Microscopy (Cryo-EM): Effective for large complexes, with increasing resolution in
recent years.

4
COMPUTATIONAL METHODS

 Homology modeling, also known as comparative modeling, is a method used to predict the 3D structure of
a protein with an unknown structure by using the known structure of a homologous protein.
 The method relies on the fact that the 3D structure of proteins is often better conserved than their amino acid
sequence. Therefore, proteins with similar sequences are likely to have similar structures.

5
STEPS OF HOMOLOGY
MODELING

6
CONT.
Template Identification: Search for proteins with known structures (templates) that have high sequence similarity to the target
protein.
Tools like BLAST and HHpred are commonly used to identify suitable templates.
Alignment of Sequences: Align the target protein sequence with the selected template sequence to ensure conserved regions are
mapped correctly.
Tools: Clustal Omega, MUSCLE.
Model Building: Use the template structure to construct a 3D model of the target protein by copying conserved regions and
modeling variable regions (loops).
Tools:
• SWISS-MODEL: A web-based tool that automates the homology modeling process, providing both alignment and model
generation.
• MODELLER: A flexible and widely used tool for generating models based on alignment.
7
CONT.

 Model optimization: Optimize the generated model using energy minimization and molecular
dynamics simulations to improve its stability and accuracy.
Tools: GROMACS, AMBER.
 Model Validation/ evaluation: Evaluate the quality of the predicted model using statistical and
structural parameters.
Tools: PROCHECK (Ramachandran plots), Verify3D, ERRAT.

8
AB INITIO (DE NOVO) MODELING

 Ab initio, or de novo, modeling predicts the 3D structure of a protein entirely from its amino acid sequence,
without relying on template structures. This approach is based on the principles of physics and energy
minimization, assuming that the native conformation of a protein corresponds to the lowest free energy state.

Scoring and
Conformational
Primary Optimization: Refinement:
Sampling:The
Sequence Input: Each conformation refine the
algorithm
Start with the is scored based on predicted
generates multiple
amino acid its energy state, structure to make
possible 3D
sequence of the and the model it more
conformations
protein as the only with the lowest biologically
(folding pathways)
input. energy is selected plausible.
of the protein.
or refined further.
9
TOOLS FOR AB INITIO MODELING

 Rosetta: Rosetta is a powerful software suite for protein structure prediction and design.
• Applications:
• Protein folding prediction.
• Protein design and engineering.
 AlphaFold: Developed by DeepMind, AlphaFold is a breakthrough AI-based tool for protein structure prediction.
• Applications:
• Predicting structures of previously unresolved proteins.
• Structural bioinformatics, drug discovery, and functional annotation.

10
THREADING (FOLD RECOGNITION)

 Threading, also known as fold recognition, is a protein structure prediction method used when no
homologous templates are available.
 It works by matching the sequence of a target protein to a library of known structural folds.
 The approach is based on the principle that a limited number of protein folds exist in nature and that
many proteins with low sequence similarity still share similar structural features.

11
Steps in Threading (Fold Recognition)
1. Input the Query Sequence: Start with the amino acid sequence of the target protein.

2. Search Known Fold Libraries: Compare the sequence against a library of known protein structures (e.g., SCOP, CATH) to identify potential structural
folds.

3. Sequence-Structure Alignment: Align the query sequence to candidate folds using secondary structure predictions, residue interactions, and solvent
accessibility patterns.

4. Scoring and Ranking: Evaluate each alignment using energy-based or statistical scoring functions and rank the folds based on compatibility.

5. Model Building and Validation: Generate a 3D structural model for the best-matching fold, refine it, and validate its quality using tools like
Ramachandran plots or Verify3D.

12
RNA STRUCTURE PREDICTION

 RNA structure prediction involves predicting the three-dimensional structure of an RNA molecule based on its
nucleotide sequence. This is critical because the function of RNA is closely related to its shape.
 Accurate predictions can help in understanding various biological processes such as protein synthesis, regulation,
and catalysis.

13
CONT.
Secondary Structure Prediction
 Secondary structure refers to the local folding of RNA that forms regular patterns of base pairs (nucleotide bases
like A, U, G, C pairing up). These structures typically include:
• Hairpins: Loops formed by complementary base-pairing.
• Bulges: Unpaired bases that stick out from the structure.
• Internal loops: Unpaired regions within a base-paired stem.
To predict RNA secondary structures:
• RNAfold: This tool uses a thermodynamic model to predict the most stable secondary structure of RNA based on
the sequence. The algorithm minimizes the free energy of the structure.
• Mfold: Similar to RNAfold, this tool predicts the RNA secondary structure by minimizing the free energy.
14
CONT.

Tertiary Structure Prediction


 Tertiary structure is the three-dimensional folding of the RNA molecule, which is more complex than secondary
structure and involves interactions between different parts of the molecule that are far apart in the sequence but
close in the 3D structure. Unlike secondary structure, tertiary structure prediction is much more challenging
because it involves considering complex spatial arrangements, including:
Long-range interactions: Where distant parts of the RNA fold together.
Interactions with ions, small molecules, or proteins: Such as the binding of metal ions or interaction with
ribosomes in translation.
Tools for tertiary structure prediction:
• RosettaRNA: Part of the Rosetta suite of tools, this helps model RNA tertiary structure by considering secondary
structure as the starting point and then applying energy minimization to achieve a low-energy 3D structure.
15
• RNA 3D: Another tool for predicting 3D RNA structure based on the secondary structure.
CONT.

RNA-Protein Interaction Prediction


 RNA molecules don't function alone; they often interact with proteins, and understanding how they interact is key
to understanding their function. Predicting these interactions requires knowing not only the RNA sequence and
structure but also how it binds to specific proteins.
 There are various tools that predict the binding sites for RNA-protein interactions:
• PRIDB and RPI-Seq are databases that help predict RNA-protein interactions, allowing researchers to explore
which proteins might bind to a specific RNA molecule based on sequence and structural features.

16
PREDICTING PROTEIN-PROTEIN INTERACTIONS
(PPIS)

 Predicting Protein-Protein Interactions (PPIs) is a critical aspect of understanding cellular processes, as


proteins rarely function in isolation. Instead, they often work together in complex networks, influencing biological
functions like signaling, metabolism, and cell regulation.
 Predicting these interactions can help in identifying drug targets, understanding disease mechanisms, and
designing new therapies.

17
MOLECULAR DOCKING
 Molecular docking is a computational technique used to predict the preferred orientation of two molecules (typically a
ligand and a receptor protein) when they bind together to form a stable complex.

 It is widely applied in drug discovery, biological research, and understanding molecular interactions at the atomic
level.

 Docking simulations help predict how small molecules, such as potential drugs, interact with their target proteins, guiding
the design of new therapeutic compounds.

Ligand and Receptor:


 Ligand: A small molecule (like a drug, peptide, or ion) that binds to a target protein (receptor) in a specific binding site.
 Receptor: A macromolecule, typically a protein or nucleic acid, that interacts with the ligand, forming a complex. The binding of the
ligand to the receptor often induces a biological effect.
Docking Process: The goal of molecular docking is to predict the best possible binding mode of a ligand to a receptor by
18
evaluating different poses and orientations based on several factors, including interaction energy and binding affinity.
STEPS IN MOLECULAR DOCKING
Preparation of Ligand and Receptor:
o Ligand: The ligand molecule is prepared by optimizing its geometry, removing water molecules, and assigning
proper atomic charges.
o Receptor: The receptor protein is prepared by removing water molecules, adding hydrogen atoms, and
assigning appropriate charges to the residues. It may also be pre-processed to identify and define active sites or
binding pockets.
Docking Simulation:
 The ligand is placed into the binding site of the receptor in different orientations and positions. The docking
software will generate many possible poses by translating and rotating the ligand within the receptor’s binding
pocket.
 The docking algorithm evaluates the ligand-receptor interactions (e.g., van der Waals forces, electrostatic
interactions, hydrogen bonds) and assigns a score based on the predicted strength of the interaction.
Scoring and Ranking:
 The poses are then ranked based on their scores, typically using scoring functions that predict binding affinity (the
higher the score, the better the predicted binding).
19
Analysis and Visualization:
 Tools like PyMOL, Chimera, or VMD are used for visualization.
20
COMMONLY USED MOLECULAR DOCKING
SOFTWARE

AutoDock
Dock
•AutoDock is one of the most widely used
Dock is another widely used software that
molecular docking software programs.
specializes in rigid-body docking and can
•It is well-suited for small molecules and
handle flexible docking of ligands. It uses
allows for both rigid and flexible docking.
grid-based energy evaluation, making it
•AutoDock also provides various scoring
suitable for high-throughput docking
functions to evaluate the quality of docking
simulations.
results.

GOLD (Genetic Optimization for Ligand Docking)


GOLD is a flexible docking software that uses a genetic
algorithm to optimize ligand poses. It is known for its accurate
scoring functions and its ability to dock ligands with large 21

conformational flexibility.
APPLICATIONS OF STRUCTURAL BIOINFORMATICS IN
DRUG DISCOVERY

 Virtual Screening: Identifying potential drug candidates by predicting how small molecules interact
with target proteins or RNA structures.
 Biomarker Discovery: Identifying key protein-RNA interactions or mutations that could serve as
biomarkers for diseases.
 Therapeutic Targeting: Designing molecules that can either block or activate protein-protein or
protein-small molecule interactions, aiding in the treatment of diseases.

22
23

You might also like