0% found this document useful (0 votes)

79 views

Fold Recognition (Threading) : Lecture-02

Protein threading, also known as fold recognition, is a structure prediction method used for proteins that have the same overall fold as proteins with known structures, but do not share significant sequence homology. It works by "threading" the amino acid sequence of the target protein into structurally similar protein templates to find the best fit and then models the target structure based on the alignment. Protein threading is based on the observations that there are a limited number of protein folds in nature and many new structures have similar folds to those already known. It differs from homology modeling which is used for proteins with homologous sequences and structures.

Uploaded by

Raksha Sandilya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views

Fold Recognition (Threading) : Lecture-02

Uploaded by

Raksha Sandilya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Lecture-02

Course Instructor: Sri A.K.Singh, Guest Faculty

Deptt. of BS&L(SMCA,P&L)
(COMPUTER APPLICATION)
Couse No.: Biotech-308,Credit Hours: 2+1,Biotech
Course title: (Computational Biology)

Fold recognition(threading)
Protein threading, also known as fold recognition, is a method of protein modeling which is used
to model those proteins which have the same fold as proteins of known structures, but do not
have homologous proteins with known structure. It differs from the homology modeling method of
structure prediction as it (protein threading) is used for proteins which do not have their
homologous protein structures deposited in the Protein Data Bank (PDB), whereas homology
modeling is used for those proteins which do. Threading works by using statistical knowledge of
the relationship between the structures deposited in the PDB and the sequence of the protein
which one wishes to model.
The prediction is made by "threading" (i.e. placing, aligning) each amino acid in the target sequence
to a position in the template structure, and evaluating how well the target fits the template. After
the best-fit template is selected, the structural model of the sequence is built based on the
alignment with the chosen template. Protein threading is based on two basic observations: that the
number of different folds in nature is fairly small (approximately 1300); and that 90% of the new
structures submitted to the PDB in the past three years have similar structural folds to ones already
in the PDB.
Classification of protein

The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive

description of the structural and evolutionary relationships of known structure. Proteins are
classified to reflect both structural and evolutionary relatedness. Many levels exist in the hierarchy,
but the principal levels are family, superfamily and fold, as described below.
Family (clear evolutionary relationship): Proteins clustered together into families are clearly
evolutionarily related. Generally, this means that pairwise residue identities between the proteins
are 30% and greater. However, in some cases similar functions and structures provide definitive
evidence of common descent in the absence of high sequence identity; for example,
many globinsform a family though some members have sequence identities of only 15%.
Superfamily (probable common evolutionary origin): Proteins that have low sequence identities,
but whose structural and functional features suggest that a common evolutionary origin is
Lecture-02
Course Instructor: Sri A.K.Singh, Guest Faculty
Deptt. of BS&L(SMCA,P&L)
(COMPUTER APPLICATION)
Couse No.: Biotech-308,Credit Hours: 2+1,Biotech
Course title: (Computational Biology)

probable, are placed together in superfamilies. For example, actin, the ATPase domain of the heat

shock protein, and hexakinase together form a superfamily.

Fold (major structural similarity): Proteins are defined as having a common fold if they have the
same major secondary structures in the same arrangement and with the same topological
connections. Different proteins with the same fold often have peripheral elements of secondary
structure and turn regions that differ in size and conformation. In some cases, these differing
peripheral regions may comprise half the structure. Proteins placed together in the same fold
category may not have a common evolutionary origin: the structural similarities could arise just
from the physics and chemistry of proteins favoring certain packing arrangements and chain
topologies.

Method

A general paradigm of protein threading consists of the following four steps:

The construction of a structure template database: Select protein structures from the protein
structure databases as structural templates. This generally involves selecting protein structures
from databases such as PDB, FSSP, SCOP, or CATH, after removing protein structures with high
sequence similarities.

The design of the scoring function: Design a good scoring function to measure the fitness between
target sequences and templates based on the knowledge of the known relationships between the
structures and the sequences. A good scoring function should contain mutation potential,
environment fitness potential, pairwise potential, secondary structure compatibilities, and gap
penalties. The quality of the energy function is closely related to the prediction accuracy, especially
the alignment accuracy.

Threading alignment: Align the target sequence with each of the structure templates by optimizing
the designed scoring function. This step is one of the major tasks of all threading-based structure
prediction programs that take into account the pairwise contact potential; otherwise, a dynamic
programming algorithm can fulfill it.
Lecture-02
Course Instructor: Sri A.K.Singh, Guest Faculty
Deptt. of BS&L(SMCA,P&L)
(COMPUTER APPLICATION)
Couse No.: Biotech-308,Credit Hours: 2+1,Biotech
Course title: (Computational Biology)

Threading prediction: Select the threading alignment that is statistically most probable as the
threading prediction. Then construct a structure model for the target by placing the backbone
atoms of the target sequence at their aligned backbone positions of the selected structural template.

Comparison with homology modeling

Homology modeling and protein threading are both template-based methods and there is no
rigorous boundary between them in terms of prediction techniques. But the protein structures of
their targets are different. Homology modeling is for those targets which have homologous proteins
with known structure (usually/maybe of same family), while protein threading is for those targets
with only fold-level homology found. In other words, homology modeling is for "easier" targets and
protein threading is for "harder" targets.

Homology modeling treats the template in an alignment as a sequence, and only sequence
homology is used for prediction. Protein threading treats the template in an alignment as a
structure, and both sequence and structure information extracted from the alignment are used for
prediction. When there is no significant homology found, protein threading can make a prediction
based on the structure information. That also explains why protein threading may be more effective
than homology modeling in many cases.

In practice, when the sequence identity in a sequence sequence alignment is low (i.e. <25%),
homology modeling may not produce a significant prediction. In this case, if there is distant
homology found for the target, protein threading can generate a good prediction.

More about threading

Fold recognition methods can be broadly divided into two types: 1, those that derive a 1-D profile
for each structure in the fold library and align the target sequence to these profiles; and 2, those
that consider the full 3-D structure of the protein template. A simple example of a profile
representation would be to take each amino acid in the structure and simply label it according to
whether it is buried in the core of the protein or exposed on the surface. More elaborate profiles
might take into account the local secondary structure (e.g. whether the amino acid is part of
an alpha helix) or even evolutionary information (how conserved the amino acid is). In the 3-D
representation, the structure is modeled as a set of inter-atomic distances, i.e. the distances are
Lecture-02
Course Instructor: Sri A.K.Singh, Guest Faculty
Deptt. of BS&L(SMCA,P&L)
(COMPUTER APPLICATION)
Couse No.: Biotech-308,Credit Hours: 2+1,Biotech
Course title: (Computational Biology)

calculated between some or all of the atom pairs in the structure. This is a much richer and far more
flexible description of the structure, but is much harder to use in calculating an alignment. The
profile-based fold recognition approach was first described by Bowie, Lü thy and David Eisenberg in
1991. The term threading was first coined by David Jones, William R. Taylor and Janet Thorntonin
1992, and originally referred specifically to the use of a full 3-D structure atomic representation of
the protein template in fold recognition. Today, the terms threading and fold recognition are
frequently (though somewhat incorrectly) used interchangeably.
Fold recognition methods are widely used and effective because it is believed that there are a
strictly limited number of different protein folds in nature, mostly as a result of evolution but also
due to constraints imposed by the basic physics and chemistry of polypeptide chains. There is,
therefore, a good chance (currently 70-80%) that a protein which has a similar fold to the target
protein has already been studied by X-ray crystallography or nuclear magnetic resonance (NMR)
spectroscopy and can be found in the PDB. Currently there are nearly 1300 different protein folds
known, but new folds are still being discovered every year due in significant part to the
ongoing structural genomics projects.
Many different algorithms have been proposed for finding the correct threading of a sequence onto
a structure, though many make use of dynamic programming in some form. For full 3-D threading,
the problem of identifying the best alignment is very difficult (it is an NP-hard problem for some
models of threading). Researchers have made use of many combinatorial optimization methods
such as Conditional random fields, simulated annealing, branch and bound and linear
programming, searching to arrive at heuristic solutions. It is interesting to compare threading
methods to methods which attempt to align two protein structures (protein structural alignment),
and indeed many of the same algorithms have been applied to both problems.
Protein threading software

 HHpred is a popular threading server which runs HHsearch, a widely used software for
remote homology detection based on pairwise comparison of hidden Markov models.
 RAPTOR (software) is an integer programming based protein threading software. It has
been replaced by a new protein threading program RaptorX / software for protein modeling and
analysis, which employs probabilistic graphical models and statistical inference to both single
Lecture-02
Course Instructor: Sri A.K.Singh, Guest Faculty
Deptt. of BS&L(SMCA,P&L)
(COMPUTER APPLICATION)
Couse No.: Biotech-308,Credit Hours: 2+1,Biotech
Course title: (Computational Biology)

template and multi-template based protein threading. RaptorX significantly outperforms RAPTOR

and is especially good at aligning proteins with sparse sequence profile. The RaptorX server is free
to public.
 Phyre is a popular threading server combining HHsearch with ab initio and multiple-
template modelling.
 MUSTER is a standard threading algorithm based on dynamic programming and sequence
profile-profile alignment. It also combines multiple structural resources to assist the sequence
profile alignment.
 SPARKS X is a probabilistic-based sequence-to-structure matching between predicted one-
dimensional structural properties of query and corresponding native properties of templates.
 BioShell is a threading algorithm using optimized profile-to-profile dynamic programming
algorithm combined with predicted secondary structure.

Bioinformatics Quiz: Test Your Knowledge of Bioinformatics
56% (18)
Bioinformatics Quiz: Test Your Knowledge of Bioinformatics
16 pages
Protein Modelling
No ratings yet
Protein Modelling
53 pages
Applications of Bioinformatics
No ratings yet
Applications of Bioinformatics
34 pages
Protein Threading
No ratings yet
Protein Threading
9 pages
Lec6-Protein Structure Prediction
No ratings yet
Lec6-Protein Structure Prediction
16 pages
3-D Structure of Proteins: Laws of Physics Theory of Evolution
No ratings yet
3-D Structure of Proteins: Laws of Physics Theory of Evolution
9 pages
3D Structure Prediction
No ratings yet
3D Structure Prediction
33 pages
Protein Modeling in Biochemistry
No ratings yet
Protein Modeling in Biochemistry
29 pages
Genome Sequencing Projects: Increase in The Number of Protein Sequences
No ratings yet
Genome Sequencing Projects: Increase in The Number of Protein Sequences
27 pages
Protein Structure Similarity: Mlesnick@stanford - Edu
No ratings yet
Protein Structure Similarity: Mlesnick@stanford - Edu
8 pages
Dr. Qudsia Yousafi
No ratings yet
Dr. Qudsia Yousafi
30 pages
Protein Sequence
No ratings yet
Protein Sequence
36 pages
Protein Structure Modeling
No ratings yet
Protein Structure Modeling
21 pages
Protein Side Chain Correction
No ratings yet
Protein Side Chain Correction
28 pages
Protein Tertiaty Structure Prediction
No ratings yet
Protein Tertiaty Structure Prediction
12 pages
Protein structure prediction and modeling
No ratings yet
Protein structure prediction and modeling
20 pages
Bioinformatics TM6
No ratings yet
Bioinformatics TM6
30 pages
Extra Notes On Threading
No ratings yet
Extra Notes On Threading
6 pages
The Threading Approach To Tertiary Structure Prediction
No ratings yet
The Threading Approach To Tertiary Structure Prediction
6 pages
Sanchez CurrOpinStructBiol 1997
No ratings yet
Sanchez CurrOpinStructBiol 1997
9 pages
Homolgy Modeling
No ratings yet
Homolgy Modeling
19 pages
Tertiary Structure Prediction Methods: Any Given Protein Sequence
No ratings yet
Tertiary Structure Prediction Methods: Any Given Protein Sequence
29 pages
2. Protein Structure Prediction
No ratings yet
2. Protein Structure Prediction
34 pages
Bioinformatics Notes - 17Bt54: Module - 4
No ratings yet
Bioinformatics Notes - 17Bt54: Module - 4
48 pages
2015 Article 14 Twilight Zone
No ratings yet
2015 Article 14 Twilight Zone
11 pages
Pre-Assessment Questions
No ratings yet
Pre-Assessment Questions
18 pages
Protein Structure Prediction
No ratings yet
Protein Structure Prediction
13 pages
Proclust:: Improved Clustering of Protein Sequences With An Extended Graph-Based Approach
No ratings yet
Proclust:: Improved Clustering of Protein Sequences With An Extended Graph-Based Approach
58 pages
2013 Front Genet 4 118
No ratings yet
2013 Front Genet 4 118
7 pages
2013 J Theor Biol 328 77-88
No ratings yet
2013 J Theor Biol 328 77-88
12 pages
Homology Modeling, Also Known As Comparative Modeling of
No ratings yet
Homology Modeling, Also Known As Comparative Modeling of
19 pages
Protein 3D Structure Database
No ratings yet
Protein 3D Structure Database
46 pages
Protein STR
No ratings yet
Protein STR
63 pages
TR_20211112_许锦波_基于深度学习的蛋白质结构预测
No ratings yet
TR_20211112_许锦波_基于深度学习的蛋白质结构预测
47 pages
Protein Tertiary Structures: Prediction From Amino Acid Sequences
No ratings yet
Protein Tertiary Structures: Prediction From Amino Acid Sequences
7 pages
Nucl. Acids Res. 2005 Ginalski 1874 91
No ratings yet
Nucl. Acids Res. 2005 Ginalski 1874 91
18 pages
Fold Lib
100% (1)
Fold Lib
24 pages
Protein Structure Prediction.pptx
No ratings yet
Protein Structure Prediction.pptx
23 pages
Progress and Challenges in Protein Structure Prediction - Zhang 2008
No ratings yet
Progress and Challenges in Protein Structure Prediction - Zhang 2008
7 pages
Homology Modeling: Ref: Structural Bioinformatics, P.E Bourne Molecular Modeling, Folkers
No ratings yet
Homology Modeling: Ref: Structural Bioinformatics, P.E Bourne Molecular Modeling, Folkers
16 pages
skolnick-et-al-2021-alphafold-2-why-it-works-and-its-implications-for-understanding-the-relationships-of-protein
No ratings yet
skolnick-et-al-2021-alphafold-2-why-it-works-and-its-implications-for-understanding-the-relationships-of-protein
5 pages
Pi Is 0969212699801774
No ratings yet
Pi Is 0969212699801774
14 pages
Protein Folding
No ratings yet
Protein Folding
21 pages
Orengo 1997
No ratings yet
Orengo 1997
17 pages
Protein Modeling by Multiple Sequence Threading and Distance Geometry
No ratings yet
Protein Modeling by Multiple Sequence Threading and Distance Geometry
5 pages
bookchapter_Proteinstructure
No ratings yet
bookchapter_Proteinstructure
16 pages
Unit 3
No ratings yet
Unit 3
9 pages
ssrn-4541252
No ratings yet
ssrn-4541252
25 pages
Experiment-7(HOMOLOGY MODELING)
No ratings yet
Experiment-7(HOMOLOGY MODELING)
12 pages
Saji, 2007 - GA341
No ratings yet
Saji, 2007 - GA341
15 pages
Protein Structure Prediction
No ratings yet
Protein Structure Prediction
17 pages
Protein Folds and Structure
No ratings yet
Protein Folds and Structure
19 pages
Structural bioinformatics
No ratings yet
Structural bioinformatics
23 pages
Lecture 7
No ratings yet
Lecture 7
24 pages
Proteins Bioinfo Latest
No ratings yet
Proteins Bioinfo Latest
45 pages
Homology modeling
No ratings yet
Homology modeling
5 pages
Module 5 notes
No ratings yet
Module 5 notes
151 pages
Protein Modelling: (Building 3D Models of Proteins)
No ratings yet
Protein Modelling: (Building 3D Models of Proteins)
19 pages
Protein Structure Modelling
No ratings yet
Protein Structure Modelling
3 pages
Molecular Modelling and Drug Design
From Everand
Molecular Modelling and Drug Design
K Anand Solomon
No ratings yet
Logical Modeling of Biological Systems
From Everand
Logical Modeling of Biological Systems
Luis Fariñas del Cerro
No ratings yet
Bioinformatics: Merging Biology and Technology
From Everand
Bioinformatics: Merging Biology and Technology
Mani Devar
No ratings yet
Latin Square Design Stat-301&stat-512
No ratings yet
Latin Square Design Stat-301&stat-512
7 pages
Phylogenetic Tree
No ratings yet
Phylogenetic Tree
5 pages
Mummer
No ratings yet
Mummer
4 pages
Secondary Metabolites:: Ochrosia Elliptica Picralima Nitida Gardenia Jasminoides Ruta Graveolens Voacanga Africana
No ratings yet
Secondary Metabolites:: Ochrosia Elliptica Picralima Nitida Gardenia Jasminoides Ruta Graveolens Voacanga Africana
2 pages
Organogenesis, Somatic Embryogenesis and Synthetic Seeds
No ratings yet
Organogenesis, Somatic Embryogenesis and Synthetic Seeds
3 pages
Tissue Culture History and Media Composition
No ratings yet
Tissue Culture History and Media Composition
4 pages
Ag-Ag Interaction
No ratings yet
Ag-Ag Interaction
8 pages
2d 3d Structure
No ratings yet
2d 3d Structure
38 pages
D. Higgins, Willie Taylor Bioinformatics Sequence, Structure and Databanks PDF
100% (2)
D. Higgins, Willie Taylor Bioinformatics Sequence, Structure and Databanks PDF
268 pages
Introduction To Bioinformatics: Tolga Can
No ratings yet
Introduction To Bioinformatics: Tolga Can
21 pages
Bioinformatics & Computational Biology Syllabus
No ratings yet
Bioinformatics & Computational Biology Syllabus
2 pages
NTHRYS Drug Designing Training
No ratings yet
NTHRYS Drug Designing Training
7 pages
Bioinformatics
No ratings yet
Bioinformatics
10 pages
Homology Modelling Notes PDF
No ratings yet
Homology Modelling Notes PDF
30 pages
Workshop Protein Modeling PDF
No ratings yet
Workshop Protein Modeling PDF
54 pages
Protein Structure Prediction
No ratings yet
Protein Structure Prediction
41 pages
Bioinformatics: Protein Structure Prediction: Chandrayani N.Rokde DR - Manali Kshirsagar
No ratings yet
Bioinformatics: Protein Structure Prediction: Chandrayani N.Rokde DR - Manali Kshirsagar
5 pages
Thesis On Homology Modeling
100% (3)
Thesis On Homology Modeling
6 pages
Secondary Structure Motif
No ratings yet
Secondary Structure Motif
106 pages
Structural Bioinfo
No ratings yet
Structural Bioinfo
76 pages
Proteins - 1999 - Simons - Ab Initio Protein Structure Prediction of CASP III Targets Using ROSETTA
No ratings yet
Proteins - 1999 - Simons - Ab Initio Protein Structure Prediction of CASP III Targets Using ROSETTA
6 pages
Drug Design Using Bioinformatics
100% (3)
Drug Design Using Bioinformatics
13 pages
Protein Structure Prediction Thesis
100% (3)
Protein Structure Prediction Thesis
8 pages
softwares used in drug discovery and designing
No ratings yet
softwares used in drug discovery and designing
4 pages
(BIF 401) Current Solved Papers.
No ratings yet
(BIF 401) Current Solved Papers.
16 pages
Birth of Bioinfo
No ratings yet
Birth of Bioinfo
15 pages
Manual PDF
100% (1)
Manual PDF
53 pages
Gene Identification Methods
No ratings yet
Gene Identification Methods
37 pages
Eyrich Bioinformatics 2001
No ratings yet
Eyrich Bioinformatics 2001
2 pages
Recent Advances in Computer-Aided Drug Design
No ratings yet
Recent Advances in Computer-Aided Drug Design
13 pages
Fold Recognition (Threading) : Lecture-02
No ratings yet
Fold Recognition (Threading) : Lecture-02
5 pages
1.arts and Science-Syllabus
No ratings yet
1.arts and Science-Syllabus
30 pages

Fold Recognition (Threading) : Lecture-02

Uploaded by

Fold Recognition (Threading) : Lecture-02

Uploaded by

Lecture-02

Course Instructor: Sri A.K.Singh, Guest Faculty

The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive

probable, are placed together in superfamilies. For example, actin, the ATPase domain of the heat

A general paradigm of protein threading consists of the following four steps:

Comparison with homology modeling

More about threading

template and multi-template based protein threading. RaptorX significantly outperforms RAPTOR

You might also like