0% found this document useful (0 votes)
16 views

Homology Modelling

Uploaded by

juhiyaadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Homology Modelling

Uploaded by

juhiyaadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Protein structure prediction through homology modelling using Swiss Modeller.

INTRODUCTION
Protein bioinformatics helps in understanding the protein structure and function. The basic
protein bioinformatics involves:

• Homology search and the functions of homologs.


• Prediction of protein secondary structure
• Identification of functionally important residues
• Homology modeling
• To understand problems such as the consequence of disease-causing mutations;
• the properties of ligand-binding sites;
• Fill gap between known sequence and structures
• Protein Engg. To alter function of a protein
• Rational Drug Design

Protein structures may be organized into four levels of complexity:


Primary structure is the sequence of amino acid residues in the polypedptide chain;
Secondary structure refers to highly regular local sub-structures mainly formed through
hydrogen bonds between backbone atoms. Two main type of secondary structure are alpha helix
and beta strand.
Tertiary structure describes the packing of alpha-helices, beta-sheets and random coils with
respect to each other on the level of one whole polypeptide chain.
Quaternary structure only exists, if there is more than one polypeptide chain present in a complex
protein. Then quaternary structure describes the spatial organization of the chains.
Proteins are frequently described as consisting from several structural units:

 A structural domain is an element of the protein’s overall structure that is self-


stabilizing and often folds independently of the rest of the protein chain.
 Structural and sequence motifs refer to short sequences of protein three-
dimensional structure or amino acid sequence that were found in a large number
of different proteins. (www.bioinfo.org.cn)
HOMOLOGY SEARCH
Homologous identification helps in extracting structural and functional information of a protein.
Protein sequence analysis provides greater specificity and less noise than nucleic acid analysis
for identification of similarities because of the inherent differences in the message content of
nucleic acid and amino acid codes due in part to 4-letter vs. 20-letter code and degeneracy of
codon messaging.
HOMOLOGY MODELING
Homology modeling is based on the assumption that two homologous proteins will share very
similar structures. Because a protein fold is more evolutionary conserved than its amino acid
sequence, a target sequence can be modeled with reasonable accuracy on a very distantly related
template, provided that the relationship between target and template can be discerned through
sequence alignment. It has been suggested that the primary bottleneck in homology modeling
arises from difficulties in obtaining a good alignment rather than from errors in structure
prediction. Given a good alignment, homology modeling can be performed with good accuracy.
A number of tools are available for predicting 3-D structure of a protein such as (PS)2: Protein
Structure Prediction Server (http://ps2.life.nctu.edu.tw/), SWISSMODEL
(http://swissmodel.expasy.org/) I-TASSER (http://zhanglab.ccmb.med.umich.edu/I- TASSER/)
etc.

Home page

Result
The SWISS-MODEL Homology Modelling Report offers a summary of all Models built in the
project.
Note: The report is accessible (i) per model via a drop down menu, next to the model in the
Models view or (ii) for all models in report.html in the downloaded file when choosing to
download the project by pressing the download button below the project title.
It is structured in the following sections:

 Model building Report: Contains project name, project date and references. The target
sequence is in Table T1 of the Report.
 Results: Version of the SWISS-MODEL template library and PDB release. All identified
templates are listed in Table T2.
 Models: Models are listed sequentially with each entry showing a picture of the model, a
link to the PDB file, the version of the modelling engine, the oligomeric state, the ligands
(if any), the global model quality estimate and the QMEAN score
A graphical representation of the QMEAN score and it's four terms separately, the local
quality estimate plot and the comparison with non-redundant set of PDB structures are
also provided. For the template, a link to the template itself is provided together with the
following information: the title of the structure, the target sequence coverage, the
sequence identity to the target, the experimental method used to obtain the structure (and
the resolution, if applicable), the oligomeric state, the ligands (if any), the sequence
similarity to the target, the template search method used.
 Save Project Locally: Allows to download the project as a zip file.
The main folder contains the Model report (report.html), images folder (banner for the
Report) and the model folder. Each model has its own subfolder.

GMQE
GMQE (Global Model Quality Estimation) is a quality estimation which combines properties
from the target-template alignment. The resulting GMQE score is expressed as a number
between 0 and 1, reflecting the expected accuracy of a model built with that alignment and
template. Higher numbers indicate higher reliability. Once a model is built, the GMQE ((1) in the
figure above) gets updated for this specific case by also taking into account the QMEAN score of
the obtained model in order to increase reliability of the quality estimation.

QMEAN
QMEAN (Benkert et al.) is a composite scoring function based on different geometrical
properties and provide both global (i.e.for the entire structure) and local (i.e. per residue)
absolute quality estimates on the basis of one single model.
The QMEAN Z-score provides an estimate of the ‘degree of nativeness’ of the structural features
observed in the model and indicates whether the model is of comparable quality to experimental
structures. Higher QMEAN Z-scores indicates better agreement between the model structure and
experimental structures of similar size. Scores of -4.0 or below are an indication of models
with very low quality, this is also highlighted by a change of the "thumbs-up" symbol to a
"thumbs-down" symbol next to the score.
QMEAN consists of four individual terms. The four individual terms of the global QMEAN
quality scores are also listed. The white area in the bar-plots (numerical values close to zero)
indicates that the property is similar to what is observed in experimental structures. Positive
values indicate that the model scores higher than experimental structures on average, negative
numbers indicate that the model scores lower than experimental structures on average. The
QMEAN Z-score itself is shown on top. The individual Z-scores compare the interaction
potential between Cbeta atoms only, all atoms, the solvation potential and the torsion angle
potential.
The “Local Quality” plot shows, for each residue of the model (reported on the x-axis), the
expected similarity to the native structure (y-axis). Typically, residues showing a score below 0.6
are expected to be of low quality. Different model chains are shown in different colours. If the
model is downloaded, the local score is reported in the B-factor column of the PDB file. The
local quality can also be visualized by choosing the colour scheme "QMEAN".
In the “Comparison” plot, model quality scores of individual models are expressed as 'Z-scores'
in comparison to scores obtained for high-resolution crystal structures. The x-axis shows the
length (in amino acids) of the proteins. The y-axis is the normalized QMEAN score. Every dot
represents one protein structure. The darkest dots are all structures with a global QMEAN Z-
score (the same score as 2 and 3 in the figure above) between -1 and 1, structures with a |Z-
Score| between 1 and 2 are grey and if the |Z-Score| is more than 2 they are in light grey. A red
star represents the model.
After a predicted model is generated with no other refinements, the program PROCHECK is
used to evaluate the quality of this model based on the G-factor (The G-factor provides a
measure of how "normal", or alternatively how "unusual", a given stereochemical property is; a
low or negative G-factor indicates that the property corresponds to a low-probability
conformation.). Finally, the predicted model was displayed by Chime and automatically sent to
users. To use this server go to the home page and paste the query protein sequence.
Validation of generated model
The 3-D model generated can be checked for the quality using the Ramchandran plot. A
Ramachandran plot is a way to visualize backbone dihedral angles Φ and Ψ of amino acid
residues in protein structure.
Values of φ are limited to the range between -60 º and -150º. For ψ, the range is limited to
regions centered about -60º and +120º.
Good models have most of the residues clustered tightly in the most-favored regions with very
few outliers.
Good, but low-resolution models, may have less pronounced clustering, but still have few
outliers.
Poor models have no clustering and there are many outliers
Profunc server (http://www.ebi.ac.uk/thornton-srv/databases/profunc/index.html) is one of the
tools that is used for checking the model generated. The ProFunc server had been developed to
help identify the likely biochemical function of a protein from its three- dimensional structure. It
uses both sequence- and structure-based methods to try to provide clues as the protein's likely or
possible function.
To use this server go to the home page and upload the query structure:

Home page
Result

Analysis table:
PDB Id
RMSD
of the Protein Seq.
Model Organism Description Coverage with the
templat chain identity
template
e
Model 1
Model 2

Validation: https://www.doe-mbi.ucla.edu/verify3d/
https://prosa.services.came.sbg.ac.at/prosa.php
Mode Prochec QmeanDisc ProQ ProS QMEA Verif
l k o A N y 3D
1
2
*screen shot of verify 3D, ProQ and ProSA
Exercises:

S. No Accession
number

1 BAA23662.1

2 AAD32855.1

3 AAG24881.1

4 ASM56427.1

5 AAQ89896.1

6 AAX61156.1

7 ALP44177

8 ALO20316

9 ALO18797

10 Q6JRS3

You might also like