0% found this document useful (0 votes)
12 views

Lecture 7- Score Matrix

Uploaded by

aletimanaswini
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Lecture 7- Score Matrix

Uploaded by

aletimanaswini
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

ISC 211

Introduction to
Bioinformatics
Lecture 7 – Scoring Matrix
Dr. Athira B
Asst. Professor, CSE
IIIT Kottayam
Scoring Matrix

 A scoring matrix contains set of values for qualifying the set of


one residue being substituted by another in sequence
alignment.
 A simple scheme:
 A positive value or high score is given for a match
 a negative value or low score for a mismatch and gaps.
 This assignment is based on the assumption that the
frequencies of mutation are equal for all bases.
Scoring matrix for aligning DNA
sequence
 Transitions: substitutions in which a purine is replaced by
another purine (A/G) or pyrimidines is replaced by another
pyrimidines (C/T)
 Transversions: substitutions between purines and pyrimidines
(A/G to C/T)
Scoring Matrices for Amino-acid

 An amino-acid scoring matrix is a 20 × 20 table such that position


indexed with amino-acids so that position X,Y in the table gives the
score of aligning amino-acid X with amino-acid Y
 Identity matrix – Exact matches receive one score and non-exact
matches a different score (1 on the diagonal 0 everywhere else)
 Mutation data matrix – a scoring matrix compiled based on
observation of protein mutation rates: some mutations are observed
more often than other (PAM, BLOSUM).
 Physical properties matrix – amino acids with similar biophysical
properties receive high score.
 Genetic code matrix – amino acids are scored based on similarities in
the coding triple.
PAM Matrix

 Point Accepted Mutation matrix- proposed by


Margret Dayhoff (known as the mother of
Bioinformatics)
 A Pam matrix is a matrix where each column and
each row represent one of the 20 aminoacids. PAM
matrices are regularly used as substitution matrices
used to score sequence alignment for proteins.
PAM Matrix computation
 Step-1: Construct Multiple Sequence Alignment (MSA) for the
dataset (collection of amino acid sequences)

 Step-2: Construct a phylogenetic tress from MSA- this will give an


idea of how mutations are happening
PAM Matrix computation

 Step-3: For each amino acid residue, compute the frequency of


substations with other residues, Fij
 FGA = {count of G → A, A → G} = 3
PAM Matrix computation

 Step-4: Compute relative mutability of residue mi

 m=

 m = = 0.0209
 Where ,
 =4
 =6
 = 0.159
PAM Matrix computation

 Step-5: Compute mutation probability, Mij


 Mij =

 MGA = = 0.0156
 Where Fij = 3 and = 4
PAM Matrix computation

 Step-6: Each non-diagonal entries are calculated


as:
 Rij =
 RGA= log ( ) = -1.01

 Where fi = fG = 10 G divided by 63 total residue=


= 0.1587
PAM Matrix computation

 Step-7: For diagonal entries, Rjj = 1- m and repeat


step 6.

 What is the probability that a residue of type j will


be replaced by i in M?
 The answer can be obtained from Rij of PAM-1
matrix.
Reading Assignment

 Understand about BLOSUM


 PAM-1, PAM-250, PAM-1000

You might also like