Early sequence aligment
Early sequence aligment
1. Types of alignment
• Global alignment:
§ Tries to align the en2re sequence.
§ Aling all le8ers from the query (reference) and target sequence.
§ Suitable for closely related sequences.
§ Needleman-Wunsch method.
• Local alignment:
§ Aling regions having highest similari2es.
§ Aling substring of target with substring of query.
§ Suitable for more divergent sequences.
§ Smith-Waterman method.
2. Global alignment
2. Ini2aliza2on step:
a. From the first row (point 0: upper leU) propor2onate a reward or penalty in each posi2on for this
row. The penal2es or rewards are accumula2ve in the rows.
b. From the first column (point 0: upper leU) propor2onate a reward or penalty in each posi2on for this
row. The penal2es or rewards are accumula2ve in the rows.
3. Matrix filling step:
a. From the first row (first point of overlapping in sequence, no ma8ers if they are equal or not) select
this square.
b. Calculate the leU value: assign the value form the leU neighbor and for every ver2cal or horizontal
movement you must add a GAP penalty to it (-2).
c. Calculate the up value: assign the value form the up neighbor and for every ver2cal or horizontal
movement you must add a GAP penalty to it (-2).
d. Calculate the diagonal value: assign the value form the diagonal neighbor and for every diagonal
movement you must add a match reward or a mismatch penalty to it (-1), depending on if the le8ers
are the same (match reward) or different (mismatch penalty).
e. Take the maximum vale between the leU, up and diagonal value and a8ribute it to the cell.
4. Traceback:
§ Begin the traceback from the right bo8om cell in the matrix where the maximum score is present
con2nuing up to the upper leU corner. You can trace the arrows back that lead to the star2ng point.
§ There is an easy way to do it:
o If the le8ers are matched, the traceback will go diagonally.
o If the le8ers are not matched, the traceback will go towards the higher neighbor value
(diagonally, horizontally, or ver2cally).
a. Star2ng from the right bo8om corner; since there is a match (T=T) you have to go diagonally.
b. From this posi2on, since there is a match (C=C) you have to go diagonally.
c. From this posi2on, since there is a match (G=G) you have to go diagonally.
d. From this posi2on, since there is a mismatch (A-T) you have to go to the higher value in any direc2on
(in this case leU horizontally).
e. From this posi2on, since there is a match (A-A) you have to go diagonally.
To do an alignment we have to consider the arrows not the values:
• A diagonal arrow from the larger to the smaller value à match.
• A diagonal arrow from the smaller to the largest value or from same value to same valueà mismatch.
• A horizontal or ver2cal arrow no ma8ers the values à gap.
4. Local alignment
• It’s similar to the Needleman-Wunsch algorithm (global alignment) despite all the nega2ve values we get
during the matrix prepara2on becomes 0.
2. Ini2aliza2on:
a. From the first row (point 0: upper leU) propor2onate a reward or penalty in each posi2on for this row.
The penal2es or rewards are accumula2ve in the rows.
b. From the first column (point 0: upper leU) propor2onate a reward or penalty in each posi2on for this
row. The penal2es or rewards are accumula2ve in the rows.
c. You have to change the nega2ve values into 0.
3. Matrix filling step:
a. From the first row (first point of overlapping in sequence, no ma8ers if they are equal or not) select
this square.
b. Calculate the leU value: assign the value form the leU neighbor and for every ver2cal or horizontal
movement you must add a GAP penalty to it (-2).
c. Calculate the up value: assign the value form the up neighbor and for every ver2cal or horizontal
movement you must add a GAP penalty to it (-2).
Value from up à -2
d. Calculate the diagonal value: assign the value form the diagonal neighbor and for every diagonal
movement you must add a match reward or a mismatch penalty to it (-1), depending on if the le8ers
are the same (match reward) or different (mismatch penalty).
e. Transform the nega2ve values from leU, up, and diagonal into 0.
f. Take the maximum vale between the leU, up and diagonal value and a8ribute it to the cell.
g. Proceed to the next cell and fix the matrix completely.
4. Traceback:
• You just have to start from the lower right cell or the highest number close to this cell and
sequen2ally generate ver2cal, horizontally, or diagonal arrows towards the neighbor cell with the
highest value.
• Once you find a 0 you should stop since you have already found the shared mo2f between these two
sequences.
• You can repeat the process to find other matches mo2fs aUer a mismatch from a posi2ve number
un2l you found another 0 àmul2ple local alignments.