03 Backtrace For Computing Alignments 5-55

Uploaded by

idhitappu

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

03 Backtrace For Computing Alignments 5-55

Uploaded by

idhitappu

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 3

Knowing the edit distance between two

strings is important but it turns out not

to be sufficient. We often need something
more which is the alignment between two
strings. We wanna know which symbol in
string x corresponds to which symbol in
string y and this is gonna be important
for any application we have of [inaudible]
for often from spell checking to machine
translation even in computational biology.
The way we compute this alignment is we
keep a back trace. A back trace is simply
a pointer when we enter each cell in the
matrix that tells us where we came from
and when we reach the. And the upper right
corner of our matrix. We can use that
pointer an then trace back through all the
pointers to read off the alignment. Let's
see how this works in practice. Again,
I've given you the equation for each cell,
in edit distance. And, if we put in
some of our values that we saw earlier,
I'll start by putting in some values, so.
[sound] Alright. So we can ask, how did we
get to this value two? Two is, we pick the
minimum of three values. We could either
take, so two is the distance. This two
here is the distance between the string I
and the string E. And we got that by
saying, it's either the alignment between
nothing and E, plus the insertion of an
extra I. So that's distance of one plus
one is two. Or zero plus two is two, or
one plus one is two. So we had three
different values. So if we were asking
which of, which minimum path did we come
from. Really they're all the same. We
could have come from any of them. And
that's going to be true for this value of
three as well. Yeah. We computed it as the
minimum of two plus one, one plus two, or
two plus one. So this could have come from
here, here, or here. And similarly, that's
going to be true. I didn't work out the,
arithmetic for you, but it's going to be
true for this cell too. You can work it
out for yourself. Here we have a distant,
distant, difference. So, the distance
between inte and e, we could compute that
by taking the distance. What it cost us
to, to convert I N T E to nothing, and
then add another insertion for E. And that
would be, that would be silly because four
plus one is five, and there's a cheaper
way to get from I N T E to E, and that is
that it costs us nothing to match this E
to that E. So, our previous alignment
between I N T and nothing, we, we can add
zero from three to get a three, so. The
minimum path for this three came from that
three. So while in some cases a cell came
from many places. In this case it
[inaudible] came from this previous three.
So we're going to do this for every cell
in the array. And the result will look
something like this where we have for
every cell every place it could have come
from. And you'll see that in a lot of
cases any path could have worked, so this
six could have come from any place. But,
crucially, this final alignment, this
eight that tells us the final, edit
distance between intention and execution.
Our traceback tells us it came from the,
the best alignment between intentio and
executio, which came from the best
alignment, from intensi, from executi and
so on. And so, we can trace back this
alignment, and get ourselves, alignment
that tells us that this N. Match this N
match this O match this O and so on but
maybe here we have an insertion, rather
than a clean lining up. Computing the back
trace very simple. We take our same
minimum edit algorithm that we have seen
and here I have labelled the cases for
you. So when we are looking at a cell
we're either deleting, inserting or
substituting and we simply add pointers.
So in a case where we are inserting we
point left and in a case where we are
deleting we point down and in a case where
I am substituting we point diagonally. I
have shown you that arrow on the previous
slide. [sound] So we can look at this
distance matrix and think about the paths
from the origin here. To the, the end of
the matrix. And any non-decreasing path
that goes from the origin to the point NM,
corresponds to some alignment of the two
sequences. An optimal alignment, then, is
composed of optimal sub-sequences, and
that's the idea that makes it, it possible
to use dynamic programming for this task.
So, the resulting of our back trace are,
two strings and then, the alignment
between them. So we, we'll know which,
which things line up exactly, which things
line up with substitutions, and then, when
we should have insertions or deletions.
What's the performance of this algorithm?
In time it's order NM because our back,
our distance matrix is of size NM, and
we're filling in each cell one time. The
same is true for space. And in the
backtrace we have to on the, in the worst
case go for, if we had N deletions and M
insertions we'd have to go N plus M. We'd
have to touch N plus M cells but not more
than that. So that's our backtrace
algorithm for computing alignments.

Chattopadhyay a. Handbook of Computer Architecture 2025
No ratings yet
Chattopadhyay a. Handbook of Computer Architecture 2025
1,465 pages
02 Computing Minimum Edit Distance 5-54
No ratings yet
02 Computing Minimum Edit Distance 5-54
2 pages
Definition of Minimum Edit Distance
No ratings yet
Definition of Minimum Edit Distance
49 pages
DAA 5-10
No ratings yet
DAA 5-10
12 pages
DP 2
No ratings yet
DP 2
19 pages
lab-manual-it403ada
No ratings yet
lab-manual-it403ada
23 pages
Needleman Wunsch PDF
No ratings yet
Needleman Wunsch PDF
3 pages
Lecture 7 - Optimizations - A 2025
No ratings yet
Lecture 7 - Optimizations - A 2025
55 pages
Lecture 4
No ratings yet
Lecture 4
57 pages
Lec 31
No ratings yet
Lec 31
29 pages
0.1 Worst and Best Case Analysis
No ratings yet
0.1 Worst and Best Case Analysis
6 pages
Lab2
No ratings yet
Lab2
10 pages
2 EditDistance 2022
No ratings yet
2 EditDistance 2022
37 pages
String Edit PDF
No ratings yet
String Edit PDF
39 pages
18-IntroNLP II PDF
No ratings yet
18-IntroNLP II PDF
187 pages
161 Main
No ratings yet
161 Main
51 pages
9 Daa
No ratings yet
9 Daa
3 pages
International Open University
No ratings yet
International Open University
10 pages
Solving Dsa Interview Questions
100% (1)
Solving Dsa Interview Questions
32 pages
Sorting
No ratings yet
Sorting
11 pages
DS Lab File 3rd Sem
No ratings yet
DS Lab File 3rd Sem
42 pages
(Data Structure AND Algorathims) : (Teacher: MR Yang Weichao)
No ratings yet
(Data Structure AND Algorathims) : (Teacher: MR Yang Weichao)
6 pages
EditDistance
No ratings yet
EditDistance
28 pages
2 - Arrays - 2019
No ratings yet
2 - Arrays - 2019
31 pages
05 Dynamic Programming i i
No ratings yet
05 Dynamic Programming i i
64 pages
Summer of Science End-Term Report: Data Structures and Algorithms
No ratings yet
Summer of Science End-Term Report: Data Structures and Algorithms
19 pages
12.05.backtracking Algorithms
No ratings yet
12.05.backtracking Algorithms
37 pages
03 Med
No ratings yet
03 Med
52 pages
Data Structure Practical
No ratings yet
Data Structure Practical
26 pages
Lecture 2
No ratings yet
Lecture 2
71 pages
Akhanda Complex Ap
No ratings yet
Akhanda Complex Ap
17 pages
Insertion
No ratings yet
Insertion
16 pages
Lec 9
No ratings yet
Lec 9
10 pages
Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras Lecture - 7b Sequence Alignment II
No ratings yet
Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras Lecture - 7b Sequence Alignment II
26 pages
EXP_1_103_MADF
No ratings yet
EXP_1_103_MADF
13 pages
Advanced Recursion-1407
No ratings yet
Advanced Recursion-1407
12 pages
DSA Full Notes
No ratings yet
DSA Full Notes
267 pages
COMP4500 - 7500 - 2013, Sem 2
No ratings yet
COMP4500 - 7500 - 2013, Sem 2
8 pages
3 Adsl
No ratings yet
3 Adsl
10 pages
Recursion and Merge Sort
No ratings yet
Recursion and Merge Sort
5 pages
Levenshtein Distance (Part 2 Gotta Go Fast) - Turnerj (Aka. James Turner)
No ratings yet
Levenshtein Distance (Part 2 Gotta Go Fast) - Turnerj (Aka. James Turner)
9 pages
Assignment DSA Theory
No ratings yet
Assignment DSA Theory
9 pages
Data Structure and Algorithm
No ratings yet
Data Structure and Algorithm
18 pages
Java Developer Initial Programer
No ratings yet
Java Developer Initial Programer
102 pages
Final Ada Lab Program
No ratings yet
Final Ada Lab Program
8 pages
DSA btech imp question
No ratings yet
DSA btech imp question
39 pages
Week_04-Learn Dsa With c++
No ratings yet
Week_04-Learn Dsa With c++
23 pages
C Assignment
No ratings yet
C Assignment
30 pages
Calculating Minimum Edit Distance
0% (1)
Calculating Minimum Edit Distance
52 pages
Defini'on of Minimum Edit Distance
No ratings yet
Defini'on of Minimum Edit Distance
52 pages
dsa sam
No ratings yet
dsa sam
13 pages
Practical Lab I: Review of Data Structures
No ratings yet
Practical Lab I: Review of Data Structures
6 pages
EQTVuAhSFY (Dragged) 4
No ratings yet
EQTVuAhSFY (Dragged) 4
1 page
Advance Algorithm Lab Record
No ratings yet
Advance Algorithm Lab Record
32 pages
Lab3
No ratings yet
Lab3
11 pages
Laboratory Class 1 Recursion
No ratings yet
Laboratory Class 1 Recursion
1 page
C++ Manual
No ratings yet
C++ Manual
44 pages
C Interviewquestions
No ratings yet
C Interviewquestions
4 pages
Algorithms Using C++ Lab: Department of Computer Science
No ratings yet
Algorithms Using C++ Lab: Department of Computer Science
34 pages
201 Mind Boggling Problems In Mathematics
From Everand
201 Mind Boggling Problems In Mathematics
Srijit Mondal
No ratings yet
Calculus by Muhammad Umer
From Everand
Calculus by Muhammad Umer
Muhammad Umer
No ratings yet
02 The Noisy Channel Model of Spelling 19-30
No ratings yet
02 The Noisy Channel Model of Spelling 19-30
12 pages
03 Real-Word Spelling Correction 9-19
No ratings yet
03 Real-Word Spelling Correction 9-19
4 pages
05 Smoothing - Add-One 6-30
No ratings yet
05 Smoothing - Add-One 6-30
3 pages
08 Kneser-Ney Smoothing 8-59
No ratings yet
08 Kneser-Ney Smoothing 8-59
3 pages
05 Sentence Segmentation 5-31
No ratings yet
05 Sentence Segmentation 5-31
3 pages
02 Regular Expressions in Practical NLP 6-04
No ratings yet
02 Regular Expressions in Practical NLP 6-04
3 pages
CS 1101-01 Unit 5
No ratings yet
CS 1101-01 Unit 5
3 pages
A1 Q And S
No ratings yet
A1 Q And S
5 pages
Lakshmi Priya Module 3 Assignment
No ratings yet
Lakshmi Priya Module 3 Assignment
6 pages
Computer Science: 1. Introduction To Computers
No ratings yet
Computer Science: 1. Introduction To Computers
2 pages
Motorola MPC5xx Memory Maps: MPC555 MPC533/MPC534 MPC535/MPC536 MPC561/MPC562 MPC563/MPC564 MPC565/MPC566
No ratings yet
Motorola MPC5xx Memory Maps: MPC555 MPC533/MPC534 MPC535/MPC536 MPC561/MPC562 MPC563/MPC564 MPC565/MPC566
1 page
ISD LAB10-DesignConcepts
No ratings yet
ISD LAB10-DesignConcepts
6 pages
Unit1 C Programming 6 10 2022
No ratings yet
Unit1 C Programming 6 10 2022
38 pages
General Purpose Registers
No ratings yet
General Purpose Registers
3 pages
Informed Search in Artificial Intelligence
No ratings yet
Informed Search in Artificial Intelligence
12 pages
4.1. Cryptographic Coding (Part 1)
No ratings yet
4.1. Cryptographic Coding (Part 1)
38 pages
DAG - Directed Acyclic Graph
No ratings yet
DAG - Directed Acyclic Graph
8 pages
COMP
No ratings yet
COMP
367 pages
Rockwell Micro850 Free Tag Names
No ratings yet
Rockwell Micro850 Free Tag Names
8 pages
Java Lab Record GT
No ratings yet
Java Lab Record GT
39 pages
DAA Lab Ex 1 - 5 PDF - 1
No ratings yet
DAA Lab Ex 1 - 5 PDF - 1
23 pages
Excel Vba Programming For Dummies 4e 4th Edition Walkenbachdownload
100% (1)
Excel Vba Programming For Dummies 4e 4th Edition Walkenbachdownload
56 pages
Industrial Training Report (Iot)
No ratings yet
Industrial Training Report (Iot)
45 pages
Data Structure Unit-5 Quiz
No ratings yet
Data Structure Unit-5 Quiz
6 pages
By Sachin Vasantrao Inkane PGT, AECS, Indore
No ratings yet
By Sachin Vasantrao Inkane PGT, AECS, Indore
13 pages
Leetcode DSA Sheet by Fraz
No ratings yet
Leetcode DSA Sheet by Fraz
23 pages
20240621060452CopyGame Log
No ratings yet
20240621060452CopyGame Log
25 pages
CO Nos. Course Outcomes Level of Learning Domain (Based On Revised Bloom's)
No ratings yet
CO Nos. Course Outcomes Level of Learning Domain (Based On Revised Bloom's)
15 pages
Lec5, Algorithm Analysis & Design, Greedy Algorithms
No ratings yet
Lec5, Algorithm Analysis & Design, Greedy Algorithms
28 pages
ACSL Petteia - SR - 3 - Orig
No ratings yet
ACSL Petteia - SR - 3 - Orig
3 pages
Instant Download Digital Image Processing Practical Approach Borko Furht PDF All Chapters
100% (4)
Instant Download Digital Image Processing Practical Approach Borko Furht PDF All Chapters
65 pages
JSP - Architecture
No ratings yet
JSP - Architecture
11 pages
Syntax C
No ratings yet
Syntax C
2 pages
Solutions For HW10-CS 6033 Fall 2023
No ratings yet
Solutions For HW10-CS 6033 Fall 2023
10 pages
3-1 Syllabus (R20)
No ratings yet
3-1 Syllabus (R20)
36 pages

03 Backtrace For Computing Alignments 5-55

Uploaded by

03 Backtrace For Computing Alignments 5-55

Uploaded by

Knowing the edit distance between two

strings is important but it turns out not

You might also like