0% found this document useful (0 votes)

9 views

BioAlg02

The document provides an overview of bioinformatics algorithms, focusing on physical mapping and restriction mapping techniques. It discusses the discovery of restriction enzymes, the construction of restriction maps, and the challenges associated with reconstructing DNA sequences from fragment sizes. Additionally, it covers methods like gel electrophoresis, double digest mapping, and various computational problems related to restriction mapping.

Uploaded by

hipaji6592

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

BioAlg02

Uploaded by

hipaji6592

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 62

An Introduction to Bioinformatics Algorithms www.bioalgorithms.

info

Physical Mapping –
Restriction Mapping
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Molecular Scissors

Molecular Cell Biology, 4th edition

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Discovering Restriction Enzymes

• HindII - first restriction enzyme – was discovered

accidentally in 1970 while studying how the bacterium
Haemophilus influenzae takes up DNA from the virus
• Recognizes and cuts DNA at sequences:
GTGCAC
GTTAAC
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Discovering Restriction Enzymes

My father has discovered a servant
who serves as a pair of scissors. If
a foreign king invades a bacterium,
this servant can cut him in small
fragments, but he does not do any
harm to his own king. Clever
people use the servant with the
Werner Arber Daniel Nathans Hamilton Smith scissors to find out the secrets of
the kings. For this reason my father
Werner Arber – discovered restriction
received the Nobel Prize for the
enzymes
Daniel Nathans - pioneered the application discovery of the servant with the
of restriction for the scissors".
construction of genetic
maps Daniel Nathans’ daughter
Hamilton Smith - showed that restriction (from Nobel lecture)
enzyme cuts DNA in the
middle of a specific sequence
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Recognition Sites of Restriction Enzymes

Molecular Cell Biology, 4th edition

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Restriction Maps
• A map showing positions
of restriction sites in a
DNA sequence
• If DNA sequence is
known then construction
of restriction map is a
trivial exercise
• In early days of
molecular biology DNA
sequences were often
unknown
• Biologists had to solve
the problem of
constructing restriction
maps without knowing
DNA sequences
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Physical map
• Definition: Let S be a DNA sequence. A
physical map consists of a set M of markers and
a function p : M  N that assigns each marker a
position of M in S.

• N denotes the set of nonnegative integers

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Restriction mapping problem

• For a set X of points on the line, let X = { |
x1 - x2| : x1, x2 X } denote the multiset
of all pairwise distances between points in X. In
the restriction mapping problem, a subset E 
X (of experimentally obtained fragment
lengths) is given and the task is to reconstruct X
from E.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Full Restriction Digest

• DNA at each restriction site creates multiple

restriction fragments:

Is it possible to reconstruct the order of the fragments from the

sizes of the fragments {3,5,5,9} ?
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Full Restriction Digest: Multiple Solutions

• Alternative ordering of restriction fragments:

vs
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Measuring Length of Restriction Fragments

• Restriction enzymes break DNA into restriction fragments.

• Gel electrophoresis is a process for separating DNA by size

and measuring sizes of restriction fragments

• Can separate DNA fragments that differ in length in only 1

nucleotide for fragments up to 500 nucleotides long
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Gel Electrophoresis
• DNA fragments are injected into a gel positioned in an
electric field
• DNA are negatively charged near neutral pH
The ribose phosphate backbone of each nucleotide
is acidic; DNA has an overall negative charge
• DNA molecules move towards the positive electrode
• DNA fragments of different lengths are separated
according to size
Smaller molecules move through the gel matrix more readily than
larger molecules
• The gel matrix restricts random diffusion so molecules of
different lengths separate into different bands
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Gel Electrophoresis: Example

Direction of DNA
movement

Smaller fragments
travel farther

Molecular Cell Biology, 4th edition

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Vizualization of DNA:
Autoradiography and Fluorescence
•autoradiography:

• The DNA is radioactively labeled. The gel is laid against a

sheet of photographic film in the dark, exposing the film at
the positions where the DNA is present

•fluorescence:

• The gel is incubated with a solution containing the

fluorescent dye ethidium – ethidium binds to the DNA

• The DNA lights up when the gel is exposed to ultraviolet

light.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Three different problems

1. the double digest problem – DDP
2. the partial digest problem – PDP
3. the simplified partial digest
problem – SPDP
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Double Digest Mapping

Use two restriction enzymes; three full digests:
1. a complete digest of S using A,
2. a complete digest of S using B, and
3. a complete digest of S using both A and B.

• Computationally, Double Digest problem is more complex

than Partial Digest problem
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Double Digest: Example

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Double Digest: Example

Without the information about X (i.e. A+B), it is impossible to solve

the double digest problem as this diagram illustrates
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Double Digest Problem

Input: dA – fragment lengths from the complete digest with
enzyme A.
dB – fragment lengths from the complete digest with
enzyme B.
dX – fragment lengths from the complete digest with
both A and B.

Output: A – location of the cuts in the restriction map for the

enzyme A.
B – location of the cuts in the restriction map for the
enzyme B.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Double Digest: Multiple Solutions

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Double digest
• The decision problem of the DDP is NP-complete.
• All algorithms have problems with more than 10
restriction sites for each enzyme.
• A solution may not be unique and the number of
solutions grows exponenially.
• DDP is a favorite mapping method since the
experiments are easy to conduct.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

DDP is NP-complete
1. Is in NP – easy
2. given a set of integers X = {x1, . . . , xl}. The Set
Partitioning Problem (SPP) is to determine whether we
can partition X in into two subsets X1 and X2 such that

xx
x X 1 x X 2

3. This problem is known to be NP-complete.

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

DDP is NP-complete
• Let X be the input of the SPP, assuming that the sum of all
elements of X is even. Then set
dA = X,
K K 
dB =  ,   x
K. with , and
2 2 x X
dAB = dA.
n0 l
• then there exists an index n0 with  xj i
  xj i because
i 1 i n0 1
of the choice of B and AB. Thus a solution for the SPP exists.
• thus SPP is a DDP in which one of the two enzymes produced
only two fragments of equal length.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Partial Restriction Digest

• The sample of DNA is exposed to the restriction enzyme for
only a limited amount of time to prevent it from being cut at
all restriction sites
• This experiment generates the set of all possible restriction
fragments between every two (not necessarily consecutive)
cuts
• This set of fragment sizes is used to determine the positions
of the restriction sites in the DNA sequence
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Multiset of Restriction Fragments

• We assume that
multiplicity of a
fragment can be
detected, i.e., the
number of
restriction
fragments of the
same length can
be determined
(e.g., by observing
twice as much
fluorescence
intensity for a
double fragment
than for a single
fragment)
Multiset: {3, 5, 5, 8, 9, 14, 14, 17, 19, 22}
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Partial Digest Fundamentals

X: the set of n integers representing the location of all cuts in
the restriction map, including the start and end

n: the total number of cuts

X: the multiset of integers representing lengths of each of the

fragments produced from a partial digest
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

One More Partial Digest Example

X 0 2 4 7 10
0 2 4 7 10
2 2 5 8
4 3 6
7 3
10
Representation of X = {2, 2, 3, 3, 4, 5, 6, 7, 8, 10} as a two
dimensional table, with elements of
X = {0, 2, 4, 7, 10}
along both the top and left side. The elements at (i, j) in the table
is xj – xi for 1 ≤ i < j ≤ n.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Partial Digest Problem: Formulation

Goal: Given all pairwise distances between points on a line,

reconstruct the positions of those points

• Input: The multiset of pairwise distances L, containing

n(n-1)/2 integers
• Output: A set X, of n integers, such that X = L
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Partial Digest: Multiple Solutions

• It is not always possible to uniquely reconstruct a set X based
only on X.
• For example, the set
X = {0, 2, 5}
and
(X + 10) = {10, 12, 15}
both produce X={2, 3, 5} as their partial digest set.
• The sets {0,1,2,5,7,9,12} and {0,1,5,7,8,10,12} present a less
trivial example of non-uniqueness. They both digest into:
{1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 5, 6, 7, 7, 7, 8, 9, 10, 11, 12}
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Homometric Sets
0 1 2 5 7 9 12 0 1 5 7 8 10 12

0 1 2 5 7 9 12 0 1 5 7 8 10 12

1 1 4 6 8 11 1 4 6 7 9 11

2 3 5 7 10 5 2 3 5 7

5 2 4 7 7 1 3 5

7 2 5 8 2 4

9 3 10 2

12 12
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Partial Digest: Brute Force

1. Find the restriction fragment of maximum length M. M is
the length of the DNA sequence.

2. For every possible set

X={0, x2, … ,xn-1, M}

compute the corresponding X

• If X is equal to the experimental partial digest L, then X

is the correct restriction map
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

BruteForcePDP
1. BruteForcePDP(L, n):
2. M  maximum element in L
3. for every set of n – 2 integers 0 < x2 < … xn-1 < M
4. X  {0,x2,…,xn-1,M}
• Form X from X
• if X = L
• return X
• output “no solution”
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Efficiency of BruteForcePDP
• BruteForcePDP takes O(M n-2) time since it must examine all
possible sets of positions.

• One way to improve the algorithm is to limit the values of xi

to only those values which occur in L.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

AnotherBruteForcePDP
1. AnotherBruteForcePDP(L, n)
2. M  maximum element in L
3. for every set of n – 2 integers 0 < x2 < … xn-1 < M
4. X  { 0,x2,…,xn-1,M }
• Form X from X
• if X = L
• return X
• output “no solution”
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

AnotherBruteForcePDP
1. AnotherBruteForcePDP(L, n)
2. M  maximum element in L
3. for every set of n – 2 integers 0 < x2 < … xn-1 < M from L
4. X  { 0,x2,…,xn-1,M }
• Form X from X
• if X = L
• return X
• output “no solution”
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Efficiency of AnotherBruteForcePDP

• It’s more efficient, but still slow

• If L = {2, 998, 1000} (n = 3, M = 1000), BruteForcePDP will
be extremely slow, but AnotherBruteForcePDP will be quite
fast
• Fewer sets are examined, but runtime is still exponential:
O(n2n-4)
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Branch and Bound Algorithm for PDP

1. Begin with X = {0}

2. Remove the largest element in L and place it in X
3. See if the element fits on the right or left side of the
restriction map
4. When it fits, find the other lengths it creates and remove
those from L
5. Go back to step 1 until L is empty
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Branch and Bound Algorithm for PDP

1. Begin with X = {0}

WRONG ALGORITHM
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Defining D(y, X)

• Before describing PartialDigest, first define

D(y, X)
as the multiset of all distances between point y and all other
points in the set X

D(y, X) = {|y – x1|, |y – x2|, …, |y – xn|}

for X = {x1, x2, …, xn}

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

PartialDigest Algorithm

PartialDigest(L):
width  Maximum element in L
DELETE(width, L)
X  {0, width}
PLACE(L, X)
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

PartialDigest Algorithm (cont’d)

1. PLACE(L, X)
2. if L is empty
3. output X
4. return
5. y  maximum element in L
• Delete(y,L)
• if D(y, X )  L
• Add y to X and remove lengths D(y, X) from L
• PLACE(L,X )
• Remove y from X and add lengths D(y, X) to L
• if D(width-y, X )  L
• Add width-y to X and remove lengths D(width-y, X) from L
• PLACE(L,X )
• Remove width-y from X and add lengths D(width-y, X ) to L
• return
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X={0}
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X={0}

Remove 10 from L and insert it into X. We know this must be

the length of the DNA sequence because it is the largest
fragment.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 10 }
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 10 }

Take 8 from L and make y = 2 or 8. But since the two cases

are symmetric, we can assume y = 2.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 10 }

We find that the distances from y=2 to other elements in X are

D(y, X) = {8, 2}, so we remove {8, 2} from L and add 2 to X.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 2, 10 }
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 2, 10 }

Take 7 from L and make y = 7 or y = 10 – 7 = 3. We will

explore y = 7 first, so D(y, X ) = {7, 5, 3}.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 2, 10 }

For y = 7 first, D(y, X ) = {7, 5, 3}. Therefore we

remove {7, 5 ,3} from L and add 7 to X.

D(y, X) = {7, 5, 3} = {|7 – 0|, |7 – 2|, |7 – 10|}

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 2, 7, 10 }
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 2, 7, 10 }

Take 6 from L and make y = 6. Unfortunately

D(y, X) = {6, 4, 1 ,4}, which is not a subset of L. Therefore
we won’t explore this branch.

6
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 2, 7, 10 }

This time make y = 4. D(y, X) = {4, 2, 3 ,6}, which is a

subset of L so we will explore this branch. We remove
{4, 2, 3 ,6} from L and add 4 to X.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 2, 4, 7, 10 }
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 2, 4, 7, 10 }

L is now empty, so we have a solution, which is X.

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 2, 7, 10 }

To find other solutions, we backtrack.

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 2, 10 }

More backtrack.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 2, 10 }

This time we will explore y = 3. D(y, X) = {3, 1, 7}, which is

not a subset of L, so we won’t explore this branch.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

An Example
L = { 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 }
X = { 0, 10 }

We backtracked back to the root. Therefore we have found

all the solutions.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Analyzing PartialDigest Algorithm

• Still exponential in worst case, but is very fast on average

• Informally, let T(n) be time PartialDigest takes to place n cuts
No branching case: T(n) < T(n-1) + O(n)
Quadratic
Branching case: T(n) < 2T(n-1) + O(n)
Exponential
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

PDP analysis
• No polynomial time algorithm is known for PDP.
In fact, the complexity of PDP is an open
problem.
• S. Skiena devised a simple backtracking
algorithm that performs well in practice, but
may require exponential time.
• This approach is not a popular mapping method,
as it is difficult to reliably produce all pairwise
distances between restriction sites.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Simplified partial digest problem

• Given a target sequence S and a single
restriction enzyme A. Two different
experiments are performed
• on two sets of copies of S:
1. In the short experiment, the time span is chosen so
that each copy of the target sequence is cut precisely
once by the restriction enzyme.
2. In the long experiment, a complete digest of S by A
is performed.
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

SPDP
• Let  = {1, . . . , 2N } be the multi-set of all
fragment lengths obtained by the short
experiment, and
• let  = {1, . . . , N+1} be the multi-set of all
fragment lengths obtained by the long
experiment,
• where N is the number of restriction sites in S.
• Here is an example: Given these (unknown)
restriction sites (in kb): 2 8 9 13 16
• We obtain % = {2kb, 6kb, 1kb, 4kb, 3kb}.

2 Introduction To Omic Technologies 23
100% (1)
2 Introduction To Omic Technologies 23
49 pages
Ch04 DNA Mapping
No ratings yet
Ch04 DNA Mapping
60 pages
DNA Mapping and Brute Force Algorithms!
No ratings yet
DNA Mapping and Brute Force Algorithms!
54 pages
2024 Bioinformatics Algorithms Day 3 - 4
No ratings yet
2024 Bioinformatics Algorithms Day 3 - 4
106 pages
Ch03 Molecular Biology Primer Part2
No ratings yet
Ch03 Molecular Biology Primer Part2
119 pages
Module 1_Session 3_Part 4
No ratings yet
Module 1_Session 3_Part 4
26 pages
Ch10_Clustering
No ratings yet
Ch10_Clustering
45 pages
Part8 PDF
No ratings yet
Part8 PDF
87 pages
Ch08 GraphsDNAseq
No ratings yet
Ch08 GraphsDNAseq
82 pages
To Bioinformatics: Dan Lopresti
No ratings yet
To Bioinformatics: Dan Lopresti
43 pages
Ch04 Motifs
No ratings yet
Ch04 Motifs
117 pages
New Algorithm for the Simplified Partial Digest Problem 1st Edition by Blazewicz, Jaroszewski 9783540200765 - Download the ebook today and experience the full content
100% (19)
New Algorithm for the Simplified Partial Digest Problem 1st Edition by Blazewicz, Jaroszewski 9783540200765 - Download the ebook today and experience the full content
55 pages
New Algorithm for the Simplified Partial Digest Problem 1st Edition by Blazewicz, Jaroszewski 9783540200765 instant download
100% (2)
New Algorithm for the Simplified Partial Digest Problem 1st Edition by Blazewicz, Jaroszewski 9783540200765 instant download
48 pages
Restriction Mapping: Fall 2011 CSC 570: Bioinformatics Alexander Dekhtyar
No ratings yet
Restriction Mapping: Fall 2011 CSC 570: Bioinformatics Alexander Dekhtyar
3 pages
New Algorithm for the Simplified Partial Digest Problem 1st Edition by Blazewicz, Jaroszewski 9783540200765 instant download
100% (2)
New Algorithm for the Simplified Partial Digest Problem 1st Edition by Blazewicz, Jaroszewski 9783540200765 instant download
42 pages
Bioinformatics Class Notes
No ratings yet
Bioinformatics Class Notes
12 pages
BioInformatics Abstract For Paper Presentation
100% (1)
BioInformatics Abstract For Paper Presentation
11 pages
Introduction To Bioinformatics: Tolga Can
No ratings yet
Introduction To Bioinformatics: Tolga Can
21 pages
PB Bioinfo L1 2023
No ratings yet
PB Bioinfo L1 2023
21 pages
unit 1
No ratings yet
unit 1
24 pages
Introduction To Data Mining For Bioinformatics: Fall 2005 Peter Van Der Putten (Putten - at - Liacs - NL)
No ratings yet
Introduction To Data Mining For Bioinformatics: Fall 2005 Peter Van Der Putten (Putten - at - Liacs - NL)
50 pages
(eBook PDF) Introduction to Bioinformatics 5th Edition download pdf
100% (10)
(eBook PDF) Introduction to Bioinformatics 5th Edition download pdf
55 pages
Bioinformatics PPT Section B Data Storage and Retrival Group 3
No ratings yet
Bioinformatics PPT Section B Data Storage and Retrival Group 3
36 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
14 pages
Lopresti 10082010 PDF
No ratings yet
Lopresti 10082010 PDF
44 pages
38401062 Introduction
No ratings yet
38401062 Introduction
13 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
7 pages
Collection
No ratings yet
Collection
8 pages
Introduction To Different Resources of Bioinformatics and Application PDF
No ratings yet
Introduction To Different Resources of Bioinformatics and Application PDF
55 pages
Bioinformatics lecture 1
No ratings yet
Bioinformatics lecture 1
48 pages
Bio in For Matics
No ratings yet
Bio in For Matics
4 pages
DNA Sequence Data Analysis: Steps Toward Computer Analysis of Nucleotide Sequences
No ratings yet
DNA Sequence Data Analysis: Steps Toward Computer Analysis of Nucleotide Sequences
7 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
34 pages
Bioinformatics
No ratings yet
Bioinformatics
22 pages
Download
No ratings yet
Download
19 pages
D. Higgins, Willie Taylor Bioinformatics Sequence, Structure and Databanks PDF
100% (2)
D. Higgins, Willie Taylor Bioinformatics Sequence, Structure and Databanks PDF
268 pages
Xu GMX 9 D JN
No ratings yet
Xu GMX 9 D JN
270 pages
Tics - A Brief Introduction
No ratings yet
Tics - A Brief Introduction
4 pages
Bioinformatics Answers
100% (1)
Bioinformatics Answers
13 pages
Ch05 Rearrangements
No ratings yet
Ch05 Rearrangements
78 pages
Bioinfo Course Notes M1 2020 Dr Mbulli
No ratings yet
Bioinfo Course Notes M1 2020 Dr Mbulli
56 pages
BTH 403-BTG407 LECTURE 1
No ratings yet
BTH 403-BTG407 LECTURE 1
6 pages
Bioinformatics Notes 2020 2021
No ratings yet
Bioinformatics Notes 2020 2021
66 pages
Bio Informatics
No ratings yet
Bio Informatics
46 pages
Bioinformatics: ABE 2007 Kent Koster Group 3
No ratings yet
Bioinformatics: ABE 2007 Kent Koster Group 3
43 pages
PDF (eBook PDF) Introduction to Bioinformatics 5th Edition download
100% (1)
PDF (eBook PDF) Introduction to Bioinformatics 5th Edition download
50 pages
"If You Can't Do Bioinformatics, You Can't Do Biology", J.D. Tisdall, 2003
No ratings yet
"If You Can't Do Bioinformatics, You Can't Do Biology", J.D. Tisdall, 2003
12 pages
Lab 1
No ratings yet
Lab 1
39 pages
What Are Some Open Problems in Bioinformatics
No ratings yet
What Are Some Open Problems in Bioinformatics
2 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
2 pages
Bio in For Matics
No ratings yet
Bio in For Matics
138 pages
Lecture1-1 525 W16 Large
No ratings yet
Lecture1-1 525 W16 Large
129 pages
Enzyme Informatics
No ratings yet
Enzyme Informatics
13 pages
Extracted Pages From Bioinformatics Basics, Development, and Future
No ratings yet
Extracted Pages From Bioinformatics Basics, Development, and Future
8 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Bioinformatics Learning Framework
No ratings yet
Bioinformatics Learning Framework
7 pages
BIOINFORMATICS Chapter 1 3rd Sem
No ratings yet
BIOINFORMATICS Chapter 1 3rd Sem
44 pages
Bio in For Matics
No ratings yet
Bio in For Matics
160 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
8 pages
DNA Basics
From Everand
DNA Basics
Sophia Curie
No ratings yet
Reordering Life: Knowledge and Control in the Genomics Revolution
From Everand
Reordering Life: Knowledge and Control in the Genomics Revolution
Stephen Hilgartner
No ratings yet
CV - Anil K Shukla (Nih, Usa)
No ratings yet
CV - Anil K Shukla (Nih, Usa)
6 pages
Bioprocess Engineering Syllabus
No ratings yet
Bioprocess Engineering Syllabus
3 pages
BIOLOGY MODULE MELC 9 BASIC TAXONOMY OF DNA Sequence
No ratings yet
BIOLOGY MODULE MELC 9 BASIC TAXONOMY OF DNA Sequence
2 pages
Daftar Pustaka
No ratings yet
Daftar Pustaka
5 pages
Jurnal 1 Isolasi Dna
No ratings yet
Jurnal 1 Isolasi Dna
3 pages
Microbiology 205 Written Exam 1
No ratings yet
Microbiology 205 Written Exam 1
2 pages
Class 9 Biology 1st Monthly
No ratings yet
Class 9 Biology 1st Monthly
2 pages
MIT20 441JF09 Lec02b Ms
No ratings yet
MIT20 441JF09 Lec02b Ms
61 pages
COVID-19 Detection by RT-PCR.: Page 1 of 2
No ratings yet
COVID-19 Detection by RT-PCR.: Page 1 of 2
2 pages
Ms Campbell Protein Synthesis Practice Questions Regents Le
No ratings yet
Ms Campbell Protein Synthesis Practice Questions Regents Le
6 pages
PDF Flow Cytometry Basics for the Non Expert Christine Goetz download
100% (4)
PDF Flow Cytometry Basics for the Non Expert Christine Goetz download
55 pages
Kontra Terhadap Obat-Obatan Herbal
No ratings yet
Kontra Terhadap Obat-Obatan Herbal
6 pages
Biotechnology Principles and Processes - DPP 02 (Of Lec-04) - Yakeen 2.0 2024 (Legend)
No ratings yet
Biotechnology Principles and Processes - DPP 02 (Of Lec-04) - Yakeen 2.0 2024 (Legend)
3 pages
13
No ratings yet
13
13 pages
Polymerase Chain Reaction
No ratings yet
Polymerase Chain Reaction
24 pages
Q2 Earth and Life Module 10
No ratings yet
Q2 Earth and Life Module 10
25 pages
Science 10
No ratings yet
Science 10
5 pages
2025 Biology P1 NED
100% (2)
2025 Biology P1 NED
11 pages
All HSC Biology Notes Super Condensed
No ratings yet
All HSC Biology Notes Super Condensed
25 pages
Download Complete Biology ISE International school 6th Edition Robert J. Brooker PDF for All Chapters
No ratings yet
Download Complete Biology ISE International school 6th Edition Robert J. Brooker PDF for All Chapters
44 pages
Dendritic Cell Dysfunction and Implications For Dendritic Cell-Based Therapy in Colorectal Cancer
No ratings yet
Dendritic Cell Dysfunction and Implications For Dendritic Cell-Based Therapy in Colorectal Cancer
8 pages
Lehninger Principles of Biochemistry 7th Edition instant download
100% (1)
Lehninger Principles of Biochemistry 7th Edition instant download
59 pages
Dna and Rna PDF
100% (1)
Dna and Rna PDF
16 pages
Genomic Selection For Poultry Breeding
No ratings yet
Genomic Selection For Poultry Breeding
13 pages
Advanced Pharmaceutical Analysis: Hod: Dr. C. Sreedhar Presented By: Kshitiz K. Gaund
No ratings yet
Advanced Pharmaceutical Analysis: Hod: Dr. C. Sreedhar Presented By: Kshitiz K. Gaund
36 pages
Abcam GAPDH Ab9484
No ratings yet
Abcam GAPDH Ab9484
5 pages
(Ebook PDF) Hidden Biometrics When Biometric Security Meets Biomedical Engineering 1st edition by Amine Nait ali 9811309566 9789811309564 full chapters - Quickly download the ebook to never miss any content
100% (6)
(Ebook PDF) Hidden Biometrics When Biometric Security Meets Biomedical Engineering 1st edition by Amine Nait ali 9811309566 9789811309564 full chapters - Quickly download the ebook to never miss any content
83 pages
Thesis Defense-MSc. Biochemistry
No ratings yet
Thesis Defense-MSc. Biochemistry
20 pages
Symbol Name Current Price 1 Month % CHG 3 Month % CHG 6 Month % CHG 1 Year % CHG
No ratings yet
Symbol Name Current Price 1 Month % CHG 3 Month % CHG 6 Month % CHG 1 Year % CHG
12 pages

BioAlg02

Uploaded by

BioAlg02

Uploaded by

An Introduction to Bioinformatics Algorithms www.bioalgorithms.

Molecular Cell Biology, 4th edition

Discovering Restriction Enzymes

• HindII - first restriction enzyme – was discovered

Discovering Restriction Enzymes

Recognition Sites of Restriction Enzymes

Molecular Cell Biology, 4th edition

• N denotes the set of nonnegative integers

Restriction mapping problem

Full Restriction Digest

• DNA at each restriction site creates multiple

Is it possible to reconstruct the order of the fragments from the

Full Restriction Digest: Multiple Solutions

• Alternative ordering of restriction fragments:

Measuring Length of Restriction Fragments

• Restriction enzymes break DNA into restriction fragments.

• Gel electrophoresis is a process for separating DNA by size

• Can separate DNA fragments that differ in length in only 1

Gel Electrophoresis: Example

Molecular Cell Biology, 4th edition

• The DNA is radioactively labeled. The gel is laid against a

• The gel is incubated with a solution containing the

• The DNA lights up when the gel is exposed to ultraviolet

Three different problems

Double Digest Mapping

• Computationally, Double Digest problem is more complex

Double Digest: Example

Double Digest: Example

Without the information about X (i.e. A+B), it is impossible to solve

Double Digest Problem

Output: A – location of the cuts in the restriction map for the

Double Digest: Multiple Solutions

3. This problem is known to be NP-complete.

Partial Restriction Digest

Multiset of Restriction Fragments

Partial Digest Fundamentals

n: the total number of cuts

X: the multiset of integers representing lengths of each of the

One More Partial Digest Example

Partial Digest Problem: Formulation

Goal: Given all pairwise distances between points on a line,

• Input: The multiset of pairwise distances L, containing

Partial Digest: Multiple Solutions

Partial Digest: Brute Force

2. For every possible set

compute the corresponding X

• If X is equal to the experimental partial digest L, then X

• One way to improve the algorithm is to limit the values of xi

• It’s more efficient, but still slow

Branch and Bound Algorithm for PDP

1. Begin with X = {0}

Branch and Bound Algorithm for PDP

1. Begin with X = {0}

• Before describing PartialDigest, first define

D(y, X) = {|y – x1|, |y – x2|, …, |y – xn|}

for X = {x1, x2, …, xn}

PartialDigest Algorithm (cont’d)

Remove 10 from L and insert it into X. We know this must be

Take 8 from L and make y = 2 or 8. But since the two cases

We find that the distances from y=2 to other elements in X are

Take 7 from L and make y = 7 or y = 10 – 7 = 3. We will

For y = 7 first, D(y, X ) = {7, 5, 3}. Therefore we

D(y, X) = {7, 5, 3} = {|7 – 0|, |7 – 2|, |7 – 10|}

Take 6 from L and make y = 6. Unfortunately

This time make y = 4. D(y, X) = {4, 2, 3 ,6}, which is a

L is now empty, so we have a solution, which is X.

To find other solutions, we backtrack.

This time we will explore y = 3. D(y, X) = {3, 1, 7}, which is

We backtracked back to the root. Therefore we have found

Analyzing PartialDigest Algorithm

• Still exponential in worst case, but is very fast on average

Simplified partial digest problem

You might also like