0% found this document useful (0 votes)
23 views

Sequencing

DNA sequencing determines the order of bases in a DNA molecule. There are two main methods - Maxam-Gilbert sequencing uses chemical degradation to cleave the DNA strand at specific bases, while Sanger sequencing uses enzymatic inhibition of DNA synthesis. Both methods require reaction products to share a common endpoint to allow size separation by electrophoresis. Automation of DNA sequencing is accelerating progress in genome sequencing projects.

Uploaded by

RASHI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Sequencing

DNA sequencing determines the order of bases in a DNA molecule. There are two main methods - Maxam-Gilbert sequencing uses chemical degradation to cleave the DNA strand at specific bases, while Sanger sequencing uses enzymatic inhibition of DNA synthesis. Both methods require reaction products to share a common endpoint to allow size separation by electrophoresis. Automation of DNA sequencing is accelerating progress in genome sequencing projects.

Uploaded by

RASHI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

DNA Sequencing Introductory article

Susan H Hardin, University of Houston, Texas, USA Article Contents


. Introduction
DNA sequencing is the determination of base order in a DNA molecule. Methods for . Maxam–Gilbert DNA Sequencing (Chemical
Degradation)
determining base order involve either chemical degradation or, more commonly,
. Sanger DNA Sequencing (Enzymatic Synthesis)
enzymatic synthesis of the region that is being sequenced. Automation of the DNA
. Automated DNA Sequencing
sequencing process is accelerating the progress of the Human Genome Project.
. Genome Sequencing
. Prospects
Introduction
The development of methods that allow one to quickly and
reliably determine the order of bases, or the ‘sequence’, in a in very specific ways through hydrogen bonds, such that A
fragment of DNA is a key technical advance, the interacts with T, and G interacts with C. These specific
importance of which cannot be overstated. Knowledge of interactions between the bases are referred to as base
DNA sequence enables a greater understanding of the pairings. The two strands of a DNA molecule occur in an
molecular basis of life. DNA sequence information antiparallel orientation in which one strand is positioned in
provides information critical to understanding a wide the 5’ to 3’ direction and the other strand is positioned in the
range of biological processes. The order of bases in DNA 3’ to 5’ direction. The terms 5’ and 3’ refer to the
specifies the order of bases in RNA, the molecule within the directionality of the DNA backbone, and are critical to
cell that directly encodes the informational content of describing the order of the bases. The convention for
proteins. Scientists routinely use the DNA sequence describing base order in a DNA sequence uses the 5’ to 3’
information to deduce protein sequence information. Base direction, and is written from left to right. Thus, if one
order dictates DNA structure and its function, and knows the sequence of one DNA strand, the complemen-
provides a molecular programme that can specify normal tary sequence can be deduced.
development, manifestation of a genetic disease, or cancer. There are two methods that are typically used to
Knowledge of DNA sequence and the ability to determine DNA sequence; the development of each
manipulate these sequences has accelerated the develop- method resulted in the award of a Nobel Prize. The first
ment of biotechnology and has led to the development of uses chemicals to specifically degrade the DNA strand, and
molecular techniques that provide the tools for asking and is referred to as Maxam–Gilbert DNA sequencing in
answering important scientific questions. The polymerase honour of the inventors, A. Maxam and W. Gilbert. The
chain reaction (PCR), an important biotechnique that second method involves specific inhibition of enzymatic
facilitates sequence-specific detection of nucleic acid, relies DNA synthesis and is referred to as Sanger sequencing in
on sequence information. DNA sequencing methods allow honor is its inventor, F. Sanger. These two sequencing
scientists to determine whether a change has been methods are described in more detail below.
introduced into the DNA, and to assay the effect of the Both methods require that the reaction products share a
change on the biology of the organism, regardless of the common endpoint. This requirement stems from the
type of organism that is being studied. Ultimately, DNA separation method used to visualize the reaction products.
sequence information may provide a way to identify These reaction products are size-separated by applying an
individuals uniquely. electric current through a gel matrix (electrophoresis), and
DNA sequencing has become so commonplace that the a common end is necessary to keep the reaction products in
technique itself is often taken for granted. However, this register with respect to size mobility, so that the smaller
has not always been the case. It was, in fact, almost products migrate more rapidly on the gel relative to the
required that scientists publish or present DNA sequence larger products. More specifically, either the 5’ or the 3’ end
data before a sequence was considered reliable. Further- can define the fragment endpoint in a Maxam–Gilbert
more, the length of the DNA information that it is possible sequencing reaction, while only the 5’ end defines the
to obtain and the number of sequences that are analysed on fragment endpoint in a Sanger sequencing reaction. The
a single gel have increased by an order of magnitude. This reason for this difference is clarified below.
article provides an overview of DNA sequencing develop-
ment.
To understand the DNA sequencing process, one must
recall several facts about DNA. First, a DNA molecule is
composed of four bases, adenine (A), guanine (G), cytosine
(C) and thymine (T). These bases interact with each other

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 1


DNA Sequencing

Maxam–Gilbert DNA Sequencing Maxam–Gilbert sequencing is not routinely used by


most investigators for several reasons. First, data pro-
(Chemical Degradation) duced in chemical sequencing reactions are typically more
ambiguous than data produced in enzymatic sequencing
In this method, a singly end-labelled DNA fragment
reactions. One reason for this is that the chemical reactivity
(typically labelled with a radioactive marker) is exposed to
of the bases is influenced by reaction impurities. Therefore,
base-specific damage. The chemicals used to induce DNA
when one reads the sequence from this type of reaction, the
damage are dimethylsulfate (attacks G), sodium hydroxide
relative intensities of the reaction products must be
(attacks A), formic acid (attacks G and A), hydrazine
analysed for proper interpretation of base identity.
(attacks C and T), and hydrazine in the presence of sodium
Additionally, this procedure uses hazardous chemicals
chloride (attacks C). These treatments are limited so that
and high levels of radioactivity. When compared with
on average only one of the bases in the strand is damaged.
enzymatic DNA sequencing, Maxam–Gilbert sequencing
Modification and, ultimately, elimination of the base (but
produces relatively shorter sequence information and the
not the sugar) produces a weak point in the DNA molecule
procedures required to generate this information are more
that is susceptible to cleavage. Next, the DNA is exposed to
labour-intensive.
piperidine at high temperature to break the strand at the
weakened position. Since these reactions are performed on
a population of molecules, electrophoresis of the reaction
produces a ladder of cleavage products that correspond to Sanger DNA Sequencing (Enzymatic
the positions of that base along the DNA strand. Products
from the different chemical modifications are electrophor-
Synthesis)
esed in adjoining lanes on the gel and read to determine the
Sanger sequencing is currently the most commonly used
DNA sequence (Figure 1). The smaller fragments indicate
method for sequencing DNA. The method exploits several
the identity of bases closest to the labelled end, and
features of a DNA polymerase: its ability to make an exact
successively larger products indicate the identity of bases
copy of a DNA molecule; its directionality of enzymatic
farther from this end. Depending on whether the label is
synthesis (5’ to 3’); its requirement for a DNA strand (a
located at the 5’ or the 3’ end of the DNA strand, the base
‘primer’) from which to begin synthesis; and its require-
order is read from the bottom to the top of the
ment for a 3’ OH at the end of the primer. If a 3’ OH is not
autoradiogram as either 5’ to 3’ or 3’ to 5’, respectively.
available, the DNA strand cannot be extended by the
polymerase. If a dideoxynucleotide (ddNTP – ddATP,
ddTTP, ddGTP, ddCTP,), a base analogue lacking a 3’
G A R Y C OH, is added into an enzymatic sequencing reaction, it is
Anode (–)
incorporated into the growing strand by the polymerase.
However, once the ddNTP is incorporated, the polymerase
is unable to add any additional bases to the end of the
strand. Importantly, ddNTPs are incorporated into the
DNA strand by the polymerase using the same base
incorporation rules that dictate incorporation of natural
nucleotides, where A specifies incorporation of T, and G
specifies incorporation of C (and vice versa).
Direction of
electric current
Once the polymerase incorporates a ddNTP, chain
extension stops. To determine DNA sequence, one per-
forms four reactions per template, where each reaction is
used to determine the relative position of a specific base
along the DNA strand. This is accomplished by adding a
different dideoxynucleotide into each reaction containing
the DNA polymerase, the DNA primer, and the dNTPs.
The ratio between ddNTP and dNTP is critical for
Cathode (+) determining how many nucleotides (on average) the
polymerase is able to incorporate into the DNA molecule
Figure 1 Schematic view of Maxam–Gilbert reaction products. G, A, R, Y before incorporating a ddNTP, thereby terminating chain
and C represent the specific chemical reactions that identify the relative elongation. The DNA primer, the ddNTPs, or the dNTPs
positions of guanine (G), adenine (A), purine (‘R’; G and A), pyrimidine can be either radiolabelled or otherwise tagged to allow
(‘Y’; C and T) and cytosine (C) bases, respectively. In this example the
fragment is labelled at the 5’ end. Reading from the bottom towards the top detection of the newly synthesized DNA strands. Since
of the gel, the banding pattern corresponds to the sequence 5’ these reactions are performed on a population of
GGTACGCCTGA 3’. molecules, electrophoresis of the reaction produces a

2 ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net


DNA Sequencing

ladder of extension products that correspond to the sequencing machines. The automated sequencer is used to
positions of that base along the DNA strand. Products separate sequencing reaction products, detect and collect
from the different reaction vessels are electrophoresed in (via computer) the data from the reactions, and analyse the
adjoining lanes on a sequencing gel, detected, and read to order of the bases to automatically deduce the base
determine the DNA sequence (Figure 2). The smaller sequence of a DNA fragment. Automated sequencers
fragments indicate the identity of bases closest to the 5’ detect extension products containing a fluorescent tag,
end and, since the DNA polymerase only incorporates allowing researchers to eliminate radioactivity from the
bases in the 5’ to 3’ direction, reading the identity of the DNA sequencing process. Sequence lengths that can be
successively larger products provides the 5’ to 3’ sequence read using an automated sequencer are dependent upon a
of the extended DNA strand. Typically, approximately variety of parameters, but typically range between 500 and
300–400 bases can be determined in a single (manual) 1000 bases.
reaction. As described for Sanger-type sequencing reactions using
(primarily) isotopes to detect the extension products, some
automated sequencers use four lanes to collect the data
from the reactions. However, some machines use differ-
Automated DNA Sequencing ently coloured fluorescent tags to indicate base identity
(Figure 3). This approach enables a single lane to contain the
A major advance in determining DNA sequence informa- data for a DNA template and increases fourfold the
tion occurred with the introduction of automated DNA

G A T C G A T C

Anode (–)

Direction of
electric current

Cathode (+)
(a) (b)

Figure 2 Data produced using Sanger sequencing reaction. G, A, T and C represent the sequencing reaction products resulting from inclusion of ddGTP,
ddATP, ddTTP or ddCTP. Since enzymatic synthesis proceeds 5’ to 3’, the smaller fragments identify bases that are closer to the primer (5’ end of the
sequence information). (a) Schematic view of Sanger reaction products. The DNA sequence identified by this pattern of bands is indicated. (b) Photograph
of corresponding sequence data.

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 3


DNA Sequencing

Figure 3 (a) Raw sequence data collected on an automated DNA sequencer (Perkin-Elmer ABI PRISM Model 377). The four colours indicate the relative
position of the bases in the DNA fragment. Each four-colour vertical line corresponds to a different sequence reaction. The smaller fragments (nearer the
cathode) identify bases that are closer to the primer (5’ end of the sequence information). (b) Portions of a representative, analysed sequence
determined by the automated sequencer.

amount of data contained on a gel. This single-lane reactions, each containing the DNA template, a specific
approach is made possible by the development of ddNTP and colour-coded primer, and the reagents
fluorescent tags that can be attached either to the DNA necessary to produce extension products. The primer
primer or to the ddNTP. Since four-colour chemistry is defines the beginning (5’ end) of the extension product, and
used by more researchers, it is discussed in more detail the incorporated ddNTP defines base identity at the 3’ end
below. of the molecule. After the reaction is completed, the colour-
coded products are pooled and prepared for loading into a
single lane on an automated sequencer.
Dye primer chemistry
When dye primer chemistry is used to detect the sequencing Dye terminator chemistry
products, fluorescent tags are attached to the sequencing
primer. With this chemistry, the primer is synthesized four When dye terminator chemistry is used to detect the
times and a different tag (corresponding to a different base sequencing products, base identity is determined by the
identity) is attached to the primer during each synthesis. fluorescent tag attached to the ddNTP. This type of
Subsequently, the researcher assembles four separate reaction chemistry is performed in a single tube that

4 ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net


DNA Sequencing

contains the DNA template, the primer, all four fluores- determined. Thus, a primer-based strategy involves
cently labelled ddNTPs, and the reagents necessary to repeated sequencing steps from known into unknown
produce extension products. The primer defines the DNA regions; this process minimizes redundancy, and it
beginning (5’ end) of the extension product and the does not require additional cloning steps. However, this
incorporated, colour-coded ddNTP defines base identity strategy requires the synthesis of a new primer for each
at the 3’ end of the molecule. After the reaction is complete, round of sequencing.
the extension products are prepared for loading into a The necessity of designing and synthesizing new primers,
single lane on an automated sequencer. An advantage of coupled with the expense and the time required for their
dye terminator chemistry is that extension products are synthesis, has limited the routine application of primer-
visualized only if they terminate with a dye-labelled walking for sequencing large DNA fragments. Researchers
ddNTP; prematurely terminated products are not de- have proposed using a library of short primers to eliminate
tected. Thus, reduced background signal typically results the requirement for custom primer synthesis. The avail-
with this chemistry. ability of a primer library would minimize waste of primer,
since each primer could be used to prime multiple
reactions, and would allow immediate access to the next
sequencing primer.
Genome Sequencing
Very often, a researcher needs to determine the sequence of
a DNA fragment that is larger than the 500–1000 base
average sequencing read length. Not surprisingly, strate- Prospects
gies to accomplish this have been developed. These
strategies are divided into two major classes, random or One of the original goals of the Human Genome Project
directed. Strategy choice is influenced by the size of the was to complete sequence determination of the entire
fragment to be sequenced. human genome by 2005. However, the project is ahead of
In random, or shotgun, DNA sequencing, a large DNA schedule and it is expected to produce a ‘working draft’ of
fragment (typically one larger than 20 000 base pairs) is the human genome by 2001. The completed genome
broken into smaller fragments that are inserted into a sequence is expected by 2003, at least two years ahead of
cloning vector. It is assumed that the sum of information schedule. Technological advances are responsible for the
contained within these smaller clones is equivalent to that rapid progress of this ambitious project. Progress in all
contained within the original DNA fragment. Numerous aspects involving DNA manipulation (especially manip-
smaller clones are randomly selected, DNA templates are ulation and propagation of large DNA fragments),
prepared for sequencing reactions, and fluorescently- evolution of faster and better DNA sequencing methods,
labelled primers that will base-pair with the vector DNA development of computer hardware and software capable
sequence bordering the insert are used to begin the of manipulating and analysing the data (bioinformatics),
sequencing reaction. Subsequently, the sequence of the and automation of procedures associated with generating
original DNA fragment is reconstructed by computer and analysing DNA sequences is responsible for this
assembly of the sequences obtained from the smaller DNA acceleration.
fragments. This strategy is being used extensively to
determine the sequence of ordered fragments that repre- Further Reading
sent the entire human genome [http://www.nhgri.nih.gov/
HGP/]. However, this random approach is typically not Ball S, Reeve MA, Robinson PS, Hill F, Brown DM and Loakes D
sufficient to complete sequence determination, since gaps (1998) The use of tailed octamer primers for cycle sequencing. Nucleic
in the sequence often remain after computer assembly. A Acids Research 26: 5225–5227.
Burbelo PD and Iadarola MJ (1994) Rapid plasmid DNA sequencing
directed strategy (described below) is usually used to
with multiple octamer primers. BioTechniques 16: 645–650.
complete the sequence project. Collins FS, Patrinos A, Jordan E, Chakravarti A, Gesteland R, Walters
A directed, or primer-walking, sequencing strategy can L, the members of the DOE and NIH planning groups (1998) New
be used to fill gaps remaining after the random phase of goals for the US Human Genome Project: 1998–2003. Science 282:
large-fragment sequencing, and as an efficient approach 682–689.
for sequencing smaller DNA fragments. This strategy uses Hardin SH, Jones LB, Homayouni R and McCollum JC (1996) Octamer
DNA primers that anneal to the template at a single site primed cycle sequencing: design of an optimal primer library. Genome
Research 6: 545–550.
and act as a start site for chain elongation. This approach
Jones LB and Hardin SH (1998a) Octamer-primed cycle sequencing
requires knowledge of some sequence information to using dye-terminator chemistry. Nucleic Acids Research 26: 2824–
design the primer. The sequence obtained from the first 2826.
reaction is used to design the primer for the next reaction Jones LB and Hardin SH (1998b) Octamer sequencing technology:
and these steps are repeated until the complete sequence is optimization using fluorescent chemistry. ABRF News 9(2): 6–10.

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 5


DNA Sequencing

Kieleczawa J, Dunn JJ and Studier FW (1992) DNA sequencing by Sanger F, Nicklen S and Coulson AR (1977) DNA sequencing with
primer walking with strings of contiguous hexamers. Science 258: chain-terminating inhibitors. Proceedings of the National Academy of
1787–1791. Sciences of the USA 74: 5463–5467.
Kotler LE, Zevin-Sonkin D, Sobolev IA, Beskin AD and Ulanovsky LE Siemieniak DR and Slightom JL (1990) A library of 3342 useful nonamer
(1993) DNA sequencing: modular primers assembled from a library of primers for genome sequencing. Gene 96: 121–124.
hexamers or pentamers. Proceedings of the National Academy of Smith LM, Sanders JZ, Kaiser RJ et al. (1986) Fluorescence detection in
Sciences of the USA 90: 4241–4245. automated DNA sequence analysis. Nature 321: 674–679.
Maxam AM and Gilbert W (1977) A new method for sequencing DNA. Studier FW (1989) A strategy for high-volume sequencing of cosmid
Proceedings of the National Academy of Sciences of the USA 74: 560– DNAs: random and directed priming with a library of oligonucleo-
564. tides. Proceedings of the National Academy of Sciences of the USA 86:
Raja MC, Zevin-Sonkin D, Shwartzburd J et al. (1997) DNA sequencing 6917–6921.
using differential extension with nucleotide subsets (DENS). Nucleic
Acids Research 25: 800–805.

6 ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net

You might also like