Sequencing
Sequencing
ladder of extension products that correspond to the sequencing machines. The automated sequencer is used to
positions of that base along the DNA strand. Products separate sequencing reaction products, detect and collect
from the different reaction vessels are electrophoresed in (via computer) the data from the reactions, and analyse the
adjoining lanes on a sequencing gel, detected, and read to order of the bases to automatically deduce the base
determine the DNA sequence (Figure 2). The smaller sequence of a DNA fragment. Automated sequencers
fragments indicate the identity of bases closest to the 5’ detect extension products containing a fluorescent tag,
end and, since the DNA polymerase only incorporates allowing researchers to eliminate radioactivity from the
bases in the 5’ to 3’ direction, reading the identity of the DNA sequencing process. Sequence lengths that can be
successively larger products provides the 5’ to 3’ sequence read using an automated sequencer are dependent upon a
of the extended DNA strand. Typically, approximately variety of parameters, but typically range between 500 and
300–400 bases can be determined in a single (manual) 1000 bases.
reaction. As described for Sanger-type sequencing reactions using
(primarily) isotopes to detect the extension products, some
automated sequencers use four lanes to collect the data
from the reactions. However, some machines use differ-
Automated DNA Sequencing ently coloured fluorescent tags to indicate base identity
(Figure 3). This approach enables a single lane to contain the
A major advance in determining DNA sequence informa- data for a DNA template and increases fourfold the
tion occurred with the introduction of automated DNA
G A T C G A T C
Anode (–)
Direction of
electric current
Cathode (+)
(a) (b)
Figure 2 Data produced using Sanger sequencing reaction. G, A, T and C represent the sequencing reaction products resulting from inclusion of ddGTP,
ddATP, ddTTP or ddCTP. Since enzymatic synthesis proceeds 5’ to 3’, the smaller fragments identify bases that are closer to the primer (5’ end of the
sequence information). (a) Schematic view of Sanger reaction products. The DNA sequence identified by this pattern of bands is indicated. (b) Photograph
of corresponding sequence data.
Figure 3 (a) Raw sequence data collected on an automated DNA sequencer (Perkin-Elmer ABI PRISM Model 377). The four colours indicate the relative
position of the bases in the DNA fragment. Each four-colour vertical line corresponds to a different sequence reaction. The smaller fragments (nearer the
cathode) identify bases that are closer to the primer (5’ end of the sequence information). (b) Portions of a representative, analysed sequence
determined by the automated sequencer.
amount of data contained on a gel. This single-lane reactions, each containing the DNA template, a specific
approach is made possible by the development of ddNTP and colour-coded primer, and the reagents
fluorescent tags that can be attached either to the DNA necessary to produce extension products. The primer
primer or to the ddNTP. Since four-colour chemistry is defines the beginning (5’ end) of the extension product, and
used by more researchers, it is discussed in more detail the incorporated ddNTP defines base identity at the 3’ end
below. of the molecule. After the reaction is completed, the colour-
coded products are pooled and prepared for loading into a
single lane on an automated sequencer.
Dye primer chemistry
When dye primer chemistry is used to detect the sequencing Dye terminator chemistry
products, fluorescent tags are attached to the sequencing
primer. With this chemistry, the primer is synthesized four When dye terminator chemistry is used to detect the
times and a different tag (corresponding to a different base sequencing products, base identity is determined by the
identity) is attached to the primer during each synthesis. fluorescent tag attached to the ddNTP. This type of
Subsequently, the researcher assembles four separate reaction chemistry is performed in a single tube that
contains the DNA template, the primer, all four fluores- determined. Thus, a primer-based strategy involves
cently labelled ddNTPs, and the reagents necessary to repeated sequencing steps from known into unknown
produce extension products. The primer defines the DNA regions; this process minimizes redundancy, and it
beginning (5’ end) of the extension product and the does not require additional cloning steps. However, this
incorporated, colour-coded ddNTP defines base identity strategy requires the synthesis of a new primer for each
at the 3’ end of the molecule. After the reaction is complete, round of sequencing.
the extension products are prepared for loading into a The necessity of designing and synthesizing new primers,
single lane on an automated sequencer. An advantage of coupled with the expense and the time required for their
dye terminator chemistry is that extension products are synthesis, has limited the routine application of primer-
visualized only if they terminate with a dye-labelled walking for sequencing large DNA fragments. Researchers
ddNTP; prematurely terminated products are not de- have proposed using a library of short primers to eliminate
tected. Thus, reduced background signal typically results the requirement for custom primer synthesis. The avail-
with this chemistry. ability of a primer library would minimize waste of primer,
since each primer could be used to prime multiple
reactions, and would allow immediate access to the next
sequencing primer.
Genome Sequencing
Very often, a researcher needs to determine the sequence of
a DNA fragment that is larger than the 500–1000 base
average sequencing read length. Not surprisingly, strate- Prospects
gies to accomplish this have been developed. These
strategies are divided into two major classes, random or One of the original goals of the Human Genome Project
directed. Strategy choice is influenced by the size of the was to complete sequence determination of the entire
fragment to be sequenced. human genome by 2005. However, the project is ahead of
In random, or shotgun, DNA sequencing, a large DNA schedule and it is expected to produce a ‘working draft’ of
fragment (typically one larger than 20 000 base pairs) is the human genome by 2001. The completed genome
broken into smaller fragments that are inserted into a sequence is expected by 2003, at least two years ahead of
cloning vector. It is assumed that the sum of information schedule. Technological advances are responsible for the
contained within these smaller clones is equivalent to that rapid progress of this ambitious project. Progress in all
contained within the original DNA fragment. Numerous aspects involving DNA manipulation (especially manip-
smaller clones are randomly selected, DNA templates are ulation and propagation of large DNA fragments),
prepared for sequencing reactions, and fluorescently- evolution of faster and better DNA sequencing methods,
labelled primers that will base-pair with the vector DNA development of computer hardware and software capable
sequence bordering the insert are used to begin the of manipulating and analysing the data (bioinformatics),
sequencing reaction. Subsequently, the sequence of the and automation of procedures associated with generating
original DNA fragment is reconstructed by computer and analysing DNA sequences is responsible for this
assembly of the sequences obtained from the smaller DNA acceleration.
fragments. This strategy is being used extensively to
determine the sequence of ordered fragments that repre- Further Reading
sent the entire human genome [http://www.nhgri.nih.gov/
HGP/]. However, this random approach is typically not Ball S, Reeve MA, Robinson PS, Hill F, Brown DM and Loakes D
sufficient to complete sequence determination, since gaps (1998) The use of tailed octamer primers for cycle sequencing. Nucleic
in the sequence often remain after computer assembly. A Acids Research 26: 5225–5227.
Burbelo PD and Iadarola MJ (1994) Rapid plasmid DNA sequencing
directed strategy (described below) is usually used to
with multiple octamer primers. BioTechniques 16: 645–650.
complete the sequence project. Collins FS, Patrinos A, Jordan E, Chakravarti A, Gesteland R, Walters
A directed, or primer-walking, sequencing strategy can L, the members of the DOE and NIH planning groups (1998) New
be used to fill gaps remaining after the random phase of goals for the US Human Genome Project: 1998–2003. Science 282:
large-fragment sequencing, and as an efficient approach 682–689.
for sequencing smaller DNA fragments. This strategy uses Hardin SH, Jones LB, Homayouni R and McCollum JC (1996) Octamer
DNA primers that anneal to the template at a single site primed cycle sequencing: design of an optimal primer library. Genome
Research 6: 545–550.
and act as a start site for chain elongation. This approach
Jones LB and Hardin SH (1998a) Octamer-primed cycle sequencing
requires knowledge of some sequence information to using dye-terminator chemistry. Nucleic Acids Research 26: 2824–
design the primer. The sequence obtained from the first 2826.
reaction is used to design the primer for the next reaction Jones LB and Hardin SH (1998b) Octamer sequencing technology:
and these steps are repeated until the complete sequence is optimization using fluorescent chemistry. ABRF News 9(2): 6–10.
Kieleczawa J, Dunn JJ and Studier FW (1992) DNA sequencing by Sanger F, Nicklen S and Coulson AR (1977) DNA sequencing with
primer walking with strings of contiguous hexamers. Science 258: chain-terminating inhibitors. Proceedings of the National Academy of
1787–1791. Sciences of the USA 74: 5463–5467.
Kotler LE, Zevin-Sonkin D, Sobolev IA, Beskin AD and Ulanovsky LE Siemieniak DR and Slightom JL (1990) A library of 3342 useful nonamer
(1993) DNA sequencing: modular primers assembled from a library of primers for genome sequencing. Gene 96: 121–124.
hexamers or pentamers. Proceedings of the National Academy of Smith LM, Sanders JZ, Kaiser RJ et al. (1986) Fluorescence detection in
Sciences of the USA 90: 4241–4245. automated DNA sequence analysis. Nature 321: 674–679.
Maxam AM and Gilbert W (1977) A new method for sequencing DNA. Studier FW (1989) A strategy for high-volume sequencing of cosmid
Proceedings of the National Academy of Sciences of the USA 74: 560– DNAs: random and directed priming with a library of oligonucleo-
564. tides. Proceedings of the National Academy of Sciences of the USA 86:
Raja MC, Zevin-Sonkin D, Shwartzburd J et al. (1997) DNA sequencing 6917–6921.
using differential extension with nucleotide subsets (DENS). Nucleic
Acids Research 25: 800–805.