100% found this document useful (8 votes)

79 views

Next Generation Sequencing and Sequence Assembly Methodologies and Algorithms pdf epub

The document discusses next-generation sequencing (NGS) technologies and their methodologies, highlighting advancements from first to third-generation sequencing. It details various sequencing platforms, their mechanisms, and the evolution of assembly algorithms for reconstructing genomes from short reads. Additionally, it addresses challenges in genome assembly and provides an overview of algorithmic approaches to improve accuracy and efficiency in the assembly process.

Uploaded by

guanngom.phoatt.uong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (8 votes)

79 views

Next Generation Sequencing and Sequence Assembly Methodologies and Algorithms pdf epub

Uploaded by

guanngom.phoatt.uong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Next Generation Sequencing and Sequence Assembly

Methodologies and Algorithms

Visit the link below to download the full version of this book:

https://medipdf.com/product/next-generation-sequencing-and-sequence-assembly-met
hodologies-and-algorithms/

Click Download Now

Ali Masoudi-Nejad Zahra Narimani
•

Nazanin Hosseinkhan

Next Generation Sequencing

and Sequence Assembly
Methodologies and Algorithms

123
Ali Masoudi-Nejad Nazanin Hosseinkhan
Laboratory of Systems Biology Laboratory of Systems Biology
and Bioinformatics (LBB) and Bioinformatics (LBB)
Institute of Biochemistry and Biophysics Institute of Biochemistry and Biophysics
University of Tehran University of Tehran
Tehran Tehran
Iran Iran

Zahra Narimani
Laboratory of Systems Biology
and Bioinformatics (LBB)
Institute of Biochemistry and Biophysics
University of Tehran
Tehran
Iran

ISSN 2193-4746 ISSN 2193-4754 (electronic)

ISBN 978-1-4614-7725-9 ISBN 978-1-4614-7726-6 (eBook)
DOI 10.1007/978-1-4614-7726-6
Springer New York Heidelberg Dordrecht London

Library of Congress Control Number: 2013938267

Ó The Author(s) 2013

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or
information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed. Exempted from this legal reservation are brief
excerpts in connection with reviews or scholarly analysis or material supplied specifically for the
purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the
work. Duplication of this publication or parts thereof is permitted only under the provisions of
the Copyright Law of the Publisher’s location, in its current version, and permission for use must
always be obtained from Springer. Permissions for use may be obtained through RightsLink at the
Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt
from the relevant protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with
respect to the material contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

Dedicated to our loving family
Preface

DNA sequencing is a fast-moving science with technologies and platforms being

updated at breathtaking speed. The hallmark of next generation sequencing (NGS)
has been a massive increase in throughput and a decrease in price compared with
previous technologies. The first next-generation DNA sequencing machine was
introduced to the market by 454 Life Sciences (Basel, Switzerland) in 2005. The
technology is based on a large-scale parallel pyrosequencing system, which relies
on fixing nebulized and adapter-ligated DNA fragments to small DNA-capture
beads in a water-in-oil emulsion. The Illumina’s (CA, USA) Genome Analyzer
was released in 2007 and marked a true revolution for genome sequencing in
which short reads became significant to genomic applications. The technology is
based on reversible dye terminators. DNA molecules are first attached to primers
on a slide and amplified so that local clonal colonies are formed. Life Technol-
ogies’ (CA, USA) SOLiDTM technology employs sequencing by ligation. In this
technology, a pool of all possible oligonucleotides of a fixed length is labeled
according to the sequenced position. Oligonucleotides are annealed and ligated;
the preferential ligation by DNA ligase for matching sequences results in a signal
that is informative of the nucleotide at that position.
So-called ‘third-generation’ technologies directly sequence individual DNA
molecules rather than relying on any amplification prior to sequencing. The
recently released PacBio system can produce 35–45 Mb of data per cell with an
average read length of 1,500 bp. The Ion Torrent Personal Genome Machine
(PGM) is another third-generation platform that uses standard sequencing chem-
istry, but with a novel, semiconductor-based detection system. This technology
already claims read lengths of approximately 200 bp with high accuracy, and the
latest PGM 318 chip can produce 1.0 Gb of data in a 2-h run. When the impli-
cations of NGS technology became apparent, several assemblers were designed to
deal with the new problems, i.e., assembly of short NGS reads in order to
reconstruct the main longer sequences. Assembly process can be done either
having a reference genome available (mapping) or without having a reference
genome available (de Novo assembly). De Novo assembly algorithms, discussed in
more detail in this book, can be classified into three main categories: greedy
algorithms, Overlap-Layout-Consensus (OLC) methods, and De Bruijn graph
approaches. The Euler assembler was the first to employ de Bruijn graphs for

vii
viii Preface

whole genome shotgun (WGS) assembly, and proved capable of assembling

bacterial genomes. Velvet and ALLPATHS improved assembly in terms of speed,
contig and scaffold length, and avoidance of misassembly. ABySS followed the
innovations with de Bruijn methods, but also introduced a distributed represen-
tation of the graph, allowing message passing interface parallelization. The
CABOGand variant MSR-CA pipelines are updates of the Celera overlap-based
assembler designed for a combination of read types, which showed some success
with short-read data for genomes in the 100 Mb range. The String Graph
Assembler (SGA) is the first to make assembly of mammalian-sized genomes
practical using the string graph approach. This observation on the current tradeoff
between accuracy and continuity suggests avenues for future improvements in
assembly. There is room for other improvements at the scaffolding stage, where, as
has happened at the assembly stage, we witness a move from naïve and greedy
algorithms to more subtle graph-based techniques.
In this book, we briefly introduce the history of first, second, and third gener-
ation sequencing technologies and also describe drawbacks of the old techniques
which now are not suitable due to their cost and the need for automation which
could not be achieved in those methods. In Sect. 2 major NGS methods—namely
Roche/454 FLX, Illumina/Solexa Genome Analyzer, and Applied Biosystems
SOLiD System, etc.—are described in detail. Also, after bringing the latest and
most predominant technologies in NGS, nanopore DNA sequencing and Pacific
single molecule real time (SMRT) DNA sequencing, which does not need an
amplification step, are described. Latest subsections of this section are devoted to
information about sequencing costs, file formats of the output, a comparison of
methods, and their drawbacks, and finally application of NGS technologies. The
second two sections, i.e. Sects. 3 and 4, provide an overview of the algorithmic
view of the assembly problem. Our main focus in these two sections will be on de
Novo assembly algorithms of NGS reads. In Sect. 3, we generally define the
assembly problem and mention the challenges involved in the assembly process,
including errors propagated from sequencing process beside computational chal-
lenges. Appropriate use of paired-end read data, which helps to overcome the
challenges regarding short length of reads, and also preprocessing that helps to
eliminate some other issues regarding inaccurate data, is the next topic discussed
in this section. Using all these techniques to reduce problems, there will still be
errors in assembly, and relevant assembly algorithms are needed to be validated in
a standard way: These are the final topics which will be discussed in Sect. 3.
Finally, in Sect. 4, an exact view of the assembly algorithm is given as to how the
problem can be mapped to a graph and how different kind of graphs are treated in
finding the solution, which is the final assembled genome. Concerning each of the
assembly approaches, several example algorithms are then described in detail and,
finally, a comparison of these methods is provided in Sect. 4.
Contents

1 Next-Generation Sequencing Methodologies . . . . . . . .......... 1

1.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . .......... 1
1.1.1 A Brief History of the Discovery of DNA
Structure and Function . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Advent of Sequencing Technologies. . . . . . . . . . . . . . . . . . . . 3
1.2.1 First-Generation DNA Sequencers . . . . . . . . . . . . . . . . 4
1.3 Some Drawbacks of the Sanger Technique . . . . . . . . . . . . . . . 5
1.3.1 Short Size Fragments . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2 Needs for Amplification and Fragment
Assembly Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.3 Problems with Parallelization . . . . . . . . . . . . . . . . . . . 9
1.3.4 Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.5 Need for Complete Automation . . . . . . . . . . . . . . . . . . 9
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Emergence of Next-Generation Sequencing . . . . . . . . . . . . . . . . . . 11

2.1 454 Pyrosequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Illumina (Solexa) Genome Analyzer. . . . . . . . . . . . . . . . . . . . 15
2.3 Applied Biosystems SOLiD Sequencing . . . . . . . . . . . . . . . . . 17
2.4 Ion Semiconductor (Ion Torrent Sequencing) . . . . . . . . . . . . . 19
2.5 Polonator Technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.6 Heliscope (Single Molecule Sequencing) . . . . . . . . . . . . . . . . 23
2.7 Latest Developments in Next-Generation
Sequencing Methods. . . . . . . . . . . . . . . . . . . . . . . . ....... 23
2.7.1 Nanopore Sequencing. . . . . . . . . . . . . . . . . . ....... 25
2.7.2 Single Molecule Real Time DNA Sequencing . ....... 26
2.8 Comparison of Available Next-Generation
Sequencing Techniques. . . . . . . . . . . . . . . . . . . . . . ....... 29
2.9 DNA Sequencing Costs . . . . . . . . . . . . . . . . . . . . . ....... 29
2.10 Sequencing Status . . . . . . . . . . . . . . . . . . . . . . . . . ....... 29
2.11 Shortcoming of NGS Techniques: Short-Reads
and Reads Accuracy Issues . . . . . . . . . . . . . . . . . . . ....... 31

ix
x Contents

2.12 NGS File Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.13 NGS Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.14 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3 The Assembly of Sequencing Data . . . . . . . . . . . . . . . . . . . . . . . . 41

3.1 What is De Novo Genome Sequence Assembly? . . . . . . . . . . . 42
3.2 Challenges of Genome Assembly. . . . . . . . . . . . . . . . . . . . . . 43
3.3 Use of Paired-End Reads in the Assembly . . . . . . . . . . . . . . . 46
3.4 Data Preprocessing Methods and Sequence Read
Correction Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Assembly Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.6 Evaluation of Assembly Methods. . . . . . . . . . . . . . . . . . . . . . 50
3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4 De Novo Assembly Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.1 Mapping Assembly to a Graph Problem . . . . . . . . . . . . . . . . . 57
4.1.1 The Overlap Graph Approach . . . . . . . . . . . . . . . . . . . 57
4.1.2 De Bruijn Graph Approach . . . . . . . . . . . . . . . . . . . . . 57
4.2 Classification of De Novo Assembly Algorithms . . . . . . . . . . . 59
4.2.1 Greedy Algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2.2 Overlap Layout Consensus (OLC) Algorithms. . . . . . . . 66
4.2.3 De Bruijn Graph-Based Algorithms . . . . . . . . . . . . . . . 69
4.3 Comparison of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Chapter 1
Next-Generation Sequencing
Methodologies

1.1 Introduction

1.1.1 A Brief History of the Discovery of DNA Structure

and Function

Although many people believe that the American biologist James Watson and
English physicist Francis Crick were the first to discover DNA in the 1950s, DNA
was actually discovered by the Swiss chemist Friedrich Miescher in the late 1860s
during his attempts to isolate the protein components of leukocytes. But when he
isolated a substance that was unlike proteins resistant to proteolysis and also had
different chemical properties of proteins, including a much higher phosphorous
content, he realized that he had discovered a new substance [1]. He called this new
substance ‘‘nuclein.’’
Miescher’s finding was not considered particularly important until the twentieth
century, when the chemical nature of nuclein was studied by the Russian bio-
chemist Phoebus Levene. He was the first to discover: (1) the order of three major
components of a single nucleotide (phosphate-sugar-base) (Fig. 1.1); (2) the
carbohydrate component of RNA (ribose) and DNA (deoxyribose); and (3) the
way RNA and DNA molecules are put together. In 1919 Levene proposed that
nucleic acids were composed of a series of nucleotides and that each nucleotide
was in turn composed of just one of four nitrogen-containing bases—a sugar
molecule and a phosphate group.
Studies conducted to discover the DNA structure were continued by Erwin
Chargaff, an Austrian biochemist, to uncover additional details about the structure
of DNA. He reached two major conclusions [3]: First, he stated that the nucleotide
composition of DNA varies among species, and second, he concluded that the
amount of the base adenine (A) is usually similar to the amount of thymine (T);
this is also true about the amount of guanine (G) and cytosine (C). The latter is
known as Chargaff’s rule (Fig. 1.2).

A. Masoudi-Nejad et al., Next Generation Sequencing and Sequence Assembly, 1

SpringerBriefs in Systems Biology, DOI: 10.1007/978-1-4614-7726-6_1,
The Author(s) 2013
2 1 Next-Generation Sequencing Methodologies

Fig. 1.1 Three components of each nucleotide: the nitrogenous base that can basically belong to
two categories (single ring: pyrimidines, or two-linked rings: purines), a pentose sugar (ribose in
RNA and deoxyribose in DNA), and a phosphate group [2]

Fig. 1.2 Chargaff’s rule: the

total amount of purines is
equal to the total amount of
pyrimidines [2]

Chargaff’s finding that A = T and C = G, along with some vital crystallog-

raphy results obtained by the English researchers Rosalind Franklin and Maurice
Wilkins, established a strong basis for the discovery of a three-dimensional,
double-helical model for the structure of DNA proposed by Watson and Crick
(Fig. 1.3).
Each chain of a double-helix DNA molecule is made up of the phosphodiester
links between nucleotides. Two strands of a DNA molecule have different direc-
tionality. The two different ends of a single strand are called 30 and 50 and the
direction of DNA synthesis is 50[30 ; this means that the free 30 hydroxyl (OH)
group from the growing strand of DNA attacks the phosphate on the next base to
be added (Fig. 1.4). Pyrophosphate is released and the new base forms a phos-
phodiester bond with the growing strand of DNA. The free 30 hydroxyl group is
then free to attack the next base to be added. This reaction is catalyzed by DNA
polymerases.
1.2 Advent of Sequencing Technologies 3

Fig. 1.3 Double-helical structure of DNA. The chains of sugar-phosphate groups are linked
together by complementary bases [2]

Fig. 1.4 DNA synthesis

direction. The 50 end of the
new nucleotide is linked to
the 30 -OH of the last
nucleotide of the growing
chain by DNA polymerase
action. During this reaction,
a pyrophosphate group is
released [http://
www.prism.gatech.edu/
*gh19/b1510/dnarep.htm]

1.2 Advent of Sequencing Technologies

Knowing about the order (sequence) of nucleotides in DNA, the molecule in which
the genetic information of all organisms is stored, has revolutionized biology and
resulted in our better understanding of life’s secrets (BBSRC Review of Next-
Generation Sequencing—final version).
The first two DNA sequencing techniques, which are known as first-generation
DNA sequencers, historically were developed by Fredrick Sanger (1977, Uni-
versity of Cambridge) and Allan Maxam and Walter Gilbert (1976–1977, Harvard
University), independently. Sanger’s method, which earned him a Nobel Prize in
Chemistry in 1980, became popular, and in fact was the sole method for DNA
sequencing for three decades, as a result of its lesser technical complexity and
lesser amount of toxic chemicals used, compared to the Maxam–Gilbert method,
4 1 Next-Generation Sequencing Methodologies

which was based on the chemical modification of DNA and subsequent cleavage at
specific bases. In the Sanger sequencing method, which is also known as ‘‘chain
termination’’ or the ‘‘dideoxy method,’’ modified nucleotides (fluorescently
labeled dideoxynucleotides) are used in the reaction in addition to normal nucle-
otides; this method was gradually improved and became automated (the first
automatic sequencing machine, AB370, was introduced in 1987 by Applied
Biosystems), and therefore has been the method of choice for large-scale
sequencing projects, e.g., whole-genome sequencing for various species, for about
30 years [4].

1.2.1 First-Generation DNA Sequencers

1.2.1.1 Sanger Sequencing Technology

In classical Sanger sequencing technology, which is sequencing by the synthesis

method, the sequencing reaction is performed in the presence of the single-
stranded DNA template, DNA primers, DNA polymerase, four normal DNA
nucleotides, and four fluorescently labeled modified nucleotides (ddATP, ddCTP,
ddGTP and ddTTP).
The DNA template is initially divided into four separate sequencing reactions
containing primers, polymerase and normal nucleotides. In each reaction in the
presence of a small amount of one of four modified nucleotides (which lack the 3’-
OH group required for the extension), which randomly incorporates into the
growing strands, terminates DNA elongation and results in DNA fragments with
various lengths. The obtained DNA fragments are then separated by size through
high resolution polyacrylamide gel electrophoresis (capillary electrophoresis) with
each of four reactions run in one of four individual lanes (lanes A, C, G and T).
DNA bands that correspond to DNA fragments with differing lengths are then
visualized, using UV light or X-ray autoradiography, and the order of nucleotides
can be determined according to the relative positions of DNA bands among four
different lanes (Fig. 1.5).

1.2.1.2 Maxam-Gilbert Chemical Degradation DNA Sequencing

Technique

The Maxam-Gilbert technique relies on the cleaving of nucleotides by chemicals

and is most efficient with small nucleotide polymers (Fig. 1.6). Chemical treat-
ment generates breaks at a small proportion of one or two of the four nucleotide
bases in each of four reactions (G, A ? G, C, C ? T). Due to the advancements in
chain termination methodology, the Maxam-Gilbert method has become redun-
dant. It became obsolete due to its less ergonomical feasibility, and it is also
considered unsafe because of the extensive use of toxic chemicals.
1.2 Advent of Sequencing Technologies 5

ddGTP ddATP ddCTP ddTTP

(a) (b)

G A C T

Largest

TCGAAGACGTATC

Smallest

Fig. 1.5 Sanger sequencing procedure. a Four distinct reactions are taking place in the presence
of all required materials for DNA synthesis. Besides in each separate reaction, a distinct type of
fluorescently labeled dideoxy nucleotides is added which after completion DNA synthesis cycles,
results in the DNA strands each of which terminated in specific dideoxy nucleotide present on
that reaction. b After reaction completion, the content of four separate reactions is electropho-
resed using high-resolution polyacrylamide gel (www.Wikipedia.org)

As a result of using less toxic chemicals and lower amounts of radioactivity

than the Maxam and Gilbert method, and because of its comparative ease, the
Sanger method was soon automated and was the method used in the first gener-
ation of DNA sequencers.

1.3 Some Drawbacks of the Sanger Technique

1.3.1 Short Size Fragments

The Sanger method can only be performed for DNA fragments with a fairly short
length, i.e., 100–1,000 base pairs. This is due to the limitation in the power of
discrimination between fragment sizes during capillary electrophoresis, which
restricts the size of the DNA that can be reliably sequenced to *1,000 base pairs
(for larger DNA fragments, longer gels are required). Larger sequences—for
example, an entire chromosome—must first be fragmented into smaller pieces and
amplified to obtain a large number of copies for each individual fragment. After
performing sequencing reaction, these fragments must be reassembled to produce
the original sequence.
6 1 Next-Generation Sequencing Methodologies

Fig. 1.6 Maxam-Gilbert chemical degradation sequencing technique. a Double-stranded DNA is

labeled at 50 ends. b Single-stranded DNA fragment is produced. c DNA fragments are distributed
in four parallel test tubes. Each test tube is subjected to a specific base degrading chemical. The
content of each tube will be electrophoresed in the next step for fragment size separation

1.3.2 Needs for Amplification and Fragment Assembly Steps

The procedure mentioned for fragmentation and amplification can be conducted by

two distinct approaches: map-based sequencing (also known as back-to-back or
hierarchical sequencing) and shotgun sequencing.
The map-based method is accomplished by using a large number of bacterial
artificial chromosomes (BAC) ([20,000), each of which contains a large DNA
fragment (approximately 100 kb), which collectively provide an overlapping

Formulation & Evaluation of Herbal Shampoo
100% (1)
Formulation & Evaluation of Herbal Shampoo
18 pages
Nanobiotechnology in Energy, Environment and Electronics Methods and Applications 1st Edition Free eBook Download
100% (3)
Nanobiotechnology in Energy, Environment and Electronics Methods and Applications 1st Edition Free eBook Download
14 pages
Monoclonal Antibodies, 3rd Edition Readable PDF Download
100% (4)
Monoclonal Antibodies, 3rd Edition Readable PDF Download
16 pages
ADVANCES IN IMMUNOLOGY VOLUME 44 Updated Edition Download
100% (4)
ADVANCES IN IMMUNOLOGY VOLUME 44 Updated Edition Download
17 pages
Bioinformatics Sequence Alignment and Markov Models 1st Edition Premium eBook Download
100% (4)
Bioinformatics Sequence Alignment and Markov Models 1st Edition Premium eBook Download
16 pages
Molecular Pathology of Nervous System Tumors Biological Stratification and Targeted Therapies Reference Book Download
100% (7)
Molecular Pathology of Nervous System Tumors Biological Stratification and Targeted Therapies Reference Book Download
14 pages
Stem Cell Genetics for Biomedical Research Past, Present, and Future Fast eBook Download
100% (7)
Stem Cell Genetics for Biomedical Research Past, Present, and Future Fast eBook Download
14 pages
Novel Therapeutics from Modern Biotechnology From Laboratory to Human Testing, 1st Edition pdf docx
100% (6)
Novel Therapeutics from Modern Biotechnology From Laboratory to Human Testing, 1st Edition pdf docx
17 pages
Genomic Applications in Pathology All Chapters Included
100% (9)
Genomic Applications in Pathology All Chapters Included
17 pages
Synthetic Vaccines Digital DOCX Download
100% (4)
Synthetic Vaccines Digital DOCX Download
15 pages
Methods of Microarray Data Analysis III Papers from CAMDA 02 1st Edition DOCX PDF Download
100% (7)
Methods of Microarray Data Analysis III Papers from CAMDA 02 1st Edition DOCX PDF Download
14 pages
Cancer and the New Biology of Water Research PDF Download
100% (6)
Cancer and the New Biology of Water Research PDF Download
16 pages
BRAF Targets in Melanoma Biological Mechanisms, Resistance, and Drug Discovery All-in-One Download
100% (6)
BRAF Targets in Melanoma Biological Mechanisms, Resistance, and Drug Discovery All-in-One Download
15 pages
Therapeutic Ribonucleic Acids in Brain Tumors One-Click Download
100% (3)
Therapeutic Ribonucleic Acids in Brain Tumors One-Click Download
16 pages
The Hereditary Basis of Childhood Cancer pdf docx
100% (8)
The Hereditary Basis of Childhood Cancer pdf docx
15 pages
Medical Imaging Informatics - 1st Edition Unrestricted Download
100% (4)
Medical Imaging Informatics - 1st Edition Unrestricted Download
17 pages
Radionuclide Peptide Cancer Therapy, 1st Edition pdf docx
100% (6)
Radionuclide Peptide Cancer Therapy, 1st Edition pdf docx
17 pages
Advances in Parasitology Readable Ebook Download
100% (10)
Advances in Parasitology Readable Ebook Download
16 pages
Precision Medicine, CRISPR, and Genome Engineering Moving from Association to Biology and Therapeutics All-in-One Download
100% (5)
Precision Medicine, CRISPR, and Genome Engineering Moving from Association to Biology and Therapeutics All-in-One Download
15 pages
Janeway's Immunobiology , 9th Edition Unlimited Download
100% (3)
Janeway's Immunobiology , 9th Edition Unlimited Download
17 pages
Computational Methods for Reproductive and Developmental Toxicology, 1st Edition Readable Ebook Download
100% (5)
Computational Methods for Reproductive and Developmental Toxicology, 1st Edition Readable Ebook Download
15 pages
Recombinant Antibodies for Infectious Diseases Full Version Download
100% (7)
Recombinant Antibodies for Infectious Diseases Full Version Download
17 pages
Infectious Tropical Diseases and One Health in Latin America Entire Book Download
100% (8)
Infectious Tropical Diseases and One Health in Latin America Entire Book Download
17 pages
Signs and Symptoms of Genetic Conditions A Handbook - 1st Edition Open Access Download
100% (6)
Signs and Symptoms of Genetic Conditions A Handbook - 1st Edition Open Access Download
14 pages
WTEC Panel Report on Tissue Engineering Research Complete DOCX Download
100% (4)
WTEC Panel Report on Tissue Engineering Research Complete DOCX Download
17 pages
Inflammation, Infection, and Microbiome in Cancers Evidence, Mechanisms, and Implications One-Click eBook Download
100% (6)
Inflammation, Infection, and Microbiome in Cancers Evidence, Mechanisms, and Implications One-Click eBook Download
17 pages
Identifiability and Regression Analysis of Biological Systems Models Statistical and Mathematical Foundations and R Scripts New Edition PDF
100% (5)
Identifiability and Regression Analysis of Biological Systems Models Statistical and Mathematical Foundations and R Scripts New Edition PDF
14 pages
Family and Marital Psychotherapy (Psychology Revivals) A Critical Approach 1st Edition Academic PDF Download
100% (4)
Family and Marital Psychotherapy (Psychology Revivals) A Critical Approach 1st Edition Academic PDF Download
17 pages
Neuropilin From Nervous System to Vascular and Tumor Biology, 1st Edition Authorized Download
100% (5)
Neuropilin From Nervous System to Vascular and Tumor Biology, 1st Edition Authorized Download
16 pages
Gene Biotechnology, 3rd Edition Fast Download
100% (7)
Gene Biotechnology, 3rd Edition Fast Download
15 pages
Advances in Virus Research Full Book Download
100% (5)
Advances in Virus Research Full Book Download
17 pages
The Wiley Blackwell Handbook of Bullying A Comprehensive and International Review of Research and Intervention A Comprehensive and International Review of Research and Intervention - 1st Edition Entire Volume Download
100% (4)
The Wiley Blackwell Handbook of Bullying A Comprehensive and International Review of Research and Intervention A Comprehensive and International Review of Research and Intervention - 1st Edition Entire Volume Download
17 pages
Fragment Based Drug Discovery 1st Edition Educational eBook Download
100% (6)
Fragment Based Drug Discovery 1st Edition Educational eBook Download
14 pages
Rodent Quality Control Genes and Bugs Monitoring Health and Genetics of Laboratory Animals Full Version Download
100% (9)
Rodent Quality Control Genes and Bugs Monitoring Health and Genetics of Laboratory Animals Full Version Download
17 pages
Molecularly Imprinted Materials Science and Technology, 1st Edition Chapter-by-Chapter Download
100% (5)
Molecularly Imprinted Materials Science and Technology, 1st Edition Chapter-by-Chapter Download
16 pages
A History of Neuropsychology eBook Full Text
100% (5)
A History of Neuropsychology eBook Full Text
14 pages
A Cognitive Ethnography of Knowledge and Material Culture Cognition, Experiment, and the Science of Salmon Lice Enhanced eBook Download
100% (8)
A Cognitive Ethnography of Knowledge and Material Culture Cognition, Experiment, and the Science of Salmon Lice Enhanced eBook Download
17 pages
Coloproctology A Practical Guide - 2nd Edition All Sections Download
100% (8)
Coloproctology A Practical Guide - 2nd Edition All Sections Download
17 pages
Biomedical Informatics for Cancer Research, 1st Edition scribd download
100% (5)
Biomedical Informatics for Cancer Research, 1st Edition scribd download
14 pages
The Salvia miltiorrhiza Genome High-Resolution PDF Download
100% (3)
The Salvia miltiorrhiza Genome High-Resolution PDF Download
15 pages
Computational Intelligence in Biomedicine and Bioinformatics Current Trends and Applications, 1st Edition full download
100% (8)
Computational Intelligence in Biomedicine and Bioinformatics Current Trends and Applications, 1st Edition full download
17 pages
Healing Grief Reclaiming Life After Any Loss Digital EPUB Download
100% (5)
Healing Grief Reclaiming Life After Any Loss Digital EPUB Download
14 pages
Viral Molecular Machines DOCX PDF Download
100% (7)
Viral Molecular Machines DOCX PDF Download
15 pages
An Introduction to Brain and Behavior - 6th Edition Direct eBook Download
100% (7)
An Introduction to Brain and Behavior - 6th Edition Direct eBook Download
17 pages
Statistical Analysis of Microbiome Data with R Optimized PDF Download
100% (6)
Statistical Analysis of Microbiome Data with R Optimized PDF Download
15 pages
Chromatin Structure, Dynamics, Regulation, 1st Edition High-Quality eBook
100% (4)
Chromatin Structure, Dynamics, Regulation, 1st Edition High-Quality eBook
15 pages
Reichman's Emergency Medicine Procedures, 3rd Edition - 3rd Edition Scribd Full Download
100% (6)
Reichman's Emergency Medicine Procedures, 3rd Edition - 3rd Edition Scribd Full Download
14 pages
Machine Learning for Healthcare Handling and Managing Data - 1st Edition Instant Reading Access
100% (6)
Machine Learning for Healthcare Handling and Managing Data - 1st Edition Instant Reading Access
17 pages
Multiple Sclerosis A Guide for Families, Third Edition - 3rd Edition Verified Download
100% (7)
Multiple Sclerosis A Guide for Families, Third Edition - 3rd Edition Verified Download
14 pages
Structural Biology and Functional Genomics, 1st Edition Complete EPUB eBook
100% (9)
Structural Biology and Functional Genomics, 1st Edition Complete EPUB eBook
17 pages
Methods of Microarray Data Analysis V 1st Edition Full Text EPUB
100% (6)
Methods of Microarray Data Analysis V 1st Edition Full Text EPUB
16 pages
Frontiers in Cancer Research Evolutionary Foundations, Revolutionary Directions High-Resolution PDF Download
100% (8)
Frontiers in Cancer Research Evolutionary Foundations, Revolutionary Directions High-Resolution PDF Download
17 pages
Practical Biomedical Signal Analysis Using MATLAB® - 2nd Edition Educational eBook Download
100% (4)
Practical Biomedical Signal Analysis Using MATLAB® - 2nd Edition Educational eBook Download
14 pages
Intermolecular and Surface Forces Revised Edition, 3rd Edition Reference Book Download
100% (7)
Intermolecular and Surface Forces Revised Edition, 3rd Edition Reference Book Download
14 pages
Cyclic Nucleotide Signaling, 1st Edition Fast eBook Download
100% (9)
Cyclic Nucleotide Signaling, 1st Edition Fast eBook Download
14 pages
Thyroid Cancer - 1st Edition Scribd Full Download
100% (5)
Thyroid Cancer - 1st Edition Scribd Full Download
15 pages
Infections in Hematology Best Quality Download
100% (3)
Infections in Hematology Best Quality Download
16 pages
Genetics, Health Care and Public Policy An Introduction to Public Health Genetics 1st Edition Premium eBook Download
100% (6)
Genetics, Health Care and Public Policy An Introduction to Public Health Genetics 1st Edition Premium eBook Download
14 pages
Computational Network Theory Theoretical Foundations and Applications Theoretical Foundations and Applications, 1st Edition Full-Feature Download
100% (4)
Computational Network Theory Theoretical Foundations and Applications Theoretical Foundations and Applications, 1st Edition Full-Feature Download
16 pages
Hematopoietic Stem Cell Transplantation, 1st Edition Digital EPUB Download
100% (6)
Hematopoietic Stem Cell Transplantation, 1st Edition Digital EPUB Download
15 pages
Where can buy Algorithms for next generation sequencing 1st Edition Sung ebook with cheap price
100% (5)
Where can buy Algorithms for next generation sequencing 1st Edition Sung ebook with cheap price
60 pages
Early Life Origins of Health and Disease - 1st Edition Digital Download
100% (9)
Early Life Origins of Health and Disease - 1st Edition Digital Download
16 pages
Flawed Convictions "Shaken Baby Syndrome" and the Inertia of Injustice Multiformat Download
100% (6)
Flawed Convictions "Shaken Baby Syndrome" and the Inertia of Injustice Multiformat Download
17 pages
Stories of Sickness 2nd Edition full download
100% (7)
Stories of Sickness 2nd Edition full download
17 pages
Growth Hormone Deficiency Physiology and Clinical Management Research PDF Download
100% (5)
Growth Hormone Deficiency Physiology and Clinical Management Research PDF Download
15 pages
JIMD Reports, Volume 18 All Sections Download
100% (6)
JIMD Reports, Volume 18 All Sections Download
14 pages
Most Downloaded Foolproof Preserving and Canning A Guide to Small Batch Jams, Jellies, Pickles, and Condiments pdf docx
100% (13)
Most Downloaded Foolproof Preserving and Canning A Guide to Small Batch Jams, Jellies, Pickles, and Condiments pdf docx
16 pages
1 s2.0 S001393510500112X Main
No ratings yet
1 s2.0 S001393510500112X Main
8 pages
Biochem Module 1 7 Reviewer
No ratings yet
Biochem Module 1 7 Reviewer
18 pages
Alkanes: Alkanes Alkanes Alkenes Hydrocarbons As Fuels Arenes
No ratings yet
Alkanes: Alkanes Alkanes Alkenes Hydrocarbons As Fuels Arenes
23 pages
Haier JC 160gd
No ratings yet
Haier JC 160gd
15 pages
UPVC Submittal
No ratings yet
UPVC Submittal
19 pages
Product List (For Export)
No ratings yet
Product List (For Export)
9 pages
Description Dr. Fixit Pidifin 2K
No ratings yet
Description Dr. Fixit Pidifin 2K
4 pages
BC 34.1 E2
No ratings yet
BC 34.1 E2
7 pages
Week#1 - Intro To Petro Global Market
No ratings yet
Week#1 - Intro To Petro Global Market
39 pages
Molykote Longterm2 Plus Grease MSDS
No ratings yet
Molykote Longterm2 Plus Grease MSDS
8 pages
Thesis
No ratings yet
Thesis
57 pages
CurcuminReviewrevised2015 04 17
No ratings yet
CurcuminReviewrevised2015 04 17
18 pages
SAIC-A-2008 Rev 6 Verify Test Medium For Hydrostatic Testing and Lay Up
No ratings yet
SAIC-A-2008 Rev 6 Verify Test Medium For Hydrostatic Testing and Lay Up
2 pages
Atomic Structure of Group 7 Elements
No ratings yet
Atomic Structure of Group 7 Elements
6 pages
Questions For Polar Bear Cartoon
No ratings yet
Questions For Polar Bear Cartoon
3 pages
DPP - 07 - Substitution Reaction
No ratings yet
DPP - 07 - Substitution Reaction
5 pages
Acme Generics LLP
100% (2)
Acme Generics LLP
7 pages
IVAPM Dosage Chart 2017
No ratings yet
IVAPM Dosage Chart 2017
2 pages
Yue Et Al., 2021 (Organic Geochemistry)
No ratings yet
Yue Et Al., 2021 (Organic Geochemistry)
14 pages
191 - 2007 - Copper
No ratings yet
191 - 2007 - Copper
14 pages
HRC 13 Info Sheet #1
No ratings yet
HRC 13 Info Sheet #1
13 pages
Safety Data Sheet: SECTION 1: Identification of The Substance/mixture and of The Company/undertaking
No ratings yet
Safety Data Sheet: SECTION 1: Identification of The Substance/mixture and of The Company/undertaking
11 pages
asish2_merged (2)
No ratings yet
asish2_merged (2)
27 pages
Hardener 450
No ratings yet
Hardener 450
7 pages
MSDS W2600 Tape Edge Sealer
No ratings yet
MSDS W2600 Tape Edge Sealer
21 pages
US10465235_Multiplexed Proximity Ligation Assay
No ratings yet
US10465235_Multiplexed Proximity Ligation Assay
23 pages
Concrete Mix Design With Fly Ash and Silica Fumes
No ratings yet
Concrete Mix Design With Fly Ash and Silica Fumes
8 pages
Preview Farmako Katzung 4
No ratings yet
Preview Farmako Katzung 4
35 pages
HPLC Determination of Fructo-Oligosaccharides in Dairy Products
No ratings yet
HPLC Determination of Fructo-Oligosaccharides in Dairy Products
5 pages

Next Generation Sequencing and Sequence Assembly Methodologies and Algorithms pdf epub

Uploaded by

Next Generation Sequencing and Sequence Assembly Methodologies and Algorithms pdf epub

Uploaded by

Next Generation Sequencing and Sequence Assembly

Methodologies and Algorithms

Click Download Now

Next Generation Sequencing

ISSN 2193-4746 ISSN 2193-4754 (electronic)

Library of Congress Control Number: 2013938267

Ó The Author(s) 2013

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

DNA sequencing is a fast-moving science with technologies and platforms being

whole genome shotgun (WGS) assembly, and proved capable of assembling

1 Next-Generation Sequencing Methodologies . . . . . . . .......... 1

2 Emergence of Next-Generation Sequencing . . . . . . . . . . . . . . . . . . 11

2.12 NGS File Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3 The Assembly of Sequencing Data . . . . . . . . . . . . . . . . . . . . . . . . 41

4 De Novo Assembly Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

1.1.1 A Brief History of the Discovery of DNA Structure

A. Masoudi-Nejad et al., Next Generation Sequencing and Sequence Assembly, 1

Fig. 1.2 Chargaff’s rule: the

Chargaff’s finding that A = T and C = G, along with some vital crystallog-

Fig. 1.4 DNA synthesis

1.2 Advent of Sequencing Technologies

1.2.1 First-Generation DNA Sequencers

1.2.1.1 Sanger Sequencing Technology

In classical Sanger sequencing technology, which is sequencing by the synthesis

1.2.1.2 Maxam-Gilbert Chemical Degradation DNA Sequencing

The Maxam-Gilbert technique relies on the cleaving of nucleotides by chemicals

ddGTP ddATP ddCTP ddTTP

As a result of using less toxic chemicals and lower amounts of radioactivity

1.3 Some Drawbacks of the Sanger Technique

1.3.1 Short Size Fragments

Fig. 1.6 Maxam-Gilbert chemical degradation sequencing technique. a Double-stranded DNA is

1.3.2 Needs for Amplification and Fragment Assembly Steps

The procedure mentioned for fragmentation and amplification can be conducted by

You might also like