0% found this document useful (0 votes)

62 views

7a Genomics 2-24 PDF

The document provides a brief overview of the evolution and history of DNA sequencing technologies. It discusses early methods from the 1970s like chromatography and Sanger dideoxy sequencing. It then covers major developments like the first genome sequenced in 1977 (φX174), large scale automated sequencing in the 1990s, the first human genome draft in 2001, and the introduction of next generation sequencing technologies starting in the 2000s including Illumina, Ion Torrent, and PacBio.

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views

7a Genomics 2-24 PDF

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

All science is either

physics or stamp
collecting...

- Ernest Rutherford
As quoted in Rutherford at
Manchester

Data! Data! Data! ...

I cant make bricks
without the clay

- Sherlock Holmes
Adventures of Copper Beeches
http://www.perkydesigns.com

ACGTGACTGAGGACCGTG
CGACTGAGACTGACTGGGT
CTAGCTAGACTACGTTTTA
TATATATATACGTCGTCGT
ACTGATGACTAGATTACAG
ACTGATTTAGATACCTGAC
TGATTTTAAAAAAATATT

Evolution of
sequencing

Archaic sequencing methods

Early 70s: chromatography

First nucleotide sequencing

First DNA sequencing

Proc. Nat. Acad. Sci. USA
Vol. 70, No. 12, Part I, pp. 3581-3584, December 1973

The Nucleotide Sequence of the lac Operator

(regulation/protein-nucleic acid interaction/DNA-RNA sequencing/oligonucleotide priming)

WALTER GILBERT AND ALLAN MAXAM

Department of Biochemistry and Molecular Biology, Harvard University, Cambridge, Massachusetts 02138

Communicated by J. D. Watson, Augut 9, 1973

The lac repressor protects the lac operator
ABSTRACT
against digestion with deoxyribonuclease. The protected
fragment is double-stranded and about 27 base-pairs
long. We determined the sequence of RNA transcription
copies of this fragment and present a sequence for 24
base pairs. It is:
5'--T GG AATT GT GA GC GG AT AAC AATT 3'
3'--AC C TT AACA C TC GC C T ATT GTT AA5'
The sequence has 2-fold symmetry regions; the two longest
are separate4 by one turn of the DNA double helix.

The lactose repressor selects one out of six million nucleotide

sequences in the Escherichia coli genome and binds to it to
prevent the expression of the genes for lactose metabolism.
How does this protein, a 150,000-dalton tetramer of identical
subunits, recognize its target? To answer this question we have
determined the sequence of the repressor-binding site: the
operator.
Genetically the operator is the locus of operator constitutive

bind again to the repressor, and is about 27 base-pairs long.

Here we shall describe its sequence.
METHODS
Sonicated DNA Fragments. Sonicated [82P]DNA fragments
were made by growing a temperature-inducible lysogen of
Xcl857plac5S7 at 340 in a glucose-50 mM Tris HCl or TES
(pH 7.4) medium in 3 mM phosphate, heating at 420 for 15
min at a cell density of 4 X 108/mI, then washing and resuspending the cells at a density of 8 X 108/ml in the same medium with 0.1 mM phosphate. 100 mCi of neutralized H3s2PO4
was added to 10 ml of cells, and the incorporation was continued for 2 hr at 34 . The cells were washed, suspended in 2
ml of TE buffer [10 mM Tris *HCl (pH 7.5)-i mM EDTA],
sonicated with six 15-sec bursts, and extracted with phenol.
The aqueous phase was extracted with ether, and the residual
ether was removed with a stream of N2. The mixture of radio-

First Genome Sequence

Sanger dideoxy sequencging

First DNA genome sequenced in 1977:
X174.

1990s: Large scale automated

Sequencing

Generation 1: Gel based or capillary

First automated sequencing

Capillary Sequencing

1995: Haemophilus influenza

2001 Human Genome

Human Genome
Not a single individual
Was a hack job
Re!ned over the next 5 yrs

Reference
assembly

Next generation sequencing

Massively Parallel Signature

Sequencing (MPSS)
Early 1990s: created by Lynx technologies,
purchased by Solexa/Illumina

Illumina Video
https://www.youtube.com/watch?v=womKfikWlxM

Early next-gen sequencers

Table 1. Comparison of different sequencing technologies, taken from [34].

Sequencer

ABI 3730

Roche 454

Solexaa

SOLiD (mp, frag)b

HeliScopec

Read length

600900

400500

75100

2535

Run time

610 h

10 h

210 d

(47 d,814 d)

Yield (Mbp)

0.01

2,3003,500/d

(500, 1,000)

105140/h

Cloning bias

Yes

Mate pair information

Yes

Based on the GA IIx. See full specifications at: http://www.illumina.com/systems/genome_analyzer.ilmn.

mp, mate pair; frag, fragment. See https://products.appliedbiosystems.com/ SOLiD 3 Plus System.
c
See: http://www.helicosbio.com/Products/HelicosregGeneticAnalysisSystem/HeliScopetradeSequencer/tabid/87/Default.aspx.
doi:10.1371/journal.pcbi.1000667.t001
b

processing the data. See Table 1 for a comparison between the

yield, fragment length, and run times of the different sequencers.
In pyrosequencing (Figure 2) [28,29], methods such as Roche
454 [30] sequencing is performed by polymerase extension of a
primed template. Single nucleotide species are added at each
cycle. If the particular nucleotide species added to the polymerase
reaction pairs with the one on the template, the incorporation
causes luciferase-based light reaction. The reaction chamber is
then washed, and the cycle repeated. Several hundreds of
thousands of wells containing material for sequencing are typically
used in a single reaction. Second is the inability to read long
mononucleotide repeats correctly.

Metagenomic Sample Coverage

Coverage.

Coverage of a genome is defined as the mean

number of times a nucleotide is being sequenced. Thus, 56
coverage means that each nucleotide in the genome is sequenced a
mean number of five times. If we could sequence a genome in a
single read, then 16 coverage would suffice for sequencing.
Shorter read lengths (25700, depending on sequencing
technologies, see Table 1), necessitate more coverage, to ensure
all reads overlap, and that those overlaps are unique enough to

Next-gen sequencers
Current fashion:
Illumina
IonTorrent
Around the corner
Real Time (PacBio)
Nanopore (Oxford)

Sequencing Overview
genomic segment
cut many times at
random (Shotgun)

Get one or two reads from

each segment

~900 bp

Reconstructing the Sequence

(Fragment Assembly)

reads

Cover region with high redundancy

Overlap & extend reads to reconstruct the original genomic region

Steps to Assemble a Genome

Some Terminology

1. Find overlapping reads

read a 500-900 long word that comes

out of sequencer
mate
a pair
of reads
from
twoof
ends
2. pair
Merge
some
"good#
pairs
reads into
of the same insert fragment

longer contigs

contig a contiguous sequence formed

by several overlapping reads
with
no gaps
3. Link
contigs
to form supercontigs
supercontig an ordered and oriented set
(sca$old)
of contigs, usually by mate
pairs

4. Derive consensus sequence

consensus sequence derived from the

sequene
multiple alignment of reads
in a contig

..ACGATTACAATAGGTT..

De!nition of Coverage

Length of genomic segment:

Number of reads:
Length of each read:

G
N
L

De!nition:

C=NL/G

Coverage

How much coverage is enough?

Lander-Waterman model: Prob[ not covered bp ] = e-C
Assuming uniform distribution of reads, C=10 results in 1
gapped region /1,000,000 nucleotides

Draft sequencing of full

genome
6 to 8X coverage

SNP !nding
>= 20x coverage

Assembly
Join reads to larger sequence: "contigs".

Reference based assembly

De Novo assembly

Publicly available de novo assemblers

Phrap (www.phrap.org)
Celera (wgs-assembler.sf.net)
Paracel (www.paracel.com)
Arachne (ftp://ftp.broadinstitute.org/pub/crd/
ARACHNE/)
CAP3 (http://seq.cs.iastate.edu/)

Gene prediction

Evidence based gene calling: BLAST

Ab initio gene calling; no homolog required:
GeneMark, Glimmer, MetaGene.

ORFans
Open Reading Frame (ORFs) with no similarity to any
sequence in the database.

Annotation
Finding function of a gene

Next-gen sequencing
Whole Genome
Sequencing
RNA-Seq
Exome
Chip-Seq
Methylation (Bisul!te
sequencing)

Lior Pachter's list

https://liorpachter.wordpress.com/seq/

Personal Genomes

Gonzaga-Jauregui 2012

Personal Genomes

~14.6 mil non-redundant SNPs

Each genome V reference assembly ~3.5mil SNPs and
~1000 CNVs

Introduction To Genetic Analysis (12th Edition)
95% (22)
Introduction To Genetic Analysis (12th Edition)
819 pages
RNA Sequence Structure and Function Computational and Bioinformatic Methods 1st Edition Jan Gorodkin - Download the entire ebook instantly and explore every detail
100% (2)
RNA Sequence Structure and Function Computational and Bioinformatic Methods 1st Edition Jan Gorodkin - Download the entire ebook instantly and explore every detail
84 pages
DNA Sequencing
No ratings yet
DNA Sequencing
81 pages
DNA Sequencing: Methods
No ratings yet
DNA Sequencing: Methods
89 pages
Lecture1 Genome - Sequencing 2019
No ratings yet
Lecture1 Genome - Sequencing 2019
41 pages
3 Sequencing
No ratings yet
3 Sequencing
30 pages
Lecture 9
No ratings yet
Lecture 9
47 pages
Next Generation Sequencing - Final
100% (1)
Next Generation Sequencing - Final
33 pages
Credit Seminar II (PHD)
No ratings yet
Credit Seminar II (PHD)
61 pages
Gene Sequencing: Darshan Maheshbhai Patel 1 Sem M. Pharm Dept. of Pharmacology Anand Pharmacy College Guide: Anjali Patel
100% (1)
Gene Sequencing: Darshan Maheshbhai Patel 1 Sem M. Pharm Dept. of Pharmacology Anand Pharmacy College Guide: Anjali Patel
47 pages
3 DNA Sequencing 23
No ratings yet
3 DNA Sequencing 23
80 pages
Genomes: Number of Base Pairs
No ratings yet
Genomes: Number of Base Pairs
38 pages
Genomics 2
No ratings yet
Genomics 2
15 pages
Sequencing
No ratings yet
Sequencing
47 pages
Deep Sequencing: Introduction To Bioinformatics Seminar November 9th, 2009
No ratings yet
Deep Sequencing: Introduction To Bioinformatics Seminar November 9th, 2009
56 pages
Next Generation Sequencing
No ratings yet
Next Generation Sequencing
44 pages
J Ygeno 2015 11 003
No ratings yet
J Ygeno 2015 11 003
32 pages
Ngs
No ratings yet
Ngs
53 pages
07 Sequencing
No ratings yet
07 Sequencing
37 pages
DNA Sequencing Methods
No ratings yet
DNA Sequencing Methods
36 pages
Next Generation Sequencing
100% (1)
Next Generation Sequencing
6 pages
12 Prevention of Human Diseases Using Health Biotechnology
No ratings yet
12 Prevention of Human Diseases Using Health Biotechnology
47 pages
Litrature and Design
No ratings yet
Litrature and Design
17 pages
Next-Generation DNA Sequencing: Diana Le Duc, M.D. Biochemistry Institute, Medical Faculty, University of Leipzig
No ratings yet
Next-Generation DNA Sequencing: Diana Le Duc, M.D. Biochemistry Institute, Medical Faculty, University of Leipzig
40 pages
Overview of Next Generation Sequencing Technologies
No ratings yet
Overview of Next Generation Sequencing Technologies
12 pages
Lec3 - DNA Sequencing
No ratings yet
Lec3 - DNA Sequencing
68 pages
Genome Sequencing: Dr. P. Balaji Vysya College, Hosur
No ratings yet
Genome Sequencing: Dr. P. Balaji Vysya College, Hosur
72 pages
Lecture 4-DNA Analysis II 2
No ratings yet
Lecture 4-DNA Analysis II 2
23 pages
1.2,3 DNA Sequencing
No ratings yet
1.2,3 DNA Sequencing
64 pages
Dna Sequencing
No ratings yet
Dna Sequencing
19 pages
Nucleic Acid Sequencing
No ratings yet
Nucleic Acid Sequencing
59 pages
Lecture 09 Chapter 05-DNA-sequencing
No ratings yet
Lecture 09 Chapter 05-DNA-sequencing
32 pages
SEQUENCING ATU Online
No ratings yet
SEQUENCING ATU Online
39 pages
DNA_Sequencing
No ratings yet
DNA_Sequencing
60 pages
Lecture 2 - Sequencing
No ratings yet
Lecture 2 - Sequencing
47 pages
DNA Sequencing
No ratings yet
DNA Sequencing
48 pages
BB221 AmitDatta DNA Sequencing
No ratings yet
BB221 AmitDatta DNA Sequencing
10 pages
NGSandApp
No ratings yet
NGSandApp
41 pages
Next-generation-sequencing-2-2048 - Copy (26 Files Merged) (1)
No ratings yet
Next-generation-sequencing-2-2048 - Copy (26 Files Merged) (1)
26 pages
Next Generation Sequencing Platforms PDF
No ratings yet
Next Generation Sequencing Platforms PDF
5 pages
NGS 2
No ratings yet
NGS 2
26 pages
2.DNA Sequencing
No ratings yet
2.DNA Sequencing
27 pages
DNA Sequencing - Sangers Method
No ratings yet
DNA Sequencing - Sangers Method
10 pages
SU6.5 PL
No ratings yet
SU6.5 PL
26 pages
Dna Sequencing (Aabha Patel)
No ratings yet
Dna Sequencing (Aabha Patel)
22 pages
15) Microarray and Sequencing presentation (DT)
No ratings yet
15) Microarray and Sequencing presentation (DT)
40 pages
Bioinformatics (5)
No ratings yet
Bioinformatics (5)
26 pages
Dna Sequencing: DR Z Chikwambi Biotechnology
100% (2)
Dna Sequencing: DR Z Chikwambi Biotechnology
110 pages
Lecture-4 - DNA Sequencing
No ratings yet
Lecture-4 - DNA Sequencing
23 pages
NGSand App
No ratings yet
NGSand App
41 pages
DNA Sequencing: Dibya Ranjan Dalei Adm no-9PBG/16 Dept. of PBG, Ca, BBSR, Ouat
No ratings yet
DNA Sequencing: Dibya Ranjan Dalei Adm no-9PBG/16 Dept. of PBG, Ca, BBSR, Ouat
23 pages
An Overview of Next-Generation Sequencing
No ratings yet
An Overview of Next-Generation Sequencing
25 pages
Assignment: Date of Submission
No ratings yet
Assignment: Date of Submission
21 pages
SP24 Genetics Exam 3 LA Review
No ratings yet
SP24 Genetics Exam 3 LA Review
62 pages
DNA Sequencing & Application
No ratings yet
DNA Sequencing & Application
15 pages
Genome Sequencing and Objectives
No ratings yet
Genome Sequencing and Objectives
18 pages
DNA Sequencing
100% (1)
DNA Sequencing
36 pages
Lecture 01 - Genome Sequencing
No ratings yet
Lecture 01 - Genome Sequencing
48 pages
Gi 1 Slides
No ratings yet
Gi 1 Slides
115 pages
Intro and Sequencing Tech
No ratings yet
Intro and Sequencing Tech
50 pages
The Science of Stem Cells
From Everand
The Science of Stem Cells
Jonathan M. W. Slack
No ratings yet
Reordering Life: Knowledge and Control in the Genomics Revolution
From Everand
Reordering Life: Knowledge and Control in the Genomics Revolution
Stephen Hilgartner
No ratings yet
DNA Microarrays Databases and Statistics 1st Edition Alan R. Kimmel - Get the ebook instantly with just one click
100% (1)
DNA Microarrays Databases and Statistics 1st Edition Alan R. Kimmel - Get the ebook instantly with just one click
82 pages
Scamper Lesson Plan 1 - Cole
No ratings yet
Scamper Lesson Plan 1 - Cole
7 pages
3082 9237 1 PB
No ratings yet
3082 9237 1 PB
10 pages
Ribosomes and Protein Synthesis
No ratings yet
Ribosomes and Protein Synthesis
12 pages
botony-cell-science-hindi
No ratings yet
botony-cell-science-hindi
13 pages
RDT Questions
100% (1)
RDT Questions
5 pages
Lecture18 PDF
No ratings yet
Lecture18 PDF
13 pages
Protein-Synthesis-Worksheet
No ratings yet
Protein-Synthesis-Worksheet
3 pages
Grade 9 DNA
No ratings yet
Grade 9 DNA
12 pages
07 Nucleic Acids Test
No ratings yet
07 Nucleic Acids Test
6 pages
Protocol For Annealing Oligonucleotides
No ratings yet
Protocol For Annealing Oligonucleotides
1 page
Plant Transposable Elements 1st Edition Jungnam Cho - The ebook is available for instant download, read anywhere
No ratings yet
Plant Transposable Elements 1st Edition Jungnam Cho - The ebook is available for instant download, read anywhere
78 pages
Government Girls Higher Secondary School Pano Akil: Chapter # 1 Homeostasis Mcq's Bank
No ratings yet
Government Girls Higher Secondary School Pano Akil: Chapter # 1 Homeostasis Mcq's Bank
5 pages
Chromosome: Prof. Harshraj. S. Shinde K. K. Wagh College of Agril. Biotech, Nashik. India
No ratings yet
Chromosome: Prof. Harshraj. S. Shinde K. K. Wagh College of Agril. Biotech, Nashik. India
24 pages
DNA and Replication Worksheet Answers
No ratings yet
DNA and Replication Worksheet Answers
2 pages
What is Hybridization
No ratings yet
What is Hybridization
8 pages
Virus Indexing
No ratings yet
Virus Indexing
15 pages
Submitted by Aiswarya V 1St MSC Zoology Roll Number 3301
No ratings yet
Submitted by Aiswarya V 1St MSC Zoology Roll Number 3301
32 pages
pGLO Transformation and Purification
No ratings yet
pGLO Transformation and Purification
3 pages
(eBook PDF) Principles of Genetics, 7th Edition download
100% (1)
(eBook PDF) Principles of Genetics, 7th Edition download
48 pages
Faststart Universal SYBR Green Master (ROX)
No ratings yet
Faststart Universal SYBR Green Master (ROX)
4 pages
Meselson and Stahl Experiment Determined
No ratings yet
Meselson and Stahl Experiment Determined
1 page
Chromosome Theory of Inheritance and DNA Replication - For Upload
No ratings yet
Chromosome Theory of Inheritance and DNA Replication - For Upload
41 pages
Bio Physics Unit 04 by Cool Education
No ratings yet
Bio Physics Unit 04 by Cool Education
19 pages
Mechanisms of Microrna-Mediated Gene Regulation in Animal Cells
No ratings yet
Mechanisms of Microrna-Mediated Gene Regulation in Animal Cells
6 pages
2.3 Nucleotides and Nucleic Acids
No ratings yet
2.3 Nucleotides and Nucleic Acids
20 pages
Nucleic Acid
No ratings yet
Nucleic Acid
13 pages
PDF Lutereau Adios Al Matrimonio - Compress
No ratings yet
PDF Lutereau Adios Al Matrimonio - Compress
13 pages

7a Genomics 2-24 PDF

Uploaded by

7a Genomics 2-24 PDF

Uploaded by

All science is either

Data! Data! Data! ...

Archaic sequencing methods

First nucleotide sequencing

First DNA sequencing

The Nucleotide Sequence of the lac Operator

WALTER GILBERT AND ALLAN MAXAM

Communicated by J. D. Watson, Augut 9, 1973

The lactose repressor selects one out of six million nucleotide

bind again to the repressor, and is about 27 base-pairs long.

First Genome Sequence

Sanger dideoxy sequencging

1990s: Large scale automated

Generation 1: Gel based or capillary

First automated sequencing

1995: Haemophilus influenza

2001 Human Genome

Next generation sequencing

Massively Parallel Signature

Early next-gen sequencers

SOLiD (mp, frag)b

Mate pair information

Based on the GA IIx. See full specifications at: http://www.illumina.com/systems/genome_analyzer.ilmn.

processing the data. See Table 1 for a comparison between the

Metagenomic Sample Coverage

Coverage of a genome is defined as the mean

Get one or two reads from

Reconstructing the Sequence

Cover region with high redundancy

Steps to Assemble a Genome

1. Find overlapping reads

read a 500-900 long word that comes

contig a contiguous sequence formed

4. Derive consensus sequence

consensus sequence derived from the

Length of genomic segment:

How much coverage is enough?

Draft sequencing of full

Reference based assembly

Publicly available de novo assemblers

Evidence based gene calling: BLAST

Lior Pachter's list

~14.6 mil non-redundant SNPs

You might also like