1181 (2014) France Denoeud: Science Et Al

Source: Geoffrey Mohan, LA Times; September 4, 2014 "Researchers have pieced together the genetic atlas of the parent of the most commonly cultivated species of coffee plant and uncovered a rather independent streak in its evolution. Coffee developed its caffeine-generating capacity independently from its cousin, cacao, according to the first whole genome study of the plant behind the brew quaffed every morning by about 100 million Americans, published online Thursday in the journal Science. There’s been a lot of genetic sleuthing on coffee, most of it far from the tree. We have a good idea about how caffeine affects animal (particularly human) genes and alters brain chemistry. We know which of our own genes seem to draw us toward consuming coffee, tea or chocolate as well. And there’s also been a heady, if somewhat contradictory, brew of studies purporting to demonstrate caffeine’s beneficial and deleterious effects on humans. But how caffeine production got started has been as hard to see as a spoon in a demitasse of espresso. “Coffee has been kind of an orphan crop," said UC Davis geneticist Juan F. Medrano, who was not involved in the study. "It has been kind of forgotten in terms of DNA research. Perhaps this opens the door to expand that area.” The international team that spent years piecing together coffee's massive genome suggests that a caffeine chemical factory developed independently at least twice, in cacao and coffee, in what's known as convergent evolution. (Koalas and humans, for instance, have fingerprints, and widely divergent animals have developed prickly outsides to protect their gooey insides.) Compared with its close relatives, coffee harbors larger families of the genes linked to aroma and bitterness and has a wider array of genes linked to caffeine production, the study found. How those new genes popped up and proliferated appears to be a series of small, fortuitous accidents, the study suggests. Neighboring genes were duplicated by a process roughly equivalent to erratic coding and processing in a computer. Unlike computers, biological systems are ruthless housekeepers, shucking duplicates like excess baggage. Sometimes duplicates develop their own specialty, which appears to be what happened in the case of coffee, the authors suggest. “A small percentage of them survive, either by splitting functions or evolving new ones," said study coauthor Victor A. Albert, an evolutionary biologist at the University at Buffalo, part of the State University of New York. "In the case of caffeine genes, we have a series of duplications that occurred all next to each other, which gave rise to enzymes that catalyze different steps" in caffeine production. Evolution favored caffeine production because the compound repels insects that prey on leaves and halts the germination of seeds from competing plants, giving coffee species a niche in which to thrive. Recent research also has suggested caffeine can help orient beneficial pollinators toward the coffee flower, Albert said. Duplication of an entire genome is thought to be a primary driving force in the rise of new species and the wide diversification of life. But coffee appears to have taken a slower, piecemeal approach of small duplications. That could mean biologists have been underestimating the contribution of narrow, sequential duplication to species diversity, Albert said. The common ancestor of coffee and such plants as cacao, tomatoes, grapes, papaya, soybeans, strawberries, peaches and poplars experienced no such whole-genome duplication. "Yet the coffee family is the fourth-largest family of flowering plants and it’s very diverse in flower, plant and fruit form,” Albert said. “Here’s a diversification without a whole genome duplication having stimulated it.” The largely French team of researchers used crushed stems, leaves and flower parts from Coffea canephora, one of the parents of the hybrid Coffea arabica, from w

Uploaded by

poorfarmer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

115 views5 pages

1181 (2014) France Denoeud: Science Et Al

Uploaded by

poorfarmer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

DOI: 10.1126/science.

1255274
, 1181 (2014); 345 Science
et al. France Denoeud
caffeine biosynthesis
The coffee genome provides insight into the convergent evolution of
This copy is for your personal, non-commercial use only.
clicking here. colleagues, clients, or customers by
, you can order high-quality copies for your If you wish to distribute this article to others

here. following the guidelines

can be obtained by Permission to republish or repurpose articles or portions of articles

): September 4, 2014 www.sciencemag.org (this information is current as of

The following resources related to this article are available online at
http://www.sciencemag.org/content/345/6201/1181.full.html
version of this article at:
including high-resolution figures, can be found in the online Updated information and services,
http://www.sciencemag.org/content/suppl/2014/09/03/345.6201.1181.DC1.html
can be found at: Supporting Online Material
http://www.sciencemag.org/content/345/6201/1181.full.html#related
found at:
can be related to this article A list of selected additional articles on the Science Web sites
http://www.sciencemag.org/content/345/6201/1181.full.html#ref-list-1
, 68 of which can be accessed free: cites 165 articles This article
http://www.sciencemag.org/content/345/6201/1181.full.html#related-urls
1 articles hosted by HighWire Press; see: cited by This article has been
http://www.sciencemag.org/cgi/collection/genetics
Genetics
http://www.sciencemag.org/cgi/collection/botany
Botany
subject collections: This article appears in the following
registered trademark of AAAS.
is a Science 2014 by the American Association for the Advancement of Science; all rights reserved. The title
Copyright American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005.
(print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the Science

o
n

S
e
p
t
e
m
b
e
r

4
,

2
0
1
4
w
w
w
.
s
c
i
e
n
c
e
m
a
g
.
o
r
g
D
o
w
n
l
o
a
d
e
d

f
r
o
m

o
n

S
e
p
t
e
m
b
e
r

4
,

2
0
1
4
w
w
w
.
s
c
i
e
n
c
e
m
a
g
.
o
r
g
D
o
w
n
l
o
a
d
e
d

f
r
o
m

o
n

S
e
p
t
e
m
b
e
r

4
,

2
0
1
4
w
w
w
.
s
c
i
e
n
c
e
m
a
g
.
o
r
g
D
o
w
n
l
o
a
d
e
d

f
r
o
m

o
n

S
e
p
t
e
m
b
e
r

4
,

2
0
1
4
w
w
w
.
s
c
i
e
n
c
e
m
a
g
.
o
r
g
D
o
w
n
l
o
a
d
e
d

f
r
o
m

o
n

S
e
p
t
e
m
b
e
r

4
,

2
0
1
4
w
w
w
.
s
c
i
e
n
c
e
m
a
g
.
o
r
g
D
o
w
n
l
o
a
d
e
d

f
r
o
m

PLANT GENOMICS
The coffee genome provides insight
into the convergent evolution of
caffeine biosynthesis
France Denoeud,
1,2,3
Lorenzo Carretero-Paulet,
4
Alexis Dereeper,
5
Gatan Droc,
6
Romain Guyot,
7
Marco Pietrella,
8
Chunfang Zheng,
9
Adriana Alberti,
1
Franois Anthony,
5
Giuseppe Aprea,
8
Jean-Marc Aury,
1
Pascal Bento,
1
Maria Bernard,
1
Stphanie Bocs,
6
Claudine Campa,
7
Alberto Cenci,
5,10
Marie-Christine Combes,
5
Dominique Crouzillat,
11
Corinne Da Silva,
1
Loretta Daddiego,
12
Fabien De Bellis,
6
Stphane Dussert,
7
Olivier Garsmeur,
6
Thomas Gayraud,
7
Valentin Guignon,
10
Katharina Jahn,
9,13,14
Vronique Jamilloux,
15
Thierry Jot,
7
Karine Labadie,
1
Tianying Lan,
4,16
Julie Leclercq,
6
Maud Lepelley,
11
Thierry Leroy,
6
Lei-Ting Li,
17
Pablo Librado,
18
Loredana Lopez,
12
Adriana Muoz,
19,20
Benjamin Noel,
1
Alberto Pallavicini,
21
Gaetano Perrotta,
12
Valrie Poncet,
7
David Pot,
6
Priyono,
22
Michel Rigoreau,
11
Mathieu Rouard,
10
Julio Rozas,
18
Christine Tranchant-Dubreuil,
7
Robert VanBuren,
17
Qiong Zhang,
17
Alan C. Andrade,
23
Xavier Argout,
6
Benot Bertrand,
24
Alexandre de Kochko,
7
Giorgio Graziosi,
21,25
Robert J Henry,
26
Jayarama,
27
Ray Ming,
17
Chifumi Nagai,
28
Steve Rounsley,
29
David Sankoff,
9
Giovanni Giuliano,
8
Victor A. Albert,
4
*
Patrick Wincker,
1,2,3
* Philippe Lashermes
5
*
Coffee is a valuable beverage crop due to its characteristic flavor, aroma, and the
stimulating effects of caffeine. We generated a high-quality draft genome of the species
Coffea canephora, which displays a conserved chromosomal gene order among asterid
angiosperms. Although it shows no sign of the whole-genome triplication identified in
Solanaceae species such as tomato, the genome includes several species-specific gene
family expansions, among them N-methyltransferases (NMTs) involved in caffeine
production, defense-related genes, and alkaloid and flavonoid enzymes involved in
secondary compound synthesis. Comparative analyses of caffeine NMTs demonstrate that
these genes expanded through sequential tandem duplications independently of genes
from cacao and tea, suggesting that caffeine in eudicots is of polyphyletic origin.
W
ithmore than2.25 billioncups consumed
every day, coffee is one of the most im-
portant crops onEarth, cultivated across
more than 11 million hectares. Coffee be-
longs to the Rubiaceae family, which is
part of the Euasterid I clade and the fourth largest
family of angiosperms, consisting of more than
11,000 species in 660 genera (1). We sequenced
Coffea canephora (2n = 2x = 22 chromosomes),
an outcrossing, highly heterozygous diploid, and
one of the parents of C. arabica (2n = 4x = 44
chromosomes), which was derived from hybrid-
ization between C. canephora and C. eugenioides
(2). A total of 54.4 million Roche 454 single and
mate-pair reads and 143,605 Sanger bacterial ar-
tificial chromosomeend reads were generated
from a doubled haploid accession, representing
~30 coverage of the 710-Mb genome (3). Addi-
tional Illumina sequencing data (60) were used
to improve the assembly (table S1) (4). The re-
sulting assembly consists of 25,216 contigs and
13,345 scaffolds with a total length of 568.6 Mb
(80% of 710 Mb), including 97 Mb (17%) of inter-
contig gaps. Eighty percent of the assembly is in
635 scaffolds, and the scaffold N50 (the scaffold
size above which 50% of the total length of the
sequence assembly can be found) is 1.26 Mb
(table S2). A high-density genetic map covering
349 scaffolds and comprising ~64%of the assem-
bly (364 Mb) and 86% of the annotated genes
was anchored to the 11 C. canephora chromo-
somes (4). More than 96% of the scaffolds larger
than 1 Mb were anchored (Fig. 1A).
We annotated 25,574 protein-coding genes (4)
(table S6), 92 microRNA precursors, and 2573
organellar-to-nuclear genome transfers (4). Trans-
posable elements account for ~50% of the ge-
nome (4), of which ~85%are long terminal repeat
(LTR) retrotransposons. Large-scale comparison
between C. canephora LTRretrotransposons and
those of reference plant genomes shows outstand-
ing conservation of several Copia groups across
distantly related genomes, suggesting that hori-
zontal mobile element transfers may be more fre-
quent than generally recognized (58).
Structurally, the coffee genome shows no sign
of a whole-genome polyploidization in its lin-
eage since the g triplication at the origin of the
core eudicots (9) (Fig. 1B). Coffee contains exactly
three paralogous regions for each of the seven
pre-g ancestral chromosomes (Fig. 1B). Coffee
chromosomal regions show unique one-to-one
correspondences with grapevine chromosomes
(Fig. 1C and fig. S12) and a one-to-three corre-
spondence with the tomato genome, which un-
derwent a second lineage-specific triplication
during its evolutionary history (10). Although
grapevine, a rosid, is the most conservative core
eudicot in terms of integrity of gross chromo-
somal structure, coffee displays less gene-order
divergence to all other rosids, despite being an
asterid itself (9). Coffee also shows little syntenic
divergence relative to other sequenced asterids
(Fig. 1D, table S17, and supplementary text).
To classify gene families in the C. canephora
genome, we ran OrthoMCL on inferred protein
sequences from coffee, grapevine, tomato, and
Arabidopsis (4), generating 16,917 groups of or-
thologous genes (fig. S5). To examine coffee-
specific gene family expansions with potential
adaptive value, we fit different branch models
implemented in BadiRate (11) to these ortho-
groups (4). In the coffee lineage, 202 orthogroups
SCIENCE sciencemag.org 5 SEPTEMBER 2014 VOL 345 ISSUE 6201 1181
1
Commissariat lEnergie Atomique, Genoscope, Institut
de Gnomique, BP5706, 91057 Evry, France.
2
CNRS, UMR
8030, CP5706, Evry, France.
3
Universit dEvry, UMR 8030,
CP5706, Evry, France.
4
Department of Biological Sciences,
109 Cooke Hall, University at Buffalo (State University of
New York), Buffalo, NY 14260, USA.
5
Institut de Recherche
pour le Dveloppement (IRD), UMR Rsistance des
Plantes aux Bioagresseurs (RPB) [Centre de Coopration
Internationale en Recherche Agronomique pour le
Dveloppement (CIRAD), IRD, UM2)], BP 64501, 34394
Montpellier Cedex 5, France.
6
CIRAD, UMR Amlioration
Gntique et Adaptation des Plantes Mditerranennes et
Tropicales (AGAP), F-34398 Montpellier, France.
7
IRD, UMR
Diversit Adaptation et Dveloppement des Plantes (CIRAD,
IRD, UM2), BP 64501, 34394 Montpellier Cedex 5, France.
8
Italian National Agency for New Technologies, Energy and
Sustainable Development (ENEA) Casaccia Research Center,
Via Anguillarese 301, 00123 Roma, Italy.
9
Department of
Mathematics and Statistics, University of Ottawa, 585
King Edward Avenue, Ottawa, Ontario K1N 6N5, Canada.
10
Bioversity International, Parc Scientifique Agropolis II,
34397 Montpellier Cedex 5, France.
11
Nestl Research
and Development Centre, 101 Avenue Gustave Eiffel,
Notre-Dame-dO, BP 49716, 37097 Tours Cedex 2, France.
12
ENEA Trisaia Research Center, 75026 Rotondella, Italy.
13
Center for Biotechnology, Universitt Bielefeld,
Universittsstrae 27, D-33615 Bielefeld, Germany.
14
AG
Genominformatik, Technische Fakultt, Universitt Bielefeld,
33594 Bielefeld, Germany.
15
Institut National de la
Recherche Agronomique (INRA), Unit de Recherches en
Gnomique-Info (UR INRA 1164), Centre de Recherche de
Versailles, 78026 Versailles Cedex, France.
16
Department of
Biology, Chongqing University of Science and Technology,
4000042 Chongqing, China.
17
Department of Plant Biology,
148 Edward R. Madigan Laboratory, MC-051, 1201 West
Gregory Drive, University of Illinois at Urbana-Champaign,
Urbana, IL 61801, USA.
18
Departament de Gentica and
Institut de Recerca de la Biodiversitat (IRBio), Universitat
de Barcelona, Diagonal 643, Barcelona 08028, Spain.
19
Department of Mathematics, University of Maryland,
Mathematics Building 084, University of Maryland, College
Park, MD 20742, USA.
20
School of Electrical Engineering
and Computer Science, University of Ottawa, 800 King
Edward Avenue, Ottawa, Ontario K1N 6N5, Canada.
21
Department of Life Sciences, University of Trieste, Via
Licio Giorgieri 5, 34127 Trieste, Italy.
22
Indonesian
Coffee and Cocoa Institute, Jember, East Java, Indonesia.
23
Laboratrio de Gentica Molecular, Ncleo de
Biotecnologia (NTBio), Embrapa Recursos Genticos e
Biotecnologia, Final Av. W/5 Norte, Parque Estao Biolgia,
Braslia-DF 70770-917, Brazil.
24
CIRAD, UMR RPB (CIRAD,
IRD, UM2), BP 64501, 34394 Montpellier Cedex 5, France.
25
DNA Analytica Srl, Via Licio Giorgieri 5, 34127 Trieste,
Italy.
26
Queensland Alliance for Agriculture and Food
Innovation, The University of Queensland, St. Lucia 4072,
Australia.
27
Central Coffee Research Institute, Coffee Board,
Coffee Research Station (Post) - 577 117 Chikmagalur
District, Karnataka State, India.
28
Hawaii Agriculture
Research Center, Post Office Box 100, Kunia, HI
96759-0100, USA.
29
BIO5 Institute, University of Arizona,
1657 Helen Street, Tucson, AZ 85721, USA.
*Corresponding author. E-mail: [email protected] (V.A.A.);
[email protected] (P.W.); [email protected]
(P.L.)
RESEARCH | REPORTS
clustering 1270 genes were supported as expanded
(Akaike information criterion > 2.7). Among gene
ontology (GO) terms annotating these, 98 out
of 4300 generic terms were significantly over- or
underrepresented (table S14). Most GOs enriched
in C. canephora (P < 0.05) belonged to two main
functional categories: defense response and meta-
bolic process, the later including different cata-
lytic activities (table S15).
Among defense response functions, there is a
clear expansion of nucleotide binding site disease-
resistance genes (12, 13) in the C. canephora ge-
nome (4). Most genes that grouped together
withinsingle orthogroups were tandemly arrayed,
suggesting that R genes evolved by tandem du-
plication and divergence of linked gene families
(supplementary text). Several gene functions in-
volved in secondary metabolite biosynthesis are
significantly expanded in the C. canephora ge-
nome, including enzymes associated with the
production of phenylpropanoids such as flavo-
noids and isoflavones (naringenin 3-dioxygenase,
isoflavone 2-hydroxylase), alkaloids (strictosi-
dine synthase, tropine dehydrogenase), monoter-
penes (e.g., menthol dehydrogenase), and caffeine
[N-methyltransferases (NMTs)] (Fig. 2). For ex-
ample, indole alkaloids such as the monoamine
oxidase inhibitor yohimbine and antimalaria drug
quinine are prominent secondary compounds of
the coffee family and its parent order, Gentianales
(14), and the GO term indole biosynthetic process
was highly enriched (P < 0.001) in coffee relative
to tomato, grapevine, and Arabidopsis.
Caffeine is a purine alkaloid synthesized by
several eudicot plants, including coffee, cacao
(Theobroma cacao), and tea (Camellia sinensis)
(Fig. 2). Caffeine is synthesized in both coffee
leaves, where it has insecticidal properties (15),
and fruits and seeds, where it inhibits seed ger-
mination of competing species (16). The late steps
in caffeine biosynthesis are mediated by a series
of NMTs (Fig. 2A) (17).
Among coffee-expanded genes, NMT activity is
one of the more highly enriched GO terms (table
S15). A single gene family (ORTHOMCL170) clus-
ters 23 genes in coffee, but none in grapevine,
tomato, or Arabidopsis (table S12), and this clus-
ter contains genes encoding known enzymes of
the caffeine biosynthetic pathway (18, 19). Maxi-
mum likelihood (ML) phylogenetic analysis of
ORTHOMCL170 with tea and cacao NMTs that
have similar activities reveals species-specific
gene clades (Fig. 2C). We analyzed these relation-
ships in a broader evolutionary context by includ-
ing genome-wide samples of NMTs from coffee,
cacao, and other eudicot species. ML trees show
that the genes encoding the closest Arabidopsis
NMT relatives of coffee caffeine biosynthetic en-
zymes are involved in benzoic, salicylic, and ni-
cotinic functions (4) (supplementary text). Caffeine
biosynthetic NMTs from coffee nested within a
gene clade distinct from those of cacao or tea,
which group together as sister lineages. Thus, a
1182 5 SEPTEMBER 2014 VOL 345 ISSUE 6201 sciencemag.org SCIENCE
Fig. 1. Structure of the C. canephora genome. (A) Alignment of the pseudochromosome 1 sequence with the
genetic map of C. canephora and genomic overview. Correspondences between the genetic linkage map and the
DNA pseudomolecule are shown at left (oriented and nonoriented scaffolds are indicated in blue and green,
respectively; gray lines denote consistent data; orange lines indicate markers with an approximate genetic location). The relative proportions (percentage of
nucleotides) in sliding windows (1-Mb size, 500-kbstep) of transposable elements (Copia in red, Gypsy in green) and genes (exons in blue, introns in dark blue) are
shown at right. (B) Coffee chromosomal blocks descending fromthe seven ancestral core eudicot chromosomes. The three paralogous descendants of the seven
ancestral chromosomes are shown in shared colors but different textures. (C) Comparison of three grapevine chromosomes (descendants of the
prehexaploidization core eudicot chromosome) mapped to a single coffee chromosome and three regions in the tomato genome. (D) Phylogeny and genome
duplication history of core eudicots. Arrowheads indicate tetraploidization (blue) or hexaploidization (green) events. Red lines trace lineages of six species that
have not undergone further polyploidization. Bar graphs and colors reflect gene-order differences (table S17) between each of the six species (column labels) and
the entire set, showing the gene order conservatism of coffee, especially among asterids, and of peach and cacao among rosids.
RESEARCH | REPORTS
minimumof two independent origins of caffeine
biosynthetic NMT activity can be inferred, as
proposed previously (20).
Microsynteny analyses of ORTHOMCL170, which
includes three tandem arrays, show that some
known and putative coffee caffeine synthase
genesCcXMT(encoding xanthosine N-methyltrans-
ferase), CcMTL, and CcNMT3form a tight as-
semblage of coexpressed tandem duplicates (Fig.
2D) reminiscent of a metabolic gene cluster (21, 22).
Given that some plant metabolic gene clusters
are of relatively recent origin (23), we sought to
further unravel the role of gene duplication in
the expansion of the coffee NMT gene family
(Fig. 2D) (supplementary text). The three main
coffee NMT clades in ORTHOMCL170 are distrib-
utedamong a minimumof three genomic blocks;
however, some phylogenetically recent tandem
duplicates have moved away from their original
positions via block rearrangements (Fig. 2D). One
such movement involving the putative meta-
bolic cluster appears to have left the CcDXMT
gene (encoding 3,7-dimethylxanthine methyl-
transferase) behind, physically separated from
its ancestral tandem array. In cacao, the func-
tionally characterized TcBCS1 gene has a tan-
dem duplicate, but this pair of genes evolved
independently from the NMT tandem arrays
found in C. canephora (fig. S29). We also ex-
amined the role of positive selection (PS) in the
evolution of caffeine biosynthesis among coffee,
tea, and cacao (4) (supplementary text). We found
significant evidence for PS [likelihood ratio test
for PAML (Phylogenetic Analysis by Maximum
Likelihood) branch-site test, P = 5.78 10
3
(24)] only for the coffee NMT lineage, indicating
that the independent evolution of caffeine bio-
synthesis in coffee was adaptive and probably
involved specific amino acid changes fixed by
PS. These results highlight the distinct acquisi-
tion of caffeine biosynthesis in the coffee plant,
providing an example of convergent evolution of
secondary metabolic pathways encoded by tan-
demly duplicated genes.
Genomic functional diversification via tandem
duplication may have helped shape other aspects
of coffee bean chemical composition. Linoleic
acid, which is produced by the oleate desaturase
FAD2, is the major polyunsaturated fatty acid
in the coffee bean (25, 26), where it contributes to
aroma composition and flavor retention after
roasting (4). Coffee has six FAD2 genes com-
pared with one in Arabidopsis, and most of these
have arisen from tandem duplications on chro-
mosome 1 (fig. S33). RNA sequencing data sug-
gest transcriptional specialization for two of the
six FAD2 copies, with CcFAD2.3 being actively
SCIENCE sciencemag.org 5 SEPTEMBER 2014 VOL 345 ISSUE 6201 1183
RESEARCH | REPORTS
Fig. 2. Evolution of caffeine biosynthesis. (A)
The principal caffeine biosynthetic pathway. Three
methylation steps are necessary to produce caf-
feine fromxanthosine, involving the successive ac-
tion of three NMTs: xanthosine methyltransferase
(XMT), theobromine synthase [7-methylxanthine
methyltransferase (MXMT)], and caffeine syn-
thase [3,7-dimethylxanthine methyltransferase
(DXMT)]. SAM, S-adenosylmethionine; SAH, S-
adenosylhomocysteine. (B) Evolutionary position of
caffeine-producing plants with respect to other
eudicots (phylogeny adapted from www.mobot.
org/MOBOT/research/APweb/). (C) ML phylog-
eny of coffee, tea, and cacao NMTs. Bootstrap
support values (percentages) from1000replicates
are shown next to relevant clades. Branch lengths
are proportional to expected numbers of nucleo-
tide substitutions per site. Colors identify genes
assignable to the genomic blocks denoted in (D).
(D) (Left) A model summarizing the duplication
history of coffee NMTgenes, following the phylog-
eny in (C). Three distinct tandem gene arrays
evolved in situ on chromosome 1 fromnearby gene
duplicates (bold squares). The red and green
blocks, colored as in (C), translocated (to chromo-
some 9) or rearranged (to elsewhere on chromo-
some 1) from their ancestral locus (blue region),
respectively. (Right) Gene orders on modern chro-
mosomes. Translocation of the red block, contain-
ing the putative caffeine NMTmetabolic cluster, left
the phylogenetically derived CcDXMTgene behind.
Similarly, CcNMT19 is a derived gene within its own
NMTclade that remained in place following move-
ment of the green block. Numbers at branches
indicate relative times since major duplication
events or diversification times of the tandem ar-
rays, calculated from approximately neutral syn-
onymous substitution rates. (E) Expression profiles
(reads per kilobase per million reads mapped) of
known Coffea canephora NMTs. The genes in the
putative metabolic cluster (along with CcDXMTand
CcMXMT) exhibit similar expressionpatterns, higher
in perisperm than endosperm. Data are plotted as
log2 values. DAP, days after pollination.
transcribed in developing endosperm (supple-
mentary text). Peak transcript abundance coin-
cides with the dramatic increase in linoleic acid
content that occurs during seed development at
the perisperm-endosperm transition (27).
Our analysis of the adaptive genomic land-
scape of C. canephora identifies the convergent
evolution of caffeine biosynthesis among plant
lineages and establishes coffee as a reference spe-
cies for understanding the evolution of genome
structure in asterid angiosperms.
REFERENCES AND NOTES
1. E. Robbrecht, J. F. Manen, Syst. Geogr. Plants 76, 85146 (2006).
2. P. Lashermes et al., Mol. Gen. Genet. 261, 259266 (1999).
3. M. Noirot et al., Ann. Bot. (London) 92, 709714 (2003).
4. Materials and methods are available as supplementary
materials on Science Online.
5. S. Schaack, C. Gilbert, C. Feschotte, Trends Ecol. Evol. 25,
537546 (2010).
6. A. Roulin et al., BMC Evol. Biol. 9, 58 (2009).
7. M. El Baidouri et al., Genome Res. 24, 831838 (2014).
8. C. Moisy, A. H. Schulman, R. Kalendar, J. P. Buchmann,
F. Pelsy, Theor. Appl. Genet. 127, 12231235 (2014).
9. O. Jaillon et al., Nature 449, 463467 (2007).
10. S. Sato et al., Nature 485, 635641 (2012).
11. P. Librado, F. G. Vieira, J. Rozas, Bioinformatics 28, 279281 (2012).
12. S. H. Hulbert, C. A. Webb, S. M. Smith, Q. Sun, Annu. Rev.
Phytopathol. 39, 285312 (2001).
13. L. McHale, X. Tan, P. Koehl, R. W. Michelmore, Genome Biol. 7,
212 (2006).
14. F. Gleason, R. Chollet, Plant Biochemistry (Jones and
Bartlett, Sudbury, MA, 2011).
15. J. A. Nathanson, Science 226, 184187 (1984).
16. A. Pacheco, J. Pohlan, M. Schulz, Allelopathy J. 21, 3956 (2008).
17. H. Ashihara, H. Sano, A. Crozier, Phytochemistry 69, 841856
(2008).
18. A. A. McCarthy, J. G. McCarthy, Plant Physiol. 144, 879889 (2007).
19. M. Ogawa, Y. Herai, N. Koizumi, T. Kusano, H. Sano, J. Biol.
Chem. 276, 82138218 (2001).
20. E. Pichersky, E. Lewinsohn, Annu. Rev. Plant Biol. 62,
549566 (2011).
21. B. Field, A. E. Osbourn, Science 320, 543547 (2008).
22. M. Matsuno et al., Science 325, 16881692 (2009).
23. B. Field et al., Proc. Natl. Acad. Sci. U.S.A. 108, 1611616121 (2011).
24. J. Zhang, R. Nielsen, Z. Yang, Mol. Biol. Evol. 22, 24722479
(2005).
25. D. Villarreal et al., J. Agric. Food Chem. 57, 1132111327 (2009).
26. S. Dussert, A. Laffargue, A. de Kochko, T. Jot, Phytochemistry
69, 29502960 (2008).
27. T. Jot et al., New Phytol. 182, 146162 (2009).
ACKNOWLEDGMENTS
We acknowledge the following sources for funding: ANR-08-GENM-
022-001 (to P.L.); ANR-09-GENM-014-002 (to P.W.); Australian
Research Council (to R.J.H.); Natural Sciences and Engineering
Research Council of Canada (to D.S.); CNR-ENEA Agrifood
Project A2 C44 L191 (to G.Gi.); FINEP-Qualicaf, INCT-CAF
(to A.C.A.); NSF grants 0922742 (to V.A.A.) and 0922545 (to R.M.);
and the College of Arts and Sciences, University at Buffalo
(to V.A.A.). We thank P. Facella (ENEA) for Roche 454 sequencing
and Instituto Agronmico do Paran (Paran, Brazil) for fruit
RNA. This work was supported by the high-performance cluster of
the SouthGreen Bioinformatics platform (UMR AGAP) CIRAD
(www.southgreen.fr). The C. canephora genome assembly and
gene models are available on the Coffee Genome Hub
(http://coffee-genome.org) and the CoGe platform
(www.genomevolution.org). Sequencing data are deposited in
the European Nucleotide Archive under the accession numbers
CBUE020000001 to CBUE020025216 (contigs), HG739085
to HG752429 (scaffolds), and HG974428 to HG974439
(chromosomes). Gene family alignments and phylogenetic trees
for BAHD acyltransferases and NMTs are available in the
GreenPhylDB (www.greenphyl.org/cgi-bin/index.cgi) under the gene
family IDs CF158535 and CF158539 to CF158545, respectively.
We declare no competing financial interests.
SUPPLEMENTARY MATERIALS
www.sciencemag.org/content/345/6201/1181/suppl/DC1
Materials and Methods
Supplementary Text
Figs. S1 to S33
Tables S1 to S27
References (28175)
28 April 2014; accepted 29 July 2014
10.1126/science.1255274
GENOME EDITING
Prevention of muscular dystrophy
in mice by CRISPR/Cas9mediated
editing of germline DNA
Chengzu Long,
1
* John R. McAnally,
1
* John M. Shelton,
2
Alex A. Mireault,
1
Rhonda Bassel-Duby,
1
Eric N. Olson
1

Duchenne muscular dystrophy (DMD) is an inherited X-linked disease caused by mutations

in the gene encoding dystrophin, a protein required for muscle fiber integrity. DMD is
characterized by progressive muscle weakness and a shortened life span, and there is no
effective treatment. We used clustered regularly interspaced short palindromic repeat/Cas9
(CRISPR/Cas9)mediated genome editing to correct the dystrophin gene (Dmd) mutation in
the germline of mdx mice, a model for DMD, and then monitored muscle structure and function.
Genome editing produced genetically mosaic animals containing 2 to 100% correction of the
Dmd gene. The degree of muscle phenotypic rescue in mosaic mice exceeded the efficiency of
gene correction, likely reflecting an advantage of the corrected cells and their contribution to
regenerating muscle. With the anticipated technological advances that will facilitate genome
editingof postnatal somatic cells, this strategy may one day allowcorrection of disease-causing
mutations in the muscle tissue of patients with DMD.
D
uchenne muscular dystrophy (DMD) is
caused by mutations in the gene for dys-
trophin on the X chromosome and affects
approximately 1 in 3500 boys. Dystrophin
is a large cytoskeletal structural protein
essential for muscle cell membrane integrity. With-
out it, muscles degenerate, causing weakness and
myopathy (1). Death of DMD patients usually
occurs by age 25, typically from breathing com-
plications and cardiomyopathy. Hence, therapy
for DMD necessitates sustained rescue of skele-
tal, respiratory, and cardiac muscle structure
and function. Although the genetic cause of
DMD was identified nearly three decades ago
(2), and several gene- and cell-based therapies
have been developed to deliver functional Dmd
alleles or dystrophin-like protein to diseased mus-
cle tissue, numerous therapeutic challenges have
been encountered, and no curative treatment
exists (3).
RNA-guided, nuclease-mediated genome edit-
ing, based on type II CRISPR (clustered regu-
larly interspaced short palindromic repeat)/Cas
(CRISPR-associated) systems, offers a new ap-
proach to alter the genome (46). In brief, Cas9,
a nuclease guided by single-guide RNA (sgRNA),
binds to a targeted genomic locus next to the
protospacer adjacent motif (PAM) and generates
a double-strand break (DSB). The DSB is then
repaired either by nonhomologous end-joining
(NHEJ), which leads to insertion/deletion (indel)
mutations, or by homology-directed repair (HDR),
which requires an exogenous template and can
generate a precise modification at a target locus
(7). Unlike other gene therapy methods, which
add a functional, or partially functional, copy of a
gene to a patients cells but retain the original
dysfunctional copy of the gene, this system can
remove the defect. Genetic correction using en-
gineered nucleases (812) has been demonstrated
in immortalized myoblasts derived from DMD
patients in vitro (9), and rodent models of rare
diseases (13), but not yet in animal models of
relatively common and currently incurable dis-
eases, such as DMD.
The objective of this study was to correct the
genetic defect in the Dmd gene of mdx mice by
CRISPR/Cas9mediated genome editing in vivo.
The mdx mouse (C57BL/10ScSn-Dmd
mdx
/J) con-
tains a nonsense mutation in exon 23 of the Dmd
gene (14, 15) (Fig. 1A). We injected Cas9, sgRNA,
and HDR template into mouse zygotes to correct
the disease-causing gene mutation in the germ
line (16, 17), a strategy that has the potential tocorrect
the mutation in all cells of the body, including myo-
genic progenitors. Safety and efficacy of CRISPR/
Cas9based gene therapy was also evaluated.
1184 5 SEPTEMBER 2014 VOL 345 ISSUE 6201 sciencemag.org SCIENCE
1
Department of Molecular Biology and Hamon Center for
Regenerative Science and Medicine, University of Texas
Southwestern Medical Center, Dallas, TX 75390, USA.
2
Department of Internal Medicine, University of Texas
Southwestern Medical Center, Dallas, TX 75390, USA.
*These authors contributed equally to this work. To whom
correspondence should be addressed. E-mail: eric.olson@
utsouthwestern.edu
RESEARCH | REPORTS