Pnas 2208261119
Pnas 2208261119
Edited by Jack Szostak, Massachusetts General Hospital, Boston, MA; received May 22, 2022; accepted September 20, 2022
The ability of nucleic acids to catalyze reactions (as well as store and transmit informa-
tion) is important for both basic and applied science, the first in the context of Significance
molecular evolution and the origin of life and the second for biomedical applica-
tions. However, the catalytic power of standard nucleic acids (NAs) assembled from The ability of nucleic acids to
just four nucleotide building blocks is limited when compared with that of proteins. catalyze reactions is important in
Here, we assess the evolutionary potential of libraries of nucleic acids with six nucle- the context of the origin of life and
otide building blocks as reservoirs for catalysis. We compare the outcomes of in vitro biomedical applications. However,
selection experiments toward RNA-cleavage activity of two nucleic acid libraries: the catalytic power of standard
one built from the standard four independently replicable nucleotides and the other nucleic acids assembled from just
from six, with the two added nucleotides coming from an artificially expanded four nucleotide building blocks is
genetic information system (AEGIS). Results from comparative experiments suggest limited when compared with that
that DNA libraries with increased chemical diversity, higher information density,
of proteins. Using an artificially
and larger searchable sequence spaces are one order of magnitude richer reservoirs
expanded genetic information
of molecules that catalyze the cleavage of a phosphodiester bond in RNA than DNA
Downloaded from https://www.pnas.org by UNIVERSITETSBIIBL I TRONDHE on November 24, 2022 from IP address 129.241.230.0.
libraries built from a standard four-nucleotide alphabet. Evolved AEGISzymes with system (AEGIS) carrying extra
nitro-carrying nucleobase Z appear to exploit a general acid–base catalytic mecha- synthetic nucleotides, we show
nism to cleave that bond, analogous to the mechanism of the ribonuclease A family that DNA libraries with increased
of protein enzymes and heavily modified DNAzymes. The AEGISzyme described chemical diversity, higher
here represents a new type of catalysts evolved from libraries built from expanded information density, and larger
genetic alphabets. searchable sequence spaces are
at least one order of magnitude
expanded genetic alphabets j RNA-cleaving DNAzymes j in vitro evolution j information density
richer reservoirs of molecules able
In vitro selection was conceived independently in 1990 by the laboratories of Larry to catalyze the cleavage of RNA
Gold, Jack Szostak, and Gerald Joyce (1–3) as a way to obtain nucleic acid ligands and than DNA libraries built from a
catalysts without the need to command chemical theory sufficient for direct design and standard four-nucleotide
without the directed trial and error that characterizes medicinal chemistry. When the alphabet. The AEGISzyme
process is seen to be “selection” without “evolution”, it presumes that a library of DNA described here represents the first
or RNA (nucleic acid [NA]) molecules already contains one or more ligands for the tar- time that catalysts have been
get receptor or one or more catalyst for the target reaction. The experiment must only evolved from libraries built from
extract the desired ligands or catalysts from the library by exploiting their binding or expanded genetic alphabets.
catalytic powers. Then, PCR would make them in large amounts. If the presumption
was incorrect, in vitro selection still could deliver, by allowing mutations during PCR
to explore sequences not present in the original library.
Since it was introduced, the technology had successfully delivered aptamers for
many targets. It has also been applied to evolve catalytic molecules (RNAzymes and
DNAzymes), including RNA kinases, ligases, polymerases, and others (4–13).
Nevertheless, despite its success, as the concept developed, the question of initial
Author affiliations: aFoundation for Applied Molecular
library composition, both in the context of prebiotic molecular evolution and biotech- Evolution, Alachua, FL 32615; and bFirebird Biomolecular
nology, also arose. It was appreciated from the beginning that standard NAs contained Sciences, LLC, Alachua, FL 32615
fewer building blocks with less functional diversity than proteins. It was expected that
this might mean that in vitro selection on standard NA platforms would not deliver Author contributions: E.B. designed research; C.A.J. and
performances comparable to those of contemporary proteins. E.B. performed research; S.H. contributed new
reagents/analytic tools; C.A.J., K.M.B., S.A.B., and E.B.
The last 20+ years of literature are rich with work attempting to understand require- analyzed data; and S.A.B. and E.B. wrote the paper.
ments related to the composition of libraries built from four standard building blocks The authors declare no competing interest.
(for a comprehensive review, see ref. (14)). Studies of landscapes with standard RNA This article is a PNAS Direct Submission.
(15, 16) have revealed that, due to frustrated landscapes composed of largely discon- Copyright © 2022 the Author(s). Published by PNAS.
nected islands of active sequences, chance emergence of an active RNA motif made of This article is distributed under Creative Commons
Attribution-NonCommercial-NoDerivatives License 4.0
four building blocks would be more important in evolution than its optimization by (CC BY-NC-ND).
natural selection, at least in the case of short sequences. 1
To whom correspondence may be addressed. Email:
Depending on the specific function and other factors, the probability of finding a [email protected].
This article contains supporting information online at
sequence that performs that function at an acceptable level in a pool of NA sequences http://www.pnas.org/lookup/suppl/doi:10.1073/pnas.
with four building blocks ranges from 105 for mere binding to 1030 for more- 2208261119/-/DCSupplemental.
complex tasks, such as catalysis (14, 17). Published October 24, 2022.
libraries (23–25). Substituting polymerase-based amplification The 25-nucleotide (nt) length for the random region was
with ligation, the Liu laboratory was recently able to function- chosen for its being, for standard DNA, the largest library for
alize NA polymers with up to 32 variants of the building blocks
(26). Following a different rationale, Gold’s group at Soma-
Logic obtained slow off-rate modified aptamers by appending
hydrophobic side chains (benzyl, naphthyl, tryptamino, and
isobutyl) to nucleobases (27, 28).
As an alternative approach, we and others have sought to
improve the intrinsic functional value of NA libraries by
increasing within them the number of independently replicat-
ing building blocks (29–32). This approach offers the possi-
bility of adding functional groups more “lightly” to library
components, as adding separate building blocks carrying
new functional groups, rather than adding them to existing
nucleobases, overcomes the issue of “overfunctionalization”,
where the NA molecule no longer behaves like an NA mole-
cule (33). Further, the information density of the library is
increased by the number of building blocks, as reported in
Table 1. This, in turn, may provide more control over folding,
limit the possibility of inactive folds competing energetically
with active folds (34), and allow for new kinds of folds
(30, 35, 36).
Many experiments from the Hirao, Benner, and Romesberg
groups have shown that such addition is possible. The number
of nucleotides in DNA and RNA is not constrained to four
and can be increased to 6, 8, 10, and possibly 12 nucleotides
(37–47). To date, however, expanded genetic alphabets have
been examined only with respect to creating binders, avoiding
the larger challenges presented by catalysis.
In this work, we assess, through in vitro selection, the evolu-
tionary potential of libraries of NAs with different informational
architectures as reservoirs for catalysis. Specifically, we compare
in vitro selection experiments of two NA libraries carrying dif-
ferent information density, targeting the cleavage of RNA as a
model reaction (48–50). This reaction has been widely explored
with in vitro selection (for a review, see refs. (51–53)). The
added information density and chemical diversity of the libraries
come by adding two synthetic nucleotides from an artificially
expanded genetic information system (AEGIS) developed in Fig. 1. AEGIS and standard components of the libraries used in this study,
our laboratory. showing hydrogen bond donors (in blue) and acceptors (in red).
2 of 11 https://doi.org/10.1073/pnas.2208261119 pnas.org
which a nearly complete sequence space can be conveniently
investigated with manageable scales of DNA material. Thus, a
library with 25 random nucleotides built from standard nucleo-
tides has 425, or 1015, possible sequences (Table 1). This
sequence space can be, in principle, covered by ∼2 nmol of
DNA; this amount would contain, on average, one exemplar of
every possible sequence.
For six-letter AEGIS libraries, 25-nt-long random regions
have 625 (2.8 × 1019) different sequences. This would require
47 μmol of library to similarly cover the sequence space. This
scale is not conveniently attainable in the laboratory.
This mathematics captures the challenge of comparing
“apples and oranges” outcomes from four-letter and six-letter
evolution experiments. It shows that, with a four-letter, 25-nt
library, the experiment actually seeks pre-existing catalysts
already present in the library; no sequence evolution is possible.
In the second case, only the initial rounds require that the
library contain a selectable catalyst. Then, sequence evolution is
possible and is a mechanism for obtaining receptors/catalysts
that are improved over those originally in the library.
Exploiting this analysis, we chose to prepare 0.5 nmol of a
standard DNA library as the starting point. This theoretically
covered ∼25% of all possible sequences of this 25-nt length. Fig. 2. AEGIS versus standard RNA-cleaving DNAzyme selection progression.
We also prepared 0.5 nmol of the six-letter AEGIS library.
Downloaded from https://www.pnas.org by UNIVERSITETSBIIBL I TRONDHE on November 24, 2022 from IP address 129.241.230.0.
Fig. 3. Sequences and appearance of clusters during AEGIS DNAzyme selection. (A) Schematic representation of cluster sequences. In blue: RNA target;
underlined: primer binding regions. Locations of AEGIS P (green) and Z (red) are marked. (B) Plot of sequence abundance (expressed as No. of reads) versus
selection cycle. (C) Table of cluster appearance color coded by type of cleavage. Blue: type 1, green: type 2, red: type 3 (see text for details), gray: no appar-
ent cleavage.
Deep sequencing after each selection cycle provided a pano- The ability of these sequences to increase the rate of cleavage at
ramic view of sequence dynamics during the evolution experiment those two RNA sites may be compared with other evolved sequen-
(Fig. 3B). Thus, some of the clusters (C1 and C2) were observed ces that do not have this effect on the same RNA target. This is
in cycle 1 and were evidently represented in the six-letter library. especially interesting because we observe no obvious sequence sim-
Others appeared later, in rounds 4, 6, 7, and 8. These might have ilarity among clusters C2, C6, and C7 compared with sequences
been too infrequent in the starting pool for HTS to capture them that show specific cleavage (but see below). This effect is likely
or might have arisen by actual evolution.
4 of 11 https://doi.org/10.1073/pnas.2208261119 pnas.org
due to the local structural environment enforced on these two Assuming that the synthesis was random with respect to
ribonucleotides by the specific surrounding sequence; this might catalytic activity, these results offer prima facie evidence that the
increase the time they spend sampling optimal in-line confor- six-letter library was a richer reservoir for functional molecules
mations for spontaneous nucleophilic attack of the phosphate than the standard four-letter library. Indeed, the density of cata-
by the adjacent 2’OH. These were not analyzed further in the lysts meeting a certain threshold in the AEGIS library is here a
present study. positive number (∼10 in 1019), and that number is higher than
It is noteworthy that the originally evolvable region in clus- that measured density in the standard library (0 in 1015 =
ters C1 and C2 shared a 14-nt-long sequence containing the exactly zero).
AEGIS nucleotides with 79% sequence identity (52% overall), Enzymatic digestions were performed to assess a 2D confor-
where 11 out of 14 nt are identical, with two of the four Zs in mation prediction produced with mFold (60) (Fig. 5A). Here,
C1 shared by C2 while the other two Zs are not (SI Appendix, deoxyribonuclease I (DNase I), which nonspecifically cleaves
Fig. S6). Since C1 has cleavage activity but C2 does not, this DNA, and ribonuclease H (RNase H), which cleaves the RNA
might indicate that the two internal Zs lacking in C2 but pre- portion in RNA:DNA hybrid helices (61), were used in separate
sent in C1 are essential for catalysis. Clusters C8 and C10, time courses (Fig. 6). As clearly shown by RNase H digestion
both showing type 2 cleavage, also presented 56% overall results, part of the RNA substrate is entrapped in a 7-nt-long
sequence homology in the variable region, with 11 matching double-helix (with one G-U wobble), presenting the cleavable
residues, including one Z. Moreover, when C8 and C10 nucleotide to what would appear to be an unstructured region
sequences were run through the Dynalign web server (59) including the four Z (Fig. 5A). In the active 3D conformation,
(https://rna.urmc.rochester.edu), which predicts shared second- it is conceivable that this region is most likely not unstructured
ary structures for two sequences, this calculated a stable (ΔG = but folding over the cleavage site in an optimal configuration.
19.7 kcal/mol) stem-loop two dimensional (2D) conforma- C8 and C10, showing type 2 cleavage, appeared together at
tion common to the two molecules (SI Appendix, Fig. S7). the same cycle and showed high sequence similarity. They had
The two clusters sharing highest overall sequence identity, somewhat different cleavage efficiencies, with ∼21% and 8%
Downloaded from https://www.pnas.org by UNIVERSITETSBIIBL I TRONDHE on November 24, 2022 from IP address 129.241.230.0.
68% including two consecutive Zs, were C6 and C7. Neither cleaved after 4 h incubation for C8 and C10. Type 3 cleaving
of these showed specific cleavage within the RNA substrate. cluster C9 appeared only at later cycles and gave ∼20% cleav-
However, both presented, as mentioned, a level of increased age after a 4h incubation. These show that the AEGIS system
hydrolysis at U6 and U13. actually was evolving during the process to deliver new catalysts
Interestingly, C1, which displayed a type 1 cleavage and had that were not present in the original library.
four Z nucleotides, was present in the initial library, persisted We then performed a detailed kinetic analysis of AEGISzyme
during all selection cycles, and was also the most-active cleaver C1 self-cleavage under selection conditions. Experimental data
when tested free in solution (∼27% cleavage after 4 h). observed a bimodal rate profile where the overall maximum
Fig. 5. Cluster 1 AEGISzyme cleavage. (A) Two-dimensional rendering of AEGISzyme C1 generated with mfold (60) and RNA/DNA construct primary
sequence. RNA target in blue, binding sites in green and underlined, Zs in red, and cleavage site in yellow. The RNA:DNA duplex portion of the molecule was
confirmed by enzymatic digestion. (B) Denaturing 18% PAGE of cleavage reaction time course, with time points marked in minutes (‘) and hours (h). RNA
bases and sizes in nucleotides are indicated on the Left. (C) Plot of percent cleavage versus time in minutes. Data were fitted to a double exponential equa-
tion. Data points represent the mean value of at least three replicates.
Fig. 6. Enzymatic digestion of AEGISzyme C1 in native conditions. Eighteen percent, 7M urea PAGE. Left: time course of DNase I digestion. Right: time course
of RNase H digestion. Time points are reported in minutes (‘). NR: nonreacted molecule. “DNA ladder”: 10-bp DNA ladder (New England Biolabs). “DNase I
DNA ladder”: 10-bp DNA ladder digested with DNase I. Lengths in base pairs for the DNA ladders are in blue. RNA bases and sizes in nucleotides are indi-
cated in black for ladders and DNA:RNA hybrid AEGISzyme C1. G8*: band resulting from the self-cleavage of C1 at its cleavage site during the digestion reac-
tion, which is not a product of DNase I or RNase H digestion. U9**: hot spot for spontaneous hydrolysis of the RNA target, which is not a product of DNase I
or RNase H digestion.
fraction cleaved was ∼55% and the observed rates were kfast = 20 mM. Kinetic reactions were followed for 26 h. Results are
1.26 × 101 min1 and kslow = 1.65 × 103 min1. In contrast, shown in SI Appendix, Fig. S8. The highest catalytic rate was
clusters C8, C9, and C10 showed monophasic cleavage rates found when [Mg2+] = 5 mM. This suggests a requirement of
that fit a single exponential. The C8 and C10 representatives, at least one (but not two) bound Mg2+, either catalytic or,
cleaving at A16, had values for kobs of 0.76 × 103 and 0.34 × more likely, structural. Further, the bell-shaped curve in SI
103 min1, respectively, with maximum cleavage of ∼25% (C8) Appendix, Fig. 8B suggests that the AEGISzyme can bind with
and ∼31% (C10). The type 3 AEGISzyme from cluster C9 weaker affinity to additional Mg2+ ions to give complexes that
cleaved at A10 with kobs = 0.83 × 103 and maximum cleav- are inactive or have lower activity or where the cation directly
age at ∼23%. The rate of 0.126 min1 for C1 is fast for a pri- contributes to stabilizing the transition states of the spontane-
mary NA enzyme lacking any postselection optimization or ous hydrolysis of any of the 12 nt in the target RNA.
secondary reselection. Typically, first-generation self-cleaving
DNAzymes arising directly from selection, before any optimiza- pH Dependence of Cleavage by AEGISzyme C1. The depen-
tion, present rates in the ∼103 range or lower (25, 56, 62). dence on pH of self-cleavage activity of AEGISzyme C1 was
For example, C1’s kobs is at least one order of magnitude faster tested over the range pH 6.0–9.0 (Fig. 8). Here, the log(kobs) val-
than the first-generation cleavers initially obtained by Perrin ues were seen to increase linearly in the range 6.0–7.0, with a
and coworkers when using a heavily modified 40N library car- slope of +0.93, decreasing in the range 7.0–9.0, with a negative
rying RNase A–mimicking modifications (25). Here, we slope of 0.233 (SI Appendix, Fig. S9). The bell-shaped curve is
obtained better rates with just one extra nucleotide in a 25N consistent with a kinetic model involving acid–base catalysis with
evolvable random region. at least two ionizable species, one acting as proton donor and the
Standard versions of each AEGISzyme, where every Z was other acting as a proton acceptor. As the data did not fit well to a
substituted with C and every P with G, were tested for activity simple symmetrical Gaussian equation (SI Appendix, Fig. S9B),
in parallel with the original AEGISzymes (Fig. 7 and SI the ionizable species participating in the cleavage likely had two
Appendix, Figs. S4 and S5). For every AEGISzyme tested, different pKas, perhaps in the range between 7 and 8 and influ-
removal of the AEGIS components destroyed cleavage activity, enced by local environments in the active site fold.
indicating that AEGIS nucleotides were essential for activity. This profile is consistent with a SN2-type transesterification
reaction typical of the well-known protein RNase A. It is note-
Magnesium Dependence of Cleavage by AEGISzyme C1. The worthy that the original selection experiment was performed at
dependence of self-cleavage activity of AEGISzyme C1 was pH 7.8, which is the pKa of nucleobase Z free in solution.
tested at different magnesium concentrations, from 0 mM to Here, it is conceivable that at least two of the Z nucleobases in
6 of 11 https://doi.org/10.1073/pnas.2208261119 pnas.org
Downloaded from https://www.pnas.org by UNIVERSITETSBIIBL I TRONDHE on November 24, 2022 from IP address 129.241.230.0.
Fig. 7. (A) Self-cleavage of AEGISzymes C1, C8, C9, and C10. Data points were fitted to a double exponential equation and represent mean values of at least
three replicates. (B) Eighteen percent denaturing PAGE showing an example of kinetic reactions for AEGISzyme C9 versus DNAzyme C9 – no ZPs, where
AEGIS nucleotides were removed. T-OH: partial alkaline hydrolysis of the DNA/RNA ladder hybrid molecule used as forward primer during selection and
here as a ladder.
AEGISzyme C1 are the ionizable participants in the reaction, Our working hypothesis premised that laboratory in vitro evo-
with structurally induced perturbed pKas between ∼7 and 8. lution applied to the second library would better deliver cleavers
This would reflect an active site analogous to that of RNase A. able to survive, be enriched, and evolve during selection than the
first library, even though the second covered only 0.0011%
Discussion (0.000011 mol fraction) of the available sequence space (Table 1).
Here, the results were clear. The standard library delivered
This work compares libraries built from standard DNA nucleo- no cleavers that survived to be enriched in our selection condi-
tides with libraries built from an Artificially expanded genetic tions. As the original pool covered 25% of the possible sequen-
Information System for their ability to deliver molecules that ces, little evolution was possible in principle.
cleave a phosphodiester linkage within an RNA substrate 12 nt In contrast, the AEGIS library delivered several catalysts after
long (DNAzymes or AEGISzymes, respectively). Here, we only nine cycles of selection. At least one of these appears to
chose to start with 25-nt random sequence libraries made of have been present early and likely in the original library.
either four standard or six standard + AEGIS different building Assuming that the library samples randomly with respect to
blocks. cleavage activity, this implies that the total library would have
Fig. 8. (A) Self-cleavage of AEGISzyme C1 at different pHs. Each data set was fitted with a biphasic, double-exponential equation. Data points are mean
values of two to five replicates. (B) Plot of AEGISzyme C1 log(kfast) versus pH.
8 of 11 https://doi.org/10.1073/pnas.2208261119 pnas.org
strategy used here and in most other studies effectively prevents to the manufacturer protocol and made single stranded with brief alka-
the selection of the best catalysts, as these are lost during selec- line treatments to remove the nonbiotinylated strand. Five milligrams
tion steps at conditions that are optimal for RNA cleavage (how- streptavidin beads (500 μL suspension) were used in cycle 1 for each
ever, as this is true for both selections, it does not affect our library type, and 1.5 mg (150 μL suspension) were used in subsequent
comparative analysis). cycles. This resulted in the single-stranded construct 50 -biotin-DNA(7 nt)-
In at least one case in Perrin’s work on heavily modified RNase- RNA(12 nt)-DNA library (55 nt)-30 being bound to the solid substrate via
like DNAzymes, reselection of 40N evolved molecules with 15% its 50 end and able to be subjected to self-cleavage.
mutagenesis and error-prone PCR was needed to achieve the same (3) A self-cleavage reaction was started by resuspending the library/target-
rates of cleavage that we obtained from crude evolution experi- coated magnetic beads in a solution containing 2 mM MgCl2, 150 mM
NaCl, and 50 mM TrisHCl at pH 7.8 and incubating at 37 °C for 30 min.
ments with just two extra nucleotides (from kobs = ∼102 min1
(4) After the prescribed incubation time, supernatants containing molecules
to kobs fast = 0.21 min1 for Dz7-45–28 (25)).
released from the beads were collected, concentrated, and amplified with
Granted, this postselection strategy on densely functionalized the same primers used as in step 1, at which point a selection cycle was
material also provided a DNAzyme that was metal free with a completed.
kobs fast = 4.9 min1 (Dz7-38-32). Here, each molecule carries a
modification on each of three nucleobases (dAim, dUga, Negative cycles for this selection were embedded in the cycles themselves: for
and dCaa). Thus, the rate of the C1 AEGISzyme (kobs fast of RNA cleavage, a negative cycle would require the incubation of the RNA
0.126 min1) is especially noteworthy, since it requires no post- target–DNA library hybrid at conditions other than the ones set for cleavage
selection and carries only four nitro-containing extra Zs in a and recovery of anything that did not cleave. This exactly was done at every
74-nt molecule. These comparisons also reinforce the point that cycle in this scheme when the single-stranded library construct coupled to
AEGIS-PCR and AEGIS in vitro evolution carry embedded in the beads was washed with NaOH followed by three TrisHCl, pH 7.8 washes
before adding the cleavage buffer and incubating at 37 °C (between steps 2
themselves evolutionary features that standard selection do not
and 3).
present, needing artifices like mutagenesis and error-prone rese- Transliteration and deep sequencing. GACTZP libraries were sequenced using
lection to sample a larger set of sequence space that AEGIS a transliteration method improved from the one described by Yang et al.
in vitro evolution experiments have largely pre-embedded.
Downloaded from https://www.pnas.org by UNIVERSITETSBIIBL I TRONDHE on November 24, 2022 from IP address 129.241.230.0.
(54, 55). Survivors from each cycle were PCR amplified under two separate
conditions that transliterate each Z:P pair into C:G (condition 1) or a 1:1 mix-
Materials and Methods ture of C:G and T:A pairs (condition 2) using an error-prone polymerase meth-
odology developed for this purpose. Tags were then added by 12 cycles of
Materials. Standard oligonucleotides and libraries were purchased from Inte- PCR, and primers included barcodes specific for the two conversion conditions.
grated DNA Technologies. AEGIS nucleotide triphosphates were purchased from The amplicon mixture was purified by native PAGE, recovered by gel extraction,
Firebird Bio. Radiolabeled [α-32P]ATP was from PerkinElmer. Dynal streptavidin- and submitted for deep sequencing. The full-length recovered sequences were
coated magnetic beads (M-270) were from Invitrogen. clustered using a custom algorithm that considers only the variable nucleotide
Polynucleotide kinase, DNase I, RNase H, and relevant buffers were bought region by grouping those with a single base change between sequence reads,
from New England Biolabs. Takara Taq HF polymerase was from Takara. Other starting with the most common read and proceeding toward the least com-
general chemicals were from Sigma-Aldrich and Fisher Scientific. mon, and iterating until all sequences were grouped. Clustered sequences
were then separated into sets by barcode and variable sites compared between
Methods.
Oligonucleotide synthesis. Oligonucleotides and libraries containing AEGIS each set. Sites in the aligned sequences that are consistently G, A, C, and T in
nucleotides were prepared as previously reported (40, 41). The randomized sites both conditions were assigned as G, A, C, and T, respectively. Conversely, we
in the library were prepared by coupling with a 1:1:1:1:1:1 mixture of the six assigned Z residues to sites in the aligned sequences that showed ∼1:1 mix-
(GACTZP) nucleoside phosphoramidites. The synthetic oligonucleotides and tures of C and T in condition 2 and P residues to sites in the aligned sequences
library were purified on denatured polyacrylamide gel electrophoresis (PAGE) that will show ∼1:1 mixtures of G and A.
Resynthesis of cluster representatives. AEGIS DNA/RNA hybrid cluster repre-
(7 M urea) and then desalted using Sep-Pac Plus C18 cartridges (Waters).
In vitro selection (SI Appendix, Fig. S1).
sentative constructs were prepared following either of two procedures: (i) by
AEGIS PCR amplification followed by streptavidin beads–based strand separa-
(1) The starting randomized libraries of interest (25N ACTG or ACTGZP, 2 nmol tion using reverse-complement AEGIS DNA templates synthesized at Firebird
each starting material) were subjected to five cycles of PCR with a forward Biomolecular Sciences, LLC, and primers as during selection but with biotin
primer and a biotinylated reverse primer containing the 12-nt RNA target on the forward strand, with consequent recovery of the single-stranded DNA/
and a 7-nt unstructured DNA spacer between the RNA and a biotin tag. Spe- RNA construct free in solution. These were then 50 -32P labeled and purified by
cifically, the primer construct was composed of, in 50 to 30 orientation, (i) a gel extraction to remove any degradation product due to the hydrolysis of the
biotin molecule; (ii) a 7-nt DNA spacer, with sequence d(GGAAAAA); (iii) the RNA portion during manipulation; (ii) by primer extension with a 50 -32P-
target RNA sequence, r(GUAACUAGAGAU); and (iv) a 15-nt-long DNA labeled DNA/RNA reverse primer, with subsequent strand separation and
sequence complementary to the 30 primer binding region of the starting purification by denaturing PAGE (16%, 7M urea).
library (reverse DNA/RNA hybrid primer). While sequences (i)–(iii) are the Enzymatic digestions. Enzymatic digestions of AEGISzyme C1 in native condi-
same as used in the original Santoro–Joyce selection (49), both 15-nt-long tions were performed as follows: 50 -[γ-32P]-labeled AEGISzyme C1 was dena-
PBSs on the library differed in the two pools, to avoid cross-contaminations tured in water at 95 °C for 30 , after which the cleavage buffer was added to 1×
between selections. The primer pairs were switched in the second reiteration final concentration and the tube passed on ice. Following this, enzyme-specific
of the standard DNAzyme experiment. Primer extensions, and subsequently buffers and nucleases were added and a ∼10 s time point collected. The tubes
PCR amplifications for each cycle, were performed with Takara Taq HS DNA were then passed at 37 °C. Five microliter time points were collected at 10 s,
polymerase and a modified buffer, previously shown to be able to (a) read 30 s, 1 min, 2 min, 5 min, 8 min, 12 min, 16 min, and 20 min and quenched
through an RNA/DNA template and (b) efficiently copy AEGIS DNA (40–43). in 10 μL denaturing gel loading buffer (95% formamide, 10 mM ethylenedi-
The amplified products were gel purified in 10% native PAGE before subject- aminetetraacetic acid (EDTA), 0.025% bromophenol blue, and 0.025% xylene
ing them to treatment with streptavidin magnetic beads. For each starting cyanol). Reaction conditions were as follows: DNase I digestion: 0.2 μM
library in cycle 1, 0.5 nmol of this material were used, corresponding to 50 -[γ-32P]-labeled AEGISzyme C1 in 1× cleavage buffer, 0.08 U/μL DNase I,
25% and 0.0011% sequence space coverage for standard and AEGIS DNA, 10 mM TrisHCl, 2.5 mM MgCl2, 0.5 mM CaCl2, pH 7.6@25 °C and RNase
respectively. H digestion: 0.2 μM 50 -[γ-32P]-labeled AEGISzyme C1 in 1× cleavage buffer,
(2) Primer extended/amplified, internally 32P-labeled, double-stranded mole- 0.2 U/μL RNase H, 50 mM TrisHCl, 75 mM KCl, 3 mM MgCl2, 10 mM dithio-
cules were then bound to streptavidin-coated magnetic beads according threitol, pH 8.3 @ 25 °C. Final reaction volumes were 50 μL.
1. C. Tuerk, L. Gold, Systematic evolution of ligands by exponential enrichment: RNA ligands to 29. J. A. Piccirilli, T. Krauch, S. E. Moroney, S. A. Benner, Enzymatic incorporation of a new base pair
bacteriophage T4 DNA polymerase. Science 249, 505–510 (1990). into DNA and RNA extends the genetic alphabet. Nature 343, 33–37 (1990).
2. D. L. Robertson, G. F. Joyce, Selection in vitro of an RNA enzyme that specifically cleaves 30. C. Switzer, S. E. Moroney, S. A. Benner, Enzymatic Incorporation of a New Base Pair into DNA and
single-stranded DNA. Nature 344, 467–468 (1990). Rna. J. Am. Chem. Soc. 111, 8322–8323 (1989).
3. A. D. Ellington, J. W. Szostak, In vitro selection of RNA molecules that bind specific ligands. Nature 31. A. Rich, “On the problems of evolution and biochemical information transfer” in Horizons in
346, 818–822 (1990). Biochemistry, M. Kasha, B. Pullmann, Eds., (Academic press, New York, 1962), pp.103–126.
4. D. P. Bartel, J. W. Szostak, Isolation of new ribozymes from a large pool of random sequences 32. G. Zubay, “A case for an additional RNA base pair in early evolution” in The Roots of Modern
Downloaded from https://www.pnas.org by UNIVERSITETSBIIBL I TRONDHE on November 24, 2022 from IP address 129.241.230.0.
[see comment]. Science 261, 1411–1418 (1993). Biochemistry, H. Kleinkauf, H. von D€ohren, L. Jaenicke Eds. (Walter de Gruiter and Co., Berlin,
5. E. Biondi, A. W. R. Maxwell, D. H. Burke, A small ribozyme with dual-site kinase activity. 1988), pp. 911–916.
Nucleic Acids Res. 40, 7528–7540 (2012). 33. A. Roychowdhury, H. Illangkoon, C. L. Hendrickson, S. A. Benner, 20 -deoxycytidines carrying amino
6. A. Akoopie, J. T. Arriola, D. Magde, U. F. M€uller, A GTP-synthesizing ribozyme selected by and thiol functionality: Synthesis and incorporation by Vent (exo-) polymerase. Org. Lett. 6,
metabolic coupling to an RNA polymerase ribozyme. Sci. Adv. 7, eabj7487 (2021). 489–492 (2004).
7. E. H. Ekland, D. P. Bartel, RNA-catalysed RNA polymerization using nucleoside triphosphates. 34. M. A. Carrigan, A. Ricardo, D. N. Ang, S. A. Benner, Quantitative analysis of a RNA-cleaving DNA
Nature 382, 373–376 (1996). catalyst obtained via in vitro selection. Biochemistry 43, 11446–11459 (2004).
8. W. K. Johnston, P. J. Unrau, M. S. Lawrence, M. E. Glasner, D. P. Bartel, RNA-catalyzed RNA 35. S. Hoshika et al., “Skinny” and “Fat” DNA: Two new double helices. J. Am. Chem. Soc. 140,
polymerization: Accurate and general RNA-templated primer extension. Science 292, 1319–1325 11655–11660 (2018).
(2001). 36. M. F. Matsuura, H. J. Kim, D. Takahashi, K. A. Abboud, S. A. Benner, Crystal structures of
9. A. Wochner, J. Attwater, A. Coulson, P. Holliger, Ribozyme-catalyzed transcription of an active deprotonated nucleobases from an expanded DNA alphabet. Acta Crystallogr. C Struct. Chem. 72,
ribozyme. Science 332, 209–212 (2011). 952–959 (2016).
10. E. H. Ekland, J. W. Szostak, D. P. Bartel, Structurally complex and highly active RNA ligases derived 37. Y. Zhang et al., A semisynthetic organism engineered for the stable expansion of the genetic
from random RNA sequences. Science 269, 364–370 (1995). alphabet. Proc. Natl. Acad. Sci. U.S.A. 114, 1317–1322 (2017).
11. J. Attwater, A. Raguram, A. S. Morgunov, E. Gianni, P. Holliger, Ribozyme-catalysed RNA synthesis 38. M. Kimoto, R. Yamashige, K. Matsunaga, S. Yokoyama, I. Hirao, Generation of high-affinity DNA
using triplet building blocks. eLife 7, e35255 (2018). aptamers using an expanded genetic alphabet. Nat. Biotechnol. 31, 453–457 (2013).
12. C. P. M. Scheitl, M. G. Maghami, A. K. Lenz, C. Hobartner, Site-specific RNA methylation by a 39. K. I. Matsunaga, M. Kimoto, I. Hirao, High-affinity DNA aptamer generation targeting von
methyltransferase ribozyme. Nature 587, 663–667 (2020). Willebrand factor A1-domain by genetic alphabet expansion for systematic evolution of ligands by
13. B. Seelig, A. J€aschke, A small catalytic RNA motif with Diels-Alderase activity. Chem. Biol. 6, exponential enrichment using two types of libraries composed of five different bases. J. Am. Chem.
167–176 (1999). Soc. 139, 324–334 (2017).
14. K. Pobanz, A. Luptak, Improving the odds: Influence of starting pools on in vitro selection 40. L. Zhang et al., Evolution of functional six-nucleotide DNA. J. Am. Chem. Soc. 137, 6734–6737
outcomes. Methods 106, 14–20 (2016). (2015).
15. A. D. Pressman et al., Mapping a systematic ribozyme fitness landscape reveals a frustrated 41. K. Sefah et al., In vitro selection with artificial expanded genetic information systems.
evolutionary network for self-aminoacylating RNA. J. Am. Chem. Soc. 141, 6213–6223 (2019). Proc. Natl. Acad. Sci. U.S.A. 111, 1449–1454 (2014).
16. J. I. Jimenez, R. Xulvi-Brunet, G. W. Campbell, R. Turk-MacLeod, I. A. Chen, Comprehensive 42. L. Zhang et al., Aptamers against cells overexpressing glypican 3 from expanded genetic systems
experimental fitness landscape and evolutionary network for small RNA. Proc. Natl. Acad. Sci. combined with cell engineering and laboratory evolution. Angew. Chem. Int. Ed. Engl. 55,
U.S.A. 110, 14984–14989 (2013). 12372–12375 (2016).
17. J. R. Lorsch, J. W. Szostak, Chance and necessity in the selection of nucleic acid catalysts. Acc. 43. E. Biondi et al., Laboratory evolution of artificially expanded DNA gives redesignable
Chem. Res. 29, 103–110 (1996). aptamers that target the toxic form of anthrax protective antigen. Nucleic Acids Res. 44,
18. T. R. Battersby et al., Quantitative analysis of receptors for adenosine nucleotides obtained via 9565–9577 (2016).
in vitro selection from a library incorporating a cationic nucleotide analog. J. Am. Chem. Soc. 121, 44. E. Biondi, S. A. Benner, Artificially Expanded Genetic Information Systems for New Aptamer
9781–9789 (1999). Technologies. Biomedicines 6, 53 (2018).
19. T. M. Tarasow, B. E. Eaton, Dressed for success: Realizing the catalytic potential of RNA. 45. S. Hoshika et al., Hachimoji DNA and RNA: A genetic system with eight building blocks. Science
Biopolymers 48, 29–37 (1998). 363, 884–887 (2019).
20. F. Tolle, G. Mayer, Dressed for success - applying chemistry to modulate aptamer functionality. 46. K. Futami, M. Kimoto, Y. W. S. Lim, I. Hirao, Genetic alphabet expansion provides versatile
Chem. Sci. (Camb.) 4, 60–67 (2013). specificities and activities of unnatural-base DNA aptamers targeting cancer cells. Mol. Ther.
21. F. Pfeiffert, M. Rosenthal, J. Siegl, J. Ewers, G. Mayer, Customised nucleic acid libraries for Nucleic Acids 14, 158–170 (2019).
enhanced aptamer selection and performance. Curr. Opin. Biotechnol. 48, 111–118 (2017). 47. V. T. Dien et al., Progress toward a semi-synthetic organism with an unrestricted expanded genetic
22. M. Kimoto, M. Nakamura, I. Hirao, Post-ExSELEX stabilization of an unnatural-base DNA aptamer alphabet. J. Am. Chem. Soc. 140, 16115–16123 (2018).
targeting VEGF165 toward pharmaceutical applications. Nucleic Acids Res. 44, 7487–7494 (2016). 48. R. R. Breaker, G. F. Joyce, A DNA enzyme that cleaves RNA. Chem. Biol. 1, 223–229 (1994).
23. S. Paul, A. A. W. L. Wong, L. T. Liu, D. M. Perrin, Selection of M2+-independent RNA-cleaving 49. S. W. Santoro, G. F. Joyce, A general purpose RNA-cleaving DNA enzyme. Proc. Natl. Acad. Sci.
DNAzymes with side-chains mimicking arginine and lysine. ChemBioChem 23, e202100600 U.S.A. 94, 4262–4266 (1997).
(2022). 50. S. K. Silverman, In vitro selection, characterization, and application of deoxyribozymes that cleave
24. M. Hollenstein, C. J. Hipolito, C. H. Lam, D. M. Perrin, Toward the combinatorial selection of RNA. Nucleic Acids Res. 33, 6151–6163 (2005).
chemically modified DNAzyme RNase A mimics active against all-RNA substrates. ACS Comb. Sci. 51. W. Zhou, J. Liu, Multi-metal-dependent nucleic acid enzymes. Metallomics 10, 30–48 (2018).
15, 174–182 (2013). 52. P. J. J. Huang, J. Liu, In vitro selection of chemically modified DNAzymes. ChemistryOpen 9,
25. Y. Wang, E. Liu, C. H. Lam, D. M. Perrin, A densely modified M2+-independent DNAzyme that 1046–1059 (2020).
cleaves RNA efficiently with multiple catalytic turnover. Chem. Sci. (Camb.) 9, 1813–1821 53. M. Hollenstein, DNA catalysis: The chemical repertoire of DNAzymes. Molecules 20, 20777–20804
(2018). (2015).
26. Z. Chen, P. A. Lichtor, A. P. Berliner, J. C. Chen, D. R. Liu, Evolution of sequence-defined highly 54. Z. Yang, F. Chen, J. B. Alvarado, S. A. Benner, Amplification, mutation, and sequencing of a
functionalized nucleic acid polymers. Nat. Chem. 10, 420–427 (2018). six-letter synthetic genetic system. J. Am. Chem. Soc. 133, 15105–15112 (2011).
27. L. Gold et al., Aptamer-based multiplexed proteomic technology for biomarker discovery. PLoS One 55. Z. Yang et al., Conversion strategy using an expanded genetic alphabet to assay nucleic acids.
5, e15004 (2010). Anal. Chem. 85, 4705–4712 (2013).
28. S. Kraemer et al., From SOMAmer-based biomarker discovery to diagnostic and clinical 56. A. V. Sidorov, J. A. Grasby, D. M. Williams, Sequence-specific cleavage of RNA in the absence of
applications: A SOMAmer-based, streamlined multiplex proteomic assay. PLoS One 6, e26332 divalent metal ions by a DNAzyme incorporating imidazolyl and amino functionalities. Nucleic
(2011). Acids Res. 32, 1591–1601 (2004).
10 of 11 https://doi.org/10.1073/pnas.2208261119 pnas.org
57. U. Kaukinen, S. Lyytik€ainen, S. Mikkola, H. L€onnberg, The reactivity of phosphodiester bonds 64. S. W. Santoro, G. F. Joyce, K. Sakthivel, S. Gramatikova, C. F. Barbas III, RNA cleavage by a DNA
within linear single-stranded oligoribonucleotides is strongly dependent on the base sequence. enzyme with extended chemical functionality. J. Am. Chem. Soc. 122, 2433–2439 (2000).
Nucleic Acids Res. 30, 468–474 (2002). 65. A. Mir, B. L. Golden, Two active site divalent ions in the crystal structure of the hammerhead
58. A. Bibillo, M. Figlerowicz, K. Ziomek, R. Kierzek, The nonenzymatic hydrolysis of ribozyme bound to a transition state analogue. Biochemistry 55, 633–636 (2016).
oligoribonucleotides. VII. Structural elements affecting hydrolysis. Nucleosides Nucleotides Nucleic 66. E. M. Moody, J. T. Lecomte, P. C. Bevilacqua, Linkage between proton binding and folding in RNA:
Acids 19, 977–994 (2000). A thermodynamic framework and its experimental application for investigating pKa shifting. RNA
59. Y. Fu, G. Sharma, D. H. Mathews, Dynalign II: Common secondary structure prediction 11, 157–172 (2005).
for RNA homologs with domain insertions. Nucleic Acids Res. 42, 13939–13948 67. B. Gong et al., Direct measurement of a pK(a) near neutrality for the catalytic cytosine in the genomic
(2014). HDV ribozyme using Raman crystallography. J. Am. Chem. Soc. 129, 13335–13342 (2007).
60. M. Zuker, Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 68. S. Nakano, D. M. Chadalavada, P. C. Bevilacqua, General acid-base catalysis in the mechanism of a
31, 3406–3415 (2003). hepatitis delta virus ribozyme. Science 287, 1493–1497 (2000).
61. S. J. Schultz, J. J. Champoux, RNase H activity: Structure, specificity, and function in reverse 69. J. L. Wilcox, A. K. Ahluwalia, P. C. Bevilacqua, Charged nucleobases and their potential for RNA
transcription. Virus Res. 134, 86–103 (2008). catalysis. Acc. Chem. Res. 44, 1270–1279 (2011).
62. A. I. Taylor et al., Catalysts from synthetic genetic polymers. Nature 518, 427–430 70. A. Luptak, A. R. Ferre-D’Amare, K. Zhou, K. W. Zilm, J. A. Doudna, Direct pK(a) measurement of the active-
(2015). site cytosine in a genomic hepatitis delta virus ribozyme. J. Am. Chem. Soc. 123, 8447–8452 (2001).
63. Z. Yang, F. Chen, S. G. Chamberlin, S. A. Benner, Expanded genetic alphabets in the polymerase 71. S. Kath-Schorr et al., General acid-base catalysis mediated by nucleobases in the hairpin ribozyme.
chain reaction. Angew. Chem. Int. Ed. Engl. 49, 177–180 (2010). J. Am. Chem. Soc. 134, 16717–16724 (2012).
Downloaded from https://www.pnas.org by UNIVERSITETSBIIBL I TRONDHE on November 24, 2022 from IP address 129.241.230.0.