Clonally Resolved Single Cell Multi Omics Identifies Routes of 2023 Cell ST
Clonally Resolved Single Cell Multi Omics Identifies Routes of 2023 Cell ST
Correspondence
carsten.mueller-tidow@
med.uni-heidelberg.de (C.M.-T.),
[email protected] (L.V.)
In brief
Velten and colleagues develop
CloneTracer, a computational method to
identify clones in single-cell RNA-seq
data. Applied to immature cells from 19
acute myeloid leukemia patients,
CloneTracer shows that dormant
hematopoietic stem cells (HSCs) are
healthy or preleukemic. Leukemic stem
cells resemble healthy active HSCs but
give rise to aberrant progenitors.
Highlights
d CloneTracer extracts clonal information from single-cell
RNA-seq data
Resource
Clonally resolved single-cell multi-omics identifies
routes of cellular differentiation
in acute myeloid leukemia
Sergi Beneyto-Calabuig,1,3,15 Anne Kathrin Merbach,2,4,15 Jonas-Alexander Kniffka,2,16 Magdalena Antes,2,5,6,16
Chelsea Szu-Tu,1,16 Christian Rohde,2,4 Alexander Waclawiczek,5,6 Patrick Stelmach,2,6 Sarah Graۧle,7,8,9 Philip Pervan,1
Maike Janssen, Jonathan J.M. Landry, Vladimir Benes, Anna Jauch, Michaela Brough, Marcus Bauer,12
2,4 10 10 11 11
Birgit Besenbeck,2 Julia Felden,2 Sebastian Ba €umer,13 Michael Hundemer,2 Tim Sauer,2 Caroline Pabst,2,4
12
Claudia Wickenhauser, Linus Angenendt, 13,14 Christoph Schliemann,13 Andreas Trumpp,5,6 Simon Haas,5,6,7,8,9
Michael Scherer,1 Simon Raffel,2 Carsten Mu € ller-Tidow,2,4,* and Lars Velten1,3,17,*
1Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
2Department of Medicine, Hematology, Oncology and Rheumatology, University Hospital Heidelberg, 69120 Heidelberg, Germany
3Universitat Pompeu Fabra (UPF), Barcelona, Spain
4Molecular Medicine Partnership Unit, European Molecular Biology Laboratory (EMBL), University of Heidelberg, 69117 Heidelberg, Germany
5Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGmbH), 69120 Heidelberg, Germany
6Division of Stem Cells and Cancer, Deutsches Krebsforschungszentrum (DKFZ) and DKFZ-ZMBH Alliance, 69120 Heidelberg, Germany
SUMMARY
Inter-patient variability and the similarity of healthy and leukemic stem cells (LSCs) have impeded the char-
acterization of LSCs in acute myeloid leukemia (AML) and their differentiation landscape. Here, we introduce
CloneTracer, a novel method that adds clonal resolution to single-cell RNA-seq datasets. Applied to samples
from 19 AML patients, CloneTracer revealed routes of leukemic differentiation. Although residual healthy and
preleukemic cells dominated the dormant stem cell compartment, active LSCs resembled their healthy coun-
terpart and retained erythroid capacity. By contrast, downstream myeloid progenitors constituted a highly
aberrant, disease-defining compartment: their gene expression and differentiation state affected both the
chemotherapy response and leukemia’s ability to differentiate into transcriptomically normal monocytes.
Finally, we demonstrated the potential of CloneTracer to identify surface markers misregulated specifically
in leukemic cells. Taken together, CloneTracer reveals a differentiation landscape that mimics its healthy
counterpart and may determine biology and therapy response in AML.
706 Cell Stem Cell 30, 706–721, May 4, 2023 ª 2023 The Authors. Published by Elsevier Inc.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
ll
Resource OPEN ACCESS
A B C
E F
G H I J
Figure 1. CloneTracer and Optimized 10x enable clonal tracking in droplet-based scRNA-seq
(A) Scheme of Optimized 10x.
(B) Normalized coverage across the mitochondrial genome obtained by default and Optimized 10x.
(legend continued on next page)
Cell Stem Cell 30, 706–721, May 4, 2023 707
ll
OPEN ACCESS Resource
(C) Coverage of nuclear mutations from various AML patients. Only immature and early myeloid cells are included. See also Figure S1K.
(D) Illustration of the statistical challenge addressed by CloneTracer.
(E) Overview of two AML cohorts, see also Methods S1.
(F) Overview of longitudinal sampling in cohort B. Pie charts indicate clinical blast counts. DA/VA indicate treatment with Daunorubicin/Ara-C and Venetoclax/
Azacitidine, respectively.
(G) Top row: inferred clonal hierarchy for patient A.8. Middle row: stacked bar chart illustrating each cell’s probability to derive from the different clones shown in
top panel. Bottom rows: heatmap depicting the variant allele frequency of all clonal markers in all cells.
(H) Like (G), except that data from patient A.6 is shown.
(I) Clonal hierarchy of patient A.6 identified from sequencing of single-cell-derived colonies, see STAR Methods.
(J) Like (G), except that data from patient B.2 is shown. For CNVs, the scaled number of counts on the specified chromosome are shown.
protocols are inherently noisy, as illustrated by frequent allelic Here, a nuclear SNV with a high allele frequency in bulk
dropout even of highly expressed genes such as NPM1 (see exome-seq was located in MPO, which was covered in 22.8%
Figures 1C and S1K). For a confident interpretation of the data of cells. The mutant MPO allele was only observed in cells car-
and quantitative analyses, statistical methods are needed that rying a mitochondrial mutation (3019G>C) (Figure 1H). The mito-
identify the most likely hierarchy among the mutations and chondrial mutation was a suitable clonal marker, as it had likely
thereby, for example, clarify whether a mitochondrial mutation occurred before the nuclear mutation or there were only cells
or CNV is present in all cancer cells or demarcates a sub-clone. carrying both mutations. To verify these results, we grew sin-
Furthermore, dropout and false positive rates (FPRs) need to be gle-cell-derived colonies and genotyped MPO and the mito-
systematically accounted for when assigning cells to (sub-) chondrial mutations, confirming that the mitochondrial mutation
clones. is a high-confidence clonal leukemia marker (Figure 1I). Similar to
We therefore developed CloneTracer, a Bayesian model that patient A.6, mitochondrial mutations drove the assignment of
identifies the hierarchical relationship between mutations and leukemic and healthy cells in patient B.3.
assigns the cells to the clones. Our model considers previous In the case of patient B.2, we observed several sub-clonal
information, such as allele frequencies from exome sequencing mitochondrial mutations downstream of the co-occurring IDH2
(exome-seq), and most importantly, it accounts for the tech- and DNMT3A mutations that, when occurring together, likely
nical noise associated with single-cell measurements of constitute a leukemic, and not preleukemic, event24 (Figure 1J).
CNVs, SNVs, and mtSNVs (Figure 1D; see Methods S1 for Since these genes displayed high dropout and no reliable clonal
detail). Thereby, it first compares possible clonal hierarchies. marker was identified, there was often considerable uncertainty
Second, for the mutational hierarchy with the highest evidence, regarding the assignment of healthy vs. leukemic cells. Hence,
it computes the posterior probability of each cell to belong to this patient was excluded from the clonal analysis. In the remain-
any particular clone. ing 4 patients, there were no well-covered clonal markers (i.e.,
We applied CloneTracer to 19 AML patients from two cohorts mtSNVs, CNVs, or well-covered SNVs).
(Figures 1E and 1F; Table S3). Cohort A consisted of diagnostic Together, these examples illustrate the importance of using
bone marrow samples that were subjected to a CD34 enrich- statistical models when interpreting single-cell genotyping
ment before single-cell CITE-seq (cellular indexing of transcrip- data. All subsequent analyses involving clonal identities pertain
tomes and epitopes by sequencing); a median of 2,232 single to the 14 patients with high-confidence healthy/leukemic assign-
cells per patient passed stringent quality control filters. Somatic ments. Importantly, ‘‘leukemic’’ is here defined purely by the
variants were identified a priori by ATAC-seq (for mitochondrial presence of a mutation not observed in the T cell lineage (usually
variants) and exome-seq (for nuclear variants) of myeloid and NPM1 or a CNV, and in some cases, a mitochondrial marker mu-
T cells (Figure S1G; Tables S3 and S5). Cohort B included paired tation) and not by a functional ability to induce leukemia. Recent
longitudinal samples from four individual patients at the time of work has used clonal tracking to demonstrate that not all stem
diagnosis, after therapy, and (in one case) at the time of relapse. cells carrying a leukemic driver are functionally leukemogenic.25
A median of 12,034 single cells per patient passed quality filters.
Somatic variants were identified a priori by panel sequencing. In Validation of CloneTracer
both cohorts, cells were stained with CITE-seq surface anti- Overall, 91% of the cells from the 14 patients could be assigned
bodies (see Table S1). Overall, we analyzed 88,602 single cells as healthy or leukemic with high confidence (Figure 2A; STAR
from 25 specimens. Methods). We validated these assignments using established
To demonstrate the performance of CloneTracer, we chose to parameters for AML diagnosis: In AML, most myeloid cells are
highlight three representative patients (A.8, A.6, and B.2). leukemic, whereas lymphoid cells are usually healthy.26,27 We
Detailed analyses of all patients are described in Methods S1. identify lymphoid and myeloid cells from scRNA-seq data and
Patient A.8 represents the performance of CloneTracer in used this assignment as an indicator for healthy and leukemic,
cases with well-covered leukemic mutations: respectively. Under this assumption, the median area under
Here, a mutation in NPM1 was covered in 94% of the cells. the receiver-operating characteristics curve (AUROC) of
Mutations in RPS29 and a mitochondrial gene co-occurred CloneTracer was 0.96 (range 0.88–1). The discretized
with this mutation. A preleukemic DNMT3A mutation occurred CloneTracer assignments had a median FPR across patients of
upstream of NPM1 but displayed a higher dropout rate. 9% (range 0.1%–20%) and a median false negative rate (FNR)
CloneTracer confidently assigned cells as part of the leukemic of 1% (range 0%–20%, Figures 2B and 2C). For patients with
clone if any of the downstream (NPM1, RPS29, or the mtSNV) SNVs as clonal markers, statistically naive assignments8,28 that
mutations were observed (Figure 1G). In their absence, there classify a cell as leukemic if at least one mutant allele is observed
was often no conclusive evidence if the cell was healthy or pre- and otherwise as healthy if at least one healthy allele is observed,
leukemic, due to the dropout of DNMT3A. We grouped these sub-optimally balanced between FPR and FNR (Figure 2C).
cells as ‘healthy’ in downstream analyses and followed up on Notably, not all myeloid cells in AML patients are leukemic. We
preleukemia for selected patients (below, see Figures 4 and therefore considered leukemia-associated immunophenotypes
S5). Similar results were obtained for 11 further patients with a (LAIPs) to distinguish leukemic vs. healthy myeloid cells. LAIPs
well-covered mutation (mostly NPM1 or a CNV) on top of the are cell state-specific, aberrantly expressed markers identified
clonal hierarchy. during the routine clinical flow cytometric analysis of AML diag-
Patient A.6 represents the behavior of CloneTracer in cases nostic samples.29,30 Three patients carried a significant number
with moderately covered leukemic mutations and co-occurring of residual healthy cells along the full myeloid differentiation
mitochondrial markers: spectrum, as well as a clinically described LAIP. In these
A B C
D E F
patients, we projected each cell to a pseudotime of myeloid dif- according to the known differentiation trajectories (Figures 3A
ferentiation (see also below) and plotted the protein expression and S3A, points in color). By contrast, cells from leukemic pa-
of clinically identified LAIP markers over differentiation pseudo- tients abundantly existed in cell states not observed in healthy
time separately for leukemic and healthy cells. As expected, patients (Figure 3A, gray points). Many of these cell states
leukemic, but not residual healthy cells, expressed LAIP in a dif- were observed in single or few patients, highlighting inter-patient
ferentiation state-dependent manner (Figures 2D and 2E). For heterogeneity (Figures 3B, 3C, and S3B).
example, CD7 was only expressed by leukemic stem-like cells Highlighting CloneTracer assignments on the uMAP showed
in patient B.1. The statistical power for identifying LAIP markers that most lymphoid cells from AML patients were healthy and
was increased by using CloneTracer, compared with the statis- myeloid cells were leukemic (Figure 3D; see also Figure 2B).
tically naive assignment (Figure 2F). Few lymphoid cells assigned as leukemic likely constituted
Together, these analyses demonstrate that CloneTracer false positive calls. By contrast, significant numbers of healthy
correctly assigned cells as healthy and leukemic and outper- monocytes and healthy HSC/MPP (multipotent progenitor)-like
formed statistically naive assignments. cells occurred in 6 and 5 patients, respectively (Figures 3D
and S3B).
Differentiation hierarchies in leukemia We next aimed to identify leukemic and healthy stem cells. We
To identify differentiation landscapes, we integrated gene observed a cluster, C6 (see Figure S4A for exact cluster labels),
expression data from all 19 patients with data from two healthy that contained HSCs from the healthy reference individuals (Fig-
individuals (A.0 from this study and C.3 from Triana et al.4). The ure 3A), as well as both healthy and leukemic cells from the
integration strategy was selected to preserve real biological dif- different AML patients (Figures 3B and 3D). Interestingly, C6
ferences between samples (Figure S2D) while accounting for contained cells from 16 of 19 AML patients, whereas most other
technical batch effects (Figure S2E; see STAR Methods). This progenitor clusters were dominated by cells from only one
analysis showed that cells from the healthy individuals arrange patient each (Figures 3B and 3C). Other progenitor populations
A B C
D E
F G H
appeared to ‘‘emerge’’ from cluster C6, which was more evident To evaluate the potential leukemic and preleukemic content of
in a 3D uMAP (see https://veltenlab.crg.eu/clonetracer/ ). the dormant stem cell population, we increased the numbers of
Compared with other progenitor cells, cells from cluster C6 analyzed cells.
tended to express stem cell surface markers (Figure 3E: expres- First, we focused on 3 patients with NPM1 mutations where
sions of CD34, CD90, and CD49f and lower expressions of CD38 CD34 expression was rare and nearly exclusive to the dormant
and CD45RA). Interestingly, C6 overexpressed genes that were stem cell population (Figure 4C: patients A.10, A.11, and A.12).
identified as upregulated in label-retaining AML cells (LRCs) dur- We sorted CD34+ cells and performed MutaSeq,21 a well-based
ing xenotransplant assays, compared with the non-label-retain- single-cell method that allows us to efficiently capture mutations
ing fraction31 (Figure 3F; Table S2). LRCs were characterized as in lowly expressed genes such as DNMT3A. The sorting strategy
the population responsible for causing disease in xenotrans- resulted in a significant number of cells expressing the dormancy
plants and for drug resistance.31 Across the different AML pa- gene signature (Figure 4D). For all of these dormant stem cells,
tients, cluster C6 contained a median of 1% (range 0.1%– genotype data indicated that they were healthy or preleukemic
17%) of the total bone marrow (Figure 3C, inset), which is higher (Figure 4E).
than the LSC number estimated from xenotransplants.35 To follow up on dormant stem cells in two patients where
To evaluate the similarity of cells from C6 to healthy stem cells, CD34 expression was not exclusive to the dormant stem cell
we projected all cells to a healthy reference4 and assigned each compartment, we sequenced 23,110 additional CD34+ and total
cell to the most similar healthy cell state (see STAR Methods; BM cells from two patients, A.2 and A.9 (Figure S5A). In the case
Figure 3G) and a score that quantifies the similarity (Figure 3H). of A.9, we thereby identified 267 C6 cells, a subset of which ex-
Leukemic cells from cluster C6 were very similar to healthy pressed the dormancy signature (Figures S5B and S5D). All but
HSCs, whereas leukemic cells outside of C6 mapped to HSCs two C6 cells here were healthy or preleukemic (i.e., DNMT3A-
or downstream myeloid progenitor states but displayed lower mutant). In the case of A.2, we identified 1,109 cluster C6 cells
similarity. At the level of monocytes and dendritic cells, the tran- that were mostly leukemic and lacked the expression of the
scriptomic similarity between leukemic cells and healthy cells dormancy score (Figures S5B and S5C). These data further al-
increased (Figure 3H, inset). We therefore conceptually structure lowed us to demonstrate that the main dataset was sufficiently
leukemic differentiation in three stages, putative healthy-like powered to detect all cell states (Figures S5C and S5D).
stem cells (‘‘C6’’), highly heterogeneous and aberrant progeni- Taken together, our data suggest that the dormant stem cell
tors, and mature, healthy-like monocytes/dendritic cells. Impor- compartment is predominantly healthy or preleukemic. By
tantly, most leukemic cells in cluster C6 were marked by muta- contrast, the active stem cell compartment was leukemic in 12
tions in NPM1 or CNVs, which are typically associated with of the 14 patients. Our results cannot rule out the existence of
leukemia and not preleukemia27 (Figures S3C and S3D). rare leukemic dormant stem cells that might be relevant for
relapse.
The dormant stem cell compartment is healthy or
preleukemic LSCs retain erythroid capacity
To further characterize the putative stem cell cluster C6, we We next investigated the leukemic fraction of C6 and its routes of
focused on the six patients for whom both healthy and leukemic differentiation. In some patients, leukemic cells from C6 ex-
cells occurred within this cluster. Healthy and leukemic cells in pressed ‘‘active HSC’’ genes32,33 or ‘‘high-output HSC’’ genes36
C6 were generally separated by the principal component anal- relative to healthy cells from C6 (Figures 4B and S4B; Table S2).
ysis based on gene expression (Figure 4A). Healthy cells ex- To identify whether these cells are truly distinct from other
pressed genes characterized as ‘‘dormant HSC’’ genes in (1) a leukemic progenitors, we performed differential expression
recent scRNA-seq study of highly purified human HSCs,32 (2) la- testing, contrasting leukemic C6 cells to other leukemic myeloid
bel retention assays in mice,33 and (3) ‘‘low-output HSC’’ genes progenitor cells from the same patient. Although most leukemic
identified using clonal tracking36 (Figures 4B and S4B; Table S2). cells expressed genes associated with lymphomyeloid priming,
A dormant HSC gene expression signature robustly separated leukemic C6 cells highly expressed genes associated with early
healthy from the large majority of leukemic C6 cells across pa- erythromyeloid (erythroid, megakaryocytic, and eosinophilic/
tients (Figure 4C); cells expressing the dormant signature were basophilic) priming,1 AP-1 transcription factors, and genes asso-
consistently CD34+CD38 (Figure S4C). These results suggest ciated with LRCs31 (Figures 5A, S4D, and S4E; Table S2). In line
that in AML, the dormant stem cell compartment, where present with this observation, in 10 of the 14 patients with confident
or observed, contains healthy stem cells (dHSCs). By contrast, CloneTracer assignments, we observed erythroid progenitors
the active stem cell compartment was predominantly leukemic: carrying leukemic mutations (Figure 5B). Typically, these cells
these cells were predominantly healthy only in 2 of the 14 inves- carried NPM1 mutations and/or CNVs and were hence derived
tigated patients (A.9, A.13) (Figure 4C). Of note, five of the seven from the leukemic clone, and not from a preleukemic clone
patients where we observe dormant healthy HSCs belong to the (Figures S3C and S3D). The abundance of mutation-carrying
karyotypically normal, NPM1 mutant subtype (Figure 4C). erythroid progenitor cells correlated with the abundance of C6
(G) Cells from leukemia patients were projected to the healthy reference4. Color: most similar healthy cell type for each cell (STAR Methods). See Figure S3A for
color legend.
(H) uMAP highlighting the similarity score, i.e., similarity to the 5 most similar healthy reference cells (STAR Methods). Inset: smoothened average of the similarity
score over pseudotime.
A B
(Figure 5B). These results indicate that most leukemias, specif- ure S3B). The fraction of healthy monocytes correlated inversely
ically C6, can differentiate into the erythroid lineage at low rates. with the overall number of monocytes (Figure 5E): in leukemia
Together, these results allowed us to designate leukemic cells cells with a few monocytes (e.g., FAB [French-American-British
from C6 as aLSCs, a rare population of stem-like cells that exists AML classification] M0,M1), these monocytes were derived
in most AML patients and that often retain erythroid capacity. from residual healthy stem cells. An unsupervised analysis of
gene expressions revealed that monocytes exist in two cell
The state and extent of the differentiation block states. Leukemia-derived monocytes were enriched in a cell
determine the phenotypic manifestation of AML state with higher expressions of MHC-II and interferon response
We next focused on the leukemic progenitors downstream of genes (IFITM1 and IFITM3) (Figure 5F; Table S2). A similar signa-
LSCs. To determine the healthy cell state most strongly resem- ture was recently described for monocytes in clonal
bling these cells, their transcriptome was projected onto healthy hematopoiesis.38
progenitor cells, ranging from early MPPs to promyelocytes4 Together, these results suggest that leukemia-derived mono-
(see also Figures 3G and 3H). We thereby obtained an average cytes originated from incomplete differentiation blocks at the
pseudotime (i.e., a value describing each cell’s progression progenitor level and matured along normal differentiation path-
along the stem cell to monocyte trajectory). The average ways. The stage of the differentiation block at MPP, LMPP, or
projected pseudotime was associated with therapy response promyelocyte stages was linked to the first-line chemotherapy
(Figure 5C; only n = 14 patients treated with anthracycline and response. Furthermore, the strength of the differentiation block
cytarabine induction therapies were included here). We found was also encoded at the progenitor level and determined the de-
that patients with the most immature leukemic progenitors had gree of monocytic differentiation. The stage and the degree of
blast persistence or died during the first induction therapy, the differentiation block are independent properties.
whereas patients with LMPP (lymphomyeloid primed progeni- Our results suggested that the stage of the differentiation
tors)-like leukemic progenitors went into complete remission block was an important feature of the AML and thus may be
(p = 0.04, Wilcoxon test). Although the cohort size underlying subject to clonal selection. To investigate this hypothesis, we
these analyses was small, the results are consistent with a recent focused on three patients with co-existing sub-clones marked
report studying the relationship between differentiation arrest by relevant driver mutations (Figure S6). In these patients,
and survival in a large bulk RNA-seq cohort.37 The presence or sub-clones shifted toward more immature differentiation
size of the C6 stem cell population was not correlated with blocks, compared with the parental clones, possibly because
chemotherapy response. Taken together, these results suggest evolutionary pressures may favor differentiation blocks at
that the stage of the differentiation block may play a role in deter- more immature states. Given the small patient number available
mining chemotherapy response. for the analyses of sub-clones, we cannot rule out that other
We next investigated the ability of the leukemic progenitors to properties or genetic drift led to the expansion of the sub-
give rise to mature monocytes, which was highly variable be- clones.
tween patients. As expected, the genotype (e.g., NPM1 mutant)
only partly explained the degree of monocytic differentiation. We CloneTracer enables the discovery of leukemia and
hypothesized that leukemic progenitors with a larger resem- healthy specific markers
blance to their healthy equivalent might display a weaker differ- Our results indicated that LSCs, as well as leukemia-derived
entiation block. We computed for each progenitor cell a similarity monocytes, are rather healthy like and difficult to distinguish
score, describing how close it resembled the most similar from their healthy counterparts. This raised the question of
healthy cell (see Figure 3H). We found that this score, after ac- whether specific markers can be used to identify healthy vs.
counting for the genotype, correlated closely with the fraction leukemic cells of various differentiation stages, including stem
of mature monocytes or dendritic cells in the bone marrow (Fig- cells and monocytes. Such markers can possibly be identified
ure 5D). Patients with aberrant progenitor cells had a few mature by comparing healthy and leukemic cells from the same pa-
myeloid cells. By contrast, patients with progenitors more tient, thereby avoiding batch effects, genetic background,
closely resembling their healthy counterparts had large numbers and other variables typically confounding healthy-cancer com-
of mature cells. Taken together, these results suggest that the parisons. To investigate this idea, we first used the CITE-seq
‘‘degree’’ of the differentiation block, together with leukemia’s data of cohort B, since a larger number of surface markers
genotype, determines the fraction of monocytes in the bone and cells per patient were covered. We asked whether there
marrow. are markers that are overexpressed or depleted in leukemic
Of note, we observed patients with mostly healthy monocytes cells of various differentiation stages, compared with healthy
and other patients with mostly leukemic monocytes (see Fig- cells of the same stage. These comparisons identified CD11c
(B) Volcano plot as in Figure 3F, comparing healthy and leukemic cells from C6. n = 6 patients were analyzed, as in (A). Genes are colored by human HSC gene
signatures.32 phyper: hypergeometric test for enrichment.
(C) PCA of cells from cluster 6 performed jointly across all patients but using exclusively genes from the dHSC signature.32,33 Cells to the right of the dotted line are
putative dormant stem cells.
(D) Rare putatively healthy dormant CD34+ stem cells from three patients were sorted and subjected to a well-based scRNA-seq protocol.21 uMAP plots, from left
to right, show: (i) Projection on original uMAP from Figure 3, (ii) uMAP of Smart-Seq2 data, (iii) CD34 expression, (iv) HLF34 expression, and (v) a dormancy score
computed from the gene list in.32,33
(E) uMAPs as in (D), highlighting the variant allele frequency of relevant preleukemic and leukemic mutations.
A B
C D
E F
A B C
D E
F G H
I J K L
as overexpressed by leukemic cells and CD49f as enriched in particularly helpful in AML diagnosis if large numbers of mono-
healthy cells (Figure S7A). Accordingly, we observed an enrich- cytes are present.
ment of leukemic and healthy cells in the CD11c+CD49f and In stem cells, the CD11c+/CD49f combination was informa-
CD11cCD49f+ fraction, respectively (Figures 6A and 6B). tive in a subset of patients. In 5 of the 6 patients analyzed by
Since CD11c expression changed as a function of differentia- CloneTracer who contained healthy and leukemic cells in cluster
tion (Figure S7B), enrichment analyses of leukemic and healthy C6, CD49f was more highly expressed by healthy (i.e., dormant)
cells were performed for the immature (CD14) and mature stem cells40 (Figure 6K). The expression of CD11c on CD34+
(CD14+) compartments separately (Figures 6A and 6B). cells was variable across and within genotypes (Figures 6L and
We next confirmed the specificity of the CD11c/CD49f marker S7F). Accordingly, data integration by single-cell transcriptomics
combination by FACS sorting followed by fluorescent in situ hy- was overall superior in identifying the multipotent leukemia stem
bridization (FISH). In patient B.4, leukemic cells carrying the cell cluster, as well as stem-like progenitor cells, compared with
monosomy 7 were enriched in the CD11c+CD49f fraction, flow cytometry.
whereas healthy cells diploid for chromosome 7 were enriched
in the CD11cCD49f+ fraction (Figures 6C and S7C) (p = DISCUSSION
1010). Similar enrichments were demonstrated in an indepen-
dent patient with trisomy 8 (Figure S7D). To investigate routes of cellular differentiation in AML, we have
To further demonstrate that CD11c and CD49f can be used to introduced CloneTracer, a computational method for adding
enrich for functionally healthy cells, we performed xenotrans- clonal resolution and identifying leukemic and healthy cells in
plantation assays. We sorted CD34+CD14 peripheral blood scRNA-seq data. Tailored to scRNA-seq, CloneTracer extends
and bone marrow cells from three de novo AML patients into on DNA-seq-specific error models,40–42 as well as models that
CD11c+CD49f and CD11cCD49f+ fractions and trans- require previous knowledge of the clonal hierarchy.43 In the
planted each fraction into two immunocompromised mice. We AML context, CloneTracer confidently identified healthy and
observed that the putatively healthy CD11cCD49f+ fractions leukemic cells in 14/19 patients. CloneTracer assignments
gave rise to both myeloid and lymphoid engraftment in 8/10 relied on the presence of a clonal CNV (observed in 5 patients),
NSG mice from 3 of the 3 patients, indicating healthy hematopoi- a clonal mutation in a highly expressed nuclear gene (observed
esis (Figures 6D and 6E). The putatively leukemic CD11c+ in 7 patients), and/or a clonal mitochondrial mutation (observed
CD49f fractions did not engraft (Figures 6D and 6E), as is in 6 patients; full detail for all patients is provided in the
frequently observed for de novo AML samples.39 Methods S1). By combining the three layers of information,
Thus, the marker combination CD11c and CD49f identified CloneTracer outperformed methods that look at individual
CD34+ progenitor populations in AML specimens, which repo- layers only.8,14,17 Previous knowledge of the mutations is
pulate NSG mice with healthy cells. In the samples studied, the required to run CloneTracer, and we recommend calling these
marker combination could not be used to enrich rare LSCs. mutations from bulk data (e.g., exome sequencing, bulk ATAC-
Finally, we evaluated these markers in two larger cohorts with seq for mitochondrial variants, and karyotyping), although mito-
different techniques: chondrial variants and CNVs can also be called de novo from
We used immunohistochemistry for tissue microarrays (TMAs) single-cell data.14,17
from 86 AML patients and analyzed expression of CD14, CD34, The availability of clonal information enabled us to clarify routes
CD11c, and CD49f (Figures 6F–6H and S7E). A distinct cohort of of leukemic differentiation. Of note, through the integration of data
87 AML patients was analyzed by flow cytometry for expression from all patients, we identified a cluster of stem cells that con-
of CD14, CD34, and CD11c (Figures 6I and 6J). Together, these sisted of dormant, healthy, or preleukemic stem cells and pre-
results allowed us to provide a perspective on the specificity of dominantly leukemic, active SCs with retained erythroid potential
CD11c and CD49f and their potential relevance as markers for and a gene expression signature resembling label-retaining cells
healthy vs. leukemic cells. In particular, we observed that in in xenotransplants.31 Since both dHSCs/dpre (dormant preleuke-
more differentiated leukemias (FAB M2–M5), the putatively leu- mic cells)-LSCs and aLSCs exhibited relatively consistent gene
kemia-derived CD14+ cells were predominantly CD11c+ expression signatures across patients and healthy individuals,
CD49f. In undifferentiated leukemias (M0), which contain only data integration of scRNA-seq datasets represented a robust
a small number of monocytes, however, residual, putatively strategy for their identification. By contrast, these cells are difficult
healthy CD14+ cells showed a CD11cCD49f+ phenotype to enrich through flow sorting strategies: the large degree of inter-
(TMA data, Figures 6F–6H and S7E) and decreased CD11c patient heterogeneity at the level of progenitors renders it difficult
expression (Figures 6I and 6J). Thus, at the level of monocytes, to develop universal purification schemes.
CD11c and CD49f constituted a robust combination of markers Downstream of LSCs, we observed a highly heterogeneous
to quantitate the fraction of leukemia content. This might be and aberrant compartment of immature myeloid cells. The
(H) Representative mid-optical sections of a CD14, CD11c, and CD49f stained tissue micro array used for quantification in (G) (see also Figure S7E for scale bar).
Arrows, upper row: CD14+/CD11c/CD49f+ cell. Arrow, lower row: CD14+/CD11c+/CD49f cell.
(I) Scheme illustrating the flow cytometry experiment.
(J) Scatterplot relating the CD11c mean fluorescent intensity on CD14+ cells to the fraction of bone marrow that is CD14+, and the genotype. Flow cytometry data
from n = 59 individuals of selected genotypes is shown.
(K) Boxplot comparing the expression of CD49f in healthy and leukemic cells from cluster C6.
(L) Bar chart relating the expression of CD11c on CD34+ cells to genotype across n = 87 patients profiled by flow cytometry.
patient-specific stage of the differentiation arrest observed in cult to amplify and large chromosomal inversions or transloca-
this compartment determined the initial chemotherapy response tions in non-coding regions cannot be mapped. For DNMT3A,
of the patient. The strength of the block independently deter- coverage was obtained in approximately 20% of the CD34+
mined the overall degree of monocytic differentiation. Unlike cells; hence, preleukemic cells were difficult to distinguish from
cellular hierarchies identified from bulk data,37 the availability healthy cells with high confidence. Preleukemic cells were there-
of single-cell resolution allowed us to distinguish stem cells fore followed up on with well-based protocols (Figures 4D and
(C6) from various immature, ‘‘stem-like’’ progenitors and 4E) or simply by sequencing larger numbers of cells (Figure S5).
pinpoint a poor first-line chemotherapy response to the latter. In the future, methods that combine DNA-based genotyping45
Figure 7 summarizes our model of leukemic differentiation with RNA-seq in droplets might overcome the limitations of
pathways and suggests overall similarities to healthy hematopoi- CloneTracer but might initially suffer from worse quality of the
etic differentiation, but with an aberrant myeloid progenitor RNA-seq data.
compartment. Our model suggests that leukemic mutations, At the level of the cohort analyzed, a limitation of the study is
although present in stem cells and monocytes, may prominently that statements are drawn from only 19 patients (14 of whom
exert their effect in the cellular context of progenitors: Seven of have clonal tracking information). Hence, the results may not
the ten most commonly mutated AML driver genes44 were be entirely representative of the large heterogeneity of AML
more strongly expressed in progenitors, compared with stem genotypes and phenotypes observed, and studies with larger
cells and monocytes (Figure S7G). Differences in expression cohorts are going to systematically link genotypes and scRNA-
levels might lead to specific effects of the mutated gene in seq phenotypes.
each cellular compartment.
This implies that AML evolution requires mutations in slowly STAR+METHODS
dividing stem cells, although selection occurs at the level of pro-
genitors. Such a model would be in line with the low number of Detailed methods are provided in the online version of this paper
genetic aberrations observed in most AMLs. Of note, our data and include the following:
are static and cannot exclude the possibility that progenitor cells
in AML might also de-differentiate to give rise to stem cells. d KEY RESOURCES TABLE
In sum, these data may carry implications for the future devel- d RESOURCE AVAILABILITY
opment of therapeutic strategies: our results indicate that in B Lead contact
most if not all AML patients, there is a stem cell compartment B Materials availability
distinct from the most immature progenitor cells. Hence, tar- B Data and code availability
geted therapies aimed at immature progenitors may increase d EXPERIMENTAL MODEL AND SUBJECT DETAILS
the initial therapeutic response, but unless these therapies also B Human subjects
target the actual stem cell compartment, the effect on relapse B Animals
and long-term survival might be limited. d METHOD DETAILS
B Collection of bone marrow
Limitations of the study B Panel, exome, and bulk ATAC sequencing
At the level of the specific single-cell methodology employed for B Antibody-oligo conjugation
clonal tracking, a limitation is that in droplet-based scRNA-seq B CITEseq surface labeling and FACS sorting
protocols, SNVs in lowly expressed genes such as TET2 are diffi- B Single-cell RNA sequencing
B Optimized 10x: Mitochondrial libraries research. All other authors were involved in the acquisition and characteriza-
B Optimized 10x: Targeted genotyping libraries tion of clinical specimen. L.V., A.K.M., S.B.-C., and C.M.-T. wrote the manu-
B Plate-based single-cell RNA-seq (MutaSeq) script and generated the figures. All authors have read and commented on
the manuscript.
B Genotyping of single cell derived cultures
B Raw 10x Genomics data processing
DECLARATION OF INTERESTS
B Analysis of single cell gene expression data
B Dormancy score calculation The Department of Medicine V (Director C.M.-T.) receives research funding
B Analysis of DNAseq from single-cell derived colonies from multiple pharmaceutical and biotech companies especially for clinical tri-
als but also for translational research.
B Processing of MutaSeq scRNAseq data
B Raw data processing of MAESTER data
Received: July 29, 2022
B Fluorescent In Situ Hybridization Revised: February 5, 2023
B Tissue microarrays Accepted: March 30, 2023
B Large cohort flow cytometry analysis Published: April 24, 2023
B Xenotransplantations
d QUANTIFICATION AND STATISTICAL ANALYSIS REFERENCES
B CloneTracer model
1. Velten, L., Haas, S.F., Raffel, S., Blaszkiewicz, S., Islam, S., Hennig, B.P.,
B Differential expression testing
Hirche, C., Lutz, C., Buss, E.C., Nowak, D., et al. (2017). Human haemato-
B Data visualization poietic stem cell lineage commitment is a continuous process. Nat. Cell
d ADDITIONAL RESOURCES Biol. 19, 271–281. https://doi.org/10.1038/ncb3493.
2. Paul, F., Arkin, Y., Giladi, A., Jaitin, D., Kenigsberg, E., Keren-Shaul, H.,
SUPPLEMENTAL INFORMATION Winter, D., Lara-Astiaso, D., Gury, M., Weiner, A., et al. (2015).
Transcriptional heterogeneity and lineage commitment in myeloid progen-
Supplemental information can be found online at https://doi.org/10.1016/j. itors. Cell 163, 1663–1677. https://doi.org/10.1016/j.cell.2015.11.013.
stem.2023.04.001. 3. Tusi, B.K., Wolock, S.L., Weinreb, C., Hwang, Y., Hidalgo, D., Zilionis, R.,
Waisman, A., Huh, J.R., Klein, A.M., and Socolovsky, M. (2018).
ACKNOWLEDGMENTS Population snapshots predict early haematopoietic and erythroid hierar-
chies. Nature 555, 54–60. https://doi.org/10.1038/nature25741.
We thank Fengbiao Zhou, Anna Mathioudaki, and Judith Zaugg for discus- 4. Triana, S., Vonficht, D., Jopp-Saile, L., Raffel, S., Lutz, R., Leonce, D.,
sions and Laleh Haghverdi and Valerie Marot-Laussazaie for providing feed- Antes, M., Hernández-Malmierca, P., Ordoñez-Rueda, D., Ramasz, B.,
back on the mathematical model. We thank all members of GeneCore et al. (2021). Single-cell proteo-genomic reference maps of the hemato-
(EMBL) for assistance with the CITE-seq experiments, the DKFZ Single-Cell poietic system enable the purification and massive profiling of precisely
Open Lab (scOpenLab) for assistance with the MutaSeq/SmartSeq2 experi- defined cell states. Nat. Immunol. 22, 1577–1589. https://doi.org/10.
ment, the FACS facilities of DKFZ, Clinics HD, and CRG/UPF, and the geno- 1038/s41590-021-01059-0.
mics units at CRG and CNAG. We thank the NCT CLB, Sektion Cell Bio-
5. Perié, L., Duffy, K.R., Kok, L., de Boer, R.J., and Schumacher, T.N. (2015).
banking, for processing and providing bone marrow samples. Figure 1A was
The branching point in erythro-myeloid differentiation. Cell 163, 1655–
created with BioRender.com. Icons used in the graphical abstract are from
1662. https://doi.org/10.1016/j.cell.2015.11.059.
Servier Medical Art and licensed under CC-BY 3.0. This work was financially
supported by the German Bundesministerium fu €r Bildung und Forschung 6. Rodriguez-Fraticelli, A.E., Wolock, S.L., Weinreb, C.S., Panero, R., Patel,
(BMBF) through the Juniorverbund in der Systemmedizin ‘‘LeukoSyStem’’ S.H., Jankovic, M., Sun, J., Calogero, R.A., Klein, A.M., and Camargo, F.D.
(FKZ 01ZX1911D to L.V. and S.R.) as well as the Verbundprojekt SMART- (2018). Clonal analysis of lineage fate in native haematopoiesis. Nature
CARE (031L0212A to C.M.-T.), the Emerson foundation grant 643577 (to 553, 212–216. https://doi.org/10.1038/nature25168.
L.V.), grant PID2019-108082GA-I00 and PRE2020-093229 by the Spanish 7. Notta, F., Zandi, S., Takayama, N., Dobson, S., Gan, O.I., Wilson, G.,
Ministry of Science, Innovation and Universities (MCIU/AEI/FEDER, UE), the Kaufmann, K.B., McLeod, J., Laurenti, E., Dunant, C.F., et al. (2016).
German Research Foundation (DFG; projects MU1328/18-1 and MU1328/ Distinct routes of lineage development reshape the human blood hierarchy
21-1 and MU1328/23-1 to C.M.-T.), and the German Cancer Aid (DKH; project across ontogeny. Science 351, aab2116. https://doi.org/10.1126/science.
70113908 to C.M.-T.). L.V. acknowledges support of the Spanish Ministry of aab2116.
Science and Innovation to the EMBL partnership, the Centro de Excelencia 8. Nam, A.S., Kim, K.T., Chaligne, R., Izzo, F., Ang, C., Taylor, J., Myers,
Severo Ochoa and the CERCA Programme/Generalitat de Catalunya. R.M., Abu-Zeinah, G., Brand, R., Omans, N.D., et al. (2019). Somatic mu-
C.M.-T., A.K.M., and J.-A.K. gratefully acknowledge the data storage service tations and cell identity linked by Genotyping of transcriptomes. Nature
SDS@hd supported by the Ministry of Science, Research and the Arts Baden- 571, 355–360. https://doi.org/10.1038/s41586-019-1367-0.
Wu€rttemberg (MWK) and the German Research Foundation (DFG) through
9. van Egeren, D., Escabi, J., Nguyen, M., Liu, S., Reilly, C.R., Patel, S.,
grant INST 35/1314-1 FUGG and INST 35/1503-1 FUGG. J.-A.K. acknowl-
Kamaz, B., Kalyva, M., DeAngelo, D.J., Galinsky, I., et al. (2021).
edges support of the Deutsche Gesellschaft fu €r Ha€matologie und Medizini-
Reconstructing the lineage histories and differentiation trajectories of indi-
sche Onkologie e.V. (DGHO) and Deutsche José Carreras Leuka €mie-Stiftung
vidual cancer cells in myeloproliferative neoplasms. Cell Stem Cell 28,
e.V. through the José Carreras-DGHO-Promotionsstipendium.
514–523.e9. https://doi.org/10.1016/j.stem.2021.02.001.
10. Nam, A.S., Dusaj, N., Izzo, F., Murali, R., Mouhieddine, T.H., Myers, R.M.,
AUTHOR CONTRIBUTIONS
Sotelo, J., Benbarche, S., Gaiti, F., Tahri, S., et al. (2022). Single-cell multi-
A.K.M., S.R., C.M.-T., and L.V. conceived the project. C.S.-T., A.K.M., M.A., omics in human clonal hematopoiesis reveals that DNMT3A R882 muta-
J.-A.K., M.J., A.W., P.S., and M.B. generated the data and developed the lab- tions perturb early progenitor states through selective hypomethylation.
oratory protocols. S.B.-C., J.-A.K., A.K.M., M.B., P.P., and L.V. analyzed the https://doi.org/10.1101/2022.01.14.476225.
data with support from M.S. and C.R. S.B.-C. and L.V. developed the statisti- 11. Izzo, F., Lee, S.C., Poran, A., Chaligne, R., Gaiti, F., Gross, B., Murali, R.R.,
cal model. J.J.-M.L. and V.B. generated and processed raw sequencing data. Deochand, S.D., Ang, C., Jones, P.W., et al. (2020). DNA methylation
S.B. advised on antibody-oligo conjugation. A.W. performed xenotransplant disruption reshapes the hematopoietic differentiation landscape. Nat.
experiments. A.J. and M.B. performed FISH. L.V. and C.M.-T. supervised Genet. 52, 378–387. https://doi.org/10.1038/s41588-020-0595-4.
phylogeny reconstruction via integrative use of single-cell and bulk 50. Macosko, E.Z., Basu, A., Satija, R., Nemesh, J., Shekhar, K., Goldman, M.,
sequencing data. Genome Res. 29, 1860–1877. https://doi.org/10.1101/ Tirosh, I., Bialas, A.R., Kamitaki, N., Martersteck, E.M., et al. (2015). Highly
gr.234435.118. parallel genome-wide expression profiling of individual cells using nanoli-
42. Zafar, H., Navin, N., Chen, K., and Nakhleh, L. (2019). SiCloneFit: Bayesian ter droplets. Cell 161, 1202–1214. https://doi.org/10.1016/j.cell.2015.
inference of population structure, genotype, and phylogeny of tumor 05.002.
clones from single-cell genome sequencing data. Genome Res. 29, 51. Mölder, F., Jablonski, K.P., Letcher, B., Hall, M.B., Tomkins-Tinch, C.H.,
1847–1859. https://doi.org/10.1101/gr.243121.118. Sochat, V., Forster, J., Lee, S., Twardziok, S.O., Kanitz, A., et al. (2021).
43. McCarthy, D.J., Rostom, R., Huang, Y., Kunz, D.J., Danecek, P., Bonder, Sustainable data analysis with Snakemake. F1000Res 10, 33. https://
M.J., Hagai, T., Lyu, R.; HipSci Consortium, and Wang, W., et al. (2020). doi.org/10.12688/f1000research.29032.2.
Cardelino: computational integration of somatic clonal substructure and 52. Kiselev, V.Y., Yiu, A., and Hemberg, M. (2018). scmap: projection of single-
single-cell transcriptomes. Nat. Methods 17, 414–421. https://doi.org/ cell RNA-seq data across data sets. Nat. Methods 15, 359–362. https://
10.1038/s41592-020-0766-3. doi.org/10.1038/nmeth.4644.
44. Papaemmanuil, E., Gerstung, M., Bullinger, L., Gaidzik, V.I., Paschka, P.,
53. Hie, B., Bryson, B., and Berger, B. (2019). Efficient integration of heteroge-
Roberts, N.D., Potter, N.E., Heuser, M., Thol, F., Bolli, N., et al. (2016).
neous single-cell transcriptomes using Scanorama. Nat. Biotechnol. 37,
Genomic classification and prognosis in acute myeloid leukemia.
685–691. https://doi.org/10.1038/s41587-019-0113-3.
N. Engl. J. Med. 374, 2209–2221. https://doi.org/10.1056/
NEJMoa1516192. 54. Stuart, T., Butler, A., Hoffman, P., Hafemeister, C., Papalexi, E., Mauck,
W.M., Hao, Y., Stoeckius, M., Smibert, P., and Satija, R. (2019).
45. Miles, L.A., Bowman, R.L., Merlinsky, T.R., Csete, I.S., Ooi, A.T., Durruthy-
Comprehensive integration of single-cell data. Cell 177, 1888–1902.e21.
Durruthy, R., Bowman, M., Famulare, C., Patel, M.A., Mendez, P., et al.
https://doi.org/10.1016/j.cell.2019.05.031.
(2020). Single-cell mutation analysis of clonal evolution in myeloid malig-
nancies. Nature 587, 477–482. https://doi.org/10.1038/s41586-020- 55. Luecken, M.D., Bu €ttner, M., Chaichoompu, K., Danese, A., Interlandi, M.,
2864-x. Mueller, M.F., Strobl, D.C., Zappia, L., Dugas, M., Colomé-Tatché, M.,
et al. (2022). Benchmarking atlas-level data integration in single-cell geno-
46. Hennig, B.P., Velten, L., Racke, I., Tu, C.S., Thoms, M., Rybin, V., Besir, H.,
mics. Nat. Methods 19, 41–50. https://doi.org/10.1038/s41592-021-
Remans, K., and Steinmetz, L.M. (2018). Large-scale low-cost NGS library
01336-8.
preparation using a robust Tn5 purification and tagmentation protocol. G3
(Bethesda) 8, 79–89. https://doi.org/10.1534/g3.117.300257. 56. Bauer, M., Vaxevanis, C., Bethmann, D., Massa, C., Pazaitis, N.,
47. Gong, H., Holcomb, I., Ooi, A., Wang, X., Majonis, D., Unger, M.A., and Wickenhauser, C., and Seliger, B. (2020). Multiplex immunohistochemistry
Ramakrishnan, R. (2016). Simple method to prepare oligonucleotide-con- as a novel tool for the topographic assessment of the bone marrow stem
jugated antibodies and its application in multiplex protein detection in sin- cell niche. Methods Enzymol. 635, 67–79. https://doi.org/10.1016/bs.mie.
gle cells. Bioconjug. Chem. 27, 217–225. https://doi.org/10.1021/acs.bio- 2019.05.055.
conjchem.5b00613. 57. Bingham, E., Chen, J.P., Jankowiak, M., Obermeyer, F., Pradhan, N.,
48. Paczulla, A.M., Rothfelder, K., Raffel, S., Konantz, M., Steinbacher, J., Karaletsos, T., Singh, R., Szerlip, P., Horsfall, P., and Goodman, N.D.
Wang, H., Tandler, C., Mbarga, M., Schaefer, T., Falcone, M., et al. (2018). Pyro: deep universal probabilistic programming. https://doi.org/
(2019). Absence of NKG2D ligands defines leukaemia stem cells and me- 10.48550/arXiv.1810.09538.
diates their immune evasion. Nature 572, 254–259. https://doi.org/10. 58. Finak, G., McDavid, A., Yajima, M., Deng, J., Gersuk, V., Shalek, A.K.,
1038/s41586-019-1410-1. Slichter, C.K., Miller, H.W., McElrath, M.J., Prlic, M., et al. (2015). MAST:
49. Wolock, S.L., Lopez, R., and Klein, A.M. (2019). Scrublet: computational a flexible statistical framework for assessing transcriptional changes and
identification of cell doublets in single-cell transcriptomic data. Cell Syst. characterizing heterogeneity in single-cell RNA sequencing data.
8, 281–291.e9. https://doi.org/10.1016/j.cels.2018.11.005. Genome Biol. 16, 278. https://doi.org/10.1186/s13059-015-0844-5.
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
Additional data: MutaSeq data (Figures 4D and 4E) Figshare Figshare: https://doi.org/10.6084/m9.figshare.21982424
All single cell RNA-seq datasets, raw data EGA EGA: EGAS00001007078
Reference data4 Figshare Figshare: https://doi.org/10.6084/m9.
figshare.13397987.v3
Reference data31 Zenodo Zenodo: https://doi.org/10.5281/zenodo.6496279
Experimental models: Organisms/strains
NOD.Prkdcscid.Il2rgnull (NSG) mice Jackson Laboratory 005557
Oligonucleotides
For a complete list of oligonucleotides N/A N/A
used in this study, see Table S4.
FISH probes: 6q21/8q24 MetaSystems D-5802-100-OG
FISH probes: 7cen/7q22/7q36 MetaSystems D-5043-100-TC
Software and algorithms
CloneTracer and Primer Design code Zenodo Zenodo:
ComplexHeatmap CRAN v. 2.6.2
FlowJo FlowJo, LLC v. 10.8.1, 10.6.1
ggplot2 CRAN v. 3.3.5
Htseq (https://pypi.org/project/HTSeq/) PyPI v. 2.02
mitoClone Zenodo Zenodo: https://doi.org/10.5281/zenodo.4443074
Mgatk (https://github.com/caleblareau/mgatk) Github v. 0.1.1
Pheatmap CRAN v. 1.0.12
PhISCS (https://github.com/sfu-compbio/PhISCS) Github v. 1.0.0
R CRAN v. 4.0.2
Seurat CRAN v. 4.3.0
Scrublet (https://github.com/swolock/scrublet) Github v. 0.2.3
Scmap Bioconductor https://doi.org/10.18129/B9.bioc.scmap
Scanorama (https://github.com/brianhie/scanorama) Github v. 1.7.3
Spectre (https://github.com/ImmuneDynamics/Spectre) Github v. 1.0.0
STAR (https://github.com/alexdobin/STAR) Github v. 2.5.4
Other
StemSpan SFEM media Stem Cell Technologies 09650
RESOURCE AVAILABILITY
Lead contact
Requests for further information, resources and reagents should be directed to and will be fulfilled by the lead contact, Lars Velten
([email protected]).
Materials availability
This study did not generate new unique reagents.
Human subjects
Bone marrow samples from AML patients were obtained at the Heidelberg University Hospital after informed written consent using
ethic application number S-169/2017. For demographic characteristics of sample donors, see Table S3. All experiments involving
human samples were approved by the ethics committee of the University Hospital Heidelberg and were in accordance with the
Declaration of Helsinki.
Animals
NOD.Prkdcscid.Il2rgnull (NSG) mice were bred and housed under specific pathogen-free conditions in the central animal facility of
the German Cancer Research Center (DKFZ). Animal experiments were approved and performed in accordance with all regulatory
€sidium Karlsruhe). Immune compromised, healthy, female NSG mice 8-12 weeks
guidelines of the official committee (Regierungspra
of age and an average weight of 18-25 g were sublethally irradiated (175 cGy) 24 h before xenotransplantation assays.
METHOD DETAILS
Antibody-oligo conjugation
For markers where no commercial conjugates were available, azide-modified oligonucleotides were conjugated to purified anti-
bodies (anti-human CD166, Clone 3A6 (Biolegend, 343902); anti-human GPR56, Clone 4C3 (Biolegend, 391902)) by the use of a
DBCO-PEG5-NHS Ester (Santa Cruz Biotechnology, Dallas, USA) in a copper-free click reaction.47
In brief, azide-containing storage buffer of purified antibodies was exchanged to PBS (pH 8.5) using the Amicon Ultra-0.5 NMWL
30 kDa Centrifugal Filter (EMD Millipore, Billerica, USA).
100 mg of PBS-buffered antibody was incubated with 2mM DBCO-PEG5-NHS in a final reaction volume of 100mL for 30 minutes at
room temperature. The reaction was stopped by the addition of 100mM Tris HCl (pH 8) for 5 minutes at room temperature and non-
reactive DBCO-PEG5-NHS was removed using the Amicon Ultra-0.5 NMWL 30 kDa Centrifugal Filter.
Azide-modified oligonucleotides were reconstituted in PBS before adding 30pmol per 1mg DBCO-functionalized antibody. The
click reaction was conducted at 4 C for 18 hours. Unreacted oligonucleotides were removed using the Amicon Ultra-0.5 NMWL
50 kDa Centrifugal Filter and the final volume was adjusted to 100mL using PBS (pH 8.5).
Conjugation products were confirmed on Ethidiumbromide (EtBr) stained 2% agarose gels, Coomassie brilliant blue (CBB) stained
4-12% polyacrylamide gels and by absorbance spectroscopy.
Azide-modified oligonucleotides were purchased from Biolegio (Biolegio, Nijmegen, Netherlands) and contained an antibody-spe-
cific barcode (bold), a PCR handle (italic) and a capture sequence (underlined). * indicates a phosphorothioated bond to prevent
nuclease degradation:
CD166: 5’/Azide/CCTTGGCACCCGAGAATTCCACATTAACAGCGCCAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA*A*A
GPR56: 5’/Azide/CCTTGGCACCCGAGAATTCCATCATATCCGTTGTCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA*A*A
used as a second sorting marker to enrich stem cells from the CD34- fraction. Population frequencies were recorded and accounted
for in quantitative analysis of the single-cell RNA-seq data set. Sorted cells were loaded onto the Next GEM Chip G for a targeted cell
recovery of 5000 cells following the manufacturer’s instructions (10x Genomics, CG000206 Rev D).
In cohort B, fluorophore-tagged antibodies against human CD34 (clone 581) and CD45 (clone HI30) (patients B.1, B.2, B.3), or
CD56 (clone QA17A16) and CD45 (patient B.4) were used to enrich for following populations:
Population frequencies were recorded and accounted for in quantitative analysis of the single-cell RNA-seq data set. In particular,
in analyses that investigate the absolute frequencies of cell types in bone marrow (Figure 5D) the frequency of the cell type was
computed per sorted population, and multiplied with the frequency of the sorting gates in the bone marrow sample.
In cases where different biological samples were combined in the same GEM generation run, cells were labeled additionally with
oligonucleotide coupled cell hashing antibodies (Biolegend, San Diego, USA). FACS sorting of live bone marrow cells was performed
using DRAQ7 (1:1000; Biolegend, San Diego, USA) and Incucyte Caspase3/7 Red (1:5000; VWR International, Radnor, Pennsylvania,
USA) on a BD FACSAriaTM Fusion equipped with a 100mm nozzle. Sorted cells were loaded onto the Next GEM Chip G for a targeted
cell recovery of 8000 cells following the manufacturer’s instruction (10x Genomics, CG000206 Rev D).
for 3 mins followed by a 10C hold. After the PCR, all reactions were pooled and underwent two rounds of successive bead cleanup. In
the first bead cleanup, 0.6X (v/v) CleanPCR beads were used, followed by two 80% ethanol washes and eluted in 50 ml EB. In the
second cleanup, 0.6X (v/v) CleanPCR beads were again used, followed by two 80% ethanol washes and eluted in 15 ml EB. The final
library was then quantified by Qubit and QC’ed on the Bioanalyzer. Representative bioanalyzer traces are shown in Figure S1B.
Lin- or Lin-CD34 + single cells were index-sorted into U-bottom 96-well plates (Sarstedt) containing 100 ml StemSpan SFEM media
(Stem Cell Technologies). Media was supplemented with penicillin/streptomycin (100 ng/mL), UM729 (1 mM, Stem Cell Technologies)
and the following human cytokines (all from Peprotech): SCF (20 ng/mL), Flt3-L (20 ng/mL), TPO (50 ng/mL), IL-3 (20 ng/mL), IL-6
(20 ng/mL). After two weeks at 5% CO2 and 37 C, colonies were imaged by microscopy, and harvested in 12 ml buffer RLT (Qiagen)
for subsequent DNA isolation.
Tissue microarrays
The frequency of different cell subsets in the bone marrow microenvironment in AML patients was analyzed by multispectral imaging
(MSI). Formalin-fixed and paraffin embedded (FFPE), decalcified bone marrow samples were stained as described elsewhere.56 The
marker panel used for staining included antibodies directed against CD34, CD14, CD11c, CD49f. For antibody clones and dilutions,
see Table S1. All primary antibodies were incubated for 30 min. Tyramide signal amplification (TSA) visualization was performed using
the Opal seven-color IHC kit containing fluorophores Opal 520, Opal 540, Opal 570, Opal 690 (Akoya Biosciences., Marlborough, MA,
USA), and DAPI. Stained slides were imaged employing the PerkinElmer Vectra Polaris platform. To unify the spatial distribution
analysis, 3 20 MSI fields (1872 3 1404 pixel, 0.5 mm/pixel) were analyzed. Cell segmentation and phenotyping of the cell subpop-
ulations were performed using the inForm software (PerkinElmer Inc., USA). The frequency of all immune cell populations analyzed
and the cartographic coordinates of each stained cell type were obtained.
Xenotransplantations
Female NSG mice 8-12 weeks of age were sublethally irradiated (175 cGy) 24 h before xenotransplantation assays. FACS sorted
primary AML samples were injected into the femoral BM cavity of sublethally irradiated mice. Mice were daily monitored, and femur
bone marrow aspirates were taken at 16 weeks to determine engraftment and lineage potential. Human leukemic engraftment in
mouse BM was evaluated by flow cytometry using anti-human-CD45-FITC (clone HI30), anti-human-CD34-BUV395 (clone 581),
anti-human-CD19-FITC (clone HIB19), anti-human-CD33-PE-Cy7 (clone WM53), CD3-BV510 (clone OKT3), and anti-mouse-
CD45-Alexa700 (clone 30-F11).
CloneTracer model
See Methods S1 for a full description of the CloneTracer model. Posterior predictive checks57 were used to determine if the data met
the assumptions of the statistical model, as detailed in the Methods S1.
Data visualization
All plots were generated using the ggplot2 (v. 3.3.5), ComplexHeatmap (v. 2.6.2) and pheatmap (v. 1.0.12) packages in R 4.0.2 or
FlowJo (v. 10.6.1, BD). Boxplots are defined as follows: the middle line corresponds to the median; the lower and upper hinges corre-
spond to first and third quartiles, respectively; the upper whisker extends from the hinge to the largest value no further than 1.5X the
inter-quartile range (or the distance between the first and third quartiles) from the hinge and the lower whisker extends from the hinge
to the smallest value at most 1.5X the inter-quartile range of the hinge. Data beyond the end of the whiskers are called ‘outlying’ points
and are plotted individually.
Detail on statistical tests used in the different figures and definition of relevant summary statistics are included in the figure legends.
ADDITIONAL RESOURCES
Interactive 2D and 3D versions of most uMAPs from this paper are available at https://veltenlab.crg.eu/clonetracer/.