Comprehensive Characterization of The Multiple Mye
Comprehensive Characterization of The Multiple Mye
analysis
Gonzalez-Kozlova2, Surendra Dasari4, Mark A. Fiala1, Yered Pita-Juarez6, Michael Strausbauch4, Geoffrey Kelly2,
Beena E. Thomas3, Shaji K. Kumar4, Hearn Jay Cho2,5, Emilie Anderson4, Michael C. Wendl1, Travis Dawson2,
Darwin D’souza2, Stephen T. Oh1, Giulia Cheloni6, Ying Li4, John F. DiPersio1, Adeeb H. Rahman2, Kavita M.
Dhodapkar3, Seunghee Kim-Schulze2, Ravi Vij1, Ioannis S. Vlachos6, Shaadi Mehr5, Mark Hamilton5, Daniel Auclair5,
Taxiarchis Kourelis4*, David Avigan6*, Madhav V. Dhodapkar3*, Sacha Gnjatic2*, Manoj K. Bhasin3*, Li Ding1*
*corresponding authors
Running title:
Correspondence to:
Li Ding, Ph.D, Washington University School of Medicine, Saint Louis, MO, 63110. Email: [email protected]. Phone:
3142861848
Competing interests
Keywords
1
Abstract
As part of the Multiple Myeloma Research Foundation (MMRF) immune atlas pilot project, we
compared immune cells of Multiple Myeloma (MM) bone marrow samples from 18 patients
assessed by single-cell RNA-seq (scRNA-seq), mass cytometry (CyTOF), and Cellular Indexing
measurements among single-cell techniques. Cell type abundances are relatively consistent
across the three approaches, while variations are observed in T cells, macrophages, and
monocytes. Concordance and correlation analysis of cell type marker gene expression across
different modalities highlighted the importance of choosing cell type marker genes best suited to
particular modalities. By integrating data from these three assays, we found International
Staging System (ISS) stage 3 patients exhibited decreased CD4+ T/ CD8+ T cells ratio.
Moreover, we observed upregulation of RAC2 and PSMB9, in NK cells of fast progressors (FP)
multiple single cell technologies revealed markers associated with MM rapid progression which
Significance
scRNA-seq, CyTOF, and CITE-seq are increasingly used for evaluating cellular heterogeneity.
Understanding their concordances is of great interest. To date, this study is the most
2
Introduction
Single-cell sequencing technologies offer advantages over traditional bulk methods in cancer
genomics research for evaluating cellular heterogeneity and investigating evolution of cellular
subpopulations between the tumor and its microenvironment. For example, single-cell methods
marked by uncontrolled clonal expansion of plasma cells. Single-cell RNA sequencing (scRNA-
seq) has been used to examine tumor and immune cell populations [1,2] and mass cytometry
(CyTOF) to evaluate the impact of drugs on immune populations in MM [3]. The third technology,
proteins. All three approaches enable identification of cell types, cell states, and characterization
In addition, the bone marrow microenvironment plays an important role in the evolution of
cells and CD16+ monocytes [2]. Specifically, the percentage of CD4+ T cells was significantly
reduced in bone marrow of MM patients, leading to altered CD4+ T/CD8+ T ratio [4]. When
comparing the clinical status, the ratio decreased in International Staging System (ISS) stage 3
patients compared to stage 1 patients [5]. With respect to treatment, the proportion of CD3+ T
cells was lower in treated patients compared to chemo-naive MM patients [6]. Further work is
needed to expand initial findings using various assays and reveal candidate markers for
3
Combining the timeliness of the technology concordance question with furtherance of MM
research, we subjected bone marrow samples from 18 MM patients to scRNA-seq, CyTOF, and
CITE-seq, examining the similarities across the aforementioned single cell techniques. We used
the results to investigate the relationship between immune population compositional alterations
and disease stages and revealed a set of markers associated with MM rapid progression.
All procedures performed in studies involving human participants were in accordance with the
ethical standards of the MMRF research committee. These samples provided by MMRF were all
from the MMRF’s CoMMpass clinical trial (NCT NCT01454297). Written informed patient
consent was obtained from all patients for the collection and analysis of their samples by the
MMRF. The CoMMpass study was conducted in accordance with recognized ethical guidelines
in the US and EU. The Institutional Review Board at each participating center approved the
study protocol.
BMA samples obtained from subjects enrolled in the MMRF CoMMpass study
(NCT01454297). Any blood clots were removed from BMA samples via passage through
70 mM cell strainer. BMA samples were aliquoted into 5mL aliquots in 50mL conical
tubes and 45mL of 22mM-filtered ACK lysing buffer (155mM Ammonium Chloride/10mM
tune gently inverted several times to mix. Tubes were then centrifuged at 400xg for 5
minutes. The supernatant was removed and the cell pellet resuspended with 5 mL of
RPMI-1640 and transferred to a clean tube. All aliquots of ACK-lysed BMA aliquots were
4
combined into 1x 50 mL tube, the volume adjusted to 50 mL with RPMI-1640. The cells
were then mixed by gentle inversion and the tube centrifuged at 400xg for 5 minutes.
The supernatant was then removed by aspiration. Depending on the size of the BMA cell
saline (PBS) containing 2%FBS (v/v) and 1mM EDTA (PBS/FCS/EDTA buffer). 25mL of
from subjects enrolled in the MMRF CoMMpass study (NCT NCT01454297) were
isolated via negative selection from CD138-positive (CD138+) myeloma cells using the
Selection Kit: Stem Cell Technologies) in accordance with the manufacturers protocol.
Briefly, 100x106 cell/mL bone marrow mononuclear cells (MNC) in a sterile 17x100mm
(14mL) tube were gently mixed and incubated with 100mL/mL CD138 selection antibody
nanoparticles was then added to the cell suspension, gently mixed, and incubated for a
further 10 minutes at room temperature. The volume of the cell suspension was then
adjusted to 8mL with phosphate-buffered saline (PBS) containing 2%FBS (v/v) and 1mM
EDTA (PBS/FCS/EDTA buffer) and the cell suspension mixed by gentle pipetting (2-3x).
The tube was then placed in the magnetic separator. After 5 minutes incubation at room
temperature, the magnet and tube were carefully inverted to pour off the supernatant
into a sterile 50mL conical tube. This supernatant contains the heterogeneous CD138-
negative immune cell mononuclear population (MNC). The tube was then removed from
the magnet and an additional 8mL of PBS/FCS/EDTA added, gently mixed, and returned
to the magnetic separator. Again, after 5 minutes incubation in the magnetic separator,
5
the tube and magnet were carefully inverted to pour of the supernatant into the 50mL
collection tube. This PBS/FCS/EDTA ‘wash’ step was repeated once more resulting in
~24mL suspension of CD138-negative bone marrow MNC cells. CD138- MNC cells were
then pelleted by centrifugation at 400xg for 5 minutes and the supernatant removed by
Processing of BMMC and library prep from MMRF CoMMpass study for scRNA-seq at
WUSTL
WUSTL Cell Thawing: Multiple Myeloma bone marrow mononuclear cells (BMMC) aliquots were
thawed in 37⁰C water bath. Cells were then pelleted by centrifugation at 300g for 5 min and all
supernatant was removed. To prepare cells for the Miltenyi Dead Cell Removal Kit, cells were
resuspended in 100 uL of beads and incubated at room temperature for 15 minutes. Dead cells
were depleted using the autoMACS®Pro Separator. Live cells were pelleted by centrifugation at
450g for 5 minutes. Cells were finally resuspended in ice cold phosphate buffer saline (PBS)
and 0.5% BSA and loaded onto the 10x Genomics Chromium Controller and using the
Chromium Next GEM Single Cell 3’ GEM, Library & Gel Bead Kit v3.3. Utilizing the 10x
Genomics Chromium Single Cell 3’v3 Library Kit and Chromium instrument, approximately
16,500 to 20,000 cells were partitioned into nanoliter droplets to achieve single cell resolution
for a maximum of 10,000 individual cells per sample. The resulting cDNA was tagged with a
common 16nt cell barcode and 10nt Unique Molecular Identifier during the RT reaction. Full
length cDNA from poly-A mRNA transcripts was enzymatically fragmented and size selected to
optimize the cDNA amplicon size (approximately 400 bp) for library construction (10x
Genomics). The concentration of the 10x single cell library was accurately determined through
qPCR (Kapa Biosystems) to produce cluster counts appropriate for the HiSeq 4000 or NovaSeq
6
6000 platform (Illumina). 26x98bp (3'v2 libraries) sequence data were generated targeting
between 25K-50K read pairs/cell, which provided digital gene expression profiles for each
individual cell.
ISMMS BMMC processing differences from WUSTL: BMMC aliquots were partially
the partially thawed BMMC aliquot and the entire volume was transferred to a 15 mL conical
tube containing 10 mL of warm thawing media. The empty BMMC tube was rinsed with
another 1 mL of thawing media which was then also transferred to the 15 mL conical tube.
Cells were processed using the EasySep Dead Cell Removal (Annexin V) Kit (StemCell
For single cell RNA-seq analysis, the proprietary software tool Cell Ranger v3.0.0 from 10x
Genomics was used for de-multiplexing sequence data into FASTQ files, aligning reads to the
Seurat v3.0.0 [7],[8] was used for all subsequent analysis. First, a series of quality filters was
applied to the data to remove those barcodes which fell into any one of these categories
recommended by Seurat: too few total transcript counts (< 300); possible debris with too few
genes expressed (< 200) and too few UMIs (< 1,000); possible more than one cell with too
many genes expressed (> 50,000) and too many UMIs (> 10,000); possible dead cell or a sign
of cellular stress and apoptosis with too high proportion of mitochondrial gene expression over
the total transcript counts (> 20%). Finally, predicted doublets were also removed using scrublet
V0.2.3.
7
We constructed a Seurat object using the unfiltered feature-barcode matrix for each sample.
Each sample was scaled and normalized using Seurat’s ‘SCTransform’ function to correct for
return.only.var.genes = F).
Cell types were assigned to each cluster by manually reviewing the expression of marker genes.
The marker genes for main cell types were CD79A, CD79B, MS4A1 (B cells); CD8A, CD8B,
CD7, CD3E (CD8+ T cells); CD4, IL7R, CD7, CD3E (CD4+ T cells); NKG7, GNLY, KLRD1,
NCAM1 (NK cells); MZB1, SDC1, IGHG1 (Plasma cells); CLEC4C, IL3RA, IRF8, GZMB
(Dendritic cells); FCGR3A (Macrophages); CD14, LYZ, S100A8, S100A9 (Monocytes); AZU1,
ELANE, MPO (Neutrophils); COL1A1, COL3A1, TNC, S100A4 (Fibroblasts); and AHSP1, HBA,
HBB (Erythrocytes). Detailed cell type markers are listed in Supplementary Table S1A. All cells
that were labeled as erythrocytes and plasma cells were removed from subsequent analysis.
Samples were thawed in the water bath at 37°C for 2-3 min and the cell concentration, viability
were determined using a Bio-Rad T20 Cell Counter (Cat# 145-0102). Samples were blocked by
incubation with TruStain fcX (Biolegend, cat# 422301) in a 50 μL cell labeling buffer. Next,
samples were labeled with Total-seq antibodies (Biolegend, Supplementary Table S1B) for 30
minutes. Cells were washed and resuspended to obtain a cell concentration of 700-1,200
cells/μl and gently pipette mix using a regular-bore pipette tip until a single cell suspension is
achieved. We then proceed immediately to Single cell Gene Expression Library (3’GEX)
construction using 10X Chromium Single Cell 3' Reagent Kits v3 (Cat# 1000075) and Chromium
i7 Sample Index Plate with Barcoding technology for Cell Surface Protein. For each sample,
8
5000 cells were injected for CITE-Seq. The libraries were sequenced on NovaSeq S4 platform
in pair end sequencing and a single index with at least 50,000 read pairs per cell.
We used Cell Ranger to demultiplex, map to the human reference genome (grch38), and count
out cells with more than 10% UMIs from mitochondrially-encoded genes or less than 1,200
mRNA UMIs in total. We then constructed a Seurat object using the feature-barcode matrix for
each sample (Seurat v3.0.0). Each sample was scaled and normalized using Seurat’s
levels were added to the Seurat object, followed by normalization and scaling for ADT assay.
altExp_name = "ADT", transform = "log"). We then integrated RNA and ADT matrix by an
integration algorithm called similarity network fusion (SNF) and clustered cells by louvain
clustering. Then, cell types were assigned to each cluster by manually reviewing the expression
of marker genes at RNA levels (same as scRNA-seq, Supplementary Table S1A) and ADT
levels (if available). All cells that were labeled as erythrocytes and plasma cells were removed
BMMC aliquots were thawed in a 37⁰C water bath and immediately transferred into RPMI+10%
FBS. Cells were pelleted by centrifugation at 300g for 5 minutes and all supernatant was
9
removed. Cells were then incubated for 20 minutes in a 37⁰C water bath with Cell-ID Rh103
Intercalator (Fluidigm, Cat# 201103A) to label non-viable cells. Samples were then blocked with
Fc receptor blocking solution (Biolegend, Cat# 422302) and stained with a cocktail of surface
antibodies for 30 minutes on ice. All antibodies were either conjugated in-house using
Fluidigm's ×8 polymer conjugation kits or purchased commercially from Fluidigm. Next, samples
into a single tube. The pooled sample was then fixed and permeabilized using BD's
cocktail of intracellular antibodies. Finally, the sample was re-fixed with freshly diluted 2.4%
formaldehyde in PBS containing 0.02% saponin and Cell-ID Intercalator-Ir (Fluidigm, Cat#
201192A) to label nucleated cells. The sample was then stored as a pellet in PBS until
acquisition.
Immediately prior to acquisition, the pooled sample was washed with Cell Staining Buffer and
Cell Acquisition Solution (Fluidigm, Cat# 201240) and resuspended in Cell Acquisition Solution
(Fluidigm, Cat# 201078). The sample was acquired on the Fluidigm Helios mass cytometer
using the wide bore injector configuration at an acquisition speed of < 400 cells per second.
BMMC aliquots were thawed in a 37⁰C water bath and immediately transferred into 15ml tubes
Aldrich; Catalog Number-E1014-5KU; 250U/mL). Cells were pelleted by centrifugation (all spins
at 500g for 5 minutes) and supernatant was removed. Cells were then incubated for 1hr in a
37⁰C water bath in 10mL of RPMI+10% FBS. Cells were counted and 3-4 million cells were
aliquoted into microfuge 2mL conical tubes, pelleted and washed 2X with 2mL CSB Maxpar®
10
Cell Staining Buffer (Fluidigm; Catalog number-201068; 500 mL) and resuspended in 300uL of
Cell-ID™ Cisplatin (Fluidigm; Catalog Number: 201064) 5 min/RT, to label dead cells.
Immediately quenched with 1.5 mL CSB, pelleted, and washed with CSB 2X.
For staining, the cell pellet was gently resuspended in 50uL CSB and the addition of an equal
reaction was incubated on a rocker platform for 45min at RT. 1mL of CSB was used to wash
and pellet the cells 2X. Cell pellet was resuspended in the residual volume and then gently
resuspended in 500uL of 1X PBS. An equal volume of 4% PFA in PBS was added to fix cells for
a minimum of 20 minutes at a final concentration of 2% PFA in PBS. The sample was labeled
overnight at 4⁰C on a rocker platform with Cell-ID Intercalator-Ir (Fluidigm, Cat# 201192A) in
Maxpar Fix and Perm Buffer (Fluidigm; Catalog Number-201067; 100 mL) to label nucleated
cells.
The following day the sample was washed 1X with CSB (all cell pelleting performed at 800g for
5 minutes after fixation) and twice with Cell Acquisition Solution (Fluidigm, Cat# 201240). Final
resuspension was in Cell Acquisition Solution at a concentration of 0.7 million cells per mL
containing a 1:10 dilution of EQ normalization beads (Fluidigm, Cat# 201078). The sample was
acquired on the Fluidigm Helios mass cytometer using the wide bore injector configuration at a
targeted acquisition speed of 300 events per second. A cryopreserved specimen of 3-4 million
Ficoll enriched PBMC derived from a pool of 4 anonymous platelet donors was included with
every batch of MMRF samples [24]. This sample was treated and analyzed in parallel
11
BMMC aliquots were thawed in a 37⁰C water bath and immediately transferred into RPMI+10%
FBS. Cells were pelleted by centrifugation at 300g for 5 minutes and all supernatant was
removed. Cells were then incubated for 20 minutes in a 37⁰C incubator. Cells were pelleted by
centrifugation at 300g for 5 minutes and all supernatant was removed. Cells were resuspended
in PBS and incubated with cisplatin for 1 min (Fluidigm, Cat# 201195) to label non-viable cells.
with a cocktail of surface antibodies for 15 minutes at room temperature. All antibodies were
commercially from Fluidigm. Next, samples were washed and fixed and permed with TF
Fix/Perm and Perm/Wash Kit (BD Pharmigen, Cat# 51-9008100 and # 51-9008102) using
Perm/Wash with a cocktail of intracellular antibodies. After washing and centrifugation at 800g
for 5 minutes, the sample was re-fixed with Maxpar Fix I buffer (Fluidigm, Cat# 201065) and
Cell-ID Intercalator-Ir (Fluidigm, Cat# 201192A) to label nucleated cells. The sample was then
stored as a pellet in PBS until acquisition. Immediately prior to acquisition, the sample was
washed with Cell Staining Buffer and Maxpar Water (Fluidigm, Cat# 201069) and resuspended
normalization beads (Fluidigm, Cat# 201078). The sample was acquired on the Fluidigm Helios
mass cytometer using the HT injector configuration at an acquisition speed of < 500 cells per
second.
The resulting FCS files were normalized and concatenated using Fluidigm's CyTOF software
Cytobank by removing EQ beads, low DNA debris, and gaussian multiplets. Barcoding
12
multiplets were also removed based on the Mahalanobis distance and barcode separation
gating out cells/debris with outlier cisplatin and DNA intercalator staining. Cell populations were
ISMMS and Mayo used the same gating strategy: CD4+CD8- (CD4+ T cells); CD8+CD4- (CD8+
T cells), Emory: CD4+CD8- (CD4+ T cells); CD8+CD4- (CD8+ T cells); CD45RO-CCR7+ (Naive
T cells). Next, we performed t-SNE analysis for 18 samples from ISMMS. We used the scaled
expression of markers, including CD57, CD11c, Ki67, CD19, CD45RA, KLRG1, CD4, CD8,
ICOS, CD16, CD127, CD1c, CD123, CD66b, TIGIT, TIM3, CD27, PD-L1, CD33, CD14, CD56,
NKG2A, CD5, CD45RO, NKG2D, CD25, CCR7, CD3, Tbet, CD38, CD39, CD28, DNAM1, HLA-
DR, PD-1, GranzymeB, CD11b. For expression normalization in CyTOF analysis, we followed
instructions from Cytobank and used transformed ratios itself compared to its control, which is
13
the table’s minimum of median of channel (described here https://support.cytobank.org/hc/en-
us/articles/206147637-How-to-create-and-configure-a-Heatmap).
Bland-Altman analysis
Differential expression analysis was performed using the default test (Wilcoxon Rank Sum test)
of function FindMarkers (from the Seurat package) with the specified parameters: min.pct=0.25,
The sequence data generated in this study has been submitted to the NCBI BioProject
14
Results
cell” fractions from patients enrolled in the Multiple Myeloma Research Foundation (MMRF)
CoMMpass study (NCT01454297). Nine were fast progressors (FP, progressed within 6 months)
and nine were non-progressors (NP, progressed >6 months but within 5 years) with patient
ages ranging from 37 to 83 years. Twelve patients were in the International Staging System
(ISS) stage III, eight underwent Autologous Stem Cell Transplantation (ASCT), eleven were
females and fifteen were Caucasions (Fig. 1A and Supplementary Table S1C). Each sample
was subjected to scRNA-seq, CyTOF, and CITE-seq at three different respective academic
research centers, namely Washington University in St. Louis (WUSTL), Icahn School of
Medicine at Mount Sinai (ISMMS), and Beth Israel Deaconess Medical Center (BIDMC). All
sites received aliquots from the same sample and technical replicates were conducted for 2
To assess immune cell composition of MM patients, bone marrow (BM) baseline samples
(collected at the initial diagnosis) from these eighteen patients were subjected to scRNA-seq,
with immune cells clustered based on their transcriptome profiles using the Louvain clustering
algorithm implemented by Seurat [7,8] (Fig. 1B). We then investigated immune cells of these
same samples by mass cytometry (CyTOF) using a 39-marker panel (Supplementary Table
S1D). Cell populations were characterized by expression of markers, clustered by the flowsom
algorithm [9], and visualized with vi-SNE in the Cytobank [10] platform (Fig. 1C). Given the
15
discordance between RNA expression and protein expression that is known to exist [11], it is
informative to characterize cell populations by measuring RNA and protein at the same time.
proteins. Following standard scRNA-seq quality filtering protocols, immune cells were clustered
average, 1,051 immune cells/sample using scRNA-seq, >64K CD45+ cells/sample using
transcriptional profiles alone (Fig. 1E). Interestingly, most cell types, including B cells,
monocytes, macrophages, neutrophils, and plasmacytoid dendritic cells (pDCs), formed distinct
clusters, while T cell subtypes mixed together. To further understand the difference of cell type
marker expression between the RNA and protein levels, we visualized the expression of some
canonical markers in Uniform Manifold Approximation and Projection (UMAP) and investigated
the concordance of the sample-level average expression of the 29 CITE-seq protein markers
between RNA level and Antibody-Derived Tags (ADT) level (Fig. 1F, G; Supplementary Fig.
S1A). As expected, expression levels of markers are generally concordant (R =0.72, p < 10-4),
with some exceptions where protein-level expression is higher than RNA-level expression and
vice versa. One impressive example is CD4 (Fig. 1F, G), which is highly expressed at ADT
measurement, but minimally expressed at the RNA level, mainly because mRNAs are produced
16
at much lower rates and have much shorter half-lives than proteins [13]. This observation is
consistent with previous studies showing low CD4 mRNA expression compared to surface CD4
protein [14]. Last, since Naive CD8+ T cells were clustered together with CD4+ T cells based on
transcriptome profiles (Fig. 1E), we investigated whether reclustering T cells alone could help to
among T cells [14] and different surface protein markers could be encoded by the same gene
[15], reclustering CD4+ and naive CD8+ T cells did not provide additional resolution of T cell
subtypes (Fig. 1H). Consistent with a published study about renal T subtype identification using
expression of cell type markers for MM T cell subtype identification in CITE-seq as compared to
standard scRNA-seq.
compared between technical replicates for 2 samples in each assay. The technical replicate
pairs are strongly correlated in all 3 assays (average Pearson correlation coefficient r=0.94 in
scRNA-seq, 0.89 in CyTOF, and 0.92 in CITE-seq) (Supplementary Fig. S1B-D). Next, to
examine the consistency of immune cell populations measured by the same techniques at
different sites, we evaluated the percentage of immune populations captured by three centers
using four samples. scRNA-seq data was generated in ISMMS, WUSTL and BIDMC using
aliquots of the same samples and CyTOF data was generated in ISMMS, Mayo Clinic and
Emory University (panels are shown in Supplementary Table S1D-F). BIDMC scRNA-seq data
is from CITE-seq data analyzed with RNA signal alone. (Supplementary Fig. S1E). We
observed that the percentages of B cells, pre-B cells, NK cells, pDCs, monocytes and
17
macrophages are generally consistent, while the T cell subset varies across centers in scRNA-
seq measurement (Supplementary Fig. S1F). This suggests that T cell composition could vary
by aliquots and potential sample processing differences across centers while other cell types
are more similar in scRNA-seq measurement. The cell type abundance measured by CyTOF is
less variable than that measured by scRNA-seq, with smaller differences observed in T cell
analysis, shown in Supplementary Table S1G). Moreover, cell subset abundances of ISMMS
samples tend to have less variation likely due to the benefit of barcoding samples (Methods).
The cell type frequencies calculated by one center (Emory) tend to be lower overall compared to
other centers in CyTOF, probably because wide bore injector assembly with cell acquisition
solution was not used to maintain cell integrity (Methods). It is worthwhile noting that including
reference samples in CyTOF is very helpful for identifying potential artifacts. For example, we
observed a big proportion of CD66b/CD3+ cells in patient samples while these were absent in
the reference sample from a healthy donor (data not shown). We hypothesized that this CD66b
staining artifact (CD66b is not expressed on CD3+ T cells) was likely due to non-specific
staining from dead cells. Indeed, the percentage of CD66b/CD3+ cells dropped dramatically
after dead cell depletion. Lastly, to evaluate the similarity of expression profiles across different
samples and centers, we calculated the Pearson correlation coefficient of expression of the B
cell markers between populations detected from different centers using scRNA-seq
(Supplementary Fig. S1H). We observed that B cells clustered according to patients instead of
are potential reservoirs of plasma cells [17]. Overall, we observed that cell type abundances are
generally consistent across centers for most cell types and that similarity of transcriptome
effects across centers. These observations imply that our cross-technique comparisons should
be valid.
18
Comparisons of cell type abundances and correlations of cell type marker expression
To evaluate the concordance of cell type composition determined by the three methods, we
calculated the cell subset frequency of each immune population relative to the CD45+
stronger concordance between scRNA-seq and CITE-seq for all cell types except NK cells
Cell type abundance is especially consistent for B cells, plasmacytoid dendritic cells (pDC), and
neutrophils. Interestingly, the cell frequency decreased and increased for T cells and
The mean differences between CyTOF and CITE-seq were -13.6% (95% CI: -24.02% to -3.11%)
for T cells and 11.07% (95% CI: 3.19% to 18.95%) for macrophages/monocytes. This finding is
consistent with a previous study where fewer T cells were detected in CyTOF compared to
scRNA-seq in healthy bone marrow samples [18]. To further investigate which subpopulations
were discordant, the frequencies of T cell subsets, monocytes, and macrophages were
evaluated (Fig. 2B, mean difference calculated by Bland-Altman analysis). Interestingly, CITE-
seq detected far more CD4+ T cells compared to CyTOF and scRNA-seq, while CyTOF
detected far fewer CD8+ T cells compared to the other two techniques. In terms of T cell
subtypes, regulatory T (Treg) cell frequency increased and memory CD8+ T cells reduced in
than the other 2 methods, while monocyte frequency was the lowest in CyTOF.
of cell type marker genes, including both the RNA and ADT levels. Average expressions of each
marker gene at the transcriptional level (blue dots) between scRNA-seq and CITE-seq are
19
generally concordant (Fig. 2C). By contrast, we observed drastic differences of some marker
genes between RNA and ADT expression in CITE-seq, probably due to the RNA dropout [19]
and shorter half-lives of mRNAs versus proteins [13]. For example, expression of CD4_adt is
higher than that of transcriptional CD4, whereas CD127/IL7R tends to be highly expressed at
the transcriptional level. This dynamic explains why IL7R is often differentially expressed in
observations highlight the importance of choosing cell type marker genes best suited to
particular modalities.
We also correlated expressions of marker genes among scRNA-seq, CyTOF, and CITE-seq.
The vast majority are positively correlated in protein-protein comparison (Fig. 2D) and RNA-
RNA comparison (Fig. 2E). Next, we investigated the correlations of expressions of marker
genes between the transcriptome and protein levels (Fig. 2F, G; Supplementary Fig. S2A, B).
As expected, the overall correlation between different modalities is lower than that of the same
modalities. We observed significant correlation for some markers, including CCR7 in CD4+
naive T cells, IL7R in CD4+ memory T cells, and FCGR3A in NK cells, between RNA and
protein level of CITE-seq, while no markers are significantly correlated between scRNA-seq and
CyTOF (Fig. 2G). We also found that FCGR3A in macrophages has a strong correlation, while
some markers are significantly anti-correlated between CITE-seq transcriptional level and
CyTOF, such as CD3D, CD3G, IL7R, CD8A, etc. (Supplementary Fig. S2A, C; Supplementary
Table S1I).
20
Decreased ratio of CD4+/CD8+T cells from ISS stage 2 to ISS stage 3 patients and fast
Further, we sought to investigate the relationship between clinical features and immune cell
normal controls and the ratio decreased with the MM progression [5]. By integrating 3 assays,
we found the ratio tends to decrease from ISS stage 2 to ISS stage 3 patients (Fig. 3A). Further,
patients, suggesting that CD8+ T cells tend to be activated rather than naive in stage 3 patients
(Fig. 3B). In addition, we then identified several DEGs of NK cells from FPs relative to NPs,
including ARPC5, XAF1, RAC2 and PSMB9, as revealed by both scRNA-seq and CITE-seq
assays (Fig. 3C). ARPC5, Actin-Related Protein 2/3 Complex Subunit 5, has been revealed to
be highly expressed in patients with poor overall survival and could be treated as an
independent biomarker for patients with MM [20], consistent with our observations. A previous
microarray-based study found that RAC2, Rac Family Small GTPase 2, is significantly
upregulated in MM as compared to MGUS [21]. One subunit of the proteasome (PSMB9), was
remarkably highly expressed in cell groups with t(4;14) translocations versus cells from MGUS
[22]. In summary, previous studies indicated RAC2 and PSMB9 are associated with disease
development from MGUS to MM and our analysis suggested that they might also be related to
MM progression. Taken together, we observed the ratio of CD4+ T/CD8+ T cells decreased in
stage 3 patients relative to stage 2 patients, suggesting an increased population of CD8+ T cells
in bone marrow microenvironment (BMME) of patients in stage 3. We also found that RAC2 and
PSMB9 are upregulated in NK cells in FPs relative to NPs at transcriptional level, which could
21
Discussion
Single-cell sequencing technologies have been widely used in studying tissue heterogeneity,
transcriptome, proteome and other mutli-omics profiles of single cells [23]. However, the
similarities of measurements across the various single cell techniques remains to be fully
using scRNA-seq, >64K CD45+ cells/sample using CyTOF, and 718 immune cells/sample using
CITE-seq. By clustering cells with or without protein profiles in CITE-seq, we showed the
markers when characterizing T cell subtypes in MM (Fig. 1E, H). This observation is in line with
Next, to examine the consistency of cell populations measured by the same techniques at
different sites, we evaluated the cell subset abundances captured by three centers using four
effect across centers and there are some important factors to consider in order to obtain
help identify marker non-specific staining artifacts; 2) Barcoding samples, sample delivery
mechanism, and using lyophilized panels is important in CyTOF experiments. Further, cross-
scRNA-seq, CyTOF, and CITE-seq are generally concordant, except some variations in T cells,
22
macrophages, and monocytes (Fig. 2A, B). Analysis revealed relatively high correlations of most
markers between the same modalities, though some markers are negatively correlated. (Fig.
2C-G). This observation highlighted the importance of choosing marker genes best suited to
particular modalities.
donors and these ratios are further decreased in ISS stage 3 versus ISS stage 1 patients [5].
Here, we confirmed this trend using 3 single cell technologies, finding that this ratio tends to
decrease even in stage 3 versus stage 2 patients (Fig. 3A). We also observed the decreased
ratio in stage 2 compared to stage 1 patients based on CyTOF and CITE-seq measurement but
not in scRNA-seq, probably due to the limited number of patients in stage 1. Future study could
further investigate how immune cell composition changes along with ISS stages with expanded
sample size. In addition, we observed upregulation of ARPC5, XAF1, RAC2, and PSMB9 in NK
cells of FPs compared to those of NPs, as suggested by both scRNA-seq and CITE-seq RNA
measurements (Fig. 3C). RAC2 and PSMB9 have been revealed to be associated with disease
development from MGUS to MM [21,22] and our analysis suggested that they might also be
related to MM rapid progression, supported by both scRNA-seq and CITE-seq. Due to the
limited number of protein markers in CITE-seq, we were unable to evaluate the protein-level
This analysis is just a small sampling of the larger work being conducted by the MMRF and their
associated academic research centers to provide a sufficiently broad, deep, and technologically
diverse vast dataset for accurately characterizing BMME and to help interrogate MM tumor
23
microenvironment (TME) using different single-cell technologies. We hope this study will help
researchers refine cell population characterization strategies and provide insights to those
questions.
This study was funded through the Multiple Myeloma Research Foundation (MMRF) Immune
Atlas initiative. We thank the multiple myeloma patients, families, and professionals who have
contributed to this study. We thank Upadhyaya Bhaskar,, Nicolas Fernandez, and Laura
Walker for their contribution to the initial work of MMRF immune atlas pilot study. We thank
Author contributions
L.D., M.B., S.G., M.D., D.A., and S.D. led project design. L.Y. led data analysis, interpretation
and Figure generation. R.J., B.L., S.S.B., M.F., D.B.D., M.S., T.D., S.K. , H.C., E. A., G.K., and
D.D. contributed to sample processing and data generation. L.Y., R.J., S.S.B., T.K., E.G.,
D.B.D., T.D. G.K., D.D., B.T., Y.P., and W.P. contributed to data processing and data
interpretation. S. O., J.D., A.R., S.K., R. V., I.V., S.M., M. H., K.D., and D.A. contributed to study
design and supervision. L.Y wrote the manuscript and made figures. M.A.W helped polish
Figures. R.J., M.W, T.K., and D.B.D. helped edit the manuscript. L.D. and S.G reviewed the
manuscript.
24
References
1. Liu R, Gao Q, Foltz SM, Fowles JS, Yao L, Wang JT, et al. Co-evolution of tumor and
immune cells during progression of multiple myeloma. Nat Commun. 2021;12: 2559.
3. Adams HC 3rd, Stevenaert F, Krejcik J, Van der Borght K, Smets T, Bald J, et al. High-
Parameter Mass Cytometry Evaluation of Relapsed/Refractory Multiple Myeloma Patients
Treated with Daratumumab Demonstrates Immune Modulation as a Novel Mechanism of
Action. Cytometry A. 2019;95: 279–289.
9. Van Gassen S, Callebaut B, Van Helden MJ, Lambrecht BN, Demeester P, Dhaene T, et
al. FlowSOM: Using self-organizing maps for visualization and interpretation of cytometry
data. Cytometry A. 2015;87: 636–645.
10. Kotecha N, Krutzik PO, Irish JM. Web-based analysis and publication of flow cytometry
experiments. Curr Protoc Cytom. 2010;Chapter 10: Unit10.17.
11. Vogel C, Marcotte EM. Insights into the regulation of protein abundance from proteomic
and transcriptomic analyses. Nat Rev Genet. 2012;13: 227–232.
12. Kim HJ, Lin Y, Geddes TA, Yang JYH, Yang P. CiteFuse enables multi-modal analysis of
CITE-seq data. Bioinformatics. 2020;36: 4137–4143.
25
14. Ding J, Smith SL, Orozco G, Barton A, Eyre S, Martin P. Characterisation of CD4+ T-cell
subtypes using single cell RNA sequencing and the impact of cell number and sequencing
depth. Sci Rep. 2020;10: 19825.
16. Krebs CF, Reimers D, Zhao Y, Paust H-J, Bartsch P, Nuñez S, et al. Pathogen-induced
tissue-resident memory TH17 (TRM17) cells amplify autoimmune kidney disease. Sci
17. Calame KL. Plasma cells: finding new light at the end of B cell development. Nat Immunol.
2001;2: 1103–1108.
18. Oetjen KA, Lindblad KE, Goswami M, Gui G, Dagur PK, Lai C, et al. Human bone marrow
assessment by single-cell RNA sequencing, mass cytometry, and flow cytometry. JCI
Insight. 2018;3. doi:10.1172/jci.insight.124928
19. Qiu P. Embracing the dropouts in single-cell RNA-seq analysis. Nat Commun. 2020;11:
1169.
20. Xiong T, Luo Z. The Expression of Actin-Related Protein 2/3 Complex Subunit 5 (ARPC5)
Expression in Multiple Myeloma and its Prognostic Significance. Med Sci Monit. 2018;24:
6340–6348.
21. Liu Z, Huang J, Zhong Q, She Y, Ou R, Li C, et al. Network-based analysis of the molecular
mechanisms of multiple myeloma and monoclonal gammopathy of undetermined
significance. Oncol Lett. 2017;14: 4167–4175.
22. Jang JS, Li Y, Mitra AK, Bi L, Abyzov A, van Wijnen AJ, et al. Molecular signatures of
multiple myeloma progression through single cell RNA-Seq. Blood Cancer J. 2019;9: 2.
23. Tang X, Huang Y, Lei J, Luo H, Zhu X. The single-cell sequencing: new developments and
medical applications. Cell Biosci. 2019;9: 53.
24. Dietz AB, Bulur PA, Emery RL, Winters JL, Epps DE, Zubair AC, et al. A novel source of
viable peripheral blood mononuclear cells from leukoreduction system chambers.
Transfusion . 2006;46: 2083–2089.
25. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods
of clinical measurement. Lancet. 1986;1: 307–310.
Figure legends
26
A) Patient characteristics and single-cell data collection. FP and NP denote fast progressors
Cell Transplantation.
B) Umap projection of integrated scRNA-seq data, with cells colored by immune cell types.
C) t-SNE projection of integrated CyTOF data, with cells colored by immune cell types.
E) UMAP projection of integrated CITE-seq data, with cells clustered by transcriptional level
F) Comparison of canonical cell type marker gene expressions between protein level (ADT, top)
and transcriptional level (RNA, bottom). Cells are colored by normalized expression.
RNA level and ADT level. The grey shaded area represents the 95% confidence interval around
H) UMAP projection of CD4+ T cells and naive CD8+ T cells, which is the subset of integrated
data in panel E, with cells clustered by transcriptional level alone, colored by immune cell
A) Main immune cell population (CD45+) frequencies observed by CITE-seq, CyTOF, and
scRNA-seq. Each boxplot is colored by assay. CITE-seq populations are determined based on
B) Immune cell subtype frequencies for CITE-seq, CyTOF, and scRNA-seq. Each boxplot is
colored by assay. CITE-seq populations are determined based on integrated RNA and ADT
expressions.
27
C) Concordance of sample-level average expressions of canonical cell type markers in main cell
subsets between scRNA-seq and CITE-seq. CITE-seq RNA and protein (ADT) level
D) Spearman correlation coefficients of protein level expressions of cell type markers between
CyTOF and CITE-seq. Each dot represents a marker gene and the color of the dot represents
0.05.
between scRNA-seq and CITE-seq. Each dot represents a marker gene and the color of the dot
represents the p value of correlation. Markers are highlighted with an outer circle if the p value is
F) Spearman correlation coefficients of cell type markers between transcriptional level and
protein level expressions in CITE-seq. Each dot represents a marker gene and the color of the
dot represents the p value of correlation. Markers are highlighted with an outer circle if the p
expressions from scRNA-seq and protein level expressions from CyTOF. Each dot represents a
marker gene and the color of the dot represents the p value of correlation. Markers are
Fig. 3. Ratio of CD4+ T/CD8+ T of patients in different ISS stages and markers associated
A) Violin plots showing the ratio of CD4+ T/CD8+ T of patients in ISS stage 2 and 3 in scRNA-
seq, CyTOF and CITE-seq. Horizontal lines indicate the median of data points in each group.
28
B) Violin plots showing single cell-level normalized expression of CD45RA in CITE-seq ADT
measurement and CyTOF. The difference is significant at p<=0.0001 based on Wilcoxon rank
sum test.
measurement (left) and scRNA-seq measurement (right). The samples are ordered based on
Expression values are scaled such that for each gene, the average of the scaled expression is 0
and the standard deviation is 1. Adjusted p values and log Fold Change in CITE-seq and
scRNA-seq were shown on the left and right side of DEGs respectively. FC=fold change.
29
Downloaded from http://aacrjournals.org/cancerrescommun/article-pdf/doi/10.1158/2767-9764.CRC-22-0022/3204323/crc-22-0022.pdf by guest on 01 September 2022
Downloaded from http://aacrjournals.org/cancerrescommun/article-pdf/doi/10.1158/2767-9764.CRC-22-0022/3204323/crc-22-0022.pdf by guest on 01 September 2022
Downloaded from http://aacrjournals.org/cancerrescommun/article-pdf/doi/10.1158/2767-9764.CRC-22-0022/3204323/crc-22-0022.pdf by guest on 01 September 2022