0% found this document useful (0 votes)

1 views

Code

The document outlines a series of data processing steps and analyses using various bioinformatics techniques, including reciprocal PCA, SCTransform, CCA, and harmony for single-cell RNA sequencing data. It details the loading of datasets, quality control measures, and the identification of cell type markers across different immune cell populations. Additionally, it emphasizes the importance of version control and the use of specific libraries to minimize bugs during analysis.

Uploaded by

janicepdudas54

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

Code

Uploaded by

janicepdudas54

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 62

---

title: "00.data"
output: html_notebook
editor_options:
chunk_output_type: console
---

2019-9-30
Use reciprocal PCA + reference-based + SCTransform + CCA + harmony.
Load the least library, write the more specific, to make sure less bugs.
Find markers from all the replications, separately.

2020-04-29
revise version

2020-09-17
BMC biology revision

2020-10-10
1. plos biology mouse spleen immune cell dataset,
2. nature immunology mouse spleen nk cell dataset,
3. our DNT dataset

2020-11-19
make sure save all versions

2020-1-7 version
**Use peciprocal PCA + reference-based + SCTransform + CCA + harmony to elevate the speed**
**Load the least library, write more specifically, to make sure less bugs.**
**Find markers from all the replications, separately.**
**Use the Zemmour_code to remove the contaminated cells, not a good choice, cannot classify
the cell type well in our datasets.**
**DoubletDecon: Use the DoubletDecon to remove doublets, about half of cells were marked as
doublets, and the expression of Cd74 has no difference, so the cluster 4 is not because of
doublets.**
**2020-02-07:Show the results before and after QC, and after use DoubletDecon, expression of
cluster 4 have no differences.**
**Scrublet: python version is too hard.**
**2020-02-11：DoubletFinder: is good at removing heterotypic doublets. Still cluster 4 expressing
Cd74.**
**2020-02-12 Union or intersect DoubletDecon and DoubletFinder result**

# library
```{r}
Sys.setenv(LANGUAGE = "en")
options(warn = -1)
memory.limit(size = 64670)
library(Seurat)
library(harmony)
library(MAST)
library(dplyr)
library(tidyselect)
library(RColorBrewer)
library(future)
library(ggplot2)
library(org.Mm.eg.db)
library(EnsDb.Mmusculus.v79)
library(cowplot)
library(data.table)
library(clusterProfiler)
library(DoubletDecon)
library(DoubletFinder)
```
# paralelization set
```{r}
# plan("multiprocess", workers = 4)
# options(future.globals.maxSize = 1024^4)
# plan()
# plan("sequential")
```
# theme set
```{r}
theme_set(theme_cowplot(font_size = 8))
theme.text = theme_cowplot(font_size = 8)
cols = c(brewer.pal(9, "Set1"),
brewer.pal(8, "Set2")[-c(2,4,8)],
brewer.pal(12, "Set3")[-9],
brewer.pal(12, "Paired"),
brewer.pal(8, "Dark2"),
brewer.pal(11, "Spectral")[-6],
brewer.pal(11, "BrBG")[-6]
)
```
# load data
```{r}
load("01.sct.dnt.Rdata")
load("01.sct.nk.Rdata")
load("01.sct.sp.Rdata")
load("01.sct.kept.Rdata")

load("01.harmony.dnt.nk.Rdata")
load("01.harmony.dnt.nk.meta.Rdata")

# load("01.harmony.dnt.Rdata")
# load("01.harmony.dnt.cc.marker.Rdata")

load("01.FindMarkers.between.DNT.and.CD4.CD8.NK.Rdata")
load("01.FindMarkers.between.DNT.and.CD4.CD8.NK.join.Rdata")

load("01.harmony.all.Rdata")
load("01.harmony.all.meta.all.Rdata")

load("01.cca.kept.Rdata")
load("01.cca.kept.cc.marker.Rdata")
load("01.cca.kept.cc.marker.48nk.Rdata")

load("01.cca.unfilter.Rdata")
load("01.cca.unfilter.meta.all.Rdata")

# load("01.cc.marker.DNT.Rdata")
# load("01.cca.kept.cc.marker.DNT.48nk.Rdata")

load("01.nDNT.decon.Rdata")
load("01.DoubletDecon.DoubletFinder.Rdata")

load("01.trans.marker.Rdata")
```
# prepare data (sct)
```{r}
##### run sct #####
sct = mapply(function(sample, data.dir){
print(paste("Preparing", sample))
##### read the 10x files of different formats #####
if (sample %in% c("CD4", "CD8", "TCRab", "nDNT_rep1", "nDNT_rep2", "aDNT_rep1",
"aDNT_rep2")) {
sct = Read10X(data.dir = data.dir)
} else if (sample %in% c("NK_sp1", "NK_sp2", "NK_sp3")) {
sct = data.table::fread(file = data.dir, header = "auto", sep = ",") %>%
tibble::column_to_rownames("V1")
} else if (sample %in% c("sp3_rep1", "sp3_rep2", "sp4_rep1", "sp4_rep2")){
sct = data.table::fread(file = data.dir, header = "auto", sep = ",") %>%
dplyr::mutate(SYMBOL = AnnotationDbi::mapIds(x = EnsDb.Mmusculus.v79, keys = V1,
column = "SYMBOL", keytype = "GENEID")) %>%
na.omit() %>%
dplyr::filter(!SYMBOL == "") %>%
dplyr::select(-V1)
dup = sct$SYMBOL[duplicated(sct$SYMBOL)]
z1 = sct %>%
dplyr::filter(SYMBOL %in% dup) %>%
dplyr::group_by(SYMBOL) %>%
dplyr::summarise_all(sum)
# long time, but should work
z2 = sct %>%
dplyr::filter(!SYMBOL %in% dup)
sct = rbind(z1, z2) %>%
tibble::column_to_rownames("SYMBOL")
}
print(head(sct[,1:5]))
print(head(sct[,(ncol(sct) - 5):ncol(sct)]))
##### create the seurat objects #####
sct = sct %>%
CreateSeuratObject(project = sample) %>%
PercentageFeatureSet(pattern = "^mt-",
col.name = "percent.mt") %>%
RenameCells(add.cell.id = sample)
print(dim(sct)) # print the cell number before qc
##### subset the seurat objects #####
sct = subset(sct, subset = percent.mt < 10 & nFeature_RNA > 500)
print(dim(sct)) # print the cell number after qc
print(median(sct$nCount_RNA)) # print the median UMIs after qc
print(max(sct$nCount_RNA)) # print the max UMIs after qc
##### scale each samples #####
sct = SCTransform(object = sct,
vars.to.regress = "percent.mt",
verbose = F,
conserve.memory = F )
##### return the objects #####
return(sct)
},
sample = c(list.files(path = "../01.processed_data/processed_data/") %>%
sub("_Tcell", "", x = .),
"NK_sp1", "NK_sp2", "NK_sp3",
"sp3_rep1", "sp3_rep2", "sp4_rep1", "sp4_rep2"),
data.dir = c(list.files(path = "../01.processed_data/processed_data/",
full.names = T),
list.files(path = "../01.processed_data/NK/NK_mouse_spleen/",
full.names = T),
list.files(path = "../01.processed_data/mouse_spleen_2019_plos.biology/",
pattern = "mouse_[3-4]", full.names = T)),
SIMPLIFY = F)
##### split and save the sct files #####
sct.dnt = sct[1:7]
sct.nk = sct[8:10]
sct.sp = sct[11:14]
save(sct.dnt, file = "01.sct.dnt.Rdata")
save(sct.nk, file = "01.sct.nk.Rdata")
save(sct.sp, file = "01.sct.sp.Rdata")
rm(sct)
gc()
##### sct.kept #####
sct.kept = mapply(function(sct, sample){
DefaultAssay(sct) = "RNA"
print(sample)
if (sample %in% c("nDNT_rep1", "nDNT_rep2", "aDNT_rep1", "aDNT_rep2")) {
sct.kept = subset(x = sct, subset = (Cd3d>0|Cd3e>0|Cd3g>0)&Cd4==0&Cd8b1==0&Klrb1c==0)
} else if (sample %in% "CD4") {
sct.kept = subset(x = sct, subset = (Cd3d>0|Cd3e>0|
Cd3g>0)&Cd8a==0&Cd8b1==0&Klrb1c==0 )
} else if (sample %in% "CD8") {
sct.kept = subset(x = sct, subset = (Cd3d>0|Cd3e>0|Cd3g>0)&Cd4==0&Klrb1c==0 )
} else if (sample %in% "TCRab") {
sct.kept = subset(x = sct, subset = (Cd3d>0|Cd3e>0|Cd3g>0) )
} else if (sample %in% c("NK_sp1", "NK_sp2", "NK_sp3")) {
sct.kept = subset(x = sct, subset = Cd3d == 0 & Cd3g == 0)
}
sct.kept = SCTransform(object = sct.kept,
vars.to.regress = "percent.mt",
verbose = F,
conserve.memory = F )
return(sct.kept)
}, sct = c(sct.dnt, sct.nk), sample = c(names(sct.dnt), names(sct.nk)), SIMPLIFY = F)
save(sct.kept, file = "01.sct.kept.Rdata")
```
# harmony CD4 CD8 NK DNT kept cells
```{r run CD4 CD8 NK DNT harmony}
##### merge the sct files #####
harmony.dnt.nk = merge(x = sct.kept$aDNT_rep1, y = sct.kept[2:10])
rm(sct.kept)
gc()
plan("sequential")
harmony.dnt.nk = harmony.dnt.nk %>%
SCTransform(vars.to.regress = "percent.mt", conserve.memory = F) %>%
# take a lot memory, about 40 min.
RunPCA(verbose = F)
##### change the core #####
plan("multiprocess", workers = 4)
options(future.globals.maxSize = 1024^4)
##### set the batch #####
table(harmony.dnt.nk$orig.ident)
harmony.dnt.nk$batch = factor(
harmony.dnt.nk$orig.ident,
labels = c("b1", "b2", rep("b1", 3), "b2", rep("b3", 3), "b1"))
table(harmony.dnt.nk$batch, harmony.dnt.nk$orig.ident)
##### run the harmony #####
harmony.dnt.nk = harmony.dnt.nk %>%
RunHarmony(group.by.vars = "batch", assay.use = "SCT") %>%
RunUMAP(dims = 1:30, reduction = "harmony") %>%
FindNeighbors(dims = 1:30, reduction = "harmony") %>%
FindClusters(resolution = .1)
##### set idents to be celltype #####
harmony.dnt.nk$celltype = sub("(_rep[1-2])|(_sp[1-3])", "", harmony.dnt.nk$orig.ident)
harmony.dnt.nk$celltypesplit = sub("_sp[1-3]", "", harmony.dnt.nk$orig.ident)
table(harmony.dnt.nk$celltype)
table(harmony.dnt.nk$celltypesplit)
Idents(harmony.dnt.nk) = "celltype"
##### ScaleData for findmarkers #####
plan("sequential")
DefaultAssay(harmony.dnt.nk) = "RNA"
harmony.dnt.nk = harmony.dnt.nk %>%
NormalizeData(., normalization.method = "LogNormalize", scale.factor = 10000) %>%
FindVariableFeatures(., selection.method = "vst", nfeatures = 2000) %>%
ScaleData(., vars.to.regress = "percent.mt")
##### save the results #####
save(harmony.dnt.nk, file = "01.harmony.dnt.nk.Rdata")
```
## markers
```{r}
object = harmony.dnt.nk
##### FindMarkers between DNT and CD4, CD8, NK respectively #####
markers = mapply(function(ident.1, ident.2){
FindMarkers(object = object, ident.1 = ident.1, ident.2 = ident.2, group.by = "celltypesplit", assay
= "RNA", slot = "data", only.pos = T) %>%
tibble::rownames_to_column("gene") %>%
dplyr::mutate(type = paste(ident.1, "vs", ident.2, sep = ".")) %>%
dplyr::arrange(desc(avg_logFC))
},
ident.1 = rep(c("nDNT_rep1", "nDNT_rep2", "aDNT_rep1", "aDNT_rep2"), 3),
ident.2 = rep(c("CD4", "CD8", "NK"), c(4, 4, 4)),
SIMPLIFY = F, USE.NAMES = F)
names(markers) = paste(rep(c("nDNT_rep1", "nDNT_rep2", "aDNT_rep1", "aDNT_rep2"), 3), "vs",
rep(c("CD4", "CD8","NK"), c(4,4,4)), sep = ".")
##### save results #####
save(markers, file = "01.FindMarkers.between.DNT.and.CD4.CD8.NK.Rdata")
##### join the replication result #####
genelist = list(
nDNT.vs.CD4 = intersect(markers$nDNT_rep1.vs.CD4$gene, markers$nDNT_rep2.vs.CD4$gene),
nDNT.vs.CD8 = intersect(markers$nDNT_rep1.vs.CD8$gene, markers$nDNT_rep2.vs.CD8$gene),
nDNT.vs.NK = intersect(markers$nDNT_rep1.vs.NK$gene, markers$nDNT_rep2.vs.NK$gene ),
aDNT.vs.CD4 = intersect(markers$aDNT_rep1.vs.CD4$gene, markers$aDNT_rep2.vs.CD4$gene),
aDNT.vs.CD8 = intersect(markers$aDNT_rep1.vs.CD8$gene, markers$aDNT_rep2.vs.CD8$gene),
aDNT.vs.NK = intersect(markers$aDNT_rep1.vs.NK$gene, markers$aDNT_rep2.vs.NK$gene )
)
markers.both = mapply(function(ident.1, ident.2){
FindMarkers(object = object, ident.1 = ident.1, ident.2 = ident.2, group.by = "celltype", assay =
"RNA", slot = "data", only.pos = T) %>%
tibble::rownames_to_column("gene") %>%
dplyr::mutate(type = paste(ident.1, "vs", ident.2, sep = ".")) %>%
dplyr::arrange(desc(avg_logFC))
},
ident.1 = rep(c("nDNT", "aDNT"), 3),
ident.2 = rep(c("CD4", "CD8","NK"), c(2,2,2)),
SIMPLIFY = F, USE.NAMES = F)
names(markers.both) = paste(rep(c("nDNT", "aDNT"), 3), "vs", rep(c("CD4", "CD8","NK"),c(2,2,2)),
sep = ".")
markers.join = lapply(names(markers.both), function(i){
markers.both[[i]] %>%
dplyr::filter(gene %in% genelist[[i]])
})
names(markers.join) = names(markers.both)
##### save results #####
save(genelist, markers.both, markers.join,
file = "01.FindMarkers.between.DNT.and.CD4.CD8.NK.join.Rdata")
##### write the csv #####
rbind(markers.join$nDNT.vs.CD4, markers.join$nDNT.vs.CD8, markers.join$nDNT.vs.NK) %>%
write.csv(file = "TableS2.nDNT.vs.CD4.CD8.NK.csv", x = ., row.names = F)
rbind(markers.join$aDNT.vs.CD4, markers.join$aDNT.vs.CD8, markers.join$aDNT.vs.NK) %>%
write.csv(file = "TableS3.aDNT.vs.CD4.CD8.NK.csv", x = ., row.names = F)
#####
```
# harmony DNT kept cells
```{r run nDNT, aDNT harmony [NOT USED]}
names(sct.kept)
plan("sequential")
harmony.dnt = mapply(function(i, j, lambda, theta, resolution){
##### run the harmony #####
object = merge(i, j) %>%
SCTransform(vars.to.regress = "percent.mt", conserve.memory = F) %>%
RunPCA(verbose = F) %>%
RunHarmony(group.by.vars = "orig.ident", assay.use = "SCT",
lambda = lambda, theta = theta) %>%
RunUMAP(dims = 1:30, reduction = "harmony") %>%
FindNeighbors(dims = 1:30, reduction = "harmony") %>%
FindClusters(resolution = resolution)
##### ScaleData for findmarkers #####
DefaultAssay(object) = "RNA"
object = object %>%
NormalizeData(., normalization.method = "LogNormalize", scale.factor = 10000) %>%
FindVariableFeatures(., selection.method = "vst", nfeatures = 2000) %>%
ScaleData(., vars.to.regress = "percent.mt")
##### return object #####
return(object)
}, SIMPLIFY = F,
i = list("nDNT" = sct.kept$nDNT_rep1, "aDNT" = sct.kept$aDNT_rep1),
j = list("nDNT" = sct.kept$nDNT_rep2, "aDNT" = sct.kept$aDNT_rep2),
lambda = c(1, .05),
theta = c(2, 3),
resolution = c(.05, .01))
##### run the harmony of naDNT #####
# harmony.naDNT = merge(sct.kept$aDNT_rep1, sct.kept[c(2,5:6)]) %>%
# SCTransform(vars.to.regress = "percent.mt", conserve.memory = F) %>%
# RunPCA(verbose = F)
# harmony.naDNT$batch = sub("[n|a]DNT_", "", harmony.naDNT$orig.ident)
# table(harmony.naDNT$batch, harmony.naDNT$orig.ident)
# harmony.naDNT = harmony.naDNT %>%
# RunHarmony(group.by.vars = "batch", assay.use = "SCT") %>%
# RunUMAP(dims = 1:30, reduction = "harmony") %>%
# FindNeighbors(dims = 1:30, reduction = "harmony") %>%
# FindClusters(resolution = .1)
# DimPlot(harmony.naDNT, split.by = "orig.ident", ncol = 2, label = T, cols = "Paired")
# FeaturePlot(harmony.naDNT, c("Il17a","Cxcr6", "Eomes", "Gzmb", "Cd74"), order = T,label = T)
##### save the results #####
save(harmony.dnt, file = "01.harmony.dnt.Rdata")
```
## markers
```{r [NOT USED]}
cc.marker = mapply(function(object){
lapply(sort(unique(object$seurat_clusters)), function(i){
if (min(table(object$orig.ident[object$seurat_clusters == i])) > 30) {
FindConservedMarkers(object, ident.1 = i, grouping.var = "orig.ident",
assay = "RNA", slot = "data", only.pos = T) %>%
tibble::rownames_to_column("gene") %>%
dplyr::filter(max_pval < .05 & minimump_p_val < .05) %>%
dplyr::mutate(cluster = i)
}}) %>%
data.table::rbindlist() %>%
dplyr::left_join(y = FindAllMarkers(object, assay = "RNA", slot = "data", only.pos = T), by =
c("cluster", "gene")) %>%
dplyr::arrange(desc(avg_logFC))
}, object = harmony.dnt, SIMPLIFY = F)
names(cc.marker)
save(cc.marker, file = "01.harmony.dnt.cc.marker.Rdata")
```
### 48nk.marker
```{r [NOT USED]}
object = harmony.dnt.nk
head([email protected])
df1 = rbind([email protected] %>% tibble::rownames_to_column("cells") %>%
dplyr::select(cells, seurat_clusters_before = seurat_clusters),
[email protected] %>% tibble::rownames_to_column("cells") %>%
dplyr::select(cells, seurat_clusters_before = seurat_clusters))
table(df1$seurat_clusters)
head(df1)
df2 = [email protected] %>% tibble::rownames_to_column("cells") %>%
dplyr::select(cells, integrated_snn_res.0.01)
table(df2$integrated_snn_res.0.01)
meta = [email protected] %>%
tibble::rownames_to_column("cells") %>%
dplyr::left_join(y = df1, by = "cells") %>%
dplyr::left_join(y = df2, by = "cells") %>%
tibble::column_to_rownames("cells")
head(meta)
meta$cluster = paste0(meta$celltype, meta$seurat_clusters_before) %>% sub("NA", "", .) %>%
factor(., levels = c(paste0("nDNT", 0:4), paste0("aDNT", 0:1), "CD4", "CD8", "NK", "TCRab"))
table(meta$cluster)
meta$recluster = paste0(sub("^[n|a]", "", meta$celltype), meta$integrated_snn_res.0.01) %>%
sub("NA", "", .) %>%
factor(., levels = c(paste0("DNT", 0:1), "CD4", "CD8", "NK", "TCRab"))
table(meta$recluster)
save(meta, file = "01.harmony.dnt.nk.meta1.Rdata")

# [email protected] = meta
# cc.marker.48nk = mapply(function(ident.1, group.by, cc.marker){
#
# x = mapply(function(ident.2){
# FindMarkers(object, ident.1, ident.2, group.by = group.by, only.pos = T, assay = "RNA", slot =
"data",
# features = cc.marker) %>%
# tibble::rownames_to_column("gene") %>%
# dplyr::filter(p_val_adj < .05)
# }, SIMPLIFY = F,
# ident.2 = c("CD4", "CD8", "NK"))
# dplyr::inner_join(x$CD4, x$CD8, by = "gene", suffix = c(".CD4", ".CD8")) %>%
# dplyr::inner_join(., x$NK, by = "gene", suffix = c("", ".NK")) %>%
# dplyr::mutate(type = ident.1)
#
# }, SIMPLIFY = F,
# ident.1 = c(paste0("nDNT",0:4), paste0("aDNT",0:1), paste0("DNT",0:1)),
# group.by = c(rep("cluster",7), rep("recluster",2)),
# cc.marker = sapply(cc.marker, function(i){
# sapply(sort(unique(i$cluster)), function(j){
# (i %>% dplyr::filter(cluster == j))$gene
# })
# }) %>%
# unlist(., recursive = F)) %>%
# data.table::rbindlist()
# head(cc.marker.48nk)
# save(cc.marker.48nk, file = "01.cca.kept.cc.marker.48nk.Rdata")
```
# harmony all
```{r run harmony all}
##### merge sct files #####
harmony.all = merge(x = sct.dnt$aDNT_rep1, y = c(sct.dnt[2:7], sct.nk, sct.sp))
rm(sct.dnt, sct.nk, sct.sp)
gc()
plan("sequential")
harmony.all = harmony.all %>%
SCTransform(vars.to.regress = "percent.mt", conserve.memory = F) %>%
# take a lot memory, about 40 min.
RunPCA(verbose = F)
##### change the core #####
plan("multiprocess", workers = 4)
options(future.globals.maxSize = 1024^4)
##### set the batch #####
table(harmony.all$orig.ident)
harmony.all$batch = factor(
harmony.all$orig.ident,
labels = c("b1", "b2", rep("b1", 3), "b2", rep("b3", 3), rep("b4", 4), "b1"))
table(harmony.all$batch, harmony.all$orig.ident)
##### run the harmony #####
[email protected] = meta.all
harmony.all = RunHarmony(object = harmony.all, group.by.vars = "batch", assay.use = "SCT",
lambda = .1, theta = 3) %>%
RunUMAP(dims = 1:30, reduction = "harmony") %>%
FindNeighbors(dims = 1:30, reduction = "harmony") %>%
FindClusters(resolution = .1)
DimPlot(harmony.all, group.by = "celltype.converge", label = T, cols = cols)
##### set idents to be celltype #####
harmony.all$celltype = factor(
harmony.all$orig.ident,
labels = c("aDNT", "aDNT", "CD4", "CD8", "nDNT", "nDNT",
rep("NK", 3),
rep("sp", 4),
"TCRab"))
harmony.all$celltypesplit = factor(
harmony.all$orig.ident,
labels = c("aDNT_rep1", "aDNT_rep2", "CD4", "CD8", "nDNT_rep1", "nDNT_rep2",
rep("NK", 3),
rep("sp", 4),
"TCRab"))
table(harmony.all$celltype)
table(harmony.all$celltypesplit)
Idents(harmony.all) = "celltype"
##### ScaleData for findmarkers #####
plan("sequential")
DefaultAssay(harmony.all) = "RNA"
harmony.all = harmony.all %>%
NormalizeData(., normalization.method = "LogNormalize", scale.factor = 10000) %>%
FindVariableFeatures(., selection.method = "vst", nfeatures = 2000) %>%
ScaleData(., vars.to.regress = "percent.mt")
##### save the results #####
save(harmony.all, file = "01.harmony.all.Rdata")
```
```{r modify harmony all metadata}
##### save the pre metadata #####
metapre = [email protected]
save(metapre, file = "01.harmony.all.metapre.Rdata")
##### import the markers from plos dataset #####
plos.meta1 = fread(input = "../01.processed_data/mouse_spleen_2019_plos.biology/other
information/pbio.3000528.s013.csv") # Mouse spleen scRNA-seq clustering data.
##### modify the plos metadata #####
meta.plos = mapply(function(i, j){
plos.meta1[grep(i, plos.meta1$cell), ] %>%
dplyr::mutate(cell.rename = paste(j, cell, sep = "_"))
},
i = c("mouse_3.1", "mouse_3.2", "mouse_4.1", "mouse_4.2"),
j = c("sp3_rep1", "sp3_rep2", "sp4_rep1", "sp4_rep2"),
SIMPLIFY = F) %>%
rbindlist() %>%
dplyr::mutate(converged.cell.type.simple = sub(".[1-4]$", "", converged.cell.type)) %>%
dplyr::select(cell.rename, converged.cell.type.simple)
head(meta.plos)
colnames(meta.plos)
table(meta.plos$converged.cell.type.simple)
##### gene expression #####
DefaultAssay(harmony.all) = "RNA"
meta.gene = FetchData(object = harmony.all, c("Cd3d", "Cd3e", "Cd3g", "Cd4", "Cd8a", "Cd8b1",
"Klrb1c"), slot = "data") %>%
tibble::rownames_to_column("cell.rename")
head(meta.gene)
##### merge the pre metadata with DNT and plos metadata #####
meta.all = metapre %>%
tibble::rownames_to_column("cell.rename") %>%
# dplyr::left_join(y = meta.DNT, by = c("cell.rename")) %>%
dplyr::left_join(y = meta.plos, by = c("cell.rename")) %>%
dplyr::left_join(y = meta.gene, by = c("cell.rename")) %>%
tibble::column_to_rownames("cell.rename")
head(meta.all)
##### put celltype inside converged.cell.type.simple #####
meta.all$converged.cell.type.simple[is.na(meta.all$converged.cell.type.simple)] =
as.character(meta.all$celltypesplit)[is.na(meta.all$converged.cell.type.simple)]
table(meta.all$converged.cell.type.simple)
##### celltype.converge #####
meta.all$celltypesplit.converge = sub("_rep[1|2]", "", meta.all$converged.cell.type.simple)
meta.all$celltype.converge = sub("_rep[1|2].*", "", meta.all$converged.cell.type.simple)
table(meta.all$celltypesplit.converge)
table(meta.all$celltype.converge)
##### T identity #####
meta.all$Tid = (meta.all$Cd3d > 0 |meta.all$Cd3e > 0| meta.all$Cd3g > 0) &
meta.all$celltype.converge %in% c("aDNT", "CD4", "CD8", "nDNT", "TCRab",
"Memory-CD4-T-cell", "Memory-CD8-T-cell",
"Naive-CD4-T-cell", "Naive-CD8-T-cell",
"NK-T-cell")
table(meta.all$Tid, meta.all$celltype.converge)

meta.all$CD4id = (meta.all$Cd3d > 0 |meta.all$Cd3e > 0| meta.all$Cd3g > 0) &

meta.all$Cd8a == 0 & meta.all$Cd8b1 == 0 & meta.all$Klrb1c == 0 &
meta.all$celltype.converge %in% c("CD4",
"Memory-CD4-T-cell",
"Naive-CD4-T-cell")
table(meta.all$CD4id, meta.all$celltype.converge)

meta.all$CD8id = (meta.all$Cd3d > 0 |meta.all$Cd3e > 0| meta.all$Cd3g > 0) &

meta.all$Cd4 == 0 & meta.all$Klrb1c == 0 &
meta.all$celltype.converge %in% c("CD8",
"Memory-CD8-T-cell",
"Naive-CD8-T-cell")
table(meta.all$CD8id, meta.all$celltype.converge)

meta.all$DNTid = (meta.all$Cd3d > 0 |meta.all$Cd3e > 0| meta.all$Cd3g > 0) &

meta.all$Cd4 == 0 & meta.all$Klrb1c == 0 & meta.all$Cd8b1 == 0 &
meta.all$celltype.converge %in% c("aDNT", "nDNT")
table(meta.all$DNTid, meta.all$celltype.converge)

meta.all$NKid = (meta.all$Cd3d == 0 & meta.all$Cd3g == 0) &

meta.all$celltype.converge %in% c("NK", "NK-cell")
table(meta.all$NKid, meta.all$celltype.converge)

meta.all$contaminate = (meta.all$Tid == FALSE &

meta.all$celltype.converge %in% c("aDNT", "CD4", "CD8", "nDNT", "TCRab",
"Memory-CD4-T-cell", "Memory-CD8-T-cell",
"Naive-CD4-T-cell", "Naive-CD8-T-cell",
"NK-T-cell")) |
(meta.all$CD4id == FALSE &
meta.all$celltype.converge %in% c("CD4",
"Memory-CD4-T-cell",
"Naive-CD4-T-cell")) |
(meta.all$CD8id == FALSE &
meta.all$celltype.converge %in% c("CD8",
"Memory-CD8-T-cell",
"Naive-CD8-T-cell")) |
(meta.all$DNTid == FALSE &
meta.all$celltype.converge %in% c("aDNT", "nDNT")) |
(meta.all$NKid == FALSE &
meta.all$celltype.converge %in% c("NK", "NK-cell"))
table(meta.all$contaminate, meta.all$celltype.converge)
##### save modified metadata #####
save(meta.all, file = "01.harmony.all.meta.all.Rdata")

```
# cca kept
```{r}
plan("sequential")
cca.kept = mapply(function(object.list){
anchor.features = SelectIntegrationFeatures(object.list = object.list,
nfeatures = 3000)
object.integrated = PrepSCTIntegration(object.list = object.list,
anchor.features = anchor.features) %>%
lapply(FUN = RunPCA, features = anchor.features, verbose = F) %>%
FindIntegrationAnchors(object.list = .,
reduction = "rpca",
anchor.features = anchor.features,
normalization.method = "SCT",
reference = 1) %>% # long time
IntegrateData(anchorset = .,
normalization.method = "SCT") %>%
RunPCA(object = .) %>%
RunUMAP(object = .,
dims = 1:30) %>%
FindNeighbors(dims = 1:30) %>%
FindClusters(resolution = .05)
##### scale for finding markers #####
DefaultAssay(object.integrated) = "RNA"
object.integrated = NormalizeData(object.integrated,
normalization.method = "LogNormalize",
scale.factor = 10000)
object.integrated = FindVariableFeatures(object.integrated,
selection.method = "vst",
nfeatures = 2000)
object.integrated = ScaleData(object.integrated,
vars.to.regress = "percent.mt")
return(object.integrated)
},
object.list = list("nDNT" = sct.kept[5:6], "aDNT" = sct.kept[1:2], "DNT" = sct.kept[c(5:6, 1:2)]),
SIMPLIFY = F)
DefaultAssay(cca.kept$DNT) = "integrated"
cca.kept$DNT = FindClusters(cca.kept$DNT, resolution = .01)
save(cca.kept, file = "01.cca.kept.Rdata")
```
## markers
```{r}
cc.marker = mapply(function(object){
print(table(object$seurat_clusters, object$orig.ident))
lapply(sort(unique(object$seurat_clusters)), function(i){
if (min(table(object$orig.ident[object$seurat_clusters == i])) > 30){
FindConservedMarkers(object, ident.1 = i, grouping.var = "orig.ident",
assay = "RNA", slot = "data", only.pos = T) %>%
tibble::rownames_to_column("gene") %>%
dplyr::filter(max_pval < .05 & minimump_p_val < .05) %>%
dplyr::mutate(cluster = i)
}}) %>%
data.table::rbindlist() %>%
dplyr::left_join(y = FindAllMarkers(object, assay = "RNA", slot = "data", only.pos = T), by =
c("cluster", "gene")) %>%
dplyr::group_by(cluster) %>%
dplyr::arrange(desc(avg_logFC), .by_group = T)
}, object = cca.kept, SIMPLIFY = F)
names(cc.marker)
save(cc.marker, file = "01.cca.kept.cc.marker.Rdata")
```
### 48nk.marker
```{r}
object = harmony.dnt.nk
# df1 = rbind([email protected] %>% tibble::rownames_to_column("cells") %>%
dplyr::select(cells, integrated_snn_res.0.05),
# [email protected] %>% tibble::rownames_to_column("cells") %>%
dplyr::select(cells, integrated_snn_res.0.05))
# table(df1$integrated_snn_res.0.05)
# head(df1)
# df2 = [email protected] %>% tibble::rownames_to_column("cells") %>%
dplyr::select(cells, integrated_snn_res.0.01)
# table(df2$integrated_snn_res.0.01)
# meta = [email protected] %>%
# tibble::rownames_to_column("cells") %>%
# dplyr::left_join(y = df1, by = "cells") %>%
# dplyr::left_join(y = df2, by = "cells") %>%
# tibble::column_to_rownames("cells")
# head(meta)
# meta$cluster = paste0(meta$celltype, meta$integrated_snn_res.0.05) %>% sub("NA", "", .) %>
%
# factor(., levels = c(paste0("nDNT", 0:4), paste0("aDNT", 0:1), "CD4", "CD8", "NK", "TCRab"))
# table(meta$cluster)
# meta$recluster = paste0(sub("^[n|a]", "", meta$celltype), meta$integrated_snn_res.0.01) %>%
sub("NA", "", .) %>%
# factor(., levels = c(paste0("DNT", 0:1), "CD4", "CD8", "NK", "TCRab"))
# table(meta$recluster)
# save(meta, file = "01.harmony.dnt.nk.meta.Rdata")

[email protected] = meta
cc.marker.48nk = mapply(function(ident.1, group.by, cc.marker){

x = mapply(function(ident.2){
FindMarkers(object, ident.1, ident.2, group.by = group.by, only.pos = T, assay = "RNA", slot =
"data",
features = cc.marker) %>%
tibble::rownames_to_column("gene") %>%
dplyr::filter(p_val_adj < .05)
}, SIMPLIFY = F,
ident.2 = c("CD4", "CD8", "NK"))
dplyr::inner_join(x$CD4, x$CD8, by = "gene", suffix = c(".CD4", ".CD8")) %>%
dplyr::inner_join(., x$NK, by = "gene", suffix = c("", ".NK")) %>%
dplyr::mutate(type = ident.1)

}, SIMPLIFY = F,
ident.1 = c(paste0("nDNT",0:4), paste0("aDNT",0:1), paste0("DNT",0:1)),
group.by = c(rep("cluster",7), rep("recluster",2)),
cc.marker = sapply(cc.marker, function(i){
sapply(sort(unique(i$cluster)), function(j){
(i %>% dplyr::filter(cluster == j))$gene
})
}) %>%
unlist(., recursive = F)) %>%
data.table::rbindlist()
head(cc.marker.48nk)
save(cc.marker.48nk, file = "01.cca.kept.cc.marker.48nk.Rdata")
```
### up and downregulate
```{r}
object = cca.kept$DNT
object$celltype = sub("_rep[1|2]", "", object$orig.ident) %>% factor(., levels = c("nDNT",
"aDNT"))
table(object$integrated_snn_res.0.01, object$orig.ident)
DefaultAssay(object) = "RNA"
df = mapply(function(ident.1, ident.2, subset.ident){
x = FindMarkers(object, ident.1 = ident.1, ident.2 = ident.2, subset.ident = subset.ident,
group.by = "orig.ident", assay = "RNA", slot = "data") %>%
tibble::rownames_to_column("gene") %>%
dplyr::filter(p_val_adj < .05) %>%
dplyr::mutate(type = paste0(ident.1,".vs.",ident.2), trans = avg_logFC > 0) %>%
dplyr::arrange(desc(avg_logFC))
x$trans = factor(x$trans, labels = c("down", "up"))
return(x)
},
ident.1 = rep(paste0("aDNT_rep",1:2),2),
ident.2 = rep(paste0("nDNT_rep",1:2),2),
subset.ident = c(0,0,1,1),
SIMPLIFY = F)
head(df[[1]])
trans.marker = mapply(function(df){
x = dplyr::inner_join(df[[1]], df[[2]], by = c("gene", "trans"), suffix = c(".rep1", ".rep2"))
}, SIMPLIFY = F, df = list("DNT0" = df[1:2], "DNT1" = df[3:4]))
head(trans.marker[[1]])
save(trans.marker, file = "01.trans.marker.Rdata")
```
# cca unfilter
```{r}
plan("sequential")
names(sct.dnt)
cca.unfilter = mapply(function(object.list){
anchor.features = SelectIntegrationFeatures(object.list = object.list,
nfeatures = 3000)
object.integrated = PrepSCTIntegration(object.list = object.list,
anchor.features = anchor.features) %>%
lapply(FUN = RunPCA, features = anchor.features, verbose = F) %>%
FindIntegrationAnchors(object.list = .,
reduction = "rpca",
anchor.features = anchor.features,
normalization.method = "SCT",
reference = 1) %>% # long time
IntegrateData(anchorset = .,
normalization.method = "SCT") %>%
RunPCA(object = .) %>%
RunUMAP(object = .,
dims = 1:30) %>%
FindNeighbors(dims = 1:30) %>%
FindClusters(resolution = .05)
##### scale for finding markers #####
DefaultAssay(object.integrated) = "RNA"
object.integrated = NormalizeData(object.integrated,
normalization.method = "LogNormalize",
scale.factor = 10000)
object.integrated = FindVariableFeatures(object.integrated,
selection.method = "vst",
nfeatures = 2000)
object.integrated = ScaleData(object.integrated,
vars.to.regress = "percent.mt")
return(object.integrated)
},
object.list = list("nDNT" = sct.dnt[5:6], "aDNT" = sct.dnt[1:2], "DNT" = sct.dnt[c(5:6, 1:2)]),
SIMPLIFY = F)
# object = cca.unfilter$DNT
# DefaultAssay(object) = "integrated"
# object = FindClusters(object, resolution = .02)
# DimPlot(object)
save(cca.unfilter, file = "01.cca.unfilter.Rdata")
```
```{r modify the metadata}
meta.all = mapply(function(object){
##### gene expression #####
DefaultAssay(object) = "RNA"
meta.gene = FetchData(object = object,
vars = c("Cd3d", "Cd3e", "Cd3g", "Cd4", "Cd8a", "Cd8b1", "Klrb1c"),
slot = "data") %>%
tibble::rownames_to_column("cell.rename")
head(meta.gene)
##### save the pre metadata #####
metapre = [email protected]
##### merge pre and modified metadata #####
meta.all = metapre %>%
tibble::rownames_to_column("cell.rename") %>%
dplyr::left_join(x = ., y = meta.gene, by = "cell.rename") %>%
tibble::column_to_rownames("cell.rename")
head(meta.all)
##### T identity #####
if (("Cd4" %in% colnames(meta.all)) == F){
meta.all$DNTid = (meta.all$Cd3d > 0 |meta.all$Cd3e > 0| meta.all$Cd3g > 0) &
meta.all$Klrb1c == 0 & meta.all$Cd8b1 == 0
} else {
meta.all$DNTid = (meta.all$Cd3d > 0 |meta.all$Cd3e > 0| meta.all$Cd3g > 0) &
meta.all$Cd4 == 0 & meta.all$Klrb1c == 0 & meta.all$Cd8b1 == 0
}
table(meta.all$DNTid, meta.all$integrated_snn_res.0.05)
meta.all$contaminate = meta.all$DNTid == FALSE
table(meta.all$contaminate, meta.all$integrated_snn_res.0.05)
meta.all$contaminate = factor(meta.all$contaminate, labels = c("kept", "removed"))
return(meta.all)
}, object = cca.unfilter, SIMPLIFY = F)
save(meta.all, file = "01.cca.unfilter.meta.all.Rdata")
```
## markers
```{r}
cc.marker = mapply(function(object){
print(table(object$seurat_clusters, object$orig.ident))
lapply(sort(unique(object$seurat_clusters)), function(i){
if (min(table(object$orig.ident[object$seurat_clusters == i])) > 30){
FindConservedMarkers(object, ident.1 = i, grouping.var = "orig.ident",
assay = "RNA", slot = "data", only.pos = T) %>%
tibble::rownames_to_column("gene") %>%
dplyr::filter(max_pval < .05 & minimump_p_val < .05) %>%
dplyr::mutate(cluster = i)
}}) %>%
data.table::rbindlist() %>%
dplyr::left_join(y = FindAllMarkers(object, assay = "RNA", slot = "data", only.pos = T), by =
c("cluster", "gene")) %>%
dplyr::group_by(cluster) %>%
dplyr::arrange(desc(avg_logFC), .by_group = T)
}, object = cca.unfilter, SIMPLIFY = F)
names(cc.marker)
save(cc.marker, file = "01.cca.unfilter.cc.marker.Rdata")
```

# DoubletFinder and DoubletDecon

## DoubletDecon use sct results
```{r}
source("../26.MAIT/DoubletDecon.f.r")
############################################################
# remove doublets using DoubletDecon, dataset need to be clustered first, must have more than
two clusters, the plot window should be the maximized.
############################################################
dev.off()
object = lapply(sct.kept[5:6], function(i){
i = i %>%
RunPCA() %>%
RunUMAP(dims = 1:30) %>%
FindNeighbors() %>%
FindClusters(resolution = .1)})
nDNT.decon = mapply(DoubletDecon.f,
object = object,
filename = c("nDNT_rep1.sct", "nDNT_rep2.sct"),
rhop = 1,
SIMPLIFY = F)
save(nDNT.decon, file = "01.nDNT.decon.Rdata")
```
## DoubletFinder union Decon use sct results
```{r}
source("../26.MAIT/DoubletFinder.f.r")
doublet.finder.res = mapply(DoubletFinder.f, object = nDNT.decon, SIMPLIFY = F)
save(doublet.finder.res, file = "01.DoubletDecon.DoubletFinder.Rdata")
```

---
title: "00.plot2"
output: html_notebook
editor_options:
chunk_output_type: console
---

2020-11-20 edit

# library
```{r}
library(Seurat)
library(harmony)
library(MAST)
library(dplyr)
library(tidyselect)
library(RColorBrewer)
library(future)
library(ggplot2)
library(org.Mm.eg.db)
library(EnsDb.Mmusculus.v79)
library(cowplot)
library(data.table)
library(clusterProfiler)
library(DoubletDecon)
library(DoubletFinder)
Sys.setenv(LANGUAGE = "en")
options(warn = -1)
memory.limit(size = 64670)
```
# theme set
```{r}
theme_set(theme_cowplot(font_size = 8))
theme.text = theme_cowplot(font_size = 8)
cols = c(brewer.pal(9, "Set1"),
brewer.pal(8, "Set2")[-c(2,4,8)],
brewer.pal(12, "Set3")[-9],
brewer.pal(12, "Paired"),
brewer.pal(8, "Dark2"),
brewer.pal(11, "Spectral")[-6],
brewer.pal(11, "BrBG")[-6]
)
```
# load data
```{r}
load("01.cca.kept.Rdata")
load("01.cca.kept.cc.marker.Rdata")
load("01.cca.kept.cc.marker.48nk.Rdata")
load("01.harmony.dnt.nk.Rdata")
load("01.harmony.dnt.nk.meta.Rdata")
load("01.DoubletDecon.DoubletFinder.Rdata")
```
# load function
```{r}
source("../24.publish_least_library/function/heatmap.f.r")
source("../24.publish_least_library/function/dotplot.f.r")

```
# fig 2a dimplot
```{r}
object = cca.kept$nDNT
##### save the png #####
(DimPlot(object, cols = "Paired", pt.size = .001, order = T) +
theme_nothing() +
theme(aspect.ratio = 1)) %>%
ggsave(filename = "tmp.png", plot = ., units = "cm", width = 6, height = 6, dpi = 1200)
##### plots #####
((DimPlot(object, pt.size = NA, cols = "Paired", combine = F))[[1]] +
theme.text +
labs(color = paste0("nDNT\ncells\n(", prettyNum(ncol(object), big.mark = ","), ")"), tag = "a") +
annotation_raster(png::readPNG("tmp.png"), -Inf, Inf, -Inf, Inf) +
annotation_custom(grob = ggplotGrob((DimPlot(object, label = T, label.size = 2.5, pt.size = NA,
combine = F))[[1]] +
theme_nothing() +
theme(aspect.ratio = 1))) +
theme(aspect.ratio = 1)) %>%
ggsave(filename = "Fig2a.pdf", plot = ., units = "cm", width = 6, height = 5)

```
# fig 2b heatmap
```{r}
colnames(cc.marker$nDNT)
object = cca.kept$nDNT
df = cc.marker$nDNT %>%
dplyr::filter((pct.1 > .5 & cluster == 4 & avg_logFC > .5) | !(cluster == 4) & p_val_adj < .05) %>%
dplyr::group_by(cluster) %>%
dplyr::arrange(desc(avg_logFC), .by_group = T)
tail(df)
table(df$cluster)
write.csv(df, file = "TableS4.nDNT.subgroup.markers.new.csv", row.names = F)
heatmap.f(object = object,
features = df$gene,
type = df$cluster,
xlab = paste0("nDNT cells (", prettyNum(ncol(object), big.mark = ","),")"),
ylab = paste0("Subgroup markers of nDNT (", prettyNum(nrow(df), big.mark = ",") ,")"),
tag = "b",
pal = c(brewer.pal(5, "Paired"),
brewer.pal(5, "Paired")),
axis.text.y = element_blank()
) %>%
ggsave(filename = "Fig2b.pdf", plot = ., units = "cm", width = 8, height = 7)
```
# fig 2c dotplot
```{r}
##### set the parameters #####
df = cc.marker$nDNT %>%
dplyr::filter((pct.1 > .5 & cluster == 4 & avg_logFC > 1) | !(cluster == 4) & p_val_adj < .05) %>%
dplyr::group_by(cluster) %>%
dplyr::arrange(desc(avg_logFC), .by_group = T) %>%
dplyr::mutate(type = paste0("nDNT", cluster)) %>%
dplyr::left_join(y = cc.marker.48nk %>%
dplyr::filter(type %in% paste0("nDNT", 0:4)),
by = c("gene", "type"),
suffix = c("", ".48nk")) %>%
dplyr::mutate(type = "Higher than CD4 CD8 NK")
colnames(df)
head(df)
df$type[is.na(df$avg_logFC.48nk)] = "Not higher than CD4 CD8 NK"
head(df[,c("gene", "type", "avg_logFC", "cluster")])
##### annotate #####
ensembl = biomaRt::useMart("ensembl", dataset = "mmusculus_gene_ensembl")
mm_go_anno = biomaRt::getBM(
attributes = c("external_gene_name", "description", "go_id", "name_1006",
"namespace_1003", "definition_1006"),
filters = c("external_gene_name"),
values = unique(as.character(df$gene)),
mart = ensembl) %>%
dplyr::filter(namespace_1003 %in% c("cellular_component", "molecular_function"))
cs = sort(unique(mm_go_anno$external_gene_name[grep("component of membrane",
mm_go_anno$name_1006)]))
cs.neg = sort(unique(mm_go_anno$external_gene_name[grep("mitochondrial|integral
component of Golgi membrane|intracellular membrane-bounded organelle|nuclear membrane",
mm_go_anno$name_1006)]))
cs = cs[!(cs %in% cs.neg)]
secreted = sort(unique(mm_go_anno$external_gene_name[
c(grep("extracellular space", mm_go_anno$name_1006),
grep("granzyme", mm_go_anno$description))]))
tf = sort(unique(mm_go_anno$external_gene_name[
grep("transcription factor activity|transcription activator activity
|transcription repressor activity|transcription coactivator activity|
DNA binding|DNA-binding",
mm_go_anno$name_1006)]))
intersect(cs, secreted)
intersect(cs, tf)
intersect(secreted, tf)
cs = cs[!(cs %in% tf)]
secreted = secreted[(!(secreted %in% cs)) & (!(secreted %in% tf))]
ng = sort(unique(df$gene[!(df$gene %in% c(cs, secreted, tf))]))
anno.df = data.frame(
gene = c(cs, secreted, tf, ng),
anno = c(rep("cell surface", length(cs)),
rep("secreted", length(secreted)),
rep("transcription factor", length(tf)),
rep("", length(ng)))) %>%
dplyr::filter(!(gene %in% c("Actn1", "Actn2")))
df.plot =
dplyr::left_join(df, anno.df, by = c("gene")) %>%
dplyr::select(gene, cluster, type, anno, avg_logFC) %>%
dplyr::mutate(cluster = paste0("nDNT", cluster))
df.plot$anno = factor(
df.plot$anno,
levels = c("cell surface", "secreted", "transcription factor", ""))
head(df.plot)
##### object #####
object = harmony.dnt.nk
[email protected] = meta
object = subset(object, cells = colnames(object)[object$celltype %in% c("CD4", "CD8", "NK",
"nDNT")])
table(object$cluster)
object$cluster = factor(object$cluster, levels = c(paste("nDNT",0:4,sep = ""), "CD4", "CD8", "NK"))
##### plot #####
plan("sequential")
p = mapply(function(i, j){
x = df.plot %>%
dplyr::filter(anno == i) %>%
dplyr::group_by(cluster) %>%
dplyr::top_n(10, avg_logFC) %>%
dplyr::group_by(cluster, type) %>%
dplyr::arrange(gene, .by_group = T)
head(x)
p = dotplot.f(object, features = x$gene, group.by = "cluster", facet = x$type,
axis.text.x = element_text(angle = 60, hjust = 1, vjust = 1)) +
labs(x = "", title = j, y = "Subgroup marker genes")
return(p)
},
i = c("cell surface", "secreted", "transcription factor", ""),
j = c("Cell\nsurface","Secreted","Transcription\nfactor","Remaining\ngenes"),
SIMPLIFY = F,
USE.NAMES = F)
p[[1]] = p[[1]] + NoLegend()
p[[2]] = p[[2]] + theme(axis.title.y = element_blank()) + NoLegend()
p[[3]] = p[[3]] + theme(axis.title.y = element_blank())
p[[4]] = p[[4]] + theme(axis.title.y = element_blank()) + NoLegend()
##### change the strip color #####
for (i in 1:4) {
pal = c(brewer.pal(3, "Pastel1")[1:2])
g = ggplotGrob(p[[i]])
strips = grep("strip-", g$layout$name)
for (x in seq_along(strips)) {
k = which(grepl("rect", g$grobs[[strips[[x]]]]$grobs[[1]]$childrenOrder))
g$grobs[[strips[[x]]]]$grobs[[1]]$children[[k]]$gp$fill = pal[x]
}
p[[i]] = g
}
##### save plots #####
plot_grid(plotlist = p, nrow = 1, labels = "c", label_size = 8, rel_widths = c(1,.9,.85,1.1)) %>%
ggsave(filename = "Fig2c.pdf", plot = ., units = "cm", width = 19, height = 16)
```
# fig S6 dimplot
```{r}
object = cca.kept$nDNT
##### save pngs #####
mapply(function(ident, num){
(DimPlot(object, cols = "Paired", pt.size = .1, cells = colnames(object)[object$orig.ident ==
ident]) + theme_nothing()) %>% ggsave(filename = paste0("tmp",num,".png"), plot = ., units =
"cm", width = 6, height = 6, dpi = 1200)
}, ident = c("nDNT_rep1", "nDNT_rep2"), num = 1:2, SIMPLIFY = F)

##### annotate function #####

annotation_custom2 <- function(grob, xmin = -Inf, xmax = Inf, ymin = -Inf, ymax = Inf, data){
layer(data = data, stat = StatIdentity, position = PositionIdentity,
geom = ggplot2:::GeomCustomAnn,
inherit.aes = TRUE, params = list(grob = grob,
xmin = xmin, xmax = xmax,
ymin = ymin, ymax = ymax))}
##### plots #####
p = DimPlot(object, cols = "Paired", pt.size = NA, split.by = "orig.ident", combine = F)[[1]]
(p +
annotation_custom2(grid::rasterGrob(png::readPNG("tmp1.png")), -Inf, Inf, -Inf, Inf, data =
p$data[which(p$data$orig.ident == "nDNT_rep1")[1],]) +
annotation_custom2(ggplotGrob(DimPlot(subset(object, cells = colnames(object)
[object$orig.ident == "nDNT_rep1"]), pt.size = NA, label = T, label.size = 2.5) + NoAxes() +
NoLegend()), -Inf, Inf, -Inf, Inf, data = p$data[which(p$data$orig.ident == "nDNT_rep1")[1],]) +
annotation_custom2(grid::rasterGrob(png::readPNG("tmp2.png")), -Inf, Inf, -Inf, Inf, data =
p$data[which(p$data$orig.ident == "nDNT_rep2")[1],]) +
annotation_custom2(ggplotGrob(DimPlot(subset(object, cells = colnames(object)
[object$orig.ident == "nDNT_rep2"]), pt.size = NA, label = T, label.size = 2.5) + NoAxes() +
NoLegend()), -Inf, Inf, -Inf, Inf, data = p$data[which(p$data$orig.ident == "nDNT_rep2")[1],]) +
labs(color = paste0("nDNT\ncells\n(", prettyNum(ncol(object), big.mark = ","),")")) +
theme.text +
theme(aspect.ratio = 1)) %>%
ggsave(filename = "FigS6.pdf", plot = ., units = "cm", width = 12, height = 6)
```
# fig S7 heatmap
```{r}
##### set the parameters #####
df = cc.marker$nDNT %>%
dplyr::filter((pct.1 > .5 & cluster == 4 & avg_logFC > 1) | !(cluster == 4) & p_val_adj < .05) %>%
dplyr::group_by(cluster) %>%
dplyr::arrange(desc(avg_logFC), .by_group = T) %>%
dplyr::mutate(type = paste0("nDNT", cluster)) %>%
dplyr::left_join(y = cc.marker.48nk %>%
dplyr::filter(type %in% paste0("nDNT", 0:4)),
by = c("gene", "type"),
suffix = c("",".NK")) %>%
dplyr::mutate(type = "Higher than CD4 CD8 NK")
colnames(df)
df$type[is.na(df$pct.2.NK)] = "Not higher than CD4 CD8 NK"
head(df[,c("gene", "type")])
df = df[df$type == "Higher than CD4 CD8 NK",]
df$gene = factor(df$gene, levels = rev(sort(unique(as.character(df$gene)))))
##### write csv #####
write.csv(df, file = "TableS5.nDNT.subgroup.CD48NK.markers.csv", row.names = F)
##### object #####
object = harmony.dnt.nk
object = subset(object, cells = colnames(object)[object$celltype %in% c("CD4", "CD8", "NK",
"nDNT")])
object$id.dot = as.character(object$celltype)
object$id.dot[grep("nDNT", object$id.dot)] = paste("nDNT",
cca.kept$nDNT$integrated_snn_res.0.05, sep = "")
object$id.dot = factor(object$id.dot, levels = c(paste("nDNT",0:4,sep = ""), "CD4", "CD8", "NK"))
df1 = df[1:95,]
df2 = df[96:(96 + 95),]
df3 = df[(96 + 95):nrow(df),]
##### plot #####
p1 = heatmap.f(object = object,
features = df1$gene,
type = df1$cluster,
group.by = "id.dot",
angle.x = 90,
axis.text.y = element_text(face = "italic", hjust = 0, size = 7),
pal = c(brewer.pal(8, "Paired"),
brewer.pal(8, "Paired")[1:2]),
xlab = "",
ylab = "",
tag = "",
space = "free_y")
p2 = heatmap.f(object = object,
features = df2$gene,
type = df2$cluster,
group.by = "id.dot",
angle.x = 90,
axis.text.y = element_text(face = "italic", hjust = 0, size = 7),
pal = c(brewer.pal(8, "Paired"),
brewer.pal(8, "Paired")[2:3]),
xlab = "",
ylab = "",
tag = "",
space = "free_y")
p3 = heatmap.f(object = object,
features = df3$gene,
type = df3$cluster,
group.by = "id.dot",
angle.x = 90,
axis.text.y = element_text(face = "italic", hjust = 0, size = 7),
pal = c(brewer.pal(8, "Paired"),
brewer.pal(8, "Paired")[3:5]),
xlab = "",
ylab = paste("Subgroup markers of nDNT higher than CD4 CD8 NK (", nrow(df), ")", sep =
""),
tag = "",
space = "free_y")
#####
plot_grid(p1, p2, p3, nrow = 1) %>%
ggsave(filename = "FigS7.pdf", plot = ., units = "cm", width = 19, height = 25)
```
# fig S8 go
```{r cluego}
##### read the data #####
df = read.table(file = "FigS8.cluego.txt", header = T, sep = "\t") %>%
setNames(c("Term", "Source", "AdjustP", "AssociatedGenes", "Num.Genes", "Genes", "Cluster"))
%>%
dplyr::group_by(Cluster, Source) %>%
dplyr::arrange(Term, .by_group = T)
df$Term = factor(df$Term, levels = unique(df$Term))
head(df)
##### plot #####
p = ggplot(df, aes(Cluster, Term, color = AdjustP, size = AssociatedGenes)) +
geom_point() +
facet_grid(rows = vars(Source), scales = "free", space = "free") +
scale_color_distiller(palette = "Spectral", direction = 1, guide = guide_colorbar(default.unit =
"cm", barwidth = .2, barheight = 1.5)) +
scale_size_area(max_size = 2.5, guide_legend(title = "Associated\nGenes (%)")) +
labs(x = "nDNT cluster") +
theme.text +
theme(strip.text.y = element_text(angle = 0))
##### change the strip color #####
g = ggplotGrob(p)
pal = brewer.pal(5, "Set3")
strips = grep("strip-", g$layout$name)
for (i in seq_along(strips)) {
k = which(grepl("rect",
g$grobs[[strips[i]]]$grobs[[1]]$childrenOrder))
g$grobs[[strips[i]]]$grobs[[1]]$children[[k]]$gp$fill = pal[i]
}
##### save plot #####
ggsave(filename = "FigS8.cluego.pdf", plot = plot_grid(g), units = "cm", width = 16, height = 9)
```

# fig S10 doubletfinder

```{r}
##### set parameters #####
meta = lapply(doublet.finder.res, function(i){
[email protected] %>%
tibble::rownames_to_column("cells") %>%
dplyr::select(cells, isADoublet, contains("DF.classifications")) %>%
setNames(c("cells", "isADoublet", "rm","DoubletFinder")) %>%
dplyr::select(-rm)
}) %>%
data.table::rbindlist() # get the doublets information
object = cca.kept$nDNT
[email protected] = [email protected] %>%
tibble::rownames_to_column("cells") %>%
dplyr::left_join(y = meta, by = "cells") %>%
tibble::column_to_rownames("cells") %>%
dplyr::mutate(doublets = paste(isADoublet, DoubletFinder)) # put doublets information into cca
results
head([email protected])
# "FALSE Singlet" # singlets type
##### plots #####
mapply(function(features){
(FeaturePlot(object, features, pt.size = .01, order = T) +
scale_color_distiller(palette = "Spectral") +
theme_nothing() +
theme(panel.background = element_rect(fill = "lightgrey", colour = NA))) %>%
ggsave(filename = "tmp.png", plot = ., units = "cm", width = 6, height = 6, dpi = 1200)
mapply(function(object, subtitle){
p1 = FeaturePlot(object, features, pt.size = NA) +
scale_color_distiller(palette = "Spectral", guide = guide_colorbar(title = "Expression",
title.position = "left", title.theme = element_text(angle = 90, size = 7, hjust = .5, vjust = .5),
default.unit = "cm", barwidth = .1, barheight = 1.5)) +
labs(subtitle = paste0(subtitle, " doublets removed")) +
theme.text +
theme(aspect.ratio = 1,
plot.title = element_text(face = "italic", hjust = .5, size = 8),
plot.subtitle = element_text(hjust = .5)) +
NoAxes() +
annotation_raster(png::readPNG("tmp.png"), -Inf, Inf, -Inf, Inf) +
annotation_custom(ggplotGrob(FeaturePlot(object, features, pt.size = NA, label = T, label.size =
2.5, repel = T) + theme_nothing()))
p2 = VlnPlot(object, features, pt.size = F) +
scale_fill_brewer(palette = "Paired") +
labs(x = "nDNT") +
theme.text +
NoLegend() +
theme(plot.title = element_blank())
plot_grid(p1, p2, ncol = 1, rel_heights = c(2, 1))
}, object = list(object, subset(object, cells = colnames(object)[object$doublets == "FALSE
Singlet"])), SIMPLIFY = F, subtitle = c("Before", "After")) %>%
plot_grid(plotlist = ., nrow = 1)
},
features = c("Cd74", "H2-Ab1", "H2-Eb1", "Spi1"),
SIMPLIFY = F) %>%
plot_grid(plotlist = ., labels = "auto", label_size = 8) %>%
ggsave(filename = "FigS10.pdf", plot = ., units = "cm", width = 19, height = 18)

```

---
title: "00.plot3"
output: html_notebook
editor_options:
chunk_output_type: console
---

2020-11-24

# library
```{r}
library(Seurat)
library(ggplot2)
library(cowplot)
library(dplyr)
library(grid)
```
# theme set
```{r}
theme.text = theme(text = element_text(size = 8),
axis.text = element_text(size = 8),
axis.title = element_text(size = 8),
plot.title = element_text(size = 8, face = "plain"),
plot.tag = element_text(size = 8, face = "bold"))
theme_set(theme_cowplot())
```
# fig 3c barplot
```{r}
N2 = 17
N4 = 10.1
N0 = 70.9*82.8/100
N1 = 70.9*82.8/100
N3 = 7.88*70.9/100
df = data.frame("percent" = c(N0,N1,N2,N3,N4),
"cluster" = 0:4,
"type" = "flow")
df$percent = round(df$percent/sum(df$percent)*100, digits = 1)
df1 = (prop.table(table(cca.kept$nDNT$integrated_snn_res.0.05, cca.kept$nDNT$orig.ident),
2)*100) %>% round(., 1) %>%
data.frame() %>%
setNames(c("cluster", "type", "percent"))
df = rbind(df, df1)
head(df)
(ggplot(df, aes(type, percent, fill = cluster)) +
geom_bar(stat = "identity", width = .7) +
scale_fill_brewer(palette = "Paired") +
labs(x = "", y = "percentage (%)", tag = "c") +
theme(axis.text.x = element_text(angle = 60, hjust = 1, vjust = 1))) %>%
ggsave(filename = "Fig3c.pdf", plot = ., units = "cm", width = 3.5, height = 4.5)
# df = df %>%
# dplyr::mutate(lab.pos = cumsum(percent) - .1*percent)
#c=
# ggplot(data = df,
# mapping = aes(x = "", y = percent, fill = type)) +
# geom_bar(stat = "identity") +
# coord_polar("y", start = 0) +
# scale_fill_brewer(palette = "Paired") +
# NoAxes() +
# geom_text(aes(y = lab.pos,
# label = scales::percent(percent/100, accuracy = 1)),
# size = 3) +
# theme.text +
# theme(plot.title = element_text(hjust = .5)) +
# labs(fill = "Cell Type",
# title = "nDNT cells",
# tag = "c")
c
```
# fig 3d heatmap
```{r}
load("01.cc.marker.Rdata")
df = read.table("../11.王崧验证结果/Naive01234.txt",
header = T,
stringsAsFactors = F) %>%
dplyr::select(N0 = Naive_0,
N1 = Naive_2,
N2 = Naive_1,
N3 = Naive_3,
N4 = Naive_4) %>%
t() %>%
as.data.frame() %>%
tibble::rownames_to_column("cell") %>%
reshape::melt("cell") %>%
dplyr::group_by(variable) %>%
dplyr::mutate(value = scale(value)) %>%
dplyr::select(gene = variable, cell, value) %>%
dplyr::left_join(
y = cc.marker$nDNT[, c("gene", "cluster")],
by = c("gene")
)
df$gene = factor(df$gene,
levels = rev(sort(unique(df$gene))))
d = ggplot(data = df,
mapping = aes(x = cell, y = gene, fill = value)) +
geom_tile(color = "white", size = 1) +
facet_grid(cluster~., scales = "free", space = "free") +
scale_fill_distiller(
palette = "RdBu",
guide = guide_colorbar(
title = expression(paste(Scaled, " ", Delta, Ct)),
title.position = "top",
barwidth = 1.5,
barheight = .2,
default.unit = "cm"
),
breaks = seq(-2, 2)
)+
labs(x = "nDNT cluster",
y = "",
tag = "d") +
scale_x_discrete(labels = as.character(0:4)) +
theme.text +
theme(legend.position = "bottom",
legend.justification = c(1,1),
axis.line = element_blank(),
axis.ticks = element_blank(),
axis.text.y = element_text(face = "italic"),
strip.text.y = element_text(angle = 0))
d
```
# fig3e facs stat subset
```{r FACS.stat.subset}
files = list.files(pattern = "FACS.*.csv", path = "../26.MAIT/")
files
celltype = c("nDNT0", "nDNT1", "nDNT2", "nDNT3", "nDNT4")
genetype = c("IKZF2", "Ly-6C", "IL17A", "GZMB")
df = lapply(grep("Helios.csv|Ly6c.csv|Il17a|Gzmb.v2", files, value = T), function(file){
df = read.csv(file = paste0("../26.MAIT/", file)) %>%
dplyr::mutate(gene = sub("Helios", "IKZF2",
sub("Gzmb", "GZMB",
sub("Ly6C", "Ly-6C",
sub("_[1-5]", "", X))))) %>%
tibble::column_to_rownames(var = "X") %>%
reshape2::melt(id.vars = "gene", value.name = "percentage") %>%
dplyr::mutate(percentage = as.numeric(sub("%", "", percentage)))
}) %>%
dplyr::bind_rows() %>%
dplyr::filter(variable %in% celltype)
df$gene = factor(df$gene, levels = genetype)
df$variable = factor(df$variable, levels = celltype)
head(df)
df.sum = df %>%
dplyr::group_by(gene, variable) %>%
dplyr::mutate(avg = mean(percentage), sd = sd(percentage)) %>%
dplyr::select(-percentage, -variable) %>%
unique()
df.sum$gene = factor(df.sum$gene, levels = genetype)
df.sum$variable = factor(df.sum$variable, levels = celltype)
df.sum
pvalue = c(
sapply(celltype[2:5], function(i){
t.test(subset(df, gene == "IKZF2" & variable == i)$percentage,
subset(df, gene == "IKZF2" & variable == "nDNT0")$percentage,
paired = T)$p.value
}),
sapply(celltype[2:5], function(i){
t.test(subset(df, gene == "Ly-6C" & variable == i)$percentage,
subset(df, gene == "Ly-6C" & variable == "nDNT0")$percentage,
paired = T)$p.value
}),
sapply(celltype[c(1, 3:5)], function(i){
t.test(subset(df, gene == "IL17A" & variable == i)$percentage,
subset(df, gene == "IL17A" & variable == "nDNT1")$percentage,
paired = T)$p.value
}),
sapply(celltype[c(1:3, 5)], function(i){
t.test(subset(df, gene == "GZMB" & variable == i)$percentage,
subset(df, gene == "GZMB" & variable == "nDNT3")$percentage,
paired = F)$p.value
}))
pvalue
pvalue = sapply(pvalue, function(i) {
if (i > .05) i = "ns" else if (i <= 0.05 & i > .01) i = "*" else if (i <= .01 & i > .001) i = "**" else if (i
<= .001) i = "***"
})
pvalue
max(df$percentage[df$gene == "IL17A"])
anno = data.frame(x1 = c(rep(0, 8),
0, 1, 1, 1,
0, 1, 2, 3) + 1,
x2 = c(1:4, 1:4 ,
1, 2, 3, 4,
3, 3, 3, 4) + 1,
y1 = c(seq(77, 200, 10)[1:4],
seq(77, 200, 10)[1:4],
seq(13, 200, 10)[1:4],
seq(75, 200, 10)[1:4]),
lab = pvalue,
gene = rep(genetype, c(4, 4, 4, 4))) %>%
dplyr::mutate(y2 = y1 + 3,
ystar = y1 + 6,
xstar = (x1 + x2)/2)
anno
p1 = ggplot(data = df.sum, mapping = aes(x = variable, y = avg)) +
geom_bar(stat = "identity", fill = NA, width = .6, aes(color = variable)) +
geom_errorbar(aes(ymin = avg - sd, ymax = avg + sd, color = variable), width = .3) +
geom_jitter(data = df, mapping = aes(x = variable, y = percentage, color = variable),
size = .5, width = .35) +
facet_grid(gene~., scales = "free_y", drop = F) +
geom_text(data = anno, aes(x = xstar, y = ystar, label = lab), size = 2) +
geom_segment(data = anno, aes(x = x1, xend = x1, y = y1, yend = y2), colour = "black") +
geom_segment(data = anno, aes(x = x2, xend = x2, y = y1, yend = y2), colour = "black") +
geom_segment(data = anno, aes(x = x1, xend = x2, y = y2, yend = y2), colour = "black") +
NoLegend() +
theme.text +
theme(strip.text.y.right = element_text(angle = 90, hjust = .5, vjust = .5)) +
labs(x = "nDNT", y = "FACS Percentage (%)") +
scale_x_discrete(labels = 0:4) +
scale_color_brewer(
palette = "Set1",
labels = labels,
guide = guide_legend(label.hjust = 0, default.unit = "mm", keywidth = 3, keyheight = 3))
p1
# ggsave(filename = "Figure3e.png", plot = p1, units = "cm", width = 4, height = 8)
# pdf("Figure3e.pdf", useDingbats = F, width = 4/2.54, height = 6/2.54)
# print(p1)
# dev.off()
```
```{r vlnplot ikzf2 ly6c2}
# load("01.cca.Rdata")
object = cca$nDNT
DefaultAssay(object) = "SCT"
df = FetchData(object, c("Ikzf2", "Ly6c1", "Ly6c2", "Il17a", "Gzmb","seurat_clusters")) %>%
dplyr::rowwise() %>%
dplyr::mutate(Ly6c = sum(c(Ly6c1, Ly6c2))) %>%
reshape2::melt(id.vars = "seurat_clusters") %>%
dplyr::filter(!variable %in% c("Ly6c1", "Ly6c2"))
head(df)
table(df$variable)
df$variable = factor(df$variable, levels = c("Ikzf2", "Ly6c", "Il17a", "Gzmb"))
striplabel = c("Ikzf2", "Ly6c1 + Ly6c2", "Il17a", "Gzmb")
names(striplabel) = levels(df$variable)
p2 = ggplot(data = df, mapping = aes(x = seurat_clusters, y = value, color = seurat_clusters)) +
geom_violin(scale = "width") +
facet_grid(variable~., scales = "free", labeller = as_labeller(striplabel)) +
scale_color_brewer(palette = "Set1") +
theme.text +
theme(strip.text.y.right = element_text(angle = 90, hjust = .5, vjust = .5, face = "italic")) +
NoLegend() +
labs(x = "nDNT", y = "Expression level")
p2
p = plot_grid(p1, p2, nrow = 1)
ggsave(filename = "Figure3e.png", plot = p, units = "cm", width = 8, height = 9)
pdf("Figure3e.pdf", useDingbats = F, width = 8/2.54, height = 9/2.54)
print(p)
dev.off()
```
```{r ly6c1 ly6c2}
object = cca$nDNT
DefaultAssay(object) = "SCT"
df = FetchData(object, c("Ly6c1", "Ly6c2", "seurat_clusters")) %>%
reshape2::melt(id.vars = "seurat_clusters")
head(df)
table(df$variable)
p2 = ggplot(data = df, mapping = aes(x = seurat_clusters, y = value, color = seurat_clusters)) +
geom_violin(scale = "width") +
facet_grid(~variable, scales = "free") +
scale_color_brewer(palette = "Set1") +
theme.text +
theme(strip.text.x = element_text(angle = 0, hjust = .5, vjust = .5, face = "italic")) +
NoLegend() +
labs(x = "nDNT", y = "Expression level")
p2
ggsave(filename = "Figure3g.png", plot = p2, units = "cm", width = 4, height = 3)
pdf("Figure3g.pdf", useDingbats = F, width = 4/2.54, height = 3/2.54)
print(p2)
dev.off()
```

# fig 3
```{r}
pdf(file = "Fig.3.pdf",
width = 21/2.54,
height = 29.7/2.54,
useDingbats = F)
grid.newpage()
pushViewport(viewport(layout = grid.layout(nrow = 297, ncol = 210)))
print(c, vp = viewport(layout.pos.row = 140:190,
layout.pos.col = 10:80))
print(d, vp = viewport(layout.pos.row = 100:220,
layout.pos.col = 100:145))
dev.off()
```

---
title: "00.plot4"
output: html_notebook
editor_options:
chunk_output_type: console
---

2020-11-21 edit

# library
```{r}
Sys.setenv(LANGUAGE = "en")
library(Seurat)
library(harmony)
library(MAST)
library(dplyr)
library(tidyselect)
library(RColorBrewer)
library(future)
library(ggplot2)
library(org.Mm.eg.db)
library(EnsDb.Mmusculus.v79)
library(cowplot)
library(data.table)
library(clusterProfiler)
library(DoubletDecon)
library(DoubletFinder)
options(warn = -1)
memory.limit(size = 64670)
```
# theme set
```{r}
theme_set(theme_cowplot(font_size = 8))
theme.text = theme_cowplot(font_size = 8)
cols = c(brewer.pal(9, "Set1"),
brewer.pal(8, "Set2")[-c(2,4,8)],
brewer.pal(12, "Set3")[-9],
brewer.pal(12, "Paired"),
brewer.pal(8, "Dark2"),
brewer.pal(11, "Spectral")[-6],
brewer.pal(11, "BrBG")[-6]
)
```
# load function
```{r}
source("../24.publish_least_library/function/heatmap.f.r")
source("../24.publish_least_library/function/dotplot.f.r")

```
# load data
```{r}
load("01.cca.kept.Rdata")
load("01.cca.kept.cc.marker.Rdata")
load("01.cca.kept.cc.marker.48nk.Rdata")

load("01.harmony.dnt.nk.Rdata")
load("01.harmony.dnt.nk.meta.Rdata")
```
# fig 4a dimplot
```{r}
object = cca.kept$aDNT
##### save the png #####
(DimPlot(object, cols = "Paired", pt.size = .001, order = T) +
theme_nothing() +
theme(aspect.ratio = 1)) %>%
ggsave(filename = "tmp.png", plot = ., units = "cm", width = 6, height = 6, dpi = 1200)
##### plots #####
p1 = (DimPlot(object, pt.size = NA, cols = "Paired", combine = F))[[1]] +
theme.text +
labs(color = paste0("aDNT\ncells\n(", prettyNum(ncol(object), big.mark = ","), ")"), tag = "a") +
annotation_raster(png::readPNG("tmp.png"), -Inf, Inf, -Inf, Inf) +
annotation_custom(grob = ggplotGrob((DimPlot(object, label = T, label.size = 2.5, pt.size = NA,
combine = F))[[1]] +
theme_nothing() +
theme(aspect.ratio = 1))) +
theme(aspect.ratio = 1)
```
```{r barplot}
df = (prop.table(table(cca.kept$aDNT$integrated_snn_res.0.05, cca.kept$aDNT$orig.ident),
2)*100) %>% round(., 1) %>%
data.frame() %>%
setNames(c("cluster", "type", "percent"))
head(df)
p2 = ggplot(df, aes(type, percent, fill = cluster)) +
geom_bar(stat = "identity", width = .7) +
scale_fill_brewer(palette = "Paired") +
labs(x = "", y = "percentage (%)", tag = "") +
theme(axis.text.x = element_text(angle = 60, hjust = 1, vjust = 1),
aspect.ratio = 2)
plot_grid(p1, p2, nrow = 1, rel_widths = c(2, 1)) %>%
ggsave(filename = "Fig4a.pdf", width = 9, height = 6, units = "cm")
```

# fig 4b heatmap
```{r}
object = cca.kept$aDNT
colnames(cc.marker$aDNT)
df = cc.marker$aDNT %>%
dplyr::filter((pct.1 > .5 & cluster == 4 & avg_logFC > 1) | !(cluster == 4) & p_val_adj < .05) %>%
dplyr::group_by(cluster) %>%
dplyr::arrange(desc(avg_logFC), .by_group = T)
tail(df)
table(df$cluster)
##### write.csv #####
write.csv(df, file = "TableS7.aDNT.subgroup.markers.csv", row.names = F)
##### plots #####
heatmap.f(object = object,
features = df$gene,
type = df$cluster,
xlab = paste0("aDNT cells (", prettyNum(ncol(object), big.mark = ","),")"),
ylab = paste0("Subgroup markers of aDNT (", prettyNum(nrow(df), big.mark = ",") ,")"),
tag = "b",
pal = c(brewer.pal(3, "Paired")[1:2],
brewer.pal(3, "Paired")[1:2]),
axis.text.y = element_blank()
) %>%
ggsave(filename = "Fig4b.pdf", plot = ., units = "cm", width = 8, height = 7)
```
# fig 4c dotplot
```{r}
##### set the parameters #####
df = cc.marker$aDNT %>%
dplyr::group_by(cluster) %>%
dplyr::arrange(desc(avg_logFC), .by_group = T) %>%
dplyr::mutate(type = paste0("aDNT", cluster)) %>%
dplyr::left_join(y = cc.marker.48nk %>%
dplyr::filter(type %in% paste0("aDNT", 0:4)),
by = c("gene", "type"),
suffix = c("", ".48nk")) %>%
dplyr::mutate(type = "Higher than\nCD4 CD8 NK")
df$type[is.na(df$avg_logFC.48nk)] = "Not higher than\nCD4 CD8 NK"
head(df[,c("gene", "type", "avg_logFC", "cluster")])
##### annotate #####
ensembl = biomaRt::useMart("ensembl", dataset = "mmusculus_gene_ensembl")
mm_go_anno = biomaRt::getBM(
attributes = c("external_gene_name", "description", "go_id", "name_1006",
"namespace_1003", "definition_1006"),
filters = c("external_gene_name"),
values = unique(as.character(df$gene)),
mart = ensembl) %>%
dplyr::filter(namespace_1003 %in% c("cellular_component", "molecular_function"))
cs = sort(unique(mm_go_anno$external_gene_name[grep("component of membrane",
mm_go_anno$name_1006)]))
cs.neg = sort(unique(mm_go_anno$external_gene_name[grep("mitochondrial|integral
component of Golgi membrane|intracellular membrane-bounded organelle|nuclear membrane",
mm_go_anno$name_1006)]))
cs = cs[!(cs %in% cs.neg)]
secreted = sort(unique(mm_go_anno$external_gene_name[
c(grep("extracellular space", mm_go_anno$name_1006),
grep("granzyme", mm_go_anno$description))]))
tf = sort(unique(mm_go_anno$external_gene_name[
grep("transcription factor activity|transcription activator activity
|transcription repressor activity|transcription coactivator activity|
DNA binding|DNA-binding",
mm_go_anno$name_1006)]))
intersect(cs, secreted)
intersect(cs, tf)
intersect(secreted, tf)
cs = cs[!(cs %in% tf)]
secreted = secreted[(!(secreted %in% cs)) & (!(secreted %in% tf))]
ng = sort(unique(df$gene[!(df$gene %in% c(cs, secreted, tf))]))
anno.df = data.frame(
gene = c(cs, secreted, tf, ng),
anno = c(rep("cell surface", length(cs)),
rep("secreted", length(secreted)),
rep("transcription factor", length(tf)),
rep("", length(ng)))) %>%
dplyr::filter(!(gene %in% c("Actn1", "Actn2")))
df.plot =
dplyr::left_join(df, anno.df, by = c("gene")) %>%
dplyr::select(gene, cluster, type, anno, avg_logFC) %>%
dplyr::mutate(cluster = paste0("aDNT", cluster))
df.plot$anno = factor(
df.plot$anno,
levels = c("cell surface", "secreted", "transcription factor", ""))
##### object #####
object = harmony.dnt.nk
[email protected] = meta
object = subset(object, cells = colnames(object)[object$celltype %in% c("CD4", "CD8", "NK",
"aDNT")])
table(object$cluster)
object$cluster = factor(object$cluster, levels = c(paste("aDNT",0:1,sep = ""), "CD4", "CD8", "NK"))
##### plot #####
p = mapply(function(i, j){
x = df.plot %>%
dplyr::filter(anno == i) %>%
dplyr::group_by(cluster) %>%
dplyr::top_n(10, avg_logFC) %>%
dplyr::group_by(cluster, type) %>%
dplyr::arrange(gene, .by_group = T)
head(x)
p = dotplot.f(object, features = x$gene, group.by = "cluster", facet = x$type,
axis.text.x = element_text(angle = 60, hjust = 1, vjust = 1)) +
labs(x = "", title = j, y = "Subgroup marker genes")
return(p)
},
i = c("cell surface", "secreted", "transcription factor", ""),
j = c("Cell\nsurface","Secreted","Transcription\nfactor","Remaining\ngenes"),
SIMPLIFY = F,
USE.NAMES = F)
p[[1]] = p[[1]] + NoLegend()
p[[2]] = p[[2]] + theme(axis.title.y = element_blank()) + NoLegend()
p[[3]] = p[[3]] + theme(axis.title.y = element_blank())
p[[4]] = p[[4]] + theme(axis.title.y = element_blank()) + NoLegend()
##### change the strip color #####
for (i in 1:4) {
pal = c(brewer.pal(3, "Pastel1")[1:2])
g = ggplotGrob(p[[i]])
strips = grep("strip-", g$layout$name)
for (x in seq_along(strips)) {
k = which(grepl("rect", g$grobs[[strips[[x]]]]$grobs[[1]]$childrenOrder))
g$grobs[[strips[[x]]]]$grobs[[1]]$children[[k]]$gp$fill = pal[x]
}
p[[i]] = g
}
##### save plots #####
plot_grid(plotlist = p, nrow = 1, labels = "c", label_size = 8, rel_widths = c(1,.9,.85,1.1)) %>%
ggsave(filename = "Fig4c.pdf", plot = ., units = "cm", width = 16, height = 8)
```
# fig 4d go
```{r}
##### set genelist #####
df = cc.marker$aDNT %>%
dplyr::select(gene, cluster) %>%
dplyr::mutate(cluster = paste0("aDNT",cluster))
head(df)
combinations = mapply(function(i){df$gene[df$cluster == i]}, i = paste0("aDNT", 0:1), SIMPLIFY =
F) %>%
lapply(., function(i){
AnnotationDbi::mapIds(x = org.Mm.eg.db, keys = i, column = "ENTREZID", keytype = "SYMBOL")
})
names(combinations)
### go #####
go = clusterProfiler::compareCluster(geneClusters = combinations, # list names needed
fun = "enrichGO",
ont = "ALL",
OrgDb = org.Mm.eg.db,
readable = T,
pool = T)
head(go@compareClusterResult)
### kegg #####
kegg = clusterProfiler::compareCluster(geneClusters = combinations,
fun = "enrichKEGG",
keyType = "ncbi-geneid",
organism = "mmu",
use_internal_data = T)
kegg@readable = F
kegg = setReadable(x = kegg, OrgDb = "org.Mm.eg.db", keyType = "ENTREZID")
kegg@compareClusterResult$ONTOLOGY = "KEGG"
head(kegg@compareClusterResult)
##### plot #####
lapply(c(go, kegg), function(i){
enrichplot::dotplot(object = i, showCategory = 5, font.size = 8) +
facet_grid(ONTOLOGY ~ ., scales = "free", space = "free") +
scale_size_area(max_size = 2.5) +
scale_color_distiller(palette = "Spectral", direction = 1, guide = guide_colorbar(
default.unit = "mm", barwidth = 2)) +
theme.text +
theme(axis.text.x = element_text(angle = 60, hjust = 1, vjust = 1))
}) %>%
plot_grid(plotlist = ., ncol = 1, labels = "d", label_size = 8) %>%
ggsave(filename = "Fig4d.pdf", plot = ., units = "cm", width = 9, height = 10)
```
```{r cluego}
##### read the data #####
df = read.table(file = "Fig4d.cluego.txt", header = T, sep = "\t") %>%
setNames(c("Term", "Source", "AdjustP", "AssociatedGenes", "Num.Genes", "Genes", "Cluster"))
%>%
dplyr::group_by(Cluster, Source) %>%
dplyr::arrange(AdjustP, .by_group = T)
df$Term = factor(df$Term, levels = unique(df$Term))
df$Cluster = as.character(df$Cluster)
head(df)
##### plot #####
p = ggplot(df, aes(Cluster, Term, color = AdjustP, size = AssociatedGenes)) +
geom_point() +
facet_grid(rows = vars(Source), scales = "free", space = "free") +
scale_color_distiller(palette = "Spectral", direction = 1, guide = guide_colorbar(default.unit =
"cm", barwidth = .2, barheight = 1.5)) +
scale_size_area(max_size = 3, guide_legend(title = "Associated\nGenes (%)")) +
labs(x = "aDNT cluster") +
theme.text
p
##### change the strip color #####
g = ggplotGrob(p)
pal = brewer.pal(4, "Set3")
strips = grep("strip-", g$layout$name)
for (i in seq_along(strips)) {
k = which(grepl("rect",
g$grobs[[strips[i]]]$grobs[[1]]$childrenOrder))
g$grobs[[strips[i]]]$grobs[[1]]$children[[k]]$gp$fill = pal[i]
}
##### save plot #####
ggsave(filename = "Fig4d.cluego.pdf", plot = plot_grid(g), units = "cm", width = 10.5, height = 8)
```

# fig S11 dimplot and featureplot

```{r dimplot}
object = cca.kept$aDNT
##### save pngs #####
mapply(function(ident, num){
(DimPlot(object, cols = "Paired", pt.size = .1, cells = colnames(object)[object$orig.ident ==
ident]) + theme_nothing()) %>% ggsave(filename = paste0("tmp",num,".png"), plot = ., units =
"cm", width = 6, height = 6, dpi = 1200)
}, ident = c("aDNT_rep1", "aDNT_rep2"), num = 1:2, SIMPLIFY = F)

##### annotate function #####

# fig S12 heatmap

```{r}
##### set the parameters #####
df = cc.marker$aDNT %>%
dplyr::group_by(cluster) %>%
dplyr::arrange(desc(avg_logFC), .by_group = T) %>%
dplyr::mutate(type = paste0("aDNT", cluster)) %>%
dplyr::left_join(y = cc.marker.48nk %>%
dplyr::filter(type %in% paste0("aDNT", 0:1)),
by = c("gene", "type"),
suffix = c("", ".NK")) %>%
dplyr::mutate(type = "Higher than CD4 CD8 NK")
df$type[is.na(df$pct.2.NK)] = "Not higher than CD4 CD8 NK"
head(df[,c("gene", "type", "pct.2.NK")])
df = df[df$type == "Higher than CD4 CD8 NK",]
df$gene = factor(df$gene, levels = rev(sort(unique(as.character(df$gene)))))
table(df$type)
##### write csv #####
# write.csv(df, file = "TableS8.aDNT.subgroup.CD48NK.markers.csv", row.names = F)
##### object #####
object = harmony.dnt.nk
object = subset(object, cells = colnames(object)[object$celltype %in% c("CD4", "CD8", "NK",
"aDNT")])
object$id.dot = as.character(object$celltype)
object$id.dot[grep("aDNT", object$id.dot)] = paste("aDNT",
cca.kept$aDNT$integrated_snn_res.0.05, sep = "")
object$id.dot = factor(object$id.dot, levels = c(paste("aDNT",0:1,sep = ""), "CD4", "CD8", "NK"))
##### plot #####
heatmap.f(object = object,
features = df$gene,
type = df$cluster,
group.by = "id.dot",
angle.x = 90,
axis.text.y = element_text(face = "italic", hjust = 0, size = 5),
pal = c(brewer.pal(8, "Paired"),
brewer.pal(8, "Paired")[3:5]),
xlab = paste0(prettyNum(ncol(object), big.mark = ",")," cells"),
ylab = paste("Subgroup markers of aDNT higher than CD4 CD8 NK (", nrow(df), ")", sep = ""),
tag = "",
space = "free") %>%
ggsave(filename = "FigS12.pdf", plot = ., units = "cm", width = 8, height = 20)
```
---
title: "00.plot5"
output: html_notebook
editor_options:
chunk_output_type: console
---

2020-11-22 edit

# library
```{r}
Sys.setenv(LANGUAGE = "en")
library(Seurat)
library(harmony)
library(MAST)
library(dplyr)
library(tidyselect)
library(RColorBrewer)
library(future)
library(ggplot2)
library(org.Mm.eg.db)
library(EnsDb.Mmusculus.v79)
library(cowplot)
library(data.table)
library(clusterProfiler)
library(DoubletDecon)
library(DoubletFinder)
options(warn = -1)
memory.limit(size = 64670)
```
# theme set
```{r}
theme.text = theme_cowplot(font_size = 8)
cols = c(brewer.pal(9, "Set1"),
brewer.pal(8, "Set2")[-8],
brewer.pal(12, "Set3")[-9],
brewer.pal(12, "Paired"),
brewer.pal(8, "Dark2"),
brewer.pal(11, "Spectral")[-6],
brewer.pal(11, "BrBG")[-6]
)
```
# load function
```{r}
source("../24.publish_least_library/function/heatmap.f.r")
source("../24.publish_least_library/function/dotplot.f.r")

```
# load data
```{r}
load("01.cca.kept.Rdata")

load("01.harmony.dnt.nk.Rdata")
load("01.cc.marker.DNT.Rdata")
load("01.cca.kept.cc.marker.DNT.48nk.Rdata")

```
# fig 5a dimplot
```{r}
##### set parameters #####
object = cca.kept$DNT
DefaultAssay(object) = "integrated"
object = FindClusters(object, resolution = .01)
object$group = paste0("DNT", object$integrated_snn_res.0.01)
##### save png #####
(DimPlot(object, group.by = "group", cols = "Set2", pt.size = .1) +
theme_nothing()) %>%
ggsave(filename = "tmp.png", plot = ., units = "cm", width = 6, height = 6, dpi = 1200)
##### plots #####
p1 = (DimPlot(object, group.by = "group", cols = "Set2", pt.size = NA) +
theme.text +
labs(color = paste0("nDNT & aDNT\ncells (", prettyNum(ncol(object), big.mark = ","),")"), tag =
"a") +
theme(aspect.ratio = 1) +
NoAxes() +
annotation_raster(png::readPNG("tmp.png"), -Inf, Inf, -Inf, Inf) +
annotation_custom(ggplotGrob(DimPlot(object, group.by = "group", pt.size = NA, label = T,
label.size = 2.5) +
theme_nothing())))
p1
#####
meta = mapply(function(object, celltype){
[email protected] %>%
tibble::rownames_to_column("cells") %>%
dplyr::mutate(recluster = paste0(celltype, integrated_snn_res.0.05)) %>%
dplyr::select(cells, recluster)
}, object = cca.kept[1:2], SIMPLIFY = F, celltype = names(cca.kept)[1:2]) %>%
data.table::rbindlist()
[email protected] = [email protected] %>%
tibble::rownames_to_column("cells") %>%
dplyr::left_join(y = meta, by = "cells") %>%
tibble::column_to_rownames("cells")
object$celltype = sub("_rep[1-2]", "", object$orig.ident) %>%
factor(x = ., levels = c("nDNT", "aDNT"))
object$recluster = factor(object$recluster, levels = c(paste0("nDNT", 0:4),paste0("aDNT", 0:1)))
##### bar plot #####
df = prop.table(table(object$recluster, object$group), 1) %>%
round(., 2) %>%
data.frame()
p3 = ggplot(df, aes(Var1, Freq, fill = Var2)) +
geom_bar(stat = "identity", position = "stack", width = .6) +
scale_fill_brewer(palette = "Set2") +
labs(x = "", y = "Percentage", fill = "") +
theme.text +
theme(aspect.ratio = 1,
axis.text.x = element_text(angle = 60, hjust = 1, vjust = 1))
p3
##### save pngs #####
mapply(function(ident, num, color){
(DimPlot(object, pt.size = .1, group.by = "recluster",
cells = colnames(object)[object$celltype == ident]) +
scale_color_manual(values = brewer.pal(7, "Paired")[color]) +
theme_nothing()) %>%
ggsave(filename = paste0("tmp",num,".png"), plot = ., units = "cm", width = 6, height = 6, dpi =
1200)
}, ident = c("nDNT", "aDNT"), num = 1:2, color = list(1:5, 6:7), SIMPLIFY = F)
##### annotate function #####
annotation_custom2 <- function(grob, xmin = -Inf, xmax = Inf, ymin = -Inf, ymax = Inf, data){
layer(data = data, stat = StatIdentity, position = PositionIdentity,
geom = ggplot2:::GeomCustomAnn,
inherit.aes = TRUE, params = list(grob = grob,
xmin = xmin, xmax = xmax,
ymin = ymin, ymax = ymax))}
##### plots #####
p = (DimPlot(object, cols = "Paired", pt.size = NA, split.by = "celltype", group.by = "recluster"))
p2 = (p +
annotation_custom2(grid::rasterGrob(png::readPNG("tmp1.png")), -Inf, Inf, -Inf, Inf,
data = p$data[which(p$data$celltype == "nDNT")[1],]) +
annotation_custom2(ggplotGrob(DimPlot(subset(object, cells = colnames(object)
[object$celltype == "nDNT"]), pt.size = NA, label = T, label.size = 2.5, group.by = "recluster") +
NoAxes() + NoLegend()), -Inf, Inf, -Inf, Inf,
data = p$data[which(p$data$celltype == "nDNT")[1],]) +
annotation_custom2(grid::rasterGrob(png::readPNG("tmp2.png")), -Inf, Inf, -Inf, Inf,
data = p$data[which(p$data$celltype == "aDNT")[1],]) +
annotation_custom2(ggplotGrob(DimPlot(subset(object, cells = colnames(object)
[object$celltype == "aDNT"]), pt.size = NA, label = T, label.size = 2.5, group.by = "recluster") +
NoAxes() + NoLegend()), -Inf, Inf, -Inf, Inf,
data = p$data[which(p$data$celltype == "aDNT")[1],]) +
labs(color = paste0("nDNT & aDNT\ncells (", prettyNum(ncol(object), big.mark = ","),")")) +
theme.text +
NoAxes() +
theme(aspect.ratio = 1))
p2
##### save plots #####
plot_grid(plot_grid(p1, p3, nrow = 1, rel_widths = c(1.4, 1)), p2, ncol = 1) %>%
ggsave(filename = "Fig5a.pdf", plot = ., units = "cm", width = 10, height = 8)
#####
```
# fig 5b heatmap
```{r}
##### set the conserved marker genes #####
table(object$celltype, object$group)
DefaultAssay(object) = "RNA"
table(Idents(object))
# m = FindAllMarkers(object, only.pos = T)
# head(m)
# df = FindConservedMarkers(object, ident.1 = 0, ident.2 = 1, grouping.var = "orig.ident") %>%
# dplyr::filter(max_pval < .05) %>%
# tibble::rownames_to_column("gene") %>%
# dplyr::left_join(y = m, by = "gene") %>%
# dplyr::arrange(desc(avg_logFC))
# colnames(df)
# cc.marker.DNT = df
# save(cc.marker.DNT, file = "01.cc.marker.DNT.Rdata")
#####
df = cc.marker.DNT
head(df)
write.csv(df, file = "TableS10.nDNT.aDNT.conserved.marker.csv", row.names = F)
table(object$group)
heatmap.f(object = object,
features = df$gene,
type = df$cluster,
group.by = "group",
xlab = paste0("nDNT & aDNT cells (", prettyNum(ncol(object), big.mark = ","),")"),
ylab = paste0("Conserved subgroup markers\nof nDNT & aDNT cells (",
prettyNum(nrow(df), big.mark = ",") ,")"),
tag = "b",
pal = c(brewer.pal(4, "Set2")[1:2],
brewer.pal(4, "Set2")[1:2]),
axis.text.y = element_blank(),
angle.x = 90
) %>%
ggsave(filename = "Fig5b.pdf", plot = ., units = "cm", width = 8, height = 7)
#####
```
# fig 5c dotplot
```{r}
object = harmony.dnt.nk
[email protected] = meta
object = subset(object, cells = colnames(object)[object$celltype %in% c("CD4", "CD8", "NK",
"nDNT", "aDNT")])
table(object$recluster)
object$recluster = factor(object$recluster, levels = c(paste0("DNT", 0:1), "CD4", "CD8", "NK"))
```
```{r}
##### set the parameters #####
df = cc.marker$DNT %>%
dplyr::group_by(cluster) %>%
dplyr::arrange(desc(avg_logFC), .by_group = T) %>%
dplyr::mutate(type = paste0("DNT", cluster)) %>%
dplyr::left_join(y = cc.marker.48nk %>%
dplyr::filter(type %in% paste0("DNT", 0:1)),
by = c("gene", "type"),
suffix = c("", ".48nk")) %>%
dplyr::mutate(type = "Higher than\nCD4 CD8 NK")
df$type[is.na(df$avg_logFC.48nk)] = "Not higher than\nCD4 CD8 NK"
head(df[,c("gene", "type", "avg_logFC", "cluster")])
##### annotate #####
# ensembl = biomaRt::useMart("ensembl", dataset = "mmusculus_gene_ensembl")
mm_go_anno = biomaRt::getBM(
attributes = c("external_gene_name", "description", "go_id", "name_1006",
"namespace_1003", "definition_1006"),
filters = c("external_gene_name"),
values = unique(as.character(df$gene)),
mart = ensembl) %>%
dplyr::filter(namespace_1003 %in% c("cellular_component", "molecular_function"))
cs = sort(unique(mm_go_anno$external_gene_name[grep("component of membrane",
mm_go_anno$name_1006)]))
cs.neg = sort(unique(mm_go_anno$external_gene_name[grep("mitochondrial|integral
component of Golgi membrane|intracellular membrane-bounded organelle|nuclear membrane",
mm_go_anno$name_1006)]))
cs = cs[!(cs %in% cs.neg)]
secreted = sort(unique(mm_go_anno$external_gene_name[
c(grep("extracellular space", mm_go_anno$name_1006),
grep("granzyme", mm_go_anno$description))]))
tf = sort(unique(mm_go_anno$external_gene_name[
grep("transcription factor activity|transcription activator activity
|transcription repressor activity|transcription coactivator activity|
DNA binding|DNA-binding",
mm_go_anno$name_1006)]))
intersect(cs, secreted)
intersect(cs, tf)
intersect(secreted, tf)
cs = cs[!(cs %in% tf)]
secreted = secreted[(!(secreted %in% cs)) & (!(secreted %in% tf))]
ng = sort(unique(df$gene[!(df$gene %in% c(cs, secreted, tf))]))
anno.df = data.frame(
gene = c(cs, secreted, tf, ng),
anno = c(rep("cell surface", length(cs)),
rep("secreted", length(secreted)),
rep("transcription factor", length(tf)),
rep("", length(ng)))) %>%
dplyr::filter(!(gene %in% c("Actn1", "Actn2")))
df.plot =
dplyr::left_join(df, anno.df, by = c("gene")) %>%
dplyr::select(gene, cluster, type, anno, avg_logFC) %>%
dplyr::mutate(cluster = paste0("DNT", cluster))
df.plot$anno = factor(
df.plot$anno,
levels = c("cell surface", "secreted", "transcription factor", ""))
head(df.plot)
##### plot #####
p = mapply(function(i, j){
x = df.plot %>%
dplyr::filter(anno == i) %>%
dplyr::group_by(cluster) %>%
dplyr::top_n(10, avg_logFC) %>%
dplyr::group_by(cluster, type) %>%
dplyr::arrange(gene, .by_group = T)
head(x)
p = dotplot.f(object, features = x$gene, group.by = "recluster", facet = x$type,
axis.text.x = element_text(angle = 60, hjust = 1, vjust = 1)) +
labs(x = "", title = j, y = "Conserved subgroup markers\nof nDNT & aDNT cells")
return(p)
},
i = c("cell surface", "secreted", "transcription factor", ""),
j = c("Cell\nsurface","Secreted","Transcription\nfactor","Remaining\ngenes"),
SIMPLIFY = F,
USE.NAMES = F)
p[[1]] = p[[1]] + NoLegend()
p[[2]] = p[[2]] + theme(axis.title.y = element_blank()) + NoLegend()
p[[3]] = p[[3]] + theme(axis.title.y = element_blank())
p[[4]] = p[[4]] + theme(axis.title.y = element_blank()) + NoLegend()
##### change the strip color #####
for (i in 1:4) {
pal = c(brewer.pal(3, "Pastel1")[1:2])
g = ggplotGrob(p[[i]])
strips = grep("strip-", g$layout$name)
for (x in seq_along(strips)) {
k = which(grepl("rect", g$grobs[[strips[[x]]]]$grobs[[1]]$childrenOrder))
g$grobs[[strips[[x]]]]$grobs[[1]]$children[[k]]$gp$fill = pal[x]
}
p[[i]] = g
}
##### save plots #####
plot_grid(plotlist = p, nrow = 1, labels = "c", label_size = 8, rel_widths = c(1.2,.85,.85,1.1)) %>%
ggsave(filename = "Fig5c.pdf", plot = ., units = "cm", width = 15, height = 7)
```
# fig 5d go
```{r cluego}
##### read the data #####
df = read.table(file = "Fig5d.cluego.txt", header = T, sep = "\t") %>%
setNames(c("Term", "Source", "AdjustP", "AssociatedGenes", "Num.Genes", "Genes", "Cluster"))
%>%
dplyr::group_by(Cluster, Source) %>%
dplyr::arrange(AdjustP, .by_group = T)
df$Term = factor(df$Term, levels = unique(df$Term))
df$Cluster = as.character(df$Cluster)
head(df)
##### plot #####
p = ggplot(df, aes(Cluster, Term, color = AdjustP, size = AssociatedGenes)) +
geom_point() +
facet_grid(rows = vars(Source), scales = "free", space = "free") +
scale_color_distiller(palette = "Spectral", direction = 1, guide = guide_colorbar(default.unit =
"cm", barwidth = .2, barheight = 1.5)) +
scale_size_area(max_size = 3, guide_legend(title = "Associated\nGenes (%)")) +
labs(x = "DNT cluster", tag = "d") +
theme.text
p
##### change the strip color #####
g = ggplotGrob(p)
pal = brewer.pal(4, "Set3")
strips = grep("strip-", g$layout$name)
for (i in seq_along(strips)) {
k = which(grepl("rect",
g$grobs[[strips[i]]]$grobs[[1]]$childrenOrder))
g$grobs[[strips[i]]]$grobs[[1]]$children[[k]]$gp$fill = pal[i]
}
##### save plot #####
ggsave(filename = "Fig5d.cluego.pdf", plot = plot_grid(g), units = "cm", width = 10, height = 5.5)
```
# fig S13 dimplot
```{r}
##### set parameters #####
object = cca.kept$DNT
meta = mapply(function(object, celltype){
[email protected] %>%
tibble::rownames_to_column("cells") %>%
dplyr::mutate(recluster = paste0(celltype, integrated_snn_res.0.05)) %>%
dplyr::select(cells, recluster)
}, object = cca.kept[1:2], SIMPLIFY = F, celltype = names(cca.kept)[1:2]) %>%
data.table::rbindlist()
[email protected] = [email protected] %>%
tibble::rownames_to_column("cells") %>%
dplyr::left_join(y = meta, by = "cells") %>%
tibble::column_to_rownames("cells")
object$celltype = sub("_rep[1-2]", "", object$orig.ident) %>%
factor(x = ., levels = c("nDNT", "aDNT"))
object$recluster = factor(object$recluster, levels = c(paste0("nDNT", 0:4),paste0("aDNT", 0:1)))
table(object$recluster)
##### save pngs #####
mapply(function(ident, num, color){
(DimPlot(object, pt.size = .1, group.by = "recluster",
cells = colnames(object)[object$orig.ident == ident]) +
scale_color_manual(values = brewer.pal(7, "Paired")[color]) +
theme_nothing()) %>%
ggsave(filename = paste0("tmp",num,".png"), plot = ., units = "cm", width = 6, height = 6, dpi =
1200)
},
ident = c("nDNT_rep1", "nDNT_rep2", "aDNT_rep1", "aDNT_rep2"),
num = 1:4,
color = list(1:5, 1:5, 6:7, 6:7),
SIMPLIFY = F)
##### annotate function #####
annotation_custom2 <- function(grob, xmin = -Inf, xmax = Inf, ymin = -Inf, ymax = Inf, data){
layer(data = data, stat = StatIdentity, position = PositionIdentity,
geom = ggplot2:::GeomCustomAnn,
inherit.aes = TRUE, params = list(grob = grob,
xmin = xmin, xmax = xmax,
ymin = ymin, ymax = ymax))}
##### plots #####
object$orig.ident = factor(object$orig.ident, levels = c("nDNT_rep1", "nDNT_rep2", "aDNT_rep1",
"aDNT_rep2"))
p = (DimPlot(object, cols = "Paired", pt.size = NA, split.by = "orig.ident", group.by = "recluster",
ncol = 2))
(p +
annotation_custom2(grid::rasterGrob(png::readPNG("tmp1.png")), -Inf, Inf, -Inf, Inf,
data = p$data[which(p$data$orig.ident == "nDNT_rep1")[1],]) +
annotation_custom2(ggplotGrob(DimPlot(subset(object, cells = colnames(object)
[object$orig.ident == "nDNT_rep1"]), pt.size = NA, label = T, label.size = 2.5, group.by =
"recluster") + NoAxes() + NoLegend()), -Inf, Inf, -Inf, Inf,
data = p$data[which(p$data$orig.ident == "nDNT_rep1")[1],]) +
annotation_custom2(grid::rasterGrob(png::readPNG("tmp2.png")), -Inf, Inf, -Inf, Inf,
data = p$data[which(p$data$orig.ident == "nDNT_rep2")[1],]) +
annotation_custom2(ggplotGrob(DimPlot(subset(object, cells = colnames(object)
[object$orig.ident == "nDNT_rep2"]), pt.size = NA, label = T, label.size = 2.5, group.by =
"recluster") + NoAxes() + NoLegend()), -Inf, Inf, -Inf, Inf,
data = p$data[which(p$data$orig.ident == "nDNT_rep2")[1],]) +
annotation_custom2(grid::rasterGrob(png::readPNG("tmp3.png")), -Inf, Inf, -Inf, Inf,
data = p$data[which(p$data$orig.ident == "aDNT_rep1")[1],]) +
annotation_custom2(ggplotGrob(DimPlot(subset(object, cells = colnames(object)
[object$orig.ident == "aDNT_rep1"]), pt.size = NA, label = T, label.size = 2.5, group.by =
"recluster") + NoAxes() + NoLegend()), -Inf, Inf, -Inf, Inf,
data = p$data[which(p$data$orig.ident == "aDNT_rep1")[1],]) +
annotation_custom2(grid::rasterGrob(png::readPNG("tmp4.png")), -Inf, Inf, -Inf, Inf,
data = p$data[which(p$data$orig.ident == "aDNT_rep2")[1],]) +
annotation_custom2(ggplotGrob(DimPlot(subset(object, cells = colnames(object)
[object$orig.ident == "aDNT_rep2"]), pt.size = NA, label = T, label.size = 2.5, group.by =
"recluster") + NoAxes() + NoLegend()), -Inf, Inf, -Inf, Inf,
data = p$data[which(p$data$orig.ident == "aDNT_rep2")[1],]) +
labs(color = paste0("nDNT & aDNT\ncells (", prettyNum(ncol(object), big.mark = ","),")")) +
theme.text +
NoAxes() +
theme(aspect.ratio = 1)) %>%
ggsave(filename = "FigS13.pdf", plot = ., units = "cm", width = 10, height = 8)
#####
```
# fig S14 heatmap
```{r}
##### object #####
object = cca.kept$DNT
DefaultAssay(object) = "integrated"
object = FindClusters(object, resolution = .01)
df = [email protected] %>%
tibble::rownames_to_column("cells") %>%
dplyr::select(cells, integrated_snn_res.0.01)
head(df)
object = harmony.dnt.nk
object = subset(object, cells = colnames(object)[object$celltype %in% c("CD4", "CD8", "NK",
"nDNT", "aDNT")])
meta = [email protected] %>%
tibble::rownames_to_column("cells") %>%
dplyr::left_join(y = df, by = "cells") %>%
tibble::column_to_rownames("cells")
head(meta)
meta$cluster = paste0(sub("a|n", "", meta$celltype), meta$integrated_snn_res.0.01) %>%
sub("NA", "", .) %>%
factor(., levels = c(paste0("DNT", 0:1), "CD4", "CD8", "NK"))
table(meta$cluster)
[email protected] = meta
##### set the parameters #####
df = cc.marker.DNT %>%
dplyr::group_by(cluster) %>%
dplyr::arrange(desc(avg_logFC), .by_group = T) %>%
dplyr::mutate(type = paste0("DNT", cluster)) %>%
dplyr::left_join(y = cc.marker.DNT.48nk %>%
dplyr::filter(type %in% paste0("DNT", 0:1)),
by = c("gene", "type"),
suffix = c("", ".NK")) %>%
dplyr::mutate(type = "Higher than CD4 CD8 NK")
df$type[is.na(df$pct.1.NK)] = "Not higher than CD4 CD8 NK"
head(df[,c("gene", "type", "avg_logFC")])
df = df[df$type == "Higher than CD4 CD8 NK",]
df$gene = factor(df$gene, levels = rev(sort(unique(as.character(df$gene)))))
##### write csv #####
write.csv(df, file = "TableS11.nDNT.aDNT.conserved.marker.vs.CD48NK.csv", row.names = F)
##### plot #####
heatmap.f(object = object,
features = df$gene,
type = df$cluster,
group.by = "cluster",
angle.x = 90,
axis.text.y = element_text(face = "italic", hjust = 0, size = 7),
pal = c(brewer.pal(5, "Paired"),
brewer.pal(5, "Paired")[1:2]),
xlab = paste0(prettyNum(ncol(object), big.mark = ",")," cells"),
ylab = paste("Conserved subgroup markers of nDNT & aDNT\nhigher than CD4 CD8 NK (",
nrow(df), ")", sep = ""),
tag = "",
space = "free") %>%
ggsave(filename = "FigS14.pdf", plot = ., units = "cm", width = 8, height = 12)
```

---
title: "00.plot6"
output: html_notebook
editor_options:
chunk_output_type: console
---

2020-11-23 edit

# library
```{r}
Sys.setenv(LANGUAGE = "en")
library(Seurat)
library(harmony)
library(MAST)
library(dplyr)
library(tidyselect)
library(RColorBrewer)
library(future)
library(ggplot2)
library(org.Mm.eg.db)
library(EnsDb.Mmusculus.v79)
library(cowplot)
library(data.table)
library(clusterProfiler)
library(DoubletDecon)
library(DoubletFinder)
options(warn = -1)
memory.limit(size = 64670)
```
# theme set
```{r}
theme_set(theme_cowplot(font_size = 8))
theme.text = theme_cowplot(font_size = 8)
cols = c(brewer.pal(9, "Set1"),
brewer.pal(8, "Set2")[-8],
brewer.pal(12, "Set3")[-9],
brewer.pal(12, "Paired"),
brewer.pal(8, "Dark2"),
brewer.pal(11, "Spectral")[-6],
brewer.pal(11, "BrBG")[-6]
)
```
# load function
```{r}
source("../24.publish_least_library/function/heatmap.f.r")
source("../24.publish_least_library/function/dotplot.f.r")

```
# load data
```{r}
load("01.cca.kept.Rdata")

# load("01.harmony.dnt.nk.Rdata")
# load("01.cc.marker.DNT.Rdata")
# load("01.cca.kept.cc.marker.DNT.48nk.Rdata")

load("01.trans.marker.Rdata")
ensembl = biomaRt::useMart("ensembl", dataset = "mmusculus_gene_ensembl")

```
# fig 6a scatter plot plus euler venn
```{r euler venn}
##### venn1 #####
combinations = list("DNT0" = trans.marker$DNT0$gene[trans.marker$DNT0$trans == "up"],
"DNT1" = trans.marker$DNT1$gene[trans.marker$DNT1$trans == "up"])
venn1 = plot(eulerr::euler(
combinations = combinations),
fills = list(fill = brewer.pal(6, "Set2")[c(6,4,3)], alpha = .6),
quantities = list(fontsize = 8, font = 1),
labels = list(fontsize = 8, font = 1,
labels = c(paste0("DNT0\n(", length(combinations$DNT0), ")"),
paste0("DNT1\n(", length(combinations$DNT1), ")"))),
main = list(labels = paste0("Subgroup upregulated genes\nafter activation (",
length(unique(c(combinations$DNT0, combinations$DNT1))), ")"),
fontsize = 6, font = 1),
adjust_labels = T)
##### venn2 #####
combinations = list("DNT0" = trans.marker$DNT0$gene[trans.marker$DNT0$trans == "down"],
"DNT1" = trans.marker$DNT1$gene[trans.marker$DNT1$trans == "down"])
venn2 = plot(eulerr::euler(
combinations = combinations),
fills = list(fill = brewer.pal(6, "Set2")[c(2,5,1)], alpha = .6),
quantities = list(fontsize = 8, font = 1),
labels = list(fontsize = 8, font = 1,
labels = c(paste0("DNT0\n(", length(combinations$DNT0), ")"),
paste0("DNT1\n(", length(combinations$DNT1), ")"))),
main = list(labels = paste0("Subgroup downregulated genes\nafter activation (",
length(unique(c(combinations$DNT0, combinations$DNT1))), ")"),
fontsize = 6, font = 1),
adjust_labels = T)
#####
plot_grid(venn1, venn2, nrow = 1) %>%
ggsave(filename = "Fig6a.s.pdf", plot = ., units = "cm", width = 6, height = 4)
#####
```
```{r scatter plot}
##### set data #####
df = dplyr::full_join(trans.marker$DNT0, trans.marker$DNT1, by = c("gene"), suffix = c(".DNT0",
".DNT1")) %>%
dplyr::rowwise() %>%
dplyr::mutate(avg_logFC.DNT0 = mean(c(avg_logFC.rep1.DNT0, avg_logFC.rep2.DNT0)),
avg_logFC.DNT1 = mean(c(avg_logFC.rep1.DNT1, avg_logFC.rep2.DNT1)),
type.DNT0 = gsub("_rep1","0",type.rep1.DNT0),
type.DNT1 = gsub("_rep1","1",type.rep1.DNT1)) %>%
dplyr::select(gene, avg_logFC.DNT0, avg_logFC.DNT1, type.DNT0, type.DNT1, contains("trans"))
head(df)
df$avg_logFC.DNT0[is.na(df$avg_logFC.DNT0)] = 0
df$avg_logFC.DNT1[is.na(df$avg_logFC.DNT1)] = 0
df = dplyr::mutate(df, type = paste(type.DNT0, trans.DNT0, type.DNT1, trans.DNT1, sep = ".")) %>
%
dplyr::select(gene, contains("avg"), type)
df$type = factor(df$type,
labels = c("aDNT0.vs.nDNT0 down\naDNT1.vs.nDNT1 down (81)",
"aDNT0.vs.nDNT0 down (114)",
"aDNT0.vs.nDNT0 up\naDNT1.vs.nDNT1 up (234)",
"aDNT0.vs.nDNT0 up (237)",
"aDNT1.vs.nDNT1 down (40)",
"aDNT1.vs.nDNT1 up (99)"))
table(df$type)
head(df)
write.csv(df, file = "TableS13.up-down-regulated.marker.csv", row.names = F)
##### plot #####
scatterplot = (ggplot(df, aes(avg_logFC.DNT0, avg_logFC.DNT1, color = type)) +
geom_point(alpha = .4, size = .5) +
geom_hline(yintercept = c(1, -1), color = "red", linetype = "dashed", size = .1) +
geom_vline(xintercept = c(1, -1), color = "red", linetype = "dashed", size = .1) +
ggrepel::geom_label_repel(data = df %>% dplyr::filter(abs(avg_logFC.DNT0) >= 1 |
abs(avg_logFC.DNT1) >= 1),
aes(avg_logFC.DNT0, avg_logFC.DNT1, label = gene), alpha = 1, size = 2, fontface =
"italic", segment.size = .1, min.segment.length = 0, seed = 1) +
scale_color_brewer(palette = "Set2") +
labs(x = "avg_logFC (aDNT0.vs.nDNT0)", y = "avg_logFC (aDNT1.vs.nDNT1)", color = "", tag = "a")
+
theme(aspect.ratio = 1,
legend.position = c(.65, 0.15)))
plot_grid(venn2, scatterplot, venn1, nrow = 1, rel_widths = c(.2, 1, .2)) %>%
ggsave(filename = "Fig6a.pdf", plot = ., units = "cm", width = 12*1.4, height = 12)
```
# fig 6b dotplot
```{r}
##### set data #####
up = df[grep("up", df$type),]
up$type = factor(up$type, levels = sort(unique(up$type))[c(2,1,3)]) %>%
gsub(" up","", .) %>%
gsub(" \$.*\$", "", .)
table(up$type)
##### annotate #####
mm_go_anno = biomaRt::getBM(
attributes = c("external_gene_name", "description", "go_id", "name_1006",
"namespace_1003", "definition_1006"),
filters = c("external_gene_name"),
values = unique(as.character(up$gene)),
mart = ensembl) %>%
dplyr::filter(namespace_1003 %in% c("cellular_component", "molecular_function"))
cs = sort(unique(mm_go_anno$external_gene_name[grep("component of membrane",
mm_go_anno$name_1006)]))
cs.neg = sort(unique(mm_go_anno$external_gene_name[grep("mitochondrial|integral
component of Golgi membrane|intracellular membrane-bounded organelle|nuclear membrane",
mm_go_anno$name_1006)]))
cs = cs[!(cs %in% cs.neg)]
secreted = sort(unique(mm_go_anno$external_gene_name[
c(grep("extracellular space", mm_go_anno$name_1006),
grep("granzyme", mm_go_anno$description))]))
tf = sort(unique(mm_go_anno$external_gene_name[
grep("transcription factor activity|transcription activator activity
|transcription repressor activity|transcription coactivator activity|
DNA binding|DNA-binding",
mm_go_anno$name_1006)]))
intersect(cs, secreted)
intersect(cs, tf)
intersect(secreted, tf)
cs = cs[!(cs %in% tf)]
secreted = secreted[(!(secreted %in% cs)) & (!(secreted %in% tf))]
ng = sort(unique(df$gene[!(df$gene %in% c(cs, secreted, tf))]))
anno.df = data.frame(
gene = c(cs, secreted, tf, ng),
anno = c(rep("cell surface", length(cs)),
rep("secreted", length(secreted)),
rep("transcription factor", length(tf)),
rep("", length(ng)))) %>%
dplyr::filter(!(gene %in% c("Actn1", "Actn2")))
df.plot =
dplyr::left_join(up, anno.df, by = c("gene"))
head(df.plot)
df.plot$anno = factor(
df.plot$anno,
levels = c("cell surface", "secreted", "transcription factor", ""))
head(df.plot)
table(df.plot$anno)
##### object #####
object = cca.kept$DNT
object$cluster = paste0(sub("_rep[1|2]", "", object$orig.ident), object$integrated_snn_res.0.01)
%>%
factor(., levels = paste0(c("nDNT","aDNT"), c(0,0,1,1)))
table(object$cluster)
##### plot #####
p = mapply(function(i, j){
x = df.plot %>%
dplyr::rowwise() %>%
dplyr::mutate(avg_logFC = mean(c(avg_logFC.DNT0, avg_logFC.DNT1))) %>%
dplyr::filter(anno == i) %>%
dplyr::ungroup() %>%
dplyr::group_by(type) %>%
dplyr::top_n(10, avg_logFC) %>%
dplyr::arrange(gene, .by_group = T)
head(x)
p = dotplot.f(object, features = x$gene, group.by = "cluster", facet = x$type,
axis.text.x = element_text(angle = 60, hjust = 1, vjust = 1)) +
labs(x = "", title = j, y = "Subgroup upregulated genes\nafter activation")
return(p)
},
i = c("cell surface", "secreted", "transcription factor", ""),
j = c("Cell\nsurface","Secreted","Transcription\nfactor","Remaining\ngenes"),
SIMPLIFY = F,
USE.NAMES = F)
p[[1]] = p[[1]] + NoLegend()
p[[2]] = p[[2]] + theme(axis.title.y = element_blank()) + NoLegend()
p[[3]] = p[[3]] + theme(axis.title.y = element_blank())
p[[4]] = p[[4]] + theme(axis.title.y = element_blank()) + NoLegend()
##### change the strip color #####
for (i in 1:4) {
pal = brewer.pal(6, "Set2")[c(6,3,4)]
g = ggplotGrob(p[[i]])
strips = grep("strip-", g$layout$name)
for (x in seq_along(strips)) {
k = which(grepl("rect", g$grobs[[strips[[x]]]]$grobs[[1]]$childrenOrder))
g$grobs[[strips[[x]]]]$grobs[[1]]$children[[k]]$gp$fill = pal[x]
}
p[[i]] = g
}
##### save plots #####
plot_grid(plotlist = p, nrow = 1, labels = "b", label_size = 8, rel_widths = c(1.2,1,1,1.1)) %>%
ggsave(filename = "Fig6b.pdf", plot = ., units = "cm", width = 15, height = 12)
```
# fig S14 go
```{r cluego}
##### read the data #####
df = read.table(file = "FigS14.cluego.txt", header = T, sep = "\t") %>%
setNames(c("Term", "Source", "AdjustP", "AssociatedGenes", "Num.Genes", "Genes", "Cluster"))
df$Cluster = factor(df$Cluster, levels = c("DNT0.up.only", "DNT0.DNT1.up", "DNT1.up.only",
"DNT0.down.only", "DNT0.DNT1.down", "DNT1.down.only"))
df$Source = factor(df$Source, levels = c("BP", "CC", "MF", "KEGG", "WikiPathways"))
df = df %>%
dplyr::group_by(Cluster, Source) %>%
dplyr::arrange(Term, .by_group = T) %>%
dplyr::mutate(Type = sub(".*(up|down).*", "\\1", Cluster))
df$Term = factor(df$Term, levels = unique(df$Term))
df$Type = factor(df$Type, levels = c("up", "down"))
head(df)
##### plot #####
p = ggplot(df, aes(Cluster, Term, color = AdjustP, size = AssociatedGenes)) +
geom_point() +
facet_grid(rows = vars(Source), cols = vars(Type), scales = "free", space = "free") +
scale_color_distiller(palette = "Spectral", direction = 1, guide = guide_colorbar(default.unit =
"cm", barwidth = .2, barheight = 1.5)) +
scale_size_area(max_size = 2.5, guide_legend(title = "Associated\nGenes (%)")) +
labs(x = "", tag = "c") +
theme.text +
theme(axis.text.x = element_text(angle = 60, hjust = 1, vjust = 1))
p
##### change the strip color #####
g = ggplotGrob(p)
pal = brewer.pal(11, "Set3")
strips = grep("strip-", g$layout$name)
for (i in seq_along(strips)) {
k = which(grepl("rect",
g$grobs[[strips[i]]]$grobs[[1]]$childrenOrder))
g$grobs[[strips[i]]]$grobs[[1]]$children[[k]]$gp$fill = pal[i]
}
##### save plot #####
ggsave(filename = "FigS14.cluego.pdf", plot = plot_grid(g), units = "cm", width = 10, height = 10)
```

Clustering On Boston Dataset
No ratings yet
Clustering On Boston Dataset
3 pages
R Command
No ratings yet
R Command
52 pages
SML Practical 1to11
No ratings yet
SML Practical 1to11
23 pages
Ex 10 - Decision Tree With Rpart and Fancy Plot and Cardio Data
No ratings yet
Ex 10 - Decision Tree With Rpart and Fancy Plot and Cardio Data
4 pages
Visualizing Big Data With Trelliscope
No ratings yet
Visualizing Big Data With Trelliscope
7 pages
Q 2
No ratings yet
Q 2
3 pages
Final Coding
No ratings yet
Final Coding
6 pages
Assignment-1 80501
No ratings yet
Assignment-1 80501
6 pages
Ensayo Abrotanella: Cargar Un Arbol Filogenetico
No ratings yet
Ensayo Abrotanella: Cargar Un Arbol Filogenetico
12 pages
Caderno 2 - Exercícios 5 A 11
No ratings yet
Caderno 2 - Exercícios 5 A 11
16 pages
Real Estate
No ratings yet
Real Estate
10 pages
DMPA Codes
No ratings yet
DMPA Codes
16 pages
Lab Manual Page No 1
No ratings yet
Lab Manual Page No 1
32 pages
Getwd
No ratings yet
Getwd
24 pages
R Fourier
No ratings yet
R Fourier
18 pages
10-Visualization of Streaming Data and Class R Code-10!03!2023
No ratings yet
10-Visualization of Streaming Data and Class R Code-10!03!2023
19 pages
R Programming
No ratings yet
R Programming
9 pages
Cureplots
No ratings yet
Cureplots
7 pages
Plotting
No ratings yet
Plotting
21 pages
10_neural_nets_with_keras.ipynb (1)
No ratings yet
10_neural_nets_with_keras.ipynb (1)
159 pages
Week 10
No ratings yet
Week 10
15 pages
Toc ch1
No ratings yet
Toc ch1
9 pages
Final Data Lab
No ratings yet
Final Data Lab
20 pages
18 3 24 Upto Week 6 A B Latest 1
No ratings yet
18 3 24 Upto Week 6 A B Latest 1
25 pages
Riska Tiana
No ratings yet
Riska Tiana
13 pages
Final Data Lab
No ratings yet
Final Data Lab
21 pages
Acadr 14
No ratings yet
Acadr 14
7 pages
Scalable Data Processing in R
No ratings yet
Scalable Data Processing in R
8 pages
Da 06-10
No ratings yet
Da 06-10
14 pages
Heart Disease Classification ML Assignment - Jupyter Notebook
No ratings yet
Heart Disease Classification ML Assignment - Jupyter Notebook
7 pages
DST Python Code With Explanation
No ratings yet
DST Python Code With Explanation
9 pages
Normalization 1
No ratings yet
Normalization 1
23 pages
Ingreso Mensual - 2000
No ratings yet
Ingreso Mensual - 2000
4 pages
Spark Job Dataproc
No ratings yet
Spark Job Dataproc
4 pages
Python Myssql Programs For Practical File Class 12 Ip
No ratings yet
Python Myssql Programs For Practical File Class 12 Ip
26 pages
RSQLML Final Slide 15 June 2019 PDF
No ratings yet
RSQLML Final Slide 15 June 2019 PDF
196 pages
Cellstatus
No ratings yet
Cellstatus
19 pages
An Introduction To Seaborn
No ratings yet
An Introduction To Seaborn
42 pages
(3.12) Exercise:: Observation
No ratings yet
(3.12) Exercise:: Observation
1 page
SML Practicals All
No ratings yet
SML Practicals All
22 pages
Tarea de Ciencia de Datos
No ratings yet
Tarea de Ciencia de Datos
32 pages
RunningWithScissors
No ratings yet
RunningWithScissors
57 pages
How To Export HDL Simulation Data To ?: Verilog Matlab
No ratings yet
How To Export HDL Simulation Data To ?: Verilog Matlab
7 pages
Rstudio Study Notes For PA 20181126
No ratings yet
Rstudio Study Notes For PA 20181126
6 pages
R program Lab manual
No ratings yet
R program Lab manual
46 pages
Script de Invernizzi
No ratings yet
Script de Invernizzi
26 pages
Problem
No ratings yet
Problem
13 pages
Fds SLOT 2
No ratings yet
Fds SLOT 2
12 pages
Naïve Bayes + Neural Network
No ratings yet
Naïve Bayes + Neural Network
10 pages
Arche
No ratings yet
Arche
2 pages
Introduction To Conquestr
No ratings yet
Introduction To Conquestr
7 pages
FDS slips solution
No ratings yet
FDS slips solution
7 pages
Escript Com Rede de Correlação
No ratings yet
Escript Com Rede de Correlação
2 pages
20BCE1205 Lab6
No ratings yet
20BCE1205 Lab6
12 pages
Main.py Text File
No ratings yet
Main.py Text File
5 pages
et
No ratings yet
et
3 pages
Introduction To R For Gene Expression Data Analysis
No ratings yet
Introduction To R For Gene Expression Data Analysis
11 pages
Week 10 Abhishek Srivastava VFinal
No ratings yet
Week 10 Abhishek Srivastava VFinal
14 pages
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
Data Availability of Published ICB Studies
No ratings yet
Data Availability of Published ICB Studies
2 pages
scCODE听课记录
No ratings yet
scCODE听课记录
14 pages
早年自敲代码
No ratings yet
早年自敲代码
96 pages
Springer 2024single Cell Analysis
No ratings yet
Springer 2024single Cell Analysis
263 pages
Microarray Data Analysis-Springer
No ratings yet
Microarray Data Analysis-Springer
228 pages
Production and Operations Management An Applied Modern Approach 2 Ed Edition Joseph Martinich Download PDF
100% (18)
Production and Operations Management An Applied Modern Approach 2 Ed Edition Joseph Martinich Download PDF
84 pages
Multiplication of Decimals Activity Sheet
No ratings yet
Multiplication of Decimals Activity Sheet
3 pages
BPSC 67th Combined Competitive Exam 2021: 555 Posts Apply From Sept 30
No ratings yet
BPSC 67th Combined Competitive Exam 2021: 555 Posts Apply From Sept 30
9 pages
Bussiness Studies Project
No ratings yet
Bussiness Studies Project
28 pages
3 - BSBSUS511 Appendix J - Sustainability Implementation Report Template
No ratings yet
3 - BSBSUS511 Appendix J - Sustainability Implementation Report Template
2 pages
MCS System Proposal-Shanghai Huaxin
No ratings yet
MCS System Proposal-Shanghai Huaxin
51 pages
Mathsc Grade 8
No ratings yet
Mathsc Grade 8
224 pages
Primary Science: Terrestrial Planets
No ratings yet
Primary Science: Terrestrial Planets
7 pages
Force ppt 9th class
No ratings yet
Force ppt 9th class
12 pages
Chapter 4
No ratings yet
Chapter 4
32 pages
Surveys, Questionnaires Observation Method
No ratings yet
Surveys, Questionnaires Observation Method
6 pages
Power Posing: Tedtalks: Your Body Language May Shape Who You Are - Amy Cuddy
No ratings yet
Power Posing: Tedtalks: Your Body Language May Shape Who You Are - Amy Cuddy
2 pages
Laporan Final Project
No ratings yet
Laporan Final Project
10 pages
Saskia Sassen and The Sociology of Globalization: A Critical Appraisal
No ratings yet
Saskia Sassen and The Sociology of Globalization: A Critical Appraisal
26 pages
Advertisement Professor and Associate Professor
No ratings yet
Advertisement Professor and Associate Professor
10 pages
English Semantics (Repaired)
No ratings yet
English Semantics (Repaired)
17 pages
PK Cevirmen Test 1
No ratings yet
PK Cevirmen Test 1
4 pages
Vanessa A. Pamparo Midterm Methods of Research
No ratings yet
Vanessa A. Pamparo Midterm Methods of Research
4 pages
INDALEX Company Profile Edit
No ratings yet
INDALEX Company Profile Edit
44 pages
2021 Tools ML DL Dan Cheat Sheets For AI
No ratings yet
2021 Tools ML DL Dan Cheat Sheets For AI
25 pages
T1 Machine Learning MCQ Questions and Answers - Key
No ratings yet
T1 Machine Learning MCQ Questions and Answers - Key
15 pages
4. Fungsi
No ratings yet
4. Fungsi
8 pages
1 Lesson Plan Physical Science Final
No ratings yet
1 Lesson Plan Physical Science Final
5 pages
VPM Classes - Solved Mock Paper - New Pattern - Csir Ugc Net Life Sciences 2011
No ratings yet
VPM Classes - Solved Mock Paper - New Pattern - Csir Ugc Net Life Sciences 2011
20 pages
English 311303 Unit Test 2 QB
No ratings yet
English 311303 Unit Test 2 QB
4 pages
Write A Reflection Paper Based On The Study of Jose Mencio Molintas Entitled
No ratings yet
Write A Reflection Paper Based On The Study of Jose Mencio Molintas Entitled
3 pages
For Young Learner: It's My Body
No ratings yet
For Young Learner: It's My Body
13 pages
Training MPRF
No ratings yet
Training MPRF
2 pages
SOR Vol III
No ratings yet
SOR Vol III
356 pages
Characteristics of Groups: What Is A Group?
No ratings yet
Characteristics of Groups: What Is A Group?
3 pages

Code

Uploaded by

Code

Uploaded by

---

meta.all$CD4id = (meta.all$Cd3d > 0 |meta.all$Cd3e > 0| meta.all$Cd3g > 0) &

meta.all$CD8id = (meta.all$Cd3d > 0 |meta.all$Cd3e > 0| meta.all$Cd3g > 0) &

meta.all$DNTid = (meta.all$Cd3d > 0 |meta.all$Cd3e > 0| meta.all$Cd3g > 0) &

meta.all$NKid = (meta.all$Cd3d == 0 & meta.all$Cd3g == 0) &

meta.all$contaminate = (meta.all$Tid == FALSE &

# DoubletFinder and DoubletDecon

##### annotate function #####

# fig S10 doubletfinder

# fig S11 dimplot and featureplot

##### annotate function #####

# fig S12 heatmap

You might also like