14-Integration Default Lognorm Pipeline-22-02-2024
14-Integration Default Lognorm Pipeline-22-02-2024
integration
Basics
• Integration of single-cell sequencing datasets, for example across
experimental batches, donors, or conditions, is often an important
step in scRNA-seq workflows.
• Integrative analysis can help to match shared cell types and states
across datasets, which can boost statistical power, and most
importantly, facilitate accurate comparative analysis across datasets.
Integration goals
• We often refer to this procedure as intergration/alignment. When aligning two genome sequences
together, identification of shared/homologous regions can help to interpret differences between the
sequences as well.
• Similarly for scRNA-seq integration, our goal is not to remove biological differences across
conditions, but to learn shared cell types/states in an initial step - specifically because that will
enable us to compare control stimulated and control profiles for these individual cell types.
• The Seurat integration procedure aims to return a single dimensional reduction that captures the
shared sources of variance across multiple layers, so that cells in a similar biological state will cluster.
• The method returns a dimensional reduction (i.e. integrated.cca) which can be used for visualization
and unsupervised clustering analysis.
Default pipeline
Perform integration
control data
2nd data
• h6.data <- Read10X(data.dir = "outs/filtered_feature_bc_matrix",gene.column =
1)
Stimulus data
Pipeline
• SE.features <- SelectIntegrationFeatures(object.list =
c(h0.dataset,h6.dataset),nfeatures = 3000)
• <- FindIntegrationAnchors(object.list = c(h0.dataset,h6.dataset),
anchor.features = SE.features, verbose = FALSE)
• SE.integrated <- IntegrateData(anchorset = SE.anchors,
verbose = FALSE)
• SE.integrated <- ScaleData(SE.integrated)
• SE.integrated <- RunPCA(SE.integrated, verbose = FALSE)
• SE.integrated <- RunUMAP(SE.integrated, dims = 1:50)
• SE.integrated <- FindNeighbors(SE.integrated,dims = 1:50)
• SE.integrated <- FindClusters(SE.integrated, resolution = 0.5)
Visualize
• DefaultAssay(SE.integrated) <- 'RNA‘
• DimPlot(SE.integrated,label = T)
• DimPlot(SE.integrated, group.by = 'orig.ident’)
How to annotate?
Known marker genes will help
clusters
Known markers
Knowledge from cluster and markers
Annotate
• SE.integrated <- RenameIdents(SE.integrated,
`0`= CD14',
`5`= B,
……..)
Annotate map
Next part
• Identify conserved cell type markers
• Identify differential expressed genes across conditions
• Trajectory analysis