Merging integrated replicates #9677

TomaBt · 2025-02-11T08:43:19Z

TomaBt
Feb 11, 2025

Hello Seurat Team,

Thank you for providing all these vignettes and resources on single-cell analysis—they have been extremely helpful!

I have some questions regarding the integration of multiple 10x multiome (ATAC/RNA) datasets. Specifically, I have performed a multiome assay on the same cell type under four different conditions, with two biological replicates per condition, resulting in a total of eight datasets.

Initial Analysis & Integration
I started by analyzing each condition independently, merging the two replicates per condition. However, I observed batch effects, so I proceeded with integrating the replicates within each condition. My approach was as follows:

For RNA:
Merged Seurat object → SCTransform → RunPCA
Integration: IntegrateLayers (CCA on PCA reduction) → JoinLayers
Clustering & Visualization: FindNeighbors (reduction = "integrated.cca") → FindClusters → RunUMAP

For ATAC:
Defined a common peak set
FindTopFeatures → RunTFIDF → RunSVD
FindNeighbors (reduction = "lsi") → FindClusters → RunUMAP

For Joint Analysis:
FindMultiModalNeighbors(reduction.list = list("integrated.cca", "lsi"))
FindClusters → RunUMAP
This approach successfully corrected batch effects within each pair of replicates.

Merging All Integrated Datasets
Now, I want to merge all integrated samples into a single Seurat object to generate a UMAP representing all conditions together. Here are my concerns:

Effect of IntegrateLayers:
Since IntegrateLayers performs integration based on PCA reductions rather than creating an "integrated" assay with corrected expression values, does this mean that merging all conditions after integration will not retain the batch correction benefits? In other words, would the final merged object behave similarly to a simple merge of the original eight assays without prior integration?

Alternative Integration Approach:
To tackle this issue, I am testing an alternative pipeline to create an "integrated" assay that will preserve the integration after merging all condtions together:

For RNA: SelectIntegrationFeatures → FindIntegrationAnchors → IntegrateData → RunPCA → RunUMAP → FindNeighbors → FindClusters
For ATAC: FindTopFeatures → RunTFIDF → RunSVD → FindIntegrationAnchors → IntegrateEmbeddings → RunUMAP
For Joint Analysis: FindMultiModalNeighbors → RunUMAP → RunSPCA
I have noticed that the UMAP generated from IntegrateData differs from that produced using IntegrateLayers while both use CCA integration. Is that expected?

What would be the best Strategy for Merging Integrated Replicates:

Would you recommend integrating the replicates first and then merging the integrated objects, or should I merge all eight samples first and then perform integration?

I am concerned that integrating all conditions together might remove meaningful biological variability arising from the different experimental conditions. Is this a valid concern?

I am new to single-cell analysis, and handling this dataset has been quite challenging. I appreciate any insights or recommendations you can provide.

Thank you very much for your help!

Best,
Thomas

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merging integrated replicates #9677

{{title}}

Replies: 0 comments

Select a reply

Merging integrated replicates #9677

TomaBt Feb 11, 2025

Replies: 0 comments

TomaBt
Feb 11, 2025