You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for providing all these vignettes and resources on single-cell analysis—they have been extremely helpful!
I have some questions regarding the integration of multiple 10x multiome (ATAC/RNA) datasets. Specifically, I have performed a multiome assay on the same cell type under four different conditions, with two biological replicates per condition, resulting in a total of eight datasets.
Initial Analysis & Integration
I started by analyzing each condition independently, merging the two replicates per condition. However, I observed batch effects, so I proceeded with integrating the replicates within each condition. My approach was as follows:
For ATAC:
Defined a common peak set FindTopFeatures → RunTFIDF → RunSVD FindNeighbors (reduction = "lsi") → FindClusters → RunUMAP
For Joint Analysis: FindMultiModalNeighbors(reduction.list = list("integrated.cca", "lsi")) FindClusters → RunUMAP
This approach successfully corrected batch effects within each pair of replicates.
Merging All Integrated Datasets
Now, I want to merge all integrated samples into a single Seurat object to generate a UMAP representing all conditions together. Here are my concerns:
Effect of IntegrateLayers:
Since IntegrateLayers performs integration based on PCA reductions rather than creating an "integrated" assay with corrected expression values, does this mean that merging all conditions after integration will not retain the batch correction benefits? In other words, would the final merged object behave similarly to a simple merge of the original eight assays without prior integration?
Alternative Integration Approach:
To tackle this issue, I am testing an alternative pipeline to create an "integrated" assay that will preserve the integration after merging all condtions together:
For RNA: SelectIntegrationFeatures → FindIntegrationAnchors → IntegrateData → RunPCA → RunUMAP → FindNeighbors → FindClusters
For ATAC: FindTopFeatures → RunTFIDF → RunSVD → FindIntegrationAnchors → IntegrateEmbeddings → RunUMAP
For Joint Analysis: FindMultiModalNeighbors → RunUMAP → RunSPCA
I have noticed that the UMAP generated from IntegrateData differs from that produced using IntegrateLayers while both use CCA integration. Is that expected?
What would be the best Strategy for Merging Integrated Replicates:
Would you recommend integrating the replicates first and then merging the integrated objects, or should I merge all eight samples first and then perform integration?
I am concerned that integrating all conditions together might remove meaningful biological variability arising from the different experimental conditions. Is this a valid concern?
I am new to single-cell analysis, and handling this dataset has been quite challenging. I appreciate any insights or recommendations you can provide.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello Seurat Team,
Thank you for providing all these vignettes and resources on single-cell analysis—they have been extremely helpful!
I have some questions regarding the integration of multiple 10x multiome (ATAC/RNA) datasets. Specifically, I have performed a multiome assay on the same cell type under four different conditions, with two biological replicates per condition, resulting in a total of eight datasets.
Initial Analysis & Integration
I started by analyzing each condition independently, merging the two replicates per condition. However, I observed batch effects, so I proceeded with integrating the replicates within each condition. My approach was as follows:
For RNA:
Merged Seurat object →
SCTransform
→RunPCA
Integration: IntegrateLayers (CCA on PCA reduction) →
JoinLayers
Clustering & Visualization:
FindNeighbors (reduction = "integrated.cca")
→FindClusters
→RunUMAP
For ATAC:
Defined a common peak set
FindTopFeatures
→RunTFIDF
→RunSVD
FindNeighbors (reduction = "lsi")
→FindClusters
→RunUMAP
For Joint Analysis:
FindMultiModalNeighbors(reduction.list = list("integrated.cca", "lsi"))
FindClusters
→RunUMAP
This approach successfully corrected batch effects within each pair of replicates.
Merging All Integrated Datasets
Now, I want to merge all integrated samples into a single Seurat object to generate a UMAP representing all conditions together. Here are my concerns:
Effect of IntegrateLayers:
Since IntegrateLayers performs integration based on PCA reductions rather than creating an "integrated" assay with corrected expression values, does this mean that merging all conditions after integration will not retain the batch correction benefits? In other words, would the final merged object behave similarly to a simple merge of the original eight assays without prior integration?
Alternative Integration Approach:
To tackle this issue, I am testing an alternative pipeline to create an "integrated" assay that will preserve the integration after merging all condtions together:
For RNA:
SelectIntegrationFeatures
→FindIntegrationAnchors
→IntegrateData
→RunPCA
→RunUMAP
→FindNeighbors
→FindClusters
For ATAC:
FindTopFeatures
→RunTFIDF
→RunSVD
→FindIntegrationAnchors
→IntegrateEmbeddings
→RunUMAP
For Joint Analysis:
FindMultiModalNeighbors
→RunUMAP
→RunSPCA
I have noticed that the UMAP generated from IntegrateData differs from that produced using IntegrateLayers while both use CCA integration. Is that expected?
What would be the best Strategy for Merging Integrated Replicates:
Would you recommend integrating the replicates first and then merging the integrated objects, or should I merge all eight samples first and then perform integration?
I am concerned that integrating all conditions together might remove meaningful biological variability arising from the different experimental conditions. Is this a valid concern?
I am new to single-cell analysis, and handling this dataset has been quite challenging. I appreciate any insights or recommendations you can provide.
Thank you very much for your help!
Best,
Thomas
Beta Was this translation helpful? Give feedback.
All reactions