Skip to content

KJ optionally use h5seurat instead of rds#110

Open
kathrinjansen wants to merge 17 commits intomasterfrom
KJ_h5seurat
Open

KJ optionally use h5seurat instead of rds#110
kathrinjansen wants to merge 17 commits intomasterfrom
KJ_h5seurat

Conversation

@kathrinjansen
Copy link
Copy Markdown
Collaborator

  • add optional usage of 'begin.h5seurat' as input instead of 'begin.rds', for this the package SeuratDisk is used

  • new option in yml for this: input_format = rds | h5seurat , this option is used for multiple tasks throughout the pipeline_seurat.py

  • for option 'h5seurat': instead of writing out flat files for e.g. scaled data or metadata, the begin.h5seurat is converted to an anndata object (begin.h5ad) and this is used for other tasks (new R script for this)

  • fixes to code for diffusion map task (was running only once per components not for each cluster resolution?)

  • test done for following combinations: from h5seurat + loom velocity, from rds + loom velocity, from raw mm format using h5seurat

@kathrinjansen kathrinjansen requested a review from snsansom July 18, 2020 10:24
Copy link
Copy Markdown
Contributor

@snsansom snsansom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Kathrin,

A few initial comments:

Can we please move the code for loading and saving of Seurat objects into functions in the tenxutils R library.

Please only require(SeuratDisk) if it is needed (i.e. within the functions).

Thanks,
Steve

@kathrinjansen
Copy link
Copy Markdown
Collaborator Author

There are now two new functions in tenxutils named loadSeurat and saveSeurat which load from rds or h5seurat depending on file ending or option 'format'. Package requirement removed from all other scripts.

Some more fixes to issues with s@misc in h5seurat: it's saved as a list if h5seurat is created by pipeline or if conversion happens, the gene_id to gene_name mapping is not in s@misc but in @assays$RNA@meta.features instead .

kathrinjansen and others added 17 commits November 4, 2020 23:16
… Now either the exportForPython task is run (input = rds) or a conversion from h5seurat to h5ad (input = h5ad). This is specified as a new option in pipeline.yml and downstream tasks use this option.
…map if .X is present, see issue here: scverse/scanpy#1318 . Now edited to match the previous script: create new anndata object without .X.
… in conversion from anndata) in seurat_FindMarkers.R
…are used in all R scripts to decide about the input format
@snsansom
Copy link
Copy Markdown
Contributor

snsansom commented Nov 5, 2020

Have rebased on master but the branch is not running on the test datasets (geneset analysis fails on mouse datasets, export anndata fails with input format h5seurat).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants