Parallel processing in subsample.py by SichongP · Pull Request #197 · Magdoll/cDNA_Cupcake

SichongP · 2022-02-24T14:09:07Z

This PR adds parallelization to subsampling as this script takes too long to run right now.

I tested new script with 10,000 total reads at 100 reads step size and 100 iterations:

With original script:

35.3 s ± 70.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

With parallel script (5 threads):

12.8 s ± 171 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The improvement should be more pronounced in real samples as multiprocessing overhead becomes negligible.

Added deconcat scripts

Added script to make Seurat input

…or BAM)

Making collapse_isoforms_by_sam.py compatible with SAM file multiprocessing

Fix seurat input script

Adds parallelization to subsampling as this script takes too long to run right now

Actually print out results

rvolden and others added 17 commits November 4, 2021 10:55

Initial commit

de9e75c

Merge pull request Magdoll#175 from velociroger-pb/master

faefbc3

Added deconcat scripts

Initial commit, only isoform counts

187141e

Merge branch 'master' of https://github.com/velociroger-pb/cDNA_Cupcake

735f401

Merge pull request Magdoll#176 from velociroger-pb/master

654d099

Added script to make Seurat input

Ignore .swp files

d116375

Added support for multiprocessing SAM files (previously only worked f…

4d96c39

…or BAM)

Changed input and output files to be user specified instead of hardcoded

3217505

Added case for cell bc and UMI tags switching places after dedup

fe7b60d

Allow for in/out and switched UMI/BC placement

0190d22

Merge pull request Magdoll#181 from velociroger-pb/master

29478d6

Making collapse_isoforms_by_sam.py compatible with SAM file multiprocessing

Initial commit

3a664fd

Fixed args

5e53b10

Fix num entries in matrix file (previously overcounting)

34ced56

Merge pull request Magdoll#196 from velociroger-pb/master

4ad367a

Fix seurat input script

Parallel processing in subsample.py

e237af5

Adds parallelization to subsampling as this script takes too long to run right now

Actually print out results

7151e62

Actually print out results

Magdoll force-pushed the master branch from 4ad367a to a418731 Compare August 23, 2022 20:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel processing in subsample.py#197

Parallel processing in subsample.py#197
SichongP wants to merge 17 commits into
Magdoll:masterfrom
SichongP:master

SichongP commented Feb 24, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

SichongP commented Feb 24, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants