You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to use QDNAseq to perform copy number calling on a pseudobulk bam file (sorted and indexed ~50GB in size). I want to start with analyzing at 100 kb bin size first.
On running the script, it has been taking WEEKS to just extract read counts from the said bam files.
This is part of the code I am using:
library(QDNAseq)
library(QDNAseq.hg38)
library(future)
This splits the genome into chunks of 100 nucleotides and processes each one separately, which explains the slowness. The smaller the chunk, the less memory is required. The bigger it is, the faster the processing. You can try increasing the number significantly, or even removing it altogether, as you have quite a bit of memory available. If you run out of memory, that means you need smaller chunks.
(I have switched fields years ago, so haven't been involved with any of this in a long time. So there's not much more I can say, but that one number jumped right up to my eyes as as this popped into my email.)
I am trying to use QDNAseq to perform copy number calling on a pseudobulk bam file (sorted and indexed ~50GB in size). I want to start with analyzing at 100 kb bin size first.
On running the script, it has been taking WEEKS to just extract read counts from the said bam files.
This is part of the code I am using:
library(QDNAseq)
library(QDNAseq.hg38)
library(future)
setwd("/home/u855h/chromothripsis/urja/LFS02CP_output/outs/clean_bam_merged")
bins <- getBinAnnotations(binSize=100, genome="hg38")
future::plan("multisession")
readCounts <- binReadCounts(bins, bamfiles="LFS02CP_merged_sorted.bam", chunkSize = 100)
print("binReadCounts: Done!")
print(readCounts)
I submit the job on the cluster by allocating it 200GB ram space. What am I doing wrong?
The text was updated successfully, but these errors were encountered: