Nextflow workflow for checking quality of NGS data
cidgoh_qc is a bioinformatics analysis workflow based on nextflow to perform QC analysis.
- Install
Nextflow
(>=21.04.0
)
TIPS: You can load nextflow on the Cedar cluster like this:
$ module load nextflow/21.04.3
- Install any of
Docker
,Singularity
orConda
as package manager.
TIPS: Docker and Conda are not allowed on the Cedar cluster. By default, singularity is in the default tools, so you don't need to install on the Cedar cluster.
- Download source code from github
$ git clone https://github.com/cidgoh/cidgoh_qc.git
$ cd cidgoh_qc
TIPS: We have set up a default version on the Cedar cluster at
/project/rrg-whsiao-ab/shared_tools/cidgoh_qc
Use singularity, docker or conda to mangage dependencies
$ nextflow run ./main.nf -profile <conda/singularity> --input samplesheet.csv --adapter_trim_mode <cutadapt/trimgalore/fastp> --kraken2_db [$dbname]
Use slurm to submit jobs
$ nextflow run ./main.nf -profile slurm --input samplesheet.csv --adapter_trim_mode <cutadapt/trimgalore/fastp> --kraken2_db [$dbname]
TIPS: If you run job on the Cedar cluster, you don't need to add --workDir because we have set up a default work_folder at
/project/rrg-whsiao-ab/misc/tmp_work_nextflow
.
The nextflow reports are under "Reports" of your result folder.
TIPS: According to the used resources, you can adjust default resources request under
conf/slurm.config
For example:
params {
account = "xxxx"
runTime = 2.h
singleCPUMem = 1.GB
}
withName:fastqc {
cpus = 4
memory = {params.singleCPUMem * 4 * task.attempt}
time = {params.runTime * task.attempt}
}