Pipeline for the alignment, variant calling, copy number calling and annotation of whole-genome sequencing data, built on top of the sarek Nextflow pipeline. Please refer to its documentation for details.
run_wgs_pipeline.sh [-g|--genome <arg>] [-h|--help] [--version] <samplesheet> <outdir>
<samplesheet>: path to the samplesheet CSV file
<outdir>: path to the output directory where to store the results
-g, --genome: genome to be used (default: 'GATK.GRCh37')
-h, --help: Prints help
--version: Prints version
where
samplesheetis a CSV fileoutdiris the path where the results will be storedgenomeis the reference genome to be used
For example:
run_wgs_pipeline.sh --genome GRCm38 design.csv wgs
will run the pipeline on the samples specified in the design.csv file, will store the results in the wgs folder, and will align the reads to the GRCm38 genome.
Check the sarek documentation about the required input.
An example input file is provided: test_input.csv.
- GATK.GRCh37 (from Broad Institute)
- GRCm38 (from Ensembl)