Skip to content

Commit

Permalink
Release v2.1.0 (#24)
Browse files Browse the repository at this point in the history
* Add rule and sample info to slurm log files

* Get naming convention of log output dir consistant

* Add pipeline tests (#14)

* Add test dataset

* Add info on test dataset

* Automate setting up to run test dataset

* Speedups (#15)

* Use full parabricks germline pipeline + other standalone tools - provides speedups

* Thread parabricks rules

* Remove snakemake wrapper and thread fastqc

* Add fastqc conda env now that snakemake wrapper has been removed

* Fixes

* fix error due to file target that isn't created

* forgot to add parabricks rule to local rule list

* remove flag that causes error

* allow dynamic inclusion of recal resources - also stop need for user … (#17)

* allow dynamic inclusion of recal resources - also stop need for user to manually write the flags

* clarify you can directly pass adapters to trim galore

* move existing helpers functions to one place in snakefile

* simplify flags for WES settings

* account for when someone doesn't use WES settings

* Simplify code (#19)

* Functionize code (#20)

Move dynamic stuff (like if-else statements) into functions to avoid having global variables

* Docs (#22)

* separate docs for running in different situations

* add images

* fix links to images

* add section about getting data on nesi

* remove incomplete docs for running pipeline on NeSi for now

* fix fastqc/multiqc error

* improve documentation

* fix file path that makes download from google cloud bucket not work

* improve docs

* discourage using home dir in docs

* add more information about pipeline

* fix sample wildcard error for rules without sample wildcard

* clarify output files

* add link to discussions

* remove g.vcf that causes error in vcf_annotation_pipeline (#25)
  • Loading branch information
leahkemp authored Apr 19, 2022
1 parent 9b5c8a0 commit 1a6f3d0
Show file tree
Hide file tree
Showing 5 changed files with 6 additions and 5 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ Cohort samples:
- `results/mapped/sample1_recalibrated.bam`
- `results/mapped/sample2_recalibrated.bam`
- `results/mapped/sample3_recalibrated.bam`
- `results/called/proband1_raw_snps_indels.g.vcf`
- `results/called/proband1_raw_snps_indels.vcf`

## Prerequisites

Expand All @@ -94,6 +94,7 @@ See the docs for a walkthrough guide for running [human_genomics_pipeline](https

- Raise issues in [the issues page](https://github.com/ESR-NZ/human_genomics_pipeline/issues)
- Create feature requests in [the issues page](https://github.com/ESR-NZ/human_genomics_pipeline/issues)
- Start a discussion in [the discussion page](https://github.com/ESR-NZ/human_genomics_pipeline/discussions)
- Contribute your code! Create your own branch from the [development branch](https://github.com/ESR-NZ/human_genomics_pipeline/tree/dev) and create a pull request to the [development branch](https://github.com/ESR-NZ/human_genomics_pipeline/tree/dev) once the code is on point!

Contributions and feedback are always welcome! :blush:
2 changes: 1 addition & 1 deletion docs/running_on_a_hpc.md
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ Set the maximum number of GPU's to be used per rule/sample for gpu-accelerated r
GPU: 1
```

It is a good idea to consider the number of samples that you are processing. For example, if you set `THREADS: "8"` and set the maximum number of cores to be used by the pipeline in the run script to `-j 32` (see step 6), a maximum of 3 samples will be able to run at one time for these rules (if they are deployed at the same time), but each sample will complete faster. In contrast, if you set `THREADS: "1"` and `-j 32`, a maximum of 32 samples could be run at one time, but each sample will take longer to complete. This also needs to be considered when setting `MAXMEMORY` + `--resources mem_mb` and `GPU` + `--resources gpu`.
It is a good idea to consider the number of samples that you are processing. For example, if you set `THREADS: "8"` and set the maximum number of cores to be used by the pipeline in the run script to `-j/--cores 32` (see [step 8](#8-modify-the-run-scripts)), a maximum of 3 samples will be able to run at one time for these rules (if they are deployed at the same time), but each sample will complete faster. In contrast, if you set `THREADS: "1"` and `-j/--cores 32`, a maximum of 32 samples could be run at one time, but each sample will take longer to complete. This also needs to be considered when setting `MAXMEMORY` + `--resources mem_mb` and `GPU` + `--resources gpu`.

#### Trimming

Expand Down
2 changes: 1 addition & 1 deletion docs/running_on_a_single_machine.md
Original file line number Diff line number Diff line change
Expand Up @@ -208,7 +208,7 @@ Set the maximum number of GPU's to be used per rule/sample for gpu-accelerated r
GPU: 1
```

It is a good idea to consider the number of samples that you are processing. For example, if you set `THREADS: "8"` and set the maximum number of cores to be used by the pipeline in the run script to `-j 32` (see step 6), a maximum of 3 samples will be able to run at one time for these rules (if they are deployed at the same time), but each sample will complete faster. In contrast, if you set `THREADS: "1"` and `-j 32`, a maximum of 32 samples could be run at one time, but each sample will take longer to complete. This also needs to be considered when setting `MAXMEMORY` + `--resources mem_mb` and `GPU` + `--resources gpu`.
It is a good idea to consider the number of samples that you are processing. For example, if you set `THREADS: "8"` and set the maximum number of cores to be used by the pipeline in the run script to `-j/--cores 32` (see [step 7](#7-modify-the-run-scripts)), a maximum of 3 samples will be able to run at one time for these rules (if they are deployed at the same time), but each sample will complete faster. In contrast, if you set `THREADS: "1"` and `-j/--cores 32`, a maximum of 32 samples could be run at one time, but each sample will take longer to complete. This also needs to be considered when setting `MAXMEMORY` + `--resources mem_mb` and `GPU` + `--resources gpu`.

#### Trimming

Expand Down
2 changes: 1 addition & 1 deletion workflow/Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ if config['DATA'] == "Cohort" or config['DATA'] == 'cohort':
input:
"../results/qc/multiqc_report.html",
expand("../results/mapped/{sample}_recalibrated.bam", sample = SAMPLES),
expand("../results/called/{family}_raw_snps_indels.g.vcf", family = FAMILIES)
expand("../results/called/{family}_raw_snps_indels.vcf", family = FAMILIES)

##### Load rules #####

Expand Down
2 changes: 1 addition & 1 deletion workflow/rules/gatk_GenotypeGVCFs.smk
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ rule gatk_GenotypeGVCFs:
gvcf = "../results/called/{family}_raw_snps_indels_tmp_combined.g.vcf",
refgenome = expand("{refgenome}", refgenome = config['REFGENOME'])
output:
protected("../results/called/{family}_raw_snps_indels.g.vcf")
protected("../results/called/{family}_raw_snps_indels.vcf")
params:
maxmemory = expand('"-Xmx{maxmemory}"', maxmemory = config['MAXMEMORY']),
tdir = config['TEMPDIR'],
Expand Down

0 comments on commit 1a6f3d0

Please sign in to comment.