-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
variant calling workflow + testing #371
Open
fridells51
wants to merge
37
commits into
master
Choose a base branch
from
vc-wf
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- add info about other species - point out human-specific parts - improve comments - more canonical code
add function creating required dictionary of input files for snpeff rules
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
With this pull request, an end-to-end Snakemake variant calling workflow will be added to lcdb-wf. The Snakefile handles references, mapping reads to the genome, QC, and includes a GATK best practices pipeline for germline and somatic variant calling. The workflow supports whole genome sequencing (WGS) and targeted sequencing inputs and returns analysis-ready, annotated VCFs.
Included in this PR is an update to the conda environment to include packages for variant calling. The lcdb-wf docs are also updated to include a comprehensive overview of the workflow as well as detailing several configuration options that the user can interact with in order to tweak the workflow for their analysis needs. The workflow is not organism-specific and the docs detail how to call variants on non-human organisms. References can be provided to the workflow externally, but this PR will also expand the existing references workflow in lcdb-wf to automatically include new reference types necessary for variant calling.
The VCF annotation portion of the workflow supports attaching annotations from databases like dbNSFP using SnpEff.
The workflow will also run MultiQC to aggregate QC checks on input fastq data, variant calling metrics, and annotation summary files.
Test data for variant calling have been generated and are hosted on https://github.com/lcdb/lcdb-wf-variant-calling-test-data. This test data is run on the workflow using circle ci to test conda environments and workflow execution when new changes are made to the workflow. This protects against deprecation and introducing bugs into the workflow with future updates.