-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
101 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
# CoverageProfile WDL Workflow | ||
|
||
## Overview | ||
|
||
The `coverageProfile` workflow calculates the depth of coverage of an input sample and visualizes it. It supports two tools for depth calculation: Samtools and DepthOfCoverage (GATK). Visualization is available for Samtools and Exome data. | ||
|
||
## Inputs | ||
|
||
- `String sampleName` - The name of the sample. | ||
- `String coverageTool` - The tool to use for coverage calculation. Default is "Samtools". | ||
- `File alignedBam` - BAM file with aligned reads. | ||
- `File alignedBamIndex` - Index file for the BAM. | ||
- `File referenceFasta` - Reference genome in FASTA format. | ||
- `File referenceDict` - Dictionary file for the reference genome. | ||
- `File referenceFai` - Index file for the reference genome. | ||
- `File intervals` - Intervals file. | ||
- `File? interval_GCcontent_track` - (Optional) GC content track for visualization. | ||
- `Int MinBaseQuality` - Minimum base quality for coverage calculation. Default is 20. | ||
- `Int MinMappingQuality` - Minimum mapping quality for coverage calculation. Default is 20. | ||
- `Boolean visualise_coverage` - Whether to visualize the coverage. Default is false. | ||
|
||
## Outputs | ||
|
||
- `File? DepthOfCoverageIntervalCov` - Depth of coverage interval summary (only for DepthOfCoverage tool). | ||
- `Float? DepthOfCoverageMeanCoverage` - Mean coverage from DepthOfCoverage tool. | ||
- `File? SamtoolsDepthProfile` - Depth profile generated by Samtools. | ||
- `File? SamtoolsCovProfilePlot` - Coverage profile plot generated by visualization task. | ||
- `Float? SamtoolsAvgCovMean` - Average coverage mean from Samtools Depth. | ||
|
||
|
||
## Workflow | ||
|
||
### Main Workflow | ||
|
||
1. **Choose Tool:** Based on the `coverageTool` input, the workflow branches to use either Samtools or DepthOfCoverage. | ||
|
||
2. **Samtools Workflow:** | ||
- Convert intervals to BED format using `IntervalListToBed`. | ||
- Calculate depth using `SamtoolsDepth`. | ||
- If `visualise_coverage` is true, visualize the coverage using `CovProfileViz`. The visualization requires the depth profile and a GC content track. It's currently only work with exome data. | ||
|
||
3. **DepthOfCoverage Workflow:** | ||
- Calculate depth using `DepthOfCoverage`. | ||
|
||
### Tasks | ||
|
||
#### DepthOfCoverage | ||
|
||
Calculates the depth of coverage using GATK's DepthOfCoverage tool. | ||
|
||
**Inputs:** | ||
- Various inputs for the BAM file, reference genome, intervals, and quality thresholds. | ||
|
||
**Outputs:** | ||
- `File sample_interval_summary` - Interval summary. | ||
- `Float mean_coverage` - Mean coverage. | ||
|
||
#### IntervalListToBed | ||
|
||
Converts an interval list to a BED file. | ||
|
||
**Inputs:** | ||
- `File intervals` - Intervals file. | ||
|
||
**Outputs:** | ||
- `File bed_intervals` - Converted BED file. | ||
|
||
#### SamtoolsDepth | ||
|
||
Calculates depth of coverage using Samtools. | ||
|
||
**Inputs:** | ||
- Various inputs for the BAM file, intervals, and quality thresholds. | ||
|
||
**Outputs:** | ||
- `File depth_profile` - Depth profile. | ||
|
||
#### CovProfileViz | ||
|
||
Visualizes the coverage profile. | ||
|
||
**Inputs:** | ||
- `File SamtoolsDepthProfile` - Depth profile from Samtools. | ||
- `File? GCcontentTrack` - Optional GC content track. | ||
- Other parameters for customization. | ||
|
||
**Outputs:** | ||
- `File cov_profile_plot` - Coverage profile plot. | ||
- `File avg_chr_cov_per_chr_plot` - Plot of average coverage per chromosome. | ||
- `Float avg_chr_cov_std` - Standard deviation of average coverage per chromosome. | ||
- `File avg_chr_cov_per_chr` - Average coverage per chromosome. | ||
- `Float avg_cov_mean` - Mean average coverage. | ||
|