Skip to content

Releases: pdimens/harpy

1.16.3

30 Jan 14:59
25fa6ed
Compare
Choose a tag to compare

Bugs Fixed

  • potential error in zgrep-based sample demultiplexing due to lack of precision in pattern #188
  • last plot in harpy qc report now bounded between 0 and 100

1.16.2

29 Jan 16:16
Compare
Choose a tag to compare

Changes

  • inline_to_haplotag.py uses pysam for fastq handling
    • does not output reads shorter than the barcode length
  • inline_to_haplotag.py leverages executemany instead of many execute sql calls
    • more initial RAM overhead, but the overall performance should be much better
  • progress bars should be better in Jupyter
  • wording is more consistent in places

1.16.1

22 Jan 15:16
Compare
Choose a tag to compare

Fixes

  • All error titles are now standardized to start with lowercase
  • restored manual deconvolution script

1.16.0

21 Jan 15:18
81f254c
Compare
Choose a tag to compare

New

  • turns out LEVIATHAN doesn't do any kind of internal deconvolution, so a new shim script was added to the leviathan workflow to deconvolve the BX tags in the input alignments based on the [already deconvolved] MI tags.

Breaking

  • this has been a long time coming, but --conda was swapped for --container, meaning conda workflow dependency handling is the default now and you can opt-in to using the container-based method
  • .conda has been renamed .environments and will now house the conda environments and/or singularity container to simplify where harpy stores the software deps

1.15.0

08 Jan 18:03
b2e2796
Compare
Choose a tag to compare

New

  • Quarto has replaced RMarkdown/Flexdashboard
    • no changes for the user to worry about, but the reports will look a little different
  • NXX plots for phasing report
  • Introduced new scripts for development installation using Conda and Pixi
  • Harpy's printing to console during runtime is sleeker now

Internal

  • Streamlined Snakemake command execution for the different workflows
  • Improved logging and error handling in various modules
  • molecule_coverage.py now uses a sqlite3 backend, which dramatically reduces the amount required RAM
  • Refactored a few Snakemake workflow files

Bug Fixes

  • Small bug reporting the wrong value for one of the valueboxes

1.14.3

20 Dec 15:35
f3ab27c
Compare
Choose a tag to compare

Bugs fixed

  • return missing haplotag barcode script that went missing after squashing commits and broke demuxing

Changed

  • added rule priority for some workflows so they prioritize creating the output files over calculting metrics and writing reports
    • this means that, for example, align bwa will prioritize creating all the output bam files, rather than running a single sample through everything

Full Changelog: 1.14.2...1.14.3

1.14.2

13 Dec 15:32
Compare
Choose a tag to compare

Changed

This is a bugfix for #176 that has a better help string for downsample, which states the BX:Z tag has to be terminal for the record

Added

  • bx_to_end.py to help preprocess FASTQ/BAM files for downsample

1.14.1

11 Dec 20:11
2e8c1b0
Compare
Choose a tag to compare

Never too proud to admit I was wrong. I didnt wan't downsample to be a snakemake workflow, but with the increased complexity of what I wanted it to do, I found myself writing an increasingly complex python script that was essentially doing all the stuff Snakemake was doing. So:

New

  • Introduced a command-line utility for extracting barcodes from SAM/BAM files
  • Enhanced phasing statistics reporting with new metrics (N50, N75, N90)
  • LRez is now part of the main Harpy installation and accessible to the user
  • adapter removal in the qc module accepts an argument now, one of:
    • auto for automatic adapter detection
    • a FASTA file of adapters

Changed

  • Downsampling is now a snakemake workflow
  • downsample handles invalids in a much more intuitive (and sensible) way

Full Changelog: 1.14...1.14.1

1.14

06 Dec 21:42
e1867c2
Compare
Choose a tag to compare

New

  • added a convenience script separate_singletons to split a bam file into singletons and nonsingletons
  • harpy downsample module to downsample FASTQ/BAM by barcodes

Breaking changes

  • singletons are now calculated such that both reads of a paired-end read only counts as "one read" for a barcode
    • which means unpaired reads now contribute properly to this value
    • overall, this is a more accurate way of calculating this metric

Fixes

  • separate_validbx has a usage change, which is breaking, however this script is not used by any of the workflows so there should be no appreciable difference
  • alignment reports have text that clarifies which math is for non-singletons
  • multiplex reads (aka reads that arent linked-read singletons) are now just referred to as non-singletons

1.13

26 Nov 15:39
Compare
Choose a tag to compare

New Features

  • new view command to view workflow log, snakefile, or configuration file.
  • conda environment recipes are now stored in outdir/workflow/envs for more self-contained workflow directories
    • also improves workflow-specific troubleshooting

Breaking Changes

  • stitchparams has been renamed imputeparams

Internal

  • improved handling of conda environments across various commands, allowing for better configuration and dependency management.
  • Updated environment directory paths for better organization and clarity across all workflows
  • local simuG replaced with conda installation
    • Removed dependency on the simuG.pl script for several simulation workflows, streamlining the execution process
    • rename rules and better directory structure for simulate variants

Bug Fixes

  • Improved regular expression handling in file processing to enhance clarity and prevent issues.
  • Corrected typos in align_stats.Rmd and routines for handling no valid barcodes

Issues and PRs

Full Changelog: 1.12...1.13