Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dorado sometimes output single base reads with '*' quality #344

Closed
ymcki opened this issue Aug 20, 2023 · 3 comments
Closed

dorado sometimes output single base reads with '*' quality #344

ymcki opened this issue Aug 20, 2023 · 3 comments

Comments

@ymcki
Copy link

ymcki commented Aug 20, 2023

This happened with 0.2.1. Doesn't seem to happen much with 0.3.x.

Anyway, such reads can cause trouble with pysam and potentially other software.

pysam-developers/pysam#1211

Maybe the best approach is just not to output single base reads? They will just make the average/median read length stat looks bad.

@tijyojwad
Copy link
Collaborator

Dorado shouldn't be outputting single base reads at all. We filter out all reads less than 5 bp long by default - https://github.com/nanoporetech/dorado/blob/master/dorado/utils/parameters.h#L31

Are you seeing this in simplex or duplex?

@ymcki
Copy link
Author

ymcki commented Aug 21, 2023

Good. I am only seeing it in 0.2.1 duplex output via the duplex_tools workflow. It is great that it is fixed in the newer version.

I think I will just have filter them out by myself when using older version.

@tijyojwad
Copy link
Collaborator

Got it! May I ask why you're using 0.2.1 and not one of the newer versions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants