You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In SAM, in general a QUAL field of * indicates that base qualities are not available. Historically no-one has particularly cared about single-base reads and it is unspecified whether in that case * indicates unavailable or a base quality of 9. That ambiguity is samtools/hts-specs#715, on which you may wish to express an opinion.
However there is no such ambiguity in BAM. If the “ubam file” you are actually feeding to this script is a BAM file, then getting None here indicates that the record really does have QUAL absent. (However the qs:i:9 tag, if it is indeed an average base quality score, suggests that this data originated from a SAM file that intended QUAL = * to mean base quality 9…)
Pysam is not crashing here. Your script is crashing when read.query_qualities returns None, which your script is not dealing with. This property is None when the QUAL field is absent, and your script probably needs to deal with this possibility.
Are these single-base reads important in your analysis, or are they a handful of degenerate reads that could be filtered out without adverse effect? Are there other single-base reads present in your data with other characters in their QUAL fields?
Thanks for your reply. I don't know about there is an undefined spec regarding this situation. Intuitively, I would think it is a way to specify a single base read with quality 9.
I encountered this situation while using ONT's dorado basecaller. Anyway, I will relay this ambiguity to the dorado basecaller team and see what they want to do with it.
I have a unmapped bam with many entries of basecalls with only one base and a quality of "*".
pysam 0.21.0 then crashes whenever it reaches a line like that.
The simple progam that allows you replicate the problem is:
This is the content of the ubam file presented in sam format I used to reproduce the error
My guess is that pysam treated single "*" as this field is empty instead of single base quality of *
The text was updated successfully, but these errors were encountered: