Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix reading UTF-8 encoded sample names when char is signed
The trick used in bcf_hdr_parse_sample_line() to rapidly find tabs and newlines could be defeated by UTF-8 characters outside the Basic Latin range on platforms where "char" is signed (like x86). It's currently not clear if VCF intends to allow these, but the 4.3 specification does allow UTF-8 and it's easy enough to support. Fix by casting to unsigned when making the comparison. Modifies formatcols.vcf to include a UTF-8 character for a round-trip test. Fixes samtools/bcftools#1408
- Loading branch information