Skip to content

Commit e3f0ff1

Browse files
committed
Doc updates for 1.14.9 release.
1 parent 4516dd0 commit e3f0ff1

File tree

3 files changed

+128
-40
lines changed

3 files changed

+128
-40
lines changed

CHANGES

+72
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,75 @@
1+
Version 1.14.9 (9th February 2017)
2+
--------------
3+
4+
Updates:
5+
6+
* BAM: Added CRC checking. Bizarrely this was absent here and in most
7+
other BAM implementations too. Pure BAM decode of an uncompressed
8+
BAM is around 9% slower and compressed BAM to compressed BAM is
9+
almost identical. The most significant hit is reading uncompressed
10+
BAM (and doing nothing else) which is 120% slower as CRC dominates.
11+
Options are available to disable the CRC checking incase this is an
12+
issue (scramble -!).
13+
14+
* CRAM: Now supports bgziped fasta references.
15+
16+
* CRAM/SAM: Headers are now kept in the same basic type order while
17+
transcoding. (Eg all @PG before all @SQ, or vice versa, depending on
18+
input ordering.)
19+
20+
* CRAM: Compression level 1 is now faster but larger. (The old -1 and
21+
-2 were too similar.)
22+
23+
* CRAM: Improved compression efficiency in some files, when switching
24+
from sorted to unsorted data.
25+
26+
* CRAM: Speedups and improvements to memory handling under GNU
27+
malloc. See the scram_init() function.
28+
29+
* CRAM: Sped up the rANS codecs on x86_64 platforms (assembly code).
30+
31+
* CRAM: Improved multi-threading performance during decode.
32+
33+
* CRAM: Block CRC checks are now only done when the block is used,
34+
speeding up multi-threading and tools that do not decode all blocks
35+
(eg flagstat).
36+
37+
* Scramble -g and -G options to generate and reuse bgzip indices when
38+
reading and writing BAM files.
39+
40+
* Scramble -q option to omit updating the @PG header records.
41+
42+
* Experimental cram_filter tool has been added, to rapidly produce
43+
cram subsets.
44+
45+
* Migrated code base to git. Use github for primary repository.
46+
Dropped ChangeLog file (recommend git clone and "git log
47+
--abbrev-commit --pretty=medium --stat" for an svn similar log
48+
style).
49+
50+
* BAM: minor improvements to gcc SIMD auto-vectorisation.
51+
52+
* Minor improvements to dstring memory usage (potentially reducing
53+
memory usage when loading very large BAM headers).
54+
55+
Bug fixes:
56+
57+
* BAM: Fixed the bin value calculation for placed but unmapped reads.
58+
59+
* CRAM: Fixed file descriptor leak in refs_load_fai().
60+
61+
* CRAM: Fixed a crash in MD5 calculation for sequences beyond the
62+
reference end.
63+
64+
* CRAM: Bug fixes when encoding malformed @SQ records.
65+
66+
* CRAM: Fixed a rare renormalisation bug in rANS codec.
67+
68+
* Fixed tests so make -j worked.
69+
70+
* Removed ancient, broken and unused popen() code.
71+
72+
173
Version 1.14.8 (22nd April 2016)
274
--------------
375

Makefile.am

+1-1
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ pkginclude_HEADERS = \
113113

114114
bin_SCRIPTS = io_lib-config
115115

116-
EXTRA_DIST = README COPYRIGHT ChangeLog CHANGES man options.mk bootstrap \
116+
EXTRA_DIST = README.md COPYRIGHT CHANGES man options.mk bootstrap \
117117
docs/ZTR_format docs/Hash_File_Format io_lib-config.in io_lib/os.h.in
118118

119119
dist-hook:

README.md

+55-39
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,20 @@
1-
Io_lib: Version 1.14.8
1+
Io_lib: Version 1.14.9
22
=======================
33

44
Io_lib is a library of file reading and writing code to provide a general
5-
purpose trace file (and Experiment File) reading interface. The programmer
6-
simply calls the (eg) read_reading to create a "Read" C structure with the
7-
data loaded into memory. It has been compiled and tested on a variety
8-
of unix systems, MacOS X and MS Windows.
5+
purpose SAM/BAM/CRAM, trace file (and Experiment File) reading
6+
interface. Programmatically {S,B,CR}AM can be manipulated using the
7+
scram_*() API functions while DNA Chromatogram ("trace") files can be
8+
read using the read_reading() function.
9+
10+
It has been compiled and tested on a variety of unix systems, MacOS X
11+
and MS Windows.
912

1013
The directories below here contain the io_lib code. These support the
1114
following file formats:
1215

16+
SAM/BAM sequence files
17+
CRAM sequence files
1318
SCF trace files
1419
ABI trace files
1520
ALF trace files
@@ -18,62 +23,73 @@ following file formats:
1823
SRF trace archives
1924
Experiment files
2025
Plain text files
21-
SAM/BAM sequence files
22-
CRAM sequence files
2326

2427
These link together to form a single "libstaden-read" library supporting
2528
all the file formats via a single read_reading (or fread_reading or
2629
mfread_reading) function call and analogous write_reading functions
2730
too. See the file include/Read.h for the generic 'Read' structure.
2831

29-
See the CHANGES for a summary of older updates or ChangeLog for the
32+
See the CHANGES for a summary of older updates or git logs for the
3033
full details.
3134

32-
Version 1.14.8 (22nd April 2016)
35+
Version 1.14.9 (9th February 2017)
3336
--------------
3437

35-
* SAM: Small speed up to record parsing.
38+
Updates:
39+
40+
* BAM: Added CRC checking. Bizarrely this was absent here and in most
41+
other BAM implementations too. Pure BAM decode of an uncompressed
42+
BAM is around 9% slower and compressed BAM to compressed BAM is
43+
almost identical. The most significant hit is reading uncompressed
44+
BAM (and doing nothing else) which is 120% slower as CRC dominates.
45+
Options are available to disable the CRC checking incase this is an
46+
issue (scramble -!).
47+
48+
* CRAM: Now supports bgziped fasta references.
49+
50+
* CRAM/SAM: Headers are now kept in the same basic type order while
51+
transcoding. (Eg all @PG before all @SQ, or vice versa, depending on
52+
input ordering.)
53+
54+
* CRAM: Compression level 1 is now faster but larger. (The old -1 and
55+
-2 were too similar.)
56+
57+
* CRAM: Improved compression efficiency in some files, when switching
58+
from sorted to unsorted data.
59+
60+
* CRAM: Various speedups relating to memory handling,
61+
multi-threaded performance and the rANS codec.
62+
63+
* CRAM: Block CRC checks are now only done when the block is used,
64+
speeding up multi-threading and tools that do not decode all blocks
65+
(eg flagstat).
3666

37-
* CRAM: Scramble now has -p and -P options to control whether to
38-
force the BAM auxiliary sizes (8 vs 16 vs 32-bit integer quantities)
39-
rather than reducing to smallest size required, and whether to
40-
preserve the order of auxiliary tags including RG, NM and MD.
67+
* Scramble -g and -G options to generate and reuse bgzip indices when
68+
reading and writing BAM files.
4169

42-
This latter option requires storing these values verbatim instead of
43-
regenerating them on-the-fly, but note this only preserves tag order
44-
with Scramble / Htslib. Htsjdk will still produce these fields out
45-
of order.
70+
* Scramble -q option to omit updating the @PG header records.
4671

47-
* CRAM no longer stores data in the CORE block, permitting greater
48-
flexibility in choosing which fields to decode. (This change is
49-
also mirrored in htslib and htsjdk.)
72+
* Experimental cram_filter tool has been added, to rapidly produce
73+
cram subsets.
5074

51-
* CRAM: ref.fai files in a different order to @SQ headers should now
52-
work correctly.
75+
* Migrated code base to git. Use github for primary repository.
5376

54-
* CRAM required-fields parameters no longer forces quality decoding
55-
when asking for sequence.
77+
Bug fixes:
5678

57-
* CRAM: More robustness / safety checks during decoding; itf8 bounds
58-
checks, running out of memory, bounds checks in BETA codec, and
59-
more.
79+
* BAM: Fixed the bin value calculation for placed but unmapped reads.
6080

61-
* CRAM auto-generated read names are consistent regardless of range
62-
queries. They also now match those produced by htslib.
81+
* CRAM: Fixed file descriptor leak in refs_load_fai().
6382

64-
* A few compiler warnings in cram_dump / cram_size have gone away.
65-
Many small CRAM code tweaks to aid comparisons to htslib. It should
66-
also be easier to build under Microsoft Visual Studio (although no
67-
project file is provided).
83+
* CRAM: Fixed a crash in MD5 calculation for sequences beyond the
84+
reference end.
6885

69-
* CRAM: the rANS codec should now be slightly faster at decoding.
86+
* CRAM: Bug fixes when encoding malformed @SQ records.
7087

71-
* CRAM bug fix: removed potential (but unobserved) possibility of
72-
8-bit quantities stored as a 16-bit value in BAM being converted
73-
incorrectly within CRAM.
88+
* CRAM: Fixed a rare renormalisation bug in rANS codec.
7489

75-
* SAM bug fix: no more complaining about "unknown" sort order.
90+
* Fixed tests so make -j worked.
7691

92+
* Removed ancient, broken and unused popen() code.
7793

7894

7995
Building

0 commit comments

Comments
 (0)