1
- Io_lib: Version 1.14.8
1
+ Io_lib: Version 1.14.9
2
2
=======================
3
3
4
4
Io_lib is a library of file reading and writing code to provide a general
5
- purpose trace file (and Experiment File) reading interface. The programmer
6
- simply calls the (eg) read_reading to create a "Read" C structure with the
7
- data loaded into memory. It has been compiled and tested on a variety
8
- of unix systems, MacOS X and MS Windows.
5
+ purpose SAM/BAM/CRAM, trace file (and Experiment File) reading
6
+ interface. Programmatically {S,B,CR}AM can be manipulated using the
7
+ scram_ * () API functions while DNA Chromatogram ("trace") files can be
8
+ read using the read_reading() function.
9
+
10
+ It has been compiled and tested on a variety of unix systems, MacOS X
11
+ and MS Windows.
9
12
10
13
The directories below here contain the io_lib code. These support the
11
14
following file formats:
12
15
16
+ SAM/BAM sequence files
17
+ CRAM sequence files
13
18
SCF trace files
14
19
ABI trace files
15
20
ALF trace files
@@ -18,62 +23,73 @@ following file formats:
18
23
SRF trace archives
19
24
Experiment files
20
25
Plain text files
21
- SAM/BAM sequence files
22
- CRAM sequence files
23
26
24
27
These link together to form a single "libstaden-read" library supporting
25
28
all the file formats via a single read_reading (or fread_reading or
26
29
mfread_reading) function call and analogous write_reading functions
27
30
too. See the file include/Read.h for the generic 'Read' structure.
28
31
29
- See the CHANGES for a summary of older updates or ChangeLog for the
32
+ See the CHANGES for a summary of older updates or git logs for the
30
33
full details.
31
34
32
- Version 1.14.8 (22nd April 2016 )
35
+ Version 1.14.9 (9th February 2017 )
33
36
--------------
34
37
35
- * SAM: Small speed up to record parsing.
38
+ Updates:
39
+
40
+ * BAM: Added CRC checking. Bizarrely this was absent here and in most
41
+ other BAM implementations too. Pure BAM decode of an uncompressed
42
+ BAM is around 9% slower and compressed BAM to compressed BAM is
43
+ almost identical. The most significant hit is reading uncompressed
44
+ BAM (and doing nothing else) which is 120% slower as CRC dominates.
45
+ Options are available to disable the CRC checking incase this is an
46
+ issue (scramble -!).
47
+
48
+ * CRAM: Now supports bgziped fasta references.
49
+
50
+ * CRAM/SAM: Headers are now kept in the same basic type order while
51
+ transcoding. (Eg all @PG before all @SQ , or vice versa, depending on
52
+ input ordering.)
53
+
54
+ * CRAM: Compression level 1 is now faster but larger. (The old -1 and
55
+ -2 were too similar.)
56
+
57
+ * CRAM: Improved compression efficiency in some files, when switching
58
+ from sorted to unsorted data.
59
+
60
+ * CRAM: Various speedups relating to memory handling,
61
+ multi-threaded performance and the rANS codec.
62
+
63
+ * CRAM: Block CRC checks are now only done when the block is used,
64
+ speeding up multi-threading and tools that do not decode all blocks
65
+ (eg flagstat).
36
66
37
- * CRAM: Scramble now has -p and -P options to control whether to
38
- force the BAM auxiliary sizes (8 vs 16 vs 32-bit integer quantities)
39
- rather than reducing to smallest size required, and whether to
40
- preserve the order of auxiliary tags including RG, NM and MD.
67
+ * Scramble -g and -G options to generate and reuse bgzip indices when
68
+ reading and writing BAM files.
41
69
42
- This latter option requires storing these values verbatim instead of
43
- regenerating them on-the-fly, but note this only preserves tag order
44
- with Scramble / Htslib. Htsjdk will still produce these fields out
45
- of order.
70
+ * Scramble -q option to omit updating the @PG header records.
46
71
47
- * CRAM no longer stores data in the CORE block, permitting greater
48
- flexibility in choosing which fields to decode. (This change is
49
- also mirrored in htslib and htsjdk.)
72
+ * Experimental cram_filter tool has been added, to rapidly produce
73
+ cram subsets.
50
74
51
- * CRAM: ref.fai files in a different order to @SQ headers should now
52
- work correctly.
75
+ * Migrated code base to git. Use github for primary repository.
53
76
54
- * CRAM required-fields parameters no longer forces quality decoding
55
- when asking for sequence.
77
+ Bug fixes:
56
78
57
- * CRAM: More robustness / safety checks during decoding; itf8 bounds
58
- checks, running out of memory, bounds checks in BETA codec, and
59
- more.
79
+ * BAM: Fixed the bin value calculation for placed but unmapped reads.
60
80
61
- * CRAM auto-generated read names are consistent regardless of range
62
- queries. They also now match those produced by htslib.
81
+ * CRAM: Fixed file descriptor leak in refs_load_fai().
63
82
64
- * A few compiler warnings in cram_dump / cram_size have gone away.
65
- Many small CRAM code tweaks to aid comparisons to htslib. It should
66
- also be easier to build under Microsoft Visual Studio (although no
67
- project file is provided).
83
+ * CRAM: Fixed a crash in MD5 calculation for sequences beyond the
84
+ reference end.
68
85
69
- * CRAM: the rANS codec should now be slightly faster at decoding .
86
+ * CRAM: Bug fixes when encoding malformed @ SQ records .
70
87
71
- * CRAM bug fix: removed potential (but unobserved) possibility of
72
- 8-bit quantities stored as a 16-bit value in BAM being converted
73
- incorrectly within CRAM.
88
+ * CRAM: Fixed a rare renormalisation bug in rANS codec.
74
89
75
- * SAM bug fix: no more complaining about "unknown" sort order .
90
+ * Fixed tests so make -j worked .
76
91
92
+ * Removed ancient, broken and unused popen() code.
77
93
78
94
79
95
Building
0 commit comments