Htscodecs 1.2.1
This release contains the following minor changes. Please see the "git log" for the full details.
Improvements / changes:
-
Speed up of rANS4x16 order-0. We now use a branchless encoder renormalisation step. For complex data it's between 13 and 50% speed up depending on compiler.
-
Improve rANS4x16 compute_shift estimates. The entropy calculation is now more accurate. This leads to more frequent use of the 10-bit frequency mode, at an expense of up to 1% size growth.
-
Speed improvements to the striped rANS mode, both encoding and decoding. Encoder gains ~8% and decoder ~5%, but varies considerably by compiler and data.
-
Added new var_put_u64_safe and var_put_u32_safe interfaces. These are automatically used by var_put_u64 and var_put_u32 when near the end of the buffer, but may also be called directly.
-
Small speed ups to the hist8 and hist1_4 functions.
-
Minor speed up to RLE decoding.
Bug fixes:
-
Work around an icc-2021 compiler bug, but also speed up the varint encoding too (#29).
-
Fix an off-by-one error in the initial size check in arith_dynamic. This meant the very smallest of blocks could fail to decode. Reported by Divon Lan.
-
Fixed hist1_4 to also count the last byte when computing T0 array.
-
Fixed overly harsh bounds checking in the fqzcomp read_array function, which meant it failed to decode some configurations.