Unaligned bit arrays on the JavaScript target #3946
+2,559
−570
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary 📘
This PR adds support for unaligned bit arrays on the JavaScript target. Specifically:
<<1:4>>
,<<12:9-little>>
,<<12:29-big>>
,<<1234:100-little>>
bits
segments:<<<<0xABCD:15>>:bits-10>>
int
segments:let assert <<_:7, i:19-little-signed>> = <<0xABCDEF12:26>>
bits
segments:let assert <<_:7, a:bits-3, b:size(14)-bits, c:bits>> = <<0xABCDEF:24, 0x1234:16>>
gleam.toml
specifies a version < v1.7.0Implementation Details 🛠️
BitArray
class in the prelude now has abitSize: number
field in addition to itsbuffer: Uint8Array
, which stores its true size in bits.BitArray
class in the prelude has been reworked in a few ways:equals()
,slice()
,sliceToFloat()
,sliceToInt()
.floatFromSlice()
,intFromSlice()
,binaryFromSlice()
,sliceAfter()
. No one should be relying on these as they were marked@internal
, so perhaps they could be fully removed? They currently call the new public functions listed above.JSDoc
annotations have been added to all functions allowing type-checking by adding// @ts-check
to the top of the file.BitArray.sliceToInt()
has internal variants for aligned and unaligned access, as well as variants for bothnumber
andBigInt
. Thenumber
variant is used when the size is <= 53 bits.BigInt
is typically 5-10x slower, hence the decision to support both paths.Implications for
@external
JavaScript code 🌍bitSize
up to a multiple of 8, look at the undefined low bits in the final byte, and give the wrong result.Implications for
gleam/stdlib
🤝gleam/stdlib
ready to go, mostly affectinggleam/bit_array
. It can only be merged once this PR goes in as its tests don't run on Gleam 1.6.2. It may be necessary to run the new stdlib tests on nightly for a short period, with them segregated into their own file so they can be included/excluded depending on the active Gleam version. I'll sort that out once this PR makes it through review.Testing 🧪
There's certainly some complexity and tricky bitwise operations here, mostly in the JavaScript prelude. The following has been done to ensure correctness:
language_tests.gleam
, andtest/javascript_prelude
.BitArray.slice()
,BitArray.sliceToInt()
,BitArray.sliceToFloat()
is covered by at least one test.Limitations 🤔
The main limitation is that there is no allowance for unused high bits in the first byte of a bit array's buffer.
The motivation for allowing this would be to make bit array slices O(1) in all cases. Currently a slice is O(1) only if its start offset is byte-aligned (the end offset doesn't matter). If the start offset isn't byte-aligned then a slice is O(N) due to requiring a copy.
This makes the following O(N²) on JavaScript, but O(N) on Erlang:
This could be addressed at a later date, albeit with another round of impact on JavaScript FFI code that would need updating. So maybe it's better to bite the bullet now? Or maybe it's not important enough to warrant the additional complexity. There's also a reasonably good chance that any folks affected by this would be able to rework their code to avoid the performance issue (if they realise what the problem is).
Thoughts?
✨✨✨