Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unaligned bit arrays on the JavaScript target #3946

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

richard-viney
Copy link
Contributor

@richard-viney richard-viney commented Dec 3, 2024

Summary 📘

This PR adds support for unaligned bit arrays on the JavaScript target. Specifically:

  • In expressions:
    • Arbitrary sized integer segments: <<1:4>>, <<12:9-little>>, <<12:29-big>>, <<1234:100-little>>
    • Arbitrary sized bits segments: <<<<0xABCD:15>>:bits-10>>
  • In patterns:
    • Arbitrary sized int segments: let assert <<_:7, i:19-little-signed>> = <<0xABCDEF12:26>>
    • Sized and unsized bits segments: let assert <<_:7, a:bits-3, b:size(14)-bits, c:bits>> = <<0xABCDEF:24, 0x1234:16>>
  • There is a warning if the above features are used when gleam.toml specifies a version < v1.7.0

Implementation Details 🛠️

  • The BitArray class in the prelude now has a bitSize: number field in addition to its buffer: Uint8Array, which stores its true size in bits.
  • The value of any unused low bits in the final byte are undefined. They will be zero in many common use cases, but making them undefined allowed for additional optimisations on some operations.
  • The BitArray class in the prelude has been reworked in a few ways:
    • New public functions: equals(), slice(), sliceToFloat(), sliceToInt().
    • Deprecated internal functions: floatFromSlice(), intFromSlice(), binaryFromSlice(), sliceAfter(). No one should be relying on these as they were marked @internal, so perhaps they could be fully removed? They currently call the new public functions listed above.
    • JSDoc annotations have been added to all functions allowing type-checking by adding // @ts-check to the top of the file.
    • BitArray.sliceToInt() has internal variants for aligned and unaligned access, as well as variants for both number and BigInt. The number variant is used when the size is <= 53 bits.BigInt is typically 5-10x slower, hence the decision to support both paths.

Implications for @external JavaScript code 🌍

  • Existing JavaScript FFI code that operates on bit arrays needs to be updated. If it isn't then, when fed an unaligned bit array, it will effectively round the bitSize up to a multiple of 8, look at the undefined low bits in the final byte, and give the wrong result.
  • However, no existing code can be using such bit arrays on JavaScript at present, so nothing breaks. Still, there could be code that is now valid on the JavaScript target, which wasn't valid previously, and which won't give the correct result for unaligned bit arrays. I'm not sure it's possible to warn about this.
  • I can make relevant updates to any affected packages, but haven't yet tried to assemble a list of them. There probably isn't a huge number.
  • Consider noting this impact on JavaScript FFI code somewhere in the v1.7 release info.

Implications for gleam/stdlib 🤝

  • I have the updates for gleam/stdlib ready to go, mostly affecting gleam/bit_array. It can only be merged once this PR goes in as its tests don't run on Gleam 1.6.2. It may be necessary to run the new stdlib tests on nightly for a short period, with them segregated into their own file so they can be included/excluded depending on the active Gleam version. I'll sort that out once this PR makes it through review.
  • Future stdlib versions that support unaligned bit arrays on JavaScript will work fine on Gleam versions < 1.7.0, there are no compatibility concerns there.
  • We could look into printing a warning if unaligned bit arrays are used on JavaScript and the package's stdlib version doesn't support them. Should we do this? If so, I'd prefer to implement it in a follow-up PR if that's acceptable.

Testing 🧪

There's certainly some complexity and tricky bitwise operations here, mostly in the JavaScript prelude. The following has been done to ensure correctness:

  • Many new tests added to language_tests.gleam, and test/javascript_prelude.
  • Every path and branch through BitArray.slice(), BitArray.sliceToInt(), BitArray.sliceToFloat() is covered by at least one test.
  • Extensive fuzzing has been performed on bit array construction, slicing, and slicing to ints and floats.
    • This validated millions of combinations of bit array contents, segment sizes, offsets, endianness, signedness, etc. on JavaScript against the result on the Erlang target.
    • Issues found by this testing were fixed and added to the language tests and prelude tests.

Limitations 🤔

The main limitation is that there is no allowance for unused high bits in the first byte of a bit array's buffer.

The motivation for allowing this would be to make bit array slices O(1) in all cases. Currently a slice is O(1) only if its start offset is byte-aligned (the end offset doesn't matter). If the start offset isn't byte-aligned then a slice is O(N) due to requiring a copy.

This makes the following O(N²) on JavaScript, but O(N) on Erlang:

pub fn print_bits(bits: BitArray) -> Nil {
  case bits {
    <<b:1, rest:bits>> -> {
      b |> int.to_string |> io.print
      print_bits(rest)
    }
    _ -> io.println("")
  }
}

This could be addressed at a later date, albeit with another round of impact on JavaScript FFI code that would need updating. So maybe it's better to bite the bullet now? Or maybe it's not important enough to warrant the additional complexity. There's also a reasonably good chance that any folks affected by this would be able to rework their code to avoid the performance issue (if they realise what the problem is).

Thoughts?


✨✨✨

@richard-viney richard-viney marked this pull request as ready for review December 3, 2024 12:22
@richard-viney richard-viney force-pushed the js-unaligned-bit-arrays branch 2 times, most recently from c77ded4 to 2496b81 Compare December 4, 2024 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant