Unaligned bit arrays on the JavaScript target #3946

richard-viney · 2024-12-03T11:54:30Z

Summary 📘

This PR adds support for unaligned bit arrays on the JavaScript target.

In expressions:

Arbitrary sized integer segments:

<<1:4>>
<<12:9-little>>
<<12:29-big>>
<<1234:100-little>>

Arbitrary sized bits segments:
```
<<<<0xABCD:15>>:bits-10>>
```

In patterns:

Arbitrary sized int segments:

let assert <<_:7, i:19-little-signed>> = <<0xABCDEF12:26>>

Sized and unsized bits segments:

let assert <<_:7, a:bits-3, b:size(14)-bits, c:bits>> = <<0xABCDEF:24, 0x1234:16>>

There is a compiler warning if the above features are used when gleam.toml specifies a version < v1.9.0.

Implementation Details 🛠️

The BitArray class in the prelude now has bitSize, byteSize, bitOffset, and rawBuffer properties.
- The value of any unused high bits in the buffer's first byte, and any unused low bits in the buffer's final byte, are undefined.
- Public API for use by FFI code is now: bitSize, byteSize, bitOffset, rawBuffer, byteAt(index).
- Deprecated APIs still used by existing FFI code: get buffer(), get length(). Using these emits a deprecation warning at runtime, and throws an exception if the bit array is unaligned (i.e. bitOffset isn't zero or bitSize isn't a multiple of 8).
JSDoc annotations have been added to allow type-checking by adding // @ts-check to the top of the file.

Implications for `@external` JavaScript code 🌍

All existing JavaScript FFI code that operates on bit arrays needs to be updated to support unaligned bit arrays.
Code that isn't updated will:
- Continue to work, provided it is only used with aligned bit arrays.
- Emit deprecation warnings at runtime on use of BitArray.length and BitArray.buffer.
- Throw a runtime exception if these deprecated APIs are called on a unaligned bit array.
No existing code breaks because unaligned bit arrays on JavaScript weren't previously possible.

Implications for `gleam/stdlib` 🤝

I have the updates for gleam/stdlib ready, mostly affecting gleam/bit_array. It can only be merged once this PR goes in as its tests won't run on stable Gleam. It may be necessary to run the new stdlib tests on nightly for a short period, with them segregated into their own file so they can be included/excluded depending on the active Gleam version. I'll sort that out once this PR makes it through review.
Future stdlib versions that support unaligned bit arrays on JavaScript work fine on earlier Gleam versions, there are no compatibility concerns there.

Testing 🧪

There's certainly some complexity and tricky bitwise operations here, mostly in the JavaScript prelude. The following has been done to ensure correctness:

Many new tests added to language_tests.gleam, and test/javascript_prelude.
Every path and branch through the code that performs slicing, concatenation, and conversion to ints/floats is covered by at least one test.
Extensive fuzzing has been performed on these operations which validated millions of combinations of bit array contents, segment sizes, offsets, endianness, signedness, etc. on JavaScript against the result on the Erlang target.
Issues found during fuzz testing were fixed and added to the language tests and prelude tests.

lpil

Wow, what a fantastic bit of work! Thank you!

I've not digested these changes properly yet, but I have some initial questions from my first review.

There's a lot of public functions in the API of the bit array class now, but generally the aim is to have none. What is the motivation for having these?

Similar for the deprecated functions, we can remove them.

There's quite a lot of code in the class. Could we move them to free-standing functions and have the generated code only use them if absolutely necessary? That would help with JavaScript bundlers performing dead code elimination.

Some existing tests have been removed or changed, why is this? New features shouldn't alter existing tests, it makes it harder to review and to prevent regressions.

Thanks again!

compiler-core/src/javascript/expression.rs

richard-viney · 2024-12-23T00:41:05Z

Thanks for taking a look!

There's a lot of public functions in the API of the bit array class now, but generally the aim is to have none. What is the motivation for having these?

These have been reduced down to the following: bitSize, byteSize, and byteAt().

Similar for the deprecated functions, we can remove them.

I've removed all except the BitArray.length and BitArray.buffer accessors, because removing these would break most/all JS FFI code that operates on bit arrays.

There's quite a lot of code in the class. Could we move them to free-standing functions and have the generated code only use them if absolutely necessary? That would help with JavaScript bundlers performing dead code elimination.

Done.

Some existing tests have been removed or changed, why is this? New features shouldn't alter existing tests, it makes it harder to review and to prevent regressions.

The diff on the tests looked a bit more complex in places than it actually was so I've moved some things around to make it easier to digest. Some existing tests did need to be tweaked or removed, e.g. those that were testing for compilation errors if unaligned bit arrays were used on the JS target.

richard-viney · 2025-01-04T02:53:15Z

This PR has been updated to make slicing bit arrays an O(1) operation under all circumstances, matching Erlang's performance characteristics. The BitArray.bitOffset member was added to enable this.

The main write-up has been updated for this change.

lpil

Incredible work, thank you so much! Sorry for taking so long to get back to you here, I'm still playing catch-up after some family matters.

This all looks great but I have left one discussion point below RE JavaScript dead code elimination.

lpil · 2025-02-09T13:47:22Z

compiler-core/templates/prelude.mjs

+  }
+
+  /** @internal */
+  sliceToInt(start, end, isBigEndian, isSigned) {


These methods mean that the large functions will never be removed by a JS bundler, isn't that right? Given their large size we want it to be that when unused there's no references to these functions so they get removed during bundling and minification so Gleam users don't need to pay a price for a feature they are not using.

ad8f1c0 restructures things such that the three slicing functions are removable by esbuild. It's an unrelated change to the rest of this PR as the previous slicing functions also weren't able to be removed, but yes they've gotten larger so it matters a bit more now.

There's still an increase in JS code size due to toBitArray() now being more complicated, the need for a BitArray.equals() function, the new deprecation warnings on access of BitArray.buffer and BitArray.length, etc. This code:

pub fn main() { <<>> }

Now bundles and minifies to 2965 bytes (1095 bytes gzip'ed) whereas previously it was 1560 bytes (676 bytes gzip'ed).

You're the best 💜

lpil

Beautiful! Thank you so much!!! I've one question inline RE unused JS functions

lpil · 2025-02-10T17:21:38Z

...src/javascript/tests/snapshots/gleam_core__javascript__tests__bit_arrays__discard_sized.snap

@@ -11,10 +11,10 @@ fn go(x) {


 ----- COMPILED JAVASCRIPT
-import { makeError } from "../gleam.mjs";
+import { makeError, bitArraySliceToInt } from "../gleam.mjs";


How come this gets imported? I can't see it used here

Fixed. It was happening because of the discarded integer segment.

lpil

This is incredible work! Thank you so much!

And also thank you for waiting as it took me quite some time to merge this in.

richard-viney · 2025-02-19T01:22:09Z

No worries, happy to help.

FYI the companion PR for stdlib is ready for a look: gleam-lang/stdlib#761

richard-viney marked this pull request as ready for review December 3, 2024 12:22

richard-viney force-pushed the js-unaligned-bit-arrays branch 3 times, most recently from 2496b81 to bb69dd9 Compare December 5, 2024 01:45

richard-viney mentioned this pull request Dec 5, 2024

Unaligned bit arrays on the JavaScript target gleam-lang/stdlib#761

Merged

richard-viney force-pushed the js-unaligned-bit-arrays branch 3 times, most recently from b347126 to 9aa5446 Compare December 13, 2024 09:32

lpil reviewed Dec 22, 2024

View reviewed changes

compiler-core/src/javascript/expression.rs Outdated Show resolved Hide resolved

richard-viney force-pushed the js-unaligned-bit-arrays branch 6 times, most recently from 544c327 to 0f6fbdb Compare December 23, 2024 00:38

richard-viney force-pushed the js-unaligned-bit-arrays branch 4 times, most recently from 9c7c0dc to 5561409 Compare December 29, 2024 22:30

richard-viney force-pushed the js-unaligned-bit-arrays branch 3 times, most recently from da179df to 0847df6 Compare December 31, 2024 23:06

richard-viney marked this pull request as draft January 1, 2025 09:27

richard-viney force-pushed the js-unaligned-bit-arrays branch 6 times, most recently from 40047c3 to 945ed06 Compare January 3, 2025 21:19

richard-viney force-pushed the js-unaligned-bit-arrays branch from 945ed06 to 7651919 Compare January 4, 2025 02:50

richard-viney marked this pull request as ready for review January 4, 2025 02:53

richard-viney force-pushed the js-unaligned-bit-arrays branch 6 times, most recently from 4988b97 to c23f5a4 Compare January 10, 2025 12:22

richard-viney force-pushed the js-unaligned-bit-arrays branch 5 times, most recently from 9928d33 to 30be6f1 Compare January 23, 2025 23:39

richard-viney force-pushed the js-unaligned-bit-arrays branch 2 times, most recently from 74a82c0 to 49e0009 Compare February 8, 2025 03:32

lpil reviewed Feb 9, 2025

View reviewed changes

richard-viney force-pushed the js-unaligned-bit-arrays branch 3 times, most recently from c89b876 to e60bc99 Compare February 10, 2025 00:44

lpil reviewed Feb 10, 2025

View reviewed changes

richard-viney added 3 commits February 12, 2025 10:29

Unaligned bit arrays on the JavaScript target

8b5c1f7

Make removal of JS bit array functions possible when they're not used

ad8f1c0

Fix discarded bit array pattern segments resulting in unused imports

2516b0b

richard-viney force-pushed the js-unaligned-bit-arrays branch from b011d2d to 2516b0b Compare February 11, 2025 21:29

Minor corrections to JS prelude types

80d7675

lpil approved these changes Feb 18, 2025

View reviewed changes

lpil merged commit 3acc384 into gleam-lang:main Feb 18, 2025
12 checks passed

richard-viney deleted the js-unaligned-bit-arrays branch February 18, 2025 13:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Unaligned bit arrays on the JavaScript target #3946

Unaligned bit arrays on the JavaScript target #3946

Uh oh!

richard-viney commented Dec 3, 2024 •

edited

Loading

Uh oh!

lpil left a comment

Uh oh!

Uh oh!

richard-viney commented Dec 23, 2024 •

edited

Loading

Uh oh!

richard-viney commented Jan 4, 2025 •

edited

Loading

Uh oh!

lpil left a comment

Uh oh!

lpil Feb 9, 2025

Uh oh!

richard-viney Feb 10, 2025 •

edited

Loading

Uh oh!

lpil Feb 10, 2025

Uh oh!

lpil left a comment

Uh oh!

lpil Feb 10, 2025

Uh oh!

richard-viney Feb 10, 2025

Uh oh!

lpil left a comment

Uh oh!

Uh oh!

richard-viney commented Feb 19, 2025

Uh oh!

Uh oh!

Uh oh!

Unaligned bit arrays on the JavaScript target #3946

Unaligned bit arrays on the JavaScript target #3946

Uh oh!

Conversation

richard-viney commented Dec 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary 📘

Implementation Details 🛠️

Implications for @external JavaScript code 🌍

Implications for gleam/stdlib 🤝

Testing 🧪

Uh oh!

lpil left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

richard-viney commented Dec 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

richard-viney commented Jan 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lpil left a comment

Choose a reason for hiding this comment

Uh oh!

lpil Feb 9, 2025

Choose a reason for hiding this comment

Uh oh!

richard-viney Feb 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lpil Feb 10, 2025

Choose a reason for hiding this comment

Uh oh!

lpil left a comment

Choose a reason for hiding this comment

Uh oh!

lpil Feb 10, 2025

Choose a reason for hiding this comment

Uh oh!

richard-viney Feb 10, 2025

Choose a reason for hiding this comment

Uh oh!

lpil left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

richard-viney commented Feb 19, 2025

Uh oh!

Uh oh!

richard-viney commented Dec 3, 2024 •

edited

Loading

Implications for `@external` JavaScript code 🌍

Implications for `gleam/stdlib` 🤝

richard-viney commented Dec 23, 2024 •

edited

Loading

richard-viney commented Jan 4, 2025 •

edited

Loading

richard-viney Feb 10, 2025 •

edited

Loading