Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Bindings IR (v2) #2457

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

WIP: Bindings IR (v2) #2457

wants to merge 1 commit into from

Conversation

bendk
Copy link
Contributor

@bendk bendk commented Feb 27, 2025

Adding a new system for bindings generation, modeled after a compiler pipeline with a lot of small steps. This is WIP a continuation of #2333.

Planning out the work

This time I only implemented the general parts of the IR, not the Python pipeline. My next step is to try to get the Gecko JS bindings to use this. I'm pretty confident that I can get Python working again so I'd rather work on a new language. Also, I think the Gecko JS bindings could really benefit from this system.

In the meantime, I expect this to be a long-lived PR/branch. Feel free to comment on it whenever you get some time to take a look. I don't expect many merge errors, since this is basically just adding new code.

I'm thinking that we should give external bindings authors plenty of time to see this before we force them to switch over. If we decided to go with this and we migrate all the builtin bindings generators to using the new system, I'd still like to see at least 2 UniFFI versions that support the current ComponentInterface-based generation before we force users to move to the new one. If it's not much work to keep supporting the legacy system, then I'd like it to be more like 4 versions.

The IRs

The IR types are mostly similar to before, with one big difference: IRs contain items from every crate involved. My feeling is that this will set us up to finally tackle issues that require cross-module coordination like #1896 and #2430.

The passes

I tried sooooo many ways to express the logic of each pass in a nice way. In the end I was inspired by the nanopass framework, where you have a bunch of little passes. That keeps the logic modular and readable. When you read one of the ir::general::pass modules, it's doing a single thing. Before, we had one big pass that converted everything, which was harder to understand.

The downside of a nanopass framework is that it's pretty easy to make rustc explode. My initial try created a separate set of IR node type for each pass, but once I got to a dozen or so the build time for uniffi_bindgen went to about 30s. Then it started getting killed by the OOM killer and I decided it was time for a new direction.

To avoid this, the new system creates a single IR for a pass, but then splits the pass up into multiple steps. Even though each step works on the same IR, it still kind of feel like they're transforming the types. Instead of adding a new field, they now set a value for a field that was previously empty.

This achieves more-or-less the same effect as the nanopass system, but with much faster compile times. My not-very-scientific measurement is that it went from 12s before to 17s now. That's not great, but if this is successful we can hopefully delete a bunch of the old code and get times down to something reasonable again.

The CLI

I replaced the peek and diff commands with the pipeline command. If you have time, read the internals doc about it and take it for a test drive. IMO it's pretty fun walking through each pipeline step.

@bendk bendk requested a review from a team as a code owner February 27, 2025 22:36
@bendk bendk requested review from badboy and removed request for a team February 27, 2025 22:36
@bendk bendk force-pushed the push-zwspykqxzzrx branch from ea4e1ff to 8dec2cd Compare February 27, 2025 23:01
@bendk bendk changed the title Bindings IR (v2) WIP: Bindings IR (v2) Feb 28, 2025
@bendk bendk force-pushed the push-zwspykqxzzrx branch 14 times, most recently from 99f3818 to f1d15da Compare March 11, 2025 13:27
@bendk bendk force-pushed the push-zwspykqxzzrx branch 4 times, most recently from 8453674 to 6a3b744 Compare March 14, 2025 20:51
@bendk
Copy link
Contributor Author

bendk commented Mar 14, 2025

Updating this PR to use less macros and syn parsing. Now, instead of wrapping everything with the ir! and using ir_pass! to generate the pass IR, there's just derive macros. One implements the node trait and one generates the macros to construct the nodes. This means instead of having ir_pass! auto-generate everything, you need to write out the pass IR types. These changes were based on my experience with JS, which was mostly positive but showed some of the issues with the last design.

The main issue was that when we were auto-generating the pass IR types, we would copy all the #[derive] attributes from the output IR. This caused issues since I wanted to #[derive(rinja::Template)] for the Gecko-JS IR types, but it doesn't make sense to for the pass IR types and in fact errors out because there's no filters module.

The new system requires extra boilerplate compared to the last, but it also feels much less magical. I like this trade-off overall, even when you ignore the issue with rinja::Template.

There's no longer specialized code to convert uniffi_meta types into the initial IR. We just derive Node for those types and then we can automatically do the conversion. This means that the foundational IR pipeline code needed to move out of uniffi_bindgen, since uniffi_meta can't depend on uniffi_bindgen without creating a circular dependency. I created the uniffi_pipeline crate to hold that code.

The other improvement is that the new Node types can operate more dynamically, i.e. the new pipeline is based around Box<dyn Node> rather than any concrete type. I think this makes the pipeline code nicer.

@bendk bendk force-pushed the push-zwspykqxzzrx branch 5 times, most recently from fb63312 to c219ada Compare March 15, 2025 00:41
@bendk bendk force-pushed the push-zwspykqxzzrx branch 24 times, most recently from 991d967 to f1aa1f4 Compare March 21, 2025 15:07
@bendk bendk force-pushed the push-zwspykqxzzrx branch from f1aa1f4 to 03512a3 Compare March 24, 2025 12:54
@bendk
Copy link
Contributor Author

bendk commented Mar 24, 2025

After working more with the Gecko JS bindings, I realized we can avoid having a separate pass IR, which simplifies a lot of things.

The only reason to have a pass IR is to removes fields/variants, but we can get away with only defining new fields -- IOW, each IR is a superset of the last one. The main exception in the old code was when transforming the uniffi_meta::Metadata items. I changed things so this happens before transforming to the initial IR. After that, it's all additive. I think it should still be possible to remove fields/variants if we really needed to. However, we can just treat that as the exception rather than the rule.

@bendk bendk force-pushed the push-zwspykqxzzrx branch 2 times, most recently from 293c0db to dc0ac2b Compare March 24, 2025 16:43
Adding a new system for bindings generation, modeled after a compiler
pipeline with a lot of small steps.
@bendk bendk force-pushed the push-zwspykqxzzrx branch from dc0ac2b to 19f0fc4 Compare March 24, 2025 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant