From 83aa13b895c4cd70ab441d9b980a3378051fac54 Mon Sep 17 00:00:00 2001 From: Matthew Maurer Date: Tue, 12 Dec 2023 12:16:08 -0800 Subject: [PATCH 01/13] RFC: patchable-function-entry Support adding a user-specified number of nops before functions and before the prelude. --- text/3543-patchable-function-entry.md | 193 ++++++++++++++++++++++++++ 1 file changed, 193 insertions(+) create mode 100644 text/3543-patchable-function-entry.md diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md new file mode 100644 index 00000000000..d8e3b37a114 --- /dev/null +++ b/text/3543-patchable-function-entry.md @@ -0,0 +1,193 @@ +- Feature Name: `patchable_function_entry` +- Start Date: 2023-12-12 +- RFC PR: [rust-lang/rfcs#3543](https://github.com/rust-lang/rfcs/pull/3543) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +This RFC proposes support for `patchable-function-entry` as present in [`clang`](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fpatchable-function-entry) and [`gcc`](https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#index-fpatchable-function-entry). This feature is generally used to allow hotpatching and instrumentation of code. + +# Motivation +[motivation]: #motivation + +The Linux kernel uses `-fpatchable-function-entry` heavily, including for [`ftrace`](https://www.kernel.org/doc/html/v6.6/trace/ftrace.html) and [`FINEIBT` for x86](https://github.com/torvalds/linux/blob/26aff849438cebcd05f1a647390c4aa700d5c0f1/arch/x86/Kconfig#L2464). Today, enabling these features alongside Rust will lead to confusing or broken behavior (`ftrace` will fail to trace Rust functions when developing, `FINEIBT` will conflict with the `kcfi` sanitizer, etc.). It also uses the `clang` and `gcc` attribute `patchable_function_entry` to disable this padding on fragile functions or those used for instrumentation. + +Integrating Rust code into this and other large projects which expect all native code to have these nop buffers will be made easier by allowing them to request the same treatment of native functions they get in C and C++. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +`patchable-function-entry` provides configurable nop padding before function symbols and after function symbols but before any generated code. We refer to the former as `prefix` padding and the latter as `entry` padding. For example, if we had a function `f` with `prefix` set to 3 and `entry` to 2, we'd expect to see: + +``` +nop +nop +nop +f: +nop +nop +// Code goes here +``` + +To set this for all functions in a crate, use `-Z patchable-function-entry=nop_count,offset` where `nop_count = prefix + entry`, and `offset = prefix`. Usually, you'll want to copy this value from a corresponding `-fpatchable-function-entry=` being passed to the C compiler in your project. + +To set this for a specific function, use `#[patchable_function_entry(prefix(m), entry(n))]` to pad with m nops before the symbol and n after the symbol, but before the prelude. This will override the flag value. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +`patchable-function-entry` provides configurable nop padding before function symbols and after function symbols but before any generated code. We refer to the former as `prefix` padding and the latter as `entry` padding. For example, if we had a function `f` with `prefix` set to 3 and `entry` to 2, we'd expect to see: + +``` +f_pad: +nop +nop +nop +f: +nop +nop +// Code goes here +``` + +Nop padding may not be supported on all architectures. As of the time of writing, support includes: + +- aarch64 +- aarch64\_be +- loongarch32 +- loongarch64 +- riscv32 +- riscv64 +- i686 +- x86\_64 + +`f_pad` addresses for every padded symbol are aggregated in the `__patchable_function_entries` section of the resulting object. +This is not a real symbol, just a collected location. + +## Compiler flag `-Z patchable-function-entry` + +This flag comes in two forms: + +- `-Z patchable-function-entry=nop_count,offset` +- `-Z patchable-function-entry=nop_count` + +In the latter, offset is assumed to be zero. `nop_count` must be greater than or equal to `offset`, or it will be rejected. + +If unspecified, the current behavior is maintained, which is equivalent to `=0` here. + +This flag sets the default nop padding for all functions in the crate. Notably, this default *only applies to codegenned functions*. If a function is monomorphized during the compilation of another crate or any similar scenario, it will use the default from that crate's compilation. In most cases, all crates in a compilation should use the same value of `-Z patchable-function-entry` to reduce confusion. + +`prefix` is calculated as `offset`. `entry` is calculated as `nop_count - offset`. This unusual mode of specification is intended to mimic the compiler flags of `clang` and `gcc` for ease of build system integration. + +## Attribute `#[patchable_function_entry]` + +This attribute allows specification of either the `prefix` or `entry` values or both, using the format `#[patchable_function_entry(prefix(n), entry(n))]`. If either is left unspecified, it overrides them to a default value of 0. + +As this is specified via an attribute, it will persist across crate boundaries unlike the compiler flag. + +# Drawbacks +[drawbacks]: #drawbacks + +Not currently aware of any other than the complexity that comes from adding anything. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +## Implementation Levels +### Status Quo +If we keep to the status quo, we need to go through the Linux kernel making Rust support disable a variety of features which depend on this codegen feature. While I have not taken a complete inventory, this includes debugging features (e.g. `ftrace`) and hardening features (e.g. `FINEIBT`). + +This alternative runs the risk of the Rust-for-Linux experiment not leaving experiment status, and similar systems with introspection considering Rust unsuitable. + +The primary advantage of this design is that it does not require us to do anything. + +### Only compiler flag +In this design, we only add the `-Z patchable-function-entry` flag and not the attribute. This is enough for today - it would allow Rust to participate in these schemes, and in the event that a user *deeply* needed an uninstrumented function, they could build it as a separate crate. + +This design has two drawbacks: + +- It requires users to artificially structure their code as a form of annotation. +- The caveats around polymorphic functions using their codegen environment's flags could be tricky or surprising. + +The primary advantage of this design is that it is purely a compiler feature, with no change to the language. + +## Compiler flag and no-padding attribute +In this design, we add the compiler flag and an attribute that zeroes out padding for a function. This covers all the use cases I see in the Linux kernel today, so the only real downside is missing the opportunity to match `gcc` and `clang`'s capabilities with only a small bit more code. + +Some other project might use explicit padding configuration per-function, but a quick search across github only finds the `patchable_function_entry` attribute set to `(0, 0)` other than in compiler tests. + +## Everything (proposed design) +The only real downside I see here is the complexity of adding one more thing to the language. + +# Argument style + +There are two basic ways being used today to specify this nop padding: + +- `nop_count`,`offset`, used by the attributes and flags in `gcc` and `clang`. +- `prefix`, `entry`, used by the *LLVM* attributes after translation from the language level attributes and flags. + +The primary advantage of the first format is that it is used in `gcc` and `clang`. This means that existing documentation will not mislead users and tooling will have an easier time feeding the correct flag to Rust. + +The advantage of the second style is that `prefix` and `entry` don't have validity constraints (`nop_count` must be greater than `offset`) and it's more obvious what the user is asking for. + +## Copy `gcc`/`clang` everywhere + +This approach has the advantage of matching all existing docs and programmers coming over not being confused. + +## Use LLVM-style everywhere + +This format doesn't require validation and is likely easier to understand for users not already exposed to this concept. + +## Use `gcc`/`clang` for flags, LLVM-style for arguments (proposed) + +Build systems tend to interact with our flag interface, and they already have `nop_count,offset` format flags constructed for their C compilers, so this is likely the easiest way for them to interface. + +Users are unlikely to be directly copying code with a manual attribute, and usually are just going to be disabling padding per a github search for the attribute. Setting padding to `(0, 0)` is compatible across both styles, and setting `prefix` and `entry` manually is likely to be more understandable for a new user. + +## Use `gcc`/`clang` for flags, Support both styles for arguments + +Our attribute system is more powerful than `clang` and `gcc`, so we have the option to support: + +- `prefix(n)` +- `entry(n)` +- `nop_count(n)` +- `offset(n)` + +as modifiers to the attribute. We could make `prefix`/`entry` vs `nop_count`/`offset` an exclusive choice, and support both. This would provide the advantage of allowing users copying from or familiar with the other specification system to continue using it. The disadvantages would be more complex attribute parsing and potential confusion for people reading code. + +## Support both styles for flags and arguments + +In addition to supporting `nop_count`/`offset` for attributes, we could support this on the command line as well. This would have three forms: + +- `-Z patchable-function-entry=m` (`nop_count=m`, `offset=0`, compat format) +- `-Z patchable-function-entry=m,n` (`nop_count=m`, `offset=n`, compat format) +- `-Z patchable-function-entry=nop_count=m,offset=n` (`nop_count=m`, `offset=n`, modern format, offset optional) +- `-Z patchable-function-entry=prefix=m,entry=n` (`prefix=m`, `entry=n`, modern format, either optional) + +This would have the benefit of making it more clear what's being specified and allowing users to employ the simpler format on the command line if not integrating with an existing build. + +The primary disadvantage of this is having many ways to say the same thing. + +## Use LLVM-style for flags, `gcc`/`clang` for arguments + +I'm not sure why we would do this. + +# Prior art +[prior-art]: #prior-art + +- Linux uses this flag and attribute extensively +- `clang` [implements the flag](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fpatchable-function-entry) +- `clang` [implements the attribute](https://clang.llvm.org/docs/AttributeReference.html#patchable-function-entry) +- `gcc` [implements the flag](https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#index-fpatchable-function-entry) +- `gcc` [implements the attribute](https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-patchable_005ffunction_005fentry-function-attribute) + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +- Should we use LLVM or `gcc`/`clang` style for a per-function attribute? Should we support both styles? +- Should we support a more explicit command line argument style? + +# Future possibilities +[future-possibilities]: #future-possibilities + +We could potentially use these for dynamic tracing of rust programs, similar to `#[instrument]` in the `tracing` today, but with more configurable behavior and even lower overhead (since there will be no conditionals to check, just a nop sled to go down). From f0d3599f76bf4b05c2476dc7ec88e2ed878c04fd Mon Sep 17 00:00:00 2001 From: Matthew Maurer Date: Wed, 13 Dec 2023 06:45:06 -0800 Subject: [PATCH 02/13] Describe the flag as though it were stable --- text/3543-patchable-function-entry.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md index d8e3b37a114..9e82b1b8c58 100644 --- a/text/3543-patchable-function-entry.md +++ b/text/3543-patchable-function-entry.md @@ -30,7 +30,7 @@ nop // Code goes here ``` -To set this for all functions in a crate, use `-Z patchable-function-entry=nop_count,offset` where `nop_count = prefix + entry`, and `offset = prefix`. Usually, you'll want to copy this value from a corresponding `-fpatchable-function-entry=` being passed to the C compiler in your project. +To set this for all functions in a crate, use `-C patchable-function-entry=nop_count,offset` where `nop_count = prefix + entry`, and `offset = prefix`. Usually, you'll want to copy this value from a corresponding `-fpatchable-function-entry=` being passed to the C compiler in your project. To set this for a specific function, use `#[patchable_function_entry(prefix(m), entry(n))]` to pad with m nops before the symbol and n after the symbol, but before the prelude. This will override the flag value. @@ -64,18 +64,18 @@ Nop padding may not be supported on all architectures. As of the time of writing `f_pad` addresses for every padded symbol are aggregated in the `__patchable_function_entries` section of the resulting object. This is not a real symbol, just a collected location. -## Compiler flag `-Z patchable-function-entry` +## Compiler flag `-C patchable-function-entry` This flag comes in two forms: -- `-Z patchable-function-entry=nop_count,offset` -- `-Z patchable-function-entry=nop_count` +- `-C patchable-function-entry=nop_count,offset` +- `-C patchable-function-entry=nop_count` In the latter, offset is assumed to be zero. `nop_count` must be greater than or equal to `offset`, or it will be rejected. If unspecified, the current behavior is maintained, which is equivalent to `=0` here. -This flag sets the default nop padding for all functions in the crate. Notably, this default *only applies to codegenned functions*. If a function is monomorphized during the compilation of another crate or any similar scenario, it will use the default from that crate's compilation. In most cases, all crates in a compilation should use the same value of `-Z patchable-function-entry` to reduce confusion. +This flag sets the default nop padding for all functions in the crate. Notably, this default *only applies to codegenned functions*. If a function is monomorphized during the compilation of another crate or any similar scenario, it will use the default from that crate's compilation. In most cases, all crates in a compilation should use the same value of `-C patchable-function-entry` to reduce confusion. `prefix` is calculated as `offset`. `entry` is calculated as `nop_count - offset`. This unusual mode of specification is intended to mimic the compiler flags of `clang` and `gcc` for ease of build system integration. @@ -102,7 +102,7 @@ This alternative runs the risk of the Rust-for-Linux experiment not leaving expe The primary advantage of this design is that it does not require us to do anything. ### Only compiler flag -In this design, we only add the `-Z patchable-function-entry` flag and not the attribute. This is enough for today - it would allow Rust to participate in these schemes, and in the event that a user *deeply* needed an uninstrumented function, they could build it as a separate crate. +In this design, we only add the `-C patchable-function-entry` flag and not the attribute. This is enough for today - it would allow Rust to participate in these schemes, and in the event that a user *deeply* needed an uninstrumented function, they could build it as a separate crate. This design has two drawbacks: @@ -159,10 +159,10 @@ as modifiers to the attribute. We could make `prefix`/`entry` vs `nop_count`/`of In addition to supporting `nop_count`/`offset` for attributes, we could support this on the command line as well. This would have three forms: -- `-Z patchable-function-entry=m` (`nop_count=m`, `offset=0`, compat format) -- `-Z patchable-function-entry=m,n` (`nop_count=m`, `offset=n`, compat format) -- `-Z patchable-function-entry=nop_count=m,offset=n` (`nop_count=m`, `offset=n`, modern format, offset optional) -- `-Z patchable-function-entry=prefix=m,entry=n` (`prefix=m`, `entry=n`, modern format, either optional) +- `-C patchable-function-entry=m` (`nop_count=m`, `offset=0`, compat format) +- `-C patchable-function-entry=m,n` (`nop_count=m`, `offset=n`, compat format) +- `-C patchable-function-entry=nop_count=m,offset=n` (`nop_count=m`, `offset=n`, modern format, offset optional) +- `-C patchable-function-entry=prefix=m,entry=n` (`prefix=m`, `entry=n`, modern format, either optional) This would have the benefit of making it more clear what's being specified and allowing users to employ the simpler format on the command line if not integrating with an existing build. From 3b8eb1a546e36504c91af9fb0f7f0b755dda06d3 Mon Sep 17 00:00:00 2001 From: Matthew Maurer Date: Wed, 13 Dec 2023 07:54:07 -0800 Subject: [PATCH 03/13] Added error-on-unsupported constraint --- text/3543-patchable-function-entry.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md index 9e82b1b8c58..64cc05eaaa3 100644 --- a/text/3543-patchable-function-entry.md +++ b/text/3543-patchable-function-entry.md @@ -79,12 +79,16 @@ This flag sets the default nop padding for all functions in the crate. Notably, `prefix` is calculated as `offset`. `entry` is calculated as `nop_count - offset`. This unusual mode of specification is intended to mimic the compiler flags of `clang` and `gcc` for ease of build system integration. +Specifying the compiler flag for a backend or architecture which does not support this feature will result in an error. + ## Attribute `#[patchable_function_entry]` This attribute allows specification of either the `prefix` or `entry` values or both, using the format `#[patchable_function_entry(prefix(n), entry(n))]`. If either is left unspecified, it overrides them to a default value of 0. As this is specified via an attribute, it will persist across crate boundaries unlike the compiler flag. +Specifying any amount of padding other than 0 in an attribute will result in an error on backends or architectures which do not support this feature. + # Drawbacks [drawbacks]: #drawbacks From 0b0e4286adf24294047e2a8b55fb9655f34b2d1c Mon Sep 17 00:00:00 2001 From: Matthew Maurer Date: Wed, 13 Dec 2023 07:54:18 -0800 Subject: [PATCH 04/13] Added incompatible padding link error question --- text/3543-patchable-function-entry.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md index 64cc05eaaa3..2662a981643 100644 --- a/text/3543-patchable-function-entry.md +++ b/text/3543-patchable-function-entry.md @@ -190,6 +190,7 @@ I'm not sure why we would do this. - Should we use LLVM or `gcc`/`clang` style for a per-function attribute? Should we support both styles? - Should we support a more explicit command line argument style? +- Should we reject linking crates with different default padding configurations? # Future possibilities [future-possibilities]: #future-possibilities From aafb1d328c33ebaa11cc11e7c876228bacd43c47 Mon Sep 17 00:00:00 2001 From: Matthew Maurer Date: Tue, 2 Jan 2024 19:10:47 +0000 Subject: [PATCH 05/13] Change parameter specification style This should begin to address concerns about confusions between different flag vs attribute. Specifically: * Require command line flag to specify parameter names * Switch attribute name to `patchable` * Use metalist format for `patchable`, and require at least one parameter * Add `unpatchable` alias --- text/3543-patchable-function-entry.md | 28 +++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md index 2662a981643..1501f2fc8cd 100644 --- a/text/3543-patchable-function-entry.md +++ b/text/3543-patchable-function-entry.md @@ -30,9 +30,9 @@ nop // Code goes here ``` -To set this for all functions in a crate, use `-C patchable-function-entry=nop_count,offset` where `nop_count = prefix + entry`, and `offset = prefix`. Usually, you'll want to copy this value from a corresponding `-fpatchable-function-entry=` being passed to the C compiler in your project. +To set this for all functions in a crate, use `-C patchable-function-entry=nop_count=m,offset=n` where `nop_count = prefix + entry`, and `offset = prefix`. Usually, you'll want to copy this value from a corresponding `-fpatchable-function-entry=` being passed to the C compiler in your project. -To set this for a specific function, use `#[patchable_function_entry(prefix(m), entry(n))]` to pad with m nops before the symbol and n after the symbol, but before the prelude. This will override the flag value. +To set this for a specific function, use `#[patchable(prefix = m, entry = n)]` to pad with m nops before the symbol and n after the symbol, but before the prelude. This will override the flag value. To disable padding for a specific function, for example because it is part of the instrumentation framework, use `#[unpatchable]`. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -68,12 +68,12 @@ This is not a real symbol, just a collected location. This flag comes in two forms: -- `-C patchable-function-entry=nop_count,offset` -- `-C patchable-function-entry=nop_count` +- `-C patchable-function-entry=nop_count=m,offset=n` +- `-C patchable-function-entry=nop_count=m` In the latter, offset is assumed to be zero. `nop_count` must be greater than or equal to `offset`, or it will be rejected. -If unspecified, the current behavior is maintained, which is equivalent to `=0` here. +If unspecified, the current behavior is maintained, which is equivalent to `nop_count=0` here. This flag sets the default nop padding for all functions in the crate. Notably, this default *only applies to codegenned functions*. If a function is monomorphized during the compilation of another crate or any similar scenario, it will use the default from that crate's compilation. In most cases, all crates in a compilation should use the same value of `-C patchable-function-entry` to reduce confusion. @@ -81,14 +81,16 @@ This flag sets the default nop padding for all functions in the crate. Notably, Specifying the compiler flag for a backend or architecture which does not support this feature will result in an error. -## Attribute `#[patchable_function_entry]` +## Attributes `#[patchable]` and `#[unpatchable]`. -This attribute allows specification of either the `prefix` or `entry` values or both, using the format `#[patchable_function_entry(prefix(n), entry(n))]`. If either is left unspecified, it overrides them to a default value of 0. +This attribute allows specification of either the `prefix` or `entry` values or both, using the format `#[patchable(prefix = m, entry = n)]`. If either is left unspecified, it overrides them to a default value of 0. Specifying neither prefix nor entry is an error, but explicitly setting them both to 0 is allowed. As this is specified via an attribute, it will persist across crate boundaries unlike the compiler flag. Specifying any amount of padding other than 0 in an attribute will result in an error on backends or architectures which do not support this feature. +`#[unpatchable]` is a built-in alias for `#[patchable(prefix = 0, entry = 0)]`. + # Drawbacks [drawbacks]: #drawbacks @@ -152,19 +154,17 @@ Users are unlikely to be directly copying code with a manual attribute, and usua Our attribute system is more powerful than `clang` and `gcc`, so we have the option to support: -- `prefix(n)` -- `entry(n)` -- `nop_count(n)` -- `offset(n)` +- `prefix = n` +- `entry = n` +- `nop_count = n` +- `offset = n` as modifiers to the attribute. We could make `prefix`/`entry` vs `nop_count`/`offset` an exclusive choice, and support both. This would provide the advantage of allowing users copying from or familiar with the other specification system to continue using it. The disadvantages would be more complex attribute parsing and potential confusion for people reading code. ## Support both styles for flags and arguments -In addition to supporting `nop_count`/`offset` for attributes, we could support this on the command line as well. This would have three forms: +In addition to supporting `nop_count`/`offset` for attributes, we could support this on the command line as well. This would have two forms: -- `-C patchable-function-entry=m` (`nop_count=m`, `offset=0`, compat format) -- `-C patchable-function-entry=m,n` (`nop_count=m`, `offset=n`, compat format) - `-C patchable-function-entry=nop_count=m,offset=n` (`nop_count=m`, `offset=n`, modern format, offset optional) - `-C patchable-function-entry=prefix=m,entry=n` (`prefix=m`, `entry=n`, modern format, either optional) From b5cb2f16425698191f691c186ec4e5cea4fdd7aa Mon Sep 17 00:00:00 2001 From: Matthew Maurer Date: Tue, 2 Jan 2024 19:19:56 +0000 Subject: [PATCH 06/13] Fix incorrect heading levels --- text/3543-patchable-function-entry.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md index 1501f2fc8cd..60fdc6efb61 100644 --- a/text/3543-patchable-function-entry.md +++ b/text/3543-patchable-function-entry.md @@ -117,15 +117,15 @@ This design has two drawbacks: The primary advantage of this design is that it is purely a compiler feature, with no change to the language. -## Compiler flag and no-padding attribute +### Compiler flag and no-padding attribute In this design, we add the compiler flag and an attribute that zeroes out padding for a function. This covers all the use cases I see in the Linux kernel today, so the only real downside is missing the opportunity to match `gcc` and `clang`'s capabilities with only a small bit more code. Some other project might use explicit padding configuration per-function, but a quick search across github only finds the `patchable_function_entry` attribute set to `(0, 0)` other than in compiler tests. -## Everything (proposed design) +### Everything (proposed design) The only real downside I see here is the complexity of adding one more thing to the language. -# Argument style +## Argument style There are two basic ways being used today to specify this nop padding: @@ -136,21 +136,21 @@ The primary advantage of the first format is that it is used in `gcc` and `clang The advantage of the second style is that `prefix` and `entry` don't have validity constraints (`nop_count` must be greater than `offset`) and it's more obvious what the user is asking for. -## Copy `gcc`/`clang` everywhere +### Copy `gcc`/`clang` everywhere This approach has the advantage of matching all existing docs and programmers coming over not being confused. -## Use LLVM-style everywhere +### Use LLVM-style everywhere This format doesn't require validation and is likely easier to understand for users not already exposed to this concept. -## Use `gcc`/`clang` for flags, LLVM-style for arguments (proposed) +### Use `gcc`/`clang` for flags, LLVM-style for arguments (proposed) Build systems tend to interact with our flag interface, and they already have `nop_count,offset` format flags constructed for their C compilers, so this is likely the easiest way for them to interface. Users are unlikely to be directly copying code with a manual attribute, and usually are just going to be disabling padding per a github search for the attribute. Setting padding to `(0, 0)` is compatible across both styles, and setting `prefix` and `entry` manually is likely to be more understandable for a new user. -## Use `gcc`/`clang` for flags, Support both styles for arguments +### Use `gcc`/`clang` for flags, Support both styles for arguments Our attribute system is more powerful than `clang` and `gcc`, so we have the option to support: @@ -161,7 +161,7 @@ Our attribute system is more powerful than `clang` and `gcc`, so we have the opt as modifiers to the attribute. We could make `prefix`/`entry` vs `nop_count`/`offset` an exclusive choice, and support both. This would provide the advantage of allowing users copying from or familiar with the other specification system to continue using it. The disadvantages would be more complex attribute parsing and potential confusion for people reading code. -## Support both styles for flags and arguments +### Support both styles for flags and arguments In addition to supporting `nop_count`/`offset` for attributes, we could support this on the command line as well. This would have two forms: @@ -172,7 +172,7 @@ This would have the benefit of making it more clear what's being specified and a The primary disadvantage of this is having many ways to say the same thing. -## Use LLVM-style for flags, `gcc`/`clang` for arguments +### Use LLVM-style for flags, `gcc`/`clang` for arguments I'm not sure why we would do this. From 8aaa21a296fc9258a7af9af21e34ea8b6610580b Mon Sep 17 00:00:00 2001 From: Matthew Maurer Date: Tue, 2 Jan 2024 22:18:05 +0000 Subject: [PATCH 07/13] Added notes on inlining --- text/3543-patchable-function-entry.md | 69 +++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md index 60fdc6efb61..fb5d6341ac7 100644 --- a/text/3543-patchable-function-entry.md +++ b/text/3543-patchable-function-entry.md @@ -91,6 +91,10 @@ Specifying any amount of padding other than 0 in an attribute will result in an `#[unpatchable]` is a built-in alias for `#[patchable(prefix = 0, entry = 0)]`. +## Optimization Notes + +Neither `#[patchable]` nor `-C patchable-function-entry` imply any restriction on inlining by themselves. If it is critical that patched code in the `entry` section be executed on *every* function invocation, not only in an advisory capacity, annotate the relevant functions with `#[inline(never)]` in addition. + # Drawbacks [drawbacks]: #drawbacks @@ -176,6 +180,71 @@ The primary disadvantage of this is having many ways to say the same thing. I'm not sure why we would do this. +## Inlining + +Inlining a function will prevent code in the `entry` patchable section from being executed. This raises the question of whether we should suppress or lint about inlining around this attribute or flag. + +Existing support in `gcc` and `clang` does not suppress inlining at all, but `rustc` makes much heavier use of inlining than they do by default, making it possible that we might want to make a different call. + +Linux's usage of this flag does not consider inlining suppression to be desirable. The two primary usages are: + +- Hardening, where only indirect calls are considered, and so inlining is a non-issue +- Tracing, where inlined calls are explicitly out of scope, and `noinline` is already explicitly added to C code which should be traced. + +Possible signals we could consider include beyond whether any padding is present: + +- Whether the `entry` padding is nonzero, not considering `prefix` - `prefix` padding would not be executed by a direct call anyways +- Whether the padding was specified by an attribute or a flag +- Whether an explicit inlining annotation is present + +Possible actions we could take include: + +- Nothing +- Warning/linting +- Suppress inlining implicitly + +Since we don't have a way to "reset" inlining to default, any plan involving suppression of inlining also needs to come with additional configuration to suppress the suppression. + +### Inline suppresssion +If the function has nonzero `entry` padding, prevent inlining. + +Add `-C allow-patchable-function-inlining` to disable this behavior. + +Add `#[patchable(inlinable = yes)]` to suppress inline suppression in the attribute. + +The advantage of this approach is that any instrumentation will always trigger when the function is called. + +Disadvantages: + +- When the flag is passed, we will disable inlining *nearly everywhere*. This would be disasterous for performance, given the number of functions Rust depends on inlining to optimize. +- This does not match C/C++ behavior, which means most existing use cases will be surprised. +- We need to add flag complexity to match existing use cases. + +We could mitigate a portion of the disadvantages of this approach by only suppressing for the attribute rather than the flag. This would prevent the use of a flag to trace all function invocations. + +### Lint on attribute +If the function has nonzero `entry` padding specified via attribute, and `#[inline]` is not explicitly set, trigger a lint. + +Use `#[allow]` to accept the inlinability, the same as any other lint. + +The advantage of this approach is that if the attribute is explicitly set, it will surface to the user to think about inlining. By using a lint, we avoid introducing new syntax, allow it to be ignored crate-wide if needed, and avoid user surprise. + +Disadvantages: + +- There are no instances of the C/C++ side variants of this attribute in the wild being used with nonzero entry padding, so we don't know if this behavior would actually be unexpected. +- There is no way for a user to use load-bearing entry padding on the whole program without annotating every function. +- The user would not be informed when patchability was triggered via a compilation flag. + +### Do not suppress inlining (proposed) +Take no action on inlining other than mentioning it in the reference. + +This approach mirrors what C/C++ does today. It doesn't close the door on taking the lint approach in the future, but we wouldn't be able to do suppression in the future without reversing the sense of the extra flags. + +Disadvantages: + +- There is no way for a user to use load-bearing entry padding on the whole program without annotating every function. +- Users not familiar with the C/C++ usage of the flag might be surprised when Rust's more aggressive inlining fails to run an `entry` prelude in some scenarios. + # Prior art [prior-art]: #prior-art From 6d1b0a2a21a1e154f554c2bcd397b9be44df8ae3 Mon Sep 17 00:00:00 2001 From: Matthew Maurer Date: Tue, 23 Jan 2024 22:25:19 +0000 Subject: [PATCH 08/13] Relax mismatched patchability behavior Explicitly allow either an error or code generation at any patchability configuration in the crate graph. This allows all of: * Disallowing this (detect it and emit an error) * Doing what will happen by default today (functions will have the patchability of their codegen site) * Adding additional support down the line to use the patchability of the declaring crate. This is unlikely to be needed, but this leaves our options open for the future. --- text/3543-patchable-function-entry.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md index fb5d6341ac7..619fbbe2b10 100644 --- a/text/3543-patchable-function-entry.md +++ b/text/3543-patchable-function-entry.md @@ -75,7 +75,7 @@ In the latter, offset is assumed to be zero. `nop_count` must be greater than or If unspecified, the current behavior is maintained, which is equivalent to `nop_count=0` here. -This flag sets the default nop padding for all functions in the crate. Notably, this default *only applies to codegenned functions*. If a function is monomorphized during the compilation of another crate or any similar scenario, it will use the default from that crate's compilation. In most cases, all crates in a compilation should use the same value of `-C patchable-function-entry` to reduce confusion. +This flag sets the default nop padding for all functions in the crate. In most cases, all crates in a compilation should use the same value of `-C patchable-function-entry` to reduce confusion. If not all crates in the compilation graph share the same `patchable-function-entry` configuration, the compiler may produce an error *or* use any patchability specification present in the graph as the default for any function. `prefix` is calculated as `offset`. `entry` is calculated as `nop_count - offset`. This unusual mode of specification is intended to mimic the compiler flags of `clang` and `gcc` for ease of build system integration. From 5c3fd167e3df0a4b6e4b9c4827d1c7741a7e90f3 Mon Sep 17 00:00:00 2001 From: Matthew Maurer Date: Tue, 23 Jan 2024 22:37:31 +0000 Subject: [PATCH 09/13] Explain entry/prefix values may be restricted Some existing codegens (notably ELFv2+GCC+PPC) can only do some values of `prefix`/`entry`. Call out that it is possible for some values to cause errors, even if the target/backend combination supports patchability. --- text/3543-patchable-function-entry.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md index 619fbbe2b10..cf79992b8b6 100644 --- a/text/3543-patchable-function-entry.md +++ b/text/3543-patchable-function-entry.md @@ -79,7 +79,7 @@ This flag sets the default nop padding for all functions in the crate. In most c `prefix` is calculated as `offset`. `entry` is calculated as `nop_count - offset`. This unusual mode of specification is intended to mimic the compiler flags of `clang` and `gcc` for ease of build system integration. -Specifying the compiler flag for a backend or architecture which does not support this feature will result in an error. +Specifying the compiler flag for a backend or architecture which does not support this feature will result in an error. Some backend / architecture combinations may only support some values of `entry` and `prefix`, in which case an error will also be generated for invalid values. ## Attributes `#[patchable]` and `#[unpatchable]`. @@ -87,7 +87,7 @@ This attribute allows specification of either the `prefix` or `entry` values or As this is specified via an attribute, it will persist across crate boundaries unlike the compiler flag. -Specifying any amount of padding other than 0 in an attribute will result in an error on backends or architectures which do not support this feature. +Specifying any amount of padding other than 0 in an attribute will result in an error on backends or architectures which do not support this feature. Some architecture/backend combinations may only support a subset of prefix and entry nop counts, and may generate errors when other counts are requested. `#[unpatchable]` is a built-in alias for `#[patchable(prefix = 0, entry = 0)]`. From eb061c9b39c523a1f9e8c8fa29b377ae1ac4d58e Mon Sep 17 00:00:00 2001 From: Matthew Maurer Date: Tue, 23 Jan 2024 22:50:44 +0000 Subject: [PATCH 10/13] Naming bikeshedding --- text/3543-patchable-function-entry.md | 26 ++++++++++++-------------- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md index cf79992b8b6..cbc71c01968 100644 --- a/text/3543-patchable-function-entry.md +++ b/text/3543-patchable-function-entry.md @@ -18,7 +18,7 @@ Integrating Rust code into this and other large projects which expect all native # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -`patchable-function-entry` provides configurable nop padding before function symbols and after function symbols but before any generated code. We refer to the former as `prefix` padding and the latter as `entry` padding. For example, if we had a function `f` with `prefix` set to 3 and `entry` to 2, we'd expect to see: +`patchable-function-entry` provides configurable nop padding before function symbols and after function symbols but before any generated code. We refer to the former as `prefix` padding and the latter as `entry` padding. For example, if we had a function `f` with `prefix_nops` set to 3 and `entry_nops` to 2, we'd expect to see: ``` nop @@ -30,14 +30,14 @@ nop // Code goes here ``` -To set this for all functions in a crate, use `-C patchable-function-entry=nop_count=m,offset=n` where `nop_count = prefix + entry`, and `offset = prefix`. Usually, you'll want to copy this value from a corresponding `-fpatchable-function-entry=` being passed to the C compiler in your project. +To set this for all functions in a crate, use `-C patchable-function-entry=total_nops=m,prefix_nops=n` where `total_nops = prefix_nops + entry_nops`. Usually, you'll want to copy this value from a corresponding `-fpatchable-function-entry=` being passed to the C compiler in your project - `total_nops` will match the first parameter used by your C compiler, and the optional `offset` parameter passed to the C compiler will match `prefix_nops`. -To set this for a specific function, use `#[patchable(prefix = m, entry = n)]` to pad with m nops before the symbol and n after the symbol, but before the prelude. This will override the flag value. To disable padding for a specific function, for example because it is part of the instrumentation framework, use `#[unpatchable]`. +To set this for a specific function, use `#[patchable_function_entry(prefix_nops = m, entry_nops = n)]` to pad with m nops before the symbol and n after the symbol, but before the prelude. This will override the flag value. To disable padding for a specific function, for example because it is part of the instrumentation framework, use `#[patchable_function_entry(entry_nops = 0, prefix_nops = 0)]`. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -`patchable-function-entry` provides configurable nop padding before function symbols and after function symbols but before any generated code. We refer to the former as `prefix` padding and the latter as `entry` padding. For example, if we had a function `f` with `prefix` set to 3 and `entry` to 2, we'd expect to see: +`patchable-function-entry` provides configurable nop padding before function symbols and after function symbols but before any generated code. We refer to the former as `prefix` padding and the latter as `entry` padding. For example, if we had a function `f` with `prefix_nops` set to 3 and `entry_nops` to 2, we'd expect to see: ``` f_pad: @@ -68,29 +68,27 @@ This is not a real symbol, just a collected location. This flag comes in two forms: -- `-C patchable-function-entry=nop_count=m,offset=n` -- `-C patchable-function-entry=nop_count=m` +- `-C patchable-function-entry=total_nops=m,prefix_nops=n` +- `-C patchable-function-entry=total_nops=m` -In the latter, offset is assumed to be zero. `nop_count` must be greater than or equal to `offset`, or it will be rejected. +In the latter, `prefix_nops` is assumed to be zero. `total_nops` must be greater than or equal to `prefix_nops`, or it will be rejected. -If unspecified, the current behavior is maintained, which is equivalent to `nop_count=0` here. +If unspecified, the current behavior is maintained, which is equivalent to `total_nops=0` here. This flag sets the default nop padding for all functions in the crate. In most cases, all crates in a compilation should use the same value of `-C patchable-function-entry` to reduce confusion. If not all crates in the compilation graph share the same `patchable-function-entry` configuration, the compiler may produce an error *or* use any patchability specification present in the graph as the default for any function. -`prefix` is calculated as `offset`. `entry` is calculated as `nop_count - offset`. This unusual mode of specification is intended to mimic the compiler flags of `clang` and `gcc` for ease of build system integration. +`entry_nops` is calculated as `total_nops - prefix_nops`. This unusual mode of specification is intended to mimic the compiler flags of `clang` and `gcc` for ease of build system integration. The first mandatory parameter to their flags matches `total_nops`, and the optional parameter matches `prefix_nops`. -Specifying the compiler flag for a backend or architecture which does not support this feature will result in an error. Some backend / architecture combinations may only support some values of `entry` and `prefix`, in which case an error will also be generated for invalid values. +Specifying the compiler flag for a backend or architecture which does not support this feature will result in an error. Some backend / architecture combinations may only support some values of `entry_nops` and `prefix_nops`, in which case an error will also be generated for invalid values. -## Attributes `#[patchable]` and `#[unpatchable]`. +## Attribute `#[patchable_function_entry]` -This attribute allows specification of either the `prefix` or `entry` values or both, using the format `#[patchable(prefix = m, entry = n)]`. If either is left unspecified, it overrides them to a default value of 0. Specifying neither prefix nor entry is an error, but explicitly setting them both to 0 is allowed. +This attribute allows specification of either the `prefix_nops` or `entry_nops` values or both, using the format `#[patchable_function_entry(prefix_nops = m, entry_nops = n)]`. If either is left unspecified, it overrides them to a default value of 0. Specifying neither `prefix_nops` nor `entry_nops` is an error, but explicitly setting them both to 0 is allowed. As this is specified via an attribute, it will persist across crate boundaries unlike the compiler flag. Specifying any amount of padding other than 0 in an attribute will result in an error on backends or architectures which do not support this feature. Some architecture/backend combinations may only support a subset of prefix and entry nop counts, and may generate errors when other counts are requested. -`#[unpatchable]` is a built-in alias for `#[patchable(prefix = 0, entry = 0)]`. - ## Optimization Notes Neither `#[patchable]` nor `-C patchable-function-entry` imply any restriction on inlining by themselves. If it is critical that patched code in the `entry` section be executed on *every* function invocation, not only in an advisory capacity, annotate the relevant functions with `#[inline(never)]` in addition. From b8953114e0c930b1d9587423e4e5ba87f548d4f1 Mon Sep 17 00:00:00 2001 From: Matthew Maurer Date: Tue, 23 Jan 2024 22:54:39 +0000 Subject: [PATCH 11/13] Move non-load-bearing decisions to future work --- text/3543-patchable-function-entry.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md index cbc71c01968..6dbc05375bc 100644 --- a/text/3543-patchable-function-entry.md +++ b/text/3543-patchable-function-entry.md @@ -263,3 +263,7 @@ Disadvantages: [future-possibilities]: #future-possibilities We could potentially use these for dynamic tracing of rust programs, similar to `#[instrument]` in the `tracing` today, but with more configurable behavior and even lower overhead (since there will be no conditionals to check, just a nop sled to go down). + +We could consider adding `#[unpatchable]` as a shorthand for `#[patchable_function_entry(entry_nops = 0, prefix-nops = 0)]`. + +We could define the behavior around differing default patchability in the crate graph more narrowly (either require a hard error, or require that the compiler follows the declaring crate's padding spec). From d6200f7831d40790678814c9954aad0961f40b7d Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Wed, 27 Mar 2024 00:57:54 +0000 Subject: [PATCH 12/13] Fix missing word in RFC 3543 Under "future possibilities", a sentence was missing the word "crate". Let's fix that. --- text/3543-patchable-function-entry.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md index 6dbc05375bc..2bcb34ffaa1 100644 --- a/text/3543-patchable-function-entry.md +++ b/text/3543-patchable-function-entry.md @@ -262,7 +262,7 @@ Disadvantages: # Future possibilities [future-possibilities]: #future-possibilities -We could potentially use these for dynamic tracing of rust programs, similar to `#[instrument]` in the `tracing` today, but with more configurable behavior and even lower overhead (since there will be no conditionals to check, just a nop sled to go down). +We could potentially use these for dynamic tracing of rust programs, similar to `#[instrument]` in the `tracing` crate today, but with more configurable behavior and even lower overhead (since there will be no conditionals to check, just a nop sled to go down). We could consider adding `#[unpatchable]` as a shorthand for `#[patchable_function_entry(entry_nops = 0, prefix-nops = 0)]`. From 8b925c57c6685a5087a8c70d1e219473a7eb9ee8 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Wed, 27 Mar 2024 00:43:20 +0000 Subject: [PATCH 13/13] Prepare RFC 3543 for merge RFC 3543 has been accepted. We've created the tracking issue. Let's add that to the RFC. --- text/3543-patchable-function-entry.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3543-patchable-function-entry.md b/text/3543-patchable-function-entry.md index 2bcb34ffaa1..d468457c7ba 100644 --- a/text/3543-patchable-function-entry.md +++ b/text/3543-patchable-function-entry.md @@ -1,7 +1,7 @@ - Feature Name: `patchable_function_entry` - Start Date: 2023-12-12 - RFC PR: [rust-lang/rfcs#3543](https://github.com/rust-lang/rfcs/pull/3543) -- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) +- Tracking Issue: [rust-lang/rust#123115](https://github.com/rust-lang/rust/issues/123115) # Summary [summary]: #summary