Skip to content

Support function vs code pointer relocations on CHERI-RISC-V #788

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Aug 22, 2025

Conversation

jrtc27
Copy link
Member

@jrtc27 jrtc27 commented Aug 21, 2025

  • [NFC][Target][AsmPrinter] Introduce and use lowerCheriCodeReference
  • [BinaryFormat][RISCV] Reserve R_RISCV_CHERI_CAPABILITY_CODE
  • [RISCV] Add MC support for .chericap %code(expr)
  • [ELF][RISCV] Support R_RISCV_CHERI_CAPABILITY_CODE and R_RISCV_FUNC_RELATIVE
  • [RISCV] Use new %code(expr) for lowerCheriCodeReference

@jrtc27 jrtc27 requested a review from dpgao August 21, 2025 13:09
@arichardson
Copy link
Member

I've not really been following these changes so this is from an outside view: the name code is somewhat confusing. What is the difference between function and code? Would a name something like local_func or local_code make more sense to distinguish trampoline vs no trampoline?

@jrtc27
Copy link
Member Author

jrtc27 commented Aug 21, 2025

The names are what we settled on for Morello, and are documented in aaelf64-morello. "Code" is intended to capture that you are referring to a specific point in the code, i.e. an instruction, whereas "function" is intended to capture that it's a more abstract reference where you just want to be able to call it (whether for a direct call or a function pointer), you don't actually care about the specific instruction you're pointing to, and therefore it's appropriate to insert trampolines.

@arichardson
Copy link
Member

The names are what we settled on for Morello, and are documented in aaelf64-morello. "Code" is intended to capture that you are referring to a specific point in the code, i.e. an instruction, whereas "function" is intended to capture that it's a more abstract reference where you just want to be able to call it (whether for a direct call or a function pointer), you don't actually care about the specific instruction you're pointing to, and therefore it's appropriate to insert trampolines.

Thanks for the explanation. That makes sense to me. I do feel like it would be good to capture that it's a direct reference, but of course that conflicts with the need to keep names short.

@jrtc27
Copy link
Member Author

jrtc27 commented Aug 21, 2025

The names are what we settled on for Morello, and are documented in aaelf64-morello. "Code" is intended to capture that you are referring to a specific point in the code, i.e. an instruction, whereas "function" is intended to capture that it's a more abstract reference where you just want to be able to call it (whether for a direct call or a function pointer), you don't actually care about the specific instruction you're pointing to, and therefore it's appropriate to insert trampolines.

Thanks for the explanation. That makes sense to me. I do feel like it would be good to capture that it's a direct reference, but of course that conflicts with the need to keep names short.

Yeah. Given the cryptic names that feature in psABIs and assembly syntax I think that's more a case of ensuring we have adequate documentation?

@arichardson
Copy link
Member

Ultimately it's just a name and doesn't matter but I feel it would be good to make it obvious to readers less familiar with the intricacies of these relocations. But yeah having good documentation should address that.

@jrtc27
Copy link
Member Author

jrtc27 commented Aug 22, 2025

Ultimately it's just a name and doesn't matter but I feel it would be good to make it obvious to readers less familiar with the intricacies of these relocations. But yeah having good documentation should address that.

Are you happy for this to be merged then once Dapeng can confirm I didn't screw something up?

(At some point soon I'll go update our psABI to document the relocations)

Copy link
Member

@arichardson arichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code all looks good to me. Only minor suggestion is maybe we should add a doc comment somewhere to ensure future readers understand the distinction.

@@ -60,12 +60,14 @@ struct CheriCapRelocLocation {
};

struct CheriCapReloc {
bool isCode;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment here or on addCapReloc() might be nice, just to make future readers don't think they need to set isCode=true for functions.

@@ -56,6 +56,9 @@ class RISCVELFTargetObjectFile : public TargetLoweringObjectFileELF {
const TargetMachine &TM) const override;

int getCheriCapabilitySize(const TargetMachine &TM) const override;

const MCExpr *lowerCheriCodeReference(const MCSymbol *Sym,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And maybe a comment here?

Copy link

@dpgao dpgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me.

@jrtc27
Copy link
Member Author

jrtc27 commented Aug 22, 2025

Code all looks good to me. Only minor suggestion is maybe we should add a doc comment somewhere to ensure future readers understand the distinction.

Please check if you think the comments I've added make it clear

@arichardson
Copy link
Member

Code all looks good to me. Only minor suggestion is maybe we should add a doc comment somewhere to ensure future readers understand the distinction.

Please check if you think the comments I've added make it clear

Looks great! Thanks for adding them.

jrtc27 added 5 commits August 22, 2025 19:10
This will allow targets to create a target-specific MCExpr for referring
to code, bypassing any use of trampolines in the presence of a c18n
runtime.
For non-c18n runtimes this is the same as R_RISCV_CHERI_CAPABILITY, but
for c18n runtimes this marks capabilities that must not be wrapped in a
trampoline as they must point directly to the code in question.
…ELATIVE

Note that we only use R_RISCV_FUNC_RELATIVE for CheriABI binaries as
it's redundant for plain RISC-V and we would not be able to produce
binaries that work with existing upstream RISC-V runtimes.
@jrtc27
Copy link
Member Author

jrtc27 commented Aug 22, 2025

(Updated to s/pointer/symbol/ as "symbol is a function pointer" is ambiguous as to whether that means "of type STT_FUNC" or "of type STT_OBJECT that holds a pointer", even if clear from the context)

@jrtc27 jrtc27 merged commit 7642871 into dev Aug 22, 2025
6 checks passed
@jrtc27 jrtc27 deleted the func-code-relocs branch August 22, 2025 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants