Skip to content

[libunwind] Add c18n support on RISC-V #785

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from
Open

[libunwind] Add c18n support on RISC-V #785

wants to merge 2 commits into from

Conversation

dpgao
Copy link

@dpgao dpgao commented Aug 12, 2025

This extends libunwind's c18n support to RV64Y.

Although c18n for RV64Y hasn't officially landed yet, this libunwind implementation does not depend on any implementation detail on CheriBSD's side because it merely uses APIs that the existing Morello implementation relies on. And before CheriBSD sets _LIBUNWIND_HAS_CHERI_LIB_C18N for RV64Y, none of the unwinding logic would be compiled in.

Stop treating the trusted stack pointer as a register because it is
unnecessary and some architectures like RISC-V do not in fact store it
in a dedicated register.
@dpgao dpgao requested review from jrtc27 and dstolfa August 12, 2025 20:48
#ifdef _LIBUNWIND_HAS_CHERI_LIB_C18N
// Preserve the argument in a callee-saved register instead of pushing it onto
// the stack because stack unwinding will switch the stack.
cmv cs0, ca0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cs0 is the frame pointer; is that a problem? Even if not, probably more friendly to use a different register here?

@@ -4554,10 +4552,21 @@ class _LIBUNWIND_HIDDEN Registers_riscv {
void setSP(reg_t value) { _registers[2] = value; }
reg_t getIP() const { return _registers[0]; }
void setIP(reg_t value) { _registers[0] = value; }
reg_t getFP() const { return _registers[8]; }
void setFP(reg_t value) { _registers[8] = value; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, I would argue there's no need to encode FP, it's just a saved register, and may not be being used as a frame pointer. Then you don't need to extend each Registers_arch with knowledge about the FP.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trusted stack frame treats fp as a special register and does not store it in the array of callee-saved registers. So without the setter, I'd need some other code that does the equivalent thing, which complicates unwindIfAtBoundary.

I suppose we can gate the getter/setter by _LIBUNWIND_HAS_CHERI_LIB_C18N?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean yes, part of my point is that the interface should really have put fp in regs. But what's wrong with setRegister(UNW_ARCH_[XC]N, state.fp) for now until we change the interface to be more sensible? You already need to list the UNW_ARCH_[XC]N for all the other callee-saved registers.

Copy link
Author

@dpgao dpgao Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for me. The reason fp is separately stored in the trusted frame is so that the trusted frame would 'look like' a normal frame. When a special flag is passed to RTLD, the trampoline will not clear the tag of fp when entering a callee compartment, allowing anyone to follow fp to cross compartment boundaries and trace the whole call stack.

This should be useful when using pmcstat, but we have never actually tried it...

#ifdef _LIBUNWIND_TARGET_AARCH64
reg_num = UNW_ARM64_C19 + i;
#elif defined(_LIBUNWIND_TARGET_RISCV)
static constexpr int callee_saved[] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to hoist this out, define an equivalent array for Morello, and static assert that it's the same length as state.regs.

@@ -1273,6 +1273,19 @@ DEFINE_LIBUNWIND_FUNCTION(__unw_getcontext)
.endr
# endif

#ifdef _LIBUNWIND_HAS_CHERI_LIB_C18N
// Store the trusted stack pointer
caddi csp, csp, -32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT these two caddis are the only reason this code is RVY-specific. It'd be nice if it worked for Xcheri; we accept addi as an alias for cincoffset(imm) these days. But I see the current RVY spec is stupid and says it should be add not addi, and so that's what Codasip's LLVM implements. I'll go fix the spec, and someone should then bug them to update their assembler. But then you should be able to write assembly that works for both.

// Store the trusted stack pointer
caddi csp, csp, -32
sc ca0, 0(csp)
sc cra, 16(csp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use __SIZEOF_CHERI_CAPABILITY__ here (and for the frame sizes)? We don't have an RV32 platform for this but there's no reason not to make it work when it's this easy (and you've already used __SIZEOF_CHERI_CAPABILITY__ for the final store, and for the restore path).

// Use 240 for ECSP (executive stack pointer). ECSP is not a real DWARF
// register, but we need it to implement c18n-aware unwinding. We pick 240
// because it is far enough away from the range of reserved registers on Arm.
UNW_ARM64_ECSP = 240,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess in an earlier iteration of the patch we really did need it to be a DWARF register?..

@dpgao dpgao changed the title Add c18n support on RISC-V [libunwind] Add c18n support on RISC-V Aug 18, 2025
@dpgao dpgao requested a review from jrtc27 August 18, 2025 12:35
struct dl_c18n_compart_state state;

#ifdef _LIBUNWIND_TARGET_AARCH64
static constexpr int fp_num = UNW_AARCH64_FP;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM style is camelCase

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fp_num was deleted but callee_saved is still snake_case not camelCase.

@@ -41,7 +59,7 @@ struct CompartmentInfo {
registers.setTrustedStack(tf);
CHERI_DBG("C18N: SET TRUSTED STACK %#p\n", (void *)tf);

registers.setFP((pint_t)state.fp);
registers.setCapabilityRegister(fp_num, (pint_t)state.fp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean can we not make this field only exist for Morello, and on CHERI-RISC-V treat it as any other saved register?

@dpgao
Copy link
Author

dpgao commented Aug 18, 2025

@jrtc27 It turns out to be not difficult at all to change the layout of the trusted frame. For RISC-V, it now looks like

struct dl_c18n_compart_state {
        /*
         * s1 to s11, then s0
         * This is so that s0 precedes pc as required for a RISC-V frame.
         */
        void *regs[12];
        void *pc;
        /*
         * INVARIANT: This field contains the top of the caller's stack when the
         * caller made the call.
         */
        void *sp;
};

cfp (or cs0) is now placed in the regs array right before pc so the kernel unwinder can still interpret the fields correctly if we so wish.

@jrtc27
Copy link
Member

jrtc27 commented Aug 18, 2025

@jrtc27 It turns out to be not difficult at all to change the layout of the trusted frame. For RISC-V, it now looks like

struct dl_c18n_compart_state {
        /*
         * s1 to s11, then s0
         * This is so that s0 precedes pc as required for a RISC-V frame.
         */
        void *regs[12];
        void *pc;
        /*
         * INVARIANT: This field contains the top of the caller's stack when the
         * caller made the call.
         */
        void *sp;
};

cfp (or cs0) is now placed in the regs array right before pc so the kernel unwinder can still interpret the fields correctly if we so wish.

The current API uses the same struct layout as what's stored on the trusted stack, but that need not be the same. Store the data in whatever format is most convenient for the OS. But that should not dictate what the API exposed is; it's an implementation detail that this API abstracts over.


private:
// _registers[0] holds the pc
#ifdef __CHERI_PURE_CAPABILITY__
// _registers[32] holds the trusted stack pointer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just add another field after? PC is in element 0 because it makes indexing the array a bit easier for all the other registers, but there's no benefit AFAICT to (ab)using a 33rd register for the trusted stack pointer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constructor Registers_riscv(const void *registers) assumes that floating point registers directly follow the _registers array, so it would be quite ugly to adapt the memcpy that initializes the floating point registers.

An alternative is to put the field after the floating point registers. But then the assembly code which saves and restores the registers would have to condition the offset of the trusted stack register on whether __riscv_flen. Again not very ideal.

Registers_arm64 does things properly by wrapping the GPR registers in a structure, so we can just add an element there. I decided that the closest thing to this on RISC-V would be to append to the _registers array. If upstream one day decides to convert this to a structure, then we wouldn't have to do much to pull that change in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# if defined(__riscv_flen)
  const uint8_t *floats = static_cast<const uint8_t *>(registers);
#  if defined(__CHERI_PURE_CAPABILITY__)
  floats += sizeof(_whatever_you_call_the_field);
#  endif
  floats += sizeof(_registers);
  memcpy(_floats, floats, sizeof(_floats));
# endif

?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants