Skip to content

Implement basic per-CPU / CPU-local storage #917

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Apr 6, 2023

Conversation

kevinaboos
Copy link
Member

@kevinaboos kevinaboos commented Apr 2, 2023

  • Not yet used anywhere
  • Only supported on x86_64

Reminder: remove temp tests and log statements before merging

@kevinaboos kevinaboos requested a review from tsoutsman April 2, 2023 03:45
Copy link
Member

@tsoutsman tsoutsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've proposed a slightly different API that would decrease the amount of unsafe syncing we have to do between PerCpuData and FixedCpuLocal.

Just to make sure we're on the same page gs is a register that isn't used in regular code and so we can use it for a different purpose right?

@kevinaboos
Copy link
Member Author

Just to make sure we're on the same page gs is a register that isn't used in regular code and so we can use it for a different purpose right?

Correct, the GS segment register is unused and is frequently used for CPU-local storage on x86. It's similar to how the FS segment register is used for thread-local storage.

@kevinaboos
Copy link
Member Author

@tsoutsman I attempted a few different ways to use traits in the CpuLocal type, with the goal being to make the RAII declaration of a CPU-local variable as simple as possible. It's pretty hard to make the type signature simple, but I was able to get it down to a single generic type parameter (which itself has one type parameter). I didn't push it to this branch because I don't love it, but take a look here: 3b24b1d. At the very least, it removes/hides the const OFFSET parameter from CpuLocal, which I do think was a bit of a flawed/leaky design. This design lets the caller do something like:

static PREEMPTION_COUNT: CpuLocal<CpuLocalPreemptionCount<PreemptionCount>> = unsafe { CpuLocal::new() };

The only other option I was able to get to work was to parameterize the CpuLocalVariable trait with a type, like so:

pub trait CpuLocalVariable<T: Sized> { ... }

which works, but you then need to have both T and V parameters for CpuLocal:

pub struct CpuLocal<T: Sized, V: CpuLocalVariable<T>> {
    _typ: PhantomData<T>,
    _var: PhantomData<V<T>>,
}

impl<T: Sized, V: CpuLocalVariable<T>> CpuLocal<T, V> {
    pub const unsafe fn new(_variable_marker: V) -> Self {
        Self { _typ: PhantomData, _var: PhantomData }
    }

This is clearer but ergonomically worse, as you have to declare/instantiate it like so:

static PREEMPTION_COUNT: CpuLocal<PreemptionCount, CpuLocalVariable<PreemptionCount>> = unsafe { CpuLocal::new(CpuLocalPreemptionCount) };

If you have any ideas on how to further simplify this, please do let me know.

Signed-off-by: Klim Tsoutsman <[email protected]>
@tsoutsman
Copy link
Member

Apologies force push was because I pushed to the wrong remote repo.

Here's a quick enum implementation I made:
b0ce230

It removes size and alignment from cpu_local, removes OFFSET const parameter, prevents CpuLocal from being instantiated with arbitrary offsets and types. The one bad thing is that per_cpu now needs to define a bunch of wrapper types so that it can implement Field for them.

@tsoutsman
Copy link
Member

tsoutsman commented Apr 4, 2023

right but that introduces a circular dependency hmm

we could have the Field impls in the crated where the type is defined e.g. cpu for CpuId

@kevinaboos
Copy link
Member Author

Thanks, that enum impl is similar to what I tried, but the addition of the unsafe trait with its associated const does help. I was trying to avoid an unsafe trait -- i wonder if there's a clever way to prevent other crates from implementing the Field trait.

The one bad thing is that per_cpu now needs to define a bunch of wrapper types so that it can implement Field for them.

This is fine, but we should impl Deref+DerefMut on those wrappers so the crates that use cpu_local don't have to be aware of said newtype wrappers.

we could have the Field impls in the crates where the type is defined, e.g. cpu for CpuId

Yeah, i don't really have any other ideas for workarounds here, so I think this is fine. But we should expose it via a macro like define_cpu_local_type!(<type>) such that it's not easy/recommended to directly impl Field. This ties into trying to prevent someone from implementing Field, but I do realize that macros wouldn't prevent that.

I suppose you could also define a proc_macro that would implement hidden traits for a cpu-local type, but that might be too complex right now. My future idea for TLS-style CPU locals incorporates proc_macros for custom attributes, but we can discuss that later.


Finally, you left this comment, which I agree with, but yes that's one of the many reasons why CpuLocal::new* was unsafe -- the type provided had to match the offset.

// NOTE: The fact that this previously compiled in cpu_local is
// indicative of an unsafe API, as the offsets (and types) are
// completely arbitrary. There's nothing stopping us from trying to
// access CpuLocal::<16, String> which would be undefined behaviour.

@kevinaboos
Copy link
Member Author

Here's a quick enum implementation I made: b0ce230

It removes size and alignment from cpu_local, removes OFFSET const parameter, prevents CpuLocal from being instantiated with arbitrary offsets and types. The one bad thing is that per_cpu now needs to define a bunch of wrapper types so that it can implement Field for them.

Thanks, I have merged/included that commit into this PR branch, and am going from there.

we could have the Field impls in the crated where the type is defined e.g. cpu for CpuId

Working on this now. I've also addressed some naming issues.


Will re-request a review from you when done.

@kevinaboos kevinaboos requested a review from tsoutsman April 5, 2023 23:42
  that define the inner type being wrapped.
* CpuId is a special case.
* We can't yet impl CpuLocalField for PreemptionCount, as that
  requires merging the `cpu_local` and `preemption` crates.
@kevinaboos
Copy link
Member Author

Made the changes, thanks for the suggestion. Though not included in this changeset, I confirmed it's possible to instantiate a CPU-local task type, i.e., this works:

static TASK_SWITCH_PREEMPTION_GUARD: cpu_local::CpuLocal<TaskSwitchPreemptionGuard> = cpu_local::CpuLocal::new(PerCpuField::TaskSwitchPreemptionGuard);
pub fn test_me() -> Option<PreemptionGuard> {
    TASK_SWITCH_PREEMPTION_GUARD.with_mut(|t| t.0.take())
}

see my latest commit msg for other details

@kevinaboos kevinaboos requested a review from tsoutsman April 6, 2023 03:30
Copy link
Member

@tsoutsman tsoutsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small doc change.

@kevinaboos kevinaboos merged commit fd5081a into theseus-os:theseus_main Apr 6, 2023
github-actions bot pushed a commit that referenced this pull request Apr 6, 2023
* Only supported on x86_64 currently; aarch64 support to come.

* x86_64 implementation is based on the GS segment register
  and the GsBase MSR, as typically used on other OSes.

* Tested working, but not yet used anywhere in Theseus.

Co-authored-by: Klim Tsoutsman <[email protected]> fd5081a
github-actions bot pushed a commit to tsoutsman/Theseus that referenced this pull request Apr 6, 2023
* Only supported on x86_64 currently; aarch64 support to come.

* x86_64 implementation is based on the GS segment register
  and the GsBase MSR, as typically used on other OSes.

* Tested working, but not yet used anywhere in Theseus.

Co-authored-by: Klim Tsoutsman <[email protected]> fd5081a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants