Implement more ergonomic CPU-local storage #1012

tsoutsman · 2023-07-21T05:52:55Z

MVP of a cpu_local attribute.

Limitations that will be addressed in a future PR:

Only integer types supported
CLS section offsets must be manually specified

~~Draft PR because the macro needs to be implemented for aarch64.~~

Signed-off-by: Klimenty Tsoutsman <[email protected]>

kernel/cls/cls_macros/src/lib.rs

Signed-off-by: Klimenty Tsoutsman <[email protected]>

kevinaboos

Generally looks good, i left a few questions

Cargo.toml

kernel/cls/cls_macros/src/lib.rs

Signed-off-by: Klimenty Tsoutsman <[email protected]>

kevinaboos

Left one major question about the atomicity of increment+load and load+decrement.

Also, can you add (or restore) typical things like crate-level docs, authorship/description fields in the toml, etc? thanks.

Everything else looks good.

kernel/preemption/src/lib.rs

Signed-off-by: Klimenty Tsoutsman <[email protected]>

kevinaboos

left one comment as a suggestion to try out in #1015

kevinaboos · 2023-07-27T17:51:36Z

kernel/preemption/src/lib.rs

-use crate::cpu_local::{CpuLocal, CpuLocalField, PerCpuField};
-use no_drop::NoDrop;
+#![no_std]
+#![feature(negative_impls, thread_local)]


note for next PR: i think we can avoid the need for every user of cls_macro to declare #![feature(thread_local)] by using the #[allow_internal_unstable(...)] attribute in the cls_macro crate, either on the proc macro definition or as a crate-level attribute. I recall using that attribute in the thread_local_macro crate, based on std's thread_local!() macro trickery.

This isn't a blocker to merging/approving this PR, but we should try to improve that rough edge, especially once we have dozens of CPU local vars everywhere.

(i just realized this is intended for regular macros, not procedural macros, but you should try it anyway)

* The new `cls` crate offers a `#[cpu_local]` attribute that can be applied to a static variable to declare it as Cpu-Local Storage (CLS), much like `#[thread_local]` and Thread-Local Storage (TLS). * The `cls_macro` crate provides a procedural macro that implements the `#[cpu_local]` attribute. Currently, this only supports basic primitive types: `u8`, `u16`, `u32`, `u64`. More coming later. * The API for accessing CPU-local primitive types is quite simple: only `load`, `fetch_add`, `fetch_sub`. Note that these functions are implemented as atomic with respect to ONLY the single current CPU, not atomic with respect to all memory accesses across all other CPUs. This is intentional as it is more efficient, and CPU-local variables are only accessible to a single CPU and not by another "foreign" CPU. * Currently this is only used for the preemption counter on each CPU, but it will be used for more variables in future PRs. * This simplifies the implementation of `preemption` management, as we can rely on the `#[cpu_local]` API's semantics to be atomic with respect to the current CPU. * This also avoids a minor latent bug that existed in the old preemption implementation which made it theoretically possible for a task migration to occur between accessing the CPU-local preemption count and modifying that count. This wasn't a problem since Theseus does not yet perform task migration, but the new design makes this interleaving impossible and is thus future-proof for later task migration support. * Re-separate `preemption` and `cpu_local` into two different crates, as they previously were. This is possible because the preemption count now uses the new `#[cpu_local]` attribute, which allows `preemption` and `cpu_local` to not have cyclic dependencies on each other. * This is responsible for most of the changes in this PR: changing `cpu_local_preemption` back to `cpu_local` or `preemption`. * Another current limitation is that we must use the `#[cpu_local]` attribute to _manually_ specify the offset of each CPU-local variable from the start of the `PerCpuData` storage type. This will be removed in a future PR. Signed-off-by: Klimenty Tsoutsman <[email protected]> Co-authored-by: Kevin Boos <[email protected]> 1f620ac

…us-os#1012) * The new `cls` crate offers a `#[cpu_local]` attribute that can be applied to a static variable to declare it as Cpu-Local Storage (CLS), much like `#[thread_local]` and Thread-Local Storage (TLS). * The `cls_macro` crate provides a procedural macro that implements the `#[cpu_local]` attribute. Currently, this only supports basic primitive types: `u8`, `u16`, `u32`, `u64`. More coming later. * The API for accessing CPU-local primitive types is quite simple: only `load`, `fetch_add`, `fetch_sub`. Note that these functions are implemented as atomic with respect to ONLY the single current CPU, not atomic with respect to all memory accesses across all other CPUs. This is intentional as it is more efficient, and CPU-local variables are only accessible to a single CPU and not by another "foreign" CPU. * Currently this is only used for the preemption counter on each CPU, but it will be used for more variables in future PRs. * This simplifies the implementation of `preemption` management, as we can rely on the `#[cpu_local]` API's semantics to be atomic with respect to the current CPU. * This also avoids a minor latent bug that existed in the old preemption implementation which made it theoretically possible for a task migration to occur between accessing the CPU-local preemption count and modifying that count. This wasn't a problem since Theseus does not yet perform task migration, but the new design makes this interleaving impossible and is thus future-proof for later task migration support. * Re-separate `preemption` and `cpu_local` into two different crates, as they previously were. This is possible because the preemption count now uses the new `#[cpu_local]` attribute, which allows `preemption` and `cpu_local` to not have cyclic dependencies on each other. * This is responsible for most of the changes in this PR: changing `cpu_local_preemption` back to `cpu_local` or `preemption`. * Another current limitation is that we must use the `#[cpu_local]` attribute to _manually_ specify the offset of each CPU-local variable from the start of the `PerCpuData` storage type. This will be removed in a future PR. Signed-off-by: Klimenty Tsoutsman <[email protected]> Co-authored-by: Kevin Boos <[email protected]> 1f620ac

tsoutsman added 2 commits July 21, 2023 15:45

Implement on x86_64

161f4f5

Signed-off-by: Klimenty Tsoutsman <[email protected]>

Reorder preemption counter decrement

d2e8fd6

Signed-off-by: Klimenty Tsoutsman <[email protected]>

tsoutsman marked this pull request as draft July 21, 2023 05:53

Add support for aarch64

cc1fe17

Signed-off-by: Klimenty Tsoutsman <[email protected]>

tsoutsman commented Jul 23, 2023

View reviewed changes

kernel/cls/cls_macros/src/lib.rs Show resolved Hide resolved

Cleanup

b8e6404

Signed-off-by: Klimenty Tsoutsman <[email protected]>

tsoutsman marked this pull request as ready for review July 23, 2023 10:43

tsoutsman requested a review from kevinaboos July 23, 2023 22:21

tsoutsman mentioned this pull request Jul 24, 2023

Add replace and set methods for CPU local variables #1015

Closed

tsoutsman added 2 commits July 24, 2023 19:34

Add docs to macro

c2b2940

Signed-off-by: Klimenty Tsoutsman <[email protected]>

Make generated struct a ZST

7a93f8f

Signed-off-by: Klimenty Tsoutsman <[email protected]>

kevinaboos requested changes Jul 26, 2023

View reviewed changes

Cleanup

8be4124

Signed-off-by: Klimenty Tsoutsman <[email protected]>

tsoutsman requested a review from kevinaboos July 26, 2023 23:32

kevinaboos requested changes Jul 27, 2023

View reviewed changes

kernel/preemption/src/lib.rs Show resolved Hide resolved

kernel/preemption/src/lib.rs Outdated Show resolved Hide resolved

kernel/preemption/src/lib.rs Outdated Show resolved Hide resolved

tsoutsman added 4 commits July 27, 2023 17:37

Implement fetch_* instructions for CLS

30d5a44

Signed-off-by: Klimenty Tsoutsman <[email protected]>

Remove arch dependent impl of fetch_sub

6edfe3b

Signed-off-by: Klimenty Tsoutsman <[email protected]>

Use correct asm attributes

3a382a0

Signed-off-by: Klimenty Tsoutsman <[email protected]>

Restore typical things

ddf4d98

Signed-off-by: Klimenty Tsoutsman <[email protected]>

tsoutsman force-pushed the cls-2 branch from 3b41d3f to ddf4d98 Compare July 27, 2023 08:12

tsoutsman requested a review from kevinaboos July 27, 2023 08:14

kevinaboos added 2 commits July 27, 2023 11:13

clarify preemption logic comments

61d758e

more comments in preemption

5d472fc

kevinaboos approved these changes Jul 27, 2023

View reviewed changes

kevinaboos merged commit 1f620ac into theseus-os:theseus_main Jul 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement more ergonomic CPU-local storage #1012

Implement more ergonomic CPU-local storage #1012

tsoutsman commented Jul 21, 2023 •

edited

Loading

kevinaboos left a comment

kevinaboos left a comment

kevinaboos left a comment

kevinaboos Jul 27, 2023

kevinaboos Jul 27, 2023

Implement more ergonomic CPU-local storage #1012

Implement more ergonomic CPU-local storage #1012

Conversation

tsoutsman commented Jul 21, 2023 • edited Loading

kevinaboos left a comment

Choose a reason for hiding this comment

kevinaboos left a comment

Choose a reason for hiding this comment

kevinaboos left a comment

Choose a reason for hiding this comment

kevinaboos Jul 27, 2023

Choose a reason for hiding this comment

kevinaboos Jul 27, 2023

Choose a reason for hiding this comment

tsoutsman commented Jul 21, 2023 •

edited

Loading