Simplify `core::hint::spin_loop` #115547

WaffleLapkin · 2023-09-04T21:14:13Z

The grouping was inconsistent and not really helpful.

r? t-libs

rustbot · 2023-09-04T21:14:17Z

Failed to set assignee to t-libs: invalid assignee

Note: Only org members, users with write permissions, or people who have commented on the PR may be assigned.

WaffleLapkin · 2023-09-04T21:17:24Z

r? libs

WaffleLapkin · 2023-09-04T21:18:46Z

library/core/src/hint.rs

+    {
+        // SAFETY: the `cfg` attr ensures that we only execute this on arm targets
+        // with support for the v6 feature.
+        unsafe { crate::arch::arm::__yield() };
    }
 }


As a side-note, I've noticed that hexagon target has a pause instruction tailored for spin loops, but asm! macro doesn't seem to support hexagon yet, so I'm not sure if we can do anything here...

I was told that we actually do support hexagon ASM. It's just unstable and not documented (besides the tracking issue: #93335). I could try adding a pause here for it then.

WaffleLapkin · 2023-09-04T21:25:34Z

library/core/src/hint.rs

+    {
+        // SAFETY: the `cfg` attr ensures that we only execute this on arm targets
+        // with support for the v6 feature.
+        unsafe { crate::arch::arm::__yield() };


This answer on SO seems to suggest that __isb(SY) is better than __yield() (for some definition of "better" I guess). It seems like both of those are available for both aarch64 and arm. Should we use __isb(SY) for arm too?

The SO answer is mistaken. At least, somewhat mistaken. ISB, the "Instruction Synchronization Barrier," flushes the instruction pipeline on the calling processor. It isn't explicitly for use in something as general-purpose as a spin-loop:

Instruction Synchronization Barrier flushes the pipeline in the PE and is a context synchronization event.

When used with the SY option it can cause a "full system barrier operation," which could only be more expensive than plain ISB.

I worry that __isb(SY) may only appear to improve performance, and only on select implementations.

The instruction that best fits the intent of a spin-loop hint is YIELD:

YIELD is a hint instruction. Software with a multithreading capability can use a YIELD instruction to indicate to the PE that it is performing a task, for example a spin-lock, that could be swapped out to improve overall system performance. The PE can use this hint to suspend and resume multiple software threads if it supports the capability.

Expanding the use of __isb(SY) from aarch64 to both targets seems misguided. If anything, the change from __yield() to __isb(SY) on aarch64 should be reverted.

WaffleLapkin · 2023-09-04T21:27:31Z

library/core/src/hint.rs

+        // SAFETY: the `cfg` attr ensures that we only execute this on arm targets
+        // with support for the v6 feature.


I'm not sure those (or any of the above) comments make sense. As far as I can tell, all the functions in core::arch::... are cfg-ed anyway, so we would not be able to call them on a wrong arch. Should I delete them?

Functions in core::arch::... can be called just fine if the relevant cfg(target_feature = "...") is not true at compile time. In that case you need to do runtime detection to check if it is supported before calling however.

I'm not sure how this could be true.

__yield is defined in arm_shared. arm_shared is marked as #[cfg(any(target_arch = "arm", target_arch = "aarch64", doc))]. __yield itself is marked as #[cfg(any(target_feature = "v6", target_arch = "aarch64", doc))].

So this results in __yield being available if cfg(all(any(target_arch = "arm", target_arch = "aarch64", doc), any(target_feature = "v6", target_arch = "aarch64", doc))). doc and target_arch = "aarch64" are not true in this context, so we are left with cfg(all(any(target_arch = "arm"), any(target_feature = "v6"))), which can be simplified to cfg(all(target_arch = "arm", target_feature = "v6")).

Unless I'm missing something, this is exactly the same as the cfg on this block?...

Looks like you are right in this case. For x86 intrinsics this is definitively not the case though. Those are never cfged on target features.

Mark-Simulacrum · 2023-09-17T23:45:34Z

@bors r+ rollup=iffy

I think this seems fine -- there's some comments here on possible changes to the code, but I don't think we need to do that in this PR.

bors · 2023-09-17T23:45:36Z

📌 Commit 1811fe6 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

bors · 2023-09-18T00:02:43Z

⌛ Testing commit 1811fe6 with merge 8a7cab8...

bors · 2023-09-18T01:46:23Z

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 8a7cab8 to master...

rust-timer · 2023-09-18T03:47:23Z

Finished benchmarking commit (8a7cab8): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.2%	[-0.2%, -0.2%]	1
Improvements ✅ (secondary)	-0.4%	[-0.5%, -0.3%]	4
All ❌✅ (primary)	-0.2%	[-0.2%, -0.2%]	1

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.4%	[2.4%, 2.4%]	1
Regressions ❌ (secondary)	3.5%	[3.3%, 3.7%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.4%	[2.4%, 2.4%]	1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 634.064s -> 634.575s (0.08%)
Artifact size: 318.47 MiB -> 318.43 MiB (-0.01%)

Simplify core::hint::spin_loop

1811fe6

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Sep 4, 2023

rustbot assigned Mark-Simulacrum Sep 4, 2023

WaffleLapkin commented Sep 4, 2023

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 17, 2023

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 18, 2023

bors merged commit 8a7cab8 into rust-lang:master Sep 18, 2023
7 of 11 checks passed

rustbot added this to the 1.74.0 milestone Sep 18, 2023

WaffleLapkin deleted the spin_looping branch November 19, 2023 00:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify `core::hint::spin_loop` #115547

Simplify `core::hint::spin_loop` #115547

WaffleLapkin commented Sep 4, 2023

rustbot commented Sep 4, 2023

WaffleLapkin commented Sep 4, 2023

WaffleLapkin Sep 4, 2023

WaffleLapkin Sep 4, 2023

WaffleLapkin Sep 4, 2023

skotchandsoda Sep 7, 2023

WaffleLapkin Sep 4, 2023

bjorn3 Sep 5, 2023

WaffleLapkin Sep 5, 2023

bjorn3 Sep 5, 2023

Mark-Simulacrum commented Sep 17, 2023

bors commented Sep 17, 2023

bors commented Sep 18, 2023

bors commented Sep 18, 2023

rust-timer commented Sep 18, 2023

		// SAFETY: the `cfg` attr ensures that we only execute this on arm targets
		// with support for the v6 feature.

Simplify core::hint::spin_loop #115547

Simplify core::hint::spin_loop #115547

Conversation

WaffleLapkin commented Sep 4, 2023

rustbot commented Sep 4, 2023

WaffleLapkin commented Sep 4, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Mark-Simulacrum commented Sep 17, 2023

bors commented Sep 17, 2023

bors commented Sep 18, 2023

bors commented Sep 18, 2023

rust-timer commented Sep 18, 2023

Overall result: ✅ improvements - no action needed

Simplify `core::hint::spin_loop` #115547

Simplify `core::hint::spin_loop` #115547