Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LangRef] Clarify semantics of masked vector load/store #82469

Merged
merged 1 commit into from
Aug 3, 2024

Conversation

RalfJung
Copy link
Contributor

@RalfJung RalfJung commented Feb 21, 2024

Basically, these operations are equivalent to a loop that iterates all elements and then does a getelementptr (without inbounds!) plus load/store only for the masked-on elements.

@llvmbot
Copy link
Member

llvmbot commented Feb 21, 2024

@llvm/pr-subscribers-llvm-ir

Author: Ralf Jung (RalfJung)

Changes

This is based on what I think has to follow from the statement about preventing exceptions. But I don't actually know what LLVM IR passes will do with these intrinsics, so this requires careful review by someone who does. :)

@nikic do you know these passes / know who knows these passes to do the review?

Also, there's an open question that remains: for the purpose of noalias, do these operations access the masked-off lanes or not? I sure hope they don't, but I realized that while data races are mentioned, noalias is not.


Full diff: https://github.com/llvm/llvm-project/pull/82469.diff

1 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+2)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index fd2e3aacd0169c..496773c4d3d386 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -23752,6 +23752,7 @@ Semantics:
 
 The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
 The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
+In particular, this means that only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation).
 
 
 ::
@@ -23794,6 +23795,7 @@ Semantics:
 
 The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
 The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
+In particular, this means that only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation).
 
 ::
 

@@ -23752,6 +23752,7 @@ Semantics:

The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
In particular, this means that only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "masked-on" the opposite of "masked-off"? Or is there some other term I could use?

@nikic nikic requested a review from topperc February 21, 2024 08:07
@nikic
Copy link
Contributor

nikic commented Feb 21, 2024

I would rephrase this in terms of something like this:

However, these intrinsics behave as-is the masked off lanes are not accessed.

Which should tell use everything necessary about their semantics. Then can continue to clarify that this means no exceptions / data races / etc.

@RalfJung
Copy link
Contributor Author

That doesn't quite say everything -- there's the question of whether this Rust PR should say offset (aka getelementptr inbounds) or offset_wrapping (aka getelementptr) when describing how the pointers to the individual elements being loaded are computed.

@programmerjake
Copy link
Contributor

That doesn't quite say everything -- there's the question of whether this Rust PR should say offset (aka getelementptr inbounds)

there is the additional caveat that LLVM is allowed to create a poison value without UB (which is what happens with getelementptr inbounds with out-of-bounds indexes), but Rust defines out-of-bounds offset to be immediate UB, instead of deferred to the load/store.

a major difference between the two choices is that doing a masked load on a pointer before the beginning of it's allocation is disallowed with inbounds, but allowed without inbounds as long as vector elements are masked-off until the offset is big enough to be in the allocation's bounds.

@RalfJung
Copy link
Contributor Author

RalfJung commented Feb 21, 2024

a major difference between the two choices is that doing a masked load on a pointer before the beginning of it's allocation is disallowed with inbounds, but allowed without inbounds as long as vector elements are masked-off until the offset is big enough to be in the allocation's bounds.

Yes that is indeed the key point: if the first half of the vector is masked-off, and that first half is actually out-of-bounds, then the pointer itself is conceptually out-of-bounds and "computing the pointer to the actually loaded element" would be a non-inbounds pointer computation. I expect this usecase to be allowed, which is why I added the following in this PR:

Only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation).

@RalfJung RalfJung force-pushed the vector-masked branch 2 times, most recently from 32f4ea4 to 9c21fa7 Compare May 2, 2024 07:07
@RalfJung
Copy link
Contributor Author

RalfJung commented May 2, 2024

@nikic I have updated the wording to

The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask, except that the masked-off lanes are not accessed.

Followed by clarification regarding exceptions, noalias, and data races. Does that work for you?

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but a second opinion wouldn't hurt.

llvm/docs/LangRef.rst Outdated Show resolved Hide resolved
llvm/docs/LangRef.rst Outdated Show resolved Hide resolved
@nikic nikic changed the title clarify semantics of masked vector load/store [LangRef] Clarify semantics of masked vector load/store May 3, 2024
@nikic nikic requested a review from preames May 3, 2024 03:32
@RalfJung
Copy link
Contributor Author

@llvm/pr-subscribers-llvm-ir this PR has one review but "a second opinion wouldn't hurt" -- would be nice if someone could take a look. :)

@RalfJung
Copy link
Contributor Author

RalfJung commented Jul 22, 2024

@nikic any recommendation for how one could get a second opinion for this PR? I don't know how to navigate the LLVM review process to move this PR forwards...

@RalfJung
Copy link
Contributor Author

RalfJung commented Aug 2, 2024

@nunoplopes any chance you could take a look at this? :)

@appujee
Copy link
Contributor

appujee commented Aug 2, 2024

cc: @fhahn and @alexey-bataev who are code owners of Autovectorizer

@alexey-bataev
Copy link
Member

No objections from my side

@RalfJung
Copy link
Contributor Author

RalfJung commented Aug 3, 2024

Thanks! Could someone merge this please then? :)

@nikic
Copy link
Contributor

nikic commented Aug 3, 2024

Thanks! Could someone merge this please then? :)

Sure, but you need to update the PR description first, which becomes the commit message.

@RalfJung
Copy link
Contributor Author

RalfJung commented Aug 3, 2024

@nikic I updated the description, does that work?

@nunoplopes
Copy link
Member

LGTM

@nikic nikic merged commit 79f7630 into llvm:main Aug 3, 2024
5 checks passed
banach-space pushed a commit to banach-space/llvm-project that referenced this pull request Aug 7, 2024
Basically, these operations are equivalent to a loop that iterates all
elements and then does a `getelementptr` (without `inbounds`!) plus
`load`/`store` only for the masked-on elements.
kstoimenov pushed a commit to kstoimenov/llvm-project that referenced this pull request Aug 15, 2024
Basically, these operations are equivalent to a loop that iterates all
elements and then does a `getelementptr` (without `inbounds`!) plus
`load`/`store` only for the masked-on elements.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants