-
Notifications
You must be signed in to change notification settings - Fork 12.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LangRef] Clarify semantics of masked vector load/store #82469
Conversation
@llvm/pr-subscribers-llvm-ir Author: Ralf Jung (RalfJung) ChangesThis is based on what I think has to follow from the statement about preventing exceptions. But I don't actually know what LLVM IR passes will do with these intrinsics, so this requires careful review by someone who does. :) @nikic do you know these passes / know who knows these passes to do the review? Also, there's an open question that remains: for the purpose of Full diff: https://github.com/llvm/llvm-project/pull/82469.diff 1 Files Affected:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index fd2e3aacd0169c..496773c4d3d386 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -23752,6 +23752,7 @@ Semantics:
The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
+In particular, this means that only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation).
::
@@ -23794,6 +23795,7 @@ Semantics:
The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
+In particular, this means that only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation).
::
|
llvm/docs/LangRef.rst
Outdated
@@ -23752,6 +23752,7 @@ Semantics: | |||
|
|||
The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations. | |||
The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes. | |||
In particular, this means that only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "masked-on" the opposite of "masked-off"? Or is there some other term I could use?
I would rephrase this in terms of something like this:
Which should tell use everything necessary about their semantics. Then can continue to clarify that this means no exceptions / data races / etc. |
That doesn't quite say everything -- there's the question of whether this Rust PR should say |
ad9bcf6
to
acf5422
Compare
there is the additional caveat that LLVM is allowed to create a a major difference between the two choices is that doing a masked load on a pointer before the beginning of it's allocation is disallowed with |
Yes that is indeed the key point: if the first half of the vector is masked-off, and that first half is actually out-of-bounds, then the pointer itself is conceptually out-of-bounds and "computing the pointer to the actually loaded element" would be a non-inbounds pointer computation. I expect this usecase to be allowed, which is why I added the following in this PR:
|
32f4ea4
to
9c21fa7
Compare
@nikic I have updated the wording to
Followed by clarification regarding exceptions, noalias, and data races. Does that work for you? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but a second opinion wouldn't hurt.
@llvm/pr-subscribers-llvm-ir this PR has one review but "a second opinion wouldn't hurt" -- would be nice if someone could take a look. :) |
@nikic any recommendation for how one could get a second opinion for this PR? I don't know how to navigate the LLVM review process to move this PR forwards... |
@nunoplopes any chance you could take a look at this? :) |
cc: @fhahn and @alexey-bataev who are code owners of Autovectorizer |
No objections from my side |
Thanks! Could someone merge this please then? :) |
Sure, but you need to update the PR description first, which becomes the commit message. |
@nikic I updated the description, does that work? |
LGTM |
Basically, these operations are equivalent to a loop that iterates all elements and then does a `getelementptr` (without `inbounds`!) plus `load`/`store` only for the masked-on elements.
Basically, these operations are equivalent to a loop that iterates all elements and then does a `getelementptr` (without `inbounds`!) plus `load`/`store` only for the masked-on elements.
Basically, these operations are equivalent to a loop that iterates all elements and then does a
getelementptr
(withoutinbounds
!) plusload
/store
only for the masked-on elements.