-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conflicting SPIR-V versions when linking to atomic-ops.spir #868
Comments
|
@fcharras Can you share the environment details? Our internal CI build server is now reliably reproducing the issue, but my unable to do so on my development system. |
I didn't try to reproduce with the conda install, here are the details of our custom development build where it can be replicated:
The image with everything installed is available on dockerhub
guide to load into the container here and the dockerfile is there. but I'd be surprised if this can't be reproduced with the conda install, for which we have instructions here. Also, simple kernels don't trigger the issue. Probably it only triggers when using advanced features (local or private memory, dpex funcs, or some combination of those...). |
Until you find the adequate fix for the issue, here is our workaround |
@fcharras I have applied your workaround for the time being to main. I am following up with our C++ compiler team to look for a better fix. |
Thank you for the follow-up. I'd be curious to know if there are any performance implication is using some SPIR-V version rather than the other (or why not just use the latest). |
@fcharras The performance implications are worrying me too. I am reaching out to our SPIR-V experts in the dpcpp team. I will update once I hear back. |
Fixed in #1103 |
I'm having troubles with using atomics from
atomic-ops.spir
, with the following error message:what could cause such a version mismatch ? I'm trying to get a minimal reproducer but it seems the error does not trigger for all atomics calls - will update.
I'm using a custom
numba-dpex
build from0.19.0
, with an up to date environment (2023 one api releases, dpctl >= 0.14.1dev1)(I don't think there are differences between my build environment and the runtime environment. I'm using
spirv-tools
binaries from ubuntu jammy repositories )For GPU, the error can be circumvented by using native atomics.
Edit: it seems it's a bug that can be summed up this way: the
atomic_ops.spir
binary has some SPIR-V version that is determined at build time, and in some cases, the JIT can produce different SPIR-V versions for the kernels, but different versions are not compatible and crash the linker. In my case, the SPIR-V version ofatomic_ops.spir
is1.0
and I can fix the bug by passing--spirv-max-version 1.0
to thellvm-spirv
call at https://github.com/IntelPython/numba-dpex/blob/main/numba_dpex/spirv_generator.py#L83 . I am not, however, able to explain why suddenly thellvm-spirv
starts outputting SPIR-V 1.3 for some of my kernels 🤔The text was updated successfully, but these errors were encountered: