-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output LLVM IR if NUMBA_DPEX_DEBUG environment variable is set #924
Conversation
@@ -139,6 +139,20 @@ def compile( | |||
self._llvm_module = kernel.module.__str__() | |||
self._module_name = kernel.name | |||
|
|||
# Dump LLVM IR if DEBUG flag is set. | |||
if config.DEBUG: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a new flag NUMBA_DPEX_DUMP_KERNEL_LLVM
? The DEBUG
seems to turn on debug symbols in some places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. Could not find any place where documentation needs to be updated with the env var. If I have missed it, let me know.
Tested the code with small code from #906 , llvm IR is printed. |
5afc7e9
to
dd68966
Compare
@adarshyoga Thank you! It is good to go, but we need a test case. Just to verify that setting the config actually generates the IR file in the right location and the file is not empty. Refer: https://docs.pytest.org/en/6.2.x/tmpdir.html#the-tmpdir-fixture |
dd68966
to
eee597b
Compare
Added a test case that tests both positive and negative cases. |
Output LLVM IR if NUMBA_DPEX_DEBUG environment variable is set f0adad1
Adding functionality to output LLVM IR to a file. The output file name contains a hash generated from the LLVM module name. This allows us to write different kernels to separate files since the hash is expected to be unique.
Tested with Blackscholes in dpbench. File name generated is
llvm_kernel_ce84b7ebc25a6c5f71403d77de797be75067dce745a14ae28ca34dc5e3365998.ll
Have you tested your changes locally for CPU and GPU devices?
Tested on GPU
Have you made sure that new changes do not introduce compiler warnings?
If this PR is a work in progress, are you filing the PR as a draft?