-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++][Gandiva] Migration JIT engine from MCJIT to LLJIT #37848
Comments
Thanks for opening this issue. |
Here is the discussion thread: https://lists.apache.org/thread/fphzvtr1jrc069z7kv78oopgr4zrjfgl UPDATE: |
…/LLJIT (#39098) ### Rationale for this change Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT since LLVM 7.0, and it has several advantage over MCJIT, in particular, MCJIT is not actively maintained, and is slated for eventual deprecation and removal. ### What changes are included in this PR? * This PR replaces the MCJIT JIT engine with the ORC v2 engine, using the `LLJIT` API. * This PR adds a new JIT linker option `JITLink` (https://llvm.org/docs/JITLink.html), which can be used together with `LLJIT`, for LLVM 14+ on Linux/macOS platform. It is turned off by default but could be turned on with environment variable `GANDIVA_USE_JIT_LINK` ### Are these changes tested? Yes, they are covered by existing unit tests ### Are there any user-facing changes? * `Configuration` class has a new option called `dump_ir`. If users would like to call `DumpIR` API of `Projector` and `Filter`, they have to set the `dump_ir` option first. * Closes: #37848 Authored-by: Yue Ni <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…ORC v2/LLJIT (apache#39098) ### Rationale for this change Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT since LLVM 7.0, and it has several advantage over MCJIT, in particular, MCJIT is not actively maintained, and is slated for eventual deprecation and removal. ### What changes are included in this PR? * This PR replaces the MCJIT JIT engine with the ORC v2 engine, using the `LLJIT` API. * This PR adds a new JIT linker option `JITLink` (https://llvm.org/docs/JITLink.html), which can be used together with `LLJIT`, for LLVM 14+ on Linux/macOS platform. It is turned off by default but could be turned on with environment variable `GANDIVA_USE_JIT_LINK` ### Are these changes tested? Yes, they are covered by existing unit tests ### Are there any user-facing changes? * `Configuration` class has a new option called `dump_ir`. If users would like to call `DumpIR` API of `Projector` and `Filter`, they have to set the `dump_ir` option first. * Closes: apache#37848 Authored-by: Yue Ni <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…ORC v2/LLJIT (apache#39098) ### Rationale for this change Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT since LLVM 7.0, and it has several advantage over MCJIT, in particular, MCJIT is not actively maintained, and is slated for eventual deprecation and removal. ### What changes are included in this PR? * This PR replaces the MCJIT JIT engine with the ORC v2 engine, using the `LLJIT` API. * This PR adds a new JIT linker option `JITLink` (https://llvm.org/docs/JITLink.html), which can be used together with `LLJIT`, for LLVM 14+ on Linux/macOS platform. It is turned off by default but could be turned on with environment variable `GANDIVA_USE_JIT_LINK` ### Are these changes tested? Yes, they are covered by existing unit tests ### Are there any user-facing changes? * `Configuration` class has a new option called `dump_ir`. If users would like to call `DumpIR` API of `Projector` and `Filter`, they have to set the `dump_ir` option first. * Closes: apache#37848 Authored-by: Yue Ni <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…ORC v2/LLJIT (apache#39098) ### Rationale for this change Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT since LLVM 7.0, and it has several advantage over MCJIT, in particular, MCJIT is not actively maintained, and is slated for eventual deprecation and removal. ### What changes are included in this PR? * This PR replaces the MCJIT JIT engine with the ORC v2 engine, using the `LLJIT` API. * This PR adds a new JIT linker option `JITLink` (https://llvm.org/docs/JITLink.html), which can be used together with `LLJIT`, for LLVM 14+ on Linux/macOS platform. It is turned off by default but could be turned on with environment variable `GANDIVA_USE_JIT_LINK` ### Are these changes tested? Yes, they are covered by existing unit tests ### Are there any user-facing changes? * `Configuration` class has a new option called `dump_ir`. If users would like to call `DumpIR` API of `Projector` and `Filter`, they have to set the `dump_ir` option first. * Closes: apache#37848 Authored-by: Yue Ni <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…CJIT to ORC v2/LLJIT (apache#39098)" This reverts commit 83cba25.
…e from MCJIT to ORC v2/LLJIT (apache#39098)"" This reverts commit 2f34923.
…CJIT to ORC v2/LLJIT (apache#39098)" This reverts commit 83cba25.
…CJIT to ORC v2/LLJIT (apache#39098)" This reverts commit 83cba25.
…CJIT to ORC v2/LLJIT (apache#39098)" This reverts commit 83cba25.
…CJIT to ORC v2/LLJIT (apache#39098)" This reverts commit 83cba25.
…CJIT to ORC v2/LLJIT (apache#39098)" This reverts commit 83cba25.
### Rationale for this change #37848 upgraded the JIT compiler for LLVM/Gandiva code which presented linking errors with newer version of LLVM. Some Gandiva tests were disabled, and here at Dremio I am running into the same linking problem when trying to build with an updated Arrow library. After reading some threads on the LLVM discord server it appears that updating to LLVM 18.1 will fix the symbol issue. I tested locally and was able to re-enable the disabled java tests which were showing the unexported ORC symbol issue. More discussion in apache/arrow-java#63. ### What changes are included in this PR? Updating vcpkg and pinning LLVM to 18.1 Notably I found encountered some build problems using the newest vcpkg update, which appeared to be related to the updated gRPC libraries. My Arrow jar CI build was timing out in this case with no clear error in the logs. The vcpkg version included here has the LLVM 18 update but not the gRPC update (which isn't needed for this issue). ### Are these changes tested? Covered by existing tests. Will also re-enable the disabled Java tests in a future change. ### Are there any user-facing changes? No. * GitHub Issue: #45132 Lead-authored-by: Logan Riggs <[email protected]> Co-authored-by: lriggs <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Description
Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT [1], which presents several advantages over MCJIT:
In my project, I've experimented with this migration and got it to work in a prototype. However, transitioning Gandiva to this new API is a substantial undertaking. I'm keen to gauge the community's interest in migrating to this new JIT engine API and would greatly appreciate any feedback or insights. Thank you.
Proposal
Projector
andFilter
APIs remain the sameLLVMGenerator
andEngine
classesLLVMGenerator
andEngine
classes constructors are expected to take an optional additionalGandivaObjectCache
reference because LLJIT requires to set up the object cache mechanism during initialization of LLJIT instanceEngine
class implementation since it is currently interfacing the MCJIT directly and we will replace the MCJIT related APIs with the LLJIT related APIsConfiguration
class, and it is expected to add a new configuration option calledneeds_ir_dumping
because LLJIT doesn't allow to retrieve the IR from module at any time. But previously Gandiva has an API calledDumpIR
which allows dumping IR at any time, so we need to use this new option to indicate IR dumping is needed and we can store the IR up front for later dumpingReferences
[1] https://llvm.org/docs/ORCv2.html
[2] https://github.com/llvm/llvm-project/commits/c4e764ea24eb02b6ec34038061cee8ff94c0f34c/llvm/include/llvm/ExecutionEngine/Orc/LLJIT.h?after=c4e764ea24eb02b6ec34038061cee8ff94c0f34c+34
[3] LLVM release dates, https://releases.llvm.org
Component(s)
C++ - Gandiva
The text was updated successfully, but these errors were encountered: