-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
building with triton support? #166
Comments
@Tobias-Fischer Are you aware of any quick example demonstrating the usage of triton? No worries if not, I will look upstream. I am now back to home base and can help debug this further |
The first code snippet is an easy example: https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html |
Alright, it fails with
|
@h-vetinari and @hmaarrfk, just fyi. My current assessment is that we will likely need to wait for triton 2.x, then simply add it as run dep and see if things work out. I tried adding |
I just confirmed that this example works: https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html |
Ah, something surprising (?): It also works without installing triton, just by having a |
Made some progress - see conda-forge/triton-feedstock#6 The open question in my mind is still the circular dependency from both torch to triton and from triton to torch. Not sure how to best deal with it .. a |
Probably with a |
I keep running into: cannot find -lcuda: No such file or directory when using Any idea what might be going on? |
We need conda-forge/triton-feedstock#6 |
Bumping this since conda-forge/triton-feedstock#6 was merged. Thanks for the good work! |
Can we quickly test this? Just use the package we have and install triton. Do things magically work or do we need to more tweaks in this feedstock? |
Anecdotal experience, but pytorch and triton from conda-forge seem to pick up each other fine today (as in no segfaults when calling torch.compile with the inductor backend). I believe special care should be taken with dependency versions. Both triton and support for it in torch are really experimental and probably have quite narrow ranges of versions where they are supposed to work together. |
@RaulPPelaez good idea on 2.0.1. Would you be interested in submitting a PR? I could start one now... |
see #172 we could also include formal triton support in that one too |
Has anyone looked at this again recently? |
Here's what we've done at Anaconda for v2.3.0: https://github.com/AnacondaRecipes/triton-feedstock There are some notes in the meta.yaml about various choices we made. Let me know if you've got any questions. |
I'm going to try making a new pull request for 3.1.0, as that's the version required by PyTorch 2.5.1. @danpetry, thanks. Curious enough, I've just tried diffing the PyPI 3.1.0 package against the one provided by PyTorch, and — at least as far as |
I think pytorch call(ed) their conda package torchtriton too? |
I probably need to re-check the logic of that statement, but that's the conclusion I came to when I worked on it, at least I wanted to be cautious and say, "this should not be used except with pytorch, with which it has been explicitly integration tested" |
It uses llvm at runtime, rather than build time |
The issue was that triton did not have wheels and even when it did it took time to get releases in. There was also an issue with rocm support not getting merged. All of these have been resolved I think. |
ok, so we can now use the wheels rather than the git repo to build? IIRC this wasn't possible |
In conda packaging? We don't want to use pre-compiled wheels in conda-build. |
Yeah, I've already noticed that it doesn't work neither with LLVM 18 nor 19 release. That said, I've assumed that it's applicable to all triton releases, whether coming from PyPI triton or PyTorch's builds. I'm going to try looking at triton's main later, to see whether I can find a commit that actually works with a LLVM release, and then let you know how "far" it is from 2.1.0. Perhaps it wouldn't be that hard to make it compatible with LLVM 19, or we could try packaging a newer snapshot that would also be compatible with PyTorch. |
I think my key issue was that they weren't tagging commits in the github repo, there weren't any sdists, and I didn't want to release a github commit and call it a release version |
Yes, that is still the case. However, they do release versioned wheels, and I've confirmed that the contents of |
@atalman, since you seem to be a maintainer of |
It's here AFAIU, but NB that triton still don't have this tagged |
whether that's good enough re traceability for conda-forge, I don't know. for anaconda main, we decided it wasn't. Triton can force-push the branch it's on and remove the commit - and indeed have in the past. (I suppose they can change tags too. Ideal is a github release.) |
I believe it's one and the same commit, co-ordinated by him, but he can confirm |
The commit fixing compatibility with LLVM 19 is triton-lang/triton@46550ab. I'm going to see how much would we actually need to patch to get it to work on top of 3.1.0. |
I've pushed my WIP to conda-forge/triton-feedstock#26. |
Worth bearing in mind that LLVM is only used by triton at runtime, to compile cuda kernels. And in the end, the binary format I guess is determined by the cuda compiler rather than llvm. So, keeping it vendored-in isn't an issue so far as compatibility with the rest of the distro is concerned, I think. |
I don't know if anyone else can confirm this? |
I think the bigger issue here is that triton either downloads a prebuilt LLVM version if it detects a supported platform, or uses system LLVM (expecting this specific commit) when it doesn't. |
Cc @amjames. Andrew, you had some useful insights into the PyTorch -> Triton -> LLVM coupling, so you may be interested in this topic and in conda-forge/triton-feedstock#26. Making Triton compatible with a proper LLVM release can be very useful for Conda-forge (and probably other distros as well). conda-forge/triton-feedstock#26 (comment) summarizes how this was achieved for Triton 3.1.0 with the LLVM 19 release. Some manual testing seems to confirm success at the "build and seems to compile stuff with nvcc" level - perhaps you have some suggestions into what subset of the PyTorch test suite to run to confirm that PyTorch + Triton works as designed? |
There's a smoke test which tests torch.compile with cuda (if an environment variable is appropriately set) |
Thanks. Looks like I was wrong and some patching is necessary for regular CC to be able to find CUDA headers:
|
@rgommers Thanks!
What kind of time-limits are we working with? Full coverage is probably a non-starter, but the inductor tests will be the best place to focus on. I would start with:
Some of these tests will require a specific GPU like A100, but in general the tests should be annotated to skip if they have special requirements like that which are not met. |
The plot thickens. After fixing the path errors, I'm getting:
My guess would be that it doesn't support CUDA 12.6 for some reason. But 12.0 is too old, and I don't think we can cleanly do 12.5 just for Triton without changing PyTorch. Will try to figure out a good solution tomorrow, but I'd appreciate any hints. |
Are you sure it’s not picking up your system CUDA (12.5?) and then competing against conda CUDA (12.6?) |
It might be a problem that the feedstock uses |
Actually, it turns out we need one more upstream patch for CUDA 12.6 support. But I also need to fix path search, I'll make a pull request later.
Hmm, is that actually wrong? I didn't think |
AFAIU But Maybe it's to do with the "conda-build" conditional in the link above, so the I/L flags aren't being added? |
Already solved via conda-forge/triton-feedstock#28. Just wondering if I should update the dependencies while at it. |
Originally posted by @Tobias-Fischer in #165 (comment)
--
More background: #151
The text was updated successfully, but these errors were encountered: