-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting tensor parallelism for int8 weight only quant #939
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/939
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit b113eda with merge base 64719d5 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Summary: following https://github.com/pytorch/ao/blob/main/tutorials/developer_api_guide/tensor_parallel.py we can support tensor parallelism for int8 weight only quant, this is needed for torchchat Test Plan: python test/dtypes/test_affine_quantized_tensor_parallel.py Reviewers: Subscribers: Tasks: Tags:
41fc6f9
to
7139466
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Wow, didn't thought that would be so straightforward!
Nice!
* [WIP] Supporting tensor parallelism for int8 weight only quant Summary: following https://github.com/pytorch/ao/blob/main/tutorials/developer_api_guide/tensor_parallel.py we can support tensor parallelism for int8 weight only quant, this is needed for torchchat Test Plan: python test/dtypes/test_affine_quantized_tensor_parallel.py Reviewers: Subscribers: Tasks: Tags: * implement tp for aqt * fixes * import fix * remove cpu test * fix * fix * fix test * device * change transpose impl * Skip compiled TP test for torch version < 2.5 * version util * fix * fix version --------- Co-authored-by: Ke Wen <[email protected]>
…s empty (pytorch#939) Skip error if sdk/cmdline-tools/latest/* is empty
Summary:
following https://github.com/pytorch/ao/blob/main/tutorials/developer_api_guide/tensor_parallel.py we can support tensor parallelism for int8 weight only quant, this is needed for torchchat
Test Plan:
python test/dtypes/test_affine_quantized_tensor_parallel.py
Reviewers:
Subscribers:
Tasks:
Tags: