Supporting tensor parallelism for int8 weight only quant #939

jerryzh168 · 2024-09-24T22:46:10Z

Summary:
following https://github.com/pytorch/ao/blob/main/tutorials/developer_api_guide/tensor_parallel.py we can support tensor parallelism for int8 weight only quant, this is needed for torchchat

Test Plan:
python test/dtypes/test_affine_quantized_tensor_parallel.py

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2024-09-24T22:46:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/939

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit b113eda with merge base 64719d5 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: following https://github.com/pytorch/ao/blob/main/tutorials/developer_api_guide/tensor_parallel.py we can support tensor parallelism for int8 weight only quant, this is needed for torchchat Test Plan: python test/dtypes/test_affine_quantized_tensor_parallel.py Reviewers: Subscribers: Tasks: Tags:

kwen2501

LGTM!
Wow, didn't thought that would be so straightforward!
Nice!

* [WIP] Supporting tensor parallelism for int8 weight only quant Summary: following https://github.com/pytorch/ao/blob/main/tutorials/developer_api_guide/tensor_parallel.py we can support tensor parallelism for int8 weight only quant, this is needed for torchchat Test Plan: python test/dtypes/test_affine_quantized_tensor_parallel.py Reviewers: Subscribers: Tasks: Tags: * implement tp for aqt * fixes * import fix * remove cpu test * fix * fix * fix test * device * change transpose impl * Skip compiled TP test for torch version < 2.5 * version util * fix * fix version --------- Co-authored-by: Ke Wen <[email protected]>

…s empty (pytorch#939) Skip error if sdk/cmdline-tools/latest/* is empty

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 24, 2024

jerryzh168 changed the title ~~[WIP] Supporting tensor parallelism for int8 weight only quant~~ Supporting tensor parallelism for int8 weight only quant Sep 25, 2024

jerryzh168 requested a review from kwen2501 September 25, 2024 02:14

jerryzh168 added 10 commits September 25, 2024 18:53

implement tp for aqt

cc938f7

fixes

27e4238

import fix

1b6e42c

remove cpu test

4553cd9

fix

771868f

fix

25e19a9

fix test

3db5d9a

device

cba6848

change transpose impl

7139466

jerryzh168 force-pushed the aqt_tensor_parallel branch from 41fc6f9 to 7139466 Compare September 26, 2024 01:53

kwen2501 and others added 2 commits September 26, 2024 17:19

Skip compiled TP test for torch version < 2.5

115a5f2

version util

5370e42

kwen2501 approved these changes Sep 27, 2024

View reviewed changes

jerryzh168 added 2 commits September 26, 2024 17:28

fix

ed9b82e

fix version

b113eda

jerryzh168 merged commit 72d2518 into pytorch:main Sep 27, 2024
17 checks passed

jerryzh168 deleted the aqt_tensor_parallel branch September 27, 2024 02:08

This was referenced Oct 1, 2024

cleaned up and tested tp support #976

Open

Tensor Parallelism Support for AffineQuantizedTensor #988

Open

jerryzh168 mentioned this pull request Oct 18, 2024

Add tensor parallelism support for int4_weight_only quantization #1120

Merged

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Update android_example.sh: Skip error if sdk/cmdline-tools/latest/* i…

87da8c4

…s empty (pytorch#939) Skip error if sdk/cmdline-tools/latest/* is empty

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supporting tensor parallelism for int8 weight only quant #939

Supporting tensor parallelism for int8 weight only quant #939

jerryzh168 commented Sep 24, 2024

pytorch-bot bot commented Sep 24, 2024 •

edited

Loading

kwen2501 left a comment

Supporting tensor parallelism for int8 weight only quant #939

Supporting tensor parallelism for int8 weight only quant #939

Conversation

jerryzh168 commented Sep 24, 2024

pytorch-bot bot commented Sep 24, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/939

✅ No Failures

kwen2501 left a comment

Choose a reason for hiding this comment

pytorch-bot bot commented Sep 24, 2024 •

edited

Loading