Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding tests for save/load support #16

Merged
merged 2 commits into from
Nov 28, 2023

Conversation

HDCharles
Copy link
Contributor

@HDCharles HDCharles commented Nov 28, 2023

Stack from ghstack (oldest at bottom):

Summary: we are able to save a model quantized with a tensor subclass,
save the state dict, then later, load model as meta tensor (i.e. only
load tensor metadata not actually parameters) apply quantization api,
and then load the quantized model state dict.

We change the dtype of the subclass to match the dtype of the
dequantized form, both to align with subclass design guidelines and to
make this work

Test Plan: python test/test.py

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: we are able to save a model quantized with a tensor subclass,
save the state dict, then later, load model as meta tensor (i.e. only
load tensor metadata not actually parameters) apply quantization api,
and then load the quantized model state dict.

We change the dtype of the subclass to match the dtype of the
dequantized form, both to align with subclass design guidelines and to
make this work

Test Plan: python test/test.py

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
HDCharles added a commit that referenced this pull request Nov 28, 2023
Summary: we are able to save a model quantized with a tensor subclass,
save the state dict, then later, load model as meta tensor (i.e. only
load tensor metadata not actually parameters) apply quantization api,
and then load the quantized model state dict.

We change the dtype of the subclass to match the dtype of the
dequantized form, both to align with subclass design guidelines and to
make this work

Test Plan: python test/test.py

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: e02cdf5cd182c06241ddba73d579189e4ff3ba69
Pull Request resolved: #16
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 28, 2023
Summary: we are able to save a model quantized with a tensor subclass,
save the state dict, then later, load model as meta tensor (i.e. only
load tensor metadata not actually parameters) apply quantization api,
and then load the quantized model state dict.

We change the dtype of the subclass to match the dtype of the
dequantized form, both to align with subclass design guidelines and to
make this work

Test Plan: python test/test.py

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@HDCharles HDCharles merged commit 4ad210d into gh/HDCharles/5/base Nov 28, 2023
@HDCharles HDCharles deleted the gh/HDCharles/5/head branch November 28, 2023 05:41
HDCharles added a commit that referenced this pull request Nov 28, 2023
Summary: we are able to save a model quantized with a tensor subclass,
save the state dict, then later, load model as meta tensor (i.e. only
load tensor metadata not actually parameters) apply quantization api,
and then load the quantized model state dict.

We change the dtype of the subclass to match the dtype of the
dequantized form, both to align with subclass design guidelines and to
make this work

Test Plan: python test/test.py

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: e02cdf5cd182c06241ddba73d579189e4ff3ba69
Pull Request resolved: #16
dbyoung18 pushed a commit to dbyoung18/ao that referenced this pull request Jul 31, 2024
Summary: we are able to save a model quantized with a tensor subclass,
save the state dict, then later, load model as meta tensor (i.e. only
load tensor metadata not actually parameters) apply quantization api,
and then load the quantized model state dict.

We change the dtype of the subclass to match the dtype of the
dequantized form, both to align with subclass design guidelines and to
make this work

Test Plan: python test/test.py

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: e02cdf5cd182c06241ddba73d579189e4ff3ba69
Pull Request resolved: pytorch#16
jerryzh168 pushed a commit that referenced this pull request Sep 4, 2024
* initial flow for autoround

Signed-off-by: yiliu30 <[email protected]>

* update flow

Signed-off-by: yiliu30 <[email protected]>

* use int4 kernel

Signed-off-by: yiliu30 <[email protected]>

* remove debug code

Signed-off-by: yiliu30 <[email protected]>

* update the forward

Signed-off-by: yiliu30 <[email protected]>

* clean code

Signed-off-by: yiliu30 <[email protected]>

* e2e example

Signed-off-by: yiliu30 <[email protected]>

* refine code

Signed-off-by: yiliu30 <[email protected]>

* add requirements for test

Signed-off-by: yiliu30 <[email protected]>

* update test

Signed-off-by: yiliu30 <[email protected]>

* update the readme

Signed-off-by: yiliu30 <[email protected]>

* add readme

Signed-off-by: yiliu30 <[email protected]>

* update the filenames

Signed-off-by: yiliu30 <[email protected]>

* update the np version

Signed-off-by: yiliu30 <[email protected]>

* add demo

Signed-off-by: yiliu30 <[email protected]>

* format

Signed-off-by: yiliu30 <[email protected]>

* add more docs

Signed-off-by: yiliu30 <[email protected]>

* format

Signed-off-by: yiliu30 <[email protected]>

* add doc

Signed-off-by: yiliu30 <[email protected]>

* use `AffineQuantizedTensor`

Signed-off-by: yiliu30 <[email protected]>

* impl ar using multensors

Signed-off-by: yiliu30 <[email protected]>

* clean code

Signed-off-by: yiliu30 <[email protected]>

* use hook + multensors

Signed-off-by: yiliu30 <[email protected]>

* separate mul_tensors into a new file

Signed-off-by: yiliu30 <[email protected]>

* fix typos

Signed-off-by: yiliu30 <[email protected]>

* rename mul_tensor to multi_tensor

Signed-off-by: yiliu30 <[email protected]>

* enable amp

Signed-off-by: yiliu30 <[email protected]>

* eval model

Signed-off-by: yiliu30 <[email protected]>

* add gen examples

Signed-off-by: yiliu30 <[email protected]>

* add warmup to benchmark

Signed-off-by: yiliu30 <[email protected]>

* add benchmark

Signed-off-by: yiliu30 <[email protected]>

* clean code

Signed-off-by: yiliu30 <[email protected]>

* format code

Signed-off-by: yiliu30 <[email protected]>

* use tiny kernel

Signed-off-by: yiliu30 <[email protected]>

* add more note

Signed-off-by: yiliu30 <[email protected]>

* format

Signed-off-by: yiliu30 <[email protected]>

* correct typos

Signed-off-by: yiliu30 <[email protected]>

* remove hard code

Signed-off-by: yiliu30 <[email protected]>

* use intx

Signed-off-by: yiliu30 <[email protected]>

* enable offload for multitensor

Signed-off-by: yiliu30 <[email protected]>

* update the default config

Signed-off-by: yiliu30 <[email protected]>

* refine note

Signed-off-by: yiliu30 <[email protected]>

* update the version check

Signed-off-by: yiliu30 <[email protected]>

* format

Signed-off-by: yiliu30 <[email protected]>

* update

Signed-off-by: yiliu30 <[email protected]>

* add ut

Signed-off-by: yiliu30 <[email protected]>

* format

Signed-off-by: yiliu30 <[email protected]>

* add scripts

Signed-off-by: yiliu30 <[email protected]>

* format code

Signed-off-by: yiliu30 <[email protected]>

* format

Signed-off-by: yiliu30 <[email protected]>

* update

Signed-off-by: yiliu30 <[email protected]>

* fix typo

Signed-off-by: yiliu30 <[email protected]>

* refine bench code

Signed-off-by: yiliu30 <[email protected]>

* Enable `use_optimized_layer_output` and AO' llama (#12)

Signed-off-by: yiliu30 <[email protected]>

* Refine the Doc (#14)

---------

Signed-off-by: yiliu30 <[email protected]>

* add more docstring

Signed-off-by: yiliu30 <[email protected]>

* add paper link

Signed-off-by: yiliu30 <[email protected]>

* correct some note

Signed-off-by: yiliu30 <[email protected]>

* add cmd

Signed-off-by: yiliu30 <[email protected]>

* udpdate the scripts

Signed-off-by: yiliu30 <[email protected]>

* revert some change

Signed-off-by: yiliu30 <[email protected]>

* Add a lightweight configuration for quick benchmarking (#15)

Signed-off-by: yiliu30 <[email protected]>

* update quant method name

Signed-off-by: yiliu30 <[email protected]>

* Wrap model's buffers and params to `MultiTensor` & update the results (#16)

* wrap model's buffers and params to `MultiTensor` and update the results

Signed-off-by: yiliu30 <[email protected]>

---------

Signed-off-by: yiliu30 <[email protected]>
jerryzh168 pushed a commit to jerryzh168/ao that referenced this pull request Sep 4, 2024
* initial flow for autoround

Signed-off-by: yiliu30 <[email protected]>

* update flow

Signed-off-by: yiliu30 <[email protected]>

* use int4 kernel

Signed-off-by: yiliu30 <[email protected]>

* remove debug code

Signed-off-by: yiliu30 <[email protected]>

* update the forward

Signed-off-by: yiliu30 <[email protected]>

* clean code

Signed-off-by: yiliu30 <[email protected]>

* e2e example

Signed-off-by: yiliu30 <[email protected]>

* refine code

Signed-off-by: yiliu30 <[email protected]>

* add requirements for test

Signed-off-by: yiliu30 <[email protected]>

* update test

Signed-off-by: yiliu30 <[email protected]>

* update the readme

Signed-off-by: yiliu30 <[email protected]>

* add readme

Signed-off-by: yiliu30 <[email protected]>

* update the filenames

Signed-off-by: yiliu30 <[email protected]>

* update the np version

Signed-off-by: yiliu30 <[email protected]>

* add demo

Signed-off-by: yiliu30 <[email protected]>

* format

Signed-off-by: yiliu30 <[email protected]>

* add more docs

Signed-off-by: yiliu30 <[email protected]>

* format

Signed-off-by: yiliu30 <[email protected]>

* add doc

Signed-off-by: yiliu30 <[email protected]>

* use `AffineQuantizedTensor`

Signed-off-by: yiliu30 <[email protected]>

* impl ar using multensors

Signed-off-by: yiliu30 <[email protected]>

* clean code

Signed-off-by: yiliu30 <[email protected]>

* use hook + multensors

Signed-off-by: yiliu30 <[email protected]>

* separate mul_tensors into a new file

Signed-off-by: yiliu30 <[email protected]>

* fix typos

Signed-off-by: yiliu30 <[email protected]>

* rename mul_tensor to multi_tensor

Signed-off-by: yiliu30 <[email protected]>

* enable amp

Signed-off-by: yiliu30 <[email protected]>

* eval model

Signed-off-by: yiliu30 <[email protected]>

* add gen examples

Signed-off-by: yiliu30 <[email protected]>

* add warmup to benchmark

Signed-off-by: yiliu30 <[email protected]>

* add benchmark

Signed-off-by: yiliu30 <[email protected]>

* clean code

Signed-off-by: yiliu30 <[email protected]>

* format code

Signed-off-by: yiliu30 <[email protected]>

* use tiny kernel

Signed-off-by: yiliu30 <[email protected]>

* add more note

Signed-off-by: yiliu30 <[email protected]>

* format

Signed-off-by: yiliu30 <[email protected]>

* correct typos

Signed-off-by: yiliu30 <[email protected]>

* remove hard code

Signed-off-by: yiliu30 <[email protected]>

* use intx

Signed-off-by: yiliu30 <[email protected]>

* enable offload for multitensor

Signed-off-by: yiliu30 <[email protected]>

* update the default config

Signed-off-by: yiliu30 <[email protected]>

* refine note

Signed-off-by: yiliu30 <[email protected]>

* update the version check

Signed-off-by: yiliu30 <[email protected]>

* format

Signed-off-by: yiliu30 <[email protected]>

* update

Signed-off-by: yiliu30 <[email protected]>

* add ut

Signed-off-by: yiliu30 <[email protected]>

* format

Signed-off-by: yiliu30 <[email protected]>

* add scripts

Signed-off-by: yiliu30 <[email protected]>

* format code

Signed-off-by: yiliu30 <[email protected]>

* format

Signed-off-by: yiliu30 <[email protected]>

* update

Signed-off-by: yiliu30 <[email protected]>

* fix typo

Signed-off-by: yiliu30 <[email protected]>

* refine bench code

Signed-off-by: yiliu30 <[email protected]>

* Enable `use_optimized_layer_output` and AO' llama (pytorch#12)

Signed-off-by: yiliu30 <[email protected]>

* Refine the Doc (pytorch#14)

---------

Signed-off-by: yiliu30 <[email protected]>

* add more docstring

Signed-off-by: yiliu30 <[email protected]>

* add paper link

Signed-off-by: yiliu30 <[email protected]>

* correct some note

Signed-off-by: yiliu30 <[email protected]>

* add cmd

Signed-off-by: yiliu30 <[email protected]>

* udpdate the scripts

Signed-off-by: yiliu30 <[email protected]>

* revert some change

Signed-off-by: yiliu30 <[email protected]>

* Add a lightweight configuration for quick benchmarking (pytorch#15)

Signed-off-by: yiliu30 <[email protected]>

* update quant method name

Signed-off-by: yiliu30 <[email protected]>

* Wrap model's buffers and params to `MultiTensor` & update the results (pytorch#16)

* wrap model's buffers and params to `MultiTensor` and update the results

Signed-off-by: yiliu30 <[email protected]>

---------

Signed-off-by: yiliu30 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants