Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unifying TBE API using List (Backend) #3563

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

spcyppt
Copy link
Contributor

@spcyppt spcyppt commented Jan 11, 2025

Summary:
As the number of arguments in TBE keeps growing, some of the optimizers run into number of arguments limitation (i.e., 64) during pytorch operation registration.

For long-term growth and maintenance, we hence redesign TBE API by packing some of the arguments into list. Note that not all arguments are packed.

We pack the arguments as a list for each type.
For common arguments, we pack

  • weights and arguments of type Momentum into TensorList
  • other tensors and optional tensors to list of optional tensors aux_tensor
  • int arguments into aux_int
  • float arguments into aux_float
  • bool arguments into aux_bool.

Similarly for optimizer-specific arguments, we pack

  • arguments of type Momentum that are not optional into TensorList
  • optional tensors to list of optional tensors optim_tensor
  • int arguments into optim_int
  • float arguments into optim_float
  • bool arguments into optim_bool.

We see issues with pytorch registration across packing SymInt in python-C++, so we unroll and pass SymInt arguments individually.

This significantly reduces number of arguments. For example, split_embedding_codegen_lookup_rowwise_adagrad_with_counter_function, which currently has 61 arguments only have 26 arguments with this API design.

Please refer to the design doc on which arguments are packed and signature.
Design doc:
https://docs.google.com/document/d/1dCBg7dcf7Yq9FHVrvXsAmFtBxkDi9o6u0r-Ptd4UDPE/edit?tab=t.0#heading=h.6bip5pwqq8xb

Full signature for each optimizer lookup function will be provided shortly.

Differential Revision: D68054868

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D68054868

Copy link

netlify bot commented Jan 11, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 8d6fa66
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/6789a63f6229cd0008797b20
😎 Deploy Preview https://deploy-preview-3563--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

spcyppt added a commit to spcyppt/FBGEMM that referenced this pull request Jan 16, 2025
Summary:
X-link: facebookresearch/FBGEMM#649


As the number of arguments in TBE keeps growing, some of the optimizers run into number of arguments limitation (i.e., 64) during pytorch operation registration. 

**For long-term growth and maintenance, we hence redesign TBE API by packing some of the arguments into list. Note that not all arguments are packed.**

We pack the arguments as a list for each type.
For **common** arguments, we pack 
- weights and arguments of type `Momentum` into TensorList
- other tensors and optional tensors to list of optional tensors `aux_tensor`
- `int` arguments into `aux_int`
- `float` arguments into `aux_float`
- `bool` arguments into `aux_bool`.

Similarly for **optimizer-specific** arguments, we pack
- arguments of type `Momentum` that are *__not__ optional* into TensorList
- *optional* tensors to list of optional tensors `optim_tensor`
- `int` arguments into `optim_int`
- `float` arguments into `optim_float`
- `bool` arguments into `optim_bool`.

We see issues with pytorch registration across packing SymInt in python-C++, so we unroll and pass SymInt arguments individually. 

**This significantly reduces number of arguments.** For example, `split_embedding_codegen_lookup_rowwise_adagrad_with_counter_function`, which currently has 61 arguments only have 26 arguments with this API design. 

Please refer to the design doc on which arguments are packed and signature.
Design doc:
https://docs.google.com/document/d/1dCBg7dcf7Yq9FHVrvXsAmFtBxkDi9o6u0r-Ptd4UDPE/edit?tab=t.0#heading=h.6bip5pwqq8xb

Full signature for each optimizer lookup function will be provided shortly.

Differential Revision: D68054868
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D68054868

spcyppt added a commit to spcyppt/FBGEMM that referenced this pull request Jan 16, 2025
Summary:
X-link: facebookresearch/FBGEMM#649


As the number of arguments in TBE keeps growing, some of the optimizers run into number of arguments limitation (i.e., 64) during pytorch operation registration. 

**For long-term growth and maintenance, we hence redesign TBE API by packing some of the arguments into list. Note that not all arguments are packed.**

We pack the arguments as a list for each type.
For **common** arguments, we pack 
- weights and arguments of type `Momentum` into TensorList
- other tensors and optional tensors to list of optional tensors `aux_tensor`
- `int` arguments into `aux_int`
- `float` arguments into `aux_float`
- `bool` arguments into `aux_bool`.

Similarly for **optimizer-specific** arguments, we pack
- arguments of type `Momentum` that are *__not__ optional* into TensorList
- *optional* tensors to list of optional tensors `optim_tensor`
- `int` arguments into `optim_int`
- `float` arguments into `optim_float`
- `bool` arguments into `optim_bool`.

We see issues with pytorch registration across packing SymInt in python-C++, so we unroll and pass SymInt arguments individually. 

**This significantly reduces number of arguments.** For example, `split_embedding_codegen_lookup_rowwise_adagrad_with_counter_function`, which currently has 61 arguments only have 26 arguments with this API design. 

Please refer to the design doc on which arguments are packed and signature.
Design doc:
https://docs.google.com/document/d/1dCBg7dcf7Yq9FHVrvXsAmFtBxkDi9o6u0r-Ptd4UDPE/edit?tab=t.0#heading=h.6bip5pwqq8xb

Full signature for each optimizer lookup function will be provided shortly.

Reviewed By: sryap

Differential Revision: D68054868
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D68054868

Summary:
X-link: facebookresearch/FBGEMM#649


As the number of arguments in TBE keeps growing, some of the optimizers run into number of arguments limitation (i.e., 64) during pytorch operation registration. 

**For long-term growth and maintenance, we hence redesign TBE API by packing some of the arguments into list. Note that not all arguments are packed.**

We pack the arguments as a list for each type.
For **common** arguments, we pack 
- weights and arguments of type `Momentum` into TensorList
- other tensors and optional tensors to list of optional tensors `aux_tensor`
- `int` arguments into `aux_int`
- `float` arguments into `aux_float`
- `bool` arguments into `aux_bool`.

Similarly for **optimizer-specific** arguments, we pack
- arguments of type `Momentum` that are *__not__ optional* into TensorList
- *optional* tensors to list of optional tensors `optim_tensor`
- `int` arguments into `optim_int`
- `float` arguments into `optim_float`
- `bool` arguments into `optim_bool`.

We see issues with pytorch registration across packing SymInt in python-C++, so we unroll and pass SymInt arguments individually. 

**This significantly reduces number of arguments.** For example, `split_embedding_codegen_lookup_rowwise_adagrad_with_counter_function`, which currently has 61 arguments only have 26 arguments with this API design. 

Please refer to the design doc on which arguments are packed and signature.
Design doc:
https://docs.google.com/document/d/1dCBg7dcf7Yq9FHVrvXsAmFtBxkDi9o6u0r-Ptd4UDPE/edit?tab=t.0#heading=h.6bip5pwqq8xb

Full signature for each optimizer lookup function will be provided shortly.

Reviewed By: sryap

Differential Revision: D68054868
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D68054868

@facebook-github-bot
Copy link
Contributor

Hi @spcyppt!

Thank you for your pull request.

We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants