Unifying TBE API using List (Backend) #3563

spcyppt · 2025-01-11T11:06:00Z

Summary:
As the number of arguments in TBE keeps growing, some of the optimizers run into number of arguments limitation (i.e., 64) during pytorch operation registration.

For long-term growth and maintenance, we hence redesign TBE API by packing some of the arguments into list. Note that not all arguments are packed.

We pack the arguments as a list for each type.
For common arguments, we pack

weights and arguments of type Momentum into TensorList
other tensors and optional tensors to list of optional tensors aux_tensor
int arguments into aux_int
float arguments into aux_float
bool arguments into aux_bool.

Similarly for optimizer-specific arguments, we pack

arguments of type Momentum that are not optional into TensorList
optional tensors to list of optional tensors optim_tensor
int arguments into optim_int
float arguments into optim_float
bool arguments into optim_bool.

We see issues with pytorch registration across packing SymInt in python-C++, so we unroll and pass SymInt arguments individually.

This significantly reduces number of arguments. For example, split_embedding_codegen_lookup_rowwise_adagrad_with_counter_function, which currently has 61 arguments only have 26 arguments with this API design.

Please refer to the design doc on which arguments are packed and signature.
Design doc:
https://docs.google.com/document/d/1dCBg7dcf7Yq9FHVrvXsAmFtBxkDi9o6u0r-Ptd4UDPE/edit?tab=t.0#heading=h.6bip5pwqq8xb

Full signature for each optimizer lookup function will be provided shortly.

Differential Revision: D68054868

facebook-github-bot · 2025-01-11T11:06:08Z

This pull request was exported from Phabricator. Differential Revision: D68054868

netlify · 2025-01-11T11:07:01Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`8d6fa66`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/6789a63f6229cd0008797b20
😎 Deploy Preview	https://deploy-preview-3563--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Summary: X-link: facebookresearch/FBGEMM#649 As the number of arguments in TBE keeps growing, some of the optimizers run into number of arguments limitation (i.e., 64) during pytorch operation registration. **For long-term growth and maintenance, we hence redesign TBE API by packing some of the arguments into list. Note that not all arguments are packed.** We pack the arguments as a list for each type. For **common** arguments, we pack - weights and arguments of type `Momentum` into TensorList - other tensors and optional tensors to list of optional tensors `aux_tensor` - `int` arguments into `aux_int` - `float` arguments into `aux_float` - `bool` arguments into `aux_bool`. Similarly for **optimizer-specific** arguments, we pack - arguments of type `Momentum` that are *__not__ optional* into TensorList - *optional* tensors to list of optional tensors `optim_tensor` - `int` arguments into `optim_int` - `float` arguments into `optim_float` - `bool` arguments into `optim_bool`. We see issues with pytorch registration across packing SymInt in python-C++, so we unroll and pass SymInt arguments individually. **This significantly reduces number of arguments.** For example, `split_embedding_codegen_lookup_rowwise_adagrad_with_counter_function`, which currently has 61 arguments only have 26 arguments with this API design. Please refer to the design doc on which arguments are packed and signature. Design doc: https://docs.google.com/document/d/1dCBg7dcf7Yq9FHVrvXsAmFtBxkDi9o6u0r-Ptd4UDPE/edit?tab=t.0#heading=h.6bip5pwqq8xb Full signature for each optimizer lookup function will be provided shortly. Differential Revision: D68054868

facebook-github-bot · 2025-01-16T06:20:06Z

This pull request was exported from Phabricator. Differential Revision: D68054868

Summary: X-link: facebookresearch/FBGEMM#649 As the number of arguments in TBE keeps growing, some of the optimizers run into number of arguments limitation (i.e., 64) during pytorch operation registration. **For long-term growth and maintenance, we hence redesign TBE API by packing some of the arguments into list. Note that not all arguments are packed.** We pack the arguments as a list for each type. For **common** arguments, we pack - weights and arguments of type `Momentum` into TensorList - other tensors and optional tensors to list of optional tensors `aux_tensor` - `int` arguments into `aux_int` - `float` arguments into `aux_float` - `bool` arguments into `aux_bool`. Similarly for **optimizer-specific** arguments, we pack - arguments of type `Momentum` that are *__not__ optional* into TensorList - *optional* tensors to list of optional tensors `optim_tensor` - `int` arguments into `optim_int` - `float` arguments into `optim_float` - `bool` arguments into `optim_bool`. We see issues with pytorch registration across packing SymInt in python-C++, so we unroll and pass SymInt arguments individually. **This significantly reduces number of arguments.** For example, `split_embedding_codegen_lookup_rowwise_adagrad_with_counter_function`, which currently has 61 arguments only have 26 arguments with this API design. Please refer to the design doc on which arguments are packed and signature. Design doc: https://docs.google.com/document/d/1dCBg7dcf7Yq9FHVrvXsAmFtBxkDi9o6u0r-Ptd4UDPE/edit?tab=t.0#heading=h.6bip5pwqq8xb Full signature for each optimizer lookup function will be provided shortly. Reviewed By: sryap Differential Revision: D68054868

facebook-github-bot · 2025-01-16T19:48:26Z

This pull request was exported from Phabricator. Differential Revision: D68054868

Summary: X-link: facebookresearch/FBGEMM#649 As the number of arguments in TBE keeps growing, some of the optimizers run into number of arguments limitation (i.e., 64) during pytorch operation registration. **For long-term growth and maintenance, we hence redesign TBE API by packing some of the arguments into list. Note that not all arguments are packed.** We pack the arguments as a list for each type. For **common** arguments, we pack - weights and arguments of type `Momentum` into TensorList - other tensors and optional tensors to list of optional tensors `aux_tensor` - `int` arguments into `aux_int` - `float` arguments into `aux_float` - `bool` arguments into `aux_bool`. Similarly for **optimizer-specific** arguments, we pack - arguments of type `Momentum` that are *__not__ optional* into TensorList - *optional* tensors to list of optional tensors `optim_tensor` - `int` arguments into `optim_int` - `float` arguments into `optim_float` - `bool` arguments into `optim_bool`. We see issues with pytorch registration across packing SymInt in python-C++, so we unroll and pass SymInt arguments individually. **This significantly reduces number of arguments.** For example, `split_embedding_codegen_lookup_rowwise_adagrad_with_counter_function`, which currently has 61 arguments only have 26 arguments with this API design. Please refer to the design doc on which arguments are packed and signature. Design doc: https://docs.google.com/document/d/1dCBg7dcf7Yq9FHVrvXsAmFtBxkDi9o6u0r-Ptd4UDPE/edit?tab=t.0#heading=h.6bip5pwqq8xb Full signature for each optimizer lookup function will be provided shortly. Reviewed By: sryap Differential Revision: D68054868

facebook-github-bot · 2025-01-17T00:37:42Z

This pull request was exported from Phabricator. Differential Revision: D68054868

facebook-github-bot · 2025-01-27T01:14:06Z

Hi @spcyppt!

Thank you for your pull request.

We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

facebook-github-bot added the cla signed label Jan 11, 2025

facebook-github-bot added the fb-exported label Jan 11, 2025

spcyppt force-pushed the export-D68054868 branch from 5f02b4b to f906f10 Compare January 16, 2025 06:19

spcyppt force-pushed the export-D68054868 branch from f906f10 to e7408fa Compare January 16, 2025 19:48

spcyppt force-pushed the export-D68054868 branch from e7408fa to 8d6fa66 Compare January 17, 2025 00:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unifying TBE API using List (Backend) #3563

Unifying TBE API using List (Backend) #3563

spcyppt commented Jan 11, 2025

facebook-github-bot commented Jan 11, 2025

netlify bot commented Jan 11, 2025 •

edited

Loading

facebook-github-bot commented Jan 16, 2025

facebook-github-bot commented Jan 16, 2025

facebook-github-bot commented Jan 17, 2025

facebook-github-bot commented Jan 27, 2025

Unifying TBE API using List (Backend) #3563

Are you sure you want to change the base?

Unifying TBE API using List (Backend) #3563

Conversation

spcyppt commented Jan 11, 2025

facebook-github-bot commented Jan 11, 2025

netlify bot commented Jan 11, 2025 • edited Loading

✅ Deploy Preview for pytorch-fbgemm-docs ready!

facebook-github-bot commented Jan 16, 2025

facebook-github-bot commented Jan 16, 2025

facebook-github-bot commented Jan 17, 2025

facebook-github-bot commented Jan 27, 2025

Process

netlify bot commented Jan 11, 2025 •

edited

Loading