Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tuple as device_type input to support Heterogenous Sharding of tables across different device_typestable #2600

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

faran928
Copy link

@faran928 faran928 commented Dec 2, 2024

Summary: As we plan to support heterogenous sharding across different device types (cuda / cpu etc), we will pass device type per shard in the format of tuple for device_type_from_sharding_info where each index will represent the device_type for that particular shard

Differential Revision: D65933148

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 2, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D65933148

faran928 pushed a commit to faran928/torchrec that referenced this pull request Dec 4, 2024
… across different device_typestable (pytorch#2600)

Summary:

As we plan to support heterogenous sharding across different device types (cuda / cpu etc), we will pass device type per shard in the format of tuple for device_type_from_sharding_info where each index will represent the device_type for that particular shard

Differential Revision: D65933148
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D65933148

faran928 pushed a commit to faran928/torchrec that referenced this pull request Dec 4, 2024
… across different device_typestable (pytorch#2600)

Summary:

As we plan to support heterogenous sharding across different device types (cuda / cpu etc), we will pass device type per shard in the format of tuple for device_type_from_sharding_info where each index will represent the device_type for that particular shard

Differential Revision: D65933148
faran928 pushed a commit to faran928/torchrec that referenced this pull request Dec 4, 2024
… across different device_typestable (pytorch#2600)

Summary:

As we plan to support heterogenous sharding across different device types (cuda / cpu etc), we will pass device type per shard in the format of tuple for device_type_from_sharding_info where each index will represent the device_type for that particular shard

Differential Revision: D65933148
Faran Ahmad added 2 commits December 3, 2024 18:14
Summary:

Unify InferRwSequenceEmbedding Modules for GPU / CPU.

There does not seem to be much difference in the implementation for InferRwSequenceEmbedding and InferCPURwSequenceEmbedding.

For heterogeneous sharding, we need to merge them together into one module. 

Also introduced the concept of device_type_from_sharding_info to propagate the correct device for output dist.

Reviewed By: jiayisuse

Differential Revision: D65859663
… across different device_typestable (pytorch#2600)

Summary:

As we plan to support heterogenous sharding across different device types (cuda / cpu etc), we will pass device type per shard in the format of tuple for device_type_from_sharding_info where each index will represent the device_type for that particular shard

Differential Revision: D65933148
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D65933148

faran928 pushed a commit to faran928/torchrec that referenced this pull request Dec 5, 2024
… across different device_typestable (pytorch#2600)

Summary:

As we plan to support heterogenous sharding across different device types (cuda / cpu etc), we will pass device type per shard in the format of tuple for device_type_from_sharding_info where each index will represent the device_type for that particular shard

Differential Revision: D65933148
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants