Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update NeMo and Megatron to TOT #424

Merged

Conversation

pstjohn
Copy link
Collaborator

@pstjohn pstjohn commented Nov 12, 2024

Updates us to TOT NeMo and Megatron, and bumps the transformer engine version to 1.11.

Includes a number of fixes we need to make to be compatible with these newer versions, including changes to IOMixin

Branches off of #302.

Includes the fix from NeMo for validation batches after restart NVIDIA/NeMo#11029

@pstjohn pstjohn force-pushed the pstjohn/update-3rdparty-esm2-pretraining branch 3 times, most recently from 51926b4 to bfbb625 Compare November 12, 2024 21:22
@pstjohn
Copy link
Collaborator Author

pstjohn commented Nov 13, 2024

/build-ci

@pstjohn pstjohn force-pushed the pstjohn/update-3rdparty-esm2-pretraining branch from 468507f to d0be804 Compare November 14, 2024 16:01
@pstjohn pstjohn changed the title Pstjohn/update 3rdparty esm2 pretraining Update NeMo and Megatron to TOT Nov 14, 2024
@pstjohn
Copy link
Collaborator Author

pstjohn commented Nov 14, 2024

/build-ci

@pstjohn pstjohn enabled auto-merge (squash) November 14, 2024 17:03
@pstjohn pstjohn merged commit 35af747 into NVIDIA:main Nov 14, 2024
4 checks passed
@polinabinder1
Copy link
Collaborator

/build-ci

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants