Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESM2 Golden Value Testing #85

Merged
merged 32 commits into from
Aug 14, 2024
Merged

ESM2 Golden Value Testing #85

merged 32 commits into from
Aug 14, 2024

Conversation

farhadrgh
Copy link
Collaborator

No description provided.

farhadrgh and others added 14 commits August 7, 2024 18:09
Signed-off-by: Farhad Ramezanghorbani <[email protected]>
Signed-off-by: Farhad Ramezanghorbani <[email protected]>
Signed-off-by: Farhad Ramezanghorbani <[email protected]>
Signed-off-by: Farhad Ramezanghorbani <[email protected]>
Signed-off-by: Farhad Ramezanghorbani <[email protected]>
Signed-off-by: Farhad Ramezanghorbani <[email protected]>
Signed-off-by: Farhad Ramezanghorbani <[email protected]>
Signed-off-by: Farhad Ramezanghorbani <[email protected]>
Comment on lines 162 to 164
tokens = tokenizer.tokenizer([row[1] for row in sample_data], return_tensors="pt", padding=True)
tokens["input_ids"] = tokens["input_ids"].to(device)
tokens["attention_mask"] = tokens["attention_mask"].to(device)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be more clear if we called this inputs or batch or something. Later when we say **tokens it looks like we're just unpacking tokens for a second. Also doesn't the return object from hugging face allow you to say batch.to(device)? It's like a special dictionary type that has a to method defined I think. If so that would be nicer to use.

model = esm2_650M_config_hiddens.configure_model(tokenizer).to(device)
model.eval()
hiddens = model(tokens["input_ids"], tokens["attention_mask"])
embeddings = reduce_hiddens(torch.transpose(hiddens, 0, 1).float(), tokens["attention_mask"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someday we should probably do this transpose inside of forward if the user wants this as their final output. Fine for now.

Comment on lines 34 to 43
bionemo2_root: Path = (
# esm2 module's path is the most dependable --> don't expect this to change!
Path(esm2.__file__)
# This gets us from 'sub-packages/bionemo-esm2/src/bionemo/esm2/__init__.py' to 'sub-packages/bionemo-esm2'
.parent.parent.parent.parent
# From here, we want to get to the root of the repository: _before_ sub-packages/
.parent.parent
).absolute()
assert bionemo2_root != Path("/")
nemo1_checkpoint_path: Path = bionemo2_root / "models/protein/esm2nv/esm2nv_650M_converted.nemo"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving this into a conftest.py file in the top level esm dir. I think @skothenhill-nv did this for geneformer but my internet is acting up so I am not sure. That would allow you to use this in other tests as we grow the package.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jstjohn @skothenhill-nv
Geneformer is using the same pattern here. Am I looking at the right test file?

Signed-off-by: Farhad Ramezanghorbani <[email protected]>
Signed-off-by: Farhad Ramezanghorbani <[email protected]>
Signed-off-by: Farhad Ramezanghorbani <[email protected]>
@farhadrgh farhadrgh changed the title [WIP] ESM2 Golden Value Testing ESM2 Golden Value Testing Aug 12, 2024
Copy link
Collaborator

@pstjohn pstjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few comments, but otherwise LGTM

@farhadrgh
Copy link
Collaborator Author

/build-ci

Copy link
Collaborator

@jstjohn jstjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the new layers if they are not needed, and the TODO comments making it look like they should be used? At least one of them (TorchLinear) would break tensor parallelism if a user used them.

@farhadrgh
Copy link
Collaborator Author

/build-ci

1 similar comment
@farhadrgh
Copy link
Collaborator Author

/build-ci

@farhadrgh
Copy link
Collaborator Author

/build-ci

@farhadrgh
Copy link
Collaborator Author

/build-ci

@farhadrgh
Copy link
Collaborator Author

/build-ci

@farhadrgh
Copy link
Collaborator Author

/build-ci

@farhadrgh
Copy link
Collaborator Author

/build-ci

@farhadrgh farhadrgh merged commit f402d11 into v2-main Aug 14, 2024
2 checks passed
@farhadrgh farhadrgh deleted the farhadr/esm2_debug branch August 14, 2024 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants