ESM2 Golden Value Testing #85

farhadrgh · 2024-08-08T19:32:36Z

No description provided.

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_model_gv.py

jstjohn · 2024-08-09T19:53:59Z

sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_model_gv.py

+    tokens = tokenizer.tokenizer([row[1] for row in sample_data], return_tensors="pt", padding=True)
+    tokens["input_ids"] = tokens["input_ids"].to(device)
+    tokens["attention_mask"] = tokens["attention_mask"].to(device)


Would be more clear if we called this inputs or batch or something. Later when we say **tokens it looks like we're just unpacking tokens for a second. Also doesn't the return object from hugging face allow you to say batch.to(device)? It's like a special dictionary type that has a to method defined I think. If so that would be nicer to use.

sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_model_gv.py

jstjohn · 2024-08-09T19:58:25Z

sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_model_gv.py

+        model = esm2_650M_config_hiddens.configure_model(tokenizer).to(device)
+        model.eval()
+        hiddens = model(tokens["input_ids"], tokens["attention_mask"])
+        embeddings = reduce_hiddens(torch.transpose(hiddens, 0, 1).float(), tokens["attention_mask"])


Someday we should probably do this transpose inside of forward if the user wants this as their final output. Fine for now.

sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_model_gv.py

jstjohn · 2024-08-09T20:02:29Z

sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_model_gv.py

+bionemo2_root: Path = (
+    # esm2 module's path is the most dependable --> don't expect this to change!
+    Path(esm2.__file__)
+    # This gets us from 'sub-packages/bionemo-esm2/src/bionemo/esm2/__init__.py' to 'sub-packages/bionemo-esm2'
+    .parent.parent.parent.parent
+    # From here, we want to get to the root of the repository: _before_ sub-packages/
+    .parent.parent
+).absolute()
+assert bionemo2_root != Path("/")
+nemo1_checkpoint_path: Path = bionemo2_root / "models/protein/esm2nv/esm2nv_650M_converted.nemo"


Consider moving this into a conftest.py file in the top level esm dir. I think @skothenhill-nv did this for geneformer but my internet is acting up so I am not sure. That would allow you to use this in other tests as we grow the package.

@jstjohn @skothenhill-nv
Geneformer is using the same pattern here. Am I looking at the right test file?

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

pstjohn

left a few comments, but otherwise LGTM

sub-packages/bionemo-esm2/src/bionemo/esm2/model/attention.py

sub-packages/bionemo-llm/src/bionemo/llm/model/layers.py

farhadrgh · 2024-08-13T17:43:07Z

/build-ci

sub-packages/bionemo-llm/src/bionemo/llm/model/biobert/transformer_specs.py

sub-packages/bionemo-llm/src/bionemo/llm/model/layers.py

jstjohn

Can you remove the new layers if they are not needed, and the TODO comments making it look like they should be used? At least one of them (TorchLinear) would break tensor parallelism if a user used them.

farhadrgh · 2024-08-13T18:44:14Z

/build-ci

farhadrgh · 2024-08-13T19:11:20Z

/build-ci

farhadrgh · 2024-08-13T19:27:48Z

/build-ci

farhadrgh · 2024-08-13T21:15:08Z

/build-ci

farhadrgh · 2024-08-13T22:27:57Z

/build-ci

farhadrgh · 2024-08-14T15:57:25Z

/build-ci

farhadrgh · 2024-08-14T18:12:26Z

/build-ci

farhadrgh and others added 14 commits August 7, 2024 18:09

mount data

dda2472

scale quary before rope

ca372c2

scale quary before rope

fc5810c

scale quary before rope

f3f092b

resolve conflicts

498ffc8

torch layers

0bd71d5

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

missing config

79d1845

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

update esm2 spec

eebfb2f

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

Merge branch 'v2-main' into farhadr/esm2_debug

ee119dc

ruff

1dff118

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

doc str

6eb7618

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

logits test

8de8933

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

add embedding and logits tests

1c3a4bd

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

update type

8f600ae

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

jstjohn reviewed Aug 9, 2024

View reviewed changes

sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_model_gv.py Outdated Show resolved Hide resolved

jstjohn reviewed Aug 9, 2024

View reviewed changes

sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_model_gv.py Outdated Show resolved Hide resolved

jstjohn reviewed Aug 9, 2024

View reviewed changes

sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_model_gv.py Outdated Show resolved Hide resolved

jstjohn reviewed Aug 9, 2024

View reviewed changes

sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_model_gv.py Outdated Show resolved Hide resolved

jstjohn reviewed Aug 9, 2024

View reviewed changes

sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_model_gv.py Outdated Show resolved Hide resolved

jstjohn reviewed Aug 9, 2024

View reviewed changes

farhadrgh added 6 commits August 9, 2024 20:45

move tests to test_model

c77ec0c

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

update

9c70326

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

update threshold

76425b6

ESM2 bionemo1 options

7b82d14

Signed-off-by: Farhad Ramezanghorbani <[email protected]>

update threshold

89de859

use torch.allclose

13eb32d

farhadrgh changed the title ~~[WIP] ESM2 Golden Value Testing~~ ESM2 Golden Value Testing Aug 12, 2024

resolve conflicts

298f2f9

farhadrgh requested review from pstjohn, gwarmstrong, skothenhill-nv and jstjohn August 12, 2024 19:35

farhadrgh added 2 commits August 12, 2024 19:44

move doc str

ca52341

ruff

d876d46

pstjohn approved these changes Aug 12, 2024

View reviewed changes

sub-packages/bionemo-esm2/src/bionemo/esm2/model/attention.py Outdated Show resolved Hide resolved

sub-packages/bionemo-llm/src/bionemo/llm/model/layers.py Outdated Show resolved Hide resolved

sub-packages/bionemo-llm/src/bionemo/llm/model/layers.py Outdated Show resolved Hide resolved

farhadrgh added 2 commits August 12, 2024 20:03

fix typo

70a5fb3

use attached config

a5b111f

jstjohn reviewed Aug 13, 2024

View reviewed changes

sub-packages/bionemo-llm/src/bionemo/llm/model/biobert/transformer_specs.py Show resolved Hide resolved

jstjohn reviewed Aug 13, 2024

View reviewed changes

sub-packages/bionemo-llm/src/bionemo/llm/model/biobert/transformer_specs.py Outdated Show resolved Hide resolved

jstjohn reviewed Aug 13, 2024

View reviewed changes

sub-packages/bionemo-llm/src/bionemo/llm/model/layers.py Outdated Show resolved Hide resolved

jstjohn approved these changes Aug 13, 2024

View reviewed changes

farhadrgh added 2 commits August 13, 2024 18:26

fix

b6f1867

rm unused layers

aff6e8c

Merge branch 'v2-main' into farhadr/esm2_debug

785a9dd

update

056f0c4

Merge branch 'v2-main' into farhadr/esm2_debug

d82a7a2

Merge branch 'v2-main' into farhadr/esm2_debug

c8b015c

update tol

7802686

farhadrgh merged commit f402d11 into v2-main Aug 14, 2024
2 checks passed

farhadrgh deleted the farhadr/esm2_debug branch August 14, 2024 19:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESM2 Golden Value Testing #85

ESM2 Golden Value Testing #85

farhadrgh commented Aug 8, 2024

jstjohn Aug 9, 2024

jstjohn Aug 9, 2024

jstjohn Aug 9, 2024

farhadrgh Aug 12, 2024

pstjohn left a comment

farhadrgh commented Aug 13, 2024

jstjohn left a comment •

edited

Loading

farhadrgh commented Aug 13, 2024

farhadrgh commented Aug 13, 2024

farhadrgh commented Aug 13, 2024

farhadrgh commented Aug 13, 2024

farhadrgh commented Aug 13, 2024

farhadrgh commented Aug 14, 2024

farhadrgh commented Aug 14, 2024

ESM2 Golden Value Testing #85

ESM2 Golden Value Testing #85

Conversation

farhadrgh commented Aug 8, 2024

jstjohn Aug 9, 2024

Choose a reason for hiding this comment

jstjohn Aug 9, 2024

Choose a reason for hiding this comment

jstjohn Aug 9, 2024

Choose a reason for hiding this comment

farhadrgh Aug 12, 2024

Choose a reason for hiding this comment

pstjohn left a comment

Choose a reason for hiding this comment

farhadrgh commented Aug 13, 2024

jstjohn left a comment • edited Loading

Choose a reason for hiding this comment

farhadrgh commented Aug 13, 2024

farhadrgh commented Aug 13, 2024

farhadrgh commented Aug 13, 2024

farhadrgh commented Aug 13, 2024

farhadrgh commented Aug 13, 2024

farhadrgh commented Aug 14, 2024

farhadrgh commented Aug 14, 2024

jstjohn left a comment •

edited

Loading