Add text to vision embedding #6282

tangy5 · 2023-04-04T01:00:50Z

As part of the text to vision encoder for medical image analysis.

Support CLIP pre-trained embedding and random text embedding.

Linked to the issue: #6177

Signed-off-by: tangy5 <[email protected]>

for more information, see https://pre-commit.ci

Signed-off-by: tangy5 <[email protected]>

for more information, see https://pre-commit.ci

Signed-off-by: tangy5 <[email protected]>

for more information, see https://pre-commit.ci

tangy5 · 2023-04-05T06:22:14Z

Hi @wyli , it would be great you can take a look at this PR when you get time. The PR is ready for review, but I struggled to get the flake8 check pass. It seems not this new script's format issue. Can you help to give a look?

Thank you so much, let me know if there are any comments.

wyli · 2023-04-05T07:50:41Z

/black
thanks @tangy5, would be great to first have a prototype of https://github.com/Project-MONAI/MONAI/milestone/98 with all the basic functionality (it's fine to not have detailed docstrings/unit tests), then we identify possible reusable modules and create smaller PRs. e.g it's interesting to see the module's API is def forward(self) which doesn't take any input additional input, with a prototype it'll be easier to see how this could be used as part of a larger network.

monai/networks/blocks/text_embedding.py

tangy5 · 2023-04-05T14:29:49Z

thanks @tangy5, would be great to first have a prototype of https://github.com/Project-MONAI/MONAI/milestone/98 with all the basic functionality (it's fine to not have detailed docstrings/unit tests), then we identify possible reusable modules and create smaller PRs. e.g it's interesting to see the module's API is def forward(self) which doesn't take any input additional input, with a prototype it'll be easier to see how this could be used as part of a larger network.

Great, how can I create this "prototype"? A unit test workflow? Basically, the two PRs should be together, this PR's (#6283) network uses this PR's module.

I plan to make these two classes reusable for most network backbones, e.g. unet, swinunetr, that if users want to use the "text embedding" for their network, it could be safely concatenated to the vision feature predicted by CNN/Transformers backbones.

I feel a complete unit test or an integration test would be better to show the prototype here?

Thank you.

Signed-off-by: tangy5 <[email protected]>

tangy5 · 2023-04-06T23:00:42Z

Hi @wyli , thank you for the suggestion. For the prototype thing you mentioned, a complete workflow and application is here: https://github.com/ljwztc/CLIP-Driven-Universal-Model

I'd like to make another pipeline for a partially supervised learning workflow to showcase how to use textembedding as a plug and play module later.

I'm thinking this as two parts:

1: The text_embedding class that can load pre-trained or to add any text embedding to any network modules.
2. A dynamic segmentor class that can be used (reusable) as a separate module attached to segmentation network backbones (e.g., SwinUNETR, UNET, segResNet, etc)

These two modules are reusable designs.
Let me know if we need a discussion. Thank you.

Signed-off-by: tangy5 <[email protected]>

for more information, see https://pre-commit.ci

Signed-off-by: tangy5 <[email protected]>

tests/test_text_encoding.py

Signed-off-by: tangy5 <[email protected]>

tangy5 · 2023-04-12T19:32:47Z

@wyli , thank you so much for the suggestions, it's very helpful.

Some changes according to your suggestions have been made:

add pretrained option
add url_map for released text embedding
add 2D embedding support
add test cases 3D, 2D.

could you have a look if the skip_if_downloading_fails is used the correct way..

Thanks.

Signed-off-by: tangy5 <[email protected]>

for more information, see https://pre-commit.ci

wyli · 2023-04-13T07:53:42Z

thanks @tangy5, looks good to me, there are issues with the CPU only tests, could you please revise? (@yiheng-wang-nv knows more if you need help)

monai/networks/blocks/text_embedding.py

Signed-off-by: tangy5 <[email protected]>

for more information, see https://pre-commit.ci

tangy5 · 2023-04-13T18:55:19Z

thanks @tangy5, looks good to me, there are issues with the CPU only tests, could you please revise? (@yiheng-wang-nv knows more if you need help)

Thanks. The test fails should because the loaded pre-trained weights are always mapped to GPU, modified and added map_location.
Other changes are modified according to your comments, thanks.
I set the pretrained to True by default and to load the release clip embedding. In the future, if there are more released weights, users can load it, and modify text_dim accordingly.

Signed-off-by: monai-bot <[email protected]>

Signed-off-by: Wenqi Li <[email protected]>

wyli · 2023-04-13T19:20:45Z

/build

Add text to vision embedding

7f255ad

Signed-off-by: tangy5 <[email protected]>

tangy5 self-assigned this Apr 4, 2023

pre-commit-ci bot and others added 4 commits April 4, 2023 01:01

[pre-commit.ci] auto fixes from pre-commit.com hooks

2dd5566

for more information, see https://pre-commit.ci

Merge branch 'Project-MONAI:dev' into textembedding

1481992

update parameters

81c2965

Signed-off-by: tangy5 <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

f84ac5d

for more information, see https://pre-commit.ci

tangy5 added this to the CLIP Driven Universal Model [P1 v1.2] milestone Apr 5, 2023

tangy5 added 2 commits April 4, 2023 21:58

update encoding

d737121

Signed-off-by: tangy5 <[email protected]>

change file mode

ae7c2fe

Signed-off-by: tangy5 <[email protected]>

tangy5 force-pushed the textembedding branch from 183a27a to ae7c2fe Compare April 5, 2023 05:04

fix flake8 format

bd4dc37

Signed-off-by: tangy5 <[email protected]>

tangy5 force-pushed the textembedding branch from ec603d7 to bd4dc37 Compare April 5, 2023 05:16

fix flake8 format2

3fd70e3

Signed-off-by: tangy5 <[email protected]>

tangy5 force-pushed the textembedding branch from 1dc20f3 to 3fd70e3 Compare April 5, 2023 05:25

[pre-commit.ci] auto fixes from pre-commit.com hooks

a65d6c1

for more information, see https://pre-commit.ci

tangy5 marked this pull request as ready for review April 5, 2023 05:32

tangy5 requested a review from wyli April 5, 2023 06:22

wyli reviewed Apr 5, 2023

View reviewed changes

monai/networks/blocks/text_embedding.py Outdated Show resolved Hide resolved

tangy5 and others added 2 commits April 6, 2023 15:51

Merge branch 'Project-MONAI:dev' into textembedding

1ffd1b2

update var name

79a8f85

Signed-off-by: tangy5 <[email protected]>

tangy5 and others added 4 commits April 6, 2023 16:03

update var name

2cc0ce4

Signed-off-by: tangy5 <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

88f392f

for more information, see https://pre-commit.ci

Merge branch 'dev' into textembedding

b5604df

update 2d case, pretrain option, release CLIP weights

fefa8d1

Signed-off-by: tangy5 <[email protected]>

wyli reviewed Apr 12, 2023

View reviewed changes

tests/test_text_encoding.py Outdated Show resolved Hide resolved

loadable options

4ba43a9

Signed-off-by: tangy5 <[email protected]>

tangy5 force-pushed the textembedding branch from cf5f072 to 4ba43a9 Compare April 12, 2023 19:22

tangy5 added 2 commits April 12, 2023 12:23

remove print

6b3d8fe

Signed-off-by: tangy5 <[email protected]>

add skip if downloading fails

af30d79

Signed-off-by: tangy5 <[email protected]>

tangy5 force-pushed the textembedding branch from 73ce0c0 to af30d79 Compare April 12, 2023 19:29

update pretrained load logic

900729d

Signed-off-by: tangy5 <[email protected]>

tangy5 force-pushed the textembedding branch from 1a962b5 to 900729d Compare April 12, 2023 19:50

tangy5 and others added 2 commits April 12, 2023 12:50

Merge branch 'dev' into textembedding

5e8de2c

[pre-commit.ci] auto fixes from pre-commit.com hooks

2be5867

for more information, see https://pre-commit.ci

wyli reviewed Apr 13, 2023

View reviewed changes

monai/networks/blocks/text_embedding.py Outdated Show resolved Hide resolved

wyli reviewed Apr 13, 2023

View reviewed changes

monai/networks/blocks/text_embedding.py Outdated Show resolved Hide resolved

tangy5 and others added 3 commits April 13, 2023 11:12

Merge branch 'Project-MONAI:dev' into textembedding

38696a5

fix cpu only test and others

c2a1755

Signed-off-by: tangy5 <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

4392a46

for more information, see https://pre-commit.ci

[MONAI] code formatting

447509b

Signed-off-by: monai-bot <[email protected]>

wyli approved these changes Apr 13, 2023

View reviewed changes

wyli added 2 commits April 13, 2023 20:17

fixes

bd6c4e3

Signed-off-by: Wenqi Li <[email protected]>

fixes

72bf74a

Signed-off-by: Wenqi Li <[email protected]>

wyli enabled auto-merge (squash) April 13, 2023 19:20

wyli merged commit 57c618c into Project-MONAI:dev Apr 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add text to vision embedding #6282

Add text to vision embedding #6282

tangy5 commented Apr 4, 2023 •

edited

Loading

tangy5 commented Apr 5, 2023

wyli commented Apr 5, 2023 •

edited

Loading

tangy5 commented Apr 5, 2023

tangy5 commented Apr 6, 2023 •

edited

Loading

tangy5 commented Apr 12, 2023

wyli commented Apr 13, 2023

tangy5 commented Apr 13, 2023 •

edited

Loading

wyli commented Apr 13, 2023

Add text to vision embedding #6282

Add text to vision embedding #6282

Conversation

tangy5 commented Apr 4, 2023 • edited Loading

tangy5 commented Apr 5, 2023

wyli commented Apr 5, 2023 • edited Loading

tangy5 commented Apr 5, 2023

tangy5 commented Apr 6, 2023 • edited Loading

tangy5 commented Apr 12, 2023

wyli commented Apr 13, 2023

tangy5 commented Apr 13, 2023 • edited Loading

wyli commented Apr 13, 2023

tangy5 commented Apr 4, 2023 •

edited

Loading

wyli commented Apr 5, 2023 •

edited

Loading

tangy5 commented Apr 6, 2023 •

edited

Loading

tangy5 commented Apr 13, 2023 •

edited

Loading