Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add three new open clip roberta base models #860

Merged
merged 34 commits into from
Nov 29, 2022
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
5524f46
feat: bump openclip to v2.5.0 (#859)
OrangeSodahub Nov 16, 2022
06ec06f
fix: remove roberta model
OrangeSodahub Nov 16, 2022
7fcb813
fix: model name
OrangeSodahub Nov 16, 2022
c4beeca
fix: add :: to model name
OrangeSodahub Nov 16, 2022
5fbfb57
fix: add transformers
OrangeSodahub Nov 16, 2022
d5dd1ce
fix: remove is_init_value
OrangeSodahub Nov 16, 2022
d937f13
fix: remove transformers
OrangeSodahub Nov 16, 2022
fe2745b
fix: not use flash-attn on cpu
OrangeSodahub Nov 17, 2022
7ddd51b
fix: add assert description
OrangeSodahub Nov 17, 2022
cce60df
fix: housekeeping
ZiniuYu Nov 17, 2022
1ef019d
Merge branch 'bump-openclip-v2.50' of https://github.com/jina-ai/clip…
ZiniuYu Nov 17, 2022
a93d738
fix: cleanup
OrangeSodahub Nov 17, 2022
cc3514d
fix: cleanup
OrangeSodahub Nov 17, 2022
1e94a6e
fix: allow to set precision
OrangeSodahub Nov 18, 2022
0d2f871
fix: gpu-test
OrangeSodahub Nov 18, 2022
5b0a65a
fix: add roberta model test
OrangeSodahub Nov 18, 2022
43b7a31
fix: dtype
OrangeSodahub Nov 18, 2022
9efb369
fix: dtype
OrangeSodahub Nov 18, 2022
848f0aa
fix: remove optional
OrangeSodahub Nov 18, 2022
b247228
fix: housekeeping
ZiniuYu Nov 18, 2022
b1fc4d6
fix: housekeeping
ZiniuYu Nov 18, 2022
5478afc
fix: typo
ZiniuYu Nov 20, 2022
ba3aa44
fix: refactor
ZiniuYu Nov 21, 2022
2b7af82
fix: refactor
ZiniuYu Nov 21, 2022
06ad96a
fix: refactor
ZiniuYu Nov 21, 2022
8ef2090
fix: type hint
ZiniuYu Nov 21, 2022
8a36b8a
fix: address comments
ZiniuYu Nov 21, 2022
cf9595d
fix: visiontransformer
OrangeSodahub Nov 21, 2022
e43990b
fix: d_model and n_head
OrangeSodahub Nov 21, 2022
1bf83ad
fix: texttransformer
OrangeSodahub Nov 21, 2022
f918d65
fix: change model to fix oom gh action
ZiniuYu Nov 21, 2022
d910ee8
fix: class init and param name
ZiniuYu Nov 21, 2022
423f788
fix: black
ZiniuYu Nov 21, 2022
04c130a
chore: bump open-clip version
ZiniuYu Nov 21, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ jobs:
pip install --no-cache-dir "server/[onnx]"
pip install --no-cache-dir "server/[transformers]"
pip install --no-cache-dir "server/[search]"
pip install --no-cache-dir "server/[transformers]"
- name: Test
id: test
run: |
Expand Down
9 changes: 6 additions & 3 deletions server/clip_server/model/flash_attention.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,12 @@ def attention(
key_padding_mask: a bool tensor of shape (B, S)

"""
assert not need_weights
assert q.dtype in [torch.float16, torch.bfloat16]
assert q.is_cuda
assert not need_weights, "not allowed to return weights."
assert q.dtype in [
torch.float16,
torch.bfloat16,
], f"flash attention only support torch.float16 or torch.bfloat16 but got {q.dtype}."
assert q.is_cuda, "flash attention only support cuda."
OrangeSodahub marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest removing these asserts. It's much safe, but degrading the performance a bit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what's more, from the function's parameter, seq_len. It seems that the flash-attention implementation can only be used for the text encoder. Is it can be applied to a vision transformer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it could. Every image tensor first convert to sentence-like tensor before fed into model.


if cu_seqlens is None:
max_s = seqlen
Expand Down
Loading