Skip to content

Commit

Permalink
feat: add three new open clip roberta base models (#860)
Browse files Browse the repository at this point in the history
* feat: bump openclip to v2.5.0 (#859)

* feat: bump openclip to v2.5.0

* fix: conflicts

* fix: default fp32 on cpu and fp16 on gpu

* feat: add two new models

* fix: remove debug

* fix: add roberta models (test)

* fix: model name xlm

* fix: (wip)

* fix: remove roberta model

* fix: model name

* fix: add :: to model name

* fix: add transformers

* fix: remove is_init_value

* fix: remove transformers

* fix: not use flash-attn on cpu

* fix: add assert description

* fix: housekeeping

* fix: cleanup

* fix: cleanup

* fix: allow to set precision

* fix: gpu-test

* fix: add roberta model test

* fix: dtype

* fix: dtype

* fix: remove optional

* fix: housekeeping

* fix: housekeeping

* fix: typo

* fix: refactor

* fix: refactor

* fix: refactor

* fix: type hint

* fix: address comments

* fix: visiontransformer

* fix: d_model and n_head

* fix: texttransformer

* fix: change model to fix oom gh action

* fix: class init and param name

* fix: black

* chore: bump open-clip version

Co-authored-by: ZiniuYu <[email protected]>
  • Loading branch information
OrangeSodahub and ZiniuYu authored Nov 29, 2022
1 parent 67f551c commit f251539
Show file tree
Hide file tree
Showing 7 changed files with 280 additions and 555 deletions.
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ jobs:
pip install --no-cache-dir "server/[onnx]"
pip install --no-cache-dir "server/[transformers]"
pip install --no-cache-dir "server/[search]"
pip install --no-cache-dir "server/[transformers]"
- name: Test
id: test
run: |
Expand Down
3 changes: 0 additions & 3 deletions server/clip_server/model/flash_attention.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,6 @@ def attention(
key_padding_mask: a bool tensor of shape (B, S)
"""
assert not need_weights
assert q.dtype in [torch.float16, torch.bfloat16]
assert q.is_cuda

if cu_seqlens is None:
max_s = seqlen
Expand Down
Loading

0 comments on commit f251539

Please sign in to comment.