feat: add three new open clip roberta base models (#860)

* feat: bump openclip to v2.5.0 (#859) * feat: bump openclip to v2.5.0 * fix: conflicts * fix: default fp32 on cpu and fp16 on gpu * feat: add two new models * fix: remove debug * fix: add roberta models (test) * fix: model name xlm * fix: (wip) * fix: remove roberta model * fix: model name * fix: add :: to model name * fix: add transformers * fix: remove is_init_value * fix: remove transformers * fix: not use flash-attn on cpu * fix: add assert description * fix: housekeeping * fix: cleanup * fix: cleanup * fix: allow to set precision * fix: gpu-test * fix: add roberta model test * fix: dtype * fix: dtype * fix: remove optional * fix: housekeeping * fix: housekeeping * fix: typo * fix: refactor * fix: refactor * fix: refactor * fix: type hint * fix: address comments * fix: visiontransformer * fix: d_model and n_head * fix: texttransformer * fix: change model to fix oom gh action * fix: class init and param name * fix: black * chore: bump open-clip version Co-authored-by: ZiniuYu <[email protected]>
jina-ai · Nov 29, 2022 · f251539 · f251539
1 parent 67f551c
commit f251539
Show file tree

Hide file tree

Showing 7 changed files with 280 additions and 555 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -113,6 +113,7 @@ jobs:
           pip install --no-cache-dir "server/[onnx]"
           pip install --no-cache-dir "server/[transformers]"
           pip install --no-cache-dir "server/[search]"
+          pip install --no-cache-dir "server/[transformers]"
       - name: Test
         id: test
         run: |

diff --git a/server/clip_server/model/flash_attention.py b/server/clip_server/model/flash_attention.py
@@ -57,9 +57,6 @@ def attention(
             key_padding_mask: a bool tensor of shape (B, S)
 
         """
-        assert not need_weights
-        assert q.dtype in [torch.float16, torch.bfloat16]
-        assert q.is_cuda
 
         if cu_seqlens is None:
             max_s = seqlen