jina-ai · numb3r3 · Aug 4, 2022 · Aug 3, 2022 · Aug 4, 2022 · Aug 4, 2022
diff --git a/.github/README-exec/onnx.readme.md b/.github/README-exec/onnx.readme.md
@@ -1,7 +1,7 @@
 # CLIPOnnxEncoder
 
-**CLIPOnnxEncoder** is the executor implemented in [clip-as-service](https://github.com/jina-ai/clip-as-service). 
-It serves OpenAI released [CLIP](https://github.com/openai/CLIP) models with ONNX runtime (🚀 **3x** speed up). 
+**CLIPOnnxEncoder** is the executor implemented in [CLIP-as-service](https://github.com/jina-ai/clip-as-service). 
+The various `CLIP` models implemented in the [OpenAI](https://github.com/openai/CLIP) and [OpenCLIP](https://github.com/mlfoundations/open_clip) are supported with ONNX runtime (🚀 **3x** speed up). 
 The introduction of the CLIP model [can be found here](https://openai.com/blog/clip/).
 
 - 🔀 **Automatic**: Auto-detect image and text documents depending on their content.
@@ -11,19 +11,28 @@ The introduction of the CLIP model [can be found here](https://openai.com/blog/c
 
 ## Model support
 
-Open AI has released 9 models so far. `ViT-B/32` is used as default model. Please also note that different model give **different size of output dimensions**. 
+ `ViT-B-32::openai` is used as the default model. To use specific pretrained models provided by `open_clip`, please use `::` to separate model name and pretrained weight name, e.g. `ViT-B-32::laion2b_e16`. Please also note that **different models give different sizes of output dimensions**.
 
-| Model          | ONNX   | Output dimension | 
-|----------------|-----| --- |
-| RN50           | ✅   | 1024 | 
-| RN101          | ✅   | 512 | 
-| RN50x4         | ✅   | 640 |
-| RN50x16        | ✅   | 768 |
-| RN50x64        | ✅   | 1024 |
-| ViT-B/32       | ✅   | 512 |
-| ViT-B/16       | ✅   | 512 |
-| ViT-L/14       | ✅   | 768 |
-| ViT-L/14@336px | ✅   | 768 |
+| Model                                 | ONNX | Output dimension | 
+|---------------------------------------|------|------------------|
+| RN50                                  | ✅    | 1024             | 
+| RN101                                 | ✅    | 512              | 
+| RN50x4                                | ✅    | 640              |
+| RN50x16                               | ✅    | 768              |
+| RN50x64                               | ✅    | 1024             |
+| ViT-B-32                              | ✅    | 512              |
+| ViT-B-16                              | ✅    | 512              |
+| ViT-B-lus-240                         | ✅    | 640              |
+| ViT-L-14                              | ✅    | 768              |
+| ViT-L-14@336px                        | ✅    | 768              |
+
+✅ = First class support 
+
+Full list of open_clip models and weights can be found [here](https://github.com/mlfoundations/open_clip#pretrained-model-interface).
+
+```{note}
+For model definition with `-quickgelu` postfix, please use non `-quickgelu` model name.
+```
 
 ## Usage
 
@@ -116,7 +125,7 @@ From the output, you will see all the text and image docs have `embedding` attac
 ╰─────────────────────────────────────────────────────────────────╯
 ```
 
-👉 Access the embedding playground in **clip-as-service** [doc](https://clip-as-service.jina.ai/playground/embedding), type sentence or image URL and see **live embedding**!
+👉 Access the embedding playground in **CLIP-as-service** [doc](https://clip-as-service.jina.ai/playground/embedding), type sentence or image URL and see **live embedding**!
 
 ### Ranking
 
@@ -174,4 +183,4 @@ d = Document(
 )
 ```
 
-👉 Access the ranking playground in **clip-as-service** [doc](https://clip-as-service.jina.ai/playground/reasoning/). Just input the reasoning texts as prompts, the server will rank the prompts and return sorted prompts with scores.
+👉 Access the ranking playground in **CLIP-as-service** [doc](https://clip-as-service.jina.ai/playground/reasoning/). Just input the reasoning texts as prompts, the server will rank the prompts and return sorted prompts with scores.
diff --git a/.github/README-exec/torch.readme.md b/.github/README-exec/torch.readme.md
@@ -1,7 +1,7 @@
 # CLIPTorchEncoder
 
-**CLIPTorchEncoder** is the executor implemented in [clip-as-service](https://github.com/jina-ai/clip-as-service). 
-It serves OpenAI released [CLIP](https://github.com/openai/CLIP) models with PyTorch runtime. 
+**CLIPTorchEncoder** is the executor implemented in [CLIP-as-service](https://github.com/jina-ai/clip-as-service). 
+The various `CLIP` models implemented in the [OpenAI](https://github.com/openai/CLIP), [OpenCLIP](https://github.com/mlfoundations/open_clip), and [MultilingualCLIP](https://github.com/FreddeFrallan/Multilingual-CLIP) are supported with PyTorch runtime.
 The introduction of the CLIP model [can be found here](https://openai.com/blog/clip/).
 
 - 🔀 **Automatic**: Auto-detect image and text documents depending on their content.
@@ -12,19 +12,34 @@ With advances of ONNX runtime, you can use `CLIPOnnxEncoder` (see [link](https:/
 
 ## Model support
 
-Open AI has released **9 models** so far. `ViT-B/32` is used as default model. Please also note that different models give **the different sizes of output dimensions**. 
+`ViT-B-32::openai` is used as the default model. To use specific pretrained models provided by `open_clip`, please use `::` to separate model name and pretrained weight name, e.g. `ViT-B-32::laion2b_e16`. Please also note that **different models give different sizes of output dimensions**.
+
+| Model                                 | PyTorch | Output dimension | 
+|---------------------------------------|---------|------------------|
+| RN50                                  | ✅       | 1024             | 
+| RN101                                 | ✅       | 512              | 
+| RN50x4                                | ✅       | 640              |
+| RN50x16                               | ✅       | 768              |
+| RN50x64                               | ✅       | 1024             |
+| ViT-B-32                              | ✅       | 512              |
+| ViT-B-16                              | ✅       | 512              |
+| ViT-B-lus-240                         | ✅       | 640              |
+| ViT-L-14                              | ✅       | 768              |
+| ViT-L-14@336px                        | ✅       | 768              |
+| M-CLIP/XLM_Roberta-Large-Vit-B-32     | ✅       | 512              |
+| M-CLIP/XLM-Roberta-Large-Vit-L-14     | ✅       | 768              |
+| M-CLIP/XLM-Roberta-Large-Vit-B-16Plus | ✅       | 640              |
+| M-CLIP/LABSE-Vit-L-14                 | ✅       | 768              |
+
+✅ = First class support
+
+
+Full list of open_clip models and weights can be found [here](https://github.com/mlfoundations/open_clip#pretrained-model-interface).
+
+```{note}
+For model definition with `-quickgelu` postfix, please use non `-quickgelu` model name.
+```
 
-| Model          | PyTorch | Output dimension | 
-|----------------|---------|------------------|
-| RN50           | ✅       | 1024             | 
-| RN101          | ✅       | 512              | 
-| RN50x4         | ✅       | 640              |
-| RN50x16        | ✅       | 768              |
-| RN50x64        | ✅       | 1024             |
-| ViT-B/32       | ✅       | 512              |
-| ViT-B/16       | ✅       | 512              |
-| ViT-L/14       | ✅       | 768              |
-| ViT-L/14@336px | ✅       | 768              |
 
 ## Usage
 
@@ -118,7 +133,7 @@ From the output, you will see all the text and image docs have `embedding` attac
 ╰─────────────────────────────────────────────────────────────────╯
 ```
 
-👉 Access the embedding playground in **clip-as-service** [doc](https://clip-as-service.jina.ai/playground/embedding), type sentence or image URL and see **live embedding**!
+👉 Access the embedding playground in **CLIP-as-service** [doc](https://clip-as-service.jina.ai/playground/embedding), type sentence or image URL and see **live embedding**!
 
 ### Ranking
 
@@ -176,4 +191,4 @@ d = Document(
 )
 ```
 
-👉 Access the ranking playground in **clip-as-service** [doc](https://clip-as-service.jina.ai/playground/reasoning/). Just input the reasoning texts as prompts, the server will rank the prompts and return sorted prompts with scores.
+👉 Access the ranking playground in **CLIP-as-service** [doc](https://clip-as-service.jina.ai/playground/reasoning/). Just input the reasoning texts as prompts, the server will rank the prompts and return sorted prompts with scores.
diff --git a/client/setup.py b/client/setup.py
@@ -5,7 +5,7 @@
 from setuptools import setup
 
 if sys.version_info < (3, 7, 0):
-    raise OSError(f'clip-as-service requires Python >=3.7, but yours is {sys.version}')
+    raise OSError(f'CLIP-as-service requires Python >=3.7, but yours is {sys.version}')
 
 try:
     pkg_name = 'clip-client'

diff --git a/scripts/benchmark.py b/scripts/benchmark.py
@@ -30,7 +30,7 @@ def __init__(
         **kwargs,
     ):
         """
-        @param server: the clip-as-service server URI
+        @param server: the CLIP-as-service server URI
         @param batch_size: number of batch sample
         @param num_iter: number of repeat run per experiment
         @param image_sample: uri of the test image

diff --git a/server/clip_server/model/clip_onnx.py b/server/clip_server/model/clip_onnx.py
@@ -146,6 +146,11 @@
         ('ViT-L-14@336px/textual.onnx', '78fab479f136403eed0db46f3e9e7ed2'),
         ('ViT-L-14@336px/visual.onnx', 'f3b1f5d55ca08d43d749e11f7e4ba27e'),
     ),
+    # MultilingualCLIP models
+    # 'M-CLIP/LABSE-Vit-L-14': (
+    #     ('M-CLIP-LABSE-Vit-L-14/textual.onnx', 'b5b649f9e064457c764874e982bca296'),
+    #     ('M-CLIP-LABSE-Vit-L-14/visual.onnx', '471951562303c9afbb804b865eedf149'),
+    # ),
 }
 
 

diff --git a/server/setup.py b/server/setup.py
@@ -5,7 +5,7 @@
 from setuptools import setup
 
 if sys.version_info < (3, 7, 0):
-    raise OSError(f'clip-as-service requires Python >=3.7, but yours is {sys.version}')
+    raise OSError(f'CLIP-as-service requires Python >=3.7, but yours is {sys.version}')
 
 try:
     pkg_name = 'clip-server'