Releases: jina-ai/clip-as-service
💫 Patch v0.8.3
Release Note (0.8.3
)
Release time: 2023-12-20 04:13:18
🙇 We'd like to thank all contributors for this new release! In particular,
Zihao Jing, Han Xiao, Nick de Silva, Ziniu Yu, Jina Dev Bot, 🙇
🐞 Bug fixes
📗 Documentation
- [
ca2b25b7
] - remove jina self-hosted parts (#942) (Zihao Jing) - [
6e418fe6
] - replace free service docs with inference docs (#918) (Ziniu Yu)
🍹 Other Improvements
💫 Patch v0.8.2
Release Note (0.8.2
)
Release time: 2023-04-19 08:23:45
🙇 We'd like to thank all contributors for this new release! In particular,
Ziniu Yu, Yang Ruiyi, YangXiuyu, Jie Fu, zawabest, Girish Chandrashekar, Jina Dev Bot, 🙇
🆕 New Features
- [
cce3b05a
] - set prefetch in client for traffic control (#897) (Ziniu Yu) - [
dabbe8bc
] - add cn clip model (#888) (Yang Ruiyi) - [
1fe3a5a0
] - add fp16 inference support (torch/onnx) (#871) (YangXiuyu) - [
1eebdd7f
] - add custom tracing spans with jina>=3.12.0 (#861) (Girish Chandrashekar) - [
f2515394
] - add three new open clip roberta base models (#860) (YangXiuyu) - [
e4717a35
] - Integrate flash attention (#853) (YangXiuyu)
🐞 Bug fixes
- [
280b925e
] - fix docarray at v1 (#911) (Ziniu Yu) - [
35733a0b
] - replace transform ndarray with transform blob (#910) (Ziniu Yu) - [
d70f2382
] - onnx package conflict during setup (#894) (Ziniu Yu) - [
8a576c58
] - install pytorch cu116 for server docker image (#882) (Ziniu Yu) - [
0b293ec8
] - dynamic convert onnx model to fp16 during start session (#876) (YangXiuyu) - [
fd16e5ab
] - check dtype when loading models (#872) (Ziniu Yu) - [
67f551ca
] - torchvision version to avoid compatibility issue (#866) (Jie Fu) - [
0223e6fa
] - add pip installable flash attention (#863) (YangXiuyu)
📗 Documentation
- [
1888ef65
] - fix broken link in client doc (#909) (Ziniu Yu) - [
f4eed3bc
] - add link and intro to inference api (#900) (Ziniu Yu) - [
702fff88
] - default model suggestion (#874) (Jie Fu)
🍹 Other Improvements
- [
19b4fa51
] - remove docsqa html (#899) (Ziniu Yu) - [
aa07d257
] - remove docsqa (#898) (Ziniu Yu) - [
f3421f7c
] - bump open-clip-torch to v2.8.0 (#883) (Ziniu Yu) - [
c7af9f71
] - fix configuration file for the search flow doc (#869) (zawabest) - [
53cd0630
] - hide changelog in docs (#864) (Ziniu Yu) - [
9bb7d1f4
] - version: the next version will be 0.8.2 (Jina Dev Bot)
💫 Patch v0.8.1
Release Note (0.8.1
)
Release time: 2022-11-15 11:15:48
This release contains 1 new feature, 1 performance improvement, 2 bug fixes and 4 documentation improvements.
🆕 Features
Allow custom callback in clip_client
(#849)
This feature allows clip-client
users to send a request to a server and then process the response with a custom callback function. There are three callbacks that users can process with custom functions: on_done
, on_error
and on_always
.
The following code snippet shows how to send a request to a server and save the response to a database.
from clip_client import Client
db = {}
def my_on_done(resp):
for doc in resp.docs:
db[doc.id] = doc
def my_on_error(resp):
with open('error.log', 'a') as f:
f.write(resp)
def my_on_always(resp):
print(f'{len(resp.docs)} docs processed')
c = Client('grpc://0.0.0.0:12345')
c.encode(
['hello', 'world'], on_done=my_on_done, on_error=my_on_error, on_always=my_on_always
)
For more details, please refer to the CLIP client documentation.
🚀 Performance
Integrate flash attention (#853)
We have integrated the flash attention module as a faster replacement for nn.MultiHeadAttention
. To take advantage of this feature, you will need to install the flash attention module manually:
pip install git+https://github.com/HazyResearch/flash-attention.git
If flash attention is present, clip_server
will automatically try to use it.
The table below compares CLIP performance with and without the flash attention module. We conducted all tests on a Tesla T4
GPU, and times how long it took to encode a batch of documents 100 times.
Model | Input data | Input shape | w/o flash attention | flash attention | Speedup |
---|---|---|---|---|---|
ViT-B-32 |
text | (1, 77) | 0.42692 | 0.37867 | 1.1274 |
ViT-B-32 |
text | (8, 77) | 0.48738 | 0.45324 | 1.0753 |
ViT-B-32 |
text | (16, 77) | 0.4764 | 0.44315 | 1.07502 |
ViT-B-32 |
image | (1, 3, 224, 224) | 0.4349 | 0.40392 | 1.0767 |
ViT-B-32 |
image | (8, 3, 224, 224) | 0.47367 | 0.45316 | 1.04527 |
ViT-B-32 |
image | (16, 3, 224, 224) | 0.51586 | 0.50555 | 1.0204 |
Based on our experiments, performance improvements vary depending on the model and GPU, but in general, the flash attention module improves performance.
🐞 Bug Fixes
Increase timeout at startup for Executor docker images (#854)
During Executor
initialization, it can take quite a lot of time to download model parameters. If a model is very large and downloading slowly, the Executor
may fail due to time-out before even starting. We have increased the timeout to 3000000ms.
Install transformers for Executor docker images (#851)
We have added the transformers
package to Executor
docker images, in order to support the multilingual CLIP model.
📗 Documentation Improvements
- Update Finetuner docs (#843)
- Add tips for client parallelism usage (#846)
- Move benchmark conclusion to beginning (#847)
- Add instructions for using clip server hosted by Jina (#848)
🤟 Contributors
We would like to thank all contributors to this release:
- Ziniu Yu (@ZiniuYu)
- Jie Fu (@jemmyshin)
- felix-wang (@numb3r3)
- YangXiuyu (@OrangeSodahub)
💫 Release v0.8.0
Release Note (0.8.0
)
Release time: 2022-10-12 08:11:40
This release contains 3 new features, 1 performance improvement, and 1 documentation improvements.
🆕 Features
Support large ONNX model files (#828)
Before this release, the ONNX model file is limited to 2GB. Now we support large ONNX models which are archived into zip files, in which several small ONNX files are stored for subgraphs. As a result, we are now able to serve all of the CLIP models via onnxruntime.
Support ViT-B-32, ViT-L-14, ViT-H-14 and ViT-g-14 trained on laion-2b (#825)
Users can now serve four new CLIP models from OpenCLIP trained on the Laion-2B dataset:
- ViT-B-32::laion2b-s34b-b79k
- ViT-L-14::laion2b-s32b-b82k
- ViT-H-14::laion2b-s32b-b79k
- ViT-g-14::laion2b-s12b-b42k
The ViT-H-14 model achieves 78.0% zero-shot top-1 accuracy on ImageNet and 73.4% on zero-shot image retrieval at Recall@5 on MS COCO. This is the best-performing open source CLIP model. To use the new models, simply specify the model name, e.g., ViT-H-14::laion2b-s32b-b79k
in the FLOW YAML. For example:
jtype: Flow
version: '1'
with:
port: 51000
executors:
- name: clip_t
uses:
jtype: CLIPEncoder
with:
name: ViT-H-14::laion2b-s32b-b79k
metas:
py_modules:
- clip_server.executors.clip_torch
Please refer to model support to see the full list of supported models.
In-place result in clip_client
; preserve output order by uuid (#815)
The clip_client
module now supports in-place embedding. This means the result of a call to the CLIP server to get embeddings is stored in the input DocumentArray
, instead of creating a new DocumentArray
. Consequently, the DocumentArray
returned by a call to Client.encode
now has the same order as the input DocumentArray
.
This could cause a breaking change if code depends on Client.encode
to return a new DocumentArray
instance.
If you run the following code, you can verify that the input DocumentArray
now contains the embeddings and that the order is unchanged.
from docarray import DocumentArray, Document
from clip_client import Client
c = Client('grpc://0.0.0.0:51000')
da = [
Document(text='she smiled, with pain'),
Document(uri='apple.png'),
Document(uri='apple.png').load_uri_to_image_tensor(),
Document(blob=open('apple.png', 'rb').read()),
Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
Document(
uri=''
),
]
c.encode(da)
print(da.embeddings)
🚀 Performance
Drop image content to boost latency (#824)
Calls to Client.encode
no longer return the input image with the embedding. Since embeddings are now inserted into the original DocumentArray
instance, this is unnecessary network traffic. As a result, the system is now faster and more responsive. Performance improvement is dependent on the size of the image and network bandwidth.
📗 Documentation Improvements
CLIP benchmark on zero-shot classification and retrieval tasks (#832)
We now provide benchmark information for CLIP models on zero-shot classification and retrieval tasks. This information should help users to choose the best CLIP model for their specific use-cases. For more details, please read the Benchmark page in the CLIP-as-Service User Guide.
🤟 Contributors
We would like to thank all contributors to this release:
felix-wang(@numb3r3 )
Ziniu Yu(@ZiniuYu )
Jie Fu(@jemmyshin )
💫 Patch v0.7.0
Release Note (0.7.0
)
Release time: 2022-09-13 13:47:54
🙇 We'd like to thank all contributors for this new release! In particular,
numb3r3, felix-wang, Jie Fu, Ziniu Yu, Jina Dev Bot, 🙇
🆕 New Features
🐞 Bug fixes
- [
213ecc28
] - always return docarray as search result (#821) (felix-wang) - [
eca57745
] - readme: use new demo server (#819) (felix-wang)
📗 Documentation
- [
8d9725fb
] - update clip search (#820) (felix-wang) - [
fa7e5776
] - docs for retrieval (#808) (Jie Fu) - [
47144c23
] - enable horizontal scrolling in wide tables (#818) (Ziniu Yu)
🍹 Other Improvements
💫 Patch v0.6.2
Release Note (0.6.2
)
Release time: 2022-09-01 04:16:27
🙇 We'd like to thank all contributors for this new release! In particular,
Ziniu Yu, Jina Dev Bot, felix-wang, 🙇
🐞 Bug fixes
📗 Documentation
🍹 Other Improvements
💫 Patch v0.6.1
Release Note (0.6.1
)
Release time: 2022-08-30 13:57:32
🙇 We'd like to thank all contributors for this new release! In particular,
felix-wang, Jina Dev Bot, numb3r3, 🙇
🐞 Bug fixes
🍹 Other Improvements
💫 Patch v0.6.0
Release Note (0.6.0
)
Release time: 2022-08-30 04:19:21
🙇 We'd like to thank all contributors for this new release! In particular,
numb3r3, Ziniu Yu, felix-wang, Jina Dev Bot, 🙇
🆕 New Features
- [
3c43eed3
] - do not send blob from server when it is loaded in client (#804) (Ziniu Yu) - [
f852dfc8
] - add warning if input is too large (#796) (Ziniu Yu) - [
65032f02
] - encode text first when both text and uri are presented (#795) (Ziniu Yu)
🐞 Bug fixes
📗 Documentation
- [
a5893c70
] - update jcloud gpu usage (#809) (Ziniu Yu) - [
b4fb0dd2
] - fix hub table typo (#803) (Ziniu Yu)
🍹 Other Improvements
💫 Patch v0.5.1
Release Note (0.5.1
)
Release time: 2022-08-08 05:11:18
🙇 We'd like to thank all contributors for this new release! In particular,
Ziniu Yu, Jina Dev Bot, numb3r3, 🙇
🆕 New Features
📗 Documentation
🍹 Other Improvements
💫 Patch v0.5.0
Release Note (0.5.0
)
Release time: 2022-08-03 05:13:06
🙇 We'd like to thank all contributors for this new release! In particular,
numb3r3, Ziniu Yu, Alex Shan, felix-wang, Sha Zhou, Jina Dev Bot, Han Xiao, 🙇
🆕 New Features
- [
3402b1d1
] - replace traversal_paths with access_paths (#791) (Ziniu Yu) - [
87928a7b
] - update onnx models and md5 (#785) (Ziniu Yu) - [
8bd83896
] - support onnx backend for openclip (#781) (felix-wang) - [
f043b4d9
] - update openclip loader (#782) (Alex Shan) - [
fa62d8e9
] - support openclip&mclip models + refactor model loader (#774) (Alex Shan) - [
32b11cd6
] - allow model selection in client (#775) (Ziniu Yu) - [
0ff4e252
] - allow credential in client (#765) (Ziniu Yu) - [
ee7da10d
] - support custom onnx file and update model signatures (#761) (Ziniu Yu) - [
ed1b92d1
] - docs: add qabot (#759) (Sha Zhou)
🐞 Bug fixes
- [
e48a7a38
] - change onnx and trt default model name to ViT-B-32::openai (#793) (Ziniu Yu) - [
8b8082a9
] - mclip cuda device (#792) (felix-wang) - [
8681b88e
] - fp16 inference (#790) (felix-wang) - [
ab00c2ae
] - upgrade jina (#788) (felix-wang) - [
1db43b48
] - no allow client to change server batch size (#787) (Ziniu Yu) - [
58772079
] - add models and md5 (#783) (Ziniu Yu) - [
7c8285bb
] - async progress bar does not display (#779) (Ziniu Yu) - [
79e85eed
] - miscalling clip_server in clip_client (Han Xiao)
📗 Documentation
- [
c67a7f59
] - add model support (#784) (Alex Shan) - [
bc6b72e6
] - add finetuner docs (#771) (Ziniu Yu) - [
2b78b12e
] - improve model support (#768) (Ziniu Yu)