-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support custom onnx file and update model signatures #761
Conversation
Codecov Report
@@ Coverage Diff @@
## main #761 +/- ##
==========================================
+ Coverage 78.31% 79.03% +0.71%
==========================================
Files 17 17
Lines 1213 1240 +27
==========================================
+ Hits 950 980 +30
+ Misses 263 260 -3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
server/clip_server/model/clip.py
Outdated
@@ -335,6 +336,7 @@ def tokenize( | |||
eot_token = _tokenizer.encoder['<|endoftext|>'] | |||
all_tokens = [[sot_token] + _tokenizer.encode(text) + [eot_token] for text in texts] | |||
result = torch.zeros(len(all_tokens), context_length, dtype=torch.long) | |||
attention_masks = torch.zeros(len(all_tokens), context_length, dtype=torch.long) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to support variable input length here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is blocked
📝 Docs are deployed on https://ft-improve_onnx--jina-docs.netlify.app 🎉 |
This PR allows the user to set filepath for pretrained custom onnx model and unify the signature of tensorrt, onnx, and huggingface clip runtime.
We also add md5 verification to make sure users download the latest and correct models.
Since tensorrt models are extremely picky for many dependencies, we no longer support tensorrt model hosting, which means that it will first download the latest onnx model and convert to trt during runtime and save for it later use.
TODO: