-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: do not send blob from server when it is loaded in client #804
Conversation
Codecov Report
@@ Coverage Diff @@
## main #804 +/- ##
==========================================
+ Coverage 81.52% 84.10% +2.57%
==========================================
Files 21 21
Lines 1440 1466 +26
==========================================
+ Hits 1174 1233 +59
+ Misses 266 233 -33
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
6edbfde
to
4d61e28
Compare
client/clip_client/client.py
Outdated
@@ -119,6 +121,11 @@ def encode(self, content, **kwargs): | |||
**self._get_post_payload(content, kwargs), | |||
on_done=partial(self._gather_result, results=results), | |||
) | |||
|
|||
for c in content: | |||
if isinstance(c, Document) and c.tags.pop('__loaded_by_CAS__', False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if isinstance(c, Document) and c.tags.pop('__loaded_by_CAS__', False): | |
if hasattr(c, 'tags') and c.tags.pop('__loaded_by_CAS__', False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a broken change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benefits from change above:
- reduce the memory footprint (to address OOM when processing a large number of documents);
- keep the inputs not updated;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The future better way is to support in-place update in encode(docs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add unittest to check server-side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If the user encodes a document with uri and no blob, the current implementation will load the blob in the client and send it to the server.
This pr saves the bandwidth by not sending back blob from the server
I ran some tests to see the benefit we have from this change.
Server config: default minibatch_size=32, model: Vit-L/14@336px, replicas: 3
Client config: default batch_size=8, input DocumentArray size=80 with each Document having a URI of an image of size ~1.5MB
Before: send 90MB, recv 90MB, cost 3-13min (Server hosted on Beijing GPU 3-7min/Berlin GPU 6-13min)
After: send 90MB, recv 250KB, cost 1-3min
In general, the time varies because of remote network communications, but the recv package sizes substantially drops