Replies: 3 comments 4 replies
-
Why do you want the first notification presented to the user? And the uring API doesn't promise there will be one or that there will be just one. |
Beta Was this translation helpful? Give feedback.
-
It's my understanding their request is "done" when the final CQE is received. Can you give some example or color to how a user would interpret an intermediate CQE (one with a more bit set) if we made that available? I don't use those commands myself. But I know in one case at least, there is a bytes value that is used to increment a total so when the final CQE is received, the total bytes written can be reported to the user. |
Beta Was this translation helpful? Give feedback.
-
Then your difinition of "done" differs depending on whether you use Lets have a look at very slow, but fresh (as in nothing was sent yet) TCP connection and
when we get result? When buffer is writen to in-kernel TCP socket buffers, which is immediately in our case, because there is room for our buffer. How it will look like with
when we get result in this case? Currently at the very differnt time. We get result when second CQE is received, the one which notifies that buffer is not in use anymore. For our TCP connection it will be when remote peer ACKs our buffer. This difference is what I am asking to eliminate. ZC should be just an optimization, not change of semantic. How it might look like? I propose ZC methods to return 2 futures: one for "main" CQE and second for buffer free notification. Current behaviour would look like following then:
It is clearly suboptimal, because second future can take arbitrary long time and we want to make progress with our app. Better option is to handle it to some buffer manager and make sure to poll it in the main loop and let it free buffers.
|
Beta Was this translation helpful? Give feedback.
-
When using zerocopy operations SEND_ZC and SENDMSG_ZC there are 2 CQEs returned: one for the operation itself and second at some point later when buffer is released and can be reused.
Here is a good description of it from LWN Article:
Currently tokio-uring merges 2 notifications into a single future, with negative consequence that user visible future will be completed much later for ZC version vs non-ZC. This is happening because non-ZC future completes when first CQE arrives, that is when data was written to TCP buffers for instance, but for ZC call second CQE (and therefore user visible future) completes when all layers in kernel are done with the buffer, which for TCP case is when buffer was ACKed by the remote.
IMHO it make sense for send[msg]_zc to return 2 futures: one for the first "operation-completed" CQE, matching non-ZC counterpart and second "buffer-is-free" for second CQE indicating when buffer can be released.
User can then use second future to track lifecycle of the buffer or better tokio-uring could provide a buffer tracker helper, which owns buffers and their corresponding "buffer-is-free" future and freeing buffers on future completion. It is responsibility of user to periodically poll buffer tracker for it to do it's work. For fixed buffers, fixed buffer registry can become such tracker.
/cc @ollie-etl as original author of zerocopy support.
Beta Was this translation helpful? Give feedback.
All reactions