Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in checksum extension after cancelled requests #63

Closed
mrandrewer opened this issue May 21, 2019 · 12 comments
Closed

Error in checksum extension after cancelled requests #63

mrandrewer opened this issue May 21, 2019 · 12 comments
Assignees
Labels

Comments

@mrandrewer
Copy link

Checksum verification does not occur if request was cancelled, this leads to verification error on next chunk upload.

Steps to reproduce with TusDiskStore:

  1. Abort request (cancellation is requested in PatchHandler)
  2. All received data is stored, but chunkComplete is not set and chunkStart retains current value
  3. Resume upload
  4. ChunkStart is not updated and still points to offset from previous request
  5. After chunk is completed metadata extension uses old offset from ChunkStart for calculation
  6. Checksums do not match.

Protocol states that "The Server SHOULD always attempt to store as much of the received data as possible.", but it seems more viable to discard uncompleted chunk when using checksum extension.

@smatsson
Copy link
Collaborator

This behavior is by design and by the specification.

"Once the entire request has been received, the Server MUST verify the uploaded chunk against the provided checksum using the specified algorithm." (https://tus.io/protocols/resumable-upload.html#checksum). If the client disconnects, then the entire request has not been received.

Doing it the way you propose would break resumability of all uploads that use the checksum extension as the checksum will never match if the client disconnects.

I don't really follow your steps. Number 6 should not happen if you follow the protocol and continue the upload (chunked or not) from the server's offset. Do you have a reproducible example where this happens?

@mrandrewer
Copy link
Author

Hello,
sorry if my explanation was unclear.

Doing it the way you propose would break resumability of all uploads that use the checksum extension as the checksum will never match if the client disconnects.

This is the exact problem i get. Current implementation does not discard uncompleted request data when cancellationToken raises cancel flag:

  • TusDiskStore stops buffering data at line 110 and flushes buffer to disk
  • WriteFileHandler exits at line 82

Data, that is already written stays on disk and checksum extension method is not invoked.

I don't have an example to illustrate problem on hand (we use custom store), but will try to construct something.

@smatsson
Copy link
Collaborator

The thing is that it's not supposed to discard data if the client disconnects. Basically the sequence of events is something like below. Remember that "chunks" have no special meaning in tus. It is just a workaround for proxies, hardware limitations etc.

So let's say we have a 100 MB file that we split into 1 MB chunks. The client would split the file in 100 chunks, calculate the checksum of each chunk and then start the uploading.

Client: Here is my first chunk with checksum X.
Server: Perfect thanks! The chunk has been saved and the checksum is OK.
Client: Here is my second chu... [disconnected]
Server: OK, client disconnected. I saved 90% of the chunk.
Client: Hi it's me again! What's the offset of my file?
Server: It's 1.9 MB
Client: Hmm... yeah that's the second chunk and 90% in. Ok got it! Here's the rest of the data + the checksum for chunk 2.
Server: Thanks! The chunk has been saved and the checksum is OK.
[continue until all chunks are done]

Checksums would basically be worthless if the data was discarded at each disconnect as there will never be a checksum that matches in that case. It would just be far easier to say that all client disconnects should discard the last chunk, which is not the case.

@smatsson
Copy link
Collaborator

After reading the spec again I find some of the parts regarding chunks and checksum to be a bit contradicting so I have created a issue over at the protocol's github: tus/tus-resumable-upload-protocol#143

@mrandrewer
Copy link
Author

mrandrewer commented May 30, 2019

Hello,
Our client soft uses tusdotnetclient package. It does not save state and just continues from new offset like this:
...
Client: Hi it's me again! What's the offset of my file?
Server: It's 1.9 MB
Client: Hmm... here is new 1 MB chunk starting from 1.9 MB offset + checksum
Server: Thanks! But the chunk has not been saved, sice your checksum for data in range 1MB-2.9MB is not OK.

It seems correct for last (uncompleted) chunk to be discarded when checksum extension is in use. It makes sense, since HEAD response does not contain any data to sync client and server in this regard. And it is possible that upload can be continued from other client instance (unexpected shutdown), or after web page reload.
Thank you!

@smatsson
Copy link
Collaborator

smatsson commented Jun 1, 2019

I get your argument and can agree to a certain degree. I will have to wait for an answer in the linked issue before taking actions on this issue as it might break stuff for other clients. If you need to move forward I would suggest disabling checksums on your server as checksums bring very little (close to nothing IMO) to the table if the server runs HTTPS. HTTPS already have a checksum implementation of sorts since it's encrypted. If the data is modified during transfer the server will not be able to decrypt the data and hence the request will fail.

Disable checksums by removing the UploadChecksum header from the request:

app.Use((httpContext, next) =>
{
    httpContext.Request.Headers.Remove(tusdotnet.Constants.HeaderConstants.UploadChecksum);
    return next();
});

app.UseTus(httpContext => ...);

Side note: I looked at the client you're using. It's unfortunate that it forces you to use chunks as you are missing the key feature of tus, namely streamed uploads, where only a single HTTP request is used. Not much to do about it though (except writing your own ;) ).

@smatsson
Copy link
Collaborator

smatsson commented Jun 1, 2019

Confirmed as a bug by the Tus team. I will make a patch and release a new version as soon as possible. Until then, use the work around in the above comment.

@smatsson smatsson self-assigned this Jun 1, 2019
@smatsson smatsson added the bug label Jun 1, 2019
@mrandrewer
Copy link
Author

mrandrewer commented Jun 4, 2019

Hello!
Thank you for clarifying proper request flow on my behalf!

Side note: I looked at the client you're using. It's unfortunate that it forces you to use chunks as you are missing the key feature of tus, namely streamed uploads, where only a single HTTP request is used. Not much to do about it though (except writing your own ;) ).

Unfortunately, we have hardware VPN with GOST encryption somewhere in the middle, and large requests are just dropped. So chunks are a nessisity.

Disable checksums by removing the UploadChecksum header from the request:

For now, I made a dirty fix, moving checksum calculation before cancellation token check in WriteFileHandler

var checksumMatched = await MatchChecksum();
if (guardedStream.CancellationToken.IsCancellationRequested) {
    return;
}
// other checks
if (!checksumMatched) {
    await Response.Error((HttpStatusCode)460, "Header Upload-Checksum does not match the checksum of the file");
    return;
}

Confirmed as a bug by the Tus team. I will make a patch and release a new version as soon as possible. Until then, use the work around in the above comment.

Looking forward to it. Thank you againg for your time!

@smatsson
Copy link
Collaborator

smatsson commented Jun 4, 2019

Thanks for your patience on this issue! I have pushed a fix now and a new version should be on nuget shortly. :)

Unfortunately, we have hardware VPN with GOST encryption somewhere in the middle, and large requests are just dropped. So chunks are a nessisity.

I see! Doing chunks without the checksum extension would give you slightly better resumability as the last chunk does not need to be discarded but could be continued where it was left of. Not sure how much of an improvement that would be though if you are already using fairly small chunks.

For now, I made a dirty fix, moving checksum calculation before cancellation token check in WriteFileHandler

My solution is actually close to that so I wouldn't call it a dirty fix ;)

Looking forward to it. Thank you againg for your time!

No worries! Thanks for using tusdotnet!

@smatsson
Copy link
Collaborator

smatsson commented Jun 4, 2019

@smatsson
Copy link
Collaborator

smatsson commented Jun 7, 2019

@mrandrewer Feel free to close this issue once you have confirmed that it's working in your setup.

@mrandrewer
Copy link
Author

mrandrewer commented Jun 10, 2019

Hello!
Everything works fine so far, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants