Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to upload the chunk / connection timed out errors using Azure #361

Closed
Kyle2123 opened this issue Feb 15, 2018 · 27 comments
Closed

Failed to upload the chunk / connection timed out errors using Azure #361

Kyle2123 opened this issue Feb 15, 2018 · 27 comments

Comments

@Kyle2123
Copy link

While running the initial backup from MacOS and Linux computers running 2.0.10 to Azure storage, the process eventually fails with a message like this:

Failed to upload the chunk f041b71f8ae1e436ea2e6ce647c0491426a3b146bd2538308cf186fcea5d3e48: Put https://[my storage account].blob.core.windows.net/[my storage container]/chunks/f0/41b71f8ae1e436ea2e6ce647c0491426a3b146bd2538308cf186fcea5d3e48: read tcp 192.168.1.145:59117->52.239.153.4:443: read: connection timed out
Incomplete snapshot saved to /mnt/USBHD/.duplicacy/incomplete

Whenever this happens I just restart the backup process again. Eventually, it finishes. The problem is that these are the initial backups so they are huge and instead of it just running for a day, it's taking forever because it fails after an hour or two and doesn't get started up again until I notice it.

I am not sure if this happens on Windows or not because I haven't tried from a Windows PC yet but I assume it does. I am also not sure if this only happens when writing to Azure storage or if it affects other clients too.

@gilbertchen
Copy link
Owner

How many threads are you using?

@Kyle2123
Copy link
Author

I haven't specified the threads option so it is whatever the default is. The command I am using is "duplicacy backup -stats"

@gilbertchen
Copy link
Owner

Can you change this line:

"github.com/azure/azure-sdk-for-go/storage"

to:

    "github.com/gilbertchen/azure-sdk-for-go/storage"

and then rebuild from the source?

This will pick up the change I made to retry on timeout errors: gilbertchen/azure-sdk-for-go@53194c2

If you would rather like a working binary, please let me know.

@Kyle2123
Copy link
Author

Sorry for the delayed response - was out of town last week. I don't have an easy way to do the builds myself, can you help me get a binary to test this with (for Mac)?

@gilbertchen
Copy link
Owner

Can you try this build: https://acrosync.com/duplicacy/duplicacy_osx_x64_2.0.10e?

@jeffaco
Copy link
Contributor

jeffaco commented Mar 8, 2018

Hey, I'm having this identical problem, see here for details. I, too, am running on Mac OS/X.

Any problem if I test this fix as well?

@gilbertchen
Copy link
Owner

@jeffaco please test the fix and let me know if it works.

@jeffaco
Copy link
Contributor

jeffaco commented Mar 9, 2018

Minor immediate problem:

After doing duplicacy init, and entering the storage password (twice, as expected), there was no \r sent, so the next lines started half way across the screen. Same occurred with duplicacy backup.

I also observed, after getting the number of patterns loaded, there is a LONG delay. I have a 1.6TB file system, so I imagine it's looking at files. But with this many files, it takes 5-10 (or maybe more) minutes of no output while doing this step. Since I did a duplicacy backup -stats (-stats was specified), it probably makes sense for you to output that you're looking for files, and then output the number of files (and total size) that will be included in the backup. This, for me, would help verify that my include/exclude pattern(s) were correct. Right now, I actually have no idea if things are correct or not.

Beyond that, this problem is pretty intermittent. I'll get back to you if/when it completes, thanks!

@jeffaco
Copy link
Contributor

jeffaco commented Mar 9, 2018

Looks like the new version did NOT fix the problem:

Uploaded chunk 2172 size 6020488, 10.01MB/s 2 days 19:52:26 0.4%
Uploaded chunk 2173 size 3207576, 10.01MB/s 2 days 19:51:16 0.4%
Uploaded chunk 2174 size 9252079, 10.01MB/s 2 days 19:51:42 0.4%
Failed to upload the chunk fbeb565b10c965341c1e3ec763bffef7140b3c89f096e28c4439459b1dac34b5: Put https://XXXXX.blob.core.windows.net/XXXXX/chunks/fb/eb565b10c965341c1e3ec763bffef7140b3c89f096e28c4439459b1dac34b5: read tcp 192.168.1.125:64943->52.191.176.36:443: read: connection reset by peer
Incomplete snapshot saved to /Volumes/Storage/.duplicacy/incomplete

Here's the version I'm running:

-rwxr-xr-x@ 1 jeff  staff  21929840 Mar  8 16:52 /Users/jeff/Applications/duplicacy_osx_x64_2.0.10e

Note that the error is slightly different this time: read: connection reset by peer.

Please let me know how to proceed, thanks!

@gilbertchen
Copy link
Owner

The new build https://acrosync.com/duplicacy/duplicacy_osx_x64_2.0.10f should be able to retry on this error.

@jeffaco
Copy link
Contributor

jeffaco commented Mar 9, 2018

I didn't want to leave you hanging. I did install this last night, just around 8:00 PM (just shortly after you posted the new image).

This is a big backup, but so far so good:

Uploaded chunk 82140 size 7684312, 10.31MB/s 2 days 07:17:40 16.4%
Uploaded chunk 82141 size 2868619, 10.31MB/s 2 days 07:17:39 16.4%
Uploaded chunk 82142 size 3777330, 10.31MB/s 2 days 07:17:42 16.4%

This is promising, but certainly not conclusive since these problems were intermittent to begin with. I'll get back with more definitive information within 2 days, 7 hours! 😄

@jeffaco
Copy link
Contributor

jeffaco commented Mar 10, 2018

Duplicacy aborted again backing up to Azure. I got MUCH farther this time, farther than ever before, but it did abort:

Uploaded chunk 276986 size 11695421, 10.37MB/s 1 day 05:27:16 55.2%
Uploaded chunk 276987 size 16777216, 10.37MB/s 1 day 05:27:14 55.2%
Uploaded chunk 276988 size 4153994, 10.37MB/s 1 day 05:27:13 55.2%
Failed to upload the chunk c0e124194f30984d1c7c8abc3ac3847eb5d3e01ae2d3d23a272913b9189e674b: Put https://XXXXX.blob.core.windows.net/XXXXX/chunks/c0/e124194f30984d1c7c8abc3ac3847eb5d3e01ae2d3d23a272913b9189e674b: write tcp 192.168.1.125:54142->52.191.176.36:443: write: broken pipe
Incomplete snapshot saved to /Volumes/Storage/.duplicacy/incomplete

The error this time is different than before: write: broken pipe.

If you can let me know how to proceed, that would be great. Thanks so much.

@gilbertchen
Copy link
Owner

The previous fix only retries on temporary errors, but this write: broken pipe error wasn't treated as a temporary error. Can you run the backup again to see if the error will keep happening?

@jeffaco
Copy link
Contributor

jeffaco commented Mar 13, 2018

I can retry, but this is 2+ day upload. I'd like to see a full upload happen at least once (in one shot) to know that communications are good. I have no reason to believe that this won't come up again, at least at times.

A broken pipe is a synchronization issue between essentially two processes (in this case, no doubt between Azure and Duplicacy). Since you know exactly what chunk you were trying to store, why wouldn't you just retry the operation? It's harmless to retry, as the new chunk would just replace the old chunk, right?

@jeffaco
Copy link
Contributor

jeffaco commented Mar 15, 2018

I ran the backup again and it did reoccur with the same error:

Uploaded chunk 353936 size 3290997, 10.07MB/s 20:18:07 70.1%
Uploaded chunk 353937 size 1644254, 10.07MB/s 20:18:07 70.1%
Uploaded chunk 353938 size 8457462, 10.07MB/s 20:18:06 70.1%
Uploaded chunk 353939 size 4476809, 10.07MB/s 20:18:06 70.1%
Failed to upload the chunk 2dcef97eac6562136f35c411e15a1ed7b193cc35ff7950e4f1b42ea5059297dd: Put https://XXXXX.blob.core.windows.net/XXXXX/chunks/2d/cef97eac6562136f35c411e15a1ed7b193cc35ff7950e4f1b42ea5059297dd: write tcp 192.168.1.125:59725->52.191.176.36:443: write: broken pipe
Uploaded chunk 353940 size 2489875, 10.07MB/s 20:18:06 70.1%
Uploaded chunk 353941 size 3310368, 10.07MB/s 20:18:06 70.1%
Uploaded chunk 353942 size 2714835, 10.07MB/s 20:18:05 70.1%
Uploaded chunk 353943 size 2139599, 10.07MB/s 20:18:05 70.1%
Incomplete snapshot saved to /Volumes/Storage/.duplicacy/incomplete

Same error: write: broken pipe. Can you change the code to retry this error?

(It took about two days before this error occurred ...)

/Jeff

@jeffaco
Copy link
Contributor

jeffaco commented Mar 20, 2018

The previous fix only retries on temporary errors, but this write: broken pipe error wasn't treated as a temporary error. Can you run the backup again to see if the error will keep happening?

Any guidance on how to proceed here? It feels like large uploads to Azure aren't very reliable here, which is concerning since I have some large uploads to do. Thanks for any advice.

@gilbertchen
Copy link
Owner

This build https://acrosync.com/duplicacy/duplicacy_osx_x64_2.1.0a should handle the broken pipe error more gracefully.

@jeffaco
Copy link
Contributor

jeffaco commented Mar 21, 2018

Awesome, thank you so much!

I've restarted the backup, I'll let you know (if all goes well, will take up to 3 days, I estimate).

@jeffaco
Copy link
Contributor

jeffaco commented Mar 23, 2018

I don't think the new image (duplicacy_osx_x64_2.1.0a) handles broken pipes properly:

Uploaded chunk 104338 size 3529671, 10.48MB/s 2 days 03:50:47 20.7%
Uploaded chunk 104339 size 8244389, 10.48MB/s 2 days 03:50:47 20.7%
Uploaded chunk 104340 size 3537824, 10.48MB/s 2 days 03:50:46 20.7%
Uploaded chunk 104341 size 6205297, 10.48MB/s 2 days 03:50:47 20.7%
Uploaded chunk 104342 size 3466820, 10.48MB/s 2 days 03:50:45 20.7%
Uploaded chunk 104343 size 2707617, 10.48MB/s 2 days 03:50:48 20.7%
Failed to upload the chunk ca54f9fffce3158899e587c0f249794cb7c94b0475698cc4672296dfa8604f33: Put https://XXXXX.blob.core.windows.net/XXXXX/chunks/ca/54f9fffce3158899e587c0f249794cb7c94b0475698cc4672296dfa8604f33: write tcp 192.168.1.125:53710->52.191.176.36:443: write: broken pipe
Uploaded chunk 104344 size 2578663, 10.48MB/s 2 days 03:50:47 20.7%
Incomplete snapshot saved to /Volumes/Storage/.duplicacy/incomplete

Same error: write tcp 192.168.1.125:53710->52.191.176.36:443: write: broken pipe.

This seems to be maddeningly consistent 😞.

Any guidance would be appreciated of what to try next, thanks!

@gilbertchen
Copy link
Owner

This build (https://acrosync.com/duplicacy/duplicacy_osx_x64_2.1.0b) should work. The previous fix didn't work because the broken pipe error seems to be a url.Error but only net.OpError was handled. This new build retries on both url.Error and net.OpError.

@jeffaco
Copy link
Contributor

jeffaco commented Mar 26, 2018

Backup finally finished successfully, with a 0 exit status and the following final output:

Backup for /Volumes/Storage at revision 1 completed
Files: 228587 total, 2411G bytes; 228587 new, 2411G bytes
File chunks: 0 total, 2411G bytes; 479287 new, 2314G bytes, 2292G bytes uploaded
Metadata chunks: 31 total, 132,553K bytes; 31 new, 132,553K bytes, 60,852K bytes uploaded
All chunks: 31 total, 2411G bytes; 479318 new, 2315G bytes, 2292G bytes uploaded
Total running time: 2 days 20:44:01
Office-iMac:Storage jeff$

I would say that your latest fix does indeed resolve the broken pipe problem! Thanks so much!

@gilbertchen
Copy link
Owner

While I'm glad the new build worked for you, I noticed there was a bug: the total number of file chunks can't be 0. This is likely to be a bug introduced after the 2.1.0 release. I'll look into that.

@jeffaco
Copy link
Contributor

jeffaco commented Mar 26, 2018

Does that mean that this backup is invalid (it does pass a duplicacy check command), or is it likely just a problem in output?

@gilbertchen
Copy link
Owner

It is just a problem in output. The backup should be fine as long as it passes the check command.

@jeffaco
Copy link
Contributor

jeffaco commented Mar 26, 2018

Awesome, thanks!

@gilbertchen
Copy link
Owner

The 0 file chunk bug has been fixed by 5d2242d.

@jeffaco
Copy link
Contributor

jeffaco commented Mar 28, 2018

Thanks for taking care of this! This issue can be closed since the problems reported in it are fixed. I would have closed it myself, but don't have permission.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants