Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call to download() results in zero sized files #9

Open
jjohanson opened this issue Feb 20, 2013 · 4 comments
Open

Call to download() results in zero sized files #9

jjohanson opened this issue Feb 20, 2013 · 4 comments

Comments

@jjohanson
Copy link

Hello!

When using the latest version of easywebdav,

easywebdav 1.0.7
requests 1.1.0
webdav server: mod_dav, apache2
ubuntu 12.10

I have a problem using downloading files. When doing a download (by calling the download() method), the file is created on the local machine with the correct name, but the file is empty (zero size).

When looking at the code I can see response.raw is used as an input to shutil.copyfileobj(). However, I think that the documentation for requests states that 'stream=True' must be used in the call to session.request for .raw to be valid (http://docs.python-requests.org/en/latest/api/):

"raw = None
File-like object representation of response (for advanced usage).
Requires that 'stream=True' on the request."

I have tried setting stream=True (line 77 in client.py). and the received file then contains data! However, when looking at the header I can see that my webdav server gzip'ing the files, so the files downloaded have to be unszipped.

*** client-new.py   2013-02-20 08:27:29.941967542 +0100
--- client-org.py   2013-02-18 14:17:59.114443000 +0100
***************
*** 76,78 ****
          url = self._get_url(path)
!         response = self.session.request(method, url, allow_redirects=False, stream=True, **kwargs)
          if isinstance(expected_code, Number) and response.status_code != expected_code \
--- 76,78 ----
          url = self._get_url(path)
!         response = self.session.request(method, url, allow_redirects=False, **kwargs)
          if isinstance(expected_code, Number) and response.status_code != expected_code \

To get around this problem I have just uncommented the 'f.write(response.content)' on line 126 in client.py (and added a comment to 'shutil.copyfileobj(response.raw, f)' on line 127).

As far as I can see using response.raw and something like shutil.copyfileobj() is however necessary to be able to download really large files.

@jjohanson
Copy link
Author

Hello again,

I have looked some more at using 'response.raw'. As mentioned above you may end up with zip'ed files when setting 'stream=True' and using 'response.raw'. However, it turns out that the 'read()' method has an optional parameter 'decode_content' that can be set to True to have the response decoded (unzipped in this case):

response.raw.read(decode_content=True)

So the following code are using 'stream=True', and response.raw (with response.raw.read(decode_content=True) called in a loop to copy all data).

The documentation on .read() states that you can not specify an amount to read and at the same time set 'response.raw.read(decode_content=True)'. So this still leaves the question about the ability to read (very) large files. I have tried reading files up to about 600MB and I can see that the file is transferred in a single call to .read() (the while loop is only passed one). So, how to transfer really large files?

Jørgen

--- /home/jojo/Downloads/easywebdav-1.0.7/easywebdav/client.py  2012-11-13 23:31:47.000000000 +0100
+++ /usr/local/lib/python2.7/dist-packages/easywebdav-1.0.7-py2.7.egg/easywebdav/client.py  2013-02-24 09:51:50.638723255 +0100
@@ -74,7 +74,7 @@
             self.session.auth = (username, password)
     def _send(self, method, path, expected_code, **kwargs):
         url = self._get_url(path)
-        response = self.session.request(method, url, allow_redirects=False, **kwargs)
+        response = self.session.request(method, url, allow_redirects=False, stream=True, **kwargs)
         if isinstance(expected_code, Number) and response.status_code != expected_code \
             or not isinstance(expected_code, Number) and response.status_code not in expected_code:
             raise OperationFailed(method, path, expected_code, response.status_code)
@@ -124,7 +124,12 @@
         response = self._send('GET', remote_path, 200)
         with open(local_path, 'wb') as f:
             #f.write(response.content)
-            shutil.copyfileobj(response.raw, f)
+            #shutil.copyfileobj(response.raw, f)
+            line = response.raw.read(decode_content=True)
+            while line:
+                f.write(line)
+                line = response.raw.read(decode_content=True)
+
     def ls(self, remote_path='.'):
         headers = {'Depth': '1'}
         response = self._send('PROPFIND', remote_path, (207, 301), headers=headers)

read(): http://urllib3.readthedocs.org/en/latest/helpers.html#module-urllib3.response

http://docs.python-requests.org/en/latest/user/advanced/
http://docs.python-requests.org/en/latest/api/

@amnong
Copy link
Owner

amnong commented May 30, 2014

I'm going to release v1.0.8 after a really long while - please let me know if the problem persists with the new version.

@blootsvoets
Copy link

Works for me with
requests==2.7.0
easywebdav==1.2.0
python==2.7.10

@Jay54520
Copy link

This worked for me:

            response.raw.decode_content = True
            shutil.copyfileobj(response.raw, out_file)

because

 the response.raw file-like object will not, by default, decode compressed responses (with GZIP or deflate). You can force it to decompress for you anyway by setting the decode_content attribute to True

ref:
https://www.codementor.io/tips/3443978201/how-to-download-image-using-requests-in-python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants