-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX #6413 pip install <url> allow directory traversal #6418
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the report and patch! Some comments.
news/6413.bugfix
Outdated
@@ -0,0 +1 @@ | |||
Fixed pip install <url> allow directory traversal when a malicious server (or a network MitM if downloading over HTTP) send a Content-Disposition header with filename which contains "../ or ..\". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in the imperative tone (so "Fix pip install <url>
..."): https://pip.pypa.io/en/stable/development/contributing/#contents-of-a-news-entry
You should also wrap lines to 80 characters and surround words that should be code with double backticks before and after (since it is reStructuredText).
src/pip/_internal/download.py
Outdated
_filename = params.get('filename') | ||
if _filename: | ||
_filename = os.path.basename(_filename) | ||
filename = _filename or filename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To increase testability, I would define a parse_content_disposition(headers, default_filename)
function that returns the file name to use, and then add a couple unit tests of the function using @pytest.mark.parametrize
.
tests/unit/test_download.py
Outdated
} | ||
session.get.return_value = resp | ||
|
||
temp_dir = mkdtemp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the pytest tmpdir
fixture so you won't need to manually create and delete a temp dir.
tests/unit/test_download.py
Outdated
temp_dir = mkdtemp() | ||
temp_sub_dir = mkdtemp(dir=temp_dir, prefix="sub-dir-") | ||
hashes = None | ||
progress_bar = "on" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can pass these two variables as keyword argument to save two lines.
tests/unit/test_download.py
Outdated
@@ -157,6 +157,43 @@ def test_unpack_http_url_bad_downloaded_checksum(mock_unpack_file): | |||
rmtree(download_dir) | |||
|
|||
|
|||
def test_download_http_url_with_directory_traversal(data): | |||
""" | |||
It should download file to temp_sub_dir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be more explicit in the docstring and say something like, "Test downloading a file when the content-disposition header contains a filename with a ".." path part."
src/pip/_internal/download.py
Outdated
type, params = cgi.parse_header(content_disposition) | ||
filename = params.get('filename') | ||
if filename: | ||
filename = os.path.basename(filename).split("\\")[-1].rstrip(".") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing this implementation, it might actually be better to add a second function (called something like sanitize_filename()
) and put the bulk of your unit tests in the unit tests of that function. Also, why is it necessary to split on \
after calling basename()
, and why splitting only with that character as opposed to others? Some code comments are probably warranted here explaining what we are protecting against. Lastly, what about left-stripping .
, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added .split("\\")[-1]
because os.path.basename("dir\\file")
return "dir\file" in linux.
Later I found out that "dir \ file" is a valid filename in linux. so .split("\\")[-1]
can be removed.
I added .rstrip(".")
because os.path.basename("dir/..") return ".." and ".." is not a valid filename,
But only ".." can not lead to directory traversal. so ``.rstrip(".")``` can also be removed.
So only os.path.basename(filename)
could fix this security issue already, so I decided to remove .split("\\")[-1]
and .rstrip(".")
, is this ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't something like that still needed for Windows, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In windows both os.path.basename("dir\\file")
and os.path.basename("dir/file")
return "file".
There is no other thing need to do in Windows, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your updates (and I agree with using os.path.basename()
). Some more comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some final(?) comments. Thanks for your work on this.
any problems? |
Hello! I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the |
No, just haven't gotten to this. But looks like you need to do a rebase in the meantime.. |
Hello! I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the |
a2ea2a2
to
42dfa4f
Compare
@gzpan123 I made some formatting and wording tweaks before planning to merge this, if that's okay. |
Thanks so much for all your work on this, @gzpan123, and for sticking with it! 👍 |
@gzpan123 In the course of reviewing your PR, I noticed a related issue that looks like it could use fixing. It seems like the "default" filename (from the >>> from pip._internal.models.link import Link
>>> link = Link('https://example.com/..%2Ffoo.txt')
>>> link.filename
'../foo.txt' This is because pip/src/pip/_internal/models/link.py Lines 61 to 68 in 5776ddd
|
Fixes #6413.