Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolvelib, test: handle relative URLs on 503-style indices #411

Merged
merged 2 commits into from
Nov 17, 2022

Conversation

woodruffw
Copy link
Member

This should fix #410 -- I don't have a third-party index with relative URLs to test against, but I've surmised the behavior based on that report (and following the PEP).

cc @fi0rini -- could you give these changes a try and let me know if they work for you?

Signed-off-by: William Woodruff [email protected]

@woodruffw woodruffw added the component:dep-sources Dependency sources label Nov 17, 2022
@woodruffw woodruffw requested review from di and tetsuo-cpp November 17, 2022 21:06
@woodruffw woodruffw self-assigned this Nov 17, 2022
Signed-off-by: William Woodruff <[email protected]>
Copy link
Member

@di di left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I would wait to merge for testing, either by the reporter or by us.

@woodruffw
Copy link
Member Author

woodruffw commented Nov 17, 2022

Yep, confirmed locally, with a fake index like this:

packages:
Flask-2.0.1-py3-none-any.whl

simple:
Flask

where simple/Flask:

<a href="../packages/Flask-2.0.1-py3-none-any.whl">
Flask-2.0.1-py3-none-any.whl</a><br/>

this resulted in a correctly resolved GET.

Edit: tested with:

python -m pip_audit --index-url http://localhost:8000/simple -r <(echo 'flask')

@woodruffw
Copy link
Member Author

Confirmed locally, so I'm going to merge. @f10rini let us know if you get a chance to test it on your end as well 🙂

@woodruffw woodruffw merged commit 9a88593 into main Nov 17, 2022
@woodruffw woodruffw deleted the ww/503-index-relative-urls branch November 17, 2022 21:31
@fi0rini
Copy link

fi0rini commented Nov 19, 2022

Hey guys, sorry to report that the command is still failing for us.

Gonna post the pip-audit output followed by some tracing related to the relative URLs so that you can see the difference. Looks like our path join is off for whatever reason.

The debug logs below show it going from
https://custom-repo.com:443/api/pypi/pypi-all/simple/alabaster/
to
https://custom-repo.com:443/api/pypi/simple/alabaster/0.7.10/alabaster-0.7.10-py2.py3-none-any.whl

There's no pypi-all there anymore so it went up one too many levels?

Here's the pip-audit output and some tracing

DEBUG:pip_audit._cli:parsed arguments: Namespace(local=False, requirements=[<_io.TextIOWrapper name='/home/nick/test_requirements.txt' mode='r' encoding='UTF-8'>], project_path=None, format=<OutputFormatChoice.Json: 'json'>, vulnerability_service=<VulnerabilityServiceChoice.Pypi: 'pypi'>, dry_run=False, strict=False, desc=<VulnerabilityDescriptionChoice.Auto: 'auto'>, cache_dir=PosixPath('/cache'), progress_spinner=<ProgressSpinnerChoice.Off: 'off'>, timeout=15, paths=[], verbose=True, fix=False, require_hashes=False, index_url='https://custom-repo.com/api/pypi/pypi-all/simple', extra_index_urls=[], skip_editable=False, no_deps=False, output=<_io.TextIOWrapper name='/report/pip-audit.json' mode='w' encoding='UTF-8'>, ignore_vulns=[])
DEBUG:cachecontrol.controller:Looking up "https://custom-repo.com/api/pypi/pypi-all/simple/alabaster" in the cache
DEBUG:cachecontrol.controller:No cache entry available
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): custom-repo.com:443
DEBUG:urllib3.connectionpool:https://custom-repo.com:443 "GET /api/pypi/pypi-all/simple/alabaster HTTP/1.1" 302 0
DEBUG:cachecontrol.controller:Status code 302 not in (200, 203, 300, 301, 308)
DEBUG:cachecontrol.controller:Looking up "https://custom-repo.com:443/api/pypi/pypi-all/simple/alabaster/" in the cache
DEBUG:cachecontrol.controller:No cache entry available
DEBUG:urllib3.connectionpool:https://custom-repo.com:443 "GET /api/pypi/pypi-all/simple/alabaster/ HTTP/1.1" 200 None
DEBUG:cachecontrol.controller:Updating cache with response from "https://custom-repo.com:443/api/pypi/pypi-all/simple/alabaster/"
DEBUG:cachecontrol.controller:Looking up "https://custom-repo.com/api/pypi/simple/alabaster/0.7.10/alabaster-0.7.10-py2.py3-none-any.whl" in the cache
DEBUG:cachecontrol.controller:No cache entry available
DEBUG:urllib3.connectionpool:https://custom-repo.com:443 "GET /api/pypi/simple/alabaster/0.7.10/alabaster-0.7.10-py2.py3-none-any.whl HTTP/1.1" 404 None
DEBUG:cachecontrol.controller:Status code 404 not in (200, 203, 300, 301, 308)
Traceback (most recent call last):
  File "/home/pipaudit/.local/bin/pip-audit", line 8, in <module>
    sys.exit(audit())
  File "/home/pipaudit/.local/lib/python3.9/site-packages/pip_audit/_cli.py", line 434, in audit
    for (spec, vulns) in auditor.audit(source):
  File "/home/pipaudit/.local/lib/python3.9/site-packages/pip_audit/_audit.py", line 66, in audit
    for dep, vulns in self._service.query_all(specs):
  File "/home/pipaudit/.local/lib/python3.9/site-packages/pip_audit/_service/interface.py", line 155, in query_all
    for spec in specs:
  File "/home/pipaudit/.local/lib/python3.9/site-packages/pip_audit/_dependency_source/requirement.py", line 114, in collect
    for _, dep in self._collect_cached_deps(filename, reqs):
  File "/home/pipaudit/.local/lib/python3.9/site-packages/pip_audit/_dependency_source/requirement.py", line 326, in _collect_cached_deps
    for req, resolved_deps in self._resolver.resolve_all(iter(req_values)):
  File "/home/pipaudit/.local/lib/python3.9/site-packages/pip_audit/_dependency_source/interface.py", line 87, in resolve_all
    yield (req, self.resolve(req))
  File "/home/pipaudit/.local/lib/python3.9/site-packages/pip_audit/_dependency_source/resolvelib/resolvelib.py", line 75, in resolve
    result = self.resolver.resolve([req])
  File "/home/pipaudit/.local/lib/python3.9/site-packages/resolvelib/resolvers.py", line 521, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File "/home/pipaudit/.local/lib/python3.9/site-packages/resolvelib/resolvers.py", line 402, in resolve
    failure_causes = self._attempt_to_pin_criterion(name)
  File "/home/pipaudit/.local/lib/python3.9/site-packages/resolvelib/resolvers.py", line 238, in _attempt_to_pin_criterion
    criteria = self._get_updated_criteria(candidate)
  File "/home/pipaudit/.local/lib/python3.9/site-packages/resolvelib/resolvers.py", line 228, in _get_updated_criteria
    for requirement in self._p.get_dependencies(candidate=candidate):
  File "/home/pipaudit/.local/lib/python3.9/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 392, in get_dependencies
    return candidate.dependencies
  File "/home/pipaudit/.local/lib/python3.9/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 121, in dependencies
    self._dependencies = list(self._get_dependencies())
  File "/home/pipaudit/.local/lib/python3.9/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 103, in _get_dependencies
    deps: List[str] = self.metadata.get_all("Requires-Dist", [])
  File "/home/pipaudit/.local/lib/python3.9/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 94, in metadata
    self._metadata = self._get_metadata_for_wheel()
  File "/home/pipaudit/.local/lib/python3.9/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 132, in _get_metadata_for_wheel
    with ZipFile(BytesIO(data)) as z:
  File "/usr/local/lib/python3.9/zipfile.py", line 1266, in __init__
    self._RealGetContents()
  File "/usr/local/lib/python3.9/zipfile.py", line 1333, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

# if I try keeping the pypi-all here and trying request
$ wget https://custom-repo.com:443/api/pypi/pypi-all/simple/alabaster/0.7.10/alabaster-0.7.10-py2.py3-none-any.whl
--2022-11-18 19:33:45--  https://custom-repo.com/api/pypi/pypi-all/simple/alabaster/0.7.10/alabaster-0.7.10-py2.py3-none-any.whl
Resolving custom-repo.com (custom-repo.com)... 
Connecting to custom-repo.com (custom-repo.com)... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12847 (13K) [application/octet-stream]
Saving to: ‘alabaster-0.7.10-py2.py3-none-any.whl’

100%[===================================================================>] 12,847      --.-K/s   in 0s      

2022-11-18 19:33:45 (61.8 MB/s) - ‘alabaster-0.7.10-py2.py3-none-any.whl’ saved [12847/12847]
# example for djang
$ curl -L  https://custom-repo.com/api/pypi/pypi-all/simple/django
<!-- ... --!>
<a href="../../packages/packages/b8/89/cf535c221e477545bc9f6fe00f644bb0bd51948dc22b0c7cc066ba9cd899/Django-4.1rc1.tar.gz#sha256=3c4204490ad3a7252873bc24c6aa2a32375baa06b4c8beb9b9cfda1f986741de" data-requires-python="&gt;=3.8" rel="internal">Django-4.1rc1.tar.gz</a><br />


$ curl -L  'https://custom-repo.com/api/pypi/pypi-all/simple/django/../../packages/packages/b8/89/cf535c221e477545bc9f6fe00f644bb0bd51948dc22b0c7cc066ba9cd899/Django-4.1rc1.tar.gz#sha256=3c4204490ad3a7252873bc24c6aa2a32375baa06b4c8beb9b9cfda1f986741de' --output django-test.tar.gz

$ file django-test.tar.gz 
django-test.tar.gz: gzip compressed data, was "Django-4.1rc1.tar", last modified: Tue Jul 19 04:51:40 2022, max compression

@woodruffw
Copy link
Member Author

Dang. Thanks for checking.

I think I know what happened: we're doing a urljoin on a base that looks like this:

https://custom-repo.com/api/pypi/pypi-all/simple/django

...and a path that looks like this:

../../packages/packages/b8/89/cf535c221e477545bc9f6fe00f644bb0bd51948dc22b0c7cc066ba9cd899/Django-4.1rc1.tar.gz

intuitively that should produce this:

https://custom-repo.com/api/pypi/pypi-all/packages/packages/b8/89/cf535c221e477545bc9f6fe00f644bb0bd51948dc22b0c7cc066ba9cd899/Django-4.1rc1.tar.gz

...but it actually removes three components instead of two, since urljoin treats any final component that doesn't end with a / as a file instead of a directory component.

Should be a one-line fix.

@fi0rini
Copy link

fi0rini commented Nov 19, 2022

alright makes sense to me

@woodruffw
Copy link
Member Author

@fi0rini I opened #412, which should actually fix it 🙂. Could you give that a try?

@fi0rini
Copy link

fi0rini commented Nov 19, 2022

yoo its workinn @woodruffw good stuff B-) awesome dude thanks!

@woodruffw
Copy link
Member Author

Awesome, thanks again for reporting and helping us debug!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:dep-sources Dependency sources
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pip-audit failure using --index-url
3 participants