-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better workaround for cache poisoning (see #3025) #7319
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach seems cleaner than setting the python tag explicitly. In both setups we're caching a wheel per interpreter, but here we preserve all the information that a well-behaved wheel build might want to communicate and it will let us remove some workarounds in our current implementation.
My big concern here is that this is basically invalidating all existing caches, without clearing them out. |
I would consider that as covered by #6956. |
This change makes it so that all existing cached wheels would no longer be considered when installing/building a wheelhouse etc. Providing functionality to trim the size of your cached wheels (#6956) is different and would not do anything toward addressing that. I'd rather that we stop spewing out more the specific tags in our 'pip install' wheels, since we're at a point where it's OK to say something along the lines of: You're doing dynamic things in your Basically (1) from #7296 (comment). |
2d71b3a
to
8ba6b75
Compare
Requested changes done. Unfortunately @pradyunsg a possible issue with the more aggressive approach is that #3025 reveals itself for downstream users of broken sdists. Since such broken sdists are probably unmaintained nowadays, the only actionable option for users will be to disable caching or adapt their, say, tox configurations, to use a different cache per python implementation. So I tend to think some lost disk space and a cache rebuild will generate less complaints. I've no strong opinion on this though. |
I added two commits to remove unused code related to |
This PR would mean that pip 20 would have a completely incompatible cache with previous versions. I think |
b3444d7
to
468e282
Compare
I rebased and added a commit with the new key construction and hashing algorithm. |
I added support for legacy cache entries, following @xavfernandez's suggestion. The diff is becoming somewhat big, but individual commits should remain easy to review. |
And I think a |
600778a
to
69cdf75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've a last comment. Otherwise, it looks really good 👍
2aab2f2
to
f3e5d51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor followup comments (only one actionable right now), otherwise this looks good to me.
path = self.get_path_for_link(link) | ||
if os.path.isdir(path): | ||
candidates.extend(os.listdir(path)) | ||
# TODO remove legacy path lookup in pip>=21 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make an issue to add a deprecation warning around here when this is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of deprecation warning do you have in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering the same as I think pip._internal.utils.deprecation.deprecated
would be too noisy for such a thing...
Maybe something like
pip/src/pip/_internal/utils/deprecation.py
Lines 101 to 102 in b3aced9
if gone_in is not None and parse(current_version) >= parse(gone_in): | |
raise PipDeprecationWarning(message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To not forget about things to do in the future, you could maybe use GitHub milestones and assign such TODO issues to future milestones. Now that pip has a regular release cadence it might be easy enough to manage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to take a look at this tomorrow.
Hello! I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the |
Make sure ``pip wheel`` never outputs pure python wheels with a python implementation tag. Better fix/workaround for `pypa#3025 <https://github.com/pypa/pip/issues/3025>`_ by using a per-implementation wheel cache instead of caching pure python wheels with an implementation tag in their name. Fixes pypa#7296
Instead of building an URL-ish string that could be complex to describe and reproduce, generate a dictionary that is hashed with a simple algorithm.
Pip 20 changes the cache key format to include the interpreter name. To avoid invalidating all existing caches, we continue using existing cache entries that were computed with the legacy algorithm. This should not regress issue pypa#3025 because wheel cached in such legacy entries should have the python implementation tag set.
Co-Authored-By: Pradyun Gedam <[email protected]>
d457274
to
e3c1ca1
Compare
I rebased to resolve a merge conflict. |
@xavfernandez I think all your comments have been handled. Can you update your review? |
@sbidoul Sorry, this PR was approved in my head ^^ |
Ping @pradyunsg, any issues with this approach? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things, that I think we should consider tackling in follow ups prior to 20.0:
- add the deprecation notice, for the legacy path usage -- I'm not 100% sure whether we should be printing a not-actionable notice for something like the cache which pip's generating.
- switch to ask-for-forgiveness-not-permission paradigm for file system access here.
Neither of these are blocking concerns though, so green tick it is. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL, thanks GitHub.
Thanks for all your work here @sbidoul! Thanks for the reviews @chrahunt and @xavfernandez! Thanks again for the ping @chrahunt. :) |
Thanks all! |
Make sure
pip wheel
never outputs pure python wheels with apython implementation tag. Better fix/workaround for #3025 by
using a per-implementation wheel cache instead of caching pure python
wheels with an implementation tag in their name.
Fixes #7296