-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict deleting packages #112
Comments
Yeah, I totally agree it's worthy to add such a restriction. It's a good practice to make an existing release deprecated by a new release but shouldn't delete a recently published release. |
Given pypa/pip#3634, it might also be worthwhile to limit adding files to packages that already exist (say to the same 24-hour span), as that Pip issue means that new files could potentially break anybody with locked-down hashes in their requirements. |
It's a foreseeable problem. I hope that PyPI will adopt a cautious policy before Python suffers its own left-pad-style fiasco. I remain concerned that a malicious (or hacked) package maintainer could upload malware to a published version of a popular package #75
|
FWIW, using hashes does prevent that attack, and that's probably a better approach anyway if you really want to make sure that the packages you download haven't been subject to tampering of any sort. It's just that, without pypa/pip#3634, a well-meaning maintainer that uploads a wheel after-the-fact that should be useful to me instead makes my build fail, which is not ideal. |
I've added a note to #75 pointing out that the potential for sdist replacement should be resolved now, given the policy changes in PEP 527. If that isn't the case, then it's a bug in Warehouse's enforcement of that policy, and is better handled as a direct issue report against Warehouse rather than as a question here. When it comes to unpublishing, we've long had the policy that using PyPI directly for institutional purposes without a caching proxy in front of it is an irresponsible development practice that fails to account for the complete lack of any form of contractual relationship between the software publisher and the software user (and also wastes the bandwidth being donated by the PSF's CDN provide, Fastly). This means we come firmly on the side of publishers here: while we consider silently replacing old releases to be a security concern, unpublishing them is entirely reasonable in the absence of any written guarantees regarding future availability. PSF sponsors may be able to make a legitimate case for changing the PyPI terms of service to prohibit removal of previously published releases (rather than merely allowing indefinite caching by others, as they do now), but that would need to be in the context of those sponsors actually making ongoing contributions to PyPI's sustaining engineering. (This would be comparable to NPM changing their own policy to account for the needs of their commercial customers) At a technical level, pypi/warehouse#720 covers the possible introduction of a different approach to release management that would permit all of the artifacts in a release to be staged and then published as a single coherent unit. Adopting such an approach would also allow end users to make informed software consumption decisions based on the release model that particular projects used. |
I'm not sure that's sufficient. It's not just that a caching proxy is required – the caching proxy must also have an extremely long cache expiry. For example, for devpi, the default is only 30 minutes – perfectly reasonable for alleviating load from PyPI, but it's not going to prevent this sort of problem. Also, this issue isn't specific to institutional purposes. Open-source packages aren't going to ignore best practices around locking down concrete dependencies, so they'll suffer the same pain. To me, this isn't really a question of security – it's a question of usability in any case. I have a number of packages on PyPI myself, but ultimately there are many more users than there are package maintainers. If, say, a maintainer of a commonly-used package such as six or Django loses control of his or her PyPI account or otherwise unpublishes a commonly-used package, it's going to cause a huge amount of pain to the entire Python community. |
Right, but the PSF has zero paid support staff for PyPI, and the volunteer admins already receive more support requests from for-profit companies than they have the ability to handle. We're not going to institute any policies that would create more work for the existing volunteers in the absence of ongoing funding that allows management of those support queues to be handed off to paid PSF staff instead. |
Deciding what to allow or disallow comes down to a balancing act between allowing package authors control over managing their software's lifecycle in the way they see fit and restrictions to allow end users to have a set of expectations they know cannot be invalidated. For this, there are really two separate issues here. The first one is that uploading to older releases (or even the most recent release) can cause a different artifact to be found than what previously was fetched. This generally isn't an issue except if the new artifact is broken in some way, or if you're using hash features in pip or similar. For broken artifacts, we're unlikely to do much about them and we expect authors to manage those edge cases in whatever makes the most sense for their project (likely removing the broken artifact). For the hash issue, that's really a tooling specific thing, and in pip specifically we should probably just exclude anything that doesn't match one of our expected hashes for the dependency. The other issue is that of deletions. There are a lot of pros and cons to a variety of different options here. I don't think we can change this at this point though without a discussion and rough consensus on distutils-sig (and possibly a PEP if it doesn't seem like we're able to get rough consensus without one). |
In pypi/warehouse#720 (comment), I've suggested that we might be able to tie a policy change to the introduction of the staging mode, such that publishers can choose between two ways of working:
New projects would start out with the immutable release model by default (but could opt out if they really wanted to), while existing projects would have to opt in to switching over from the status quo. |
I'll move this discussion to the mailing list as it seems like that's a better venue per #112 (comment). Sorry to bounce this around so much. Hopefully I don't screw up using the mailing list and end up causing more problems. 🤞 |
Sorry to introduce the hashes thing here – that's a bit of a distraction. |
(Was pypi/legacy#738, thanks to @ewdurbin for pointing me in the right direction)
Earlier, a number of users encountered broken builds when
[email protected]
, originally published on 2017-11-13, was unpublished on 2017-11-23. This is because those following best practices around fully locking down dependencies (e.g. viaPipfile.lock
) were pointed at the no-longer-existing v3.5.0.Some time ago, there was a similar problem in the npm ecosystem around the
left-pad
package getting unpublished: https://www.theregister.co.uk/2016/03/23/npm_left_pad_chaos/, http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npmAs a consequence, npm adopted a policy that prohibited deleting versions more than 24 hours old without contacting support: http://blog.npmjs.org/post/141905368000/changes-to-npms-unpublish-policy
I believe PyPI should adopt a similar policy – perhaps exactly the same one.
The text was updated successfully, but these errors were encountered: