-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pip freeze with a hash #4732
Comments
Is there at least some way to easily script this? E.g., can I loop over a |
PIP would need to calculate and keep the hash somewhere as it installs the package. When doing a freeze, it'd retrieve the information. |
This would be an awesome feature, indeed. |
This sounds like a good idea, although I am not sure how it'll work. As @max-wittig pointed out, the hash needs to be computed when the installation occurs, when the installation source is downloaded. |
You can get the hash from the cached wheel in ~/.cache/pip/wheels/ |
It looks like pipenv is getting the hashes directly from the warehouse api https://github.com/pypa/pipenv/blob/master/pipenv/utils.py#L468-L508 |
This comment has been minimized.
This comment has been minimized.
@andrewchambers perhaps instead of the slight barbs consider sending a PR? |
It appears that this user story (Python developer wanting to hash their dependencies) is addressed by pipenv, a distinct PyPA project. See https://docs.pipenv.org/basics/#pipfile-lock-security-features for details. So I'm closing this issue assuming that this user story is out-of-scope for pip itself, and best handled by a "higher-level" tool. Other readers also might be interested in: |
Are you saying you think the entirety of pip freeze is out of scope for pip
now?
Because if not, this seems like a very logical thing for pip. Not all of us
use any current higher level tool, and it's pip itself that introduced the
possibility of having hashes in requirements files.
Without this feature it's pretty unfeasible to generate those.
Saying "patches welcome" seems very reasonable, but closing not so much.
…On Mon, Aug 6, 2018, 17:26 d❤vid ***@***.***> wrote:
Closed #4732 <#4732>.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4732 (comment)>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AAUIXkYXs7_gKTjNsPIAurvfoyr6k-Gkks5uOFIagaJpZM4Pduuk>
.
|
@Julian note that it was the OP who closed the issue, not the pip developers. The option for someone to create a PR for this remains available to anyone interested in the feature. |
Ah, indeed, thanks! Great, glad to hear it's not being designated as out of scope. |
See the proposal for a |
I've updated the ticket description with the proposed solution (as I understand it). Note that Pipfile-based dependencies are usable today if you use pipenv. |
See today's convoluted workaround at your handy peterbe/hashin#100 |
I just switch to Pipenv, which supports this workflow. Sadly it's still not included in the default python package. |
Is there any roadmap or concrete discussion about implementing the proposed |
This will generate a
Note that this will directly modify existing requirements.txt file. You can install |
would still be good to have this directly via |
|
What about generating the lock file at install-time, like npm, yarn, pipenv, poetry, Cargo, and Conan do? (sorry if I missed any)
On updates to This directly supports with the stated use case:
but it avoids a lot of the extra work that is being described in #8519. |
@NoahGorny WRT hashes:
Don't trust remote, but don't necessarily trust local either. Verify they match to give the user assurance that the right thing is installed. If they don't match, generate an error; provide a way to force the install if they don't match, but by default uninstall/rollback if they don't match (or depending where/when you're generating the hashes...don't install to start with, which would be even better). |
We can not always generate the hash locally after installation, that's why we create the new HASH file. However, I am not sure we should fetch hashes from remote each time we freeze the environment...
This requires users to actively generate lockfiles in installations, and only works if the user is installing from requirements file in the first place. This is a good option for such users, but in other use cases I think it does not work just as well |
The approach suggested by @chrahunt in #4732 (comment) is also valuable in a lot of situations. It has complexities to think through too, for instance when the install command is used to update an existing environment, and when pip decides it does not need to reinstall some already installed dependencies. In such cases we'd still need a way to obtain information about the hashes of installed distributions. |
This use case from the original issue assumes we have a requirements file, and several comments refer to Pipfile support, which would work in the same way. I think there may be some people who would want to get their environment set up and then generate a lock file for it, but IMO we risk not actually satisfying this issue adequately if we try to solve that one at the same time.
Good point. It would be worthwhile to see how other dependency managers behave in that situation. If it turns out it's common (and generally agreed to be necessary) to store hashes with the installed packages, then that could be turned right around and included in the PEP itself. :) |
@chrahunt to give confidence that the right thing is being installed; I would think you'd want something generated before it's installed that could easily be verified. Question: what all is getting hashed? (or being proposed to being hashed) |
See https://pip.pypa.io/en/stable/reference/pip_install/#hash-checking-mode -- it's the entire files.
|
That doesn't really say what gets hashed, just requirement around hashing. If it's the generated file, there wouldn't be an issue with hashing wheels. A hash would also be verifiable against what is downloaded vs what is installed. Something is off. |
When you |
@uranusjr if you're checking the hash of the archive prior to extracting it, then it doesn't matter what happens after. If you're hashing what is actually put into the system, then of course it's going to change all the time but that's also an extremely bad design since you cannot have deterministic hashing behavior. Honestly, Python/Pip should follow the package hashing done by RPM, Deb, and others. You hash the package itself, not it's installed data. This provides deterministic behavior and can be verified before an install is ever done. IOW - there should be no need to regenerate a wheel from the installation; you're not hashing the installation but the package itself. |
What you describe is exactly what pip is currently doing. The problem in this thread is the other way around: people are looking for a way to generate hashes from installed data, and the pip developers are trying to explain we don’t know how this can be done. |
I had an attempt at #8519 which got stale... |
@uranusjr if that's the case then it there is certainly an answer - an emphatic @NoahGorny that doesn't really clarify anything. I did leave a comment about one aspect. |
I personally think pip freeze with hash is desirable, and would facilitate common workflows. It is feasible if we record the hash of the distribution that was downloaded for installation (not the wheel we possibly built locally). It is not trivial because we have the (wheel) cache in between. And adding information in |
Is it that painful to put the locking up front and then use the lock to control the environment? Rather than controlling the environment then reaching back up the data path to get the hashes later? |
@sbidoul if you record the hash of the file that was downloaded (the package) whether wheel or otherwise it's easy to verify. If it's a VCS download, then a driver for the VCS should take the VCS location (git URL, svn URL, etc) and some repo data (git hash, svn revision, etc) to create a hash which could then be standardized and easily used. Trying to generate from the installed data is very problematic from numerous aspects:
A single package (wheel, bdist, sdist) should have exactly 1 hash that would match it. |
@altendky I'd say it is cumbersome today. And it seems the tools that automate it have to hack pip internals or reimplement a sizeable portion of it to achieve that goal. So my feeling is that pip would help broader adoption of hash checking if it exposed mechanisms to facilitate hashes discovery. pip freeze with hashes is one of such mechanism. Another is to let pip report more information about what it does when installing (or dry-run install), such as the (hashes of) distributions it downloaded to perform the install. |
@BenjamenMeyer I don't think anyone is attempting to do that indeed. Regarding VCS, I'd say we don't really need anything special. I would simply relax a little bit pip's hash checking mode to consider that commit references for VCS that have immutable commit refs (git shas, ...) are sufficient as a hash mechanism. |
@sbidoul, I didn't say pip shouldn't support it, just that perhaps the order of operations should be slightly different than requested here. |
How about an alternative UX:
As it turns out, right now it does almost this, except it prints only one requirement hash on each run of |
Theoratically yes (well it can print out all the hashes it knows; theoratically there are infinite possible hashes), but pip does not currently have the mechanism to do so. a PR exploring this would be much welcomed. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Would this perhaps be something which would be a good fit with (I suppose it would also be possible to write a small tool which takes the JSON output and converts it to a |
The idea of using JSON format is precisely so that such tools are easy to write without needing changes to pip, so yes, that would be the recommended approach. |
Any way forward on this issue, any way someone could help ? What is left to be done/discussed ? |
Description:
User story: I am a Python developer with an existing
requirements.txt
file. I want to add hashes to the file, so that future installations are more secure.What I've run:
At the moment I need to:
pip hash /path/to/package
requirements.txt
It would be great if instead I could:
pip freeze --hash
requirements.txt
Today's solution:
Pipfile is a replacement for requirements.txt that includes hashes in a file called
Pipfile.lock
.pipenv is a tool for managing your virtualenv based on
Pipfile
, including checks against the hashes defined inPipfile.lock
. (It can also convert arequirements.txt
file.)Suggested solution:
Supporting Pipfile at the pip layer (rather than a higher-level tool) is on the PyPA roadmap, see https://github.com/pypa/pipfile#pip-integration-eventual :
The implication is that this is the preferred solution to supporting hashes (rather than adding them to
requirements.txt
orpip freeze
). The current status "Deferred till PR" (see this ticket). See also #6925The text was updated successfully, but these errors were encountered: