Skip to content

Commit

Permalink
Add a dedicated topic page on "Secure Installs"
Browse files Browse the repository at this point in the history
This moves the content about hash checking mode into here, along with
adding minor additional recommendation of using `--no-binary :all:`.
  • Loading branch information
pradyunsg committed Feb 11, 2022
1 parent 6e30407 commit e512b34
Show file tree
Hide file tree
Showing 4 changed files with 105 additions and 158 deletions.
160 changes: 3 additions & 157 deletions docs/html/cli/pip_install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -230,164 +230,10 @@ This is now covered in :doc:`../topics/caching`.

This is now covered in :doc:`../topics/caching`.

.. _`hash-checking mode`:
.. rubric:: Hash checking mode
:name: hash-checking-mode

Hash-Checking Mode
------------------

Since version 8.0, pip can check downloaded package archives against local
hashes to protect against remote tampering. To verify a package against one or
more hashes, add them to the end of the line::

FooProject == 1.2 --hash=sha256:2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 \
--hash=sha256:486ea46224d1bb4fb680f34f7c9ad96a8f24ec88be73ea8e5a6c65260e9cb8a7

(The ability to use multiple hashes is important when a package has both
binary and source distributions or when it offers binary distributions for a
variety of platforms.)

The recommended hash algorithm at the moment is sha256, but stronger ones are
allowed, including all those supported by ``hashlib``. However, weaker ones
such as md5, sha1, and sha224 are excluded to avoid giving a false sense of
security.

Hash verification is an all-or-nothing proposition. Specifying a ``--hash``
against any requirement not only checks that hash but also activates a global
*hash-checking mode*, which imposes several other security restrictions:

* Hashes are required for all requirements. This is because a partially-hashed
requirements file is of little use and thus likely an error: a malicious
actor could slip bad code into the installation via one of the unhashed
requirements. Note that hashes embedded in URL-style requirements via the
``#md5=...`` syntax suffice to satisfy this rule (regardless of hash
strength, for legacy reasons), though you should use a stronger
hash like sha256 whenever possible.
* Hashes are required for all dependencies. An error results if there is a
dependency that is not spelled out and hashed in the requirements file.
* Requirements that take the form of project names (rather than URLs or local
filesystem paths) must be pinned to a specific version using ``==``. This
prevents a surprising hash mismatch upon the release of a new version
that matches the requirement specifier.
* ``--egg`` is disallowed, because it delegates installation of dependencies
to setuptools, giving up pip's ability to enforce any of the above.

.. _`--require-hashes`:

Hash-checking mode can be forced on with the ``--require-hashes`` command-line
option:

.. tab:: Unix/macOS

.. code-block:: console
$ python -m pip install --require-hashes -r requirements.txt
...
Hashes are required in --require-hashes mode (implicitly on when a hash is
specified for any package). These requirements were missing hashes,
leaving them open to tampering. These are the hashes the downloaded
archives actually had. You can add lines like these to your requirements
files to prevent tampering.
pyelasticsearch==1.0 --hash=sha256:44ddfb1225054d7d6b1d02e9338e7d4809be94edbe9929a2ec0807d38df993fa
more-itertools==2.2 --hash=sha256:93e62e05c7ad3da1a233def6731e8285156701e3419a5fe279017c429ec67ce0
.. tab:: Windows

.. code-block:: console
C:\> py -m pip install --require-hashes -r requirements.txt
...
Hashes are required in --require-hashes mode (implicitly on when a hash is
specified for any package). These requirements were missing hashes,
leaving them open to tampering. These are the hashes the downloaded
archives actually had. You can add lines like these to your requirements
files to prevent tampering.
pyelasticsearch==1.0 --hash=sha256:44ddfb1225054d7d6b1d02e9338e7d4809be94edbe9929a2ec0807d38df993fa
more-itertools==2.2 --hash=sha256:93e62e05c7ad3da1a233def6731e8285156701e3419a5fe279017c429ec67ce0
This can be useful in deploy scripts, to ensure that the author of the
requirements file provided hashes. It is also a convenient way to bootstrap
your list of hashes, since it shows the hashes of the packages it fetched. It
fetches only the preferred archive for each package, so you may still need to
add hashes for alternatives archives using :ref:`pip hash`: for instance if
there is both a binary and a source distribution.

The :ref:`wheel cache <Wheel cache>` is disabled in hash-checking mode to
prevent spurious hash mismatch errors. These would otherwise occur while
installing sdists that had already been automatically built into cached wheels:
those wheels would be selected for installation, but their hashes would not
match the sdist ones from the requirements file. A further complication is that
locally built wheels are nondeterministic: contemporary modification times make
their way into the archive, making hashes unpredictable across machines and
cache flushes. Compilation of C code adds further nondeterminism, as many
compilers include random-seeded values in their output. However, wheels fetched
from index servers are the same every time. They land in pip's HTTP cache, not
its wheel cache, and are used normally in hash-checking mode. The only downside
of having the wheel cache disabled is thus extra build time for sdists, and
this can be solved by making sure pre-built wheels are available from the index
server.

Hash-checking mode also works with :ref:`pip download` and :ref:`pip wheel`.
See :doc:`../topics/repeatable-installs` for a comparison of hash-checking mode
with other repeatability strategies.

.. warning::

Beware of the ``setup_requires`` keyword arg in :file:`setup.py`. The
(rare) packages that use it will cause those dependencies to be downloaded
by setuptools directly, skipping pip's hash-checking. If you need to use
such a package, see :ref:`Controlling
setup_requires <controlling-setup_requires>`.

.. warning::

Be careful not to nullify all your security work when you install your
actual project by using setuptools directly: for example, by calling
``python setup.py install``, ``python setup.py develop``, or
``easy_install``. Setuptools will happily go out and download, unchecked,
anything you missed in your requirements file—and it’s easy to miss things
as your project evolves. To be safe, install your project using pip and
:ref:`--no-deps <install_--no-deps>`.

Instead of ``python setup.py develop``, use...

.. tab:: Unix/macOS

.. code-block:: shell
python -m pip install --no-deps -e .
.. tab:: Windows

.. code-block:: shell
py -m pip install --no-deps -e .
Instead of ``python setup.py install``, use...

.. tab:: Unix/macOS

.. code-block:: shell
python -m pip install --no-deps .
.. tab:: Windows

.. code-block:: shell
py -m pip install --no-deps .
Hashes from PyPI
^^^^^^^^^^^^^^^^

PyPI provides an MD5 hash in the fragment portion of each package download URL,
like ``#md5=123...``, which pip checks as a protection against download
corruption. Other hash algorithms that have guaranteed support from ``hashlib``
are also supported here: sha1, sha224, sha384, sha256, and sha512. Since this
hash originates remotely, it is not a useful guard against tampering and thus
does not satisfy the ``--require-hashes`` demand that every package have a
local hash.
This is now covered in :doc:`../topics/secure-installs`.

.. rubric:: Local Project Installs
:name: local-project-installs
Expand Down
2 changes: 1 addition & 1 deletion docs/html/reference/requirements-file-format.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ The options which can be applied to individual requirements are:

- {ref}`--install-option <install_--install-option>`
- {ref}`--global-option <install_--global-option>`
- `--hash` (for {ref}`Hash-Checking mode`)
- `--hash` (for {ref}`Hash-checking mode`)

## Referring to other requirements files

Expand Down
1 change: 1 addition & 0 deletions docs/html/topics/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,6 @@ configuration
dependency-resolution
local-project-installs
repeatable-installs
secure-installs
vcs-support
```
100 changes: 100 additions & 0 deletions docs/html/topics/secure-installs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Secure installs

By default, pip does not perform any checks to protect against remote tampering and involves running arbitrary code from distributions. It is, however, possible to use pip in a manner that changes these behaviours, to provide a more secure installation mechanism.

This can be achieved by doing the following:

- Enable {ref}`Hash-checking mode`, by passing {any}`--require-hashes`
- Disallow source distributions, by passing {any}`--only-binary :all: <--only-binary>`

(Hash-checking mode)=

## Hash-checking Mode

```{versionadded} 8.0
```

This mode uses local hashes, embedded in a requirements.txt file, to protect against remote tampering and network issues. These hashes are specified using a `--hash` [per requirement option](../../reference/requirements-file-format#per-requirement-options).

Note that hash-checking is an all-or-nothing proposition. Specifying `--hash` against _any_ requirement will activate this mode globally.

To add hashes for a package, add them to line as follows:

```
FooProject == 1.2 \
--hash=sha256:2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 \
--hash=sha256:486ea46224d1bb4fb680f34f7c9ad96a8f24ec88be73ea8e5a6c65260e9cb8a7
```

### Additional restrictions

- Hashes are required for _all_ requirements.

This is because a partially-hashed requirements file is of little use and thus likely an error: a malicious actor could slip bad code into the installation via one of the unhashed requirements.

Note that hashes embedded in URL-style requirements via the `#md5=...` syntax suffice to satisfy this rule (regardless of hash strength, for legacy reasons), though you should use a stronger hash like sha256 whenever possible.

- Hashes are required for _all_ dependencies.

If there is a dependency that is not spelled out and hashed in the requirements file, it will result in an error.

- Requirements must be pinned (either to a URL, filesystem path or using `==`).

This prevents a surprising hash mismatch upon the release of a new version that matches the requirement specifier.

### Forcing Hash-checking mode

It is possible to force the hash checking mode to be enabled, by passing `--require-hashes` command-line option.

This can be useful in deploy scripts, to ensure that the author of the requirements file provided hashes. It is also a convenient way to bootstrap your list of hashes, since it shows the hashes of the packages it fetched. It fetches only the preferred archive for each package, so you may still need to add hashes for alternatives archives using {ref}`pip hash`: for instance if there is both a binary and a source distribution.

### Hash algorithms

The recommended hash algorithm at the moment is sha256, but stronger ones are allowed, including all those supported by `hashlib`. However, weaker ones such as md5, sha1, and sha224 are excluded to avoid giving a false sense of security.

### Multiple hashes per package

It is possible to use multiple hashes for each package. This is important when a package offers binary distributions for a variety of platforms or when it is important to allow both binary and source distributions.

### Interaction with caching

The {ref}`locally-built wheel cache <wheel-caching>` is disabled in hash-checking mode to prevent spurious hash mismatch errors.

These would otherwise occur while installing sdists that had already been automatically built into cached wheels: those wheels would be selected for installation, but their hashes would not match the sdist ones from the requirements file.

A further complication is that locally built wheels are nondeterministic: contemporary modification times make their way into the archive, making hashes unpredictable across machines and cache flushes. Compilation of C code adds further nondeterminism, as many compilers include random-seeded values in their output.

However, wheels fetched from index servers are required to be the same every time. They land in pip's HTTP cache, not its wheel cache, and are used normally in hash-checking mode. The only downside of having the wheel cache disabled is thus extra build time for sdists, and this can be solved by making sure pre-built wheels are available from the index server.

### Using hashes from PyPI (or other index servers)

PyPI (and certain other index servers) provides a hash for the distribution, in the fragment portion of each download URL, like `#sha256=123...`, which pip checks as a protection against download corruption.

Other hash algorithms that have guaranteed support from `hashlib` are also supported here: sha1, sha224, sha384, sha256, and sha512. Since this hash originates remotely, it is not a useful guard against tampering and thus does not satisfy the `--require-hashes` demand that every package have a local hash.

## Repeatable installs

Hash-checking mode also works with {ref}`pip download` and {ref}`pip wheel`. See {doc}`../topics/repeatable-installs` for a comparison of hash-checking mode with other repeatability strategies.

```{warning}
Beware of the `setup_requires` keyword arg in {file}`setup.py`. The (rare) packages that use it will cause those dependencies to be downloaded by setuptools directly, skipping pip's hash-checking. If you need to use such a package, see {ref}`controlling setup_requires <controlling-setup_requires>`.
```

## Do not use setuptools directly

Be careful not to nullify all your security work by installing your actual project by using setuptools' deprecated interfaces directly: for example, by calling `python setup.py install`, `python setup.py develop`, or `easy_install`.

These will happily go out and download, unchecked, anything you missed in your requirements file and it’s easy to miss things as your project evolves. To be safe, install your project using pip and {any}`--no-deps`.

Instead of `python setup.py install`, use:

```{pip-cli}
$ pip install --no-deps .
```

Instead of `python setup.py develop`, use:

```{pip-cli}
$ pip install --no-deps -e .
```

0 comments on commit e512b34

Please sign in to comment.