Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip_audit: dedupe vulnerability reports by alias #232

Merged
merged 8 commits into from
Feb 7, 2022

Conversation

woodruffw
Copy link
Member

Only implemented for PyPI so far. OSV probably needs similar changes.

Closes #231.

@woodruffw woodruffw requested a review from di February 4, 2022 16:37
@woodruffw woodruffw added enhancement New feature or request component:vuln-sources Components that provide sources of vulnerability information labels Feb 4, 2022
This will make normalizing OSV's results a little easier,
and gives us the ability to add more detail to the
output formats in the future.
@woodruffw woodruffw requested a review from tetsuo-cpp February 4, 2022 22:16
@di
Copy link
Member

di commented Feb 4, 2022

I was kind of imagining that this would either happen here or here in a way that isn't specific to the service, in a way that would take a list of VulnerabilityResults and condense ones with a shared alias into a MultiVulnerabilityResult (or some other name) that contains multiple VulnerabilityResults.

Additionally, this would make it easier to provide an option to display all the correlated vulnerability IDs when outputting these, e.g.

$ pip-audit --show-dupes
Found 2 known vulnerabilities in 1 package
Name  Version ID                              Fix Versions
----  ------- ------------------------------- ------------
Flask 0.5     PYSEC-2019-179, FOOBAR-2019-666 1.0
Flask 0.5     PYSEC-2018-66                   0.12.3

@woodruffw
Copy link
Member Author

woodruffw commented Feb 4, 2022 via email

@tetsuo-cpp
Copy link
Contributor

I'd also like the deduplication logic to be moved out of the vuln service if possible. But other than that, it looks good.

@woodruffw
Copy link
Member Author

Should be good for another review. We now consider each vulnerability's id when evaluating aliases, per #231 (comment)

One open question: should we collect all aliases for a vulnerability and update the "canonical" vulnerability with them? For example, the following result list:

        [
            VulnerabilityResult(
                id="PYSEC-0",
                description="fake",
                fix_versions=[Version("1.1.0")],
                aliases={"CVE-XXXX-YYYYY"},
            ),
            VulnerabilityResult(
                id="FAKE-1",
                description="fake",
                fix_versions=[Version("1.1.0")],
                aliases={"CVE-XXXX-YYYYY"},
            ),
            VulnerabilityResult(
                id="CVE-XXXX-YYYYY",
                description="fake",
                fix_versions=[Version("1.1.0")],
                aliases={"FAKE-1"},
            ),
        ]

results in one vulnerability with an id of PYSEC-0 and an aliases set of just {"CVE-XXXX-YYYYY"}. Should it include FAKE-1 as well?

@di
Copy link
Member

di commented Feb 7, 2022

Yes, I think so. Let's assume the alias-id relationship to be bi-directional even if we don't see both directions.

@woodruffw
Copy link
Member Author

Sounds good, I'll adjust for that in a moment.

Copy link
Member

@di di left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks a lot cleaner than the first pass. I think this LGTM once the comment regarding aliases/ids is resolved.

@woodruffw
Copy link
Member Author

Added alias merging and a corresponding test; let me know what you think.

@woodruffw woodruffw merged commit 6a6b1fc into main Feb 7, 2022
@woodruffw woodruffw deleted the ww/dedupe-on-aliases branch February 7, 2022 19:20
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Feb 20, 2022
### Added

* CLI: The `--fix` flag has been added, allowing users to attempt to
  automatically upgrade any vulnerable dependencies to the first safe version
  available ([#212](pypa/pip-audit#212),
  [#222](pypa/pip-audit#222))

* CLI: The combination of `--fix` and `--dry-run` is now supported, causing
  `pip-audit` to perform the auditing step but not any resulting fix steps
  ([#223](pypa/pip-audit#223))

* CLI: The `--require-hashes` flag has been added which can be used in
  conjunction with `-r` to check that all requirements in the file have an
  associated hash ([#229](pypa/pip-audit#229))

* CLI: The `--index-url` flag has been added, allowing users to use custom
  package indices when running with the `-r` flag
  ([#238](pypa/pip-audit#238))

* CLI: The `--extra-index-url` flag has been added, allowing users to use
  multiple package indices when running with the `-r` flag
  ([#238](pypa/pip-audit#238))

### Changed

* `pip-audit`'s minimum Python version is now 3.7.

* CLI: The default output format is now correctly pluralized
  ([#221](pypa/pip-audit#221))

* Output formats: The SBOM output formats (`--format=cyclonedx-xml` and
  `--format=cyclonedx-json`) now use CycloneDX
  [Schema 1.4](https://cyclonedx.org/docs/1.4/xml/)
  ([#216](pypa/pip-audit#216))

* Vulnerability sources: When using PyPI as a vulnerability service, any hashes
  provided in a requirements file are checked against those reported by PyPI
  ([#229](pypa/pip-audit#229))

* Vulnerability sources: `pip-audit` now uniques each result based on its
  alias set, reducing the amount of duplicate information in the default
  columnar output format
  ([#232](pypa/pip-audit#232))

* CLI: `pip-audit` now prints its output more frequently, including when
  there are no discovered vulnerabilities but packages were skipped.
  Similarly, "manifest" output formats (JSON, CycloneDX) are now emitted
  unconditionally
  ([#240](pypa/pip-audit#240))

### Fixed

* CLI: A regression causing excess output during `pip audit -r`
  was fixed ([#226](pypa/pip-audit#226))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:vuln-sources Components that provide sources of vulnerability information enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Review how pip-audit handles multiple vulnerabilities with the same alias
3 participants