-
Notifications
You must be signed in to change notification settings - Fork 595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting file.metadata.selection
to none
still results in files in the SBOM
#2989
Comments
I can confirm there seems to be something unexpected happening here:
It results in a files section with no metadata or any other information such as digests:
For what it's worth: I think it might make sense for this flag to prevent metadata from being captured, rather than preventing files from being captured, and perhaps we should think about introducing a new configuration for the entire file section to disable all file data collection, e.g.: file:
# enable file cataloging
enabled: true
- or -
selection: ...
metadata:
# select which files should be captured by the file-metadata cataloger and included in the SBOM.
# Options include:
# - "all": capture all files from the search space
# - "owned-by-package": capture only files owned by packages
# - "none", "": do not capture any files (env: SYFT_FILE_METADATA_SELECTION)
selection: 'owned-by-package'
# the file digest algorithms to use when cataloging files (options: "md5", "sha1", "sha224", "sha256", "sha384", "sha512") (env: SYFT_FILE_METADATA_DIGESTS)
digests:
- 'sha1'
- 'sha256'
... |
file.metadata.selection
to none
still results in files in the SBOM
I agree that someone could interpret setting this to |
I have some questions about the changes here:
@tomersein besides the surprise of the setting not doing what you expected, is there a reason you don't want a |
I actually want a way to disable this cataloger. |
You mean the file metadata cataloger? |
yes, but when i put "none" i still see results, which make me think it still works |
I think we want to agree on what Syft currently does before we decide to change it. For that reason, I made 3 SBOMs with the possible settings, like this: SYFT_FILE_METADATA_SELECTION=none syft alpine:latest -o json > syft-none.json
SYFT_FILE_METADATA_SELECTION=owned-by-package syft alpine:latest -o json > syft-owned.json
SYFT_FILE_METADATA_SELECTION=all syft alpine:latest -o json > syft-all.json Then we can use
So we can see here that when the file metadata cataloger is set to "all", it gets metadata for all the files in the image. But why do "syft-none" and "syft-owned" have the same number of files? It's because when Syft finds files that are owned by a package, it emits a relationship of type "contains" for that package, where the parent is the package and the child is the file. So does setting
So we can see that less information is captured in the This brings us to the PR feedback @wagoodman had and the question I had: When Syft finds a package that owns files, this information is encoded in the SBOM via relationships. (If the metadata cataloger is set to "NONE," then these files lack metadata like digest and mode, but they are still present.) The feedback on the PR is: We need to have the files and the relationships that point to them, or neither. The question I have is: Are we willing to let the metadata cataloger selection interfere with other settings that rely on file relationships, like removing binaries that overlap by file ownership with OS packages. I've added needs discussion, since I think the right course of action here isn't obvious. |
Based off of the discussion on the live stream about using the existing cataloger selection facilities in syft, I have a separate proposal for being able to augment how to turn off file catalogers: #3505 Essentially this would allow for |
Hi @tomersein! Thanks for your patience on this issue. We did some digging, and there are a few places file objects are created besides the file metadata cataloger:
So if your goal is to have no
I don't know that emitting these file relationships takes very much time. You can see which catalogers are taking time by passing
You can also set I hope this helps! Let me know how you'd like to proceed with this issue and the related PR. |
I think the next step here is to get #3505 merged. I'm removing |
What happened:
When I use "none" I still get "files" entry in the final json.
What you expected to happen:
If I use "none" remove the "files" entry.
Steps to reproduce the issue:
use this config.yaml:
scan an image\directory
Anything else we need to know?:
did a little check and the issue is this function:
func toFile(s sbom.SBOM) []model.File
I think that in case of none it shouldn't enter this function or use skip (if all variables like metadata, digest, etc.) are empty.
Environment:
syft version
: 1.8.0cat /etc/os-release
or similar): macOSThe text was updated successfully, but these errors were encountered: