Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package JSON manifest should support generic attributes #385

Open
nacl opened this issue Jul 14, 2021 · 3 comments
Open

Package JSON manifest should support generic attributes #385

nacl opened this issue Jul 14, 2021 · 3 comments
Assignees
Labels
P3 An issue that we are not working on but will review quarterly

Comments

@nacl
Copy link
Collaborator

nacl commented Jul 14, 2021

I like the idea of the JSON manifest format -- it allows us to create backends for a generic packaging frontends.

I would like to adopt this for pkg_rpm at some point, but in order to do this, the manifest would need to support a generic "attributes" format to support file tags (like %config). This will also be needed for supporting packages in Windows.

So, instead of separate mode, user, and group fields, perhaps we should have an attributes object that can contain arbitrary information, mirroring pkg_attributes in the pkg_filegroup framework?

@nacl nacl added the P2 An issue that should be worked on when time is available label Jul 14, 2021
@aiuto
Copy link
Collaborator

aiuto commented Jul 15, 2021

We can do whatever we want. Since there is, so far, 1 method that writes the manifest, and one that decodes an entry, and they are both private interfaces, changing the format is fair game as we need it.

Another reason for moving to that now might be that we may have to add more user/owner types for #370 . Rather than add more fields at the top level, add them to the embedded struct.

@aiuto
Copy link
Collaborator

aiuto commented Aug 9, 2021

One more requirement: We need operators around this

  • combine - which combines two attribute sets. For example, one has owner + group and the other has mode, yielding one with owner, group, & mode.
  • compare for equality. - when we have the same file put in an archive twice, that is fine if the modes are the same. If the modes conflict, it should raise an error.

@nacl
Copy link
Collaborator Author

nacl commented Nov 20, 2021

Been thinking about this on and off. Initial thoughts:

  • The manifest is a JSON array containing a series of objects. Each object represents an individual installable entity, as is known by Bazel. Each key represents an attribute of the entity, each value represents the value of that attribute.
  • Each entity type is represented by a type field (a string or enumerated value). This type dictates how the "core" attributes are interpreted.
  • "Core" attributes include things like sources, in-package destinations, basic UNIX-style permissions (either octal or strings), and UNIX-style ownership.
  • "Extended" attributes are related to specific entities (e.g. properties of UNIX-style device nodes), or specific packaging rules (e.g. Windows installers, or custom file tags in pkg_rpm)

So, something like this:

[
  {
    "label": "//:foo-config",
    "type": "FILE",
    "source": "bazel-out/k8-linux-fastbuild/foo", // or whatever bazel provides us
    "destination": "{PREFIX}/etc/foo", // Templated destination
    "mode": "rw-r--r--", // or 0644
    "user": "root",
    "group": "root",
    "rpm_filetag": "%config" // Custom attribute for pkg_rpm
    "zip_compression": "STORE" // Theoretical custom attribute for pkg_zip
    "unix_extended_permissions": "+i" // Theoretical custom attribute for UNIX-style installers (set immutable bit) 
   },
   ...
]

This is simple enough to meet all the above requirements and is also relatively futureproof. It also has the unfortunate consequences of requiring more code and redoing a bit of the logic that we already have for the manifest system as used in pkg_zip and pkg_tar. WDYT?

@nacl nacl self-assigned this Dec 20, 2021
@nacl nacl added P1 An issue that must be resolved. Must have an assignee and removed P2 An issue that should be worked on when time is available labels Mar 7, 2022
@aiuto aiuto added P3 An issue that we are not working on but will review quarterly and removed P1 An issue that must be resolved. Must have an assignee labels Sep 12, 2022
nacl pushed a commit to nacl/rules_pkg that referenced this issue Jan 13, 2023
The current way that rules_pkg communicates with must packagers is using a
manifest file, which is currently a JSON data structure based on a an array of
arrays.  While generally readable, it looks strange, as it was to reduce Bazel
resource usage (JSON strings in memory).  Further, our Python code is directly
bound to this data structure format.

However, if we want to add more or change this, it becomes cumbersome on both
the Starlark and Python sides. This change alleviates concerns generally by:

- Converting all manifests to a JSON "object" style, improving readability.
  Numerous "golden" tests were updated to support this.
- Replace the `collections.namedtuple`-based `ManifestEntry` object with one
  that is a little more flexible and type-safe.
- Providing a function (`read_entries_from`) that converts a file-like object
  into a list of `ManifestEntry`s, and replacing all JSON reading in packagers
  (`tar`, `zip`, `install`) and their tests with this function.

Other convenience factors or things addressed:

- `ManifestEntry.entry_type` is now just `ManifestEntry.type`
- Bazel 6 now stringifies repository-local labels with a preceding `@`, unlike
  prior versions.  Adapt to this in the manifest writer.

Future changes will extend this interface to allow for custom attributes to be
passed from `pkg_files` and friends, which, among other things, will be
necessary to more generically support `pkg_rpm`.

Provides groundwork for (but doesn't resolve) bazelbuild#385.
nacl pushed a commit that referenced this issue Feb 8, 2023
Genericize package manifest system and interface

The current way that rules_pkg communicates with must packagers is using a
manifest file, which is currently a JSON data structure based on a an array of
arrays.  While generally readable, it looks strange, as it was to reduce Bazel
resource usage (JSON strings in memory).  Further, our Python code is directly
bound to this data structure format.

However, if we want to add more or change this, it becomes cumbersome on both
the Starlark and Python sides. This change alleviates concerns generally by:

- Converting all manifests to a JSON "object" style, improving readability.
  Numerous golden tests were updated to support this.
- Replace the `collections.namedtuple`-based `ManifestEntry` object with one
  that is a little more flexible and type-safe.
- Providing a function (`read_entries_from`) that converts a file-like object
  into a list of `ManifestEntry`s, and replacing all JSON reading in packagers
  (`tar`, `zip`, `install`) and their tests with this function.

Other convenience factors or things addressed:

- `ManifestEntry.entry_type` is now just `ManifestEntry.type`
- Bazel 6 now stringifies repository-local labels with a preceding `@`, unlike
  prior versions.  Adapt to this in the manifest writer.

Future changes will extend this interface to allow for custom attributes to be
passed from `pkg_files` and friends, which, among other things, will be
necessary to more generically support `pkg_rpm`.

Provides groundwork for (but doesn't resolve) #385.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 An issue that we are not working on but will review quarterly
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants