-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PURL-TYPE: golang] fix type spec regarding path segments #308
Comments
I don't think this is right either. Go module names are case sensitive, including the first path element. Uppercase characters are forbidden for the first path element, so lowercasing is unnecessary for valid module names and can turn invalid module names into valid module names. It's easier and more accurate to just leave the module name how it is.
|
Option C: namespace must be empty with encoded names (any casing)This closely mimics the capabilities supported for go module names. Since this is a breaking change, a new package type such as |
Isn't Option C the same as Option B? Related to namespaces: #294 It's not possible to just make a new go package type to avoid versioning PURL. This would create distinct PURLs for both go and golang which refer to the same package, but only in certain contexts, and likely lead to unexpected and inconsistent normalization where some software translates golang PURLs into go PURLs and other software considers the normalized PURL to be a distinct package. This is a problem especially for software that tries to match PURLs from different sources. |
@matt-phylum, option B has a detailed specification about the name property. My proposal is not to have any opinion. purl doesn't have a concept of versioning built-in, so needs both the producer and consumer agree on the exact version to follow. Distinct package types avoids this problem. The benefit is that it can be used as a precendent to improve npm, pypi etc. |
Creating distinct package types avoids one problem by creating a bigger problem. Creating several slightly different types with slightly different behaviors defeats the purpose of having a standardized way of naming packages. If all possible type variations are valid simultaneously, all implementations need to support all ways to refer to packages relevant to that implementation. Existing software would not understand the new types until updated (similar to versioning PURL?). Humans working with PURLs will need to remember which rules apply to which types. Normalizing from one type to a preferred type to alleviate this issue would be a significant change to normalization that would cause issues with interoperability between products and compatibility with existing data. I think removing the namespace for Go or other package types and instead putting a percent encoded path into the name, whether with a new type or a new version, would be a disaster because it would break compatibility with almost all existing Go PURLs and PURL implementations. There's no standard format for deconstructed PURLs so it's safe to change the spec so Go packages do not have namespaces as long as the path is used without percent encoding, resulting in the same serialized representation. It'd probably be best to do this across all package types at the same time so PURL implementations can be simplified by combining the two components instead of having an extra case for namespace+name combined. |
re: #308 (comment)
Oh there is. see https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst
this would be against existing purl spec. Existing PURL spec: thing is: AFAIK
you are completely wrong here. The opposite is the case:
the namespace-segments and name are to be escaped per PURL spec - regardless of new or old go PURL. nothing changes here. and to distinguish between new and old ... well the one has atleast one namespace-segment, the other does not. and downstream usage example: I just wanted to give ideas how this could be solved and how hard it might be. Anyway, I do not want to alter the core PURL spec. All it takes is "fixing" the type spec. |
PHP nor NPM have namespaces either. The name of |
you are wrong here.
but all of this does not matter for this discussion here, sorry, please stick to the topic.
|
There's no difference between how NPM and PHP do/don't have namespaces and how Go does/doesn't have namespaces. In all of these cases, the name of the package in the native ecosystem contains slashes, and for PURL the native name is pulled apart into a namespace+name combination that results in the serialized form containing the native name. |
We already have this problem. For example, nixos can wrap a pypi package and build it slightly differently and have a similar package name that may or may not have the same vulnerabilities. Many OS distros also operate similarly. |
I don't see package namespaces in the screenshot. The first arrow looks like it's pointing at "main Composer repository", but the repository is not related to the package name. The other two arrows are pointing at package names.
https://getcomposer.org/doc/01-basic-usage.md#the-require-key
https://getcomposer.org/doc/01-basic-usage.md#package-names Some package types do have namespaces.
composer, docker, golang, huggingface, npm, swift create a PURL namespace by splitting the native package name/id on the last slash such that writing out the PURL in its canonical form gives the appearance of PURL using the native package name/id, despite PURL actually forcing a namespace+name. nuget is actually similar to npm, but handled differently by PURL. NuGet packages usually have a name prefix, but NuGet uses periods as delimiters, and There are a few more I'm not sure about, but the rest forbid namespaces. I think it would be a mistake to create a package type which normally puts slashes in its PURL name because it makes PURLs that are difficult for humans and it creates complications if namespaces are removed from the core specification (possible without breaking existing PURLs). |
reminder: this is about the current Each ecosystem has own requirements, each ecosystem is facing different standards and constraints. |
@jkowalleck, we are seeing similar issues and potential workarounds across other package types, which is what we are trying to convey here. I think the next step could be for the core maintainers to digest the information and come up with something authoritative. |
re: #308 (comment) I see, but this does not help this particular problem. In the meantime, this particular issue for PS: nuff said. will unfollow this issue, since i am not really affected as a non- |
re: Option B, and to @jkowalleck 's point about namespaces: #63 (comment) |
see PURL spec : https://github.com/package-url/purl-spec/blob/b33dda1cf4515efa8eabbbe8e9b140950805f845/PURL-SPECIFICATION.rst#rules-for-each-purl-component
see PURL-TYPE spec for
golang
:purl-spec/PURL-TYPES.rst
Lines 300 to 314 in b33dda1
Problem
According to PURL-TYPE spec for
golang
, "Thenamespace
and name must be lowercased."This means, that all URL path-part from a hosted go module MUST be lowercased for PURL namespaces.
URL path-part are case-sensitive per definition.
Therefore, TYPE spec is not helpful, as it modifies URL path-part and renders is usable in namespaces, as it makes them PURLs indistinguishable, and it makes them PURLs unusable for package retrieval.
see also: google/deps.dev#93
see also: https://www.youtube.com/watch?v=Lts4NjHqKIw&t=1004s
Example
Module with the topic of preserving a thing:
hosted at
https://example.com/pakages/Preserve
would have a purl
pkg:golang/example.com/pakages/preserve
.Module with the topic of an event before serving a thing:
hosted at
https://example.com/pakages/preServe
would have a purl
pkg:golang/example.com/pakages/preserve
.Issue A: Both PURLs are the same, but the modules are not.
Issue B: none of the PURL namespace/name segments are usable to build the original/actual distribution/source URL from it.
Possible Solution
Option A: simply allow case-sensitivity
When converting URL to PURL namespace, then the host-part of the URL name MUST be lowercased, and the path-part of the URL segments MUST NOT be modified.
Example:
https://packages.EXAMPLE.com/MyOrg/foO
--> PURLpkg:golang/packages.example.com/MyOrg/foO
https://packages.example.com/ACME/foo
--> PURLpkg:golang/packages.example.com/ACME/foo
In case the proposed solution above is considered a breaking change:
deprecate the existing PURL-TYPE
golang
and create a new PURL-TYPEgo
(see #67),and define the PURL TYPE as proposed above.
Option B: no namespaces, all encoded name
this would be definitely a breaking change, so it requires deprecating TYPE
purl
, and come up with a reboot:go
(see Go is called Go, not Golang #67)namespace
must be emptyname
must be the lowercased host-part of the distribution URL followed by the unmodified path-part of the distribution URLsubpath
is used to point to a case-sensitive subpath inside a package.version
is often empty when a commit is not specified and should bethe commit in most cases when available.
Example:
https://packages.EXAMPLE.com/MyOrg/foO%26bar
--> PURLpkg:golang/packages.example.com%2FMyOrg%2FfoO%2526bar
https://packages.example.com/ACME/foo
--> PURLpkg:golang/packages.example.com%2FACME%2Ffoo
Please bare with me, I am just the person who happened to write this report, I do not know much about the golang ecosystem, but I know something about PURL.
The text was updated successfully, but these errors were encountered: