Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go suggestions for PURL-TYPES.rst #196

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 4 additions & 6 deletions PURL-TYPES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -258,15 +258,13 @@ golang

- There is no default package repository: this is implied in the namespace
using the ``go get`` command conventions.
- The ``namespace`` and `name` must be lowercased.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where did this definition of namespace/name go?
see #308

Copy link
Author

@tiegz tiegz Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkowalleck your Issue adds more nuance to this problem, so I left a link in there to an argument against namespaces for Go : #63 (comment)

to expedite this PR, I'll just revert that removal for now.

- The ``subpath`` is used to point to a subpath inside a package.
- The ``version`` is often empty when a commit is not specified and should be
the commit in most cases when available.
- The ``subpath`` is used to point to a package inside a module.
- The ``version`` for modules should be semver with a v prefix, or a pseudo-version that points to an untagged commit.
Copy link

@jdalton jdalton Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pseudo-version is a bit wiggly language.

Suggested change
- The ``version`` for modules should be semver with a v prefix, or a pseudo-version that points to an untagged commit.
- The ``version`` may be empty. Otherwise, it may start with a lowercased "v" followed by a valid semver version or consist of a sha-1 or short sha-1 git commit hash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pseudo-version is an explicitly defined thing in the ecosystem. However, pseudo-versions also start with v so it's not clear to people who aren't familiar with the idiosyncrasies of Go that you may get something like v0.0.0-20170915032832-14c0d48ead0c which is not semver and has weak ordering with other v0.0.0 versions and cannot be compared with non-v0.0.0 versions. It'd probably be a good idea to link to the Go documentation or give more details here.

Copy link

@jdalton jdalton Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matt-phylum
snipping off the "v"

0.0.0-20170915032832-14c0d48ead0c

And then running it through the regexp provided here:
https://semver.org/#is-there-a-suggested-regular-expression-regex-to-check-a-semver-string

const regexSemverNumberedGroups =
    /^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$/

Results in...

regexSemverNumberedGroups.exec('0.0.0-20170915032832-14c0d48ead0c')
// =>['0.0.0-20170915032832-14c0d48ead0c', '0', '0', '0', '20170915032832-14c0d48ead0c', undefined, index: 0, input: '0.0.0-20170915032832-14c0d48ead0c', groups: undefined]

a valid parse. Semver can have a trailing -stuff, like -20170915032832-14c0d48ead0c. It is considered metadata.

which is not semver and has weak ordering with other v0.0.0 versions and cannot be compared with non-v0.0.0 versions.

There are handy utils for parsing, coercing, and comparing those too:
https://www.npmjs.com/package/semver

Copy link
Contributor

@matt-phylum matt-phylum Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: this is wrong and you can skip reading it.

It looks like semver, but it's not semver. Even if it can be parsed using a semver parser, it has none of the semantics of semantic versioning.

Pseudo-versions and regular versions can never be compared without access to the repository history. If a repository has a v1.0.0, there is almost certainly a v0.0.0-* pseudo-version which is logically greater than the v1.0.0.

Pseudo-versions can be compared with other pseudo-versions based on date, but dates do not have the same semantics as version numbers because they don't capture branching information. If there's a problem affecting v1.0.0..v1.0.1 and v2.0.0..v2.0.1, that problem is also affecting the pseudo-versions for those commits and any unreleased commits within those version ranges, and date ordering fails for commits that are made on branches parallel to the fix, or for commits that are made on the v1 series after the v2 series has been fixed but before the v1 series has been fixed. If you sorted the pseudo-versions and visualized whether they are affected or not, the result would be more like a noisy gradient than the sharp edge normally implied by those version ranges.

It's an inherent problem with the tooling trying to create versions for arbitrary commits, not a lack of library support.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some docs on pseudo-versions: https://go.dev/ref/mod#pseudo-versions We could link those from the docs to clarify what the language is referring to

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong about this. Until today, I had only ever seen pseudo-versions that started with v0.0.0. Even in the documentation about pseudo-versions, every example given starts with v0.0.0. However, there are actually pseudo-versions that start with other prefixes--they are just much less common in the data that I normally deal with. You can treat it as regular semver.

Example of a non-v0.0.0 pseudo-version:

$ go get github.com/spf13/cobra@756ba6dad61458cbbf7abecfc502d230574c57d2
go: downloading github.com/spf13/cobra v1.8.2-0.20240728161807-756ba6dad614

1.8.2-0.20240728161807-756ba6dad614 follows after v1.8.1. There is not yet a tagged 1.8.2 release. This follows the usual semver behavior of being a prerelease greater than the previous release and less than the next release.

It's still not really semver because in this example you can't infer whether one 1.8.2 pseudo-version comes before or after another in commit order, but that's a much more minor difference. In most cases it doesn't matter and it's probably not worth calling attention to beyond just linking to the pseudo-version documentation or just mentioning that it may be a pseudo-version.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matt-phylum Cool! I think I covered that on the JS implementation here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been codified in packageurl-js's v2.0.0 release 🎉

Copy link
Author

@tiegz tiegz Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The version may be empty.

I think this is implicit based on PURL-SPECIFICATION: The version is prefixed by a '@' separator when not empty. If we added it here, seemed like it should be included in every other type too? (other than OCI and Swift, apparently)

Otherwise, it may start with a lowercased "v" followed by a valid semver version or consist of a sha-1 or short sha-1 git commit hash.

Since Go supports multiple VCS systems, how about something like this?

The `version` may start with a lowercased "v" followed by: a semantic version, or a 
Go "pseudo-version", which consists of a semantic version followed by a timestamp 
and revision identifier.

- Examples::

pkg:golang/github.com/gorilla/context@234fd47e07d1004f0aed9c
pkg:golang/github.com/gorilla/context@v1.1.1
pkg:golang/google.golang.org/genproto#googleapis/api/annotations
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ This still leaves an empty version example without while removing the mention from the version description

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mentioned this above, but PURL-SPECIFICATION indicates that versions can be empty

pkg:golang/github.com/gorilla/context@234fd47e07d1004f0aed9c#api
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep one of the github sha-1 hash examples in

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for modern Go tools, it should be a pseudo-version, not a bare commit ID like it is now: https://go.dev/doc/modules/version-numbers#in-development

pkg:golang/golang.org/x/[email protected]#collate

hackage
-------
Expand Down