Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify encoding of colon between scheme and type #361

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions PURL-SPECIFICATION.rst
Original file line number Diff line number Diff line change
Expand Up @@ -247,8 +247,9 @@ Use these rules for percent-encoding and decoding ``purl`` components:
- the '#', '?', '@' and ':' characters must NOT be encoded when used as
separators. They may need to be encoded elsewhere
Copy link
Contributor

@gernot-h gernot-h Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fully address the confusion of #39, I would probably also change the last sentence in the para before:

Suggested change
separators. They may need to be encoded elsewhere
separators. Some of them need to be encoded elsewhere as specified in the rules below.

I think, this would also make clearer where they need to be encoded.


- the ':' ``scheme`` and ``type`` separator does not need to and must NOT be encoded.
It is unambiguous unencoded everywhere
- The colon ':' separator between ``scheme`` and ``type`` MUST NOT be encoded.
For example, in the PURL snippet ``pkg:npm`` the colon ':' MUST NOT be encoded,
and the PURL snippet ``pkg%3Anpm`` is invalid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gernot-h @pombredanne Consider adding at the top of the file, perhaps as a new one-line paragraph following the current first paragraph, something along the lines of the following:

This specification uses RFC 2119 (https://datatracker.ietf.org/doc/html/rfc2119), as clarified in RFC 8174 (https://datatracker.ietf.org/doc/html/rfc8174), for the interpretation of certain terms, e.g., MUST NOT.

Or perhaps a slight modification to the example provided by RFC 2119:

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119, as clarified in RFC 8174.

(Note that the core spec currently contains a great deal of language that will need to be modified to implement RFC 2119/8174.)

- the '/' used as ``type``/``namespace``/``name`` and ``subpath`` segments separator
does not need to and must NOT be percent-encoded. It is unambiguous unencoded
Expand Down