-
Notifications
You must be signed in to change notification settings - Fork 9.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify how Schema Objects require full-document parsing (3.1.1) #3758
Conversation
JSON Schema draft 2020-12 includes numerous keywords that require parsing the entire document prior to deeming a reference unresolvable. This makes that more clear and outlines several approaches. The practice of embedding OpenAPI fragments in other formats is deemed to have implementation-defined (non-interoperable) behavior, as the potential complications that might arise are not predictable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job ! @handrews
This PR IMHO is very clear. I am very much tempted to merge this before the TDC call. @lornajane , @ralfhandl thoughts? |
@miqui this one has some pretty deep impacts so I'd rather leave it open for longer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! 👍
Discussed during TDC 5/2/2024 - It was decided we need 2 more TSC approvals. |
This goes into more detail and uses "undefined" instead of "implementation-defined" as the behavior is likely to be incorrect (rather than just a different interpretationof an ambiguous requirement), and may result in security concerns as well.
@ralfhandl I've pushed a new commit with the following commit message: Rework the guidance around fragmentary parsing. This goes into more detail and uses "undefined" instead of I don't know that it's a simplification, but I think it's more explicit and should be easier to follow. Please do continue to suggest simplified wording - it's something I struggle with sometimes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good explanation!
|
||
Implementations that parse referenced fragments of OpenAPI content without regard for the content of the rest of the containing document will miss keywords that change the meaning and behavior of the reference target. | ||
In particular, failing to take into account keywords that change the base URI introduces security risks by causing references to resolve to unintended URIs, with unpredictable results. | ||
While some implementations support this sort of parsing due to the requirements of past versions of this specification, in version 3.1, the result of parsing fragments in isolation is _undefined_ and likely to contradict the requirements of this specification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be missing some distinction, but this seems like the same behavior that's described as "implementation defined" in https://github.com/OAI/OpenAPI-Specification/pull/3732/files#diff-b92507e7acda65ae00a02236c555cefc68b6fca4661077b84c2fb9ab150e5e17R151
This section particularly addresses schema objects but the other PR seems to encompass schema objects too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@notEthan eh... (wobbles hand back and forth in a wishy-washy way). There's an overlap, but it's not quite the same. This is where we get into the "it's too hard to even explain" that I mentioned in the undefined/implementation-defined PR, but this ones not one of the worst so here's an explanation:
If you have a document that consists of an empty JSON object, {}
, that document is syntactically valid as several different Objects. Parsing that document from different reference contexts is still full-document parsing, so it does not fall into this undefined behavior. But it is parsing using multiple conflicting contexts, so it falls into the implementation-defined behavior of the other PR.
It's a pretty fine hair to split, but it is deterministic. The whole-document case is implementation-defined because at least one implementor I've spoken with did not think it required a fix in the spec. I'm trying to invalidate as few existing implementations as possible, which is tricky to balance with the scenarios that result in outright wrong behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither of these changes get into the deeper question of whether documents are parsed as JSON/YAML once or each time, or whether, once parsed as JSON/YAML, the resulting structures are further parsed as OAS Objects once or each time. That would start getting into whether various parsing steps are cached, and what is done when a cached Object doesn't agree with a new parsing context. There are many ways this might or might not be detected, and many strategies one could pick to handle it - new each time, first wins, last wins, any conflict is an error, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an overlap, but it's not quite the same.
I'm happy to accept that, and appreciate the explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured it might help folks to see what "complicated to explain" looks like :-)
All reviewers seem to be happy with the latest state of the PR |
Looks good 👎 |
Please review even though this is closed! It should have been a draft PR and was merged prematurely.
This is the next step in un-blocking OASComply. Unlike the previous OASComply-related PR, this PR does not apply to v3.0.4. What, if any, wording about document parsing needs to go in 3.0.4 is something we can sort out later.
The implied requirement might be surprising to some, but is an unambiguous result of normatively citing JSON Schema 2020-12. Resolution of references to
$id
values that do not match the document's location (e.g. are resolved as URIs rather than URLs, including by noticing them in already-parsed content) was affirmed multiple times by TSC decisions during the work on OAS 3.1:Unfortunately, it seems like we never did get around to adding explicit wording highlighting this. Which does not change the fact that a.) it was decided and re-affirmed, and b.) normatively citing JSON Schema 2020-12 mandates this behavior as the only possible way to satisfy its requirements.
There might be further changes related to this in the "Resolving Relative References in URIs", but I'll deal with that when I fix #3545, which covers both that section and "Resolving Relative References in URLs".
I also expect we'll want to add further guidance for both implementors and description authors on the Learn site. I tried to keep the text here as minimal as possible while heading off the misunderstandings we've seen as folks implement 3.1.
Commit message:
JSON Schema draft 2020-12 includes numerous keywords that require parsing the entire document prior to deeming a reference unresolvable. This makes that more clear and outlines several approaches.
The practice of embedding OpenAPI fragments in other formats is deemed to have implementation-defined (non-interoperable) behavior, as the potential complications that might arise are not predictable.