-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(compartment-mapper): Custom parser support #2304
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Design direction seems fine. I like to do import
, archive
, and import-archive
together for any feature that applies to all three, to make sure the design is sound. A scaffold
test should verify that an application has the same behavior through import
or archive
+ import-archive
using custom parsers. Any commentary on why archive
and import-archive
should not support custom parsers would be welcome. Please request a review again when this PR is out of draft.
Thank you for sharing LiteralUnion
. There are many places where I imagine this pattern should be applied. I welcome a comment from @turadg whether this should become integral to our style for enum literals.
Is "import-archive" the "unarchive" to the "archive" case? If so, I'm assuming that the same custom parser would need to be provided to both invocations. What happens (or should happen) if this is not true? Would we need to do something like add the parser itself to the archive? |
Indeed.
It’s good enough to require that the same custom parsers be present for |
Analogously, |
/** @type {Record<string, Language>} */ | ||
const customLanguageForExtension = Object.create(null); | ||
for (const { parser, extensions, language } of parsers) { | ||
if ( | ||
language in parserForLanguage && | ||
parserForLanguage[language] !== parser | ||
) { | ||
throw new Error(`Parser for language ${q(language)} already defined`); | ||
} | ||
parserForLanguage[language] = parser; | ||
for (const extension of extensions) { | ||
if ( | ||
extension in customLanguageForExtension && | ||
customLanguageForExtension[extension] !== language | ||
) { | ||
throw new Error( | ||
`Extension ${q(extension)} already assigned language ${q(customLanguageForExtension[extension])}`, | ||
); | ||
} | ||
customLanguageForExtension[extension] = language; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m beginning to suspect it’s better to thread separate “defaults for all compartments” options for parserForLanguage
and languageForExtension
than to merge them into parsers
and then unmerge them here. In #2294, I would have to make @endo/compartment-mapper/import-parsers.js
express an opinion about the language for each extension, and there is no clear opinion at that juncture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch captures my investigation and I am convinced that threading parserForLanguage
and languageForExtension
as separate options (instead of consolidated custom parsers
) is robust with less ceremony and will compose better with changes I have coming for #400
e5f88ee
to
491a795
Compare
@kriskowal OK, this is ready for Proper Review. Please take a looksee |
will update |
const parserForLanguage = freeze( | ||
/** @type {const} */ ({ | ||
mjs: parserArchiveMjs, | ||
'pre-mjs-json': parserArchiveMjs, | ||
cjs: parserArchiveCjs, | ||
'pre-cjs-json': parserArchiveCjs, | ||
json: parserJson, | ||
text: parserText, | ||
bytes: parserBytes, | ||
}), | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and the other one) shouldn't be mutated. Calling it const
isn't enough 😄
UPDATE: There are at least three. I hope I didn't miss one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You caught them all.
/** @import {ParserForLanguage} from './types.js' */ | ||
/** @import {ReadFn} from './types.js' */ | ||
/** @import {ReadPowers} from './types.js' */ | ||
/** @import {SomeObject} from './types.js' */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, copilot
44ed66f
to
c756eba
Compare
@@ -62,6 +62,12 @@ const q = JSON.stringify; | |||
{ | |||
globals: ['a', {}], | |||
}, | |||
{ | |||
options: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care that it's possible to put unserializable junk in here? Should that be restricted via types or at runtime or both or neither?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defer to @naugtur
@kriskowal Note: there's no explicit test for the archive use-case. I am not sure if it's needed? |
/** @type {ParserForLanguage} */ | ||
const finalParserForLanguage = Object.create(null); | ||
|
||
for (const [language, parser] of [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that putting stuff in parserForLanguage
seems to leak out of tests, so a) I stopped doing that, and b) I froze them.
finalLanguageForExtension[extension] = language; | ||
} | ||
for (const [extension, language] of entries(languageForExtension)) { | ||
finalLanguageForExtension[extension] = language; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I should write anything back to the CompartmentDescriptor
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is fine.
/** @type {Record<string, Language>} */ | ||
const finalLanguageForExtension = Object.create(null); | ||
for (const [extension, language] of entries(customLanguageForExtension)) { | ||
if (extension in languageForExtension) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I be using has()
instead of this? It's habit (because TypeScript seems to prefer it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since they’re equivalent for Object.create(null)
, I’m inclined to overlook the question. Unless I’m wrong about them being equivalent. @erights for second opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In boneskull/native-parser...kriskowal-native-parser, I did some additional work to ensure that in
behaves as I would expect. In the case of options
where a devious user might provide an object with a prototype chain that includes enumerable properties, the difference between has
and in
would be observable, but that can be avoided by capturing options with freeze(create(null), option)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is close to ready.
I investigated changing the parsers
option to separate parserForLanguage
and languageForExtension
options and I like the results. I offer these commits for your consideration. I would like to subsume it into this PR in some fashion or another and I am not concerned about preserving my authorship metadata. boneskull/native-parser...kriskowal-native-parser
We typically use a rebase and merge commit and preserve the narrative of the commit history. Let me know if it was your intention to squash+merge the PR or if you want me to review the narrative of the commit history.
finalLanguageForExtension[extension] = language; | ||
} | ||
for (const [extension, language] of entries(languageForExtension)) { | ||
finalLanguageForExtension[extension] = language; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is fine.
@@ -200,7 +205,7 @@ export const assertPackagePolicy = (allegedPackagePolicy, path, url) => { | |||
* It also moonlights as a type guard. | |||
* | |||
* @param {unknown} allegedPolicy - Alleged `Policy` to test | |||
* @returns {asserts allegedPolicy is import('./types.js').Policy|undefined} | |||
* @returns {asserts allegedPolicy is SomePolicy|undefined} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a precedent for Some*
types already? Can we consider Maybe*
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a "maybe" type. This playground illustrates why this is a thing.
If I were to explain this in a nutshell: when we use Policy
without a type argument, it artificially narrows the type to Policy<void, void, void, unknown>
. We may be able to safely assume this currently, but that is not guaranteed. Using SomePolicy
prevents future headaches, essentially.
@@ -62,6 +62,12 @@ const q = JSON.stringify; | |||
{ | |||
globals: ['a', {}], | |||
}, | |||
{ | |||
options: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defer to @naugtur
const parserForLanguage = freeze( | ||
/** @type {const} */ ({ | ||
mjs: parserArchiveMjs, | ||
'pre-mjs-json': parserArchiveMjs, | ||
cjs: parserArchiveCjs, | ||
'pre-cjs-json': parserArchiveCjs, | ||
json: parserJson, | ||
text: parserText, | ||
bytes: parserBytes, | ||
}), | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You caught them all.
The reasoning behind this is that a custom parser may want to use the `CompartmentDescriptor` to do things like policy enforcement.
This swaps out the "native module" fixture for a "markdown" fixture. The test parser implementation is now a "markdown parser". While the native module fixture was useful, it presents some practical portability issues.
Due to the type parameter defaults, these assertions _actually_ said, e.g., ```ts asserts value is Policy<void, void, void, void> ``` ...which is not necessarily true. This fixes the problem by adding `SomePolicy` and `SomePackagePolicy`.
This adds an optional property, `options`, to the `PackagePolicy` type. Here, consumers can add any fields or metadata they need to the policy. Any values will pass through the policy validator _alive, and unspoiled_. Endo should not read the contents of this property directly. The default type of `unknown` means that access to the value must have an explicit type assertion; it is the responsibility of the consumer to provide a proper type for this field.
Changed `policy` prop to `SomePackagePolicy`.
Co-authored-by: Kris Kowal <[email protected]>
Co-authored-by: Kris Kowal <[email protected]>
- Adds tests for more situations - Fixes prevention of overriding builtin parsers - types for various options consolidated
…guage and languageForExtension
7ff4779
to
5aeb5e6
Compare
@kriskowal OK, I've cherry-picked your changes, then fixed a problem in |
Co-authored-by: Kris Kowal <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thank you! And thanks for offering to squash+merge. No notes on commit structure or messages.
## Description This exposes `captureFromMap()` in `capture-lite.js`. This function is similar to e.g., `makeArchiveFromMap()` in `archive-lite.js`; but rather than creating a `.zip` archive, it simply returns the fully-completed `CompartmentMapDescriptor`, `Sources`, and a mapping of filename to compartment map name. This information is needed for next-gen-lavamoat-node ("endomoat")'s automatic policy generation. Another commit disables the hardcoded check for parsers in the compartment map validation functions (which are no longer necessary after #2304). ### Questions - Should this be split into two PRs? - Should any of this be renamed? - Internal functions were copy/pasted from `archive-lite.js` into `capture-lite.js`. Should these be extracted into a shared module? ### Security Considerations None that I'm aware of. ### Scaling Considerations If anything, it may shave a few nanoseconds off of compartment map validation. ### Documentation Considerations Probably should be added to `NEWS.md`. ### Testing Considerations - [ ] The compartment map validation is currently not tested in isolation and probably should be (removing the parser-name assertion did not cause a test to fail) - [x] `captureFromMap()` needs some sort of basic round-trip test. I think a snapshot of the return value may suffice? ### Compatibility Considerations None ### Upgrade Considerations None --------- Co-authored-by: Kris Kowal <[email protected]>
Closes: #2303
Description
This PR adds custom parser support as discussed in #2303. To support the feature:
options
was added to thePackagePolicy
type.compartmentDescriptor
objects, so that parsers can enforce policySecurity Considerations
See #2303
Scaling Considerations
See #2303
Documentation Considerations
parsers
for e.g.,importLocation
options
inPackagePolicy
Testing Considerations
options
is ignored by policy validatorCompatibility Considerations
None
Upgrade Considerations
NEWS.md