Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module: change default resolver to not throw on unknown scheme #47824

Merged
94 changes: 47 additions & 47 deletions doc/api/esm.md
Original file line number Diff line number Diff line change
Expand Up @@ -1029,28 +1029,6 @@ and there is no security.
// https-loader.mjs
import { get } from 'node:https';

export function resolve(specifier, context, nextResolve) {
const { parentURL = null } = context;

// Normally Node.js would error on specifiers starting with 'https://', so
// this hook intercepts them and converts them into absolute URLs to be
// passed along to the later hooks below.
if (specifier.startsWith('https://')) {
return {
shortCircuit: true,
url: specifier,
};
} else if (parentURL && parentURL.startsWith('https://')) {
return {
shortCircuit: true,
url: new URL(specifier, parentURL).href,
};
}

// Let Node.js handle all other specifiers.
return nextResolve(specifier);
}

export function load(url, context, nextLoad) {
// For JavaScript to be loaded over the network, we need to fetch and
// return it.
Expand Down Expand Up @@ -1091,9 +1069,7 @@ prints the current version of CoffeeScript per the module at the URL in
#### Transpiler loader

Sources that are in formats Node.js doesn't understand can be converted into
JavaScript using the [`load` hook][load hook]. Before that hook gets called,
however, a [`resolve` hook][resolve hook] needs to tell Node.js not to
throw an error on unknown file types.
JavaScript using the [`load` hook][load hook].

This is less performant than transpiling source files before running
Node.js; a transpiler loader should only be used for development and testing
Expand All @@ -1109,25 +1085,6 @@ import CoffeeScript from 'coffeescript';

const baseURL = pathToFileURL(`${cwd()}/`).href;

// CoffeeScript files end in .coffee, .litcoffee, or .coffee.md.
const extensionsRegex = /\.coffee$|\.litcoffee$|\.coffee\.md$/;

export function resolve(specifier, context, nextResolve) {
if (extensionsRegex.test(specifier)) {
const { parentURL = baseURL } = context;

// Node.js normally errors on unknown file extensions, so return a URL for
// specifiers ending in the CoffeeScript file extensions.
return {
shortCircuit: true,
url: new URL(specifier, parentURL).href,
};
}

// Let Node.js handle all other specifiers.
return nextResolve(specifier);
}

export async function load(url, context, nextLoad) {
if (extensionsRegex.test(url)) {
// Now that we patched resolve to let CoffeeScript URLs through, we need to
Expand Down Expand Up @@ -1220,6 +1177,50 @@ loaded from disk but before Node.js executes it; and so on for any `.coffee`,
`.litcoffee` or `.coffee.md` files referenced via `import` statements of any
loaded file.

#### Overriding loader
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested this code? Can you add it to the examples in https://github.com/nodejs/loaders-test?

We should probably have an example that needs to use both resolve and load, so that we demonstrate how the two hooks work together. Ideally it would be as realistic a use case as possible.

Speaking of realistic examples, another “resolve-only” use case discussed in the design docs for the loaders API was to rewrite specifiers like lodash into URLs like https://esm.sh/lodash. Perhaps this might be a better example than this “overriding loader,” as it would presumably be a lot shorter of an example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested this code? Can you add it to the examples in https://github.com/nodejs/loaders-test?

I have (in my repo that I link to in the issue), but I can definitely copy it to https://github.com/nodejs/loaders-test.

Speaking of realistic examples, another “resolve-only” use case discussed in the design docs for the loaders API was to rewrite specifiers like lodash into URLs like https://esm.sh/lodash

The example I gave is very realistic as it is a simplification of import maps, where you can map bare specifiers to specific files. I can definitely use your use-case, but I think in terms of complexity, they're roughly the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this lands (hopefully...😊), I'll not only add this loader to the loaders-test, but also modify the http-loader and typescript-loader to not need resolvers.


The above two loaders hooked into the "load" phase of the module loader.
This loader hooks into the "resolution" phase. This loader reads an
`overrides.json` file that specifies which specifiers to override to another
url.
giltayar marked this conversation as resolved.
Show resolved Hide resolved

```js
// overriding-loader.js
import fs from 'node:fs/promises';

const overrides = JSON.parse(await fs.readFile('overrides.json'));

export async function resolve(specifier, context, nextResolve) {
if (specifier in overrides) {
return nextResolve(overrides[specifier], context);
}

return nextResolve(specifier, context);
}
```

Let's assume we have these files:

```js
// main.js
import 'a-module-to-override';
```

```json
// overrides.json
{
"a-module-to-override": "./module-override.js"
}
giltayar marked this conversation as resolved.
Show resolved Hide resolved
```

```js
// module-override.js
console.log('module overridden!');
```

If you run `node --experimental-loader ./overriding-loader.js main.js`
the output will be `module overriden!`.

## Resolution algorithm

### Features
Expand Down Expand Up @@ -1506,9 +1507,9 @@ _isImports_, _conditions_)
> 7. If _pjson?.type_ exists and is _"module"_, then
> 1. If _url_ ends in _".js"_, then
> 1. Return _"module"_.
> 2. Throw an _Unsupported File Extension_ error.
> 2. return **undefined**.
guybedford marked this conversation as resolved.
Show resolved Hide resolved
> 8. Otherwise,
> 1. Throw an _Unsupported File Extension_ error.
> 1. return **undefined**.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a breaking change to the resolution algorithm. cc @guybedford

I think Node could and still should throw this error, it would just be as part of resolution and loading, as opposed to just resolution. So all we really need to do is rename the section “Module resolution and loading algorithm,” and you add the steps for loading and move the error to there. From the consumer’s perspective there would be no breaking changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we’re moving the check for unsupported schemes into load, then we should move the check for unsupported file types/extensions into load as well (or does that already happen as part of load?)

@GeoffreyBooth already happens today

All the more reason then to extend this algorithm definition to include the loading phase, to document which errors get thrown during that phase too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. But I think this is out of scope for this issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. But I think this is out of scope for this issue.

I don’t think so, as this PR is all about moving an error into the loading phase. The loading phase is so simple that we don’t currently define that algorithm (it’s essentially “load the source from the resolved URL”) so it’s really just adding the validation checks that happen before that step.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeoffreyBooth arent loaders still experimental?
the docs on the resolve hook are pretty clear https://nodejs.org/api/esm.html#resolvespecifier-context-nextresolve

The loaders API is being redesigned. This hook may disappear or its signature may change. Do not rely on the API described below.

so I think it should ok to land this and adjust the docs in a follow-up PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have an example that needs to use both resolve and load, so that we demonstrate how the two hooks work together. Ideally it would be as realistic a use case as possible.

My ESM mocking loader has both, but I don't think this can be put in an example in the docs. It's too large, even if I simplify it.

Copy link
Member

@GeoffreyBooth GeoffreyBooth May 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arent loaders still experimental?

Loaders are experimental, but the built-in ESM Loader’s resolution algorithm is stable and has been such for years. What we document about it here is the basis for bundlers that replicate the algorithm, so it’s important that the documentation be complete. Those other tools replicate the error flows too.

If this PR ships without the follow-up that fixes the docs, those other tools might think we’re no longer erroring on invalid schemes, which is wrong. The point of spelling out the algorithm here is to provide a “spec” for other tools to match, and for our own code to be judged against (so that Node’s internal code isn’t itself “the spec” when it might have bugs).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll write it down, as suggested by @GeoffreyBooth (by looking at the code and trying to replicate the same "algorithm style" as in the resolver algorithm).


**LOOKUP\_PACKAGE\_SCOPE**(_url_)

Expand Down Expand Up @@ -1581,7 +1582,6 @@ for ESM specifiers is [commonjs-extension-resolution-loader][].
[custom https loader]: #https-loader
[load hook]: #loadurl-context-nextload
[percent-encoded]: url.md#percent-encoding-in-urls
[resolve hook]: #resolvespecifier-context-nextresolve
[special scheme]: https://url.spec.whatwg.org/#special-scheme
[status code]: process.md#exit-codes
[the official standard format]: https://tc39.github.io/ecma262/#sec-modules
Expand Down
31 changes: 31 additions & 0 deletions lib/internal/modules/esm/load.js
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@ async function defaultLoad(url, context = kEmptyObject) {
source,
} = context;

throwIfUnsupportedURLScheme(new URL(url), experimentalNetworkImports);

if (format == null) {
format = await defaultGetFormat(url, context);
}
Expand All @@ -102,6 +104,35 @@ async function defaultLoad(url, context = kEmptyObject) {
};
}

/**
* throws an error if the protocol is not one of the protocols
* that can be loaded in the default loader
* @param {URL} parsed
* @param {boolean} experimentalNetworkImports
*/
function throwIfUnsupportedURLScheme(parsed, experimentalNetworkImports) {
// Avoid accessing the `protocol` property due to the lazy getters.
const protocol = parsed?.protocol;
if (
protocol &&
protocol !== 'file:' &&
protocol !== 'data:' &&
protocol !== 'node:' &&
(
!experimentalNetworkImports ||
(
protocol !== 'https:' &&
protocol !== 'http:'
)
)
) {
const schemes = ['file', 'data', 'node'];
if (experimentalNetworkImports) {
ArrayPrototypePush(schemes, 'https', 'http');
}
throw new ERR_UNSUPPORTED_ESM_URL_SCHEME(parsed, schemes);
}
}

/**
* For a falsy `format` returned from `load`, throw an error.
Expand Down
36 changes: 0 additions & 36 deletions lib/internal/modules/esm/resolve.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
const {
ArrayIsArray,
ArrayPrototypeJoin,
ArrayPrototypePush,
ArrayPrototypeShift,
JSONStringify,
ObjectGetOwnPropertyNames,
Expand Down Expand Up @@ -51,7 +50,6 @@ const {
ERR_PACKAGE_PATH_NOT_EXPORTED,
ERR_UNSUPPORTED_DIR_IMPORT,
ERR_NETWORK_IMPORT_DISALLOWED,
ERR_UNSUPPORTED_ESM_URL_SCHEME,
} = require('internal/errors').codes;

const { Module: CJSModule } = require('internal/modules/cjs/loader');
Expand Down Expand Up @@ -941,37 +939,6 @@ function throwIfInvalidParentURL(parentURL) {
}
}

function throwIfUnsupportedURLProtocol(url) {
// Avoid accessing the `protocol` property due to the lazy getters.
const protocol = url.protocol;
if (protocol !== 'file:' && protocol !== 'data:' &&
protocol !== 'node:') {
throw new ERR_UNSUPPORTED_ESM_URL_SCHEME(url);
}
}

function throwIfUnsupportedURLScheme(parsed, experimentalNetworkImports) {
// Avoid accessing the `protocol` property due to the lazy getters.
const protocol = parsed?.protocol;
if (
protocol &&
protocol !== 'file:' &&
protocol !== 'data:' &&
(
!experimentalNetworkImports ||
(
protocol !== 'https:' &&
protocol !== 'http:'
)
)
) {
const schemes = ['file', 'data'];
if (experimentalNetworkImports) {
ArrayPrototypePush(schemes, 'https', 'http');
}
throw new ERR_UNSUPPORTED_ESM_URL_SCHEME(parsed, schemes);
}
}

function defaultResolve(specifier, context = {}) {
let { parentURL, conditions } = context;
Expand Down Expand Up @@ -1048,7 +1015,6 @@ function defaultResolve(specifier, context = {}) {
// This must come after checkIfDisallowedImport
if (parsed && parsed.protocol === 'node:') return { __proto__: null, url: specifier };

throwIfUnsupportedURLScheme(parsed, experimentalNetworkImports);

const isMain = parentURL === undefined;
if (isMain) {
Expand Down Expand Up @@ -1095,8 +1061,6 @@ function defaultResolve(specifier, context = {}) {
throw error;
}

throwIfUnsupportedURLProtocol(url);

return {
__proto__: null,
// Do NOT cast `url` to a string: that will work even when there are real
Expand Down
2 changes: 2 additions & 0 deletions test/es-module/test-esm-import-meta-resolve.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ assert.strictEqual(
code: 'ERR_INVALID_ARG_TYPE',
})
);
assert.strictEqual(import.meta.resolve('http://some-absolute/url'), 'http://some-absolute/url');
assert.strictEqual(import.meta.resolve('some://weird/protocol'), 'some://weird/protocol');
assert.strictEqual(import.meta.resolve('baz/', fixtures),
fixtures + 'node_modules/baz/');

Expand Down
31 changes: 31 additions & 0 deletions test/es-module/test-esm-loader-default-resolver.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
import { spawnPromisified } from '../common/index.mjs';
import * as fixtures from '../common/fixtures.mjs';
import assert from 'node:assert';
import { execPath } from 'node:process';
import { describe, it } from 'node:test';

describe('default resolver', () => {
it('should accept foreign schemas without exception (e.g. uyyt://something/or-other', async () => {
const { code, stdout, stderr } = await spawnPromisified(execPath, [
'--no-warnings',
'--experimental-loader',
fixtures.fileURL('/es-module-loaders/uyyt-dummy-loader.mjs'),
fixtures.path('/es-module-loaders/uyyt-dummy-loader-main.mjs'),
]);
assert.strictEqual(code, 0);
assert.strictEqual(stdout.trim(), 'index.mjs!');
assert.strictEqual(stderr, '');
});

it('should resolve foreign schemas by doing regular url absolutization', async () => {
const { code, stdout, stderr } = await spawnPromisified(execPath, [
'--no-warnings',
'--experimental-loader',
fixtures.fileURL('/es-module-loaders/uyyt-dummy-loader.mjs'),
fixtures.path('/es-module-loaders/uyyt-dummy-loader-main2.mjs'),
]);
assert.strictEqual(code, 0);
assert.strictEqual(stdout.trim(), '42');
assert.strictEqual(stderr, '');
});
});
18 changes: 0 additions & 18 deletions test/fixtures/es-module-loaders/http-loader.mjs
Original file line number Diff line number Diff line change
@@ -1,23 +1,5 @@
import { get } from 'http';

export function resolve(specifier, context, nextResolve) {
const { parentURL = null } = context;

if (specifier.startsWith('http://')) {
return {
shortCircuit: true,
url: specifier,
};
} else if (parentURL?.startsWith('http://')) {
return {
shortCircuit: true,
url: new URL(specifier, parentURL).href,
};
}

return nextResolve(specifier);
}

export function load(url, context, nextLoad) {
if (url.startsWith('http://')) {
return new Promise((resolve, reject) => {
Expand Down
1 change: 1 addition & 0 deletions test/fixtures/es-module-loaders/uyyt-dummy-loader-main.mjs
giltayar marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
import 'uyyt://1/index.mjs';
giltayar marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
import 'uyyt://1/index2.mjs';
giltayar marked this conversation as resolved.
Show resolved Hide resolved
24 changes: 24 additions & 0 deletions test/fixtures/es-module-loaders/uyyt-dummy-loader.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
export function load(url, context, nextLoad) {
switch (url) {
case 'uyyt://1/index.mjs':
return {
source: 'console.log("index.mjs!")',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
case 'uyyt://1/index.mjs':
return {
source: 'console.log("index.mjs!")',
case 'uyyt://0/0':
case 'uyyt://1/1':
return {
source: 'console.log(import.meta.url)',

format: 'module',
shortCircuit: true,
};
case 'uyyt://1/index2.mjs':
return {
source: 'import c from "./sub.mjs"; console.log(c);',
giltayar marked this conversation as resolved.
Show resolved Hide resolved
format: 'module',
shortCircuit: true,
};
case 'uyyt://1/sub.mjs':
return {
source: 'export default 42',
format: 'module',
shortCircuit: true,
};
giltayar marked this conversation as resolved.
Show resolved Hide resolved
default:
return nextLoad(url, context);
}
}