Skip to content

Commit

Permalink
fix(module-source): Sub non-conforming ZWJ prefixes with CGJ (#2436)
Browse files Browse the repository at this point in the history
## Description

This PR changes the `HIDDEN_PREFIX` of `ModuleSource` from the
non-conforming `$h\u200D_` zero-width joiner (`ZWJ`) notation to the
conforming `$h\u034F_` combining grapheme joiner (`CGJ`) notation.

A future PR may address further changes to a `$\u034F`-prefixed and
`\u034F$`-suffixed format as was suggested by @michaelfig in
discussions.

### Motivation

This change is motivated after encountering a parsing error when using
`rollup` which was traced back to the `$h\u200D_`-prefixed identifier in
an `endoScript` bundle. More importantly, this is also motivated by the
subsequent discovery that `rollup`'s implementation was actually
conforming to the ECMAScript Specification when it was throwing this
error.

To elaborate, while runtimes today will accept the special identifier
notation that is currently being introduced by the `ModuleSource`
rewrites, the current `$h\u200D_` zero-width joiner (`ZWJ`) notation
does not conform to the specifications defined in the [ECMAScript
Lexical Grammar](https://tc39.es/ecma262/#sec-names-and-keywords). In
essence, what the specifications entail is that the character sequence
for [Identifier Names](https://tc39.es/ecma262/#sec-identifier-names)
once unescaped would be expected to match the
`/^[$_\p{ID_Start}][$_\p{ID_Continue}]*$/u` pattern, aside from the
additional `#` character prefix required in the case of private fields.

As such, one can test this in the console by evaluating the following:

```js
Object.fromEntries([String.raw`$h\u200D_`, String.raw`$h\u034F_`].map(id => [id, /^[$_\p{ID_Start}][$_\p{ID_Continue}]*$/u.test(JSON.parse(`"${id}"`))]))
```

The above would yield the following object in a runtime where the
unicode escape sequences are retained:

```js
{$h\u200D_: false, $h\u034F_: true}
```

Digging closer in the Unicode Standard, it seems that the zero-width
joiner (`ZWJ`) may indeed be used in a conforming notation per [Emoji
Profile in Annex #31 of the Unicode
Standard](http://www.unicode.org/reports/tr31/#Emoji_Profile), however
this is not applicable for this purpose as it would require the use of
emojis.

At this point, my suggestion to instead use the combining grapheme
joiner (`CGJ`) is best articulated with this excerpt that I am borrowing
from its canonical Wikipedia entry:

> However, in contrast to the zero-width joiner and similar characters,
the `CGJ` does not affect whether the two letters are rendered
separately or as a ligature or cursively joined—the default behavior for
this is determined by the font.[^1]
>
>
> [^1]: https://en.wikipedia.org/wiki/Combining_grapheme_joiner

The Wikipedia article offers additional nuances about the differences,
while the [Proposal for addition of COMBINING GRAPHEME
JOINER](https://www.unicode.org/L2/L2000/00274-N2236-grapheme-joiner.htm)
offers the necessary context about its intent.

It is fair to note that there are many uses of the zero-width joiner
(`ZWJ`) already in the wild, and in fact there are currently `test262`
tests for its occurrence. That said, unless those uses are conforming to
the ECMAScript Specification and the Unicode Standard, they will limit
code portability and adoption by users who may end up confused by
failures similar to the one encountered with `rollup`.

Ultimately, with the reasonable recommendations to exercise caution when
it comes to bundling `ses` and related sources that are best bundled
with `bundleSource` instead, those sources may still need to be parsed
with tools like `rollup` for different purposes that would be aligned
with the expectations that they are being handled safely.

### Approach

#### Substituting the invisible joiner character

A search across the monorepo for `(?:\u200d|\\u200d)_` yields only 3
files of interest:

- `packages/module-source/TESTS.md`
- `packages/module-source/src/hidden.js`
- `packages/module-source/test/module-source.test.js`

While making changes to the 3 files of interest, a distinction is made
between matching `\$h\\u200d_` and `\$h\u200d_` where the replacements
are respectively `$h\\u034f_` and `$h\u034f_`, along with their `$c`
equivalents.

The search across the monorepo for `(?:\u200d|\\u200d)_` yields another
978 files that are not of interest found in:

-
`packages/test262-runner/test262/test/language/expressions/class/elements`
-
`packages/test262-runner/test262/test/language/statements/class/elements`

All those files remain unchanged.

#### Ensuring generic wording is used

For testing and other purposes where descriptive phrases are used to
refer to the use of `ZWJ`, `CGJ` or other characters for this same
intent, the phrase *"invisible joiner character"* is suggested.

### Security Considerations

**Does not apply to my knowledge**

### Scaling Considerations

**Does not apply to my knowledge**

### Documentation Considerations

**Does not apply to my knowledge**

### Testing Considerations

**See**:
#2436 (comment)

### Compatibility Considerations

While the changes do not affect compatibility when the generated code is
evaluated at runtime, there can potentially be compatibility concerns
with tools that have been specifically designed to work with the current
notation.

### Upgrade Considerations

**Does not apply to my knowledge**
  • Loading branch information
SMotaal authored Sep 1, 2024
2 parents 0e74a70 + 29ef8a5 commit cc82132
Show file tree
Hide file tree
Showing 4 changed files with 46 additions and 45 deletions.
2 changes: 1 addition & 1 deletion packages/module-source/DESIGN.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ the module needs to be linked to and how to link their imports and exports.

The functor source, the module transformed into a program, has the following
shape.
The names are additionally obscured with Unicode zero-width-joiners to avoid
The names are additionally obscured with invisible joiner characters to avoid
collisions with sensibly constructed modules, and the transformation preserves
line numbers.

Expand Down
72 changes: 36 additions & 36 deletions packages/module-source/TESTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Pairs of module source vs functor source. The output was generated.
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); const { default: $c_default } = { default: bb };$h‍_once.default($c‍_default);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); const { default: $c͏_default } = { default: bb };$_once.default($c͏_default);
})
```

Expand All @@ -24,7 +24,7 @@ Pairs of module source vs functor source. The output was generated.
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); const { default: $c_default } = { default: class {valueOf() {return 45;}} };$h‍_once.default($c‍_default);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); const { default: $c͏_default } = { default: class {valueOf() {return 45;}} };$_once.default($c͏_default);
})
```

Expand All @@ -36,8 +36,8 @@ export default 123
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); //#! /usr/bin/env node
const { default: $c_default } = { default: 123 };$h‍_once.default($c‍_default);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); //#! /usr/bin/env node
const { default: $c͏_default } = { default: 123 };$_once.default($c͏_default);
})
```

Expand All @@ -48,7 +48,7 @@ const { default: $c‍_default } = { default: 123 };$h‍_once.default($c‍_def
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); const { default: $c_default } = { default: arguments };$h‍_once.default($c‍_default);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); const { default: $c͏_default } = { default: arguments };$_once.default($c͏_default);
})
```

Expand All @@ -59,7 +59,7 @@ const { default: $c‍_default } = { default: 123 };$h‍_once.default($c‍_def
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); const { default: $c_default } = { default: this };$h‍_once.default($c‍_default);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); const { default: $c͏_default } = { default: this };$_once.default($c͏_default);
})
```

Expand All @@ -77,11 +77,11 @@ export const ghi = 789;
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); let abc = 123;$h‍_once.abc(abc);
let $c‍_def = 456;$h‍_live.def($c‍_def);
let def2 = def;$h‍_once.def2(def2);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); let abc = 123;$_once.abc(abc);
let $c͏_def = 456;$_live.def($c͏_def);
let def2 = def;$_once.def2(def2);
def++;
const ghi = 789;$h‍_once.ghi(ghi);
const ghi = 789;$_once.ghi(ghi);
})
```

Expand All @@ -94,8 +94,8 @@ export const { def, nest: [, ghi, ...nestrest], ...rest } = { def: 456, nest: [
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); const abc = 123;$h‍_once.abc(abc);
const { def, nest: [, ghi, ...nestrest], ...rest } = { def: 456, nest: ['skip', 789, 'a', 'b'], other: 999, and: 998 };$h‍_once.def(def);$h‍_once.ghi(ghi);$h‍_once.nestrest(nestrest);$h‍_once.rest(rest);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); const abc = 123;$_once.abc(abc);
const { def, nest: [, ghi, ...nestrest], ...rest } = { def: 456, nest: ['skip', 789, 'a', 'b'], other: 999, and: 998 };$_once.def(def);$_once.ghi(ghi);$_once.nestrest(nestrest);$_once.rest(rest);
})
```

Expand All @@ -107,8 +107,8 @@ export const abc = 123;
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); const abc2 = abc;
const abc = 123;$h‍_once.abc(abc);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); const abc2 = abc;
const abc = 123;$_once.abc(abc);
})
```

Expand All @@ -120,8 +120,8 @@ export let abc = 123;
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); const abc2 = abc;
let abc = 123;$h‍_once.abc(abc);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); const abc2 = abc;
let abc = 123;$_once.abc(abc);
})
```

Expand All @@ -134,9 +134,9 @@ export const abc3 = abc;
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []);$h‍_live.abc(); const abc2 = abc;$h‍_once.abc2(abc2);
var $c‍_abc = 123;abc = $c‍_abc;
const abc3 = abc;$h‍_once.abc3(abc3);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []);$_live.abc(); const abc2 = abc;$_once.abc2(abc2);
var $c͏_abc = 123;abc = $c͏_abc;
const abc3 = abc;$_once.abc3(abc3);
})
```

Expand All @@ -151,11 +151,11 @@ export const fn3 = fn;
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []);Object.defineProperty($c‍_fn, 'name', {value: "fn"});$h‍_live.fn($c‍_fn); const fn2 = fn;$h‍_once.fn2(fn2);
function $c‍_fn() {
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []);Object.defineProperty($c͏_fn, 'name', {value: "fn"});$_live.fn($c͏_fn); const fn2 = fn;$_once.fn2(fn2);
function $_fn() {
return 'foo';
}
const fn3 = fn;$h‍_once.fn3(fn3);
const fn3 = fn;$_once.fn3(fn3);
})
```

Expand All @@ -167,8 +167,8 @@ export class C {} if (C) { count += 1; }
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); let $c‍_count = 0;$h‍_live.count($c‍_count);{
class C {}$h‍_live.C(C);}if (C) {count += 1;}
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); let $c͏_count = 0;$_live.count($c͏_count);{
class C {}$_live.C(C);}if (C) {count += 1;}
})
```

Expand All @@ -179,7 +179,7 @@ export class C {} if (C) { count += 1; }
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); const { default: $c_default } = { default: class C {} };$h‍_once.default($c‍_default);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); const { default: $c͏_default } = { default: class C {} };$_once.default($c͏_default);
})
```

Expand All @@ -190,7 +190,7 @@ export class C {} if (C) { count += 1; }
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); const { default: $c_default } = { default: class {} };$h‍_once.default($c‍_default);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); const { default: $c͏_default } = { default: class {} };$_once.default($c͏_default);
})
```

Expand All @@ -201,7 +201,7 @@ export class C {} if (C) { count += 1; }
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); const { default: $c_default } = { default: class {} };$h‍_once.default($c‍_default);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); const { default: $c͏_default } = { default: class {} };$_once.default($c͏_default);
})
```

Expand All @@ -213,8 +213,8 @@ export function F(arg) { return arg; }
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []);Object.defineProperty($c‍_F, 'name', {value: "F"});$h‍_live.F($c‍_F); F(123);
function $c‍_F(arg) {return arg;}
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []);Object.defineProperty($_F, 'name', {value: "F"});$_live.F($_F); F(123);
function $_F(arg) {return arg;}
})
```

Expand All @@ -225,7 +225,7 @@ function $c‍_F(arg) {return arg;}
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); const { default: $c_default } = { default: async function F(arg) {return arg;} };$h‍_once.default($c‍_default);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); const { default: $c͏_default } = { default: async function F(arg) {return arg;} };$_once.default($c͏_default);
})
```

Expand All @@ -236,7 +236,7 @@ function $c‍_F(arg) {return arg;}
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([]), []); const { default: $c_default } = { default: async function (arg) {return arg;} };$h‍_once.default($c‍_default);;
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([]), []); const { default: $c͏_default } = { default: async function (arg) {return arg;} };$_once.default($c͏_default);;
})
```

Expand All @@ -249,7 +249,7 @@ function $c‍_F(arg) {return arg;}
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { let ns;$h‍_imports(new Map([["module", new Map([["*", [$h‍_a => (ns = $h‍_a)]]])]]), []);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { let ns;$_imports(new Map([["module", new Map([["*", [$_a => (ns = $h͏_a)]]])]]), []);
})
```

Expand All @@ -260,7 +260,7 @@ function $c‍_F(arg) {return arg;}
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { let foo,bar;$h‍_imports(new Map([["module", new Map([["foo", [$h‍_a => (foo = $h‍_a)]],["bar", [$h‍_a => (bar = $h‍_a)]]])]]), []);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { let foo,bar;$_imports(new Map([["module", new Map([["foo", [$_a => (foo = $h͏_a)]],["bar", [$_a => (bar = $h͏_a)]]])]]), []);
})
```
#### Case 3
Expand All @@ -270,7 +270,7 @@ function $c‍_F(arg) {return arg;}
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { let myName;$h‍_imports(new Map([["module", new Map([["default", [$h‍_a => (myName = $h‍_a)]]])]]), []);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { let myName;$_imports(new Map([["module", new Map([["default", [$_a => (myName = $h͏_a)]]])]]), []);
})
```

Expand All @@ -282,7 +282,7 @@ function $c‍_F(arg) {return arg;}
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { let myName,other;$h‍_imports(new Map([["module", new Map([["default", [$h‍_a => (myName = $h‍_a)]],["otherName", [$h‍_a => (other = $h‍_a)]]])]]), []);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { let myName,other;$_imports(new Map([["module", new Map([["default", [$_a => (myName = $h͏_a)]],["otherName", [$_a => (other = $h͏_a)]]])]]), []);
})
```

Expand All @@ -294,6 +294,6 @@ function $c‍_F(arg) {return arg;}
```

```js
(({ imports: $h‍_imports, liveVar: $h‍_live, onceVar: $h‍_once, }) => { $h‍_imports(new Map([["module", new Map([])]]), []);
(({ imports: $h͏_imports, liveVar: $h͏_live, onceVar: $h͏_once, }) => { $_imports(new Map([["module", new Map([])]]), []);
})
```
4 changes: 2 additions & 2 deletions packages/module-source/src/hidden.js
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
export const HIDDEN_PREFIX = '$h\u200d_';
export const HIDDEN_CONST_VAR_PREFIX = '$c\u200d_';
export const HIDDEN_PREFIX = '$h\u034f_';
export const HIDDEN_CONST_VAR_PREFIX = '$c\u034f_';
export const HIDDEN_A = `${HIDDEN_PREFIX}a`;
export const HIDDEN_IMPORT = `${HIDDEN_PREFIX}import`;
export const HIDDEN_IMPORT_SELF = `${HIDDEN_PREFIX}importSelf`;
Expand Down
13 changes: 7 additions & 6 deletions packages/module-source/test/module-source.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -442,22 +442,22 @@ export default async function (arg) { return arg; }
t.is(await ret, 'foo', 'F returns correctly');
});

test('zero width joiner is reserved', t => {
test('invisible joiner character is reserved', t => {
t.throws(() => {
const _ = new ModuleSource(`const $h\u200d_import = 123; $h\u200d_import`);
const _ = new ModuleSource(`const $h\u034f_import = 123; $h\u034f_import`);
});
});

test('zero width joiner in constified variable is reserved', t => {
test('invisible joiner character in constified variable is reserved', t => {
t.throws(() => {
const _ = new ModuleSource(`const $c\u200d_myVar = 123; $c\u200d_myVar`);
const _ = new ModuleSource(`const $c\u034f_myVar = 123; $c\u034f_myVar`);
});
});

test('zero width joiner is allowed in non-reserved words', t => {
test('invisible joiner character is allowed in non-reserved words', t => {
const { namespace } = initialize(
t,
`const $h\u200d_import2 = 123; export default $h\u200d_import2`,
`const $h\u034f_import2 = 123; export default $h\u034f_import2`,
);
const { default: name } = namespace;
t.is(name, 123);
Expand Down Expand Up @@ -729,6 +729,7 @@ test('source map generation', t => {
const { __syncModuleProgram__ } = new ModuleSource(`'Hello, World!'`, {
sourceUrl: 'must-appear-in-source.js',
sourceMapUrl: 'must-not-appear-in-source.js',
// @ts-expect-error SourceMapHookDetails do not have those properties!
sourceMapHook(sourceMap, { sourceUrl, sourceMapUrl, source }) {
t.log(sourceMap);
t.is(sourceMapUrl, 'must-not-appear-in-source.js');
Expand Down

0 comments on commit cc82132

Please sign in to comment.