Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(commonjs): reconstruct real es module from __esModule marker #537

Merged

Conversation

LarsDenBakker
Copy link
Contributor

@LarsDenBakker LarsDenBakker commented Aug 12, 2020

Rollup Plugin Name: commonjs

This PR contains:

  • bugfix
  • feature
  • refactor
  • documentation
  • other

Are tests included?

  • yes (bugfixes and features will not be merged without tests)
  • no

Breaking Changes?

  • yes (breaking changes will not be merged unless absolutely necessary)
  • no

If yes, then include "BREAKING CHANGES:" in the first commit message body, followed by a description of what is breaking.

List any relevant issue numbers: #481

Description

By @lukastaegert: The description has been updated to reflect the final state of this PR after my changes

This implements "improvement 2" described in #481. When a module is compiled from es module to commonjs it is a common practice to use the __esModule marker to indicate that the module follows esm semantics.

With this change, when a module is marked with either:

exports.__esModule = true;

or

Object.defineProperty(exports, '__esModule', { value: true });

And it does not otherwise perform any unchecked operations such as assignments to module.exports, it will be reconstructed without any wrapper or interop code. In contrast to the previous implementation of this PR, it will still provide the same return value when required by another commonjs module and also still support synthetic exports.

If there is an assignment to exports.default or module.exports.default, this assignment will become the default export. Otherwise the default export will be the commonjs namespace, i.e. the return value when requiring the module. This is to ensure that if there is no explicit default export, we keep compatibility with Node.

Moreover, this PR is now close to a complete rewrite of core parts of this plugin, making it hopefully much more maintainable, and also faster. In short, the previous version of the plugin did two complete AST traversals as well as two iterations over all top level statements while now, there is only one top level iteration and one AST traversal.

The leading principle here was to focus on gather information and collecting specific nodes during this traversal and only perform transformations that need to be performed in any case. Then only after the traversal is complete and we have a complete picture of the code, the remaining code transformations are performed.

The remaining code then is roughly grouped in export generation and import generation, both of which were extracted to separate modules.

There are some small improvements like not adding interop code in more situations, but also some bugs were fixed:

  • if there are multiple assignments to the same exports.foo, the plugin will no longer generate invalid code
  • if there are multiple requires of the same module in the code, it will now be ensured that the variable name chosen for the import does not cause a conflict in ANY of the require locations.
  • the check for the peer dependency version of Rollup will now actually work again (and there is a test to ensure it does not break again in the future)
  • tests that throw errors are now guaranteed to turn red. Previously, errors were sometimes swallowed, resulting in green tests. For that reason, there is now a skipped test for dynamic requires that is unfortunately not easily fixable without a complete rethinking of the dynamic require logic.

One last change I added here is that in case no dynamic requires are used, the plugin uses a simplified commonjs wrapper, because it really bugged me that the message Dynamic requires are not currently supported by @rollup/plugin-commonjs became part of everyones code base while hardly anyone would ever see it. Now this message and the corresponding require handler will only be embedded if the code explicitly references require or module.require.

@@ -110,7 +108,7 @@ export default function commonjs(options = {}) {
ast
);

setIsCjsPromise(id, isEsModule ? false : Boolean(transformed));
setIsCjsPromise(id, isEsModule || isCompiledEsModule ? false : Boolean(transformed));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This avoids creating a proxy module for these, which I think is correct. I'm not entirely sure if this creates other unintended side effects?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if you are reconstructing as an es module, and that es module will still have require calls with dynamic arguments, then the transformation should still take place - after the reconstruction.

"@typescript-eslint/no-unused-vars": "off"
"@typescript-eslint/no-unused-vars": "off",
"camelcase": "off",
"no-underscore-dangle": "off"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to add some eslint ignore, what is the prefered approach here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to add some eslint ignore, what is the prefered approach here?

If I understand correctly, you did this due to the const at the top.
It's pretty standard for consts to be upper-underscore-case, and it looks like eslint supports that out of the box:

If ESLint decides that the variable is a constant (all uppercase), then no warning will be thrown

(https://eslint.org/docs/rules/camelcase)

So it's weird you had to do this.
Anyway I'm for using a local eslint-disable-line on the same line. So other instances will still warn like it usually does, to keep the style aligned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lintings errors come from generated output in the test fixtures, I can't make those const or add linting comments unfortunately.

Copy link
Contributor

@danielgindi danielgindi Aug 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh so you can exclude the tests folder, or override settings for the tests folder, and that's what you did.
I missed the fact that this was under tests.
Maybe you can change the variable generation algorithm to avoid collisions with other people's eslint with generated code?

return {
code,
map,
syntheticNamedExports: isEsModule || isCompiledEsModule ? false : '__moduleExports'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm avoiding creating __moduleExports for compiled es modules. This follows the pattern of turning the commonjs module into a real es module. I'm not sure if this can create issues in other places though.

@lukastaegert
Copy link
Member

Thank you already for your work! So I tried to compile Rollup with this new plugin and it failed by creating an empty namespace for a dependency. So here is a test case that should be considered:

// input: dep.js
Object.defineProperty(exports, '__esModule', { value: true });
const foo = { test: 42 };
module.exports = foo;

// actual output (contains no exports at all)
const foo = { test: 42 };
var dep = foo;

So what I would expect is that the plugin just falls back to the old logic if anything is assigned to module.exports because here, named export detection will fail.

Another case that we should definitely think about:

// input
// main.js
import test from './dep.js'
console.log(test);

// dep.js
Object.defineProperty(exports, '__esModule', { value: true });
exports.test = 42;

Now this one fails because dep.js has no default export. Which is kind of true EXCEPT that this example could run perfectly in Node because every CJS file has a default export. Now of course just adding a default export would be inconsistent but I wonder if there is a way this could be fixed via an option. At the moment, this behaviour change makes this PR breaking.

A simple fix that would as far as I can tell be consistent with the current behaviour, which will be using this interop:

function getDefaultExportFromCjs (x) {
	return x && x.__esModule && Object.prototype.hasOwnProperty.call(x, 'default') ? x['default'] : x;
}

would be to replicate this interop in so far as if there is no default export, export the namespace of this module as default. This would fix the above scenario and avoid the breaking change. Now of course this would require either creating a circular dependency or manually creating an object for the default export.

@LarsDenBakker
Copy link
Contributor Author

LarsDenBakker commented Aug 16, 2020

@lukastaegert Thanks for the notes.

module.exports reassignment

This is very unusual, because it would overwrite the __esModule property as well. Is this done in the wild? Anyhow I've adjusted the code to completely bail out of special handling for this case.

namespace export

I added a namespace export back when there is no default export. I think I got it in a way that didn't require a lot of extra logic or helpers. But please confirm :)

Copy link
Member

@lukastaegert lukastaegert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So thank you for keeping up with my comments so far, and sorry this is taking a while. So I went for another round of testing, and here are some things I found by using a CJS module as an entry point:

  • "Proper" compiled ES module

    // input
    exports.foo = 'foo';
    exports.default = 'default';
    Object.defineProperty(exports, '__esModule', { value: true });
    
    // output
    var foo = 'foo';
    var _default = 'default';
    export default _default;
    export { foo };

    looks good.

  • Using module.exports

    // input
    module.exports = { foo: 'foo', default: 'default' };
    Object.defineProperty(exports, '__esModule', { value: true });
    
    // output new plugin
    var main = { foo: 'foo', default: 'default' };
    export default main;
    
    // output old plugin
    // ... helper code
    var main = createCommonjsModule(function (module, exports) {
      module.exports = { foo: 'foo', default: 'default' };
      Object.defineProperty(exports, '__esModule', { value: true });
    });
    var main$1 = /*@__PURE__*/getDefaultExportFromCjs(main);
    export default main$1;

    So here, the new plugin gives us the wrong default export. As there is a default property in the object, it should definitely give us that as the default export. I still think we really need to completely deoptimize in that situation and do what the old plugin did here to not break code. This also includes keeping the property definition. And regarding the question if that pattern is used, anymatch, which is a transitive dependency of Rollup itself, uses this pattern.

  • No default export

    // input
    exports.foo = 'foo';
    Object.defineProperty(exports, '__esModule', { value: true });
    
    // output
    var foo = 'foo';
    var main = {
      foo: foo
    };
    export default main;
    export { foo };

    While the output here is generally not bad, we could actually "hide" the missing default export in entry points by adding syntheticNamedExports: 'default' to transpiled ES modules that do not have an explicit (and only those!) default export. That would still make it possible to import the default internally but it would not pollute the signature of entry points:

    // output
    var foo = 'foo';
    export { foo };

// eslint-disable-next-line consistent-return
return val.value;
return { key: key.value, value: valueProperty.value };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you run this across code with Object.defineProperty(exports, 'someNonsense', {value: true}) then the plugin will crash because one usage of this function does not yet use the new signature:
https://github.com/rollup/plugins/pull/537/files#diff-425cdc831ee23ba9fbb89366518fb8b6L460-L461
Maybe you also want to adjust the name slightly to reflect that it is giving you now the property definition and not just the name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also crashes if you use a non-literal or falsy value.

@lukastaegert
Copy link
Member

lukastaegert commented Aug 20, 2020

So my general feeling at the moment is that this PR is a little too eager. There may be other edge cases I did not test yet. Considering the huge importance of this plugin for the ecosystem and the amount of tools relying on it (e.g. SnowPack), maybe a "conservative by default" approach would be better here. This would mean

  • If there is any reason where the old logic would have decided that it shouldWrap everything in a createCommonJSModule EXCEPT if there is a SINGLE Object.defineProperty on exports or module.exports for __esModule with a truthy value, then we should still wrap everything and keep any Object.defineProperty and handle it exactly like before.

  • If there is a proper Object.defineProperty but the module is assigning anything to module.exports we should again wrap everything, keep the property definition and deoptimize as before. That way, module.exports = {default: 'foo', __esModule: true} would also be handled correctly.

  • Only if there is no fishy business except the single property definition do we remove this definition.

  • Making the isCjs return false for such a module may also be a bad idea. I think we really want to make sure we do not change behaviour here. See this example:

    // input
    // main.js
    const dep = require('./dep.js');
    console.log(dep);
    
    // dep.js
    exports.foo = 'foo';
    Object.defineProperty(exports, '__esModule', { value: true });
    
    // output new plugin
    var foo = 'foo';
    var dep = {
      foo: foo
    }
    var dep$1 = /*#__PURE__*/Object.freeze({
      __proto__: null,
      'default': dep,
      foo: foo
    });
    console.log(dep$1);
    var main = {};
    export default main;
    
    // output old plugin
    // ... helper code
    var dep = createCommonjsModule(function (module, exports) {
      exports.foo = 'foo';
      Object.defineProperty(exports, '__esModule', { value: true });
    });
    console.log(dep);
    var main = {};
    export default main;
    
    // what might be better for the new plugin
    var foo = 'foo';
    var dep = {
      foo: foo
    }
    console.log(dep);
    var main = {};
    export default main;

    So there is no reason to create a proper namespace object here, we could just forward the fake namespace which is what a real CommonJS module in Node what also get. To that end, we could just again add the __moduleExports export and the syntheticNamedExports: "__moduleExports" property and mark this as a CommonJS module. That way, this release really would not be a breaking change at all because everything would behave as before. It is just that we get more optimized code in some situations. But we should still check if there are issues with unwrapping when the property definition is missing. Maybe we also add it to the fake default export and thus __moduleExports.

@lukastaegert
Copy link
Member

Here is another scenario to consider: Assume we previously created a CJS module that has another module as an external dependency and now we are bundling both together:

// input
// main.js
var dep = require('./dep.js');
function _interopDefault(e) {
  return e && e.__esModule ? e : { default: e };
}
var dep__default = /*#__PURE__*/ _interopDefault(dep);
console.log(dep__default['default']); // should log "foo"

// dep.js
exports.default = 'foo';
Object.defineProperty(exports, '__esModule', { value: true });

// output new plugin
var _default = 'foo';
var dep = /*#__PURE__*/Object.freeze({
  __proto__: null,
  'default': _default
});
function _interopDefault(e) {
  return e && e.__esModule ? e : { default: e };
}
var dep__default = /*#__PURE__*/ _interopDefault(dep);
console.log(dep__default['default']); // actually logs "{ default: 'foo' }"
var main = {};
export default main;

// output old plugin
// ... helper code
var dep = createCommonjsModule(function (module, exports) {
  exports.default = 'foo';
  Object.defineProperty(exports, '__esModule', { value: true });
});
var main = createCommonjsModule(function (module) {
  function _interopDefault(e) {
    return e && e.__esModule ? e : { default: e };
  }
  var dep__default = /*#__PURE__*/ _interopDefault(dep);
  console.log(dep__default['default']); // logs "foo" as expected
});
var main$1 = /*@__PURE__*/getDefaultExportFromCjs(main);
module.exports = main$1;

So how to fix this? We could make sure that internal require statements always receive the "old" namespace by adding a __moduleExports export as before that exports something like

export const __moduleExports = Object.defineProperty({
  default: foo
}, '__esModule', {value: true});

So basically wrapping the artificial namespace object in a "defineProperty" to keep semantics. But I fear this may already be a problem when importing an actual ES module that I did not anticipate, so maybe this could also be "fixed" in the proxy in some way.

@lukastaegert
Copy link
Member

Ok, so there is definitely a bug in the current implementation where importing an ES module from CommonJS does not properly resolve the default export due to the missing property. This should actually be solved separately, hope I find some time soon. My best idea so far is to

import * as namespace from 'module';
export default /*#__PURE__*/Object.defineProperty(/*#__PURE__*/Object.assign({}, namespace), '__esModule', {value: true});

in the proxy, hm

@LarsDenBakker
Copy link
Contributor Author

@lukastaegert Thanks for the review.

I thought I had already covered the assignment to exports case, will double check the test cases.

I was indeed unsure about not marking the module as CJS. I was hoping we could do a clean CJS -> ESM -> CJS, but clearly there is a lot more to consider here.

@lukastaegert
Copy link
Member

I was indeed unsure about not marking the module as CJS. I was hoping we could do a clean CJS -> ESM -> CJS, but clearly there is a lot more to consider here

Maybe let's first wait for #552 as it might have some implications here.

@lukastaegert
Copy link
Member

Hi @LarsDenBakker, if you do not mind, I would offer to take over this PR so that you can focus on the IMO more important #540. Background is also that once this last open commonjs PR is merged, I would like to do a larger refactoring PR to improve the code structure and modularization of this plugin. There are some very important future improvements I hope to tackle based on this.

@LarsDenBakker
Copy link
Contributor Author

No problem, go for it!

@lukastaegert lukastaegert force-pushed the feat/improve-compiled-esm-handling branch 2 times, most recently from 28deeca to 2493fe4 Compare October 19, 2020 04:28
@guybedford
Copy link
Contributor

The new approach here is looking great @lukastaegert.


switch (node.type) {
case 'Literal':
return node.value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be return !!node.value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would not make a difference but we can do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah just to be consistent so you can can always expect a boolean, later you might use this function for other cases

@@ -60,12 +61,56 @@ export function hasCjsKeywords(code, ignoreGlobal) {
return firstpass.test(code);
}

function isTrueNode(node) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that all of these AST helpers beg for a new utility file 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely. Only wondering if I should do these things in this PR or go for a separate refactoring PR, currently leaning towards the latter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with you. This way we can review this PR without wondering what actually changed

@lukastaegert lukastaegert marked this pull request as draft October 20, 2020 03:57
@lukastaegert
Copy link
Member

Converting this one to a draft because I still want to check some stuff here.

@lukastaegert lukastaegert force-pushed the feat/improve-compiled-esm-handling branch from 2493fe4 to 375ba48 Compare October 20, 2020 05:03
@lukastaegert lukastaegert force-pushed the feat/improve-compiled-esm-handling branch from 1d59b3a to 627f801 Compare November 24, 2020 06:12
@lukastaegert
Copy link
Member

lukastaegert commented Nov 24, 2020

I already started work on the next big refactoring PR that will add circular dependency support here: #658. To avoid merge cascades, I would hope to put any larger changes beyond regression fixing into the other PR.

@shellscape shellscape merged commit 21c51e0 into rollup:master Nov 28, 2020
@shellscape
Copy link
Collaborator

👏 great work everyone

vasttiankliu added a commit to vasttiankliu/plugins that referenced this pull request Nov 17, 2022
…537)

BREAKING CHANGES: rollup/plugins#537 (comment)

* feat(commonjs): reconstruct real es module from __esModule marker

* fix(commonjs): handle module.exports reassignment

* fix(commonjs): preserve namespace default export fallback

* fix(commonjs): mark redefined module.exports as CJS

* chore(commonjs): fix tests

* Make tests actually throw when there is an error in the code and skip broken test

* chore(commonjs): Improve how AST branche are skipped

* fix(commonjs): Fix Rollup peer dependency version check

* refactor(commonjs): Use a switch statement for top level analysis

* refactor(commonjs): Restructure transform slightly

* refactor(commonjs): Extract helpers

* refactor(commonjs): Extract helpers

* refactor(commonjs): Move __esModule detection logic entirely within CJS transformer

* refactor(commonjs): Add __moduleExports back to compiled modules

* refactor(commonjs): Move more parts into switch statement

* refactor(commonjs): Completely refactor require handling

Finally remove second AST walk

* fix(commonjs): Handle nested and multiple __esModule definitions

* fix(commonjs): Handle shadowed imports for multiple requires

* fix(commonjs): Handle double assignments to exports

* chore(commonjs): Further cleanup

* refactor(commonjs): extract logic to rewrite imports

* feat(commonjs): only add interop if necessary

* refactor(commonjs): Do not add require helper unless used

* refactor(commonjs): Inline dynamic require handling into loop

* refactor(commonjs): Extract import logic

* refactor(commonjs): Extract export generation

* refactor(commonjs): Avoid empty lines before wrapped code

* Do not remove leading comments in files

* refactor(commonjs): Remove unused code

* fix(commonjs): Improve error message for unsupported dynamic requires

Co-authored-by: Lukas Taegert-Atkinson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants