Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature suggestion: optimizing var definitions #75

Open
chrisaljoudi opened this issue Dec 22, 2012 · 25 comments
Open

Feature suggestion: optimizing var definitions #75

chrisaljoudi opened this issue Dec 22, 2012 · 25 comments

Comments

@chrisaljoudi
Copy link

Hi!
I came across a case where there's a possible trick that can reduce size significantly. Take the following piece of code:

var x = 12345678, y = 12345678;

It compresses to:

var x=12345678,y=12345678;

Can't it be optimized to the following:

var x=12345678,y=x;

Or is there something I'm missing? I think it should be available as an unsafe optimization.

Thanks!

@mal
Copy link

mal commented Dec 22, 2012

Also, if you could determine that x was read only and never used in scope after y had been mutated, or vice versa (or even that both were read only) you could potentially optimise one of the two variables away completely.

Have to be careful of inner closures, probably pretty unsafe, could be fun though! 😛

@ForbesLindesay
Copy link
Contributor

The optimisation above would be completely safe though

@staabm
Copy link

staabm commented Dec 23, 2012

The optimization is only safe when the vars contain primitives, isn't it?

@chrisaljoudi
Copy link
Author

@staabm I think it's safe when the expression-node being assigned to the variables has .has_side_effects() false.

@ForbesLindesay
Copy link
Contributor

In general it is safe to optimise:

var x = E, y = E;

to

var x = E, y = x;

For all E provided that E:

  1. has no side effects
  2. returns the same value every time

e.g. The following would not be safe to optimise even though Math.random() has no side effects:

var x =  Math.random(), y =  Math.random();

The following would be though:

function get() { return 5; }
var x =  get(), y =  get();

The follwoing would not because {} === {} is false

function get() { return {}; }
var x =  get(), y =  get();

but the following would be:

var val = {};
var x = val, y = val;

@chrisaljoudi
Copy link
Author

@ForbesLindesay You're definitely right, sorry for being inaccurate.

@ForbesLindesay
Copy link
Contributor

Interestingly it would be safe with certain very specific side effects, but this is probably more in depth analysis than we'd want to do:

var got = false;
function get() {
  return got = true;
}
var x =  get(), y =  get();

because although get has side effects, it doesn't have a side effect if you run it a second time having not reset got to false.

@ForbesLindesay
Copy link
Contributor

No worries @abody these things are tricky at the best of times, my initial intuition which I started writing down was completely false.

@chrisaljoudi
Copy link
Author

@ForbesLindesay Yeah that's too deep an analysis -- you'd expect the person who's coding the above would optimize it himself/herself. I initially just thought about this optimization because I had initialized lots of variables to null.

@chrisaljoudi
Copy link
Author

@ForbesLindesay There are also cases where defined getters might have side effects, but it's too hard to spot those. That's why I suggested to just make the optimization disabled by default/unsafe.

@ForbesLindesay
Copy link
Contributor

You wouldn't want it to run for anything that could be a getter, because that's way too likely to be unsafe. Just code the optimisation so it only does it when we know it's safe (i.e. expressions that don't contain property accessors, function calls, object literals, array literals or regexp literals).

Perhaps more significantly, it slows runtime performance if not done carefully: http://jsperf.com/optimizing-var-definitions (needs more runs to be statistically significant).

I suspect it will have negligible or no affect on file size once gzip has done it's thing. It may ultimately not be worth optimising, unless you want to handle the case of a call to a function which has no side affects and always returns the same value. e.g. the performance gain of optimising the following would be huge:

function factorial(n) {
  if (n === 0) return 1;
  else if (n > 0) return n * factorial(n - 1);
}
var a = factorial(50), b = factorial(50);

How often such code actually appears though I'm not sure.

Do we know whether UglifyJS already tracks which functions are 'pure' functions (i.e. no side effects and always return the exact same value for the same input)? If it does, this may be doable, if not then it's a fair bit of work for minimal gain.

@chrisaljoudi
Copy link
Author

@ForbesLindesay Oh interesting benchmark; I'm running Chrome 23 and actually the separate declaration is slower than either the remaining two. You can never really know if there's a getter defined; and it needn't be a property accessor to be unsafe (think about the case where you do window.__defineGetter__("something", someFunctionThatDoesEvilStuff)).

Perhaps you're right about the negligible difference in file size. I'm not sure whether UglifyJS keeps track of such 'pure' functions, perhaps @mishoo can clarify?

@ForbesLindesay
Copy link
Contributor

OK, anything that can be a getter then, which should include all property accessors and any variable that's not found in the current scope (and therefore could be a property on the window)

@ForbesLindesay
Copy link
Contributor

this.defineGetter wouldn't make any difference, I've already said we should exempt property accessors and 'global' variables, this.name is a property accessor.

@ForbesLindesay
Copy link
Contributor

It might need to also be disabled inside with blocks or where there's evals and new Functions, but virtually all optimisations are disabled inside with blocks etc.

@chrisaljoudi
Copy link
Author

@ForbesLindesay Oh. Never mind about my previous comment. I completely missed the point. Sorry!

@ForbesLindesay
Copy link
Contributor

For anyone else reading:

function test() {
    this.__defineGetter__("x", function() {
        alert("This is evil.");
        return 5;
    });
    return x;
}
console.log(new test());

just throws a reference error and if you're in stict mode then so does the above code. The above code in non-strict mode is just assigning x as a property of the global object so the following is also fine:

(function () {
  var x = 10;
  function test() {
    this.__defineGetter__("x", function() {
        alert("This is evil.");
        return 5;
    });
    return x;
  }
  console.log(test()); //logs 10
}())

@mishoo
Copy link
Owner

mishoo commented Dec 23, 2012

This might seem simple enough to wonder “is there a good reason not to do it?”, but it's in fact quite complicated in the general case.

Doing it only for the case var i = XXX, j = XXX is fairly simple, indeed, but won't probably save one byte in 99.9% of cases. Doing it for the case var i = XXX; ...; var j = XXX requires data flow analysis, and there's some serious theory behind this term. I investigated the possibilities but for the near future I don't think we'll have this (it's complicated by two issues: (1) closures, and (2) JS is a complex beast; normally compilers do DFA on a very simple intermediate language, not on the original source).

This comment also applies to #68 and #27 — data flow analysis would help dealing with them too.

@chrisaljoudi
Copy link
Author

@mishoo I see. Well, I just tested with the Google Closure Compiler (it actually took me some time to work around it optimizing away the variables and replacing them with their values). It appears it doesn't do such an optimization, either. Thanks for your explanation!

@SoniEx2
Copy link

SoniEx2 commented Sep 7, 2015

This could be easily doable by using register-based bytecode, instead of AST, as IR. You'd lose comments but you could always add a bytecode for comments if you wanted to :P

@michaelficarra
Copy link
Contributor

@SoniEx2 Unless you're very careful with your design, transformations on that bytecode would result in behaviour that cannot be translated back to JavaScript.

@SoniEx2
Copy link

SoniEx2 commented Sep 7, 2015

@michaelficarra You can turn Lua bytecode back and forth, at least... altho slightly inefficiently for some things... Maybe optimize based on bytecode and minify based on JS? (JS -> bytecode -> optimize -> JS -> minify)

@michaelficarra
Copy link
Contributor

Sure, it can be done, but that optimisation step is going to be tricky. You have to make sure the bytecode's overall structure remains representable in JavaScript, or at least can be filled in to be representable. As an example, a bytecode with arbitrary jumps can jump into the middle of a "loop", but that behaviour cannot be represented in JavaScript; at least, not easily.

@SoniEx2
Copy link

SoniEx2 commented Sep 7, 2015

I see what you mean. But if you want to optimize efficiently an IR is usually the way to go.

@Mrgaton
Copy link

Mrgaton commented Oct 11, 2024

this isnt fixed yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants