-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Put *Sync
methods behind a flag in some future major version
#1665
Comments
I think synchronous methods do have their place, and shouldn't be moved behind a flag. You just shouldn't be using synchronous calls in hot code. |
-1. Sync functions are very useful in various cases.
|
@cjihrig I was not talking about @vkurchatkin Checked the code. Isn't it possible to create an actual lock instead of a loop? By the way, |
@ChALkeR I'm not sure how lock can help? To run async functions you need an event loop. It can't be the main loop, as unrelated requests would be processed as well. Something like this would work: nodejs/node-v0.x-archive#7323 |
+1 Only good use for sync functions is for one-line scripts, so adding a flag for it seems reasonable. But usage of sync functions in any public modules should definitely be discouraged. Not only because of the event loop blocking, but because sync functions can't be replaced with async functions should the need for it arise. For example, Jade uses Even synchronous require is a big pain in the ass if you want to require something from a remote http server or a zip archive instead of a filesystem. We've talked about that before. I believe sync functions should be removed completely when ES7 |
@vkurchatkin nodejs/node-v0.x-archive#7323 looks like a cleaner solution to me than the current deasync module. Once again: I do not propose to remove This might require a better way to build a sync version of an async function than the deasync module, if it is not good enough. |
I'd rather have an async-only mode that warns/throws (an option for both). Creating CLI scripts that require runtime flags is a pain in the ass. |
I'm -1 on putting Sync methods behind a flag. There are plenty of valid use cases for them. |
@jonathanong That woldn't do anything. It would not lower You could just grep for |
@piscisaureus They introduce problems because they seem to be an easy solution at a first glance but result in huge problems later (see 2, 3). Even for use cases that look like valid there might be problems (see 4). Such flag might be useful, but this is a different question from the one that I want to discuss here. A relevant thing would be to propose some other way of discouraging newcomers (and not-so-newcomers) from (mis-)using @piscisaureus, @vkurchatkin Please, list specific use cases for this to be constructive. |
Relevant: #5 (comment) |
To name some: scripting, bootstrapping, logging, using IO inside synchronous code. Sometimes blocking is fine. Also, point by point: 1 -
No, they couldn't 2 - let's fix the docs 3 -
This point as against removing 4 - this looks bad, actually 5 - See 2 6 -
No, it's not. |
|
I think we're forgetting how much userland code this sort of hard-deprecation and putting behind a flag breaks. Tons of packages use the Wouldn't it be nicer to just discourage them in the docs more or soft-deprecate them? Our developers aren't retards, most io developers understand what blocking and non-blocking code means. Removing these just seems cruel and I'm not really sure what problem it actually solves. |
@benjamingr Ok, I guess we need some stats here. Discouraging in the docs is required, but it would not solve the problem completely. For example, there could be some intermediate step where those methods cause runtime warnings (for the first usage of every function) if the program was started without a flag. Another way would be to allow enabling that flag per |
I don't understand something here though:
There are tens of thousands of usages for |
@benjamingr I repeat: this is not about right away, it is about changing the docs at some point, then moving that under a flag in some future version when all the prerequisites will be fullfilled (Promises in the core, a good deasync implementation). It should be done gracefully enough and after making sure that all problems could be solved. About breaking compatibility — check how many packages are there that require a verion of And for reference, there are 302 results for And no, soft-deprecation will not be as effective. People will still use |
I agree with @piscisaureus. Im -1 on putting them behind a flag as well |
Regarding @piscisaureus's But it's not sufficient. It gets a little old telling every new developer to NEVER EVER EVER, no seriously, NEVER use a My solution is to use a module like Further, if someone did want/need to use a |
@CrabDude Do I read it correct that you propose to loosen the restriction a bit compared to what I initially proposed and to allow using If so, sounds like a reasonable change to me for the transition period (which could be indefinetely long). But it would not fix the scripts like the one mentioned in (4). Maybe allowing |
-1 from me. What about all the people pushing blocking code to child processes? They have to wait for async messages to come in, so only allowing them in the first and last tick doesn't work. Some things simply don't work effectively as async. Certain crypto functions or machine learning code are good examples as there's so much data interaction that the thread hops kill performance. My suggestion would be to simply wait for async/await and a promisified core, then deprecate whichever sync functions are no longer useful. Regardless of what we decide to do, this would probably not happen for quite awhile. |
But some functions that have a
That's why it's titled «in some future major version» ☺. |
I'm -1, whilst I agree the sync functions are bad. There is no advantage of doing this, because you still have the added complexity to support sync in iojs (only now with some additional flag logic). Plus, whilst I know sync is bad- there are tons of times when performance is not a priority. Admittedly its a systemic issue in my experience (that I use sync because something I want to use was also sync). I think I would be +1 in deprecating sync when ES7 await keyword lands. |
@Qard Check this out: Sync: 'use strict';
var zlib = require('zlib');
var data = 'abcdefghijklmnopqrstuvwxyz';
var gzipped = zlib.gzipSync(data);
var time = process.hrtime();
var count = 0, limit = 800000, block = 20000;
function call() {
var contents = zlib.gunzipSync(gzipped);
count++;
if (count % block === 0) {
var t = process.hrtime(time);
console.log(count + ': ' + block / (t[0] + t[1] / 1e9) + ' per second ' + JSON.stringify(process.memoryUsage()));
time = process.hrtime();
}
}
for (var i = 0; i < limit; i++) {
call();
} Results:
Async: 'use strict';
var zlib = require('zlib');
var data = 'abcdefghijklmnopqrstuvwxyz';
var gzipped = zlib.gzipSync(data);
var time = process.hrtime();
var count = 0, limit = 800000, block = 20000;
function call() {
zlib.gunzip(gzipped, function(err, contents) {
count++;
if (count % block === 0) {
var t = process.hrtime(time);
console.log(count + ': ' + block / (t[0] + t[1] / 1e9) + ' per second ' + JSON.stringify(process.memoryUsage()));
time = process.hrtime();
}
if (count > limit) {
process.exit(0);
}
setImmediate(call);
});
}
for (var i = 0; i < 20; i++) {
call();
} Results:
Guess which one is faster (putting aside the fact that the sync one consumes 3.8 GiB of memory and crashes)? |
@meandmycode That would work, I think, and will also solve the problem. I would be happy if that happens. I agree that there is nowhere to rush on this issue now, this has to wait until there is good Promises support in the core. |
To be fair, gunzipping a single thing nearly a million times synchronously is not very realistic code. I can take down the process even easier with a simple I could fork bomb my own dev server. Does that mean I should be telling the bash devs to remove spawn support? |
And that's not explained in the docs. Re-read the reasoning again — the methods are there, people expect them to work.
That's called a testcase, that's why it's a single short thing gunzipped nearly a millon times. |
They do work. It's the garbage collector that does not work as expected. Removing the sync methods does nothing to stop the thousands of other ways I can blow up a process by blocking the garbage collector. The only fix for that is documentation. |
I accept that point.
Well, that was expected. I'm glad that there is some discussion on this, though. A flag to trace But the issue is initially about discouraging using |
I labeled it tc-agenda. An unanimous 'no' from the TC (if that is what happens) might nip a lot of discourse (and discord) in the bud. |
Perhaps better intro docs on how to use async javascript? |
Let's just put the performance issue to bed. Depending on the use case, it can be faster to use the 'use strict';
const ITER = 1e4;
const LEN = 1024 * 64;
var fs = require('fs');
var fd = fs.openSync('/tmp/tmp.tmp', 'r');
var b = new Buffer(LEN);
var cntr = 0;
var t, i;
// Async Test
/*
t = process.hrtime();
(function r() {
if (++cntr === ITER)
return printTime();
fs.read(fd, b, 0, LEN, 0, r);
}());
/* */
// Sync Test
/*
t = process.hrtime();
for (i = 0; i < ITER; i++) {
fs.readSync(fd, b, 0, LEN, 0);
}
printTime();
/* */
function printTime() {
t = process.hrtime(t);
t = t[0] * 1e9 + t[1];
console.log((t / ITER).toFixed(1) + ' ns/op');
}
So the only real question is should we allow an operation that blocks the event loop. IMO developers are intelligent enough to make the call whether they should or not. And don't forget one important use case. If there's an uncaught exception and I want to log information about the state of the application before it's brought down. I can't create an async write because the event loop is possibly in a bad state. It's critical that I'm able to synchronously log this information out. |
@ChALkeR The issue you raise about a beginner using Since there are way better ways to solve this issue (suggested above like better documentation, proper warnings¹ etc.), I also -1 this suggestion. A lot of functions in the standard library could be misused. We can't deprecate them because beginners are likely to misuse them. ¹ - See how browsers warn in the console about the use of synchronous ajax requests? |
It'd be pretty trivial to write a userland module that, when required,
|
Writing such a module is trivial. However, forcing all the people who misuse Sync to install it is not as easy. :( |
@Qard I'm sure you're aware, but please read this thread on why that still isn't full-proof: #1674 . Here's a crazy hack as a proof-of-concept to get around that: ['read', 'write'].forEach(function(m) {
var old_method = process.binding('fs')[m];
process.binding('fs')[m] = function() {
if (typeof arguments[arguments.length] !== 'function')
console.warn('did not pass callback', (new Error()).stack.substr(6));
return old_method.apply(null, arguments);
};
}); |
@trevnorris Your example demonstrates my point on the common misperception and doesn't address the reasoning for this issue (performance degradation in high-concurrency scenarios). Here's an example that does: var http = require('http')
var foo = require('./packageWithBuriedSync').foo
// Note the ignorant all-to-common omission of Sync in the name
var bar = require('./packageWithBuriedSync').bar
function asyncHandler(req, res) {
var i = 3
foo(sendEnd)
foo(sendEnd)
foo(sendEnd)
function sendEnd() {
if (!--i) res.end()
}
}
function syncHandler(req, res) {
bar()
bar()
bar()
res.end()
// To eliminate performance questions about closure creation
function sendEndNoop() {}
}
http.createServer(asyncHandler).listen(8000) // packageWithBuriedSync.js
var child_process = require('child_process')
var exec = child_process.exec
var execSync = child_process.execSync
module.exports = {
foo: function (callback) {
exec('ls', callback)
},
bar: function () {
return execSync('ls')
}
} Using exec('ls') Time taken for tests:
execSync('ls') Time taken for tests:
Conclusion: For IO-bound cases[1], an ignorant buried @Qard Yes. See @rlidwka's comment though. @benjamingr This is solved. The issue is the current stance is hostile to performance and an anti-pattern for new developers, thus the reasoning for proposing inclusion in core. [1] Due to my MBA's flash drive, filesystem latency was extremely low and async performance had relative parity with |
My preference is to close this in favor of a more pragmatic solution like #1674. |
Indeed. It doesn't really stop people from using sync functions where they Like I said, not a fix, but it's something that can be done right now to
|
Please let's not have an argument about who does or doesn't know more node.. |
@Fishrock123 you're right, for what it's worth I was not making fun of @CrabDude nor was I particularly critical of his level of expertise. I was just really amused by the tone of his comments. Still am, but going to keep it at a more professional tone from now on. @CrabDude I'm sorry if I offended you, but you got to see where I'm coming from. I never once criticized you or your ability to write serverside JavaScript. The only thing my comments were about were the tone you replied to trev (and later me). @ChALkeR for what it's worth it was never personal, I don't recall interacting with him before. |
@CrabDude You successfully demonstrated a case where using a sync alternative is slower. Those examples are plentiful. The point of my benchmark is to 1) demonstrate that async is not always faster and 2) show we shouldn't pretend we know what's best for the app developers. There are legitimate use cases for sync operations. Recall again that if you want to log information in an
|
I don't think you two actually disagree about anything at this point. @trevnorris demonstrated the Neither of you disagreed about those points. Both of you agree that it would be nice to track these performance issues in real code. I think we should ofcus on #1674 at this point. |
I think that @trevnorris is correct here:
I initially wanted some discussion on this and I am glad that happened and that #1674 was created. But it misses the problem behind this issue that I specially highligthed in the first (actually second) sentance of the issue. And the discussion somewhy almost missed the documentation changes that I proposed, so I opened a separate issue for that: #1684. I guess this issue could be closed now if there are no objections. |
Let's make sure we're talking about the same things...
Stupid to remove
Unnecessary inflammatory language aside,
First, facts are not theory. It is factually correct, demonstrated above and acknowledge by yourself that blocking calls are a major performance penalty for IO-bound tasks, which arguably account for the vast majority of node's use (numbers would be useful here). Second, the addition of
Precisely. So we're in agreement that
At face value, this seems reasonable, yet in this context, it's misguided. Reductio ad absurdum:
You'll notice for all of the above, core does not preclude userland from implementing their own features, blocking calls, http alternatives, or module/package systems. It does however ship solutions to further enable the best non-blocking JavaScript IO runtime, which
Addressed at the beginning of the conversation. #1674 and #1684 are excellent steps in the right directly, though I would prefer to see the logic that continues to support the proliferation of |
No, I am not writing this from a mental hospital, please read till the bottom.
The purpose of this is to discourage using
*Sync
versions of the methods that could be async.I propose to put all or at least a part of
*Sync
methods behind a runtime flag in some future major version of io.js and to split documentation, moving all*Sync
methods to a separate page. This of course would be semver-major.Atm, there are
*Sync
methods defined inzlib
,fs
,child_process
,crypto
modules and additionaly used inrepl
andmodule
modules.To allow usage inside io.js itself (i.e. for
require
), they could be moved to «private» methods (beginning with a_
sign).*Sync
methods suggest bad practices (see 2), but when someone with full understanding of the consequences needs them, they could be constructed from userspace. See https://github.com/abbr/deasync, it creates synchronous methods from async methods.If deasync module is not good enough for this, this could be done when there is would be a good enough solution.
When a newcomer begins writing something using io.js, he or she goes to the documentation looking how to do something, sees
*Sync
methods without any warnings there and almost certanly begins with using them, because that's what he or she is used to. That's easier for a newcomer than spending a few minutes reading how he or she should actually do stuff. And don't blame the newcomer, it's the presense of*Sync
methods in the documentation that suggests to him or her that it's an ok way to do things. This results in a big pile of bad code by the time when the person understands that it should be rewritten. And people don't like to rewrite code for no visible reson, leaving this code to be legacy (see 3).When someone writes synchronous code (see 2) it limits how he or she can use async functions without rewriting most part of the logic, so he or she comes complaining about that there should be a
*Sync
version of everything out there (as the presense of*Sync
versions in the core suggests it). See Is there a sync version encapsulating libmagic in node.js ? mscdex/mmmagic#32 and motivation behind https://github.com/abbr/deasync.It's completely broken either way. Even for «simple one-time scripts». See zlib: memory leak with gunzipSync #1479 — a person was doing something like
files.forEach( …zlib.gzipSync(…) …)
, in what I suppose was a simple script. What could possibly go wrong? Memory usage has gone completely bad in his script. And even manual calls togc()
do not help. Testcase:People go to the doc, see
*Sync
versions (see 2), use them — then everyone are telling them that they are using io.js wrong just based on that fact: gripe: deprecating fs.exists/existsSync #1592 (comment).One can argue again that using
*Sync
versions of methods is ok in scripts and that that's simplier, but:*Sync
-based code. Promises-based code with accurate error handling is much cleaner that*Sync
-based code with accurate error handling. Using Promisify won't be needed once Promises go to the core (see Feature Request: Every async function returns Promise #11).This will require all public modules that are using
*Sync
methods either to rewrite things using async methods or require/include something like https://github.com/abbr/deasync to be compatible.This can be done separately for various core modules/methods. For example,
zlib.*Sync
are maybe the worst of them and it looks to me that they are not actively used in public modules.The text was updated successfully, but these errors were encountered: