Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MakeCallback is dangerous, not only or best way of calling callback; provide direct method #284

Closed
metamatt opened this issue Feb 18, 2015 · 57 comments
Labels

Comments

@metamatt
Copy link

NodeJS lets you invoke a "callback" (really, invoke JS code from C++ code) 2 ways:

  • calling the callback directly; if you have a Handle<Function> foo you can just foo->Call()
  • calling the callback via MakeCallback (in src/node.cc)

The former ("direct") case just calls from C++ into JS synchronously, and returns back into C++ when the JS code returns.

The latter (MakeCallback) case calls the JS callback function, but then also ends the current tick, calling process._tickCallback() and any other callbacks registered via nextTick(). This form is intended only for use by C++ code that was dispatched from the libuv event loop, and not for C++ code that was invoked from JS code. Because in that case (JS code "A" running, calls into C++ code in a node addon, the addon's C++ code calls MakeCallback, MakeCallback dispatches the nextTick callbacks) the JS code marked "A" has been preempted, and state changed by the nextTick callbacks can be corrupted. See TooTallNate/node-weak#35 for an example of how dangerous this is.

Use cases for node addons (that would want to use nan) wanting to invoke JS code from C++ code that was invoked directly from JS (not from a libuv event callback) include

For these cases, use of MakeCallback is not only incorrect but extremely dangerous. The right answer is to call the callback directly. But nan doesn't expose a way of doing this; NanCallback::Call is actually wrapping MakeCallback. So is the right answer to avoid NanCallback altogether? I don't think so; it serves a very real need to deal more easily with V8's memory management; using Persistent<Function> got harder in newer V8 (node 0.11/0.12) and so it's much nicer to use NanCallback instead of raw Handle<Function>. So I think NanCallback should provide separate methods for calling a callback directly and calling a callback via MakeCallback, and make the difference really obvious so people don't accidentally do the latter. (Note in my patch for the node-weak instance of this, I do use NanCallback for memory management, but then have to go to some lengths to extract the real Handle<Function> to call directly.)

I think the problem here is exacerbated by (a) nebulous terminology like "callback" (is any function that gets called a "callback"? Not all callbacks are the same) and (b) the difficulty of using Persistent<Function> without NanCallback. But when you do use NanCallback and because of (a), it's just too tempting to accidentally call NanCallback::Call and thus MakeCallback and then preempt your JS code and then suffer all manner of heisenbugs.

So my suggestion and request is:

  • add a new NanCallback::CallDirect that wraps Handle<Function>::Call
  • rename NanCallback::Call to NanCallback::CallAndEndTick (or something similarly explicit that's difficult to confuse)
  • in a perfect world, remove NanCallback::Call. That will probably be too annoying to addon authors already using it correctly. Then, at least deprecate it (have it generate a compile-time warning, and then act as CallAndEndTick?)
@kkoopa
Copy link
Collaborator

kkoopa commented Feb 18, 2015

Hmm, this is actually the first time I've heard of this problem. My experience is completely opposite in that not using node::MakeCallback as of 0.11 would lead to a whole bunch of strange errors exactly because process._tickCallback() was not called and the same goes for domain support. Normally, every call would want to be done through node::MakeCallback for this reason.

Might introduce NanCallback:: CallDirect for convenience, but renaming or deprecating NanCallback::Call seems less useful. The documentation could however be improved.

@metamatt
Copy link
Author

See also nodejs/node-v0.x-archive#9245.

@metamatt
Copy link
Author

@kkoopa, can you elaborate on the "whole bunch of strange errors"?

Barring evidence to the contrary, I'm pretty adamant that inviting people to use MakeCallback over-eagerly is a really bad idea. See https://gist.github.com/metamatt/4ce96adbf14de9a97d41, https://gist.github.com/metamatt/99ef2a5e12de2acd5e39. I don't know anything about (haven't tried to use) domain support, so maybe I'm missing something there. But if you call MakeCallback in the (I would assume fairly standard/common) case where your addon C++ code was invoked from JS in the same domain, you can preempt the JS code that invoked you and break NodeJS's entire threadless/consistency/concurrency model.

I don't mean to sound too harsh here. But this is not a theoretical concern; this was what actually happened with TooTallNate/node-weak#35 and it caused all manner of heisenbugs in every part of our system that were extremely difficult to track down, and were all caused by this one thing (and all went away when I patched node-weak to not do that).

@kkoopa
Copy link
Collaborator

kkoopa commented Feb 18, 2015

In some situations, not using MakeCallback would lead to strange errors such as the next tick never happening, causing all manners of normal code to fail. Now I don't know if this only applied to code invoked from libuv. I do believe this is a genuine concern, but want the common case adjusted for what is normal. It's not exactly nice to have to know, and keep track of, using one way of calling in certain places and other ways in other sections. What I would like is one single way of calling a JS function from C++ that always works, no matter what.

@metamatt
Copy link
Author

Let's see if we get any traction on nodejs/node-v0.x-archive#9245 then... in TooTallNate/node-weak#36 (comment) @trevnorris said that MakeCallback should be safe for use (modulo a small performance hit), or at least that's the intent. I observe otherwise today, but if that becomes true, then we should have your "one single way that always works".

@kkoopa
Copy link
Collaborator

kkoopa commented Feb 18, 2015

Yes, let's indeed see what happens with it. I had also been under the impression that MakeCallback should always be safe, which is why it is used everywhere. Having to correctly choose between two different ways of calling is not really acceptable, it will lead to so many hard-to-trace runtime bugs either way one (wrongly) chooses to make the call.

@saper
Copy link
Contributor

saper commented Apr 25, 2015

I actually ran into the problem (sass/node-sass#857) where my application uses uv threadpool to launch worker threads against a C/C++ library. The C++ library invokes certain JS subroutines. In case you are using asynchronous filesystem I/O in those you can end up in a situation where all threadpool threads are blocked (because fs I/O also uses uv threadpool worker threads at the point where non are available/runnable).

So actually I need to have next ticks invoked since otherwise my application might get stuck forever.
Yes, we are using NanCallback but that does not seem to help.

@kkoopa
Copy link
Collaborator

kkoopa commented Apr 25, 2015

Nothing can be done for that here as far as I know. NanCallback should be the correct choice, as it invokes node::MakeCallback under the hood, which handles the tickCallback. Perhaps your problem is due to something else? How does it behave under Node 0.10? It might be a bug in libuv or Node for all I know, but I'm not qualified to determine exactly what the reason is.

@saper
Copy link
Contributor

saper commented Apr 25, 2015

Yes, there is nothing nan can do better. We block threads in the uv threadpool; I really wish I could have coroutines there and bility to suspend worker's execution. My solution will probably a separate worker thread communicating with the library. By the way I have an idea for a better fix to TooTallNate/node-weak#35, using uv_async_t.

@kkoopa kkoopa added the bug label May 13, 2015
wbyoung added a commit to wbyoung/node-sqlite3 that referenced this issue Jun 30, 2015
I believe this is the most appropriate way to handle exceptions. Using
v8::Function::Call rather than node::MakeCallback (via NanMakeCallback)
means that these functions will not work with domains, but that seems
appropriate.

I don't know if nodejs/node-v0.x-archive#9245 or nodejs/nan#284 should be a concern
here or not.
@kkoopa kkoopa added this to the 1.9 milestone Jul 19, 2015
@hankbao
Copy link

hankbao commented Jul 27, 2015

I've just encountered a related issue. We're currently using electron to pack our web app. There's a native node addon providing some functionality for the electron app. Our web app will call the addon and pass a js callback to it. The native addon store the callback in a NanCallback. It setups a TryCatch before calling the callback. However the TryCatch always fails to catch the exception thrown by the js callback. It turns out that if I call the callback with Function::Call instead of with NanCallback::Call(), exceptions can be caught.

I guess my problem may related to this issue.

@kkoopa
Copy link
Collaborator

kkoopa commented Jul 27, 2015

MakeCallback does its own error handling.

On July 27, 2015 1:07:34 PM CEST, Hank BAO [email protected] wrote:

I've just encountered a related issue. We're currently using electron
to pack our web app. There's a native node addon providing some
functionality. Our web app will call the addon and pass a js callback
to it. The native addon store the callback in a NanCallback. It setups
a TryCatch before calling the callback. However the TryCatch always
fails to catch the exception thrown by the js callback. It turns out
that if I call the callback with Function::Call instead of with
NanCallback::Call(), the exception can be caught.

I guess my problem may related to this issue.


Reply to this email directly or view it on GitHub:
#284 (comment)

@hankbao
Copy link

hankbao commented Jul 27, 2015

Unfortunately, when an exception was thrown in the js callback and caught by MakeCallback in my addon, codes doing cleanup stuffs after calling the callback won't be executed.

@joshperry
Copy link

This issue and nodejs/node-v0.x-archive#9245 seem to contradict eachother about whether this is an actual bug...

I need to make some synchronous callbacks. If this is an issue, that's fine, however, there are two different ways to call a Function depending on the node version. The only place I can find in Nan that glosses over these differences is in Nan::Callback.

However Nan::Callback calls node::MakeCallback, possibly invoking this bug, but also dispatching the call until after the next tick (not to mention the overhead of a persistent handle that I don't need).

Does Nan have a primitive for making synchronous function calls from C++ to JS that takes the new and old methods into account?

@saper
Copy link
Contributor

saper commented Oct 15, 2015

@joshperry how does your C++ code get started? (I mean the code that you'd like to do a sync jump in to JavaScript from) ? Are you running pure C++ in the v8 event loop? Or is it some code invoked via libuv that runs outside of the v8 event loop (I/O callbacks etc.)?

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 15, 2015

Yes, you can always call the v8::Function without using MakeCallback.

@joshperry
Copy link

I'm working on a usb hardware module that interfaces with libusb. When a js user wants to get a list of devices, I ask libusb for that list, reify it into js objects, then (I'd like to) call the provided callback synchronously. I need it to be synchronous because, after the callback, I need to free the device list.

@joshperry
Copy link

@kkoopa That's fine, I was just wondering if I needed my own ifdefs to take into account the difference in node <0.12 for the new isolate and context arguments or if Nan had something for me there.

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 15, 2015

No isolates are used for calling functions.

Nan::Callback cb;
*cb->Call();

alternatively

v8::Local<v8::Function> cb;
cb->Call(); 

@joshperry
Copy link

Should I not worry about the deprecation of that Call overload? The non-deprecated Call requires a context (gotten from the current isolate usually I think) and returns a Maybe. Perhaps this is something that Nan will address in the future?

https://v8docs.nodesource.com/io.js-3.0/d4/da0/v8_8h_source.html#l03063

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 26, 2015

Aww, patching V8 is not really possible, save for internal leak testing. Maybe it is possible to hack something up by wrapping a v8::HandleScope in a Nan::SealHandleScope and doing the checking by hand. I'll take a look, but don't feel optimistic. I really wish C++ had a proper way of overriding access modifiers.

@trevnorris
Copy link
Collaborator

@kkoopa sorry, the patch is actually only for v0.12. don't have a working one for v0.10. I can submit it to land, but w/o v0.10 and v0.8 support I don't think it'll be worth it.

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 26, 2015

Nah, not here and now, but do submit it for 0.12 anyway, hopefully we will get to drop 0.8 and 0.10 in 12-18 months (and start forcing Isolate passing everywhere and other good stuff).

On October 26, 2015 10:29:11 PM GMT+02:00, Trevor Norris [email protected] wrote:

@kkoopa sorry, the patch is actually only for v0.12. don't have a
working one for v0.10. I can submit it to land, but w/o v0.10 and v0.8
support I don't think it'll be worth it.


Reply to this email directly or view it on GitHub:
#284 (comment)

@bnoordhuis
Copy link
Member

Apropos v0.8, what's the reason nan still supports it? Upstream no longer does. I'd just sunset the support and say "nan version $x is the last one to support v0.8."

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 27, 2015

It's so similar to 0.10 that it is trivial to support. Npm also support 0.8 for a while more. My plan was to drop both 0.8 and 0.10 at the same time, since nothing is gained by dropping only 0.8.

On October 27, 2015 1:47:21 PM GMT+02:00, Ben Noordhuis [email protected] wrote:

Apropos v0.8, what's the reason nan still supports it? Upstream no
longer does. I'd just sunset the support and say "nan version $x is
the last one to support v0.8."


Reply to this email directly or view it on GitHub:
#284 (comment)

@bnoordhuis
Copy link
Member

It would give people an incentive to upgrade to a version that's still supported by upstream. Although if you're still stuck on v0.8 at this point, you're probably not a very upgrade-y person anyway.

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 27, 2015

Yeah, but it is not so much for end-users as for developers. Their modules can run on 0.8, even if the majority of their users couldn't care less. That makes upgrading their NAN version a non-issue, giving support for new versions at no expense.

From NAN's perspective there are three versions at the moment: 0.8 to 0.10, 0.12 to io.js 2, and io.js 3 to Node 5+. These correspond to: ancient, isolates, maybes.

On October 27, 2015 1:55:03 PM GMT+02:00, Ben Noordhuis [email protected] wrote:

It would give people an incentive to upgrade to a version that's still
supported by upstream. Although if you're still stuck on v0.8 at this
point, you're probably not a very upgrade-y person anyway.


Reply to this email directly or view it on GitHub:
#284 (comment)

bcrusu added a commit to bcrusu/mesos-node that referenced this issue Nov 6, 2015
@LinusU
Copy link
Contributor

LinusU commented Nov 16, 2015

I got bit by this today as well, I would love to see NanCallback::CallDirect implemented. Would a pull request be accepted?

I'm also in favor for renaming Call to something more explicit, but that's step 2. I think we should add CallDirect either way...

@kkoopa
Copy link
Collaborator

kkoopa commented Nov 16, 2015

I don't think adding that is the right approach. Just get the underlying function and use Nan::Call on it once that has been added from another PR.

Nan::Call(*callback, Nan::GetCurrentContext()->Global(), 0, 0);

On November 16, 2015 6:59:00 PM GMT+02:00, "Linus Unnebäck" [email protected] wrote:

I got bit by this today as well, I would love to see
NanCallback::CallDirect implemented. Would a pull request be
accepted?

I'm also in favor for renaming Call to something more explicit, but
that's step 2. I think we should add CallDirect either way...


Reply to this email directly or view it on GitHub:
#284 (comment)

@LinusU
Copy link
Contributor

LinusU commented Nov 17, 2015

@kkoopa Perfect, that's exactly what I need!

@kkoopa
Copy link
Collaborator

kkoopa commented Nov 17, 2015

For reference, Nan:Call is #496, which is merely a wrapper for v8::Function::Call. Nan::Callback already supports the * operator as a shorthand for Nan::Callback::GetFunction.

@kkoopa
Copy link
Collaborator

kkoopa commented Feb 21, 2016

I'll close this, as it should no longer be an issue. Nan::Call is included in 2.2 and node::MakeCallback is now reentrant.

@kkoopa kkoopa closed this as completed Feb 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants