-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spawn with pipes #18016
Comments
how about this: 'use strict'
const { spawn } = require('child_process')
let p1 = spawn('find', ['.', '-type', 'f'])
let p2 = spawn('grep', ['foo'], {stdio: [p1.stdout, 'pipe', process.error]})
p2.stdout.on('data', (d) => {
console.log(d.toString())
})
p1.on('exit', (code, signal) => console.log('p1 done', code, signal))
p2.on('exit', (code, signal) => console.log('p2 done', code, signal)) |
I'm not sure that I'm catching the difference here. The main thing that I've tried to start with your bit, cleaning it up a little (I don't have Here's the current version I tried -- added a "use strict";
const { spawn } = require("child_process");
let p1 = spawn("find", [".", "-type", "f"], {stdio: [process.stdin, "pipe", process.stderr]});
p1.on("error", e => console.log("p1 error:", e));
p1.on("exit", (code, signal) => console.log("p1 done", code, signal));
let p2 = spawn("grep", ["foo"], {stdio: [p1.stdout, "pipe", process.stderr]});
p2.on("error", e => console.log("p2 error:", e));
p2.on("exit", (code, signal) => console.log("p2 done", code, signal));
let p3 = spawn("sed", ["-e","s/^/-/"], {stdio: [p2.stdout, process.stdout, process.stderr]});
p3.on("error", e => console.log("p3 error:", e));
p3.on("exit", (code, signal) => console.log("p3 done", code, signal));
// p3.stdout.on("data", d => console.log(d.toString())); This skips a line as I described -- and if I change the last output to |
can you share with me a failing scenario? I am not able observe the |
Sorry, I tried recreating it with a toy directory but it doesn't show the problem there. Above, I'm running it in a pretty big directory, which is where I see the problem. |
I tried it again on my home directory, and the discrepancies are very frequent, to the point that two consecutive runs rarely produce the same output. I also tried it without the direct I'm guessing that you should be able to see this if running in a big enough directory. |
yes, I tried quite a big one but could not see anything wrong. But as you state you are seeing it consistently, let me try again. Couple of sanity checks:
|
|
thanks for the clarification. Now the only differences would be:
|
@elibarzilay - is this still outstanding? |
@gireeshpunathil Yes. In fact I just tested it again recently, and it is still very broken. To see it without scanning a directory tree with Save the following in
And put this in
With this I'm getting:
|
thanks, I understand what you are saying. And I am able to reproduce it too: Can you try this? $ cat 18016.js "use strict";
const { spawn } = require("child_process");
let p1 = spawn("./b", ["10000"])
let p2 = spawn("grep", ["7"])
p1.stdout.pipe(p2.stdin);
let p3 = spawn("wc", ["-l"], {stdio: ['pipe', process.stdout, 'pipe']})
p2.stdout.pipe(p3.stdin); |
Explanation to the behavior we see:
With this setup when When you do I am too puzzled to recommend anything. :) /cc @nodejs/child_process @nodejs/streams |
@gireeshpunathil Sounds like a serious problem then -- why are q1/q2 And most importantly, if they are created, then what's the point of BTW, I didn't get to try your suggested solution, but assuming it works: |
@elibarzilay - sorry, missed to read your comment in time.
because each child is created by Node independently, that opens up new pipes. Closing one end may have side effects, so piping them as suggested would be a good workaround. #9413 shares the same problem as this one, and #9413 (comment) has a summary of the root cause, design constraints and a potential workaround:
|
@gireeshpunathil I still think that the problem I described stands: if My question was basically: if a pipe is created unconditionally, then is there a way for user code to get that pipe to be used with a different process (assuming that a different process can be started with a given pipe as its input, rather than having the same race condition on that side too). |
@elibarzilay - I have put in a PR to address this issue, if you have a system to build node from source, you may please try the patch and see how it goes. |
@gireeshpunathil I finally got to try it now, and indeed it looks like it's fixing the problem. |
when t0 and t1 are spawned with t0's outputstream [1, 2] is piped into t1's input, a new pipe is created which uses a copy of the t0's fd. This leaves the original copy in Node parent, unattended. Net result is that when t0 produces data, it gets bifurcated into both the copies Detect the passed handle to be of 'wrap' type and close after the native spawn invocation by which time piping would have been over. Fixes: nodejs#9413 Fixes: nodejs#18016 PR-URL: nodejs#21209 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Anna Henningsen <[email protected]>
when t0 and t1 are spawned with t0's outputstream [1, 2] is piped into t1's input, a new pipe is created which uses a copy of the t0's fd. This leaves the original copy in Node parent, unattended. Net result is that when t0 produces data, it gets bifurcated into both the copies Detect the passed handle to be of 'wrap' type and close after the native spawn invocation by which time piping would have been over. Fixes: #9413 Fixes: #18016 PR-URL: #21209 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Anna Henningsen <[email protected]>
@gireeshpunathil -- thanks! Quick question: when, or what version will this be available in? |
@elibarzilay: Some gestation period in the master and then |
No specific need, I just have some personal mini-project that got stuck on that... Thanks again! |
Something bothers me in this issue - from what I recall from my early unix days processes should typically be spawned right to left, not left to right (otherwise there's a race condition chance when the left process outgrows the pipe buffer before the right process had the chance to start consuming it). Does this bug only cause issues with the left-to-right pattern? |
TBH, I don't even remember what was the original order that made me run into this problem, but I played with many variations, including changing the order. Don't take my code as something that should or should not be done. If you read through the discussion, you'll see that the problem was an independent race condition where output would only sometimes reach the pipe on the JS side. (As a sidenote, I don't know how things were in the very early days, but the order should usually not matter, because when the output buffer is full, the kernel suspends the process until there is more space. Otherwise, you'd have issues with any |
Disclaimer: the below is using my understanding of the docs which might
be wrong.
I'm trying to figure out the best way to create a pipeline of two
processes, and there seem to be some subtle issues that are either bugs
or maybe a result of under-documentation... I'll describe the three
variants that I tried below -- note that I'm trying to avoid the
explicit piping in JS that the "very elaborate" example in the docs is
doing, since my goal is to let the system do its thing when possible.
The main question here is whether the first code sample should work, and
if not, then what is the right way to do this.
0. Setup
In all of these examples the current directory has two files:
foo.js
with the code, and a
bar
file. This is just an artificial setup tomake sure that the resulting output works.
1. Simple code
The first thing I tried is what I thought should work fine:
On Windows (using cygwin) this works as expected:
but on Linux it sometimes works fine (as above) and sometimes it fails
inexplicably:
To make things more weird, uncommenting the
console.log("!!!")
linemakes it work fine (on both platforms). This lead me to believe that
there might be some race condition somewhere.
2. Delayed start of the second process
Following the above guess, I tried to delay the second process to make
sure that the pipe gets generated in a consistent way:
This either fails as before (rarely), or (more commonly) fails with an error:
This might be a different error where the stream gets into a state where
it cannot be used as an input. Note that:
setTimeout()
leads to the same behaviorpreviously
3. Starting the processes from the end
I finally tried creating the processes going from the end. This was
close, but when the
p1
process ends, it doesn't close its pre-existingstdout, so I have to do that manually:
This seems like it works fine, but I dislike closing the pipe in my own
code, since it is a step towards doing more work in node, instead of
letting the OS manage the pipe in a more natural way.
The text was updated successfully, but these errors were encountered: