Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spawn with pipes #18016

Closed
elibarzilay opened this issue Jan 6, 2018 · 22 comments · Fixed by #21209
Closed

spawn with pipes #18016

elibarzilay opened this issue Jan 6, 2018 · 22 comments · Fixed by #21209
Labels
child_process Issues and PRs related to the child_process subsystem. stream Issues and PRs related to the stream subsystem.

Comments

@elibarzilay
Copy link

  • Version: v9.3.0
  • Platform: mainly Linux, compared with Windows
  • Subsystem: child_process

Disclaimer: the below is using my understanding of the docs which might
be wrong.

I'm trying to figure out the best way to create a pipeline of two
processes, and there seem to be some subtle issues that are either bugs
or maybe a result of under-documentation... I'll describe the three
variants that I tried below -- note that I'm trying to avoid the
explicit piping in JS that the "very elaborate" example in the docs is
doing, since my goal is to let the system do its thing when possible.

The main question here is whether the first code sample should work, and
if not, then what is the right way to do this.

0. Setup

In all of these examples the current directory has two files: foo.js
with the code, and a bar file. This is just an artificial setup to
make sure that the resulting output works.

1. Simple code

The first thing I tried is what I thought should work fine:

"use strict";
const { spawn } = require("child_process");

let p1 = spawn("find", [".", "-type", "f"],
               {stdio: [process.stdin, "pipe", process.stderr]});
p1.on("error", e => console.log("p1 error:", e));
p1.on("exit", (code, signal) => console.log("p1 done", code, signal));
// console.log("!!!");
let p2 = spawn("grep", ["foo"],
               {stdio: [p1.stdio[1], process.stdout, process.stdout]});
p2.on("error", e => console.log("p2 error:", e));
p2.on("exit", (code, signal) => console.log("p2 done", code, signal));

On Windows (using cygwin) this works as expected:

./foo.js
p1 done 0 null
p2 done 0 null

but on Linux it sometimes works fine (as above) and sometimes it fails
inexplicably:

p1 done 0 null
p2 done 1 null

To make things more weird, uncommenting the console.log("!!!") line
makes it work fine (on both platforms). This lead me to believe that
there might be some race condition somewhere.

2. Delayed start of the second process

Following the above guess, I tried to delay the second process to make
sure that the pipe gets generated in a consistent way:

"use strict";
const { spawn } = require("child_process");

let p1 = spawn("find", [".", "-type", "f"],
               {stdio: [process.stdin, "pipe", process.stderr]});
p1.on("error", e => console.log("p1 error:", e));
p1.on("exit", (code, signal) => console.log("p1 done", code, signal));

setImmediate(() => {
  let p2 = spawn("grep", ["foo"],
                 {stdio: [p1.stdio[1], process.stdout, process.stdout]});
  p2.on("error", e => console.log("p2 error:", e));
  p2.on("exit", (code, signal) => console.log("p2 done", code, signal));
});

This either fails as before (rarely), or (more commonly) fails with an error:

p1 done 0 null
internal/child_process.js:917
      throw new errors.TypeError('ERR_INVALID_OPT_VALUE', 'stdio',
      ^

TypeError [ERR_INVALID_OPT_VALUE]: The value "Socket {
  connecting: false, ... [Symbol(bytesRead)]: 15 }" is invalid for option "stdio"
    at internal/child_process.js:917:13
    at Array.reduce (<anonymous>)
    at _validateStdio (internal/child_process.js:843:17)
    at ChildProcess.spawn (internal/child_process.js:283:11)
    at exports.spawn (child_process.js:499:9)
    at Immediate.setImmediate (/home/eli/work/ms/x/foo.js:10:12)
    at runCallback (timers.js:773:18)
    at tryOnImmediate (timers.js:734:5)
    at processImmediate [as _immediateCallback] (timers.js:711:5)

This might be a different error where the stream gets into a state where
it cannot be used as an input. Note that:

  1. Using setTimeout() leads to the same behavior
  2. The failure is only on Linux -- on Windows it works fine as it did
    previously

3. Starting the processes from the end

I finally tried creating the processes going from the end. This was
close, but when the p1 process ends, it doesn't close its pre-existing
stdout, so I have to do that manually:

"use strict";
const { spawn } = require("child_process");

let p2 = spawn("grep", ["foo"],
               {stdio: ["pipe", process.stdout, process.stdout]});
p2.on("error", e => console.log("p2 error:", e));
p2.on("exit", (code, signal) => console.log("p2 done", code, signal));

let p1 = spawn("find", [".", "-type", "f"],
               {stdio: [process.stdin, p2.stdio[0], process.stderr]});
p1.on("error", e => console.log("p1 error:", e));
p1.on("exit", (code, signal) => {
  console.log("p1 done", code, signal);
  p2.stdio[0].end();
});

This seems like it works fine, but I dislike closing the pipe in my own
code, since it is a step towards doing more work in node, instead of
letting the OS manage the pipe in a more natural way.

@gireeshpunathil
Copy link
Member

how about this:

'use strict'
const { spawn } = require('child_process')

let p1 = spawn('find', ['.', '-type', 'f'])
let p2 = spawn('grep', ['foo'], {stdio: [p1.stdout, 'pipe', process.error]})

p2.stdout.on('data', (d) => {
  console.log(d.toString())
})

p1.on('exit', (code, signal) => console.log('p1 done', code, signal))
p2.on('exit', (code, signal) => console.log('p2 done', code, signal))

@elibarzilay
Copy link
Author

I'm not sure that I'm catching the difference here. The main thing that
I can see is that you're using pipe and a manual loop at the end of
the chain instead of just connecting it to process.stdout. If that's
the case, then I think that there is some problem with process.stdout.

I've tried to start with your bit, cleaning it up a little (I don't have
process.error, so replace it with process.stderr). I don't get the
previous process failures now (even with my original sample, and on the
same v9.3), but every once in a while it drops one line from the
expected output. This bogus dropping does not happen when I add the
manual printout that you do, which is why I suspect that there's some
problem with process.stdout. Maybe something with process output that
happens in a way that is not safe with the REPL.

Here's the current version I tried -- added a sed to ensure that the
printouts are all going through the chain and not getting printed
earlier somehow:

"use strict";
const { spawn } = require("child_process");

let p1 = spawn("find", [".", "-type", "f"], {stdio: [process.stdin, "pipe", process.stderr]});
p1.on("error", e => console.log("p1 error:", e));
p1.on("exit", (code, signal) => console.log("p1 done", code, signal));

let p2 = spawn("grep", ["foo"], {stdio: [p1.stdout, "pipe", process.stderr]});
p2.on("error", e => console.log("p2 error:", e));
p2.on("exit", (code, signal) => console.log("p2 done", code, signal));

let p3 = spawn("sed", ["-e","s/^/-/"], {stdio: [p2.stdout, process.stdout, process.stderr]});
p3.on("error", e => console.log("p3 error:", e));
p3.on("exit", (code, signal) => console.log("p3 done", code, signal));

// p3.stdout.on("data", d => console.log(d.toString()));

This skips a line as I described -- and if I change the last output to
"pipe" and un-comment the manual printout, it indeed works
consistently.

@gireeshpunathil
Copy link
Member

can you share with me a failing scenario? I am not able observe the dropping stdout scenario.

@elibarzilay
Copy link
Author

Sorry, I tried recreating it with a toy directory but it doesn't show the problem there. Above, I'm running it in a pretty big directory, which is where I see the problem.

@elibarzilay
Copy link
Author

I tried it again on my home directory, and the discrepancies are very frequent, to the point that two consecutive runs rarely produce the same output. I also tried it without the direct process.stdout (with the manual output printing thing) and that fails to produce consistent results too.

I'm guessing that you should be able to see this if running in a big enough directory.

@gireeshpunathil
Copy link
Member

yes, I tried quite a big one but could not see anything wrong. But as you state you are seeing it consistently, let me try again. Couple of sanity checks:

  • hope you are not redirecting output to a file - in which case the process.stdout changes its identity.

  • for that matter you are also not counting output lines through | wc -l etc.?

  • hope you are not taking care of the re-ordering of file names that find command performs based on its internal logic.

@elibarzilay
Copy link
Author

  1. I am redirecting to a file, but like I said, the differences happen
    either way. (I also saw these differences without a redirection.)

  2. Using wc would be a valid way to test it, since lines get ommitted.
    But I'm mainly comparing the files to verify that there are actual
    differences.

  3. And yes, I do account for possible differences in ordering, it's not
    the problem.

@gireeshpunathil
Copy link
Member

thanks for the clarification. Now the only differences would be:

  • in terms of platform. Can you please confirm your complete OS version?
  • the number of files : roughly how many files the find command brings forth?

@gireeshpunathil
Copy link
Member

@elibarzilay - is this still outstanding?

@elibarzilay
Copy link
Author

@gireeshpunathil Yes. In fact I just tested it again recently, and it is still very broken. To see it without scanning a directory tree with find, do this:

Save the following in b and chmod +x it:

#!/usr/bin/env bash
for (( x=0; x < $1; x++ )); do
  echo xxx $x xxx
done

And put this in x.js:

"use strict";
const { spawn } = require("child_process");

let p1 = spawn("./b", ["10000"], {stdio: [process.stdin, "pipe", process.stderr]});
p1.on("error", e => console.log("p1 error:", e));
let p2 = spawn("grep", ["7"], {stdio: [p1.stdio[1], "pipe", process.stdout]});
p2.on("error", e => console.log("p2 error:", e));
let p3 = spawn("wc", ["-l"], {stdio: [p2.stdio[1], process.stdout, process.stdout]});
p3.on("error", e => console.log("p3 error:", e));

With this I'm getting:

$ ./b 10000 | grep 7 | wc -l               # shows the expected result
3439
$ node x.js; node x.js; node x.js; node x.js
2140
2146
2796
2143

@gireeshpunathil
Copy link
Member

thanks, I understand what you are saying. And I am able to reproduce it too:

Can you try this?

$ cat 18016.js

"use strict";
const { spawn } = require("child_process");
let p1 = spawn("./b", ["10000"])
let p2 = spawn("grep", ["7"])
p1.stdout.pipe(p2.stdin);
let p3 = spawn("wc", ["-l"], {stdio: ['pipe', process.stdout, 'pipe']})
p2.stdout.pipe(p3.stdin);

@gireeshpunathil
Copy link
Member

Explanation to the behavior we see:

  1. When 'b' is spawned, a pipe is created for child's stdout.
  2. As pipes would have it, it has 2 ends p1 and p2. p1 is with parent (x.js) and p2 with child, 'b'
  3. When 'grep' is spawned, p1 is passed as stdin for it.
  4. Expectation is that all data coming to p1 be routed to grep.
  5. In reality, another pipe is created, say with ends q1 and q2. q1 will be controlled by parent, x.js
  6. The forked child (grep) uses a copy of p1, say q2.
  7. The parent x.js also has the original copy of p1 with it.
  8. All p1, p2, q1, q2 are sockets.

With this setup when find start writing data, there are two potential consumers for it - q2 (p1.stdio[1] duplicated in grep) and original p1 that is still with parent. Depending on which one attempts a read when data is available, the data is bifurcated, and grep gets only a handlful of it!

When you do p1.stdout.pipe(p2.stdin) the missing pieces are returned to grep.

I am too puzzled to recommend anything. :)

/cc @nodejs/child_process @nodejs/streams

@gireeshpunathil gireeshpunathil added child_process Issues and PRs related to the child_process subsystem. stream Issues and PRs related to the stream subsystem. labels May 20, 2018
@elibarzilay
Copy link
Author

@gireeshpunathil Sounds like a serious problem then -- why are q1/q2
created? Can I access them somehow to connect the two processes?

And most importantly, if they are created, then what's the point of
having the pipe option to begin with? Not only does it sound
questionable, it seems that if a pipe specification comes in
addition
to the internal creation of pipes, then it's pretty much
guaranteed to get race problems like the one I have?

BTW, I didn't get to try your suggested solution, but assuming it works:
it's wasting some resources in having the communication go through node,
right? My understanding of pipe is that it would allow running two
processes and having the OS shovel the data between them to consume
fewer resources. I'm coming from a background in a language
implementation (Racket) where the implementation of subprocesses does
exactly that: allow starting two processes with the created output of
one being used as the input of another, and establishing this kind of
cheap OS communication between them.

@gireeshpunathil
Copy link
Member

@elibarzilay - sorry, missed to read your comment in time.

why are q1/q2created? Can I access them somehow to connect the two processes?

because each child is created by Node independently, that opens up new pipes. Closing one end may have side effects, so piping them as suggested would be a good workaround.

#9413 shares the same problem as this one, and #9413 (comment) has a summary of the root cause, design constraints and a potential workaround:

  • if you want Node to control both (or many) child processes directly, the pipes are created between Node and the child independent of other relationships. So we have to manually pipe extra ends to the desired destinations, as suggested in spawn with pipes #18016 (comment) [ this can be argued at the design level though ]

  • if you want to spawn a number of child processes with direct data flow dependency between those, you could issue a single child process command and combine all of them together: (ex:- exec('find . | grep')) that has a single pipe from Node and the internal piping is managed by the shell that spawned the child processes.

@elibarzilay
Copy link
Author

@gireeshpunathil I still think that the problem I described stands: if "pipe" creates a second pipe with a race condition then it looks like a less-than-useless feature (less because it can lead to bugs).

My question was basically: if a pipe is created unconditionally, then is there a way for user code to get that pipe to be used with a different process (assuming that a different process can be started with a given pipe as its input, rather than having the same race condition on that side too).

@gireeshpunathil
Copy link
Member

@elibarzilay - I have put in a PR to address this issue, if you have a system to build node from source, you may please try the patch and see how it goes.

@elibarzilay
Copy link
Author

@gireeshpunathil I finally got to try it now, and indeed it looks like it's fixing the problem.

gireeshpunathil added a commit to gireeshpunathil/node that referenced this issue Feb 6, 2019
when t0 and t1 are spawned with t0's outputstream [1, 2] is piped into
t1's input, a new pipe is created which uses a copy of the t0's fd.
This leaves the original copy in Node parent, unattended. Net result is
that when t0 produces data, it gets bifurcated into both the copies

Detect the passed handle to be of 'wrap' type and close after the
native spawn invocation by which time piping would have been over.

Fixes: nodejs#9413
Fixes: nodejs#18016

PR-URL: nodejs#21209
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
addaleax pushed a commit that referenced this issue Feb 6, 2019
when t0 and t1 are spawned with t0's outputstream [1, 2] is piped into
t1's input, a new pipe is created which uses a copy of the t0's fd.
This leaves the original copy in Node parent, unattended. Net result is
that when t0 produces data, it gets bifurcated into both the copies

Detect the passed handle to be of 'wrap' type and close after the
native spawn invocation by which time piping would have been over.

Fixes: #9413
Fixes: #18016

PR-URL: #21209
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
@elibarzilay
Copy link
Author

@gireeshpunathil -- thanks!

Quick question: when, or what version will this be available in?

@gireeshpunathil
Copy link
Member

@elibarzilay: Some gestation period in the master and then v.11.10.0 if that is cut, or else v12.0.0 . Do you have any specific requirement?

@elibarzilay
Copy link
Author

No specific need, I just have some personal mini-project that got stuck on that... Thanks again!

@arcanis
Copy link
Contributor

arcanis commented Feb 27, 2019

Something bothers me in this issue - from what I recall from my early unix days processes should typically be spawned right to left, not left to right (otherwise there's a race condition chance when the left process outgrows the pipe buffer before the right process had the chance to start consuming it). Does this bug only cause issues with the left-to-right pattern?

@elibarzilay
Copy link
Author

TBH, I don't even remember what was the original order that made me run into this problem, but I played with many variations, including changing the order. Don't take my code as something that should or should not be done. If you read through the discussion, you'll see that the problem was an independent race condition where output would only sometimes reach the pipe on the JS side.

(As a sidenote, I don't know how things were in the very early days, but the order should usually not matter, because when the output buffer is full, the kernel suspends the process until there is more space. Otherwise, you'd have issues with any fast-process | slow-process.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
child_process Issues and PRs related to the child_process subsystem. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants