Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

makeDuplexPair should be the default stream paradigm #29986

Closed
awwright opened this issue Oct 15, 2019 · 11 comments
Closed

makeDuplexPair should be the default stream paradigm #29986

awwright opened this issue Oct 15, 2019 · 11 comments
Labels
feature request Issues that request new features to be added to Node.js. stale stream Issues and PRs related to the stream subsystem.

Comments

@awwright
Copy link
Contributor

Is your feature request related to a problem? Please describe.

The Readable and Writable objects are not well encapsulated: I can inject data into a Readable stream by calling Readable#push, and I can read data from a Writable stream by monkey-patching Writable#_read.

Further, these interfaces are not symmetrical. If I want to make data available on a readable side, why not use the write API? This is how it works on every other application: If a client creates a TCP stream, there's one "writable" side, and one "readable" side; yet if I want to make both sides in the same process, I have to use a different API, for some reason.

Describe the solution you'd like

the Node.js stream library should include DuplexPair and SimplexPair objects, or equivalent factory functions.

interface SimplexPair {
   Readable readable;
   Writable writable;
}
interface DuplexPair {
   Duplex client;
   Duplex server;
}

SimplexPair

SimplexPair creates two different but related objects, one Readable and one Writable; anything written to the Writable side is made available on the Readable side. For encapsulation, the objects would not have public references to each other, and there would be no way to make data available on the readable side without access to the writable side.

DuplexPair

DuplexPair is the same, but both sides are writable and readable.

PassThrough streams

This paradigm should not be new to Node.js developers; a PassThrough stream is just a special case of SimplexPair where both sides are exposed on a single Duplex object.

Transform streams

Transform streams generate two pairs, and returns one from each:

function ROT13Pair(){
  const input = new SimplexPair;
  const output = new SimplexPair;
  input.readable.on('data', function(buf){
    output.writable.write(buf.toString().replace(/[a-zA-Z]/g, function(c){
      const d = c.charCodeAt(0) + 13;
      return String.fromCharCode( ((c<="Z")?90:122)>=d ? d : d-26 );
    }));
  });
  return {
    writable: input.writable,
    readable: output.readable,
  };
}
const { writable, readable } = new ROT13Pair;
process.stdin.pipe(writable);
readable.pipe(process.stdout);

This pattern is repeatable to any level:

function ROT26Pair(){
  const input = new ROT13Pair;
  const output = new ROT13Pair;
  input.readable.pipe(output.writable);
  return {
    writable: input.writable,
    readable: output.readable,
  };
}
const { writable, readable } = new ROT26Pair;
process.stdin.pipe(writable);
readable.pipe(process.stdout);

... although I'm not sure how practical this particular example would be in production.

I would bet this style could also result in a modest performance improvement, since much of the logic around buffering and flow control could be re-implemented.

@addaleax addaleax added feature request Issues that request new features to be added to Node.js. stream Issues and PRs related to the stream subsystem. labels Oct 15, 2019
@addaleax
Copy link
Member

/cc @nodejs/streams

Without voicing a strong preference about whether this should be the “default” stream paradigm, I’d be okay with exposing the DuplexPair code that we already ship as an internal utility anyway.

@awwright As far as I am concerned, feel free to open a PR if you like – the current implementation of DuplexPair is in lib/internal/streams/duplexpair.js (and there is one in test/common/duplexpair.js that could be removed if we make the functionality a public API). Aside maybe from documentation, the PR wouldn’t have to be complex, and it’s usually easier to have a debate over a concrete suggestion.

@awwright
Copy link
Contributor Author

awwright commented Oct 15, 2019

@addaleax Sure, I'll be doing that in the near future (exposing the current DuplexPair implementation, and adding a SimplexPair). Feedback on naming and such is welcome first.

Uh, the transform examples aren't that great (they don't really offer anything beyond a Transform stream), but the idea is there would be another API that would create a Duplex stream from one Readable and one Writable:

process.stdin.pipe(new Transform(new ROT26Pair)).pipe(process.stdout);

And second, while I think exposing DuplexPair is an improvement, part of the request is better encapsulation—having a reference to Readable by itself shouldn't let me write data to that stream. This might require a re-implementation of some logic, which is why I figure much of the existing logic around buffering/flow control/object streams could be optimized away.

@mcollina
Copy link
Member

I'm fine in exposing DuplexPair, I don't think we should change the other streams API right now.

@ronag
Copy link
Member

ronag commented Nov 1, 2019

I'm not sure I agree with this. We've quite recently had problems with DuplexPair and I don't think the; how we want it to work, how it does work and how it can work is quite consistent.

I'm worried further extending the streams public API is going to make it even more difficult to make sure streams behave and are used correctly and can be reasoned about.

There is already a npm package that exposes this functionality where semver can be applied.

@awwright
Copy link
Contributor Author

awwright commented Nov 2, 2019

@ronag Well, I reported that bug, and my impression is it happened because regular streams (with the asymmetric API) are way too complicated for one person to understand how they function. And the same bug is in that "duplexpair" package, too, because it depends on a Node.js-derived stream to work.

Do you have a critique of this proposal per se (by itself), or is the concern simply that the change is unwarranted for now?

And is encapsulation in Node.js simply not a priority? I've noticed a lot of other stuff has been getting ported to symbol properties, for instance.

@ronag
Copy link
Member

ronag commented Nov 2, 2019

@awwright: My main objection is from a personal perspective. I find the existing paradigms flawed but good enough for now. If we are going to introduce another overlapping paradigm to our users it should to according to me at least be significantly better and something we actively encourage and document. Not something we just add in.

are way too complicated for one person to understand how they function

Yes, and DuplexPair is still built on top of those. For me it's just another layer which makes it even harder to reason about. If we built DuplexPair from scratch and with well defined semantics I would be more positive but I don't believe that's the way to go either. Or if it in someway allowed us to get around some difficult stream related edge cases that are difficult to fix due to compat concerns.

Right now I'm not sure of either how DuplexPair works exactly or how it is supposed to work (even though I was involved with sorting out that bug).

Do you have a critique of this proposal per se (by itself), or is the concern simply that the change is unwarranted for now?

I like the proposal from the theoretical perspective but I think it's unwarranted right now from a practical stand-point. I would rather focus on trying to sort out the existing API's and make them more consistent and easier to use.

And is encapsulation in Node.js simply not a priority?

This issue is for me not about encapsulation. We could theoretically hide push behind private symbols (not sure what you mean about Writable._read, that doesn't exist). So it's not really about the fact that we have Readable and Writable but more of an implementation detail.

I have a lot of opinions on this but they are soft. I don't have any hard objections to this but would advise against it.

I would encourage you to discuss with @Fishrock123 who is working on a larger overhaul of streams and might make use of some of your feedback.

@awwright
Copy link
Contributor Author

awwright commented Nov 3, 2019

@ronag Fair enough, thanks for the explanation! I'll check that out for sure. Maybe it solves the same things.

@addaleax I'm guessing you have a differing opinion, may I hear?

@addaleax
Copy link
Member

addaleax commented Nov 3, 2019

@awwright I mean, in the end it boils down to exposing DuplexPair from streams, right? I don’t really see any issue with that. It doesn’t have to be a paradigm change, and we don’t have to adopt it elsewhere in core, but if people want to use it, why not.

(And regarding

Right now I'm not sure of either how DuplexPair works exactly or how it is supposed to work (even though I was involved with sorting out that bug).

… that can be put into words relatively directly: What is written to one of the Duplex streams is read from the other.)

@ronag
Copy link
Member

ronag commented Nov 3, 2019

I mean, in the end it boils down to exposing DuplexPair from streams, right? I don’t really see any issue with that. It doesn’t have to be a paradigm change, and we don’t have to adopt it elsewhere in core, but if people want to use it, why not.

Then why not use the npm package? If we expose something from node core at least in my view we officially supporting and encourage usage. It's a mark of this component is mature and thought through and usable without having to give extra thought to edge cases that we might take into account when using it in core. To me it's more than "just exposing it".

… that can be put into words relatively directly: What is written to one of the Duplex streams is read from the other.)

That's a bit too simplistic though in my opinion... there are a lot of edge cases that might need to be thought through...

  • What happens when one side errors while the other doesn't?
  • How are empty chunks handled, what is the ordering when pushing empty chunks? Right now I think empty chunk ordering is broken, i.e. the callback is invoked immediately instead of waiting for any pending chunks.
  • What happens when on side is closed and the other isn't?
  • What is the order of different events such as close and error between the two sides.
  • What happens when one side is destroyed?
  • Is the reading side flushed or dumped on error on the writing side?
  • Is the writing side flushed or dumped on error on the reading side?

I haven't given this enough thought and some of the above points might be nonsense but I hope that clarifies my concern.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2022

There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment.

For more information on how the project manages feature requests, please consult the feature request management document.

@github-actions github-actions bot added the stale label Mar 4, 2022
@targos targos moved this to Pending Triage in Node.js feature requests Mar 6, 2022
@targos targos moved this from Pending Triage to Stale in Node.js feature requests Mar 6, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Apr 5, 2022

There has been no activity on this feature request and it is being closed. If you feel closing this issue is not the right thing to do, please leave a comment.

For more information on how the project manages feature requests, please consult the feature request management document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Issues that request new features to be added to Node.js. stale stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

No branches or pull requests

4 participants