Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node 16 Behavior Changes #44

Closed
claytongentry opened this issue May 1, 2022 · 2 comments
Closed

Node 16 Behavior Changes #44

claytongentry opened this issue May 1, 2022 · 2 comments

Comments

@claytongentry
Copy link
Member

claytongentry commented May 1, 2022

Hey @reconbot (nice to virtually meet!),

Thanks for all the great work on bluestream. I've been working to bring it up to full compat with Node 16 (#43), and I was wondering if you could do us a solid and offer some insight into a couple of issues I've seen running on 16 specifically.

1. "Callback called multiple times" errors

The error is reproducible by running any of the transform tests on master using Node 16.

It appears that async transform functions that call a callback in Node 16 will hit a "Callback called multiple times" error due to some since-deprecated streams internals behavior specific to Node 16 (i.e. no issue in 14 nor 18). I found a helpful diagnosis of the problem here: kylefarris/clamscan#88, where the recommended resolution was to simply return a promise chain.

Given there are no awaits in the bluestream transform handler, I simply removed the async declaration to resolve: b49aee0

Do you forecast any issues with that change?

2. stream.read() no longer increments the _eventsCount property

There are three tests in the readAsync suite failing on Node 16, including this one:

it('resolvers a buffer with a number bytes from a buffer stream', async () => {

The failure is on the countEvents assertion. On 16, the _eventsCount property at the conclusion of the test is 0, but the test asserts 1.

I narrowed this down to the stream.read() call here: https://github.com/bustle/bluestream/blob/master/lib/readAsync.ts#L5 On 14, I see the eventsCount property incremented, and on 16 I don't.

Do you have any insight as to that behavioral variance or any thoughts on how to resolve?

Thanks again!

@reconbot
Copy link
Contributor

reconbot commented May 1, 2022

Hey @claytongentry,
Happy to tell you what I remember.

The difference between a sync function returning a promise and an async function returning a promise is that the sync one can throw an exception whereas the async one can't. What Node does internally with them.. you got me. I'm not really sure when/where you call user code but you'll need to wrap it with a try/catch and return Promise.reject() in those cases so you don't get unhandled exceptions. Other than that it should be fine. 🤷

I honestly don't remember anything about counting read events.

Since node 10 all nodejs streams are async iterators and their performance has gotten on par with raw streams for most workloads. (Including workloads that I wrote bluestream for) and so as soon as I could I've moved most of my code to use for await and built out an async iterator toolchain called streaming-iterables. It's got a similar api, but it's a lot more simple under the hood and with it the problems bluestream were created to solve aren't relevant. (Async functions having harsh error cases with nodejs streams, and doing concurrent async work for ETLs.)

  • TrasnformStream is exactly like transform
  • pipe is exactly like pipeline
  • tap, reduce, merge, throttle, flatten, etc do what you'd expect.

It's gotten some traction and I haven't looked back.

Hope that helps!

@claytongentry
Copy link
Member Author

Extremely helpful. Thanks man. Maybe we'll just move toward streaming-iterables in place of our current bluestream usage. Appreciate the insight in either case!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants