Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: allow typed arrays to be written and read #22427

Closed
wants to merge 11 commits into from

Conversation

SirR4T
Copy link
Contributor

@SirR4T SirR4T commented Aug 21, 2018

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

Partially fixing #1826

Caveat

  • Any TypedArrays can be written, now, but in non-objectMode, the .write() method still receives raw Buffers.

I'm not completely sure that is the behaviour we want, but I'm willing to work towards fixing it, with some mentoring.

@nodejs-github-bot nodejs-github-bot added the stream Issues and PRs related to the stream subsystem. label Aug 21, 2018
@SirR4T SirR4T changed the title Fix1826 stream typed arrays stream: allow typed arrays to be written and read Aug 21, 2018
@SirR4T
Copy link
Contributor Author

SirR4T commented Aug 21, 2018

on second thought, the Uint8Array support for streams does the same thing as well, so 🤷‍♂️

@SirR4T SirR4T mentioned this pull request Aug 21, 2018
4 tasks
lib/stream.js Outdated
'[object Float32Array]',
'[object Float64Array]',
'[object DataView]'
].includes(Object.prototype.toString.call(obj));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be pretty slow. Surely there is a better way...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will .indexOf() be faster? Open to other suggestions as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just use something like ArrayBuffer.isView()? It seems to be available since at least node v4. The reason the string check was used before was because we were explicitly checking for only Uint8Array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On util.types are a lot of checks that can be used as well.

Copy link
Contributor

@mscdex mscdex Aug 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BridgeAR the relevant function from util.types is already being used earlier in this file if it exists. This implementation here is used if both the function from util.types and the function from the util binding are not available.

Copy link
Contributor Author

@SirR4T SirR4T Aug 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mscdex so ArrayBuffer.isView() could still be available, even though util.types and util binding are not available?
In other PRs dealing with typed arrays, i have been using:

const { isArrayBufferView } = require('internal/util/types');

Same can be used here too?
Wasn't sure whether compatibility with 0.x versions was specifically required here, and how to add that support.

Copy link
Contributor

@mscdex mscdex Aug 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so ArrayBuffer.isView() could still be available, even though util.types and util binding are not available?

My guess is that a string-based check was used previously for better compatibility with other, non-node platforms that use the same streams interface? I do not know for sure. Perhaps @addaleax might remember why they added that kind of implementation?

EDIT: judging by the comments from earlier in this file, it could also be useful for node v6.x, which does not have require('util').types, process.binding('util').isArrayBufferView, or require('internal/util/types').

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So i'm guessing current PR stays put?

Or is there any merit in adding a patch of this sort:

diff --git a/lib/stream.js b/lib/stream.js
index c6861b3464..58aab7c16d 100644
--- a/lib/stream.js
+++ b/lib/stream.js
@@ -42,6 +42,15 @@ Stream.finished = eos;
 Stream.Stream = Stream;

 // Internal utilities
+try {
+  const { isArrayBufferView } = require('internal/util/types');
+  if (typeof isArrayBufferView === 'function') {
+    Stream._isArrayBufferView = isArrayBufferView;
+  }
+} catch (e) {
+}
+
+if (!Stream._isArrayBufferView) {
   try {
     const types = require('util').types;
     if (types && typeof types.isArrayBufferView === 'function') {
@@ -53,6 +62,7 @@ try {
     }
   } catch (e) {
   }
+}

 if (!Stream._isArrayBufferView) {
   Stream._isArrayBufferView = function _isArrayBufferView(obj) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SirR4T That's more or less the exact same thing that's being done below that with require('util').types. I still think using ArrayBuffer.isView() as the fallback is probably best. I suppose the function could be cached at startup like internal/util/types does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use ArrayBuffer.isView. Although do verify, if simply having it in a constant will cache it? Or should the const isTypedArray be moved to module scope?

Also, shouldn't we be checking for existence of ArrayBuffer.isView first, and then be falling back to other methods (util.types)?

lib/stream.js Outdated
@@ -45,37 +45,50 @@ Stream.Stream = Stream;
try {
const types = require('util').types;
if (types && typeof types.isUint8Array === 'function') {
Stream._isUint8Array = types.isUint8Array;
// Stream._isUint8Array = types.isUint8Array;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments should be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

lib/stream.js Outdated
'[object Float32Array]',
'[object Float64Array]',
'[object DataView]'
].includes(Object.prototype.toString.call(obj));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On util.types are a lot of checks that can be used as well.

lib/stream.js Outdated
if (!Stream._uint8ArrayToBuffer) {
Stream._uint8ArrayToBuffer = function _uint8ArrayToBuffer(chunk) {
if (!Stream._typedArrayToBuffer) {
Stream._typedArrayToBuffer = function _typedArrayToBuffer(chunk) {
return Buffer.prototype.slice.call(chunk);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will likely not work with anything besides Uint8Array. So any other TypedArray would fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated ._typedArrayToBuffer to use Buffer.from().

lib/stream.js Outdated
@@ -45,37 +45,50 @@ Stream.Stream = Stream;
try {
const types = require('util').types;
if (types && typeof types.isUint8Array === 'function') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be updated to check for the correct function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Contributor Author

@SirR4T SirR4T left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @BridgeAR , does this look ok, for _typedArrayToBuffer()?

lib/stream.js Outdated
@@ -87,7 +87,7 @@ if (version[0] === 0 && version[1] < 12) {

if (!Stream._typedArrayToBuffer) {
Stream._typedArrayToBuffer = function _typedArrayToBuffer(chunk) {
return Buffer.prototype.slice.call(chunk);
return Buffer.from(chunk.buffer, chunk.byteOffset, chunk.byteLength);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Buffer.prototype.slice() would have made a different Buffer. With Buffer.from(), the underlying memory will still be shared. Will that be a problem?

lib/stream.js Outdated
if (!Stream._isArrayBufferView) {
// Cached to make sure no userland code can tamper with it.
const isArrayBufferView = ArrayBuffer.isView;
Stream._isArrayBufferView = isArrayBufferView;
Copy link
Contributor

@mscdex mscdex Aug 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, in this case we don't actually need to do any caching. I think we can just do

Stream._isArrayBufferView = ArrayBuffer.isView;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@SirR4T
Copy link
Contributor Author

SirR4T commented Aug 27, 2018

@mscdex / @BridgeAR : any other changes suggested / required here?

@mscdex
Copy link
Contributor

mscdex commented Aug 27, 2018

@SirR4T
Copy link
Contributor Author

SirR4T commented Aug 29, 2018

rebased against master. does that warrant a new CI?

@SirR4T
Copy link
Contributor Author

SirR4T commented Aug 30, 2018

Hi @mscdex , sorry for the commit and CI noise, but I had rebased this branch against master, after the earlier CI run had completed.

For this to be merged in, should another CI run be initiated?

Also, kind of off topic:
I personally prefer to keep my feature branches updated with (rebased against) master, so that builds across different feature branches may benefit simultaneously. Is this practice frowned upon? Or are there other ways to achieve the same?

@mcollina mcollina added semver-minor PRs that contain new features and should be released in the next minor version. dont-land-on-v6.x labels Sep 3, 2018
@mcollina
Copy link
Member

mcollina commented Sep 3, 2018

I've added the dont-land-on-6 and dont-land-on-8 labels out of caution.

lib/stream.js Outdated
if (types && typeof types.isUint8Array === 'function') {
Stream._isUint8Array = types.isUint8Array;
if (types && typeof types.isArrayBufferView === 'function') {
Stream._isArrayBufferView = types.isArrayBufferView;
} else {
// This throws for Node < 4.2.0 because there's no util binding and
// returns undefined for Node < 7.4.0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment would need to be updated. When has isArrayBufferView  been added?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v7.4.0.

* `chunk` {Buffer|TypedArray|DataView|string|null|any} Chunk of data to push
into the read queue. For streams not operating in object mode, `chunk` must
be a string, `Buffer` or `Uint8Array`. For object mode streams, `chunk` may
be any JavaScript value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this sentence is not up-to-date. It's not just string, Buffer or Uint8Array anymore.

Copy link
Contributor Author

@SirR4T SirR4T Sep 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but the typed arrays and data views can be passed received only in object mode. So thought that the current statement still remains true.

It might be better to explicitly document this behaviour though. Any thoughts on the same?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I misunderstood this PR then. I thought the whole point of this is to support all TypedArray and DataView  in binary mode.

This looks like from looking at the code as !state.objectMode && Stream._isArrayBufferView works when objectMode is `false.

Copy link
Contributor Author

@SirR4T SirR4T Sep 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea... I wasn't sure of the behaviour we required here, which is why I had raised the PR with a caveat.

The PR introducing support for Uint8Array in streams, too, asserts a similar behaviour, so I had let this PR follow similarly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. Still, this PR make accepting TypedArray and DataView  everywhere so it should be reflected in the docs.

lib/stream.js Outdated
return Buffer.prototype.slice.call(chunk);
if (!Stream._typedArrayToBuffer) {
Stream._typedArrayToBuffer = function _typedArrayToBuffer(chunk) {
return Buffer.from(chunk.buffer, chunk.byteOffset, chunk.byteLength);
};
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can clean up this whole file. As long as it exports the two functions we need, it's ok to not have all those checks. readable-stream@3 supports Node 6+ anyway.

@mcollina
Copy link
Member

mcollina commented Sep 3, 2018

@addaleax
Copy link
Member

addaleax commented Sep 3, 2018

* `chunk` {Buffer|TypedArray|DataView|string|null|any} Chunk of data to push
into the read queue. For streams not operating in object mode, `chunk` must
be a string, `Buffer` or `Uint8Array`. For object mode streams, `chunk` may
be any JavaScript value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. Still, this PR make accepting TypedArray and DataView  everywhere so it should be reflected in the docs.

test/parallel/test-stream-typedarrays.js Show resolved Hide resolved
@SirR4T SirR4T force-pushed the fix1826-stream-TypedArrays branch from 38fd0fc to 865f456 Compare September 4, 2018 10:52
@SirR4T
Copy link
Contributor Author

SirR4T commented Sep 4, 2018

  • removed legacy compatibility code for lib/stream.js (cc: @mcollina , @mscdex , and @addaleax due to comments here and here)
  • added tests for streams where writeable is piped to readable, and with transforms,
  • updated docs for typed arrays and data views,
  • added docs explicitly stating conversion of typed arrays and data views in non-object mode.
  • rebased against current master,

// NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
// DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
// OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
// USE OR OTHER DEALINGS IN THE SOFTWARE.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this header is not required


const src = new PassThrough({ objectMode: true });
const tx = new PassThrough({ objectMode: true });
const dest = new PassThrough({ objectMode: true });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the whole point of this change is see how they work with objectMode: false.

let ticks = testBuffers.length;

const rs = new Readable({
objectMode: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be objectMode: false. Also everywhere else.

@SirR4T
Copy link
Contributor Author

SirR4T commented Sep 5, 2018

@mcollina : pushed changes with objectMode: false, and some changes to internal _stream_readable.js and _stream_writable.js to remove checks which I think are unnecessary now. They might have been necessary before Buffer was a subclass of Uint8Array.

One test is still failing, which raises this question: how should encoding work, for any .write() methods where chunk is not a string? As far as I understand, it works with Buffers and Uint8Arrays currently (before this PR), but will not work with any other typed array or data view.

@mcollina
Copy link
Member

mcollina commented Sep 5, 2018

pushed changes with objectMode: false, and some changes to internal _stream_readable.js and _stream_writable.js to remove checks which I think are unnecessary now. They might have been necessary before Buffer was a subclass of Uint8Array.

Are you sure? Can you please add some checks in the tests that everything that is emitted/returned by on('data') and read() are Buffers? Also, can you check that everything that gets into _write() and _writev() are indeed Buffer?

One test is still failing, which raises this question: how should encoding work, for any .write() methods where chunk is not a string? As far as I understand, it works with Buffers and Uint8Arrays currently (before this PR), but will not work with any other typed array or data view.

I think this is why you need the conversion that you have commented out.

@lundibundi
Copy link
Member

ping @SirR4T.

@SirR4T
Copy link
Contributor Author

SirR4T commented Oct 7, 2018 via email

@lundibundi lundibundi added the wip Issues and PRs that are still a work in progress. label Oct 8, 2018
@SirR4T SirR4T force-pushed the fix1826-stream-TypedArrays branch from 06b494a to dbfcade Compare October 17, 2018 09:26
@antsmartian
Copy link
Contributor

@SirR4T Ping, any update on this?

@SirR4T
Copy link
Contributor Author

SirR4T commented Nov 17, 2018

fixed the merge conflicts for now, but if someone could guide me on correcting the test cases, would appreciate the help.

@SirR4T SirR4T force-pushed the fix1826-stream-TypedArrays branch from 9f31601 to 7841da2 Compare November 17, 2018 07:37
write: common.mustCall((chunk, encoding, cb) => {
assert(!(chunk instanceof Buffer));
assert(ArrayBuffer.isView(chunk));
assert.deepStrictEqual(chunk, nthWrittenBuffer(n++));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instances of Buffer could not be deepStrictEqual to those of TypedArray, could they?

Maybe try:

assert.deepStrictEqual(Buffer.from(chunk.buffer, chunk.byteOffset, chunk.length), nthWrittenBuffer(n++));

@BridgeAR
Copy link
Member

@SirR4T what is the status here?

@jasnell
Copy link
Member

jasnell commented Jul 7, 2020

Given that this is out of date and there has been no activity in over a year, closing. Can reopen if it is picked back up again

@jasnell jasnell closed this Jul 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-minor PRs that contain new features and should be released in the next minor version. stream Issues and PRs related to the stream subsystem. wip Issues and PRs that are still a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants