Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Buffer inconsistent throws/coercion #5323

Closed
trevnorris opened this issue Apr 17, 2013 · 17 comments
Closed

Buffer inconsistent throws/coercion #5323

trevnorris opened this issue Apr 17, 2013 · 17 comments

Comments

@trevnorris
Copy link

Working on #4964 and noticed that Buffers have inconsistencies of when arguments are thrown and when they're coerced. This is a proposal to solidify how they should be handled. It has taken care to follow existing practices by browsers in regards to generic argument handling and that of the Typed Array specification:

  • Required numeric arguments throw if undefined.
  • Optional numeric arguments are set as default if undefined.
  • Arguments expected to be numeric will coerce all other values.
  • Any double will be floored to int.
  • Throw for all out of range indexes (index > buffer.length || index < 0) for any read/write methods.
  • buffer.slice() and buffer.copy() should follow the specification for arraybuffer.slice() (with a few modification for .copy()).

As an example of how to handle this, here are a couple verification functions:

inline uint32_t GetOptionalUintArg(Handle<Value> arg) {
  int32_t val_i = arg->Int32Value();
  if (val_i < 0)
    return ThrowRangeError("index is less than zero");
  return static_cast<uint32_t>(val_i);
}

inline uint32_t GetRequiredUintArg(Handle<Value> arg) {
  if (arg->IsUndefined())
    return ThrowTypeError("argument is undefined");
  return GetOptionalUintArg(arg);
}
@bnoordhuis
Copy link
Member

I second this motion.

Arguments expected to be numeric will coerce all other values.

My beef with this would be that it could hide silly type errors in core, i.e. that it could mask expensive type coercions. Not insurmountable: if all type checking logic is centralized, I can always patch my tree to throw instead of coerce and run the tests (and hope they excise that particular behavior.)

@isaacs
Copy link

isaacs commented Apr 18, 2013

My beef with this would be that it could hide silly type errors in core

I'm inclined to agree with this. Where would we be coercing to a numeric type?

@trevnorris
Copy link
Author

@bnoordhuis I feel the same way. My take on it is that type coercion in js is usually cheap, and if an unexpected value goes through it'll show up as a DEOPT while testing. I understand this isn't optimal, but can't think of a much better way.

For cc-land, not completely sure. The best idea I've had is to create a global set of argument parsing functions, like the two above, which would make it easy to patch for testing.

@isaacs For any functions that users aren't supposed to touch, no need for the coercion. It's geared towards the public API. And it's not just for numbers specifically (though I forgot to mention that). It's for any primitive. Remember 0468861? It's pretty standard to auto-coerce any expected primitive argument to be what is required.

Here's an example:

var dv = new DataView(new ArrayBuffer(8));
dv.setFloat64('0','1.1');
dv.getFloat64('0');
// output: 1.1

@trevnorris
Copy link
Author

These are easy primitive coercion methods that follow ECMA 262.

  • !! - convert value to Boolean (sec. 9.2)
  • + - convert value to Number (sec. 9.3)
  • ~~ - convert value to Int32 (sec. 9.5)
  • +'' - convert value to string (sec. 9.8)

Here is an example using several of the above:

// use: fillRange(val, [start][, end][, noAssert]);
function fillRange(val, start, end, noAssert) {
  val = +val;
  // null-ish will default to 0
  start = ~~start;
  // null-ish will default to "data_length"
  end = end == null ? data_length : ~~end;

  // js api's generally fix this for developers
  if (end > start)
    end = start;

  // important to throw for index check when js is trying to
  // access raw memory (e.g. Buffers/Typed Arrays)
  if (!noAssert && (start < 0 || end > total_length))
    throw new RangeError("out of range index");

  // ...
}

Now as far as the c++ side, definitely want @bnoordhuis opinion on that. I'd like to be able to replicate the above for consistency, but value coercion is much more expensive on the native side. What about making a few argument value parsing functions included in node.h. Possibly setting a #define for debug mode so it'll throw instead of auto-coercing the values?

@bnoordhuis
Copy link
Member

What about making a few argument value parsing functions included in node.h. Possibly setting a #define for debug mode so it'll throw instead of auto-coercing the values?

I'm okay with that but said functions should go into node_internals.h, not node.h.

@trevnorris
Copy link
Author

Sounds good to me. Other than that, have any comments on any other arguments that should be thrown?

@trevnorris
Copy link
Author

@isaacs Want your opinion on one thing. Right now the Buffer .slice() will coerce all arguments. This sort of make sense because it follows the same pattern as Array. Though since the user isn't just getting a copy back, imho, out of range indexes should throw. So, change tha API, or make an exception for that case?

@trevnorris
Copy link
Author

Ok. Consensus has been reached.

General:

  • Expected arguments throw if value === undefined.
  • arguments.length is too expensive (in js) and so not used to check.
  • Optional arguments are set to their defaults if === undefined.
  • Primitives are coerced before value validation occurs. (e.g. length = +length)

General Numeric rules:

  • Expected Number values are coerced:
    • JS with +arg.
    • CC with NumberValue(arg).
  • Expected Int/Uint values are coerced:
    • JS with ~~arg.
    • CC with Int32Value(arg)
  • When Int/Uint are expected values are allowed to overflow wrap, but coercion must be done before checking.
  • Uint values are first coerced to Int to check for negative values. Then cast to size_t.

Other general rules:

  • String arguments are coerced using + ''.
  • Boolean arguments are coerced using !!.

Exception:

  • Buffer instantiation expects Uint but evaluates as the floor of Number.

General Buffer method rules:

  • If end > start then end = start.
  • All write*/read* will throw if index (plus offset for written value) is out of range.
  • Buffer.prototype.slice will behave like ArrayBuffer.prototype.slice.

One TBD:

  • Buffer.prototype.copy is problematic for the following:
    • I was an idiot and added the following to the docs: "All values passed that are undefined/NaN or are out of bounds are set equal to their respective defaults."
    • Currently throws when "sourceEnd is negative", "sourceStart is greater than sourceEnd" or "attempt to copy after targetBuffer". Which could contradict the "out of bounds" line above. Depending on how you read it.

Other notes:

  • In cc-land the function ParseArrayIndex has been added to node_internals.h used to parse all array indexes in cc.

Have a hard time believing it's come out this complicated, but this should address most, if not all, argument parsing concerns.

/cc @isaacs @bnoordhuis just for one last review

@bnoordhuis
Copy link
Member

I'm not sure if your comment addresses this but:

Expected Int/Uint values are coerced:

Can lead to unexpected results (where 'unexpected' === 'most people won't realize that', not 'undefined'.)

> 0x7fffffff === ~~0x7fffffff  // 2147483647
true
> 0x80000000 === ~~0x80000000  // -2147483648
false

This is more of a general observation because you can't have buffers that big anyway.

@trevnorris
Copy link
Author

Was thinking about that myself, but then realized that no one else does it as well:

new ArrayBuffer(~~(0xffffffff+3))

@trevnorris
Copy link
Author

But I did make an exception for Buffer instantiation. I floor the Number
which will leave it in floating point range. So in that one case it'll
throw regardless of the size of n.

@domenic
Copy link

domenic commented Jun 30, 2013

Sorry to jump in here, but

Optional arguments are set to their defaults if == null.

Is not really the way to go. It should be === undefined to match ES6 semantics of default arguments and all built-in ES5 functions.

Otherwise the general movement toward coercion is great and matches the ES5 built-ins.

@trevnorris
Copy link
Author

Good assessment. There are parts of the API where null is a valid but not default parameter. Oversight on my part. Thanks.

@trevnorris
Copy link
Author

@isaacs I'd suggest during v0.13 cleanup we make this standard across all of core as much as possible.

@domenic domenic mentioned this issue Aug 21, 2013
h4ck3rm1k3 pushed a commit to h4ck3rm1k3/node that referenced this issue Nov 1, 2013
As documented in nodejs#3042 and in [1], the existing vm implementation has
many problems. All of these are solved by @brianmcd's [contextify][2]
package. This commit uses contextify as a conceptual base and its code
core to overhaul the vm module and fix its many edge cases and caveats.

Functionally, this fixes nodejs#3042. In particular:

- A context is now indistinguishable from the object it is based on
  (the "sandbox"). A context is simply a sandbox that has been marked
  by the vm module, via `vm.createContext`, with special internal
  information that allows scripts to be run inside of it.
- Consequently, items added to the context from anywhere are
  immediately visible to all code that can access that context, both
  inside and outside the virtual machine.

This commit also smooths over the API very slightly:

- Parameter defaults are now uniformly triggered via `undefined`, per
  ES6 semantics and previous discussion at [3].
- Several undocumented and problematic features have been removed, e.g.
  the conflation of `vm.Script` with `vm` itself, and the fact that
  `Script` instances also had all static `vm` methods. The API is now
  exactly as documented (although arguably the existence of the
  `vm.Script` export is not yet documented, just the `Script` class
  itself).

In terms of implementation, this replaces node_script.cc with
node_contextify.cc, which is derived originally from [4] (see [5]) but
has since undergone extensive modifications and iterations to expose
the most useful C++ API and use the coding conventions and utilities of
Node core.

The bindings exposed by `process.binding('contextify')`
(node_contextify.cc) replace those formerly exposed by
`process.binding('evals')` (node_script.cc). They are:

- ContextifyScript(code, [filename]), with methods:
  - runInThisContext()
  - runInContext(sandbox, [timeout])
- makeContext(sandbox)

From this, the vm.js file builds the entire documented vm module API.

node.js and module.js were modified to use this new native binding, or
the vm module itself where possible. This introduces an extra line or
two into the stack traces of module compilation (and thus into most
stack traces), explaining the changed tests.

The tests were also updated slightly, with all vm-related simple tests
consolidated as test/simple/test-vm-* (some of them were formerly
test/simple/test-script-*). At the same time they switched from
`common.debug` to `console.error` and were updated to use
`assert.throws` instead of rolling their own error-testing methods.

New tests were also added, of course, demonstrating the new
capabilities and fixes.

[1]: http://nodejs.org/docs/v0.10.16/api/vm.html#vm_caveats
[2]: https://github.com/brianmcd/contextify
[3]: nodejs#5323 (comment)
[4]: https://github.com/kkoopa/contextify/blob/bf123f3ef960f0943d1e30bda02e3163a004e964/src/contextify.cc
[5]: https://gist.github.com/domenic/6068120
@jasnell
Copy link
Member

jasnell commented May 26, 2015

@trevnorris ... is this still an issue?

@jasnell
Copy link
Member

jasnell commented Aug 26, 2015

@trevnorris ... ping :-)

@trevnorris
Copy link
Author

going to say no.

@jasnell jasnell closed this as completed Aug 27, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants