Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: throw if string is not a valid HEX string #3773

Closed

Conversation

thefourtheye
Copy link
Contributor

As it is, if an invalid HEX string is passed to Buffer constructor,
it will only use the valid HEX values and ignore the rest. But, it also
throws an error when the length of the string is odd in length. This
patch throws an error if the string is not a valid HEX string.

Fixes: #3770

cc @vkurchatkin @trevnorris @bnoordhuis

@thefourtheye thefourtheye added buffer Issues and PRs related to the buffer subsystem. lib / src Issues and PRs related to general changes in the lib or src directory. labels Nov 11, 2015
if (StringBytes::IsValidString(env->isolate(), str, encoding) == false) {
if (encoding == HEX)
return env->ThrowTypeError("Invalid hex string");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems rather wasteful to call IsValidString() and only act on the result when encoding == HEX.

EDIT: What I mean is that IsValidString() may be cheap now but that can change when more checks are added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to send a follow-up or for base 64 strings

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to compare a known boolean value to false. if (!… would do.

@bnoordhuis
Copy link
Member

The problem I see with this approach is that it allocates memory for and writes out the string twice in the common case.

@thefourtheye
Copy link
Contributor Author

Would it be better to do it in js itself?

@trevnorris
Copy link
Contributor

Couldn't we just optimistically write the hex value and do the final check of buffer_length * 2 == string_length? I'd prefer we optimize for the common case.

Now we'd also have to check for utf-8 characters. I'd recommend doing something like the following:

size_t string_length;
if (string->IsOneByte())
  string_length = string->Length();
else
  string_length = string->Utf8Length();

Now String::IsOneByte() can have false negatives, but it is faster than String::ContainsOnlyOneByte(). Which scans the entire string. Probably want to do some benchmarking, but I believe this is a fair trade-off.


bool StringBytes::IsValidString(Isolate* isolate,
Local<String> string,
enum encoding enc) {
if (enc == HEX && string->Length() % 2 != 0)
if (enc == HEX && IsValidHexString(isolate, string) == false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!IsValidHexString(isolate, string)

@ChALkeR ChALkeR added the semver-major PRs that contain breaking changes and should be released in the next major version. label Nov 11, 2015
@ChALkeR
Copy link
Member

ChALkeR commented Nov 11, 2015

+1 on this change, there is no sense in both throwing on invalid length and ignoring (actually stopping at) invalid content. Also, it seems like that behavior was never documented.

Could you add a note to the documentation that this method throws on invalid input (after making sure that it does that for other encodings, too)?

A minor notice regarding the commit message: «hex» is not an abbrevation, it's a short word for «hexademical».

Also, this is probably a semver-major change.

As it is, if an invalid HEX string is passed to `Buffer` constructor,
it will only use the valid HEX values and ignore the rest. But, it also
throws an error when the length of the string is odd in length. This
patch throws an error if the string is not a valid HEX string.

Fixes: nodejs#3770
@thefourtheye thefourtheye force-pushed the buffer-hex-string-fix branch from 97fc4c3 to 0d5f47a Compare January 30, 2016 12:44
@jasnell
Copy link
Member

jasnell commented Mar 22, 2016

What's the status on this one?

@jasnell jasnell added the stalled Issues and PRs that are stalled. label Mar 27, 2016
@estliberitas estliberitas force-pushed the master branch 2 times, most recently from 7da4fd4 to c7066fb Compare April 26, 2016 05:22
@thefourtheye thefourtheye mentioned this pull request May 17, 2016
4 tasks
@jasnell
Copy link
Member

jasnell commented Aug 6, 2016

@thefourtheye ... are you still interested in doing this?

@jasnell
Copy link
Member

jasnell commented Feb 28, 2017

ping @thefourtheye

@jasnell
Copy link
Member

jasnell commented Mar 24, 2017

Closing due to lack of forward progress on this. Can reopen and revisit if necessary

@jasnell jasnell closed this Mar 24, 2017
@thefourtheye thefourtheye deleted the buffer-hex-string-fix branch March 28, 2017 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. lib / src Issues and PRs related to general changes in the lib or src directory. semver-major PRs that contain breaking changes and should be released in the next major version. stalled Issues and PRs that are stalled.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants