crypto: add HMAC to crypto.timingSafeEqual() #38488

Trott · 2021-04-30T19:03:44Z

Add HMAC to crypto.timingSafeEqual(). This makes things slower but also
makes them far more timing safe.

Refs: #38226 (comment)
Fixes: #38226

nodejs-github-bot · 2021-04-30T19:59:56Z

CI: https://ci.nodejs.org/job/node-test-pull-request/37853/

Trott · 2021-04-30T20:01:39Z

Here's a stress test of this PR: https://ci.nodejs.org/job/node-stress-single-test/294/

And here's one for current master branch: https://ci.nodejs.org/job/node-stress-single-test/295/

Trott · 2021-04-30T20:38:49Z

Stress test results are enough for me to say "This is a good thing." At least, it fixes test flakiness (beyond the tiny bit of flakiness inherent in a probabilistic test).

My C++ was never very good and it's only gotten rustier over the last several years, so apologies in advance.

@nodejs/crypto @not-an-aardvark

panva · 2021-04-30T20:44:41Z

This makes things slower

How much slower? Is the existing implementation not timing safe enough? Should this be a semver-major? Or better yet introduced as a flag to the existing method?

jasnell · 2021-04-30T20:49:05Z

src/crypto/crypto_timing.cc

+  char key[kKeySize];
+  snprintf(key, sizeof(key), "%04x%04x%04x%04x%04x%04x%04x%04x",
+           bufKey[0],
+           bufKey[1],
+           bufKey[2],
+           bufKey[3],
+           bufKey[4],
+           bufKey[5],
+           bufKey[6],
+           bufKey[7]);


Why not just use a static random key here rather than generating one each time? The key can be generated randomly when the crypto binding is loaded and just reused.

Why not just

Because I am not particularly fluent at C++ and copied this implementation from the uuid generation in our inspector code. Happy to take informed comments to improve it, but the answer to any "why not just" queries is going to be "Because I'm really bad at this."

And I guess I should state, rather than imply: Thanks for the suggestion, and I'll look into doing that later. (Focused on something else at the moment, but not focused enough to ignore GitHub comments, go figure.)

:-) ... ok, I just wanted to make sure there wasn't a specific technical reason. By avoiding generating the random key on every call we can save at least some of the additional performance cost introduced here.

I think the main advantage generating a random key every time is that it would become much harder for someone to get the function to return true as a result of a HMAC-sha256 collision, in the event that sha256 becomes less collision-resistant in the future.

As a secondary, smaller benefit, it also prevents an attacker from knowing either of the inputs to CRYPTO_memcmp. More explicitly, if the computation is CRYPTO_memcmp(sha256hmac(someKey, a), sha256hmac(someKey, b)) where someKey is a static constant, then an attacker could control at least the first few bytes of sha256hmac(someKey, a) by changing a via brute force. Given that this is intended as a defense-in-depth measure in the scenario that CRYPTO_memcmp is insufficient to address side-channels, in this scenario the attacker could use a timing side-channel to discover the first few bytes of sha256hmac(someKey, b). It's not clear how this would be useful in most cases, but it would give a nonzero amount of information about b.

jasnell · 2021-04-30T20:50:47Z

src/crypto/crypto_timing.cc

+  std::array<unsigned char, EVP_MAX_MD_SIZE> hash1;
+  std::array<unsigned char, EVP_MAX_MD_SIZE> hash2;


why not just unsigned char hash1[EVP_MAX_MD_SIZE] here?

Sure, I don't know the difference, so that works for me. 😬

Afaik, C-style array lacks the higher-level functionalities of std::array, hence, they do not track their own size, so you need to manage size information manually. So for this particular case, I think it's better to keep std::array , since we keeps track of arrays own size.

tniessen · 2021-04-30T21:26:33Z

If this really isn't timing-safe, isn't that a bug in CRYPTO_memcmp?

Trott · 2021-04-30T21:57:13Z

This makes things slower

How much slower?

We don't have any formal benchmarks so I'd have to write one to say with certainty.

Using test/pummel/test-crypto-timing-safe-equal-benchmarks.js as a proxy for a benchmark:

On LinuxONE CI, master takes about 2.5 seconds per run and this PR takes about 3.8 seconds per run.
On my 8-year-old laptop, master takes about 5 or 6 seconds and this PR takes about 14 or 15 seconds per run.

Is the existing implementation not timing safe enough?

Short answer is "Yes, not timing safe enough."

Longer answer: Unless there's a bug in test/pummel/test-crypto-timing-safe-equal-benchmarks.js, the answer would be "The current implementation fails a lot on some specific hosts in CI, so yes, not timing safe enough." You can see in https://ci.nodejs.org/job/node-stress-single-test/294/nodes=rhel7-s390x/console that with this implementation, the test passed 1000 times out of 1000 runs on LinuxONE (which is very fast in CI--might be our fastest CI host). In comparison, master branch (which you can see at https://ci.nodejs.org/job/node-stress-single-test/295/nodes=rhel7-s390x/console) is still running as of this writing but so far has failed more than 50% of the 700+ runs so far.

Should this be a semver-major? Or better yet introduced as a flag to the existing method?

Good questions. I'm interested in what others think.

Trott · 2021-04-30T21:58:25Z

If this really isn't timing-safe, isn't that a bug in CRYPTO_memcmp?

I assumed we're using the OpenSSL memcmp under the hood and didn't even consider that there might be e bug in memcmp, but now that you mention it, I suppose this all warrants more investigation.

jasnell · 2021-04-30T22:02:02Z

Unless there's a bug in test/pummel/test-crypto-timing-safe-equal-benchmarks.js

Bug.... no.... but there could be something at play here with the use of process.hrtime(). The LinuxOne machine is very fast, and I do wonder if there's some precision that's being lost here in the timing logic in the benchmark.

Trott · 2021-04-30T23:32:35Z

The LinuxOne machine is very fast, and I do wonder if there's some precision that's being lost here in the timing logic in the benchmark.

I could be wrong, but I would think lost precision should make the test more robust.

Trott · 2021-04-30T23:42:58Z

If this really isn't timing-safe, isn't that a bug in CRYPTO_memcmp?

There's also several lines of code inside TimingSafe() before it gets to the memcmp() so maybe that's where the problem lies somehow? Maybe there are optimizations that occur when the two buffers have the same contents that result in a timing difference that is still measurable, especially since the test is testing with one-character buffers?

Trott · 2021-05-01T00:34:24Z

Here's a stress test on a branch where I removed memcmp() and always return true instead. (I had to alter a comparison in the test too.)

Stress test: https://ci.nodejs.org/job/node-stress-single-test/296/nodes=rhel7-s390x/
Branch diff vs. master: https://github.com/nodejs/node/compare/master...Trott:no-memcmp?expand=1

If this fails a lot, then the timing issue doesn't involve memcmp().

If it succeeds 100%, then that strongly suggests that the issue is with memcmp(). 😱

Trott · 2021-05-01T00:38:53Z

Here's a stress test on a branch where I removed memcmp() and always return true instead. (I had to alter a comparison in the test too.)

Stress test: https://ci.nodejs.org/job/node-stress-single-test/296/nodes=rhel7-s390x/
Branch diff vs. master: https://github.com/nodejs/node/compare/master...Trott:no-memcmp?expand=1

If this fails a lot, then the timing issue doesn't involve memcmp().

If it succeeds 100%, then that strongly suggests that the issue is with memcmp(). 😱

So far, no failures in 50 runs. So that strongly suggests (to me at least) that the problem is indeed memcmp(). Do you agree, @tniessen? What do we do next?

Trott · 2021-05-01T00:40:04Z

So far, no failures in 50 runs. So that strongly suggests (to me at least) that the problem is indeed memcmp(). Do you agree, @tniessen? What do we do next?

Looks like we're not even using a wrapper around OpenSSL's CRYPTO_memcmp(). We are calling it directly.

Trott · 2021-05-01T00:45:42Z

So far, no failures in 50 runs. So that strongly suggests (to me at least) that the problem is indeed memcmp(). Do you agree, @tniessen? What do we do next?

Looks like we're not even using a wrapper around OpenSSL's CRYPTO_memcmp(). We are calling it directly.

Maybe the way I modified the C++ function allowed the optimizer to remove some of the other code. So maybe the conclusion I draw here isn't correct. I should have thought of that before...

not-an-aardvark · 2021-05-01T01:55:20Z

Re. "Is the existing implementation not timing safe enough?", I think:

There are a lot of layers of abstraction in the benchmark test, including crypto.timingSafeEqual. One of them is causing the test to fail.
At this point, my best guess is that the culprit is branch prediction within the benchmark test itself. V8 compiles the benchmark in some way involving lots of fancy inlining and complicated runtime heuristics. Maybe this results in a faster compilation of the "run a benchmark with hardcoded equal inputs" code than the "run a benchmark with hardcoded unequal inputs" code, despite the use of things like eval and randomized ordering to throw off the optimizer. If this is the case, then the test is just flaky and crypto.timingSafeEqual is fine.

(As for why the breakage just started recently: I doubt it's substantively related to Trott@845adb7, but maybe that pushed something like memory usage over a particular threshold that caused things to be compiled/arranged differently.)
Another possibility, which is plausible albeit maybe far-fetched, is that crypto.timingSafeEqual or the benchmark code is quicker to return false than to return true based on some artifact of code generation, but takes the same amount of time to return false for unequal buffers regardless of how "similar" they are. In this case, there would technically be a bug but it would be unlikely to be exploitable as a security issue.
At this point we can't completely rule out the possibility that there's a problem with CRYPTO_memcmp. Since there's some uncertainty, I think it would be reasonable to add an HMAC as a defense-in-depth mechanism.

If we want to investigate further, there are a few routes we could try:

Update the benchmark to compare random buffers that are read from I/O somehow, rather than generating effectively-constant buffers in JS. This would more accurately simulate real-world conditions, in which a buffer might come in over the network, and would probably rule out the possibility of V8 shenanigans affecting the benchmark. (More specifically, a harness could create two files that either (a) both contain the same random buffer, or (b) both have different random buffers, and then run a Node program that uses crypto.timingSafeEqual to check whether the buffers are equal, and compare the runtime between these cases.) This test would take a lot longer to run, but seems like it would be fairly conclusive about whether the problem is in our benchmark or not.
- A potentially-easier version of this would be to implement (part of) the test in native code, and have it call crypto.timingSafeEqual via N-API (or just call the native binding used by crypto.timingSafeEqual).
- If that test identifies a timing issue, it would also be interesting to compare the timing between cases where (a) both files contain different random buffers, and (b) both files contain the same random buffer except that the last byte is different.
It could be worth examining the assembly generated by CRYPTO_memcmp in the compiled node binary for this platform, although this wouldn't rule out bugs caused by hardware-level optimizations.
In theory it would be possible to examine the code that V8 generates somehow, but I'm not aware of how to do this.

Trott · 2021-05-01T02:23:30Z

Update the benchmark to compare random buffers that are read from I/O somehow, rather than generating effectively-constant buffers in JS.

Perhaps using crypto.randomUUID() would be sufficient? EDIT: Er, ok, not sufficient, but maybe an incremental improvement....

Avoid possible V8 optimizations that may invalidate benchmark. This is done by writing randm UUIDs to file and reading the files again as needed, rather than having hardcoded strings for the buffer contents. Refs: nodejs#38488 (comment)

Trott · 2021-05-01T04:18:14Z

create two files that either (a) both contain the same random buffer, or (b) both have different random buffers, and then run a Node program that uses crypto.timingSafeEqual to check whether the buffers are equal, and compare the runtime between these cases.)

I tried to do a lightweight version of this approach in #38493

Add HMAC to crypto.timingSafeEqual(). This makes things slower but also makes them far more timing safe. Refs: nodejs#38226 (comment) Fixes: nodejs#38226

Trott added the wip Issues and PRs that are still a work in progress. label Apr 30, 2021

github-actions bot added c++ Issues and PRs that require attention from people who are familiar with C++. crypto Issues and PRs related to the crypto subsystem. needs-ci PRs that need a full CI run. labels Apr 30, 2021

Trott force-pushed the hmacify branch 2 times, most recently from f2a0d3a to a080a53 Compare April 30, 2021 19:59

Trott force-pushed the hmacify branch 2 times, most recently from feede9c to b3c3a0c Compare April 30, 2021 20:34

Trott removed the wip Issues and PRs that are still a work in progress. label Apr 30, 2021

Trott changed the title ~~wip: add HMAC to crypto.timingSafeEqual()~~ src: add HMAC to crypto.timingSafeEqual() Apr 30, 2021

Trott changed the title ~~src: add HMAC to crypto.timingSafeEqual()~~ crypto: add HMAC to crypto.timingSafeEqual() Apr 30, 2021

jasnell reviewed Apr 30, 2021

View reviewed changes

Trott mentioned this pull request May 1, 2021

test: make test-crypto-timing-safe-equal-benchmarks robust #38493

Closed

CertifiedRice approved these changes May 1, 2021

View reviewed changes

richardlau mentioned this pull request May 18, 2021

v16.2.0 release proposal #38719

Merged

Trott force-pushed the hmacify branch from b3c3a0c to fab3612 Compare May 27, 2022 22:45

crypto: add HMAC to crypto.timingSafeEqual()

e3d7835

Add HMAC to crypto.timingSafeEqual(). This makes things slower but also makes them far more timing safe. Refs: nodejs#38226 (comment) Fixes: nodejs#38226

Trott force-pushed the hmacify branch from fab3612 to e3d7835 Compare May 28, 2022 15:11

Trott force-pushed the main branch from 2d76238 to ca3ed36 Compare November 12, 2022 01:49

larsqa mentioned this pull request Apr 26, 2023

timingSafeEqual functionality w3c/webcrypto#270

Open

Mkassabov mentioned this pull request Oct 4, 2023

Fix: platform agnostic cookie elysiajs/elysia#253

Merged

Trott closed this Mar 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

crypto: add HMAC to crypto.timingSafeEqual() #38488

crypto: add HMAC to crypto.timingSafeEqual() #38488

Trott commented Apr 30, 2021 •

edited

Loading

nodejs-github-bot commented Apr 30, 2021

Trott commented Apr 30, 2021

Trott commented Apr 30, 2021

panva commented Apr 30, 2021

jasnell Apr 30, 2021

Trott Apr 30, 2021

Trott Apr 30, 2021

jasnell Apr 30, 2021

not-an-aardvark May 1, 2021

jasnell Apr 30, 2021

Trott Apr 30, 2021

Mifrill Nov 16, 2023

tniessen commented Apr 30, 2021

Trott commented Apr 30, 2021

Trott commented Apr 30, 2021

jasnell commented Apr 30, 2021

Trott commented Apr 30, 2021 •

edited

Loading

Trott commented Apr 30, 2021

Trott commented May 1, 2021

Trott commented May 1, 2021

Trott commented May 1, 2021

Trott commented May 1, 2021

not-an-aardvark commented May 1, 2021

Trott commented May 1, 2021 •

edited

Loading

Trott commented May 1, 2021

		std::array<unsigned char, EVP_MAX_MD_SIZE> hash1;
		std::array<unsigned char, EVP_MAX_MD_SIZE> hash2;

crypto: add HMAC to crypto.timingSafeEqual() #38488

crypto: add HMAC to crypto.timingSafeEqual() #38488

Conversation

Trott commented Apr 30, 2021 • edited Loading

nodejs-github-bot commented Apr 30, 2021

Trott commented Apr 30, 2021

Trott commented Apr 30, 2021

panva commented Apr 30, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tniessen commented Apr 30, 2021

Trott commented Apr 30, 2021

Trott commented Apr 30, 2021

jasnell commented Apr 30, 2021

Trott commented Apr 30, 2021 • edited Loading

Trott commented Apr 30, 2021

Trott commented May 1, 2021

Trott commented May 1, 2021

Trott commented May 1, 2021

Trott commented May 1, 2021

not-an-aardvark commented May 1, 2021

Trott commented May 1, 2021 • edited Loading

Trott commented May 1, 2021

Trott commented Apr 30, 2021 •

edited

Loading

Trott commented Apr 30, 2021 •

edited

Loading

Trott commented May 1, 2021 •

edited

Loading