Investigate "Balloon Hash" as alternative for Argon2 or Scrypt #40

cipriancraciun · 2024-11-15T15:13:05Z

See:

hakavlad · 2024-11-20T13:37:13Z

https://github.com/samuel-lucas6/Cryptography-Guidelines?tab=readme-ov-file#password-hashingpassword-based-key-derivation

Avoid (not in order because they’re all bad):

Balloon hashing: arguably better than Argon2 since it's similar in strength whilst having a more impressive design (e.g. no separate variants, resistance to cache attacks, easy to implement with standard cryptographic hash functions, and performant). Unfortunately, it has seen virtually no adoption. There seems to be no information on recommended parameters, the reference implementation is henrycg/balloon#5 (comment), there are no official test vectors, there's no RFC draft, and only a handful of people have implemented the algorithm, with it not being in any popular libraries. Therefore, just use Argon2, which has now been standardised and widely adopted.

cipriancraciun · 2024-11-20T14:48:22Z

I'll take one-by-one your arguments:

arguably better than Argon2 [...]

at this moment, I don't think any PBKDF algorithm can be called "the best" based on actual facts and research; yes, Argon2 has won the password hashing contest, but since then there have been many more contenders (yescrypt for example, Balloon hashing another one), there have been some research papers, but nothing definitive...
to conclude, all we have is mainly suggestions and best-practices;

[...] easy to implement with standard cryptographic hash functions, and performant [...]

and this is exactly why I consider Balloon hashing better than Argon2:
- it can use any hash function (and I do like Blake3), thus I can reduce the amount of cryptographic primitives in my code;
- I can actually understand what is happening in there;
- as opposed to Argon2 which (before being standardized in a RFC) had quite a few missing details from the paper, and is very convoluted;

Unfortunately, it has seen virtually no adoption.

the same was true for both Bcrypt, Scrypt, and Argon2 before they became best practices;

there seems to be no information on recommended parameters

I don't see this as a problem, because in the end it doesn't matter what the "recommended parameters" are, if I can't run it on a RaspberryPi Zero (which has only 512 MiB of RAM), or it takes half a minute to execute;
the best parameters are thus identified through trial and error, until they "work" for a given use-case;

the reference implementation is [no-longer maintained]
[...] and only a handful of people have implemented the algorithm, with it not being in any popular libraries.

it doesn't matter, as the algorithm is quite trivial to implement;

there are no official test vectors

it doesn't matter, as there isn't yet a standard one;

there's no RFC draft

indeed, but there is active work on one;
most of my tool doesn't follow an RFC, thus there isn't much additional risk; (i.e. I already have a lot of experimental constructs;)

As such, I don't see yet any compelling reason why not to use Balloon Hash.

hakavlad · 2024-11-20T14:53:08Z

I'll take one-by-one your arguments:

BTW it's arguments of Samuel Lucas

hakavlad · 2024-11-20T14:58:37Z

@samuel-lucas6 Could you comments this, please?

samuel-lucas6 · 2024-11-21T17:50:24Z

I'd recommend sticking with Argon2, at least for the time being. It's among the best we have in practice but not perfect. It should be stronger and significantly faster than Balloon/BKDF. I don't see any reason to use scrypt over Argon2.

I don't think it's a good idea to use algorithms that don't have a proper specification or that have a specification that's subject to change. This doesn't have to be an RFC, but it needs to be documented in some fashion with rationale.

I technically did do a version of the Internet-Draft for the Rust Crypto implementation of Balloon, but it's not the ideal way to specify Balloon. Then BKDF is a work in progress that I'd like further feedback on, just cryptographers/engineers aren't particularly interested. There doesn't seem to be as much research/reading on password hashing compared to some other topics, and it feels like most people have given up after Argon2. The non-academic, cache-hardness proposals like Pufferfish2 and bscrypt have the same issue with no specification and a lack of peer review. They shouldn't be considered suitable for production.

I agree with the remainder of what you've said.

cipriancraciun · 2024-11-21T17:55:25Z

This doesn't have to be an RFC, but it needs to be documented in some fashion with rationale.

Well, I assume that the original paper describing the algorithm does fit this bill, right?

samuel-lucas6 · 2024-11-22T08:33:21Z

Well, I assume that the original paper describing the algorithm does fit this bill, right?

Nope because they didn't take the time to properly specify the algorithm. It leaves lots of implementation details unspecified, meaning you'll end up with non-interoperable implementations that potentially offer different security properties/levels of security.

cipriancraciun · 2024-11-22T09:25:08Z

[...] It leaves lots of implementation details unspecified [...]

Besides not specifying which hash algorithm to use, I don't see any meaningful missing details. Could you point to some?

[...] you'll end up with non-interoperable implementations [...]

To put things into perspective, this project offers an "experimental" encryption scheme based on ChaCha20, Blake3, X25519, and Argon2, but it doesn't implement any existing standard, outside of "best practices". Thus, interoperability is not an issue.

samuel-lucas6 · 2024-11-23T10:44:22Z

Besides not specifying which hash algorithm to use, I don't see any meaningful missing details. Could you point to some?

The paper pseudocode doesn't define things like ints_to_block(), the size of the counter, the max parameters, the encoding, and how to avoid canonicalization attacks. Then there's Balloon and Balloon-M as well as other variants of Balloon in older versions of the paper (e.g., Double-Buffer).

To put things into perspective, this project offers an "experimental" encryption scheme based on ChaCha20, Blake3, X25519, and Argon2, but it doesn't implement any existing standard, outside of "best practices". Thus, interoperability is not an issue.

Those are all algorithms with a specification. That sounds more like a protocol than a scheme. I don't think there's anything wrong with doing something like ChaCha20-BLAKE3 as an AEAD with justification, but there's very little justification for using Balloon currently. If you're certain you want to implement it, I'd recommend writing a specification for your version of Balloon.

cipriancraciun · 2024-11-23T11:00:18Z

[I find all of the issues a bit like nit-picking...]

The paper pseudocode doesn't define things like ints_to_block()

Well, let's assume they are 64 bits in size, and just represent them as big-endian bytes and concatenate them.

the size of the counter

64 bit big-endian again.

the max parameters

Which "max parameters"? The rounds and memory size can be arbitrarily large without any issues.

the encoding

The "encoding" of what?

and how to avoid canonicalization attacks.

I believe this is a solved problem, especially since the algorithm works only with counters or indices (64 bit thus) or full blocks.

Then there's Balloon and Balloon-M as well as other variants of Balloon in older versions of the paper (e.g., Double-Buffer).

Indeed, this is the biggest issue, but one could use the latest variant described in the paper, because most other papers have chosen the latest variant to review.

With regard to the parallel version (I'm assuming it's the -M) version, I would just skip it for the purpose of this project, because my target includes low-power devices (think of RaspberryPi Zero), thus parallelism doesn't actually work.

samuel-lucas6 · 2024-11-23T11:18:37Z

[I find all of the issues a bit like nit-picking...]

In that case, I'm going to unsubscribe from this conversation as there's no point going around in circles. I will just leave you with the following:

Little-endian is generally preferred. Ascon is getting changed from big-endian to little-endian.
The rounds and memory size can't be arbitrarily large because the counter will overflow, even with a 64-bit counter.
The counter and any integer values need to be converted to bytes.
The password and salt are variable-length so canonicalization attacks aren't a solved problem.
to_int() also isn't defined, which I skimmed over.
Using the Balloon paper suffers from other issues like salt-based memory accesses, modulo bias, poor performance, a lack of variable-length output, etc.

cipriancraciun · 2024-11-23T12:48:35Z

In that case, I'm going to unsubscribe from this conversation as there's no point going around in circles.

I am sorry, I didn't meant to offend you.

The rounds and memory size can't be arbitrarily large because the counter will overflow, even with a 64-bit counter.

Here by "arbitrarily" I meant some limit that does make sense in the real world. One would have to apply limits in a practical sense, else the actual code wouldn't be able to run (wouldn't be able to actually allocate memory), or wouldn't finish (if the iterations are too large).

With regard to the 64bit counter overflow, given what I've read from the paper, I believe its main purpose is to aid the security proof, because wherever hash is used, it's inputs already depend on a previous step, and thus can't be parallelized even without the counter.

However, the counter can be then increased to a 128 bit integer, or even let to wrap, as I don't believe the whole strength of the algorithm lies in the counter. (Even with 128 bits, if one chooses the max 64 bit value for both memory and size, it can overflow, as for each iteration it's incremented 1+delta*2 times.)

Using the Balloon paper suffers from other issues like salt-based memory accesses

Indeed, but there are two choices, neither of which are good:

depend on salt -- side-channels might leak information about the salt;
depend on password -- side-channels might leak information about the password;
plus, do both -- side-channels might leak some (less) information about both;

modulo bias

With regard to modulo bias, mod is used only twice:

once to choose the "previous" block wrapping around the blocks vector if needed, thus no modulo bias here;
and once when choosing the other blocks, where indeed modulo bias might be an issue, but the inputs to mod is a hash, and the outcome of the mod is again hashed, thus perhaps evening the odds; (thus I doubt it really impacts the overall quality;)

poor performance

Compared to what?

a lack of variable-length output, etc.

Not an issue for my particular use-case. (I need a function that takes a 32 byte key (salt), some data (password), munge on it to thwart brute-force guessing attacks, and spit out another 32 bytes.)

hakavlad · 2024-11-23T18:07:35Z

neither of which are good

depend on salt -- side-channels might leak information about the salt;

Salt is not secret. Then what is the problem in this case?

cipriancraciun · 2024-11-23T18:10:11Z

Salt is not secret. Then what is the problem in this case?

From what I gather, the general consensus is that an attacker could observe which salt is being used, kind of fingerprinting if not exactly the exact value of the salt, for example by noticing patterns in memory access.

Thus, it can expose "which" user is login-in, or which key is being encrypted / decrypted.

hakavlad · 2024-11-23T18:34:00Z

IMHO revealing non-secret data is better than revealing secret data.

or which key is being encrypted

This is like metadata leakage, not secret data leakage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate "Balloon Hash" as alternative for Argon2 or Scrypt #40

Investigate "Balloon Hash" as alternative for Argon2 or Scrypt #40

cipriancraciun commented Nov 15, 2024 •

edited

Loading

hakavlad commented Nov 20, 2024

cipriancraciun commented Nov 20, 2024

hakavlad commented Nov 20, 2024

hakavlad commented Nov 20, 2024

samuel-lucas6 commented Nov 21, 2024

cipriancraciun commented Nov 21, 2024

samuel-lucas6 commented Nov 22, 2024

cipriancraciun commented Nov 22, 2024

samuel-lucas6 commented Nov 23, 2024

cipriancraciun commented Nov 23, 2024

samuel-lucas6 commented Nov 23, 2024

cipriancraciun commented Nov 23, 2024

hakavlad commented Nov 23, 2024

cipriancraciun commented Nov 23, 2024

hakavlad commented Nov 23, 2024

Investigate "Balloon Hash" as alternative for Argon2 or Scrypt #40

Investigate "Balloon Hash" as alternative for Argon2 or Scrypt #40

Comments

cipriancraciun commented Nov 15, 2024 • edited Loading

hakavlad commented Nov 20, 2024

cipriancraciun commented Nov 20, 2024

hakavlad commented Nov 20, 2024

hakavlad commented Nov 20, 2024

samuel-lucas6 commented Nov 21, 2024

cipriancraciun commented Nov 21, 2024

samuel-lucas6 commented Nov 22, 2024

cipriancraciun commented Nov 22, 2024

samuel-lucas6 commented Nov 23, 2024

cipriancraciun commented Nov 23, 2024

samuel-lucas6 commented Nov 23, 2024

cipriancraciun commented Nov 23, 2024

hakavlad commented Nov 23, 2024

cipriancraciun commented Nov 23, 2024

hakavlad commented Nov 23, 2024

cipriancraciun commented Nov 15, 2024 •

edited

Loading