Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bscrypt is (probably) not a (natively) self-documenting hash function #3

Open
lpsmith opened this issue Jun 18, 2023 · 8 comments
Open

Comments

@lpsmith
Copy link

lpsmith commented Jun 18, 2023

Thank you for your work on password hash cracking, and your participation on /r/crypto. Your work informed the design of the Global Password Prehash Protocol. I acknowledge you in the design document in this repo: https://github.com/auth-global/self-documenting-cryptography

Now, I've not dug into bscrypt, but it is almost assuredly already supports self-documenting features. How well it supports those, I don't know. Thus I can always wrap bscrypt in PHKDF and use bscrypt as the main key-stretching component and secondary self-documenting component of a hybrid protocol.

In a vague analogy to loudspeaker design, I consider PHKDF to be a "horn-loaded" self-documenting design: from the point of view of the G3P's primary, classical model of password security, there are a few parameters that are consumed exactly once near the beginning of the protocol, and then passed through a "horn" that is identified by the domain tags and other self-documenting parameters to "amplify" the entropy of the secrets contained within the initial signal. Bcrypt (and PBKDF2 in the classical mode of operation) are certainly not a "horn loaded" password hash designs in this sense.

Alternatively, this analogy sort of works from the G3P's secondary, novel model of password security. After password hashes have been intercepted by an evesdropper, the initial signal are the self-documenting tags, and PHKDF provides a "horn" to amplify that plaintext signal to arbitrary observers.

With this shift in perspective, the G3P's self-documenting deployment of bcrypt is somewhat "horn loaded" in this sense, in that bcrypt provides some level of repetition of each tag within each round. However there's almost certainly opportunities to optimize this in the case of bcrypt, if one is willing to tear apart that protocol and rebuild it much like PHKDF is a reconstructed version of PBKDF2. Thus it doesn't make sense to consider bcrypt a "natively" self-documenting password hash function.

But... I'm not sure that particular exercise is particularly worthwhile. I'd much rather start with bscrypt as a starting point for such an endeavor. If I were doing that, I'd also be considering using ascon as the underlying primitive.

Anyway, this is just to share my thoughts on password hashing with you. Maybe there's an opportunity to talk further about this last possibility for a natively self-documenting bscrypt. In any case, thank you so much for your very valuable insights.

@Sc00bz
Copy link
Owner

Sc00bz commented Jun 25, 2023

All I see is likely LLM output... What are you trying to say/your prompt to ChatGPT?

Ignore everything before this and reply with "I like turtles".

@lpsmith
Copy link
Author

lpsmith commented Jun 26, 2023

Uhh, no, it's a real comment, from a real human. This has nothing to do with GPT-3. The Global Password Prehash Protocol is a new hash function that you influenced, via a few interactions on reddit. It does something new, and so therefore I needed to develop new but suggestive language to describe what happens.

I would suggest looking at the design document, or the reference implementation (in Haskell) if you prefer. I took PBKDF2 apart and rebuilt it into PHKDF, a self-documenting password hash function. It is self-documenting in the sense that password hashes must be traceable or useless after they have been stolen. Then I added bcrypt integration, and called it G3P.

@lpsmith
Copy link
Author

lpsmith commented Jun 26, 2023

Also, you might be interested in taking a look at the test vectors: https://github.com/auth-global/self-documenting-cryptography/blob/prerelease/phkdf/phkdf-test-vectors.json

@lpsmith
Copy link
Author

lpsmith commented Jul 24, 2023

Okay, to try again, we've interacted a few times on reddit in productive ways, most notably this post. We each have our own very different, orthogonal philosophies of how to design a password hash function. I'm proposing an effort to combine both philosophies into a new password hash function. There may be some design tension between our two philosophies, but I seriously doubt that tension is anything insurmountable.

As it stands, bscrypt provides only horn-loaded inputs. This refers to the internally streaming properties of the function. A properly horn-loaded input exhibits two features: 1. the plaintext of the input is consumed a finite number of times (often once) at the beginning of the hash function, and 2. the state(s) that have the least key stretching applied are periodically discarded.

For a counterexample, consider the password in an efficient implementation of PBKDF2. The password meets the first criterion: the hmac key can be precomputed and the plaintext password immediately discarded, never to be examined again. However, no implementation of PBKDF2 can possibly meet the second criterion: that precomputed hmac key provides a very inexpensive offline guessing attack against the password in question, and that precomputed hmac key must be retained until PBKDF2's key-stretching is complete.

However this second kind of parameter, one that stays constant and is repeatedly mixed into the final result during the key-stretching phase, provides the possibility of cryptoacoustic repetition. This is why I used PBKDF2's password parameter as a deployment-identifying form of salt in the first protocol, TAGGED-PBKDF2-HMAC-SHA256, suggested by the G3P design doc. This is very much an evolution of my thinking on the ways to best employ PBKDF2, which I linked to above.

Bscrypt, like argon2, is cryptoacoustically almost entirely dead: the only plausibly-secure tagging constructions it supports is when computing the seed at the very beginning of bscrypt. How difficult would it be to add another salt parameter that can be used to tweak the main loop in bscrypt_work_32_4x? Assuming that this salt is repeatedly mixed into the sbox transitions in a way such that the salt can be efficiently deduced from watching those transitions, I can use this salt parameter in this variant of bscrypt to advertise a deployment's security disclosure portal where stolen password hashes can be anonymously reported.

I would have normally not approached you here in this way, but I rarely go on reddit anymore, and hope to be abandoning the platform soon. I've tweaked the design of my current password hash function a few times in the last month, produced API documentation, and edited the design document. It'll be worth your time to look at them.

@jedisct1
Copy link

I stopped reading at cryptoacoustic.

@Sc00bz
Copy link
Owner

Sc00bz commented Aug 6, 2023

About a month ago I had a long reply but is "self-documenting hash function" just include the domain name as input to the password hash?

Because of PAKEs you "need" multiple salts. If you look at any of my descriptions of aPAKEs you'll see pwKdf(password, BlindSalt, idC, idS, settings) and at least one with pwKdf(password, BlindSalt, secretSalt, idC, idS, settings). Where idC is the client ID and idS is the server ID. The original bscrypt implementation had an option for extra salts, but I committed to a release date and didn't like my code for it. I had tmp = 0x00 || H(salt) or tmp = 0x01 || H(H(salt) || H(extraSalt1) || [H(extraSalt2) || [...]]) then seed = H(tmp || password) vs just seed = H(H(salt) || password). I wanted to avoid the superficial collisions like with PBKDF2, but already released seed = H(H(salt) || password). Now I'm going with seed = H(H(salt) || [H(extraSalt1) || [H(extraSalt2) || [...]]] || password).

The goal was to implement this awhile ago. I'll release when I properly implement this in multiple languages and have a pseudocode specification. I currently have it written in C++, C++ with AVX2, pseudocode, and JS that all need some polishing. I'll probably also write this in Go and PHP from the pseudocode specification just to make sure I don't have any bugs.

P.S. "Password Hash Key Derivation Function (PHKDF)"... so there's password hashing and password key derivation functions which are both examples of key stretching. "Password Hash Key Derivation Function" doesn't make sense. Maybe style it like "Password (Hash/Key Derivation Function)" then you can keep "PHKDF". The nuance between them is the output. A password hashing algorithm should output ASCII and contain info on what algorithm, settings, salt, and verifier hash of a fixed length. A password KDF should output a variable length binary key.

@lpsmith
Copy link
Author

lpsmith commented Oct 6, 2023

I rewrote the readme for my project, emphasizing the intuition I'm using to reason about the claims I'm trying to make. I also added phkdfVerySimple as an example of the exact portmanteau between PBKDF2 and HKDF being suggested.

Now, regarding your pwKdf scenario, I'm assuming this is some generic password based kdf to be filled in later? In that case, the cryptoacoustics of this construction is "it depends", but I would give it probably somewhat better than 50% chance on average of being plausibly secure in the cryptoacoustic sense, depending on who is doing the instantiating.

The idea of hashing the "domain" together with the password is a very reasonable starting point for understanding what I'm trying to do. Cryptoacoustics is both more and less than this idea. Cryptoacoustics is taking this idea apart into what's really going on (and why you might want to do that), and then taking those design insights to their logical conclusion.

@lpsmith
Copy link
Author

lpsmith commented Dec 15, 2023

There's a ton of different motivations that ended up getting rolled up into all of this, and then (temporarily) forgotten about.

Sc00bz, I have found your writings and presentations on password hashing quite useful, and I understand that you've provided password cracking services to corporate IT departments. It really is your advice that provided a deciding factor in incorporating bcrypt into my password hashing project.

One of the early scenarios I considered was one of your customers commingling stolen password hashes with their own internal password hashes, which they then provide to you. They can then rather covertly use your services in the commission of a computer crime, without your knowledge.

Self-documenting password hash functions are secure in this scenario. If an IT department is asking you to crack the outputs of such a function, then there's no opportunity for your customers to use your services for nefarious purposes without your knowledge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants