Hacker News new | past | comments | ask | show | jobs | submit login

Using "hash" to mean "strongly collision-free function" is a valid terminological choice; in a wide-readership piece like this it should ideally be pointed out in order to avoid confusion, but there are many fields where that definition would be assumed without statement.

Getting confused between hash functions and password-based key derivation functions, on the other hand, is absolutely inexcusable; that very confusion is why there are so many people making the mistake of using SHA256(salt . password) as a key derivation function. People hear that MD5 is broken but SHA256 isn't; not realizing that MD5's breakage as a hash function does not impact its security as a PBKDF, and not realizing that MD5 and SHA256, being not designed as PBKDFs are both entirely inappropriate for use in that context.

Also, for reference: MD5(12 printable ASCII characters) ~ bcrypt(8 printable ASCII characters) ~ scrypt(6 printable ASCII characters) in terms of cost-to-crack on sophisticated (ASIC/FPGA/GPU) hardware.




Not to quip, but surely some of the blame for that confusion has to lie on the cryptography experts who insist on using lung-spasm-inducing hairballs like "PBDKF" (edit: typo there was, I swear, unintentional!) to represent concepts better explained in English as "expensive to compute".


"Password based key derivation function".

The problem here isn't that cryptographers are inflicting a terrible name at you, it's that security people have repurposed a function intended for one purpose (generating crypto keys) for another (storing passwords).

Also: don't blame us, blame the PKCS standards group.


Sure sure. What I'm really saying is that the core concepts here are "collision resistance" and "susceptibility to brute force", and that those are comparatively easy to grok for a typical developer. But they get scared to death when they read stuff like (verbatim from cperciva above) "MD5's breakage as a hash function does not impact its security as a PBKDF, and not realizing that MD5 and SHA256, being not designed as PBKDFs" which makes it sound more subtle than it is.


To clarify: the clear English way to write that would be something like "Colliding passwords aren't a problem, because it's cheaper to search the password space anyway, so you can use MD5 there securely. But MD5 and SHA256 were designed to be cheap to compute, so they're easy to brute force. Use a password function designed to be expensive instead." Charging off down the jargon road can only complicate things. And doing so in a post about people being confused (!) about the subject is frankly hurting, not helping.


But what he's saying is very straightforward:

The MD5 vulnerabilities that this post talked about don't change the security of PBKDF-MD5 (or any iterated MD5 password hash), because password hashes aren't used to authenticate data. Similarly (this is a little mindbending): HMAC-MD5 has no known viable attacks, so many crypto protocols that use MD5 don't have viable attacks.

and

Because they all seem like acronyms, people get MD5 and SHA256 (which are core hash functions, "primitives" in a cryptosystem) with PBKDF, which is a standard for turning passwords into crypto keys.

(I know you know both, I'm just trying to restate).


I don't see that it was repurposed; "key derivation function" != "encryption key derivation function", and I prefer the term "login key" over "password hash".


Since there are good password hashes that are also bad KDFs, I'm not sure I agree with the equivalence.


There are?


... I think so?

They don't all work this way but this is one of those disagreements where you're more likely to be right than me.


Good point. Isn't the only PBKDF requirement artificial and technically unnecessary computational delays? I mean far beyond what is necessary to actually produce a strongly collision free function.


The key requirement for a password-based key derivation function is that performing a brute force search is expensive. The PBKDF2 and bcrypt functions approximate this by scaling the amount of computation required; but this is imperfect since they can both be implemented on small circuits, making highly parallel attacks using GPUs/FPGAs/ASICs feasible. The scrypt function scales the amount of RAM required -- and thus the circuit size -- as well, making it far more expensive to attack since you can't use the same sort of cheap highly-parallel crunching.

The best reference for this analysis is the scrypt paper: http://www.tarsnap.com/scrypt/scrypt.pdf


Also PBKDF2 is not designed to be hard to crack on a GPU. PBKDF2 is actually notoriously easy to implement on a GPU or FPGA due to its small fixed amount of ram. It's probably not a huge deal since taking a large fraction of a second on a CPU to validate a login would translate into taking a large fraction of 1/150 of a second per password crack attempt on a GPU so it still is decent security.

For something designed to defeat specialized hardware, see scrypt.

[edit] Just noticed who I was replying to. Now that's funny.


A bigger reason PBKDF2 is easy to implement on GPUs is that it's (almost always) built out of primitives that are already widely available in GPU implementations, like SHA1 and SHA256.

That's why I recommend PBKDF2 with the PRF set to the authentication tag from FEAL4 in GCM mode. Nobody's got that in a GPU! Arrrrr!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: