Dear tech journalists, please stop saying stuff like "But we have yet to find ou...

mquander · on June 8, 2012

I just googled "PBKDF2 PHP" and the first page was full of free implementations. But maybe it's cheating, since I know what "PBKDF2" is. I tried to simulate what a totally ignorant person would do, and googled "PHP password." The second result was the PHP manual page on passwords, where it explains in eleven different languages, using simple words, exactly what the deal is with password hashing, and refers people to two built-in functions (crypt() and hash()) that handle both bcrypt and PBKDF2.

Exactly how much easier does it need to get? Shall we print out the manual page and put it under people's doorsteps?

It would take like a maximum of twenty minutes for anyone at all, armed with Google and Stack Overflow, to go from "I know nothing at all about password hashing" to "I am securely hashing my passwords" in PHP or any other language. I think it's fair to wonder what the fuck is wrong when, in companies full of tens or hundreds of presumed-competent programmers, nobody does that, ever.

capsule_toy · on June 8, 2012

LinkedIn was launched, in what, 2003? If you googled the general advice back then, it was pretty much just use MD5 or, if you were really cutting edge, SHA1. Salting wasn't common at all. Salting eventually started becoming common and now you're silly if you don't use bcrypt.

phkamp · on June 8, 2012

This is more a reflection on where you got your advise in 2003, than what was considered best-practice.

Salting became best-practice in the 1980ies, but the "lost generation" of dot-com wizards never bothered reading "all that old stuff", so they are doomed to repeat the mistakes.

xtracto · on June 8, 2012

I remember reading about salting back in the 90´s when I got my first copy of FreeBSD. It definitely is an old concept

mquander · on June 8, 2012

I wasn't doing web development in 2003, so I can't really argue. But it's been 9 years during 2003, and there's been a tremendous amount of light and noise about the dangers of weak hashing strategies during that time. I'm sure that LinkedIn has a zillion programmers who follow programming blogs, read HN, and so on, so I can't understand why none of them have just sat down and fixed it. Even if it takes half a day once you add in documentation, QA, deployment, and so on, this seems like a completely obviously worthwhile half-day.

espeed · on June 8, 2012

Dear tech journalists, please stop saying stuff like "But we have yet to find out why nobody objected to them protecting 150+ million user passwords with 1970s methods."

Poul-Henning Kamp (http://en.wikipedia.org/wiki/Poul-Henning_Kamp) is not a "tech journalist."

willvarfar · on June 8, 2012

Poul-Henning Kamp is many things, but journalist?

He is allowed to say stuff like "But we have yet to find out why nobody objected to them protecting 150+ million user passwords with 1970s methods."

And this is Linkedin. They should know and do better.

I actually imagine that their very gifted developers are running around wondering how they themselves didn't audit this.

willvarfar · on June 8, 2012

> I actually imagine that their very gifted developers are running around wondering how they themselves didn't audit this.

or perhaps its that some 3rd party can authenticate users using sha1 passwords i.e. that internally linkedin passwords are scrypted or something, but this dump was from MitM between 3rd party plugin and linkedin?

lurker14 · on June 8, 2012

I can't imagine that the person responsible for the database can look his colleagues in the eye. He must have called in sick the day after the leak and is not coming back to the office.

You can only imagine how many times someone noticed that passwords weren't salted (by comparing stored passwords to a leaked set of hashes or raibow tables after another announcement from some company being hakced) and complained, and got brushed off.

phkamp · on June 8, 2012

I think you're setting the bar too high for tech journalists, lets aim for them knowing the difference between "md5" and "md5crypt" first.

But no, I don't think it is at all obvious why LinkedIn used unsalted SHA1.

LinkedIn went through an IPO, which implies that a number of companies have audited them from head to tail several times along the way.

If the commodity you buy is millions of user accounts, shouldn't you, as investor, at least check that there was a lock on the door to the warehouse ?

Argorak · on June 8, 2012

> So suppose you are developing an agile product, someone loses access to their account and asks for a new password, you type `head -c 9 /dev/urandom | base64`....`UPDATE users`

I don't think I ever want to be _that_ agile. My agile projects usually have a set of application functions exposed as scripts immediately. And yes, proper password change is one of them. (besides, how about just using `pwgen 16` and not some trickery with head and random?)

Second goal: establish a process that gets everyone flagged that tries to change things using phpMyAdmin that have proper equivalents in your scripting toolkit. Agility is no excuse for sloppiness. If the agile crowd still insists to be agile to death, call the whole thing MVT (Minimum viable toolkit).

Using a framework where all this can be done from a REPL also helps a lot.

phkamp · on June 8, 2012

... and since you are not danish, you don't realize that base64 emits the danish word "badeanstalts" and therefore fall to even the most trivial dictionary attack.

dchest · on June 8, 2012

9 random bytes encoded to 12 base64 bytes is still 2^72 bits of random data. You'll be hit by a meteorite much sooner than you randomly generate "badeanstalts".

ma2rten · on June 8, 2012

I can see why people don't use bcrypt/PBKDF2: they don't know or it's not a priority. Your reason, however, doesn't strike me as a particularly good one: you could just write a quick password reset tool in PHP or even better write a quick shell script that splits out the password reset query.

And I think LinkedIn really has no excuse.

drostie · on June 8, 2012

I'm I guess a little hesitant to follow up on these because the original post is getting strongly downvoted, but I tend to agree somewhat. This is the same as I was telling people at the company -- "just use the PHP API we've developed!"

It turns out that this is a bit complicated, as my colleagues readily pointed out to me. So for example you are basically saying "write a .PHP file and execute it locally," which is perfectly fine, as long as the problem comes from your boss or one of your testers -- it is risky when it occurs on a production server (because the script you're generating is insecure). On your production server you really do want to execute the action from within a MySQL prompt if it's possible, and so it's a sort of two-sided game of "I'm going to reset my password over here and then update their (salt, password) with the result of my local PHP queries," and that's a bit weird as a process.

The other tender point is that once you've made a choice, it's very hard to change it. So, "all of our existing passwords use the old system, we're not changing!" was a very strong argument and I did have to spend a bunch of time creating a fall-back for legacy passwords.

I would agree, however with this: in general there is a reasonable expectation of, "if we're doing this so much that it bogs us down, then the app is mature enough for a proper email-sending password-reset tool; and as long as it doesn't bog us down we'll do it the hard way." But convincing people to make the hard way even harder is a tricky proposal even on a good day. It's like telling people, "no, leave that code, I know it does 2^n operations but n is always small and it's not actually the part that slows down our system and it's more readable this way." The intangible -- security/readability -- is being negotiated for the tangible -- dev-ease/speed. I had trouble selling it to the other devs.

mkup · on June 8, 2012

Why don't you write small program in C and call it as external process from your PHP code to perform resource-intensive computations? Database should have nothing to do at all with user password hashes.