I thought the standard way of migrating your PW hashing function was that you could only do it during a login, because that's the only time you have the PW in plain text. No?
> I thought the standard way of migrating your PW hashing function was that you could only do it during a login, because that's the only time you have the PW in plain text. No?
No. Well, I mean, it's common to do that, but it's a bad idea and it's not necessary to keep all of those vulnerable hashes around.
Say you have a bunch of MD5 password hashes stored, and you want to upgrade your password hashing to BCrypt. Don't wait for the user to login and rehash their plain text password.
Instead, Bcrypt all the MD5s. The authentication mechanism is now BCrypt(MD5(plaintext)) == stored_bcrypt. Upgrade them to straight bcrypt on login.
> That doesn't really have the same security properties as the original login information.
Well, obviously. For instance, it lacks the property of being trivially vulnerable to a GPU hashing the entire dictionary with a given salt in 30 seconds flat.
No, you just take the original MD5 of lower(pass)+username and run that thru your new bcrypt. You will have a more complicated authentication process but it's worth it.
Yes, in addition to migrating to pure bcrypt/scrypt at next login. I've implemented this migration strategy for two different web sites (not listed in my HN profile) with success.
Not so complicated if you plan for this sort of stuff ahead of time, or even at migration time. You can prepend your hashed fields with the algorithm used, and have a library that automatically handles them.
This also makes generating test data easier, as our library has a "plaintext" hash type, so we can just insert "$plaintext$password" rather than having to run a hash on the password.
The entropy in the resulting hash is almost the same as a salted MD5 though, is it not? You have more bits, but not significantly more possible hashes.
edit: I guess most people only start out with short passwords, thus limiting the possible entropy anyway...
> The entropy in the resulting hash is almost the same as a salted MD5 though, is it not? You have more bits, but not significantly more possible hashes.
The entropy isn't the significant problem with MD5. The problem is how damn fast you can perform 2 billion MD5 hashes on modern machines.
1: Vulnerability to a pre-computed table. This hits hard if you are using plain-old MD5 without a salt.
2: Vulnerability to a newly computed table. This will hit you if you use a global salt and a fast hashing/encryption algorithm.
3: Vulnerability to high throughput custom hardware. This will hit you if you use a fast hashing algorithm like MD5. Modern GPU-based algorithms can churn through hundreds of billions of hashes per second on fairly inexpensive hardware. Which means that attacking, say, 20 million accounts it's possible to run through billions of possible passwords (which means a full dictionary attack, the top million most common passwords, plus common ways of combining words to make passwords, plus every random alphanum string up to 9-10 digits) for each and every account in only a month of work.
4: Vulnerability to very weak passwords. This will hit you if you use a slow hashing algorithm and there's no global vulnerability because you use per-record salts (or bcrypt/scrypt).
4 is where you want to be, always. You can't protect against weak passwords, but you can make it expensive for weak passwords to be revealed, and prohibitively expensive for stronger passwords to be systematically cracked.
The entropy of the hashes isn't really relevant since it's the repeatability of the hashing that's at issue. For example, the hash 7c6a180b36896a0a8c02787eeafb0e4c seems to have plenty of entropy, but since it's the MD5 of the string "password1" and you can just google that hash to find out that information, the theoretical entropy is moot.
Entropy is not really the issue here. For a dictionary attack with a non-reversible hash, what matters is how computationally intensive it is to generate the hash. MD5 hashes can be generated very quickly, while bcrypt hashes take much, much longer to generate.
You could delete the old hash and require users to generate a new one through the password reset mechanism. Generally you would want to combine both approaches, so that only users who do not log in during the transfer window have to deal with setting a new password.
Bingo. All they had to do is delete all the login tokens. Users with what would have otherwise been a valid cookie would have had to re-log in, but that's a minor inconvenience that's expected to happen from time to time.