The Death Throes of crypt()
dex writes "Tom Perrine and Devin Kowatch of the San Diego Supercomputer Center have issued "Teracrack:
Password cracking using TeraFLOP and PetaByte Resources" (PDF, HTML version via Google). Using SDSC's
prodigious computing facilities, they precomputed 207 billion crypt() hashes in
80 minutes."
I wonder if this will spur Sunto finally make the default password encryption algorithm on Solaris something other than crypt...
Actually with most Unixish systems going to other password formats such as MD5 and Blowfish I'd think that this goes to show that (NSA notwithstanding) crypt() has had a long, healthy existance. Rather than saying 'crypt() is dead' they should be saying 'it took 30ish years but crypt() is at the end of its useful life'.
Not many pieces of code will be able to boast that lifespan.
Trolling is a art,
Unless they release these hashes out into the wild, the average cracker/hacker does not have access to this type of resource...
Definately cool though for proof of concept!
80 Minutes? Obviously we just are not using enough power.
30% Troll, 50% Underrated, 10% Interesting
Score:5, Troll
In the wake of stories like this, and yesterday's story about RSA-576 being cracked (here), is this a message that we need more secure forms of encryption than we already have? RSA is great so far, but how long until 1024 is broken? Or any other schemes, like the MD5 hashing that's used for digital signatures?
Topics that people with lots of credentials behind their names are going to have to solve. Keeping ahead of the crackers is a big concern not only for security of transactions, but for personal privacy as well.
Erioll
Mirrored here:
http://home.twcny.rr.com/scooper2/teracrack.pdf
- Sometimes you're the pidgeon, sometimes you're the statue.
Remember that many glibc2 "overloads" crypt() to be able to deal with MD5 hashes. I wonder how many MD5 hashes they can generate...
That over time, any encryption alghorythm may be broken by superior computer. 50 years from now, normal computers will put anything we have to shame, and supercomputers will make current ones look like calculators.
Crypt is already supplantable by many improved techniques, but even if it is used, are they going to make these keys available to the world?
If not, now that it's known a really faster computer can solve then, perhaps the next step in spammy-crackers' arsenal will be to take their virussed drones away from attacking anti-spam sites and focus them at generating crypt or other password solutions? How many drones working P2P-style (you create these hashes, I'll create these ones) would it take to equal this supercomputer?
Clearly, crypt() was meant to die: just look at its name!
As Schneier says on the first page of Chapter 1 of "Applied Cryptography",
They've got the tables on their ftp server, but it seems slashdotted because it's going really slow... my computer says "downloaded 4194304 bytes of 1209462790550 bytes (0.00034%)"
Anyone have a bit torrent for this thing?
HIV Crosses Species Barrier... into Muppets
If I understand the article correctly, they're using serious computer power to develop a database of all passwords and their resulting hashes. In doing so, they can reverse engineer any hash back into a password via a 1 to 1 lookup. The only problem is that the attacker still needs physical access to your password hashes. In other words, this is much more serious for "insider" hacks where a system user wants root control.
Correct me if I'm wrong, but wasn't the point of moving the hashes from passwd to another root protected file (like "master.passwd") to prevent this sort of problem?
Javascript + Nintendo DSi = DSiCade
The ability to generate lots of crypt strings only helps you if you have the original crypt string to compare against. Most modern UNIX systems store crypt strings in /etc/shadow which is only readable by root. The crypt string is never passed across the net during most auth sequences. (Certain types of LDAP auth being the exception here.)
The problem occurs if someone manages to break into a machine, achieve root, and pick up the /etc/shadow file. They can now brute-force all the passwords given enough time, and it appears that the amount of time needed is shrinking.
This is a good argument for using different passwords on untrusted boxese and changing your password often.
Even so, using a 10 character input of about 72 possible input chars, isn't 207 billion still only like .0000055% of the total search space?
So that 20000 * 80minutes gives ~1% of the space cracked?
2000000 * 80 minutes = 304 years to fully close the space.
With a perfect distribution, the mean of about 150 years seems like a long time.
Someone please check my assumptions here.
Sure, MD5 strings are a little longer, but still (I'm pretty sure) less than 80 characters.
md5 strings are 32 characters. And most linux distros use md5 for their shadow files already.
I've already rooted all your boxen and converted them to a worldwide Beowulf cluster.
Time to crack some pr0n passwords...
Moral of the story: Pick a good password.
Funny, we did this work almost a year ago, and someone finally notices :-(
:-)
Yes, this is a very small part of the total theorectical key space.
But its exactly the part of the space that is most "interesting"; this is the part that will most likely be searched by an attacker AND the part that is most likely to have a real, user-selected password.
The original goal (years ago) was to allow us to verify that our users weren't using passwords that would be likely to be found by an attacker.
Yes, passwords are *supposed* to be stored in shadow files that are not accessible, except by root, but in practice, it is often discovered out in the Real World, especially at larger multi-vendor sites, that user password hashes are copied between machines, stored in databases, and in general available to an attacker who does not yet have root.
These are the same sites that often can't or haven't converted to md5() hashes, as they have older legacy OSes that don't do md5(). Note that even though Sun "supports" md5() hashes, they don't support them everywhere and it certainly isn't seamless. Don't get me started about AIX. Linux and the *BSD folks are way ahead of most of the commercial UNIX variants.
As for the scalability, read the whole paper. We used the largest single machine at SDSC, but its rather dated in terms of crypt() performance. A distributed.net-style project using typical home machines would win, IF you could get a thousand or so people to cooperate.
The Terabytes of storage in a single filesystem didn't hurt in the sort/merge phases either.
Personally, I'm a Kerberos fan
With a password hash, all hashes generated are of the same length. Therefore someone bruteforcing the hash will take the same amount of time no matter what the password is. A longer password though may save you from a dictionary attack, especially passphrases, as it greatly reduces the chance of a brute force dictionary crack finding your password.
Well, for starters, you should avoiding telling people the length of your password...
as far as i'm concerned, sun's pam configuration and flexible login setup only applies to solaris 9 and above. after trying to come up with a sitewide md5 solution i found solaris 8 to be terribly difficult to work with. dont get me started on their broken ldap libraries.
several datacenters i work with independently only offer solaris 8 so "why aren't you using the latest sun distro's" falls on my deaf ears. the huge body of vendor supplied software which calls for solaris 8 just makes it worse.
nis implementations that pass these crypt values around the network just makes keeping them inaccessible to users a nightmare.
crypt-for-passwords is one of those "standard unix" methodologies that needs to have already died a horrible death. the original title of this topic was hopefully appropriate.
I am going to go convert two of my physical binary decision devices into a cup of coffee.
But my point still holds. Sun wants to maintain backward compatibility. Period. That's separate from suck-add implementations of the PAM authentication method. And you're lucky. Last time I administrated Solaris boxes (other than my home workstation), I was stuck with Solaris 7. Trust me, that hurt.
nis implementations that pass these crypt values around the network just makes keeping them inaccessible to users a nightmare.
I feel your pain. That's going to be a difficult one to solve though. How does a user authenticate on a network if they can't pass their hash? One could add public key encryption on top of the authentication protocol, but one slip up in key management and the whole deck of cards collapses. After all, how long will it be until we see a worm that uses distributed power to begin generating all possible MD5 hashes? A daunting task to be sure, but certainly not beyond the current state of computing power.
Javascript + Nintendo DSi = DSiCade
Reading the article there is no way that teracrack is going to deal with a strong password, the hash won't be present in it's table.
Regardless of algorithm, the weak passwords will allways be the first to fall. We can all stop using crypt() and start using md5 hashes, but the same techniques can be applied again, and again the first passwords to fall will be the weak ones.
I hate to sound like a Luddite, but technical problems aren't allways best fixed with more technology. The best use of teracrack that I can see, is the same use that it's predecessor had, to identify weak passwords and identify them to the user and admin to ensure that this core problem is addressed.
"Talk minus action equals nothing" - Joey Shithead, D.O.A.
"Talk minus action equals
Not a new concept, but novel given the use of modern computing resources.
:-) wrote software in 1991/1992 to do this: unfortunately sun4 MP's and about a gigabyte of disk were the best we had.
:-). The output was also stored in bitwise numerical order - matching divide/2 matching very fast.
:-).
I (and probably others, I claim no novelty, only an inventive step
Rather than precompute the entire crypt() space, we precomputed the space for well known words (and combinations of those words with different prefix and suffix), because for any individual word, there are only 32 [I think from memory - 5 bits?] combinations that it can crypt to given the random salt that was possible.
Because of available disk space, we couldn't store the entire precomputed output: so we chose to only store the first N bits of it. This was configurable. I cannot remember the exact figure - sure I could dig the code up out of an old CDROM archive
So the password cracker would then mmap() the couple of gigabyte file, then easily find (or not find) a candidate prefix. If it found the prefix, it ran a few trial crypt()'s to ensure an exact match. In practice, because of the lack of diversity in passwords, there were few false candidate matches: so the password cracker had some extremely large hundreds/millions of equivalent cracks per second as a result of mostly just not finding comparisons, and a few trial runs for succesful targets (I think the running rate of the other two popular crackers at that time was about 100K cracks/second).
Anyway - that was a long time ago - fun and games as a student. I still have the source code
You're overlooking the random salt value, which is different on each machine, or even the same machine for different accounts. So, generally you would have a hash of "$salt-$password".
The problem mentioned with crypt() is that the $salt value is very small (1.5 bytes), thus making dictionary attacks feasible. If MySQL is not using salt values, then they risk similar attacks.
Most writers regard truth as their most valuable possession, and therefore are most economical in its use - Mark Twain
Back in 1994 I wrote the MD5 based crypt() which it seems almost everybody has adopted from FreeBSD by now: *BSD, Linux/GLIBC, Cisco and appearantly even Solaris.
:-)
The important properties of a good crypt() algorithm are still:
1. Input password is not length or charset restricted.
2. The algorithm is complex enough to not lend itself easily to hardware implementations (FPGAs etc).
3. The Salt is big enough that precomputing dictionaries is not feasible.
You will notice that apart from #1, these are not quantified, DES-based crypt() fulfilled #2 and #3 back in its days, but no longer does so. In a similar way, some day we will declare my MD5 based password scrambler as failing one or both of those criteria because password scrambling is not a case of finding "the" algorithm and putting a checkmark in the box, it needs to on periodically evaluated and improved every decade or so.
That is why I put the $1$ prefix on my MD5-based crypt(), so that you can update to a better algorithm when you need to. OpenBSD has already added a couple of stronger SHA based algorithms ($2...) and more can be added in the future.
In the absense of any other "IANA" for this, I would appreciate if you would register your "magic strings" with me so we don't have collisions.
If you're still using DES-based crypt(), switch to MD5 based crypt() now. Don't wait any longer, you are already 9 years late (IMO).
It's even "free as in free beer"
Poul-Henning Kamp -- FreeBSD since before it was called that...
In standard crypt() format, the salt is the first two characters of the hash.
The canonical way hash a password using crypt():
The characters "xo" are chosen at random when the password is first hashed. Note how they are the first two characters of the hash. The canonical way to check if a given password matches a hash is:Note how I use the entire hash as the salt. Only the first two characters of the salt are actually used by crypt(). Actually, only twelve bits from the first two characters are used for the salt: Two different salts resulted in the same hashes: this shows that crypt() does not use the entire 16 bits of the two characters (indeed, not even the entire 14 bits of the characters as US-ASCII). Only twelve bits are in fact used.Also, the entire password is not used: in fact, only the first eight characters of the password are used:
Since slashcode strips un-american characters, I cannot demonstrate the the top bit of each character in the passphrase is discarded.
Now, we can do some math: if 12 bits of salt is used, we have 4096 possible salts (2 ^ 12). If 7 bits of 8 characters are used, we have 7 * 8 = 56 bits of possible password. Thus, we have 2 ^ 56 * 4096 = 295147905179352825856 possible passwords (295 quintillion).
Now, these numbers don't match up with what's reported in the article description (207 billion hashes). It's possible that some combinations of passwords and salts produce identical hashes, but I would never expect nearly this many...time to read the article.
OK, I skimmed the article. They did not cover the entire keyspace of passwords. They only created a list of candidate passwords from a system dictionary using Crack's password generation routines. There are 1425835290 (1.4 billion) times more possible passwords than they tested. If they tried hashing all possible passwords, it would have taken them 217022 years at the rate they're going (80 minutes per 207 billion passwords). The storage of these hashes is out of the question (I don't know my metrics that high :).
Actually, on second thought, I can imagine a compression scheme that could drastically cut down on the storage involved: but this is irrelevant, the CPU time is still overwhelming.
Lesson learned: worry more about the quality of your passwords than the quality of your password hashing algorithm.
[Perhaps I missed something? Anyone care to check my arithmetic?]