Cracking PGP In the Cloud
pariax writes "So you wanna build your own massively distributed password cracking infrastructure? Electric Alchemy has published a writeup detailing their experiences cracking PGP ZIP archives using brute force computing power provided by Amazon EC2 and a distributed password cracker from Elcomsoft."
If only they'd thought of using distributed computing for the first post, instead of password cracking!
I was under the impression that crypto like PGP was based on stuff which would (in theory) take millions of years to crack even with every machine on earth dedicated to it?
Yes obviously cracking passwords scales linearly, we've known that for a long time. Oh, you could get 100 machines brute forcing instead of one, but what good is that? Either the password is crap and you crack is easily, or it's helluva complex and scaling it up 100x won't do a damn thing. In this case it looks like they just picked some random range and said "Hey, this is unfeasible on a single machine and doable on a cloud, let's do that" but they haven't produced any credible evidence it is in this range. Not unless semi-complex password possibility matches their corporate password policy or whatever.
Live today, because you never know what tomorrow brings
They will want to be careful or else they just might get arrested.
So you wanna build your own massively distributed password cracking infrastructure?
No
> I was under the impression that crypto like PGP was based on stuff which
> would (in theory) take millions of years to crack even with every machine on
> earth dedicated to it?
That's true if everything's equal. Including your passphrase. If the cipher
for encryption is 128-bit strong, then your password/passphrase needs to match
that. If it doesn't it's the weakest link, easier to attack than the actual
crypto algorithm and will take accordingly less time to crack.
Example: For a password composed only of lower-case a-z english characters, ;-)
you'd need 28 characters chosen in a true random fashion (think scrabble tiles
pulled out of a hat) to actually achieve a strength of 128-bit, that matches a
128-bit crypto or hash algorithm.
The strength of TFA 'sweetspot' passwords were somewhere around 60-bits.
Since even RC5 has been broken at 64-bits (distributed.net - though it took
some time), such passwords are OK for low-priority stuff but not, if say, the
NSA is after you
I was under the impression that crypto like PGP was based on stuff which would (in theory) take millions of years to crack even with every machine on earth dedicated to it?
Yes, but the search space is significantly lower if you assume an password that's 1-8 latin alphanumeric characters, as this exercise did.
It's still 122 days on 10 VMs. One tenth of that on 100VMs.
Assuming I want to use some spare CPU cycles for some purpose or other, where should I apply to make sure it's OK? So far, we have protein folding and alien hunting as acceptable: what are some other uses of spare CPU that are acceptable and unacceptable?
Shutting down free speech with violence isn't fighting fascism. It IS fascism!
They are only talking about "characters" in a password, which is a bit dubious. The important information is how many bits long the password provides. For a discussion on this see, for example: http://world.std.com/~reinhold/dicewarefaq.html#howlong For this reason and others, I'll take their "report" with a grain
One of the adversized features of ElcomSoft Distributed Password Recovery is that all network communications between password recovery clients and the server are securely encrypted. How is that possible, I wonder.
How do you know they weren't cracking a PGP'd zip archive containing secret documents about alien protein folding technology?
First of all, the article is a very nice summary of the issues involved with setting up a cloud to crack passwords - the nuts and bolts, if you will. I liked that the authors took the time to look into the economics of trying to crack passwords, how much money it would cost vs. how long it would take. Password cracking is one example of massively scalable computing, which is presumably why the NSA allegedly has had to keep upgrading the electrical infrastructure at their headquarters. Elcomsoft certainly made a splash with their PGP-cracking software and managing to harness the power of cheap GPU cards (which are set up for parallel processing) was a bit of genius. That said, even massive horsepower runs into a brick wall once the passphrases become long and the encryption algorithm is good.
On page 2 of the article, the authors nicely summarize the cost of cracking longer and longer passwords. Once passwords start incorporating special characters (per SPEC), the cost shoots sky high even for relatively short passwords (i.e. $10MM+ for a 9 character password, $1BN for a 10-character password, the US national debt for a 12-character password). The article so clearly lays out why the various law enforcement agencies have been focusing on being able to force folk to disclose their encryption keys. The cost of cracking a well-executed encryption scheme combined with a good password is simply too high. So, go ahead and use those special characters, upper and lowercase, etc. to make life interesting for would-be snoops. But realize that unless trends in privacy rights swing the other way, law enforcement will simply compel key disclosure, as they have for years in the UK, for example.
Lastly, the article underscores the value of keychain-type schemes that allow many long passphrases to be stored in a accessible format. Make it easy to have long, complex passphrases and it becomes more likely that people will actually use them.
I wonder why key strengthening is not used more, even '40-bit' passwords would be pretty tough to crack if each password took 1s to test.
Schneier had an interesting piece on deriving a limit of the necessary key length from thermodynamics. ... assuming your password is only bruteforce-able ... otherwise http://xkcd.com/538/
http://www.schneier.com/blog/archives/2009/09/the_doghouse_cr.html
NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
People who would try to hack your password probably will not use Amazon's EC2, but something far more trivial, such as a botnet. Botnets are free, all you need is some time.
This is why I wouldn't think that 11-12 character bound has any meaning in practice. A more meaningful boundary is the one which can not possibly be cracked in reasonable time with a big computer network, regardless of the cost of operating it.
I use a composite password approach that gives a best balance between security and ease of use: use a good random generator for say 16 characters. Print it. This is not your password yet. You will come up with a few easy to remember, but hard to guess transforms, such as inserting several characters in a place only you know, using the printed sequence in a different order than linear left-to-right, and replacing all instances of printed "a" with "3" for example.
This ensures brute-forcing over the network will not be possible, as your password is truly random, and long. Finding the list would not lead to instant hack either as there's still reasonable information withheld from what's printed. You can also frequently change the "seed", the random characters you print, and apply the same rules you remember from before, to arrive at a completely new password, without having to remember anything new.
If you are cracking through the cloud, then you are also vulnerable, and someone can use your efforts to get into the system before you...
In this case, it sounds like the customer was pretty glad they'd used weak passwords.
The implication is that they'd locked some files up in an encrypted zip, forgotten the password, and wanted the contents back.
If they'd chosen a stronger key, they'd not have got their files back.
TFA:
This analysis may be insightful as you develop your enterprise password policies, or choose your personal passwords.
(A good password policy is: don't forget your passwords!)
you'd need 28 characters chosen in a true random fashion (think scrabble tiles
pulled out of a hat) to actually achieve a strength of 128-bit, that matches a
128-bit crypto or hash algorithm.
Scrabble tiles would be an exceptionally bad choice.
If the encryption software works as advertised, they would need the private key file to exploit this. So as long as you keep your private key file to yourself you should still be safe for a while.
Sure, if your limiting it to 1 per second, but they can get into the hundreds of millions of passwords per second with GPU setups.
I'm also a bit confused. I've never used PGP to make an encrypted zip file, but I use GnuPG to encrypt emails all the time and I, too, was under the impression that it was infeasible in practice to brute force the encryption.
Is the difference that with PGP/GnuPG email encryption, our passwords are merely decrypting our keys which are themselves fully 128 or 256 bits long or whatever? Whereas in this situation with the ZIP file there was no separate key - the password was the key? (I haven't read all of TFA)
What chore that they need to use Windows. For a brute force password guesser, most Slashdotters could write it in 10 lines of perl.
I have an idea : how about a self destructing key? There would be a physical USB key that would have your passphrases on it. The passphrases would be quite lengthy strings of randomly generated characters, effectively un-forcable unless there's a massive weakness in the encryption algorithm.
The key would have a small CPU and lithium ion battery. All the components would be potted in epoxy, and you would be able to put an outer shell around the key resembling a common brand of USB stick.
In order to use the key, you'd have to enter a small password to unlock it. If the key has not been used in roughly 2 weeks of real time, it erases the passphrase from itself.
So if you get arrested or compeled to give up your password, you just have to keep silent for a couple weeks. Then, it's gone!
The company surely did have the private PGP key lying around. They just forgot the password.
As an analogy, think of a safe. A good safe is hard to break in if you don't have the key. If you have the key, it's quite easy. Now you fear that someone could break in your house, get the key and open your safe. Therefore you put the key for the big safe into another, smaller safe. If you need to open the big safe, you first open the small safe, take out the key of the big safe and then open that.
Now if you have lost the key for the small safe, and the small safe is less secure than the big safe, you'll certainly not crack the big safe, but just the small safe in order to get the key of the big safe.
Now, the key for the small safe is your password, and the key of the big safe is the PGP key. If someone has access to the small safe (the password-protected PGP key), then the security of whatever is in the big safe is certainly limited by the security of the small safe.
Now with emails, the point is that the big safe (the encrypted email) is out in the public, while the small safe (the password-protected PGP key) is in your home (i.e. on your computer, which hopefully itself has appropriate protection against intruders).
So the security of your PGP encrypted mail is limited by the combination of the security of your computer and the security of your PGP password. If your computer is basically unprotected, and your PGP password is weak, then anyone can read your encrypted mail by simply breaking into your computer, copying the private PGP key, and breaking the password. If your computer is well-secured, the attacker will have a hard time to get your private PGP key, and if you PGP password is strong, the attacker will have a hard time to break it if he manages to get the PGP private key.
The Tao of math: The numbers you can count are not the real numbers.
I thought the problem was that there was an infinite number of matching passphrases producing invalid results. Like, only a very simple hash or CRC - 1 or 2 bytes checks the validity of the passphrase to protect from common typos, but if you try even semi-hard, you will get a hash collision, the data decrypts, but it decrypts to garbage - a standard GIGO filter with a very weak anti-garbage protection on input.
This way, on top of one correct result you should get an infinite number of incorrect results and unless you have a clue how the correct result should look like and use some heuristics to distinguish it from garbage, you'll be no wiser than before... (and if it was additionally encrypted with anything that makes it look like white noise, there is simply no way to tell it apart from pure garbage.)
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
Damnit, my password is all vowels again!
which is totally what she said
scrabble tiles pulled out of a hat is a bad example :)
They are not evenly distributed, so you'll have a higher occurrence of certain letters (namely the vowels and other common letters like s.)
and here is why (from wikipedia) English-language editions of Scrabble contain 100 letter tiles, in the following distribution: * 2 blank tiles (scoring 0 points)
* 1 point: E ×12, A ×9, I ×9, O ×8, N ×6, R ×6, T ×6, L ×4, S ×4, U ×4
* 2 points: D ×4, G ×3
* 3 points: B ×2, C ×2, M ×2, P ×2
* 4 points: F ×2, H ×2, V ×2, W ×2, Y ×2
* 5 points: K ×1
* 8 points: J ×1, X ×1
* 10 points: Q ×1, Z ×1
such passwords are OK for low-priority stuff but not, if say, the NSA is after you ;-)
If the NSA is after you, I would think the strength of your passwords is the least of your worries.
Much more horse power and it don't cost you anything. Thank god for windoze lusers.
Nope. Aliens use bittorrent.
To pick a trivial example.
Your password is 'password'.
Cracking algorithm attempts to open your encrypted archive using a list of, say, 20,000 english words. 'password' is 5th on the list. After 5 iterations, you notice that your decryption attempt has yielded data that looks like a valid zip archive, or contains english words. Result. You win the internets.
You can refine this.
1. Attempt a password list crack.
2. Attempt a Markov-chain based crack, looking for english-like words generated by your Markov Chain algorithm. Like, say. 'bibble' or 'foglet'. Tr
3. Repeat the above for all letter case combinations, and number/letter replacements - like B1bb7e, or f0Glet.
Et cetera,
The edge you have is that people often choose known words as passwords, or easy-to-remember nonsense words.
This reduces your password search space *hugely*.
For example, say your pgp doodad accepts up to 10 character passwords formed from any combination of letter case or number. 26 lowercase letter, 26 uppercase letters, 10 numbers. Your maximum search space would be the sum of all (26+26+10)^n, where n iterates from 1 to 10, or 853,058,371,866,181,866, or 8.5x10^17. This is the size of the set of all possible mixed case alphanumeric passwords up to a maximum length of 10. You would have to try each of these combinations to fully search this space. This is called 'brute forcing'.
It is a *much* larger number of passwords than the 20,000 in your dictionary list....
So, you use the search space limiting techniques *first*, which will yield a result in 95% of all cases. Then, you try brute force, or give up.
Really, the set of scrabble tiles in a standard box is a bad example, it wouldn't be real hard to sort out 1 tile of each letter and use those 26 tiles to generate the password (placing the tiles back in the hat and shaking it a bit after each draw).
If we are going to split hairs, we might as well do it all the way.
Nerd rage is the funniest rage.
I looked at EC2 for raw processing power earlier this year (my company needs to train a lot of neural nets) and it just isn't worth it, unless you only need the power short term. A high-performance EC2 node gives you 8 cores running at (very roughly) the equivalent of a 2GHz P4, and costs $0.68/hr == about $460 per month, which is only a little less than what an equivalent box (probably a 2.83GHz Core 2 Quad or similar) would cost you. Put power to run that box down at about $0.05 per hour and you can build your own local cluster of equivalent performance for around the same amount of money as you'll save in your first month and a half of operation.
> (from wikipedia) English-language editions of Scrabble contain 100 letter
> tiles
I meant using scrabble tiles in principle. So obviously 26 a-z :-)
characters/tiles, not 100 with uneven and therefore non-random distribution.
> If the encryption software works as advertised, they would need the private
> key file to exploit this.
You are confusing public key encryption (1 private key & 1 public key) with
conventional/symmetric encryption (gpg -c) where no separate key per se is
required. The encrypted file is all you have.
Let's see, he already did some 'folding' in the bathroom, while browsing an old old issue of FHM, and
there were no aliens in his morning cereal (just spaceships in purply artificial grape flavor). Came
to work, punched in, and got started. Yup, on time, and right on schedule. boss.
WARNING: Smartphones have side effects--most of them undocumented.
scrabble tiles pulled out of a hat is a bad example :)
They are not evenly distributed, so you'll have a higher occurrence of certain letters (namely the vowels and other common letters like s.)
Unless of course you know not to do that and just use the 26 letters comprising the alphabet.
You could even generate a one time pad that way.
That's only a problem if you have no idea what the encrypted data might be. But in most reality-based cases, that's not the problem. You almost always have the clues you need.
In this case, for example, the file is a ZIP archive. That means the archive contains in the clear the original file names including any extensions, such as .jpeg, .bmp, .doc, .pdf, or whatever. All those file types have artifacts you can test for. They all have specific formats. They'll have version numbers, dimensions that must fall within reasonable boundaries, or other attributes that simply won't produce a coherent file unless they're correct.
For example, a JPEG image file is a container and is filled with markers identifying all the different sections. They all must be right or it won't display. So you'd start by looking for the SOI marker as the first byte of the file (0xffd8) or you'd throw it out. After the SOI you'd have to find another valid JPEG marker (two more bytes beginning with 0xFF.) So that's three bytes you'd have to match exactly, and the fourth byte would have to be on the list of valid markers. After you find the next marker, it'll probably be followed by a length (two or four more bytes). If that length is greater than your file size, it's a fail. Sure, if all that passes you'd have to decrypt more data to figure out if you're still in a valid file, but the chances are now only about 1 in 16 million keys tested. You then farm all these "potentials" to a machine or other process dedicated to deeper examination of the candidates.
If I were writing this, I'd have enough smarts in the key tester to look for all possibilities within the first blocksize of the cypher. Anything that looked reasonable at that point would be exported to the "evaluate potentials" system.
Every data file has its structure. You just have to look for it.
John
A much less geeky/costly solution than using a GPS-integrated self-destruct mechanism is: ...have two passwords. One decrypts the data, the other erases it.
Actually, some ATMs have a similar ability: your PIN lets you access your bank account, while entering your PIN backwards does the same thing but calls the cops at the same time. If you're mugged at the ATM and forced to reveal your PIN, you give/use it backwards to notify police while the perp is busy emptying your savings.
Can we get a "-1 Wrong" moderation option?
It wasn't carbon, but the fuel consumed that was my first thought. Back when distributed.net was busy burning energy to win these pointless challenges, I did some rough calculations on the electricity required to solve it.
Turns out that the energy spent breaking RC5-64 used somewhere between 2 and 50 *trains* full of coal.
And that was only the energy directly consumed by the computers involved, and not any of the heating or cooling costs associated with it. And sure, more modern CPUs are more energy efficient, and I extrapolated the figures from a lot of published sources and made a lot of assumptions. But regardless of CO2 or greenhouse gasses or dirty coal or any of that environmental stuff, that's a lot of irreplaceable fossil fuel that's now gone.
I don't think it's sad or tainted to consider the overall impact of what you do. Saying "oh, I want to help search for E.T." is one thing. It may cost you an extra 1440 kWh/day, but you have the money, no big deal. But understanding that SETI@HOME is causing tens of thousands of people around the globe to collectively burn tons of fuel every day might make some of the volunteers rethink their decision. Ignoring that is the kind of perspective that thoughtlessly sucks up our finite resources.
And no, I don't consider alien hunting a valuable use of energy, at least not at this time in our history. Once we have fusion reactors or some other form of "free energy", all that will change.
Go ahead and crack keys, search for Extra Terrestrials, or fold proteins, or whatever you want to do with your box. Leave your lights on 24x7. Run the furnace and the air conditioner together. Just understand that what you do today has an impact, and consider the value of the outcome.
John
Sounds like someone was doing 'Difficult Data Retrieval'
FTA, they mention that Amazon didn't allow them to create more than 9 instances, so they couldn't crack the passwords in less than 122 days. (a request to get suitable amounts of computing power was made, but takes time, is not enabled by default, and wasn't available at the time of writing?)
Dear Sir,Thank you for submitting your request to increase your Amazon EC2 limit. It is our intention to meet your needs. We will review your case and contact you within 3 - 5 business days.
You win the internets.
The Internets is full of furries, 4chan, and people having sex with dogs! I don't want this! Take it back!
Support my political activism on Patreon.
That is just nonsense.
If the customer had used a proper PKI with key recovery/escrow this could have been avoided. The solution is NOT to make weak passwords so that you can crack them when you lose your passphrase. How on earth is this modded informative?!
what if we covered the moon with graphics cards?
i bet we could break any password then, huh guys
No problem, I've got a monitor full with post-it notes. So my policy must be excellent.
You take one of each letter, put it them in a bag, jiggle the bag and pull out a single tile. Drop the tile back in the bag and repeat.
You can even get 52 characters out of it: if your thumb covers the letter when you draw it out, capital. If it covers the blank side, lowercase.
Can you be Even More Awesome?!
An irrelevant note I might add. All PGP/GPG encrypted data is symmetrically encrypted using a randomly generated key. It is only that resulting key that is then encrypted using the public key, for speed reasons.
The security of your data depends heavily on the random number source used for generating these session keys.
- Michael T. Babcock (Yes, I blog)
This is called a known plain text attack. Most modern algorithms are hardened against this. So this technique wouldn't help unless the algorithm had a known weakness to this. But then again, maybe people still encrypt their archives using enigma machines these days. The world is a crazy place.
If using a cloud, where you pay by CPU-Hour, wouldn't it make sense to use as many VMs as it takes to get it done in.. an hour? (if that many are available)
If I'm understanding right, there's no cost difference, but you get your results right away, instead of waiting half a year
Can you be Even More Awesome?!
> An irrelevant note I might add. All PGP/GPG encrypted data is symmetrically
> encrypted using a randomly generated key. It is only that resulting key that
> is then encrypted using the public key, for speed reasons.
Purely from method you're correct. But the distinction made prior between
public key and straight-symmetric is quite relevant to this discussion. If the
files were encrypted with public key encryption and the private key is lost,
you have no other choice but brute-force attacking the cipher with associated
cracking-time. Attacking the password is not even an option anymore, as
opposed to having the files symmetrically encrypted where you can still choose
between attacking the cipher or the passwords.
What's the carbon footprint of your post? Mine has fewer electrons that yours.
I'd say pulling tiles out of a bag from a typical Scrabble set would be optimal to guessing a lot of passwords.
Ever wonder why the people who win at Hangman always pick E,T,A,I,O,N first ?
So why not reverse the distributions, so that lots of Q,W,X,Y are present, and only a handful of E,T,A,I,O,N, and use that to generate a random password ?
If you can provision 30k CPUs, sure. As mentioned in the article this type of password cracking is trivially parallel.
This sig is intentionally left blank
How about we all go back to living in the trees and eating coconuts ?
What is with this techno-guilt-trip lately ? Everything online now has to be weighed in terms of how much energy it is using ? Google datacenters, key cracking attempts, posts to Slashdot ? (Kind of ironic his post was tl;dr;, just think of the energy he could have saved if he'd been less verbose).
You want to do something concrete to save energy ? Turn down your aircon, stop driving that 5 litre SUV, invest in a set of solar panels for your house and get off the grid etc.
Don't try to pretend your making a contribution by saving a couple of milliwatts on CPU use. For every post made online, you already saved tons more energy in not producing newspapers, not burning gas so that some truck can deliver them to paper-shop, not even having a paper-shop consuming energy to stay open and sell you yor daily news.
Everything we do consumes energy, but these sensational claims about how much we could save if we used black screens and white text are nonsense. We already saved far more energy by being online in the first place.
The problem is that people make the mistaken assumption that the easy-to-remember passwords have to be the same length as the ideal password.
It's true that nonsense words generated by markov chain do not use all of the bits available in their given output, but there is no need to limit pronounceable passwords to eight or ten characters. If they're easy enough to remember, you can go almost arbitrarily high.
A little back of the envelope calculation shows that adding characters on the end is almost linearly equivalent to adding characters to the depth, especially where your character depth is already greater than your password length.
I can tell you which one is easier to remember, though.
Can you be Even More Awesome?!
Someone attacking a password they believe to be random isn't going to worry about English letter frequencies (adding duplicate symbols to the pool doesn't really increase it's randomness; think in terms of 'ab' vs 'aab' or 'aaaab', if I discover information about the pool you used, 'ab' is going to be 'more' resistant than the other two (in that it will take two guesses half of the time, instead of 1/3 or 1/5 of the time)).
Nerd rage is the funniest rage.
> why not reverse the distributions, so that lots of Q,W,X,Y are present, and
> only a handful of E,T,A,I,O,N, and use that to generate a random password ?
When some characters have more chance to appear than others then it's by
definition more longer 'random'. Random is, when they all have equal chances
of being drawn, so you want 26 tiles, one per letter of the alphabet.
No longer 'random' I meant to say ;-)
Or even faster on 8000 VMs (8 cores * 1000 servers). Which a lot educational institutions have. Google "Rocks Clusters" and look for their cluster registrations.
If my power plant is burning any coal, something is very wrong.
No problem, I've got a monitor full with post-it notes. So my policy must be excellent.
I didn't know my old manager read Slashdot!
Posts not to be taken literally. Almost everything is sarcasm.
aying "oh, I want to help search for E.T." is one thing. It may cost you an extra 1440 kWh/day, but you have the money, no big deal. But understanding that SETI@HOME is causing tens of thousands of people around the globe to collectively burn tons of fuel every day might make some of the volunteers rethink their decision.
If your computer uses 1440 kWh/day = 1.4 MWh/day = 60kW, you ought to have a better use for it that SETI@home. My house, for example, has only 100A service, good for about 24 kW.
Turns out that the energy spent breaking RC5-64 used somewhere between 2 and 50 *trains* full of coal.
Except now we all think you have a factor of 1000 error in that estimate.
USB Flash - Linux-OS+Apps and 256b...1024b... random number generator ... plus..., anyone (I think) can have a very complex key-generator.
Isolated (air-gap) key-generator loads USB-flash drives, then sneaker distribution.... There may be other options... for information sharing, but (IMO) someone passing unbreakable encrypted files within an encrypted VPN, that is not a known business or government, would create a signature of interest on the W3. Governments have really good tools, and can domestically shutUdn (post hypothesis, evidence collection, and foundation...).
An old-school key-book, snail-mail, and postal-bulk camouflage, may be better, if you need to hide information.
Unaccountable leaders are masters, and unrepresented people are slaves. How do US and EU fare?
In this case, it sounds like the customer was pretty glad they'd used weak passwords.
The implication is that they'd locked some files up in an encrypted zip, forgotten the password, and wanted the contents back.
If they'd chosen a stronger key, they'd not have got their files back.
TFA:
This analysis may be insightful as you develop your enterprise password policies, or choose your personal passwords.
(A good password policy is: don't forget your passwords!)
Alternatively, an disgruntled former employee may have refused to divulge the password. A third possibility is that this individual became unable to reveal the password through injury or death.
Eagles may soar, but weasels don't get sucked into jet engines.
They never say that they actually found the password?
Or how long it took?
Are they still looking for it?
I read both pages and found no conclusion whatsoever.
UGH.. can't.. understand.. lacks.. cars..
... but if you have a thermostat, you save in the 1440 kWh/day you would otherwise have spent on just warming up your apartment. Yay for thinking ovens.
You used a lot of words to simply say "passwords chosen by humans are typically easier to crack than randomly generated ones."
Explicitly encrypting the file with symmetric encryption doesn't make anything any easier than using public key encryption, its the choice of symmetric key that matters (since one is used in both cases).
- Michael T. Babcock (Yes, I blog)
What if the content is hidden....say like 7zip encryption where file content can be hidden. Would it work then?
It's interesting to note that this would catch all but my strongest personal passwords. I tend to rely on english-like words.
Aside from swapping steps 2 and 3 (due to ease of implementation, and IMHO likeliness of being used by a victim) I would add three more steps:
Step 4: crack with minor (i.e. easily remembered) non-alphanumeric character additions. Up to now, "tits!" would foil this password algorithm. Coincidentally, this would also catch my strongest personal password.
Step 5: crack with geometric keyboard combinations. "qwerty" is probably in the dictionary attack, but there are many other rows, directions, and starting points to modifiy this, not to mention keyboard layouts.
Step 6: crack with knowledge of foreign languages, so it's not just english-like permutations. This would probably be non-trivial to implement, though extremely valuable if you're not targeting primarily english-speakers.
Isn't the point of altruistic distributed computing like folding@home or seti@home that the machines would be on anyway. Fans running, displays on, a/c going etc. Keeping the processor busy on a typical desktop machine isn't likely to add hugely to the power consumption is it? This is fuel that mostly was going to be burned anyway - why not get some useful information while we're at it?
Put the big safe in an expensive car, and the little safe in an old Holden. (Or use some American classic if you prefer.)
Semi-automatic amateur armchair Australian philosopher; conjecture ready at any moment...
What do you do after you've don all that? And don't forget to count the energy used to make the solar panels. Feel guilty about that for at least 35 years. And how much energy does it take to generate one human baby? That has to count. And have you shaded your planet from the sun recently? It is throwing thousands of kilowatts at the earth every second! Get outside and cover as much as possible to keep it cool.
Semi-automatic amateur armchair Australian philosopher; conjecture ready at any moment...
No.
Say that it takes 59 mins hours to find the password, if you pay for 10 vms, it will cost you 10 hours worth. If you pay for 1000s vms, it will still cost you 1000 hours worth, because you can't cancel them after they have done 59mins worth of work.
A "known plaintext attack" is a specific cryptographic technique where you use known plaintext material to help break the key. A very simple example might be a Caesar cypher where you know the word ROME is in the message. You could then try subtracting the values of ROME against the letters in the message BUUBD LSPNF, and you'd quickly find the last four letters where the differences are all -1,-1,-1,-1, thus the key is to shift each letter by one, yielding the message ATTACK ROME. Without the plaintext you'd have to solve the cypher the old fashioned way.
The ULTRA project who decrypted Enigma didn't use a known plaintext attack either. They couldn't reverse engineer the keys from the cribs they obtained. Instead, they used the cribs to solve a different problem (which happens to be the same problem I'm describing.) They had to put in "stop words" to get the bombes to mechanically stop spinning when they encountered a possible solution.
What I described is not a known plaintext attack. It is simply testing the output of the algorithm against possible solutions. No algorithm is hardened against this because this is the normal function of the cypher.
John
Nothing is an absolute in cryptography. You still have to make guesses as to when you hit upon the correct answer. And an adversary has every incentive to not make it easy for you.
But the adversary might be lazy, and let the tools do their jobs by default. When they decrypt their archive, they probably want to immediately use the results without going through a second deobfuscation step. Never discount the value of human nature and laziness -- they could save you tons of work.
As a cryptanalyst, you have to look for shortcuts. If I were given foo.7z and was told it was encrypted, I'd only have a few formats to try. It could be a bare 7z format (described here, or a compressed version. I'd have to figure out what possible artifacts I could look for in the decrypted file. But knowing it's in 7z format makes the job easier. (Of course, 7zip's key strengthening routine would not make it easier, as the key is encrypted 2^18 times!)
I'm not saying it's easy. There are thousands of file formats in common use out there. And I'd have only hope that the adversary is using one of them. But it's really not much different of a problem than "magic" already solves. And I could probably leverage a .magic file to help with the identification task.
John
The last time I measured the difference between an Athlon 2400 CPU sitting idle at a Windows explorer desktop (somewhere around 3% CPU usage) and the same CPU with its load peg at 100%, the difference was 60 Watts extra.
The reason I went looking was because I had installed distributed.net's cracker on one day, and the next when I went into the room I noticed that it was uncomfortably warmer than the previous day. So I used my battery backup's internal statistics to measure current draw, and I was very surprised to see the delta. I've since confirmed similar behavior with my latest desktop using a Kill-A-Watt, but I don't remember the exact numbers for load vs. idle.
John
Sorry, yes, I meant watt-hours, not kilowatt-hours. It was a think-o. I did the calculations several years ago, and I didn't bother to go look them up again when I posted this morning. It's probably time to redo them anyway, as CPUs are now much more energy efficient than they were back then.
The point is that the amount of fuel being burned to re-confirm a statistical theory was staggering. We knew it would take roughly X tests to crack a DES key, and it did. Proving it once was exciting. We knew it would take Y tests to crack an RC4-56 key, and it did. Proving that was somewhat less exciting. There was no point in burning up further fuel just to prove that we could crack an RC5-64 key in Z tests. Fine, we know it'll work, save the damn fuel.
John
Go back and reread what I said. "Consider the value of the outcome." That means you should make sure that what you're spending energy on is worth it, not that you shouldn't use energy.
I don't feel particularly guilty about consuming energy (certainly not enough to do anything more serious about it than to use mass transit.) But I seriously question the wisdom or validity to use energy to confirm that it's going to take roughly 50% of 2^72 tests to brute force the RC5-72 challenge. This after already proving that it took about 85% of 2^56 tests to brute force the RC5-56 challenge, and a similar percentage of 2^64 tests to brute force the RC5-64 challenge. It is an insane waste of energy.
At least SETI@home has the unknown factor that some people can believe in. Folding@home and the World Computing Grid may provide actual scientific or humanitarian benefits. Rendering WoW at 72 frames per second gives the player an immersive experience. All those offer benefits (or potential benefits) to one or more people.
So is cracking PGP providing a benefit? In TFA's case, yes, they're trying to recover some lost files for a client. For us to repeat the experience, just to prove we can copy their efforts to brute force a PGP passphrase? Sheer waste.
John
There was a group called CyberLocator trying to do some authentication and geo/time stamping around GPS and a citadel model, though originally as an regional authentication scheme for online casinos, using raw GPS signal information and not calculated results, as part of the passphrase. They had some patents and stuff on it. Their authentication server had to have at least 3 of the same satellites in view to see the raw signal variances in the radio waves, which are apparently available on many but not all GPS chips. Because of time issues, it's substantially harder to fake your position information along with the regular secure passphrase transfer. I was able to track down one of the founders by following the casino related VC trail a few years ago when I was doing a project for work, but they probably went bust not long after and the patents got sold to the four winds. Good idea, I guess the marketing and associated costs weren't that hot.
There was a related thing called GeoCodex for doing location related encryption/geo-encryption, but that's either gone black or went bust. That too seemed to be a well executed idea, but things just didn't fall into place.