Codebreaking - Taking the First Step?
Master Spy asks: "Here's something that the Slashdot community might be able to help with. If you receive a message in code how do you take the first step? Back in the days of WWII it was easier. The codebreakers at Bletchley Park already knew that the messages were encoded using an Enigma machine so all they had to do was work out the positions of the rotors using brain power, the Bombe or later the Colossus machine. American codebreakers also knew the basic details about the methods the Japanese used but now however things are more complicated. Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' How do you know how the message has been encrypted? It could be an Enigma machine, it could have been XOR'd with a second message or a one-time pad or it could use some form of software encryption such as Blowfish or DES. Before you start ripping the message apart for decoding how do codebreakers find out what method has been used to encode the message?"
Well, if this was easy, codebreaking wouldn't be any fun. Don't forget that both the Germans and the Japanese had a variety (tens if not hundreds) of different cyphers in circulation, so it wasn't exactly as simple as assuming it's Enigma or Code Purple. :)
As to how it's done, that has to do with analysing the text, frequency analysis of 1-grams 2-gram etc. Simple substitution will exhibit one fingerprint (though different languages will obviously be different), something like a playfair or Venegere (sp) square will have another, and DES encypted text a completely different structure. Obviously on a small enaugh sample there may not be enaugh information to latch into...
But with a larger sample, it's mostly a combination of good tools, experience, and guesswork
Step One: "Aquire more samples."
When you have less data then a smallish key (and that message has no more then 28 * 8 = 244 bits, probably much less), the data can (most likely) decrypt to anything at all with the proper key. If that's all you really have, then you need to pursue non-code-breaking methods of finding out what that is.
And of course what to do next depends on the characteristics of that more data. A lot of cyptoanalysis assumes you have knowlege of the encryption method; this is because it's "easy" to obtain by reading code, but "easy" is a relative term. It's easier then just guessing, but still hard. Without knowlege of an algorithm, you need to luck out and hope they used one with a distinct signiture. If they didn't, you're probably basically out of luck on a single person's resources, because all of the "good" algorithms should be effectively indistinguishable from noise after encryption.
1. Get some books. Schneier's book is the best starting point.
.. move up the scale until you can understand the way more sophisticated codes are broken (for instance differential cryptanalysis). It gets harder at this point and well outside the realm of practicality but if you get this far, you will be able to break any cryptographically weak cipher (which includes the products of many companies, unfortunately).
2. Learn statistics (and basic number theory). You can discover a lot about a message by its statistical properties.
3. BREAK LOTS OF CODES. Without experience, you are lost. Start by breaking substitution and Caesar ciphers (easy with statistics), then Vigenere/Gronsfeld ciphers (harder but still "crypto for dummies"), then try XOR ciphers (they can be solved easily in an interesting way)... then try to understand how WEP is broken... DeCSS
4. If you become advanced enough, you can start reading papers on cryptanalysis. Many of them are surprisingly easy to understand once you understand number theory. However, it is much more difficult to *discover* some of the stuff these guys come up with, it's pretty amazing.
Anyway, to summarize, understand the statistics involved and PRACTICE until you can just look at a substition cipher and understand what it says... just by the letter frequencies! If you are trying to break a simple code you need lots of ciphertext to analyze.
And don't forget: sometimes you don't need to break a code at all. As a poster above wrote, sometimes context is enough. Sometimes an external clue will give the code away. How do you know what to look for? Experience!
If you are interested, I would suggest that you start by reading The Code Book by Simon Singh. It gives a good overview of the history of the battle between cryptography and cryptanalysis, and how ciphers have evolved to defeat methods of codebreaking. It's an interesting and entertaining read and you might gain some insight on how you would approach this particular cipher.
BTW, I have a truly marvellous solution to your cipher which this textarea is too small to contain.
A book titled 'System Identification And Key-Clustering', by Dr. I. J. Kumar is available from Aegean Park Press. It deals with defining a methodology for identifying cryptosystems and narrowing the key space applicable for a given message. This is quite what you want, but be warned - it is not for the faint of heart...