Codebreaking - Taking the First Step?

← Back to Stories (view on slashdot.org)

Codebreaking - Taking the First Step?

Posted by Cliff on Thursday February 20, 2003 @10:20AM from the journeys-of-a-thousand-miles dept.

Master Spy asks: "Here's something that the Slashdot community might be able to help with. If you receive a message in code how do you take the first step? Back in the days of WWII it was easier. The codebreakers at Bletchley Park already knew that the messages were encoded using an Enigma machine so all they had to do was work out the positions of the rotors using brain power, the Bombe or later the Colossus machine. American codebreakers also knew the basic details about the methods the Japanese used but now however things are more complicated. Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' How do you know how the message has been encrypted? It could be an Enigma machine, it could have been XOR'd with a second message or a one-time pad or it could use some form of software encryption such as Blowfish or DES. Before you start ripping the message apart for decoding how do codebreakers find out what method has been used to encode the message?"

11 of 83 comments (clear)

Min score:

Reason:

Sort:

how by igs · 2003-02-20 10:28 · Score: 5, Informative

Well, if this was easy, codebreaking wouldn't be any fun. Don't forget that both the Germans and the Japanese had a variety (tens if not hundreds) of different cyphers in circulation, so it wasn't exactly as simple as assuming it's Enigma or Code Purple.
As to how it's done, that has to do with analysing the text, frequency analysis of 1-grams 2-gram etc. Simple substitution will exhibit one fingerprint (though different languages will obviously be different), something like a playfair or Venegere (sp) square will have another, and DES encypted text a completely different structure. Obviously on a small enaugh sample there may not be enaugh information to latch into...
But with a larger sample, it's mostly a combination of good tools, experience, and guesswork :)
1. Re:how by ChadN · 2003-02-20 13:59 · Score: 2, Informative
  
  Simply put, any new ad-hoc cipher that was not designed by experts will probably be subject to flaws that could be revealed by statistical tests, assuming a sufficient amount of ciphertext.
  
  Additionaly, any ciphertext that was encrypted by a well-designed cipher (and I'll include DES in this example, despite its relatively small keysize by modern standards) will NOT be much harder to decrypt simply because the cipher is unknown. Even if you had LOTS of ciphertext and tried against every known published cipher, along with billions of variants (ie, additional rounds for each one, etc.) the extra workload would be modest to minute, compared to the work of actually searching the keyspace (and in the case where there is no known plaintext, analyzing the de-ciphered text for probable plaintext).
  
  Most modern protocols go well out their way to advertise the ciphers used to encode messages, precisely because that bit of information is of no real extra security as long as the key is kept secret (and is well chosen). To not do so would make deploying things a nightmare.
  
  So I think, in the end, few experts would argue that using the most commonly known good cipher, with a well-chosen key, is any less difficult to 'break' than using an obscure and secret cipher. Especially if that cipher is not one that is also widely deployed as a secure cipher. The real hard work is in finding the key. And the fun work is in finding ad-hoc ciphers that people think are secure because the method is secret.
  
  --
  "It's overkill, of course. But you can never have too much overkill." - Anonymous Slashdot Coward
Step 1: statistical analysis by Gadzinka · 2003-02-20 10:33 · Score: 2, Informative

Do the statistical analysis on the encrypted data. In several ways. If all you get is seamingly random stream of data with equal distribution of all values then you've got raw stream encrypted by modern, quite strong cipher.

Good luck ;)

--
Bastard Operator From 193.219.28.162
Step One: by Jerf · 2003-02-20 10:36 · Score: 3, Informative

Step One: "Aquire more samples."

When you have less data then a smallish key (and that message has no more then 28 * 8 = 244 bits, probably much less), the data can (most likely) decrypt to anything at all with the proper key. If that's all you really have, then you need to pursue non-code-breaking methods of finding out what that is.

And of course what to do next depends on the characteristics of that more data. A lot of cyptoanalysis assumes you have knowlege of the encryption method; this is because it's "easy" to obtain by reading code, but "easy" is a relative term. It's easier then just guessing, but still hard. Without knowlege of an algorithm, you need to luck out and hope they used one with a distinct signiture. If they didn't, you're probably basically out of luck on a single person's resources, because all of the "good" algorithms should be effectively indistinguishable from noise after encryption.
Well... it all depends... by j.e.hahn · 2003-02-20 10:37 · Score: 2, Informative

Code breaking is hard by its very nature. You're trying to find an unknown message by inverting (or short-circuiting) an unknown process.

If you think of things mathematically, you're looking to find a plaintext p in the set of all possible plaintexts P and some function f from the set of all ciphertexts to the set of all plaintexts where f(c)=p. These means both f and p are unknown, and while multiple solutions may exist they are likely of "measure zero" in 2 very large spaces. (let's asssume we have a suitable measure for such things, and not worry about the real details.)

To a mathematician, finding a general solution to the above would be a Field's medal winning sort of thing. The reality is that you need more information. If you got a large message you should start checking letter/symbols counts, following by the counts of various character pairings, etc. The goal is often to come up with a statistical model to see if you can build a plausible f. Another thing is to try common functions (xor with various values, etc.) on the stream and see what happens. Sometimes that'll give you a clue. But most of time it involves a little luck, a little intuition and a lot of perseverance.
how to perform cryptanalysis by Anonymous Coward · 2003-02-20 10:41 · Score: 5, Informative

1. Get some books. Schneier's book is the best starting point.

2. Learn statistics (and basic number theory). You can discover a lot about a message by its statistical properties.

3. BREAK LOTS OF CODES. Without experience, you are lost. Start by breaking substitution and Caesar ciphers (easy with statistics), then Vigenere/Gronsfeld ciphers (harder but still "crypto for dummies"), then try XOR ciphers (they can be solved easily in an interesting way)... then try to understand how WEP is broken... DeCSS .. move up the scale until you can understand the way more sophisticated codes are broken (for instance differential cryptanalysis). It gets harder at this point and well outside the realm of practicality but if you get this far, you will be able to break any cryptographically weak cipher (which includes the products of many companies, unfortunately).

4. If you become advanced enough, you can start reading papers on cryptanalysis. Many of them are surprisingly easy to understand once you understand number theory. However, it is much more difficult to *discover* some of the stuff these guys come up with, it's pretty amazing.

Anyway, to summarize, understand the statistics involved and PRACTICE until you can just look at a substition cipher and understand what it says... just by the letter frequencies! If you are trying to break a simple code you need lots of ciphertext to analyze.

And don't forget: sometimes you don't need to break a code at all. As a poster above wrote, sometimes context is enough. Sometimes an external clue will give the code away. How do you know what to look for? Experience!
1. Re:how to perform cryptanalysis by Garin · 2003-02-20 13:19 · Score: 4, Informative
  
  Actually, I disagree. I don't think Schneier's book is the best place to start. It's a fine book, no doubt, but it says very little about real cryptology from a theoretical standpoint, or from the point of view of teaching you to develop or break codes.
  
  If you're a math god, start with the Handbook of Applied Cryptography by Menezes, van Oorschot, and Vanstone.
  
  If your math isn't quite as godly, start with Thomas Barr's "Invitation to Cryptology". It's an excellent starter book for anyone with even a little bit of mathematical skill. You really don't need much but some high school math, maybe a bit of first-year algebra and stuff, and a willingness to do the chapter problems.
  
  --
  In any field, find the strangest thing and then explore it. -John Archibald Wheeler
Reading Material by dhwang · 2003-02-20 10:51 · Score: 4, Informative

If you are interested, I would suggest that you start by reading The Code Book by Simon Singh. It gives a good overview of the history of the battle between cryptography and cryptanalysis, and how ciphers have evolved to defeat methods of codebreaking. It's an interesting and entertaining read and you might gain some insight on how you would approach this particular cipher.

BTW, I have a truly marvellous solution to your cipher which this textarea is too small to contain.
Always starts in the same place by onegecko · 2003-02-20 10:53 · Score: 2, Informative

You start with known encryption methods (simplest first) and by process of elimination you keep going until you get a clue. A good cryptologist has information about everything from Bacon's method to the most recent ciphers and their algorithms.

The folks who cracked the Enigma started the same way. The Polish started the process, sent info to England where it was completed.

A code fragment that short, though, would be darned impossible to crack unless you get more.

But you can already see patterns: word length, multiple "+" characters (maybe an indicator of end-of-phrase or something?).

But that's -- basically -- how you do it. Educated guesses and grunt work (either by you or computer). Unless it's Quantum encryption which is spoiled as soon as you intercept, so you can't decode it.

Check out The Code Book for some great -- albeit basic -- information about methodology and history.
Cryptosystem identification literature by fiffilinus · 2003-02-20 22:15 · Score: 3, Informative

A book titled 'System Identification And Key-Clustering', by Dr. I. J. Kumar is available from Aegean Park Press. It deals with defining a methodology for identifying cryptosystems and narrowing the key space applicable for a given message. This is quite what you want, but be warned - it is not for the faint of heart...
How to start. by Eivind · 2003-02-20 23:37 · Score: 2, Informative
1. Try to get more text coded with the same cryptosystem (and preferably the same key). Cracking anything based on 25 bytes of ciphertext is going to be hard.
2. Look for statistics. Run character-statistics. Do they look like normal text, only with different symbols ? If so you have a monoalphabetic substitution-cipher, crackable in 5 seconds by a computer or 5 minutes by hand. Repeat for digraphs or trigraphs. Any result different from "all combinations equally likely" (or close) gives you a hint.
3. Try to xor the text with a copy of itself shifted various places left and rigth. Observe how many nulls you get with various displacements. If you get a jump in nulls for a certain shift, you're likely dealing with a periodic substitution-cipher. Again easily crackable if the period is not too long and you have enough ciphertext. (enough here is something like 20 times the period. So if the period is 50 you'd need a kilobyte of ciphertext to easily attack it, more or less.)
If the text looks completely random under all statistical analysis you can think of, and stays that way even when xored with itself shifted various ways odds are you're dealing with something a bit more serious, and you'll need more expertise than you can gain from a "ask slashdot" article to crack it.
Good luck !