Codebreaking - Taking the First Step?

← Back to Stories (view on slashdot.org)

Codebreaking - Taking the First Step?

Posted by Cliff on Thursday February 20, 2003 @10:20AM from the journeys-of-a-thousand-miles dept.

Master Spy asks: "Here's something that the Slashdot community might be able to help with. If you receive a message in code how do you take the first step? Back in the days of WWII it was easier. The codebreakers at Bletchley Park already knew that the messages were encoded using an Enigma machine so all they had to do was work out the positions of the rotors using brain power, the Bombe or later the Colossus machine. American codebreakers also knew the basic details about the methods the Japanese used but now however things are more complicated. Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' How do you know how the message has been encrypted? It could be an Enigma machine, it could have been XOR'd with a second message or a one-time pad or it could use some form of software encryption such as Blowfish or DES. Before you start ripping the message apart for decoding how do codebreakers find out what method has been used to encode the message?"

20 of 83 comments (clear)

Min score:

Reason:

Sort:

Medium by peripatetic_bum · 2003-02-20 10:22 · Score: 4, Insightful

Let the medium decided how to decide this.
Ie if were over the net, look at the wrappers.
If over the radio, look at the spectrum.
It's whats around the message that will break the message

--
Sigs are dangerous coy things
how by igs · 2003-02-20 10:28 · Score: 5, Informative

Well, if this was easy, codebreaking wouldn't be any fun. Don't forget that both the Germans and the Japanese had a variety (tens if not hundreds) of different cyphers in circulation, so it wasn't exactly as simple as assuming it's Enigma or Code Purple.
As to how it's done, that has to do with analysing the text, frequency analysis of 1-grams 2-gram etc. Simple substitution will exhibit one fingerprint (though different languages will obviously be different), something like a playfair or Venegere (sp) square will have another, and DES encypted text a completely different structure. Obviously on a small enaugh sample there may not be enaugh information to latch into...
But with a larger sample, it's mostly a combination of good tools, experience, and guesswork :)
I've seen that. by orthogonal · 2003-02-20 10:33 · Score: 4, Funny

Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++'

Yeah, 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' showed up on my SETI@home screen too.

This is clearly the signature of the Grays from Cygnus Prime. You don't want to communicate with them.

They Grays of Cynus Prime are evil. They will put chips in your head.

They will use the chips to make you do bad things. Like posting to Slashdot.

--
Opinions on the Twiddler2 hand-held keyboard?
Step One: by Jerf · 2003-02-20 10:36 · Score: 3, Informative

Step One: "Aquire more samples."

When you have less data then a smallish key (and that message has no more then 28 * 8 = 244 bits, probably much less), the data can (most likely) decrypt to anything at all with the proper key. If that's all you really have, then you need to pursue non-code-breaking methods of finding out what that is.

And of course what to do next depends on the characteristics of that more data. A lot of cyptoanalysis assumes you have knowlege of the encryption method; this is because it's "easy" to obtain by reading code, but "easy" is a relative term. It's easier then just guessing, but still hard. Without knowlege of an algorithm, you need to luck out and hope they used one with a distinct signiture. If they didn't, you're probably basically out of luck on a single person's resources, because all of the "good" algorithms should be effectively indistinguishable from noise after encryption.
encryption breaking by Dot.Sig · 2003-02-20 10:36 · Score: 3, Funny

"Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++'"

I transmit a polite reply saying, "No, I am NOT interested in being your love monkey no matter how much you lust after me." Gee whiz! The things people say when they think nobody is listening!
how to perform cryptanalysis by Anonymous Coward · 2003-02-20 10:41 · Score: 5, Informative

1. Get some books. Schneier's book is the best starting point.

2. Learn statistics (and basic number theory). You can discover a lot about a message by its statistical properties.

3. BREAK LOTS OF CODES. Without experience, you are lost. Start by breaking substitution and Caesar ciphers (easy with statistics), then Vigenere/Gronsfeld ciphers (harder but still "crypto for dummies"), then try XOR ciphers (they can be solved easily in an interesting way)... then try to understand how WEP is broken... DeCSS .. move up the scale until you can understand the way more sophisticated codes are broken (for instance differential cryptanalysis). It gets harder at this point and well outside the realm of practicality but if you get this far, you will be able to break any cryptographically weak cipher (which includes the products of many companies, unfortunately).

4. If you become advanced enough, you can start reading papers on cryptanalysis. Many of them are surprisingly easy to understand once you understand number theory. However, it is much more difficult to *discover* some of the stuff these guys come up with, it's pretty amazing.

Anyway, to summarize, understand the statistics involved and PRACTICE until you can just look at a substition cipher and understand what it says... just by the letter frequencies! If you are trying to break a simple code you need lots of ciphertext to analyze.

And don't forget: sometimes you don't need to break a code at all. As a poster above wrote, sometimes context is enough. Sometimes an external clue will give the code away. How do you know what to look for? Experience!
1. Re:how to perform cryptanalysis by Garin · 2003-02-20 13:19 · Score: 4, Informative
  
  Actually, I disagree. I don't think Schneier's book is the best place to start. It's a fine book, no doubt, but it says very little about real cryptology from a theoretical standpoint, or from the point of view of teaching you to develop or break codes.
  
  If you're a math god, start with the Handbook of Applied Cryptography by Menezes, van Oorschot, and Vanstone.
  
  If your math isn't quite as godly, start with Thomas Barr's "Invitation to Cryptology". It's an excellent starter book for anyone with even a little bit of mathematical skill. You really don't need much but some high school math, maybe a bit of first-year algebra and stuff, and a willingness to do the chapter problems.
  
  --
  In any field, find the strangest thing and then explore it. -John Archibald Wheeler
Reading Material by dhwang · 2003-02-20 10:51 · Score: 4, Informative

If you are interested, I would suggest that you start by reading The Code Book by Simon Singh. It gives a good overview of the history of the battle between cryptography and cryptanalysis, and how ciphers have evolved to defeat methods of codebreaking. It's an interesting and entertaining read and you might gain some insight on how you would approach this particular cipher.

BTW, I have a truly marvellous solution to your cipher which this textarea is too small to contain.
Look at it. by Glonoinha · 2003-02-20 10:52 · Score: 4, Insightful

Take a bunch of encoded stuff and simply look at it, watch for patterns over the course of the data as a whole. For a small sample set

'sdjek dYqkP 1Nt$% GGl9) MHrYD +++'

this is not going to do you much good, but if you have reams of encoded / encrypted data just stare at it for a while, look at it in a way that you look through it (like those hidden picture things) and after a while you will recognize patterns and have something with which to work.

There is a fine line between the high quality software engineer and mild autism. Ever watch 'Rain Man' or 'A Beautiful Mind' and think - hey that guy would be a BAD ASS developer ...

Helps to be able to think in 6+ dimensions when you are cracking codes, and a photographic memory helps too.

I should probably post this as an AC - last thing I need is the CIA / NSA figuring out what I am capable of :p

--
Glonoinha the MebiByte Slayer
Simple by splattertrousers · 2003-02-20 10:53 · Score: 5, Funny

Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' How do you know how the message has been encrypted?
First you djc,s dk%33R +++ (110), then you sD##N KDL:: Ds03k -332+. From there, it's a trivial matter of just 3!Wop mclDI a002g a!22# with the sklj3 V3iia aq@@1 +1867 -5309.
Duh.
Homeland Security by lostindenver · 2003-02-20 10:57 · Score: 3, Insightful

If You are an american even posting this is probably a violation of some 4 letter acronym or terrorist prevention law. I mean why not just ask how to make a model rocket why dont ya.
Learn about the traffic. by rjh · 2003-02-20 11:03 · Score: 3, Insightful

Stop thinking about the encrypted bits. Start thinking about who sent these bits and who these bits were sent to. Think about the application which created the data. Think about what purpose the data is going to be used.

Once you have this information, you'll be much better equipped to figure out what the basic structure underpinning the cipher is. For instance, if the data is part of a realtime encrypted stream, I'd think "stream cipher" and look at RC4 or SEAL. If the data's part of a pen-and-paper arrangement with all values mod 26, I'd think "Solitaire". If the data's a pen-and-paper arrangement meant for communicating between two deep-cover espionage agents, I'd think "one-time pad". If the data's something pulled off a disk drive, I'd think of Matt Blaze's ECB+OFB algorithm. Etc.

What it boils down to is, this question is pretty arbitrary. Very rarely will you have no metainformation about the plaintext. Seek out as much metainformation as you can, and use the metainformation to make educated guesses, cribs, etc.
Code breaking by crmartin · 2003-02-20 11:08 · Score: 3, Interesting

Here's a partial answer:

(1) there is always the possibility that you simply won't. In fact, a properly used one time pad cipher is indistinguishable from noise. It's also a major pain in the ass to use, because you must somehow transmit as many bits of key as you want to send bits of message, and your one-time pad is only as good as your method of transmitting the key.

(2) If there is some kind of message in the signal and a cipher is involved other than a one-time pad or something isomorphic to one, then there will be some degree of redundancy in it. This is a theorem of information theory. Statistical measures will eventually reveal that the redundancy exists.

(3) At that point, there are lots of approaches. A good readable and interesting introduction to these, along with the history of such things, is David Kahn's The Codebreakers. Bruce Schneier's Applied Cryptography is a good, more technical introduction for the computer geek. I've also heard good things for Handbook of Applied Cryptography as well, but I don't actually know the book.

But as someone notes above, it's an inherently hard problem to simply identify the cipher, and modern ciphers like RSA are, as far as we know, computationally intractable because the only known attack requires factoring a very large prime number.

(4) You give up and hire a pretty young woman to talk the marine guards into letting you at the code room. (Details of this approach are left as an exercise for the interested reader.) Sometimes the old fashioned ways are best.
There's no such thing as code-breaking today. by 3-State+Bit · 2003-02-20 11:10 · Score: 4, Interesting

The only, only thing you can expect to learn is who's communicating with whom [and when / how much information is exchanged] ( you probably know this already ) , and what protocol they're using ( it's probably unbreakable ).

Chances are, if you are intercepting an encrypted stream, you are intercepting an unbreakably encrypted stream.

Perhaps you are thinking that if only you knew what protocol the stream is using, you might look online and see if that protocol has been cracked.

Don't waste your time.

The chances are approximately 0 that the stream you are intercepting is using a protocol that has been cracked, or that it is using a keyspace you can brute-force for under a few hundred thousand dollars, or in under a matter of years.

Sorry -- you have a higher chance (almost infinitely higher -- as I said, the chance you will succeed in what you are asking to do is approximately 0) of port-scanning the machine at the source or the destination and 0wning it than you do of breaking the stream.

I don't say this to mean you should give up -- just that you're phrasing your question wrong. Don't discount the 0wning venue of attack.

For every million desktop machines communicating over TCP/IP, only a matter of a few dozen will have 0 exploitable security weaknesses. (However, most security weaknesses are unknown.)

Find out what kind of machine is at the source and the destination, then 0wn one of them. Chances are almost overwhleming that it's possible, if not with a remote exploit, then through social engineering. (Send an attachment that will be opened on either end of the communication, or induce either end to visit a web page in a browser that is exploitable (=, basically, every browser except Lynx).

If they browse with Netscape or Internet Explorer, chances are almost overwhelming that they can be owned.

It's not that hard to get someone to browse to a certain page, if you know anything at all about who that person is.

Back to your original question: gone are the days that protocols were breakable by any hotshot think tank. Today only implementations are, and rarely at the level you're trying to address. Don't break the code -- break into the system.

Hope this helps.
Social engineering solution by mbstone · 2003-02-20 11:41 · Score: 4, Funny

You social-engineer the NSA or other TLA with teraflop codebreaking computer-capability into helping you crack the message. For example, consider the following method used by an Idahoan to get his potato field plowed:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>& gt ;
An old man lived alone in southern Idaho. It was early spring, and he wanted to spade a garden plot to prepare it for planting potatoes. But it was very hard work and he just didn't have the energy.

You see, his only son, who would have helped him, was in prison.

The old man wrote a letter to his son and mentioned his predicament.

A week later, he received a note back, which said, "For heaven's sake, Dad, whatever you do, don't dig up that section of the garden! That's where I buried the GUNS!"

The next morning, bright and early, a dozen police showed up and dug up the entire garden, without finding any guns.

Looking out the kitchen window, the old man thought "Now, what in the world is going on here?" Confused, he wrote another letter to his son telling him what happened, and asked him for advice.

Another week passed and his son's reply arrived in the mailbox. The old man carried the letter up to the house, sat down at the kitchen table and read, "Now plant your potatoes, Dad. It's the best I could do for you under the circumstances."
echo /dev/urandom /dev/ttyS0 by Emrikol · 2003-02-20 11:45 · Score: 4, Funny

sdjek dYqkP 1Nt$% GGl9) MHrYD +++

NO CARRIER

Damn line noise...

Good old memories!

--
You're all bastards!
Hrmph.. by penguin_punk · 2003-02-20 12:22 · Score: 3, Funny

Thanks Cliff, now everyone knows Osama's slashdot nick.

--
HURD - Hurd's Under Research & Development
sdjek dYqkP 1Nt$% GGl9) MHrYD +++ by pizza_milkshake · 2003-02-20 12:50 · Score: 4, Funny

sdjek dYqkP 1Nt$% GGl9) MHrYD +++
this is obviously perl code
It's garbage by lkaos · 2003-02-20 19:25 · Score: 4, Insightful

sdjek dYqkP 1Nt$% GGl9) MHrYD +++

Two things give it away:

The spaces are too regular. You'd be quite hard pressed to form a coherent sentence with any character occuring every 5n character.

So then perhaps the spaces are irrelevant. Then the next questionable aspect is the last three +++'s. Now, if your code didn't atleast work in groups of three, the mathematic likely hood of three +++ occuring would be small.

So then, what would make most sense is some kind of consistant bit manipulation at least in cycles of three characters. Then you double GGs and unique character (%$) make that unlikely too.

So what makes the most sense? Just random typing.

Look at the first set of characters:

sdjek

Just type it a few times... It's quite natural. You might have well used asdf (I bet your typing style isn't perfect... you probably favor your right hand).

If you examine each other character grouping, you'll see that none of them are very hard to reach.

Also, it gets the KIS approval which in most circumstances, is the winning vote.

--
int func(int a);
func((b += 3, b));
Cryptosystem identification literature by fiffilinus · 2003-02-20 22:15 · Score: 3, Informative

A book titled 'System Identification And Key-Clustering', by Dr. I. J. Kumar is available from Aegean Park Press. It deals with defining a methodology for identifying cryptosystems and narrowing the key space applicable for a given message. This is quite what you want, but be warned - it is not for the faint of heart...