Codebreaking - Taking the First Step?
Master Spy asks: "Here's something that the Slashdot community might be able to help with. If you receive a message in code how do you take the first step? Back in the days of WWII it was easier. The codebreakers at Bletchley Park already knew that the messages were encoded using an Enigma machine so all they had to do was work out the positions of the rotors using brain power, the Bombe or later the Colossus machine. American codebreakers also knew the basic details about the methods the Japanese used but now however things are more complicated. Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' How do you know how the message has been encrypted? It could be an Enigma machine, it could have been XOR'd with a second message or a one-time pad or it could use some form of software encryption such as Blowfish or DES. Before you start ripping the message apart for decoding how do codebreakers find out what method has been used to encode the message?"
Let the medium decided how to decide this.
Ie if were over the net, look at the wrappers.
If over the radio, look at the spectrum.
It's whats around the message that will break the message
Sigs are dangerous coy things
The better you know what's out there to use, the better the chance of recognizing what you're up against.
Take a bunch of encoded stuff and simply look at it, watch for patterns over the course of the data as a whole. For a small sample set
...
:p
'sdjek dYqkP 1Nt$% GGl9) MHrYD +++'
this is not going to do you much good, but if you have reams of encoded / encrypted data just stare at it for a while, look at it in a way that you look through it (like those hidden picture things) and after a while you will recognize patterns and have something with which to work.
There is a fine line between the high quality software engineer and mild autism. Ever watch 'Rain Man' or 'A Beautiful Mind' and think - hey that guy would be a BAD ASS developer
Helps to be able to think in 6+ dimensions when you are cracking codes, and a photographic memory helps too.
I should probably post this as an AC - last thing I need is the CIA / NSA figuring out what I am capable of
Glonoinha the MebiByte Slayer
If You are an american even posting this is probably a violation of some 4 letter acronym or terrorist prevention law. I mean why not just ask how to make a model rocket why dont ya.
Stop thinking about the encrypted bits. Start thinking about who sent these bits and who these bits were sent to. Think about the application which created the data. Think about what purpose the data is going to be used.
Once you have this information, you'll be much better equipped to figure out what the basic structure underpinning the cipher is. For instance, if the data is part of a realtime encrypted stream, I'd think "stream cipher" and look at RC4 or SEAL. If the data's part of a pen-and-paper arrangement with all values mod 26, I'd think "Solitaire". If the data's a pen-and-paper arrangement meant for communicating between two deep-cover espionage agents, I'd think "one-time pad". If the data's something pulled off a disk drive, I'd think of Matt Blaze's ECB+OFB algorithm. Etc.
What it boils down to is, this question is pretty arbitrary. Very rarely will you have no metainformation about the plaintext. Seek out as much metainformation as you can, and use the metainformation to make educated guesses, cribs, etc.
sdjek dYqkP 1Nt$% GGl9) MHrYD +++
Two things give it away:
The spaces are too regular. You'd be quite hard pressed to form a coherent sentence with any character occuring every 5n character.
So then perhaps the spaces are irrelevant. Then the next questionable aspect is the last three +++'s. Now, if your code didn't atleast work in groups of three, the mathematic likely hood of three +++ occuring would be small.
So then, what would make most sense is some kind of consistant bit manipulation at least in cycles of three characters. Then you double GGs and unique character (%$) make that unlikely too.
So what makes the most sense? Just random typing.
Look at the first set of characters:
sdjek
Just type it a few times... It's quite natural. You might have well used asdf (I bet your typing style isn't perfect... you probably favor your right hand).
If you examine each other character grouping, you'll see that none of them are very hard to reach.
Also, it gets the KIS approval which in most circumstances, is the winning vote.
int func(int a);
func((b += 3, b));