Codebreaking - Taking the First Step?
Master Spy asks: "Here's something that the Slashdot community might be able to help with. If you receive a message in code how do you take the first step? Back in the days of WWII it was easier. The codebreakers at Bletchley Park already knew that the messages were encoded using an Enigma machine so all they had to do was work out the positions of the rotors using brain power, the Bombe or later the Colossus machine. American codebreakers also knew the basic details about the methods the Japanese used but now however things are more complicated. Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' How do you know how the message has been encrypted? It could be an Enigma machine, it could have been XOR'd with a second message or a one-time pad or it could use some form of software encryption such as Blowfish or DES. Before you start ripping the message apart for decoding how do codebreakers find out what method has been used to encode the message?"
Let the medium decided how to decide this.
Ie if were over the net, look at the wrappers.
If over the radio, look at the spectrum.
It's whats around the message that will break the message
Sigs are dangerous coy things
Well, if this was easy, codebreaking wouldn't be any fun. Don't forget that both the Germans and the Japanese had a variety (tens if not hundreds) of different cyphers in circulation, so it wasn't exactly as simple as assuming it's Enigma or Code Purple. :)
As to how it's done, that has to do with analysing the text, frequency analysis of 1-grams 2-gram etc. Simple substitution will exhibit one fingerprint (though different languages will obviously be different), something like a playfair or Venegere (sp) square will have another, and DES encypted text a completely different structure. Obviously on a small enaugh sample there may not be enaugh information to latch into...
But with a larger sample, it's mostly a combination of good tools, experience, and guesswork
Suppose you are listening to a transmission and you receive the following: 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++'
Yeah, 'sdjek dYqkP 1Nt$% GGl9) MHrYD +++' showed up on my SETI@home screen too.
This is clearly the signature of the Grays from Cygnus Prime. You don't want to communicate with them.
They Grays of Cynus Prime are evil. They will put chips in your head.
They will use the chips to make you do bad things. Like posting to Slashdot.
Opinions on the Twiddler2 hand-held keyboard?
1. Get some books. Schneier's book is the best starting point.
.. move up the scale until you can understand the way more sophisticated codes are broken (for instance differential cryptanalysis). It gets harder at this point and well outside the realm of practicality but if you get this far, you will be able to break any cryptographically weak cipher (which includes the products of many companies, unfortunately).
2. Learn statistics (and basic number theory). You can discover a lot about a message by its statistical properties.
3. BREAK LOTS OF CODES. Without experience, you are lost. Start by breaking substitution and Caesar ciphers (easy with statistics), then Vigenere/Gronsfeld ciphers (harder but still "crypto for dummies"), then try XOR ciphers (they can be solved easily in an interesting way)... then try to understand how WEP is broken... DeCSS
4. If you become advanced enough, you can start reading papers on cryptanalysis. Many of them are surprisingly easy to understand once you understand number theory. However, it is much more difficult to *discover* some of the stuff these guys come up with, it's pretty amazing.
Anyway, to summarize, understand the statistics involved and PRACTICE until you can just look at a substition cipher and understand what it says... just by the letter frequencies! If you are trying to break a simple code you need lots of ciphertext to analyze.
And don't forget: sometimes you don't need to break a code at all. As a poster above wrote, sometimes context is enough. Sometimes an external clue will give the code away. How do you know what to look for? Experience!
If you are interested, I would suggest that you start by reading The Code Book by Simon Singh. It gives a good overview of the history of the battle between cryptography and cryptanalysis, and how ciphers have evolved to defeat methods of codebreaking. It's an interesting and entertaining read and you might gain some insight on how you would approach this particular cipher.
BTW, I have a truly marvellous solution to your cipher which this textarea is too small to contain.
Take a bunch of encoded stuff and simply look at it, watch for patterns over the course of the data as a whole. For a small sample set
...
:p
'sdjek dYqkP 1Nt$% GGl9) MHrYD +++'
this is not going to do you much good, but if you have reams of encoded / encrypted data just stare at it for a while, look at it in a way that you look through it (like those hidden picture things) and after a while you will recognize patterns and have something with which to work.
There is a fine line between the high quality software engineer and mild autism. Ever watch 'Rain Man' or 'A Beautiful Mind' and think - hey that guy would be a BAD ASS developer
Helps to be able to think in 6+ dimensions when you are cracking codes, and a photographic memory helps too.
I should probably post this as an AC - last thing I need is the CIA / NSA figuring out what I am capable of
Glonoinha the MebiByte Slayer
First you djc,s dk%33R +++ (110), then you sD##N KDL:: Ds03k -332+. From there, it's a trivial matter of just 3!Wop mclDI a002g a!22# with the sklj3 V3iia aq@@1 +1867 -5309.
Duh.
The only, only thing you can expect to learn is who's communicating with whom [and when / how much information is exchanged] ( you probably know this already ) , and what protocol they're using ( it's probably unbreakable ).
Chances are, if you are intercepting an encrypted stream, you are intercepting an unbreakably encrypted stream.
Perhaps you are thinking that if only you knew what protocol the stream is using, you might look online and see if that protocol has been cracked.
Don't waste your time.
The chances are approximately 0 that the stream you are intercepting is using a protocol that has been cracked, or that it is using a keyspace you can brute-force for under a few hundred thousand dollars, or in under a matter of years.
Sorry -- you have a higher chance (almost infinitely higher -- as I said, the chance you will succeed in what you are asking to do is approximately 0) of port-scanning the machine at the source or the destination and 0wning it than you do of breaking the stream.
I don't say this to mean you should give up -- just that you're phrasing your question wrong. Don't discount the 0wning venue of attack.
For every million desktop machines communicating over TCP/IP, only a matter of a few dozen will have 0 exploitable security weaknesses. (However, most security weaknesses are unknown.)
Find out what kind of machine is at the source and the destination, then 0wn one of them. Chances are almost overwhleming that it's possible, if not with a remote exploit, then through social engineering. (Send an attachment that will be opened on either end of the communication, or induce either end to visit a web page in a browser that is exploitable (=, basically, every browser except Lynx).
If they browse with Netscape or Internet Explorer, chances are almost overwhelming that they can be owned.
It's not that hard to get someone to browse to a certain page, if you know anything at all about who that person is.
Back to your original question: gone are the days that protocols were breakable by any hotshot think tank. Today only implementations are, and rarely at the level you're trying to address. Don't break the code -- break into the system.
Hope this helps.
You social-engineer the NSA or other TLA with teraflop codebreaking computer-capability into helping you crack the message. For example, consider the following method used by an Idahoan to get his potato field plowed:
& gt ;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
An old man lived alone in southern Idaho. It was early spring, and he wanted to spade a garden plot to prepare it for planting potatoes. But it was very hard work and he just didn't have the energy.
You see, his only son, who would have helped him, was in prison.
The old man wrote a letter to his son and mentioned his predicament.
A week later, he received a note back, which said, "For heaven's sake, Dad, whatever you do, don't dig up that section of the garden! That's where I buried the GUNS!"
The next morning, bright and early, a dozen police showed up and dug up the entire garden, without finding any guns.
Looking out the kitchen window, the old man thought "Now, what in the world is going on here?" Confused, he wrote another letter to his son telling him what happened, and asked him for advice.
Another week passed and his son's reply arrived in the mailbox. The old man carried the letter up to the house, sat down at the kitchen table and read, "Now plant your potatoes, Dad. It's the best I could do for you under the circumstances."
NO CARRIER
Damn line noise...
Good old memories!
You're all bastards!
this is obviously perl code
sdjek dYqkP 1Nt$% GGl9) MHrYD +++
Two things give it away:
The spaces are too regular. You'd be quite hard pressed to form a coherent sentence with any character occuring every 5n character.
So then perhaps the spaces are irrelevant. Then the next questionable aspect is the last three +++'s. Now, if your code didn't atleast work in groups of three, the mathematic likely hood of three +++ occuring would be small.
So then, what would make most sense is some kind of consistant bit manipulation at least in cycles of three characters. Then you double GGs and unique character (%$) make that unlikely too.
So what makes the most sense? Just random typing.
Look at the first set of characters:
sdjek
Just type it a few times... It's quite natural. You might have well used asdf (I bet your typing style isn't perfect... you probably favor your right hand).
If you examine each other character grouping, you'll see that none of them are very hard to reach.
Also, it gets the KIS approval which in most circumstances, is the winning vote.
int func(int a);
func((b += 3, b));