Secret Data: Steganography v Steganalysis
gManZboy writes "Two researchers in China has taken a look at the steganography vs. steganalysis arms race. Steganography (hiding data) has drawn more attention recently, as those concerned about information security have recognized that illicit use of the technique might become a threat (to companies or even states). Researchers have thus increased study of steganalysis, the detection of embedded information."
There's some truth to the idea of a hidden message in comic strips.
During the 50's and 60's the air force used a particular comic strip ("smokey stover" i think. http://www.toonopedia.com/smokey.htm, also the origin of "foo" and "foo fighter") to train recon. photo interpreters. The artist would hide his wife's name somewhere in every strip, and the new recruits would have to find it.
There are some people that if they don't know, you can't tell 'em.
Steganography is typically used within a closed group. It is typically not used between strangers. Therefore, you don't need to publicize your steganographic protocols beyond a small group of people.
Furthermore, if you take the trouble to hide your data with steganography chances are that you will also encrypt it. In this scenario, the two accomplish different goals. Steganography ensures that no-one realizes that you have communicated at all and cryptography ensures that even if the steganography is compromised, they cannot tell what it was you were sending.
Steganography is gold to any mole in need of transmitting information from inside a hostile organization to his people on the outside. So long as the hostile org cannot tell that he is communicating, he is safe. Once they figure out, he is busted.
Or for anyone transmitting information across an untrusted medium for that matter. If you use PGP to protect your Internet mail, the Feds are going to know that you have _something_ going on and that they might want to keep extra tabs on you. If you also use steganographic techniques, you'll never show up on their radar in the first place.
sigs are hazardous to your health
There's a good story on something vaugely related that has to do with the frequency of digits in measured numbers. (That is, it isn't equally probable to see every digit -- earlier digits in a number favor lower digits, like "1".) People who were falsifying accounting records were caught because the numbers they used were "too random".
Actually, here the fault is that they didn't understood the target. Expenses have no "natural" size, they're likely to be scale invariant. Basicly, you're looking for a distribution where C*f(x) = f(x). If you took 1..9, try C=2: 2,4,6,8,10,12,14,16,18... suddenly you have 5 leading 1s.
Turns out the right distribution is following Benford's law:
30.1% 17.6% 12.5% 9.7% 7.9% 6.7% 5.8% 5.1% 4.6%
The second example you have is that the human "RNG" is flawed.
A computer doesn't really suffer from this problem. The stenagography problem is really this.
1. Find randomness in source data
2. Replace random data with pseudorandom data
Of course, if you overwrite non-random data, you're doing it wrong. If you're going to use the LSB, you need to verfiy that it is random, or find the portion of it that is random (which is kinda what you're doing when you pick the LSB from a pixel anyway).
The biggest problem is really to hide it in a "reasonable" way.
Perfect steganography should replace all randomness with noise.
Perfect compression should eliminate all randomness.
In other words, steganography operates on the thin slice between good compression (jpg, mp3, divx) and perfect compression. It's much easier to hide information in bmp, wav, uncompressed avi, but it also looks damn obvious.
Kjella
Live today, because you never know what tomorrow brings