Battling Steganography
An anonymous reader submitted a fairly thin little story about a researcher who is Battling
Steganography. I can certainly see the appeal of the study but it really seems like a needle in a hay stack sort of project. And when you actually can detect one technique, new and better techniques will crop up and take its place.
Was it just me, or did the article make it seem like anyone that would use steganography would be a criminal? Since when in a 'free' country should the ability to hide a message be of interest to the "legal community"?
It's hard to tell the cool to chill, my favorite hotel room has a view to an ill.
So, how can the algorithms mentioned in the article (which is interesting, but rather short on facts...) distinguish between the noise added by a steganographically embedded encrypted message and the noise caused by a slightly underspecced A to D converter?
You're right, there isn't too much of a difference between random noise and an encrypted communication. If you had a pure digital stream that had just been converted from analog, you could stick data in the least significant bits and no one would be the wiser. For example, a CD is just a sequence of 16 bit words iterated 44,100 times a second; you could just replace the least significant bit in each word with bits from your hidden message and it would be indistiguishable from random noise.
The problem arises when you try to compress digital information. These compression algorithms use the most optimum way to represent data that they can find and discard the least significant data, so they would completely destroy the afore mentioned hidden message. To hide data in a compressed file you need to play with how the compression mechanism stores the data, and the resulting file is most probably not going to be optimally compressed when you're done. What this guy is doing is looking at how the information was compressed, extract the overlying data that was being stored, and making sure the compression algorithm was indeed optimal. If there are any odd quirks in the compressed data or it doesn't look like the compression was optimal, it may be because data is hidden inside.
I hope this is a good enough explanation. I'm short on the examples but the underlying ideas are pretty basic.
If steganography can be made "turnkey", it'll work
for most of today's privacy requirements.
You might think that it'd be easy to detect,
or simple to prevent, but that's simply not true.
Unless someone lists all the ways in which one
can hide information, and a fantastically fast
approach to testing any given communication on the
net against those techniques. Otherwise, to
read a steganographically-encoded message,
each recipient will need to figure out which of
all the messages intercepted even includes the
data you're looking for, and what was used in
this particular instance. Hell, one might even
have two or more different techniques applied
in a single message. Like this message does.
Sort of.
....
Now we have more people looking at steganography. This can only make it more effective. Sure, the methods we have now might be broken but what about the next ones, the ones that don't show up on the statistical analysis that he appears to be using.
Bleh!