USAF Wants To Find Steganographic Content

← Back to Stories (view on slashdot.org)

USAF Wants To Find Steganographic Content

Posted by timothy on Saturday January 10, 2004 @09:29PM from the sir-yes-sir-we-must-examine-porn-sir dept.

Bud Higgins writes "The U.S. Air Force has posted a Small Business Technology Transfer Program (STTR) solicitation in which they seek proposals for the automated detection of steganographic content. They seek an application that should run both unobtrusively in the background and in a manual mode, and provide the user the capability to scan all email attachments, downloaded materials and accessed files with an appropriate steganalysis algorithm, reporting any abnormal results (i.e. the presence of steganography). I personally don't think that is feasible, but maybe a good programmer can prove me wrong. A link to the solicitation AF04-T008 can be found here. For those who are not familiar with the SBIR/STTR program, it provides up to $850k for 3 years of research." This sounds very similar to what Niels Provos did over a several-year period at University of Michigan's CITI and released under a free license. I hope the USAF doesn't spend too much of my money without considering extending that research.

9 of 267 comments (clear)

Feasible? by jmv · 2004-01-10 21:37 · Score: 5, Informative

...reporting any abnormal results (i.e. the presence of steganography). I personally don't think that is feasible...

I think it probably depends on where you hide the data. For instance, it's probably harder to hide data in the LSBs of an image than, e.g. a file that's supposed to be white noise ("Hey, my mic doesn't work, it only records noise. See for yourself"). Of course, the less data you encode, the harder it is to detect it.

--
Opus: the Swiss army knife of audio codec
1. Re:Feasible? by RomulusNR · 2004-01-10 21:51 · Score: 5, Insightful
  
  Uh, sure, the "this is supposed to be random noise" trick will work about as long as the average spam-filter-avoidance trick lasts.
  
  "The enemy is sending out an abnormally large amount of random noise data. Must just be having microphone trouble. Nothing to see here."
  
  Roger that.
  
  No +1, cause I've been drinking...
  
  --
  Terrorists can attack freedom, but only Congress can destroy it.
Hrm by Cave+Dweller · 2004-01-10 21:38 · Score: 5, Insightful

Those of you paranoid enough will probably chime in with something along the lines of "Yeah, but Echelon probably has something like this built-in already!". Anyway, isn't the point of steganography to hide information in such a way that you *cannot reliably* tell whether the information was there in the first place?

I'm not sure what they're looking for here; perhaps a better steganography algorithm?
Well I hope it's better than stegdetect then... by argan0n · 2004-01-10 21:51 · Score: 5, Informative

As stegdetect (last time I checked) easily fails on files created with steghide

--
argan0n
Re:Oh yeah? by Soko · 2004-01-10 21:58 · Score: 5, Insightful

Take off the tinfoil hat, dude. Checking all pics on the net for steganographic info is virtually impossible - just too much info to sort through in a reasonable time frame.

They likley want this to scan documents leaving thier internal network in an attempt to catch people who are sending out sensitive or secret info. To me this looks like the USAF is plugging a leak, not going on the hunt.

Soko

--
"Depression is merely anger without enthusiasm." - Anonymous
Re:stego wrapped pgp by Ronald+Dumsfeld · 2004-01-10 22:33 · Score: 5, Interesting

Maybe statistical analysis can determine if a given image or other medium is possibly hiding information. But if that information is encrypted, doesn't it look like random data without the key? Without knowing the key or even the cipher used to encrypt it... how can it be shown to actually be information? "That's just random noise/corruption in my images your honor... I dont know what your talking about"

Statistical analysis can indeed detect where hidden information is placed into an image, usually by noticing that the balance of the image is off. In fact, using encrypted data is more likely to stand out because images are not usually populated with statistically random data.

Here's a piece on scanning Usenet for hidden images. As a broadcast medium you'd expect it to be most frequently used as you can anonymously post material and it is well-nigh impossible to locate the intended recipient.

--
Where's the Kaboom?
There's supposed to be an Earth-shattering Kaboom.
Here's an ineresting little by freidog · 2004-01-10 22:41 · Score: 5, Informative

paper (pdf) on detection of steganographic messages based on simple statistical analisys of the image. It seems to work well against 2 of the 3 major steganographic endodings they tried.
Rubbish by dmiller · 2004-01-10 23:41 · Score: 5, Informative

It is trivial to write a program to discover content that has been stegged. A jpeg with hidden content would be quite easy to find if the areas with content where significantly different from those without.

The point of steganography is to hide information so that its presence cannot be detected. This means hiding information below the noise floor of the media. Information hidden in this way cannot be practically detected, assuming the stego is halfway decent, and the message to be hidden appears random (easily accomplished by encrypting it first).

Sure, *if* you had access to the unaltered original, then you could detect that it had been altered, but any competent steganographer would encrypt the hidden information first.

It would be possible with time and processing power to dicover what bits where stegged if you used /dev/urandom to get the data.

This sentence demonstrates that you don't understand either /dev/urandom or steganography.

Knowing your processor type and kernel implientation the powers that be could find patterns in the data and look for those (or absence of those) in your message. But if the randomness is of a natural type then the difficulty increases by a massive amount.

More mis-informed rubbish - kernel implementation and processor type have little to do with the algorithms underlying the /dev/urandom implementation. Furthermore, /dev/urandom is based on "natural type" entropy (i.e randomness derived from unpredicable physical processes).
So if you have to hide something from the feds then become a scientist and collect lots of data from nature. It should have an element of randomness that allows you to steg your secrets in the data.
or, you could go and take a regular photo. Plenty of real, nature-derived randomess there.
Of course this is feasable! by jetmarc · 2004-01-11 00:25 · Score: 5, Interesting

> I personally don't think that is feasible

Of course this is feasable! At least with todays steganography software.

What the software does, is to overwrite appearently insignificant portions of the "container" data (the audio/picture/text/whatever file that transports the smaller hidden file). The steganographers say (rightfully) that, by encrypting the hidden data with a strong-enough algorithm, it is indistinguishable from random data. Ie, no one (without the key used for encryption) would be able to tell if it's encrypted data, or perfectly random data.

However, the programmers of steganographic software now go one step further and say (wrongly!) that images and audio files carry random noise in their least significant bits (LSB). Certainly, the lowest of those 16 bits of CD quality audio does not carry much data. And granted, 16 bits give 96dB of dynamic range while analog master tapes (studio quality) only have about 80dB, and microphone technology hardly touches 96dB. The LSB of an audio wave file definately is noisy, no doubt about that.

But (big "BUT"), it is far from being perfectly random. In the LSB you might find 50Hz/60Hz hiss from the buildings electric cabeling. You might find characteristic noise that's typical for your brand of microphone, or even a kind of "noise fingerprint" that could be used to distinguish your microphone from others of the same brand (much like crime investigators can distinguish typewriters by analyzing the blackmail letter). Actually, an experiment showed that when cutting all but the LSB of a music wave file, the tune remains still recognizable!

What the stego programmers do is to replace that LSB (or even 4 least significant bits) with perfectly (pseudo) random data. That's a difference! I can just cut all but the LSB and check if it statistically matches perfect random data (whitenoise) or if "some of" the music tune is "somehow" in there (eg by correlation, a DSP technique).

The same applies for pictures. If the pictures were scanned, the lower bits will contain artefacts characteristic to the particular scanner used. Digital photos exhibit "signatures" of the CCD/CMOS chip used in the digicam. Etc.

The steganographers know this, while the programmers of stegano software deliberately ignores it. It's a solvable problem, but infinitely difficult. If you know what the stegano-detection software is looking for, you can easily avoid it. Just encrypt your hidden data to "perfect random" and then transform it (by adding data, thus loosing efficiency) to exhibit almost the same "fingerprint" signature as the data you are going to overwrite. In case of an audio wave file, impress a bit of the tune on your data.

But obviously, you can't reach perfection, because a 100% match means that you overwrite the original data with a 100% copy of it (-> you have stored 0 bytes of hidden data). Or you know how the detector works, what tresholds it uses to bin the file as "steganographic", and stay a little below the treshold. But that puts you on the risky side.. Will they change the tresholds? Will they check for other characteristics as well, something that you didn't address in your steganographic software?

That's why the steganographic programmers (not researchers!) ignore this problem. It has no practical solution. It's so much easier to just ignore it, and offer you the choice between 4 and 8 bits of hidden data per 16 bits of wave data (like eg "Scramdisk" does, a recommendable harddisk encryption software). This is better than nothing, but it is far from "not feasable" to detect!

Marc