USAF Wants To Find Steganographic Content
Bud Higgins writes "The U.S. Air Force has posted a Small Business Technology Transfer Program (STTR) solicitation in which they seek proposals for the automated detection of steganographic content. They seek an application that should run both unobtrusively in the background and in a manual mode, and provide the user the capability to scan all email attachments, downloaded materials and accessed files with an appropriate steganalysis algorithm, reporting any abnormal results (i.e. the presence of steganography). I personally don't think that is feasible, but maybe a good programmer can prove me wrong. A link to the solicitation AF04-T008 can be found here. For those who are not familiar with the SBIR/STTR program, it provides up to $850k for 3 years of research." This sounds very similar to what Niels Provos did over a several-year period at University of Michigan's CITI and released under a free license. I hope the USAF doesn't spend too much of my money without considering extending that research.
...reporting any abnormal results (i.e. the presence of steganography). I personally don't think that is feasible...
I think it probably depends on where you hide the data. For instance, it's probably harder to hide data in the LSBs of an image than, e.g. a file that's supposed to be white noise ("Hey, my mic doesn't work, it only records noise. See for yourself"). Of course, the less data you encode, the harder it is to detect it.
Opus: the Swiss army knife of audio codec
Techincally I don't see what it would be impossible about it. Some how create a way to transfer the text in letters to a sophistated OTR and pull the lines out. Run tests on the text, if that is possible. Might take a might powerfull computer system to handle the load, but probably now or in the future. Sounds like a spendy machine
Those of you paranoid enough will probably chime in with something along the lines of "Yeah, but Echelon probably has something like this built-in already!". Anyway, isn't the point of steganography to hide information in such a way that you *cannot reliably* tell whether the information was there in the first place?
I'm not sure what they're looking for here; perhaps a better steganography algorithm?
Suuuuure, Carnivore anyone?
I work for a company that is funded through a SBIR grant, so on behalf of the company I work for and to all tax paying Americans let me just say: Thank You!
It really is an interesting government program. All the IP we generate with the money stays with us. However in the interest of equitable return to the taxpayer, we have decided to release all of our core software components GPL. (Okay, okay this also helps when it comes time for our semi-annual review, to show that we aren't just soaking the taxpayers.) We hope to turn a profit partially by our user interface components (non-core code that we are not releasing) and also through support.
Trying to get one of these grants is highly competitive, but if you have a really good idea and don't want the vulture capitalists to "fund" you, this is a great program.
Education is a better safeguard of liberty than a standing army.
Edward Everett (1794 - 1865)
It is trivial to write a program to discover content that has been stegged. A jpeg with hidden content would be quite easy to find if the areas with content where significantly different from those without. The problem comes when the data is similar to the carrier.
/dev/urandom to get the data. Knowing your processor type and kernel implientation the powers that be could find patterns in the data and look for those (or absence of those) in your message. But if the randomness is of a natural type then the difficulty increases by a massive amount.
If you had hidden your message is bogus scientific data taken from a near random source then it would be very difficult to see the areas that contained stegged data.
It would be possible with time and processing power to dicover what bits where stegged if you used
So if you have to hide something from the feds then become a scientist and collect lots of data from nature. It should have an element of randomness that allows you to steg your secrets in the data.
blog and junk
I work with the AF on contracting options..
....AC
Ususally when you see a contract for research that already exists - it is a means for the AF to pay for the additional research usually paying the guy or the company that came up with the research in the first place...
The other main issue with DOD (Dept of Defense) contracting is that you have to sell actual widgets not just research - so this contract method is a way to get the widgets paid for...
Maybe statistical analysis can determine if a given image or other medium is possibly hiding information. But if that information is encrypted, doesn't it look like random data without the key? Without knowing the key or even the cipher used to encrypt it... how can it be shown to actually be information? "That's just random noise/corruption in my images your honor... I dont know what your talking about"
1. Win contract.
2. Base new software on Mr. Provos' work.
3. Profit!!
In an IT world where profit is linked to enterprise software, this will be a very interesting piece of work for somebody. Kudos to the winner. I would bid myself if I was a US citizen!
As stegdetect (last time I checked) easily fails on files created with steghide
argan0n
One thing that does surprise me is that they have allowed the Air Force guys to look at this at all, it seems much more like an Army or NSA thing.
"Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
I'd expect that a fair amount of first-order steg would be detectable by a process that examined all patterns in a data stream, and spotted that or those patterns that were UNLIKE the other patterns in the data, based on some heuristic.
Of course, if you were to steg with an OTP or some such (i.e. your steg is based on deviance from a known data set), you'd more easily escape such detection.
Terrorists can attack freedom, but only Congress can destroy it.
I hope the USAF doesn't spend too much of my money without considering extending that research.
Sorry to break it to you but taxpayer dollars are not "your money." It ceases to be yours when you pay taxes. Otherwise, I would be able to say, "No, you can't build that road, I won't allow it since it's my money that you're using." It's part of the implicit social contract between government and its citizens: The people recognise that there are certain things that require public funding for the good of everyone, and so grant our elected representatives the right to decide how to use that money. You have control over it insomuch as you can vote for your representatives and in referendums, but you cannot take the attitude that you get to control where every dollar that you pay in taxes goes. If that were the case then nothing would ever get done, because projects are -always- beneficial to some people and worthless to others. If people could say e.g. "No, you can't use my tax money to build that school as I don't have kids and so I'm not getting anything out of it" or "No, I don't want my tax dollars going into road construction, I don't even own a car" then there would be no schools, no roads, no public facilities, etc. So, yes, you are certainly entitled to have a say in how tax dollars are spent, but it's in the context of your representative or through voter initiatives, and not on the basis of "that's my money you're spending there."
In "Unification" (Star Trek episode 108), the cloaked Klingon ship that delivers Picard and Spock into Romulan territory sends a coded message to Enterprise that is piggybacked on surrounding Romulan transmissions. If the Romulans were not able to discover this in their time, what makes the USAF think they'll be able to do it now?
But I had a this little idea. Suppose we "pollute" normal images with random data with say 1% redundancy. What I mean is, whenever you create an image you take some random data and steganographically embed it in the image. Write a gimp plugin or something so that the process is transparent and automatic. Your file only becomes 1% bigger, so its no big deal. Not everyone needs to do this, just sufficiently many people so that the vast majority of the positives of stego detection systems are going to be false positives. As long as the message is encrypted before embedding, it is provably impossible to tell a genuine stego image from a false positive, assuming that the underlying encryption isn't broken. So you get a secure stegosystem with 1% efficiency "for free".
[dons tinfoil hat]
We'd all better soon start doing something like this, given where governments are going.
The "solution" can be implemented with the current laws and regulations, and I think the programmer is only a small part to make this system work. A lot of enforcement authorities have to come together and the current evidence suggests that they will come together. Of course, it is a moot point that by the time they figure this out, people would have learned to hide data in other creative ways - the eternal cat-and-rat game ...
Consider this
If Adobe (and others) could be forced to include in their code methods to detect currencies Slashdot | Photoshop CS Adds Banknote Image Detection, Blocking? and not disclose it till they were caught by some vigilant users, what makes us so smug that other major companies with "closed" software are not already in-bed-with-the-feds ? So, it is conceivable that the automatic detection may be going on and we wouldn't be any wiser.
See the Adobe example of how such "spyware" can be forced to run "unobtrusively."
Major Email providers like Yahoo and Hotmail already provide automatic scanning for virus, AOL is including automatic scanning for spyware, MicroTrend (?) already has Online Virus Scanning of your Hard Drive (!), and so under the threat of the Patriot Act (and it's ilk) many of these companies can be forced to scan everything that goes in and out of their systems.
This is the key. Now the threshold for "abnormal" has been reduced so much (almanac carriers as potential terrorists, CAPPS passenger detection based on names and 15 flights were cancelled last month based on this, anti-war protestors as possible terrorists and hence being tailed by the Feds etc.) that the problem of false alarms no longer dogs the current administration and law enforcement agencies.
This is the crux. When the error threshold is reduced so much that the high rates of error are no longer problematic, then any solution (whether efficient or not) can be implemented. Who cares whether it works well or not. Till now the false alarms were the things that stopped such 1984-ish like scenarios from unfolding. Once you accept high errors, and accept even high collatoral damage as the price of doing "business," you can have a solution to almost anything implemented - whether it deserves to be implemented or not is a whole different issue. But who cares? You got nothing to hide - Right?
To see a world in a grain of sand, and then to step back and see the beach where the sand lies
Just reformat any image traveling through a USAF system to destroy any hidden messages. It's cheaper, takes less time, and will force the sender to use less secure means.
People who bite the hand that feeds them usually lick the boot that kicks them
A use for the code I wrote to sort porn based on image content. I can see it now. Project JISM: Joint Image Statistical Modeling. Any my mom said my chronic masterbation wouldn't get me anywhere.
People who bite the hand that feeds them usually lick the boot that kicks them
If the steg'd data has obvious headers and block formatting, a weak algorithm could leave enough of a pattern in the output file to be detectable. And of course some applications of stego are used to embed cleartext data...
Proponents of stego sometimes suggest it's use in environments where even the suspicion of crypto is enough to risk persecution and/or prosecution.The other "trick" to detecting stego is that "normal" JPG/BMP/WAV/MP3/AVI/MPEG files tend to not actually show a high degree of random noise -- the seemingly random data in the LSB tends to have a pattern imposed by the encoder used and the input device.
I'd guess that this problem is more of an issue on highly-processed information from clean sources. You wouldn't expect random noise on an MP3 file ripped off the latest pop album release, but it wouldn't be out of place on a .SHN "bootleg" recording of a TMBG live concert from a handheld DAT recorder...
I do not deploy Linux. Ever.
... from stenographic content. Either he knows it's there (so he won't report it, surely) or he doesn't know (so he does not extract the potentially dangerous content). A scan for steganographic content should be performed by ISPs or by something like carnivore.
Anyway the USAF initiative is more clever than it seems, because vital steganographic content (terrorist plans and so) must be hidden in "popular" files, to make it hard for the good guys to find out the intended audience of the message. So a user level scan might be somewhat helpful.
It will also give a good excuse to people caught surfing for porn ("I am just helping out the USAF, dear!").
---- MISSING MISCELLANEOUS DATA SEGMENT --- [sigdash] trolololol
It is easy to 'steganohide' content in uncompressed noisy files like tiff or wav. But that content gets destroyed by lossfull compression which is mainly used by multimedia formats (jpeg, mpeg, divx, mpg3, ...). If not, it's called a watermark, but (un)fortunately nobody found a watermark algorithm yet which is robust against lossfull codecs and adding some more noise.
So You have to steganohide Your content after compressing. But compressed files have much less noise, and that noise is not random noise but has statistical quirks. If You just hide Your content as white noise and add it to the file - thats detectable, because it changes the statistical behaviour of the file!
Instead You have to write an specific steganografic algorithm for each lossfull compression format You want to hide content in! It has to respect the 'format noise character'. That's what Niels Provos did for pnm and jpeg with outguess.
... for watching porn during work :-)
Thomas S. Iversen
I wonder if they've talked to this guy
He claims to have a system which can detect modifications to photographic images.
Any tampering with a photographic image causes detectable statistical changes. These changes can indicate that the image may have been edited to change the content or possibly that steganographic data has been added.
paper (pdf) on detection of steganographic messages based on simple statistical analisys of the image. It seems to work well against 2 of the 3 major steganographic endodings they tried.
It is very clear this is an impossible task. All one needs to do is run a standard PUBLIC KEY ENCRIPTION - you can get the code from www.openssl.org - then stow the encripted bits into the noise in the target file.
It can be stowed as replaced low order bits where the address of the bit is generated via a hashing function.
Even IF ( a really big IF here) it is possible to determine which bits were flipped (XOR) or stowed, one is still faced with knowing the arbitrary hashing function that was used.
If one is so lucky to find the hashing function one is still faced with cracking the public key encription and this has been shown to be impossible. Oh - and the hashing function istself can be derived from the public key encription code. If so - then it is provable that the hash cannot be derived much less the message that is being hidden.
For years it has been feasible to hide messages in any commonly available digital or even non-digital data streams.
The only messages they are likely to detected are very poorly encoded ones or ones that are deliberatly poorly encoded so they can be found.
Yet - I am sure there are many people who will gladly produce some literary fiction and take the money and run. We've all seen alot of this in reports commissioned by official agencies.
For any such system to work, it would have to basically be the greatest code-cracking machine on the face of the planet. More than that, though, would be the implications of false-positives. Let's say I send a photoshopped picture of, oh, I don't know, Natalie Portman to a buddy who works for the Air Force. The system, working under the operating parameters it's set to work with, picks up on a specific pattern of bits in the picture and determines that it's a coded message. The coded message is decoded to, inexplicably, reveal GPS coordinates, a date/timestamp, and the phrase "Free XXXXXX" (or some equally suspect verbiage). What would YOU think the "message" meant?
/dev/rand can produce terrorist messages. It's the million-monkey problem, except with thermonuclear weapons.
Given enough processing power, even
"Why Subscribe?" Good question...
Imagine if seganographic checking software was to be mandatory on all computers containing DRM. And, removing it would be a felon. Remember boys and girls, owning a computer is a privilege, not a "right".
Think it can't happen? Think again, we have the Patriot Act as the front runner for this kinda shit. Seriously, I'm voting Libertarian this election. I'm tired of the same old Demo/Repub bull shit!! Arrtrrggghhhhhhaaaa
Life is not for the lazy.
Following the link it says Niels Provos analyzed 2 million ebay pics and then 1 million usernet messages... mount 192.168.1.1:/sten/incoming and warmup grep and cp!
Laurence.
The point of steganography is to hide information so that its presence cannot be detected. This means hiding information below the noise floor of the media. Information hidden in this way cannot be practically detected, assuming the stego is halfway decent, and the message to be hidden appears random (easily accomplished by encrypting it first).
Sure, *if* you had access to the unaltered original, then you could detect that it had been altered, but any competent steganographer would encrypt the hidden information first.
This sentence demonstrates that you don't understand either /dev/urandom or steganography.
More mis-informed rubbish - kernel implementation and processor type have little to do with the algorithms underlying the /dev/urandom implementation. Furthermore, /dev/urandom is based on "natural type" entropy (i.e randomness derived from unpredicable physical processes).
So if you have to hide something from the feds then become a scientist and collect lots of data from nature. It should have an element of randomness that allows you to steg your secrets in the data.or, you could go and take a regular photo. Plenty of real, nature-derived randomess there.
The USAF. What is it all about... is it good, or is it whack?
Which country do you claim to come from?
In audio that is. SAy you decide to start hiding stuff in live performance music, as in fan recorded data. Much of that is distributed in 24-bit format since we are talking about hardcore people here. Well, this is good already, seeing as you aren't going to find 24-bit converters that really get 24-bits of SNR. So you have plenty of inherant noise to begin with. Add to that the noise of a concert and you've plenty to mask the signal with.
Think of code consisting from selectively placed LOL, OMG, ROTFLMAO, HEH, WOW, SUXORZ, ROXORZ, C00L, WOOT and several dozen smileys, place them at random places of a blog message and send them over some IM network. Undistinguishable from billions of messages that cruise the network daily.
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
The problem with this sort of project is that if you ananlyse a billion images looking for steg. using a billion different types of analysis then you will find some form of hidden message.
Unfortunately that hidden message will just be a statistical glitch and not really exist.
The Bible Code is an excellent example of how the act of looking for hidden messages can allow you to find hidden meaning where there isn't any.
What they have is they monitor certain phone numbers, fax numbers, satellite numbers, etc. spoken/written keywords and stuff like that. But if the communications is encrypted, it can't be decrypted realtime and it will be sent to be processed.
Now, you can deduce lots of things from traffic analysis alone. But assuming you have a crypto system which is good (non-snakeoil), NSA and friends can just try to bruteforce their way around it, or they might use some tricks known to them related to the properties of the ciphers. It's gonna take them time.
But you should know that most of the stuff out there is plain text. No encryption, nothing. Just known encodings. And those can be analzyzed with machines.
In any case, your steganographic stuff is safe, unless you're in the watch lists. Even your encrypted comms is most likely safe unless you buy some binary-only shit from USA and excercise due caution and good crypto hygiene.
To dream up technical solutions (OK, ideas) for human problems? Is it to do with the US's yearning to regain it's position on the top of the technology R&D tree?
Doesn't this latest research grant smack of a Bush-backed "We want an all-encompassing system to catch bad people. Oh and we reckon stenography is the answer too."
"It's not your information. It's information about you" - John Ford, Vice President, Equifax
> I personally don't think that is feasible
Of course this is feasable! At least with todays steganography software.
What the software does, is to overwrite appearently insignificant portions of the "container" data (the audio/picture/text/whatever file that transports the smaller hidden file). The steganographers say (rightfully) that, by encrypting the hidden data with a strong-enough algorithm, it is indistinguishable from random data. Ie, no one (without the key used for encryption) would be able to tell if it's encrypted data, or perfectly random data.
However, the programmers of steganographic software now go one step further and say (wrongly!) that images and audio files carry random noise in their least significant bits (LSB). Certainly, the lowest of those 16 bits of CD quality audio does not carry much data. And granted, 16 bits give 96dB of dynamic range while analog master tapes (studio quality) only have about 80dB, and microphone technology hardly touches 96dB. The LSB of an audio wave file definately is noisy, no doubt about that.
But (big "BUT"), it is far from being perfectly random. In the LSB you might find 50Hz/60Hz hiss from the buildings electric cabeling. You might find characteristic noise that's typical for your brand of microphone, or even a kind of "noise fingerprint" that could be used to distinguish your microphone from others of the same brand (much like crime investigators can distinguish typewriters by analyzing the blackmail letter). Actually, an experiment showed that when cutting all but the LSB of a music wave file, the tune remains still recognizable!
What the stego programmers do is to replace that LSB (or even 4 least significant bits) with perfectly (pseudo) random data. That's a difference! I can just cut all but the LSB and check if it statistically matches perfect random data (whitenoise) or if "some of" the music tune is "somehow" in there (eg by correlation, a DSP technique).
The same applies for pictures. If the pictures were scanned, the lower bits will contain artefacts characteristic to the particular scanner used. Digital photos exhibit "signatures" of the CCD/CMOS chip used in the digicam. Etc.
The steganographers know this, while the programmers of stegano software deliberately ignores it. It's a solvable problem, but infinitely difficult. If you know what the stegano-detection software is looking for, you can easily avoid it. Just encrypt your hidden data to "perfect random" and then transform it (by adding data, thus loosing efficiency) to exhibit almost the same "fingerprint" signature as the data you are going to overwrite. In case of an audio wave file, impress a bit of the tune on your data.
But obviously, you can't reach perfection, because a 100% match means that you overwrite the original data with a 100% copy of it (-> you have stored 0 bytes of hidden data). Or you know how the detector works, what tresholds it uses to bin the file as "steganographic", and stay a little below the treshold. But that puts you on the risky side.. Will they change the tresholds? Will they check for other characteristics as well, something that you didn't address in your steganographic software?
That's why the steganographic programmers (not researchers!) ignore this problem. It has no practical solution. It's so much easier to just ignore it, and offer you the choice between 4 and 8 bits of hidden data per 16 bits of wave data (like eg "Scramdisk" does, a recommendable harddisk encryption software). This is better than nothing, but it is far from "not feasable" to detect!
Marc
as an excuse to automatically screen US-inbound emails and then levy an extortionate fee to process a vistors' visa?
Or do you think all of the emails will just go somewhere else instead?
"It's not your information. It's information about you" - John Ford, Vice President, Equifax
The United States. I was a Naval aviator and a graduate of Top Gun. My call sign is Maverick.
As others stated, (as always in cryptography) if the stegging user isn't stupid (means he would encode before steg), the data to be stegged would be as random as the data that you steg it in. There is no possibility to tell one set of random data from another set of random data. I think they do it for discovering stupid spys.
Adobe was not forced to include currency detection.
Re-encode outgoing pictures?
That'd surely remove any steno. content
People are saying that adding stegged content to a (compressed) file adds redundancy, which can be detected.
I also read here that compressing the data and adding it, would still add redundancy. Is this correct?
What about compressing, then encrypting the data? I always thought that compression and encryption both attempt to minimise the entropy of a set of data. How can it be detected if it's random?
"Smoking helps you lose weight - one lung at a time" -- A. E. Neumann
The idea is to detect the likely presence of stego.. not to decode it, tha's an entirely different thing.
Analyzing a jpg or png to staistically determine if it's "clean" or has a message in it is not all that difficult. Decoding that message is a totally unrelated feat.. more likely reserved for cryptographers.
They might try compressing the images - an image with a large amount of non-random text hidden within it should compress somewhat more than a standard compressed image.
--- Bwah?
What if instead of trying to hide something in a specific image for example, you gave the steganographic software a selection of say 100 images and got it to choose which one would be best suitable to hide the data so it was hardest to find. While it might take alot of processing power to do this for a large selection it would make finding allot harder. Oh wait were supposed to be making it easier :P, how about banning all steganographic software and research under the PATRIOT III act and then only criminals will use it? Im not sure what the USAF is trying to get at here, if someone just thought it would be cool to do then fine, but if they are hoping to use it to catch terrorists then its stupid - you cant go through every email, IM, phone call, sms, fax, snail mail, telegram, VoIP call and website in the world looking for something dodgy, even if none of it was encrypted theres just too much!
This comment does not represent the views or opinions of the user.
so, like - wow! - this sure has spawned an interesting debate about how hiding messages within random data 'disguised' as plain old emails could be possible, or maybe not, and maybe someone could find it and filter it out... wow, impressive.
So, why do we want to look for such messages? Are terrorists from the middle east supposed to be passing messages around with this technology that even the finest scientists at Slashdot's secret underground laboratory can't even seem to agree would be possible? Here I am thinking the terrorists use, like, you know, stolen cell phones and stuff. Maybe it's the Russkies - they've still got moles planted in the US you know! Maybe it's the nutjobs in the US who occasionally cause big trouble passing around the secret messages - but then that's not the air force's business.
The 'wheres' and 'whys' of this technology are what conjure up the most questions in my mind.
RTFM; please, I beg you.
I can imagine terrorists or criminals starting to use open source software in the future because of this. Then some marketing or PR department of some large closed source or any sworn enemy of open source (ie. SCO) would start sprouting FUDs about open source and damage it's credibility. Worse, it could push the government to regulate it.
EvilCON - Made Famous by
Wouldn't it be easier to just overwrite the extra bits where information can be hidden than to analyze it? Mike
The USAF and the government in general are interested in a lot of interesting things. Why does this deserves frontpage attention??
Hide it in porn. Can you imagine the NSA guy telling his supervisor he has to look at tons off nuns-in-chains-pics because Bin Laden might hide in there? Tubgirl anyone? MWUHAHAHAHA!
Steganography was a problem during World War II. Mail was subject to inspection and censorship. There were concerns about espionage and attempts to evade censorship. Mail was checked for invisible ink and anything else that might be used to hide messages. Some people used steganography to save money. Since there were special subsidized postal rates for mailing newspapers, messages could be sent by using a pin to poke holes in the paper, spelling out the characters of the message. Some soldiers tried to evade the censor so that they could tell their family where they were located. Censors were suspicious of weather reports and other statistical information that might be used to hide messages.
Mea navis aericumbens anguillis abundat
In these days when the FBI thinks possession of an almanac makes you suspicious...what happens to you if some half-baked experimental steganography-detection program looks at billions of .jpgs, gets to an image you've included in an eBay auction descriptions, and detects some not-quite-decodable signal just above the noise that it interprets "there's definitely something hidden in that image, even though we can't tell what?"
How do you prove that you're innocent?
How do you prove that your image does NOT contain steganography?
Worse yet, suppose you are using steganography--say, a watermark to prevent people from stealing your image. Will the FBI believe what you tell them is the decoded content?
I mean, a few decades ago some nutcase analyzed Shakespeare's First Folio and decided that it was printed in a mixture of two slightly different fonts that constituted a binary code with a message proving that it had been written by Sir Francis Bacon. (No kidding). That proves that it's easy for someone who's looking for steganography to find it, whether it's there or not.
"How to Do Nothing," kids activities, back in print!
then again most of /. is not the general populace
[Fuck Beta]
o0t!
You'd have go go around obtaining lots of original recordings. Like using an one-time pad, with stego, you can't use the same source twice, nor can you use a source that's already available. You need to be the sole source. Otherwise the enemy can do a binary comparison and see that there's something different, possibly hidden data.
Is anyone else reminded of John Nash's condition at the end of A Beautiful Mind? :-)
Seems to me that searching for steganographics will be a difficult task. Just like 40-bit vs. 128-bit SSL, the steganography will just get a *little* bit more complex and will be orders of magnitude harder to discover.
If the data is compressed and/or encrypted prior to being stegged, then even if the data is correctly extracted, it will be impossible to determine whether it is actual data or just noise.
Conceptually, the execution bounds for looking for these "hidden" messages seems not too different from trying to find factors of prime numbers. Take an image, and distill it into two parts, one of which is a hidden message you know nothing about, and the other is the final image with the hidden message removed.
This is my sig.
The original poster doesn't believe that it's possible to detect steganographic content. There have been lots of technical follow-ups that suggest it might be possible, but almost nobody has mentioned the funding issue. The task is most likely possible simply because there's been an STTR solicitation published. Many of the STTR and SBIR solicitations are designed by their authors to fund existing projects known to the authors. These "solicitations" provoke very few proposal submissions, occasionally even just the one from the expected recipient of the funds.
Don't get me wrong - this isn't a scam. The funding groups are usually genuinely interested in having what they specify developed, sometimes wind up buying lots of it once the development is complete, and in most cases all qualified bidders are truly considered. It's just that the solicitations are often written so narrowly that only a select few bidders can qualify.
But hey, at least the bidders are required to be small businesses, not like those Halliburton contracts for Iraq!
Clearly, this problem is undecidable in general.
Once again, you can spend $850,000 on engineers who will try and fail... or you can spend $5 on coffee for mathematician.
We hope to turn a profit partially by our user interface components (non-core code that we are not releasing) and also through support.
LOL
The problem with the LSBs of an image is that they aren't quite random. Unless the image is raytraced or otherwise artificially produced, there's a fair amount of order there. Even a raytraced image might not be quite random.
The same holds with audio. For instance, crypted data is white noise, but concert noise is "pink noise" which has a characteristic spectrum. The noise produced by converters is closer to white, but it isn't quite either. People like Neils Provos have been studying this for a while, trying to find out which bits they can change without altering the statistics of the image or audio, but with limited success. As of last year (don't know how it is this year), all published steganography schemes at least a few months old had been broken.
I hereby place the above post in the public domain.
Detecting encrypted steganography would be difficult. It would involve statistical analysis of the "unimportant" bits of a known good media sample (be it image, audio, even an executable) and comparing it to the suspect message.
This would involve a tremendous database on the part of the USAF. More importantly, if the people using the steganography had a similar database (and code that could encrypt their hidden text to match the properties of the "known good"s), then the messages would be undetectable.
A better (but more controversial) approach be this: The USAF modifies every picture/audio stream/etc that goes to the outside world. Only the least significant bits (the places where the encrypted message is likely to hide) would be changed -- to gibberish. Then it doesn't matter if the message was stego-ed or not -- it's unreadable now.
Only 2 problems I see with this:
1) Doesn't match what the USAF asked for, which was a way to DETECT stego. I feel that this is OK because the AF's original goal is WAY too broad an d open ended. Stego isn't limited to pictures. It can use music, text, code (using redundancy in certain instructions in the x86 instruction set). In short, there are too many possible channels for something to be stego-ed through.
2) It's an overt measure. If you wanted to let these stego-ed messages get to their intended recipients, and then monitor what Bob the Spy was then doing, you'd be SOL. But still, if this was a known policy, it would be tremendously useful.
Oh, and for those who say "The data is being tampered with! That's inherently wrong!", if the data was so important that it's modification would cause problems, then the original steganography would be automatically detected.
So what if they know the data is there? If it's encrypted, what can they do?
My other car is first.
wouldn't a stego'd image be indistinguishable from one that had been recompressed?
In the free world the media isn't government run; the government is media run.
I suppose I could have the software on a USB device that could encrypt the data for me, but since I can't get external email on that system I'd have to carry it out of there with me (maybe on the USB device). If I can do that, I can cary it anywhere so why would I risk sending this info from military computers when I can head to the internet cafe, the library in town or Kinko's?
A lot of military folk live on base and may get internet service provided by the military so they could check messages entering and leaving that way, but not on the base my wife works at. They get their connections 3rd-party and it never passes through military routers first.
From what I've [not] seen of my wife's secure work environment, I'd bet the AirForce would get a lot further with the money in providing additional security training to their "com-nazi's" and improve the physical security of their secret information.
They may already be trying to do some sort of scanning of outgoing attachments, because their Exchange servers seem to fold, spindle, and mutilate about two-thirds of the legitimate attachments my wife tries to send home. Then again, I've never seen a network that was "down" as often as theirs is so it may just be inexperience at the controls. Seriously, you can't take an airman out of bootcamp, send him to a few classes and expect them to be able to manage a complex network running Windows.
"terrorism" and "pedophilia" are the root passwords to the Constitution
Detecting steganographic content is probably difficult IMHO, but I would imagine it is easier analysing images to see if they contain differences from the norm, than looking for information with brute force. Most images for example are clean, but if you're seeing a colour variation in a pattern over an image that should contain one colour, this should be easier.
However even with this relatively simple method, the processing power used to analyse every image opened or detached from a host pc (this would probably be a trojan), would take a fair amount of resource time. The delays on the pc alone would probably alert the user.
Although if the need is for a Trojan, they should have probably been more circumspect about announcing their need for it.
It means a lot to them. They have narrowed down the source. Now instead of placing bugs on ten thousand communication lines, they only have to place one.
"Only the small secrets need to be protected. The big ones are kept secret by public incredulity." - Marshall McLuhan
What they really seem to want is an excuse to scan everybody's email and other net traffic as it flows over the net. That's scary.
Is it illegal to open a UDP connection to any random IP address and send series of packets containing completely random data? Would this trigger a probe somewhere by the government?
Another method of hiding a conversation would be to simply have two people connect to the same game server at a certain time of the day. The two people would then simply convert the data they want to send to gun coordinate aiming info. Since both clients receive each others gun coordinates all you have to do is decode this info on the other end. There are millions of ways to hide information these days. Are the chats that occur during most games monitored by the government in the first place?
Just asking.
+2
Once you accept high errors, and accept even high collatoral damage as the price of doing "business,"
You've made a very important point there. The future isn't 1984 it's Brazil.
[-- Trust the Monkey --]
Monitoring communications to intercept internal leaks or spies, and then trying to obtain the actual incriminating plaintext? Or merely trying to thwart such communications? The latter is much easier to accomplish than the former. They could simply set up their email gateways to recode all stegano transport data formats (pictures, sound etc) on the fly, thus most likely killing any embedded stegano content, without affecting the usability of non-malicious information too much--a JPEG of Bubba shooting Iraqis will still be viewable after recoding.
with turning lead into gold. No, I take that back: we can turn lead into gold in principle with particle accelerators. Steganography, however, is provably undetectable when it is done correctly.
But our new overlords, the US military, won't be stopped by such little details from wasting their money, like the lords and monarchs before them wasted money on alchemy.
Sure, the Feds could possibly pay Adobe to add a "tell the Feds" bit to Photoshop's Stego feature, if Phtoshop had such a thing. But stego isn't the kind of thing you typically ship integrated with other products - it'd be a separate image-manipulation program or audio-manipulation program, or perhaps a plugin for programs that support such things (e.g. a Mozilla stego-reader thing.) And they can't control all sources of image bits, much less image manipulation programs.
The problem is pretty hard - how good are the models you can make of each type of image source and its noise components, how good are the image manipulation programs at transforming noise-like encrypted data into something that matches the statistics of the image noise, what traces does the stego program leave so it can find its own images?
The problem from an honest eavesdropper's perspective is how to keep the false positive rate low enough to wade through the huge amounts of image data, raw or manipulated, on the net to find the potentially very very small amount of real stego. (Hint: the amount of binary data on Usenet is probably well over 100 Megabits per second, and spam still counts if you're looking for stego.) On the other hand, a dishonest eavesdropper only has to maintain an attempt at verisimilitude "We found pr0n on the suspected terrorists' computers, but real religious fanatics wouldn't have that so it must be Steganography! That means they're guilty guilty guilty! And this picture of Osama has his Left Eye Winking and his right middle finger mostly extended! That means that the attack is planned for This Month, in an American City next to a bend in a river!" Sure, it's bogus, but the almanac stuff is bogus too - it only has to keep the sheep feeling nervous about what the Feds might find next, because terrorists are a threat to our nation's Precious Bodily Fluids. An automated stego detector is fine if you want to claim that there are 10,000 suspected terrorist chatter messages per day, but you can't actually issue PR alarms as fast as you get false positives because it'd be way too fast to maintain credibility; crying "Wolf" is something you do at controlled intervals if you want to be believed.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
This is a near impossible task to do within a relatively
acceptable time frame, and I mean both for the development
of such a tool and also the time it would take to trawl
through images, sound, video and even written TEXT!
The issue some people are talking about, with regards to color
changes and inconsistencies with colors in regions of images etc,
have all been solved in the stego world through spread spectrum
technologies. unlink simple intuitive methods such as Wong's
bitwise encoding of data via usage of weaknesses in the human
visual system, spread spectrum algorithms hide the data in the
frequency domain, and even then before the data is "embedded" its
usually compressed and encrypted, and passed through filters such
that an area in the target image can be found so that the stego
data become invariant to DCT compression techniques etc.
All these factors lead me to believe that there is the
possibility of building a tool that can produce a probability
rating of whether or not an image has stego data, but to what
degree that probability is reliable would be another question to
answer. but to develop a system that would detect and extract the
stego data, well thats will most likely be impossible cause there
is not watermarking method that is invariant (meaning you can't
extract a watermark from a piece of data when there is no watermark
embedded in the image, video etc..)
All in all I think this kind of proposal is like chasing the
white rabbit.
Arash Partow
Arash Partow's Philosophy: Be a person who knows what they don't know, and not a person who doesn't know.
Interesting... looking for things being too random...
So then to counter this, the steg programs need to encode data in such a way that the various nonrandom patterns originally present in the unaltered files.
It seems like this would become a mathematical arms race where, on one side, analyzers are developing new statistical tests for patterns, and on the other side, programmers for steg programs must keep patching their programs to account for these types of patterns.
Steg programs need two inputs: an encrypted text to hide (the message), and a random stream of data to hide it in (the "medium"). The only way that the output can be identified as possibly containing a steganographic message is if the statistical properties of the hidden message are in some way distinct from those of the medium.
That implies that an effective steg program would do some analysis of the statistical properties of the medium prior to hiding the message, and would adapt the statistical properties of the encrypted message to blend in. For example, they might make a message hidden in audio look like Boltzmann noise (assuming there were no other pseudo-random artifacts created by the recording equipment and audio encoding scheme).
Only snag I see is that, if several parameters are adjustable, the values of those parameters would also need to be known on the receiving end.
Get your teeth into a small slice: the cake of liberty
Picture this: steganographic propaganda aimed at government spooks.
Searching for steganography is like airport security, and equally futile. Both assume that it's possible to recognize anything that can possibly used to do ill, even when you don't know what it is, how it works, or what it's for. 99% of the time, you'll have a false alarm; the other 1% of the time, you'll find a really dumb crook who wasn't competent enough to do any real harm anyway. (If he was, you wouldn't have caught him.)
I have a suggestion. By analogy with the crypto geeks who always encrypt, just so that any REAL messages will be lost in the chaff.
/dev/urandom, and stego them into every single image file it saves. At least make it a checkbox, and checked by default for any lossy compression format.
Add a doodad to The Gimp. What this doodad does, is slurp a bunch of bytes from
Stego is suposed to be visually undetectable; that means this won't actually hurt your prized pr0n collection. But it will chaff the heck out of The Man.
Lots of original recordings? No problem. Just whip out your trusty digital camera, switch it to movie mode, and record your kitten playing with a string. Or run a garage band or get a DV camera. There are lots of hobbies that can serve as a cover story for having disks full of uncompressed audio or video files.
Clueless poster and clueless moderators.
DO a spectral analysis before and after compression of noise and then compare that result to the before and after of compressed data. Use a decent compression algorithm. Now encrypt the compressed noise and compare that to the compressed and encrypted data and just for curiosity compare those to uncompressed, unencrypted noise. Use a decent noise source just as you would decent encryption.
What you will find is that compressed and encrypted data is statistically indistinguishable from random noise. Simply compressed data is statistically indistinguishable from most psuedo-random noise generators. Good encryption algorithms offers no improvement over compression and encryption compared to truly random noise other than providing a smaller comparative sample size for any given amount of compressible data which is desirable.
Put another way, there is no detectable redundancy in random noise and if done properly there will not be any detectable redundancy in encrypted data either. Redundancy testing using statistical methods of spectral analysis is an excellent quality measure of compressed and/or encrypted data.
There are however a couple things to look out for when using typically available compression and encryption products and that is the almost universal incorporation of indentifying file headers which do server as finger prints. It is therefore important to strip these headers prior to transmission and serve only the compressed/encrypted data. Also beware of predefined data such as decoding dictionaries imbedded within the encrypted file format and use other algorithms as necessary. As long as you encrypt after compression you need not worry about such dictionarys commonplace in compressed file formats. Note that it serves virtually no purpose to compress after encryption anyway since there will not be any redundancy in the data to compress.
Stenography is really nothing more than security by obscurity in and of itself and if one is not careful, can be detected even though the incorporated data is indistinguishable from noise. As such it is a prerequisite that the transfer medium contain a sufficient amount of noise that the data can be contained within it. In absence of the original, unstenographed transfer medium with which to use in comparision, the stenographed medium will not be readily detectable and the encrypted data will not be extractable as long as that data is consistent with the expected statistical analysis of the noise contained within the source medium.
Re-read that last sentence a few times.
If there are known statistical anomalies in the noise signature of any given transfer medium and the stenographed version doesn't contain those anomalies then that is an identifier of stenographed media. It may not be of value in extracting the embedded information let alone decrypting the data but there is strategic value in knowing that a data set or data stream has been subjected to stenography in the same way there is strategic value in knowing that data has been encrypted without knowing the plain text of that data.
If your stenographed data cannot be distinguished from random noise then the transfer noise channel needs to be equally random. If the noise channel is not truly random then the stenographed data signature should be comparable to the signature of the noise channel.
There are ways around this requirement if the data size is small in relation to the set size of the noise channel in the transfer medium and that data is interleaved spread spectrum. The result being that the data does not significantly alter the signature of the noise channel or otherwise serves in keeping any difference below the statistical threshold of detection with that level being an arbitrary value set by the observer.
In conclusion it is equally important to analyse and understand the noise channel in the transfer medium as it is the data to be injected and the composite result in comparision. Choose your com
Unleash a worm upon the world. Let it spread, scan the disks of the infected machines for JPEG files, and stego-encode randomly picked data encrypted by a randomly generated key into them. In just few hours, the false positives in any stego-detecting systems shoot up by many orders of magnitude, effectively rendering them useless.
got the noise? now you can guess at content: perhaps standard English phrases. Knowing both and you're well on the way to breaking it.
For example, if you suspect that people might be using it to talk about terrorism, you might want to guess that the message contains a phrase like "allah akbar" or somesuch.
plz ignore
Forensic investigators tend to decode encrypted data on a hard drive examination using one of two methods:
1. Find an encrypted file and compel the suspect to disclose the key
2. Find old plaintext of the file
When you encrypt something, it's often stored on disk in a plaintext format before encryption. After deleting that file, the bits in it can be recovered in many cases using forensic techniques. Also the encryption program or other programs you use on the file may store the bits in memory that ends up being written to swap space.
IOW encryption is useless unless you're very careful about where the plaintext goes. I would assume that steganography follows the same rules.
Clueless poster and clueless moderators.
Indeed.
It's simply astonishing to me that the only stego method the whole of Slashdot commentators (I read so far) are capable of thinking is hiding data in least significant parts of data (LSB). Which is, in reality, as powerful technique as is rot-13 encryption-wise.
Writing something in LSB doesn't survive _any_ data manipulation, filtering, re-coding or pretty much anything else. If you want to hide something, you hide it in MOST significant part of data, where your payload is guaranteed to survive as long as host data does.
You generally achieve this by spread spectrum encoding which is roughly a method of splitting the power of your signal over a large number of most significant data bins (frequencies, various transformation factors or whatnot). By using this technique, not only is your data imperceptible, algthough it is hidden in MSBs (of sorts), it is also hidded by the fact that only by having the key for selecting the right data bins you can dechypher the stego data.
Spread spectrum techniques can be made unbelievably robust. So much that you could embed a message in a picture, print it out, scan it back in, crop half of it, and still be able to recover the message (now that's a nice James Bond trick).
Granted, usable payload wouldn't be on the order of 1/10th of the carrier data (as with LSB techniques), more on the order of 1/10000th, but large volumes of carrier data these days are easy to come by.
Feel free to google for more info.
Wow. Two alcohol-influenced comments made the same night went from 1 to 5.
I need to drink more.
Terrorists can attack freedom, but only Congress can destroy it.